1cgroup.conf(5) Slurm Configuration File cgroup.conf(5)
2
3
4
6 cgroup.conf - Slurm configuration file for the cgroup support
7
8
10 cgroup.conf is an ASCII file which defines parameters used by Slurm's
11 Linux cgroup related plugins. The file location can be modified at
12 system build time using the DEFAULT_SLURM_CONF parameter or at execu‐
13 tion time by setting the SLURM_CONF environment variable. The file will
14 always be located in the same directory as the slurm.conf file.
15
16 Parameter names are case insensitive. Any text following a "#" in the
17 configuration file is treated as a comment through the end of that
18 line. Changes to the configuration file take effect upon restart of
19 Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
20 command "scontrol reconfigure" unless otherwise noted.
21
22
23 For general Slurm Cgroups information, see the Cgroups Guide at
24 <https://slurm.schedmd.com/cgroups.html>.
25
26
27 The following cgroup.conf parameters are defined to control the general
28 behavior of Slurm cgroup plugins.
29
30
31 CgroupAutomount=<yes|no>
32 Slurm cgroup plugins require valid and functional cgroup subsys‐
33 tem to be mounted under /sys/fs/cgroup/<subsystem_name>. When
34 launched, plugins check their subsystem availability. If not
35 available, the plugin launch fails unless CgroupAutomount is set
36 to yes. In that case, the plugin will first try to mount the
37 required subsystems.
38
39
40 CgroupMountpoint=PATH
41 Specify the PATH under which cgroups should be mounted. This
42 should be a writable directory which will contain cgroups
43 mounted one per subsystem. The default PATH is /sys/fs/cgroup.
44
45
47 The following cgroup.conf parameters are defined to control the behav‐
48 ior of this particular plugin:
49
50
51 AllowedKmemSpace=<number>
52 Constrain the job cgroup kernel memory to this amount of the
53 allocated memory, specified in bytes. The AllowedKmemSpace must
54 be between the upper and lower memory limits, specified by MaxK‐
55 memPercent and MinKmemSpace, respectively. If AllowedKmemSpace
56 goes beyond the upper or lower limit, it will be reset to that
57 upper or lower limit, whichever has been exceeded.
58
59
60 AllowedRAMSpace=<number>
61 Constrain the job/step cgroup RAM to this percentage of the
62 allocated memory. The percentage supplied may be expressed as
63 floating point number, e.g. 101.5. Sets the cgroup soft memory
64 limit at the allocated memory size and then sets the job/step
65 hard memory limit at the (AllowedRAMSpace/100) * allocated mem‐
66 ory. If the job/step exceeds the hard limit, then it might trig‐
67 ger Out Of Memory (OOM) events (including oom-kill) which will
68 be logged to kernel log ringbuffer (dmesg in Linux). Setting
69 AllowedRAMSpace above 100 may cause system Out of Memory (OOM)
70 events as it allows job/step to allocate more memory than con‐
71 figured to the nodes. Reducing configured node available memory
72 to avoid system OOM events is suggested. Setting Allowe‐
73 dRAMSpace below 100 will result in jobs receiving less memory
74 than allocated and soft memory limit will set to the same value
75 as the hard limit. Also see ConstrainRAMSpace. The default
76 value is 100.
77
78
79 AllowedSwapSpace=<number>
80 Constrain the job cgroup swap space to this percentage of the
81 allocated memory. The default value is 0, which means that
82 RAM+Swap will be limited to AllowedRAMSpace. The supplied per‐
83 centage may be expressed as a floating point number, e.g. 50.5.
84 If the limit is exceeded, the job steps will be killed and a
85 warning message will be written to standard error. Also see
86 ConstrainSwapSpace. NOTE: Setting AllowedSwapSpace to 0 does
87 not restrict the Linux kernel from using swap space. To control
88 how the kernel uses swap space, see MemorySwappiness.
89
90
91 ConstrainCores=<yes|no>
92 If configured to "yes" then constrain allowed cores to the sub‐
93 set of allocated resources. This functionality makes use of the
94 cpuset subsystem. Due to a bug fixed in version 1.11.5 of
95 HWLOC, the task/affinity plugin may be required in addition to
96 task/cgroup for this to function properly. The default value is
97 "no".
98
99
100 ConstrainDevices=<yes|no>
101 If configured to "yes" then constrain the job's allowed devices
102 based on GRES allocated resources. It uses the devices subsystem
103 for that. The default value is "no".
104
105
106 ConstrainKmemSpace=<yes|no>
107 If configured to "yes" then constrain the job's Kmem RAM usage
108 in addition to RAM usage. Only takes effect if ConstrainRAMSpace
109 is set to "yes". If enabled, the job's Kmem limit will be
110 assigned the value of AllowedKmemSpace or the value coming from
111 MaxKmemPercent. The default value is "no" which will leave Kmem
112 setting untouched by Slurm. Also see AllowedKmemSpace, MaxKmem‐
113 Percent.
114
115
116 ConstrainRAMSpace=<yes|no>
117 If configured to "yes" then constrain the job's RAM usage by
118 setting the memory soft limit to the allocated memory and the
119 hard limit to the allocated memory * AllowedRAMSpace. The
120 default value is "no", in which case the job's RAM limit will be
121 set to its swap space limit if ConstrainSwapSpace is set to
122 "yes". Also see AllowedSwapSpace, AllowedRAMSpace and Con‐
123 strainSwapSpace. NOTE: When enabled, ConstrainRAMSpace can lead
124 to a noticeable decline in per-node job throughout. Sites with
125 high-throughput requirements should carefully weigh the tradeoff
126 between per-node throughput, versus potential problems that can
127 arise from unconstrained memory usage on the node. See
128 <https://slurm.schedmd.com/high_throughput.html> for further
129 discussion.
130
131
132 ConstrainSwapSpace=<yes|no>
133 If configured to "yes" then constrain the job's swap space
134 usage. The default value is "no". Note that when set to "yes"
135 and ConstrainRAMSpace is set to "no", AllowedRAMSpace is auto‐
136 matically set to 100% in order to limit the RAM+Swap amount to
137 100% of job's requirement plus the percent of allowed swap
138 space. This amount is thus set to both RAM and RAM+Swap limits.
139 This means that in that particular case, ConstrainRAMSpace is
140 automatically enabled with the same limit than the one used to
141 constrain swap space. Also see AllowedSwapSpace.
142
143
144 MaxRAMPercent=PERCENT
145 Set an upper bound in percent of total RAM on the RAM constraint
146 for a job. This will be the memory constraint applied to jobs
147 that are not explicitly allocated memory by Slurm (i.e. Slurm's
148 select plugin is not configured to manage memory allocations).
149 The PERCENT may be an arbitrary floating point number. The
150 default value is 100.
151
152
153 MaxSwapPercent=PERCENT
154 Set an upper bound (in percent of total RAM) on the amount of
155 RAM+Swap that may be used for a job. This will be the swap limit
156 applied to jobs on systems where memory is not being explicitly
157 allocated to job. The PERCENT may be an arbitrary floating point
158 number between 0 and 100. The default value is 100.
159
160
161 MaxKmemPercent=PERCENT
162 Set an upper bound in percent of total RAM as the maximum Kmem
163 for a job. The PERCENT may be an arbitrary floating point num‐
164 ber, however, the product of MaxKmemPercent and job requested
165 memory has to fall between MinKmemSpace and job requested mem‐
166 ory, otherwise the boundary value is used. The default value is
167 100.
168
169
170 MemorySwappiness=<number>
171 Configure the kernel's priority for swapping out anonymous pages
172 (such as program data) verses file cache pages for the job
173 cgroup. Valid values are between 0 and 100, inclusive. A value
174 of 0 prevents the kernel from swapping out program data. A value
175 of 100 gives equal priorioty to swapping out file cache or
176 anonymous pages. If not set, then the kernel's default swappi‐
177 ness value will be used. Either ConstrainRAMSpace or Constrain‐
178 SwapSpace must be set to yes in order for this parameter to be
179 applied.
180
181
182 MinKmemSpace=<number>
183 Set a lower bound (in MB) on the memory limits defined by
184 AllowedKmemSpace. The default limit is 30M.
185
186
187 MinRAMSpace=<number>
188 Set a lower bound (in MB) on the memory limits defined by
189 AllowedRAMSpace and AllowedSwapSpace. This prevents accidentally
190 creating a memory cgroup with such a low limit that slurmstepd
191 is immediately killed due to lack of RAM. The default limit is
192 30M.
193
194
195 TaskAffinity=<yes|no>
196 If configured to "yes" then set a default task affinity to bind
197 each step task to a subset of the allocated cores using
198 sched_setaffinity. The default value is "no". Note: This fea‐
199 ture requires the Portable Hardware Locality (hwloc) library to
200 be installed.
201
202
204 Debian and derivatives (e.g. Ubuntu) usually exclude the memory and
205 memsw (swap) cgroups by default. To include them, add the following
206 parameters to the kernel command line: cgroup_enable=memory swapac‐
207 count=1
208
209 This can usually be placed in /etc/default/grub inside the GRUB_CMD‐
210 LINE_LINUX variable. A command such as update-grub must be run after
211 updating the file.
212
213
215 ###
216 # Slurm cgroup support configuration file
217 ###
218 CgroupAutomount=yes
219 ConstrainCores=yes
220 #
221
222
224 Copyright (C) 2010-2012 Lawrence Livermore National Security. Produced
225 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
226 Copyright (C) 2010-2016 SchedMD LLC.
227
228 This file is part of Slurm, a resource management program. For
229 details, see <https://slurm.schedmd.com/>.
230
231 Slurm is free software; you can redistribute it and/or modify it under
232 the terms of the GNU General Public License as published by the Free
233 Software Foundation; either version 2 of the License, or (at your
234 option) any later version.
235
236 Slurm is distributed in the hope that it will be useful, but WITHOUT
237 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
238 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
239 for more details.
240
241
243 slurm.conf(5)
244
245
246
247June 2020 Slurm Configuration File cgroup.conf(5)