1cgroup.conf(5) Slurm Configuration File cgroup.conf(5)
2
3
4
6 cgroup.conf - Slurm configuration file for the cgroup support
7
8
10 cgroup.conf is an ASCII file which defines parameters used by Slurm's
11 Linux cgroup related plugins. The file location can be modified at
12 system build time using the DEFAULT_SLURM_CONF parameter or at execu‐
13 tion time by setting the SLURM_CONF environment variable. The file will
14 always be located in the same directory as the slurm.conf file.
15
16 Parameter names are case insensitive. Any text following a "#" in the
17 configuration file is treated as a comment through the end of that
18 line. Changes to the configuration file take effect upon restart of
19 Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
20 command "scontrol reconfigure" unless otherwise noted.
21
22
23 For general Slurm Cgroups information, see the Cgroups Guide at
24 <https://slurm.schedmd.com/cgroups.html>.
25
26
27 The following cgroup.conf parameters are defined to control the general
28 behavior of Slurm cgroup plugins.
29
30
31 CgroupAutomount=<yes|no>
32 Slurm cgroup plugins require valid and functional cgroup subsys‐
33 tem to be mounted under /sys/fs/cgroup/<subsystem_name>. When
34 launched, plugins check their subsystem availability. If not
35 available, the plugin launch fails unless CgroupAutomount is set
36 to yes. In that case, the plugin will first try to mount the re‐
37 quired subsystems.
38
39
40 CgroupMountpoint=PATH
41 Specify the PATH under which cgroups should be mounted. This
42 should be a writable directory which will contain cgroups
43 mounted one per subsystem. The default PATH is /sys/fs/cgroup.
44
45
47 The following cgroup.conf parameters are defined to control the behav‐
48 ior of this particular plugin:
49
50
51 AllowedKmemSpace=<number>
52 Constrain the job cgroup kernel memory to this amount of the al‐
53 located memory, specified in bytes. The AllowedKmemSpace must be
54 between the upper and lower memory limits, specified by MaxKmem‐
55 Percent and MinKmemSpace, respectively. If AllowedKmemSpace goes
56 beyond the upper or lower limit, it will be reset to that upper
57 or lower limit, whichever has been exceeded.
58
59
60 AllowedRAMSpace=<number>
61 Constrain the job/step cgroup RAM to this percentage of the al‐
62 located memory. The percentage supplied may be expressed as
63 floating point number, e.g. 101.5. Sets the cgroup soft memory
64 limit at the allocated memory size and then sets the job/step
65 hard memory limit at the (AllowedRAMSpace/100) * allocated mem‐
66 ory. If the job/step exceeds the hard limit, then it might trig‐
67 ger Out Of Memory (OOM) events (including oom-kill) which will
68 be logged to kernel log ringbuffer (dmesg in Linux). Setting Al‐
69 lowedRAMSpace above 100 may cause system Out of Memory (OOM)
70 events as it allows job/step to allocate more memory than con‐
71 figured to the nodes. Reducing configured node available memory
72 to avoid system OOM events is suggested. Setting Allowe‐
73 dRAMSpace below 100 will result in jobs receiving less memory
74 than allocated and soft memory limit will set to the same value
75 as the hard limit. Also see ConstrainRAMSpace. The default
76 value is 100.
77
78
79 AllowedSwapSpace=<number>
80 Constrain the job cgroup swap space to this percentage of the
81 allocated memory. The default value is 0, which means that
82 RAM+Swap will be limited to AllowedRAMSpace. The supplied per‐
83 centage may be expressed as a floating point number, e.g. 50.5.
84 If the limit is exceeded, the job steps will be killed and a
85 warning message will be written to standard error. Also see
86 ConstrainSwapSpace. NOTE: Setting AllowedSwapSpace to 0 does
87 not restrict the Linux kernel from using swap space. To control
88 how the kernel uses swap space, see MemorySwappiness.
89
90
91 ConstrainCores=<yes|no>
92 If configured to "yes" then constrain allowed cores to the sub‐
93 set of allocated resources. This functionality makes use of the
94 cpuset subsystem. Due to a bug fixed in version 1.11.5 of
95 HWLOC, the task/affinity plugin may be required in addition to
96 task/cgroup for this to function properly. The default value is
97 "no".
98
99
100 ConstrainDevices=<yes|no>
101 If configured to "yes" then constrain the job's allowed devices
102 based on GRES allocated resources. It uses the devices subsystem
103 for that. The default value is "no".
104
105
106 ConstrainKmemSpace=<yes|no>
107 If configured to "yes" then constrain the job's Kmem RAM usage
108 in addition to RAM usage. Only takes effect if ConstrainRAMSpace
109 is set to "yes". If enabled, the job's Kmem limit will be as‐
110 signed the value of AllowedKmemSpace or the value coming from
111 MaxKmemPercent. The default value is "no" which will leave Kmem
112 setting untouched by Slurm. Also see AllowedKmemSpace, MaxKmem‐
113 Percent.
114
115
116 ConstrainRAMSpace=<yes|no>
117 If configured to "yes" then constrain the job's RAM usage by
118 setting the memory soft limit to the allocated memory and the
119 hard limit to the allocated memory * AllowedRAMSpace. The de‐
120 fault value is "no", in which case the job's RAM limit will be
121 set to its swap space limit if ConstrainSwapSpace is set to
122 "yes". Also see AllowedSwapSpace, AllowedRAMSpace and Con‐
123 strainSwapSpace.
124
125 NOTE: When using ConstrainRAMSpace, if a process tries to con‐
126 sume more memory than is available, the step that process is
127 running in will be killed. This differs from the behavior when
128 using OverMemoryKill, where just the offending process will be
129 killed.
130
131 NOTE: When enabled, ConstrainRAMSpace can lead to a noticeable
132 decline in per-node job throughout. Sites with high-throughput
133 requirements should carefully weigh the tradeoff between per-
134 node throughput, versus potential problems that can arise from
135 unconstrained memory usage on the node. See
136 <https://slurm.schedmd.com/high_throughput.html> for further
137 discussion.
138
139
140 ConstrainSwapSpace=<yes|no>
141 If configured to "yes" then constrain the job's swap space us‐
142 age. The default value is "no". Note that when set to "yes" and
143 ConstrainRAMSpace is set to "no", AllowedRAMSpace is automati‐
144 cally set to 100% in order to limit the RAM+Swap amount to 100%
145 of job's requirement plus the percent of allowed swap space.
146 This amount is thus set to both RAM and RAM+Swap limits. This
147 means that in that particular case, ConstrainRAMSpace is auto‐
148 matically enabled with the same limit as the one used to con‐
149 strain swap space. Also see AllowedSwapSpace.
150
151
152 MaxRAMPercent=PERCENT
153 Set an upper bound in percent of total RAM on the RAM constraint
154 for a job. This will be the memory constraint applied to jobs
155 that are not explicitly allocated memory by Slurm (i.e. Slurm's
156 select plugin is not configured to manage memory allocations).
157 The PERCENT may be an arbitrary floating point number. The de‐
158 fault value is 100.
159
160
161 MaxSwapPercent=PERCENT
162 Set an upper bound (in percent of total RAM) on the amount of
163 RAM+Swap that may be used for a job. This will be the swap limit
164 applied to jobs on systems where memory is not being explicitly
165 allocated to job. The PERCENT may be an arbitrary floating point
166 number between 0 and 100. The default value is 100.
167
168
169 MaxKmemPercent=PERCENT
170 Set an upper bound in percent of total RAM as the maximum Kmem
171 for a job. The PERCENT may be an arbitrary floating point num‐
172 ber, however, the product of MaxKmemPercent and job requested
173 memory has to fall between MinKmemSpace and job requested mem‐
174 ory, otherwise the boundary value is used. The default value is
175 100.
176
177
178 MemorySwappiness=<number>
179 Configure the kernel's priority for swapping out anonymous pages
180 (such as program data) verses file cache pages for the job
181 cgroup. Valid values are between 0 and 100, inclusive. A value
182 of 0 prevents the kernel from swapping out program data. A value
183 of 100 gives equal priority to swapping out file cache or anony‐
184 mous pages. If not set, then the kernel's default swappiness
185 value will be used. ConstrainSwapSpace must be set to yes in or‐
186 der for this parameter to be applied.
187
188
189 MinKmemSpace=<number>
190 Set a lower bound (in MB) on the memory limits defined by Al‐
191 lowedKmemSpace. The default limit is 30M.
192
193
194 MinRAMSpace=<number>
195 Set a lower bound (in MB) on the memory limits defined by Al‐
196 lowedRAMSpace and AllowedSwapSpace. This prevents accidentally
197 creating a memory cgroup with such a low limit that slurmstepd
198 is immediately killed due to lack of RAM. The default limit is
199 30M.
200
201
202 TaskAffinity=<yes|no>
203 If configured to "yes" then set a default task affinity to bind
204 each step task to a subset of the allocated cores using
205 sched_setaffinity. The default value is "no".
206
207 NOTE: The recommended configuration to bind steps to tasks is to
208 set TaskPlugin=task/affinity,task/cgroup in the slurm.conf and
209 set TaskAffinity=no with ConstrainCores=yes in the cgroup.conf.
210 This setup uses the task/affinity plugin for setting the affin‐
211 ity of the tasks and uses the task/cgroup plugin to fence tasks
212 into the specified resources, combining the best of both pieces.
213
214 NOTE: This feature requires the Portable Hardware Locality
215 (hwloc) library to be installed.
216
217
219 Debian and derivatives (e.g. Ubuntu) usually exclude the memory and
220 memsw (swap) cgroups by default. To include them, add the following pa‐
221 rameters to the kernel command line: cgroup_enable=memory swapaccount=1
222
223 This can usually be placed in /etc/default/grub inside the GRUB_CMD‐
224 LINE_LINUX variable. A command such as update-grub must be run after
225 updating the file.
226
227
229 /etc/slurm/cgroup.conf:
230 This example cgroup.conf file shows a configuration that enables
231 the more commonly used cgroup enforcment mechanisms.
232
233 ###
234 # Slurm cgroup support configuration file.
235 ###
236 CgroupAutomount=yes
237 CgroupMountpoint=/sys/fs/cgroup
238 ConstrainCores=yes
239 ConstrainDevices=yes
240 ConstrainKmemSpace=no #avoid known Kernel issues
241 ConstrainRAMSpace=yes
242 ConstrainSwapSpace=yes
243 TaskAffinity=no #use task/affinity plugin instead
244
245 /etc/slurm/slurm.conf:
246 These are the entries required in slurm.conf to activate the
247 cgroup enforcement mechanisms. Make sure that the node defini‐
248 tions in your slurm.conf closely match the configuration as
249 shown by "slurmd -C". Either MemSpecLimit should be set or
250 RealMemory should be defined with less than the actual amount of
251 memory for a node to ensure that all system/non-job processes
252 will have sufficient memory at all times. Sites should also con‐
253 figure pam_slurm_adopt to ensure users can not escape the
254 cgroups via ssh.
255
256 ###
257 # Slurm configuration entries for cgroups
258 ###
259 ProctrackType=proctrack/cgroup
260 TaskPlugin=task/cgroup,task/affinity
261 JobAcctGatherType=jobacct_gather/cgroup #optional for gathering metrics
262 PrologFlags=Contain #X11 flag is also suggested
263
264
266 Copyright (C) 2010-2012 Lawrence Livermore National Security. Produced
267 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
268 Copyright (C) 2010-2016 SchedMD LLC.
269
270 This file is part of Slurm, a resource management program. For de‐
271 tails, see <https://slurm.schedmd.com/>.
272
273 Slurm is free software; you can redistribute it and/or modify it under
274 the terms of the GNU General Public License as published by the Free
275 Software Foundation; either version 2 of the License, or (at your op‐
276 tion) any later version.
277
278 Slurm is distributed in the hope that it will be useful, but WITHOUT
279 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
280 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
281 for more details.
282
283
285 slurm.conf(5)
286
287
288
289April 2021 Slurm Configuration File cgroup.conf(5)