1cgroup.conf(5) Slurm Configuration File cgroup.conf(5)
2
3
4
6 cgroup.conf - Slurm configuration file for the cgroup support
7
8
10 cgroup.conf is an ASCII file which defines parameters used by Slurm's
11 Linux cgroup related plugins. The file location can be modified at
12 system build time using the DEFAULT_SLURM_CONF parameter or at execu‐
13 tion time by setting the SLURM_CONF environment variable. The file will
14 always be located in the same directory as the slurm.conf file.
15
16 Parameter names are case insensitive. Any text following a "#" in the
17 configuration file is treated as a comment through the end of that
18 line. Changes to the configuration file take effect upon restart of
19 Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
20 command "scontrol reconfigure" unless otherwise noted.
21
22
23 For general Slurm cgroups information, see the Cgroups Guide at
24 <https://slurm.schedmd.com/cgroups.html>.
25
26
27 The following cgroup.conf parameters are defined to control the general
28 behavior of Slurm cgroup plugins.
29
30
31 CgroupAutomount=<yes|no>
32 Slurm cgroup plugins require valid and functional cgroup subsys‐
33 tem to be mounted under /sys/fs/cgroup/<subsystem_name>. When
34 launched, plugins check their subsystem availability. If not
35 available, the plugin launch fails unless CgroupAutomount is set
36 to yes. In that case, the plugin will first try to mount the re‐
37 quired subsystems.
38
39
40 CgroupMountpoint=PATH
41 Specify the PATH under which cgroups should be mounted. This
42 should be a writable directory which will contain cgroups
43 mounted one per subsystem. The default PATH is /sys/fs/cgroup.
44
45
46 CgroupPlugin=<cgroup/v1|autodetect>
47 Specify the plugin to be used when interacting with the cgroup
48 subsystem. Supported values at the moment are only "cgroup/v1"
49 which supports the legacy interface of cgroup v1, or "autode‐
50 tect" which tries to determine which cgroup version does your
51 system provide. This is useful if nodes have support for differ‐
52 ent cgroup versions. The default value is "autodetect".
53
54
56 The following cgroup.conf parameters are defined to control the behav‐
57 ior of this particular plugin:
58
59
60 AllowedKmemSpace=<number>
61 Constrain the job cgroup kernel memory to this amount of the al‐
62 located memory, specified in bytes. The AllowedKmemSpace must be
63 between the upper and lower memory limits, specified by MaxKmem‐
64 Percent and MinKmemSpace, respectively. If AllowedKmemSpace goes
65 beyond the upper or lower limit, it will be reset to that upper
66 or lower limit, whichever has been exceeded.
67
68
69 AllowedRAMSpace=<number>
70 Constrain the job/step cgroup RAM to this percentage of the al‐
71 located memory. The percentage supplied may be expressed as
72 floating point number, e.g. 101.5. Sets the cgroup soft memory
73 limit at the allocated memory size and then sets the job/step
74 hard memory limit at the (AllowedRAMSpace/100) * allocated mem‐
75 ory. If the job/step exceeds the hard limit, then it might trig‐
76 ger Out Of Memory (OOM) events (including oom-kill) which will
77 be logged to kernel log ring buffer (dmesg in Linux). Setting
78 AllowedRAMSpace above 100 may cause system Out of Memory (OOM)
79 events as it allows job/step to allocate more memory than con‐
80 figured to the nodes. Reducing configured node available memory
81 to avoid system OOM events is suggested. Setting Allowe‐
82 dRAMSpace below 100 will result in jobs receiving less memory
83 than allocated and soft memory limit will set to the same value
84 as the hard limit. Also see ConstrainRAMSpace. The default
85 value is 100.
86
87
88 AllowedSwapSpace=<number>
89 Constrain the job cgroup swap space to this percentage of the
90 allocated memory. The default value is 0, which means that
91 RAM+Swap will be limited to AllowedRAMSpace. The supplied per‐
92 centage may be expressed as a floating point number, e.g. 50.5.
93 If the limit is exceeded, the job steps will be killed and a
94 warning message will be written to standard error. Also see
95 ConstrainSwapSpace. NOTE: Setting AllowedSwapSpace to 0 does
96 not restrict the Linux kernel from using swap space. To control
97 how the kernel uses swap space, see MemorySwappiness.
98
99
100 ConstrainCores=<yes|no>
101 If configured to "yes" then constrain allowed cores to the sub‐
102 set of allocated resources. This functionality makes use of the
103 cpuset subsystem. Due to a bug fixed in version 1.11.5 of
104 HWLOC, the task/affinity plugin may be required in addition to
105 task/cgroup for this to function properly. The default value is
106 "no".
107
108
109 ConstrainDevices=<yes|no>
110 If configured to "yes" then constrain the job's allowed devices
111 based on GRES allocated resources. It uses the devices subsystem
112 for that. The default value is "no".
113
114
115 ConstrainKmemSpace=<yes|no>
116 If configured to "yes" then constrain the job's Kmem RAM usage
117 in addition to RAM usage. Only takes effect if ConstrainRAMSpace
118 is set to "yes". If enabled, the job's Kmem limit will be as‐
119 signed the value of AllowedKmemSpace or the value coming from
120 MaxKmemPercent. The default value is "no" which will leave Kmem
121 setting untouched by Slurm. Also see AllowedKmemSpace, MaxKmem‐
122 Percent.
123
124
125 ConstrainRAMSpace=<yes|no>
126 If configured to "yes" then constrain the job's RAM usage by
127 setting the memory soft limit to the allocated memory and the
128 hard limit to the allocated memory * AllowedRAMSpace. The de‐
129 fault value is "no", in which case the job's RAM limit will be
130 set to its swap space limit if ConstrainSwapSpace is set to
131 "yes". Also see AllowedSwapSpace, AllowedRAMSpace and Con‐
132 strainSwapSpace.
133
134 NOTE: When using ConstrainRAMSpace, if the combined memory used
135 by all processes in a step is greater than the limit, then the
136 kernel will trigger an OOM event, killing one or more of the
137 processes in the step. The step state will be marked as OOM, but
138 the step itself will keep running and other processes in the
139 step may continue to run as well. This differs from the behav‐
140 ior of OverMemoryKill, where the whole step will be killed/can‐
141 celled. It also differs in that the memory usage is checked on a
142 per-process basis by the JobAcctGather polling system.
143
144 NOTE: When enabled, ConstrainRAMSpace can lead to a noticeable
145 decline in per-node job throughout. Sites with high-throughput
146 requirements should carefully weigh the tradeoff between
147 per-node throughput, versus potential problems that can arise
148 from unconstrained memory usage on the node. See
149 <https://slurm.schedmd.com/high_throughput.html> for further
150 discussion.
151
152
153 ConstrainSwapSpace=<yes|no>
154 If configured to "yes" then constrain the job's swap space us‐
155 age. The default value is "no". Note that when set to "yes" and
156 ConstrainRAMSpace is set to "no", AllowedRAMSpace is automati‐
157 cally set to 100% in order to limit the RAM+Swap amount to 100%
158 of job's requirement plus the percent of allowed swap space.
159 This amount is thus set to both RAM and RAM+Swap limits. This
160 means that in that particular case, ConstrainRAMSpace is auto‐
161 matically enabled with the same limit as the one used to con‐
162 strain swap space. Also see AllowedSwapSpace.
163
164
165 MaxRAMPercent=PERCENT
166 Set an upper bound in percent of total RAM on the RAM constraint
167 for a job. This will be the memory constraint applied to jobs
168 that are not explicitly allocated memory by Slurm (i.e. Slurm's
169 select plugin is not configured to manage memory allocations).
170 The PERCENT may be an arbitrary floating point number. The de‐
171 fault value is 100.
172
173
174 MaxSwapPercent=PERCENT
175 Set an upper bound (in percent of total RAM) on the amount of
176 RAM+Swap that may be used for a job. This will be the swap limit
177 applied to jobs on systems where memory is not being explicitly
178 allocated to job. The PERCENT may be an arbitrary floating point
179 number between 0 and 100. The default value is 100.
180
181
182 MaxKmemPercent=PERCENT
183 Set an upper bound in percent of total RAM as the maximum Kmem
184 for a job. The PERCENT may be an arbitrary floating point num‐
185 ber, however, the product of MaxKmemPercent and job requested
186 memory has to fall between MinKmemSpace and job requested mem‐
187 ory, otherwise the boundary value is used. The default value is
188 100.
189
190
191 MemorySwappiness=<number>
192 Configure the kernel's priority for swapping out anonymous pages
193 (such as program data) verses file cache pages for the job
194 cgroup. Valid values are between 0 and 100, inclusive. A value
195 of 0 prevents the kernel from swapping out program data. A value
196 of 100 gives equal priority to swapping out file cache or anony‐
197 mous pages. If not set, then the kernel's default swappiness
198 value will be used. ConstrainSwapSpace must be set to yes in or‐
199 der for this parameter to be applied.
200
201
202 MinKmemSpace=<number>
203 Set a lower bound (in MB) on the memory limits defined by Al‐
204 lowedKmemSpace. The default limit is 30M.
205
206
207 MinRAMSpace=<number>
208 Set a lower bound (in MB) on the memory limits defined by Al‐
209 lowedRAMSpace and AllowedSwapSpace. This prevents accidentally
210 creating a memory cgroup with such a low limit that slurmstepd
211 is immediately killed due to lack of RAM. The default limit is
212 30M.
213
214
216 Debian and derivatives (e.g. Ubuntu) usually exclude the memory and
217 memsw (swap) cgroups by default. To include them, add the following pa‐
218 rameters to the kernel command line: cgroup_enable=memory swapaccount=1
219
220 This can usually be placed in /etc/default/grub inside the GRUB_CMD‐
221 LINE_LINUX variable. A command such as update-grub must be run after
222 updating the file.
223
224
226 /etc/slurm/cgroup.conf:
227 This example cgroup.conf file shows a configuration that enables
228 the more commonly used cgroup enforcement mechanisms.
229
230 ###
231 # Slurm cgroup support configuration file.
232 ###
233 CgroupAutomount=yes
234 CgroupMountpoint=/sys/fs/cgroup
235 ConstrainCores=yes
236 ConstrainDevices=yes
237 ConstrainKmemSpace=no #avoid known Kernel issues
238 ConstrainRAMSpace=yes
239 ConstrainSwapSpace=yes
240
241 /etc/slurm/slurm.conf:
242 These are the entries required in slurm.conf to activate the
243 cgroup enforcement mechanisms. Make sure that the node defini‐
244 tions in your slurm.conf closely match the configuration as
245 shown by "slurmd -C". Either MemSpecLimit should be set or
246 RealMemory should be defined with less than the actual amount of
247 memory for a node to ensure that all system/non-job processes
248 will have sufficient memory at all times. Sites should also con‐
249 figure pam_slurm_adopt to ensure users can not escape the
250 cgroups via ssh.
251
252 ###
253 # Slurm configuration entries for cgroups
254 ###
255 ProctrackType=proctrack/cgroup
256 TaskPlugin=task/cgroup,task/affinity
257 JobAcctGatherType=jobacct_gather/cgroup #optional for gathering metrics
258 PrologFlags=Contain #X11 flag is also suggested
259
260
262 Copyright (C) 2010-2012 Lawrence Livermore National Security. Produced
263 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
264 Copyright (C) 2010-2021 SchedMD LLC.
265
266 This file is part of Slurm, a resource management program. For de‐
267 tails, see <https://slurm.schedmd.com/>.
268
269 Slurm is free software; you can redistribute it and/or modify it under
270 the terms of the GNU General Public License as published by the Free
271 Software Foundation; either version 2 of the License, or (at your op‐
272 tion) any later version.
273
274 Slurm is distributed in the hope that it will be useful, but WITHOUT
275 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
276 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
277 for more details.
278
279
281 slurm.conf(5)
282
283
284
285October 2021 Slurm Configuration File cgroup.conf(5)