1cgroup.conf(5) Slurm Configuration File cgroup.conf(5)
2
3
4
6 cgroup.conf - Slurm configuration file for the cgroup support
7
8
10 cgroup.conf is an ASCII file which defines parameters used by Slurm's
11 Linux cgroup related plugins. The file will always be located in the
12 same directory as the slurm.conf.
13
14 Parameter names are case insensitive. Any text following a "#" in the
15 configuration file is treated as a comment through the end of that
16 line. Changes to the configuration file take effect upon restart of
17 Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
18 command "scontrol reconfigure" unless otherwise noted.
19
20
21 For general Slurm cgroups information, see the Cgroups Guide at
22 <https://slurm.schedmd.com/cgroups.html>.
23
24
25 The following cgroup.conf parameters are defined to control the general
26 behavior of Slurm cgroup plugins.
27
28
29 CgroupAutomount=<yes|no>
30 Slurm cgroup plugins require valid and functional cgroup subsys‐
31 tem to be mounted under /sys/fs/cgroup/<subsystem_name>. When
32 launched, plugins check their subsystem availability. If not
33 available, the plugin launch fails unless CgroupAutomount is set
34 to yes. In that case, the plugin will first try to mount the re‐
35 quired subsystems.
36
37 CgroupMountpoint=PATH
38 Specify the PATH under which cgroups should be mounted. This
39 should be a writable directory which will contain cgroups
40 mounted one per subsystem. The default PATH is /sys/fs/cgroup.
41
42 CgroupPlugin=<cgroup/v1|autodetect>
43 Specify the plugin to be used when interacting with the cgroup
44 subsystem. Supported values at the moment are only "cgroup/v1"
45 which supports the legacy interface of cgroup v1, or "autode‐
46 tect" which tries to determine which cgroup version does your
47 system provide. This is useful if nodes have support for differ‐
48 ent cgroup versions. The default value is "cgroup/v1".
49
51 The following cgroup.conf parameters are defined to control the behav‐
52 ior of this particular plugin:
53
54
55 AllowedKmemSpace=<number>
56 Constrain the job cgroup kernel memory to this amount of the al‐
57 located memory, specified in bytes. The AllowedKmemSpace must be
58 between the upper and lower memory limits, specified by MaxKmem‐
59 Percent and MinKmemSpace, respectively. If AllowedKmemSpace goes
60 beyond the upper or lower limit, it will be reset to that upper
61 or lower limit, whichever has been exceeded.
62
63 AllowedRAMSpace=<number>
64 Constrain the job/step cgroup RAM to this percentage of the al‐
65 located memory. The percentage supplied may be expressed as
66 floating point number, e.g. 101.5. Sets the cgroup soft memory
67 limit at the allocated memory size and then sets the job/step
68 hard memory limit at the (AllowedRAMSpace/100) * allocated mem‐
69 ory. If the job/step exceeds the hard limit, then it might trig‐
70 ger Out Of Memory (OOM) events (including oom-kill) which will
71 be logged to kernel log ring buffer (dmesg in Linux). Setting
72 AllowedRAMSpace above 100 may cause system Out of Memory (OOM)
73 events as it allows job/step to allocate more memory than con‐
74 figured to the nodes. Reducing configured node available memory
75 to avoid system OOM events is suggested. Setting Allowe‐
76 dRAMSpace below 100 will result in jobs receiving less memory
77 than allocated and soft memory limit will set to the same value
78 as the hard limit. Also see ConstrainRAMSpace. The default
79 value is 100.
80
81 AllowedSwapSpace=<number>
82 Constrain the job cgroup swap space to this percentage of the
83 allocated memory. The default value is 0, which means that
84 RAM+Swap will be limited to AllowedRAMSpace. The supplied per‐
85 centage may be expressed as a floating point number, e.g. 50.5.
86 If the limit is exceeded, the job steps will be killed and a
87 warning message will be written to standard error. Also see
88 ConstrainSwapSpace. NOTE: Setting AllowedSwapSpace to 0 does
89 not restrict the Linux kernel from using swap space. To control
90 how the kernel uses swap space, see MemorySwappiness.
91
92 ConstrainCores=<yes|no>
93 If configured to "yes" then constrain allowed cores to the sub‐
94 set of allocated resources. This functionality makes use of the
95 cpuset subsystem. Due to a bug fixed in version 1.11.5 of
96 HWLOC, the task/affinity plugin may be required in addition to
97 task/cgroup for this to function properly. The default value is
98 "no".
99
100 ConstrainDevices=<yes|no>
101 If configured to "yes" then constrain the job's allowed devices
102 based on GRES allocated resources. It uses the devices subsystem
103 for that. The default value is "no".
104
105 ConstrainKmemSpace=<yes|no>
106 If configured to "yes" then constrain the job's Kmem RAM usage
107 in addition to RAM usage. Only takes effect if ConstrainRAMSpace
108 is set to "yes". If enabled, the job's Kmem limit will be as‐
109 signed the value of AllowedKmemSpace or the value coming from
110 MaxKmemPercent. The default value is "no" which will leave Kmem
111 setting untouched by Slurm. Also see AllowedKmemSpace, MaxKmem‐
112 Percent.
113
114 ConstrainRAMSpace=<yes|no>
115 If configured to "yes" then constrain the job's RAM usage by
116 setting the memory soft limit to the allocated memory and the
117 hard limit to the allocated memory * AllowedRAMSpace. The de‐
118 fault value is "no", in which case the job's RAM limit will be
119 set to its swap space limit if ConstrainSwapSpace is set to
120 "yes". Also see AllowedSwapSpace, AllowedRAMSpace and Con‐
121 strainSwapSpace.
122
123 NOTE: When using ConstrainRAMSpace, if the combined memory used
124 by all processes in a step is greater than the limit, then the
125 kernel will trigger an OOM event, killing one or more of the
126 processes in the step. The step state will be marked as OOM, but
127 the step itself will keep running and other processes in the
128 step may continue to run as well. This differs from the behav‐
129 ior of OverMemoryKill, where the whole step will be killed/can‐
130 celled.
131
132 NOTE: When enabled, ConstrainRAMSpace can lead to a noticeable
133 decline in per-node job throughout. Sites with high-throughput
134 requirements should carefully weigh the tradeoff between
135 per-node throughput, versus potential problems that can arise
136 from unconstrained memory usage on the node. See
137 <https://slurm.schedmd.com/high_throughput.html> for further
138 discussion.
139
140 ConstrainSwapSpace=<yes|no>
141 If configured to "yes" then constrain the job's swap space us‐
142 age. The default value is "no". Note that when set to "yes" and
143 ConstrainRAMSpace is set to "no", AllowedRAMSpace is automati‐
144 cally set to 100% in order to limit the RAM+Swap amount to 100%
145 of job's requirement plus the percent of allowed swap space.
146 This amount is thus set to both RAM and RAM+Swap limits. This
147 means that in that particular case, ConstrainRAMSpace is auto‐
148 matically enabled with the same limit as the one used to con‐
149 strain swap space. Also see AllowedSwapSpace.
150
151 MaxRAMPercent=PERCENT
152 Set an upper bound in percent of total RAM on the RAM constraint
153 for a job. This will be the memory constraint applied to jobs
154 that are not explicitly allocated memory by Slurm (i.e. Slurm's
155 select plugin is not configured to manage memory allocations).
156 The PERCENT may be an arbitrary floating point number. The de‐
157 fault value is 100.
158
159 MaxSwapPercent=PERCENT
160 Set an upper bound (in percent of total RAM) on the amount of
161 RAM+Swap that may be used for a job. This will be the swap limit
162 applied to jobs on systems where memory is not being explicitly
163 allocated to job. The PERCENT may be an arbitrary floating point
164 number between 0 and 100. The default value is 100.
165
166 MaxKmemPercent=PERCENT
167 Set an upper bound in percent of total RAM as the maximum Kmem
168 for a job. The PERCENT may be an arbitrary floating point num‐
169 ber, however, the product of MaxKmemPercent and job requested
170 memory has to fall between MinKmemSpace and job requested mem‐
171 ory, otherwise the boundary value is used. The default value is
172 100.
173
174 MemorySwappiness=<number>
175 Configure the kernel's priority for swapping out anonymous pages
176 (such as program data) verses file cache pages for the job
177 cgroup. Valid values are between 0 and 100, inclusive. A value
178 of 0 prevents the kernel from swapping out program data. A value
179 of 100 gives equal priority to swapping out file cache or anony‐
180 mous pages. If not set, then the kernel's default swappiness
181 value will be used. ConstrainSwapSpace must be set to yes in or‐
182 der for this parameter to be applied.
183
184 MinKmemSpace=<number>
185 Set a lower bound (in MB) on the memory limits defined by Al‐
186 lowedKmemSpace. The default limit is 30M.
187
188 MinRAMSpace=<number>
189 Set a lower bound (in MB) on the memory limits defined by Al‐
190 lowedRAMSpace and AllowedSwapSpace. This prevents accidentally
191 creating a memory cgroup with such a low limit that slurmstepd
192 is immediately killed due to lack of RAM. The default limit is
193 30M.
194
196 Debian and derivatives (e.g. Ubuntu) usually exclude the memory and
197 memsw (swap) cgroups by default. To include them, add the following pa‐
198 rameters to the kernel command line: cgroup_enable=memory swapaccount=1
199
200 This can usually be placed in /etc/default/grub inside the GRUB_CMD‐
201 LINE_LINUX variable. A command such as update-grub must be run after
202 updating the file.
203
204
206 /etc/slurm/cgroup.conf:
207 This example cgroup.conf file shows a configuration that enables
208 the more commonly used cgroup enforcement mechanisms.
209
210 ###
211 # Slurm cgroup support configuration file.
212 ###
213 CgroupAutomount=yes
214 CgroupMountpoint=/sys/fs/cgroup
215 ConstrainCores=yes
216 ConstrainDevices=yes
217 ConstrainKmemSpace=no #avoid known Kernel issues
218 ConstrainRAMSpace=yes
219 ConstrainSwapSpace=yes
220
221
222 /etc/slurm/slurm.conf:
223 These are the entries required in slurm.conf to activate the
224 cgroup enforcement mechanisms. Make sure that the node defini‐
225 tions in your slurm.conf closely match the configuration as
226 shown by "slurmd -C". Either MemSpecLimit should be set or
227 RealMemory should be defined with less than the actual amount of
228 memory for a node to ensure that all system/non-job processes
229 will have sufficient memory at all times. Sites should also con‐
230 figure pam_slurm_adopt to ensure users can not escape the
231 cgroups via ssh.
232
233 ###
234 # Slurm configuration entries for cgroups
235 ###
236 ProctrackType=proctrack/cgroup
237 TaskPlugin=task/cgroup,task/affinity
238 JobAcctGatherType=jobacct_gather/cgroup #optional for gathering metrics
239 PrologFlags=Contain #X11 flag is also suggested
240
241
243 Copyright (C) 2010-2012 Lawrence Livermore National Security. Produced
244 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
245 Copyright (C) 2010-2022 SchedMD LLC.
246
247 This file is part of Slurm, a resource management program. For de‐
248 tails, see <https://slurm.schedmd.com/>.
249
250 Slurm is free software; you can redistribute it and/or modify it under
251 the terms of the GNU General Public License as published by the Free
252 Software Foundation; either version 2 of the License, or (at your op‐
253 tion) any later version.
254
255 Slurm is distributed in the hope that it will be useful, but WITHOUT
256 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
257 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
258 for more details.
259
260
262 slurm.conf(5)
263
264
265
266April 2022 Slurm Configuration File cgroup.conf(5)