1cgroup.conf(5) Slurm Configuration File cgroup.conf(5)
2
3
4
6 cgroup.conf - Slurm configuration file for the cgroup support
7
8
10 cgroup.conf is an ASCII file which defines parameters used by Slurm's
11 Linux cgroup related plugins. The file location can be modified at
12 system build time using the DEFAULT_SLURM_CONF parameter or at execu‐
13 tion time by setting the SLURM_CONF environment variable. The file will
14 always be located in the same directory as the slurm.conf file.
15
16 Parameter names are case insensitive. Any text following a "#" in the
17 configuration file is treated as a comment through the end of that
18 line. Changes to the configuration file take effect upon restart of
19 Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
20 command "scontrol reconfigure" unless otherwise noted.
21
22
23 For general Slurm Cgroups information, see the Cgroups Guide at
24 <https://slurm.schedmd.com/cgroups.html>.
25
26
27 The following cgroup.conf parameters are defined to control the general
28 behavior of Slurm cgroup plugins.
29
30
31 CgroupAutomount=<yes|no>
32 Slurm cgroup plugins require valid and functional cgroup subsys‐
33 tem to be mounted under /sys/fs/cgroup/<subsystem_name>. When
34 launched, plugins check their subsystem availability. If not
35 available, the plugin launch fails unless CgroupAutomount is set
36 to yes. In that case, the plugin will first try to mount the
37 required subsystems.
38
39
40 CgroupMountpoint=PATH
41 Specify the PATH under which cgroups should be mounted. This
42 should be a writable directory which will contain cgroups
43 mounted one per subsystem. The default PATH is /sys/fs/cgroup.
44
45
47 The following cgroup.conf parameters are defined to control the behav‐
48 ior of this particular plugin:
49
50
51 AllowedKmemSpace=<number>
52 Constrain the job cgroup kernel memory to this amount of the
53 allocated memory, specified in bytes. The AllowedKmemSpace must
54 be between the upper and lower memory limits, specified by MaxK‐
55 memPercent and MinKmemSpace, respectively. If AllowedKmemSpace
56 goes beyond the upper or lower limit, it will be reset to that
57 upper or lower limit, whichever has been exceeded.
58
59
60 AllowedRAMSpace=<number>
61 Constrain the job/step cgroup RAM to this percentage of the
62 allocated memory. The percentage supplied may be expressed as
63 floating point number, e.g. 101.5. Sets the cgroup soft memory
64 limit at the allocated memory size and then sets the job/step
65 hard memory limit at the (AllowedRAMSpace/100) * allocated mem‐
66 ory. If the job/step exceeds the hard limit, then it might trig‐
67 ger Out Of Memory (OOM) events (including oom-kill) which will
68 be logged to kernel log ringbuffer (dmesg in Linux). Setting
69 AllowedRAMSpace above 100 may cause system Out of Memory (OOM)
70 events as it allows job/step to allocate more memory than con‐
71 figured to the nodes. Reducing configured node available memory
72 to avoid system OOM events is suggested. Setting Allowe‐
73 dRAMSpace below 100 will result in jobs receiving less memory
74 than allocated and soft memory limit will set to the same value
75 as the hard limit. Also see ConstrainRAMSpace. The default
76 value is 100.
77
78
79 AllowedSwapSpace=<number>
80 Constrain the job cgroup swap space to this percentage of the
81 allocated memory. The default value is 0, which means that
82 RAM+Swap will be limited to AllowedRAMSpace. The supplied per‐
83 centage may be expressed as a floating point number, e.g. 50.5.
84 If the limit is exceeded, the job steps will be killed and a
85 warning message will be written to standard error. Also see
86 ConstrainSwapSpace. NOTE: Setting AllowedSwapSpace to 0 does
87 not restrict the Linux kernel from using swap space. To control
88 how the kernel uses swap space, see MemorySwappiness.
89
90
91 ConstrainCores=<yes|no>
92 If configured to "yes" then constrain allowed cores to the sub‐
93 set of allocated resources. This functionality makes use of the
94 cpuset subsystem. Due to a bug fixed in version 1.11.5 of
95 HWLOC, the task/affinity plugin may be required in addition to
96 task/cgroup for this to function properly. The default value is
97 "no".
98
99
100 ConstrainDevices=<yes|no>
101 If configured to "yes" then constrain the job's allowed devices
102 based on GRES allocated resources. It uses the devices subsystem
103 for that. The default value is "no".
104
105
106 ConstrainKmemSpace=<yes|no>
107 If configured to "yes" then constrain the job's Kmem RAM usage
108 in addition to RAM usage. Only takes effect if ConstrainRAMSpace
109 is set to "yes". The default value is "no". If set to yes, the
110 job's Kmem limit will be set to AllowedKmemSpace if set; other‐
111 wise, the job's Kmem limit will be set to its RAM limit. Also
112 see AllowedKmemSpace.
113
114
115 ConstrainRAMSpace=<yes|no>
116 If configured to "yes" then constrain the job's RAM usage by
117 setting the memory soft limit to the allocated memory and the
118 hard limit to the allocated memory * AllowedRAMSpace. The
119 default value is "no", in which case the job's RAM limit will be
120 set to its swap space limit if ConstrainSwapSpace is set to
121 "yes". Also see AllowedSwapSpace, AllowedRAMSpace and Con‐
122 strainSwapSpace. NOTE: When enabled, ConstrainRAMSpace can lead
123 to a noticeable decline in per-node job throughout. Sites with
124 high-throughput requirements should carefully weigh the tradeoff
125 between per-node throughput, versus potential problems that can
126 arise from unconstrained memory usage on the node. See
127 <https://slurm.schedmd.com/high_throughput.html> for further
128 discussion.
129
130
131 ConstrainSwapSpace=<yes|no>
132 If configured to "yes" then constrain the job's swap space
133 usage. The default value is "no". Note that when set to "yes"
134 and ConstrainRAMSpace is set to "no", AllowedRAMSpace is auto‐
135 matically set to 100% in order to limit the RAM+Swap amount to
136 100% of job's requirement plus the percent of allowed swap
137 space. This amount is thus set to both RAM and RAM+Swap limits.
138 This means that in that particular case, ConstrainRAMSpace is
139 automatically enabled with the same limit than the one used to
140 constrain swap space. Also see AllowedSwapSpace.
141
142
143 MaxRAMPercent=PERCENT
144 Set an upper bound in percent of total RAM on the RAM constraint
145 for a job. This will be the memory constraint applied to jobs
146 that are not explicitly allocated memory by Slurm (i.e. Slurm's
147 select plugin is not configured to manage memory allocations).
148 The PERCENT may be an arbitrary floating point number. The
149 default value is 100.
150
151
152 MaxSwapPercent=PERCENT
153 Set an upper bound (in percent of total RAM) on the amount of
154 RAM+Swap that may be used for a job. This will be the swap limit
155 applied to jobs on systems where memory is not being explicitly
156 allocated to job. The PERCENT may be an arbitrary floating point
157 number between 0 and 100. The default value is 100.
158
159
160 MaxKmemPercent=PERCENT
161 Set an upper bound in percent of total Kmem for a job. The PER‐
162 CENT may be an arbitrary floating point number. The default
163 value is 100.
164
165
166 MemorySwappiness=<number>
167 Configure the kernel's priority for swapping out anonymous pages
168 (such as program data) verses file cache pages for the job
169 cgroup. Valid values are between 0 and 100, inclusive. A value
170 of 0 prevents the kernel from swapping out program data. A value
171 of 100 gives equal priorioty to swapping out file cache or
172 anonymous pages. If not set, then the kernel's default swappi‐
173 ness value will be used. Either ConstrainRAMSpace or Constrain‐
174 SwapSpace must be set to yes in order for this parameter to be
175 applied.
176
177
178 MinKmemSpace=<number>
179 Set a lower bound (in MB) on the memory limits defined by
180 AllowedKmemSpace. The default limit is 30M.
181
182
183 MinRAMSpace=<number>
184 Set a lower bound (in MB) on the memory limits defined by
185 AllowedRAMSpace and AllowedSwapSpace. This prevents accidentally
186 creating a memory cgroup with such a low limit that slurmstepd
187 is immediately killed due to lack of RAM. The default limit is
188 30M.
189
190
191 TaskAffinity=<yes|no>
192 If configured to "yes" then set a default task affinity to bind
193 each step task to a subset of the allocated cores using
194 sched_setaffinity. The default value is "no". Note: This fea‐
195 ture requires the Portable Hardware Locality (hwloc) library to
196 be installed.
197
198
200 Debian and derivatives (e.g. Ubuntu) usually exclude the memory and
201 memsw (swap) cgroups by default. To include them, add the following
202 parameters to the kernel command line: cgroup_enable=memory swapac‐
203 count=1
204
205 This can usually be placed in /etc/default/grub inside the GRUB_CMD‐
206 LINE_LINUX variable. A command such as update-grub must be run after
207 updating the file.
208
209
211 ###
212 # Slurm cgroup support configuration file
213 ###
214 CgroupAutomount=yes
215 ConstrainCores=yes
216 #
217
218
220 Copyright (C) 2010-2012 Lawrence Livermore National Security. Produced
221 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
222 Copyright (C) 2010-2016 SchedMD LLC.
223
224 This file is part of Slurm, a resource management program. For
225 details, see <https://slurm.schedmd.com/>.
226
227 Slurm is free software; you can redistribute it and/or modify it under
228 the terms of the GNU General Public License as published by the Free
229 Software Foundation; either version 2 of the License, or (at your
230 option) any later version.
231
232 Slurm is distributed in the hope that it will be useful, but WITHOUT
233 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
234 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
235 for more details.
236
237
239 slurm.conf(5)
240
241
242
243December 2016 Slurm Configuration File cgroup.conf(5)