1cgroup.conf(5)             Slurm Configuration File             cgroup.conf(5)
2
3
4

NAME

6       cgroup.conf - Slurm configuration file for the cgroup support
7
8

DESCRIPTION

10       cgroup.conf  is  an ASCII file which defines parameters used by Slurm's
11       Linux cgroup related plugins.  The file location  can  be  modified  at
12       system  build  time using the DEFAULT_SLURM_CONF parameter or at execu‐
13       tion time by setting the SLURM_CONF environment variable. The file will
14       always be located in the same directory as the slurm.conf file.
15
16       Parameter  names are case insensitive.  Any text following a "#" in the
17       configuration file is treated as a comment  through  the  end  of  that
18       line.   Changes  to  the configuration file take effect upon restart of
19       Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
20       command "scontrol reconfigure" unless otherwise noted.
21
22
23       For  general  Slurm  Cgroups  information,  see  the  Cgroups  Guide at
24       <https://slurm.schedmd.com/cgroups.html>.
25
26
27       The following cgroup.conf parameters are defined to control the general
28       behavior of Slurm cgroup plugins.
29
30
31       CgroupAutomount=<yes|no>
32              Slurm cgroup plugins require valid and functional cgroup subsys‐
33              tem to be mounted under  /sys/fs/cgroup/<subsystem_name>.   When
34              launched,  plugins  check  their  subsystem availability. If not
35              available, the plugin launch fails unless CgroupAutomount is set
36              to yes. In that case, the plugin will first try to mount the re‐
37              quired subsystems.
38
39
40       CgroupMountpoint=PATH
41              Specify the PATH under which cgroups  should  be  mounted.  This
42              should  be  a  writable  directory  which  will  contain cgroups
43              mounted one per subsystem. The default PATH is /sys/fs/cgroup.
44
45

TASK/CGROUP PLUGIN

47       The following cgroup.conf parameters are defined to control the  behav‐
48       ior of this particular plugin:
49
50
51       AllowedKmemSpace=<number>
52              Constrain the job cgroup kernel memory to this amount of the al‐
53              located memory, specified in bytes. The AllowedKmemSpace must be
54              between the upper and lower memory limits, specified by MaxKmem‐
55              Percent and MinKmemSpace, respectively. If AllowedKmemSpace goes
56              beyond  the upper or lower limit, it will be reset to that upper
57              or lower limit, whichever has been exceeded.
58
59
60       AllowedRAMSpace=<number>
61              Constrain the job/step cgroup RAM to this percentage of the  al‐
62              located  memory.   The  percentage  supplied may be expressed as
63              floating point number, e.g. 101.5.  Sets the cgroup soft  memory
64              limit  at  the  allocated memory size and then sets the job/step
65              hard memory limit at the (AllowedRAMSpace/100) * allocated  mem‐
66              ory. If the job/step exceeds the hard limit, then it might trig‐
67              ger Out Of Memory (OOM) events (including oom-kill)  which  will
68              be logged to kernel log ringbuffer (dmesg in Linux). Setting Al‐
69              lowedRAMSpace above 100 may cause system  Out  of  Memory  (OOM)
70              events  as  it allows job/step to allocate more memory than con‐
71              figured to the nodes.  Reducing configured node available memory
72              to  avoid  system  OOM  events  is  suggested.   Setting Allowe‐
73              dRAMSpace below 100 will result in jobs  receiving  less  memory
74              than  allocated and soft memory limit will set to the same value
75              as the hard limit.  Also  see  ConstrainRAMSpace.   The  default
76              value is 100.
77
78
79       AllowedSwapSpace=<number>
80              Constrain  the  job  cgroup swap space to this percentage of the
81              allocated memory.  The default value  is  0,  which  means  that
82              RAM+Swap  will  be limited to AllowedRAMSpace. The supplied per‐
83              centage may be expressed as a floating point number, e.g.  50.5.
84              If  the  limit  is  exceeded, the job steps will be killed and a
85              warning message will be written to  standard  error.   Also  see
86              ConstrainSwapSpace.   NOTE:  Setting  AllowedSwapSpace to 0 does
87              not restrict the Linux kernel from using swap space. To  control
88              how the kernel uses swap space, see MemorySwappiness.
89
90
91       ConstrainCores=<yes|no>
92              If  configured to "yes" then constrain allowed cores to the sub‐
93              set of allocated resources. This functionality makes use of  the
94              cpuset  subsystem.   Due  to  a  bug  fixed in version 1.11.5 of
95              HWLOC, the task/affinity plugin may be required in  addition  to
96              task/cgroup for this to function properly.  The default value is
97              "no".
98
99
100       ConstrainDevices=<yes|no>
101              If configured to "yes" then constrain the job's allowed  devices
102              based on GRES allocated resources. It uses the devices subsystem
103              for that.  The default value is "no".
104
105
106       ConstrainKmemSpace=<yes|no>
107              If configured to "yes" then constrain the job's Kmem  RAM  usage
108              in addition to RAM usage. Only takes effect if ConstrainRAMSpace
109              is set to "yes". If enabled, the job's Kmem limit  will  be  as‐
110              signed  the  value  of AllowedKmemSpace or the value coming from
111              MaxKmemPercent.  The default value is "no" which will leave Kmem
112              setting untouched by Slurm.  Also see AllowedKmemSpace, MaxKmem‐
113              Percent.
114
115
116       ConstrainRAMSpace=<yes|no>
117              If configured to "yes" then constrain the  job's  RAM  usage  by
118              setting  the  memory  soft limit to the allocated memory and the
119              hard limit to the allocated memory * AllowedRAMSpace.   The  de‐
120              fault  value  is "no", in which case the job's RAM limit will be
121              set to its swap space limit  if  ConstrainSwapSpace  is  set  to
122              "yes".   Also  see  AllowedSwapSpace,  AllowedRAMSpace  and Con‐
123              strainSwapSpace.
124
125              NOTE: When using ConstrainRAMSpace, if a process tries  to  con‐
126              sume  more  memory  than  is available, the step that process is
127              running in will be killed. This differs from the  behavior  when
128              using  OverMemoryKill,  where just the offending process will be
129              killed.
130
131              NOTE: When enabled, ConstrainRAMSpace can lead to  a  noticeable
132              decline  in  per-node job throughout. Sites with high-throughput
133              requirements should carefully weigh the  tradeoff  between  per-
134              node  throughput,  versus potential problems that can arise from
135              unconstrained    memory    usage    on     the     node.     See
136              <https://slurm.schedmd.com/high_throughput.html>   for   further
137              discussion.
138
139
140       ConstrainSwapSpace=<yes|no>
141              If configured to "yes" then constrain the job's swap  space  us‐
142              age.  The default value is "no". Note that when set to "yes" and
143              ConstrainRAMSpace is set to "no", AllowedRAMSpace  is  automati‐
144              cally  set to 100% in order to limit the RAM+Swap amount to 100%
145              of job's requirement plus the percent  of  allowed  swap  space.
146              This  amount  is  thus set to both RAM and RAM+Swap limits. This
147              means that in that particular case, ConstrainRAMSpace  is  auto‐
148              matically  enabled  with  the same limit as the one used to con‐
149              strain swap space.  Also see AllowedSwapSpace.
150
151
152       MaxRAMPercent=PERCENT
153              Set an upper bound in percent of total RAM on the RAM constraint
154              for  a  job.  This will be the memory constraint applied to jobs
155              that are not explicitly allocated memory by Slurm (i.e.  Slurm's
156              select  plugin  is not configured to manage memory allocations).
157              The PERCENT may be an arbitrary floating point number.  The  de‐
158              fault value is 100.
159
160
161       MaxSwapPercent=PERCENT
162              Set  an  upper  bound (in percent of total RAM) on the amount of
163              RAM+Swap that may be used for a job. This will be the swap limit
164              applied  to jobs on systems where memory is not being explicitly
165              allocated to job. The PERCENT may be an arbitrary floating point
166              number between 0 and 100.  The default value is 100.
167
168
169       MaxKmemPercent=PERCENT
170              Set  an  upper bound in percent of total RAM as the maximum Kmem
171              for a job. The PERCENT may be an arbitrary floating  point  num‐
172              ber,  however,  the  product of MaxKmemPercent and job requested
173              memory has to fall between MinKmemSpace and job  requested  mem‐
174              ory,  otherwise the boundary value is used. The default value is
175              100.
176
177
178       MemorySwappiness=<number>
179              Configure the kernel's priority for swapping out anonymous pages
180              (such  as  program  data)  verses  file  cache pages for the job
181              cgroup. Valid values are between 0 and 100, inclusive.  A  value
182              of 0 prevents the kernel from swapping out program data. A value
183              of 100 gives equal priority to swapping out file cache or anony‐
184              mous  pages.  If  not  set, then the kernel's default swappiness
185              value will be used. ConstrainSwapSpace must be set to yes in or‐
186              der for this parameter to be applied.
187
188
189       MinKmemSpace=<number>
190              Set  a  lower  bound (in MB) on the memory limits defined by Al‐
191              lowedKmemSpace. The default limit is 30M.
192
193
194       MinRAMSpace=<number>
195              Set a lower bound (in MB) on the memory limits  defined  by  Al‐
196              lowedRAMSpace  and  AllowedSwapSpace. This prevents accidentally
197              creating a memory cgroup with such a low limit  that  slurmstepd
198              is  immediately  killed due to lack of RAM. The default limit is
199              30M.
200
201
202       TaskAffinity=<yes|no>
203              If configured to "yes" then set a default task affinity to  bind
204              each  step  task  to  a  subset  of  the  allocated  cores using
205              sched_setaffinity.  The default value is "no".
206
207              NOTE: The recommended configuration to bind steps to tasks is to
208              set  TaskPlugin=task/affinity,task/cgroup  in the slurm.conf and
209              set TaskAffinity=no with ConstrainCores=yes in the  cgroup.conf.
210              This  setup uses the task/affinity plugin for setting the affin‐
211              ity of the tasks and uses the task/cgroup plugin to fence  tasks
212              into the specified resources, combining the best of both pieces.
213
214              NOTE:  This  feature  requires  the  Portable  Hardware Locality
215              (hwloc) library to be installed.
216
217

DISTRIBUTION-SPECIFIC NOTES

219       Debian and derivatives (e.g. Ubuntu) usually  exclude  the  memory  and
220       memsw (swap) cgroups by default. To include them, add the following pa‐
221       rameters to the kernel command line: cgroup_enable=memory swapaccount=1
222
223       This can usually be placed in /etc/default/grub  inside  the  GRUB_CMD‐
224       LINE_LINUX  variable.  A  command such as update-grub must be run after
225       updating the file.
226
227

EXAMPLE

229       /etc/slurm/cgroup.conf:
230              This example cgroup.conf file shows a configuration that enables
231              the more commonly used cgroup enforcment mechanisms.
232
233              ###
234              # Slurm cgroup support configuration file.
235              ###
236              CgroupAutomount=yes
237              CgroupMountpoint=/sys/fs/cgroup
238              ConstrainCores=yes
239              ConstrainDevices=yes
240              ConstrainKmemSpace=no        #avoid known Kernel issues
241              ConstrainRAMSpace=yes
242              ConstrainSwapSpace=yes
243              TaskAffinity=no              #use task/affinity plugin instead
244
245       /etc/slurm/slurm.conf:
246              These  are  the  entries  required in slurm.conf to activate the
247              cgroup enforcement mechanisms. Make sure that the  node  defini‐
248              tions  in  your  slurm.conf  closely  match the configuration as
249              shown by "slurmd -C".  Either  MemSpecLimit  should  be  set  or
250              RealMemory should be defined with less than the actual amount of
251              memory for a node to ensure that  all  system/non-job  processes
252              will have sufficient memory at all times. Sites should also con‐
253              figure pam_slurm_adopt  to  ensure  users  can  not  escape  the
254              cgroups via ssh.
255
256              ###
257              # Slurm configuration entries for cgroups
258              ###
259              ProctrackType=proctrack/cgroup
260              TaskPlugin=task/cgroup,task/affinity
261              JobAcctGatherType=jobacct_gather/cgroup #optional for gathering metrics
262              PrologFlags=Contain                     #X11 flag is also suggested
263
264

COPYING

266       Copyright (C) 2010-2012 Lawrence Livermore National Security.  Produced
267       at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
268       Copyright (C) 2010-2016 SchedMD LLC.
269
270       This file is part of Slurm, a resource  management  program.   For  de‐
271       tails, see <https://slurm.schedmd.com/>.
272
273       Slurm  is free software; you can redistribute it and/or modify it under
274       the terms of the GNU General Public License as published  by  the  Free
275       Software  Foundation;  either version 2 of the License, or (at your op‐
276       tion) any later version.
277
278       Slurm is distributed in the hope that it will be  useful,  but  WITHOUT
279       ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
280       FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
281       for more details.
282
283

SEE ALSO

285       slurm.conf(5)
286
287
288
289April 2021                 Slurm Configuration File             cgroup.conf(5)
Impressum