1cgroup.conf(5)             Slurm Configuration File             cgroup.conf(5)
2
3
4

NAME

6       cgroup.conf - Slurm configuration file for the cgroup support
7
8

DESCRIPTION

10       cgroup.conf  is  an ASCII file which defines parameters used by Slurm's
11       Linux cgroup related plugins.  The file location  can  be  modified  at
12       system  build  time using the DEFAULT_SLURM_CONF parameter or at execu‐
13       tion time by setting the SLURM_CONF environment variable. The file will
14       always be located in the same directory as the slurm.conf file.
15
16       Parameter  names are case insensitive.  Any text following a "#" in the
17       configuration file is treated as a comment  through  the  end  of  that
18       line.   Changes  to  the configuration file take effect upon restart of
19       Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
20       command "scontrol reconfigure" unless otherwise noted.
21
22
23       For  general  Slurm  cgroups  information,  see  the  Cgroups  Guide at
24       <https://slurm.schedmd.com/cgroups.html>.
25
26
27       The following cgroup.conf parameters are defined to control the general
28       behavior of Slurm cgroup plugins.
29
30
31       CgroupAutomount=<yes|no>
32              Slurm cgroup plugins require valid and functional cgroup subsys‐
33              tem to be mounted under  /sys/fs/cgroup/<subsystem_name>.   When
34              launched,  plugins  check  their  subsystem availability. If not
35              available, the plugin launch fails unless CgroupAutomount is set
36              to yes. In that case, the plugin will first try to mount the re‐
37              quired subsystems.
38
39
40       CgroupMountpoint=PATH
41              Specify the PATH under which cgroups  should  be  mounted.  This
42              should  be  a  writable  directory  which  will  contain cgroups
43              mounted one per subsystem. The default PATH is /sys/fs/cgroup.
44
45
46       CgroupPlugin=<cgroup/v1|autodetect>
47              Specify the plugin to be used when interacting with  the  cgroup
48              subsystem.   Supported values at the moment are only "cgroup/v1"
49              which supports the legacy interface of cgroup  v1,  or  "autode‐
50              tect"  which  tries  to determine which cgroup version does your
51              system provide. This is useful if nodes have support for differ‐
52              ent cgroup versions. The default value is "autodetect".
53
54

TASK/CGROUP PLUGIN

56       The  following cgroup.conf parameters are defined to control the behav‐
57       ior of this particular plugin:
58
59
60       AllowedKmemSpace=<number>
61              Constrain the job cgroup kernel memory to this amount of the al‐
62              located memory, specified in bytes. The AllowedKmemSpace must be
63              between the upper and lower memory limits, specified by MaxKmem‐
64              Percent and MinKmemSpace, respectively. If AllowedKmemSpace goes
65              beyond the upper or lower limit, it will be reset to that  upper
66              or lower limit, whichever has been exceeded.
67
68
69       AllowedRAMSpace=<number>
70              Constrain  the job/step cgroup RAM to this percentage of the al‐
71              located memory.  The percentage supplied  may  be  expressed  as
72              floating  point number, e.g. 101.5.  Sets the cgroup soft memory
73              limit at the allocated memory size and then  sets  the  job/step
74              hard  memory limit at the (AllowedRAMSpace/100) * allocated mem‐
75              ory. If the job/step exceeds the hard limit, then it might trig‐
76              ger  Out  Of Memory (OOM) events (including oom-kill) which will
77              be logged to kernel log ring buffer (dmesg  in  Linux).  Setting
78              AllowedRAMSpace  above  100 may cause system Out of Memory (OOM)
79              events as it allows job/step to allocate more memory  than  con‐
80              figured to the nodes.  Reducing configured node available memory
81              to avoid  system  OOM  events  is  suggested.   Setting  Allowe‐
82              dRAMSpace  below  100  will result in jobs receiving less memory
83              than allocated and soft memory limit will set to the same  value
84              as  the  hard  limit.   Also see ConstrainRAMSpace.  The default
85              value is 100.
86
87
88       AllowedSwapSpace=<number>
89              Constrain the job cgroup swap space to this  percentage  of  the
90              allocated  memory.   The  default  value  is 0, which means that
91              RAM+Swap will be limited to AllowedRAMSpace. The  supplied  per‐
92              centage  may be expressed as a floating point number, e.g. 50.5.
93              If the limit is exceeded, the job steps will  be  killed  and  a
94              warning  message  will  be  written to standard error.  Also see
95              ConstrainSwapSpace.  NOTE: Setting AllowedSwapSpace  to  0  does
96              not  restrict the Linux kernel from using swap space. To control
97              how the kernel uses swap space, see MemorySwappiness.
98
99
100       ConstrainCores=<yes|no>
101              If configured to "yes" then constrain allowed cores to the  sub‐
102              set  of allocated resources. This functionality makes use of the
103              cpuset subsystem.  Due to a  bug  fixed  in  version  1.11.5  of
104              HWLOC,  the  task/affinity plugin may be required in addition to
105              task/cgroup for this to function properly.  The default value is
106              "no".
107
108
109       ConstrainDevices=<yes|no>
110              If  configured to "yes" then constrain the job's allowed devices
111              based on GRES allocated resources. It uses the devices subsystem
112              for that.  The default value is "no".
113
114
115       ConstrainKmemSpace=<yes|no>
116              If  configured  to "yes" then constrain the job's Kmem RAM usage
117              in addition to RAM usage. Only takes effect if ConstrainRAMSpace
118              is  set  to  "yes". If enabled, the job's Kmem limit will be as‐
119              signed the value of AllowedKmemSpace or the  value  coming  from
120              MaxKmemPercent.  The default value is "no" which will leave Kmem
121              setting untouched by Slurm.  Also see AllowedKmemSpace, MaxKmem‐
122              Percent.
123
124
125       ConstrainRAMSpace=<yes|no>
126              If  configured  to  "yes"  then constrain the job's RAM usage by
127              setting the memory soft limit to the allocated  memory  and  the
128              hard  limit  to the allocated memory * AllowedRAMSpace.  The de‐
129              fault value is "no", in which case the job's RAM limit  will  be
130              set  to  its  swap  space  limit if ConstrainSwapSpace is set to
131              "yes".  Also  see  AllowedSwapSpace,  AllowedRAMSpace  and  Con‐
132              strainSwapSpace.
133
134              NOTE:  When using ConstrainRAMSpace, if the combined memory used
135              by all processes in a step is greater than the limit,  then  the
136              kernel  will  trigger  an  OOM event, killing one or more of the
137              processes in the step. The step state will be marked as OOM, but
138              the  step  itself  will  keep running and other processes in the
139              step may continue to run as well.  This differs from the  behav‐
140              ior  of OverMemoryKill, where the whole step will be killed/can‐
141              celled. It also differs in that the memory usage is checked on a
142              per-process basis by the JobAcctGather polling system.
143
144              NOTE:  When  enabled, ConstrainRAMSpace can lead to a noticeable
145              decline in per-node job throughout. Sites  with  high-throughput
146              requirements   should   carefully  weigh  the  tradeoff  between
147              per-node throughput, versus potential problems  that  can  arise
148              from    unconstrained    memory   usage   on   the   node.   See
149              <https://slurm.schedmd.com/high_throughput.html>   for   further
150              discussion.
151
152
153       ConstrainSwapSpace=<yes|no>
154              If  configured  to "yes" then constrain the job's swap space us‐
155              age.  The default value is "no". Note that when set to "yes" and
156              ConstrainRAMSpace  is  set to "no", AllowedRAMSpace is automati‐
157              cally set to 100% in order to limit the RAM+Swap amount to  100%
158              of  job's  requirement  plus  the percent of allowed swap space.
159              This amount is thus set to both RAM and  RAM+Swap  limits.  This
160              means  that  in that particular case, ConstrainRAMSpace is auto‐
161              matically enabled with the same limit as the one  used  to  con‐
162              strain swap space.  Also see AllowedSwapSpace.
163
164
165       MaxRAMPercent=PERCENT
166              Set an upper bound in percent of total RAM on the RAM constraint
167              for a job.  This will be the memory constraint applied  to  jobs
168              that  are not explicitly allocated memory by Slurm (i.e. Slurm's
169              select plugin is not configured to manage  memory  allocations).
170              The  PERCENT  may be an arbitrary floating point number. The de‐
171              fault value is 100.
172
173
174       MaxSwapPercent=PERCENT
175              Set an upper bound (in percent of total RAM) on  the  amount  of
176              RAM+Swap that may be used for a job. This will be the swap limit
177              applied to jobs on systems where memory is not being  explicitly
178              allocated to job. The PERCENT may be an arbitrary floating point
179              number between 0 and 100.  The default value is 100.
180
181
182       MaxKmemPercent=PERCENT
183              Set an upper bound in percent of total RAM as the  maximum  Kmem
184              for  a  job. The PERCENT may be an arbitrary floating point num‐
185              ber, however, the product of MaxKmemPercent  and  job  requested
186              memory  has  to fall between MinKmemSpace and job requested mem‐
187              ory, otherwise the boundary value is used. The default value  is
188              100.
189
190
191       MemorySwappiness=<number>
192              Configure the kernel's priority for swapping out anonymous pages
193              (such as program data) verses  file  cache  pages  for  the  job
194              cgroup.  Valid  values are between 0 and 100, inclusive. A value
195              of 0 prevents the kernel from swapping out program data. A value
196              of 100 gives equal priority to swapping out file cache or anony‐
197              mous pages. If not set, then  the  kernel's  default  swappiness
198              value will be used. ConstrainSwapSpace must be set to yes in or‐
199              der for this parameter to be applied.
200
201
202       MinKmemSpace=<number>
203              Set a lower bound (in MB) on the memory limits  defined  by  Al‐
204              lowedKmemSpace. The default limit is 30M.
205
206
207       MinRAMSpace=<number>
208              Set  a  lower  bound (in MB) on the memory limits defined by Al‐
209              lowedRAMSpace and AllowedSwapSpace. This  prevents  accidentally
210              creating  a  memory cgroup with such a low limit that slurmstepd
211              is immediately killed due to lack of RAM. The default  limit  is
212              30M.
213
214

DISTRIBUTION-SPECIFIC NOTES

216       Debian  and  derivatives  (e.g.  Ubuntu) usually exclude the memory and
217       memsw (swap) cgroups by default. To include them, add the following pa‐
218       rameters to the kernel command line: cgroup_enable=memory swapaccount=1
219
220       This  can  usually  be placed in /etc/default/grub inside the GRUB_CMD‐
221       LINE_LINUX variable. A command such as update-grub must  be  run  after
222       updating the file.
223
224

EXAMPLE

226       /etc/slurm/cgroup.conf:
227              This example cgroup.conf file shows a configuration that enables
228              the more commonly used cgroup enforcement mechanisms.
229
230              ###
231              # Slurm cgroup support configuration file.
232              ###
233              CgroupAutomount=yes
234              CgroupMountpoint=/sys/fs/cgroup
235              ConstrainCores=yes
236              ConstrainDevices=yes
237              ConstrainKmemSpace=no        #avoid known Kernel issues
238              ConstrainRAMSpace=yes
239              ConstrainSwapSpace=yes
240
241       /etc/slurm/slurm.conf:
242              These are the entries required in  slurm.conf  to  activate  the
243              cgroup  enforcement  mechanisms. Make sure that the node defini‐
244              tions in your slurm.conf  closely  match  the  configuration  as
245              shown  by  "slurmd  -C".   Either  MemSpecLimit should be set or
246              RealMemory should be defined with less than the actual amount of
247              memory  for  a  node to ensure that all system/non-job processes
248              will have sufficient memory at all times. Sites should also con‐
249              figure  pam_slurm_adopt  to  ensure  users  can  not  escape the
250              cgroups via ssh.
251
252              ###
253              # Slurm configuration entries for cgroups
254              ###
255              ProctrackType=proctrack/cgroup
256              TaskPlugin=task/cgroup,task/affinity
257              JobAcctGatherType=jobacct_gather/cgroup #optional for gathering metrics
258              PrologFlags=Contain                     #X11 flag is also suggested
259
260

COPYING

262       Copyright (C) 2010-2012 Lawrence Livermore National Security.  Produced
263       at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
264       Copyright (C) 2010-2021 SchedMD LLC.
265
266       This  file  is  part  of Slurm, a resource management program.  For de‐
267       tails, see <https://slurm.schedmd.com/>.
268
269       Slurm is free software; you can redistribute it and/or modify it  under
270       the  terms  of  the GNU General Public License as published by the Free
271       Software Foundation; either version 2 of the License, or (at  your  op‐
272       tion) any later version.
273
274       Slurm  is  distributed  in the hope that it will be useful, but WITHOUT
275       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
276       FITNESS  FOR  A PARTICULAR PURPOSE.  See the GNU General Public License
277       for more details.
278
279

SEE ALSO

281       slurm.conf(5)
282
283
284
285October 2021               Slurm Configuration File             cgroup.conf(5)
Impressum