1gres.conf(5)               Slurm Configuration File               gres.conf(5)
2
3
4

NAME

6       gres.conf  -  Slurm configuration file for Generic RESource (GRES) man‐
7       agement.
8
9

DESCRIPTION

11       gres.conf is an ASCII file which describes the configuration of Generic
12       RESource  (GRES)  on each compute node.  If the GRES information in the
13       slurm.conf file  does  not  fully  describe  those  resources,  then  a
14       gres.conf file should be included on each compute node.  The file loca‐
15       tion can be modified at system build time using the  DEFAULT_SLURM_CONF
16       parameter  or  at  execution time by setting the SLURM_CONF environment
17       variable. The file will always be located in the same directory as  the
18       slurm.conf file.
19
20
21       If  the  GRES  information in the slurm.conf file fully describes those
22       resources (i.e. no "Cores", "File" or "Links" specification is required
23       for that GRES type or that information is automatically detected), that
24       information may be omitted from the gres.conf file and only the config‐
25       uration information in the slurm.conf file will be used.  The gres.conf
26       file may be omitted completely if the configuration information in  the
27       slurm.conf file fully describes all GRES.
28
29
30       If  using  the  gres.conf  file  to describe the resources available to
31       nodes, the first parameter on the line should be NodeName. If configur‐
32       ing  Generic Resources without specifying nodes, the first parameter on
33       the line should be Name.
34
35
36       Parameter names are case insensitive.  Any text following a "#" in  the
37       configuration  file  is  treated  as  a comment through the end of that
38       line.  Changes to the configuration file take effect  upon  restart  of
39       Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
40       command "scontrol reconfigure" unless otherwise noted.
41
42
43       NOTE:  Slurm  support  for   gres/mps   requires   the   use   of   the
44       select/cons_tres  plugin. For more information on how to configure MPS,
45       see https://slurm.schedmd.com/gres.html#MPS_Management.
46
47
48       For   more   information   on   GRES   scheduling   in   general,   see
49       https://slurm.schedmd.com/gres.html.
50
51
52       The overall configuration parameters available include:
53
54
55       AutoDetect
56              The  hardware  detection mechanisms to enable for automatic GRES
57              configuration.  Currently, the options are:
58
59              nvml   Automatically detect NVIDIA GPUs
60
61              off    Do not automatically detect any GPUs.  Used  to  override
62                     other options.
63
64              rsmi   Automatically detect AMD GPUs
65
66       AutoDetect  can  be on a line by itself, in which case it will globally
67       apply to all lines in gres.conf by default. In addition, AutoDetect can
68       be combined with NodeName to only apply to certain nodes. Node-specific
69       AutoDetects will trump the global AutoDetect. A  node-specific  AutoDe‐
70       tect  only  needs  to be specified once per node. If specified multiple
71       times for the same nodes, they must all be the  same  value.  To  unset
72       AutoDetect for a node when a global AutoDetect is set, simply set it to
73       "off" in a node-specific GRES line.  E.g.: NodeName=tux3 AutoDetect=off
74       Name=gpu File=/dev/nvidia[0-3].
75
76
77       Count  Number  of  resources  of this type available on this node.  The
78              default value is set to the number of File values specified  (if
79              any),  otherwise the default value is one. A suffix of "K", "M",
80              "G", "T" or "P" may be used to  multiply  the  number  by  1024,
81              1048576,    1073741824,   etc.   respectively.    For   example:
82              "Count=10G".
83
84
85       Cores  Optionally specify the core index numbers for the specific cores
86              which  can  use  this resource.  For example, it may be strongly
87              preferable to use specific  cores  with  specific  GRES  devices
88              (e.g. on a NUMA architecture).  While Slurm can track and assign
89              resources at the CPU or thread level, its scheduling  algorithms
90              used  to co-allocate GRES devices with CPUs operates at a socket
91              or NUMA level.  Therefore it is not possible  to  preferentially
92              assign  GRES  with  different  specific CPUs on the same NUMA or
93              socket and this option should be used to identify all  cores  on
94              some socket.
95
96
97              Multiple  cores may be specified using a comma delimited list or
98              a range may be specified using a "-" separator  (e.g.  "0,1,2,3"
99              or  "0-3").   If  a  job specifies --gres-flags=enforce-binding,
100              then only the  identified  cores  can  be  allocated  with  each
101              generic resource. This will tend to improve performance of jobs,
102              but delay the allocation of resources to them.  If specified and
103              a  job  is  not  submitted with the --gres-flags=enforce-binding
104              option the identified cores will  be  preferred  for  scheduling
105              with each generic resource.
106
107              If  --gres-flags=disable-binding is specified, then any core can
108              be used with the resources, which also increases  the  speed  of
109              Slurm's  scheduling  algorithm  but  can degrade the application
110              performance.  The --gres-flags=disable-binding  option  is  cur‐
111              rently  required to use more CPUs than are bound to a GRES (i.e.
112              if a GPU is bound to the CPUs on one socket,  but  resources  on
113              more  than one socket are required to run the job).  If any core
114              can be effectively used with the resources, then do not  specify
115              the  cores  option  for  improved  speed in the Slurm scheduling
116              logic.  A restart of the slurmctld is needed for changes to  the
117              Cores option to take effect.
118
119              NOTE: Since Slurm must be able to perform resource management on
120              heterogeneous clusters having various processing unit  numbering
121              schemes,  a  logical core index must be specified instead of the
122              physical core index.  That logical core index might  not  corre‐
123              spond  to  your  physical core index number.  Core 0 will be the
124              first core on the first socket, while core 1 will be the  second
125              core  on  the  first  socket.  This numbering coincides with the
126              logical core number (Core L#) seen in "lstopo -l"  command  out‐
127              put.
128
129
130       File   Fully  qualified  pathname of the device files associated with a
131              resource.  The name can include a numeric  range  suffix  to  be
132              interpreted by Slurm (e.g. File=/dev/nvidia[0-3]).
133
134
135              This  field  is  generally  required  if  enforcement of generic
136              resource allocations is to be  supported  (i.e.  prevents  users
137              from  making  use  of  resources allocated to a different user).
138              Enforcement of the file allocation  relies  upon  Linux  Control
139              Groups  (cgroups)  and  Slurm's  task/cgroup  plugin, which will
140              place the allocated files into the job's cgroup and prevent  use
141              of  other  files.   Please  see  Slurm's  Cgroups Guide for more
142              information: https://slurm.schedmd.com/cgroups.html.
143
144              If File is specified then Count must be either set to the number
145              of  file  names  specified  or not set (the default value is the
146              number of files specified).  The exception to this is  MPS.  For
147              MPS,  each GPU would be identified by device file using the File
148              parameter and Count would specify the number of MPS entries that
149              would  correspond to that GPU (typically 100 or some multiple of
150              100).
151
152              NOTE: If you specify the File parameter for a resource  on  some
153              node,  the  option must be specified on all nodes and Slurm will
154              track the assignment of each specific  resource  on  each  node.
155              Otherwise  Slurm  will only track a count of allocated resources
156              rather than the state of each individual device file.
157
158              NOTE: Drain a node before changing the  count  of  records  with
159              File  parameters  (i.e. if you want to add or remove GPUs from a
160              node's configuration).  Failure to do so will result in any  job
161              using those GRES being aborted.
162
163
164       Flags  Optional flags that can be specified to change configured behav‐
165              ior of the GRES.
166
167              Allowed values at present are:
168
169              CountOnly           Do not attempt to load plugin as  this  GRES
170                                  will  only  be  used to track counts of GRES
171                                  used. This avoids attempting  to  load  non-
172                                  existent plugin which can affect filesystems
173                                  with high latency  metadata  operations  for
174                                  non-existent files.
175
176
177       Links  A comma-delimited list of numbers identifying the number of con‐
178              nections  between  this  device  and  other  devices  to   allow
179              coscheduling  of  better  connected devices.  This is an ordered
180              list in which the number of connections this specific device has
181              to device number 0 would be in the first position, the number of
182              connections it has to device number 1 in  the  second  position,
183              etc.  A -1 indicates the device itself and a 0 indicates no con‐
184              nection.  If specified, then this line can only contain a single
185              GRES device (i.e. can only contain a single file via File).
186
187
188              This  is  an  optional value and is usually automatically deter‐
189              mined if AutoDetect is enabled.  A typical use case would be  to
190              identify  GPUs  having NVLink connectivity.  Note that for GPUs,
191              the minor number assigned by the OS and used in the device  file
192              (i.e.  the X in /dev/nvidiaX) is not necessarily the same as the
193              device number/index. The device number is created by sorting the
194              GPUs  by  PCI  bus  ID and then numbering them starting from the
195              smallest               bus               ID.                 See
196              https://slurm.schedmd.com/gres.html#GPU_Management
197
198
199       Name   Name of the generic resource. Any desired name may be used.  The
200              name must match  a  value  in  GresTypes  in  slurm.conf.   Each
201              generic  resource  has  an  optional  plugin  which  can provide
202              resource-specific functionality.  Generic  resources  that  cur‐
203              rently include an optional plugin are:
204
205              gpu    Graphics Processing Unit
206
207              mps    CUDA Multi-Process Service (MPS)
208
209              nic    Network Interface Card
210
211              mic    Intel Many Integrated Core (MIC) processor
212
213
214       NodeName
215              An  optional  NodeName  specification  can be used to permit one
216              gres.conf file to be used for all compute nodes in a cluster  by
217              specifying  the  node(s)  that  each  line should apply to.  The
218              NodeName specification can use a Slurm hostlist specification as
219              shown in the example below.
220
221
222       Type   An  optional  arbitrary  string  identifying the type of device.
223              For example, this might be used to identify a specific model  of
224              GPU,  which users can then specify in a job request.  If Type is
225              specified, then Count is limited in size (currently 1024).
226
227

EXAMPLES

229       ##################################################################
230       # Slurm's Generic Resource (GRES) configuration file
231       # Define GPU devices with MPS support, with AutoDetect sanity checking
232       ##################################################################
233       AutoDetect=nvml
234       Name=gpu Type=gtx560 File=/dev/nvidia0 COREs=0,1
235       Name=gpu Type=tesla  File=/dev/nvidia1 COREs=2,3
236       Name=mps Count=100 File=/dev/nvidia0 COREs=0,1
237       Name=mps Count=100  File=/dev/nvidia1 COREs=2,3
238
239
240       ##################################################################
241       # Slurm's Generic Resource (GRES) configuration file
242       # Overwrite system defaults and explicitly configure three GPUs
243       ##################################################################
244       Name=gpu Type=tesla File=/dev/nvidia[0-1] COREs=0,1
245       # Name=gpu Type=tesla  File=/dev/nvidia[2-3] COREs=2,3
246       # NOTE: nvidia2 device is out of service
247       Name=gpu Type=tesla  File=/dev/nvidia3 COREs=2,3
248
249
250       ##################################################################
251       # Slurm's Generic Resource (GRES) configuration file
252       # Use a single gres.conf file for all compute nodes - positive method
253       ##################################################################
254       ## Explicitly specify devices on nodes tux0-tux15
255       # NodeName=tux[0-15]  Name=gpu File=/dev/nvidia[0-3]
256       # NOTE: tux3 nvidia1 device is out of service
257       NodeName=tux[0-2]  Name=gpu File=/dev/nvidia[0-3]
258       NodeName=tux3  Name=gpu File=/dev/nvidia[0,2-3]
259       NodeName=tux[4-15]  Name=gpu File=/dev/nvidia[0-3]
260
261
262       ##################################################################
263       # Slurm's Generic Resource (GRES) configuration file
264       # Use NVML to gather GPU configuration information
265       # for all nodes except one
266       ##################################################################
267       AutoDetect=nvml
268       NodeName=tux3 AutoDetect=off Name=gpu File=/dev/nvidia[0-3]
269
270       ##################################################################
271       # Slurm's Generic Resource (GRES) configuration file
272       # Specify some nodes with NVML,  some  with  RSMI,  and  some  with  no
273       AutoDetect
274       ##################################################################
275       NodeName=tux[0-7] AutoDetect=nvml
276       NodeName=tux[8-11] AutoDetect=rsmi
277       NodeName=tux[12-15] Name=gpu File=/dev/nvidia[0-3]
278
279

COPYING

281       Copyright  (C)  2010 The Regents of the University of California.  Pro‐
282       duced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
283       Copyright (C) 2010-2019 SchedMD LLC.
284
285       This file is  part  of  Slurm,  a  resource  management  program.   For
286       details, see <https://slurm.schedmd.com/>.
287
288       Slurm  is free software; you can redistribute it and/or modify it under
289       the terms of the GNU General Public License as published  by  the  Free
290       Software  Foundation;  either  version  2  of  the License, or (at your
291       option) any later version.
292
293       Slurm is distributed in the hope that it will be  useful,  but  WITHOUT
294       ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
295       FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
296       for more details.
297
298

SEE ALSO

300       slurm.conf(5)
301
302
303
304October 2020               Slurm Configuration File               gres.conf(5)
Impressum