gres.conf(5)

1gres.conf(5)               Slurm Configuration File               gres.conf(5)
2
3
4

NAME

6       gres.conf  -  Slurm configuration file for Generic RESource (GRES) man‐
7       agement.
8
9

DESCRIPTION

11       gres.conf is an ASCII file which describes the configuration of Generic
12       RESource  (GRES)  on each compute node.  If the GRES information in the
13       slurm.conf file  does  not  fully  describe  those  resources,  then  a
14       gres.conf file should be included on each compute node.  The file loca‐
15       tion can be modified at system build time using the  DEFAULT_SLURM_CONF
16       parameter  or  at  execution time by setting the SLURM_CONF environment
17       variable. The file will always be located in the same directory as  the
18       slurm.conf file.
19
20
21       If  the  GRES  information in the slurm.conf file fully describes those
22       resources (i.e. no "Cores", "File" or "Links" specification is required
23       for that GRES type or that information is automatically detected), that
24       information may be omitted from the gres.conf file and only the config‐
25       uration information in the slurm.conf file will be used.  The gres.conf
26       file may be omitted completely if the configuration information in  the
27       slurm.conf file fully describes all GRES.
28
29
30       Parameter  names are case insensitive.  Any text following a "#" in the
31       configuration file is treated as a comment  through  the  end  of  that
32       line.   Changes  to  the configuration file take effect upon restart of
33       Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
34       command "scontrol reconfigure" unless otherwise noted.
35
36
37       NOTE:   Slurm   support   for   gres/mps   requires   the  use  of  the
38       select/cons_tres plugin. For more information on how to configure  MPS,
39       see https://slurm.schedmd.com/gres.html#MPS_Management.
40
41
42       For   more   information   on   GRES   scheduling   in   general,   see
43       https://slurm.schedmd.com/gres.html.
44
45
46       The overall configuration parameters available include:
47
48
49       AutoDetect
50              The hardware detection mechanisms to enable for  automatic  GRES
51              configuration.   This should be on a line by itself.  Currently,
52              the only valid option is nvml, which  allows  for  automatically
53              detecting NVIDIA GPUs.
54
55
56       Count  Number  of  resources  of this type available on this node.  The
57              default value is set to the number of File values specified  (if
58              any),  otherwise the default value is one. A suffix of "K", "M",
59              "G", "T" or "P" may be used to  multiply  the  number  by  1024,
60              1048576,    1073741824,   etc.   respectively.    For   example:
61              "Count=10G".
62
63
64       Cores  Optionally specify the first thread CPU index  numbers  for  the
65              specific cores which can use this resource.  For example, it may
66              be strongly preferable to use specific cores with specific  GRES
67              devices  (e.g.  on  a NUMA architecture).  While Slurm can track
68              and assign resources at the CPU or thread level, its  scheduling
69              algorithms  used  to co-allocate GRES devices with CPUs operates
70              at a socket or NUMA level.  Therefore  it  is  not  possible  to
71              preferentially  assign  GRES with different specific CPUs on the
72              same NUMA or socket and this option should be used  to  identify
73              all cores on some socket.
74
75
76              Multiple  cores may be specified using a comma delimited list or
77              a range may be specified using a "-" separator  (e.g.  "0,1,2,3"
78              or  "0-3").   If  a  job specifies --gres-flags=enforce-binding,
79              then only the  identified  cores  can  be  allocated  with  each
80              generic resource. This will tend to improve performance of jobs,
81              but delay the allocation of resources to them.  If specified and
82              a  job  is  not  submitted with the --gres-flags=enforce-binding
83              option the identified cores will be preferred for scheduled with
84              each generic resource.
85
86              If  --gres-flags=disable-binding is specified, then any core can
87              be used with the resources, which also increases  the  speed  of
88              Slurm's  scheduling  algorithm  but  can degrade the application
89              performance.  The --gres-flags=disable-binding  option  is  cur‐
90              rently  required to use more CPUs than are bound to a GRES (i.e.
91              if a GPU is bound to the CPUs on one socket,  but  resources  on
92              more  than one socket are required to run the job).  If any core
93              can be effectively used with the resources, then do not  specify
94              the  cores  option  for  improved  speed in the Slurm scheduling
95              logic.  A restart of the slurmctld is needed for changes to  the
96              Cores option to take effect.
97
98              NOTE:  If  your  cores  contain  multiple threads only the first
99              thread (processing unit) of each core needs to be listed.   Also
100              note  that  since Slurm must be able to perform resource manage‐
101              ment on heterogeneous clusters having  various  processing  unit
102              numbering schemes, a logical processing unit index must be spec‐
103              ified instead of the physical processing unit index.  That  pro‐
104              cessing unit logical index might not correspond to your physical
105              index number.  Processing unit 0 will be the first socket, first
106              core  and  (if  configured)  first thread.  If hyperthreading is
107              enabled, processing unit 1 will  always  be  the  first  socket,
108              first core and second thread.  If hyperthreading is not enabled,
109              processing unit 1 will always be the  first  socket  and  second
110              core.  This numbering coincides with the processing unit logical
111              number (PU L#) seen in "lstopo -l" command output.
112
113
114       File   Fully qualified pathname of the device files associated  with  a
115              resource.   The  name  can  include a numeric range suffix to be
116              interpreted by Slurm (e.g. File=/dev/nvidia[0-3]).
117
118
119              This field is  generally  required  if  enforcement  of  generic
120              resource  allocations  is  to  be supported (i.e. prevents users
121              from making use of resources allocated  to  a  different  user).
122              Enforcement  of  the  file  allocation relies upon Linux Control
123              Groups (cgroups) and  Slurm's  task/cgroup  plugin,  which  will
124              place  the allocated files into the job's cgroup and prevent use
125              of other files.  Please  see  Slurm's  Cgroups  Guide  for  more
126              information: https://slurm.schedmd.com/cgroups.html.
127
128              If File is specified then Count must be either set to the number
129              of file names specified or not set (the  default  value  is  the
130              number  of  files specified).  The exception to this is MPS. For
131              MPS, each GPU would be identified by device file using the  File
132              parameter and Count would specify the number of MPS entries that
133              would correspond to that GPU (typically 100 or some multiple  of
134              100).
135
136              NOTE:  If  you specify the File parameter for a resource on some
137              node, the option must be specified on all nodes and  Slurm  will
138              track  the  assignment  of  each specific resource on each node.
139              Otherwise Slurm will only track a count of  allocated  resources
140              rather than the state of each individual device file.
141
142              NOTE:  Drain  a  node  before changing the count of records with
143              File parameters (i.e. if you want to add or remove GPUs  from  a
144              node's  configuration).  Failure to do so will result in any job
145              using those GRES being aborted.
146
147
148       Links  A comma-delimited list of numbers identifying the number of con‐
149              nections   between  this  device  and  other  devices  to  allow
150              coscheduling of better connected devices.  This  is  an  ordered
151              list in which the number of connections this specific device has
152              to device number 0 would be in the first position, the number of
153              connections  it  has  to device number 1 in the second position,
154              etc.  A -1 indicates the device itself and a 0 indicates no con‐
155              nection.  If specified, then this line can only contain a single
156              GRES device (i.e. can only contain a single file via File).
157
158
159              This is an optional value and is  usually  automatically  deter‐
160              mined  if AutoDetect is enabled.  A typical use case would be to
161              identify GPUs having NVLink connectivity.  Note that  for  GPUs,
162              the  minor number assigned by the OS and used in the device file
163              (i.e. the X in /dev/nvidiaX) is not necessarily the same as  the
164              device number/index. The device number is created by sorting the
165              GPUs by PCI bus ID and then numbering  them  starting  from  the
166              smallest                bus               ID.                See
167              https://slurm.schedmd.com/gres.html#GPU_Management
168
169
170       Name   Name of the generic resource. Any desired name may be used.  The
171              name  must  match  a  value  in  GresTypes  in slurm.conf.  Each
172              generic resource  has  an  optional  plugin  which  can  provide
173              resource-specific  functionality.   Generic  resources that cur‐
174              rently include an optional plugin are:
175
176              gpu    Graphics Processing Unit
177
178              mps    CUDA Multi-Process Service (MPS)
179
180              nic    Network Interface Card
181
182              mic    Intel Many Integrated Core (MIC) processor
183
184
185       NodeName
186              An optional NodeName specification can be  used  to  permit  one
187              gres.conf  file to be used for all compute nodes in a cluster by
188              specifying the node(s) that each  line  should  apply  to.   The
189              NodeName specification can use a Slurm hostlist specification as
190              shown in the example below.
191
192
193       Type   An optional arbitrary string identifying  the  type  of  device.
194              For  example, this might be used to identify a specific model of
195              GPU, which users can then specify in a job request.  If Type  is
196              specified, then Count is limited in size (currently 1024).
197
198

EXAMPLES

200       ##################################################################
201       # Slurm's Generic Resource (GRES) configuration file
202       # Define GPU devices with MPS support
203       ##################################################################
204       AutoDetect=nvml
205       Name=gpu Type=gtx560 File=/dev/nvidia0 COREs=0,1
206       Name=gpu Type=tesla  File=/dev/nvidia1 COREs=2,3
207       Name=mps Count=100 File=/dev/nvidia0 COREs=0,1
208       Name=mps Count=100  File=/dev/nvidia1 COREs=2,3
209
210
211       ##################################################################
212       # Slurm's Generic Resource (GRES) configuration file
213       # Overwrite system defaults and explicitly configure three GPUs
214       ##################################################################
215       Name=gpu Type=tesla File=/dev/nvidia[0-1] COREs=0,1
216       # Name=gpu Type=tesla  File=/dev/nvidia[2-3] COREs=2,3
217       # NOTE: nvidia2 device is out of service
218       Name=gpu Type=tesla  File=/dev/nvidia3 COREs=2,3
219
220
221       ##################################################################
222       # Slurm's Generic Resource (GRES) configuration file
223       # Use a single gres.conf file for all compute nodes - positive method
224       ##################################################################
225       ## Explicitly specify devices on nodes tux0-tux15
226       # NodeName=tux[0-15]  Name=gpu File=/dev/nvidia[0-3]
227       # NOTE: tux3 nvidia1 device is out of service
228       NodeName=tux[0-2]  Name=gpu File=/dev/nvidia[0-3]
229       NodeName=tux3  Name=gpu File=/dev/nvidia[0,2-3]
230       NodeName=tux[4-15]  Name=gpu File=/dev/nvidia[0-3]
231
232
233       ##################################################################
234       # Slurm's Generic Resource (GRES) configuration file
235       # Use NVML to gather GPU configuration information
236       # Information about all other GRES gathered from slurm.conf
237       ##################################################################
238       AutoDetect=nvml
239
240

COPYING

242       Copyright  (C)  2010 The Regents of the University of California.  Pro‐
243       duced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
244       Copyright (C) 2010-2019 SchedMD LLC.
245
246       This file is  part  of  Slurm,  a  resource  management  program.   For
247       details, see <https://slurm.schedmd.com/>.
248
249       Slurm  is free software; you can redistribute it and/or modify it under
250       the terms of the GNU General Public License as published  by  the  Free
251       Software  Foundation;  either  version  2  of  the License, or (at your
252       option) any later version.
253
254       Slurm is distributed in the hope that it will be  useful,  but  WITHOUT
255       ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
256       FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
257       for more details.
258
259

NAME

DESCRIPTION

EXAMPLES

COPYING

SEE ALSO