1gres.conf(5) Slurm Configuration File gres.conf(5)
2
3
4
6 gres.conf - Slurm configuration file for Generic RESource (GRES) man‐
7 agement.
8
9
11 gres.conf is an ASCII file which describes the configuration of Generic
12 RESource (GRES) on each compute node. If the GRES information in the
13 slurm.conf file does not fully describe those resources, then a
14 gres.conf file should be included on each compute node. The file loca‐
15 tion can be modified at system build time using the DEFAULT_SLURM_CONF
16 parameter or at execution time by setting the SLURM_CONF environment
17 variable. The file will always be located in the same directory as the
18 slurm.conf file.
19
20
21 If the GRES information in the slurm.conf file fully describes those
22 resources (i.e. no "Cores", "File" or "Links" specification is required
23 for that GRES type or that information is automatically detected), that
24 information may be omitted from the gres.conf file and only the config‐
25 uration information in the slurm.conf file will be used. The gres.conf
26 file may be omitted completely if the configuration information in the
27 slurm.conf file fully describes all GRES.
28
29
30 If using the gres.conf file to describe the resources available to
31 nodes, the first parameter on the line should be NodeName. If configur‐
32 ing Generic Resources without specifying nodes, the first parameter on
33 the line should be Name.
34
35
36 Parameter names are case insensitive. Any text following a "#" in the
37 configuration file is treated as a comment through the end of that
38 line. Changes to the configuration file take effect upon restart of
39 Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
40 command "scontrol reconfigure" unless otherwise noted.
41
42
43 NOTE: Slurm support for gres/mps requires the use of the
44 select/cons_tres plugin. For more information on how to configure MPS,
45 see https://slurm.schedmd.com/gres.html#MPS_Management.
46
47
48 For more information on GRES scheduling in general, see
49 https://slurm.schedmd.com/gres.html.
50
51
52 The overall configuration parameters available include:
53
54
55 AutoDetect
56 The hardware detection mechanisms to enable for automatic GRES
57 configuration. Currently, the options are:
58
59 nvml Automatically detect NVIDIA GPUs
60
61 off Do not automatically detect any GPUs. Used to override
62 other options.
63
64 rsmi Automatically detect AMD GPUs
65
66 AutoDetect can be on a line by itself, in which case it will globally
67 apply to all lines in gres.conf by default. In addition, AutoDetect can
68 be combined with NodeName to only apply to certain nodes. Node-specific
69 AutoDetects will trump the global AutoDetect. A node-specific AutoDe‐
70 tect only needs to be specified once per node. If specified multiple
71 times for the same nodes, they must all be the same value. To unset
72 AutoDetect for a node when a global AutoDetect is set, simply set it to
73 "off" in a node-specific GRES line. E.g.: NodeName=tux3 AutoDetect=off
74 Name=gpu File=/dev/nvidia[0-3].
75
76
77 Count Number of resources of this type available on this node. The
78 default value is set to the number of File values specified (if
79 any), otherwise the default value is one. A suffix of "K", "M",
80 "G", "T" or "P" may be used to multiply the number by 1024,
81 1048576, 1073741824, etc. respectively. For example:
82 "Count=10G".
83
84
85 Cores Optionally specify the core index numbers for the specific cores
86 which can use this resource. For example, it may be strongly
87 preferable to use specific cores with specific GRES devices
88 (e.g. on a NUMA architecture). While Slurm can track and assign
89 resources at the CPU or thread level, its scheduling algorithms
90 used to co-allocate GRES devices with CPUs operates at a socket
91 or NUMA level. Therefore it is not possible to preferentially
92 assign GRES with different specific CPUs on the same NUMA or
93 socket and this option should be used to identify all cores on
94 some socket.
95
96
97 Multiple cores may be specified using a comma delimited list or
98 a range may be specified using a "-" separator (e.g. "0,1,2,3"
99 or "0-3"). If a job specifies --gres-flags=enforce-binding,
100 then only the identified cores can be allocated with each
101 generic resource. This will tend to improve performance of jobs,
102 but delay the allocation of resources to them. If specified and
103 a job is not submitted with the --gres-flags=enforce-binding
104 option the identified cores will be preferred for scheduling
105 with each generic resource.
106
107 If --gres-flags=disable-binding is specified, then any core can
108 be used with the resources, which also increases the speed of
109 Slurm's scheduling algorithm but can degrade the application
110 performance. The --gres-flags=disable-binding option is cur‐
111 rently required to use more CPUs than are bound to a GRES (i.e.
112 if a GPU is bound to the CPUs on one socket, but resources on
113 more than one socket are required to run the job). If any core
114 can be effectively used with the resources, then do not specify
115 the cores option for improved speed in the Slurm scheduling
116 logic. A restart of the slurmctld is needed for changes to the
117 Cores option to take effect.
118
119 NOTE: Since Slurm must be able to perform resource management on
120 heterogeneous clusters having various processing unit numbering
121 schemes, a logical core index must be specified instead of the
122 physical core index. That logical core index might not corre‐
123 spond to your physical core index number. Core 0 will be the
124 first core on the first socket, while core 1 will be the second
125 core on the first socket. This numbering coincides with the
126 logical core number (Core L#) seen in "lstopo -l" command out‐
127 put.
128
129
130 File Fully qualified pathname of the device files associated with a
131 resource. The name can include a numeric range suffix to be
132 interpreted by Slurm (e.g. File=/dev/nvidia[0-3]).
133
134
135 This field is generally required if enforcement of generic
136 resource allocations is to be supported (i.e. prevents users
137 from making use of resources allocated to a different user).
138 Enforcement of the file allocation relies upon Linux Control
139 Groups (cgroups) and Slurm's task/cgroup plugin, which will
140 place the allocated files into the job's cgroup and prevent use
141 of other files. Please see Slurm's Cgroups Guide for more
142 information: https://slurm.schedmd.com/cgroups.html.
143
144 If File is specified then Count must be either set to the number
145 of file names specified or not set (the default value is the
146 number of files specified). The exception to this is MPS. For
147 MPS, each GPU would be identified by device file using the File
148 parameter and Count would specify the number of MPS entries that
149 would correspond to that GPU (typically 100 or some multiple of
150 100).
151
152 NOTE: If you specify the File parameter for a resource on some
153 node, the option must be specified on all nodes and Slurm will
154 track the assignment of each specific resource on each node.
155 Otherwise Slurm will only track a count of allocated resources
156 rather than the state of each individual device file.
157
158 NOTE: Drain a node before changing the count of records with
159 File parameters (i.e. if you want to add or remove GPUs from a
160 node's configuration). Failure to do so will result in any job
161 using those GRES being aborted.
162
163
164 Flags Optional flags that can be specified to change configured behav‐
165 ior of the GRES.
166
167 Allowed values at present are:
168
169 CountOnly Do not attempt to load plugin as this GRES
170 will only be used to track counts of GRES
171 used. This avoids attempting to load non-
172 existent plugin which can affect filesystems
173 with high latency metadata operations for
174 non-existent files.
175
176
177 Links A comma-delimited list of numbers identifying the number of con‐
178 nections between this device and other devices to allow
179 coscheduling of better connected devices. This is an ordered
180 list in which the number of connections this specific device has
181 to device number 0 would be in the first position, the number of
182 connections it has to device number 1 in the second position,
183 etc. A -1 indicates the device itself and a 0 indicates no con‐
184 nection. If specified, then this line can only contain a single
185 GRES device (i.e. can only contain a single file via File).
186
187
188 This is an optional value and is usually automatically deter‐
189 mined if AutoDetect is enabled. A typical use case would be to
190 identify GPUs having NVLink connectivity. Note that for GPUs,
191 the minor number assigned by the OS and used in the device file
192 (i.e. the X in /dev/nvidiaX) is not necessarily the same as the
193 device number/index. The device number is created by sorting the
194 GPUs by PCI bus ID and then numbering them starting from the
195 smallest bus ID. See
196 https://slurm.schedmd.com/gres.html#GPU_Management
197
198
199 Name Name of the generic resource. Any desired name may be used. The
200 name must match a value in GresTypes in slurm.conf. Each
201 generic resource has an optional plugin which can provide
202 resource-specific functionality. Generic resources that cur‐
203 rently include an optional plugin are:
204
205 gpu Graphics Processing Unit
206
207 mps CUDA Multi-Process Service (MPS)
208
209 nic Network Interface Card
210
211 mic Intel Many Integrated Core (MIC) processor
212
213
214 NodeName
215 An optional NodeName specification can be used to permit one
216 gres.conf file to be used for all compute nodes in a cluster by
217 specifying the node(s) that each line should apply to. The
218 NodeName specification can use a Slurm hostlist specification as
219 shown in the example below.
220
221
222 Type An optional arbitrary string identifying the type of device.
223 For example, this might be used to identify a specific model of
224 GPU, which users can then specify in a job request. If Type is
225 specified, then Count is limited in size (currently 1024).
226
227
229 ##################################################################
230 # Slurm's Generic Resource (GRES) configuration file
231 # Define GPU devices with MPS support, with AutoDetect sanity checking
232 ##################################################################
233 AutoDetect=nvml
234 Name=gpu Type=gtx560 File=/dev/nvidia0 COREs=0,1
235 Name=gpu Type=tesla File=/dev/nvidia1 COREs=2,3
236 Name=mps Count=100 File=/dev/nvidia0 COREs=0,1
237 Name=mps Count=100 File=/dev/nvidia1 COREs=2,3
238
239
240 ##################################################################
241 # Slurm's Generic Resource (GRES) configuration file
242 # Overwrite system defaults and explicitly configure three GPUs
243 ##################################################################
244 Name=gpu Type=tesla File=/dev/nvidia[0-1] COREs=0,1
245 # Name=gpu Type=tesla File=/dev/nvidia[2-3] COREs=2,3
246 # NOTE: nvidia2 device is out of service
247 Name=gpu Type=tesla File=/dev/nvidia3 COREs=2,3
248
249
250 ##################################################################
251 # Slurm's Generic Resource (GRES) configuration file
252 # Use a single gres.conf file for all compute nodes - positive method
253 ##################################################################
254 ## Explicitly specify devices on nodes tux0-tux15
255 # NodeName=tux[0-15] Name=gpu File=/dev/nvidia[0-3]
256 # NOTE: tux3 nvidia1 device is out of service
257 NodeName=tux[0-2] Name=gpu File=/dev/nvidia[0-3]
258 NodeName=tux3 Name=gpu File=/dev/nvidia[0,2-3]
259 NodeName=tux[4-15] Name=gpu File=/dev/nvidia[0-3]
260
261
262 ##################################################################
263 # Slurm's Generic Resource (GRES) configuration file
264 # Use NVML to gather GPU configuration information
265 # for all nodes except one
266 ##################################################################
267 AutoDetect=nvml
268 NodeName=tux3 AutoDetect=off Name=gpu File=/dev/nvidia[0-3]
269
270 ##################################################################
271 # Slurm's Generic Resource (GRES) configuration file
272 # Specify some nodes with NVML, some with RSMI, and some with no
273 AutoDetect
274 ##################################################################
275 NodeName=tux[0-7] AutoDetect=nvml
276 NodeName=tux[8-11] AutoDetect=rsmi
277 NodeName=tux[12-15] Name=gpu File=/dev/nvidia[0-3]
278
279
281 Copyright (C) 2010 The Regents of the University of California. Pro‐
282 duced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
283 Copyright (C) 2010-2019 SchedMD LLC.
284
285 This file is part of Slurm, a resource management program. For
286 details, see <https://slurm.schedmd.com/>.
287
288 Slurm is free software; you can redistribute it and/or modify it under
289 the terms of the GNU General Public License as published by the Free
290 Software Foundation; either version 2 of the License, or (at your
291 option) any later version.
292
293 Slurm is distributed in the hope that it will be useful, but WITHOUT
294 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
295 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
296 for more details.
297
298
300 slurm.conf(5)
301
302
303
304October 2020 Slurm Configuration File gres.conf(5)