1gres.conf(5) Slurm Configuration File gres.conf(5)
2
3
4
6 gres.conf - Slurm configuration file for Generic RESource (GRES) man‐
7 agement.
8
9
11 gres.conf is an ASCII file which describes the configuration of Generic
12 RESource (GRES) on each compute node. If the GRES information in the
13 slurm.conf file does not fully describe those resources, then a
14 gres.conf file should be included on each compute node. The file loca‐
15 tion can be modified at system build time using the DEFAULT_SLURM_CONF
16 parameter or at execution time by setting the SLURM_CONF environment
17 variable. The file will always be located in the same directory as the
18 slurm.conf file.
19
20
21 If the GRES information in the slurm.conf file fully describes those
22 resources (i.e. no "Cores", "File" or "Links" specification is required
23 for that GRES type or that information is automatically detected), that
24 information may be omitted from the gres.conf file and only the config‐
25 uration information in the slurm.conf file will be used. The gres.conf
26 file may be omitted completely if the configuration information in the
27 slurm.conf file fully describes all GRES.
28
29
30 Parameter names are case insensitive. Any text following a "#" in the
31 configuration file is treated as a comment through the end of that
32 line. Changes to the configuration file take effect upon restart of
33 Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the
34 command "scontrol reconfigure" unless otherwise noted.
35
36
37 NOTE: Slurm support for gres/mps requires the use of the
38 select/cons_tres plugin. For more information on how to configure MPS,
39 see https://slurm.schedmd.com/gres.html#MPS_Management.
40
41
42 For more information on GRES scheduling in general, see
43 https://slurm.schedmd.com/gres.html.
44
45
46 The overall configuration parameters available include:
47
48
49 AutoDetect
50 The hardware detection mechanisms to enable for automatic GRES
51 configuration. This should be on a line by itself. Currently,
52 the only valid option is nvml, which allows for automatically
53 detecting NVIDIA GPUs.
54
55
56 Count Number of resources of this type available on this node. The
57 default value is set to the number of File values specified (if
58 any), otherwise the default value is one. A suffix of "K", "M",
59 "G", "T" or "P" may be used to multiply the number by 1024,
60 1048576, 1073741824, etc. respectively. For example:
61 "Count=10G".
62
63
64 Cores Optionally specify the first thread CPU index numbers for the
65 specific cores which can use this resource. For example, it may
66 be strongly preferable to use specific cores with specific GRES
67 devices (e.g. on a NUMA architecture). While Slurm can track
68 and assign resources at the CPU or thread level, its scheduling
69 algorithms used to co-allocate GRES devices with CPUs operates
70 at a socket or NUMA level. Therefore it is not possible to
71 preferentially assign GRES with different specific CPUs on the
72 same NUMA or socket and this option should be used to identify
73 all cores on some socket.
74
75
76 Multiple cores may be specified using a comma delimited list or
77 a range may be specified using a "-" separator (e.g. "0,1,2,3"
78 or "0-3"). If a job specifies --gres-flags=enforce-binding,
79 then only the identified cores can be allocated with each
80 generic resource. This will tend to improve performance of jobs,
81 but delay the allocation of resources to them. If specified and
82 a job is not submitted with the --gres-flags=enforce-binding
83 option the identified cores will be preferred for scheduled with
84 each generic resource.
85
86 If --gres-flags=disable-binding is specified, then any core can
87 be used with the resources, which also increases the speed of
88 Slurm's scheduling algorithm but can degrade the application
89 performance. The --gres-flags=disable-binding option is cur‐
90 rently required to use more CPUs than are bound to a GRES (i.e.
91 if a GPU is bound to the CPUs on one socket, but resources on
92 more than one socket are required to run the job). If any core
93 can be effectively used with the resources, then do not specify
94 the cores option for improved speed in the Slurm scheduling
95 logic. A restart of the slurmctld is needed for changes to the
96 Cores option to take effect.
97
98 NOTE: If your cores contain multiple threads only the first
99 thread (processing unit) of each core needs to be listed. Also
100 note that since Slurm must be able to perform resource manage‐
101 ment on heterogeneous clusters having various processing unit
102 numbering schemes, a logical processing unit index must be spec‐
103 ified instead of the physical processing unit index. That pro‐
104 cessing unit logical index might not correspond to your physical
105 index number. Processing unit 0 will be the first socket, first
106 core and (if configured) first thread. If hyperthreading is
107 enabled, processing unit 1 will always be the first socket,
108 first core and second thread. If hyperthreading is not enabled,
109 processing unit 1 will always be the first socket and second
110 core. This numbering coincides with the processing unit logical
111 number (PU L#) seen in "lstopo -l" command output.
112
113
114 File Fully qualified pathname of the device files associated with a
115 resource. The name can include a numeric range suffix to be
116 interpreted by Slurm (e.g. File=/dev/nvidia[0-3]).
117
118
119 This field is generally required if enforcement of generic
120 resource allocations is to be supported (i.e. prevents users
121 from making use of resources allocated to a different user).
122 Enforcement of the file allocation relies upon Linux Control
123 Groups (cgroups) and Slurm's task/cgroup plugin, which will
124 place the allocated files into the job's cgroup and prevent use
125 of other files. Please see Slurm's Cgroups Guide for more
126 information: https://slurm.schedmd.com/cgroups.html.
127
128 If File is specified then Count must be either set to the number
129 of file names specified or not set (the default value is the
130 number of files specified). The exception to this is MPS. For
131 MPS, each GPU would be identified by device file using the File
132 parameter and Count would specify the number of MPS entries that
133 would correspond to that GPU (typically 100 or some multiple of
134 100).
135
136 NOTE: If you specify the File parameter for a resource on some
137 node, the option must be specified on all nodes and Slurm will
138 track the assignment of each specific resource on each node.
139 Otherwise Slurm will only track a count of allocated resources
140 rather than the state of each individual device file.
141
142 NOTE: Drain a node before changing the count of records with
143 File parameters (i.e. if you want to add or remove GPUs from a
144 node's configuration). Failure to do so will result in any job
145 using those GRES being aborted.
146
147
148 Links A comma-delimited list of numbers identifying the number of con‐
149 nections between this device and other devices to allow
150 coscheduling of better connected devices. This is an ordered
151 list in which the number of connections this specific device has
152 to device number 0 would be in the first position, the number of
153 connections it has to device number 1 in the second position,
154 etc. A -1 indicates the device itself and a 0 indicates no con‐
155 nection. If specified, then this line can only contain a single
156 GRES device (i.e. can only contain a single file via File).
157
158
159 This is an optional value and is usually automatically deter‐
160 mined if AutoDetect is enabled. A typical use case would be to
161 identify GPUs having NVLink connectivity. Note that for GPUs,
162 the minor number assigned by the OS and used in the device file
163 (i.e. the X in /dev/nvidiaX) is not necessarily the same as the
164 device number/index. The device number is created by sorting the
165 GPUs by PCI bus ID and then numbering them starting from the
166 smallest bus ID. See
167 https://slurm.schedmd.com/gres.html#GPU_Management
168
169
170 Name Name of the generic resource. Any desired name may be used. The
171 name must match a value in GresTypes in slurm.conf. Each
172 generic resource has an optional plugin which can provide
173 resource-specific functionality. Generic resources that cur‐
174 rently include an optional plugin are:
175
176 gpu Graphics Processing Unit
177
178 mps CUDA Multi-Process Service (MPS)
179
180 nic Network Interface Card
181
182 mic Intel Many Integrated Core (MIC) processor
183
184
185 NodeName
186 An optional NodeName specification can be used to permit one
187 gres.conf file to be used for all compute nodes in a cluster by
188 specifying the node(s) that each line should apply to. The
189 NodeName specification can use a Slurm hostlist specification as
190 shown in the example below.
191
192
193 Type An optional arbitrary string identifying the type of device.
194 For example, this might be used to identify a specific model of
195 GPU, which users can then specify in a job request. If Type is
196 specified, then Count is limited in size (currently 1024).
197
198
200 ##################################################################
201 # Slurm's Generic Resource (GRES) configuration file
202 # Define GPU devices with MPS support
203 ##################################################################
204 AutoDetect=nvml
205 Name=gpu Type=gtx560 File=/dev/nvidia0 COREs=0,1
206 Name=gpu Type=tesla File=/dev/nvidia1 COREs=2,3
207 Name=mps Count=100 File=/dev/nvidia0 COREs=0,1
208 Name=mps Count=100 File=/dev/nvidia1 COREs=2,3
209
210
211 ##################################################################
212 # Slurm's Generic Resource (GRES) configuration file
213 # Overwrite system defaults and explicitly configure three GPUs
214 ##################################################################
215 Name=gpu Type=tesla File=/dev/nvidia[0-1] COREs=0,1
216 # Name=gpu Type=tesla File=/dev/nvidia[2-3] COREs=2,3
217 # NOTE: nvidia2 device is out of service
218 Name=gpu Type=tesla File=/dev/nvidia3 COREs=2,3
219
220
221 ##################################################################
222 # Slurm's Generic Resource (GRES) configuration file
223 # Use a single gres.conf file for all compute nodes - positive method
224 ##################################################################
225 ## Explicitly specify devices on nodes tux0-tux15
226 # NodeName=tux[0-15] Name=gpu File=/dev/nvidia[0-3]
227 # NOTE: tux3 nvidia1 device is out of service
228 NodeName=tux[0-2] Name=gpu File=/dev/nvidia[0-3]
229 NodeName=tux3 Name=gpu File=/dev/nvidia[0,2-3]
230 NodeName=tux[4-15] Name=gpu File=/dev/nvidia[0-3]
231
232
233 ##################################################################
234 # Slurm's Generic Resource (GRES) configuration file
235 # Use NVML to gather GPU configuration information
236 # Information about all other GRES gathered from slurm.conf
237 ##################################################################
238 AutoDetect=nvml
239
240
242 Copyright (C) 2010 The Regents of the University of California. Pro‐
243 duced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
244 Copyright (C) 2010-2019 SchedMD LLC.
245
246 This file is part of Slurm, a resource management program. For
247 details, see <https://slurm.schedmd.com/>.
248
249 Slurm is free software; you can redistribute it and/or modify it under
250 the terms of the GNU General Public License as published by the Free
251 Software Foundation; either version 2 of the License, or (at your
252 option) any later version.
253
254 Slurm is distributed in the hope that it will be useful, but WITHOUT
255 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
256 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
257 for more details.
258
259
261 slurm.conf(5)
262
263
264
265September 2019 Slurm Configuration File gres.conf(5)