1salloc(1) Slurm Commands salloc(1)
2
3
4
6 salloc - Obtain a Slurm job allocation (a set of nodes), execute a com‐
7 mand, and then release the allocation when the command is finished.
8
9
11 salloc [OPTIONS(0)...] [ : [OPTIONS(N)...]] [command(0) [args(0)...]]
12
13 Option(s) define multiple jobs in a co-scheduled heterogeneous job.
14 For more details about heterogeneous jobs see the document
15 https://slurm.schedmd.com/heterogeneous_jobs.html
16
17
19 salloc is used to allocate a Slurm job allocation, which is a set of
20 resources (nodes), possibly with some set of constraints (e.g. number
21 of processors per node). When salloc successfully obtains the re‐
22 quested allocation, it then runs the command specified by the user.
23 Finally, when the user specified command is complete, salloc relin‐
24 quishes the job allocation.
25
26 The command may be any program the user wishes. Some typical commands
27 are xterm, a shell script containing srun commands, and srun (see the
28 EXAMPLES section). If no command is specified, then salloc runs the
29 user's default shell.
30
31 The following document describes the influence of various options on
32 the allocation of cpus to jobs and tasks.
33 https://slurm.schedmd.com/cpu_management.html
34
35 NOTE: The salloc logic includes support to save and restore the termi‐
36 nal line settings and is designed to be executed in the foreground. If
37 you need to execute salloc in the background, set its standard input to
38 some file, for example: "salloc -n16 a.out </dev/null &"
39
40
42 If salloc is unable to execute the user command, it will return 1 and
43 print errors to stderr. Else if success or if killed by signals HUP,
44 INT, KILL, or QUIT: it will return 0.
45
46
48 If provided, the command is resolved in the following order:
49
50 1. If command starts with ".", then path is constructed as: current
51 working directory / command
52 2. If command starts with a "/", then path is considered absolute.
53 3. If command can be resolved through PATH. See path_resolution(7).
54 4. If command is in current working directory.
55
56 Current working directory is the calling process working directory un‐
57 less the --chdir argument is passed, which will override the current
58 working directory.
59
60
62 -A, --account=<account>
63 Charge resources used by this job to specified account. The ac‐
64 count is an arbitrary string. The account name may be changed
65 after job submission using the scontrol command.
66
67 --acctg-freq=<datatype>=<interval>[,<datatype>=<interval>...]
68 Define the job accounting and profiling sampling intervals in
69 seconds. This can be used to override the JobAcctGatherFre‐
70 quency parameter in the slurm.conf file. <datatype>=<interval>
71 specifies the task sampling interval for the jobacct_gather
72 plugin or a sampling interval for a profiling type by the
73 acct_gather_profile plugin. Multiple comma-separated
74 <datatype>=<interval> pairs may be specified. Supported datatype
75 values are:
76
77 task Sampling interval for the jobacct_gather plugins and
78 for task profiling by the acct_gather_profile
79 plugin.
80 NOTE: This frequency is used to monitor memory us‐
81 age. If memory limits are enforced the highest fre‐
82 quency a user can request is what is configured in
83 the slurm.conf file. It can not be disabled.
84
85 energy Sampling interval for energy profiling using the
86 acct_gather_energy plugin.
87
88 network Sampling interval for infiniband profiling using the
89 acct_gather_interconnect plugin.
90
91 filesystem Sampling interval for filesystem profiling using the
92 acct_gather_filesystem plugin.
93
94 The default value for the task sampling interval is 30 seconds.
95 The default value for all other intervals is 0. An interval of
96 0 disables sampling of the specified type. If the task sampling
97 interval is 0, accounting information is collected only at job
98 termination (reducing Slurm interference with the job).
99 Smaller (non-zero) values have a greater impact upon job perfor‐
100 mance, but a value of 30 seconds is not likely to be noticeable
101 for applications having less than 10,000 tasks.
102
103
104 --bb=<spec>
105 Burst buffer specification. The form of the specification is
106 system dependent. Note the burst buffer may not be accessible
107 from a login node, but require that salloc spawn a shell on one
108 of its allocated compute nodes. When the --bb option is used,
109 Slurm parses this option and creates a temporary burst buffer
110 script file that is used internally by the burst buffer plugins.
111 See Slurm's burst buffer guide for more information and exam‐
112 ples:
113 https://slurm.schedmd.com/burst_buffer.html
114
115 --bbf=<file_name>
116 Path of file containing burst buffer specification. The form of
117 the specification is system dependent. Also see --bb. Note the
118 burst buffer may not be accessible from a login node, but re‐
119 quire that salloc spawn a shell on one of its allocated compute
120 nodes. See Slurm's burst buffer guide for more information and
121 examples:
122 https://slurm.schedmd.com/burst_buffer.html
123
124 --begin=<time>
125 Defer eligibility of this job allocation until the specified
126 time.
127
128 Time may be of the form HH:MM:SS to run a job at a specific time
129 of day (seconds are optional). (If that time is already past,
130 the next day is assumed.) You may also specify midnight, noon,
131 fika (3 PM) or teatime (4 PM) and you can have a time-of-day
132 suffixed with AM or PM for running in the morning or the
133 evening. You can also say what day the job will be run, by
134 specifying a date of the form MMDDYY or MM/DD/YY YYYY-MM-DD.
135 Combine date and time using the following format
136 YYYY-MM-DD[THH:MM[:SS]]. You can also give times like now +
137 count time-units, where the time-units can be seconds (default),
138 minutes, hours, days, or weeks and you can tell Slurm to run the
139 job today with the keyword today and to run the job tomorrow
140 with the keyword tomorrow. The value may be changed after job
141 submission using the scontrol command. For example:
142 --begin=16:00
143 --begin=now+1hour
144 --begin=now+60 (seconds by default)
145 --begin=2010-01-20T12:34:00
146
147
148 Notes on date/time specifications:
149 - Although the 'seconds' field of the HH:MM:SS time specifica‐
150 tion is allowed by the code, note that the poll time of the
151 Slurm scheduler is not precise enough to guarantee dispatch of
152 the job on the exact second. The job will be eligible to start
153 on the next poll following the specified time. The exact poll
154 interval depends on the Slurm scheduler (e.g., 60 seconds with
155 the default sched/builtin).
156 - If no time (HH:MM:SS) is specified, the default is
157 (00:00:00).
158 - If a date is specified without a year (e.g., MM/DD) then the
159 current year is assumed, unless the combination of MM/DD and
160 HH:MM:SS has already passed for that year, in which case the
161 next year is used.
162
163
164 --bell Force salloc to ring the terminal bell when the job allocation
165 is granted (and only if stdout is a tty). By default, salloc
166 only rings the bell if the allocation is pending for more than
167 ten seconds (and only if stdout is a tty). Also see the option
168 --no-bell.
169
170 -D, --chdir=<path>
171 Change directory to path before beginning execution. The path
172 can be specified as full path or relative path to the directory
173 where the command is executed.
174
175 --cluster-constraint=<list>
176 Specifies features that a federated cluster must have to have a
177 sibling job submitted to it. Slurm will attempt to submit a sib‐
178 ling job to a cluster if it has at least one of the specified
179 features.
180
181 -M, --clusters=<string>
182 Clusters to issue commands to. Multiple cluster names may be
183 comma separated. The job will be submitted to the one cluster
184 providing the earliest expected job initiation time. The default
185 value is the current cluster. A value of 'all' will query to run
186 on all clusters. Note that the SlurmDBD must be up for this op‐
187 tion to work properly.
188
189 --comment=<string>
190 An arbitrary comment.
191
192 -C, --constraint=<list>
193 Nodes can have features assigned to them by the Slurm adminis‐
194 trator. Users can specify which of these features are required
195 by their job using the constraint option. If you are looking for
196 'soft' constraints please see see --prefer for more information.
197 Only nodes having features matching the job constraints will be
198 used to satisfy the request. Multiple constraints may be speci‐
199 fied with AND, OR, matching OR, resource counts, etc. (some op‐
200 erators are not supported on all system types).
201
202 NOTE: If features that are part of the node_features/helpers
203 plugin are requested, then only the Single Name and AND options
204 are supported.
205
206 Supported --constraint options include:
207
208 Single Name
209 Only nodes which have the specified feature will be used.
210 For example, --constraint="intel"
211
212 Node Count
213 A request can specify the number of nodes needed with
214 some feature by appending an asterisk and count after the
215 feature name. For example, --nodes=16 --con‐
216 straint="graphics*4 ..." indicates that the job requires
217 16 nodes and that at least four of those nodes must have
218 the feature "graphics."
219
220 AND If only nodes with all of specified features will be
221 used. The ampersand is used for an AND operator. For
222 example, --constraint="intel&gpu"
223
224 OR If only nodes with at least one of specified features
225 will be used. The vertical bar is used for an OR opera‐
226 tor. For example, --constraint="intel|amd"
227
228 Matching OR
229 If only one of a set of possible options should be used
230 for all allocated nodes, then use the OR operator and en‐
231 close the options within square brackets. For example,
232 --constraint="[rack1|rack2|rack3|rack4]" might be used to
233 specify that all nodes must be allocated on a single rack
234 of the cluster, but any of those four racks can be used.
235
236 Multiple Counts
237 Specific counts of multiple resources may be specified by
238 using the AND operator and enclosing the options within
239 square brackets. For example, --con‐
240 straint="[rack1*2&rack2*4]" might be used to specify that
241 two nodes must be allocated from nodes with the feature
242 of "rack1" and four nodes must be allocated from nodes
243 with the feature "rack2".
244
245 NOTE: This construct does not support multiple Intel KNL
246 NUMA or MCDRAM modes. For example, while --con‐
247 straint="[(knl&quad)*2&(knl&hemi)*4]" is not supported,
248 --constraint="[haswell*2&(knl&hemi)*4]" is supported.
249 Specification of multiple KNL modes requires the use of a
250 heterogeneous job.
251
252 NOTE: Multiple Counts can cause jobs to be allocated with
253 a non-optimal network layout.
254
255 Brackets
256 Brackets can be used to indicate that you are looking for
257 a set of nodes with the different requirements contained
258 within the brackets. For example, --con‐
259 straint="[(rack1|rack2)*1&(rack3)*2]" will get you one
260 node with either the "rack1" or "rack2" features and two
261 nodes with the "rack3" feature. The same request without
262 the brackets will try to find a single node that meets
263 those requirements.
264
265 NOTE: Brackets are only reserved for Multiple Counts and
266 Matching OR syntax. AND operators require a count for
267 each feature inside square brackets (i.e.
268 "[quad*2&hemi*1]"). Slurm will only allow a single set of
269 bracketed constraints per job.
270
271 Parenthesis
272 Parenthesis can be used to group like node features to‐
273 gether. For example, --con‐
274 straint="[(knl&snc4&flat)*4&haswell*1]" might be used to
275 specify that four nodes with the features "knl", "snc4"
276 and "flat" plus one node with the feature "haswell" are
277 required. All options within parenthesis should be
278 grouped with AND (e.g. "&") operands.
279
280 --container=<path_to_container>
281 Absolute path to OCI container bundle.
282
283 --contiguous
284 If set, then the allocated nodes must form a contiguous set.
285
286 NOTE: If SelectPlugin=cons_res this option won't be honored with
287 the topology/tree or topology/3d_torus plugins, both of which
288 can modify the node ordering.
289
290 -S, --core-spec=<num>
291 Count of specialized cores per node reserved by the job for sys‐
292 tem operations and not used by the application. The application
293 will not use these cores, but will be charged for their alloca‐
294 tion. Default value is dependent upon the node's configured
295 CoreSpecCount value. If a value of zero is designated and the
296 Slurm configuration option AllowSpecResourcesUsage is enabled,
297 the job will be allowed to override CoreSpecCount and use the
298 specialized resources on nodes it is allocated. This option can
299 not be used with the --thread-spec option.
300
301 NOTE: Explicitly setting a job's specialized core value implic‐
302 itly sets its --exclusive option, reserving entire nodes for the
303 job.
304
305 --cores-per-socket=<cores>
306 Restrict node selection to nodes with at least the specified
307 number of cores per socket. See additional information under -B
308 option above when task/affinity plugin is enabled.
309 NOTE: This option may implicitly set the number of tasks (if -n
310 was not specified) as one task per requested thread.
311
312 --cpu-freq=<p1>[-p2[:p3]]
313
314 Request that job steps initiated by srun commands inside this
315 allocation be run at some requested frequency if possible, on
316 the CPUs selected for the step on the compute node(s).
317
318 p1 can be [#### | low | medium | high | highm1] which will set
319 the frequency scaling_speed to the corresponding value, and set
320 the frequency scaling_governor to UserSpace. See below for defi‐
321 nition of the values.
322
323 p1 can be [Conservative | OnDemand | Performance | PowerSave]
324 which will set the scaling_governor to the corresponding value.
325 The governor has to be in the list set by the slurm.conf option
326 CpuFreqGovernors.
327
328 When p2 is present, p1 will be the minimum scaling frequency and
329 p2 will be the maximum scaling frequency.
330
331 p2 can be [#### | medium | high | highm1] p2 must be greater
332 than p1.
333
334 p3 can be [Conservative | OnDemand | Performance | PowerSave |
335 SchedUtil | UserSpace] which will set the governor to the corre‐
336 sponding value.
337
338 If p3 is UserSpace, the frequency scaling_speed will be set by a
339 power or energy aware scheduling strategy to a value between p1
340 and p2 that lets the job run within the site's power goal. The
341 job may be delayed if p1 is higher than a frequency that allows
342 the job to run within the goal.
343
344 If the current frequency is < min, it will be set to min. Like‐
345 wise, if the current frequency is > max, it will be set to max.
346
347 Acceptable values at present include:
348
349 #### frequency in kilohertz
350
351 Low the lowest available frequency
352
353 High the highest available frequency
354
355 HighM1 (high minus one) will select the next highest
356 available frequency
357
358 Medium attempts to set a frequency in the middle of the
359 available range
360
361 Conservative attempts to use the Conservative CPU governor
362
363 OnDemand attempts to use the OnDemand CPU governor (the de‐
364 fault value)
365
366 Performance attempts to use the Performance CPU governor
367
368 PowerSave attempts to use the PowerSave CPU governor
369
370 UserSpace attempts to use the UserSpace CPU governor
371
372 The following informational environment variable is set
373 in the job
374 step when --cpu-freq option is requested.
375 SLURM_CPU_FREQ_REQ
376
377 This environment variable can also be used to supply the value
378 for the CPU frequency request if it is set when the 'srun' com‐
379 mand is issued. The --cpu-freq on the command line will over‐
380 ride the environment variable value. The form on the environ‐
381 ment variable is the same as the command line. See the ENVIRON‐
382 MENT VARIABLES section for a description of the
383 SLURM_CPU_FREQ_REQ variable.
384
385 NOTE: This parameter is treated as a request, not a requirement.
386 If the job step's node does not support setting the CPU fre‐
387 quency, or the requested value is outside the bounds of the le‐
388 gal frequencies, an error is logged, but the job step is allowed
389 to continue.
390
391 NOTE: Setting the frequency for just the CPUs of the job step
392 implies that the tasks are confined to those CPUs. If task con‐
393 finement (i.e. the task/affinity TaskPlugin is enabled, or the
394 task/cgroup TaskPlugin is enabled with "ConstrainCores=yes" set
395 in cgroup.conf) is not configured, this parameter is ignored.
396
397 NOTE: When the step completes, the frequency and governor of
398 each selected CPU is reset to the previous values.
399
400 NOTE: When submitting jobs with the --cpu-freq option with lin‐
401 uxproc as the ProctrackType can cause jobs to run too quickly
402 before Accounting is able to poll for job information. As a re‐
403 sult not all of accounting information will be present.
404
405 --cpus-per-gpu=<ncpus>
406 Advise Slurm that ensuing job steps will require ncpus proces‐
407 sors per allocated GPU. Not compatible with the --cpus-per-task
408 option.
409
410 -c, --cpus-per-task=<ncpus>
411 Advise Slurm that ensuing job steps will require ncpus proces‐
412 sors per task. By default Slurm will allocate one processor per
413 task.
414
415 For instance, consider an application that has 4 tasks, each re‐
416 quiring 3 processors. If our cluster is comprised of quad-pro‐
417 cessors nodes and we simply ask for 12 processors, the con‐
418 troller might give us only 3 nodes. However, by using the
419 --cpus-per-task=3 options, the controller knows that each task
420 requires 3 processors on the same node, and the controller will
421 grant an allocation of 4 nodes, one for each of the 4 tasks.
422
423 NOTE: Beginning with 22.05, srun will not inherit the
424 --cpus-per-task value requested by salloc or sbatch. It must be
425 requested again with the call to srun or set with the
426 SRUN_CPUS_PER_TASK environment variable if desired for the
427 task(s).
428
429 --deadline=<OPT>
430 remove the job if no ending is possible before this deadline
431 (start > (deadline - time[-min])). Default is no deadline.
432 Valid time formats are:
433 HH:MM[:SS] [AM|PM]
434 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
435 MM/DD[/YY]-HH:MM[:SS]
436 YYYY-MM-DD[THH:MM[:SS]]]
437 now[+count[seconds(default)|minutes|hours|days|weeks]]
438
439 --delay-boot=<minutes>
440 Do not reboot nodes in order to satisfied this job's feature
441 specification if the job has been eligible to run for less than
442 this time period. If the job has waited for less than the spec‐
443 ified period, it will use only nodes which already have the
444 specified features. The argument is in units of minutes. A de‐
445 fault value may be set by a system administrator using the de‐
446 lay_boot option of the SchedulerParameters configuration parame‐
447 ter in the slurm.conf file, otherwise the default value is zero
448 (no delay).
449
450 -d, --dependency=<dependency_list>
451 Defer the start of this job until the specified dependencies
452 have been satisfied completed. <dependency_list> is of the form
453 <type:job_id[:job_id][,type:job_id[:job_id]]> or
454 <type:job_id[:job_id][?type:job_id[:job_id]]>. All dependencies
455 must be satisfied if the "," separator is used. Any dependency
456 may be satisfied if the "?" separator is used. Only one separa‐
457 tor may be used. For instance:
458 -d afterok:20:21,afterany:23
459 means that the job can run only after a 0 return code of jobs 20
460 and 21 AND completion of job 23. However:
461 -d afterok:20:21?afterany:23
462 means that any of the conditions (afterok:20 OR afterok:21 OR
463 afterany:23) will be enough to release the job. Many jobs can
464 share the same dependency and these jobs may even belong to dif‐
465 ferent users. The value may be changed after job submission
466 using the scontrol command. Dependencies on remote jobs are al‐
467 lowed in a federation. Once a job dependency fails due to the
468 termination state of a preceding job, the dependent job will
469 never be run, even if the preceding job is requeued and has a
470 different termination state in a subsequent execution.
471
472 after:job_id[[+time][:jobid[+time]...]]
473 After the specified jobs start or are cancelled and
474 'time' in minutes from job start or cancellation happens,
475 this job can begin execution. If no 'time' is given then
476 there is no delay after start or cancellation.
477
478 afterany:job_id[:jobid...]
479 This job can begin execution after the specified jobs
480 have terminated. This is the default dependency type.
481
482 afterburstbuffer:job_id[:jobid...]
483 This job can begin execution after the specified jobs
484 have terminated and any associated burst buffer stage out
485 operations have completed.
486
487 aftercorr:job_id[:jobid...]
488 A task of this job array can begin execution after the
489 corresponding task ID in the specified job has completed
490 successfully (ran to completion with an exit code of
491 zero).
492
493 afternotok:job_id[:jobid...]
494 This job can begin execution after the specified jobs
495 have terminated in some failed state (non-zero exit code,
496 node failure, timed out, etc).
497
498 afterok:job_id[:jobid...]
499 This job can begin execution after the specified jobs
500 have successfully executed (ran to completion with an
501 exit code of zero).
502
503 singleton
504 This job can begin execution after any previously
505 launched jobs sharing the same job name and user have
506 terminated. In other words, only one job by that name
507 and owned by that user can be running or suspended at any
508 point in time. In a federation, a singleton dependency
509 must be fulfilled on all clusters unless DependencyParam‐
510 eters=disable_remote_singleton is used in slurm.conf.
511
512 -m, --distribution={*|block|cyclic|arbi‐
513 trary|plane=<size>}[:{*|block|cyclic|fcyclic}[:{*|block|cyclic|fcyclic}]][,{Pack|NoPack}]
514
515 Specify alternate distribution methods for remote processes.
516 For job allocation, this sets environment variables that will be
517 used by subsequent srun requests and also affects which cores
518 will be selected for job allocation.
519
520 This option controls the distribution of tasks to the nodes on
521 which resources have been allocated, and the distribution of
522 those resources to tasks for binding (task affinity). The first
523 distribution method (before the first ":") controls the distri‐
524 bution of tasks to nodes. The second distribution method (after
525 the first ":") controls the distribution of allocated CPUs
526 across sockets for binding to tasks. The third distribution
527 method (after the second ":") controls the distribution of allo‐
528 cated CPUs across cores for binding to tasks. The second and
529 third distributions apply only if task affinity is enabled. The
530 third distribution is supported only if the task/cgroup plugin
531 is configured. The default value for each distribution type is
532 specified by *.
533
534 Note that with select/cons_res and select/cons_tres, the number
535 of CPUs allocated to each socket and node may be different. Re‐
536 fer to https://slurm.schedmd.com/mc_support.html for more infor‐
537 mation on resource allocation, distribution of tasks to nodes,
538 and binding of tasks to CPUs.
539 First distribution method (distribution of tasks across nodes):
540
541
542 * Use the default method for distributing tasks to nodes
543 (block).
544
545 block The block distribution method will distribute tasks to a
546 node such that consecutive tasks share a node. For exam‐
547 ple, consider an allocation of three nodes each with two
548 cpus. A four-task block distribution request will dis‐
549 tribute those tasks to the nodes with tasks one and two
550 on the first node, task three on the second node, and
551 task four on the third node. Block distribution is the
552 default behavior if the number of tasks exceeds the num‐
553 ber of allocated nodes.
554
555 cyclic The cyclic distribution method will distribute tasks to a
556 node such that consecutive tasks are distributed over
557 consecutive nodes (in a round-robin fashion). For exam‐
558 ple, consider an allocation of three nodes each with two
559 cpus. A four-task cyclic distribution request will dis‐
560 tribute those tasks to the nodes with tasks one and four
561 on the first node, task two on the second node, and task
562 three on the third node. Note that when SelectType is
563 select/cons_res, the same number of CPUs may not be allo‐
564 cated on each node. Task distribution will be round-robin
565 among all the nodes with CPUs yet to be assigned to
566 tasks. Cyclic distribution is the default behavior if
567 the number of tasks is no larger than the number of allo‐
568 cated nodes.
569
570 plane The tasks are distributed in blocks of size <size>. The
571 size must be given or SLURM_DIST_PLANESIZE must be set.
572 The number of tasks distributed to each node is the same
573 as for cyclic distribution, but the taskids assigned to
574 each node depend on the plane size. Additional distribu‐
575 tion specifications cannot be combined with this option.
576 For more details (including examples and diagrams),
577 please see https://slurm.schedmd.com/mc_support.html and
578 https://slurm.schedmd.com/dist_plane.html
579
580 arbitrary
581 The arbitrary method of distribution will allocate pro‐
582 cesses in-order as listed in file designated by the envi‐
583 ronment variable SLURM_HOSTFILE. If this variable is
584 listed it will over ride any other method specified. If
585 not set the method will default to block. Inside the
586 hostfile must contain at minimum the number of hosts re‐
587 quested and be one per line or comma separated. If spec‐
588 ifying a task count (-n, --ntasks=<number>), your tasks
589 will be laid out on the nodes in the order of the file.
590 NOTE: The arbitrary distribution option on a job alloca‐
591 tion only controls the nodes to be allocated to the job
592 and not the allocation of CPUs on those nodes. This op‐
593 tion is meant primarily to control a job step's task lay‐
594 out in an existing job allocation for the srun command.
595 NOTE: If the number of tasks is given and a list of re‐
596 quested nodes is also given, the number of nodes used
597 from that list will be reduced to match that of the num‐
598 ber of tasks if the number of nodes in the list is
599 greater than the number of tasks.
600
601 Second distribution method (distribution of CPUs across sockets
602 for binding):
603
604
605 * Use the default method for distributing CPUs across sock‐
606 ets (cyclic).
607
608 block The block distribution method will distribute allocated
609 CPUs consecutively from the same socket for binding to
610 tasks, before using the next consecutive socket.
611
612 cyclic The cyclic distribution method will distribute allocated
613 CPUs for binding to a given task consecutively from the
614 same socket, and from the next consecutive socket for the
615 next task, in a round-robin fashion across sockets.
616 Tasks requiring more than one CPU will have all of those
617 CPUs allocated on a single socket if possible.
618
619 fcyclic
620 The fcyclic distribution method will distribute allocated
621 CPUs for binding to tasks from consecutive sockets in a
622 round-robin fashion across the sockets. Tasks requiring
623 more than one CPU will have each CPUs allocated in a
624 cyclic fashion across sockets.
625
626 Third distribution method (distribution of CPUs across cores for
627 binding):
628
629
630 * Use the default method for distributing CPUs across cores
631 (inherited from second distribution method).
632
633 block The block distribution method will distribute allocated
634 CPUs consecutively from the same core for binding to
635 tasks, before using the next consecutive core.
636
637 cyclic The cyclic distribution method will distribute allocated
638 CPUs for binding to a given task consecutively from the
639 same core, and from the next consecutive core for the
640 next task, in a round-robin fashion across cores.
641
642 fcyclic
643 The fcyclic distribution method will distribute allocated
644 CPUs for binding to tasks from consecutive cores in a
645 round-robin fashion across the cores.
646
647 Optional control for task distribution over nodes:
648
649
650 Pack Rather than evenly distributing a job step's tasks evenly
651 across its allocated nodes, pack them as tightly as pos‐
652 sible on the nodes. This only applies when the "block"
653 task distribution method is used.
654
655 NoPack Rather than packing a job step's tasks as tightly as pos‐
656 sible on the nodes, distribute them evenly. This user
657 option will supersede the SelectTypeParameters
658 CR_Pack_Nodes configuration parameter.
659
660 -x, --exclude=<node_name_list>
661 Explicitly exclude certain nodes from the resources granted to
662 the job.
663
664 --exclusive[={user|mcs}]
665 The job allocation can not share nodes with other running jobs
666 (or just other users with the "=user" option or with the "=mcs"
667 option). If user/mcs are not specified (i.e. the job allocation
668 can not share nodes with other running jobs), the job is allo‐
669 cated all CPUs and GRES on all nodes in the allocation, but is
670 only allocated as much memory as it requested. This is by design
671 to support gang scheduling, because suspended jobs still reside
672 in memory. To request all the memory on a node, use --mem=0.
673 The default shared/exclusive behavior depends on system configu‐
674 ration and the partition's OverSubscribe option takes precedence
675 over the job's option. NOTE: Since shared GRES (MPS) cannot be
676 allocated at the same time as a sharing GRES (GPU) this option
677 only allocates all sharing GRES and no underlying shared GRES.
678
679 -B, --extra-node-info=<sockets>[:cores[:threads]]
680 Restrict node selection to nodes with at least the specified
681 number of sockets, cores per socket and/or threads per core.
682 NOTE: These options do not specify the resource allocation size.
683 Each value specified is considered a minimum. An asterisk (*)
684 can be used as a placeholder indicating that all available re‐
685 sources of that type are to be utilized. Values can also be
686 specified as min-max. The individual levels can also be speci‐
687 fied in separate options if desired:
688 --sockets-per-node=<sockets>
689 --cores-per-socket=<cores>
690 --threads-per-core=<threads>
691 If task/affinity plugin is enabled, then specifying an alloca‐
692 tion in this manner also results in subsequently launched tasks
693 being bound to threads if the -B option specifies a thread
694 count, otherwise an option of cores if a core count is speci‐
695 fied, otherwise an option of sockets. If SelectType is config‐
696 ured to select/cons_res, it must have a parameter of CR_Core,
697 CR_Core_Memory, CR_Socket, or CR_Socket_Memory for this option
698 to be honored. If not specified, the scontrol show job will
699 display 'ReqS:C:T=*:*:*'. This option applies to job alloca‐
700 tions.
701 NOTE: This option is mutually exclusive with --hint,
702 --threads-per-core and --ntasks-per-core.
703 NOTE: This option may implicitly set the number of tasks (if -n
704 was not specified) as one task per requested thread.
705
706 --get-user-env[=timeout][mode]
707 This option will load login environment variables for the user
708 specified in the --uid option. The environment variables are
709 retrieved by running something along the lines of "su - <user‐
710 name> -c /usr/bin/env" and parsing the output. Be aware that
711 any environment variables already set in salloc's environment
712 will take precedence over any environment variables in the
713 user's login environment. The optional timeout value is in sec‐
714 onds. Default value is 3 seconds. The optional mode value con‐
715 trols the "su" options. With a mode value of "S", "su" is exe‐
716 cuted without the "-" option. With a mode value of "L", "su" is
717 executed with the "-" option, replicating the login environment.
718 If mode is not specified, the mode established at Slurm build
719 time is used. Examples of use include "--get-user-env",
720 "--get-user-env=10" "--get-user-env=10L", and
721 "--get-user-env=S". NOTE: This option only works if the caller
722 has an effective uid of "root".
723
724 --gid=<group>
725 Submit the job with the specified group's group access permis‐
726 sions. group may be the group name or the numerical group ID.
727 In the default Slurm configuration, this option is only valid
728 when used by the user root.
729
730 --gpu-bind=[verbose,]<type>
731 Bind tasks to specific GPUs. By default every spawned task can
732 access every GPU allocated to the step. If "verbose," is speci‐
733 fied before <type>, then print out GPU binding debug information
734 to the stderr of the tasks. GPU binding is ignored if there is
735 only one task.
736
737 Supported type options:
738
739 closest Bind each task to the GPU(s) which are closest. In a
740 NUMA environment, each task may be bound to more than
741 one GPU (i.e. all GPUs in that NUMA environment).
742
743 map_gpu:<list>
744 Bind by setting GPU masks on tasks (or ranks) as spec‐
745 ified where <list> is
746 <gpu_id_for_task_0>,<gpu_id_for_task_1>,... GPU IDs
747 are interpreted as decimal values. If the number of
748 tasks (or ranks) exceeds the number of elements in
749 this list, elements in the list will be reused as
750 needed starting from the beginning of the list. To
751 simplify support for large task counts, the lists may
752 follow a map with an asterisk and repetition count.
753 For example "map_gpu:0*4,1*4". If the task/cgroup
754 plugin is used and ConstrainDevices is set in
755 cgroup.conf, then the GPU IDs are zero-based indexes
756 relative to the GPUs allocated to the job (e.g. the
757 first GPU is 0, even if the global ID is 3). Other‐
758 wise, the GPU IDs are global IDs, and all GPUs on each
759 node in the job should be allocated for predictable
760 binding results.
761
762 mask_gpu:<list>
763 Bind by setting GPU masks on tasks (or ranks) as spec‐
764 ified where <list> is
765 <gpu_mask_for_task_0>,<gpu_mask_for_task_1>,... The
766 mapping is specified for a node and identical mapping
767 is applied to the tasks on every node (i.e. the lowest
768 task ID on each node is mapped to the first mask spec‐
769 ified in the list, etc.). GPU masks are always inter‐
770 preted as hexadecimal values but can be preceded with
771 an optional '0x'. To simplify support for large task
772 counts, the lists may follow a map with an asterisk
773 and repetition count. For example
774 "mask_gpu:0x0f*4,0xf0*4". If the task/cgroup plugin
775 is used and ConstrainDevices is set in cgroup.conf,
776 then the GPU IDs are zero-based indexes relative to
777 the GPUs allocated to the job (e.g. the first GPU is
778 0, even if the global ID is 3). Otherwise, the GPU IDs
779 are global IDs, and all GPUs on each node in the job
780 should be allocated for predictable binding results.
781
782 none Do not bind tasks to GPUs (turns off binding if
783 --gpus-per-task is requested).
784
785 per_task:<gpus_per_task>
786 Each task will be bound to the number of gpus speci‐
787 fied in <gpus_per_task>. Gpus are assigned in order to
788 tasks. The first task will be assigned the first x
789 number of gpus on the node etc.
790
791 single:<tasks_per_gpu>
792 Like --gpu-bind=closest, except that each task can
793 only be bound to a single GPU, even when it can be
794 bound to multiple GPUs that are equally close. The
795 GPU to bind to is determined by <tasks_per_gpu>, where
796 the first <tasks_per_gpu> tasks are bound to the first
797 GPU available, the second <tasks_per_gpu> tasks are
798 bound to the second GPU available, etc. This is basi‐
799 cally a block distribution of tasks onto available
800 GPUs, where the available GPUs are determined by the
801 socket affinity of the task and the socket affinity of
802 the GPUs as specified in gres.conf's Cores parameter.
803
804 --gpu-freq=[<type]=value>[,<type=value>][,verbose]
805 Request that GPUs allocated to the job are configured with spe‐
806 cific frequency values. This option can be used to indepen‐
807 dently configure the GPU and its memory frequencies. After the
808 job is completed, the frequencies of all affected GPUs will be
809 reset to the highest possible values. In some cases, system
810 power caps may override the requested values. The field type
811 can be "memory". If type is not specified, the GPU frequency is
812 implied. The value field can either be "low", "medium", "high",
813 "highm1" or a numeric value in megahertz (MHz). If the speci‐
814 fied numeric value is not possible, a value as close as possible
815 will be used. See below for definition of the values. The ver‐
816 bose option causes current GPU frequency information to be
817 logged. Examples of use include "--gpu-freq=medium,memory=high"
818 and "--gpu-freq=450".
819
820 Supported value definitions:
821
822 low the lowest available frequency.
823
824 medium attempts to set a frequency in the middle of the
825 available range.
826
827 high the highest available frequency.
828
829 highm1 (high minus one) will select the next highest avail‐
830 able frequency.
831
832 -G, --gpus=[type:]<number>
833 Specify the total number of GPUs required for the job. An op‐
834 tional GPU type specification can be supplied. For example
835 "--gpus=volta:3". Multiple options can be requested in a comma
836 separated list, for example: "--gpus=volta:3,kepler:1". See
837 also the --gpus-per-node, --gpus-per-socket and --gpus-per-task
838 options.
839 NOTE: The allocation has to contain at least one GPU per node.
840
841 --gpus-per-node=[type:]<number>
842 Specify the number of GPUs required for the job on each node in‐
843 cluded in the job's resource allocation. An optional GPU type
844 specification can be supplied. For example
845 "--gpus-per-node=volta:3". Multiple options can be requested in
846 a comma separated list, for example:
847 "--gpus-per-node=volta:3,kepler:1". See also the --gpus,
848 --gpus-per-socket and --gpus-per-task options.
849
850 --gpus-per-socket=[type:]<number>
851 Specify the number of GPUs required for the job on each socket
852 included in the job's resource allocation. An optional GPU type
853 specification can be supplied. For example
854 "--gpus-per-socket=volta:3". Multiple options can be requested
855 in a comma separated list, for example:
856 "--gpus-per-socket=volta:3,kepler:1". Requires job to specify a
857 sockets per node count ( --sockets-per-node). See also the
858 --gpus, --gpus-per-node and --gpus-per-task options.
859
860 --gpus-per-task=[type:]<number>
861 Specify the number of GPUs required for the job on each task to
862 be spawned in the job's resource allocation. An optional GPU
863 type specification can be supplied. For example
864 "--gpus-per-task=volta:1". Multiple options can be requested in
865 a comma separated list, for example:
866 "--gpus-per-task=volta:3,kepler:1". See also the --gpus,
867 --gpus-per-socket and --gpus-per-node options. This option re‐
868 quires an explicit task count, e.g. -n, --ntasks or "--gpus=X
869 --gpus-per-task=Y" rather than an ambiguous range of nodes with
870 -N, --nodes. This option will implicitly set
871 --gpu-bind=per_task:<gpus_per_task>, but that can be overridden
872 with an explicit --gpu-bind specification.
873
874 --gres=<list>
875 Specifies a comma-delimited list of generic consumable re‐
876 sources. The format of each entry on the list is
877 "name[[:type]:count]". The name is that of the consumable re‐
878 source. The count is the number of those resources with a de‐
879 fault value of 1. The count can have a suffix of "k" or "K"
880 (multiple of 1024), "m" or "M" (multiple of 1024 x 1024), "g" or
881 "G" (multiple of 1024 x 1024 x 1024), "t" or "T" (multiple of
882 1024 x 1024 x 1024 x 1024), "p" or "P" (multiple of 1024 x 1024
883 x 1024 x 1024 x 1024). The specified resources will be allo‐
884 cated to the job on each node. The available generic consumable
885 resources is configurable by the system administrator. A list
886 of available generic consumable resources will be printed and
887 the command will exit if the option argument is "help". Exam‐
888 ples of use include "--gres=gpu:2", "--gres=gpu:kepler:2", and
889 "--gres=help".
890
891 --gres-flags=<type>
892 Specify generic resource task binding options.
893
894 disable-binding
895 Disable filtering of CPUs with respect to generic re‐
896 source locality. This option is currently required to
897 use more CPUs than are bound to a GRES (i.e. if a GPU is
898 bound to the CPUs on one socket, but resources on more
899 than one socket are required to run the job). This op‐
900 tion may permit a job to be allocated resources sooner
901 than otherwise possible, but may result in lower job per‐
902 formance.
903 NOTE: This option is specific to SelectType=cons_res.
904
905 enforce-binding
906 The only CPUs available to the job will be those bound to
907 the selected GRES (i.e. the CPUs identified in the
908 gres.conf file will be strictly enforced). This option
909 may result in delayed initiation of a job. For example a
910 job requiring two GPUs and one CPU will be delayed until
911 both GPUs on a single socket are available rather than
912 using GPUs bound to separate sockets, however, the appli‐
913 cation performance may be improved due to improved commu‐
914 nication speed. Requires the node to be configured with
915 more than one socket and resource filtering will be per‐
916 formed on a per-socket basis.
917 NOTE: This option is specific to SelectType=cons_tres.
918
919 -h, --help
920 Display help information and exit.
921
922 --hint=<type>
923 Bind tasks according to application hints.
924 NOTE: This option cannot be used in conjunction with
925 --ntasks-per-core, --threads-per-core or -B. If --hint is speci‐
926 fied as a command line argument, it will take precedence over
927 the environment.
928
929 compute_bound
930 Select settings for compute bound applications: use all
931 cores in each socket, one thread per core.
932
933 memory_bound
934 Select settings for memory bound applications: use only
935 one core in each socket, one thread per core.
936
937 [no]multithread
938 [don't] use extra threads with in-core multi-threading
939 which can benefit communication intensive applications.
940 Only supported with the task/affinity plugin.
941
942 help show this help message
943
944 -H, --hold
945 Specify the job is to be submitted in a held state (priority of
946 zero). A held job can now be released using scontrol to reset
947 its priority (e.g. "scontrol release <job_id>").
948
949 -I, --immediate[=<seconds>]
950 exit if resources are not available within the time period spec‐
951 ified. If no argument is given (seconds defaults to 1), re‐
952 sources must be available immediately for the request to suc‐
953 ceed. If defer is configured in SchedulerParameters and sec‐
954 onds=1 the allocation request will fail immediately; defer con‐
955 flicts and takes precedence over this option. By default, --im‐
956 mediate is off, and the command will block until resources be‐
957 come available. Since this option's argument is optional, for
958 proper parsing the single letter option must be followed immedi‐
959 ately with the value and not include a space between them. For
960 example "-I60" and not "-I 60".
961
962 -J, --job-name=<jobname>
963 Specify a name for the job allocation. The specified name will
964 appear along with the job id number when querying running jobs
965 on the system. The default job name is the name of the "com‐
966 mand" specified on the command line.
967
968 -K, --kill-command[=signal]
969 salloc always runs a user-specified command once the allocation
970 is granted. salloc will wait indefinitely for that command to
971 exit. If you specify the --kill-command option salloc will send
972 a signal to your command any time that the Slurm controller
973 tells salloc that its job allocation has been revoked. The job
974 allocation can be revoked for a couple of reasons: someone used
975 scancel to revoke the allocation, or the allocation reached its
976 time limit. If you do not specify a signal name or number and
977 Slurm is configured to signal the spawned command at job termi‐
978 nation, the default signal is SIGHUP for interactive and SIGTERM
979 for non-interactive sessions. Since this option's argument is
980 optional, for proper parsing the single letter option must be
981 followed immediately with the value and not include a space be‐
982 tween them. For example "-K1" and not "-K 1".
983
984 -L, --licenses=<license>[@db][:count][,license[@db][:count]...]
985 Specification of licenses (or other resources available on all
986 nodes of the cluster) which must be allocated to this job. Li‐
987 cense names can be followed by a colon and count (the default
988 count is one). Multiple license names should be comma separated
989 (e.g. "--licenses=foo:4,bar").
990
991 NOTE: When submitting heterogeneous jobs, license requests only
992 work correctly when made on the first component job. For exam‐
993 ple "salloc -L ansys:2 :".
994
995 --mail-type=<type>
996 Notify user by email when certain event types occur. Valid type
997 values are NONE, BEGIN, END, FAIL, REQUEUE, ALL (equivalent to
998 BEGIN, END, FAIL, INVALID_DEPEND, REQUEUE, and STAGE_OUT), IN‐
999 VALID_DEPEND (dependency never satisfied), STAGE_OUT (burst buf‐
1000 fer stage out and teardown completed), TIME_LIMIT, TIME_LIMIT_90
1001 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80
1002 percent of time limit), and TIME_LIMIT_50 (reached 50 percent of
1003 time limit). Multiple type values may be specified in a comma
1004 separated list. The user to be notified is indicated with
1005 --mail-user.
1006
1007 --mail-user=<user>
1008 User to receive email notification of state changes as defined
1009 by --mail-type. The default value is the submitting user.
1010
1011 --mcs-label=<mcs>
1012 Used only when the mcs/group plugin is enabled. This parameter
1013 is a group among the groups of the user. Default value is cal‐
1014 culated by the Plugin mcs if it's enabled.
1015
1016 --mem=<size>[units]
1017 Specify the real memory required per node. Default units are
1018 megabytes. Different units can be specified using the suffix
1019 [K|M|G|T]. Default value is DefMemPerNode and the maximum value
1020 is MaxMemPerNode. If configured, both of parameters can be seen
1021 using the scontrol show config command. This parameter would
1022 generally be used if whole nodes are allocated to jobs (Select‐
1023 Type=select/linear). Also see --mem-per-cpu and --mem-per-gpu.
1024 The --mem, --mem-per-cpu and --mem-per-gpu options are mutually
1025 exclusive. If --mem, --mem-per-cpu or --mem-per-gpu are speci‐
1026 fied as command line arguments, then they will take precedence
1027 over the environment.
1028
1029 NOTE: A memory size specification of zero is treated as a spe‐
1030 cial case and grants the job access to all of the memory on each
1031 node.
1032
1033 NOTE: Enforcement of memory limits currently relies upon the
1034 task/cgroup plugin or enabling of accounting, which samples mem‐
1035 ory use on a periodic basis (data need not be stored, just col‐
1036 lected). In both cases memory use is based upon the job's Resi‐
1037 dent Set Size (RSS). A task may exceed the memory limit until
1038 the next periodic accounting sample.
1039
1040 --mem-bind=[{quiet|verbose},]<type>
1041 Bind tasks to memory. Used only when the task/affinity plugin is
1042 enabled and the NUMA memory functions are available. Note that
1043 the resolution of CPU and memory binding may differ on some ar‐
1044 chitectures. For example, CPU binding may be performed at the
1045 level of the cores within a processor while memory binding will
1046 be performed at the level of nodes, where the definition of
1047 "nodes" may differ from system to system. By default no memory
1048 binding is performed; any task using any CPU can use any memory.
1049 This option is typically used to ensure that each task is bound
1050 to the memory closest to its assigned CPU. The use of any type
1051 other than "none" or "local" is not recommended.
1052
1053 NOTE: To have Slurm always report on the selected memory binding
1054 for all commands executed in a shell, you can enable verbose
1055 mode by setting the SLURM_MEM_BIND environment variable value to
1056 "verbose".
1057
1058 The following informational environment variables are set when
1059 --mem-bind is in use:
1060
1061 SLURM_MEM_BIND_LIST
1062 SLURM_MEM_BIND_PREFER
1063 SLURM_MEM_BIND_SORT
1064 SLURM_MEM_BIND_TYPE
1065 SLURM_MEM_BIND_VERBOSE
1066
1067 See the ENVIRONMENT VARIABLES section for a more detailed de‐
1068 scription of the individual SLURM_MEM_BIND* variables.
1069
1070 Supported options include:
1071
1072 help show this help message
1073
1074 local Use memory local to the processor in use
1075
1076 map_mem:<list>
1077 Bind by setting memory masks on tasks (or ranks) as spec‐
1078 ified where <list> is
1079 <numa_id_for_task_0>,<numa_id_for_task_1>,... The map‐
1080 ping is specified for a node and identical mapping is ap‐
1081 plied to the tasks on every node (i.e. the lowest task ID
1082 on each node is mapped to the first ID specified in the
1083 list, etc.). NUMA IDs are interpreted as decimal values
1084 unless they are preceded with '0x' in which case they in‐
1085 terpreted as hexadecimal values. If the number of tasks
1086 (or ranks) exceeds the number of elements in this list,
1087 elements in the list will be reused as needed starting
1088 from the beginning of the list. To simplify support for
1089 large task counts, the lists may follow a map with an as‐
1090 terisk and repetition count. For example
1091 "map_mem:0x0f*4,0xf0*4". For predictable binding re‐
1092 sults, all CPUs for each node in the job should be allo‐
1093 cated to the job.
1094
1095 mask_mem:<list>
1096 Bind by setting memory masks on tasks (or ranks) as spec‐
1097 ified where <list> is
1098 <numa_mask_for_task_0>,<numa_mask_for_task_1>,... The
1099 mapping is specified for a node and identical mapping is
1100 applied to the tasks on every node (i.e. the lowest task
1101 ID on each node is mapped to the first mask specified in
1102 the list, etc.). NUMA masks are always interpreted as
1103 hexadecimal values. Note that masks must be preceded
1104 with a '0x' if they don't begin with [0-9] so they are
1105 seen as numerical values. If the number of tasks (or
1106 ranks) exceeds the number of elements in this list, ele‐
1107 ments in the list will be reused as needed starting from
1108 the beginning of the list. To simplify support for large
1109 task counts, the lists may follow a mask with an asterisk
1110 and repetition count. For example "mask_mem:0*4,1*4".
1111 For predictable binding results, all CPUs for each node
1112 in the job should be allocated to the job.
1113
1114 no[ne] don't bind tasks to memory (default)
1115
1116 p[refer]
1117 Prefer use of first specified NUMA node, but permit
1118 use of other available NUMA nodes.
1119
1120 q[uiet]
1121 quietly bind before task runs (default)
1122
1123 rank bind by task rank (not recommended)
1124
1125 sort sort free cache pages (run zonesort on Intel KNL nodes)
1126
1127 v[erbose]
1128 verbosely report binding before task runs
1129
1130 --mem-per-cpu=<size>[units]
1131 Minimum memory required per usable allocated CPU. Default units
1132 are megabytes. Different units can be specified using the suf‐
1133 fix [K|M|G|T]. The default value is DefMemPerCPU and the maxi‐
1134 mum value is MaxMemPerCPU (see exception below). If configured,
1135 both parameters can be seen using the scontrol show config com‐
1136 mand. Note that if the job's --mem-per-cpu value exceeds the
1137 configured MaxMemPerCPU, then the user's limit will be treated
1138 as a memory limit per task; --mem-per-cpu will be reduced to a
1139 value no larger than MaxMemPerCPU; --cpus-per-task will be set
1140 and the value of --cpus-per-task multiplied by the new
1141 --mem-per-cpu value will equal the original --mem-per-cpu value
1142 specified by the user. This parameter would generally be used
1143 if individual processors are allocated to jobs (SelectType=se‐
1144 lect/cons_res). If resources are allocated by core, socket, or
1145 whole nodes, then the number of CPUs allocated to a job may be
1146 higher than the task count and the value of --mem-per-cpu should
1147 be adjusted accordingly. Also see --mem and --mem-per-gpu. The
1148 --mem, --mem-per-cpu and --mem-per-gpu options are mutually ex‐
1149 clusive.
1150
1151 NOTE: If the final amount of memory requested by a job can't be
1152 satisfied by any of the nodes configured in the partition, the
1153 job will be rejected. This could happen if --mem-per-cpu is
1154 used with the --exclusive option for a job allocation and
1155 --mem-per-cpu times the number of CPUs on a node is greater than
1156 the total memory of that node.
1157
1158 NOTE: This applies to usable allocated CPUs in a job allocation.
1159 This is important when more than one thread per core is config‐
1160 ured. If a job requests --threads-per-core with fewer threads
1161 on a core than exist on the core (or --hint=nomultithread which
1162 implies --threads-per-core=1), the job will be unable to use
1163 those extra threads on the core and those threads will not be
1164 included in the memory per CPU calculation. But if the job has
1165 access to all threads on the core, those threads will be in‐
1166 cluded in the memory per CPU calculation even if the job did not
1167 explicitly request those threads.
1168
1169 In the following examples, each core has two threads.
1170
1171 In this first example, two tasks can run on separate hyper‐
1172 threads in the same core because --threads-per-core is not used.
1173 The third task uses both threads of the second core. The allo‐
1174 cated memory per cpu includes all threads:
1175
1176 $ salloc -n3 --mem-per-cpu=100
1177 salloc: Granted job allocation 17199
1178 $ sacct -j $SLURM_JOB_ID -X -o jobid%7,reqtres%35,alloctres%35
1179 JobID ReqTRES AllocTRES
1180 ------- ----------------------------------- -----------------------------------
1181 17199 billing=3,cpu=3,mem=300M,node=1 billing=4,cpu=4,mem=400M,node=1
1182
1183 In this second example, because of --threads-per-core=1, each
1184 task is allocated an entire core but is only able to use one
1185 thread per core. Allocated CPUs includes all threads on each
1186 core. However, allocated memory per cpu includes only the usable
1187 thread in each core.
1188
1189 $ salloc -n3 --mem-per-cpu=100 --threads-per-core=1
1190 salloc: Granted job allocation 17200
1191 $ sacct -j $SLURM_JOB_ID -X -o jobid%7,reqtres%35,alloctres%35
1192 JobID ReqTRES AllocTRES
1193 ------- ----------------------------------- -----------------------------------
1194 17200 billing=3,cpu=3,mem=300M,node=1 billing=6,cpu=6,mem=300M,node=1
1195
1196 --mem-per-gpu=<size>[units]
1197 Minimum memory required per allocated GPU. Default units are
1198 megabytes. Different units can be specified using the suffix
1199 [K|M|G|T]. Default value is DefMemPerGPU and is available on
1200 both a global and per partition basis. If configured, the pa‐
1201 rameters can be seen using the scontrol show config and scontrol
1202 show partition commands. Also see --mem. The --mem,
1203 --mem-per-cpu and --mem-per-gpu options are mutually exclusive.
1204
1205 --mincpus=<n>
1206 Specify a minimum number of logical cpus/processors per node.
1207
1208 --network=<type>
1209 Specify information pertaining to the switch or network. The
1210 interpretation of type is system dependent. This option is sup‐
1211 ported when running Slurm on a Cray natively. It is used to re‐
1212 quest using Network Performance Counters. Only one value per
1213 request is valid. All options are case in-sensitive. In this
1214 configuration supported values include:
1215
1216 system
1217 Use the system-wide network performance counters. Only
1218 nodes requested will be marked in use for the job alloca‐
1219 tion. If the job does not fill up the entire system the
1220 rest of the nodes are not able to be used by other jobs
1221 using NPC, if idle their state will appear as PerfCnts.
1222 These nodes are still available for other jobs not using
1223 NPC.
1224
1225 blade Use the blade network performance counters. Only nodes re‐
1226 quested will be marked in use for the job allocation. If
1227 the job does not fill up the entire blade(s) allocated to
1228 the job those blade(s) are not able to be used by other
1229 jobs using NPC, if idle their state will appear as PerfC‐
1230 nts. These nodes are still available for other jobs not
1231 using NPC.
1232
1233 In all cases the job allocation request must specify the --ex‐
1234 clusive option. Otherwise the request will be denied.
1235
1236 Also with any of these options steps are not allowed to share
1237 blades, so resources would remain idle inside an allocation if
1238 the step running on a blade does not take up all the nodes on
1239 the blade.
1240
1241 --nice[=adjustment]
1242 Run the job with an adjusted scheduling priority within Slurm.
1243 With no adjustment value the scheduling priority is decreased by
1244 100. A negative nice value increases the priority, otherwise de‐
1245 creases it. The adjustment range is +/- 2147483645. Only privi‐
1246 leged users can specify a negative adjustment.
1247
1248 --no-bell
1249 Silence salloc's use of the terminal bell. Also see the option
1250 --bell.
1251
1252 -k, --no-kill[=off]
1253 Do not automatically terminate a job if one of the nodes it has
1254 been allocated fails. The user will assume the responsibilities
1255 for fault-tolerance should a node fail. The job allocation will
1256 not be revoked so the user may launch new job steps on the re‐
1257 maining nodes in their allocation. This option does not set the
1258 SLURM_NO_KILL environment variable. Therefore, when a node
1259 fails, steps running on that node will be killed unless the
1260 SLURM_NO_KILL environment variable was explicitly set or srun
1261 calls within the job allocation explicitly requested --no-kill.
1262
1263 Specify an optional argument of "off" to disable the effect of
1264 the SALLOC_NO_KILL environment variable.
1265
1266 By default Slurm terminates the entire job allocation if any
1267 node fails in its range of allocated nodes.
1268
1269 --no-shell
1270 immediately exit after allocating resources, without running a
1271 command. However, the Slurm job will still be created and will
1272 remain active and will own the allocated resources as long as it
1273 is active. You will have a Slurm job id with no associated pro‐
1274 cesses or tasks. You can submit srun commands against this re‐
1275 source allocation, if you specify the --jobid= option with the
1276 job id of this Slurm job. Or, this can be used to temporarily
1277 reserve a set of resources so that other jobs cannot use them
1278 for some period of time. (Note that the Slurm job is subject to
1279 the normal constraints on jobs, including time limits, so that
1280 eventually the job will terminate and the resources will be
1281 freed, or you can terminate the job manually using the scancel
1282 command.)
1283
1284 -F, --nodefile=<node_file>
1285 Much like --nodelist, but the list is contained in a file of
1286 name node file. The node names of the list may also span multi‐
1287 ple lines in the file. Duplicate node names in the file will
1288 be ignored. The order of the node names in the list is not im‐
1289 portant; the node names will be sorted by Slurm.
1290
1291 -w, --nodelist=<node_name_list>
1292 Request a specific list of hosts. The job will contain all of
1293 these hosts and possibly additional hosts as needed to satisfy
1294 resource requirements. The list may be specified as a
1295 comma-separated list of hosts, a range of hosts (host[1-5,7,...]
1296 for example), or a filename. The host list will be assumed to
1297 be a filename if it contains a "/" character. If you specify a
1298 minimum node or processor count larger than can be satisfied by
1299 the supplied host list, additional resources will be allocated
1300 on other nodes as needed. Duplicate node names in the list will
1301 be ignored. The order of the node names in the list is not im‐
1302 portant; the node names will be sorted by Slurm.
1303
1304 -N, --nodes=<minnodes>[-maxnodes]
1305 Request that a minimum of minnodes nodes be allocated to this
1306 job. A maximum node count may also be specified with maxnodes.
1307 If only one number is specified, this is used as both the mini‐
1308 mum and maximum node count. The partition's node limits super‐
1309 sede those of the job. If a job's node limits are outside of
1310 the range permitted for its associated partition, the job will
1311 be left in a PENDING state. This permits possible execution at
1312 a later time, when the partition limit is changed. If a job
1313 node limit exceeds the number of nodes configured in the parti‐
1314 tion, the job will be rejected. Note that the environment vari‐
1315 able SLURM_JOB_NUM_NODES will be set to the count of nodes actu‐
1316 ally allocated to the job. See the ENVIRONMENT VARIABLES sec‐
1317 tion for more information. If -N is not specified, the default
1318 behavior is to allocate enough nodes to satisfy the requested
1319 resources as expressed by per-job specification options, e.g.
1320 -n, -c and --gpus. The job will be allocated as many nodes as
1321 possible within the range specified and without delaying the
1322 initiation of the job. The node count specification may include
1323 a numeric value followed by a suffix of "k" (multiplies numeric
1324 value by 1,024) or "m" (multiplies numeric value by 1,048,576).
1325
1326 -n, --ntasks=<number>
1327 salloc does not launch tasks, it requests an allocation of re‐
1328 sources and executed some command. This option advises the Slurm
1329 controller that job steps run within this allocation will launch
1330 a maximum of number tasks and sufficient resources are allocated
1331 to accomplish this. The default is one task per node, but note
1332 that the --cpus-per-task option will change this default.
1333
1334 --ntasks-per-core=<ntasks>
1335 Request the maximum ntasks be invoked on each core. Meant to be
1336 used with the --ntasks option. Related to --ntasks-per-node ex‐
1337 cept at the core level instead of the node level. NOTE: This
1338 option is not supported when using SelectType=select/linear.
1339
1340 --ntasks-per-gpu=<ntasks>
1341 Request that there are ntasks tasks invoked for every GPU. This
1342 option can work in two ways: 1) either specify --ntasks in addi‐
1343 tion, in which case a type-less GPU specification will be auto‐
1344 matically determined to satisfy --ntasks-per-gpu, or 2) specify
1345 the GPUs wanted (e.g. via --gpus or --gres) without specifying
1346 --ntasks, and the total task count will be automatically deter‐
1347 mined. The number of CPUs needed will be automatically in‐
1348 creased if necessary to allow for any calculated task count.
1349 This option will implicitly set --gpu-bind=single:<ntasks>, but
1350 that can be overridden with an explicit --gpu-bind specifica‐
1351 tion. This option is not compatible with a node range (i.e.
1352 -N<minnodes-maxnodes>). This option is not compatible with
1353 --gpus-per-task, --gpus-per-socket, or --ntasks-per-node. This
1354 option is not supported unless SelectType=cons_tres is config‐
1355 ured (either directly or indirectly on Cray systems).
1356
1357 --ntasks-per-node=<ntasks>
1358 Request that ntasks be invoked on each node. If used with the
1359 --ntasks option, the --ntasks option will take precedence and
1360 the --ntasks-per-node will be treated as a maximum count of
1361 tasks per node. Meant to be used with the --nodes option. This
1362 is related to --cpus-per-task=ncpus, but does not require knowl‐
1363 edge of the actual number of cpus on each node. In some cases,
1364 it is more convenient to be able to request that no more than a
1365 specific number of tasks be invoked on each node. Examples of
1366 this include submitting a hybrid MPI/OpenMP app where only one
1367 MPI "task/rank" should be assigned to each node while allowing
1368 the OpenMP portion to utilize all of the parallelism present in
1369 the node, or submitting a single setup/cleanup/monitoring job to
1370 each node of a pre-existing allocation as one step in a larger
1371 job script.
1372
1373 --ntasks-per-socket=<ntasks>
1374 Request the maximum ntasks be invoked on each socket. Meant to
1375 be used with the --ntasks option. Related to --ntasks-per-node
1376 except at the socket level instead of the node level. NOTE:
1377 This option is not supported when using SelectType=select/lin‐
1378 ear.
1379
1380 -O, --overcommit
1381 Overcommit resources.
1382
1383 When applied to a job allocation (not including jobs requesting
1384 exclusive access to the nodes) the resources are allocated as if
1385 only one task per node is requested. This means that the re‐
1386 quested number of cpus per task (-c, --cpus-per-task) are allo‐
1387 cated per node rather than being multiplied by the number of
1388 tasks. Options used to specify the number of tasks per node,
1389 socket, core, etc. are ignored.
1390
1391 When applied to job step allocations (the srun command when exe‐
1392 cuted within an existing job allocation), this option can be
1393 used to launch more than one task per CPU. Normally, srun will
1394 not allocate more than one process per CPU. By specifying
1395 --overcommit you are explicitly allowing more than one process
1396 per CPU. However no more than MAX_TASKS_PER_NODE tasks are per‐
1397 mitted to execute per node. NOTE: MAX_TASKS_PER_NODE is defined
1398 in the file slurm.h and is not a variable, it is set at Slurm
1399 build time.
1400
1401 -s, --oversubscribe
1402 The job allocation can over-subscribe resources with other run‐
1403 ning jobs. The resources to be over-subscribed can be nodes,
1404 sockets, cores, and/or hyperthreads depending upon configura‐
1405 tion. The default over-subscribe behavior depends on system
1406 configuration and the partition's OverSubscribe option takes
1407 precedence over the job's option. This option may result in the
1408 allocation being granted sooner than if the --oversubscribe op‐
1409 tion was not set and allow higher system utilization, but appli‐
1410 cation performance will likely suffer due to competition for re‐
1411 sources. Also see the --exclusive option.
1412
1413 -p, --partition=<partition_names>
1414 Request a specific partition for the resource allocation. If
1415 not specified, the default behavior is to allow the slurm con‐
1416 troller to select the default partition as designated by the
1417 system administrator. If the job can use more than one parti‐
1418 tion, specify their names in a comma separate list and the one
1419 offering earliest initiation will be used with no regard given
1420 to the partition name ordering (although higher priority parti‐
1421 tions will be considered first). When the job is initiated, the
1422 name of the partition used will be placed first in the job
1423 record partition string.
1424
1425 --power=<flags>
1426 Comma separated list of power management plugin options. Cur‐
1427 rently available flags include: level (all nodes allocated to
1428 the job should have identical power caps, may be disabled by the
1429 Slurm configuration option PowerParameters=job_no_level).
1430
1431 --prefer=<list>
1432 Nodes can have features assigned to them by the Slurm adminis‐
1433 trator. Users can specify which of these features are desired
1434 but not required by their job using the prefer option. This op‐
1435 tion operates independently from --constraint and will override
1436 whatever is set there if possible. When scheduling the features
1437 in --prefer are tried first if a node set isn't available with
1438 those features then --constraint is attempted. See --constraint
1439 for more information, this option behaves the same way.
1440
1441
1442 --priority=<value>
1443 Request a specific job priority. May be subject to configura‐
1444 tion specific constraints. value should either be a numeric
1445 value or "TOP" (for highest possible value). Only Slurm opera‐
1446 tors and administrators can set the priority of a job.
1447
1448 --profile={all|none|<type>[,<type>...]}
1449 Enables detailed data collection by the acct_gather_profile
1450 plugin. Detailed data are typically time-series that are stored
1451 in an HDF5 file for the job or an InfluxDB database depending on
1452 the configured plugin.
1453
1454 All All data types are collected. (Cannot be combined with
1455 other values.)
1456
1457 None No data types are collected. This is the default.
1458 (Cannot be combined with other values.)
1459
1460 Valid type values are:
1461
1462 Energy Energy data is collected.
1463
1464 Task Task (I/O, Memory, ...) data is collected.
1465
1466 Lustre Lustre data is collected.
1467
1468 Network
1469 Network (InfiniBand) data is collected.
1470
1471 -q, --qos=<qos>
1472 Request a quality of service for the job. QOS values can be de‐
1473 fined for each user/cluster/account association in the Slurm
1474 database. Users will be limited to their association's defined
1475 set of qos's when the Slurm configuration parameter, Account‐
1476 ingStorageEnforce, includes "qos" in its definition.
1477
1478 -Q, --quiet
1479 Suppress informational messages from salloc. Errors will still
1480 be displayed.
1481
1482 --reboot
1483 Force the allocated nodes to reboot before starting the job.
1484 This is only supported with some system configurations and will
1485 otherwise be silently ignored. Only root, SlurmUser or admins
1486 can reboot nodes.
1487
1488 --reservation=<reservation_names>
1489 Allocate resources for the job from the named reservation. If
1490 the job can use more than one reservation, specify their names
1491 in a comma separate list and the one offering earliest initia‐
1492 tion. Each reservation will be considered in the order it was
1493 requested. All reservations will be listed in scontrol/squeue
1494 through the life of the job. In accounting the first reserva‐
1495 tion will be seen and after the job starts the reservation used
1496 will replace it.
1497
1498 --signal=[R:]<sig_num>[@sig_time]
1499 When a job is within sig_time seconds of its end time, send it
1500 the signal sig_num. Due to the resolution of event handling by
1501 Slurm, the signal may be sent up to 60 seconds earlier than
1502 specified. sig_num may either be a signal number or name (e.g.
1503 "10" or "USR1"). sig_time must have an integer value between 0
1504 and 65535. By default, no signal is sent before the job's end
1505 time. If a sig_num is specified without any sig_time, the de‐
1506 fault time will be 60 seconds. Use the "R:" option to allow
1507 this job to overlap with a reservation with MaxStartDelay set.
1508 To have the signal sent at preemption time see the pre‐
1509 empt_send_user_signal SlurmctldParameter.
1510
1511 --sockets-per-node=<sockets>
1512 Restrict node selection to nodes with at least the specified
1513 number of sockets. See additional information under -B option
1514 above when task/affinity plugin is enabled.
1515 NOTE: This option may implicitly set the number of tasks (if -n
1516 was not specified) as one task per requested thread.
1517
1518 --spread-job
1519 Spread the job allocation over as many nodes as possible and at‐
1520 tempt to evenly distribute tasks across the allocated nodes.
1521 This option disables the topology/tree plugin.
1522
1523 --switches=<count>[@max-time]
1524 When a tree topology is used, this defines the maximum count of
1525 leaf switches desired for the job allocation and optionally the
1526 maximum time to wait for that number of switches. If Slurm finds
1527 an allocation containing more switches than the count specified,
1528 the job remains pending until it either finds an allocation with
1529 desired switch count or the time limit expires. It there is no
1530 switch count limit, there is no delay in starting the job. Ac‐
1531 ceptable time formats include "minutes", "minutes:seconds",
1532 "hours:minutes:seconds", "days-hours", "days-hours:minutes" and
1533 "days-hours:minutes:seconds". The job's maximum time delay may
1534 be limited by the system administrator using the SchedulerParam‐
1535 eters configuration parameter with the max_switch_wait parameter
1536 option. On a dragonfly network the only switch count supported
1537 is 1 since communication performance will be highest when a job
1538 is allocate resources on one leaf switch or more than 2 leaf
1539 switches. The default max-time is the max_switch_wait Sched‐
1540 ulerParameters.
1541
1542 --thread-spec=<num>
1543 Count of specialized threads per node reserved by the job for
1544 system operations and not used by the application. The applica‐
1545 tion will not use these threads, but will be charged for their
1546 allocation. This option can not be used with the --core-spec
1547 option.
1548
1549 NOTE: Explicitly setting a job's specialized thread value im‐
1550 plicitly sets its --exclusive option, reserving entire nodes for
1551 the job.
1552
1553 --threads-per-core=<threads>
1554 Restrict node selection to nodes with at least the specified
1555 number of threads per core. In task layout, use the specified
1556 maximum number of threads per core. NOTE: "Threads" refers to
1557 the number of processing units on each core rather than the num‐
1558 ber of application tasks to be launched per core. See addi‐
1559 tional information under -B option above when task/affinity
1560 plugin is enabled.
1561 NOTE: This option may implicitly set the number of tasks (if -n
1562 was not specified) as one task per requested thread.
1563
1564 -t, --time=<time>
1565 Set a limit on the total run time of the job allocation. If the
1566 requested time limit exceeds the partition's time limit, the job
1567 will be left in a PENDING state (possibly indefinitely). The
1568 default time limit is the partition's default time limit. When
1569 the time limit is reached, each task in each job step is sent
1570 SIGTERM followed by SIGKILL. The interval between signals is
1571 specified by the Slurm configuration parameter KillWait. The
1572 OverTimeLimit configuration parameter may permit the job to run
1573 longer than scheduled. Time resolution is one minute and second
1574 values are rounded up to the next minute.
1575
1576 A time limit of zero requests that no time limit be imposed.
1577 Acceptable time formats include "minutes", "minutes:seconds",
1578 "hours:minutes:seconds", "days-hours", "days-hours:minutes" and
1579 "days-hours:minutes:seconds".
1580
1581 --time-min=<time>
1582 Set a minimum time limit on the job allocation. If specified,
1583 the job may have its --time limit lowered to a value no lower
1584 than --time-min if doing so permits the job to begin execution
1585 earlier than otherwise possible. The job's time limit will not
1586 be changed after the job is allocated resources. This is per‐
1587 formed by a backfill scheduling algorithm to allocate resources
1588 otherwise reserved for higher priority jobs. Acceptable time
1589 formats include "minutes", "minutes:seconds", "hours:min‐
1590 utes:seconds", "days-hours", "days-hours:minutes" and
1591 "days-hours:minutes:seconds".
1592
1593 --tmp=<size>[units]
1594 Specify a minimum amount of temporary disk space per node. De‐
1595 fault units are megabytes. Different units can be specified us‐
1596 ing the suffix [K|M|G|T].
1597
1598 --uid=<user>
1599 Attempt to submit and/or run a job as user instead of the invok‐
1600 ing user id. The invoking user's credentials will be used to
1601 check access permissions for the target partition. This option
1602 is only valid for user root. This option may be used by user
1603 root may use this option to run jobs as a normal user in a
1604 RootOnly partition for example. If run as root, salloc will drop
1605 its permissions to the uid specified after node allocation is
1606 successful. user may be the user name or numerical user ID.
1607
1608 --usage
1609 Display brief help message and exit.
1610
1611 --use-min-nodes
1612 If a range of node counts is given, prefer the smaller count.
1613
1614 -v, --verbose
1615 Increase the verbosity of salloc's informational messages. Mul‐
1616 tiple -v's will further increase salloc's verbosity. By default
1617 only errors will be displayed.
1618
1619 -V, --version
1620 Display version information and exit.
1621
1622 --wait-all-nodes=<value>
1623 Controls when the execution of the command begins with respect
1624 to when nodes are ready for use (i.e. booted). By default, the
1625 salloc command will return as soon as the allocation is made.
1626 This default can be altered using the salloc_wait_nodes option
1627 to the SchedulerParameters parameter in the slurm.conf file.
1628
1629 0 Begin execution as soon as allocation can be made. Do not
1630 wait for all nodes to be ready for use (i.e. booted).
1631
1632 1 Do not begin execution until all nodes are ready for use.
1633
1634 --wckey=<wckey>
1635 Specify wckey to be used with job. If TrackWCKey=no (default)
1636 in the slurm.conf this value is ignored.
1637
1638 --x11[={all|first|last}]
1639 Sets up X11 forwarding on "all", "first" or "last" node(s) of
1640 the allocation. This option is only enabled if Slurm was com‐
1641 piled with X11 support and PrologFlags=x11 is defined in the
1642 slurm.conf. Default is "all".
1643
1645 Executing salloc sends a remote procedure call to slurmctld. If enough
1646 calls from salloc or other Slurm client commands that send remote pro‐
1647 cedure calls to the slurmctld daemon come in at once, it can result in
1648 a degradation of performance of the slurmctld daemon, possibly result‐
1649 ing in a denial of service.
1650
1651 Do not run salloc or other Slurm client commands that send remote pro‐
1652 cedure calls to slurmctld from loops in shell scripts or other pro‐
1653 grams. Ensure that programs limit calls to salloc to the minimum neces‐
1654 sary for the information you are trying to gather.
1655
1656
1658 Upon startup, salloc will read and handle the options set in the fol‐
1659 lowing environment variables. The majority of these variables are set
1660 the same way the options are set, as defined above. For flag options
1661 that are defined to expect no argument, the option can be enabled by
1662 setting the environment variable without a value (empty or NULL
1663 string), the string 'yes', or a non-zero number. Any other value for
1664 the environment variable will result in the option not being set.
1665 There are a couple exceptions to these rules that are noted below.
1666 NOTE: Command line options always override environment variables set‐
1667 tings.
1668
1669
1670 SALLOC_ACCOUNT Same as -A, --account
1671
1672 SALLOC_ACCTG_FREQ Same as --acctg-freq
1673
1674 SALLOC_BELL Same as --bell
1675
1676 SALLOC_BURST_BUFFER Same as --bb
1677
1678 SALLOC_CLUSTERS or SLURM_CLUSTERS
1679 Same as --clusters
1680
1681 SALLOC_CONSTRAINT Same as -C, --constraint
1682
1683 SALLOC_CONTAINER Same as --container.
1684
1685 SALLOC_CORE_SPEC Same as --core-spec
1686
1687 SALLOC_CPUS_PER_GPU Same as --cpus-per-gpu
1688
1689 SALLOC_DEBUG Same as -v, --verbose. Must be set to 0 or 1 to
1690 disable or enable the option.
1691
1692 SALLOC_DELAY_BOOT Same as --delay-boot
1693
1694 SALLOC_EXCLUSIVE Same as --exclusive
1695
1696 SALLOC_GPU_BIND Same as --gpu-bind
1697
1698 SALLOC_GPU_FREQ Same as --gpu-freq
1699
1700 SALLOC_GPUS Same as -G, --gpus
1701
1702 SALLOC_GPUS_PER_NODE Same as --gpus-per-node
1703
1704 SALLOC_GPUS_PER_TASK Same as --gpus-per-task
1705
1706 SALLOC_GRES Same as --gres
1707
1708 SALLOC_GRES_FLAGS Same as --gres-flags
1709
1710 SALLOC_HINT or SLURM_HINT
1711 Same as --hint
1712
1713 SALLOC_IMMEDIATE Same as -I, --immediate
1714
1715 SALLOC_KILL_CMD Same as -K, --kill-command
1716
1717 SALLOC_MEM_BIND Same as --mem-bind
1718
1719 SALLOC_MEM_PER_CPU Same as --mem-per-cpu
1720
1721 SALLOC_MEM_PER_GPU Same as --mem-per-gpu
1722
1723 SALLOC_MEM_PER_NODE Same as --mem
1724
1725 SALLOC_NETWORK Same as --network
1726
1727 SALLOC_NO_BELL Same as --no-bell
1728
1729 SALLOC_NO_KILL Same as -k, --no-kill
1730
1731 SALLOC_OVERCOMMIT Same as -O, --overcommit
1732
1733 SALLOC_PARTITION Same as -p, --partition
1734
1735 SALLOC_POWER Same as --power
1736
1737 SALLOC_PROFILE Same as --profile
1738
1739 SALLOC_QOS Same as --qos
1740
1741 SALLOC_REQ_SWITCH When a tree topology is used, this defines the
1742 maximum count of switches desired for the job al‐
1743 location and optionally the maximum time to wait
1744 for that number of switches. See --switches.
1745
1746 SALLOC_RESERVATION Same as --reservation
1747
1748 SALLOC_SIGNAL Same as --signal
1749
1750 SALLOC_SPREAD_JOB Same as --spread-job
1751
1752 SALLOC_THREAD_SPEC Same as --thread-spec
1753
1754 SALLOC_THREADS_PER_CORE
1755 Same as --threads-per-core
1756
1757 SALLOC_TIMELIMIT Same as -t, --time
1758
1759 SALLOC_USE_MIN_NODES Same as --use-min-nodes
1760
1761 SALLOC_WAIT_ALL_NODES Same as --wait-all-nodes. Must be set to 0 or 1
1762 to disable or enable the option.
1763
1764 SALLOC_WAIT4SWITCH Max time waiting for requested switches. See
1765 --switches
1766
1767 SALLOC_WCKEY Same as --wckey
1768
1769 SLURM_CONF The location of the Slurm configuration file.
1770
1771 SLURM_EXIT_ERROR Specifies the exit code generated when a Slurm
1772 error occurs (e.g. invalid options). This can be
1773 used by a script to distinguish application exit
1774 codes from various Slurm error conditions. Also
1775 see SLURM_EXIT_IMMEDIATE.
1776
1777 SLURM_EXIT_IMMEDIATE Specifies the exit code generated when the --im‐
1778 mediate option is used and resources are not cur‐
1779 rently available. This can be used by a script
1780 to distinguish application exit codes from vari‐
1781 ous Slurm error conditions. Also see
1782 SLURM_EXIT_ERROR.
1783
1785 salloc will set the following environment variables in the environment
1786 of the executed program:
1787
1788 SLURM_*_HET_GROUP_#
1789 For a heterogeneous job allocation, the environment variables
1790 are set separately for each component.
1791
1792 SLURM_CLUSTER_NAME
1793 Name of the cluster on which the job is executing.
1794
1795 SLURM_CONTAINER
1796 OCI Bundle for job. Only set if --container is specified.
1797
1798 SLURM_CPUS_PER_GPU
1799 Number of CPUs requested per allocated GPU. Only set if the
1800 --cpus-per-gpu option is specified.
1801
1802 SLURM_CPUS_PER_TASK
1803 Number of CPUs requested per task. Only set if the
1804 --cpus-per-task option is specified.
1805
1806 SLURM_DIST_PLANESIZE
1807 Plane distribution size. Only set for plane distributions. See
1808 -m, --distribution.
1809
1810 SLURM_DISTRIBUTION
1811 Only set if the -m, --distribution option is specified.
1812
1813 SLURM_GPU_BIND
1814 Requested binding of tasks to GPU. Only set if the --gpu-bind
1815 option is specified.
1816
1817 SLURM_GPU_FREQ
1818 Requested GPU frequency. Only set if the --gpu-freq option is
1819 specified.
1820
1821 SLURM_GPUS
1822 Number of GPUs requested. Only set if the -G, --gpus option is
1823 specified.
1824
1825 SLURM_GPUS_PER_NODE
1826 Requested GPU count per allocated node. Only set if the
1827 --gpus-per-node option is specified.
1828
1829 SLURM_GPUS_PER_SOCKET
1830 Requested GPU count per allocated socket. Only set if the
1831 --gpus-per-socket option is specified.
1832
1833 SLURM_GPUS_PER_TASK
1834 Requested GPU count per allocated task. Only set if the
1835 --gpus-per-task option is specified.
1836
1837 SLURM_HET_SIZE
1838 Set to count of components in heterogeneous job.
1839
1840 SLURM_JOB_ACCOUNT
1841 Account name associated of the job allocation.
1842
1843 SLURM_JOB_ID
1844 The ID of the job allocation.
1845
1846 SLURM_JOB_CPUS_PER_NODE
1847 Count of CPUs available to the job on the nodes in the alloca‐
1848 tion, using the format CPU_count[(xnumber_of_nodes)][,CPU_count
1849 [(xnumber_of_nodes)] ...]. For example:
1850 SLURM_JOB_CPUS_PER_NODE='72(x2),36' indicates that on the first
1851 and second nodes (as listed by SLURM_JOB_NODELIST) the alloca‐
1852 tion has 72 CPUs, while the third node has 36 CPUs. NOTE: The
1853 select/linear plugin allocates entire nodes to jobs, so the
1854 value indicates the total count of CPUs on allocated nodes. The
1855 select/cons_res and select/cons_tres plugins allocate individual
1856 CPUs to jobs, so this number indicates the number of CPUs allo‐
1857 cated to the job.
1858
1859 SLURM_JOB_GPUS
1860 The global GPU IDs of the GPUs allocated to this job. The GPU
1861 IDs are not relative to any device cgroup, even if devices are
1862 constrained with task/cgroup. Only set in batch and interactive
1863 jobs.
1864
1865 SLURM_JOB_NODELIST
1866 List of nodes allocated to the job.
1867
1868 SLURM_JOB_NUM_NODES
1869 Total number of nodes in the job allocation.
1870
1871 SLURM_JOB_PARTITION
1872 Name of the partition in which the job is running.
1873
1874 SLURM_JOB_QOS
1875 Quality Of Service (QOS) of the job allocation.
1876
1877 SLURM_JOB_RESERVATION
1878 Advanced reservation containing the job allocation, if any.
1879
1880 SLURM_JOBID
1881 The ID of the job allocation. See SLURM_JOB_ID. Included for
1882 backwards compatibility.
1883
1884 SLURM_MEM_BIND
1885 Set to value of the --mem-bind option.
1886
1887 SLURM_MEM_BIND_LIST
1888 Set to bit mask used for memory binding.
1889
1890 SLURM_MEM_BIND_PREFER
1891 Set to "prefer" if the --mem-bind option includes the prefer op‐
1892 tion.
1893
1894 SLURM_MEM_BIND_SORT
1895 Sort free cache pages (run zonesort on Intel KNL nodes)
1896
1897 SLURM_MEM_BIND_TYPE
1898 Set to the memory binding type specified with the --mem-bind op‐
1899 tion. Possible values are "none", "rank", "map_map", "mask_mem"
1900 and "local".
1901
1902 SLURM_MEM_BIND_VERBOSE
1903 Set to "verbose" if the --mem-bind option includes the verbose
1904 option. Set to "quiet" otherwise.
1905
1906 SLURM_MEM_PER_CPU
1907 Same as --mem-per-cpu
1908
1909 SLURM_MEM_PER_GPU
1910 Requested memory per allocated GPU. Only set if the
1911 --mem-per-gpu option is specified.
1912
1913 SLURM_MEM_PER_NODE
1914 Same as --mem
1915
1916 SLURM_NNODES
1917 Total number of nodes in the job allocation. See
1918 SLURM_JOB_NUM_NODES. Included for backwards compatibility.
1919
1920 SLURM_NODELIST
1921 List of nodes allocated to the job. See SLURM_JOB_NODELIST. In‐
1922 cluded for backwards compabitility.
1923
1924 SLURM_NODE_ALIASES
1925 Sets of node name, communication address and hostname for nodes
1926 allocated to the job from the cloud. Each element in the set if
1927 colon separated and each set is comma separated. For example:
1928 SLURM_NODE_ALIASES=ec0:1.2.3.4:foo,ec1:1.2.3.5:bar
1929
1930 SLURM_NTASKS
1931 Same as -n, --ntasks
1932
1933 SLURM_NTASKS_PER_CORE
1934 Set to value of the --ntasks-per-core option, if specified.
1935
1936 SLURM_NTASKS_PER_GPU
1937 Set to value of the --ntasks-per-gpu option, if specified.
1938
1939 SLURM_NTASKS_PER_NODE
1940 Set to value of the --ntasks-per-node option, if specified.
1941
1942 SLURM_NTASKS_PER_SOCKET
1943 Set to value of the --ntasks-per-socket option, if specified.
1944
1945 SLURM_OVERCOMMIT
1946 Set to 1 if --overcommit was specified.
1947
1948 SLURM_PROFILE
1949 Same as --profile
1950
1951 SLURM_SHARDS_ON_NODE
1952 Number of GPU Shards available to the step on this node.
1953
1954 SLURM_SUBMIT_DIR
1955 The directory from which salloc was invoked or, if applicable,
1956 the directory specified by the -D, --chdir option.
1957
1958 SLURM_SUBMIT_HOST
1959 The hostname of the computer from which salloc was invoked.
1960
1961 SLURM_TASKS_PER_NODE
1962 Number of tasks to be initiated on each node. Values are comma
1963 separated and in the same order as SLURM_JOB_NODELIST. If two
1964 or more consecutive nodes are to have the same task count, that
1965 count is followed by "(x#)" where "#" is the repetition count.
1966 For example, "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the
1967 first three nodes will each execute two tasks and the fourth
1968 node will execute one task.
1969
1970 SLURM_THREADS_PER_CORE
1971 This is only set if --threads-per-core or SAL‐
1972 LOC_THREADS_PER_CORE were specified. The value will be set to
1973 the value specified by --threads-per-core or SAL‐
1974 LOC_THREADS_PER_CORE. This is used by subsequent srun calls
1975 within the job allocation.
1976
1978 While salloc is waiting for a PENDING job allocation, most signals will
1979 cause salloc to revoke the allocation request and exit.
1980
1981 However if the allocation has been granted and salloc has already
1982 started the specified command, then salloc will ignore most signals.
1983 salloc will not exit or release the allocation until the command exits.
1984 One notable exception is SIGHUP. A SIGHUP signal will cause salloc to
1985 release the allocation and exit without waiting for the command to fin‐
1986 ish. Another exception is SIGTERM, which will be forwarded to the
1987 spawned process.
1988
1989
1991 To get an allocation, and open a new xterm in which srun commands may
1992 be typed interactively:
1993
1994 $ salloc -N16 xterm
1995 salloc: Granted job allocation 65537
1996 # (at this point the xterm appears, and salloc waits for xterm to exit)
1997 salloc: Relinquishing job allocation 65537
1998
1999
2000 To grab an allocation of nodes and launch a parallel application on one
2001 command line:
2002
2003 $ salloc -N5 srun -n10 myprogram
2004
2005
2006 To create a heterogeneous job with 3 components, each allocating a
2007 unique set of nodes:
2008
2009 $ salloc -w node[2-3] : -w node4 : -w node[5-7] bash
2010 salloc: job 32294 queued and waiting for resources
2011 salloc: job 32294 has been allocated resources
2012 salloc: Granted job allocation 32294
2013
2014
2016 Copyright (C) 2006-2007 The Regents of the University of California.
2017 Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
2018 Copyright (C) 2008-2010 Lawrence Livermore National Security.
2019 Copyright (C) 2010-2022 SchedMD LLC.
2020
2021 This file is part of Slurm, a resource management program. For de‐
2022 tails, see <https://slurm.schedmd.com/>.
2023
2024 Slurm is free software; you can redistribute it and/or modify it under
2025 the terms of the GNU General Public License as published by the Free
2026 Software Foundation; either version 2 of the License, or (at your op‐
2027 tion) any later version.
2028
2029 Slurm is distributed in the hope that it will be useful, but WITHOUT
2030 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
2031 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
2032 for more details.
2033
2034
2036 sinfo(1), sattach(1), sbatch(1), squeue(1), scancel(1), scontrol(1),
2037 slurm.conf(5), sched_setaffinity (2), numa (3)
2038
2039
2040
2041December 2022 Slurm Commands salloc(1)