1sbatch(1) Slurm Commands sbatch(1)
2
3
4
6 sbatch - Submit a batch script to Slurm.
7
8
10 sbatch [OPTIONS(0)...] [ : [OPTIONS(n)...]] script(0) [args(0)...]
11
12 Option(s) define multiple jobs in a co-scheduled heterogeneous job.
13 For more details about heterogeneous jobs see the document
14 https://slurm.schedmd.com/heterogeneous_jobs.html
15
16
18 sbatch submits a batch script to Slurm. The batch script may be given
19 to sbatch through a file name on the command line, or if no file name
20 is specified, sbatch will read in a script from standard input. The
21 batch script may contain options preceded with "#SBATCH" before any
22 executable commands in the script.
23
24 sbatch exits immediately after the script is successfully transferred
25 to the Slurm controller and assigned a Slurm job ID. The batch script
26 is not necessarily granted resources immediately, it may sit in the
27 queue of pending jobs for some time before its required resources
28 become available.
29
30 By default both standard output and standard error are directed to a
31 file of the name "slurm-%j.out", where the "%j" is replaced with the
32 job allocation number. The file will be generated on the first node of
33 the job allocation. Other than the batch script itself, Slurm does no
34 movement of user files.
35
36 When the job allocation is finally granted for the batch script, Slurm
37 runs a single copy of the batch script on the first node in the set of
38 allocated nodes.
39
40 The following document describes the influence of various options on
41 the allocation of cpus to jobs and tasks.
42 https://slurm.schedmd.com/cpu_management.html
43
44
46 sbatch will return 0 on success or error code on failure.
47
48
50 The batch script is resolved in the following order:
51
52 1. If script starts with ".", then path is constructed as: current
53 working directory / script
54
55 2. If script starts with a "/", then path is considered absolute.
56
57 3. If script is in current working directory.
58
59 4. If script can be resolved through PATH. See path_resolution(7).
60
61
63 -a, --array=<indexes>
64 Submit a job array, multiple jobs to be executed with identical
65 parameters. The indexes specification identifies what array
66 index values should be used. Multiple values may be specified
67 using a comma separated list and/or a range of values with a "-"
68 separator. For example, "--array=0-15" or "--array=0,6,16-32".
69 A step function can also be specified with a suffix containing a
70 colon and number. For example, "--array=0-15:4" is equivalent to
71 "--array=0,4,8,12". A maximum number of simultaneously running
72 tasks from the job array may be specified using a "%" separator.
73 For example "--array=0-15%4" will limit the number of simultane‐
74 ously running tasks from this job array to 4. The minimum index
75 value is 0. the maximum value is one less than the configura‐
76 tion parameter MaxArraySize. NOTE: currently, federated job
77 arrays only run on the local cluster.
78
79
80 -A, --account=<account>
81 Charge resources used by this job to specified account. The
82 account is an arbitrary string. The account name may be changed
83 after job submission using the scontrol command.
84
85
86 --acctg-freq
87 Define the job accounting and profiling sampling intervals.
88 This can be used to override the JobAcctGatherFrequency parame‐
89 ter in Slurm's configuration file, slurm.conf. The supported
90 format is as follows:
91
92 --acctg-freq=<datatype>=<interval>
93 where <datatype>=<interval> specifies the task sam‐
94 pling interval for the jobacct_gather plugin or a
95 sampling interval for a profiling type by the
96 acct_gather_profile plugin. Multiple, comma-sepa‐
97 rated <datatype>=<interval> intervals may be speci‐
98 fied. Supported datatypes are as follows:
99
100 task=<interval>
101 where <interval> is the task sampling inter‐
102 val in seconds for the jobacct_gather plugins
103 and for task profiling by the
104 acct_gather_profile plugin. NOTE: This fre‐
105 quency is used to monitor memory usage. If
106 memory limits are enforced the highest fre‐
107 quency a user can request is what is config‐
108 ured in the slurm.conf file. They can not
109 turn it off (=0) either.
110
111 energy=<interval>
112 where <interval> is the sampling interval in
113 seconds for energy profiling using the
114 acct_gather_energy plugin
115
116 network=<interval>
117 where <interval> is the sampling interval in
118 seconds for infiniband profiling using the
119 acct_gather_infiniband plugin.
120
121 filesystem=<interval>
122 where <interval> is the sampling interval in
123 seconds for filesystem profiling using the
124 acct_gather_filesystem plugin.
125
126 The default value for the task sampling
127 interval is 30 seconds.
128 The default value for all other intervals is 0. An interval of
129 0 disables sampling of the specified type. If the task sampling
130 interval is 0, accounting information is collected only at job
131 termination (reducing Slurm interference with the job).
132 Smaller (non-zero) values have a greater impact upon job perfor‐
133 mance, but a value of 30 seconds is not likely to be noticeable
134 for applications having less than 10,000 tasks.
135
136
137 -B --extra-node-info=<sockets[:cores[:threads]]>
138 Restrict node selection to nodes with at least the specified
139 number of sockets, cores per socket and/or threads per core.
140 NOTE: These options do not specify the resource allocation size.
141 Each value specified is considered a minimum. An asterisk (*)
142 can be used as a placeholder indicating that all available
143 resources of that type are to be utilized. Values can also be
144 specified as min-max. The individual levels can also be speci‐
145 fied in separate options if desired:
146 --sockets-per-node=<sockets>
147 --cores-per-socket=<cores>
148 --threads-per-core=<threads>
149 If task/affinity plugin is enabled, then specifying an alloca‐
150 tion in this manner also results in subsequently launched tasks
151 being bound to threads if the -B option specifies a thread
152 count, otherwise an option of cores if a core count is speci‐
153 fied, otherwise an option of sockets. If SelectType is config‐
154 ured to select/cons_res, it must have a parameter of CR_Core,
155 CR_Core_Memory, CR_Socket, or CR_Socket_Memory for this option
156 to be honored. If not specified, the scontrol show job will
157 display 'ReqS:C:T=*:*:*'. This option applies to job alloca‐
158 tions.
159
160
161 --batch=<list>
162 Nodes can have features assigned to them by the Slurm adminis‐
163 trator. Users can specify which of these features are required
164 by their batch script using this options. For example a job's
165 allocation may include both Intel Haswell and KNL nodes with
166 features "haswell" and "knl" respectively. On such a configura‐
167 tion the batch script would normally benefit by executing on a
168 faster Haswell node. This would be specified using the option
169 "--batch=haswell". The specification can include AND and OR
170 operators using the ampersand and vertical bar separators. For
171 example: "--batch=haswell|broadwell" or
172 "--batch=haswell|big_memory". The --batch argument must be a
173 subset of the job's --constraint=<list> argument (i.e. the job
174 can not request only KNL nodes, but require the script to exe‐
175 cute on a Haswell node). If the request can not be satisfied
176 from the resources allocated to the job, the batch script will
177 execute on the first node of the job allocation.
178
179
180 --bb=<spec>
181 Burst buffer specification. The form of the specification is
182 system dependent. Note the burst buffer may not be accessible
183 from a login node, but require that salloc spawn a shell on one
184 of it's allocated compute nodes. See the description of Sal‐
185 locDefaultCommand in the slurm.conf man page for more informa‐
186 tion about how to spawn a remote shell.
187
188
189 --bbf=<file_name>
190 Path of file containing burst buffer specification. The form of
191 the specification is system dependent. These burst buffer
192 directives will be inserted into the submitted batch script.
193
194
195 --begin=<time>
196 Submit the batch script to the Slurm controller immediately,
197 like normal, but tell the controller to defer the allocation of
198 the job until the specified time.
199
200 Time may be of the form HH:MM:SS to run a job at a specific time
201 of day (seconds are optional). (If that time is already past,
202 the next day is assumed.) You may also specify midnight, noon,
203 fika (3 PM) or teatime (4 PM) and you can have a time-of-day
204 suffixed with AM or PM for running in the morning or the
205 evening. You can also say what day the job will be run, by
206 specifying a date of the form MMDDYY or MM/DD/YY YYYY-MM-DD.
207 Combine date and time using the following format
208 YYYY-MM-DD[THH:MM[:SS]]. You can also give times like now +
209 count time-units, where the time-units can be seconds (default),
210 minutes, hours, days, or weeks and you can tell Slurm to run the
211 job today with the keyword today and to run the job tomorrow
212 with the keyword tomorrow. The value may be changed after job
213 submission using the scontrol command. For example:
214 --begin=16:00
215 --begin=now+1hour
216 --begin=now+60 (seconds by default)
217 --begin=2010-01-20T12:34:00
218
219
220 Notes on date/time specifications:
221 - Although the 'seconds' field of the HH:MM:SS time specifica‐
222 tion is allowed by the code, note that the poll time of the
223 Slurm scheduler is not precise enough to guarantee dispatch of
224 the job on the exact second. The job will be eligible to start
225 on the next poll following the specified time. The exact poll
226 interval depends on the Slurm scheduler (e.g., 60 seconds with
227 the default sched/builtin).
228 - If no time (HH:MM:SS) is specified, the default is
229 (00:00:00).
230 - If a date is specified without a year (e.g., MM/DD) then the
231 current year is assumed, unless the combination of MM/DD and
232 HH:MM:SS has already passed for that year, in which case the
233 next year is used.
234
235
236 --checkpoint=<time>
237 Specifies the interval between creating checkpoints of the job
238 step. By default, the job step will have no checkpoints cre‐
239 ated. Acceptable time formats include "minutes", "minutes:sec‐
240 onds", "hours:minutes:seconds", "days-hours", "days-hours:min‐
241 utes" and "days-hours:minutes:seconds".
242
243
244 --checkpoint-dir=<directory>
245 Specifies the directory into which the job or job step's check‐
246 point should be written (used by the checkpoint/blcrm and check‐
247 point/xlch plugins only). The default value is the current
248 working directory. Checkpoint files will be of the form
249 "<job_id>.ckpt" for jobs and "<job_id>.<step_id>.ckpt" for job
250 steps.
251
252
253 --cluster-constraint=[!]<list>
254 Specifies features that a federated cluster must have to have a
255 sibling job submitted to it. Slurm will attempt to submit a sib‐
256 ling job to a cluster if it has at least one of the specified
257 features. If the "!" option is included, Slurm will attempt to
258 submit a sibling job to a cluster that has none of the specified
259 features.
260
261
262 --comment=<string>
263 An arbitrary comment enclosed in double quotes if using spaces
264 or some special characters.
265
266
267 -C, --constraint=<list>
268 Nodes can have features assigned to them by the Slurm adminis‐
269 trator. Users can specify which of these features are required
270 by their job using the constraint option. Only nodes having
271 features matching the job constraints will be used to satisfy
272 the request. Multiple constraints may be specified with AND,
273 OR, matching OR, resource counts, etc. (some operators are not
274 supported on all system types). Supported constraint options
275 include:
276
277 Single Name
278 Only nodes which have the specified feature will be used.
279 For example, --constraint="intel"
280
281 Node Count
282 A request can specify the number of nodes needed with
283 some feature by appending an asterisk and count after the
284 feature name. For example "--nodes=16 --con‐
285 straint=graphics*4 ..." indicates that the job requires
286 16 nodes and that at least four of those nodes must have
287 the feature "graphics."
288
289 AND If only nodes with all of specified features will be
290 used. The ampersand is used for an AND operator. For
291 example, --constraint="intel&gpu"
292
293 OR If only nodes with at least one of specified features
294 will be used. The vertical bar is used for an OR opera‐
295 tor. For example, --constraint="intel|amd"
296
297 Matching OR
298 If only one of a set of possible options should be used
299 for all allocated nodes, then use the OR operator and
300 enclose the options within square brackets. For example:
301 "--constraint=[rack1|rack2|rack3|rack4]" might be used to
302 specify that all nodes must be allocated on a single rack
303 of the cluster, but any of those four racks can be used.
304
305 Multiple Counts
306 Specific counts of multiple resources may be specified by
307 using the AND operator and enclosing the options within
308 square brackets. For example: "--con‐
309 straint=[rack1*2&rack2*4]" might be used to specify that
310 two nodes must be allocated from nodes with the feature
311 of "rack1" and four nodes must be allocated from nodes
312 with the feature "rack2".
313
314 NOTE: This construct does not support multiple Intel KNL
315 NUMA or MCDRAM modes. For example, while "--con‐
316 straint=[(knl&quad)*2&(knl&hemi)*4]" is not supported,
317 "--constraint=[haswell*2&(knl&hemi)*4]" is supported.
318 Specification of multiple KNL modes requires the use of a
319 heterogeneous job.
320
321
322 Parenthesis
323 Parenthesis can be used to group like node features
324 together. For example "--con‐
325 straint=[(knl&snc4&flat)*4&haswell*1]" might be used to
326 specify that four nodes with the features "knl", "snc4"
327 and "flat" plus one node with the feature "haswell" are
328 required. All options within parenthesis should be
329 grouped with AND (e.g. "&") operands.
330
331
332 --contiguous
333 If set, then the allocated nodes must form a contiguous set.
334 Not honored with the topology/tree or topology/3d_torus plugins,
335 both of which can modify the node ordering.
336
337
338 --cores-per-socket=<cores>
339 Restrict node selection to nodes with at least the specified
340 number of cores per socket. See additional information under -B
341 option above when task/affinity plugin is enabled.
342
343
344 --cpu-freq =<p1[-p2[:p3]]>
345
346 Request that job steps initiated by srun commands inside this
347 sbatch script be run at some requested frequency if possible, on
348 the CPUs selected for the step on the compute node(s).
349
350 p1 can be [#### | low | medium | high | highm1] which will set
351 the frequency scaling_speed to the corresponding value, and set
352 the frequency scaling_governor to UserSpace. See below for defi‐
353 nition of the values.
354
355 p1 can be [Conservative | OnDemand | Performance | PowerSave]
356 which will set the scaling_governor to the corresponding value.
357 The governor has to be in the list set by the slurm.conf option
358 CpuFreqGovernors.
359
360 When p2 is present, p1 will be the minimum scaling frequency and
361 p2 will be the maximum scaling frequency.
362
363 p2 can be [#### | medium | high | highm1] p2 must be greater
364 than p1.
365
366 p3 can be [Conservative | OnDemand | Performance | PowerSave |
367 UserSpace] which will set the governor to the corresponding
368 value.
369
370 If p3 is UserSpace, the frequency scaling_speed will be set by a
371 power or energy aware scheduling strategy to a value between p1
372 and p2 that lets the job run within the site's power goal. The
373 job may be delayed if p1 is higher than a frequency that allows
374 the job to run within the goal.
375
376 If the current frequency is < min, it will be set to min. Like‐
377 wise, if the current frequency is > max, it will be set to max.
378
379 Acceptable values at present include:
380
381 #### frequency in kilohertz
382
383 Low the lowest available frequency
384
385 High the highest available frequency
386
387 HighM1 (high minus one) will select the next highest
388 available frequency
389
390 Medium attempts to set a frequency in the middle of the
391 available range
392
393 Conservative attempts to use the Conservative CPU governor
394
395 OnDemand attempts to use the OnDemand CPU governor (the
396 default value)
397
398 Performance attempts to use the Performance CPU governor
399
400 PowerSave attempts to use the PowerSave CPU governor
401
402 UserSpace attempts to use the UserSpace CPU governor
403
404
405 The following informational environment variable is set
406 in the job
407 step when --cpu-freq option is requested.
408 SLURM_CPU_FREQ_REQ
409
410 This environment variable can also be used to supply the value
411 for the CPU frequency request if it is set when the 'srun' com‐
412 mand is issued. The --cpu-freq on the command line will over‐
413 ride the environment variable value. The form on the environ‐
414 ment variable is the same as the command line. See the ENVIRON‐
415 MENT VARIABLES section for a description of the
416 SLURM_CPU_FREQ_REQ variable.
417
418 NOTE: This parameter is treated as a request, not a requirement.
419 If the job step's node does not support setting the CPU fre‐
420 quency, or the requested value is outside the bounds of the
421 legal frequencies, an error is logged, but the job step is
422 allowed to continue.
423
424 NOTE: Setting the frequency for just the CPUs of the job step
425 implies that the tasks are confined to those CPUs. If task con‐
426 finement (i.e., TaskPlugin=task/affinity or TaskPlu‐
427 gin=task/cgroup with the "ConstrainCores" option) is not config‐
428 ured, this parameter is ignored.
429
430 NOTE: When the step completes, the frequency and governor of
431 each selected CPU is reset to the previous values.
432
433 NOTE: When submitting jobs with the --cpu-freq option with lin‐
434 uxproc as the ProctrackType can cause jobs to run too quickly
435 before Accounting is able to poll for job information. As a
436 result not all of accounting information will be present.
437
438
439 -c, --cpus-per-task=<ncpus>
440 Advise the Slurm controller that ensuing job steps will require
441 ncpus number of processors per task. Without this option, the
442 controller will just try to allocate one processor per task.
443
444 For instance, consider an application that has 4 tasks, each
445 requiring 3 processors. If our cluster is comprised of
446 quad-processors nodes and we simply ask for 12 processors, the
447 controller might give us only 3 nodes. However, by using the
448 --cpus-per-task=3 options, the controller knows that each task
449 requires 3 processors on the same node, and the controller will
450 grant an allocation of 4 nodes, one for each of the 4 tasks.
451
452
453 --deadline=<OPT>
454 remove the job if no ending is possible before this deadline
455 (start > (deadline - time[-min])). Default is no deadline.
456 Valid time formats are:
457 HH:MM[:SS] [AM|PM]
458 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
459 MM/DD[/YY]-HH:MM[:SS]
460 YYYY-MM-DD[THH:MM[:SS]]]
461
462
463 --delay-boot=<minutes>
464 Do not reboot nodes in order to satisfied this job's feature
465 specification if the job has been eligible to run for less than
466 this time period. If the job has waited for less than the spec‐
467 ified period, it will use only nodes which already have the
468 specified features. The argument is in units of minutes. A
469 default value may be set by a system administrator using the
470 delay_boot option of the SchedulerParameters configuration
471 parameter in the slurm.conf file, otherwise the default value is
472 zero (no delay).
473
474
475 -d, --dependency=<dependency_list>
476 Defer the start of this job until the specified dependencies
477 have been satisfied completed. <dependency_list> is of the form
478 <type:job_id[:job_id][,type:job_id[:job_id]]> or
479 <type:job_id[:job_id][?type:job_id[:job_id]]>. All dependencies
480 must be satisfied if the "," separator is used. Any dependency
481 may be satisfied if the "?" separator is used. Many jobs can
482 share the same dependency and these jobs may even belong to dif‐
483 ferent users. The value may be changed after job submission
484 using the scontrol command. Once a job dependency fails due to
485 the termination state of a preceding job, the dependent job will
486 never be run, even if the preceding job is requeued and has a
487 different termination state in a subsequent execution.
488
489 after:job_id[:jobid...]
490 This job can begin execution after the specified jobs
491 have begun execution.
492
493 afterany:job_id[:jobid...]
494 This job can begin execution after the specified jobs
495 have terminated.
496
497 afterburstbuffer:job_id[:jobid...]
498 This job can begin execution after the specified jobs
499 have terminated and any associated burst buffer stage out
500 operations have completed.
501
502 aftercorr:job_id[:jobid...]
503 A task of this job array can begin execution after the
504 corresponding task ID in the specified job has completed
505 successfully (ran to completion with an exit code of
506 zero).
507
508 afternotok:job_id[:jobid...]
509 This job can begin execution after the specified jobs
510 have terminated in some failed state (non-zero exit code,
511 node failure, timed out, etc).
512
513 afterok:job_id[:jobid...]
514 This job can begin execution after the specified jobs
515 have successfully executed (ran to completion with an
516 exit code of zero).
517
518 expand:job_id
519 Resources allocated to this job should be used to expand
520 the specified job. The job to expand must share the same
521 QOS (Quality of Service) and partition. Gang scheduling
522 of resources in the partition is also not supported.
523
524 singleton
525 This job can begin execution after any previously
526 launched jobs sharing the same job name and user have
527 terminated. In other words, only one job by that name
528 and owned by that user can be running or suspended at any
529 point in time.
530
531
532 -D, --chdir=<directory>
533 Set the working directory of the batch script to directory
534 before it is executed. The path can be specified as full path or
535 relative path to the directory where the command is executed.
536
537
538 -e, --error=<filename pattern>
539 Instruct Slurm to connect the batch script's standard error
540 directly to the file name specified in the "filename pattern".
541 By default both standard output and standard error are directed
542 to the same file. For job arrays, the default file name is
543 "slurm-%A_%a.out", "%A" is replaced by the job ID and "%a" with
544 the array index. For other jobs, the default file name is
545 "slurm-%j.out", where the "%j" is replaced by the job ID. See
546 the filename pattern section below for filename specification
547 options.
548
549
550 --exclusive[=user|mcs]
551 The job allocation can not share nodes with other running jobs
552 (or just other users with the "=user" option or with the "=mcs"
553 option). The default shared/exclusive behavior depends on sys‐
554 tem configuration and the partition's OverSubscribe option takes
555 precedence over the job's option.
556
557
558 --export=<environment variables [ALL] | NONE>
559 Identify which environment variables from the submission envi‐
560 ronment are propagated to the launched application. By default,
561 all are propagated. Multiple environment variable names should
562 be comma separated. Environment variable names may be specified
563 to propagate the current value (e.g. "--export=EDITOR") or spe‐
564 cific values may be exported (e.g. "--export=EDI‐
565 TOR=/bin/emacs"). In these two examples, the propagated envi‐
566 ronment will only contain the variable EDITOR, along with
567 SLURM_* environment variables. However, Slurm will then implic‐
568 itly attempt to load the user's environment on the node where
569 the script is being executed, as if --get-user-env was speci‐
570 fied. This will happen whenever NONE or environment variables
571 are specified. If one desires to add to the submission environ‐
572 ment instead of replacing it, have the argument include ALL
573 (e.g. "--export=ALL,EDITOR=/bin/emacs"). Make sure ALL is speci‐
574 fied first, since sbatch applies the environment from left to
575 right, overwriting as necessary. Environment variables propa‐
576 gated from the submission environment will always overwrite
577 environment variables found in the user environment on the node.
578 If one desires no environment variables be propagated from the
579 submitting machine, use the argument NONE. Regardless of this
580 setting, the appropriate SLURM_* task environment variables are
581 always exported to the environment. This option is particularly
582 important for jobs that are submitted on one cluster and execute
583 on a different cluster (e.g. with different paths).
584
585
586 --export-file=<filename | fd>
587 If a number between 3 and OPEN_MAX is specified as the argument
588 to this option, a readable file descriptor will be assumed
589 (STDIN and STDOUT are not supported as valid arguments). Other‐
590 wise a filename is assumed. Export environment variables
591 defined in <filename> or read from <fd> to the job's execution
592 environment. The content is one or more environment variable
593 definitions of the form NAME=value, each separated by a null
594 character. This allows the use of special characters in envi‐
595 ronment definitions.
596
597
598 -F, --nodefile=<node file>
599 Much like --nodelist, but the list is contained in a file of
600 name node file. The node names of the list may also span multi‐
601 ple lines in the file. Duplicate node names in the file will
602 be ignored. The order of the node names in the list is not
603 important; the node names will be sorted by Slurm.
604
605
606 --get-user-env[=timeout][mode]
607 This option will tell sbatch to retrieve the login environment
608 variables for the user specified in the --uid option. The envi‐
609 ronment variables are retrieved by running something of this
610 sort "su - <username> -c /usr/bin/env" and parsing the output.
611 Be aware that any environment variables already set in sbatch's
612 environment will take precedence over any environment variables
613 in the user's login environment. Clear any environment variables
614 before calling sbatch that you do not want propagated to the
615 spawned program. The optional timeout value is in seconds.
616 Default value is 8 seconds. The optional mode value control the
617 "su" options. With a mode value of "S", "su" is executed with‐
618 out the "-" option. With a mode value of "L", "su" is executed
619 with the "-" option, replicating the login environment. If mode
620 not specified, the mode established at Slurm build time is used.
621 Example of use include "--get-user-env", "--get-user-env=10"
622 "--get-user-env=10L", and "--get-user-env=S". This option was
623 originally created for use by Moab.
624
625
626 --gid=<group>
627 If sbatch is run as root, and the --gid option is used, submit
628 the job with group's group access permissions. group may be the
629 group name or the numerical group ID.
630
631
632 --gres=<list>
633 Specifies a comma delimited list of generic consumable
634 resources. The format of each entry on the list is
635 "name[[:type]:count]". The name is that of the consumable
636 resource. The count is the number of those resources with a
637 default value of 1. The specified resources will be allocated
638 to the job on each node. The available generic consumable
639 resources is configurable by the system administrator. A list
640 of available generic consumable resources will be printed and
641 the command will exit if the option argument is "help". Exam‐
642 ples of use include "--gres=gpu:2,mic:1", "--gres=gpu:kepler:2",
643 and "--gres=help".
644
645
646 --gres-flags=<type>
647 Specify generic resource task binding options.
648
649 disable-binding
650 Disable filtering of CPUs with respect to generic
651 resource locality. This option is currently required to
652 use more CPUs than are bound to a GRES (i.e. if a GPU is
653 bound to the CPUs on one socket, but resources on more
654 than one socket are required to run the job). This
655 option may permit a job to be allocated resources sooner
656 than otherwise possible, but may result in lower job per‐
657 formance.
658
659 enforce-binding
660 The only CPUs available to the job will be those bound to
661 the selected GRES (i.e. the CPUs identified in the
662 gres.conf file will be strictly enforced). This option
663 may result in delayed initiation of a job. For example a
664 job requiring two GPUs and one CPU will be delayed until
665 both GPUs on a single socket are available rather than
666 using GPUs bound to separate sockets, however the appli‐
667 cation performance may be improved due to improved commu‐
668 nication speed. Requires the node to be configured with
669 more than one socket and resource filtering will be per‐
670 formed on a per-socket basis.
671
672
673 -H, --hold
674 Specify the job is to be submitted in a held state (priority of
675 zero). A held job can now be released using scontrol to reset
676 its priority (e.g. "scontrol release <job_id>").
677
678
679 -h, --help
680 Display help information and exit.
681
682
683 --hint=<type>
684 Bind tasks according to application hints.
685
686 compute_bound
687 Select settings for compute bound applications: use all
688 cores in each socket, one thread per core.
689
690 memory_bound
691 Select settings for memory bound applications: use only
692 one core in each socket, one thread per core.
693
694 [no]multithread
695 [don't] use extra threads with in-core multi-threading
696 which can benefit communication intensive applications.
697 Only supported with the task/affinity plugin.
698
699 help show this help message
700
701
702 --ignore-pbs
703 Ignore any "#PBS" options specified in the batch script.
704
705
706 -i, --input=<filename pattern>
707 Instruct Slurm to connect the batch script's standard input
708 directly to the file name specified in the "filename pattern".
709
710 By default, "/dev/null" is open on the batch script's standard
711 input and both standard output and standard error are directed
712 to a file of the name "slurm-%j.out", where the "%j" is replaced
713 with the job allocation number, as described below in the file‐
714 name pattern section.
715
716
717 -J, --job-name=<jobname>
718 Specify a name for the job allocation. The specified name will
719 appear along with the job id number when querying running jobs
720 on the system. The default is the name of the batch script, or
721 just "sbatch" if the script is read on sbatch's standard input.
722
723
724 --jobid=<jobid>
725 Allocate resources as the specified job id. NOTE: Only valid
726 for users root and SlurmUser. NOTE: Not valid for federated
727 clusters.
728
729
730 -k, --no-kill
731 Do not automatically terminate a job if one of the nodes it has
732 been allocated fails. The user will assume the responsibilities
733 for fault-tolerance should a node fail. When there is a node
734 failure, any active job steps (usually MPI jobs) on that node
735 will almost certainly suffer a fatal error, but with --no-kill,
736 the job allocation will not be revoked so the user may launch
737 new job steps on the remaining nodes in their allocation.
738
739 By default Slurm terminates the entire job allocation if any
740 node fails in its range of allocated nodes.
741
742
743 --kill-on-invalid-dep=<yes|no>
744 If a job has an invalid dependency and it can never run this
745 parameter tells Slurm to terminate it or not. A terminated job
746 state will be JOB_CANCELLED. If this option is not specified
747 the system wide behavior applies. By default the job stays
748 pending with reason DependencyNeverSatisfied or if the
749 kill_invalid_depend is specified in slurm.conf the job is termi‐
750 nated.
751
752
753 -L, --licenses=<license>
754 Specification of licenses (or other resources available on all
755 nodes of the cluster) which must be allocated to this job.
756 License names can be followed by a colon and count (the default
757 count is one). Multiple license names should be comma separated
758 (e.g. "--licenses=foo:4,bar"). To submit jobs using remote
759 licenses, those served by the slurmdbd, specify the name of the
760 server providing the licenses. For example "--license=nas‐
761 tran@slurmdb:12".
762
763
764 -M, --clusters=<string>
765 Clusters to issue commands to. Multiple cluster names may be
766 comma separated. The job will be submitted to the one cluster
767 providing the earliest expected job initiation time. The default
768 value is the current cluster. A value of 'all' will query to run
769 on all clusters. Note the --export option to control environ‐
770 ment variables exported between clusters. Note that the Slur‐
771 mDBD must be up for this option to work properly.
772
773
774 -m, --distribution=
775 arbitrary|<block|cyclic|plane=<options>[:block|cyclic|fcyclic]>
776
777 Specify alternate distribution methods for remote processes. In
778 sbatch, this only sets environment variables that will be used
779 by subsequent srun requests. This option controls the assign‐
780 ment of tasks to the nodes on which resources have been allo‐
781 cated, and the distribution of those resources to tasks for
782 binding (task affinity). The first distribution method (before
783 the ":") controls the distribution of resources across nodes.
784 The optional second distribution method (after the ":") controls
785 the distribution of resources across sockets within a node.
786 Note that with select/cons_res, the number of cpus allocated on
787 each socket and node may be different. Refer to
788 https://slurm.schedmd.com/mc_support.html for more information
789 on resource allocation, assignment of tasks to nodes, and bind‐
790 ing of tasks to CPUs.
791
792 First distribution method:
793
794 block The block distribution method will distribute tasks to a
795 node such that consecutive tasks share a node. For exam‐
796 ple, consider an allocation of three nodes each with two
797 cpus. A four-task block distribution request will dis‐
798 tribute those tasks to the nodes with tasks one and two
799 on the first node, task three on the second node, and
800 task four on the third node. Block distribution is the
801 default behavior if the number of tasks exceeds the num‐
802 ber of allocated nodes.
803
804 cyclic The cyclic distribution method will distribute tasks to a
805 node such that consecutive tasks are distributed over
806 consecutive nodes (in a round-robin fashion). For exam‐
807 ple, consider an allocation of three nodes each with two
808 cpus. A four-task cyclic distribution request will dis‐
809 tribute those tasks to the nodes with tasks one and four
810 on the first node, task two on the second node, and task
811 three on the third node. Note that when SelectType is
812 select/cons_res, the same number of CPUs may not be allo‐
813 cated on each node. Task distribution will be round-robin
814 among all the nodes with CPUs yet to be assigned to
815 tasks. Cyclic distribution is the default behavior if
816 the number of tasks is no larger than the number of allo‐
817 cated nodes.
818
819 plane The tasks are distributed in blocks of a specified size.
820 The options include a number representing the size of the
821 task block. This is followed by an optional specifica‐
822 tion of the task distribution scheme within a block of
823 tasks and between the blocks of tasks. The number of
824 tasks distributed to each node is the same as for cyclic
825 distribution, but the taskids assigned to each node
826 depend on the plane size. For more details (including
827 examples and diagrams), please see
828 https://slurm.schedmd.com/mc_support.html
829 and
830 https://slurm.schedmd.com/dist_plane.html
831
832 arbitrary
833 The arbitrary method of distribution will allocate pro‐
834 cesses in-order as listed in file designated by the envi‐
835 ronment variable SLURM_HOSTFILE. If this variable is
836 listed it will override any other method specified. If
837 not set the method will default to block. Inside the
838 hostfile must contain at minimum the number of hosts
839 requested and be one per line or comma separated. If
840 specifying a task count (-n, --ntasks=<number>), your
841 tasks will be laid out on the nodes in the order of the
842 file.
843 NOTE: The arbitrary distribution option on a job alloca‐
844 tion only controls the nodes to be allocated to the job
845 and not the allocation of CPUs on those nodes. This
846 option is meant primarily to control a job step's task
847 layout in an existing job allocation for the srun com‐
848 mand.
849
850
851 Second distribution method:
852
853 block The block distribution method will distribute tasks to
854 sockets such that consecutive tasks share a socket.
855
856 cyclic The cyclic distribution method will distribute tasks to
857 sockets such that consecutive tasks are distributed over
858 consecutive sockets (in a round-robin fashion). Tasks
859 requiring more than one CPU will have all of those CPUs
860 allocated on a single socket if possible.
861
862 fcyclic
863 The fcyclic distribution method will distribute tasks to
864 sockets such that consecutive tasks are distributed over
865 consecutive sockets (in a round-robin fashion). Tasks
866 requiring more than one CPU will have each CPUs allocated
867 in a cyclic fashion across sockets.
868
869
870 --mail-type=<type>
871 Notify user by email when certain event types occur. Valid type
872 values are NONE, BEGIN, END, FAIL, REQUEUE, ALL (equivalent to
873 BEGIN, END, FAIL, REQUEUE, and STAGE_OUT), STAGE_OUT (burst buf‐
874 fer stage out and teardown completed), TIME_LIMIT, TIME_LIMIT_90
875 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80
876 percent of time limit), TIME_LIMIT_50 (reached 50 percent of
877 time limit) and ARRAY_TASKS (send emails for each array task).
878 Multiple type values may be specified in a comma separated list.
879 The user to be notified is indicated with --mail-user. Unless
880 the ARRAY_TASKS option is specified, mail notifications on job
881 BEGIN, END and FAIL apply to a job array as a whole rather than
882 generating individual email messages for each task in the job
883 array.
884
885
886 --mail-user=<user>
887 User to receive email notification of state changes as defined
888 by --mail-type. The default value is the submitting user.
889
890
891 --mcs-label=<mcs>
892 Used only when the mcs/group plugin is enabled. This parameter
893 is a group among the groups of the user. Default value is cal‐
894 culated by the Plugin mcs if it's enabled.
895
896
897 --mem=<size[units]>
898 Specify the real memory required per node. Default units are
899 megabytes unless the SchedulerParameters configuration parameter
900 includes the "default_gbytes" option for gigabytes. Different
901 units can be specified using the suffix [K|M|G|T]. Default
902 value is DefMemPerNode and the maximum value is MaxMemPerNode.
903 If configured, both parameters can be seen using the scontrol
904 show config command. This parameter would generally be used if
905 whole nodes are allocated to jobs (SelectType=select/linear).
906 Also see --mem-per-cpu. --mem and --mem-per-cpu are mutually
907 exclusive.
908
909 NOTE: A memory size specification of zero is treated as a spe‐
910 cial case and grants the job access to all of the memory on each
911 node. If the job is allocated multiple nodes in a heterogeneous
912 cluster, the memory limit on each node will be that of the node
913 in the allocation with the smallest memory size (same limit will
914 apply to every node in the job's allocation).
915
916 NOTE: Enforcement of memory limits currently relies upon the
917 task/cgroup plugin or enabling of accounting, which samples mem‐
918 ory use on a periodic basis (data need not be stored, just col‐
919 lected). In both cases memory use is based upon the job's Resi‐
920 dent Set Size (RSS). A task may exceed the memory limit until
921 the next periodic accounting sample.
922
923
924 --mem-per-cpu=<size[units]>
925 Minimum memory required per allocated CPU. Default units are
926 megabytes unless the SchedulerParameters configuration parameter
927 includes the "default_gbytes" option for gigabytes. Default
928 value is DefMemPerCPU and the maximum value is MaxMemPerCPU (see
929 exception below). If configured, both parameters can be seen
930 using the scontrol show config command. Note that if the job's
931 --mem-per-cpu value exceeds the configured MaxMemPerCPU, then
932 the user's limit will be treated as a memory limit per task;
933 --mem-per-cpu will be reduced to a value no larger than MaxMem‐
934 PerCPU; --cpus-per-task will be set and the value of
935 --cpus-per-task multiplied by the new --mem-per-cpu value will
936 equal the original --mem-per-cpu value specified by the user.
937 This parameter would generally be used if individual processors
938 are allocated to jobs (SelectType=select/cons_res). If
939 resources are allocated by the core, socket or whole nodes; the
940 number of CPUs allocated to a job may be higher than the task
941 count and the value of --mem-per-cpu should be adjusted accord‐
942 ingly. Also see --mem. --mem and --mem-per-cpu are mutually
943 exclusive.
944
945
946 --mem-bind=[{quiet,verbose},]type
947 Bind tasks to memory. Used only when the task/affinity plugin is
948 enabled and the NUMA memory functions are available. Note that
949 the resolution of CPU and memory binding may differ on some
950 architectures. For example, CPU binding may be performed at the
951 level of the cores within a processor while memory binding will
952 be performed at the level of nodes, where the definition of
953 "nodes" may differ from system to system. By default no memory
954 binding is performed; any task using any CPU can use any memory.
955 This option is typically used to ensure that each task is bound
956 to the memory closest to it's assigned CPU. The use of any type
957 other than "none" or "local" is not recommended. If you want
958 greater control, try running a simple test code with the options
959 "--cpu-bind=verbose,none --mem-bind=verbose,none" to determine
960 the specific configuration.
961
962 NOTE: To have Slurm always report on the selected memory binding
963 for all commands executed in a shell, you can enable verbose
964 mode by setting the SLURM_MEM_BIND environment variable value to
965 "verbose".
966
967 The following informational environment variables are set when
968 --mem-bind is in use:
969
970 SLURM_MEM_BIND_LIST
971 SLURM_MEM_BIND_PREFER
972 SLURM_MEM_BIND_SORT
973 SLURM_MEM_BIND_TYPE
974 SLURM_MEM_BIND_VERBOSE
975
976 See the ENVIRONMENT VARIABLES section for a more detailed
977 description of the individual SLURM_MEM_BIND* variables.
978
979 Supported options include:
980
981 help show this help message
982
983 local Use memory local to the processor in use
984
985 map_mem:<list>
986 Bind by setting memory masks on tasks (or ranks) as spec‐
987 ified where <list> is
988 <numa_id_for_task_0>,<numa_id_for_task_1>,... The map‐
989 ping is specified for a node and identical mapping is
990 applied to the tasks on every node (i.e. the lowest task
991 ID on each node is mapped to the first ID specified in
992 the list, etc.). NUMA IDs are interpreted as decimal
993 values unless they are preceded with '0x' in which case
994 they interpreted as hexadecimal values. If the number of
995 tasks (or ranks) exceeds the number of elements in this
996 list, elements in the list will be reused as needed
997 starting from the beginning of the list. To simplify
998 support for large task counts, the lists may follow a map
999 with an asterisk and repetition count For example
1000 "map_mem:0x0f*4,0xf0*4". Not supported unless the entire
1001 node is allocated to the job.
1002
1003 mask_mem:<list>
1004 Bind by setting memory masks on tasks (or ranks) as spec‐
1005 ified where <list> is
1006 <numa_mask_for_task_0>,<numa_mask_for_task_1>,... The
1007 mapping is specified for a node and identical mapping is
1008 applied to the tasks on every node (i.e. the lowest task
1009 ID on each node is mapped to the first mask specified in
1010 the list, etc.). NUMA masks are always interpreted as
1011 hexadecimal values. Note that masks must be preceded
1012 with a '0x' if they don't begin with [0-9] so they are
1013 seen as numerical values. If the number of tasks (or
1014 ranks) exceeds the number of elements in this list, ele‐
1015 ments in the list will be reused as needed starting from
1016 the beginning of the list. To simplify support for large
1017 task counts, the lists may follow a mask with an asterisk
1018 and repetition count For example "mask_mem:0*4,1*4". Not
1019 supported unless the entire node is allocated to the job.
1020
1021 no[ne] don't bind tasks to memory (default)
1022
1023 p[refer]
1024 Prefer use of first specified NUMA node, but permit
1025 use of other available NUMA nodes.
1026
1027 q[uiet]
1028 quietly bind before task runs (default)
1029
1030 rank bind by task rank (not recommended)
1031
1032 sort sort free cache pages (run zonesort on Intel KNL nodes)
1033
1034 v[erbose]
1035 verbosely report binding before task runs
1036
1037
1038 --mincpus=<n>
1039 Specify a minimum number of logical cpus/processors per node.
1040
1041
1042 -N, --nodes=<minnodes[-maxnodes]>
1043 Request that a minimum of minnodes nodes be allocated to this
1044 job. A maximum node count may also be specified with maxnodes.
1045 If only one number is specified, this is used as both the mini‐
1046 mum and maximum node count. The partition's node limits super‐
1047 sede those of the job. If a job's node limits are outside of
1048 the range permitted for its associated partition, the job will
1049 be left in a PENDING state. This permits possible execution at
1050 a later time, when the partition limit is changed. If a job
1051 node limit exceeds the number of nodes configured in the parti‐
1052 tion, the job will be rejected. Note that the environment vari‐
1053 able SLURM_JOB_NODES will be set to the count of nodes actually
1054 allocated to the job. See the ENVIRONMENT VARIABLES section for
1055 more information. If -N is not specified, the default behavior
1056 is to allocate enough nodes to satisfy the requirements of the
1057 -n and -c options. The job will be allocated as many nodes as
1058 possible within the range specified and without delaying the
1059 initiation of the job. The node count specification may include
1060 a numeric value followed by a suffix of "k" (multiplies numeric
1061 value by 1,024) or "m" (multiplies numeric value by 1,048,576).
1062
1063
1064 -n, --ntasks=<number>
1065 sbatch does not launch tasks, it requests an allocation of
1066 resources and submits a batch script. This option advises the
1067 Slurm controller that job steps run within the allocation will
1068 launch a maximum of number tasks and to provide for sufficient
1069 resources. The default is one task per node, but note that the
1070 --cpus-per-task option will change this default.
1071
1072
1073 --network=<type>
1074 Specify information pertaining to the switch or network. The
1075 interpretation of type is system dependent. This option is sup‐
1076 ported when running Slurm on a Cray natively. It is used to
1077 request using Network Performance Counters. Only one value per
1078 request is valid. All options are case in-sensitive. In this
1079 configuration supported values include:
1080
1081 system
1082 Use the system-wide network performance counters. Only
1083 nodes requested will be marked in use for the job alloca‐
1084 tion. If the job does not fill up the entire system the
1085 rest of the nodes are not able to be used by other jobs
1086 using NPC, if idle their state will appear as PerfCnts.
1087 These nodes are still available for other jobs not using
1088 NPC.
1089
1090 blade Use the blade network performance counters. Only nodes
1091 requested will be marked in use for the job allocation.
1092 If the job does not fill up the entire blade(s) allocated
1093 to the job those blade(s) are not able to be used by other
1094 jobs using NPC, if idle their state will appear as PerfC‐
1095 nts. These nodes are still available for other jobs not
1096 using NPC.
1097
1098
1099 In all cases the job allocation request must specify the
1100 --exclusive option. Otherwise the request will be denied.
1101
1102 Also with any of these options steps are not allowed to share
1103 blades, so resources would remain idle inside an allocation if
1104 the step running on a blade does not take up all the nodes on
1105 the blade.
1106
1107 The network option is also supported on systems with IBM's Par‐
1108 allel Environment (PE). See IBM's LoadLeveler job command key‐
1109 word documentation about the keyword "network" for more informa‐
1110 tion. Multiple values may be specified in a comma separated
1111 list. All options are case in-sensitive. Supported values
1112 include:
1113
1114 BULK_XFER[=<resources>]
1115 Enable bulk transfer of data using Remote
1116 Direct-Memory Access (RDMA). The optional resources
1117 specification is a numeric value which can have a
1118 suffix of "k", "K", "m", "M", "g" or "G" for kilo‐
1119 bytes, megabytes or gigabytes. NOTE: The resources
1120 specification is not supported by the underlying IBM
1121 infrastructure as of Parallel Environment version
1122 2.2 and no value should be specified at this time.
1123
1124 CAU=<count> Number of Collective Acceleration Units (CAU)
1125 required. Applies only to IBM Power7-IH processors.
1126 Default value is zero. Independent CAU will be
1127 allocated for each programming interface (MPI, LAPI,
1128 etc.)
1129
1130 DEVNAME=<name>
1131 Specify the device name to use for communications
1132 (e.g. "eth0" or "mlx4_0").
1133
1134 DEVTYPE=<type>
1135 Specify the device type to use for communications.
1136 The supported values of type are: "IB" (InfiniBand),
1137 "HFI" (P7 Host Fabric Interface), "IPONLY" (IP-Only
1138 interfaces), "HPCE" (HPC Ethernet), and "KMUX" (Ker‐
1139 nel Emulation of HPCE). The devices allocated to a
1140 job must all be of the same type. The default value
1141 depends upon depends upon what hardware is available
1142 and in order of preferences is IPONLY (which is not
1143 considered in User Space mode), HFI, IB, HPCE, and
1144 KMUX.
1145
1146 IMMED =<count>
1147 Number of immediate send slots per window required.
1148 Applies only to IBM Power7-IH processors. Default
1149 value is zero.
1150
1151 INSTANCES =<count>
1152 Specify number of network connections for each task
1153 on each network connection. The default instance
1154 count is 1.
1155
1156 IPV4 Use Internet Protocol (IP) version 4 communications
1157 (default).
1158
1159 IPV6 Use Internet Protocol (IP) version 6 communications.
1160
1161 LAPI Use the LAPI programming interface.
1162
1163 MPI Use the MPI programming interface. MPI is the
1164 default interface.
1165
1166 PAMI Use the PAMI programming interface.
1167
1168 SHMEM Use the OpenSHMEM programming interface.
1169
1170 SN_ALL Use all available switch networks (default).
1171
1172 SN_SINGLE Use one available switch network.
1173
1174 UPC Use the UPC programming interface.
1175
1176 US Use User Space communications.
1177
1178
1179 Some examples of network specifications:
1180
1181 Instances=2,US,MPI,SN_ALL
1182 Create two user space connections for MPI communica‐
1183 tions on every switch network for each task.
1184
1185 US,MPI,Instances=3,Devtype=IB
1186 Create three user space connections for MPI communi‐
1187 cations on every InfiniBand network for each task.
1188
1189 IPV4,LAPI,SN_Single
1190 Create a IP version 4 connection for LAPI communica‐
1191 tions on one switch network for each task.
1192
1193 Instances=2,US,LAPI,MPI
1194 Create two user space connections each for LAPI and
1195 MPI communications on every switch network for each
1196 task. Note that SN_ALL is the default option so
1197 every switch network is used. Also note that
1198 Instances=2 specifies that two connections are
1199 established for each protocol (LAPI and MPI) and
1200 each task. If there are two networks and four tasks
1201 on the node then a total of 32 connections are
1202 established (2 instances x 2 protocols x 2 networks
1203 x 4 tasks).
1204
1205
1206 --nice[=adjustment]
1207 Run the job with an adjusted scheduling priority within Slurm.
1208 With no adjustment value the scheduling priority is decreased by
1209 100. A negative nice value increases the priority, otherwise
1210 decreases it. The adjustment range is +/- 2147483645. Only priv‐
1211 ileged users can specify a negative adjustment.
1212
1213
1214 --no-requeue
1215 Specifies that the batch job should never be requeued under any
1216 circumstances. Setting this option will prevent system adminis‐
1217 trators from being able to restart the job (for example, after a
1218 scheduled downtime), recover from a node failure, or be requeued
1219 upon preemption by a higher priority job. When a job is
1220 requeued, the batch script is initiated from its beginning.
1221 Also see the --requeue option. The JobRequeue configuration
1222 parameter controls the default behavior on the cluster.
1223
1224
1225 --ntasks-per-core=<ntasks>
1226 Request the maximum ntasks be invoked on each core. Meant to be
1227 used with the --ntasks option. Related to --ntasks-per-node
1228 except at the core level instead of the node level. NOTE: This
1229 option is not supported unless SelectType=cons_res is configured
1230 (either directly or indirectly on Cray systems) along with the
1231 node's core count.
1232
1233
1234 --ntasks-per-node=<ntasks>
1235 Request that ntasks be invoked on each node. If used with the
1236 --ntasks option, the --ntasks option will take precedence and
1237 the --ntasks-per-node will be treated as a maximum count of
1238 tasks per node. Meant to be used with the --nodes option. This
1239 is related to --cpus-per-task=ncpus, but does not require knowl‐
1240 edge of the actual number of cpus on each node. In some cases,
1241 it is more convenient to be able to request that no more than a
1242 specific number of tasks be invoked on each node. Examples of
1243 this include submitting a hybrid MPI/OpenMP app where only one
1244 MPI "task/rank" should be assigned to each node while allowing
1245 the OpenMP portion to utilize all of the parallelism present in
1246 the node, or submitting a single setup/cleanup/monitoring job to
1247 each node of a pre-existing allocation as one step in a larger
1248 job script.
1249
1250
1251 --ntasks-per-socket=<ntasks>
1252 Request the maximum ntasks be invoked on each socket. Meant to
1253 be used with the --ntasks option. Related to --ntasks-per-node
1254 except at the socket level instead of the node level. NOTE:
1255 This option is not supported unless SelectType=cons_res is con‐
1256 figured (either directly or indirectly on Cray systems) along
1257 with the node's socket count.
1258
1259
1260 -O, --overcommit
1261 Overcommit resources. When applied to job allocation, only one
1262 CPU is allocated to the job per node and options used to specify
1263 the number of tasks per node, socket, core, etc. are ignored.
1264 When applied to job step allocations (the srun command when exe‐
1265 cuted within an existing job allocation), this option can be
1266 used to launch more than one task per CPU. Normally, srun will
1267 not allocate more than one process per CPU. By specifying
1268 --overcommit you are explicitly allowing more than one process
1269 per CPU. However no more than MAX_TASKS_PER_NODE tasks are per‐
1270 mitted to execute per node. NOTE: MAX_TASKS_PER_NODE is defined
1271 in the file slurm.h and is not a variable, it is set at Slurm
1272 build time.
1273
1274
1275 -o, --output=<filename pattern>
1276 Instruct Slurm to connect the batch script's standard output
1277 directly to the file name specified in the "filename pattern".
1278 By default both standard output and standard error are directed
1279 to the same file. For job arrays, the default file name is
1280 "slurm-%A_%a.out", "%A" is replaced by the job ID and "%a" with
1281 the array index. For other jobs, the default file name is
1282 "slurm-%j.out", where the "%j" is replaced by the job ID. See
1283 the filename pattern section below for filename specification
1284 options.
1285
1286
1287 --open-mode=append|truncate
1288 Open the output and error files using append or truncate mode as
1289 specified. The default value is specified by the system config‐
1290 uration parameter JobFileAppend.
1291
1292
1293 --parsable
1294 Outputs only the job id number and the cluster name if present.
1295 The values are separated by a semicolon. Errors will still be
1296 displayed.
1297
1298
1299 -p, --partition=<partition_names>
1300 Request a specific partition for the resource allocation. If
1301 not specified, the default behavior is to allow the slurm con‐
1302 troller to select the default partition as designated by the
1303 system administrator. If the job can use more than one parti‐
1304 tion, specify their names in a comma separate list and the one
1305 offering earliest initiation will be used with no regard given
1306 to the partition name ordering (although higher priority parti‐
1307 tions will be considered first). When the job is initiated, the
1308 name of the partition used will be placed first in the job
1309 record partition string.
1310
1311
1312 --power=<flags>
1313 Comma separated list of power management plugin options. Cur‐
1314 rently available flags include: level (all nodes allocated to
1315 the job should have identical power caps, may be disabled by the
1316 Slurm configuration option PowerParameters=job_no_level).
1317
1318
1319 --priority=<value>
1320 Request a specific job priority. May be subject to configura‐
1321 tion specific constraints. value should either be a numeric
1322 value or "TOP" (for highest possible value). Only Slurm opera‐
1323 tors and administrators can set the priority of a job.
1324
1325
1326 --profile=<all|none|[energy[,|task[,|lustre[,|network]]]]>
1327 enables detailed data collection by the acct_gather_profile
1328 plugin. Detailed data are typically time-series that are stored
1329 in an HDF5 file for the job or an InfluxDB database depending on
1330 the configured plugin.
1331
1332
1333 All All data types are collected. (Cannot be combined with
1334 other values.)
1335
1336
1337 None No data types are collected. This is the default.
1338 (Cannot be combined with other values.)
1339
1340
1341 Energy Energy data is collected.
1342
1343
1344 Task Task (I/O, Memory, ...) data is collected.
1345
1346
1347 Lustre Lustre data is collected.
1348
1349
1350 Network Network (InfiniBand) data is collected.
1351
1352
1353 --propagate[=rlimit[,rlimit...]]
1354 Allows users to specify which of the modifiable (soft) resource
1355 limits to propagate to the compute nodes and apply to their
1356 jobs. If no rlimit is specified, then all resource limits will
1357 be propagated. The following rlimit names are supported by
1358 Slurm (although some options may not be supported on some sys‐
1359 tems):
1360
1361 ALL All limits listed below (default)
1362
1363 NONE No limits listed below
1364
1365 AS The maximum address space for a process
1366
1367 CORE The maximum size of core file
1368
1369 CPU The maximum amount of CPU time
1370
1371 DATA The maximum size of a process's data segment
1372
1373 FSIZE The maximum size of files created. Note that if the
1374 user sets FSIZE to less than the current size of the
1375 slurmd.log, job launches will fail with a 'File size
1376 limit exceeded' error.
1377
1378 MEMLOCK The maximum size that may be locked into memory
1379
1380 NOFILE The maximum number of open files
1381
1382 NPROC The maximum number of processes available
1383
1384 RSS The maximum resident set size
1385
1386 STACK The maximum stack size
1387
1388
1389 -q, --qos=<qos>
1390 Request a quality of service for the job. QOS values can be
1391 defined for each user/cluster/account association in the Slurm
1392 database. Users will be limited to their association's defined
1393 set of qos's when the Slurm configuration parameter, Account‐
1394 ingStorageEnforce, includes "qos" in it's definition.
1395
1396
1397 -Q, --quiet
1398 Suppress informational messages from sbatch. Errors will still
1399 be displayed.
1400
1401
1402 --reboot
1403 Force the allocated nodes to reboot before starting the job.
1404 This is only supported with some system configurations and will
1405 otherwise be silently ignored.
1406
1407
1408 --requeue
1409 Specifies that the batch job should eligible to being requeue.
1410 The job may be requeued explicitly by a system administrator,
1411 after node failure, or upon preemption by a higher priority job.
1412 When a job is requeued, the batch script is initiated from its
1413 beginning. Also see the --no-requeue option. The JobRequeue
1414 configuration parameter controls the default behavior on the
1415 cluster.
1416
1417
1418 --reservation=<name>
1419 Allocate resources for the job from the named reservation.
1420
1421 --share The --share option has been replaced by the --oversub‐
1422 scribe option described below.
1423
1424
1425 -s, --oversubscribe
1426 The job allocation can over-subscribe resources with other run‐
1427 ning jobs. The resources to be over-subscribed can be nodes,
1428 sockets, cores, and/or hyperthreads depending upon configura‐
1429 tion. The default over-subscribe behavior depends on system
1430 configuration and the partition's OverSubscribe option takes
1431 precedence over the job's option. This option may result in the
1432 allocation being granted sooner than if the --oversubscribe
1433 option was not set and allow higher system utilization, but
1434 application performance will likely suffer due to competition
1435 for resources. Also see the --exclusive option.
1436
1437
1438 -S, --core-spec=<num>
1439 Count of specialized cores per node reserved by the job for sys‐
1440 tem operations and not used by the application. The application
1441 will not use these cores, but will be charged for their alloca‐
1442 tion. Default value is dependent upon the node's configured
1443 CoreSpecCount value. If a value of zero is designated and the
1444 Slurm configuration option AllowSpecResourcesUsage is enabled,
1445 the job will be allowed to override CoreSpecCount and use the
1446 specialized resources on nodes it is allocated. This option can
1447 not be used with the --thread-spec option.
1448
1449
1450 --signal=[B:]<sig_num>[@<sig_time>]
1451 When a job is within sig_time seconds of its end time, send it
1452 the signal sig_num. Due to the resolution of event handling by
1453 Slurm, the signal may be sent up to 60 seconds earlier than
1454 specified. sig_num may either be a signal number or name (e.g.
1455 "10" or "USR1"). sig_time must have an integer value between 0
1456 and 65535. By default, no signal is sent before the job's end
1457 time. If a sig_num is specified without any sig_time, the
1458 default time will be 60 seconds. Use the "B:" option to signal
1459 only the batch shell, none of the other processes will be sig‐
1460 naled. By default all job steps will be signaled, but not the
1461 batch shell itself.
1462
1463
1464 --sockets-per-node=<sockets>
1465 Restrict node selection to nodes with at least the specified
1466 number of sockets. See additional information under -B option
1467 above when task/affinity plugin is enabled.
1468
1469
1470 --spread-job
1471 Spread the job allocation over as many nodes as possible and
1472 attempt to evenly distribute tasks across the allocated nodes.
1473 This option disables the topology/tree plugin.
1474
1475
1476 --switches=<count>[@<max-time>]
1477 When a tree topology is used, this defines the maximum count of
1478 switches desired for the job allocation and optionally the maxi‐
1479 mum time to wait for that number of switches. If Slurm finds an
1480 allocation containing more switches than the count specified,
1481 the job remains pending until it either finds an allocation with
1482 desired switch count or the time limit expires. It there is no
1483 switch count limit, there is no delay in starting the job.
1484 Acceptable time formats include "minutes", "minutes:seconds",
1485 "hours:minutes:seconds", "days-hours", "days-hours:minutes" and
1486 "days-hours:minutes:seconds". The job's maximum time delay may
1487 be limited by the system administrator using the SchedulerParam‐
1488 eters configuration parameter with the max_switch_wait parameter
1489 option. On a dragonfly network the only switch count supported
1490 is 1 since communication performance will be highest when a job
1491 is allocate resources on one leaf switch or more than 2 leaf
1492 switches. The default max-time is the max_switch_wait Sched‐
1493 ulerParameters.
1494
1495
1496 -t, --time=<time>
1497 Set a limit on the total run time of the job allocation. If the
1498 requested time limit exceeds the partition's time limit, the job
1499 will be left in a PENDING state (possibly indefinitely). The
1500 default time limit is the partition's default time limit. When
1501 the time limit is reached, each task in each job step is sent
1502 SIGTERM followed by SIGKILL. The interval between signals is
1503 specified by the Slurm configuration parameter KillWait. The
1504 OverTimeLimit configuration parameter may permit the job to run
1505 longer than scheduled. Time resolution is one minute and second
1506 values are rounded up to the next minute.
1507
1508 A time limit of zero requests that no time limit be imposed.
1509 Acceptable time formats include "minutes", "minutes:seconds",
1510 "hours:minutes:seconds", "days-hours", "days-hours:minutes" and
1511 "days-hours:minutes:seconds".
1512
1513
1514 --tasks-per-node=<n>
1515 Specify the number of tasks to be launched per node. Equivalent
1516 to --ntasks-per-node.
1517
1518
1519 --test-only
1520 Validate the batch script and return an estimate of when a job
1521 would be scheduled to run given the current job queue and all
1522 the other arguments specifying the job requirements. No job is
1523 actually submitted.
1524
1525
1526 --thread-spec=<num>
1527 Count of specialized threads per node reserved by the job for
1528 system operations and not used by the application. The applica‐
1529 tion will not use these threads, but will be charged for their
1530 allocation. This option can not be used with the --core-spec
1531 option.
1532
1533
1534 --threads-per-core=<threads>
1535 Restrict node selection to nodes with at least the specified
1536 number of threads per core. NOTE: "Threads" refers to the num‐
1537 ber of processing units on each core rather than the number of
1538 application tasks to be launched per core. See additional
1539 information under -B option above when task/affinity plugin is
1540 enabled.
1541
1542
1543 --time-min=<time>
1544 Set a minimum time limit on the job allocation. If specified,
1545 the job may have it's --time limit lowered to a value no lower
1546 than --time-min if doing so permits the job to begin execution
1547 earlier than otherwise possible. The job's time limit will not
1548 be changed after the job is allocated resources. This is per‐
1549 formed by a backfill scheduling algorithm to allocate resources
1550 otherwise reserved for higher priority jobs. Acceptable time
1551 formats include "minutes", "minutes:seconds", "hours:min‐
1552 utes:seconds", "days-hours", "days-hours:minutes" and
1553 "days-hours:minutes:seconds".
1554
1555
1556 --tmp=<size[units]>
1557 Specify a minimum amount of temporary disk space per node.
1558 Default units are megabytes unless the SchedulerParameters con‐
1559 figuration parameter includes the "default_gbytes" option for
1560 gigabytes. Different units can be specified using the suffix
1561 [K|M|G|T].
1562
1563
1564 -u, --usage
1565 Display brief help message and exit.
1566
1567
1568 --uid=<user>
1569 Attempt to submit and/or run a job as user instead of the invok‐
1570 ing user id. The invoking user's credentials will be used to
1571 check access permissions for the target partition. User root may
1572 use this option to run jobs as a normal user in a RootOnly par‐
1573 tition for example. If run as root, sbatch will drop its permis‐
1574 sions to the uid specified after node allocation is successful.
1575 user may be the user name or numerical user ID.
1576
1577
1578 --use-min-nodes
1579 If a range of node counts is given, prefer the smaller count.
1580
1581
1582 -V, --version
1583 Display version information and exit.
1584
1585
1586 -v, --verbose
1587 Increase the verbosity of sbatch's informational messages. Mul‐
1588 tiple -v's will further increase sbatch's verbosity. By default
1589 only errors will be displayed.
1590
1591
1592 -w, --nodelist=<node name list>
1593 Request a specific list of hosts. The job will contain all of
1594 these hosts and possibly additional hosts as needed to satisfy
1595 resource requirements. The list may be specified as a
1596 comma-separated list of hosts, a range of hosts (host[1-5,7,...]
1597 for example), or a filename. The host list will be assumed to
1598 be a filename if it contains a "/" character. If you specify a
1599 minimum node or processor count larger than can be satisfied by
1600 the supplied host list, additional resources will be allocated
1601 on other nodes as needed. Duplicate node names in the list will
1602 be ignored. The order of the node names in the list is not
1603 important; the node names will be sorted by Slurm.
1604
1605
1606 -W, --wait
1607 Do not exit until the submitted job terminates. The exit code
1608 of the sbatch command will be the same as the exit code of the
1609 submitted job. If the job terminated due to a signal rather than
1610 a normal exit, the exit code will be set to 1. In the case of a
1611 job array, the exit code recorded will be the highest value for
1612 any task in the job array.
1613
1614
1615 --wait-all-nodes=<value>
1616 Controls when the execution of the command begins. By default
1617 the job will begin execution as soon as the allocation is made.
1618
1619 0 Begin execution as soon as allocation can be made. Do not
1620 wait for all nodes to be ready for use (i.e. booted).
1621
1622 1 Do not begin execution until all nodes are ready for use.
1623
1624
1625 --wckey=<wckey>
1626 Specify wckey to be used with job. If TrackWCKey=no (default)
1627 in the slurm.conf this value is ignored.
1628
1629
1630 --wrap=<command string>
1631 Sbatch will wrap the specified command string in a simple "sh"
1632 shell script, and submit that script to the slurm controller.
1633 When --wrap is used, a script name and arguments may not be
1634 specified on the command line; instead the sbatch-generated
1635 wrapper script is used.
1636
1637
1638 -x, --exclude=<node name list>
1639 Explicitly exclude certain nodes from the resources granted to
1640 the job.
1641
1642
1643 --x11[=<all|batch|first|last>]
1644 Sets up X11 forwarding on all, batch host, first or last node(s)
1645 of the allocation. This option is only enabled if Slurm was com‐
1646 piled with X11 support and PrologFlags=x11 is defined in the
1647 slurm.conf. Default is batch.
1648
1649
1651 sbatch allows for a filename pattern to contain one or more replacement
1652 symbols, which are a percent sign "%" followed by a letter (e.g. %j).
1653
1654 \\ Do not process any of the replacement symbols.
1655
1656 %% The character "%".
1657
1658 %A Job array's master job allocation number.
1659
1660 %a Job array ID (index) number.
1661
1662 %J jobid.stepid of the running job. (e.g. "128.0")
1663
1664 %j jobid of the running job.
1665
1666 %N short hostname. This will create a separate IO file per node.
1667
1668 %n Node identifier relative to current job (e.g. "0" is the first
1669 node of the running job) This will create a separate IO file per
1670 node.
1671
1672 %s stepid of the running job.
1673
1674 %t task identifier (rank) relative to current job. This will create
1675 a separate IO file per task.
1676
1677 %u User name.
1678
1679 %x Job name.
1680
1681 A number placed between the percent character and format specifier may
1682 be used to zero-pad the result in the IO filename. This number is
1683 ignored if the format specifier corresponds to non-numeric data (%N
1684 for example).
1685
1686 Some examples of how the format string may be used for a 4 task job
1687 step with a Job ID of 128 and step id of 0 are included below:
1688
1689 job%J.out job128.0.out
1690
1691 job%4j.out job0128.out
1692
1693 job%j-%2t.out job128-00.out, job128-01.out, ...
1694
1696 Upon startup, sbatch will read and handle the options set in the fol‐
1697 lowing environment variables. Note that environment variables will
1698 override any options set in a batch script, and command line options
1699 will override any environment variables.
1700
1701
1702 SBATCH_ACCOUNT Same as -A, --account
1703
1704 SBATCH_ACCTG_FREQ Same as --acctg-freq
1705
1706 SBATCH_ARRAY_INX Same as -a, --array
1707
1708 SBATCH_BATCH Same as --batch
1709
1710 SBATCH_CHECKPOINT Same as --checkpoint
1711
1712 SBATCH_CHECKPOINT_DIR Same as --checkpoint-dir
1713
1714 SBATCH_CLUSTERS or SLURM_CLUSTERS
1715 Same as --clusters
1716
1717 SBATCH_CONSTRAINT Same as -C, --constraint
1718
1719 SBATCH_CORE_SPEC Same as --core-spec
1720
1721 SBATCH_DEBUG Same as -v, --verbose
1722
1723 SBATCH_DELAY_BOOT Same as --delay-boot
1724
1725 SBATCH_DISTRIBUTION Same as -m, --distribution
1726
1727 SBATCH_EXCLUSIVE Same as --exclusive
1728
1729 SBATCH_EXPORT Same as --export
1730
1731 SBATCH_GET_USER_ENV Same as --get-user-env
1732
1733 SBATCH_GRES Same as --gres
1734
1735 SBATCH_GRES_FLAGS Same as --gres-flags
1736
1737 SBATCH_HINT or SLURM_HINT
1738 Same as --hint
1739
1740 SBATCH_IGNORE_PBS Same as --ignore-pbs
1741
1742 SBATCH_JOBID Same as --jobid
1743
1744 SBATCH_JOB_NAME Same as -J, --job-name
1745
1746 SBATCH_MEM_BIND Same as --mem-bind
1747
1748 SBATCH_NETWORK Same as --network
1749
1750 SBATCH_NO_REQUEUE Same as --no-requeue
1751
1752 SBATCH_OPEN_MODE Same as --open-mode
1753
1754 SBATCH_OVERCOMMIT Same as -O, --overcommit
1755
1756 SBATCH_PARTITION Same as -p, --partition
1757
1758 SBATCH_POWER Same as --power
1759
1760 SBATCH_PROFILE Same as --profile
1761
1762 SBATCH_QOS Same as --qos
1763
1764 SBATCH_RESERVATION Same as --reservation
1765
1766 SBATCH_REQ_SWITCH When a tree topology is used, this defines the
1767 maximum count of switches desired for the job
1768 allocation and optionally the maximum time to
1769 wait for that number of switches. See --switches
1770
1771 SBATCH_REQUEUE Same as --requeue
1772
1773 SBATCH_SIGNAL Same as --signal
1774
1775 SBATCH_SPREAD_JOB Same as --spread-job
1776
1777 SBATCH_THREAD_SPEC Same as --thread-spec
1778
1779 SBATCH_TIMELIMIT Same as -t, --time
1780
1781 SBATCH_USE_MIN_NODES Same as --use-min-nodes
1782
1783 SBATCH_WAIT Same as -W, --wait
1784
1785 SBATCH_WAIT_ALL_NODES Same as --wait-all-nodes
1786
1787 SBATCH_WAIT4SWITCH Max time waiting for requested switches. See
1788 --switches
1789
1790 SBATCH_WCKEY Same as --wckey
1791
1792 SLURM_CONF The location of the Slurm configuration file.
1793
1794 SLURM_EXIT_ERROR Specifies the exit code generated when a Slurm
1795 error occurs (e.g. invalid options). This can be
1796 used by a script to distinguish application exit
1797 codes from various Slurm error conditions.
1798
1799 SLURM_STEP_KILLED_MSG_NODE_ID=ID
1800 If set, only the specified node will log when the
1801 job or step are killed by a signal.
1802
1803
1805 The Slurm controller will set the following variables in the environ‐
1806 ment of the batch script.
1807
1808 BASIL_RESERVATION_ID
1809 The reservation ID on Cray systems running ALPS/BASIL only.
1810
1811 SBATCH_MEM_BIND
1812 Set to value of the --mem-bind option.
1813
1814 SBATCH_MEM_BIND_LIST
1815 Set to bit mask used for memory binding.
1816
1817 SBATCH_MEM_BIND_PREFER
1818 Set to "prefer" if the --mem-bind option includes the prefer
1819 option.
1820
1821 SBATCH_MEM_BIND_TYPE
1822 Set to the memory binding type specified with the --mem-bind
1823 option. Possible values are "none", "rank", "map_map",
1824 "mask_mem" and "local".
1825
1826 SBATCH_MEM_BIND_VERBOSE
1827 Set to "verbose" if the --mem-bind option includes the verbose
1828 option. Set to "quiet" otherwise.
1829
1830 SLURM_*_PACK_GROUP_#
1831 For a heterogenous job allocation, the environment variables are
1832 set separately for each component.
1833
1834 SLURM_ARRAY_TASK_COUNT
1835 Total number of tasks in a job array.
1836
1837 SLURM_ARRAY_TASK_ID
1838 Job array ID (index) number.
1839
1840 SLURM_ARRAY_TASK_MAX
1841 Job array's maximum ID (index) number.
1842
1843 SLURM_ARRAY_TASK_MIN
1844 Job array's minimum ID (index) number.
1845
1846 SLURM_ARRAY_TASK_STEP
1847 Job array's index step size.
1848
1849 SLURM_ARRAY_JOB_ID
1850 Job array's master job ID number.
1851
1852 SLURM_CHECKPOINT_IMAGE_DIR
1853 Directory into which checkpoint images should be written if
1854 specified on the execute line.
1855
1856 SLURM_CLUSTER_NAME
1857 Name of the cluster on which the job is executing.
1858
1859 SLURM_CPUS_ON_NODE
1860 Number of CPUS on the allocated node.
1861
1862 SLURM_CPUS_PER_TASK
1863 Number of cpus requested per task. Only set if the
1864 --cpus-per-task option is specified.
1865
1866 SLURM_DISTRIBUTION
1867 Same as -m, --distribution
1868
1869 SLURM_GTIDS
1870 Global task IDs running on this node. Zero origin and comma
1871 separated.
1872
1873 SLURM_JOB_ACCOUNT
1874 Account name associated of the job allocation.
1875
1876 SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
1877 The ID of the job allocation.
1878
1879 SLURM_JOB_CPUS_PER_NODE
1880 Count of processors available to the job on this node. Note the
1881 select/linear plugin allocates entire nodes to jobs, so the
1882 value indicates the total count of CPUs on the node. The
1883 select/cons_res plugin allocates individual processors to jobs,
1884 so this number indicates the number of processors on this node
1885 allocated to the job.
1886
1887 SLURM_JOB_DEPENDENCY
1888 Set to value of the --dependency option.
1889
1890 SLURM_JOB_NAME
1891 Name of the job.
1892
1893 SLURM_JOB_NODELIST (and SLURM_NODELIST for backwards compatibility)
1894 List of nodes allocated to the job.
1895
1896 SLURM_JOB_NUM_NODES (and SLURM_NNODES for backwards compatibility)
1897 Total number of nodes in the job's resource allocation.
1898
1899 SLURM_JOB_PARTITION
1900 Name of the partition in which the job is running.
1901
1902 SLURM_JOB_QOS
1903 Quality Of Service (QOS) of the job allocation.
1904
1905 SLURM_JOB_RESERVATION
1906 Advanced reservation containing the job allocation, if any.
1907
1908 SLURM_LOCALID
1909 Node local task ID for the process within a job.
1910
1911 SLURM_MEM_PER_CPU
1912 Same as --mem-per-cpu
1913
1914 SLURM_MEM_PER_NODE
1915 Same as --mem
1916
1917 SLURM_NODE_ALIASES
1918 Sets of node name, communication address and hostname for nodes
1919 allocated to the job from the cloud. Each element in the set if
1920 colon separated and each set is comma separated. For example:
1921 SLURM_NODE_ALIASES=ec0:1.2.3.4:foo,ec1:1.2.3.5:bar
1922
1923 SLURM_NODEID
1924 ID of the nodes allocated.
1925
1926 SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
1927 Same as -n, --ntasks
1928
1929 SLURM_NTASKS_PER_CORE
1930 Number of tasks requested per core. Only set if the
1931 --ntasks-per-core option is specified.
1932
1933 SLURM_NTASKS_PER_NODE
1934 Number of tasks requested per node. Only set if the
1935 --ntasks-per-node option is specified.
1936
1937 SLURM_NTASKS_PER_SOCKET
1938 Number of tasks requested per socket. Only set if the
1939 --ntasks-per-socket option is specified.
1940
1941 SLURM_PACK_SIZE
1942 Set to count of components in heterogeneous job.
1943
1944 SLURM_PRIO_PROCESS
1945 The scheduling priority (nice value) at the time of job submis‐
1946 sion. This value is propagated to the spawned processes.
1947
1948 SLURM_PROCID
1949 The MPI rank (or relative process ID) of the current process
1950
1951 SLURM_PROFILE
1952 Same as --profile
1953
1954 SLURM_RESTART_COUNT
1955 If the job has been restarted due to system failure or has been
1956 explicitly requeued, this will be sent to the number of times
1957 the job has been restarted.
1958
1959 SLURM_SUBMIT_DIR
1960 The directory from which sbatch was invoked.
1961
1962 SLURM_SUBMIT_HOST
1963 The hostname of the computer from which sbatch was invoked.
1964
1965 SLURM_TASKS_PER_NODE
1966 Number of tasks to be initiated on each node. Values are comma
1967 separated and in the same order as SLURM_JOB_NODELIST. If two
1968 or more consecutive nodes are to have the same task count, that
1969 count is followed by "(x#)" where "#" is the repetition count.
1970 For example, "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the
1971 first three nodes will each execute three tasks and the fourth
1972 node will execute one task.
1973
1974 SLURM_TASK_PID
1975 The process ID of the task being started.
1976
1977 SLURM_TOPOLOGY_ADDR
1978 This is set only if the system has the topology/tree plugin
1979 configured. The value will be set to the names network
1980 switches which may be involved in the job's communications
1981 from the system's top level switch down to the leaf switch and
1982 ending with node name. A period is used to separate each hard‐
1983 ware component name.
1984
1985 SLURM_TOPOLOGY_ADDR_PATTERN
1986 This is set only if the system has the topology/tree plugin
1987 configured. The value will be set component types listed in
1988 SLURM_TOPOLOGY_ADDR. Each component will be identified as
1989 either "switch" or "node". A period is used to separate each
1990 hardware component type.
1991
1992 SLURMD_NODENAME
1993 Name of the node running the job script.
1994
1995
1997 Specify a batch script by filename on the command line. The batch
1998 script specifies a 1 minute time limit for the job.
1999
2000 $ cat myscript
2001 #!/bin/sh
2002 #SBATCH --time=1
2003 srun hostname |sort
2004
2005 $ sbatch -N4 myscript
2006 salloc: Granted job allocation 65537
2007
2008 $ cat slurm-65537.out
2009 host1
2010 host2
2011 host3
2012 host4
2013
2014
2015 Pass a batch script to sbatch on standard input:
2016
2017 $ sbatch -N4 <<EOF
2018 > #!/bin/sh
2019 > srun hostname |sort
2020 > EOF
2021 sbatch: Submitted batch job 65541
2022
2023 $ cat slurm-65541.out
2024 host1
2025 host2
2026 host3
2027 host4
2028
2029
2030 To create a heterogeneous job with 3 components, each allocating a
2031 unique set of nodes:
2032
2033 sbatch -w node[2-3] : -w node4 : -w node[5-7] work.bash
2034 Submitted batch job 34987
2035
2036
2038 Copyright (C) 2006-2007 The Regents of the University of California.
2039 Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
2040 Copyright (C) 2008-2010 Lawrence Livermore National Security.
2041 Copyright (C) 2010-2017 SchedMD LLC.
2042
2043 This file is part of Slurm, a resource management program. For
2044 details, see <https://slurm.schedmd.com/>.
2045
2046 Slurm is free software; you can redistribute it and/or modify it under
2047 the terms of the GNU General Public License as published by the Free
2048 Software Foundation; either version 2 of the License, or (at your
2049 option) any later version.
2050
2051 Slurm is distributed in the hope that it will be useful, but WITHOUT
2052 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
2053 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
2054 for more details.
2055
2056
2058 sinfo(1), sattach(1), salloc(1), squeue(1), scancel(1), scontrol(1),
2059 slurm.conf(5), sched_setaffinity (2), numa (3)
2060
2061
2062
2063February 2019 Slurm Commands sbatch(1)