sbatch(1)

1sbatch(1)                       Slurm Commands                       sbatch(1)
2
3
4

NAME

6       sbatch - Submit a batch script to Slurm.
7
8

SYNOPSIS

10       sbatch [OPTIONS(0)...] [ : [OPTIONS(N)...]] script(0) [args(0)...]
11
12       Option(s)  define  multiple  jobs  in a co-scheduled heterogeneous job.
13       For more details about heterogeneous jobs see the document
14       https://slurm.schedmd.com/heterogeneous_jobs.html
15
16

DESCRIPTION

18       sbatch submits a batch script to Slurm.  The batch script may be  given
19       to  sbatch  through a file name on the command line, or if no file name
20       is specified, sbatch will read in a script  from  standard  input.  The
21       batch  script  may  contain  options preceded with "#SBATCH" before any
22       executable commands in the script.  sbatch will stop processing further
23       #SBATCH  directives  once the first non-comment non-whitespace line has
24       been reached in the script.
25
26       sbatch exits immediately after the script is  successfully  transferred
27       to  the Slurm controller and assigned a Slurm job ID.  The batch script
28       is not necessarily granted resources immediately, it  may  sit  in  the
29       queue  of  pending  jobs  for  some  time before its required resources
30       become available.
31
32       By default both standard output and standard error are  directed  to  a
33       file  of  the  name "slurm-%j.out", where the "%j" is replaced with the
34       job allocation number. The file will be generated on the first node  of
35       the  job allocation.  Other than the batch script itself, Slurm does no
36       movement of user files.
37
38       When the job allocation is finally granted for the batch script,  Slurm
39       runs  a single copy of the batch script on the first node in the set of
40       allocated nodes.
41
42       The following document describes the influence of  various  options  on
43       the allocation of cpus to jobs and tasks.
44       https://slurm.schedmd.com/cpu_management.html
45
46

RETURN VALUE

48       sbatch will return 0 on success or error code on failure.
49
50

SCRIPT PATH RESOLUTION

52       The batch script is resolved in the following order:
53
54       1. If  script  starts  with  ".",  then path is constructed as: current
55          working directory / script
56
57       2. If script starts with a "/", then path is considered absolute.
58
59       3. If script is in current working directory.
60
61       4. If script can be resolved through PATH. See path_resolution(7).
62
63       Current working directory is  the  calling  process  working  directory
64       unless  the --chdir argument is passed, which will override the current
65       working directory.
66
67

OPTIONS

69       -a, --array=<indexes>
70              Submit a job array, multiple jobs to be executed with  identical
71              parameters.   The  indexes  specification  identifies what array
72              index values should be used. Multiple values  may  be  specified
73              using a comma separated list and/or a range of values with a "-"
74              separator. For example, "--array=0-15"  or  "--array=0,6,16-32".
75              A step function can also be specified with a suffix containing a
76              colon and number. For example, "--array=0-15:4" is equivalent to
77              "--array=0,4,8,12".   A maximum number of simultaneously running
78              tasks from the job array may be specified using a "%" separator.
79              For example "--array=0-15%4" will limit the number of simultane‐
80              ously running tasks from this job array to 4.  The minimum index
81              value  is  0.  the maximum value is one less than the configura‐
82              tion parameter MaxArraySize.   NOTE:  currently,  federated  job
83              arrays only run on the local cluster.
84
85
86       -A, --account=<account>
87              Charge  resources  used  by  this job to specified account.  The
88              account is an arbitrary string. The account name may be  changed
89              after job submission using the scontrol command.
90
91
92       --acctg-freq
93              Define  the  job  accounting  and  profiling sampling intervals.
94              This can be used to override the JobAcctGatherFrequency  parame‐
95              ter  in  Slurm's  configuration file, slurm.conf.  The supported
96              format is as follows:
97
98              --acctg-freq=<datatype>=<interval>
99                          where <datatype>=<interval> specifies the task  sam‐
100                          pling  interval  for  the jobacct_gather plugin or a
101                          sampling  interval  for  a  profiling  type  by  the
102                          acct_gather_profile  plugin.  Multiple,  comma-sepa‐
103                          rated <datatype>=<interval> intervals may be  speci‐
104                          fied. Supported datatypes are as follows:
105
106                          task=<interval>
107                                 where  <interval> is the task sampling inter‐
108                                 val in seconds for the jobacct_gather plugins
109                                 and     for    task    profiling    by    the
110                                 acct_gather_profile plugin.  NOTE: This  fre‐
111                                 quency  is  used  to monitor memory usage. If
112                                 memory limits are enforced the  highest  fre‐
113                                 quency  a user can request is what is config‐
114                                 ured in the slurm.conf file.   They  can  not
115                                 turn it off (=0) either.
116
117                          energy=<interval>
118                                 where  <interval> is the sampling interval in
119                                 seconds  for  energy  profiling   using   the
120                                 acct_gather_energy plugin
121
122                          network=<interval>
123                                 where  <interval> is the sampling interval in
124                                 seconds for infiniband  profiling  using  the
125                                 acct_gather_infiniband plugin.
126
127                          filesystem=<interval>
128                                 where  <interval> is the sampling interval in
129                                 seconds for filesystem  profiling  using  the
130                                 acct_gather_filesystem plugin.
131
132              The  default  value  for  the  task  sampling
133              interval is 30 seconds.
134              The default value for all other intervals is 0.  An interval  of
135              0 disables sampling of the specified type.  If the task sampling
136              interval is 0, accounting information is collected only  at  job
137              termination (reducing Slurm interference with the job).
138              Smaller (non-zero) values have a greater impact upon job perfor‐
139              mance, but a value of 30 seconds is not likely to be  noticeable
140              for applications having less than 10,000 tasks.
141
142
143       -B --extra-node-info=<sockets[:cores[:threads]]>
144              Restrict  node  selection  to  nodes with at least the specified
145              number of sockets, cores per socket  and/or  threads  per  core.
146              NOTE: These options do not specify the resource allocation size.
147              Each value specified is considered a minimum.  An  asterisk  (*)
148              can  be  used  as  a  placeholder  indicating that all available
149              resources of that type are to be utilized. Values  can  also  be
150              specified  as  min-max. The individual levels can also be speci‐
151              fied in separate options if desired:
152                  --sockets-per-node=<sockets>
153                  --cores-per-socket=<cores>
154                  --threads-per-core=<threads>
155              If task/affinity plugin is enabled, then specifying  an  alloca‐
156              tion  in this manner also results in subsequently launched tasks
157              being bound to threads if  the  -B  option  specifies  a  thread
158              count,  otherwise  an  option of cores if a core count is speci‐
159              fied, otherwise an option of sockets.  If SelectType is  config‐
160              ured  to  select/cons_res,  it must have a parameter of CR_Core,
161              CR_Core_Memory, CR_Socket, or CR_Socket_Memory for  this  option
162              to  be  honored.   If  not specified, the scontrol show job will
163              display 'ReqS:C:T=*:*:*'. This option  applies  to  job  alloca‐
164              tions.
165
166
167       --batch=<list>
168              Nodes  can  have features assigned to them by the Slurm adminis‐
169              trator.  Users can specify which of these features are  required
170              by  their  batch script using this options.  For example a job's
171              allocation may include both Intel Haswell  and  KNL  nodes  with
172              features "haswell" and "knl" respectively.  On such a configura‐
173              tion the batch script would normally benefit by executing  on  a
174              faster  Haswell  node.  This would be specified using the option
175              "--batch=haswell".  The specification can  include  AND  and  OR
176              operators  using  the ampersand and vertical bar separators. For
177              example:             "--batch=haswell|broadwell"              or
178              "--batch=haswell|big_memory".   The  --batch  argument must be a
179              subset of the job's --constraint=<list> argument (i.e.  the  job
180              can  not  request only KNL nodes, but require the script to exe‐
181              cute on a Haswell node).  If the request can  not  be  satisfied
182              from  the  resources allocated to the job, the batch script will
183              execute on the first node of the job allocation.
184
185
186       --bb=<spec>
187              Burst buffer specification. The form  of  the  specification  is
188              system  dependent.   Note the burst buffer may not be accessible
189              from a login node, but require that salloc spawn a shell on  one
190              of  it's  allocated  compute  nodes. See the description of Sal‐
191              locDefaultCommand in the slurm.conf man page for  more  informa‐
192              tion about how to spawn a remote shell.
193
194
195       --bbf=<file_name>
196              Path of file containing burst buffer specification.  The form of
197              the specification  is  system  dependent.   These  burst  buffer
198              directives will be inserted into the submitted batch script.
199
200
201       -b, --begin=<time>
202              Submit  the  batch  script  to the Slurm controller immediately,
203              like normal, but tell the controller to defer the allocation  of
204              the job until the specified time.
205
206              Time may be of the form HH:MM:SS to run a job at a specific time
207              of day (seconds are optional).  (If that time is  already  past,
208              the  next day is assumed.)  You may also specify midnight, noon,
209              fika (3 PM) or teatime (4 PM) and you  can  have  a  time-of-day
210              suffixed  with  AM  or  PM  for  running  in  the morning or the
211              evening.  You can also say what day the  job  will  be  run,  by
212              specifying  a  date  of  the form MMDDYY or MM/DD/YY YYYY-MM-DD.
213              Combine   date   and   time   using   the    following    format
214              YYYY-MM-DD[THH:MM[:SS]].  You  can  also  give  times like now +
215              count time-units, where the time-units can be seconds (default),
216              minutes, hours, days, or weeks and you can tell Slurm to run the
217              job today with the keyword today and to  run  the  job  tomorrow
218              with  the  keyword tomorrow.  The value may be changed after job
219              submission using the scontrol command.  For example:
220                 --begin=16:00
221                 --begin=now+1hour
222                 --begin=now+60           (seconds by default)
223                 --begin=2010-01-20T12:34:00
224
225
226              Notes on date/time specifications:
227               - Although the 'seconds' field of the HH:MM:SS time  specifica‐
228              tion  is  allowed  by  the  code, note that the poll time of the
229              Slurm scheduler is not precise enough to guarantee  dispatch  of
230              the  job on the exact second.  The job will be eligible to start
231              on the next poll following the specified time.  The  exact  poll
232              interval  depends  on the Slurm scheduler (e.g., 60 seconds with
233              the default sched/builtin).
234               -  If  no  time  (HH:MM:SS)  is  specified,  the   default   is
235              (00:00:00).
236               -  If a date is specified without a year (e.g., MM/DD) then the
237              current year is assumed, unless the  combination  of  MM/DD  and
238              HH:MM:SS  has  already  passed  for that year, in which case the
239              next year is used.
240
241
242       --checkpoint=<time>
243              Specifies the interval between creating checkpoints of  the  job
244              step.   By  default,  the job step will have no checkpoints cre‐
245              ated.  Acceptable time formats include "minutes",  "minutes:sec‐
246              onds",  "hours:minutes:seconds",  "days-hours", "days-hours:min‐
247              utes" and "days-hours:minutes:seconds".
248
249
250       --cluster-constraint=[!]<list>
251              Specifies features that a federated cluster must have to have  a
252              sibling job submitted to it. Slurm will attempt to submit a sib‐
253              ling job to a cluster if it has at least one  of  the  specified
254              features.  If  the "!" option is included, Slurm will attempt to
255              submit a sibling job to a cluster that has none of the specified
256              features.
257
258
259       --comment=<string>
260              An  arbitrary  comment enclosed in double quotes if using spaces
261              or some special characters.
262
263
264       -C, --constraint=<list>
265              Nodes can have features assigned to them by the  Slurm  adminis‐
266              trator.   Users can specify which of these features are required
267              by their job using the constraint  option.   Only  nodes  having
268              features  matching  the  job constraints will be used to satisfy
269              the request.  Multiple constraints may be  specified  with  AND,
270              OR,  matching  OR, resource counts, etc. (some operators are not
271              supported on all system types).   Supported  constraint  options
272              include:
273
274              Single Name
275                     Only nodes which have the specified feature will be used.
276                     For example, --constraint="intel"
277
278              Node Count
279                     A request can specify the number  of  nodes  needed  with
280                     some feature by appending an asterisk and count after the
281                     feature   name.    For   example    "--nodes=16    --con‐
282                     straint=graphics*4  ..."  indicates that the job requires
283                     16 nodes and that at least four of those nodes must  have
284                     the feature "graphics."
285
286              AND    If  only  nodes  with  all  of specified features will be
287                     used.  The ampersand is used for an  AND  operator.   For
288                     example, --constraint="intel&gpu"
289
290              OR     If  only  nodes  with  at least one of specified features
291                     will be used.  The vertical bar is used for an OR  opera‐
292                     tor.  For example, --constraint="intel|amd"
293
294              Matching OR
295                     If  only  one of a set of possible options should be used
296                     for all allocated nodes, then use  the  OR  operator  and
297                     enclose the options within square brackets.  For example:
298                     "--constraint=[rack1|rack2|rack3|rack4]" might be used to
299                     specify that all nodes must be allocated on a single rack
300                     of the cluster, but any of those four racks can be used.
301
302              Multiple Counts
303                     Specific counts of multiple resources may be specified by
304                     using  the  AND operator and enclosing the options within
305                     square     brackets.       For      example:      "--con‐
306                     straint=[rack1*2&rack2*4]"  might be used to specify that
307                     two nodes must be allocated from nodes with  the  feature
308                     of  "rack1"  and  four nodes must be allocated from nodes
309                     with the feature "rack2".
310
311                     NOTE: This construct does not support multiple Intel  KNL
312                     NUMA   or   MCDRAM  modes.  For  example,  while  "--con‐
313                     straint=[(knl&quad)*2&(knl&hemi)*4]"  is  not  supported,
314                     "--constraint=[haswell*2&(knl&hemi)*4]"   is   supported.
315                     Specification of multiple KNL modes requires the use of a
316                     heterogeneous job.
317
318
319              Parenthesis
320                     Parenthesis  can  be  used  to  group  like node features
321                     together.          For          example           "--con‐
322                     straint=[(knl&snc4&flat)*4&haswell*1]"  might  be used to
323                     specify that four nodes with the features  "knl",  "snc4"
324                     and  "flat"  plus one node with the feature "haswell" are
325                     required.  All  options  within  parenthesis  should   be
326                     grouped with AND (e.g. "&") operands.
327
328
329       --contiguous
330              If  set,  then  the  allocated nodes must form a contiguous set.
331              Not honored with the topology/tree or topology/3d_torus plugins,
332              both of which can modify the node ordering.
333
334
335       --cores-per-socket=<cores>
336              Restrict  node  selection  to  nodes with at least the specified
337              number of cores per socket.  See additional information under -B
338              option above when task/affinity plugin is enabled.
339
340
341       --cpu-freq =<p1[-p2[:p3]]>
342
343              Request  that  job  steps initiated by srun commands inside this
344              sbatch script be run at some requested frequency if possible, on
345              the CPUs selected for the step on the compute node(s).
346
347              p1  can be  [#### | low | medium | high | highm1] which will set
348              the frequency scaling_speed to the corresponding value, and  set
349              the frequency scaling_governor to UserSpace. See below for defi‐
350              nition of the values.
351
352              p1 can be [Conservative | OnDemand |  Performance  |  PowerSave]
353              which  will set the scaling_governor to the corresponding value.
354              The governor has to be in the list set by the slurm.conf  option
355              CpuFreqGovernors.
356
357              When p2 is present, p1 will be the minimum scaling frequency and
358              p2 will be the maximum scaling frequency.
359
360              p2 can be  [#### | medium | high | highm1] p2  must  be  greater
361              than p1.
362
363              p3  can  be [Conservative | OnDemand | Performance | PowerSave |
364              UserSpace] which will set  the  governor  to  the  corresponding
365              value.
366
367              If p3 is UserSpace, the frequency scaling_speed will be set by a
368              power or energy aware scheduling strategy to a value between  p1
369              and  p2  that lets the job run within the site's power goal. The
370              job may be delayed if p1 is higher than a frequency that  allows
371              the job to run within the goal.
372
373              If  the current frequency is < min, it will be set to min. Like‐
374              wise, if the current frequency is > max, it will be set to max.
375
376              Acceptable values at present include:
377
378              ####          frequency in kilohertz
379
380              Low           the lowest available frequency
381
382              High          the highest available frequency
383
384              HighM1        (high minus one)  will  select  the  next  highest
385                            available frequency
386
387              Medium        attempts  to  set a frequency in the middle of the
388                            available range
389
390              Conservative  attempts to use the Conservative CPU governor
391
392              OnDemand      attempts to use the  OnDemand  CPU  governor  (the
393                            default value)
394
395              Performance   attempts to use the Performance CPU governor
396
397              PowerSave     attempts to use the PowerSave CPU governor
398
399              UserSpace     attempts to use the UserSpace CPU governor
400
401
402              The  following  informational environment variable is set
403              in the job
404              step when --cpu-freq option is requested.
405                      SLURM_CPU_FREQ_REQ
406
407              This environment variable can also be used to supply  the  value
408              for  the CPU frequency request if it is set when the 'srun' com‐
409              mand is issued.  The --cpu-freq on the command line  will  over‐
410              ride  the  environment variable value.  The form on the environ‐
411              ment variable is the same as the command line.  See the ENVIRON‐
412              MENT    VARIABLES    section    for   a   description   of   the
413              SLURM_CPU_FREQ_REQ variable.
414
415              NOTE: This parameter is treated as a request, not a requirement.
416              If  the  job  step's  node does not support setting the CPU fre‐
417              quency, or the requested value is  outside  the  bounds  of  the
418              legal  frequencies,  an  error  is  logged,  but the job step is
419              allowed to continue.
420
421              NOTE: Setting the frequency for just the CPUs of  the  job  step
422              implies that the tasks are confined to those CPUs.  If task con‐
423              finement    (i.e.,    TaskPlugin=task/affinity    or    TaskPlu‐
424              gin=task/cgroup with the "ConstrainCores" option) is not config‐
425              ured, this parameter is ignored.
426
427              NOTE: When the step completes, the  frequency  and  governor  of
428              each selected CPU is reset to the previous values.
429
430              NOTE: When submitting jobs with  the --cpu-freq option with lin‐
431              uxproc as the ProctrackType can cause jobs to  run  too  quickly
432              before  Accounting  is  able  to  poll for job information. As a
433              result not all of accounting information will be present.
434
435
436       --cpus-per-gpu=<ncpus>
437              Advise Slurm that ensuing job steps will require  ncpus  proces‐
438              sors  per  allocated GPU.  Requires the --gpus option.  Not com‐
439              patible with the --cpus-per-task option.
440
441
442       -c, --cpus-per-task=<ncpus>
443              Advise the Slurm controller that ensuing job steps will  require
444              ncpus  number  of processors per task.  Without this option, the
445              controller will just try to allocate one processor per task.
446
447              For instance, consider an application that  has  4  tasks,  each
448              requiring   3  processors.   If  our  cluster  is  comprised  of
449              quad-processors nodes and we simply ask for 12  processors,  the
450              controller  might  give  us only 3 nodes.  However, by using the
451              --cpus-per-task=3 options, the controller knows that  each  task
452              requires  3 processors on the same node, and the controller will
453              grant an allocation of 4 nodes, one for each of the 4 tasks.
454
455
456       --deadline=<OPT>
457              remove the job if no ending is  possible  before  this  deadline
458              (start  >  (deadline  -  time[-min])).   Default is no deadline.
459              Valid time formats are:
460              HH:MM[:SS] [AM|PM]
461              MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
462              MM/DD[/YY]-HH:MM[:SS]
463              YYYY-MM-DD[THH:MM[:SS]]]
464
465
466       --delay-boot=<minutes>
467              Do not reboot nodes in order to  satisfied  this  job's  feature
468              specification  if the job has been eligible to run for less than
469              this time period.  If the job has waited for less than the spec‐
470              ified  period,  it  will  use  only nodes which already have the
471              specified features.  The argument is in  units  of  minutes.   A
472              default  value  may  be  set by a system administrator using the
473              delay_boot  option  of  the  SchedulerParameters   configuration
474              parameter in the slurm.conf file, otherwise the default value is
475              zero (no delay).
476
477
478       -d, --dependency=<dependency_list>
479              Defer the start of this job  until  the  specified  dependencies
480              have been satisfied completed.  <dependency_list> is of the form
481              <type:job_id[:job_id][,type:job_id[:job_id]]>                 or
482              <type:job_id[:job_id][?type:job_id[:job_id]]>.  All dependencies
483              must be satisfied if the "," separator is used.  Any  dependency
484              may  be  satisfied  if the "?" separator is used.  Many jobs can
485              share the same dependency and these jobs may even belong to dif‐
486              ferent   users.  The   value may be changed after job submission
487              using the scontrol command.  Once a job dependency fails due  to
488              the termination state of a preceding job, the dependent job will
489              never be run, even if the preceding job is requeued  and  has  a
490              different termination state in a subsequent execution.
491
492              after:job_id[:jobid...]
493                     This  job  can  begin  execution after the specified jobs
494                     have begun execution.
495
496              afterany:job_id[:jobid...]
497                     This job can begin execution  after  the  specified  jobs
498                     have terminated.
499
500              afterburstbuffer:job_id[:jobid...]
501                     This  job  can  begin  execution after the specified jobs
502                     have terminated and any associated burst buffer stage out
503                     operations have completed.
504
505              aftercorr:job_id[:jobid...]
506                     A  task  of  this job array can begin execution after the
507                     corresponding task ID in the specified job has  completed
508                     successfully  (ran  to  completion  with  an exit code of
509                     zero).
510
511              afternotok:job_id[:jobid...]
512                     This job can begin execution  after  the  specified  jobs
513                     have terminated in some failed state (non-zero exit code,
514                     node failure, timed out, etc).
515
516              afterok:job_id[:jobid...]
517                     This job can begin execution  after  the  specified  jobs
518                     have  successfully  executed  (ran  to completion with an
519                     exit code of zero).
520
521              expand:job_id
522                     Resources allocated to this job should be used to  expand
523                     the specified job.  The job to expand must share the same
524                     QOS (Quality of Service) and partition.  Gang  scheduling
525                     of resources in the partition is also not supported.
526
527              singleton
528                     This   job  can  begin  execution  after  any  previously
529                     launched jobs sharing the same job  name  and  user  have
530                     terminated.   In  other  words, only one job by that name
531                     and owned by that user can be running or suspended at any
532                     point in time.
533
534
535       -D, --chdir=<directory>
536              Set  the  working  directory  of  the  batch script to directory
537              before it is executed. The path can be specified as full path or
538              relative path to the directory where the command is executed.
539
540
541       -e, --error=<filename pattern>
542              Instruct  Slurm  to  connect  the  batch script's standard error
543              directly to the file name specified in the  "filename  pattern".
544              By  default both standard output and standard error are directed
545              to the same file.  For job arrays,  the  default  file  name  is
546              "slurm-%A_%a.out",  "%A" is replaced by the job ID and "%a" with
547              the array index.  For other  jobs,  the  default  file  name  is
548              "slurm-%j.out",  where  the "%j" is replaced by the job ID.  See
549              the filename pattern section below  for  filename  specification
550              options.
551
552
553       --exclusive[=user|mcs]
554              The  job  allocation can not share nodes with other running jobs
555              (or just other users with the "=user" option or with the  "=mcs"
556              option).   The default shared/exclusive behavior depends on sys‐
557              tem configuration and the partition's OverSubscribe option takes
558              precedence over the job's option.
559
560
561       --export=<environment variables [ALL] | NONE>
562              Identify  which  environment variables from the submission envi‐
563              ronment are propagated to the launched application. By  default,
564              all  are propagated.  Multiple environment variable names should
565              be comma separated.  Environment variable names may be specified
566              to  propagate the current value (e.g. "--export=EDITOR") or spe‐
567              cific   values   may   be    exported    (e.g.    "--export=EDI‐
568              TOR=/bin/emacs").   In  these two examples, the propagated envi‐
569              ronment will  only  contain  the  variable  EDITOR,  along  with
570              SLURM_* environment variables.  However, Slurm will then implic‐
571              itly attempt to load the user's environment on  the  node  where
572              the  script  is  being executed, as if --get-user-env was speci‐
573              fied. This will happen whenever NONE  or  environment  variables
574              are specified.  If one desires to add to the submission environ‐
575              ment instead of replacing it,  have  the  argument  include  ALL
576              (e.g. "--export=ALL,EDITOR=/bin/emacs"). Make sure ALL is speci‐
577              fied first, since sbatch applies the environment  from  left  to
578              right,  overwriting  as necessary.  Environment variables propa‐
579              gated from the  submission  environment  will  always  overwrite
580              environment variables found in the user environment on the node.
581              If one desires no environment variables be propagated  from  the
582              submitting  machine,  use the argument NONE.  Regardless of this
583              setting, the appropriate SLURM_* task environment variables  are
584              always exported to the environment.  This option is particularly
585              important for jobs that are submitted on one cluster and execute
586              on  a  different  cluster (e.g. with different paths).  To avoid
587              steps inheriting environment export settings  (e.g.  NONE)  from
588              sbatch command, the environment variable SLURM_EXPORT_ENV should
589              be set to ALL in the job script.
590
591
592       --export-file=<filename | fd>
593              If a number between 3 and OPEN_MAX is specified as the  argument
594              to  this  option,  a  readable  file  descriptor will be assumed
595              (STDIN and STDOUT are not supported as valid arguments).  Other‐
596              wise  a  filename  is  assumed.   Export  environment  variables
597              defined in <filename> or read from <fd> to the  job's  execution
598              environment.  The  content  is  one or more environment variable
599              definitions of the form NAME=value, each  separated  by  a  null
600              character.   This  allows the use of special characters in envi‐
601              ronment definitions.
602
603
604       -F, --nodefile=<node file>
605              Much like --nodelist, but the list is contained  in  a  file  of
606              name node file.  The node names of the list may also span multi‐
607              ple lines in the file.    Duplicate node names in the file  will
608              be  ignored.   The  order  of  the node names in the list is not
609              important; the node names will be sorted by Slurm.
610
611
612       --get-user-env[=timeout][mode]
613              This option will tell sbatch to retrieve the  login  environment
614              variables for the user specified in the --uid option.  The envi‐
615              ronment variables are retrieved by  running  something  of  this
616              sort  "su  - <username> -c /usr/bin/env" and parsing the output.
617              Be aware that any environment variables already set in  sbatch's
618              environment  will take precedence over any environment variables
619              in the user's login environment. Clear any environment variables
620              before  calling  sbatch  that  you do not want propagated to the
621              spawned program.  The optional  timeout  value  is  in  seconds.
622              Default value is 8 seconds.  The optional mode value control the
623              "su" options.  With a mode value of "S", "su" is executed  with‐
624              out  the "-" option.  With a mode value of "L", "su" is executed
625              with the "-" option, replicating the login environment.  If mode
626              not specified, the mode established at Slurm build time is used.
627              Example of  use  include  "--get-user-env",  "--get-user-env=10"
628              "--get-user-env=10L",  and  "--get-user-env=S".  This option was
629              originally created for use by Moab.
630
631
632       --gid=<group>
633              If sbatch is run as root, and the --gid option is  used,  submit
634              the job with group's group access permissions.  group may be the
635              group name or the numerical group ID.
636
637
638       -G, --gpus=[<type>:]<number>
639              Specify the total number of  GPUs  required  for  the  job.   An
640              optional  GPU  type  specification can be supplied.  For example
641              "--gpus=volta:3".  Multiple options can be requested in a  comma
642              separated  list,  for  example:  "--gpus=volta:3,kepler:1".  See
643              also the --gpus-per-node, --gpus-per-socket and  --gpus-per-task
644              options.
645
646
647       --gpu-bind=<type>
648              Bind  tasks to specific GPUs.  By default every spawned task can
649              access every GPU allocated to the job.
650
651              Supported type options:
652
653              closest   Bind each task to the GPU(s) which are closest.  In  a
654                        NUMA  environment, each task may be bound to more than
655                        one GPU (i.e.  all GPUs in that NUMA environment).
656
657              map_gpu:<list>
658                        Bind by setting GPU masks on tasks (or ranks) as spec‐
659                        ified            where            <list>            is
660                        <gpu_id_for_task_0>,<gpu_id_for_task_1>,...  GPU   IDs
661                        are interpreted as decimal values unless they are pre‐
662                        ceded with '0x' in  which  case  they  interpreted  as
663                        hexadecimal  values. If the number of tasks (or ranks)
664                        exceeds the number of elements in this list,  elements
665                        in the list will be reused as needed starting from the
666                        beginning of the list. To simplify support  for  large
667                        task counts, the lists may follow a map with an aster‐
668                        isk    and    repetition    count.     For     example
669                        "map_gpu:0*4,1*4".   Not  supported  unless the entire
670                        node is allocated to the job.
671
672              mask_gpu:<list>
673                        Bind by setting GPU masks on tasks (or ranks) as spec‐
674                        ified            where            <list>            is
675                        <gpu_mask_for_task_0>,<gpu_mask_for_task_1>,...    The
676                        mapping  is specified for a node and identical mapping
677                        is applied to the tasks on every node (i.e. the lowest
678                        task ID on each node is mapped to the first mask spec‐
679                        ified in the list, etc.). GPU masks are always  inter‐
680                        preted  as hexadecimal values but can be preceded with
681                        an optional '0x'. Not supported unless the entire node
682                        is allocated to the job. To simplify support for large
683                        task counts, the lists may follow a map with an aster‐
684                        isk     and    repetition    count.     For    example
685                        "mask_gpu:0x0f*4,0xf0*4".  Not  supported  unless  the
686                        entire node is allocated to the job.
687
688
689       --gpu-freq=[<type]=value>[,<type=value>][,verbose]
690              Request  that GPUs allocated to the job are configured with spe‐
691              cific frequency values.  This option can  be  used  to  indepen‐
692              dently  configure the GPU and its memory frequencies.  After the
693              job is completed, the frequencies of all affected GPUs  will  be
694              reset  to  the  highest  possible values.  In some cases, system
695              power caps may override the requested values.   The  field  type
696              can be "memory".  If type is not specified, the GPU frequency is
697              implied.  The value field can either be "low", "medium", "high",
698              "highm1"  or  a numeric value in megahertz (MHz).  If the speci‐
699              fied numeric value is not possible, a value as close as possible
700              will  be used. See below for definition of the values.  The ver‐
701              bose option causes  current  GPU  frequency  information  to  be
702              logged.  Examples of use include "--gpu-freq=medium,memory=high"
703              and "--gpu-freq=450".
704
705              Supported value definitions:
706
707              low       the lowest available frequency.
708
709              medium    attempts to set a  frequency  in  the  middle  of  the
710                        available range.
711
712              high      the highest available frequency.
713
714              highm1    (high  minus  one) will select the next highest avail‐
715                        able frequency.
716
717
718       --gpus-per-node=[<type>:]<number>
719              Specify the number of GPUs required for the  job  on  each  node
720              included in the job's resource allocation.  An optional GPU type
721              specification     can     be     supplied.      For      example
722              "--gpus-per-node=volta:3".  Multiple options can be requested in
723              a      comma      separated       list,       for       example:
724              "--gpus-per-node=volta:3,kepler:1".    See   also   the  --gpus,
725              --gpus-per-socket and --gpus-per-task options.
726
727
728       --gpus-per-socket=[<type>:]<number>
729              Specify the number of GPUs required for the job on  each  socket
730              included in the job's resource allocation.  An optional GPU type
731              specification     can     be     supplied.      For      example
732              "--gpus-per-socket=volta:3".   Multiple options can be requested
733              in     a     comma     separated     list,     for      example:
734              "--gpus-per-socket=volta:3,kepler:1".  Requires job to specify a
735              sockets per node count  (  --sockets-per-node).   See  also  the
736              --gpus, --gpus-per-node and --gpus-per-task options.
737
738
739       --gpus-per-task=[<type>:]<number>
740              Specify  the number of GPUs required for the job on each task to
741              be spawned in the job's resource allocation.   An  optional  GPU
742              type  specification  can  be supplied.  This option requires the
743              specification    of    a    task     count.      For     example
744              "--gpus-per-task=volta:1".  Multiple options can be requested in
745              a      comma      separated       list,       for       example:
746              "--gpus-per-task=volta:3,kepler:1".   Requires  job to specify a
747              task count (--nodes).  See also  the  --gpus,  --gpus-per-socket
748              and --gpus-per-node options.
749
750
751       --gres=<list>
752              Specifies   a   comma   delimited  list  of  generic  consumable
753              resources.   The  format  of  each  entry   on   the   list   is
754              "name[[:type]:count]".   The  name  is  that  of  the consumable
755              resource.  The count is the number of  those  resources  with  a
756              default  value  of 1.  The count can have a suffix of "k" or "K"
757              (multiple of 1024), "m" or "M" (multiple of 1024 x 1024), "g" or
758              "G"  (multiple  of  1024 x 1024 x 1024), "t" or "T" (multiple of
759              1024 x 1024 x 1024 x 1024), "p" or "P" (multiple of 1024 x  1024
760              x  1024  x  1024 x 1024).  The specified resources will be allo‐
761              cated to the job on each node.  The available generic consumable
762              resources  is  configurable by the system administrator.  A list
763              of available generic consumable resources will  be  printed  and
764              the  command  will exit if the option argument is "help".  Exam‐
765              ples of use include "--gres=gpu:2,mic:1", "--gres=gpu:kepler:2",
766              and "--gres=help".
767
768
769       --gres-flags=<type>
770              Specify generic resource task binding options.
771
772              disable-binding
773                     Disable   filtering  of  CPUs  with  respect  to  generic
774                     resource locality.  This option is currently required  to
775                     use  more CPUs than are bound to a GRES (i.e. if a GPU is
776                     bound to the CPUs on one socket, but  resources  on  more
777                     than  one  socket  are  required  to  run the job).  This
778                     option may permit a job to be allocated resources  sooner
779                     than otherwise possible, but may result in lower job per‐
780                     formance.
781
782              enforce-binding
783                     The only CPUs available to the job will be those bound to
784                     the  selected  GRES  (i.e.  the  CPUs  identified  in the
785                     gres.conf file will be strictly  enforced).  This  option
786                     may result in delayed initiation of a job.  For example a
787                     job requiring two GPUs and one CPU will be delayed  until
788                     both  GPUs  on  a single socket are available rather than
789                     using GPUs bound to separate sockets, however the  appli‐
790                     cation performance may be improved due to improved commu‐
791                     nication speed.  Requires the node to be configured  with
792                     more  than one socket and resource filtering will be per‐
793                     formed on a per-socket basis.
794
795
796       -H, --hold
797              Specify the job is to be submitted in a held state (priority  of
798              zero).   A  held job can now be released using scontrol to reset
799              its priority (e.g. "scontrol release <job_id>").
800
801
802       -h, --help
803              Display help information and exit.
804
805
806       --hint=<type>
807              Bind tasks according to application hints.
808
809              compute_bound
810                     Select settings for compute bound applications:  use  all
811                     cores in each socket, one thread per core.
812
813              memory_bound
814                     Select  settings  for memory bound applications: use only
815                     one core in each socket, one thread per core.
816
817              [no]multithread
818                     [don't] use extra threads  with  in-core  multi-threading
819                     which  can  benefit communication intensive applications.
820                     Only supported with the task/affinity plugin.
821
822              help   show this help message
823
824
825       --ignore-pbs
826              Ignore any "#PBS" options specified in the batch script.
827
828
829       -i, --input=<filename pattern>
830              Instruct Slurm to connect  the  batch  script's  standard  input
831              directly to the file name specified in the "filename pattern".
832
833              By  default,  "/dev/null" is open on the batch script's standard
834              input and both standard output and standard error  are  directed
835              to a file of the name "slurm-%j.out", where the "%j" is replaced
836              with the job allocation number, as described below in the  file‐
837              name pattern section.
838
839
840       -J, --job-name=<jobname>
841              Specify  a  name for the job allocation. The specified name will
842              appear along with the job id number when querying  running  jobs
843              on  the  system. The default is the name of the batch script, or
844              just "sbatch" if the script is read on sbatch's standard input.
845
846
847       -k, --no-kill [=off]
848              Do not automatically terminate a job if one of the nodes it  has
849              been allocated fails.  The user will assume the responsibilities
850              for fault-tolerance should a node fail.  When there  is  a  node
851              failure,  any  active  job steps (usually MPI jobs) on that node
852              will almost certainly suffer a fatal error, but with  --no-kill,
853              the  job  allocation  will not be revoked so the user may launch
854              new job steps on the remaining nodes in their allocation.
855
856              Specify an optional argument of "off" disable the effect of  the
857              SBATCH_NO_KILL environment variable.
858
859              By  default  Slurm  terminates  the entire job allocation if any
860              node fails in its range of allocated nodes.
861
862
863       --kill-on-invalid-dep=<yes|no>
864              If a job has an invalid dependency and it  can  never  run  this
865              parameter  tells  Slurm to terminate it or not. A terminated job
866              state will be JOB_CANCELLED.  If this option  is  not  specified
867              the  system  wide  behavior  applies.   By default the job stays
868              pending  with  reason   DependencyNeverSatisfied   or   if   the
869              kill_invalid_depend is specified in slurm.conf the job is termi‐
870              nated.
871
872
873       -L, --licenses=<license>
874              Specification of licenses (or other resources available  on  all
875              nodes  of  the  cluster)  which  must  be allocated to this job.
876              License names can be followed by a colon and count (the  default
877              count is one).  Multiple license names should be comma separated
878              (e.g.  "--licenses=foo:4,bar").  To  submit  jobs  using  remote
879              licenses,  those served by the slurmdbd, specify the name of the
880              server providing  the  licenses.   For  example  "--license=nas‐
881              tran@slurmdb:12".
882
883
884       -M, --clusters=<string>
885              Clusters  to  issue  commands to.  Multiple cluster names may be
886              comma separated.  The job will be submitted to the  one  cluster
887              providing the earliest expected job initiation time. The default
888              value is the current cluster. A value of 'all' will query to run
889              on  all  clusters.  Note the --export option to control environ‐
890              ment variables exported between clusters.  Note that  the  Slur‐
891              mDBD must be up for this option to work properly.
892
893
894       -m, --distribution=
895              arbitrary|<block|cyclic|plane=<options>[:block|cyclic|fcyclic]>
896
897              Specify alternate distribution methods for remote processes.  In
898              sbatch, this only sets environment variables that will  be  used
899              by  subsequent  srun requests.  This option controls the assign‐
900              ment of tasks to the nodes on which resources  have  been  allo‐
901              cated,  and  the  distribution  of  those resources to tasks for
902              binding (task affinity). The first distribution  method  (before
903              the  ":")  controls  the distribution of resources across nodes.
904              The optional second distribution method (after the ":") controls
905              the  distribution  of  resources  across  sockets within a node.
906              Note that with select/cons_res, the number of cpus allocated  on
907              each    socket   and   node   may   be   different.   Refer   to
908              https://slurm.schedmd.com/mc_support.html for  more  information
909              on  resource allocation, assignment of tasks to nodes, and bind‐
910              ing of tasks to CPUs.
911
912              First distribution method:
913
914              block  The block distribution method will distribute tasks to  a
915                     node  such that consecutive tasks share a node. For exam‐
916                     ple, consider an allocation of three nodes each with  two
917                     cpus.  A  four-task  block distribution request will dis‐
918                     tribute those tasks to the nodes with tasks one  and  two
919                     on  the  first  node,  task three on the second node, and
920                     task four on the third node.  Block distribution  is  the
921                     default  behavior if the number of tasks exceeds the num‐
922                     ber of allocated nodes.
923
924              cyclic The cyclic distribution method will distribute tasks to a
925                     node  such  that  consecutive  tasks are distributed over
926                     consecutive nodes (in a round-robin fashion).  For  exam‐
927                     ple,  consider an allocation of three nodes each with two
928                     cpus. A four-task cyclic distribution request  will  dis‐
929                     tribute  those tasks to the nodes with tasks one and four
930                     on the first node, task two on the second node, and  task
931                     three  on  the  third node.  Note that when SelectType is
932                     select/cons_res, the same number of CPUs may not be allo‐
933                     cated on each node. Task distribution will be round-robin
934                     among all the nodes with  CPUs  yet  to  be  assigned  to
935                     tasks.   Cyclic  distribution  is the default behavior if
936                     the number of tasks is no larger than the number of allo‐
937                     cated nodes.
938
939              plane  The  tasks are distributed in blocks of a specified size.
940                     The options include a number representing the size of the
941                     task  block.   This is followed by an optional specifica‐
942                     tion of the task distribution scheme within  a  block  of
943                     tasks  and  between  the  blocks of tasks.  The number of
944                     tasks distributed to each node is the same as for  cyclic
945                     distribution,  but  the  taskids  assigned  to  each node
946                     depend on the plane size.  For  more  details  (including
947                     examples and diagrams), please see
948                     https://slurm.schedmd.com/mc_support.html
949                     and
950                     https://slurm.schedmd.com/dist_plane.html
951
952              arbitrary
953                     The  arbitrary  method of distribution will allocate pro‐
954                     cesses in-order as listed in file designated by the envi‐
955                     ronment  variable  SLURM_HOSTFILE.   If  this variable is
956                     listed it will override any other method  specified.   If
957                     not  set  the  method  will default to block.  Inside the
958                     hostfile must contain at  minimum  the  number  of  hosts
959                     requested  and  be  one  per line or comma separated.  If
960                     specifying a task  count  (-n,  --ntasks=<number>),  your
961                     tasks  will  be laid out on the nodes in the order of the
962                     file.
963                     NOTE: The arbitrary distribution option on a job  alloca‐
964                     tion  only  controls the nodes to be allocated to the job
965                     and not the allocation  of  CPUs  on  those  nodes.  This
966                     option  is  meant  primarily to control a job step's task
967                     layout in an existing job allocation for  the  srun  com‐
968                     mand.
969
970
971              Second distribution method:
972
973              block  The  block  distribution  method will distribute tasks to
974                     sockets such that consecutive tasks share a socket.
975
976              cyclic The cyclic distribution method will distribute  tasks  to
977                     sockets  such that consecutive tasks are distributed over
978                     consecutive sockets (in a  round-robin  fashion).   Tasks
979                     requiring  more  than one CPU will have all of those CPUs
980                     allocated on a single socket if possible.
981
982              fcyclic
983                     The fcyclic distribution method will distribute tasks  to
984                     sockets  such that consecutive tasks are distributed over
985                     consecutive sockets (in a  round-robin  fashion).   Tasks
986                     requiring more than one CPU will have each CPUs allocated
987                     in a cyclic fashion across sockets.
988
989
990       --mail-type=<type>
991              Notify user by email when certain event types occur.  Valid type
992              values  are  NONE, BEGIN, END, FAIL, REQUEUE, ALL (equivalent to
993              BEGIN, END, FAIL, REQUEUE, and STAGE_OUT), STAGE_OUT (burst buf‐
994              fer stage out and teardown completed), TIME_LIMIT, TIME_LIMIT_90
995              (reached 90 percent of time limit),  TIME_LIMIT_80  (reached  80
996              percent  of  time  limit),  TIME_LIMIT_50 (reached 50 percent of
997              time limit) and ARRAY_TASKS (send emails for each  array  task).
998              Multiple type values may be specified in a comma separated list.
999              The user to be notified is indicated with  --mail-user.   Unless
1000              the  ARRAY_TASKS  option is specified, mail notifications on job
1001              BEGIN, END and FAIL apply to a job array as a whole rather  than
1002              generating  individual  email  messages for each task in the job
1003              array.
1004
1005
1006       --mail-user=<user>
1007              User to receive email notification of state changes  as  defined
1008              by --mail-type.  The default value is the submitting user.
1009
1010
1011       --mcs-label=<mcs>
1012              Used  only when the mcs/group plugin is enabled.  This parameter
1013              is a group among the groups of the user.  Default value is  cal‐
1014              culated by the Plugin mcs if it's enabled.
1015
1016
1017       --mem=<size[units]>
1018              Specify  the  real  memory required per node.  Default units are
1019              megabytes unless the SchedulerParameters configuration parameter
1020              includes  the  "default_gbytes" option for gigabytes.  Different
1021              units can be specified  using  the  suffix  [K|M|G|T].   Default
1022              value  is  DefMemPerNode and the maximum value is MaxMemPerNode.
1023              If configured, both parameters can be seen  using  the  scontrol
1024              show  config command.  This parameter would generally be used if
1025              whole nodes are allocated  to  jobs  (SelectType=select/linear).
1026              Also   see   --mem-per-cpu   and   --mem-per-gpu.    The  --mem,
1027              --mem-per-cpu and --mem-per-gpu options are mutually  exclusive.
1028              If  --mem,  --mem-per-cpu or --mem-per-gpu are specified as com‐
1029              mand line arguments, then they will  take  precedence  over  the
1030              environment.
1031
1032              NOTE:  A  memory size specification of zero is treated as a spe‐
1033              cial case and grants the job access to all of the memory on each
1034              node.  If the job is allocated multiple nodes in a heterogeneous
1035              cluster, the memory limit on each node will be that of the  node
1036              in the allocation with the smallest memory size (same limit will
1037              apply to every node in the job's allocation).
1038
1039              NOTE: Enforcement of memory limits  currently  relies  upon  the
1040              task/cgroup plugin or enabling of accounting, which samples mem‐
1041              ory use on a periodic basis (data need not be stored, just  col‐
1042              lected).  In both cases memory use is based upon the job's Resi‐
1043              dent Set Size (RSS). A task may exceed the  memory  limit  until
1044              the next periodic accounting sample.
1045
1046
1047       --mem-per-cpu=<size[units]>
1048              Minimum  memory  required  per allocated CPU.  Default units are
1049              megabytes unless the SchedulerParameters configuration parameter
1050              includes  the  "default_gbytes"  option  for gigabytes.  Default
1051              value is DefMemPerCPU and the maximum value is MaxMemPerCPU (see
1052              exception  below).  If  configured,  both parameters can be seen
1053              using the scontrol show config command.  Note that if the  job's
1054              --mem-per-cpu  value  exceeds  the configured MaxMemPerCPU, then
1055              the user's limit will be treated as a  memory  limit  per  task;
1056              --mem-per-cpu  will be reduced to a value no larger than MaxMem‐
1057              PerCPU;  --cpus-per-task  will  be  set   and   the   value   of
1058              --cpus-per-task  multiplied  by the new --mem-per-cpu value will
1059              equal the original --mem-per-cpu value specified  by  the  user.
1060              This  parameter would generally be used if individual processors
1061              are  allocated   to   jobs   (SelectType=select/cons_res).    If
1062              resources  are allocated by the core, socket or whole nodes; the
1063              number of CPUs allocated to a job may be higher  than  the  task
1064              count  and the value of --mem-per-cpu should be adjusted accord‐
1065              ingly.   Also  see  --mem   and   --mem-per-gpu.    The   --mem,
1066              --mem-per-cpu and --mem-per-gpu options are mutually exclusive.
1067
1068              NOTE:If  the  final amount of memory requested by job (eg.: when
1069              --mem-per-cpu use with --exclusive option) can't be satisfied by
1070              any  of  nodes  configured  in  the  partition,  the job will be
1071              rejected.
1072
1073
1074       --mem-per-gpu=<size[units]>
1075              Minimum memory required per allocated GPU.   Default  units  are
1076              megabytes unless the SchedulerParameters configuration parameter
1077              includes the "default_gbytes" option for  gigabytes.   Different
1078              units  can  be  specified  using  the suffix [K|M|G|T].  Default
1079              value is DefMemPerGPU and is available on both a global and  per
1080              partition  basis.   If  configured,  the  parameters can be seen
1081              using the scontrol show config and scontrol show partition  com‐
1082              mands.    Also   see   --mem.    The  --mem,  --mem-per-cpu  and
1083              --mem-per-gpu options are mutually exclusive.
1084
1085
1086       --mem-bind=[{quiet,verbose},]type
1087              Bind tasks to memory. Used only when the task/affinity plugin is
1088              enabled  and the NUMA memory functions are available.  Note that
1089              the resolution of CPU and memory  binding  may  differ  on  some
1090              architectures.  For example, CPU binding may be performed at the
1091              level of the cores within a processor while memory binding  will
1092              be  performed  at  the  level  of nodes, where the definition of
1093              "nodes" may differ from system to system.  By default no  memory
1094              binding is performed; any task using any CPU can use any memory.
1095              This option is typically used to ensure that each task is  bound
1096              to  the memory closest to it's assigned CPU. The use of any type
1097              other than "none" or "local" is not recommended.   If  you  want
1098              greater control, try running a simple test code with the options
1099              "--cpu-bind=verbose,none --mem-bind=verbose,none"  to  determine
1100              the specific configuration.
1101
1102              NOTE: To have Slurm always report on the selected memory binding
1103              for all commands executed in a shell,  you  can  enable  verbose
1104              mode by setting the SLURM_MEM_BIND environment variable value to
1105              "verbose".
1106
1107              The following informational environment variables are  set  when
1108              --mem-bind is in use:
1109
1110                   SLURM_MEM_BIND_LIST
1111                   SLURM_MEM_BIND_PREFER
1112                   SLURM_MEM_BIND_SORT
1113                   SLURM_MEM_BIND_TYPE
1114                   SLURM_MEM_BIND_VERBOSE
1115
1116              See  the  ENVIRONMENT  VARIABLES  section  for  a  more detailed
1117              description of the individual SLURM_MEM_BIND* variables.
1118
1119              Supported options include:
1120
1121              help   show this help message
1122
1123              local  Use memory local to the processor in use
1124
1125              map_mem:<list>
1126                     Bind by setting memory masks on tasks (or ranks) as spec‐
1127                     ified             where             <list>             is
1128                     <numa_id_for_task_0>,<numa_id_for_task_1>,...   The  map‐
1129                     ping  is  specified  for  a node and identical mapping is
1130                     applied to the tasks on every node (i.e. the lowest  task
1131                     ID  on  each  node is mapped to the first ID specified in
1132                     the list, etc.).  NUMA IDs  are  interpreted  as  decimal
1133                     values  unless  they are preceded with '0x' in which case
1134                     they interpreted as hexadecimal values.  If the number of
1135                     tasks  (or  ranks) exceeds the number of elements in this
1136                     list, elements in the  list  will  be  reused  as  needed
1137                     starting  from  the  beginning  of the list.  To simplify
1138                     support for large task counts, the lists may follow a map
1139                     with   an  asterisk  and  repetition  count  For  example
1140                     "map_mem:0x0f*4,0xf0*4".  Not supported unless the entire
1141                     node is allocated to the job.
1142
1143              mask_mem:<list>
1144                     Bind by setting memory masks on tasks (or ranks) as spec‐
1145                     ified             where             <list>             is
1146                     <numa_mask_for_task_0>,<numa_mask_for_task_1>,...     The
1147                     mapping is specified for a node and identical mapping  is
1148                     applied  to the tasks on every node (i.e. the lowest task
1149                     ID on each node is mapped to the first mask specified  in
1150                     the  list,  etc.).   NUMA masks are always interpreted as
1151                     hexadecimal values.  Note that  masks  must  be  preceded
1152                     with  a  '0x'  if they don't begin with [0-9] so they are
1153                     seen as numerical values.  If the  number  of  tasks  (or
1154                     ranks)  exceeds the number of elements in this list, ele‐
1155                     ments in the list will be reused as needed starting  from
1156                     the beginning of the list.  To simplify support for large
1157                     task counts, the lists may follow a mask with an asterisk
1158                     and repetition count For example "mask_mem:0*4,1*4".  Not
1159                     supported unless the entire node is allocated to the job.
1160
1161              no[ne] don't bind tasks to memory (default)
1162
1163              p[refer]
1164                     Prefer use of first specified NUMA node, but permit
1165                      use of other available NUMA nodes.
1166
1167              q[uiet]
1168                     quietly bind before task runs (default)
1169
1170              rank   bind by task rank (not recommended)
1171
1172              sort   sort free cache pages (run zonesort on Intel KNL nodes)
1173
1174              v[erbose]
1175                     verbosely report binding before task runs
1176
1177
1178       --mincpus=<n>
1179              Specify a minimum number of logical cpus/processors per node.
1180
1181
1182       -N, --nodes=<minnodes[-maxnodes]>
1183              Request that a minimum of minnodes nodes be  allocated  to  this
1184              job.   A maximum node count may also be specified with maxnodes.
1185              If only one number is specified, this is used as both the  mini‐
1186              mum  and maximum node count.  The partition's node limits super‐
1187              sede those of the job.  If a job's node limits  are  outside  of
1188              the  range  permitted for its associated partition, the job will
1189              be left in a PENDING state.  This permits possible execution  at
1190              a  later  time,  when  the partition limit is changed.  If a job
1191              node limit exceeds the number of nodes configured in the  parti‐
1192              tion, the job will be rejected.  Note that the environment vari‐
1193              able SLURM_JOB_NODES will be set to the count of nodes  actually
1194              allocated to the job. See the ENVIRONMENT VARIABLES  section for
1195              more information.  If -N is not specified, the default  behavior
1196              is  to  allocate enough nodes to satisfy the requirements of the
1197              -n and -c options.  The job will be allocated as many  nodes  as
1198              possible  within  the  range  specified and without delaying the
1199              initiation of the job.  The node count specification may include
1200              a  numeric value followed by a suffix of "k" (multiplies numeric
1201              value by 1,024) or "m" (multiplies numeric value by 1,048,576).
1202
1203
1204       -n, --ntasks=<number>
1205              sbatch does not launch  tasks,  it  requests  an  allocation  of
1206              resources  and  submits  a batch script. This option advises the
1207              Slurm controller that job steps run within the  allocation  will
1208              launch  a  maximum of number tasks and to provide for sufficient
1209              resources.  The default is one task per node, but note that  the
1210              --cpus-per-task option will change this default.
1211
1212
1213       --network=<type>
1214              Specify  information  pertaining  to the switch or network.  The
1215              interpretation of type is system dependent.  This option is sup‐
1216              ported  when  running  Slurm  on a Cray natively.  It is used to
1217              request using Network Performance Counters.  Only one value  per
1218              request  is  valid.  All options are case in-sensitive.  In this
1219              configuration supported values include:
1220
1221              system
1222                    Use the system-wide  network  performance  counters.  Only
1223                    nodes  requested will be marked in use for the job alloca‐
1224                    tion.  If the job does not fill up the entire  system  the
1225                    rest  of  the  nodes are not able to be used by other jobs
1226                    using NPC, if idle their state will  appear  as  PerfCnts.
1227                    These  nodes  are still available for other jobs not using
1228                    NPC.
1229
1230              blade Use the blade network  performance  counters.  Only  nodes
1231                    requested  will  be  marked in use for the job allocation.
1232                    If the job does not fill up the entire blade(s)  allocated
1233                    to the job those blade(s) are not able to be used by other
1234                    jobs using NPC, if idle their state will appear as  PerfC‐
1235                    nts.   These  nodes are still available for other jobs not
1236                    using NPC.
1237
1238
1239              In all cases the job allocation request must specify the
1240              --exclusive option.  Otherwise the request will be denied.
1241
1242              Also with any of these options steps are not  allowed  to  share
1243              blades,  so  resources would remain idle inside an allocation if
1244              the step running on a blade does not take up all  the  nodes  on
1245              the blade.
1246
1247              The  network option is also supported on systems with IBM's Par‐
1248              allel Environment (PE).  See IBM's LoadLeveler job command  key‐
1249              word documentation about the keyword "network" for more informa‐
1250              tion.  Multiple values may be specified  in  a  comma  separated
1251              list.   All  options  are  case  in-sensitive.  Supported values
1252              include:
1253
1254              BULK_XFER[=<resources>]
1255                          Enable  bulk   transfer   of   data   using   Remote
1256                          Direct-Memory Access (RDMA).  The optional resources
1257                          specification is a numeric value which  can  have  a
1258                          suffix  of  "k", "K", "m", "M", "g" or "G" for kilo‐
1259                          bytes, megabytes or gigabytes.  NOTE: The  resources
1260                          specification is not supported by the underlying IBM
1261                          infrastructure as of  Parallel  Environment  version
1262                          2.2 and no value should be specified at this time.
1263
1264              CAU=<count> Number   of   Collective  Acceleration  Units  (CAU)
1265                          required.  Applies only to IBM Power7-IH processors.
1266                          Default  value  is  zero.   Independent  CAU will be
1267                          allocated for each programming interface (MPI, LAPI,
1268                          etc.)
1269
1270              DEVNAME=<name>
1271                          Specify  the  device  name to use for communications
1272                          (e.g. "eth0" or "mlx4_0").
1273
1274              DEVTYPE=<type>
1275                          Specify the device type to use  for  communications.
1276                          The supported values of type are: "IB" (InfiniBand),
1277                          "HFI" (P7 Host Fabric Interface), "IPONLY"  (IP-Only
1278                          interfaces), "HPCE" (HPC Ethernet), and "KMUX" (Ker‐
1279                          nel Emulation of HPCE).  The devices allocated to  a
1280                          job must all be of the same type.  The default value
1281                          depends upon depends upon what hardware is available
1282                          and  in order of preferences is IPONLY (which is not
1283                          considered in User Space mode), HFI, IB,  HPCE,  and
1284                          KMUX.
1285
1286              IMMED =<count>
1287                          Number  of immediate send slots per window required.
1288                          Applies only to IBM Power7-IH  processors.   Default
1289                          value is zero.
1290
1291              INSTANCES =<count>
1292                          Specify  number of network connections for each task
1293                          on each network connection.   The  default  instance
1294                          count is 1.
1295
1296              IPV4        Use  Internet Protocol (IP) version 4 communications
1297                          (default).
1298
1299              IPV6        Use Internet Protocol (IP) version 6 communications.
1300
1301              LAPI        Use the LAPI programming interface.
1302
1303              MPI         Use the  MPI  programming  interface.   MPI  is  the
1304                          default interface.
1305
1306              PAMI        Use the PAMI programming interface.
1307
1308              SHMEM       Use the OpenSHMEM programming interface.
1309
1310              SN_ALL      Use all available switch networks (default).
1311
1312              SN_SINGLE   Use one available switch network.
1313
1314              UPC         Use the UPC programming interface.
1315
1316              US          Use User Space communications.
1317
1318
1319              Some examples of network specifications:
1320
1321              Instances=2,US,MPI,SN_ALL
1322                          Create two user space connections for MPI communica‐
1323                          tions on every switch network for each task.
1324
1325              US,MPI,Instances=3,Devtype=IB
1326                          Create three user space connections for MPI communi‐
1327                          cations on every InfiniBand network for each task.
1328
1329              IPV4,LAPI,SN_Single
1330                          Create a IP version 4 connection for LAPI communica‐
1331                          tions on one switch network for each task.
1332
1333              Instances=2,US,LAPI,MPI
1334                          Create two user space connections each for LAPI  and
1335                          MPI  communications on every switch network for each
1336                          task. Note that SN_ALL  is  the  default  option  so
1337                          every   switch  network  is  used.  Also  note  that
1338                          Instances=2  specifies  that  two  connections   are
1339                          established  for  each  protocol  (LAPI and MPI) and
1340                          each task.  If there are two networks and four tasks
1341                          on  the  node  then  a  total  of 32 connections are
1342                          established (2 instances x 2 protocols x 2  networks
1343                          x 4 tasks).
1344
1345
1346       --nice[=adjustment]
1347              Run  the  job with an adjusted scheduling priority within Slurm.
1348              With no adjustment value the scheduling priority is decreased by
1349              100.  A  negative  nice  value increases the priority, otherwise
1350              decreases it. The adjustment range is +/- 2147483645. Only priv‐
1351              ileged users can specify a negative adjustment.
1352
1353
1354       --no-requeue
1355              Specifies  that the batch job should never be requeued under any
1356              circumstances.  Setting this option will prevent system adminis‐
1357              trators from being able to restart the job (for example, after a
1358              scheduled downtime), recover from a node failure, or be requeued
1359              upon  preemption  by  a  higher  priority  job.   When  a job is
1360              requeued, the batch script  is  initiated  from  its  beginning.
1361              Also  see  the  --requeue  option.  The JobRequeue configuration
1362              parameter controls the default behavior on the cluster.
1363
1364
1365       --ntasks-per-core=<ntasks>
1366              Request the maximum ntasks be invoked on each core.  Meant to be
1367              used  with  the  --ntasks  option.  Related to --ntasks-per-node
1368              except at the core level instead of the node level.  NOTE:  This
1369              option is not supported unless SelectType=cons_res is configured
1370              (either directly or indirectly on Cray systems) along  with  the
1371              node's core count.
1372
1373
1374       --ntasks-per-node=<ntasks>
1375              Request  that  ntasks be invoked on each node.  If used with the
1376              --ntasks option, the --ntasks option will  take  precedence  and
1377              the  --ntasks-per-node  will  be  treated  as a maximum count of
1378              tasks per node.  Meant to be used with the --nodes option.  This
1379              is related to --cpus-per-task=ncpus, but does not require knowl‐
1380              edge of the actual number of cpus on each node.  In some  cases,
1381              it  is more convenient to be able to request that no more than a
1382              specific number of tasks be invoked on each node.   Examples  of
1383              this  include  submitting a hybrid MPI/OpenMP app where only one
1384              MPI "task/rank" should be assigned to each node  while  allowing
1385              the  OpenMP portion to utilize all of the parallelism present in
1386              the node, or submitting a single setup/cleanup/monitoring job to
1387              each  node  of a pre-existing allocation as one step in a larger
1388              job script.
1389
1390
1391       --ntasks-per-socket=<ntasks>
1392              Request the maximum ntasks be invoked on each socket.  Meant  to
1393              be  used with the --ntasks option.  Related to --ntasks-per-node
1394              except at the socket level instead of  the  node  level.   NOTE:
1395              This  option is not supported unless SelectType=cons_res is con‐
1396              figured (either directly or indirectly on  Cray  systems)  along
1397              with the node's socket count.
1398
1399
1400       -O, --overcommit
1401              Overcommit  resources.  When applied to job allocation, only one
1402              CPU is allocated to the job per node and options used to specify
1403              the  number  of tasks per node, socket, core, etc.  are ignored.
1404              When applied to job step allocations (the srun command when exe‐
1405              cuted  within  an  existing  job allocation), this option can be
1406              used to launch more than one task per CPU.  Normally, srun  will
1407              not  allocate  more  than  one  process  per CPU.  By specifying
1408              --overcommit you are explicitly allowing more than  one  process
1409              per  CPU. However no more than MAX_TASKS_PER_NODE tasks are per‐
1410              mitted to execute per node.  NOTE: MAX_TASKS_PER_NODE is defined
1411              in  the  file  slurm.h and is not a variable, it is set at Slurm
1412              build time.
1413
1414
1415       -o, --output=<filename pattern>
1416              Instruct Slurm to connect the  batch  script's  standard  output
1417              directly  to  the file name specified in the "filename pattern".
1418              By default both standard output and standard error are  directed
1419              to  the  same  file.   For  job arrays, the default file name is
1420              "slurm-%A_%a.out", "%A" is replaced by the job ID and "%a"  with
1421              the  array  index.   For  other  jobs,  the default file name is
1422              "slurm-%j.out", where the "%j" is replaced by the job  ID.   See
1423              the  filename  pattern  section below for filename specification
1424              options.
1425
1426
1427       --open-mode=append|truncate
1428              Open the output and error files using append or truncate mode as
1429              specified.  The default value is specified by the system config‐
1430              uration parameter JobFileAppend.
1431
1432
1433       --parsable
1434              Outputs only the job id number and the cluster name if  present.
1435              The  values  are  separated by a semicolon. Errors will still be
1436              displayed.
1437
1438
1439       -p, --partition=<partition_names>
1440              Request a specific partition for the  resource  allocation.   If
1441              not  specified,  the default behavior is to allow the slurm con‐
1442              troller to select the default partition  as  designated  by  the
1443              system  administrator.  If  the job can use more than one parti‐
1444              tion, specify their names in a comma separate list and  the  one
1445              offering  earliest  initiation will be used with no regard given
1446              to the partition name ordering (although higher priority  parti‐
1447              tions will be considered first).  When the job is initiated, the
1448              name of the partition used will  be  placed  first  in  the  job
1449              record partition string.
1450
1451
1452       --power=<flags>
1453              Comma  separated  list of power management plugin options.  Cur‐
1454              rently available flags include: level (all  nodes  allocated  to
1455              the job should have identical power caps, may be disabled by the
1456              Slurm configuration option PowerParameters=job_no_level).
1457
1458
1459       --priority=<value>
1460              Request a specific job priority.  May be subject  to  configura‐
1461              tion  specific  constraints.   value  should either be a numeric
1462              value or "TOP" (for highest possible value).  Only Slurm  opera‐
1463              tors and administrators can set the priority of a job.
1464
1465
1466       --profile=<all|none|[energy[,|task[,|lustre[,|network]]]]>
1467              enables  detailed  data  collection  by  the acct_gather_profile
1468              plugin.  Detailed data are typically time-series that are stored
1469              in an HDF5 file for the job or an InfluxDB database depending on
1470              the configured plugin.
1471
1472
1473              All       All data types are collected. (Cannot be combined with
1474                        other values.)
1475
1476
1477              None      No data types are collected. This is the default.
1478                         (Cannot be combined with other values.)
1479
1480
1481              Energy    Energy data is collected.
1482
1483
1484              Task      Task (I/O, Memory, ...) data is collected.
1485
1486
1487              Lustre    Lustre data is collected.
1488
1489
1490              Network   Network (InfiniBand) data is collected.
1491
1492
1493       --propagate[=rlimit[,rlimit...]]
1494              Allows  users to specify which of the modifiable (soft) resource
1495              limits to propagate to the compute  nodes  and  apply  to  their
1496              jobs.  If  no rlimit is specified, then all resource limits will
1497              be propagated.  The following  rlimit  names  are  supported  by
1498              Slurm  (although  some options may not be supported on some sys‐
1499              tems):
1500
1501              ALL       All limits listed below (default)
1502
1503              NONE      No limits listed below
1504
1505              AS        The maximum address space for a process
1506
1507              CORE      The maximum size of core file
1508
1509              CPU       The maximum amount of CPU time
1510
1511              DATA      The maximum size of a process's data segment
1512
1513              FSIZE     The maximum size of files created. Note  that  if  the
1514                        user  sets  FSIZE to less than the current size of the
1515                        slurmd.log, job launches will fail with a  'File  size
1516                        limit exceeded' error.
1517
1518              MEMLOCK   The maximum size that may be locked into memory
1519
1520              NOFILE    The maximum number of open files
1521
1522              NPROC     The maximum number of processes available
1523
1524              RSS       The maximum resident set size
1525
1526              STACK     The maximum stack size
1527
1528
1529       -q, --qos=<qos>
1530              Request  a  quality  of  service for the job.  QOS values can be
1531              defined for each user/cluster/account association in  the  Slurm
1532              database.   Users will be limited to their association's defined
1533              set of qos's when the Slurm  configuration  parameter,  Account‐
1534              ingStorageEnforce, includes "qos" in it's definition.
1535
1536
1537       -Q, --quiet
1538              Suppress informational messages from sbatch such as Job ID. Only
1539              errors will still be displayed.
1540
1541
1542       --reboot
1543              Force the allocated nodes to reboot  before  starting  the  job.
1544              This  is only supported with some system configurations and will
1545              otherwise be silently ignored.
1546
1547
1548       --requeue
1549              Specifies that the batch job should eligible to  being  requeue.
1550              The  job  may  be requeued explicitly by a system administrator,
1551              after node failure, or upon preemption by a higher priority job.
1552              When  a  job is requeued, the batch script is initiated from its
1553              beginning.  Also see the --no-requeue  option.   The  JobRequeue
1554              configuration  parameter  controls  the  default behavior on the
1555              cluster.
1556
1557
1558       --reservation=<name>
1559              Allocate resources for the job from the named reservation.
1560
1561
1562       -s, --oversubscribe
1563              The job allocation can over-subscribe resources with other  run‐
1564              ning  jobs.   The  resources to be over-subscribed can be nodes,
1565              sockets, cores, and/or hyperthreads  depending  upon  configura‐
1566              tion.   The  default  over-subscribe  behavior depends on system
1567              configuration and the  partition's  OverSubscribe  option  takes
1568              precedence over the job's option.  This option may result in the
1569              allocation being granted  sooner  than  if  the  --oversubscribe
1570              option  was  not  set  and  allow higher system utilization, but
1571              application performance will likely suffer  due  to  competition
1572              for resources.  Also see the --exclusive option.
1573
1574
1575       -S, --core-spec=<num>
1576              Count of specialized cores per node reserved by the job for sys‐
1577              tem operations and not used by the application. The  application
1578              will  not use these cores, but will be charged for their alloca‐
1579              tion.  Default value is dependent  upon  the  node's  configured
1580              CoreSpecCount  value.   If a value of zero is designated and the
1581              Slurm configuration option AllowSpecResourcesUsage  is  enabled,
1582              the  job  will  be allowed to override CoreSpecCount and use the
1583              specialized resources on nodes it is allocated.  This option can
1584              not be used with the --thread-spec option.
1585
1586
1587       --signal=[B:]<sig_num>[@<sig_time>]
1588              When  a  job is within sig_time seconds of its end time, send it
1589              the signal sig_num.  Due to the resolution of event handling  by
1590              Slurm,  the  signal  may  be  sent up to 60 seconds earlier than
1591              specified.  sig_num may either be a signal number or name  (e.g.
1592              "10"  or "USR1").  sig_time must have an integer value between 0
1593              and 65535.  By default, no signal is sent before the  job's  end
1594              time.   If  a  sig_num  is  specified  without any sig_time, the
1595              default time will be 60 seconds.  Use the "B:" option to  signal
1596              only  the  batch shell, none of the other processes will be sig‐
1597              naled. By default all job steps will be signaled,  but  not  the
1598              batch  shell itself.  To have the signal sent at preemption time
1599              see the preempt_send_user_signal SlurmctldParameter.
1600
1601
1602       --sockets-per-node=<sockets>
1603              Restrict node selection to nodes with  at  least  the  specified
1604              number  of  sockets.  See additional information under -B option
1605              above when task/affinity plugin is enabled.
1606
1607
1608       --spread-job
1609              Spread the job allocation over as many  nodes  as  possible  and
1610              attempt  to  evenly distribute tasks across the allocated nodes.
1611              This option disables the topology/tree plugin.
1612
1613
1614       --switches=<count>[@<max-time>]
1615              When a tree topology is used, this defines the maximum count  of
1616              switches desired for the job allocation and optionally the maxi‐
1617              mum time to wait for that number of switches. If Slurm finds  an
1618              allocation  containing  more  switches than the count specified,
1619              the job remains pending until it either finds an allocation with
1620              desired  switch count or the time limit expires.  It there is no
1621              switch count limit, there is  no  delay  in  starting  the  job.
1622              Acceptable  time  formats  include "minutes", "minutes:seconds",
1623              "hours:minutes:seconds", "days-hours", "days-hours:minutes"  and
1624              "days-hours:minutes:seconds".   The job's maximum time delay may
1625              be limited by the system administrator using the SchedulerParam‐
1626              eters configuration parameter with the max_switch_wait parameter
1627              option.  On a dragonfly network the only switch count  supported
1628              is  1 since communication performance will be highest when a job
1629              is allocate resources on one leaf switch or  more  than  2  leaf
1630              switches.   The  default  max-time is the max_switch_wait Sched‐
1631              ulerParameters.
1632
1633
1634       -t, --time=<time>
1635              Set a limit on the total run time of the job allocation.  If the
1636              requested time limit exceeds the partition's time limit, the job
1637              will be left in a PENDING state  (possibly  indefinitely).   The
1638              default  time limit is the partition's default time limit.  When
1639              the time limit is reached, each task in each job  step  is  sent
1640              SIGTERM  followed  by  SIGKILL.  The interval between signals is
1641              specified by the Slurm configuration  parameter  KillWait.   The
1642              OverTimeLimit  configuration parameter may permit the job to run
1643              longer than scheduled.  Time resolution is one minute and second
1644              values are rounded up to the next minute.
1645
1646              A  time  limit  of  zero requests that no time limit be imposed.
1647              Acceptable time formats  include  "minutes",  "minutes:seconds",
1648              "hours:minutes:seconds",  "days-hours", "days-hours:minutes" and
1649              "days-hours:minutes:seconds".
1650
1651
1652       --test-only
1653              Validate the batch script and return an estimate of when  a  job
1654              would  be  scheduled  to run given the current job queue and all
1655              the other arguments specifying the job requirements. No  job  is
1656              actually submitted.
1657
1658
1659       --thread-spec=<num>
1660              Count  of  specialized  threads per node reserved by the job for
1661              system operations and not used by the application. The  applica‐
1662              tion  will  not use these threads, but will be charged for their
1663              allocation.  This option can not be used  with  the  --core-spec
1664              option.
1665
1666
1667       --threads-per-core=<threads>
1668              Restrict  node  selection  to  nodes with at least the specified
1669              number of threads per core.  NOTE: "Threads" refers to the  num‐
1670              ber  of  processing units on each core rather than the number of
1671              application tasks to  be  launched  per  core.   See  additional
1672              information  under  -B option above when task/affinity plugin is
1673              enabled.
1674
1675
1676       --time-min=<time>
1677              Set a minimum time limit on the job allocation.   If  specified,
1678              the  job  may have it's --time limit lowered to a value no lower
1679              than --time-min if doing so permits the job to  begin  execution
1680              earlier  than otherwise possible.  The job's time limit will not
1681              be changed after the job is allocated resources.  This  is  per‐
1682              formed  by a backfill scheduling algorithm to allocate resources
1683              otherwise reserved for higher priority  jobs.   Acceptable  time
1684              formats   include   "minutes",   "minutes:seconds",  "hours:min‐
1685              utes:seconds",    "days-hours",     "days-hours:minutes"     and
1686              "days-hours:minutes:seconds".
1687
1688
1689       --tmp=<size[units]>
1690              Specify  a  minimum  amount  of  temporary  disk space per node.
1691              Default units are megabytes unless the SchedulerParameters  con‐
1692              figuration  parameter  includes  the "default_gbytes" option for
1693              gigabytes.  Different units can be specified  using  the  suffix
1694              [K|M|G|T].
1695
1696
1697       --usage
1698              Display brief help message and exit.
1699
1700
1701       --uid=<user>
1702              Attempt to submit and/or run a job as user instead of the invok‐
1703              ing user id. The invoking user's credentials  will  be  used  to
1704              check access permissions for the target partition. User root may
1705              use this option to run jobs as a normal user in a RootOnly  par‐
1706              tition for example. If run as root, sbatch will drop its permis‐
1707              sions to the uid specified after node allocation is  successful.
1708              user may be the user name or numerical user ID.
1709
1710
1711       --use-min-nodes
1712              If a range of node counts is given, prefer the smaller count.
1713
1714
1715       -V, --version
1716              Display version information and exit.
1717
1718
1719       -v, --verbose
1720              Increase the verbosity of sbatch's informational messages.  Mul‐
1721              tiple -v's will further increase sbatch's verbosity.  By default
1722              only errors will be displayed.
1723
1724
1725       -w, --nodelist=<node name list>
1726              Request  a  specific list of hosts.  The job will contain all of
1727              these hosts and possibly additional hosts as needed  to  satisfy
1728              resource   requirements.    The  list  may  be  specified  as  a
1729              comma-separated list of hosts, a range of hosts (host[1-5,7,...]
1730              for  example),  or a filename.  The host list will be assumed to
1731              be a filename if it contains a "/" character.  If you specify  a
1732              minimum  node or processor count larger than can be satisfied by
1733              the supplied host list, additional resources will  be  allocated
1734              on other nodes as needed.  Duplicate node names in the list will
1735              be ignored.  The order of the node names  in  the  list  is  not
1736              important; the node names will be sorted by Slurm.
1737
1738
1739       -W, --wait
1740              Do  not  exit until the submitted job terminates.  The exit code
1741              of the sbatch command will be the same as the exit code  of  the
1742              submitted job. If the job terminated due to a signal rather than
1743              a normal exit, the exit code will be set to 1.  In the case of a
1744              job  array, the exit code recorded will be the highest value for
1745              any task in the job array.
1746
1747
1748       --wait-all-nodes=<value>
1749              Controls when the execution of the command begins.   By  default
1750              the job will begin execution as soon as the allocation is made.
1751
1752              0    Begin  execution as soon as allocation can be made.  Do not
1753                   wait for all nodes to be ready for use (i.e. booted).
1754
1755              1    Do not begin execution until all nodes are ready for use.
1756
1757
1758       --wckey=<wckey>
1759              Specify wckey to be used with job.  If  TrackWCKey=no  (default)
1760              in the slurm.conf this value is ignored.
1761
1762
1763       --wrap=<command string>
1764              Sbatch  will  wrap the specified command string in a simple "sh"
1765              shell script, and submit that script to  the  slurm  controller.
1766              When  --wrap  is  used,  a  script name and arguments may not be
1767              specified on the  command  line;  instead  the  sbatch-generated
1768              wrapper script is used.
1769
1770
1771       -x, --exclude=<node name list>
1772              Explicitly  exclude  certain nodes from the resources granted to
1773              the job.
1774
1775

filename pattern

1777       sbatch allows for a filename pattern to contain one or more replacement
1778       symbols, which are a percent sign "%" followed by a letter (e.g. %j).
1779
1780       \\     Do not process any of the replacement symbols.
1781
1782       %%     The character "%".
1783
1784       %A     Job array's master job allocation number.
1785
1786       %a     Job array ID (index) number.
1787
1788       %J     jobid.stepid of the running job. (e.g. "128.0")
1789
1790       %j     jobid of the running job.
1791
1792       %N     short hostname. This will create a separate IO file per node.
1793
1794       %n     Node  identifier  relative to current job (e.g. "0" is the first
1795              node of the running job) This will create a separate IO file per
1796              node.
1797
1798       %s     stepid of the running job.
1799
1800       %t     task identifier (rank) relative to current job. This will create
1801              a separate IO file per task.
1802
1803       %u     User name.
1804
1805       %x     Job name.
1806
1807       A number placed between the percent character and format specifier  may
1808       be  used  to  zero-pad  the  result  in the IO filename. This number is
1809       ignored if the format specifier corresponds to   non-numeric  data  (%N
1810       for example).
1811
1812       Some  examples  of  how  the format string may be used for a 4 task job
1813       step with a Job ID of 128 and step id of 0 are included below:
1814
1815       job%J.out      job128.0.out
1816
1817       job%4j.out     job0128.out
1818
1819       job%j-%2t.out  job128-00.out, job128-01.out, ...
1820

INPUT ENVIRONMENT VARIABLES

1822       Upon startup, sbatch will read and handle the options set in  the  fol‐
1823       lowing  environment  variables.   Note  that environment variables will
1824       override any options set in a batch script, and  command  line  options
1825       will override any environment variables.
1826
1827
1828       SBATCH_ACCOUNT        Same as -A, --account
1829
1830       SBATCH_ACCTG_FREQ     Same as --acctg-freq
1831
1832       SBATCH_ARRAY_INX      Same as -a, --array
1833
1834       SBATCH_BATCH          Same as --batch
1835
1836       SBATCH_CHECKPOINT     Same as --checkpoint
1837
1838       SBATCH_CLUSTERS or SLURM_CLUSTERS
1839                             Same as --clusters
1840
1841       SBATCH_CONSTRAINT     Same as -C, --constraint
1842
1843       SBATCH_CORE_SPEC      Same as --core-spec
1844
1845       SBATCH_CPUS_PER_GPU   Same as --cpus-per-gpu
1846
1847       SBATCH_DEBUG          Same as -v, --verbose
1848
1849       SBATCH_DELAY_BOOT     Same as --delay-boot
1850
1851       SBATCH_DISTRIBUTION   Same as -m, --distribution
1852
1853       SBATCH_EXCLUSIVE      Same as --exclusive
1854
1855       SBATCH_EXPORT         Same as --export
1856
1857       SBATCH_GET_USER_ENV   Same as --get-user-env
1858
1859       SBATCH_GPUS           Same as -G, --gpus
1860
1861       SBATCH_GPU_BIND       Same as --gpu-bind
1862
1863       SBATCH_GPU_FREQ       Same as --gpu-freq
1864
1865       SBATCH_GPUS_PER_NODE  Same as --gpus-per-node
1866
1867       SBATCH_GPUS_PER_TASK  Same   as  --gpus-per-task  SBATCH_GRES  Same  as
1868                             --gres
1869
1870       SBATCH_GRES_FLAGS     Same as --gres-flags
1871
1872       SBATCH_HINT or SLURM_HINT
1873                             Same as --hint
1874
1875       SBATCH_IGNORE_PBS     Same as --ignore-pbs
1876
1877       SBATCH_JOB_NAME       Same as -J, --job-name
1878
1879       SBATCH_MEM_BIND       Same as --mem-bind
1880
1881       SBATCH_MEM_PER_GPU    Same as --mem-per-gpu
1882
1883       SBATCH_NETWORK        Same as --network
1884
1885       SBATCH_NO_KILL        Same as -k, --no-kill
1886
1887       SBATCH_NO_REQUEUE     Same as --no-requeue
1888
1889       SBATCH_OPEN_MODE      Same as --open-mode
1890
1891       SBATCH_OVERCOMMIT     Same as -O, --overcommit
1892
1893       SBATCH_PARTITION      Same as -p, --partition
1894
1895       SBATCH_POWER          Same as --power
1896
1897       SBATCH_PROFILE        Same as --profile
1898
1899       SBATCH_QOS            Same as --qos
1900
1901       SBATCH_RESERVATION    Same as --reservation
1902
1903       SBATCH_REQ_SWITCH     When a tree topology is used,  this  defines  the
1904                             maximum  count  of  switches  desired for the job
1905                             allocation and optionally  the  maximum  time  to
1906                             wait for that number of switches. See --switches
1907
1908       SBATCH_REQUEUE        Same as --requeue
1909
1910       SBATCH_SIGNAL         Same as --signal
1911
1912       SBATCH_SPREAD_JOB     Same as --spread-job
1913
1914       SBATCH_THREAD_SPEC    Same as --thread-spec
1915
1916       SBATCH_TIMELIMIT      Same as -t, --time
1917
1918       SBATCH_USE_MIN_NODES  Same as --use-min-nodes
1919
1920       SBATCH_WAIT           Same as -W, --wait
1921
1922       SBATCH_WAIT_ALL_NODES Same as --wait-all-nodes
1923
1924       SBATCH_WAIT4SWITCH    Max  time  waiting  for  requested  switches. See
1925                             --switches
1926
1927       SBATCH_WCKEY          Same as --wckey
1928
1929       SLURM_CONF            The location of the Slurm configuration file.
1930
1931       SLURM_EXIT_ERROR      Specifies the exit code generated  when  a  Slurm
1932                             error occurs (e.g. invalid options).  This can be
1933                             used by a script to distinguish application  exit
1934                             codes from various Slurm error conditions.
1935
1936       SLURM_STEP_KILLED_MSG_NODE_ID=ID
1937                             If set, only the specified node will log when the
1938                             job or step are killed by a signal.
1939
1940

OUTPUT ENVIRONMENT VARIABLES

1942       The Slurm controller will set the following variables in  the  environ‐
1943       ment of the batch script.
1944
1945       SBATCH_MEM_BIND
1946              Set to value of the --mem-bind option.
1947
1948       SBATCH_MEM_BIND_LIST
1949              Set to bit mask used for memory binding.
1950
1951       SBATCH_MEM_BIND_PREFER
1952              Set  to  "prefer"  if  the --mem-bind option includes the prefer
1953              option.
1954
1955       SBATCH_MEM_BIND_TYPE
1956              Set to the memory binding type  specified  with  the  --mem-bind
1957              option.    Possible   values   are  "none",  "rank",  "map_map",
1958              "mask_mem" and "local".
1959
1960       SBATCH_MEM_BIND_VERBOSE
1961              Set to "verbose" if the --mem-bind option includes  the  verbose
1962              option.  Set to "quiet" otherwise.
1963
1964       SLURM_*_PACK_GROUP_#
1965              For  a  heterogeneous  job allocation, the environment variables
1966              are set separately for each component.
1967
1968       SLURM_ARRAY_TASK_COUNT
1969              Total number of tasks in a job array.
1970
1971       SLURM_ARRAY_TASK_ID
1972              Job array ID (index) number.
1973
1974       SLURM_ARRAY_TASK_MAX
1975              Job array's maximum ID (index) number.
1976
1977       SLURM_ARRAY_TASK_MIN
1978              Job array's minimum ID (index) number.
1979
1980       SLURM_ARRAY_TASK_STEP
1981              Job array's index step size.
1982
1983       SLURM_ARRAY_JOB_ID
1984              Job array's master job ID number.
1985
1986       SLURM_CLUSTER_NAME
1987              Name of the cluster on which the job is executing.
1988
1989       SLURM_CPUS_ON_NODE
1990              Number of CPUS on the allocated node.
1991
1992       SLURM_CPUS_PER_GPU
1993              Number of CPUs requested per allocated GPU.   Only  set  if  the
1994              --cpus-per-gpu option is specified.
1995
1996       SLURM_CPUS_PER_TASK
1997              Number   of   cpus   requested   per  task.   Only  set  if  the
1998              --cpus-per-task option is specified.
1999
2000       SLURM_DISTRIBUTION
2001              Same as -m, --distribution
2002
2003       SLURM_EXPORT_ENV
2004              Same as -e, --export.
2005
2006       SLURM_GPUS
2007              Number of GPUs requested.  Only set if the -G, --gpus option  is
2008              specified.
2009
2010       SLURM_GPU_BIND
2011              Requested  binding  of tasks to GPU.  Only set if the --gpu-bind
2012              option is specified.
2013
2014       SLURM_GPU_FREQ
2015              Requested GPU frequency.  Only set if the --gpu-freq  option  is
2016              specified.
2017
2018       SLURM_GPUS_PER_NODE
2019              Requested  GPU  count  per  allocated  node.   Only  set  if the
2020              --gpus-per-node option is specified.
2021
2022       SLURM_GPUS_PER_SOCKET
2023              Requested GPU count per  allocated  socket.   Only  set  if  the
2024              --gpus-per-socket option is specified.
2025
2026       SLURM_GPUS_PER_TASK
2027              Requested  GPU  count  per  allocated  task.   Only  set  if the
2028              --gpus-per-task option is specified.
2029
2030       SLURM_GTIDS
2031              Global task IDs running on this node.  Zero   origin  and  comma
2032              separated.
2033
2034       SLURM_JOB_ACCOUNT
2035              Account name associated of the job allocation.
2036
2037       SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
2038              The ID of the job allocation.
2039
2040       SLURM_JOB_CPUS_PER_NODE
2041              Count of processors available to the job on this node.  Note the
2042              select/linear plugin allocates entire  nodes  to  jobs,  so  the
2043              value  indicates  the  total  count  of  CPUs  on the node.  The
2044              select/cons_res plugin allocates individual processors to  jobs,
2045              so  this  number indicates the number of processors on this node
2046              allocated to the job.
2047
2048       SLURM_JOB_DEPENDENCY
2049              Set to value of the --dependency option.
2050
2051       SLURM_JOB_NAME
2052              Name of the job.
2053
2054       SLURM_JOB_NODELIST (and SLURM_NODELIST for backwards compatibility)
2055              List of nodes allocated to the job.
2056
2057       SLURM_JOB_NUM_NODES (and SLURM_NNODES for backwards compatibility)
2058              Total number of nodes in the job's resource allocation.
2059
2060       SLURM_JOB_PARTITION
2061              Name of the partition in which the job is running.
2062
2063       SLURM_JOB_QOS
2064              Quality Of Service (QOS) of the job allocation.
2065
2066       SLURM_JOB_RESERVATION
2067              Advanced reservation containing the job allocation, if any.
2068
2069       SLURM_LOCALID
2070              Node local task ID for the process within a job.
2071
2072       SLURM_MEM_PER_CPU
2073              Same as --mem-per-cpu
2074
2075       SLURM_MEM_PER_GPU
2076              Requested  memory  per  allocated  GPU.    Only   set   if   the
2077              --mem-per-gpu option is specified.
2078
2079       SLURM_MEM_PER_NODE
2080              Same as --mem
2081
2082       SLURM_NODE_ALIASES
2083              Sets  of node name, communication address and hostname for nodes
2084              allocated to the job from the cloud. Each element in the set  if
2085              colon  separated  and  each set is comma separated. For example:
2086              SLURM_NODE_ALIASES=ec0:1.2.3.4:foo,ec1:1.2.3.5:bar
2087
2088       SLURM_NODEID
2089              ID of the nodes allocated.
2090
2091       SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
2092              Same as -n, --ntasks
2093
2094       SLURM_NTASKS_PER_CORE
2095              Number  of  tasks  requested  per  core.   Only   set   if   the
2096              --ntasks-per-core option is specified.
2097
2098       SLURM_NTASKS_PER_NODE
2099              Number   of   tasks   requested  per  node.   Only  set  if  the
2100              --ntasks-per-node option is specified.
2101
2102       SLURM_NTASKS_PER_SOCKET
2103              Number  of  tasks  requested  per  socket.   Only  set  if   the
2104              --ntasks-per-socket option is specified.
2105
2106       SLURM_PACK_SIZE
2107              Set to count of components in heterogeneous job.
2108
2109       SLURM_PRIO_PROCESS
2110              The  scheduling priority (nice value) at the time of job submis‐
2111              sion.  This value is  propagated  to the spawned processes.
2112
2113       SLURM_PROCID
2114              The MPI rank (or relative process ID) of the current process
2115
2116       SLURM_PROFILE
2117              Same as --profile
2118
2119       SLURM_RESTART_COUNT
2120              If the job has been restarted due to system failure or has  been
2121              explicitly  requeued,  this  will be sent to the number of times
2122              the job has been restarted.
2123
2124       SLURM_SUBMIT_DIR
2125              The directory from which sbatch was invoked or,  if  applicable,
2126              the directory specified by the -D, --chdir option.
2127
2128       SLURM_SUBMIT_HOST
2129              The hostname of the computer from which sbatch was invoked.
2130
2131       SLURM_TASKS_PER_NODE
2132              Number  of  tasks to be initiated on each node. Values are comma
2133              separated and in the same order as SLURM_JOB_NODELIST.   If  two
2134              or  more consecutive nodes are to have the same task count, that
2135              count is followed by "(x#)" where "#" is the  repetition  count.
2136              For  example,  "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the
2137              first three nodes will each execute three tasks and  the  fourth
2138              node will execute one task.
2139
2140       SLURM_TASK_PID
2141              The process ID of the task being started.
2142
2143       SLURM_TOPOLOGY_ADDR
2144              This is set only if the  system  has  the  topology/tree  plugin
2145              configured.   The  value  will  be  set  to  the  names  network
2146              switches which  may be  involved  in  the  job's  communications
2147              from the system's top level switch down to the leaf  switch  and
2148              ending   with node name. A period is used to separate each hard‐
2149              ware component name.
2150
2151       SLURM_TOPOLOGY_ADDR_PATTERN
2152              This is set only if the  system  has  the  topology/tree  plugin
2153              configured. The value will be set  component  types  listed   in
2154              SLURM_TOPOLOGY_ADDR.   Each  component  will  be  identified  as
2155              either  "switch" or "node".  A period is  used  to separate each
2156              hardware component type.
2157
2158       SLURMD_NODENAME
2159              Name of the node running the job script.
2160
2161

EXAMPLES

2163       Specify a batch script by filename on  the  command  line.   The  batch
2164       script specifies a 1 minute time limit for the job.
2165
2166              $ cat myscript
2167              #!/bin/sh
2168              #SBATCH --time=1
2169              srun hostname |sort
2170
2171              $ sbatch -N4 myscript
2172              salloc: Granted job allocation 65537
2173
2174              $ cat slurm-65537.out
2175              host1
2176              host2
2177              host3
2178              host4
2179
2180
2181       Pass a batch script to sbatch on standard input:
2182
2183              $ sbatch -N4 <<EOF
2184              > #!/bin/sh
2185              > srun hostname |sort
2186              > EOF
2187              sbatch: Submitted batch job 65541
2188
2189              $ cat slurm-65541.out
2190              host1
2191              host2
2192              host3
2193              host4
2194
2195
2196       To  create  a  heterogeneous  job  with 3 components, each allocating a
2197       unique set of nodes:
2198
2199              sbatch -w node[2-3] : -w node4 : -w node[5-7] work.bash
2200              Submitted batch job 34987
2201
2202

COPYING

2204       Copyright (C) 2006-2007 The Regents of the  University  of  California.
2205       Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
2206       Copyright (C) 2008-2010 Lawrence Livermore National Security.
2207       Copyright (C) 2010-2017 SchedMD LLC.
2208
2209       This  file  is  part  of  Slurm,  a  resource  management program.  For
2210       details, see <https://slurm.schedmd.com/>.
2211
2212       Slurm is free software; you can redistribute it and/or modify it  under
2213       the  terms  of  the GNU General Public License as published by the Free
2214       Software Foundation; either version 2  of  the  License,  or  (at  your
2215       option) any later version.
2216
2217       Slurm  is  distributed  in the hope that it will be useful, but WITHOUT
2218       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
2219       FITNESS  FOR  A PARTICULAR PURPOSE.  See the GNU General Public License
2220       for more details.
2221
2222