1scontrol(1)                     Slurm Commands                     scontrol(1)
2
3
4

NAME

6       scontrol - Used view and modify Slurm configuration and state.
7
8

SYNOPSIS

10       scontrol [OPTIONS...] [COMMAND...]
11
12

DESCRIPTION

14       scontrol  is used to view or modify Slurm configuration including: job,
15       job step, node, partition, reservation, and overall  system  configura‐
16       tion.  Most  of  the  commands  can only be executed by user root or an
17       Administrator. If an attempt to view or modify  configuration  informa‐
18       tion  is made by an unauthorized user, an error message will be printed
19       and the requested action will not occur.  If no command is  entered  on
20       the  execute  line,  scontrol  will  operate in an interactive mode and
21       prompt for input. It will continue prompting for  input  and  executing
22       commands  until  explicitly  terminated. If a command is entered on the
23       execute line, scontrol will execute that  command  and  terminate.  All
24       commands  and options are case-insensitive, although node names, parti‐
25       tion names, and reservation names are case-sensitive (node  names  "LX"
26       and  "lx" are distinct). All commands and options can be abbreviated to
27       the extent that the specification is unique. A modified Slurm  configu‐
28       ration  can  be  written to a file using the scontrol write config com‐
29       mand.  The  resulting  file  will  be  named   using   the   convention
30       "slurm.conf.<datetime>" and located in the same directory as the origi‐
31       nal "slurm.conf" file. The directory containing the original slurm.conf
32       must be writable for this to occur.
33
34

OPTIONS

36       -a, --all
37              When  the  show  command  is  used, then display all partitions,
38              their jobs and jobs steps. This causes information  to  be  dis‐
39              played about partitions that are configured as hidden and parti‐
40              tions that are unavailable to user's group.
41
42       -d, --details
43              Causes the show command  to  provide  additional  details  where
44              available.
45
46       --federation
47              Report jobs from from federation if a member of one.
48
49       -F, --future
50              Report nodes in FUTURE state.
51
52       -h, --help
53              Print a help message describing the usage of scontrol.
54
55       --hide Do  not  display information about hidden partitions, their jobs
56              and job steps.  By default, neither partitions that are  config‐
57              ured  as hidden nor those partitions unavailable to user's group
58              will be displayed (i.e. this is the default behavior).
59
60       --local
61              Show only information local to this cluster. Ignore other  clus‐
62              ters  in  the  federated if a member of one. Overrides --federa‐
63              tion.
64
65       -M, --clusters=<string>
66              The cluster to issue commands to. Only one cluster name  may  be
67              specified.  Note that the SlurmDBD must be up for this option to
68              work properly.  This option implicitly sets the --local option.
69
70
71       -o, --oneliner
72              Print information one line per record.
73
74       -Q, --quiet
75              Print no warning or informational  messages,  only  fatal  error
76              messages.
77
78       --sibling
79              Show  all sibling jobs on a federated cluster. Implies --federa‐
80              tion.
81
82       -u, --uid=<uid>
83              Attempt to update a job as user <uid> instead  of  the  invoking
84              user id.
85
86       -v, --verbose
87              Print   detailed  event  logging.  Multiple  -v's  will  further
88              increase the verbosity of logging. By default only  errors  will
89              be displayed.
90
91
92       -V , --version
93              Print version information and exit.
94
95       COMMANDS
96
97
98       abort  Instruct  the Slurm controller to terminate immediately and gen‐
99              erate a core file.  See "man slurmctld"  for  information  about
100              where the core file will be written.
101
102
103       cancel_reboot <NodeList>
104              Cancel pending reboots on nodes.
105
106
107       checkpoint CKPT_OP ID
108              Perform a checkpoint activity on the job step(s) with the speci‐
109              fied identification.  ID can be used to identify a specific  job
110              (e.g. "<job_id>", which applies to all of its existing steps) or
111              a specific job  step  (e.g.  "<job_id>.<step_id>").   Acceptable
112              values for CKPT_OP include:
113
114              able        Test if presently not disabled, report start time if
115                          checkpoint in progress
116
117              create      Create a checkpoint and continue the job or job step
118
119              disable     Disable future checkpoints
120
121              enable      Enable future checkpoints
122
123              error       Report the result for the last  checkpoint  request,
124                          error code and message
125
126              restart     Restart execution of the previously checkpointed job
127                          or job step
128
129              requeue     Create a checkpoint and requeue the batch job,  com‐
130                          bines vacate and restart operations
131
132              vacate      Create  a  checkpoint  and  terminate the job or job
133                          step
134       Acceptable values for CKPT_OP include:
135
136              MaxWait=<seconds>   Maximum time for checkpoint to  be  written.
137                                  Default  value  is  10  seconds.  Valid with
138                                  create and vacate options only.
139
140              ImageDir=<directory_name>
141                                  Location of  checkpoint  file.   Valid  with
142                                  create,  vacate  and  restart  options only.
143                                  This value takes precedent over any --check‐
144                                  point-dir  value specified at job submission
145                                  time.
146
147              StickToNodes        If set, resume job on  the  same  nodes  are
148                                  previously  used.   Valid  with  the restart
149                                  option only.
150
151
152       cluster CLUSTER_NAME
153              The cluster to issue commands to. Only one cluster name  may  be
154              specified.
155
156
157       create SPECIFICATION
158              Create  a  new  partition  or reservation.  See the full list of
159              parameters below.  Include the tag "res" to create a reservation
160              without specifying a reservation name.
161
162
163       completing
164              Display  all  jobs  in  a COMPLETING state along with associated
165              nodes in either a COMPLETING or DOWN state.
166
167
168       delete SPECIFICATION
169              Delete the entry with  the  specified  SPECIFICATION.   The  two
170              SPECIFICATION  choices  are  PartitionName=<name>  and  Reserva‐
171              tion=<name>. Reservations and partitions should have no  associ‐
172              ated jobs at the time of their deletion (modify the jobs first).
173              If the specified partition is in use, the request is denied.
174
175
176       errnumstr ERRNO
177              Given a Slurm error number, return a descriptive string.
178
179
180       fsdampeningfactor FACTOR
181              Set the FairShareDampeningFactor in slurmctld.
182
183
184       help   Display a description of scontrol options and commands.
185
186
187       hold job_list
188              Prevent a pending job from being started (sets it's priority  to
189              0).   Use the release command to permit the job to be scheduled.
190              The job_list argument is a comma separated list of  job  IDs  OR
191              "jobname="  with  the job's name, which will attempt to hold all
192              jobs having that name.  Note that when a job is held by a system
193              administrator  using the hold command, only a system administra‐
194              tor may release the job for execution (also see the  uhold  com‐
195              mand).  When  the  job  is  held  by  its  owner, it may also be
196              released by the job's owner.  Additionally, attempting to hold a
197              running job will have not suspend or cancel it. But, it will set
198              the job priority to 0 and update the  job  reason  field,  which
199              would hold the job if it was requeued at a later time.
200
201
202       notify job_id message
203              Send  a  message to standard error of the salloc or srun command
204              or batch job associated with the specified job_id.
205
206
207       pidinfo proc_id
208              Print the Slurm job id and  scheduled  termination  time  corre‐
209              sponding  to  the  supplied  process id, proc_id, on the current
210              node.  This will work only with processes on node on which scon‐
211              trol  is  run, and only for those processes spawned by Slurm and
212              their descendants.
213
214
215       listpids [job_id[.step_id]] [NodeName]
216              Print  a  listing  of  the  process  IDs  in  a  job  step   (if
217              JOBID.STEPID  is provided), or all of the job steps in a job (if
218              job_id is provided), or all of the job steps in all of the  jobs
219              on  the local node (if job_id is not provided or job_id is "*").
220              This will work only with processes on the node on which scontrol
221              is  run, and only for those processes spawned by Slurm and their
222              descendants. Note that some Slurm configurations  (ProctrackType
223              value  of  pgid) are unable to identify all processes associated
224              with a job or job step.
225
226              Note that the NodeName option is only  really  useful  when  you
227              have  multiple  slurmd daemons running on the same host machine.
228              Multiple slurmd daemons on one host are, in general,  only  used
229              by Slurm developers.
230
231
232       ping   Ping  the  primary  and secondary slurmctld daemon and report if
233              they are responding.
234
235
236       reboot [ASAP] [nextstate=<RESUME|DOWN>] [reason=<reason>] [NodeList]
237              Reboot all nodes in the system when they become idle  using  the
238              RebootProgram  as  configured  in Slurm's slurm.conf file.  Each
239              node will have the "REBOOT" flag added to its node state.  After
240              a  node  reboots  and  the  slurmd  daemon  starts up again, the
241              HealthCheckProgram will run once. Then, the slurmd  daemon  will
242              register  itself with the slurmctld daemon and the "REBOOT" flag
243              will be cleared.  If the node reason  is  "Reboot  ASAP",  Slurm
244              will  clear  the  node's "DRAIN" state flag as well.  The "ASAP"
245              option adds the "DRAIN" flag to each  node's  state,  preventing
246              additional  jobs  from running on the node so it can be rebooted
247              and returned to service  "As  Soon  As  Possible"  (i.e.  ASAP).
248              "ASAP"  will  also  set  the node reason to "Reboot ASAP" if the
249              "reason" option isn't specified.  If the "nextstate"  option  is
250              specified  as  "DOWN", then the node will remain in a down state
251              after rebooting. If "nextstate" is specified as  "RESUME",  then
252              the  nodes will resume as normal when the node registers and the
253              node reason will be cleared.  Resuming nodes will be  considered
254              as available in backfill future scheduling and won't be replaced
255              by idle nodes in a reservation.  When using the "nextstate"  and
256              "reason"  options  together  the  reason  will  be appended with
257              "reboot issued" when the reboot is issued and "reboot  complete"
258              when the node registers with a "nextstate" of "DOWN".  The "rea‐
259              son" option sets each node's reason to a  user-defined  message.
260              An optional list of nodes to reboot may be specified. By default
261              all nodes are rebooted.  NOTE: By default, this command does not
262              prevent additional jobs from being scheduled on any nodes before
263              reboot.  To do this, you can either use  the  "ASAP"  option  or
264              explicitly drain the nodes beforehand.  You can alternately cre‐
265              ate an advanced reservation  to  prevent  additional  jobs  from
266              being initiated on nodes to be rebooted.  Pending reboots can be
267              cancelled by using "scontrol cancel_reboot  <node>"  or  setting
268              the node state to "CANCEL_REBOOT".  A node will be marked "DOWN"
269              if it doesn't reboot within ResumeTimeout.
270
271
272       reconfigure
273              Instruct all Slurm daemons to re-read  the  configuration  file.
274              This command does not restart the daemons.  This mechanism would
275              be used to  modify  configuration  parameters  (Epilog,  Prolog,
276              SlurmctldLogFile,  SlurmdLogFile,  etc.).   The Slurm controller
277              (slurmctld) forwards the request all other daemons (slurmd  dae‐
278              mon  on  each  compute  node).  Running jobs continue execution.
279              Most configuration parameters can be  changed  by  just  running
280              this  command,  however,  Slurm  daemons  should be shutdown and
281              restarted  if  any  of  these  parameters  are  to  be  changed:
282              AuthType,  BackupAddr,  BackupController,  ControlAddr, Control‐
283              Mach, PluginDir, StateSaveLocation, SlurmctldPort or SlurmdPort.
284              The slurmctld daemon and all slurmd daemons must be restarted if
285              nodes are added to or removed from the cluster.
286
287
288       release job_list
289              Release a previously held job to begin execution.  The  job_list
290              argument is a comma separated list of job IDs OR "jobname=" with
291              the job's name, which will attempt to hold all jobs having  that
292              name.  Also see hold.
293
294
295       requeue job_list
296              Requeue  a  running,  suspended or finished Slurm batch job into
297              pending state.  The job_list argument is a comma separated  list
298              of job IDs.
299
300
301       requeuehold job_list
302              Requeue  a  running,  suspended or finished Slurm batch job into
303              pending state, moreover the job is put in held  state  (priority
304              zero).   The  job_list argument is a comma separated list of job
305              IDs.  A held job can be released using  scontrol  to  reset  its
306              priority   (e.g.   "scontrol  release  <job_id>").  The  command
307              accepts the following option:
308
309              State=SpecialExit
310                     The "SpecialExit" keyword specifies that the job  has  to
311                     be  put  in a special state JOB_SPECIAL_EXIT.  The "scon‐
312                     trol show job" command will display the JobState as  SPE‐
313                     CIAL_EXIT, while the "squeue" command as SE.
314
315
316       resume job_list
317              Resume  a  previously suspended job.  The job_list argument is a
318              comma separated list of job IDs.  Also see suspend.
319
320              NOTE: A suspended job releases its CPUs for allocation to  other
321              jobs.   Resuming a previously suspended job may result in multi‐
322              ple jobs being allocated the same CPUs, which could trigger gang
323              scheduling  with  some  configurations  or severe degradation in
324              performance with other configurations.  Use of the scancel  com‐
325              mand  to send SIGSTOP and SIGCONT signals would stop a job with‐
326              out releasing its CPUs for allocation to other jobs and would be
327              a  preferable  mechanism  in  many  cases.  If performing system
328              maintenance you may want to use suspend/resume in the  following
329              way. Before suspending set all nodes to draining or set all par‐
330              titions to down so that no new jobs can be scheduled. Then  sus‐
331              pend  jobs.  Once  maintenance  is  done resume jobs then resume
332              nodes and/or set all partitions back to up.  Use with caution.
333
334
335       schedloglevel LEVEL
336              Enable or disable scheduler logging.  LEVEL  may  be  "0",  "1",
337              "disable" or "enable". "0" has the same effect as "disable". "1"
338              has the same effect as "enable".  This value  is  temporary  and
339              will   be  overwritten  when  the  slurmctld  daemon  reads  the
340              slurm.conf configuration file (e.g. when the daemon is restarted
341              or  scontrol  reconfigure is executed) if the SlurmSchedLogLevel
342              parameter is present.
343
344
345       setdebug LEVEL
346              Change the debug level of the slurmctld daemon.  LEVEL may be an
347              integer  value  between  zero and nine (using the same values as
348              SlurmctldDebug in the slurm.conf file) or the name of  the  most
349              detailed  message type to be printed: "quiet", "fatal", "error",
350              "info", "verbose", "debug",  "debug2",  "debug3",  "debug4",  or
351              "debug5".  This value is temporary and will be overwritten when‐
352              ever the slurmctld daemon  reads  the  slurm.conf  configuration
353              file  (e.g. when the daemon is restarted or scontrol reconfigure
354              is executed).
355
356
357       setdebugflags [+|-]FLAG
358              Add or remove DebugFlags of  the  slurmctld  daemon.   See  "man
359              slurm.conf"  for a list of supported DebugFlags.  NOTE: Changing
360              the value  of  some  DebugFlags  will  have  no  effect  without
361              restarting  the  slurmctld  daemon,  which  would set DebugFlags
362              based upon the contents of the slurm.conf configuration file.
363
364
365       show ENTITY ID
366              or
367
368       show ENTITY=ID
369              Display the state of the specified  entity  with  the  specified
370              identification.   ENTITY  may  be  aliases,  assoc_mgr,  bbstat,
371              burstbuffer, config, daemons, dwstat, federation, frontend, job,
372              node,  partition, powercap, reservation, slurmd, step, topology,
373              hostlist, hostlistsorted or hostnames ID can be used to identify
374              a  specific element of the identified entity: job ID, node name,
375              partition name, reservation name, or job step ID for job,  node,
376              partition,  or  step  respectively.   For an ENTITY of bbstat or
377              dwstat (they are equivalent) optional arguments are the  options
378              of  the  local status command.  The status commands will be exe‐
379              cuted by the slurmctld daemon and its response returned  to  the
380              user.  For an ENTITY of topology, the ID may be a node or switch
381              name.  If one node name is specified, all switches connected  to
382              that  node  (and  their parent switches) will be shown.  If more
383              than one node name is specified, only switches that  connect  to
384              all named nodes will be shown.  aliases will return all NodeName
385              values associated to a given NodeHostname  (useful  to  get  the
386              list  of virtual nodes associated with a real node in a configu‐
387              ration where multiple slurmd daemons execute on a single compute
388              node).   assoc_mgr  displays  the current contents of the slurm‐
389              ctld's internal cache for users, associations and/or qos. The ID
390              may                be               users=<user1>,[...,<userN>],
391              accounts=<acct1>,[...,<acctN>],  qos=<qos1>,[...,<qosN>]  and/or
392              flags=<users,assoc,qos>,  used  to filter the desired section to
393              be displayed. If no flags are specified, all sections  are  dis‐
394              played.   burstbuffer  displays the current status of the Burst‐
395              Buffer plugin.  config displays parameter names from the config‐
396              uration files in mixed case (e.g. SlurmdPort=7003) while derived
397              parameters names are in upper case  only  (e.g.  SLURM_VERSION).
398              hostnames  takes  an  optional  hostlist expression as input and
399              writes a list of individual host names to standard  output  (one
400              per  line).  If no hostlist expression is supplied, the contents
401              of the SLURM_JOB_NODELIST  environment  variable  is  used.  For
402              example  "tux[1-3]"  is  mapped to "tux1","tux2" and "tux3" (one
403              hostname per line).  hostlist takes a list  of  host  names  and
404              prints  the  hostlist  expression for them (the inverse of host‐
405              names).  hostlist can also take the absolute pathname of a  file
406              (beginning  with  the  character '/') containing a list of host‐
407              names.  Multiple node names may be specified using  simple  node
408              range  expressions  (e.g. "lx[10-20]"). All other ID values must
409              identify a single element. The  job  step  ID  is  of  the  form
410              "job_id.step_id",  (e.g.  "1234.1").  slurmd reports the current
411              status of the slurmd daemon executing  on  the  same  node  from
412              which  the scontrol command is executed (the local host). It can
413              be useful to diagnose problems.  By default  hostlist  does  not
414              sort  the  node  list  or  make it unique (e.g. tux2,tux1,tux2 =
415              tux[2,1-2]).  If you wanted a  sorted  list  use  hostlistsorted
416              (e.g. tux2,tux1,tux2 = tux[1-2,2]).  By default, all elements of
417              the entity type specified are printed.  For an ENTITY of job, if
418              the  job  does  not specify socket-per-node, cores-per-socket or
419              threads-per-core then it  will  display  '*'  in  ReqS:C:T=*:*:*
420              field. For an ENTITY of federation, the federation name that the
421              controller is part of and the sibling clusters part of the  fed‐
422              eration will be listed.
423
424
425       shutdown OPTION
426              Instruct  Slurm daemons to save current state and terminate.  By
427              default, the Slurm controller (slurmctld) forwards  the  request
428              all  other  daemons  (slurmd  daemon  on each compute node).  An
429              OPTION of slurmctld or controller results in only the  slurmctld
430              daemon being shutdown and the slurmd daemons remaining active.
431
432
433       suspend job_list
434              Suspend  a  running job.  The job_list argument is a comma sepa‐
435              rated list of job IDs.  Use the resume  command  to  resume  its
436              execution.   User processes must stop on receipt of SIGSTOP sig‐
437              nal and resume upon receipt of SIGCONT for this operation to  be
438              effective.  Not all architectures and configurations support job
439              suspension.  If a suspended job is requeued, it will  be  placed
440              in  a  held  state.   The time a job is suspended will not count
441              against a job's time limit.  Only  an  operator,  administrator,
442              SlurmUser, or root can suspend jobs.
443
444
445       takeover
446              Instruct Slurm's backup controller (slurmctld) to take over sys‐
447              tem control.  Slurm's backup controller  requests  control  from
448              the  primary  and  waits  for  its  termination.  After that, it
449              switches from backup mode to controller mode.  If  primary  con‐
450              troller can not be contacted, it directly switches to controller
451              mode. This  can  be  used  to  speed  up  the  Slurm  controller
452              fail-over  mechanism when the primary node is down.  This can be
453              used to minimize disruption if the computer executing  the  pri‐
454              mary Slurm controller is scheduled down.  (Note: Slurm's primary
455              controller will take the control back at startup.)
456
457
458       top job_list
459              Move the specified job IDs to the  top  of  the  queue  of  jobs
460              belonging to the identical user ID, partition name, account, and
461              QOS.  The job_list argument is a comma separated ordered list of
462              job  IDs.   Any job not matching all of those fields will not be
463              effected.  Only jobs submitted to a  single  partition  will  be
464              effected.  This operation changes the order of jobs by adjusting
465              job nice values.  The net effect on that user's throughput  will
466              be  negligible to slightly negative.  This operation is disabled
467              by default for non-privileged (non-operator,  admin,  SlurmUser,
468              or root) users. This operation may be enabled for non-privileged
469              users by  the  system  administrator  by  including  the  option
470              "enable_user_top"   in   the  SchedulerParameters  configuration
471              parameter.
472
473
474       uhold job_list
475              Prevent a pending job from being started (sets it's priority  to
476              0).   The job_list argument is a space separated list of job IDs
477              or job names.  Use the release command to permit the job  to  be
478              scheduled.   This command is designed for a system administrator
479              to hold a job so that the job owner may release it  rather  than
480              requiring  the  intervention of a system administrator (also see
481              the hold command).
482
483
484       update SPECIFICATION
485              Update job, step, node, partition, powercapping  or  reservation
486              configuration  per  the supplied specification. SPECIFICATION is
487              in the same format as the Slurm configuration file and the  out‐
488              put  of the show command described above. It may be desirable to
489              execute the show  command  (described  above)  on  the  specific
490              entity you want to update, then use cut-and-paste tools to enter
491              updated configuration values to the update. Note that while most
492              configuration  values can be changed using this command, not all
493              can be changed using this mechanism. In particular, the hardware
494              configuration  of  a node or the physical addition or removal of
495              nodes from the cluster may only be accomplished through  editing
496              the  Slurm configuration file and executing the reconfigure com‐
497              mand (described above).
498
499
500       version
501              Display the version number of scontrol being executed.
502
503
504       wait_job job_id
505              Wait until a job and all of its nodes are ready for use  or  the
506              job  has entered some termination state. This option is particu‐
507              larly useful in the Slurm Prolog or in the batch  script  itself
508              if nodes are powered down and restarted automatically as needed.
509
510
511       write batch_script job_id optional_filename
512              Write  the  batch script for a given job_id to a file or to std‐
513              out. The file will default to slurm-<job_id>.sh if the  optional
514              filename  argument  is  not given. The script will be written to
515              stdout if - is given instead of a filename.   The  batch  script
516              can  only  be retrieved by an admin or operator, or by the owner
517              of the job.
518
519
520       write config
521              Write the current configuration to a file with the  naming  con‐
522              vention  of "slurm.conf.<datetime>" in the same directory as the
523              original slurm.conf file.
524
525
526       INTERACTIVE COMMANDS
527              NOTE: All commands listed below can be used in  the  interactive
528              mode, but NOT on the initial command line.
529
530
531       all    Show  all  partitions,  their  jobs  and jobs steps. This causes
532              information to be displayed about partitions that are configured
533              as hidden and partitions that are unavailable to user's group.
534
535
536       details
537              Causes  the  show  command  to  provide additional details where
538              available.  Job information will include CPUs  and  NUMA  memory
539              allocated  on  each  node.   Note  that on computers with hyper‐
540              threading enabled and Slurm configured to allocate  cores,  each
541              listed  CPU  represents  one physical core.  Each hyperthread on
542              that core can be allocated a separate task, so a job's CPU count
543              and  task  count  may differ.  See the --cpu-bind and --mem-bind
544              option descriptions in srun man pages for more information.  The
545              details option is currently only supported for the show job com‐
546              mand.
547
548
549       exit   Terminate scontrol interactive session.
550
551
552       hide   Do not display partition, job or jobs step information for  par‐
553              titions  that  are  configured  as hidden or partitions that are
554              unavailable to the user's group.  This is the default behavior.
555
556
557       oneliner
558              Print information one line per record.
559
560
561       quiet  Print no warning or informational  messages,  only  fatal  error
562              messages.
563
564
565       quit   Terminate the execution of scontrol.
566
567
568       verbose
569              Print detailed event logging.  This includes time-stamps on data
570              structures, record counts, etc.
571
572
573       !!     Repeat the last command executed.
574
575
576       SPECIFICATIONS FOR UPDATE COMMAND, JOBS
577
578       Note that update requests done by either root, SlurmUser or Administra‐
579       tors  are  not  subject  to  certain  restrictions. For instance, if an
580       Administrator changes the QOS on a pending job, certain limits such  as
581       the  TimeLimit will not be changed automatically as changes made by the
582       Administrators are allowed to violate these restrictions.
583
584
585       Account=<account>
586              Account name to be changed for this job's resource  use.   Value
587              may be cleared with blank data value, "Account=".
588
589       AdminComment=<spec>
590              Arbitrary  descriptive string. Can only be set by a Slurm admin‐
591              istrator.
592
593       ArrayTaskThrottle=<count>
594              Specify the maximum number of tasks in a job array that can exe‐
595              cute at the same time.  Set the count to zero in order to elimi‐
596              nate any limit.  The task throttle count  for  a  job  array  is
597              reported  as part of its ArrayTaskId field, preceded with a per‐
598              cent sign.  For example "ArrayTaskId=1-10%2" indicates the maxi‐
599              mum number of running tasks is limited to 2.
600
601       BurstBuffer=<spec>
602              Burst buffer specification to be changed for this job's resource
603              use.  Value may  be  cleared  with  blank  data  value,  "Burst‐
604              Buffer=".  Format is burst buffer plugin specific.
605
606       Clusters=<spec>
607              Specifies the clusters that the federated job can run on.
608
609       ClusterFeatures=<spec>
610              Specifies  features that a federated cluster must have to have a
611              sibling job submitted to it. Slurm will attempt to submit a sib‐
612              ling  job  to  a cluster if it has at least one of the specified
613              features.
614
615       Comment=<spec>
616              Arbitrary descriptive string.
617
618       Contiguous=<yes|no>
619              Set the job's requirement for contiguous (consecutive) nodes  to
620              be  allocated.   Possible  values  are "YES" and "NO".  Only the
621              Slurm administrator or root can change this parameter.
622
623       CoreSpec=<count>
624              Number of cores to reserve per node for  system  use.   The  job
625              will  be  charged  for  these  cores, but be unable to use them.
626              Will be reported as "*" if not constrained.
627
628       CPUsPerTask=<count>
629              Change the CPUsPerTask job's value.
630
631       Deadline=<time_spec>
632              It accepts times of the form HH:MM:SS to specify a deadline to a
633              job  at  a specific time of day (seconds are optional).  You may
634              also specify midnight, noon, fika (3 PM) or teatime (4  PM)  and
635              you can have a time-of-day suffixed with AM or PM for a deadline
636              in the morning or the evening.  You can specify a  deadline  for
637              the  job with a date of the form MMDDYY or MM/DD/YY or MM.DD.YY,
638              or a date and time as  YYYY-MM-DD[THH:MM[:SS]].   You  can  also
639              give times like now + count time-units, where the time-units can
640              be minutes, hours, days, or weeks and you can tell Slurm to  put
641              a  deadline  for tomorrow with the keyword tomorrow.  The speci‐
642              fied deadline must be later than the current time.  Only pending
643              jobs  can have the deadline updated.  Only the Slurm administra‐
644              tor or root can change this parameter.
645
646       DelayBoot=<time_spec>
647              Change the time to decide whether to reboot nodes  in  order  to
648              satisfy job's feature specification if the job has been eligible
649              to run for less than this time  period.  See  salloc/sbatch  man
650              pages option --delay-boot.
651
652       Dependency=<dependency_list>
653              Defer job's initiation until specified job dependency specifica‐
654              tion is satisfied.   Cancel  dependency  with  an  empty  depen‐
655              dency_list  (e.g.  "Dependency=").   <dependency_list> is of the
656              form <type:job_id[:job_id][,type:job_id[:job_id]]>.   Many  jobs
657              can  share the same dependency and these jobs may even belong to
658              different  users.
659
660              after:job_id[:jobid...]
661                     This job can begin execution  after  the  specified  jobs
662                     have begun execution.
663
664              afterany:job_id[:jobid...]
665                     This  job  can  begin  execution after the specified jobs
666                     have terminated.
667
668              afternotok:job_id[:jobid...]
669                     This job can begin execution  after  the  specified  jobs
670                     have terminated in some failed state (non-zero exit code,
671                     node failure, timed out, etc).
672
673              afterok:job_id[:jobid...]
674                     This job can begin execution  after  the  specified  jobs
675                     have  successfully  executed  (ran  to completion with an
676                     exit code of zero).
677
678              singleton
679                     This  job  can  begin  execution  after  any   previously
680                     launched  jobs  sharing  the  same job name and user have
681                     terminated.  In other words, only one job  by  that  name
682                     and owned by that user can be running or suspended at any
683                     point in time.
684
685       EligibleTime=<time_spec>
686              See StartTime.
687
688       EndTime
689              The time the job is expected to terminate  based  on  the  job's
690              time  limit.   When  the  job  ends  sooner,  this field will be
691              updated with the actual end time.
692
693       ExcNodeList=<nodes>
694              Set the job's list of excluded node. Multiple node names may  be
695              specified    using   simple   node   range   expressions   (e.g.
696              "lx[10-20]").  Value may be cleared with blank data value, "Exc‐
697              NodeList=".
698
699       Features=<features>
700              Set  the job's required node features.  The list of features may
701              include multiple feature  names  separated  by  ampersand  (AND)
702              and/or   vertical   bar   (OR)  operators.   For  example:  Fea‐
703              tures="opteron&video" or Features="fast|faster".  In  the  first
704              example,  only  nodes  having both the feature "opteron" AND the
705              feature "video" will be used.  There is no mechanism to  specify
706              that  you  want one node with feature "opteron" and another node
707              with feature "video" in case no node has both features.  If only
708              one  of  a  set of possible options should be used for all allo‐
709              cated nodes, then use the OR operator and  enclose  the  options
710              within      square     brackets.      For     example:     "Fea‐
711              tures=[rack1|rack2|rack3|rack4]" might be used to  specify  that
712              all nodes must be allocated on a single rack of the cluster, but
713              any of those four racks can be used.  A request can also specify
714              the  number  of  nodes  needed with some feature by appending an
715              asterisk and count after the feature name.   For  example  "Fea‐
716              tures=graphics*4"  indicates  that at least four allocated nodes
717              must have the feature "graphics."   Parenthesis  are  also  sup‐
718              ported  for  features  to  be ANDed together.  For example "Fea‐
719              tures=[(knl&a2a&flat)*4&haswell*2]" indicates the resource allo‐
720              cation  should  include  4 nodes with ALL of the features "knl",
721              "a2a", and "flat" plus 2 nodes with the feature "haswell".  Con‐
722              straints  with  node counts may only be combined with AND opera‐
723              tors.  Value may be cleared with blank data value,  for  example
724              "Features=".
725
726
727       Gres=<list>
728              Specifies   a   comma   delimited  list  of  generic  consumable
729              resources.   The  format  of  each  entry   on   the   list   is
730              "name[:count[*cpu]]".   The  name  is  that  of  the  consumable
731              resource.  The count is the number of  those  resources  with  a
732              default  value  of 1.  The specified resources will be allocated
733              to the job on each node allocated unless "*cpu" is appended,  in
734              which  case  the resources will be allocated on a per cpu basis.
735              The available generic consumable resources  is  configurable  by
736              the  system  administrator.  A list of available generic consum‐
737              able resources will be printed and the command will exit if  the
738              option   argument   is   "help".    Examples   of   use  include
739              "Gres=gpus:2*cpu,disk=40G" and "Gres=help".
740
741
742       JobId=<job_list>
743              Identify the job(s) to be updated.  The job_list may be a  comma
744              separated list of job IDs.  Either JobId or JobName is required.
745
746       Licenses=<name>
747              Specification  of  licenses (or other resources available on all
748              nodes of the cluster) as  described  in  salloc/sbatch/srun  man
749              pages.
750
751       MinCPUsNode=<count>
752              Set  the  job's minimum number of CPUs per node to the specified
753              value.
754
755       MinMemoryCPU=<megabytes>
756              Set the job's minimum real memory required per allocated CPU  to
757              the specified value. Either MinMemoryCPU or MinMemoryNode may be
758              set, but not both.
759
760       MinMemoryNode=<megabytes>
761              Set the job's minimum real memory required per node to the spec‐
762              ified  value.   Either MinMemoryCPU or MinMemoryNode may be set,
763              but not both.
764
765       MinTmpDiskNode=<megabytes>
766              Set the job's minimum temporary disk space required per node  to
767              the  specified  value.  Only the Slurm administrator or root can
768              change this parameter.
769
770       imeMin=<timespec>
771              Change TimeMin value which specifies the minimum time limit min‐
772              utes of the job.
773
774       JobName=<name>
775              Identify  the  name of jobs to be modified or set the job's name
776              to the specified value.  When used to identify jobs to be  modi‐
777              fied,  all  jobs  belonging to all users are modified unless the
778              UserID option is used to identify a specific user.  Either JobId
779              or JobName is required.
780
781       Name[=<name>]
782              See JobName.
783
784       Nice[=<adjustment>]
785              Update  the  job  with  an  adjusted  scheduling priority within
786              Slurm. With no  adjustment  value  the  scheduling  priority  is
787              decreased  by 100. A negative nice value increases the priority,
788              otherwise decreases it. The adjustment range is +/-  2147483645.
789              Only privileged users can specify a negative adjustment.
790
791       NodeList=<nodes>
792              Change the nodes allocated to a running job to shrink it's size.
793              The specified list of nodes must be a subset of the  nodes  cur‐
794              rently  allocated  to the job. Multiple node names may be speci‐
795              fied using simple node  range  expressions  (e.g.  "lx[10-20]").
796              After  a  job's  allocation is reduced, subsequent srun commands
797              must explicitly specify node and task counts which are valid for
798              the new allocation.
799
800       NumCPUs=<min_count>[-<max_count>]
801              Set the job's minimum and optionally maximum count of CPUs to be
802              allocated.
803
804       NumNodes=<min_count>[-<max_count>]
805              Set the job's minimum and optionally maximum count of  nodes  to
806              be  allocated.  If the job is already running, use this to spec‐
807              ify a node count less than  currently  allocated  and  resources
808              previously  allocated  to  the job will be relinquished. After a
809              job's allocation  is  reduced,  subsequent  srun  commands  must
810              explicitly  specify node and task counts which are valid for the
811              new allocation. Also see the NodeList parameter above.  This  is
812              the same than ReqNodes.
813
814       NumTasks=<count>
815              Set  the  job's  count of required tasks to the specified value.
816              This is the same than ReqProcs.
817
818       OverSubscribe=<yes|no>
819              Set the job's ability to share compute resources (i.e.  individ‐
820              ual  CPUs)  with other jobs. Possible values are "YES" and "NO".
821              This option can only be changed for pending jobs.
822
823       Partition=<name>
824              Set the job's partition to the specified value.
825
826       Priority=<number>
827              Set the job's priority to the specified value.  Note that a  job
828              priority of zero prevents the job from ever being scheduled.  By
829              setting a job's priority to zero it is held.  Set  the  priority
830              to  a  non-zero value to permit it to run.  Explicitly setting a
831              job's priority clears any previously set nice value and  removes
832              the priority/multifactor plugin's ability to manage a job's pri‐
833              ority.  In order to restore  the  priority/multifactor  plugin's
834              ability  to  manage  a job's priority, hold and then release the
835              job.  Only the Slurm administrator or root  can  increase  job's
836              priority.
837
838       QOS=<name>
839              Set  the  job's QOS (Quality Of Service) to the specified value.
840              Value may be cleared with blank data value, "QOS=".
841
842       Reboot=<yes|no>
843              Set the job's flag that specifies whether to force the allocated
844              nodes  to reboot before starting the job. This is only supported
845              with some  system  configurations  and  therefore  it  could  be
846              silently ignored.
847
848       ReqCores=<count>
849              Change the job's requested Cores count.
850
851       ReqNodeList=<nodes>
852              Set  the job's list of required node. Multiple node names may be
853              specified   using   simple   node   range   expressions    (e.g.
854              "lx[10-20]").   Value  may  be  cleared  with  blank data value,
855              "ReqNodeList=".
856
857       ReqNodes=<min_count>[-<max_count>]
858              See NumNodes.
859
860       ReqProcs=<count>
861              See NumTasks.
862
863       ReqSockets=<count>
864              Change the job's requested socket count.
865
866       ReqThreads=<count>
867              Change the job's requested threads count.
868
869       Requeue=<0|1>
870              Stipulates whether a job should be requeued after a  node  fail‐
871              ure: 0 for no, 1 for yes.
872
873       ReservationName=<name>
874              Set  the job's reservation to the specified value.  Value may be
875              cleared with blank data value, "ReservationName=".
876
877       ResetAccrueTime
878              Reset the job's accrue time value to 0 meaning it will loose any
879              time  previously  accrued  for  priority.  Helpful if you have a
880              large queue of jobs already in the queue and want to start  lim‐
881              iting  how  many  jobs  can  accrue time without waiting for the
882              queue to flush out.
883
884       StdOut=<filepath>
885              Set the batch job's stdout file path.
886
887       Shared=<yes|no>
888              See OverSubscribe option above.
889
890       StartTime=<time_spec>
891              Set the job's earliest initiation time.  It accepts times of the
892              form  HH:MM:SS  to  run a job at a specific time of day (seconds
893              are optional).  (If that time is already past, the next  day  is
894              assumed.)   You  may also specify midnight, noon, fika (3 PM) or
895              teatime (4 PM) and you can have a time-of-day suffixed  with  AM
896              or  PM  for running in the morning or the evening.  You can also
897              say what day the job will be run, by specifying a  date  of  the
898              form  MMDDYY  or  MM/DD/YY  or  MM.DD.YY,  or a date and time as
899              YYYY-MM-DD[THH:MM[:SS]].  You can also give  times  like  now  +
900              count  time-units,  where  the time-units can be minutes, hours,
901              days, or weeks and you can tell Slurm to run the job today  with
902              the  keyword  today and to run the job tomorrow with the keyword
903              tomorrow.
904
905              Notes on date/time specifications:
906               - although the 'seconds' field of the HH:MM:SS time  specifica‐
907              tion  is  allowed  by  the  code, note that the poll time of the
908              Slurm scheduler is not precise enough to guarantee  dispatch  of
909              the  job on the exact second.  The job will be eligible to start
910              on the next poll following the specified time.  The  exact  poll
911              interval  depends  on the Slurm scheduler (e.g., 60 seconds with
912              the default sched/builtin).
913               -  if  no  time  (HH:MM:SS)  is  specified,  the   default   is
914              (00:00:00).
915               -  if a date is specified without a year (e.g., MM/DD) then the
916              current year is assumed, unless the  combination  of  MM/DD  and
917              HH:MM:SS  has  already  passed  for that year, in which case the
918              next year is used.
919
920       Switches=<count>[@<max-time-to-wait>]
921              When a tree topology is used, this defines the maximum count  of
922              switches desired for the job allocation. If Slurm finds an allo‐
923              cation containing more switches than the  count  specified,  the
924              job  remain  pending  until  it  either finds an allocation with
925              desired switch count or the time limit expires. By default there
926              is  no switch count limit and no time limit delay. Set the count
927              to zero in order to clean any previously  set  count  (disabling
928              the  limit).  The job's maximum time delay may be limited by the
929              system administrator using the SchedulerParameters configuration
930              parameter  with  the max_switch_wait parameter option.  Also see
931              wait-for-switch.
932
933
934       wait-for-switch=<seconds>
935              Change max time to wait for a switch <seconds> secs.
936
937
938       TasksPerNode=<count>
939              Change the job's requested TasksPerNode.
940
941
942       ThreadSpec=<count>
943              Number of threads to reserve per node for system use.   The  job
944              will  be  charged  for these threads, but be unable to use them.
945              Will be reported as "*" if not constrained.
946
947
948       TimeLimit=<time>
949              The  job's  time  limit.   Output  format  is  [days-]hours:min‐
950              utes:seconds  or "UNLIMITED".  Input format (for update command)
951              set   is   minutes,   minutes:seconds,    hours:minutes:seconds,
952              days-hours,  days-hours:minutes  or  days-hours:minutes:seconds.
953              Time resolution is one minute and second values are  rounded  up
954              to the next minute.  If changing the time limit of a job, either
955              specify a new time limit value or precede  the  time  and  equal
956              sign  with  a  "+"  or "-" to increment or decrement the current
957              time limit (e.g. "TimeLimit+=30").  In  order  to  increment  or
958              decrement  the  current time limit, the JobId specification must
959              precede the TimeLimit specification.  Only the Slurm administra‐
960              tor or root can increase job's TimeLimit.
961
962
963       UserID=<UID or name>
964              Used  with  the  JobName option to identify jobs to be modified.
965              Either a user name or numeric ID (UID), may be specified.
966
967
968       WCKey=<key>
969              Set the job's workload characterization  key  to  the  specified
970              value.
971
972
973       SPECIFICATIONS FOR SHOW COMMAND, JOBS
974
975       The  "show"  command,  when used with the "job" or "job <jobid>" entity
976       displays detailed information about a job or jobs.  Much of this infor‐
977       mation  may  be  modified  using  the "update job" command as described
978       above.  However, the following fields displayed by the show job command
979       are read-only and cannot be modified:
980
981
982       AllocNode:Sid
983              Local node and system id making the resource allocation.
984
985       BatchFlag
986              Jobs submitted using the sbatch command have BatchFlag set to 1.
987              Jobs submitted using other commands have BatchFlag set to 0.
988
989       ExitCode=<exit>:<sig>
990              Exit status reported for the job by the  wait()  function.   The
991              first  number  is  the exit code, typically as set by the exit()
992              function.  The second number  of  the  signal  that  caused  the
993              process to terminate if it was terminated by a signal.
994
995       GroupId
996              The group under which the job was submitted.
997
998       JobState
999              The current state of the job.
1000
1001       NodeListIndices
1002              The  NodeIndices expose the internal indices into the node table
1003              associated with the node(s) allocated to the job.
1004
1005       NtasksPerN:B:S:C=
1006              <tasks_per_node>:<tasks_per_base‐
1007              board>:<tasks_per_socket>:<tasks_per_core> Specifies the number
1008              of tasks to be started per hardware component (node, baseboard,
1009              socket and core).  Unconstrained values may be shown as "0" or
1010              "*".
1011
1012       PreemptTime
1013              Time at which job was signaled that it was selected for  preemp‐
1014              tion.  (Meaningful only for PreemptMode=CANCEL and the partition
1015              or QOS with which the job is associated has  a  GraceTime  value
1016              designated.)
1017
1018       PreSusTime
1019              Time the job ran prior to last suspend.
1020
1021       Reason The reason job is not running: e.g., waiting "Resources".
1022
1023       ReqB:S:C:T=
1024              <baseboard_count>:<socket_per_base‐
1025              board_count>:<core_per_socket_count>:<thread_per_core_count>
1026              Specifies the count of various hardware components requested by
1027              the job.  Unconstrained values may be shown as "0" or "*".
1028
1029       SecsPreSuspend=<seconds>
1030              If the job is suspended, this is the run time accumulated by the
1031              job (in seconds) prior to being suspended.
1032
1033       Socks/Node=<count>
1034              Count of desired sockets per node
1035
1036       SubmitTime
1037              The  time  and  date stamp (in localtime) the job was submitted.
1038              The format of the output is identical to  that  of  the  EndTime
1039              field.
1040
1041              NOTE: If a job is requeued, the submit time is reset.  To obtain
1042              the original submit time it is necessary to use  the  "sacct  -j
1043              <job_id[.<step_id>]" command also designating the -D or --dupli‐
1044              cate option to display all duplicate entries for a job.
1045
1046       SuspendTime
1047              Time the job was last suspended or resumed.
1048
1049       NOTE on information displayed for various job states:
1050              When you submit a request for the "show job" function the  scon‐
1051              trol  process  makes  an  RPC  request  call to slurmctld with a
1052              REQUEST_JOB_INFO message type.  If the state of the job is PEND‐
1053              ING, then it returns some detail information such as: min_nodes,
1054              min_procs, cpus_per_task, etc. If the state is other than  PEND‐
1055              ING  the code assumes that it is in a further state such as RUN‐
1056              NING, COMPLETE, etc. In these cases the code explicitly  returns
1057              zero for these values. These values are meaningless once the job
1058              resources have been allocated and the job has started.
1059
1060
1061       SPECIFICATIONS FOR UPDATE COMMAND, STEPS
1062
1063       StepId=<job_id>[.<step_id>]
1064              Identify the step to be updated.  If the job_id is given, but no
1065              step_id  is  specified then all steps of the identified job will
1066              be modified.  This specification is required.
1067
1068       CompFile=<completion file>
1069              Update a step with information about a steps completion.  Can be
1070              useful  if  step  statistics aren't directly available through a
1071              jobacct_gather plugin.  The file is a space-delimited file  with
1072              format for Version 1 is as follows
1073
1074              1 34461 0 2 0 3 1361906011 1361906015 1 1 3368 13357 /bin/sleep
1075              A B     C D E F G          H          I J K    L     M
1076
1077              Field Descriptions:
1078
1079              A file version
1080              B ALPS apid
1081              C inblocks
1082              D outblocks
1083              E exit status
1084              F number of allocated CPUs
1085              G start time
1086              H end time
1087              I utime
1088              J stime
1089              K maxrss
1090              L uid
1091              M command name
1092
1093       TimeLimit=<time>
1094              The  job's  time  limit.   Output  format  is  [days-]hours:min‐
1095              utes:seconds or "UNLIMITED".  Input format (for update  command)
1096              set    is   minutes,   minutes:seconds,   hours:minutes:seconds,
1097              days-hours,  days-hours:minutes  or  days-hours:minutes:seconds.
1098              Time  resolution  is one minute and second values are rounded up
1099              to the next minute.  If changing  the  time  limit  of  a  step,
1100              either specify a new time limit value or precede the time with a
1101              "+" or "-" to increment or  decrement  the  current  time  limit
1102              (e.g.  "TimeLimit=+30").  In order to increment or decrement the
1103              current time limit, the StepId specification  must  precede  the
1104              TimeLimit specification.
1105
1106
1107       SPECIFICATIONS FOR UPDATE COMMAND, NODES
1108
1109       NodeName=<name>
1110              Identify  the  node(s) to be updated. Multiple node names may be
1111              specified   using   simple   node   range   expressions    (e.g.
1112              "lx[10-20]"). This specification is required.
1113
1114
1115       ActiveFeatures=<features>
1116              Identify  the feature(s) currently active on the specified node.
1117              Any previously active feature specification will be  overwritten
1118              with  the  new  value.   Also  see AvailableFeatures.  Typically
1119              ActiveFeatures will be identical to  AvailableFeatures;  however
1120              ActiveFeatures  may  be configured as a subset of the Available‐
1121              Features. For example, a node may be booted in multiple configu‐
1122              rations.  In that case, all possible configurations may be iden‐
1123              tified as AvailableFeatures, while ActiveFeatures would identify
1124              the current node configuration.
1125
1126
1127       AvailableFeatures=<features>
1128              Identify  the  feature(s)  available on the specified node.  Any
1129              previously defined available feature specification will be over‐
1130              written  with  the  new  value.   AvailableFeatures assigned via
1131              scontrol will only persist across the restart of  the  slurmctld
1132              daemon  with  the  -R option and state files preserved or slurm‐
1133              ctld's receipt of a SIGHUP.  Update slurm.conf with any  changes
1134              meant  to  be  persistent across normal restarts of slurmctld or
1135              the execution of scontrol reconfig.  Also see ActiveFeatures.
1136
1137
1138       CpuBind=<node>
1139              Specify the task binding mode to be used  by  default  for  this
1140              node.   Supported  options  include:  "none", "board", "socket",
1141              "ldom" (NUMA), "core", "thread" and "off" (remove previous bind‐
1142              ing mode).
1143
1144
1145       Gres=<gres>
1146              Identify  generic  resources to be associated with the specified
1147              node.  Any previously defined generic resources  will  be  over‐
1148              written with the new value.  Specifications for multiple generic
1149              resources should be comma separated.  Each  resource  specifica‐
1150              tion  consists  of  a  name followed by an optional colon with a
1151              numeric  value  (default  value  is   one)   (e.g.   "Gres=band‐
1152              width:10000,gpus").   Generic  resources  assigned  via scontrol
1153              will only persist across the restart  of  the  slurmctld  daemon
1154              with  the  -R  option  and  state files preserved or slurmctld's
1155              receipt of a SIGHUP.  Update slurm.conf with any  changes  meant
1156              to be persistent across normal restarts of slurmctld or the exe‐
1157              cution of scontrol reconfig.
1158
1159
1160       Reason=<reason>
1161              Identify the reason the node is in a "DOWN", "DRAINED",  "DRAIN‐
1162              ING", "FAILING" or "FAIL" state.  Use quotes to enclose a reason
1163              having more than one word.
1164
1165
1166       State=<state>
1167              Identify the state to be assigned to  the  node.  Possible  node
1168              states are "NoResp", "ALLOC", "ALLOCATED", "COMPLETING", "DOWN",
1169              "DRAIN", "FAIL", "FAILING", "FUTURE" "IDLE",  "MAINT",  "MIXED",
1170              "PERFCTRS/NPC",  "RESERVED",  "POWER_DOWN", "POWER_UP", "RESUME"
1171              or "UNDRAIN". Not all of those states can be set using the scon‐
1172              trol  command  only  the following can: "CANCEL_REBOOT", "DOWN",
1173              "DRAIN", "FAIL",  "FUTURE",  "RESUME",  "NoResp",  "POWER_DOWN",
1174              "POWER_UP"  and  "UNDRAIN".   If a node is in a "MIXED" state it
1175              usually means the node is in multiple states.  For  instance  if
1176              only part of the node is "ALLOCATED" and the rest of the node is
1177              "IDLE" the state will be "MIXED".  If you want to remove a  node
1178              from  service,  you typically want to set it's state to "DRAIN".
1179              "CANCEL_REBOOT" cancels a pending reboot on the  node  (same  as
1180              "scontrol  cancel_reboot  <node>").   "FAILING"  is  similar  to
1181              "DRAIN" except that some applications will  seek  to  relinquish
1182              those  nodes before the job completes.  "PERFCTRS/NPC" indicates
1183              that Network Performance Counters associated with this node  are
1184              in  use,  rendering  this node as not usable for any other jobs.
1185              "RESERVED" indicates the node is in an advanced reservation  and
1186              not  generally available.  "RESUME" is not an actual node state,
1187              but will change a node state from "DRAINED", "DRAINING",  "DOWN"
1188              or  "REBOOT"  to either "IDLE" or "ALLOCATED" state as appropri‐
1189              ate.   "UNDRAIN"  clears  the  node  from  being  drained  (like
1190              "RESUME"),  but  will  not  change  the  node's base state (e.g.
1191              "DOWN").  Setting a node "DOWN" will cause all running and  sus‐
1192              pended  jobs  on  that  node to be terminated.  "POWER_DOWN" and
1193              "POWER_UP" will use the configured  SuspendProg  and  ResumeProg
1194              programs  to explicitly place a node in or out of a power saving
1195              mode. If a node is already in the process of being powered up or
1196              down,  the  command  will  only change the state of the node but
1197              won't have any effect until the configured ResumeTimeout or Sus‐
1198              pendTimeout  is  reached.   Use of this command can be useful in
1199              situations where a ResumeProg like capmc  in  Cray  machines  is
1200              stalled and one wants to restore the node to "IDLE" manually, in
1201              this  case  rebooting  the  node  and  setting  the   state   to
1202              "POWER_DOWN"  will  cancel the previous "POWER_UP" state and the
1203              node will become "IDLE".  The "NoResp" state will only  set  the
1204              "NoResp"  flag for a node without changing its underlying state.
1205              While all of the above states are valid, some of  them  are  not
1206              valid  new  node  states  given  their prior state.  If the node
1207              state code printed is followed by "~", this indicates  the  node
1208              is  presently  in  a  power  saving  mode  (typically running at
1209              reduced frequency).  If the node state code is followed by  "#",
1210              this indicates the node is presently being powered up or config‐
1211              ured.  If the node state code is followed by "$", this indicates
1212              the  node  is  currently  in  a reservation with a flag value of
1213              "maintenance".  If the node state code is followed by "@",  this
1214              indicates  the node is currently scheduled to be rebooted.  Gen‐
1215              erally only "DRAIN", "FAIL" and "RESUME" should be used.   NOTE:
1216              The  scontrol command should not be used to change node state on
1217              Cray systems. Use Cray tools such as xtprocadmin instead.
1218
1219
1220       Weight=<weight>
1221              Identify weight to be  associated  with  specified  nodes.  This
1222              allows  dynamic  changes  to weight associated with nodes, which
1223              will be used  for  the  subsequent  node  allocation  decisions.
1224              Weight  assigned  via  scontrol  will  only  persist  across the
1225              restart of the slurmctld daemon with the  -R  option  and  state
1226              files  preserved  or  slurmctld's  receipt  of a SIGHUP.  Update
1227              slurm.conf with any changes meant to be persistent across normal
1228              restarts of slurmctld or the execution of scontrol reconfig.
1229
1230
1231       SPECIFICATIONS FOR SHOW COMMAND, NODES
1232
1233       The meaning of the energy information is as follows:
1234
1235
1236       CurrentWatts
1237              The  instantaneous  power consumption of the node at the time of
1238              the last node energy accounting sample, in watts.
1239
1240
1241       LowestJoules
1242              The energy consumed by the node between the  last  time  it  was
1243              powered  on  and  the  last time it was registered by slurmd, in
1244              joules.
1245
1246
1247       ConsumedJoules
1248              The energy consumed by the node between the  last  time  it  was
1249              registered  by  the  slurmd  daemon  and  the  last  node energy
1250              accounting sample, in joules.
1251
1252
1253       If the reported value is "n/s" (not supported), the node does not  sup‐
1254       port  the configured AcctGatherEnergyType plugin. If the reported value
1255       is zero, energy accounting for nodes is disabled.
1256
1257
1258       The meaning of the external sensors information is as follows:
1259
1260
1261       ExtSensorsJoules
1262              The energy consumed by the node between the  last  time  it  was
1263              powered  on and the last external sensors plugin node sample, in
1264              joules.
1265
1266
1267
1268       ExtSensorsWatts
1269              The instantaneous power consumption of the node at the  time  of
1270              the last external sensors plugin node sample, in watts.
1271
1272
1273       ExtSensorsTemp
1274              The  temperature  of  the  node at the time of the last external
1275              sensors plugin node sample, in celsius.
1276
1277
1278       If the reported value is "n/s" (not supported), the node does not  sup‐
1279       port the configured ExtSensorsType plugin.
1280
1281
1282       The meaning of the resource specialization information is as follows:
1283
1284
1285       CPUSpecList
1286              The  list  of  Slurm  abstract CPU IDs on this node reserved for
1287              exclusive use by the Slurm compute node daemons (slurmd,  slurm‐
1288              stepd).
1289
1290
1291       MemSpecLimit
1292              The  combined  memory  limit, in megabytes, on this node for the
1293              Slurm compute node daemons (slurmd, slurmstepd).
1294
1295
1296       The meaning of the memory information is as follows:
1297
1298
1299       RealMemory
1300              The total memory, in MB, on the node.
1301
1302
1303       AllocMem
1304              The total memory, in MB, currently  allocated  by  jobs  on  the
1305              node.
1306
1307
1308       FreeMem
1309              The  total memory, in MB, currently free on the node as reported
1310              by the OS.
1311
1312
1313       SPECIFICATIONS FOR UPDATE COMMAND, FRONTEND
1314
1315
1316       FrontendName=<name>
1317              Identify the front end node to be updated. This specification is
1318              required.
1319
1320
1321       Reason=<reason>
1322              Identify  the  reason  the node is in a "DOWN" or "DRAIN" state.
1323              Use quotes to enclose a reason having more than one word.
1324
1325
1326       State=<state>
1327              Identify the state to be assigned to the front end node.  Possi‐
1328              ble  values  are  "DOWN",  "DRAIN"  or "RESUME".  If you want to
1329              remove a front end node from service, you typically want to  set
1330              it's  state  to  "DRAIN".  "RESUME" is not an actual node state,
1331              but will return a "DRAINED", "DRAINING",  or  "DOWN"  front  end
1332              node to service, either "IDLE" or "ALLOCATED" state as appropri‐
1333              ate.  Setting a front end node "DOWN" will cause all running and
1334              suspended jobs on that node to be terminated.
1335
1336
1337       SPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, PARTITIONS
1338
1339       AllowGroups=<name>
1340              Identify the user groups which may use this partition.  Multiple
1341              groups may be specified in a comma separated  list.   To  permit
1342              all groups to use the partition specify "AllowGroups=ALL".
1343
1344
1345       AllocNodes=<name>
1346              Comma  separated list of nodes from which users can execute jobs
1347              in the partition.  Node names may be specified  using  the  node
1348              range  expression  syntax described above.  The default value is
1349              "ALL".
1350
1351
1352       Alternate=<partition name>
1353              Alternate partition to be used if the state of this partition is
1354              "DRAIN" or "INACTIVE."  The value "NONE" will clear a previously
1355              set alternate partition.
1356
1357
1358       CpuBind=<node>
1359              Specify the task binding mode to be used  by  default  for  this
1360              partition.    Supported   options   include:   "none",  "board",
1361              "socket", "ldom" (NUMA), "core", "thread" and "off" (remove pre‐
1362              vious binding mode).
1363
1364
1365       Default=<yes|no>
1366              Specify  if  this  partition  is to be used by jobs which do not
1367              explicitly identify a partition to use.  Possible output  values
1368              are "YES" and "NO".  In order to change the default partition of
1369              a running system,  use  the  scontrol  update  command  and  set
1370              Default=yes  for  the  partition that you want to become the new
1371              default.
1372
1373
1374       DefaultTime=<time>
1375              Run time limit used for jobs that don't specify a value. If  not
1376              set  then  MaxTime will be used.  Format is the same as for Max‐
1377              Time.
1378
1379
1380       DefMemPerCPU=<MB>
1381              Set the default memory to be allocated per CPU for jobs in  this
1382              partition.  The memory size is specified in megabytes.
1383
1384       DefMemPerNode=<MB>
1385              Set the default memory to be allocated per node for jobs in this
1386              partition.  The memory size is specified in megabytes.
1387
1388
1389       DisableRootJobs=<yes|no>
1390              Specify if jobs can be executed as user root.   Possible  values
1391              are "YES" and "NO".
1392
1393
1394       GraceTime=<seconds>
1395              Specifies,  in units of seconds, the preemption grace time to be
1396              extended to a job which has been selected for  preemption.   The
1397              default  value  is  zero, no preemption grace time is allowed on
1398              this partition or qos.  (Meaningful only for PreemptMode=CANCEL)
1399
1400
1401       Hidden=<yes|no>
1402              Specify if the partition and its  jobs  should  be  hidden  from
1403              view.   Hidden  partitions  will  by  default not be reported by
1404              Slurm APIs or commands.  Possible values are "YES" and "NO".
1405
1406
1407       MaxMemPerCPU=<MB>
1408              Set the maximum memory to be allocated per CPU for jobs in  this
1409              partition.  The memory size is specified in megabytes.
1410
1411       MaxMemPerCNode=<MB>
1412              Set the maximum memory to be allocated per node for jobs in this
1413              partition.  The memory size is specified in megabytes.
1414
1415
1416       MaxNodes=<count>
1417              Set the maximum number of nodes which will be allocated  to  any
1418              single  job  in  the  partition. Specify a number, "INFINITE" or
1419              "UNLIMITED".  Changing the MaxNodes of a partition has no effect
1420              upon jobs that have already begun execution.
1421
1422
1423       MaxTime=<time>
1424              The   maximum   run   time   for   jobs.    Output   format   is
1425              [days-]hours:minutes:seconds or "UNLIMITED".  Input format  (for
1426              update  command) is minutes, minutes:seconds, hours:minutes:sec‐
1427              onds, days-hours, days-hours:minutes or  days-hours:minutes:sec‐
1428              onds.   Time  resolution  is  one  minute  and second values are
1429              rounded up to the next minute.  Changing the MaxTime of a parti‐
1430              tion has no effect upon jobs that have already begun execution.
1431
1432
1433       MinNodes=<count>
1434              Set  the  minimum number of nodes which will be allocated to any
1435              single job in the partition.  Changing the MinNodes of a  parti‐
1436              tion has no effect upon jobs that have already begun execution.
1437
1438
1439       Nodes=<name>
1440              Identify  the node(s) to be associated with this partition. Mul‐
1441              tiple node names  may  be  specified  using  simple  node  range
1442              expressions  (e.g.  "lx[10-20]").   Note  that  jobs may only be
1443              associated with one partition at any time.  Specify a blank data
1444              value  to remove all nodes from a partition: "Nodes=".  Changing
1445              the Nodes in a partition has  no  effect  upon  jobs  that  have
1446              already begun execution.
1447
1448
1449       OverTimeLimit=<count>
1450              Number  of  minutes  by  which  a  job can exceed its time limit
1451              before being canceled.  The configured job time limit is treated
1452              as  a  soft  limit.  Adding OverTimeLimit to the soft limit pro‐
1453              vides a hard limit, at which point the job is canceled.  This is
1454              particularly  useful  for  backfill scheduling, which bases upon
1455              each job's soft time limit.  A partition-specific  OverTimeLimit
1456              will override any global OverTimeLimit value.  If not specified,
1457              the global OverTimeLimit value will take  precedence.   May  not
1458              exceed exceed 65533 minutes.  An input value of "UNLIMITED" will
1459              clear any previously configured partition-specific OverTimeLimit
1460              value.
1461
1462
1463       OverSubscribe=<yes|no|exclusive|force>[:<job_count>]
1464              Specify if compute resources (i.e. individual CPUs) in this par‐
1465              tition can be shared by  multiple  jobs.   Possible  values  are
1466              "YES",  "NO",  "EXCLUSIVE"  and  "FORCE".  An optional job count
1467              specifies how many jobs can be allocated to use each resource.
1468
1469
1470       PartitionName=<name>
1471              Identify the partition to  be  updated.  This  specification  is
1472              required.
1473
1474
1475       PreemptMode=<mode>
1476              Reset  the  mechanism  used to preempt jobs in this partition if
1477              PreemptType is configured to preempt/partition_prio. The default
1478              preemption  mechanism  is specified by the cluster-wide Preempt‐
1479              Mode configuration parameter.  Possible values are "OFF",  "CAN‐
1480              CEL", "CHECKPOINT", "REQUEUE" and "SUSPEND".
1481
1482
1483       Priority=<count>
1484              Jobs submitted to a higher priority partition will be dispatched
1485              before pending jobs in lower priority partitions and if possible
1486              they  will  preempt running jobs from lower priority partitions.
1487              Note that a partition's priority takes precedence over  a  job's
1488              priority.  The value may not exceed 65533.
1489
1490
1491       PriorityJobFactor=<count>
1492              Partition  factor  used by priority/multifactor plugin in calcu‐
1493              lating job priority.  The value may not exceed 65533.  Also  see
1494              PriorityTier.
1495
1496
1497       PriorityTier=<count>
1498              Jobs  submitted to a partition with a higher priority tier value
1499              will be dispatched before pending jobs in partition  with  lower
1500              priority  tier  value  and,   if  possible,  they  will  preempt
1501              running jobs from partitions with lower  priority  tier  values.
1502              Note  that  a  partition's priority tier takes precedence over a
1503              job's priority.  The value may not exceed 65533.  Also see  Pri‐
1504              orityJobFactor.
1505
1506
1507       QOS=<QOSname|blank to remove>
1508              Set the partition QOS with a QOS name or to remove the Partition
1509              QOS leave the option blank.
1510
1511
1512       RootOnly=<yes|no>
1513              Specify if only allocation requests initiated by user root  will
1514              be  satisfied.  This can be used to restrict control of the par‐
1515              tition to some meta-scheduler.  Possible values  are  "YES"  and
1516              "NO".
1517
1518
1519       ReqResv=<yes|no>
1520              Specify  if  only  allocation requests designating a reservation
1521              will be satisfied.  This is used to restrict partition usage  to
1522              be allowed only within a reservation.  Possible values are "YES"
1523              and "NO".
1524
1525
1526       Shared=<yes|no|exclusive|force>[:<job_count>]
1527              Renamed to OverSubscribe, see option descriptions above.
1528
1529
1530       State=<up|down|drain|inactive>
1531              Specify if jobs can be allocated nodes or queued in this  parti‐
1532              tion.  Possible values are "UP", "DOWN", "DRAIN" and "INACTIVE".
1533
1534              UP        Designates  that new jobs may queued on the partition,
1535                        and that jobs may be allocated nodes and run from  the
1536                        partition.
1537
1538              DOWN      Designates  that  new jobs may be queued on the parti‐
1539                        tion, but queued jobs may not be allocated  nodes  and
1540                        run  from  the  partition. Jobs already running on the
1541                        partition continue to run. The jobs must be explicitly
1542                        canceled to force their termination.
1543
1544              DRAIN     Designates  that no new jobs may be queued on the par‐
1545                        tition (job submission requests will be denied with an
1546                        error  message), but jobs already queued on the parti‐
1547                        tion may be allocated nodes and  run.   See  also  the
1548                        "Alternate" partition specification.
1549
1550              INACTIVE  Designates  that no new jobs may be queued on the par‐
1551                        tition, and jobs already queued may not  be  allocated
1552                        nodes  and  run.   See  also the "Alternate" partition
1553                        specification.
1554
1555
1556       TRESBillingWeights=<TRES Billing Weights>
1557              TRESBillingWeights is used to define the billing weights of each
1558              TRES  type  that will be used in calculating the usage of a job.
1559              The calculated usage is used when calculating fairshare and when
1560              enforcing  the  TRES  billing limit on jobs.  Updates affect new
1561              jobs and not existing jobs.  See the  slurm.conf  man  page  for
1562              more information.
1563
1564
1565
1566       SPECIFICATIONS FOR UPDATE COMMAND, POWERCAP
1567
1568
1569       PowerCap=<count>
1570              Set  the  amount  of watts the cluster is limited to.  Specify a
1571              number, "INFINITE" to enable the  power  capping  logic  without
1572              power  restriction  or  "0"  to disable the power capping logic.
1573              Update slurm.conf with any changes meant to be persistent across
1574              normal restarts of slurmctld or the execution of scontrol recon‐
1575              fig.
1576
1577
1578       SPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, RESERVATIONS
1579
1580
1581
1582       Reservation=<name>
1583              Identify the name  of  the  reservation  to  be  created,
1584              updated,  or  deleted.   This  parameter  is required for
1585              update and is the only parameter for delete.  For create,
1586              if you do not want to give a reservation name, use "scon‐
1587              trol create res ..." and a name will be created automati‐
1588              cally.
1589
1590
1591       Accounts=<account list>
1592              List of accounts permitted to use the reserved nodes, for
1593              example "Accounts=physcode1,physcode2".  A user in any of
1594              the  accounts may use the reserved nodes.  A new reserva‐
1595              tion must specify Users and/or Accounts.  If  both  Users
1596              and  Accounts  are  specified,  a  job must match both in
1597              order to use  the  reservation.   Accounts  can  also  be
1598              denied  access  to  reservations  by preceding all of the
1599              account names with '-'.  Alternately  precede  the  equal
1600              sign         with        '-'.         For        example,
1601              "Accounts=-physcode1,-physcode2"                       or
1602              "Accounts-=physcode1,physcode2"  will  permit any account
1603              except physcode1 and physcode2 to  use  the  reservation.
1604              You  can add or remove individual accounts from an exist‐
1605              ing reservation by using the update command and adding  a
1606              '+'  or  '-'  sign  before the '=' sign.  If accounts are
1607              denied access to a reservation (account name preceded  by
1608              a '-'), then all other accounts are implicitly allowed to
1609              use the reservation  and  it  is  not  possible  to  also
1610              explicitly specify allowed accounts.
1611
1612
1613       BurstBuffer=<buffer_spec>[,<buffer_spec>,...]
1614              Specification  of  burst buffer resources which are to be
1615              reserved.   "buffer_spec"  consists  of  four   elements:
1616              [plugin:][type:]#[units]  "plugin"  is  the  burst buffer
1617              plugin name, currently either "cray" or "generic".  If no
1618              plugin  is specified, the reservation applies to all con‐
1619              figured burst buffer plugins.  "type"  specifies  a  Cray
1620              generic  burst  buffer resource, for example "nodes".  if
1621              "type" is not specified, the number is a measure of stor‐
1622              age  space.   The  "units"  may  be "N" (nodes), "K|KiB",
1623              "M|MiB", "G|GiB", "T|TiB", "P|PiB" (for powers  of  1024)
1624              and  "KB",  "MB",  "GB", "TB", "PB" (for powers of 1000).
1625              The default units are bytes for reservations  of  storage
1626              space.   For  example "BurstBuffer=cray:2TB" (reserve 2TB
1627              of storage plus 3 nodes from the Cray plugin) or  "Burst‐
1628              Buffer=100GB" (reserve 100 GB of storage from all config‐
1629              ured burst buffer plugins).  Jobs using this  reservation
1630              are  not  restricted to these burst buffer resources, but
1631              may use these reserved resources plus any which are  gen‐
1632              erally available.  NOTE: Usually Slurm interprets KB, MB,
1633              GB, TB, PB, TB units as powers of  1024,  but  for  Burst
1634              Buffers  size  specifications  Slurm supports both IEC/SI
1635              formats.  This is  because  the  CRAY  API  for  managing
1636              DataWarps supports both formats.
1637
1638
1639       CoreCnt=<num>
1640              This    option    is    only   supported   when   Select‐
1641              Type=select/cons_res. Identify  number  of  cores  to  be
1642              reserved.  If  NodeCnt  is  used  without the FIRST_CORES
1643              flag, this is the total number of cores to reserve  where
1644              cores  per  node  is  CoreCnt/NodeCnt.   If a nodelist is
1645              used, or if NodeCnt is used with  the  FIRST_CORES  flag,
1646              this  should  be  an  array  of  core  numbers  by  node:
1647              Nodes=node[1-5]  CoreCnt=2,2,3,3,4  or  flags=FIRST_CORES
1648              NodeCnt=5 CoreCnt=1,2,1,3,2.
1649
1650
1651       Licenses=<license>
1652              Specification  of  licenses (or other resources available
1653              on all nodes of the cluster) which are  to  be  reserved.
1654              License  names  can be followed by a colon and count (the
1655              default count is one).  Multiple license names should  be
1656              comma   separated  (e.g.  "Licenses=foo:4,bar").   A  new
1657              reservation must specify  one  or  more  resource  to  be
1658              included:  NodeCnt, Nodes and/or Licenses.  If a reserva‐
1659              tion includes Licenses, but no NodeCnt or Nodes, then the
1660              option  Flags=LICENSE_ONLY  must also be specified.  Jobs
1661              using  this  reservation  are  not  restricted  to  these
1662              licenses,  but  may  use these reserved licenses plus any
1663              which are generally available.
1664
1665
1666       NodeCnt=<num>[,num,...]
1667              Identify number of nodes to be reserved. The  number  can
1668              include  a suffix of "k" or "K", in which case the number
1669              specified is multiplied by 1024.  A new reservation  must
1670              specify  one  or  more  resource to be included: NodeCnt,
1671              Nodes and/or Licenses.
1672
1673
1674       Nodes=<name>
1675              Identify the node(s) to be reserved. Multiple node  names
1676              may  be  specified  using  simple  node range expressions
1677              (e.g. "Nodes=lx[10-20]").  Specify a blank data value  to
1678              remove  all  nodes  from  a reservation: "Nodes=".  A new
1679              reservation must specify  one  or  more  resource  to  be
1680              included: NodeCnt, Nodes and/or Licenses. A specification
1681              of "ALL" will reserve all nodes. Set Flags=PART_NODES and
1682              PartitionName=  in order for changes in the nodes associ‐
1683              ated with a partition to also be reflected in  the  nodes
1684              associated with a reservation.
1685
1686
1687       StartTime=<time_spec>
1688              The  start  time  for the reservation.  A new reservation
1689              must specify a start time.  It accepts times of the  form
1690              HH:MM:SS   for  a  specific  time  of  day  (seconds  are
1691              optional).  (If that time is already past, the  next  day
1692              is  assumed.)   You may also specify midnight, noon, fika
1693              (3 PM) or teatime (4 PM) and you can have  a  time-of-day
1694              suffixed  with AM or PM for running in the morning or the
1695              evening.  You can also say what day the job will be  run,
1696              by  specifying  a  date of the form MMDDYY or MM/DD/YY or
1697              MM.DD.YY, or a date and time as  YYYY-MM-DD[THH:MM[:SS]].
1698              You  can  also  give  times  like now + count time-units,
1699              where the time-units can  be  minutes,  hours,  days,  or
1700              weeks  and  you  can tell Slurm to run the job today with
1701              the keyword today and to run the job  tomorrow  with  the
1702              keyword  tomorrow.  You  cannot update the StartTime of a
1703              reservation in ACTIVE state.
1704
1705
1706       EndTime=<time_spec>
1707              The end time for the reservation.  A new reservation must
1708              specify an end time or a duration.  Valid formats are the
1709              same as for StartTime.
1710
1711
1712       Duration=<time>
1713              The length of a  reservation.   A  new  reservation  must
1714              specify  an  end  time  or a duration.  Valid formats are
1715              minutes,     minutes:seconds,      hours:minutes:seconds,
1716              days-hours,  days-hours:minutes,  days-hours:minutes:sec‐
1717              onds, or UNLIMITED.  Time resolution is  one  minute  and
1718              second  values  are rounded up to the next minute. Output
1719              format is always [days-]hours:minutes:seconds.
1720
1721
1722       PartitionName=<name>
1723              Identify the partition to be reserved.
1724
1725
1726       Flags=<flags>
1727              Flags associated with the reservation.  You  can  add  or
1728              remove  individual  flags from an existing reservation by
1729              adding a '+' or '-' sign before the '=' sign.  For  exam‐
1730              ple:  Flags-=DAILY  (NOTE: this shortcut is not supported
1731              for all flags).  Currently supported flags include:
1732
1733              ANY_NODES     This is a  reservation  for  burst  buffers
1734                            and/or licenses only and not compute nodes.
1735                            If this flag  is  set,  a  job  using  this
1736                            reservation  may  use  the associated burst
1737                            buffers and/or licenses  plus  any  compute
1738                            nodes.   If  this  flag  is  not set, a job
1739                            using this reservation  may  use  only  the
1740                            nodes  and  licenses  associated  with  the
1741                            reservation.
1742
1743              DAILY         Repeat the reservation  at  the  same  time
1744                            every day.
1745
1746              FLEX          Permit  jobs  requesting the reservation to
1747                            begin  prior  to  the  reservation's  start
1748                            time, end after the reservation's end time,
1749                            and use any resources inside and/or outside
1750                            of  the  reservation regardless of any con‐
1751                            straints possibly set in the reservation. A
1752                            typical  use  case  is  to prevent jobs not
1753                            explicitly requesting the reservation  from
1754                            using  those reserved resources rather than
1755                            forcing jobs requesting the reservation  to
1756                            use  those  resources  in  the  time  frame
1757                            reserved. Another  use  case  could  be  to
1758                            always  have  a  particular number of nodes
1759                            with a specific feature reserved for a spe‐
1760                            cific  account so users in this account may
1761                            use this nodes plus  possibly  other  nodes
1762                            without this feature.
1763
1764              FIRST_CORES   Use  the  lowest  numbered  cores on a node
1765                            only.
1766
1767              IGNORE_JOBS   Ignore currently running jobs when creating
1768                            the  reservation.   This  can be especially
1769                            useful when reserving all nodes in the sys‐
1770                            tem for maintenance.
1771
1772              LICENSE_ONLY  See ANY_NODES.
1773
1774              MAINT         Maintenance mode, receives special account‐
1775                            ing treatment.  This partition is permitted
1776                            to   use  resources  that  are  already  in
1777                            another reservation.
1778
1779              NO_HOLD_JOBS_AFTER
1780                            By default, when  a  reservation  ends  the
1781                            reservation  request  will  be removed from
1782                            any pending jobs submitted to the  reserva‐
1783                            tion  and  will  be  put into a held state.
1784                            Use this flag to let jobs  run  outside  of
1785                            the  reservation  after  the reservation is
1786                            gone.
1787
1788              OVERLAP       This reservation can be allocated resources
1789                            that are already in another reservation.
1790
1791              PART_NODES    This  flag can be used to reserve all nodes
1792                            within the specified partition.  Partition‐
1793                            Name  and  Nodes=ALL  must  be specified or
1794                            this option is ignored.
1795
1796              PURGE_COMP    Purge the reservation once the last associ‐
1797                            ated  job has completed.  Once the reserva‐
1798                            tion has been created, it must be populated
1799                            within  5  minutes  of its start time or it
1800                            will be purged before any  jobs  have  been
1801                            run.
1802
1803              REPLACE       Nodes which are DOWN, DRAINED, or allocated
1804                            to jobs are automatically replenished using
1805                            idle resources.  This option can be used to
1806                            maintain  a   constant   number   of   idle
1807                            resources  available for pending jobs (sub‐
1808                            ject to availability  of  idle  resources).
1809                            This should be used with the NodeCnt reser‐
1810                            vation option;  do  not  identify  specific
1811                            nodes to be included in the reservation.
1812
1813              REPLACE_DOWN  Nodes  which  are DOWN or DRAINED are auto‐
1814                            matically replenished using idle resources.
1815                            This  option can be used to maintain a con‐
1816                            stant sized pool of resources available for
1817                            pending  jobs  (subject  to availability of
1818                            idle resources).  This should be used  with
1819                            the  NodeCnt  reservation  option;  do  not
1820                            identify specific nodes to be  included  in
1821                            the reservation.
1822
1823              SPEC_NODES    Reservation  is  for specific nodes (output
1824                            only)
1825
1826              STATIC_ALLOC  Make it so after the nodes are selected for
1827                            a  reservation  they don't change.  Without
1828                            this option when nodes are selected  for  a
1829                            reservation  and one goes down the reserva‐
1830                            tion will select a new  node  to  fill  the
1831                            spot.
1832
1833              TIME_FLOAT    The  reservation  start time is relative to
1834                            the current time and moves forward  through
1835                            time  (e.g.  a StartTime=now+10minutes will
1836                            always be 10 minutes in the future).
1837
1838              WEEKDAY       Repeat the reservation at the same time  on
1839                            every  weekday (Monday, Tuesday, Wednesday,
1840                            Thursday and Friday).
1841
1842              WEEKEND       Repeat the reservation at the same time  on
1843                            every weekend day (Saturday and Sunday).
1844
1845              WEEKLY        Repeat  the  reservation  at  the same time
1846                            every week.
1847
1848
1849       Features=<features>
1850              Set the reservation's required  node  features.  Multiple
1851              values  may be "&" separated if all features are required
1852              (AND operation) or separated by "|" if any of the  speci‐
1853              fied  features  are required (OR operation).  Parenthesis
1854              are also supported for features to be ANDed together with
1855              counts of nodes having the specified features.  For exam‐
1856              ple "Features=[(knl&a2a&flat)*4&haswell*2]" indicates the
1857              advanced  reservation  should include 4 nodes with ALL of
1858              the features "knl", "a2a", and "flat" plus 2  nodes  with
1859              the feature "haswell".
1860
1861              Value may be cleared with blank data value, "Features=".
1862
1863
1864       Users=<user list>
1865              List  of  users  permitted to use the reserved nodes, for
1866              example "User=jones1,smith2".   A  new  reservation  must
1867              specify   Users  and/or  Accounts.   If  both  Users  and
1868              Accounts are specified, a job must match both in order to
1869              use  the reservation.  Users can also be denied access to
1870              reservations by preceding all of the user names with '-'.
1871              Alternately  precede  the equal sign with '-'.  For exam‐
1872              ple, "User=-jones1,-smith2" or "User-=jones1,smith2" will
1873              permit  any  user  except  jones1  and  smith2 to use the
1874              reservation.  You can add or remove individual users from
1875              an  existing  reservation by using the update command and
1876              adding a '+' or '-' sign before the '=' sign.   If  users
1877              are denied access to a reservation (user name preceded by
1878              a '-'), then all other users are  implicitly  allowed  to
1879              use  the  reservation  and  it  is  not  possible to also
1880              explicitly specify allowed users.
1881
1882
1883       TRES=<tres_spec>
1884              Comma-separated list of TRES required  for  the  reserva‐
1885              tion. Current supported TRES types with reservations are:
1886              CPU, Node, License and BB. CPU and Node follow  the  same
1887              format  as  CoreCnt  and NodeCnt parameters respectively.
1888              License names can be followed  by  an  equal  '='  and  a
1889              count:
1890
1891              License/<name1>=<count1>[,License/<name2>=<count2>,...]
1892
1893              BurstBuffer  can  be specified in a similar way as Burst‐
1894              Buffer parameter. The only difference is that colon  sym‐
1895              bol  ':'  should  be replaced by an equal '=' in order to
1896              follow the TRES format.
1897
1898              Some examples of TRES valid specifications:
1899
1900              TRES=cpu=5,bb/cray=4,license/iop1=1,license/iop2=3
1901
1902              TRES=node=5k,license/iop1=2
1903
1904              As specified in CoreCnt, if a nodelist is specified,  cpu
1905              can  be  an  array  of  core  numbers by node: nodes=com‐
1906              pute[1-3] TRES=cpu=2,2,1,bb/cray=4,license/iop1=2
1907
1908              Please note that CPU, Node, License and BB  can  override
1909              CoreCnt,  NodeCnt,  Licenses  and  BurstBuffer parameters
1910              respectively.  Also CPU represents CoreCnt, in a reserva‐
1911              tion and will be adjusted if you have threads per core on
1912              your nodes.
1913
1914
1915       SPECIFICATIONS FOR SHOW COMMAND, LAYOUTS
1916
1917       Without options, lists all configured  layouts.  With  a  layout
1918       specified, shows entities with following options:
1919
1920
1921       Key=<value>
1922              Keys/Values  to  update for the entities. The format must
1923              respect the layout.d configuration files. Key=Type cannot
1924              be  updated.  One  Key/Value  is required, several can be
1925              set.
1926
1927       Entity=<value>
1928              Entities to show, default is not used. Can be set to "*".
1929
1930       Type=<value>
1931              Type of entities to show, default is not used.
1932
1933       nolayout
1934              If not used, only entities with  defining  the  tree  are
1935              shown.  With the option, only leaves are shown.
1936
1937

ENVIRONMENT VARIABLES

1939       Some  scontrol  options  may  be  set via environment variables.
1940       These environment  variables,  along  with  their  corresponding
1941       options,  are  listed  below.  (Note:  Commandline  options will
1942       always override these settings.)
1943
1944       SCONTROL_ALL        -a, --all
1945
1946       SCONTROL_FEDERATION --federation
1947
1948       SCONTROL_FUTURE     -F, --future
1949
1950       SCONTROL_LOCAL      --local
1951
1952       SCONTROL_SIBLING    --sibling
1953
1954       SLURM_BITSTR_LEN    Specifies the string length to be  used  for
1955                           holding  a  job  array's task ID expression.
1956                           The default value is 64 bytes.  A value of 0
1957                           will  print  the  full  expression  with any
1958                           length   required.    Larger   values    may
1959                           adversely  impact  the  application  perfor‐
1960                           mance.
1961
1962       SLURM_CLUSTERS      Same as --clusters
1963
1964       SLURM_CONF          The  location  of  the  Slurm  configuration
1965                           file.
1966
1967       SLURM_TIME_FORMAT   Specify  the  format  used  to  report  time
1968                           stamps. A value  of  standard,  the  default
1969                           value,   generates   output   in   the  form
1970                           "year-month-dateThour:minute:second".      A
1971                           value     of     relative    returns    only
1972                           "hour:minute:second"  if  the  current  day.
1973                           For  other  dates  in  the  current  year it
1974                           prints   the   "hour:minute"   preceded   by
1975                           "Tomorr"  (tomorrow),  "Ystday" (yesterday),
1976                           the name of the  day  for  the  coming  week
1977                           (e.g.  "Mon",  "Tue",  etc.),  otherwise the
1978                           date (e.g. "25 Apr").  For  other  years  it
1979                           returns a date month and year without a time
1980                           (e.g.  "6 Jun 2012"). All of the time stamps
1981                           use a 24 hour format.
1982
1983                           A valid strftime() format can also be speci‐
1984                           fied. For example, a value of "%a  %T"  will
1985                           report  the day of the week and a time stamp
1986                           (e.g. "Mon 12:34:56").
1987
1988
1989       SLURM_TOPO_LEN      Specify the maximum size of  the  line  when
1990                           printing  Topology.  If not set, the default
1991                           value "512" will be used.
1992
1993

AUTHORIZATION

1995       When using the Slurm db, users  who  have  AdminLevel's  defined
1996       (Operator  or  Admin) and users who are account coordinators are
1997       given the authority  to  view  and  modify  jobs,  reservations,
1998       nodes,  etc.,  as defined in the following table - regardless of
1999       whether a  PrivateData  restriction  has  been  defined  in  the
2000       slurm.conf file.
2001
2002       scontrol show job(s):        Admin, Operator, Coordinator
2003       scontrol update job:         Admin, Operator, Coordinator
2004       scontrol requeue:            Admin, Operator, Coordinator
2005       scontrol show step(s):       Admin, Operator, Coordinator
2006       scontrol update step:        Admin, Operator, Coordinator
2007
2008       scontrol show node:          Admin, Operator
2009       scontrol update node:        Admin
2010
2011       scontrol create partition:   Admin
2012       scontrol show partition:     Admin, Operator
2013       scontrol update partition:   Admin
2014       scontrol delete partition:   Admin
2015
2016       scontrol create reservation: Admin, Operator
2017       scontrol show reservation:   Admin, Operator
2018       scontrol update reservation: Admin, Operator
2019       scontrol delete reservation: Admin, Operator
2020
2021       scontrol reconfig:           Admin
2022       scontrol shutdown:           Admin
2023       scontrol takeover:           Admin
2024
2025

EXAMPLES

2027       # scontrol
2028       scontrol: show part debug
2029       PartitionName=debug
2030          AllocNodes=ALL AllowGroups=ALL Default=YES
2031          DefaultTime=NONE DisableRootJobs=NO Hidden=NO
2032          MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1
2033          Nodes=snowflake[0-48]
2034          Priority=1 RootOnly=NO OverSubscribe=YES:4
2035          State=UP TotalCPUs=694 TotalNodes=49
2036       scontrol: update PartitionName=debug MaxTime=60:00 MaxNodes=4
2037       scontrol: show job 71701
2038       JobId=71701 Name=hostname
2039          UserId=da(1000) GroupId=da(1000)
2040          Priority=66264 Account=none QOS=normal WCKey=*123
2041          JobState=COMPLETED Reason=None Dependency=(null)
2042          TimeLimit=UNLIMITED  Requeue=1  Restarts=0  BatchFlag=0 Exit‐
2043       Code=0:0
2044          SubmitTime=2010-01-05T10:58:40                      Eligible‐
2045       Time=2010-01-05T10:58:40
2046          StartTime=2010-01-05T10:58:40 EndTime=2010-01-05T10:58:40
2047          SuspendTime=None SecsPreSuspend=0
2048          Partition=debug AllocNode:Sid=snowflake:4702
2049          ReqNodeList=(null) ExcNodeList=(null)
2050          NodeList=snowflake0
2051          NumNodes=1 NumCPUs=10 CPUs/Task=2 ReqS:C:T=1:1:1
2052          MinCPUsNode=2 MinMemoryNode=0 MinTmpDiskNode=0
2053          Features=(null) Reservation=(null)
2054          OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
2055       scontrol: update JobId=71701 TimeLimit=30:00 Priority=500
2056       scontrol: show hostnames tux[1-3]
2057       tux1
2058       tux2
2059       tux3
2060       scontrol:   create   res   StartTime=2009-04-01T08:00:00   Dura‐
2061       tion=5:00:00 Users=dbremer NodeCnt=10
2062       Reservation created: dbremer_1
2063       scontrol: update Reservation=dbremer_1 Flags=Maint NodeCnt=20
2064       scontrol: delete Reservation=dbremer_1
2065       scontrol: quit
2066
2067

COPYING

2069       Copyright (C) 2002-2007 The Regents of the University  of  Cali‐
2070       fornia.  Produced at Lawrence Livermore National Laboratory (cf,
2071       DISCLAIMER).
2072       Copyright (C) 2008-2010 Lawrence Livermore National Security.
2073       Copyright (C) 2010-2018 SchedMD LLC.
2074
2075       This file is part of Slurm, a resource management program.   For
2076       details, see <https://slurm.schedmd.com/>.
2077
2078       Slurm is free software; you can redistribute it and/or modify it
2079       under the terms of the GNU General Public License  as  published
2080       by  the  Free  Software  Foundation;  either  version  2  of the
2081       License, or (at your option) any later version.
2082
2083       Slurm is distributed in the hope that it  will  be  useful,  but
2084       WITHOUT  ANY WARRANTY; without even the implied warranty of MER‐
2085       CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See  the  GNU
2086       General Public License for more details.
2087

FILES

2089       /etc/slurm.conf
2090

SEE ALSO

2092       scancel(1),    sinfo(1),    squeue(1),   slurm_checkpoint   (3),
2093       slurm_create_partition    (3),    slurm_delete_partition    (3),
2094       slurm_load_ctl_conf  (3),  slurm_load_jobs  (3), slurm_load_node
2095       (3),   slurm_load_partitions   (3),    slurm_reconfigure    (3),
2096       slurm_requeue   (3),   slurm_resume   (3),  slurm_shutdown  (3),
2097       slurm_suspend (3),  slurm_takeover  (3),  slurm_update_job  (3),
2098       slurm_update_node      (3),      slurm_update_partition     (3),
2099       slurm.conf(5), slurmctld(8)
2100
2101
2102
2103February 2019                   Slurm Commands                     scontrol(1)
Impressum