smap(1) - f29

1smap(1)                         Slurm Commands                         smap(1)
2
3
4

NAME

6       smap  -  graphically view information about Slurm jobs, partitions, and
7       set configurations parameters.
8
9

SYNOPSIS

11       smap [OPTIONS...]
12

DESCRIPTION

14       smap is used to graphically view job, partition  and  node  information
15       for a system running Slurm.  Note that information about nodes and par‐
16       titions to which you lack access will  always  be  displayed  to  avoid
17       obvious  gaps in the output.  This is equivalent to the --all option of
18       the sinfo and squeue commands.
19
20

OPTIONS

22       -c, --commandline
23              Print output to the commandline, no curses.
24
25
26       -D <option>, --display=<option>
27              sets the display mode for  smap,  showing  relevant  information
28              about  the  selected  view  and  displaying a corresponding node
29              chart.  Note that unallocated nodes are indicated by a  '.'  and
30              nodes  in  the  DOWN,  DRAINED or FAIL state by a '#'.  When the
31              --iterate=<seconds> option is also selected, you can switch dis‐
32              plays by typing a different letter from the list below.
33
34              j      Displays information about jobs running on system.
35
36              r      Display  information  about advanced reservations.  While
37                     all current and future reservations will be listed,  only
38                     currently  active  reservations  will  appear on the node
39                     map.
40
41              s      Displays information about slurm partitions on the system
42
43
44       -h, --noheader
45              Do not print a header on the output.
46
47
48       -H, --show_hidden
49              Display hidden partitions and their jobs.
50
51
52       --help,
53              Print a message describing all smap options.
54
55
56       -i <seconds> , --iterate=<seconds>
57              Print the state on a periodic basis.  Sleep  for  the  indicated
58              number  of seconds between reports.  User can exit at anytime by
59              typing 'q' or hitting the return key.  If user is  in  configure
60              mode type 'exit' to exit program, 'quit' to exit configure mode.
61
62
63       -M, --clusters=<string>
64              Clusters  to  issue commands to.  Note that the SlurmDBD must be
65              up for this option to work properly.
66
67
68       -n, --nodes
69              Only show objects with these nodes.
70
71
72       -Q, --quiet
73              Avoid printing error messages.
74
75
76       --usage
77              Print a brief message listing the smap options.
78
79
80       -V , --version
81              Print version information and exit.
82
83

INTERACTIVE OPTIONS

85       When using smap in curses mode and when the --iterate=<seconds>  option
86       is  also  selected,  you can scroll through the different windows using
87       the arrow keys.  The up and down arrow keys scroll the window  contain‐
88       ing  the grid, and the left and right arrow keys scroll the window con‐
89       taining the text information.
90
91       With the iterate option selected, you can use any of the options avail‐
92       able to the -D option listed above (except 'c') to change screens.  You
93       can also hide or make visible hidden partitions by pressing 'h' at  any
94       moment.
95
96

OUTPUT FIELD DESCRIPTIONS

98       ACCESS_CONTROL
99              Identifies  the  users  or  bank  accounts  which  can  use this
100              advanced reservation.  A prefix of "A:" indicates that the  fol‐
101              lowing account names may use this reservation.  A prefix of "U:"
102              indicates that the following user names may  use  this  reserva‐
103              tion.
104
105       AVAIL  Partition state: up or down.
106
107       END_TIME
108              The time when an advanced reservation ended.
109
110       ID     Key  to  identify  the  nodes associated with this entity in the
111              node chart.
112
113       NAME   Name of the job or advanced reservation.
114
115       NODELIST
116              Names of nodes associated with this configuration, partition  or
117              reservation.
118
119       NODES  Count of nodes with this particular configuration.
120
121       PARTITION
122              Name  of  a  partition.  Note that the suffix "*" identifies the
123              default partition.
124
125       ST     State of a job in compact  form.  Possible  states  include:  PD
126              (pending), R (running), S (suspended), CD  (completed), CF (con‐
127              figuring), CG (completing), F (failed),  TO  (timeout),  and  NF
128              (node  failure).  See  JOB  STATE  CODES  section below for more
129              information.
130
131       START_TIME
132              The time when an advanced reservation started.
133
134       STATE  State of the nodes.  Possible states  include:  allocated,  com‐
135              pleting,  down,  drained,  draining,  fail,  failing,  idle, and
136              unknown plus their abbreviated forms: alloc, comp, down,  drain,
137              drng,  fail,  failg,  idle, and unk respectively.  Note that the
138              suffix "*" identifies nodes that are presently  not  responding.
139              See NODE STATE CODES section below for more information.
140
141       TIMELIMIT
142              Maximum  time  limit for any user job in days-hours:minutes:sec‐
143              onds.  infinite is used to identify jobs or partitions without a
144              job time limit.
145
146
147
148       TOPOGRAPHY INFORMATION
149
150       The node chart is designed to indicate relative locations of the nodes.
151       On most Linux clusters this will represent a one-dimensional  array  of
152       nodes.  Larger clusters will utilize multiple as needed with right side
153       of one line being logically followed by the left side of the next line.
154
155

NODE STATE CODES

157       Node state codes are shortened as required for the field  size.   These
158       node  states  may  be followed by a special character to identify state
159       flags associated with the  node.   The  following  node  sufficies  and
160       states are used:
161
162       *   The  node is presently not responding and will not be allocated any
163           new work.  If the node remains non-responsive, it will be placed in
164           the  DOWN  state (except in the case of COMPLETING, DRAINED, DRAIN‐
165           ING, FAIL, FAILING nodes).
166
167       ~   The node is presently in a power saving mode (typically running  at
168           reduced frequency).
169
170       #   The node is presently being powered up or configured.
171
172       $   The  node is currently in a reservation with a flag value of "main‐
173           tenance".
174
175       @   The node is pending reboot.
176
177       ALLOCATED   The node has been allocated to one or more jobs.
178
179       ALLOCATED+  The node is allocated to one or more active jobs  plus  one
180                   or more jobs are in the process of COMPLETING.
181
182       COMPLETING  All  jobs  associated  with this node are in the process of
183                   COMPLETING.  This node state will be removed  when  all  of
184                   the  job's  processes  have terminated and the Slurm epilog
185                   program (if any) has terminated. See the  Epilog  parameter
186                   description  in  the  slurm.conf man page for more informa‐
187                   tion.
188
189       DOWN        The node is unavailable for use.  Slurm  can  automatically
190                   place  nodes  in  this state if some failure occurs. System
191                   administrators may also  explicitly  place  nodes  in  this
192                   state.  If a node resumes normal operation, Slurm can auto‐
193                   matically return it to service. See the ReturnToService and
194                   SlurmdTimeout  parameter  descriptions in the slurm.conf(5)
195                   man page for more information.
196
197       DRAINED     The node is unavailable for use  per  system  administrator
198                   request.   See  the  update node command in the scontrol(1)
199                   man page or the slurm.conf(5) man page  for  more  informa‐
200                   tion.
201
202       DRAINING    The  node  is  currently  executing  a job, but will not be
203                   allocated to  additional  jobs.  The  node  state  will  be
204                   changed to state DRAINED when the last job on it completes.
205                   Nodes enter this state per  system  administrator  request.
206                   See  the update node command in the scontrol(1) man page or
207                   the slurm.conf(5) man page for more information.
208
209       FAIL        The node is expected to fail soon and  is  unavailable  for
210                   use  per system administrator request.  See the update node
211                   command in the scontrol(1) man page  or  the  slurm.conf(5)
212                   man page for more information.
213
214       FAILING     The  node  is currently executing a job, but is expected to
215                   fail soon and is unavailable for use per system administra‐
216                   tor  request.   See  the  update  node command in the scon‐
217                   trol(1) man page or the slurm.conf(5)  man  page  for  more
218                   information.
219
220       IDLE        The  node is not allocated to any jobs and is available for
221                   use.
222
223       MAINT       The node is currently in a reservation with a flag value of
224                   "maintainence".
225
226       REBOOT      The node is currently scheduled to be rebooted.
227
228       UNKNOWN     The  Slurm controller has just started and the node's state
229                   has not yet been determined.
230
231

JOB STATE CODES

233       Jobs typically pass through several states in the course of their  exe‐
234       cution.   The  typical states are PENDING, RUNNING, SUSPENDED, COMPLET‐
235       ING, and COMPLETED.  An explanation of each state follows.
236
237       BF  BOOT_FAIL       Job terminated due to launch failure, typically due
238                           to a hardware failure (e.g. unable to boot the node
239                           or block and the job can not be requeued).
240
241       CA  CANCELLED       Job was explicitly cancelled by the user or  system
242                           administrator.   The  job  may or may not have been
243                           initiated.
244
245       CD  COMPLETED       Job has terminated all processes on all nodes  with
246                           an exit code of zero.
247
248       CG  COMPLETING      Job is in the process of completing. Some processes
249                           on some nodes may still be active.
250
251       CF  CONFIGURING     Job has been allocated resources, but  are  waiting
252                           for them to become ready for use (e.g. booting).
253
254       F   FAILED          Job  terminated  with  non-zero  exit code or other
255                           failure condition.
256
257       NF  NODE_FAIL       Job terminated due to failure of one or more  allo‐
258                           cated nodes.
259
260       PD  PENDING         Sibling job (in federation) revoked.
261
262       PR  PREEMPTED       Job terminated due to preemption.
263
264       RV  REVOKED         Job currently has an allocation.
265
266       R   RUNNING         Job currently has an allocation.
267
268       SI  SIGNALING       Signal of job currently in progress.
269
270       SO  STAGE_OUT       Staging out data after job completion.
271
272       SE  SPECIAL_EXIT    The job was requeued in a special state. This state
273                           can be set by users, typically in  EpilogSlurmctld,
274                           if  the  job  has terminated with a particular exit
275                           value.
276
277       ST  STOPPED         Job has  an  allocation,  but  execution  has  been
278                           stopped   with  SIGSTOP  signal.   CPUS  have  been
279                           retained by this job.
280
281       S   SUSPENDED       Job has an allocation, but execution has been  sus‐
282                           pended and CPUs have been released for other jobs.
283
284       TO  TIMEOUT         Job terminated upon reaching its time limit.
285
286

ENVIRONMENT VARIABLES

288       The  following  environment  variables can be used to override settings
289       compiled into smap.
290
291       SLURM_CONF          The location of the Slurm configuration file.
292
293

COPYING

295       Copyright (C) 2004-2007 The Regents of the  University  of  California.
296       Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
297       Copyright (C) 2008-2009 Lawrence Livermore National Security.
298       Copyright (C) 2010-2013 SchedMD LLC.
299
300       This  file  is  part  of  Slurm,  a  resource  management program.  For
301       details, see <https://slurm.schedmd.com/>.
302
303       Slurm is free software; you can redistribute it and/or modify it  under
304       the  terms  of  the GNU General Public License as published by the Free
305       Software Foundation; either version 2  of  the  License,  or  (at  your
306       option) any later version.
307
308       Slurm  is  distributed  in the hope that it will be useful, but WITHOUT
309       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
310       FITNESS  FOR  A PARTICULAR PURPOSE.  See the GNU General Public License
311       for more details.
312
313