1PBSTOP(1)             User Contributed Perl Documentation            PBSTOP(1)
2
3
4

NAME

6       pbstop - monitoring utility for OpenPBS or Torque
7

SYNOPSIS

9       pbstop [OPTION]... [@hostname]...
10

DESCRIPTION

12       Draws a full-terminal display of your nodes and jobs.  The default grid
13       shows each node's 1st CPU as a single character.  The specific
14       character denotes the state of the node or identifies the job running
15       on that CPU.  The job listing shows the job name, queue name, state,
16       etc. and, on the far left, the character used to identify nodes in the
17       upper grid.  Pressing a number key will toggle the display of that CPU
18       on all of the nodes.
19
20       This program runs best if the "perl-PBS" module is installed.  While
21       there are currently no loss of features if it isn't installed, it will
22       run much faster with it.  If you are unsure if PBS is installed, run
23       this program, hit "h", and look for the Backend information at the top
24       right.
25

COMMAND-LINE OPTIONS

27       -s num
28           seconds between refreshes
29
30       -c num
31           number of columns to display in the grid (0 scales based on term
32           width)
33
34       -m num
35           max number of cpus in a node before it gets its own grid
36
37       -n  don't put spaces between each node in the grid display for a more
38           compact display (no space)
39
40       -q  queue name for limiting the view of the grid and job list.  Only
41           one name is supported at this time.  No corresponding interactive
42           command.
43
44       -u  usernames for limiting the view of the grid and job list.  Can be a
45           comma-seperated list of usernames or "all".  "me" is a pseudonym
46           for the username running pbstop(1).
47
48       -C  toggle colorization
49
50       -S  toggle state summary display
51
52       -G  toggle grid display
53
54       -Q  toggle queue display
55
56       -t  toggle showing queued jobs in queue display
57
58       -[0-9]...
59           cpu numbers for grid display
60
61       -J  toggle jobs in grid display
62
63       -fillbg
64           fill the background with black instead of using the terminal's
65           default
66
67       -V  print version and exit
68

INTERACTIVE COMMANDS

70       Several single-key commands are recognized while pbstop(1) is running.
71       The arrow keys, PageUp, and PageDown keys will scroll the display if it
72       doesn't fit in your terminal.
73
74       When prompted to type something, ctrl-g can be used to cancel the
75       command.
76
77       space
78           Immediately update display
79
80       q   Quit pbstop(1)
81
82       h   Display help screen, version, and current settings
83
84       c   Prompts for the number of columns to display the node grid (0 auto-
85           scales based on term width)
86
87       s   Prompts for the number of seconds to wait between display updates
88
89       u   Prompts for a username.  The grid and job listing will be limited
90           to the named user.  Input "all" will remove all limitations (the
91           default), and "me" will limit to the current username running
92           pbstop(1).  If the username or "me" is prefixed with a "+" or "-",
93           the username will be added or removed from the list of usernames to
94           be limited.  "a" and "m" are shortcuts for "all" and "me".
95
96       /   Prompts the user for a search string, for displaying the details
97           of.  The search can optionally begin with one of the following
98           pattern specifiers (think: mutt): "~s" for a server, "~n" for a
99           node, or "~j" for a job number.  If no pattern specifier is found,
100           pbstop will attempt to find the object that best matches the search
101           string. The string can be a server name, nodename, or a job number.
102           Nodenames can optionally be followed by a space and the server
103           name.  Job numbers may optionally be followed by a dot and the
104           server name.
105
106           If an object is found, a subwindow will be opened displaying
107           details.  Hit "q" to exit the window.
108
109           When viewing a job detail subwindow, pressing "l" is a shortcut for
110           jumping directly to the associated job's node load subwindow.
111
112           (Mnemonic: like using / to search for text in vi or less)
113
114       l   Prompts the user for a job id.  A node load report subwindow will
115           be displayed for the given jobid.  This subwindow shows the current
116           load average, the physical and available memory, and the number of
117           sessions.  Available physical memory will be negative in the event
118           of swapping.  If the number of sessions is 0, that might indicate a
119           problem on that node.
120
121           Pressing "l" in this subwindow jumps you directly to the associated
122           job detail subwindow; as if the user typed "/jobid".
123
124           (Mnemonic: load average)
125
126       C   Toggle the use of the colors in the display
127
128       S   Toggle the display of the state summary
129
130       G   Toggle the display of the node grid
131
132       Q   Toggle the display of the job queue
133
134       t   Toggle the display of currently queued (not running) jobs in the
135           display.  This can reduce the size of the queue display
136           considerably in some environments.
137
138           (Mnemonic: I don't know, toggle?  "Q" was already used for
139           something more important)
140
141       J   Toggle the display of job letters in the node grid.  This handy
142           because you can see the node state "hidden" behind the job letter.
143           For example, use this to see which nodes are not yet "busy" that
144           have jobs.
145
146       f   Toggle background fill with black instead of using the terminal's
147           default.  Use this if the display looks bad on your colored or
148           transparent background.
149
150       Any single number (0-9)
151           Toggle display of that CPU number in the display.  This is
152           confusing at first, but useful in SMP environments (See SMP section
153           below).
154

STARTUP

156       pbstop(1) has many configuration variables that can set on the command
157       line, interactively, or from configuration files.  When pbstop(1)
158       starts, it first initializes these variables with built-in defaults,
159       then reads in /etc/pbstoprc, the reads ~/.pbstoprc, and finally parses
160       the command line arguments.  Note that several of the command line
161       arguments and interactive commands are toggles, they don't directly set
162       the value of the configuration.  In contrast, the configuration files
163       are not toggles.
164
165       The configuration files may contain following name=value pairs:
166
167       columns
168           Number of columns in the node grid, positive integer (0 scales
169           based on term width)
170
171       sleeptime
172           Number of seconds to pause between display updates, positive
173           integer
174
175       colorize
176           Use colors in the display, 1 or 0
177
178       show_summary
179           Display the summary at the top of the display, 1 or 0
180
181       compact_summary
182           Show node state summary on one line, 1 or 0
183
184       showncpus
185           Show the NCPUs job resource in the queue display, 1 or 0.
186
187       nodesort
188           Define the sorting method for the nodes in the main display grid.
189           The current possible methods are:
190
191           ordered
192               Preserves the order given from pbs_server without sorting; good
193               for nodes that don't follow a specific pattern or order.
194
195           lexical
196               Simple alphabetical sort.  Fastest method for nodes with zero-
197               padded names such as node0023.
198
199           integer
200               The first numbers found for an integer sort.  Useful if you are
201               unfortunate enough to not have zero-padded nodes, like node1
202               and node23.
203
204           mixed
205               Lexical sort followed by an integer sort.  Should give
206               meaningful results in all cases, especially if you are *really*
207               unfortunate enough to not have zero-padded nodes and have
208               different leading strings, like lin34 and win5.  This is the
209               default.
210
211           mixed2
212               Mixed sort followed by another mixed sort.  Useful for
213               pathelogical admins that name their nodes after rack positions,
214               like rack1node4 and rack10node12.
215
216       nodesort_host
217           Defines sorting methods on a per-server basis.  It is a comma-
218           delimited list of "host=method" pairs surrounded by paranthesis,
219           i.e.  nodesort_host=(serv1=ordered,serv2=lexical).  The host part
220           is first checked as an exact match, otherwise is interpreted as a
221           perl regexp (first match wins).
222
223       nospace
224           No space between nodes in grid for a more compact display, 1 or 0
225
226       show_grid
227           Show the node grid, 1 or 0
228
229       show_queue
230           Show the job queue, 1 or 0
231
232       show_qqueue
233           Show queued (not running) jobs in the queue display, 1 or 0
234
235       show_jobs
236           Show job and color information in the node grid, 1 or 0
237
238       show_cpu
239           Comma seperated list of CPU numbers to display
240
241       show_onlyq
242           Queue name to limit the view in the grid and job list.  Only one
243           name is supported at this time.
244
245       show_user
246           Usernames to limit the view in the grid and job list.  Can be a
247           comma-seperated list of users, "all", or "me".
248
249           It might be reasonable for a site to have "show_user=me" in
250           /etc/pbstoprc and for admin users to have "show_user=all" in their
251           own ~/.pbstoprc.
252
253           Members of a group might want all of their groupmates's usernames
254           in their own ~/.pbstoprc.
255
256       host
257           Comma seperated list of hostnames running pbs_server
258
259       maxrows
260           Number of rows in the large scrollable panel
261
262       maxcolums
263           Number of columns in the large scrollable panel
264
265       maxnodegrid
266           Fill the background with black, 1 or 0
267
268       A sample configuration file:
269
270           # I'm grumpy and don't like color
271           colorize=0
272
273           # my 6 CPU machine should get a seperate grid
274           maxnodegrid=5
275
276           # all of my Torque servers
277           host=teraserver,bigbird,testhpc
278
279           # teraserver has strict naming, testhpc has useless naming
280           nodesort_host=(.*\.usc.edu=integer,teraserver=lexical,testhpc=ordered)
281

SMP ENVIRONMENTS

283       pbstop(1) was developed with three specific clusters in mind, these are
284       a 1700 node cluster of dual SMP machines, a 64 proc SMP with 16 single
285       node machines, and a 21 node cluster of single procs without nicely
286       numbered hostnames.  With this kind of pedigree, pbstop(1) is fairly
287       flexible.
288
289       The number of columns in the grid can be shrunk or expanded on the
290       command line with "-C", or interactively with "c".  Additional CPUs can
291       be displayed by pressing the appropriate number key.  Using the number
292       keys is confusing at first, but if you try it a few times it will
293       became natural.  By default, nodes with 8 or more CPUs are displayed in
294       a seperate grid.
295
296       The first two clusters mentioned above display well with the defaults.
297       The third is typically displayed with the number of columns set to "1".
298

FILES

300       /etc/pbstoprc
301           The global configuration file
302
303       ~/.pbstoprc
304           The personal configuration file.
305

ENVIRONMENTAL VARIABLES

307       PBS_DEFAULT
308           The server's hostname (same as most PBS client commands)
309

SEE ALSO

311       PBS(3pm), qstat(1B)
312

BUGS

314       The large Job structure uses the servername supplied by the user, the
315       Job structure uses the servername returned by the server... so they
316       don't match up (this makes the jobloadreport imprecise).  The curses
317       code is very ineffecient and the display gets corrupted at times.  It
318       can't produce plain text output like top's "batch" mode.  grep FIXME
319       from pbstop for more!
320

AUTHOR

322       pbstop(1) was originally written by Garrick Staples <garrick@usc.edu>.
323       The node grid and lettering concept is from Dennis Smith.  Thanks to
324       Egan Ford and the xCAT mailing list for testing and feedback.
325

POD ERRORS

327       Hey! The above document had some coding errors, which are explained
328       below:
329
330       Around line 2613:
331           =back doesn't take any parameters, but you said =back 4
332
333
334
335perl v5.12.1                      2010-06-22                         PBSTOP(1)
Impressum