1PBSTOP(1) User Contributed Perl Documentation PBSTOP(1)
2
3
4
6 pbstop - monitoring utility for OpenPBS or Torque
7
9 pbstop [OPTION]... [@hostname]...
10
12 Draws a full-terminal display of your nodes and jobs. The default grid
13 shows each node's 1st CPU as a single character. The specific
14 character denotes the state of the node or identifies the job running
15 on that CPU. The job listing shows the job name, queue name, state,
16 etc. and, on the far left, the character used to identify nodes in the
17 upper grid. Pressing a number key will toggle the display of that CPU
18 on all of the nodes.
19
20 This program runs best if the "perl-PBS" module is installed. While
21 there are currently no loss of features if it isn't installed, it will
22 run much faster with it. If you are unsure if PBS is installed, run
23 this program, hit "h", and look for the Backend information at the top
24 right.
25
27 -s num
28 seconds between refreshes
29
30 -c num
31 number of columns to display in the grid (0 scales based on term
32 width)
33
34 -m num
35 max number of cpus in a node before it gets its own grid
36
37 -n don't put spaces between each node in the grid display for a more
38 compact display (no space)
39
40 -q queue name for limiting the view of the grid and job list. Only
41 one name is supported at this time. No corresponding interactive
42 command.
43
44 -u usernames for limiting the view of the grid and job list. Can be a
45 comma-seperated list of usernames or "all". "me" is a pseudonym
46 for the username running pbstop(1).
47
48 -C toggle colorization
49
50 -S toggle state summary display
51
52 -G toggle grid display
53
54 -Q toggle queue display
55
56 -t toggle showing queued jobs in queue display
57
58 -[0-9]...
59 cpu numbers for grid display
60
61 -J toggle jobs in grid display
62
63 -fillbg
64 fill the background with black instead of using the terminal's
65 default
66
67 -V print version and exit
68
70 Several single-key commands are recognized while pbstop(1) is running.
71 The arrow keys, PageUp, and PageDown keys will scroll the display if it
72 doesn't fit in your terminal.
73
74 When prompted to type something, ctrl-g can be used to cancel the
75 command.
76
77 space
78 Immediately update display
79
80 q Quit pbstop(1)
81
82 h Display help screen, version, and current settings
83
84 c Prompts for the number of columns to display the node grid (0 auto-
85 scales based on term width)
86
87 s Prompts for the number of seconds to wait between display updates
88
89 u Prompts for a username. The grid and job listing will be limited
90 to the named user. Input "all" will remove all limitations (the
91 default), and "me" will limit to the current username running
92 pbstop(1). If the username or "me" is prefixed with a "+" or "-",
93 the username will be added or removed from the list of usernames to
94 be limited. "a" and "m" are shortcuts for "all" and "me".
95
96 / Prompts the user for a search string, for displaying the details
97 of. The search can optionally begin with one of the following
98 pattern specifiers (think: mutt): "~s" for a server, "~n" for a
99 node, or "~j" for a job number. If no pattern specifier is found,
100 pbstop will attempt to find the object that best matches the search
101 string. The string can be a server name, nodename, or a job number.
102 Nodenames can optionally be followed by a space and the server
103 name. Job numbers may optionally be followed by a dot and the
104 server name.
105
106 If an object is found, a subwindow will be opened displaying
107 details. Hit "q" to exit the window.
108
109 When viewing a job detail subwindow, pressing "l" is a shortcut for
110 jumping directly to the associated job's node load subwindow.
111
112 (Mnemonic: like using / to search for text in vi or less)
113
114 l Prompts the user for a job id. A node load report subwindow will
115 be displayed for the given jobid. This subwindow shows the current
116 load average, the physical and available memory, and the number of
117 sessions. Available physical memory will be negative in the event
118 of swapping. If the number of sessions is 0, that might indicate a
119 problem on that node.
120
121 Pressing "l" in this subwindow jumps you directly to the associated
122 job detail subwindow; as if the user typed "/jobid".
123
124 (Mnemonic: load average)
125
126 C Toggle the use of the colors in the display
127
128 S Toggle the display of the state summary
129
130 G Toggle the display of the node grid
131
132 Q Toggle the display of the job queue
133
134 t Toggle the display of currently queued (not running) jobs in the
135 display. This can reduce the size of the queue display
136 considerably in some environments.
137
138 (Mnemonic: I don't know, toggle? "Q" was already used for
139 something more important)
140
141 J Toggle the display of job letters in the node grid. This handy
142 because you can see the node state "hidden" behind the job letter.
143 For example, use this to see which nodes are not yet "busy" that
144 have jobs.
145
146 f Toggle background fill with black instead of using the terminal's
147 default. Use this if the display looks bad on your colored or
148 transparent background.
149
150 Any single number (0-9)
151 Toggle display of that CPU number in the display. This is
152 confusing at first, but useful in SMP environments (See SMP section
153 below).
154
156 pbstop(1) has many configuration variables that can set on the command
157 line, interactively, or from configuration files. When pbstop(1)
158 starts, it first initializes these variables with built-in defaults,
159 then reads in /etc/pbstoprc, the reads ~/.pbstoprc, and finally parses
160 the command line arguments. Note that several of the command line
161 arguments and interactive commands are toggles, they don't directly set
162 the value of the configuration. In contrast, the configuration files
163 are not toggles.
164
165 The configuration files may contain following name=value pairs:
166
167 columns
168 Number of columns in the node grid, positive integer (0 scales
169 based on term width)
170
171 sleeptime
172 Number of seconds to pause between display updates, positive
173 integer
174
175 colorize
176 Use colors in the display, 1 or 0
177
178 show_summary
179 Display the summary at the top of the display, 1 or 0
180
181 compact_summary
182 Show node state summary on one line, 1 or 0
183
184 showncpus
185 Show the NCPUs job resource in the queue display, 1 or 0.
186
187 nodesort
188 Define the sorting method for the nodes in the main display grid.
189 The current possible methods are:
190
191 ordered
192 Preserves the order given from pbs_server without sorting; good
193 for nodes that don't follow a specific pattern or order.
194
195 lexical
196 Simple alphabetical sort. Fastest method for nodes with zero-
197 padded names such as node0023.
198
199 integer
200 The first numbers found for an integer sort. Useful if you are
201 unfortunate enough to not have zero-padded nodes, like node1
202 and node23.
203
204 mixed
205 Lexical sort followed by an integer sort. Should give
206 meaningful results in all cases, especially if you are *really*
207 unfortunate enough to not have zero-padded nodes and have
208 different leading strings, like lin34 and win5. This is the
209 default.
210
211 mixed2
212 Mixed sort followed by another mixed sort. Useful for
213 pathelogical admins that name their nodes after rack positions,
214 like rack1node4 and rack10node12.
215
216 nodesort_host
217 Defines sorting methods on a per-server basis. It is a comma-
218 delimited list of "host=method" pairs surrounded by paranthesis,
219 i.e. nodesort_host=(serv1=ordered,serv2=lexical). The host part
220 is first checked as an exact match, otherwise is interpreted as a
221 perl regexp (first match wins).
222
223 nospace
224 No space between nodes in grid for a more compact display, 1 or 0
225
226 show_grid
227 Show the node grid, 1 or 0
228
229 show_queue
230 Show the job queue, 1 or 0
231
232 show_qqueue
233 Show queued (not running) jobs in the queue display, 1 or 0
234
235 show_jobs
236 Show job and color information in the node grid, 1 or 0
237
238 show_cpu
239 Comma seperated list of CPU numbers to display
240
241 show_onlyq
242 Queue name to limit the view in the grid and job list. Only one
243 name is supported at this time.
244
245 show_user
246 Usernames to limit the view in the grid and job list. Can be a
247 comma-seperated list of users, "all", or "me".
248
249 It might be reasonable for a site to have "show_user=me" in
250 /etc/pbstoprc and for admin users to have "show_user=all" in their
251 own ~/.pbstoprc.
252
253 Members of a group might want all of their groupmates's usernames
254 in their own ~/.pbstoprc.
255
256 host
257 Comma seperated list of hostnames running pbs_server
258
259 maxrows
260 Number of rows in the large scrollable panel
261
262 maxcolums
263 Number of columns in the large scrollable panel
264
265 maxnodegrid
266 Fill the background with black, 1 or 0
267
268 A sample configuration file:
269
270 # I'm grumpy and don't like color
271 colorize=0
272
273 # my 6 CPU machine should get a seperate grid
274 maxnodegrid=5
275
276 # all of my Torque servers
277 host=teraserver,bigbird,testhpc
278
279 # teraserver has strict naming, testhpc has useless naming
280 nodesort_host=(.*\.usc.edu=integer,teraserver=lexical,testhpc=ordered)
281
283 pbstop(1) was developed with three specific clusters in mind, these are
284 a 1700 node cluster of dual SMP machines, a 64 proc SMP with 16 single
285 node machines, and a 21 node cluster of single procs without nicely
286 numbered hostnames. With this kind of pedigree, pbstop(1) is fairly
287 flexible.
288
289 The number of columns in the grid can be shrunk or expanded on the
290 command line with "-C", or interactively with "c". Additional CPUs can
291 be displayed by pressing the appropriate number key. Using the number
292 keys is confusing at first, but if you try it a few times it will
293 became natural. By default, nodes with 8 or more CPUs are displayed in
294 a seperate grid.
295
296 The first two clusters mentioned above display well with the defaults.
297 The third is typically displayed with the number of columns set to "1".
298
300 /etc/pbstoprc
301 The global configuration file
302
303 ~/.pbstoprc
304 The personal configuration file.
305
307 PBS_DEFAULT
308 The server's hostname (same as most PBS client commands)
309
311 PBS(3pm), qstat(1B)
312
314 The large Job structure uses the servername supplied by the user, the
315 Job structure uses the servername returned by the server... so they
316 don't match up (this makes the jobloadreport imprecise). The curses
317 code is very ineffecient and the display gets corrupted at times. It
318 can't produce plain text output like top's "batch" mode. grep FIXME
319 from pbstop for more!
320
322 pbstop(1) was originally written by Garrick Staples <garrick@usc.edu>.
323 The node grid and lettering concept is from Dennis Smith. Thanks to
324 Egan Ford and the xCAT mailing list for testing and feedback.
325
327 Hey! The above document had some coding errors, which are explained
328 below:
329
330 Around line 2613:
331 =back doesn't take any parameters, but you said =back 4
332
333
334
335perl v5.12.1 2010-06-22 PBSTOP(1)