1PBSTOP(1) User Contributed Perl Documentation PBSTOP(1)
2
3
4
6 pbstop - monitoring utility for OpenPBS or Torque
7
9 pbstop [OPTION]... [@hostname]...
10
12 Draws a full-terminal display of your nodes and jobs. The default grid
13 shows each node's 1st CPU as a single character. The specific charac‐
14 ter denotes the state of the node or identifies the job running on that
15 CPU. The job listing shows the job name, queue name, state, etc. and,
16 on the far left, the character used to identify nodes in the upper
17 grid. Pressing a number key will toggle the display of that CPU on all
18 of the nodes.
19
20 This program runs best if the "perl-PBS" module is installed. While
21 there are currently no loss of features if it isn't installed, it will
22 run much faster with it. If you are unsure if PBS is installed, run
23 this program, hit "h", and look for the Backend information at the top
24 right.
25
27 -s num
28 seconds between refreshes
29
30 -c num
31 number of columns to display in the grid (0 scales based on term
32 width)
33
34 -m num
35 max number of cpus in a node before it gets its own grid
36
37 -q queue name for limiting the view of the grid and job list. Only
38 one name is supported at this time. No corresponding interactive
39 command.
40
41 -u usernames for limiting the view of the grid and job list. Can be a
42 comma-seperated list of usernames or "all". "me" is a pseudonym
43 for the username running pbstop(1).
44
45 -C toggle colorization
46
47 -S toggle state summary display
48
49 -G toggle grid display
50
51 -Q toggle queue display
52
53 -t toggle showing queued jobs in queue display
54
55 -[0-9]...
56 cpu numbers for grid display
57
58 -J toggle jobs in grid display
59
60 -fillbg
61 fill the background with black instead of using the terminal's
62 default
63
64 -V print version and exit
65
67 Several single-key commands are recognized while pbstop(1) is running.
68 The arrow keys, PageUp, and PageDown keys will scroll the display if it
69 doesn't fit in your terminal.
70
71 When prompted to type something, ctrl-g can be used to cancel the com‐
72 mand.
73
74 space
75 Immediately update display
76
77 q Quit pbstop(1)
78
79 h Display help screen, version, and current settings
80
81 c Prompts for the number of columns to display the node grid (0 auto-
82 scales based on term width)
83
84 s Prompts for the number of seconds to wait between display updates
85
86 u Prompts for a username. The grid and job listing will be limited
87 to the named user. Input "all" will remove all limitations (the
88 default), and "me" will limit to the current username running
89 pbstop(1). If the username or "me" is prefixed with a "+" or "-",
90 the username will be added or removed from the list of usernames to
91 be limited. "a" and "m" are shortcuts for "all" and "me".
92
93 / Prompts the user for a search string, for displaying the details
94 of. The search can optionally begin with one of the following pat‐
95 tern specifiers (think: mutt): "~s" for a server, "~n" for a node,
96 or "~j" for a job number. If no pattern specifier is found, pbstop
97 will attempt to find the object that best matches the search
98 string. The string can be a server name, nodename, or a job number.
99 Nodenames can optionally be followed by a space and the server
100 name. Job numbers may optionally be followed by a dot and the
101 server name.
102
103 If an object is found, a subwindow will be opened displaying
104 details. Hit "q" to exit the window.
105
106 When viewing a job detail subwindow, pressing "l" is a shortcut for
107 jumping directly to the associated job's node load subwindow.
108
109 (Mnemonic: like using / to search for text in vi or less)
110
111 l Prompts the user for a job id. A node load report subwindow will
112 be displayed for the given jobid. This subwindow shows the current
113 load average, the physical and available memory, and the number of
114 sessions. Available physical memory will be negative in the event
115 of swapping. If the number of sessions is 0, that might indicate a
116 problem on that node.
117
118 Pressing "l" in this subwindow jumps you directly to the associated
119 job detail subwindow; as if the user typed "/jobid".
120
121 (Mnemonic: load average)
122
123 C Toggle the use of the colors in the display
124
125 S Toggle the display of the state summary
126
127 G Toggle the display of the node grid
128
129 Q Toggle the display of the job queue
130
131 t Toggle the display of currently queued (not running) jobs in the
132 display. This can reduce the size of the queue display consider‐
133 ably in some environments.
134
135 (Mnemonic: I don't know, toggle? "Q" was already used for some‐
136 thing more important)
137
138 J Toggle the display of job letters in the node grid. This handy
139 because you can see the node state "hidden" behind the job letter.
140 For example, use this to see which nodes are not yet "busy" that
141 have jobs.
142
143 f Toggle background fill with black instead of using the terminal's
144 default. Use this if the display looks bad on your colored or
145 transparent background.
146
147 Any single number (0-9)
148 Toggle display of that CPU number in the display. This is confus‐
149 ing at first, but useful in SMP environments (See SMP section
150 below).
151
153 pbstop(1) has many configuration variables that can set on the command
154 line, interactively, or from configuration files. When pbstop(1)
155 starts, it first initializes these variables with built-in defaults,
156 then reads in /etc/pbstoprc, the reads ~/.pbstoprc, and finally parses
157 the command line arguments. Note that several of the command line
158 arguments and interactive commands are toggles, they don't directly set
159 the value of the configuration. In contrast, the configuration files
160 are not toggles.
161
162 The configuration files may contain following name=value pairs:
163
164 columns
165 Number of columns in the node grid, positive integer (0 scales
166 based on term width)
167
168 sleeptime
169 Number of seconds to pause between display updates, positive inte‐
170 ger
171
172 colorize
173 Use colors in the display, 1 or 0
174
175 show_summary
176 Display the summary at the top of the display, 1 or 0
177
178 compact_summary
179 Show node state summary on one line, 1 or 0
180
181 showncpus
182 Show the NCPUs job resource in the queue display, 1 or 0.
183
184 nodesort
185 Define the sorting method for the nodes in the main display grid.
186 The current possible methods are:
187
188 ordered
189 Preserves the order given from pbs_server without sorting; good
190 for nodes that don't follow a specific pattern or order.
191
192 lexical
193 Simple alphabetical sort. Fastest method for nodes with zero-
194 padded names such as node0023.
195
196 integer
197 The first numbers found for an integer sort. Useful if you are
198 unfortunate enough to not have zero-padded nodes, like node1
199 and node23.
200
201 mixed
202 Lexical sort followed by an integer sort. Should give meaning‐
203 ful results in all cases, especially if you are *really* unfor‐
204 tunate enough to not have zero-padded nodes and have different
205 leading strings, like lin34 and win5. This is the default.
206
207 mixed2
208 Mixed sort followed by another mixed sort. Useful for pathel‐
209 ogical admins that name their nodes after rack positions, like
210 rack1node4 and rack10node12.
211
212 nodesort_host
213 Defines sorting methods on a per-server basis. It is a comma-
214 delimited list of "host=method" pairs surrounded by paranthesis,
215 i.e. nodesort_host=(serv1=ordered,serv2=lexical). The host part
216 is first checked as an exact match, otherwise is interpreted as a
217 perl regexp (first match wins).
218
219 show_grid
220 Show the node grid, 1 or 0
221
222 show_queue
223 Show the job queue, 1 or 0
224
225 show_qqueue
226 Show queued (not running) jobs in the queue display, 1 or 0
227
228 show_jobs
229 Show job and color information in the node grid, 1 or 0
230
231 show_cpu
232 Comma seperated list of CPU numbers to display
233
234 show_onlyq
235 Queue name to limit the view in the grid and job list. Only one
236 name is supported at this time.
237
238 show_user
239 Usernames to limit the view in the grid and job list. Can be a
240 comma-seperated list of users, "all", or "me".
241
242 It might be reasonable for a site to have "show_user=me" in
243 /etc/pbstoprc and for admin users to have "show_user=all" in their
244 own ~/.pbstoprc.
245
246 Members of a group might want all of their groupmates's usernames
247 in their own ~/.pbstoprc.
248
249 host
250 Comma seperated list of hostnames running pbs_server
251
252 maxrows
253 Number of rows in the large scrollable panel
254
255 maxcolums
256 Number of columns in the large scrollable panel
257
258 maxnodegrid
259 Fill the background with black, 1 or 0
260
261 A sample configuration file:
262
263 # I'm grumpy and don't like color
264 colorize=0
265
266 # my 6 CPU machine should get a seperate grid
267 maxnodegrid=5
268
269 # all of my Torque servers
270 host=teraserver,bigbird,testhpc
271
272 # teraserver has strict naming, testhpc has useless naming
273 nodesort_host=(.*\.usc.edu=integer,teraserver=lexical,testhpc=ordered)
274
276 pbstop(1) was developed with three specific clusters in mind, these are
277 a 1700 node cluster of dual SMP machines, a 64 proc SMP with 16 single
278 node machines, and a 21 node cluster of single procs without nicely
279 numbered hostnames. With this kind of pedigree, pbstop(1) is fairly
280 flexible.
281
282 The number of columns in the grid can be shrunk or expanded on the com‐
283 mand line with "-C", or interactively with "c". Additional CPUs can be
284 displayed by pressing the appropriate number key. Using the number
285 keys is confusing at first, but if you try it a few times it will
286 became natural. By default, nodes with 8 or more CPUs are displayed in
287 a seperate grid.
288
289 The first two clusters mentioned above display well with the defaults.
290 The third is typically displayed with the number of columns set to "1".
291
293 /etc/pbstoprc
294 The global configuration file
295
296 ~/.pbstoprc
297 The personal configuration file.
298
300 PBS_DEFAULT
301 The server's hostname (same as most PBS client commands)
302
304 PBS(3pm), qstat(1B)
305
307 The large Job structure uses the servername supplied by the user, the
308 Job structure uses the servername returned by the server... so they
309 don't match up (this makes the jobloadreport imprecise). The curses
310 code is very ineffecient and the display gets corrupted at times. It
311 can't produce plain text output like top's "batch" mode. grep FIXME
312 from pbstop for more!
313
315 pbstop(1) was originally written by Garrick Staples <garrick@usc.edu>.
316 The node grid and lettering concept is from Dennis Smith. Thanks to
317 Egan Ford and the xCAT mailing list for testing and feedback.
318
319
320
321perl v5.8.8 2006-09-08 PBSTOP(1)