1QSTAT(1) Grid Engine User Commands QSTAT(1)
2
3
4
6 qstat - show the status of Grid Engine jobs and queues
7
9 qstat [ -ext ] [ -f ] [ -F [resource_name,...] ] [ -g {c|d|t}[+] ] [
10 -help ] [ -j [job_list] ] [ -l resource=val,... ] [ -ne ] [ -pe
11 pe_name,... ] [ -pri ] [ -q wc_queue_list ] [ -qs
12 {a|c|d|o|s|u|A|C|D|E|S} ] [ -r ] [ -s {r|p|s|z|hu|ho|hs|hj|ha|h|a}[+] ]
13 [ -t ] [ -U user,... ] [ -u user,... ] [ -urg ] [ -xml ]
14
16 qstat shows the current status of the available Grid Engine queues and
17 the jobs associated with the queues. Selection options allow you to get
18 information about specific jobs, queues or users. Without any option
19 qstat will display only a list of jobs with no queue status informa‐
20 tion.
21
22 The administrator and the user may define files (see sge_qstat(5)),
23 which can contain any of the options described below. A cluster-wide
24 sge_qstat file may be placed under $SGE_ROOT/$SGE_CELL/com‐
25 mon/sge_qstat The user private file is searched at the location
26 $HOME/.sge_qstat. The home directory request file has the higher
27 precedence than the cluster global file. Command line can be used to
28 override the flags contained in the files.
29
31 -explain a|A|c|E
32 ´c' displays the reason for the c(onfiguration ambiguous) state
33 of a queue instance. 'a' shows the reason for the alarm state.
34 Suspend alarm state reasons will be displayed by 'A'. 'E' dis‐
35 plays the reason for a queue instance error state.
36
37 The output format for the alarm reasons is one line per reason
38 containing the resource value and threshold. For details about
39 the resource value please refer to the description of the Full
40 Format in section OUTPUT FORMATS below.
41
42 -ext Displays additional information for each job related to the job
43 ticket policy scheme (see OUTPUT FORMATS below).
44
45 -f Specifies a "full" format display of information. The -f option
46 causes summary information on all queues to be displayed along
47 with the queued job list.
48
49 -F [ resource_name,... ]
50 Like in the case of -f information is displayed on all jobs as
51 well as queues. In addition, qstat will present a detailed list‐
52 ing of the current resource availability per queue with respect
53 to all resources (if the option argument is omitted) or with
54 respect to those resources contained in the resource_name list.
55 Please refer to the description of the Full Format in section
56 OUTPUT FORMATS below for further detail.
57
58 -g {c|d|t}[+]
59 The -g option allows for controlling grouping of displayed
60 objects.
61
62 With -g c a cluster queue summary is displayed. Find more
63 information in the section OUTPUT FORMATS.
64
65 With -g d array jobs are displayed verbosely in a one line per
66 job task fashion. By default, array jobs are grouped and all
67 tasks with the same status (for pending tasks only) are dis‐
68 played in a single line. The array job task id range field in
69 the output (see section OUTPUT FORMATS) specifies the corre‐
70 sponding set of tasks.
71
72 With -g t parallel jobs are displayed verbosely in a one line
73 per parallel job task fashion. By default, parallel job tasks
74 are displayed in a single line. Also with -g t option the func‐
75 tion of each parallel task is displayed rather than the jobs
76 slot amount (see section OUTPUT FORMATS).
77
78
79 -help Prints a listing of all options.
80
81 -j [job_list]
82 Prints either for all pending jobs or the jobs contained in
83 job_list various information. The job_list can contain job_ids,
84 job_names, or wildcard expression sge_types(1).
85
86 For jobs in E(rror) state the error reason is displayed. For
87 jobs that could not be dispatched during in the last scheduling
88 interval the obstacles are shown, if 'schedd_job_info' in
89 sched_conf(5) is configured accordingly.
90
91 For running jobs available information on resource utilization
92 is shown about consumed cpu time in seconds, integral memory
93 usage in Gbytes seconds, amount of data transferred in io opera‐
94 tions, current virtual memory utilization in Mbytes, and maximum
95 virtual memory utilization in Mbytes. This information is not
96 available if resource utilization retrieval is not supported for
97 the OS platform where the job is hosted.
98
99 -l resource[=value],...
100 Defines the resources required by the jobs or granted by the
101 queues on which information is requested. Matching is performed
102 on queues based on non-mutable resource availability information
103 only. That means load values are always ignored except the so-
104 called static load values (i.e. "arch", "num_proc", "mem_total",
105 "swap_total" and "virtual_total") ones. Consumable utilization
106 is also ignored. The pending jobs are restricted to jobs that
107 might run in one of the above queues. In a similar fashion also
108 the queue-job matching bases only on non-mutable resource avail‐
109 ability information.
110
111 -ne In combination with -f the option suppresses the display of
112 empty queues. This means all queues where actually no jobs are
113 running are not displayed.
114
115 -pe pe_name,...
116 Displays status information with respect to queues which are
117 attached to at least one of the parallel environments listed in
118 the comma separated option argument. Status information for jobs
119 is displayed either for those which execute in one of the
120 selected queues or which are pending and might get scheduled to
121 those queues in principle.
122
123
124 -pri Displays additional information for each job related to the job
125 priorities in general. (see OUTPUT FORMATS below).
126
127 -q wc_queue_list
128 Specifies a wildcard expression queue list to which job informa‐
129 tion is to be displayed. Find the definition of wc_queue_list in
130 sge_types(1).
131
132 -qs {a|c|d|o|s|u|A|C|D|E|S}
133 Allows for the filtering of queue instances according to state.
134
135 -r Prints extended information about the resource requirements of
136 the displayed jobs. Please refer to the OUTPUT FORMATS sub-sec‐
137 tion Expanded Format below for detailed information.
138
139 -s {p|r|s|z|hu|ho|hs|hj|ha|h|a}[+]
140
141 Prints only jobs in the specified state, any combination of
142 states is possible. -s prs corresponds to the regular qstat out‐
143 put without -s at all. To show recently finished jobs, use -s z.
144 To display jobs in user/operator/system hold, use the -s
145 hu/ho/hs option. The -s ha option shows jobs which where submit‐
146 ted with the qsub -a command. qstat -s hj displays all jobs
147 which are not eligible for execution unless the job has entries
148 in the job dependency list. qstat -s h is an abbreviation for
149 qstat -s huhohshjha and qstat -s a is an abbreviation for qstat
150 -s psr (see -a and -hold_jid option to qsub(1)).
151
152 -t Prints extended information about the controlled sub-tasks of
153 the displayed parallel jobs. Please refer to the OUTPUT FORMATS
154 sub-section Reduced Format below for detailed information. Sub-
155 tasks of parallel jobs should not be confused with array job
156 tasks (see -g option above and -t option to qsub(1)).
157
158 -U user,...
159 Displays status information with respect to queues to which the
160 specified users have access. Status information for jobs is dis‐
161 played either for those which execute in one of the selected
162 queues or which are pending and might get scheduled to those
163 queues in principle.
164
165 -u user,...
166 Display information only on those jobs and queues being associ‐
167 ated with the users from the given user list. Queue status
168 information is displayed if the -f or -F options are specified
169 additionally and if the user runs jobs in those queues.
170
171 The string $user is a placeholder for the current username. An
172 asterisk "*" can be used as username wildcard to request any
173 users' jobs be displayed. The default value for this switch is
174 -u $user.
175
176
177 -urg Displays additional information for each job related to the job
178 urgency policy scheme (see OUTPUT FORMATS below).
179
180 -xml This option can be used with all other options and changes the
181 output to XML. The used schemas are referenced in the XML out‐
182 put. The output is printed to stdout.
183
185 Depending on the presence or absence of the -explain, -f, -F, or -qs
186 and -r and -t option three output formats need to be differentiated.
187
188 The -ext and -urg options may be used to display additional information
189 for each job.
190
191 Cluster Queue Format (with -g c)
192 Following the header line a section for each cluster queue is provided.
193 When queue instances selection are applied (-l -pe, -q, -U) the cluster
194 format contains only cluster queues of the corresponding queue
195 instances.
196
197 · the cluster queue name.
198
199 · an average of the normalized load average of all queue hosts. In
200 order to reflect each hosts different significance the number of
201 configured slots is used as a weighting factor when determining
202 cluster queue load. Please note that only hosts with a
203 np_load_value are considered for this value. When queue selection is
204 applied only data about selected queues is considered in this for‐
205 mula. If the load value is not available at any of the hosts '-NA-'
206 is printed instead of the value from the complex attribute defini‐
207 tion.
208
209 · the number of currently used slots.
210
211 · the number of currently available slots.
212
213 · the total number of slots.
214
215 · the number of slots which is in at least one of the states 'aoACDS'
216 and in none of the states 'cdsuE'
217
218 · the number of slots which are in one of these states or in any com‐
219 bination of them: 'cdsuE'
220
221 · the -g c option can be used in combination with -ext. In this case,
222 additional columns are added to the output. Each column contains the
223 slot count for one of the available queue states.
224
225 Reduced Format (without -f, -F, and -qs)
226 Following the header line a line is printed for each job consisting of
227
228 · the job ID.
229
230 · the priority of the job determining its position in the pending jobs
231 list. The priority value is determined dynamically based on ticket
232 and urgency policy set-up (see also sge_priority(5) ).
233
234 · the name of the job.
235
236 · the user name of the job owner.
237
238 · the status of the job - one of d(eletion), E(rror), h(old),
239 r(unning), R(estarted), s(uspended), S(uspended), t(ransfering),
240 T(hreshold) or w(aiting).
241
242 The state d(eletion) indicates that a qdel(1) has been used to ini‐
243 tiate job deletion. The states t(ransfering) and r(unning) indicate
244 that a job is about to be executed or is already executing, whereas
245 the states s(uspended), S(uspended) and T(hreshold) show that an
246 already running jobs has been suspended. The s(uspended) state is
247 caused by suspending the job via the qmod(1) command, the
248 S(uspended) state indicates that the queue containing the job is
249 suspended and therefore the job is also suspended and the T(hresh‐
250 old) state shows that at least one suspend threshold of the corre‐
251 sponding queue was exceeded (see queue_conf(5)) and that the job has
252 been suspended as a consequence. The state R(estarted) indicates
253 that the job was restarted. This can be caused by a job migration or
254 because of one of the reasons described in the -r section of the
255 qsub(1) command.
256
257 The states w(aiting) and h(old) only appear for pending jobs. The
258 h(old) state indicates that a job currently is not eligible for exe‐
259 cution due to a hold state assigned to it via qhold(1), qalter(1) or
260 the qsub(1) -h option or that the job is waiting for completion of
261 the jobs to which job dependencies have been assigned to the job via
262 the -hold_jid option of qsub(1) or qalter(1).
263
264 The state E(rror) appears for pending jobs that couldn't be started
265 due to job properties. The reason for the job error is shown by the
266 qstat(1) -j job_list option.
267
268 · the submission or start time and date of the job.
269
270 · the queue the job is assigned to (for running or suspended jobs
271 only).
272
273 · the number of job slots or the function of parallel job tasks if -g
274 t is specified.
275
276 Without -g t option the total number of slots occupied resp.
277 requested by the job is displayed. For pending parallel jobs with a
278 PE slot range request, the assumed future slot allocation is dis‐
279 played. With -g t option the function of the running jobs (MASTER
280 or SLAVE - the latter for parallel jobs only) is displayed.
281
282 · the array job task id. Will be empty for non-array jobs. See the -t
283 option to qsub(1) and the -g above for additional information.
284
285 If the -t option is supplied, each status line always contains parallel
286 job task information as if -g t were specified and each line contains
287 the following parallel job subtask information:
288
289 · the parallel task ID (do not confuse parallel tasks with array job
290 tasks),
291
292 · the status of the parallel task - one of r(unning), R(estarted),
293 s(uspended), S(uspended), T(hreshold), w(aiting), h(old), or
294 x(exited).
295
296 · the cpu, memory, and I/O usage,
297
298 · the exit status of the parallel task,
299
300 · and the failure code and message for the parallel task.
301
302 Full Format (with -f and -F)
303 Following the header line a section for each queue separated by a hori‐
304 zontal line is provided. For each queue the information printed con‐
305 sists of
306
307 · the queue name,
308
309 · the queue type - one of B(atch), I(nteractive), C(heckpointing),
310 P(arallel), T(ransfer) or combinations thereof or N(one),
311
312 · the number of used and available job slots,
313
314 · the load average of the queue host,
315
316 · the architecture of the queue host and
317
318 · the state of the queue - one of u(nknown) if the corresponding
319 sge_execd(8) cannot be contacted, a(larm), A(larm), C(alendar sus‐
320 pended), s(uspended), S(ubordinate), d(isabled), D(isabled), E(rror)
321 or combinations thereof.
322
323 If the state is a(larm) at least on of the load thresholds defined in
324 the load_thresholds list of the queue configuration (see queue_conf(5))
325 is currently exceeded, which prevents from scheduling further jobs to
326 that queue.
327
328 As opposed to this, the state A(larm) indicates that at least one of
329 the suspend thresholds of the queue (see queue_conf(5)) is currently
330 exceeded. This will result in jobs running in that queue being succes‐
331 sively suspended until no threshold is violated.
332
333 The states s(uspended) and d(isabled) can be assigned to queues and
334 released via the qmod(1) command. Suspending a queue will cause all
335 jobs executing in that queue to be suspended.
336
337 The states D(isabled) and C(alendar suspended) indicate that the queue
338 has been disabled or suspended automatically via the calendar facility
339 of Grid Engine (see calendar_conf(5)), while the S(ubordinate) state
340 indicates, that the queue has been suspend via subordination to another
341 queue (see queue_conf(5) for details). When suspending a queue (regard‐
342 less of the cause) all jobs executing in that queue are suspended too.
343
344 If an E(rror) state is displayed for a queue, sge_execd(8) on that host
345 was unable to locate the sge_shepherd(8) executable on that host in
346 order to start a job. Please check the error logfile of that
347 sge_execd(8) for leads on how to resolve the problem. Please enable the
348 queue afterwards via the -c option of the qmod(1) command manually.
349
350 If the c(onfiguration ambiguous) state is displayed for a queue
351 instance this indicates that the configuration specified for this queue
352 instance in sge_conf(5) is ambiguous. This state is cleared when the
353 configuration becomes unambiguous again. This state prevents further
354 jobs from being scheduled to that queue instance. Detailed reasons why
355 a queue instance entered the c(onfiguration ambiguous) state can be
356 found in the sge_qmaster(8) messages file and are shown by the qstat
357 -explain switch. For queue instances in this state the cluster queue's
358 default settings are used for the ambiguous attribute.
359
360 If an o(rphaned) state is displayed for a queue instance, it indicates
361 that the queue instance is no longer demanded by the current cluster
362 queue's configuration or the host group configuration. The queue
363 instance is kept because jobs which not yet finished jobs are still
364 associated with it, and it will vanish from qstat output when these
365 jobs have finished. To quicken vanishing of an orphaned queue instance
366 associated job(s) can be deleted using qdel(1). A queue instance in
367 (o)rphaned state can be revived by changing the cluster queue configu‐
368 ration accordingly to cover that queue instance. This state prevents
369 from scheduling further jobs to that queue instance.
370
371 If the -F option was used, resource availability information is printed
372 following the queue status line. For each resource (as selected in an
373 option argument to -F or for all resources if the option argument was
374 omitted) a single line is displayed with the following format:
375
376 · a one letter specifier indicating whether the current resource
377 availability value was dominated by either
378 `g' - a cluster global,
379 `h' - a host total or
380 `q' - a queue related resource consumption.
381
382 · a second one letter specifier indicating the source for the current
383 resource availability value, being one of
384 `l' - a load value reported for the resource,
385 `L' - a load value for the resource after administrator defined load
386 scaling has been applied,
387 `c' - availability derived from the consumable resources facility
388 (see complexes(5)),
389 `f' - a fixed availability definition derived from a non-consumable
390 complex attribute or a fixed resource limit.
391
392 · after a colon the name of the resource on which information is dis‐
393 played.
394
395 · after an equal sign the current resource availability value.
396
397 The displayed availability values and the sources from which they
398 derive are always the minimum values of all possible combinations.
399 Hence, for example, a line of the form "qf:h_vmem=4G" indicates that a
400 queue currently has a maximum availability in virtual memory of 4 Giga‐
401 byte, where this value is a fixed value (e.g. a resource limit in the
402 queue configuration) and it is queue dominated, i.e. the host in total
403 may have more virtual memory available than this, but the queue doesn't
404 allow for more. Contrarily a line "hl:h_vmem=4G" would also indicate an
405 upper bound of 4 Gigabyte virtual memory availability, but the limit
406 would be derived from a load value currently reported for the host. So
407 while the queue might allow for jobs with higher virtual memory
408 requirements, the host on which this particular queue resides currently
409 only has 4 Gigabyte available.
410
411 If the -explain option was used with the character 'a' or 'A', informa‐
412 tion about resources is displayed, that violate load or suspend thresh‐
413 olds.
414 The same format as with the -F option is used with following exten‐
415 sions:
416
417 · the line starts with the keyword `alarm'
418
419 · appended to the resource value is the type and value of the appro‐
420 priate threshold
421
422 After the queue status line (in case of -f) or the resource availabil‐
423 ity information (in case of -F) a single line is printed for each job
424 running currently in this queue. Each job status line contains
425
426 · the job ID,
427
428 · the priority of the job determining its position in the pending jobs
429 list. The priority value is determined dynamically based on ticket
430 and urgency policy set-up (see also sge_priority(5) ).
431
432 · the job name,
433
434 · the job owner name,
435
436 · the status of the job - one of t(ransfering), r(unning),
437 R(estarted), s(uspended), S(uspended) or T(hreshold) (see the
438 Reduced Format section for detailed information),
439
440 · the submission or start time and date of the job.
441
442 · the number of job slots or the function of parallel job tasks if -g
443 t is specified.
444
445 Without -g t option the number of slots occupied per queue resp.
446 requested by the job is displayed. For pending parallel jobs with a
447 PE slot range request, the assumed future slot allocation is dis‐
448 played. With -g t option the function of the running jobs (MASTER
449 or SLAVE - the latter for parallel jobs only) is displayed.
450
451 If the -t option is supplied, each job status line also contains
452
453 · the task ID,
454
455 · the status of the task - one of r(unning), R(estarted), s(uspended),
456 S(uspended), T(hreshold), w(aiting), h(old), or x(exited) (see the
457 Reduced Format section for detailed information),
458
459 · the cpu, memory, and I/O usage,
460
461 · the exit status of the task,
462
463 · and the failure code and message for the task.
464
465 Following the list of queue sections a PENDING JOBS list may be printed
466 in case jobs are waiting for being assigned to a queue. A status line
467 for each waiting job is displayed being similar to the one for the run‐
468 ning jobs. The differences are that the status for the jobs is w(ait‐
469 ing) or h(old), that the submit time and date is shown instead of the
470 start time and that no function is displayed for the jobs.
471
472 In very rare cases, e.g. if sge_qmaster(8) starts up from an inconsis‐
473 tent state in the job or queue spool files or if the clean queue (-cq)
474 option of qconf(1) is used, qstat cannot assign jobs to either the run‐
475 ning or pending jobs section of the output. In this case as job status
476 inconsistency (e.g. a job has a running status but is not assigned to a
477 queue) has been detected. Such jobs are printed in an ERROR JOBS sec‐
478 tion at the very end of the output. The ERROR JOBS section should dis‐
479 appear upon restart of sge_qmaster(8). Please contact your Grid Engine
480 support representative if you feel uncertain about the cause or effects
481 of such jobs.
482
483 Expanded Format (with -r)
484 If the -r option was specified together with qstat, the following
485 information for each displayed job is printed (a single line for each
486 of the following job characteristics):
487
488 · The job and master queue name.
489
490 · The hard and soft resource requirements of the job as specified with
491 the qsub(1) -l option. The per resource addend when determining the
492 jobs urgency contribution value is printed (see also sge_prior‐
493 ity(5)).
494
495 · The requested parallel environment including the desired queue slot
496 range (see -pe option of qsub(1)).
497
498 · The requested checkpointing environment of the job (see the qsub(1)
499 -ckpt option).
500
501 · In case of running jobs, the granted parallel environment with the
502 granted number of queue slots.
503
504 Enhanced Output (with -ext)
505 For each job the following additional items are displayed:
506
507 ntckts The total number of tickets in normalized fashion.
508
509 project
510 The project to which the job is assigned as specified in the
511 qsub(1) -P option.
512
513 department
514 The department, to which the user belongs (use the -sul and -su
515 options of qconf(1) to display the current department defini‐
516 tions).
517
518 cpu The current accumulated CPU usage of the job in seconds.
519
520 mem The current accumulated memory usage of the job in Gbytes sec‐
521 onds.
522
523 io The current accumulated IO usage of the job.
524
525 tckts The total number of tickets assigned to the job currently
526
527 ovrts The override tickets as assigned by the -ot option of qalter(1).
528
529 otckt The override portion of the total number of tickets assigned to
530 the job currently
531
532 ftckt The functional portion of the total number of tickets assigned
533 to the job currently
534
535 stckt The share portion of the total number of tickets assigned to the
536 job currently
537
538 share The share of the total system to which the job is entitled cur‐
539 rently.
540
541 Enhanced Output (with -urg)
542 For each job the following additional urgency policy related items are
543 displayed (see also sge_priority(5)):
544
545 nurg The jobs total urgency value in normalized fashion.
546
547 urg The jobs total urgency value.
548
549 rrcontr
550 The urgency value contribution that reflects the urgency that is
551 related to the jobs overall resource requirement.
552
553 wtcontr
554 The urgency value contribution that reflects the urgency related
555 to the jobs waiting time.
556
557 dlcontr
558 The urgency value contribution that reflects the urgency related
559 to the jobs deadline initiation time.
560
561 deadline
562 The deadline initiation time of the job as specified with the
563 qsub(1) -dl option.
564
565 Enhanced Output (with -pri)
566 For each job, the following additional job priority related items are
567 displayed (see also sge_priority(5)):
568
569 nurg The job's total urgency value in normalized fashion.
570
571 npprior
572 The job's -p priority in normalized fashion.
573
574 ntckts The job's ticket amount in normalized fashion.
575
576 ppri The job's -p priority as specified by the user.
577
579 SGE_ROOT Specifies the location of the Grid Engine standard con‐
580 figuration files.
581
582 SGE_CELL If set, specifies the default Grid Engine cell. To
583 address a Grid Engine cell qstat uses (in the order of
584 precedence):
585
586 The name of the cell specified in the environment
587 variable SGE_CELL, if it is set.
588
589 The name of the default cell, i.e. default.
590
591
592 SGE_DEBUG_LEVEL
593 If set, specifies that debug information should be writ‐
594 ten to stderr. In addition the level of detail in which
595 debug information is generated is defined.
596
597 SGE_QMASTER_PORT
598 If set, specifies the tcp port on which sge_qmaster(8)
599 is expected to listen for communication requests. Most
600 installations will use a services map entry for the ser‐
601 vice "sge_qmaster" instead to define that port.
602
603 SGE_LONG_QNAMES
604 Qstat does display queue names up to 30 characters. If
605 that is to much or not enough, one can set a custom
606 length with this variable. The minimum display length is
607 10 characters. If one does not know the best display
608 length, one can set SGE_LONG_QNAMES to -1 and qstat will
609 figure out the best length.
610
612 $SGE_ROOT/$SGE_CELL/common/act_qmaster
613 Grid Engine master host file
614 $SGE_ROOT/$SGE_CELL/common/sge_qstat
615 cluster qstat default options
616 $HOME/.sge_qstat
617 user qstat default options
618
620 sge_intro(1), qalter(1), qconf(1), qhold(1), qhost(1), qmod(1),
621 qsub(1), queue_conf(5), sge_execd(8), sge_qmaster(8), sge_shepherd(8).
622
624 See sge_intro(1) for a full statement of rights and permissions.
625
626
627
628GE 6.1 $Date: 2007/11/06 18:18:12 $ QSTAT(1)