1CONDOR_Q(1) HTCondor Manual CONDOR_Q(1)
2
3
4
6 condor_q - HTCondor Manual
7
8 Display information about jobs in queue
9
10
11
13 condor_q [-help [Universe | State] ]
14
15 condor_q [-debug ] [general options ] [restriction list ] [output op‐
16 tions ] [analyze options ]
17
19 condor_q displays information about jobs in the HTCondor job queue. By
20 default, condor_q queries the local job queue, but this behavior may be
21 modified by specifying one of the general options.
22
23 As of version 8.5.2, condor_q defaults to querying only the current
24 user's jobs. This default is overridden when the restriction list has
25 usernames and/or job ids, when the -submitter or -allusers arguments
26 are specified, or when the current user is a queue superuser. It can
27 also be overridden by setting the CONDOR_Q_ONLY_MY_JOBS configuration
28 macro to False.
29
30 As of version 8.5.6, condor_q defaults to batch-mode output (see -batch
31 in the Options section below). The old behavior can be obtained by
32 specifying -nobatch on the command line. To change the default back to
33 its pre-8.5.6 value, set the new configuration variable CON‐
34 DOR_Q_DASH_BATCH_IS_DEFAULT to False.
35
37 As of version 8.5.6, condor_q defaults to displaying information about
38 batches of jobs, rather than individual jobs. The intention is that
39 this will be a more useful, and user-friendly, format for users with
40 large numbers of jobs in the queue. Ideally, users will specify mean‐
41 ingful batch names for their jobs, to make it easier to keep track of
42 related jobs.
43
44 (For information about specifying batch names for your jobs, see the
45 condor_submit and condor_submit_dag manual pages.)
46
47 A batch of jobs is defined as follows:
48
49 • An entire workflow (a DAG or hierarchy of nested DAGs) (note that
50 condor_dagman now specifies a default batch name for all jobs in a
51 given workflow)
52
53 • All jobs in a single cluster
54
55 • All jobs submitted by a single user that have the same executable
56 specified in their submit file (unless submitted with different batch
57 names)
58
59 • All jobs submitted by a single user that have the same batch name
60 specified in their submit file or on the condor_submit or condor_sub‐
61 mit_dag command line.
62
64 There are many output options that modify the output generated by con‐
65 dor_q. The effects of these options, and the meanings of the various
66 output data, are described below.
67
68 Output options
69 If the -long option is specified, condor_q displays a long description
70 of the queried jobs by printing the entire job ClassAd for all jobs
71 matching the restrictions, if any. Individual attributes of the job
72 ClassAd can be displayed by means of the -format option, which displays
73 attributes with a printf(3) format, or with the -autoformat option.
74 Multiple -format options may be specified in the option list to display
75 several attributes of the job.
76
77 For most output options (except as specified), the last line of con‐
78 dor_q output contains a summary of the queue: the total number of jobs,
79 and the number of jobs in the completed, removed, idle, running, held
80 and suspended states.
81
82 If no output options are specified, condor_q now defaults to batch
83 mode, and displays the following columns of information, with one line
84 of output per batch of jobs:
85
86 OWNER, BATCH_NAME, SUBMITTED, DONE, RUN, IDLE, [HOLD,] TOTAL, JOB_IDS
87
88 Note that the HOLD column is only shown if there are held jobs in the
89 output or if there are no jobs in the output.
90
91 If the -nobatch option is specified, condor_q displays the following
92 columns of information, with one line of output per job:
93
94 ID, OWNER, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
95
96 If the -dag option is specified (in conjunction with -nobatch), con‐
97 dor_q displays the following columns of information, with one line of
98 output per job; the owner is shown only for top-level jobs, and for all
99 other jobs (including sub-DAGs) the node name is shown:
100
101 ID, OWNER/NODENAME, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
102
103 If the -run option is specified (in conjunction with -nobatch), con‐
104 dor_q displays the following columns of information, with one line of
105 output per running job:
106
107 ID, OWNER, SUBMITTED, RUN_TIME, HOST(S)
108
109 Also note that the -run option disables output of the totals line.
110
111 If the -grid option is specified, condor_q displays the following col‐
112 umns of information, with one line of output per job:
113
114 ID, OWNER, STATUS, GRID->MANAGER, HOST, GRID_JOB_ID
115
116 If the -grid:ec2 option is specified, condor_q displays the following
117 columns of information, with one line of output per job:
118
119 ID, OWNER, STATUS, INSTANCE ID, CMD
120
121 If the -goodput option is specified, condor_q displays the following
122 columns of information, with one line of output per job:
123
124 ID, OWNER, SUBMITTED, RUN_TIME, GOODPUT, CPU_UTIL, Mb/s
125
126 If the -io option is specified, condor_q displays the following columns
127 of information, with one line of output per job:
128
129 ID, OWNER, RUNS, ST, INPUT, OUTPUT, RATE, MISC
130
131 If the -cputime option is specified (in conjunction with -nobatch),
132 condor_q displays the following columns of information, with one line
133 of output per job:
134
135 ID, OWNER, SUBMITTED, CPU_TIME, ST, PRI, SIZE, CMD
136
137 If the -hold option is specified, condor_q displays the following col‐
138 umns of information, with one line of output per job:
139
140 ID, OWNER, HELD_SINCE, HOLD_REASON
141
142 If the -totals option is specified, condor_q displays only one line of
143 output no matter how many jobs and batches of jobs are in the queue.
144 That line of output contains the total number of jobs, and the number
145 of jobs in the completed, removed, idle, running, held and suspended
146 states.
147
148 Output data
149 The available output data are as follows:
150
151 ID (Non-batch mode only) The cluster/process id of the HTCondor
152 job.
153
154 OWNER The owner of the job or batch of jobs.
155
156 OWNER/NODENAME
157 (-dag only) The owner of a job or the DAG node name of the
158 job.
159
160 BATCH_NAME
161 (Batch mode only) The batch name of the job or batch of jobs.
162
163 SUBMITTED
164 The month, day, hour, and minute the job was submitted to the
165 queue.
166
167 DONE (Batch mode only) The number of job procs that are done, but
168 still in the queue.
169
170 RUN (Batch mode only) The number of job procs that are running.
171
172 IDLE (Batch mode only) The number of job procs that are in the
173 queue but idle.
174
175 HOLD (Batch mode only) The number of job procs that are in the
176 queue but held.
177
178 TOTAL (Batch mode only) The total number of job procs in the queue,
179 unless the batch is a DAG, in which case this is the total
180 number of clusters in the queue. Note: for non-DAG batches,
181 the TOTAL column contains correct values only in version
182 8.5.7 and later.
183
184 JOB_IDS
185 (Batch mode only) The range of job IDs belonging to the
186 batch.
187
188 RUN_TIME
189 (Non-batch mode only) Wall-clock time accumulated by the job
190 currently running in days, hours, minutes, and seconds. When
191 the job is idle or held the jobs previous accumulated time
192 will be displayed.
193
194 ST (Non-batch mode only) Current status of the job, which varies
195 somewhat according to the job universe and the timing of up‐
196 dates. H = on hold, R = running, I = idle (waiting for a ma‐
197 chine to execute on), C = completed, X = removed, S = sus‐
198 pended (execution of a running job temporarily suspended on
199 execute node), < = transferring input (or queued to do so),
200 and > = transferring output (or queued to do so).
201
202 PRI (Non-batch mode only) User specified priority of the job,
203 displayed as an integer, with higher numbers corresponding to
204 better priority.
205
206 SIZE (Non-batch mode only) The peak amount of memory in Mbytes
207 consumed by the job; note this value is only refreshed peri‐
208 odically. The actual value reported is taken from the job
209 ClassAd attribute MemoryUsage if this attribute is defined,
210 and from job attribute ImageSize otherwise.
211
212 CMD (Non-batch mode only) The name of the executable. For EC2
213 jobs, this field is arbitrary.
214
215 HOST(S)
216 (-run only) The host where the job is running.
217
218 STATUS (-grid only) The state that HTCondor believes the job is in.
219 Possible values are grid-type specific, but include:
220
221 PENDING
222 The job is waiting for resources to become avail‐
223 able in order to run.
224
225 ACTIVE The job has received resources, and the applica‐
226 tion is executing.
227
228 FAILED The job terminated before completion because of an
229 error, user-triggered cancel, or system-triggered
230 cancel.
231
232 DONE The job completed successfully.
233
234 SUSPENDED
235 The job has been suspended. Resources which were
236 allocated for this job may have been released due
237 to a scheduler-specific reason.
238
239 STAGE_IN
240 The job manager is staging in files, in order to
241 run the job.
242
243 STAGE_OUT
244 The job manager is staging out files generated by
245 the job.
246
247 UNKNOWN
248 Unknown
249
250 GRID->MANAGER
251 (-grid only) A guess at what remote batch system is running
252 the job. It is a guess, because HTCondor looks at the jobman‐
253 ager contact string to attempt identification. If the value
254 is fork, the job is running on the remote host without a job‐
255 manager. Values may also be condor, lsf, or pbs.
256
257 HOST (-grid only) The host to which the job was submitted.
258
259 GRID_JOB_ID
260 (-grid only) (More information needed here.)
261
262 INSTANCE ID
263 (-grid:ec2 only) Usually EC2 instance ID; may be blank or the
264 client token, depending on job progress.
265
266 GOODPUT
267 (-goodput only) The percentage of RUN_TIME for this job which
268 has been saved in a checkpoint. A low GOODPUT value indicates
269 that the job is failing to checkpoint. If a job has not yet
270 attempted a checkpoint, this column contains [?????].
271
272 CPU_UTIL
273 (-goodput only) The ratio of CPU_TIME to RUN_TIME for check‐
274 pointed work. A low CPU_UTIL indicates that the job is not
275 running efficiently, perhaps because it is I/O bound or be‐
276 cause the job requires more memory than available on the re‐
277 mote workstations. If the job has not (yet) checkpointed,
278 this column contains [??????].
279
280 Mb/s (-goodput only) The network usage of this job, in Megabits
281 per second of run-time. READ The total number of bytes the
282 application has read from files and sockets. WRITE The total
283 number of bytes the application has written to files and
284 sockets. SEEK The total number of seek operations the appli‐
285 cation has performed on files. XPUT The effective throughput
286 (average bytes read and written per second) from the applica‐
287 tion's point of view. BUFSIZE The maximum number of bytes to
288 be buffered per file. BLOCKSIZE The desired block size for
289 large data transfers. These fields are updated when a job
290 produces a checkpoint or completes. If a job has not yet pro‐
291 duced a checkpoint, this information is not available.
292
293 INPUT (-io only) BytesRecvd.
294
295 OUTPUT (-io only) BytesSent.
296
297 RATE (-io only) BytesRecvd+BytesSent.
298
299 MISC (-io only) JobUniverse.
300
301 CPU_TIME
302 (-cputime only) The remote CPU time accumulated by the job to
303 date (which has been stored in a checkpoint) in days, hours,
304 minutes, and seconds. (If the job is currently running, time
305 accumulated during the current run is not shown. If the job
306 has not produced a checkpoint, this column contains
307 0+00:00:00.)
308
309 HELD_SINCE
310 (-hold only) Month, day, hour and minute at which the job was
311 held.
312
313 HOLD_REASON
314 (-hold only) The hold reason for the job.
315
316 Analyze
317 The -analyze or -better-analyze options can be used to determine why
318 certain jobs are not running by performing an analysis on a per machine
319 basis for each machine in the pool. The reasons can vary among failed
320 constraints, insufficient priority, resource owner preferences and pre‐
321 vention of preemption by the PREEMPTION_REQUIREMENTS
322 expression. If the analyze option -verbose is specified along with
323 the -analyze option, the reason for failure is displayed on a per ma‐
324 chine basis. -better-analyze differs from -analyze in that it will do
325 matchmaking analysis on jobs even if they are currently running, or if
326 the reason they are not running is not due to matchmaking. -better-ana‐
327 lyze also produces more thorough analysis of complex Requirements and
328 shows the values of relevant job ClassAd attributes. When only a single
329 machine is being analyzed via -machine or -mconstraint, the values of
330 relevant attributes of the machine ClassAd are also displayed.
331
333 To restrict the display to jobs of interest, a list of zero or more re‐
334 striction options may be supplied. Each restriction may be one of:
335
336 • cluster.process, which matches jobs which belong to the specified
337 cluster and have the specified process number;
338
339 • cluster (without a process), which matches all jobs belonging to the
340 specified cluster;
341
342 • owner, which matches all jobs owned by the specified owner;
343
344 • -constraint expression, which matches all jobs that satisfy the spec‐
345 ified ClassAd expression;
346
347 • -unmatchable expression, which matches all jobs that do not match any
348 slot that would be considered by -better-analyze ;
349
350 • -allusers, which overrides the default restriction of only matching
351 jobs submitted by the current user.
352
353 If cluster or cluster.process is specified, and the job matching that
354 restriction is a condor_dagman job, information for all jobs of that
355 DAG is displayed in batch mode (in non-batch mode, only the condor_dag‐
356 man job itself is displayed).
357
358 If no owner restrictions are present, the job matches the restriction
359 list if it matches at least one restriction in the list. If owner re‐
360 strictions are present, the job matches the list if it matches one of
361 the owner restrictions and at least one non-owner restriction.
362
364 -debug Causes debugging information to be sent to stderr, based on
365 the value of the configuration variable TOOL_DEBUG.
366
367 -batch (output option) Show a single line of progress information
368 for a batch of jobs, where a batch is defined as follows:
369
370 • An entire workflow (a DAG or hierarchy of nested DAGs)
371
372 • All jobs in a single cluster
373
374 • All jobs submitted by a single user that have the same exe‐
375 cutable specified in their submit file
376
377 • All jobs submitted by a single user that have the same
378 batch name specified in their submit file or on the con‐
379 dor_submit or condor_submit_dag command line.
380
381 Also change the output columns as noted above.
382
383 Note that, as of version 8.5.6, -batch is the default, unless
384 the CONDOR_Q_DASH_BATCH_IS_DEFAULT configuration variable is
385 set to False.
386
387 -nobatch
388 (output option) Show a line for each job (turn off the -batch
389 option).
390
391 -global
392 (general option) Queries all job queues in the pool.
393
394 -submitter submitter
395 (general option) List jobs of a specific submitter in the en‐
396 tire pool, not just for a single condor_schedd.
397
398 -name name
399 (general option) Query only the job queue of the named con‐
400 dor_schedd daemon.
401
402 -pool centralmanagerhostname[:portnumber]
403 (general option) Use the centralmanagerhostname as the cen‐
404 tral manager to locate condor_schedd daemons. The default is
405 the COLLECTOR_HOST, as specified in the configuration.
406
407 -jobads file
408 (general option) Display jobs from a list of ClassAds from a
409 file, instead of the real ClassAds from the condor_schedd
410 daemon. This is most useful for debugging purposes. The Clas‐
411 sAds appear as if condor_q -long is used with the header
412 stripped out.
413
414 -userlog file
415 (general option) Display jobs, with job information coming
416 from a job event log, instead of from the real ClassAds from
417 the condor_schedd daemon. This is most useful for automated
418 testing of the status of jobs known to be in the given job
419 event log, because it reduces the load on the condor_schedd.
420 A job event log does not contain all of the job information,
421 so some fields in the normal output of condor_q will be
422 blank.
423
424 -factory
425 (output option) Display information about late materializa‐
426 tion job factories in the condor_shedd.
427
428 -autocluster
429 (output option) Output condor_schedd daemon auto cluster in‐
430 formation. For each auto cluster, output the unique ID of the
431 auto cluster along with the number of jobs in that auto clus‐
432 ter. This option is intended to be used together with the
433 -long option to output the ClassAds representing auto clus‐
434 ters. The ClassAds can then be used to identify or classify
435 the demand for sets of machine resources, which will be use‐
436 ful in the on-demand creation of execute nodes for glidein
437 services.
438
439 -cputime
440 (output option) Instead of wall-clock allocation time
441 (RUN_TIME), display remote CPU time accumulated by the job to
442 date in days, hours, minutes, and seconds. If the job is cur‐
443 rently running, time accumulated during the current run is
444 not shown. Note that this option has no effect unless used in
445 conjunction with -nobatch.
446
447 -currentrun
448 (output option) If this option is specified, RUN_TIME dis‐
449 plays the time accumulated so far on this current run unless
450 the job is in IDLE or HELD state then RUN_TIME will display
451 the previous runs time. Note that this is the base behavior
452 and is not required, and this option cannot be used in con‐
453 junction with -cumulative-time.
454
455 -cumulative-time
456 (output option) Normally, RUN_TIME contains the current or
457 previous runs accumulated wall-clock time. If this option is
458 specified, RUN_TIME displays the accumulated time for the
459 current run plus all previous runs. Note that this option
460 cannot be used in conjunction with -currentrun.
461
462 -dag (output option) Display DAG node jobs under their DAGMan in‐
463 stance. Child nodes are listed using indentation to show the
464 structure of the DAG. Note that this option has no effect un‐
465 less used in conjunction with -nobatch.
466
467 -expert
468 (output option) Display shorter error messages.
469
470 -grid (output option) Get information only about jobs submitted to
471 grid resources.
472
473 -grid:ec2
474 (output option) Get information only about jobs submitted to
475 grid resources and display it in a format better-suited for
476 EC2 than the default.
477
478 -goodput
479 (output option) Display job goodput statistics.
480
481 -help [Universe | State]
482 (output option) Print usage info, and, optionally, addition‐
483 ally print job universes or job states.
484
485 -hold (output option) Get information about jobs in the hold state.
486 Also displays the time the job was placed into the hold state
487 and the reason why the job was placed in the hold state.
488
489 -limit Number
490 (output option) Limit the number of items output to Number.
491
492 -io (output option) Display job input/output summaries.
493
494 -long (output option) Display entire job ClassAds in long format
495 (one attribute per line).
496
497 -idle (output option) Get information about idle jobs. Note that
498 this option implies -nobatch.
499
500 -run (output option) Get information about running jobs. Note that
501 this option implies -nobatch.
502
503 -stream-results
504 (output option) Display results as jobs are fetched from the
505 job queue rather than storing results in memory until all
506 jobs have been fetched. This can reduce memory consumption
507 when fetching large numbers of jobs, but if condor_q is
508 paused while displaying results, this could result in a time‐
509 out in communication with condor_schedd.
510
511 -totals
512 (output option) Display only the totals.
513
514 -version
515 (output option) Print the HTCondor version and exit.
516
517 -wide (output option) If this option is specified, and the command
518 portion of the output would cause the output to extend beyond
519 80 columns, display beyond the 80 columns.
520
521 -xml (output option) Display entire job ClassAds in XML format.
522 The XML format is fully defined in the reference manual, ob‐
523 tained from the ClassAds web page, with a link at
524 http://htcondor.org/classad/classad.html.
525
526 -json (output option) Display entire job ClassAds in JSON format.
527
528 -attributes Attr1[,Attr2 ...]
529 (output option) Explicitly list the attributes, by name in a
530 comma separated list, which should be displayed when using
531 the -xml, -json or -long options. Limiting the number of at‐
532 tributes increases the efficiency of the query.
533
534 -format fmt attr
535 (output option) Display attribute or expression attr in for‐
536 mat fmt. To display the attribute or expression the format
537 must contain a single printf(3)-style conversion specifier.
538 Attributes must be from the job ClassAd. Expressions are
539 ClassAd expressions and may refer to attributes in the job
540 ClassAd. If the attribute is not present in a given ClassAd
541 and cannot be parsed as an expression, then the format option
542 will be silently skipped. %r prints the unevaluated, or raw
543 values. The conversion specifier must match the type of the
544 attribute or expression. %s is suitable for strings such as
545 Owner, %d for integers such as ClusterId, and %f for floating
546 point numbers such as RemoteWallClockTime. %v identifies the
547 type of the attribute, and then prints the value in an appro‐
548 priate format. %V identifies the type of the attribute, and
549 then prints the value in an appropriate format as it would
550 appear in the -long format. As an example, strings used with
551 %V will have quote marks. An incorrect format will result in
552 undefined behavior. Do not use more than one conversion spec‐
553 ifier in a given format. More than one conversion specifier
554 will result in undefined behavior. To output multiple at‐
555 tributes repeat the -format option once for each desired at‐
556 tribute. Like printf(3) style formats, one may include other
557 text that will be reproduced directly. A format without any
558 conversion specifiers may be specified, but an attribute is
559 still required. Include a backslash followed by an 'n' to
560 specify a line break.
561
562 -autoformat[:jlhVr,tng] attr1 [attr2 ...] or -af[:jlhVr,tng] attr1
563 [attr2 ...]
564 (output option) Display attribute(s) or expression(s) format‐
565 ted in a default way according to attribute types. This op‐
566 tion takes an arbitrary number of attribute names as argu‐
567 ments, and prints out their values, with a space between each
568 value and a newline character after the last value. It is
569 like the -format option without format strings. This output
570 option does not work in conjunction with any of the options
571 -run, -currentrun, -hold, -grid, -goodput, or -io.
572
573 It is assumed that no attribute names begin with a dash char‐
574 acter, so that the next word that begins with dash is the
575 start of the next option. The autoformat option may be fol‐
576 lowed by a colon character and formatting qualifiers to devi‐
577 ate the output formatting from the default:
578
579 j print the job ID as the first field,
580
581 l label each field,
582
583 h print column headings before the first line of output,
584
585 V use %V rather than %v for formatting (string values are
586 quoted),
587
588 r print "raw", or unevaluated values,
589
590 , add a comma character after each field,
591
592 t add a tab character before each field instead of the de‐
593 fault space character,
594
595 n add a newline character after each field,
596
597 g add a newline character between ClassAds, and suppress spa‐
598 ces before each field.
599
600 Use -af:h to get tabular values with headings.
601
602 Use -af:lrng to get -long equivalent format.
603
604 The newline and comma characters may not be used together.
605 The l and h characters may not be used together.
606
607 -print-format file
608 Read output formatting information from the given custom
609 print format file. see Print Formats for more information
610 about custom print format files.
611
612 -analyze[:<qual>]
613 (analyze option) Perform a matchmaking analysis on why the
614 requested jobs are not running. First a simple analysis de‐
615 termines if the job is not running due to not being in a
616 runnable state. If the job is in a runnable state, then this
617 option is equivalent to -better-analyze. <qual> is a comma
618 separated list containing one or more of
619
620 priority to consider user priority during the analysis
621
622 summary to show a one line summary for each job or machine
623 reverse to analyze machines, rather than jobs
624
625
626 -better-analyze[:<qual>]
627 (analyze option) Perform a more detailed matchmaking analysis
628 to determine how many resources are available to run the re‐
629 quested jobs. This option is never meaningful for Scheduler
630 universe jobs and only meaningful for grid universe jobs do‐
631 ing matchmaking. When this option is used in conjunction with
632 the -unmatchable option, The output will be a list of job ids
633 that don't match any of the available slots. <qual> is a
634 comma separated list containing one or more of
635
636 priority to consider user priority during the analysis
637
638 summary to show a one line summary for each job or machine
639 reverse to analyze machines, rather than jobs
640
641
642 -machine name
643 (analyze option) When doing matchmaking analysis, analyze
644 only machine ClassAds that have slot or machine names that
645 match the given name.
646
647 -mconstraint expression
648 (analyze option) When doing matchmaking analysis, match only
649 machine ClassAds which match the ClassAd expression con‐
650 straint.
651
652 -slotads file
653 (analyze option) When doing matchmaking analysis, use the ma‐
654 chine ClassAds from the file instead of the ones from the
655 condor_collector daemon. This is most useful for debugging
656 purposes. The ClassAds appear as if condor_status -long is
657 used.
658
659 -userprios file
660 (analyze option) When doing matchmaking analysis with prior‐
661 ity, read user priorities from the file rather than the ones
662 from the condor_negotiator daemon. This is most useful for
663 debugging purposes or to speed up analysis in situations
664 where the condor_negotiator daemon is slow to respond to con‐
665 dor_userprio requests. The file should be in the format pro‐
666 duced by condor_userprio -long.
667
668 -nouserprios
669 (analyze option) Do not consider user priority during the
670 analysis.
671
672 -reverse-analyze
673 (analyze option) Analyze machine requirements against jobs.
674
675 -verbose
676 (analyze option) When doing analysis, show progress and in‐
677 clude the names of specific machines in the output.
678
680 The default output from condor_q is formatted to be human readable, not
681 script readable. In an effort to make the output fit within 80 charac‐
682 ters, values in some fields might be truncated. Furthermore, the HTCon‐
683 dor Project can (and does) change the formatting of this default output
684 as we see fit. Therefore, any script that is attempting to parse data
685 from condor_q is strongly encouraged to use the -format option (de‐
686 scribed above, examples given below).
687
688 Although -analyze provides a very good first approximation, the ana‐
689 lyzer cannot diagnose all possible situations, because the analysis is
690 based on instantaneous and local information. Therefore, there are some
691 situations such as when several submitters are contending for re‐
692 sources, or if the pool is rapidly changing state which cannot be accu‐
693 rately diagnosed.
694
695 It is possible to hold jobs that are in the X state. To avoid this it
696 is best to construct a -constraint expression that option contains Job‐
697 Status != 3 if the user wishes to avoid this condition.
698
700 The -format option provides a way to specify both the job attributes
701 and formatting of those attributes. There must be only one conversion
702 specification per -format option. As an example, to list only Jane
703 Doe's jobs in the queue, choosing to print and format only the owner of
704 the job, the command line arguments for the job, and the process ID of
705 the job:
706
707 $ condor_q -submitter jdoe -format "%s" Owner -format " %s " Args -format " ProcId = %d\n" ProcId
708 jdoe 16386 2800 ProcId = 0
709 jdoe 16386 3000 ProcId = 1
710 jdoe 16386 3200 ProcId = 2
711 jdoe 16386 3400 ProcId = 3
712 jdoe 16386 3600 ProcId = 4
713 jdoe 16386 4200 ProcId = 7
714
715 To display only the JobID's of Jane Doe's jobs you can use the follow‐
716 ing.
717
718 $ condor_q -submitter jdoe -format "%d." ClusterId -format "%d\n" ProcId
719 27.0
720 27.1
721 27.2
722 27.3
723 27.4
724 27.7
725
726 An example that shows the analysis in summary format:
727
728 $ condor_q -analyze:summary
729
730 -- Submitter: submit-1.chtc.wisc.edu : <192.168.100.43:9618?sock=11794_95bb_3> :
731 submit-1.chtc.wisc.edu
732 Analyzing matches for 5979 slots
733 Autocluster Matches Machine Running Serving
734 JobId Members/Idle Reqmnts Rejects Job Users Job Other User Avail Owner
735 ---------- ------------ -------- ------------ ---------- ---------- ----- -----
736 25764522.0 7/0 5910 820 7/10 5046 34 smith
737 25764682.0 9/0 2172 603 9/9 1531 29 smith
738 25765082.0 18/0 2172 603 18/9 1531 29 smith
739 25765900.0 1/0 2172 603 1/9 1531 29 smith
740
741 An example that shows summary information by machine:
742
743 $ condor_q -ana:sum,rev
744
745 -- Submitter: s-1.chtc.wisc.edu : <192.168.100.43:9618?sock=11794_95bb_3> : s-1.chtc.wisc.edu
746 Analyzing matches for 2885 jobs
747 Slot Slot's Req Job's Req Both
748 Name Type Matches Job Matches Slot Match %
749 ------------------------ ---- ------------ ------------ ----------
750 slot1@INFO.wisc.edu Stat 2729 0 0.00
751 slot2@INFO.wisc.edu Stat 2729 0 0.00
752 slot1@aci-001.chtc.wisc.edu Part 0 2793 0.00
753 slot1_1@a-001.chtc.wisc.edu Dyn 2644 2792 91.37
754 slot1_2@a-001.chtc.wisc.edu Dyn 2623 2601 85.10
755 slot1_3@a-001.chtc.wisc.edu Dyn 2644 2632 85.82
756 slot1_4@a-001.chtc.wisc.edu Dyn 2644 2792 91.37
757 slot1@a-002.chtc.wisc.edu Part 0 2633 0.00
758 slot1_10@a-002.chtc.wisc.edu Den 2623 2601 85.10
759
760 An example with two independent DAGs in the queue:
761
762 $ condor_q
763
764 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:35169?...
765 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
766 wenger DAG: 3696 2/12 11:55 _ 10 _ 10 3698.0 ... 3707.0
767 wenger DAG: 3697 2/12 11:55 1 1 1 10 3709.0 ... 3710.0
768
769 14 jobs; 0 completed, 0 removed, 1 idle, 13 running, 0 held, 0 suspended
770
771 Note that the "13 running" in the last line is two more than the total
772 of the RUN column, because the two condor_dagman jobs themselves are
773 counted in the last line but not the RUN column.
774
775 Also note that the "completed" value in the last line does not corre‐
776 spond to the total of the DONE column, because the "completed" value in
777 the last line only counts jobs that are completed but still in the
778 queue, whereas the DONE column counts jobs that are no longer in the
779 queue.
780
781 Here's an example with a held job, illustrating the addition of the
782 HOLD column to the output:
783
784 $ condor_q
785
786 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
787 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE HOLD TOTAL JOB_IDS
788 wenger CMD: /bin/slee 9/13 16:25 _ 3 _ 1 4 599.0 ...
789
790 4 jobs; 0 completed, 0 removed, 0 idle, 3 running, 1 held, 0 suspended
791
792 Here are some examples with a nested-DAG workflow in the queue, which
793 is one of the most complicated cases. The workflow consists of a
794 top-level DAG with nodes NodeA and NodeB, each with two two-proc clus‐
795 ters; and a sub-DAG SubZ with nodes NodeSA and NodeSB, each with two
796 two-proc clusters.
797
798 First of all, non-batch mode with all of the node jobs in the queue:
799
800 $ condor_q -nobatch
801
802 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
803 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
804 591.0 wenger 9/13 16:05 0+00:00:13 R 0 2.4 condor_dagman -p 0
805 592.0 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 60
806 592.1 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 300
807 593.0 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 60
808 593.1 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 300
809 594.0 wenger 9/13 16:05 0+00:00:07 R 0 2.4 condor_dagman -p 0
810 595.0 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 60
811 595.1 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 300
812 596.0 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 60
813 596.1 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 300
814
815 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
816
817 Now non-batch mode with the -dag option (unfortunately, condor_q
818 doesn't do a good job of grouping procs in the same cluster together):
819
820 $ condor_q -nobatch -dag
821
822 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
823 ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD
824 591.0 wenger 9/13 16:05 0+00:00:27 R 0 2.4 condor_dagman -
825 592.0 |-NodeA 9/13 16:05 0+00:00:21 R 0 0.0 sleep 60
826 593.0 |-NodeB 9/13 16:05 0+00:00:21 R 0 0.0 sleep 60
827 594.0 |-SubZ 9/13 16:05 0+00:00:21 R 0 2.4 condor_dagman -
828 595.0 |-NodeSA 9/13 16:05 0+00:00:15 R 0 0.0 sleep 60
829 596.0 |-NodeSB 9/13 16:05 0+00:00:15 R 0 0.0 sleep 60
830 592.1 |-NodeA 9/13 16:05 0+00:00:21 R 0 0.0 sleep 300
831 593.1 |-NodeB 9/13 16:05 0+00:00:21 R 0 0.0 sleep 300
832 595.1 |-NodeSA 9/13 16:05 0+00:00:15 R 0 0.0 sleep 300
833 596.1 |-NodeSB 9/13 16:05 0+00:00:15 R 0 0.0 sleep 300
834
835 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
836
837 Now, finally, the non-batch (default) mode:
838
839 $ condor_q
840
841 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
842 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
843 wenger ex1.dag+591 9/13 16:05 _ 8 _ 5 592.0 ... 596.1
844
845 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
846
847 There are several things about this output that may be slightly confus‐
848 ing:
849
850 • The TOTAL column is less than the RUN column. This is because, for
851 DAG node jobs, their contribution to the TOTAL column is the number
852 of clusters, not the number of procs (but their contribution to the
853 RUN column is the number of procs). So the four DAG nodes (8 procs)
854 contribute 4, and the sub-DAG contributes 1, to the TOTAL column.
855 (But, somewhat confusingly, the sub-DAG job is not counted in the RUN
856 column.)
857
858 • The sum of the RUN and IDLE columns (8) is less than the 10 jobs
859 listed in the totals line at the bottom. This is because the
860 top-level DAG and sub-DAG jobs are not counted in the RUN column, but
861 they are counted in the totals line.
862
863 Now here is non-batch mode after proc 0 of each node job has finished:
864
865 $ condor_q -nobatch
866
867 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
868 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
869 591.0 wenger 9/13 16:05 0+00:01:19 R 0 2.4 condor_dagman -p 0
870 592.1 wenger 9/13 16:05 0+00:01:13 R 0 0.0 sleep 300
871 593.1 wenger 9/13 16:05 0+00:01:13 R 0 0.0 sleep 300
872 594.0 wenger 9/13 16:05 0+00:01:13 R 0 2.4 condor_dagman -p 0
873 595.1 wenger 9/13 16:05 0+00:01:07 R 0 0.0 sleep 300
874 596.1 wenger 9/13 16:05 0+00:01:07 R 0 0.0 sleep 300
875
876 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
877
878 The same state also with the -dag option:
879
880 $ condor_q -nobatch -dag
881
882 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
883 ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD
884 591.0 wenger 9/13 16:05 0+00:01:30 R 0 2.4 condor_dagman -
885 592.1 |-NodeA 9/13 16:05 0+00:01:24 R 0 0.0 sleep 300
886 593.1 |-NodeB 9/13 16:05 0+00:01:24 R 0 0.0 sleep 300
887 594.0 |-SubZ 9/13 16:05 0+00:01:24 R 0 2.4 condor_dagman -
888 595.1 |-NodeSA 9/13 16:05 0+00:01:18 R 0 0.0 sleep 300
889 596.1 |-NodeSB 9/13 16:05 0+00:01:18 R 0 0.0 sleep 300
890
891 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
892
893 And, finally, that state in batch (default) mode:
894
895 $ condor_q
896
897 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
898 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
899 wenger ex1.dag+591 9/13 16:05 _ 4 _ 5 592.1 ... 596.1
900
901 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
902
904 condor_q will exit with a status value of 0 (zero) upon success, and it
905 will exit with the value 1 (one) upon failure.
906
908 HTCondor Team
909
911 1990-2023, Center for High Throughput Computing, Computer Sciences De‐
912 partment, University of Wisconsin-Madison, Madison, WI, US. Licensed
913 under the Apache License, Version 2.0.
914
915
916
917
918 Oct 02, 2023 CONDOR_Q(1)