1CONDOR_Q(1) HTCondor Manual CONDOR_Q(1)
2
3
4
6 condor_q - HTCondor Manual
7
8 Display information about jobs in queue
9
10
11
13 condor_q [-help [Universe | State] ]
14
15 condor_q [-debug ] [general options ] [restriction list ] [output
16 options ] [analyze options ]
17
19 condor_q displays information about jobs in the HTCondor job queue. By
20 default, condor_q queries the local job queue, but this behavior may be
21 modified by specifying one of the general options.
22
23 As of version 8.5.2, condor_q defaults to querying only the current
24 user's jobs. This default is overridden when the restriction list has
25 usernames and/or job ids, when the -submitter or -allusers arguments
26 are specified, or when the current user is a queue superuser. It can
27 also be overridden by setting the CONDOR_Q_ONLY_MY_JOBS configuration
28 macro to False.
29
30 As of version 8.5.6, condor_q defaults to batch-mode output (see -batch
31 in the Options section below). The old behavior can be obtained by
32 specifying -nobatch on the command line. To change the default back to
33 its pre-8.5.6 value, set the new configuration variable CON‐
34 DOR_Q_DASH_BATCH_IS_DEFAULT to False.
35
37 As of version 8.5.6, condor_q defaults to displaying information about
38 batches of jobs, rather than individual jobs. The intention is that
39 this will be a more useful, and user-friendly, format for users with
40 large numbers of jobs in the queue. Ideally, users will specify mean‐
41 ingful batch names for their jobs, to make it easier to keep track of
42 related jobs.
43
44 (For information about specifying batch names for your jobs, see the
45 /man-pages/condor_submit and /man-pages/condor_submit_dag manual
46 pages.)
47
48 A batch of jobs is defined as follows:
49
50 · An entire workflow (a DAG or hierarchy of nested DAGs) (note that
51 condor_dagman now specifies a default batch name for all jobs in a
52 given workflow)
53
54 · All jobs in a single cluster
55
56 · All jobs submitted by a single user that have the same executable
57 specified in their submit file (unless submitted with different batch
58 names)
59
60 · All jobs submitted by a single user that have the same batch name
61 specified in their submit file or on the condor_submit or condor_sub‐
62 mit_dag command line.
63
65 There are many output options that modify the output generated by con‐
66 dor_q. The effects of these options, and the meanings of the various
67 output data, are described below.
68
69 Output options
70 If the -long option is specified, condor_q displays a long description
71 of the queried jobs by printing the entire job ClassAd for all jobs
72 matching the restrictions, if any. Individual attributes of the job
73 ClassAd can be displayed by means of the -format option, which displays
74 attributes with a printf(3) format, or with the -autoformat option.
75 Multiple -format options may be specified in the option list to display
76 several attributes of the job.
77
78 For most output options (except as specified), the last line of con‐
79 dor_q output contains a summary of the queue: the total number of jobs,
80 and the number of jobs in the completed, removed, idle, running, held
81 and suspended states.
82
83 If no output options are specified, condor_q now defaults to batch
84 mode, and displays the following columns of information, with one line
85 of output per batch of jobs:
86
87 OWNER, BATCH_NAME, SUBMITTED, DONE, RUN, IDLE, [HOLD,] TOTAL, JOB_IDS
88
89 Note that the HOLD column is only shown if there are held jobs in the
90 output or if there are no jobs in the output.
91
92 If the -nobatch option is specified, condor_q displays the following
93 columns of information, with one line of output per job:
94
95 ID, OWNER, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
96
97 If the -dag option is specified (in conjunction with -nobatch), con‐
98 dor_q displays the following columns of information, with one line of
99 output per job; the owner is shown only for top-level jobs, and for all
100 other jobs (including sub-DAGs) the node name is shown:
101
102 ID, OWNER/NODENAME, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
103
104 If the -run option is specified (in conjunction with -nobatch), con‐
105 dor_q displays the following columns of information, with one line of
106 output per running job:
107
108 ID, OWNER, SUBMITTED, RUN_TIME, HOST(S)
109
110 Also note that the -run option disables output of the totals line.
111
112 If the -grid option is specified, condor_q displays the following col‐
113 umns of information, with one line of output per job:
114
115 ID, OWNER, STATUS, GRID->MANAGER, HOST, GRID_JOB_ID
116
117 If the -grid:ec2 option is specified, condor_q displays the following
118 columns of information, with one line of output per job:
119
120 ID, OWNER, STATUS, INSTANCE ID, CMD
121
122 If the -goodput option is specified, condor_q displays the following
123 columns of information, with one line of output per job:
124
125 ID, OWNER, SUBMITTED, RUN_TIME, GOODPUT, CPU_UTIL, Mb/s
126
127 If the -io option is specified, condor_q displays the following columns
128 of information, with one line of output per job:
129
130 ID, OWNER, RUNS, ST, INPUT, OUTPUT, RATE, MISC
131
132 If the -cputime option is specified (in conjunction with -nobatch),
133 condor_q displays the following columns of information, with one line
134 of output per job:
135
136 ID, OWNER, SUBMITTED, CPU_TIME, ST, PRI, SIZE, CMD
137
138 If the -hold option is specified, condor_q displays the following col‐
139 umns of information, with one line of output per job:
140
141 ID, OWNER, HELD_SINCE, HOLD_REASON
142
143 If the -totals option is specified, condor_q displays only one line of
144 output no matter how many jobs and batches of jobs are in the queue.
145 That line of output contains the total number of jobs, and the number
146 of jobs in the completed, removed, idle, running, held and suspended
147 states.
148
149 Output data
150 The available output data are as follows:
151
152 ID (Non-batch mode only) The cluster/process id of the HTCondor
153 job.
154
155 OWNER The owner of the job or batch of jobs.
156
157 OWNER/NODENAME
158 (-dag only) The owner of a job or the DAG node name of the
159 job.
160
161 BATCH_NAME
162 (Batch mode only) The batch name of the job or batch of jobs.
163
164 SUBMITTED
165 The month, day, hour, and minute the job was submitted to the
166 queue.
167
168 DONE (Batch mode only) The number of job procs that are done, but
169 still in the queue.
170
171 RUN (Batch mode only) The number of job procs that are running.
172
173 IDLE (Batch mode only) The number of job procs that are in the
174 queue but idle.
175
176 HOLD (Batch mode only) The number of job procs that are in the
177 queue but held.
178
179 TOTAL (Batch mode only) The total number of job procs in the queue,
180 unless the batch is a DAG, in which case this is the total
181 number of clusters in the queue. Note: for non-DAG batches,
182 the TOTAL column contains correct values only in version
183 8.5.7 and later.
184
185 JOB_IDS
186 (Batch mode only) The range of job IDs belonging to the
187 batch.
188
189 RUN_TIME
190 (Non-batch mode only) Wall-clock time accumulated by the job
191 to date in days, hours, minutes, and seconds.
192
193 ST (Non-batch mode only) Current status of the job, which varies
194 somewhat according to the job universe and the timing of
195 updates. H = on hold, R = running, I = idle (waiting for a
196 machine to execute on), C = completed, X = removed, S = sus‐
197 pended (execution of a running job temporarily suspended on
198 execute node), < = transferring input (or queued to do so),
199 and > = transferring output (or queued to do so).
200
201 PRI (Non-batch mode only) User specified priority of the job,
202 displayed as an integer, with higher numbers corresponding to
203 better priority.
204
205 SIZE (Non-batch mode only) The peak amount of memory in Mbytes
206 consumed by the job; note this value is only refreshed peri‐
207 odically. The actual value reported is taken from the job
208 ClassAd attribute MemoryUsage if this attribute is defined,
209 and from job attribute ImageSize otherwise.
210
211 CMD (Non-batch mode only) The name of the executable. For EC2
212 jobs, this field is arbitrary.
213
214 HOST(S)
215 (-run only) The host where the job is running.
216
217 STATUS (-grid only) The state that HTCondor believes the job is in.
218 Possible values are grid-type specific, but include:
219
220 PENDING
221 The job is waiting for resources to become avail‐
222 able in order to run.
223
224 ACTIVE The job has received resources, and the applica‐
225 tion is executing.
226
227 FAILED The job terminated before completion because of an
228 error, user-triggered cancel, or system-triggered
229 cancel.
230
231 DONE The job completed successfully.
232
233 SUSPENDED
234 The job has been suspended. Resources which were
235 allocated for this job may have been released due
236 to a scheduler-specific reason.
237
238 UNSUBMITTED
239 The job has not been submitted to the scheduler
240 yet, pending the reception of the GLOBUS_GRAM_PRO‐
241 TOCOL_JOB_SIGNAL_COMMIT_REQUEST signal from a
242 client.
243
244 STAGE_IN
245 The job manager is staging in files, in order to
246 run the job.
247
248 STAGE_OUT
249 The job manager is staging out files generated by
250 the job.
251
252 UNKNOWN
253 Unknown
254
255 GRID->MANAGER
256 (-grid only) A guess at what remote batch system is running
257 the job. It is a guess, because HTCondor looks at the Globus
258 jobmanager contact string to attempt identification. If the
259 value is fork, the job is running on the remote host without
260 a jobmanager. Values may also be condor, lsf, or pbs.
261
262 HOST (-grid only) The host to which the job was submitted.
263
264 GRID_JOB_ID
265 (-grid only) (More information needed here.)
266
267 INSTANCE ID
268 (-grid:ec2 only) Usually EC2 instance ID; may be blank or the
269 client token, depending on job progress.
270
271 GOODPUT
272 (-goodput only) The percentage of RUN_TIME for this job which
273 has been saved in a checkpoint. A low GOODPUT value indicates
274 that the job is failing to checkpoint. If a job has not yet
275 attempted a checkpoint, this column contains [?????].
276
277 CPU_UTIL
278 (-goodput only) The ratio of CPU_TIME to RUN_TIME for check‐
279 pointed work. A low CPU_UTIL indicates that the job is not
280 running efficiently, perhaps because it is I/O bound or
281 because the job requires more memory than available on the
282 remote workstations. If the job has not (yet) checkpointed,
283 this column contains [??????].
284
285 Mb/s (-goodput only) The network usage of this job, in Megabits
286 per second of run-time. READ The total number of bytes the
287 application has read from files and sockets. WRITE The total
288 number of bytes the application has written to files and
289 sockets. SEEK The total number of seek operations the appli‐
290 cation has performed on files. XPUT The effective throughput
291 (average bytes read and written per second) from the applica‐
292 tion's point of view. BUFSIZE The maximum number of bytes to
293 be buffered per file. BLOCKSIZE The desired block size for
294 large data transfers. These fields are updated when a job
295 produces a checkpoint or completes. If a job has not yet pro‐
296 duced a checkpoint, this information is not available.
297
298 INPUT (-io only) For standard universe, FileReadBytes; otherwise,
299 BytesRecvd.
300
301 OUTPUT (-io only) For standard universe, FileWriteBytes; otherwise,
302 BytesSent.
303
304 RATE (-io only) For standard universe, FileReadBytes+FileWrite‐
305 Bytes; otherwise, BytesRecvd+BytesSent.
306
307 MISC (-io only) JobUniverse.
308
309 CPU_TIME
310 (-cputime only) The remote CPU time accumulated by the job to
311 date (which has been stored in a checkpoint) in days, hours,
312 minutes, and seconds. (If the job is currently running, time
313 accumulated during the current run is not shown. If the job
314 has not produced a checkpoint, this column contains
315 0+00:00:00.)
316
317 HELD_SINCE
318 (-hold only) Month, day, hour and minute at which the job was
319 held.
320
321 HOLD_REASON
322 (-hold only) The hold reason for the job.
323
324 Analyze
325 The -analyze or -better-analyze options can be used to determine why
326 certain jobs are not running by performing an analysis on a per machine
327 basis for each machine in the pool. The reasons can vary among failed
328 constraints, insufficient priority, resource owner preferences and pre‐
329 vention of preemption by the PREEMPTION_REQUIREMENTS
330 expression. If the analyze option -verbose is specified along with
331 the -analyze option, the reason for failure is displayed on a per
332 machine basis. -better-analyze differs from -analyze in that it will do
333 matchmaking analysis on jobs even if they are currently running, or if
334 the reason they are not running is not due to matchmaking. -better-ana‐
335 lyze also produces more thorough analysis of complex Requirements and
336 shows the values of relevant job ClassAd attributes. When only a single
337 machine is being analyzed via -machine or -mconstraint, the values of
338 relevant attributes of the machine ClassAd are also displayed.
339
341 To restrict the display to jobs of interest, a list of zero or more
342 restriction options may be supplied. Each restriction may be one of:
343
344 · cluster.process, which matches jobs which belong to the specified
345 cluster and have the specified process number;
346
347 · cluster (without a process), which matches all jobs belonging to the
348 specified cluster;
349
350 · owner, which matches all jobs owned by the specified owner;
351
352 · -constraint expression, which matches all jobs that satisfy the spec‐
353 ified ClassAd expression;
354
355 · -unmatchable expression, which matches all jobs that do not match any
356 slot that would be considered by -better-analyze ;
357
358 · -allusers, which overrides the default restriction of only matching
359 jobs submitted by the current user.
360
361 If cluster or cluster.process is specified, and the job matching that
362 restriction is a condor_dagman job, information for all jobs of that
363 DAG is displayed in batch mode (in non-batch mode, only the condor_dag‐
364 man job itself is displayed).
365
366 If no owner restrictions are present, the job matches the restriction
367 list if it matches at least one restriction in the list. If owner
368 restrictions are present, the job matches the list if it matches one of
369 the owner restrictions and at least one non-owner restriction.
370
372 -debug Causes debugging information to be sent to stderr, based on
373 the value of the configuration variable TOOL_DEBUG.
374
375 -batch (output option) Show a single line of progress information
376 for a batch of jobs, where a batch is defined as follows:
377
378 · An entire workflow (a DAG or hierarchy of nested DAGs)
379
380 · All jobs in a single cluster
381
382 · All jobs submitted by a single user that have the same exe‐
383 cutable specified in their submit file
384
385 · All jobs submitted by a single user that have the same
386 batch name specified in their submit file or on the con‐
387 dor_submit or condor_submit_dag command line.
388
389 Also change the output columns as noted above.
390
391 Note that, as of version 8.5.6, -batch is the default, unless
392 the CONDOR_Q_DASH_BATCH_IS_DEFAULT configuration variable is
393 set to False.
394
395 -nobatch
396 (output option) Show a line for each job (turn off the -batch
397 option).
398
399 -global
400 (general option) Queries all job queues in the pool.
401
402 -submitter submitter
403 (general option) List jobs of a specific submitter in the
404 entire pool, not just for a single condor_schedd.
405
406 -name name
407 (general option) Query only the job queue of the named con‐
408 dor_schedd daemon.
409
410 -pool centralmanagerhostname[:portnumber]
411 (general option) Use the centralmanagerhostname as the cen‐
412 tral manager to locate condor_schedd daemons. The default is
413 the COLLECTOR_HOST, as specified in the configuration.
414
415 -jobads file
416 (general option) Display jobs from a list of ClassAds from a
417 file, instead of the real ClassAds from the condor_schedd
418 daemon. This is most useful for debugging purposes. The Clas‐
419 sAds appear as if condor_q -long is used with the header
420 stripped out.
421
422 -userlog file
423 (general option) Display jobs, with job information coming
424 from a job event log, instead of from the real ClassAds from
425 the condor_schedd daemon. This is most useful for automated
426 testing of the status of jobs known to be in the given job
427 event log, because it reduces the load on the condor_schedd.
428 A job event log does not contain all of the job information,
429 so some fields in the normal output of condor_q will be
430 blank.
431
432 -autocluster
433 (output option) Output condor_schedd daemon auto cluster
434 information. For each auto cluster, output the unique ID of
435 the auto cluster along with the number of jobs in that auto
436 cluster. This option is intended to be used together with the
437 -long option to output the ClassAds representing auto clus‐
438 ters. The ClassAds can then be used to identify or classify
439 the demand for sets of machine resources, which will be use‐
440 ful in the on-demand creation of execute nodes for glidein
441 services.
442
443 -cputime
444 (output option) Instead of wall-clock allocation time
445 (RUN_TIME), display remote CPU time accumulated by the job to
446 date in days, hours, minutes, and seconds. If the job is cur‐
447 rently running, time accumulated during the current run is
448 not shown. Note that this option has no effect unless used in
449 conjunction with -nobatch.
450
451 -currentrun
452 (output option) Normally, RUN_TIME contains all the time
453 accumulated during the current run plus all previous runs. If
454 this option is specified, RUN_TIME only displays the time
455 accumulated so far on this current run.
456
457 -dag (output option) Display DAG node jobs under their DAGMan
458 instance. Child nodes are listed using indentation to show
459 the structure of the DAG. Note that this option has no effect
460 unless used in conjunction with -nobatch.
461
462 -expert
463 (output option) Display shorter error messages.
464
465 -grid (output option) Get information only about jobs submitted to
466 grid resources.
467
468 -grid:ec2
469 (output option) Get information only about jobs submitted to
470 grid resources and display it in a format better-suited for
471 EC2 than the default.
472
473 -goodput
474 (output option) Display job goodput statistics.
475
476 -help [Universe | State]
477 (output option) Print usage info, and, optionally, addition‐
478 ally print job universes or job states.
479
480 -hold (output option) Get information about jobs in the hold state.
481 Also displays the time the job was placed into the hold state
482 and the reason why the job was placed in the hold state.
483
484 -limit Number
485 (output option) Limit the number of items output to Number.
486
487 -io (output option) Display job input/output summaries.
488
489 -long (output option) Display entire job ClassAds in long format
490 (one attribute per line).
491
492 -run (output option) Get information about running jobs. Note that
493 this option has no effect unless used in conjunction with
494 -nobatch.
495
496 -stream-results
497 (output option) Display results as jobs are fetched from the
498 job queue rather than storing results in memory until all
499 jobs have been fetched. This can reduce memory consumption
500 when fetching large numbers of jobs, but if condor_q is
501 paused while displaying results, this could result in a time‐
502 out in communication with condor_schedd.
503
504 -totals
505 (output option) Display only the totals.
506
507 -version
508 (output option) Print the HTCondor version and exit.
509
510 -wide (output option) If this option is specified, and the command
511 portion of the output would cause the output to extend beyond
512 80 columns, display beyond the 80 columns.
513
514 -xml (output option) Display entire job ClassAds in XML format.
515 The XML format is fully defined in the reference manual,
516 obtained from the ClassAds web page, with a link at
517 http://htcondor.org/classad/classad.html.
518
519 -json (output option) Display entire job ClassAds in JSON format.
520
521 -attributes Attr1[,Attr2 ...]
522 (output option) Explicitly list the attributes, by name in a
523 comma separated list, which should be displayed when using
524 the -xml, -json or -long options. Limiting the number of
525 attributes increases the efficiency of the query.
526
527 -format fmt attr
528 (output option) Display attribute or expression attr in for‐
529 mat fmt. To display the attribute or expression the format
530 must contain a single printf(3)-style conversion specifier.
531 Attributes must be from the job ClassAd. Expressions are
532 ClassAd expressions and may refer to attributes in the job
533 ClassAd. If the attribute is not present in a given ClassAd
534 and cannot be parsed as an expression, then the format option
535 will be silently skipped. %r prints the unevaluated, or raw
536 values. The conversion specifier must match the type of the
537 attribute or expression. %s is suitable for strings such as
538 Owner, %d for integers such as ClusterId, and %f for floating
539 point numbers such as RemoteWallClockTime. %v identifies the
540 type of the attribute, and then prints the value in an appro‐
541 priate format. %V identifies the type of the attribute, and
542 then prints the value in an appropriate format as it would
543 appear in the -long format. As an example, strings used with
544 %V will have quote marks. An incorrect format will result in
545 undefined behavior. Do not use more than one conversion spec‐
546 ifier in a given format. More than one conversion specifier
547 will result in undefined behavior. To output multiple
548 attributes repeat the -format option once for each desired
549 attribute. Like printf(3) style formats, one may include
550 other text that will be reproduced directly. A format without
551 any conversion specifiers may be specified, but an attribute
552 is still required. Include a backslash followed by an 'n' to
553 specify a line break.
554
555 -autoformat[:jlhVr,tng] attr1 [attr2 ...] or -af[:jlhVr,tng] attr1
556 [attr2 ...]
557 (output option) Display attribute(s) or expression(s) format‐
558 ted in a default way according to attribute types. This
559 option takes an arbitrary number of attribute names as argu‐
560 ments, and prints out their values, with a space between each
561 value and a newline character after the last value. It is
562 like the -format option without format strings. This output
563 option does not work in conjunction with any of the options
564 -run, -currentrun, -hold, -grid, -goodput, or -io.
565
566 It is assumed that no attribute names begin with a dash char‐
567 acter, so that the next word that begins with dash is the
568 start of the next option. The autoformat option may be fol‐
569 lowed by a colon character and formatting qualifiers to devi‐
570 ate the output formatting from the default:
571
572 j print the job ID as the first field,
573
574 l label each field,
575
576 h print column headings before the first line of output,
577
578 V use %V rather than %v for formatting (string values are
579 quoted),
580
581 r print "raw", or unevaluated values,
582
583 , add a comma character after each field,
584
585 t add a tab character before each field instead of the
586 default space character,
587
588 n add a newline character after each field,
589
590 g add a newline character between ClassAds, and suppress spa‐
591 ces before each field.
592
593 Use -af:h to get tabular values with headings.
594
595 Use -af:lrng to get -long equivalent format.
596
597 The newline and comma characters may not be used together.
598 The l and h characters may not be used together.
599
600 -analyze[:<qual>]
601 (analyze option) Perform a matchmaking analysis on why the
602 requested jobs are not running. First a simple analysis
603 determines if the job is not running due to not being in a
604 runnable state. If the job is in a runnable state, then this
605 option is equivalent to -better-analyze. <qual> is a comma
606 separated list containing one or more of
607
608 priority to consider user priority during the analysis
609
610 summary to show a one line summary for each job or machine
611 reverse to analyze machines, rather than jobs
612
613
614 -better-analyze[:<qual>]
615 (analyze option) Perform a more detailed matchmaking analysis
616 to determine how many resources are available to run the
617 requested jobs. This option is never meaningful for Scheduler
618 universe jobs and only meaningful for grid universe jobs
619 doing matchmaking. When this option is used in conjunction
620 with the -unmatchable option, The output will be a list of
621 job ids that don't match any of the available slots. <qual>
622 is a comma separated list containing one or more of
623
624 priority to consider user priority during the analysis
625
626 summary to show a one line summary for each job or machine
627 reverse to analyze machines, rather than jobs
628
629
630 -machine name
631 (analyze option) When doing matchmaking analysis, analyze
632 only machine ClassAds that have slot or machine names that
633 match the given name.
634
635 -mconstraint expression
636 (analyze option) When doing matchmaking analysis, match only
637 machine ClassAds which match the ClassAd expression con‐
638 straint.
639
640 -slotads file
641 (analyze option) When doing matchmaking analysis, use the
642 machine ClassAds from the file instead of the ones from the
643 condor_collector daemon. This is most useful for debugging
644 purposes. The ClassAds appear as if condor_status -long is
645 used.
646
647 -userprios file
648 (analyze option) When doing matchmaking analysis with prior‐
649 ity, read user priorities from the file rather than the ones
650 from the condor_negotiator daemon. This is most useful for
651 debugging purposes or to speed up analysis in situations
652 where the condor_negotiator daemon is slow to respond to con‐
653 dor_userprio requests. The file should be in the format pro‐
654 duced by condor_userprio -long.
655
656 -nouserprios
657 (analyze option) Do not consider user priority during the
658 analysis.
659
660 -reverse-analyze
661 (analyze option) Analyze machine requirements against jobs.
662
663 -verbose
664 (analyze option) When doing analysis, show progress and
665 include the names of specific machines in the output.
666
668 The default output from condor_q is formatted to be human readable, not
669 script readable. In an effort to make the output fit within 80 charac‐
670 ters, values in some fields might be truncated. Furthermore, the HTCon‐
671 dor Project can (and does) change the formatting of this default output
672 as we see fit. Therefore, any script that is attempting to parse data
673 from condor_q is strongly encouraged to use the -format option
674 (described above, examples given below).
675
676 Although -analyze provides a very good first approximation, the ana‐
677 lyzer cannot diagnose all possible situations, because the analysis is
678 based on instantaneous and local information. Therefore, there are some
679 situations such as when several submitters are contending for
680 resources, or if the pool is rapidly changing state which cannot be
681 accurately diagnosed.
682
683 Options -goodput, -cputime, and -io are most useful for standard uni‐
684 verse jobs, since they rely on values computed when a job produces a
685 checkpoint.
686
687 It is possible to to hold jobs that are in the X state. To avoid this
688 it is best to construct a -constraint expression that option contains
689 JobStatus != 3 if the user wishes to avoid this condition.
690
692 The -format option provides a way to specify both the job attributes
693 and formatting of those attributes. There must be only one conversion
694 specification per -format option. As an example, to list only Jane
695 Doe's jobs in the queue, choosing to print and format only the owner of
696 the job, the command line arguments for the job, and the process ID of
697 the job:
698
699 $ condor_q -submitter jdoe -format "%s" Owner -format " %s " Args -format " ProcId = %d\n" ProcId
700 jdoe 16386 2800 ProcId = 0
701 jdoe 16386 3000 ProcId = 1
702 jdoe 16386 3200 ProcId = 2
703 jdoe 16386 3400 ProcId = 3
704 jdoe 16386 3600 ProcId = 4
705 jdoe 16386 4200 ProcId = 7
706
707 To display only the JobID's of Jane Doe's jobs you can use the follow‐
708 ing.
709
710 $ condor_q -submitter jdoe -format "%d." ClusterId -format "%d\n" ProcId
711 27.0
712 27.1
713 27.2
714 27.3
715 27.4
716 27.7
717
718 An example that shows the analysis in summary format:
719
720 $ condor_q -analyze:summary
721
722 -- Submitter: submit-1.chtc.wisc.edu : <192.168.100.43:9618?sock=11794_95bb_3> :
723 submit-1.chtc.wisc.edu
724 Analyzing matches for 5979 slots
725 Autocluster Matches Machine Running Serving
726 JobId Members/Idle Reqmnts Rejects Job Users Job Other User Avail Owner
727 ---------- ------------ -------- ------------ ---------- ---------- ----- -----
728 25764522.0 7/0 5910 820 7/10 5046 34 smith
729 25764682.0 9/0 2172 603 9/9 1531 29 smith
730 25765082.0 18/0 2172 603 18/9 1531 29 smith
731 25765900.0 1/0 2172 603 1/9 1531 29 smith
732
733 An example that shows summary information by machine:
734
735 $ condor_q -ana:sum,rev
736
737 -- Submitter: s-1.chtc.wisc.edu : <192.168.100.43:9618?sock=11794_95bb_3> : s-1.chtc.wisc.edu
738 Analyzing matches for 2885 jobs
739 Slot Slot's Req Job's Req Both
740 Name Type Matches Job Matches Slot Match %
741 ------------------------ ---- ------------ ------------ ----------
742 slot1@INFO.wisc.edu Stat 2729 0 0.00
743 slot2@INFO.wisc.edu Stat 2729 0 0.00
744 slot1@aci-001.chtc.wisc.edu Part 0 2793 0.00
745 slot1_1@a-001.chtc.wisc.edu Dyn 2644 2792 91.37
746 slot1_2@a-001.chtc.wisc.edu Dyn 2623 2601 85.10
747 slot1_3@a-001.chtc.wisc.edu Dyn 2644 2632 85.82
748 slot1_4@a-001.chtc.wisc.edu Dyn 2644 2792 91.37
749 slot1@a-002.chtc.wisc.edu Part 0 2633 0.00
750 slot1_10@a-002.chtc.wisc.edu Den 2623 2601 85.10
751
752 An example with two independent DAGs in the queue:
753
754 $ condor_q
755
756 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:35169?...
757 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
758 wenger DAG: 3696 2/12 11:55 _ 10 _ 10 3698.0 ... 3707.0
759 wenger DAG: 3697 2/12 11:55 1 1 1 10 3709.0 ... 3710.0
760
761 14 jobs; 0 completed, 0 removed, 1 idle, 13 running, 0 held, 0 suspended
762
763 Note that the "13 running" in the last line is two more than the total
764 of the RUN column, because the two condor_dagman jobs themselves are
765 counted in the last line but not the RUN column.
766
767 Also note that the "completed" value in the last line does not corre‐
768 spond to the total of the DONE column, because the "completed" value in
769 the last line only counts jobs that are completed but still in the
770 queue, whereas the DONE column counts jobs that are no longer in the
771 queue.
772
773 Here's an example with a held job, illustrating the addition of the
774 HOLD column to the output:
775
776 $ condor_q
777
778 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
779 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE HOLD TOTAL JOB_IDS
780 wenger CMD: /bin/slee 9/13 16:25 _ 3 _ 1 4 599.0 ...
781
782 4 jobs; 0 completed, 0 removed, 0 idle, 3 running, 1 held, 0 suspended
783
784 Here are some examples with a nested-DAG workflow in the queue, which
785 is one of the most complicated cases. The workflow consists of a
786 top-level DAG with nodes NodeA and NodeB, each with two two-proc clus‐
787 ters; and a sub-DAG SubZ with nodes NodeSA and NodeSB, each with two
788 two-proc clusters.
789
790 First of all, non-batch mode with all of the node jobs in the queue:
791
792 $ condor_q -nobatch
793
794 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
795 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
796 591.0 wenger 9/13 16:05 0+00:00:13 R 0 2.4 condor_dagman -p 0
797 592.0 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 60
798 592.1 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 300
799 593.0 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 60
800 593.1 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 300
801 594.0 wenger 9/13 16:05 0+00:00:07 R 0 2.4 condor_dagman -p 0
802 595.0 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 60
803 595.1 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 300
804 596.0 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 60
805 596.1 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 300
806
807 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
808
809 Now non-batch mode with the -dag option (unfortunately, condor_q
810 doesn't do a good job of grouping procs in the same cluster together):
811
812 $ condor_q -nobatch -dag
813
814 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
815 ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD
816 591.0 wenger 9/13 16:05 0+00:00:27 R 0 2.4 condor_dagman -
817 592.0 |-NodeA 9/13 16:05 0+00:00:21 R 0 0.0 sleep 60
818 593.0 |-NodeB 9/13 16:05 0+00:00:21 R 0 0.0 sleep 60
819 594.0 |-SubZ 9/13 16:05 0+00:00:21 R 0 2.4 condor_dagman -
820 595.0 |-NodeSA 9/13 16:05 0+00:00:15 R 0 0.0 sleep 60
821 596.0 |-NodeSB 9/13 16:05 0+00:00:15 R 0 0.0 sleep 60
822 592.1 |-NodeA 9/13 16:05 0+00:00:21 R 0 0.0 sleep 300
823 593.1 |-NodeB 9/13 16:05 0+00:00:21 R 0 0.0 sleep 300
824 595.1 |-NodeSA 9/13 16:05 0+00:00:15 R 0 0.0 sleep 300
825 596.1 |-NodeSB 9/13 16:05 0+00:00:15 R 0 0.0 sleep 300
826
827 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
828
829 Now, finally, the non-batch (default) mode:
830
831 $ condor_q
832
833 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
834 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
835 wenger ex1.dag+591 9/13 16:05 _ 8 _ 5 592.0 ... 596.1
836
837 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
838
839 There are several things about this output that may be slightly confus‐
840 ing:
841
842 · The TOTAL column is less than the RUN column. This is because, for
843 DAG node jobs, their contribution to the TOTAL column is the number
844 of clusters, not the number of procs (but their contribution to the
845 RUN column is the number of procs). So the four DAG nodes (8 procs)
846 contribute 4, and the sub-DAG contributes 1, to the TOTAL column.
847 (But, somewhat confusingly, the sub-DAG job is not counted in the RUN
848 column.)
849
850 · The sum of the RUN and IDLE columns (8) is less than the 10 jobs
851 listed in the totals line at the bottom. This is because the
852 top-level DAG and sub-DAG jobs are not counted in the RUN column, but
853 they are counted in the totals line.
854
855 Now here is non-batch mode after proc 0 of each node job has finished:
856
857 $ condor_q -nobatch
858
859 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
860 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
861 591.0 wenger 9/13 16:05 0+00:01:19 R 0 2.4 condor_dagman -p 0
862 592.1 wenger 9/13 16:05 0+00:01:13 R 0 0.0 sleep 300
863 593.1 wenger 9/13 16:05 0+00:01:13 R 0 0.0 sleep 300
864 594.0 wenger 9/13 16:05 0+00:01:13 R 0 2.4 condor_dagman -p 0
865 595.1 wenger 9/13 16:05 0+00:01:07 R 0 0.0 sleep 300
866 596.1 wenger 9/13 16:05 0+00:01:07 R 0 0.0 sleep 300
867
868 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
869
870 The same state also with the -dag option:
871
872 $ condor_q -nobatch -dag
873
874 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
875 ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD
876 591.0 wenger 9/13 16:05 0+00:01:30 R 0 2.4 condor_dagman -
877 592.1 |-NodeA 9/13 16:05 0+00:01:24 R 0 0.0 sleep 300
878 593.1 |-NodeB 9/13 16:05 0+00:01:24 R 0 0.0 sleep 300
879 594.0 |-SubZ 9/13 16:05 0+00:01:24 R 0 2.4 condor_dagman -
880 595.1 |-NodeSA 9/13 16:05 0+00:01:18 R 0 0.0 sleep 300
881 596.1 |-NodeSB 9/13 16:05 0+00:01:18 R 0 0.0 sleep 300
882
883 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
884
885 And, finally, that state in batch (default) mode:
886
887 $ condor_q
888
889 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
890 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
891 wenger ex1.dag+591 9/13 16:05 _ 4 _ 5 592.1 ... 596.1
892
893 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
894
896 condor_q will exit with a status value of 0 (zero) upon success, and it
897 will exit with the value 1 (one) upon failure.
898
900 HTCondor Team
901
903 1990-2020, Center for High Throughput Computing, Computer Sciences
904 Department, University of Wisconsin-Madison, Madison, WI, US. Licensed
905 under the Apache License, Version 2.0.
906
907
908
909
9108.8 Aug 06, 2020 CONDOR_Q(1)