1condor_q(1) General Commands Manual condor_q(1)
2
3
4
6 condor_qDisplay information about jobs in queue
7
9 condor_q[-help [Universe | State]]
10
11 condor_q[-debug] [general options] [restriction list] [output options]
12 [analyze options]
13
15 condor_qdisplays information about jobs in the HTCondor job queue. By
16 default, condor_qqueries the local job queue, but this behavior may be
17 modified by specifying one of the general options.
18
19 As of version 8.5.2, condor_qdefaults to querying only the current
20 user's jobs. This default is overridden when the restriction list has
21 usernames and/or job ids, when the -submitteror -allusersarguments are
22 specified, or when the current user is a queue superuser. It can also
23 be overridden by setting the CONDOR_Q_ONLY_MY_JOBSconfiguration macro
24 to False.
25
26 As of version 8.5.6, condor_qdefaults to batch-mode output (see
27 -batchin the Options section below). The old behavior can be obtained
28 by specifying -nobatchon the command line. To change the default back
29 to its pre-8.5.6 value, set the new configuration variable CON‐
30 DOR_Q_DASH_BATCH_IS_DEFAULTto False.
31
33 As of version 8.5.6, condor_qdefaults to displaying information about
34 batches of jobs, rather than individual jobs. The intention is that
35 this will be a more useful, and user-friendly, format for users with
36 large numbers of jobs in the queue. Ideally, users will specify mean‐
37 ingful batch names for their jobs, to make it easier to keep track of
38 related jobs.
39
40 (For information about specifying batch names for your jobs, see the
41 condor_submit( ) and condor_submit_dag( ) man pages.)
42
43 A batch of jobs is defined as follows:
44
45 * An entire workflow (a DAG or hierarchy of nested DAGs) (note that
46 condor_dagmannow specifies a default batch name for all jobs in a
47 given workflow)
48
49 * All jobs in a single cluster
50
51 * All jobs submitted by a single user that have the same executable
52 specified in their submit file (unless submitted with different
53 batch names)
54
55 * All jobs submitted by a single user that have the same batch name
56 specified in their submit file or on the condor_submitor condor_sub‐
57 mit_dagcommand line.
58
60 There are many output options that modify the output generated by con‐
61 dor_q. The effects of these options, and the meanings of the various
62 output data, are described below.
63
64 Output options
65 If the -longoption is specified, condor_qdisplays a long description of
66 the queried jobs by printing the entire job ClassAd for all jobs match‐
67 ing the restrictions, if any. Individual attributes of the job ClassAd
68 can be displayed by means of the -formatoption, which displays
69 attributes with a printf(3) format, or with the -autoformatoption.
70 Multiple -formatoptions may be specified in the option list to display
71 several attributes of the job.
72
73 For most output options (except as specified), the last line of con‐
74 dor_qoutput contains a summary of the queue: the total number of jobs,
75 and the number of jobs in the completed, removed, idle, running, held
76 and suspended states.
77
78 If no output options are specified, condor_qnow defaults to batch mode,
79 and displays the following columns of information, with one line of
80 output per batch of jobs:
81
82 OWNER, BATCH_NAME, SUBMITTED, DONE, RUN, IDLE, [HOLD,] TOTAL,
83 JOB_IDS
84
85 Note that the HOLD column is only shown if there are held jobs in the
86 output or if there are nojobs in the output.
87
88 If the -nobatchoption is specified, condor_qdisplays the following col‐
89 umns of information, with one line of output per job:
90
91 ID, OWNER, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
92
93 If the -dagoption is specified (in conjunction with -nobatch), con‐
94 dor_qdisplays the following columns of information, with one line of
95 output per job; the owner is shown only for top-level jobs, and for all
96 other jobs (including sub-DAGs) the node name is shown:
97
98 ID, OWNER/NODENAME, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
99
100 If the -runoption is specified (in conjunction with -nobatch), con‐
101 dor_qdisplays the following columns of information, with one line of
102 output per running job:
103
104 ID, OWNER, SUBMITTED, RUN_TIME, HOST(S)
105
106 Also note that the -runoption disables output of the totals line.
107
108 If the -gridoption is specified, condor_qdisplays the following columns
109 of information, with one line of output per job:
110
111 ID, OWNER, STATUS, GRID->MANAGER, HOST, GRID_JOB_ID
112
113 If the -grid:ec2option is specified, condor_qdisplays the following
114 columns of information, with one line of output per job:
115
116 ID, OWNER, STATUS, INSTANCE ID, CMD
117
118 If the -goodputoption is specified, condor_qdisplays the following col‐
119 umns of information, with one line of output per job:
120
121 ID, OWNER, SUBMITTED, RUN_TIME, GOODPUT, CPU_UTIL, Mb/s
122
123 If the -iooption is specified, condor_qdisplays the following columns
124 of information, with one line of output per job:
125
126 ID, OWNER, RUNS, ST, INPUT, OUTPUT, RATE, MISC
127
128 If the -cputimeoption is specified (in conjunction with -nobatch), con‐
129 dor_qdisplays the following columns of information, with one line of
130 output per job:
131
132 ID, OWNER, SUBMITTED, CPU_TIME, ST, PRI, SIZE, CMD
133
134 If the -holdoption is specified, condor_qdisplays the following columns
135 of information, with one line of output per job:
136
137 ID, OWNER, HELD_SINCE, HOLD_REASON
138
139 If the -totalsoption is specified, condor_qdisplays only one line of
140 output no matter how many jobs and batches of jobs are in the queue.
141 That line of output contains the total number of jobs, and the number
142 of jobs in the completed, removed, idle, running, held and suspended
143 states.
144
145 Output data
146 The available output data are as follows:
147
148 ID
149
150 (Non-batch mode only) The cluster/process id of the HTCondor job.
151
152
153
154 OWNER
155
156 The owner of the job or batch of jobs.
157
158
159
160 OWNER/NODENAME
161
162 (-dagonly) The owner of a job or the DAG node name of the job.
163
164
165
166 BATCH_NAME
167
168 (Batch mode only) The batch name of the job or batch of jobs.
169
170
171
172 SUBMITTED
173
174 The month, day, hour, and minute the job was submitted to the queue.
175
176
177
178 DONE
179
180 (Batch mode only) The number of job procs that are done, but still
181 in the queue.
182
183
184
185 RUN
186
187 (Batch mode only) The number of job procs that are running.
188
189
190
191 IDLE
192
193 (Batch mode only) The number of job procs that are in the queue but
194 idle.
195
196
197
198 HOLD
199
200 (Batch mode only) The number of job procs that are in the queue but
201 held.
202
203
204
205 TOTAL
206
207 (Batch mode only) The total number of job procs in the queue, unless
208 the batch is a DAG, in which case this is the total number of clus‐
209 ters in the queue. Note: for non-DAG batches, the TOTAL column con‐
210 tains correct values only in version 8.5.7 and later.
211
212
213
214 JOB_IDS
215
216 (Batch mode only) The range of job IDs belonging to the batch.
217
218
219
220 RUN_TIME
221
222 (Non-batch mode only) Wall-clock time accumulated by the job to date
223 in days, hours, minutes, and seconds.
224
225
226
227 ST
228
229 (Non-batch mode only) Current status of the job, which varies some‐
230 what according to the job universe and the timing of updates. H = on
231 hold, R = running, I = idle (waiting for a machine to execute on), C
232 = completed, X = removed, S = suspended (execution of a running job
233 temporarily suspended on execute node), < = transferring input (or
234 queued to do so), and > = transferring output (or queued to do so).
235
236
237
238 PRI
239
240 (Non-batch mode only) User specified priority of the job, displayed
241 as an integer, with higher numbers corresponding to better priority.
242
243
244
245 SIZE
246
247 (Non-batch mode only) The peak amount of memory in Mbytes consumed
248 by the job; note this value is only refreshed periodically. The
249 actual value reported is taken from the job ClassAd attribute Memo‐
250 ryUsageif this attribute is defined, and from job attribute Image‐
251 Sizeotherwise.
252
253
254
255 CMD
256
257 (Non-batch mode only) The name of the executable. For EC2 jobs, this
258 field is arbitrary.
259
260
261
262 HOST(S)
263
264 (-runonly) The host where the job is running.
265
266
267
268 STATUS
269
270 (-gridonly) The state that HTCondor believes the job is in. Possible
271 values are grid-type specific, but include:
272
273 PENDING
274
275 The job is waiting for resources to become available in order to
276 run.
277
278
279
280 ACTIVE
281
282 The job has received resources, and the application is executing.
283
284
285
286 FAILED
287
288 The job terminated before completion because of an error, user-
289 triggered cancel, or system-triggered cancel.
290
291
292
293 DONE
294
295 The job completed successfully.
296
297
298
299 SUSPENDED
300
301 The job has been suspended. Resources which were allocated for
302 this job may have been released due to a scheduler-specific rea‐
303 son.
304
305
306
307 UNSUBMITTED
308
309 The job has not been submitted to the scheduler yet, pending the
310 reception of the GLOBUS_GRAM_PROTOCOL_JOB_SIGNAL_COMMIT_REQUEST
311 signal from a client.
312
313
314
315 STAGE_IN
316
317 The job manager is staging in files, in order to run the job.
318
319
320
321 STAGE_OUT
322
323 The job manager is staging out files generated by the job.
324
325
326
327 UNKNOWN
328
329
330
331
332
333
334
335 GRID->MANAGER
336
337 (-gridonly) A guess at what remote batch system is running the job.
338 It is a guess, because HTCondor looks at the Globus jobmanager con‐
339 tact string to attempt identification. If the value is fork, the job
340 is running on the remote host without a jobmanager. Values may also
341 be condor, lsf, or pbs.
342
343
344
345 HOST
346
347 (-gridonly) The host to which the job was submitted.
348
349
350
351 GRID_JOB_ID
352
353 (-gridonly) (More information needed here.)
354
355
356
357 INSTANCE ID
358
359 (-grid:ec2only) Usually EC2 instance ID; may be blank or the client
360 token, depending on job progress.
361
362
363
364 GOODPUT
365
366 (-goodputonly) The percentage of RUN_TIME for this job which has
367 been saved in a checkpoint. A low GOODPUT value indicates that the
368 job is failing to checkpoint. If a job has not yet attempted a
369 checkpoint, this column contains [?????].
370
371
372
373 CPU_UTIL
374
375 (-goodputonly) The ratio of CPU_TIME to RUN_TIME for checkpointed
376 work. A low CPU_UTIL indicates that the job is not running effi‐
377 ciently, perhaps because it is I/O bound or because the job requires
378 more memory than available on the remote workstations. If the job
379 has not (yet) checkpointed, this column contains [??????].
380
381
382
383 Mb/s
384
385 (-goodputonly) The network usage of this job, in Megabits per second
386 of run-time.
387
388
389
390 READ The total number of bytes the application has read from files
391 and sockets.
392
393
394
395 WRITE The total number of bytes the application has written to files
396 and sockets.
397
398
399
400 SEEK The total number of seek operations the application has per‐
401 formed on files.
402
403
404
405 XPUT The effective throughput (average bytes read and written per
406 second) from the application's point of view.
407
408
409
410 BUFSIZE The maximum number of bytes to be buffered per file.
411
412
413
414 BLOCKSIZE The desired block size for large data transfers. These
415 fields are updated when a job produces a checkpoint or completes. If
416 a job has not yet produced a checkpoint, this information is not
417 available.
418
419
420
421 INPUT
422
423 (-ioonly) For standard universe, FileReadBytes; otherwise, Bytes‐
424 Recvd.
425
426
427
428 OUTPUT
429
430 (-ioonly) For standard universe, FileWriteBytes; otherwise,
431 BytesSent.
432
433
434
435 RATE
436
437 (-ioonly) For standard universe, FileReadBytes+FileWriteBytes; oth‐
438 erwise, BytesRecvd+BytesSent.
439
440
441
442 MISC
443
444 (-ioonly) JobUniverse.
445
446
447
448 CPU_TIME
449
450 (-cputimeonly) The remote CPU time accumulated by the job to date
451 (which has been stored in a checkpoint) in days, hours, minutes, and
452 seconds. (If the job is currently running, time accumulated during
453 the current run is notshown. If the job has not produced a check‐
454 point, this column contains 0+00:00:00.)
455
456
457
458 HELD_SINCE
459
460 (-holdonly) Month, day, hour and minute at which the job was held.
461
462
463
464 HOLD_REASON
465
466 (-holdonly) The hold reason for the job.
467
468
469
470 Analyze
471 The -analyzeor -better-analyzeoptions can be used to determine why cer‐
472 tain jobs are not running by performing an analysis on a per machine
473 basis for each machine in the pool. The reasons can vary among failed
474 constraints, insufficient priority, resource owner preferences and pre‐
475 vention of preemption by the PREEMPTION_REQUIREMENTSexpression. If the
476 analyze option -verboseis specified along with the -analyzeoption, the
477 reason for failure is displayed on a per machine basis. -better-ana‐
478 lyzediffers from -analyzein that it will do matchmaking analysis on
479 jobs even if they are currently running, or if the reason they are not
480 running is not due to matchmaking. -better-analyzealso produces more
481 thorough analysis of complex Requirements and shows the values of rele‐
482 vant job ClassAd attributes. When only a single machine is being ana‐
483 lyzed via -machineor -mconstraint, the values of relevant attributes of
484 the machine ClassAd are also displayed.
485
487 To restrict the display to jobs of interest, a list of zero or more
488 restriction options may be supplied. Each restriction may be one of:
489
490 * cluster.process, which matches jobs which belong to the specified
491 cluster and have the specified process number;
492
493 * cluster(without a process), which matches all jobs belonging to
494 the specified cluster;
495
496 * owner, which matches all jobs owned by the specified owner;
497
498 * -constraint expression, which matches all jobs that satisfy the
499 specified ClassAd expression;
500
501 * -unmatchable expression, which matches all jobs that do not match
502 any slot that would be considered by -better-analyze ;
503
504 * -allusers, which overrides the default restriction of only match‐
505 ing jobs submitted by the current user.
506
507 If clusteror cluster.processis specified, and the job matching that
508 restriction is a condor_dagmanjob, information for all jobs of that DAG
509 is displayed in batch mode (in non-batch mode, only the condor_dagman‐
510 job itself is displayed).
511
512 If no ownerrestrictions are present, the job matches the restriction
513 list if it matches at least one restriction in the list. If own‐
514 errestrictions are present, the job matches the list if it matches one
515 of the ownerrestrictions andat least one non-ownerrestriction.
516
518 -debug
519
520 Causes debugging information to be sent to stderr, based on the
521 value of the configuration variable TOOL_DEBUG.
522
523
524
525 -batch
526
527 (output option) Show a single line of progress information for a
528 batch of jobs, where a batch is defined as follows:
529
530 * An entire workflow (a DAG or hierarchy of nested DAGs)
531
532 * All jobs in a single cluster
533
534 * All jobs submitted by a single user that have the same exe‐
535 cutable specified in their submit file
536
537 * All jobs submitted by a single user that have the same batch
538 name specified in their submit file or on the condor_submitor
539 condor_submit_dagcommand line. Also change the output columns as
540 noted above.
541
542 Note that, as of version 8.5.6, -batchis the default, unless the
543 CONDOR_Q_DASH_BATCH_IS_DEFAULTconfiguration variable is set to
544 False.
545
546
547
548 -nobatch
549
550 (output option) Show a line for each job (turn off the -batchop‐
551 tion).
552
553
554
555 -global
556
557 (general option) Queries all job queues in the pool.
558
559
560
561 -submitter submitter
562
563 (general option) List jobs of a specific submitter in the entire
564 pool, not just for a single condor_schedd.
565
566
567
568 -name name
569
570 (general option) Query only the job queue of the named condor_sched‐
571 ddaemon.
572
573
574
575 -pool centralmanagerhostname[:portnumber]
576
577 (general option) Use the centralmanagerhostnameas the central man‐
578 ager to locate condor_schedddaemons. The default is the COLLEC‐
579 TOR_HOST, as specified in the configuration.
580
581
582
583 -jobads file
584
585 (general option) Display jobs from a list of ClassAds from a file,
586 instead of the real ClassAds from the condor_schedddaemon. This is
587 most useful for debugging purposes. The ClassAds appear as if con‐
588 dor_q-longis used with the header stripped out.
589
590
591
592 -userlog file
593
594 (general option) Display jobs, with job information coming from a
595 job event log, instead of from the real ClassAds from the con‐
596 dor_schedddaemon. This is most useful for automated testing of the
597 status of jobs known to be in the given job event log, because it
598 reduces the load on the condor_schedd. A job event log does not con‐
599 tain all of the job information, so some fields in the normal output
600 of condor_qwill be blank.
601
602
603
604 -autocluster
605
606 (output option) Output condor_schedddaemon auto cluster information.
607 For each auto cluster, output the unique ID of the auto cluster
608 along with the number of jobs in that auto cluster. This option is
609 intended to be used together with the -longoption to output the
610 ClassAds representing auto clusters. The ClassAds can then be used
611 to identify or classify the demand for sets of machine resources,
612 which will be useful in the on-demand creation of execute nodes for
613 glidein services.
614
615
616
617 -cputime
618
619 (output option) Instead of wall-clock allocation time (RUN_TIME),
620 display remote CPU time accumulated by the job to date in days,
621 hours, minutes, and seconds. If the job is currently running, time
622 accumulated during the current run is notshown. Note that this
623 option has no effect unless used in conjunction with -nobatch.
624
625
626
627 -currentrun
628
629 (output option) Normally, RUN_TIME contains all the time accumulated
630 during the current run plus all previous runs. If this option is
631 specified, RUN_TIME only displays the time accumulated so far on
632 this current run.
633
634
635
636 -dag
637
638 (output option) Display DAG node jobs under their DAGMan instance.
639 Child nodes are listed using indentation to show the structure of
640 the DAG. Note that this option has no effect unless used in conjunc‐
641 tion with -nobatch.
642
643
644
645 -expert
646
647 (output option) Display shorter error messages.
648
649
650
651 -grid
652
653 (output option) Get information only about jobs submitted to grid
654 resources.
655
656
657
658 -grid:ec2
659
660 (output option) Get information only about jobs submitted to grid
661 resources and display it in a format better-suited for EC2 than the
662 default.
663
664
665
666 -goodput
667
668 (output option) Display job goodput statistics.
669
670
671
672 -help [Universe | State]
673
674 (output option) Print usage info, and, optionally, additionally
675 print job universes or job states.
676
677
678
679 -hold
680
681 (output option) Get information about jobs in the hold state. Also
682 displays the time the job was placed into the hold state and the
683 reason why the job was placed in the hold state.
684
685
686
687 -limit Number
688
689 (output option) Limit the number of items output to Number.
690
691
692
693 -io
694
695 (output option) Display job input/output summaries.
696
697
698
699 -long
700
701 (output option) Display entire job ClassAds in long format (one
702 attribute per line).
703
704
705
706 -run
707
708 (output option) Get information about running jobs. Note that this
709 option has no effect unless used in conjunction with -nobatch.
710
711
712
713 -stream-results
714
715 (output option) Display results as jobs are fetched from the job
716 queue rather than storing results in memory until all jobs have been
717 fetched. This can reduce memory consumption when fetching large num‐
718 bers of jobs, but if condor_qis paused while displaying results,
719 this could result in a timeout in communication with condor_schedd.
720
721
722
723 -totals
724
725 (output option) Display only the totals.
726
727
728
729 -version
730
731 (output option) Print the HTCondor version and exit.
732
733
734
735 -wide
736
737 (output option) If this option is specified, and the command portion
738 of the output would cause the output to extend beyond 80 columns,
739 display beyond the 80 columns.
740
741
742
743 -xml
744
745 (output option) Display entire job ClassAds in XML format. The XML
746 format is fully defined in the reference manual, obtained from the
747 ClassAds web page, with a link at http://htcondor.org/classad/clas‐
748 sad.html.
749
750
751
752 -json
753
754 (output option) Display entire job ClassAds in JSON format.
755
756
757
758 -attributes Attr1[,Attr2 ...]
759
760 (output option) Explicitly list the attributes, by name in a comma
761 separated list, which should be displayed when using the -xml,
762 -jsonor -longoptions. Limiting the number of attributes increases
763 the efficiency of the query.
764
765
766
767 -format fmt attr
768
769 (output option) Display attribute or expression attrin format fmt.
770 To display the attribute or expression the format must contain a
771 single printf(3)-style conversion specifier. Attributes must be from
772 the job ClassAd. Expressions are ClassAd expressions and may refer
773 to attributes in the job ClassAd. If the attribute is not present in
774 a given ClassAd and cannot be parsed as an expression, then the for‐
775 mat option will be silently skipped. %r prints the unevaluated, or
776 raw values. The conversion specifier must match the type of the
777 attribute or expression. %s is suitable for strings such as Owner,
778 %d for integers such as ClusterId, and %f for floating point numbers
779 such as RemoteWallClockTime. %v identifies the type of the
780 attribute, and then prints the value in an appropriate format. %V
781 identifies the type of the attribute, and then prints the value in
782 an appropriate format as it would appear in the -longformat. As an
783 example, strings used with %V will have quote marks. An incorrect
784 format will result in undefined behavior. Do not use more than one
785 conversion specifier in a given format. More than one conversion
786 specifier will result in undefined behavior. To output multiple
787 attributes repeat the -formatoption once for each desired attribute.
788 Like printf(3)style formats, one may include other text that will be
789 reproduced directly. A format without any conversion specifiers may
790 be specified, but an attribute is still required. Include a back‐
791 slash followed by an `n' to specify a line break.
792
793
794
795
796
797 -autoformat[:jlhVr,tng] attr1 [attr2 ...]or -af[:jlhVr,tng] attr1
798 [attr2 ...]
799
800 (output option) Display attribute(s) or expression(s) formatted in a
801 default way according to attribute types. This option takes an arbi‐
802 trary number of attribute names as arguments, and prints out their
803 values, with a space between each value and a newline character
804 after the last value. It is like the -formatoption without format
805 strings. This output option does notwork in conjunction with any of
806 the options -run, -currentrun, -hold, -grid, -goodput, or -io.
807
808 It is assumed that no attribute names begin with a dash character,
809 so that the next word that begins with dash is the start of the next
810 option. The autoformatoption may be followed by a colon character
811 and formatting qualifiers to deviate the output formatting from the
812 default:
813
814 jprint the job ID as the first field,
815
816 llabel each field,
817
818 hprint column headings before the first line of output,
819
820 Vuse %V rather than %v for formatting (string values are quoted),
821
822 rprint "raw", or unevaluated values,
823
824 ,add a comma character after each field,
825
826 tadd a tab character before each field instead of the default space
827 character,
828
829 nadd a newline character after each field,
830
831 gadd a newline character between ClassAds, and suppress spaces
832 before each field.
833
834 Use -af:hto get tabular values with headings.
835
836 Use -af:lrngto get -long equivalent format.
837
838 The newline and comma characters may notbe used together. The land
839 hcharacters may notbe used together.
840
841
842
843 -analyze[:<qual>]
844
845 (analyze option) Perform a matchmaking analysis on why the requested
846 jobs are not running. First a simple analysis determines if the job
847 is not running due to not being in a runnable state. If the job is
848 in a runnable state, then this option is equivalent to -better-ana‐
849 lyze. <qual>is a comma separated list containing one or more of
850
851 priorityto consider user priority during the analysis
852
853 summaryto show a one line summary for each job or machine
854
855 reverseto analyze machines, rather than jobs
856
857
858
859 -better-analyze[:<qual>]
860
861 (analyze option) Perform a more detailed matchmaking analysis to
862 determine how many resources are available to run the requested
863 jobs. This option is never meaningful for Scheduler universe jobs
864 and only meaningful for grid universe jobs doing matchmaking. When
865 this option is used in conjunction with the -unmatchableoption, The
866 output will be a list of job ids that don't match any of the avail‐
867 able slots. <qual>is a comma separated list containing one or more
868 of
869
870 priorityto consider user priority during the analysis
871
872 summaryto show a one line summary for each job or machine
873
874 reverseto analyze machines, rather than jobs
875
876
877
878 -machine name
879
880 (analyze option) When doing matchmaking analysis, analyze only
881 machine ClassAds that have slot or machine names that match the
882 given name.
883
884
885
886 -mconstraint expression
887
888 (analyze option) When doing matchmaking analysis, match only machine
889 ClassAds which match the ClassAd expression constraint.
890
891
892
893 -slotads file
894
895 (analyze option) When doing matchmaking analysis, use the machine
896 ClassAds from the file instead of the ones from the condor_collec‐
897 tordaemon. This is most useful for debugging purposes. The ClassAds
898 appear as if condor_status-longis used.
899
900
901
902 -userprios file
903
904 (analyze option) When doing matchmaking analysis with priority, read
905 user priorities from the file rather than the ones from the con‐
906 dor_negotiatordaemon. This is most useful for debugging purposes or
907 to speed up analysis in situations where the condor_negotiatordaemon
908 is slow to respond to condor_userpriorequests. The file should be in
909 the format produced by condor_userprio-long.
910
911
912
913 -nouserprios
914
915 (analyze option) Do not consider user priority during the analysis.
916
917
918
919 -reverse-analyze
920
921 (analyze option) Analyze machine requirements against jobs.
922
923
924
925 -verbose
926
927 (analyze option) When doing analysis, show progress and include the
928 names of specific machines in the output.
929
930
931
933 The default output from condor_qis formatted to be human readable, not
934 script readable. In an effort to make the output fit within 80 charac‐
935 ters, values in some fields might be truncated. Furthermore, the HTCon‐
936 dor Project can (and does) change the formatting of this default output
937 as we see fit. Therefore, any script that is attempting to parse data
938 from condor_qis strongly encouraged to use the -formatoption (described
939 above, examples given below).
940
941 Although -analyzeprovides a very good first approximation, the analyzer
942 cannot diagnose all possible situations, because the analysis is based
943 on instantaneous and local information. Therefore, there are some situ‐
944 ations such as when several submitters are contending for resources, or
945 if the pool is rapidly changing state which cannot be accurately diag‐
946 nosed.
947
948 Options -goodput, -cputime, and -ioare most useful for standard uni‐
949 verse jobs, since they rely on values computed when a job produces a
950 checkpoint.
951
952 It is possible to to hold jobs that are in the X state. To avoid this
953 it is best to construct a -constraint expressionthat option contains
954 JobStatus != 3if the user wishes to avoid this condition.
955
957 The -formatoption provides a way to specify both the job attributes and
958 formatting of those attributes. There must be only one conversion spec‐
959 ification per -formatoption. As an example, to list only Jane Doe's
960 jobs in the queue, choosing to print and format only the owner of the
961 job, the command line arguments for the job, and the process ID of the
962 job:
963
964 $ condor_q -submitter jdoe -format "%s" Owner -format " %s " Args -for‐
965 mat " ProcId = %d\n" ProcId
966 jdoe 16386 2800 ProcId = 0
967 jdoe 16386 3000 ProcId = 1
968 jdoe 16386 3200 ProcId = 2
969 jdoe 16386 3400 ProcId = 3
970 jdoe 16386 3600 ProcId = 4
971 jdoe 16386 4200 ProcId = 7
972
973 To display only the JobID's of Jane Doe's jobs you can use the follow‐
974 ing.
975
976 $ condor_q -submitter jdoe -format "%d." ClusterId -format "%d\n" Pro‐
977 cId
978 27.0
979 27.1
980 27.2
981 27.3
982 27.4
983 27.7
984
985 An example that shows the analysis in summary format:
986
987 $ condor_q -analyze:summary
988
989 -- Submitter: submit-1.chtc.wisc.edu :
990 <192.168.100.43:9618?sock=11794_95bb_3> :
991 submit-1.chtc.wisc.edu
992 Analyzing matches for 5979 slots
993 Autocluster Matches Machine Running Serving
994 JobId Members/Idle Reqmnts Rejects Job Users Job Other User
995 Avail Owner
996 ---------- ------------ -------- ------------ ---------- ----------
997 ----- -----
998 25764522.0 7/0 5910 820 7/10 5046 34
999 smith
1000 25764682.0 9/0 2172 603 9/9 1531 29
1001 smith
1002 25765082.0 18/0 2172 603 18/9 1531 29
1003 smith
1004 25765900.0 1/0 2172 603 1/9 1531 29
1005 smith
1006
1007 An example that shows summary information by machine:
1008
1009 $ condor_q -ana:sum,rev
1010
1011 -- Submitter: s-1.chtc.wisc.edu :
1012 <192.168.100.43:9618?sock=11794_95bb_3> : s-1.chtc.wisc.edu
1013 Analyzing matches for 2885 jobs
1014 Slot Slot's Req Job's Req Both
1015 Name Type Matches Job Matches Slot
1016 Match %
1017 ------------------------ ---- ------------ ------------
1018 ----------
1019 slot1@INFO.wisc.edu Stat 2729 0
1020 0.00
1021 slot2@INFO.wisc.edu Stat 2729 0
1022 0.00
1023 slot1@aci-001.chtc.wisc.edu Part 0 2793
1024 0.00
1025 slot1_1@a-001.chtc.wisc.edu Dyn 2644 2792
1026 91.37
1027 slot1_2@a-001.chtc.wisc.edu Dyn 2623 2601
1028 85.10
1029 slot1_3@a-001.chtc.wisc.edu Dyn 2644 2632
1030 85.82
1031 slot1_4@a-001.chtc.wisc.edu Dyn 2644 2792
1032 91.37
1033 slot1@a-002.chtc.wisc.edu Part 0 2633
1034 0.00
1035 slot1_10@a-002.chtc.wisc.edu Den 2623 2601
1036 85.10
1037
1038 An example with two independent DAGs in the queue:
1039
1040 $ condor_q
1041
1042 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:35169?...
1043 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
1044 wenger DAG: 3696 2/12 11:55 _ 10 _ 10 3698.0 ...
1045 3707.0
1046 wenger DAG: 3697 2/12 11:55 1 1 1 10 3709.0 ...
1047 3710.0
1048
1049 14 jobs; 0 completed, 0 removed, 1 idle, 13 running, 0 held, 0 sus‐
1050 pended
1051
1052 Note that the "13 running" in the last line is two more than the total
1053 of the RUN column, because the two condor_dagmanjobs themselves are
1054 counted in the last line but not the RUN column.
1055
1056 Also note that the "completed" value in the last line does not corre‐
1057 spond to the total of the DONE column, because the "completed" value in
1058 the last line only counts jobs that are completed but still in the
1059 queue, whereas the DONE column counts jobs that are no longer in the
1060 queue.
1061
1062 Here's an example with a held job, illustrating the addition of the
1063 HOLD column to the output:
1064
1065 $ condor_q
1066
1067 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1068 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE HOLD TOTAL
1069 JOB_IDS
1070 wenger CMD: /bin/slee 9/13 16:25 _ 3 _ 1 4
1071 599.0 ...
1072
1073 4 jobs; 0 completed, 0 removed, 0 idle, 3 running, 1 held, 0 suspended
1074
1075 Here are some examples with a nested-DAG workflow in the queue, which
1076 is one of the most complicated cases. The workflow consists of a top-
1077 level DAG with nodes NodeA and NodeB, each with two two-proc clusters;
1078 and a sub-DAG SubZ with nodes NodeSA and NodeSB, each with two two-proc
1079 clusters.
1080
1081 First of all, non-batch mode with all of the node jobs in the queue:
1082
1083 $ condor_q -nobatch
1084
1085 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1086 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
1087 591.0 wenger 9/13 16:05 0+00:00:13 R 0 2.4 con‐
1088 dor_dagman -p 0
1089 592.0 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 60
1090 592.1 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 300
1091 593.0 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 60
1092 593.1 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 300
1093 594.0 wenger 9/13 16:05 0+00:00:07 R 0 2.4 con‐
1094 dor_dagman -p 0
1095 595.0 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 60
1096 595.1 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 300
1097 596.0 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 60
1098 596.1 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 300
1099
1100 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 sus‐
1101 pended
1102
1103 Now non-batch mode with the -dagoption (unfortunately, condor_qdoesn't
1104 do a good job of grouping procs in the same cluster together):
1105
1106 $ condor_q -nobatch -dag
1107
1108 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1109 ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD
1110 591.0 wenger 9/13 16:05 0+00:00:27 R 0 2.4 con‐
1111 dor_dagman -
1112 592.0 |-NodeA 9/13 16:05 0+00:00:21 R 0 0.0 sleep
1113 60
1114 593.0 |-NodeB 9/13 16:05 0+00:00:21 R 0 0.0 sleep
1115 60
1116 594.0 |-SubZ 9/13 16:05 0+00:00:21 R 0 2.4 con‐
1117 dor_dagman -
1118 595.0 |-NodeSA 9/13 16:05 0+00:00:15 R 0 0.0 sleep
1119 60
1120 596.0 |-NodeSB 9/13 16:05 0+00:00:15 R 0 0.0 sleep
1121 60
1122 592.1 |-NodeA 9/13 16:05 0+00:00:21 R 0 0.0 sleep
1123 300
1124 593.1 |-NodeB 9/13 16:05 0+00:00:21 R 0 0.0 sleep
1125 300
1126 595.1 |-NodeSA 9/13 16:05 0+00:00:15 R 0 0.0 sleep
1127 300
1128 596.1 |-NodeSB 9/13 16:05 0+00:00:15 R 0 0.0 sleep
1129 300
1130
1131 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 sus‐
1132 pended
1133
1134 Now, finally, the non-batch (default) mode:
1135
1136 $ condor_q
1137
1138 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1139 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
1140 wenger ex1.dag+591 9/13 16:05 _ 8 _ 5 592.0 ...
1141 596.1
1142
1143 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 sus‐
1144 pended
1145
1146 There are several things about this output that may be slightly confus‐
1147 ing:
1148
1149 * The TOTAL column is less than the RUN column. This is because, for
1150 DAG node jobs, their contribution to the TOTAL column is the number
1151 of clusters, not the number of procs (but their contribution to the
1152 RUN column is the number of procs). So the four DAG nodes (8 procs)
1153 contribute 4, and the sub-DAG contributes 1, to the TOTAL column.
1154 (But, somewhat confusingly, the sub-DAG job is notcounted in the RUN
1155 column.)
1156
1157 * The sum of the RUN and IDLE columns (8) is less than the 10 jobs
1158 listed in the totals line at the bottom. This is because the top-
1159 level DAG and sub-DAG jobs are not counted in the RUN column, but
1160 they are counted in the totals line.
1161
1162 Now here is non-batch mode after proc 0 of each node job has finished:
1163
1164 $ condor_q -nobatch
1165
1166 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1167 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
1168 591.0 wenger 9/13 16:05 0+00:01:19 R 0 2.4 con‐
1169 dor_dagman -p 0
1170 592.1 wenger 9/13 16:05 0+00:01:13 R 0 0.0 sleep 300
1171 593.1 wenger 9/13 16:05 0+00:01:13 R 0 0.0 sleep 300
1172 594.0 wenger 9/13 16:05 0+00:01:13 R 0 2.4 con‐
1173 dor_dagman -p 0
1174 595.1 wenger 9/13 16:05 0+00:01:07 R 0 0.0 sleep 300
1175 596.1 wenger 9/13 16:05 0+00:01:07 R 0 0.0 sleep 300
1176
1177 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1178
1179 The same state also with the -dagoption:
1180
1181 $ condor_q -nobatch -dag
1182
1183 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1184 ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD
1185 591.0 wenger 9/13 16:05 0+00:01:30 R 0 2.4 con‐
1186 dor_dagman -
1187 592.1 |-NodeA 9/13 16:05 0+00:01:24 R 0 0.0 sleep
1188 300
1189 593.1 |-NodeB 9/13 16:05 0+00:01:24 R 0 0.0 sleep
1190 300
1191 594.0 |-SubZ 9/13 16:05 0+00:01:24 R 0 2.4 con‐
1192 dor_dagman -
1193 595.1 |-NodeSA 9/13 16:05 0+00:01:18 R 0 0.0 sleep
1194 300
1195 596.1 |-NodeSB 9/13 16:05 0+00:01:18 R 0 0.0 sleep
1196 300
1197
1198 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1199
1200 And, finally, that state in batch (default) mode:
1201
1202 $ condor_q
1203
1204 -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1205 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
1206 wenger ex1.dag+591 9/13 16:05 _ 4 _ 5 592.1 ...
1207 596.1
1208
1209 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1210
1212 condor_qwill exit with a status value of 0 (zero) upon success, and it
1213 will exit with the value 1 (one) upon failure.
1214
1216 Center for High Throughput Computing, University of Wiscon‐
1217 sin–Madison
1218
1220 Copyright © 1990-2019 Center for High Throughput Computing, Computer
1221 Sciences Department, University of Wisconsin-Madison, Madison, WI. All
1222 Rights Reserved. Licensed under the Apache License, Version 2.0.
1223
1224
1225
1226 date condor_q(1)