condor_q(1)

1condor_q(1)                 General Commands Manual                condor_q(1)
2
3
4

Name

6       condor_qDisplay information about jobs in queue
7

Synopsis

9       condor_q[-help [Universe | State]]
10
11       condor_q[-debug]  [general options] [restriction list] [output options]
12       [analyze options]
13

Description

15       condor_qdisplays information about jobs in the HTCondor job  queue.  By
16       default,  condor_qqueries the local job queue, but this behavior may be
17       modified by specifying one of the general options.
18
19       As of version 8.5.2, condor_qdefaults  to  querying  only  the  current
20       user's  jobs.  This default is overridden when the restriction list has
21       usernames and/or job ids, when the -submitteror -allusersarguments  are
22       specified,  or  when the current user is a queue superuser. It can also
23       be overridden by setting the  CONDOR_Q_ONLY_MY_JOBSconfiguration  macro
24       to False.
25
26       As  of  version  8.5.6,  condor_qdefaults  to  batch-mode  output  (see
27       -batchin the Options section below). The old behavior can  be  obtained
28       by  specifying  -nobatchon the command line. To change the default back
29       to its  pre-8.5.6  value,  set  the  new  configuration  variable  CON‐
30       DOR_Q_DASH_BATCH_IS_DEFAULTto False.
31

Batches of jobs

33       As  of  version 8.5.6, condor_qdefaults to displaying information about
34       batches of jobs, rather than individual jobs.  The  intention  is  that
35       this  will  be  a more useful, and user-friendly, format for users with
36       large numbers of jobs in the queue. Ideally, users will  specify  mean‐
37       ingful  batch  names for their jobs, to make it easier to keep track of
38       related jobs.
39
40       (For information about specifying batch names for your  jobs,  see  the
41       condor_submit( ) and condor_submit_dag( ) man pages.)
42
43       A batch of jobs is defined as follows:
44
45          *  An entire workflow (a DAG or hierarchy of nested DAGs) (note that
46          condor_dagmannow specifies a default batch name for all  jobs  in  a
47          given workflow)
48
49          * All jobs in a single cluster
50
51          *  All jobs submitted by a single user that have the same executable
52          specified in their submit  file  (unless  submitted  with  different
53          batch names)
54
55          *  All jobs submitted by a single user that have the same batch name
56          specified in their submit file or on the condor_submitor condor_sub‐
57          mit_dagcommand line.
58

Output

60       There  are many output options that modify the output generated by con‐
61       dor_q. The effects of these options, and the meanings  of  the  various
62       output data, are described below.
63
64   Output options
65       If the -longoption is specified, condor_qdisplays a long description of
66       the queried jobs by printing the entire job ClassAd for all jobs match‐
67       ing  the restrictions, if any. Individual attributes of the job ClassAd
68       can  be  displayed  by  means  of  the  -formatoption,  which  displays
69       attributes  with  a   printf(3)  format, or with the -autoformatoption.
70       Multiple -formatoptions may be specified in the option list to  display
71       several attributes of the job.
72
73       For  most  output  options (except as specified), the last line of con‐
74       dor_qoutput contains a summary of the queue: the total number of  jobs,
75       and  the  number of jobs in the completed, removed, idle, running, held
76       and suspended states.
77
78       If no output options are specified, condor_qnow defaults to batch mode,
79       and  displays  the  following  columns of information, with one line of
80       output per batch of jobs:
81
82           OWNER, BATCH_NAME,  SUBMITTED,  DONE,  RUN,  IDLE,  [HOLD,]  TOTAL,
83       JOB_IDS
84
85       Note  that  the HOLD column is only shown if there are held jobs in the
86       output or if there are nojobs in the output.
87
88       If the -nobatchoption is specified, condor_qdisplays the following col‐
89       umns of information, with one line of output per job:
90
91           ID, OWNER, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
92
93       If  the  -dagoption  is  specified (in conjunction with -nobatch), con‐
94       dor_qdisplays the following columns of information, with  one  line  of
95       output per job; the owner is shown only for top-level jobs, and for all
96       other jobs (including sub-DAGs) the node name is shown:
97
98           ID, OWNER/NODENAME, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
99
100       If the -runoption is specified (in  conjunction  with  -nobatch),  con‐
101       dor_qdisplays  the  following  columns of information, with one line of
102       output per running job:
103
104           ID, OWNER, SUBMITTED, RUN_TIME, HOST(S)
105
106       Also note that the -runoption disables output of the totals line.
107
108       If the -gridoption is specified, condor_qdisplays the following columns
109       of information, with one line of output per job:
110
111           ID, OWNER, STATUS, GRID->MANAGER, HOST, GRID_JOB_ID
112
113       If  the  -grid:ec2option  is  specified, condor_qdisplays the following
114       columns of information, with one line of output per job:
115
116           ID, OWNER, STATUS, INSTANCE ID, CMD
117
118       If the -goodputoption is specified, condor_qdisplays the following col‐
119       umns of information, with one line of output per job:
120
121           ID, OWNER, SUBMITTED, RUN_TIME, GOODPUT, CPU_UTIL, Mb/s
122
123       If  the  -iooption is specified, condor_qdisplays the following columns
124       of information, with one line of output per job:
125
126           ID, OWNER, RUNS, ST, INPUT, OUTPUT, RATE, MISC
127
128       If the -cputimeoption is specified (in conjunction with -nobatch), con‐
129       dor_qdisplays  the  following  columns of information, with one line of
130       output per job:
131
132           ID, OWNER, SUBMITTED, CPU_TIME, ST, PRI, SIZE, CMD
133
134       If the -holdoption is specified, condor_qdisplays the following columns
135       of information, with one line of output per job:
136
137           ID, OWNER, HELD_SINCE, HOLD_REASON
138
139       If  the  -totalsoption  is specified, condor_qdisplays only one line of
140       output no matter how many jobs and batches of jobs are  in  the  queue.
141       That  line  of output contains the total number of jobs, and the number
142       of jobs in the completed, removed, idle, running,  held  and  suspended
143       states.
144
145   Output data
146       The available output data are as follows:
147
148       ID
149
150          (Non-batch mode only) The cluster/process id of the HTCondor job.
151
152
153
154       OWNER
155
156          The owner of the job or batch of jobs.
157
158
159
160       OWNER/NODENAME
161
162          (-dagonly) The owner of a job or the DAG node name of the job.
163
164
165
166       BATCH_NAME
167
168          (Batch mode only) The batch name of the job or batch of jobs.
169
170
171
172       SUBMITTED
173
174          The month, day, hour, and minute the job was submitted to the queue.
175
176
177
178       DONE
179
180          (Batch  mode  only) The number of job procs that are done, but still
181          in the queue.
182
183
184
185       RUN
186
187          (Batch mode only) The number of job procs that are running.
188
189
190
191       IDLE
192
193          (Batch mode only) The number of job procs that are in the queue  but
194          idle.
195
196
197
198       HOLD
199
200          (Batch  mode only) The number of job procs that are in the queue but
201          held.
202
203
204
205       TOTAL
206
207          (Batch mode only) The total number of job procs in the queue, unless
208          the  batch is a DAG, in which case this is the total number of clus‐
209          ters in the queue. Note: for non-DAG batches, the TOTAL column  con‐
210          tains correct values only in version 8.5.7 and later.
211
212
213
214       JOB_IDS
215
216          (Batch mode only) The range of job IDs belonging to the batch.
217
218
219
220       RUN_TIME
221
222          (Non-batch mode only) Wall-clock time accumulated by the job to date
223          in days, hours, minutes, and seconds.
224
225
226
227       ST
228
229          (Non-batch mode only) Current status of the job, which varies  some‐
230          what according to the job universe and the timing of updates. H = on
231          hold, R = running, I = idle (waiting for a machine to execute on), C
232          =  completed, X = removed, S = suspended (execution of a running job
233          temporarily suspended on execute node), < = transferring  input  (or
234          queued to do so), and > = transferring output (or queued to do so).
235
236
237
238       PRI
239
240          (Non-batch  mode only) User specified priority of the job, displayed
241          as an integer, with higher numbers corresponding to better priority.
242
243
244
245       SIZE
246
247          (Non-batch mode only) The peak amount of memory in  Mbytes  consumed
248          by  the  job;  note  this  value is only refreshed periodically. The
249          actual value reported is taken from the job ClassAd attribute  Memo‐
250          ryUsageif  this  attribute is defined, and from job attribute Image‐
251          Sizeotherwise.
252
253
254
255       CMD
256
257          (Non-batch mode only) The name of the executable. For EC2 jobs, this
258          field is arbitrary.
259
260
261
262       HOST(S)
263
264          (-runonly) The host where the job is running.
265
266
267
268       STATUS
269
270          (-gridonly) The state that HTCondor believes the job is in. Possible
271          values are grid-type specific, but include:
272
273          PENDING
274
275             The job is waiting for resources to become available in order  to
276             run.
277
278
279
280          ACTIVE
281
282             The job has received resources, and the application is executing.
283
284
285
286          FAILED
287
288             The  job  terminated before completion because of an error, user-
289             triggered cancel, or system-triggered cancel.
290
291
292
293          DONE
294
295             The job completed successfully.
296
297
298
299          SUSPENDED
300
301             The job has been suspended. Resources which  were  allocated  for
302             this  job may have been released due to a scheduler-specific rea‐
303             son.
304
305
306
307          UNSUBMITTED
308
309             The job has not been submitted to the scheduler yet, pending  the
310             reception  of  the GLOBUS_GRAM_PROTOCOL_JOB_SIGNAL_COMMIT_REQUEST
311             signal from a client.
312
313
314
315          STAGE_IN
316
317             The job manager is staging in files, in order to run the job.
318
319
320
321          STAGE_OUT
322
323             The job manager is staging out files generated by the job.
324
325
326
327          UNKNOWN
328
329
330
331
332
333
334
335       GRID->MANAGER
336
337          (-gridonly) A guess at what remote batch system is running the  job.
338          It  is a guess, because HTCondor looks at the Globus jobmanager con‐
339          tact string to attempt identification. If the value is fork, the job
340          is  running on the remote host without a jobmanager. Values may also
341          be condor, lsf, or pbs.
342
343
344
345       HOST
346
347          (-gridonly) The host to which the job was submitted.
348
349
350
351       GRID_JOB_ID
352
353          (-gridonly) (More information needed here.)
354
355
356
357       INSTANCE ID
358
359          (-grid:ec2only) Usually EC2 instance ID; may be blank or the  client
360          token, depending on job progress.
361
362
363
364       GOODPUT
365
366          (-goodputonly)  The  percentage  of  RUN_TIME for this job which has
367          been saved in a checkpoint. A low GOODPUT value indicates  that  the
368          job  is  failing  to  checkpoint.  If  a job has not yet attempted a
369          checkpoint, this column contains [?????].
370
371
372
373       CPU_UTIL
374
375          (-goodputonly) The ratio of CPU_TIME to  RUN_TIME  for  checkpointed
376          work.  A  low  CPU_UTIL  indicates that the job is not running effi‐
377          ciently, perhaps because it is I/O bound or because the job requires
378          more  memory  than  available on the remote workstations. If the job
379          has not (yet) checkpointed, this column contains [??????].
380
381
382
383       Mb/s
384
385          (-goodputonly) The network usage of this job, in Megabits per second
386          of run-time.
387
388
389
390          READ  The  total number of bytes the application has read from files
391          and sockets.
392
393
394
395          WRITE The total number of bytes the application has written to files
396          and sockets.
397
398
399
400          SEEK  The  total  number of seek operations the application has per‐
401          formed on files.
402
403
404
405          XPUT The effective throughput (average bytes read  and  written  per
406          second) from the application's point of view.
407
408
409
410          BUFSIZE The maximum number of bytes to be buffered per file.
411
412
413
414          BLOCKSIZE  The  desired  block  size for large data transfers. These
415          fields are updated when a job produces a checkpoint or completes. If
416          a  job  has  not  yet produced a checkpoint, this information is not
417          available.
418
419
420
421       INPUT
422
423          (-ioonly) For standard universe,  FileReadBytes;  otherwise,  Bytes‐
424          Recvd.
425
426
427
428       OUTPUT
429
430          (-ioonly)   For   standard   universe,   FileWriteBytes;  otherwise,
431          BytesSent.
432
433
434
435       RATE
436
437          (-ioonly) For standard universe, FileReadBytes+FileWriteBytes;  oth‐
438          erwise, BytesRecvd+BytesSent.
439
440
441
442       MISC
443
444          (-ioonly) JobUniverse.
445
446
447
448       CPU_TIME
449
450          (-cputimeonly)  The  remote  CPU time accumulated by the job to date
451          (which has been stored in a checkpoint) in days, hours, minutes, and
452          seconds.  (If  the job is currently running, time accumulated during
453          the current run is notshown. If the job has not  produced  a  check‐
454          point, this column contains 0+00:00:00.)
455
456
457
458       HELD_SINCE
459
460          (-holdonly) Month, day, hour and minute at which the job was held.
461
462
463
464       HOLD_REASON
465
466          (-holdonly) The hold reason for the job.
467
468
469
470   Analyze
471       The -analyzeor -better-analyzeoptions can be used to determine why cer‐
472       tain jobs are not running by performing an analysis on  a  per  machine
473       basis  for  each machine in the pool. The reasons can vary among failed
474       constraints, insufficient priority, resource owner preferences and pre‐
475       vention  of preemption by the PREEMPTION_REQUIREMENTSexpression. If the
476       analyze option -verboseis specified along with the -analyzeoption,  the
477       reason  for  failure  is displayed on a per machine basis. -better-ana‐
478       lyzediffers from -analyzein that it will  do  matchmaking  analysis  on
479       jobs  even if they are currently running, or if the reason they are not
480       running is not due to matchmaking.  -better-analyzealso  produces  more
481       thorough analysis of complex Requirements and shows the values of rele‐
482       vant job ClassAd attributes. When only a single machine is  being  ana‐
483       lyzed via -machineor -mconstraint, the values of relevant attributes of
484       the machine ClassAd are also displayed.
485

Restrictions

487       To restrict the display to jobs of interest, a list  of  zero  or  more
488       restriction options may be supplied. Each restriction may be one of:
489
490          *  cluster.process, which matches jobs which belong to the specified
491          cluster and have the specified process number;
492
493          * cluster(without a process), which matches all  jobs  belonging  to
494          the specified cluster;
495
496          * owner, which matches all jobs owned by the specified owner;
497
498          *  -constraint  expression,  which matches all jobs that satisfy the
499          specified ClassAd expression;
500
501          * -unmatchable expression, which matches all jobs that do not  match
502          any slot that would be considered by -better-analyze ;
503
504          *  -allusers, which overrides the default restriction of only match‐
505          ing jobs submitted by the current user.
506
507       If clusteror cluster.processis specified, and  the  job  matching  that
508       restriction is a condor_dagmanjob, information for all jobs of that DAG
509       is displayed in batch mode (in non-batch mode, only the  condor_dagman‐
510       job itself is displayed).
511
512       If  no  ownerrestrictions  are present, the job matches the restriction
513       list if it matches at least  one  restriction  in  the  list.  If  own‐
514       errestrictions  are present, the job matches the list if it matches one
515       of the ownerrestrictions andat least one non-ownerrestriction.
516

Options

518       -debug
519
520          Causes debugging information to be sent  to  stderr,  based  on  the
521          value of the configuration variable TOOL_DEBUG.
522
523
524
525       -batch
526
527          (output  option)  Show  a  single line of progress information for a
528          batch of jobs, where a batch is defined as follows:
529
530             * An entire workflow (a DAG or hierarchy of nested DAGs)
531
532             * All jobs in a single cluster
533
534             * All jobs submitted by a single user that  have  the  same  exe‐
535             cutable specified in their submit file
536
537             *  All  jobs  submitted by a single user that have the same batch
538             name specified in their submit file  or  on  the  condor_submitor
539             condor_submit_dagcommand  line. Also change the output columns as
540             noted above.
541
542          Note that, as of version 8.5.6, -batchis  the  default,  unless  the
543          CONDOR_Q_DASH_BATCH_IS_DEFAULTconfiguration   variable   is  set  to
544          False.
545
546
547
548       -nobatch
549
550          (output option) Show a line for each job  (turn  off  the  -batchop‐
551          tion).
552
553
554
555       -global
556
557          (general option) Queries all job queues in the pool.
558
559
560
561       -submitter submitter
562
563          (general  option)  List  jobs  of a specific submitter in the entire
564          pool, not just for a single condor_schedd.
565
566
567
568       -name name
569
570          (general option) Query only the job queue of the named condor_sched‐
571          ddaemon.
572
573
574
575       -pool centralmanagerhostname[:portnumber]
576
577          (general  option)  Use the centralmanagerhostnameas the central man‐
578          ager to locate condor_schedddaemons.  The  default  is  the  COLLEC‐
579          TOR_HOST, as specified in the configuration.
580
581
582
583       -jobads file
584
585          (general  option)  Display jobs from a list of ClassAds from a file,
586          instead of the real ClassAds from the condor_schedddaemon.  This  is
587          most  useful  for debugging purposes. The ClassAds appear as if con‐
588          dor_q-longis used with the header stripped out.
589
590
591
592       -userlog file
593
594          (general option) Display jobs, with job information  coming  from  a
595          job  event  log,  instead  of  from  the real ClassAds from the con‐
596          dor_schedddaemon. This is most useful for automated testing  of  the
597          status  of  jobs  known to be in the given job event log, because it
598          reduces the load on the condor_schedd. A job event log does not con‐
599          tain all of the job information, so some fields in the normal output
600          of condor_qwill be blank.
601
602
603
604       -autocluster
605
606          (output option) Output condor_schedddaemon auto cluster information.
607          For  each  auto  cluster,  output  the unique ID of the auto cluster
608          along with the number of jobs in that auto cluster. This  option  is
609          intended  to  be  used  together  with the -longoption to output the
610          ClassAds representing auto clusters. The ClassAds can then  be  used
611          to  identify  or  classify the demand for sets of machine resources,
612          which will be useful in the on-demand creation of execute nodes  for
613          glidein services.
614
615
616
617       -cputime
618
619          (output  option)  Instead  of wall-clock allocation time (RUN_TIME),
620          display remote CPU time accumulated by the  job  to  date  in  days,
621          hours,  minutes,  and seconds. If the job is currently running, time
622          accumulated during the current  run  is  notshown.  Note  that  this
623          option has no effect unless used in conjunction with -nobatch.
624
625
626
627       -currentrun
628
629          (output option) Normally, RUN_TIME contains all the time accumulated
630          during the current run plus all previous runs.  If  this  option  is
631          specified,  RUN_TIME  only  displays  the time accumulated so far on
632          this current run.
633
634
635
636       -dag
637
638          (output option) Display DAG node jobs under their  DAGMan  instance.
639          Child  nodes  are  listed using indentation to show the structure of
640          the DAG. Note that this option has no effect unless used in conjunc‐
641          tion with -nobatch.
642
643
644
645       -expert
646
647          (output option) Display shorter error messages.
648
649
650
651       -grid
652
653          (output  option)  Get  information only about jobs submitted to grid
654          resources.
655
656
657
658       -grid:ec2
659
660          (output option) Get information only about jobs  submitted  to  grid
661          resources  and display it in a format better-suited for EC2 than the
662          default.
663
664
665
666       -goodput
667
668          (output option) Display job goodput statistics.
669
670
671
672       -help [Universe | State]
673
674          (output option) Print  usage  info,  and,  optionally,  additionally
675          print job universes or job states.
676
677
678
679       -hold
680
681          (output  option)  Get information about jobs in the hold state. Also
682          displays the time the job was placed into the  hold  state  and  the
683          reason why the job was placed in the hold state.
684
685
686
687       -limit Number
688
689          (output option) Limit the number of items output to Number.
690
691
692
693       -io
694
695          (output option) Display job input/output summaries.
696
697
698
699       -long
700
701          (output  option)  Display  entire  job  ClassAds in long format (one
702          attribute per line).
703
704
705
706       -run
707
708          (output option) Get information about running jobs. Note  that  this
709          option has no effect unless used in conjunction with -nobatch.
710
711
712
713       -stream-results
714
715          (output  option)  Display  results  as jobs are fetched from the job
716          queue rather than storing results in memory until all jobs have been
717          fetched. This can reduce memory consumption when fetching large num‐
718          bers of jobs, but if condor_qis  paused  while  displaying  results,
719          this could result in a timeout in communication with condor_schedd.
720
721
722
723       -totals
724
725          (output option) Display only the totals.
726
727
728
729       -version
730
731          (output option) Print the HTCondor version and exit.
732
733
734
735       -wide
736
737          (output option) If this option is specified, and the command portion
738          of the output would cause the output to extend  beyond  80  columns,
739          display beyond the 80 columns.
740
741
742
743       -xml
744
745          (output  option)  Display entire job ClassAds in XML format. The XML
746          format is fully defined in the reference manual, obtained  from  the
747          ClassAds  web page, with a link at http://htcondor.org/classad/clas‐
748          sad.html.
749
750
751
752       -json
753
754          (output option) Display entire job ClassAds in JSON format.
755
756
757
758       -attributes Attr1[,Attr2 ...]
759
760          (output option) Explicitly list the attributes, by name in  a  comma
761          separated  list,  which  should  be  displayed  when using the -xml,
762          -jsonor -longoptions. Limiting the number  of  attributes  increases
763          the efficiency of the query.
764
765
766
767       -format fmt attr
768
769          (output  option)  Display attribute or expression attrin format fmt.
770          To display the attribute or expression the  format  must  contain  a
771          single printf(3)-style conversion specifier. Attributes must be from
772          the job ClassAd. Expressions are ClassAd expressions and  may  refer
773          to attributes in the job ClassAd. If the attribute is not present in
774          a given ClassAd and cannot be parsed as an expression, then the for‐
775          mat  option  will be silently skipped. %r prints the unevaluated, or
776          raw values. The conversion specifier must  match  the  type  of  the
777          attribute  or  expression. %s is suitable for strings such as Owner,
778          %d for integers such as ClusterId, and %f for floating point numbers
779          such   as   RemoteWallClockTime.  %v  identifies  the  type  of  the
780          attribute, and then prints the value in an  appropriate  format.  %V
781          identifies  the  type of the attribute, and then prints the value in
782          an appropriate format as it would appear in the -longformat.  As  an
783          example,  strings  used  with %V will have quote marks. An incorrect
784          format will result in undefined behavior. Do not use more  than  one
785          conversion  specifier  in  a  given format. More than one conversion
786          specifier will result in  undefined  behavior.  To  output  multiple
787          attributes repeat the -formatoption once for each desired attribute.
788          Like printf(3)style formats, one may include other text that will be
789          reproduced  directly. A format without any conversion specifiers may
790          be specified, but an attribute is still required.  Include  a  back‐
791          slash followed by an `n' to specify a line break.
792
793
794
795
796
797       -autoformat[:jlhVr,tng]   attr1  [attr2  ...]or  -af[:jlhVr,tng]  attr1
798       [attr2 ...]
799
800          (output option) Display attribute(s) or expression(s) formatted in a
801          default way according to attribute types. This option takes an arbi‐
802          trary number of attribute names as arguments, and prints  out  their
803          values,  with  a  space  between  each value and a newline character
804          after the last value. It is like the  -formatoption  without  format
805          strings.  This output option does notwork in conjunction with any of
806          the options -run, -currentrun, -hold, -grid, -goodput, or -io.
807
808          It is assumed that no attribute names begin with a  dash  character,
809          so that the next word that begins with dash is the start of the next
810          option. The autoformatoption may be followed by  a  colon  character
811          and  formatting qualifiers to deviate the output formatting from the
812          default:
813
814          jprint the job ID as the first field,
815
816          llabel each field,
817
818          hprint column headings before the first line of output,
819
820          Vuse %V rather than %v for formatting (string values are quoted),
821
822          rprint "raw", or unevaluated values,
823
824          ,add a comma character after each field,
825
826          tadd a tab character before each field instead of the default  space
827          character,
828
829          nadd a newline character after each field,
830
831          gadd  a  newline  character  between  ClassAds,  and suppress spaces
832          before each field.
833
834          Use -af:hto get tabular values with headings.
835
836          Use -af:lrngto get -long equivalent format.
837
838          The newline and comma characters may notbe used together.  The  land
839          hcharacters may notbe used together.
840
841
842
843       -analyze[:<qual>]
844
845          (analyze option) Perform a matchmaking analysis on why the requested
846          jobs are not running. First a simple analysis determines if the  job
847          is  not  running due to not being in a runnable state. If the job is
848          in a runnable state, then this option is equivalent to  -better-ana‐
849          lyze. <qual>is a comma separated list containing one or more of
850
851          priorityto consider user priority during the analysis
852
853          summaryto show a one line summary for each job or machine
854
855          reverseto analyze machines, rather than jobs
856
857
858
859       -better-analyze[:<qual>]
860
861          (analyze  option)  Perform  a  more detailed matchmaking analysis to
862          determine how many resources are  available  to  run  the  requested
863          jobs.  This  option  is never meaningful for Scheduler universe jobs
864          and only meaningful for grid universe jobs doing  matchmaking.  When
865          this  option is used in conjunction with the -unmatchableoption, The
866          output will be a list of job ids that don't match any of the  avail‐
867          able  slots.  <qual>is a comma separated list containing one or more
868          of
869
870          priorityto consider user priority during the analysis
871
872          summaryto show a one line summary for each job or machine
873
874          reverseto analyze machines, rather than jobs
875
876
877
878       -machine name
879
880          (analyze option)  When  doing  matchmaking  analysis,  analyze  only
881          machine  ClassAds  that  have  slot  or machine names that match the
882          given name.
883
884
885
886       -mconstraint expression
887
888          (analyze option) When doing matchmaking analysis, match only machine
889          ClassAds which match the ClassAd expression constraint.
890
891
892
893       -slotads file
894
895          (analyze  option)  When  doing matchmaking analysis, use the machine
896          ClassAds from the file instead of the ones from  the  condor_collec‐
897          tordaemon.  This is most useful for debugging purposes. The ClassAds
898          appear as if condor_status-longis used.
899
900
901
902       -userprios file
903
904          (analyze option) When doing matchmaking analysis with priority, read
905          user  priorities  from  the  file rather than the ones from the con‐
906          dor_negotiatordaemon. This is most useful for debugging purposes  or
907          to speed up analysis in situations where the condor_negotiatordaemon
908          is slow to respond to condor_userpriorequests. The file should be in
909          the format produced by condor_userprio-long.
910
911
912
913       -nouserprios
914
915          (analyze option) Do not consider user priority during the analysis.
916
917
918
919       -reverse-analyze
920
921          (analyze option) Analyze machine requirements against jobs.
922
923
924
925       -verbose
926
927          (analyze  option) When doing analysis, show progress and include the
928          names of specific machines in the output.
929
930
931

General Remarks

933       The default output from condor_qis formatted to be human readable,  not
934       script  readable. In an effort to make the output fit within 80 charac‐
935       ters, values in some fields might be truncated. Furthermore, the HTCon‐
936       dor Project can (and does) change the formatting of this default output
937       as we see fit. Therefore, any script that is attempting to  parse  data
938       from condor_qis strongly encouraged to use the -formatoption (described
939       above, examples given below).
940
941       Although -analyzeprovides a very good first approximation, the analyzer
942       cannot  diagnose all possible situations, because the analysis is based
943       on instantaneous and local information. Therefore, there are some situ‐
944       ations such as when several submitters are contending for resources, or
945       if the pool is rapidly changing state which cannot be accurately  diag‐
946       nosed.
947
948       Options  -goodput,  -cputime,  and -ioare most useful for standard uni‐
949       verse jobs, since they rely on values computed when a  job  produces  a
950       checkpoint.
951
952       It  is  possible to to hold jobs that are in the X state. To avoid this
953       it is best to construct a -constraint  expressionthat  option  contains
954       JobStatus != 3if the user wishes to avoid this condition.
955

Examples

957       The -formatoption provides a way to specify both the job attributes and
958       formatting of those attributes. There must be only one conversion spec‐
959       ification  per  -formatoption.  As  an example, to list only Jane Doe's
960       jobs in the queue, choosing to print and format only the owner  of  the
961       job,  the command line arguments for the job, and the process ID of the
962       job:
963
964       $ condor_q -submitter jdoe -format "%s" Owner -format " %s " Args -for‐
965       mat " ProcId = %d\n" ProcId
966       jdoe 16386 2800 ProcId = 0
967       jdoe 16386 3000 ProcId = 1
968       jdoe 16386 3200 ProcId = 2
969       jdoe 16386 3400 ProcId = 3
970       jdoe 16386 3600 ProcId = 4
971       jdoe 16386 4200 ProcId = 7
972
973       To  display only the JobID's of Jane Doe's jobs you can use the follow‐
974       ing.
975
976       $ condor_q -submitter jdoe -format "%d." ClusterId -format "%d\n"  Pro‐
977       cId
978       27.0
979       27.1
980       27.2
981       27.3
982       27.4
983       27.7
984
985       An example that shows the analysis in summary format:
986
987       $ condor_q -analyze:summary
988
989       --            Submitter:            submit-1.chtc.wisc.edu            :
990       <192.168.100.43:9618?sock=11794_95bb_3> :
991        submit-1.chtc.wisc.edu
992       Analyzing matches for 5979 slots
993                   Autocluster  Matches    Machine     Running  Serving
994        JobId     Members/Idle  Reqmnts  Rejects Job   Users  Job  Other  User
995       Avail Owner
996       ----------  ------------  --------  ------------  ---------- ----------
997       ----- -----
998       25764522.0  7/0             5910        820   7/10       5046        34
999       smith
1000       25764682.0  9/0             2172        603   9/9        1531        29
1001       smith
1002       25765082.0  18/0            2172        603   18/9       1531        29
1003       smith
1004       25765900.0  1/0             2172        603   1/9        1531        29
1005       smith
1006
1007       An example that shows summary information by machine:
1008
1009       $ condor_q -ana:sum,rev
1010
1011       --             Submitter:              s-1.chtc.wisc.edu              :
1012       <192.168.100.43:9618?sock=11794_95bb_3> : s-1.chtc.wisc.edu
1013       Analyzing matches for 2885 jobs
1014                                       Slot  Slot's Req    Job's Req     Both
1015       Name                              Type    Matches   Job   Matches  Slot
1016       Match %
1017       ------------------------          ----    ------------     ------------
1018       ----------
1019       slot1@INFO.wisc.edu                    Stat               2729        0
1020       0.00
1021       slot2@INFO.wisc.edu                   Stat               2729         0
1022       0.00
1023       slot1@aci-001.chtc.wisc.edu           Part                 0       2793
1024       0.00
1025       slot1_1@a-001.chtc.wisc.edu          Dyn               2644        2792
1026       91.37
1027       slot1_2@a-001.chtc.wisc.edu           Dyn               2623       2601
1028       85.10
1029       slot1_3@a-001.chtc.wisc.edu          Dyn               2644        2632
1030       85.82
1031       slot1_4@a-001.chtc.wisc.edu           Dyn               2644       2792
1032       91.37
1033       slot1@a-002.chtc.wisc.edu            Part                 0        2633
1034       0.00
1035       slot1_10@a-002.chtc.wisc.edu          Den               2623       2601
1036       85.10
1037
1038       An example with two independent DAGs in the queue:
1039
1040       $ condor_q
1041
1042       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:35169?...
1043       OWNER  BATCH_NAME    SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
1044       wenger DAG: 3696    2/12 11:55      _     10      _     10  3698.0  ...
1045       3707.0
1046       wenger  DAG:  3697    2/12 11:55      1      1      1     10 3709.0 ...
1047       3710.0
1048
1049       14 jobs; 0 completed, 0 removed, 1 idle, 13 running,  0  held,  0  sus‐
1050       pended
1051
1052       Note  that the "13 running" in the last line is two more than the total
1053       of the RUN column, because the  two  condor_dagmanjobs  themselves  are
1054       counted in the last line but not the RUN column.
1055
1056       Also  note  that the "completed" value in the last line does not corre‐
1057       spond to the total of the DONE column, because the "completed" value in
1058       the  last  line  only  counts  jobs that are completed but still in the
1059       queue, whereas the DONE column counts jobs that are no  longer  in  the
1060       queue.
1061
1062       Here's  an  example  with  a held job, illustrating the addition of the
1063       HOLD column to the output:
1064
1065       $ condor_q
1066
1067       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1068       OWNER  BATCH_NAME        SUBMITTED   DONE   RUN    IDLE    HOLD   TOTAL
1069       JOB_IDS
1070       wenger  CMD:  /bin/slee   9/13 16:25      _      3      _      1      4
1071       599.0 ...
1072
1073       4 jobs; 0 completed, 0 removed, 0 idle, 3 running, 1 held, 0 suspended
1074
1075       Here are some examples with a nested-DAG workflow in the  queue,  which
1076       is  one  of the most complicated cases. The workflow consists of a top-
1077       level DAG with nodes NodeA and NodeB, each with two two-proc  clusters;
1078       and a sub-DAG SubZ with nodes NodeSA and NodeSB, each with two two-proc
1079       clusters.
1080
1081       First of all, non-batch mode with all of the node jobs in the queue:
1082
1083       $ condor_q -nobatch
1084
1085       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1086        ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
1087        591.0   wenger          9/13  16:05    0+00:00:13  R   0     2.4  con‐
1088       dor_dagman -p 0
1089        592.0   wenger          9/13 16:05   0+00:00:07 R  0    0.0 sleep 60
1090        592.1   wenger          9/13 16:05   0+00:00:07 R  0    0.0 sleep 300
1091        593.0   wenger          9/13 16:05   0+00:00:07 R  0    0.0 sleep 60
1092        593.1   wenger          9/13 16:05   0+00:00:07 R  0    0.0 sleep 300
1093        594.0    wenger           9/13  16:05    0+00:00:07  R   0    2.4 con‐
1094       dor_dagman -p 0
1095        595.0   wenger          9/13 16:05   0+00:00:01 R  0    0.0 sleep 60
1096        595.1   wenger          9/13 16:05   0+00:00:01 R  0    0.0 sleep 300
1097        596.0   wenger          9/13 16:05   0+00:00:01 R  0    0.0 sleep 60
1098        596.1   wenger          9/13 16:05   0+00:00:01 R  0    0.0 sleep 300
1099
1100       10 jobs; 0 completed, 0 removed, 0 idle, 10 running,  0  held,  0  sus‐
1101       pended
1102
1103       Now  non-batch mode with the -dagoption (unfortunately, condor_qdoesn't
1104       do a good job of grouping procs in the same cluster together):
1105
1106       $ condor_q -nobatch -dag
1107
1108       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1109        ID      OWNER/NODENAME      SUBMITTED     RUN_TIME ST PRI SIZE CMD
1110        591.0   wenger             9/13 16:05   0+00:00:27 R   0     2.4  con‐
1111       dor_dagman -
1112        592.0     |-NodeA            9/13 16:05   0+00:00:21 R  0    0.0 sleep
1113       60
1114        593.0    |-NodeB           9/13 16:05   0+00:00:21 R  0     0.0  sleep
1115       60
1116        594.0     |-SubZ             9/13  16:05   0+00:00:21 R  0    2.4 con‐
1117       dor_dagman -
1118        595.0     |-NodeSA         9/13 16:05   0+00:00:15 R  0     0.0  sleep
1119       60
1120        596.0      |-NodeSB          9/13 16:05   0+00:00:15 R  0    0.0 sleep
1121       60
1122        592.1    |-NodeA           9/13 16:05   0+00:00:21 R  0     0.0  sleep
1123       300
1124        593.1     |-NodeB            9/13 16:05   0+00:00:21 R  0    0.0 sleep
1125       300
1126        595.1     |-NodeSA         9/13 16:05   0+00:00:15 R  0     0.0  sleep
1127       300
1128        596.1      |-NodeSB          9/13 16:05   0+00:00:15 R  0    0.0 sleep
1129       300
1130
1131       10 jobs; 0 completed, 0 removed, 0 idle, 10 running,  0  held,  0  sus‐
1132       pended
1133
1134       Now, finally, the non-batch (default) mode:
1135
1136       $ condor_q
1137
1138       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1139       OWNER  BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
1140       wenger  ex1.dag+591    9/13 16:05      _      8      _      5 592.0 ...
1141       596.1
1142
1143       10 jobs; 0 completed, 0 removed, 0 idle, 10 running,  0  held,  0  sus‐
1144       pended
1145
1146       There are several things about this output that may be slightly confus‐
1147       ing:
1148
1149          * The TOTAL column is less than the RUN column. This is because, for
1150          DAG  node jobs, their contribution to the TOTAL column is the number
1151          of clusters, not the number of procs (but their contribution to  the
1152          RUN  column is the number of procs). So the four DAG nodes (8 procs)
1153          contribute 4, and the sub-DAG contributes 1, to  the  TOTAL  column.
1154          (But, somewhat confusingly, the sub-DAG job is notcounted in the RUN
1155          column.)
1156
1157          * The sum of the RUN and IDLE columns (8) is less than the  10  jobs
1158          listed  in  the  totals line at the bottom. This is because the top-
1159          level DAG and sub-DAG jobs are not counted in the  RUN  column,  but
1160          they are counted in the totals line.
1161
1162       Now here is non-batch mode after proc 0 of each node job has finished:
1163
1164       $ condor_q -nobatch
1165
1166       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1167        ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
1168        591.0    wenger           9/13  16:05    0+00:01:19  R   0    2.4 con‐
1169       dor_dagman -p 0
1170        592.1   wenger          9/13 16:05   0+00:01:13 R  0    0.0 sleep 300
1171        593.1   wenger          9/13 16:05   0+00:01:13 R  0    0.0 sleep 300
1172        594.0   wenger          9/13  16:05    0+00:01:13  R   0     2.4  con‐
1173       dor_dagman -p 0
1174        595.1   wenger          9/13 16:05   0+00:01:07 R  0    0.0 sleep 300
1175        596.1   wenger          9/13 16:05   0+00:01:07 R  0    0.0 sleep 300
1176
1177       6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1178
1179       The same state also with the -dagoption:
1180
1181       $ condor_q -nobatch -dag
1182
1183       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1184        ID      OWNER/NODENAME      SUBMITTED     RUN_TIME ST PRI SIZE CMD
1185        591.0    wenger              9/13  16:05   0+00:01:30 R  0    2.4 con‐
1186       dor_dagman -
1187        592.1    |-NodeA           9/13 16:05   0+00:01:24 R  0     0.0  sleep
1188       300
1189        593.1     |-NodeB            9/13 16:05   0+00:01:24 R  0    0.0 sleep
1190       300
1191        594.0    |-SubZ            9/13 16:05   0+00:01:24 R   0     2.4  con‐
1192       dor_dagman -
1193        595.1      |-NodeSA          9/13 16:05   0+00:01:18 R  0    0.0 sleep
1194       300
1195        596.1     |-NodeSB         9/13 16:05   0+00:01:18 R  0     0.0  sleep
1196       300
1197
1198       6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1199
1200       And, finally, that state in batch (default) mode:
1201
1202       $ condor_q
1203
1204       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1205       OWNER  BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
1206       wenger  ex1.dag+591    9/13 16:05      _      4      _      5 592.1 ...
1207       596.1
1208
1209       6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1210

Exit Status

1212       condor_qwill exit with a status value of 0 (zero) upon success, and  it
1213       will exit with the value 1 (one) upon failure.
1214

Author

1216       Center   for   High   Throughput   Computing,   University  of  Wiscon‐
1217       sin&ndash;Madison
1218

Copyright

1220       Copyright © 1990-2019 Center for High  Throughput  Computing,  Computer
1221       Sciences  Department, University of Wisconsin-Madison, Madison, WI. All
1222       Rights Reserved. Licensed under the Apache License, Version 2.0.
1223
1224
1225
1226                                     date                          condor_q(1)