1just-man-pages/condor_q(1)  General Commands Manual just-man-pages/condor_q(1)
2
3
4

Name

6       condor_q Display information about jobs in queue
7

Synopsis

9       condor_q [ -help [Universe | State] ]
10
11       condor_q  [  -debug ] [ general options ] [ restriction list ] [ output
12       options ] [ analyze options ]
13

Description

15       condor_q displays information about jobs in the HTCondor job queue.  By
16       default, condor_q queries the local job queue, but this behavior may be
17       modified by specifying one of the general options.
18
19       As of version 8.5.2, condor_q defaults to  querying  only  the  current
20       user's  jobs.  This default is overridden when the restriction list has
21       usernames and/or job ids, when the -submitter  or  -allusers  arguments
22       are  specified,  or  when the current user is a queue superuser. It can
23       also be overridden by setting the  CONDOR_Q_ONLY_MY_JOBS  configuration
24       macro to  False .
25
26       As of version 8.5.6, condor_q defaults to batch-mode output (see -batch
27       in the Options section below). The old  behavior  can  be  obtained  by
28       specifying  -nobatch on the command line. To change the default back to
29       its  pre-8.5.6  value,  set  the  new  configuration   variable    CON‐
30       DOR_Q_DASH_BATCH_IS_DEFAULT to  False .
31

Batches of jobs

33       As  of version 8.5.6, condor_q defaults to displaying information about
34       batches of jobs, rather than individual jobs.  The  intention  is  that
35       this  will  be  a more useful, and user-friendly, format for users with
36       large numbers of jobs in the queue. Ideally, users will  specify  mean‐
37       ingful  batch  names for their jobs, to make it easier to keep track of
38       related jobs.
39
40       (For information about specifying batch names for your  jobs,  see  the
41       condor_submit ( ) and condor_submit_dag ( ) man pages.)
42
43       A batch of jobs is defined as follows:
44
45          *  An entire workflow (a DAG or hierarchy of nested DAGs) (note that
46          condor_dagman now specifies a default batch name for all jobs  in  a
47          given workflow)
48
49          * All jobs in a single cluster
50
51          *  All jobs submitted by a single user that have the same executable
52          specified in their submit  file  (unless  submitted  with  different
53          batch names)
54
55          *  All jobs submitted by a single user that have the same batch name
56          specified in their submit file  or  on  the  condor_submit  or  con‐
57          dor_submit_dag command line.
58

Output

60       There  are many output options that modify the output generated by con‐
61       dor_q . The effects of these options, and the meanings of  the  various
62       output data, are described below.
63
64   Output options
65       If  the -long option is specified, condor_q displays a long description
66       of the queried jobs by printing the entire job  ClassAd  for  all  jobs
67       matching  the  restrictions,  if  any. Individual attributes of the job
68       ClassAd can be displayed by means of the -format option, which displays
69       attributes  with  a   printf(3) format, or with the -autoformat option.
70       Multiple -format options may be specified in the option list to display
71       several attributes of the job.
72
73       For  most  output  options (except as specified), the last line of con‐
74       dor_q output contains a summary of the queue: the total number of jobs,
75       and  the  number of jobs in the completed, removed, idle, running, held
76       and suspended states.
77
78       If no output options are specified,  condor_q  now  defaults  to  batch
79       mode,  and displays the following columns of information, with one line
80       of output per batch of jobs:
81
82           OWNER, BATCH_NAME,  SUBMITTED,  DONE,  RUN,  IDLE,  [HOLD,]  TOTAL,
83       JOB_IDS
84
85       Note  that  the HOLD column is only shown if there are held jobs in the
86       output or if there are no jobs in the output.
87
88       If the -nobatch option is specified, condor_q  displays  the  following
89       columns of information, with one line of output per job:
90
91           ID, OWNER, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
92
93       If  the  -dag option is specified (in conjunction with -nobatch ), con‐
94       dor_q displays the following columns of information, with one  line  of
95       output per job; the owner is shown only for top-level jobs, and for all
96       other jobs (including sub-DAGs) the node name is shown:
97
98           ID, OWNER/NODENAME, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
99
100       If the -run option is specified (in conjunction with -nobatch  ),  con‐
101       dor_q  displays  the following columns of information, with one line of
102       output per running job:
103
104           ID, OWNER, SUBMITTED, RUN_TIME, HOST(S)
105
106       Also note that the -run option disables output of the totals line.
107
108       If the -grid option is specified, condor_q displays the following  col‐
109       umns of information, with one line of output per job:
110
111           ID, OWNER, STATUS, GRID->MANAGER, HOST, GRID_JOB_ID
112
113       If  the  -goodput  option is specified, condor_q displays the following
114       columns of information, with one line of output per job:
115
116           ID, OWNER, SUBMITTED, RUN_TIME, GOODPUT, CPU_UTIL, Mb/s
117
118       If the -io option is specified, condor_q displays the following columns
119       of information, with one line of output per job:
120
121           ID, OWNER, RUNS, ST, INPUT, OUTPUT, RATE, MISC
122
123       If  the  -cputime  option is specified (in conjunction with -nobatch ),
124       condor_q displays the following columns of information, with  one  line
125       of output per job:
126
127           ID, OWNER, SUBMITTED, CPU_TIME, ST, PRI, SIZE, CMD
128
129       If  the -hold option is specified, condor_q displays the following col‐
130       umns of information, with one line of output per job:
131
132           ID, OWNER, HELD_SINCE, HOLD_REASON
133
134       If the -totals option is specified, condor_q displays only one line  of
135       output  no  matter  how many jobs and batches of jobs are in the queue.
136       That line of output contains the total number of jobs, and  the  number
137       of  jobs  in  the completed, removed, idle, running, held and suspended
138       states.
139
140   Output data
141       The available output data are as follows:
142
143       ID
144
145          (Non-batch mode only) The cluster/process id of the HTCondor job.
146
147
148
149       OWNER
150
151          The owner of the job or batch of jobs.
152
153
154
155       OWNER/NODENAME
156
157          ( -dag only) The owner of a job or the DAG node name of the job.
158
159
160
161       BATCH_NAME
162
163          (Batch mode only) The batch name of the job or batch of jobs.
164
165
166
167       SUBMITTED
168
169          The month, day, hour, and minute the job was submitted to the queue.
170
171
172
173       DONE
174
175          (Batch mode only) The number of job procs that are done,  but  still
176          in the queue.
177
178
179
180       RUN
181
182          (Batch mode only) The number of job procs that are running.
183
184
185
186       IDLE
187
188          (Batch  mode only) The number of job procs that are in the queue but
189          idle.
190
191
192
193       HOLD
194
195          (Batch mode only) The number of job procs that are in the queue  but
196          held.
197
198
199
200       TOTAL
201
202          (Batch mode only) The total number of job procs in the queue, unless
203          the batch is a DAG, in which case this is the total number of  clus‐
204          ters  in the queue. Note: for non-DAG batches, the TOTAL column con‐
205          tains correct values only in version 8.5.7 and later.
206
207
208
209       JOB_IDS
210
211          (Batch mode only) The range of job IDs belonging to the batch.
212
213
214
215       RUN_TIME
216
217          (Non-batch mode only) Wall-clock time accumulated by the job to date
218          in days, hours, minutes, and seconds.
219
220
221
222       ST
223
224          (Non-batch  mode only) Current status of the job, which varies some‐
225          what according to the job universe and the timing of updates. H = on
226          hold, R = running, I = idle (waiting for a machine to execute on), C
227          = completed, X = removed, S = suspended (execution of a running  job
228          temporarily  suspended  on execute node), < = transferring input (or
229          queued to do so), and > = transferring output (or queued to do so).
230
231
232
233       PRI
234
235          (Non-batch mode only) User specified priority of the job,  displayed
236          as an integer, with higher numbers corresponding to better priority.
237
238
239
240       SIZE
241
242          (Non-batch  mode  only) The peak amount of memory in Mbytes consumed
243          by the job; note this value  is  only  refreshed  periodically.  The
244          actual value reported is taken from the job ClassAd attribute  Memo‐
245          ryUsage if this attribute is defined, and from job attribute  Image‐
246          Size otherwise.
247
248
249
250       CMD
251
252          (Non-batch mode only) The name of the executable.
253
254
255
256       HOST(S)
257
258          ( -run only) The host where the job is running.
259
260
261
262       STATUS
263
264          ( -grid only) The state that HTCondor believes the job is in. Possi‐
265          ble values are
266
267          PENDING
268
269             The job is waiting for resources to become available in order  to
270             run.
271
272
273
274          ACTIVE
275
276             The job has received resources, and the application is executing.
277
278
279
280          FAILED
281
282             The  job  terminated before completion because of an error, user-
283             triggered cancel, or system-triggered cancel.
284
285
286
287          DONE
288
289             The job completed successfully.
290
291
292
293          SUSPENDED
294
295             The job has been suspended. Resources which  were  allocated  for
296             this  job may have been released due to a scheduler-specific rea‐
297             son.
298
299
300
301          UNSUBMITTED
302
303             The job has not been submitted to the scheduler yet, pending  the
304             reception  of  the GLOBUS_GRAM_PROTOCOL_JOB_SIGNAL_COMMIT_REQUEST
305             signal from a client.
306
307
308
309          STAGE_IN
310
311             The job manager is staging in files, in order to run the job.
312
313
314
315          STAGE_OUT
316
317             The job manager is staging out files generated by the job.
318
319
320
321          UNKNOWN
322
323
324
325
326
327
328
329       GRID->MANAGER
330
331          ( -grid only) A guess at what remote batch  system  is  running  the
332          job.  It is a guess, because HTCondor looks at the Globus jobmanager
333          contact string to attempt identification. If the value is fork,  the
334          job  is  running on the remote host without a jobmanager. Values may
335          also be condor, lsf, or pbs.
336
337
338
339       HOST
340
341          ( -grid only) The host to which the job was submitted.
342
343
344
345       GRID_JOB_ID
346
347          ( -grid only) (More information needed here.)
348
349
350
351       GOODPUT
352
353          ( -goodput only) The percentage of RUN_TIME for this job  which  has
354          been  saved  in a checkpoint. A low GOODPUT value indicates that the
355          job is failing to checkpoint. If a  job  has  not  yet  attempted  a
356          checkpoint, this column contains  [?????] .
357
358
359
360       CPU_UTIL
361
362          (  -goodput only) The ratio of CPU_TIME to RUN_TIME for checkpointed
363          work. A low CPU_UTIL indicates that the job  is  not  running  effi‐
364          ciently, perhaps because it is I/O bound or because the job requires
365          more memory than available on the remote workstations.  If  the  job
366          has not (yet) checkpointed, this column contains  [??????] .
367
368
369
370       Mb/s
371
372          ( -goodput only) The network usage of this job, in Megabits per sec‐
373          ond of run-time.
374
375
376
377          READ The total number of bytes the application has read  from  files
378          and sockets.
379
380
381
382          WRITE The total number of bytes the application has written to files
383          and sockets.
384
385
386
387          SEEK The total number of seek operations the  application  has  per‐
388          formed on files.
389
390
391
392          XPUT  The  effective  throughput (average bytes read and written per
393          second) from the application's point of view.
394
395
396
397          BUFSIZE The maximum number of bytes to be buffered per file.
398
399
400
401          BLOCKSIZE The desired block size for  large  data  transfers.  These
402          fields are updated when a job produces a checkpoint or completes. If
403          a job has not yet produced a checkpoint,  this  information  is  not
404          available.
405
406
407
408       INPUT
409
410          (  -io only) For standard universe, FileReadBytes; otherwise, Bytes‐
411          Recvd.
412
413
414
415       OUTPUT
416
417          (  -io  only)  For  standard  universe,  FileWriteBytes;  otherwise,
418          BytesSent.
419
420
421
422       RATE
423
424          (  -io  only)  For  standard universe, FileReadBytes+FileWriteBytes;
425          otherwise, BytesRecvd+BytesSent.
426
427
428
429       MISC
430
431          ( -io only) JobUniverse.
432
433
434
435       CPU_TIME
436
437          ( -cputime only) The remote CPU time accumulated by the job to  date
438          (which has been stored in a checkpoint) in days, hours, minutes, and
439          seconds. (If the job is currently running, time  accumulated  during
440          the  current  run is not shown. If the job has not produced a check‐
441          point, this column contains 0+00:00:00.)
442
443
444
445       HELD_SINCE
446
447          ( -hold only) Month, day, hour and minute at which the job was held.
448
449
450
451       HOLD_REASON
452
453          ( -hold only) The hold reason for the job.
454
455
456
457   Analyze
458       The -analyze or -better-analyze options can be used  to  determine  why
459       certain jobs are not running by performing an analysis on a per machine
460       basis for each machine in the pool. The reasons can vary  among  failed
461       constraints, insufficient priority, resource owner preferences and pre‐
462       vention of preemption by the   PREEMPTION_REQUIREMENTS  expression.  If
463       the  analyze  option  -verbose  is  specified  along  with the -analyze
464       option, the reason for failure is displayed on  a  per  machine  basis.
465       -better-analyze  differs  from  -analyze in that it will do matchmaking
466       analysis on jobs even if they are currently running, or if  the  reason
467       they  are  not running is not due to matchmaking.  -better-analyze also
468       produces more thorough analysis of complex Requirements and  shows  the
469       values  of  relevant job ClassAd attributes. When only a single machine
470       is being analyzed via -machine or -mconstraint , the values of relevant
471       attributes of the machine ClassAd are also displayed.
472

Restrictions

474       To  restrict  the  display  to jobs of interest, a list of zero or more
475       restriction options may be supplied. Each restriction may be one of:
476
477          * cluster .  process , which matches jobs which belong to the speci‐
478          fied cluster and have the specified process number;
479
480          *  cluster (without a process ), which matches all jobs belonging to
481          the specified cluster;
482
483          * owner , which matches all jobs owned by the specified owner;
484
485          * -constraint expression , which matches all jobs that  satisfy  the
486          specified ClassAd expression;
487
488          * -allusers , which overrides the default restriction of only match‐
489          ing jobs submitted by the current user.
490
491       If cluster or cluster .  process is specified,  and  the  job  matching
492       that  restriction  is  a condor_dagman job, information for all jobs of
493       that DAG is displayed in batch mode (in non-batch mode, only  the  con‐
494       dor_dagman job itself is displayed).
495
496       If  no  owner restrictions are present, the job matches the restriction
497       list if it matches at least one  restriction  in  the  list.  If  owner
498       restrictions are present, the job matches the list if it matches one of
499       the owner restrictions and at least one non- owner restriction.
500

Options

502       -debug
503
504          Causes debugging information to be sent to  stderr ,  based  on  the
505          value of the configuration variable  TOOL_DEBUG .
506
507
508
509       -batch
510
511          (output  option)  Show  a  single line of progress information for a
512          batch of jobs, where a batch is defined as follows:
513
514             * An entire workflow (a DAG or hierarchy of nested DAGs)
515
516             * All jobs in a single cluster
517
518             * All jobs submitted by a single user that  have  the  same  exe‐
519             cutable specified in their submit file
520
521             *  All  jobs  submitted by a single user that have the same batch
522             name specified in their submit file or on  the  condor_submit  or
523             condor_submit_dag command line. Also change the output columns as
524             noted above.
525
526          Note that, as of version 8.5.6, -batch is the  default,  unless  the
527          CONDOR_Q_DASH_BATCH_IS_DEFAULT  configuration  variable  is  set  to
528          False .
529
530
531
532       -nobatch
533
534          (output option) Show a line  for  each  job  (turn  off  the  -batch
535          option).
536
537
538
539       -global
540
541          (general option) Queries all job queues in the pool.
542
543
544
545       -submitter submitter
546
547          (general  option)  List  jobs  of a specific submitter in the entire
548          pool, not just for a single condor_schedd .
549
550
551
552       -name name
553
554          (general option) Query only the job queue of the named condor_schedd
555          daemon.
556
557
558
559       -pool centralmanagerhostname[:portnumber]
560
561          (general  option) Use the centralmanagerhostname as the central man‐
562          ager to locate condor_schedd daemons. The default  is  the   COLLEC‐
563          TOR_HOST , as specified in the configuration.
564
565
566
567       -jobads file
568
569          (general  option)  Display jobs from a list of ClassAds from a file,
570          instead of the real ClassAds from the condor_schedd daemon. This  is
571          most  useful  for debugging purposes. The ClassAds appear as if con‐
572          dor_q -long is used with the header stripped out.
573
574
575
576       -userlog file
577
578          (general option) Display jobs, with job information  coming  from  a
579          job  event  log,  instead  of  from  the real ClassAds from the con‐
580          dor_schedd daemon. This is most useful for automated testing of  the
581          status  of  jobs  known to be in the given job event log, because it
582          reduces the load on the condor_schedd . A job  event  log  does  not
583          contain  all  of  the  job information, so some fields in the normal
584          output of condor_q will be blank.
585
586
587
588       -autocluster
589
590          (output option) Output condor_schedd daemon  auto  cluster  informa‐
591          tion.  For each auto cluster, output the unique ID of the auto clus‐
592          ter along with the number of jobs in that auto cluster. This  option
593          is  intended to be used together with the -long option to output the
594          ClassAds representing auto clusters. The ClassAds can then  be  used
595          to  identify  or  classify the demand for sets of machine resources,
596          which will be useful in the on-demand creation of execute nodes  for
597          glidein services.
598
599
600
601       -cputime
602
603          (output  option)  Instead  of wall-clock allocation time (RUN_TIME),
604          display remote CPU time accumulated by the  job  to  date  in  days,
605          hours,  minutes,  and seconds. If the job is currently running, time
606          accumulated during the current run is  not  shown.  Note  that  this
607          option has no effect unless used in conjunction with -nobatch .
608
609
610
611       -currentrun
612
613          (output option) Normally, RUN_TIME contains all the time accumulated
614          during the current run plus all previous runs.  If  this  option  is
615          specified,  RUN_TIME  only  displays  the time accumulated so far on
616          this current run.
617
618
619
620       -dag
621
622          (output option) Display DAG node jobs under their  DAGMan  instance.
623          Child  nodes  are  listed using indentation to show the structure of
624          the DAG. Note that this option has no effect unless used in conjunc‐
625          tion with -nobatch .
626
627
628
629       -expert
630
631          (output option) Display shorter error messages.
632
633
634
635       -grid
636
637          (output  option)  Get  information only about jobs submitted to grid
638          resources described as gt2 or gt5 .
639
640
641
642       -goodput
643
644          (output option) Display job goodput statistics.
645
646
647
648       -help [Universe | State]
649
650          (output option) Print  usage  info,  and,  optionally,  additionally
651          print job universes or job states.
652
653
654
655       -hold
656
657          (output  option)  Get information about jobs in the hold state. Also
658          displays the time the job was placed into the  hold  state  and  the
659          reason why the job was placed in the hold state.
660
661
662
663       -limit Number
664
665          (output option) Limit the number of items output to Number .
666
667
668
669       -io
670
671          (output option) Display job input/output summaries.
672
673
674
675       -long
676
677          (output  option)  Display  entire  job  ClassAds in long format (one
678          attribute per line).
679
680
681
682       -run
683
684          (output option) Get information about running jobs. Note  that  this
685          option has no effect unless used in conjunction with -nobatch .
686
687
688
689       -stream-results
690
691          (output  option)  Display  results  as jobs are fetched from the job
692          queue rather than storing results in memory until all jobs have been
693          fetched. This can reduce memory consumption when fetching large num‐
694          bers of jobs, but if condor_q is paused  while  displaying  results,
695          this could result in a timeout in communication with condor_schedd .
696
697
698
699       -totals
700
701          (output option) Display only the totals.
702
703
704
705       -version
706
707          (output option) Print the HTCondor version and exit.
708
709
710
711       -wide
712
713          (output option) If this option is specified, and the command portion
714          of the output would cause the output to extend  beyond  80  columns,
715          display beyond the 80 columns.
716
717
718
719       -xml
720
721          (output  option)  Display entire job ClassAds in XML format. The XML
722          format is fully defined in the reference manual, obtained  from  the
723          ClassAds  web page, with a link at http://htcondor.org/classad/clas
724          sad.html.
725
726
727
728       -json
729
730          (output option) Display entire job ClassAds in JSON format.
731
732
733
734       -attributes Attr1[,Attr2 ...]
735
736          (output option) Explicitly list the attributes, by name in  a  comma
737          separated  list,  which  should  be  displayed when using the -xml ,
738          -json or -long options. Limiting the number of attributes  increases
739          the efficiency of the query.
740
741
742
743       -format fmt attr
744
745          (output option) Display attribute or expression attr in format fmt .
746          To display the attribute or expression the  format  must  contain  a
747          single   printf(3)  -style  conversion specifier. Attributes must be
748          from the job ClassAd. Expressions are ClassAd  expressions  and  may
749          refer  to  attributes  in  the  job ClassAd. If the attribute is not
750          present in a given ClassAd and cannot be parsed  as  an  expression,
751          then  the  format  option  will  be  silently skipped. %r prints the
752          unevaluated, or raw values. The conversion specifier must match  the
753          type of the attribute or expression. %s is suitable for strings such
754          as  Owner , %d for integers such as  ClusterId , and %f for floating
755          point  numbers such as  RemoteWallClockTime . %v identifies the type
756          of the attribute, and then prints the value in an  appropriate  for‐
757          mat.  %V  identifies  the type of the attribute, and then prints the
758          value in an appropriate format as it would appear in the -long  for‐
759          mat.  As  an example, strings used with %V will have quote marks. An
760          incorrect format will result in undefined behavior. Do not use  more
761          than  one conversion specifier in a given format. More than one con‐
762          version specifier will result in undefined behavior. To output  mul‐
763          tiple  attributes  repeat  the  -format option once for each desired
764          attribute. Like  printf(3) style formats, one may include other text
765          that  will  be  reproduced directly. A format without any conversion
766          specifiers may be specified, but an  attribute  is  still  required.
767          Include n to specify a line break.
768
769
770
771
772
773       -autoformat[:jlhVr,tng]  attr1  [attr2  ...]   or -af[:jlhVr,tng] attr1
774       [attr2 ...]
775
776          (output option) Display attribute(s) or expression(s) formatted in a
777          default way according to attribute types. This option takes an arbi‐
778          trary number of attribute names as arguments, and prints  out  their
779          values,  with  a  space  between  each value and a newline character
780          after the last value. It is like the -format option  without  format
781          strings. This output option does not work in conjunction with any of
782          the options -run , -currentrun , -hold , -grid , -goodput , or -io .
783
784          It is assumed that no attribute names begin with a  dash  character,
785          so that the next word that begins with dash is the start of the next
786          option. The autoformat option may be followed by a  colon  character
787          and  formatting qualifiers to deviate the output formatting from the
788          default:
789
790          j print the job ID as the first field,
791
792          l label each field,
793
794          h print column headings before the first line of output,
795
796          V use %V rather than %v for formatting (string values are quoted),
797
798          r print "raw", or unevaluated values,
799
800          , add a comma character after each field,
801
802          t add a tab character before each field instead of the default space
803          character,
804
805          n add a newline character after each field,
806
807          g  add  a  newline  character  between ClassAds, and suppress spaces
808          before each field.
809
810          Use -af:h to get tabular values with headings.
811
812          Use -af:lrng to get -long equivalent format.
813
814          The newline and comma characters may not be used together. The l and
815          h characters may not be used together.
816
817
818
819       -analyze[:<qual>]
820
821          (analyze option) Perform a matchmaking analysis on why the requested
822          jobs are not running. First a simple analysis determines if the  job
823          is  not  running due to not being in a runnable state. If the job is
824          in a runnable state, then this option is equivalent to  -better-ana‐
825          lyze .  <qual> is a comma separated list containing one or more of
826
827          priority to consider user priority during the analysis
828
829          summary to show a one line summary for each job or machine
830
831          reverse to analyze machines, rather than jobs
832
833
834
835       -better-analyze[:<qual>]
836
837          (analyze  option)  Perform  a  more detailed matchmaking analysis to
838          determine how many resources are  available  to  run  the  requested
839          jobs.  This  option  is never meaningful for Scheduler universe jobs
840          and only  meaningful  for  grid  universe  jobs  doing  matchmaking.
841          <qual> is a comma separated list containing one or more of
842
843          priority to consider user priority during the analysis
844
845          summary to show a one line summary for each job or machine
846
847          reverse to analyze machines, rather than jobs
848
849
850
851       -machine name
852
853          (analyze  option)  When  doing  matchmaking  analysis,  analyze only
854          machine ClassAds that have slot or  machine  names  that  match  the
855          given name.
856
857
858
859       -mconstraint expression
860
861          (analyze option) When doing matchmaking analysis, match only machine
862          ClassAds which match the ClassAd expression constraint.
863
864
865
866       -slotads file
867
868          (analyze option) When doing matchmaking analysis,  use  the  machine
869          ClassAds from the file instead of the ones from the condor_collector
870          daemon. This is most useful for  debugging  purposes.  The  ClassAds
871          appear as if condor_status -long is used.
872
873
874
875       -userprios file
876
877          (analyze option) When doing matchmaking analysis with priority, read
878          user priorities from the file rather than the  ones  from  the  con‐
879          dor_negotiator daemon. This is most useful for debugging purposes or
880          to speed up analysis in situations where the condor_negotiator  dae‐
881          mon  is slow to respond to condor_userprio requests. The file should
882          be in the format produced by condor_userprio -long .
883
884
885
886       -nouserprios
887
888          (analyze option) Do not consider user priority during the analysis.
889
890
891
892       -reverse-analyze
893
894          (analyze option) Analyze machine requirements against jobs.
895
896
897
898       -verbose
899
900          (analyze option) When doing analysis, show progress and include  the
901          names of specific machines in the output.
902
903
904

General Remarks

906       The default output from condor_q is formatted to be human readable, not
907       script readable. In an effort to make the output fit within 80  charac‐
908       ters, values in some fields might be truncated. Furthermore, the HTCon‐
909       dor Project can (and does) change the formatting of this default output
910       as  we  see fit. Therefore, any script that is attempting to parse data
911       from  condor_q  is  strongly  encouraged  to  use  the  -format  option
912       (described above, examples given below).
913
914       Although  -analyze  provides  a very good first approximation, the ana‐
915       lyzer cannot diagnose all possible situations, because the analysis  is
916       based on instantaneous and local information. Therefore, there are some
917       situations  such  as  when  several  submitters  are   contending   for
918       resources,  or  if  the  pool is rapidly changing state which cannot be
919       accurately diagnosed.
920
921       Options -goodput , -cputime , and -io are most useful for standard uni‐
922       verse  jobs,  since  they rely on values computed when a job produces a
923       checkpoint.
924
925       It is possible to to hold jobs that are in the X state. To  avoid  this
926       it  is  best to construct a -constraint expression that option contains
927       JobStatus != 3 if the user wishes to avoid this condition.
928

Examples

930       The -format option provides a way to specify both  the  job  attributes
931       and  formatting  of those attributes. There must be only one conversion
932       specification per -format option. As an  example,  to  list  only  Jane
933       Doe's jobs in the queue, choosing to print and format only the owner of
934       the job, the command line arguments for the job, and the process ID  of
935       the job:
936
937       $ condor_q -submitter jdoe -format "%s" Owner -format " %s " Args -for‐
938       mat " ProcId = %d\n" ProcId
939       jdoe 16386 2800 ProcId = 0
940       jdoe 16386 3000 ProcId = 1
941       jdoe 16386 3200 ProcId = 2
942       jdoe 16386 3400 ProcId = 3
943       jdoe 16386 3600 ProcId = 4
944       jdoe 16386 4200 ProcId = 7
945
946       To display only the JobID's of Jane Doe's jobs you can use the  follow‐
947       ing.
948
949       $  condor_q -submitter jdoe -format "%d." ClusterId -format "%d\n" Pro‐
950       cId
951       27.0
952       27.1
953       27.2
954       27.3
955       27.4
956       27.7
957
958       An example that shows the analysis in summary format:
959
960       $ condor_q -analyze:summary
961
962       --            Submitter:            submit-1.chtc.wisc.edu            :
963       <192.168.100.43:9618?sock=11794_95bb_3> :
964        submit-1.chtc.wisc.edu
965       Analyzing matches for 5979 slots
966                   Autocluster  Matches    Machine     Running  Serving
967        JobId      Members/Idle   Reqmnts   Rejects  Job  Users Job Other User
968       Avail Owner
969       ---------- ------------  --------  ------------  ----------  ----------
970       ----- -----
971       25764522.0  7/0             5910        820   7/10       5046        34
972       smith
973       25764682.0  9/0             2172        603   9/9        1531        29
974       smith
975       25765082.0  18/0            2172        603   18/9       1531        29
976       smith
977       25765900.0  1/0             2172        603   1/9        1531        29
978       smith
979
980       An example that shows summary information by machine:
981
982       $ condor_q -ana:sum,rev
983
984       --              Submitter:              s-1.chtc.wisc.edu             :
985       <192.168.100.43:9618?sock=11794_95bb_3> : s-1.chtc.wisc.edu
986       Analyzing matches for 2885 jobs
987                                       Slot  Slot's Req    Job's Req     Both
988       Name                             Type   Matches   Job    Matches   Slot
989       Match %
990       ------------------------           ----    ------------    ------------
991       ----------
992       slot1@INFO.wisc.edu                   Stat               2729         0
993       0.00
994       slot2@INFO.wisc.edu                    Stat               2729        0
995       0.00
996       slot1@aci-001.chtc.wisc.edu          Part                 0        2793
997       0.00
998       slot1_1@a-001.chtc.wisc.edu           Dyn               2644       2792
999       91.37
1000       slot1_2@a-001.chtc.wisc.edu          Dyn               2623        2601
1001       85.10
1002       slot1_3@a-001.chtc.wisc.edu           Dyn               2644       2632
1003       85.82
1004       slot1_4@a-001.chtc.wisc.edu          Dyn               2644        2792
1005       91.37
1006       slot1@a-002.chtc.wisc.edu             Part                 0       2633
1007       0.00
1008       slot1_10@a-002.chtc.wisc.edu         Den               2623        2601
1009       85.10
1010
1011       An example with two independent DAGs in the queue:
1012
1013       $ condor_q
1014
1015       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:35169?...
1016       OWNER  BATCH_NAME    SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
1017       wenger  DAG:  3696    2/12 11:55      _     10      _     10 3698.0 ...
1018       3707.0
1019       wenger DAG: 3697    2/12 11:55      1      1      1     10  3709.0  ...
1020       3710.0
1021
1022       14  jobs;  0  completed,  0 removed, 1 idle, 13 running, 0 held, 0 sus‐
1023       pended
1024
1025       Note that the "13 running" in the last line is two more than the  total
1026       of  the  RUN  column, because the two condor_dagman jobs themselves are
1027       counted in the last line but not the RUN column.
1028
1029       Also note that the "completed" value in the last line does  not  corre‐
1030       spond to the total of the DONE column, because the "completed" value in
1031       the last line only counts jobs that are  completed  but  still  in  the
1032       queue,  whereas  the  DONE column counts jobs that are no longer in the
1033       queue.
1034
1035       Here's an example with a held job, illustrating  the  addition  of  the
1036       HOLD column to the output:
1037
1038       $ condor_q
1039
1040       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1041       OWNER   BATCH_NAME         SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL
1042       JOB_IDS
1043       wenger CMD: /bin/slee   9/13 16:25      _      3      _       1       4
1044       599.0 ...
1045
1046       4 jobs; 0 completed, 0 removed, 0 idle, 3 running, 1 held, 0 suspended
1047
1048       Here  are  some examples with a nested-DAG workflow in the queue, which
1049       is one of the most complicated cases. The workflow consists of  a  top-
1050       level  DAG with nodes NodeA and NodeB, each with two two-proc clusters;
1051       and a sub-DAG SubZ with nodes NodeSA and NodeSB, each with two two-proc
1052       clusters.
1053
1054       First of all, non-batch mode with all of the node jobs in the queue:
1055
1056       $ condor_q -nobatch
1057
1058       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1059        ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
1060        591.0    wenger           9/13  16:05    0+00:00:13  R   0    2.4 con‐
1061       dor_dagman -p 0
1062        592.0   wenger          9/13 16:05   0+00:00:07 R  0    0.0 sleep 60
1063        592.1   wenger          9/13 16:05   0+00:00:07 R  0    0.0 sleep 300
1064        593.0   wenger          9/13 16:05   0+00:00:07 R  0    0.0 sleep 60
1065        593.1   wenger          9/13 16:05   0+00:00:07 R  0    0.0 sleep 300
1066        594.0   wenger          9/13  16:05    0+00:00:07  R   0     2.4  con‐
1067       dor_dagman -p 0
1068        595.0   wenger          9/13 16:05   0+00:00:01 R  0    0.0 sleep 60
1069        595.1   wenger          9/13 16:05   0+00:00:01 R  0    0.0 sleep 300
1070        596.0   wenger          9/13 16:05   0+00:00:01 R  0    0.0 sleep 60
1071        596.1   wenger          9/13 16:05   0+00:00:01 R  0    0.0 sleep 300
1072
1073       10  jobs;  0  completed,  0 removed, 0 idle, 10 running, 0 held, 0 sus‐
1074       pended
1075
1076       Now non-batch  mode  with  the  -dag  option  (unfortunately,  condor_q
1077       doesn't do a good job of grouping procs in the same cluster together):
1078
1079       $ condor_q -nobatch -dag
1080
1081       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1082        ID      OWNER/NODENAME      SUBMITTED     RUN_TIME ST PRI SIZE CMD
1083        591.0    wenger              9/13  16:05   0+00:00:27 R  0    2.4 con‐
1084       dor_dagman -
1085        592.0    |-NodeA           9/13 16:05   0+00:00:21 R  0     0.0  sleep
1086       60
1087        593.0     |-NodeB            9/13 16:05   0+00:00:21 R  0    0.0 sleep
1088       60
1089        594.0    |-SubZ            9/13 16:05   0+00:00:21 R   0     2.4  con‐
1090       dor_dagman -
1091        595.0      |-NodeSA          9/13 16:05   0+00:00:15 R  0    0.0 sleep
1092       60
1093        596.0     |-NodeSB         9/13 16:05   0+00:00:15 R  0     0.0  sleep
1094       60
1095        592.1     |-NodeA            9/13 16:05   0+00:00:21 R  0    0.0 sleep
1096       300
1097        593.1    |-NodeB           9/13 16:05   0+00:00:21 R  0     0.0  sleep
1098       300
1099        595.1      |-NodeSA          9/13 16:05   0+00:00:15 R  0    0.0 sleep
1100       300
1101        596.1     |-NodeSB         9/13 16:05   0+00:00:15 R  0     0.0  sleep
1102       300
1103
1104       10  jobs;  0  completed,  0 removed, 0 idle, 10 running, 0 held, 0 sus‐
1105       pended
1106
1107       Now, finally, the non-batch (default) mode:
1108
1109       $ condor_q
1110
1111       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1112       OWNER  BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
1113       wenger ex1.dag+591   9/13 16:05      _      8      _      5  592.0  ...
1114       596.1
1115
1116       10  jobs;  0  completed,  0 removed, 0 idle, 10 running, 0 held, 0 sus‐
1117       pended
1118
1119       There are several things about this output that may be slightly confus‐
1120       ing:
1121
1122          * The TOTAL column is less than the RUN column. This is because, for
1123          DAG node jobs, their contribution to the TOTAL column is the  number
1124          of  clusters, not the number of procs (but their contribution to the
1125          RUN column is the number of procs). So the four DAG nodes (8  procs)
1126          contribute  4,  and  the sub-DAG contributes 1, to the TOTAL column.
1127          (But, somewhat confusingly, the sub-DAG job is not  counted  in  the
1128          RUN column.)
1129
1130          *  The  sum of the RUN and IDLE columns (8) is less than the 10 jobs
1131          listed in the totals line at the bottom. This is  because  the  top-
1132          level  DAG  and  sub-DAG jobs are not counted in the RUN column, but
1133          they are counted in the totals line.
1134
1135       Now here is non-batch mode after proc 0 of each node job has finished:
1136
1137       $ condor_q -nobatch
1138
1139       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1140        ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
1141        591.0   wenger          9/13  16:05    0+00:01:19  R   0     2.4  con‐
1142       dor_dagman -p 0
1143        592.1   wenger          9/13 16:05   0+00:01:13 R  0    0.0 sleep 300
1144        593.1   wenger          9/13 16:05   0+00:01:13 R  0    0.0 sleep 300
1145        594.0    wenger           9/13  16:05    0+00:01:13  R   0    2.4 con‐
1146       dor_dagman -p 0
1147        595.1   wenger          9/13 16:05   0+00:01:07 R  0    0.0 sleep 300
1148        596.1   wenger          9/13 16:05   0+00:01:07 R  0    0.0 sleep 300
1149
1150       6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1151
1152       The same state also with the -dag option:
1153
1154       $ condor_q -nobatch -dag
1155
1156       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1157        ID      OWNER/NODENAME      SUBMITTED     RUN_TIME ST PRI SIZE CMD
1158        591.0   wenger             9/13 16:05   0+00:01:30 R   0     2.4  con‐
1159       dor_dagman -
1160        592.1     |-NodeA            9/13 16:05   0+00:01:24 R  0    0.0 sleep
1161       300
1162        593.1    |-NodeB           9/13 16:05   0+00:01:24 R  0     0.0  sleep
1163       300
1164        594.0     |-SubZ             9/13  16:05   0+00:01:24 R  0    2.4 con‐
1165       dor_dagman -
1166        595.1     |-NodeSA         9/13 16:05   0+00:01:18 R  0     0.0  sleep
1167       300
1168        596.1      |-NodeSB          9/13 16:05   0+00:01:18 R  0    0.0 sleep
1169       300
1170
1171       6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1172
1173       And, finally, that state in batch (default) mode:
1174
1175       $ condor_q
1176
1177       -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?...
1178       OWNER  BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
1179       wenger ex1.dag+591   9/13 16:05      _      4      _      5  592.1  ...
1180       596.1
1181
1182       6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
1183

Exit Status

1185       condor_q will exit with a status value of 0 (zero) upon success, and it
1186       will exit with the value 1 (one) upon failure.
1187

Author

1189       Center for High Throughput Computing, University of Wisconsin-Madison
1190
1192       Copyright (C) 1990-2018 Center for High Throughput Computing,  Computer
1193       Sciences  Department, University of Wisconsin-Madison, Madison, WI. All
1194       Rights Reserved. Licensed under the Apache License, Version 2.0.
1195
1196
1197
1198                                     date           just-man-pages/condor_q(1)
Impressum