1condor_dagman(1)            General Commands Manual           condor_dagman(1)
2
3
4

Name

6       condor_dagmanmeta scheduler of the jobs submitted as the nodes of a DAG
7       or DAGs
8

Synopsis

10       condor_dagman-f-t-l .-help
11
12       condor_dagman-version
13
14       condor_dagman-f-l .-csdversion version_string[-debug  level]  [-maxidle
15       numberOfProcs]  [-maxjobs  numberOfJobs]  [-maxpre  NumberOfPreScripts]
16       [-maxpost   NumberOfPostScripts]   [-noeventchecks]    [-allowlogerror]
17       [-usedagdir]   -lockfile   filename[-waitfordebug]   [-autorescue  0|1]
18       [-dorescuefrom number] [-allowversionmismatch] [-DumpRescue] [-verbose]
19       [-force]  [-notification  value]  [-suppress_notification]  [-dont_sup‐
20       press_notification] [-dagman DagmanExecutable] [-outfile_dir directory]
21       [-update_submit]         [-import_env]        [-priority        number]
22       [-dont_use_default_node_log]   [-DontAlwaysRunPost]    [-AlwaysRunPost]
23       [-DoRecovery] -dag dag_file[-dag dag_file_2...-dag dag_file_n]
24

Description

26       condor_dagmanis  a  meta  scheduler  for the HTCondor jobs within a DAG
27       (directed acyclic graph) (or multiple DAGs). In typical usage,  a  sub‐
28       mitter of jobs that are organized into a DAG submits the DAG using con‐
29       dor_submit_dag. condor_submit_dagdoes error checking on aspects of  the
30       DAG and then submits condor_dagmanas an HTCondor job. condor_dagmanuses
31       log files to coordinate the further submission of the jobs  within  the
32       DAG.
33
34       All  command line arguments to the DaemonCorelibrary functions work for
35       condor_dagman. When invoked from  the  command  line,  condor_dagmanre‐
36       quires  the arguments -f -l .to appear first on the command line, to be
37       processed by DaemonCore. The csdversionmust also be specified; at start
38       up,  condor_dagmanchecks  for  a  version mismatch with the condor_sub‐
39       mit_dagversion in this argument. The -targument must  also  be  present
40       for the -helpoption, such that output is sent to the terminal.
41
42       Arguments  to  condor_dagmanare either automatically set by condor_sub‐
43       mit_dagor they are specified as command-line arguments  to  condor_sub‐
44       mit_dagand  passed  on  to condor_dagman. The method by which the argu‐
45       ments are set is given in their description below.
46
47       condor_dagmancan run multiple, independent DAGs. This is done by speci‐
48       fying  multiple  -dag  arguments. Pass multiple DAG input files as com‐
49       mand-line arguments to condor_submit_dag.
50
51       Debugging output may be obtained by using the -debug leveloption. Level
52       values and what they produce is described as
53
54          * level = 0; never produce output, except for usage info
55
56          * level = 1; very quiet, output severe errors
57
58          * level = 2; normal output, errors and warnings
59
60          * level = 3; output errors, as well as all warnings
61
62          * level = 4; internal debugging output
63
64          * level = 5; internal debugging output; outer loop debugging
65
66          * level = 6; internal debugging output; inner loop debugging; output
67          DAG input file lines as they are parsed
68
69          * level = 7; internal debugging  output;  rarely  used;  output  DAG
70          input file lines as they are parsed
71

Options

73       -help
74
75          Display usage information and exit.
76
77
78
79       -version
80
81          Display version information and exit.
82
83
84
85       -debug level
86
87          An  integer level of debugging output. levelis an integer, with val‐
88          ues of 0-7 inclusive, where 7 is the most verbose output. This  com‐
89          mand-line  option  to  condor_submit_dagis passed to condor_dagmanor
90          defaults to the value 3.
91
92
93
94       -maxidle NumberOfProcs
95
96          Sets the maximum number of idle  procs  allowed  before  condor_dag‐
97          manstops  submitting  more  node  jobs. Note that for this argument,
98          each individual proc within a cluster counts as a towards the limit,
99          which  is  inconsistent with -maxjobs .Once idle procs start to run,
100          condor_dagmanwill resume submitting jobs once  the  number  of  idle
101          procs  falls  below the specified limit. NumberOfProcsis a non-nega‐
102          tive integer. If this option is omitted, the number of idle procs is
103          limited  by  the  configuration variable DAGMAN_MAX_JOBS_IDLE(see ),
104          which defaults to 1000. To disable this limit,  set  NumberOfProcsto
105          0.  Note that submit description files that queue multiple procs can
106          cause the NumberOfProcslimit to be exceeded.  Setting  queue  5000in
107          the submit description file, where -maxidleis set to 250 will result
108          in a cluster of 5000 new procs being submitted to the condor_schedd,
109          not 250. In this case, condor_dagmanwill resume submitting jobs when
110          the number of idle procs falls below 250.
111
112
113
114       -maxjobs NumberOfClusters
115
116          Sets the maximum number of clusters within the DAG that will be sub‐
117          mitted  to  HTCondor  at one time. Note that for this argument, each
118          cluster counts as one job, no matter how many individual  procs  are
119          in  the  cluster. NumberOfClustersis a non-negative integer. If this
120          option is omitted, the number of clusters is limited by the configu‐
121          ration variable DAGMAN_MAX_JOBS_SUBMITTED(see ), which defaults to 0
122          (unlimited).
123
124
125
126       -maxpre NumberOfPreScripts
127
128          Sets the maximum number of PRE scripts within the DAG  that  may  be
129          running at one time. NumberOfPreScriptsis a non-negative integer. If
130          this option is omitted, the number of PRE scripts is limited by  the
131          configuration  variable DAGMAN_MAX_PRE_SCRIPTS(see ), which defaults
132          to 20.
133
134
135
136       -maxpost NumberOfPostScripts
137
138          Sets the maximum number of POST scripts within the DAG that  may  be
139          running  at  one time. NumberOfPostScriptsis a non-negative integer.
140          If this option is omitted, the number of POST scripts is limited  by
141          the  configuration  variable  DAGMAN_MAX_POST_SCRIPTS(see  ),  which
142          defaults to 20.
143
144
145
146       -noeventchecks
147
148          This argument is no longer used; it is now ignored. Its  functional‐
149          ity is now implemented by the DAGMAN_ALLOW_EVENTSconfiguration vari‐
150          able.
151
152
153
154       -allowlogerror
155
156          As of verson 8.5.5 this argument is no longer supported, and setting
157          it will generate a warning.
158
159
160
161       -usedagdir
162
163          This optional argument causes condor_dagmanto run each specified DAG
164          as if the directory containing that DAG file was the current working
165          directory.  This option is most useful when running multiple DAGs in
166          a single condor_dagman.
167
168
169
170       -lockfile filename
171
172          Names the file created and used as a lock file. The lock  file  pre‐
173          vents  execution  of  two of the same DAG, as defined by a DAG input
174          file. A default lock file ending with the suffix .dag.lockis  passed
175          to condor_dagmanby condor_submit_dag.
176
177
178
179       -waitfordebug
180
181          This  optional argument causes condor_dagmanto wait at startup until
182          someone attaches to  the  process  with  a  debugger  and  sets  the
183          wait_for_debug variable in main_init() to false.
184
185
186
187       -autorescue 0|1
188
189          Whether to automatically run the newest rescue DAG for the given DAG
190          file, if one exists (0 = false, 1 = true).
191
192
193
194       -dorescuefrom number
195
196          Forces condor_dagmanto run the specified rescue DAG number  for  the
197          given  DAG.  A value of 0 is the same as not specifying this option.
198          Specifying a nonexistent rescue DAG is a fatal error.
199
200
201
202       -allowversionmismatch
203
204          This optional argument causes condor_dagmanto allow a  version  mis‐
205          match  between  condor_dagmanitself and the .condor.subfile produced
206          by  condor_submit_dag(or,  in  other  words,   between   condor_sub‐
207          mit_dagand  condor_dagman). WARNING! This option should be used only
208          if absolutely necessary. Allowing version mismatches can cause  sub‐
209          tle  problems  when  running DAGs. (Note that, starting with version
210          7.4.0,  condor_dagmanno  longer  requires  an  exact  version  match
211          between itself and the .condor.subfile. Instead, a "minimum compati‐
212          ble version" is defined, and any .condor.subfile of that version  or
213          newer is accepted.)
214
215
216
217       -DumpRescue
218
219          This  optional  argument  causes  condor_dagmanto immediately dump a
220          Rescue DAG and then exit, as opposed to actually  running  the  DAG.
221          This  feature is mainly intended for testing. The Rescue DAG file is
222          produced whether or not there are parse errors reading the  original
223          DAG  input  file.  The name of the file differs if there was a parse
224          error.
225
226
227
228       -verbose
229
230          (This argument is included only to be passed to  condor_submit_dagif
231          lazy  submit  file  generation  is used for nested DAGs.) Cause con‐
232          dor_submit_dagto give verbose error messages.
233
234
235
236       -force
237
238          (This argument is included only to be passed to  condor_submit_dagif
239          lazy  submit  file generation is used for nested DAGs.) Require con‐
240          dor_submit_dagto overwrite the files that it produces, if the  files
241          already  exist.  Note  that dagman.outwill be appended to, not over‐
242          written. If new-style rescue DAG mode is in  effect,  and  any  new-
243          style  rescue  DAGs  exist,  the  -forceflag  will  cause them to be
244          renamed, and the original DAG will be run. If old-style  rescue  DAG
245          mode  is  in  effect,  any  existing  old-style  rescue DAGs will be
246          deleted, and the original DAG will be run. See the  HTCondor  manual
247          section on Rescue DAGs for more information.
248
249
250
251       -notification value
252
253          This  argument  is only included to be passed to condor_submit_dagif
254          lazy submit file generation is used for nested DAGs. Sets the e-mail
255          notification for DAGMan itself. This information will be used within
256          the HTCondor submit description file for DAGMan. This file  is  pro‐
257          duced  by  condor_submit_dag. The notificationoption is described in
258          the condor_submitmanual page.
259
260
261
262       -suppress_notification
263
264          Causes jobs submitted by condor_dagmanto not send email notification
265          for  events. The same effect can be achieved by setting the configu‐
266          ration variable DAGMAN_SUPPRESS_NOTIFICATIONto  True.  This  command
267          line  option is independent of the -notificationcommand line option,
268          which controls notification for the  condor_dagmanjob  itself.  This
269          flag  is  generally  superfluous, as DAGMAN_SUPPRESS_NOTIFICATIONde‐
270          faults to True.
271
272
273
274       -dont_suppress_notification
275
276          Causes jobs submitted by condor_dagmanto defer to content within the
277          submit description file when deciding to send email notification for
278          events. The same effect can be achieved by setting the configuration
279          variable  DAGMAN_SUPPRESS_NOTIFICATIONto  False.  This  command line
280          flag is independent of the -notificationcommand line  option,  which
281          controls  notification  for  the  condor_dagmanjob  itself.  If both
282          -dont_suppress_notificationand  -suppress_notificationare  specified
283          within the same command line, the last argument is used.
284
285
286
287       -dagman DagmanExecutable
288
289          (This  argument is included only to be passed to condor_submit_dagif
290          lazy submit file generation is used for  nested  DAGs.)  Allows  the
291          specification  of  an  alternate  condor_dagmanexecutable to be used
292          instead of the one found in the user's path. This must  be  a  fully
293          qualified path.
294
295
296
297       -outfile_dir directory
298
299          (This  argument is included only to be passed to condor_submit_dagif
300          lazy submit file generation is used for nested DAGs.) Specifies  the
301          directory in which the .dagman.outfile will be written. The directo‐
302          rymay be specified relative to the current working directory as con‐
303          dor_submit_dagis executed, or specified with an absolute path. With‐
304          out this option, the .dagman.outfile is placed in the same directory
305          as the first DAG input file listed on the command line.
306
307
308
309       -update_submit
310
311          (This  argument is included only to be passed to condor_submit_dagif
312          lazy submit file generation is used for nested DAGs.) This  optional
313          argument  causes an existing .condor.subfile to not be treated as an
314          error; rather, the .condor.subfile  will  be  overwritten,  but  the
315          existing  values of -maxjobs, -maxidle, -maxpre, and -maxpostwill be
316          preserved.
317
318
319
320       -import_env
321
322          (This argument is included only to be passed to  condor_submit_dagif
323          lazy  submit file generation is used for nested DAGs.) This optional
324          argument causes condor_submit_dagto import the  current  environment
325          into the environmentcommand of the .condor.subfile it generates.
326
327
328
329       -priority number
330
331          Sets  the  minimum  job  priority of node jobs submitted and running
332          under this condor_dagmanjob.
333
334
335
336       -dont_use_default_node_log
337
338          This option is disabled as  of  HTCondor  version  8.3.1.Tells  con‐
339          dor_dagmanto  use  the  file  specified by the job ClassAd attribute
340          UserLogto monitor job status. If this command line argument is used,
341          then the job event log file cannot be defined with a macro.
342
343
344
345       -DontAlwaysRunPost
346
347          This option causes condor_dagmanto not run the POST script of a node
348          if the PRE script fails. (This was the  default  behavior  prior  to
349          HTCondor  version 7.7.2, and is again the default behavior from ver‐
350          sion 8.5.4 onwards.)
351
352
353
354       -AlwaysRunPost
355
356          This option causes condor_dagmanto always run the POST script  of  a
357          node,  even  if the PRE script fails. (This was the default behavior
358          for HTCondor version 7.7.2 through version 8.5.3.)
359
360
361
362       -DoRecovery
363
364          Causes condor_dagmanto start in recovery mode. This  means  that  it
365          reads the relevant job user log(s) and catches up to the given DAG's
366          previous state before submitting any new jobs.
367
368
369
370       -dag filename
371
372          filenameis the name of the DAG input file that is set as an argument
373          to condor_submit_dag, and passed to condor_dagman.
374
375
376
377
378

Exit Status

380       condor_dagmanwill  exit  with  a status value of 0 (zero) upon success,
381       and it will exit with the value 1 (one) upon failure.
382

Examples

384       condor_dagmanis normally not run directly, but submitted as an HTCondor
385       job by running condor_submit_dag. See the condor_submit_dag manual page
386       for examples.
387

Author

389       Center for High Throughput Computing, University of Wisconsin-Madison
390
392       Copyright (C) 1990-2019 Center for High Throughput Computing,  Computer
393       Sciences  Department, University of Wisconsin-Madison, Madison, WI. All
394       Rights Reserved. Licensed under the Apache License, Version 2.0.
395
396
397
398                                     date                     condor_dagman(1)
Impressum