1CONDOR_DAGMAN(1)                HTCondor Manual               CONDOR_DAGMAN(1)
2
3
4

NAME

6       condor_dagman - HTCondor Manual
7
8       meta scheduler of the jobs submitted as the nodes of a DAG or DAGs
9
10

SYNOPSIS

12       condor_dagman -f -t -l . -help
13
14       condor_dagman -version
15
16       condor_dagman -f -l . -csdversion version_string [-debug level] [-maxi‐
17       dle numberOfProcs] [-maxjobs numberOfJobs] [-maxpre NumberOfPreScripts]
18       [-maxpost  NumberOfPostScripts]  [-noeventchecks  ]  [-allowlogerror  ]
19       [-usedagdir ] -lockfile filename  [-waitfordebug  ]  [-autorescue  0|1]
20       [-dorescuefrom  number]  [-allowversionmismatch ] [-DumpRescue ] [-ver‐
21       bose  ]  [-force  ]  [-notification  value]  [-suppress_notification  ]
22       [-dont_suppress_notification ] [-dagman DagmanExecutable] [-outfile_dir
23       directory]  [-update_submit  ]  [-import_env   ]   [-priority   number]
24       [-dont_use_default_node_log  ]  [-DontAlwaysRunPost ] [-AlwaysRunPost ]
25       [-DoRecovery ] -dag dag_file [-dag dag_file_2 ... -dag dag_file_n ]
26

DESCRIPTION

28       condor_dagman is a meta scheduler for the HTCondor jobs  within  a  DAG
29       (directed  acyclic  graph) (or multiple DAGs). In typical usage, a sub‐
30       mitter of jobs that are organized into a DAG submits the DAG using con‐
31       dor_submit_dag. condor_submit_dag does error checking on aspects of the
32       DAG and then submits condor_dagman as an HTCondor  job.   condor_dagman
33       uses  log files to coordinate the further submission of the jobs within
34       the DAG.
35
36       All command line arguments to the DaemonCore library functions work for
37       condor_dagman.  When  invoked  from the command line, condor_dagman re‐
38       quires the arguments -f -l . to appear first on the command line, to be
39       processed  by  DaemonCore.  The  csdversion  must also be specified; at
40       start up, condor_dagman checks for a version  mismatch  with  the  con‐
41       dor_submit_dag  version  in this argument. The -t argument must also be
42       present for the -help option, such that output is sent to the terminal.
43
44       Arguments to condor_dagman are either automatically set by  condor_sub‐
45       mit_dag  or they are specified as command-line arguments to condor_sub‐
46       mit_dag and passed on to condor_dagman. The method by which  the  argu‐
47       ments are set is given in their description below.
48
49       condor_dagman can run multiple, independent DAGs. This is done by spec‐
50       ifying multiple -dag a rguments. Pass multiple DAG input files as  com‐
51       mand-line arguments to condor_submit_dag.
52
53       Debugging  output  may  be  obtained  by using the -debug level option.
54       Level values and what they produce is described as
55
56       • level = 0; never produce output, except for usage info
57
58       • level = 1; very quiet, output severe errors
59
60       • level = 2; normal output, errors and warnings
61
62       • level = 3; output errors, as well as all warnings
63
64       • level = 4; internal debugging output
65
66       • level = 5; internal debugging output; outer loop debugging
67
68       • level = 6; internal debugging output; inner  loop  debugging;  output
69         DAG input file lines as they are parsed
70
71       • level  =  7; internal debugging output; rarely used; output DAG input
72         file lines as they are parsed
73

OPTIONS

75          -help  Display usage information and exit.
76
77          -version
78                 Display version information and exit.
79
80          -debug level
81                 An integer level of debugging output. level  is  an  integer,
82                 with  values  of  0-7  inclusive, where 7 is the most verbose
83                 output. This  command-line  option  to  condor_submit_dag  is
84                 passed to condor_dagman or defaults to the value 3.
85
86          -maxidle NumberOfProcs
87                 Sets  the  maximum  number  of idle procs allowed before con‐
88                 dor_dagman stops submitting more node  jobs.  Note  that  for
89                 this  argument,  each individual proc within a cluster counts
90                 as a towards the limit, which is inconsistent with -maxjobs .
91                 Once  idle procs start to run, condor_dagman will resume sub‐
92                 mitting jobs once the number of idle procs  falls  below  the
93                 specified limit.  NumberOfProcs is a non-negative integer. If
94                 this option is omitted, the number of idle procs  is  limited
95                 by the configuration variable DAGMAN_MAX_JOBS_IDLE
96                   (see  admin-manual/configuration-macros:configuration  file
97                 entries for dagman), which defaults to 1000. To disable  this
98                 limit,  set  NumberOfProcs to 0. Note that submit description
99                 files that queue multiple procs can cause  the  NumberOfProcs
100                 limit  to  be  exceeded. Setting queue 5000 in the submit de‐
101                 scription file, where -maxidle is set to 250 will result in a
102                 cluster  of  5000  new  procs  being  submitted  to  the con‐
103                 dor_schedd, not 250. In this case, condor_dagman will  resume
104                 submitting  jobs  when  the  number of idle procs falls below
105                 250.
106
107          -maxjobs NumberOfClusters
108                 Sets the maximum number of clusters within the DAG that  will
109                 be  submitted to HTCondor at one time. Note that for this ar‐
110                 gument, each cluster counts as one job, no  matter  how  many
111                 individual  procs  are  in the cluster. NumberOfClusters is a
112                 non-negative integer. If this option is omitted,  the  number
113                 of  clusters  is  limited  by the configuration variable DAG‐
114                 MAN_MAX_JOBS_SUBMITTED
115                   (see  admin-manual/configuration-macros:configuration  file
116                 entries for dagman), which defaults to 0 (unlimited).
117
118          -maxpre NumberOfPreScripts
119                 Sets  the  maximum  number of PRE scripts within the DAG that
120                 may be running at one time. NumberOfPreScripts is a non-nega‐
121                 tive  integer.   If this option is omitted, the number of PRE
122                 scripts  is  limited  by  the  configuration  variable   DAG‐
123                 MAN_MAX_PRE_SCRIPTS        (see       admin-manual/configura‐
124                 tion-macros:configuration file entries for dagman), which de‐
125                 faults to 20.
126
127          -maxpost NumberOfPostScripts
128                 Sets  the  maximum number of POST scripts within the DAG that
129                 may be running at one time. NumberOfPostScripts is a non-neg‐
130                 ative  integer. If this option is omitted, the number of POST
131                 scripts  is  limited  by  the  configuration  variable   DAG‐
132                 MAN_MAX_POST_SCRIPTS
133                   (see  admin-manual/configuration-macros:configuration  file
134                 entries for dagman), which defaults to 20.
135
136          -noeventchecks
137                 This argument is no longer used; it is now ignored. Its func‐
138                 tionality  is now implemented by the DAGMAN_ALLOW_EVENTS con‐
139                 figuration variable.
140
141          -allowlogerror
142                 As of verson 8.5.5 this argument is no longer supported,  and
143                 setting it will generate a warning.
144
145          -usedagdir
146                 This optional argument causes condor_dagman to run each spec‐
147                 ified DAG as if the directory containing that  DAG  file  was
148                 the  current  working  directory.  This option is most useful
149                 when running multiple DAGs in a single condor_dagman.
150
151          -lockfile filename
152                 Names the file created and used as a lock file. The lock file
153                 prevents  execution  of  two of the same DAG, as defined by a
154                 DAG input file. A default lock file ending  with  the  suffix
155                 .dag.lock is passed to condor_dagman by condor_submit_dag.
156
157          -waitfordebug
158                 This  optional  argument  causes  condor_dagman  to  wait  at
159                 startup until someone attaches to the process with a debugger
160                 and sets the wait_for_debug variable in main_init() to false.
161
162          -autorescue 0|1
163                 Whether  to  automatically  run the newest rescue DAG for the
164                 given DAG file, if one exists (0 = false, 1 = true).
165
166          -dorescuefrom number
167                 Forces condor_dagman to run the specified rescue  DAG  number
168                 for the given DAG. A value of 0 is the same as not specifying
169                 this option. Specifying a nonexistent rescue DAG is  a  fatal
170                 error.
171
172          -allowversionmismatch
173                 This  optional  argument causes condor_dagman to allow a ver‐
174                 sion mismatch between  condor_dagman  itself  and  the  .con‐
175                 dor.sub  file  produced  by  condor_submit_dag  (or, in other
176                 words, between condor_submit_dag and condor_dagman). WARNING!
177                 This  option should be used only if absolutely necessary. Al‐
178                 lowing version mismatches can cause subtle problems when run‐
179                 ning  DAGs.  (Note  that,  starting  with version 7.4.0, con‐
180                 dor_dagman no longer requires an exact version match  between
181                 itself and the .condor.sub file.  Instead, a "minimum compat‐
182                 ible version" is defined, and any .condor.sub  file  of  that
183                 version or newer is accepted.)
184
185          -DumpRescue
186                 This  optional  argument  causes condor_dagman to immediately
187                 dump a Rescue DAG and then exit, as opposed to actually  run‐
188                 ning  the  DAG.  This feature is mainly intended for testing.
189                 The Rescue DAG file is produced  whether  or  not  there  are
190                 parse errors reading the original DAG input file. The name of
191                 the file differs if there was a parse error.
192
193          -verbose
194                 (This argument is included only to be passed  to  condor_sub‐
195                 mit_dag  if  lazy  submit  file generation is used for nested
196                 DAGs.) Cause condor_submit_dag to  give  verbose  error  mes‐
197                 sages.
198
199          -force (This  argument  is included only to be passed to condor_sub‐
200                 mit_dag if lazy submit file generation  is  used  for  nested
201                 DAGs.)  Require condor_submit_dag to overwrite the files that
202                 it produces, if the files already exist. Note that dagman.out
203                 will be appended to, not overwritten. If new-style rescue DAG
204                 mode is in effect, and any new-style rescue DAGs  exist,  the
205                 -force  flag  will cause them to be renamed, and the original
206                 DAG will be run. If old-style rescue DAG mode is  in  effect,
207                 any  existing  old-style rescue DAGs will be deleted, and the
208                 original DAG will be run. See the HTCondor manual section  on
209                 Rescue DAGs for more information.
210
211          -notification value
212                 This  argument  is  only included to be passed to condor_sub‐
213                 mit_dag if lazy submit file generation  is  used  for  nested
214                 DAGs.  Sets  the  e-mail notification for DAGMan itself. This
215                 information will be used within the HTCondor submit  descrip‐
216                 tion  file  for  DAGMan. This file is produced by condor_sub‐
217                 mit_dag. The notification option is  described  in  the  con‐
218                 dor_submit manual page.
219
220          -suppress_notification
221                 Causes  jobs submitted by condor_dagman to not send email no‐
222                 tification for events. The same effect  can  be  achieved  by
223                 setting  the configuration variable DAGMAN_SUPPRESS_NOTIFICA‐
224                 TION
225                   to True. This command line option  is  independent  of  the
226                 -notification  command  line option, which controls notifica‐
227                 tion for the condor_dagman job itself. This flag is generally
228                 superfluous,   as  DAGMAN_SUPPRESS_NOTIFICATION  defaults  to
229                 True.
230
231          -dont_suppress_notification
232                 Causes jobs submitted by condor_dagman to  defer  to  content
233                 within  the  submit  description  file  when deciding to send
234                 email  notification  for  events.  The  same  effect  can  be
235                 achieved  by  setting  the configuration variable DAGMAN_SUP‐
236                 PRESS_NOTIFICATION
237                   to False. This command line flag is independent of the -no‐
238                 tification  command  line option, which controls notification
239                 for the condor_dagman job itself. If both  -dont_suppress_no‐
240                 tification  and  -suppress_notification  are specified within
241                 the same command line, the last argument is used.
242
243          -dagman DagmanExecutable
244                 (This argument is included only to be passed  to  condor_sub‐
245                 mit_dag  if  lazy  submit  file generation is used for nested
246                 DAGs.) Allows the specification of an alternate condor_dagman
247                 executable  to be used instead of the one found in the user's
248                 path. This must be a fully qualified path.
249
250          -outfile_dir directory
251                 (This argument is included only to be passed  to  condor_sub‐
252                 mit_dag  if  lazy  submit  file generation is used for nested
253                 DAGs.) Specifies the directory in which the .dagman.out  file
254                 will  be  written. The directory may be specified relative to
255                 the current working directory as  condor_submit_dag  is  exe‐
256                 cuted,  or  specified with an absolute path. Without this op‐
257                 tion, the .dagman.out file is placed in the same directory as
258                 the first DAG input file listed on the command line.
259
260          -update_submit
261                 (This  argument  is included only to be passed to condor_sub‐
262                 mit_dag if lazy submit file generation  is  used  for  nested
263                 DAGs.)  This optional argument causes an existing .condor.sub
264                 file to not be treated as an error; rather,  the  .condor.sub
265                 file   will  be  overwritten,  but  the  existing  values  of
266                 -maxjobs, -maxidle, -maxpre, and -maxpost will be preserved.
267
268          -import_env
269                 (This argument is included only to be passed  to  condor_sub‐
270                 mit_dag  if  lazy  submit  file generation is used for nested
271                 DAGs.) This optional argument causes condor_submit_dag to im‐
272                 port  the current environment into the environment command of
273                 the .condor.sub file it generates.
274
275          -priority number
276                 Sets the minimum job priority of node jobs submitted and run‐
277                 ning under this condor_dagman job.
278
279          -dont_use_default_node_log
280                 This  option  is disabled as of HTCondor version 8.3.1. Tells
281                 condor_dagman to use the file specified by  the  job  ClassAd
282                 attribute UserLog to monitor job status. If this command line
283                 argument is used, then the job event log file cannot  be  de‐
284                 fined with a macro.
285
286          -DontAlwaysRunPost
287                 This  option  causes condor_dagman to not run the POST script
288                 of a node if the PRE script fails. (This was the default  be‐
289                 havior  prior to HTCondor version 7.7.2, and is again the de‐
290                 fault behavior from version 8.5.4 onwards.)
291
292          -AlwaysRunPost
293                 This option causes  condor_dagman  to  always  run  the  POST
294                 script of a node, even if the PRE script fails. (This was the
295                 default behavior for HTCondor version 7.7.2  through  version
296                 8.5.3.)
297
298          -DoRecovery
299                 Causes  condor_dagman  to  start in recovery mode. This means
300                 that it reads the relevant job user log(s) and catches up  to
301                 the  given  DAG's  previous  state  before submitting any new
302                 jobs.
303
304          -dag filename
305                 filename is the name of the DAG input file that is set as  an
306                 argument to condor_submit_dag, and passed to condor_dagman.
307

EXIT STATUS

309       condor_dagman  will  exit with a status value of 0 (zero) upon success,
310       and it will exit with the value 1 (one) upon failure.
311

EXAMPLES

313       condor_dagman is normally not run directly, but submitted as an  HTCon‐
314       dor  job  by  running condor_submit_dag. See the /man-pages/condor_sub‐
315       mit_dag manual page for examples.
316

AUTHOR

318       HTCondor Team
319
321       1990-2021, Center for High Throughput Computing, Computer Sciences  De‐
322       partment,  University  of  Wisconsin-Madison, Madison, WI, US. Licensed
323       under the Apache License, Version 2.0.
324
325
326
327
3288.8                              Aug 23, 2021                 CONDOR_DAGMAN(1)
Impressum