1CONDOR_DAGMAN(1)                HTCondor Manual               CONDOR_DAGMAN(1)
2
3
4

NAME

6       condor_dagman - HTCondor Manual
7
8       meta scheduler of the jobs submitted as the nodes of a DAG or DAGs
9
10

SYNOPSIS

12       condor_dagman -f -t -l . -help
13
14       condor_dagman -version
15
16       condor_dagman   -f  -l  .  -csdversion  version_string  [-debug  level]
17       [-dryrun] [-maxidle  numberOfProcs]  [-maxjobs  numberOfJobs]  [-maxpre
18       NumberOfPreScripts]   [-maxpost   NumberOfPostScripts]  [-maxhold  Num‐
19       berOfHoldScripts] [-usedagdir ]  -lockfile  filename  [-waitfordebug  ]
20       [-autorescue  0|1]  [-dorescuefrom  number] [-load_save filename] [-al‐
21       lowversionmismatch ] [-DumpRescue ] [-verbose ] [-force  ]  [-notifica‐
22       tion  value]  [-suppress_notification  ] [-dont_suppress_notification ]
23       [-dagman DagmanExecutable] [-outfile_dir directory]  [-update_submit  ]
24       [-import_env  ] [-include_env Variables] [-insert_env Key=Value] [-pri‐
25       ority number] [-DontAlwaysRunPost ] [-AlwaysRunPost  ]  [-DoRecovery  ]
26       [-dot] -dag dag_file [-dag dag_file_2 ... -dag dag_file_n ]
27

DESCRIPTION

29       condor_dagman  is  a  meta scheduler for the HTCondor jobs within a DAG
30       (directed acyclic graph) (or multiple DAGs). In typical usage,  a  sub‐
31       mitter of jobs that are organized into a DAG submits the DAG using con‐
32       dor_submit_dag. condor_submit_dag does error checking on aspects of the
33       DAG  and  then submits condor_dagman as an HTCondor job.  condor_dagman
34       uses log files to coordinate the further submission of the jobs  within
35       the DAG.
36
37       All command line arguments to the DaemonCore library functions work for
38       condor_dagman. When invoked from the command  line,  condor_dagman  re‐
39       quires the arguments -f -l . to appear first on the command line, to be
40       processed by DaemonCore. The csdversion  must  also  be  specified;  at
41       start  up,  condor_dagman  checks  for a version mismatch with the con‐
42       dor_submit_dag version in this argument. The -t argument must  also  be
43       present for the -help option, such that output is sent to the terminal.
44
45       Arguments  to condor_dagman are either automatically set by condor_sub‐
46       mit_dag or they are specified as command-line arguments to  condor_sub‐
47       mit_dag  and  passed on to condor_dagman. The method by which the argu‐
48       ments are set is given in their description below.
49
50       condor_dagman can run multiple, independent DAGs. This is done by spec‐
51       ifying  multiple -dag a rguments. Pass multiple DAG input files as com‐
52       mand-line arguments to condor_submit_dag.
53
54       Debugging output may be obtained by  using  the  -debug  level  option.
55       Level values and what they produce is described as
56
57       • level = 0; never produce output, except for usage info
58
59       • level = 1; very quiet, output severe errors
60
61       • level = 2; normal output, errors and warnings
62
63       • level = 3; output errors, as well as all warnings
64
65       • level = 4; internal debugging output
66
67       • level = 5; internal debugging output; outer loop debugging
68
69       • level  =  6;  internal debugging output; inner loop debugging; output
70         DAG input file lines as they are parsed
71
72       • level = 7; internal debugging output; rarely used; output  DAG  input
73         file lines as they are parsed
74

OPTIONS

76          -help  Display usage information and exit.
77
78          -version
79                 Display version information and exit.
80
81          -csdversion VersionString
82                 Sets  the version of condor_submit_dag command used to submit
83                 the DAGMan workflow. Used to help identify version  mismatch‐
84                 ing.
85
86          -debug level
87                 An  integer  level  of debugging output. level is an integer,
88                 with values of 0-7 inclusive, where 7  is  the  most  verbose
89                 output.  This  command-line  option  to  condor_submit_dag is
90                 passed to condor_dagman or defaults to the value 3.
91
92          -dryrun
93                 Inform DAGMan to do a dry run. Where the DAG is ran but  node
94                 jobs are not actually submitted.
95
96          -maxidle NumberOfProcs
97                 Sets  the  maximum  number  of idle procs allowed before con‐
98                 dor_dagman stops submitting more node jobs. If this option is
99                 omitted  then the number of idle procs is limited by the con‐
100                 figuration variable DAGMAN_MAX_JOBS_IDLE  which  defaults  to
101                 1000.   To  disable  this  limit, set NumberOfProcs to 0. The
102                 NumberOfProcs can be exceeded if a nodes job has a queue com‐
103                 mand  with  more  than one proc to queue. i.e. queue 500 will
104                 submit all procs even if NumberOfProcs is 250. In  this  case
105                 DAGMan will wait for for the number of idle procs to fall be‐
106                 low 250 before submitting more jobs to the condor_schedd.
107
108          -maxjobs NumberOfClusters
109                 Sets the maximum number of clusters within the DAG that  will
110                 be submitted to HTCondor at one time. Each cluster is associ‐
111                 ated with one node job no matter how  many  individual  procs
112                 are in the cluster.  NumberOfClusters is a non-negative inte‐
113                 ger. If this option is omitted then the number of clusters is
114                 limited       by       the       configuration       variable
115                 DAGMAN_MAX_JOBS_SUBMITTED which defaults to 0 (unlimited).
116
117          -maxpre NumberOfPreScripts
118                 Sets the maximum number of PRE scripts within  the  DAG  that
119                 may be running at one time. NumberOfPreScripts is a non-nega‐
120                 tive integer.  If this option is omitted, the number  of  PRE
121                 scripts    is   limited   by   the   configuration   variable
122                 DAGMAN_MAX_PRE_SCRIPTS which defaults to 20.
123
124          -maxpost NumberOfPostScripts
125                 Sets the maximum number of POST scripts within the  DAG  that
126                 may be running at one time. NumberOfPostScripts is a non-neg‐
127                 ative integer. If this option is omitted, the number of  POST
128                 scripts    is   limited   by   the   configuration   variable
129                 DAGMAN_MAX_POST_SCRIPTS which defaults to 20.
130
131          -maxhold NumberOfHoldScripts
132                 Sets the maximum number of HOLD scripts within the  DAG  that
133                 may be running at one time. NumberOfHoldscripts is a non-neg‐
134                 ative integer.  If this option is omitted, the number of HOLD
135                 scripts    is   limited   by   the   configuration   variable
136                 DAGMAN_MAX_HOLD_SCRIPTS, which defaults to 0 (unlimited).
137
138          -usedagdir
139                 This optional argument causes condor_dagman to run each spec‐
140                 ified  DAG  as  if the directory containing that DAG file was
141                 the current working directory. This  option  is  most  useful
142                 when running multiple DAGs in a single condor_dagman.
143
144          -lockfile filename
145                 Names the file created and used as a lock file. The lock file
146                 prevents execution of two of the same DAG, as  defined  by  a
147                 DAG  input  file.  A default lock file ending with the suffix
148                 .dag.lock is passed to condor_dagman by condor_submit_dag.
149
150          -waitfordebug
151                 This  optional  argument  causes  condor_dagman  to  wait  at
152                 startup until someone attaches to the process with a debugger
153                 and sets the wait_for_debug variable in main_init() to false.
154
155          -autorescue 0|1
156                 Whether to automatically run the newest rescue  DAG  for  the
157                 given DAG file, if one exists (0 = false, 1 = true).
158
159          -dorescuefrom number
160                 Forces  condor_dagman  to run the specified rescue DAG number
161                 for the given DAG. A value of 0 is the same as not specifying
162                 this  option.  Specifying a nonexistent rescue DAG is a fatal
163                 error.
164
165          -load_save filename
166                 Specify a file with saved DAG  progress  to  re-run  the  DAG
167                 from.  If  given a path DAGMan will attempt to read that file
168                 following that path. Otherwise, DAGMan  will  check  for  the
169                 file in the DAG's save_files sub-directory.
170
171          -allowversionmismatch
172                 This  optional  argument causes condor_dagman to allow a ver‐
173                 sion mismatch between  condor_dagman  itself  and  the  .con‐
174                 dor.sub  file  produced  by  condor_submit_dag  (or, in other
175                 words, between condor_submit_dag and condor_dagman). WARNING!
176                 This  option should be used only if absolutely necessary. Al‐
177                 lowing version mismatches can cause subtle problems when run‐
178                 ning DAGs.
179
180          -DumpRescue
181                 This  optional  argument  causes condor_dagman to immediately
182                 dump a Rescue DAG and then exit, as opposed to actually  run‐
183                 ning  the  DAG.  This feature is mainly intended for testing.
184                 The Rescue DAG file is produced  whether  or  not  there  are
185                 parse errors reading the original DAG input file. The name of
186                 the file differs if there was a parse error.
187
188          -verbose
189                 (This argument is included only to be passed  to  condor_sub‐
190                 mit_dag  if  lazy  submit  file generation is used for nested
191                 DAGs.) Cause condor_submit_dag to  give  verbose  error  mes‐
192                 sages.
193
194          -force (This  argument  is included only to be passed to condor_sub‐
195                 mit_dag if lazy submit file generation  is  used  for  nested
196                 DAGs.)  Require condor_submit_dag to overwrite the files that
197                 it produces, if the files already exist. Note that dagman.out
198                 will be appended to, not overwritten. If new-style rescue DAG
199                 mode is in effect, and any new-style rescue DAGs  exist,  the
200                 -force  flag  will cause them to be renamed, and the original
201                 DAG will be run. If old-style rescue DAG mode is  in  effect,
202                 any  existing  old-style rescue DAGs will be deleted, and the
203                 original DAG will be run. See the HTCondor manual section  on
204                 Rescue DAGs for more information.
205
206          -notification value
207                 This  argument  is  only included to be passed to condor_sub‐
208                 mit_dag if lazy submit file generation  is  used  for  nested
209                 DAGs.  Sets  the  e-mail notification for DAGMan itself. This
210                 information will be used within the HTCondor submit  descrip‐
211                 tion  file  for  DAGMan. This file is produced by condor_sub‐
212                 mit_dag. The notification option is  described  in  the  con‐
213                 dor_submit manual page.
214
215          -suppress_notification
216                 Causes  jobs submitted by condor_dagman to not send email no‐
217                 tification for events. The same effect  can  be  achieved  by
218                 setting          the          configuration          variable
219                 DAGMAN_SUPPRESS_NOTIFICATION to True. This command  line  op‐
220                 tion is independent of the -notification command line option,
221                 which controls notification for the condor_dagman job itself.
222                 This  flag is generally superfluous, as DAGMAN_SUPPRESS_NOTI‐
223                 FICATION defaults to True.
224
225          -dont_suppress_notification
226                 Causes jobs submitted by condor_dagman to  defer  to  content
227                 within  the  submit  description  file  when deciding to send
228                 email  notification  for  events.  The  same  effect  can  be
229                 achieved    by    setting    the    configuration    variable
230                 DAGMAN_SUPPRESS_NOTIFICATION to False. This command line flag
231                 is  independent  of  the  -notification  command line option,
232                 which controls notification for the condor_dagman job itself.
233                 If  both  -dont_suppress_notification and -suppress_notifica‐
234                 tion are specified within the same command line, the last ar‐
235                 gument is used.
236
237          -dagman DagmanExecutable
238                 (This  argument  is included only to be passed to condor_sub‐
239                 mit_dag if lazy submit file generation  is  used  for  nested
240                 DAGs.) Allows the specification of an alternate condor_dagman
241                 executable to be used instead of the one found in the  user's
242                 path. This must be a fully qualified path.
243
244          -outfile_dir directory
245                 (This  argument  is included only to be passed to condor_sub‐
246                 mit_dag if lazy submit file generation  is  used  for  nested
247                 DAGs.)  Specifies the directory in which the .dagman.out file
248                 will be written. The directory may be specified  relative  to
249                 the  current  working  directory as condor_submit_dag is exe‐
250                 cuted, or specified with an absolute path. Without  this  op‐
251                 tion, the .dagman.out file is placed in the same directory as
252                 the first DAG input file listed on the command line.
253
254          -update_submit
255                 (This argument is included only to be passed  to  condor_sub‐
256                 mit_dag  if  lazy  submit  file generation is used for nested
257                 DAGs.) This optional argument causes an existing  .condor.sub
258                 file  to  not be treated as an error; rather, the .condor.sub
259                 file  will  be  overwritten,  but  the  existing  values   of
260                 -maxjobs, -maxidle, -maxpre, and -maxpost will be preserved.
261
262          -import_env
263                 (This  argument  is included only to be passed to condor_sub‐
264                 mit_dag if lazy submit file generation  is  used  for  nested
265                 DAGs.) This optional argument causes condor_submit_dag to im‐
266                 port the current environment into the environment command  of
267                 the .condor.sub file it generates.
268
269          -include_env Variables
270                 This  optional argument takes a comma separated list of envi‐
271                 roment variables to add  to  .condor.sub  getenv  environment
272                 filter  which  causes found matching environment variables to
273                 be added to the DAGMan manager jobs environment.
274
275          -insert_env Key=Value
276                 This optional argument takes a delimited string of  Key=Value
277                 pairs  to  explicitly set into the .condor.sub files environ‐
278                 ment macro.  The base delimiter is a semicolon  that  can  be
279                 overriden  by  setting the first character in the string to a
280                 valid delimiting character.  If  multiple  -insert_env  flags
281                 contain  the  same Key then the last occurances Value will be
282                 set in the DAGMan jobs environment.
283
284          -priority number
285                 Sets the minimum job priority of node jobs submitted and run‐
286                 ning under this condor_dagman job.
287
288          -DontAlwaysRunPost
289                 This  option  causes condor_dagman to not run the POST script
290                 of a node if the PRE script fails.
291
292          -AlwaysRunPost
293                 This option causes  condor_dagman  to  always  run  the  POST
294                 script of a node, even if the PRE script fails.
295
296          -DoRecovery
297                 Causes  condor_dagman  to  start in recovery mode. This means
298                 that it reads the relevant job user log(s) and catches up  to
299                 the  given  DAG's  previous  state  before submitting any new
300                 jobs.
301
302          -dot   Run condor_dagman up until the point when a DOT file is  pro‐
303                 duced.
304
305          -dag filename
306                 filename  is the name of the DAG input file that is set as an
307                 argument to condor_submit_dag, and passed to condor_dagman.
308

EXIT STATUS

310       condor_dagman will exit with a status value of 0 (zero)  upon  success,
311       and it will exit with the value 1 (one) upon failure.
312

EXAMPLES

314       condor_dagman  is normally not run directly, but submitted as an HTCon‐
315       dor job by running condor_submit_dag. See the condor_submit_dag  manual
316       page for examples.
317

AUTHOR

319       HTCondor Team
320
322       1990-2023,  Center for High Throughput Computing, Computer Sciences De‐
323       partment, University of Wisconsin-Madison, Madison,  WI,  US.  Licensed
324       under the Apache License, Version 2.0.
325
326
327
328
329                                 Oct 02, 2023                 CONDOR_DAGMAN(1)
Impressum