1just-man-pages/condor_dagman(G1e)neral Commands Manjuuaslt-man-pages/condor_dagman(1)
2
3
4
6 condor_dagman meta scheduler of the jobs submitted as the nodes of a
7 DAG or DAGs
8
10 condor_dagman -f -t -l . -help
11
12 condor_dagman -version
13
14 condor_dagman -f -l . -csdversion version_string [ -debug level ] [
15 -maxidle numberOfProcs ] [ -maxjobs numberOfJobs ] [ -maxpre NumberOf‐
16 PreScripts ] [ -maxpost NumberOfPostScripts ] [ -noeventchecks ] [
17 -allowlogerror ] [ -usedagdir ] -lockfile filename [ -waitfordebug ] [
18 -autorescue 0|1 ] [ -dorescuefrom number ] [ -allowversionmismatch ] [
19 -DumpRescue ] [ -verbose ] [ -force ] [ -notification value ] [ -sup‐
20 press_notification ] [ -dont_suppress_notification ] [ -dagman Dag‐
21 manExecutable ] [ -outfile_dir directory ] [ -update_submit ] [
22 -import_env ] [ -priority number ] [ -dont_use_default_node_log ] [
23 -DontAlwaysRunPost ] [ -AlwaysRunPost ] [ -DoRecovery ] -dag dag_file [
24 -dag dag_file_2 ... -dag dag_file_n ]
25
27 condor_dagman is a meta scheduler for the HTCondor jobs within a DAG
28 (directed acyclic graph) (or multiple DAGs). In typical usage, a sub‐
29 mitter of jobs that are organized into a DAG submits the DAG using con‐
30 dor_submit_dag . condor_submit_dag does error checking on aspects of
31 the DAG and then submits condor_dagman as an HTCondor job. condor_dag‐
32 man uses log files to coordinate the further submission of the jobs
33 within the DAG.
34
35 All command line arguments to the DaemonCore library functions work for
36 condor_dagman . When invoked from the command line, condor_dagman
37 requires the arguments -f -l . to appear first on the command line, to
38 be processed by DaemonCore . The csdversion must also be specified; at
39 start up, condor_dagman checks for a version mismatch with the con‐
40 dor_submit_dag version in this argument. The -t argument must also be
41 present for the -help option, such that output is sent to the terminal.
42
43 Arguments to condor_dagman are either automatically set by condor_sub‐
44 mit_dag or they are specified as command-line arguments to condor_sub‐
45 mit_dag and passed on to condor_dagman . The method by which the argu‐
46 ments are set is given in their description below.
47
48 condor_dagman can run multiple, independent DAGs. This is done by spec‐
49 ifying multiple -dag a rguments. Pass multiple DAG input files as com‐
50 mand-line arguments to condor_submit_dag .
51
52 Debugging output may be obtained by using the -debug level option.
53 Level values and what they produce is described as
54
55 * level = 0; never produce output, except for usage info
56
57 * level = 1; very quiet, output severe errors
58
59 * level = 2; normal output, errors and warnings
60
61 * level = 3; output errors, as well as all warnings
62
63 * level = 4; internal debugging output
64
65 * level = 5; internal debugging output; outer loop debugging
66
67 * level = 6; internal debugging output; inner loop debugging; output
68 DAG input file lines as they are parsed
69
70 * level = 7; internal debugging output; rarely used; output DAG
71 input file lines as they are parsed
72
74 -help
75
76 Display usage information and exit.
77
78
79
80 -version
81
82 Display version information and exit.
83
84
85
86 -debug level
87
88 An integer level of debugging output. level is an integer, with
89 values of 0-7 inclusive, where 7 is the most verbose output. This
90 command-line option to condor_submit_dag is passed to condor_dagman
91 or defaults to the value 3.
92
93
94
95 -maxidle NumberOfProcs
96
97 Sets the maximum number of idle procs allowed before condor_dagman
98 stops submitting more node jobs. Note that for this argument, each
99 individual proc within a cluster counts as a towards the limit,
100 which is inconsistent with -maxjobs . Once idle procs start to run,
101 condor_dagman will resume submitting jobs once the number of idle
102 procs falls below the specified limit. NumberOfProcs is a non-nega‐
103 tive integer. If this option is omitted, the number of idle procs is
104 limited by the configuration variable DAGMAN_MAX_JOBS_IDLE (see ),
105 which defaults to 1000. To disable this limit, set NumberOfProcs to
106 0. Note that submit description files that queue multiple procs can
107 cause the NumberOfProcs limit to be exceeded. Setting queue 5000 in
108 the submit description file, where -maxidle is set to 250 will
109 result in a cluster of 5000 new procs being submitted to the con‐
110 dor_schedd , not 250. In this case, condor_dagman will resume sub‐
111 mitting jobs when the number of idle procs falls below 250.
112
113
114
115 -maxjobs NumberOfClusters
116
117 Sets the maximum number of clusters within the DAG that will be sub‐
118 mitted to HTCondor at one time. Note that for this argument, each
119 cluster counts as one job, no matter how many individual procs are
120 in the cluster. NumberOfClusters is a non-negative integer. If this
121 option is omitted, the number of clusters is limited by the configu‐
122 ration variable DAGMAN_MAX_JOBS_SUBMITTED (see ), which defaults to
123 0 (unlimited).
124
125
126
127 -maxpre NumberOfPreScripts
128
129 Sets the maximum number of PRE scripts within the DAG that may be
130 running at one time. NumberOfPreScripts is a non-negative integer.
131 If this option is omitted, the number of PRE scripts is limited by
132 the configuration variable DAGMAN_MAX_PRE_SCRIPTS (see ), which
133 defaults to 20.
134
135
136
137 -maxpost NumberOfPostScripts
138
139 Sets the maximum number of POST scripts within the DAG that may be
140 running at one time. NumberOfPostScripts is a non-negative integer.
141 If this option is omitted, the number of POST scripts is limited by
142 the configuration variable DAGMAN_MAX_POST_SCRIPTS (see ), which
143 defaults to 20.
144
145
146
147 -noeventchecks
148
149 This argument is no longer used; it is now ignored. Its functional‐
150 ity is now implemented by the DAGMAN_ALLOW_EVENTS configuration
151 variable.
152
153
154
155 -allowlogerror
156
157 As of verson 8.5.5 this argument is no longer supported, and setting
158 it will generate a warning.
159
160
161
162 -usedagdir
163
164 This optional argument causes condor_dagman to run each specified
165 DAG as if the directory containing that DAG file was the current
166 working directory. This option is most useful when running multiple
167 DAGs in a single condor_dagman .
168
169
170
171 -lockfile filename
172
173 Names the file created and used as a lock file. The lock file pre‐
174 vents execution of two of the same DAG, as defined by a DAG input
175 file. A default lock file ending with the suffix .dag.lock is
176 passed to condor_dagman by condor_submit_dag .
177
178
179
180 -waitfordebug
181
182 This optional argument causes condor_dagman to wait at startup until
183 someone attaches to the process with a debugger and sets the
184 wait_for_debug variable in main_init() to false.
185
186
187
188 -autorescue 0|1
189
190 Whether to automatically run the newest rescue DAG for the given DAG
191 file, if one exists (0 = false , 1 = true ).
192
193
194
195 -dorescuefrom number
196
197 Forces condor_dagman to run the specified rescue DAG number for the
198 given DAG. A value of 0 is the same as not specifying this option.
199 Specifying a nonexistent rescue DAG is a fatal error.
200
201
202
203 -allowversionmismatch
204
205 This optional argument causes condor_dagman to allow a version mis‐
206 match between condor_dagman itself and the .condor.sub file pro‐
207 duced by condor_submit_dag (or, in other words, between condor_sub‐
208 mit_dag and condor_dagman ). WARNING! This option should be used
209 only if absolutely necessary. Allowing version mismatches can cause
210 subtle problems when running DAGs. (Note that, starting with version
211 7.4.0, condor_dagman no longer requires an exact version match
212 between itself and the .condor.sub file. Instead, a "minimum com‐
213 patible version" is defined, and any .condor.sub file of that ver‐
214 sion or newer is accepted.)
215
216
217
218 -DumpRescue
219
220 This optional argument causes condor_dagman to immediately dump a
221 Rescue DAG and then exit, as opposed to actually running the DAG.
222 This feature is mainly intended for testing. The Rescue DAG file is
223 produced whether or not there are parse errors reading the original
224 DAG input file. The name of the file differs if there was a parse
225 error.
226
227
228
229 -verbose
230
231 (This argument is included only to be passed to condor_submit_dag if
232 lazy submit file generation is used for nested DAGs.) Cause con‐
233 dor_submit_dag to give verbose error messages.
234
235
236
237 -force
238
239 (This argument is included only to be passed to condor_submit_dag if
240 lazy submit file generation is used for nested DAGs.) Require con‐
241 dor_submit_dag to overwrite the files that it produces, if the files
242 already exist. Note that dagman.out will be appended to, not over‐
243 written. If new-style rescue DAG mode is in effect, and any new-
244 style rescue DAGs exist, the -force flag will cause them to be
245 renamed, and the original DAG will be run. If old-style rescue DAG
246 mode is in effect, any existing old-style rescue DAGs will be
247 deleted, and the original DAG will be run. See the HTCondor manual
248 section on Rescue DAGs for more information.
249
250
251
252 -notification value
253
254 This argument is only included to be passed to condor_submit_dag if
255 lazy submit file generation is used for nested DAGs. Sets the e-mail
256 notification for DAGMan itself. This information will be used within
257 the HTCondor submit description file for DAGMan. This file is pro‐
258 duced by condor_submit_dag . The notification option is described in
259 the condor_submit manual page.
260
261
262
263 -suppress_notification
264
265 Causes jobs submitted by condor_dagman to not send email notifica‐
266 tion for events. The same effect can be achieved by setting the con‐
267 figuration variable DAGMAN_SUPPRESS_NOTIFICATION to True . This
268 command line option is independent of the -notification command line
269 option, which controls notification for the condor_dagman job
270 itself. This flag is generally superfluous, as DAGMAN_SUP‐
271 PRESS_NOTIFICATION defaults to True .
272
273
274
275 -dont_suppress_notification
276
277 Causes jobs submitted by condor_dagman to defer to content within
278 the submit description file when deciding to send email notification
279 for events. The same effect can be achieved by setting the configu‐
280 ration variable DAGMAN_SUPPRESS_NOTIFICATION to False . This com‐
281 mand line flag is independent of the -notification command line
282 option, which controls notification for the condor_dagman job
283 itself. If both -dont_suppress_notification and -suppress_notifica‐
284 tion are specified within the same command line, the last argument
285 is used.
286
287
288
289 -dagman DagmanExecutable
290
291 (This argument is included only to be passed to condor_submit_dag if
292 lazy submit file generation is used for nested DAGs.) Allows the
293 specification of an alternate condor_dagman executable to be used
294 instead of the one found in the user's path. This must be a fully
295 qualified path.
296
297
298
299 -outfile_dir directory
300
301 (This argument is included only to be passed to condor_submit_dag if
302 lazy submit file generation is used for nested DAGs.) Specifies the
303 directory in which the .dagman.out file will be written. The direc‐
304 tory may be specified relative to the current working directory as
305 condor_submit_dag is executed, or specified with an absolute path.
306 Without this option, the .dagman.out file is placed in the same
307 directory as the first DAG input file listed on the command line.
308
309
310
311 -update_submit
312
313 (This argument is included only to be passed to condor_submit_dag if
314 lazy submit file generation is used for nested DAGs.) This optional
315 argument causes an existing .condor.sub file to not be treated as
316 an error; rather, the .condor.sub file will be overwritten, but the
317 existing values of -maxjobs , -maxidle , -maxpre , and -maxpost will
318 be preserved.
319
320
321
322 -import_env
323
324 (This argument is included only to be passed to condor_submit_dag if
325 lazy submit file generation is used for nested DAGs.) This optional
326 argument causes condor_submit_dag to import the current environment
327 into the environment command of the .condor.sub file it generates.
328
329
330
331 -priority number
332
333 Sets the minimum job priority of node jobs submitted and running
334 under this condor_dagman job.
335
336
337
338 -dont_use_default_node_log
339
340 This option is disabled as of HTCondor version 8.3.1. Tells con‐
341 dor_dagman to use the file specified by the job ClassAd attribute
342 UserLog to monitor job status. If this command line argument is
343 used, then the job event log file cannot be defined with a macro.
344
345
346
347 -DontAlwaysRunPost
348
349 This option causes condor_dagman to not run the POST script of a
350 node if the PRE script fails. (This was the default behavior prior
351 to HTCondor version 7.7.2, and is again the default behavior from
352 version 8.5.4 onwards.)
353
354
355
356 -AlwaysRunPost
357
358 This option causes condor_dagman to always run the POST script of a
359 node, even if the PRE script fails. (This was the default behavior
360 for HTCondor version 7.7.2 through version 8.5.3.)
361
362
363
364 -DoRecovery
365
366 Causes condor_dagman to start in recovery mode. This means that it
367 reads the relevant job user log(s) and catches up to the given DAG's
368 previous state before submitting any new jobs.
369
370
371
372 -dag filename
373
374 filename is the name of the DAG input file that is set as an argu‐
375 ment to condor_submit_dag , and passed to condor_dagman .
376
377
378
379
380
382 condor_dagman will exit with a status value of 0 (zero) upon success,
383 and it will exit with the value 1 (one) upon failure.
384
386 condor_dagman is normally not run directly, but submitted as an HTCon‐
387 dor job by running condor_submit_dag. See the condor_submit_dag manual
388 page for examples.
389
391 Center for High Throughput Computing, University of Wisconsin-Madison
392
394 Copyright (C) 1990-2018 Center for High Throughput Computing, Computer
395 Sciences Department, University of Wisconsin-Madison, Madison, WI. All
396 Rights Reserved. Licensed under the Apache License, Version 2.0.
397
398
399
400 date just-man-pages/condor_dagman(1)