1condor_dagman(1) General Commands Manual condor_dagman(1)
2
3
4
6 condor_dagmanmeta scheduler of the jobs submitted as the nodes of a DAG
7 or DAGs
8
10 condor_dagman-f-t-l .-help
11
12 condor_dagman-version
13
14 condor_dagman-f-l .-csdversion version_string[-debug level] [-maxidle
15 numberOfProcs] [-maxjobs numberOfJobs] [-maxpre NumberOfPreScripts]
16 [-maxpost NumberOfPostScripts] [-noeventchecks] [-allowlogerror]
17 [-usedagdir] -lockfile filename[-waitfordebug] [-autorescue 0|1]
18 [-dorescuefrom number] [-allowversionmismatch] [-DumpRescue] [-verbose]
19 [-force] [-notification value] [-suppress_notification] [-dont_sup‐
20 press_notification] [-dagman DagmanExecutable] [-outfile_dir directory]
21 [-update_submit] [-import_env] [-priority number]
22 [-dont_use_default_node_log] [-DontAlwaysRunPost] [-AlwaysRunPost]
23 [-DoRecovery] -dag dag_file[-dag dag_file_2...-dag dag_file_n]
24
26 condor_dagmanis a meta scheduler for the HTCondor jobs within a DAG
27 (directed acyclic graph) (or multiple DAGs). In typical usage, a sub‐
28 mitter of jobs that are organized into a DAG submits the DAG using con‐
29 dor_submit_dag. condor_submit_dagdoes error checking on aspects of the
30 DAG and then submits condor_dagmanas an HTCondor job. condor_dagmanuses
31 log files to coordinate the further submission of the jobs within the
32 DAG.
33
34 All command line arguments to the DaemonCorelibrary functions work for
35 condor_dagman. When invoked from the command line, condor_dagmanre‐
36 quires the arguments -f -l .to appear first on the command line, to be
37 processed by DaemonCore. The csdversionmust also be specified; at start
38 up, condor_dagmanchecks for a version mismatch with the condor_sub‐
39 mit_dagversion in this argument. The -targument must also be present
40 for the -helpoption, such that output is sent to the terminal.
41
42 Arguments to condor_dagmanare either automatically set by condor_sub‐
43 mit_dagor they are specified as command-line arguments to condor_sub‐
44 mit_dagand passed on to condor_dagman. The method by which the argu‐
45 ments are set is given in their description below.
46
47 condor_dagmancan run multiple, independent DAGs. This is done by speci‐
48 fying multiple -dag arguments. Pass multiple DAG input files as com‐
49 mand-line arguments to condor_submit_dag.
50
51 Debugging output may be obtained by using the -debug leveloption. Level
52 values and what they produce is described as
53
54 * level = 0; never produce output, except for usage info
55
56 * level = 1; very quiet, output severe errors
57
58 * level = 2; normal output, errors and warnings
59
60 * level = 3; output errors, as well as all warnings
61
62 * level = 4; internal debugging output
63
64 * level = 5; internal debugging output; outer loop debugging
65
66 * level = 6; internal debugging output; inner loop debugging; output
67 DAG input file lines as they are parsed
68
69 * level = 7; internal debugging output; rarely used; output DAG
70 input file lines as they are parsed
71
73 -help
74
75 Display usage information and exit.
76
77
78
79 -version
80
81 Display version information and exit.
82
83
84
85 -debug level
86
87 An integer level of debugging output. levelis an integer, with val‐
88 ues of 0-7 inclusive, where 7 is the most verbose output. This com‐
89 mand-line option to condor_submit_dagis passed to condor_dagmanor
90 defaults to the value 3.
91
92
93
94 -maxidle NumberOfProcs
95
96 Sets the maximum number of idle procs allowed before condor_dag‐
97 manstops submitting more node jobs. Note that for this argument,
98 each individual proc within a cluster counts as a towards the limit,
99 which is inconsistent with -maxjobs .Once idle procs start to run,
100 condor_dagmanwill resume submitting jobs once the number of idle
101 procs falls below the specified limit. NumberOfProcsis a non-nega‐
102 tive integer. If this option is omitted, the number of idle procs is
103 limited by the configuration variable DAGMAN_MAX_JOBS_IDLE(see ),
104 which defaults to 1000. To disable this limit, set NumberOfProcsto
105 0. Note that submit description files that queue multiple procs can
106 cause the NumberOfProcslimit to be exceeded. Setting queue 5000in
107 the submit description file, where -maxidleis set to 250 will result
108 in a cluster of 5000 new procs being submitted to the condor_schedd,
109 not 250. In this case, condor_dagmanwill resume submitting jobs when
110 the number of idle procs falls below 250.
111
112
113
114 -maxjobs NumberOfClusters
115
116 Sets the maximum number of clusters within the DAG that will be sub‐
117 mitted to HTCondor at one time. Note that for this argument, each
118 cluster counts as one job, no matter how many individual procs are
119 in the cluster. NumberOfClustersis a non-negative integer. If this
120 option is omitted, the number of clusters is limited by the configu‐
121 ration variable DAGMAN_MAX_JOBS_SUBMITTED(see ), which defaults to 0
122 (unlimited).
123
124
125
126 -maxpre NumberOfPreScripts
127
128 Sets the maximum number of PRE scripts within the DAG that may be
129 running at one time. NumberOfPreScriptsis a non-negative integer. If
130 this option is omitted, the number of PRE scripts is limited by the
131 configuration variable DAGMAN_MAX_PRE_SCRIPTS(see ), which defaults
132 to 20.
133
134
135
136 -maxpost NumberOfPostScripts
137
138 Sets the maximum number of POST scripts within the DAG that may be
139 running at one time. NumberOfPostScriptsis a non-negative integer.
140 If this option is omitted, the number of POST scripts is limited by
141 the configuration variable DAGMAN_MAX_POST_SCRIPTS(see ), which
142 defaults to 20.
143
144
145
146 -noeventchecks
147
148 This argument is no longer used; it is now ignored. Its functional‐
149 ity is now implemented by the DAGMAN_ALLOW_EVENTSconfiguration vari‐
150 able.
151
152
153
154 -allowlogerror
155
156 As of verson 8.5.5 this argument is no longer supported, and setting
157 it will generate a warning.
158
159
160
161 -usedagdir
162
163 This optional argument causes condor_dagmanto run each specified DAG
164 as if the directory containing that DAG file was the current working
165 directory. This option is most useful when running multiple DAGs in
166 a single condor_dagman.
167
168
169
170 -lockfile filename
171
172 Names the file created and used as a lock file. The lock file pre‐
173 vents execution of two of the same DAG, as defined by a DAG input
174 file. A default lock file ending with the suffix .dag.lockis passed
175 to condor_dagmanby condor_submit_dag.
176
177
178
179 -waitfordebug
180
181 This optional argument causes condor_dagmanto wait at startup until
182 someone attaches to the process with a debugger and sets the
183 wait_for_debug variable in main_init() to false.
184
185
186
187 -autorescue 0|1
188
189 Whether to automatically run the newest rescue DAG for the given DAG
190 file, if one exists (0 = false, 1 = true).
191
192
193
194 -dorescuefrom number
195
196 Forces condor_dagmanto run the specified rescue DAG number for the
197 given DAG. A value of 0 is the same as not specifying this option.
198 Specifying a nonexistent rescue DAG is a fatal error.
199
200
201
202 -allowversionmismatch
203
204 This optional argument causes condor_dagmanto allow a version mis‐
205 match between condor_dagmanitself and the .condor.subfile produced
206 by condor_submit_dag(or, in other words, between condor_sub‐
207 mit_dagand condor_dagman). WARNING! This option should be used only
208 if absolutely necessary. Allowing version mismatches can cause sub‐
209 tle problems when running DAGs. (Note that, starting with version
210 7.4.0, condor_dagmanno longer requires an exact version match
211 between itself and the .condor.subfile. Instead, a "minimum compati‐
212 ble version" is defined, and any .condor.subfile of that version or
213 newer is accepted.)
214
215
216
217 -DumpRescue
218
219 This optional argument causes condor_dagmanto immediately dump a
220 Rescue DAG and then exit, as opposed to actually running the DAG.
221 This feature is mainly intended for testing. The Rescue DAG file is
222 produced whether or not there are parse errors reading the original
223 DAG input file. The name of the file differs if there was a parse
224 error.
225
226
227
228 -verbose
229
230 (This argument is included only to be passed to condor_submit_dagif
231 lazy submit file generation is used for nested DAGs.) Cause con‐
232 dor_submit_dagto give verbose error messages.
233
234
235
236 -force
237
238 (This argument is included only to be passed to condor_submit_dagif
239 lazy submit file generation is used for nested DAGs.) Require con‐
240 dor_submit_dagto overwrite the files that it produces, if the files
241 already exist. Note that dagman.outwill be appended to, not over‐
242 written. If new-style rescue DAG mode is in effect, and any new-
243 style rescue DAGs exist, the -forceflag will cause them to be
244 renamed, and the original DAG will be run. If old-style rescue DAG
245 mode is in effect, any existing old-style rescue DAGs will be
246 deleted, and the original DAG will be run. See the HTCondor manual
247 section on Rescue DAGs for more information.
248
249
250
251 -notification value
252
253 This argument is only included to be passed to condor_submit_dagif
254 lazy submit file generation is used for nested DAGs. Sets the e-mail
255 notification for DAGMan itself. This information will be used within
256 the HTCondor submit description file for DAGMan. This file is pro‐
257 duced by condor_submit_dag. The notificationoption is described in
258 the condor_submitmanual page.
259
260
261
262 -suppress_notification
263
264 Causes jobs submitted by condor_dagmanto not send email notification
265 for events. The same effect can be achieved by setting the configu‐
266 ration variable DAGMAN_SUPPRESS_NOTIFICATIONto True. This command
267 line option is independent of the -notificationcommand line option,
268 which controls notification for the condor_dagmanjob itself. This
269 flag is generally superfluous, as DAGMAN_SUPPRESS_NOTIFICATIONde‐
270 faults to True.
271
272
273
274 -dont_suppress_notification
275
276 Causes jobs submitted by condor_dagmanto defer to content within the
277 submit description file when deciding to send email notification for
278 events. The same effect can be achieved by setting the configuration
279 variable DAGMAN_SUPPRESS_NOTIFICATIONto False. This command line
280 flag is independent of the -notificationcommand line option, which
281 controls notification for the condor_dagmanjob itself. If both
282 -dont_suppress_notificationand -suppress_notificationare specified
283 within the same command line, the last argument is used.
284
285
286
287 -dagman DagmanExecutable
288
289 (This argument is included only to be passed to condor_submit_dagif
290 lazy submit file generation is used for nested DAGs.) Allows the
291 specification of an alternate condor_dagmanexecutable to be used
292 instead of the one found in the user's path. This must be a fully
293 qualified path.
294
295
296
297 -outfile_dir directory
298
299 (This argument is included only to be passed to condor_submit_dagif
300 lazy submit file generation is used for nested DAGs.) Specifies the
301 directory in which the .dagman.outfile will be written. The directo‐
302 rymay be specified relative to the current working directory as con‐
303 dor_submit_dagis executed, or specified with an absolute path. With‐
304 out this option, the .dagman.outfile is placed in the same directory
305 as the first DAG input file listed on the command line.
306
307
308
309 -update_submit
310
311 (This argument is included only to be passed to condor_submit_dagif
312 lazy submit file generation is used for nested DAGs.) This optional
313 argument causes an existing .condor.subfile to not be treated as an
314 error; rather, the .condor.subfile will be overwritten, but the
315 existing values of -maxjobs, -maxidle, -maxpre, and -maxpostwill be
316 preserved.
317
318
319
320 -import_env
321
322 (This argument is included only to be passed to condor_submit_dagif
323 lazy submit file generation is used for nested DAGs.) This optional
324 argument causes condor_submit_dagto import the current environment
325 into the environmentcommand of the .condor.subfile it generates.
326
327
328
329 -priority number
330
331 Sets the minimum job priority of node jobs submitted and running
332 under this condor_dagmanjob.
333
334
335
336 -dont_use_default_node_log
337
338 This option is disabled as of HTCondor version 8.3.1.Tells con‐
339 dor_dagmanto use the file specified by the job ClassAd attribute
340 UserLogto monitor job status. If this command line argument is used,
341 then the job event log file cannot be defined with a macro.
342
343
344
345 -DontAlwaysRunPost
346
347 This option causes condor_dagmanto not run the POST script of a node
348 if the PRE script fails. (This was the default behavior prior to
349 HTCondor version 7.7.2, and is again the default behavior from ver‐
350 sion 8.5.4 onwards.)
351
352
353
354 -AlwaysRunPost
355
356 This option causes condor_dagmanto always run the POST script of a
357 node, even if the PRE script fails. (This was the default behavior
358 for HTCondor version 7.7.2 through version 8.5.3.)
359
360
361
362 -DoRecovery
363
364 Causes condor_dagmanto start in recovery mode. This means that it
365 reads the relevant job user log(s) and catches up to the given DAG's
366 previous state before submitting any new jobs.
367
368
369
370 -dag filename
371
372 filenameis the name of the DAG input file that is set as an argument
373 to condor_submit_dag, and passed to condor_dagman.
374
375
376
377
378
380 condor_dagmanwill exit with a status value of 0 (zero) upon success,
381 and it will exit with the value 1 (one) upon failure.
382
384 condor_dagmanis normally not run directly, but submitted as an HTCondor
385 job by running condor_submit_dag. See the condor_submit_dag manual page
386 for examples.
387
389 Center for High Throughput Computing, University of Wiscon‐
390 sin–Madison
391
393 Copyright © 1990-2019 Center for High Throughput Computing, Computer
394 Sciences Department, University of Wisconsin-Madison, Madison, WI. All
395 Rights Reserved. Licensed under the Apache License, Version 2.0.
396
397
398
399 date condor_dagman(1)