1CONDOR_SUBMIT(1) HTCondor Manual CONDOR_SUBMIT(1)
2
3
4
6 condor_submit - HTCondor Manual
7
8 Queue jobs for execution under HTCondor
9
10
12 condor_submit [-terse ] [-verbose ] [-unused ] [-file submit_file]
13 [-name schedd_name] [-remote schedd_name] [-addr <ip:port>] [-pool
14 pool_name] [-disable ] [-password passphrase] [-debug ] [-append com‐
15 mand ...][-batch-name batch_name] [-spool ] [-dump filename] [-interac‐
16 tive ] [-allow-crlf-script ] [-dry-run ] [-maxjobs number-of-jobs]
17 [-single-cluster ] [-stm method] [<submit-variable>=<value> ] [submit
18 description file ] [-queue queue_arguments]
19
21 condor_submit is the program for submitting jobs for execution under
22 HTCondor. condor_submit requires one or more submit description com‐
23 mands to direct the queuing of jobs. These commands may come from a
24 file, standard input, the command line, or from some combination of
25 these. One submit description may contain specifications for the queu‐
26 ing of many HTCondor jobs at once. A single invocation of condor_submit
27 may cause one or more clusters. A cluster is a set of jobs specified in
28 the submit description between queue commands for which the exe‐
29 cutable is not changed. It is advantageous to submit multiple jobs as a
30 single cluster because:
31
32 · Much less memory is used by the scheduler to hold the same number of
33 jobs.
34
35 · Only one copy of the checkpoint file is needed to represent all jobs
36 in a cluster until they begin execution.
37
38 · There is much less overhead involved for HTCondor to start the next
39 job in a cluster than for HTCondor to start a new cluster. This can
40 make a big difference when submitting lots of short jobs.
41
42 Multiple clusters may be specified within a single submit description.
43 Each cluster must specify a single executable.
44
45 The job ClassAd attribute ClusterId identifies a cluster.
46
47 The submit description file argument is the path and file name of the
48 submit description file. If this optional argument is the dash charac‐
49 ter (-), then the commands are taken from standard input. If - is spec‐
50 ified for the submit description file, -verbose is implied; this can be
51 overridden by specifying -terse.
52
53 If no submit discription file argument is given, and no -queue argument
54 is given, commands are taken automatically from standard input.
55
56 Note that submission of jobs from a Windows machine requires a stashed
57 password to allow HTCondor to impersonate the user submitting the job.
58 To stash a password, use the condor_store_cred command. See the manual
59 page for details.
60
61 For lengthy lines within the submit description file, the backslash (\)
62 is a line continuation character. Placing the backslash at the end of a
63 line causes the current line's command to be continued with the next
64 line of the file. Submit description files may contain comments. A com‐
65 ment is any line beginning with a pound character (#).
66
68 -terse Terse output - display JobId ranges only.
69
70 -verbose
71 Verbose output - display the created job ClassAd
72
73 -unused
74 As a default, causes no warnings to be issued about
75 user-defined macros not being used within the submit descrip‐
76 tion file. The meaning reverses (toggles) when the configura‐
77 tion variable WARN_ON_UNUSED_SUBMIT_FILE_MACROS
78 is set to the non default value of False. Printing the
79 warnings can help identify spelling errors of submit descrip‐
80 tion file commands. The warnings are sent to stderr.
81
82 -file submit_file
83 Use submit_file as the submit discription file. This is
84 equivalent to providing submit_file as an argument without
85 the preceeding -file.
86
87 -name schedd_name
88 Submit to the specified condor_schedd. Use this option to
89 submit to a condor_schedd other than the default local one.
90 schedd_name is the value of the Name ClassAd attribute on the
91 machine where the condor_schedd daemon runs.
92
93 -remote schedd_name
94 Submit to the specified condor_schedd, spooling all required
95 input files over the network connection. schedd_name is the
96 value of the Name ClassAd attribute on the machine where the
97 condor_schedd daemon runs. This option is equivalent to using
98 both -name and -spool.
99
100 -addr <ip:port>
101 Submit to the condor_schedd at the IP address and port given
102 by the sinful string argument <ip:port>.
103
104 -pool pool_name
105 Look in the specified pool for the condor_schedd to submit
106 to. This option is used with -name or -remote.
107
108 -disable
109 Disable file permission checks when submitting a job for read
110 permissions on all input files, such as those defined by com‐
111 mands input and transfer_input_files , as well as write
112 permission to output files, such as a log file defined by log
113 and output files defined with output or transfer_out‐
114 put_files .
115
116 -password passphrase
117 Specify a password to the MyProxy server.
118
119 -debug Cause debugging information to be sent to stderr, based on
120 the value of the configuration variable TOOL_DEBUG.
121
122 -append command
123 Augment the commands in the submit description file with the
124 given command. This command will be considered to immediately
125 precede the queue command within the submit description file,
126 and come after all other previous commands. If the command
127 specifies a queue command, as in the example
128
129 condor_submit mysubmitfile -append "queue input in A, B, C"
130
131 then the entire -append command line option and its arguments
132 are converted to
133
134 condor_submit mysubmitfile -queue input in A, B, C
135
136 The submit description file is not modified. Multiple com‐
137 mands are specified by using the -append option multiple
138 times. Each new command is given in a separate -append
139 option. Commands with spaces in them will need to be enclosed
140 in double quote marks.
141
142 -batch-name batch_name
143 Set the batch name for this submit. The batch name is dis‐
144 played by condor_q -batch. It is intended for use by users to
145 give meaningful names to their jobs and to influence how con‐
146 dor_q groups jobs for display. Use of this argument takes
147 precedence over a batch name specified in the submit descrip‐
148 tion file itself.
149
150 -spool Spool all required input files, job event log, and proxy over
151 the connection to the condor_schedd. After submission, modify
152 local copies of the files without affecting your jobs. Any
153 output files for completed jobs need to be retrieved with
154 condor_transfer_data.
155
156 -dump filename
157 Sends all ClassAds to the specified file, instead of to the
158 condor_schedd.
159
160 -interactive
161 Indicates that the user wants to run an interactive shell on
162 an execute machine in the pool. This is equivalent to creat‐
163 ing a submit description file of a vanilla universe sleep
164 job, and then running condor_ssh_to_job by hand. Without any
165 additional arguments, condor_submit with the -interactive
166 flag creates a dummy vanilla universe job that sleeps, sub‐
167 mits it to the local scheduler, waits for the job to run, and
168 then launches condor_ssh_to_job to run a shell. If the user
169 would like to run the shell on a machine that matches a par‐
170 ticular requirements expression, the submit description file
171 is specified, and it will contain the expression. Note that
172 all policy expressions specified in the submit description
173 file are honored, but any executable or universe commands
174 are overwritten to be sleep and vanilla. The job ClassAd
175 attribute InteractiveJob is set to True to identify interac‐
176 tive jobs for condor_startd policy usage.
177
178 -allow-crlf-script
179 Changes the check for an invalid line ending on the exe‐
180 cutable script's #! line from an ERROR to a WARNING. The #!
181 line will be ignored by Windows, so it won't matter if it is
182 invalid; but Unix and Linux will not run a script that has a
183 Windows/DOS line ending on the first line of the script. So
184 condor_submit will not allow such a script to be submitted as
185 the job's executable unless this option is supplied.
186
187 -dry-run file
188 Parse the submit description file, sending the resulting job
189 ClassAd to the file given by file, but do not submit the
190 job(s). This permits observation of the job specification,
191 and it facilitates debugging the submit description file con‐
192 tents. If file is -, the output is written to stdout.
193
194 -maxjobs number-of-jobs
195 If the total number of jobs specified by the submit descrip‐
196 tion file is more than the integer value given by num‐
197 ber-of-jobs, then no jobs are submitted for execution and an
198 error message is generated. A 0 or negative value for the
199 number-of-jobs causes no limit to be imposed.
200
201 -single-cluster
202 If the jobs specified by the submit description file causes
203 more than a single cluster value to be assigned, then no jobs
204 are submitted for execution and an error message is gener‐
205 ated.
206
207 -stm method
208 Specify the method use to move a sandbox into HTCondor.
209 method is one of stm_use_schedd_only or stm_use_transferd.
210
211 <submit-variable>=<value>
212 Defines a submit command or submit variable with a value, and
213 parses it as if it was placed at the beginning of the submit
214 description file. The submit description file is not changed.
215 To correctly parse the condor_submit command line, this
216 option must be specified without white space characters
217 before and after the equals sign (=), or the entire option
218 must be surrounded by double quote marks.
219
220 -queue queue_arguments
221 A command line specification of how many jobs to queue, which
222 is only permitted if the submit description file does not
223 have a queue command. The queue_arguments are the same as may
224 be within a submit description file. The parsing of the
225 queue_arguments finishes at the end of the line or when a
226 dash character (-) is encountered. Therefore, its best place‐
227 ment within the command line will be at the end of the com‐
228 mand line.
229
230 On a Unix command line, the shell expands file globs before
231 parsing occurs.
232
234 Note: more information on submitting HTCondor jobs can be found here:
235 /users-manual/submitting-a-job.
236
237 As of version 8.5.6, the condor_submit language supports multi-line
238 values in commands. The syntax is the same as the configuration lan‐
239 guage (see more details here: admin-manual/introduction-to-configura‐
240 tion:multi-line values).
241
242 Each submit description file describes one or more clusters of jobs to
243 be placed in the HTCondor execution pool. All jobs in a cluster must
244 share the same executable, but they may have different input and output
245 files, and different program arguments. The submit description file is
246 generally the last command-line argument to condor_submit. If the sub‐
247 mit description file argument is omitted, condor_submit will read the
248 submit description from standard input.
249
250 The submit description file must contain at least one executable com‐
251 mand and at least one queue command. All of the other commands have
252 default actions.
253
254 Note that a submit file that contains more than one executable command
255 will produce multiple clusters when submitted. This is not generally
256 recommended, and is not allowed for submit files that are run as DAG
257 node jobs by condor_dagman.
258
259 The commands which can appear in the submit description file are numer‐
260 ous. They are listed here in alphabetical order by category.
261
262 BASIC COMMANDS
263
264 arguments = <argument_list>
265 List of arguments to be supplied to the executable as part of
266 the command line.
267
268 In the java universe, the first argument must be the name of
269 the class containing main.
270
271 There are two permissible formats for specifying arguments,
272 identified as the old syntax and the new syntax. The old syn‐
273 tax supports white space characters within arguments only in
274 special circumstances; when used, the command line arguments
275 are represented in the job ClassAd attribute Args. The new
276 syntax supports uniform quoting of white space characters
277 within arguments; when used, the command line arguments are
278 represented in the job ClassAd attribute Arguments.
279
280 Old Syntax
281
282 In the old syntax, individual command line arguments are
283 delimited (separated) by space characters. To allow a double
284 quote mark in an argument, it is escaped with a backslash;
285 that is, the two character sequence \" becomes a single dou‐
286 ble quote mark within an argument.
287
288 Further interpretation of the argument string differs depend‐
289 ing on the operating system. On Windows, the entire argument
290 string is passed verbatim (other than the backslash in front
291 of double quote marks) to the Windows application. Most Win‐
292 dows applications will allow spaces within an argument value
293 by surrounding the argument with double quotes marks. In all
294 other cases, there is no further interpretation of the argu‐
295 ments.
296
297 Example:
298
299 arguments = one \"two\" 'three'
300
301 Produces in Unix vanilla universe:
302
303 argument 1: one
304 argument 2: "two"
305 argument 3: 'three'
306
307 New Syntax
308
309 Here are the rules for using the new syntax:
310
311 1. The entire string representing the command line arguments
312 is surrounded by double quote marks. This permits the
313 white space characters of spaces and tabs to potentially
314 be embedded within a single argument. Putting the double
315 quote mark within the arguments is accomplished by escap‐
316 ing it with another double quote mark.
317
318 2. The white space characters of spaces or tabs delimit argu‐
319 ments.
320
321 3. To embed white space characters of spaces or tabs within a
322 single argument, surround the entire argument with single
323 quote marks.
324
325 4. To insert a literal single quote mark, escape it within an
326 argument already delimited by single quote marks by adding
327 another single quote mark.
328
329 Example:
330
331 arguments = "3 simple arguments"
332
333 Produces:
334
335 argument 1: 3
336 argument 2: simple
337 argument 3: arguments
338
339 Another example:
340
341 arguments = "one 'two with spaces' 3"
342
343 Produces:
344
345 argument 1: one
346 argument 2: two with spaces
347 argument 3: 3
348
349 And yet another example:
350
351 arguments = "one ""two"" 'spacey ''quoted'' argument'"
352
353 Produces:
354
355 argument 1: one
356 argument 2: "two"
357 argument 3: spacey 'quoted' argument
358
359 Notice that in the new syntax, the backslash has no special
360 meaning. This is for the convenience of Windows users.
361
362
363 environment = <parameter_list>
364 List of environment
365 variables.
366
367 There are two different formats for specifying the environ‐
368 ment variables: the old format and the new format. The old
369 format is retained for backward-compatibility. It suffers
370 from a platform-dependent syntax and the inability to insert
371 some special characters into the environment.
372
373 The new syntax for specifying environment values:
374
375 1. Put double quote marks around the entire argument string.
376 This distinguishes the new syntax from the old. The old
377 syntax does not have double quote marks around it. Any
378 literal double quote marks within the string must be
379 escaped by repeating the double quote mark.
380
381 2. Each environment entry has the form
382
383 <name>=<value>
384
385 3. Use white space (space or tab characters) to separate
386 environment entries.
387
388 4. To put any white space in an environment entry, surround
389 the space and as much of the surrounding entry as desired
390 with single quote marks.
391
392 5. To insert a literal single quote mark, repeat the single
393 quote mark anywhere inside of a section surrounded by sin‐
394 gle quote marks.
395
396 Example:
397
398 environment = "one=1 two=""2"" three='spacey ''quoted'' value'"
399
400 Produces the following environment entries:
401
402 one=1
403 two="2"
404 three=spacey 'quoted' value
405
406 Under the old syntax, there are no double quote marks sur‐
407 rounding the environment specification. Each environment
408 entry remains of the form
409
410 <name>=<value>
411
412 Under Unix, list multiple environment entries by separating
413 them with a semicolon (;). Under Windows, separate multiple
414 entries with a vertical bar (|). There is no way to insert a
415 literal semicolon under Unix or a literal vertical bar under
416 Windows. Note that spaces are accepted, but rarely desired,
417 characters within parameter names and values, because they
418 are treated as literal characters, not separators or ignored
419 white space. Place spaces within the parameter list only if
420 required.
421
422 A Unix example:
423
424 environment = one=1;two=2;three="quotes have no 'special' meaning"
425
426 This produces the following:
427
428 one=1
429 two=2
430 three="quotes have no 'special' meaning"
431
432 If the environment is set with the environment command and
433 getenv is also set to true, values specified with environ‐
434 ment override values in the submitter's environment (regard‐
435 less of the order of the environment and getenv commands).
436
437
438 error = <pathname>
439 A path and file name used by HTCondor to capture any error
440 messages the program would normally write to the screen (that
441 is, this file becomes stderr). A path is given with respect
442 to the file system of the machine on which the job is submit‐
443 ted. The file is written (by the job) in the remote scratch
444 directory of the machine where the job is executed. When the
445 job exits, the resulting file is transferred back to the
446 machine where the job was submitted, and the path is utilized
447 for file placement. If not specified, the default value of
448 /dev/null is used for submission to a Unix machine. If not
449 specified, error messages are ignored for submission to a
450 Windows machine. More than one job should not use the same
451 error file, since this will cause one job to overwrite the
452 errors of another. If HTCondor detects that the error and
453 output files for a job are the same, it will run the job such
454 that the output and error data is merged.
455
456 executable = <pathname>
457 An optional path and a required file name of the executable
458 file for this job cluster. Only one executable command
459 within a submit description file is guaranteed to work prop‐
460 erly. More than one often works.
461
462 If no path or a relative path is used, then the executable
463 file is presumed to be relative to the current working direc‐
464 tory of the user as the condor_submit command is issued.
465
466 If submitting into the standard universe, then the named exe‐
467 cutable must have been re-linked with the HTCondor libraries
468 (such as via the condor_compile command). If submitting into
469 the vanilla universe (the default), then the named executable
470 need not be re-linked and can be any process which can run in
471 the background (shell scripts work fine as well). If submit‐
472 ting into the Java universe, then the argument must be a com‐
473 piled .class file.
474
475
476 getenv = <True | False>
477 If getenv is set to
478 True, then condor_submit will copy all of the user's current
479 shell environment variables at the time of job submission
480 into the job ClassAd. The job will therefore execute with the
481 same set of environment variables that the user had at submit
482 time. Defaults to False.
483
484 If the environment is set with the environment command and
485 getenv is also set to true, values specified with environment
486 override values in the submitter's environment (regardless of
487 the order of the environment and getenv commands).
488
489 input = <pathname>
490 HTCondor assumes that its jobs are long-running, and that the
491 user will not wait at the terminal for their completion.
492 Because of this, the standard files which normally access the
493 terminal, (stdin, stdout, and stderr), must refer to files.
494 Thus, the file name specified with input should contain any
495 keyboard input the program requires (that is, this file
496 becomes stdin). A path is given with respect to the file sys‐
497 tem of the machine on which the job is submitted. The file is
498 transferred before execution to the remote scratch directory
499 of the machine where the job is executed. If not specified,
500 the default value of /dev/null is used for submission to a
501 Unix machine. If not specified, input is ignored for submis‐
502 sion to a Windows machine. For grid universe jobs, input
503 may be a URL that the Globus tool globus_url_copy under‐
504 stands.
505
506 Note that this command does not refer to the command-line
507 arguments of the program. The command-line arguments are
508 specified by the arguments command.
509
510
511 log = <pathname>
512 Use log to specify a file name where HTCondor will write a
513 log file of what is happening with this job cluster, called a
514 job event log. For example, HTCondor will place a log entry
515 into this file when and where the job begins running, when
516 the job produces a checkpoint, or moves (migrates) to another
517 machine, and when the job completes. Most users find specify‐
518 ing a log file to be handy; its use is recommended. If no log
519 entry is specified, HTCondor does not create a log for this
520 cluster. If a relative path is specified, it is relative to
521 the current working directory as the job is submitted or the
522 directory specified by submit command initialdir on the sub‐
523 mit machine.
524
525
526 log_xml = <True | False>
527 If log_xml is True, then the job event log file will be
528 written in ClassAd XML. If not specified, XML is not used.
529 Note that the file is an XML fragment; it is missing the file
530 header and footer. Do not mix XML and non-XML within a single
531 file. If multiple jobs write to a single job event log file,
532 ensure that all of the jobs specify this option in the same
533 way.
534
535
536
537
538 notification = <Always | Complete | Error | Never>
539 Owners of HTCondor jobs are notified by e-mail when certain
540 events occur. If defined by Always, the owner will be noti‐
541 fied whenever the job produces a checkpoint, as well as when
542 the job completes. If defined by Complete, the owner will be
543 notified when the job terminates. If defined by Error, the
544 owner will only be notified if the job terminates abnormally,
545 (as defined by JobSuccessExitCode, if defined) or if the job
546 is placed on hold because of a failure, and not by user
547 request. If defined by Never (the default), the owner will
548 not receive e-mail, regardless to what happens to the job.
549 The HTCondor User's manual documents statistics included in
550 the e-mail.
551
552 notify_user = <email-address>
553 Used to specify the e-mail address to use when HTCondor sends
554 e-mail about a job. If not specified, HTCondor defaults to
555 using the e-mail address defined by
556
557 job-owner@UID_DOMAIN
558
559 where the configuration variable UID_DOMAIN
560 is specified by the HTCondor site administrator. If
561 UID_DOMAIN has not been specified, HTCondor sends the
562 e-mail to:
563
564 job-owner@submit-machine-name
565
566
567
568 output = <pathname>
569 The output file captures any information the program would
570 ordinarily write to the screen (that is, this file becomes
571 stdout). A path is given with respect to the file system of
572 the machine on which the job is submitted. The file is writ‐
573 ten (by the job) in the remote scratch directory of the
574 machine where the job is executed. When the job exits, the
575 resulting file is transferred back to the machine where the
576 job was submitted, and the path is utilized for file place‐
577 ment. If not specified, the default value of /dev/null is
578 used for submission to a Unix machine. If not specified, out‐
579 put is ignored for submission to a Windows machine. Multiple
580 jobs should not use the same output file, since this will
581 cause one job to overwrite the output of another. If HTCondor
582 detects that the error and output files for a job are the
583 same, it will run the job such that the output and error data
584 is merged.
585
586 Note that if a program explicitly opens and writes to a file,
587 that file should not be specified as the output file.
588
589
590 priority = <integer>
591 An HTCondor job priority can be any integer, with 0 being the
592 default. Jobs with higher numerical priority will run before
593 jobs with lower numerical priority. Note that this priority
594 is on a per user basis. One user with many jobs may use this
595 command to order his/her own jobs, and this will have no
596 effect on whether or not these jobs will run ahead of another
597 user's jobs.
598
599 Note that the priority setting in an HTCondor submit file
600 will be overridden by condor_dagman if the submit file is
601 used for a node in a DAG, and the priority of the node within
602 the DAG is non-zero (see users-manual/dagman-applica‐
603 tions:advanced features of dagman for more details).
604
605 queue [<int expr> ]
606 Places zero or more copies of the job into the HTCondor
607 queue.
608
609 queue [<int expr> ] [<varname> ] in [slice ] <list of items> Places
610 zero or more copies of the job in the queue based on items in
611 a <list of items>
612
613 queue [<int expr> ] [<varname> ] matching [files | dirs ] [slice ]
614 <list of items with file globbing>] Places zero or more
615 copies of the job in the queue based on files that match a
616 <list of items with file globbing>
617
618 queue [<int expr> ] [<list of varnames> ] from [slice ] <file name>
619 | <list of items>] Places zero or more copies of the job in
620 the queue based on lines from the submit file or from <file
621 name>
622
623 The optional argument <int expr> specifies how many times to
624 repeat the job submission for a given set of arguments. It
625 may be an integer or an expression that evaluates to an inte‐
626 ger, and it defaults to 1. All but the first form of this
627 command are various ways of specifying a list of items. When
628 these forms are used <int expr> jobs will be queued for each
629 item in the list. The in, matching and from keyword indicates
630 how the list will be specified.
631
632 · in The list of items is an explicit comma and/or space sep‐
633 arated <list of items>. If the <list of items> begins with
634 an open paren, and the close paren is not on the same line
635 as the open, then the list continues until a line that
636 begins with a close paren is read from the submit file.
637
638 · matching Each item in the <list of items with file glob‐
639 bing> will be matched against the names of files and direc‐
640 tories relative to the current directory, the set of match‐
641 ing names is the resulting list of items.
642
643 · files Only filenames will matched.
644
645 · dirs Only directory names will be matched.
646
647 · from <file name> | <list of items> Each line from <file
648 name> or <list of items> is a single item, this allows for
649 multiple variables to be set for each item. Lines from
650 <file name> or <list of items> will be split on comma
651 and/or space until there are values for each of the vari‐
652 ables specified in <list of varnames>. The last variable
653 will contain the remainder of the line. When the <list of
654 items> form is used, the list continues until the first
655 line that begins with a close paren, and lines beginning
656 with pound sign ('#') will be skipped. When using the
657 <file name> form, if the <file name> ends with |, then it
658 will be executed as a script whatever the script writes to
659 stdout will be the list of items.
660
661 The optional argument <varname> or <list of varnames> is the
662 name or names of of variables that will be set to the value
663 of the current item when queuing the job. If no <varname> is
664 specified the variable ITEM will be used. Leading and trail‐
665 ing whitespace be trimmed. The optional argument <slice> is a
666 python style slice selecting only some of the items in the
667 list of items. Negative step values are not supported.
668
669 A submit file may contain more than one queue statement,
670 and if desired, any commands may be placed between subsequent
671 queue commands, such as new input , output , error ,
672 initialdir , or arguments commands. This is handy when
673 submitting multiple runs into one cluster with one submit
674 description file.
675
676
677 universe = <vanilla | standard | scheduler | local | grid | java| vm
678 | parallel | docker>
679 Specifies which HTCondor universe to use when running this
680 job. The HTCondor universe specifies an HTCondor execution
681 environment.
682
683 The vanilla universe is the default (except where the config‐
684 uration variable DEFAULT_UNIVERSE
685 defines it otherwise), and is an execution environment for
686 jobs which do not use HTCondor's mechanisms for taking check‐
687 points; these are ones that have not been linked with the
688 HTCondor libraries. Use the vanilla universe to submit shell
689 scripts to HTCondor.
690
691 The standard universe tells HTCondor that this job has been
692 re-linked via condor_compile with the HTCondor libraries and
693 therefore supports taking checkpoints and remote system
694 calls.
695
696 The scheduler universe is for a job that is to run on the
697 machine where the job is submitted. This universe is intended
698 for a job that acts as a metascheduler and will not be pre‐
699 empted.
700
701 The local universe is for a job that is to run on the machine
702 where the job is submitted. This universe runs the job imme‐
703 diately and will not preempt the job.
704
705 The grid universe forwards the job to an external job manage‐
706 ment system. Further specification of the grid universe is
707 done with the grid_resource command.
708
709 The java universe is for programs written to the Java Virtual
710 Machine.
711
712 The vm universe facilitates the execution of a virtual
713 machine.
714
715 The parallel universe is for parallel jobs (e.g. MPI) that
716 require multiple machines in order to run.
717
718 The docker universe runs a docker container as an HTCondor
719 job.
720
721 COMMANDS FOR MATCHMAKING
722
723 rank = <ClassAd Float Expression>
724 A ClassAd Floating-Point expression that states how to rank
725 machines which have already met the requirements expression.
726 Essentially, rank expresses preference. A higher numeric
727 value equals better rank. HTCondor will give the job the
728 machine with the highest rank. For example,
729
730 request_memory = max({60, Target.TotalSlotMemory})
731 rank = Memory
732
733 asks HTCondor to find all available machines with more than
734 60 megabytes of memory and give to the job the machine with
735 the most amount of memory. The HTCondor User's Manual con‐
736 tains complete information on the syntax and available
737 attributes that can be used in the ClassAd expression.
738
739
740 request_cpus = <num-cpus>
741 A requested number of CPUs (cores). If not specified, the
742 number requested will be 1. If specified, the expression
743
744 && (RequestCpus <= Target.Cpus)
745
746 is appended to the requirements expression for the job.
747
748 For pools that enable dynamic condor_startd provisioning,
749 specifies the minimum number of CPUs requested for this job,
750 resulting in a dynamic slot being created with this many
751 cores.
752
753
754 request_disk = <quantity>
755 The requested amount of disk space in KiB requested for this
756 job. If not specified, it will be set to the job ClassAd
757 attribute DiskUsage. The expression
758
759 && (RequestDisk <= Target.Disk)
760
761 is appended to the requirements expression for the job.
762
763 For pools that enable dynamic condor_startd provisioning, a
764 dynamic slot will be created with at least this much disk
765 space.
766
767 Characters may be appended to a numerical value to indicate
768 units. K or KB indicates KiB, 210 numbers of bytes. M or MB
769 indicates MiB, 220 numbers of bytes. G or GB indicates GiB,
770 230 numbers of bytes. T or TB indicates TiB, 240 numbers of
771 bytes.
772
773
774 request_memory = <quantity>
775 The required amount of memory in MiB that this job needs to
776 avoid excessive swapping. If not specified and the submit
777 command vm_memory is specified, then the value specified
778 for vm_memory defines request_memory . If neither
779 request_memory nor vm_memory is specified, the value is set
780 by the configuration variable JOB_DEFAULT_REQUESTMEMORY
781 . The actual amount of memory used by a job is represented
782 by the job ClassAd attribute MemoryUsage.
783
784 For pools that enable dynamic condor_startd provisioning, a
785 dynamic slot will be created with at least this much RAM.
786
787 The expression
788
789 && (RequestMemory <= Target.Memory)
790
791 is appended to the requirements expression for the job.
792
793 Characters may be appended to a numerical value to indicate
794 units. K or KB indicates KiB, 210 numbers of bytes. M or MB
795 indicates MiB, 220 numbers of bytes. G or GB indicates GiB,
796 230 numbers of bytes. T or TB indicates TiB, 240 numbers of
797 bytes.
798
799
800
801
802 request_<name> = <quantity>
803 The required amount of the custom machine resource identified
804 by <name> that this job needs. The custom machine resource is
805 defined in the machine's configuration. Machines that have
806 available GPUs will define <name> to be GPUs.
807
808
809 requirements = <ClassAd Boolean Expression>
810 The requirements command is a boolean ClassAd expression
811 which uses C-like operators. In order for any job in this
812 cluster to run on a given machine, this requirements expres‐
813 sion must evaluate to true on the given machine.
814
815 For scheduler and local universe jobs, the requirements
816 expression is evaluated against the Scheduler ClassAd which
817 represents the the condor_schedd daemon running on the submit
818 machine, rather than a remote machine. Like all commands in
819 the submit description file, if multiple requirements com‐
820 mands are present, all but the last one are ignored. By
821 default, condor_submit appends the following clauses to the
822 requirements expression:
823
824 1. Arch and OpSys are set equal to the Arch and OpSys of the
825 submit machine. In other words: unless you request other‐
826 wise, HTCondor will give your job machines with the same
827 architecture and operating system version as the machine
828 running condor_submit.
829
830 2. Cpus >= RequestCpus, if the job ClassAd attribute
831 RequestCpus is defined.
832
833 3. Disk >= RequestDisk, if the job ClassAd attribute Request‐
834 Disk is defined. Otherwise, Disk >= DiskUsage is appended
835 to the requirements. The DiskUsage attribute is initial‐
836 ized to the size of the executable plus the size of any
837 files specified in a transfer_input_files command. It
838 exists to ensure there is enough disk space on the target
839 machine for HTCondor to copy over both the executable and
840 needed input files. The DiskUsage attribute represents the
841 maximum amount of total disk space required by the job in
842 kilobytes. HTCondor automatically updates the DiskUsage
843 attribute approximately every 20 minutes while the job
844 runs with the amount of space being used by the job on the
845 execute machine.
846
847 4. Memory >= RequestMemory, if the job ClassAd attribute
848 RequestMemory is defined.
849
850 5. If Universe is set to Vanilla, FileSystemDomain is set
851 equal to the submit machine's FileSystemDomain.
852
853 View the requirements of a job which has already been submit‐
854 ted (along with everything else about the job ClassAd) with
855 the command condor_q -l; see the command reference for
856 /man-pages/condor_q. Also, see the HTCondor Users Manual for
857 complete information on the syntax and available attributes
858 that can be used in the ClassAd expression.
859
860 FILE TRANSFER COMMANDS
861
862
863
864 dont_encrypt_input_files = < file1,file2,file... >
865 A comma and/or space separated list of input files that are
866 not to be network encrypted when transferred with the file
867 transfer mechanism. Specification of files in this manner
868 overrides configuration that would use encryption. Each input
869 file must also be in the list given by transfer_input_files
870 . When a path to an input file or directory is specified,
871 this specifies the path to the file on the submit side. A
872 single wild card character (*) may be used in each file name.
873
874
875
876 dont_encrypt_output_files = < file1,file2,file... >
877 A comma and/or space separated list of output files that are
878 not to be network encrypted when transferred back with the
879 file transfer mechanism. Specification of files in this man‐
880 ner overrides configuration that would use encryption. The
881 output file(s) must also either be in the list given by
882 transfer_output_files or be discovered and to be transferred
883 back with the file transfer mechanism. When a path to an out‐
884 put file or directory is specified, this specifies the path
885 to the file on the execute side. A single wild card character
886 (*) may be used in each file name.
887
888
889 encrypt_execute_directory = <True | False>
890 Defaults to False. If set to True, HTCondor will encrypt the
891 contents of the remote scratch directory of the machine where
892 the job is executed. This encryption is transparent to the
893 job itself, but ensures that files left behind on the local
894 disk of the execute machine, perhaps due to a system crash,
895 will remain private. In addition, condor_submit will append
896 to the job's requirements expression
897
898 && (TARGET.HasEncryptExecuteDirectory)
899
900 to ensure the job is matched to a machine that is capable of
901 encrypting the contents of the execute directory. This sup‐
902 port is limited to Windows platforms that use the NTFS file
903 system and Linux platforms with the ecryptfs-utils package
904 installed.
905
906
907
908 encrypt_input_files = < file1,file2,file... >
909 A comma and/or space separated list of input files that are
910 to be network encrypted when transferred with the file trans‐
911 fer mechanism. Specification of files in this manner over‐
912 rides configuration that would not use encryption. Each input
913 file must also be in the list given by transfer_input_files
914 . When a path to an input file or directory is specified,
915 this specifies the path to the file on the submit side. A
916 single wild card character (*) may be used in each file name.
917 The method of encryption utilized will be as agreed upon in
918 security negotiation; if that negotiation failed, then the
919 file transfer mechanism must also fail for files to be net‐
920 work encrypted.
921
922
923
924 encrypt_output_files = < file1,file2,file... >
925 A comma and/or space separated list of output files that are
926 to be network encrypted when transferred back with the file
927 transfer mechanism. Specification of files in this manner
928 overrides configuration that would not use encryption. The
929 output file(s) must also either be in the list given by
930 transfer_output_files or be discovered and to be transferred
931 back with the file transfer mechanism. When a path to an out‐
932 put file or directory is specified, this specifies the path
933 to the file on the execute side. A single wild card character
934 (*) may be used in each file name. The method of encryption
935 utilized will be as agreed upon in security negotiation; if
936 that negotiation failed, then the file transfer mechanism
937 must also fail for files to be network encrypted.
938
939
940 max_transfer_input_mb = <ClassAd Integer Expression>
941 This integer expression specifies the maximum allowed total
942 size in MiB of the input files that are transferred for a
943 job. This expression does not apply to grid universe, stan‐
944 dard universe, or files transferred via file transfer
945 plug-ins. The expression may refer to attributes of the job.
946 The special value -1 indicates no limit. If not defined, the
947 value set by configuration variable MAX_TRANSFER_INPUT_MB
948 is used. If the observed size of all input files at submit
949 time is larger than the limit, the job will be immediately
950 placed on hold with a HoldReasonCode value of 32. If the job
951 passes this initial test, but the size of the input files
952 increases or the limit decreases so that the limit is vio‐
953 lated, the job will be placed on hold at the time when the
954 file transfer is attempted.
955
956
957 max_transfer_output_mb = <ClassAd Integer Expression>
958 This integer expression specifies the maximum allowed total
959 size in MiB of the output files that are transferred for a
960 job. This expression does not apply to grid universe, stan‐
961 dard universe, or files transferred via file transfer
962 plug-ins. The expression may refer to attributes of the job.
963 The special value -1 indicates no limit. If not set, the
964 value set by configuration variable MAX_TRANSFER_OUTPUT_MB
965 is used. If the total size of the job's output files to be
966 transferred is larger than the limit, the job will be placed
967 on hold with a HoldReasonCode value of 33. The output will be
968 transferred up to the point when the limit is hit, so some
969 files may be fully transferred, some partially, and some not
970 at all.
971
972
973
974 output_destination = <destination-URL>
975 When present, defines a URL that specifies both a plug-in and
976 a destination for the transfer of the entire output sandbox
977 or a subset of output files as specified by the submit com‐
978 mand transfer_output_files . The plug-in does the transfer
979 of files, and no files are sent back to the submit machine.
980 The HTCondor Administrator's manual has full details.
981
982
983 should_transfer_files = <YES | NO | IF_NEEDED >
984 The should_transfer_files setting is used to define if HTCon‐
985 dor should transfer files to and from the remote machine
986 where the job runs. The file transfer mechanism is used to
987 run jobs which are not in the standard universe (and can
988 therefore use remote system calls for file access) on
989 machines which do not have a shared file system with the sub‐
990 mit machine. should_transfer_files equal to YES will cause
991 HTCondor to always transfer files for the job. NO disables
992 HTCondor's file transfer mechanism. IF_NEEDED will not trans‐
993 fer files for the job if it is matched with a resource in the
994 same FileSystemDomain as the submit machine (and therefore,
995 on a machine with the same shared file system). If the job is
996 matched with a remote resource in a different FileSystemDo‐
997 main, HTCondor will transfer the necessary files.
998
999 For more information about this and other settings related to
1000 transferring files, see the HTCondor User's manual section on
1001 the file transfer mechanism.
1002
1003 Note that should_transfer_files is not supported for jobs
1004 submitted to the grid universe.
1005
1006
1007 skip_filechecks = <True | False>
1008 When True, file permission checks for the submitted job are
1009 disabled. When False, file permissions are checked; this is
1010 the behavior when this command is not present in the submit
1011 description file. File permissions are checked for read per‐
1012 missions on all input files, such as those defined by com‐
1013 mands input and transfer_input_files , and for write per‐
1014 mission to output files, such as a log file defined by log
1015 and output files defined with output or transfer_out‐
1016 put_files .
1017
1018
1019 stream_error = <True | False>
1020 If True, then stderr is streamed back to the machine from
1021 which the job was submitted. If False, stderr is stored
1022 locally and transferred back when the job completes. This
1023 command is ignored if the job ClassAd attribute TransferErr
1024 is False. The default value is False. This command must be
1025 used in conjunction with error , otherwise stderr will sent
1026 to /dev/null on Unix machines and ignored on Windows
1027 machines.
1028
1029
1030 stream_input = <True | False>
1031 If True, then stdin is streamed from the machine on which the
1032 job was submitted. The default value is False. The command is
1033 only relevant for jobs submitted to the vanilla or java uni‐
1034 verses, and it is ignored by the grid universe. This command
1035 must be used in conjunction with input , otherwise stdin
1036 will be /dev/null on Unix machines and ignored on Windows
1037 machines.
1038
1039 stream_output = <True | False>
1040 If True, then stdout is streamed back to the machine from
1041 which the job was submitted. If False, stdout is stored
1042 locally and transferred back when the job completes. This
1043 command is ignored if the job ClassAd attribute TransferOut
1044 is False. The default value is False. This command must be
1045 used in conjunction with output , otherwise stdout will sent
1046 to /dev/null on Unix machines and ignored on Windows
1047 machines.
1048
1049
1050 transfer_executable = <True | False>
1051 This command is applicable to jobs submitted to the grid and
1052 vanilla universes. If transfer_executable is set to False,
1053 then HTCondor looks for the executable on the remote machine,
1054 and does not transfer the executable over. This is useful for
1055 an already pre-staged executable; HTCondor behaves more like
1056 rsh. The default value is True.
1057
1058
1059 transfer_input_files = < file1,file2,file... >
1060 A comma-delimited list of all the files and directories to be
1061 transferred into the working directory for the job, before
1062 the job is started. By default, the file specified in the
1063 executable command and any file specified in the input
1064 command (for example, stdin) are transferred.
1065
1066 When a path to an input file or directory is specified, this
1067 specifies the path to the file on the submit side. The file
1068 is placed in the job's temporary scratch directory on the
1069 execute side, and it is named using the base name of the
1070 original path. For example, /path/to/input_file becomes
1071 input_file in the job's scratch directory.
1072
1073 A directory may be specified by appending the forward slash
1074 character (/) as a trailing path separator. This syntax is
1075 used for both Windows and Linux submit hosts. A directory
1076 example using a trailing path separator is input_data/. When
1077 a directory is specified with the trailing path separator,
1078 the contents of the directory are transferred, but the direc‐
1079 tory itself is not transferred. It is as if each of the items
1080 within the directory were listed in the transfer list. When
1081 there is no trailing path separator, the directory is trans‐
1082 ferred, its contents are transferred, and these contents are
1083 placed inside the transferred directory.
1084
1085 For grid universe jobs other than HTCondor-C, the transfer of
1086 directories is not currently supported.
1087
1088 Symbolic links to files are transferred as the files they
1089 point to. Transfer of symbolic links to directories is not
1090 currently supported.
1091
1092 For vanilla and vm universe jobs only, a file may be speci‐
1093 fied by giving a URL, instead of a file name. The implementa‐
1094 tion for URL transfers requires both configuration and avail‐
1095 able plug-in.
1096
1097
1098 transfer_output_files = < file1,file2,file... >
1099 This command forms an explicit list of output files and
1100 directories to be transferred back from the temporary working
1101 directory on the execute machine to the submit machine. If
1102 there are multiple files, they must be delimited with commas.
1103 Setting transfer_output_files to the empty string ("") means
1104 that no files are to be transferred.
1105
1106 For HTCondor-C jobs and all other non-grid universe jobs, if
1107 transfer_output_files is not specified, HTCondor will auto‐
1108 matically transfer back all files in the job's temporary
1109 working directory which have been modified or created by the
1110 job. Subdirectories are not scanned for output, so if output
1111 from subdirectories is desired, the output list must be
1112 explicitly specified. For grid universe jobs other than
1113 HTCondor-C, desired output files must also be explicitly
1114 listed. Another reason to explicitly list output files is for
1115 a job that creates many files, and the user wants only a sub‐
1116 set transferred back.
1117
1118 For grid universe jobs other than with grid type condor, to
1119 have files other than standard output and standard error
1120 transferred from the execute machine back to the submit
1121 machine, do use transfer_output_files, listing all files to
1122 be transferred. These files are found on the execute machine
1123 in the working directory of the job.
1124
1125 When a path to an output file or directory is specified, it
1126 specifies the path to the file on the execute side. As a des‐
1127 tination on the submit side, the file is placed in the job's
1128 initial working directory, and it is named using the base
1129 name of the original path. For example, path/to/output_file
1130 becomes output_file in the job's initial working directory.
1131 The name and path of the file that is written on the submit
1132 side may be modified by using transfer_output_remaps . Note
1133 that this remap function only works with files but not with
1134 directories.
1135
1136 A directory may be specified using a trailing path separator.
1137 An example of a trailing path separator is the slash charac‐
1138 ter on Unix platforms; a directory example using a trailing
1139 path separator is input_data/. When a directory is specified
1140 with a trailing path separator, the contents of the directory
1141 are transferred, but the directory itself is not transferred.
1142 It is as if each of the items within the directory were
1143 listed in the transfer list. When there is no trailing path
1144 separator, the directory is transferred, its contents are
1145 transferred, and these contents are placed inside the trans‐
1146 ferred directory.
1147
1148 For grid universe jobs other than HTCondor-C, the transfer of
1149 directories is not currently supported.
1150
1151 Symbolic links to files are transferred as the files they
1152 point to. Transfer of symbolic links to directories is not
1153 currently supported.
1154
1155 transfer_output_remaps = < name = newname ; name2 = newname2 ... >
1156 This specifies the name (and optionally path) to use when
1157 downloading output files from the completed job. Normally,
1158 output files are transferred back to the initial working
1159 directory with the same name they had in the execution direc‐
1160 tory. This gives you the option to save them with a different
1161 path or name. If you specify a relative path, the final path
1162 will be relative to the job's initial working directory.
1163
1164 name describes an output file name produced by your job, and
1165 newname describes the file name it should be downloaded to.
1166 Multiple remaps can be specified by separating each with a
1167 semicolon. If you wish to remap file names that contain
1168 equals signs or semicolons, these special characters may be
1169 escaped with a backslash. You cannot specify directories to
1170 be remapped.
1171
1172 Note that whether an output file is transferred is controlled
1173 by transfer_output_files. Listing a file in transfer_out‐
1174 put_remaps is not sufficient to cause it to be transferred.
1175
1176
1177 when_to_transfer_output = < ON_EXIT | ON_EXIT_OR_EVICT >
1178 Setting when_to_transfer_output equal to ON_EXIT will cause
1179 HTCondor to transfer the job's output files back to the sub‐
1180 mitting machine only when the job completes (exits on its
1181 own).
1182
1183 The ON_EXIT_OR_EVICT option is intended for fault tolerant
1184 jobs which periodically save their own state and can restart
1185 where they left off. In this case, files are spooled to the
1186 submit machine any time the job leaves a remote site, either
1187 because it exited on its own, or was evicted by the HTCondor
1188 system for any reason prior to job completion. The files
1189 spooled back are placed in a directory defined by the value
1190 of the SPOOL configuration variable. Any output files trans‐
1191 ferred back to the submit machine are automatically sent back
1192 out again as input files if the job restarts.
1193
1194 POLICY COMMANDS
1195
1196 max_retries = <integer>
1197 The maximum number of retries allowed for this job (must be
1198 non-negative). If the job fails (does not exit with the suc‐
1199 cess_exit_code exit code) it will be retried up to
1200 max_retries times (unless retries are ceased because of the
1201 retry_until command). If max_retries is not defined, and
1202 either retry_until or success_exit_code is, the value of
1203 DEFAULT_JOB_MAX_RETRIES will be used for the maximum number
1204 of retries.
1205
1206 The combination of the max_retries, retry_until, and suc‐
1207 cess_exit_code commands causes an appropriate OnExitRemove
1208 expression to be automatically generated. If retry command(s)
1209 and on_exit_remove are both defined, the OnExitRemove expres‐
1210 sion will be generated by OR'ing the expression specified in
1211 OnExitRemove and the expression generated by the retry com‐
1212 mands.
1213
1214
1215 retry_until <Integer | ClassAd Boolean Expression>
1216 An integer value or boolean expression that prevents further
1217 retries from taking place, even if max_retries have not been
1218 exhausted. If retry_until is an integer, the job exiting
1219 with that exit code will cause retries to cease. If
1220 retry_until is a ClassAd expression, the expression evaluat‐
1221 ing to True will cause retries to cease.
1222
1223 success_exit_code = <integer>
1224 The exit code that is considered successful for this job.
1225 Defaults to 0 if not defined.
1226
1227 Note: non-zero values of success_exit_code should generally
1228 not be used for DAG node jobs. At the present time, con‐
1229 dor_dagman does not take into account the value of suc‐
1230 cess_exit_code. This means that, if success_exit_code is set
1231 to a non-zero value, condor_dagman will consider the job
1232 failed when it actually succeeds. For single-proc DAG node
1233 jobs, this can be overcome by using a POST script that takes
1234 into account the value of success_exit_code (although this is
1235 not recommended). For multi-proc DAG node jobs, there is cur‐
1236 rently no way to overcome this limitation.
1237
1238
1239 hold = <True | False>
1240 If hold is set to True, then the submitted job will be placed
1241 into the Hold state. Jobs in the Hold state will not run
1242 until released by condor_release. Defaults to False.
1243
1244
1245 keep_claim_idle = <integer>
1246 An integer number of seconds that a job requests the con‐
1247 dor_schedd to wait before releasing its claim after the job
1248 exits or after the job is removed.
1249
1250 The process by which the condor_schedd claims a condor_startd
1251 is somewhat time-consuming. To amortize this cost, the con‐
1252 dor_schedd tries to reuse claims to run subsequent jobs,
1253 after a job using a claim is done. However, it can only do
1254 this if there is an idle job in the queue at the moment the
1255 previous job completes. Sometimes, and especially for the
1256 node jobs when using DAGMan, there is a subsequent job about
1257 to be submitted, but it has not yet arrived in the queue when
1258 the previous job completes. As a result, the condor_schedd
1259 releases the claim, and the next job must wait an entire
1260 negotiation cycle to start. When this submit command is
1261 defined with a non-negative integer, when the job exits, the
1262 condor_schedd tries as usual to reuse the claim. If it can‐
1263 not, instead of releasing the claim, the condor_schedd keeps
1264 the claim until either the number of seconds given as a
1265 parameter, or a new job which matches that claim arrives,
1266 whichever comes first. The condor_startd in question will
1267 remain in the Claimed/Idle state, and the original job will
1268 be "charged" (in terms of priority) for the time in this
1269 state.
1270
1271
1272 leave_in_queue = <ClassAd Boolean Expression>
1273 When the ClassAd Expression evaluates to True, the job is not
1274 removed from the queue upon completion. This allows the user
1275 of a remotely spooled job to retrieve output files in cases
1276 where HTCondor would have removed them as part of the cleanup
1277 associated with completion. The job will only exit the queue
1278 once it has been marked for removal (via condor_rm, for exam‐
1279 ple) and the leave_in_queue expression has become False.
1280 leave_in_queue defaults to False.
1281
1282 As an example, if the job is to be removed once the output is
1283 retrieved with condor_transfer_data, then use
1284
1285 leave_in_queue = (JobStatus == 4) && ((StageOutFinish =?= UNDEFINED) ||\
1286 (StageOutFinish == 0))
1287
1288
1289
1290 next_job_start_delay = <ClassAd Boolean Expression>
1291 This expression specifies the number of seconds to delay
1292 after starting up this job before the next job is started.
1293 The maximum allowed delay is specified by the HTCondor con‐
1294 figuration variable MAX_NEXT_JOB_START_DELAY
1295 , which defaults to 10 minutes. This command does not apply
1296 to scheduler or local universe jobs.
1297
1298 This command has been historically used to implement a form
1299 of job start throttling from the job submitter's perspective.
1300 It was effective for the case of multiple job submission
1301 where the transfer of extremely large input data sets to the
1302 execute machine caused machine performance to suffer. This
1303 command is no longer useful, as throttling should be accom‐
1304 plished through configuration of the condor_schedd daemon.
1305
1306
1307 on_exit_hold = <ClassAd Boolean Expression>
1308 The ClassAd expression is checked when the job exits, and if
1309 True, places the job into the Hold state. If False (the
1310 default value when not defined), then nothing happens and the
1311 on_exit_remove expression is checked to determine if that
1312 needs to be applied.
1313
1314 For example: Suppose a job is known to run for a minimum of
1315 an hour. If the job exits after less than an hour, the job
1316 should be placed on hold and an e-mail notification sent,
1317 instead of being allowed to leave the queue.
1318
1319 on_exit_hold = (time() - JobStartDate) < (60 * $(MINUTE))
1320
1321 This expression places the job on hold if it exits for any
1322 reason before running for an hour. An e-mail will be sent to
1323 the user explaining that the job was placed on hold because
1324 this expression became True.
1325
1326 periodic_* expressions take precedence over on_exit_* expres‐
1327 sions, and *_hold expressions take precedence over a *_remove
1328 expressions.
1329
1330 Only job ClassAd attributes will be defined for use by this
1331 ClassAd expression. This expression is available for the
1332 vanilla, java, parallel, grid, local and scheduler universes.
1333 It is additionally available, when submitted from a Unix
1334 machine, for the standard universe.
1335
1336 on_exit_hold_reason = <ClassAd String Expression>
1337 When the job is placed on hold due to the on_exit_hold
1338 expression becoming True, this expression is evaluated to set
1339 the value of HoldReason in the job ClassAd. If this expres‐
1340 sion is UNDEFINED or produces an empty or invalid string, a
1341 default description is used.
1342
1343
1344 on_exit_hold_subcode = <ClassAd Integer Expression>
1345 When the job is placed on hold due to the on_exit_hold
1346 expression becoming True, this expression is evaluated to set
1347 the value of HoldReasonSubCode in the job ClassAd. The
1348 default subcode is 0. The HoldReasonCode will be set to 3,
1349 which indicates that the job went on hold due to a job policy
1350 expression.
1351
1352
1353 on_exit_remove = <ClassAd Boolean Expression>
1354 The ClassAd expression is checked when the job exits, and if
1355 True (the default value when undefined), then it allows the
1356 job to leave the queue normally. If False, then the job is
1357 placed back into the Idle state. If the user job runs under
1358 the vanilla universe, then the job restarts from the begin‐
1359 ning. If the user job runs under the standard universe, then
1360 it continues from where it left off, using the last check‐
1361 point.
1362
1363 For example, suppose a job occasionally segfaults, but
1364 chances are that the job will finish successfully if the job
1365 is run again with the same data. The on_exit_remove expres‐
1366 sion can cause the job to run again with the following com‐
1367 mand. Assume that the signal identifier for the segmentation
1368 fault is 11 on the platform where the job will be running.
1369
1370 on_exit_remove = (ExitBySignal == False) || (ExitSignal != 11)
1371
1372 This expression lets the job leave the queue if the job was
1373 not killed by a signal or if it was killed by a signal other
1374 than 11, representing segmentation fault in this example. So,
1375 if the exited due to signal 11, it will stay in the job
1376 queue. In any other case of the job exiting, the job will
1377 leave the queue as it normally would have done.
1378
1379 As another example, if the job should only leave the queue if
1380 it exited on its own with status 0, this on_exit_remove
1381 expression works well:
1382
1383 on_exit_remove = (ExitBySignal == False) && (ExitCode == 0)
1384
1385 If the job was killed by a signal or exited with a non-zero
1386 exit status, HTCondor would leave the job in the queue to run
1387 again.
1388
1389 periodic_* expressions take precedence over on_exit_* expres‐
1390 sions, and *_hold expressions take precedence over a *_remove
1391 expressions.
1392
1393 Only job ClassAd attributes will be defined for use by this
1394 ClassAd expression.
1395
1396 periodic_hold = <ClassAd Boolean Expression>
1397 This expression is checked periodically when the job is not
1398 in the Held state. If it becomes True, the job will be placed
1399 on hold. If unspecified, the default value is False.
1400
1401 periodic_* expressions take precedence over on_exit_* expres‐
1402 sions, and *_hold expressions take precedence over a *_remove
1403 expressions.
1404
1405 Only job ClassAd attributes will be defined for use by this
1406 ClassAd expression. Note that, by default, this expression is
1407 only checked once every 60 seconds. The period of these eval‐
1408 uations can be adjusted by setting the PERIODIC_EXPR_INTER‐
1409 VAL, MAX_PERIODIC_EXPR_INTERVAL, and PERIODIC_EXPR_TIMESLICE
1410 configuration macros.
1411
1412
1413 periodic_hold_reason = <ClassAd String Expression>
1414 When the job is placed on hold due to the periodic_hold
1415 expression becoming True, this expression is evaluated to set
1416 the value of HoldReason in the job ClassAd. If this expres‐
1417 sion is UNDEFINED or produces an empty or invalid string, a
1418 default description is used.
1419
1420
1421 periodic_hold_subcode = <ClassAd Integer Expression>
1422 When the job is placed on hold due to the periodic_hold
1423 expression becoming true, this expression is evaluated to set
1424 the value of HoldReasonSubCode in the job ClassAd. The
1425 default subcode is 0. The HoldReasonCode will be set to 3,
1426 which indicates that the job went on hold due to a job policy
1427 expression.
1428
1429
1430 periodic_release = <ClassAd Boolean Expression>
1431 This expression is checked periodically when the job is in
1432 the Held state. If the expression becomes True, the job will
1433 be released.
1434
1435 Only job ClassAd attributes will be defined for use by this
1436 ClassAd expression. Note that, by default, this expression is
1437 only checked once every 60 seconds. The period of these eval‐
1438 uations can be adjusted by setting the PERIODIC_EXPR_INTER‐
1439 VAL, MAX_PERIODIC_EXPR_INTERVAL, and PERIODIC_EXPR_TIMESLICE
1440 configuration macros.
1441
1442
1443 periodic_remove = <ClassAd Boolean Expression>
1444 This expression is checked periodically. If it becomes True,
1445 the job is removed from the queue. If unspecified, the
1446 default value is False.
1447
1448 See the Examples section of this manual page for an example
1449 of a periodic_remove expression.
1450
1451 periodic_* expressions take precedence over on_exit_* expres‐
1452 sions, and *_hold expressions take precedence over a *_remove
1453 expressions. So, the periodic_remove expression takes prece‐
1454 dent over the on_exit_remove expression, if the two describe
1455 conflicting actions.
1456
1457 Only job ClassAd attributes will be defined for use by this
1458 ClassAd expression. Note that, by default, this expression is
1459 only checked once every 60 seconds. The period of these eval‐
1460 uations can be adjusted by setting the PERIODIC_EXPR_INTER‐
1461 VAL, MAX_PERIODIC_EXPR_INTERVAL, and PERIODIC_EXPR_TIMESLICE
1462 configuration macros.
1463
1464 COMMANDS SPECIFIC TO THE STANDARD UNIVERSE
1465
1466
1467 allow_startup_script = <True | False>
1468 If True, a standard universe job will execute a script
1469 instead of submitting the job, and the consistency check to
1470 see if the executable has been linked using condor_compile is
1471 omitted. The executable command within the submit descrip‐
1472 tion file specifies the name of the script. The script is
1473 used to do preprocessing before the job is submitted. The
1474 shell script ends with an exec of the job executable, such
1475 that the process id of the executable is the same as that of
1476 the shell script. Here is an example script that gets a copy
1477 of a machine-specific executable before the exec.
1478
1479 #! /bin/sh
1480
1481 # get the host name of the machine
1482 $host=`uname -n`
1483
1484 # grab a standard universe executable designed specifically
1485 # for this host
1486 scp elsewhere@cs.wisc.edu:${host} executable
1487
1488 # The PID MUST stay the same, so exec the new standard universe process.
1489 exec executable ${1+"$@"}
1490
1491 If this command is not present (defined), then the value
1492 defaults to false.
1493
1494 append_files = file1, file2, ...
1495 If your job attempts to access a file mentioned in this list,
1496 HTCondor will force all writes to that file to be appended to
1497 the end. Furthermore, condor_submit will not truncate it.
1498 This list uses the same syntax as compress_files, shown
1499 above.
1500
1501 This option may yield some surprising results. If several
1502 jobs attempt to write to the same file, their output may be
1503 intermixed. If a job is evicted from one or more machines
1504 during the course of its lifetime, such an output file might
1505 contain several copies of the results. This option should be
1506 only be used when you wish a certain file to be treated as a
1507 running log instead of a precise result.
1508
1509 This option only applies to standard-universe jobs.
1510
1511
1512
1513
1514 buffer_files = < name = (size,block-size) ; name2 =
1515 (size,block-size) ... >; buffer_size = <bytes-in-buffer>; buf‐
1516 fer_block_size = <bytes-in-block>
1517 HTCondor keeps a buffer of recently-used data for each file a
1518 job accesses. This buffer is used both to cache commonly-used
1519 data and to consolidate small reads and writes into larger
1520 operations that get better throughput. The default settings
1521 should produce reasonable results for most programs.
1522
1523 These options only apply to standard-universe jobs.
1524
1525 If needed, you may set the buffer controls individually for
1526 each file using the buffer_files option. For example, to set
1527 the buffer size to 1 MiB and the block size to 256 KiB for
1528 the file input.data, use this command:
1529
1530 buffer_files = "input.data=(1000000,256000)"
1531
1532 Alternatively, you may use these two options to set the
1533 default sizes for all files used by your job:
1534
1535 buffer_size = 1000000
1536 buffer_block_size = 256000
1537
1538 If you do not set these, HTCondor will use the values given
1539 by these two configuration file macros:
1540
1541 DEFAULT_IO_BUFFER_SIZE = 1000000
1542 DEFAULT_IO_BUFFER_BLOCK_SIZE = 256000
1543
1544 Finally, if no other settings are present, HTCondor will use
1545 a buffer of 512 KiB and a block size of 32 KiB.
1546
1547
1548 compress_files = file1, file2, ...
1549 If your job attempts to access any of the files mentioned in
1550 this list, HTCondor will automatically compress them (if
1551 writing) or decompress them (if reading). The compress format
1552 is the same as used by GNU gzip.
1553
1554 The files given in this list may be simple file names or com‐
1555 plete paths and may include * as a wild card. For example,
1556 this list causes the file /tmp/data.gz, any file named
1557 event.gz, and any file ending in .gzip to be automatically
1558 compressed or decompressed as needed:
1559
1560 compress_files = /tmp/data.gz, event.gz, *.gzip
1561
1562 Due to the nature of the compression format, compressed files
1563 must only be accessed sequentially. Random access reading is
1564 allowed but is very slow, while random access writing is sim‐
1565 ply not possible. This restriction may be avoided by using
1566 both compress_files and fetch_files at the same time. When
1567 this is done, a file is kept in the decompressed state at the
1568 execution machine, but is compressed for transfer to its
1569 original location.
1570
1571 This option only applies to standard universe jobs.
1572
1573
1574 fetch_files = file1, file2, ...
1575 If your job attempts to access a file mentioned in this list,
1576 HTCondor will automatically copy the whole file to the exe‐
1577 cuting machine, where it can be accessed quickly. When your
1578 job closes the file, it will be copied back to its original
1579 location. This list uses the same syntax as compress_files,
1580 shown above.
1581
1582 This option only applies to standard universe jobs.
1583
1584
1585 file_remaps = < name = newname ; name2 = newname2 ... >
1586 Directs HTCondor to use a new file name in place of an old
1587 one. name describes a file name that your job may attempt to
1588 open, and newname describes the file name it should be
1589 replaced with. newname may include an optional leading
1590 access specifier, local: or remote:. If left unspecified, the
1591 default access specifier is remote:. Multiple remaps can be
1592 specified by separating each with a semicolon.
1593
1594 This option only applies to standard universe jobs.
1595
1596 If you wish to remap file names that contain equals signs or
1597 semicolons, these special characters may be escaped with a
1598 backslash.
1599
1600 Example One:
1601 Suppose that your job reads a file named
1602 dataset.1. To instruct HTCondor to force your job
1603 to read other.dataset instead, add this to the
1604 submit file:
1605
1606 file_remaps = "dataset.1=other.dataset"
1607
1608 Example Two:
1609 Suppose that your run many jobs which all read in
1610 the same large file, called very.big. If this file
1611 can be found in the same place on a local disk in
1612 every machine in the pool, (say /bigdisk/bigfile,)
1613 you can instruct HTCondor of this fact by remap‐
1614 ping very.big to /bigdisk/bigfile and specifying
1615 that the file is to be read locally, which will be
1616 much faster than reading over the network.
1617
1618 file_remaps = "very.big = local:/bigdisk/bigfile"
1619
1620 Example Three:
1621 Several remaps can be applied at once by separat‐
1622 ing each with a semicolon.
1623
1624 file_remaps = "very.big = local:/bigdisk/bigfile ; dataset.1 = other.dataset"
1625
1626
1627
1628 local_files = file1, file2, ...
1629 If your job attempts to access a file mentioned in this list,
1630 HTCondor will cause it to be read or written at the execution
1631 machine. This is most useful for temporary files not used for
1632 input or output. This list uses the same syntax as com‐
1633 press_files, shown above.
1634
1635 local_files = /tmp/*
1636
1637 This option only applies to standard universe jobs.
1638
1639
1640 want_remote_io = <True | False>
1641 This option controls how a file is opened and manipulated in
1642 a standard universe job. If this option is true, which is the
1643 default, then the condor_shadow makes all decisions about how
1644 each and every file should be opened by the executing job.
1645 This entails a network round trip (or more) from the job to
1646 the condor_shadow and back again for every single open() in
1647 addition to other needed information about the file. If set
1648 to false, then when the job queries the condor_shadow for the
1649 first time about how to open a file, the condor_shadow will
1650 inform the job to automatically perform all of its file
1651 manipulation on the local file system on the execute machine
1652 and any file remapping will be ignored. This means that there
1653 must be a shared file system (such as NFS or AFS) between the
1654 execute machine and the submit machine and that ALL paths
1655 that the job could open on the execute machine must be valid.
1656 The ability of the standard universe job to checkpoint, pos‐
1657 sibly to a checkpoint server, is not affected by this
1658 attribute. However, when the job resumes it will be expecting
1659 the same file system conditions that were present when the
1660 job checkpointed.
1661
1662 COMMANDS FOR THE GRID
1663
1664 azure_admin_key = <pathname>
1665 For grid type azure jobs, specifies the path and file name of
1666 a file that contains an SSH public key. This key can be used
1667 to log into the administrator account of the instance via
1668 SSH.
1669
1670
1671 azure_admin_username = <account name>
1672 For grid type azure jobs, specifies the name of an adminis‐
1673 trator account to be created in the instance. This account
1674 can be logged into via SSH.
1675
1676 azure_auth_file = <pathname>
1677 For grid type azure jobs, specifies a path and file name of
1678 the authorization file that grants permission for HTCondor to
1679 use the Azure account. If it's not defined, then HTCondor
1680 will attempt to use the default credentials of the Azure CLI
1681 tools.
1682
1683
1684 azure_image = <image id>
1685 For grid type azure jobs, identifies the disk image to be
1686 used for the boot disk of the instance. This image must
1687 already be registered within Azure.
1688
1689
1690 azure_location = <image id>
1691 For grid type azure jobs, identifies the location within
1692 Azure where the instance should be run. As an example, one
1693 current location is centralus.
1694
1695
1696 azure_size = <machine type>
1697 For grid type azure jobs, the hardware configuration that the
1698 virtual machine instance is to run on.
1699
1700
1701 batch_queue = <queuename>
1702 Used for pbs, lsf, and sge grid universe jobs. Specifies the
1703 name of the PBS/LSF/SGE job queue into which the job should
1704 be submitted. If not specified, the default queue is used.
1705
1706
1707 boinc_authenticator_file = <pathname>
1708 For grid type boinc jobs, specifies a path and file name of
1709 the authorization file that grants permission for HTCondor to
1710 use the BOINC service. There is no default value when not
1711 specified.
1712
1713
1714 cream_attributes = <name=value;...;name=value>
1715 Provides a list of attribute/value pairs to be set in a CREAM
1716 job description of a grid universe job destined for the CREAM
1717 grid system. The pairs are separated by semicolons, and writ‐
1718 ten in New ClassAd syntax.
1719
1720
1721 delegate_job_GSI_credentials_lifetime = <seconds>
1722 Specifies the maximum number of seconds for which delegated
1723 proxies should be valid. The default behavior when this com‐
1724 mand is not specified is determined by the configuration
1725 variable DELEGATE_JOB_GSI_CREDENTIALS_LIFETIME
1726 , which defaults to one day. A value of 0 indicates that the
1727 delegated proxy should be valid for as long as allowed by the
1728 credential used to create the proxy. This setting currently
1729 only applies to proxies delegated for non-grid jobs and for
1730 HTCondor-C jobs. It does not currently apply to globus grid
1731 jobs, which always behave as though this setting were 0. This
1732 variable has no effect if the configuration variable DELE‐
1733 GATE_JOB_GSI_CREDENTIALS
1734 is False, because in that case the job proxy is copied
1735 rather than delegated.
1736
1737
1738 ec2_access_key_id = <pathname>
1739 For grid type ec2 jobs, identifies the file containing the
1740 access key.
1741
1742 ec2_ami_id = <EC2 xMI ID>
1743 For grid type ec2 jobs, identifies the machine image. Ser‐
1744 vices compatible with the EC2 Query API may refer to these
1745 with abbreviations other than AMI, for example EMI is valid
1746 for Eucalyptus.
1747
1748 ec2_availability_zone = <zone name>
1749 For grid type ec2 jobs, specifies the Availability Zone that
1750 the instance should be run in. This command is optional,
1751 unless ec2_ebs_volumes is set. As an example, one current
1752 zone is us-east-1b.
1753
1754
1755 ec2_block_device_mapping = <block-device>:<ker‐
1756 nel-device>,<block-device>:<kernel-device>, ...
1757 For grid type ec2 jobs, specifies the block device to kernel
1758 device mapping. This command is optional.
1759
1760
1761 ec2_ebs_volumes = <ebs name>:<device name>,<ebs name>:<device
1762 name>,...
1763 For grid type ec2 jobs, optionally specifies a list of Elas‐
1764 tic Block Store (EBS) volumes to be made available to the
1765 instance and the device names they should have in the
1766 instance.
1767
1768
1769 ec2_elastic_ip = <elastic IP address>
1770 For grid type ec2 jobs, and optional specification of an
1771 Elastic IP address that should be assigned to this instance.
1772
1773
1774 ec2_iam_profile_arn = <IAM profile ARN>
1775 For grid type ec2 jobs, an Amazon Resource Name (ARN) identi‐
1776 fying which Identity and Access Management (IAM) (instance)
1777 profile to associate with the instance.
1778
1779
1780 ec2_iam_profile_name= <IAM profile name>
1781 For grid type ec2 jobs, a name identifying which Identity and
1782 Access Management (IAM) (instance) profile to associate with
1783 the instance.
1784
1785 ec2_instance_type = <instance type>
1786 For grid type ec2 jobs, identifies the instance type. Differ‐
1787 ent services may offer different instance types, so no
1788 default value is set.
1789
1790 ec2_keypair = <ssh key-pair name>
1791 For grid type ec2 jobs, specifies the name of an SSH key-pair
1792 that is already registered with the EC2 service. The associ‐
1793 ated private key can be used to ssh into the virtual machine
1794 once it is running.
1795
1796 ec2_keypair_file = <pathname>
1797 For grid type ec2 jobs, specifies the complete path and file
1798 name of a file into which HTCondor will write an SSH key for
1799 use with ec2 jobs. The key can be used to ssh into the vir‐
1800 tual machine once it is running. If ec2_keypair is speci‐
1801 fied for a job, ec2_keypair_file is ignored.
1802
1803 ec2_parameter_names = ParameterName1, ParameterName2, ...
1804 For grid type ec2 jobs, a space or comma separated list of
1805 the names of additional parameters to pass when instantiating
1806 an instance.
1807
1808 ec2_parameter_<name> = <value>
1809 For grid type ec2 jobs, specifies the value for the corre‐
1810 spondingly named (instance instantiation) parameter. <name>
1811 is the parameter name specified in the submit command
1812 ec2_parameter_names , but with any periods replaced by
1813 underscores.
1814
1815
1816 ec2_secret_access_key = <pathname>
1817 For grid type ec2 jobs, specifies the path and file name con‐
1818 taining the secret access key.
1819
1820
1821 ec2_security_groups = group1, group2, ...
1822 For grid type ec2 jobs, defines the list of EC2 security
1823 groups which should be associated with the job.
1824
1825
1826 ec2_security_ids = id1, id2, ...
1827 For grid type ec2 jobs, defines the list of EC2 security
1828 group IDs which should be associated with the job.
1829
1830
1831 ec2_spot_price = <bid>
1832 For grid type ec2 jobs, specifies the spot instance bid,
1833 which is the most that the job submitter is willing to pay
1834 per hour to run this job.
1835
1836 ec2_tag_names = <name0,name1,name...>
1837 For grid type ec2 jobs, specifies the case of tag names that
1838 will be associated with the running instance. This is only
1839 necessary if a tag name case matters. By default the list
1840 will be automatically generated.
1841
1842
1843 ec2_tag_<name> = <value>
1844 For grid type ec2 jobs, specifies a tag to be associated with
1845 the running instance. The tag name will be lower-cased, use
1846 ec2_tag_names to change the case.
1847
1848 WantNameTag = <True | False>
1849 For grid type ec2 jobs, a job may request that its 'name' tag
1850 be (not) set by HTCondor. If the job does not otherwise spec‐
1851 ify any tags, not setting its name tag will eliminate a call
1852 by the EC2 GAHP, improving performance.
1853
1854
1855 ec2_user_data = <data>
1856 For grid type ec2 jobs, provides a block of data that can be
1857 accessed by the virtual machine. If both ec2_user_data and
1858 ec2_user_data_file are specified for a job, the two blocks of
1859 data are concatenated, with the data from this ec2_user_data
1860 submit command occurring first.
1861
1862 ec2_user_data_file = <pathname>
1863 For grid type ec2 jobs, specifies a path and file name whose
1864 contents can be accessed by the virtual machine. If both
1865 ec2_user_data and ec2_user_data_file are specified for a job,
1866 the two blocks of data are concatenated, with the data from
1867 that ec2_user_data submit command occurring first.
1868
1869 ec2_vpc_ip = <a.b.c.d>
1870 For grid type ec2 jobs, that are part of a Virtual Private
1871 Cloud (VPC), an optional specification of the IP address that
1872 this instance should have within the VPC.
1873
1874
1875 ec2_vpc_subnet = <subnet specification string>
1876 For grid type ec2 jobs, an optional specification of the Vir‐
1877 tual Private Cloud (VPC) that this instance should be a part
1878 of.
1879
1880
1881 gce_account = <account name>
1882 For grid type gce jobs, specifies the Google cloud services
1883 account to use. If this submit command isn't specified, then
1884 a random account from the authorization file given by
1885 gce_auth_file will be used.
1886
1887 gce_auth_file = <pathname>
1888 For grid type gce jobs, specifies a path and file name of the
1889 authorization file that grants permission for HTCondor to use
1890 the Google account. If this command is not specified, then
1891 the default file of the Google command-line tools will be
1892 used.
1893
1894
1895 gce_image = <image id>
1896 For grid type gce jobs, the identifier of the virtual machine
1897 image representing the HTCondor job to be run. This virtual
1898 machine image must already be register with GCE and reside in
1899 Google's Cloud Storage service.
1900
1901 gce_json_file = <pathname>
1902 For grid type gce jobs, specifies the path and file name of a
1903 file that contains JSON elements that should be added to the
1904 instance description submitted to the GCE service.
1905
1906
1907 gce_machine_type = <machine type>
1908 For grid type gce jobs, the long form of the URL that
1909 describes the machine configuration that the virtual machine
1910 instance is to run on.
1911
1912 gce_metadata = <name=value,...,name=value>
1913 For grid type gce jobs, a comma separated list of name and
1914 value pairs that define metadata for a virtual machine
1915 instance that is an HTCondor job.
1916
1917 gce_metadata_file = <pathname>
1918 For grid type gce jobs, specifies a path and file name of the
1919 file that contains metadata for a virtual machine instance
1920 that is an HTCondor job. Within the file, each name and value
1921 pair is on its own line; so, the pairs are separated by the
1922 newline character.
1923
1924
1925 gce_preemptible = <True | False>
1926 For grid type gce jobs, specifies whether the virtual machine
1927 instance should be preemptible. The default is for the
1928 instance to not be preemptible.
1929
1930 globus_rematch = <ClassAd Boolean Expression>
1931 This expression is evaluated by the condor_gridmanager when‐
1932 ever:
1933
1934 1. the globus_resubmit expression evaluates to True
1935
1936 2. the condor_gridmanager decides it needs to retry a submis‐
1937 sion (as when a previous submission failed to commit)
1938
1939 If globus_rematch evaluates to True, then before the job is
1940 submitted again to globus, the condor_gridmanager will
1941 request that the condor_schedd daemon renegotiate with the
1942 matchmaker (the condor_negotiator). The result is this job
1943 will be matched again.
1944
1945
1946 globus_resubmit = <ClassAd Boolean Expression>
1947 The expression is evaluated by the condor_gridmanager each
1948 time the condor_gridmanager gets a job ad to manage. There‐
1949 fore, the expression is evaluated:
1950
1951 1. when a grid universe job is first submitted to HTCondor-G
1952
1953 2. when a grid universe job is released from the hold state
1954
1955 3. when HTCondor-G is restarted (specifically, whenever the
1956 condor_gridmanager is restarted)
1957
1958 If the expression evaluates to True, then any previous sub‐
1959 mission to the grid universe will be forgotten and this job
1960 will be submitted again as a fresh submission to the grid
1961 universe. This may be useful if there is a desire to give up
1962 on a previous submission and try again. Note that this may
1963 result in the same job running more than once. Do not treat
1964 this operation lightly.
1965
1966
1967 globus_rsl = <RSL-string>
1968 Used to provide any additional Globus RSL string attributes
1969 which are not covered by other submit description file com‐
1970 mands or job attributes. Used for grid universe jobs, where
1971 the grid resource has a grid-type-string of gt2.
1972
1973
1974 grid_resource = <grid-type-string> <grid-specific-parameter-list>
1975 For each grid-type-string value, there are further type-spe‐
1976 cific values that must specified. This submit description
1977 file command allows each to be given in a space-separated
1978 list. Allowable grid-type-string values are batch, condor,
1979 cream, ec2, gt2, gt5, lsf, nordugrid, pbs, sge, and unicore.
1980 The HTCondor manual chapter on Grid Computing details the
1981 variety of grid types.
1982
1983 For a grid-type-string of batch, the single parameter is the
1984 name of the local batch system, and will be one of pbs, lsf,
1985 or sge.
1986
1987 For a grid-type-string of condor, the first parameter is the
1988 name of the remote condor_schedd daemon. The second parameter
1989 is the name of the pool to which the remote condor_schedd
1990 daemon belongs.
1991
1992 For a grid-type-string of cream, there are three parameters.
1993 The first parameter is the web services address of the CREAM
1994 server. The second parameter is the name of the batch system
1995 that sits behind the CREAM server. The third parameter iden‐
1996 tifies a site-specific queue within the batch system.
1997
1998 For a grid-type-string of ec2, one additional parameter spec‐
1999 ifies the EC2 URL.
2000
2001 For a grid-type-string of gt2, the single parameter is the
2002 name of the pre-WS GRAM resource to be used.
2003
2004 For a grid-type-string of gt5, the single parameter is the
2005 name of the pre-WS GRAM resource to be used, which is the
2006 same as for the grid-type-string of gt2.
2007
2008 For a grid-type-string of lsf, no additional parameters are
2009 used.
2010
2011 For a grid-type-string of nordugrid, the single parameter is
2012 the name of the NorduGrid resource to be used.
2013
2014 For a grid-type-string of pbs, no additional parameters are
2015 used.
2016
2017 For a grid-type-string of sge, no additional parameters are
2018 used.
2019
2020 For a grid-type-string of unicore, the first parameter is the
2021 name of the Unicore Usite to be used. The second parameter is
2022 the name of the Unicore Vsite to be used.
2023
2024
2025 keystore_alias = <name>
2026 A string to locate the certificate in a Java keystore file,
2027 as used for a unicore job.
2028
2029
2030 keystore_file = <pathname>
2031 The complete path and file name of the Java keystore file
2032 containing the certificate to be used for a unicore job.
2033
2034
2035 keystore_passphrase_file = <pathname>
2036 The complete path and file name to the file containing the
2037 passphrase protecting a Java keystore file containing the
2038 certificate. Relevant for a unicore job.
2039
2040
2041 MyProxyCredentialName = <symbolic name>
2042 The symbolic name that identifies a credential to the MyProxy
2043 server. This symbolic name is set as the credential is ini‐
2044 tially stored on the server (using myproxy-init).
2045
2046
2047 MyProxyHost = <host>:<port>
2048 The Internet address of the host that is the MyProxy server.
2049 The host may be specified by either a host name (as in
2050 head.example.com) or an IP address (of the form 123.456.7.8).
2051 The port number is an integer.
2052
2053
2054 MyProxyNewProxyLifetime = <number-of-minutes>
2055 The new lifetime (in minutes) of the proxy after it is
2056 refreshed.
2057
2058
2059 MyProxyPassword = <password>
2060 The password needed to refresh a credential on the MyProxy
2061 server. This password is set when the user initially stores
2062 credentials on the server (using myproxy-init). As an alter‐
2063 native to using MyProxyPassword in the submit description
2064 file, the password may be specified as a command line argu‐
2065 ment to condor_submit with the -password argument.
2066
2067 MyProxyRefreshThreshold = <number-of-seconds>
2068 The time (in seconds) before the expiration of a proxy that
2069 the proxy should be refreshed. For example, if MyProxyRe‐
2070 freshThreshold is set to the value 600, the proxy will be
2071 refreshed 10 minutes before it expires.
2072
2073 MyProxyServerDN = <credential subject>
2074 A string that specifies the expected Distinguished Name (cre‐
2075 dential subject, abbreviated DN) of the MyProxy server. It
2076 must be specified when the MyProxy server DN does not follow
2077 the conventional naming scheme of a host credential. This
2078 occurs, for example, when the MyProxy server DN begins with a
2079 user credential.
2080
2081
2082 nordugrid_rsl = <RSL-string>
2083 Used to provide any additional RSL string attributes which
2084 are not covered by regular submit description file parame‐
2085 ters. Used when the universe is grid, and the type of grid
2086 system is nordugrid.
2087
2088 transfer_error = <True | False>
2089 For jobs submitted to the grid universe only. If True, then
2090 the error output (from stderr) from the job is transferred
2091 from the remote machine back to the submit machine. The name
2092 of the file after transfer is given by the error command.
2093 If False, no transfer takes place (from the remote machine to
2094 submit machine), and the name of the file is given by the
2095 error command. The default value is True.
2096
2097
2098 transfer_input = <True | False>
2099 For jobs submitted to the grid universe only. If True, then
2100 the job input (stdin) is transferred from the machine where
2101 the job was submitted to the remote machine. The name of the
2102 file that is transferred is given by the input command. If
2103 False, then the job's input is taken from a pre-staged file
2104 on the remote machine, and the name of the file is given by
2105 the input command. The default value is True.
2106
2107 For transferring files other than stdin, see trans‐
2108 fer_input_files .
2109
2110
2111 transfer_output = <True | False>
2112 For jobs submitted to the grid universe only. If True, then
2113 the output (from stdout) from the job is transferred from the
2114 remote machine back to the submit machine. The name of the
2115 file after transfer is given by the output command. If
2116 False, no transfer takes place (from the remote machine to
2117 submit machine), and the name of the file is given by the
2118 output command. The default value is True.
2119
2120 For transferring files other than stdout, see transfer_out‐
2121 put_files .
2122
2123
2124 use_x509userproxy = <True | False>
2125 Set this command to True to indicate that the job requires an
2126 X.509 user proxy. If x509userproxy is set, then that file is
2127 used for the proxy. Otherwise, the proxy is looked for in the
2128 standard locations. If x509userproxy is set or if the job is
2129 a grid universe job of grid type gt2, gt5, cream, or nor‐
2130 dugrid, then the value of use_x509userproxy is forced to
2131 True. Defaults to False.
2132
2133
2134 x509userproxy = <full-pathname>
2135 Used to override the default path name for X.509 user cer‐
2136 tificates. The default location for X.509 proxies is the
2137 /tmp directory, which is generally a local file system. Set‐
2138 ting this value would allow HTCondor to access the proxy in a
2139 shared file system (for example, AFS). HTCondor will use the
2140 proxy specified in the submit description file first. If
2141 nothing is specified in the submit description file, it will
2142 use the environment variable X509_USER_PROXY. If that vari‐
2143 able is not present, it will search in the default location.
2144 Note that proxies are only valid for a limited time. Con‐
2145 dor_submit will not submit a job with an expired proxy, it
2146 will return an error. Also, if the configuration parameter
2147 CRED_MIN_TIME_LEFT is set to some number of seconds, and if
2148 the proxy will expire before that many seconds, condor_submit
2149 will also refuse to submit the job. That is, if
2150 CRED_MIN_TIME_LEFT is set to 60, condor_submit will refuse to
2151 submit a job whose proxy will expire 60 seconds from the time
2152 of submission.
2153
2154 x509userproxy is relevant when the universe is vanilla, or
2155 when the universe is grid and the type of grid system is one
2156 of gt2, gt5, condor, cream, or nordugrid. Defining a value
2157 causes the proxy to be delegated to the execute machine.
2158 Further, VOMS attributes defined in the proxy will appear in
2159 the job ClassAd.
2160
2161 COMMANDS FOR PARALLEL, JAVA, and SCHEDULER UNIVERSES
2162
2163
2164 hold_kill_sig = <signal-number>
2165 For the scheduler universe only, signal-number is the sig‐
2166 nal delivered to the job when the job is put on hold with
2167 condor_hold. signal-number may be either the platform-spe‐
2168 cific name or value of the signal. If this command is not
2169 present, the value of kill_sig is used.
2170
2171
2172 jar_files = <file_list>
2173 Specifies a list of additional JAR files to include when
2174 using the Java universe. JAR files will be transferred along
2175 with the executable and automatically added to the classpath.
2176
2177
2178 java_vm_args = <argument_list>
2179 Specifies a list of additional arguments to the Java VM
2180 itself, When HTCondor runs the Java program, these are the
2181 arguments that go before the class name. This can be used to
2182 set VM-specific arguments like stack size, garbage-collector
2183 arguments and initial property values.
2184
2185 machine_count = <max>
2186 For the parallel universe, a single value (max) is required.
2187 It is neither a maximum or minimum, but the number of
2188 machines to be dedicated toward running the job.
2189
2190
2191 remove_kill_sig = <signal-number>
2192 For the scheduler universe only, signal-number is the sig‐
2193 nal delivered to the job when the job is removed with con‐
2194 dor_rm. signal-number may be either the platform-specific
2195 name or value of the signal. This example shows it both ways
2196 for a Linux signal:
2197
2198 remove_kill_sig = SIGUSR1
2199 remove_kill_sig = 10
2200
2201 If this command is not present, the value of kill_sig is
2202 used.
2203
2204 COMMANDS FOR THE VM UNIVERSE
2205
2206 vm_disk = file1:device1:permission1, file2:device2:permission2:for‐
2207 mat2, ...
2208 A list of comma separated disk files. Each disk file is spec‐
2209 ified by 4 colon separated fields. The first field is the
2210 path and file name of the disk file. The second field speci‐
2211 fies the device. The third field specifies permissions, and
2212 the optional fourth field specifies the image format. If a
2213 disk file will be transferred by HTCondor, then the first
2214 field should just be the simple file name (no path informa‐
2215 tion).
2216
2217 An example that specifies two disk files:
2218
2219 vm_disk = /myxen/diskfile.img:sda1:w,/myxen/swap.img:sda2:w
2220
2221
2222
2223 vm_checkpoint = <True | False>
2224 A boolean value specifying whether or not to take check‐
2225 points. If not specified, the default value is False. In the
2226 current implementation, setting both vm_checkpoint and
2227 vm_networking to True does not yet work in all cases. Net‐
2228 working cannot be used if a vm universe job uses a checkpoint
2229 in order to continue execution after migration to another
2230 machine.
2231
2232
2233 vm_macaddr = <MACAddr>
2234 Defines that MAC address that the virtual machine's network
2235 interface should have, in the standard format of six groups
2236 of two hexadecimal digits separated by colons.
2237
2238
2239 vm_memory = <MBytes-of-memory>
2240 The amount of memory in MBytes that a vm universe job
2241 requires.
2242
2243
2244 vm_networking = <True | False>
2245 Specifies whether to use networking or not. In the current
2246 implementation, setting both vm_checkpoint and vm_networking
2247 to True does not yet work in all cases. Networking cannot be
2248 used if a vm universe job uses a checkpoint in order to con‐
2249 tinue execution after migration to another machine.
2250
2251
2252 vm_networking_type = <nat | bridge >
2253 When vm_networking is True, this definition augments the
2254 job's requirements to match only machines with the specified
2255 networking. If not specified, then either networking type
2256 matches.
2257
2258
2259 vm_no_output_vm = <True | False>
2260 When True, prevents HTCondor from transferring output files
2261 back to the machine from which the vm universe job was sub‐
2262 mitted. If not specified, the default value is False.
2263
2264
2265 vm_type = <vmware | xen | kvm>
2266 Specifies the underlying virtual machine software that this
2267 job expects.
2268
2269 vmware_dir = <pathname>
2270 The complete path and name of the directory where VMware-spe‐
2271 cific files and applications such as the VMDK (Virtual
2272 Machine Disk Format) and VMX (Virtual Machine Configuration)
2273 reside. This command is optional; when not specified, all
2274 relevant VMware image files are to be listed using trans‐
2275 fer_input_files .
2276
2277
2278 vmware_should_transfer_files = <True | False>
2279 Specifies whether HTCondor will transfer VMware-specific
2280 files located as specified by vmware_dir to the execute
2281 machine (True) or rely on access through a shared file system
2282 (False). Omission of this required command (for VMware vm
2283 universe jobs) results in an error message from condor_sub‐
2284 mit, and the job will not be submitted.
2285
2286
2287 vmware_snapshot_disk = <True | False>
2288 When True, causes HTCondor to utilize a VMware snapshot disk
2289 for new or modified files. If not specified, the default
2290 value is True.
2291
2292 xen_initrd = <image-file>
2293 When xen_kernel gives a file name for the kernel image to
2294 use, this optional command may specify a path to a ramdisk
2295 (initrd) image file. If the image file will be transferred by
2296 HTCondor, then the value should just be the simple file name
2297 (no path information).
2298
2299
2300 xen_kernel = <included | path-to-kernel>
2301 A value of included specifies that the kernel is included in
2302 the disk file. If not one of these values, then the value is
2303 a path and file name of the kernel to be used. If a kernel
2304 file will be transferred by HTCondor, then the value should
2305 just be the simple file name (no path information).
2306
2307 xen_kernel_params = <string>
2308 A string that is appended to the Xen kernel command line.
2309
2310
2311 xen_root = <string>
2312 A string that is appended to the Xen kernel command line to
2313 specify the root device. This string is required when
2314 xen_kernel gives a path to a kernel. Omission for this
2315 required case results in an error message during submission.
2316
2317 COMMANDS FOR THE DOCKER UNIVERSE
2318
2319
2320 docker_image = < image-name >
2321 Defines the name of the Docker image that is the basis for
2322 the docker container.
2323
2324 ADVANCED COMMANDS
2325
2326 accounting_group = <accounting-group-name>
2327 Causes jobs to negotiate under the given accounting group.
2328 This value is advertised in the job ClassAd as AcctGroup. The
2329 HTCondor Administrator's manual contains more information
2330 about accounting groups.
2331
2332
2333 accounting_group_user = <accounting-group-user-name>
2334 Sets the user name associated with the accounting group name
2335 for resource usage accounting purposes. If not set, defaults
2336 to the value of the job ClassAd attribute Owner. This value
2337 is advertised in the job ClassAd as AcctGroupUser. If an
2338 accounting group has not been set with the accounting_group
2339 command, this command is ignored.
2340
2341
2342 concurrency_limits = <string-list>
2343 A list of resources that this job needs. The resources are
2344 presumed to have concurrency limits placed upon them, thereby
2345 limiting the number of concurrent jobs in execution which
2346 need the named resource. Commas and space characters delimit
2347 the items in the list. Each item in the list is a string
2348 that identifies the limit, or it is a ClassAd expression that
2349 evaluates to a string, and it is evaluated in the context of
2350 machine ClassAd being considered as a match. Each item in the
2351 list also may specify a numerical value identifying the inte‐
2352 ger number of resources required for the job. The syntax
2353 follows the resource name by a colon character (:) and the
2354 numerical value. Details on concurrency limits are in the
2355 HTCondor Administrator's manual.
2356
2357
2358 concurrency_limits_expr = <ClassAd String Expression>
2359 A ClassAd expression that represents the list of resources
2360 that this job needs after evaluation. The ClassAd expression
2361 may specify machine ClassAd attributes that are evaluated
2362 against a matched machine. After evaluation, the list sets
2363 concurrency_limits.
2364
2365
2366 copy_to_spool = <True | False>
2367 If copy_to_spool is True, then condor_submit copies the exe‐
2368 cutable to the local spool directory before running it on a
2369 remote host. As copying can be quite time consuming and
2370 unnecessary, the default value is False for all job universes
2371 other than the standard universe. When False, condor_submit
2372 does not copy the executable to a local spool directory. The
2373 default is True in standard universe, because resuming execu‐
2374 tion from a checkpoint can only be guaranteed to work using
2375 precisely the same executable that created the checkpoint.
2376
2377 coresize = <size>
2378 Should the user's program abort and produce a core file,
2379 coresize specifies the maximum size in bytes of the core file
2380 which the user wishes to keep. If coresize is not specified
2381 in the command file, the system's user resource limit core‐
2382 dumpsize is used (note that coredumpsize is not an HTCondor
2383 parameter - it is an operating system parameter that can be
2384 viewed with the limit or ulimit command on Unix and in the
2385 Registry on Windows). A value of -1 results in no limits
2386 being applied to the core file size. If HTCondor is running
2387 as root, a coresize setting greater than the system coredump‐
2388 size limit will override the system setting; if HTCondor is
2389 not running as root, the system coredumpsize limit will over‐
2390 ride coresize.
2391
2392
2393 cron_day_of_month = <Cron-evaluated Day>
2394 The set of days of the month for which a deferral time
2395 applies. The HTCondor User's manual section on Time Schedul‐
2396 ing for Job Execution has further details.
2397
2398
2399 cron_day_of_week = <Cron-evaluated Day>
2400 The set of days of the week for which a deferral time
2401 applies. The HTCondor User's manual section on Time Schedul‐
2402 ing for Job Execution has further details.
2403
2404 cron_hour = <Cron-evaluated Hour>
2405 The set of hours of the day for which a deferral time
2406 applies. The HTCondor User's manual section on Time Schedul‐
2407 ing for Job Execution has further details.
2408
2409 cron_minute = <Cron-evaluated Minute>
2410 The set of minutes within an hour for which a deferral time
2411 applies. The HTCondor User's manual section on Time Schedul‐
2412 ing for Job Execution has further details.
2413
2414
2415 cron_month = <Cron-evaluated Month>
2416 The set of months within a year for which a deferral time
2417 applies. The HTCondor User's manual section on Time Schedul‐
2418 ing for Job Execution has further details.
2419
2420
2421 cron_prep_time = <ClassAd Integer Expression>
2422 Analogous to deferral_prep_time . The number of seconds
2423 prior to a job's deferral time that the job may be matched
2424 and sent to an execution machine.
2425
2426
2427 cron_window = <ClassAd Integer Expression>
2428 Analogous to the submit command deferral_window . It allows
2429 cron jobs that miss their deferral time to begin execution.
2430
2431 The HTCondor User's manual section on Time Scheduling for Job
2432 Execution has further details.
2433
2434
2435 dagman_log = <pathname>
2436 DAGMan inserts this command to specify an event log that it
2437 watches to maintain the state of the DAG. If the log com‐
2438 mand is not specified in the submit file, DAGMan uses the log
2439 command to specify the event log.
2440
2441 deferral_prep_time = <ClassAd Integer Expression>
2442 The number of seconds prior to a job's deferral time that the
2443 job may be matched and sent to an execution machine.
2444
2445 The HTCondor User's manual section on Time Scheduling for Job
2446 Execution has further details.
2447
2448
2449 deferral_time = <ClassAd Integer Expression>
2450 Allows a job to specify the time at which its execution is to
2451 begin, instead of beginning execution as soon as it arrives
2452 at the execution machine. The deferral time is an expression
2453 that evaluates to a Unix Epoch timestamp (the number of sec‐
2454 onds elapsed since 00:00:00 on January 1, 1970, Coordinated
2455 Universal Time). Deferral time is evaluated with respect to
2456 the execution machine. This option delays the start of execu‐
2457 tion, but not the matching and claiming of a machine for the
2458 job. If the job is not available and ready to begin execution
2459 at the deferral time, it has missed its deferral time. A job
2460 that misses its deferral time will be put on hold in the
2461 queue.
2462
2463 The HTCondor User's manual section on Time Scheduling for Job
2464 Execution has further details.
2465
2466 Due to implementation details, a deferral time may not be
2467 used for scheduler universe jobs.
2468
2469
2470 deferral_window = <ClassAd Integer Expression>
2471 The deferral window is used in conjunction with the defer‐
2472 ral_time command to allow jobs that miss their deferral time
2473 to begin execution.
2474
2475 The HTCondor User's manual section on Time Scheduling for Job
2476 Execution has further details.
2477
2478
2479 description = <string>
2480 A string that sets the value of the job ClassAd attribute
2481 JobDescription. When set, tools which display the executable
2482 such as condor_q will instead use this string.
2483
2484
2485 email_attributes = <list-of-job-ad-attributes>
2486 A comma-separated list of attributes from the job ClassAd.
2487 These attributes and their values will be included in the
2488 e-mail notification of job completion.
2489
2490
2491 image_size = <size>
2492 Advice to HTCondor specifying the maximum virtual image size
2493 to which the job will grow during its execution. HTCondor
2494 will then execute the job only on machines which have enough
2495 resources, (such as virtual memory), to support executing the
2496 job. If not specified, HTCondor will automatically make a
2497 (reasonably accurate) estimate about the job's size and
2498 adjust this estimate as the program runs. If specified and
2499 underestimated, the job may crash due to the inability to
2500 acquire more address space; for example, if malloc() fails.
2501 If the image size is overestimated, HTCondor may have diffi‐
2502 culty finding machines which have the required resources.
2503 size is specified in KiB. For example, for an image size of 8
2504 MiB, size should be 8000.
2505
2506 initialdir = <directory-path>
2507 Used to give jobs a directory with respect to file input and
2508 output. Also provides a directory (on the machine from which
2509 the job is submitted) for the job event log, when a full path
2510 is not specified.
2511
2512 For vanilla universe jobs where there is a shared file sys‐
2513 tem, it is the current working directory on the machine where
2514 the job is executed.
2515
2516 For vanilla or grid universe jobs where file transfer mecha‐
2517 nisms are utilized (there is not a shared file system), it is
2518 the directory on the machine from which the job is submitted
2519 where the input files come from, and where the job's output
2520 files go to.
2521
2522 For standard universe jobs, it is the directory on the
2523 machine from which the job is submitted where the con‐
2524 dor_shadow daemon runs; the current working directory for
2525 file input and output accomplished through remote system
2526 calls.
2527
2528 For scheduler universe jobs, it is the directory on the
2529 machine from which the job is submitted where the job runs;
2530 the current working directory for file input and output with
2531 respect to relative path names.
2532
2533 Note that the path to the executable is not relative to ini‐
2534 tialdir ; if it is a relative path, it is relative to the
2535 directory in which the condor_submit command is run.
2536
2537
2538 job_ad_information_attrs = <attribute-list>
2539 A comma-separated list of job ClassAd attribute names. The
2540 named attributes and their values are written to the job
2541 event log whenever any event is being written to the log.
2542 This implements the same thing as the configuration variable
2543 EVENT_LOG_INFORMATION_ATTRS (see the admin-manual/configura‐
2544 tion-macros:daemon logging configuration file entries page),
2545 but it applies to the job event log, instead of the system
2546 event log.
2547
2548
2549 JobBatchName = <batch_name>
2550 Set the batch name for this submit. The batch name is dis‐
2551 played by condor_q -batch. It is intended for use by users to
2552 give meaningful names to their jobs and to influence how con‐
2553 dor_q groups jobs for display. This value in a submit file
2554 can be overridden by specifying the -batch-name argument on
2555 the condor_submit command line.
2556
2557
2558 job_lease_duration = <number-of-seconds>
2559 For vanilla, parallel, VM, and java universe jobs only, the
2560 duration in seconds of a job lease. The default value is
2561 2,400, or forty minutes. If a job lease is not desired, the
2562 value can be explicitly set to 0 to disable the job lease
2563 semantics. The value can also be a ClassAd expression that
2564 evaluates to an integer. The HTCondor User's manual section
2565 on Special Environment Considerations has further details.
2566
2567
2568 job_machine_attrs = <attr1, attr2, ...>
2569 A comma and/or space separated list of machine attribute
2570 names that should be recorded in the job ClassAd in addition
2571 to the ones specified by the condor_schedd daemon's system
2572 configuration variable SYSTEM_JOB_MACHINE_ATTRS
2573 . When there are multiple run attempts, history of machine
2574 attributes from previous run attempts may be kept. The number
2575 of run attempts to store may be extended beyond the sys‐
2576 tem-specified history length by using the submit file command
2577 job_machine_attrs_history_length . A machine attribute
2578 named X will be inserted into the job ClassAd as an attribute
2579 named MachineAttrX0. The previous value of this attribute
2580 will be named MachineAttrX1, the previous to that will be
2581 named MachineAttrX2, and so on, up to the specified history
2582 length. A history of length 1 means that only MachineAttrX0
2583 will be recorded. The value recorded in the job ClassAd is
2584 the evaluation of the machine attribute in the context of the
2585 job ClassAd when the condor_schedd daemon initiates the start
2586 up of the job. If the evaluation results in an Undefined or
2587 Error result, the value recorded in the job ad will be Unde‐
2588 fined or Error, respectively.
2589
2590
2591 want_graceful_removal = <boolean expression>
2592 If true, this job will be given a chance to shut down cleanly
2593 when removed. The job will be given as much time as the
2594 administrator of the execute resource allows, which may be
2595 none. The default is false. For details, see the configura‐
2596 tion setting GRACEFULLY_REMOVE_JOBS.
2597
2598
2599 kill_sig = <signal-number>
2600 When HTCondor needs to kick a job off of a machine, it will
2601 send the job the signal specified by signal-number . sig‐
2602 nal-number needs to be an integer which represents a valid
2603 signal on the execution machine. For jobs submitted to the
2604 standard universe, the default value is the number for SIGT‐
2605 STP which tells the HTCondor libraries to initiate a check‐
2606 point of the process. For jobs submitted to other universes,
2607 the default value, when not defined, is SIGTERM, which is the
2608 standard way to terminate a program in Unix.
2609
2610 kill_sig_timeout = <seconds>
2611 This submit command should no longer be used as of HTCondor
2612 version 7.7.3; use job_max_vacate_time instead. If
2613 job_max_vacate_time is not defined, this defines the number
2614 of seconds that HTCondor should wait following the sending of
2615 the kill signal defined by kill_sig and forcibly killing
2616 the job. The actual amount of time between sending the signal
2617 and forcibly killing the job is the smallest of this value
2618 and the configuration variable KILLING_TIMEOUT
2619 , as defined on the execute machine.
2620
2621
2622 load_profile = <True | False>
2623 When True, loads the account profile of the dedicated run
2624 account for Windows jobs. May not be used with run_as_owner
2625 .
2626
2627
2628 match_list_length = <integer value>
2629 Defaults to the value zero (0). When match_list_length is
2630 defined with an integer value greater than zero (0),
2631 attributes are inserted into the job ClassAd. The maximum
2632 number of attributes defined is given by the integer value.
2633 The job ClassAds introduced are given as
2634
2635 LastMatchName0 = "most-recent-Name"
2636 LastMatchName1 = "next-most-recent-Name"
2637
2638 The value for each introduced ClassAd is given by the value
2639 of the Name attribute from the machine ClassAd of a previous
2640 execution (match). As a job is matched, the definitions for
2641 these attributes will roll, with LastMatchName1 becoming
2642 LastMatchName2, LastMatchName0 becoming LastMatchName1, and
2643 LastMatchName0 being set by the most recent value of the Name
2644 attribute.
2645
2646 An intended use of these job attributes is in the require‐
2647 ments expression. The requirements can allow a job to prefer
2648 a match with either the same or a different resource than a
2649 previous match.
2650
2651
2652
2653 job_max_vacate_time = <integer expression>
2654 An integer-valued expression (in seconds) that may be used to
2655 adjust the time given to an evicted job for gracefully shut‐
2656 ting down. If the job's setting is less than the machine's,
2657 the job's is used. If the job's setting is larger than the
2658 machine's, the result depends on whether the job has any
2659 excess retirement time. If the job has more retirement time
2660 left than the machine's max vacate time setting, then retire‐
2661 ment time will be converted into vacating time, up to the
2662 amount requested by the job.
2663
2664 Setting this expression does not affect the job's resource
2665 requirements or preferences. For a job to only run on a
2666 machine with a minimum MachineMaxVacateTime, or to preferen‐
2667 tially run on such machines, explicitly specify this in the
2668 requirements and/or rank expressions.
2669
2670
2671 max_job_retirement_time = <integer expression>
2672 An integer-valued expression (in seconds) that does nothing
2673 unless the machine that runs the job has been configured to
2674 provide retirement time. Retirement time is a grace period
2675 given to a job to finish when a resource claim is about to be
2676 preempted. The default behavior in many cases is to take as
2677 much retirement time as the machine offers, so this command
2678 will rarely appear in a submit description file.
2679
2680 When a resource claim is to be preempted, this expression in
2681 the submit file specifies the maximum run time of the job (in
2682 seconds, since the job started). This expression has no
2683 effect, if it is greater than the maximum retirement time
2684 provided by the machine policy. If the resource claim is not
2685 preempted, this expression and the machine retirement policy
2686 are irrelevant. If the resource claim is preempted the job
2687 will be allowed to run until the retirement time expires, at
2688 which point it is hard-killed. The job will be soft-killed
2689 when it is getting close to the end of retirement in order to
2690 give it time to gracefully shut down. The amount of lead-time
2691 for soft-killing is determined by the maximum vacating time
2692 granted to the job.
2693
2694 Standard universe jobs and any jobs running with nice_user
2695 priority have a default max_job_retirement_time of 0, so no
2696 retirement time is utilized by default. In all other cases,
2697 no default value is provided, so the maximum amount of
2698 retirement time is utilized by default.
2699
2700 Setting this expression does not affect the job's resource
2701 requirements or preferences. For a job to only run on a
2702 machine with a minimum MaxJobRetirementTime, or to preferen‐
2703 tially run on such machines, explicitly specify this in the
2704 requirements and/or rank expressions.
2705
2706 nice_user = <True | False>
2707 Normally, when a machine becomes available to HTCondor,
2708 HTCondor decides which job to run based upon user and job
2709 priorities. Setting nice_user equal to True tells HTCondor
2710 not to use your regular user priority, but that this job
2711 should have last priority among all users and all jobs. So
2712 jobs submitted in this fashion run only on machines which no
2713 other non-nice_user job wants - a true bottom-feeder job!
2714 This is very handy if a user has some jobs they wish to run,
2715 but do not wish to use resources that could instead be used
2716 to run other people's HTCondor jobs. Jobs submitted in this
2717 fashion have "nice-user." prepended to the owner name when
2718 viewed from condor_q or condor_userprio. The default value is
2719 False.
2720
2721 noop_job = <ClassAd Boolean Expression>
2722 When this boolean expression is True, the job is immediately
2723 removed from the queue, and HTCondor makes no attempt at run‐
2724 ning the job. The log file for the job will show a job sub‐
2725 mitted event and a job terminated event, along with an exit
2726 code of 0, unless the user specifies a different signal or
2727 exit code.
2728
2729
2730 noop_job_exit_code = <return value>
2731 When noop_job is in the submit description file and evalu‐
2732 ates to True, this command allows the job to specify the
2733 return value as shown in the job's log file job terminated
2734 event. If not specified, the job will show as having termi‐
2735 nated with status 0. This overrides any value specified with
2736 noop_job_exit_signal .
2737
2738
2739 noop_job_exit_signal = <signal number>
2740 When noop_job is in the submit description file and evalu‐
2741 ates to True, this command allows the job to specify the sig‐
2742 nal number that the job's log event will show the job having
2743 terminated with.
2744
2745
2746 remote_initialdir = <directory-path>
2747 The path specifies the directory in which the job is to be
2748 executed on the remote machine. This is currently supported
2749 in all universes except for the standard universe.
2750
2751
2752 rendezvousdir = <directory-path>
2753 Used to specify the shared file system directory to be used
2754 for file system authentication when submitting to a remote
2755 scheduler. Should be a path to a preexisting directory.
2756
2757
2758 run_as_owner = <True | False>
2759 A boolean value that causes the job to be run under the login
2760 of the submitter, if supported by the joint configuration of
2761 the submit and execute machines. On Unix platforms, this
2762 defaults to True, and on Windows platforms, it defaults to
2763 False. May not be used with load_profile . See the HTCondor
2764 manual Platform-Specific Information chapter for administra‐
2765 tive details on configuring Windows to support this option.
2766
2767 stack_size = <size in bytes>
2768 This command applies only to Linux platform jobs that are not
2769 standard universe jobs. An integer number of bytes, repre‐
2770 senting the amount of stack space to be allocated for the
2771 job. This value replaces the default allocation of stack
2772 space, which is unlimited in size.
2773
2774 submit_event_notes = <note>
2775 A string that is appended to the submit event in the job's
2776 log file. For DAGMan jobs, the string DAG Node: and the
2777 node's name is automatically defined for submit_event_notes,
2778 causing the logged submit event to identify the DAG node job
2779 submitted.
2780
2781 +<attribute> = <value>
2782 A line that begins with a '+' (plus) character instructs con‐
2783 dor_submit to insert the given attribute into the job ClassAd
2784 with the given value. Note that setting an attribute should
2785 not be used in place of one of the specific commands listed
2786 above. Often, the command name does not directly correspond
2787 to an attribute name; furthermore, many submit commands
2788 result in actions more complex than simply setting an
2789 attribute or attributes. See /classad-attributes/job-clas‐
2790 sad-attributes for a list of HTCondor job attributes.
2791
2792 MACROS AND COMMENTS
2793
2794
2795 In addition to commands, the submit description file can contain macros
2796 and comments.
2797
2798 Macros Parameterless macros in the form of $(macro_name:default ini‐
2799 tial value) may be used anywhere in HTCondor submit descrip‐
2800 tion files to provide textual substitution at submit time.
2801 Macros can be defined by lines in the form of
2802
2803 <macro_name> = <string>
2804
2805 Two pre-defined macros are supplied by the submit description
2806 file parser. The $(Cluster) or $(ClusterId) macro supplies
2807 the value of the
2808
2809 ClusterId job ClassAd attribute, and the $(Process) or
2810 $(ProcId) macro supplies the value of the ProcId job ClassAd
2811 attribute. These macros are intended to aid in the specifica‐
2812 tion of input/output files, arguments, etc., for clusters
2813 with lots of jobs, and/or could be used to supply an HTCondor
2814 process with its own cluster and process numbers on the com‐
2815 mand line.
2816
2817 The $(Node) macro is defined for parallel universe jobs, and
2818 is especially relevant for MPI applications. It is a unique
2819 value assigned for the duration of the job that essentially
2820 identifies the machine (slot) on which a program is execut‐
2821 ing. Values assigned start at 0 and increase monotonically.
2822 The values are assigned as the parallel job is about to
2823 start.
2824
2825 Recursive definition of macros is permitted. An example of a
2826 construction that works is the following:
2827
2828 foo = bar
2829 foo = snap $(foo)
2830
2831 As a result, foo = snap bar.
2832
2833 Note that both left- and right- recursion works, so
2834
2835 foo = bar
2836 foo = $(foo) snap
2837
2838 has as its result foo = bar snap.
2839
2840 The construction
2841
2842 foo = $(foo) bar
2843
2844 by itself will not work, as it does not have an initial base
2845 case. Mutually recursive constructions such as:
2846
2847 B = bar
2848 C = $(B)
2849 B = $(C) boo
2850
2851 will not work, and will fill memory with expansions.
2852
2853 A default value may be specified, for use if the macro has no
2854 definition. Consider the example
2855
2856 D = $(E:24)
2857
2858 Where E is not defined within the submit description file,
2859 the default value 24 is used, resulting in
2860
2861 D = 24
2862
2863 This is of limited value, as the scope of macro substitution
2864 is the submit description file. Thus, either the macro is or
2865 is not defined within the submit description file. If the
2866 macro is defined, then the default value is useless. If the
2867 macro is not defined, then there is no point in using it in a
2868 submit command.
2869
2870
2871 To use the dollar sign character ($) as a literal, without
2872 macro expansion, use
2873
2874 $(DOLLAR)
2875
2876 In addition to the normal macro, there is also a special kind
2877 of macro called a substitution macro
2878 that allows the substitution of a machine ClassAd attribute
2879 value defined on the resource machine itself (gotten after a
2880 match to the machine has been made) into specific commands
2881 within the submit description file. The substitution macro is
2882 of the form:
2883
2884 $$(attribute)
2885
2886 As this form of the substitution macro is only evaluated
2887 within the context of the machine ClassAd, use of a scope
2888 resolution prefix TARGET. or MY. is not allowed.
2889
2890 A common use of this form of the substitution macro is for
2891 the heterogeneous submission of an executable:
2892
2893 executable = povray.$$(OpSys).$$(Arch)
2894
2895 Values for the OpSys and Arch attributes are substituted at
2896 match time for any given resource. This example allows HTCon‐
2897 dor to automatically choose the correct executable for the
2898 matched machine.
2899
2900 An extension to the syntax of the substitution macro provides
2901 an alternative string to use if the machine attribute within
2902 the substitution macro is undefined. The syntax appears as:
2903
2904 $$(attribute:string_if_attribute_undefined)
2905
2906 An example using this extended syntax provides a path name to
2907 a required input file. Since the file can be placed in dif‐
2908 ferent locations on different machines, the file's path name
2909 is given as an argument to the program.
2910
2911 arguments = $$(input_file_path:/usr/foo)
2912
2913 On the machine, if the attribute input_file_path is not
2914 defined, then the path /usr/foo is used instead.
2915
2916 A further extension to the syntax of the substitution macro
2917 allows the evaluation of a ClassAd expression to define the
2918 value. In this form, the expression may refer to machine
2919 attributes by prefacing them with the TARGET. scope resolu‐
2920 tion prefix. To place a ClassAd expression into the substitu‐
2921 tion macro, square brackets are added to delimit the expres‐
2922 sion. The syntax appears as:
2923
2924 $$([ClassAd expression])
2925
2926 An example of a job that uses this syntax may be one that
2927 wants to know how much memory it can use. The application
2928 cannot detect this itself, as it would potentially use all of
2929 the memory on a multi-slot machine. So the job determines the
2930 memory per slot, reducing it by 10% to account for miscella‐
2931 neous overhead, and passes this as a command line argument to
2932 the application. In the submit description file will be
2933
2934 arguments = --memory $$([TARGET.Memory * 0.9])
2935
2936
2937
2938 To insert two dollar sign characters ($$) as literals into a
2939 ClassAd string, use
2940
2941 $$(DOLLARDOLLAR)
2942
2943
2944
2945
2946
2947 The environment macro, $ENV, allows the evaluation of an
2948 environment variable to be used in setting a submit descrip‐
2949 tion file command. The syntax used is
2950
2951 $ENV(variable)
2952
2953 An example submit description file command that uses this
2954 functionality evaluates the submitter's home directory in
2955 order to set the path and file name of a log file:
2956
2957 log = $ENV(HOME)/jobs/logfile
2958
2959 The environment variable is evaluated when the submit
2960 description file is processed.
2961
2962
2963
2964
2965 The $RANDOM_CHOICE macro allows a random choice to be made
2966 from a given list of parameters at submission time. For an
2967 expression, if some randomness needs to be generated, the
2968 macro may appear as
2969
2970 $RANDOM_CHOICE(0,1,2,3,4,5,6)
2971
2972 When evaluated, one of the parameters values will be chosen.
2973
2974 Comments
2975 Blank lines and lines beginning with a pound sign ('#') char‐
2976 acter are ignored by the submit description file parser.
2977
2979 While processing the queue command in a submit file or from the command
2980 line, condor_submit will set the values of several automatic submit
2981 variables so that they can be referred to by statements in the submit
2982 file. With the exception of Cluster and Process, if these variables are
2983 set by the submit file, they will not be modified during queue pro‐
2984 cessing.
2985
2986 ClusterId
2987 Set to the integer value that the ClusterId attribute that
2988 the job ClassAd will have when the job is submitted. All jobs
2989 in a single submit will normally have the same value for the
2990 ClusterId. If the -dry-run argument is specified, The value
2991 will be 1.
2992
2993 Cluster
2994 Alternate name for the ClusterId submit variable. Before
2995 HTCondor version 8.4 this was the only name.
2996
2997 ProcId Set to the integer value that the ProcId attribute of the job
2998 ClassAd will have when the job is submitted. The value will
2999 start at 0 and increment by 1 for each job submitted.
3000
3001 Process
3002 Alternate name for the ProcId submit variable. Before HTCon‐
3003 dor version 8.4 this was the only name.
3004
3005 Node For parallel universes, set to the value #pArAlLeLnOdE# or
3006 #MpInOdE# depending on the parallel universe type For other
3007 universes it is set to nothing.
3008
3009 Step Set to the step value as it varies from 0 to N-1 where N is
3010 the number provided on the queue argument. This variable
3011 changes at the same rate as ProcId when it changes at all.
3012 For submit files that don't make use of the queue number
3013 option, Step will always be 0. For submit files that don't
3014 make use of any of the foreach options, Step and ProcId will
3015 always be the same.
3016
3017 ItemIndex
3018 Set to the index within the item list being processed by the
3019 various queue foreach options. For submit files that don't
3020 make use of any queue foreach list, ItemIndex will always be
3021 0 For submit files that make use of a slice to select only
3022 some items in a foreach list, ItemIndex will only be set to
3023 selected values.
3024
3025 Row Alternate name for ItemIndex.
3026
3027 Item when a queue foreach option is used and no variable list is
3028 supplied, this variable will be set to the value of the cur‐
3029 rent item.
3030
3031 The automatic variables below are set before parsing the submit file,
3032 and will not vary during processing unless the submit file itself sets
3033 them.
3034
3035 ARCH Set to the CPU architecture of the machine running con‐
3036 dor_submit. The value will be the same as the automatic con‐
3037 figuration variable of the same name.
3038
3039 OPSYS Set to the name of the operating system on the machine run‐
3040 ning condor_submit. The value will be the same as the auto‐
3041 matic configuration variable of the same name.
3042
3043 OPSYSANDVER
3044 Set to the name and major version of the operating system on
3045 the machine running condor_submit. The value will be the same
3046 as the automatic configuration variable of the same name.
3047
3048 OPSYSMAJORVER
3049 Set to the major version of the operating system on the
3050 machine running condor_submit. The value will be the same as
3051 the automatic configuration variable of the same name.
3052
3053 OPSYSVER
3054 Set to the version of the operating system on the machine
3055 running condor_submit. The value will be the same as the
3056 automatic configuration variable of the same name.
3057
3058 SPOOL Set to the full path of the HTCondor spool directory. The
3059 value will be the same as the automatic configuration vari‐
3060 able of the same name.
3061
3062 IsLinux
3063 Set to true if the operating system of the machine running
3064 condor_submit is a Linux variant. Set to false otherwise.
3065
3066 IsWindows
3067 Set to true if the operating system of the machine running
3068 condor_submit is a Microsoft Windows variant. Set to false
3069 otherwise.
3070
3071 SUBMIT_FILE
3072 Set to the full pathname of the submit file being processed
3073 by condor_submit. If submit statements are read from standard
3074 input, it is set to nothing.
3075
3077 condor_submit will exit with a status value of 0 (zero) upon success,
3078 and a non-zero value upon failure.
3079
3081 · Submit Description File Example 1: This example queues three jobs for
3082 execution by HTCondor. The first will be given command line arguments
3083 of 15 and 2000, and it will write its standard output to foo.out1.
3084 The second will be given command line arguments of 30 and 2000, and
3085 it will write its standard output to foo.out2. Similarly the third
3086 will have arguments of 45 and 6000, and it will use foo.out3 for its
3087 standard output. Standard error output (if any) from all three pro‐
3088 grams will appear in foo.error.
3089
3090 ####################
3091 #
3092 # submit description file
3093 # Example 1: queuing multiple jobs with differing
3094 # command line arguments and output files.
3095 #
3096 ####################
3097
3098 Executable = foo
3099 Universe = vanilla
3100
3101 Arguments = 15 2000
3102 Output = foo.out0
3103 Error = foo.err0
3104 Queue
3105
3106 Arguments = 30 2000
3107 Output = foo.out1
3108 Error = foo.err1
3109 Queue
3110
3111 Arguments = 45 6000
3112 Output = foo.out2
3113 Error = foo.err2
3114 Queue
3115
3116 Or you can get the same results as the above submit file by using a
3117 list of arguments with the Queue statement
3118
3119 ####################
3120 #
3121 # submit description file
3122 # Example 1b: queuing multiple jobs with differing
3123 # command line arguments and output files, alternate syntax
3124 #
3125 ####################
3126
3127 Executable = foo
3128 Universe = vanilla
3129
3130 # generate different output and error filenames for each process
3131 Output = foo.out$(Process)
3132 Error = foo.err$(Process)
3133
3134 Queue Arguments From (
3135 15 2000
3136 30 2000
3137 45 6000
3138 )
3139
3140 · Submit Description File Example 2: This submit description file exam‐
3141 ple queues 150 runs of program foo which must have been compiled and
3142 linked for an Intel x86 processor running RHEL 3. HTCondor will not
3143 attempt to run the processes on machines which have less than 32
3144 Megabytes of physical memory, and it will run them on machines which
3145 have at least 64 Megabytes, if such machines are available. Stdin,
3146 stdout, and stderr will refer to in.0, out.0, and err.0 for the first
3147 run of this program (process 0). Stdin, stdout, and stderr will refer
3148 to in.1, out.1, and err.1 for process 1, and so forth. A log file
3149 containing entries about where and when HTCondor runs, takes check‐
3150 points, and migrates processes in this cluster will be written into
3151 file foo.log.
3152
3153 ####################
3154 #
3155 # Example 2: Show off some fancy features including
3156 # use of pre-defined macros and logging.
3157 #
3158 ####################
3159
3160 Executable = foo
3161 Universe = standard
3162 Requirements = OpSys == "LINUX" && Arch =="INTEL"
3163 Rank = Memory >= 64
3164 Request_Memory = 32 Mb
3165 Image_Size = 28 Mb
3166
3167 Error = err.$(Process)
3168 Input = in.$(Process)
3169 Output = out.$(Process)
3170 Log = foo.log
3171 Queue 150
3172
3173 · Submit Description File Example 3: This example targets the
3174 /bin/sleep program to run only on a platform running a RHEL 6 operat‐
3175 ing system. The example presumes that the pool contains machines run‐
3176 ning more than one version of Linux, and this job needs the particu‐
3177 lar operating system to run correctly.
3178
3179 ####################
3180 #
3181 # Example 3: Run on a RedHat 6 machine
3182 #
3183 ####################
3184 Universe = vanilla
3185 Executable = /bin/sleep
3186 Arguments = 30
3187 Requirements = (OpSysAndVer == "RedHat6")
3188
3189 Error = err.$(Process)
3190 Input = in.$(Process)
3191 Output = out.$(Process)
3192 Log = sleep.log
3193 Queue
3194
3195 · Command Line example: The following command uses the -append option
3196 to add two commands before the job(s) is queued. A log file and an
3197 error log file are specified. The submit description file is
3198 unchanged.
3199
3200 condor_submit -a "log = out.log" -a "error = error.log" mysubmitfile
3201
3202 Note that each of the added commands is contained within quote marks
3203 because there are space characters within the command.
3204
3205 · periodic_remove example: A job should be removed from the queue, if
3206 the total suspension time of the job is more than half of the run
3207 time of the job.
3208
3209 Including the command
3210
3211 periodic_remove = CumulativeSuspensionTime >
3212 ((RemoteWallClockTime - CumulativeSuspensionTime) / 2.0)
3213
3214 in the submit description file causes this to happen.
3215
3217 · For security reasons, HTCondor will refuse to run any jobs submitted
3218 by user root (UID = 0) or by a user whose default group is group
3219 wheel (GID = 0). Jobs submitted by user root or a user with a default
3220 group of wheel will appear to sit forever in the queue in an idle
3221 state.
3222
3223 · All path names specified in the submit description file must be less
3224 than 256 characters in length, and command line arguments must be
3225 less than 4096 characters in length; otherwise, condor_submit gives a
3226 warning message but the jobs will not execute properly.
3227
3228 · Somewhat understandably, behavior gets bizarre if the user makes the
3229 mistake of requesting multiple HTCondor jobs to write to the same
3230 file, and/or if the user alters any files that need to be accessed by
3231 an HTCondor job which is still in the queue. For example, the com‐
3232 pressing of data or output files before an HTCondor job has completed
3233 is a common mistake.
3234
3235 · To disable checkpointing for Standard Universe jobs, include the
3236 line:
3237
3238 +WantCheckpoint = False
3239
3240 in the submit description file before the queue command(s).
3241
3243 HTCondor User Manual
3244
3246 HTCondor Team
3247
3249 1990-2020, Center for High Throughput Computing, Computer Sciences
3250 Department, University of Wisconsin-Madison, Madison, WI, US. Licensed
3251 under the Apache License, Version 2.0.
3252
3253
3254
3255
32568.8 Aug 06, 2020 CONDOR_SUBMIT(1)