1just-man-pages/condor_submit(G1e)neral Commands Manjuuaslt-man-pages/condor_submit(1)
2
3
4
6 condor_submit Queue jobs for execution under Condor
7
9 condor_submit [ -verbose ] [ -unused ] [ -name schedd_name ] [ -remote
10 schedd_name ] [ -pool pool_name ] [ -disable ] [ -password passphrase ]
11 [ -debug ] [ -append command ... ] [ -spool ] [ -dump filename ] [
12 submit description file ]
13
15 condor_submit is the program for submitting jobs for execution under
16 Condor. condor_submit requires a submit description file which con‐
17 tains commands to direct the queuing of jobs. One submit description
18 file may contain specifications for the queuing of many Condor jobs at
19 once. A single invocation of condor_submit may cause one or more clus‐
20 ters. A cluster is a set of jobs specified in the submit description
21 file between queue commands for which the executable is not changed. It
22 is advantageous to submit multiple jobs as a single cluster because:
23
24 * Only one copy of the checkpoint file is needed to represent all
25 jobs in a cluster until they begin execution.
26
27 * There is much less overhead involved for Condor to start the next
28 job in a cluster than for Condor to start a new cluster. This can
29 make a big difference when submitting lots of short jobs.
30
31 Multiple clusters may be specified within a single submit description
32 file. Each cluster must specify a single executable.
33
34 The job ClassAd attribute ClusterIdidentifies a cluster. See specifics
35 for this attribute in the Appendix on page .
36
37 Note that submission of jobs from a Windows machine requires a stashed
38 password to allow Condor to impersonate the user submitting the job. To
39 stash a password, use the condor_store_cred command. See the manual
40 page at page for details.
41
42 See section for the commands that may be placed in the submit descrip‐
43 tion file to direct the submission of a job.
44
46 -verbose
47
48 Verbose output - display the created job ClassAd
49
50
51
52
53
54 -unused
55
56 As a default, causes no warnings to be issued about user-defined
57 macros not being used within the submit description file. The mean‐
58 ing reverses (toggles) when the configuration variable
59 WARN_ON_UNUSED_SUBMIT_FILE_MACROSis set to the nondefault value of
60 False. Printing the warnings can help identify spelling errors of
61 submit description file commands. The warnings are sent to stderr.
62
63
64
65
66
67 -name schedd_name
68
69 Submit to the specified condor_schedd . Use this option to submit to
70 a condor_schedd other than the default local one. schedd_name is
71 the value of the NameClassAd attribute on the machine where the con‐
72 dor_schedd daemon runs.
73
74
75
76
77
78 -remote schedd_name
79
80 Submit to the specified condor_schedd , spooling all required input
81 files over the network connection. schedd_name is the value of the
82 NameClassAd attribute on the machine where the condor_schedd daemon
83 runs. This option is equivalent to using both -name and -spool .
84
85
86
87
88
89 -pool pool_name
90
91 Look in the specified pool for the condor_schedd to submit to. This
92 option is used with -name or -remote .
93
94
95
96
97
98 -disable
99
100 Disable file permission checks.
101
102
103
104
105
106 -password passphrase
107
108 Specify a password to the MyProxy server.
109
110
111
112
113
114 -debug
115
116 Cause debugging information to be sent to stderr, based on the value
117 of the configuration variable TOOL_DEBUG.
118
119
120
121
122
123 -append command
124
125 Augment the commands in the submit description file with the given
126 command. This command will be considered to immediately precede the
127 Queue command within the submit description file, and come after all
128 other previous commands. The submit description file is not modi‐
129 fied. Multiple commands are specified by using the -append option
130 multiple times. Each new command is given in a separate -append
131 option. Commands with spaces in them will need to be enclosed in
132 double quote marks.
133
134
135
136
137
138 -spool
139
140 Spool all required input files, user log, and proxy over the connec‐
141 tion to the condor_schedd . After submission, modify local copies of
142 the files without affecting your jobs. Any output files for com‐
143 pleted jobs need to be retrieved with condor_transfer_data .
144
145
146
147
148
149 -dump filename
150
151 Sends all ClassAds to the specified file, instead of to the con‐
152 dor_schedd .
153
154
155
156
157
158 submit description file
159
160 The pathname to the submit description file. If this optional argu‐
161 ment is missing or equal to ``-'', then the commands are taken from
162 standard input.
163
164
165
166
167
169 Each submit description file describes one cluster of jobs to be placed
170 in the Condor execution pool. All jobs in a cluster must share the same
171 executable, but they may have different input and output files, and
172 different program arguments. The submit description file is the only
173 command-line argument to condor_submit . If the submit description file
174 argument is omitted, condor_submit will read the submit description
175 from standard input.
176
177 The submit description file must contain one executable command and at
178 least one queue command. All of the other commands have default
179 actions.
180
181 The commands which can appear in the submit description file are numer‐
182 ous. They are listed here in alphabetical order by category.
183
184 BASIC COMMANDS
185
186
187
188
189
190
191
192 arguments = <argument_list>
193
194 List of arguments to be supplied to the program on the command line.
195 In the Java Universe, the first argument must be the name of the
196 class containing main.
197
198 There are two permissible formats for specifying arguments. The new
199 syntax supports uniform quoting of spaces within arguments; the old
200 syntax supports spaces in arguments only in special circumstances.
201
202 In the old syntax, arguments are delimited (separated) by space
203 characters. Double-quotes must be escaped with a backslash (i.e. put
204 a backslash in front of each double-quote).
205
206 Further interpretation of the argument string differs depending on
207 the operating system. On Windows, your argument string is simply
208 passed verbatim (other than the backslash in front of double-quotes)
209 to the Windows application. Most Windows applications will allow you
210 to put spaces within an argument value by surrounding the argument
211 with double-quotes. In all other cases, there is no further inter‐
212 pretation of the arguments.
213
214 Example:
215
216
217
218 arguments = one
219
220 Produces in Unix vanilla universe:
221
222
223
224 argument 1: one
225 argument 2: "two"
226 argument 3: 'three'
227
228 Here are the rules for using the new syntax:
229
230
231
232 1. Put double quotes around the entire argument string. This dis‐
233 tinguishes the new syntax from the old, because these double-
234 quotes are not escaped with backslashes, as required in the old
235 syntax. Any literal double-quotes within the string must be
236 escaped by repeating them.
237
238
239
240 2. Use white space (e.g. spaces or tabs) to separate arguments.
241
242
243
244 3. To put any white space in an argument, you must surround the
245 space and as much of the surrounding argument as you like with
246 single-quotes.
247
248
249
250 4. To insert a literal single-quote, you must repeat it anywhere
251 inside of a single-quoted section.
252
253
254
255 Example:
256
257
258
259 arguments = "one ""two"" 'spacey ''quoted'' argument'"
260
261 Produces:
262
263
264
265 argument 1: one
266 argument 2: "two"
267 argument 3: spacey 'quoted' argument
268
269 Notice that in the new syntax, backslash has no special meaning.
270 This is for the convenience of Windows users.
271
272
273
274
275
276 environment = <parameter_list>
277
278 List of environment variables.
279
280 There are two different formats for specifying the environment vari‐
281 ables: the old format and the new format. The old format is retained
282 for backward-compatibility. It suffers from a platform-dependent
283 syntax and the inability to insert some special characters into the
284 environment.
285
286 The new syntax for specifying environment values:
287
288
289
290 1. Put double quote marks around the entire argument string. This
291 distinguishes the new syntax from the old. The old syntax does
292 not have double quote marks around it. Any literal double quote
293 marks within the string must be escaped by repeating the double
294 quote mark.
295
296
297
298 2. Each environment entry has the form
299
300
301
302 <name>=<value>
303
304
305
306 3. Use white space (space or tab characters) to separate environ‐
307 ment entries.
308
309
310
311 4. To put any white space in an environment entry, surround the
312 space and as much of the surrounding entry as desired with single
313 quote marks.
314
315
316
317 5. To insert a literal single quote mark, repeat the single quote
318 mark anywhere inside of a section surrounded by single quote
319 marks.
320
321
322
323 Example:
324
325
326
327 environment = "one=1 two=""2"" three='spacey ''quoted'' value'"
328
329 Produces the following environment entries:
330
331
332
333 one=1
334 two="2"
335 three=spacey 'quoted' value
336
337 Under the old syntax, there are no double quote marks surrounding
338 the environment specification. Each environment entry remains of the
339 form
340
341 <name>=<value>
342
343 Under Unix, list multiple environment entries by separating them
344 with a semicolon (;). Under Windows, separate multiple entries with
345 a vertical bar (| ). There is no way to insert a literal semicolon
346 under Unix or a literal vertical bar under Windows. Note that spaces
347 are accepted, but rarely desired, characters within parameter names
348 and values, because they are treated as literal characters, not sep‐
349 arators or ignored white space. Place spaces within the parameter
350 list only if required.
351
352 A Unix example:
353
354
355
356 environment = one=1;two=2;three="quotes have no 'special' meaning"
357
358 This produces the following:
359
360
361
362 one=1
363 two=2
364 three="quotes have no 'special' meaning"
365
366 If the environment is set with the environment command and getenv is
367 also set to true, values specified with environment override values
368 in the submittor's environment (regardless of the order of the envi‐
369 ronment and getenv commands).
370
371
372
373
374
375 error = <pathname>
376
377 A path and file name used by Condor to capture any error messages
378 the program would normally write to the screen (that is, this file
379 becomes stderr). If not specified, the default value of /dev/nullis
380 used for submission to a Unix machine. If not specified, error mes‐
381 sages are ignored for submission to a Windows machine. More than one
382 job should not use the same error file, since this will cause one
383 job to overwrite the errors of another. The error file and the out‐
384 put file should not be the same file as the outputs will overwrite
385 each other or be lost. For grid universe jobs, error may be a URL
386 that the Globus tool globus_url_copy understands.
387
388
389
390
391
392 executable = <pathname>
393
394 An optional path and a required file name of the executable file for
395 this job cluster. Only one executable command within a submit
396 description file is guaranteed to work properly. More than one often
397 works.
398
399 If no path or a relative path is used, then the executable file is
400 presumed to be relative to the current working directory of the user
401 as the condor_submit command is issued.
402
403 If submitting into the standard universe, then the named executable
404 must have been re-linked with the Condor libraries (such as via the
405 condor_compile command). If submitting into the vanilla universe
406 (the default), then the named executable need not be re-linked and
407 can be any process which can run in the background (shell scripts
408 work fine as well). If submitting into the Java universe, then the
409 argument must be a compiled .classfile.
410
411
412
413
414
415 getenv = <True | False>
416
417 If getenv is set to True, then condor_submit will copy all of the
418 user's current shell environment variables at the time of job sub‐
419 mission into the job ClassAd. The job will therefore execute with
420 the same set of environment variables that the user had at submit
421 time. Defaults to False.
422
423 If the environment is set with the environment command and getenv is
424 also set to true, values specified with environment override values
425 in the submittor's environment (regardless of the order of the envi‐
426 ronment and getenv commands).
427
428
429
430
431
432 input = <pathname>
433
434 Condor assumes that its jobs are long-running, and that the user
435 will not wait at the terminal for their completion. Because of this,
436 the standard files which normally access the terminal, (stdin, std‐
437 out, and stderr), must refer to files. Thus, the file name specified
438 with input should contain any keyboard input the program requires
439 (that is, this file becomes stdin). If not specified, the default
440 value of /dev/nullis used for submission to a Unix machine. If not
441 specified, input is ignored for submission to a Windows machine. For
442 grid universe jobs, input may be a URL that the Globus tool
443 globus_url_copy understands.
444
445 Note that this command does not refer to the command-line arguments
446 of the program. The command-line arguments are specified by the
447 arguments command.
448
449
450
451
452
453 log = <pathname>
454
455 Use log to specify a file name where Condor will write a log file of
456 what is happening with this job cluster. For example, Condor will
457 place a log entry into this file when and where the job begins run‐
458 ning, when the job produces a checkpoint, or moves (migrates) to
459 another machine, and when the job completes. Most users find speci‐
460 fying a log file to be handy; its use is recommended. If no log
461 entry is specified, Condor does not create a log for this cluster.
462
463
464
465
466
467 log_xml = <True | False>
468
469 If log_xml is True, then the log file will be written in ClassAd
470 XML. If not specified, XML is not used. Note that the file is an XML
471 fragment; it is missing the file header and footer. Do not mix XML
472 and non-XML within a single file. If multiple jobs write to a single
473 log file, ensure that all of the jobs specify this option in the
474 same way.
475
476
477
478
479
480 notification = <Always | Complete | Error | Never>
481
482 Owners of Condor jobs are notified by e-mail when certain events
483 occur. If defined by Always , the owner will be notified whenever
484 the job produces a checkpoint, as well as when the job completes. If
485 defined by Complete (the default), the owner will be notified when
486 the job terminates. If defined by Error , the owner will only be
487 notified if the job terminates abnormally. If defined by Never , the
488 owner will not receive e-mail, regardless to what happens to the
489 job. The statistics included in the e-mail are documented in section
490 on page .
491
492
493
494
495
496 notify_user = <email-address>
497
498 Used to specify the e-mail address to use when Condor sends e-mail
499 about a job. If not specified, Condor defaults to using the e-mail
500 address defined by
501
502 job-owner@UID_DOMAIN
503
504 where the configuration variable UID_DOMAINis specified by the Con‐
505 dor site administrator. If UID_DOMAINhas not been specified, Condor
506 sends the e-mail to:
507
508 job-owner@submit-machine-name
509
510
511
512
513
514 output = <pathname>
515
516 The output file captures any information the program would ordinar‐
517 ily write to the screen (that is, this file becomes stdout). If not
518 specified, the default value of /dev/nullis used for submission to a
519 Unix machine. If not specified, output is ignored for submission to
520 a Windows machine. Multiple jobs should not use the same output
521 file, since this will cause one job to overwrite the output of
522 another. The output file and the error file should not be the same
523 file as the outputs will overwrite each other or be lost. For grid
524 universe jobs, output may be a URL that the Globus tool
525 globus_url_copy understands.
526
527 Note that if a program explicitly opens and writes to a file, that
528 file should not be specified as the output file.
529
530
531
532
533
534 priority = <integer>
535
536 A Condor job priority can be any integer, with 0 being the default.
537 Jobs with higher numerical priority will run before jobs with lower
538 numerical priority. Note that this priority is on a per user basis.
539 One user with many jobs may use this command to order his/her own
540 jobs, and this will have no effect on whether or not these jobs will
541 run ahead of another user's jobs.
542
543
544
545
546
547 queue [ number-of-procs ]
548
549 Places one or more copies of the job into the Condor queue. The
550 optional argument number-of-procs specifies how many times to submit
551 the job to the queue, and it defaults to 1. If desired, any commands
552 may be placed between subsequent queue commands, such as new input ,
553 output , error , initialdir , or arguments commands. This is handy
554 when submitting multiple runs into one cluster with one submit
555 description file.
556
557
558
559
560
561 universe = <vanilla | standard | scheduler | local | grid | java | vm>
562
563 Specifies which Condor Universe to use when running this job. The
564 Condor Universe specifies a Condor execution environment. The stan‐
565 dard Universe tells Condor that this job has been re-linked via con‐
566 dor_compile with the Condor libraries and therefore supports check‐
567 pointing and remote system calls. The vanilla Universe is the
568 default (except where the configuration variable DEFAULT_UNIVERSEde‐
569 fines it otherwise), and is an execution environment for jobs which
570 have not been linked with the Condor libraries. Note: Use the
571 vanilla Universe to submit shell scripts to Condor. The scheduler is
572 for a job that should act as a metascheduler. The grid universe for‐
573 wards the job to an external job management system. Further specifi‐
574 cation of the grid universe is done with the grid_resource command.
575 The java universe is for programs written to the Java Virtual
576 Machine. The vm universe facilitates the execution of a virtual
577 machine.
578
579
580
581
582
583 COMMANDS FOR MATCHMAKING
584
585
586
587
588
589
590
591 rank = <ClassAd Float Expression>
592
593 A ClassAd Floating-Point expression that states how to rank machines
594 which have already met the requirements expression. Essentially,
595 rank expresses preference. A higher numeric value equals better
596 rank. Condor will give the job the machine with the highest rank.
597 For example,
598
599 requirements = Memory > 60
600 rank = Memory
601
602 asks Condor to find all available machines with more than 60
603 megabytes of memory and give to the job the machine with the most
604 amount of memory. See section within the Condor Users Manual for
605 complete information on the syntax and available attributes that can
606 be used in the ClassAd expression.
607
608
609
610
611
612 requirements = <ClassAd Boolean Expression>
613
614 The requirements command is a boolean ClassAd expression which uses
615 C-like operators. In order for any job in this cluster to run on a
616 given machine, this requirements expression must evaluate to true on
617 the given machine. For example, to require that whatever machine
618 executes a Condor job has a least 64 Meg of RAM and has a MIPS per‐
619 formance rating greater than 45, use:
620
621 requirements = Memory >= 64 && Mips > 45
622
623 For scheduler and local universe jobs, the requirements expression
624 is evaluated against the SchedulerClassAd which represents the the
625 condor_schedd daemon running on the submit machine, rather than a
626 remote machine. Like all commands in the submit description file, if
627 multiple requirements commands are present, all but the last one are
628 ignored. By default, condor_submit appends the following clauses to
629 the requirements expression:
630
631 1. Arch and OpSys are set equal to the Arch and OpSys of the sub‐
632 mit machine. In other words: unless you request otherwise, Condor
633 will give your job machines with the same architecture and oper‐
634 ating system version as the machine running condor_submit .
635
636 2. Disk >=DiskUsage. The DiskUsageattribute is initialized to the
637 size of the executable plus the size of any files specified in a
638 transfer_input_files command. It exists to ensure there is enough
639 disk space on the target machine for Condor to copy over both the
640 executable and needed input files. The DiskUsageattribute repre‐
641 sents the maximum amount of total disk space required by the job
642 in kilobytes. Condor automatically updates the DiskUsageattribute
643 approximately every 20 minutes while the job runs with the amount
644 of space being used by the job on the execute machine.
645
646 3. (Memory * 1024) >=ImageSize. To ensure the target machine has
647 enough memory to run your job.
648
649 4. If Universe is set to Vanilla, FileSystemDomain is set equal
650 to the submit machine's FileSystemDomain. View the requirements
651 of a job which has already been submitted (along with everything
652 else about the job ClassAd) with the command condor_q -l ; see
653 the command reference for condor_q on page . Also, see the Condor
654 Users Manual for complete information on the syntax and available
655 attributes that can be used in the ClassAd expression.
656
657
658
659
660
661 FILE TRANSFER COMMANDS
662
663
664
665
666
667
668
669 should_transfer_files = <YES | NO | IF_NEEDED >
670
671 The should_transfer_files setting is used to define if Condor should
672 transfer files to and from the remote machine where the job runs.
673 The file transfer mechanism is used to run jobs which are not in the
674 standard universe (and can therefore use remote system calls for
675 file access) on machines which do not have a shared file system with
676 the submit machine. should_transfer_files equal to YES will cause
677 Condor to always transfer files for the job. NO disables Condor's
678 file transfer mechanism. IF_NEEDED will not transfer files for the
679 job if it is matched with a resource in the same FileSystemDomainas
680 the submit machine (and therefore, on a machine with the same shared
681 file system). If the job is matched with a remote resource in a dif‐
682 ferent FileSystemDomain, Condor will transfer the necessary files.
683
684 If defining should_transfer_files you must also define
685 when_to_transfer_output (described below). For more information
686 about this and other settings related to transferring files, see
687 section on page .
688
689 Note that should_transfer_files is not supported for jobs submitted
690 to the grid universe.
691
692
693
694
695
696 stream_error = <True | False>
697
698 If True, then stderris streamed back to the machine from which the
699 job was submitted. If False, stderris stored locally and transferred
700 back when the job completes. This command is ignored if the job
701 ClassAd attribute TransferErris False. The default value is Truein
702 the grid universe and Falseotherwise. This command must be used in
703 conjunction with error , otherwise stderrwill sent to /dev/nullon
704 Unix machines and ignored on Windows machines.
705
706
707
708
709
710 stream_input = <True | False>
711
712 If True, then stdinis streamed from the machine on which the job was
713 submitted. The default value is False. The command is only relevant
714 for jobs submitted to the vanilla or java universes, and it is
715 ignored by the grid universe. This command must be used in conjunc‐
716 tion with input , otherwise stdinwill be /dev/nullon Unix machines
717 and ignored on Windows machines.
718
719
720
721
722
723 stream_output = <True | False>
724
725 If True, then stdoutis streamed back to the machine from which the
726 job was submitted. If False, stdoutis stored locally and transferred
727 back when the job completes. This command is ignored if the job
728 ClassAd attribute TransferOutis False. The default value is Truein
729 the grid universe and Falseotherwise. This command must be used in
730 conjunction with output , otherwise stdoutwill sent to /dev/nullon
731 Unix machines and ignored on Windows machines.
732
733
734
735
736
737 transfer_executable = <True | False>
738
739 This command is applicable to jobs submitted to the grid, and
740 vanilla universes. If transfer_executable is set to False, then Con‐
741 dor looks for the executable on the remote machine, and does not
742 transfer the executable over. This is useful for an already pre-
743 staged executable; Condor behaves more like rsh. The default value
744 is True.
745
746
747
748
749
750 transfer_input_files = <file1,file2,file... >
751
752 A comma-delimited list of all the files to be transferred into the
753 working directory for the job before the job is started. By default,
754 the file specified in the executable command and any file specified
755 in the input command (for example, stdin) are transferred.
756
757 Only the transfer of files is available; the transfer of subdirecto‐
758 ries is not supported.
759
760 For vanilla universe jobs only, a file may be specified by giving a
761 URL, instead of a file name. The implementation for URL transfers
762 requires both configuration and available plug-in. See section for
763 details.
764
765 For more information about this and other settings related to trans‐
766 ferring files, see section on page .
767
768
769
770
771
772 transfer_output_files = <file1,file2,file... >
773
774 This command forms an explicit list of output files to be trans‐
775 ferred back from the temporary working directory on the execute
776 machine to the submit machine. Most of the time, there is no need to
777 use this command. For grid universe jobs, other than where the grid
778 type is condor , the use of transfer_output_files is useful. For
779 Condor-C jobs and all other non-grid universe jobs, if transfer_out‐
780 put_files is not specified, Condor will automatically transfer back
781 all files in the job's temporary working directory which have been
782 modified or created by the job. This is usually the desired behav‐
783 ior. Explicitly listing output files is typically only done when the
784 job creates many files, and the user wants to keep a subset of those
785 files. If there are multiple files, they must be delimited with com‐
786 mas. WARNING : Do not specify transfer_output_files in the submit
787 description file unless there is a really good reason - it is best
788 to let Condor figure things out by itself based upon what the job
789 produces.
790
791 For grid universe jobs other than with grid type condor , to have
792 files other than standard output and standard error transferred from
793 the execute machine back to the submit machine, do use transfer_out‐
794 put_files , listing all files to be transferred. These files are
795 found on the execute machine in the working directory of the job.
796
797 For more information about this and other settings related to trans‐
798 ferring files, see section on page .
799
800
801
802
803
804 transfer_output_remaps <`` name newname ; name2 newname2 ... ''>
805
806 This specifies the name (and optionally path) to use when download‐
807 ing output files from the completed job. Normally, output files are
808 transferred back to the initial working directory with the same name
809 they had in the execution directory. This gives you the option to
810 save them with a different path or name. If you specify a relative
811 path, the final path will be relative to the job's initial working
812 directory.
813
814 name describes an output file name produced by your job, and newname
815 describes the file name it should be downloaded to. Multiple remaps
816 can be specified by separating each with a semicolon. If you wish to
817 remap file names that contain equals signs or semicolons, these spe‐
818 cial characters may be escaped with a backslash.
819
820
821
822
823
824 when_to_transfer_output = <ON_EXIT | ON_EXIT_OR_EVICT >
825
826
827
828 Setting when_to_transfer_output equal to ON_EXIT will cause Condor
829 to transfer the job's output files back to the submitting machine
830 only when the job completes (exits on its own).
831
832 The ON_EXIT_OR_EVICT option is intended for fault tolerant jobs
833 which periodically save their own state and can restart where they
834 left off. In this case, files are spooled to the submit machine any
835 time the job leaves a remote site, either because it exited on its
836 own, or was evicted by the Condor system for any reason prior to job
837 completion. The files spooled back are placed in a directory defined
838 by the value of the SPOOLconfiguration variable. Any output files
839 transferred back to the submit machine are automatically sent back
840 out again as input files if the job restarts.
841
842 For more information about this and other settings related to trans‐
843 ferring files, see section on page .
844
845
846
847
848
849 POLICY COMMANDS
850
851
852
853
854
855
856
857 hold = <True | False>
858
859 If hold is set to True, then the submitted job will be placed into
860 the Hold state. Jobs in the Hold state will not run until released
861 by condor_release . Defaults to False.
862
863
864
865
866
867 leave_in_queue = <ClassAd Boolean Expression>
868
869 When the ClassAd Expression evaluates to True, the job is not
870 removed from the queue upon completion. This allows the user of a
871 remotely spooled job to retrieve output files in cases where Condor
872 would have removed them as part of the cleanup associated with com‐
873 pletion. The job will only exit the queue once it has been marked
874 for removal (via condor_rm , for example) and the leave_in_queue
875 expression has become False. leave_in_queue defaults to False.
876
877 As an example, if the job is to be removed once the output is
878 retrieved with condor_transfer_data , then use
879
880 leave_in_queue = (JobStatus == 4) && ((StageOutFinish =?= UNDEFINED)
881 ||.br
882 (StageOutFinish == 0))
883
884
885
886
887
888 on_exit_hold = <ClassAd Boolean Expression>
889
890 The ClassAd expression is checked when the job exits, and if True,
891 places the job into the Hold state. If False(the default value when
892 not defined), then nothing happens and the on_exit_removeexpression
893 is checked to determine if that needs to be applied.
894
895 For example: Suppose a job is known to run for a minimum of an hour.
896 If the job exits after less than an hour, the job should be placed
897 on hold and an e-mail notification sent, instead of being allowed to
898 leave the queue.
899
900
901
902 on_exit_hold = (CurrentTime - JobStartDate) < (60 * $(MINUTE))
903
904 This expression places the job on hold if it exits for any reason
905 before running for an hour. An e-mail will be sent to the user
906 explaining that the job was placed on hold because this expression
907 became True.
908
909 periodic_*expressions take precedence over on_exit_*expressions, and
910 *_holdexpressions take precedence over a *_removeexpressions.
911
912 Only job ClassAd attributes will be defined for use by this ClassAd
913 expression. This expression is available for the vanilla, java, par‐
914 allel, grid, local and scheduler universes. It is additionally
915 available, when submitted from a Unix machine, for the standard uni‐
916 verse.
917
918
919
920
921
922 on_exit_remove = <ClassAd Boolean Expression>
923
924 The ClassAd expression is checked when the job exits, and if
925 True(the default value when undefined), then it allows the job to
926 leave the queue normally. If False, then the job is placed back into
927 the Idle state. If the user job runs under the vanilla universe,
928 then the job restarts from the beginning. If the user job runs under
929 the standard universe, then it continues from where it left off,
930 using the last checkpoint.
931
932 For example, suppose a job occasionally segfaults, but chances are
933 that the job will finish successfully if the job is run again with
934 the same data. The on_exit_remove expression can cause the job to
935 run again with the following command. Assume that the signal identi‐
936 fier for the segmentation fault is 11 on the platform where the job
937 will be running.
938
939 on_exit_remove = (ExitBySignal == False) || (ExitSignal != 11)
940
941 This expression lets the job leave the queue if the job was not
942 killed by a signal or if it was killed by a signal other than 11,
943 representing segmentation fault in this example. So, if the exited
944 due to signal 11, it will stay in the job queue. In any other case
945 of the job exiting, the job will leave the queue as it normally
946 would have done.
947
948 As another example, if the job should only leave the queue if it
949 exited on its own with status 0, this on_exit_remove expression
950 works well:
951
952
953
954 on_exit_remove = (ExitBySignal == False) && (ExitCode == 0)
955
956 If the job was killed by a signal or exited with a non-zero exit
957 status, Condor would leave the job in the queue to run again.
958
959 periodic_*expressions take precedence over on_exit_*expressions, and
960 *_holdexpressions take precedence over a *_removeexpressions.
961
962 Only job ClassAd attributes will be defined for use by this ClassAd
963 expression. This expression is available for the vanilla, java, par‐
964 allel, grid, local and scheduler universes. It is additionally
965 available, when submitted from a Unix machine, for the standard uni‐
966 verse. Note that the condor_schedd daemon, by default, only checks
967 these periodic expressions once every 300 seconds. The period of
968 these evaluations can be adjusted by setting the PERI‐
969 ODIC_EXPR_INTERVALconfiguration macro.
970
971
972
973
974
975 periodic_hold = <ClassAd Boolean Expression>
976
977 This expression is checked periodically at an interval of the number
978 of seconds set by the configuration variable PERIODIC_EXPR_INTERVAL.
979 If it becomes True, the job will be placed on hold. If unspecified,
980 the default value is False.
981
982 periodic_*expressions take precedence over on_exit_*expressions, and
983 *_holdexpressions take precedence over a *_removeexpressions.
984
985 Only job ClassAd attributes will be defined for use by this ClassAd
986 expression. This expression is available for the vanilla, java, par‐
987 allel, grid, local and scheduler universes. It is additionally
988 available, when submitted from a Unix machine, for the standard uni‐
989 verse. Note that the condor_schedd daemon, by default, only checks
990 these periodic expressions once every 300 seconds. The period of
991 these evaluations can be adjusted by setting the PERI‐
992 ODIC_EXPR_INTERVALconfiguration macro.
993
994
995
996
997
998 periodic_release = <ClassAd Boolean Expression>
999
1000 This expression is checked periodically at an interval of the number
1001 of seconds set by the configuration variable PERIODIC_EXPR_INTERVAL‐
1002 while the job is in the Hold state. If the expression becomes True,
1003 the job will be released.
1004
1005 Only job ClassAd attributes will be defined for use by this ClassAd
1006 expression. This expression is available for the vanilla, java, par‐
1007 allel, grid, local and scheduler universes. It is additionally
1008 available, when submitted from a Unix machine, for the standard uni‐
1009 verse. Note that the condor_schedd daemon, by default, only checks
1010 periodic expressions once every 300 seconds. The period of these
1011 evaluations can be adjusted by setting the PERIODIC_EXPR_INTERVAL‐
1012 configuration macro.
1013
1014
1015
1016
1017
1018 periodic_remove = <ClassAd Boolean Expression>
1019
1020 This expression is checked periodically at an interval of the number
1021 of seconds set by the configuration variable PERIODIC_EXPR_INTERVAL.
1022 If it becomes True, the job is removed from the queue. If unspeci‐
1023 fied, the default value is False.
1024
1025 See section , the Examples section of the condor_submit manual page,
1026 for an example of a periodic_remove expression.
1027
1028 periodic_*expressions take precedence over on_exit_*expressions, and
1029 *_holdexpressions take precedence over a *_removeexpressions. So,
1030 the periodic_removeexpression takes precedent over the
1031 on_exit_removeexpression, if the two describe conflicting actions.
1032
1033 Only job ClassAd attributes will be defined for use by this ClassAd
1034 expression. This expression is available for the vanilla, java, par‐
1035 allel, grid, local and scheduler universes. It is additionally
1036 available, when submitted from a Unix machine, for the standard uni‐
1037 verse. Note that the condor_schedd daemon, by default, only checks
1038 periodic expressions once every 300 seconds. The period of these
1039 evaluations can be adjusted by setting the PERIODIC_EXPR_INTERVAL‐
1040 configuration macro.
1041
1042
1043
1044
1045
1046 next_job_start_delay = <ClassAd Boolean Expression>
1047
1048 This expression specifies the number of seconds to delay after
1049 starting up this job before the next job is started. The maximum
1050 allowed delay is specified by the Condor configuration variable
1051 MAX_NEXT_JOB_START_DELAY, which defaults to 10 minutes. This command
1052 does not apply to scheduler or local universe jobs.
1053
1054 This command has been historically used to implement a form of job
1055 start throttling from the job submitter's perspective. It was effec‐
1056 tive for the case of multiple job submission where the transfer of
1057 extremely large input data sets to the execute machine caused
1058 machine performance to suffer. This command is no longer useful, as
1059 throttling should be accomplished through configuration of the con‐
1060 dor_schedd daemon.
1061
1062
1063
1064
1065
1066 COMMANDS SPECIFIC TO THE STANDARD UNIVERSE
1067
1068
1069
1070
1071
1072
1073
1074 allow_startup_script = <True | False>
1075
1076 If True, a standard universe job will execute a script instead of
1077 submitting the job, and the consistency check to see if the exe‐
1078 cutable has been linked using condor_compile is omitted. The exe‐
1079 cutable command within the submit description file specifies the
1080 name of the script. The script is used to do preprocessing before
1081 the job is submitted. The shell script ends with an exec of the job
1082 executable, such that the process id of the executable is the same
1083 as that of the shell script. Here is an example script that gets a
1084 copy of a machine-specific executable before the exec .
1085
1086 #! /bin/sh
1087
1088 # get the host name of the machine
1089 $host=`uname -n`
1090
1091 # grab a standard universe executable designed specifically
1092 # for this host
1093 scp elsewhere@cs.wisc.edu:${host} executable
1094
1095 # The PID MUST stay the same, so exec the new standard universe
1096 process.
1097 exec executable ${1+"$@"} If this command is not present (defined),
1098 then the value defaults to false.
1099
1100
1101
1102
1103
1104 append_files = file1, file2, ...
1105
1106
1107
1108 If your job attempts to access a file mentioned in this list, Condor
1109 will force all writes to that file to be appended to the end. Fur‐
1110 thermore, condor_submit will not truncate it. This list uses the
1111 same syntax as compress_files, shown above.
1112
1113 This option may yield some surprising results. If several jobs
1114 attempt to write to the same file, their output may be intermixed.
1115 If a job is evicted from one or more machines during the course of
1116 its lifetime, such an output file might contain several copies of
1117 the results. This option should be only be used when you wish a cer‐
1118 tain file to be treated as a running log instead of a precise
1119 result.
1120
1121 This option only applies to standard-universe jobs.
1122
1123
1124
1125
1126
1127 buffer_files <`` name (size,block-size) ; name2 (size,block-size) ...
1128 '' >
1129
1130
1131
1132
1133
1134 buffer_size <bytes-in-buffer>
1135
1136
1137
1138
1139
1140 buffer_block_size <bytes-in-block>
1141
1142 Condor keeps a buffer of recently-used data for each file a job
1143 accesses. This buffer is used both to cache commonly-used data and
1144 to consolidate small reads and writes into larger operations that
1145 get better throughput. The default settings should produce reason‐
1146 able results for most programs.
1147
1148 These options only apply to standard-universe jobs.
1149
1150 If needed, you may set the buffer controls individually for each
1151 file using the buffer_files option. For example, to set the buffer
1152 size to 1 Mbyte and the block size to 256 Kbytes for the file
1153 input.data, use this command:
1154
1155
1156
1157 buffer_files = "input.data=(1000000,256000)"
1158
1159 Alternatively, you may use these two options to set the default
1160 sizes for all files used by your job:
1161
1162
1163
1164 buffer_size = 1000000
1165 buffer_block_size = 256000
1166
1167 If you do not set these, Condor will use the values given by these
1168 two configuration file macros:
1169
1170
1171
1172 DEFAULT_IO_BUFFER_SIZE = 1000000
1173 DEFAULT_IO_BUFFER_BLOCK_SIZE = 256000
1174
1175 Finally, if no other settings are present, Condor will use a buffer
1176 of 512 Kbytes and a block size of 32 Kbytes.
1177
1178
1179
1180
1181
1182 compress_files = file1, file2, ...
1183
1184
1185
1186 If your job attempts to access any of the files mentioned in this
1187 list, Condor will automatically compress them (if writing) or decom‐
1188 press them (if reading). The compress format is the same as used by
1189 GNU gzip.
1190
1191 The files given in this list may be simple file names or complete
1192 paths and may include as a wild card. For example, this list causes
1193 the file /tmp/data.gz, any file named event.gz, and any file ending
1194 in .gzip to be automatically compressed or decompressed as needed:
1195
1196
1197
1198 compress_files = /tmp/data.gz, event.gz, *.gzip
1199
1200 Due to the nature of the compression format, compressed files must
1201 only be accessed sequentially. Random access reading is allowed but
1202 is very slow, while random access writing is simply not possible.
1203 This restriction may be avoided by using both compress_files and
1204 fetch_files at the same time. When this is done, a file is kept in
1205 the decompressed state at the execution machine, but is compressed
1206 for transfer to its original location.
1207
1208 This option only applies to standard universe jobs.
1209
1210
1211
1212
1213
1214 fetch_files = file1, file2, ...
1215
1216 If your job attempts to access a file mentioned in this list, Condor
1217 will automatically copy the whole file to the executing machine,
1218 where it can be accessed quickly. When your job closes the file, it
1219 will be copied back to its original location. This list uses the
1220 same syntax as compress_files, shown above.
1221
1222 This option only applies to standard universe jobs.
1223
1224
1225
1226
1227
1228 file_remaps <`` name newname ; name2 newname2 ... ''>
1229
1230
1231
1232 Directs Condor to use a new file name in place of an old one. name
1233 describes a file name that your job may attempt to open, and newname
1234 describes the file name it should be replaced with. newname may
1235 include an optional leading access specifier, local:or remote:. If
1236 left unspecified, the default access specifier is remote:. Multiple
1237 remaps can be specified by separating each with a semicolon.
1238
1239 This option only applies to standard universe jobs.
1240
1241 If you wish to remap file names that contain equals signs or semi‐
1242 colons, these special characters may be escaped with a backslash.
1243
1244
1245
1246 Example One:
1247
1248 Suppose that your job reads a file named dataset.1. To instruct
1249 Condor to force your job to read other.datasetinstead, add this
1250 to the submit file:
1251
1252 file_remaps = "dataset.1=other.dataset"
1253
1254
1255
1256 Example Two:
1257
1258 Suppose that your run many jobs which all read in the same large
1259 file, called very.big. If this file can be found in the same
1260 place on a local disk in every machine in the pool, (say
1261 /bigdisk/bigfile,) you can instruct Condor of this fact by remap‐
1262 ping very.bigto /bigdisk/bigfileand specifying that the file is
1263 to be read locally, which will be much faster than reading over
1264 the network.
1265
1266 file_remaps = "very.big = local:/bigdisk/bigfile"
1267
1268
1269
1270 Example Three:
1271
1272 Several remaps can be applied at once by separating each with a
1273 semicolon.
1274
1275 file_remaps = "very.big = local:/bigdisk/bigfile ; dataset.1 =
1276 other.dataset"
1277
1278
1279
1280
1281
1282
1283
1284 local_files = file1, file2, ...
1285
1286
1287
1288 If your job attempts to access a file mentioned in this list, Condor
1289 will cause it to be read or written at the execution machine. This
1290 is most useful for temporary files not used for input or output.
1291 This list uses the same syntax as compress_files, shown above.
1292
1293
1294
1295 local_files = /tmp/*
1296
1297 This option only applies to standard universe jobs.
1298
1299
1300
1301
1302
1303 want_remote_io = <True | False>
1304
1305 This option controls how a file is opened and manipulated in a stan‐
1306 dard universe job. If this option is true, which is the default,
1307 then the condor_shadow makes all decisions about how each and every
1308 file should be opened by the executing job. This entails a network
1309 round trip (or more) from the job to the condor_shadow and back
1310 again for every single open()in addition to other needed information
1311 about the file. If set to false, then when the job queries the con‐
1312 dor_shadow for the first time about how to open a file, the con‐
1313 dor_shadow will inform the job to automatically perform all of its
1314 file manipulation on the local file system on the execute machine
1315 and any file remapping will be ignored. This means that there must
1316 be a shared file system (such as NFS or AFS) between the execute
1317 machine and the submit machine and that ALL paths that the job could
1318 open on the execute machine must be valid. The ability of the stan‐
1319 dard universe job to checkpoint, possibly to a checkpoint server, is
1320 not affected by this attribute. However, when the job resumes it
1321 will be expecting the same file system conditions that were present
1322 when the job checkpointed.
1323
1324
1325
1326
1327
1328 COMMANDS FOR THE GRID
1329
1330
1331
1332
1333
1334
1335
1336 amazon_ami_id = <Amazon EC2 AMI ID>
1337
1338 AMI identifier of the VM image to run for amazon jobs.
1339
1340
1341
1342
1343
1344 amazon_instance_type = <VM Type>
1345
1346 Identifier for the type of VM desired for an amazon job. The default
1347 value is ``m1.small''.
1348
1349
1350
1351
1352
1353 amazon_keypair_file = <pathname>
1354
1355 The complete path and filename of a file into which Condor will
1356 write an ssh key for use with amazon jobs. The key can be used to
1357 ssh into the virtual machine once it is running.
1358
1359
1360
1361
1362
1363 amazon_private_key = <pathname>
1364
1365 Used for amazon jobs. Path and filename of a file containing the
1366 private key to be used to authenticate with Amazon's EC2 service via
1367 SOAP.
1368
1369
1370
1371
1372
1373 amazon_public_key = <pathname>
1374
1375 Used for amazon jobs. Path and filename of a file containing the
1376 public X509 certificate to be used to authenticate with Amazon's EC2
1377 service via SOAP.
1378
1379
1380
1381
1382
1383 amazon_security_groups = group1, group2, ...
1384
1385 Used for amazon jobs. A list of Amazon EC2 security group names,
1386 which should be associated with the job.
1387
1388
1389
1390
1391
1392 amazon_user_data = <data>
1393
1394 Used for amazon jobs. A block of data that can be accessed by the
1395 virtual machine job inside Amazon EC2.
1396
1397
1398
1399
1400
1401 amazon_user_data_file = <pathname>
1402
1403 Used for amazon jobs. A file containing data that can be accessed by
1404 the virtual machine job inside Amazon EC2.
1405
1406
1407
1408
1409
1410 globus_rematch = <ClassAd Boolean Expression>
1411
1412 This expression is evaluated by the condor_gridmanager whenever:
1413
1414 1. the globus_resubmit expression evaluates to True
1415
1416 2. the condor_gridmanager decides it needs to retry a submission
1417 (as when a previous submission failed to commit) If
1418 globus_rematch evaluates to True, then before the job is submit‐
1419 ted again to globus, the condor_gridmanager will request that the
1420 condor_schedd daemon renegotiate with the matchmaker (the con‐
1421 dor_negotiator ). The result is this job will be matched again.
1422
1423
1424
1425
1426
1427 globus_resubmit = <ClassAd Boolean Expression>
1428
1429 The expression is evaluated by the condor_gridmanager each time the
1430 condor_gridmanager gets a job ad to manage. Therefore, the expres‐
1431 sion is evaluated:
1432
1433 1. when a grid universe job is first submitted to Condor-G
1434
1435 2. when a grid universe job is released from the hold state
1436
1437 3. when Condor-G is restarted (specifically, whenever the con‐
1438 dor_gridmanager is restarted) If the expression evaluates to
1439 True, then any previous submission to the grid universe will be
1440 forgotten and this job will be submitted again as a fresh submis‐
1441 sion to the grid universe. This may be useful if there is a
1442 desire to give up on a previous submission and try again. Note
1443 that this may result in the same job running more than once. Do
1444 not treat this operation lightly.
1445
1446
1447
1448
1449
1450 globus_rsl = <RSL-string>
1451
1452 Used to provide any additional Globus RSL string attributes which
1453 are not covered by other submit description file commands or job
1454 attributes. Used for grid universe jobs, where the grid resource has
1455 a grid-type-string of gt2 .
1456
1457
1458
1459
1460
1461 globus_xml = <XML-string>
1462
1463 Used to provide any additional attributes in the GRAM XML job
1464 description that Condor writes which are not covered by regular sub‐
1465 mit description file parameters. Used for grid type gt4 jobs.
1466
1467
1468
1469
1470
1471 grid_resource = <grid-type-string><grid-specific-parameter-list>
1472
1473 For each grid-type-string value, there are further type-specific
1474 values that must specified. This submit description file command
1475 allows each to be given in a space-separated list. Allowable grid-
1476 type-string values are amazon , condor , cream , gt2 , gt4 , gt5 ,
1477 lsf , nordugrid , pbs , and unicore . See section for details on the
1478 variety of grid types.
1479
1480 For a grid-type-string of amazon , no additional parameters are
1481 used. See section for details.
1482
1483 For a grid-type-string of condor , the first parameter is the name
1484 of the remote condor_schedd daemon. The second parameter is the name
1485 of the pool to which the remote condor_schedd daemon belongs. See
1486 section for details.
1487
1488 For a grid-type-string of cream , there are three parameters. The
1489 first parameter is the web services address of the CREAM server. The
1490 second parameter is the name of the batch system that sits behind
1491 the CREAM server. The third parameter identifies a site-specific
1492 queue within the batch system. See section for details.
1493
1494 For a grid-type-string of gt2 , the single parameter is the name of
1495 the pre-WS GRAM resource to be used. See section for details.
1496
1497 For a grid-type-string of gt4 , the first parameter is the name of
1498 the WS GRAM service to be used. The second parameter is the name of
1499 WS resource to be used (usually the name of the back-end scheduler).
1500 See section for details.
1501
1502 For a grid-type-string of gt5 , the single parameter is the name of
1503 the pre-WS GRAM resource to be used, which is the same as for the
1504 grid-type-string of gt2 . See section for details.
1505
1506 For a grid-type-string of lsf , no additional parameters are used.
1507 See section for details.
1508
1509 For a grid-type-string of nordugrid , the single parameter is the
1510 name of the NorduGrid resource to be used. See section for details.
1511
1512 For a grid-type-string of pbs , no additional parameters are used.
1513 See section for details.
1514
1515 For a grid-type-string of unicore , the first parameter is the name
1516 of the Unicore Usite to be used. The second parameter is the name of
1517 the Unicore Vsite to be used. See section for details.
1518
1519
1520
1521
1522
1523 keystore_alias = <name>
1524
1525 A string to locate the certificate in a Java keystore file, as used
1526 for a unicore job.
1527
1528
1529
1530
1531
1532 keystore_file = <pathname>
1533
1534 The complete path and file name of the Java keystore file containing
1535 the certificate to be used for a unicore job.
1536
1537
1538
1539
1540
1541 keystore_passphrase_file = <pathname>
1542
1543 The complete path and file name to the file containing the
1544 passphrase protecting a Java keystore file containing the certifi‐
1545 cate. Relevant for a unicore job.
1546
1547
1548
1549
1550
1551 MyProxyCredentialName = <symbolic name>
1552
1553 The symbolic name that identifies a credential to the MyProxy
1554 server. This symbolic name is set as the credential is initially
1555 stored on the server (using myproxy-init ).
1556
1557
1558
1559
1560
1561 MyProxyHost = <host>:<port>
1562
1563 The Internet address of the host that is the MyProxy server. The
1564 host may be specified by either a host name (as in head.example.com)
1565 or an IP address (of the form 123.456.7.8). The port number is an
1566 integer.
1567
1568
1569
1570
1571
1572 MyProxyNewProxyLifetime = <number-of-minutes>
1573
1574 The new lifetime (in minutes) of the proxy after it is refreshed.
1575
1576
1577
1578
1579
1580 MyProxyPassword = <password>
1581
1582 The password needed to refresh a credential on the MyProxy server.
1583 This password is set when the user initially stores credentials on
1584 the server (using myproxy-init ). As an alternative to using MyProx‐
1585 yPassword in the submit description file, the password may be speci‐
1586 fied as a command line argument to condor_submit with the -password
1587 argument.
1588
1589
1590
1591
1592
1593 MyProxyRefreshThreshold = <number-of-seconds>
1594
1595 The time (in seconds) before the expiration of a proxy that the
1596 proxy should be refreshed. For example, if MyProxyRefreshThreshold
1597 is set to the value 600, the proxy will be refreshed 10 minutes
1598 before it expires.
1599
1600
1601
1602
1603
1604 MyProxyServerDN = <credential subject>
1605
1606 A string that specifies the expected Distinguished Name (credential
1607 subject, abbreviated DN) of the MyProxy server. It must be specified
1608 when the MyProxy server DN does not follow the conventional naming
1609 scheme of a host credential. This occurs, for example, when the
1610 MyProxy server DN begins with a user credential.
1611
1612
1613
1614
1615
1616 nordugrid_rsl = <RSL-string>
1617
1618 Used to provide any additional RSL string attributes which are not
1619 covered by regular submit description file parameters. Used when the
1620 universe is grid , and the type of grid system is nordugrid .
1621
1622
1623
1624
1625
1626 transfer_error = <True | False>
1627
1628 For jobs submitted to the grid universe only. If True, then the
1629 error output (from stderr) from the job is transferred from the
1630 remote machine back to the submit machine. The name of the file
1631 after transfer is given by the error command. If False, no transfer
1632 takes place (from the remote machine to submit machine), and the
1633 name of the file is given by the error command. The default value is
1634 True.
1635
1636
1637
1638
1639
1640 transfer_input = <True | False>
1641
1642 For jobs submitted to the grid universe only. If True, then the job
1643 input (stdin) is transferred from the machine where the job was sub‐
1644 mitted to the remote machine. The name of the file that is trans‐
1645 ferred is given by the input command. If False, then the job's input
1646 is taken from a pre-staged file on the remote machine, and the name
1647 of the file is given by the input command. The default value is
1648 True.
1649
1650 For transferring files other than stdin, see transfer_input_files .
1651
1652
1653
1654
1655
1656 transfer_output = <True | False>
1657
1658 For jobs submitted to the grid universe only. If True, then the out‐
1659 put (from stdout) from the job is transferred from the remote
1660 machine back to the submit machine. The name of the file after
1661 transfer is given by the output command. If False, no transfer takes
1662 place (from the remote machine to submit machine), and the name of
1663 the file is given by the output command. The default value is True.
1664
1665 For transferring files other than stdout, see transfer_output_files
1666 .
1667
1668
1669
1670
1671
1672 x509userproxy = <full-pathname>
1673
1674 Used to override the default path name for X.509 user certificates.
1675 The default location for X.509 proxies is the /tmpdirectory, which
1676 is generally a local file system. Setting this value would allow
1677 Condor to access the proxy in a shared file system (for example,
1678 AFS). Condor will use the proxy specified in the submit description
1679 file first. If nothing is specified in the submit description file,
1680 it will use the environment variable X509_USER_CERT. If that vari‐
1681 able is not present, it will search in the default location.
1682
1683 x509userproxy is relevant when the universe is vanilla , or when the
1684 universe is grid and the type of grid system is one of gt2 , gt4 ,
1685 or nordugrid . Defining a value causes the proxy to be delegated to
1686 the execute machine. Further, VOMS attributes defined in the proxy
1687 will appear in the job ClassAd. See the unnumbered subsection
1688 labeled Job ClassAd Attributes on page for all job attribute
1689 descriptions.
1690
1691
1692
1693
1694
1695 COMMANDS FOR PARALLEL, JAVA, and SCHEDULER UNIVERSES
1696
1697
1698
1699
1700
1701
1702
1703 hold_kill_sig = <signal-number>
1704
1705 For the scheduler universe only, signal-number is the signal deliv‐
1706 ered to the job when the job is put on hold with condor_hold . sig‐
1707 nal-number may be either the platform-specific name or value of the
1708 signal. If this command is not present, the value of kill_sig is
1709 used.
1710
1711
1712
1713
1714
1715 jar_files = <file_list>
1716
1717 Specifies a list of additional JAR files to include when using the
1718 Java universe. JAR files will be transferred along with the exe‐
1719 cutable and automatically added to the classpath.
1720
1721
1722
1723
1724
1725 java_vm_args = <argument_list>
1726
1727 Specifies a list of additional arguments to the Java VM itself, When
1728 Condor runs the Java program, these are the arguments that go before
1729 the class name. This can be used to set VM-specific arguments like
1730 stack size, garbage-collector arguments and initial property values.
1731
1732
1733
1734
1735
1736 machine_count = <max>
1737
1738 For the parallel universe, a single value ( max ) is required. It is
1739 neither a maximum or minimum, but the number of machines to be dedi‐
1740 cated toward running the job.
1741
1742
1743
1744
1745
1746 remove_kill_sig = <signal-number>
1747
1748 For the scheduler universe only, signal-number is the signal deliv‐
1749 ered to the job when the job is removed with condor_rm . signal-
1750 number may be either the platform-specific name or value of the sig‐
1751 nal. This example shows it both ways for a Linux signal:
1752
1753 remove_kill_sig = SIGUSR1
1754 remove_kill_sig = 10
1755
1756 If this command is not present, the value of kill_sig is used.
1757
1758
1759
1760
1761
1762 COMMANDS FOR THE VM UNIVERSE
1763
1764
1765
1766
1767
1768
1769
1770 vm_cdrom_files = file1, file2, ...
1771
1772 A comma-separated list of input CD-ROM files.
1773
1774
1775
1776
1777
1778 vm_checkpoint = <True | False>
1779
1780 A boolean value specifying whether or not to take checkpoints. If
1781 not specified, the default value is False. In the current implemen‐
1782 tation, setting both vm_checkpoint and vm_networking to Truedoes not
1783 yet work in all cases. Networking cannot be used if a vm universe
1784 job uses a checkpoint in order to continue execution after migration
1785 to another machine.
1786
1787
1788
1789
1790
1791 vm_macaddr = <MACAddr>
1792
1793 Defines that MAC address that the virtual machine's network inter‐
1794 face should have, in the standard format of six groups of two hexa‐
1795 decimal digits separated by colons.
1796
1797
1798
1799
1800
1801 vm_memory = <MBytes-of-memory>
1802
1803 The amount of memory in MBytes that a vm universe job requires.
1804
1805
1806
1807
1808
1809 vm_networking = <True | False>
1810
1811 Specifies whether to use networking or not. In the current implemen‐
1812 tation, setting both vm_checkpoint and vm_networking to Truedoes not
1813 yet work in all cases. Networking cannot be used if a vm universe
1814 job uses a checkpoint in order to continue execution after migration
1815 to another machine.
1816
1817
1818
1819
1820
1821 vm_networking_type = <nat | bridge >
1822
1823 When vm_networking is True, this definition augments the job's
1824 requirements to match only machines with the specified networking.
1825 If not specified, then either networking type matches.
1826
1827
1828
1829
1830
1831 vm_no_output_vm = <True | False>
1832
1833 When True, prevents Condor from transferring output files back to
1834 the machine from which the vm universe job was submitted. If not
1835 specified, the default value is False.
1836
1837
1838
1839
1840
1841 vm_should_transfer_cdrom_files = <True | False>
1842
1843 Specifies whether Condor will transfer CD-ROM files to the execute
1844 machine (True) or rely on access through a shared file system
1845 (False).
1846
1847
1848
1849
1850
1851 vm_type = <vmware | xen | kvm>
1852
1853 Specifies the underlying virtual machine software that this job
1854 expects.
1855
1856
1857
1858
1859
1860 vmware_dir = <pathname>
1861
1862 The complete path and name of the directory where VMware-specific
1863 files and applications such as the VMDK (Virtual Machine Disk For‐
1864 mat) and VMX (Virtual Machine Configuration) reside.
1865
1866
1867
1868
1869
1870 vmware_should_transfer_files = <True | False>
1871
1872 Specifies whether Condor will transfer VMware-specific files located
1873 as specified by vmware_dir to the execute machine (True) or rely on
1874 access through a shared file system (False). Omission of this
1875 required command (for VMware vm universe jobs) results in an error
1876 message from condor_submit , and the job will not be submitted.
1877
1878
1879
1880
1881
1882 vmware_snapshot_disk = <True | False>
1883
1884 When True, causes Condor to utilize a VMware snapshot disk for new
1885 or modified files. If not specified, the default value is True.
1886
1887
1888
1889
1890
1891 xen_cdrom_device = <device>
1892
1893 Describes the Xen CD-ROM device when vm_cdrom_files is defined.
1894
1895
1896
1897
1898
1899 xen_disk = file1:device1:permission1, file2:device2:permission2, ...
1900
1901 A list of comma separated disk files. Each disk file is specified by
1902 3 colon separated fields. The first field is the path and file name
1903 of the disk file. The second field specifies the device, and the
1904 third field specifies permissions.
1905
1906 An example that specifies two disk files:
1907
1908 xen_disk = /myxen/diskfile.img:sda1:w,/myxen/swap.img:sda2:w
1909
1910
1911
1912
1913
1914 xen_initrd = <image-file>
1915
1916 When xen_kernel gives a path and file name for the kernel image to
1917 use, this optional command may specify a path to and ramdisk (ini‐
1918 trd) image file.
1919
1920
1921
1922
1923
1924 xen_kernel = <included | path-to-kernel>
1925
1926 A value of included specifies that the kernel is included in the
1927 disk file. If not one of these values, then the value is a path and
1928 file name of the kernel to be used.
1929
1930
1931
1932
1933
1934 xen_kernel_params = <string>
1935
1936 A string that is appended to the Xen kernel command line.
1937
1938
1939
1940
1941
1942 xen_root = <string>
1943
1944 A string that is appended to the Xen kernel command line to specify
1945 the root device. This string is required when xen_kernel gives a
1946 path to a kernel. Omission for this required case results in an
1947 error message during submission.
1948
1949
1950
1951
1952
1953 xen_transfer_files = <list-of-files>
1954
1955 A comma separated list of all files that Condor is to transfer to
1956 the execute machine.
1957
1958
1959
1960
1961
1962 kvm_disk = file1:device1:permission1, file2:device2:permission2, ...
1963
1964 A list of comma separated disk files. Each disk file is specified by
1965 3 colon separated fields. The first field is the path and file name
1966 of the disk file. The second field specifies the device, and the
1967 third field specifies permissions.
1968
1969 An example that specifies two disk files:
1970
1971 kvm_disk = /myxen/diskfile.img:sda1:w,/myxen/swap.img:sda2:w
1972
1973
1974
1975
1976
1977 kvm_cdrom_device = <device>
1978
1979 Describes the KVM CD-ROM device when vm_cdrom_files is defined.
1980
1981
1982
1983
1984
1985 kvm_transfer_files = <list-of-files>
1986
1987 A comma separated list of all files that Condor is to transfer to
1988 the execute machine.
1989
1990
1991
1992
1993
1994 ADVANCED COMMANDS
1995
1996
1997
1998
1999
2000
2001
2002 concurrency_limits = <string-list>
2003
2004 A list of resources that this job needs. The resources are presumed
2005 to have concurrency limits placed upon them, thereby limiting the
2006 number of concurrent jobs in execution which need the named
2007 resource. Commas and space characters delimit the items in the list.
2008 Each item in the list may specify a numerical value identifying the
2009 integer number of resources required for the job. The syntax follows
2010 the resource name by a colon character (:) and the numerical value.
2011 See section for details on concurrency limits.
2012
2013
2014
2015
2016
2017 copy_to_spool = <True | False>
2018
2019 If copy_to_spool is True, then condor_submit copies the executable
2020 to the local spool directory before running it on a remote host. As
2021 copying can be quite time consuming and unnecessary, the default
2022 value is Falsefor all job universes other than the standard uni‐
2023 verse. When False, condor_submit does not copy the executable to a
2024 local spool directory. The default is Truein standard universe,
2025 because resuming execution from a checkpoint can only be guaranteed
2026 to work using precisely the same executable that created the check‐
2027 point.
2028
2029
2030
2031
2032
2033 coresize = <size>
2034
2035 Should the user's program abort and produce a core file, coresize
2036 specifies the maximum size in bytes of the core file which the user
2037 wishes to keep. If coresize is not specified in the command file,
2038 the system's user resource limit ``coredumpsize'' is used. This
2039 limit is not used in HP-UX operating systems.
2040
2041
2042
2043
2044
2045 cron_day_of_month = <Cron-evaluated Day>
2046
2047 The set of days of the month for which a deferral time applies. See
2048 section for further details and examples.
2049
2050
2051
2052
2053
2054 cron_day_of_week = <Cron-evaluated Day>
2055
2056 The set of days of the week for which a deferral time applies. See
2057 section for details, semantics, and examples.
2058
2059
2060
2061
2062
2063 cron_hour = <Cron-evaluated Hour>
2064
2065 The set of hours of the day for which a deferral time applies. See
2066 section for details, semantics, and examples.
2067
2068
2069
2070
2071
2072 cron_minute = <Cron-evaluated Minute>
2073
2074 The set of minutes within an hour for which a deferral time applies.
2075 See section for details, semantics, and examples.
2076
2077
2078
2079
2080
2081 cron_month = <Cron-evaluated Month>
2082
2083 The set of months within a year for which a deferral time applies.
2084 See section for details, semantics, and examples.
2085
2086
2087
2088
2089
2090 cron_prep_time = <ClassAd Integer Expression>
2091
2092 Analogous to deferral_prep_time . The number of seconds prior to a
2093 job's deferral time that the job may be matched and sent to an exe‐
2094 cution machine.
2095
2096
2097
2098
2099
2100 cron_window = <ClassAd Integer Expression>
2101
2102 Analogous to the submit command deferral_window . It allows cron
2103 jobs that miss their deferral time to begin execution.
2104
2105 See section for further details and examples.
2106
2107
2108
2109
2110
2111 deferral_prep_time = <ClassAd Integer Expression>
2112
2113 The number of seconds prior to a job's deferral time that the job
2114 may be matched and sent to an execution machine.
2115
2116 See section for further details.
2117
2118
2119
2120
2121
2122 deferral_time = <ClassAd Integer Expression>
2123
2124 Allows a job to specify the time at which its execution is to begin,
2125 instead of beginning execution as soon as it arrives at the execu‐
2126 tion machine. The deferral time is an expression that evaluates to a
2127 Unix Epoch timestamp (the number of seconds elapsed since 00:00:00
2128 on January 1, 1970, Coordinated Universal Time). Deferral time is
2129 evaluated with respect to the execution machine. This option delays
2130 the start of execution, but not the matching and claiming of a
2131 machine for the job. If the job is not available and ready to begin
2132 execution at the deferral time, it has missed its deferral time. A
2133 job that misses its deferral time will be put on hold in the queue.
2134
2135 See section for further details and examples.
2136
2137 Due to implementation details, a deferral time may not be used for
2138 scheduler universe jobs.
2139
2140
2141
2142
2143
2144 deferral_window = <ClassAd Integer Expression>
2145
2146 The deferral window is used in conjunction with the deferral_time
2147 command to allow jobs that miss their deferral time to begin execu‐
2148 tion.
2149
2150 See section for further details and examples.
2151
2152
2153
2154
2155
2156 email_attributes = <list-of-job-ad-attributes>
2157
2158 A comma-separated list of attributes from the job ClassAd. These
2159 attributes and their values will be included in the e-mail notifica‐
2160 tion of job completion.
2161
2162
2163
2164
2165
2166 image_size = <size>
2167
2168 This command tells Condor the maximum virtual image size to which
2169 you believe your program will grow during its execution. Condor will
2170 then execute your job only on machines which have enough resources,
2171 (such as virtual memory), to support executing your job. If you do
2172 not specify the image size of your job in the description file, Con‐
2173 dor will automatically make a (reasonably accurate) estimate about
2174 its size and adjust this estimate as your program runs. If the image
2175 size of your job is underestimated, it may crash due to inability to
2176 acquire more address space, e.g. malloc() fails. If the image size
2177 is overestimated, Condor may have difficulty finding machines which
2178 have the required resources. size must be in Kbytes, e.g. for an
2179 image size of 8 megabytes, use a size of 8000.
2180
2181
2182
2183
2184
2185 initialdir = <directory-path>
2186
2187 Used to give jobs a directory with respect to file input and output.
2188 Also provides a directory (on the machine from which the job is sub‐
2189 mitted) for the user log, when a full path is not specified.
2190
2191 For vanilla universe jobs where there is a shared file system, it is
2192 the current working directory on the machine where the job is exe‐
2193 cuted.
2194
2195 For vanilla or grid universe jobs where file transfer mechanisms are
2196 utilized (there is not a shared file system), it is the directory on
2197 the machine from which the job is submitted where the input files
2198 come from, and where the job's output files go to.
2199
2200 For standard universe jobs, it is the directory on the machine from
2201 which the job is submitted where the condor_shadow daemon runs; the
2202 current working directory for file input and output accomplished
2203 through remote system calls.
2204
2205 For scheduler universe jobs, it is the directory on the machine from
2206 which the job is submitted where the job runs; the current working
2207 directory for file input and output with respect to relative path
2208 names.
2209
2210 Note that the path to the executable is not relative to initialdir ;
2211 if it is a relative path, it is relative to the directory in which
2212 the condor_submit command is run.
2213
2214
2215
2216
2217
2218 job_lease_duration = <number-of-seconds>
2219
2220 For vanilla and java universe jobs only, the duration (in seconds)
2221 of a job lease. The default value is twenty minutes for universes
2222 that support it. If a job lease is not desired, the value can be
2223 explicitly set to 0 to disable the job lease semantics. See section
2224 for details of job leases.
2225
2226
2227
2228
2229
2230 kill_sig = <signal-number>
2231
2232 When Condor needs to kick a job off of a machine, it will send the
2233 job the signal specified by signal-number . signal-number needs to
2234 be an integer which represents a valid signal on the execution
2235 machine. For jobs submitted to the standard universe, the default
2236 value is the number for SIGTSTPwhich tells the Condor libraries to
2237 initiate a checkpoint of the process. For jobs submitted to the
2238 vanilla universe, the default is SIGTERMwhich is the standard way to
2239 terminate a program in Unix.
2240
2241
2242
2243
2244
2245 load_profile = <True | False>
2246
2247 When True, loads the account profile of the dedicated run account
2248 for Windows jobs. May not be used with run_as_owner .
2249
2250
2251
2252
2253
2254 match_list_length = <integer value>
2255
2256 Defaults to the value zero (0). When match_list_length is defined
2257 with an integer value greater than zero (0), attributes are inserted
2258 into the job ClassAd. The maximum number of attributes defined is
2259 given by the integer value. The job ClassAds introduced are given as
2260
2261 LastMatchName0 = "most-recent-Name"
2262 LastMatchName1 = "next-most-recent-Name"
2263
2264 The value for each introduced ClassAd is given by the value of the
2265 Nameattribute from the machine ClassAd of a previous execution
2266 (match). As a job is matched, the definitions for these attributes
2267 will roll, with LastMatchName1becoming LastMatchName2, LastMatch‐
2268 Name0becoming LastMatchName1, and LastMatchName0being set by the
2269 most recent value of the Nameattribute.
2270
2271 An intended use of these job attributes is in the requirements
2272 expression. The requirements can allow a job to prefer a match with
2273 either the same or a different resource than a previous match.
2274
2275
2276
2277
2278
2279 max_job_retirement_time = <integer expression>
2280
2281 An integer-valued expression (in seconds) that does nothing unless
2282 the machine that runs the job has been configured to provide retire‐
2283 ment time (see section ). Retirement time is a grace period given to
2284 a job to finish naturally when a resource claim is about to be pre‐
2285 empted. No kill signals are sent during a retirement time. The
2286 default behavior in many cases is to take as much retirement time as
2287 the machine offers, so this command will rarely appear in a submit
2288 description file.
2289
2290 When a resource claim is to be preempted, this expression in the
2291 submit file specifies the maximum run time of the job (in seconds,
2292 since the job started). This expression has no effect, if it is
2293 greater than the maximum retirement time provided by the machine
2294 policy. If the resource claim is not preempted, this expression and
2295 the machine retirement policy are irrelevant. If the resource claim
2296 is preempted and the job finishes sooner than the maximum time, the
2297 claim closes gracefully and all is well. If the resource claim is
2298 preempted and the job does not finish in time, the usual preemption
2299 procedure is followed (typically a soft kill signal, followed by
2300 some time to gracefully shut down, followed by a hard kill signal).
2301
2302 Standard universe jobs and any jobs running with nice_user priority
2303 have a default max_job_retirement_time of 0, so no retirement time
2304 is utilized by default. In all other cases, no default value is pro‐
2305 vided, so the maximum amount of retirement time is utilized by
2306 default.
2307
2308 Setting this expression does not affect the job's resource require‐
2309 ments or preferences. For a job to only run on a machine with a min‐
2310 imum , or to preferentially run on such machines, explicitly specify
2311 this in the requirements and/or rank expressions.
2312
2313
2314
2315
2316
2317 nice_user = <True | False>
2318
2319 Normally, when a machine becomes available to Condor, Condor decides
2320 which job to run based upon user and job priorities. Setting
2321 nice_user equal to Truetells Condor not to use your regular user
2322 priority, but that this job should have last priority among all
2323 users and all jobs. So jobs submitted in this fashion run only on
2324 machines which no other non-nice_user job wants -- a true ``bottom-
2325 feeder'' job! This is very handy if a user has some jobs they wish
2326 to run, but do not wish to use resources that could instead be used
2327 to run other people's Condor jobs. Jobs submitted in this fashion
2328 have ``nice-user.'' pre-appended in front of the owner name when
2329 viewed from condor_q or condor_userprio . The default value is False
2330 .
2331
2332
2333
2334
2335
2336 noop_job = <ClassAd Boolean Expression>
2337
2338 When this boolean expression is True, the job is immediately removed
2339 from the queue, and Condor makes no attempt at running the job. The
2340 log file for the job will show a job submitted event and a job ter‐
2341 minated event, along with an exit code of 0, unless the user speci‐
2342 fies a different signal or exit code.
2343
2344
2345
2346
2347
2348 noop_job_exit_code = <return value>
2349
2350 When noop_job is in the submit description file and evaluates to
2351 True, this command allows the job to specify the return value as
2352 shown in the job's log file job terminated event. If not specified,
2353 the job will show as having terminated with status 0. This overrides
2354 any value specified with noop_job_exit_signal .
2355
2356
2357
2358
2359
2360 noop_job_exit_signal = <signal number>
2361
2362 When noop_job is in the submit description file and evaluates to
2363 True, this command allows the job to specify the signal number that
2364 the job's log event will show the job having terminated with.
2365
2366
2367
2368
2369
2370 remote_initialdir = <directory-path>
2371
2372 The path specifies the directory in which the job is to be executed
2373 on the remote machine. This is currently supported in all universes
2374 except for the standard universe.
2375
2376
2377
2378
2379
2380 rendezvousdir = <directory-path>
2381
2382 Used to specify the shared file system directory to be used for file
2383 system authentication when submitting to a remote scheduler. Should
2384 be a path to a preexisting directory.
2385
2386
2387
2388
2389
2390 request_cpus = <num-cpus>
2391
2392 For pools that enable dynamic condor_startd provisioning (see sec‐
2393 tion ), the number of CPUs requested for this job. If not specified,
2394 the number requested under dynamic condor_startd provisioning will
2395 be 1.
2396
2397
2398
2399
2400
2401 request_disk = <quantity>
2402
2403 For pools that enable dynamic condor_startd provisioning (see sec‐
2404 tion ), the amount of disk space in Kbytes requested for this job,
2405 setting an initial value for the job ClassAd attribute DiskUsage. If
2406 not specified, the initial amount requested under dynamic con‐
2407 dor_startd provisioning will depend on the job universe. For vm uni‐
2408 verse jobs, it will be the size of the disk image. For other uni‐
2409 verses, it will be the sum of sizes of the job's executable and all
2410 input files.
2411
2412
2413
2414
2415
2416 request_memory = <quantity>
2417
2418 For pools that enable dynamic condor_startd provisioning (see sec‐
2419 tion ), the amount of memory space in Mbytes requested for this job,
2420 setting an initial value for the job ClassAd attribute ImageSize. If
2421 not specified, the initial amount requested under dynamic con‐
2422 dor_startd provisioning will depend on the job universe. For vm uni‐
2423 verse jobs that do not specify the request with vm_memory , it will
2424 be 0. For other universes, it will be the size of the job's exe‐
2425 cutable.
2426
2427
2428
2429
2430
2431 run_as_owner = <True | False>
2432
2433 A boolean value that causes the job to be run under the login of the
2434 submitter, if supported by the joint configuration of the submit and
2435 execute machines. On Unix platforms, this defaults to True, and on
2436 Windows platforms, it defaults to False. May not be used with
2437 load_profile . See section for administrative details on configuring
2438 Windows to support this option, as well as section on page for a
2439 definition of STARTER_ALLOW_RUNAS_OWNER.
2440
2441
2442
2443
2444
2445 +<attribute>= <value>
2446
2447 A line which begins with a '+' (plus) character instructs con‐
2448 dor_submit to insert the following attribute into the job ClassAd
2449 with the given value .
2450
2451
2452
2453
2454
2455 In addition to commands, the submit description file can contain macros
2456 and comments:
2457
2458 Macros
2459
2460 Parameterless macros in the form of $(macro_name)may be inserted
2461 anywhere in Condor submit description files. Macros can be defined
2462 by lines in the form of
2463
2464 <macro_name> = <string>
2465
2466 Three pre-defined macros are supplied by the submit description file
2467 parser. The third of the pre-defined macros is only relevant to MPI
2468 applications under the parallel universe. The $(Cluster)macro sup‐
2469 plies the value of the ClusterIdjob ClassAd attribute, and the
2470 $(Process)macro supplies the value of the ProcIdjob ClassAd
2471 attribute. These macros are intended to aid in the specification of
2472 input/output files, arguments, etc., for clusters with lots of jobs,
2473 and/or could be used to supply a Condor process with its own cluster
2474 and process numbers on the command line. The $(Node)macro is defined
2475 for MPI applications run as parallel universe jobs. It is a unique
2476 value assigned for the duration of the job that essentially identi‐
2477 fies the machine on which a program is executing.
2478
2479 To use the dollar sign character ($) as a literal, without macro
2480 expansion, use
2481
2482 $(DOLLAR)
2483
2484 In addition to the normal macro, there is also a special kind of
2485 macro called a substitution macro that allows the substitution of a
2486 ClassAd attribute value defined on the resource machine itself (got‐
2487 ten after a match to the machine has been made) into specific com‐
2488 mands within the submit description file. The substitution macro is
2489 of the form:
2490
2491 $$(attribute)
2492
2493 A common use of this macro is for the heterogeneous submission of an
2494 executable:
2495
2496 executable = povray.$$(opsys).$$(arch)
2497
2498 Values for the opsysand archattributes are substituted at match time
2499 for any given resource. This allows Condor to automatically choose
2500 the correct executable for the matched machine.
2501
2502 An extension to the syntax of the substitution macro provides an
2503 alternative string to use if the machine attribute within the sub‐
2504 stitution macro is undefined. The syntax appears as:
2505
2506 $$(attribute:string_if_attribute_undefined)
2507
2508 An example using this extended syntax provides a path name to a
2509 required input file. Since the file can be placed in different loca‐
2510 tions on different machines, the file's path name is given as an
2511 argument to the program.
2512
2513 argument = $$(input_file_path:/usr/foo)
2514
2515 On the machine, if the attribute input_file_pathis not defined, then
2516 the path /usr/foois used instead.
2517
2518 A further extension to the syntax of the substitution macro allows
2519 the evaluation of a ClassAd expression to define the value. As all
2520 substitution macros, the expression is evaluated after a match has
2521 been made. Therefore, the expression may refer to machine attributes
2522 by prefacing them with the scope resolution prefix TARGET., as spec‐
2523 ified in section . To place a ClassAd expression into the substitu‐
2524 tion macro, square brackets are added to delimit the expression. The
2525 syntax appears as:
2526
2527 $$([ClassAd expression])
2528
2529 An example of a job that uses this syntax may be one that wants to
2530 know how much memory it can use. The application cannot detect this
2531 itself, as it would potentially use all of the memory on a multi-
2532 slot machine. So the job determines the memory per slot, reducing it
2533 by 10% to account for miscellaneous overhead, and passes this as a
2534 command line argument to the application. In the submit description
2535 file will be
2536
2537 arguments=--memory $$([TARGET.Memory * 0.9])
2538
2539 To insert two dollar sign characters ($$) as literals into a ClassAd
2540 string, use
2541
2542 $$(DOLLARDOLLAR)
2543
2544 The environment macro, $ENV, allows the evaluation of an environment
2545 variable to be used in setting a submit description file command.
2546 The syntax used is
2547
2548 $ENV(variable)
2549
2550 An example submit description file command that uses this function‐
2551 ality evaluates the submittor's home directory in order to set the
2552 path and file name of a log file:
2553
2554 log = $ENV(HOME)/jobs/logfile
2555
2556 The environment variable is evaluated when the submit description
2557 file is processed.
2558
2559 The $RANDOM_CHOICE macro allows a random choice to be made from a
2560 given list of parameters at submission time. For an expression, if
2561 some randomness needs to be generated, the macro may appear as
2562
2563 $RANDOM_CHOICE(0,1,2,3,4,5,6)
2564
2565 When evaluated, one of the parameters values will be chosen.
2566
2567
2568
2569
2570
2571 Comments
2572
2573 Blank lines and lines beginning with a pound sign ('#') character
2574 are ignored by the submit description file parser.
2575
2576
2577
2578
2579
2581 condor_submit will exit with a status value of 0 (zero) upon success,
2582 and a non-zero value upon failure.
2583
2585 * Submit Description File Example 1: This example queues three jobs
2586 for execution by Condor. The first will be given command line argu‐
2587 ments of 15 and 2000 , and it will write its standard output to
2588 foo.out1. The second will be given command line arguments of 30 and
2589 2000 , and it will write its standard output to foo.out2. Similarly
2590 the third will have arguments of 45 and 6000 , and it will use
2591 foo.out3for its standard output. Standard error output (if any) from
2592 all three programs will appear in foo.error.
2593
2594
2595
2596 ####################
2597 #
2598 # submit description file
2599 # Example 1: queuing multiple jobs with differing
2600 # command line arguments and output files.
2601 #
2602 ####################
2603
2604 Executable = foo
2605 Universe = standard
2606
2607 Arguments = 15 2000
2608 Output = foo.out1
2609 Error = foo.err1
2610 Queue
2611
2612 Arguments = 30 2000
2613 Output = foo.out2
2614 Error = foo.err2
2615 Queue
2616
2617 Arguments = 45 6000
2618 Output = foo.out3
2619 Error = foo.err3
2620 Queue
2621
2622
2623
2624 * Submit Description File Example 2: This submit description file
2625 example queues 150 runs of program foo which must have been compiled
2626 and linked for Sun workstations running Solaris 8. Condor will not
2627 attempt to run the processes on machines which have less than 32
2628 Megabytes of physical memory, and it will run them on machines which
2629 have at least 64 Megabytes, if such machines are available. Stdin,
2630 stdout, and stderr will refer to in.0, out.0, and err.0for the first
2631 run of this program (process 0). Stdin, stdout, and stderr will
2632 refer to in.1, out.1, and err.1for process 1, and so forth. A log
2633 file containing entries about where and when Condor runs, takes
2634 checkpoints, and migrates processes in this cluster will be written
2635 into file foo.log.
2636
2637
2638
2639 ####################
2640 #
2641 # Example 2: Show off some fancy features including
2642 # use of pre-defined macros and logging.
2643 #
2644 ####################
2645
2646 Executable = foo
2647 Universe = standard
2648 Requirements = Memory >= 32 && OpSys == "SOLARIS28" && Arch
2649 =="SUN4u"
2650 Rank = Memory >= 64
2651 Image_Size = 28 Meg
2652
2653 Error = err.$(Process)
2654 Input = in.$(Process)
2655 Output = out.$(Process)
2656 Log = foo.log
2657
2658 Queue 150
2659
2660
2661
2662 * Command Line example: The following command uses the -append
2663 option to add two commands before the job(s) is queued. A log file
2664 and an error log file are specified. The submit description file is
2665 unchanged.
2666
2667 condor_submit -a "log = out.log" -a "error = error.log" mysubmitfile
2668
2669 Note that each of the added commands is contained within quote marks
2670 because there are space characters within the command.
2671
2672
2673
2674 * periodic_removeexample: A job should be removed from the queue, if
2675 the total suspension time of the job is more than half of the run
2676 time of the job.
2677
2678 Including the command
2679
2680 periodic_remove = CumulativeSuspensionTime >
2681 ((RemoteWallClockTime - CumulativeSuspensionTime)
2682 / 2.0)
2683
2684 in the submit description file causes this to happen.
2685
2686
2687
2689 * For security reasons, Condor will refuse to run any jobs submitted
2690 by user root (UID = 0) or by a user whose default group is group
2691 wheel (GID = 0). Jobs submitted by user root or a user with a
2692 default group of wheel will appear to sit forever in the queue in an
2693 idle state.
2694
2695
2696
2697 * All path names specified in the submit description file must be
2698 less than 256 characters in length, and command line arguments must
2699 be less than 4096 characters in length; otherwise, condor_submit
2700 gives a warning message but the jobs will not execute properly.
2701
2702
2703
2704 * Somewhat understandably, behavior gets bizarre if the user makes
2705 the mistake of requesting multiple Condor jobs to write to the same
2706 file, and/or if the user alters any files that need to be accessed
2707 by a Condor job which is still in the queue. For example, the com‐
2708 pressing of data or output files before a Condor job has completed
2709 is a common mistake.
2710
2711
2712
2713 * To disable checkpointing for Standard Universe jobs, include the
2714 line:
2715
2716 +WantCheckpoint = False
2717
2718 in the submit description file before the queue command(s).
2719
2721 Condor User Manual
2722
2724 Condor Team, University of Wisconsin-Madison
2725
2727 Copyright (C) 1990-2009 Condor Team, Computer Sciences Department, Uni‐
2728 versity of Wisconsin-Madison, Madison, WI. All Rights Reserved.
2729 Licensed under the Apache License, Version 2.0.
2730
2731 See the Condor Version 7.4.2 Manual or http://www.condorpro‐
2732 ject.org/licensefor additional notices.
2733
2734 condor-admin@cs.wisc.edu
2735
2736
2737
2738 date just-man-pages/condor_submit(1)