1prun(1)                              PRRTE                             prun(1)
2
3
4

NAME

6       prun - Execute serial and parallel jobs with the PMIx Reference Runtime
7       (PRTE).
8

SYNOPSIS

10       prun requires a running prte Distributed Virtual Machine  (DVM)  to  be
11       running at the time of the call.  See prte(1) for more information.
12
13       Single Process Multiple Data (SPMD) Model:
14
15              prun [ options ] <program> [ <args> ]
16
17       Multiple Instruction Multiple Data (MIMD) Model:
18
19              prun [ global_options ] \
20                   [ local_options1 ] <program1> [ <args1> ] : \
21                   [ local_options2 ] <program2> [ <args2> ] : \
22                   ... : \
23                   [ local_optionsN ] <programN> [ <argsN> ]
24
25       Note  that  in  both models, invoking prun via an absolute path name is
26       equivalent to specifying the --prefix option with a <dir> value equiva‐
27       lent  to the directory where prun resides, minus its last subdirectory.
28       For example:
29
30              $ /usr/local/bin/prun ...
31
32       is equivalent to
33
34              $ prun --prefix /usr/local
35

QUICK SUMMARY

37       If you are simply looking for how to run an application,  you  probably
38       want to use a command line of the following form:
39
40              $ prun [ -np X ] [ --hostfile <filename> ] <program>
41
42       This  will  run X copies of <program> in your current run-time environ‐
43       ment over the set of hosts specified by <filename>, scheduling (by  de‐
44       fault)  in  a round-robin fashion by CPU slot.  If running under a sup‐
45       ported resource manager a hostfile is usually not required  unless  the
46       caller  wishes  to  further restrict the set of resources used for that
47       job.
48
49       Please note that PRTE automatically binds processes.   See  prte-map(1)
50       for defaults for the mapping, ranking, and binding of processes.
51
52       If your application uses threads, then you probably want to ensure that
53       you are either not bound at all  (by  specifying  --bind-to  none),  or
54       bound  to multiple cores using an appropriate binding level or specific
55       number of processing elements per application process.
56
57       Default ranking is by slot if number of processes <= 2,  otherwise  de‐
58       fault to ranking by package (formally known as “socket”).
59
60       See  prte-map(1)  for more details on mapping, ranking, and binding op‐
61       tions.
62

OPTIONS

64       This section includes many commonly used options.  There may  be  other
65       options listed with prun --help.
66
67       prun  will  send  the name of the directory where it was invoked on the
68       local node to each of the remote nodes, and attempt to change  to  that
69       directory.   See the “Current Working Directory” section below for fur‐
70       ther details.
71
72       <program>
73              The program executable.  This is identified as  the  first  non-
74              recognized argument to prun.
75
76       <args> Pass  these run-time arguments to every new process.  These must
77              always be the last arguments to prun after the <program>.  If an
78              app context file is used, <args> will be ignored.
79
80       -h, --help
81              Display help for this command
82
83       -q, --quiet
84              Suppress  informative messages from prun during application exe‐
85              cution.
86
87       -v, --verbose
88              Be verbose
89
90       -V, --version
91              Print version number.  If no other  arguments  are  given,  this
92              will also cause prun to exit.
93
94   Specifying Number of Processes
95       The  following options specify the number of processes to launch.  Note
96       that none of the options imply a particular binding policy - e.g.,  re‐
97       questing  N processes for each socket does not imply that the processes
98       will be bound to the package.
99
100       Additional options and details are presented in prte-map(1).  Below are
101       a few of the commonly used options.
102
103       -c, -n, --n, --np <#>
104              Run  this  many  copies of the program on the given nodes.  This
105              option indicates that the specified file is an  executable  pro‐
106              gram  and  not  an application context.  If no value is provided
107              for the number of copies to execute (i.e., neither the --np  nor
108              its  synonyms are provided on the command line), prun will auto‐
109              matically execute a copy of the program  on  each  process  slot
110              (see  below for description of a “process slot”).  This feature,
111              however, can only be used in the SPMD model and will  return  an
112              error  (without  beginning  execution of the application) other‐
113              wise.
114
115   I/O Management
116       To manage standard I/O:
117
118       --output-filename <filename>
119              Redirect the stdout, stderr, and stddiag of all processes  to  a
120              process-unique   version   of  the  specified  filename  (“file‐
121              name.id”).  Any directories in the filename  will  automatically
122              be  created.   Each  output  file will consist of “filename.id”,
123              where the id will be the processes’ rank, left-filled  with  ze‐
124              ro’s  for  correct ordering in listings.  Both stdout and stderr
125              will be redirected to the file.  A relative path value  will  be
126              converted  to  an absolute path based on the current working di‐
127              rectory where prun is executed.  Note that this will not work in
128              environments where the file system on compute nodes differs from
129              that where prun is executed.  This option accepts  one  case-in‐
130              sensitive  directive,  specified after a colon (:): NOCOPY indi‐
131              cates that the output is not to be echoed to the terminal.
132
133       --output-directory <path>
134              Redirect the stdout, stderr, and stddiag of all processes  to  a
135              process-unique location consisting of “//rank.id/std[out,err,di‐
136              ag]”, where the id will be the processes’ rank, left-filled with
137              zero’s for correct ordering in listings.  Any directories in the
138              filename will automatically be created.  A relative  path  value
139              will be converted to an absolute path based on the current work‐
140              ing directory where prun is executed.  Note that this  will  not
141              work on environments where the file system on compute nodes dif‐
142              fers from that where prun is executed.  This  option  also  sup‐
143              ports two case-insensitive directives, specified in comma-delim‐
144              ited form after a colon (:): NOJOBID (omits the jobid  directory
145              layer) and NOCOPY (do not copy the output to the terminal).
146
147       --stdin <rank>
148              The  rank  of the process that is to receive stdin.  The default
149              is to forward stdin to rank 0, but this option can  be  used  to
150              forward  stdin to any process.  It is also acceptable to specify
151              none, indicating that no processes are to receive stdin.
152
153       --merge-stderr-to-stdout
154              Merge stderr to stdout for each process.
155
156       --map-by :TAGOUTPUT
157              Tag each line of output to stdout, stderr, and stddiag with [jo‐
158              bid,  MCW_rank]<stdxxx>  indicating  the  jobid  and rank of the
159              process that generated the output, and the channel which  gener‐
160              ated it.
161
162       --map-by :TIMESTAMPOUTPUT
163              Timestamp each line of output to stdout, stderr, and stddiag.
164
165       --map-by :XMLOUTPUT
166              Provide all output to stdout, stderr, and stddiag in an xml for‐
167              mat.
168
169       --xterm <ranks>
170              Display the output from the processes identified by their  ranks
171              in  separate xterm windows.  The ranks are specified as a comma-
172              separated list of ranges, with a -1 indicating all.  A  separate
173              window  will be created for each specified process.  Note: xterm
174              will normally terminate  the  window  upon  termination  of  the
175              process  running within it.  However, by adding a “!” to the end
176              of the list of specified ranks, the proper options will be  pro‐
177              vided  to  ensure  that  xterm  keeps  the window open after the
178              process terminates, thus allowing you to see the  process’  out‐
179              put.   Each  xterm  window will subsequently need to be manually
180              closed.  Note: In some environments, xterm may require that  the
181              executable be in the user’s path, or be specified in absolute or
182              relative terms.  Thus, it may be necessary to  specify  a  local
183              executable  as “./foo” instead of just “foo”.  If xterm fails to
184              find the executable, prun will hang, but still respond correctly
185              to  a  ctrl-c. If this happens, please check that the executable
186              is being specified correctly and try again.
187
188   File and Environment Management
189       To manage files and runtime environment:
190
191       --path <path>
192              <path> that will be used when attempting to locate the requested
193              executables.   This  is  used prior to using the local PATH set‐
194              ting.
195
196       --prefix <dir>
197              Prefix directory that will be used to set the  PATH  and  LD_LI‐
198              BRARY_PATH  on  the  remote  node  before  invoking  the  target
199              process.  See the “Remote Execution” section, below.
200
201       --noprefix
202              Disable the automatic --prefix behavior
203
204       -s | --preload-binary
205              Copy the specified executable(s) to  remote  machines  prior  to
206              starting  remote  processes.   The executables will be copied to
207              the session directory and will be deleted upon completion of the
208              job.
209
210       --preload-files <files>
211              Preload the comma separated list of files to the current working
212              directory  of  the  remote  machines  where  processes  will  be
213              launched prior to starting those processes.
214
215       --set-cwd-to-session-dir
216              Set the working directory of the started processes to their ses‐
217              sion directory.
218
219       --wdir <dir>
220              Change to the directory <dir> before  the  user’s  program  exe‐
221              cutes.  See the “Current Working Directory” section for notes on
222              relative paths.  Note: If the --wdir option appears both on  the
223              command  line  and  in  an application context, the context will
224              take precedence over the command line.  Thus, if the path to the
225              desired  working  directory  is  different on the backend nodes,
226              then it must be specified as an absolute path  that  is  correct
227              for the backend node.
228
229       --wd <dir>
230              Synonym for --wdir.
231
232       -x <env>
233              Export  the  specified environment variables to the remote nodes
234              before executing the program.  Only one environment variable can
235              be  specified per -x option.  Existing environment variables can
236              be specified or new variable names specified with  corresponding
237              values.  If multiple -x options with the same variable name (re‐
238              gardless of value) are provided then the last one listed on  the
239              command  line  will  take precedence, and the others will be ig‐
240              nored.  The exception to this is for PRTE_MCA_ prefixed environ‐
241              ment  variables  which  will report an error in that scenario if
242              any of the values differ.  For example: $  prun  -x  DISPLAY  -x
243              OFILE=/tmp/out ...
244
245       The  parser  for  the  -x option is not very sophisticated; it does not
246       even understand quoted values.  Users are advised to set  variables  in
247       the environment, and then use -x to export (not define) them.
248
249   MCA Parameters
250       Setting  MCA parameters take a few different forms depending the target
251       project for the  parameter.   For  example,  MCA  parameters  targeting
252       OpenPMIx will contain the string pmix in their name, and MCA parameters
253       targeting PRTE will contain the string prte in  their  name.   See  the
254       “MCA” section, below, for finer details on the MCA.
255
256       --gpmixmca <key> <value>
257              Pass  global  PMIx MCA parameters that are applicable to all ap‐
258              plication contexts.  <key> is the parameter name; <value> is the
259              parameter value.
260
261       --mca <key> <value>
262              Send  arguments  to various MCA modules.  See the “MCA” section,
263              below.
264
265       --pmixmca <key> <value>
266              Send arguments to various PMIx MCA modules.  See the “MCA”  sec‐
267              tion, below.
268
269       --prtemca <key> <value>
270              Send  arguments to various PRTE MCA modules.  See the “MCA” sec‐
271              tion, below.
272
273       --pmixam <arg0>
274              Aggregate PMIx MCA parameter set file list.  The  arg0  argument
275              is a comma-separated list of tuning files.  Each file containing
276              MCA parameter sets for this application context.
277
278   Debugging Options
279       --get-stack-traces
280              When paired with the --timeout  option,  prun  will  obtain  and
281              print  out  stack  traces  from  all launched processes that are
282              still alive when the timeout expires.  Note that obtaining stack
283              traces can take a little time and produce a lot of output, espe‐
284              cially for large process-count jobs.
285
286       --timeout <seconds>
287              The maximum number of seconds that prun will  run.   After  this
288              many  seconds,  prun will abort the launched job and exit with a
289              non-zero exit status.  Using --timeout can be also  useful  when
290              combined with the --get-stack-traces option.
291
292   Other Options
293       There are also other options:
294
295       --allow-run-as-root
296              Allow  prun to run when executed by the root user (prun defaults
297              to aborting when launched as the root user).
298
299       --app <appfile>
300              Provide an appfile, ignoring all other command line options.
301
302       --continuous
303              Job is to run until explicitly terminated.
304
305       --dvm-uri
306              Specify the URI of the DVM master,  or  the  name  of  the  file
307              (specified as file:filename) that contains that info.
308
309       --enable-recovery
310              Enable recovery from process failure [Default = disabled].
311
312       --disable-recovery
313              Disable recovery (resets all recovery options to off).
314
315       --map-by :DONOTLAUNCH
316              Perform all necessary operations to prepare to launch the appli‐
317              cation, but do not actually launch it.
318
319       --index-argv-by-rank
320              Uniquely index argv[0] for each process using its rank.
321
322       --max-restarts <num>
323              Max number of times to restart a failed process.
324
325       --pid  PID of the daemon to which we should connect.
326
327       --report-child-jobs-separately
328              Return the exit status of the primary job only.
329
330       --show-progress
331              Output a brief periodic report on launch progress.
332
333       --terminate
334              Terminate the DVM.
335
336       The following options are useful for developers; they are not generally
337       useful to most users:
338
339       --map-by :DISPLAYALLOC
340              Display  a  detailed  list  of the allocation being used by this
341              job.
342
343       --map-by :DISPLAYDEVEL
344              Display a more detailed table showing  the  mapped  location  of
345              each process prior to launch.
346
347       --map-by :DISPLAYTOPO
348              Display  the  topology  as  part  of the process map just before
349              launch.
350
351       --report-state-on-timeout
352              When paired with the --timeout command line option,  report  the
353              run-time  subsystem  state  of each process when the timeout ex‐
354              pires.
355

DESCRIPTION

357       One invocation of prun starts an application  running  under  the  PRTE
358       DVM.   If  the  application is single process multiple data (SPMD), the
359       application can be specified on the prun command line.
360
361       If the application is multiple instruction multiple data  (MIMD),  com‐
362       prising  of  multiple programs, the set of programs and argument can be
363       specified in one of two ways: Extended Command Line Arguments, and  Ap‐
364       plication Context.
365
366       An application context describes the MIMD program set including all ar‐
367       guments in a separate file.  This file  essentially  contains  multiple
368       prun command lines, less the command name itself.  The ability to spec‐
369       ify different options for different instantiations of a program is  an‐
370       other reason to use an application context.
371
372       Extended command line arguments allow for the description of the appli‐
373       cation layout on the command line using  colons  (:)  to  separate  the
374       specification of programs and arguments.  Some options are globally set
375       across all specified programs (e.g. --hostfile), while others are  spe‐
376       cific to a single program (e.g. --np).
377
378   Specifying Host Nodes
379       Host  nodes  can be identified on the prun command line with the --host
380       option or in a hostfile.  See prte-map(1) for more details.
381
382   Application Context or Executable Program?
383       To distinguish the two different forms, prun looks on the command  line
384       for  --app option.  If it is specified, then the file named on the com‐
385       mand line is assumed to be an application context.  If it is not speci‐
386       fied, then the file is assumed to be an executable program.
387
388   Locating Files
389       If  no  relative  or  absolute  path is specified for a file, prun will
390       first look for files by searching  the  directories  specified  by  the
391       --path  option.  If there is no --path option set or if the file is not
392       found at the --path location, then prun will search the user’s PATH en‐
393       vironment variable as defined on the source node(s).
394
395       If  a  relative directory is specified, it must be relative to the ini‐
396       tial working directory determined by the specific  starter  used.   For
397       example  when  using  the rsh or ssh starters, the initial directory is
398       $HOME by default.  Other starters may set the initial directory to  the
399       current working directory from the invocation of prun.
400
401   Current Working Directory
402       The  --wdir  prun  option  (and  its  synonym, --wd) allows the user to
403       change to an arbitrary directory before the program is invoked.  It can
404       also  be  used in application context files to specify working directo‐
405       ries on specific nodes and/or for specific applications.
406
407       If the --wdir option appears both in a context file and on the  command
408       line, the context file directory will override the command line value.
409
410       If  the  --wdir option is specified, prun will attempt to change to the
411       specified directory on all of the remote nodes.  If  this  fails,  prun
412       will abort.
413
414       If  the  --wdir  option  is not specified, prun will send the directory
415       name where prun was invoked to each of the remote  nodes.   The  remote
416       nodes  will try to change to that directory.  If they are unable (e.g.,
417       if the directory does not exist on that node), then prun will  use  the
418       default directory determined by the starter.
419
420       All directory changing occurs before the user’s program is invoked.
421
422   Standard I/O
423       The  PRTE DVM directs UNIX standard input to /dev/null on all processes
424       except the rank 0 process.  The rank 0 process inherits standard  input
425       from  prun.   Note:  The node that invoked prun need not be the same as
426       the node where the rank 0 process resides.  PRTE DVM handles the  redi‐
427       rection of prun’s standard input to the rank 0 process.
428
429       The  PRTE  DVM directs UNIX standard output and error from remote nodes
430       to the node that invoked prun and prints it on the standard  output/er‐
431       ror of prun.  Local processes inherit the standard output/error of prun
432       and transfer to it directly.
433
434       Thus it is possible to redirect standard I/O for applications by  using
435       the typical shell redirection procedure on prun.
436
437              $ prun --np 2 my_app < my_input > my_output
438
439       Note  that  in  this  example  only the rank 0 process will receive the
440       stream from my_input on stdin.  The stdin on all the other  nodes  will
441       be  tied to /dev/null.  However, the stdout from all nodes will be col‐
442       lected into the my_output file.
443
444   Signal Propagation
445       When prun receives a SIGTERM and SIGINT, it will attempt  to  kill  the
446       entire  job  by  sending  all processes in the job a SIGTERM, waiting a
447       small number of seconds, then  sending  all  processes  in  the  job  a
448       SIGKILL.
449
450       SIGUSR1 and SIGUSR2 signals received by prun are propagated to all pro‐
451       cesses in the job.
452
453       A SIGTSTOP signal to prun will cause a SIGSTOP signal to be sent to all
454       of  the  programs started by prun and likewise a SIGCONT signal to prun
455       will cause a SIGCONT sent.
456
457       Other signals are not currently propagated by prun.
458
459   Process Termination / Signal Handling
460       During the run of an application, if any process dies  abnormally  (ei‐
461       ther exiting before invoking PMIx_Finalize, or dying as the result of a
462       signal), prun will print out an error message and kill the rest of  the
463       application.
464
465   Process Environment
466       Processes  in  the  application inherit their environment from the PRTE
467       DVM daemon upon the node on which they are running.  The environment is
468       typically  inherited from the user’s shell.  On remote nodes, the exact
469       environment is determined by the boot MCA module used.  The rsh  launch
470       module,  for example, uses either rsh/ssh to launch the PRTE DVM daemon
471       on remote nodes, and typically executes  one  or  more  of  the  user’s
472       shell-setup files before launching the daemon.  When running dynamical‐
473       ly linked applications which require  the  LD_LIBRARY_PATH  environment
474       variable  to  be set, care must be taken to ensure that it is correctly
475       set when booting PRTE DVM.
476
477       See the “Remote Execution” section for more details.
478
479   Remote Execution
480       The PRTE DVM requires that the PATH environment variable be set to find
481       executables  on remote nodes.  This is typically only necessary in rsh-
482       or ssh-based environments.  Batch and scheduled environments  typically
483       copy the current environment to the execution of remote jobs, so if the
484       current environment has PATH and/or LD_LIBRARY_PATH set  properly,  the
485       remote  nodes will also have it set properly.  If the PRTE DVM was com‐
486       piled with shared library support, it may also be necessary to have the
487       LD_LIBRARY_PATH environment variable set on remote nodes as well (espe‐
488       cially to find the shared libraries required to run user applications).
489
490       However, it is not always desirable or possible to edit  shell  startup
491       files  to set PATH and/or LD_LIBRARY_PATH.  The --prefix option is pro‐
492       vided for some simple configurations where this is not possible.
493
494       The --prefix option takes a single argument: the base directory on  the
495       remote  node  where  PRTE DVM is installed.  The PRTE DVM will use this
496       directory to set the remote PATH and LD_LIBRARY_PATH  before  executing
497       any  user  applications.   This allows running jobs without having pre-
498       configured the PATH and LD_LIBRARY_PATH on the remote nodes.
499
500       The PRTE DVM adds the basename of the current node’s “bindir” (the  di‐
501       rectory  where  the PRTE DVM’s executables are installed) to the prefix
502       and uses that to set the PATH on the remote node.  Similarly, PRTE  DVM
503       adds  the  basename of the current node’s “libdir” (the directory where
504       the PRTE DVM’s libraries are installed) to the prefix and uses that  to
505       set the LD_LIBRARY_PATH on the remote node.  For example:
506
507       Local bindir:
508              /local/node/directory/bin
509
510       Local libdir:
511              /local/node/directory/lib64
512
513       If the following command line is used:
514
515              $ prun --prefix /remote/node/directory
516
517       The  PRTE  DVM  will  add  “/remote/node/directory/bin” to the PATH and
518       “/remote/node/directory/lib64” to the  LD_LIBRARY_PATH  on  the  remote
519       node before attempting to execute anything.
520
521       The  --prefix option is not sufficient if the installation paths on the
522       remote node are different than the local node (e.g., if “/lib” is  used
523       on  the local node, but “/lib64” is used on the remote node), or if the
524       installation paths are something other than a subdirectory under a com‐
525       mon prefix.
526
527       Note  that  executing  prun  via  an absolute pathname is equivalent to
528       specifying --prefix without the last subdirectory in the absolute path‐
529       name to prun.
530
531       For example:
532
533              $ /usr/local/bin/prun ...
534
535       is equivalent to
536
537              $ prun --prefix /usr/local ...
538
539   Exported Environment Variables
540       All environment variables that are named in the form PMIX_\* will auto‐
541       matically be exported to new processes on the local and  remote  nodes.
542       Environmental parameters can also be set/forwarded to the new processes
543       using the MCA parameter mca_base_env_list.  While the syntax of the  -x
544       option  and MCA param allows the definition of new variables, note that
545       the parser for these options are currently not very sophisticated -  it
546       does not even understand quoted values.  Users are advised to set vari‐
547       ables in the environment and use the option to export them; not to  de‐
548       fine them.
549
550   Setting MCA Parameters
551       The  --mca  / --pmixmca / --prtemca switches (referenced here as “--mca
552       switches” for brevity) allow the passing of parameters to  various  MCA
553       (Modular  Component Architecture) modules.  MCA modules have direct im‐
554       pact on programs because they allow tunable parameters to be set at run
555       time.
556
557       The  -mca switch takes two arguments: <key> and <value>.  The <key> ar‐
558       gument generally specifies which MCA module  will  receive  the  value.
559       For example, the <key> “rmaps” is used to select which RMAPS to be used
560       for mapping processes to nodes.  The <value> argument is the value that
561       is passed.  For example:
562
563       prun -prtemca rmaps seq -np 1 foo
564              Tells PRTE to use the “seq” RMAPS component, and to run a single
565              copy of “a.out” on an allocated node.
566
567       The -mca switch can be used multiple times to specify  different  <key>
568       and/or  <value>  arguments.   If  the same <key> is specified more than
569       once, the <value>s are concatenated with a comma (“,”) separating them.
570
571       Note that the -mca switch is simply a shortcut for setting  environment
572       variables.   The same effect may be accomplished by setting correspond‐
573       ing environment variables before running prun.  The form of  the  envi‐
574       ronment variables depends on the type of the --mca switch.
575
576       --mca  PRTE_MCA_<key>=<value>
577
578       --pmixmca
579              PMIX_MCA_<key>=<value>
580
581       --prtemca
582              `PRTE_MCA_=``
583
584       Thus,  the  -mca  switch overrides any previously set environment vari‐
585       ables.  The -mca settings similarly override MCA parameters set in  the
586       $PRTE_PREFIX/etc/prte-mca-params.conf   or  $HOME/.prte/mca-params.conf
587       file.
588
589       Unknown <key> arguments are still set as environment  variable  –  they
590       are  not checked (by prun) for correctness.  Illegal or incorrect <val‐
591       ue> arguments may or may not be reported – it depends on  the  specific
592       MCA module.
593
594       To find the available component types under the MCA architecture, or to
595       find the available parameters for a specific component, use  the  pinfo
596       command.   See  the  pinfo(1)  man page for detailed information on the
597       command.
598
599   Running as root
600       The PRTE team strongly advises against executing prun as the root user.
601       Applications should be run as regular (non-root) users.
602
603       Reflecting this advice, prun will refuse to run as root by default.  To
604       override this default, you can add the  --allow-run-as-root  option  to
605       the prun command line.
606

RETURN VALUE

608       There  is no standard definition for what prun should return as an exit
609       status.  After considerable discussion, we  settled  on  the  following
610       method  for  assigning the prun exit status (note: in the following de‐
611       scription, the “primary” job is the initial application started by prun
612       -  all  jobs  that  are  spawned by that job are designated “secondary”
613       jobs):
614
615       • if all processes in the primary job normally terminate with exit sta‐
616         tus 0, we return 0
617
618       • if  one  or more processes in the primary job normally terminate with
619         non-zero exit status, we return the exit status of the  process  with
620         the lowest rank to have a non-zero status
621
622       • if all processes in the primary job normally terminate with exit sta‐
623         tus 0, and one or more processes in a secondary job  normally  termi‐
624         nate  with non-zero exit status, we (a) return the exit status of the
625         process with the lowest rank in the lowest jobid to have  a  non-zero
626         status,  and  (b) output a message summarizing the exit status of the
627         primary and all secondary jobs.
628
629       • if the cmd line option --report-child-jobs-separately is set, we will
630         return  -only- the exit status of the primary job.  Any non-zero exit
631         status in secondary jobs will be reported solely in a  summary  print
632         statement.
633
634       By default, the job will abort when any process terminates with non-ze‐
635       ro status.  The MCA parameter prte_abort_on_non_zero_status can be  set
636       to false (or 0) to cause the PRTE DVM to not abort a job if one or more
637       processes return a non-zero status.  In that  situation  the  PRTE  DVM
638       records  and notes that processes exited with non-zero termination sta‐
639       tus to report the approprate exit status of  prun  (per  bullet  points
640       above).
641
642       If  the  --timeout  command line option is used and the timeout expires
643       before the job completes (thereby forcing prun to kill  the  job)  prun
644       will  return an exit status equivalent to the value of ETIMEDOUT (which
645       is typically 110 on Linux and OS X systems).
646
647
648
6492021-08-23                                                             prun(1)
Impressum