1pbs_mom(8B) PBS pbs_mom(8B)
2
3
4
6 pbs_mom - start a pbs batch execution mini-server
7
9 pbs_mom [-a alarm] [-C chkdirectory] [-c config] [-d directory]
10 [-H hostname] [-L logfile] [-M MOMport] [-R RPPport] [-p|-q|-r] [-x]
11
13 The pbs_mom command starts the operation of a batch Machine Oriented
14 Mini-server, MOM, on the local host. Typically, this command will be
15 in a local boot file such as /etc/rc.local . To insure that the
16 pbs_mom command is not runnable by the general user community, the
17 server will only execute if its real and effective uid is zero.
18
19 One function of pbs_mom is to place jobs into execution as directed by
20 the server, establish resource usage limits, monitor the job's usage,
21 and notify the server when the job completes. If they exist, pbs_mom
22 will execute a prologue script before executing a job and an epilogue
23 script after executing the job. The next function of pbs_mom is to
24 respond to resource monitor requests. This was done by a separate
25 process in previous versions of PBS but has now been combined into one
26 process. The resource monitor function is provided mainly for the PBS
27 scheduler. It provides information about the status of running jobs,
28 memory available etc. The next function of pbs_mom is to respond to
29 task manager requests. This involves communicating with running tasks
30 over a tcp socket as well as communicating with other MOMs within a job
31 (aka a "sisterhood").
32
33 Pbs_mom will record a diagnostic message in a log file for any error
34 occurrence. The log files are maintained in the mom_logs directory
35 below the home directory of the server. If the log file cannot be
36 opened, the diagnostic message is written to the system console.
37
39 -a alarm Used to specify the alarm timeout in seconds for com‐
40 puting a resource. Every time a resource request is
41 processed, an alarm is set for the given amount of
42 time. If the request has not completed before the
43 given time, an alarm signal is generated. The default
44 is 5 seconds.
45
46 -C chkdirectory Specifieds the path of the directory used to hold
47 checkpoint files. [Currently this is only valid on
48 Cray systems.] The default directory is
49 PBS_HOME/spool/checkpoint, see the -d option. The
50 directory specified with the -C option must be owned by
51 root and accessible (rwx) only by root to protect the
52 security of the checkpoint files.
53
54 -c config Specify a alternative configuration file, see descrip‐
55 tion below. If this is a relative file name it will be
56 relative to PBS_HOME/mom_priv, see the -d option. If
57 the specified file cannot be opened, pbs_mom will
58 abort. If the -c option is not supplied, pbs_mom will
59 attempt to open the default
60 configuration file "config" in PBS_HOME/mom_priv. If
61 this file is not present, pbs_mom will log the fact and
62 continue.
63
64 -H hostname Set MOM's hostname. This can be useful on multi-homed
65 networks.
66
67 -d directory Specifies the path of the directory which is the home
68 of the servers working files, PBS_HOME. This option is
69 typically used along with -M when debugging MOM. The
70 default directory is given by $PBS_SERVER_HOME which is
71 typically /usr/spool/PBS.
72
73 -L logfile Specify an absolute path name for use as the log file.
74 If not specified, MOM will open a file named for the
75 current date in the PBS_HOME/mom_logs directory, see
76 the -d option.
77
78 -M port Specifies the port number on which the mini-server
79 (MOM) will listen for batch requests.
80
81 -R port Specifies the port number on which the mini-server
82 (MOM) will listen for resource monitor requests, task
83 manager requests and inter-MOM messages. Both a UDP
84 and a TCP port of this number will be used.
85
86 -p (Default after version 2.4.0) (Preserve running jobs)
87 -- Specifies the impact on jobs which were in execution
88 when the mini-server shut-down. The -p option tries
89 to preserve any running jobs when the MOM restarts.
90 The new mini-server will not be the parent of any run‐
91 ning jobs, MOM has lost control of her offspring (not
92 a new situation for a mother). The MOM will allow the
93 jobs to continue to run and monitor them indirectly via
94 polling. All recovered jobs will report an exit code of
95 0 when they are complete. The -p option is mutually
96 exclusive with the -r, -P and -q options.
97
98 -P (Terminate all jobs and remove them from the queue) --
99 Specifies the impact on jobs which were in execution
100 when the mini-server shut-down. With the -P option, it
101 is assumed that either the entire system has been
102 restarted or the MOM has been down so long that it can
103 no longer guarantee that the pid of any running process
104 is the same as the recorded job process pid of a recov‐
105 ering job. Unlike the -p option no attempt is made to
106 try and preserve or recover running jobs. All jobs are
107 terminated and removed from the queue. The -q option
108 is mutually exclusive with the -p, -q and -r options.
109
110 -q (Requeue all jobs - This is the default behavior in
111 versions prior to 2.4.0) -- Specifies the impact on
112 jobs which were in execution when the mini-servershut-
113 down. Do not terminate running processes. With the -q
114 option, it is assumed that either the entire system has
115 been restarted or the MOM has been down so long that it
116 can no longer guarantee that the pid of any running
117 process is the same as the recorded job process pid of
118 a recovering job. No attempt is made to kill job pro‐
119 cesses. The MOM will mark the jobs as terminated and
120 notify the batch server which owns the job. Re-runnable
121 jobs will be requeued. The -q option is mutually
122 exclusive with the -p, -P and -r options.
123
124 -r (Terminate running processes and requeue all jobs) --
125 Specifies the impact on jobs which were in execution
126 when the mini-server shut-down. With the -r option, MOM
127 will kill any processes belonging to running jobs, mark
128 the jobs as terminated and notify the batch server that
129 owns the job. Re-runnable jobs are reset to a queued
130 state so they can be run again. The -r option is mutu‐
131 ally exclusive with the -p, -P and -q options.
132
133 If the -r option is used following a reboot, process
134 IDs (pids) may be reused and MOM may kill a process
135 that is not a batch session.
136
137 -S port Specifies the port number on which the pbs_server is
138 listening for requests. If pbs_server is started with
139 a -p option, pbs_mom will need to use the -S option and
140 match the port value which was used to start
141 pbs_server.
142
143 -x Disables the check for privileged port resource monitor
144 connections. This is used mainly for testing since the
145 privileged port is the only mechanism used to prevent
146 any ordinary user from connecting.
147
149 The configuration file may be specified on the command line at program
150 start with the -c flag. The use of this file is to provide several
151 types of run time information to pbs_mom: static resource names and
152 values, external resources provided by a program to be run on request
153 via a shell escape, and values to pass to internal set up functions at
154 initialization (and re-initialization).
155
156 Each item type is on a single line with the component parts separated
157 by white space. If the line starts with a hash mark (pound sign, #),
158 the line is considered to be a comment and is skipped.
159
160 Static Resources
161 For static resource names and values, the configuration file
162 contains a list of resource names/values pairs, one pair per
163 line and separated by white space. An Example of static
164 resource names and values could be the number of tape drives of
165 different types and could be specified by
166
167 tape3480 4
168 tape3420 2
169 tapedat 1
170 tape8mm 1
171
172 Shell Commands
173 If the first character of the value is an exclamation mark (!),
174 the entire rest of the line is saved to be executed through the
175 services of the system(3) standard library routine.
176
177 The shell escape provides a means for the resource monitor to
178 yield arbitrary information to the scheduler. Parameter substi‐
179 tution is done such that the value of any qualifier sent with
180 the query, as explained below, replaces a token with a percent
181 sign (%) followed by the name of the qualifier. For example,
182 here is a configuration file line which gives a resource name of
183 "escape":
184
185 escape !echo %xxx %yyy
186
187 If a query for "escape" is sent with no qualifiers, the command
188 executed would be "echo %xxx %yyy". If one qualifier is sent,
189 "escape[xxx=hi there]", the command executed would be "echo hi
190 there %yyy". If two qualifiers are sent,
191 "escape[xxx=hi][yyy=there]", the command executed would be "echo
192 hi there". If a qualifier is sent with no matching token in the
193 command line, "escape[zzz=snafu]", an error is reported.
194
195 size[fs=<FS>]
196 Specifies that the available and configured disk space in the
197 <FS> filesystem is to be reported to the pbs_server and sched‐
198 uler. NOTE: To request disk space on a per job basis, specify
199 the file resource as in 'qsub -l nodes=1,file=1000kb' For exam‐
200 ple, the available and configured disk space in the
201 /localscratch filesystem will be reported:
202
203 size[fs=/localscratch]
204
205 Initialization Value
206 An initialization value directive has a name which starts with a
207 dollar sign ($) and must be known to MOM via an internal table.
208 The entries in this table now are:
209
210 pbsserver
211 which defines hostnames running pbs_server that will be
212 allowed to submit jobs, issue Resource Monitor (RM)
213 requests, and get status updates. MOM will continually
214 attempt to contact all server hosts for node status and
215 state updates. Like $PBS_SERVER_HOME/server_name, the
216 hostname may be followed by a colon and a port number.
217 This parameter replaces the oft-confused $clienthost
218 parameter from TORQUE 2.0.0p0 and earlier. Note that the
219 hostname in $PBS_SERVER_HOME/server_name is used if no
220 $pbsserver parameters are found
221
222 pbsclient
223 which causes a host name to be added to the list of hosts
224 which will be allowed to connect to MOM as long as they
225 are using a privilaged port for the purposes of resource
226 monitor requests. For example, here are two configura‐
227 tion file lines which will allow the hosts "fred" and
228 "wilma" to connect:
229
230 $pbsclient fred
231 $pbsclient wilma
232
233 Two host name are always allowed to connection to
234 pbs_mom, "localhost" and the name returned to pbs_mom by
235 the system call gethostname(). These names need not be
236 specified in the configuration file. The hosts listed as
237 "clients" can issue Resource Monitor (RM) requests.
238 Other MOM nodes and servers do not need to be listed as
239 clients.
240
241 restricted
242 which causes a host name to be added to the list of hosts
243 which will be allowed to connect to MOM without needing
244 to use a privilaged port. These names allow for wildcard
245 matching. For example, here is a configuration file line
246 which will allow queries from any host from the domain
247 "ibm.com".
248
249 $restricted *.ibm.com
250
251 The restriction which applies to these connections is
252 that only internal queries may be made. No resources
253 from a config file will be found. This is to prevent any
254 shell commands from being run by a non-root process.
255 This parameter is generally not required except for some
256 versions of OSX.
257
258 logevent
259 which sets the mask that determines which event types are
260 logged by pbs_mom. For example:
261
262 $logevent 0x1fff
263 $logevent 255
264
265 The first example would set the log event mask to 0x1ff
266 (511) which enables logging of all events including debug
267 events. The second example would set the mask to 0x0ff
268 (255) which enables all events except debug events.
269
270 cputmult
271 which sets a factor used to adjust cpu time used by a
272 job. This is provided to allow adjustment of time
273 charged and limits enforced where the job might run on
274 systems with different cpu performance. If Mom's system
275 is faster than the reference system, set cputmult to a
276 decimal value greater than 1.0. If Mom's system is
277 slower, set cputmult to a value between 1.0 and 0.0. For
278 example:
279
280 $cputmult 1.5
281 $cputmult 0.75
282
283 usecp specifies which directories should be staged with cp
284 instead of rcp/scp. If a shared filesystem is available
285 on all hosts in a cluster, this directive is used to make
286 these filesystems known to MOM. For example, if /home is
287 NFS mounted on all nodes in a cluster:
288
289 $usecp *:/home /home
290
291 wallmult
292 which sets a factor used to adjust wall time usage by to
293 job to a common reference system. The factor is used for
294 walltime calculations and limits the same as cputmult is
295 used for cpu time.
296
297 configversion
298 specifies the version of the config file data, a string.
299
300 check_poll_time
301 specifies the MOM interval in seconds. MOM checks each
302 job for updated resource usages, exited processes, over-
303 limit conditions, etc. once per interval. This value
304 should be equal or lower to pbs_server's job_stat_rate.
305 High values result in stale information reported to
306 pbs_server. Low values result in increased system usage
307 by MOM. Default is 45 seconds.
308
309 down_on_error
310 causes MOM to report itself as state "down" to pbs_server
311 in the event of a failed health check. This feature is
312 EXPERIMENTAL and likely to be removed in the future. See
313 HEALTH CHECK below.
314
315 ideal_load
316 ideal processor load. Represents a low water mark for
317 the load average. Nodes that are currently busy will
318 consider itself free after falling below ideal_load.
319
320 auto_ideal_load
321 if jobs are running, sets idea_load based on a simple
322 expression. The expressions start with the variable 't'
323 (total assigned CPUs) or 'c' (existing CPUs), an operator
324 (+ - / *), and followed by a float constant.
325
326 $auto_ideal_load t-0.2
327
328 loglevel
329 specifies the verbosity of logging with higher numbers
330 specifying more verbose logging. Values may range
331 between 0 and 7.
332
333 log_file_max_size
334 If this is set to a value > 0 then pbs_mom will roll
335 the current log file to log-file-name.1 when its size is
336 greater than or equal to the value of
337 log_file_max_size. This value is interpreted as kilo‐
338 bytes.
339
340 log_file_roll_depth
341 If this is set to a value >=1 and log_file_max_size is
342 set then pbs_mom will continue rolling the log files to
343 log-file-name.log_file_roll_depth.
344
345 max_load
346 maximum processor load. Nodes over this load average are
347 considered busy (see ideal_load above).
348
349 auto_max_load
350 if jobs are running, sets max_load based on a simple
351 expression. The expressions start with the variable 't'
352 (total assigned CPUs) or 'c' (existing CPUs), an operator
353 (+ - / *), and followed by a float constant.
354
355 enablemomrestart
356 enable automatic restarts of MOM. If enabled, MOM will
357 check if its binary has been updated and restart itself
358 at a safe point when no jobs are running; thus making
359 upgrades easier. The check is made by comparing the
360 mtime of the pbs_mom executable. Command-line args, the
361 process name, and the PATH env variable are preserved
362 across restarts. It is recommended that this not be
363 enabled in the config file, but enabled when desired with
364 momctl (see RESOURCES for more information.)
365
366 node_check_script
367 specifies the fully qualified pathname of the health
368 check script to run (see HEALTH CHECK for more informa‐
369 tion).
370
371 node_check_interval
372 specifies when to run the MOM health check. The check
373 can be either periodic, event-driver, or both. The value
374 starts with an integer specifying the number of MOM
375 intervals between subsequent executions of the specified
376 health check. After the integer is an optional comma-
377 separated list of event names. Currently supported are
378 "jobstart" and "jobend". This value defaults to 1 with
379 no events indicating the check is run every MOM interval.
380 (see HEALTH CHECK for more information)
381
382 $node_check_interval 0Disabled.
383 $node_check_interval 0,jobstartOnly
384 $node_check_interval 10,jobstart,jobend
385
386 prologalarm
387 Specifies maximum duration (in seconds) which the MOM
388 will wait for the job prolog or job job epilog to com‐
389 plete. This parameter default to 300 seconds (5 minutes)
390
391 rcpcmd Specify the the full path and argument to be used for
392 remote file copies. This overrides the compile-time
393 default found in configure. This must contain 2 words:
394 the full path to the command and the switches. The copy
395 command must be able to recursively copy files to the
396 remote host and accept arguments of the form
397 "user@host:files" For example:
398
399 $rcpcmd /usr/bin/rcp -rp
400 $rcpcmd /usr/bin/scp -rpB
401
402 remote_checkpoint_dirs
403 Specifies what server checkpoint directories are remotely
404 mounted. This directive is used to tell the MOM which
405 directories are shared with the server. Using remote
406 checkpoint directories eliminates the need to copy the
407 checkpoint files back and forth between the MOM and the
408 server. This parameter is available in 2.4.1 and later.
409
410 $remote_checkpoint_dirs /var/spool/torque/checkpoint
411
412 remote_reconfig
413 Enables the ability to remotely reconfigure pbs_mom with
414 a new config file. Default is disabled. This parameter
415 accepts various forms of true, yes, and 1.
416
417 timeout
418 Specifies the number of seconds before TCP messages will
419 time out. TCP messages include job obituaries, and TM
420 requests if RPP is disabled. Default is 60 seconds.
421
422 tmpdir Sets the directory basename for a per-job temporary
423 directory. Before job launch, MOM will append the jobid
424 to the tmpdir basename and create the directory. After
425 the job exit, MOM will recursively delete it. The env
426 variable TMPDIR will be set for all pro/epilog scripts,
427 the job script, and TM tasks.
428 Directory creation and removal is done as the job owner
429 and group, so the owner must have write permission to
430 create the directory. If the directory already exists
431 and is owned by the job owner, it will not be deleted
432 after the job. If the directory already exists and is
433 NOT owned by the job owner, the job start will be
434 rejected.
435
436 status_update_time
437 Specifies (in seconds) how often MOM updates its status
438 information to pbs_server. This value should correlate
439 with the server's scheduling interval. High values
440 increase the load of pbs_server and the network. Low
441 values cause pbs_server to report stale information.
442 Default is 45 seconds.
443
444 varattr
445 This is similar to a shell escape above, but includes a
446 TTL. The command will only be run every TTL seconds. A
447 TTL of -1 will cause the command to be executed only
448 once. A TTL of 0 will cause the command to be run every‐
449 time varattr is requested. This parameter may be used
450 multiple times, but all output will be grouped into a
451 single "varattr" attribute in the request and status out‐
452 put. The command should output data in the form of
453 varattrname=va1ue1[+value2]...
454
455 $varattr 3600 /path/to/script [<ARGS>]...
456
457 The configuration file must be executable and "secure". It must be
458 owned by a user id and group id less than 10 and not be world writable.
459 Output from this file must be in the format $VAR=$VAL, i.e.,
460
461 dataset13=20070104
462 dataset22=20070202
463 viraltest=abdd3
464
465 xauthpath
466 Specifies the path to the xauth binary to enable X11 fowarding.
467
468 ignvmem
469 If set to true, then pbs_mom will ignore vmem/pvmem limit
470 enforcement.
471
472 ignwalltime
473 If set to true, then pbs_mom will ignore walltime limit enforce‐
474 ment.
475
476 mom_host
477 Sets the local hostname as used by pbs_mom.
478
480 Resource Monitor queries can be made with momctl's -q option to
481 retrieve and set pbs_mom options. Any configured static resource may
482 be retrieved with a request of the same name. These are resource
483 requests not otherwise documented in the PBS ERS.
484
485 cycle forces an immediate MOM cycle
486
487 status_update_time
488 retrieve or set the $status_update_time parameter
489
490 check_poll_time
491 retrieve or set the $check_poll_time parameter
492
493 configversion
494 retrieve the config version
495
496 jobstartblocktime
497 retrieve or set the $jobstartblocktime parameter
498
499 enablemomrestart
500 retrieve or set the $enablemomrestart parameter
501
502 loglevel
503 retrieve or set the $loglevel parameter
504
505 down_on_error
506 retrieve or set the EXPERIMENTAL $down_on_error parameter
507
508 diag0 - diag4
509 retrieves various diagnostic information
510
511 rcpcmd retrieve or set the $rcpcmd parameter
512
513 version
514 retrieves the pbs_mom version
515
517 The health check script is executed directly by the pbs_mom daemon
518 under the root user id. It must be accessible from the compute node and
519 may be a script or compiled executable program. It may make any needed
520 system calls and execute any combination of system utilities but should
521 not execute resource manager client commands. Also, as of TORQUE
522 1.0.1, the pbs_mom daemon blocks until the health check is completed
523 and does not possess a built-in timeout. Consequently, it is advisable
524 to keep the launch script execution time short and verify that the
525 script will not block even under failure conditions.
526
527 If the script detects a failure, it should return the keyword 'ERROR'
528 to stdout followed by an error message. The message (up to 256 charac‐
529 ters) immediately following the ERROR string will be assigned to the
530 node attribute 'message' of the associated node.
531
532 If the script detects a failure when run from "jobstart", then the job
533 will be rejected. This should probably only be used with advanced
534 schedulers like Moab so that the job can be routed to another node.
535
536 TORQUE currently ignores ERROR messages by default, but advanced sched‐
537 ulers like moab can be configured to react appropriately.
538
539 If the experimental $down_on_error MOM setting is enabled, MOM will set
540 itself to state down and report to pbs_server; and pbs_server will
541 report the node as "down". Additionally, the experimental
542 "down_on_error" server attribute can be enabled which has the same
543 effect but moves the decision to pbs_server. It is redundant to have
544 MOM's $down_on_error and pbs_server's down_on_error features enabled.
545 See "down_on_error" in pbs_server_attributes(7B).
546
548 $PBS_SERVER_HOME/server_name
549 contains the hostname running pbs_server.
550
551 $PBS_SERVER_HOME/mom_priv
552 the default directory for configuration files, typically
553 (/usr/spool/pbs)/mom_priv.
554
555 $PBS_SERVER_HOME/mom_logs
556 directory for log files recorded by the server.
557
558 $PBS_SERVER_HOME/mom_priv/prologue
559 the administrative script to be run before job execution.
560
561 $PBS_SERVER_HOME/mom_priv/epilogue
562 the administrative script to be run after job execution.
563
565 pbs_mom handles the following signals:
566
567 SIGHUP causes pbs_mom to re-read its configuration file, close and
568 reopen the log file, and reinitialize resource structures.
569
570 SIGALRM
571 results in a log file entry. The signal is used to limit the
572 time taken by certain children processes, such as the prologue
573 and epilogue.
574
575 SIGINT and SIGTERM
576 results in pbs_mom exiting without terminating any running jobs.
577 This is the action for the following signals as well: SIGXCPU,
578 SIGXFSZ, SIGCPULIM, and SIGSHUTDN.
579
580 SIGUSR1, SIGUSR2
581 causes MOM to increase and decrease logging levels, respec‐
582 tively.
583
584 SIGPIPE, SIGINFO
585 are ignored.
586
587 SIGBUS, SIGFPE, SIGILL, SIGTRAP, and SIGSYS
588 cause a core dump if the PBSCOREDUMP environmental variable is
589 defined.
590
591 All other signals have their default behavior installed.
592
594 If the mini-server command fails to begin operation, the server exits
595 with a value greater than zero.
596
598 pbs_server(8B), pbs_scheduler_basl(8B), pbs_scheduler_tcl(8B), the PBS
599 External Reference Specification, and the PBS Administrator's Guide.
600
601
602
603Local pbs_mom(8B)