1pbs_mom(8B) PBS pbs_mom(8B)
2
3
4
6 pbs_mom - start a pbs batch execution mini-server
7
9 pbs_mom [-C chkdirectory] [-c config] [-d directory] [-L logfile]
10 [-M MOMport] [-R RPPport] [-p|-r] [-x]
11
13 The pbs_mom command starts the operation of a batch Machine Oriented
14 Mini-server, MOM, on the local host. Typically, this command will be
15 in a local boot file such as /etc/rc.local . To insure that the
16 pbs_mom command is not runnable by the general user community, the
17 server will only execute if its real and effective uid is zero.
18
19 One function of pbs_mom is to place jobs into execution as directed by
20 the server, establish resource usage limits, monitor the job's usage,
21 and notify the server when the job completes. If they exist, pbs_mom
22 will execute a prologue script before executing a job and an epilogue
23 script after executing the job. The next function of pbs_mom is to
24 respond to resource monitor requests. This was done by a separate
25 process in previous versions of PBS but has now been combined into one
26 process. The resource monitor function is provided mainly for the PBS
27 scheduler. It provides information about the status of running jobs,
28 memory available etc. The next function of pbs_mom is to respond to
29 task manager requests. This involves communicating with running tasks
30 over a tcp socket as well as communicating with other MOMs within a job
31 (aka a "sisterhood").
32
33 Pbs_mom will record a diagnostic message in a log file for any error
34 occurrence. The log files are maintained in the mom_logs directory
35 below the home directory of the server. If the log file cannot be
36 opened, the diagnostic message is written to the system console.
37
39 -C chkdirectory Specifieds the path of the directory used to hold
40 checkpoint files. [Currently this is only valid on
41 Cray systems.] The default directory is
42 PBS_HOME/spool/checkpoint, see the -d option. The
43 directory specified with the -C option must be owned by
44 root and accessible (rwx) only by root to protect the
45 security of the checkpoint files.
46
47 -c config Specify a alternative configuration file, see descrip‐
48 tion below. If this is a relative file name it will be
49 relative to PBS_HOME/mom_priv, see the -d option. If
50 the specified file cannot be opened, pbs_mom will
51 abort. If the -c option is not supplied, pbs_mom will
52 attempt to open the default
53 configuration file "config" in PBS_HOME/mom_priv. If
54 this file is not present, pbs_mom will log the fact and
55 continue.
56
57 -d directory Specifies the path of the directory which is the home
58 of the servers working files, PBS_HOME. This option is
59 typically used along with -M when debugging MOM. The
60 default directory is given by $PBS_SERVER_HOME which is
61 typically /usr/spool/PBS.
62
63 -L logfile Specify an absolute path name for use as the log file.
64 If not specified, MOM will open a file named for the
65 current date in the PBS_HOME/mom_logs directory, see
66 the -d option.
67
68 -M port Specifies the port number on which the mini-server
69 (MOM) will listen for batch requests.
70
71 -R port Specifies the port number on which the mini-server
72 (MOM) will listen for resource monitor requests, task
73 manager requests and inter-MOM messages. Both a UDP
74 and a TCP port of this number will be used.
75
76 -p Specifies the impact on jobs which were in execution
77 when the mini-server shut down. On any restart of MOM,
78 the new mini-server will not be the parent of any run‐
79 ning jobs, MOM has lost control of her offspring (not a
80 new situation for a mother). With the -p option, Mom
81 will allow the jobs to continue to run and monitor them
82 indirectly via polling. The -p option is mutually
83 exclusive with the -r option.
84
85 -r Specifies the impact on jobs which were in execution
86 when the mini-server shut down. With the -r option,
87 MOM will kill any processes belonging to jobs, mark the
88 jobs as terminated, and notify the batch server which
89 owns the job. The -r option is mutual exclusive with
90 the -p option.
91
92 Normally the mini-server is started from the system
93 boot file without the -p or the -r option. The mini-
94 server will make no attempt to signal the former ses‐
95 sion of any job which may have been running when the
96 mini-server terminated. It is assumed that on reboot,
97 all processes have been killed.
98
99 If the -r option is used following a reboot, process
100 IDs (pids) may be reused and MOM may kill a process
101 that is not a batch session.
102
103 -a alarm Used to specify the alarm timeout in seconds for com‐
104 puting a resource. Every time a resource request is
105 processed, an alarm is set for the given amount of
106 time. If the request has not completed before the
107 given time, an alarm signal is generated. The default
108 is 5 seconds.
109
110 -x Disables the check for privileged port resource monitor
111 connections. This is used mainly for testing since the
112 privileged port is the only mechanism used to prevent
113 any ordinary user from connecting.
114
116 The configuration file may be specified on the command line at program
117 start with the -c flag. The use of this file is to provide several
118 types of run time information to pbs_mom: static resource names and
119 values, external resources provided by a program to be run on request
120 via a shell escape, and values to pass to internal set up functions at
121 initialization (and re-initialization).
122
123 Each item type is on a single line with the component parts separated
124 by white space. If the line starts with a hash mark (pound sign, #),
125 the line is considered to be a comment and is skipped.
126
127 Static Resources
128 For static resource names and values, the configuration file
129 contains a list of resource names/values pairs, one pair per
130 line and separated by white space. An Example of static
131 resource names and values could be the number of tape drives of
132 different types and could be specified by
133
134 tape3480 4
135 tape3420 2
136 tapedat 1
137 tape8mm 1
138
139 Shell Commands
140 If the first character of the value is an exclamation mark (!),
141 the entire rest of the line is saved to be executed through the
142 services of the system(3) standard library routine.
143
144 The shell escape provides a means for the resource monitor to
145 yield arbitrary information to the scheduler. Parameter substi‐
146 tution is done such that the value of any qualifier sent with
147 the query, as explained below, replaces a token with a percent
148 sign (%) followed by the name of the qualifier. For example,
149 here is a configuration file line which gives a resource name of
150 "escape":
151
152 escape !echo %xxx %yyy
153
154 If a query for "escape" is sent with no qualifiers, the command
155 executed would be "echo %xxx %yyy". If one qualifier is sent,
156 "escape[xxx=hi there]", the command executed would be "echo hi
157 there %yyy". If two qualifiers are sent,
158 "escape[xxx=hi][yyy=there]", the command executed would be "echo
159 hi there". If a qualifier is sent with no matching token in the
160 command line, "escape[zzz=snafu]", an error is reported.
161
162 size[fs=<FS>]
163 Specifies that the available and configured disk space in the
164 <FS> filesystem is to be reported to the pbs_server and sched‐
165 uler. NOTE: To request disk space on a per job basis, specify
166 the file resource as in 'qsub -l nodes=1,file=1000kb' For exam‐
167 ple, the available and configured disk space in the
168 /localscratch filesystem will be reported:
169
170 size[fs=/localscratch]
171
172 Initialization Value
173 An initialization value directive has a name which starts with a
174 dollar sign ($) and must be known to MOM via an internal table.
175 The entries in this table now are:
176
177 pbsserver
178 which defines hostnames running pbs_server that will be
179 allowed to submit jobs, issue Resource Monitor (RM)
180 requests, and get status updates. MOM will continually
181 attempt to contact all server hosts for node status and
182 state updates. Like $PBS_SERVER_HOME/server_name, the
183 hostname may be followed by a colon and a port number.
184 This parameter replaces the oft-confused $clienthost
185 parameter from TORQUE 2.0.0p0 and earlier. Note that the
186 hostname in $PBS_SERVER_HOME/server_name is used if no
187 $pbsserver parameters are found
188
189 pbsclient
190 which causes a host name to be added to the list of hosts
191 which will be allowed to connect to MOM as long as they
192 are using a privilaged port for the purposes of resource
193 monitor requests. For example, here are two configura‐
194 tion file lines which will allow the hosts "fred" and
195 "wilma" to connect:
196
197 $pbsclient fred
198 $pbsclient wilma
199
200 Two host name are always allowed to connection to
201 pbs_mom, "localhost" and the name returned to pbs_mom by
202 the system call gethostname(). These names need not be
203 specified in the configuration file. The hosts listed as
204 "clients" can issue Resource Monitor (RM) requests.
205 Other MOM nodes and servers do not need to be listed as
206 clients.
207
208 restricted
209 which causes a host name to be added to the list of hosts
210 which will be allowed to connect to MOM without needing
211 to use a privilaged port. These names allow for wildcard
212 matching. For example, here is a configuration file line
213 which will allow queries from any host from the domain
214 "ibm.com".
215
216 $restricted *.ibm.com
217
218 The restriction which applies to these connections is
219 that only internal queries may be made. No resources
220 from a config file will be found. This is to prevent any
221 shell commands from being run by a non-root process.
222 This parameter is generally not required except for some
223 versions of OSX.
224
225 logevent
226 which sets the mask that determines which event types are
227 logged by pbs_mom. For example:
228
229 $logevent 0x1fff
230 $logevent 255
231
232 The first example would set the log event mask to 0x1ff
233 (511) which enables logging of all events including debug
234 events. The second example would set the mask to 0x0ff
235 (255) which enables all events except debug events.
236
237 cputmult
238 which sets a factor used to adjust cpu time used by a
239 job. This is provided to allow adjustment of time
240 charged and limits enforced where the job might run on
241 systems with different cpu performance. If Mom's system
242 is faster than the reference system, set cputmult to a
243 decimal value greater than 1.0. If Mom's system is
244 slower, set cputmult to a value between 1.0 and 0.0. For
245 example:
246
247 $cputmult 1.5
248 $cputmult 0.75
249
250 usecp specifies which directories should be staged with cp
251 instead of rcp/scp. If a shared filesystem is available
252 on all hosts in a cluster, this directive is used to make
253 these filesystems known to MOM. For example, if /home is
254 NFS mounted on all nodes in a cluster:
255
256 $usecp *:/home /home
257
258 wallmult
259 which sets a factor used to adjust wall time usage by to
260 job to a common reference system. The factor is used for
261 walltime calculations and limits the same as cputmult is
262 used for cpu time.
263
264 configversion
265 specifies the version of the config file data, a string.
266
267 check_poll_time
268 specifies the MOM interval in seconds. MOM checks each
269 job for updated resource usages, exited processes, over-
270 limit conditions, etc. once per interval. This value
271 should be equal or lower to pbs_server's job_stat_rate.
272 High values result in stale information reported to
273 pbs_server. Low values result in increased system usage
274 by MOM. Default is 45 seconds.
275
276 down_on_error
277 causes MOM to report itself as state "down" to pbs_server
278 in the event of a failed health check. This feature is
279 EXPERIMENTAL and likely to be removed in the future. See
280 HEALTH CHECK below.
281
282 ideal_load
283 ideal processor load. Represents a low water mark for
284 the load average. Nodes that are currently busy will
285 consider itself free after falling below ideal_load.
286
287 loglevel
288 specifies the verbosity of logging with higher numbers
289 specifying more verbose logging. Values may range
290 between 0 and 7.
291
292 log_file_max_size
293 If this is set to a value > 0 then pbs_mom will roll
294 the current log file to log-file-name.1 when its size is
295 greater than or equal to the value of
296 log_file_max_size. This value is interpreted as kilo‐
297 bytes.
298
299 log_file_roll_depth
300 If this is set to a value >=1 and log_file_max_size is
301 set then pbs_mom will continue rolling the log files to
302 log-file-name.log_file_roll_depth.
303
304 max_load
305 maximum processor load. Nodes over this load average are
306 considered busy (see ideal_load above).
307
308 enablemomrestart
309 enable automatic restarts of MOM. If enabled, MOM will
310 check if its binary has been updated and restart itself
311 at a safe point when no jobs are running; thus making
312 upgrades easier. The check is made by comparing the
313 mtime of the pbs_mom executable. Command-line args, the
314 process name, and the PATH env variable are preserved
315 across restarts. It is recommended that this not be
316 enabled in the config file, but enabled when desired with
317 momctl (see RESOURCES for more information.)
318
319 node_check_script
320 specifies the fully qualified pathname of the health
321 check script to run (see HEALTH CHECK for more informa‐
322 tion).
323
324 node_check_interval
325 specifies the number of MOM intervals between subsequent
326 executions of the specified health check. This value
327 default to 1 indicating the check is run every mom inter‐
328 val. (see HEALTH CHECK for more information)
329
330 prologalarm
331 Specifies maximum duration (in seconds) which the mom
332 will wait for the job prolog or job job epilog to com‐
333 plete. This parameter default to 300 seconds (5 minutes)
334
335 rcpcmd Specify the the full path and argument to be used for
336 remote file copies. This overrides the compile-time
337 default found in configure. This must contain 2 words:
338 the full path to the command and the switches. The copy
339 command must be able to recursively copy files to the
340 remote host and accept arguments of the form
341 "user@host:files" For example:
342
343 $rcpcmd /usr/bin/rcp -rp
344 $rcpcmd /usr/bin/scp -rpB
345
346 remote_reconfig
347 Enables the ability to remotely reconfigure pbs_mom with
348 a new config file. Default is disabled.
349
350 timeout
351 Specifies the number of seconds before TCP messages will
352 time out. TCP messages include job obituaries, and TM
353 requests if RPP is disabled. Default is 60 seconds.
354
355 tmpdir Sets the directory basename for a per-job temporary
356 directory. Before job launch, MOM will append the jobid
357 to the tmpdir basename and create the directory. After
358 the job exit, MOM will recursively delete it. The env
359 variable TMPDIR will be set for all pro/epilog scripts,
360 the job script, and TM tasks.
361 Directory creation and removal is done as the job owner
362 and group, so the owner must have write permission to
363 create the directory. If the directory already exists
364 and is owned by the job owner, it will not be deleted
365 after the job. If the directory already exists and is
366 NOT owned by the job owner, the job start will be
367 rejected.
368
369 status_update_time
370 specifies (in seconds) how often MOM updates its status
371 information to pbs_server. This value should correlate
372 with the server's scheduling interval. High values
373 increase the load of pbs_server and the network. Low
374 values cause pbs_server to report stale information.
375 Default is 45 seconds.
376
377 The configuration file must be "secure". It must be owned by a user id
378 and group id less than 10 and not be world writtable.
379
381 Resource Monitor queries can be made with momctl's -q option to
382 retrieve and set pbs_mom options. Any configured static resource may
383 be retrieved with a request of the same name. These are resource
384 requests not otherwise documented in the PBS ERS.
385
386 cycle forces an immediate MOM cycle
387
388 status_update_time
389 retrieve or set the $status_update_time parameter
390
391 check_poll_time
392 retrieve or set the $check_poll_time parameter
393
394 configversion
395 retrieve the config version
396
397 jobstartblocktime
398 retrieve or set the $jobstartblocktime parameter
399
400 enablemomrestart
401 retrieve or set the $enablemomrestart parameter
402
403 loglevel
404 retrieve or set the $loglevel parameter
405
406 down_on_error
407 retrieve or set the EXPERIMENTAL $down_on_error parameter
408
409 diag0 - diag4
410 retrieves various diagnostic information
411
412 rcpcmd retrieve or set the $rcpcmd parameter
413
414 version
415 retrieves the pbs_mom version
416
418 The health check script is executed directly by the pbs_mom daemon
419 under the root user id. It must be accessible from the compute node and
420 may be a script or compiled executable program. It may make any needed
421 system calls and execute any combination of system utilities but should
422 not execute resource manager client commands. Also, as of TORQUE
423 1.0.1, the pbs_mom daemon blocks until the health check is completed
424 and does not possess a built-in timeout. Consequently, it is advisable
425 to keep the launch script execution time short and verify that the
426 script will not block even under failure conditions.
427
428 If the script detects a failure, it should return the keyword 'ERROR'
429 to stdout followed by an error message. The message (up to 256 charac‐
430 ters) immediately following the ERROR string will be assigned to the
431 node attribute 'message' of the associated node.
432
433 TORQUE currently ignores ERROR messages by default, but advanced sched‐
434 ulers like moab can be configured to react appropriately.
435
436 If the experimental $down_on_error MOM setting is enabled, MOM will set
437 itself to state down and report to pbs_server; and pbs_server will
438 report the node as "down". Additionally, the experimental
439 "down_on_error" server attribute can be enabled which has the same
440 effect but moves the decision to pbs_server. It is redundant to have
441 MOM's $down_on_error and pbs_server's down_on_error features enabled.
442 See "down_on_error" in pbs_server_attributes(7B).
443
445 $PBS_SERVER_HOME/server_name
446 contains the hostname running pbs_server.
447
448 $PBS_SERVER_HOME/mom_priv
449 the default directory for configuration files, typically
450 (/usr/spool/pbs)/mom_priv.
451
452 $PBS_SERVER_HOME/mom_logs
453 directory for log files recorded by the server.
454
455 $PBS_SERVER_HOME/mom_priv/prologue
456 the administrative script to be run before job execution.
457
458 $PBS_SERVER_HOME/mom_priv/epilogue
459 the administrative script to be run after job execution.
460
462 pbs_mom handles the following signals:
463
464 SIGHUP causes pbs_mom to re-read its configuration file, close and
465 reopen the log file, and reinitialize resource structures.
466
467 SIGALRM
468 results in a log file entry. The signal is used to limit the
469 time taken by certain children processes, such as the prologue
470 and epilogue.
471
472 SIGINT and SIGTERM
473 results in pbs_mom exiting without terminating any running jobs.
474 This is the action for the following signals as well: SIGXCPU,
475 SIGXFSZ, SIGCPULIM, and SIGSHUTDN.
476
477 SIGUSR1,
478 causes mom to increase and decrease logging levels, respec‐
479 tively.
480
481 SIGPIPE, SIGINFO
482 are ignored.
483
484 SIGBUS, SIGFPE, SIGILL, SIGTRAP, and SIGSYS
485 cause a core dump if the PBSCOREDUMP environmental variable is
486 defined.
487 .LP are ignored.
488
489 All other signals have their default behavior installed.
490
492 If the mini-server command fails to begin operation, the server exits
493 with a value greater than zero.
494
496 pbs_server(8B), pbs_scheduler_basl(8B), pbs_scheduler_tcl(8B), the PBS
497 External Reference Specification, and the PBS Administrator's Guide.
498
499
500
501Local pbs_mom(8B)