1lamssi_boot(7) LAM SSI BOOT OVERVIEW lamssi_boot(7)
2
3
4
6 LAM SSI boot - overview of LAM's boot SSI modules
7
9 The "kind" for boot SSI modules is "boot". Specifically, the string
10 "boot" (without the quotes) is the prefix that can be used as the pre‐
11 fix to arguments when passing values to boot modules at run time. For
12 example:
13
14 lamboot -ssi boot rsh hostfile
15 Specifies to use the "rsh" boot module, and lamboot across all the
16 nodes listed in the file hostfile.
17
18 LAM currently has several boot modules: bproc, globus, rsh (which
19 includes ssh), slurm, and tm.
20
22 The LAM/MPI User's Guide contains much detail about all of the boot
23 modules. All users are strongly encouraged to read it. This man page
24 is a summary of the available information.
25
27 Only one boot module may be selected per command execution. Hence, the
28 selection of which module occurs once when a given command initializes.
29 Once the module is chosen, it is used for the duration of the program
30 run.
31
32 In most cases, LAM will automatically select the "best" module at run-
33 time. LAM will query all available modules at run time to obtain a
34 list of priorities. The module with the highest priority will be used.
35 If multiple modules return the same priority, LAM will select one at
36 random. Priorities are in the range of 0 to 100, with 0 being the low‐
37 est priority and 100 being the highest. At run time, each module will
38 examine the run-time environment and return a priority value that is
39 appropriate.
40
41 For example, when running a PBS job, the tm module will return a suffi‐
42 ciently high priority value such that it will be selected and the other
43 available modules will not.
44
45 Most modules allow run time parameters to override the priorities that
46 they return that allow changing the order (and therefore ultimate
47 selection) of the available boot modules. See below.
48
49 Alternatively, a specific module may be selected by the user by speci‐
50 fying a value for the boot parameter (either by environment variable or
51 by the -ssi command line parameter). In this case, no other modules
52 will be queried by LAM. If the named module returns a valid priority,
53 it will be used. For example:
54
55 lamboot -ssi boot rsh hostfile
56 Tells LAM to only query the rsh boot module and see if it is avail‐
57 able to run.
58
59 If the boot module that is selected is unable to run (e.g., attempting
60 to use the tm boot module when not running in a PBS job), an appropri‐
61 ate error message will be printed and execution will abort.
62
64 As with all SSI modules, it is possible to pass parameters at run time.
65 This section discusses the built-in LAM boot modules, as well as the
66 run-time parameters that they accept.
67
68 In the discussion below, parameters to boot modules are discussed in
69 terms of name and value. The name and value may be specified as com‐
70 mand line arguments to the lamboot, lamgrow, recon, and lamwipe com‐
71 mands with the -ssi switch, or they may be set in environment variables
72 of the form LAM_MPI_SSI_name=value. Note that using the -ssi command
73 line switch will take precendence over any previously-set environment
74 variables.
75
76 bproc Boot Module
77 The bproc boot module uses native bproc functionality (e.g., the
78 bproc_execmove library call) to launch jobs on slaves nodes from the
79 head node. Checks are made before launching to ensure that the nodes
80 are available and are "owned" by the user and/or the user's group.
81 Appropriate error messages will be displayed if the user is unable to
82 execute on the target nodes.
83
84 Hostnames should be specified using bproc notation: -1 indicates the
85 head node, and integer numbers starting with 0 represent slave nodes.
86 The string "localhost" will automatically be converted to "-1".
87
88 The default behavior is to mark the bproc head node as "non-scheduled‐
89 able", meaning that the expansion of "N" and "C" when used with mpirun
90 and lamexec will exclude the bproc head node. For example, "mpirun C
91 my_mpi_program" will run copies of my_mpi_program on all lambooted
92 slave nodes, but not the bproc head node.
93
94 Note that the bproc boot module is only usable from the bproc head
95 node.
96
97 The bproc boot module only has one tunable parameter:
98
99 boot_bproc_priority
100 Using the priority argument can override LAM's automatic run-time
101 boot module selection algorithms. This parameter only has effect
102 when the tm module is eligible to be run (i.e., when running on a
103 bproc cluster).
104
105 See the bproc notes in the user documentation for more details.
106
107 globus Boot Module
108 The globus boot module uses the globus-job-run command to launch exe‐
109 cutables on remote nodes. It is currently limited to only allowing
110 jobs that can use the fork job manager on the Globus gatekeeper. Other
111 job managers are not yet supported.
112
113 LAM will effectively never select the globus boot module by default
114 because it has an extremely low default priority; it must be manually
115 selected with the boot SSI parameter or have its priority raised.
116 Additionally, LAM must be able to find the globus-job-run command in
117 your PATH.
118
119 The boot schema requires hosts to be listed as the Globus contact
120 string. For example:
121
122 "host1:port1:/O=xxx/OU=yyy/CN=aaa bbb ccc"
123
124 Note the use of quotes because the CN includes spaces -- the entire
125 contact name must be enclosed in quotes. Additionally, since globus-
126 job-run does not invoke the user's "dot" files on the remote nodes, no
127 PATH or environment is setup. Hence, the attribute lam_install_path
128 must be specified for each contact string in the hostfile so that LAM
129 knows where to find its executables on the remote nodes. For example:
130
131 "host1:port1:/O=xxx/OU=yyy/CN=aaa bbb ccc" lam_install_path=/home/lam
132
133 The globus boot module only has one tunable parameter:
134
135 boot_globus_priority
136 Using the priority argument can override LAM's automatic run-time
137 boot module selection algorithms.
138
139 rsh Boot Module
140 The rsh boot module uses rsh or ssh (or any other command line agent
141 that acts like rsh/ssh) to launch executables on remote nodes. It
142 requires that executables can be started on remote nodes without being
143 prompted for a password, and without outputting anything to stderr.
144
145 The rsh boot module is always available, and unless overridden, always
146 assigns itself a priority of 0.
147
148 The rsh module accepts a few run-time parameters:
149
150 boot_rsh_agent
151 Used to override the compiled-in default remote agent program that
152 was selected when LAM is compiled. For example, this parameter can
153 be set to use "ssh" if LAM was compiled to use "rsh" by default.
154 Previous versions of LAM/MPI used the LAMRSH environment variable
155 for this purpose. While the LAMRSH environment variable still
156 works, its use is deprecated in favor of the boot_rsh_agent SSI
157 module argument.
158
159 boot_rsh_priority
160 Using the priority argument can override LAM's automatic run-time
161 boot module selection algorithms.
162
163 boot_rsh_username
164 If the user has a different username on the remote machine, this
165 parameter can be used to pass the -l argument to the underlying
166 remote agent. Note that this is a coarse-grained control -- this
167 one username will be used for all remote nodes. If more fine-
168 grained control is required, the username should be specified in
169 the boot schema file on a per-host basis.
170
171 slurm Boot Module
172 The slurm boot module uses the srun command to launch the LAM daemons
173 in a SLURM execution environment (i.e., it detects that it is running
174 under SLURM and automatically sets its priority to 50). It can be used
175 in two different modes: batch (where a script is submitted to SLURM and
176 it is run on the first node in the node allocation) and allocate (where
177 the -A option is used to srun to obtain an interactive allocation).
178 The slurm boot module does not support running in a script that is
179 launched by SLURM on all nodes in an allocation.
180
181 No boot schema file is required when using the slurm boot module; LAM
182 will automatically determine the host and CPU count from SLURM itself.
183
184 The slurm boot module only has one tunable parameter:
185
186 boot_slurm_priority
187 Using the priority argument can override LAM's automatic run-time
188 boot module selection algorithms. This parameter only has effect
189 when the slurm module is eligible to be run (i.e., when running in
190 a SLURM allocation).
191
192 tm Boot Module
193 The tm boot module uses the Task Management (TM) interface to launch
194 executables on remote nodes. Currently, only OpenPBS and PBSPro are
195 the only two systems that implement the TM interface. Hence, when LAM
196 detects that it is running in a PBS job, it will automatically set the
197 tm priority to 50. When not running in a PBS job, the tm module will
198 not be available.
199
200 The tm boot module only has one tunable parameter:
201
202 boot_tm_priority
203 Using the priority argument can override LAM's automatic run-time
204 boot module selection algorithms. This parameter only has effect
205 when the tm module is eligible to be run (i.e., when running in a
206 PBS job).
207
209 lamssi(7), mpirun(1), LAM User's Guide
210
211
212
213LAM 7.1.2 March, 2006 lamssi_boot(7)