1LAMBOOT(1) LAM TOOLS LAMBOOT(1)
2
3
4
6 lamboot - Start a LAM multicomputer.
7
9 lamboot [-b] [-d] [-h] [-H] [-l] [-s] [-v] [-V] [-x] [-nn] [-np] [-c
10 <conf file>] [-prefix </lam/install/path/>] [-sessionprefix
11 <value>] [-sessionsuffix <value>] [-withlamprefixpath <value>]
12 [-ssi <key> <value>] [<bhost>]
13
15 -b Assume local and remote shell are the same. This means that
16 only one remote shell invocation is used to each node. If -b
17 is not used, two remote shell invocations are used to each
18 node.
19
20 -d Turn on debugging output. This implies -v.
21
22 -h Print the command help menu.
23
24 -l Delay hostname-to-IP-address resolution.
25
26 -prefix Use the LAM installation specified in </lam/install/path/>.
27 Not compatible with LAM/MPI versions prior to 7.1.
28
29 -s Close stdio on the local node.
30
31 -ssi <key> <value>
32 Send arguments to various SSI modules. See the "SSI" section,
33 below.
34
35 -v Be verbose.
36
37 -x Run in fault tolerant mode.
38
39 -H Do not display the command header.
40
41 -nn Don't add "-n" to the remote agent command line
42
43 -np Do not force the execution of $HOME/.profile on remote hosts
44
45 -session-prefix <value>
46 Set the session prefix, overriding LAM_MPI_SESSION_PREFIX.
47
48 -session-suffix <value>
49 Set the session suffix, overriding LAM_MPI_SESSION_SUFFIX.
50
51 -withlamprefixpath <value>
52 Override the internal installation path. For internal use on‐
53 ly, do not use unless you know what you are doing.
54
56 LAM_MPI_SESSION_PREFIX
57
58 LAM_MPI_SESSION_SUFFIX
59 It is possible to change the session directory used by
60 LAM/MPI, normally of the form:
61
62 <tmpdir>/lam-<username>@<hostname>[-<suffix>]
63
64 <tmpdir> will be set to LAM_MPI_SESSION_PREFIX if set. Otherwise, it
65 will fall back to the value of TMPDIR. If neither of these
66 are set, the default is /tmp.
67
68 <suffix> can be overridden by the LAM_MPI_SESSION_SUFFIX environment
69 variable. If LAM_MPI_SESSION_SUFFIX is not set and LAM is
70 running under a supported batch scheduling system, $suffix
71 will be a value unique to the currently running job.
72
74 The lamboot tool starts the LAM software on each of the machines speci‐
75 fied in the boot schema, <bhost>. The boot schema specifies the host‐
76 names of nodes to be used in the run-time MPI environment, and option‐
77 ally lists how may CPUs LAM may used on each node. The user may wish
78 to first run the recon(1) tool to verify that LAM can be started.
79
80 Starting LAM is a three step procedure. In the first step, hboot(1) is
81 invoked on each of the specified machines. Then each machine allocates
82 a dynamic port and communicates it back to lamboot which collects them.
83 In the third step, lamboot gives each machine the list of ma‐
84 chines/ports in order to form a fully connected topology. If any ma‐
85 chine was not able to start, or if a timeout period expires before the
86 first step completes, lamboot invokes lamwipe(1) to terminate LAM and
87 reports the error.
88
89 The <bhost> file is a LAM boot schema written in the host file syntax.
90 See bhost(5). Instead of the command line, a boot schema can be speci‐
91 fied in the LAMBHOST environment variable. Otherwise a default file,
92 lam-bhost.def, is used. LAM searches for <bhost> first in the local
93 directory and then in the installation directory under etc/.
94
95 In addition, lamboot uses a process schema for the individual LAM
96 nodes. A process schema (see conf(5)) is a description of the process‐
97 es which constitute the operating system on a node. In general, the
98 system administrator maintains this file -- LAM/MPI users will general‐
99 ly not need to change this file. It is also possible for the user to
100 customize the LAM software with a private process schema.
101
102 The bhost file
103 The format of the <bhost> file is documented in the bhost(5) man page.
104
105 lamboot will resolve all names in <bhost> on the node in which lamboot
106 was invoked (the origin node). After that, LAM will only use IP ad‐
107 dresses, not names. Specifically, the name resolution configuration on
108 all other nodes is not used. Hence, the the origin node must be able
109 to resolve all the names in <bhost> to addresses that are reachable by
110 all other nodes.
111
112 A common mistake is to list localhost (or any name that resolves to the
113 special address 127.0.0.1 -- the loopback TCP/IP device) in a <bhost>
114 file that contains other nodes. In this case, the address 127.0.0.1
115 would be sent to each of the other nodes as the address of the origin
116 node. If the other nodes try to use 127.0.0.1 to contact the origin
117 node, they will actually be contacting themselves, and would eventually
118 timeout and fail.
119
120 The IP addresses obtained from <bhost> are used for LAM's meta mes‐
121 sages: startup and shutdown of jobs, out-of-band messages used for co‐
122 ordination, etc. The amount of traffic is fairly low (unless using the
123 "lamd" mode of MPI message passing, in which case all MPI traffic will
124 also utilize LAM's meta messages for transport -- see mpirun(1)). When
125 using the TCP RPI, these IP addresses are also used for MPI message
126 passing via direct sockets between each pair of nodes.
127
128 A common case is where a "master" node has multiple network interface
129 cards (NICs) -- one that is connected to a public network, and one that
130 is connected to a private network where parallel jobs are to be run.
131 To include the master node in a <bhost> file, the IP name (or address)
132 of the NIC on the private network should be listed in <bhost>. This
133 ensures that all the other nodes can reach the master node on the pri‐
134 vate network.
135
136 As another example, some configurations have multiple TCP/IP NICs in
137 each node of a parallel job. One NIC is considered "slow" (e.g.,
138 10Mbps), while the other is considered "fast" (e.g., 100Mbps). It is
139 desirable to allow LAM to take advantage of the higher bandwidth on the
140 "fast" network for MPI messages. As such, <bhost> should list the IP
141 names (or addresses) of all the "fast" NICs. However, if the LAM RPI
142 does not use TCP/IP (e.g., the Myrinet/GM RPI), the <bhost> file should
143 probably list the "slow" NICs so that LAM's meta message traffic does
144 not cause overhead and potentially detract from performance on the
145 "fast" network from other high-performance applications.
146
147 Delaying hostname lookups
148 Normally, name resolution of hostnames is done on the machines where
149 lamboot is invoked. This is done for optimization reasons, so that the
150 list of hostnames only needs to be resolved once (potentially minimiz‐
151 ing the amount of DNS or other hostname-lookup network traffic).
152
153 However, in some non-uniform networking environments, this is not suf‐
154 ficient because each host may have a different IP address on each of
155 its peers. For example, host A may have address Z on host B, but have
156 address Y on host C.
157
158 The -l option to lamboot will cause LAM to distribute hostnames to each
159 node rather than a fully resolved set of IP addresses. Hence, each
160 node where LAM is booted will do its own name resolution on the list of
161 hostnames.
162
163 SSI (System Services Interface)
164 The -ssi switch allows the passing of parameters to various SSI mod‐
165 ules. LAM's SSI modules are described in detail in lamssi(7). SSI
166 modules have direct impact on MPI programs because they allow tunable
167 parameters to be set at run time (such as which boot device driver to
168 use, what parameters to pass to that driver, etc.).
169
170 The -ssi switch takes two arguments: <key> and <value>. The <key> ar‐
171 gument generally specifies which SSI module will receive the value.
172 For example, the <key> "boot" is used to select which RPI to be used
173 for starting processes on remote nodes. The <value> argument is the
174 value that is passed. For example:
175
176 lamboot -ssi boot tm
177 Tells LAM to use the "tm" boot module for native launching in PB‐
178 SPro / OpenPBS environments (the tm boot module does not require a
179 boot schema).
180
181 lamboot -ssi boot rsh -ssi rsh_agent "ssh -x" boot_schema
182 Tells LAM to use the "rsh" boot module, and tells the rsh module to
183 use "ssh -x" as the specific agent to launch executables on remote
184 nodes.
185
186 And so on. LAM's boot SSI modules are described in lamssi_boot(7).
187 This page should be consulted for specific actions that are taken by,
188 and how to tweak the run-time behavior of each boot module.
189
190 The -ssi switch can be used multiple times to specify different <key>
191 and/or <value> arguments. If the same <key> is specified more than
192 once, the <value>s are concatenated with a comma (",") separating them.
193
194 Note that the -ssi switch is simply a shortcut for setting environment
195 variables. The same effect may be accomplished by setting correspond‐
196 ing environment variables before running lamboot. The form of the en‐
197 vironment variables that LAM sets are: LAM_MPI_SSI_<key>=<value>.
198
199 Note that the -ssi switch overrides any previously set environment
200 variables. Also note that unknown <key> arguments are still set as en‐
201 vironment variable -- they are not checked (by lamwipe) for correct‐
202 ness. Illegal or incorrect <value> arguments may or may not be report‐
203 ed -- it depends on the specific SSI module.
204
205 Remote Executable Invocation
206 All tweakable aspects of launching executables on remote nodes during
207 lamboot are discussed in lamssi(7) and lamssi_boot(7). Topics include
208 (but are not limited to): discovery of remote shell, run-time overrides
209 of the agent use to launch remote executables (e.g., rsh and ssh), etc.
210
211 Closing stdio
212 The stdio of each LAM daemon on a remote host that is launched by lam‐
213 boot is closed by default. Normally, the stdio of the LAM daemon
214 launched on the local host is left open so that the internal LAM tst‐
215 dio(3) package works properly. However, it is sometimes desirable to
216 close the stdio of the local LAM daemon as well. For example:
217
218 rsh somenode lamboot -s hostfile
219
220 This is because rsh waits for two conditions before exiting: lamboot to
221 exit, and stdout / stderr to be closed. Without -s, stdout / stderr
222 would not be closed, and rsh (and ssh) will hang even though lamboot
223 had completed. -s causes the stdout / stderr of the local LAM daemon
224 to be closed upon invocation, which will allow rsh to complete. Using
225 -s will not affect lamboot in any other way, but it will prevent the
226 tstdio(3) package from working properly.
227
228 Fault Tolerance
229 If the -x option is given, LAM runs in fault tolerant mode. In this
230 mode, nodes exchange ``heart beat'' messages periodically to make sure
231 all nodes are running and the links connecting them are operational.
232 When a node's heart beats stop, it is declared ``dead'' and all LAM
233 nodes (and processes) are notified. This allows users to write fault
234 tolerant applications that can degrade gracefully, or fully recover by
235 replacing the defunct node with another (see lamgrow(1)). Since this
236 mode introduces a performance penalty, it is not activated by default.
237
239 lamboot -v
240 Start LAM on the machines described in the default boot schema.
241 Report about important steps as they are done.
242
243 lamboot -d hostfile
244 Start LAM on the machines described in file hostfile. Provide in‐
245 credibly detailed reports on what is happening at each stage in the
246 boot process.
247
248 lamboot mynodes
249 Start LAM on the machines described in the boot schema mynodes.
250 Operate silently.
251
253 laminstalldir/etc/lam-bhost.def default boot schema file, where
254 "laminstalldir" is the directory
255 where LAM/MPI was installed
256
257 laminstalldir/etc/lam-conf.lamd default process schema file for LAM
258 nodes
259
261 recon(1), lamwipe(1), hboot(1), tstdio(3), bhost(5), conf(5), lam-help‐
262 file(5), lamssi(7), lamssi_boot(7)
263
264
265
266
267
268
269
270
271
272LAM 7.1.2 March, 2006 LAMBOOT(1)