1RECON(1) LAM TOOLS RECON(1)
2
3
4
6 recon - Check if LAM can be started.
7
9 recon [-a] [-b] [-d] [-h] [-v] [-nn] [-np] [-ssi <key> <value>]
10 [<bhost>]
11
13 -a Report all host errors.
14
15 -b Assume local and remote shell are the same. This means that
16 only one remote shell invocation is used to each node. If -b
17 is not used, two remote shell invocations are used to each
18 node.
19
20 -d Turn on debugging.
21
22 -h Print the command help menu.
23
24 -ssi <key> <value>
25 Send arguments to various SSI modules. See the "SSI" section,
26 below.
27
28 -v Be verbose.
29
30 -nn Don't add "-n" to the remote agent command line
31
32 -np Do not force the execution of $HOME/.profile on remote hosts
33
35 In order for LAM to be started on a remote UNIX machine, several
36 requirements have to be fulfilled:
37
38 1) The machine must be reachable via the network.
39
40 2) The user must be able to remotely execute on the machine with
41 the default remote shell program that was chosen when LAM was
42 configured. This is usually rsh(1), but any remote shell pro‐
43 gram is acceptable (such as ssh(1), etc.). Note that remote
44 host permission must be configured such that the remote shell
45 program will not ask for a password when a command is invoked on
46 remote host.
47
48 3) The remote user's shell must have a search path that will locate
49 LAM executables.
50
51 4) The remote shell's startup file must not print anything to stan‐
52 dard error when invoked non-interactively.
53
54 If any of these requirements is not met for any machine declared in
55 <bhost>, LAM will not be able to start. By running recon first, the
56 user will be able to quickly identify and correct problems in the setup
57 that would inhibit LAM from starting.
58
59 The local machine where recon is invoked must be one of the machines
60 specified in <bhost>.
61
62 The <bhost> file is a LAM boot schema written in the host file syntax.
63 See bhost(5). Instead of the command line, a boot schema can be speci‐
64 fied in the LAMBHOST environment variable. Otherwise a default file,
65 bhost.def, is used. LAM seaches for <bhost> first in the local direc‐
66 tory and then in the installation directory under etc/.
67
68 recon tests each machine defined in <bhost> by attempting to execute on
69 it the tkill(1) command using its "pretend" option (no action is
70 taken). This test, if successful, indicates that all the requirements
71 listed above are met, and thus LAM can be started on the machine. If
72 the attempt is successful, the next machine is checked. In case the
73 attempt fails, a descriptive error message is displayed and recon stops
74 unless the -a option is used, in which case recon continues checking
75 the remaining machines.
76
77 If recon takes a long time to finish successfully, this will be a good
78 indication to the user that the LAM system to be started has slow com‐
79 munication links or heavily loaded machines, and it might be preferable
80 to exclude or replace some of the machines in the system.
81
82 SSI (System Services Interface)
83 The -ssi switch allows the passing of parameters to various SSI mod‐
84 ules. LAM's SSI modules are described in detail in lamssi(7). SSI
85 modules have direct impact on MPI programs because they allow tunable
86 parameters to be set at run time (such as which boot device driver to
87 use, what parameters to pass to that driver, etc.).
88
89 The -ssi switch takes two arguments: <key> and <value>. The <key>
90 argument generally specifies which SSI module will receive the value.
91 For example, the <key> "boot" is used to select which RPI to be used
92 for starting processes on remote nodes. The <value> argument is the
93 value that is passed. For example:
94
95 recon -ssi boot tm
96 Tells LAM to use the "tm" boot module for native launching in
97 PBSPro / OpenPBS environments (the tm boot module does not require
98 a boot schema).
99
100 recon -ssi boot rsh -ssi rsh_agent "ssh -x" boot_file
101 Tells LAM to use the "rsh" boot module, and tells the rsh module to
102 use "ssh -x" as the specific agent to launch executables on remote
103 nodes.
104
105 And so on. LAM's boot SSI modules are described in lamssi_boot(7).
106 This page should be consulted for specific actions that are taken by,
107 and how to tweak the run-time behavior of each boot module.
108
109 The -ssi switch can be used multiple times to specify different <key>
110 and/or <value> arguments. If the same <key> is specified more than
111 once, the <value>s are concatenated with a comma (",") separating them.
112
113 Note that the -ssi switch is simply a shortcut for setting environment
114 variables. The same effect may be accomplished by setting correspond‐
115 ing environment variables before running lamboot. The form of the
116 environment variables that LAM sets are: LAM_MPI_SSI_<key>=<value>.
117
118 Note that the -ssi switch overrides any previously set environment
119 variables. Also note that unknown <key> arguments are still set as
120 environment variable -- they are not checked (by lamwipe) for correct‐
121 ness. Illegal or incorrect <value> arguments may or may not be
122 reported -- it depends on the specific SSI module.
123
124 Remote Executable Invocation
125 All tweakable aspects of launching executables on remote nodes during
126 recon are discussed in lamssi(7) and lamssi_boot(7). Topics include
127 (but are not limited to): discovery of remote shell, run-time overrides
128 of the agent use to launch remote executables (e.g., rsh and ssh), etc.
129
131 laminstalldir/etc/lam-bhost.def default boot schema file, where
132 "laminstalldir" is the directory
133 where LAM/MPI was installed.
134
136 recon -v mynodes
137 Check if LAM can be started on all the UNIX machines described in
138 the boot schema mynodes. Report about important steps as they are
139 done.
140
141 recon -v -a
142 Check if LAM can be started on all the UNIX machines described in
143 the default boot schema. Report about important steps as they are
144 done. Check all the machines; do not stop after the first error
145 message.
146
148 rsh(1), tkill(1), bhost(5), lamboot(1), lamwipe(1), lam-helpfile(5),
149 lamssi(7), lamssi_boot(7)
150
151
152
153LAM 7.1.2 March, 2006 RECON(1)