1lamssi_boot(7)               LAM SSI BOOT OVERVIEW              lamssi_boot(7)
2
3
4

NAME

6       LAM SSI boot - overview of LAM's boot SSI modules
7

DESCRIPTION

9       The  "kind"  for  boot SSI modules is "boot".  Specifically, the string
10       "boot" (without the quotes) is the prefix that can be used as the  pre‐
11       fix  to arguments when passing values to boot modules at run time.  For
12       example:
13
14       lamboot -ssi boot rsh hostfile
15           Specifies to use the "rsh" boot module, and lamboot across all  the
16           nodes listed in the file hostfile.
17
18       LAM  currently  has  several  boot  modules:  bproc, globus, rsh (which
19       includes ssh), slurm, and tm.
20

ADDITIONAL INFORMATION

22       The LAM/MPI User's Guide contains much detail about  all  of  the  boot
23       modules.   All users are strongly encouraged to read it.  This man page
24       is a summary of the available information.
25

SELECTING A BOOT MODULE

27       Only one boot module may be selected per command execution.  Hence, the
28       selection of which module occurs once when a given command initializes.
29       Once the module is chosen, it is used for the duration of  the  program
30       run.
31
32       In  most cases, LAM will automatically select the "best" module at run-
33       time.  LAM will query all available modules at run  time  to  obtain  a
34       list of priorities.  The module with the highest priority will be used.
35       If multiple modules return the same priority, LAM will  select  one  at
36       random.  Priorities are in the range of 0 to 100, with 0 being the low‐
37       est priority and 100 being the highest.  At run time, each module  will
38       examine  the  run-time  environment and return a priority value that is
39       appropriate.
40
41       For example, when running a PBS job, the tm module will return a suffi‐
42       ciently high priority value such that it will be selected and the other
43       available modules will not.
44
45       Most modules allow run time parameters to override the priorities  that
46       they  return  that  allow  changing  the  order (and therefore ultimate
47       selection) of the available boot modules.  See below.
48
49       Alternatively, a specific module may be selected by the user by  speci‐
50       fying a value for the boot parameter (either by environment variable or
51       by the -ssi command line parameter).  In this case,  no  other  modules
52       will  be queried by LAM.  If the named module returns a valid priority,
53       it will be used.  For example:
54
55       lamboot -ssi boot rsh hostfile
56           Tells LAM to only query the rsh boot module and see if it is avail‐
57           able to run.
58
59       If  the boot module that is selected is unable to run (e.g., attempting
60       to use the tm boot module when not running in a PBS job), an  appropri‐
61       ate error message will be printed and execution will abort.
62

AVAILABLE MODULES

64       As with all SSI modules, it is possible to pass parameters at run time.
65       This section discusses the built-in LAM boot modules, as  well  as  the
66       run-time parameters that they accept.
67
68       In  the  discussion  below, parameters to boot modules are discussed in
69       terms of name and value.  The name and value may be specified  as  com‐
70       mand  line  arguments  to the lamboot, lamgrow, recon, and lamwipe com‐
71       mands with the -ssi switch, or they may be set in environment variables
72       of  the  form LAM_MPI_SSI_name=value.  Note that using the -ssi command
73       line switch will take precendence over any  previously-set  environment
74       variables.
75
76   bproc Boot Module
77       The  bproc  boot  module  uses  native  bproc  functionality (e.g., the
78       bproc_execmove library call) to launch jobs on slaves  nodes  from  the
79       head  node.   Checks are made before launching to ensure that the nodes
80       are available and are "owned" by the  user  and/or  the  user's  group.
81       Appropriate  error  messages will be displayed if the user is unable to
82       execute on the target nodes.
83
84       Hostnames should be specified using bproc notation:  -1  indicates  the
85       head  node,  and integer numbers starting with 0 represent slave nodes.
86       The string "localhost" will automatically be converted to "-1".
87
88       The default behavior is to mark the bproc head node as  "non-scheduled‐
89       able",  meaning that the expansion of "N" and "C" when used with mpirun
90       and lamexec will exclude the bproc head node.  For example,  "mpirun  C
91       my_mpi_program"  will  run  copies  of  my_mpi_program on all lambooted
92       slave nodes, but not the bproc head node.
93
94       Note that the bproc boot module is only  usable  from  the  bproc  head
95       node.
96
97       The bproc boot module only has one tunable parameter:
98
99       boot_bproc_priority
100           Using  the  priority argument can override LAM's automatic run-time
101           boot module selection algorithms.  This parameter only  has  effect
102           when  the  tm module is eligible to be run (i.e., when running on a
103           bproc cluster).
104
105       See the bproc notes in the user documentation for more details.
106
107   globus Boot Module
108       The globus boot module uses the globus-job-run command to  launch  exe‐
109       cutables  on  remote  nodes.   It is currently limited to only allowing
110       jobs that can use the fork job manager on the Globus gatekeeper.  Other
111       job managers are not yet supported.
112
113       LAM  will  effectively  never  select the globus boot module by default
114       because it has an extremely low default priority; it must  be  manually
115       selected  with  the  boot  SSI  parameter  or have its priority raised.
116       Additionally, LAM must be able to find the  globus-job-run  command  in
117       your PATH.
118
119       The  boot  schema  requires  hosts  to  be listed as the Globus contact
120       string.  For example:
121
122       "host1:port1:/O=xxx/OU=yyy/CN=aaa bbb ccc"
123
124       Note the use of quotes because the CN includes  spaces  --  the  entire
125       contact  name  must be enclosed in quotes.  Additionally, since globus-
126       job-run does not invoke the user's "dot" files on the remote nodes,  no
127       PATH  or  environment  is setup.  Hence, the attribute lam_install_path
128       must be specified for each contact string in the hostfile so  that  LAM
129       knows where to find its executables on the remote nodes.  For example:
130
131       "host1:port1:/O=xxx/OU=yyy/CN=aaa bbb ccc" lam_install_path=/home/lam
132
133       The globus boot module only has one tunable parameter:
134
135       boot_globus_priority
136           Using  the  priority argument can override LAM's automatic run-time
137           boot module selection algorithms.
138
139   rsh Boot Module
140       The rsh boot module uses rsh or ssh (or any other  command  line  agent
141       that  acts  like  rsh/ssh)  to  launch executables on remote nodes.  It
142       requires that executables can be started on remote nodes without  being
143       prompted for a password, and without outputting anything to stderr.
144
145       The  rsh boot module is always available, and unless overridden, always
146       assigns itself a priority of 0.
147
148       The rsh module accepts a few run-time parameters:
149
150       boot_rsh_agent
151           Used to override the compiled-in default remote agent program  that
152           was selected when LAM is compiled.  For example, this parameter can
153           be set to use "ssh" if LAM was compiled to use  "rsh"  by  default.
154           Previous  versions  of LAM/MPI used the LAMRSH environment variable
155           for this purpose.  While  the  LAMRSH  environment  variable  still
156           works,  its  use  is  deprecated in favor of the boot_rsh_agent SSI
157           module argument.
158
159       boot_rsh_priority
160           Using the priority argument can override LAM's  automatic  run-time
161           boot module selection algorithms.
162
163       boot_rsh_username
164           If  the  user  has a different username on the remote machine, this
165           parameter can be used to pass the -l  argument  to  the  underlying
166           remote  agent.   Note that this is a coarse-grained control -- this
167           one username will be used for all  remote  nodes.   If  more  fine-
168           grained  control  is  required, the username should be specified in
169           the boot schema file on a per-host basis.
170
171   slurm Boot Module
172       The slurm boot module uses the srun command to launch the  LAM  daemons
173       in  a  SLURM execution environment (i.e., it detects that it is running
174       under SLURM and automatically sets its priority to 50).  It can be used
175       in two different modes: batch (where a script is submitted to SLURM and
176       it is run on the first node in the node allocation) and allocate (where
177       the  -A  option  is  used to srun to obtain an interactive allocation).
178       The slurm boot module does not support running  in  a  script  that  is
179       launched by SLURM on all nodes in an allocation.
180
181       No  boot  schema file is required when using the slurm boot module; LAM
182       will automatically determine the host and CPU count from SLURM itself.
183
184       The slurm boot module only has one tunable parameter:
185
186       boot_slurm_priority
187           Using the priority argument can override LAM's  automatic  run-time
188           boot  module  selection algorithms.  This parameter only has effect
189           when the slurm module is eligible to be run (i.e., when running  in
190           a SLURM allocation).
191
192   tm Boot Module
193       The  tm  boot  module uses the Task Management (TM) interface to launch
194       executables on remote nodes.  Currently, only OpenPBS  and  PBSPro  are
195       the  only two systems that implement the TM interface.  Hence, when LAM
196       detects that it is running in a PBS job, it will automatically set  the
197       tm  priority  to 50.  When not running in a PBS job, the tm module will
198       not be available.
199
200       The tm boot module only has one tunable parameter:
201
202       boot_tm_priority
203           Using the priority argument can override LAM's  automatic  run-time
204           boot  module  selection algorithms.  This parameter only has effect
205           when the tm module is eligible to be run (i.e., when running  in  a
206           PBS job).
207

SEE ALSO

209       lamssi(7), mpirun(1), LAM User's Guide
210
211
212
213LAM 7.1.2                         March, 2006                   lamssi_boot(7)
Impressum