1lamssi_coll(7) LAM SSI COLL OVERVIEW lamssi_coll(7)
2
3
4
6 LAM SSI collectives - overview of LAM's MPI collective SSI modules
7
9 The "kind" for collectives SSI modules is "coll". Specifically, the
10 string "coll" (without the quotes) is the prefix that should be used
11 with the mpirun command line with the -ssi switch. For example:
12
13 mpirun -ssi coll_base_crossover 4 C my_mpi_program
14
15 LAM currently has three coll modules:
16
17 lam_basic
18 A full implementation of MPI collectives on intracommunicators.
19 The algorithms are the same as were in the LAM 6.5 series. Collec‐
20 tives on intercommunicators are undefined, and will result in run-
21 time errors.
22
23 impi
24 Collective functions for IMPI communicators. These are mostly un-
25 implemented; only the basics exist: MPI_BARRIER and MPI_REDUCE.
26
27 shmem
28 Shared memory collectives.
29
30 smp SMP-aware collectives (based on the MagPIe algorithms). The fol‐
31 lowing algorithms provide SMP-aware performance on multiprocessors:
32 MPI_ALLREDUCE, MPI_ALLTOALL, MPI_ALLTOALLV, MPI_BARRIER, MPI_BCAST,
33 MPI_GATHER, MPI_GATHERV, MPI_REDUCE, MPI_SCATTER, and MPI_SCATTERV.
34 Note that the reduction algorithms must be specifically enabled by
35 marking the operations as associative before they will be used.
36 All other MPI collectives will fall back to their lam_basic equiva‐
37 lents.
38
39 More collective modules are likely to be implemented in the future.
40
42 In the discussion below, the parameters are discussed in terms of kind
43 and value. Unlike other SSI module kinds, since coll modules are
44 selected on a per-communicator basis, the kind and value may be speci‐
45 fied as attributes to a parent communicator.
46
47 Need to write much more here.
48
49 Selecting a coll module
50 coll modules are selected on a per-communicator basis. They are
51 selected when the communicator is created, and remain the active coll
52 module for the life of that communicator. For example, different coll
53 modules may be assigned to MPI_COMM_WORLD and MPI_COMM_SELF. In most
54 cases LAM/MPI will select the best coll module automatically. For
55 example, when a communicator spans multiple nodes and at least one node
56 has multiple MPI processes, the smp module will automatically be
57 selected.
58
59 However, the LAM_MPI_SSI_COLL keyval can be used to set an attribute on
60 a communicator that is used to create a new communicator. The
61 attribute should have the value of the string name of the coll module
62 to use. If that module cannot be used, an MPI exception will occur.
63 This attribute is only examined on the parent communicator when a new
64 communicator is created.
65
66 coll SSI Parameters
67 The coll modules accept several parameters:
68
69 coll_associative
70 Because of specific wording in the MPI standard, LAM/MPI can effec‐
71 tively not assume that any reduction operator is associative (at
72 least, not without additional overhead). Hence, LAM/MPI relies on
73 the user to indicate that certain operations are associative. If
74 the user sets the coll_associative SSI parameter to 1, LAM/MPI may
75 assume that the reduction operator is assocative, and may be able
76 to optimize the overall reduction operation. If it is 0 or unde‐
77 fined, LAM/MPI will assume that the reduction operation is not
78 associative, and will use strict linear ordering of reduction oper‐
79 ations (regardless of data locality). This attribute is checked
80 every time a reduction operator is invoked. The User's Guide con‐
81 tains more information on this topic.
82
83 coll_crossover
84 This parameter determines the maximum number of processes in a com‐
85 municator that will use linear algorithms. This SSI parameter is
86 only checked during MPI_INIT.
87
88 coll_reduce_crossover
89 During reduction operations, it makes sense to use the number of
90 bytes to be transferred rather than the number of processes as a
91 metric whether to use linear or logrithmic algorithms. This param‐
92 eter indicates the maxmimum number of bytes to be transferred by
93 each process by a linear algorithm. This SSI parameter is only
94 checked during MPI_INIT.
95
96 Notes on the smp coll Module
97 The smp coll module is based on the algorithms from the MagPIe project.
98 It is not yet complete; there are still more algorithms that can be
99 optmized for SMP-aware execution -- by the time that LAM/MPI was frozen
100 in preparation for release, only some of the algorithms had been com‐
101 pleted. It is expected that future versions of LAM/MPI will have more
102 SMP-optimized algorithms.
103
104 The User's Guide contains much more detail about the smp module. In
105 particular, the coll_associative SSI parameter must be 1 for the SMP-
106 aware reduction algorithms to be used. If it is 0 or undefined, the
107 corresponding lam_basic algorithms will be used. The coll_associative
108 attribute is checked at every invocation of the reduction algorithms.
109
111 lamssi(7), mpirun(1), LAM User's Guide
112
113
114
115LAM 7.1.2 March, 2006 lamssi_coll(7)