1fi_psm(7) Libfabric v1.6.1 fi_psm(7)
2
3
4
6 fi_psm - The PSM Fabric Provider
7
9 The psm provider runs over the PSM 1.x interface that is currently sup‐
10 ported by the Intel TrueScale Fabric. PSM provides tag-matching mes‐
11 sage queue functions that are optimized for MPI implementations. PSM
12 also has limited Active Message support, which is not officially pub‐
13 lished but is quite stable and well documented in the source code (part
14 of the OFED release). The psm provider makes use of both the
15 tag-matching message queue functions and the Active Message functions
16 to support a variety of libfabric data transfer APIs, including tagged
17 message queue, message queue, RMA, and atomic operations.
18
19 The psm provider can work with the psm2-compat library, which exposes a
20 PSM 1.x interface over the Intel Omni-Path Fabric.
21
23 The psm provider doesn't support all the features defined in the lib‐
24 fabric API. Here are some of the limitations:
25
26 Endpoint types : Only support non-connection based types FI_DGRAM and
27 FI_RDM
28
29 Endpoint capabilities : Endpoints can support any combination of data
30 transfer capabilities FI_TAGGED, FI_MSG, FI_ATOMICS, and FI_RMA. These
31 capabilities can be further refined by FI_SEND, FI_RECV, FI_READ,
32 FI_WRITE, FI_REMOTE_READ, and FI_REMOTE_WRITE to limit the direction of
33 operations. The limitation is that no two endpoints can have overlap‐
34 ping receive or RMA target capabilities in any of the above categories.
35 For example it is fine to have two endpoints with FI_TAGGED | FI_SEND,
36 one endpoint with FI_TAGGED | FI_RECV, one endpoint with FI_MSG, one
37 endpoint with FI_RMA | FI_ATOMICS. But it is not allowed to have two
38 endpoints with FI_TAGGED, or two endpoints with FI_RMA.
39
40 FI_MULTI_RECV is supported for non-tagged message queue only.
41
42 Other supported capabilities include FI_TRIGGER.
43
44 Modes : FI_CONTEXT is required for the FI_TAGGED and FI_MSG capabili‐
45 ties. That means, any request belonging to these two categories that
46 generates a completion must pass as the operation context a valid
47 pointer to type struct fi_context, and the space referenced by the
48 pointer must remain untouched until the request has completed. If none
49 of FI_TAGGED and FI_MSG is asked for, the FI_CONTEXT mode is not
50 required.
51
52 Progress : The psm provider requires manual progress. The application
53 is expected to call fi_cq_read or fi_cntr_read function from time to
54 time when no other libfabric function is called to ensure progress is
55 made in a timely manner. The provider does support auto progress mode.
56 However, the performance can be significantly impacted if the applica‐
57 tion purely depends on the provider to make auto progress.
58
59 Unsupported features : These features are unsupported: connection man‐
60 agement, scalable endpoint, passive endpoint, shared receive context,
61 send/inject with immediate data.
62
64 The psm provider checks for the following environment variables:
65
66 FI_PSM_UUID : PSM requires that each job has a unique ID (UUID). All
67 the processes in the same job need to use the same UUID in order to be
68 able to talk to each other. The PSM reference manual advises to keep
69 UUID unique to each job. In practice, it generally works fine to reuse
70 UUID as long as (1) no two jobs with the same UUID are running at the
71 same time; and (2) previous jobs with the same UUID have exited nor‐
72 mally. If running into "resource busy" or "connection failure" issues
73 with unknown reason, it is advisable to manually set the UUID to a
74 value different from the default.
75
76 The default UUID is 0FFF0FFF-0000-0000-0000-0FFF0FFF0FFF.
77
78 FI_PSM_NAME_SERVER : The psm provider has a simple built-in name server
79 that can be used to resolve an IP address or host name into a transport
80 address needed by the fi_av_insert call. The main purpose of this name
81 server is to allow simple client-server type applications (such as
82 those in fabtests) to be written purely with libfabric, without using
83 any out-of-band communication mechanism. For such applications, the
84 server would run first to allow endpoints be created and registered
85 with the name server, and then the client would call fi_getinfo with
86 the node parameter set to the IP address or host name of the server.
87 The resulting fi_info structure would have the transport address of the
88 endpoint created by the server in the dest_addr field. Optionally the
89 service parameter can be used in addition to node. Notice that the
90 service number is interpreted by the provider and is not a TCP/IP port
91 number.
92
93 The name server is on by default. It can be turned off by setting the
94 variable to 0. This may save a small amount of resource since a sepa‐
95 rate thread is created when the name server is on.
96
97 The provider detects OpenMPI and MPICH runs and changes the default
98 setting to off.
99
100 FI_PSM_TAGGED_RMA : The RMA functions are implemented on top of the PSM
101 Active Message functions. The Active Message functions have limit on
102 the size of data can be transferred in a single message. Large trans‐
103 fers can be divided into small chunks and be pipe-lined. However, the
104 bandwidth is sub-optimal by doing this way.
105
106 The psm provider use PSM tag-matching message queue functions to
107 achieve higher bandwidth for large size RMA. For this purpose, a bit
108 is reserved from the tag space to separate the RMA traffic from the
109 regular tagged message queue.
110
111 The option is on by default. To turn it off set the variable to 0.
112
113 FI_PSM_AM_MSG : The psm provider implements the non-tagged message
114 queue over the PSM tag-matching message queue. One tag bit is reserved
115 for this purpose. Alternatively, the non-tagged message queue can be
116 implemented over Active Message. This experimental feature has
117 slightly larger latency.
118
119 This option is off by default. To turn it on set the variable to 1.
120
121 FI_PSM_DELAY : Time (seconds) to sleep before closing PSM endpoints.
122 This is a workaround for a bug in some versions of PSM library.
123
124 The default setting is 1.
125
126 FI_PSM_TIMEOUT : Timeout (seconds) for gracefully closing PSM end‐
127 points. A forced closing will be issued if timeout expires.
128
129 The default setting is 5.
130
131 FI_PSM_PROG_INTERVAL : When auto progress is enabled (asked via the
132 hints to fi_getinfo), a progress thread is created to make progress
133 calls from time to time. This option set the interval (microseconds)
134 between progress calls.
135
136 The default setting is 1 if affinity is set, or 1000 if not. See
137 FI_PSM_PROG_AFFINITY.
138
139 FI_PSM_PROG_AFFINITY : When set, specify the set of CPU cores to set
140 the progress thread affinity to. The format is
141 <start>[:<end>[:<stride>]][,<start>[:<end>[:<stride>]]]*, where each
142 triplet <start>:<end>:<stride> defines a block of core_ids. Both
143 <start> and <end> can be either the core_id (when >=0) or
144 core_id - num_cores (when <0).
145
146 By default affinity is not set.
147
149 fabric(7), fi_provider(7), fi_psm2(7),
150
152 OpenFabrics.
153
154
155
156Libfabric Programmer's Manual 2018-02-13 fi_psm(7)