1fi_psm(7)                      Libfabric v1.14.0                     fi_psm(7)
2
3
4

NAME

6       fi_psm - The PSM Fabric Provider
7

OVERVIEW

9       The psm provider runs over the PSM 1.x interface that is currently sup‐
10       ported by the Intel TrueScale Fabric.  PSM provides  tag-matching  mes‐
11       sage  queue  functions that are optimized for MPI implementations.  PSM
12       also has limited Active Message support, which is not  officially  pub‐
13       lished but is quite stable and well documented in the source code (part
14       of the  OFED  release).   The  psm  provider  makes  use  of  both  the
15       tag-matching  message  queue functions and the Active Message functions
16       to support a variety of libfabric data transfer APIs, including  tagged
17       message queue, message queue, RMA, and atomic operations.
18
19       The psm provider can work with the psm2-compat library, which exposes a
20       PSM 1.x interface over the Intel Omni-Path Fabric.
21

LIMITATIONS

23       The psm provider doesn’t support all the features defined in  the  lib‐
24       fabric API.  Here are some of the limitations:
25
26       Endpoint types
27              Only support non-connection based types FI_DGRAM and FI_RDM
28
29       Endpoint capabilities
30              Endpoints can support any combination of data transfer capabili‐
31              ties FI_TAGGED, FI_MSG, FI_ATOMICS, and FI_RMA.  These capabili‐
32              ties  can  be  further  refined  by  FI_SEND,  FI_RECV, FI_READ,
33              FI_WRITE, FI_REMOTE_READ, and FI_REMOTE_WRITE to limit  the  di‐
34              rection  of operations.  The limitation is that no two endpoints
35              can have overlapping receive or RMA target capabilities  in  any
36              of  the  above  categories.   For example it is fine to have two
37              endpoints with FI_TAGGED | FI_SEND, one endpoint with  FI_TAGGED
38              |  FI_RECV, one endpoint with FI_MSG, one endpoint with FI_RMA |
39              FI_ATOMICS.  But it is not allowed to have  two  endpoints  with
40              FI_TAGGED, or two endpoints with FI_RMA.
41
42       FI_MULTI_RECV is supported for non-tagged message queue only.
43
44       Other supported capabilities include FI_TRIGGER.
45
46       Modes  FI_CONTEXT  is  required  for the FI_TAGGED and FI_MSG capabili‐
47              ties.  That means, any request belonging to these two categories
48              that generates a completion must pass as the operation context a
49              valid pointer to type struct fi_context, and  the  space  refer‐
50              enced by the pointer must remain untouched until the request has
51              completed.  If none of FI_TAGGED and FI_MSG is  asked  for,  the
52              FI_CONTEXT mode is not required.
53
54       Progress
55              The  psm  provider requires manual progress.  The application is
56              expected to call fi_cq_read or fi_cntr_read function  from  time
57              to  time  when  no  other libfabric function is called to ensure
58              progress is made in a timely manner.  The provider does  support
59              auto  progress  mode.   However, the performance can be signifi‐
60              cantly  impacted  if  the  application  purely  depends  on  the
61              provider to make auto progress.
62
63       Unsupported features
64              These  features are unsupported: connection management, scalable
65              endpoint, passive endpoint, shared receive context,  send/inject
66              with immediate data.
67

RUNTIME PARAMETERS

69       The psm provider checks for the following environment variables:
70
71       FI_PSM_UUID
72              PSM requires that each job has a unique ID (UUID).  All the pro‐
73              cesses in the same job need to use the same UUID in order to  be
74              able to talk to each other.  The PSM reference manual advises to
75              keep UUID unique to each job.  In practice, it  generally  works
76              fine to reuse UUID as long as (1) no two jobs with the same UUID
77              are running at the same time; and (2)  previous  jobs  with  the
78              same UUID have exited normally.  If running into “resource busy”
79              or “connection failure” issues with unknown reason, it is advis‐
80              able  to manually set the UUID to a value different from the de‐
81              fault.
82
83       The default UUID is 0FFF0FFF-0000-0000-0000-0FFF0FFF0FFF.
84
85       FI_PSM_NAME_SERVER
86              The psm provider has a simple built-in name server that  can  be
87              used  to resolve an IP address or host name into a transport ad‐
88              dress needed by the fi_av_insert call.  The main purpose of this
89              name  server  is to allow simple client-server type applications
90              (such as those in fabtests) to be written purely with libfabric,
91              without using any out-of-band communication mechanism.  For such
92              applications, the server would run first to allow  endpoints  be
93              created and registered with the name server, and then the client
94              would call fi_getinfo with the node parameter set to the IP  ad‐
95              dress  or host name of the server.  The resulting fi_info struc‐
96              ture would have the transport address of the endpoint created by
97              the  server  in the dest_addr field.  Optionally the service pa‐
98              rameter can be used in addition to node.  Notice that  the  ser‐
99              vice  number  is interpreted by the provider and is not a TCP/IP
100              port number.
101
102       The name server is on by default.  It can be turned off by setting  the
103       variable  to 0.  This may save a small amount of resource since a sepa‐
104       rate thread is created when the name server is on.
105
106       The provider detects OpenMPI and MPICH runs  and  changes  the  default
107       setting to off.
108
109       FI_PSM_TAGGED_RMA
110              The  RMA functions are implemented on top of the PSM Active Mes‐
111              sage functions.  The Active Message functions have limit on  the
112              size  of  data  can  be  transferred in a single message.  Large
113              transfers can be divided into small chunks  and  be  pipe-lined.
114              However, the bandwidth is sub-optimal by doing this way.
115
116       The  psm  provider  use  PSM  tag-matching  message  queue functions to
117       achieve higher bandwidth for large size RMA.  For this purpose,  a  bit
118       is  reserved  from  the  tag space to separate the RMA traffic from the
119       regular tagged message queue.
120
121       The option is on by default.  To turn it off set the variable to 0.
122
123       FI_PSM_AM_MSG
124              The psm provider implements the non-tagged  message  queue  over
125              the PSM tag-matching message queue.  One tag bit is reserved for
126              this purpose.  Alternatively, the non-tagged message  queue  can
127              be  implemented  over Active Message.  This experimental feature
128              has slightly larger latency.
129
130       This option is off by default.  To turn it on set the variable to 1.
131
132       FI_PSM_DELAY
133              Time (seconds) to sleep before closing PSM endpoints.  This is a
134              workaround for a bug in some versions of PSM library.
135
136       The default setting is 1.
137
138       FI_PSM_TIMEOUT
139              Timeout  (seconds)  for  gracefully  closing  PSM  endpoints.  A
140              forced closing will be issued if timeout expires.
141
142       The default setting is 5.
143
144       FI_PSM_PROG_INTERVAL
145              When auto progress is enabled (asked via the  hints  to  fi_get‐
146              info),  a progress thread is created to make progress calls from
147              time to time.  This option set the interval  (microseconds)  be‐
148              tween progress calls.
149
150       The  default  setting  is  1  if  affinity is set, or 1000 if not.  See
151       FI_PSM_PROG_AFFINITY.
152
153       FI_PSM_PROG_AFFINITY
154              When set, specify the set of  CPU  cores  to  set  the  progress
155              thread       affinity       to.        The       format       is
156              <start>[:<end>[:<stride>]][,<start>[:<end>[:<stride>]]]*,  where
157              each triplet <start>:<end>:<stride> defines a block of core_ids.
158              Both <start> and <end> can be either the core_id (when  >=0)  or
159              core_id - num_cores (when <0).
160
161       By default affinity is not set.
162

SEE ALSO

164       fabric(7), fi_provider(7), fi_psm2(7), fi_psm3(7),
165

AUTHORS

167       OpenFabrics.
168
169
170
171Libfabric Programmer’s Manual     2021-03-22                         fi_psm(7)
Impressum