1RSOCKET(7)               Librdmacm Programmer's Manual              RSOCKET(7)
2
3
4

NAME

6       rsocket - RDMA socket API
7

SYNOPSIS

9       #include <rdma/rsocket.h>
10

DESCRIPTION

12       RDMA socket API and protocol
13

NOTES

15       Rsockets  is  a protocol over RDMA that supports a socket-level API for
16       applications.  Rsocket APIs are intended to match the behavior of  cor‐
17       responding  socket  calls, except where noted.  Rsocket functions match
18       the name and function signature of socket  calls,  with  the  exception
19       that all function calls are prefixed with an 'r'.
20
21       The following functions are defined:
22
23       rsocket
24
25       rbind, rlisten, raccept, rconnect
26
27       rshutdown, rclose
28
29       rrecv, rrecvfrom, rrecvmsg, rread, rreadv
30
31       rsend, rsendto, rsendmsg, rwrite, rwritev
32
33       rpoll, rselect
34
35       rgetpeername, rgetsockname
36
37       rsetsockopt, rgetsockopt, rfcntl
38
39       Functions  take the same parameters as that used for sockets.  The fol‐
40       low capabilities and flags are supported at this time:
41
42       PF_INET, PF_INET6, SOCK_STREAM, SOCK_DGRAM
43
44       SOL_SOCKET - SO_ERROR,  SO_KEEPALIVE  (flag  supported,  but  ignored),
45       SO_LINGER, SO_OOBINLINE, SO_RCVBUF, SO_REUSEADDR, SO_SNDBUF
46
47       IPPROTO_TCP - TCP_NODELAY, TCP_MAXSEG
48
49       IPPROTO_IPV6 - IPV6_V6ONLY
50
51       MSG_DONTWAIT, MSG_PEEK, O_NONBLOCK
52
53       Rsockets  provides  extensions beyond normal socket routines that allow
54       for direct placement of data into an  application's  buffer.   This  is
55       also  known  as  zero-copy  support,  since  data  is sent and received
56       directly, bypassing copies into network controlled buffers.   The  fol‐
57       lowing calls and options support direct data placement.
58
59       riomap, riounmap, riowrite
60
61       off_t  riomap(int  socket,  void *buf, size_t len, int prot, int flags,
62       off_t offset)
63
64       Riomap registers an application buffer with the RDMA hardware
65              associated with an rsocket.  The buffer is registered either for
66              local  only  access  (PROT_NONE)  or  for  remote  write  access
67              (PROT_WRITE).  When registered for remote access, the buffer  is
68              mapped  to a given offset.  The offset is either provided by the
69              user, or if the user selects -1 for the offset, rsockets selects
70              one.   The remote peer may access an iomapped buffer directly by
71              specifying the correct offset.  The mapping is not guaranteed to
72              be  available until after the remote peer receives a data trans‐
73              fer initiated after riomap has completed.
74
75       In order to enable the use of remote IO mapping calls on an rsocket, an
76       application  must  set  the number of IO mappings that are available to
77       the remote peer.  This may be done using the rsetsockopt RDMA_IOMAPSIZE
78       option.   By  default,  an rsocket does not support remote IO mappings.
79       riounmap
80
81       int riounmap(int socket, void *buf, size_t len)
82
83       Riounmap removes the mapping between a buffer and an rsocket.
84
85       riowrite
86
87       size_t riowrite(int socket, const void *buf, size_t count,  off_t  off‐
88       set, int flags)
89
90       Riowrite allows an application to transfer data over an rsocket
91              directly  into a remotely iomapped buffer.  The remote buffer is
92              specified through an offset parameter, which  corresponds  to  a
93              remote iomapped buffer.  From the sender's perspective, riowrite
94              behaves similar to rwrite.  From  a  receiver's  view,  riowrite
95              transfers  are  silently  redirected into a pre- determined data
96              buffer.  Data is received automatically, and the receiver is not
97              informed  of  the transfer.  However, iowrite data is still con‐
98              sidered part of the data stream, such that iowrite data will  be
99              written  before  a  subsequent  transfer is received.  A message
100              sent immediately after initiating an  iowrite  may  be  used  to
101              notify the receiver of the iowrite.
102
103       In  addition to standard socket options, rsockets supports options spe‐
104       cific to RDMA devices and  protocols.   These  options  are  accessible
105       through rsetsockopt using SOL_RDMA option level.
106
107       RDMA_SQSIZE - Integer size of the underlying send queue.
108
109       RDMA_RQSIZE - Integer size of the underlying receive queue.
110
111       RDMA_INLINE - Integer size of inline data.
112
113       RDMA_IOMAPSIZE - Integer number of remote IO mappings supported
114
115       RDMA_ROUTE - struct ibv_path_data of path record for connection.
116
117       Note  that  rsockets fd's cannot be passed into non-rsocket calls.  For
118       applications which must mix rsocket fd's with standard socket  fd's  or
119       opened  files, rpoll and rselect support polling both rsockets and nor‐
120       mal fd's.
121
122       Existing applications can make use of rsockets through  the  use  of  a
123       preload  library.   Because rsockets implements an end-to-end protocol,
124       both sides of a connection must use rsockets.  The rdma_cm library pro‐
125       vides  such  a  preload library, librspreload.  To reduce the chance of
126       the preload library intercepting  calls  without  the  user's  explicit
127       knowledge,  the librspreload library is installed into %libdir%/rsocket
128       subdirectory.
129
130       The preload library can be used by  setting  LD_PRELOAD  when  running.
131       Note  that  not  all  applications will work with rsockets.  Support is
132       limited based on the socket options used by the  application.   Support
133       for fork() is limited, but available.  To use rsockets with the preload
134       library for applications that call fork, users must set the environment
135       variable  RDMAV_FORK_SAFE=1  on  both the client and server side of the
136       connection.  In general, fork is supportable  for  server  applications
137       that  accept  a  connection,  then fork off a process to handle the new
138       connection.
139
140       rsockets uses configuration files that give  an  administrator  control
141       over   the   default  settings  used  by  rsockets.   Use  files  under
142       /etc/rdma/rsocket as shown:
143
144       mem_default - default size of receive buffer(s)
145
146       wmem_default - default size of send buffer(s)
147
148       sqsize_default - default size of send queue
149
150       rqsize_default - default size of receive queue
151
152       inline_default - default size of inline data
153
154       iomap_size - default size of remote iomapping table
155
156       polling_time - default number of microseconds to poll for  data  before
157       waiting
158
159       wake_up_interval  -  maximum  number  of milliseconds to block in poll.
160       This value is used to safe guard against potential application hangs in
161       rpoll().
162
163       All  configuration files should contain a single integer value.  Values
164       may be set by issuing a command similar to the following example.
165
166       echo 1000000 > /etc/rdma/rsocket/mem_default
167
168       If configuration  files  are  not  available,  rsockets  uses  internal
169       defaults.   Applications  can  override default values programmatically
170       through the rsetsockopt routine.
171

SEE ALSO

173       rdma_cm(7)
174
175
176
177librdmacm                         2019-04-16                        RSOCKET(7)
Impressum