1io_uring_register(2)       Linux Programmer's Manual      io_uring_register(2)
2
3
4

NAME

6       io_uring_register - register files or user buffers for asynchronous I/O
7

SYNOPSIS

9       #include <liburing.h>
10
11       int io_uring_register(unsigned int fd, unsigned int opcode,
12                             void *arg, unsigned int nr_args);
13

DESCRIPTION

15       The  io_uring_register(2)  system  call  registers resources (e.g. user
16       buffers, files, eventfd,  personality,  restrictions)  for  use  in  an
17       io_uring(7)  instance referenced by fd.  Registering files or user buf‐
18       fers allows the kernel to take long term references  to  internal  data
19       structures  or create long term mappings of application memory, greatly
20       reducing per-I/O overhead.
21
22       fd is the file descriptor returned by a call to io_uring_setup(2).   If
23       opcode  has  the flag IORING_REGISTER_USE_REGISTERED_RING ored into it,
24       fd is instead the index of a registered ring fd.
25
26       opcode can be one of:
27
28
29       IORING_REGISTER_BUFFERS
30              arg points to a struct iovec array of nr_args entries.  The buf‐
31              fers  associated  with  the  iovecs will be locked in memory and
32              charged against the user's RLIMIT_MEMLOCK resource  limit.   See
33              getrlimit(2)  for  more  information.   Additionally, there is a
34              size limit of 1GiB per buffer.  Currently, the buffers  must  be
35              anonymous, non-file-backed memory, such as that returned by mal‐
36              loc(3) or mmap(2) with the MAP_ANONYMOUS flag set.   It  is  ex‐
37              pected  that  this limitation will be lifted in the future. Huge
38              pages are supported as well. Note that the entire huge page will
39              be pinned in the kernel, even if only a portion of it is used.
40
41              After  a  successful  call, the supplied buffers are mapped into
42              the kernel and eligible for I/O.  To make use of them,  the  ap‐
43              plication   must   specify   the  IORING_OP_READ_FIXED  or  IOR‐
44              ING_OP_WRITE_FIXED opcodes in the submission  queue  entry  (see
45              the  struct  io_uring_sqe  definition in io_uring_enter(2)), and
46              set the buf_index field to the desired buffer index.  The memory
47              range  described  by  the  submission queue entry's addr and len
48              fields must fall within the indexed buffer.
49
50              It is perfectly valid to setup a large buffer and then only  use
51              part of it for an I/O, as long as the range is within the origi‐
52              nally mapped region.
53
54              An application can increase or decrease the size  or  number  of
55              registered  buffers by first unregistering the existing buffers,
56              and then issuing a new call to io_uring_register(2) with the new
57              buffers.
58
59              Note  that  before  5.13  registering buffers would wait for the
60              ring to idle.  If the application  currently  has  requests  in-
61              flight,  the  registration  will wait for those to finish before
62              proceeding.
63
64              An application need not  unregister  buffers  explicitly  before
65              shutting  down  the io_uring instance. Note, however, that shut‐
66              down processing may run asynchronously within the kernel.  As  a
67              result, it is not guaranteed that pages are immediately unpinned
68              in this case. Available since 5.1.
69
70
71       IORING_REGISTER_BUFFERS2
72              Register buffers for I/O. Similar to IORING_REGISTER_BUFFERS but
73              aims to have a more extensible ABI.
74
75              arg  points  to  a  struct  io_uring_rsrc_register,  and nr_args
76              should be set to the number of bytes in the structure.
77
78
79               struct io_uring_rsrc_register {
80                   __u32 nr;
81                   __u32 resv;
82                   __u64 resv2;
83                   __aligned_u64 data;
84                   __aligned_u64 tags;
85               };
86
87
88               The data field contains a pointer to a struct iovec array of nr
89               entries.   The  tags  field should either be 0, then tagging is
90               disabled, or point to an array of nr "tags"  (unsigned  64  bit
91               integers).  If  a tag is zero, then tagging for this particular
92               resource (a buffer in this case) is disabled. Otherwise,  after
93               the resource had been unregistered and it's not used anymore, a
94               CQE will be posted with user_data set to the specified tag  and
95               all other fields zeroed.
96
97               Note  that  resource updates, e.g.  IORING_REGISTER_BUFFERS_UP‐
98               DATE, don't necessarily deallocate resources by the time it re‐
99               turns, but they might be held alive until all requests using it
100               complete.
101
102               Available since 5.13.
103
104
105       IORING_REGISTER_BUFFERS_UPDATE
106              Updates registered buffers  with  new  ones,  either  turning  a
107              sparse entry into a real one, or replacing an existing entry.
108
109              arg  must  contain  a pointer to a struct io_uring_rsrc_update2,
110              which contains an offset on which to start the  update,  and  an
111              array  of  struct  iovec.   tags points to an array of tags.  nr
112              must contain the number of descriptors in the passed in  arrays.
113              See  IORING_REGISTER_BUFFERS2  for the resource tagging descrip‐
114              tion.
115
116
117               struct io_uring_rsrc_update2 {
118                   __u32 offset;
119                   __u32 resv;
120                   __aligned_u64 data;
121                   __aligned_u64 tags;
122                   __u32 nr;
123                   __u32 resv2;
124               };
125
126               Available since 5.13.
127
128
129       IORING_UNREGISTER_BUFFERS
130              This operation takes no argument, and  arg  must  be  passed  as
131              NULL.   All  previously  registered  buffers associated with the
132              io_uring instance  will  be  released  synchronously.  Available
133              since 5.1.
134
135
136       IORING_REGISTER_FILES
137              Register  files  for I/O.  arg contains a pointer to an array of
138              nr_args file descriptors (signed 32 bit integers).
139
140              To make use of the registered files, the  IOSQE_FIXED_FILE  flag
141              must  be set in the flags member of the struct io_uring_sqe, and
142              the fd member is set to the index of the file in  the  file  de‐
143              scriptor array.
144
145              The file set may be sparse, meaning that the fd field in the ar‐
146              ray may be set to -1.  See IORING_REGISTER_FILES_UPDATE for  how
147              to update files in place.
148
149              Note  that before 5.13 registering files would wait for the ring
150              to idle.  If the application currently has  requests  in-flight,
151              the  registration  will wait for those to finish before proceed‐
152              ing. See IORING_REGISTER_FILES_UPDATE for how to update  an  ex‐
153              isting set without that limitation.
154
155              Files  are automatically unregistered when the io_uring instance
156              is torn down. An application needs only unregister if it  wishes
157              to register a new set of fds. Available since 5.1.
158
159
160       IORING_REGISTER_FILES2
161              Register files for I/O. Similar to IORING_REGISTER_FILES.
162
163              arg  points  to  a  struct  io_uring_rsrc_register,  and nr_args
164              should be set to the number of bytes in the structure.
165
166              The data field contains a pointer to an array  of  nr  file  de‐
167              scriptors (signed 32 bit integers).  tags field should either be
168              0 or or point to an array of nr "tags" (unsigned  64  bit  inte‐
169              gers).  See  IORING_REGISTER_BUFFERS2  for more info on resource
170              tagging.
171
172              Note that resource updates, e.g.   IORING_REGISTER_FILES_UPDATE,
173              don't necessarily deallocate resources, they might be held until
174              all requests using that resource complete.
175
176              Available since 5.13.
177
178
179       IORING_REGISTER_FILES_UPDATE
180              This operation replaces existing files in  the  registered  file
181              set  with  new ones, either turning a sparse entry (one where fd
182              is equal to -1 ) into a real one,  removing  an  existing  entry
183              (new  one  is set to -1 ), or replacing an existing entry with a
184              new existing entry.
185
186              arg must contain a pointer to  a  struct  io_uring_files_update,
187              which  contains  an  offset on which to start the update, and an
188              array of file descriptors to use for the update.   nr_args  must
189              contain the number of descriptors in the passed in array. Avail‐
190              able since 5.5.
191
192              File descriptors can be skipped if they are set to IORING_REGIS‐
193              TER_FILES_SKIP.   Skipping an fd will not touch the file associ‐
194              ated with the previous fd at that index. Available since 5.12.
195
196
197       IORING_REGISTER_FILES_UPDATE2
198              Similar to IORING_REGISTER_FILES_UPDATE, replaces existing files
199              in  the  registered  file  set  with  new ones, either turning a
200              sparse entry (one where fd is equal to -1 ) into a real one, re‐
201              moving  an  existing entry (new one is set to -1 ), or replacing
202              an existing entry with a new existing entry.
203
204              arg must contain a pointer to  a  struct  io_uring_rsrc_update2,
205              which  contains  an  offset on which to start the update, and an
206              array of file descriptors to use for the update stored in  data.
207              tags  points to an array of tags.  nr must contain the number of
208              descriptors in the passed in arrays.   See  IORING_REGISTER_BUF‐
209              FERS2 for the resource tagging description.
210
211              Available since 5.13.
212
213
214       IORING_UNREGISTER_FILES
215              This  operation  requires no argument, and arg must be passed as
216              NULL.  All  previously  registered  files  associated  with  the
217              io_uring instance will be unregistered. Available since 5.1.
218
219
220       IORING_REGISTER_EVENTFD
221              It's  possible  to  use eventfd(2) to get notified of completion
222              events on an io_uring instance. If this is desired,  an  eventfd
223              file  descriptor  can be registered through this operation.  arg
224              must contain a pointer  to  the  eventfd  file  descriptor,  and
225              nr_args must be 1. Note that while io_uring generally takes care
226              to avoid spurious events, they  can  occur.  Similarly,  batched
227              completions  of CQEs may only trigger a single eventfd notifica‐
228              tion even if multiple CQEs are posted.  The  application  should
229              make no assumptions on number of events being available having a
230              direct correlation to eventfd notifications posted.  An  eventfd
231              notification must thus only be treated as a hint to check the CQ
232              ring for completions. Available since 5.2.
233
234              An application can  temporarily  disable  notifications,  coming
235              through   the   registered   eventfd,   by   setting   the  IOR‐
236              ING_CQ_EVENTFD_DISABLED bit in the flags field of the  CQ  ring.
237              Available since 5.8.
238
239
240       IORING_REGISTER_EVENTFD_ASYNC
241              This  works just like IORING_REGISTER_EVENTFD , except notifica‐
242              tions are only posted for events that complete in an async  man‐
243              ner.  This  means  that  events that complete inline while being
244              submitted do not trigger a  notification  event.  The  arguments
245              supplied are the same as for IORING_REGISTER_EVENTFD.  Available
246              since 5.6.
247
248
249       IORING_UNREGISTER_EVENTFD
250              Unregister an eventfd file  descriptor  to  stop  notifications.
251              Since  only  one eventfd descriptor is currently supported, this
252              operation takes no argument, and arg must be passed as NULL  and
253              nr_args must be zero. Available since 5.2.
254
255
256       IORING_REGISTER_PROBE
257              This  operation  returns a structure, io_uring_probe, which con‐
258              tains information about the opcodes supported by io_uring on the
259              running  kernel.   arg must contain a pointer to a struct io_ur‐
260              ing_probe, and nr_args must contain the size of the ops array in
261              that  probe  struct.  The  ops  array  is  of  the  type  io_ur‐
262              ing_probe_op, which holds the value of the opcode  and  a  flags
263              field.  If  the  flags field has IO_URING_OP_SUPPORTED set, then
264              this opcode is supported on the running kernel. Available  since
265              5.6.
266
267
268       IORING_REGISTER_PERSONALITY
269              This  operation registers credentials of the running application
270              with io_uring, and returns an id associated with  these  creden‐
271              tials.  Applications  wishing  to  share a ring between separate
272              users/processes can pass in this credential id in the  sqe  per‐
273              sonality  field. If set, that particular sqe will be issued with
274              these credentials. Must be invoked with  arg  set  to  NULL  and
275              nr_args set to zero. Available since 5.6.
276
277
278       IORING_UNREGISTER_PERSONALITY
279              This  operation  unregisters a previously registered personality
280              with io_uring.  nr_args must be set to the id in  question,  and
281              arg must be set to NULL. Available since 5.6.
282
283
284       IORING_REGISTER_ENABLE_RINGS
285              This  operation  enables  an io_uring ring started in a disabled
286              state (IORING_SETUP_R_DISABLED was  specified  in  the  call  to
287              io_uring_setup(2)).   While  the io_uring ring is disabled, sub‐
288              missions are not allowed and registrations are not restricted.
289
290              After the execution of this operation, the io_uring ring is  en‐
291              abled:  submissions  and registration are allowed, but they will
292              be validated following the  registered  restrictions  (if  any).
293              This  operation  takes no argument, must be invoked with arg set
294              to NULL and nr_args set to zero. Available since 5.10.
295
296
297       IORING_REGISTER_RESTRICTIONS
298              arg points to a struct io_uring_restriction array of nr_args en‐
299              tries.
300
301              With  an  entry  it is possible to allow an io_uring_register(2)
302              opcode, or specify which opcode  and  flags  of  the  submission
303              queue  entry  are allowed, or require certain flags to be speci‐
304              fied (these flags must be set on each submission queue entry).
305
306              All the restrictions must be  submitted  with  a  single  io_ur‐
307              ing_register(2)  call  and they are handled as an allowlist (op‐
308              codes and flags not registered, are not allowed).
309
310              Restrictions can be registered only if the io_uring ring started
311              in  a  disabled state (IORING_SETUP_R_DISABLED must be specified
312              in the call to io_uring_setup(2)).
313
314              Available since 5.10.
315
316
317       IORING_REGISTER_IOWQ_AFF
318              By default, async workers created by io_uring will  inherit  the
319              CPU mask of its parent. This is usually all the CPUs in the sys‐
320              tem, unless the parent is being run with a limited set. If  this
321              isn't  the  desired outcome, the application may explicitly tell
322              io_uring what CPUs the async workers may run on.  arg must point
323              to a cpu_set_t mask, and nr_args the byte size of that mask.
324
325              Available since 5.14.
326
327
328       IORING_UNREGISTER_IOWQ_AFF
329              Undoes  a CPU mask previously set with IORING_REGISTER_IOWQ_AFF.
330              Must not have arg or nr_args set.
331
332              Available since 5.14.
333
334
335       IORING_REGISTER_IOWQ_MAX_WORKERS
336              By default, io_uring limits the unbounded workers created to the
337              maximum  processor  count  set  by  RLIMIT_NPROC and the bounded
338              workers is a function of the SQ ring size and the number of CPUs
339              in  the  system. Sometimes this can be excessive (or too little,
340              for bounded), and this command provides  a  way  to  change  the
341              count per ring (per NUMA node) instead.
342
343              arg  must  be  set to an unsigned int pointer to an array of two
344              values, with the values in the array being set  to  the  maximum
345              count of workers per NUMA node. Index 0 holds the bounded worker
346              count, and index 1 holds the unbounded worker count. On success‐
347              ful  return, the passed in array will contain the previous maxi‐
348              mum valyes for each type. If the count being  passed  in  is  0,
349              then this command returns the current maximum values and doesn't
350              modify the current setting.  nr_args must be set to  2,  as  the
351              command takes two values.
352
353              Available since 5.15.
354
355
356       IORING_REGISTER_RING_FDS
357              Whenever  io_uring_enter(2)  is called to submit request or wait
358              for completions, the kernel must grab a reference  to  the  file
359              descriptor.  If  the application using io_uring is threaded, the
360              file table is marked as shared, and the reference grab  and  put
361              of  the file descriptor count is more expensive than it is for a
362              non-threaded application.
363
364              Similarly to how io_uring allows registration of files, this al‐
365              low  registration  of  the ring file descriptor itself. This re‐
366              duces the overhead of the io_uring_enter(2) system call.
367
368              arg must be set to an unsigned int pointer to an array  of  type
369              struct  io_uring_rsrc_register of nr_args number of entries. The
370              data field of this struct must point to  an  io_uring  file  de‐
371              scriptor,  and  the offset field can be either -1 or an explicit
372              offset desired for the registered file descriptor value.  If  -1
373              is  used,  then  upon successful return of this system call, the
374              field will contain the value of the registered  file  descriptor
375              to be used for future io_uring_enter(2) system calls.
376
377              On  successful completion of this request, the returned descrip‐
378              tors may be used instead of the real file descriptor for  io_ur‐
379              ing_enter(2),  provided that IORING_ENTER_REGISTERED_RING is set
380              in the flags for the system call. This  flag  tells  the  kernel
381              that a registered descriptor is used rather than a real file de‐
382              scriptor.
383
384              Each thread or process using a ring must register the  file  de‐
385              scriptor directly by issuing this request.
386
387              The  maximum  number of supported registered ring descriptors is
388              currently limited to 16.
389
390              Available since 5.18.
391
392
393       IORING_UNREGISTER_RING_FDS
394              Unregister descriptors previously registered with  IORING_REGIS‐
395              TER_RING_FDS.
396
397              arg  must  be set to an unsigned int pointer to an array of type
398              struct io_uring_rsrc_register of nr_args number of entries. Only
399              the  offset field should be set in the structure, containing the
400              registered file descriptor offset previously returned from  IOR‐
401              ING_REGISTER_RING_FDS that the application wishes to unregister.
402
403              Note  that  this  isn't  done automatically on ring exit, if the
404              thread or task that previously registered a ring file descriptor
405              isn't exiting. It is recommended to manually unregister any pre‐
406              viously registered ring descriptors if the ring  is  closed  and
407              the task persists. This will free up a registration slot, making
408              it available for future use.
409
410              Available since 5.18.
411
412
413       IORING_REGISTER_PBUF_RING
414              Registers a shared buffer ring to be used with provided buffers.
415              This  is  a newer alternative to using IORING_OP_PROVIDE_BUFFERS
416              which is more efficient, to be used with request types that sup‐
417              port the IOSQE_BUFFER_SELECT flag.
418
419              The arg argument must be filled in with the appropriate informa‐
420              tion. It looks as follows:
421
422                   struct io_uring_buf_reg {
423                       __u64 ring_addr;
424                       __u32 ring_entries;
425                       __u16 bgid;
426                       __u16 pad;
427                       __u64 resv[3];
428                   };
429
430               The ring_addr field must contain the address to the memory  al‐
431               located  to fit this ring.  The memory must be page aligned and
432               hence allocated appropriately  using  eg  posix_memalign(3)  or
433               similar.  The  size  of the ring is the product of ring_entries
434               and the size of struct io_uring_buf.  ring_entries is  the  de‐
435               sired  size  of the ring, and must be a power-of-2 in size. The
436               maximum size allowed is 2^15 (32768).  bgid is the buffer group
437               ID  associated with this ring. SQEs that select a buffer have a
438               buffer group associated with them in their buf_group field, and
439               the  associated CQEs will have IORING_CQE_F_BUFFER set in their
440               flags member, which will also contain the specific  ID  of  the
441               buffer  selected.  The rest of the fields are reserved and must
442               be cleared to zero.
443
444               nr_args must be set to 1.
445
446               Also see io_uring_register_buf_ring(3) for more details. Avail‐
447               able since 5.19.
448
449
450       IORING_UNREGISTER_PBUF_RING
451              Unregister  a  previously  registered provided buffer ring.  arg
452              must be set to the address of a  struct  io_uring_buf_reg,  with
453              just the bgid field set to the buffer group ID of the previously
454              registered provided buffer group.  nr_args must  be  set  to  1.
455              Also see IORING_REGISTER_PBUF_RING .
456
457              Available since 5.19.
458
459
460       IORING_REGISTER_SYNC_CANCEL
461              Performs  a  synchronous  cancelation  request, which works in a
462              similar fashion to IORING_OP_ASYNC_CANCEL  except  it  completes
463              inline.  This  can  be  useful  for scenarios where cancelations
464              should happen synchronously, rather than needing to issue an SQE
465              and wait for completion of that specific CQE.
466
467              arg  must  be  set  to  a pointer to a struct io_uring_sync_can‐
468              cel_reg structure, with the  details  filled  in  for  what  re‐
469              quest(s)   to   target   for  cancelation.  See  io_uring_regis‐
470              ter_sync_cancel(3) for details on that. The  return  values  are
471              the  same, except they are passed back synchronously rather than
472              through the CQE res field.  nr_args must be set to 1.
473
474              Available since 6.0.
475
476
477       IORING_REGISTER_FILE_ALLOC_RANGE
478              sets the allowable range for fixed file index allocations within
479              the  kernel. When requests that can instantiate a new fixed file
480              are used with IORING_FILE_INDEX_ALLOC , the application is  ask‐
481              ing  the  kernel  to allocate a new fixed file descriptor rather
482              than pass in a specific value for one. By  default,  the  kernel
483              will  pick  any available fixed file descriptor within the range
484              available.  This effectively allows the application to set aside
485              a  range  just for dynamic allocations, with the remainder being
486              used for specific values.
487
488              nr_args must be set to 1 and arg must be set to a pointer  to  a
489              struct io_uring_file_index_range:
490
491                   struct io_uring_file_index_range {
492                       __u32 off;
493                       __u32 len;
494                       __u64 resv;
495                   };
496
497               with off being set to the starting value for the range, and len
498               being set to the number of descriptors. The reserved resv field
499               must be cleared to zero.
500
501               The application must have registered a file table first.
502
503               Available since 6.0.
504
505

RETURN VALUE

507       On  success, io_uring_register(2) returns either 0 or a positive value,
508       depending on the opcode used.  On error, a negative error value is  re‐
509       turned. The caller should not rely on the errno variable.
510
511

ERRORS

513       EACCES The opcode field is not allowed due to registered restrictions.
514
515       EBADF  One or more fds in the fd array are invalid.
516
517       EBADFD IORING_REGISTER_ENABLE_RINGS or IORING_REGISTER_RESTRICTIONS was
518              specified, but the io_uring ring is not disabled.
519
520       EBUSY  IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES or  IORING_REG‐
521              ISTER_RESTRICTIONS  was  specified,  but there were already buf‐
522              fers, files, or restrictions registered.
523
524       EFAULT buffer is outside of the process' accessible address  space,  or
525              iov_len is greater than 1GiB.
526
527       EINVAL IORING_REGISTER_BUFFERS  or IORING_REGISTER_FILES was specified,
528              but nr_args is 0.
529
530       EINVAL IORING_REGISTER_BUFFERS  was  specified,  but  nr_args   exceeds
531              UIO_MAXIOV
532
533       EINVAL IORING_UNREGISTER_BUFFERS  or IORING_UNREGISTER_FILES was speci‐
534              fied, and nr_args is non-zero or arg is non-NULL.
535
536       EINVAL IORING_REGISTER_RESTRICTIONS was specified, but nr_args  exceeds
537              the maximum allowed number of restrictions or restriction opcode
538              is invalid.
539
540       EMFILE IORING_REGISTER_FILES was specified and nr_args exceeds the max‐
541              imum allowed number of files in a fixed file set.
542
543       EMFILE IORING_REGISTER_FILES was specified and adding nr_args file ref‐
544              erences would exceed the maximum allowed  number  of  files  the
545              user  is allowed to have according to the RLIMIT_NOFILE resource
546              limit and the caller does not have CAP_SYS_RESOURCE  capability.
547              Note that this is a per user limit, not per process.
548
549       ENOMEM Insufficient kernel resources are available, or the caller had a
550              non-zero RLIMIT_MEMLOCK soft resource limit, but tried  to  lock
551              more  memory  than  the  limit permitted.  This limit is not en‐
552              forced if the process is privileged (CAP_IPC_LOCK).
553
554       ENXIO  IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was  speci‐
555              fied, but there were no buffers or files registered.
556
557       ENXIO  Attempt  to  register  files  or buffers on an io_uring instance
558              that is already undergoing file or buffer  registration,  or  is
559              being torn down.
560
561       EOPNOTSUPP
562              User buffers point to file-backed memory.
563
564       EFAULT User buffers point to file-backed memory (newer kernels).
565
566
567
568Linux                             2019-01-17              io_uring_register(2)
Impressum