1io_uring_register(2)       Linux Programmer's Manual      io_uring_register(2)
2
3
4

NAME

6       io_uring_register - register files or user buffers for asynchronous I/O
7

SYNOPSIS

9       #include <liburing.h>
10
11       int io_uring_register(unsigned int fd, unsigned int opcode,
12                             void *arg, unsigned int nr_args);
13

DESCRIPTION

15       The  io_uring_register(2)  system  call  registers resources (e.g. user
16       buffers, files, eventfd,  personality,  restrictions)  for  use  in  an
17       io_uring(7)  instance referenced by fd.  Registering files or user buf‐
18       fers allows the kernel to take long term references  to  internal  data
19       structures  or create long term mappings of application memory, greatly
20       reducing per-I/O overhead.
21
22       fd is the file descriptor returned by a call to io_uring_setup(2).  op‐
23       code can be one of:
24
25
26       IORING_REGISTER_BUFFERS
27              arg points to a struct iovec array of nr_args entries.  The buf‐
28              fers associated with the iovecs will be  locked  in  memory  and
29              charged  against  the user's RLIMIT_MEMLOCK resource limit.  See
30              getrlimit(2) for more information.   Additionally,  there  is  a
31              size  limit  of 1GiB per buffer.  Currently, the buffers must be
32              anonymous, non-file-backed memory, such as that returned by mal‐
33              loc(3)  or  mmap(2)  with the MAP_ANONYMOUS flag set.  It is ex‐
34              pected that this limitation will be lifted in the  future.  Huge
35              pages are supported as well. Note that the entire huge page will
36              be pinned in the kernel, even if only a portion of it is used.
37
38              After a successful call, the supplied buffers  are  mapped  into
39              the  kernel  and eligible for I/O.  To make use of them, the ap‐
40              plication  must  specify  the   IORING_OP_READ_FIXED   or   IOR‐
41              ING_OP_WRITE_FIXED  opcodes  in  the submission queue entry (see
42              the struct io_uring_sqe definition  in  io_uring_enter(2)),  and
43              set the buf_index field to the desired buffer index.  The memory
44              range described by the submission queue  entry's  addr  and  len
45              fields must fall within the indexed buffer.
46
47              It  is perfectly valid to setup a large buffer and then only use
48              part of it for an I/O, as long as the range is within the origi‐
49              nally mapped region.
50
51              An  application  can  increase or decrease the size or number of
52              registered buffers by first unregistering the existing  buffers,
53              and then issuing a new call to io_uring_register(2) with the new
54              buffers.
55
56              Note that before 5.13 registering buffers  would  wait  for  the
57              ring  to  idle.   If  the application currently has requests in-
58              flight, the registration will wait for those  to  finish  before
59              proceeding.
60
61              An  application  need  not  unregister buffers explicitly before
62              shutting down the io_uring instance. Available since 5.1.
63
64
65       IORING_REGISTER_BUFFERS2
66              Register buffers for I/O. Similar to IORING_REGISTER_BUFFERS but
67              aims to have a more extensible ABI.
68
69              arg  points  to  a  struct  io_uring_rsrc_register,  and nr_args
70              should be set to the number of bytes in the structure.
71
72
73               struct io_uring_rsrc_register {
74                   __u32 nr;
75                   __u32 resv;
76                   __u64 resv2;
77                   __aligned_u64 data;
78                   __aligned_u64 tags;
79               };
80
81
82               The data field contains a pointer to a struct iovec array of nr
83               entries.   The  tags  field should either be 0, then tagging is
84               disabled, or point to an array of nr "tags"  (unsigned  64  bit
85               integers).  If  a tag is zero, then tagging for this particular
86               resource (a buffer in this case) is disabled. Otherwise,  after
87               the resource had been unregistered and it's not used anymore, a
88               CQE will be posted with user_data set to the specified tag  and
89               all other fields zeroed.
90
91               Note  that  resource updates, e.g.  IORING_REGISTER_BUFFERS_UP‐
92               DATE, don't necessarily deallocate resources by the time it re‐
93               turns, but they might be held alive until all requests using it
94               complete.
95
96               Available since 5.13.
97
98
99       IORING_REGISTER_BUFFERS_UPDATE
100              Updates registered buffers  with  new  ones,  either  turning  a
101              sparse entry into a real one, or replacing an existing entry.
102
103              arg  must  contain  a pointer to a struct io_uring_rsrc_update2,
104              which contains an offset on which to start the  update,  and  an
105              array  of  struct  iovec.   tags points to an array of tags.  nr
106              must contain the number of descriptors in the passed in  arrays.
107              See  IORING_REGISTER_BUFFERS2  for the resource tagging descrip‐
108              tion.
109
110
111               struct io_uring_rsrc_update2 {
112                   __u32 offset;
113                   __u32 resv;
114                   __aligned_u64 data;
115                   __aligned_u64 tags;
116                   __u32 nr;
117                   __u32 resv2;
118               };
119
120               Available since 5.13.
121
122
123       IORING_UNREGISTER_BUFFERS
124              This operation takes no argument, and  arg  must  be  passed  as
125              NULL.   All  previously  registered  buffers associated with the
126              io_uring instance will be released. Available since 5.1.
127
128
129       IORING_REGISTER_FILES
130              Register files for I/O.  arg contains a pointer to an  array  of
131              nr_args file descriptors (signed 32 bit integers).
132
133              To  make  use of the registered files, the IOSQE_FIXED_FILE flag
134              must be set in the flags member of the struct io_uring_sqe,  and
135              the  fd  member  is set to the index of the file in the file de‐
136              scriptor array.
137
138              The file set may be sparse, meaning that the fd field in the ar‐
139              ray  may be set to -1.  See IORING_REGISTER_FILES_UPDATE for how
140              to update files in place.
141
142              Note that before 5.13 registering files would wait for the  ring
143              to  idle.   If the application currently has requests in-flight,
144              the registration will wait for those to finish  before  proceed‐
145              ing.  See  IORING_REGISTER_FILES_UPDATE for how to update an ex‐
146              isting set without that limitation.
147
148              Files are automatically unregistered when the io_uring  instance
149              is  torn down. An application needs only unregister if it wishes
150              to register a new set of fds. Available since 5.1.
151
152
153       IORING_REGISTER_FILES2
154              Register files for I/O. Similar to IORING_REGISTER_FILES.
155
156              arg points  to  a  struct  io_uring_rsrc_register,  and  nr_args
157              should be set to the number of bytes in the structure.
158
159              The  data  field  contains  a pointer to an array of nr file de‐
160              scriptors (signed 32 bit integers).  tags field should either be
161              0  or  or  point to an array of nr "tags" (unsigned 64 bit inte‐
162              gers). See IORING_REGISTER_BUFFERS2 for more  info  on  resource
163              tagging.
164
165              Note  that resource updates, e.g.  IORING_REGISTER_FILES_UPDATE,
166              don't necessarily deallocate resources, they might be held until
167              all requests using that resource complete.
168
169              Available since 5.13.
170
171
172       IORING_REGISTER_FILES_UPDATE
173              This  operation  replaces  existing files in the registered file
174              set with new ones, either turning a sparse entry (one  where  fd
175              is  equal  to  -1  ) into a real one, removing an existing entry
176              (new one is set to -1 ), or replacing an existing entry  with  a
177              new existing entry.
178
179              arg  must  contain  a pointer to a struct io_uring_files_update,
180              which contains an offset on which to start the  update,  and  an
181              array  of  file descriptors to use for the update.  nr_args must
182              contain the number of descriptors in the passed in array. Avail‐
183              able since 5.5.
184
185              File descriptors can be skipped if they are set to IORING_REGIS‐
186              TER_FILES_SKIP.  Skipping an fd will not touch the file  associ‐
187              ated with the previous fd at that index. Available since 5.12.
188
189
190       IORING_REGISTER_FILES_UPDATE2
191              Similar to IORING_REGISTER_FILES_UPDATE, replaces existing files
192              in the registered file set  with  new  ones,  either  turning  a
193              sparse entry (one where fd is equal to -1 ) into a real one, re‐
194              moving an existing entry (new one is set to -1 ),  or  replacing
195              an existing entry with a new existing entry.
196
197              arg  must  contain  a pointer to a struct io_uring_rsrc_update2,
198              which contains an offset on which to start the  update,  and  an
199              array  of file descriptors to use for the update stored in data.
200              tags points to an array of tags.  nr must contain the number  of
201              descriptors  in  the passed in arrays.  See IORING_REGISTER_BUF‐
202              FERS2 for the resource tagging description.
203
204              Available since 5.13.
205
206
207       IORING_UNREGISTER_FILES
208              This operation requires no argument, and arg must be  passed  as
209              NULL.   All  previously  registered  files  associated  with the
210              io_uring instance will be unregistered. Available since 5.1.
211
212
213       IORING_REGISTER_EVENTFD
214              It's possible to use eventfd(2) to get  notified  of  completion
215              events  on  an io_uring instance. If this is desired, an eventfd
216              file descriptor can be registered through this  operation.   arg
217              must  contain  a  pointer  to  the  eventfd file descriptor, and
218              nr_args must be 1. Note that while io_uring generally takes care
219              to  avoid  spurious  events,  they can occur. Similarly, batched
220              completions of CQEs may only trigger a single eventfd  notifica‐
221              tion  even  if  multiple CQEs are posted. The application should
222              make no assumptions on number of events being available having a
223              direct  correlation  to eventfd notifications posted. An eventfd
224              notification must thus only be treated as a hint to check the CQ
225              ring for completions. Available since 5.2.
226
227              An  application  can  temporarily  disable notifications, coming
228              through  the   registered   eventfd,   by   setting   the   IOR‐
229              ING_CQ_EVENTFD_DISABLED  bit  in the flags field of the CQ ring.
230              Available since 5.8.
231
232
233       IORING_REGISTER_EVENTFD_ASYNC
234              This works just like IORING_REGISTER_EVENTFD , except  notifica‐
235              tions  are only posted for events that complete in an async man‐
236              ner. This means that events that  complete  inline  while  being
237              submitted  do  not  trigger  a notification event. The arguments
238              supplied are the same as for IORING_REGISTER_EVENTFD.  Available
239              since 5.6.
240
241
242       IORING_UNREGISTER_EVENTFD
243              Unregister  an  eventfd  file  descriptor to stop notifications.
244              Since only one eventfd descriptor is currently  supported,  this
245              operation  takes no argument, and arg must be passed as NULL and
246              nr_args must be zero. Available since 5.2.
247
248
249       IORING_REGISTER_PROBE
250              This operation returns a structure, io_uring_probe,  which  con‐
251              tains information about the opcodes supported by io_uring on the
252              running kernel.  arg must contain a pointer to a  struct  io_ur‐
253              ing_probe, and nr_args must contain the size of the ops array in
254              that  probe  struct.  The  ops  array  is  of  the  type  io_ur‐
255              ing_probe_op,  which  holds  the value of the opcode and a flags
256              field. If the flags field has  IO_URING_OP_SUPPORTED  set,  then
257              this  opcode is supported on the running kernel. Available since
258              5.6.
259
260
261       IORING_REGISTER_PERSONALITY
262              This operation registers credentials of the running  application
263              with  io_uring,  and returns an id associated with these creden‐
264              tials. Applications wishing to share  a  ring  between  separate
265              users/processes  can  pass in this credential id in the sqe per‐
266              sonality field. If set, that particular sqe will be issued  with
267              these  credentials.  Must  be  invoked  with arg set to NULL and
268              nr_args set to zero. Available since 5.6.
269
270
271       IORING_UNREGISTER_PERSONALITY
272              This operation unregisters a previously  registered  personality
273              with  io_uring.   nr_args must be set to the id in question, and
274              arg must be set to NULL. Available since 5.6.
275
276
277       IORING_REGISTER_ENABLE_RINGS
278              This operation enables an io_uring ring started  in  a  disabled
279              state  (IORING_SETUP_R_DISABLED  was  specified  in  the call to
280              io_uring_setup(2)).  While the io_uring ring is  disabled,  sub‐
281              missions are not allowed and registrations are not restricted.
282
283              After  the execution of this operation, the io_uring ring is en‐
284              abled: submissions and registration are allowed, but  they  will
285              be  validated  following  the  registered restrictions (if any).
286              This operation takes no argument, must be invoked with  arg  set
287              to NULL and nr_args set to zero. Available since 5.10.
288
289
290       IORING_REGISTER_RESTRICTIONS
291              arg points to a struct io_uring_restriction array of nr_args en‐
292              tries.
293
294              With an entry it is possible to  allow  an  io_uring_register(2)
295              opcode,  or  specify  which  opcode  and flags of the submission
296              queue entry are allowed, or require certain flags to  be  speci‐
297              fied (these flags must be set on each submission queue entry).
298
299              All  the  restrictions  must  be  submitted with a single io_ur‐
300              ing_register(2) call and they are handled as an  allowlist  (op‐
301              codes and flags not registered, are not allowed).
302
303              Restrictions can be registered only if the io_uring ring started
304              in a disabled state (IORING_SETUP_R_DISABLED must  be  specified
305              in the call to io_uring_setup(2)).
306
307              Available since 5.10.
308
309
310       IORING_REGISTER_IOWQ_AFF
311              By  default,  async workers created by io_uring will inherit the
312              CPU mask of its parent. This is usually all the CPUs in the sys‐
313              tem,  unless the parent is being run with a limited set. If this
314              isn't the desired outcome, the application may  explicitly  tell
315              io_uring what CPUs the async workers may run on.  arg must point
316              to a cpu_set_t mask, and nr_args the byte size of that mask.
317
318              Available since 5.14.
319
320
321       IORING_UNREGISTER_IOWQ_AFF
322              Undoes a CPU mask previously set with  IORING_REGISTER_IOWQ_AFF.
323              Must not have arg or nr_args set.
324
325              Available since 5.14.
326
327
328       IORING_REGISTER_IOWQ_MAX_WORKERS
329              By default, io_uring limits the unbounded workers created to the
330              maximum processor count set  by  RLIMIT_NPROC  and  the  bounded
331              workers is a function of the SQ ring size and the number of CPUs
332              in the system. Sometimes this can be excessive (or  too  little,
333              for  bounded),  and  this  command  provides a way to change the
334              count per ring (per NUMA node) instead.
335
336              arg must be set to an unsigned int pointer to an  array  of  two
337              values,  with  the  values in the array being set to the maximum
338              count of workers per NUMA node. Index 0 holds the bounded worker
339              count, and index 1 holds the unbounded worker count. On success‐
340              ful return, the passed in array will contain the previous  maxi‐
341              mum  valyes  for  each  type. If the count being passed in is 0,
342              then this command returns the current maximum values and doesn't
343              modify  the  current  setting.  nr_args must be set to 2, as the
344              command takes two values.
345
346              Available since 5.15.
347
348
349       IORING_REGISTER_RING_FDS
350              Whenever io_uring_enter(2) is called to submit request  or  wait
351              for  completions,  the  kernel must grab a reference to the file
352              descriptor. If the application using io_uring is  threaded,  the
353              file  table  is marked as shared, and the reference grab and put
354              of the file descriptor count is more expensive than it is for  a
355              non-threaded application.
356
357              Similarly to how io_uring allows registration of files, this al‐
358              low registration of the ring file descriptor  itself.  This  re‐
359              duces the overhead of the io_uring_enter(2) system call.
360
361              arg  must  be set to an unsigned int pointer to an array of type
362              struct io_uring_rsrc_register of nr_args number of entries.  The
363              data  field  of  this  struct must point to an io_uring file de‐
364              scriptor, and the offset field can be either -1 or  an  explicit
365              offset  desired  for the registered file descriptor value. If -1
366              is used, then upon successful return of this  system  call,  the
367              field  will  contain the value of the registered file descriptor
368              to be used for future io_uring_enter(2) system calls.
369
370              On successful completion of this request, the returned  descrip‐
371              tors  may be used instead of the real file descriptor for io_ur‐
372              ing_enter(2), provided that IORING_ENTER_REGISTERED_RING is  set
373              in  the  flags  for  the system call. This flag tells the kernel
374              that a registered descriptor is used rather than a real file de‐
375              scriptor.
376
377              Each  thread  or process using a ring must register the file de‐
378              scriptor directly by issuing this request.
379
380              The maximum number of supported registered ring  descriptors  is
381              currently limited to 16.
382
383              Available since 5.18.
384
385
386       IORING_UNREGISTER_RING_FDS
387              Unregister  descriptors previously registered with IORING_REGIS‐
388              TER_RING_FDS.
389
390              arg must be set to an unsigned int pointer to an array  of  type
391              struct io_uring_rsrc_register of nr_args number of entries. Only
392              the offset field should be set in the structure, containing  the
393              registered  file descriptor offset previously returned from IOR‐
394              ING_REGISTER_RING_FDS that the application wishes to unregister.
395
396              Note that this isn't done automatically on  ring  exit,  if  the
397              thread or task that previously registered a ring file descriptor
398              isn't exiting. It is recommended to manually unregister any pre‐
399              viously  registered  ring  descriptors if the ring is closed and
400              the task persists. This will free up a registration slot, making
401              it available for future use.
402
403              Available since 5.18.
404
405
406       IORING_REGISTER_PBUF_RING
407              Registers a shared buffer ring to be used with provided buffers.
408              This is a newer alternative to  using  IORING_OP_PROVIDE_BUFFERS
409              which is more efficient, to be used with request types that sup‐
410              port the IOSQE_BUFFER_SELECT flag.
411
412              The arg argument must be filled in with the appropriate informa‐
413              tion. It looks as follows:
414
415                   struct io_uring_buf_reg {
416                       __u64 ring_addr;
417                       __u32 ring_entries;
418                       __u16 bgid;
419                       __u16 pad;
420                       __u64 resv[3];
421                   };
422
423               The  ring_addr field must contain the address to the memory al‐
424               located to fit this ring.  The memory must be page aligned  and
425               hence  allocated  appropriately  using  eg posix_memalign(3) or
426               similar. The size of the ring is the  product  of  ring_entries
427               and  the  size of struct io_uring_buf.  ring_entries is the de‐
428               sired size of the ring, and must be a power-of-2 in  size.  The
429               maximum size allowed is 2^15 (32768).  bgid is the buffer group
430               ID associated with this ring. SQEs that select a buffer  has  a
431               buffer group associated with them in their buf_group field, and
432               the associated CQE will have IORING_CQE_F_BUFFER set  in  their
433               flags  member,  which  will also contain the specific ID of the
434               buffer selected. The rest of the fields are reserved  and  must
435               be cleared to zero.
436
437               The flags argument is currently unused and must be set to zero.
438
439               must be set to 1.
440
441               Also see io_uring_register_buf_ring(3) for more details. Avail‐
442               able since 5.19.
443
444
445       IORING_UNREGISTER_PBUF_RING
446              Unregister a previously registered provided  buffer  ring.   arg
447              must  be  set  to the address of a struct io_uring_buf_reg, with
448              just the bgid field set to the buffer group ID of the previously
449              registered  provided  buffer  group.   nr_args must be set to 1.
450              Also see IORING_REGISTER_PBUF_RING .
451
452              Available since 5.19.
453
454
455       IORING_REGISTER_SYNC_CANCEL
456              Performs a synchronous cancelation request,  which  works  in  a
457              similar  fashion  to  IORING_OP_ASYNC_CANCEL except it completes
458              inline. This can be  useful  for  scenarios  where  cancelations
459              should happen synchronously, rather than needing to issue an SQE
460              and wait for completion of that specific CQE.
461
462              arg must be set to a  pointer  to  a  struct  io_uring_sync_can‐
463              cel_reg  structure,  with  the  details  filled  in for what re‐
464              quest(s)  to  target  for   cancelation.   See   io_uring_regis‐
465              ter_sync_cancel(3)  for  details  on that. The return values are
466              the same, except they are passed back synchronously rather  than
467              through the CQE res field.  nr_args must be set to 1.
468
469              Available since 6.0.
470
471
472       IORING_REGISTER_FILE_ALLOC_RANGE
473              sets the allowable range for fixed file index allocations within
474              the kernel. When requests that can instantiate a new fixed  file
475              are  used with IORING_FILE_INDEX_ALLOC , the application is ask‐
476              ing the kernel to allocate a new fixed  file  descriptor  rather
477              than  pass  in  a specific value for one. By default, the kernel
478              will pick any available fixed file descriptor within  the  range
479              available.  This effectively allows the application to set aside
480              a range just for dynamic allocations, with the  remainder  being
481              used for specific values.
482
483              nr_args  must  be set to 1 and arg must be set to a pointer to a
484              struct io_uring_file_index_range:
485
486                   struct io_uring_file_index_range {
487                       __u32 off;
488                       __u32 len;
489                       __u64 resv;
490                   };
491
492               with off being set to the starting value for the range, and len
493               being set to the number of descriptors. The reserved resv field
494               must be cleared to zero.
495
496               The application must have registered a file table first.
497
498               Available since 6.0.
499
500

RETURN VALUE

502       On success, io_uring_register(2) returns either 0 or a positive  value,
503       depending  on the opcode used.  On error, a negative error value is re‐
504       turned. The caller should not rely on the errno variable.
505
506

ERRORS

508       EACCES The opcode field is not allowed due to registered restrictions.
509
510       EBADF  One or more fds in the fd array are invalid.
511
512       EBADFD IORING_REGISTER_ENABLE_RINGS or IORING_REGISTER_RESTRICTIONS was
513              specified, but the io_uring ring is not disabled.
514
515       EBUSY  IORING_REGISTER_BUFFERS  or IORING_REGISTER_FILES or IORING_REG‐
516              ISTER_RESTRICTIONS was specified, but there  were  already  buf‐
517              fers, files, or restrictions registered.
518
519       EFAULT buffer  is  outside of the process' accessible address space, or
520              iov_len is greater than 1GiB.
521
522       EINVAL IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES was  specified,
523              but nr_args is 0.
524
525       EINVAL IORING_REGISTER_BUFFERS   was  specified,  but  nr_args  exceeds
526              UIO_MAXIOV
527
528       EINVAL IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was  speci‐
529              fied, and nr_args is non-zero or arg is non-NULL.
530
531       EINVAL IORING_REGISTER_RESTRICTIONS  was specified, but nr_args exceeds
532              the maximum allowed number of restrictions or restriction opcode
533              is invalid.
534
535       EMFILE IORING_REGISTER_FILES was specified and nr_args exceeds the max‐
536              imum allowed number of files in a fixed file set.
537
538       EMFILE IORING_REGISTER_FILES was specified and adding nr_args file ref‐
539              erences  would  exceed  the  maximum allowed number of files the
540              user is allowed to have according to the RLIMIT_NOFILE  resource
541              limit  and the caller does not have CAP_SYS_RESOURCE capability.
542              Note that this is a per user limit, not per process.
543
544       ENOMEM Insufficient kernel resources are available, or the caller had a
545              non-zero  RLIMIT_MEMLOCK  soft resource limit, but tried to lock
546              more memory than the limit permitted.  This  limit  is  not  en‐
547              forced if the process is privileged (CAP_IPC_LOCK).
548
549       ENXIO  IORING_UNREGISTER_BUFFERS  or IORING_UNREGISTER_FILES was speci‐
550              fied, but there were no buffers or files registered.
551
552       ENXIO  Attempt to register files or buffers  on  an  io_uring  instance
553              that  is  already  undergoing file or buffer registration, or is
554              being torn down.
555
556       EOPNOTSUPP
557              User buffers point to file-backed memory.
558
559
560
561Linux                             2019-01-17              io_uring_register(2)
Impressum