1io_uring_register(2) Linux Programmer's Manual io_uring_register(2)
2
3
4
6 io_uring_register - register files or user buffers for asynchronous I/O
7
9 #include <liburing.h>
10
11 int io_uring_register(unsigned int fd, unsigned int opcode,
12 void *arg, unsigned int nr_args);
13
15 The io_uring_register(2) system call registers resources (e.g. user
16 buffers, files, eventfd, personality, restrictions) for use in an
17 io_uring(7) instance referenced by fd. Registering files or user buf‐
18 fers allows the kernel to take long term references to internal data
19 structures or create long term mappings of application memory, greatly
20 reducing per-I/O overhead.
21
22 fd is the file descriptor returned by a call to io_uring_setup(2). If
23 opcode has the flag IORING_REGISTER_USE_REGISTERED_RING ored into it,
24 fd is instead the index of a registered ring fd.
25
26 opcode can be one of:
27
28
29 IORING_REGISTER_BUFFERS
30 arg points to a struct iovec array of nr_args entries. The buf‐
31 fers associated with the iovecs will be locked in memory and
32 charged against the user's RLIMIT_MEMLOCK resource limit. See
33 getrlimit(2) for more information. Additionally, there is a
34 size limit of 1GiB per buffer. Currently, the buffers must be
35 anonymous, non-file-backed memory, such as that returned by mal‐
36 loc(3) or mmap(2) with the MAP_ANONYMOUS flag set. It is ex‐
37 pected that this limitation will be lifted in the future. Huge
38 pages are supported as well. Note that the entire huge page will
39 be pinned in the kernel, even if only a portion of it is used.
40
41 After a successful call, the supplied buffers are mapped into
42 the kernel and eligible for I/O. To make use of them, the ap‐
43 plication must specify the IORING_OP_READ_FIXED or IOR‐
44 ING_OP_WRITE_FIXED opcodes in the submission queue entry (see
45 the struct io_uring_sqe definition in io_uring_enter(2)), and
46 set the buf_index field to the desired buffer index. The memory
47 range described by the submission queue entry's addr and len
48 fields must fall within the indexed buffer.
49
50 It is perfectly valid to setup a large buffer and then only use
51 part of it for an I/O, as long as the range is within the origi‐
52 nally mapped region.
53
54 An application can increase or decrease the size or number of
55 registered buffers by first unregistering the existing buffers,
56 and then issuing a new call to io_uring_register(2) with the new
57 buffers.
58
59 Note that before 5.13 registering buffers would wait for the
60 ring to idle. If the application currently has requests in-
61 flight, the registration will wait for those to finish before
62 proceeding.
63
64 An application need not unregister buffers explicitly before
65 shutting down the io_uring instance. Note, however, that shut‐
66 down processing may run asynchronously within the kernel. As a
67 result, it is not guaranteed that pages are immediately unpinned
68 in this case. Available since 5.1.
69
70
71 IORING_REGISTER_BUFFERS2
72 Register buffers for I/O. Similar to IORING_REGISTER_BUFFERS but
73 aims to have a more extensible ABI.
74
75 arg points to a struct io_uring_rsrc_register, and nr_args
76 should be set to the number of bytes in the structure.
77
78
79 struct io_uring_rsrc_register {
80 __u32 nr;
81 __u32 resv;
82 __u64 resv2;
83 __aligned_u64 data;
84 __aligned_u64 tags;
85 };
86
87
88 The data field contains a pointer to a struct iovec array of nr
89 entries. The tags field should either be 0, then tagging is
90 disabled, or point to an array of nr "tags" (unsigned 64 bit
91 integers). If a tag is zero, then tagging for this particular
92 resource (a buffer in this case) is disabled. Otherwise, after
93 the resource had been unregistered and it's not used anymore, a
94 CQE will be posted with user_data set to the specified tag and
95 all other fields zeroed.
96
97 Note that resource updates, e.g. IORING_REGISTER_BUFFERS_UP‐
98 DATE, don't necessarily deallocate resources by the time it re‐
99 turns, but they might be held alive until all requests using it
100 complete.
101
102 Available since 5.13.
103
104
105 IORING_REGISTER_BUFFERS_UPDATE
106 Updates registered buffers with new ones, either turning a
107 sparse entry into a real one, or replacing an existing entry.
108
109 arg must contain a pointer to a struct io_uring_rsrc_update2,
110 which contains an offset on which to start the update, and an
111 array of struct iovec. tags points to an array of tags. nr
112 must contain the number of descriptors in the passed in arrays.
113 See IORING_REGISTER_BUFFERS2 for the resource tagging descrip‐
114 tion.
115
116
117 struct io_uring_rsrc_update2 {
118 __u32 offset;
119 __u32 resv;
120 __aligned_u64 data;
121 __aligned_u64 tags;
122 __u32 nr;
123 __u32 resv2;
124 };
125
126 Available since 5.13.
127
128
129 IORING_UNREGISTER_BUFFERS
130 This operation takes no argument, and arg must be passed as
131 NULL. All previously registered buffers associated with the
132 io_uring instance will be released synchronously. Available
133 since 5.1.
134
135
136 IORING_REGISTER_FILES
137 Register files for I/O. arg contains a pointer to an array of
138 nr_args file descriptors (signed 32 bit integers).
139
140 To make use of the registered files, the IOSQE_FIXED_FILE flag
141 must be set in the flags member of the struct io_uring_sqe, and
142 the fd member is set to the index of the file in the file de‐
143 scriptor array.
144
145 The file set may be sparse, meaning that the fd field in the ar‐
146 ray may be set to -1. See IORING_REGISTER_FILES_UPDATE for how
147 to update files in place.
148
149 Note that before 5.13 registering files would wait for the ring
150 to idle. If the application currently has requests in-flight,
151 the registration will wait for those to finish before proceed‐
152 ing. See IORING_REGISTER_FILES_UPDATE for how to update an ex‐
153 isting set without that limitation.
154
155 Files are automatically unregistered when the io_uring instance
156 is torn down. An application needs only unregister if it wishes
157 to register a new set of fds. Available since 5.1.
158
159
160 IORING_REGISTER_FILES2
161 Register files for I/O. Similar to IORING_REGISTER_FILES.
162
163 arg points to a struct io_uring_rsrc_register, and nr_args
164 should be set to the number of bytes in the structure.
165
166 The data field contains a pointer to an array of nr file de‐
167 scriptors (signed 32 bit integers). tags field should either be
168 0 or or point to an array of nr "tags" (unsigned 64 bit inte‐
169 gers). See IORING_REGISTER_BUFFERS2 for more info on resource
170 tagging.
171
172 Note that resource updates, e.g. IORING_REGISTER_FILES_UPDATE,
173 don't necessarily deallocate resources, they might be held until
174 all requests using that resource complete.
175
176 Available since 5.13.
177
178
179 IORING_REGISTER_FILES_UPDATE
180 This operation replaces existing files in the registered file
181 set with new ones, either turning a sparse entry (one where fd
182 is equal to -1 ) into a real one, removing an existing entry
183 (new one is set to -1 ), or replacing an existing entry with a
184 new existing entry.
185
186 arg must contain a pointer to a struct io_uring_files_update,
187 which contains an offset on which to start the update, and an
188 array of file descriptors to use for the update. nr_args must
189 contain the number of descriptors in the passed in array. Avail‐
190 able since 5.5.
191
192 File descriptors can be skipped if they are set to IORING_REGIS‐
193 TER_FILES_SKIP. Skipping an fd will not touch the file associ‐
194 ated with the previous fd at that index. Available since 5.12.
195
196
197 IORING_REGISTER_FILES_UPDATE2
198 Similar to IORING_REGISTER_FILES_UPDATE, replaces existing files
199 in the registered file set with new ones, either turning a
200 sparse entry (one where fd is equal to -1 ) into a real one, re‐
201 moving an existing entry (new one is set to -1 ), or replacing
202 an existing entry with a new existing entry.
203
204 arg must contain a pointer to a struct io_uring_rsrc_update2,
205 which contains an offset on which to start the update, and an
206 array of file descriptors to use for the update stored in data.
207 tags points to an array of tags. nr must contain the number of
208 descriptors in the passed in arrays. See IORING_REGISTER_BUF‐
209 FERS2 for the resource tagging description.
210
211 Available since 5.13.
212
213
214 IORING_UNREGISTER_FILES
215 This operation requires no argument, and arg must be passed as
216 NULL. All previously registered files associated with the
217 io_uring instance will be unregistered. Available since 5.1.
218
219
220 IORING_REGISTER_EVENTFD
221 It's possible to use eventfd(2) to get notified of completion
222 events on an io_uring instance. If this is desired, an eventfd
223 file descriptor can be registered through this operation. arg
224 must contain a pointer to the eventfd file descriptor, and
225 nr_args must be 1. Note that while io_uring generally takes care
226 to avoid spurious events, they can occur. Similarly, batched
227 completions of CQEs may only trigger a single eventfd notifica‐
228 tion even if multiple CQEs are posted. The application should
229 make no assumptions on number of events being available having a
230 direct correlation to eventfd notifications posted. An eventfd
231 notification must thus only be treated as a hint to check the CQ
232 ring for completions. Available since 5.2.
233
234 An application can temporarily disable notifications, coming
235 through the registered eventfd, by setting the IOR‐
236 ING_CQ_EVENTFD_DISABLED bit in the flags field of the CQ ring.
237 Available since 5.8.
238
239
240 IORING_REGISTER_EVENTFD_ASYNC
241 This works just like IORING_REGISTER_EVENTFD , except notifica‐
242 tions are only posted for events that complete in an async man‐
243 ner. This means that events that complete inline while being
244 submitted do not trigger a notification event. The arguments
245 supplied are the same as for IORING_REGISTER_EVENTFD. Available
246 since 5.6.
247
248
249 IORING_UNREGISTER_EVENTFD
250 Unregister an eventfd file descriptor to stop notifications.
251 Since only one eventfd descriptor is currently supported, this
252 operation takes no argument, and arg must be passed as NULL and
253 nr_args must be zero. Available since 5.2.
254
255
256 IORING_REGISTER_PROBE
257 This operation returns a structure, io_uring_probe, which con‐
258 tains information about the opcodes supported by io_uring on the
259 running kernel. arg must contain a pointer to a struct io_ur‐
260 ing_probe, and nr_args must contain the size of the ops array in
261 that probe struct. The ops array is of the type io_ur‐
262 ing_probe_op, which holds the value of the opcode and a flags
263 field. If the flags field has IO_URING_OP_SUPPORTED set, then
264 this opcode is supported on the running kernel. Available since
265 5.6.
266
267
268 IORING_REGISTER_PERSONALITY
269 This operation registers credentials of the running application
270 with io_uring, and returns an id associated with these creden‐
271 tials. Applications wishing to share a ring between separate
272 users/processes can pass in this credential id in the sqe per‐
273 sonality field. If set, that particular sqe will be issued with
274 these credentials. Must be invoked with arg set to NULL and
275 nr_args set to zero. Available since 5.6.
276
277
278 IORING_UNREGISTER_PERSONALITY
279 This operation unregisters a previously registered personality
280 with io_uring. nr_args must be set to the id in question, and
281 arg must be set to NULL. Available since 5.6.
282
283
284 IORING_REGISTER_ENABLE_RINGS
285 This operation enables an io_uring ring started in a disabled
286 state (IORING_SETUP_R_DISABLED was specified in the call to
287 io_uring_setup(2)). While the io_uring ring is disabled, sub‐
288 missions are not allowed and registrations are not restricted.
289
290 After the execution of this operation, the io_uring ring is en‐
291 abled: submissions and registration are allowed, but they will
292 be validated following the registered restrictions (if any).
293 This operation takes no argument, must be invoked with arg set
294 to NULL and nr_args set to zero. Available since 5.10.
295
296
297 IORING_REGISTER_RESTRICTIONS
298 arg points to a struct io_uring_restriction array of nr_args en‐
299 tries.
300
301 With an entry it is possible to allow an io_uring_register(2)
302 opcode, or specify which opcode and flags of the submission
303 queue entry are allowed, or require certain flags to be speci‐
304 fied (these flags must be set on each submission queue entry).
305
306 All the restrictions must be submitted with a single io_ur‐
307 ing_register(2) call and they are handled as an allowlist (op‐
308 codes and flags not registered, are not allowed).
309
310 Restrictions can be registered only if the io_uring ring started
311 in a disabled state (IORING_SETUP_R_DISABLED must be specified
312 in the call to io_uring_setup(2)).
313
314 Available since 5.10.
315
316
317 IORING_REGISTER_IOWQ_AFF
318 By default, async workers created by io_uring will inherit the
319 CPU mask of its parent. This is usually all the CPUs in the sys‐
320 tem, unless the parent is being run with a limited set. If this
321 isn't the desired outcome, the application may explicitly tell
322 io_uring what CPUs the async workers may run on. arg must point
323 to a cpu_set_t mask, and nr_args the byte size of that mask.
324
325 Available since 5.14.
326
327
328 IORING_UNREGISTER_IOWQ_AFF
329 Undoes a CPU mask previously set with IORING_REGISTER_IOWQ_AFF.
330 Must not have arg or nr_args set.
331
332 Available since 5.14.
333
334
335 IORING_REGISTER_IOWQ_MAX_WORKERS
336 By default, io_uring limits the unbounded workers created to the
337 maximum processor count set by RLIMIT_NPROC and the bounded
338 workers is a function of the SQ ring size and the number of CPUs
339 in the system. Sometimes this can be excessive (or too little,
340 for bounded), and this command provides a way to change the
341 count per ring (per NUMA node) instead.
342
343 arg must be set to an unsigned int pointer to an array of two
344 values, with the values in the array being set to the maximum
345 count of workers per NUMA node. Index 0 holds the bounded worker
346 count, and index 1 holds the unbounded worker count. On success‐
347 ful return, the passed in array will contain the previous maxi‐
348 mum valyes for each type. If the count being passed in is 0,
349 then this command returns the current maximum values and doesn't
350 modify the current setting. nr_args must be set to 2, as the
351 command takes two values.
352
353 Available since 5.15.
354
355
356 IORING_REGISTER_RING_FDS
357 Whenever io_uring_enter(2) is called to submit request or wait
358 for completions, the kernel must grab a reference to the file
359 descriptor. If the application using io_uring is threaded, the
360 file table is marked as shared, and the reference grab and put
361 of the file descriptor count is more expensive than it is for a
362 non-threaded application.
363
364 Similarly to how io_uring allows registration of files, this al‐
365 low registration of the ring file descriptor itself. This re‐
366 duces the overhead of the io_uring_enter(2) system call.
367
368 arg must be set to an unsigned int pointer to an array of type
369 struct io_uring_rsrc_register of nr_args number of entries. The
370 data field of this struct must point to an io_uring file de‐
371 scriptor, and the offset field can be either -1 or an explicit
372 offset desired for the registered file descriptor value. If -1
373 is used, then upon successful return of this system call, the
374 field will contain the value of the registered file descriptor
375 to be used for future io_uring_enter(2) system calls.
376
377 On successful completion of this request, the returned descrip‐
378 tors may be used instead of the real file descriptor for io_ur‐
379 ing_enter(2), provided that IORING_ENTER_REGISTERED_RING is set
380 in the flags for the system call. This flag tells the kernel
381 that a registered descriptor is used rather than a real file de‐
382 scriptor.
383
384 Each thread or process using a ring must register the file de‐
385 scriptor directly by issuing this request.
386
387 The maximum number of supported registered ring descriptors is
388 currently limited to 16.
389
390 Available since 5.18.
391
392
393 IORING_UNREGISTER_RING_FDS
394 Unregister descriptors previously registered with IORING_REGIS‐
395 TER_RING_FDS.
396
397 arg must be set to an unsigned int pointer to an array of type
398 struct io_uring_rsrc_register of nr_args number of entries. Only
399 the offset field should be set in the structure, containing the
400 registered file descriptor offset previously returned from IOR‐
401 ING_REGISTER_RING_FDS that the application wishes to unregister.
402
403 Note that this isn't done automatically on ring exit, if the
404 thread or task that previously registered a ring file descriptor
405 isn't exiting. It is recommended to manually unregister any pre‐
406 viously registered ring descriptors if the ring is closed and
407 the task persists. This will free up a registration slot, making
408 it available for future use.
409
410 Available since 5.18.
411
412
413 IORING_REGISTER_PBUF_RING
414 Registers a shared buffer ring to be used with provided buffers.
415 This is a newer alternative to using IORING_OP_PROVIDE_BUFFERS
416 which is more efficient, to be used with request types that sup‐
417 port the IOSQE_BUFFER_SELECT flag.
418
419 The arg argument must be filled in with the appropriate informa‐
420 tion. It looks as follows:
421
422 struct io_uring_buf_reg {
423 __u64 ring_addr;
424 __u32 ring_entries;
425 __u16 bgid;
426 __u16 pad;
427 __u64 resv[3];
428 };
429
430 The ring_addr field must contain the address to the memory al‐
431 located to fit this ring. The memory must be page aligned and
432 hence allocated appropriately using eg posix_memalign(3) or
433 similar. The size of the ring is the product of ring_entries
434 and the size of struct io_uring_buf. ring_entries is the de‐
435 sired size of the ring, and must be a power-of-2 in size. The
436 maximum size allowed is 2^15 (32768). bgid is the buffer group
437 ID associated with this ring. SQEs that select a buffer have a
438 buffer group associated with them in their buf_group field, and
439 the associated CQEs will have IORING_CQE_F_BUFFER set in their
440 flags member, which will also contain the specific ID of the
441 buffer selected. The rest of the fields are reserved and must
442 be cleared to zero.
443
444 nr_args must be set to 1.
445
446 Also see io_uring_register_buf_ring(3) for more details. Avail‐
447 able since 5.19.
448
449
450 IORING_UNREGISTER_PBUF_RING
451 Unregister a previously registered provided buffer ring. arg
452 must be set to the address of a struct io_uring_buf_reg, with
453 just the bgid field set to the buffer group ID of the previously
454 registered provided buffer group. nr_args must be set to 1.
455 Also see IORING_REGISTER_PBUF_RING .
456
457 Available since 5.19.
458
459
460 IORING_REGISTER_SYNC_CANCEL
461 Performs a synchronous cancelation request, which works in a
462 similar fashion to IORING_OP_ASYNC_CANCEL except it completes
463 inline. This can be useful for scenarios where cancelations
464 should happen synchronously, rather than needing to issue an SQE
465 and wait for completion of that specific CQE.
466
467 arg must be set to a pointer to a struct io_uring_sync_can‐
468 cel_reg structure, with the details filled in for what re‐
469 quest(s) to target for cancelation. See io_uring_regis‐
470 ter_sync_cancel(3) for details on that. The return values are
471 the same, except they are passed back synchronously rather than
472 through the CQE res field. nr_args must be set to 1.
473
474 Available since 6.0.
475
476
477 IORING_REGISTER_FILE_ALLOC_RANGE
478 sets the allowable range for fixed file index allocations within
479 the kernel. When requests that can instantiate a new fixed file
480 are used with IORING_FILE_INDEX_ALLOC , the application is ask‐
481 ing the kernel to allocate a new fixed file descriptor rather
482 than pass in a specific value for one. By default, the kernel
483 will pick any available fixed file descriptor within the range
484 available. This effectively allows the application to set aside
485 a range just for dynamic allocations, with the remainder being
486 used for specific values.
487
488 nr_args must be set to 1 and arg must be set to a pointer to a
489 struct io_uring_file_index_range:
490
491 struct io_uring_file_index_range {
492 __u32 off;
493 __u32 len;
494 __u64 resv;
495 };
496
497 with off being set to the starting value for the range, and len
498 being set to the number of descriptors. The reserved resv field
499 must be cleared to zero.
500
501 The application must have registered a file table first.
502
503 Available since 6.0.
504
505
507 On success, io_uring_register(2) returns either 0 or a positive value,
508 depending on the opcode used. On error, a negative error value is re‐
509 turned. The caller should not rely on the errno variable.
510
511
513 EACCES The opcode field is not allowed due to registered restrictions.
514
515 EBADF One or more fds in the fd array are invalid.
516
517 EBADFD IORING_REGISTER_ENABLE_RINGS or IORING_REGISTER_RESTRICTIONS was
518 specified, but the io_uring ring is not disabled.
519
520 EBUSY IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES or IORING_REG‐
521 ISTER_RESTRICTIONS was specified, but there were already buf‐
522 fers, files, or restrictions registered.
523
524 EFAULT buffer is outside of the process' accessible address space, or
525 iov_len is greater than 1GiB.
526
527 EINVAL IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES was specified,
528 but nr_args is 0.
529
530 EINVAL IORING_REGISTER_BUFFERS was specified, but nr_args exceeds
531 UIO_MAXIOV
532
533 EINVAL IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was speci‐
534 fied, and nr_args is non-zero or arg is non-NULL.
535
536 EINVAL IORING_REGISTER_RESTRICTIONS was specified, but nr_args exceeds
537 the maximum allowed number of restrictions or restriction opcode
538 is invalid.
539
540 EMFILE IORING_REGISTER_FILES was specified and nr_args exceeds the max‐
541 imum allowed number of files in a fixed file set.
542
543 EMFILE IORING_REGISTER_FILES was specified and adding nr_args file ref‐
544 erences would exceed the maximum allowed number of files the
545 user is allowed to have according to the RLIMIT_NOFILE resource
546 limit and the caller does not have CAP_SYS_RESOURCE capability.
547 Note that this is a per user limit, not per process.
548
549 ENOMEM Insufficient kernel resources are available, or the caller had a
550 non-zero RLIMIT_MEMLOCK soft resource limit, but tried to lock
551 more memory than the limit permitted. This limit is not en‐
552 forced if the process is privileged (CAP_IPC_LOCK).
553
554 ENXIO IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was speci‐
555 fied, but there were no buffers or files registered.
556
557 ENXIO Attempt to register files or buffers on an io_uring instance
558 that is already undergoing file or buffer registration, or is
559 being torn down.
560
561 EOPNOTSUPP
562 User buffers point to file-backed memory.
563
564 EFAULT User buffers point to file-backed memory (newer kernels).
565
566
567
568Linux 2019-01-17 io_uring_register(2)