1io_uring_register(2) Linux Programmer's Manual io_uring_register(2)
2
3
4
6 io_uring_register - register files or user buffers for asynchronous I/O
7
9 #include <liburing.h>
10
11 int io_uring_register(unsigned int fd, unsigned int opcode,
12 void *arg, unsigned int nr_args);
13
15 The io_uring_register(2) system call registers resources (e.g. user
16 buffers, files, eventfd, personality, restrictions) for use in an
17 io_uring(7) instance referenced by fd. Registering files or user buf‐
18 fers allows the kernel to take long term references to internal data
19 structures or create long term mappings of application memory, greatly
20 reducing per-I/O overhead.
21
22 fd is the file descriptor returned by a call to io_uring_setup(2). op‐
23 code can be one of:
24
25
26 IORING_REGISTER_BUFFERS
27 arg points to a struct iovec array of nr_args entries. The buf‐
28 fers associated with the iovecs will be locked in memory and
29 charged against the user's RLIMIT_MEMLOCK resource limit. See
30 getrlimit(2) for more information. Additionally, there is a
31 size limit of 1GiB per buffer. Currently, the buffers must be
32 anonymous, non-file-backed memory, such as that returned by mal‐
33 loc(3) or mmap(2) with the MAP_ANONYMOUS flag set. It is ex‐
34 pected that this limitation will be lifted in the future. Huge
35 pages are supported as well. Note that the entire huge page will
36 be pinned in the kernel, even if only a portion of it is used.
37
38 After a successful call, the supplied buffers are mapped into
39 the kernel and eligible for I/O. To make use of them, the ap‐
40 plication must specify the IORING_OP_READ_FIXED or IOR‐
41 ING_OP_WRITE_FIXED opcodes in the submission queue entry (see
42 the struct io_uring_sqe definition in io_uring_enter(2)), and
43 set the buf_index field to the desired buffer index. The memory
44 range described by the submission queue entry's addr and len
45 fields must fall within the indexed buffer.
46
47 It is perfectly valid to setup a large buffer and then only use
48 part of it for an I/O, as long as the range is within the origi‐
49 nally mapped region.
50
51 An application can increase or decrease the size or number of
52 registered buffers by first unregistering the existing buffers,
53 and then issuing a new call to io_uring_register(2) with the new
54 buffers.
55
56 Note that before 5.13 registering buffers would wait for the
57 ring to idle. If the application currently has requests in-
58 flight, the registration will wait for those to finish before
59 proceeding.
60
61 An application need not unregister buffers explicitly before
62 shutting down the io_uring instance. Available since 5.1.
63
64
65 IORING_REGISTER_BUFFERS2
66 Register buffers for I/O. Similar to IORING_REGISTER_BUFFERS but
67 aims to have a more extensible ABI.
68
69 arg points to a struct io_uring_rsrc_register, and nr_args
70 should be set to the number of bytes in the structure.
71
72
73 struct io_uring_rsrc_register {
74 __u32 nr;
75 __u32 resv;
76 __u64 resv2;
77 __aligned_u64 data;
78 __aligned_u64 tags;
79 };
80
81
82 The data field contains a pointer to a struct iovec array of nr
83 entries. The tags field should either be 0, then tagging is
84 disabled, or point to an array of nr "tags" (unsigned 64 bit
85 integers). If a tag is zero, then tagging for this particular
86 resource (a buffer in this case) is disabled. Otherwise, after
87 the resource had been unregistered and it's not used anymore, a
88 CQE will be posted with user_data set to the specified tag and
89 all other fields zeroed.
90
91 Note that resource updates, e.g. IORING_REGISTER_BUFFERS_UP‐
92 DATE, don't necessarily deallocate resources by the time it re‐
93 turns, but they might be held alive until all requests using it
94 complete.
95
96 Available since 5.13.
97
98
99 IORING_REGISTER_BUFFERS_UPDATE
100 Updates registered buffers with new ones, either turning a
101 sparse entry into a real one, or replacing an existing entry.
102
103 arg must contain a pointer to a struct io_uring_rsrc_update2,
104 which contains an offset on which to start the update, and an
105 array of struct iovec. tags points to an array of tags. nr
106 must contain the number of descriptors in the passed in arrays.
107 See IORING_REGISTER_BUFFERS2 for the resource tagging descrip‐
108 tion.
109
110
111 struct io_uring_rsrc_update2 {
112 __u32 offset;
113 __u32 resv;
114 __aligned_u64 data;
115 __aligned_u64 tags;
116 __u32 nr;
117 __u32 resv2;
118 };
119
120 Available since 5.13.
121
122
123 IORING_UNREGISTER_BUFFERS
124 This operation takes no argument, and arg must be passed as
125 NULL. All previously registered buffers associated with the
126 io_uring instance will be released. Available since 5.1.
127
128
129 IORING_REGISTER_FILES
130 Register files for I/O. arg contains a pointer to an array of
131 nr_args file descriptors (signed 32 bit integers).
132
133 To make use of the registered files, the IOSQE_FIXED_FILE flag
134 must be set in the flags member of the struct io_uring_sqe, and
135 the fd member is set to the index of the file in the file de‐
136 scriptor array.
137
138 The file set may be sparse, meaning that the fd field in the ar‐
139 ray may be set to -1. See IORING_REGISTER_FILES_UPDATE for how
140 to update files in place.
141
142 Note that before 5.13 registering files would wait for the ring
143 to idle. If the application currently has requests in-flight,
144 the registration will wait for those to finish before proceed‐
145 ing. See IORING_REGISTER_FILES_UPDATE for how to update an ex‐
146 isting set without that limitation.
147
148 Files are automatically unregistered when the io_uring instance
149 is torn down. An application needs only unregister if it wishes
150 to register a new set of fds. Available since 5.1.
151
152
153 IORING_REGISTER_FILES2
154 Register files for I/O. Similar to IORING_REGISTER_FILES.
155
156 arg points to a struct io_uring_rsrc_register, and nr_args
157 should be set to the number of bytes in the structure.
158
159 The data field contains a pointer to an array of nr file de‐
160 scriptors (signed 32 bit integers). tags field should either be
161 0 or or point to an array of nr "tags" (unsigned 64 bit inte‐
162 gers). See IORING_REGISTER_BUFFERS2 for more info on resource
163 tagging.
164
165 Note that resource updates, e.g. IORING_REGISTER_FILES_UPDATE,
166 don't necessarily deallocate resources, they might be held until
167 all requests using that resource complete.
168
169 Available since 5.13.
170
171
172 IORING_REGISTER_FILES_UPDATE
173 This operation replaces existing files in the registered file
174 set with new ones, either turning a sparse entry (one where fd
175 is equal to -1 ) into a real one, removing an existing entry
176 (new one is set to -1 ), or replacing an existing entry with a
177 new existing entry.
178
179 arg must contain a pointer to a struct io_uring_files_update,
180 which contains an offset on which to start the update, and an
181 array of file descriptors to use for the update. nr_args must
182 contain the number of descriptors in the passed in array. Avail‐
183 able since 5.5.
184
185 File descriptors can be skipped if they are set to IORING_REGIS‐
186 TER_FILES_SKIP. Skipping an fd will not touch the file associ‐
187 ated with the previous fd at that index. Available since 5.12.
188
189
190 IORING_REGISTER_FILES_UPDATE2
191 Similar to IORING_REGISTER_FILES_UPDATE, replaces existing files
192 in the registered file set with new ones, either turning a
193 sparse entry (one where fd is equal to -1 ) into a real one, re‐
194 moving an existing entry (new one is set to -1 ), or replacing
195 an existing entry with a new existing entry.
196
197 arg must contain a pointer to a struct io_uring_rsrc_update2,
198 which contains an offset on which to start the update, and an
199 array of file descriptors to use for the update stored in data.
200 tags points to an array of tags. nr must contain the number of
201 descriptors in the passed in arrays. See IORING_REGISTER_BUF‐
202 FERS2 for the resource tagging description.
203
204 Available since 5.13.
205
206
207 IORING_UNREGISTER_FILES
208 This operation requires no argument, and arg must be passed as
209 NULL. All previously registered files associated with the
210 io_uring instance will be unregistered. Available since 5.1.
211
212
213 IORING_REGISTER_EVENTFD
214 It's possible to use eventfd(2) to get notified of completion
215 events on an io_uring instance. If this is desired, an eventfd
216 file descriptor can be registered through this operation. arg
217 must contain a pointer to the eventfd file descriptor, and
218 nr_args must be 1. Note that while io_uring generally takes care
219 to avoid spurious events, they can occur. Similarly, batched
220 completions of CQEs may only trigger a single eventfd notifica‐
221 tion even if multiple CQEs are posted. The application should
222 make no assumptions on number of events being available having a
223 direct correlation to eventfd notifications posted. An eventfd
224 notification must thus only be treated as a hint to check the CQ
225 ring for completions. Available since 5.2.
226
227 An application can temporarily disable notifications, coming
228 through the registered eventfd, by setting the IOR‐
229 ING_CQ_EVENTFD_DISABLED bit in the flags field of the CQ ring.
230 Available since 5.8.
231
232
233 IORING_REGISTER_EVENTFD_ASYNC
234 This works just like IORING_REGISTER_EVENTFD , except notifica‐
235 tions are only posted for events that complete in an async man‐
236 ner. This means that events that complete inline while being
237 submitted do not trigger a notification event. The arguments
238 supplied are the same as for IORING_REGISTER_EVENTFD. Available
239 since 5.6.
240
241
242 IORING_UNREGISTER_EVENTFD
243 Unregister an eventfd file descriptor to stop notifications.
244 Since only one eventfd descriptor is currently supported, this
245 operation takes no argument, and arg must be passed as NULL and
246 nr_args must be zero. Available since 5.2.
247
248
249 IORING_REGISTER_PROBE
250 This operation returns a structure, io_uring_probe, which con‐
251 tains information about the opcodes supported by io_uring on the
252 running kernel. arg must contain a pointer to a struct io_ur‐
253 ing_probe, and nr_args must contain the size of the ops array in
254 that probe struct. The ops array is of the type io_ur‐
255 ing_probe_op, which holds the value of the opcode and a flags
256 field. If the flags field has IO_URING_OP_SUPPORTED set, then
257 this opcode is supported on the running kernel. Available since
258 5.6.
259
260
261 IORING_REGISTER_PERSONALITY
262 This operation registers credentials of the running application
263 with io_uring, and returns an id associated with these creden‐
264 tials. Applications wishing to share a ring between separate
265 users/processes can pass in this credential id in the sqe per‐
266 sonality field. If set, that particular sqe will be issued with
267 these credentials. Must be invoked with arg set to NULL and
268 nr_args set to zero. Available since 5.6.
269
270
271 IORING_UNREGISTER_PERSONALITY
272 This operation unregisters a previously registered personality
273 with io_uring. nr_args must be set to the id in question, and
274 arg must be set to NULL. Available since 5.6.
275
276
277 IORING_REGISTER_ENABLE_RINGS
278 This operation enables an io_uring ring started in a disabled
279 state (IORING_SETUP_R_DISABLED was specified in the call to
280 io_uring_setup(2)). While the io_uring ring is disabled, sub‐
281 missions are not allowed and registrations are not restricted.
282
283 After the execution of this operation, the io_uring ring is en‐
284 abled: submissions and registration are allowed, but they will
285 be validated following the registered restrictions (if any).
286 This operation takes no argument, must be invoked with arg set
287 to NULL and nr_args set to zero. Available since 5.10.
288
289
290 IORING_REGISTER_RESTRICTIONS
291 arg points to a struct io_uring_restriction array of nr_args en‐
292 tries.
293
294 With an entry it is possible to allow an io_uring_register(2)
295 opcode, or specify which opcode and flags of the submission
296 queue entry are allowed, or require certain flags to be speci‐
297 fied (these flags must be set on each submission queue entry).
298
299 All the restrictions must be submitted with a single io_ur‐
300 ing_register(2) call and they are handled as an allowlist (op‐
301 codes and flags not registered, are not allowed).
302
303 Restrictions can be registered only if the io_uring ring started
304 in a disabled state (IORING_SETUP_R_DISABLED must be specified
305 in the call to io_uring_setup(2)).
306
307 Available since 5.10.
308
309
310 IORING_REGISTER_IOWQ_AFF
311 By default, async workers created by io_uring will inherit the
312 CPU mask of its parent. This is usually all the CPUs in the sys‐
313 tem, unless the parent is being run with a limited set. If this
314 isn't the desired outcome, the application may explicitly tell
315 io_uring what CPUs the async workers may run on. arg must point
316 to a cpu_set_t mask, and nr_args the byte size of that mask.
317
318 Available since 5.14.
319
320
321 IORING_UNREGISTER_IOWQ_AFF
322 Undoes a CPU mask previously set with IORING_REGISTER_IOWQ_AFF.
323 Must not have arg or nr_args set.
324
325 Available since 5.14.
326
327
328 IORING_REGISTER_IOWQ_MAX_WORKERS
329 By default, io_uring limits the unbounded workers created to the
330 maximum processor count set by RLIMIT_NPROC and the bounded
331 workers is a function of the SQ ring size and the number of CPUs
332 in the system. Sometimes this can be excessive (or too little,
333 for bounded), and this command provides a way to change the
334 count per ring (per NUMA node) instead.
335
336 arg must be set to an unsigned int pointer to an array of two
337 values, with the values in the array being set to the maximum
338 count of workers per NUMA node. Index 0 holds the bounded worker
339 count, and index 1 holds the unbounded worker count. On success‐
340 ful return, the passed in array will contain the previous maxi‐
341 mum valyes for each type. If the count being passed in is 0,
342 then this command returns the current maximum values and doesn't
343 modify the current setting. nr_args must be set to 2, as the
344 command takes two values.
345
346 Available since 5.15.
347
348
349 IORING_REGISTER_RING_FDS
350 Whenever io_uring_enter(2) is called to submit request or wait
351 for completions, the kernel must grab a reference to the file
352 descriptor. If the application using io_uring is threaded, the
353 file table is marked as shared, and the reference grab and put
354 of the file descriptor count is more expensive than it is for a
355 non-threaded application.
356
357 Similarly to how io_uring allows registration of files, this al‐
358 low registration of the ring file descriptor itself. This re‐
359 duces the overhead of the io_uring_enter(2) system call.
360
361 arg must be set to an unsigned int pointer to an array of type
362 struct io_uring_rsrc_register of nr_args number of entries. The
363 data field of this struct must point to an io_uring file de‐
364 scriptor, and the offset field can be either -1 or an explicit
365 offset desired for the registered file descriptor value. If -1
366 is used, then upon successful return of this system call, the
367 field will contain the value of the registered file descriptor
368 to be used for future io_uring_enter(2) system calls.
369
370 On successful completion of this request, the returned descrip‐
371 tors may be used instead of the real file descriptor for io_ur‐
372 ing_enter(2), provided that IORING_ENTER_REGISTERED_RING is set
373 in the flags for the system call. This flag tells the kernel
374 that a registered descriptor is used rather than a real file de‐
375 scriptor.
376
377 Each thread or process using a ring must register the file de‐
378 scriptor directly by issuing this request.
379
380 The maximum number of supported registered ring descriptors is
381 currently limited to 16.
382
383 Available since 5.18.
384
385
386 IORING_UNREGISTER_RING_FDS
387 Unregister descriptors previously registered with IORING_REGIS‐
388 TER_RING_FDS.
389
390 arg must be set to an unsigned int pointer to an array of type
391 struct io_uring_rsrc_register of nr_args number of entries. Only
392 the offset field should be set in the structure, containing the
393 registered file descriptor offset previously returned from IOR‐
394 ING_REGISTER_RING_FDS that the application wishes to unregister.
395
396 Note that this isn't done automatically on ring exit, if the
397 thread or task that previously registered a ring file descriptor
398 isn't exiting. It is recommended to manually unregister any pre‐
399 viously registered ring descriptors if the ring is closed and
400 the task persists. This will free up a registration slot, making
401 it available for future use.
402
403 Available since 5.18.
404
405
406 IORING_REGISTER_PBUF_RING
407 Registers a shared buffer ring to be used with provided buffers.
408 This is a newer alternative to using IORING_OP_PROVIDE_BUFFERS
409 which is more efficient, to be used with request types that sup‐
410 port the IOSQE_BUFFER_SELECT flag.
411
412 The arg argument must be filled in with the appropriate informa‐
413 tion. It looks as follows:
414
415 struct io_uring_buf_reg {
416 __u64 ring_addr;
417 __u32 ring_entries;
418 __u16 bgid;
419 __u16 pad;
420 __u64 resv[3];
421 };
422
423 The ring_addr field must contain the address to the memory al‐
424 located to fit this ring. The memory must be page aligned and
425 hence allocated appropriately using eg posix_memalign(3) or
426 similar. The size of the ring is the product of ring_entries
427 and the size of struct io_uring_buf. ring_entries is the de‐
428 sired size of the ring, and must be a power-of-2 in size. The
429 maximum size allowed is 2^15 (32768). bgid is the buffer group
430 ID associated with this ring. SQEs that select a buffer has a
431 buffer group associated with them in their buf_group field, and
432 the associated CQE will have IORING_CQE_F_BUFFER set in their
433 flags member, which will also contain the specific ID of the
434 buffer selected. The rest of the fields are reserved and must
435 be cleared to zero.
436
437 The flags argument is currently unused and must be set to zero.
438
439 must be set to 1.
440
441 Also see io_uring_register_buf_ring(3) for more details. Avail‐
442 able since 5.19.
443
444
445 IORING_UNREGISTER_PBUF_RING
446 Unregister a previously registered provided buffer ring. arg
447 must be set to the address of a struct io_uring_buf_reg, with
448 just the bgid field set to the buffer group ID of the previously
449 registered provided buffer group. nr_args must be set to 1.
450 Also see IORING_REGISTER_PBUF_RING .
451
452 Available since 5.19.
453
454
455 IORING_REGISTER_SYNC_CANCEL
456 Performs a synchronous cancelation request, which works in a
457 similar fashion to IORING_OP_ASYNC_CANCEL except it completes
458 inline. This can be useful for scenarios where cancelations
459 should happen synchronously, rather than needing to issue an SQE
460 and wait for completion of that specific CQE.
461
462 arg must be set to a pointer to a struct io_uring_sync_can‐
463 cel_reg structure, with the details filled in for what re‐
464 quest(s) to target for cancelation. See io_uring_regis‐
465 ter_sync_cancel(3) for details on that. The return values are
466 the same, except they are passed back synchronously rather than
467 through the CQE res field. nr_args must be set to 1.
468
469 Available since 6.0.
470
471
472 IORING_REGISTER_FILE_ALLOC_RANGE
473 sets the allowable range for fixed file index allocations within
474 the kernel. When requests that can instantiate a new fixed file
475 are used with IORING_FILE_INDEX_ALLOC , the application is ask‐
476 ing the kernel to allocate a new fixed file descriptor rather
477 than pass in a specific value for one. By default, the kernel
478 will pick any available fixed file descriptor within the range
479 available. This effectively allows the application to set aside
480 a range just for dynamic allocations, with the remainder being
481 used for specific values.
482
483 nr_args must be set to 1 and arg must be set to a pointer to a
484 struct io_uring_file_index_range:
485
486 struct io_uring_file_index_range {
487 __u32 off;
488 __u32 len;
489 __u64 resv;
490 };
491
492 with off being set to the starting value for the range, and len
493 being set to the number of descriptors. The reserved resv field
494 must be cleared to zero.
495
496 The application must have registered a file table first.
497
498 Available since 6.0.
499
500
502 On success, io_uring_register(2) returns either 0 or a positive value,
503 depending on the opcode used. On error, a negative error value is re‐
504 turned. The caller should not rely on the errno variable.
505
506
508 EACCES The opcode field is not allowed due to registered restrictions.
509
510 EBADF One or more fds in the fd array are invalid.
511
512 EBADFD IORING_REGISTER_ENABLE_RINGS or IORING_REGISTER_RESTRICTIONS was
513 specified, but the io_uring ring is not disabled.
514
515 EBUSY IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES or IORING_REG‐
516 ISTER_RESTRICTIONS was specified, but there were already buf‐
517 fers, files, or restrictions registered.
518
519 EFAULT buffer is outside of the process' accessible address space, or
520 iov_len is greater than 1GiB.
521
522 EINVAL IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES was specified,
523 but nr_args is 0.
524
525 EINVAL IORING_REGISTER_BUFFERS was specified, but nr_args exceeds
526 UIO_MAXIOV
527
528 EINVAL IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was speci‐
529 fied, and nr_args is non-zero or arg is non-NULL.
530
531 EINVAL IORING_REGISTER_RESTRICTIONS was specified, but nr_args exceeds
532 the maximum allowed number of restrictions or restriction opcode
533 is invalid.
534
535 EMFILE IORING_REGISTER_FILES was specified and nr_args exceeds the max‐
536 imum allowed number of files in a fixed file set.
537
538 EMFILE IORING_REGISTER_FILES was specified and adding nr_args file ref‐
539 erences would exceed the maximum allowed number of files the
540 user is allowed to have according to the RLIMIT_NOFILE resource
541 limit and the caller does not have CAP_SYS_RESOURCE capability.
542 Note that this is a per user limit, not per process.
543
544 ENOMEM Insufficient kernel resources are available, or the caller had a
545 non-zero RLIMIT_MEMLOCK soft resource limit, but tried to lock
546 more memory than the limit permitted. This limit is not en‐
547 forced if the process is privileged (CAP_IPC_LOCK).
548
549 ENXIO IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was speci‐
550 fied, but there were no buffers or files registered.
551
552 ENXIO Attempt to register files or buffers on an io_uring instance
553 that is already undergoing file or buffer registration, or is
554 being torn down.
555
556 EOPNOTSUPP
557 User buffers point to file-backed memory.
558
559
560
561Linux 2019-01-17 io_uring_register(2)