1USERFAULTFD(2)             Linux Programmer's Manual            USERFAULTFD(2)
2
3
4

NAME

6       userfaultfd - create a file descriptor for handling page faults in user
7       space
8

SYNOPSIS

10       #include <fcntl.h>            /* Definition of O_* constants */
11       #include <sys/syscall.h>      /* Definition of SYS_* constants */
12       #include <unistd.h>
13
14       int syscall(SYS_userfaultfd, int flags);
15
16       Note: glibc provides no wrapper for  userfaultfd(),  necessitating  the
17       use of syscall(2).
18

DESCRIPTION

20       userfaultfd()  creates  a  new  userfaultfd object that can be used for
21       delegation of page-fault handling to a user-space application, and  re‐
22       turns  a  file descriptor that refers to the new object.  The new user‐
23       faultfd object is configured using ioctl(2).
24
25       Once the userfaultfd object is  configured,  the  application  can  use
26       read(2)  to  receive  userfaultfd  notifications.  The reads from user‐
27       faultfd may be blocking or non-blocking,  depending  on  the  value  of
28       flags  used  for the creation of the userfaultfd or subsequent calls to
29       fcntl(2).
30
31       The following values may be bitwise ORed in flags to change the  behav‐
32       ior of userfaultfd():
33
34       O_CLOEXEC
35              Enable  the  close-on-exec flag for the new userfaultfd file de‐
36              scriptor.  See the description of the O_CLOEXEC flag in open(2).
37
38       O_NONBLOCK
39              Enables non-blocking operation for the userfaultfd object.   See
40              the description of the O_NONBLOCK flag in open(2).
41
42       When  the  last  file  descriptor  referring to a userfaultfd object is
43       closed, all memory ranges that were registered with the object are  un‐
44       registered and unread events are flushed.
45
46       Userfaultfd supports two modes of registration:
47
48       UFFDIO_REGISTER_MODE_MISSING (since 4.10)
49              When  registered  with  UFFDIO_REGISTER_MODE_MISSING mode, user-
50              space will receive a page-fault notification when a missing page
51              is  accessed.  The faulted thread will be stopped from execution
52              until the page fault is resolved from user-space  by  either  an
53              UFFDIO_COPY or an UFFDIO_ZEROPAGE ioctl.
54
55       UFFDIO_REGISTER_MODE_WP (since 5.7)
56              When  registered  with  UFFDIO_REGISTER_MODE_WP mode, user-space
57              will receive a page-fault notification  when  a  write-protected
58              page is written.  The faulted thread will be stopped from execu‐
59              tion until user-space write-unprotects the page  using  an  UFF‐
60              DIO_WRITEPROTECT ioctl.
61
62       Multiple  modes  can  be  enabled  at the same time for the same memory
63       range.
64
65       Since Linux 4.14, a userfaultfd page-fault notification can selectively
66       embed  faulting thread ID information into the notification.  One needs
67       to enable this feature explicitly using the UFFD_FEATURE_THREAD_ID fea‐
68       ture bit when initializing the userfaultfd context.  By default, thread
69       ID reporting is disabled.
70
71   Usage
72       The userfaultfd mechanism is designed to allow a  thread  in  a  multi‐
73       threaded  program to perform user-space paging for the other threads in
74       the process.  When a page fault occurs for one of  the  regions  regis‐
75       tered  to  the  userfaultfd object, the faulting thread is put to sleep
76       and an event is generated that can be read via the userfaultfd file de‐
77       scriptor.   The  fault-handling  thread reads events from this file de‐
78       scriptor  and  services  them  using  the   operations   described   in
79       ioctl_userfaultfd(2).  When servicing the page fault events, the fault-
80       handling thread can trigger a wake-up for the sleeping thread.
81
82       It is possible for the faulting threads and the fault-handling  threads
83       to  run  in  the  context  of different processes.  In this case, these
84       threads may belong to different programs, and the program that executes
85       the  faulting  threads  will not necessarily cooperate with the program
86       that handles the  page  faults.   In  such  non-cooperative  mode,  the
87       process  that  monitors userfaultfd and handles page faults needs to be
88       aware of the changes in the  virtual  memory  layout  of  the  faulting
89       process to avoid memory corruption.
90
91       Since  Linux  4.11,  userfaultfd  can  also  notify  the fault-handling
92       threads about changes in the virtual  memory  layout  of  the  faulting
93       process.   In  addition,  if  the faulting process invokes fork(2), the
94       userfaultfd objects associated with the parent may be  duplicated  into
95       the child process and the userfaultfd monitor will be notified (via the
96       UFFD_EVENT_FORK described below) about the file  descriptor  associated
97       with  the userfault objects created for the child process, which allows
98       the userfaultfd monitor to perform  user-space  paging  for  the  child
99       process.   Unlike  page faults which have to be synchronous and require
100       an explicit or implicit wakeup, all other events  are  delivered  asyn‐
101       chronously and the non-cooperative process resumes execution as soon as
102       the userfaultfd manager  executes  read(2).   The  userfaultfd  manager
103       should  carefully  synchronize calls to UFFDIO_COPY with the processing
104       of events.
105
106       The current asynchronous model of the event  delivery  is  optimal  for
107       single threaded non-cooperative userfaultfd manager implementations.
108
109       Since  Linux  5.7,  userfaultfd  is  able  to do synchronous page dirty
110       tracking using the new write-protect register mode.  One  should  check
111       against  the  feature  bit  UFFD_FEATURE_PAGEFAULT_FLAG_WP before using
112       this feature.  Similar to the original userfaultfd  missing  mode,  the
113       write-protect  mode  will  generate a userfaultfd notification when the
114       protected page is written.  The user needs to resolve the page fault by
115       unprotecting  the  faulted  page and kicking the faulted thread to con‐
116       tinue.  For more information, please refer to the  "Userfaultfd  write-
117       protect mode" section.
118
119   Userfaultfd operation
120       After  the userfaultfd object is created with userfaultfd(), the appli‐
121       cation must enable it using the UFFDIO_API  ioctl(2)  operation.   This
122       operation  allows  a handshake between the kernel and user space to de‐
123       termine the API version and supported features.  This operation must be
124       performed  before  any of the other ioctl(2) operations described below
125       (or those operations fail with the EINVAL error).
126
127       After a successful UFFDIO_API operation, the application then registers
128       memory  address  ranges  using  the UFFDIO_REGISTER ioctl(2) operation.
129       After successful completion of  a  UFFDIO_REGISTER  operation,  a  page
130       fault  occurring in the requested memory range, and satisfying the mode
131       defined at the registration time, will be forwarded by  the  kernel  to
132       the  user-space  application.   The  application  can then use the UFF‐
133       DIO_COPY or UFFDIO_ZEROPAGE ioctl(2) operations  to  resolve  the  page
134       fault.
135
136       Since  Linux 4.14, if the application sets the UFFD_FEATURE_SIGBUS fea‐
137       ture bit using the UFFDIO_API ioctl(2), no page-fault notification will
138       be  forwarded  to  user space.  Instead a SIGBUS signal is delivered to
139       the faulting process.  With this feature, userfaultfd can be  used  for
140       robustness purposes to simply catch any access to areas within the reg‐
141       istered address range that do not have pages allocated, without  having
142       to  listen  to  userfaultfd events.  No userfaultfd monitor will be re‐
143       quired for dealing with such memory accesses.  For example,  this  fea‐
144       ture  can  be  useful  for applications that want to prevent the kernel
145       from automatically allocating pages and filling holes in  sparse  files
146       when the hole is accessed through a memory mapping.
147
148       The UFFD_FEATURE_SIGBUS feature is implicitly inherited through fork(2)
149       if used in combination with UFFD_FEATURE_FORK.
150
151       Details of the various ioctl(2) operations can be found in  ioctl_user‐
152       faultfd(2).
153
154       Since  Linux 4.11, events other than page-fault may enabled during UFF‐
155       DIO_API operation.
156
157       Up to Linux 4.11, userfaultfd can be used only with  anonymous  private
158       memory  mappings.   Since Linux 4.11, userfaultfd can be also used with
159       hugetlbfs and shared memory mappings.
160
161   Userfaultfd write-protect mode (since 5.7)
162       Since Linux 5.7, userfaultfd supports  write-protect  mode.   The  user
163       needs  to  first  check  availability  of this feature using UFFDIO_API
164       ioctl against the feature bit UFFD_FEATURE_PAGEFAULT_FLAG_WP before us‐
165       ing this feature.
166
167       To register with userfaultfd write-protect mode, the user needs to ini‐
168       tiate the UFFDIO_REGISTER ioctl with mode UFFDIO_REGISTER_MODE_WP  set.
169       Note  that  it  is legal to monitor the same memory range with multiple
170       modes.  For example, the user can do UFFDIO_REGISTER with the mode  set
171       to  UFFDIO_REGISTER_MODE_MISSING | UFFDIO_REGISTER_MODE_WP.  When there
172       is only UFFDIO_REGISTER_MODE_WP registered, user-space will not receive
173       any  notification  when a missing page is written.  Instead, user-space
174       will receive a write-protect page-fault notification only when  an  ex‐
175       isting but write-protected page got written.
176
177       After  the UFFDIO_REGISTER ioctl completed with UFFDIO_REGISTER_MODE_WP
178       mode set, the user can write-protect any  existing  memory  within  the
179       range   using  the  ioctl  UFFDIO_WRITEPROTECT  where  uffdio_writepro‐
180       tect.mode should be set to UFFDIO_WRITEPROTECT_MODE_WP.
181
182       When a write-protect event happens, user-space  will  receive  a  page-
183       fault   notification   whose   uffd_msg.pagefault.flags  will  be  with
184       UFFD_PAGEFAULT_FLAG_WP flag set.  Note: since only writes  can  trigger
185       this  kind  of  fault, write-protect notifications will always have the
186       UFFD_PAGEFAULT_FLAG_WRITE bit set along with the UFFD_PAGEFAULT_FLAG_WP
187       bit.
188
189       To  resolve a write-protection page fault, the user should initiate an‐
190       other UFFDIO_WRITEPROTECT ioctl, whose uffd_msg.pagefault.flags  should
191       have the flag UFFDIO_WRITEPROTECT_MODE_WP cleared upon the faulted page
192       or range.
193
194       Write-protect mode supports only private anonymous memory.
195
196   Reading from the userfaultfd structure
197       Each read(2) from the userfaultfd file descriptor returns one  or  more
198       uffd_msg  structures,  each of which describes a page-fault event or an
199       event required for the non-cooperative userfaultfd usage:
200
201           struct uffd_msg {
202               __u8  event;            /* Type of event */
203               ...
204               union {
205                   struct {
206                       __u64 flags;    /* Flags describing fault */
207                       __u64 address;  /* Faulting address */
208                       union {
209                           __u32 ptid; /* Thread ID of the fault */
210                       } feat;
211                   } pagefault;
212
213                   struct {            /* Since Linux 4.11 */
214                       __u32 ufd;      /* Userfault file descriptor
215                                          of the child process */
216                   } fork;
217
218                   struct {            /* Since Linux 4.11 */
219                       __u64 from;     /* Old address of remapped area */
220                       __u64 to;       /* New address of remapped area */
221                       __u64 len;      /* Original mapping length */
222                   } remap;
223
224                   struct {            /* Since Linux 4.11 */
225                       __u64 start;    /* Start address of removed area */
226                       __u64 end;      /* End address of removed area */
227                   } remove;
228                   ...
229               } arg;
230
231               /* Padding fields omitted */
232           } __packed;
233
234       If multiple events are available  and  the  supplied  buffer  is  large
235       enough, read(2) returns as many events as will fit in the supplied buf‐
236       fer.  If the buffer supplied to read(2) is smaller than the size of the
237       uffd_msg structure, the read(2) fails with the error EINVAL.
238
239       The fields set in the uffd_msg structure are as follows:
240
241       event  The  type  of  event.   Depending  of  the event type, different
242              fields of the arg union represent details required for the event
243              processing.   The  non-page-fault events are generated only when
244              appropriate feature is enabled during API  handshake  with  UFF‐
245              DIO_API ioctl(2).
246
247              The following values can appear in the event field:
248
249              UFFD_EVENT_PAGEFAULT (since Linux 4.3)
250                     A page-fault event.  The page-fault details are available
251                     in the pagefault field.
252
253              UFFD_EVENT_FORK (since Linux 4.11)
254                     Generated when the faulting process invokes  fork(2)  (or
255                     clone(2)  without  the CLONE_VM flag).  The event details
256                     are available in the fork field.
257
258              UFFD_EVENT_REMAP (since Linux 4.11)
259                     Generated when the faulting  process  invokes  mremap(2).
260                     The event details are available in the remap field.
261
262              UFFD_EVENT_REMOVE (since Linux 4.11)
263                     Generated  when  the  faulting process invokes madvise(2)
264                     with MADV_DONTNEED or MADV_REMOVE advice.  The event  de‐
265                     tails are available in the remove field.
266
267              UFFD_EVENT_UNMAP (since Linux 4.11)
268                     Generated  when  the  faulting  process  unmaps  a memory
269                     range, either explicitly using  munmap(2)  or  implicitly
270                     during  mmap(2)  or  mremap(2).   The  event  details are
271                     available in the remove field.
272
273       pagefault.address
274              The address that triggered the page fault.
275
276       pagefault.flags
277              A  bit  mask  of   flags   that   describe   the   event.    For
278              UFFD_EVENT_PAGEFAULT, the following flag may appear:
279
280              UFFD_PAGEFAULT_FLAG_WRITE
281                     If the address is in a range that was registered with the
282                     UFFDIO_REGISTER_MODE_MISSING   flag   (see    ioctl_user‐
283                     faultfd(2))  and  this  flag  is set, this a write fault;
284                     otherwise it is a read fault.
285
286              UFFD_PAGEFAULT_FLAG_WP
287                     If the address is in a range that was registered with the
288                     UFFDIO_REGISTER_MODE_WP  flag,  when  this bit is set, it
289                     means it is a write-protect fault.   Otherwise  it  is  a
290                     page-missing fault.
291
292       pagefault.feat.pid
293              The thread ID that triggered the page fault.
294
295       fork.ufd
296              The file descriptor associated with the userfault object created
297              for the child created by fork(2).
298
299       remap.from
300              The original address of the memory range that was remapped using
301              mremap(2).
302
303       remap.to
304              The  new  address  of  the  memory range that was remapped using
305              mremap(2).
306
307       remap.len
308              The original length of the memory range that was remapped  using
309              mremap(2).
310
311       remove.start
312              The  start address of the memory range that was freed using mad‐
313              vise(2) or unmapped
314
315       remove.end
316              The end address of the memory range that was  freed  using  mad‐
317              vise(2) or unmapped
318
319       A  read(2) on a userfaultfd file descriptor can fail with the following
320       errors:
321
322       EINVAL The userfaultfd object has not yet been enabled using  the  UFF‐
323              DIO_API ioctl(2) operation
324
325       If  the O_NONBLOCK flag is enabled in the associated open file descrip‐
326       tion, the userfaultfd file descriptor can be  monitored  with  poll(2),
327       select(2),  and epoll(7).  When events are available, the file descrip‐
328       tor indicates as readable.  If the O_NONBLOCK flag is not enabled, then
329       poll(2)  (always) indicates the file as having a POLLERR condition, and
330       select(2) indicates the file descriptor as both readable and writable.
331

RETURN VALUE

333       On success, userfaultfd() returns a new file descriptor that refers  to
334       the  userfaultfd object.  On error, -1 is returned, and errno is set to
335       indicate the error.
336

ERRORS

338       EINVAL An unsupported value was specified in flags.
339
340       EMFILE The per-process limit on the number of open file descriptors has
341              been reached
342
343       ENFILE The system-wide limit on the total number of open files has been
344              reached.
345
346       ENOMEM Insufficient kernel memory was available.
347
348       EPERM (since Linux 5.2)
349              The caller is not privileged (does not have  the  CAP_SYS_PTRACE
350              capability  in the initial user namespace), and /proc/sys/vm/un‐
351              privileged_userfaultfd has the value 0.
352

VERSIONS

354       The userfaultfd() system call first appeared in Linux 4.3.
355
356       The support for hugetlbfs and shared memory  areas  and  non-page-fault
357       events was added in Linux 4.11
358

CONFORMING TO

360       userfaultfd()  is Linux-specific and should not be used in programs in‐
361       tended to be portable.
362

NOTES

364       The userfaultfd mechanism can be used as an alternative to  traditional
365       user-space paging techniques based on the use of the SIGSEGV signal and
366       mmap(2).  It can also be used to  implement  lazy  restore  for  check‐
367       point/restore  mechanisms,  as  well  as  post-copy  migration to allow
368       (nearly) uninterrupted execution when transferring virtual machines and
369       Linux containers from one host to another.
370

BUGS

372       If  the  UFFD_FEATURE_EVENT_FORK  is enabled and a system call from the
373       fork(2) family is interrupted by a signal  or  failed,  a  stale  user‐
374       faultfd  descriptor  might  be  created.   In  this  case,  a  spurious
375       UFFD_EVENT_FORK will be delivered to the userfaultfd monitor.
376

EXAMPLES

378       The program below demonstrates the use of  the  userfaultfd  mechanism.
379       The  program  creates  two threads, one of which acts as the page-fault
380       handler for the process, for the pages in  a  demand-page  zero  region
381       created using mmap(2).
382
383       The  program  takes  one  command-line argument, which is the number of
384       pages that will be created in a mapping whose page faults will be  han‐
385       dled via userfaultfd.  After creating a userfaultfd object, the program
386       then creates an anonymous private mapping of  the  specified  size  and
387       registers  the  address range of that mapping using the UFFDIO_REGISTER
388       ioctl(2) operation.  The program then creates a second thread that will
389       perform the task of handling page faults.
390
391       The  main  thread  then walks through the pages of the mapping fetching
392       bytes from successive pages.  Because the pages have not yet  been  ac‐
393       cessed,  the  first  access of a byte in each page will trigger a page-
394       fault event on the userfaultfd file descriptor.
395
396       Each of the page-fault events is handled by the  second  thread,  which
397       sits  in  a loop processing input from the userfaultfd file descriptor.
398       In each loop iteration, the second thread first calls poll(2) to  check
399       the state of the file descriptor, and then reads an event from the file
400       descriptor.  All such events  should  be  UFFD_EVENT_PAGEFAULT  events,
401       which  the  thread  handles by copying a page of data into the faulting
402       region using the UFFDIO_COPY ioctl(2) operation.
403
404       The following is an example of what we see when running the program:
405
406           $ ./userfaultfd_demo 3
407           Address returned by mmap() = 0x7fd30106c000
408
409           fault_handler_thread():
410               poll() returns: nready = 1; POLLIN = 1; POLLERR = 0
411               UFFD_EVENT_PAGEFAULT event: flags = 0; address = 7fd30106c00f
412                   (uffdio_copy.copy returned 4096)
413           Read address 0x7fd30106c00f in main(): A
414           Read address 0x7fd30106c40f in main(): A
415           Read address 0x7fd30106c80f in main(): A
416           Read address 0x7fd30106cc0f in main(): A
417
418           fault_handler_thread():
419               poll() returns: nready = 1; POLLIN = 1; POLLERR = 0
420               UFFD_EVENT_PAGEFAULT event: flags = 0; address = 7fd30106d00f
421                   (uffdio_copy.copy returned 4096)
422           Read address 0x7fd30106d00f in main(): B
423           Read address 0x7fd30106d40f in main(): B
424           Read address 0x7fd30106d80f in main(): B
425           Read address 0x7fd30106dc0f in main(): B
426
427           fault_handler_thread():
428               poll() returns: nready = 1; POLLIN = 1; POLLERR = 0
429               UFFD_EVENT_PAGEFAULT event: flags = 0; address = 7fd30106e00f
430                   (uffdio_copy.copy returned 4096)
431           Read address 0x7fd30106e00f in main(): C
432           Read address 0x7fd30106e40f in main(): C
433           Read address 0x7fd30106e80f in main(): C
434           Read address 0x7fd30106ec0f in main(): C
435
436   Program source
437
438       /* userfaultfd_demo.c
439
440          Licensed under the GNU General Public License version 2 or later.
441       */
442       #define _GNU_SOURCE
443       #include <inttypes.h>
444       #include <sys/types.h>
445       #include <stdio.h>
446       #include <linux/userfaultfd.h>
447       #include <pthread.h>
448       #include <errno.h>
449       #include <unistd.h>
450       #include <stdlib.h>
451       #include <fcntl.h>
452       #include <signal.h>
453       #include <poll.h>
454       #include <string.h>
455       #include <sys/mman.h>
456       #include <sys/syscall.h>
457       #include <sys/ioctl.h>
458       #include <poll.h>
459
460       #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
461                               } while (0)
462
463       static int page_size;
464
465       static void *
466       fault_handler_thread(void *arg)
467       {
468           static struct uffd_msg msg;   /* Data read from userfaultfd */
469           static int fault_cnt = 0;     /* Number of faults so far handled */
470           long uffd;                    /* userfaultfd file descriptor */
471           static char *page = NULL;
472           struct uffdio_copy uffdio_copy;
473           ssize_t nread;
474
475           uffd = (long) arg;
476
477           /* Create a page that will be copied into the faulting region. */
478
479           if (page == NULL) {
480               page = mmap(NULL, page_size, PROT_READ | PROT_WRITE,
481                           MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
482               if (page == MAP_FAILED)
483                   errExit("mmap");
484           }
485
486           /* Loop, handling incoming events on the userfaultfd
487              file descriptor. */
488
489           for (;;) {
490
491               /* See what poll() tells us about the userfaultfd. */
492
493               struct pollfd pollfd;
494               int nready;
495               pollfd.fd = uffd;
496               pollfd.events = POLLIN;
497               nready = poll(&pollfd, 1, -1);
498               if (nready == -1)
499                   errExit("poll");
500
501               printf("\nfault_handler_thread():\n");
502               printf("    poll() returns: nready = %d; "
503                       "POLLIN = %d; POLLERR = %d\n", nready,
504                       (pollfd.revents & POLLIN) != 0,
505                       (pollfd.revents & POLLERR) != 0);
506
507               /* Read an event from the userfaultfd. */
508
509               nread = read(uffd, &msg, sizeof(msg));
510               if (nread == 0) {
511                   printf("EOF on userfaultfd!\n");
512                   exit(EXIT_FAILURE);
513               }
514
515               if (nread == -1)
516                   errExit("read");
517
518               /* We expect only one kind of event; verify that assumption. */
519
520               if (msg.event != UFFD_EVENT_PAGEFAULT) {
521                   fprintf(stderr, "Unexpected event on userfaultfd\n");
522                   exit(EXIT_FAILURE);
523               }
524
525               /* Display info about the page-fault event. */
526
527               printf("    UFFD_EVENT_PAGEFAULT event: ");
528               printf("flags = %"PRIx64"; ", msg.arg.pagefault.flags);
529               printf("address = %"PRIx64"\n", msg.arg.pagefault.address);
530
531               /* Copy the page pointed to by 'page' into the faulting
532                  region. Vary the contents that are copied in, so that it
533                  is more obvious that each fault is handled separately. */
534
535               memset(page, 'A' + fault_cnt % 20, page_size);
536               fault_cnt++;
537
538               uffdio_copy.src = (unsigned long) page;
539
540               /* We need to handle page faults in units of pages(!).
541                  So, round faulting address down to page boundary. */
542
543               uffdio_copy.dst = (unsigned long) msg.arg.pagefault.address &
544                                                  ~(page_size - 1);
545               uffdio_copy.len = page_size;
546               uffdio_copy.mode = 0;
547               uffdio_copy.copy = 0;
548               if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == -1)
549                   errExit("ioctl-UFFDIO_COPY");
550
551               printf("        (uffdio_copy.copy returned %"PRId64")\n",
552                       uffdio_copy.copy);
553           }
554       }
555
556       int
557       main(int argc, char *argv[])
558       {
559           long uffd;          /* userfaultfd file descriptor */
560           char *addr;         /* Start of region handled by userfaultfd */
561           uint64_t len;       /* Length of region handled by userfaultfd */
562           pthread_t thr;      /* ID of thread that handles page faults */
563           struct uffdio_api uffdio_api;
564           struct uffdio_register uffdio_register;
565           int s;
566
567           if (argc != 2) {
568               fprintf(stderr, "Usage: %s num-pages\n", argv[0]);
569               exit(EXIT_FAILURE);
570           }
571
572           page_size = sysconf(_SC_PAGE_SIZE);
573           len = strtoull(argv[1], NULL, 0) * page_size;
574
575           /* Create and enable userfaultfd object. */
576
577           uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
578           if (uffd == -1)
579               errExit("userfaultfd");
580
581           uffdio_api.api = UFFD_API;
582           uffdio_api.features = 0;
583           if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
584               errExit("ioctl-UFFDIO_API");
585
586           /* Create a private anonymous mapping. The memory will be
587              demand-zero paged--that is, not yet allocated. When we
588              actually touch the memory, it will be allocated via
589              the userfaultfd. */
590
591           addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
592                       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
593           if (addr == MAP_FAILED)
594               errExit("mmap");
595
596           printf("Address returned by mmap() = %p\n", addr);
597
598           /* Register the memory range of the mapping we just created for
599              handling by the userfaultfd object. In mode, we request to track
600              missing pages (i.e., pages that have not yet been faulted in). */
601
602           uffdio_register.range.start = (unsigned long) addr;
603           uffdio_register.range.len = len;
604           uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
605           if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1)
606               errExit("ioctl-UFFDIO_REGISTER");
607
608           /* Create a thread that will process the userfaultfd events. */
609
610           s = pthread_create(&thr, NULL, fault_handler_thread, (void *) uffd);
611           if (s != 0) {
612               errno = s;
613               errExit("pthread_create");
614           }
615
616           /* Main thread now touches memory in the mapping, touching
617              locations 1024 bytes apart. This will trigger userfaultfd
618              events for all pages in the region. */
619
620           int l;
621           l = 0xf;    /* Ensure that faulting address is not on a page
622                          boundary, in order to test that we correctly
623                          handle that case in fault_handling_thread(). */
624           while (l < len) {
625               char c = addr[l];
626               printf("Read address %p in main(): ", addr + l);
627               printf("%c\n", c);
628               l += 1024;
629               usleep(100000);         /* Slow things down a little */
630           }
631
632           exit(EXIT_SUCCESS);
633       }
634

SEE ALSO

636       fcntl(2), ioctl(2), ioctl_userfaultfd(2), madvise(2), mmap(2)
637
638       Documentation/admin-guide/mm/userfaultfd.rst in the Linux kernel source
639       tree
640

COLOPHON

642       This  page  is  part of release 5.13 of the Linux man-pages project.  A
643       description of the project, information about reporting bugs,  and  the
644       latest     version     of     this    page,    can    be    found    at
645       https://www.kernel.org/doc/man-pages/.
646
647
648
649Linux                             2021-03-22                    USERFAULTFD(2)
Impressum