1MMAP(2) Linux Programmer's Manual MMAP(2)
2
3
4
6 mmap, munmap - map or unmap files or devices into memory
7
9 #include <sys/mman.h>
10
11 void *mmap(void *addr, size_t length, int prot, int flags,
12 int fd, off_t offset);
13 int munmap(void *addr, size_t length);
14
15 See NOTES for information on feature test macro requirements.
16
18 mmap() creates a new mapping in the virtual address space of the call‐
19 ing process. The starting address for the new mapping is specified in
20 addr. The length argument specifies the length of the mapping (which
21 must be greater than 0).
22
23 If addr is NULL, then the kernel chooses the (page-aligned) address at
24 which to create the mapping; this is the most portable method of creat‐
25 ing a new mapping. If addr is not NULL, then the kernel takes it as a
26 hint about where to place the mapping; on Linux, the mapping will be
27 created at a nearby page boundary. The address of the new mapping is
28 returned as the result of the call.
29
30 The contents of a file mapping (as opposed to an anonymous mapping; see
31 MAP_ANONYMOUS below), are initialized using length bytes starting at
32 offset offset in the file (or other object) referred to by the file
33 descriptor fd. offset must be a multiple of the page size as returned
34 by sysconf(_SC_PAGE_SIZE).
35
36 The prot argument describes the desired memory protection of the map‐
37 ping (and must not conflict with the open mode of the file). It is
38 either PROT_NONE or the bitwise OR of one or more of the following
39 flags:
40
41 PROT_EXEC Pages may be executed.
42
43 PROT_READ Pages may be read.
44
45 PROT_WRITE Pages may be written.
46
47 PROT_NONE Pages may not be accessed.
48
49 The flags argument determines whether updates to the mapping are visi‐
50 ble to other processes mapping the same region, and whether updates are
51 carried through to the underlying file. This behavior is determined by
52 including exactly one of the following values in flags:
53
54 MAP_SHARED
55 Share this mapping. Updates to the mapping are visible to other
56 processes mapping the same region, and (in the case of file-
57 backed mappings) are carried through to the underlying file.
58 (To precisely control when updates are carried through to the
59 underlying file requires the use of msync(2).)
60
61 MAP_SHARED_VALIDATE (since Linux 4.15)
62 This flag provides the same behavior as MAP_SHARED except that
63 MAP_SHARED mappings ignore unknown flags in flags. By contrast,
64 when creating a mapping using MAP_SHARED_VALIDATE, the kernel
65 verifies all passed flags are known and fails the mapping with
66 the error EOPNOTSUPP for unknown flags. This mapping type is
67 also required to be able to use some mapping flags (e.g.,
68 MAP_SYNC).
69
70 MAP_PRIVATE
71 Create a private copy-on-write mapping. Updates to the mapping
72 are not visible to other processes mapping the same file, and
73 are not carried through to the underlying file. It is unspeci‐
74 fied whether changes made to the file after the mmap() call are
75 visible in the mapped region.
76
77 Both MAP_SHARED and MAP_PRIVATE are described in POSIX.1-2001 and
78 POSIX.1-2008. MAP_SHARED_VALIDATE is a Linux extension.
79
80 In addition, zero or more of the following values can be ORed in flags:
81
82 MAP_32BIT (since Linux 2.4.20, 2.6)
83 Put the mapping into the first 2 Gigabytes of the process
84 address space. This flag is supported only on x86-64, for
85 64-bit programs. It was added to allow thread stacks to be
86 allocated somewhere in the first 2 GB of memory, so as to
87 improve context-switch performance on some early 64-bit proces‐
88 sors. Modern x86-64 processors no longer have this performance
89 problem, so use of this flag is not required on those systems.
90 The MAP_32BIT flag is ignored when MAP_FIXED is set.
91
92 MAP_ANON
93 Synonym for MAP_ANONYMOUS. Deprecated.
94
95 MAP_ANONYMOUS
96 The mapping is not backed by any file; its contents are initial‐
97 ized to zero. The fd argument is ignored; however, some imple‐
98 mentations require fd to be -1 if MAP_ANONYMOUS (or MAP_ANON) is
99 specified, and portable applications should ensure this. The
100 offset argument should be zero. The use of MAP_ANONYMOUS in
101 conjunction with MAP_SHARED is supported on Linux only since
102 kernel 2.4.
103
104 MAP_DENYWRITE
105 This flag is ignored. (Long ago—Linux 2.0 and earlier—it sig‐
106 naled that attempts to write to the underlying file should fail
107 with ETXTBUSY. But this was a source of denial-of-service
108 attacks.)
109
110 MAP_EXECUTABLE
111 This flag is ignored.
112
113 MAP_FILE
114 Compatibility flag. Ignored.
115
116 MAP_FIXED
117 Don't interpret addr as a hint: place the mapping at exactly
118 that address. addr must be suitably aligned: for most architec‐
119 tures a multiple of the page size is sufficient; however, some
120 architectures may impose additional restrictions. If the memory
121 region specified by addr and len overlaps pages of any existing
122 mapping(s), then the overlapped part of the existing mapping(s)
123 will be discarded. If the specified address cannot be used,
124 mmap() will fail.
125
126 Software that aspires to be portable should use the MAP_FIXED
127 flag with care, keeping in mind that the exact layout of a
128 process's memory mappings is allowed to change significantly
129 between kernel versions, C library versions, and operating sys‐
130 tem releases. Carefully read the discussion of this flag in
131 NOTES!
132
133 MAP_FIXED_NOREPLACE (since Linux 4.17)
134 This flag provides behavior that is similar to MAP_FIXED with
135 respect to the addr enforcement, but differs in that
136 MAP_FIXED_NOREPLACE never clobbers a preexisting mapped range.
137 If the requested range would collide with an existing mapping,
138 then this call fails with the error EEXIST. This flag can
139 therefore be used as a way to atomically (with respect to other
140 threads) attempt to map an address range: one thread will suc‐
141 ceed; all others will report failure.
142
143 Note that older kernels which do not recognize the
144 MAP_FIXED_NOREPLACE flag will typically (upon detecting a colli‐
145 sion with a preexisting mapping) fall back to a "non-MAP_FIXED"
146 type of behavior: they will return an address that is different
147 from the requested address. Therefore, backward-compatible
148 software should check the returned address against the requested
149 address.
150
151 MAP_GROWSDOWN
152 This flag is used for stacks. It indicates to the kernel vir‐
153 tual memory system that the mapping should extend downward in
154 memory. The return address is one page lower than the memory
155 area that is actually created in the process's virtual address
156 space. Touching an address in the "guard" page below the map‐
157 ping will cause the mapping to grow by a page. This growth can
158 be repeated until the mapping grows to within a page of the high
159 end of the next lower mapping, at which point touching the
160 "guard" page will result in a SIGSEGV signal.
161
162 MAP_HUGETLB (since Linux 2.6.32)
163 Allocate the mapping using "huge pages." See the Linux kernel
164 source file Documentation/vm/hugetlbpage.txt for further infor‐
165 mation, as well as NOTES, below.
166
167 MAP_HUGE_2MB, MAP_HUGE_1GB (since Linux 3.8)
168 Used in conjunction with MAP_HUGETLB to select alternative
169 hugetlb page sizes (respectively, 2 MB and 1 GB) on systems that
170 support multiple hugetlb page sizes.
171
172 More generally, the desired huge page size can be configured by
173 encoding the base-2 logarithm of the desired page size in the
174 six bits at the offset MAP_HUGE_SHIFT. (A value of zero in this
175 bit field provides the default huge page size; the default huge
176 page size can be discovered via the Hugepagesize field exposed
177 by /proc/meminfo.) Thus, the above two constants are defined
178 as:
179
180 #define MAP_HUGE_2MB (21 << MAP_HUGE_SHIFT)
181 #define MAP_HUGE_1GB (30 << MAP_HUGE_SHIFT)
182
183 The range of huge page sizes that are supported by the system
184 can be discovered by listing the subdirectories in /sys/ker‐
185 nel/mm/hugepages.
186
187 MAP_LOCKED (since Linux 2.5.37)
188 Mark the mapped region to be locked in the same way as mlock(2).
189 This implementation will try to populate (prefault) the whole
190 range but the mmap() call doesn't fail with ENOMEM if this
191 fails. Therefore major faults might happen later on. So the
192 semantic is not as strong as mlock(2). One should use mmap()
193 plus mlock(2) when major faults are not acceptable after the
194 initialization of the mapping. The MAP_LOCKED flag is ignored
195 in older kernels.
196
197 MAP_NONBLOCK (since Linux 2.5.46)
198 This flag is meaningful only in conjunction with MAP_POPULATE.
199 Don't perform read-ahead: create page tables entries only for
200 pages that are already present in RAM. Since Linux 2.6.23, this
201 flag causes MAP_POPULATE to do nothing. One day, the combina‐
202 tion of MAP_POPULATE and MAP_NONBLOCK may be reimplemented.
203
204 MAP_NORESERVE
205 Do not reserve swap space for this mapping. When swap space is
206 reserved, one has the guarantee that it is possible to modify
207 the mapping. When swap space is not reserved one might get
208 SIGSEGV upon a write if no physical memory is available. See
209 also the discussion of the file /proc/sys/vm/overcommit_memory
210 in proc(5). In kernels before 2.6, this flag had effect only
211 for private writable mappings.
212
213 MAP_POPULATE (since Linux 2.5.46)
214 Populate (prefault) page tables for a mapping. For a file map‐
215 ping, this causes read-ahead on the file. This will help to
216 reduce blocking on page faults later. MAP_POPULATE is supported
217 for private mappings only since Linux 2.6.23.
218
219 MAP_STACK (since Linux 2.6.27)
220 Allocate the mapping at an address suitable for a process or
221 thread stack. This flag is currently a no-op, but is used in
222 the glibc threading implementation so that if some architectures
223 require special treatment for stack allocations, support can
224 later be transparently implemented for glibc.
225
226 MAP_SYNC (since Linux 4.15)
227 This flag is available only with the MAP_SHARED_VALIDATE mapping
228 type; mappings of type MAP_SHARED will silently ignore this
229 flag. This flag is supported only for files supporting DAX
230 (direct mapping of persistent memory). For other files, creat‐
231 ing a mapping with this flag results in an EOPNOTSUPP error.
232
233 Shared file mappings with this flag provide the guarantee that
234 while some memory is writably mapped in the address space of the
235 process, it will be visible in the same file at the same offset
236 even after the system crashes or is rebooted. In conjunction
237 with the use of appropriate CPU instructions, this provides
238 users of such mappings with a more efficient way of making data
239 modifications persistent.
240
241 MAP_UNINITIALIZED (since Linux 2.6.33)
242 Don't clear anonymous pages. This flag is intended to improve
243 performance on embedded devices. This flag is honored only if
244 the kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIAL‐
245 IZED option. Because of the security implications, that option
246 is normally enabled only on embedded devices (i.e., devices
247 where one has complete control of the contents of user memory).
248
249 Of the above flags, only MAP_FIXED is specified in POSIX.1-2001 and
250 POSIX.1-2008. However, most systems also support MAP_ANONYMOUS (or its
251 synonym MAP_ANON).
252
253 Memory mapped by mmap() is preserved across fork(2), with the same
254 attributes.
255
256 A file is mapped in multiples of the page size. For a file that is not
257 a multiple of the page size, the remaining memory is zeroed when
258 mapped, and writes to that region are not written out to the file. The
259 effect of changing the size of the underlying file of a mapping on the
260 pages that correspond to added or removed regions of the file is
261 unspecified.
262
263 munmap()
264 The munmap() system call deletes the mappings for the specified address
265 range, and causes further references to addresses within the range to
266 generate invalid memory references. The region is also automatically
267 unmapped when the process is terminated. On the other hand, closing
268 the file descriptor does not unmap the region.
269
270 The address addr must be a multiple of the page size (but length need
271 not be). All pages containing a part of the indicated range are
272 unmapped, and subsequent references to these pages will generate
273 SIGSEGV. It is not an error if the indicated range does not contain
274 any mapped pages.
275
277 On success, mmap() returns a pointer to the mapped area. On error, the
278 value MAP_FAILED (that is, (void *) -1) is returned, and errno is set
279 to indicate the cause of the error.
280
281 On success, munmap() returns 0. On failure, it returns -1, and errno
282 is set to indicate the cause of the error (probably to EINVAL).
283
285 EACCES A file descriptor refers to a non-regular file. Or a file map‐
286 ping was requested, but fd is not open for reading. Or
287 MAP_SHARED was requested and PROT_WRITE is set, but fd is not
288 open in read/write (O_RDWR) mode. Or PROT_WRITE is set, but the
289 file is append-only.
290
291 EAGAIN The file has been locked, or too much memory has been locked
292 (see setrlimit(2)).
293
294 EBADF fd is not a valid file descriptor (and MAP_ANONYMOUS was not
295 set).
296
297 EEXIST MAP_FIXED_NOREPLACE was specified in flags, and the range cov‐
298 ered by addr and length is clashes with an existing mapping.
299
300 EINVAL We don't like addr, length, or offset (e.g., they are too large,
301 or not aligned on a page boundary).
302
303 EINVAL (since Linux 2.6.12) length was 0.
304
305 EINVAL flags contained neither MAP_PRIVATE or MAP_SHARED, or contained
306 both of these values.
307
308 ENFILE The system-wide limit on the total number of open files has been
309 reached.
310
311 ENODEV The underlying filesystem of the specified file does not support
312 memory mapping.
313
314 ENOMEM No memory is available.
315
316 ENOMEM The process's maximum number of mappings would have been
317 exceeded. This error can also occur for munmap(), when unmap‐
318 ping a region in the middle of an existing mapping, since this
319 results in two smaller mappings on either side of the region
320 being unmapped.
321
322 ENOMEM (since Linux 4.7) The process's RLIMIT_DATA limit, described in
323 getrlimit(2), would have been exceeded.
324
325 EOVERFLOW
326 On 32-bit architecture together with the large file extension
327 (i.e., using 64-bit off_t): the number of pages used for length
328 plus number of pages used for offset would overflow unsigned
329 long (32 bits).
330
331 EPERM The prot argument asks for PROT_EXEC but the mapped area belongs
332 to a file on a filesystem that was mounted no-exec.
333
334 EPERM The operation was prevented by a file seal; see fcntl(2).
335
336 ETXTBSY
337 MAP_DENYWRITE was set but the object specified by fd is open for
338 writing.
339
340 Use of a mapped region can result in these signals:
341
342 SIGSEGV
343 Attempted write into a region mapped as read-only.
344
345 SIGBUS Attempted access to a portion of the buffer that does not corre‐
346 spond to the file (for example, beyond the end of the file,
347 including the case where another process has truncated the
348 file).
349
351 For an explanation of the terms used in this section, see
352 attributes(7).
353
354 ┌───────────────────┬───────────────┬─────────┐
355 │Interface │ Attribute │ Value │
356 ├───────────────────┼───────────────┼─────────┤
357 │mmap(), munmap() │ Thread safety │ MT-Safe │
358 └───────────────────┴───────────────┴─────────┘
360 POSIX.1-2001, POSIX.1-2008, SVr4, 4.4BSD.
361
363 On POSIX systems on which mmap(), msync(2), and munmap() are available,
364 _POSIX_MAPPED_FILES is defined in <unistd.h> to a value greater than 0.
365 (See also sysconf(3).)
366
368 On some hardware architectures (e.g., i386), PROT_WRITE implies
369 PROT_READ. It is architecture dependent whether PROT_READ implies
370 PROT_EXEC or not. Portable programs should always set PROT_EXEC if
371 they intend to execute code in the new mapping.
372
373 The portable way to create a mapping is to specify addr as 0 (NULL),
374 and omit MAP_FIXED from flags. In this case, the system chooses the
375 address for the mapping; the address is chosen so as not to conflict
376 with any existing mapping, and will not be 0. If the MAP_FIXED flag is
377 specified, and addr is 0 (NULL), then the mapped address will be 0
378 (NULL).
379
380 Certain flags constants are defined only if suitable feature test
381 macros are defined (possibly by default): _DEFAULT_SOURCE with glibc
382 2.19 or later; or _BSD_SOURCE or _SVID_SOURCE in glibc 2.19 and ear‐
383 lier. (Employing _GNU_SOURCE also suffices, and requiring that macro
384 specifically would have been more logical, since these flags are all
385 Linux-specific.) The relevant flags are: MAP_32BIT, MAP_ANONYMOUS (and
386 the synonym MAP_ANON), MAP_DENYWRITE, MAP_EXECUTABLE, MAP_FILE,
387 MAP_GROWSDOWN, MAP_HUGETLB, MAP_LOCKED, MAP_NONBLOCK, MAP_NORESERVE,
388 MAP_POPULATE, and MAP_STACK.
389
390 An application can determine which pages of a mapping are currently
391 resident in the buffer/page cache using mincore(2).
392
393 Using MAP_FIXED safely
394 The only safe use for MAP_FIXED is where the address range specified by
395 addr and length was previously reserved using another mapping; other‐
396 wise, the use of MAP_FIXED is hazardous because it forcibly removes
397 preexisting mappings, making it easy for a multithreaded process to
398 corrupt its own address space.
399
400 For example, suppose that thread A looks through /proc/<pid>/maps and
401 in order to locate an unused address range that it can map using
402 MAP_FIXED, while thread B simultaneously acquires part or all of that
403 same address range. When thread A subsequently employs
404 mmap(MAP_FIXED), it will effectively clobber the mapping that thread B
405 created. In this scenario, thread B need not create a mapping
406 directly; simply making a library call that, internally, uses dlopen(3)
407 to load some other shared library, will suffice. The dlopen(3) call
408 will map the library into the process's address space. Furthermore,
409 almost any library call may be implemented in a way that adds memory
410 mappings to the address space, either with this technique, or by simply
411 allocating memory. Examples include brk(2), malloc(3), pthread_cre‐
412 ate(3), and the PAM libraries ⟨http://www.linux-pam.org⟩.
413
414 Since Linux 4.17, a multithreaded program can use the MAP_FIXED_NORE‐
415 PLACE flag to avoid the hazard described above when attempting to cre‐
416 ate a mapping at a fixed address that has not been reserved by a preex‐
417 isting mapping.
418
419 Timestamps changes for file-backed mappings
420 For file-backed mappings, the st_atime field for the mapped file may be
421 updated at any time between the mmap() and the corresponding unmapping;
422 the first reference to a mapped page will update the field if it has
423 not been already.
424
425 The st_ctime and st_mtime field for a file mapped with PROT_WRITE and
426 MAP_SHARED will be updated after a write to the mapped region, and
427 before a subsequent msync(2) with the MS_SYNC or MS_ASYNC flag, if one
428 occurs.
429
430 Huge page (Huge TLB) mappings
431 For mappings that employ huge pages, the requirements for the arguments
432 of mmap() and munmap() differ somewhat from the requirements for map‐
433 pings that use the native system page size.
434
435 For mmap(), offset must be a multiple of the underlying huge page size.
436 The system automatically aligns length to be a multiple of the underly‐
437 ing huge page size.
438
439 For munmap(), addr and length must both be a multiple of the underlying
440 huge page size.
441
442 C library/kernel differences
443 This page describes the interface provided by the glibc mmap() wrapper
444 function. Originally, this function invoked a system call of the same
445 name. Since kernel 2.4, that system call has been superseded by
446 mmap2(2), and nowadays the glibc mmap() wrapper function invokes
447 mmap2(2) with a suitably adjusted value for offset.
448
450 On Linux, there are no guarantees like those suggested above under
451 MAP_NORESERVE. By default, any process can be killed at any moment
452 when the system runs out of memory.
453
454 In kernels before 2.6.7, the MAP_POPULATE flag has effect only if prot
455 is specified as PROT_NONE.
456
457 SUSv3 specifies that mmap() should fail if length is 0. However, in
458 kernels before 2.6.12, mmap() succeeded in this case: no mapping was
459 created and the call returned addr. Since kernel 2.6.12, mmap() fails
460 with the error EINVAL for this case.
461
462 POSIX specifies that the system shall always zero fill any partial page
463 at the end of the object and that system will never write any modifica‐
464 tion of the object beyond its end. On Linux, when you write data to
465 such partial page after the end of the object, the data stays in the
466 page cache even after the file is closed and unmapped and even though
467 the data is never written to the file itself, subsequent mappings may
468 see the modified content. In some cases, this could be fixed by call‐
469 ing msync(2) before the unmap takes place; however, this doesn't work
470 on tmpfs(5) (for example, when using the POSIX shared memory interface
471 documented in shm_overview(7)).
472
474 The following program prints part of the file specified in its first
475 command-line argument to standard output. The range of bytes to be
476 printed is specified via offset and length values in the second and
477 third command-line arguments. The program creates a memory mapping of
478 the required pages of the file and then uses write(2) to output the
479 desired bytes.
480
481 Program source
482 #include <sys/mman.h>
483 #include <sys/stat.h>
484 #include <fcntl.h>
485 #include <stdio.h>
486 #include <stdlib.h>
487 #include <unistd.h>
488
489 #define handle_error(msg) \
490 do { perror(msg); exit(EXIT_FAILURE); } while (0)
491
492 int
493 main(int argc, char *argv[])
494 {
495 char *addr;
496 int fd;
497 struct stat sb;
498 off_t offset, pa_offset;
499 size_t length;
500 ssize_t s;
501
502 if (argc < 3 || argc > 4) {
503 fprintf(stderr, "%s file offset [length]\n", argv[0]);
504 exit(EXIT_FAILURE);
505 }
506
507 fd = open(argv[1], O_RDONLY);
508 if (fd == -1)
509 handle_error("open");
510
511 if (fstat(fd, &sb) == -1) /* To obtain file size */
512 handle_error("fstat");
513
514 offset = atoi(argv[2]);
515 pa_offset = offset & ~(sysconf(_SC_PAGE_SIZE) - 1);
516 /* offset for mmap() must be page aligned */
517
518 if (offset >= sb.st_size) {
519 fprintf(stderr, "offset is past end of file\n");
520 exit(EXIT_FAILURE);
521 }
522
523 if (argc == 4) {
524 length = atoi(argv[3]);
525 if (offset + length > sb.st_size)
526 length = sb.st_size - offset;
527 /* Can't display bytes past end of file */
528
529 } else { /* No length arg ==> display to end of file */
530 length = sb.st_size - offset;
531 }
532
533 addr = mmap(NULL, length + offset - pa_offset, PROT_READ,
534 MAP_PRIVATE, fd, pa_offset);
535 if (addr == MAP_FAILED)
536 handle_error("mmap");
537
538 s = write(STDOUT_FILENO, addr + offset - pa_offset, length);
539 if (s != length) {
540 if (s == -1)
541 handle_error("write");
542
543 fprintf(stderr, "partial write");
544 exit(EXIT_FAILURE);
545 }
546
547 munmap(addr, length + offset - pa_offset);
548 close(fd);
549
550 exit(EXIT_SUCCESS);
551 }
552
554 ftruncate(2), getpagesize(2), memfd_create(2), mincore(2), mlock(2),
555 mmap2(2), mprotect(2), mremap(2), msync(2), remap_file_pages(2), setr‐
556 limit(2), shmat(2), userfaultfd(2), shm_open(3), shm_overview(7)
557
558 The descriptions of the following files in proc(5): /proc/[pid]/maps,
559 /proc/[pid]/map_files, and /proc/[pid]/smaps.
560
561 B.O. Gallmeister, POSIX.4, O'Reilly, pp. 128–129 and 389–391.
562
564 This page is part of release 4.16 of the Linux man-pages project. A
565 description of the project, information about reporting bugs, and the
566 latest version of this page, can be found at
567 https://www.kernel.org/doc/man-pages/.
568
569
570
571Linux 2018-04-30 MMAP(2)