1OPEN(2) Linux Programmer's Manual OPEN(2)
2
3
4
6 open, openat, creat - open and possibly create a file
7
9 #include <sys/types.h>
10 #include <sys/stat.h>
11 #include <fcntl.h>
12
13 int open(const char *pathname, int flags);
14 int open(const char *pathname, int flags, mode_t mode);
15
16 int creat(const char *pathname, mode_t mode);
17
18 int openat(int dirfd, const char *pathname, int flags);
19 int openat(int dirfd, const char *pathname, int flags, mode_t mode);
20
21 Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
22
23 openat():
24 Since glibc 2.10:
25 _POSIX_C_SOURCE >= 200809L
26 Before glibc 2.10:
27 _ATFILE_SOURCE
28
30 The open() system call opens the file specified by pathname. If the
31 specified file does not exist, it may optionally (if O_CREAT is speci‐
32 fied in flags) be created by open().
33
34 The return value of open() is a file descriptor, a small, nonnegative
35 integer that is used in subsequent system calls (read(2), write(2),
36 lseek(2), fcntl(2), etc.) to refer to the open file. The file descrip‐
37 tor returned by a successful call will be the lowest-numbered file
38 descriptor not currently open for the process.
39
40 By default, the new file descriptor is set to remain open across an
41 execve(2) (i.e., the FD_CLOEXEC file descriptor flag described in
42 fcntl(2) is initially disabled); the O_CLOEXEC flag, described below,
43 can be used to change this default. The file offset is set to the
44 beginning of the file (see lseek(2)).
45
46 A call to open() creates a new open file description, an entry in the
47 system-wide table of open files. The open file description records the
48 file offset and the file status flags (see below). A file descriptor
49 is a reference to an open file description; this reference is unaf‐
50 fected if pathname is subsequently removed or modified to refer to a
51 different file. For further details on open file descriptions, see
52 NOTES.
53
54 The argument flags must include one of the following access modes:
55 O_RDONLY, O_WRONLY, or O_RDWR. These request opening the file read-
56 only, write-only, or read/write, respectively.
57
58 In addition, zero or more file creation flags and file status flags can
59 be bitwise-or'd in flags. The file creation flags are O_CLOEXEC,
60 O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TMPFILE, and
61 O_TRUNC. The file status flags are all of the remaining flags listed
62 below. The distinction between these two groups of flags is that the
63 file creation flags affect the semantics of the open operation itself,
64 while the file status flags affect the semantics of subsequent I/O
65 operations. The file status flags can be retrieved and (in some cases)
66 modified; see fcntl(2) for details.
67
68 The full list of file creation flags and file status flags is as fol‐
69 lows:
70
71 O_APPEND
72 The file is opened in append mode. Before each write(2), the
73 file offset is positioned at the end of the file, as if with
74 lseek(2). The modification of the file offset and the write
75 operation are performed as a single atomic step.
76
77 O_APPEND may lead to corrupted files on NFS filesystems if more
78 than one process appends data to a file at once. This is
79 because NFS does not support appending to a file, so the client
80 kernel has to simulate it, which can't be done without a race
81 condition.
82
83 O_ASYNC
84 Enable signal-driven I/O: generate a signal (SIGIO by default,
85 but this can be changed via fcntl(2)) when input or output
86 becomes possible on this file descriptor. This feature is
87 available only for terminals, pseudoterminals, sockets, and
88 (since Linux 2.6) pipes and FIFOs. See fcntl(2) for further
89 details. See also BUGS, below.
90
91 O_CLOEXEC (since Linux 2.6.23)
92 Enable the close-on-exec flag for the new file descriptor.
93 Specifying this flag permits a program to avoid additional
94 fcntl(2) F_SETFD operations to set the FD_CLOEXEC flag.
95
96 Note that the use of this flag is essential in some multi‐
97 threaded programs, because using a separate fcntl(2) F_SETFD
98 operation to set the FD_CLOEXEC flag does not suffice to avoid
99 race conditions where one thread opens a file descriptor and
100 attempts to set its close-on-exec flag using fcntl(2) at the
101 same time as another thread does a fork(2) plus execve(2).
102 Depending on the order of execution, the race may lead to the
103 file descriptor returned by open() being unintentionally leaked
104 to the program executed by the child process created by fork(2).
105 (This kind of race is in principle possible for any system call
106 that creates a file descriptor whose close-on-exec flag should
107 be set, and various other Linux system calls provide an equiva‐
108 lent of the O_CLOEXEC flag to deal with this problem.)
109
110 O_CREAT
111 If pathname does not exist, create it as a regular file.
112
113 The owner (user ID) of the new file is set to the effective user
114 ID of the process.
115
116 The group ownership (group ID) of the new file is set either to
117 the effective group ID of the process (System V semantics) or to
118 the group ID of the parent directory (BSD semantics). On Linux,
119 the behavior depends on whether the set-group-ID mode bit is set
120 on the parent directory: if that bit is set, then BSD semantics
121 apply; otherwise, System V semantics apply. For some filesys‐
122 tems, the behavior also depends on the bsdgroups and sysvgroups
123 mount options described in mount(8)).
124
125 The mode argument specifies the file mode bits be applied when a
126 new file is created. This argument must be supplied when
127 O_CREAT or O_TMPFILE is specified in flags; if neither O_CREAT
128 nor O_TMPFILE is specified, then mode is ignored. The effective
129 mode is modified by the process's umask in the usual way: in the
130 absence of a default ACL, the mode of the created file is
131 (mode & ~umask). Note that this mode applies only to future
132 accesses of the newly created file; the open() call that creates
133 a read-only file may well return a read/write file descriptor.
134
135 The following symbolic constants are provided for mode:
136
137 S_IRWXU 00700 user (file owner) has read, write, and execute
138 permission
139
140 S_IRUSR 00400 user has read permission
141
142 S_IWUSR 00200 user has write permission
143
144 S_IXUSR 00100 user has execute permission
145
146 S_IRWXG 00070 group has read, write, and execute permission
147
148 S_IRGRP 00040 group has read permission
149
150 S_IWGRP 00020 group has write permission
151
152 S_IXGRP 00010 group has execute permission
153
154 S_IRWXO 00007 others have read, write, and execute permission
155
156 S_IROTH 00004 others have read permission
157
158 S_IWOTH 00002 others have write permission
159
160 S_IXOTH 00001 others have execute permission
161
162 According to POSIX, the effect when other bits are set in mode
163 is unspecified. On Linux, the following bits are also honored
164 in mode:
165
166 S_ISUID 0004000 set-user-ID bit
167
168 S_ISGID 0002000 set-group-ID bit (see inode(7)).
169
170 S_ISVTX 0001000 sticky bit (see inode(7)).
171
172 O_DIRECT (since Linux 2.4.10)
173 Try to minimize cache effects of the I/O to and from this file.
174 In general this will degrade performance, but it is useful in
175 special situations, such as when applications do their own
176 caching. File I/O is done directly to/from user-space buffers.
177 The O_DIRECT flag on its own makes an effort to transfer data
178 synchronously, but does not give the guarantees of the O_SYNC
179 flag that data and necessary metadata are transferred. To guar‐
180 antee synchronous I/O, O_SYNC must be used in addition to
181 O_DIRECT. See NOTES below for further discussion.
182
183 A semantically similar (but deprecated) interface for block
184 devices is described in raw(8).
185
186 O_DIRECTORY
187 If pathname is not a directory, cause the open to fail. This
188 flag was added in kernel version 2.1.126, to avoid denial-of-
189 service problems if opendir(3) is called on a FIFO or tape
190 device.
191
192 O_DSYNC
193 Write operations on the file will complete according to the
194 requirements of synchronized I/O data integrity completion.
195
196 By the time write(2) (and similar) return, the output data has
197 been transferred to the underlying hardware, along with any file
198 metadata that would be required to retrieve that data (i.e., as
199 though each write(2) was followed by a call to fdatasync(2)).
200 See NOTES below.
201
202 O_EXCL Ensure that this call creates the file: if this flag is speci‐
203 fied in conjunction with O_CREAT, and pathname already exists,
204 then open() fails with the error EEXIST.
205
206 When these two flags are specified, symbolic links are not fol‐
207 lowed: if pathname is a symbolic link, then open() fails regard‐
208 less of where the symbolic link points.
209
210 In general, the behavior of O_EXCL is undefined if it is used
211 without O_CREAT. There is one exception: on Linux 2.6 and
212 later, O_EXCL can be used without O_CREAT if pathname refers to
213 a block device. If the block device is in use by the system
214 (e.g., mounted), open() fails with the error EBUSY.
215
216 On NFS, O_EXCL is supported only when using NFSv3 or later on
217 kernel 2.6 or later. In NFS environments where O_EXCL support
218 is not provided, programs that rely on it for performing locking
219 tasks will contain a race condition. Portable programs that
220 want to perform atomic file locking using a lockfile, and need
221 to avoid reliance on NFS support for O_EXCL, can create a unique
222 file on the same filesystem (e.g., incorporating hostname and
223 PID), and use link(2) to make a link to the lockfile. If
224 link(2) returns 0, the lock is successful. Otherwise, use
225 stat(2) on the unique file to check if its link count has
226 increased to 2, in which case the lock is also successful.
227
228 O_LARGEFILE
229 (LFS) Allow files whose sizes cannot be represented in an off_t
230 (but can be represented in an off64_t) to be opened. The
231 _LARGEFILE64_SOURCE macro must be defined (before including any
232 header files) in order to obtain this definition. Setting the
233 _FILE_OFFSET_BITS feature test macro to 64 (rather than using
234 O_LARGEFILE) is the preferred method of accessing large files on
235 32-bit systems (see feature_test_macros(7)).
236
237 O_NOATIME (since Linux 2.6.8)
238 Do not update the file last access time (st_atime in the inode)
239 when the file is read(2).
240
241 This flag can be employed only if one of the following condi‐
242 tions is true:
243
244 * The effective UID of the process matches the owner UID of the
245 file.
246
247 * The calling process has the CAP_FOWNER capability in its user
248 namespace and the owner UID of the file has a mapping in the
249 namespace.
250
251 This flag is intended for use by indexing or backup programs,
252 where its use can significantly reduce the amount of disk activ‐
253 ity. This flag may not be effective on all filesystems. One
254 example is NFS, where the server maintains the access time.
255
256 O_NOCTTY
257 If pathname refers to a terminal device—see tty(4)—it will not
258 become the process's controlling terminal even if the process
259 does not have one.
260
261 O_NOFOLLOW
262 If pathname is a symbolic link, then the open fails, with the
263 error ELOOP. Symbolic links in earlier components of the path‐
264 name will still be followed. (Note that the ELOOP error that
265 can occur in this case is indistinguishable from the case where
266 an open fails because there are too many symbolic links found
267 while resolving components in the prefix part of the pathname.)
268
269 This flag is a FreeBSD extension, which was added to Linux in
270 version 2.1.126, and has subsequently been standardized in
271 POSIX.1-2008.
272
273 See also O_PATH below.
274
275 O_NONBLOCK or O_NDELAY
276 When possible, the file is opened in nonblocking mode. Neither
277 the open() nor any subsequent operations on the file descriptor
278 which is returned will cause the calling process to wait.
279
280 Note that this flag has no effect for regular files and block
281 devices; that is, I/O operations will (briefly) block when
282 device activity is required, regardless of whether O_NONBLOCK is
283 set. Since O_NONBLOCK semantics might eventually be imple‐
284 mented, applications should not depend upon blocking behavior
285 when specifying this flag for regular files and block devices.
286
287 For the handling of FIFOs (named pipes), see also fifo(7). For
288 a discussion of the effect of O_NONBLOCK in conjunction with
289 mandatory file locks and with file leases, see fcntl(2).
290
291 O_PATH (since Linux 2.6.39)
292 Obtain a file descriptor that can be used for two purposes: to
293 indicate a location in the filesystem tree and to perform opera‐
294 tions that act purely at the file descriptor level. The file
295 itself is not opened, and other file operations (e.g., read(2),
296 write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2))
297 fail with the error EBADF.
298
299 The following operations can be performed on the resulting file
300 descriptor:
301
302 * close(2).
303
304 * fchdir(2), if the file descriptor refers to a directory
305 (since Linux 3.5).
306
307 * fstat(2) (since Linux 3.6).
308
309 * fstatfs(2) (since Linux 3.12).
310
311 * Duplicating the file descriptor (dup(2), fcntl(2) F_DUPFD,
312 etc.).
313
314 * Getting and setting file descriptor flags (fcntl(2) F_GETFD
315 and F_SETFD).
316
317 * Retrieving open file status flags using the fcntl(2) F_GETFL
318 operation: the returned flags will include the bit O_PATH.
319
320 * Passing the file descriptor as the dirfd argument of openat()
321 and the other "*at()" system calls. This includes linkat(2)
322 with AT_EMPTY_PATH (or via procfs using AT_SYMLINK_FOLLOW)
323 even if the file is not a directory.
324
325 * Passing the file descriptor to another process via a UNIX
326 domain socket (see SCM_RIGHTS in unix(7)).
327
328 When O_PATH is specified in flags, flag bits other than
329 O_CLOEXEC, O_DIRECTORY, and O_NOFOLLOW are ignored.
330
331 Opening a file or directory with the O_PATH flag requires no
332 permissions on the object itself (but does require execute per‐
333 mission on the directories in the path prefix). Depending on
334 the subsequent operation, a check for suitable file permissions
335 may be performed (e.g., fchdir(2) requires execute permission on
336 the directory referred to by its file descriptor argument). By
337 contrast, obtaining a reference to a filesystem object by open‐
338 ing it with the O_RDONLY flag requires that the caller have read
339 permission on the object, even when the subsequent operation
340 (e.g., fchdir(2), fstat(2)) does not require read permission on
341 the object.
342
343 If pathname is a symbolic link and the O_NOFOLLOW flag is also
344 specified, then the call returns a file descriptor referring to
345 the symbolic link. This file descriptor can be used as the
346 dirfd argument in calls to fchownat(2), fstatat(2), linkat(2),
347 and readlinkat(2) with an empty pathname to have the calls oper‐
348 ate on the symbolic link.
349
350 If pathname refers to an automount point that has not yet been
351 triggered, so no other filesystem is mounted on it, then the
352 call returns a file descriptor referring to the automount direc‐
353 tory without triggering a mount. fstatfs(2) can then be used to
354 determine if it is, in fact, an untriggered automount point
355 (.f_type == AUTOFS_SUPER_MAGIC).
356
357 One use of O_PATH for regular files is to provide the equivalent
358 of POSIX.1's O_EXEC functionality. This permits us to open a
359 file for which we have execute permission but not read permis‐
360 sion, and then execute that file, with steps something like the
361 following:
362
363 char buf[PATH_MAX];
364 fd = open("some_prog", O_PATH);
365 snprintf(buf, "/proc/self/fd/%d", fd);
366 execl(buf, "some_prog", (char *) NULL);
367
368 An O_PATH file descriptor can also be passed as the argument of
369 fexecve(3).
370
371 O_SYNC Write operations on the file will complete according to the
372 requirements of synchronized I/O file integrity completion (by
373 contrast with the synchronized I/O data integrity completion
374 provided by O_DSYNC.)
375
376 By the time write(2) (or similar) returns, the output data and
377 associated file metadata have been transferred to the underlying
378 hardware (i.e., as though each write(2) was followed by a call
379 to fsync(2)). See NOTES below.
380
381 O_TMPFILE (since Linux 3.11)
382 Create an unnamed temporary regular file. The pathname argument
383 specifies a directory; an unnamed inode will be created in that
384 directory's filesystem. Anything written to the resulting file
385 will be lost when the last file descriptor is closed, unless the
386 file is given a name.
387
388 O_TMPFILE must be specified with one of O_RDWR or O_WRONLY and,
389 optionally, O_EXCL. If O_EXCL is not specified, then linkat(2)
390 can be used to link the temporary file into the filesystem, mak‐
391 ing it permanent, using code like the following:
392
393 char path[PATH_MAX];
394 fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
395 S_IRUSR | S_IWUSR);
396
397 /* File I/O on 'fd'... */
398
399 snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd);
400 linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
401 AT_SYMLINK_FOLLOW);
402
403 In this case, the open() mode argument determines the file per‐
404 mission mode, as with O_CREAT.
405
406 Specifying O_EXCL in conjunction with O_TMPFILE prevents a tem‐
407 porary file from being linked into the filesystem in the above
408 manner. (Note that the meaning of O_EXCL in this case is dif‐
409 ferent from the meaning of O_EXCL otherwise.)
410
411 There are two main use cases for O_TMPFILE:
412
413 * Improved tmpfile(3) functionality: race-free creation of tem‐
414 porary files that (1) are automatically deleted when closed;
415 (2) can never be reached via any pathname; (3) are not sub‐
416 ject to symlink attacks; and (4) do not require the caller to
417 devise unique names.
418
419 * Creating a file that is initially invisible, which is then
420 populated with data and adjusted to have appropriate filesys‐
421 tem attributes (fchown(2), fchmod(2), fsetxattr(2), etc.)
422 before being atomically linked into the filesystem in a fully
423 formed state (using linkat(2) as described above).
424
425 O_TMPFILE requires support by the underlying filesystem; only a
426 subset of Linux filesystems provide that support. In the ini‐
427 tial implementation, support was provided in the ext2, ext3,
428 ext4, UDF, Minix, and shmem filesystems. Support for other
429 filesystems has subsequently been added as follows: XFS (Linux
430 3.15); Btrfs (Linux 3.16); F2FS (Linux 3.16); and ubifs (Linux
431 4.9)
432
433 O_TRUNC
434 If the file already exists and is a regular file and the access
435 mode allows writing (i.e., is O_RDWR or O_WRONLY) it will be
436 truncated to length 0. If the file is a FIFO or terminal device
437 file, the O_TRUNC flag is ignored. Otherwise, the effect of
438 O_TRUNC is unspecified.
439
440 creat()
441 A call to creat() is equivalent to calling open() with flags equal to
442 O_CREAT|O_WRONLY|O_TRUNC.
443
444 openat()
445 The openat() system call operates in exactly the same way as open(),
446 except for the differences described here.
447
448 If the pathname given in pathname is relative, then it is interpreted
449 relative to the directory referred to by the file descriptor dirfd
450 (rather than relative to the current working directory of the calling
451 process, as is done by open() for a relative pathname).
452
453 If pathname is relative and dirfd is the special value AT_FDCWD, then
454 pathname is interpreted relative to the current working directory of
455 the calling process (like open()).
456
457 If pathname is absolute, then dirfd is ignored.
458
460 open(), openat(), and creat() return the new file descriptor, or -1 if
461 an error occurred (in which case, errno is set appropriately).
462
464 open(), openat(), and creat() can fail with the following errors:
465
466 EACCES The requested access to the file is not allowed, or search per‐
467 mission is denied for one of the directories in the path prefix
468 of pathname, or the file did not exist yet and write access to
469 the parent directory is not allowed. (See also path_resolu‐
470 tion(7).)
471
472 EDQUOT Where O_CREAT is specified, the file does not exist, and the
473 user's quota of disk blocks or inodes on the filesystem has been
474 exhausted.
475
476 EEXIST pathname already exists and O_CREAT and O_EXCL were used.
477
478 EFAULT pathname points outside your accessible address space.
479
480 EFBIG See EOVERFLOW.
481
482 EINTR While blocked waiting to complete an open of a slow device
483 (e.g., a FIFO; see fifo(7)), the call was interrupted by a sig‐
484 nal handler; see signal(7).
485
486 EINVAL The filesystem does not support the O_DIRECT flag. See NOTES
487 for more information.
488
489 EINVAL Invalid value in flags.
490
491 EINVAL O_TMPFILE was specified in flags, but neither O_WRONLY nor
492 O_RDWR was specified.
493
494 EINVAL O_CREAT was specified in flags and the final component ("base‐
495 name") of the new file's pathname is invalid (e.g., it contains
496 characters not permitted by the underlying filesystem).
497
498 EISDIR pathname refers to a directory and the access requested involved
499 writing (that is, O_WRONLY or O_RDWR is set).
500
501 EISDIR pathname refers to an existing directory, O_TMPFILE and one of
502 O_WRONLY or O_RDWR were specified in flags, but this kernel ver‐
503 sion does not provide the O_TMPFILE functionality.
504
505 ELOOP Too many symbolic links were encountered in resolving pathname.
506
507 ELOOP pathname was a symbolic link, and flags specified O_NOFOLLOW but
508 not O_PATH.
509
510 EMFILE The per-process limit on the number of open file descriptors has
511 been reached (see the description of RLIMIT_NOFILE in getr‐
512 limit(2)).
513
514 ENAMETOOLONG
515 pathname was too long.
516
517 ENFILE The system-wide limit on the total number of open files has been
518 reached.
519
520 ENODEV pathname refers to a device special file and no corresponding
521 device exists. (This is a Linux kernel bug; in this situation
522 ENXIO must be returned.)
523
524 ENOENT O_CREAT is not set and the named file does not exist. Or, a
525 directory component in pathname does not exist or is a dangling
526 symbolic link.
527
528 ENOENT pathname refers to a nonexistent directory, O_TMPFILE and one of
529 O_WRONLY or O_RDWR were specified in flags, but this kernel ver‐
530 sion does not provide the O_TMPFILE functionality.
531
532 ENOMEM The named file is a FIFO, but memory for the FIFO buffer can't
533 be allocated because the per-user hard limit on memory alloca‐
534 tion for pipes has been reached and the caller is not privi‐
535 leged; see pipe(7).
536
537 ENOMEM Insufficient kernel memory was available.
538
539 ENOSPC pathname was to be created but the device containing pathname
540 has no room for the new file.
541
542 ENOTDIR
543 A component used as a directory in pathname is not, in fact, a
544 directory, or O_DIRECTORY was specified and pathname was not a
545 directory.
546
547 ENXIO O_NONBLOCK | O_WRONLY is set, the named file is a FIFO, and no
548 process has the FIFO open for reading.
549
550 ENXIO The file is a device special file and no corresponding device
551 exists.
552
553 EOPNOTSUPP
554 The filesystem containing pathname does not support O_TMPFILE.
555
556 EOVERFLOW
557 pathname refers to a regular file that is too large to be
558 opened. The usual scenario here is that an application compiled
559 on a 32-bit platform without -D_FILE_OFFSET_BITS=64 tried to
560 open a file whose size exceeds (1<<31)-1 bytes; see also
561 O_LARGEFILE above. This is the error specified by POSIX.1; in
562 kernels before 2.6.24, Linux gave the error EFBIG for this case.
563
564 EPERM The O_NOATIME flag was specified, but the effective user ID of
565 the caller did not match the owner of the file and the caller
566 was not privileged.
567
568 EPERM The operation was prevented by a file seal; see fcntl(2).
569
570 EROFS pathname refers to a file on a read-only filesystem and write
571 access was requested.
572
573 ETXTBSY
574 pathname refers to an executable image which is currently being
575 executed and write access was requested.
576
577 EWOULDBLOCK
578 The O_NONBLOCK flag was specified, and an incompatible lease was
579 held on the file (see fcntl(2)).
580
581 The following additional errors can occur for openat():
582
583 EBADF dirfd is not a valid file descriptor.
584
585 ENOTDIR
586 pathname is a relative pathname and dirfd is a file descriptor
587 referring to a file other than a directory.
588
590 openat() was added to Linux in kernel 2.6.16; library support was added
591 to glibc in version 2.4.
592
594 open(), creat() SVr4, 4.3BSD, POSIX.1-2001, POSIX.1-2008.
595
596 openat(): POSIX.1-2008.
597
598 The O_DIRECT, O_NOATIME, O_PATH, and O_TMPFILE flags are Linux-spe‐
599 cific. One must define _GNU_SOURCE to obtain their definitions.
600
601 The O_CLOEXEC, O_DIRECTORY, and O_NOFOLLOW flags are not specified in
602 POSIX.1-2001, but are specified in POSIX.1-2008. Since glibc 2.12, one
603 can obtain their definitions by defining either _POSIX_C_SOURCE with a
604 value greater than or equal to 200809L or _XOPEN_SOURCE with a value
605 greater than or equal to 700. In glibc 2.11 and earlier, one obtains
606 the definitions by defining _GNU_SOURCE.
607
608 As noted in feature_test_macros(7), feature test macros such as
609 _POSIX_C_SOURCE, _XOPEN_SOURCE, and _GNU_SOURCE must be defined before
610 including any header files.
611
613 Under Linux, the O_NONBLOCK flag indicates that one wants to open but
614 does not necessarily have the intention to read or write. This is typ‐
615 ically used to open devices in order to get a file descriptor for use
616 with ioctl(2).
617
618 The (undefined) effect of O_RDONLY | O_TRUNC varies among implementa‐
619 tions. On many systems the file is actually truncated.
620
621 Note that open() can open device special files, but creat() cannot cre‐
622 ate them; use mknod(2) instead.
623
624 If the file is newly created, its st_atime, st_ctime, st_mtime fields
625 (respectively, time of last access, time of last status change, and
626 time of last modification; see stat(2)) are set to the current time,
627 and so are the st_ctime and st_mtime fields of the parent directory.
628 Otherwise, if the file is modified because of the O_TRUNC flag, its
629 st_ctime and st_mtime fields are set to the current time.
630
631 The files in the /proc/[pid]/fd directory show the open file descrip‐
632 tors of the process with the PID pid. The files in the
633 /proc/[pid]/fdinfo directory show even more information about these
634 files descriptors. See proc(5) for further details of both of these
635 directories.
636
637 Open file descriptions
638 The term open file description is the one used by POSIX to refer to the
639 entries in the system-wide table of open files. In other contexts,
640 this object is variously also called an "open file object", a "file
641 handle", an "open file table entry", or—in kernel-developer parlance—a
642 struct file.
643
644 When a file descriptor is duplicated (using dup(2) or similar), the
645 duplicate refers to the same open file description as the original file
646 descriptor, and the two file descriptors consequently share the file
647 offset and file status flags. Such sharing can also occur between pro‐
648 cesses: a child process created via fork(2) inherits duplicates of its
649 parent's file descriptors, and those duplicates refer to the same open
650 file descriptions.
651
652 Each open() of a file creates a new open file description; thus, there
653 may be multiple open file descriptions corresponding to a file inode.
654
655 On Linux, one can use the kcmp(2) KCMP_FILE operation to test whether
656 two file descriptors (in the same process or in two different pro‐
657 cesses) refer to the same open file description.
658
659 Synchronized I/O
660 The POSIX.1-2008 "synchronized I/O" option specifies different variants
661 of synchronized I/O, and specifies the open() flags O_SYNC, O_DSYNC,
662 and O_RSYNC for controlling the behavior. Regardless of whether an
663 implementation supports this option, it must at least support the use
664 of O_SYNC for regular files.
665
666 Linux implements O_SYNC and O_DSYNC, but not O_RSYNC. (Somewhat incor‐
667 rectly, glibc defines O_RSYNC to have the same value as O_SYNC.)
668
669 O_SYNC provides synchronized I/O file integrity completion, meaning
670 write operations will flush data and all associated metadata to the
671 underlying hardware. O_DSYNC provides synchronized I/O data integrity
672 completion, meaning write operations will flush data to the underlying
673 hardware, but will only flush metadata updates that are required to
674 allow a subsequent read operation to complete successfully. Data
675 integrity completion can reduce the number of disk operations that are
676 required for applications that don't need the guarantees of file
677 integrity completion.
678
679 To understand the difference between the two types of completion, con‐
680 sider two pieces of file metadata: the file last modification timestamp
681 (st_mtime) and the file length. All write operations will update the
682 last file modification timestamp, but only writes that add data to the
683 end of the file will change the file length. The last modification
684 timestamp is not needed to ensure that a read completes successfully,
685 but the file length is. Thus, O_DSYNC would only guarantee to flush
686 updates to the file length metadata (whereas O_SYNC would also always
687 flush the last modification timestamp metadata).
688
689 Before Linux 2.6.33, Linux implemented only the O_SYNC flag for open().
690 However, when that flag was specified, most filesystems actually pro‐
691 vided the equivalent of synchronized I/O data integrity completion
692 (i.e., O_SYNC was actually implemented as the equivalent of O_DSYNC).
693
694 Since Linux 2.6.33, proper O_SYNC support is provided. However, to
695 ensure backward binary compatibility, O_DSYNC was defined with the same
696 value as the historical O_SYNC, and O_SYNC was defined as a new (two-
697 bit) flag value that includes the O_DSYNC flag value. This ensures
698 that applications compiled against new headers get at least O_DSYNC
699 semantics on pre-2.6.33 kernels.
700
701 C library/kernel differences
702 Since version 2.26, the glibc wrapper function for open() employs the
703 openat() system call, rather than the kernel's open() system call. For
704 certain architectures, this is also true in glibc versions before 2.26.
705
706 NFS
707 There are many infelicities in the protocol underlying NFS, affecting
708 amongst others O_SYNC and O_NDELAY.
709
710 On NFS filesystems with UID mapping enabled, open() may return a file
711 descriptor but, for example, read(2) requests are denied with EACCES.
712 This is because the client performs open() by checking the permissions,
713 but UID mapping is performed by the server upon read and write
714 requests.
715
716 FIFOs
717 Opening the read or write end of a FIFO blocks until the other end is
718 also opened (by another process or thread). See fifo(7) for further
719 details.
720
721 File access mode
722 Unlike the other values that can be specified in flags, the access mode
723 values O_RDONLY, O_WRONLY, and O_RDWR do not specify individual bits.
724 Rather, they define the low order two bits of flags, and are defined
725 respectively as 0, 1, and 2. In other words, the combination O_RDONLY
726 | O_WRONLY is a logical error, and certainly does not have the same
727 meaning as O_RDWR.
728
729 Linux reserves the special, nonstandard access mode 3 (binary 11) in
730 flags to mean: check for read and write permission on the file and
731 return a file descriptor that can't be used for reading or writing.
732 This nonstandard access mode is used by some Linux drivers to return a
733 file descriptor that is to be used only for device-specific ioctl(2)
734 operations.
735
736 Rationale for openat() and other directory file descriptor APIs
737 openat() and the other system calls and library functions that take a
738 directory file descriptor argument (i.e., execveat(2), faccessat(2),
739 fanotify_mark(2), fchmodat(2), fchownat(2), fstatat(2), futimesat(2),
740 linkat(2), mkdirat(2), mknodat(2), name_to_handle_at(2), readlinkat(2),
741 renameat(2), statx(2), symlinkat(2), unlinkat(2), utimensat(2), mkfi‐
742 foat(3), and scandirat(3)) address two problems with the older inter‐
743 faces that preceded them. Here, the explanation is in terms of the
744 openat() call, but the rationale is analogous for the other interfaces.
745
746 First, openat() allows an application to avoid race conditions that
747 could occur when using open() to open files in directories other than
748 the current working directory. These race conditions result from the
749 fact that some component of the directory prefix given to open() could
750 be changed in parallel with the call to open(). Suppose, for example,
751 that we wish to create the file dir1/dir2/xxx.dep if the file
752 dir1/dir2/xxx exists. The problem is that between the existence check
753 and the file-creation step, dir1 or dir2 (which might be symbolic
754 links) could be modified to point to a different location. Such races
755 can be avoided by opening a file descriptor for the target directory,
756 and then specifying that file descriptor as the dirfd argument of (say)
757 fstatat(2) and openat(). The use of the dirfd file descriptor also has
758 other benefits:
759
760 * the file descriptor is a stable reference to the directory, even if
761 the directory is renamed; and
762
763 * the open file descriptor prevents the underlying filesystem from
764 being dismounted, just as when a process has a current working
765 directory on a filesystem.
766
767 Second, openat() allows the implementation of a per-thread "current
768 working directory", via file descriptor(s) maintained by the applica‐
769 tion. (This functionality can also be obtained by tricks based on the
770 use of /proc/self/fd/dirfd, but less efficiently.)
771
772 O_DIRECT
773 The O_DIRECT flag may impose alignment restrictions on the length and
774 address of user-space buffers and the file offset of I/Os. In Linux
775 alignment restrictions vary by filesystem and kernel version and might
776 be absent entirely. However there is currently no filesystem-indepen‐
777 dent interface for an application to discover these restrictions for a
778 given file or filesystem. Some filesystems provide their own inter‐
779 faces for doing so, for example the XFS_IOC_DIOINFO operation in
780 xfsctl(3).
781
782 Under Linux 2.4, transfer sizes, and the alignment of the user buffer
783 and the file offset must all be multiples of the logical block size of
784 the filesystem. Since Linux 2.6.0, alignment to the logical block size
785 of the underlying storage (typically 512 bytes) suffices. The logical
786 block size can be determined using the ioctl(2) BLKSSZGET operation or
787 from the shell using the command:
788
789 blockdev --getss
790
791 O_DIRECT I/Os should never be run concurrently with the fork(2) system
792 call, if the memory buffer is a private mapping (i.e., any mapping cre‐
793 ated with the mmap(2) MAP_PRIVATE flag; this includes memory allocated
794 on the heap and statically allocated buffers). Any such I/Os, whether
795 submitted via an asynchronous I/O interface or from another thread in
796 the process, should be completed before fork(2) is called. Failure to
797 do so can result in data corruption and undefined behavior in parent
798 and child processes. This restriction does not apply when the memory
799 buffer for the O_DIRECT I/Os was created using shmat(2) or mmap(2) with
800 the MAP_SHARED flag. Nor does this restriction apply when the memory
801 buffer has been advised as MADV_DONTFORK with madvise(2), ensuring that
802 it will not be available to the child after fork(2).
803
804 The O_DIRECT flag was introduced in SGI IRIX, where it has alignment
805 restrictions similar to those of Linux 2.4. IRIX has also a fcntl(2)
806 call to query appropriate alignments, and sizes. FreeBSD 4.x intro‐
807 duced a flag of the same name, but without alignment restrictions.
808
809 O_DIRECT support was added under Linux in kernel version 2.4.10. Older
810 Linux kernels simply ignore this flag. Some filesystems may not imple‐
811 ment the flag, in which case open() fails with the error EINVAL if it
812 is used.
813
814 Applications should avoid mixing O_DIRECT and normal I/O to the same
815 file, and especially to overlapping byte regions in the same file.
816 Even when the filesystem correctly handles the coherency issues in this
817 situation, overall I/O throughput is likely to be slower than using
818 either mode alone. Likewise, applications should avoid mixing mmap(2)
819 of files with direct I/O to the same files.
820
821 The behavior of O_DIRECT with NFS will differ from local filesystems.
822 Older kernels, or kernels configured in certain ways, may not support
823 this combination. The NFS protocol does not support passing the flag
824 to the server, so O_DIRECT I/O will bypass the page cache only on the
825 client; the server may still cache the I/O. The client asks the server
826 to make the I/O synchronous to preserve the synchronous semantics of
827 O_DIRECT. Some servers will perform poorly under these circumstances,
828 especially if the I/O size is small. Some servers may also be config‐
829 ured to lie to clients about the I/O having reached stable storage;
830 this will avoid the performance penalty at some risk to data integrity
831 in the event of server power failure. The Linux NFS client places no
832 alignment restrictions on O_DIRECT I/O.
833
834 In summary, O_DIRECT is a potentially powerful tool that should be used
835 with caution. It is recommended that applications treat use of
836 O_DIRECT as a performance option which is disabled by default.
837
838 "The thing that has always disturbed me about O_DIRECT is that
839 the whole interface is just stupid, and was probably designed by
840 a deranged monkey on some serious mind-controlling sub‐
841 stances."—Linus
842
844 Currently, it is not possible to enable signal-driven I/O by specifying
845 O_ASYNC when calling open(); use fcntl(2) to enable this flag.
846
847 One must check for two different error codes, EISDIR and ENOENT, when
848 trying to determine whether the kernel supports O_TMPFILE functional‐
849 ity.
850
851 When both O_CREAT and O_DIRECTORY are specified in flags and the file
852 specified by pathname does not exist, open() will create a regular file
853 (i.e., O_DIRECTORY is ignored).
854
856 chmod(2), chown(2), close(2), dup(2), fcntl(2), link(2), lseek(2),
857 mknod(2), mmap(2), mount(2), open_by_handle_at(2), read(2), socket(2),
858 stat(2), umask(2), unlink(2), write(2), fopen(3), acl(5), fifo(7),
859 inode(7), path_resolution(7), symlink(7)
860
862 This page is part of release 4.15 of the Linux man-pages project. A
863 description of the project, information about reporting bugs, and the
864 latest version of this page, can be found at
865 https://www.kernel.org/doc/man-pages/.
866
867
868
869Linux 2017-09-15 OPEN(2)