1FCNTL(2)                   Linux Programmer's Manual                  FCNTL(2)


6       fcntl - manipulate file descriptor


9       #include <unistd.h>
10       #include <fcntl.h>
12       int fcntl(int fd, int cmd, ... /* arg */ );


15       fcntl() performs one of the operations described below on the open file
16       descriptor fd.  The operation is determined by cmd.
18       fcntl() can take an optional third argument.  Whether or not this argu‐
19       ment  is  required is determined by cmd.  The required argument type is
20       indicated in parentheses after  each  cmd  name  (in  most  cases,  the
21       required type is int, and we identify the argument using the name arg),
22       or void is specified if the argument is not required.
24       Certain of the operations below are supported only since  a  particular
25       Linux  kernel  version.   The  preferred method of checking whether the
26       host kernel supports a particular operation is to invoke  fcntl()  with
27       the  desired  cmd value and then test whether the call failed with EIN‐
28       VAL, indicating that the kernel does not recognize this value.
30   Duplicating a file descriptor
31       F_DUPFD (int)
32              Duplicate the  file  descriptor  fd  using  the  lowest-numbered
33              available file descriptor greater than or equal to arg.  This is
34              different from dup2(2), which uses exactly the  file  descriptor
35              specified.
37              On success, the new file descriptor is returned.
39              See dup(2) for further details.
41       F_DUPFD_CLOEXEC (int; since Linux 2.6.24)
42              As  for F_DUPFD, but additionally set the close-on-exec flag for
43              the duplicate file descriptor.  Specifying this flag  permits  a
44              program  to avoid an additional fcntl() F_SETFD operation to set
45              the FD_CLOEXEC flag.  For an explanation of  why  this  flag  is
46              useful, see the description of O_CLOEXEC in open(2).
48   File descriptor flags
49       The  following  commands  manipulate  the  flags associated with a file
50       descriptor.  Currently, only one such flag is defined: FD_CLOEXEC,  the
51       close-on-exec  flag.  If the FD_CLOEXEC bit is set, the file descriptor
52       will automatically be closed during a successful  execve(2).   (If  the
53       execve(2)  fails, the file descriptor is left open.)  If the FD_CLOEXEC
54       bit is not  set,  the  file  descriptor  will  remain  open  across  an
55       execve(2).
57       F_GETFD (void)
58              Return  (as  the function result) the file descriptor flags; arg
59              is ignored.
61       F_SETFD (int)
62              Set the file descriptor flags to the value specified by arg.
64       In multithreaded programs, using fcntl() F_SETFD to set  the  close-on-
65       exec  flag  at  the same time as another thread performs a fork(2) plus
66       execve(2) is vulnerable to a race condition  that  may  unintentionally
67       leak  the file descriptor to the program executed in the child process.
68       See the discussion of the O_CLOEXEC flag in open(2) for details  and  a
69       remedy to the problem.
71   File status flags
72       Each  open  file  description has certain associated status flags, ini‐
73       tialized by open(2) and possibly modified by fcntl().  Duplicated  file
74       descriptors  (made with dup(2), fcntl(F_DUPFD), fork(2), etc.) refer to
75       the same open file description, and thus share  the  same  file  status
76       flags.
78       The file status flags and their semantics are described in open(2).
80       F_GETFL (void)
81              Return  (as  the  function  result) the file access mode and the
82              file status flags; arg is ignored.
84       F_SETFL (int)
85              Set the file status flags to the value specified by  arg.   File
86              access mode (O_RDONLY, O_WRONLY, O_RDWR) and file creation flags
87              (i.e., O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC) in arg  are  ignored.
88              On  Linux,  this  command can change only the O_APPEND, O_ASYNC,
89              O_DIRECT, O_NOATIME, and O_NONBLOCK flags.  It is  not  possible
90              to change the O_DSYNC and O_SYNC flags; see BUGS, below.
92   Advisory record locking
93       Linux  implements traditional ("process-associated") UNIX record locks,
94       as standardized by POSIX.  For a Linux-specific alternative with better
95       semantics, see the discussion of open file description locks below.
97       F_SETLK,  F_SETLKW,  and F_GETLK are used to acquire, release, and test
98       for the existence of record locks (also known as byte-range,  file-seg‐
99       ment, or file-region locks).  The third argument, lock, is a pointer to
100       a structure that has at least  the  following  fields  (in  unspecified
101       order).
103           struct flock {
104               ...
105               short l_type;    /* Type of lock: F_RDLCK,
106                                   F_WRLCK, F_UNLCK */
107               short l_whence;  /* How to interpret l_start:
108                                   SEEK_SET, SEEK_CUR, SEEK_END */
109               off_t l_start;   /* Starting offset for lock */
110               off_t l_len;     /* Number of bytes to lock */
111               pid_t l_pid;     /* PID of process blocking our lock
112                                   (set by F_GETLK and F_OFD_GETLK) */
113               ...
114           };
116       The  l_whence,  l_start, and l_len fields of this structure specify the
117       range of bytes we wish to lock.  Bytes past the end of the file may  be
118       locked, but not bytes before the start of the file.
120       l_start  is  the starting offset for the lock, and is interpreted rela‐
121       tive to either: the start of the file (if l_whence  is  SEEK_SET);  the
122       current  file  offset (if l_whence is SEEK_CUR); or the end of the file
123       (if l_whence is SEEK_END).  In the final two cases, l_start  can  be  a
124       negative  number  provided  the offset does not lie before the start of
125       the file.
127       l_len specifies the number of bytes to be locked.  If  l_len  is  posi‐
128       tive,  then  the  range  to  be  locked  covers bytes l_start up to and
129       including l_start+l_len-1.  Specifying 0  for  l_len  has  the  special
130       meaning:  lock all bytes starting at the location specified by l_whence
131       and l_start through to the end of file, no matter how  large  the  file
132       grows.
134       POSIX.1-2001 allows (but does not require) an implementation to support
135       a negative l_len value; if l_len is negative, the interval described by
136       lock covers bytes l_start+l_len up to and including l_start-1.  This is
137       supported by Linux since kernel versions 2.4.21 and 2.5.49.
139       The l_type field can be used to place  a  read  (F_RDLCK)  or  a  write
140       (F_WRLCK) lock on a file.  Any number of processes may hold a read lock
141       (shared lock) on a file region, but only one process may hold  a  write
142       lock  (exclusive  lock).   An  exclusive lock excludes all other locks,
143       both shared and exclusive.  A single process can hold only one type  of
144       lock  on  a  file region; if a new lock is applied to an already-locked
145       region, then the existing lock is  converted  to  the  new  lock  type.
146       (Such  conversions may involve splitting, shrinking, or coalescing with
147       an existing lock if the byte range specified by the new lock  does  not
148       precisely coincide with the range of the existing lock.)
150       F_SETLK (struct flock *)
151              Acquire  a lock (when l_type is F_RDLCK or F_WRLCK) or release a
152              lock (when l_type is F_UNLCK) on  the  bytes  specified  by  the
153              l_whence,  l_start,  and l_len fields of lock.  If a conflicting
154              lock is held by another process, this call returns -1  and  sets
155              errno  to  EACCES  or  EAGAIN.  (The error returned in this case
156              differs across implementations, so  POSIX  requires  a  portable
157              application to check for both errors.)
159       F_SETLKW (struct flock *)
160              As  for  F_SETLK, but if a conflicting lock is held on the file,
161              then wait for that lock to be released.  If a signal  is  caught
162              while  waiting, then the call is interrupted and (after the sig‐
163              nal handler has returned) returns immediately (with return value
164              -1 and errno set to EINTR; see signal(7)).
166       F_GETLK (struct flock *)
167              On  input  to  this call, lock describes a lock we would like to
168              place on the file.  If the lock could be  placed,  fcntl()  does
169              not  actually  place it, but returns F_UNLCK in the l_type field
170              of lock and leaves the other fields of the structure unchanged.
172              If one or more incompatible locks would prevent this lock  being
173              placed, then fcntl() returns details about one of those locks in
174              the l_type, l_whence, l_start, and l_len fields of lock.  If the
175              conflicting  lock  is  a traditional (process-associated) record
176              lock, then the l_pid field is set to  the  PID  of  the  process
177              holding  that  lock.   If  the  conflicting lock is an open file
178              description lock, then l_pid  is  set  to  -1.   Note  that  the
179              returned  information may already be out of date by the time the
180              caller inspects it.
182       In order to place a read lock, fd must be open for reading.   In  order
183       to  place  a  write  lock,  fd must be open for writing.  To place both
184       types of lock, open a file read-write.
186       When placing locks with F_SETLKW, the kernel detects deadlocks, whereby
187       two  or  more  processes  have  their lock requests mutually blocked by
188       locks held by the other processes.   For  example,  suppose  process  A
189       holds  a  write lock on byte 100 of a file, and process B holds a write
190       lock on byte 200.  If each process  then  attempts  to  lock  the  byte
191       already locked by the other process using F_SETLKW, then, without dead‐
192       lock detection, both processes would remain blocked indefinitely.  When
193       the  kernel  detects such deadlocks, it causes one of the blocking lock
194       requests to immediately fail with the  error  EDEADLK;  an  application
195       that encounters such an error should release some of its locks to allow
196       other applications to proceed before attempting regain the  locks  that
197       it  requires.  Circular deadlocks involving more than two processes are
198       also detected.  Note, however, that there are limitations to  the  ker‐
199       nel's deadlock-detection algorithm; see BUGS.
201       As well as being removed by an explicit F_UNLCK, record locks are auto‐
202       matically released when the process terminates.
204       Record locks are not inherited by a child created via fork(2), but  are
205       preserved across an execve(2).
207       Because  of the buffering performed by the stdio(3) library, the use of
208       record locking with routines in that package  should  be  avoided;  use
209       read(2) and write(2) instead.
211       The  record  locks  described  above  are  associated  with the process
212       (unlike the open file description locks  described  below).   This  has
213       some unfortunate consequences:
215       *  If  a  process  closes any file descriptor referring to a file, then
216          all of the process's locks on that file are released, regardless  of
217          the  file  descriptor(s)  on which the locks were obtained.  This is
218          bad: it means that a process can lose its locks on a  file  such  as
219          /etc/passwd  or  /etc/mtab  when  for some reason a library function
220          decides to open, read, and close the same file.
222       *  The threads in a process share locks.   In  other  words,  a  multi‐
223          threaded  program  can't  use  record locking to ensure that threads
224          don't simultaneously access the same region of a file.
226       Open file description locks solve both of these problems.
228   Open file description locks (non-POSIX)
229       Open file description locks are advisory byte-range locks whose  opera‐
230       tion  is  in  most  respects  identical to the traditional record locks
231       described above.  This lock type is Linux-specific, and available since
232       Linux 3.15.  (There is a proposal with the Austin Group to include this
233       lock type in the next revision of POSIX.1.)  For an explanation of open
234       file descriptions, see open(2).
236       The  principal  difference  between  the two lock types is that whereas
237       traditional record locks are  associated  with  a  process,  open  file
238       description  locks  are  associated  with  the open file description on
239       which they are acquired, much like locks acquired with flock(2).   Con‐
240       sequently  (and  unlike  traditional  advisory record locks), open file
241       description locks are  inherited  across  fork(2)  (and  clone(2)  with
242       CLONE_FILES),  and are only automatically released on the last close of
243       the open file description, instead of being released on  any  close  of
244       the file.
246       Conflicting  lock  combinations  (i.e., a read lock and a write lock or
247       two write locks) where one lock is an open file  description  lock  and
248       the  other  is  a  traditional  record lock conflict even when they are
249       acquired by the same process on the same file descriptor.
251       Open file description locks placed via the same open  file  description
252       (i.e.,  via  the  same  file descriptor, or via a duplicate of the file
253       descriptor created by fork(2), dup(2), fcntl() F_DUPFD, and so on)  are
254       always compatible: if a new lock is placed on an already locked region,
255       then the existing lock is converted to the new lock type.   (Such  con‐
256       versions  may  result  in  splitting,  shrinking, or coalescing with an
257       existing lock as discussed above.)
259       On the other hand, open file description locks may conflict  with  each
260       other  when  they  are  acquired  via different open file descriptions.
261       Thus, the threads in a multithreaded program can use open file descrip‐
262       tion locks to synchronize access to a file region by having each thread
263       perform its own open(2) on the file and applying locks via the  result‐
264       ing file descriptor.
266       As  with  traditional  advisory  locks,  the third argument to fcntl(),
267       lock, is a pointer to an flock structure.  By contrast with traditional
268       record  locks,  the  l_pid  field of that structure must be set to zero
269       when using the commands described below.
271       The commands for working with open file description locks are analogous
272       to those used with traditional locks:
274       F_OFD_SETLK (struct flock *)
275              Acquire an open file description lock (when l_type is F_RDLCK or
276              F_WRLCK) or release an open file description lock  (when  l_type
277              is F_UNLCK) on the bytes specified by the l_whence, l_start, and
278              l_len fields of lock.  If a conflicting lock is held by  another
279              process, this call returns -1 and sets errno to EAGAIN.
281       F_OFD_SETLKW (struct flock *)
282              As  for  F_OFD_SETLK,  but  if a conflicting lock is held on the
283              file, then wait for that lock to be released.  If  a  signal  is
284              caught  while  waiting,  then the call is interrupted and (after
285              the signal  handler  has  returned)  returns  immediately  (with
286              return value -1 and errno set to EINTR; see signal(7)).
288       F_OFD_GETLK (struct flock *)
289              On  input  to this call, lock describes an open file description
290              lock we would like to place on the file.  If the lock  could  be
291              placed,  fcntl() does not actually place it, but returns F_UNLCK
292              in the l_type field of lock and leaves the other fields  of  the
293              structure  unchanged.   If  one or more incompatible locks would
294              prevent this lock being placed, then details about one of  these
295              locks are returned via lock, as described above for F_GETLK.
297       In  the  current implementation, no deadlock detection is performed for
298       open file description locks.  (This contrasts  with  process-associated
299       record locks, for which the kernel does perform deadlock detection.)
301   Mandatory locking
302       Warning:  the  Linux implementation of mandatory locking is unreliable.
303       See BUGS below.  Because of these bugs, and the fact that  the  feature
304       is  believed  to be little used, since Linux 4.5, mandatory locking has
305       been made an optional feature, governed by a configuration option (CON‐
306       FIG_MANDATORY_FILE_LOCKING).   This  is an initial step toward removing
307       this feature completely.
309       By  default,  both  traditional  (process-associated)  and  open   file
310       description record locks are advisory.  Advisory locks are not enforced
311       and are useful only between cooperating processes.
313       Both lock types can also be mandatory.  Mandatory  locks  are  enforced
314       for  all  processes.   If  a  process  tries to perform an incompatible
315       access (e.g., read(2) or write(2)) on a file region that has an  incom‐
316       patible mandatory lock, then the result depends upon whether the O_NON‐
317       BLOCK flag is enabled for its open file description.  If the O_NONBLOCK
318       flag  is not enabled, then the system call is blocked until the lock is
319       removed or converted to a mode that is compatible with the access.   If
320       the  O_NONBLOCK  flag  is  enabled, then the system call fails with the
321       error EAGAIN.
323       To make use of mandatory locks, mandatory locking must be enabled  both
324       on  the filesystem that contains the file to be locked, and on the file
325       itself.  Mandatory locking is enabled on a  filesystem  using  the  "-o
326       mand" option to mount(8), or the MS_MANDLOCK flag for mount(2).  Manda‐
327       tory locking is enabled on a file by disabling group execute permission
328       on  the file and enabling the set-group-ID permission bit (see chmod(1)
329       and chmod(2)).
331       Mandatory locking is not specified by POSIX.  Some other  systems  also
332       support  mandatory  locking,  although  the details of how to enable it
333       vary across systems.
335   Lost locks
336       When an advisory lock is obtained on a networked filesystem such as NFS
337       it  is  possible  that the lock might get lost.  This may happen due to
338       administrative action on the server, or  due  to  a  network  partition
339       (i.e.,  loss  of network connectivity with the server) which lasts long
340       enough for the server to assume that the client is no longer  function‐
341       ing.
343       When  the  filesystem  determines  that  a  lock  has been lost, future
344       read(2) or write(2) requests may fail with the error EIO.   This  error
345       will  persist  until  the  lock  is  removed  or the file descriptor is
346       closed.  Since Linux 3.12, this happens at least for  NFSv4  (including
347       all minor versions).
349       Some  versions  of  UNIX  send a signal (SIGLOST) in this circumstance.
350       Linux does not define this signal, and does not provide  any  asynchro‐
351       nous notification of lost locks.
353   Managing signals
355       used to manage I/O availability signals:
357       F_GETOWN (void)
358              Return (as the function result) the process ID or process  group
359              currently  receiving SIGIO and SIGURG signals for events on file
360              descriptor fd.  Process IDs are  returned  as  positive  values;
361              process  group IDs are returned as negative values (but see BUGS
362              below).  arg is ignored.
364       F_SETOWN (int)
365              Set the process ID or process group ID that will  receive  SIGIO
366              and  SIGURG  signals  for events on the file descriptor fd.  The
367              target process or process group  ID  is  specified  in  arg.   A
368              process  ID is specified as a positive value; a process group ID
369              is specified as a negative value.  Most  commonly,  the  calling
370              process specifies itself as the owner (that is, arg is specified
371              as getpid(2)).
373              As well as setting the file  descriptor  owner,  one  must  also
374              enable  generation  of  signals on the file descriptor.  This is
375              done by using the fcntl() F_SETFL command  to  set  the  O_ASYNC
376              file  status flag on the file descriptor.  Subsequently, a SIGIO
377              signal is sent whenever input or output becomes possible on  the
378              file  descriptor.   The  fcntl() F_SETSIG command can be used to
379              obtain delivery of a signal other than SIGIO.
381              Sending a signal to  the  owner  process  (group)  specified  by
382              F_SETOWN  is  subject  to  the  same  permissions  checks as are
383              described for kill(2), where the sending process is the one that
384              employs F_SETOWN (but see BUGS below).  If this permission check
385              fails,  then  the  signal  is  silently  discarded.   Note:  The
386              F_SETOWN  operation records the caller's credentials at the time
387              of the fcntl() call, and it is these saved credentials that  are
388              used for the permission checks.
390              If  the  file  descriptor  fd  refers to a socket, F_SETOWN also
391              selects the recipient of SIGURG signals that are delivered  when
392              out-of-band data arrives on that socket.  (SIGURG is sent in any
393              situation where select(2) would report the socket as  having  an
394              "exceptional condition".)
396              The following was true in 2.6.x kernels up to and including ker‐
397              nel 2.6.11:
399                     If a nonzero value is  given  to  F_SETSIG  in  a  multi‐
400                     threaded  process  running  with a threading library that
401                     supports thread groups  (e.g.,  NPTL),  then  a  positive
402                     value  given to F_SETOWN has a different meaning: instead
403                     of being a process ID identifying a whole process, it  is
404                     a  thread  ID  identifying  a  specific  thread  within a
405                     process.  Consequently,  it  may  be  necessary  to  pass
406                     F_SETOWN  the result of gettid(2) instead of getpid(2) to
407                     get sensible results when F_SETSIG is used.  (In  current
408                     Linux  threading  implementations, a main thread's thread
409                     ID is the same as its process ID.  This means that a sin‐
410                     gle-threaded  program  can  equally use gettid(2) or get‐
411                     pid(2) in this scenario.)  Note, however, that the state‐
412                     ments in this paragraph do not apply to the SIGURG signal
413                     generated for out-of-band data on a socket:  this  signal
414                     is  always  sent  to either a process or a process group,
415                     depending on the value given to F_SETOWN.
417              The above behavior was accidentally dropped in Linux 2.6.12, and
418              won't be restored.  From Linux 2.6.32 onward, use F_SETOWN_EX to
419              target SIGIO and SIGURG signals at a particular thread.
421       F_GETOWN_EX (struct f_owner_ex *) (since Linux 2.6.32)
422              Return the current file descriptor owner settings as defined  by
423              a  previous  F_SETOWN_EX operation.  The information is returned
424              in the structure pointed to by  arg,  which  has  the  following
425              form:
427                  struct f_owner_ex {
428                      int   type;
429                      pid_t pid;
430                  };
432              The  type  field  will  have  one  of  the  values  F_OWNER_TID,
433              F_OWNER_PID, or F_OWNER_PGRP.  The pid field is a positive inte‐
434              ger  representing  a thread ID, process ID, or process group ID.
435              See F_SETOWN_EX for more details.
437       F_SETOWN_EX (struct f_owner_ex *) (since Linux 2.6.32)
438              This operation performs a similar task to F_SETOWN.   It  allows
439              the  caller  to  direct  I/O  availability signals to a specific
440              thread, process, or process group.   The  caller  specifies  the
441              target  of  signals  via arg, which is a pointer to a f_owner_ex
442              structure.  The type field has  one  of  the  following  values,
443              which define how pid is interpreted:
445              F_OWNER_TID
446                     Send  the signal to the thread whose thread ID (the value
447                     returned by a call to clone(2) or gettid(2)) is specified
448                     in pid.
450              F_OWNER_PID
451                     Send  the  signal to the process whose ID is specified in
452                     pid.
454              F_OWNER_PGRP
455                     Send the signal to the process group whose ID  is  speci‐
456                     fied in pid.  (Note that, unlike with F_SETOWN, a process
457                     group ID is specified as a positive value here.)
459       F_GETSIG (void)
460              Return (as the function result) the signal sent  when  input  or
461              output  becomes  possible.  A value of zero means SIGIO is sent.
462              Any other value (including SIGIO) is the  signal  sent  instead,
463              and in this case additional info is available to the signal han‐
464              dler if installed with SA_SIGINFO.  arg is ignored.
466       F_SETSIG (int)
467              Set the signal sent when input or output becomes possible to the
468              value  given  in arg.  A value of zero means to send the default
469              SIGIO signal.  Any other value (including SIGIO) is  the  signal
470              to  send  instead, and in this case additional info is available
471              to the signal handler if installed with SA_SIGINFO.
473              By using F_SETSIG with a nonzero value, and  setting  SA_SIGINFO
474              for  the  signal  handler  (see sigaction(2)), extra information
475              about I/O events is passed to the handler in a siginfo_t  struc‐
476              ture.   If  the  si_code field indicates the source is SI_SIGIO,
477              the si_fd field gives the file descriptor  associated  with  the
478              event.  Otherwise, there is no indication which file descriptors
479              are pending, and you should use the usual mechanisms (select(2),
480              poll(2),  read(2)  with  O_NONBLOCK set etc.) to determine which
481              file descriptors are available for I/O.
483              Note that the file descriptor provided in si_fd is the one  that
484              was  specified  during the F_SETSIG operation.  This can lead to
485              an unusual corner case.  If the file  descriptor  is  duplicated
486              (dup(2) or similar), and the original file descriptor is closed,
487              then I/O events will continue to be  generated,  but  the  si_fd
488              field will contain the number of the now closed file descriptor.
490              By  selecting  a  real time signal (value >= SIGRTMIN), multiple
491              I/O events may be queued using the same signal numbers.   (Queu‐
492              ing  is  dependent  on  available memory.)  Extra information is
493              available if SA_SIGINFO is set for the signal handler, as above.
495              Note that Linux imposes a limit on the number of real-time  sig‐
496              nals  that may be queued to a process (see getrlimit(2) and sig‐
497              nal(7)) and if this limit is reached, then the kernel reverts to
498              delivering  SIGIO,  and  this  signal is delivered to the entire
499              process rather than to a specific thread.
501       Using these mechanisms, a program can implement fully asynchronous  I/O
502       without using select(2) or poll(2) most of the time.
504       The  use  of  O_ASYNC  is  specific  to BSD and Linux.  The only use of
505       F_GETOWN and F_SETOWN specified in POSIX.1 is in conjunction  with  the
506       use of the SIGURG signal on sockets.  (POSIX does not specify the SIGIO
507       signal.)  F_GETOWN_EX, F_SETOWN_EX, F_GETSIG, and F_SETSIG  are  Linux-
508       specific.  POSIX has asynchronous I/O and the aio_sigevent structure to
509       achieve similar things; these are also available in Linux  as  part  of
510       the GNU C Library (Glibc).
512   Leases
513       F_SETLEASE  and  F_GETLEASE  (Linux 2.4 onward) are used to establish a
514       new lease, and retrieve the current lease, on the open file description
515       referred  to by the file descriptor fd.  A file lease provides a mecha‐
516       nism whereby the process holding the  lease  (the  "lease  holder")  is
517       notified  (via  delivery  of  a  signal)  when  a  process  (the "lease
518       breaker") tries to open(2) or truncate(2) the file referred to by  that
519       file descriptor.
521       F_SETLEASE (int)
522              Set  or  remove a file lease according to which of the following
523              values is specified in the integer arg:
525              F_RDLCK
526                     Take out a read  lease.   This  will  cause  the  calling
527                     process  to be notified when the file is opened for writ‐
528                     ing or is truncated.  A read lease can be placed only  on
529                     a file descriptor that is opened read-only.
531              F_WRLCK
532                     Take out a write lease.  This will cause the caller to be
533                     notified when the file is opened for reading  or  writing
534                     or  is  truncated.  A write lease may be placed on a file
535                     only if there are no other open file descriptors for  the
536                     file.
538              F_UNLCK
539                     Remove our lease from the file.
541       Leases  are  associated  with  an  open file description (see open(2)).
542       This means that duplicate file descriptors (created  by,  for  example,
543       fork(2) or dup(2)) refer to the same lease, and this lease may be modi‐
544       fied or released using any  of  these  descriptors.   Furthermore,  the
545       lease  is  released  by  either an explicit F_UNLCK operation on any of
546       these duplicate file descriptors, or when  all  such  file  descriptors
547       have been closed.
549       Leases may be taken out only on regular files.  An unprivileged process
550       may take out a lease only on a  file  whose  UID  (owner)  matches  the
551       filesystem UID of the process.  A process with the CAP_LEASE capability
552       may take out leases on arbitrary files.
554       F_GETLEASE (void)
555              Indicates what  type  of  lease  is  associated  with  the  file
556              descriptor  fd by returning either F_RDLCK, F_WRLCK, or F_UNLCK,
557              indicating, respectively, a read lease , a write  lease,  or  no
558              lease.  arg is ignored.
560       When a process (the "lease breaker") performs an open(2) or truncate(2)
561       that conflicts with a lease established via F_SETLEASE, the system call
562       is  blocked  by  the kernel and the kernel notifies the lease holder by
563       sending it a signal  (SIGIO  by  default).   The  lease  holder  should
564       respond to receipt of this signal by doing whatever cleanup is required
565       in preparation for the file to be accessed by  another  process  (e.g.,
566       flushing cached buffers) and then either remove or downgrade its lease.
567       A lease is removed by performing an F_SETLEASE command  specifying  arg
568       as  F_UNLCK.   If the lease holder currently holds a write lease on the
569       file, and the lease breaker is opening the file for reading, then it is
570       sufficient for the lease holder to downgrade the lease to a read lease.
571       This is done by performing an  F_SETLEASE  command  specifying  arg  as
572       F_RDLCK.
574       If  the  lease holder fails to downgrade or remove the lease within the
575       number of seconds specified in /proc/sys/fs/lease-break-time, then  the
576       kernel forcibly removes or downgrades the lease holder's lease.
578       Once  a  lease  break has been initiated, F_GETLEASE returns the target
579       lease type (either F_RDLCK or F_UNLCK, depending on what would be  com‐
580       patible  with  the  lease  breaker)  until the lease holder voluntarily
581       downgrades or removes the lease or the kernel forcibly  does  so  after
582       the lease break timer expires.
584       Once  the lease has been voluntarily or forcibly removed or downgraded,
585       and assuming the lease breaker has not unblocked its system  call,  the
586       kernel permits the lease breaker's system call to proceed.
588       If the lease breaker's blocked open(2) or truncate(2) is interrupted by
589       a signal handler, then the system call fails with the error EINTR,  but
590       the  other  steps still occur as described above.  If the lease breaker
591       is killed by a signal while blocked in open(2) or truncate(2), then the
592       other steps still occur as described above.  If the lease breaker spec‐
593       ifies the O_NONBLOCK flag when calling open(2), then the  call  immedi‐
594       ately fails with the error EWOULDBLOCK, but the other steps still occur
595       as described above.
597       The default signal used to notify the lease holder is SIGIO,  but  this
598       can  be  changed  using the F_SETSIG command to fcntl().  If a F_SETSIG
599       command is performed (even one specifying SIGIO), and the  signal  han‐
600       dler  is  established using SA_SIGINFO, then the handler will receive a
601       siginfo_t structure as its second argument, and the si_fd field of this
602       argument will hold the file descriptor of the leased file that has been
603       accessed by another process.  (This  is  useful  if  the  caller  holds
604       leases against multiple files.)
606   File and directory change notification (dnotify)
607       F_NOTIFY (int)
608              (Linux  2.4  onward)  Provide  notification  when  the directory
609              referred to by fd or any  of  the  files  that  it  contains  is
610              changed.   The events to be notified are specified in arg, which
611              is a bit mask specified by ORing together zero or  more  of  the
612              following bits:
614              DN_ACCESS   A  file  was  accessed (read(2), pread(2), readv(2),
615                          and similar)
616              DN_MODIFY   A file was modified (write(2), pwrite(2), writev(2),
617                          truncate(2), ftruncate(2), and similar).
618              DN_CREATE   A  file  was  created  (open(2), creat(2), mknod(2),
619                          mkdir(2), link(2), symlink(2), rename(2)  into  this
620                          directory).
621              DN_DELETE   A file was unlinked (unlink(2), rename(2) to another
622                          directory, rmdir(2)).
623              DN_RENAME   A   file   was   renamed   within   this   directory
624                          (rename(2)).
625              DN_ATTRIB   The  attributes  of  a  file were changed (chown(2),
626                          chmod(2), utime(2), utimensat(2), and similar).
628              (In order to obtain these definitions, the  _GNU_SOURCE  feature
629              test macro must be defined before including any header files.)
631              Directory  notifications are normally "one-shot", and the appli‐
632              cation must reregister to receive further notifications.  Alter‐
633              natively,  if DN_MULTISHOT is included in arg, then notification
634              will remain in effect until explicitly removed.
636              A series of F_NOTIFY requests is cumulative, with the events  in
637              arg  being added to the set already monitored.  To disable noti‐
638              fication of all events, make an F_NOTIFY call specifying arg  as
639              0.
641              Notification  occurs via delivery of a signal.  The default sig‐
642              nal is SIGIO, but this can be changed using the F_SETSIG command
643              to  fcntl().  (Note that SIGIO is one of the nonqueuing standard
644              signals; switching to the use of a real-time signal  means  that
645              multiple  notifications  can  be queued to the process.)  In the
646              latter case, the signal handler receives a  siginfo_t  structure
647              as  its  second  argument  (if the handler was established using
648              SA_SIGINFO) and the si_fd field of this structure  contains  the
649              file  descriptor  which  generated the notification (useful when
650              establishing notification on multiple directories).
652              Especially when using DN_MULTISHOT, a real time signal should be
653              used  for  notification,  so  that multiple notifications can be
654              queued.
656              NOTE: New applications should use the inotify interface  (avail‐
657              able since kernel 2.6.13), which provides a much superior inter‐
658              face for obtaining notifications of filesystem events.  See ino‐
659              tify(7).
661   Changing the capacity of a pipe
662       F_SETPIPE_SZ (int; since Linux 2.6.35)
663              Change the capacity of the pipe referred to by fd to be at least
664              arg bytes.  An unprivileged process can adjust the pipe capacity
665              to  any value between the system page size and the limit defined
666              in /proc/sys/fs/pipe-max-size (see proc(5)).   Attempts  to  set
667              the pipe capacity below the page size are silently rounded up to
668              the page size.  Attempts by an unprivileged process to  set  the
669              pipe  capacity  above  the  limit  in /proc/sys/fs/pipe-max-size
670              yield the error EPERM; a privileged  process  (CAP_SYS_RESOURCE)
671              can override the limit.
673              When  allocating  the  buffer for the pipe, the kernel may use a
674              capacity larger than arg, if that is convenient for  the  imple‐
675              mentation.   (In  the  current implementation, the allocation is
676              the next higher power-of-two page-size multiple of the requested
677              size.)   The  actual capacity (in bytes) that is set is returned
678              as the function result.
680              Attempting to set the pipe capacity smaller than the  amount  of
681              buffer  space  currently  used  to store data produces the error
682              EBUSY.
684              Note that because of the way the pages of the  pipe  buffer  are
685              employed  when  data is written to the pipe, the number of bytes
686              that can be written may be less than the nominal size, depending
687              on the size of the writes.
689       F_GETPIPE_SZ (void; since Linux 2.6.35)
690              Return  (as  the  function  result)  the  capacity  of  the pipe
691              referred to by fd.
693   File Sealing
694       File seals limit the set of allowed operations on a  given  file.   For
695       each seal that is set on a file, a specific set of operations will fail
696       with EPERM on this file from now on.  The file is said  to  be  sealed.
697       The default set of seals depends on the type of the underlying file and
698       filesystem.  For an overview of file sealing, a discussion of its  pur‐
699       pose, and some code examples, see memfd_create(2).
701       Currently, file seals can be applied only to a file descriptor returned
702       by memfd_create(2) (if the MFD_ALLOW_SEALING was employed).   On  other
703       filesystems,  all  fcntl() operations that operate on seals will return
704       EINVAL.
706       Seals are a property of an inode.   Thus,  all  open  file  descriptors
707       referring  to the same inode share the same set of seals.  Furthermore,
708       seals can never be removed, only added.
710       F_ADD_SEALS (int; since Linux 3.17)
711              Add the seals given in the bit-mask argument arg to the  set  of
712              seals of the inode referred to by the file descriptor fd.  Seals
713              cannot be removed again.  Once this call succeeds, the seals are
714              enforced by the kernel immediately.  If the current set of seals
715              includes  F_SEAL_SEAL  (see  below),  then  this  call  will  be
716              rejected with EPERM.  Adding a seal that is already set is a no-
717              op, in case F_SEAL_SEAL is not set already.  In order to place a
718              seal, the file descriptor fd must be writable.
720       F_GET_SEALS (void; since Linux 3.17)
721              Return  (as the function result) the current set of seals of the
722              inode referred to by fd.  If no seals are set,  0  is  returned.
723              If  the  file does not support sealing, -1 is returned and errno
724              is set to EINVAL.
726       The following seals are available:
728       F_SEAL_SEAL
729              If  this  seal  is  set,  any  further  call  to  fcntl()   with
730              F_ADD_SEALS  fails  with  the error EPERM.  Therefore, this seal
731              prevents any modifications to the set of seals itself.   If  the
732              initial  set  of seals of a file includes F_SEAL_SEAL, then this
733              effectively causes the set of seals to be constant and locked.
735       F_SEAL_SHRINK
736              If this seal is set, the file in question cannot be  reduced  in
737              size.   This  affects  open(2)  with the O_TRUNC flag as well as
738              truncate(2) and ftruncate(2).  Those calls fail  with  EPERM  if
739              you  try  to  shrink  the file in question.  Increasing the file
740              size is still possible.
742       F_SEAL_GROW
743              If this seal is set, the size of the file in question cannot  be
744              increased.   This  affects  write(2) beyond the end of the file,
745              truncate(2), ftruncate(2), and fallocate(2).  These  calls  fail
746              with  EPERM  if  you use them to increase the file size.  If you
747              keep the size or shrink it, those calls still work as expected.
749       F_SEAL_WRITE
750              If this seal is set, you cannot modify the contents of the file.
751              Note  that  shrinking  or  growing the size of the file is still
752              possible and allowed.  Thus, this seal is normally used in  com‐
753              bination  with  one  of  the  other  seals.   This  seal affects
754              write(2) and fallocate(2) (only in  combination  with  the  FAL‐
755              LOC_FL_PUNCH_HOLE  flag).   Those  calls fail with EPERM if this
756              seal is set.  Furthermore, trying to create new shared, writable
757              memory-mappings via mmap(2) will also fail with EPERM.
759              Using  the  F_ADD_SEALS  operation  to set the F_SEAL_WRITE seal
760              fails with EBUSY if any writable, shared mapping  exists.   Such
761              mappings  must  be  unmapped before you can add this seal.  Fur‐
762              thermore, if there are any asynchronous I/O operations  (io_sub‐
763              mit(2)) pending on the file, all outstanding writes will be dis‐
764              carded.
766   File read/write hints
767       Write lifetime hints can be used to inform the kernel about  the  rela‐
768       tive  expected  lifetime of writes on a given inode or via a particular
769       open file description.  (See open(2) for an explanation  of  open  file
770       descriptions.)   In  this  context, the term "write lifetime" means the
771       expected time the data will live on media, before being overwritten  or
772       erased.
774       An  application  may  use  the different hint values specified below to
775       separate writes into different write classes, so that multiple users or
776       applications  running  on a single storage back-end can aggregate their
777       I/O patterns in a consistent manner.  However, there are no  functional
778       semantics implied by these flags, and different I/O classes can use the
779       write lifetime hints in arbitrary ways, so long as the hints  are  used
780       consistently.
782       The following operations can be applied to the file descriptor, fd:
784       F_GET_RW_HINT (uint64_t *; since Linux 4.13)
785              Returns  the  value  of  the read/write hint associated with the
786              underlying inode referred to by fd.
788       F_SET_RW_HINT (uint64_t *; since Linux 4.13)
789              Sets the read/write hint value associated  with  the  underlying
790              inode  referred to by fd.  This hint persists until either it is
791              explicitly modified or the underlying filesystem is unmounted.
793       F_GET_FILE_RW_HINT (uint64_t *; since Linux 4.13)
794              Returns the value of the read/write  hint  associated  with  the
795              open file description referred to by fd.
797       F_SET_FILE_RW_HINT (uint64_t *; since Linux 4.13)
798              Sets  the  read/write  hint  value associated with the open file
799              description referred to by fd.
801       If an open file description has not been assigned  a  read/write  hint,
802       then it shall use the value assigned to the inode, if any.
804       The following read/write hints are valid since Linux 4.13:
807              No specific hint has been set.  This is the default value.
810              No  specific  write  lifetime  is  associated  with this file or
811              inode.
814              Data written to this inode or via this open file description  is
815              expected to have a short lifetime.
818              Data  written to this inode or via this open file description is
819              expected to have  a  lifetime  longer  than  data  written  with
820              RWH_WRITE_LIFE_SHORT.
823              Data  written to this inode or via this open file description is
824              expected to have  a  lifetime  longer  than  data  written  with
825              RWH_WRITE_LIFE_MEDIUM.
828              Data  written to this inode or via this open file description is
829              expected to have  a  lifetime  longer  than  data  written  with
830              RWH_WRITE_LIFE_LONG.
832       All  the  write-specific hints are relative to each other, and no indi‐
833       vidual absolute meaning should be attributed to them.


836       For a successful call, the return value depends on the operation:
838       F_DUPFD  The new file descriptor.
840       F_GETFD  Value of file descriptor flags.
842       F_GETFL  Value of file status flags.
844       F_GETLEASE
845                Type of lease held on file descriptor.
847       F_GETOWN Value of file descriptor owner.
849       F_GETSIG Value of signal sent when read or write becomes  possible,  or
850                zero for traditional SIGIO behavior.
853                The pipe capacity.
855       F_GET_SEALS
856                A  bit  mask  identifying the seals that have been set for the
857                inode referred to by fd.
859       All other commands
860                Zero.
862       On error, -1 is returned, and errno is set appropriately.


865       EACCES or EAGAIN
866              Operation is prohibited by locks held by other processes.
868       EAGAIN The operation is prohibited because the file  has  been  memory-
869              mapped by another process.
871       EBADF  fd is not an open file descriptor
873       EBADF  cmd  is  F_SETLK  or  F_SETLKW and the file descriptor open mode
874              doesn't match with the type of lock requested.
876       EBUSY  cmd is F_SETPIPE_SZ and the new pipe capacity specified  in  arg
877              is  smaller  than  the  amount of buffer space currently used to
878              store data in the pipe.
880       EBUSY  cmd is F_ADD_SEALS, arg includes F_SEAL_WRITE, and there  exists
881              a writable, shared mapping on the file referred to by fd.
883       EDEADLK
884              It  was detected that the specified F_SETLKW command would cause
885              a deadlock.
887       EFAULT lock is outside your accessible address space.
889       EINTR  cmd is F_SETLKW or F_OFD_SETLKW and  the  operation  was  inter‐
890              rupted by a signal; see signal(7).
892       EINTR  cmd  is  F_GETLK,  F_SETLK, F_OFD_GETLK, or F_OFD_SETLK, and the
893              operation was interrupted  by  a  signal  before  the  lock  was
894              checked  or  acquired.   Most  likely when locking a remote file
895              (e.g., locking over NFS), but can sometimes happen locally.
897       EINVAL The value specified in cmd is not recognized by this kernel.
899       EINVAL cmd is F_ADD_SEALS and arg includes an unrecognized sealing bit.
901       EINVAL cmd is F_ADD_SEALS or F_GET_SEALS and the filesystem  containing
902              the inode referred to by fd does not support sealing.
904       EINVAL cmd  is F_DUPFD and arg is negative or is greater than the maxi‐
905              mum allowable value (see  the  discussion  of  RLIMIT_NOFILE  in
906              getrlimit(2)).
908       EINVAL cmd is F_SETSIG and arg is not an allowable signal number.
910       EINVAL cmd  is F_OFD_SETLK, F_OFD_SETLKW, or F_OFD_GETLK, and l_pid was
911              not specified as zero.
913       EMFILE cmd is F_DUPFD and the per-process limit on the number  of  open
914              file descriptors has been reached.
916       ENOLCK Too  many  segment  locks  open, lock table is full, or a remote
917              locking protocol failed (e.g., locking over NFS).
919       ENOTDIR
920              F_NOTIFY was specified in cmd, but fd does not refer to a direc‐
921              tory.
923       EPERM  cmd  is  F_SETPIPE_SZ  and  the soft or hard user pipe limit has
924              been reached; see pipe(7).
926       EPERM  Attempted to clear the O_APPEND flag on  a  file  that  has  the
927              append-only attribute set.
929       EPERM  cmd was F_ADD_SEALS, but fd was not open for writing or the cur‐
930              rent set of seals on the file already includes F_SEAL_SEAL.


933       SVr4, 4.3BSD, POSIX.1-2001.   Only  the  operations  F_DUPFD,  F_GETFD,
934       F_SETFD, F_GETFL, F_SETFL, F_GETLK, F_SETLK, and F_SETLKW are specified
935       in POSIX.1-2001.
937       F_GETOWN and F_SETOWN are specified in  POSIX.1-2001.   (To  get  their
938       definitions, define either _XOPEN_SOURCE with the value 500 or greater,
939       or _POSIX_C_SOURCE with the value 200809L or greater.)
941       F_DUPFD_CLOEXEC is specified in POSIX.1-2008.  (To get this definition,
942       define   _POSIX_C_SOURCE   with   the  value  200809L  or  greater,  or
943       _XOPEN_SOURCE with the value 700 or greater.)
946       SIG,  F_NOTIFY, F_GETLEASE, and F_SETLEASE are Linux-specific.  (Define
947       the _GNU_SOURCE macro to obtain these definitions.)
949       F_OFD_SETLK, F_OFD_SETLKW, and F_OFD_GETLK are Linux-specific (and  one
950       must define _GNU_SOURCE to obtain their definitions), but work is being
951       done to have them included in the next version of POSIX.1.
953       F_ADD_SEALS and F_GET_SEALS are Linux-specific.


956       The errors returned by dup2(2) are different  from  those  returned  by
957       F_DUPFD.
959   File locking
960       The original Linux fcntl() system call was not designed to handle large
961       file offsets (in the flock structure).  Consequently, an fcntl64() sys‐
962       tem  call was added in Linux 2.4.  The newer system call employs a dif‐
963       ferent structure for file locking, flock64, and corresponding commands,
964       F_GETLK64,  F_SETLK64,  and  F_SETLKW64.  However, these details can be
965       ignored by applications using glibc,  whose  fcntl()  wrapper  function
966       transparently  employs  the  more recent system call where it is avail‐
967       able.
969   Record locks
970       Since kernel 2.0, there is no interaction between  the  types  of  lock
971       placed by flock(2) and fcntl().
973       Several  systems have more fields in struct flock such as, for example,
974       l_sysid (to identify the machine where the  lock  is  held).   Clearly,
975       l_pid  alone  is not going to be very useful if the process holding the
976       lock may live on a different machine; on Linux, while present  on  some
977       architectures (such as MIPS32), this field is not used.
979       The original Linux fcntl() system call was not designed to handle large
980       file offsets (in the flock structure).  Consequently, an fcntl64() sys‐
981       tem  call was added in Linux 2.4.  The newer system call employs a dif‐
982       ferent structure for file locking, flock64, and corresponding commands,
983       F_GETLK64,  F_SETLK64,  and  F_SETLKW64.  However, these details can be
984       ignored by applications using glibc,  whose  fcntl()  wrapper  function
985       transparently  employs  the  more recent system call where it is avail‐
986       able.
988   Record locking and NFS
989       Before Linux 3.12, if an NFSv4 client loses contact with the server for
990       a  period  of  time (defined as more than 90 seconds with no communica‐
991       tion), it might lose and regain a lock without ever being aware of  the
992       fact.  (The period of time after which contact is assumed lost is known
993       as the NFSv4 leasetime.  On a Linux NFS server, this can be  determined
994       by  looking at /proc/fs/nfsd/nfsv4leasetime, which expresses the period
995       in seconds.  The default value for this file  is  90.)   This  scenario
996       potentially  risks data corruption, since another process might acquire
997       a lock in the intervening period and perform file I/O.
999       Since Linux 3.12, if an NFSv4 client loses contact with the server, any
1000       I/O  to  the file by a process which "thinks" it holds a lock will fail
1001       until that process closes and reopens the file.   A  kernel  parameter,
1002       nfs.recover_lost_locks,  can  be set to 1 to obtain the pre-3.12 behav‐
1003       ior, whereby the client will attempt to recover lost locks when contact
1004       is  reestablished  with  the  server.  Because of the attendant risk of
1005       data corruption, this parameter defaults to 0 (disabled).


1008   F_SETFL
1009       It is not possible to use F_SETFL to change the state  of  the  O_DSYNC
1010       and  O_SYNC  flags.   Attempts  to  change the state of these flags are
1011       silently ignored.
1013   F_GETOWN
1014       A limitation of the Linux system call conventions on some architectures
1015       (notably  i386)  means  that  if  a  (negative)  process group ID to be
1016       returned by F_GETOWN falls in the range -1 to -4095,  then  the  return
1017       value  is  wrongly interpreted by glibc as an error in the system call;
1018       that is, the return value of fcntl() will be -1, and errno will contain
1019       the (positive) process group ID.  The Linux-specific F_GETOWN_EX opera‐
1020       tion avoids this problem.  Since glibc version 2.11,  glibc  makes  the
1021       kernel  F_GETOWN  problem  invisible  by  implementing  F_GETOWN  using
1022       F_GETOWN_EX.
1024   F_SETOWN
1025       In Linux 2.4 and earlier, there is bug that can occur when an  unprivi‐
1026       leged  process  uses  F_SETOWN  to  specify  the owner of a socket file
1027       descriptor as a process (group) other than the caller.  In  this  case,
1028       fcntl()  can  return  -1  with  errno set to EPERM, even when the owner
1029       process (group) is one that the caller has permission to  send  signals
1030       to.   Despite  this error return, the file descriptor owner is set, and
1031       signals will be sent to the owner.
1033   Deadlock detection
1034       The deadlock-detection algorithm employed by the  kernel  when  dealing
1035       with  F_SETLKW  requests  can  yield  both false negatives (failures to
1036       detect deadlocks, leaving a set of deadlocked processes blocked indefi‐
1037       nitely) and false positives (EDEADLK errors when there is no deadlock).
1038       For example, the kernel limits the lock depth of its dependency  search
1039       to  10  steps,  meaning  that circular deadlock chains that exceed that
1040       size will not be detected.  In addition, the kernel may  falsely  indi‐
1041       cate  a  deadlock when two or more processes created using the clone(2)
1042       CLONE_FILES flag place locks that appear (to the kernel) to conflict.
1044   Mandatory locking
1045       The Linux implementation of mandatory locking is subject to race condi‐
1046       tions  which render it unreliable: a write(2) call that overlaps with a
1047       lock may modify data after the mandatory lock is  acquired;  a  read(2)
1048       call  that  overlaps  with  a lock may detect changes to data that were
1049       made only after a write lock was acquired.  Similar races exist between
1050       mandatory  locks  and  mmap(2).  It is therefore inadvisable to rely on
1051       mandatory locking.


1054       dup2(2), flock(2), open(2), socket(2), lockf(3), capabilities(7),  fea‐
1055       ture_test_macros(7), lslocks(8)
1057       locks.txt,  mandatory-locking.txt,  and dnotify.txt in the Linux kernel
1058       source directory Documentation/filesystems/ (on  older  kernels,  these
1059       files  are  directly under the Documentation/ directory, and mandatory-
1060       locking.txt is called mandatory.txt)


1063       This page is part of release 5.04 of the Linux  man-pages  project.   A
1064       description  of  the project, information about reporting bugs, and the
1065       latest    version    of    this    page,    can     be     found     at
1066       https://www.kernel.org/doc/man-pages/.
1070Linux                             2019-03-06                          FCNTL(2)