1FCNTL(2)                   Linux Programmer's Manual                  FCNTL(2)


6       fcntl - manipulate file descriptor


9       #include <unistd.h>
10       #include <fcntl.h>
12       int fcntl(int fd, int cmd, ... /* arg */ );


15       fcntl() performs one of the operations described below on the open file
16       descriptor fd.  The operation is determined by cmd.
18       fcntl() can take an optional third argument.  Whether or not this argu‐
19       ment  is  required is determined by cmd.  The required argument type is
20       indicated in parentheses after  each  cmd  name  (in  most  cases,  the
21       required type is int, and we identify the argument using the name arg),
22       or void is specified if the argument is not required.
24       Certain of the operations below are supported only since  a  particular
25       Linux  kernel  version.   The  preferred method of checking whether the
26       host kernel supports a particular operation is to invoke  fcntl()  with
27       the  desired  cmd value and then test whether the call failed with EIN‐
28       VAL, indicating that the kernel does not recognize this value.
30   Duplicating a file descriptor
31       F_DUPFD (int)
32              Duplicate the  file  descriptor  fd  using  the  lowest-numbered
33              available file descriptor greater than or equal to arg.  This is
34              different from dup2(2), which uses exactly the  file  descriptor
35              specified.
37              On success, the new file descriptor is returned.
39              See dup(2) for further details.
41       F_DUPFD_CLOEXEC (int; since Linux 2.6.24)
42              As  for F_DUPFD, but additionally set the close-on-exec flag for
43              the duplicate file descriptor.  Specifying this flag  permits  a
44              program  to avoid an additional fcntl() F_SETFD operation to set
45              the FD_CLOEXEC flag.  For an explanation of  why  this  flag  is
46              useful, see the description of O_CLOEXEC in open(2).
48   File descriptor flags
49       The  following  commands  manipulate  the  flags associated with a file
50       descriptor.  Currently, only one such flag is defined: FD_CLOEXEC,  the
51       close-on-exec  flag.  If the FD_CLOEXEC bit is set, the file descriptor
52       will automatically be closed during a successful  execve(2).   (If  the
53       execve(2)  fails, the file descriptor is left open.)  If the FD_CLOEXEC
54       bit is not  set,  the  file  descriptor  will  remain  open  across  an
55       execve(2).
57       F_GETFD (void)
58              Return  (as  the function result) the file descriptor flags; arg
59              is ignored.
61       F_SETFD (int)
62              Set the file descriptor flags to the value specified by arg.
64       In multithreaded programs, using fcntl() F_SETFD to set  the  close-on-
65       exec  flag  at  the same time as another thread performs a fork(2) plus
66       execve(2) is vulnerable to a race condition  that  may  unintentionally
67       leak  the file descriptor to the program executed in the child process.
68       See the discussion of the O_CLOEXEC flag in open(2) for details  and  a
69       remedy to the problem.
71   File status flags
72       Each  open  file  description has certain associated status flags, ini‐
73       tialized by open(2) and possibly modified by fcntl().  Duplicated  file
74       descriptors  (made with dup(2), fcntl(F_DUPFD), fork(2), etc.) refer to
75       the same open file description, and thus share  the  same  file  status
76       flags.
78       The file status flags and their semantics are described in open(2).
80       F_GETFL (void)
81              Return  (as  the  function  result) the file access mode and the
82              file status flags; arg is ignored.
84       F_SETFL (int)
85              Set the file status flags to the value specified by  arg.   File
86              access mode (O_RDONLY, O_WRONLY, O_RDWR) and file creation flags
87              (i.e., O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC) in arg  are  ignored.
88              On  Linux,  this  command can change only the O_APPEND, O_ASYNC,
89              O_DIRECT, O_NOATIME, and O_NONBLOCK flags.  It is  not  possible
90              to change the O_DSYNC and O_SYNC flags; see BUGS, below.
92   Advisory record locking
93       Linux  implements traditional ("process-associated") UNIX record locks,
94       as standardized by POSIX.  For a Linux-specific alternative with better
95       semantics, see the discussion of open file description locks below.
97       F_SETLK,  F_SETLKW,  and F_GETLK are used to acquire, release, and test
98       for the existence of record locks (also known as byte-range,  file-seg‐
99       ment, or file-region locks).  The third argument, lock, is a pointer to
100       a structure that has at least  the  following  fields  (in  unspecified
101       order).
103           struct flock {
104               ...
105               short l_type;    /* Type of lock: F_RDLCK,
106                                   F_WRLCK, F_UNLCK */
107               short l_whence;  /* How to interpret l_start:
108                                   SEEK_SET, SEEK_CUR, SEEK_END */
109               off_t l_start;   /* Starting offset for lock */
110               off_t l_len;     /* Number of bytes to lock */
111               pid_t l_pid;     /* PID of process blocking our lock
112                                   (set by F_GETLK and F_OFD_GETLK) */
113               ...
114           };
116       The  l_whence,  l_start, and l_len fields of this structure specify the
117       range of bytes we wish to lock.  Bytes past the end of the file may  be
118       locked, but not bytes before the start of the file.
120       l_start  is  the starting offset for the lock, and is interpreted rela‐
121       tive to either: the start of the file (if l_whence  is  SEEK_SET);  the
122       current  file  offset (if l_whence is SEEK_CUR); or the end of the file
123       (if l_whence is SEEK_END).  In the final two cases, l_start  can  be  a
124       negative  number  provided  the offset does not lie before the start of
125       the file.
127       l_len specifies the number of bytes to be locked.  If  l_len  is  posi‐
128       tive,  then  the  range  to  be  locked  covers bytes l_start up to and
129       including l_start+l_len-1.  Specifying 0  for  l_len  has  the  special
130       meaning:  lock all bytes starting at the location specified by l_whence
131       and l_start through to the end of file, no matter how  large  the  file
132       grows.
134       POSIX.1-2001 allows (but does not require) an implementation to support
135       a negative l_len value; if l_len is negative, the interval described by
136       lock covers bytes l_start+l_len up to and including l_start-1.  This is
137       supported by Linux since kernel versions 2.4.21 and 2.5.49.
139       The l_type field can be used to place  a  read  (F_RDLCK)  or  a  write
140       (F_WRLCK) lock on a file.  Any number of processes may hold a read lock
141       (shared lock) on a file region, but only one process may hold  a  write
142       lock  (exclusive  lock).   An  exclusive lock excludes all other locks,
143       both shared and exclusive.  A single process can hold only one type  of
144       lock  on  a  file region; if a new lock is applied to an already-locked
145       region, then the existing lock is  converted  to  the  new  lock  type.
146       (Such  conversions may involve splitting, shrinking, or coalescing with
147       an existing lock if the byte range specified by the new lock  does  not
148       precisely coincide with the range of the existing lock.)
150       F_SETLK (struct flock *)
151              Acquire  a lock (when l_type is F_RDLCK or F_WRLCK) or release a
152              lock (when l_type is F_UNLCK) on  the  bytes  specified  by  the
153              l_whence,  l_start,  and l_len fields of lock.  If a conflicting
154              lock is held by another process, this call returns -1  and  sets
155              errno  to  EACCES  or  EAGAIN.  (The error returned in this case
156              differs across implementations, so  POSIX  requires  a  portable
157              application to check for both errors.)
159       F_SETLKW (struct flock *)
160              As  for  F_SETLK, but if a conflicting lock is held on the file,
161              then wait for that lock to be released.  If a signal  is  caught
162              while  waiting, then the call is interrupted and (after the sig‐
163              nal handler has returned) returns immediately (with return value
164              -1 and errno set to EINTR; see signal(7)).
166       F_GETLK (struct flock *)
167              On  input  to  this call, lock describes a lock we would like to
168              place on the file.  If the lock could be  placed,  fcntl()  does
169              not  actually  place it, but returns F_UNLCK in the l_type field
170              of lock and leaves the other fields of the structure unchanged.
172              If one or more incompatible locks would prevent this lock  being
173              placed, then fcntl() returns details about one of those locks in
174              the l_type, l_whence, l_start, and l_len fields of lock.  If the
175              conflicting  lock  is  a traditional (process-associated) record
176              lock, then the l_pid field is set to  the  PID  of  the  process
177              holding  that  lock.   If  the  conflicting lock is an open file
178              description lock, then l_pid  is  set  to  -1.   Note  that  the
179              returned  information may already be out of date by the time the
180              caller inspects it.
182       In order to place a read lock, fd must be open for reading.   In  order
183       to  place  a  write  lock,  fd must be open for writing.  To place both
184       types of lock, open a file read-write.
186       When placing locks with F_SETLKW, the kernel detects deadlocks, whereby
187       two  or  more  processes  have  their lock requests mutually blocked by
188       locks held by the other processes.   For  example,  suppose  process  A
189       holds  a  write lock on byte 100 of a file, and process B holds a write
190       lock on byte 200.  If each process  then  attempts  to  lock  the  byte
191       already locked by the other process using F_SETLKW, then, without dead‐
192       lock detection, both processes would remain blocked indefinitely.  When
193       the  kernel  detects such deadlocks, it causes one of the blocking lock
194       requests to immediately fail with the  error  EDEADLK;  an  application
195       that encounters such an error should release some of its locks to allow
196       other applications to proceed before attempting regain the  locks  that
197       it  requires.  Circular deadlocks involving more than two processes are
198       also detected.  Note, however, that there are limitations to  the  ker‐
199       nel's deadlock-detection algorithm; see BUGS.
201       As well as being removed by an explicit F_UNLCK, record locks are auto‐
202       matically released when the process terminates.
204       Record locks are not inherited by a child created via fork(2), but  are
205       preserved across an execve(2).
207       Because  of the buffering performed by the stdio(3) library, the use of
208       record locking with routines in that package  should  be  avoided;  use
209       read(2) and write(2) instead.
211       The  record  locks  described  above  are  associated  with the process
212       (unlike the open file description locks  described  below).   This  has
213       some unfortunate consequences:
215       *  If  a  process  closes any file descriptor referring to a file, then
216          all of the process's locks on that file are released, regardless  of
217          the  file  descriptor(s)  on which the locks were obtained.  This is
218          bad: it means that a process can lose its locks on a  file  such  as
219          /etc/passwd  or  /etc/mtab  when  for some reason a library function
220          decides to open, read, and close the same file.
222       *  The threads in a process share locks.   In  other  words,  a  multi‐
223          threaded  program  can't  use  record locking to ensure that threads
224          don't simultaneously access the same region of a file.
226       Open file description locks solve both of these problems.
228   Open file description locks (non-POSIX)
229       Open file description locks are advisory byte-range locks whose  opera‐
230       tion  is  in  most  respects  identical to the traditional record locks
231       described above.  This lock type is Linux-specific, and available since
232       Linux 3.15.  (There is a proposal with the Austin Group to include this
233       lock type in the next revision of POSIX.1.)  For an explanation of open
234       file descriptions, see open(2).
236       The  principal  difference  between  the two lock types is that whereas
237       traditional record locks are  associated  with  a  process,  open  file
238       description  locks  are  associated  with  the open file description on
239       which they are acquired, much like locks acquired with flock(2).   Con‐
240       sequently  (and  unlike  traditional  advisory record locks), open file
241       description locks are  inherited  across  fork(2)  (and  clone(2)  with
242       CLONE_FILES),  and are only automatically released on the last close of
243       the open file description, instead of being released on  any  close  of
244       the file.
246       Conflicting  lock  combinations  (i.e., a read lock and a write lock or
247       two write locks) where one lock is an open file  description  lock  and
248       the  other  is  a  traditional  record lock conflict even when they are
249       acquired by the same process on the same file descriptor.
251       Open file description locks placed via the same open  file  description
252       (i.e.,  via  the  same  file descriptor, or via a duplicate of the file
253       descriptor created by fork(2), dup(2), fcntl() F_DUPFD, and so on)  are
254       always compatible: if a new lock is placed on an already locked region,
255       then the existing lock is converted to the new lock type.   (Such  con‐
256       versions  may  result  in  splitting,  shrinking, or coalescing with an
257       existing lock as discussed above.)
259       On the other hand, open file description locks may conflict  with  each
260       other  when  they  are  acquired  via different open file descriptions.
261       Thus, the threads in a multithreaded program can use open file descrip‐
262       tion locks to synchronize access to a file region by having each thread
263       perform its own open(2) on the file and applying locks via the  result‐
264       ing file descriptor.
266       As  with  traditional  advisory  locks,  the third argument to fcntl(),
267       lock, is a pointer to an flock structure.  By contrast with traditional
268       record  locks,  the  l_pid  field of that structure must be set to zero
269       when using the commands described below.
271       The commands for working with open file description locks are analogous
272       to those used with traditional locks:
274       F_OFD_SETLK (struct flock *)
275              Acquire an open file description lock (when l_type is F_RDLCK or
276              F_WRLCK) or release an open file description lock  (when  l_type
277              is F_UNLCK) on the bytes specified by the l_whence, l_start, and
278              l_len fields of lock.  If a conflicting lock is held by  another
279              process, this call returns -1 and sets errno to EAGAIN.
281       F_OFD_SETLKW (struct flock *)
282              As  for  F_OFD_SETLK,  but  if a conflicting lock is held on the
283              file, then wait for that lock to be released.  If  a  signal  is
284              caught  while  waiting,  then the call is interrupted and (after
285              the signal  handler  has  returned)  returns  immediately  (with
286              return value -1 and errno set to EINTR; see signal(7)).
288       F_OFD_GETLK (struct flock *)
289              On  input  to this call, lock describes an open file description
290              lock we would like to place on the file.  If the lock  could  be
291              placed,  fcntl() does not actually place it, but returns F_UNLCK
292              in the l_type field of lock and leaves the other fields  of  the
293              structure  unchanged.   If  one or more incompatible locks would
294              prevent this lock being placed, then details about one of  these
295              locks are returned via lock, as described above for F_GETLK.
297       In  the  current implementation, no deadlock detection is performed for
298       open file description locks.  (This contrasts  with  process-associated
299       record locks, for which the kernel does perform deadlock detection.)
301   Mandatory locking
302       Warning:  the  Linux implementation of mandatory locking is unreliable.
303       See BUGS below.  Because of these bugs, and the fact that  the  feature
304       is  believed  to be little used, since Linux 4.5, mandatory locking has
305       been made an optional feature, governed by a configuration option (CON‐
306       FIG_MANDATORY_FILE_LOCKING).   This  is an initial step toward removing
307       this feature completely.
309       By  default,  both  traditional  (process-associated)  and  open   file
310       description record locks are advisory.  Advisory locks are not enforced
311       and are useful only between cooperating processes.
313       Both lock types can also be mandatory.  Mandatory  locks  are  enforced
314       for  all  processes.   If  a  process  tries to perform an incompatible
315       access (e.g., read(2) or write(2)) on a file region that has an  incom‐
316       patible mandatory lock, then the result depends upon whether the O_NON‐
317       BLOCK flag is enabled for its open file description.  If the O_NONBLOCK
318       flag  is not enabled, then the system call is blocked until the lock is
319       removed or converted to a mode that is compatible with the access.   If
320       the  O_NONBLOCK  flag  is  enabled, then the system call fails with the
321       error EAGAIN.
323       To make use of mandatory locks, mandatory locking must be enabled  both
324       on  the filesystem that contains the file to be locked, and on the file
325       itself.  Mandatory locking is enabled on a  filesystem  using  the  "-o
326       mand" option to mount(8), or the MS_MANDLOCK flag for mount(2).  Manda‐
327       tory locking is enabled on a file by disabling group execute permission
328       on  the file and enabling the set-group-ID permission bit (see chmod(1)
329       and chmod(2)).
331       Mandatory locking is not specified by POSIX.  Some other  systems  also
332       support  mandatory  locking,  although  the details of how to enable it
333       vary across systems.
335   Lost locks
336       When an advisory lock is obtained on a networked filesystem such as NFS
337       it  is  possible  that the lock might get lost.  This may happen due to
338       administrative action on the server, or  due  to  a  network  partition
339       (i.e.,  loss  of network connectivity with the server) which lasts long
340       enough for the server to assume that the client is no longer  function‐
341       ing.
343       When  the  filesystem  determines  that  a  lock  has been lost, future
344       read(2) or write(2) requests may fail with the error EIO.   This  error
345       will  persist  until  the  lock  is  removed  or the file descriptor is
346       closed.  Since Linux 3.12, this happens at least for  NFSv4  (including
347       all minor versions).
349       Some  versions  of  UNIX  send a signal (SIGLOST) in this circumstance.
350       Linux does not define this signal, and does not provide  any  asynchro‐
351       nous notification of lost locks.
353   Managing signals
355       used to manage I/O availability signals:
357       F_GETOWN (void)
358              Return (as the function result) the process ID or process  group
359              currently  receiving SIGIO and SIGURG signals for events on file
360              descriptor fd.  Process IDs are  returned  as  positive  values;
361              process  group IDs are returned as negative values (but see BUGS
362              below).  arg is ignored.
364       F_SETOWN (int)
365              Set the process ID or process group ID that will  receive  SIGIO
366              and  SIGURG  signals  for events on the file descriptor fd.  The
367              target process or process group  ID  is  specified  in  arg.   A
368              process  ID is specified as a positive value; a process group ID
369              is specified as a negative value.  Most  commonly,  the  calling
370              process specifies itself as the owner (that is, arg is specified
371              as getpid(2)).
373              As well as setting the file  descriptor  owner,  one  must  also
374              enable  generation  of  signals on the file descriptor.  This is
375              done by using the fcntl() F_SETFL command  to  set  the  O_ASYNC
376              file  status flag on the file descriptor.  Subsequently, a SIGIO
377              signal is sent whenever input or output becomes possible on  the
378              file  descriptor.   The  fcntl() F_SETSIG command can be used to
379              obtain delivery of a signal other than SIGIO.
381              Sending a signal to  the  owner  process  (group)  specified  by
382              F_SETOWN  is  subject  to  the  same  permissions  checks as are
383              described for kill(2), where the sending process is the one that
384              employs F_SETOWN (but see BUGS below).  If this permission check
385              fails,  then  the  signal  is  silently  discarded.   Note:  The
386              F_SETOWN  operation records the caller's credentials at the time
387              of the fcntl() call, and it is these saved credentials that  are
388              used for the permission checks.
390              If  the  file  descriptor  fd  refers to a socket, F_SETOWN also
391              selects the recipient of SIGURG signals that are delivered  when
392              out-of-band data arrives on that socket.  (SIGURG is sent in any
393              situation where select(2) would report the socket as  having  an
394              "exceptional condition".)
396              The following was true in 2.6.x kernels up to and including ker‐
397              nel 2.6.11:
399                     If a nonzero value is  given  to  F_SETSIG  in  a  multi‐
400                     threaded  process  running  with a threading library that
401                     supports thread groups  (e.g.,  NPTL),  then  a  positive
402                     value  given to F_SETOWN has a different meaning: instead
403                     of being a process ID identifying a whole process, it  is
404                     a  thread  ID  identifying  a  specific  thread  within a
405                     process.  Consequently,  it  may  be  necessary  to  pass
406                     F_SETOWN  the result of gettid(2) instead of getpid(2) to
407                     get sensible results when F_SETSIG is used.  (In  current
408                     Linux  threading  implementations, a main thread's thread
409                     ID is the same as its process ID.  This means that a sin‐
410                     gle-threaded  program  can  equally use gettid(2) or get‐
411                     pid(2) in this scenario.)  Note, however, that the state‐
412                     ments in this paragraph do not apply to the SIGURG signal
413                     generated for out-of-band data on a socket:  this  signal
414                     is  always  sent  to either a process or a process group,
415                     depending on the value given to F_SETOWN.
417              The above behavior was accidentally dropped in Linux 2.6.12, and
418              won't be restored.  From Linux 2.6.32 onward, use F_SETOWN_EX to
419              target SIGIO and SIGURG signals at a particular thread.
421       F_GETOWN_EX (struct f_owner_ex *) (since Linux 2.6.32)
422              Return the current file descriptor owner settings as defined  by
423              a  previous  F_SETOWN_EX operation.  The information is returned
424              in the structure pointed to by  arg,  which  has  the  following
425              form:
427                  struct f_owner_ex {
428                      int   type;
429                      pid_t pid;
430                  };
432              The  type  field  will  have  one  of  the  values  F_OWNER_TID,
433              F_OWNER_PID, or F_OWNER_PGRP.  The pid field is a positive inte‐
434              ger  representing  a thread ID, process ID, or process group ID.
435              See F_SETOWN_EX for more details.
437       F_SETOWN_EX (struct f_owner_ex *) (since Linux 2.6.32)
438              This operation performs a similar task to F_SETOWN.   It  allows
439              the  caller  to  direct  I/O  availability signals to a specific
440              thread, process, or process group.   The  caller  specifies  the
441              target  of  signals  via arg, which is a pointer to a f_owner_ex
442              structure.  The type field has  one  of  the  following  values,
443              which define how pid is interpreted:
445              F_OWNER_TID
446                     Send  the signal to the thread whose thread ID (the value
447                     returned by a call to clone(2) or gettid(2)) is specified
448                     in pid.
450              F_OWNER_PID
451                     Send  the  signal to the process whose ID is specified in
452                     pid.
454              F_OWNER_PGRP
455                     Send the signal to the process group whose ID  is  speci‐
456                     fied in pid.  (Note that, unlike with F_SETOWN, a process
457                     group ID is specified as a positive value here.)
459       F_GETSIG (void)
460              Return (as the function result) the signal sent  when  input  or
461              output  becomes  possible.  A value of zero means SIGIO is sent.
462              Any other value (including SIGIO) is the  signal  sent  instead,
463              and in this case additional info is available to the signal han‐
464              dler if installed with SA_SIGINFO.  arg is ignored.
466       F_SETSIG (int)
467              Set the signal sent when input or output becomes possible to the
468              value  given  in arg.  A value of zero means to send the default
469              SIGIO signal.  Any other value (including SIGIO) is  the  signal
470              to  send  instead, and in this case additional info is available
471              to the signal handler if installed with SA_SIGINFO.
473              By using F_SETSIG with a nonzero value, and  setting  SA_SIGINFO
474              for  the  signal  handler  (see sigaction(2)), extra information
475              about I/O events is passed to the handler in a siginfo_t  struc‐
476              ture.   If  the  si_code field indicates the source is SI_SIGIO,
477              the si_fd field gives the file descriptor  associated  with  the
478              event.  Otherwise, there is no indication which file descriptors
479              are pending, and you should use the usual mechanisms (select(2),
480              poll(2),  read(2)  with  O_NONBLOCK set etc.) to determine which
481              file descriptors are available for I/O.
483              Note that the file descriptor provided in si_fd is the one  that
484              was  specified  during the F_SETSIG operation.  This can lead to
485              an unusual corner case.  If the file  descriptor  is  duplicated
486              (dup(2) or similar), and the original file descriptor is closed,
487              then I/O events will continue to be  generated,  but  the  si_fd
488              field will contain the number of the now closed file descriptor.
490              By  selecting  a  real time signal (value >= SIGRTMIN), multiple
491              I/O events may be queued using the same signal numbers.   (Queu‐
492              ing  is  dependent  on  available memory.)  Extra information is
493              available if SA_SIGINFO is set for the signal handler, as above.
495              Note that Linux imposes a limit on the number of real-time  sig‐
496              nals  that may be queued to a process (see getrlimit(2) and sig‐
497              nal(7)) and if this limit is reached, then the kernel reverts to
498              delivering  SIGIO,  and  this  signal is delivered to the entire
499              process rather than to a specific thread.
501       Using these mechanisms, a program can implement fully asynchronous  I/O
502       without using select(2) or poll(2) most of the time.
504       The  use  of  O_ASYNC  is  specific  to BSD and Linux.  The only use of
505       F_GETOWN and F_SETOWN specified in POSIX.1 is in conjunction  with  the
506       use of the SIGURG signal on sockets.  (POSIX does not specify the SIGIO
507       signal.)  F_GETOWN_EX, F_SETOWN_EX, F_GETSIG, and F_SETSIG  are  Linux-
508       specific.  POSIX has asynchronous I/O and the aio_sigevent structure to
509       achieve similar things; these are also available in Linux  as  part  of
510       the GNU C Library (Glibc).
512   Leases
513       F_SETLEASE and F_GETLEASE (Linux 2.4 onward) are used (respectively) to
514       establish a new lease, and retrieve the current lease, on the open file
515       description  referred  to by the file descriptor fd.  A file lease pro‐
516       vides a mechanism whereby the process holding  the  lease  (the  "lease
517       holder")  is  notified  (via  delivery of a signal) when a process (the
518       "lease breaker") tries to open(2) or truncate(2) the file  referred  to
519       by that file descriptor.
521       F_SETLEASE (int)
522              Set  or  remove a file lease according to which of the following
523              values is specified in the integer arg:
525              F_RDLCK
526                     Take out a read  lease.   This  will  cause  the  calling
527                     process  to be notified when the file is opened for writ‐
528                     ing or is truncated.  A read lease can be placed only  on
529                     a file descriptor that is opened read-only.
531              F_WRLCK
532                     Take out a write lease.  This will cause the caller to be
533                     notified when the file is opened for reading  or  writing
534                     or  is  truncated.  A write lease may be placed on a file
535                     only if there are no other open file descriptors for  the
536                     file.
538              F_UNLCK
539                     Remove our lease from the file.
541       Leases  are  associated  with  an  open file description (see open(2)).
542       This means that duplicate file descriptors (created  by,  for  example,
543       fork(2) or dup(2)) refer to the same lease, and this lease may be modi‐
544       fied or released using any  of  these  descriptors.   Furthermore,  the
545       lease  is  released  by  either an explicit F_UNLCK operation on any of
546       these duplicate file descriptors, or when  all  such  file  descriptors
547       have been closed.
549       Leases may be taken out only on regular files.  An unprivileged process
550       may take out a lease only on a  file  whose  UID  (owner)  matches  the
551       filesystem UID of the process.  A process with the CAP_LEASE capability
552       may take out leases on arbitrary files.
554       F_GETLEASE (void)
555              Indicates what  type  of  lease  is  associated  with  the  file
556              descriptor  fd by returning either F_RDLCK, F_WRLCK, or F_UNLCK,
557              indicating, respectively, a read lease , a write  lease,  or  no
558              lease.  arg is ignored.
560       When a process (the "lease breaker") performs an open(2) or truncate(2)
561       that conflicts with a lease established via F_SETLEASE, the system call
562       is  blocked  by  the kernel and the kernel notifies the lease holder by
563       sending it a signal  (SIGIO  by  default).   The  lease  holder  should
564       respond to receipt of this signal by doing whatever cleanup is required
565       in preparation for the file to be accessed by  another  process  (e.g.,
566       flushing cached buffers) and then either remove or downgrade its lease.
567       A lease is removed by performing an F_SETLEASE command  specifying  arg
568       as  F_UNLCK.   If the lease holder currently holds a write lease on the
569       file, and the lease breaker is opening the file for reading, then it is
570       sufficient for the lease holder to downgrade the lease to a read lease.
571       This is done by performing an  F_SETLEASE  command  specifying  arg  as
572       F_RDLCK.
574       If  the  lease holder fails to downgrade or remove the lease within the
575       number of seconds specified in /proc/sys/fs/lease-break-time, then  the
576       kernel forcibly removes or downgrades the lease holder's lease.
578       Once  a  lease  break has been initiated, F_GETLEASE returns the target
579       lease type (either F_RDLCK or F_UNLCK, depending on what would be  com‐
580       patible  with  the  lease  breaker)  until the lease holder voluntarily
581       downgrades or removes the lease or the kernel forcibly  does  so  after
582       the lease break timer expires.
584       Once  the lease has been voluntarily or forcibly removed or downgraded,
585       and assuming the lease breaker has not unblocked its system  call,  the
586       kernel permits the lease breaker's system call to proceed.
588       If the lease breaker's blocked open(2) or truncate(2) is interrupted by
589       a signal handler, then the system call fails with the error EINTR,  but
590       the  other  steps still occur as described above.  If the lease breaker
591       is killed by a signal while blocked in open(2) or truncate(2), then the
592       other steps still occur as described above.  If the lease breaker spec‐
593       ifies the O_NONBLOCK flag when calling open(2), then the  call  immedi‐
594       ately fails with the error EWOULDBLOCK, but the other steps still occur
595       as described above.
597       The default signal used to notify the lease holder is SIGIO,  but  this
598       can  be  changed  using the F_SETSIG command to fcntl().  If a F_SETSIG
599       command is performed (even one specifying SIGIO), and the  signal  han‐
600       dler  is  established using SA_SIGINFO, then the handler will receive a
601       siginfo_t structure as its second argument, and the si_fd field of this
602       argument will hold the file descriptor of the leased file that has been
603       accessed by another process.  (This  is  useful  if  the  caller  holds
604       leases against multiple files.)
606   File and directory change notification (dnotify)
607       F_NOTIFY (int)
608              (Linux  2.4  onward)  Provide  notification  when  the directory
609              referred to by fd or any  of  the  files  that  it  contains  is
610              changed.   The events to be notified are specified in arg, which
611              is a bit mask specified by ORing together zero or  more  of  the
612              following bits:
614              DN_ACCESS   A  file  was  accessed (read(2), pread(2), readv(2),
615                          and similar)
616              DN_MODIFY   A file was modified (write(2), pwrite(2), writev(2),
617                          truncate(2), ftruncate(2), and similar).
618              DN_CREATE   A  file  was  created  (open(2), creat(2), mknod(2),
619                          mkdir(2), link(2), symlink(2), rename(2)  into  this
620                          directory).
621              DN_DELETE   A file was unlinked (unlink(2), rename(2) to another
622                          directory, rmdir(2)).
623              DN_RENAME   A   file   was   renamed   within   this   directory
624                          (rename(2)).
625              DN_ATTRIB   The  attributes  of  a  file were changed (chown(2),
626                          chmod(2), utime(2), utimensat(2), and similar).
628              (In order to obtain these definitions, the  _GNU_SOURCE  feature
629              test macro must be defined before including any header files.)
631              Directory  notifications are normally "one-shot", and the appli‐
632              cation must reregister to receive further notifications.  Alter‐
633              natively,  if DN_MULTISHOT is included in arg, then notification
634              will remain in effect until explicitly removed.
636              A series of F_NOTIFY requests is cumulative, with the events  in
637              arg  being added to the set already monitored.  To disable noti‐
638              fication of all events, make an F_NOTIFY call specifying arg  as
639              0.
641              Notification  occurs via delivery of a signal.  The default sig‐
642              nal is SIGIO, but this can be changed using the F_SETSIG command
643              to  fcntl().  (Note that SIGIO is one of the nonqueuing standard
644              signals; switching to the use of a real-time signal  means  that
645              multiple  notifications  can  be queued to the process.)  In the
646              latter case, the signal handler receives a  siginfo_t  structure
647              as  its  second  argument  (if the handler was established using
648              SA_SIGINFO) and the si_fd field of this structure  contains  the
649              file  descriptor  which  generated the notification (useful when
650              establishing notification on multiple directories).
652              Especially when using DN_MULTISHOT, a real time signal should be
653              used  for  notification,  so  that multiple notifications can be
654              queued.
656              NOTE: New applications should use the inotify interface  (avail‐
657              able since kernel 2.6.13), which provides a much superior inter‐
658              face for obtaining notifications of filesystem events.  See ino‐
659              tify(7).
661   Changing the capacity of a pipe
662       F_SETPIPE_SZ (int; since Linux 2.6.35)
663              Change the capacity of the pipe referred to by fd to be at least
664              arg bytes.  An unprivileged process can adjust the pipe capacity
665              to  any value between the system page size and the limit defined
666              in /proc/sys/fs/pipe-max-size (see proc(5)).   Attempts  to  set
667              the pipe capacity below the page size are silently rounded up to
668              the page size.  Attempts by an unprivileged process to  set  the
669              pipe  capacity  above  the  limit  in /proc/sys/fs/pipe-max-size
670              yield the error EPERM; a privileged  process  (CAP_SYS_RESOURCE)
671              can override the limit.
673              When  allocating  the  buffer for the pipe, the kernel may use a
674              capacity larger than arg, if that is convenient for  the  imple‐
675              mentation.   (In  the  current implementation, the allocation is
676              the next higher power-of-two page-size multiple of the requested
677              size.)   The  actual capacity (in bytes) that is set is returned
678              as the function result.
680              Attempting to set the pipe capacity smaller than the  amount  of
681              buffer  space  currently  used  to store data produces the error
682              EBUSY.
684       F_GETPIPE_SZ (void; since Linux 2.6.35)
685              Return (as  the  function  result)  the  capacity  of  the  pipe
686              referred to by fd.
688   File Sealing
689       File  seals  limit  the set of allowed operations on a given file.  For
690       each seal that is set on a file, a specific set of operations will fail
691       with  EPERM  on  this file from now on.  The file is said to be sealed.
692       The default set of seals depends on the type of the underlying file and
693       filesystem.   For an overview of file sealing, a discussion of its pur‐
694       pose, and some code examples, see memfd_create(2).
696       Currently, file seals can be applied only to a file descriptor returned
697       by  memfd_create(2)  (if the MFD_ALLOW_SEALING was employed).  On other
698       filesystems, all fcntl() operations that operate on seals  will  return
699       EINVAL.
701       Seals  are  a  property  of  an inode.  Thus, all open file descriptors
702       referring to the same inode share the same set of seals.   Furthermore,
703       seals can never be removed, only added.
705       F_ADD_SEALS (int; since Linux 3.17)
706              Add  the  seals given in the bit-mask argument arg to the set of
707              seals of the inode referred to by the file descriptor fd.  Seals
708              cannot be removed again.  Once this call succeeds, the seals are
709              enforced by the kernel immediately.  If the current set of seals
710              includes  F_SEAL_SEAL  (see  below),  then  this  call  will  be
711              rejected with EPERM.  Adding a seal that is already set is a no-
712              op, in case F_SEAL_SEAL is not set already.  In order to place a
713              seal, the file descriptor fd must be writable.
715       F_GET_SEALS (void; since Linux 3.17)
716              Return (as the function result) the current set of seals of  the
717              inode  referred  to  by fd.  If no seals are set, 0 is returned.
718              If the file does not support sealing, -1 is returned  and  errno
719              is set to EINVAL.
721       The following seals are available:
723       F_SEAL_SEAL
724              If   this  seal  is  set,  any  further  call  to  fcntl()  with
725              F_ADD_SEALS fails with the error EPERM.   Therefore,  this  seal
726              prevents  any  modifications to the set of seals itself.  If the
727              initial set of seals of a file includes F_SEAL_SEAL,  then  this
728              effectively causes the set of seals to be constant and locked.
730       F_SEAL_SHRINK
731              If  this  seal is set, the file in question cannot be reduced in
732              size.  This affects open(2) with the O_TRUNC  flag  as  well  as
733              truncate(2)  and  ftruncate(2).   Those calls fail with EPERM if
734              you try to shrink the file in  question.   Increasing  the  file
735              size is still possible.
737       F_SEAL_GROW
738              If  this seal is set, the size of the file in question cannot be
739              increased.  This affects write(2) beyond the end  of  the  file,
740              truncate(2),  ftruncate(2),  and fallocate(2).  These calls fail
741              with EPERM if you use them to increase the file  size.   If  you
742              keep the size or shrink it, those calls still work as expected.
744       F_SEAL_WRITE
745              If this seal is set, you cannot modify the contents of the file.
746              Note that shrinking or growing the size of  the  file  is  still
747              possible  and allowed.  Thus, this seal is normally used in com‐
748              bination with  one  of  the  other  seals.   This  seal  affects
749              write(2)  and  fallocate(2)  (only  in combination with the FAL‐
750              LOC_FL_PUNCH_HOLE flag).  Those calls fail with  EPERM  if  this
751              seal is set.  Furthermore, trying to create new shared, writable
752              memory-mappings via mmap(2) will also fail with EPERM.
754              Using the F_ADD_SEALS operation to  set  the  F_SEAL_WRITE  seal
755              fails  with  EBUSY if any writable, shared mapping exists.  Such
756              mappings must be unmapped before you can add  this  seal.   Fur‐
757              thermore,  if there are any asynchronous I/O operations (io_sub‐
758              mit(2)) pending on the file, all outstanding writes will be dis‐
759              carded.
761   File read/write hints
762       Write  lifetime  hints can be used to inform the kernel about the rela‐
763       tive expected lifetime of writes on a given inode or via  a  particular
764       open  file  description.   (See open(2) for an explanation of open file
765       descriptions.)  In this context, the term "write  lifetime"  means  the
766       expected  time the data will live on media, before being overwritten or
767       erased.
769       An application may use the different hint  values  specified  below  to
770       separate writes into different write classes, so that multiple users or
771       applications running on a single storage back-end can  aggregate  their
772       I/O  patterns in a consistent manner.  However, there are no functional
773       semantics implied by these flags, and different I/O classes can use the
774       write  lifetime  hints in arbitrary ways, so long as the hints are used
775       consistently.
777       The following operations can be applied to the file descriptor, fd:
779       F_GET_RW_HINT (uint64_t *; since Linux 4.13)
780              Returns the value of the read/write  hint  associated  with  the
781              underlying inode referred to by fd.
783       F_SET_RW_HINT (uint64_t *; since Linux 4.13)
784              Sets  the  read/write  hint value associated with the underlying
785              inode referred to by fd.  This hint persists until either it  is
786              explicitly modified or the underlying filesystem is unmounted.
788       F_GET_FILE_RW_HINT (uint64_t *; since Linux 4.13)
789              Returns  the  value  of  the read/write hint associated with the
790              open file description referred to by fd.
792       F_SET_FILE_RW_HINT (uint64_t *; since Linux 4.13)
793              Sets the read/write hint value associated  with  the  open  file
794              description referred to by fd.
796       If  an  open  file description has not been assigned a read/write hint,
797       then it shall use the value assigned to the inode, if any.
799       The following read/write hints are valid since Linux 4.13:
802              No specific hint has been set.  This is the default value.
805              No specific write lifetime  is  associated  with  this  file  or
806              inode.
809              Data  written to this inode or via this open file description is
810              expected to have a short lifetime.
813              Data written to this inode or via this open file description  is
814              expected  to  have  a  lifetime  longer  than  data written with
815              RWH_WRITE_LIFE_SHORT.
818              Data written to this inode or via this open file description  is
819              expected  to  have  a  lifetime  longer  than  data written with
820              RWH_WRITE_LIFE_MEDIUM.
823              Data written to this inode or via this open file description  is
824              expected  to  have  a  lifetime  longer  than  data written with
825              RWH_WRITE_LIFE_LONG.
827       All the write-specific hints are relative to each other, and  no  indi‐
828       vidual absolute meaning should be attributed to them.


831       For a successful call, the return value depends on the operation:
833       F_DUPFD  The new file descriptor.
835       F_GETFD  Value of file descriptor flags.
837       F_GETFL  Value of file status flags.
839       F_GETLEASE
840                Type of lease held on file descriptor.
842       F_GETOWN Value of file descriptor owner.
844       F_GETSIG Value  of  signal sent when read or write becomes possible, or
845                zero for traditional SIGIO behavior.
848                The pipe capacity.
850       F_GET_SEALS
851                A bit mask identifying the seals that have been  set  for  the
852                inode referred to by fd.
854       All other commands
855                Zero.
857       On error, -1 is returned, and errno is set appropriately.


860       EACCES or EAGAIN
861              Operation is prohibited by locks held by other processes.
863       EAGAIN The  operation  is  prohibited because the file has been memory-
864              mapped by another process.
866       EBADF  fd is not an open file descriptor
868       EBADF  cmd is F_SETLK or F_SETLKW and the  file  descriptor  open  mode
869              doesn't match with the type of lock requested.
871       EBUSY  cmd  is  F_SETPIPE_SZ and the new pipe capacity specified in arg
872              is smaller than the amount of buffer  space  currently  used  to
873              store data in the pipe.
875       EBUSY  cmd  is F_ADD_SEALS, arg includes F_SEAL_WRITE, and there exists
876              a writable, shared mapping on the file referred to by fd.
878       EDEADLK
879              It was detected that the specified F_SETLKW command would  cause
880              a deadlock.
882       EFAULT lock is outside your accessible address space.
884       EINTR  cmd  is  F_SETLKW  or  F_OFD_SETLKW and the operation was inter‐
885              rupted by a signal; see signal(7).
887       EINTR  cmd is F_GETLK, F_SETLK, F_OFD_GETLK, or  F_OFD_SETLK,  and  the
888              operation  was  interrupted  by  a  signal  before  the lock was
889              checked or acquired.  Most likely when  locking  a  remote  file
890              (e.g., locking over NFS), but can sometimes happen locally.
892       EINVAL The value specified in cmd is not recognized by this kernel.
894       EINVAL cmd is F_ADD_SEALS and arg includes an unrecognized sealing bit.
896       EINVAL cmd  is F_ADD_SEALS or F_GET_SEALS and the filesystem containing
897              the inode referred to by fd does not support sealing.
899       EINVAL cmd is F_DUPFD and arg is negative or is greater than the  maxi‐
900              mum  allowable  value  (see  the  discussion of RLIMIT_NOFILE in
901              getrlimit(2)).
903       EINVAL cmd is F_SETSIG and arg is not an allowable signal number.
905       EINVAL cmd is F_OFD_SETLK, F_OFD_SETLKW, or F_OFD_GETLK, and l_pid  was
906              not specified as zero.
908       EMFILE cmd  is  F_DUPFD and the per-process limit on the number of open
909              file descriptors has been reached.
911       ENOLCK Too many segment locks open, lock table is  full,  or  a  remote
912              locking protocol failed (e.g., locking over NFS).
914       ENOTDIR
915              F_NOTIFY was specified in cmd, but fd does not refer to a direc‐
916              tory.
918       EPERM  cmd is F_SETPIPE_SZ and the soft or hard  user  pipe  limit  has
919              been reached; see pipe(7).
921       EPERM  Attempted  to  clear  the  O_APPEND  flag on a file that has the
922              append-only attribute set.
924       EPERM  cmd was F_ADD_SEALS, but fd was not open for writing or the cur‐
925              rent set of seals on the file already includes F_SEAL_SEAL.


928       SVr4,  4.3BSD,  POSIX.1-2001.   Only  the  operations F_DUPFD, F_GETFD,
929       F_SETFD, F_GETFL, F_SETFL, F_GETLK, F_SETLK, and F_SETLKW are specified
930       in POSIX.1-2001.
932       F_GETOWN  and  F_SETOWN  are  specified in POSIX.1-2001.  (To get their
933       definitions, define either _XOPEN_SOURCE with the value 500 or greater,
934       or _POSIX_C_SOURCE with the value 200809L or greater.)
936       F_DUPFD_CLOEXEC is specified in POSIX.1-2008.  (To get this definition,
937       define  _POSIX_C_SOURCE  with  the  value  200809L   or   greater,   or
938       _XOPEN_SOURCE with the value 700 or greater.)
941       SIG, F_NOTIFY, F_GETLEASE, and F_SETLEASE are Linux-specific.   (Define
942       the _GNU_SOURCE macro to obtain these definitions.)
944       F_OFD_SETLK,  F_OFD_SETLKW, and F_OFD_GETLK are Linux-specific (and one
945       must define _GNU_SOURCE to obtain their definitions), but work is being
946       done to have them included in the next version of POSIX.1.
948       F_ADD_SEALS and F_GET_SEALS are Linux-specific.


951       The  errors  returned  by  dup2(2) are different from those returned by
952       F_DUPFD.
954   File locking
955       The original Linux fcntl() system call was not designed to handle large
956       file offsets (in the flock structure).  Consequently, an fcntl64() sys‐
957       tem call was added in Linux 2.4.  The newer system call employs a  dif‐
958       ferent structure for file locking, flock64, and corresponding commands,
959       F_GETLK64, F_SETLK64, and F_SETLKW64.  However, these  details  can  be
960       ignored  by  applications  using  glibc, whose fcntl() wrapper function
961       transparently employs the more recent system call where  it  is  avail‐
962       able.
964   Record locks
965       Since  kernel  2.0,  there  is no interaction between the types of lock
966       placed by flock(2) and fcntl().
968       Several systems have more fields in struct flock such as, for  example,
969       l_sysid.   Clearly,  l_pid  alone is not going to be very useful if the
970       process holding the lock may live on a different machine.
972       The original Linux fcntl() system call was not designed to handle large
973       file offsets (in the flock structure).  Consequently, an fcntl64() sys‐
974       tem call was added in Linux 2.4.  The newer system call employs a  dif‐
975       ferent structure for file locking, flock64, and corresponding commands,
976       F_GETLK64, F_SETLK64, and F_SETLKW64.  However, these  details  can  be
977       ignored  by  applications  using  glibc, whose fcntl() wrapper function
978       transparently employs the more recent system call where  it  is  avail‐
979       able.
981   Record locking and NFS
982       Before Linux 3.12, if an NFSv4 client loses contact with the server for
983       a period of time (defined as more than 90 seconds  with  no  communica‐
984       tion),  it might lose and regain a lock without ever being aware of the
985       fact.  (The period of time after which contact is assumed lost is known
986       as  the NFSv4 leasetime.  On a Linux NFS server, this can be determined
987       by looking at /proc/fs/nfsd/nfsv4leasetime, which expresses the  period
988       in  seconds.   The  default  value for this file is 90.)  This scenario
989       potentially risks data corruption, since another process might  acquire
990       a lock in the intervening period and perform file I/O.
992       Since Linux 3.12, if an NFSv4 client loses contact with the server, any
993       I/O to the file by a process which "thinks" it holds a lock  will  fail
994       until  that  process  closes and reopens the file.  A kernel parameter,
995       nfs.recover_lost_locks, can be set to 1 to obtain the  pre-3.12  behav‐
996       ior, whereby the client will attempt to recover lost locks when contact
997       is reestablished with the server.  Because of  the  attendant  risk  of
998       data corruption, this parameter defaults to 0 (disabled).


1001   F_SETFL
1002       It  is  not  possible to use F_SETFL to change the state of the O_DSYNC
1003       and O_SYNC flags.  Attempts to change the  state  of  these  flags  are
1004       silently ignored.
1006   F_GETOWN
1007       A limitation of the Linux system call conventions on some architectures
1008       (notably i386) means that if  a  (negative)  process  group  ID  to  be
1009       returned  by  F_GETOWN  falls in the range -1 to -4095, then the return
1010       value is wrongly interpreted by glibc as an error in the  system  call;
1011       that is, the return value of fcntl() will be -1, and errno will contain
1012       the (positive) process group ID.  The Linux-specific F_GETOWN_EX opera‐
1013       tion  avoids  this  problem.  Since glibc version 2.11, glibc makes the
1014       kernel  F_GETOWN  problem  invisible  by  implementing  F_GETOWN  using
1015       F_GETOWN_EX.
1017   F_SETOWN
1018       In  Linux 2.4 and earlier, there is bug that can occur when an unprivi‐
1019       leged process uses F_SETOWN to specify  the  owner  of  a  socket  file
1020       descriptor  as  a process (group) other than the caller.  In this case,
1021       fcntl() can return -1 with errno set to  EPERM,  even  when  the  owner
1022       process  (group)  is one that the caller has permission to send signals
1023       to.  Despite this error return, the file descriptor owner is  set,  and
1024       signals will be sent to the owner.
1026   Deadlock detection
1027       The  deadlock-detection  algorithm  employed by the kernel when dealing
1028       with F_SETLKW requests can yield  both  false  negatives  (failures  to
1029       detect deadlocks, leaving a set of deadlocked processes blocked indefi‐
1030       nitely) and false positives (EDEADLK errors when there is no deadlock).
1031       For  example, the kernel limits the lock depth of its dependency search
1032       to 10 steps, meaning that circular deadlock  chains  that  exceed  that
1033       size  will  not be detected.  In addition, the kernel may falsely indi‐
1034       cate a deadlock when two or more processes created using  the  clone(2)
1035       CLONE_FILES flag place locks that appear (to the kernel) to conflict.
1037   Mandatory locking
1038       The Linux implementation of mandatory locking is subject to race condi‐
1039       tions which render it unreliable: a write(2) call that overlaps with  a
1040       lock  may  modify  data after the mandatory lock is acquired; a read(2)
1041       call that overlaps with a lock may detect changes  to  data  that  were
1042       made only after a write lock was acquired.  Similar races exist between
1043       mandatory locks and mmap(2).  It is therefore inadvisable  to  rely  on
1044       mandatory locking.


1047       dup2(2),  flock(2), open(2), socket(2), lockf(3), capabilities(7), fea‐
1048       ture_test_macros(7), lslocks(8)
1050       locks.txt, mandatory-locking.txt, and dnotify.txt in the  Linux  kernel
1051       source  directory  Documentation/filesystems/  (on older kernels, these
1052       files are directly under the Documentation/ directory,  and  mandatory-
1053       locking.txt is called mandatory.txt)


1056       This  page  is  part of release 4.15 of the Linux man-pages project.  A
1057       description of the project, information about reporting bugs,  and  the
1058       latest     version     of     this    page,    can    be    found    at
1059       https://www.kernel.org/doc/man-pages/.
1063Linux                             2018-02-02                          FCNTL(2)