fcntl64(2)

1FCNTL(2)                   Linux Programmer's Manual                  FCNTL(2)
2
3
4

NAME

6       fcntl - manipulate file descriptor
7

SYNOPSIS

9       #include <unistd.h>
10       #include <fcntl.h>
11
12       int fcntl(int fd, int cmd, ... /* arg */ );
13

DESCRIPTION

15       fcntl() performs one of the operations described below on the open file
16       descriptor fd.  The operation is determined by cmd.
17
18       fcntl() can take an optional third argument.  Whether or not this argu‐
19       ment  is  required is determined by cmd.  The required argument type is
20       indicated in parentheses after  each  cmd  name  (in  most  cases,  the
21       required type is int, and we identify the argument using the name arg),
22       or void is specified if the argument is not required.
23
24       Certain of the operations below are supported only since  a  particular
25       Linux  kernel  version.   The  preferred method of checking whether the
26       host kernel supports a particular operation is to invoke  fcntl()  with
27       the  desired  cmd value and then test whether the call failed with EIN‐
28       VAL, indicating that the kernel does not recognize this value.
29
30   Duplicating a file descriptor
31       F_DUPFD (int)
32              Duplicate the  file  descriptor  fd  using  the  lowest-numbered
33              available file descriptor greater than or equal to arg.  This is
34              different from dup2(2), which uses exactly the  file  descriptor
35              specified.
36
37              On success, the new file descriptor is returned.
38
39              See dup(2) for further details.
40
41       F_DUPFD_CLOEXEC (int; since Linux 2.6.24)
42              As  for F_DUPFD, but additionally set the close-on-exec flag for
43              the duplicate file descriptor.  Specifying this flag  permits  a
44              program  to avoid an additional fcntl() F_SETFD operation to set
45              the FD_CLOEXEC flag.  For an explanation of  why  this  flag  is
46              useful, see the description of O_CLOEXEC in open(2).
47
48   File descriptor flags
49       The  following  commands  manipulate  the  flags associated with a file
50       descriptor.  Currently, only one such flag is defined: FD_CLOEXEC,  the
51       close-on-exec  flag.  If the FD_CLOEXEC bit is set, the file descriptor
52       will automatically be closed during a successful  execve(2).   (If  the
53       execve(2)  fails, the file descriptor is left open.)  If the FD_CLOEXEC
54       bit is not  set,  the  file  descriptor  will  remain  open  across  an
55       execve(2).
56
57       F_GETFD (void)
58              Return  (as  the function result) the file descriptor flags; arg
59              is ignored.
60
61       F_SETFD (int)
62              Set the file descriptor flags to the value specified by arg.
63
64       In multithreaded programs, using fcntl() F_SETFD to set  the  close-on-
65       exec  flag  at  the same time as another thread performs a fork(2) plus
66       execve(2) is vulnerable to a race condition  that  may  unintentionally
67       leak  the file descriptor to the program executed in the child process.
68       See the discussion of the O_CLOEXEC flag in open(2) for details  and  a
69       remedy to the problem.
70
71   File status flags
72       Each  open  file  description has certain associated status flags, ini‐
73       tialized by open(2) and possibly modified by fcntl().  Duplicated  file
74       descriptors  (made with dup(2), fcntl(F_DUPFD), fork(2), etc.) refer to
75       the same open file description, and thus share  the  same  file  status
76       flags.
77
78       The file status flags and their semantics are described in open(2).
79
80       F_GETFL (void)
81              Return  (as  the  function  result) the file access mode and the
82              file status flags; arg is ignored.
83
84       F_SETFL (int)
85              Set the file status flags to the value specified by  arg.   File
86              access mode (O_RDONLY, O_WRONLY, O_RDWR) and file creation flags
87              (i.e., O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC) in arg  are  ignored.
88              On  Linux,  this  command can change only the O_APPEND, O_ASYNC,
89              O_DIRECT, O_NOATIME, and O_NONBLOCK flags.  It is  not  possible
90              to change the O_DSYNC and O_SYNC flags; see BUGS, below.
91
92   Advisory record locking
93       Linux  implements traditional ("process-associated") UNIX record locks,
94       as standardized by POSIX.  For a Linux-specific alternative with better
95       semantics, see the discussion of open file description locks below.
96
97       F_SETLK,  F_SETLKW,  and F_GETLK are used to acquire, release, and test
98       for the existence of record locks (also known as byte-range,  file-seg‐
99       ment, or file-region locks).  The third argument, lock, is a pointer to
100       a structure that has at least  the  following  fields  (in  unspecified
101       order).
102
103           struct flock {
104               ...
105               short l_type;    /* Type of lock: F_RDLCK,
106                                   F_WRLCK, F_UNLCK */
107               short l_whence;  /* How to interpret l_start:
108                                   SEEK_SET, SEEK_CUR, SEEK_END */
109               off_t l_start;   /* Starting offset for lock */
110               off_t l_len;     /* Number of bytes to lock */
111               pid_t l_pid;     /* PID of process blocking our lock
112                                   (set by F_GETLK and F_OFD_GETLK) */
113               ...
114           };
115
116       The  l_whence,  l_start, and l_len fields of this structure specify the
117       range of bytes we wish to lock.  Bytes past the end of the file may  be
118       locked, but not bytes before the start of the file.
119
120       l_start  is  the starting offset for the lock, and is interpreted rela‐
121       tive to either: the start of the file (if l_whence  is  SEEK_SET);  the
122       current  file  offset (if l_whence is SEEK_CUR); or the end of the file
123       (if l_whence is SEEK_END).  In the final two cases, l_start  can  be  a
124       negative  number  provided  the offset does not lie before the start of
125       the file.
126
127       l_len specifies the number of bytes to be locked.  If  l_len  is  posi‐
128       tive,  then  the  range  to  be  locked  covers bytes l_start up to and
129       including l_start+l_len-1.  Specifying 0  for  l_len  has  the  special
130       meaning:  lock all bytes starting at the location specified by l_whence
131       and l_start through to the end of file, no matter how  large  the  file
132       grows.
133
134       POSIX.1-2001 allows (but does not require) an implementation to support
135       a negative l_len value; if l_len is negative, the interval described by
136       lock covers bytes l_start+l_len up to and including l_start-1.  This is
137       supported by Linux since kernel versions 2.4.21 and 2.5.49.
138
139       The l_type field can be used to place  a  read  (F_RDLCK)  or  a  write
140       (F_WRLCK) lock on a file.  Any number of processes may hold a read lock
141       (shared lock) on a file region, but only one process may hold  a  write
142       lock  (exclusive  lock).   An  exclusive lock excludes all other locks,
143       both shared and exclusive.  A single process can hold only one type  of
144       lock  on  a  file region; if a new lock is applied to an already-locked
145       region, then the existing lock is  converted  to  the  new  lock  type.
146       (Such  conversions may involve splitting, shrinking, or coalescing with
147       an existing lock if the byte range specified by the new lock  does  not
148       precisely coincide with the range of the existing lock.)
149
150       F_SETLK (struct flock *)
151              Acquire  a lock (when l_type is F_RDLCK or F_WRLCK) or release a
152              lock (when l_type is F_UNLCK) on  the  bytes  specified  by  the
153              l_whence,  l_start,  and l_len fields of lock.  If a conflicting
154              lock is held by another process, this call returns -1  and  sets
155              errno  to  EACCES  or  EAGAIN.  (The error returned in this case
156              differs across implementations, so  POSIX  requires  a  portable
157              application to check for both errors.)
158
159       F_SETLKW (struct flock *)
160              As  for  F_SETLK, but if a conflicting lock is held on the file,
161              then wait for that lock to be released.  If a signal  is  caught
162              while  waiting, then the call is interrupted and (after the sig‐
163              nal handler has returned) returns immediately (with return value
164              -1 and errno set to EINTR; see signal(7)).
165
166       F_GETLK (struct flock *)
167              On  input  to  this call, lock describes a lock we would like to
168              place on the file.  If the lock could be  placed,  fcntl()  does
169              not  actually  place it, but returns F_UNLCK in the l_type field
170              of lock and leaves the other fields of the structure unchanged.
171
172              If one or more incompatible locks would prevent this lock  being
173              placed, then fcntl() returns details about one of those locks in
174              the l_type, l_whence, l_start, and l_len fields of lock.  If the
175              conflicting  lock  is  a traditional (process-associated) record
176              lock, then the l_pid field is set to  the  PID  of  the  process
177              holding  that  lock.   If  the  conflicting lock is an open file
178              description lock, then l_pid  is  set  to  -1.   Note  that  the
179              returned  information may already be out of date by the time the
180              caller inspects it.
181
182       In order to place a read lock, fd must be open for reading.   In  order
183       to  place  a  write  lock,  fd must be open for writing.  To place both
184       types of lock, open a file read-write.
185
186       When placing locks with F_SETLKW, the kernel detects deadlocks, whereby
187       two  or  more  processes  have  their lock requests mutually blocked by
188       locks held by the other processes.   For  example,  suppose  process  A
189       holds  a  write lock on byte 100 of a file, and process B holds a write
190       lock on byte 200.  If each process  then  attempts  to  lock  the  byte
191       already locked by the other process using F_SETLKW, then, without dead‐
192       lock detection, both processes would remain blocked indefinitely.  When
193       the  kernel  detects such deadlocks, it causes one of the blocking lock
194       requests to immediately fail with the  error  EDEADLK;  an  application
195       that encounters such an error should release some of its locks to allow
196       other applications to proceed before attempting regain the  locks  that
197       it  requires.  Circular deadlocks involving more than two processes are
198       also detected.  Note, however, that there are limitations to  the  ker‐
199       nel's deadlock-detection algorithm; see BUGS.
200
201       As well as being removed by an explicit F_UNLCK, record locks are auto‐
202       matically released when the process terminates.
203
204       Record locks are not inherited by a child created via fork(2), but  are
205       preserved across an execve(2).
206
207       Because  of the buffering performed by the stdio(3) library, the use of
208       record locking with routines in that package  should  be  avoided;  use
209       read(2) and write(2) instead.
210
211       The  record  locks  described  above  are  associated  with the process
212       (unlike the open file description locks  described  below).   This  has
213       some unfortunate consequences:
214
215       *  If  a  process  closes any file descriptor referring to a file, then
216          all of the process's locks on that file are released, regardless  of
217          the  file  descriptor(s)  on which the locks were obtained.  This is
218          bad: it means that a process can lose its locks on a  file  such  as
219          /etc/passwd  or  /etc/mtab  when  for some reason a library function
220          decides to open, read, and close the same file.
221
222       *  The threads in a process share locks.   In  other  words,  a  multi‐
223          threaded  program  can't  use  record locking to ensure that threads
224          don't simultaneously access the same region of a file.
225
226       Open file description locks solve both of these problems.
227
228   Open file description locks (non-POSIX)
229       Open file description locks are advisory byte-range locks whose  opera‐
230       tion  is  in  most  respects  identical to the traditional record locks
231       described above.  This lock type is Linux-specific, and available since
232       Linux 3.15.  (There is a proposal with the Austin Group to include this
233       lock type in the next revision of POSIX.1.)  For an explanation of open
234       file descriptions, see open(2).
235
236       The  principal  difference  between  the two lock types is that whereas
237       traditional record locks are  associated  with  a  process,  open  file
238       description  locks  are  associated  with  the open file description on
239       which they are acquired, much like locks acquired with flock(2).   Con‐
240       sequently  (and  unlike  traditional  advisory record locks), open file
241       description locks are  inherited  across  fork(2)  (and  clone(2)  with
242       CLONE_FILES),  and are only automatically released on the last close of
243       the open file description, instead of being released on  any  close  of
244       the file.
245
246       Conflicting  lock  combinations  (i.e., a read lock and a write lock or
247       two write locks) where one lock is an open file  description  lock  and
248       the  other  is  a  traditional  record lock conflict even when they are
249       acquired by the same process on the same file descriptor.
250
251       Open file description locks placed via the same open  file  description
252       (i.e.,  via  the  same  file descriptor, or via a duplicate of the file
253       descriptor created by fork(2), dup(2), fcntl() F_DUPFD, and so on)  are
254       always compatible: if a new lock is placed on an already locked region,
255       then the existing lock is converted to the new lock type.   (Such  con‐
256       versions  may  result  in  splitting,  shrinking, or coalescing with an
257       existing lock as discussed above.)
258
259       On the other hand, open file description locks may conflict  with  each
260       other  when  they  are  acquired  via different open file descriptions.
261       Thus, the threads in a multithreaded program can use open file descrip‐
262       tion locks to synchronize access to a file region by having each thread
263       perform its own open(2) on the file and applying locks via the  result‐
264       ing file descriptor.
265
266       As  with  traditional  advisory  locks,  the third argument to fcntl(),
267       lock, is a pointer to an flock structure.  By contrast with traditional
268       record  locks,  the  l_pid  field of that structure must be set to zero
269       when using the commands described below.
270
271       The commands for working with open file description locks are analogous
272       to those used with traditional locks:
273
274       F_OFD_SETLK (struct flock *)
275              Acquire an open file description lock (when l_type is F_RDLCK or
276              F_WRLCK) or release an open file description lock  (when  l_type
277              is F_UNLCK) on the bytes specified by the l_whence, l_start, and
278              l_len fields of lock.  If a conflicting lock is held by  another
279              process, this call returns -1 and sets errno to EAGAIN.
280
281       F_OFD_SETLKW (struct flock *)
282              As  for  F_OFD_SETLK,  but  if a conflicting lock is held on the
283              file, then wait for that lock to be released.  If  a  signal  is
284              caught  while  waiting,  then the call is interrupted and (after
285              the signal  handler  has  returned)  returns  immediately  (with
286              return value -1 and errno set to EINTR; see signal(7)).
287
288       F_OFD_GETLK (struct flock *)
289              On  input  to this call, lock describes an open file description
290              lock we would like to place on the file.  If the lock  could  be
291              placed,  fcntl() does not actually place it, but returns F_UNLCK
292              in the l_type field of lock and leaves the other fields  of  the
293              structure  unchanged.   If  one or more incompatible locks would
294              prevent this lock being placed, then details about one of  these
295              locks are returned via lock, as described above for F_GETLK.
296
297       In  the  current implementation, no deadlock detection is performed for
298       open file description locks.  (This contrasts  with  process-associated
299       record locks, for which the kernel does perform deadlock detection.)
300
301   Mandatory locking
302       Warning:  the  Linux implementation of mandatory locking is unreliable.
303       See BUGS below.  Because of these bugs, and the fact that  the  feature
304       is  believed  to be little used, since Linux 4.5, mandatory locking has
305       been made an optional feature, governed by a configuration option (CON‐
306       FIG_MANDATORY_FILE_LOCKING).   This  is an initial step toward removing
307       this feature completely.
308
309       By  default,  both  traditional  (process-associated)  and  open   file
310       description record locks are advisory.  Advisory locks are not enforced
311       and are useful only between cooperating processes.
312
313       Both lock types can also be mandatory.  Mandatory  locks  are  enforced
314       for  all  processes.   If  a  process  tries to perform an incompatible
315       access (e.g., read(2) or write(2)) on a file region that has an  incom‐
316       patible mandatory lock, then the result depends upon whether the O_NON‐
317       BLOCK flag is enabled for its open file description.  If the O_NONBLOCK
318       flag  is not enabled, then the system call is blocked until the lock is
319       removed or converted to a mode that is compatible with the access.   If
320       the  O_NONBLOCK  flag  is  enabled, then the system call fails with the
321       error EAGAIN.
322
323       To make use of mandatory locks, mandatory locking must be enabled  both
324       on  the filesystem that contains the file to be locked, and on the file
325       itself.  Mandatory locking is enabled on a  filesystem  using  the  "-o
326       mand" option to mount(8), or the MS_MANDLOCK flag for mount(2).  Manda‐
327       tory locking is enabled on a file by disabling group execute permission
328       on  the file and enabling the set-group-ID permission bit (see chmod(1)
329       and chmod(2)).
330
331       Mandatory locking is not specified by POSIX.  Some other  systems  also
332       support  mandatory  locking,  although  the details of how to enable it
333       vary across systems.
334
335   Lost locks
336       When an advisory lock is obtained on a networked filesystem such as NFS
337       it  is  possible  that the lock might get lost.  This may happen due to
338       administrative action on the server, or  due  to  a  network  partition
339       (i.e.,  loss  of network connectivity with the server) which lasts long
340       enough for the server to assume that the client is no longer  function‐
341       ing.
342
343       When  the  filesystem  determines  that  a  lock  has been lost, future
344       read(2) or write(2) requests may fail with the error EIO.   This  error
345       will  persist  until  the  lock  is  removed  or the file descriptor is
346       closed.  Since Linux 3.12, this happens at least for  NFSv4  (including
347       all minor versions).
348
349       Some  versions  of  UNIX  send a signal (SIGLOST) in this circumstance.
350       Linux does not define this signal, and does not provide  any  asynchro‐
351       nous notification of lost locks.
352
353   Managing signals
354       F_GETOWN, F_SETOWN, F_GETOWN_EX, F_SETOWN_EX, F_GETSIG and F_SETSIG are
355       used to manage I/O availability signals:
356
357       F_GETOWN (void)
358              Return (as the function result) the process ID or process  group
359              ID  currently  receiving  SIGIO and SIGURG signals for events on
360              file descriptor fd.  Process IDs are returned as  positive  val‐
361              ues;  process group IDs are returned as negative values (but see
362              BUGS below).  arg is ignored.
363
364       F_SETOWN (int)
365              Set the process ID or process group ID that will  receive  SIGIO
366              and  SIGURG  signals  for events on the file descriptor fd.  The
367              target process or process group  ID  is  specified  in  arg.   A
368              process  ID is specified as a positive value; a process group ID
369              is specified as a negative value.  Most  commonly,  the  calling
370              process specifies itself as the owner (that is, arg is specified
371              as getpid(2)).
372
373              As well as setting the file  descriptor  owner,  one  must  also
374              enable  generation  of  signals on the file descriptor.  This is
375              done by using the fcntl() F_SETFL command  to  set  the  O_ASYNC
376              file  status flag on the file descriptor.  Subsequently, a SIGIO
377              signal is sent whenever input or output becomes possible on  the
378              file  descriptor.   The  fcntl() F_SETSIG command can be used to
379              obtain delivery of a signal other than SIGIO.
380
381              Sending a signal to  the  owner  process  (group)  specified  by
382              F_SETOWN  is  subject  to  the  same  permissions  checks as are
383              described for kill(2), where the sending process is the one that
384              employs F_SETOWN (but see BUGS below).  If this permission check
385              fails,  then  the  signal  is  silently  discarded.   Note:  The
386              F_SETOWN  operation records the caller's credentials at the time
387              of the fcntl() call, and it is these saved credentials that  are
388              used for the permission checks.
389
390              If  the  file  descriptor  fd  refers to a socket, F_SETOWN also
391              selects the recipient of SIGURG signals that are delivered  when
392              out-of-band data arrives on that socket.  (SIGURG is sent in any
393              situation where select(2) would report the socket as  having  an
394              "exceptional condition".)
395
396              The following was true in 2.6.x kernels up to and including ker‐
397              nel 2.6.11:
398
399                     If a nonzero value is  given  to  F_SETSIG  in  a  multi‐
400                     threaded  process  running  with a threading library that
401                     supports thread groups  (e.g.,  NPTL),  then  a  positive
402                     value  given to F_SETOWN has a different meaning: instead
403                     of being a process ID identifying a whole process, it  is
404                     a  thread  ID  identifying  a  specific  thread  within a
405                     process.  Consequently,  it  may  be  necessary  to  pass
406                     F_SETOWN  the result of gettid(2) instead of getpid(2) to
407                     get sensible results when F_SETSIG is used.  (In  current
408                     Linux  threading  implementations, a main thread's thread
409                     ID is the same as its process ID.  This means that a sin‐
410                     gle-threaded  program  can  equally use gettid(2) or get‐
411                     pid(2) in this scenario.)  Note, however, that the state‐
412                     ments in this paragraph do not apply to the SIGURG signal
413                     generated for out-of-band data on a socket:  this  signal
414                     is  always  sent  to either a process or a process group,
415                     depending on the value given to F_SETOWN.
416
417              The above behavior was accidentally dropped in Linux 2.6.12, and
418              won't be restored.  From Linux 2.6.32 onward, use F_SETOWN_EX to
419              target SIGIO and SIGURG signals at a particular thread.
420
421       F_GETOWN_EX (struct f_owner_ex *) (since Linux 2.6.32)
422              Return the current file descriptor owner settings as defined  by
423              a  previous  F_SETOWN_EX operation.  The information is returned
424              in the structure pointed to by  arg,  which  has  the  following
425              form:
426
427                  struct f_owner_ex {
428                      int   type;
429                      pid_t pid;
430                  };
431
432              The  type  field  will  have  one  of  the  values  F_OWNER_TID,
433              F_OWNER_PID, or F_OWNER_PGRP.  The pid field is a positive inte‐
434              ger  representing  a thread ID, process ID, or process group ID.
435              See F_SETOWN_EX for more details.
436
437       F_SETOWN_EX (struct f_owner_ex *) (since Linux 2.6.32)
438              This operation performs a similar task to F_SETOWN.   It  allows
439              the  caller  to  direct  I/O  availability signals to a specific
440              thread, process, or process group.   The  caller  specifies  the
441              target  of  signals  via arg, which is a pointer to a f_owner_ex
442              structure.  The type field has  one  of  the  following  values,
443              which define how pid is interpreted:
444
445              F_OWNER_TID
446                     Send  the signal to the thread whose thread ID (the value
447                     returned by a call to clone(2) or gettid(2)) is specified
448                     in pid.
449
450              F_OWNER_PID
451                     Send  the  signal to the process whose ID is specified in
452                     pid.
453
454              F_OWNER_PGRP
455                     Send the signal to the process group whose ID  is  speci‐
456                     fied in pid.  (Note that, unlike with F_SETOWN, a process
457                     group ID is specified as a positive value here.)
458
459       F_GETSIG (void)
460              Return (as the function result) the signal sent  when  input  or
461              output  becomes  possible.  A value of zero means SIGIO is sent.
462              Any other value (including SIGIO) is the  signal  sent  instead,
463              and in this case additional info is available to the signal han‐
464              dler if installed with SA_SIGINFO.  arg is ignored.
465
466       F_SETSIG (int)
467              Set the signal sent when input or output becomes possible to the
468              value  given  in arg.  A value of zero means to send the default
469              SIGIO signal.  Any other value (including SIGIO) is  the  signal
470              to  send  instead, and in this case additional info is available
471              to the signal handler if installed with SA_SIGINFO.
472
473              By using F_SETSIG with a nonzero value, and  setting  SA_SIGINFO
474              for  the  signal  handler  (see sigaction(2)), extra information
475              about I/O events is passed to the handler in a siginfo_t  struc‐
476              ture.   If  the  si_code field indicates the source is SI_SIGIO,
477              the si_fd field gives the file descriptor  associated  with  the
478              event.  Otherwise, there is no indication which file descriptors
479              are pending, and you should use the usual mechanisms (select(2),
480              poll(2),  read(2)  with  O_NONBLOCK set etc.) to determine which
481              file descriptors are available for I/O.
482
483              Note that the file descriptor provided in si_fd is the one  that
484              was  specified  during the F_SETSIG operation.  This can lead to
485              an unusual corner case.  If the file  descriptor  is  duplicated
486              (dup(2) or similar), and the original file descriptor is closed,
487              then I/O events will continue to be  generated,  but  the  si_fd
488              field will contain the number of the now closed file descriptor.
489
490              By  selecting  a  real time signal (value >= SIGRTMIN), multiple
491              I/O events may be queued using the same signal numbers.   (Queu‐
492              ing  is  dependent  on  available memory.)  Extra information is
493              available if SA_SIGINFO is set for the signal handler, as above.
494
495              Note that Linux imposes a limit on the number of real-time  sig‐
496              nals  that may be queued to a process (see getrlimit(2) and sig‐
497              nal(7)) and if this limit is reached, then the kernel reverts to
498              delivering  SIGIO,  and  this  signal is delivered to the entire
499              process rather than to a specific thread.
500
501       Using these mechanisms, a program can implement fully asynchronous  I/O
502       without using select(2) or poll(2) most of the time.
503
504       The  use  of  O_ASYNC  is  specific  to BSD and Linux.  The only use of
505       F_GETOWN and F_SETOWN specified in POSIX.1 is in conjunction  with  the
506       use of the SIGURG signal on sockets.  (POSIX does not specify the SIGIO
507       signal.)  F_GETOWN_EX, F_SETOWN_EX, F_GETSIG, and F_SETSIG  are  Linux-
508       specific.  POSIX has asynchronous I/O and the aio_sigevent structure to
509       achieve similar things; these are also available in Linux  as  part  of
510       the GNU C Library (Glibc).
511
512   Leases
513       F_SETLEASE  and  F_GETLEASE  (Linux 2.4 onward) are used to establish a
514       new lease, and retrieve the current lease, on the open file description
515       referred  to by the file descriptor fd.  A file lease provides a mecha‐
516       nism whereby the process holding the  lease  (the  "lease  holder")  is
517       notified  (via  delivery  of  a  signal)  when  a  process  (the "lease
518       breaker") tries to open(2) or truncate(2) the file referred to by  that
519       file descriptor.
520
521       F_SETLEASE (int)
522              Set  or  remove a file lease according to which of the following
523              values is specified in the integer arg:
524
525              F_RDLCK
526                     Take out a read  lease.   This  will  cause  the  calling
527                     process  to be notified when the file is opened for writ‐
528                     ing or is truncated.  A read lease can be placed only  on
529                     a file descriptor that is opened read-only.
530
531              F_WRLCK
532                     Take out a write lease.  This will cause the caller to be
533                     notified when the file is opened for reading  or  writing
534                     or  is  truncated.  A write lease may be placed on a file
535                     only if there are no other open file descriptors for  the
536                     file.
537
538              F_UNLCK
539                     Remove our lease from the file.
540
541       Leases  are  associated  with  an  open file description (see open(2)).
542       This means that duplicate file descriptors (created  by,  for  example,
543       fork(2) or dup(2)) refer to the same lease, and this lease may be modi‐
544       fied or released using any  of  these  descriptors.   Furthermore,  the
545       lease  is  released  by  either an explicit F_UNLCK operation on any of
546       these duplicate file descriptors, or when  all  such  file  descriptors
547       have been closed.
548
549       Leases may be taken out only on regular files.  An unprivileged process
550       may take out a lease only on a  file  whose  UID  (owner)  matches  the
551       filesystem UID of the process.  A process with the CAP_LEASE capability
552       may take out leases on arbitrary files.
553
554       F_GETLEASE (void)
555              Indicates what  type  of  lease  is  associated  with  the  file
556              descriptor  fd by returning either F_RDLCK, F_WRLCK, or F_UNLCK,
557              indicating, respectively, a read lease , a write  lease,  or  no
558              lease.  arg is ignored.
559
560       When a process (the "lease breaker") performs an open(2) or truncate(2)
561       that conflicts with a lease established via F_SETLEASE, the system call
562       is  blocked  by  the kernel and the kernel notifies the lease holder by
563       sending it a signal  (SIGIO  by  default).   The  lease  holder  should
564       respond to receipt of this signal by doing whatever cleanup is required
565       in preparation for the file to be accessed by  another  process  (e.g.,
566       flushing cached buffers) and then either remove or downgrade its lease.
567       A lease is removed by performing an F_SETLEASE command  specifying  arg
568       as  F_UNLCK.   If the lease holder currently holds a write lease on the
569       file, and the lease breaker is opening the file for reading, then it is
570       sufficient for the lease holder to downgrade the lease to a read lease.
571       This is done by performing an  F_SETLEASE  command  specifying  arg  as
572       F_RDLCK.
573
574       If  the  lease holder fails to downgrade or remove the lease within the
575       number of seconds specified in /proc/sys/fs/lease-break-time, then  the
576       kernel forcibly removes or downgrades the lease holder's lease.
577
578       Once  a  lease  break has been initiated, F_GETLEASE returns the target
579       lease type (either F_RDLCK or F_UNLCK, depending on what would be  com‐
580       patible  with  the  lease  breaker)  until the lease holder voluntarily
581       downgrades or removes the lease or the kernel forcibly  does  so  after
582       the lease break timer expires.
583
584       Once  the lease has been voluntarily or forcibly removed or downgraded,
585       and assuming the lease breaker has not unblocked its system  call,  the
586       kernel permits the lease breaker's system call to proceed.
587
588       If the lease breaker's blocked open(2) or truncate(2) is interrupted by
589       a signal handler, then the system call fails with the error EINTR,  but
590       the  other  steps still occur as described above.  If the lease breaker
591       is killed by a signal while blocked in open(2) or truncate(2), then the
592       other steps still occur as described above.  If the lease breaker spec‐
593       ifies the O_NONBLOCK flag when calling open(2), then the  call  immedi‐
594       ately fails with the error EWOULDBLOCK, but the other steps still occur
595       as described above.
596
597       The default signal used to notify the lease holder is SIGIO,  but  this
598       can  be  changed  using the F_SETSIG command to fcntl().  If a F_SETSIG
599       command is performed (even one specifying SIGIO), and the  signal  han‐
600       dler  is  established using SA_SIGINFO, then the handler will receive a
601       siginfo_t structure as its second argument, and the si_fd field of this
602       argument will hold the file descriptor of the leased file that has been
603       accessed by another process.  (This  is  useful  if  the  caller  holds
604       leases against multiple files.)
605
606   File and directory change notification (dnotify)
607       F_NOTIFY (int)
608              (Linux  2.4  onward)  Provide  notification  when  the directory
609              referred to by fd or any  of  the  files  that  it  contains  is
610              changed.   The events to be notified are specified in arg, which
611              is a bit mask specified by ORing together zero or  more  of  the
612              following bits:
613
614              DN_ACCESS
615                     A  file  was  accessed  (read(2), pread(2), readv(2), and
616                     similar)
617              DN_MODIFY
618                     A file  was  modified  (write(2),  pwrite(2),  writev(2),
619                     truncate(2), ftruncate(2), and similar).
620              DN_CREATE
621                     A   file   was   created  (open(2),  creat(2),  mknod(2),
622                     mkdir(2), link(2), symlink(2), rename(2) into this direc‐
623                     tory).
624              DN_DELETE
625                     A  file  was  unlinked  (unlink(2),  rename(2) to another
626                     directory, rmdir(2)).
627              DN_RENAME
628                     A file was renamed within this directory (rename(2)).
629              DN_ATTRIB
630                     The  attributes  of  a  file  were   changed   (chown(2),
631                     chmod(2), utime(2), utimensat(2), and similar).
632
633              (In  order  to obtain these definitions, the _GNU_SOURCE feature
634              test macro must be defined before including any header files.)
635
636              Directory notifications are normally "one-shot", and the  appli‐
637              cation must reregister to receive further notifications.  Alter‐
638              natively, if DN_MULTISHOT is included in arg, then  notification
639              will remain in effect until explicitly removed.
640
641              A  series of F_NOTIFY requests is cumulative, with the events in
642              arg being added to the set already monitored.  To disable  noti‐
643              fication  of all events, make an F_NOTIFY call specifying arg as
644              0.
645
646              Notification occurs via delivery of a signal.  The default  sig‐
647              nal is SIGIO, but this can be changed using the F_SETSIG command
648              to fcntl().  (Note that SIGIO is one of the nonqueuing  standard
649              signals;  switching  to the use of a real-time signal means that
650              multiple notifications can be queued to the  process.)   In  the
651              latter  case,  the signal handler receives a siginfo_t structure
652              as its second argument (if the  handler  was  established  using
653              SA_SIGINFO)  and  the si_fd field of this structure contains the
654              file descriptor which generated the  notification  (useful  when
655              establishing notification on multiple directories).
656
657              Especially when using DN_MULTISHOT, a real time signal should be
658              used for notification, so that  multiple  notifications  can  be
659              queued.
660
661              NOTE:  New applications should use the inotify interface (avail‐
662              able since kernel 2.6.13), which provides a much superior inter‐
663              face for obtaining notifications of filesystem events.  See ino‐
664              tify(7).
665
666   Changing the capacity of a pipe
667       F_SETPIPE_SZ (int; since Linux 2.6.35)
668              Change the capacity of the pipe referred to by fd to be at least
669              arg bytes.  An unprivileged process can adjust the pipe capacity
670              to any value between the system page size and the limit  defined
671              in  /proc/sys/fs/pipe-max-size  (see  proc(5)).  Attempts to set
672              the pipe capacity below the page size are silently rounded up to
673              the  page  size.  Attempts by an unprivileged process to set the
674              pipe capacity  above  the  limit  in  /proc/sys/fs/pipe-max-size
675              yield  the  error EPERM; a privileged process (CAP_SYS_RESOURCE)
676              can override the limit.
677
678              When allocating the buffer for the pipe, the kernel  may  use  a
679              capacity  larger  than arg, if that is convenient for the imple‐
680              mentation.  (In the current implementation,  the  allocation  is
681              the next higher power-of-two page-size multiple of the requested
682              size.)  The actual capacity (in bytes) that is set  is  returned
683              as the function result.
684
685              Attempting  to  set the pipe capacity smaller than the amount of
686              buffer space currently used to store  data  produces  the  error
687              EBUSY.
688
689              Note  that  because  of the way the pages of the pipe buffer are
690              employed when data is written to the pipe, the number  of  bytes
691              that can be written may be less than the nominal size, depending
692              on the size of the writes.
693
694       F_GETPIPE_SZ (void; since Linux 2.6.35)
695              Return (as  the  function  result)  the  capacity  of  the  pipe
696              referred to by fd.
697
698   File Sealing
699       File  seals  limit  the set of allowed operations on a given file.  For
700       each seal that is set on a file, a specific set of operations will fail
701       with  EPERM  on  this file from now on.  The file is said to be sealed.
702       The default set of seals depends on the type of the underlying file and
703       filesystem.   For an overview of file sealing, a discussion of its pur‐
704       pose, and some code examples, see memfd_create(2).
705
706       Currently, file seals can be applied only to a file descriptor returned
707       by  memfd_create(2)  (if the MFD_ALLOW_SEALING was employed).  On other
708       filesystems, all fcntl() operations that operate on seals  will  return
709       EINVAL.
710
711       Seals  are  a  property  of  an inode.  Thus, all open file descriptors
712       referring to the same inode share the same set of seals.   Furthermore,
713       seals can never be removed, only added.
714
715       F_ADD_SEALS (int; since Linux 3.17)
716              Add  the  seals given in the bit-mask argument arg to the set of
717              seals of the inode referred to by the file descriptor fd.  Seals
718              cannot be removed again.  Once this call succeeds, the seals are
719              enforced by the kernel immediately.  If the current set of seals
720              includes  F_SEAL_SEAL  (see  below),  then  this  call  will  be
721              rejected with EPERM.  Adding a seal that is already set is a no-
722              op, in case F_SEAL_SEAL is not set already.  In order to place a
723              seal, the file descriptor fd must be writable.
724
725       F_GET_SEALS (void; since Linux 3.17)
726              Return (as the function result) the current set of seals of  the
727              inode  referred  to  by fd.  If no seals are set, 0 is returned.
728              If the file does not support sealing, -1 is returned  and  errno
729              is set to EINVAL.
730
731       The following seals are available:
732
733       F_SEAL_SEAL
734              If   this  seal  is  set,  any  further  call  to  fcntl()  with
735              F_ADD_SEALS fails with the error EPERM.   Therefore,  this  seal
736              prevents  any  modifications to the set of seals itself.  If the
737              initial set of seals of a file includes F_SEAL_SEAL,  then  this
738              effectively causes the set of seals to be constant and locked.
739
740       F_SEAL_SHRINK
741              If  this  seal is set, the file in question cannot be reduced in
742              size.  This affects open(2) with the O_TRUNC  flag  as  well  as
743              truncate(2)  and  ftruncate(2).   Those calls fail with EPERM if
744              you try to shrink the file in  question.   Increasing  the  file
745              size is still possible.
746
747       F_SEAL_GROW
748              If  this seal is set, the size of the file in question cannot be
749              increased.  This affects write(2) beyond the end  of  the  file,
750              truncate(2),  ftruncate(2),  and fallocate(2).  These calls fail
751              with EPERM if you use them to increase the file  size.   If  you
752              keep the size or shrink it, those calls still work as expected.
753
754       F_SEAL_WRITE
755              If this seal is set, you cannot modify the contents of the file.
756              Note that shrinking or growing the size of  the  file  is  still
757              possible  and allowed.  Thus, this seal is normally used in com‐
758              bination with  one  of  the  other  seals.   This  seal  affects
759              write(2)  and  fallocate(2)  (only  in combination with the FAL‐
760              LOC_FL_PUNCH_HOLE flag).  Those calls fail with  EPERM  if  this
761              seal is set.  Furthermore, trying to create new shared, writable
762              memory-mappings via mmap(2) will also fail with EPERM.
763
764              Using the F_ADD_SEALS operation to  set  the  F_SEAL_WRITE  seal
765              fails  with  EBUSY if any writable, shared mapping exists.  Such
766              mappings must be unmapped before you can add  this  seal.   Fur‐
767              thermore,  if there are any asynchronous I/O operations (io_sub‐
768              mit(2)) pending on the file, all outstanding writes will be dis‐
769              carded.
770
771       F_SEAL_FUTURE_WRITE (since Linux 5.1)
772              The effect of this seal is similar to F_SEAL_WRITE, but the con‐
773              tents of the file can still be modified via shared writable map‐
774              pings  that  were  created  prior  to  the  seal being set.  Any
775              attempt to create a new writable mapping on the file via mmap(2)
776              will fail with EPERM.  Likewise, an attempt to write to the file
777              via write(2) will fail with EPERM.
778
779              Using this seal, one process can create a memory buffer that  it
780              can  continue  to  modify  while sharing that buffer on a "read-
781              only" basis with other processes.
782
783   File read/write hints
784       Write lifetime hints can be used to inform the kernel about  the  rela‐
785       tive  expected  lifetime of writes on a given inode or via a particular
786       open file description.  (See open(2) for an explanation  of  open  file
787       descriptions.)   In  this  context, the term "write lifetime" means the
788       expected time the data will live on media, before being overwritten  or
789       erased.
790
791       An  application  may  use  the different hint values specified below to
792       separate writes into different write classes, so that multiple users or
793       applications  running  on a single storage back-end can aggregate their
794       I/O patterns in a consistent manner.  However, there are no  functional
795       semantics implied by these flags, and different I/O classes can use the
796       write lifetime hints in arbitrary ways, so long as the hints  are  used
797       consistently.
798
799       The following operations can be applied to the file descriptor, fd:
800
801       F_GET_RW_HINT (uint64_t *; since Linux 4.13)
802              Returns  the  value  of  the read/write hint associated with the
803              underlying inode referred to by fd.
804
805       F_SET_RW_HINT (uint64_t *; since Linux 4.13)
806              Sets the read/write hint value associated  with  the  underlying
807              inode  referred to by fd.  This hint persists until either it is
808              explicitly modified or the underlying filesystem is unmounted.
809
810       F_GET_FILE_RW_HINT (uint64_t *; since Linux 4.13)
811              Returns the value of the read/write  hint  associated  with  the
812              open file description referred to by fd.
813
814       F_SET_FILE_RW_HINT (uint64_t *; since Linux 4.13)
815              Sets  the  read/write  hint  value associated with the open file
816              description referred to by fd.
817
818       If an open file description has not been assigned  a  read/write  hint,
819       then it shall use the value assigned to the inode, if any.
820
821       The following read/write hints are valid since Linux 4.13:
822
823       RWH_WRITE_LIFE_NOT_SET
824              No specific hint has been set.  This is the default value.
825
826       RWH_WRITE_LIFE_NONE
827              No  specific  write  lifetime  is  associated  with this file or
828              inode.
829
830       RWH_WRITE_LIFE_SHORT
831              Data written to this inode or via this open file description  is
832              expected to have a short lifetime.
833
834       RWH_WRITE_LIFE_MEDIUM
835              Data  written to this inode or via this open file description is
836              expected to have  a  lifetime  longer  than  data  written  with
837              RWH_WRITE_LIFE_SHORT.
838
839       RWH_WRITE_LIFE_LONG
840              Data  written to this inode or via this open file description is
841              expected to have  a  lifetime  longer  than  data  written  with
842              RWH_WRITE_LIFE_MEDIUM.
843
844       RWH_WRITE_LIFE_EXTREME
845              Data  written to this inode or via this open file description is
846              expected to have  a  lifetime  longer  than  data  written  with
847              RWH_WRITE_LIFE_LONG.
848
849       All  the  write-specific hints are relative to each other, and no indi‐
850       vidual absolute meaning should be attributed to them.
851

RETURN VALUE

853       For a successful call, the return value depends on the operation:
854
855       F_DUPFD
856              The new file descriptor.
857
858       F_GETFD
859              Value of file descriptor flags.
860
861       F_GETFL
862              Value of file status flags.
863
864       F_GETLEASE
865              Type of lease held on file descriptor.
866
867       F_GETOWN
868              Value of file descriptor owner.
869
870       F_GETSIG
871              Value of signal sent when read or  write  becomes  possible,  or
872              zero for traditional SIGIO behavior.
873
874       F_GETPIPE_SZ, F_SETPIPE_SZ
875              The pipe capacity.
876
877       F_GET_SEALS
878              A  bit  mask  identifying  the  seals that have been set for the
879              inode referred to by fd.
880
881       All other commands
882              Zero.
883
884       On error, -1 is returned, and errno is set appropriately.
885

ERRORS

887       EACCES or EAGAIN
888              Operation is prohibited by locks held by other processes.
889
890       EAGAIN The operation is prohibited because the file  has  been  memory-
891              mapped by another process.
892
893       EBADF  fd is not an open file descriptor
894
895       EBADF  cmd  is  F_SETLK  or  F_SETLKW and the file descriptor open mode
896              doesn't match with the type of lock requested.
897
898       EBUSY  cmd is F_SETPIPE_SZ and the new pipe capacity specified  in  arg
899              is  smaller  than  the  amount of buffer space currently used to
900              store data in the pipe.
901
902       EBUSY  cmd is F_ADD_SEALS, arg includes F_SEAL_WRITE, and there  exists
903              a writable, shared mapping on the file referred to by fd.
904
905       EDEADLK
906              It  was detected that the specified F_SETLKW command would cause
907              a deadlock.
908
909       EFAULT lock is outside your accessible address space.
910
911       EINTR  cmd is F_SETLKW or F_OFD_SETLKW and  the  operation  was  inter‐
912              rupted by a signal; see signal(7).
913
914       EINTR  cmd  is  F_GETLK,  F_SETLK, F_OFD_GETLK, or F_OFD_SETLK, and the
915              operation was interrupted  by  a  signal  before  the  lock  was
916              checked  or  acquired.   Most  likely when locking a remote file
917              (e.g., locking over NFS), but can sometimes happen locally.
918
919       EINVAL The value specified in cmd is not recognized by this kernel.
920
921       EINVAL cmd is F_ADD_SEALS and arg includes an unrecognized sealing bit.
922
923       EINVAL cmd is F_ADD_SEALS or F_GET_SEALS and the filesystem  containing
924              the inode referred to by fd does not support sealing.
925
926       EINVAL cmd  is F_DUPFD and arg is negative or is greater than the maxi‐
927              mum allowable value (see  the  discussion  of  RLIMIT_NOFILE  in
928              getrlimit(2)).
929
930       EINVAL cmd is F_SETSIG and arg is not an allowable signal number.
931
932       EINVAL cmd  is F_OFD_SETLK, F_OFD_SETLKW, or F_OFD_GETLK, and l_pid was
933              not specified as zero.
934
935       EMFILE cmd is F_DUPFD and the per-process limit on the number  of  open
936              file descriptors has been reached.
937
938       ENOLCK Too  many  segment  locks  open, lock table is full, or a remote
939              locking protocol failed (e.g., locking over NFS).
940
941       ENOTDIR
942              F_NOTIFY was specified in cmd, but fd does not refer to a direc‐
943              tory.
944
945       EPERM  cmd  is  F_SETPIPE_SZ  and  the soft or hard user pipe limit has
946              been reached; see pipe(7).
947
948       EPERM  Attempted to clear the O_APPEND flag on  a  file  that  has  the
949              append-only attribute set.
950
951       EPERM  cmd was F_ADD_SEALS, but fd was not open for writing or the cur‐
952              rent set of seals on the file already includes F_SEAL_SEAL.
953

CONFORMING TO

955       SVr4, 4.3BSD, POSIX.1-2001.   Only  the  operations  F_DUPFD,  F_GETFD,
956       F_SETFD, F_GETFL, F_SETFL, F_GETLK, F_SETLK, and F_SETLKW are specified
957       in POSIX.1-2001.
958
959       F_GETOWN and F_SETOWN are specified in  POSIX.1-2001.   (To  get  their
960       definitions, define either _XOPEN_SOURCE with the value 500 or greater,
961       or _POSIX_C_SOURCE with the value 200809L or greater.)
962
963       F_DUPFD_CLOEXEC is specified in POSIX.1-2008.  (To get this definition,
964       define   _POSIX_C_SOURCE   with   the  value  200809L  or  greater,  or
965       _XOPEN_SOURCE with the value 700 or greater.)
966
967       F_GETOWN_EX, F_SETOWN_EX, F_SETPIPE_SZ, F_GETPIPE_SZ, F_GETSIG,  F_SET‐
968       SIG,  F_NOTIFY, F_GETLEASE, and F_SETLEASE are Linux-specific.  (Define
969       the _GNU_SOURCE macro to obtain these definitions.)
970
971       F_OFD_SETLK, F_OFD_SETLKW, and F_OFD_GETLK are Linux-specific (and  one
972       must define _GNU_SOURCE to obtain their definitions), but work is being
973       done to have them included in the next version of POSIX.1.
974
975       F_ADD_SEALS and F_GET_SEALS are Linux-specific.
976

NOTES

978       The errors returned by dup2(2) are different  from  those  returned  by
979       F_DUPFD.
980
981   File locking
982       The original Linux fcntl() system call was not designed to handle large
983       file offsets (in the flock structure).  Consequently, an fcntl64() sys‐
984       tem  call was added in Linux 2.4.  The newer system call employs a dif‐
985       ferent structure for file locking, flock64, and corresponding commands,
986       F_GETLK64,  F_SETLK64,  and  F_SETLKW64.  However, these details can be
987       ignored by applications using glibc,  whose  fcntl()  wrapper  function
988       transparently  employs  the  more recent system call where it is avail‐
989       able.
990
991   Record locks
992       Since kernel 2.0, there is no interaction between  the  types  of  lock
993       placed by flock(2) and fcntl().
994
995       Several  systems have more fields in struct flock such as, for example,
996       l_sysid (to identify the machine where the  lock  is  held).   Clearly,
997       l_pid  alone  is not going to be very useful if the process holding the
998       lock may live on a different machine; on Linux, while present  on  some
999       architectures (such as MIPS32), this field is not used.
1000
1001       The original Linux fcntl() system call was not designed to handle large
1002       file offsets (in the flock structure).  Consequently, an fcntl64() sys‐
1003       tem  call was added in Linux 2.4.  The newer system call employs a dif‐
1004       ferent structure for file locking, flock64, and corresponding commands,
1005       F_GETLK64,  F_SETLK64,  and  F_SETLKW64.  However, these details can be
1006       ignored by applications using glibc,  whose  fcntl()  wrapper  function
1007       transparently  employs  the  more recent system call where it is avail‐
1008       able.
1009
1010   Record locking and NFS
1011       Before Linux 3.12, if an NFSv4 client loses contact with the server for
1012       a  period  of  time (defined as more than 90 seconds with no communica‐
1013       tion), it might lose and regain a lock without ever being aware of  the
1014       fact.  (The period of time after which contact is assumed lost is known
1015       as the NFSv4 leasetime.  On a Linux NFS server, this can be  determined
1016       by  looking at /proc/fs/nfsd/nfsv4leasetime, which expresses the period
1017       in seconds.  The default value for this file  is  90.)   This  scenario
1018       potentially  risks data corruption, since another process might acquire
1019       a lock in the intervening period and perform file I/O.
1020
1021       Since Linux 3.12, if an NFSv4 client loses contact with the server, any
1022       I/O  to  the file by a process which "thinks" it holds a lock will fail
1023       until that process closes and reopens the file.   A  kernel  parameter,
1024       nfs.recover_lost_locks,  can  be set to 1 to obtain the pre-3.12 behav‐
1025       ior, whereby the client will attempt to recover lost locks when contact
1026       is  reestablished  with  the  server.  Because of the attendant risk of
1027       data corruption, this parameter defaults to 0 (disabled).
1028

BUGS

1030   F_SETFL
1031       It is not possible to use F_SETFL to change the state  of  the  O_DSYNC
1032       and  O_SYNC  flags.   Attempts  to  change the state of these flags are
1033       silently ignored.
1034
1035   F_GETOWN
1036       A limitation of the Linux system call conventions on some architectures
1037       (notably  i386)  means  that  if  a  (negative)  process group ID to be
1038       returned by F_GETOWN falls in the range -1 to -4095,  then  the  return
1039       value  is  wrongly interpreted by glibc as an error in the system call;
1040       that is, the return value of fcntl() will be -1, and errno will contain
1041       the (positive) process group ID.  The Linux-specific F_GETOWN_EX opera‐
1042       tion avoids this problem.  Since glibc version 2.11,  glibc  makes  the
1043       kernel  F_GETOWN  problem  invisible  by  implementing  F_GETOWN  using
1044       F_GETOWN_EX.
1045
1046   F_SETOWN
1047       In Linux 2.4 and earlier, there is bug that can occur when an  unprivi‐
1048       leged  process  uses  F_SETOWN  to  specify  the owner of a socket file
1049       descriptor as a process (group) other than the caller.  In  this  case,
1050       fcntl()  can  return  -1  with  errno set to EPERM, even when the owner
1051       process (group) is one that the caller has permission to  send  signals
1052       to.   Despite  this error return, the file descriptor owner is set, and
1053       signals will be sent to the owner.
1054
1055   Deadlock detection
1056       The deadlock-detection algorithm employed by the  kernel  when  dealing
1057       with  F_SETLKW  requests  can  yield  both false negatives (failures to
1058       detect deadlocks, leaving a set of deadlocked processes blocked indefi‐
1059       nitely) and false positives (EDEADLK errors when there is no deadlock).
1060       For example, the kernel limits the lock depth of its dependency  search
1061       to  10  steps,  meaning  that circular deadlock chains that exceed that
1062       size will not be detected.  In addition, the kernel may  falsely  indi‐
1063       cate  a  deadlock when two or more processes created using the clone(2)
1064       CLONE_FILES flag place locks that appear (to the kernel) to conflict.
1065
1066   Mandatory locking
1067       The Linux implementation of mandatory locking is subject to race condi‐
1068       tions  which render it unreliable: a write(2) call that overlaps with a
1069       lock may modify data after the mandatory lock is  acquired;  a  read(2)
1070       call  that  overlaps  with  a lock may detect changes to data that were
1071       made only after a write lock was acquired.  Similar races exist between
1072       mandatory  locks  and  mmap(2).  It is therefore inadvisable to rely on
1073       mandatory locking.
1074

COLOPHON

1085       This page is part of release 5.07 of the Linux  man-pages  project.   A
1086       description  of  the project, information about reporting bugs, and the
1087       latest    version    of    this    page,    can     be     found     at
1088       https://www.kernel.org/doc/man-pages/.
1089
1090
1091
1092Linux                             2020-02-09                          FCNTL(2)