pselect6(2)

1SELECT(2)                  Linux Programmer's Manual                 SELECT(2)
2
3
4

NAME

6       select,  pselect,  FD_CLR,  FD_ISSET, FD_SET, FD_ZERO - synchronous I/O
7       multiplexing
8

SYNOPSIS

10       #include <sys/select.h>
11
12       int select(int nfds, fd_set *restrict readfds,
13                  fd_set *restrict writefds, fd_set *restrict exceptfds,
14                  struct timeval *restrict timeout);
15
16       void FD_CLR(int fd, fd_set *set);
17       int  FD_ISSET(int fd, fd_set *set);
18       void FD_SET(int fd, fd_set *set);
19       void FD_ZERO(fd_set *set);
20
21       int pselect(int nfds, fd_set *restrict readfds,
22                  fd_set *restrict writefds, fd_set *restrict exceptfds,
23                  const struct timespec *restrict timeout,
24                  const sigset_t *restrict sigmask);
25
26   Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
27
28       pselect():
29           _POSIX_C_SOURCE >= 200112L
30

DESCRIPTION

32       WARNING: select() can monitor only file descriptors  numbers  that  are
33       less  than  FD_SETSIZE (1024)—an unreasonably low limit for many modern
34       applications—and this limitation will not change.  All modern  applica‐
35       tions  should instead use poll(2) or epoll(7), which do not suffer this
36       limitation.
37
38       select() allows a program to monitor multiple file descriptors, waiting
39       until one or more of the file descriptors become "ready" for some class
40       of I/O operation (e.g., input possible).  A file descriptor is  consid‐
41       ered  ready  if it is possible to perform a corresponding I/O operation
42       (e.g., read(2), or a sufficiently small write(2)) without blocking.
43
44   File descriptor sets
45       The principal arguments of select() are three "sets" of  file  descrip‐
46       tors  (declared  with  the type fd_set), which allow the caller to wait
47       for three classes of events on the specified set of  file  descriptors.
48       Each  of  the  fd_set arguments may be specified as NULL if no file de‐
49       scriptors are to be watched for the corresponding class of events.
50
51       Note well: Upon return, each of the file descriptor sets is modified in
52       place  to indicate which file descriptors are currently "ready".  Thus,
53       if using select() within a loop, the sets must be reinitialized  before
54       each call.
55
56       The contents of a file descriptor set can be manipulated using the fol‐
57       lowing macros:
58
59       FD_ZERO()
60              This macro clears (removes all file descriptors from)  set.   It
61              should  be employed as the first step in initializing a file de‐
62              scriptor set.
63
64       FD_SET()
65              This macro adds the file descriptor fd to set.   Adding  a  file
66              descriptor  that  is  already present in the set is a no-op, and
67              does not produce an error.
68
69       FD_CLR()
70              This macro removes the file descriptor fd from set.  Removing  a
71              file  descriptor  that is not present in the set is a no-op, and
72              does not produce an error.
73
74       FD_ISSET()
75              select() modifies the contents of  the  sets  according  to  the
76              rules  described  below.  After calling select(), the FD_ISSET()
77              macro can be used to test if a file descriptor is still  present
78              in  a set.  FD_ISSET() returns nonzero if the file descriptor fd
79              is present in set, and zero if it is not.
80
81   Arguments
82       The arguments of select() are as follows:
83
84       readfds
85              The file descriptors in this set are watched to see if they  are
86              ready  for reading.  A file descriptor is ready for reading if a
87              read operation will not block; in particular, a file  descriptor
88              is also ready on end-of-file.
89
90              After select() has returned, readfds will be cleared of all file
91              descriptors except for those that are ready for reading.
92
93       writefds
94              The file descriptors in this set are watched to see if they  are
95              ready  for writing.  A file descriptor is ready for writing if a
96              write operation will not block.  However, even  if  a  file  de‐
97              scriptor indicates as writable, a large write may still block.
98
99              After  select()  has  returned,  writefds will be cleared of all
100              file descriptors except for those that are ready for writing.
101
102       exceptfds
103              The file descriptors in this set are  watched  for  "exceptional
104              conditions".   For  examples of some exceptional conditions, see
105              the discussion of POLLPRI in poll(2).
106
107              After select() has returned, exceptfds will be  cleared  of  all
108              file  descriptors except for those for which an exceptional con‐
109              dition has occurred.
110
111       nfds   This argument should be set to  the  highest-numbered  file  de‐
112              scriptor  in  any of the three sets, plus 1.  The indicated file
113              descriptors in each set are checked, up to this limit  (but  see
114              BUGS).
115
116       timeout
117              The  timeout  argument is a timeval structure (shown below) that
118              specifies the interval that select() should block waiting for  a
119              file  descriptor to become ready.  The call will block until ei‐
120              ther:
121
122              • a file descriptor becomes ready;
123
124              • the call is interrupted by a signal handler; or
125
126              • the timeout expires.
127
128              Note that the timeout interval will be rounded up to the  system
129              clock  granularity,  and  kernel scheduling delays mean that the
130              blocking interval may overrun by a small amount.
131
132              If both fields of the timeval structure are zero, then  select()
133              returns immediately.  (This is useful for polling.)
134
135              If  timeout  is  specified as NULL, select() blocks indefinitely
136              waiting for a file descriptor to become ready.
137
138   pselect()
139       The pselect() system call allows an application to  safely  wait  until
140       either a file descriptor becomes ready or until a signal is caught.
141
142       The  operation of select() and pselect() is identical, other than these
143       three differences:
144
145       • select() uses a timeout that is a struct timeval  (with  seconds  and
146         microseconds),  while  pselect() uses a struct timespec (with seconds
147         and nanoseconds).
148
149       • select() may update the timeout argument to indicate  how  much  time
150         was left.  pselect() does not change this argument.
151
152       • select()  has  no  sigmask  argument, and behaves as pselect() called
153         with NULL sigmask.
154
155       sigmask is a pointer to a signal mask (see sigprocmask(2));  if  it  is
156       not  NULL, then pselect() first replaces the current signal mask by the
157       one pointed to by sigmask, then does the "select"  function,  and  then
158       restores  the  original  signal  mask.  (If sigmask is NULL, the signal
159       mask is not modified during the pselect() call.)
160
161       Other than the difference in the precision of the timeout argument, the
162       following pselect() call:
163
164           ready = pselect(nfds, &readfds, &writefds, &exceptfds,
165                           timeout, &sigmask);
166
167       is equivalent to atomically executing the following calls:
168
169           sigset_t origmask;
170
171           pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
172           ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
173           pthread_sigmask(SIG_SETMASK, &origmask, NULL);
174
175       The  reason  that  pselect() is needed is that if one wants to wait for
176       either a signal or for a file  descriptor  to  become  ready,  then  an
177       atomic  test is needed to prevent race conditions.  (Suppose the signal
178       handler sets a global flag and returns.  Then a  test  of  this  global
179       flag followed by a call of select() could hang indefinitely if the sig‐
180       nal arrived just after the test but just before the call.  By contrast,
181       pselect()  allows  one  to first block signals, handle the signals that
182       have come in, then call pselect() with the  desired  sigmask,  avoiding
183       the race.)
184
185   The timeout
186       The timeout argument for select() is a structure of the following type:
187
188           struct timeval {
189               time_t      tv_sec;         /* seconds */
190               suseconds_t tv_usec;        /* microseconds */
191           };
192
193       The corresponding argument for pselect() has the following type:
194
195           struct timespec {
196               time_t      tv_sec;         /* seconds */
197               long        tv_nsec;        /* nanoseconds */
198           };
199
200       On  Linux,  select() modifies timeout to reflect the amount of time not
201       slept; most other implementations do not do this.  (POSIX.1 permits ei‐
202       ther  behavior.)  This causes problems both when Linux code which reads
203       timeout is ported to other operating systems, and when code  is  ported
204       to  Linux that reuses a struct timeval for multiple select()s in a loop
205       without reinitializing it.  Consider timeout to be undefined after  se‐
206       lect() returns.
207

RETURN VALUE

209       On  success,  select() and pselect() return the number of file descrip‐
210       tors contained in the three returned descriptor sets (that is, the  to‐
211       tal  number of bits that are set in readfds, writefds, exceptfds).  The
212       return value may be zero if the timeout expired  before  any  file  de‐
213       scriptors became ready.
214
215       On  error,  -1 is returned, and errno is set to indicate the error; the
216       file descriptor sets are unmodified, and timeout becomes undefined.
217

ERRORS

219       EBADF  An invalid file descriptor was given in one of the sets.   (Per‐
220              haps  a file descriptor that was already closed, or one on which
221              an error has occurred.)  However, see BUGS.
222
223       EINTR  A signal was caught; see signal(7).
224
225       EINVAL nfds is negative or exceeds  the  RLIMIT_NOFILE  resource  limit
226              (see getrlimit(2)).
227
228       EINVAL The value contained within timeout is invalid.
229
230       ENOMEM Unable to allocate memory for internal tables.
231

VERSIONS

233       pselect()  was  added  to  Linux in kernel 2.6.16.  Prior to this, pse‐
234       lect() was emulated in glibc (but see BUGS).
235

CONFORMING TO

237       select() conforms to POSIX.1-2001, POSIX.1-2008, and  4.4BSD  (select()
238       first  appeared in 4.2BSD).  Generally portable to/from non-BSD systems
239       supporting clones of the BSD socket  layer  (including  System V  vari‐
240       ants).   However,  note  that  the  System V variant typically sets the
241       timeout variable before returning, but the BSD variant does not.
242
243       pselect() is defined in POSIX.1g, and in POSIX.1-2001 and POSIX.1-2008.
244

NOTES

246       An fd_set is a fixed size buffer.  Executing FD_CLR() or FD_SET()  with
247       a value of fd that is negative or is equal to or larger than FD_SETSIZE
248       will result in undefined behavior.  Moreover, POSIX requires fd to be a
249       valid file descriptor.
250
251       The  operation  of select() and pselect() is not affected by the O_NON‐
252       BLOCK flag.
253
254       On some other UNIX systems, select() can fail with the error EAGAIN  if
255       the  system  fails  to  allocate kernel-internal resources, rather than
256       ENOMEM as Linux does.  POSIX specifies this error for poll(2), but  not
257       for select().  Portable programs may wish to check for EAGAIN and loop,
258       just as with EINTR.
259
260   The self-pipe trick
261       On systems that lack pselect(), reliable  (and  more  portable)  signal
262       trapping can be achieved using the self-pipe trick.  In this technique,
263       a signal handler writes a byte to a pipe whose other end  is  monitored
264       by  select()  in  the  main  program.  (To avoid possibly blocking when
265       writing to a pipe that may be full or reading from a pipe that  may  be
266       empty,  nonblocking  I/O  is  used when reading from and writing to the
267       pipe.)
268
269   Emulating usleep(3)
270       Before the advent of usleep(3), some code employed a call  to  select()
271       with  all  three  sets  empty,  nfds  zero, and a non-NULL timeout as a
272       fairly portable way to sleep with subsecond precision.
273
274   Correspondence between select() and poll() notifications
275       Within the Linux kernel source, we find the following definitions which
276       show the correspondence between the readable, writable, and exceptional
277       condition notifications of select() and the  event  notifications  pro‐
278       vided by poll(2) and epoll(7):
279
280           #define POLLIN_SET  (EPOLLRDNORM | EPOLLRDBAND | EPOLLIN |
281                                EPOLLHUP | EPOLLERR)
282                              /* Ready for reading */
283           #define POLLOUT_SET (EPOLLWRBAND | EPOLLWRNORM | EPOLLOUT |
284                                EPOLLERR)
285                              /* Ready for writing */
286           #define POLLEX_SET  (EPOLLPRI)
287                              /* Exceptional condition */
288
289   Multithreaded applications
290       If  a  file descriptor being monitored by select() is closed in another
291       thread, the result is unspecified.  On some UNIX systems, select()  un‐
292       blocks  and  returns,  with  an  indication that the file descriptor is
293       ready (a subsequent I/O operation will likely fail with an  error,  un‐
294       less  another process reopens file descriptor between the time select()
295       returned and the I/O operation is performed).  On Linux (and some other
296       systems),  closing  the file descriptor in another thread has no effect
297       on select().  In summary, any application that relies on  a  particular
298       behavior in this scenario must be considered buggy.
299
300   C library/kernel differences
301       The  Linux kernel allows file descriptor sets of arbitrary size, deter‐
302       mining the length of the sets to be checked from  the  value  of  nfds.
303       However, in the glibc implementation, the fd_set type is fixed in size.
304       See also BUGS.
305
306       The pselect() interface described in this page is implemented by glibc.
307       The underlying Linux system call is named pselect6().  This system call
308       has somewhat different behavior from the glibc wrapper function.
309
310       The Linux pselect6() system call modifies its timeout  argument.   How‐
311       ever,  the  glibc wrapper function hides this behavior by using a local
312       variable for the timeout argument that is passed to  the  system  call.
313       Thus,  the  glibc  pselect() function does not modify its timeout argu‐
314       ment; this is the behavior required by POSIX.1-2001.
315
316       The final argument of the pselect6() system call is  not  a  sigset_t *
317       pointer, but is instead a structure of the form:
318
319           struct {
320               const kernel_sigset_t *ss;   /* Pointer to signal set */
321               size_t ss_len;               /* Size (in bytes) of object
322                                               pointed to by 'ss' */
323           };
324
325       This  allows the system call to obtain both a pointer to the signal set
326       and its size, while allowing for the fact that most architectures  sup‐
327       port a maximum of 6 arguments to a system call.  See sigprocmask(2) for
328       a discussion of the difference between the kernel and  libc  notion  of
329       the signal set.
330
331   Historical glibc details
332       Glibc  2.0 provided an incorrect version of pselect() that did not take
333       a sigmask argument.
334
335       In glibc versions 2.1 to 2.2.1, one must define _GNU_SOURCE in order to
336       obtain the declaration of pselect() from <sys/select.h>.
337

BUGS

339       POSIX allows an implementation to define an upper limit, advertised via
340       the constant FD_SETSIZE, on the range of file descriptors that  can  be
341       specified  in a file descriptor set.  The Linux kernel imposes no fixed
342       limit, but the glibc implementation makes  fd_set  a  fixed-size  type,
343       with  FD_SETSIZE  defined  as 1024, and the FD_*() macros operating ac‐
344       cording to that limit.  To monitor file descriptors greater than  1023,
345       use poll(2) or epoll(7) instead.
346
347       The implementation of the fd_set arguments as value-result arguments is
348       a design error that is avoided in poll(2) and epoll(7).
349
350       According to POSIX, select() should check all specified  file  descrip‐
351       tors  in  the three file descriptor sets, up to the limit nfds-1.  How‐
352       ever, the current implementation ignores any file descriptor  in  these
353       sets  that  is greater than the maximum file descriptor number that the
354       process currently has open.  According to POSIX, any such file descrip‐
355       tor  that  is  specified  in one of the sets should result in the error
356       EBADF.
357
358       Starting with version 2.1, glibc provided  an  emulation  of  pselect()
359       that was implemented using sigprocmask(2) and select().  This implemen‐
360       tation remained vulnerable to the very race  condition  that  pselect()
361       was  designed to prevent.  Modern versions of glibc use the (race-free)
362       pselect() system call on kernels where it is provided.
363
364       On Linux, select() may report a socket file descriptor  as  "ready  for
365       reading",  while nevertheless a subsequent read blocks.  This could for
366       example happen when data has arrived but upon examination has the wrong
367       checksum and is discarded.  There may be other circumstances in which a
368       file descriptor is spuriously reported as ready.  Thus it may be  safer
369       to use O_NONBLOCK on sockets that should not block.
370
371       On  Linux, select() also modifies timeout if the call is interrupted by
372       a signal handler (i.e., the EINTR error return).  This is not permitted
373       by POSIX.1.  The Linux pselect() system call has the same behavior, but
374       the glibc wrapper hides this behavior by internally copying the timeout
375       to a local variable and passing that variable to the system call.
376

EXAMPLES

378       #include <stdio.h>
379       #include <stdlib.h>
380       #include <sys/select.h>
381
382       int
383       main(void)
384       {
385           fd_set rfds;
386           struct timeval tv;
387           int retval;
388
389           /* Watch stdin (fd 0) to see when it has input. */
390
391           FD_ZERO(&rfds);
392           FD_SET(0, &rfds);
393
394           /* Wait up to five seconds. */
395
396           tv.tv_sec = 5;
397           tv.tv_usec = 0;
398
399           retval = select(1, &rfds, NULL, NULL, &tv);
400           /* Don't rely on the value of tv now! */
401
402           if (retval == -1)
403               perror("select()");
404           else if (retval)
405               printf("Data is available now.\n");
406               /* FD_ISSET(0, &rfds) will be true. */
407           else
408               printf("No data within five seconds.\n");
409
410           exit(EXIT_SUCCESS);
411       }
412

COLOPHON

420       This page is part of release 5.13 of the Linux  man-pages  project.   A
421       description  of  the project, information about reporting bugs, and the
422       latest    version    of    this    page,    can     be     found     at
423       https://www.kernel.org/doc/man-pages/.
424
425
426
427Linux                             2021-03-22                         SELECT(2)