1select(2) System Calls Manual select(2)
2
3
4
6 select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO, fd_set - synchro‐
7 nous I/O multiplexing
8
10 Standard C library (libc, -lc)
11
13 #include <sys/select.h>
14
15 typedef /* ... */ fd_set;
16
17 int select(int nfds, fd_set *_Nullable restrict readfds,
18 fd_set *_Nullable restrict writefds,
19 fd_set *_Nullable restrict exceptfds,
20 struct timeval *_Nullable restrict timeout);
21
22 void FD_CLR(int fd, fd_set *set);
23 int FD_ISSET(int fd, fd_set *set);
24 void FD_SET(int fd, fd_set *set);
25 void FD_ZERO(fd_set *set);
26
27 int pselect(int nfds, fd_set *_Nullable restrict readfds,
28 fd_set *_Nullable restrict writefds,
29 fd_set *_Nullable restrict exceptfds,
30 const struct timespec *_Nullable restrict timeout,
31 const sigset_t *_Nullable restrict sigmask);
32
33 Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
34
35 pselect():
36 _POSIX_C_SOURCE >= 200112L
37
39 WARNING: select() can monitor only file descriptors numbers that are
40 less than FD_SETSIZE (1024)—an unreasonably low limit for many modern
41 applications—and this limitation will not change. All modern applica‐
42 tions should instead use poll(2) or epoll(7), which do not suffer this
43 limitation.
44
45 select() allows a program to monitor multiple file descriptors, waiting
46 until one or more of the file descriptors become "ready" for some class
47 of I/O operation (e.g., input possible). A file descriptor is consid‐
48 ered ready if it is possible to perform a corresponding I/O operation
49 (e.g., read(2), or a sufficiently small write(2)) without blocking.
50
51 fd_set
52 A structure type that can represent a set of file descriptors. Accord‐
53 ing to POSIX, the maximum number of file descriptors in an fd_set
54 structure is the value of the macro FD_SETSIZE.
55
56 File descriptor sets
57 The principal arguments of select() are three "sets" of file descrip‐
58 tors (declared with the type fd_set), which allow the caller to wait
59 for three classes of events on the specified set of file descriptors.
60 Each of the fd_set arguments may be specified as NULL if no file de‐
61 scriptors are to be watched for the corresponding class of events.
62
63 Note well: Upon return, each of the file descriptor sets is modified in
64 place to indicate which file descriptors are currently "ready". Thus,
65 if using select() within a loop, the sets must be reinitialized before
66 each call.
67
68 The contents of a file descriptor set can be manipulated using the fol‐
69 lowing macros:
70
71 FD_ZERO()
72 This macro clears (removes all file descriptors from) set. It
73 should be employed as the first step in initializing a file de‐
74 scriptor set.
75
76 FD_SET()
77 This macro adds the file descriptor fd to set. Adding a file
78 descriptor that is already present in the set is a no-op, and
79 does not produce an error.
80
81 FD_CLR()
82 This macro removes the file descriptor fd from set. Removing a
83 file descriptor that is not present in the set is a no-op, and
84 does not produce an error.
85
86 FD_ISSET()
87 select() modifies the contents of the sets according to the
88 rules described below. After calling select(), the FD_ISSET()
89 macro can be used to test if a file descriptor is still present
90 in a set. FD_ISSET() returns nonzero if the file descriptor fd
91 is present in set, and zero if it is not.
92
93 Arguments
94 The arguments of select() are as follows:
95
96 readfds
97 The file descriptors in this set are watched to see if they are
98 ready for reading. A file descriptor is ready for reading if a
99 read operation will not block; in particular, a file descriptor
100 is also ready on end-of-file.
101
102 After select() has returned, readfds will be cleared of all file
103 descriptors except for those that are ready for reading.
104
105 writefds
106 The file descriptors in this set are watched to see if they are
107 ready for writing. A file descriptor is ready for writing if a
108 write operation will not block. However, even if a file de‐
109 scriptor indicates as writable, a large write may still block.
110
111 After select() has returned, writefds will be cleared of all
112 file descriptors except for those that are ready for writing.
113
114 exceptfds
115 The file descriptors in this set are watched for "exceptional
116 conditions". For examples of some exceptional conditions, see
117 the discussion of POLLPRI in poll(2).
118
119 After select() has returned, exceptfds will be cleared of all
120 file descriptors except for those for which an exceptional con‐
121 dition has occurred.
122
123 nfds This argument should be set to the highest-numbered file de‐
124 scriptor in any of the three sets, plus 1. The indicated file
125 descriptors in each set are checked, up to this limit (but see
126 BUGS).
127
128 timeout
129 The timeout argument is a timeval structure (shown below) that
130 specifies the interval that select() should block waiting for a
131 file descriptor to become ready. The call will block until ei‐
132 ther:
133
134 • a file descriptor becomes ready;
135
136 • the call is interrupted by a signal handler; or
137
138 • the timeout expires.
139
140 Note that the timeout interval will be rounded up to the system
141 clock granularity, and kernel scheduling delays mean that the
142 blocking interval may overrun by a small amount.
143
144 If both fields of the timeval structure are zero, then select()
145 returns immediately. (This is useful for polling.)
146
147 If timeout is specified as NULL, select() blocks indefinitely
148 waiting for a file descriptor to become ready.
149
150 pselect()
151 The pselect() system call allows an application to safely wait until
152 either a file descriptor becomes ready or until a signal is caught.
153
154 The operation of select() and pselect() is identical, other than these
155 three differences:
156
157 • select() uses a timeout that is a struct timeval (with seconds and
158 microseconds), while pselect() uses a struct timespec (with seconds
159 and nanoseconds).
160
161 • select() may update the timeout argument to indicate how much time
162 was left. pselect() does not change this argument.
163
164 • select() has no sigmask argument, and behaves as pselect() called
165 with NULL sigmask.
166
167 sigmask is a pointer to a signal mask (see sigprocmask(2)); if it is
168 not NULL, then pselect() first replaces the current signal mask by the
169 one pointed to by sigmask, then does the "select" function, and then
170 restores the original signal mask. (If sigmask is NULL, the signal
171 mask is not modified during the pselect() call.)
172
173 Other than the difference in the precision of the timeout argument, the
174 following pselect() call:
175
176 ready = pselect(nfds, &readfds, &writefds, &exceptfds,
177 timeout, &sigmask);
178
179 is equivalent to atomically executing the following calls:
180
181 sigset_t origmask;
182
183 pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
184 ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
185 pthread_sigmask(SIG_SETMASK, &origmask, NULL);
186
187 The reason that pselect() is needed is that if one wants to wait for
188 either a signal or for a file descriptor to become ready, then an
189 atomic test is needed to prevent race conditions. (Suppose the signal
190 handler sets a global flag and returns. Then a test of this global
191 flag followed by a call of select() could hang indefinitely if the sig‐
192 nal arrived just after the test but just before the call. By contrast,
193 pselect() allows one to first block signals, handle the signals that
194 have come in, then call pselect() with the desired sigmask, avoiding
195 the race.)
196
197 The timeout
198 The timeout argument for select() is a structure of the following type:
199
200 struct timeval {
201 time_t tv_sec; /* seconds */
202 suseconds_t tv_usec; /* microseconds */
203 };
204
205 The corresponding argument for pselect() is a timespec(3) structure.
206
207 On Linux, select() modifies timeout to reflect the amount of time not
208 slept; most other implementations do not do this. (POSIX.1 permits ei‐
209 ther behavior.) This causes problems both when Linux code which reads
210 timeout is ported to other operating systems, and when code is ported
211 to Linux that reuses a struct timeval for multiple select()s in a loop
212 without reinitializing it. Consider timeout to be undefined after se‐
213 lect() returns.
214
216 On success, select() and pselect() return the number of file descrip‐
217 tors contained in the three returned descriptor sets (that is, the to‐
218 tal number of bits that are set in readfds, writefds, exceptfds). The
219 return value may be zero if the timeout expired before any file de‐
220 scriptors became ready.
221
222 On error, -1 is returned, and errno is set to indicate the error; the
223 file descriptor sets are unmodified, and timeout becomes undefined.
224
226 EBADF An invalid file descriptor was given in one of the sets. (Per‐
227 haps a file descriptor that was already closed, or one on which
228 an error has occurred.) However, see BUGS.
229
230 EINTR A signal was caught; see signal(7).
231
232 EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit
233 (see getrlimit(2)).
234
235 EINVAL The value contained within timeout is invalid.
236
237 ENOMEM Unable to allocate memory for internal tables.
238
240 On some other UNIX systems, select() can fail with the error EAGAIN if
241 the system fails to allocate kernel-internal resources, rather than
242 ENOMEM as Linux does. POSIX specifies this error for poll(2), but not
243 for select(). Portable programs may wish to check for EAGAIN and loop,
244 just as with EINTR.
245
247 POSIX.1-2008.
248
250 select()
251 POSIX.1-2001, 4.4BSD (first appeared in 4.2BSD).
252
253 Generally portable to/from non-BSD systems supporting clones of
254 the BSD socket layer (including System V variants). However,
255 note that the System V variant typically sets the timeout vari‐
256 able before returning, but the BSD variant does not.
257
258 pselect()
259 Linux 2.6.16. POSIX.1g, POSIX.1-2001.
260
261 Prior to this, it was emulated in glibc (but see BUGS).
262
263 fd_set POSIX.1-2001.
264
266 The following header also provides the fd_set type: <sys/time.h>.
267
268 An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with
269 a value of fd that is negative or is equal to or larger than FD_SETSIZE
270 will result in undefined behavior. Moreover, POSIX requires fd to be a
271 valid file descriptor.
272
273 The operation of select() and pselect() is not affected by the O_NON‐
274 BLOCK flag.
275
276 The self-pipe trick
277 On systems that lack pselect(), reliable (and more portable) signal
278 trapping can be achieved using the self-pipe trick. In this technique,
279 a signal handler writes a byte to a pipe whose other end is monitored
280 by select() in the main program. (To avoid possibly blocking when
281 writing to a pipe that may be full or reading from a pipe that may be
282 empty, nonblocking I/O is used when reading from and writing to the
283 pipe.)
284
285 Emulating usleep(3)
286 Before the advent of usleep(3), some code employed a call to select()
287 with all three sets empty, nfds zero, and a non-NULL timeout as a
288 fairly portable way to sleep with subsecond precision.
289
290 Correspondence between select() and poll() notifications
291 Within the Linux kernel source, we find the following definitions which
292 show the correspondence between the readable, writable, and exceptional
293 condition notifications of select() and the event notifications pro‐
294 vided by poll(2) and epoll(7):
295
296 #define POLLIN_SET (EPOLLRDNORM | EPOLLRDBAND | EPOLLIN |
297 EPOLLHUP | EPOLLERR)
298 /* Ready for reading */
299 #define POLLOUT_SET (EPOLLWRBAND | EPOLLWRNORM | EPOLLOUT |
300 EPOLLERR)
301 /* Ready for writing */
302 #define POLLEX_SET (EPOLLPRI)
303 /* Exceptional condition */
304
305 Multithreaded applications
306 If a file descriptor being monitored by select() is closed in another
307 thread, the result is unspecified. On some UNIX systems, select() un‐
308 blocks and returns, with an indication that the file descriptor is
309 ready (a subsequent I/O operation will likely fail with an error, un‐
310 less another process reopens the file descriptor between the time se‐
311 lect() returned and the I/O operation is performed). On Linux (and
312 some other systems), closing the file descriptor in another thread has
313 no effect on select(). In summary, any application that relies on a
314 particular behavior in this scenario must be considered buggy.
315
316 C library/kernel differences
317 The Linux kernel allows file descriptor sets of arbitrary size, deter‐
318 mining the length of the sets to be checked from the value of nfds.
319 However, in the glibc implementation, the fd_set type is fixed in size.
320 See also BUGS.
321
322 The pselect() interface described in this page is implemented by glibc.
323 The underlying Linux system call is named pselect6(). This system call
324 has somewhat different behavior from the glibc wrapper function.
325
326 The Linux pselect6() system call modifies its timeout argument. How‐
327 ever, the glibc wrapper function hides this behavior by using a local
328 variable for the timeout argument that is passed to the system call.
329 Thus, the glibc pselect() function does not modify its timeout argu‐
330 ment; this is the behavior required by POSIX.1-2001.
331
332 The final argument of the pselect6() system call is not a sigset_t *
333 pointer, but is instead a structure of the form:
334
335 struct {
336 const kernel_sigset_t *ss; /* Pointer to signal set */
337 size_t ss_len; /* Size (in bytes) of object
338 pointed to by 'ss' */
339 };
340
341 This allows the system call to obtain both a pointer to the signal set
342 and its size, while allowing for the fact that most architectures sup‐
343 port a maximum of 6 arguments to a system call. See sigprocmask(2) for
344 a discussion of the difference between the kernel and libc notion of
345 the signal set.
346
347 Historical glibc details
348 glibc 2.0 provided an incorrect version of pselect() that did not take
349 a sigmask argument.
350
351 From glibc 2.1 to glibc 2.2.1, one must define _GNU_SOURCE in order to
352 obtain the declaration of pselect() from <sys/select.h>.
353
355 POSIX allows an implementation to define an upper limit, advertised via
356 the constant FD_SETSIZE, on the range of file descriptors that can be
357 specified in a file descriptor set. The Linux kernel imposes no fixed
358 limit, but the glibc implementation makes fd_set a fixed-size type,
359 with FD_SETSIZE defined as 1024, and the FD_*() macros operating ac‐
360 cording to that limit. To monitor file descriptors greater than 1023,
361 use poll(2) or epoll(7) instead.
362
363 The implementation of the fd_set arguments as value-result arguments is
364 a design error that is avoided in poll(2) and epoll(7).
365
366 According to POSIX, select() should check all specified file descrip‐
367 tors in the three file descriptor sets, up to the limit nfds-1. How‐
368 ever, the current implementation ignores any file descriptor in these
369 sets that is greater than the maximum file descriptor number that the
370 process currently has open. According to POSIX, any such file descrip‐
371 tor that is specified in one of the sets should result in the error
372 EBADF.
373
374 Starting with glibc 2.1, glibc provided an emulation of pselect() that
375 was implemented using sigprocmask(2) and select(). This implementation
376 remained vulnerable to the very race condition that pselect() was de‐
377 signed to prevent. Modern versions of glibc use the (race-free) pse‐
378 lect() system call on kernels where it is provided.
379
380 On Linux, select() may report a socket file descriptor as "ready for
381 reading", while nevertheless a subsequent read blocks. This could for
382 example happen when data has arrived but upon examination has the wrong
383 checksum and is discarded. There may be other circumstances in which a
384 file descriptor is spuriously reported as ready. Thus it may be safer
385 to use O_NONBLOCK on sockets that should not block.
386
387 On Linux, select() also modifies timeout if the call is interrupted by
388 a signal handler (i.e., the EINTR error return). This is not permitted
389 by POSIX.1. The Linux pselect() system call has the same behavior, but
390 the glibc wrapper hides this behavior by internally copying the timeout
391 to a local variable and passing that variable to the system call.
392
394 #include <stdio.h>
395 #include <stdlib.h>
396 #include <sys/select.h>
397
398 int
399 main(void)
400 {
401 int retval;
402 fd_set rfds;
403 struct timeval tv;
404
405 /* Watch stdin (fd 0) to see when it has input. */
406
407 FD_ZERO(&rfds);
408 FD_SET(0, &rfds);
409
410 /* Wait up to five seconds. */
411
412 tv.tv_sec = 5;
413 tv.tv_usec = 0;
414
415 retval = select(1, &rfds, NULL, NULL, &tv);
416 /* Don't rely on the value of tv now! */
417
418 if (retval == -1)
419 perror("select()");
420 else if (retval)
421 printf("Data is available now.\n");
422 /* FD_ISSET(0, &rfds) will be true. */
423 else
424 printf("No data within five seconds.\n");
425
426 exit(EXIT_SUCCESS);
427 }
428
430 accept(2), connect(2), poll(2), read(2), recv(2), restart_syscall(2),
431 send(2), sigprocmask(2), write(2), timespec(3), epoll(7), time(7)
432
433 For a tutorial with discussion and examples, see select_tut(2).
434
435
436
437Linux man-pages 6.04 2023-03-30 select(2)