1SELECT(2) Linux Programmer's Manual SELECT(2)
2
3
4
6 select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O
7 multiplexing
8
10 /* According to POSIX.1-2001, POSIX.1-2008 */
11 #include <sys/select.h>
12
13 /* According to earlier standards */
14 #include <sys/time.h>
15 #include <sys/types.h>
16 #include <unistd.h>
17
18 int select(int nfds, fd_set *readfds, fd_set *writefds,
19 fd_set *exceptfds, struct timeval *timeout);
20
21 void FD_CLR(int fd, fd_set *set);
22 int FD_ISSET(int fd, fd_set *set);
23 void FD_SET(int fd, fd_set *set);
24 void FD_ZERO(fd_set *set);
25
26 #include <sys/select.h>
27
28 int pselect(int nfds, fd_set *readfds, fd_set *writefds,
29 fd_set *exceptfds, const struct timespec *timeout,
30 const sigset_t *sigmask);
31
32 Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
33
34 pselect(): _POSIX_C_SOURCE >= 200112L
35
37 select() and pselect() allow a program to monitor multiple file
38 descriptors, waiting until one or more of the file descriptors become
39 "ready" for some class of I/O operation (e.g., input possible). A file
40 descriptor is considered ready if it is possible to perform a corre‐
41 sponding I/O operation (e.g., read(2), or a sufficiently small
42 write(2)) without blocking.
43
44 select() can monitor only file descriptors numbers that are less than
45 FD_SETSIZE; poll(2) does not have this limitation. See BUGS.
46
47 The operation of select() and pselect() is identical, other than these
48 three differences:
49
50 (i) select() uses a timeout that is a struct timeval (with seconds
51 and microseconds), while pselect() uses a struct timespec (with
52 seconds and nanoseconds).
53
54 (ii) select() may update the timeout argument to indicate how much
55 time was left. pselect() does not change this argument.
56
57 (iii) select() has no sigmask argument, and behaves as pselect()
58 called with NULL sigmask.
59
60 Three independent sets of file descriptors are watched. The file
61 descriptors listed in readfds will be watched to see if characters
62 become available for reading (more precisely, to see if a read will not
63 block; in particular, a file descriptor is also ready on end-of-file).
64 The file descriptors in writefds will be watched to see if space is
65 available for write (though a large write may still block). The file
66 descriptors in exceptfds will be watched for exceptional conditions.
67 (For examples of some exceptional conditions, see the discussion of
68 POLLPRI in poll(2).)
69
70 On exit, each of the file descriptor sets is modified in place to indi‐
71 cate which file descriptors actually changed status. (Thus, if using
72 select() within a loop, the sets must be reinitialized before each
73 call.)
74
75 Each of the three file descriptor sets may be specified as NULL if no
76 file descriptors are to be watched for the corresponding class of
77 events.
78
79 Four macros are provided to manipulate the sets. FD_ZERO() clears a
80 set. FD_SET() and FD_CLR() add and remove a given file descriptor from
81 a set. FD_ISSET() tests to see if a file descriptor is part of the
82 set; this is useful after select() returns.
83
84 nfds should be set to the highest-numbered file descriptor in any of
85 the three sets, plus 1. The indicated file descriptors in each set are
86 checked, up to this limit (but see BUGS).
87
88 The timeout argument specifies the interval that select() should block
89 waiting for a file descriptor to become ready. The call will block
90 until either:
91
92 * a file descriptor becomes ready;
93
94 * the call is interrupted by a signal handler; or
95
96 * the timeout expires.
97
98 Note that the timeout interval will be rounded up to the system clock
99 granularity, and kernel scheduling delays mean that the blocking inter‐
100 val may overrun by a small amount. If both fields of the timeval
101 structure are zero, then select() returns immediately. (This is useful
102 for polling.) If timeout is NULL (no timeout), select() can block
103 indefinitely.
104
105 sigmask is a pointer to a signal mask (see sigprocmask(2)); if it is
106 not NULL, then pselect() first replaces the current signal mask by the
107 one pointed to by sigmask, then does the "select" function, and then
108 restores the original signal mask.
109
110 Other than the difference in the precision of the timeout argument, the
111 following pselect() call:
112
113 ready = pselect(nfds, &readfds, &writefds, &exceptfds,
114 timeout, &sigmask);
115
116 is equivalent to atomically executing the following calls:
117
118 sigset_t origmask;
119
120 pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
121 ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
122 pthread_sigmask(SIG_SETMASK, &origmask, NULL);
123
124 The reason that pselect() is needed is that if one wants to wait for
125 either a signal or for a file descriptor to become ready, then an
126 atomic test is needed to prevent race conditions. (Suppose the signal
127 handler sets a global flag and returns. Then a test of this global
128 flag followed by a call of select() could hang indefinitely if the sig‐
129 nal arrived just after the test but just before the call. By contrast,
130 pselect() allows one to first block signals, handle the signals that
131 have come in, then call pselect() with the desired sigmask, avoiding
132 the race.)
133
134 The timeout
135 The time structures involved are defined in <sys/time.h> and look like
136
137 struct timeval {
138 long tv_sec; /* seconds */
139 long tv_usec; /* microseconds */
140 };
141
142 and
143
144 struct timespec {
145 long tv_sec; /* seconds */
146 long tv_nsec; /* nanoseconds */
147 };
148
149 (However, see below on the POSIX.1 versions.)
150
151 Some code calls select() with all three sets empty, nfds zero, and a
152 non-NULL timeout as a fairly portable way to sleep with subsecond pre‐
153 cision.
154
155 On Linux, select() modifies timeout to reflect the amount of time not
156 slept; most other implementations do not do this. (POSIX.1 permits
157 either behavior.) This causes problems both when Linux code which
158 reads timeout is ported to other operating systems, and when code is
159 ported to Linux that reuses a struct timeval for multiple select()s in
160 a loop without reinitializing it. Consider timeout to be undefined
161 after select() returns.
162
164 On success, select() and pselect() return the number of file descrip‐
165 tors contained in the three returned descriptor sets (that is, the
166 total number of bits that are set in readfds, writefds, exceptfds)
167 which may be zero if the timeout expires before anything interesting
168 happens. On error, -1 is returned, and errno is set to indicate the
169 error; the file descriptor sets are unmodified, and timeout becomes
170 undefined.
171
173 EBADF An invalid file descriptor was given in one of the sets. (Per‐
174 haps a file descriptor that was already closed, or one on which
175 an error has occurred.) However, see BUGS.
176
177 EINTR A signal was caught; see signal(7).
178
179 EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit
180 (see getrlimit(2)).
181
182 EINVAL The value contained within timeout is invalid.
183
184 ENOMEM Unable to allocate memory for internal tables.
185
187 pselect() was added to Linux in kernel 2.6.16. Prior to this, pse‐
188 lect() was emulated in glibc (but see BUGS).
189
191 select() conforms to POSIX.1-2001, POSIX.1-2008, and 4.4BSD (select()
192 first appeared in 4.2BSD). Generally portable to/from non-BSD systems
193 supporting clones of the BSD socket layer (including System V vari‐
194 ants). However, note that the System V variant typically sets the
195 timeout variable before exit, but the BSD variant does not.
196
197 pselect() is defined in POSIX.1g, and in POSIX.1-2001 and POSIX.1-2008.
198
200 An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with
201 a value of fd that is negative or is equal to or larger than FD_SETSIZE
202 will result in undefined behavior. Moreover, POSIX requires fd to be a
203 valid file descriptor.
204
205 The operation of select() and pselect() is not affected by the O_NON‐
206 BLOCK flag.
207
208 On some other UNIX systems, select() can fail with the error EAGAIN if
209 the system fails to allocate kernel-internal resources, rather than
210 ENOMEM as Linux does. POSIX specifies this error for poll(2), but not
211 for select(). Portable programs may wish to check for EAGAIN and loop,
212 just as with EINTR.
213
214 On systems that lack pselect(), reliable (and more portable) signal
215 trapping can be achieved using the self-pipe trick. In this technique,
216 a signal handler writes a byte to a pipe whose other end is monitored
217 by select() in the main program. (To avoid possibly blocking when
218 writing to a pipe that may be full or reading from a pipe that may be
219 empty, nonblocking I/O is used when reading from and writing to the
220 pipe.)
221
222 Concerning the types involved, the classical situation is that the two
223 fields of a timeval structure are typed as long (as shown above), and
224 the structure is defined in <sys/time.h>. The POSIX.1 situation is
225
226 struct timeval {
227 time_t tv_sec; /* seconds */
228 suseconds_t tv_usec; /* microseconds */
229 };
230
231 where the structure is defined in <sys/select.h> and the data types
232 time_t and suseconds_t are defined in <sys/types.h>.
233
234 Concerning prototypes, the classical situation is that one should
235 include <time.h> for select(). The POSIX.1 situation is that one
236 should include <sys/select.h> for select() and pselect().
237
238 Under glibc 2.0, <sys/select.h> gives the wrong prototype for pse‐
239 lect(). Under glibc 2.1 to 2.2.1, it gives pselect() when _GNU_SOURCE
240 is defined. Since glibc 2.2.2, the requirements are as shown in the
241 SYNOPSIS.
242
243 Correspondence between select() and poll() notifications
244 Within the Linux kernel source, we find the following definitions which
245 show the correspondence between the readable, writable, and exceptional
246 condition notifications of select() and the event notifications pro‐
247 vided by poll(2) and epoll(7):
248
249 #define POLLIN_SET (EPOLLRDNORM | EPOLLRDBAND | EPOLLIN |
250 EPOLLHUP | EPOLLERR)
251 /* Ready for reading */
252 #define POLLOUT_SET (EPOLLWRBAND | EPOLLWRNORM | EPOLLOUT |
253 EPOLLERR)
254 /* Ready for writing */
255 #define POLLEX_SET (EPOLLPRI)
256 /* Exceptional condition */
257
258 Multithreaded applications
259 If a file descriptor being monitored by select() is closed in another
260 thread, the result is unspecified. On some UNIX systems, select()
261 unblocks and returns, with an indication that the file descriptor is
262 ready (a subsequent I/O operation will likely fail with an error,
263 unless another process reopens file descriptor between the time
264 select() returned and the I/O operation is performed). On Linux (and
265 some other systems), closing the file descriptor in another thread has
266 no effect on select(). In summary, any application that relies on a
267 particular behavior in this scenario must be considered buggy.
268
269 C library/kernel differences
270 The Linux kernel allows file descriptor sets of arbitrary size, deter‐
271 mining the length of the sets to be checked from the value of nfds.
272 However, in the glibc implementation, the fd_set type is fixed in size.
273 See also BUGS.
274
275 The pselect() interface described in this page is implemented by glibc.
276 The underlying Linux system call is named pselect6(). This system call
277 has somewhat different behavior from the glibc wrapper function.
278
279 The Linux pselect6() system call modifies its timeout argument. How‐
280 ever, the glibc wrapper function hides this behavior by using a local
281 variable for the timeout argument that is passed to the system call.
282 Thus, the glibc pselect() function does not modify its timeout argu‐
283 ment; this is the behavior required by POSIX.1-2001.
284
285 The final argument of the pselect6() system call is not a sigset_t *
286 pointer, but is instead a structure of the form:
287
288 struct {
289 const kernel_sigset_t *ss; /* Pointer to signal set */
290 size_t ss_len; /* Size (in bytes) of object
291 pointed to by 'ss' */
292 };
293
294 This allows the system call to obtain both a pointer to the signal set
295 and its size, while allowing for the fact that most architectures sup‐
296 port a maximum of 6 arguments to a system call. See sigprocmask(2) for
297 a discussion of the difference between the kernel and libc notion of
298 the signal set.
299
301 POSIX allows an implementation to define an upper limit, advertised via
302 the constant FD_SETSIZE, on the range of file descriptors that can be
303 specified in a file descriptor set. The Linux kernel imposes no fixed
304 limit, but the glibc implementation makes fd_set a fixed-size type,
305 with FD_SETSIZE defined as 1024, and the FD_*() macros operating
306 according to that limit. To monitor file descriptors greater than
307 1023, use poll(2) instead.
308
309 The implementation of the fd_set arguments as value-result arguments
310 means that they must be reinitialized on each call to select(). This
311 design error is avoided by poll(2), which uses separate structure
312 fields for the input and output of the call.
313
314 According to POSIX, select() should check all specified file descrip‐
315 tors in the three file descriptor sets, up to the limit nfds-1. How‐
316 ever, the current implementation ignores any file descriptor in these
317 sets that is greater than the maximum file descriptor number that the
318 process currently has open. According to POSIX, any such file descrip‐
319 tor that is specified in one of the sets should result in the error
320 EBADF.
321
322 Glibc 2.0 provided a version of pselect() that did not take a sigmask
323 argument.
324
325 Starting with version 2.1, glibc provided an emulation of pselect()
326 that was implemented using sigprocmask(2) and select(). This implemen‐
327 tation remained vulnerable to the very race condition that pselect()
328 was designed to prevent. Modern versions of glibc use the (race-free)
329 pselect() system call on kernels where it is provided.
330
331 Under Linux, select() may report a socket file descriptor as "ready for
332 reading", while nevertheless a subsequent read blocks. This could for
333 example happen when data has arrived but upon examination has wrong
334 checksum and is discarded. There may be other circumstances in which a
335 file descriptor is spuriously reported as ready. Thus it may be safer
336 to use O_NONBLOCK on sockets that should not block.
337
338 On Linux, select() also modifies timeout if the call is interrupted by
339 a signal handler (i.e., the EINTR error return). This is not permitted
340 by POSIX.1. The Linux pselect() system call has the same behavior, but
341 the glibc wrapper hides this behavior by internally copying the timeout
342 to a local variable and passing that variable to the system call.
343
345 #include <stdio.h>
346 #include <stdlib.h>
347 #include <sys/time.h>
348 #include <sys/types.h>
349 #include <unistd.h>
350
351 int
352 main(void)
353 {
354 fd_set rfds;
355 struct timeval tv;
356 int retval;
357
358 /* Watch stdin (fd 0) to see when it has input. */
359
360 FD_ZERO(&rfds);
361 FD_SET(0, &rfds);
362
363 /* Wait up to five seconds. */
364
365 tv.tv_sec = 5;
366 tv.tv_usec = 0;
367
368 retval = select(1, &rfds, NULL, NULL, &tv);
369 /* Don't rely on the value of tv now! */
370
371 if (retval == -1)
372 perror("select()");
373 else if (retval)
374 printf("Data is available now.\n");
375 /* FD_ISSET(0, &rfds) will be true. */
376 else
377 printf("No data within five seconds.\n");
378
379 exit(EXIT_SUCCESS);
380 }
381
383 accept(2), connect(2), poll(2), read(2), recv(2), restart_syscall(2),
384 send(2), sigprocmask(2), write(2), epoll(7), time(7)
385
386 For a tutorial with discussion and examples, see select_tut(2).
387
389 This page is part of release 5.04 of the Linux man-pages project. A
390 description of the project, information about reporting bugs, and the
391 latest version of this page, can be found at
392 https://www.kernel.org/doc/man-pages/.
393
394
395
396Linux 2019-11-19 SELECT(2)