1SELECT(2) Linux Programmer's Manual SELECT(2)
2
3
4
6 select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O
7 multiplexing
8
10 /* According to POSIX.1-2001, POSIX.1-2008 */
11 #include <sys/select.h>
12
13 /* According to earlier standards */
14 #include <sys/time.h>
15 #include <sys/types.h>
16 #include <unistd.h>
17
18 int select(int nfds, fd_set *readfds, fd_set *writefds,
19 fd_set *exceptfds, struct timeval *timeout);
20
21 void FD_CLR(int fd, fd_set *set);
22 int FD_ISSET(int fd, fd_set *set);
23 void FD_SET(int fd, fd_set *set);
24 void FD_ZERO(fd_set *set);
25
26 #include <sys/select.h>
27
28 int pselect(int nfds, fd_set *readfds, fd_set *writefds,
29 fd_set *exceptfds, const struct timespec *timeout,
30 const sigset_t *sigmask);
31
32 Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
33
34 pselect(): _POSIX_C_SOURCE >= 200112L
35
37 select() and pselect() allow a program to monitor multiple file
38 descriptors, waiting until one or more of the file descriptors become
39 "ready" for some class of I/O operation (e.g., input possible). A file
40 descriptor is considered ready if it is possible to perform a corre‐
41 sponding I/O operation (e.g., read(2), or a sufficiently small
42 write(2)) without blocking.
43
44 select() can monitor only file descriptors numbers that are less than
45 FD_SETSIZE; poll(2) does not have this limitation. See BUGS.
46
47 The operation of select() and pselect() is identical, other than these
48 three differences:
49
50 (i) select() uses a timeout that is a struct timeval (with seconds
51 and microseconds), while pselect() uses a struct timespec (with
52 seconds and nanoseconds).
53
54 (ii) select() may update the timeout argument to indicate how much
55 time was left. pselect() does not change this argument.
56
57 (iii) select() has no sigmask argument, and behaves as pselect()
58 called with NULL sigmask.
59
60 Three independent sets of file descriptors are watched. The file
61 descriptors listed in readfds will be watched to see if characters
62 become available for reading (more precisely, to see if a read will not
63 block; in particular, a file descriptor is also ready on end-of-file).
64 The file descriptors in writefds will be watched to see if space is
65 available for write (though a large write may still block). The file
66 descriptors in exceptfds will be watched for exceptional conditions.
67 (For examples of some exceptional conditions, see the discussion of
68 POLLPRI in poll(2).)
69
70 On exit, each of the file descriptor sets is modified in place to indi‐
71 cate which file descriptors actually changed status. (Thus, if using
72 select() within a loop, the sets must be reinitialized before each
73 call.)
74
75 Each of the three file descriptor sets may be specified as NULL if no
76 file descriptors are to be watched for the corresponding class of
77 events.
78
79 Four macros are provided to manipulate the sets. FD_ZERO() clears a
80 set. FD_SET() and FD_CLR() add and remove a given file descriptor from
81 a set. FD_ISSET() tests to see if a file descriptor is part of the
82 set; this is useful after select() returns.
83
84 nfds should be set to the highest-numbered file descriptor in any of
85 the three sets, plus 1. The indicated file descriptors in each set are
86 checked, up to this limit (but see BUGS).
87
88 The timeout argument specifies the interval that select() should block
89 waiting for a file descriptor to become ready. The call will block
90 until either:
91
92 * a file descriptor becomes ready;
93
94 * the call is interrupted by a signal handler; or
95
96 * the timeout expires.
97
98 Note that the timeout interval will be rounded up to the system clock
99 granularity, and kernel scheduling delays mean that the blocking inter‐
100 val may overrun by a small amount. If both fields of the timeval
101 structure are zero, then select() returns immediately. (This is useful
102 for polling.) If timeout is NULL (no timeout), select() can block
103 indefinitely.
104
105 sigmask is a pointer to a signal mask (see sigprocmask(2)); if it is
106 not NULL, then pselect() first replaces the current signal mask by the
107 one pointed to by sigmask, then does the "select" function, and then
108 restores the original signal mask.
109
110 Other than the difference in the precision of the timeout argument, the
111 following pselect() call:
112
113 ready = pselect(nfds, &readfds, &writefds, &exceptfds,
114 timeout, &sigmask);
115
116 is equivalent to atomically executing the following calls:
117
118 sigset_t origmask;
119
120 pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
121 ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
122 pthread_sigmask(SIG_SETMASK, &origmask, NULL);
123
124 The reason that pselect() is needed is that if one wants to wait for
125 either a signal or for a file descriptor to become ready, then an
126 atomic test is needed to prevent race conditions. (Suppose the signal
127 handler sets a global flag and returns. Then a test of this global
128 flag followed by a call of select() could hang indefinitely if the sig‐
129 nal arrived just after the test but just before the call. By contrast,
130 pselect() allows one to first block signals, handle the signals that
131 have come in, then call pselect() with the desired sigmask, avoiding
132 the race.)
133
134 The timeout
135 The time structures involved are defined in <sys/time.h> and look like
136
137 struct timeval {
138 long tv_sec; /* seconds */
139 long tv_usec; /* microseconds */
140 };
141
142 and
143
144 struct timespec {
145 long tv_sec; /* seconds */
146 long tv_nsec; /* nanoseconds */
147 };
148
149 (However, see below on the POSIX.1 versions.)
150
151 Some code calls select() with all three sets empty, nfds zero, and a
152 non-NULL timeout as a fairly portable way to sleep with subsecond pre‐
153 cision.
154
155 On Linux, select() modifies timeout to reflect the amount of time not
156 slept; most other implementations do not do this. (POSIX.1 permits
157 either behavior.) This causes problems both when Linux code which
158 reads timeout is ported to other operating systems, and when code is
159 ported to Linux that reuses a struct timeval for multiple select()s in
160 a loop without reinitializing it. Consider timeout to be undefined
161 after select() returns.
162
164 On success, select() and pselect() return the number of file descrip‐
165 tors contained in the three returned descriptor sets (that is, the
166 total number of bits that are set in readfds, writefds, exceptfds)
167 which may be zero if the timeout expires before anything interesting
168 happens. On error, -1 is returned, and errno is set to indicate the
169 error; the file descriptor sets are unmodified, and timeout becomes
170 undefined.
171
173 EBADF An invalid file descriptor was given in one of the sets. (Per‐
174 haps a file descriptor that was already closed, or one on which
175 an error has occurred.) However, see BUGS.
176
177 EINTR A signal was caught; see signal(7).
178
179 EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit
180 (see getrlimit(2)).
181
182 EINVAL The value contained within timeout is invalid.
183
184 ENOMEM Unable to allocate memory for internal tables.
185
187 pselect() was added to Linux in kernel 2.6.16. Prior to this, pse‐
188 lect() was emulated in glibc (but see BUGS).
189
191 select() conforms to POSIX.1-2001, POSIX.1-2008, and 4.4BSD (select()
192 first appeared in 4.2BSD). Generally portable to/from non-BSD systems
193 supporting clones of the BSD socket layer (including System V vari‐
194 ants). However, note that the System V variant typically sets the
195 timeout variable before exit, but the BSD variant does not.
196
197 pselect() is defined in POSIX.1g, and in POSIX.1-2001 and POSIX.1-2008.
198
200 An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with
201 a value of fd that is negative or is equal to or larger than FD_SETSIZE
202 will result in undefined behavior. Moreover, POSIX requires fd to be a
203 valid file descriptor.
204
205 The operation of select() and pselect() is not affected by the O_NON‐
206 BLOCK flag.
207
208 On some other UNIX systems, select() can fail with the error EAGAIN if
209 the system fails to allocate kernel-internal resources, rather than
210 ENOMEM as Linux does. POSIX specifies this error for poll(2), but not
211 for select(). Portable programs may wish to check for EAGAIN and loop,
212 just as with EINTR.
213
214 On systems that lack pselect(), reliable (and more portable) signal
215 trapping can be achieved using the self-pipe trick. In this technique,
216 a signal handler writes a byte to a pipe whose other end is monitored
217 by select() in the main program. (To avoid possibly blocking when
218 writing to a pipe that may be full or reading from a pipe that may be
219 empty, nonblocking I/O is used when reading from and writing to the
220 pipe.)
221
222 Concerning the types involved, the classical situation is that the two
223 fields of a timeval structure are typed as long (as shown above), and
224 the structure is defined in <sys/time.h>. The POSIX.1 situation is
225
226 struct timeval {
227 time_t tv_sec; /* seconds */
228 suseconds_t tv_usec; /* microseconds */
229 };
230
231 where the structure is defined in <sys/select.h> and the data types
232 time_t and suseconds_t are defined in <sys/types.h>.
233
234 Concerning prototypes, the classical situation is that one should
235 include <time.h> for select(). The POSIX.1 situation is that one
236 should include <sys/select.h> for select() and pselect().
237
238 Under glibc 2.0, <sys/select.h> gives the wrong prototype for pse‐
239 lect(). Under glibc 2.1 to 2.2.1, it gives pselect() when _GNU_SOURCE
240 is defined. Since glibc 2.2.2, the requirements are as shown in the
241 SYNOPSIS.
242
243 Correspondence between select() and poll() notifications
244 Within the Linux kernel source, we find the following definitions which
245 show the correspondence between the readable, writable, and exceptional
246 condition notifications of select() and the event notifications pro‐
247 vided by poll(2) (and epoll(7)):
248
249 #define POLLIN_SET (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP |
250 POLLERR)
251 /* Ready for reading */
252 #define POLLOUT_SET (POLLWRBAND | POLLWRNORM | POLLOUT | POLLERR)
253 /* Ready for writing */
254 #define POLLEX_SET (POLLPRI)
255 /* Exceptional condition */
256
257 Multithreaded applications
258 If a file descriptor being monitored by select() is closed in another
259 thread, the result is unspecified. On some UNIX systems, select()
260 unblocks and returns, with an indication that the file descriptor is
261 ready (a subsequent I/O operation will likely fail with an error,
262 unless another process reopens file descriptor between the time
263 select() returned and the I/O operation is performed). On Linux (and
264 some other systems), closing the file descriptor in another thread has
265 no effect on select(). In summary, any application that relies on a
266 particular behavior in this scenario must be considered buggy.
267
268 C library/kernel differences
269 The Linux kernel allows file descriptor sets of arbitrary size, deter‐
270 mining the length of the sets to be checked from the value of nfds.
271 However, in the glibc implementation, the fd_set type is fixed in size.
272 See also BUGS.
273
274 The pselect() interface described in this page is implemented by glibc.
275 The underlying Linux system call is named pselect6(). This system call
276 has somewhat different behavior from the glibc wrapper function.
277
278 The Linux pselect6() system call modifies its timeout argument. How‐
279 ever, the glibc wrapper function hides this behavior by using a local
280 variable for the timeout argument that is passed to the system call.
281 Thus, the glibc pselect() function does not modify its timeout argu‐
282 ment; this is the behavior required by POSIX.1-2001.
283
284 The final argument of the pselect6() system call is not a sigset_t *
285 pointer, but is instead a structure of the form:
286
287 struct {
288 const kernel_sigset_t *ss; /* Pointer to signal set */
289 size_t ss_len; /* Size (in bytes) of object
290 pointed to by 'ss' */
291 };
292
293 This allows the system call to obtain both a pointer to the signal set
294 and its size, while allowing for the fact that most architectures sup‐
295 port a maximum of 6 arguments to a system call. See sigprocmask(2) for
296 a discussion of the difference between the kernel and libc notion of
297 the signal set.
298
300 POSIX allows an implementation to define an upper limit, advertised via
301 the constant FD_SETSIZE, on the range of file descriptors that can be
302 specified in a file descriptor set. The Linux kernel imposes no fixed
303 limit, but the glibc implementation makes fd_set a fixed-size type,
304 with FD_SETSIZE defined as 1024, and the FD_*() macros operating
305 according to that limit. To monitor file descriptors greater than
306 1023, use poll(2) instead.
307
308 The implementation of the fd_set arguments as value-result arguments
309 means that they must be reinitialized on each call to select(). This
310 design error is avoided by poll(2), which uses separate structure
311 fields for the input and output of the call.
312
313 According to POSIX, select() should check all specified file descrip‐
314 tors in the three file descriptor sets, up to the limit nfds-1. How‐
315 ever, the current implementation ignores any file descriptor in these
316 sets that is greater than the maximum file descriptor number that the
317 process currently has open. According to POSIX, any such file descrip‐
318 tor that is specified in one of the sets should result in the error
319 EBADF.
320
321 Glibc 2.0 provided a version of pselect() that did not take a sigmask
322 argument.
323
324 Starting with version 2.1, glibc provided an emulation of pselect()
325 that was implemented using sigprocmask(2) and select(). This implemen‐
326 tation remained vulnerable to the very race condition that pselect()
327 was designed to prevent. Modern versions of glibc use the (race-free)
328 pselect() system call on kernels where it is provided.
329
330 Under Linux, select() may report a socket file descriptor as "ready for
331 reading", while nevertheless a subsequent read blocks. This could for
332 example happen when data has arrived but upon examination has wrong
333 checksum and is discarded. There may be other circumstances in which a
334 file descriptor is spuriously reported as ready. Thus it may be safer
335 to use O_NONBLOCK on sockets that should not block.
336
337 On Linux, select() also modifies timeout if the call is interrupted by
338 a signal handler (i.e., the EINTR error return). This is not permitted
339 by POSIX.1. The Linux pselect() system call has the same behavior, but
340 the glibc wrapper hides this behavior by internally copying the timeout
341 to a local variable and passing that variable to the system call.
342
344 #include <stdio.h>
345 #include <stdlib.h>
346 #include <sys/time.h>
347 #include <sys/types.h>
348 #include <unistd.h>
349
350 int
351 main(void)
352 {
353 fd_set rfds;
354 struct timeval tv;
355 int retval;
356
357 /* Watch stdin (fd 0) to see when it has input. */
358
359 FD_ZERO(&rfds);
360 FD_SET(0, &rfds);
361
362 /* Wait up to five seconds. */
363
364 tv.tv_sec = 5;
365 tv.tv_usec = 0;
366
367 retval = select(1, &rfds, NULL, NULL, &tv);
368 /* Don't rely on the value of tv now! */
369
370 if (retval == -1)
371 perror("select()");
372 else if (retval)
373 printf("Data is available now.\n");
374 /* FD_ISSET(0, &rfds) will be true. */
375 else
376 printf("No data within five seconds.\n");
377
378 exit(EXIT_SUCCESS);
379 }
380
382 accept(2), connect(2), poll(2), read(2), recv(2), restart_syscall(2),
383 send(2), sigprocmask(2), write(2), epoll(7), time(7)
384
385 For a tutorial with discussion and examples, see select_tut(2).
386
388 This page is part of release 5.02 of the Linux man-pages project. A
389 description of the project, information about reporting bugs, and the
390 latest version of this page, can be found at
391 https://www.kernel.org/doc/man-pages/.
392
393
394
395Linux 2019-03-06 SELECT(2)