1SELECT(2) Linux Programmer's Manual SELECT(2)
2
3
4
6 select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O
7 multiplexing
8
10 /* According to POSIX.1-2001, POSIX.1-2008 */
11 #include <sys/select.h>
12
13 /* According to earlier standards */
14 #include <sys/time.h>
15 #include <sys/types.h>
16 #include <unistd.h>
17
18 int select(int nfds, fd_set *readfds, fd_set *writefds,
19 fd_set *exceptfds, struct timeval *timeout);
20
21 void FD_CLR(int fd, fd_set *set);
22 int FD_ISSET(int fd, fd_set *set);
23 void FD_SET(int fd, fd_set *set);
24 void FD_ZERO(fd_set *set);
25
26 #include <sys/select.h>
27
28 int pselect(int nfds, fd_set *readfds, fd_set *writefds,
29 fd_set *exceptfds, const struct timespec *timeout,
30 const sigset_t *sigmask);
31
32 Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
33
34 pselect(): _POSIX_C_SOURCE >= 200112L
35
37 select() and pselect() allow a program to monitor multiple file
38 descriptors, waiting until one or more of the file descriptors become
39 "ready" for some class of I/O operation (e.g., input possible). A file
40 descriptor is considered ready if it is possible to perform a corre‐
41 sponding I/O operation (e.g., read(2) without blocking, or a suffi‐
42 ciently small write(2)).
43
44 select() can monitor only file descriptors numbers that are less than
45 FD_SETSIZE; poll(2) does not have this limitation. See BUGS.
46
47 The operation of select() and pselect() is identical, other than these
48 three differences:
49
50 (i) select() uses a timeout that is a struct timeval (with seconds
51 and microseconds), while pselect() uses a struct timespec (with
52 seconds and nanoseconds).
53
54 (ii) select() may update the timeout argument to indicate how much
55 time was left. pselect() does not change this argument.
56
57 (iii) select() has no sigmask argument, and behaves as pselect()
58 called with NULL sigmask.
59
60 Three independent sets of file descriptors are watched. The file
61 descriptors listed in readfds will be watched to see if characters
62 become available for reading (more precisely, to see if a read will not
63 block; in particular, a file descriptor is also ready on end-of-file).
64 The file descriptors in writefds will be watched to see if space is
65 available for write (though a large write may still block). The file
66 descriptors in exceptfds will be watched for exceptional conditions.
67 (For examples of some exceptional conditions, see the discussion of
68 POLLPRI in poll(2).)
69
70 On exit, each of the file descriptor sets is modified in place to indi‐
71 cate which file descriptors actually changed status. (Thus, if using
72 select() within a loop, the sets must be reinitialized before each
73 call.)
74
75 Each of the three file descriptor sets may be specified as NULL if no
76 file descriptors are to be watched for the corresponding class of
77 events.
78
79 Four macros are provided to manipulate the sets. FD_ZERO() clears a
80 set. FD_SET() and FD_CLR() respectively add and remove a given file
81 descriptor from a set. FD_ISSET() tests to see if a file descriptor is
82 part of the set; this is useful after select() returns.
83
84 nfds should be set to the highest-numbered file descriptor in any of
85 the three sets, plus 1. The indicated file descriptors in each set are
86 checked, up to this limit (but see BUGS).
87
88 The timeout argument specifies the interval that select() should block
89 waiting for a file descriptor to become ready. The call will block
90 until either:
91
92 * a file descriptor becomes ready;
93
94 * the call is interrupted by a signal handler; or
95
96 * the timeout expires.
97
98 Note that the timeout interval will be rounded up to the system clock
99 granularity, and kernel scheduling delays mean that the blocking inter‐
100 val may overrun by a small amount. If both fields of the timeval
101 structure are zero, then select() returns immediately. (This is useful
102 for polling.) If timeout is NULL (no timeout), select() can block
103 indefinitely.
104
105 sigmask is a pointer to a signal mask (see sigprocmask(2)); if it is
106 not NULL, then pselect() first replaces the current signal mask by the
107 one pointed to by sigmask, then does the "select" function, and then
108 restores the original signal mask.
109
110 Other than the difference in the precision of the timeout argument, the
111 following pselect() call:
112
113 ready = pselect(nfds, &readfds, &writefds, &exceptfds,
114 timeout, &sigmask);
115
116 is equivalent to atomically executing the following calls:
117
118 sigset_t origmask;
119
120 pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
121 ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
122 pthread_sigmask(SIG_SETMASK, &origmask, NULL);
123
124 The reason that pselect() is needed is that if one wants to wait for
125 either a signal or for a file descriptor to become ready, then an
126 atomic test is needed to prevent race conditions. (Suppose the signal
127 handler sets a global flag and returns. Then a test of this global
128 flag followed by a call of select() could hang indefinitely if the sig‐
129 nal arrived just after the test but just before the call. By contrast,
130 pselect() allows one to first block signals, handle the signals that
131 have come in, then call pselect() with the desired sigmask, avoiding
132 the race.)
133
134 The timeout
135 The time structures involved are defined in <sys/time.h> and look like
136
137 struct timeval {
138 long tv_sec; /* seconds */
139 long tv_usec; /* microseconds */
140 };
141
142 and
143
144 struct timespec {
145 long tv_sec; /* seconds */
146 long tv_nsec; /* nanoseconds */
147 };
148
149 (However, see below on the POSIX.1 versions.)
150
151 Some code calls select() with all three sets empty, nfds zero, and a
152 non-NULL timeout as a fairly portable way to sleep with subsecond pre‐
153 cision.
154
155 On Linux, select() modifies timeout to reflect the amount of time not
156 slept; most other implementations do not do this. (POSIX.1 permits
157 either behavior.) This causes problems both when Linux code which
158 reads timeout is ported to other operating systems, and when code is
159 ported to Linux that reuses a struct timeval for multiple select()s in
160 a loop without reinitializing it. Consider timeout to be undefined
161 after select() returns.
162
164 On success, select() and pselect() return the number of file descrip‐
165 tors contained in the three returned descriptor sets (that is, the
166 total number of bits that are set in readfds, writefds, exceptfds)
167 which may be zero if the timeout expires before anything interesting
168 happens. On error, -1 is returned, and errno is set to indicate the
169 error; the file descriptor sets are unmodified, and timeout becomes
170 undefined.
171
173 EBADF An invalid file descriptor was given in one of the sets. (Per‐
174 haps a file descriptor that was already closed, or one on which
175 an error has occurred.) However, see BUGS.
176
177 EINTR A signal was caught; see signal(7).
178
179 EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit
180 (see getrlimit(2)).
181
182 EINVAL The value contained within timeout is invalid.
183
184 ENOMEM Unable to allocate memory for internal tables.
185
187 pselect() was added to Linux in kernel 2.6.16. Prior to this, pse‐
188 lect() was emulated in glibc (but see BUGS).
189
191 select() conforms to POSIX.1-2001, POSIX.1-2008, and 4.4BSD (select()
192 first appeared in 4.2BSD). Generally portable to/from non-BSD systems
193 supporting clones of the BSD socket layer (including System V vari‐
194 ants). However, note that the System V variant typically sets the
195 timeout variable before exit, but the BSD variant does not.
196
197 pselect() is defined in POSIX.1g, and in POSIX.1-2001 and POSIX.1-2008.
198
200 An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with
201 a value of fd that is negative or is equal to or larger than FD_SETSIZE
202 will result in undefined behavior. Moreover, POSIX requires fd to be a
203 valid file descriptor.
204
205 On some other UNIX systems, select() can fail with the error EAGAIN if
206 the system fails to allocate kernel-internal resources, rather than
207 ENOMEM as Linux does. POSIX specifies this error for poll(2), but not
208 for select(). Portable programs may wish to check for EAGAIN and loop,
209 just as with EINTR.
210
211 On systems that lack pselect(), reliable (and more portable) signal
212 trapping can be achieved using the self-pipe trick. In this technique,
213 a signal handler writes a byte to a pipe whose other end is monitored
214 by select() in the main program. (To avoid possibly blocking when
215 writing to a pipe that may be full or reading from a pipe that may be
216 empty, nonblocking I/O is used when reading from and writing to the
217 pipe.)
218
219 Concerning the types involved, the classical situation is that the two
220 fields of a timeval structure are typed as long (as shown above), and
221 the structure is defined in <sys/time.h>. The POSIX.1 situation is
222
223 struct timeval {
224 time_t tv_sec; /* seconds */
225 suseconds_t tv_usec; /* microseconds */
226 };
227
228 where the structure is defined in <sys/select.h> and the data types
229 time_t and suseconds_t are defined in <sys/types.h>.
230
231 Concerning prototypes, the classical situation is that one should
232 include <time.h> for select(). The POSIX.1 situation is that one
233 should include <sys/select.h> for select() and pselect().
234
235 Under glibc 2.0, <sys/select.h> gives the wrong prototype for pse‐
236 lect(). Under glibc 2.1 to 2.2.1, it gives pselect() when _GNU_SOURCE
237 is defined. Since glibc 2.2.2, the requirements are as shown in the
238 SYNOPSIS.
239
240 Correspondence between select() and poll() notifications
241 Within the Linux kernel source, we find the following definitions which
242 show the correspondence between the readable, writable, and exceptional
243 condition notifications of select() and the event notifications pro‐
244 vided by poll(2) (and epoll(7)):
245
246 #define POLLIN_SET (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP |
247 POLLERR)
248 /* Ready for reading */
249 #define POLLOUT_SET (POLLWRBAND | POLLWRNORM | POLLOUT | POLLERR)
250 /* Ready for writing */
251 #define POLLEX_SET (POLLPRI)
252 /* Exceptional condition */
253
254 Multithreaded applications
255 If a file descriptor being monitored by select() is closed in another
256 thread, the result is unspecified. On some UNIX systems, select()
257 unblocks and returns, with an indication that the file descriptor is
258 ready (a subsequent I/O operation will likely fail with an error,
259 unless another the file descriptor reopened between the time select()
260 returned and the I/O operations was performed). On Linux (and some
261 other systems), closing the file descriptor in another thread has no
262 effect on select(). In summary, any application that relies on a par‐
263 ticular behavior in this scenario must be considered buggy.
264
265 C library/kernel differences
266 The Linux kernel allows file descriptor sets of arbitrary size, deter‐
267 mining the length of the sets to be checked from the value of nfds.
268 However, in the glibc implementation, the fd_set type is fixed in size.
269 See also BUGS.
270
271 The pselect() interface described in this page is implemented by glibc.
272 The underlying Linux system call is named pselect6(). This system call
273 has somewhat different behavior from the glibc wrapper function.
274
275 The Linux pselect6() system call modifies its timeout argument. How‐
276 ever, the glibc wrapper function hides this behavior by using a local
277 variable for the timeout argument that is passed to the system call.
278 Thus, the glibc pselect() function does not modify its timeout argu‐
279 ment; this is the behavior required by POSIX.1-2001.
280
281 The final argument of the pselect6() system call is not a sigset_t *
282 pointer, but is instead a structure of the form:
283
284 struct {
285 const kernel_sigset_t *ss; /* Pointer to signal set */
286 size_t ss_len; /* Size (in bytes) of object
287 pointed to by 'ss' */
288 };
289
290 This allows the system call to obtain both a pointer to the signal set
291 and its size, while allowing for the fact that most architectures sup‐
292 port a maximum of 6 arguments to a system call. See sigprocmask(2) for
293 a discussion of the difference between the kernel and libc notion of
294 the signal set.
295
297 POSIX allows an implementation to define an upper limit, advertised via
298 the constant FD_SETSIZE, on the range of file descriptors that can be
299 specified in a file descriptor set. The Linux kernel imposes no fixed
300 limit, but the glibc implementation makes fd_set a fixed-size type,
301 with FD_SETSIZE defined as 1024, and the FD_*() macros operating
302 according to that limit. To monitor file descriptors greater than
303 1023, use poll(2) instead.
304
305 According to POSIX, select() should check all specified file descrip‐
306 tors in the three file descriptor sets, up to the limit nfds-1. How‐
307 ever, the current implementation ignores any file descriptor in these
308 sets that is greater than the maximum file descriptor number that the
309 process currently has open. According to POSIX, any such file descrip‐
310 tor that is specified in one of the sets should result in the error
311 EBADF.
312
313 Glibc 2.0 provided a version of pselect() that did not take a sigmask
314 argument.
315
316 Starting with version 2.1, glibc provided an emulation of pselect()
317 that was implemented using sigprocmask(2) and select(). This implemen‐
318 tation remained vulnerable to the very race condition that pselect()
319 was designed to prevent. Modern versions of glibc use the (race-free)
320 pselect() system call on kernels where it is provided.
321
322 Under Linux, select() may report a socket file descriptor as "ready for
323 reading", while nevertheless a subsequent read blocks. This could for
324 example happen when data has arrived but upon examination has wrong
325 checksum and is discarded. There may be other circumstances in which a
326 file descriptor is spuriously reported as ready. Thus it may be safer
327 to use O_NONBLOCK on sockets that should not block.
328
329 On Linux, select() also modifies timeout if the call is interrupted by
330 a signal handler (i.e., the EINTR error return). This is not permitted
331 by POSIX.1. The Linux pselect() system call has the same behavior, but
332 the glibc wrapper hides this behavior by internally copying the timeout
333 to a local variable and passing that variable to the system call.
334
336 #include <stdio.h>
337 #include <stdlib.h>
338 #include <sys/time.h>
339 #include <sys/types.h>
340 #include <unistd.h>
341
342 int
343 main(void)
344 {
345 fd_set rfds;
346 struct timeval tv;
347 int retval;
348
349 /* Watch stdin (fd 0) to see when it has input. */
350
351 FD_ZERO(&rfds);
352 FD_SET(0, &rfds);
353
354 /* Wait up to five seconds. */
355
356 tv.tv_sec = 5;
357 tv.tv_usec = 0;
358
359 retval = select(1, &rfds, NULL, NULL, &tv);
360 /* Don't rely on the value of tv now! */
361
362 if (retval == -1)
363 perror("select()");
364 else if (retval)
365 printf("Data is available now.\n");
366 /* FD_ISSET(0, &rfds) will be true. */
367 else
368 printf("No data within five seconds.\n");
369
370 exit(EXIT_SUCCESS);
371 }
372
374 accept(2), connect(2), poll(2), read(2), recv(2), restart_syscall(2),
375 send(2), sigprocmask(2), write(2), epoll(7), time(7)
376
377 For a tutorial with discussion and examples, see select_tut(2).
378
380 This page is part of release 4.15 of the Linux man-pages project. A
381 description of the project, information about reporting bugs, and the
382 latest version of this page, can be found at
383 https://www.kernel.org/doc/man-pages/.
384
385
386
387Linux 2017-09-15 SELECT(2)