1SELECT_TUT(2) Linux Programmer's Manual SELECT_TUT(2)
2
3
4
6 select, pselect - synchronous I/O multiplexing
7
9 See select(2)
10
12 The select() and pselect() system calls are used to efficiently monitor
13 multiple file descriptors, to see if any of them is, or becomes,
14 "ready"; that is, to see whether I/O becomes possible, or an "excep‐
15 tional condition" has occurred on any of the file descriptors.
16
17 This page provides background and tutorial information on the use of
18 these system calls. For details of the arguments and semantics of se‐
19 lect() and pselect(), see select(2).
20
21 Combining signal and data events
22 pselect() is useful if you are waiting for a signal as well as for file
23 descriptor(s) to become ready for I/O. Programs that receive signals
24 normally use the signal handler only to raise a global flag. The
25 global flag will indicate that the event must be processed in the main
26 loop of the program. A signal will cause the select() (or pselect())
27 call to return with errno set to EINTR. This behavior is essential so
28 that signals can be processed in the main loop of the program, other‐
29 wise select() would block indefinitely.
30
31 Now, somewhere in the main loop will be a conditional to check the
32 global flag. So we must ask: what if a signal arrives after the condi‐
33 tional, but before the select() call? The answer is that select()
34 would block indefinitely, even though an event is actually pending.
35 This race condition is solved by the pselect() call. This call can be
36 used to set the signal mask to a set of signals that are to be received
37 only within the pselect() call. For instance, let us say that the
38 event in question was the exit of a child process. Before the start of
39 the main loop, we would block SIGCHLD using sigprocmask(2). Our pse‐
40 lect() call would enable SIGCHLD by using an empty signal mask. Our
41 program would look like:
42
43 static volatile sig_atomic_t got_SIGCHLD = 0;
44
45 static void
46 child_sig_handler(int sig)
47 {
48 got_SIGCHLD = 1;
49 }
50
51 int
52 main(int argc, char *argv[])
53 {
54 sigset_t sigmask, empty_mask;
55 struct sigaction sa;
56 fd_set readfds, writefds, exceptfds;
57 int r;
58
59 sigemptyset(&sigmask);
60 sigaddset(&sigmask, SIGCHLD);
61 if (sigprocmask(SIG_BLOCK, &sigmask, NULL) == -1) {
62 perror("sigprocmask");
63 exit(EXIT_FAILURE);
64 }
65
66 sa.sa_flags = 0;
67 sa.sa_handler = child_sig_handler;
68 sigemptyset(&sa.sa_mask);
69 if (sigaction(SIGCHLD, &sa, NULL) == -1) {
70 perror("sigaction");
71 exit(EXIT_FAILURE);
72 }
73
74 sigemptyset(&empty_mask);
75
76 for (;;) { /* main loop */
77 /* Initialize readfds, writefds, and exceptfds
78 before the pselect() call. (Code omitted.) */
79
80 r = pselect(nfds, &readfds, &writefds, &exceptfds,
81 NULL, &empty_mask);
82 if (r == -1 && errno != EINTR) {
83 /* Handle error */
84 }
85
86 if (got_SIGCHLD) {
87 got_SIGCHLD = 0;
88
89 /* Handle signalled event here; e.g., wait() for all
90 terminated children. (Code omitted.) */
91 }
92
93 /* main body of program */
94 }
95 }
96
97 Practical
98 So what is the point of select()? Can't I just read and write to my
99 file descriptors whenever I want? The point of select() is that it
100 watches multiple descriptors at the same time and properly puts the
101 process to sleep if there is no activity. UNIX programmers often find
102 themselves in a position where they have to handle I/O from more than
103 one file descriptor where the data flow may be intermittent. If you
104 were to merely create a sequence of read(2) and write(2) calls, you
105 would find that one of your calls may block waiting for data from/to a
106 file descriptor, while another file descriptor is unused though ready
107 for I/O. select() efficiently copes with this situation.
108
109 Select law
110 Many people who try to use select() come across behavior that is diffi‐
111 cult to understand and produces nonportable or borderline results. For
112 instance, the above program is carefully written not to block at any
113 point, even though it does not set its file descriptors to nonblocking
114 mode. It is easy to introduce subtle errors that will remove the ad‐
115 vantage of using select(), so here is a list of essentials to watch for
116 when using select().
117
118 1. You should always try to use select() without a timeout. Your pro‐
119 gram should have nothing to do if there is no data available. Code
120 that depends on timeouts is not usually portable and is difficult
121 to debug.
122
123 2. The value nfds must be properly calculated for efficiency as ex‐
124 plained above.
125
126 3. No file descriptor must be added to any set if you do not intend to
127 check its result after the select() call, and respond appropri‐
128 ately. See next rule.
129
130 4. After select() returns, all file descriptors in all sets should be
131 checked to see if they are ready.
132
133 5. The functions read(2), recv(2), write(2), and send(2) do not neces‐
134 sarily read/write the full amount of data that you have requested.
135 If they do read/write the full amount, it's because you have a low
136 traffic load and a fast stream. This is not always going to be the
137 case. You should cope with the case of your functions managing to
138 send or receive only a single byte.
139
140 6. Never read/write only in single bytes at a time unless you are re‐
141 ally sure that you have a small amount of data to process. It is
142 extremely inefficient not to read/write as much data as you can
143 buffer each time. The buffers in the example below are 1024 bytes
144 although they could easily be made larger.
145
146 7. Calls to read(2), recv(2), write(2), send(2), and select() can fail
147 with the error EINTR, and calls to read(2), recv(2) write(2), and
148 send(2) can fail with errno set to EAGAIN (EWOULDBLOCK). These re‐
149 sults must be properly managed (not done properly above). If your
150 program is not going to receive any signals, then it is unlikely
151 you will get EINTR. If your program does not set nonblocking I/O,
152 you will not get EAGAIN.
153
154 8. Never call read(2), recv(2), write(2), or send(2) with a buffer
155 length of zero.
156
157 9. If the functions read(2), recv(2), write(2), and send(2) fail with
158 errors other than those listed in 7., or one of the input functions
159 returns 0, indicating end of file, then you should not pass that
160 file descriptor to select() again. In the example below, I close
161 the file descriptor immediately, and then set it to -1 to prevent
162 it being included in a set.
163
164 10. The timeout value must be initialized with each new call to se‐
165 lect(), since some operating systems modify the structure. pse‐
166 lect() however does not modify its timeout structure.
167
168 11. Since select() modifies its file descriptor sets, if the call is
169 being used in a loop, then the sets must be reinitialized before
170 each call.
171
173 See select(2).
174
176 Generally speaking, all operating systems that support sockets also
177 support select(). select() can be used to solve many problems in a
178 portable and efficient way that naive programmers try to solve in a
179 more complicated manner using threads, forking, IPCs, signals, memory
180 sharing, and so on.
181
182 The poll(2) system call has the same functionality as select(), and is
183 somewhat more efficient when monitoring sparse file descriptor sets.
184 It is nowadays widely available, but historically was less portable
185 than select().
186
187 The Linux-specific epoll(7) API provides an interface that is more ef‐
188 ficient than select(2) and poll(2) when monitoring large numbers of
189 file descriptors.
190
192 Here is an example that better demonstrates the true utility of se‐
193 lect(). The listing below is a TCP forwarding program that forwards
194 from one TCP port to another.
195
196 #include <stdlib.h>
197 #include <stdio.h>
198 #include <unistd.h>
199 #include <sys/select.h>
200 #include <string.h>
201 #include <signal.h>
202 #include <sys/socket.h>
203 #include <netinet/in.h>
204 #include <arpa/inet.h>
205 #include <errno.h>
206
207 static int forward_port;
208
209 #undef max
210 #define max(x,y) ((x) > (y) ? (x) : (y))
211
212 static int
213 listen_socket(int listen_port)
214 {
215 struct sockaddr_in addr;
216 int lfd;
217 int yes;
218
219 lfd = socket(AF_INET, SOCK_STREAM, 0);
220 if (lfd == -1) {
221 perror("socket");
222 return -1;
223 }
224
225 yes = 1;
226 if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR,
227 &yes, sizeof(yes)) == -1) {
228 perror("setsockopt");
229 close(lfd);
230 return -1;
231 }
232
233 memset(&addr, 0, sizeof(addr));
234 addr.sin_port = htons(listen_port);
235 addr.sin_family = AF_INET;
236 if (bind(lfd, (struct sockaddr *) &addr, sizeof(addr)) == -1) {
237 perror("bind");
238 close(lfd);
239 return -1;
240 }
241
242 printf("accepting connections on port %d\n", listen_port);
243 listen(lfd, 10);
244 return lfd;
245 }
246
247 static int
248 connect_socket(int connect_port, char *address)
249 {
250 struct sockaddr_in addr;
251 int cfd;
252
253 cfd = socket(AF_INET, SOCK_STREAM, 0);
254 if (cfd == -1) {
255 perror("socket");
256 return -1;
257 }
258
259 memset(&addr, 0, sizeof(addr));
260 addr.sin_port = htons(connect_port);
261 addr.sin_family = AF_INET;
262
263 if (!inet_aton(address, (struct in_addr *) &addr.sin_addr.s_addr)) {
264 fprintf(stderr, "inet_aton(): bad IP address format\n");
265 close(cfd);
266 return -1;
267 }
268
269 if (connect(cfd, (struct sockaddr *) &addr, sizeof(addr)) == -1) {
270 perror("connect()");
271 shutdown(cfd, SHUT_RDWR);
272 close(cfd);
273 return -1;
274 }
275 return cfd;
276 }
277
278 #define SHUT_FD1 do { \
279 if (fd1 >= 0) { \
280 shutdown(fd1, SHUT_RDWR); \
281 close(fd1); \
282 fd1 = -1; \
283 } \
284 } while (0)
285
286 #define SHUT_FD2 do { \
287 if (fd2 >= 0) { \
288 shutdown(fd2, SHUT_RDWR); \
289 close(fd2); \
290 fd2 = -1; \
291 } \
292 } while (0)
293
294 #define BUF_SIZE 1024
295
296 int
297 main(int argc, char *argv[])
298 {
299 int h;
300 int fd1 = -1, fd2 = -1;
301 char buf1[BUF_SIZE], buf2[BUF_SIZE];
302 int buf1_avail = 0, buf1_written = 0;
303 int buf2_avail = 0, buf2_written = 0;
304
305 if (argc != 4) {
306 fprintf(stderr, "Usage\n\tfwd <listen-port> "
307 "<forward-to-port> <forward-to-ip-address>\n");
308 exit(EXIT_FAILURE);
309 }
310
311 signal(SIGPIPE, SIG_IGN);
312
313 forward_port = atoi(argv[2]);
314
315 h = listen_socket(atoi(argv[1]));
316 if (h == -1)
317 exit(EXIT_FAILURE);
318
319 for (;;) {
320 int ready, nfds = 0;
321 ssize_t nbytes;
322 fd_set readfds, writefds, exceptfds;
323
324 FD_ZERO(&readfds);
325 FD_ZERO(&writefds);
326 FD_ZERO(&exceptfds);
327 FD_SET(h, &readfds);
328 nfds = max(nfds, h);
329
330 if (fd1 > 0 && buf1_avail < BUF_SIZE)
331 FD_SET(fd1, &readfds);
332 /* Note: nfds is updated below, when fd1 is added to
333 exceptfds. */
334 if (fd2 > 0 && buf2_avail < BUF_SIZE)
335 FD_SET(fd2, &readfds);
336
337 if (fd1 > 0 && buf2_avail - buf2_written > 0)
338 FD_SET(fd1, &writefds);
339 if (fd2 > 0 && buf1_avail - buf1_written > 0)
340 FD_SET(fd2, &writefds);
341
342 if (fd1 > 0) {
343 FD_SET(fd1, &exceptfds);
344 nfds = max(nfds, fd1);
345 }
346 if (fd2 > 0) {
347 FD_SET(fd2, &exceptfds);
348 nfds = max(nfds, fd2);
349 }
350
351 ready = select(nfds + 1, &readfds, &writefds, &exceptfds, NULL);
352
353 if (ready == -1 && errno == EINTR)
354 continue;
355
356 if (ready == -1) {
357 perror("select()");
358 exit(EXIT_FAILURE);
359 }
360
361 if (FD_ISSET(h, &readfds)) {
362 socklen_t addrlen;
363 struct sockaddr_in client_addr;
364 int fd;
365
366 addrlen = sizeof(client_addr);
367 memset(&client_addr, 0, addrlen);
368 fd = accept(h, (struct sockaddr *) &client_addr, &addrlen);
369 if (fd == -1) {
370 perror("accept()");
371 } else {
372 SHUT_FD1;
373 SHUT_FD2;
374 buf1_avail = buf1_written = 0;
375 buf2_avail = buf2_written = 0;
376 fd1 = fd;
377 fd2 = connect_socket(forward_port, argv[3]);
378 if (fd2 == -1)
379 SHUT_FD1;
380 else
381 printf("connect from %s\n",
382 inet_ntoa(client_addr.sin_addr));
383
384 /* Skip any events on the old, closed file
385 descriptors. */
386
387 continue;
388 }
389 }
390
391 /* NB: read OOB data before normal reads. */
392
393 if (fd1 > 0 && FD_ISSET(fd1, &exceptfds)) {
394 char c;
395
396 nbytes = recv(fd1, &c, 1, MSG_OOB);
397 if (nbytes < 1)
398 SHUT_FD1;
399 else
400 send(fd2, &c, 1, MSG_OOB);
401 }
402 if (fd2 > 0 && FD_ISSET(fd2, &exceptfds)) {
403 char c;
404
405 nbytes = recv(fd2, &c, 1, MSG_OOB);
406 if (nbytes < 1)
407 SHUT_FD2;
408 else
409 send(fd1, &c, 1, MSG_OOB);
410 }
411 if (fd1 > 0 && FD_ISSET(fd1, &readfds)) {
412 nbytes = read(fd1, buf1 + buf1_avail,
413 BUF_SIZE - buf1_avail);
414 if (nbytes < 1)
415 SHUT_FD1;
416 else
417 buf1_avail += nbytes;
418 }
419 if (fd2 > 0 && FD_ISSET(fd2, &readfds)) {
420 nbytes = read(fd2, buf2 + buf2_avail,
421 BUF_SIZE - buf2_avail);
422 if (nbytes < 1)
423 SHUT_FD2;
424 else
425 buf2_avail += nbytes;
426 }
427 if (fd1 > 0 && FD_ISSET(fd1, &writefds) && buf2_avail > 0) {
428 nbytes = write(fd1, buf2 + buf2_written,
429 buf2_avail - buf2_written);
430 if (nbytes < 1)
431 SHUT_FD1;
432 else
433 buf2_written += nbytes;
434 }
435 if (fd2 > 0 && FD_ISSET(fd2, &writefds) && buf1_avail > 0) {
436 nbytes = write(fd2, buf1 + buf1_written,
437 buf1_avail - buf1_written);
438 if (nbytes < 1)
439 SHUT_FD2;
440 else
441 buf1_written += nbytes;
442 }
443
444 /* Check if write data has caught read data. */
445
446 if (buf1_written == buf1_avail)
447 buf1_written = buf1_avail = 0;
448 if (buf2_written == buf2_avail)
449 buf2_written = buf2_avail = 0;
450
451 /* One side has closed the connection, keep
452 writing to the other side until empty. */
453
454 if (fd1 < 0 && buf1_avail - buf1_written == 0)
455 SHUT_FD2;
456 if (fd2 < 0 && buf2_avail - buf2_written == 0)
457 SHUT_FD1;
458 }
459 exit(EXIT_SUCCESS);
460 }
461
462 The above program properly forwards most kinds of TCP connections in‐
463 cluding OOB signal data transmitted by telnet servers. It handles the
464 tricky problem of having data flow in both directions simultaneously.
465 You might think it more efficient to use a fork(2) call and devote a
466 thread to each stream. This becomes more tricky than you might sus‐
467 pect. Another idea is to set nonblocking I/O using fcntl(2). This
468 also has its problems because you end up using inefficient timeouts.
469
470 The program does not handle more than one simultaneous connection at a
471 time, although it could easily be extended to do this with a linked
472 list of buffers—one for each connection. At the moment, new connec‐
473 tions cause the current connection to be dropped.
474
476 accept(2), connect(2), poll(2), read(2), recv(2), select(2), send(2),
477 sigprocmask(2), write(2), epoll(7)
478
480 This page is part of release 5.13 of the Linux man-pages project. A
481 description of the project, information about reporting bugs, and the
482 latest version of this page, can be found at
483 https://www.kernel.org/doc/man-pages/.
484
485
486
487Linux 2021-03-22 SELECT_TUT(2)