1SELECT_TUT(2) System Calls Manual SELECT_TUT(2)
2
3
4
6 select, pselect - synchronous I/O multiplexing
7
9 Standard C library (libc, -lc)
10
12 See select(2)
13
15 The select() and pselect() system calls are used to efficiently monitor
16 multiple file descriptors, to see if any of them is, or becomes,
17 "ready"; that is, to see whether I/O becomes possible, or an "excep‐
18 tional condition" has occurred on any of the file descriptors.
19
20 This page provides background and tutorial information on the use of
21 these system calls. For details of the arguments and semantics of se‐
22 lect() and pselect(), see select(2).
23
24 Combining signal and data events
25 pselect() is useful if you are waiting for a signal as well as for file
26 descriptor(s) to become ready for I/O. Programs that receive signals
27 normally use the signal handler only to raise a global flag. The
28 global flag will indicate that the event must be processed in the main
29 loop of the program. A signal will cause the select() (or pselect())
30 call to return with errno set to EINTR. This behavior is essential so
31 that signals can be processed in the main loop of the program, other‐
32 wise select() would block indefinitely.
33
34 Now, somewhere in the main loop will be a conditional to check the
35 global flag. So we must ask: what if a signal arrives after the condi‐
36 tional, but before the select() call? The answer is that select()
37 would block indefinitely, even though an event is actually pending.
38 This race condition is solved by the pselect() call. This call can be
39 used to set the signal mask to a set of signals that are to be received
40 only within the pselect() call. For instance, let us say that the
41 event in question was the exit of a child process. Before the start of
42 the main loop, we would block SIGCHLD using sigprocmask(2). Our pse‐
43 lect() call would enable SIGCHLD by using an empty signal mask. Our
44 program would look like:
45
46 static volatile sig_atomic_t got_SIGCHLD = 0;
47
48 static void
49 child_sig_handler(int sig)
50 {
51 got_SIGCHLD = 1;
52 }
53
54 int
55 main(int argc, char *argv[])
56 {
57 sigset_t sigmask, empty_mask;
58 struct sigaction sa;
59 fd_set readfds, writefds, exceptfds;
60 int r;
61
62 sigemptyset(&sigmask);
63 sigaddset(&sigmask, SIGCHLD);
64 if (sigprocmask(SIG_BLOCK, &sigmask, NULL) == -1) {
65 perror("sigprocmask");
66 exit(EXIT_FAILURE);
67 }
68
69 sa.sa_flags = 0;
70 sa.sa_handler = child_sig_handler;
71 sigemptyset(&sa.sa_mask);
72 if (sigaction(SIGCHLD, &sa, NULL) == -1) {
73 perror("sigaction");
74 exit(EXIT_FAILURE);
75 }
76
77 sigemptyset(&empty_mask);
78
79 for (;;) { /* main loop */
80 /* Initialize readfds, writefds, and exceptfds
81 before the pselect() call. (Code omitted.) */
82
83 r = pselect(nfds, &readfds, &writefds, &exceptfds,
84 NULL, &empty_mask);
85 if (r == -1 && errno != EINTR) {
86 /* Handle error */
87 }
88
89 if (got_SIGCHLD) {
90 got_SIGCHLD = 0;
91
92 /* Handle signalled event here; e.g., wait() for all
93 terminated children. (Code omitted.) */
94 }
95
96 /* main body of program */
97 }
98 }
99
100 Practical
101 So what is the point of select()? Can't I just read and write to my
102 file descriptors whenever I want? The point of select() is that it
103 watches multiple descriptors at the same time and properly puts the
104 process to sleep if there is no activity. UNIX programmers often find
105 themselves in a position where they have to handle I/O from more than
106 one file descriptor where the data flow may be intermittent. If you
107 were to merely create a sequence of read(2) and write(2) calls, you
108 would find that one of your calls may block waiting for data from/to a
109 file descriptor, while another file descriptor is unused though ready
110 for I/O. select() efficiently copes with this situation.
111
112 Select law
113 Many people who try to use select() come across behavior that is diffi‐
114 cult to understand and produces nonportable or borderline results. For
115 instance, the above program is carefully written not to block at any
116 point, even though it does not set its file descriptors to nonblocking
117 mode. It is easy to introduce subtle errors that will remove the ad‐
118 vantage of using select(), so here is a list of essentials to watch for
119 when using select().
120
121 1. You should always try to use select() without a timeout. Your pro‐
122 gram should have nothing to do if there is no data available. Code
123 that depends on timeouts is not usually portable and is difficult
124 to debug.
125
126 2. The value nfds must be properly calculated for efficiency as ex‐
127 plained above.
128
129 3. No file descriptor must be added to any set if you do not intend to
130 check its result after the select() call, and respond appropri‐
131 ately. See next rule.
132
133 4. After select() returns, all file descriptors in all sets should be
134 checked to see if they are ready.
135
136 5. The functions read(2), recv(2), write(2), and send(2) do not neces‐
137 sarily read/write the full amount of data that you have requested.
138 If they do read/write the full amount, it's because you have a low
139 traffic load and a fast stream. This is not always going to be the
140 case. You should cope with the case of your functions managing to
141 send or receive only a single byte.
142
143 6. Never read/write only in single bytes at a time unless you are re‐
144 ally sure that you have a small amount of data to process. It is
145 extremely inefficient not to read/write as much data as you can
146 buffer each time. The buffers in the example below are 1024 bytes
147 although they could easily be made larger.
148
149 7. Calls to read(2), recv(2), write(2), send(2), and select() can fail
150 with the error EINTR, and calls to read(2), recv(2), write(2), and
151 send(2) can fail with errno set to EAGAIN (EWOULDBLOCK). These re‐
152 sults must be properly managed (not done properly above). If your
153 program is not going to receive any signals, then it is unlikely
154 you will get EINTR. If your program does not set nonblocking I/O,
155 you will not get EAGAIN.
156
157 8. Never call read(2), recv(2), write(2), or send(2) with a buffer
158 length of zero.
159
160 9. If the functions read(2), recv(2), write(2), and send(2) fail with
161 errors other than those listed in 7., or one of the input functions
162 returns 0, indicating end of file, then you should not pass that
163 file descriptor to select() again. In the example below, I close
164 the file descriptor immediately, and then set it to -1 to prevent
165 it being included in a set.
166
167 10. The timeout value must be initialized with each new call to se‐
168 lect(), since some operating systems modify the structure. pse‐
169 lect() however does not modify its timeout structure.
170
171 11. Since select() modifies its file descriptor sets, if the call is
172 being used in a loop, then the sets must be reinitialized before
173 each call.
174
176 See select(2).
177
179 Generally speaking, all operating systems that support sockets also
180 support select(). select() can be used to solve many problems in a
181 portable and efficient way that naive programmers try to solve in a
182 more complicated manner using threads, forking, IPCs, signals, memory
183 sharing, and so on.
184
185 The poll(2) system call has the same functionality as select(), and is
186 somewhat more efficient when monitoring sparse file descriptor sets.
187 It is nowadays widely available, but historically was less portable
188 than select().
189
190 The Linux-specific epoll(7) API provides an interface that is more ef‐
191 ficient than select(2) and poll(2) when monitoring large numbers of
192 file descriptors.
193
195 Here is an example that better demonstrates the true utility of se‐
196 lect(). The listing below is a TCP forwarding program that forwards
197 from one TCP port to another.
198
199 #include <arpa/inet.h>
200 #include <errno.h>
201 #include <netinet/in.h>
202 #include <signal.h>
203 #include <stdio.h>
204 #include <stdlib.h>
205 #include <string.h>
206 #include <sys/select.h>
207 #include <sys/socket.h>
208 #include <unistd.h>
209
210 static int forward_port;
211
212 #undef max
213 #define max(x, y) ((x) > (y) ? (x) : (y))
214
215 static int
216 listen_socket(int listen_port)
217 {
218 int lfd;
219 int yes;
220 struct sockaddr_in addr;
221
222 lfd = socket(AF_INET, SOCK_STREAM, 0);
223 if (lfd == -1) {
224 perror("socket");
225 return -1;
226 }
227
228 yes = 1;
229 if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR,
230 &yes, sizeof(yes)) == -1)
231 {
232 perror("setsockopt");
233 close(lfd);
234 return -1;
235 }
236
237 memset(&addr, 0, sizeof(addr));
238 addr.sin_port = htons(listen_port);
239 addr.sin_family = AF_INET;
240 if (bind(lfd, (struct sockaddr *) &addr, sizeof(addr)) == -1) {
241 perror("bind");
242 close(lfd);
243 return -1;
244 }
245
246 printf("accepting connections on port %d\n", listen_port);
247 listen(lfd, 10);
248 return lfd;
249 }
250
251 static int
252 connect_socket(int connect_port, char *address)
253 {
254 int cfd;
255 struct sockaddr_in addr;
256
257 cfd = socket(AF_INET, SOCK_STREAM, 0);
258 if (cfd == -1) {
259 perror("socket");
260 return -1;
261 }
262
263 memset(&addr, 0, sizeof(addr));
264 addr.sin_port = htons(connect_port);
265 addr.sin_family = AF_INET;
266
267 if (!inet_aton(address, (struct in_addr *) &addr.sin_addr.s_addr)) {
268 fprintf(stderr, "inet_aton(): bad IP address format\n");
269 close(cfd);
270 return -1;
271 }
272
273 if (connect(cfd, (struct sockaddr *) &addr, sizeof(addr)) == -1) {
274 perror("connect()");
275 shutdown(cfd, SHUT_RDWR);
276 close(cfd);
277 return -1;
278 }
279 return cfd;
280 }
281
282 #define SHUT_FD1 do { \
283 if (fd1 >= 0) { \
284 shutdown(fd1, SHUT_RDWR); \
285 close(fd1); \
286 fd1 = -1; \
287 } \
288 } while (0)
289
290 #define SHUT_FD2 do { \
291 if (fd2 >= 0) { \
292 shutdown(fd2, SHUT_RDWR); \
293 close(fd2); \
294 fd2 = -1; \
295 } \
296 } while (0)
297
298 #define BUF_SIZE 1024
299
300 int
301 main(int argc, char *argv[])
302 {
303 int h;
304 int ready, nfds;
305 int fd1 = -1, fd2 = -1;
306 int buf1_avail = 0, buf1_written = 0;
307 int buf2_avail = 0, buf2_written = 0;
308 char buf1[BUF_SIZE], buf2[BUF_SIZE];
309 fd_set readfds, writefds, exceptfds;
310 ssize_t nbytes;
311
312 if (argc != 4) {
313 fprintf(stderr, "Usage\n\tfwd <listen-port> "
314 "<forward-to-port> <forward-to-ip-address>\n");
315 exit(EXIT_FAILURE);
316 }
317
318 signal(SIGPIPE, SIG_IGN);
319
320 forward_port = atoi(argv[2]);
321
322 h = listen_socket(atoi(argv[1]));
323 if (h == -1)
324 exit(EXIT_FAILURE);
325
326 for (;;) {
327 nfds = 0;
328
329 FD_ZERO(&readfds);
330 FD_ZERO(&writefds);
331 FD_ZERO(&exceptfds);
332 FD_SET(h, &readfds);
333 nfds = max(nfds, h);
334
335 if (fd1 > 0 && buf1_avail < BUF_SIZE)
336 FD_SET(fd1, &readfds);
337 /* Note: nfds is updated below, when fd1 is added to
338 exceptfds. */
339 if (fd2 > 0 && buf2_avail < BUF_SIZE)
340 FD_SET(fd2, &readfds);
341
342 if (fd1 > 0 && buf2_avail - buf2_written > 0)
343 FD_SET(fd1, &writefds);
344 if (fd2 > 0 && buf1_avail - buf1_written > 0)
345 FD_SET(fd2, &writefds);
346
347 if (fd1 > 0) {
348 FD_SET(fd1, &exceptfds);
349 nfds = max(nfds, fd1);
350 }
351 if (fd2 > 0) {
352 FD_SET(fd2, &exceptfds);
353 nfds = max(nfds, fd2);
354 }
355
356 ready = select(nfds + 1, &readfds, &writefds, &exceptfds, NULL);
357
358 if (ready == -1 && errno == EINTR)
359 continue;
360
361 if (ready == -1) {
362 perror("select()");
363 exit(EXIT_FAILURE);
364 }
365
366 if (FD_ISSET(h, &readfds)) {
367 socklen_t addrlen;
368 struct sockaddr_in client_addr;
369 int fd;
370
371 addrlen = sizeof(client_addr);
372 memset(&client_addr, 0, addrlen);
373 fd = accept(h, (struct sockaddr *) &client_addr, &addrlen);
374 if (fd == -1) {
375 perror("accept()");
376 } else {
377 SHUT_FD1;
378 SHUT_FD2;
379 buf1_avail = buf1_written = 0;
380 buf2_avail = buf2_written = 0;
381 fd1 = fd;
382 fd2 = connect_socket(forward_port, argv[3]);
383 if (fd2 == -1)
384 SHUT_FD1;
385 else
386 printf("connect from %s\n",
387 inet_ntoa(client_addr.sin_addr));
388
389 /* Skip any events on the old, closed file
390 descriptors. */
391
392 continue;
393 }
394 }
395
396 /* NB: read OOB data before normal reads. */
397
398 if (fd1 > 0 && FD_ISSET(fd1, &exceptfds)) {
399 char c;
400
401 nbytes = recv(fd1, &c, 1, MSG_OOB);
402 if (nbytes < 1)
403 SHUT_FD1;
404 else
405 send(fd2, &c, 1, MSG_OOB);
406 }
407 if (fd2 > 0 && FD_ISSET(fd2, &exceptfds)) {
408 char c;
409
410 nbytes = recv(fd2, &c, 1, MSG_OOB);
411 if (nbytes < 1)
412 SHUT_FD2;
413 else
414 send(fd1, &c, 1, MSG_OOB);
415 }
416 if (fd1 > 0 && FD_ISSET(fd1, &readfds)) {
417 nbytes = read(fd1, buf1 + buf1_avail,
418 BUF_SIZE - buf1_avail);
419 if (nbytes < 1)
420 SHUT_FD1;
421 else
422 buf1_avail += nbytes;
423 }
424 if (fd2 > 0 && FD_ISSET(fd2, &readfds)) {
425 nbytes = read(fd2, buf2 + buf2_avail,
426 BUF_SIZE - buf2_avail);
427 if (nbytes < 1)
428 SHUT_FD2;
429 else
430 buf2_avail += nbytes;
431 }
432 if (fd1 > 0 && FD_ISSET(fd1, &writefds) && buf2_avail > 0) {
433 nbytes = write(fd1, buf2 + buf2_written,
434 buf2_avail - buf2_written);
435 if (nbytes < 1)
436 SHUT_FD1;
437 else
438 buf2_written += nbytes;
439 }
440 if (fd2 > 0 && FD_ISSET(fd2, &writefds) && buf1_avail > 0) {
441 nbytes = write(fd2, buf1 + buf1_written,
442 buf1_avail - buf1_written);
443 if (nbytes < 1)
444 SHUT_FD2;
445 else
446 buf1_written += nbytes;
447 }
448
449 /* Check if write data has caught read data. */
450
451 if (buf1_written == buf1_avail)
452 buf1_written = buf1_avail = 0;
453 if (buf2_written == buf2_avail)
454 buf2_written = buf2_avail = 0;
455
456 /* One side has closed the connection, keep
457 writing to the other side until empty. */
458
459 if (fd1 < 0 && buf1_avail - buf1_written == 0)
460 SHUT_FD2;
461 if (fd2 < 0 && buf2_avail - buf2_written == 0)
462 SHUT_FD1;
463 }
464 exit(EXIT_SUCCESS);
465 }
466
467 The above program properly forwards most kinds of TCP connections in‐
468 cluding OOB signal data transmitted by telnet servers. It handles the
469 tricky problem of having data flow in both directions simultaneously.
470 You might think it more efficient to use a fork(2) call and devote a
471 thread to each stream. This becomes more tricky than you might sus‐
472 pect. Another idea is to set nonblocking I/O using fcntl(2). This
473 also has its problems because you end up using inefficient timeouts.
474
475 The program does not handle more than one simultaneous connection at a
476 time, although it could easily be extended to do this with a linked
477 list of buffers—one for each connection. At the moment, new connec‐
478 tions cause the current connection to be dropped.
479
481 accept(2), connect(2), poll(2), read(2), recv(2), select(2), send(2),
482 sigprocmask(2), write(2), epoll(7)
483
484
485
486Linux man-pages 6.04 2023-02-05 SELECT_TUT(2)