1PERLIPC(1) Perl Programmers Reference Guide PERLIPC(1)
2
3
4
6 perlipc - Perl interprocess communication (signals, fifos, pipes, safe
7 subprocesses, sockets, and semaphores)
8
10 The basic IPC facilities of Perl are built out of the good old Unix
11 signals, named pipes, pipe opens, the Berkeley socket routines, and
12 SysV IPC calls. Each is used in slightly different situations.
13
15 Perl uses a simple signal handling model: the %SIG hash contains names
16 or references of user-installed signal handlers. These handlers will
17 be called with an argument which is the name of the signal that
18 triggered it. A signal may be generated intentionally from a
19 particular keyboard sequence like control-C or control-Z, sent to you
20 from another process, or triggered automatically by the kernel when
21 special events transpire, like a child process exiting, your own
22 process running out of stack space, or hitting a process file-size
23 limit.
24
25 For example, to trap an interrupt signal, set up a handler like this:
26
27 our $shucks;
28
29 sub catch_zap {
30 my $signame = shift;
31 $shucks++;
32 die "Somebody sent me a SIG$signame";
33 }
34 $SIG{INT} = __PACKAGE__ . "::catch_zap";
35 $SIG{INT} = \&catch_zap; # best strategy
36
37 Prior to Perl 5.8.0 it was necessary to do as little as you possibly
38 could in your handler; notice how all we do is set a global variable
39 and then raise an exception. That's because on most systems, libraries
40 are not re-entrant; particularly, memory allocation and I/O routines
41 are not. That meant that doing nearly anything in your handler could
42 in theory trigger a memory fault and subsequent core dump - see
43 "Deferred Signals (Safe Signals)" below.
44
45 The names of the signals are the ones listed out by "kill -l" on your
46 system, or you can retrieve them using the CPAN module IPC::Signal.
47
48 You may also choose to assign the strings "IGNORE" or "DEFAULT" as the
49 handler, in which case Perl will try to discard the signal or do the
50 default thing.
51
52 On most Unix platforms, the "CHLD" (sometimes also known as "CLD")
53 signal has special behavior with respect to a value of "IGNORE".
54 Setting $SIG{CHLD} to "IGNORE" on such a platform has the effect of not
55 creating zombie processes when the parent process fails to wait() on
56 its child processes (i.e., child processes are automatically reaped).
57 Calling wait() with $SIG{CHLD} set to "IGNORE" usually returns -1 on
58 such platforms.
59
60 Some signals can be neither trapped nor ignored, such as the KILL and
61 STOP (but not the TSTP) signals. Note that ignoring signals makes them
62 disappear. If you only want them blocked temporarily without them
63 getting lost you'll have to use the "POSIX" module's sigprocmask.
64
65 Sending a signal to a negative process ID means that you send the
66 signal to the entire Unix process group. This code sends a hang-up
67 signal to all processes in the current process group, and also sets
68 $SIG{HUP} to "IGNORE" so it doesn't kill itself:
69
70 # block scope for local
71 {
72 local $SIG{HUP} = "IGNORE";
73 kill HUP => -getpgrp();
74 # snazzy writing of: kill("HUP", -getpgrp())
75 }
76
77 Another interesting signal to send is signal number zero. This doesn't
78 actually affect a child process, but instead checks whether it's alive
79 or has changed its UIDs.
80
81 unless (kill 0 => $kid_pid) {
82 warn "something wicked happened to $kid_pid";
83 }
84
85 Signal number zero may fail because you lack permission to send the
86 signal when directed at a process whose real or saved UID is not
87 identical to the real or effective UID of the sending process, even
88 though the process is alive. You may be able to determine the cause of
89 failure using $! or "%!".
90
91 unless (kill(0 => $pid) || $!{EPERM}) {
92 warn "$pid looks dead";
93 }
94
95 You might also want to employ anonymous functions for simple signal
96 handlers:
97
98 $SIG{INT} = sub { die "\nOutta here!\n" };
99
100 SIGCHLD handlers require some special care. If a second child dies
101 while in the signal handler caused by the first death, we won't get
102 another signal. So must loop here else we will leave the unreaped child
103 as a zombie. And the next time two children die we get another zombie.
104 And so on.
105
106 use POSIX ":sys_wait_h";
107 $SIG{CHLD} = sub {
108 while ((my $child = waitpid(-1, WNOHANG)) > 0) {
109 $Kid_Status{$child} = $?;
110 }
111 };
112 # do something that forks...
113
114 Be careful: qx(), system(), and some modules for calling external
115 commands do a fork(), then wait() for the result. Thus, your signal
116 handler will be called. Because wait() was already called by system()
117 or qx(), the wait() in the signal handler will see no more zombies and
118 will therefore block.
119
120 The best way to prevent this issue is to use waitpid(), as in the
121 following example:
122
123 use POSIX ":sys_wait_h"; # for nonblocking read
124
125 my %children;
126
127 $SIG{CHLD} = sub {
128 # don't change $! and $? outside handler
129 local ($!, $?);
130 while ( (my $pid = waitpid(-1, WNOHANG)) > 0 ) {
131 delete $children{$pid};
132 cleanup_child($pid, $?);
133 }
134 };
135
136 while (1) {
137 my $pid = fork();
138 die "cannot fork" unless defined $pid;
139 if ($pid == 0) {
140 # ...
141 exit 0;
142 } else {
143 $children{$pid}=1;
144 # ...
145 system($command);
146 # ...
147 }
148 }
149
150 Signal handling is also used for timeouts in Unix. While safely
151 protected within an "eval{}" block, you set a signal handler to trap
152 alarm signals and then schedule to have one delivered to you in some
153 number of seconds. Then try your blocking operation, clearing the
154 alarm when it's done but not before you've exited your "eval{}" block.
155 If it goes off, you'll use die() to jump out of the block.
156
157 Here's an example:
158
159 my $ALARM_EXCEPTION = "alarm clock restart";
160 eval {
161 local $SIG{ALRM} = sub { die $ALARM_EXCEPTION };
162 alarm 10;
163 flock($fh, 2) # blocking write lock
164 || die "cannot flock: $!";
165 alarm 0;
166 };
167 if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }
168
169 If the operation being timed out is system() or qx(), this technique is
170 liable to generate zombies. If this matters to you, you'll need to
171 do your own fork() and exec(), and kill the errant child process.
172
173 For more complex signal handling, you might see the standard POSIX
174 module. Lamentably, this is almost entirely undocumented, but the
175 ext/POSIX/t/sigaction.t file from the Perl source distribution has some
176 examples in it.
177
178 Handling the SIGHUP Signal in Daemons
179 A process that usually starts when the system boots and shuts down when
180 the system is shut down is called a daemon (Disk And Execution
181 MONitor). If a daemon process has a configuration file which is
182 modified after the process has been started, there should be a way to
183 tell that process to reread its configuration file without stopping the
184 process. Many daemons provide this mechanism using a "SIGHUP" signal
185 handler. When you want to tell the daemon to reread the file, simply
186 send it the "SIGHUP" signal.
187
188 The following example implements a simple daemon, which restarts itself
189 every time the "SIGHUP" signal is received. The actual code is located
190 in the subroutine code(), which just prints some debugging info to show
191 that it works; it should be replaced with the real code.
192
193 #!/usr/bin/perl
194
195 use v5.36;
196
197 use POSIX ();
198 use FindBin ();
199 use File::Basename ();
200 use File::Spec::Functions qw(catfile);
201
202 $| = 1;
203
204 # make the daemon cross-platform, so exec always calls the script
205 # itself with the right path, no matter how the script was invoked.
206 my $script = File::Basename::basename($0);
207 my $SELF = catfile($FindBin::Bin, $script);
208
209 # POSIX unmasks the sigprocmask properly
210 $SIG{HUP} = sub {
211 print "got SIGHUP\n";
212 exec($SELF, @ARGV) || die "$0: couldn't restart: $!";
213 };
214
215 code();
216
217 sub code {
218 print "PID: $$\n";
219 print "ARGV: @ARGV\n";
220 my $count = 0;
221 while (1) {
222 sleep 2;
223 print ++$count, "\n";
224 }
225 }
226
227 Deferred Signals (Safe Signals)
228 Before Perl 5.8.0, installing Perl code to deal with signals exposed
229 you to danger from two things. First, few system library functions are
230 re-entrant. If the signal interrupts while Perl is executing one
231 function (like malloc(3) or printf(3)), and your signal handler then
232 calls the same function again, you could get unpredictable
233 behavior--often, a core dump. Second, Perl isn't itself re-entrant at
234 the lowest levels. If the signal interrupts Perl while Perl is
235 changing its own internal data structures, similarly unpredictable
236 behavior may result.
237
238 There were two things you could do, knowing this: be paranoid or be
239 pragmatic. The paranoid approach was to do as little as possible in
240 your signal handler. Set an existing integer variable that already has
241 a value, and return. This doesn't help you if you're in a slow system
242 call, which will just restart. That means you have to "die" to
243 longjmp(3) out of the handler. Even this is a little cavalier for the
244 true paranoiac, who avoids "die" in a handler because the system is out
245 to get you. The pragmatic approach was to say "I know the risks, but
246 prefer the convenience", and to do anything you wanted in your signal
247 handler, and be prepared to clean up core dumps now and again.
248
249 Perl 5.8.0 and later avoid these problems by "deferring" signals. That
250 is, when the signal is delivered to the process by the system (to the C
251 code that implements Perl) a flag is set, and the handler returns
252 immediately. Then at strategic "safe" points in the Perl interpreter
253 (e.g. when it is about to execute a new opcode) the flags are checked
254 and the Perl level handler from %SIG is executed. The "deferred" scheme
255 allows much more flexibility in the coding of signal handlers as we
256 know the Perl interpreter is in a safe state, and that we are not in a
257 system library function when the handler is called. However the
258 implementation does differ from previous Perls in the following ways:
259
260 Long-running opcodes
261 As the Perl interpreter looks at signal flags only when it is about
262 to execute a new opcode, a signal that arrives during a long-
263 running opcode (e.g. a regular expression operation on a very large
264 string) will not be seen until the current opcode completes.
265
266 If a signal of any given type fires multiple times during an opcode
267 (such as from a fine-grained timer), the handler for that signal
268 will be called only once, after the opcode completes; all other
269 instances will be discarded. Furthermore, if your system's signal
270 queue gets flooded to the point that there are signals that have
271 been raised but not yet caught (and thus not deferred) at the time
272 an opcode completes, those signals may well be caught and deferred
273 during subsequent opcodes, with sometimes surprising results. For
274 example, you may see alarms delivered even after calling alarm(0)
275 as the latter stops the raising of alarms but does not cancel the
276 delivery of alarms raised but not yet caught. Do not depend on the
277 behaviors described in this paragraph as they are side effects of
278 the current implementation and may change in future versions of
279 Perl.
280
281 Interrupting IO
282 When a signal is delivered (e.g., SIGINT from a control-C) the
283 operating system breaks into IO operations like read(2), which is
284 used to implement Perl's readline() function, the "<>" operator. On
285 older Perls the handler was called immediately (and as "read" is
286 not "unsafe", this worked well). With the "deferred" scheme the
287 handler is not called immediately, and if Perl is using the
288 system's "stdio" library that library may restart the "read"
289 without returning to Perl to give it a chance to call the %SIG
290 handler. If this happens on your system the solution is to use the
291 ":perlio" layer to do IO--at least on those handles that you want
292 to be able to break into with signals. (The ":perlio" layer checks
293 the signal flags and calls %SIG handlers before resuming IO
294 operation.)
295
296 The default in Perl 5.8.0 and later is to automatically use the
297 ":perlio" layer.
298
299 Note that it is not advisable to access a file handle within a
300 signal handler where that signal has interrupted an I/O operation
301 on that same handle. While perl will at least try hard not to
302 crash, there are no guarantees of data integrity; for example, some
303 data might get dropped or written twice.
304
305 Some networking library functions like gethostbyname() are known to
306 have their own implementations of timeouts which may conflict with
307 your timeouts. If you have problems with such functions, try using
308 the POSIX sigaction() function, which bypasses Perl safe signals.
309 Be warned that this does subject you to possible memory corruption,
310 as described above.
311
312 Instead of setting $SIG{ALRM}:
313
314 local $SIG{ALRM} = sub { die "alarm" };
315
316 try something like the following:
317
318 use POSIX qw(SIGALRM);
319 POSIX::sigaction(SIGALRM,
320 POSIX::SigAction->new(sub { die "alarm" }))
321 || die "Error setting SIGALRM handler: $!\n";
322
323 Another way to disable the safe signal behavior locally is to use
324 the "Perl::Unsafe::Signals" module from CPAN, which affects all
325 signals.
326
327 Restartable system calls
328 On systems that supported it, older versions of Perl used the
329 SA_RESTART flag when installing %SIG handlers. This meant that
330 restartable system calls would continue rather than returning when
331 a signal arrived. In order to deliver deferred signals promptly,
332 Perl 5.8.0 and later do not use SA_RESTART. Consequently,
333 restartable system calls can fail (with $! set to "EINTR") in
334 places where they previously would have succeeded.
335
336 The default ":perlio" layer retries "read", "write" and "close" as
337 described above; interrupted "wait" and "waitpid" calls will always
338 be retried.
339
340 Signals as "faults"
341 Certain signals like SEGV, ILL, BUS and FPE are generated by
342 virtual memory addressing errors and similar "faults". These are
343 normally fatal: there is little a Perl-level handler can do with
344 them. So Perl delivers them immediately rather than attempting to
345 defer them.
346
347 It is possible to catch these with a %SIG handler (see perlvar),
348 but on top of the usual problems of "unsafe" signals the signal is
349 likely to get rethrown immediately on return from the signal
350 handler, so such a handler should "die" or "exit" instead.
351
352 Signals triggered by operating system state
353 On some operating systems certain signal handlers are supposed to
354 "do something" before returning. One example can be CHLD or CLD,
355 which indicates a child process has completed. On some operating
356 systems the signal handler is expected to "wait" for the completed
357 child process. On such systems the deferred signal scheme will not
358 work for those signals: it does not do the "wait". Again the
359 failure will look like a loop as the operating system will reissue
360 the signal because there are completed child processes that have
361 not yet been "wait"ed for.
362
363 If you want the old signal behavior back despite possible memory
364 corruption, set the environment variable "PERL_SIGNALS" to "unsafe".
365 This feature first appeared in Perl 5.8.1.
366
368 A named pipe (often referred to as a FIFO) is an old Unix IPC mechanism
369 for processes communicating on the same machine. It works just like
370 regular anonymous pipes, except that the processes rendezvous using a
371 filename and need not be related.
372
373 To create a named pipe, use the POSIX::mkfifo() function.
374
375 use POSIX qw(mkfifo);
376 mkfifo($path, 0700) || die "mkfifo $path failed: $!";
377
378 You can also use the Unix command mknod(1), or on some systems,
379 mkfifo(1). These may not be in your normal path, though.
380
381 # system return val is backwards, so && not ||
382 #
383 $ENV{PATH} .= ":/etc:/usr/etc";
384 if ( system("mknod", $path, "p")
385 && system("mkfifo", $path) )
386 {
387 die "mk{nod,fifo} $path failed";
388 }
389
390 A fifo is convenient when you want to connect a process to an unrelated
391 one. When you open a fifo, the program will block until there's
392 something on the other end.
393
394 For example, let's say you'd like to have your .signature file be a
395 named pipe that has a Perl program on the other end. Now every time
396 any program (like a mailer, news reader, finger program, etc.) tries to
397 read from that file, the reading program will read the new signature
398 from your program. We'll use the pipe-checking file-test operator, -p,
399 to find out whether anyone (or anything) has accidentally removed our
400 fifo.
401
402 chdir(); # go home
403 my $FIFO = ".signature";
404
405 while (1) {
406 unless (-p $FIFO) {
407 unlink $FIFO; # discard any failure, will catch later
408 require POSIX; # delayed loading of heavy module
409 POSIX::mkfifo($FIFO, 0700)
410 || die "can't mkfifo $FIFO: $!";
411 }
412
413 # next line blocks till there's a reader
414 open (my $fh, ">", $FIFO) || die "can't open $FIFO: $!";
415 print $fh "John Smith (smith\@host.org)\n", `fortune -s`;
416 close($fh) || die "can't close $FIFO: $!";
417 sleep 2; # to avoid dup signals
418 }
419
421 Perl's basic open() statement can also be used for unidirectional
422 interprocess communication by specifying the open mode as "|-" or "-|".
423 Here's how to start something up in a child process you intend to write
424 to:
425
426 open(my $spooler, "|-", "cat -v | lpr -h 2>/dev/null")
427 || die "can't fork: $!";
428 local $SIG{PIPE} = sub { die "spooler pipe broke" };
429 print $spooler "stuff\n";
430 close $spooler || die "bad spool: $! $?";
431
432 And here's how to start up a child process you intend to read from:
433
434 open(my $status, "-|", "netstat -an 2>&1")
435 || die "can't fork: $!";
436 while (<$status>) {
437 next if /^(tcp|udp)/;
438 print;
439 }
440 close $status || die "bad netstat: $! $?";
441
442 Be aware that these operations are full Unix forks, which means they
443 may not be correctly implemented on all alien systems. See "open" in
444 perlport for portability details.
445
446 In the two-argument form of open(), a pipe open can be achieved by
447 either appending or prepending a pipe symbol to the second argument:
448
449 open(my $spooler, "| cat -v | lpr -h 2>/dev/null")
450 || die "can't fork: $!";
451 open(my $status, "netstat -an 2>&1 |")
452 || die "can't fork: $!";
453
454 This can be used even on systems that do not support forking, but this
455 possibly allows code intended to read files to unexpectedly execute
456 programs. If one can be sure that a particular program is a Perl
457 script expecting filenames in @ARGV using the two-argument form of
458 open() or the "<>" operator, the clever programmer can write something
459 like this:
460
461 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
462
463 and no matter which sort of shell it's called from, the Perl program
464 will read from the file f1, the process cmd1, standard input (tmpfile
465 in this case), the f2 file, the cmd2 command, and finally the f3 file.
466 Pretty nifty, eh?
467
468 You might notice that you could use backticks for much the same effect
469 as opening a pipe for reading:
470
471 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
472 die "bad netstatus ($?)" if $?;
473
474 While this is true on the surface, it's much more efficient to process
475 the file one line or record at a time because then you don't have to
476 read the whole thing into memory at once. It also gives you finer
477 control of the whole process, letting you kill off the child process
478 early if you'd like.
479
480 Be careful to check the return values from both open() and close(). If
481 you're writing to a pipe, you should also trap SIGPIPE. Otherwise,
482 think of what happens when you start up a pipe to a command that
483 doesn't exist: the open() will in all likelihood succeed (it only
484 reflects the fork()'s success), but then your output will
485 fail--spectacularly. Perl can't know whether the command worked,
486 because your command is actually running in a separate process whose
487 exec() might have failed. Therefore, while readers of bogus commands
488 return just a quick EOF, writers to bogus commands will get hit with a
489 signal, which they'd best be prepared to handle. Consider:
490
491 open(my $fh, "|-", "bogus") || die "can't fork: $!";
492 print $fh "bang\n"; # neither necessary nor sufficient
493 # to check print retval!
494 close($fh) || die "can't close: $!";
495
496 The reason for not checking the return value from print() is because of
497 pipe buffering; physical writes are delayed. That won't blow up until
498 the close, and it will blow up with a SIGPIPE. To catch it, you could
499 use this:
500
501 $SIG{PIPE} = "IGNORE";
502 open(my $fh, "|-", "bogus") || die "can't fork: $!";
503 print $fh "bang\n";
504 close($fh) || die "can't close: status=$?";
505
506 Filehandles
507 Both the main process and any child processes it forks share the same
508 STDIN, STDOUT, and STDERR filehandles. If both processes try to access
509 them at once, strange things can happen. You may also want to close or
510 reopen the filehandles for the child. You can get around this by
511 opening your pipe with open(), but on some systems this means that the
512 child process cannot outlive the parent.
513
514 Background Processes
515 You can run a command in the background with:
516
517 system("cmd &");
518
519 The command's STDOUT and STDERR (and possibly STDIN, depending on your
520 shell) will be the same as the parent's. You won't need to catch
521 SIGCHLD because of the double-fork taking place; see below for details.
522
523 Complete Dissociation of Child from Parent
524 In some cases (starting server processes, for instance) you'll want to
525 completely dissociate the child process from the parent. This is often
526 called daemonization. A well-behaved daemon will also chdir() to the
527 root directory so it doesn't prevent unmounting the filesystem
528 containing the directory from which it was launched, and redirect its
529 standard file descriptors from and to /dev/null so that random output
530 doesn't wind up on the user's terminal.
531
532 use POSIX "setsid";
533
534 sub daemonize {
535 chdir("/") || die "can't chdir to /: $!";
536 open(STDIN, "<", "/dev/null") || die "can't read /dev/null: $!";
537 open(STDOUT, ">", "/dev/null") || die "can't write /dev/null: $!";
538 defined(my $pid = fork()) || die "can't fork: $!";
539 exit if $pid; # non-zero now means I am the parent
540 (setsid() != -1) || die "Can't start a new session: $!";
541 open(STDERR, ">&", STDOUT) || die "can't dup stdout: $!";
542 }
543
544 The fork() has to come before the setsid() to ensure you aren't a
545 process group leader; the setsid() will fail if you are. If your
546 system doesn't have the setsid() function, open /dev/tty and use the
547 "TIOCNOTTY" ioctl() on it instead. See tty(4) for details.
548
549 Non-Unix users should check their "Your_OS::Process" module for other
550 possible solutions.
551
552 Safe Pipe Opens
553 Another interesting approach to IPC is making your single program go
554 multiprocess and communicate between--or even amongst--yourselves. The
555 two-argument form of the open() function will accept a file argument of
556 either "-|" or "|-" to do a very interesting thing: it forks a child
557 connected to the filehandle you've opened. The child is running the
558 same program as the parent. This is useful for safely opening a file
559 when running under an assumed UID or GID, for example. If you open a
560 pipe to minus, you can write to the filehandle you opened and your kid
561 will find it in his STDIN. If you open a pipe from minus, you can read
562 from the filehandle you opened whatever your kid writes to his STDOUT.
563
564 my $PRECIOUS = "/path/to/some/safe/file";
565 my $sleep_count;
566 my $pid;
567 my $kid_to_write;
568
569 do {
570 $pid = open($kid_to_write, "|-");
571 unless (defined $pid) {
572 warn "cannot fork: $!";
573 die "bailing out" if $sleep_count++ > 6;
574 sleep 10;
575 }
576 } until defined $pid;
577
578 if ($pid) { # I am the parent
579 print $kid_to_write @some_data;
580 close($kid_to_write) || warn "kid exited $?";
581 } else { # I am the child
582 # drop permissions in setuid and/or setgid programs:
583 ($>, $)) = ($<, $();
584 open (my $outfile, ">", $PRECIOUS)
585 || die "can't open $PRECIOUS: $!";
586 while (<STDIN>) {
587 print $outfile; # child STDIN is parent $kid_to_write
588 }
589 close($outfile) || die "can't close $PRECIOUS: $!";
590 exit(0); # don't forget this!!
591 }
592
593 Another common use for this construct is when you need to execute
594 something without the shell's interference. With system(), it's
595 straightforward, but you can't use a pipe open or backticks safely.
596 That's because there's no way to stop the shell from getting its hands
597 on your arguments. Instead, use lower-level control to call exec()
598 directly.
599
600 Here's a safe backtick or pipe open for read:
601
602 my $pid = open(my $kid_to_read, "-|");
603 defined($pid) || die "can't fork: $!";
604
605 if ($pid) { # parent
606 while (<$kid_to_read>) {
607 # do something interesting
608 }
609 close($kid_to_read) || warn "kid exited $?";
610
611 } else { # child
612 ($>, $)) = ($<, $(); # suid only
613 exec($program, @options, @args)
614 || die "can't exec program: $!";
615 # NOTREACHED
616 }
617
618 And here's a safe pipe open for writing:
619
620 my $pid = open(my $kid_to_write, "|-");
621 defined($pid) || die "can't fork: $!";
622
623 $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
624
625 if ($pid) { # parent
626 print $kid_to_write @data;
627 close($kid_to_write) || warn "kid exited $?";
628
629 } else { # child
630 ($>, $)) = ($<, $();
631 exec($program, @options, @args)
632 || die "can't exec program: $!";
633 # NOTREACHED
634 }
635
636 It is very easy to dead-lock a process using this form of open(), or
637 indeed with any use of pipe() with multiple subprocesses. The example
638 above is "safe" because it is simple and calls exec(). See "Avoiding
639 Pipe Deadlocks" for general safety principles, but there are extra
640 gotchas with Safe Pipe Opens.
641
642 In particular, if you opened the pipe using "open $fh, "|-"", then you
643 cannot simply use close() in the parent process to close an unwanted
644 writer. Consider this code:
645
646 my $pid = open(my $writer, "|-"); # fork open a kid
647 defined($pid) || die "first fork failed: $!";
648 if ($pid) {
649 if (my $sub_pid = fork()) {
650 defined($sub_pid) || die "second fork failed: $!";
651 close($writer) || die "couldn't close writer: $!";
652 # now do something else...
653 }
654 else {
655 # first write to $writer
656 # ...
657 # then when finished
658 close($writer) || die "couldn't close writer: $!";
659 exit(0);
660 }
661 }
662 else {
663 # first do something with STDIN, then
664 exit(0);
665 }
666
667 In the example above, the true parent does not want to write to the
668 $writer filehandle, so it closes it. However, because $writer was
669 opened using "open $fh, "|-"", it has a special behavior: closing it
670 calls waitpid() (see "waitpid" in perlfunc), which waits for the
671 subprocess to exit. If the child process ends up waiting for something
672 happening in the section marked "do something else", you have deadlock.
673
674 This can also be a problem with intermediate subprocesses in more
675 complicated code, which will call waitpid() on all open filehandles
676 during global destruction--in no predictable order.
677
678 To solve this, you must manually use pipe(), fork(), and the form of
679 open() which sets one file descriptor to another, as shown below:
680
681 pipe(my $reader, my $writer) || die "pipe failed: $!";
682 my $pid = fork();
683 defined($pid) || die "first fork failed: $!";
684 if ($pid) {
685 close $reader;
686 if (my $sub_pid = fork()) {
687 defined($sub_pid) || die "first fork failed: $!";
688 close($writer) || die "can't close writer: $!";
689 }
690 else {
691 # write to $writer...
692 # ...
693 # then when finished
694 close($writer) || die "can't close writer: $!";
695 exit(0);
696 }
697 # write to $writer...
698 }
699 else {
700 open(STDIN, "<&", $reader) || die "can't reopen STDIN: $!";
701 close($writer) || die "can't close writer: $!";
702 # do something...
703 exit(0);
704 }
705
706 Since Perl 5.8.0, you can also use the list form of "open" for pipes.
707 This is preferred when you wish to avoid having the shell interpret
708 metacharacters that may be in your command string.
709
710 So for example, instead of using:
711
712 open(my $ps_pipe, "-|", "ps aux") || die "can't open ps pipe: $!";
713
714 One would use either of these:
715
716 open(my $ps_pipe, "-|", "ps", "aux")
717 || die "can't open ps pipe: $!";
718
719 my @ps_args = qw[ ps aux ];
720 open(my $ps_pipe, "-|", @ps_args)
721 || die "can't open @ps_args|: $!";
722
723 Because there are more than three arguments to open(), it forks the
724 ps(1) command without spawning a shell, and reads its standard output
725 via the $ps_pipe filehandle. The corresponding syntax to write to
726 command pipes is to use "|-" in place of "-|".
727
728 This was admittedly a rather silly example, because you're using string
729 literals whose content is perfectly safe. There is therefore no cause
730 to resort to the harder-to-read, multi-argument form of pipe open().
731 However, whenever you cannot be assured that the program arguments are
732 free of shell metacharacters, the fancier form of open() should be
733 used. For example:
734
735 my @grep_args = ("egrep", "-i", $some_pattern, @many_files);
736 open(my $grep_pipe, "-|", @grep_args)
737 || die "can't open @grep_args|: $!";
738
739 Here the multi-argument form of pipe open() is preferred because the
740 pattern and indeed even the filenames themselves might hold
741 metacharacters.
742
743 Avoiding Pipe Deadlocks
744 Whenever you have more than one subprocess, you must be careful that
745 each closes whichever half of any pipes created for interprocess
746 communication it is not using. This is because any child process
747 reading from the pipe and expecting an EOF will never receive it, and
748 therefore never exit. A single process closing a pipe is not enough to
749 close it; the last process with the pipe open must close it for it to
750 read EOF.
751
752 Certain built-in Unix features help prevent this most of the time. For
753 instance, filehandles have a "close on exec" flag, which is set en
754 masse under control of the $^F variable. This is so any filehandles
755 you didn't explicitly route to the STDIN, STDOUT or STDERR of a child
756 program will be automatically closed.
757
758 Always explicitly and immediately call close() on the writable end of
759 any pipe, unless that process is actually writing to it. Even if you
760 don't explicitly call close(), Perl will still close() all filehandles
761 during global destruction. As previously discussed, if those
762 filehandles have been opened with Safe Pipe Open, this will result in
763 calling waitpid(), which may again deadlock.
764
765 Bidirectional Communication with Another Process
766 While this works reasonably well for unidirectional communication, what
767 about bidirectional communication? The most obvious approach doesn't
768 work:
769
770 # THIS DOES NOT WORK!!
771 open(my $prog_for_reading_and_writing, "| some program |")
772
773 If you forget to "use warnings", you'll miss out entirely on the
774 helpful diagnostic message:
775
776 Can't do bidirectional pipe at -e line 1.
777
778 If you really want to, you can use the standard open2() from the
779 IPC::Open2 module to catch both ends. There's also an open3() in
780 IPC::Open3 for tridirectional I/O so you can also catch your child's
781 STDERR, but doing so would then require an awkward select() loop and
782 wouldn't allow you to use normal Perl input operations.
783
784 If you look at its source, you'll see that open2() uses low-level
785 primitives like the pipe() and exec() syscalls to create all the
786 connections. Although it might have been more efficient by using
787 socketpair(), this would have been even less portable than it already
788 is. The open2() and open3() functions are unlikely to work anywhere
789 except on a Unix system, or at least one purporting POSIX compliance.
790
791 Here's an example of using open2():
792
793 use IPC::Open2;
794 my $pid = open2(my $reader, my $writer, "cat -un");
795 print $writer "stuff\n";
796 my $got = <$reader>;
797 waitpid $pid, 0;
798
799 The problem with this is that buffering is really going to ruin your
800 day. Even though your $writer filehandle is auto-flushed so the
801 process on the other end gets your data in a timely manner, you can't
802 usually do anything to force that process to give its data to you in a
803 similarly quick fashion. In this special case, we could actually so,
804 because we gave cat a -u flag to make it unbuffered. But very few
805 commands are designed to operate over pipes, so this seldom works
806 unless you yourself wrote the program on the other end of the double-
807 ended pipe.
808
809 A solution to this is to use a library which uses pseudottys to make
810 your program behave more reasonably. This way you don't have to have
811 control over the source code of the program you're using. The "Expect"
812 module from CPAN also addresses this kind of thing. This module
813 requires two other modules from CPAN, "IO::Pty" and "IO::Stty". It
814 sets up a pseudo terminal to interact with programs that insist on
815 talking to the terminal device driver. If your system is supported,
816 this may be your best bet.
817
818 Bidirectional Communication with Yourself
819 If you want, you may make low-level pipe() and fork() syscalls to
820 stitch this together by hand. This example only talks to itself, but
821 you could reopen the appropriate handles to STDIN and STDOUT and call
822 other processes. (The following example lacks proper error checking.)
823
824 #!/usr/bin/perl
825 # pipe1 - bidirectional communication using two pipe pairs
826 # designed for the socketpair-challenged
827 use v5.36;
828 use IO::Handle; # enable autoflush method before Perl 5.14
829 pipe(my $parent_rdr, my $child_wtr); # XXX: check failure?
830 pipe(my $child_rdr, my $parent_wtr); # XXX: check failure?
831 $child_wtr->autoflush(1);
832 $parent_wtr->autoflush(1);
833
834 if ($pid = fork()) {
835 close $parent_rdr;
836 close $parent_wtr;
837 print $child_wtr "Parent Pid $$ is sending this\n";
838 chomp(my $line = <$child_rdr>);
839 print "Parent Pid $$ just read this: '$line'\n";
840 close $child_rdr; close $child_wtr;
841 waitpid($pid, 0);
842 } else {
843 die "cannot fork: $!" unless defined $pid;
844 close $child_rdr;
845 close $child_wtr;
846 chomp(my $line = <$parent_rdr>);
847 print "Child Pid $$ just read this: '$line'\n";
848 print $parent_wtr "Child Pid $$ is sending this\n";
849 close $parent_rdr;
850 close $parent_wtr;
851 exit(0);
852 }
853
854 But you don't actually have to make two pipe calls. If you have the
855 socketpair() system call, it will do this all for you.
856
857 #!/usr/bin/perl
858 # pipe2 - bidirectional communication using socketpair
859 # "the best ones always go both ways"
860
861 use v5.36;
862 use Socket;
863 use IO::Handle; # enable autoflush method before Perl 5.14
864
865 # We say AF_UNIX because although *_LOCAL is the
866 # POSIX 1003.1g form of the constant, many machines
867 # still don't have it.
868 socketpair(my $child, my $parent, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
869 || die "socketpair: $!";
870
871 $child->autoflush(1);
872 $parent->autoflush(1);
873
874 if ($pid = fork()) {
875 close $parent;
876 print $child "Parent Pid $$ is sending this\n";
877 chomp(my $line = <$child>);
878 print "Parent Pid $$ just read this: '$line'\n";
879 close $child;
880 waitpid($pid, 0);
881 } else {
882 die "cannot fork: $!" unless defined $pid;
883 close $child;
884 chomp(my $line = <$parent>);
885 print "Child Pid $$ just read this: '$line'\n";
886 print $parent "Child Pid $$ is sending this\n";
887 close $parent;
888 exit(0);
889 }
890
892 While not entirely limited to Unix-derived operating systems (e.g.,
893 WinSock on PCs provides socket support, as do some VMS libraries), you
894 might not have sockets on your system, in which case this section
895 probably isn't going to do you much good. With sockets, you can do
896 both virtual circuits like TCP streams and datagrams like UDP packets.
897 You may be able to do even more depending on your system.
898
899 The Perl functions for dealing with sockets have the same names as the
900 corresponding system calls in C, but their arguments tend to differ for
901 two reasons. First, Perl filehandles work differently than C file
902 descriptors. Second, Perl already knows the length of its strings, so
903 you don't need to pass that information.
904
905 One of the major problems with ancient, antemillennial socket code in
906 Perl was that it used hard-coded values for some of the constants,
907 which severely hurt portability. If you ever see code that does
908 anything like explicitly setting "$AF_INET = 2", you know you're in for
909 big trouble. An immeasurably superior approach is to use the Socket
910 module, which more reliably grants access to the various constants and
911 functions you'll need.
912
913 If you're not writing a server/client for an existing protocol like
914 NNTP or SMTP, you should give some thought to how your server will know
915 when the client has finished talking, and vice-versa. Most protocols
916 are based on one-line messages and responses (so one party knows the
917 other has finished when a "\n" is received) or multi-line messages and
918 responses that end with a period on an empty line ("\n.\n" terminates a
919 message/response).
920
921 Internet Line Terminators
922 The Internet line terminator is "\015\012". Under ASCII variants of
923 Unix, that could usually be written as "\r\n", but under other systems,
924 "\r\n" might at times be "\015\015\012", "\012\012\015", or something
925 completely different. The standards specify writing "\015\012" to be
926 conformant (be strict in what you provide), but they also recommend
927 accepting a lone "\012" on input (be lenient in what you require). We
928 haven't always been very good about that in the code in this manpage,
929 but unless you're on a Mac from way back in its pre-Unix dark ages,
930 you'll probably be ok.
931
932 Internet TCP Clients and Servers
933 Use Internet-domain sockets when you want to do client-server
934 communication that might extend to machines outside of your own system.
935
936 Here's a sample TCP client using Internet-domain sockets:
937
938 #!/usr/bin/perl
939 use v5.36;
940 use Socket;
941
942 my $remote = shift || "localhost";
943 my $port = shift || 2345; # random port
944 if ($port =~ /\D/) { $port = getservbyname($port, "tcp") }
945 die "No port" unless $port;
946 my $iaddr = inet_aton($remote) || die "no host: $remote";
947 my $paddr = sockaddr_in($port, $iaddr);
948
949 my $proto = getprotobyname("tcp");
950 socket(my $sock, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
951 connect($sock, $paddr) || die "connect: $!";
952 while (my $line = <$sock>) {
953 print $line;
954 }
955
956 close ($sock) || die "close: $!";
957 exit(0);
958
959 And here's a corresponding server to go along with it. We'll leave the
960 address as "INADDR_ANY" so that the kernel can choose the appropriate
961 interface on multihomed hosts. If you want sit on a particular
962 interface (like the external side of a gateway or firewall machine),
963 fill this in with your real address instead.
964
965 #!/usr/bin/perl -T
966 use v5.36;
967 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
968 use Socket;
969 use Carp;
970 my $EOL = "\015\012";
971
972 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
973
974 my $port = shift || 2345;
975 die "invalid port" unless $port =~ /^ \d+ $/x;
976
977 my $proto = getprotobyname("tcp");
978
979 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
980 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
981 || die "setsockopt: $!";
982 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
983 listen($server, SOMAXCONN) || die "listen: $!";
984
985 logmsg "server started on port $port";
986
987 for (my $paddr; $paddr = accept(my $client, $server); close $client) {
988 my($port, $iaddr) = sockaddr_in($paddr);
989 my $name = gethostbyaddr($iaddr, AF_INET);
990
991 logmsg "connection from $name [",
992 inet_ntoa($iaddr), "]
993 at port $port";
994
995 print $client "Hello there, $name, it's now ",
996 scalar localtime(), $EOL;
997 }
998
999 And here's a multitasking version. It's multitasked in that like most
1000 typical servers, it spawns (fork()s) a child server to handle the
1001 client request so that the master server can quickly go back to service
1002 a new client.
1003
1004 #!/usr/bin/perl -T
1005 use v5.36;
1006 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
1007 use Socket;
1008 use Carp;
1009 my $EOL = "\015\012";
1010
1011 sub spawn; # forward declaration
1012 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
1013
1014 my $port = shift || 2345;
1015 die "invalid port" unless $port =~ /^ \d+ $/x;
1016
1017 my $proto = getprotobyname("tcp");
1018
1019 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
1020 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
1021 || die "setsockopt: $!";
1022 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
1023 listen($server, SOMAXCONN) || die "listen: $!";
1024
1025 logmsg "server started on port $port";
1026
1027 my $waitedpid = 0;
1028
1029 use POSIX ":sys_wait_h";
1030 use Errno;
1031
1032 sub REAPER {
1033 local $!; # don't let waitpid() overwrite current error
1034 while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) {
1035 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1036 }
1037 $SIG{CHLD} = \&REAPER; # loathe SysV
1038 }
1039
1040 $SIG{CHLD} = \&REAPER;
1041
1042 while (1) {
1043 my $paddr = accept(my $client, $server) || do {
1044 # try again if accept() returned because got a signal
1045 next if $!{EINTR};
1046 die "accept: $!";
1047 };
1048 my ($port, $iaddr) = sockaddr_in($paddr);
1049 my $name = gethostbyaddr($iaddr, AF_INET);
1050
1051 logmsg "connection from $name [",
1052 inet_ntoa($iaddr),
1053 "] at port $port";
1054
1055 spawn $client, sub {
1056 $| = 1;
1057 print "Hello there, $name, it's now ",
1058 scalar localtime(),
1059 $EOL;
1060 exec "/usr/games/fortune" # XXX: "wrong" line terminators
1061 or confess "can't exec fortune: $!";
1062 };
1063 close $client;
1064 }
1065
1066 sub spawn {
1067 my $client = shift;
1068 my $coderef = shift;
1069
1070 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1071 confess "usage: spawn CLIENT CODEREF";
1072 }
1073
1074 my $pid;
1075 unless (defined($pid = fork())) {
1076 logmsg "cannot fork: $!";
1077 return;
1078 }
1079 elsif ($pid) {
1080 logmsg "begat $pid";
1081 return; # I'm the parent
1082 }
1083 # else I'm the child -- go spawn
1084
1085 open(STDIN, "<&", $client) || die "can't dup client to stdin";
1086 open(STDOUT, ">&", $client) || die "can't dup client to stdout";
1087 ## open(STDERR, ">&", STDOUT) || die "can't dup stdout to stderr";
1088 exit($coderef->());
1089 }
1090
1091 This server takes the trouble to clone off a child version via fork()
1092 for each incoming request. That way it can handle many requests at
1093 once, which you might not always want. Even if you don't fork(), the
1094 listen() will allow that many pending connections. Forking servers
1095 have to be particularly careful about cleaning up their dead children
1096 (called "zombies" in Unix parlance), because otherwise you'll quickly
1097 fill up your process table. The REAPER subroutine is used here to call
1098 waitpid() for any child processes that have finished, thereby ensuring
1099 that they terminate cleanly and don't join the ranks of the living
1100 dead.
1101
1102 Within the while loop we call accept() and check to see if it returns a
1103 false value. This would normally indicate a system error needs to be
1104 reported. However, the introduction of safe signals (see "Deferred
1105 Signals (Safe Signals)" above) in Perl 5.8.0 means that accept() might
1106 also be interrupted when the process receives a signal. This typically
1107 happens when one of the forked subprocesses exits and notifies the
1108 parent process with a CHLD signal.
1109
1110 If accept() is interrupted by a signal, $! will be set to EINTR. If
1111 this happens, we can safely continue to the next iteration of the loop
1112 and another call to accept(). It is important that your signal
1113 handling code not modify the value of $!, or else this test will likely
1114 fail. In the REAPER subroutine we create a local version of $! before
1115 calling waitpid(). When waitpid() sets $! to ECHILD as it inevitably
1116 does when it has no more children waiting, it updates the local copy
1117 and leaves the original unchanged.
1118
1119 You should use the -T flag to enable taint checking (see perlsec) even
1120 if we aren't running setuid or setgid. This is always a good idea for
1121 servers or any program run on behalf of someone else (like CGI
1122 scripts), because it lessens the chances that people from the outside
1123 will be able to compromise your system. Note that perl can be built
1124 without taint support. There are two different modes: in one, -T will
1125 silently do nothing. In the other mode -T results in a fatal error.
1126
1127 Let's look at another TCP client. This one connects to the TCP "time"
1128 service on a number of different machines and shows how far their
1129 clocks differ from the system on which it's being run:
1130
1131 #!/usr/bin/perl
1132 use v5.36;
1133 use Socket;
1134
1135 my $SECS_OF_70_YEARS = 2208988800;
1136 sub ctime { scalar localtime(shift() || time()) }
1137
1138 my $iaddr = gethostbyname("localhost");
1139 my $proto = getprotobyname("tcp");
1140 my $port = getservbyname("time", "tcp");
1141 my $paddr = sockaddr_in(0, $iaddr);
1142
1143 $| = 1;
1144 printf "%-24s %8s %s\n", "localhost", 0, ctime();
1145
1146 foreach my $host (@ARGV) {
1147 printf "%-24s ", $host;
1148 my $hisiaddr = inet_aton($host) || die "unknown host";
1149 my $hispaddr = sockaddr_in($port, $hisiaddr);
1150 socket(my $socket, PF_INET, SOCK_STREAM, $proto)
1151 || die "socket: $!";
1152 connect($socket, $hispaddr) || die "connect: $!";
1153 my $rtime = pack("C4", ());
1154 read($socket, $rtime, 4);
1155 close($socket);
1156 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1157 printf "%8d %s\n", $histime - time(), ctime($histime);
1158 }
1159
1160 Unix-Domain TCP Clients and Servers
1161 That's fine for Internet-domain clients and servers, but what about
1162 local communications? While you can use the same setup, sometimes you
1163 don't want to. Unix-domain sockets are local to the current host, and
1164 are often used internally to implement pipes. Unlike Internet domain
1165 sockets, Unix domain sockets can show up in the file system with an
1166 ls(1) listing.
1167
1168 % ls -l /dev/log
1169 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
1170
1171 You can test for these with Perl's -S file test:
1172
1173 unless (-S "/dev/log") {
1174 die "something's wicked with the log system";
1175 }
1176
1177 Here's a sample Unix-domain client:
1178
1179 #!/usr/bin/perl
1180 use v5.36;
1181 use Socket;
1182
1183 my $rendezvous = shift || "catsock";
1184 socket(my $sock, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1185 connect($sock, sockaddr_un($rendezvous)) || die "connect: $!";
1186 while (defined(my $line = <$sock>)) {
1187 print $line;
1188 }
1189 exit(0);
1190
1191 And here's a corresponding server. You don't have to worry about silly
1192 network terminators here because Unix domain sockets are guaranteed to
1193 be on the localhost, and thus everything works right.
1194
1195 #!/usr/bin/perl -T
1196 use v5.36;
1197 use Socket;
1198 use Carp;
1199
1200 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
1201 sub spawn; # forward declaration
1202 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
1203
1204 my $NAME = "catsock";
1205 my $uaddr = sockaddr_un($NAME);
1206 my $proto = getprotobyname("tcp");
1207
1208 socket(my $server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1209 unlink($NAME);
1210 bind ($server, $uaddr) || die "bind: $!";
1211 listen($server, SOMAXCONN) || die "listen: $!";
1212
1213 logmsg "server started on $NAME";
1214
1215 my $waitedpid;
1216
1217 use POSIX ":sys_wait_h";
1218 sub REAPER {
1219 my $child;
1220 while (($waitedpid = waitpid(-1, WNOHANG)) > 0) {
1221 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1222 }
1223 $SIG{CHLD} = \&REAPER; # loathe SysV
1224 }
1225
1226 $SIG{CHLD} = \&REAPER;
1227
1228
1229 for ( $waitedpid = 0;
1230 accept(my $client, $server) || $waitedpid;
1231 $waitedpid = 0, close $client)
1232 {
1233 next if $waitedpid;
1234 logmsg "connection on $NAME";
1235 spawn $client, sub {
1236 print "Hello there, it's now ", scalar localtime(), "\n";
1237 exec("/usr/games/fortune") || die "can't exec fortune: $!";
1238 };
1239 }
1240
1241 sub spawn {
1242 my $client = shift();
1243 my $coderef = shift();
1244
1245 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1246 confess "usage: spawn CLIENT CODEREF";
1247 }
1248
1249 my $pid;
1250 unless (defined($pid = fork())) {
1251 logmsg "cannot fork: $!";
1252 return;
1253 }
1254 elsif ($pid) {
1255 logmsg "begat $pid";
1256 return; # I'm the parent
1257 }
1258 else {
1259 # I'm the child -- go spawn
1260 }
1261
1262 open(STDIN, "<&", $client)
1263 || die "can't dup client to stdin";
1264 open(STDOUT, ">&", $client)
1265 || die "can't dup client to stdout";
1266 ## open(STDERR, ">&", STDOUT)
1267 ## || die "can't dup stdout to stderr";
1268 exit($coderef->());
1269 }
1270
1271 As you see, it's remarkably similar to the Internet domain TCP server,
1272 so much so, in fact, that we've omitted several duplicate
1273 functions--spawn(), logmsg(), ctime(), and REAPER()--which are the same
1274 as in the other server.
1275
1276 So why would you ever want to use a Unix domain socket instead of a
1277 simpler named pipe? Because a named pipe doesn't give you sessions.
1278 You can't tell one process's data from another's. With socket
1279 programming, you get a separate session for each client; that's why
1280 accept() takes two arguments.
1281
1282 For example, let's say that you have a long-running database server
1283 daemon that you want folks to be able to access from the Web, but only
1284 if they go through a CGI interface. You'd have a small, simple CGI
1285 program that does whatever checks and logging you feel like, and then
1286 acts as a Unix-domain client and connects to your private server.
1287
1289 For those preferring a higher-level interface to socket programming,
1290 the IO::Socket module provides an object-oriented approach. If for
1291 some reason you lack this module, you can just fetch IO::Socket from
1292 CPAN, where you'll also find modules providing easy interfaces to the
1293 following systems: DNS, FTP, Ident (RFC 931), NIS and NISPlus, NNTP,
1294 Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--to name just a few.
1295
1296 A Simple Client
1297 Here's a client that creates a TCP connection to the "daytime" service
1298 at port 13 of the host name "localhost" and prints out everything that
1299 the server there cares to provide.
1300
1301 #!/usr/bin/perl
1302 use v5.36;
1303 use IO::Socket;
1304 my $remote = IO::Socket::INET->new(
1305 Proto => "tcp",
1306 PeerAddr => "localhost",
1307 PeerPort => "daytime(13)",
1308 )
1309 || die "can't connect to daytime service on localhost";
1310 while (<$remote>) { print }
1311
1312 When you run this program, you should get something back that looks
1313 like this:
1314
1315 Wed May 14 08:40:46 MDT 1997
1316
1317 Here are what those parameters to the new() constructor mean:
1318
1319 "Proto"
1320 This is which protocol to use. In this case, the socket handle
1321 returned will be connected to a TCP socket, because we want a
1322 stream-oriented connection, that is, one that acts pretty much like
1323 a plain old file. Not all sockets are this of this type. For
1324 example, the UDP protocol can be used to make a datagram socket,
1325 used for message-passing.
1326
1327 "PeerAddr"
1328 This is the name or Internet address of the remote host the server
1329 is running on. We could have specified a longer name like
1330 "www.perl.com", or an address like "207.171.7.72". For
1331 demonstration purposes, we've used the special hostname
1332 "localhost", which should always mean the current machine you're
1333 running on. The corresponding Internet address for localhost is
1334 "127.0.0.1", if you'd rather use that.
1335
1336 "PeerPort"
1337 This is the service name or port number we'd like to connect to.
1338 We could have gotten away with using just "daytime" on systems with
1339 a well-configured system services file,[FOOTNOTE: The system
1340 services file is found in /etc/services under Unixy systems.] but
1341 here we've specified the port number (13) in parentheses. Using
1342 just the number would have also worked, but numeric literals make
1343 careful programmers nervous.
1344
1345 A Webget Client
1346 Here's a simple client that takes a remote host to fetch a document
1347 from, and then a list of files to get from that host. This is a more
1348 interesting client than the previous one because it first sends
1349 something to the server before fetching the server's response.
1350
1351 #!/usr/bin/perl
1352 use v5.36;
1353 use IO::Socket;
1354 unless (@ARGV > 1) { die "usage: $0 host url ..." }
1355 my $host = shift(@ARGV);
1356 my $EOL = "\015\012";
1357 my $BLANK = $EOL x 2;
1358 for my $document (@ARGV) {
1359 my $remote = IO::Socket::INET->new( Proto => "tcp",
1360 PeerAddr => $host,
1361 PeerPort => "http(80)",
1362 ) || die "cannot connect to httpd on $host";
1363 $remote->autoflush(1);
1364 print $remote "GET $document HTTP/1.0" . $BLANK;
1365 while ( <$remote> ) { print }
1366 close $remote;
1367 }
1368
1369 The web server handling the HTTP service is assumed to be at its
1370 standard port, number 80. If the server you're trying to connect to is
1371 at a different port, like 1080 or 8080, you should specify it as the
1372 named-parameter pair, "PeerPort => 8080". The "autoflush" method is
1373 used on the socket because otherwise the system would buffer up the
1374 output we sent it. (If you're on a prehistoric Mac, you'll also need
1375 to change every "\n" in your code that sends data over the network to
1376 be a "\015\012" instead.)
1377
1378 Connecting to the server is only the first part of the process: once
1379 you have the connection, you have to use the server's language. Each
1380 server on the network has its own little command language that it
1381 expects as input. The string that we send to the server starting with
1382 "GET" is in HTTP syntax. In this case, we simply request each
1383 specified document. Yes, we really are making a new connection for
1384 each document, even though it's the same host. That's the way you
1385 always used to have to speak HTTP. Recent versions of web browsers may
1386 request that the remote server leave the connection open a little
1387 while, but the server doesn't have to honor such a request.
1388
1389 Here's an example of running that program, which we'll call webget:
1390
1391 % webget www.perl.com /guanaco.html
1392 HTTP/1.1 404 File Not Found
1393 Date: Thu, 08 May 1997 18:02:32 GMT
1394 Server: Apache/1.2b6
1395 Connection: close
1396 Content-type: text/html
1397
1398 <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1399 <BODY><H1>File Not Found</H1>
1400 The requested URL /guanaco.html was not found on this server.<P>
1401 </BODY>
1402
1403 Ok, so that's not very interesting, because it didn't find that
1404 particular document. But a long response wouldn't have fit on this
1405 page.
1406
1407 For a more featureful version of this program, you should look to the
1408 lwp-request program included with the LWP modules from CPAN.
1409
1410 Interactive Client with IO::Socket
1411 Well, that's all fine if you want to send one command and get one
1412 answer, but what about setting up something fully interactive, somewhat
1413 like the way telnet works? That way you can type a line, get the
1414 answer, type a line, get the answer, etc.
1415
1416 This client is more complicated than the two we've done so far, but if
1417 you're on a system that supports the powerful "fork" call, the solution
1418 isn't that rough. Once you've made the connection to whatever service
1419 you'd like to chat with, call "fork" to clone your process. Each of
1420 these two identical process has a very simple job to do: the parent
1421 copies everything from the socket to standard output, while the child
1422 simultaneously copies everything from standard input to the socket. To
1423 accomplish the same thing using just one process would be much harder,
1424 because it's easier to code two processes to do one thing than it is to
1425 code one process to do two things. (This keep-it-simple principle a
1426 cornerstones of the Unix philosophy, and good software engineering as
1427 well, which is probably why it's spread to other systems.)
1428
1429 Here's the code:
1430
1431 #!/usr/bin/perl
1432 use v5.36;
1433 use IO::Socket;
1434
1435 unless (@ARGV == 2) { die "usage: $0 host port" }
1436 my ($host, $port) = @ARGV;
1437
1438 # create a tcp connection to the specified host and port
1439 my $handle = IO::Socket::INET->new(Proto => "tcp",
1440 PeerAddr => $host,
1441 PeerPort => $port)
1442 || die "can't connect to port $port on $host: $!";
1443
1444 $handle->autoflush(1); # so output gets there right away
1445 print STDERR "[Connected to $host:$port]\n";
1446
1447 # split the program into two processes, identical twins
1448 die "can't fork: $!" unless defined(my $kidpid = fork());
1449
1450 # the if{} block runs only in the parent process
1451 if ($kidpid) {
1452 # copy the socket to standard output
1453 while (defined (my $line = <$handle>)) {
1454 print STDOUT $line;
1455 }
1456 kill("TERM", $kidpid); # send SIGTERM to child
1457 }
1458 # the else{} block runs only in the child process
1459 else {
1460 # copy standard input to the socket
1461 while (defined (my $line = <STDIN>)) {
1462 print $handle $line;
1463 }
1464 exit(0); # just in case
1465 }
1466
1467 The "kill" function in the parent's "if" block is there to send a
1468 signal to our child process, currently running in the "else" block, as
1469 soon as the remote server has closed its end of the connection.
1470
1471 If the remote server sends data a byte at time, and you need that data
1472 immediately without waiting for a newline (which might not happen), you
1473 may wish to replace the "while" loop in the parent with the following:
1474
1475 my $byte;
1476 while (sysread($handle, $byte, 1) == 1) {
1477 print STDOUT $byte;
1478 }
1479
1480 Making a system call for each byte you want to read is not very
1481 efficient (to put it mildly) but is the simplest to explain and works
1482 reasonably well.
1483
1485 As always, setting up a server is little bit more involved than running
1486 a client. The model is that the server creates a special kind of
1487 socket that does nothing but listen on a particular port for incoming
1488 connections. It does this by calling the "IO::Socket::INET->new()"
1489 method with slightly different arguments than the client did.
1490
1491 Proto
1492 This is which protocol to use. Like our clients, we'll still
1493 specify "tcp" here.
1494
1495 LocalPort
1496 We specify a local port in the "LocalPort" argument, which we
1497 didn't do for the client. This is service name or port number for
1498 which you want to be the server. (Under Unix, ports under 1024 are
1499 restricted to the superuser.) In our sample, we'll use port 9000,
1500 but you can use any port that's not currently in use on your
1501 system. If you try to use one already in used, you'll get an
1502 "Address already in use" message. Under Unix, the "netstat -a"
1503 command will show which services current have servers.
1504
1505 Listen
1506 The "Listen" parameter is set to the maximum number of pending
1507 connections we can accept until we turn away incoming clients.
1508 Think of it as a call-waiting queue for your telephone. The low-
1509 level Socket module has a special symbol for the system maximum,
1510 which is SOMAXCONN.
1511
1512 Reuse
1513 The "Reuse" parameter is needed so that we restart our server
1514 manually without waiting a few minutes to allow system buffers to
1515 clear out.
1516
1517 Once the generic server socket has been created using the parameters
1518 listed above, the server then waits for a new client to connect to it.
1519 The server blocks in the "accept" method, which eventually accepts a
1520 bidirectional connection from the remote client. (Make sure to
1521 autoflush this handle to circumvent buffering.)
1522
1523 To add to user-friendliness, our server prompts the user for commands.
1524 Most servers don't do this. Because of the prompt without a newline,
1525 you'll have to use the "sysread" variant of the interactive client
1526 above.
1527
1528 This server accepts one of five different commands, sending output back
1529 to the client. Unlike most network servers, this one handles only one
1530 incoming client at a time. Multitasking servers are covered in Chapter
1531 16 of the Camel.
1532
1533 Here's the code.
1534
1535 #!/usr/bin/perl
1536 use v5.36;
1537 use IO::Socket;
1538 use Net::hostent; # for OOish version of gethostbyaddr
1539
1540 my $PORT = 9000; # pick something not in use
1541
1542 my $server = IO::Socket::INET->new( Proto => "tcp",
1543 LocalPort => $PORT,
1544 Listen => SOMAXCONN,
1545 Reuse => 1);
1546
1547 die "can't setup server" unless $server;
1548 print "[Server $0 accepting clients]\n";
1549
1550 while (my $client = $server->accept()) {
1551 $client->autoflush(1);
1552 print $client "Welcome to $0; type help for command list.\n";
1553 my $hostinfo = gethostbyaddr($client->peeraddr);
1554 printf "[Connect from %s]\n",
1555 $hostinfo ? $hostinfo->name : $client->peerhost;
1556 print $client "Command? ";
1557 while ( <$client>) {
1558 next unless /\S/; # blank line
1559 if (/quit|exit/i) { last }
1560 elsif (/date|time/i) { printf $client "%s\n", scalar localtime() }
1561 elsif (/who/i ) { print $client `who 2>&1` }
1562 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1` }
1563 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1` }
1564 else {
1565 print $client "Commands: quit date who cookie motd\n";
1566 }
1567 } continue {
1568 print $client "Command? ";
1569 }
1570 close $client;
1571 }
1572
1574 Another kind of client-server setup is one that uses not connections,
1575 but messages. UDP communications involve much lower overhead but also
1576 provide less reliability, as there are no promises that messages will
1577 arrive at all, let alone in order and unmangled. Still, UDP offers
1578 some advantages over TCP, including being able to "broadcast" or
1579 "multicast" to a whole bunch of destination hosts at once (usually on
1580 your local subnet). If you find yourself overly concerned about
1581 reliability and start building checks into your message system, then
1582 you probably should use just TCP to start with.
1583
1584 UDP datagrams are not a bytestream and should not be treated as such.
1585 This makes using I/O mechanisms with internal buffering like stdio
1586 (i.e. print() and friends) especially cumbersome. Use syswrite(), or
1587 better send(), like in the example below.
1588
1589 Here's a UDP program similar to the sample Internet TCP client given
1590 earlier. However, instead of checking one host at a time, the UDP
1591 version will check many of them asynchronously by simulating a
1592 multicast and then using select() to do a timed-out wait for I/O. To
1593 do something similar with TCP, you'd have to use a different socket
1594 handle for each host.
1595
1596 #!/usr/bin/perl
1597 use v5.36;
1598 use Socket;
1599 use Sys::Hostname;
1600
1601 my $SECS_OF_70_YEARS = 2_208_988_800;
1602
1603 my $iaddr = gethostbyname(hostname());
1604 my $proto = getprotobyname("udp");
1605 my $port = getservbyname("time", "udp");
1606 my $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1607
1608 socket(my $socket, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
1609 bind($socket, $paddr) || die "bind: $!";
1610
1611 $| = 1;
1612 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime();
1613 my $count = 0;
1614 for my $host (@ARGV) {
1615 $count++;
1616 my $hisiaddr = inet_aton($host) || die "unknown host";
1617 my $hispaddr = sockaddr_in($port, $hisiaddr);
1618 defined(send($socket, 0, 0, $hispaddr)) || die "send $host: $!";
1619 }
1620
1621 my $rout = my $rin = "";
1622 vec($rin, fileno($socket), 1) = 1;
1623
1624 # timeout after 10.0 seconds
1625 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1626 my $rtime = "";
1627 my $hispaddr = recv($socket, $rtime, 4, 0) || die "recv: $!";
1628 my ($port, $hisiaddr) = sockaddr_in($hispaddr);
1629 my $host = gethostbyaddr($hisiaddr, AF_INET);
1630 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1631 printf "%-12s ", $host;
1632 printf "%8d %s\n", $histime - time(), scalar localtime($histime);
1633 $count--;
1634 }
1635
1636 This example does not include any retries and may consequently fail to
1637 contact a reachable host. The most prominent reason for this is
1638 congestion of the queues on the sending host if the number of hosts to
1639 contact is sufficiently large.
1640
1642 While System V IPC isn't so widely used as sockets, it still has some
1643 interesting uses. However, you cannot use SysV IPC or Berkeley mmap()
1644 to have a variable shared amongst several processes. That's because
1645 Perl would reallocate your string when you weren't wanting it to. You
1646 might look into the "IPC::Shareable" or "threads::shared" modules for
1647 that.
1648
1649 Here's a small example showing shared memory usage.
1650
1651 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR);
1652
1653 my $size = 2000;
1654 my $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR);
1655 defined($id) || die "shmget: $!";
1656 print "shm key $id\n";
1657
1658 my $message = "Message #1";
1659 shmwrite($id, $message, 0, 60) || die "shmwrite: $!";
1660 print "wrote: '$message'\n";
1661 shmread($id, my $buff, 0, 60) || die "shmread: $!";
1662 print "read : '$buff'\n";
1663
1664 # the buffer of shmread is zero-character end-padded.
1665 substr($buff, index($buff, "\0")) = "";
1666 print "un" unless $buff eq $message;
1667 print "swell\n";
1668
1669 print "deleting shm $id\n";
1670 shmctl($id, IPC_RMID, 0) || die "shmctl: $!";
1671
1672 Here's an example of a semaphore:
1673
1674 use IPC::SysV qw(IPC_CREAT);
1675
1676 my $IPC_KEY = 1234;
1677 my $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT);
1678 defined($id) || die "semget: $!";
1679 print "sem id $id\n";
1680
1681 Put this code in a separate file to be run in more than one process.
1682 Call the file take:
1683
1684 # create a semaphore
1685
1686 my $IPC_KEY = 1234;
1687 my $id = semget($IPC_KEY, 0, 0);
1688 defined($id) || die "semget: $!";
1689
1690 my $semnum = 0;
1691 my $semflag = 0;
1692
1693 # "take" semaphore
1694 # wait for semaphore to be zero
1695 my $semop = 0;
1696 my $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1697
1698 # Increment the semaphore count
1699 $semop = 1;
1700 my $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
1701 my $opstring = $opstring1 . $opstring2;
1702
1703 semop($id, $opstring) || die "semop: $!";
1704
1705 Put this code in a separate file to be run in more than one process.
1706 Call this file give:
1707
1708 # "give" the semaphore
1709 # run this in the original process and you will see
1710 # that the second process continues
1711
1712 my $IPC_KEY = 1234;
1713 my $id = semget($IPC_KEY, 0, 0);
1714 die unless defined($id);
1715
1716 my $semnum = 0;
1717 my $semflag = 0;
1718
1719 # Decrement the semaphore count
1720 my $semop = -1;
1721 my $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1722
1723 semop($id, $opstring) || die "semop: $!";
1724
1725 The SysV IPC code above was written long ago, and it's definitely
1726 clunky looking. For a more modern look, see the IPC::SysV module.
1727
1728 A small example demonstrating SysV message queues:
1729
1730 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR);
1731
1732 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR);
1733 defined($id) || die "msgget failed: $!";
1734
1735 my $sent = "message";
1736 my $type_sent = 1234;
1737
1738 msgsnd($id, pack("l! a*", $type_sent, $sent), 0)
1739 || die "msgsnd failed: $!";
1740
1741 msgrcv($id, my $rcvd_buf, 60, 0, 0)
1742 || die "msgrcv failed: $!";
1743
1744 my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf);
1745
1746 if ($rcvd eq $sent) {
1747 print "okay\n";
1748 } else {
1749 print "not okay\n";
1750 }
1751
1752 msgctl($id, IPC_RMID, 0) || die "msgctl failed: $!\n";
1753
1755 Most of these routines quietly but politely return "undef" when they
1756 fail instead of causing your program to die right then and there due to
1757 an uncaught exception. (Actually, some of the new Socket conversion
1758 functions do croak() on bad arguments.) It is therefore essential to
1759 check return values from these functions. Always begin your socket
1760 programs this way for optimal success, and don't forget to add the -T
1761 taint-checking flag to the "#!" line for servers:
1762
1763 #!/usr/bin/perl -T
1764 use v5.36;
1765 use sigtrap;
1766 use Socket;
1767
1769 These routines all create system-specific portability problems. As
1770 noted elsewhere, Perl is at the mercy of your C libraries for much of
1771 its system behavior. It's probably safest to assume broken SysV
1772 semantics for signals and to stick with simple TCP and UDP socket
1773 operations; e.g., don't try to pass open file descriptors over a local
1774 UDP datagram socket if you want your code to stand a chance of being
1775 portable.
1776
1778 Tom Christiansen, with occasional vestiges of Larry Wall's original
1779 version and suggestions from the Perl Porters.
1780
1782 There's a lot more to networking than this, but this should get you
1783 started.
1784
1785 For intrepid programmers, the indispensable textbook is Unix Network
1786 Programming, 2nd Edition, Volume 1 by W. Richard Stevens (published by
1787 Prentice-Hall). Most books on networking address the subject from the
1788 perspective of a C programmer; translation to Perl is left as an
1789 exercise for the reader.
1790
1791 The IO::Socket(3) manpage describes the object library, and the
1792 Socket(3) manpage describes the low-level interface to sockets.
1793 Besides the obvious functions in perlfunc, you should also check out
1794 the modules file at your nearest CPAN site, especially
1795 <http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>. See
1796 perlmodlib or best yet, the Perl FAQ for a description of what CPAN is
1797 and where to get it if the previous link doesn't work for you.
1798
1799 Section 5 of CPAN's modules file is devoted to "Networking, Device
1800 Control (modems), and Interprocess Communication", and contains
1801 numerous unbundled modules numerous networking modules, Chat and Expect
1802 operations, CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC,
1803 SNMP, SMTP, Telnet, Threads, and ToolTalk--to name just a few.
1804
1805
1806
1807perl v5.38.2 2023-11-30 PERLIPC(1)