1PERLIPC(1) Perl Programmers Reference Guide PERLIPC(1)
2
3
4
6 perlipc - Perl interprocess communication (signals, fifos, pipes, safe
7 subprocesses, sockets, and semaphores)
8
10 The basic IPC facilities of Perl are built out of the good old Unix
11 signals, named pipes, pipe opens, the Berkeley socket routines, and
12 SysV IPC calls. Each is used in slightly different situations.
13
15 Perl uses a simple signal handling model: the %SIG hash contains names
16 or references of user-installed signal handlers. These handlers will
17 be called with an argument which is the name of the signal that
18 triggered it. A signal may be generated intentionally from a
19 particular keyboard sequence like control-C or control-Z, sent to you
20 from another process, or triggered automatically by the kernel when
21 special events transpire, like a child process exiting, your own
22 process running out of stack space, or hitting a process file-size
23 limit.
24
25 For example, to trap an interrupt signal, set up a handler like this:
26
27 our $shucks;
28
29 sub catch_zap {
30 my $signame = shift;
31 $shucks++;
32 die "Somebody sent me a SIG$signame";
33 }
34 $SIG{INT} = __PACKAGE__ . "::catch_zap";
35 $SIG{INT} = \&catch_zap; # best strategy
36
37 Prior to Perl 5.7.3 it was necessary to do as little as you possibly
38 could in your handler; notice how all we do is set a global variable
39 and then raise an exception. That's because on most systems, libraries
40 are not re-entrant; particularly, memory allocation and I/O routines
41 are not. That meant that doing nearly anything in your handler could
42 in theory trigger a memory fault and subsequent core dump - see
43 "Deferred Signals (Safe Signals)" below.
44
45 The names of the signals are the ones listed out by "kill -l" on your
46 system, or you can retrieve them using the CPAN module IPC::Signal.
47
48 You may also choose to assign the strings "IGNORE" or "DEFAULT" as the
49 handler, in which case Perl will try to discard the signal or do the
50 default thing.
51
52 On most Unix platforms, the "CHLD" (sometimes also known as "CLD")
53 signal has special behavior with respect to a value of "IGNORE".
54 Setting $SIG{CHLD} to "IGNORE" on such a platform has the effect of not
55 creating zombie processes when the parent process fails to "wait()" on
56 its child processes (i.e., child processes are automatically reaped).
57 Calling "wait()" with $SIG{CHLD} set to "IGNORE" usually returns "-1"
58 on such platforms.
59
60 Some signals can be neither trapped nor ignored, such as the KILL and
61 STOP (but not the TSTP) signals. Note that ignoring signals makes them
62 disappear. If you only want them blocked temporarily without them
63 getting lost you'll have to use POSIX' sigprocmask.
64
65 Sending a signal to a negative process ID means that you send the
66 signal to the entire Unix process group. This code sends a hang-up
67 signal to all processes in the current process group, and also sets
68 $SIG{HUP} to "IGNORE" so it doesn't kill itself:
69
70 # block scope for local
71 {
72 local $SIG{HUP} = "IGNORE";
73 kill HUP => -$$;
74 # snazzy writing of: kill("HUP", -$$)
75 }
76
77 Another interesting signal to send is signal number zero. This doesn't
78 actually affect a child process, but instead checks whether it's alive
79 or has changed its UIDs.
80
81 unless (kill 0 => $kid_pid) {
82 warn "something wicked happened to $kid_pid";
83 }
84
85 Signal number zero may fail because you lack permission to send the
86 signal when directed at a process whose real or saved UID is not
87 identical to the real or effective UID of the sending process, even
88 though the process is alive. You may be able to determine the cause of
89 failure using $! or "%!".
90
91 unless (kill(0 => $pid) || $!{EPERM}) {
92 warn "$pid looks dead";
93 }
94
95 You might also want to employ anonymous functions for simple signal
96 handlers:
97
98 $SIG{INT} = sub { die "\nOutta here!\n" };
99
100 SIGCHLD handlers require some special care. If a second child dies
101 while in the signal handler caused by the first death, we won't get
102 another signal. So must loop here else we will leave the unreaped child
103 as a zombie. And the next time two children die we get another zombie.
104 And so on.
105
106 use POSIX ":sys_wait_h";
107 $SIG{CHLD} = sub {
108 while ((my $child = waitpid(-1, WNOHANG)) > 0) {
109 $Kid_Status{$child} = $?;
110 }
111 };
112 # do something that forks...
113
114 Be careful: qx(), system(), and some modules for calling external
115 commands do a fork(), then wait() for the result. Thus, your signal
116 handler will be called. Because wait() was already called by system()
117 or qx(), the wait() in the signal handler will see no more zombies and
118 will therefore block.
119
120 The best way to prevent this issue is to use waitpid(), as in the
121 following example:
122
123 use POSIX ":sys_wait_h"; # for nonblocking read
124
125 my %children;
126
127 $SIG{CHLD} = sub {
128 # don't change $! and $? outside handler
129 local ($!, $?);
130 my $pid = waitpid(-1, WNOHANG);
131 return if $pid == -1;
132 return unless defined $children{$pid};
133 delete $children{$pid};
134 cleanup_child($pid, $?);
135 };
136
137 while (1) {
138 my $pid = fork();
139 die "cannot fork" unless defined $pid;
140 if ($pid == 0) {
141 # ...
142 exit 0;
143 } else {
144 $children{$pid}=1;
145 # ...
146 system($command);
147 # ...
148 }
149 }
150
151 Signal handling is also used for timeouts in Unix. While safely
152 protected within an "eval{}" block, you set a signal handler to trap
153 alarm signals and then schedule to have one delivered to you in some
154 number of seconds. Then try your blocking operation, clearing the
155 alarm when it's done but not before you've exited your "eval{}" block.
156 If it goes off, you'll use die() to jump out of the block.
157
158 Here's an example:
159
160 my $ALARM_EXCEPTION = "alarm clock restart";
161 eval {
162 local $SIG{ALRM} = sub { die $ALARM_EXCEPTION };
163 alarm 10;
164 flock(FH, 2) # blocking write lock
165 || die "cannot flock: $!";
166 alarm 0;
167 };
168 if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }
169
170 If the operation being timed out is system() or qx(), this technique is
171 liable to generate zombies. If this matters to you, you'll need to
172 do your own fork() and exec(), and kill the errant child process.
173
174 For more complex signal handling, you might see the standard POSIX
175 module. Lamentably, this is almost entirely undocumented, but the
176 t/lib/posix.t file from the Perl source distribution has some examples
177 in it.
178
179 Handling the SIGHUP Signal in Daemons
180 A process that usually starts when the system boots and shuts down when
181 the system is shut down is called a daemon (Disk And Execution
182 MONitor). If a daemon process has a configuration file which is
183 modified after the process has been started, there should be a way to
184 tell that process to reread its configuration file without stopping the
185 process. Many daemons provide this mechanism using a "SIGHUP" signal
186 handler. When you want to tell the daemon to reread the file, simply
187 send it the "SIGHUP" signal.
188
189 The following example implements a simple daemon, which restarts itself
190 every time the "SIGHUP" signal is received. The actual code is located
191 in the subroutine "code()", which just prints some debugging info to
192 show that it works; it should be replaced with the real code.
193
194 #!/usr/bin/perl -w
195
196 use POSIX ();
197 use FindBin ();
198 use File::Basename ();
199 use File::Spec::Functions;
200
201 $| = 1;
202
203 # make the daemon cross-platform, so exec always calls the script
204 # itself with the right path, no matter how the script was invoked.
205 my $script = File::Basename::basename($0);
206 my $SELF = catfile($FindBin::Bin, $script);
207
208 # POSIX unmasks the sigprocmask properly
209 $SIG{HUP} = sub {
210 print "got SIGHUP\n";
211 exec($SELF, @ARGV) || die "$0: couldn't restart: $!";
212 };
213
214 code();
215
216 sub code {
217 print "PID: $$\n";
218 print "ARGV: @ARGV\n";
219 my $count = 0;
220 while (++$count) {
221 sleep 2;
222 print "$count\n";
223 }
224 }
225
226 Deferred Signals (Safe Signals)
227 Before Perl 5.7.3, installing Perl code to deal with signals exposed
228 you to danger from two things. First, few system library functions are
229 re-entrant. If the signal interrupts while Perl is executing one
230 function (like malloc(3) or printf(3)), and your signal handler then
231 calls the same function again, you could get unpredictable
232 behavior--often, a core dump. Second, Perl isn't itself re-entrant at
233 the lowest levels. If the signal interrupts Perl while Perl is
234 changing its own internal data structures, similarly unpredictable
235 behavior may result.
236
237 There were two things you could do, knowing this: be paranoid or be
238 pragmatic. The paranoid approach was to do as little as possible in
239 your signal handler. Set an existing integer variable that already has
240 a value, and return. This doesn't help you if you're in a slow system
241 call, which will just restart. That means you have to "die" to
242 longjmp(3) out of the handler. Even this is a little cavalier for the
243 true paranoiac, who avoids "die" in a handler because the system is out
244 to get you. The pragmatic approach was to say "I know the risks, but
245 prefer the convenience", and to do anything you wanted in your signal
246 handler, and be prepared to clean up core dumps now and again.
247
248 Perl 5.7.3 and later avoid these problems by "deferring" signals. That
249 is, when the signal is delivered to the process by the system (to the C
250 code that implements Perl) a flag is set, and the handler returns
251 immediately. Then at strategic "safe" points in the Perl interpreter
252 (e.g. when it is about to execute a new opcode) the flags are checked
253 and the Perl level handler from %SIG is executed. The "deferred" scheme
254 allows much more flexibility in the coding of signal handlers as we
255 know the Perl interpreter is in a safe state, and that we are not in a
256 system library function when the handler is called. However the
257 implementation does differ from previous Perls in the following ways:
258
259 Long-running opcodes
260 As the Perl interpreter looks at signal flags only when it is about
261 to execute a new opcode, a signal that arrives during a long-
262 running opcode (e.g. a regular expression operation on a very large
263 string) will not be seen until the current opcode completes.
264
265 If a signal of any given type fires multiple times during an opcode
266 (such as from a fine-grained timer), the handler for that signal
267 will be called only once, after the opcode completes; all other
268 instances will be discarded. Furthermore, if your system's signal
269 queue gets flooded to the point that there are signals that have
270 been raised but not yet caught (and thus not deferred) at the time
271 an opcode completes, those signals may well be caught and deferred
272 during subsequent opcodes, with sometimes surprising results. For
273 example, you may see alarms delivered even after calling alarm(0)
274 as the latter stops the raising of alarms but does not cancel the
275 delivery of alarms raised but not yet caught. Do not depend on the
276 behaviors described in this paragraph as they are side effects of
277 the current implementation and may change in future versions of
278 Perl.
279
280 Interrupting IO
281 When a signal is delivered (e.g., SIGINT from a control-C) the
282 operating system breaks into IO operations like read(2), which is
283 used to implement Perl's readline() function, the "<>" operator. On
284 older Perls the handler was called immediately (and as "read" is
285 not "unsafe", this worked well). With the "deferred" scheme the
286 handler is not called immediately, and if Perl is using the
287 system's "stdio" library that library may restart the "read"
288 without returning to Perl to give it a chance to call the %SIG
289 handler. If this happens on your system the solution is to use the
290 ":perlio" layer to do IO--at least on those handles that you want
291 to be able to break into with signals. (The ":perlio" layer checks
292 the signal flags and calls %SIG handlers before resuming IO
293 operation.)
294
295 The default in Perl 5.7.3 and later is to automatically use the
296 ":perlio" layer.
297
298 Note that it is not advisable to access a file handle within a
299 signal handler where that signal has interrupted an I/O operation
300 on that same handle. While perl will at least try hard not to
301 crash, there are no guarantees of data integrity; for example, some
302 data might get dropped or written twice.
303
304 Some networking library functions like gethostbyname() are known to
305 have their own implementations of timeouts which may conflict with
306 your timeouts. If you have problems with such functions, try using
307 the POSIX sigaction() function, which bypasses Perl safe signals.
308 Be warned that this does subject you to possible memory corruption,
309 as described above.
310
311 Instead of setting $SIG{ALRM}:
312
313 local $SIG{ALRM} = sub { die "alarm" };
314
315 try something like the following:
316
317 use POSIX qw(SIGALRM);
318 POSIX::sigaction(SIGALRM, POSIX::SigAction->new(sub { die "alarm" }))
319 || die "Error setting SIGALRM handler: $!\n";
320
321 Another way to disable the safe signal behavior locally is to use
322 the "Perl::Unsafe::Signals" module from CPAN, which affects all
323 signals.
324
325 Restartable system calls
326 On systems that supported it, older versions of Perl used the
327 SA_RESTART flag when installing %SIG handlers. This meant that
328 restartable system calls would continue rather than returning when
329 a signal arrived. In order to deliver deferred signals promptly,
330 Perl 5.7.3 and later do not use SA_RESTART. Consequently,
331 restartable system calls can fail (with $! set to "EINTR") in
332 places where they previously would have succeeded.
333
334 The default ":perlio" layer retries "read", "write" and "close" as
335 described above; interrupted "wait" and "waitpid" calls will always
336 be retried.
337
338 Signals as "faults"
339 Certain signals like SEGV, ILL, and BUS are generated by virtual
340 memory addressing errors and similar "faults". These are normally
341 fatal: there is little a Perl-level handler can do with them. So
342 Perl delivers them immediately rather than attempting to defer
343 them.
344
345 Signals triggered by operating system state
346 On some operating systems certain signal handlers are supposed to
347 "do something" before returning. One example can be CHLD or CLD,
348 which indicates a child process has completed. On some operating
349 systems the signal handler is expected to "wait" for the completed
350 child process. On such systems the deferred signal scheme will not
351 work for those signals: it does not do the "wait". Again the
352 failure will look like a loop as the operating system will reissue
353 the signal because there are completed child processes that have
354 not yet been "wait"ed for.
355
356 If you want the old signal behavior back despite possible memory
357 corruption, set the environment variable "PERL_SIGNALS" to "unsafe".
358 This feature first appeared in Perl 5.8.1.
359
361 A named pipe (often referred to as a FIFO) is an old Unix IPC mechanism
362 for processes communicating on the same machine. It works just like
363 regular anonymous pipes, except that the processes rendezvous using a
364 filename and need not be related.
365
366 To create a named pipe, use the "POSIX::mkfifo()" function.
367
368 use POSIX qw(mkfifo);
369 mkfifo($path, 0700) || die "mkfifo $path failed: $!";
370
371 You can also use the Unix command mknod(1), or on some systems,
372 mkfifo(1). These may not be in your normal path, though.
373
374 # system return val is backwards, so && not ||
375 #
376 $ENV{PATH} .= ":/etc:/usr/etc";
377 if ( system("mknod", $path, "p")
378 && system("mkfifo", $path) )
379 {
380 die "mk{nod,fifo} $path failed";
381 }
382
383 A fifo is convenient when you want to connect a process to an unrelated
384 one. When you open a fifo, the program will block until there's
385 something on the other end.
386
387 For example, let's say you'd like to have your .signature file be a
388 named pipe that has a Perl program on the other end. Now every time
389 any program (like a mailer, news reader, finger program, etc.) tries to
390 read from that file, the reading program will read the new signature
391 from your program. We'll use the pipe-checking file-test operator, -p,
392 to find out whether anyone (or anything) has accidentally removed our
393 fifo.
394
395 chdir(); # go home
396 my $FIFO = ".signature";
397
398 while (1) {
399 unless (-p $FIFO) {
400 unlink $FIFO; # discard any failure, will catch later
401 require POSIX; # delayed loading of heavy module
402 POSIX::mkfifo($FIFO, 0700)
403 || die "can't mkfifo $FIFO: $!";
404 }
405
406 # next line blocks till there's a reader
407 open (FIFO, "> $FIFO") || die "can't open $FIFO: $!";
408 print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
409 close(FIFO) || die "can't close $FIFO: $!";
410 sleep 2; # to avoid dup signals
411 }
412
414 Perl's basic open() statement can also be used for unidirectional
415 interprocess communication by either appending or prepending a pipe
416 symbol to the second argument to open(). Here's how to start something
417 up in a child process you intend to write to:
418
419 open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
420 || die "can't fork: $!";
421 local $SIG{PIPE} = sub { die "spooler pipe broke" };
422 print SPOOLER "stuff\n";
423 close SPOOLER || die "bad spool: $! $?";
424
425 And here's how to start up a child process you intend to read from:
426
427 open(STATUS, "netstat -an 2>&1 |")
428 || die "can't fork: $!";
429 while (<STATUS>) {
430 next if /^(tcp|udp)/;
431 print;
432 }
433 close STATUS || die "bad netstat: $! $?";
434
435 If one can be sure that a particular program is a Perl script expecting
436 filenames in @ARGV, the clever programmer can write something like
437 this:
438
439 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
440
441 and no matter which sort of shell it's called from, the Perl program
442 will read from the file f1, the process cmd1, standard input (tmpfile
443 in this case), the f2 file, the cmd2 command, and finally the f3 file.
444 Pretty nifty, eh?
445
446 You might notice that you could use backticks for much the same effect
447 as opening a pipe for reading:
448
449 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
450 die "bad netstatus ($?)" if $?;
451
452 While this is true on the surface, it's much more efficient to process
453 the file one line or record at a time because then you don't have to
454 read the whole thing into memory at once. It also gives you finer
455 control of the whole process, letting you kill off the child process
456 early if you'd like.
457
458 Be careful to check the return values from both open() and close(). If
459 you're writing to a pipe, you should also trap SIGPIPE. Otherwise,
460 think of what happens when you start up a pipe to a command that
461 doesn't exist: the open() will in all likelihood succeed (it only
462 reflects the fork()'s success), but then your output will
463 fail--spectacularly. Perl can't know whether the command worked,
464 because your command is actually running in a separate process whose
465 exec() might have failed. Therefore, while readers of bogus commands
466 return just a quick EOF, writers to bogus commands will get hit with a
467 signal, which they'd best be prepared to handle. Consider:
468
469 open(FH, "|bogus") || die "can't fork: $!";
470 print FH "bang\n"; # neither necessary nor sufficient
471 # to check print retval!
472 close(FH) || die "can't close: $!";
473
474 The reason for not checking the return value from print() is because of
475 pipe buffering; physical writes are delayed. That won't blow up until
476 the close, and it will blow up with a SIGPIPE. To catch it, you could
477 use this:
478
479 $SIG{PIPE} = "IGNORE";
480 open(FH, "|bogus") || die "can't fork: $!";
481 print FH "bang\n";
482 close(FH) || die "can't close: status=$?";
483
484 Filehandles
485 Both the main process and any child processes it forks share the same
486 STDIN, STDOUT, and STDERR filehandles. If both processes try to access
487 them at once, strange things can happen. You may also want to close or
488 reopen the filehandles for the child. You can get around this by
489 opening your pipe with open(), but on some systems this means that the
490 child process cannot outlive the parent.
491
492 Background Processes
493 You can run a command in the background with:
494
495 system("cmd &");
496
497 The command's STDOUT and STDERR (and possibly STDIN, depending on your
498 shell) will be the same as the parent's. You won't need to catch
499 SIGCHLD because of the double-fork taking place; see below for details.
500
501 Complete Dissociation of Child from Parent
502 In some cases (starting server processes, for instance) you'll want to
503 completely dissociate the child process from the parent. This is often
504 called daemonization. A well-behaved daemon will also chdir() to the
505 root directory so it doesn't prevent unmounting the filesystem
506 containing the directory from which it was launched, and redirect its
507 standard file descriptors from and to /dev/null so that random output
508 doesn't wind up on the user's terminal.
509
510 use POSIX "setsid";
511
512 sub daemonize {
513 chdir("/") || die "can't chdir to /: $!";
514 open(STDIN, "< /dev/null") || die "can't read /dev/null: $!";
515 open(STDOUT, "> /dev/null") || die "can't write to /dev/null: $!";
516 defined(my $pid = fork()) || die "can't fork: $!";
517 exit if $pid; # non-zero now means I am the parent
518 (setsid() != -1) || die "Can't start a new session: $!"
519 open(STDERR, ">&STDOUT") || die "can't dup stdout: $!";
520 }
521
522 The fork() has to come before the setsid() to ensure you aren't a
523 process group leader; the setsid() will fail if you are. If your
524 system doesn't have the setsid() function, open /dev/tty and use the
525 "TIOCNOTTY" ioctl() on it instead. See tty(4) for details.
526
527 Non-Unix users should check their "Your_OS::Process" module for other
528 possible solutions.
529
530 Safe Pipe Opens
531 Another interesting approach to IPC is making your single program go
532 multiprocess and communicate between--or even amongst--yourselves. The
533 open() function will accept a file argument of either "-|" or "|-" to
534 do a very interesting thing: it forks a child connected to the
535 filehandle you've opened. The child is running the same program as the
536 parent. This is useful for safely opening a file when running under an
537 assumed UID or GID, for example. If you open a pipe to minus, you can
538 write to the filehandle you opened and your kid will find it in his
539 STDIN. If you open a pipe from minus, you can read from the filehandle
540 you opened whatever your kid writes to his STDOUT.
541
542 use English qw[ -no_match_vars ];
543 my $PRECIOUS = "/path/to/some/safe/file";
544 my $sleep_count;
545 my $pid;
546
547 do {
548 $pid = open(KID_TO_WRITE, "|-");
549 unless (defined $pid) {
550 warn "cannot fork: $!";
551 die "bailing out" if $sleep_count++ > 6;
552 sleep 10;
553 }
554 } until defined $pid;
555
556 if ($pid) { # I am the parent
557 print KID_TO_WRITE @some_data;
558 close(KID_TO_WRITE) || warn "kid exited $?";
559 } else { # I am the child
560 # drop permissions in setuid and/or setgid programs:
561 ($EUID, $EGID) = ($UID, $GID);
562 open (OUTFILE, "> $PRECIOUS")
563 || die "can't open $PRECIOUS: $!";
564 while (<STDIN>) {
565 print OUTFILE; # child's STDIN is parent's KID_TO_WRITE
566 }
567 close(OUTFILE) || die "can't close $PRECIOUS: $!";
568 exit(0); # don't forget this!!
569 }
570
571 Another common use for this construct is when you need to execute
572 something without the shell's interference. With system(), it's
573 straightforward, but you can't use a pipe open or backticks safely.
574 That's because there's no way to stop the shell from getting its hands
575 on your arguments. Instead, use lower-level control to call exec()
576 directly.
577
578 Here's a safe backtick or pipe open for read:
579
580 my $pid = open(KID_TO_READ, "-|");
581 defined($pid) || die "can't fork: $!";
582
583 if ($pid) { # parent
584 while (<KID_TO_READ>) {
585 # do something interesting
586 }
587 close(KID_TO_READ) || warn "kid exited $?";
588
589 } else { # child
590 ($EUID, $EGID) = ($UID, $GID); # suid only
591 exec($program, @options, @args)
592 || die "can't exec program: $!";
593 # NOTREACHED
594 }
595
596 And here's a safe pipe open for writing:
597
598 my $pid = open(KID_TO_WRITE, "|-");
599 defined($pid) || die "can't fork: $!";
600
601 $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
602
603 if ($pid) { # parent
604 print KID_TO_WRITE @data;
605 close(KID_TO_WRITE) || warn "kid exited $?";
606
607 } else { # child
608 ($EUID, $EGID) = ($UID, $GID);
609 exec($program, @options, @args)
610 || die "can't exec program: $!";
611 # NOTREACHED
612 }
613
614 It is very easy to dead-lock a process using this form of open(), or
615 indeed with any use of pipe() with multiple subprocesses. The example
616 above is "safe" because it is simple and calls exec(). See "Avoiding
617 Pipe Deadlocks" for general safety principles, but there are extra
618 gotchas with Safe Pipe Opens.
619
620 In particular, if you opened the pipe using "open FH, "|-"", then you
621 cannot simply use close() in the parent process to close an unwanted
622 writer. Consider this code:
623
624 my $pid = open(WRITER, "|-"); # fork open a kid
625 defined($pid) || die "first fork failed: $!";
626 if ($pid) {
627 if (my $sub_pid = fork()) {
628 defined($sub_pid) || die "second fork failed: $!";
629 close(WRITER) || die "couldn't close WRITER: $!";
630 # now do something else...
631 }
632 else {
633 # first write to WRITER
634 # ...
635 # then when finished
636 close(WRITER) || die "couldn't close WRITER: $!";
637 exit(0);
638 }
639 }
640 else {
641 # first do something with STDIN, then
642 exit(0);
643 }
644
645 In the example above, the true parent does not want to write to the
646 WRITER filehandle, so it closes it. However, because WRITER was opened
647 using "open FH, "|-"", it has a special behavior: closing it calls
648 waitpid() (see "waitpid" in perlfunc), which waits for the subprocess
649 to exit. If the child process ends up waiting for something happening
650 in the section marked "do something else", you have deadlock.
651
652 This can also be a problem with intermediate subprocesses in more
653 complicated code, which will call waitpid() on all open filehandles
654 during global destruction--in no predictable order.
655
656 To solve this, you must manually use pipe(), fork(), and the form of
657 open() which sets one file descriptor to another, as shown below:
658
659 pipe(READER, WRITER) || die "pipe failed: $!";
660 $pid = fork();
661 defined($pid) || die "first fork failed: $!";
662 if ($pid) {
663 close READER;
664 if (my $sub_pid = fork()) {
665 defined($sub_pid) || die "first fork failed: $!";
666 close(WRITER) || die "can't close WRITER: $!";
667 }
668 else {
669 # write to WRITER...
670 # ...
671 # then when finished
672 close(WRITER) || die "can't close WRITER: $!";
673 exit(0);
674 }
675 # write to WRITER...
676 }
677 else {
678 open(STDIN, "<&READER") || die "can't reopen STDIN: $!";
679 close(WRITER) || die "can't close WRITER: $!";
680 # do something...
681 exit(0);
682 }
683
684 Since Perl 5.8.0, you can also use the list form of "open" for pipes.
685 This is preferred when you wish to avoid having the shell interpret
686 metacharacters that may be in your command string.
687
688 So for example, instead of using:
689
690 open(PS_PIPE, "ps aux|") || die "can't open ps pipe: $!";
691
692 One would use either of these:
693
694 open(PS_PIPE, "-|", "ps", "aux")
695 || die "can't open ps pipe: $!";
696
697 @ps_args = qw[ ps aux ];
698 open(PS_PIPE, "-|", @ps_args)
699 || die "can't open @ps_args|: $!";
700
701 Because there are more than three arguments to open(), forks the ps(1)
702 command without spawning a shell, and reads its standard output via the
703 "PS_PIPE" filehandle. The corresponding syntax to write to command
704 pipes is to use "|-" in place of "-|".
705
706 This was admittedly a rather silly example, because you're using string
707 literals whose content is perfectly safe. There is therefore no cause
708 to resort to the harder-to-read, multi-argument form of pipe open().
709 However, whenever you cannot be assured that the program arguments are
710 free of shell metacharacters, the fancier form of open() should be
711 used. For example:
712
713 @grep_args = ("egrep", "-i", $some_pattern, @many_files);
714 open(GREP_PIPE, "-|", @grep_args)
715 || die "can't open @grep_args|: $!";
716
717 Here the multi-argument form of pipe open() is preferred because the
718 pattern and indeed even the filenames themselves might hold
719 metacharacters.
720
721 Be aware that these operations are full Unix forks, which means they
722 may not be correctly implemented on all alien systems. Additionally,
723 these are not true multithreading. To learn more about threading, see
724 the modules file mentioned below in the SEE ALSO section.
725
726 Avoiding Pipe Deadlocks
727 Whenever you have more than one subprocess, you must be careful that
728 each closes whichever half of any pipes created for interprocess
729 communication it is not using. This is because any child process
730 reading from the pipe and expecting an EOF will never receive it, and
731 therefore never exit. A single process closing a pipe is not enough to
732 close it; the last process with the pipe open must close it for it to
733 read EOF.
734
735 Certain built-in Unix features help prevent this most of the time. For
736 instance, filehandles have a "close on exec" flag, which is set en
737 masse under control of the $^F variable. This is so any filehandles
738 you didn't explicitly route to the STDIN, STDOUT or STDERR of a child
739 program will be automatically closed.
740
741 Always explicitly and immediately call close() on the writable end of
742 any pipe, unless that process is actually writing to it. Even if you
743 don't explicitly call close(), Perl will still close() all filehandles
744 during global destruction. As previously discussed, if those
745 filehandles have been opened with Safe Pipe Open, this will result in
746 calling waitpid(), which may again deadlock.
747
748 Bidirectional Communication with Another Process
749 While this works reasonably well for unidirectional communication, what
750 about bidirectional communication? The most obvious approach doesn't
751 work:
752
753 # THIS DOES NOT WORK!!
754 open(PROG_FOR_READING_AND_WRITING, "| some program |")
755
756 If you forget to "use warnings", you'll miss out entirely on the
757 helpful diagnostic message:
758
759 Can't do bidirectional pipe at -e line 1.
760
761 If you really want to, you can use the standard open2() from the
762 "IPC::Open2" module to catch both ends. There's also an open3() in
763 "IPC::Open3" for tridirectional I/O so you can also catch your child's
764 STDERR, but doing so would then require an awkward select() loop and
765 wouldn't allow you to use normal Perl input operations.
766
767 If you look at its source, you'll see that open2() uses low-level
768 primitives like the pipe() and exec() syscalls to create all the
769 connections. Although it might have been more efficient by using
770 socketpair(), this would have been even less portable than it already
771 is. The open2() and open3() functions are unlikely to work anywhere
772 except on a Unix system, or at least one purporting POSIX compliance.
773
774 Here's an example of using open2():
775
776 use FileHandle;
777 use IPC::Open2;
778 $pid = open2(*Reader, *Writer, "cat -un");
779 print Writer "stuff\n";
780 $got = <Reader>;
781
782 The problem with this is that buffering is really going to ruin your
783 day. Even though your "Writer" filehandle is auto-flushed so the
784 process on the other end gets your data in a timely manner, you can't
785 usually do anything to force that process to give its data to you in a
786 similarly quick fashion. In this special case, we could actually so,
787 because we gave cat a -u flag to make it unbuffered. But very few
788 commands are designed to operate over pipes, so this seldom works
789 unless you yourself wrote the program on the other end of the double-
790 ended pipe.
791
792 A solution to this is to use a library which uses pseudottys to make
793 your program behave more reasonably. This way you don't have to have
794 control over the source code of the program you're using. The "Expect"
795 module from CPAN also addresses this kind of thing. This module
796 requires two other modules from CPAN, "IO::Pty" and "IO::Stty". It
797 sets up a pseudo terminal to interact with programs that insist on
798 talking to the terminal device driver. If your system is supported,
799 this may be your best bet.
800
801 Bidirectional Communication with Yourself
802 If you want, you may make low-level pipe() and fork() syscalls to
803 stitch this together by hand. This example only talks to itself, but
804 you could reopen the appropriate handles to STDIN and STDOUT and call
805 other processes. (The following example lacks proper error checking.)
806
807 #!/usr/bin/perl -w
808 # pipe1 - bidirectional communication using two pipe pairs
809 # designed for the socketpair-challenged
810 use IO::Handle; # thousands of lines just for autoflush :-(
811 pipe(PARENT_RDR, CHILD_WTR); # XXX: check failure?
812 pipe(CHILD_RDR, PARENT_WTR); # XXX: check failure?
813 CHILD_WTR->autoflush(1);
814 PARENT_WTR->autoflush(1);
815
816 if ($pid = fork()) {
817 close PARENT_RDR;
818 close PARENT_WTR;
819 print CHILD_WTR "Parent Pid $$ is sending this\n";
820 chomp($line = <CHILD_RDR>);
821 print "Parent Pid $$ just read this: '$line'\n";
822 close CHILD_RDR; close CHILD_WTR;
823 waitpid($pid, 0);
824 } else {
825 die "cannot fork: $!" unless defined $pid;
826 close CHILD_RDR;
827 close CHILD_WTR;
828 chomp($line = <PARENT_RDR>);
829 print "Child Pid $$ just read this: '$line'\n";
830 print PARENT_WTR "Child Pid $$ is sending this\n";
831 close PARENT_RDR;
832 close PARENT_WTR;
833 exit(0);
834 }
835
836 But you don't actually have to make two pipe calls. If you have the
837 socketpair() system call, it will do this all for you.
838
839 #!/usr/bin/perl -w
840 # pipe2 - bidirectional communication using socketpair
841 # "the best ones always go both ways"
842
843 use Socket;
844 use IO::Handle; # thousands of lines just for autoflush :-(
845
846 # We say AF_UNIX because although *_LOCAL is the
847 # POSIX 1003.1g form of the constant, many machines
848 # still don't have it.
849 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
850 || die "socketpair: $!";
851
852 CHILD->autoflush(1);
853 PARENT->autoflush(1);
854
855 if ($pid = fork()) {
856 close PARENT;
857 print CHILD "Parent Pid $$ is sending this\n";
858 chomp($line = <CHILD>);
859 print "Parent Pid $$ just read this: '$line'\n";
860 close CHILD;
861 waitpid($pid, 0);
862 } else {
863 die "cannot fork: $!" unless defined $pid;
864 close CHILD;
865 chomp($line = <PARENT>);
866 print "Child Pid $$ just read this: '$line'\n";
867 print PARENT "Child Pid $$ is sending this\n";
868 close PARENT;
869 exit(0);
870 }
871
873 While not entirely limited to Unix-derived operating systems (e.g.,
874 WinSock on PCs provides socket support, as do some VMS libraries), you
875 might not have sockets on your system, in which case this section
876 probably isn't going to do you much good. With sockets, you can do
877 both virtual circuits like TCP streams and datagrams like UDP packets.
878 You may be able to do even more depending on your system.
879
880 The Perl functions for dealing with sockets have the same names as the
881 corresponding system calls in C, but their arguments tend to differ for
882 two reasons. First, Perl filehandles work differently than C file
883 descriptors. Second, Perl already knows the length of its strings, so
884 you don't need to pass that information.
885
886 One of the major problems with ancient, antemillennial socket code in
887 Perl was that it used hard-coded values for some of the constants,
888 which severely hurt portability. If you ever see code that does
889 anything like explicitly setting "$AF_INET = 2", you know you're in for
890 big trouble. An immeasurably superior approach is to use the "Socket"
891 module, which more reliably grants access to the various constants and
892 functions you'll need.
893
894 If you're not writing a server/client for an existing protocol like
895 NNTP or SMTP, you should give some thought to how your server will know
896 when the client has finished talking, and vice-versa. Most protocols
897 are based on one-line messages and responses (so one party knows the
898 other has finished when a "\n" is received) or multi-line messages and
899 responses that end with a period on an empty line ("\n.\n" terminates a
900 message/response).
901
902 Internet Line Terminators
903 The Internet line terminator is "\015\012". Under ASCII variants of
904 Unix, that could usually be written as "\r\n", but under other systems,
905 "\r\n" might at times be "\015\015\012", "\012\012\015", or something
906 completely different. The standards specify writing "\015\012" to be
907 conformant (be strict in what you provide), but they also recommend
908 accepting a lone "\012" on input (be lenient in what you require). We
909 haven't always been very good about that in the code in this manpage,
910 but unless you're on a Mac from way back in its pre-Unix dark ages,
911 you'll probably be ok.
912
913 Internet TCP Clients and Servers
914 Use Internet-domain sockets when you want to do client-server
915 communication that might extend to machines outside of your own system.
916
917 Here's a sample TCP client using Internet-domain sockets:
918
919 #!/usr/bin/perl -w
920 use strict;
921 use Socket;
922 my ($remote, $port, $iaddr, $paddr, $proto, $line);
923
924 $remote = shift || "localhost";
925 $port = shift || 2345; # random port
926 if ($port =~ /\D/) { $port = getservbyname($port, "tcp") }
927 die "No port" unless $port;
928 $iaddr = inet_aton($remote) || die "no host: $remote";
929 $paddr = sockaddr_in($port, $iaddr);
930
931 $proto = getprotobyname("tcp");
932 socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
933 connect(SOCK, $paddr) || die "connect: $!";
934 while ($line = <SOCK>) {
935 print $line;
936 }
937
938 close (SOCK) || die "close: $!";
939 exit(0);
940
941 And here's a corresponding server to go along with it. We'll leave the
942 address as "INADDR_ANY" so that the kernel can choose the appropriate
943 interface on multihomed hosts. If you want sit on a particular
944 interface (like the external side of a gateway or firewall machine),
945 fill this in with your real address instead.
946
947 #!/usr/bin/perl -Tw
948 use strict;
949 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
950 use Socket;
951 use Carp;
952 my $EOL = "\015\012";
953
954 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
955
956 my $port = shift || 2345;
957 die "invalid port" unless if $port =~ /^ \d+ $/x;
958
959 my $proto = getprotobyname("tcp");
960
961 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
962 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
963 || die "setsockopt: $!";
964 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
965 listen(Server, SOMAXCONN) || die "listen: $!";
966
967 logmsg "server started on port $port";
968
969 my $paddr;
970
971 $SIG{CHLD} = \&REAPER;
972
973 for ( ; $paddr = accept(Client, Server); close Client) {
974 my($port, $iaddr) = sockaddr_in($paddr);
975 my $name = gethostbyaddr($iaddr, AF_INET);
976
977 logmsg "connection from $name [",
978 inet_ntoa($iaddr), "]
979 at port $port";
980
981 print Client "Hello there, $name, it's now ",
982 scalar localtime(), $EOL;
983 }
984
985 And here's a multithreaded version. It's multithreaded in that like
986 most typical servers, it spawns (fork()s) a slave server to handle the
987 client request so that the master server can quickly go back to service
988 a new client.
989
990 #!/usr/bin/perl -Tw
991 use strict;
992 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
993 use Socket;
994 use Carp;
995 my $EOL = "\015\012";
996
997 sub spawn; # forward declaration
998 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
999
1000 my $port = shift || 2345;
1001 die "invalid port" unless if $port =~ /^ \d+ $/x;
1002
1003 my $proto = getprotobyname("tcp");
1004
1005 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
1006 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
1007 || die "setsockopt: $!";
1008 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
1009 listen(Server, SOMAXCONN) || die "listen: $!";
1010
1011 logmsg "server started on port $port";
1012
1013 my $waitedpid = 0;
1014 my $paddr;
1015
1016 use POSIX ":sys_wait_h";
1017 use Errno;
1018
1019 sub REAPER {
1020 local $!; # don't let waitpid() overwrite current error
1021 while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) {
1022 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1023 }
1024 $SIG{CHLD} = \&REAPER; # loathe SysV
1025 }
1026
1027 $SIG{CHLD} = \&REAPER;
1028
1029 while (1) {
1030 $paddr = accept(Client, Server) || do {
1031 # try again if accept() returned because got a signal
1032 next if $!{EINTR};
1033 die "accept: $!";
1034 };
1035 my ($port, $iaddr) = sockaddr_in($paddr);
1036 my $name = gethostbyaddr($iaddr, AF_INET);
1037
1038 logmsg "connection from $name [",
1039 inet_ntoa($iaddr),
1040 "] at port $port";
1041
1042 spawn sub {
1043 $| = 1;
1044 print "Hello there, $name, it's now ", scalar localtime(), $EOL;
1045 exec "/usr/games/fortune" # XXX: "wrong" line terminators
1046 or confess "can't exec fortune: $!";
1047 };
1048 close Client;
1049 }
1050
1051 sub spawn {
1052 my $coderef = shift;
1053
1054 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1055 confess "usage: spawn CODEREF";
1056 }
1057
1058 my $pid;
1059 unless (defined($pid = fork())) {
1060 logmsg "cannot fork: $!";
1061 return;
1062 }
1063 elsif ($pid) {
1064 logmsg "begat $pid";
1065 return; # I'm the parent
1066 }
1067 # else I'm the child -- go spawn
1068
1069 open(STDIN, "<&Client") || die "can't dup client to stdin";
1070 open(STDOUT, ">&Client") || die "can't dup client to stdout";
1071 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1072 exit($coderef->());
1073 }
1074
1075 This server takes the trouble to clone off a child version via fork()
1076 for each incoming request. That way it can handle many requests at
1077 once, which you might not always want. Even if you don't fork(), the
1078 listen() will allow that many pending connections. Forking servers
1079 have to be particularly careful about cleaning up their dead children
1080 (called "zombies" in Unix parlance), because otherwise you'll quickly
1081 fill up your process table. The REAPER subroutine is used here to call
1082 waitpid() for any child processes that have finished, thereby ensuring
1083 that they terminate cleanly and don't join the ranks of the living
1084 dead.
1085
1086 Within the while loop we call accept() and check to see if it returns a
1087 false value. This would normally indicate a system error needs to be
1088 reported. However, the introduction of safe signals (see "Deferred
1089 Signals (Safe Signals)" above) in Perl 5.7.3 means that accept() might
1090 also be interrupted when the process receives a signal. This typically
1091 happens when one of the forked subprocesses exits and notifies the
1092 parent process with a CHLD signal.
1093
1094 If accept() is interrupted by a signal, $! will be set to EINTR. If
1095 this happens, we can safely continue to the next iteration of the loop
1096 and another call to accept(). It is important that your signal
1097 handling code not modify the value of $!, or else this test will likely
1098 fail. In the REAPER subroutine we create a local version of $! before
1099 calling waitpid(). When waitpid() sets $! to ECHILD as it inevitably
1100 does when it has no more children waiting, it updates the local copy
1101 and leaves the original unchanged.
1102
1103 You should use the -T flag to enable taint checking (see perlsec) even
1104 if we aren't running setuid or setgid. This is always a good idea for
1105 servers or any program run on behalf of someone else (like CGI
1106 scripts), because it lessens the chances that people from the outside
1107 will be able to compromise your system.
1108
1109 Let's look at another TCP client. This one connects to the TCP "time"
1110 service on a number of different machines and shows how far their
1111 clocks differ from the system on which it's being run:
1112
1113 #!/usr/bin/perl -w
1114 use strict;
1115 use Socket;
1116
1117 my $SECS_OF_70_YEARS = 2208988800;
1118 sub ctime { scalar localtime(shift() || time()) }
1119
1120 my $iaddr = gethostbyname("localhost");
1121 my $proto = getprotobyname("tcp");
1122 my $port = getservbyname("time", "tcp");
1123 my $paddr = sockaddr_in(0, $iaddr);
1124 my($host);
1125
1126 $| = 1;
1127 printf "%-24s %8s %s\n", "localhost", 0, ctime();
1128
1129 foreach $host (@ARGV) {
1130 printf "%-24s ", $host;
1131 my $hisiaddr = inet_aton($host) || die "unknown host";
1132 my $hispaddr = sockaddr_in($port, $hisiaddr);
1133 socket(SOCKET, PF_INET, SOCK_STREAM, $proto)
1134 || die "socket: $!";
1135 connect(SOCKET, $hispaddr) || die "connect: $!";
1136 my $rtime = pack("C4", ());
1137 read(SOCKET, $rtime, 4);
1138 close(SOCKET);
1139 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1140 printf "%8d %s\n", $histime - time(), ctime($histime);
1141 }
1142
1143 Unix-Domain TCP Clients and Servers
1144 That's fine for Internet-domain clients and servers, but what about
1145 local communications? While you can use the same setup, sometimes you
1146 don't want to. Unix-domain sockets are local to the current host, and
1147 are often used internally to implement pipes. Unlike Internet domain
1148 sockets, Unix domain sockets can show up in the file system with an
1149 ls(1) listing.
1150
1151 % ls -l /dev/log
1152 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
1153
1154 You can test for these with Perl's -S file test:
1155
1156 unless (-S "/dev/log") {
1157 die "something's wicked with the log system";
1158 }
1159
1160 Here's a sample Unix-domain client:
1161
1162 #!/usr/bin/perl -w
1163 use Socket;
1164 use strict;
1165 my ($rendezvous, $line);
1166
1167 $rendezvous = shift || "catsock";
1168 socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1169 connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!";
1170 while (defined($line = <SOCK>)) {
1171 print $line;
1172 }
1173 exit(0);
1174
1175 And here's a corresponding server. You don't have to worry about silly
1176 network terminators here because Unix domain sockets are guaranteed to
1177 be on the localhost, and thus everything works right.
1178
1179 #!/usr/bin/perl -Tw
1180 use strict;
1181 use Socket;
1182 use Carp;
1183
1184 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
1185 sub spawn; # forward declaration
1186 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
1187
1188 my $NAME = "catsock";
1189 my $uaddr = sockaddr_un($NAME);
1190 my $proto = getprotobyname("tcp");
1191
1192 socket(Server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1193 unlink($NAME);
1194 bind (Server, $uaddr) || die "bind: $!";
1195 listen(Server, SOMAXCONN) || die "listen: $!";
1196
1197 logmsg "server started on $NAME";
1198
1199 my $waitedpid;
1200
1201 use POSIX ":sys_wait_h";
1202 sub REAPER {
1203 my $child;
1204 while (($waitedpid = waitpid(-1, WNOHANG)) > 0) {
1205 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1206 }
1207 $SIG{CHLD} = \&REAPER; # loathe SysV
1208 }
1209
1210 $SIG{CHLD} = \&REAPER;
1211
1212
1213 for ( $waitedpid = 0;
1214 accept(Client, Server) || $waitedpid;
1215 $waitedpid = 0, close Client)
1216 {
1217 next if $waitedpid;
1218 logmsg "connection on $NAME";
1219 spawn sub {
1220 print "Hello there, it's now ", scalar localtime(), "\n";
1221 exec("/usr/games/fortune") || die "can't exec fortune: $!";
1222 };
1223 }
1224
1225 sub spawn {
1226 my $coderef = shift();
1227
1228 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1229 confess "usage: spawn CODEREF";
1230 }
1231
1232 my $pid;
1233 unless (defined($pid = fork())) {
1234 logmsg "cannot fork: $!";
1235 return;
1236 }
1237 elsif ($pid) {
1238 logmsg "begat $pid";
1239 return; # I'm the parent
1240 }
1241 else {
1242 # I'm the child -- go spawn
1243 }
1244
1245 open(STDIN, "<&Client") || die "can't dup client to stdin";
1246 open(STDOUT, ">&Client") || die "can't dup client to stdout";
1247 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1248 exit($coderef->());
1249 }
1250
1251 As you see, it's remarkably similar to the Internet domain TCP server,
1252 so much so, in fact, that we've omitted several duplicate
1253 functions--spawn(), logmsg(), ctime(), and REAPER()--which are the same
1254 as in the other server.
1255
1256 So why would you ever want to use a Unix domain socket instead of a
1257 simpler named pipe? Because a named pipe doesn't give you sessions.
1258 You can't tell one process's data from another's. With socket
1259 programming, you get a separate session for each client; that's why
1260 accept() takes two arguments.
1261
1262 For example, let's say that you have a long-running database server
1263 daemon that you want folks to be able to access from the Web, but only
1264 if they go through a CGI interface. You'd have a small, simple CGI
1265 program that does whatever checks and logging you feel like, and then
1266 acts as a Unix-domain client and connects to your private server.
1267
1269 For those preferring a higher-level interface to socket programming,
1270 the IO::Socket module provides an object-oriented approach. IO::Socket
1271 has been included in the standard Perl distribution ever since Perl
1272 5.004. If you're running an earlier version of Perl (in which case,
1273 how are you reading this manpage?), just fetch IO::Socket from CPAN,
1274 where you'll also find modules providing easy interfaces to the
1275 following systems: DNS, FTP, Ident (RFC 931), NIS and NISPlus, NNTP,
1276 Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--to name just a few.
1277
1278 A Simple Client
1279 Here's a client that creates a TCP connection to the "daytime" service
1280 at port 13 of the host name "localhost" and prints out everything that
1281 the server there cares to provide.
1282
1283 #!/usr/bin/perl -w
1284 use IO::Socket;
1285 $remote = IO::Socket::INET->new(
1286 Proto => "tcp",
1287 PeerAddr => "localhost",
1288 PeerPort => "daytime(13)",
1289 )
1290 || die "can't connect to daytime service on localhost";
1291 while (<$remote>) { print }
1292
1293 When you run this program, you should get something back that looks
1294 like this:
1295
1296 Wed May 14 08:40:46 MDT 1997
1297
1298 Here are what those parameters to the new() constructor mean:
1299
1300 "Proto"
1301 This is which protocol to use. In this case, the socket handle
1302 returned will be connected to a TCP socket, because we want a
1303 stream-oriented connection, that is, one that acts pretty much like
1304 a plain old file. Not all sockets are this of this type. For
1305 example, the UDP protocol can be used to make a datagram socket,
1306 used for message-passing.
1307
1308 "PeerAddr"
1309 This is the name or Internet address of the remote host the server
1310 is running on. We could have specified a longer name like
1311 "www.perl.com", or an address like "207.171.7.72". For
1312 demonstration purposes, we've used the special hostname
1313 "localhost", which should always mean the current machine you're
1314 running on. The corresponding Internet address for localhost is
1315 "127.0.0.1", if you'd rather use that.
1316
1317 "PeerPort"
1318 This is the service name or port number we'd like to connect to.
1319 We could have gotten away with using just "daytime" on systems with
1320 a well-configured system services file,[FOOTNOTE: The system
1321 services file is found in /etc/services under Unixy systems.] but
1322 here we've specified the port number (13) in parentheses. Using
1323 just the number would have also worked, but numeric literals make
1324 careful programmers nervous.
1325
1326 Notice how the return value from the "new" constructor is used as a
1327 filehandle in the "while" loop? That's what's called an indirect
1328 filehandle, a scalar variable containing a filehandle. You can use it
1329 the same way you would a normal filehandle. For example, you can read
1330 one line from it this way:
1331
1332 $line = <$handle>;
1333
1334 all remaining lines from is this way:
1335
1336 @lines = <$handle>;
1337
1338 and send a line of data to it this way:
1339
1340 print $handle "some data\n";
1341
1342 A Webget Client
1343 Here's a simple client that takes a remote host to fetch a document
1344 from, and then a list of files to get from that host. This is a more
1345 interesting client than the previous one because it first sends
1346 something to the server before fetching the server's response.
1347
1348 #!/usr/bin/perl -w
1349 use IO::Socket;
1350 unless (@ARGV > 1) { die "usage: $0 host url ..." }
1351 $host = shift(@ARGV);
1352 $EOL = "\015\012";
1353 $BLANK = $EOL x 2;
1354 for my $document (@ARGV) {
1355 $remote = IO::Socket::INET->new( Proto => "tcp",
1356 PeerAddr => $host,
1357 PeerPort => "http(80)",
1358 ) || die "cannot connect to httpd on $host";
1359 $remote->autoflush(1);
1360 print $remote "GET $document HTTP/1.0" . $BLANK;
1361 while ( <$remote> ) { print }
1362 close $remote;
1363 }
1364
1365 The web server handling the HTTP service is assumed to be at its
1366 standard port, number 80. If the server you're trying to connect to is
1367 at a different port, like 1080 or 8080, you should specify it as the
1368 named-parameter pair, "PeerPort => 8080". The "autoflush" method is
1369 used on the socket because otherwise the system would buffer up the
1370 output we sent it. (If you're on a prehistoric Mac, you'll also need
1371 to change every "\n" in your code that sends data over the network to
1372 be a "\015\012" instead.)
1373
1374 Connecting to the server is only the first part of the process: once
1375 you have the connection, you have to use the server's language. Each
1376 server on the network has its own little command language that it
1377 expects as input. The string that we send to the server starting with
1378 "GET" is in HTTP syntax. In this case, we simply request each
1379 specified document. Yes, we really are making a new connection for
1380 each document, even though it's the same host. That's the way you
1381 always used to have to speak HTTP. Recent versions of web browsers may
1382 request that the remote server leave the connection open a little
1383 while, but the server doesn't have to honor such a request.
1384
1385 Here's an example of running that program, which we'll call webget:
1386
1387 % webget www.perl.com /guanaco.html
1388 HTTP/1.1 404 File Not Found
1389 Date: Thu, 08 May 1997 18:02:32 GMT
1390 Server: Apache/1.2b6
1391 Connection: close
1392 Content-type: text/html
1393
1394 <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1395 <BODY><H1>File Not Found</H1>
1396 The requested URL /guanaco.html was not found on this server.<P>
1397 </BODY>
1398
1399 Ok, so that's not very interesting, because it didn't find that
1400 particular document. But a long response wouldn't have fit on this
1401 page.
1402
1403 For a more featureful version of this program, you should look to the
1404 lwp-request program included with the LWP modules from CPAN.
1405
1406 Interactive Client with IO::Socket
1407 Well, that's all fine if you want to send one command and get one
1408 answer, but what about setting up something fully interactive, somewhat
1409 like the way telnet works? That way you can type a line, get the
1410 answer, type a line, get the answer, etc.
1411
1412 This client is more complicated than the two we've done so far, but if
1413 you're on a system that supports the powerful "fork" call, the solution
1414 isn't that rough. Once you've made the connection to whatever service
1415 you'd like to chat with, call "fork" to clone your process. Each of
1416 these two identical process has a very simple job to do: the parent
1417 copies everything from the socket to standard output, while the child
1418 simultaneously copies everything from standard input to the socket. To
1419 accomplish the same thing using just one process would be much harder,
1420 because it's easier to code two processes to do one thing than it is to
1421 code one process to do two things. (This keep-it-simple principle a
1422 cornerstones of the Unix philosophy, and good software engineering as
1423 well, which is probably why it's spread to other systems.)
1424
1425 Here's the code:
1426
1427 #!/usr/bin/perl -w
1428 use strict;
1429 use IO::Socket;
1430 my ($host, $port, $kidpid, $handle, $line);
1431
1432 unless (@ARGV == 2) { die "usage: $0 host port" }
1433 ($host, $port) = @ARGV;
1434
1435 # create a tcp connection to the specified host and port
1436 $handle = IO::Socket::INET->new(Proto => "tcp",
1437 PeerAddr => $host,
1438 PeerPort => $port)
1439 || die "can't connect to port $port on $host: $!";
1440
1441 $handle->autoflush(1); # so output gets there right away
1442 print STDERR "[Connected to $host:$port]\n";
1443
1444 # split the program into two processes, identical twins
1445 die "can't fork: $!" unless defined($kidpid = fork());
1446
1447 # the if{} block runs only in the parent process
1448 if ($kidpid) {
1449 # copy the socket to standard output
1450 while (defined ($line = <$handle>)) {
1451 print STDOUT $line;
1452 }
1453 kill("TERM", $kidpid); # send SIGTERM to child
1454 }
1455 # the else{} block runs only in the child process
1456 else {
1457 # copy standard input to the socket
1458 while (defined ($line = <STDIN>)) {
1459 print $handle $line;
1460 }
1461 exit(0); # just in case
1462 }
1463
1464 The "kill" function in the parent's "if" block is there to send a
1465 signal to our child process, currently running in the "else" block, as
1466 soon as the remote server has closed its end of the connection.
1467
1468 If the remote server sends data a byte at time, and you need that data
1469 immediately without waiting for a newline (which might not happen), you
1470 may wish to replace the "while" loop in the parent with the following:
1471
1472 my $byte;
1473 while (sysread($handle, $byte, 1) == 1) {
1474 print STDOUT $byte;
1475 }
1476
1477 Making a system call for each byte you want to read is not very
1478 efficient (to put it mildly) but is the simplest to explain and works
1479 reasonably well.
1480
1482 As always, setting up a server is little bit more involved than running
1483 a client. The model is that the server creates a special kind of
1484 socket that does nothing but listen on a particular port for incoming
1485 connections. It does this by calling the "IO::Socket::INET->new()"
1486 method with slightly different arguments than the client did.
1487
1488 Proto
1489 This is which protocol to use. Like our clients, we'll still
1490 specify "tcp" here.
1491
1492 LocalPort
1493 We specify a local port in the "LocalPort" argument, which we
1494 didn't do for the client. This is service name or port number for
1495 which you want to be the server. (Under Unix, ports under 1024 are
1496 restricted to the superuser.) In our sample, we'll use port 9000,
1497 but you can use any port that's not currently in use on your
1498 system. If you try to use one already in used, you'll get an
1499 "Address already in use" message. Under Unix, the "netstat -a"
1500 command will show which services current have servers.
1501
1502 Listen
1503 The "Listen" parameter is set to the maximum number of pending
1504 connections we can accept until we turn away incoming clients.
1505 Think of it as a call-waiting queue for your telephone. The low-
1506 level Socket module has a special symbol for the system maximum,
1507 which is SOMAXCONN.
1508
1509 Reuse
1510 The "Reuse" parameter is needed so that we restart our server
1511 manually without waiting a few minutes to allow system buffers to
1512 clear out.
1513
1514 Once the generic server socket has been created using the parameters
1515 listed above, the server then waits for a new client to connect to it.
1516 The server blocks in the "accept" method, which eventually accepts a
1517 bidirectional connection from the remote client. (Make sure to
1518 autoflush this handle to circumvent buffering.)
1519
1520 To add to user-friendliness, our server prompts the user for commands.
1521 Most servers don't do this. Because of the prompt without a newline,
1522 you'll have to use the "sysread" variant of the interactive client
1523 above.
1524
1525 This server accepts one of five different commands, sending output back
1526 to the client. Unlike most network servers, this one handles only one
1527 incoming client at a time. Multithreaded servers are covered in
1528 Chapter 16 of the Camel.
1529
1530 Here's the code. We'll
1531
1532 #!/usr/bin/perl -w
1533 use IO::Socket;
1534 use Net::hostent; # for OOish version of gethostbyaddr
1535
1536 $PORT = 9000; # pick something not in use
1537
1538 $server = IO::Socket::INET->new( Proto => "tcp",
1539 LocalPort => $PORT,
1540 Listen => SOMAXCONN,
1541 Reuse => 1);
1542
1543 die "can't setup server" unless $server;
1544 print "[Server $0 accepting clients]\n";
1545
1546 while ($client = $server->accept()) {
1547 $client->autoflush(1);
1548 print $client "Welcome to $0; type help for command list.\n";
1549 $hostinfo = gethostbyaddr($client->peeraddr);
1550 printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
1551 print $client "Command? ";
1552 while ( <$client>) {
1553 next unless /\S/; # blank line
1554 if (/quit|exit/i) { last }
1555 elsif (/date|time/i) { printf $client "%s\n", scalar localtime() }
1556 elsif (/who/i ) { print $client `who 2>&1` }
1557 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1` }
1558 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1` }
1559 else {
1560 print $client "Commands: quit date who cookie motd\n";
1561 }
1562 } continue {
1563 print $client "Command? ";
1564 }
1565 close $client;
1566 }
1567
1569 Another kind of client-server setup is one that uses not connections,
1570 but messages. UDP communications involve much lower overhead but also
1571 provide less reliability, as there are no promises that messages will
1572 arrive at all, let alone in order and unmangled. Still, UDP offers
1573 some advantages over TCP, including being able to "broadcast" or
1574 "multicast" to a whole bunch of destination hosts at once (usually on
1575 your local subnet). If you find yourself overly concerned about
1576 reliability and start building checks into your message system, then
1577 you probably should use just TCP to start with.
1578
1579 UDP datagrams are not a bytestream and should not be treated as such.
1580 This makes using I/O mechanisms with internal buffering like stdio
1581 (i.e. print() and friends) especially cumbersome. Use syswrite(), or
1582 better send(), like in the example below.
1583
1584 Here's a UDP program similar to the sample Internet TCP client given
1585 earlier. However, instead of checking one host at a time, the UDP
1586 version will check many of them asynchronously by simulating a
1587 multicast and then using select() to do a timed-out wait for I/O. To
1588 do something similar with TCP, you'd have to use a different socket
1589 handle for each host.
1590
1591 #!/usr/bin/perl -w
1592 use strict;
1593 use Socket;
1594 use Sys::Hostname;
1595
1596 my ( $count, $hisiaddr, $hispaddr, $histime,
1597 $host, $iaddr, $paddr, $port, $proto,
1598 $rin, $rout, $rtime, $SECS_OF_70_YEARS);
1599
1600 $SECS_OF_70_YEARS = 2_208_988_800;
1601
1602 $iaddr = gethostbyname(hostname());
1603 $proto = getprotobyname("udp");
1604 $port = getservbyname("time", "udp");
1605 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1606
1607 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
1608 bind(SOCKET, $paddr) || die "bind: $!";
1609
1610 $| = 1;
1611 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime();
1612 $count = 0;
1613 for $host (@ARGV) {
1614 $count++;
1615 $hisiaddr = inet_aton($host) || die "unknown host";
1616 $hispaddr = sockaddr_in($port, $hisiaddr);
1617 defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!";
1618 }
1619
1620 $rin = "";
1621 vec($rin, fileno(SOCKET), 1) = 1;
1622
1623 # timeout after 10.0 seconds
1624 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1625 $rtime = "";
1626 $hispaddr = recv(SOCKET, $rtime, 4, 0) || die "recv: $!";
1627 ($port, $hisiaddr) = sockaddr_in($hispaddr);
1628 $host = gethostbyaddr($hisiaddr, AF_INET);
1629 $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1630 printf "%-12s ", $host;
1631 printf "%8d %s\n", $histime - time(), scalar localtime($histime);
1632 $count--;
1633 }
1634
1635 This example does not include any retries and may consequently fail to
1636 contact a reachable host. The most prominent reason for this is
1637 congestion of the queues on the sending host if the number of hosts to
1638 contact is sufficiently large.
1639
1641 While System V IPC isn't so widely used as sockets, it still has some
1642 interesting uses. However, you cannot use SysV IPC or Berkeley mmap()
1643 to have a variable shared amongst several processes. That's because
1644 Perl would reallocate your string when you weren't wanting it to. You
1645 might look into the "IPC::Shareable" or "threads::shared" modules for
1646 that.
1647
1648 Here's a small example showing shared memory usage.
1649
1650 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR);
1651
1652 $size = 2000;
1653 $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR);
1654 defined($id) || die "shmget: $!";
1655 print "shm key $id\n";
1656
1657 $message = "Message #1";
1658 shmwrite($id, $message, 0, 60) || die "shmwrite: $!";
1659 print "wrote: '$message'\n";
1660 shmread($id, $buff, 0, 60) || die "shmread: $!";
1661 print "read : '$buff'\n";
1662
1663 # the buffer of shmread is zero-character end-padded.
1664 substr($buff, index($buff, "\0")) = "";
1665 print "un" unless $buff eq $message;
1666 print "swell\n";
1667
1668 print "deleting shm $id\n";
1669 shmctl($id, IPC_RMID, 0) || die "shmctl: $!";
1670
1671 Here's an example of a semaphore:
1672
1673 use IPC::SysV qw(IPC_CREAT);
1674
1675 $IPC_KEY = 1234;
1676 $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT);
1677 defined($id) || die "shmget: $!";
1678 print "shm key $id\n";
1679
1680 Put this code in a separate file to be run in more than one process.
1681 Call the file take:
1682
1683 # create a semaphore
1684
1685 $IPC_KEY = 1234;
1686 $id = semget($IPC_KEY, 0, 0);
1687 defined($id) || die "shmget: $!";
1688
1689 $semnum = 0;
1690 $semflag = 0;
1691
1692 # "take" semaphore
1693 # wait for semaphore to be zero
1694 $semop = 0;
1695 $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1696
1697 # Increment the semaphore count
1698 $semop = 1;
1699 $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
1700 $opstring = $opstring1 . $opstring2;
1701
1702 semop($id, $opstring) || die "semop: $!";
1703
1704 Put this code in a separate file to be run in more than one process.
1705 Call this file give:
1706
1707 # "give" the semaphore
1708 # run this in the original process and you will see
1709 # that the second process continues
1710
1711 $IPC_KEY = 1234;
1712 $id = semget($IPC_KEY, 0, 0);
1713 die unless defined($id);
1714
1715 $semnum = 0;
1716 $semflag = 0;
1717
1718 # Decrement the semaphore count
1719 $semop = -1;
1720 $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1721
1722 semop($id, $opstring) || die "semop: $!";
1723
1724 The SysV IPC code above was written long ago, and it's definitely
1725 clunky looking. For a more modern look, see the IPC::SysV module which
1726 is included with Perl starting from Perl 5.005.
1727
1728 A small example demonstrating SysV message queues:
1729
1730 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR);
1731
1732 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR);
1733 defined($id) || die "msgget failed: $!";
1734
1735 my $sent = "message";
1736 my $type_sent = 1234;
1737
1738 msgsnd($id, pack("l! a*", $type_sent, $sent), 0)
1739 || die "msgsnd failed: $!";
1740
1741 msgrcv($id, my $rcvd_buf, 60, 0, 0)
1742 || die "msgrcv failed: $!";
1743
1744 my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf);
1745
1746 if ($rcvd eq $sent) {
1747 print "okay\n";
1748 } else {
1749 print "not okay\n";
1750 }
1751
1752 msgctl($id, IPC_RMID, 0) || die "msgctl failed: $!\n";
1753
1755 Most of these routines quietly but politely return "undef" when they
1756 fail instead of causing your program to die right then and there due to
1757 an uncaught exception. (Actually, some of the new Socket conversion
1758 functions do croak() on bad arguments.) It is therefore essential to
1759 check return values from these functions. Always begin your socket
1760 programs this way for optimal success, and don't forget to add the -T
1761 taint-checking flag to the "#!" line for servers:
1762
1763 #!/usr/bin/perl -Tw
1764 use strict;
1765 use sigtrap;
1766 use Socket;
1767
1769 These routines all create system-specific portability problems. As
1770 noted elsewhere, Perl is at the mercy of your C libraries for much of
1771 its system behavior. It's probably safest to assume broken SysV
1772 semantics for signals and to stick with simple TCP and UDP socket
1773 operations; e.g., don't try to pass open file descriptors over a local
1774 UDP datagram socket if you want your code to stand a chance of being
1775 portable.
1776
1778 Tom Christiansen, with occasional vestiges of Larry Wall's original
1779 version and suggestions from the Perl Porters.
1780
1782 There's a lot more to networking than this, but this should get you
1783 started.
1784
1785 For intrepid programmers, the indispensable textbook is Unix Network
1786 Programming, 2nd Edition, Volume 1 by W. Richard Stevens (published by
1787 Prentice-Hall). Most books on networking address the subject from the
1788 perspective of a C programmer; translation to Perl is left as an
1789 exercise for the reader.
1790
1791 The IO::Socket(3) manpage describes the object library, and the
1792 Socket(3) manpage describes the low-level interface to sockets.
1793 Besides the obvious functions in perlfunc, you should also check out
1794 the modules file at your nearest CPAN site, especially
1795 <http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>. See
1796 perlmodlib or best yet, the Perl FAQ for a description of what CPAN is
1797 and where to get it if the previous link doesn't work for you.
1798
1799 Section 5 of CPAN's modules file is devoted to "Networking, Device
1800 Control (modems), and Interprocess Communication", and contains
1801 numerous unbundled modules numerous networking modules, Chat and Expect
1802 operations, CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC,
1803 SNMP, SMTP, Telnet, Threads, and ToolTalk--to name just a few.
1804
1805
1806
1807perl v5.16.3 2013-03-04 PERLIPC(1)