1PERLIPC(1) Perl Programmers Reference Guide PERLIPC(1)
2
3
4
6 perlipc - Perl interprocess communication (signals, fifos, pipes, safe
7 subprocesses, sockets, and semaphores)
8
10 The basic IPC facilities of Perl are built out of the good old Unix
11 signals, named pipes, pipe opens, the Berkeley socket routines, and
12 SysV IPC calls. Each is used in slightly different situations.
13
15 Perl uses a simple signal handling model: the %SIG hash contains names
16 or references of user-installed signal handlers. These handlers will
17 be called with an argument which is the name of the signal that
18 triggered it. A signal may be generated intentionally from a
19 particular keyboard sequence like control-C or control-Z, sent to you
20 from another process, or triggered automatically by the kernel when
21 special events transpire, like a child process exiting, your own
22 process running out of stack space, or hitting a process file-size
23 limit.
24
25 For example, to trap an interrupt signal, set up a handler like this:
26
27 our $shucks;
28
29 sub catch_zap {
30 my $signame = shift;
31 $shucks++;
32 die "Somebody sent me a SIG$signame";
33 }
34 $SIG{INT} = __PACKAGE__ . "::catch_zap";
35 $SIG{INT} = \&catch_zap; # best strategy
36
37 Prior to Perl 5.8.0 it was necessary to do as little as you possibly
38 could in your handler; notice how all we do is set a global variable
39 and then raise an exception. That's because on most systems, libraries
40 are not re-entrant; particularly, memory allocation and I/O routines
41 are not. That meant that doing nearly anything in your handler could
42 in theory trigger a memory fault and subsequent core dump - see
43 "Deferred Signals (Safe Signals)" below.
44
45 The names of the signals are the ones listed out by "kill -l" on your
46 system, or you can retrieve them using the CPAN module IPC::Signal.
47
48 You may also choose to assign the strings "IGNORE" or "DEFAULT" as the
49 handler, in which case Perl will try to discard the signal or do the
50 default thing.
51
52 On most Unix platforms, the "CHLD" (sometimes also known as "CLD")
53 signal has special behavior with respect to a value of "IGNORE".
54 Setting $SIG{CHLD} to "IGNORE" on such a platform has the effect of not
55 creating zombie processes when the parent process fails to "wait()" on
56 its child processes (i.e., child processes are automatically reaped).
57 Calling "wait()" with $SIG{CHLD} set to "IGNORE" usually returns "-1"
58 on such platforms.
59
60 Some signals can be neither trapped nor ignored, such as the KILL and
61 STOP (but not the TSTP) signals. Note that ignoring signals makes them
62 disappear. If you only want them blocked temporarily without them
63 getting lost you'll have to use POSIX' sigprocmask.
64
65 Sending a signal to a negative process ID means that you send the
66 signal to the entire Unix process group. This code sends a hang-up
67 signal to all processes in the current process group, and also sets
68 $SIG{HUP} to "IGNORE" so it doesn't kill itself:
69
70 # block scope for local
71 {
72 local $SIG{HUP} = "IGNORE";
73 kill HUP => -getpgrp();
74 # snazzy writing of: kill("HUP", -getpgrp())
75 }
76
77 Another interesting signal to send is signal number zero. This doesn't
78 actually affect a child process, but instead checks whether it's alive
79 or has changed its UIDs.
80
81 unless (kill 0 => $kid_pid) {
82 warn "something wicked happened to $kid_pid";
83 }
84
85 Signal number zero may fail because you lack permission to send the
86 signal when directed at a process whose real or saved UID is not
87 identical to the real or effective UID of the sending process, even
88 though the process is alive. You may be able to determine the cause of
89 failure using $! or "%!".
90
91 unless (kill(0 => $pid) || $!{EPERM}) {
92 warn "$pid looks dead";
93 }
94
95 You might also want to employ anonymous functions for simple signal
96 handlers:
97
98 $SIG{INT} = sub { die "\nOutta here!\n" };
99
100 SIGCHLD handlers require some special care. If a second child dies
101 while in the signal handler caused by the first death, we won't get
102 another signal. So must loop here else we will leave the unreaped child
103 as a zombie. And the next time two children die we get another zombie.
104 And so on.
105
106 use POSIX ":sys_wait_h";
107 $SIG{CHLD} = sub {
108 while ((my $child = waitpid(-1, WNOHANG)) > 0) {
109 $Kid_Status{$child} = $?;
110 }
111 };
112 # do something that forks...
113
114 Be careful: qx(), system(), and some modules for calling external
115 commands do a fork(), then wait() for the result. Thus, your signal
116 handler will be called. Because wait() was already called by system()
117 or qx(), the wait() in the signal handler will see no more zombies and
118 will therefore block.
119
120 The best way to prevent this issue is to use waitpid(), as in the
121 following example:
122
123 use POSIX ":sys_wait_h"; # for nonblocking read
124
125 my %children;
126
127 $SIG{CHLD} = sub {
128 # don't change $! and $? outside handler
129 local ($!, $?);
130 while ( (my $pid = waitpid(-1, WNOHANG)) > 0 ) {
131 delete $children{$pid};
132 cleanup_child($pid, $?);
133 }
134 };
135
136 while (1) {
137 my $pid = fork();
138 die "cannot fork" unless defined $pid;
139 if ($pid == 0) {
140 # ...
141 exit 0;
142 } else {
143 $children{$pid}=1;
144 # ...
145 system($command);
146 # ...
147 }
148 }
149
150 Signal handling is also used for timeouts in Unix. While safely
151 protected within an "eval{}" block, you set a signal handler to trap
152 alarm signals and then schedule to have one delivered to you in some
153 number of seconds. Then try your blocking operation, clearing the
154 alarm when it's done but not before you've exited your "eval{}" block.
155 If it goes off, you'll use die() to jump out of the block.
156
157 Here's an example:
158
159 my $ALARM_EXCEPTION = "alarm clock restart";
160 eval {
161 local $SIG{ALRM} = sub { die $ALARM_EXCEPTION };
162 alarm 10;
163 flock(FH, 2) # blocking write lock
164 || die "cannot flock: $!";
165 alarm 0;
166 };
167 if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }
168
169 If the operation being timed out is system() or qx(), this technique is
170 liable to generate zombies. If this matters to you, you'll need to
171 do your own fork() and exec(), and kill the errant child process.
172
173 For more complex signal handling, you might see the standard POSIX
174 module. Lamentably, this is almost entirely undocumented, but the
175 ext/POSIX/t/sigaction.t file from the Perl source distribution has some
176 examples in it.
177
178 Handling the SIGHUP Signal in Daemons
179 A process that usually starts when the system boots and shuts down when
180 the system is shut down is called a daemon (Disk And Execution
181 MONitor). If a daemon process has a configuration file which is
182 modified after the process has been started, there should be a way to
183 tell that process to reread its configuration file without stopping the
184 process. Many daemons provide this mechanism using a "SIGHUP" signal
185 handler. When you want to tell the daemon to reread the file, simply
186 send it the "SIGHUP" signal.
187
188 The following example implements a simple daemon, which restarts itself
189 every time the "SIGHUP" signal is received. The actual code is located
190 in the subroutine "code()", which just prints some debugging info to
191 show that it works; it should be replaced with the real code.
192
193 #!/usr/bin/perl
194
195 use strict;
196 use warnings;
197
198 use POSIX ();
199 use FindBin ();
200 use File::Basename ();
201 use File::Spec::Functions qw(catfile);
202
203 $| = 1;
204
205 # make the daemon cross-platform, so exec always calls the script
206 # itself with the right path, no matter how the script was invoked.
207 my $script = File::Basename::basename($0);
208 my $SELF = catfile($FindBin::Bin, $script);
209
210 # POSIX unmasks the sigprocmask properly
211 $SIG{HUP} = sub {
212 print "got SIGHUP\n";
213 exec($SELF, @ARGV) || die "$0: couldn't restart: $!";
214 };
215
216 code();
217
218 sub code {
219 print "PID: $$\n";
220 print "ARGV: @ARGV\n";
221 my $count = 0;
222 while (1) {
223 sleep 2;
224 print ++$count, "\n";
225 }
226 }
227
228 Deferred Signals (Safe Signals)
229 Before Perl 5.8.0, installing Perl code to deal with signals exposed
230 you to danger from two things. First, few system library functions are
231 re-entrant. If the signal interrupts while Perl is executing one
232 function (like malloc(3) or printf(3)), and your signal handler then
233 calls the same function again, you could get unpredictable
234 behavior--often, a core dump. Second, Perl isn't itself re-entrant at
235 the lowest levels. If the signal interrupts Perl while Perl is
236 changing its own internal data structures, similarly unpredictable
237 behavior may result.
238
239 There were two things you could do, knowing this: be paranoid or be
240 pragmatic. The paranoid approach was to do as little as possible in
241 your signal handler. Set an existing integer variable that already has
242 a value, and return. This doesn't help you if you're in a slow system
243 call, which will just restart. That means you have to "die" to
244 longjmp(3) out of the handler. Even this is a little cavalier for the
245 true paranoiac, who avoids "die" in a handler because the system is out
246 to get you. The pragmatic approach was to say "I know the risks, but
247 prefer the convenience", and to do anything you wanted in your signal
248 handler, and be prepared to clean up core dumps now and again.
249
250 Perl 5.8.0 and later avoid these problems by "deferring" signals. That
251 is, when the signal is delivered to the process by the system (to the C
252 code that implements Perl) a flag is set, and the handler returns
253 immediately. Then at strategic "safe" points in the Perl interpreter
254 (e.g. when it is about to execute a new opcode) the flags are checked
255 and the Perl level handler from %SIG is executed. The "deferred" scheme
256 allows much more flexibility in the coding of signal handlers as we
257 know the Perl interpreter is in a safe state, and that we are not in a
258 system library function when the handler is called. However the
259 implementation does differ from previous Perls in the following ways:
260
261 Long-running opcodes
262 As the Perl interpreter looks at signal flags only when it is about
263 to execute a new opcode, a signal that arrives during a long-
264 running opcode (e.g. a regular expression operation on a very large
265 string) will not be seen until the current opcode completes.
266
267 If a signal of any given type fires multiple times during an opcode
268 (such as from a fine-grained timer), the handler for that signal
269 will be called only once, after the opcode completes; all other
270 instances will be discarded. Furthermore, if your system's signal
271 queue gets flooded to the point that there are signals that have
272 been raised but not yet caught (and thus not deferred) at the time
273 an opcode completes, those signals may well be caught and deferred
274 during subsequent opcodes, with sometimes surprising results. For
275 example, you may see alarms delivered even after calling alarm(0)
276 as the latter stops the raising of alarms but does not cancel the
277 delivery of alarms raised but not yet caught. Do not depend on the
278 behaviors described in this paragraph as they are side effects of
279 the current implementation and may change in future versions of
280 Perl.
281
282 Interrupting IO
283 When a signal is delivered (e.g., SIGINT from a control-C) the
284 operating system breaks into IO operations like read(2), which is
285 used to implement Perl's readline() function, the "<>" operator. On
286 older Perls the handler was called immediately (and as "read" is
287 not "unsafe", this worked well). With the "deferred" scheme the
288 handler is not called immediately, and if Perl is using the
289 system's "stdio" library that library may restart the "read"
290 without returning to Perl to give it a chance to call the %SIG
291 handler. If this happens on your system the solution is to use the
292 ":perlio" layer to do IO--at least on those handles that you want
293 to be able to break into with signals. (The ":perlio" layer checks
294 the signal flags and calls %SIG handlers before resuming IO
295 operation.)
296
297 The default in Perl 5.8.0 and later is to automatically use the
298 ":perlio" layer.
299
300 Note that it is not advisable to access a file handle within a
301 signal handler where that signal has interrupted an I/O operation
302 on that same handle. While perl will at least try hard not to
303 crash, there are no guarantees of data integrity; for example, some
304 data might get dropped or written twice.
305
306 Some networking library functions like gethostbyname() are known to
307 have their own implementations of timeouts which may conflict with
308 your timeouts. If you have problems with such functions, try using
309 the POSIX sigaction() function, which bypasses Perl safe signals.
310 Be warned that this does subject you to possible memory corruption,
311 as described above.
312
313 Instead of setting $SIG{ALRM}:
314
315 local $SIG{ALRM} = sub { die "alarm" };
316
317 try something like the following:
318
319 use POSIX qw(SIGALRM);
320 POSIX::sigaction(SIGALRM,
321 POSIX::SigAction->new(sub { die "alarm" }))
322 || die "Error setting SIGALRM handler: $!\n";
323
324 Another way to disable the safe signal behavior locally is to use
325 the "Perl::Unsafe::Signals" module from CPAN, which affects all
326 signals.
327
328 Restartable system calls
329 On systems that supported it, older versions of Perl used the
330 SA_RESTART flag when installing %SIG handlers. This meant that
331 restartable system calls would continue rather than returning when
332 a signal arrived. In order to deliver deferred signals promptly,
333 Perl 5.8.0 and later do not use SA_RESTART. Consequently,
334 restartable system calls can fail (with $! set to "EINTR") in
335 places where they previously would have succeeded.
336
337 The default ":perlio" layer retries "read", "write" and "close" as
338 described above; interrupted "wait" and "waitpid" calls will always
339 be retried.
340
341 Signals as "faults"
342 Certain signals like SEGV, ILL, and BUS are generated by virtual
343 memory addressing errors and similar "faults". These are normally
344 fatal: there is little a Perl-level handler can do with them. So
345 Perl delivers them immediately rather than attempting to defer
346 them.
347
348 Signals triggered by operating system state
349 On some operating systems certain signal handlers are supposed to
350 "do something" before returning. One example can be CHLD or CLD,
351 which indicates a child process has completed. On some operating
352 systems the signal handler is expected to "wait" for the completed
353 child process. On such systems the deferred signal scheme will not
354 work for those signals: it does not do the "wait". Again the
355 failure will look like a loop as the operating system will reissue
356 the signal because there are completed child processes that have
357 not yet been "wait"ed for.
358
359 If you want the old signal behavior back despite possible memory
360 corruption, set the environment variable "PERL_SIGNALS" to "unsafe".
361 This feature first appeared in Perl 5.8.1.
362
364 A named pipe (often referred to as a FIFO) is an old Unix IPC mechanism
365 for processes communicating on the same machine. It works just like
366 regular anonymous pipes, except that the processes rendezvous using a
367 filename and need not be related.
368
369 To create a named pipe, use the "POSIX::mkfifo()" function.
370
371 use POSIX qw(mkfifo);
372 mkfifo($path, 0700) || die "mkfifo $path failed: $!";
373
374 You can also use the Unix command mknod(1), or on some systems,
375 mkfifo(1). These may not be in your normal path, though.
376
377 # system return val is backwards, so && not ||
378 #
379 $ENV{PATH} .= ":/etc:/usr/etc";
380 if ( system("mknod", $path, "p")
381 && system("mkfifo", $path) )
382 {
383 die "mk{nod,fifo} $path failed";
384 }
385
386 A fifo is convenient when you want to connect a process to an unrelated
387 one. When you open a fifo, the program will block until there's
388 something on the other end.
389
390 For example, let's say you'd like to have your .signature file be a
391 named pipe that has a Perl program on the other end. Now every time
392 any program (like a mailer, news reader, finger program, etc.) tries to
393 read from that file, the reading program will read the new signature
394 from your program. We'll use the pipe-checking file-test operator, -p,
395 to find out whether anyone (or anything) has accidentally removed our
396 fifo.
397
398 chdir(); # go home
399 my $FIFO = ".signature";
400
401 while (1) {
402 unless (-p $FIFO) {
403 unlink $FIFO; # discard any failure, will catch later
404 require POSIX; # delayed loading of heavy module
405 POSIX::mkfifo($FIFO, 0700)
406 || die "can't mkfifo $FIFO: $!";
407 }
408
409 # next line blocks till there's a reader
410 open (FIFO, "> $FIFO") || die "can't open $FIFO: $!";
411 print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
412 close(FIFO) || die "can't close $FIFO: $!";
413 sleep 2; # to avoid dup signals
414 }
415
417 Perl's basic open() statement can also be used for unidirectional
418 interprocess communication by either appending or prepending a pipe
419 symbol to the second argument to open(). Here's how to start something
420 up in a child process you intend to write to:
421
422 open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
423 || die "can't fork: $!";
424 local $SIG{PIPE} = sub { die "spooler pipe broke" };
425 print SPOOLER "stuff\n";
426 close SPOOLER || die "bad spool: $! $?";
427
428 And here's how to start up a child process you intend to read from:
429
430 open(STATUS, "netstat -an 2>&1 |")
431 || die "can't fork: $!";
432 while (<STATUS>) {
433 next if /^(tcp|udp)/;
434 print;
435 }
436 close STATUS || die "bad netstat: $! $?";
437
438 If one can be sure that a particular program is a Perl script expecting
439 filenames in @ARGV, the clever programmer can write something like
440 this:
441
442 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
443
444 and no matter which sort of shell it's called from, the Perl program
445 will read from the file f1, the process cmd1, standard input (tmpfile
446 in this case), the f2 file, the cmd2 command, and finally the f3 file.
447 Pretty nifty, eh?
448
449 You might notice that you could use backticks for much the same effect
450 as opening a pipe for reading:
451
452 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
453 die "bad netstatus ($?)" if $?;
454
455 While this is true on the surface, it's much more efficient to process
456 the file one line or record at a time because then you don't have to
457 read the whole thing into memory at once. It also gives you finer
458 control of the whole process, letting you kill off the child process
459 early if you'd like.
460
461 Be careful to check the return values from both open() and close(). If
462 you're writing to a pipe, you should also trap SIGPIPE. Otherwise,
463 think of what happens when you start up a pipe to a command that
464 doesn't exist: the open() will in all likelihood succeed (it only
465 reflects the fork()'s success), but then your output will
466 fail--spectacularly. Perl can't know whether the command worked,
467 because your command is actually running in a separate process whose
468 exec() might have failed. Therefore, while readers of bogus commands
469 return just a quick EOF, writers to bogus commands will get hit with a
470 signal, which they'd best be prepared to handle. Consider:
471
472 open(FH, "|bogus") || die "can't fork: $!";
473 print FH "bang\n"; # neither necessary nor sufficient
474 # to check print retval!
475 close(FH) || die "can't close: $!";
476
477 The reason for not checking the return value from print() is because of
478 pipe buffering; physical writes are delayed. That won't blow up until
479 the close, and it will blow up with a SIGPIPE. To catch it, you could
480 use this:
481
482 $SIG{PIPE} = "IGNORE";
483 open(FH, "|bogus") || die "can't fork: $!";
484 print FH "bang\n";
485 close(FH) || die "can't close: status=$?";
486
487 Filehandles
488 Both the main process and any child processes it forks share the same
489 STDIN, STDOUT, and STDERR filehandles. If both processes try to access
490 them at once, strange things can happen. You may also want to close or
491 reopen the filehandles for the child. You can get around this by
492 opening your pipe with open(), but on some systems this means that the
493 child process cannot outlive the parent.
494
495 Background Processes
496 You can run a command in the background with:
497
498 system("cmd &");
499
500 The command's STDOUT and STDERR (and possibly STDIN, depending on your
501 shell) will be the same as the parent's. You won't need to catch
502 SIGCHLD because of the double-fork taking place; see below for details.
503
504 Complete Dissociation of Child from Parent
505 In some cases (starting server processes, for instance) you'll want to
506 completely dissociate the child process from the parent. This is often
507 called daemonization. A well-behaved daemon will also chdir() to the
508 root directory so it doesn't prevent unmounting the filesystem
509 containing the directory from which it was launched, and redirect its
510 standard file descriptors from and to /dev/null so that random output
511 doesn't wind up on the user's terminal.
512
513 use POSIX "setsid";
514
515 sub daemonize {
516 chdir("/") || die "can't chdir to /: $!";
517 open(STDIN, "< /dev/null") || die "can't read /dev/null: $!";
518 open(STDOUT, "> /dev/null") || die "can't write to /dev/null: $!";
519 defined(my $pid = fork()) || die "can't fork: $!";
520 exit if $pid; # non-zero now means I am the parent
521 (setsid() != -1) || die "Can't start a new session: $!";
522 open(STDERR, ">&STDOUT") || die "can't dup stdout: $!";
523 }
524
525 The fork() has to come before the setsid() to ensure you aren't a
526 process group leader; the setsid() will fail if you are. If your
527 system doesn't have the setsid() function, open /dev/tty and use the
528 "TIOCNOTTY" ioctl() on it instead. See tty(4) for details.
529
530 Non-Unix users should check their "Your_OS::Process" module for other
531 possible solutions.
532
533 Safe Pipe Opens
534 Another interesting approach to IPC is making your single program go
535 multiprocess and communicate between--or even amongst--yourselves. The
536 open() function will accept a file argument of either "-|" or "|-" to
537 do a very interesting thing: it forks a child connected to the
538 filehandle you've opened. The child is running the same program as the
539 parent. This is useful for safely opening a file when running under an
540 assumed UID or GID, for example. If you open a pipe to minus, you can
541 write to the filehandle you opened and your kid will find it in his
542 STDIN. If you open a pipe from minus, you can read from the filehandle
543 you opened whatever your kid writes to his STDOUT.
544
545 use English;
546 my $PRECIOUS = "/path/to/some/safe/file";
547 my $sleep_count;
548 my $pid;
549
550 do {
551 $pid = open(KID_TO_WRITE, "|-");
552 unless (defined $pid) {
553 warn "cannot fork: $!";
554 die "bailing out" if $sleep_count++ > 6;
555 sleep 10;
556 }
557 } until defined $pid;
558
559 if ($pid) { # I am the parent
560 print KID_TO_WRITE @some_data;
561 close(KID_TO_WRITE) || warn "kid exited $?";
562 } else { # I am the child
563 # drop permissions in setuid and/or setgid programs:
564 ($EUID, $EGID) = ($UID, $GID);
565 open (OUTFILE, "> $PRECIOUS")
566 || die "can't open $PRECIOUS: $!";
567 while (<STDIN>) {
568 print OUTFILE; # child's STDIN is parent's KID_TO_WRITE
569 }
570 close(OUTFILE) || die "can't close $PRECIOUS: $!";
571 exit(0); # don't forget this!!
572 }
573
574 Another common use for this construct is when you need to execute
575 something without the shell's interference. With system(), it's
576 straightforward, but you can't use a pipe open or backticks safely.
577 That's because there's no way to stop the shell from getting its hands
578 on your arguments. Instead, use lower-level control to call exec()
579 directly.
580
581 Here's a safe backtick or pipe open for read:
582
583 my $pid = open(KID_TO_READ, "-|");
584 defined($pid) || die "can't fork: $!";
585
586 if ($pid) { # parent
587 while (<KID_TO_READ>) {
588 # do something interesting
589 }
590 close(KID_TO_READ) || warn "kid exited $?";
591
592 } else { # child
593 ($EUID, $EGID) = ($UID, $GID); # suid only
594 exec($program, @options, @args)
595 || die "can't exec program: $!";
596 # NOTREACHED
597 }
598
599 And here's a safe pipe open for writing:
600
601 my $pid = open(KID_TO_WRITE, "|-");
602 defined($pid) || die "can't fork: $!";
603
604 $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
605
606 if ($pid) { # parent
607 print KID_TO_WRITE @data;
608 close(KID_TO_WRITE) || warn "kid exited $?";
609
610 } else { # child
611 ($EUID, $EGID) = ($UID, $GID);
612 exec($program, @options, @args)
613 || die "can't exec program: $!";
614 # NOTREACHED
615 }
616
617 It is very easy to dead-lock a process using this form of open(), or
618 indeed with any use of pipe() with multiple subprocesses. The example
619 above is "safe" because it is simple and calls exec(). See "Avoiding
620 Pipe Deadlocks" for general safety principles, but there are extra
621 gotchas with Safe Pipe Opens.
622
623 In particular, if you opened the pipe using "open FH, "|-"", then you
624 cannot simply use close() in the parent process to close an unwanted
625 writer. Consider this code:
626
627 my $pid = open(WRITER, "|-"); # fork open a kid
628 defined($pid) || die "first fork failed: $!";
629 if ($pid) {
630 if (my $sub_pid = fork()) {
631 defined($sub_pid) || die "second fork failed: $!";
632 close(WRITER) || die "couldn't close WRITER: $!";
633 # now do something else...
634 }
635 else {
636 # first write to WRITER
637 # ...
638 # then when finished
639 close(WRITER) || die "couldn't close WRITER: $!";
640 exit(0);
641 }
642 }
643 else {
644 # first do something with STDIN, then
645 exit(0);
646 }
647
648 In the example above, the true parent does not want to write to the
649 WRITER filehandle, so it closes it. However, because WRITER was opened
650 using "open FH, "|-"", it has a special behavior: closing it calls
651 waitpid() (see "waitpid" in perlfunc), which waits for the subprocess
652 to exit. If the child process ends up waiting for something happening
653 in the section marked "do something else", you have deadlock.
654
655 This can also be a problem with intermediate subprocesses in more
656 complicated code, which will call waitpid() on all open filehandles
657 during global destruction--in no predictable order.
658
659 To solve this, you must manually use pipe(), fork(), and the form of
660 open() which sets one file descriptor to another, as shown below:
661
662 pipe(READER, WRITER) || die "pipe failed: $!";
663 $pid = fork();
664 defined($pid) || die "first fork failed: $!";
665 if ($pid) {
666 close READER;
667 if (my $sub_pid = fork()) {
668 defined($sub_pid) || die "first fork failed: $!";
669 close(WRITER) || die "can't close WRITER: $!";
670 }
671 else {
672 # write to WRITER...
673 # ...
674 # then when finished
675 close(WRITER) || die "can't close WRITER: $!";
676 exit(0);
677 }
678 # write to WRITER...
679 }
680 else {
681 open(STDIN, "<&READER") || die "can't reopen STDIN: $!";
682 close(WRITER) || die "can't close WRITER: $!";
683 # do something...
684 exit(0);
685 }
686
687 Since Perl 5.8.0, you can also use the list form of "open" for pipes.
688 This is preferred when you wish to avoid having the shell interpret
689 metacharacters that may be in your command string.
690
691 So for example, instead of using:
692
693 open(PS_PIPE, "ps aux|") || die "can't open ps pipe: $!";
694
695 One would use either of these:
696
697 open(PS_PIPE, "-|", "ps", "aux")
698 || die "can't open ps pipe: $!";
699
700 @ps_args = qw[ ps aux ];
701 open(PS_PIPE, "-|", @ps_args)
702 || die "can't open @ps_args|: $!";
703
704 Because there are more than three arguments to open(), forks the ps(1)
705 command without spawning a shell, and reads its standard output via the
706 "PS_PIPE" filehandle. The corresponding syntax to write to command
707 pipes is to use "|-" in place of "-|".
708
709 This was admittedly a rather silly example, because you're using string
710 literals whose content is perfectly safe. There is therefore no cause
711 to resort to the harder-to-read, multi-argument form of pipe open().
712 However, whenever you cannot be assured that the program arguments are
713 free of shell metacharacters, the fancier form of open() should be
714 used. For example:
715
716 @grep_args = ("egrep", "-i", $some_pattern, @many_files);
717 open(GREP_PIPE, "-|", @grep_args)
718 || die "can't open @grep_args|: $!";
719
720 Here the multi-argument form of pipe open() is preferred because the
721 pattern and indeed even the filenames themselves might hold
722 metacharacters.
723
724 Be aware that these operations are full Unix forks, which means they
725 may not be correctly implemented on all alien systems.
726
727 Avoiding Pipe Deadlocks
728 Whenever you have more than one subprocess, you must be careful that
729 each closes whichever half of any pipes created for interprocess
730 communication it is not using. This is because any child process
731 reading from the pipe and expecting an EOF will never receive it, and
732 therefore never exit. A single process closing a pipe is not enough to
733 close it; the last process with the pipe open must close it for it to
734 read EOF.
735
736 Certain built-in Unix features help prevent this most of the time. For
737 instance, filehandles have a "close on exec" flag, which is set en
738 masse under control of the $^F variable. This is so any filehandles
739 you didn't explicitly route to the STDIN, STDOUT or STDERR of a child
740 program will be automatically closed.
741
742 Always explicitly and immediately call close() on the writable end of
743 any pipe, unless that process is actually writing to it. Even if you
744 don't explicitly call close(), Perl will still close() all filehandles
745 during global destruction. As previously discussed, if those
746 filehandles have been opened with Safe Pipe Open, this will result in
747 calling waitpid(), which may again deadlock.
748
749 Bidirectional Communication with Another Process
750 While this works reasonably well for unidirectional communication, what
751 about bidirectional communication? The most obvious approach doesn't
752 work:
753
754 # THIS DOES NOT WORK!!
755 open(PROG_FOR_READING_AND_WRITING, "| some program |")
756
757 If you forget to "use warnings", you'll miss out entirely on the
758 helpful diagnostic message:
759
760 Can't do bidirectional pipe at -e line 1.
761
762 If you really want to, you can use the standard open2() from the
763 "IPC::Open2" module to catch both ends. There's also an open3() in
764 "IPC::Open3" for tridirectional I/O so you can also catch your child's
765 STDERR, but doing so would then require an awkward select() loop and
766 wouldn't allow you to use normal Perl input operations.
767
768 If you look at its source, you'll see that open2() uses low-level
769 primitives like the pipe() and exec() syscalls to create all the
770 connections. Although it might have been more efficient by using
771 socketpair(), this would have been even less portable than it already
772 is. The open2() and open3() functions are unlikely to work anywhere
773 except on a Unix system, or at least one purporting POSIX compliance.
774
775 Here's an example of using open2():
776
777 use FileHandle;
778 use IPC::Open2;
779 $pid = open2(*Reader, *Writer, "cat -un");
780 print Writer "stuff\n";
781 $got = <Reader>;
782
783 The problem with this is that buffering is really going to ruin your
784 day. Even though your "Writer" filehandle is auto-flushed so the
785 process on the other end gets your data in a timely manner, you can't
786 usually do anything to force that process to give its data to you in a
787 similarly quick fashion. In this special case, we could actually so,
788 because we gave cat a -u flag to make it unbuffered. But very few
789 commands are designed to operate over pipes, so this seldom works
790 unless you yourself wrote the program on the other end of the double-
791 ended pipe.
792
793 A solution to this is to use a library which uses pseudottys to make
794 your program behave more reasonably. This way you don't have to have
795 control over the source code of the program you're using. The "Expect"
796 module from CPAN also addresses this kind of thing. This module
797 requires two other modules from CPAN, "IO::Pty" and "IO::Stty". It
798 sets up a pseudo terminal to interact with programs that insist on
799 talking to the terminal device driver. If your system is supported,
800 this may be your best bet.
801
802 Bidirectional Communication with Yourself
803 If you want, you may make low-level pipe() and fork() syscalls to
804 stitch this together by hand. This example only talks to itself, but
805 you could reopen the appropriate handles to STDIN and STDOUT and call
806 other processes. (The following example lacks proper error checking.)
807
808 #!/usr/bin/perl -w
809 # pipe1 - bidirectional communication using two pipe pairs
810 # designed for the socketpair-challenged
811 use IO::Handle; # thousands of lines just for autoflush :-(
812 pipe(PARENT_RDR, CHILD_WTR); # XXX: check failure?
813 pipe(CHILD_RDR, PARENT_WTR); # XXX: check failure?
814 CHILD_WTR->autoflush(1);
815 PARENT_WTR->autoflush(1);
816
817 if ($pid = fork()) {
818 close PARENT_RDR;
819 close PARENT_WTR;
820 print CHILD_WTR "Parent Pid $$ is sending this\n";
821 chomp($line = <CHILD_RDR>);
822 print "Parent Pid $$ just read this: '$line'\n";
823 close CHILD_RDR; close CHILD_WTR;
824 waitpid($pid, 0);
825 } else {
826 die "cannot fork: $!" unless defined $pid;
827 close CHILD_RDR;
828 close CHILD_WTR;
829 chomp($line = <PARENT_RDR>);
830 print "Child Pid $$ just read this: '$line'\n";
831 print PARENT_WTR "Child Pid $$ is sending this\n";
832 close PARENT_RDR;
833 close PARENT_WTR;
834 exit(0);
835 }
836
837 But you don't actually have to make two pipe calls. If you have the
838 socketpair() system call, it will do this all for you.
839
840 #!/usr/bin/perl -w
841 # pipe2 - bidirectional communication using socketpair
842 # "the best ones always go both ways"
843
844 use Socket;
845 use IO::Handle; # thousands of lines just for autoflush :-(
846
847 # We say AF_UNIX because although *_LOCAL is the
848 # POSIX 1003.1g form of the constant, many machines
849 # still don't have it.
850 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
851 || die "socketpair: $!";
852
853 CHILD->autoflush(1);
854 PARENT->autoflush(1);
855
856 if ($pid = fork()) {
857 close PARENT;
858 print CHILD "Parent Pid $$ is sending this\n";
859 chomp($line = <CHILD>);
860 print "Parent Pid $$ just read this: '$line'\n";
861 close CHILD;
862 waitpid($pid, 0);
863 } else {
864 die "cannot fork: $!" unless defined $pid;
865 close CHILD;
866 chomp($line = <PARENT>);
867 print "Child Pid $$ just read this: '$line'\n";
868 print PARENT "Child Pid $$ is sending this\n";
869 close PARENT;
870 exit(0);
871 }
872
874 While not entirely limited to Unix-derived operating systems (e.g.,
875 WinSock on PCs provides socket support, as do some VMS libraries), you
876 might not have sockets on your system, in which case this section
877 probably isn't going to do you much good. With sockets, you can do
878 both virtual circuits like TCP streams and datagrams like UDP packets.
879 You may be able to do even more depending on your system.
880
881 The Perl functions for dealing with sockets have the same names as the
882 corresponding system calls in C, but their arguments tend to differ for
883 two reasons. First, Perl filehandles work differently than C file
884 descriptors. Second, Perl already knows the length of its strings, so
885 you don't need to pass that information.
886
887 One of the major problems with ancient, antemillennial socket code in
888 Perl was that it used hard-coded values for some of the constants,
889 which severely hurt portability. If you ever see code that does
890 anything like explicitly setting "$AF_INET = 2", you know you're in for
891 big trouble. An immeasurably superior approach is to use the "Socket"
892 module, which more reliably grants access to the various constants and
893 functions you'll need.
894
895 If you're not writing a server/client for an existing protocol like
896 NNTP or SMTP, you should give some thought to how your server will know
897 when the client has finished talking, and vice-versa. Most protocols
898 are based on one-line messages and responses (so one party knows the
899 other has finished when a "\n" is received) or multi-line messages and
900 responses that end with a period on an empty line ("\n.\n" terminates a
901 message/response).
902
903 Internet Line Terminators
904 The Internet line terminator is "\015\012". Under ASCII variants of
905 Unix, that could usually be written as "\r\n", but under other systems,
906 "\r\n" might at times be "\015\015\012", "\012\012\015", or something
907 completely different. The standards specify writing "\015\012" to be
908 conformant (be strict in what you provide), but they also recommend
909 accepting a lone "\012" on input (be lenient in what you require). We
910 haven't always been very good about that in the code in this manpage,
911 but unless you're on a Mac from way back in its pre-Unix dark ages,
912 you'll probably be ok.
913
914 Internet TCP Clients and Servers
915 Use Internet-domain sockets when you want to do client-server
916 communication that might extend to machines outside of your own system.
917
918 Here's a sample TCP client using Internet-domain sockets:
919
920 #!/usr/bin/perl -w
921 use strict;
922 use Socket;
923 my ($remote, $port, $iaddr, $paddr, $proto, $line);
924
925 $remote = shift || "localhost";
926 $port = shift || 2345; # random port
927 if ($port =~ /\D/) { $port = getservbyname($port, "tcp") }
928 die "No port" unless $port;
929 $iaddr = inet_aton($remote) || die "no host: $remote";
930 $paddr = sockaddr_in($port, $iaddr);
931
932 $proto = getprotobyname("tcp");
933 socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
934 connect(SOCK, $paddr) || die "connect: $!";
935 while ($line = <SOCK>) {
936 print $line;
937 }
938
939 close (SOCK) || die "close: $!";
940 exit(0);
941
942 And here's a corresponding server to go along with it. We'll leave the
943 address as "INADDR_ANY" so that the kernel can choose the appropriate
944 interface on multihomed hosts. If you want sit on a particular
945 interface (like the external side of a gateway or firewall machine),
946 fill this in with your real address instead.
947
948 #!/usr/bin/perl -Tw
949 use strict;
950 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
951 use Socket;
952 use Carp;
953 my $EOL = "\015\012";
954
955 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
956
957 my $port = shift || 2345;
958 die "invalid port" unless $port =~ /^ \d+ $/x;
959
960 my $proto = getprotobyname("tcp");
961
962 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
963 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
964 || die "setsockopt: $!";
965 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
966 listen(Server, SOMAXCONN) || die "listen: $!";
967
968 logmsg "server started on port $port";
969
970 my $paddr;
971
972 for ( ; $paddr = accept(Client, Server); close Client) {
973 my($port, $iaddr) = sockaddr_in($paddr);
974 my $name = gethostbyaddr($iaddr, AF_INET);
975
976 logmsg "connection from $name [",
977 inet_ntoa($iaddr), "]
978 at port $port";
979
980 print Client "Hello there, $name, it's now ",
981 scalar localtime(), $EOL;
982 }
983
984 And here's a multitasking version. It's multitasked in that like most
985 typical servers, it spawns (fork()s) a slave server to handle the
986 client request so that the master server can quickly go back to service
987 a new client.
988
989 #!/usr/bin/perl -Tw
990 use strict;
991 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
992 use Socket;
993 use Carp;
994 my $EOL = "\015\012";
995
996 sub spawn; # forward declaration
997 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
998
999 my $port = shift || 2345;
1000 die "invalid port" unless $port =~ /^ \d+ $/x;
1001
1002 my $proto = getprotobyname("tcp");
1003
1004 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
1005 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
1006 || die "setsockopt: $!";
1007 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
1008 listen(Server, SOMAXCONN) || die "listen: $!";
1009
1010 logmsg "server started on port $port";
1011
1012 my $waitedpid = 0;
1013 my $paddr;
1014
1015 use POSIX ":sys_wait_h";
1016 use Errno;
1017
1018 sub REAPER {
1019 local $!; # don't let waitpid() overwrite current error
1020 while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) {
1021 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1022 }
1023 $SIG{CHLD} = \&REAPER; # loathe SysV
1024 }
1025
1026 $SIG{CHLD} = \&REAPER;
1027
1028 while (1) {
1029 $paddr = accept(Client, Server) || do {
1030 # try again if accept() returned because got a signal
1031 next if $!{EINTR};
1032 die "accept: $!";
1033 };
1034 my ($port, $iaddr) = sockaddr_in($paddr);
1035 my $name = gethostbyaddr($iaddr, AF_INET);
1036
1037 logmsg "connection from $name [",
1038 inet_ntoa($iaddr),
1039 "] at port $port";
1040
1041 spawn sub {
1042 $| = 1;
1043 print "Hello there, $name, it's now ",
1044 scalar localtime(),
1045 $EOL;
1046 exec "/usr/games/fortune" # XXX: "wrong" line terminators
1047 or confess "can't exec fortune: $!";
1048 };
1049 close Client;
1050 }
1051
1052 sub spawn {
1053 my $coderef = shift;
1054
1055 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1056 confess "usage: spawn CODEREF";
1057 }
1058
1059 my $pid;
1060 unless (defined($pid = fork())) {
1061 logmsg "cannot fork: $!";
1062 return;
1063 }
1064 elsif ($pid) {
1065 logmsg "begat $pid";
1066 return; # I'm the parent
1067 }
1068 # else I'm the child -- go spawn
1069
1070 open(STDIN, "<&Client") || die "can't dup client to stdin";
1071 open(STDOUT, ">&Client") || die "can't dup client to stdout";
1072 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1073 exit($coderef->());
1074 }
1075
1076 This server takes the trouble to clone off a child version via fork()
1077 for each incoming request. That way it can handle many requests at
1078 once, which you might not always want. Even if you don't fork(), the
1079 listen() will allow that many pending connections. Forking servers
1080 have to be particularly careful about cleaning up their dead children
1081 (called "zombies" in Unix parlance), because otherwise you'll quickly
1082 fill up your process table. The REAPER subroutine is used here to call
1083 waitpid() for any child processes that have finished, thereby ensuring
1084 that they terminate cleanly and don't join the ranks of the living
1085 dead.
1086
1087 Within the while loop we call accept() and check to see if it returns a
1088 false value. This would normally indicate a system error needs to be
1089 reported. However, the introduction of safe signals (see "Deferred
1090 Signals (Safe Signals)" above) in Perl 5.8.0 means that accept() might
1091 also be interrupted when the process receives a signal. This typically
1092 happens when one of the forked subprocesses exits and notifies the
1093 parent process with a CHLD signal.
1094
1095 If accept() is interrupted by a signal, $! will be set to EINTR. If
1096 this happens, we can safely continue to the next iteration of the loop
1097 and another call to accept(). It is important that your signal
1098 handling code not modify the value of $!, or else this test will likely
1099 fail. In the REAPER subroutine we create a local version of $! before
1100 calling waitpid(). When waitpid() sets $! to ECHILD as it inevitably
1101 does when it has no more children waiting, it updates the local copy
1102 and leaves the original unchanged.
1103
1104 You should use the -T flag to enable taint checking (see perlsec) even
1105 if we aren't running setuid or setgid. This is always a good idea for
1106 servers or any program run on behalf of someone else (like CGI
1107 scripts), because it lessens the chances that people from the outside
1108 will be able to compromise your system.
1109
1110 Let's look at another TCP client. This one connects to the TCP "time"
1111 service on a number of different machines and shows how far their
1112 clocks differ from the system on which it's being run:
1113
1114 #!/usr/bin/perl -w
1115 use strict;
1116 use Socket;
1117
1118 my $SECS_OF_70_YEARS = 2208988800;
1119 sub ctime { scalar localtime(shift() || time()) }
1120
1121 my $iaddr = gethostbyname("localhost");
1122 my $proto = getprotobyname("tcp");
1123 my $port = getservbyname("time", "tcp");
1124 my $paddr = sockaddr_in(0, $iaddr);
1125 my($host);
1126
1127 $| = 1;
1128 printf "%-24s %8s %s\n", "localhost", 0, ctime();
1129
1130 foreach $host (@ARGV) {
1131 printf "%-24s ", $host;
1132 my $hisiaddr = inet_aton($host) || die "unknown host";
1133 my $hispaddr = sockaddr_in($port, $hisiaddr);
1134 socket(SOCKET, PF_INET, SOCK_STREAM, $proto)
1135 || die "socket: $!";
1136 connect(SOCKET, $hispaddr) || die "connect: $!";
1137 my $rtime = pack("C4", ());
1138 read(SOCKET, $rtime, 4);
1139 close(SOCKET);
1140 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1141 printf "%8d %s\n", $histime - time(), ctime($histime);
1142 }
1143
1144 Unix-Domain TCP Clients and Servers
1145 That's fine for Internet-domain clients and servers, but what about
1146 local communications? While you can use the same setup, sometimes you
1147 don't want to. Unix-domain sockets are local to the current host, and
1148 are often used internally to implement pipes. Unlike Internet domain
1149 sockets, Unix domain sockets can show up in the file system with an
1150 ls(1) listing.
1151
1152 % ls -l /dev/log
1153 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
1154
1155 You can test for these with Perl's -S file test:
1156
1157 unless (-S "/dev/log") {
1158 die "something's wicked with the log system";
1159 }
1160
1161 Here's a sample Unix-domain client:
1162
1163 #!/usr/bin/perl -w
1164 use Socket;
1165 use strict;
1166 my ($rendezvous, $line);
1167
1168 $rendezvous = shift || "catsock";
1169 socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1170 connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!";
1171 while (defined($line = <SOCK>)) {
1172 print $line;
1173 }
1174 exit(0);
1175
1176 And here's a corresponding server. You don't have to worry about silly
1177 network terminators here because Unix domain sockets are guaranteed to
1178 be on the localhost, and thus everything works right.
1179
1180 #!/usr/bin/perl -Tw
1181 use strict;
1182 use Socket;
1183 use Carp;
1184
1185 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
1186 sub spawn; # forward declaration
1187 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
1188
1189 my $NAME = "catsock";
1190 my $uaddr = sockaddr_un($NAME);
1191 my $proto = getprotobyname("tcp");
1192
1193 socket(Server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1194 unlink($NAME);
1195 bind (Server, $uaddr) || die "bind: $!";
1196 listen(Server, SOMAXCONN) || die "listen: $!";
1197
1198 logmsg "server started on $NAME";
1199
1200 my $waitedpid;
1201
1202 use POSIX ":sys_wait_h";
1203 sub REAPER {
1204 my $child;
1205 while (($waitedpid = waitpid(-1, WNOHANG)) > 0) {
1206 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1207 }
1208 $SIG{CHLD} = \&REAPER; # loathe SysV
1209 }
1210
1211 $SIG{CHLD} = \&REAPER;
1212
1213
1214 for ( $waitedpid = 0;
1215 accept(Client, Server) || $waitedpid;
1216 $waitedpid = 0, close Client)
1217 {
1218 next if $waitedpid;
1219 logmsg "connection on $NAME";
1220 spawn sub {
1221 print "Hello there, it's now ", scalar localtime(), "\n";
1222 exec("/usr/games/fortune") || die "can't exec fortune: $!";
1223 };
1224 }
1225
1226 sub spawn {
1227 my $coderef = shift();
1228
1229 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1230 confess "usage: spawn CODEREF";
1231 }
1232
1233 my $pid;
1234 unless (defined($pid = fork())) {
1235 logmsg "cannot fork: $!";
1236 return;
1237 }
1238 elsif ($pid) {
1239 logmsg "begat $pid";
1240 return; # I'm the parent
1241 }
1242 else {
1243 # I'm the child -- go spawn
1244 }
1245
1246 open(STDIN, "<&Client") || die "can't dup client to stdin";
1247 open(STDOUT, ">&Client") || die "can't dup client to stdout";
1248 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1249 exit($coderef->());
1250 }
1251
1252 As you see, it's remarkably similar to the Internet domain TCP server,
1253 so much so, in fact, that we've omitted several duplicate
1254 functions--spawn(), logmsg(), ctime(), and REAPER()--which are the same
1255 as in the other server.
1256
1257 So why would you ever want to use a Unix domain socket instead of a
1258 simpler named pipe? Because a named pipe doesn't give you sessions.
1259 You can't tell one process's data from another's. With socket
1260 programming, you get a separate session for each client; that's why
1261 accept() takes two arguments.
1262
1263 For example, let's say that you have a long-running database server
1264 daemon that you want folks to be able to access from the Web, but only
1265 if they go through a CGI interface. You'd have a small, simple CGI
1266 program that does whatever checks and logging you feel like, and then
1267 acts as a Unix-domain client and connects to your private server.
1268
1270 For those preferring a higher-level interface to socket programming,
1271 the IO::Socket module provides an object-oriented approach. If for
1272 some reason you lack this module, you can just fetch IO::Socket from
1273 CPAN, where you'll also find modules providing easy interfaces to the
1274 following systems: DNS, FTP, Ident (RFC 931), NIS and NISPlus, NNTP,
1275 Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--to name just a few.
1276
1277 A Simple Client
1278 Here's a client that creates a TCP connection to the "daytime" service
1279 at port 13 of the host name "localhost" and prints out everything that
1280 the server there cares to provide.
1281
1282 #!/usr/bin/perl -w
1283 use IO::Socket;
1284 $remote = IO::Socket::INET->new(
1285 Proto => "tcp",
1286 PeerAddr => "localhost",
1287 PeerPort => "daytime(13)",
1288 )
1289 || die "can't connect to daytime service on localhost";
1290 while (<$remote>) { print }
1291
1292 When you run this program, you should get something back that looks
1293 like this:
1294
1295 Wed May 14 08:40:46 MDT 1997
1296
1297 Here are what those parameters to the new() constructor mean:
1298
1299 "Proto"
1300 This is which protocol to use. In this case, the socket handle
1301 returned will be connected to a TCP socket, because we want a
1302 stream-oriented connection, that is, one that acts pretty much like
1303 a plain old file. Not all sockets are this of this type. For
1304 example, the UDP protocol can be used to make a datagram socket,
1305 used for message-passing.
1306
1307 "PeerAddr"
1308 This is the name or Internet address of the remote host the server
1309 is running on. We could have specified a longer name like
1310 "www.perl.com", or an address like "207.171.7.72". For
1311 demonstration purposes, we've used the special hostname
1312 "localhost", which should always mean the current machine you're
1313 running on. The corresponding Internet address for localhost is
1314 "127.0.0.1", if you'd rather use that.
1315
1316 "PeerPort"
1317 This is the service name or port number we'd like to connect to.
1318 We could have gotten away with using just "daytime" on systems with
1319 a well-configured system services file,[FOOTNOTE: The system
1320 services file is found in /etc/services under Unixy systems.] but
1321 here we've specified the port number (13) in parentheses. Using
1322 just the number would have also worked, but numeric literals make
1323 careful programmers nervous.
1324
1325 Notice how the return value from the "new" constructor is used as a
1326 filehandle in the "while" loop? That's what's called an indirect
1327 filehandle, a scalar variable containing a filehandle. You can use it
1328 the same way you would a normal filehandle. For example, you can read
1329 one line from it this way:
1330
1331 $line = <$handle>;
1332
1333 all remaining lines from is this way:
1334
1335 @lines = <$handle>;
1336
1337 and send a line of data to it this way:
1338
1339 print $handle "some data\n";
1340
1341 A Webget Client
1342 Here's a simple client that takes a remote host to fetch a document
1343 from, and then a list of files to get from that host. This is a more
1344 interesting client than the previous one because it first sends
1345 something to the server before fetching the server's response.
1346
1347 #!/usr/bin/perl -w
1348 use IO::Socket;
1349 unless (@ARGV > 1) { die "usage: $0 host url ..." }
1350 $host = shift(@ARGV);
1351 $EOL = "\015\012";
1352 $BLANK = $EOL x 2;
1353 for my $document (@ARGV) {
1354 $remote = IO::Socket::INET->new( Proto => "tcp",
1355 PeerAddr => $host,
1356 PeerPort => "http(80)",
1357 ) || die "cannot connect to httpd on $host";
1358 $remote->autoflush(1);
1359 print $remote "GET $document HTTP/1.0" . $BLANK;
1360 while ( <$remote> ) { print }
1361 close $remote;
1362 }
1363
1364 The web server handling the HTTP service is assumed to be at its
1365 standard port, number 80. If the server you're trying to connect to is
1366 at a different port, like 1080 or 8080, you should specify it as the
1367 named-parameter pair, "PeerPort => 8080". The "autoflush" method is
1368 used on the socket because otherwise the system would buffer up the
1369 output we sent it. (If you're on a prehistoric Mac, you'll also need
1370 to change every "\n" in your code that sends data over the network to
1371 be a "\015\012" instead.)
1372
1373 Connecting to the server is only the first part of the process: once
1374 you have the connection, you have to use the server's language. Each
1375 server on the network has its own little command language that it
1376 expects as input. The string that we send to the server starting with
1377 "GET" is in HTTP syntax. In this case, we simply request each
1378 specified document. Yes, we really are making a new connection for
1379 each document, even though it's the same host. That's the way you
1380 always used to have to speak HTTP. Recent versions of web browsers may
1381 request that the remote server leave the connection open a little
1382 while, but the server doesn't have to honor such a request.
1383
1384 Here's an example of running that program, which we'll call webget:
1385
1386 % webget www.perl.com /guanaco.html
1387 HTTP/1.1 404 File Not Found
1388 Date: Thu, 08 May 1997 18:02:32 GMT
1389 Server: Apache/1.2b6
1390 Connection: close
1391 Content-type: text/html
1392
1393 <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1394 <BODY><H1>File Not Found</H1>
1395 The requested URL /guanaco.html was not found on this server.<P>
1396 </BODY>
1397
1398 Ok, so that's not very interesting, because it didn't find that
1399 particular document. But a long response wouldn't have fit on this
1400 page.
1401
1402 For a more featureful version of this program, you should look to the
1403 lwp-request program included with the LWP modules from CPAN.
1404
1405 Interactive Client with IO::Socket
1406 Well, that's all fine if you want to send one command and get one
1407 answer, but what about setting up something fully interactive, somewhat
1408 like the way telnet works? That way you can type a line, get the
1409 answer, type a line, get the answer, etc.
1410
1411 This client is more complicated than the two we've done so far, but if
1412 you're on a system that supports the powerful "fork" call, the solution
1413 isn't that rough. Once you've made the connection to whatever service
1414 you'd like to chat with, call "fork" to clone your process. Each of
1415 these two identical process has a very simple job to do: the parent
1416 copies everything from the socket to standard output, while the child
1417 simultaneously copies everything from standard input to the socket. To
1418 accomplish the same thing using just one process would be much harder,
1419 because it's easier to code two processes to do one thing than it is to
1420 code one process to do two things. (This keep-it-simple principle a
1421 cornerstones of the Unix philosophy, and good software engineering as
1422 well, which is probably why it's spread to other systems.)
1423
1424 Here's the code:
1425
1426 #!/usr/bin/perl -w
1427 use strict;
1428 use IO::Socket;
1429 my ($host, $port, $kidpid, $handle, $line);
1430
1431 unless (@ARGV == 2) { die "usage: $0 host port" }
1432 ($host, $port) = @ARGV;
1433
1434 # create a tcp connection to the specified host and port
1435 $handle = IO::Socket::INET->new(Proto => "tcp",
1436 PeerAddr => $host,
1437 PeerPort => $port)
1438 || die "can't connect to port $port on $host: $!";
1439
1440 $handle->autoflush(1); # so output gets there right away
1441 print STDERR "[Connected to $host:$port]\n";
1442
1443 # split the program into two processes, identical twins
1444 die "can't fork: $!" unless defined($kidpid = fork());
1445
1446 # the if{} block runs only in the parent process
1447 if ($kidpid) {
1448 # copy the socket to standard output
1449 while (defined ($line = <$handle>)) {
1450 print STDOUT $line;
1451 }
1452 kill("TERM", $kidpid); # send SIGTERM to child
1453 }
1454 # the else{} block runs only in the child process
1455 else {
1456 # copy standard input to the socket
1457 while (defined ($line = <STDIN>)) {
1458 print $handle $line;
1459 }
1460 exit(0); # just in case
1461 }
1462
1463 The "kill" function in the parent's "if" block is there to send a
1464 signal to our child process, currently running in the "else" block, as
1465 soon as the remote server has closed its end of the connection.
1466
1467 If the remote server sends data a byte at time, and you need that data
1468 immediately without waiting for a newline (which might not happen), you
1469 may wish to replace the "while" loop in the parent with the following:
1470
1471 my $byte;
1472 while (sysread($handle, $byte, 1) == 1) {
1473 print STDOUT $byte;
1474 }
1475
1476 Making a system call for each byte you want to read is not very
1477 efficient (to put it mildly) but is the simplest to explain and works
1478 reasonably well.
1479
1481 As always, setting up a server is little bit more involved than running
1482 a client. The model is that the server creates a special kind of
1483 socket that does nothing but listen on a particular port for incoming
1484 connections. It does this by calling the "IO::Socket::INET->new()"
1485 method with slightly different arguments than the client did.
1486
1487 Proto
1488 This is which protocol to use. Like our clients, we'll still
1489 specify "tcp" here.
1490
1491 LocalPort
1492 We specify a local port in the "LocalPort" argument, which we
1493 didn't do for the client. This is service name or port number for
1494 which you want to be the server. (Under Unix, ports under 1024 are
1495 restricted to the superuser.) In our sample, we'll use port 9000,
1496 but you can use any port that's not currently in use on your
1497 system. If you try to use one already in used, you'll get an
1498 "Address already in use" message. Under Unix, the "netstat -a"
1499 command will show which services current have servers.
1500
1501 Listen
1502 The "Listen" parameter is set to the maximum number of pending
1503 connections we can accept until we turn away incoming clients.
1504 Think of it as a call-waiting queue for your telephone. The low-
1505 level Socket module has a special symbol for the system maximum,
1506 which is SOMAXCONN.
1507
1508 Reuse
1509 The "Reuse" parameter is needed so that we restart our server
1510 manually without waiting a few minutes to allow system buffers to
1511 clear out.
1512
1513 Once the generic server socket has been created using the parameters
1514 listed above, the server then waits for a new client to connect to it.
1515 The server blocks in the "accept" method, which eventually accepts a
1516 bidirectional connection from the remote client. (Make sure to
1517 autoflush this handle to circumvent buffering.)
1518
1519 To add to user-friendliness, our server prompts the user for commands.
1520 Most servers don't do this. Because of the prompt without a newline,
1521 you'll have to use the "sysread" variant of the interactive client
1522 above.
1523
1524 This server accepts one of five different commands, sending output back
1525 to the client. Unlike most network servers, this one handles only one
1526 incoming client at a time. Multitasking servers are covered in Chapter
1527 16 of the Camel.
1528
1529 Here's the code. We'll
1530
1531 #!/usr/bin/perl -w
1532 use IO::Socket;
1533 use Net::hostent; # for OOish version of gethostbyaddr
1534
1535 $PORT = 9000; # pick something not in use
1536
1537 $server = IO::Socket::INET->new( Proto => "tcp",
1538 LocalPort => $PORT,
1539 Listen => SOMAXCONN,
1540 Reuse => 1);
1541
1542 die "can't setup server" unless $server;
1543 print "[Server $0 accepting clients]\n";
1544
1545 while ($client = $server->accept()) {
1546 $client->autoflush(1);
1547 print $client "Welcome to $0; type help for command list.\n";
1548 $hostinfo = gethostbyaddr($client->peeraddr);
1549 printf "[Connect from %s]\n",
1550 $hostinfo ? $hostinfo->name : $client->peerhost;
1551 print $client "Command? ";
1552 while ( <$client>) {
1553 next unless /\S/; # blank line
1554 if (/quit|exit/i) { last }
1555 elsif (/date|time/i) { printf $client "%s\n", scalar localtime() }
1556 elsif (/who/i ) { print $client `who 2>&1` }
1557 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1` }
1558 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1` }
1559 else {
1560 print $client "Commands: quit date who cookie motd\n";
1561 }
1562 } continue {
1563 print $client "Command? ";
1564 }
1565 close $client;
1566 }
1567
1569 Another kind of client-server setup is one that uses not connections,
1570 but messages. UDP communications involve much lower overhead but also
1571 provide less reliability, as there are no promises that messages will
1572 arrive at all, let alone in order and unmangled. Still, UDP offers
1573 some advantages over TCP, including being able to "broadcast" or
1574 "multicast" to a whole bunch of destination hosts at once (usually on
1575 your local subnet). If you find yourself overly concerned about
1576 reliability and start building checks into your message system, then
1577 you probably should use just TCP to start with.
1578
1579 UDP datagrams are not a bytestream and should not be treated as such.
1580 This makes using I/O mechanisms with internal buffering like stdio
1581 (i.e. print() and friends) especially cumbersome. Use syswrite(), or
1582 better send(), like in the example below.
1583
1584 Here's a UDP program similar to the sample Internet TCP client given
1585 earlier. However, instead of checking one host at a time, the UDP
1586 version will check many of them asynchronously by simulating a
1587 multicast and then using select() to do a timed-out wait for I/O. To
1588 do something similar with TCP, you'd have to use a different socket
1589 handle for each host.
1590
1591 #!/usr/bin/perl -w
1592 use strict;
1593 use Socket;
1594 use Sys::Hostname;
1595
1596 my ( $count, $hisiaddr, $hispaddr, $histime,
1597 $host, $iaddr, $paddr, $port, $proto,
1598 $rin, $rout, $rtime, $SECS_OF_70_YEARS);
1599
1600 $SECS_OF_70_YEARS = 2_208_988_800;
1601
1602 $iaddr = gethostbyname(hostname());
1603 $proto = getprotobyname("udp");
1604 $port = getservbyname("time", "udp");
1605 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1606
1607 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
1608 bind(SOCKET, $paddr) || die "bind: $!";
1609
1610 $| = 1;
1611 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime();
1612 $count = 0;
1613 for $host (@ARGV) {
1614 $count++;
1615 $hisiaddr = inet_aton($host) || die "unknown host";
1616 $hispaddr = sockaddr_in($port, $hisiaddr);
1617 defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!";
1618 }
1619
1620 $rin = "";
1621 vec($rin, fileno(SOCKET), 1) = 1;
1622
1623 # timeout after 10.0 seconds
1624 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1625 $rtime = "";
1626 $hispaddr = recv(SOCKET, $rtime, 4, 0) || die "recv: $!";
1627 ($port, $hisiaddr) = sockaddr_in($hispaddr);
1628 $host = gethostbyaddr($hisiaddr, AF_INET);
1629 $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1630 printf "%-12s ", $host;
1631 printf "%8d %s\n", $histime - time(), scalar localtime($histime);
1632 $count--;
1633 }
1634
1635 This example does not include any retries and may consequently fail to
1636 contact a reachable host. The most prominent reason for this is
1637 congestion of the queues on the sending host if the number of hosts to
1638 contact is sufficiently large.
1639
1641 While System V IPC isn't so widely used as sockets, it still has some
1642 interesting uses. However, you cannot use SysV IPC or Berkeley mmap()
1643 to have a variable shared amongst several processes. That's because
1644 Perl would reallocate your string when you weren't wanting it to. You
1645 might look into the "IPC::Shareable" or "threads::shared" modules for
1646 that.
1647
1648 Here's a small example showing shared memory usage.
1649
1650 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR);
1651
1652 $size = 2000;
1653 $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR);
1654 defined($id) || die "shmget: $!";
1655 print "shm key $id\n";
1656
1657 $message = "Message #1";
1658 shmwrite($id, $message, 0, 60) || die "shmwrite: $!";
1659 print "wrote: '$message'\n";
1660 shmread($id, $buff, 0, 60) || die "shmread: $!";
1661 print "read : '$buff'\n";
1662
1663 # the buffer of shmread is zero-character end-padded.
1664 substr($buff, index($buff, "\0")) = "";
1665 print "un" unless $buff eq $message;
1666 print "swell\n";
1667
1668 print "deleting shm $id\n";
1669 shmctl($id, IPC_RMID, 0) || die "shmctl: $!";
1670
1671 Here's an example of a semaphore:
1672
1673 use IPC::SysV qw(IPC_CREAT);
1674
1675 $IPC_KEY = 1234;
1676 $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT);
1677 defined($id) || die "semget: $!";
1678 print "sem id $id\n";
1679
1680 Put this code in a separate file to be run in more than one process.
1681 Call the file take:
1682
1683 # create a semaphore
1684
1685 $IPC_KEY = 1234;
1686 $id = semget($IPC_KEY, 0, 0);
1687 defined($id) || die "semget: $!";
1688
1689 $semnum = 0;
1690 $semflag = 0;
1691
1692 # "take" semaphore
1693 # wait for semaphore to be zero
1694 $semop = 0;
1695 $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1696
1697 # Increment the semaphore count
1698 $semop = 1;
1699 $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
1700 $opstring = $opstring1 . $opstring2;
1701
1702 semop($id, $opstring) || die "semop: $!";
1703
1704 Put this code in a separate file to be run in more than one process.
1705 Call this file give:
1706
1707 # "give" the semaphore
1708 # run this in the original process and you will see
1709 # that the second process continues
1710
1711 $IPC_KEY = 1234;
1712 $id = semget($IPC_KEY, 0, 0);
1713 die unless defined($id);
1714
1715 $semnum = 0;
1716 $semflag = 0;
1717
1718 # Decrement the semaphore count
1719 $semop = -1;
1720 $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1721
1722 semop($id, $opstring) || die "semop: $!";
1723
1724 The SysV IPC code above was written long ago, and it's definitely
1725 clunky looking. For a more modern look, see the IPC::SysV module.
1726
1727 A small example demonstrating SysV message queues:
1728
1729 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR);
1730
1731 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR);
1732 defined($id) || die "msgget failed: $!";
1733
1734 my $sent = "message";
1735 my $type_sent = 1234;
1736
1737 msgsnd($id, pack("l! a*", $type_sent, $sent), 0)
1738 || die "msgsnd failed: $!";
1739
1740 msgrcv($id, my $rcvd_buf, 60, 0, 0)
1741 || die "msgrcv failed: $!";
1742
1743 my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf);
1744
1745 if ($rcvd eq $sent) {
1746 print "okay\n";
1747 } else {
1748 print "not okay\n";
1749 }
1750
1751 msgctl($id, IPC_RMID, 0) || die "msgctl failed: $!\n";
1752
1754 Most of these routines quietly but politely return "undef" when they
1755 fail instead of causing your program to die right then and there due to
1756 an uncaught exception. (Actually, some of the new Socket conversion
1757 functions do croak() on bad arguments.) It is therefore essential to
1758 check return values from these functions. Always begin your socket
1759 programs this way for optimal success, and don't forget to add the -T
1760 taint-checking flag to the "#!" line for servers:
1761
1762 #!/usr/bin/perl -Tw
1763 use strict;
1764 use sigtrap;
1765 use Socket;
1766
1768 These routines all create system-specific portability problems. As
1769 noted elsewhere, Perl is at the mercy of your C libraries for much of
1770 its system behavior. It's probably safest to assume broken SysV
1771 semantics for signals and to stick with simple TCP and UDP socket
1772 operations; e.g., don't try to pass open file descriptors over a local
1773 UDP datagram socket if you want your code to stand a chance of being
1774 portable.
1775
1777 Tom Christiansen, with occasional vestiges of Larry Wall's original
1778 version and suggestions from the Perl Porters.
1779
1781 There's a lot more to networking than this, but this should get you
1782 started.
1783
1784 For intrepid programmers, the indispensable textbook is Unix Network
1785 Programming, 2nd Edition, Volume 1 by W. Richard Stevens (published by
1786 Prentice-Hall). Most books on networking address the subject from the
1787 perspective of a C programmer; translation to Perl is left as an
1788 exercise for the reader.
1789
1790 The IO::Socket(3) manpage describes the object library, and the
1791 Socket(3) manpage describes the low-level interface to sockets.
1792 Besides the obvious functions in perlfunc, you should also check out
1793 the modules file at your nearest CPAN site, especially
1794 <http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>. See
1795 perlmodlib or best yet, the Perl FAQ for a description of what CPAN is
1796 and where to get it if the previous link doesn't work for you.
1797
1798 Section 5 of CPAN's modules file is devoted to "Networking, Device
1799 Control (modems), and Interprocess Communication", and contains
1800 numerous unbundled modules numerous networking modules, Chat and Expect
1801 operations, CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC,
1802 SNMP, SMTP, Telnet, Threads, and ToolTalk--to name just a few.
1803
1804
1805
1806perl v5.28.2 2018-11-01 PERLIPC(1)