1PERLIPC(1) Perl Programmers Reference Guide PERLIPC(1)
2
3
4
6 perlipc - Perl interprocess communication (signals, fifos, pipes, safe
7 subprocesses, sockets, and semaphores)
8
10 The basic IPC facilities of Perl are built out of the good old Unix
11 signals, named pipes, pipe opens, the Berkeley socket routines, and
12 SysV IPC calls. Each is used in slightly different situations.
13
15 Perl uses a simple signal handling model: the %SIG hash contains names
16 or references of user-installed signal handlers. These handlers will
17 be called with an argument which is the name of the signal that
18 triggered it. A signal may be generated intentionally from a
19 particular keyboard sequence like control-C or control-Z, sent to you
20 from another process, or triggered automatically by the kernel when
21 special events transpire, like a child process exiting, your own
22 process running out of stack space, or hitting a process file-size
23 limit.
24
25 For example, to trap an interrupt signal, set up a handler like this:
26
27 our $shucks;
28
29 sub catch_zap {
30 my $signame = shift;
31 $shucks++;
32 die "Somebody sent me a SIG$signame";
33 }
34 $SIG{INT} = __PACKAGE__ . "::catch_zap";
35 $SIG{INT} = \&catch_zap; # best strategy
36
37 Prior to Perl 5.8.0 it was necessary to do as little as you possibly
38 could in your handler; notice how all we do is set a global variable
39 and then raise an exception. That's because on most systems, libraries
40 are not re-entrant; particularly, memory allocation and I/O routines
41 are not. That meant that doing nearly anything in your handler could
42 in theory trigger a memory fault and subsequent core dump - see
43 "Deferred Signals (Safe Signals)" below.
44
45 The names of the signals are the ones listed out by "kill -l" on your
46 system, or you can retrieve them using the CPAN module IPC::Signal.
47
48 You may also choose to assign the strings "IGNORE" or "DEFAULT" as the
49 handler, in which case Perl will try to discard the signal or do the
50 default thing.
51
52 On most Unix platforms, the "CHLD" (sometimes also known as "CLD")
53 signal has special behavior with respect to a value of "IGNORE".
54 Setting $SIG{CHLD} to "IGNORE" on such a platform has the effect of not
55 creating zombie processes when the parent process fails to "wait()" on
56 its child processes (i.e., child processes are automatically reaped).
57 Calling "wait()" with $SIG{CHLD} set to "IGNORE" usually returns "-1"
58 on such platforms.
59
60 Some signals can be neither trapped nor ignored, such as the KILL and
61 STOP (but not the TSTP) signals. Note that ignoring signals makes them
62 disappear. If you only want them blocked temporarily without them
63 getting lost you'll have to use the "POSIX" module's sigprocmask.
64
65 Sending a signal to a negative process ID means that you send the
66 signal to the entire Unix process group. This code sends a hang-up
67 signal to all processes in the current process group, and also sets
68 $SIG{HUP} to "IGNORE" so it doesn't kill itself:
69
70 # block scope for local
71 {
72 local $SIG{HUP} = "IGNORE";
73 kill HUP => -getpgrp();
74 # snazzy writing of: kill("HUP", -getpgrp())
75 }
76
77 Another interesting signal to send is signal number zero. This doesn't
78 actually affect a child process, but instead checks whether it's alive
79 or has changed its UIDs.
80
81 unless (kill 0 => $kid_pid) {
82 warn "something wicked happened to $kid_pid";
83 }
84
85 Signal number zero may fail because you lack permission to send the
86 signal when directed at a process whose real or saved UID is not
87 identical to the real or effective UID of the sending process, even
88 though the process is alive. You may be able to determine the cause of
89 failure using $! or "%!".
90
91 unless (kill(0 => $pid) || $!{EPERM}) {
92 warn "$pid looks dead";
93 }
94
95 You might also want to employ anonymous functions for simple signal
96 handlers:
97
98 $SIG{INT} = sub { die "\nOutta here!\n" };
99
100 SIGCHLD handlers require some special care. If a second child dies
101 while in the signal handler caused by the first death, we won't get
102 another signal. So must loop here else we will leave the unreaped child
103 as a zombie. And the next time two children die we get another zombie.
104 And so on.
105
106 use POSIX ":sys_wait_h";
107 $SIG{CHLD} = sub {
108 while ((my $child = waitpid(-1, WNOHANG)) > 0) {
109 $Kid_Status{$child} = $?;
110 }
111 };
112 # do something that forks...
113
114 Be careful: qx(), system(), and some modules for calling external
115 commands do a fork(), then wait() for the result. Thus, your signal
116 handler will be called. Because wait() was already called by system()
117 or qx(), the wait() in the signal handler will see no more zombies and
118 will therefore block.
119
120 The best way to prevent this issue is to use waitpid(), as in the
121 following example:
122
123 use POSIX ":sys_wait_h"; # for nonblocking read
124
125 my %children;
126
127 $SIG{CHLD} = sub {
128 # don't change $! and $? outside handler
129 local ($!, $?);
130 while ( (my $pid = waitpid(-1, WNOHANG)) > 0 ) {
131 delete $children{$pid};
132 cleanup_child($pid, $?);
133 }
134 };
135
136 while (1) {
137 my $pid = fork();
138 die "cannot fork" unless defined $pid;
139 if ($pid == 0) {
140 # ...
141 exit 0;
142 } else {
143 $children{$pid}=1;
144 # ...
145 system($command);
146 # ...
147 }
148 }
149
150 Signal handling is also used for timeouts in Unix. While safely
151 protected within an "eval{}" block, you set a signal handler to trap
152 alarm signals and then schedule to have one delivered to you in some
153 number of seconds. Then try your blocking operation, clearing the
154 alarm when it's done but not before you've exited your "eval{}" block.
155 If it goes off, you'll use die() to jump out of the block.
156
157 Here's an example:
158
159 my $ALARM_EXCEPTION = "alarm clock restart";
160 eval {
161 local $SIG{ALRM} = sub { die $ALARM_EXCEPTION };
162 alarm 10;
163 flock($fh, 2) # blocking write lock
164 || die "cannot flock: $!";
165 alarm 0;
166 };
167 if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }
168
169 If the operation being timed out is system() or qx(), this technique is
170 liable to generate zombies. If this matters to you, you'll need to
171 do your own fork() and exec(), and kill the errant child process.
172
173 For more complex signal handling, you might see the standard POSIX
174 module. Lamentably, this is almost entirely undocumented, but the
175 ext/POSIX/t/sigaction.t file from the Perl source distribution has some
176 examples in it.
177
178 Handling the SIGHUP Signal in Daemons
179 A process that usually starts when the system boots and shuts down when
180 the system is shut down is called a daemon (Disk And Execution
181 MONitor). If a daemon process has a configuration file which is
182 modified after the process has been started, there should be a way to
183 tell that process to reread its configuration file without stopping the
184 process. Many daemons provide this mechanism using a "SIGHUP" signal
185 handler. When you want to tell the daemon to reread the file, simply
186 send it the "SIGHUP" signal.
187
188 The following example implements a simple daemon, which restarts itself
189 every time the "SIGHUP" signal is received. The actual code is located
190 in the subroutine "code()", which just prints some debugging info to
191 show that it works; it should be replaced with the real code.
192
193 #!/usr/bin/perl
194
195 use strict;
196 use warnings;
197
198 use POSIX ();
199 use FindBin ();
200 use File::Basename ();
201 use File::Spec::Functions qw(catfile);
202
203 $| = 1;
204
205 # make the daemon cross-platform, so exec always calls the script
206 # itself with the right path, no matter how the script was invoked.
207 my $script = File::Basename::basename($0);
208 my $SELF = catfile($FindBin::Bin, $script);
209
210 # POSIX unmasks the sigprocmask properly
211 $SIG{HUP} = sub {
212 print "got SIGHUP\n";
213 exec($SELF, @ARGV) || die "$0: couldn't restart: $!";
214 };
215
216 code();
217
218 sub code {
219 print "PID: $$\n";
220 print "ARGV: @ARGV\n";
221 my $count = 0;
222 while (1) {
223 sleep 2;
224 print ++$count, "\n";
225 }
226 }
227
228 Deferred Signals (Safe Signals)
229 Before Perl 5.8.0, installing Perl code to deal with signals exposed
230 you to danger from two things. First, few system library functions are
231 re-entrant. If the signal interrupts while Perl is executing one
232 function (like malloc(3) or printf(3)), and your signal handler then
233 calls the same function again, you could get unpredictable
234 behavior--often, a core dump. Second, Perl isn't itself re-entrant at
235 the lowest levels. If the signal interrupts Perl while Perl is
236 changing its own internal data structures, similarly unpredictable
237 behavior may result.
238
239 There were two things you could do, knowing this: be paranoid or be
240 pragmatic. The paranoid approach was to do as little as possible in
241 your signal handler. Set an existing integer variable that already has
242 a value, and return. This doesn't help you if you're in a slow system
243 call, which will just restart. That means you have to "die" to
244 longjmp(3) out of the handler. Even this is a little cavalier for the
245 true paranoiac, who avoids "die" in a handler because the system is out
246 to get you. The pragmatic approach was to say "I know the risks, but
247 prefer the convenience", and to do anything you wanted in your signal
248 handler, and be prepared to clean up core dumps now and again.
249
250 Perl 5.8.0 and later avoid these problems by "deferring" signals. That
251 is, when the signal is delivered to the process by the system (to the C
252 code that implements Perl) a flag is set, and the handler returns
253 immediately. Then at strategic "safe" points in the Perl interpreter
254 (e.g. when it is about to execute a new opcode) the flags are checked
255 and the Perl level handler from %SIG is executed. The "deferred" scheme
256 allows much more flexibility in the coding of signal handlers as we
257 know the Perl interpreter is in a safe state, and that we are not in a
258 system library function when the handler is called. However the
259 implementation does differ from previous Perls in the following ways:
260
261 Long-running opcodes
262 As the Perl interpreter looks at signal flags only when it is about
263 to execute a new opcode, a signal that arrives during a long-
264 running opcode (e.g. a regular expression operation on a very large
265 string) will not be seen until the current opcode completes.
266
267 If a signal of any given type fires multiple times during an opcode
268 (such as from a fine-grained timer), the handler for that signal
269 will be called only once, after the opcode completes; all other
270 instances will be discarded. Furthermore, if your system's signal
271 queue gets flooded to the point that there are signals that have
272 been raised but not yet caught (and thus not deferred) at the time
273 an opcode completes, those signals may well be caught and deferred
274 during subsequent opcodes, with sometimes surprising results. For
275 example, you may see alarms delivered even after calling alarm(0)
276 as the latter stops the raising of alarms but does not cancel the
277 delivery of alarms raised but not yet caught. Do not depend on the
278 behaviors described in this paragraph as they are side effects of
279 the current implementation and may change in future versions of
280 Perl.
281
282 Interrupting IO
283 When a signal is delivered (e.g., SIGINT from a control-C) the
284 operating system breaks into IO operations like read(2), which is
285 used to implement Perl's readline() function, the "<>" operator. On
286 older Perls the handler was called immediately (and as "read" is
287 not "unsafe", this worked well). With the "deferred" scheme the
288 handler is not called immediately, and if Perl is using the
289 system's "stdio" library that library may restart the "read"
290 without returning to Perl to give it a chance to call the %SIG
291 handler. If this happens on your system the solution is to use the
292 ":perlio" layer to do IO--at least on those handles that you want
293 to be able to break into with signals. (The ":perlio" layer checks
294 the signal flags and calls %SIG handlers before resuming IO
295 operation.)
296
297 The default in Perl 5.8.0 and later is to automatically use the
298 ":perlio" layer.
299
300 Note that it is not advisable to access a file handle within a
301 signal handler where that signal has interrupted an I/O operation
302 on that same handle. While perl will at least try hard not to
303 crash, there are no guarantees of data integrity; for example, some
304 data might get dropped or written twice.
305
306 Some networking library functions like gethostbyname() are known to
307 have their own implementations of timeouts which may conflict with
308 your timeouts. If you have problems with such functions, try using
309 the POSIX sigaction() function, which bypasses Perl safe signals.
310 Be warned that this does subject you to possible memory corruption,
311 as described above.
312
313 Instead of setting $SIG{ALRM}:
314
315 local $SIG{ALRM} = sub { die "alarm" };
316
317 try something like the following:
318
319 use POSIX qw(SIGALRM);
320 POSIX::sigaction(SIGALRM,
321 POSIX::SigAction->new(sub { die "alarm" }))
322 || die "Error setting SIGALRM handler: $!\n";
323
324 Another way to disable the safe signal behavior locally is to use
325 the "Perl::Unsafe::Signals" module from CPAN, which affects all
326 signals.
327
328 Restartable system calls
329 On systems that supported it, older versions of Perl used the
330 SA_RESTART flag when installing %SIG handlers. This meant that
331 restartable system calls would continue rather than returning when
332 a signal arrived. In order to deliver deferred signals promptly,
333 Perl 5.8.0 and later do not use SA_RESTART. Consequently,
334 restartable system calls can fail (with $! set to "EINTR") in
335 places where they previously would have succeeded.
336
337 The default ":perlio" layer retries "read", "write" and "close" as
338 described above; interrupted "wait" and "waitpid" calls will always
339 be retried.
340
341 Signals as "faults"
342 Certain signals like SEGV, ILL, and BUS are generated by virtual
343 memory addressing errors and similar "faults". These are normally
344 fatal: there is little a Perl-level handler can do with them. So
345 Perl delivers them immediately rather than attempting to defer
346 them.
347
348 Signals triggered by operating system state
349 On some operating systems certain signal handlers are supposed to
350 "do something" before returning. One example can be CHLD or CLD,
351 which indicates a child process has completed. On some operating
352 systems the signal handler is expected to "wait" for the completed
353 child process. On such systems the deferred signal scheme will not
354 work for those signals: it does not do the "wait". Again the
355 failure will look like a loop as the operating system will reissue
356 the signal because there are completed child processes that have
357 not yet been "wait"ed for.
358
359 If you want the old signal behavior back despite possible memory
360 corruption, set the environment variable "PERL_SIGNALS" to "unsafe".
361 This feature first appeared in Perl 5.8.1.
362
364 A named pipe (often referred to as a FIFO) is an old Unix IPC mechanism
365 for processes communicating on the same machine. It works just like
366 regular anonymous pipes, except that the processes rendezvous using a
367 filename and need not be related.
368
369 To create a named pipe, use the "POSIX::mkfifo()" function.
370
371 use POSIX qw(mkfifo);
372 mkfifo($path, 0700) || die "mkfifo $path failed: $!";
373
374 You can also use the Unix command mknod(1), or on some systems,
375 mkfifo(1). These may not be in your normal path, though.
376
377 # system return val is backwards, so && not ||
378 #
379 $ENV{PATH} .= ":/etc:/usr/etc";
380 if ( system("mknod", $path, "p")
381 && system("mkfifo", $path) )
382 {
383 die "mk{nod,fifo} $path failed";
384 }
385
386 A fifo is convenient when you want to connect a process to an unrelated
387 one. When you open a fifo, the program will block until there's
388 something on the other end.
389
390 For example, let's say you'd like to have your .signature file be a
391 named pipe that has a Perl program on the other end. Now every time
392 any program (like a mailer, news reader, finger program, etc.) tries to
393 read from that file, the reading program will read the new signature
394 from your program. We'll use the pipe-checking file-test operator, -p,
395 to find out whether anyone (or anything) has accidentally removed our
396 fifo.
397
398 chdir(); # go home
399 my $FIFO = ".signature";
400
401 while (1) {
402 unless (-p $FIFO) {
403 unlink $FIFO; # discard any failure, will catch later
404 require POSIX; # delayed loading of heavy module
405 POSIX::mkfifo($FIFO, 0700)
406 || die "can't mkfifo $FIFO: $!";
407 }
408
409 # next line blocks till there's a reader
410 open (my $fh, ">", $FIFO) || die "can't open $FIFO: $!";
411 print $fh "John Smith (smith\@host.org)\n", `fortune -s`;
412 close($fh) || die "can't close $FIFO: $!";
413 sleep 2; # to avoid dup signals
414 }
415
417 Perl's basic open() statement can also be used for unidirectional
418 interprocess communication by specifying the open mode as "|-" or "-|".
419 Here's how to start something up in a child process you intend to write
420 to:
421
422 open(my $spooler, "|-", "cat -v | lpr -h 2>/dev/null")
423 || die "can't fork: $!";
424 local $SIG{PIPE} = sub { die "spooler pipe broke" };
425 print $spooler "stuff\n";
426 close $spooler || die "bad spool: $! $?";
427
428 And here's how to start up a child process you intend to read from:
429
430 open(my $status, "-|", "netstat -an 2>&1")
431 || die "can't fork: $!";
432 while (<$status>) {
433 next if /^(tcp|udp)/;
434 print;
435 }
436 close $status || die "bad netstat: $! $?";
437
438 Be aware that these operations are full Unix forks, which means they
439 may not be correctly implemented on all alien systems. See "open" in
440 perlport for portability details.
441
442 In the two-argument form of open(), a pipe open can be achieved by
443 either appending or prepending a pipe symbol to the second argument:
444
445 open(my $spooler, "| cat -v | lpr -h 2>/dev/null")
446 || die "can't fork: $!";
447 open(my $status, "netstat -an 2>&1 |")
448 || die "can't fork: $!";
449
450 This can be used even on systems that do not support forking, but this
451 possibly allows code intended to read files to unexpectedly execute
452 programs. If one can be sure that a particular program is a Perl
453 script expecting filenames in @ARGV using the two-argument form of
454 open() or the "<>" operator, the clever programmer can write something
455 like this:
456
457 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
458
459 and no matter which sort of shell it's called from, the Perl program
460 will read from the file f1, the process cmd1, standard input (tmpfile
461 in this case), the f2 file, the cmd2 command, and finally the f3 file.
462 Pretty nifty, eh?
463
464 You might notice that you could use backticks for much the same effect
465 as opening a pipe for reading:
466
467 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
468 die "bad netstatus ($?)" if $?;
469
470 While this is true on the surface, it's much more efficient to process
471 the file one line or record at a time because then you don't have to
472 read the whole thing into memory at once. It also gives you finer
473 control of the whole process, letting you kill off the child process
474 early if you'd like.
475
476 Be careful to check the return values from both open() and close(). If
477 you're writing to a pipe, you should also trap SIGPIPE. Otherwise,
478 think of what happens when you start up a pipe to a command that
479 doesn't exist: the open() will in all likelihood succeed (it only
480 reflects the fork()'s success), but then your output will
481 fail--spectacularly. Perl can't know whether the command worked,
482 because your command is actually running in a separate process whose
483 exec() might have failed. Therefore, while readers of bogus commands
484 return just a quick EOF, writers to bogus commands will get hit with a
485 signal, which they'd best be prepared to handle. Consider:
486
487 open(my $fh, "|-", "bogus") || die "can't fork: $!";
488 print $fh "bang\n"; # neither necessary nor sufficient
489 # to check print retval!
490 close($fh) || die "can't close: $!";
491
492 The reason for not checking the return value from print() is because of
493 pipe buffering; physical writes are delayed. That won't blow up until
494 the close, and it will blow up with a SIGPIPE. To catch it, you could
495 use this:
496
497 $SIG{PIPE} = "IGNORE";
498 open(my $fh, "|-", "bogus") || die "can't fork: $!";
499 print $fh "bang\n";
500 close($fh) || die "can't close: status=$?";
501
502 Filehandles
503 Both the main process and any child processes it forks share the same
504 STDIN, STDOUT, and STDERR filehandles. If both processes try to access
505 them at once, strange things can happen. You may also want to close or
506 reopen the filehandles for the child. You can get around this by
507 opening your pipe with open(), but on some systems this means that the
508 child process cannot outlive the parent.
509
510 Background Processes
511 You can run a command in the background with:
512
513 system("cmd &");
514
515 The command's STDOUT and STDERR (and possibly STDIN, depending on your
516 shell) will be the same as the parent's. You won't need to catch
517 SIGCHLD because of the double-fork taking place; see below for details.
518
519 Complete Dissociation of Child from Parent
520 In some cases (starting server processes, for instance) you'll want to
521 completely dissociate the child process from the parent. This is often
522 called daemonization. A well-behaved daemon will also chdir() to the
523 root directory so it doesn't prevent unmounting the filesystem
524 containing the directory from which it was launched, and redirect its
525 standard file descriptors from and to /dev/null so that random output
526 doesn't wind up on the user's terminal.
527
528 use POSIX "setsid";
529
530 sub daemonize {
531 chdir("/") || die "can't chdir to /: $!";
532 open(STDIN, "<", "/dev/null") || die "can't read /dev/null: $!";
533 open(STDOUT, ">", "/dev/null") || die "can't write /dev/null: $!";
534 defined(my $pid = fork()) || die "can't fork: $!";
535 exit if $pid; # non-zero now means I am the parent
536 (setsid() != -1) || die "Can't start a new session: $!";
537 open(STDERR, ">&", STDOUT) || die "can't dup stdout: $!";
538 }
539
540 The fork() has to come before the setsid() to ensure you aren't a
541 process group leader; the setsid() will fail if you are. If your
542 system doesn't have the setsid() function, open /dev/tty and use the
543 "TIOCNOTTY" ioctl() on it instead. See tty(4) for details.
544
545 Non-Unix users should check their "Your_OS::Process" module for other
546 possible solutions.
547
548 Safe Pipe Opens
549 Another interesting approach to IPC is making your single program go
550 multiprocess and communicate between--or even amongst--yourselves. The
551 two-argument form of the open() function will accept a file argument of
552 either "-|" or "|-" to do a very interesting thing: it forks a child
553 connected to the filehandle you've opened. The child is running the
554 same program as the parent. This is useful for safely opening a file
555 when running under an assumed UID or GID, for example. If you open a
556 pipe to minus, you can write to the filehandle you opened and your kid
557 will find it in his STDIN. If you open a pipe from minus, you can read
558 from the filehandle you opened whatever your kid writes to his STDOUT.
559
560 my $PRECIOUS = "/path/to/some/safe/file";
561 my $sleep_count;
562 my $pid;
563 my $kid_to_write;
564
565 do {
566 $pid = open($kid_to_write, "|-");
567 unless (defined $pid) {
568 warn "cannot fork: $!";
569 die "bailing out" if $sleep_count++ > 6;
570 sleep 10;
571 }
572 } until defined $pid;
573
574 if ($pid) { # I am the parent
575 print $kid_to_write @some_data;
576 close($kid_to_write) || warn "kid exited $?";
577 } else { # I am the child
578 # drop permissions in setuid and/or setgid programs:
579 ($>, $)) = ($<, $();
580 open (my $outfile, ">", $PRECIOUS)
581 || die "can't open $PRECIOUS: $!";
582 while (<STDIN>) {
583 print $outfile; # child STDIN is parent $kid_to_write
584 }
585 close($outfile) || die "can't close $PRECIOUS: $!";
586 exit(0); # don't forget this!!
587 }
588
589 Another common use for this construct is when you need to execute
590 something without the shell's interference. With system(), it's
591 straightforward, but you can't use a pipe open or backticks safely.
592 That's because there's no way to stop the shell from getting its hands
593 on your arguments. Instead, use lower-level control to call exec()
594 directly.
595
596 Here's a safe backtick or pipe open for read:
597
598 my $pid = open(my $kid_to_read, "-|");
599 defined($pid) || die "can't fork: $!";
600
601 if ($pid) { # parent
602 while (<$kid_to_read>) {
603 # do something interesting
604 }
605 close($kid_to_read) || warn "kid exited $?";
606
607 } else { # child
608 ($>, $)) = ($<, $(); # suid only
609 exec($program, @options, @args)
610 || die "can't exec program: $!";
611 # NOTREACHED
612 }
613
614 And here's a safe pipe open for writing:
615
616 my $pid = open(my $kid_to_write, "|-");
617 defined($pid) || die "can't fork: $!";
618
619 $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
620
621 if ($pid) { # parent
622 print $kid_to_write @data;
623 close($kid_to_write) || warn "kid exited $?";
624
625 } else { # child
626 ($>, $)) = ($<, $();
627 exec($program, @options, @args)
628 || die "can't exec program: $!";
629 # NOTREACHED
630 }
631
632 It is very easy to dead-lock a process using this form of open(), or
633 indeed with any use of pipe() with multiple subprocesses. The example
634 above is "safe" because it is simple and calls exec(). See "Avoiding
635 Pipe Deadlocks" for general safety principles, but there are extra
636 gotchas with Safe Pipe Opens.
637
638 In particular, if you opened the pipe using "open $fh, "|-"", then you
639 cannot simply use close() in the parent process to close an unwanted
640 writer. Consider this code:
641
642 my $pid = open(my $writer, "|-"); # fork open a kid
643 defined($pid) || die "first fork failed: $!";
644 if ($pid) {
645 if (my $sub_pid = fork()) {
646 defined($sub_pid) || die "second fork failed: $!";
647 close($writer) || die "couldn't close writer: $!";
648 # now do something else...
649 }
650 else {
651 # first write to $writer
652 # ...
653 # then when finished
654 close($writer) || die "couldn't close writer: $!";
655 exit(0);
656 }
657 }
658 else {
659 # first do something with STDIN, then
660 exit(0);
661 }
662
663 In the example above, the true parent does not want to write to the
664 $writer filehandle, so it closes it. However, because $writer was
665 opened using "open $fh, "|-"", it has a special behavior: closing it
666 calls waitpid() (see "waitpid" in perlfunc), which waits for the
667 subprocess to exit. If the child process ends up waiting for something
668 happening in the section marked "do something else", you have deadlock.
669
670 This can also be a problem with intermediate subprocesses in more
671 complicated code, which will call waitpid() on all open filehandles
672 during global destruction--in no predictable order.
673
674 To solve this, you must manually use pipe(), fork(), and the form of
675 open() which sets one file descriptor to another, as shown below:
676
677 pipe(my $reader, my $writer) || die "pipe failed: $!";
678 my $pid = fork();
679 defined($pid) || die "first fork failed: $!";
680 if ($pid) {
681 close $reader;
682 if (my $sub_pid = fork()) {
683 defined($sub_pid) || die "first fork failed: $!";
684 close($writer) || die "can't close writer: $!";
685 }
686 else {
687 # write to $writer...
688 # ...
689 # then when finished
690 close($writer) || die "can't close writer: $!";
691 exit(0);
692 }
693 # write to $writer...
694 }
695 else {
696 open(STDIN, "<&", $reader) || die "can't reopen STDIN: $!";
697 close($writer) || die "can't close writer: $!";
698 # do something...
699 exit(0);
700 }
701
702 Since Perl 5.8.0, you can also use the list form of "open" for pipes.
703 This is preferred when you wish to avoid having the shell interpret
704 metacharacters that may be in your command string.
705
706 So for example, instead of using:
707
708 open(my $ps_pipe, "-|", "ps aux") || die "can't open ps pipe: $!";
709
710 One would use either of these:
711
712 open(my $ps_pipe, "-|", "ps", "aux")
713 || die "can't open ps pipe: $!";
714
715 my @ps_args = qw[ ps aux ];
716 open(my $ps_pipe, "-|", @ps_args)
717 || die "can't open @ps_args|: $!";
718
719 Because there are more than three arguments to open(), it forks the
720 ps(1) command without spawning a shell, and reads its standard output
721 via the $ps_pipe filehandle. The corresponding syntax to write to
722 command pipes is to use "|-" in place of "-|".
723
724 This was admittedly a rather silly example, because you're using string
725 literals whose content is perfectly safe. There is therefore no cause
726 to resort to the harder-to-read, multi-argument form of pipe open().
727 However, whenever you cannot be assured that the program arguments are
728 free of shell metacharacters, the fancier form of open() should be
729 used. For example:
730
731 my @grep_args = ("egrep", "-i", $some_pattern, @many_files);
732 open(my $grep_pipe, "-|", @grep_args)
733 || die "can't open @grep_args|: $!";
734
735 Here the multi-argument form of pipe open() is preferred because the
736 pattern and indeed even the filenames themselves might hold
737 metacharacters.
738
739 Avoiding Pipe Deadlocks
740 Whenever you have more than one subprocess, you must be careful that
741 each closes whichever half of any pipes created for interprocess
742 communication it is not using. This is because any child process
743 reading from the pipe and expecting an EOF will never receive it, and
744 therefore never exit. A single process closing a pipe is not enough to
745 close it; the last process with the pipe open must close it for it to
746 read EOF.
747
748 Certain built-in Unix features help prevent this most of the time. For
749 instance, filehandles have a "close on exec" flag, which is set en
750 masse under control of the $^F variable. This is so any filehandles
751 you didn't explicitly route to the STDIN, STDOUT or STDERR of a child
752 program will be automatically closed.
753
754 Always explicitly and immediately call close() on the writable end of
755 any pipe, unless that process is actually writing to it. Even if you
756 don't explicitly call close(), Perl will still close() all filehandles
757 during global destruction. As previously discussed, if those
758 filehandles have been opened with Safe Pipe Open, this will result in
759 calling waitpid(), which may again deadlock.
760
761 Bidirectional Communication with Another Process
762 While this works reasonably well for unidirectional communication, what
763 about bidirectional communication? The most obvious approach doesn't
764 work:
765
766 # THIS DOES NOT WORK!!
767 open(my $prog_for_reading_and_writing, "| some program |")
768
769 If you forget to "use warnings", you'll miss out entirely on the
770 helpful diagnostic message:
771
772 Can't do bidirectional pipe at -e line 1.
773
774 If you really want to, you can use the standard open2() from the
775 IPC::Open2 module to catch both ends. There's also an open3() in
776 IPC::Open3 for tridirectional I/O so you can also catch your child's
777 STDERR, but doing so would then require an awkward select() loop and
778 wouldn't allow you to use normal Perl input operations.
779
780 If you look at its source, you'll see that open2() uses low-level
781 primitives like the pipe() and exec() syscalls to create all the
782 connections. Although it might have been more efficient by using
783 socketpair(), this would have been even less portable than it already
784 is. The open2() and open3() functions are unlikely to work anywhere
785 except on a Unix system, or at least one purporting POSIX compliance.
786
787 Here's an example of using open2():
788
789 use IPC::Open2;
790 my $pid = open2(my $reader, my $writer, "cat -un");
791 print $writer "stuff\n";
792 my $got = <$reader>;
793 waitpid $pid, 0;
794
795 The problem with this is that buffering is really going to ruin your
796 day. Even though your $writer filehandle is auto-flushed so the
797 process on the other end gets your data in a timely manner, you can't
798 usually do anything to force that process to give its data to you in a
799 similarly quick fashion. In this special case, we could actually so,
800 because we gave cat a -u flag to make it unbuffered. But very few
801 commands are designed to operate over pipes, so this seldom works
802 unless you yourself wrote the program on the other end of the double-
803 ended pipe.
804
805 A solution to this is to use a library which uses pseudottys to make
806 your program behave more reasonably. This way you don't have to have
807 control over the source code of the program you're using. The "Expect"
808 module from CPAN also addresses this kind of thing. This module
809 requires two other modules from CPAN, "IO::Pty" and "IO::Stty". It
810 sets up a pseudo terminal to interact with programs that insist on
811 talking to the terminal device driver. If your system is supported,
812 this may be your best bet.
813
814 Bidirectional Communication with Yourself
815 If you want, you may make low-level pipe() and fork() syscalls to
816 stitch this together by hand. This example only talks to itself, but
817 you could reopen the appropriate handles to STDIN and STDOUT and call
818 other processes. (The following example lacks proper error checking.)
819
820 #!/usr/bin/perl
821 # pipe1 - bidirectional communication using two pipe pairs
822 # designed for the socketpair-challenged
823 use strict;
824 use warnings;
825 use IO::Handle; # enable autoflush method before Perl 5.14
826 pipe(my $parent_rdr, my $child_wtr); # XXX: check failure?
827 pipe(my $child_rdr, my $parent_wtr); # XXX: check failure?
828 $child_wtr->autoflush(1);
829 $parent_wtr->autoflush(1);
830
831 if ($pid = fork()) {
832 close $parent_rdr;
833 close $parent_wtr;
834 print $child_wtr "Parent Pid $$ is sending this\n";
835 chomp(my $line = <$child_rdr>);
836 print "Parent Pid $$ just read this: '$line'\n";
837 close $child_rdr; close $child_wtr;
838 waitpid($pid, 0);
839 } else {
840 die "cannot fork: $!" unless defined $pid;
841 close $child_rdr;
842 close $child_wtr;
843 chomp(my $line = <$parent_rdr>);
844 print "Child Pid $$ just read this: '$line'\n";
845 print $parent_wtr "Child Pid $$ is sending this\n";
846 close $parent_rdr;
847 close $parent_wtr;
848 exit(0);
849 }
850
851 But you don't actually have to make two pipe calls. If you have the
852 socketpair() system call, it will do this all for you.
853
854 #!/usr/bin/perl
855 # pipe2 - bidirectional communication using socketpair
856 # "the best ones always go both ways"
857
858 use strict;
859 use warnings;
860 use Socket;
861 use IO::Handle; # enable autoflush method before Perl 5.14
862
863 # We say AF_UNIX because although *_LOCAL is the
864 # POSIX 1003.1g form of the constant, many machines
865 # still don't have it.
866 socketpair(my $child, my $parent, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
867 || die "socketpair: $!";
868
869 $child->autoflush(1);
870 $parent->autoflush(1);
871
872 if ($pid = fork()) {
873 close $parent;
874 print $child "Parent Pid $$ is sending this\n";
875 chomp(my $line = <$child>);
876 print "Parent Pid $$ just read this: '$line'\n";
877 close $child;
878 waitpid($pid, 0);
879 } else {
880 die "cannot fork: $!" unless defined $pid;
881 close $child;
882 chomp(my $line = <$parent>);
883 print "Child Pid $$ just read this: '$line'\n";
884 print $parent "Child Pid $$ is sending this\n";
885 close $parent;
886 exit(0);
887 }
888
890 While not entirely limited to Unix-derived operating systems (e.g.,
891 WinSock on PCs provides socket support, as do some VMS libraries), you
892 might not have sockets on your system, in which case this section
893 probably isn't going to do you much good. With sockets, you can do
894 both virtual circuits like TCP streams and datagrams like UDP packets.
895 You may be able to do even more depending on your system.
896
897 The Perl functions for dealing with sockets have the same names as the
898 corresponding system calls in C, but their arguments tend to differ for
899 two reasons. First, Perl filehandles work differently than C file
900 descriptors. Second, Perl already knows the length of its strings, so
901 you don't need to pass that information.
902
903 One of the major problems with ancient, antemillennial socket code in
904 Perl was that it used hard-coded values for some of the constants,
905 which severely hurt portability. If you ever see code that does
906 anything like explicitly setting "$AF_INET = 2", you know you're in for
907 big trouble. An immeasurably superior approach is to use the Socket
908 module, which more reliably grants access to the various constants and
909 functions you'll need.
910
911 If you're not writing a server/client for an existing protocol like
912 NNTP or SMTP, you should give some thought to how your server will know
913 when the client has finished talking, and vice-versa. Most protocols
914 are based on one-line messages and responses (so one party knows the
915 other has finished when a "\n" is received) or multi-line messages and
916 responses that end with a period on an empty line ("\n.\n" terminates a
917 message/response).
918
919 Internet Line Terminators
920 The Internet line terminator is "\015\012". Under ASCII variants of
921 Unix, that could usually be written as "\r\n", but under other systems,
922 "\r\n" might at times be "\015\015\012", "\012\012\015", or something
923 completely different. The standards specify writing "\015\012" to be
924 conformant (be strict in what you provide), but they also recommend
925 accepting a lone "\012" on input (be lenient in what you require). We
926 haven't always been very good about that in the code in this manpage,
927 but unless you're on a Mac from way back in its pre-Unix dark ages,
928 you'll probably be ok.
929
930 Internet TCP Clients and Servers
931 Use Internet-domain sockets when you want to do client-server
932 communication that might extend to machines outside of your own system.
933
934 Here's a sample TCP client using Internet-domain sockets:
935
936 #!/usr/bin/perl
937 use strict;
938 use warnings;
939 use Socket;
940
941 my $remote = shift || "localhost";
942 my $port = shift || 2345; # random port
943 if ($port =~ /\D/) { $port = getservbyname($port, "tcp") }
944 die "No port" unless $port;
945 my $iaddr = inet_aton($remote) || die "no host: $remote";
946 my $paddr = sockaddr_in($port, $iaddr);
947
948 my $proto = getprotobyname("tcp");
949 socket(my $sock, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
950 connect($sock, $paddr) || die "connect: $!";
951 while (my $line = <$sock>) {
952 print $line;
953 }
954
955 close ($sock) || die "close: $!";
956 exit(0);
957
958 And here's a corresponding server to go along with it. We'll leave the
959 address as "INADDR_ANY" so that the kernel can choose the appropriate
960 interface on multihomed hosts. If you want sit on a particular
961 interface (like the external side of a gateway or firewall machine),
962 fill this in with your real address instead.
963
964 #!/usr/bin/perl -T
965 use strict;
966 use warnings;
967 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
968 use Socket;
969 use Carp;
970 my $EOL = "\015\012";
971
972 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
973
974 my $port = shift || 2345;
975 die "invalid port" unless $port =~ /^ \d+ $/x;
976
977 my $proto = getprotobyname("tcp");
978
979 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
980 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
981 || die "setsockopt: $!";
982 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
983 listen($server, SOMAXCONN) || die "listen: $!";
984
985 logmsg "server started on port $port";
986
987 for (my $paddr; $paddr = accept(my $client, $server); close $client) {
988 my($port, $iaddr) = sockaddr_in($paddr);
989 my $name = gethostbyaddr($iaddr, AF_INET);
990
991 logmsg "connection from $name [",
992 inet_ntoa($iaddr), "]
993 at port $port";
994
995 print $client "Hello there, $name, it's now ",
996 scalar localtime(), $EOL;
997 }
998
999 And here's a multitasking version. It's multitasked in that like most
1000 typical servers, it spawns (fork()s) a slave server to handle the
1001 client request so that the master server can quickly go back to service
1002 a new client.
1003
1004 #!/usr/bin/perl -T
1005 use strict;
1006 use warnings;
1007 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
1008 use Socket;
1009 use Carp;
1010 my $EOL = "\015\012";
1011
1012 sub spawn; # forward declaration
1013 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
1014
1015 my $port = shift || 2345;
1016 die "invalid port" unless $port =~ /^ \d+ $/x;
1017
1018 my $proto = getprotobyname("tcp");
1019
1020 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
1021 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
1022 || die "setsockopt: $!";
1023 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
1024 listen($server, SOMAXCONN) || die "listen: $!";
1025
1026 logmsg "server started on port $port";
1027
1028 my $waitedpid = 0;
1029
1030 use POSIX ":sys_wait_h";
1031 use Errno;
1032
1033 sub REAPER {
1034 local $!; # don't let waitpid() overwrite current error
1035 while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) {
1036 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1037 }
1038 $SIG{CHLD} = \&REAPER; # loathe SysV
1039 }
1040
1041 $SIG{CHLD} = \&REAPER;
1042
1043 while (1) {
1044 my $paddr = accept(my $client, $server) || do {
1045 # try again if accept() returned because got a signal
1046 next if $!{EINTR};
1047 die "accept: $!";
1048 };
1049 my ($port, $iaddr) = sockaddr_in($paddr);
1050 my $name = gethostbyaddr($iaddr, AF_INET);
1051
1052 logmsg "connection from $name [",
1053 inet_ntoa($iaddr),
1054 "] at port $port";
1055
1056 spawn $client, sub {
1057 $| = 1;
1058 print "Hello there, $name, it's now ",
1059 scalar localtime(),
1060 $EOL;
1061 exec "/usr/games/fortune" # XXX: "wrong" line terminators
1062 or confess "can't exec fortune: $!";
1063 };
1064 close $client;
1065 }
1066
1067 sub spawn {
1068 my $client = shift;
1069 my $coderef = shift;
1070
1071 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1072 confess "usage: spawn CLIENT CODEREF";
1073 }
1074
1075 my $pid;
1076 unless (defined($pid = fork())) {
1077 logmsg "cannot fork: $!";
1078 return;
1079 }
1080 elsif ($pid) {
1081 logmsg "begat $pid";
1082 return; # I'm the parent
1083 }
1084 # else I'm the child -- go spawn
1085
1086 open(STDIN, "<&", $client) || die "can't dup client to stdin";
1087 open(STDOUT, ">&", $client) || die "can't dup client to stdout";
1088 ## open(STDERR, ">&", STDOUT) || die "can't dup stdout to stderr";
1089 exit($coderef->());
1090 }
1091
1092 This server takes the trouble to clone off a child version via fork()
1093 for each incoming request. That way it can handle many requests at
1094 once, which you might not always want. Even if you don't fork(), the
1095 listen() will allow that many pending connections. Forking servers
1096 have to be particularly careful about cleaning up their dead children
1097 (called "zombies" in Unix parlance), because otherwise you'll quickly
1098 fill up your process table. The REAPER subroutine is used here to call
1099 waitpid() for any child processes that have finished, thereby ensuring
1100 that they terminate cleanly and don't join the ranks of the living
1101 dead.
1102
1103 Within the while loop we call accept() and check to see if it returns a
1104 false value. This would normally indicate a system error needs to be
1105 reported. However, the introduction of safe signals (see "Deferred
1106 Signals (Safe Signals)" above) in Perl 5.8.0 means that accept() might
1107 also be interrupted when the process receives a signal. This typically
1108 happens when one of the forked subprocesses exits and notifies the
1109 parent process with a CHLD signal.
1110
1111 If accept() is interrupted by a signal, $! will be set to EINTR. If
1112 this happens, we can safely continue to the next iteration of the loop
1113 and another call to accept(). It is important that your signal
1114 handling code not modify the value of $!, or else this test will likely
1115 fail. In the REAPER subroutine we create a local version of $! before
1116 calling waitpid(). When waitpid() sets $! to ECHILD as it inevitably
1117 does when it has no more children waiting, it updates the local copy
1118 and leaves the original unchanged.
1119
1120 You should use the -T flag to enable taint checking (see perlsec) even
1121 if we aren't running setuid or setgid. This is always a good idea for
1122 servers or any program run on behalf of someone else (like CGI
1123 scripts), because it lessens the chances that people from the outside
1124 will be able to compromise your system.
1125
1126 Let's look at another TCP client. This one connects to the TCP "time"
1127 service on a number of different machines and shows how far their
1128 clocks differ from the system on which it's being run:
1129
1130 #!/usr/bin/perl
1131 use strict;
1132 use warnings;
1133 use Socket;
1134
1135 my $SECS_OF_70_YEARS = 2208988800;
1136 sub ctime { scalar localtime(shift() || time()) }
1137
1138 my $iaddr = gethostbyname("localhost");
1139 my $proto = getprotobyname("tcp");
1140 my $port = getservbyname("time", "tcp");
1141 my $paddr = sockaddr_in(0, $iaddr);
1142
1143 $| = 1;
1144 printf "%-24s %8s %s\n", "localhost", 0, ctime();
1145
1146 foreach my $host (@ARGV) {
1147 printf "%-24s ", $host;
1148 my $hisiaddr = inet_aton($host) || die "unknown host";
1149 my $hispaddr = sockaddr_in($port, $hisiaddr);
1150 socket(my $socket, PF_INET, SOCK_STREAM, $proto)
1151 || die "socket: $!";
1152 connect($socket, $hispaddr) || die "connect: $!";
1153 my $rtime = pack("C4", ());
1154 read($socket, $rtime, 4);
1155 close($socket);
1156 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1157 printf "%8d %s\n", $histime - time(), ctime($histime);
1158 }
1159
1160 Unix-Domain TCP Clients and Servers
1161 That's fine for Internet-domain clients and servers, but what about
1162 local communications? While you can use the same setup, sometimes you
1163 don't want to. Unix-domain sockets are local to the current host, and
1164 are often used internally to implement pipes. Unlike Internet domain
1165 sockets, Unix domain sockets can show up in the file system with an
1166 ls(1) listing.
1167
1168 % ls -l /dev/log
1169 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
1170
1171 You can test for these with Perl's -S file test:
1172
1173 unless (-S "/dev/log") {
1174 die "something's wicked with the log system";
1175 }
1176
1177 Here's a sample Unix-domain client:
1178
1179 #!/usr/bin/perl
1180 use Socket;
1181 use strict;
1182 use warnings;
1183
1184 my $rendezvous = shift || "catsock";
1185 socket(my $sock, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1186 connect($sock, sockaddr_un($rendezvous)) || die "connect: $!";
1187 while (defined(my $line = <$sock>)) {
1188 print $line;
1189 }
1190 exit(0);
1191
1192 And here's a corresponding server. You don't have to worry about silly
1193 network terminators here because Unix domain sockets are guaranteed to
1194 be on the localhost, and thus everything works right.
1195
1196 #!/usr/bin/perl -T
1197 use strict;
1198 use warnings;
1199 use Socket;
1200 use Carp;
1201
1202 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
1203 sub spawn; # forward declaration
1204 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
1205
1206 my $NAME = "catsock";
1207 my $uaddr = sockaddr_un($NAME);
1208 my $proto = getprotobyname("tcp");
1209
1210 socket(my $server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1211 unlink($NAME);
1212 bind ($server, $uaddr) || die "bind: $!";
1213 listen($server, SOMAXCONN) || die "listen: $!";
1214
1215 logmsg "server started on $NAME";
1216
1217 my $waitedpid;
1218
1219 use POSIX ":sys_wait_h";
1220 sub REAPER {
1221 my $child;
1222 while (($waitedpid = waitpid(-1, WNOHANG)) > 0) {
1223 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1224 }
1225 $SIG{CHLD} = \&REAPER; # loathe SysV
1226 }
1227
1228 $SIG{CHLD} = \&REAPER;
1229
1230
1231 for ( $waitedpid = 0;
1232 accept(my $client, $server) || $waitedpid;
1233 $waitedpid = 0, close $client)
1234 {
1235 next if $waitedpid;
1236 logmsg "connection on $NAME";
1237 spawn $client, sub {
1238 print "Hello there, it's now ", scalar localtime(), "\n";
1239 exec("/usr/games/fortune") || die "can't exec fortune: $!";
1240 };
1241 }
1242
1243 sub spawn {
1244 my $client = shift();
1245 my $coderef = shift();
1246
1247 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1248 confess "usage: spawn CLIENT CODEREF";
1249 }
1250
1251 my $pid;
1252 unless (defined($pid = fork())) {
1253 logmsg "cannot fork: $!";
1254 return;
1255 }
1256 elsif ($pid) {
1257 logmsg "begat $pid";
1258 return; # I'm the parent
1259 }
1260 else {
1261 # I'm the child -- go spawn
1262 }
1263
1264 open(STDIN, "<&", $client)
1265 || die "can't dup client to stdin";
1266 open(STDOUT, ">&", $client)
1267 || die "can't dup client to stdout";
1268 ## open(STDERR, ">&", STDOUT)
1269 ## || die "can't dup stdout to stderr";
1270 exit($coderef->());
1271 }
1272
1273 As you see, it's remarkably similar to the Internet domain TCP server,
1274 so much so, in fact, that we've omitted several duplicate
1275 functions--spawn(), logmsg(), ctime(), and REAPER()--which are the same
1276 as in the other server.
1277
1278 So why would you ever want to use a Unix domain socket instead of a
1279 simpler named pipe? Because a named pipe doesn't give you sessions.
1280 You can't tell one process's data from another's. With socket
1281 programming, you get a separate session for each client; that's why
1282 accept() takes two arguments.
1283
1284 For example, let's say that you have a long-running database server
1285 daemon that you want folks to be able to access from the Web, but only
1286 if they go through a CGI interface. You'd have a small, simple CGI
1287 program that does whatever checks and logging you feel like, and then
1288 acts as a Unix-domain client and connects to your private server.
1289
1291 For those preferring a higher-level interface to socket programming,
1292 the IO::Socket module provides an object-oriented approach. If for
1293 some reason you lack this module, you can just fetch IO::Socket from
1294 CPAN, where you'll also find modules providing easy interfaces to the
1295 following systems: DNS, FTP, Ident (RFC 931), NIS and NISPlus, NNTP,
1296 Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--to name just a few.
1297
1298 A Simple Client
1299 Here's a client that creates a TCP connection to the "daytime" service
1300 at port 13 of the host name "localhost" and prints out everything that
1301 the server there cares to provide.
1302
1303 #!/usr/bin/perl
1304 use strict;
1305 use warnings;
1306 use IO::Socket;
1307 my $remote = IO::Socket::INET->new(
1308 Proto => "tcp",
1309 PeerAddr => "localhost",
1310 PeerPort => "daytime(13)",
1311 )
1312 || die "can't connect to daytime service on localhost";
1313 while (<$remote>) { print }
1314
1315 When you run this program, you should get something back that looks
1316 like this:
1317
1318 Wed May 14 08:40:46 MDT 1997
1319
1320 Here are what those parameters to the new() constructor mean:
1321
1322 "Proto"
1323 This is which protocol to use. In this case, the socket handle
1324 returned will be connected to a TCP socket, because we want a
1325 stream-oriented connection, that is, one that acts pretty much like
1326 a plain old file. Not all sockets are this of this type. For
1327 example, the UDP protocol can be used to make a datagram socket,
1328 used for message-passing.
1329
1330 "PeerAddr"
1331 This is the name or Internet address of the remote host the server
1332 is running on. We could have specified a longer name like
1333 "www.perl.com", or an address like "207.171.7.72". For
1334 demonstration purposes, we've used the special hostname
1335 "localhost", which should always mean the current machine you're
1336 running on. The corresponding Internet address for localhost is
1337 "127.0.0.1", if you'd rather use that.
1338
1339 "PeerPort"
1340 This is the service name or port number we'd like to connect to.
1341 We could have gotten away with using just "daytime" on systems with
1342 a well-configured system services file,[FOOTNOTE: The system
1343 services file is found in /etc/services under Unixy systems.] but
1344 here we've specified the port number (13) in parentheses. Using
1345 just the number would have also worked, but numeric literals make
1346 careful programmers nervous.
1347
1348 A Webget Client
1349 Here's a simple client that takes a remote host to fetch a document
1350 from, and then a list of files to get from that host. This is a more
1351 interesting client than the previous one because it first sends
1352 something to the server before fetching the server's response.
1353
1354 #!/usr/bin/perl
1355 use strict;
1356 use warnings;
1357 use IO::Socket;
1358 unless (@ARGV > 1) { die "usage: $0 host url ..." }
1359 my $host = shift(@ARGV);
1360 my $EOL = "\015\012";
1361 my $BLANK = $EOL x 2;
1362 for my $document (@ARGV) {
1363 my $remote = IO::Socket::INET->new( Proto => "tcp",
1364 PeerAddr => $host,
1365 PeerPort => "http(80)",
1366 ) || die "cannot connect to httpd on $host";
1367 $remote->autoflush(1);
1368 print $remote "GET $document HTTP/1.0" . $BLANK;
1369 while ( <$remote> ) { print }
1370 close $remote;
1371 }
1372
1373 The web server handling the HTTP service is assumed to be at its
1374 standard port, number 80. If the server you're trying to connect to is
1375 at a different port, like 1080 or 8080, you should specify it as the
1376 named-parameter pair, "PeerPort => 8080". The "autoflush" method is
1377 used on the socket because otherwise the system would buffer up the
1378 output we sent it. (If you're on a prehistoric Mac, you'll also need
1379 to change every "\n" in your code that sends data over the network to
1380 be a "\015\012" instead.)
1381
1382 Connecting to the server is only the first part of the process: once
1383 you have the connection, you have to use the server's language. Each
1384 server on the network has its own little command language that it
1385 expects as input. The string that we send to the server starting with
1386 "GET" is in HTTP syntax. In this case, we simply request each
1387 specified document. Yes, we really are making a new connection for
1388 each document, even though it's the same host. That's the way you
1389 always used to have to speak HTTP. Recent versions of web browsers may
1390 request that the remote server leave the connection open a little
1391 while, but the server doesn't have to honor such a request.
1392
1393 Here's an example of running that program, which we'll call webget:
1394
1395 % webget www.perl.com /guanaco.html
1396 HTTP/1.1 404 File Not Found
1397 Date: Thu, 08 May 1997 18:02:32 GMT
1398 Server: Apache/1.2b6
1399 Connection: close
1400 Content-type: text/html
1401
1402 <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1403 <BODY><H1>File Not Found</H1>
1404 The requested URL /guanaco.html was not found on this server.<P>
1405 </BODY>
1406
1407 Ok, so that's not very interesting, because it didn't find that
1408 particular document. But a long response wouldn't have fit on this
1409 page.
1410
1411 For a more featureful version of this program, you should look to the
1412 lwp-request program included with the LWP modules from CPAN.
1413
1414 Interactive Client with IO::Socket
1415 Well, that's all fine if you want to send one command and get one
1416 answer, but what about setting up something fully interactive, somewhat
1417 like the way telnet works? That way you can type a line, get the
1418 answer, type a line, get the answer, etc.
1419
1420 This client is more complicated than the two we've done so far, but if
1421 you're on a system that supports the powerful "fork" call, the solution
1422 isn't that rough. Once you've made the connection to whatever service
1423 you'd like to chat with, call "fork" to clone your process. Each of
1424 these two identical process has a very simple job to do: the parent
1425 copies everything from the socket to standard output, while the child
1426 simultaneously copies everything from standard input to the socket. To
1427 accomplish the same thing using just one process would be much harder,
1428 because it's easier to code two processes to do one thing than it is to
1429 code one process to do two things. (This keep-it-simple principle a
1430 cornerstones of the Unix philosophy, and good software engineering as
1431 well, which is probably why it's spread to other systems.)
1432
1433 Here's the code:
1434
1435 #!/usr/bin/perl
1436 use strict;
1437 use warnings;
1438 use IO::Socket;
1439
1440 unless (@ARGV == 2) { die "usage: $0 host port" }
1441 my ($host, $port) = @ARGV;
1442
1443 # create a tcp connection to the specified host and port
1444 my $handle = IO::Socket::INET->new(Proto => "tcp",
1445 PeerAddr => $host,
1446 PeerPort => $port)
1447 || die "can't connect to port $port on $host: $!";
1448
1449 $handle->autoflush(1); # so output gets there right away
1450 print STDERR "[Connected to $host:$port]\n";
1451
1452 # split the program into two processes, identical twins
1453 die "can't fork: $!" unless defined(my $kidpid = fork());
1454
1455 # the if{} block runs only in the parent process
1456 if ($kidpid) {
1457 # copy the socket to standard output
1458 while (defined (my $line = <$handle>)) {
1459 print STDOUT $line;
1460 }
1461 kill("TERM", $kidpid); # send SIGTERM to child
1462 }
1463 # the else{} block runs only in the child process
1464 else {
1465 # copy standard input to the socket
1466 while (defined (my $line = <STDIN>)) {
1467 print $handle $line;
1468 }
1469 exit(0); # just in case
1470 }
1471
1472 The "kill" function in the parent's "if" block is there to send a
1473 signal to our child process, currently running in the "else" block, as
1474 soon as the remote server has closed its end of the connection.
1475
1476 If the remote server sends data a byte at time, and you need that data
1477 immediately without waiting for a newline (which might not happen), you
1478 may wish to replace the "while" loop in the parent with the following:
1479
1480 my $byte;
1481 while (sysread($handle, $byte, 1) == 1) {
1482 print STDOUT $byte;
1483 }
1484
1485 Making a system call for each byte you want to read is not very
1486 efficient (to put it mildly) but is the simplest to explain and works
1487 reasonably well.
1488
1490 As always, setting up a server is little bit more involved than running
1491 a client. The model is that the server creates a special kind of
1492 socket that does nothing but listen on a particular port for incoming
1493 connections. It does this by calling the "IO::Socket::INET->new()"
1494 method with slightly different arguments than the client did.
1495
1496 Proto
1497 This is which protocol to use. Like our clients, we'll still
1498 specify "tcp" here.
1499
1500 LocalPort
1501 We specify a local port in the "LocalPort" argument, which we
1502 didn't do for the client. This is service name or port number for
1503 which you want to be the server. (Under Unix, ports under 1024 are
1504 restricted to the superuser.) In our sample, we'll use port 9000,
1505 but you can use any port that's not currently in use on your
1506 system. If you try to use one already in used, you'll get an
1507 "Address already in use" message. Under Unix, the "netstat -a"
1508 command will show which services current have servers.
1509
1510 Listen
1511 The "Listen" parameter is set to the maximum number of pending
1512 connections we can accept until we turn away incoming clients.
1513 Think of it as a call-waiting queue for your telephone. The low-
1514 level Socket module has a special symbol for the system maximum,
1515 which is SOMAXCONN.
1516
1517 Reuse
1518 The "Reuse" parameter is needed so that we restart our server
1519 manually without waiting a few minutes to allow system buffers to
1520 clear out.
1521
1522 Once the generic server socket has been created using the parameters
1523 listed above, the server then waits for a new client to connect to it.
1524 The server blocks in the "accept" method, which eventually accepts a
1525 bidirectional connection from the remote client. (Make sure to
1526 autoflush this handle to circumvent buffering.)
1527
1528 To add to user-friendliness, our server prompts the user for commands.
1529 Most servers don't do this. Because of the prompt without a newline,
1530 you'll have to use the "sysread" variant of the interactive client
1531 above.
1532
1533 This server accepts one of five different commands, sending output back
1534 to the client. Unlike most network servers, this one handles only one
1535 incoming client at a time. Multitasking servers are covered in Chapter
1536 16 of the Camel.
1537
1538 Here's the code.
1539
1540 #!/usr/bin/perl
1541 use strict;
1542 use warnings;
1543 use IO::Socket;
1544 use Net::hostent; # for OOish version of gethostbyaddr
1545
1546 my $PORT = 9000; # pick something not in use
1547
1548 my $server = IO::Socket::INET->new( Proto => "tcp",
1549 LocalPort => $PORT,
1550 Listen => SOMAXCONN,
1551 Reuse => 1);
1552
1553 die "can't setup server" unless $server;
1554 print "[Server $0 accepting clients]\n";
1555
1556 while (my $client = $server->accept()) {
1557 $client->autoflush(1);
1558 print $client "Welcome to $0; type help for command list.\n";
1559 my $hostinfo = gethostbyaddr($client->peeraddr);
1560 printf "[Connect from %s]\n",
1561 $hostinfo ? $hostinfo->name : $client->peerhost;
1562 print $client "Command? ";
1563 while ( <$client>) {
1564 next unless /\S/; # blank line
1565 if (/quit|exit/i) { last }
1566 elsif (/date|time/i) { printf $client "%s\n", scalar localtime() }
1567 elsif (/who/i ) { print $client `who 2>&1` }
1568 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1` }
1569 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1` }
1570 else {
1571 print $client "Commands: quit date who cookie motd\n";
1572 }
1573 } continue {
1574 print $client "Command? ";
1575 }
1576 close $client;
1577 }
1578
1580 Another kind of client-server setup is one that uses not connections,
1581 but messages. UDP communications involve much lower overhead but also
1582 provide less reliability, as there are no promises that messages will
1583 arrive at all, let alone in order and unmangled. Still, UDP offers
1584 some advantages over TCP, including being able to "broadcast" or
1585 "multicast" to a whole bunch of destination hosts at once (usually on
1586 your local subnet). If you find yourself overly concerned about
1587 reliability and start building checks into your message system, then
1588 you probably should use just TCP to start with.
1589
1590 UDP datagrams are not a bytestream and should not be treated as such.
1591 This makes using I/O mechanisms with internal buffering like stdio
1592 (i.e. print() and friends) especially cumbersome. Use syswrite(), or
1593 better send(), like in the example below.
1594
1595 Here's a UDP program similar to the sample Internet TCP client given
1596 earlier. However, instead of checking one host at a time, the UDP
1597 version will check many of them asynchronously by simulating a
1598 multicast and then using select() to do a timed-out wait for I/O. To
1599 do something similar with TCP, you'd have to use a different socket
1600 handle for each host.
1601
1602 #!/usr/bin/perl
1603 use strict;
1604 use warnings;
1605 use Socket;
1606 use Sys::Hostname;
1607
1608 my $SECS_OF_70_YEARS = 2_208_988_800;
1609
1610 my $iaddr = gethostbyname(hostname());
1611 my $proto = getprotobyname("udp");
1612 my $port = getservbyname("time", "udp");
1613 my $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1614
1615 socket(my $socket, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
1616 bind($socket, $paddr) || die "bind: $!";
1617
1618 $| = 1;
1619 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime();
1620 my $count = 0;
1621 for my $host (@ARGV) {
1622 $count++;
1623 my $hisiaddr = inet_aton($host) || die "unknown host";
1624 my $hispaddr = sockaddr_in($port, $hisiaddr);
1625 defined(send($socket, 0, 0, $hispaddr)) || die "send $host: $!";
1626 }
1627
1628 my $rout = my $rin = "";
1629 vec($rin, fileno($socket), 1) = 1;
1630
1631 # timeout after 10.0 seconds
1632 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1633 my $rtime = "";
1634 my $hispaddr = recv($socket, $rtime, 4, 0) || die "recv: $!";
1635 my ($port, $hisiaddr) = sockaddr_in($hispaddr);
1636 my $host = gethostbyaddr($hisiaddr, AF_INET);
1637 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1638 printf "%-12s ", $host;
1639 printf "%8d %s\n", $histime - time(), scalar localtime($histime);
1640 $count--;
1641 }
1642
1643 This example does not include any retries and may consequently fail to
1644 contact a reachable host. The most prominent reason for this is
1645 congestion of the queues on the sending host if the number of hosts to
1646 contact is sufficiently large.
1647
1649 While System V IPC isn't so widely used as sockets, it still has some
1650 interesting uses. However, you cannot use SysV IPC or Berkeley mmap()
1651 to have a variable shared amongst several processes. That's because
1652 Perl would reallocate your string when you weren't wanting it to. You
1653 might look into the "IPC::Shareable" or "threads::shared" modules for
1654 that.
1655
1656 Here's a small example showing shared memory usage.
1657
1658 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR);
1659
1660 my $size = 2000;
1661 my $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR);
1662 defined($id) || die "shmget: $!";
1663 print "shm key $id\n";
1664
1665 my $message = "Message #1";
1666 shmwrite($id, $message, 0, 60) || die "shmwrite: $!";
1667 print "wrote: '$message'\n";
1668 shmread($id, my $buff, 0, 60) || die "shmread: $!";
1669 print "read : '$buff'\n";
1670
1671 # the buffer of shmread is zero-character end-padded.
1672 substr($buff, index($buff, "\0")) = "";
1673 print "un" unless $buff eq $message;
1674 print "swell\n";
1675
1676 print "deleting shm $id\n";
1677 shmctl($id, IPC_RMID, 0) || die "shmctl: $!";
1678
1679 Here's an example of a semaphore:
1680
1681 use IPC::SysV qw(IPC_CREAT);
1682
1683 my $IPC_KEY = 1234;
1684 my $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT);
1685 defined($id) || die "semget: $!";
1686 print "sem id $id\n";
1687
1688 Put this code in a separate file to be run in more than one process.
1689 Call the file take:
1690
1691 # create a semaphore
1692
1693 my $IPC_KEY = 1234;
1694 my $id = semget($IPC_KEY, 0, 0);
1695 defined($id) || die "semget: $!";
1696
1697 my $semnum = 0;
1698 my $semflag = 0;
1699
1700 # "take" semaphore
1701 # wait for semaphore to be zero
1702 my $semop = 0;
1703 my $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1704
1705 # Increment the semaphore count
1706 $semop = 1;
1707 my $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
1708 my $opstring = $opstring1 . $opstring2;
1709
1710 semop($id, $opstring) || die "semop: $!";
1711
1712 Put this code in a separate file to be run in more than one process.
1713 Call this file give:
1714
1715 # "give" the semaphore
1716 # run this in the original process and you will see
1717 # that the second process continues
1718
1719 my $IPC_KEY = 1234;
1720 my $id = semget($IPC_KEY, 0, 0);
1721 die unless defined($id);
1722
1723 my $semnum = 0;
1724 my $semflag = 0;
1725
1726 # Decrement the semaphore count
1727 my $semop = -1;
1728 my $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1729
1730 semop($id, $opstring) || die "semop: $!";
1731
1732 The SysV IPC code above was written long ago, and it's definitely
1733 clunky looking. For a more modern look, see the IPC::SysV module.
1734
1735 A small example demonstrating SysV message queues:
1736
1737 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR);
1738
1739 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR);
1740 defined($id) || die "msgget failed: $!";
1741
1742 my $sent = "message";
1743 my $type_sent = 1234;
1744
1745 msgsnd($id, pack("l! a*", $type_sent, $sent), 0)
1746 || die "msgsnd failed: $!";
1747
1748 msgrcv($id, my $rcvd_buf, 60, 0, 0)
1749 || die "msgrcv failed: $!";
1750
1751 my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf);
1752
1753 if ($rcvd eq $sent) {
1754 print "okay\n";
1755 } else {
1756 print "not okay\n";
1757 }
1758
1759 msgctl($id, IPC_RMID, 0) || die "msgctl failed: $!\n";
1760
1762 Most of these routines quietly but politely return "undef" when they
1763 fail instead of causing your program to die right then and there due to
1764 an uncaught exception. (Actually, some of the new Socket conversion
1765 functions do croak() on bad arguments.) It is therefore essential to
1766 check return values from these functions. Always begin your socket
1767 programs this way for optimal success, and don't forget to add the -T
1768 taint-checking flag to the "#!" line for servers:
1769
1770 #!/usr/bin/perl -T
1771 use strict;
1772 use warnings;
1773 use sigtrap;
1774 use Socket;
1775
1777 These routines all create system-specific portability problems. As
1778 noted elsewhere, Perl is at the mercy of your C libraries for much of
1779 its system behavior. It's probably safest to assume broken SysV
1780 semantics for signals and to stick with simple TCP and UDP socket
1781 operations; e.g., don't try to pass open file descriptors over a local
1782 UDP datagram socket if you want your code to stand a chance of being
1783 portable.
1784
1786 Tom Christiansen, with occasional vestiges of Larry Wall's original
1787 version and suggestions from the Perl Porters.
1788
1790 There's a lot more to networking than this, but this should get you
1791 started.
1792
1793 For intrepid programmers, the indispensable textbook is Unix Network
1794 Programming, 2nd Edition, Volume 1 by W. Richard Stevens (published by
1795 Prentice-Hall). Most books on networking address the subject from the
1796 perspective of a C programmer; translation to Perl is left as an
1797 exercise for the reader.
1798
1799 The IO::Socket(3) manpage describes the object library, and the
1800 Socket(3) manpage describes the low-level interface to sockets.
1801 Besides the obvious functions in perlfunc, you should also check out
1802 the modules file at your nearest CPAN site, especially
1803 <http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>. See
1804 perlmodlib or best yet, the Perl FAQ for a description of what CPAN is
1805 and where to get it if the previous link doesn't work for you.
1806
1807 Section 5 of CPAN's modules file is devoted to "Networking, Device
1808 Control (modems), and Interprocess Communication", and contains
1809 numerous unbundled modules numerous networking modules, Chat and Expect
1810 operations, CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC,
1811 SNMP, SMTP, Telnet, Threads, and ToolTalk--to name just a few.
1812
1813
1814
1815perl v5.32.1 2021-03-31 PERLIPC(1)