1PERLIPC(1) Perl Programmers Reference Guide PERLIPC(1)
2
3
4
6 perlipc - Perl interprocess communication (signals, fifos, pipes, safe
7 subprocesses, sockets, and semaphores)
8
10 The basic IPC facilities of Perl are built out of the good old Unix
11 signals, named pipes, pipe opens, the Berkeley socket routines, and
12 SysV IPC calls. Each is used in slightly different situations.
13
15 Perl uses a simple signal handling model: the %SIG hash contains names
16 or references of user-installed signal handlers. These handlers will
17 be called with an argument which is the name of the signal that trig‐
18 gered it. A signal may be generated intentionally from a particular
19 keyboard sequence like control-C or control-Z, sent to you from another
20 process, or triggered automatically by the kernel when special events
21 transpire, like a child process exiting, your process running out of
22 stack space, or hitting file size limit.
23
24 For example, to trap an interrupt signal, set up a handler like this:
25
26 sub catch_zap {
27 my $signame = shift;
28 $shucks++;
29 die "Somebody sent me a SIG$signame";
30 }
31 $SIG{INT} = 'catch_zap'; # could fail in modules
32 $SIG{INT} = \&catch_zap; # best strategy
33
34 Prior to Perl 5.7.3 it was necessary to do as little as you possibly
35 could in your handler; notice how all we do is set a global variable
36 and then raise an exception. That's because on most systems, libraries
37 are not re-entrant; particularly, memory allocation and I/O routines
38 are not. That meant that doing nearly anything in your handler could
39 in theory trigger a memory fault and subsequent core dump - see
40 "Deferred Signals (Safe Signals)" below.
41
42 The names of the signals are the ones listed out by "kill -l" on your
43 system, or you can retrieve them from the Config module. Set up an
44 @signame list indexed by number to get the name and a %signo table
45 indexed by name to get the number:
46
47 use Config;
48 defined $Config{sig_name} ⎪⎪ die "No sigs?";
49 foreach $name (split(' ', $Config{sig_name})) {
50 $signo{$name} = $i;
51 $signame[$i] = $name;
52 $i++;
53 }
54
55 So to check whether signal 17 and SIGALRM were the same, do just this:
56
57 print "signal #17 = $signame[17]\n";
58 if ($signo{ALRM}) {
59 print "SIGALRM is $signo{ALRM}\n";
60 }
61
62 You may also choose to assign the strings 'IGNORE' or 'DEFAULT' as the
63 handler, in which case Perl will try to discard the signal or do the
64 default thing.
65
66 On most Unix platforms, the "CHLD" (sometimes also known as "CLD") sig‐
67 nal has special behavior with respect to a value of 'IGNORE'. Setting
68 $SIG{CHLD} to 'IGNORE' on such a platform has the effect of not creat‐
69 ing zombie processes when the parent process fails to "wait()" on its
70 child processes (i.e. child processes are automatically reaped). Call‐
71 ing "wait()" with $SIG{CHLD} set to 'IGNORE' usually returns "-1" on
72 such platforms.
73
74 Some signals can be neither trapped nor ignored, such as the KILL and
75 STOP (but not the TSTP) signals. One strategy for temporarily ignoring
76 signals is to use a local() statement, which will be automatically
77 restored once your block is exited. (Remember that local() values are
78 "inherited" by functions called from within that block.)
79
80 sub precious {
81 local $SIG{INT} = 'IGNORE';
82 &more_functions;
83 }
84 sub more_functions {
85 # interrupts still ignored, for now...
86 }
87
88 Sending a signal to a negative process ID means that you send the sig‐
89 nal to the entire Unix process-group. This code sends a hang-up signal
90 to all processes in the current process group (and sets $SIG{HUP} to
91 IGNORE so it doesn't kill itself):
92
93 {
94 local $SIG{HUP} = 'IGNORE';
95 kill HUP => -$$;
96 # snazzy writing of: kill('HUP', -$$)
97 }
98
99 Another interesting signal to send is signal number zero. This doesn't
100 actually affect a child process, but instead checks whether it's alive
101 or has changed its UID.
102
103 unless (kill 0 => $kid_pid) {
104 warn "something wicked happened to $kid_pid";
105 }
106
107 When directed at a process whose UID is not identical to that of the
108 sending process, signal number zero may fail because you lack permis‐
109 sion to send the signal, even though the process is alive. You may be
110 able to determine the cause of failure using "%!".
111
112 unless (kill 0 => $pid or $!{EPERM}) {
113 warn "$pid looks dead";
114 }
115
116 You might also want to employ anonymous functions for simple signal
117 handlers:
118
119 $SIG{INT} = sub { die "\nOutta here!\n" };
120
121 But that will be problematic for the more complicated handlers that
122 need to reinstall themselves. Because Perl's signal mechanism is cur‐
123 rently based on the signal(3) function from the C library, you may
124 sometimes be so misfortunate as to run on systems where that function
125 is "broken", that is, it behaves in the old unreliable SysV way rather
126 than the newer, more reasonable BSD and POSIX fashion. So you'll see
127 defensive people writing signal handlers like this:
128
129 sub REAPER {
130 $waitedpid = wait;
131 # loathe sysV: it makes us not only reinstate
132 # the handler, but place it after the wait
133 $SIG{CHLD} = \&REAPER;
134 }
135 $SIG{CHLD} = \&REAPER;
136 # now do something that forks...
137
138 or better still:
139
140 use POSIX ":sys_wait_h";
141 sub REAPER {
142 my $child;
143 # If a second child dies while in the signal handler caused by the
144 # first death, we won't get another signal. So must loop here else
145 # we will leave the unreaped child as a zombie. And the next time
146 # two children die we get another zombie. And so on.
147 while (($child = waitpid(-1,WNOHANG)) > 0) {
148 $Kid_Status{$child} = $?;
149 }
150 $SIG{CHLD} = \&REAPER; # still loathe sysV
151 }
152 $SIG{CHLD} = \&REAPER;
153 # do something that forks...
154
155 Signal handling is also used for timeouts in Unix, While safely pro‐
156 tected within an "eval{}" block, you set a signal handler to trap alarm
157 signals and then schedule to have one delivered to you in some number
158 of seconds. Then try your blocking operation, clearing the alarm when
159 it's done but not before you've exited your "eval{}" block. If it goes
160 off, you'll use die() to jump out of the block, much as you might using
161 longjmp() or throw() in other languages.
162
163 Here's an example:
164
165 eval {
166 local $SIG{ALRM} = sub { die "alarm clock restart" };
167 alarm 10;
168 flock(FH, 2); # blocking write lock
169 alarm 0;
170 };
171 if ($@ and $@ !~ /alarm clock restart/) { die }
172
173 If the operation being timed out is system() or qx(), this technique is
174 liable to generate zombies. If this matters to you, you'll need to
175 do your own fork() and exec(), and kill the errant child process.
176
177 For more complex signal handling, you might see the standard POSIX mod‐
178 ule. Lamentably, this is almost entirely undocumented, but the
179 t/lib/posix.t file from the Perl source distribution has some examples
180 in it.
181
182 Handling the SIGHUP Signal in Daemons
183
184 A process that usually starts when the system boots and shuts down when
185 the system is shut down is called a daemon (Disk And Execution MONi‐
186 tor). If a daemon process has a configuration file which is modified
187 after the process has been started, there should be a way to tell that
188 process to re-read its configuration file, without stopping the
189 process. Many daemons provide this mechanism using the "SIGHUP" signal
190 handler. When you want to tell the daemon to re-read the file you sim‐
191 ply send it the "SIGHUP" signal.
192
193 Not all platforms automatically reinstall their (native) signal han‐
194 dlers after a signal delivery. This means that the handler works only
195 the first time the signal is sent. The solution to this problem is to
196 use "POSIX" signal handlers if available, their behaviour is
197 well-defined.
198
199 The following example implements a simple daemon, which restarts itself
200 every time the "SIGHUP" signal is received. The actual code is located
201 in the subroutine "code()", which simply prints some debug info to show
202 that it works and should be replaced with the real code.
203
204 #!/usr/bin/perl -w
205
206 use POSIX ();
207 use FindBin ();
208 use File::Basename ();
209 use File::Spec::Functions;
210
211 $⎪=1;
212
213 # make the daemon cross-platform, so exec always calls the script
214 # itself with the right path, no matter how the script was invoked.
215 my $script = File::Basename::basename($0);
216 my $SELF = catfile $FindBin::Bin, $script;
217
218 # POSIX unmasks the sigprocmask properly
219 my $sigset = POSIX::SigSet->new();
220 my $action = POSIX::SigAction->new('sigHUP_handler',
221 $sigset,
222 &POSIX::SA_NODEFER);
223 POSIX::sigaction(&POSIX::SIGHUP, $action);
224
225 sub sigHUP_handler {
226 print "got SIGHUP\n";
227 exec($SELF, @ARGV) or die "Couldn't restart: $!\n";
228 }
229
230 code();
231
232 sub code {
233 print "PID: $$\n";
234 print "ARGV: @ARGV\n";
235 my $c = 0;
236 while (++$c) {
237 sleep 2;
238 print "$c\n";
239 }
240 }
241 __END__
242
244 A named pipe (often referred to as a FIFO) is an old Unix IPC mechanism
245 for processes communicating on the same machine. It works just like a
246 regular, connected anonymous pipes, except that the processes ren‐
247 dezvous using a filename and don't have to be related.
248
249 To create a named pipe, use the "POSIX::mkfifo()" function.
250
251 use POSIX qw(mkfifo);
252 mkfifo($path, 0700) or die "mkfifo $path failed: $!";
253
254 You can also use the Unix command mknod(1) or on some systems,
255 mkfifo(1). These may not be in your normal path.
256
257 # system return val is backwards, so && not ⎪⎪
258 #
259 $ENV{PATH} .= ":/etc:/usr/etc";
260 if ( system('mknod', $path, 'p')
261 && system('mkfifo', $path) )
262 {
263 die "mk{nod,fifo} $path failed";
264 }
265
266 A fifo is convenient when you want to connect a process to an unrelated
267 one. When you open a fifo, the program will block until there's some‐
268 thing on the other end.
269
270 For example, let's say you'd like to have your .signature file be a
271 named pipe that has a Perl program on the other end. Now every time
272 any program (like a mailer, news reader, finger program, etc.) tries to
273 read from that file, the reading program will block and your program
274 will supply the new signature. We'll use the pipe-checking file test
275 -p to find out whether anyone (or anything) has accidentally removed
276 our fifo.
277
278 chdir; # go home
279 $FIFO = '.signature';
280
281 while (1) {
282 unless (-p $FIFO) {
283 unlink $FIFO;
284 require POSIX;
285 POSIX::mkfifo($FIFO, 0700)
286 or die "can't mkfifo $FIFO: $!";
287 }
288
289 # next line blocks until there's a reader
290 open (FIFO, "> $FIFO") ⎪⎪ die "can't write $FIFO: $!";
291 print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
292 close FIFO;
293 sleep 2; # to avoid dup signals
294 }
295
296 Deferred Signals (Safe Signals)
297
298 In Perls before Perl 5.7.3 by installing Perl code to deal with sig‐
299 nals, you were exposing yourself to danger from two things. First, few
300 system library functions are re-entrant. If the signal interrupts
301 while Perl is executing one function (like malloc(3) or printf(3)), and
302 your signal handler then calls the same function again, you could get
303 unpredictable behavior--often, a core dump. Second, Perl isn't itself
304 re-entrant at the lowest levels. If the signal interrupts Perl while
305 Perl is changing its own internal data structures, similarly unpre‐
306 dictable behaviour may result.
307
308 There were two things you could do, knowing this: be paranoid or be
309 pragmatic. The paranoid approach was to do as little as possible in
310 your signal handler. Set an existing integer variable that already has
311 a value, and return. This doesn't help you if you're in a slow system
312 call, which will just restart. That means you have to "die" to
313 longjump(3) out of the handler. Even this is a little cavalier for the
314 true paranoiac, who avoids "die" in a handler because the system is out
315 to get you. The pragmatic approach was to say "I know the risks, but
316 prefer the convenience", and to do anything you wanted in your signal
317 handler, and be prepared to clean up core dumps now and again.
318
319 In Perl 5.7.3 and later to avoid these problems signals are
320 "deferred"-- that is when the signal is delivered to the process by the
321 system (to the C code that implements Perl) a flag is set, and the han‐
322 dler returns immediately. Then at strategic "safe" points in the Perl
323 interpreter (e.g. when it is about to execute a new opcode) the flags
324 are checked and the Perl level handler from %SIG is executed. The
325 "deferred" scheme allows much more flexibility in the coding of signal
326 handler as we know Perl interpreter is in a safe state, and that we are
327 not in a system library function when the handler is called. However
328 the implementation does differ from previous Perls in the following
329 ways:
330
331 Long running opcodes
332 As Perl interpreter only looks at the signal flags when it about to
333 execute a new opcode if a signal arrives during a long running
334 opcode (e.g. a regular expression operation on a very large string)
335 then signal will not be seen until operation completes.
336
337 Interrupting IO
338 When a signal is delivered (e.g. INT control-C) the operating sys‐
339 tem breaks into IO operations like "read" (used to implement Perls
340 <> operator). On older Perls the handler was called immediately
341 (and as "read" is not "unsafe" this worked well). With the
342 "deferred" scheme the handler is not called immediately, and if
343 Perl is using system's "stdio" library that library may re-start
344 the "read" without returning to Perl and giving it a chance to call
345 the %SIG handler. If this happens on your system the solution is to
346 use ":perlio" layer to do IO - at least on those handles which you
347 want to be able to break into with signals. (The ":perlio" layer
348 checks the signal flags and calls %SIG handlers before resuming IO
349 operation.)
350
351 Note that the default in Perl 5.7.3 and later is to automatically
352 use the ":perlio" layer.
353
354 Note that some networking library functions like gethostbyname()
355 are known to have their own implementations of timeouts which may
356 conflict with your timeouts. If you are having problems with such
357 functions, you can try using the POSIX sigaction() function, which
358 bypasses the Perl safe signals (note that this means subjecting
359 yourself to possible memory corruption, as described above).
360 Instead of setting $SIG{ALRM}:
361
362 local $SIG{ALRM} = sub { die "alarm" };
363
364 try something like the following:
365
366 use POSIX qw(SIGALRM);
367 POSIX::sigaction(SIGALRM,
368 POSIX::SigAction->new(sub { die "alarm" }))
369 or die "Error setting SIGALRM handler: $!\n";
370
371 Restartable system calls
372 On systems that supported it, older versions of Perl used the
373 SA_RESTART flag when installing %SIG handlers. This meant that
374 restartable system calls would continue rather than returning when
375 a signal arrived. In order to deliver deferred signals promptly,
376 Perl 5.7.3 and later do not use SA_RESTART. Consequently,
377 restartable system calls can fail (with $! set to "EINTR") in
378 places where they previously would have succeeded.
379
380 Note that the default ":perlio" layer will retry "read", "write"
381 and "close" as described above and that interrupted "wait" and
382 "waitpid" calls will always be retried.
383
384 Signals as "faults"
385 Certain signals e.g. SEGV, ILL, BUS are generated as a result of
386 virtual memory or other "faults". These are normally fatal and
387 there is little a Perl-level handler can do with them. (In particu‐
388 lar the old signal scheme was particularly unsafe in such cases.)
389 However if a %SIG handler is set the new scheme simply sets a flag
390 and returns as described above. This may cause the operating system
391 to try the offending machine instruction again and - as nothing has
392 changed - it will generate the signal again. The result of this is
393 a rather odd "loop". In future Perl's signal mechanism may be
394 changed to avoid this - perhaps by simply disallowing %SIG handlers
395 on signals of that type. Until then the work-round is not to set a
396 %SIG handler on those signals. (Which signals they are is operating
397 system dependent.)
398
399 Signals triggered by operating system state
400 On some operating systems certain signal handlers are supposed to
401 "do something" before returning. One example can be CHLD or CLD
402 which indicates a child process has completed. On some operating
403 systems the signal handler is expected to "wait" for the completed
404 child process. On such systems the deferred signal scheme will not
405 work for those signals (it does not do the "wait"). Again the fail‐
406 ure will look like a loop as the operating system will re-issue the
407 signal as there are un-waited-for completed child processes.
408
409 If you want the old signal behaviour back regardless of possible memory
410 corruption, set the environment variable "PERL_SIGNALS" to "unsafe" (a
411 new feature since Perl 5.8.1).
412
414 Perl's basic open() statement can also be used for unidirectional
415 interprocess communication by either appending or prepending a pipe
416 symbol to the second argument to open(). Here's how to start something
417 up in a child process you intend to write to:
418
419 open(SPOOLER, "⎪ cat -v ⎪ lpr -h 2>/dev/null")
420 ⎪⎪ die "can't fork: $!";
421 local $SIG{PIPE} = sub { die "spooler pipe broke" };
422 print SPOOLER "stuff\n";
423 close SPOOLER ⎪⎪ die "bad spool: $! $?";
424
425 And here's how to start up a child process you intend to read from:
426
427 open(STATUS, "netstat -an 2>&1 ⎪")
428 ⎪⎪ die "can't fork: $!";
429 while (<STATUS>) {
430 next if /^(tcp⎪udp)/;
431 print;
432 }
433 close STATUS ⎪⎪ die "bad netstat: $! $?";
434
435 If one can be sure that a particular program is a Perl script that is
436 expecting filenames in @ARGV, the clever programmer can write something
437 like this:
438
439 % program f1 "cmd1⎪" - f2 "cmd2⎪" f3 < tmpfile
440
441 and irrespective of which shell it's called from, the Perl program will
442 read from the file f1, the process cmd1, standard input (tmpfile in
443 this case), the f2 file, the cmd2 command, and finally the f3 file.
444 Pretty nifty, eh?
445
446 You might notice that you could use backticks for much the same effect
447 as opening a pipe for reading:
448
449 print grep { !/^(tcp⎪udp)/ } `netstat -an 2>&1`;
450 die "bad netstat" if $?;
451
452 While this is true on the surface, it's much more efficient to process
453 the file one line or record at a time because then you don't have to
454 read the whole thing into memory at once. It also gives you finer con‐
455 trol of the whole process, letting you to kill off the child process
456 early if you'd like.
457
458 Be careful to check both the open() and the close() return values. If
459 you're writing to a pipe, you should also trap SIGPIPE. Otherwise,
460 think of what happens when you start up a pipe to a command that
461 doesn't exist: the open() will in all likelihood succeed (it only
462 reflects the fork()'s success), but then your output will fail--spec‐
463 tacularly. Perl can't know whether the command worked because your
464 command is actually running in a separate process whose exec() might
465 have failed. Therefore, while readers of bogus commands return just a
466 quick end of file, writers to bogus command will trigger a signal
467 they'd better be prepared to handle. Consider:
468
469 open(FH, "⎪bogus") or die "can't fork: $!";
470 print FH "bang\n" or die "can't write: $!";
471 close FH or die "can't close: $!";
472
473 That won't blow up until the close, and it will blow up with a SIGPIPE.
474 To catch it, you could use this:
475
476 $SIG{PIPE} = 'IGNORE';
477 open(FH, "⎪bogus") or die "can't fork: $!";
478 print FH "bang\n" or die "can't write: $!";
479 close FH or die "can't close: status=$?";
480
481 Filehandles
482
483 Both the main process and any child processes it forks share the same
484 STDIN, STDOUT, and STDERR filehandles. If both processes try to access
485 them at once, strange things can happen. You may also want to close or
486 reopen the filehandles for the child. You can get around this by open‐
487 ing your pipe with open(), but on some systems this means that the
488 child process cannot outlive the parent.
489
490 Background Processes
491
492 You can run a command in the background with:
493
494 system("cmd &");
495
496 The command's STDOUT and STDERR (and possibly STDIN, depending on your
497 shell) will be the same as the parent's. You won't need to catch
498 SIGCHLD because of the double-fork taking place (see below for more
499 details).
500
501 Complete Dissociation of Child from Parent
502
503 In some cases (starting server processes, for instance) you'll want to
504 completely dissociate the child process from the parent. This is often
505 called daemonization. A well behaved daemon will also chdir() to the
506 root directory (so it doesn't prevent unmounting the filesystem con‐
507 taining the directory from which it was launched) and redirect its
508 standard file descriptors from and to /dev/null (so that random output
509 doesn't wind up on the user's terminal).
510
511 use POSIX 'setsid';
512
513 sub daemonize {
514 chdir '/' or die "Can't chdir to /: $!";
515 open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
516 open STDOUT, '>/dev/null'
517 or die "Can't write to /dev/null: $!";
518 defined(my $pid = fork) or die "Can't fork: $!";
519 exit if $pid;
520 setsid or die "Can't start a new session: $!";
521 open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
522 }
523
524 The fork() has to come before the setsid() to ensure that you aren't a
525 process group leader (the setsid() will fail if you are). If your sys‐
526 tem doesn't have the setsid() function, open /dev/tty and use the
527 "TIOCNOTTY" ioctl() on it instead. See tty(4) for details.
528
529 Non-Unix users should check their Your_OS::Process module for other
530 solutions.
531
532 Safe Pipe Opens
533
534 Another interesting approach to IPC is making your single program go
535 multiprocess and communicate between (or even amongst) yourselves. The
536 open() function will accept a file argument of either "-⎪" or "⎪-" to
537 do a very interesting thing: it forks a child connected to the filehan‐
538 dle you've opened. The child is running the same program as the par‐
539 ent. This is useful for safely opening a file when running under an
540 assumed UID or GID, for example. If you open a pipe to minus, you can
541 write to the filehandle you opened and your kid will find it in his
542 STDIN. If you open a pipe from minus, you can read from the filehandle
543 you opened whatever your kid writes to his STDOUT.
544
545 use English '-no_match_vars';
546 my $sleep_count = 0;
547
548 do {
549 $pid = open(KID_TO_WRITE, "⎪-");
550 unless (defined $pid) {
551 warn "cannot fork: $!";
552 die "bailing out" if $sleep_count++ > 6;
553 sleep 10;
554 }
555 } until defined $pid;
556
557 if ($pid) { # parent
558 print KID_TO_WRITE @some_data;
559 close(KID_TO_WRITE) ⎪⎪ warn "kid exited $?";
560 } else { # child
561 ($EUID, $EGID) = ($UID, $GID); # suid progs only
562 open (FILE, "> /safe/file")
563 ⎪⎪ die "can't open /safe/file: $!";
564 while (<STDIN>) {
565 print FILE; # child's STDIN is parent's KID
566 }
567 exit; # don't forget this
568 }
569
570 Another common use for this construct is when you need to execute some‐
571 thing without the shell's interference. With system(), it's straight‐
572 forward, but you can't use a pipe open or backticks safely. That's
573 because there's no way to stop the shell from getting its hands on your
574 arguments. Instead, use lower-level control to call exec() directly.
575
576 Here's a safe backtick or pipe open for read:
577
578 # add error processing as above
579 $pid = open(KID_TO_READ, "-⎪");
580
581 if ($pid) { # parent
582 while (<KID_TO_READ>) {
583 # do something interesting
584 }
585 close(KID_TO_READ) ⎪⎪ warn "kid exited $?";
586
587 } else { # child
588 ($EUID, $EGID) = ($UID, $GID); # suid only
589 exec($program, @options, @args)
590 ⎪⎪ die "can't exec program: $!";
591 # NOTREACHED
592 }
593
594 And here's a safe pipe open for writing:
595
596 # add error processing as above
597 $pid = open(KID_TO_WRITE, "⎪-");
598 $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
599
600 if ($pid) { # parent
601 for (@data) {
602 print KID_TO_WRITE;
603 }
604 close(KID_TO_WRITE) ⎪⎪ warn "kid exited $?";
605
606 } else { # child
607 ($EUID, $EGID) = ($UID, $GID);
608 exec($program, @options, @args)
609 ⎪⎪ die "can't exec program: $!";
610 # NOTREACHED
611 }
612
613 Since Perl 5.8.0, you can also use the list form of "open" for pipes :
614 the syntax
615
616 open KID_PS, "-⎪", "ps", "aux" or die $!;
617
618 forks the ps(1) command (without spawning a shell, as there are more
619 than three arguments to open()), and reads its standard output via the
620 "KID_PS" filehandle. The corresponding syntax to write to command
621 pipes (with "⎪-" in place of "-⎪") is also implemented.
622
623 Note that these operations are full Unix forks, which means they may
624 not be correctly implemented on alien systems. Additionally, these are
625 not true multithreading. If you'd like to learn more about threading,
626 see the modules file mentioned below in the SEE ALSO section.
627
628 Bidirectional Communication with Another Process
629
630 While this works reasonably well for unidirectional communication, what
631 about bidirectional communication? The obvious thing you'd like to do
632 doesn't actually work:
633
634 open(PROG_FOR_READING_AND_WRITING, "⎪ some program ⎪")
635
636 and if you forget to use the "use warnings" pragma or the -w flag, then
637 you'll miss out entirely on the diagnostic message:
638
639 Can't do bidirectional pipe at -e line 1.
640
641 If you really want to, you can use the standard open2() library func‐
642 tion to catch both ends. There's also an open3() for tridirectional
643 I/O so you can also catch your child's STDERR, but doing so would then
644 require an awkward select() loop and wouldn't allow you to use normal
645 Perl input operations.
646
647 If you look at its source, you'll see that open2() uses low-level prim‐
648 itives like Unix pipe() and exec() calls to create all the connections.
649 While it might have been slightly more efficient by using socketpair(),
650 it would have then been even less portable than it already is. The
651 open2() and open3() functions are unlikely to work anywhere except on
652 a Unix system or some other one purporting to be POSIX compliant.
653
654 Here's an example of using open2():
655
656 use FileHandle;
657 use IPC::Open2;
658 $pid = open2(*Reader, *Writer, "cat -u -n" );
659 print Writer "stuff\n";
660 $got = <Reader>;
661
662 The problem with this is that Unix buffering is really going to ruin
663 your day. Even though your "Writer" filehandle is auto-flushed, and
664 the process on the other end will get your data in a timely manner, you
665 can't usually do anything to force it to give it back to you in a simi‐
666 larly quick fashion. In this case, we could, because we gave cat a -u
667 flag to make it unbuffered. But very few Unix commands are designed to
668 operate over pipes, so this seldom works unless you yourself wrote the
669 program on the other end of the double-ended pipe.
670
671 A solution to this is the nonstandard Comm.pl library. It uses pseudo-
672 ttys to make your program behave more reasonably:
673
674 require 'Comm.pl';
675 $ph = open_proc('cat -n');
676 for (1..10) {
677 print $ph "a line\n";
678 print "got back ", scalar <$ph>;
679 }
680
681 This way you don't have to have control over the source code of the
682 program you're using. The Comm library also has expect() and inter‐
683 act() functions. Find the library (and we hope its successor
684 IPC::Chat) at your nearest CPAN archive as detailed in the SEE ALSO
685 section below.
686
687 The newer Expect.pm module from CPAN also addresses this kind of thing.
688 This module requires two other modules from CPAN: IO::Pty and IO::Stty.
689 It sets up a pseudo-terminal to interact with programs that insist on
690 using talking to the terminal device driver. If your system is amongst
691 those supported, this may be your best bet.
692
693 Bidirectional Communication with Yourself
694
695 If you want, you may make low-level pipe() and fork() to stitch this
696 together by hand. This example only talks to itself, but you could
697 reopen the appropriate handles to STDIN and STDOUT and call other pro‐
698 cesses.
699
700 #!/usr/bin/perl -w
701 # pipe1 - bidirectional communication using two pipe pairs
702 # designed for the socketpair-challenged
703 use IO::Handle; # thousands of lines just for autoflush :-(
704 pipe(PARENT_RDR, CHILD_WTR); # XXX: failure?
705 pipe(CHILD_RDR, PARENT_WTR); # XXX: failure?
706 CHILD_WTR->autoflush(1);
707 PARENT_WTR->autoflush(1);
708
709 if ($pid = fork) {
710 close PARENT_RDR; close PARENT_WTR;
711 print CHILD_WTR "Parent Pid $$ is sending this\n";
712 chomp($line = <CHILD_RDR>);
713 print "Parent Pid $$ just read this: `$line'\n";
714 close CHILD_RDR; close CHILD_WTR;
715 waitpid($pid,0);
716 } else {
717 die "cannot fork: $!" unless defined $pid;
718 close CHILD_RDR; close CHILD_WTR;
719 chomp($line = <PARENT_RDR>);
720 print "Child Pid $$ just read this: `$line'\n";
721 print PARENT_WTR "Child Pid $$ is sending this\n";
722 close PARENT_RDR; close PARENT_WTR;
723 exit;
724 }
725
726 But you don't actually have to make two pipe calls. If you have the
727 socketpair() system call, it will do this all for you.
728
729 #!/usr/bin/perl -w
730 # pipe2 - bidirectional communication using socketpair
731 # "the best ones always go both ways"
732
733 use Socket;
734 use IO::Handle; # thousands of lines just for autoflush :-(
735 # We say AF_UNIX because although *_LOCAL is the
736 # POSIX 1003.1g form of the constant, many machines
737 # still don't have it.
738 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
739 or die "socketpair: $!";
740
741 CHILD->autoflush(1);
742 PARENT->autoflush(1);
743
744 if ($pid = fork) {
745 close PARENT;
746 print CHILD "Parent Pid $$ is sending this\n";
747 chomp($line = <CHILD>);
748 print "Parent Pid $$ just read this: `$line'\n";
749 close CHILD;
750 waitpid($pid,0);
751 } else {
752 die "cannot fork: $!" unless defined $pid;
753 close CHILD;
754 chomp($line = <PARENT>);
755 print "Child Pid $$ just read this: `$line'\n";
756 print PARENT "Child Pid $$ is sending this\n";
757 close PARENT;
758 exit;
759 }
760
762 While not limited to Unix-derived operating systems (e.g., WinSock on
763 PCs provides socket support, as do some VMS libraries), you may not
764 have sockets on your system, in which case this section probably isn't
765 going to do you much good. With sockets, you can do both virtual cir‐
766 cuits (i.e., TCP streams) and datagrams (i.e., UDP packets). You may
767 be able to do even more depending on your system.
768
769 The Perl function calls for dealing with sockets have the same names as
770 the corresponding system calls in C, but their arguments tend to differ
771 for two reasons: first, Perl filehandles work differently than C file
772 descriptors. Second, Perl already knows the length of its strings, so
773 you don't need to pass that information.
774
775 One of the major problems with old socket code in Perl was that it used
776 hard-coded values for some of the constants, which severely hurt porta‐
777 bility. If you ever see code that does anything like explicitly set‐
778 ting "$AF_INET = 2", you know you're in for big trouble: An immeasur‐
779 ably superior approach is to use the "Socket" module, which more reli‐
780 ably grants access to various constants and functions you'll need.
781
782 If you're not writing a server/client for an existing protocol like
783 NNTP or SMTP, you should give some thought to how your server will know
784 when the client has finished talking, and vice-versa. Most protocols
785 are based on one-line messages and responses (so one party knows the
786 other has finished when a "\n" is received) or multi-line messages and
787 responses that end with a period on an empty line ("\n.\n" terminates a
788 message/response).
789
790 Internet Line Terminators
791
792 The Internet line terminator is "\015\012". Under ASCII variants of
793 Unix, that could usually be written as "\r\n", but under other systems,
794 "\r\n" might at times be "\015\015\012", "\012\012\015", or something
795 completely different. The standards specify writing "\015\012" to be
796 conformant (be strict in what you provide), but they also recommend
797 accepting a lone "\012" on input (but be lenient in what you require).
798 We haven't always been very good about that in the code in this man‐
799 page, but unless you're on a Mac, you'll probably be ok.
800
801 Internet TCP Clients and Servers
802
803 Use Internet-domain sockets when you want to do client-server communi‐
804 cation that might extend to machines outside of your own system.
805
806 Here's a sample TCP client using Internet-domain sockets:
807
808 #!/usr/bin/perl -w
809 use strict;
810 use Socket;
811 my ($remote,$port, $iaddr, $paddr, $proto, $line);
812
813 $remote = shift ⎪⎪ 'localhost';
814 $port = shift ⎪⎪ 2345; # random port
815 if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') }
816 die "No port" unless $port;
817 $iaddr = inet_aton($remote) ⎪⎪ die "no host: $remote";
818 $paddr = sockaddr_in($port, $iaddr);
819
820 $proto = getprotobyname('tcp');
821 socket(SOCK, PF_INET, SOCK_STREAM, $proto) ⎪⎪ die "socket: $!";
822 connect(SOCK, $paddr) ⎪⎪ die "connect: $!";
823 while (defined($line = <SOCK>)) {
824 print $line;
825 }
826
827 close (SOCK) ⎪⎪ die "close: $!";
828 exit;
829
830 And here's a corresponding server to go along with it. We'll leave the
831 address as INADDR_ANY so that the kernel can choose the appropriate
832 interface on multihomed hosts. If you want sit on a particular inter‐
833 face (like the external side of a gateway or firewall machine), you
834 should fill this in with your real address instead.
835
836 #!/usr/bin/perl -Tw
837 use strict;
838 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
839 use Socket;
840 use Carp;
841 my $EOL = "\015\012";
842
843 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
844
845 my $port = shift ⎪⎪ 2345;
846 my $proto = getprotobyname('tcp');
847
848 ($port) = $port =~ /^(\d+)$/ or die "invalid port";
849
850 socket(Server, PF_INET, SOCK_STREAM, $proto) ⎪⎪ die "socket: $!";
851 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
852 pack("l", 1)) ⎪⎪ die "setsockopt: $!";
853 bind(Server, sockaddr_in($port, INADDR_ANY)) ⎪⎪ die "bind: $!";
854 listen(Server,SOMAXCONN) ⎪⎪ die "listen: $!";
855
856 logmsg "server started on port $port";
857
858 my $paddr;
859
860 $SIG{CHLD} = \&REAPER;
861
862 for ( ; $paddr = accept(Client,Server); close Client) {
863 my($port,$iaddr) = sockaddr_in($paddr);
864 my $name = gethostbyaddr($iaddr,AF_INET);
865
866 logmsg "connection from $name [",
867 inet_ntoa($iaddr), "]
868 at port $port";
869
870 print Client "Hello there, $name, it's now ",
871 scalar localtime, $EOL;
872 }
873
874 And here's a multithreaded version. It's multithreaded in that like
875 most typical servers, it spawns (forks) a slave server to handle the
876 client request so that the master server can quickly go back to service
877 a new client.
878
879 #!/usr/bin/perl -Tw
880 use strict;
881 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
882 use Socket;
883 use Carp;
884 my $EOL = "\015\012";
885
886 sub spawn; # forward declaration
887 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
888
889 my $port = shift ⎪⎪ 2345;
890 my $proto = getprotobyname('tcp');
891
892 ($port) = $port =~ /^(\d+)$/ or die "invalid port";
893
894 socket(Server, PF_INET, SOCK_STREAM, $proto) ⎪⎪ die "socket: $!";
895 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
896 pack("l", 1)) ⎪⎪ die "setsockopt: $!";
897 bind(Server, sockaddr_in($port, INADDR_ANY)) ⎪⎪ die "bind: $!";
898 listen(Server,SOMAXCONN) ⎪⎪ die "listen: $!";
899
900 logmsg "server started on port $port";
901
902 my $waitedpid = 0;
903 my $paddr;
904
905 use POSIX ":sys_wait_h";
906 sub REAPER {
907 my $child;
908 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
909 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
910 }
911 $SIG{CHLD} = \&REAPER; # loathe sysV
912 }
913
914 $SIG{CHLD} = \&REAPER;
915
916 for ( $waitedpid = 0;
917 ($paddr = accept(Client,Server)) ⎪⎪ $waitedpid;
918 $waitedpid = 0, close Client)
919 {
920 next if $waitedpid and not $paddr;
921 my($port,$iaddr) = sockaddr_in($paddr);
922 my $name = gethostbyaddr($iaddr,AF_INET);
923
924 logmsg "connection from $name [",
925 inet_ntoa($iaddr), "]
926 at port $port";
927
928 spawn sub {
929 $⎪=1;
930 print "Hello there, $name, it's now ", scalar localtime, $EOL;
931 exec '/usr/games/fortune' # XXX: `wrong' line terminators
932 or confess "can't exec fortune: $!";
933 };
934
935 }
936
937 sub spawn {
938 my $coderef = shift;
939
940 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
941 confess "usage: spawn CODEREF";
942 }
943
944 my $pid;
945 if (!defined($pid = fork)) {
946 logmsg "cannot fork: $!";
947 return;
948 } elsif ($pid) {
949 logmsg "begat $pid";
950 return; # I'm the parent
951 }
952 # else I'm the child -- go spawn
953
954 open(STDIN, "<&Client") ⎪⎪ die "can't dup client to stdin";
955 open(STDOUT, ">&Client") ⎪⎪ die "can't dup client to stdout";
956 ## open(STDERR, ">&STDOUT") ⎪⎪ die "can't dup stdout to stderr";
957 exit &$coderef();
958 }
959
960 This server takes the trouble to clone off a child version via fork()
961 for each incoming request. That way it can handle many requests at
962 once, which you might not always want. Even if you don't fork(), the
963 listen() will allow that many pending connections. Forking servers
964 have to be particularly careful about cleaning up their dead children
965 (called "zombies" in Unix parlance), because otherwise you'll quickly
966 fill up your process table.
967
968 We suggest that you use the -T flag to use taint checking (see perlsec)
969 even if we aren't running setuid or setgid. This is always a good idea
970 for servers and other programs run on behalf of someone else (like CGI
971 scripts), because it lessens the chances that people from the outside
972 will be able to compromise your system.
973
974 Let's look at another TCP client. This one connects to the TCP "time"
975 service on a number of different machines and shows how far their
976 clocks differ from the system on which it's being run:
977
978 #!/usr/bin/perl -w
979 use strict;
980 use Socket;
981
982 my $SECS_of_70_YEARS = 2208988800;
983 sub ctime { scalar localtime(shift) }
984
985 my $iaddr = gethostbyname('localhost');
986 my $proto = getprotobyname('tcp');
987 my $port = getservbyname('time', 'tcp');
988 my $paddr = sockaddr_in(0, $iaddr);
989 my($host);
990
991 $⎪ = 1;
992 printf "%-24s %8s %s\n", "localhost", 0, ctime(time());
993
994 foreach $host (@ARGV) {
995 printf "%-24s ", $host;
996 my $hisiaddr = inet_aton($host) ⎪⎪ die "unknown host";
997 my $hispaddr = sockaddr_in($port, $hisiaddr);
998 socket(SOCKET, PF_INET, SOCK_STREAM, $proto) ⎪⎪ die "socket: $!";
999 connect(SOCKET, $hispaddr) ⎪⎪ die "bind: $!";
1000 my $rtime = ' ';
1001 read(SOCKET, $rtime, 4);
1002 close(SOCKET);
1003 my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS;
1004 printf "%8d %s\n", $histime - time, ctime($histime);
1005 }
1006
1007 Unix-Domain TCP Clients and Servers
1008
1009 That's fine for Internet-domain clients and servers, but what about
1010 local communications? While you can use the same setup, sometimes you
1011 don't want to. Unix-domain sockets are local to the current host, and
1012 are often used internally to implement pipes. Unlike Internet domain
1013 sockets, Unix domain sockets can show up in the file system with an
1014 ls(1) listing.
1015
1016 % ls -l /dev/log
1017 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
1018
1019 You can test for these with Perl's -S file test:
1020
1021 unless ( -S '/dev/log' ) {
1022 die "something's wicked with the log system";
1023 }
1024
1025 Here's a sample Unix-domain client:
1026
1027 #!/usr/bin/perl -w
1028 use Socket;
1029 use strict;
1030 my ($rendezvous, $line);
1031
1032 $rendezvous = shift ⎪⎪ 'catsock';
1033 socket(SOCK, PF_UNIX, SOCK_STREAM, 0) ⎪⎪ die "socket: $!";
1034 connect(SOCK, sockaddr_un($rendezvous)) ⎪⎪ die "connect: $!";
1035 while (defined($line = <SOCK>)) {
1036 print $line;
1037 }
1038 exit;
1039
1040 And here's a corresponding server. You don't have to worry about silly
1041 network terminators here because Unix domain sockets are guaranteed to
1042 be on the localhost, and thus everything works right.
1043
1044 #!/usr/bin/perl -Tw
1045 use strict;
1046 use Socket;
1047 use Carp;
1048
1049 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
1050 sub spawn; # forward declaration
1051 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
1052
1053 my $NAME = 'catsock';
1054 my $uaddr = sockaddr_un($NAME);
1055 my $proto = getprotobyname('tcp');
1056
1057 socket(Server,PF_UNIX,SOCK_STREAM,0) ⎪⎪ die "socket: $!";
1058 unlink($NAME);
1059 bind (Server, $uaddr) ⎪⎪ die "bind: $!";
1060 listen(Server,SOMAXCONN) ⎪⎪ die "listen: $!";
1061
1062 logmsg "server started on $NAME";
1063
1064 my $waitedpid;
1065
1066 use POSIX ":sys_wait_h";
1067 sub REAPER {
1068 my $child;
1069 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
1070 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
1071 }
1072 $SIG{CHLD} = \&REAPER; # loathe sysV
1073 }
1074
1075 $SIG{CHLD} = \&REAPER;
1076
1077 for ( $waitedpid = 0;
1078 accept(Client,Server) ⎪⎪ $waitedpid;
1079 $waitedpid = 0, close Client)
1080 {
1081 next if $waitedpid;
1082 logmsg "connection on $NAME";
1083 spawn sub {
1084 print "Hello there, it's now ", scalar localtime, "\n";
1085 exec '/usr/games/fortune' or die "can't exec fortune: $!";
1086 };
1087 }
1088
1089 sub spawn {
1090 my $coderef = shift;
1091
1092 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
1093 confess "usage: spawn CODEREF";
1094 }
1095
1096 my $pid;
1097 if (!defined($pid = fork)) {
1098 logmsg "cannot fork: $!";
1099 return;
1100 } elsif ($pid) {
1101 logmsg "begat $pid";
1102 return; # I'm the parent
1103 }
1104 # else I'm the child -- go spawn
1105
1106 open(STDIN, "<&Client") ⎪⎪ die "can't dup client to stdin";
1107 open(STDOUT, ">&Client") ⎪⎪ die "can't dup client to stdout";
1108 ## open(STDERR, ">&STDOUT") ⎪⎪ die "can't dup stdout to stderr";
1109 exit &$coderef();
1110 }
1111
1112 As you see, it's remarkably similar to the Internet domain TCP server,
1113 so much so, in fact, that we've omitted several duplicate func‐
1114 tions--spawn(), logmsg(), ctime(), and REAPER()--which are exactly the
1115 same as in the other server.
1116
1117 So why would you ever want to use a Unix domain socket instead of a
1118 simpler named pipe? Because a named pipe doesn't give you sessions.
1119 You can't tell one process's data from another's. With socket program‐
1120 ming, you get a separate session for each client: that's why accept()
1121 takes two arguments.
1122
1123 For example, let's say that you have a long running database server
1124 daemon that you want folks from the World Wide Web to be able to
1125 access, but only if they go through a CGI interface. You'd have a
1126 small, simple CGI program that does whatever checks and logging you
1127 feel like, and then acts as a Unix-domain client and connects to your
1128 private server.
1129
1131 For those preferring a higher-level interface to socket programming,
1132 the IO::Socket module provides an object-oriented approach. IO::Socket
1133 is included as part of the standard Perl distribution as of the 5.004
1134 release. If you're running an earlier version of Perl, just fetch
1135 IO::Socket from CPAN, where you'll also find modules providing easy
1136 interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and
1137 NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just
1138 to name a few.
1139
1140 A Simple Client
1141
1142 Here's a client that creates a TCP connection to the "daytime" service
1143 at port 13 of the host name "localhost" and prints out everything that
1144 the server there cares to provide.
1145
1146 #!/usr/bin/perl -w
1147 use IO::Socket;
1148 $remote = IO::Socket::INET->new(
1149 Proto => "tcp",
1150 PeerAddr => "localhost",
1151 PeerPort => "daytime(13)",
1152 )
1153 or die "cannot connect to daytime port at localhost";
1154 while ( <$remote> ) { print }
1155
1156 When you run this program, you should get something back that looks
1157 like this:
1158
1159 Wed May 14 08:40:46 MDT 1997
1160
1161 Here are what those parameters to the "new" constructor mean:
1162
1163 "Proto"
1164 This is which protocol to use. In this case, the socket handle
1165 returned will be connected to a TCP socket, because we want a
1166 stream-oriented connection, that is, one that acts pretty much like
1167 a plain old file. Not all sockets are this of this type. For
1168 example, the UDP protocol can be used to make a datagram socket,
1169 used for message-passing.
1170
1171 "PeerAddr"
1172 This is the name or Internet address of the remote host the server
1173 is running on. We could have specified a longer name like
1174 "www.perl.com", or an address like "204.148.40.9". For demonstra‐
1175 tion purposes, we've used the special hostname "localhost", which
1176 should always mean the current machine you're running on. The cor‐
1177 responding Internet address for localhost is "127.1", if you'd
1178 rather use that.
1179
1180 "PeerPort"
1181 This is the service name or port number we'd like to connect to.
1182 We could have gotten away with using just "daytime" on systems with
1183 a well-configured system services file,[FOOTNOTE: The system ser‐
1184 vices file is in /etc/services under Unix] but just in case, we've
1185 specified the port number (13) in parentheses. Using just the num‐
1186 ber would also have worked, but constant numbers make careful pro‐
1187 grammers nervous.
1188
1189 Notice how the return value from the "new" constructor is used as a
1190 filehandle in the "while" loop? That's what's called an indirect file‐
1191 handle, a scalar variable containing a filehandle. You can use it the
1192 same way you would a normal filehandle. For example, you can read one
1193 line from it this way:
1194
1195 $line = <$handle>;
1196
1197 all remaining lines from is this way:
1198
1199 @lines = <$handle>;
1200
1201 and send a line of data to it this way:
1202
1203 print $handle "some data\n";
1204
1205 A Webget Client
1206
1207 Here's a simple client that takes a remote host to fetch a document
1208 from, and then a list of documents to get from that host. This is a
1209 more interesting client than the previous one because it first sends
1210 something to the server before fetching the server's response.
1211
1212 #!/usr/bin/perl -w
1213 use IO::Socket;
1214 unless (@ARGV > 1) { die "usage: $0 host document ..." }
1215 $host = shift(@ARGV);
1216 $EOL = "\015\012";
1217 $BLANK = $EOL x 2;
1218 foreach $document ( @ARGV ) {
1219 $remote = IO::Socket::INET->new( Proto => "tcp",
1220 PeerAddr => $host,
1221 PeerPort => "http(80)",
1222 );
1223 unless ($remote) { die "cannot connect to http daemon on $host" }
1224 $remote->autoflush(1);
1225 print $remote "GET $document HTTP/1.0" . $BLANK;
1226 while ( <$remote> ) { print }
1227 close $remote;
1228 }
1229
1230 The web server handing the "http" service, which is assumed to be at
1231 its standard port, number 80. If the web server you're trying to con‐
1232 nect to is at a different port (like 1080 or 8080), you should specify
1233 as the named-parameter pair, "PeerPort => 8080". The "autoflush"
1234 method is used on the socket because otherwise the system would buffer
1235 up the output we sent it. (If you're on a Mac, you'll also need to
1236 change every "\n" in your code that sends data over the network to be a
1237 "\015\012" instead.)
1238
1239 Connecting to the server is only the first part of the process: once
1240 you have the connection, you have to use the server's language. Each
1241 server on the network has its own little command language that it
1242 expects as input. The string that we send to the server starting with
1243 "GET" is in HTTP syntax. In this case, we simply request each speci‐
1244 fied document. Yes, we really are making a new connection for each
1245 document, even though it's the same host. That's the way you always
1246 used to have to speak HTTP. Recent versions of web browsers may
1247 request that the remote server leave the connection open a little
1248 while, but the server doesn't have to honor such a request.
1249
1250 Here's an example of running that program, which we'll call webget:
1251
1252 % webget www.perl.com /guanaco.html
1253 HTTP/1.1 404 File Not Found
1254 Date: Thu, 08 May 1997 18:02:32 GMT
1255 Server: Apache/1.2b6
1256 Connection: close
1257 Content-type: text/html
1258
1259 <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1260 <BODY><H1>File Not Found</H1>
1261 The requested URL /guanaco.html was not found on this server.<P>
1262 </BODY>
1263
1264 Ok, so that's not very interesting, because it didn't find that partic‐
1265 ular document. But a long response wouldn't have fit on this page.
1266
1267 For a more fully-featured version of this program, you should look to
1268 the lwp-request program included with the LWP modules from CPAN.
1269
1270 Interactive Client with IO::Socket
1271
1272 Well, that's all fine if you want to send one command and get one
1273 answer, but what about setting up something fully interactive, somewhat
1274 like the way telnet works? That way you can type a line, get the
1275 answer, type a line, get the answer, etc.
1276
1277 This client is more complicated than the two we've done so far, but if
1278 you're on a system that supports the powerful "fork" call, the solution
1279 isn't that rough. Once you've made the connection to whatever service
1280 you'd like to chat with, call "fork" to clone your process. Each of
1281 these two identical process has a very simple job to do: the parent
1282 copies everything from the socket to standard output, while the child
1283 simultaneously copies everything from standard input to the socket. To
1284 accomplish the same thing using just one process would be much harder,
1285 because it's easier to code two processes to do one thing than it is to
1286 code one process to do two things. (This keep-it-simple principle a
1287 cornerstones of the Unix philosophy, and good software engineering as
1288 well, which is probably why it's spread to other systems.)
1289
1290 Here's the code:
1291
1292 #!/usr/bin/perl -w
1293 use strict;
1294 use IO::Socket;
1295 my ($host, $port, $kidpid, $handle, $line);
1296
1297 unless (@ARGV == 2) { die "usage: $0 host port" }
1298 ($host, $port) = @ARGV;
1299
1300 # create a tcp connection to the specified host and port
1301 $handle = IO::Socket::INET->new(Proto => "tcp",
1302 PeerAddr => $host,
1303 PeerPort => $port)
1304 or die "can't connect to port $port on $host: $!";
1305
1306 $handle->autoflush(1); # so output gets there right away
1307 print STDERR "[Connected to $host:$port]\n";
1308
1309 # split the program into two processes, identical twins
1310 die "can't fork: $!" unless defined($kidpid = fork());
1311
1312 # the if{} block runs only in the parent process
1313 if ($kidpid) {
1314 # copy the socket to standard output
1315 while (defined ($line = <$handle>)) {
1316 print STDOUT $line;
1317 }
1318 kill("TERM", $kidpid); # send SIGTERM to child
1319 }
1320 # the else{} block runs only in the child process
1321 else {
1322 # copy standard input to the socket
1323 while (defined ($line = <STDIN>)) {
1324 print $handle $line;
1325 }
1326 }
1327
1328 The "kill" function in the parent's "if" block is there to send a sig‐
1329 nal to our child process (current running in the "else" block) as soon
1330 as the remote server has closed its end of the connection.
1331
1332 If the remote server sends data a byte at time, and you need that data
1333 immediately without waiting for a newline (which might not happen), you
1334 may wish to replace the "while" loop in the parent with the following:
1335
1336 my $byte;
1337 while (sysread($handle, $byte, 1) == 1) {
1338 print STDOUT $byte;
1339 }
1340
1341 Making a system call for each byte you want to read is not very effi‐
1342 cient (to put it mildly) but is the simplest to explain and works rea‐
1343 sonably well.
1344
1346 As always, setting up a server is little bit more involved than running
1347 a client. The model is that the server creates a special kind of
1348 socket that does nothing but listen on a particular port for incoming
1349 connections. It does this by calling the "IO::Socket::INET->new()"
1350 method with slightly different arguments than the client did.
1351
1352 Proto
1353 This is which protocol to use. Like our clients, we'll still spec‐
1354 ify "tcp" here.
1355
1356 LocalPort
1357 We specify a local port in the "LocalPort" argument, which we
1358 didn't do for the client. This is service name or port number for
1359 which you want to be the server. (Under Unix, ports under 1024 are
1360 restricted to the superuser.) In our sample, we'll use port 9000,
1361 but you can use any port that's not currently in use on your sys‐
1362 tem. If you try to use one already in used, you'll get an "Address
1363 already in use" message. Under Unix, the "netstat -a" command will
1364 show which services current have servers.
1365
1366 Listen
1367 The "Listen" parameter is set to the maximum number of pending con‐
1368 nections we can accept until we turn away incoming clients. Think
1369 of it as a call-waiting queue for your telephone. The low-level
1370 Socket module has a special symbol for the system maximum, which is
1371 SOMAXCONN.
1372
1373 Reuse
1374 The "Reuse" parameter is needed so that we restart our server manu‐
1375 ally without waiting a few minutes to allow system buffers to clear
1376 out.
1377
1378 Once the generic server socket has been created using the parameters
1379 listed above, the server then waits for a new client to connect to it.
1380 The server blocks in the "accept" method, which eventually accepts a
1381 bidirectional connection from the remote client. (Make sure to aut‐
1382 oflush this handle to circumvent buffering.)
1383
1384 To add to user-friendliness, our server prompts the user for commands.
1385 Most servers don't do this. Because of the prompt without a newline,
1386 you'll have to use the "sysread" variant of the interactive client
1387 above.
1388
1389 This server accepts one of five different commands, sending output back
1390 to the client. Note that unlike most network servers, this one only
1391 handles one incoming client at a time. Multithreaded servers are cov‐
1392 ered in Chapter 6 of the Camel.
1393
1394 Here's the code. We'll
1395
1396 #!/usr/bin/perl -w
1397 use IO::Socket;
1398 use Net::hostent; # for OO version of gethostbyaddr
1399
1400 $PORT = 9000; # pick something not in use
1401
1402 $server = IO::Socket::INET->new( Proto => 'tcp',
1403 LocalPort => $PORT,
1404 Listen => SOMAXCONN,
1405 Reuse => 1);
1406
1407 die "can't setup server" unless $server;
1408 print "[Server $0 accepting clients]\n";
1409
1410 while ($client = $server->accept()) {
1411 $client->autoflush(1);
1412 print $client "Welcome to $0; type help for command list.\n";
1413 $hostinfo = gethostbyaddr($client->peeraddr);
1414 printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
1415 print $client "Command? ";
1416 while ( <$client>) {
1417 next unless /\S/; # blank line
1418 if (/quit⎪exit/i) { last; }
1419 elsif (/date⎪time/i) { printf $client "%s\n", scalar localtime; }
1420 elsif (/who/i ) { print $client `who 2>&1`; }
1421 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1`; }
1422 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1`; }
1423 else {
1424 print $client "Commands: quit date who cookie motd\n";
1425 }
1426 } continue {
1427 print $client "Command? ";
1428 }
1429 close $client;
1430 }
1431
1433 Another kind of client-server setup is one that uses not connections,
1434 but messages. UDP communications involve much lower overhead but also
1435 provide less reliability, as there are no promises that messages will
1436 arrive at all, let alone in order and unmangled. Still, UDP offers
1437 some advantages over TCP, including being able to "broadcast" or "mul‐
1438 ticast" to a whole bunch of destination hosts at once (usually on your
1439 local subnet). If you find yourself overly concerned about reliability
1440 and start building checks into your message system, then you probably
1441 should use just TCP to start with.
1442
1443 Note that UDP datagrams are not a bytestream and should not be treated
1444 as such. This makes using I/O mechanisms with internal buffering like
1445 stdio (i.e. print() and friends) especially cumbersome. Use syswrite(),
1446 or better send(), like in the example below.
1447
1448 Here's a UDP program similar to the sample Internet TCP client given
1449 earlier. However, instead of checking one host at a time, the UDP ver‐
1450 sion will check many of them asynchronously by simulating a multicast
1451 and then using select() to do a timed-out wait for I/O. To do some‐
1452 thing similar with TCP, you'd have to use a different socket handle for
1453 each host.
1454
1455 #!/usr/bin/perl -w
1456 use strict;
1457 use Socket;
1458 use Sys::Hostname;
1459
1460 my ( $count, $hisiaddr, $hispaddr, $histime,
1461 $host, $iaddr, $paddr, $port, $proto,
1462 $rin, $rout, $rtime, $SECS_of_70_YEARS);
1463
1464 $SECS_of_70_YEARS = 2208988800;
1465
1466 $iaddr = gethostbyname(hostname());
1467 $proto = getprotobyname('udp');
1468 $port = getservbyname('time', 'udp');
1469 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1470
1471 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) ⎪⎪ die "socket: $!";
1472 bind(SOCKET, $paddr) ⎪⎪ die "bind: $!";
1473
1474 $⎪ = 1;
1475 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime time;
1476 $count = 0;
1477 for $host (@ARGV) {
1478 $count++;
1479 $hisiaddr = inet_aton($host) ⎪⎪ die "unknown host";
1480 $hispaddr = sockaddr_in($port, $hisiaddr);
1481 defined(send(SOCKET, 0, 0, $hispaddr)) ⎪⎪ die "send $host: $!";
1482 }
1483
1484 $rin = '';
1485 vec($rin, fileno(SOCKET), 1) = 1;
1486
1487 # timeout after 10.0 seconds
1488 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1489 $rtime = '';
1490 ($hispaddr = recv(SOCKET, $rtime, 4, 0)) ⎪⎪ die "recv: $!";
1491 ($port, $hisiaddr) = sockaddr_in($hispaddr);
1492 $host = gethostbyaddr($hisiaddr, AF_INET);
1493 $histime = unpack("N", $rtime) - $SECS_of_70_YEARS;
1494 printf "%-12s ", $host;
1495 printf "%8d %s\n", $histime - time, scalar localtime($histime);
1496 $count--;
1497 }
1498
1499 Note that this example does not include any retries and may conse‐
1500 quently fail to contact a reachable host. The most prominent reason for
1501 this is congestion of the queues on the sending host if the number of
1502 list of hosts to contact is sufficiently large.
1503
1505 While System V IPC isn't so widely used as sockets, it still has some
1506 interesting uses. You can't, however, effectively use SysV IPC or
1507 Berkeley mmap() to have shared memory so as to share a variable amongst
1508 several processes. That's because Perl would reallocate your string
1509 when you weren't wanting it to.
1510
1511 Here's a small example showing shared memory usage.
1512
1513 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU);
1514
1515 $size = 2000;
1516 $id = shmget(IPC_PRIVATE, $size, S_IRWXU) ⎪⎪ die "$!";
1517 print "shm key $id\n";
1518
1519 $message = "Message #1";
1520 shmwrite($id, $message, 0, 60) ⎪⎪ die "$!";
1521 print "wrote: '$message'\n";
1522 shmread($id, $buff, 0, 60) ⎪⎪ die "$!";
1523 print "read : '$buff'\n";
1524
1525 # the buffer of shmread is zero-character end-padded.
1526 substr($buff, index($buff, "\0")) = '';
1527 print "un" unless $buff eq $message;
1528 print "swell\n";
1529
1530 print "deleting shm $id\n";
1531 shmctl($id, IPC_RMID, 0) ⎪⎪ die "$!";
1532
1533 Here's an example of a semaphore:
1534
1535 use IPC::SysV qw(IPC_CREAT);
1536
1537 $IPC_KEY = 1234;
1538 $id = semget($IPC_KEY, 10, 0666 ⎪ IPC_CREAT ) ⎪⎪ die "$!";
1539 print "shm key $id\n";
1540
1541 Put this code in a separate file to be run in more than one process.
1542 Call the file take:
1543
1544 # create a semaphore
1545
1546 $IPC_KEY = 1234;
1547 $id = semget($IPC_KEY, 0 , 0 );
1548 die if !defined($id);
1549
1550 $semnum = 0;
1551 $semflag = 0;
1552
1553 # 'take' semaphore
1554 # wait for semaphore to be zero
1555 $semop = 0;
1556 $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1557
1558 # Increment the semaphore count
1559 $semop = 1;
1560 $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
1561 $opstring = $opstring1 . $opstring2;
1562
1563 semop($id,$opstring) ⎪⎪ die "$!";
1564
1565 Put this code in a separate file to be run in more than one process.
1566 Call this file give:
1567
1568 # 'give' the semaphore
1569 # run this in the original process and you will see
1570 # that the second process continues
1571
1572 $IPC_KEY = 1234;
1573 $id = semget($IPC_KEY, 0, 0);
1574 die if !defined($id);
1575
1576 $semnum = 0;
1577 $semflag = 0;
1578
1579 # Decrement the semaphore count
1580 $semop = -1;
1581 $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1582
1583 semop($id,$opstring) ⎪⎪ die "$!";
1584
1585 The SysV IPC code above was written long ago, and it's definitely
1586 clunky looking. For a more modern look, see the IPC::SysV module which
1587 is included with Perl starting from Perl 5.005.
1588
1589 A small example demonstrating SysV message queues:
1590
1591 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU);
1592
1593 my $id = msgget(IPC_PRIVATE, IPC_CREAT ⎪ S_IRWXU);
1594
1595 my $sent = "message";
1596 my $type_sent = 1234;
1597 my $rcvd;
1598 my $type_rcvd;
1599
1600 if (defined $id) {
1601 if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) {
1602 if (msgrcv($id, $rcvd, 60, 0, 0)) {
1603 ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd);
1604 if ($rcvd eq $sent) {
1605 print "okay\n";
1606 } else {
1607 print "not okay\n";
1608 }
1609 } else {
1610 die "# msgrcv failed\n";
1611 }
1612 } else {
1613 die "# msgsnd failed\n";
1614 }
1615 msgctl($id, IPC_RMID, 0) ⎪⎪ die "# msgctl failed: $!\n";
1616 } else {
1617 die "# msgget failed\n";
1618 }
1619
1621 Most of these routines quietly but politely return "undef" when they
1622 fail instead of causing your program to die right then and there due to
1623 an uncaught exception. (Actually, some of the new Socket conversion
1624 functions croak() on bad arguments.) It is therefore essential to
1625 check return values from these functions. Always begin your socket
1626 programs this way for optimal success, and don't forget to add -T taint
1627 checking flag to the #! line for servers:
1628
1629 #!/usr/bin/perl -Tw
1630 use strict;
1631 use sigtrap;
1632 use Socket;
1633
1635 All these routines create system-specific portability problems. As
1636 noted elsewhere, Perl is at the mercy of your C libraries for much of
1637 its system behaviour. It's probably safest to assume broken SysV
1638 semantics for signals and to stick with simple TCP and UDP socket oper‐
1639 ations; e.g., don't try to pass open file descriptors over a local UDP
1640 datagram socket if you want your code to stand a chance of being porta‐
1641 ble.
1642
1644 Tom Christiansen, with occasional vestiges of Larry Wall's original
1645 version and suggestions from the Perl Porters.
1646
1648 There's a lot more to networking than this, but this should get you
1649 started.
1650
1651 For intrepid programmers, the indispensable textbook is Unix Network
1652 Programming, 2nd Edition, Volume 1 by W. Richard Stevens (published by
1653 Prentice-Hall). Note that most books on networking address the subject
1654 from the perspective of a C programmer; translation to Perl is left as
1655 an exercise for the reader.
1656
1657 The IO::Socket(3) manpage describes the object library, and the
1658 Socket(3) manpage describes the low-level interface to sockets.
1659 Besides the obvious functions in perlfunc, you should also check out
1660 the modules file at your nearest CPAN site. (See perlmodlib or best
1661 yet, the Perl FAQ for a description of what CPAN is and where to get
1662 it.)
1663
1664 Section 5 of the modules file is devoted to "Networking, Device Control
1665 (modems), and Interprocess Communication", and contains numerous unbun‐
1666 dled modules numerous networking modules, Chat and Expect operations,
1667 CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP,
1668 Telnet, Threads, and ToolTalk--just to name a few.
1669
1670
1671
1672perl v5.8.8 2006-01-07 PERLIPC(1)