1PERLOPENTUT(1) Perl Programmers Reference Guide PERLOPENTUT(1)
2
3
4
6 perlopentut - tutorial on opening things in Perl
7
9 Perl has two simple, built-in ways to open files: the shell way for
10 convenience, and the C way for precision. The shell way also has 2-
11 and 3-argument forms, which have different semantics for handling the
12 filename. The choice is yours.
13
15 Perl's "open" function was designed to mimic the way command-line
16 redirection in the shell works. Here are some basic examples from the
17 shell:
18
19 $ myprogram file1 file2 file3
20 $ myprogram < inputfile
21 $ myprogram > outputfile
22 $ myprogram >> outputfile
23 $ myprogram | otherprogram
24 $ otherprogram | myprogram
25
26 And here are some more advanced examples:
27
28 $ otherprogram | myprogram f1 - f2
29 $ otherprogram 2>&1 | myprogram -
30 $ myprogram <&3
31 $ myprogram >&4
32
33 Programmers accustomed to constructs like those above can take comfort
34 in learning that Perl directly supports these familiar constructs using
35 virtually the same syntax as the shell.
36
37 Simple Opens
38 The "open" function takes two arguments: the first is a filehandle, and
39 the second is a single string comprising both what to open and how to
40 open it. "open" returns true when it works, and when it fails, returns
41 a false value and sets the special variable $! to reflect the system
42 error. If the filehandle was previously opened, it will be implicitly
43 closed first.
44
45 For example:
46
47 open(INFO, "datafile") || die("can't open datafile: $!");
48 open(INFO, "< datafile") || die("can't open datafile: $!");
49 open(RESULTS,"> runstats") || die("can't open runstats: $!");
50 open(LOG, ">> logfile ") || die("can't open logfile: $!");
51
52 If you prefer the low-punctuation version, you could write that this
53 way:
54
55 open INFO, "< datafile" or die "can't open datafile: $!";
56 open RESULTS,"> runstats" or die "can't open runstats: $!";
57 open LOG, ">> logfile " or die "can't open logfile: $!";
58
59 A few things to notice. First, the leading less-than is optional. If
60 omitted, Perl assumes that you want to open the file for reading.
61
62 Note also that the first example uses the "||" logical operator, and
63 the second uses "or", which has lower precedence. Using "||" in the
64 latter examples would effectively mean
65
66 open INFO, ( "< datafile" || die "can't open datafile: $!" );
67
68 which is definitely not what you want.
69
70 The other important thing to notice is that, just as in the shell, any
71 whitespace before or after the filename is ignored. This is good,
72 because you wouldn't want these to do different things:
73
74 open INFO, "<datafile"
75 open INFO, "< datafile"
76 open INFO, "< datafile"
77
78 Ignoring surrounding whitespace also helps for when you read a filename
79 in from a different file, and forget to trim it before opening:
80
81 $filename = <INFO>; # oops, \n still there
82 open(EXTRA, "< $filename") || die "can't open $filename: $!";
83
84 This is not a bug, but a feature. Because "open" mimics the shell in
85 its style of using redirection arrows to specify how to open the file,
86 it also does so with respect to extra whitespace around the filename
87 itself as well. For accessing files with naughty names, see
88 "Dispelling the Dweomer".
89
90 There is also a 3-argument version of "open", which lets you put the
91 special redirection characters into their own argument:
92
93 open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";
94
95 In this case, the filename to open is the actual string in $datafile,
96 so you don't have to worry about $datafile containing characters that
97 might influence the open mode, or whitespace at the beginning of the
98 filename that would be absorbed in the 2-argument version. Also, any
99 reduction of unnecessary string interpolation is a good thing.
100
101 Indirect Filehandles
102 "open"'s first argument can be a reference to a filehandle. As of perl
103 5.6.0, if the argument is uninitialized, Perl will automatically create
104 a filehandle and put a reference to it in the first argument, like so:
105
106 open( my $in, $infile ) or die "Couldn't read $infile: $!";
107 while ( <$in> ) {
108 # do something with $_
109 }
110 close $in;
111
112 Indirect filehandles make namespace management easier. Since
113 filehandles are global to the current package, two subroutines trying
114 to open "INFILE" will clash. With two functions opening indirect
115 filehandles like "my $infile", there's no clash and no need to worry
116 about future conflicts.
117
118 Another convenient behavior is that an indirect filehandle
119 automatically closes when it goes out of scope or when you undefine it:
120
121 sub firstline {
122 open( my $in, shift ) && return scalar <$in>;
123 # no close() required
124 }
125
126 Pipe Opens
127 In C, when you want to open a file using the standard I/O library, you
128 use the "fopen" function, but when opening a pipe, you use the "popen"
129 function. But in the shell, you just use a different redirection
130 character. That's also the case for Perl. The "open" call remains the
131 same--just its argument differs.
132
133 If the leading character is a pipe symbol, "open" starts up a new
134 command and opens a write-only filehandle leading into that command.
135 This lets you write into that handle and have what you write show up on
136 that command's standard input. For example:
137
138 open(PRINTER, "| lpr -Plp1") || die "can't run lpr: $!";
139 print PRINTER "stuff\n";
140 close(PRINTER) || die "can't close lpr: $!";
141
142 If the trailing character is a pipe, you start up a new command and
143 open a read-only filehandle leading out of that command. This lets
144 whatever that command writes to its standard output show up on your
145 handle for reading. For example:
146
147 open(NET, "netstat -i -n |") || die "can't fork netstat: $!";
148 while (<NET>) { } # do something with input
149 close(NET) || die "can't close netstat: $!";
150
151 What happens if you try to open a pipe to or from a non-existent
152 command? If possible, Perl will detect the failure and set $! as
153 usual. But if the command contains special shell characters, such as
154 ">" or "*", called 'metacharacters', Perl does not execute the command
155 directly. Instead, Perl runs the shell, which then tries to run the
156 command. This means that it's the shell that gets the error
157 indication. In such a case, the "open" call will only indicate failure
158 if Perl can't even run the shell. See "How can I capture STDERR from
159 an external command?" in perlfaq8 to see how to cope with this.
160 There's also an explanation in perlipc.
161
162 If you would like to open a bidirectional pipe, the IPC::Open2 library
163 will handle this for you. Check out "Bidirectional Communication with
164 Another Process" in perlipc
165
166 perl-5.6.x introduced a version of piped open that executes a process
167 based on its command line arguments without relying on the shell.
168 (Similar to the "system(@LIST)" notation.) This is safer and faster
169 than executing a single argument pipe-command, but does not allow
170 special shell constructs. (It is also not supported on Microsoft
171 Windows, Mac OS Classic or RISC OS.)
172
173 Here's an example of "open '-|'", which prints a random Unix fortune
174 cookie as uppercase:
175
176 my $collection = shift(@ARGV);
177 open my $fortune, '-|', 'fortune', $collection
178 or die "Could not find fortune - $!";
179 while (<$fortune>)
180 {
181 print uc($_);
182 }
183 close($fortune);
184
185 And this "open '|-'" pipes into lpr:
186
187 open my $printer, '|-', 'lpr', '-Plp1'
188 or die "can't run lpr: $!";
189 print {$printer} "stuff\n";
190 close($printer)
191 or die "can't close lpr: $!";
192
193 The Minus File
194 Again following the lead of the standard shell utilities, Perl's "open"
195 function treats a file whose name is a single minus, "-", in a special
196 way. If you open minus for reading, it really means to access the
197 standard input. If you open minus for writing, it really means to
198 access the standard output.
199
200 If minus can be used as the default input or default output, what
201 happens if you open a pipe into or out of minus? What's the default
202 command it would run? The same script as you're currently running!
203 This is actually a stealth "fork" hidden inside an "open" call. See
204 "Safe Pipe Opens" in perlipc for details.
205
206 Mixing Reads and Writes
207 It is possible to specify both read and write access. All you do is
208 add a "+" symbol in front of the redirection. But as in the shell,
209 using a less-than on a file never creates a new file; it only opens an
210 existing one. On the other hand, using a greater-than always clobbers
211 (truncates to zero length) an existing file, or creates a brand-new one
212 if there isn't an old one. Adding a "+" for read-write doesn't affect
213 whether it only works on existing files or always clobbers existing
214 ones.
215
216 open(WTMP, "+< /usr/adm/wtmp")
217 || die "can't open /usr/adm/wtmp: $!";
218
219 open(SCREEN, "+> lkscreen")
220 || die "can't open lkscreen: $!";
221
222 open(LOGFILE, "+>> /var/log/applog")
223 || die "can't open /var/log/applog: $!";
224
225 The first one won't create a new file, and the second one will always
226 clobber an old one. The third one will create a new file if necessary
227 and not clobber an old one, and it will allow you to read at any point
228 in the file, but all writes will always go to the end. In short, the
229 first case is substantially more common than the second and third
230 cases, which are almost always wrong. (If you know C, the plus in
231 Perl's "open" is historically derived from the one in C's fopen(3S),
232 which it ultimately calls.)
233
234 In fact, when it comes to updating a file, unless you're working on a
235 binary file as in the WTMP case above, you probably don't want to use
236 this approach for updating. Instead, Perl's -i flag comes to the
237 rescue. The following command takes all the C, C++, or yacc source or
238 header files and changes all their foo's to bar's, leaving the old
239 version in the original filename with a ".orig" tacked on the end:
240
241 $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]
242
243 This is a short cut for some renaming games that are really the best
244 way to update textfiles. See the second question in perlfaq5 for more
245 details.
246
247 Filters
248 One of the most common uses for "open" is one you never even notice.
249 When you process the ARGV filehandle using "<ARGV>", Perl actually does
250 an implicit open on each file in @ARGV. Thus a program called like
251 this:
252
253 $ myprogram file1 file2 file3
254
255 can have all its files opened and processed one at a time using a
256 construct no more complex than:
257
258 while (<>) {
259 # do something with $_
260 }
261
262 If @ARGV is empty when the loop first begins, Perl pretends you've
263 opened up minus, that is, the standard input. In fact, $ARGV, the
264 currently open file during "<ARGV>" processing, is even set to "-" in
265 these circumstances.
266
267 You are welcome to pre-process your @ARGV before starting the loop to
268 make sure it's to your liking. One reason to do this might be to
269 remove command options beginning with a minus. While you can always
270 roll the simple ones by hand, the Getopts modules are good for this:
271
272 use Getopt::Std;
273
274 # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o
275 getopts("vDo:");
276
277 # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o}
278 getopts("vDo:", \%args);
279
280 Or the standard Getopt::Long module to permit named arguments:
281
282 use Getopt::Long;
283 GetOptions( "verbose" => \$verbose, # --verbose
284 "Debug" => \$debug, # --Debug
285 "output=s" => \$output );
286 # --output=somestring or --output somestring
287
288 Another reason for preprocessing arguments is to make an empty argument
289 list default to all files:
290
291 @ARGV = glob("*") unless @ARGV;
292
293 You could even filter out all but plain, text files. This is a bit
294 silent, of course, and you might prefer to mention them on the way.
295
296 @ARGV = grep { -f && -T } @ARGV;
297
298 If you're using the -n or -p command-line options, you should put
299 changes to @ARGV in a "BEGIN{}" block.
300
301 Remember that a normal "open" has special properties, in that it might
302 call fopen(3S) or it might called popen(3S), depending on what its
303 argument looks like; that's why it's sometimes called "magic open".
304 Here's an example:
305
306 $pwdinfo = `domainname` =~ /^(\(none\))?$/
307 ? '< /etc/passwd'
308 : 'ypcat passwd |';
309
310 open(PWD, $pwdinfo)
311 or die "can't open $pwdinfo: $!";
312
313 This sort of thing also comes into play in filter processing. Because
314 "<ARGV>" processing employs the normal, shell-style Perl "open", it
315 respects all the special things we've already seen:
316
317 $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
318
319 That program will read from the file f1, the process cmd1, standard
320 input (tmpfile in this case), the f2 file, the cmd2 command, and
321 finally the f3 file.
322
323 Yes, this also means that if you have files named "-" (and so on) in
324 your directory, they won't be processed as literal files by "open".
325 You'll need to pass them as "./-", much as you would for the rm
326 program, or you could use "sysopen" as described below.
327
328 One of the more interesting applications is to change files of a
329 certain name into pipes. For example, to autoprocess gzipped or
330 compressed files by decompressing them with gzip:
331
332 @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc $_ |" : $_ } @ARGV;
333
334 Or, if you have the GET program installed from LWP, you can fetch URLs
335 before processing them:
336
337 @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;
338
339 It's not for nothing that this is called magic "<ARGV>". Pretty nifty,
340 eh?
341
343 If you want the convenience of the shell, then Perl's "open" is
344 definitely the way to go. On the other hand, if you want finer
345 precision than C's simplistic fopen(3S) provides you should look to
346 Perl's "sysopen", which is a direct hook into the open(2) system call.
347 That does mean it's a bit more involved, but that's the price of
348 precision.
349
350 "sysopen" takes 3 (or 4) arguments.
351
352 sysopen HANDLE, PATH, FLAGS, [MASK]
353
354 The HANDLE argument is a filehandle just as with "open". The PATH is a
355 literal path, one that doesn't pay attention to any greater-thans or
356 less-thans or pipes or minuses, nor ignore whitespace. If it's there,
357 it's part of the path. The FLAGS argument contains one or more values
358 derived from the Fcntl module that have been or'd together using the
359 bitwise "|" operator. The final argument, the MASK, is optional; if
360 present, it is combined with the user's current umask for the creation
361 mode of the file. You should usually omit this.
362
363 Although the traditional values of read-only, write-only, and read-
364 write are 0, 1, and 2 respectively, this is known not to hold true on
365 some systems. Instead, it's best to load in the appropriate constants
366 first from the Fcntl module, which supplies the following standard
367 flags:
368
369 O_RDONLY Read only
370 O_WRONLY Write only
371 O_RDWR Read and write
372 O_CREAT Create the file if it doesn't exist
373 O_EXCL Fail if the file already exists
374 O_APPEND Append to the file
375 O_TRUNC Truncate the file
376 O_NONBLOCK Non-blocking access
377
378 Less common flags that are sometimes available on some operating
379 systems include "O_BINARY", "O_TEXT", "O_SHLOCK", "O_EXLOCK",
380 "O_DEFER", "O_SYNC", "O_ASYNC", "O_DSYNC", "O_RSYNC", "O_NOCTTY",
381 "O_NDELAY" and "O_LARGEFILE". Consult your open(2) manpage or its
382 local equivalent for details. (Note: starting from Perl release 5.6
383 the "O_LARGEFILE" flag, if available, is automatically added to the
384 sysopen() flags because large files are the default.)
385
386 Here's how to use "sysopen" to emulate the simple "open" calls we had
387 before. We'll omit the "|| die $!" checks for clarity, but make sure
388 you always check the return values in real code. These aren't quite
389 the same, since "open" will trim leading and trailing whitespace, but
390 you'll get the idea.
391
392 To open a file for reading:
393
394 open(FH, "< $path");
395 sysopen(FH, $path, O_RDONLY);
396
397 To open a file for writing, creating a new file if needed or else
398 truncating an old file:
399
400 open(FH, "> $path");
401 sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT);
402
403 To open a file for appending, creating one if necessary:
404
405 open(FH, ">> $path");
406 sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT);
407
408 To open a file for update, where the file must already exist:
409
410 open(FH, "+< $path");
411 sysopen(FH, $path, O_RDWR);
412
413 And here are things you can do with "sysopen" that you cannot do with a
414 regular "open". As you'll see, it's just a matter of controlling the
415 flags in the third argument.
416
417 To open a file for writing, creating a new file which must not
418 previously exist:
419
420 sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT);
421
422 To open a file for appending, where that file must already exist:
423
424 sysopen(FH, $path, O_WRONLY | O_APPEND);
425
426 To open a file for update, creating a new file if necessary:
427
428 sysopen(FH, $path, O_RDWR | O_CREAT);
429
430 To open a file for update, where that file must not already exist:
431
432 sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT);
433
434 To open a file without blocking, creating one if necessary:
435
436 sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT);
437
438 Permissions a la mode
439 If you omit the MASK argument to "sysopen", Perl uses the octal value
440 0666. The normal MASK to use for executables and directories should be
441 0777, and for anything else, 0666.
442
443 Why so permissive? Well, it isn't really. The MASK will be modified
444 by your process's current "umask". A umask is a number representing
445 disabled permissions bits; that is, bits that will not be turned on in
446 the created files' permissions field.
447
448 For example, if your "umask" were 027, then the 020 part would disable
449 the group from writing, and the 007 part would disable others from
450 reading, writing, or executing. Under these conditions, passing
451 "sysopen" 0666 would create a file with mode 0640, since "0666 & ~027"
452 is 0640.
453
454 You should seldom use the MASK argument to "sysopen()". That takes
455 away the user's freedom to choose what permission new files will have.
456 Denying choice is almost always a bad thing. One exception would be
457 for cases where sensitive or private data is being stored, such as with
458 mail folders, cookie files, and internal temporary files.
459
461 Re-Opening Files (dups)
462 Sometimes you already have a filehandle open, and want to make another
463 handle that's a duplicate of the first one. In the shell, we place an
464 ampersand in front of a file descriptor number when doing redirections.
465 For example, "2>&1" makes descriptor 2 (that's STDERR in Perl) be
466 redirected into descriptor 1 (which is usually Perl's STDOUT). The
467 same is essentially true in Perl: a filename that begins with an
468 ampersand is treated instead as a file descriptor if a number, or as a
469 filehandle if a string.
470
471 open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!";
472 open(MHCONTEXT, "<&4") || die "couldn't dup fd4: $!";
473
474 That means that if a function is expecting a filename, but you don't
475 want to give it a filename because you already have the file open, you
476 can just pass the filehandle with a leading ampersand. It's best to
477 use a fully qualified handle though, just in case the function happens
478 to be in a different package:
479
480 somefunction("&main::LOGFILE");
481
482 This way if somefunction() is planning on opening its argument, it can
483 just use the already opened handle. This differs from passing a
484 handle, because with a handle, you don't open the file. Here you have
485 something you can pass to open.
486
487 If you have one of those tricky, newfangled I/O objects that the C++
488 folks are raving about, then this doesn't work because those aren't a
489 proper filehandle in the native Perl sense. You'll have to use
490 fileno() to pull out the proper descriptor number, assuming you can:
491
492 use IO::Socket;
493 $handle = IO::Socket::INET->new("www.perl.com:80");
494 $fd = $handle->fileno;
495 somefunction("&$fd"); # not an indirect function call
496
497 It can be easier (and certainly will be faster) just to use real
498 filehandles though:
499
500 use IO::Socket;
501 local *REMOTE = IO::Socket::INET->new("www.perl.com:80");
502 die "can't connect" unless defined(fileno(REMOTE));
503 somefunction("&main::REMOTE");
504
505 If the filehandle or descriptor number is preceded not just with a
506 simple "&" but rather with a "&=" combination, then Perl will not
507 create a completely new descriptor opened to the same place using the
508 dup(2) system call. Instead, it will just make something of an alias
509 to the existing one using the fdopen(3S) library call. This is
510 slightly more parsimonious of systems resources, although this is less
511 a concern these days. Here's an example of that:
512
513 $fd = $ENV{"MHCONTEXTFD"};
514 open(MHCONTEXT, "<&=$fd") or die "couldn't fdopen $fd: $!";
515
516 If you're using magic "<ARGV>", you could even pass in as a command
517 line argument in @ARGV something like "<&=$MHCONTEXTFD", but we've
518 never seen anyone actually do this.
519
520 Dispelling the Dweomer
521 Perl is more of a DWIMmer language than something like Java--where DWIM
522 is an acronym for "do what I mean". But this principle sometimes leads
523 to more hidden magic than one knows what to do with. In this way, Perl
524 is also filled with dweomer, an obscure word meaning an enchantment.
525 Sometimes, Perl's DWIMmer is just too much like dweomer for comfort.
526
527 If magic "open" is a bit too magical for you, you don't have to turn to
528 "sysopen". To open a file with arbitrary weird characters in it, it's
529 necessary to protect any leading and trailing whitespace. Leading
530 whitespace is protected by inserting a "./" in front of a filename that
531 starts with whitespace. Trailing whitespace is protected by appending
532 an ASCII NUL byte ("\0") at the end of the string.
533
534 $file =~ s#^(\s)#./$1#;
535 open(FH, "< $file\0") || die "can't open $file: $!";
536
537 This assumes, of course, that your system considers dot the current
538 working directory, slash the directory separator, and disallows ASCII
539 NULs within a valid filename. Most systems follow these conventions,
540 including all POSIX systems as well as proprietary Microsoft systems.
541 The only vaguely popular system that doesn't work this way is the
542 "Classic" Macintosh system, which uses a colon where the rest of us use
543 a slash. Maybe "sysopen" isn't such a bad idea after all.
544
545 If you want to use "<ARGV>" processing in a totally boring and non-
546 magical way, you could do this first:
547
548 # "Sam sat on the ground and put his head in his hands.
549 # 'I wish I had never come here, and I don't want to see
550 # no more magic,' he said, and fell silent."
551 for (@ARGV) {
552 s#^([^./])#./$1#;
553 $_ .= "\0";
554 }
555 while (<>) {
556 # now process $_
557 }
558
559 But be warned that users will not appreciate being unable to use "-" to
560 mean standard input, per the standard convention.
561
562 Paths as Opens
563 You've probably noticed how Perl's "warn" and "die" functions can
564 produce messages like:
565
566 Some warning at scriptname line 29, <FH> line 7.
567
568 That's because you opened a filehandle FH, and had read in seven
569 records from it. But what was the name of the file, rather than the
570 handle?
571
572 If you aren't running with "strict refs", or if you've turned them off
573 temporarily, then all you have to do is this:
574
575 open($path, "< $path") || die "can't open $path: $!";
576 while (<$path>) {
577 # whatever
578 }
579
580 Since you're using the pathname of the file as its handle, you'll get
581 warnings more like
582
583 Some warning at scriptname line 29, </etc/motd> line 7.
584
585 Single Argument Open
586 Remember how we said that Perl's open took two arguments? That was a
587 passive prevarication. You see, it can also take just one argument.
588 If and only if the variable is a global variable, not a lexical, you
589 can pass "open" just one argument, the filehandle, and it will get the
590 path from the global scalar variable of the same name.
591
592 $FILE = "/etc/motd";
593 open FILE or die "can't open $FILE: $!";
594 while (<FILE>) {
595 # whatever
596 }
597
598 Why is this here? Someone has to cater to the hysterical porpoises.
599 It's something that's been in Perl since the very beginning, if not
600 before.
601
602 Playing with STDIN and STDOUT
603 One clever move with STDOUT is to explicitly close it when you're done
604 with the program.
605
606 END { close(STDOUT) || die "can't close stdout: $!" }
607
608 If you don't do this, and your program fills up the disk partition due
609 to a command line redirection, it won't report the error exit with a
610 failure status.
611
612 You don't have to accept the STDIN and STDOUT you were given. You are
613 welcome to reopen them if you'd like.
614
615 open(STDIN, "< datafile")
616 || die "can't open datafile: $!";
617
618 open(STDOUT, "> output")
619 || die "can't open output: $!";
620
621 And then these can be accessed directly or passed on to subprocesses.
622 This makes it look as though the program were initially invoked with
623 those redirections from the command line.
624
625 It's probably more interesting to connect these to pipes. For example:
626
627 $pager = $ENV{PAGER} || "(less || more)";
628 open(STDOUT, "| $pager")
629 || die "can't fork a pager: $!";
630
631 This makes it appear as though your program were called with its stdout
632 already piped into your pager. You can also use this kind of thing in
633 conjunction with an implicit fork to yourself. You might do this if
634 you would rather handle the post processing in your own program, just
635 in a different process:
636
637 head(100);
638 while (<>) {
639 print;
640 }
641
642 sub head {
643 my $lines = shift || 20;
644 return if $pid = open(STDOUT, "|-"); # return if parent
645 die "cannot fork: $!" unless defined $pid;
646 while (<STDIN>) {
647 last if --$lines < 0;
648 print;
649 }
650 exit;
651 }
652
653 This technique can be applied to repeatedly push as many filters on
654 your output stream as you wish.
655
657 These topics aren't really arguments related to "open" or "sysopen",
658 but they do affect what you do with your open files.
659
660 Opening Non-File Files
661 When is a file not a file? Well, you could say when it exists but
662 isn't a plain file. We'll check whether it's a symbolic link first,
663 just in case.
664
665 if (-l $file || ! -f _) {
666 print "$file is not a plain file\n";
667 }
668
669 What other kinds of files are there than, well, files? Directories,
670 symbolic links, named pipes, Unix-domain sockets, and block and
671 character devices. Those are all files, too--just not plain files.
672 This isn't the same issue as being a text file. Not all text files are
673 plain files. Not all plain files are text files. That's why there are
674 separate "-f" and "-T" file tests.
675
676 To open a directory, you should use the "opendir" function, then
677 process it with "readdir", carefully restoring the directory name if
678 necessary:
679
680 opendir(DIR, $dirname) or die "can't opendir $dirname: $!";
681 while (defined($file = readdir(DIR))) {
682 # do something with "$dirname/$file"
683 }
684 closedir(DIR);
685
686 If you want to process directories recursively, it's better to use the
687 File::Find module. For example, this prints out all files recursively
688 and adds a slash to their names if the file is a directory.
689
690 @ARGV = qw(.) unless @ARGV;
691 use File::Find;
692 find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV;
693
694 This finds all bogus symbolic links beneath a particular directory:
695
696 find sub { print "$File::Find::name\n" if -l && !-e }, $dir;
697
698 As you see, with symbolic links, you can just pretend that it is what
699 it points to. Or, if you want to know what it points to, then
700 "readlink" is called for:
701
702 if (-l $file) {
703 if (defined($whither = readlink($file))) {
704 print "$file points to $whither\n";
705 } else {
706 print "$file points nowhere: $!\n";
707 }
708 }
709
710 Opening Named Pipes
711 Named pipes are a different matter. You pretend they're regular files,
712 but their opens will normally block until there is both a reader and a
713 writer. You can read more about them in "Named Pipes" in perlipc.
714 Unix-domain sockets are rather different beasts as well; they're
715 described in "Unix-Domain TCP Clients and Servers" in perlipc.
716
717 When it comes to opening devices, it can be easy and it can be tricky.
718 We'll assume that if you're opening up a block device, you know what
719 you're doing. The character devices are more interesting. These are
720 typically used for modems, mice, and some kinds of printers. This is
721 described in "How do I read and write the serial port?" in perlfaq8
722 It's often enough to open them carefully:
723
724 sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY)
725 # (O_NOCTTY no longer needed on POSIX systems)
726 or die "can't open /dev/ttyS1: $!";
727 open(TTYOUT, "+>&TTYIN")
728 or die "can't dup TTYIN: $!";
729
730 $ofh = select(TTYOUT); $| = 1; select($ofh);
731
732 print TTYOUT "+++at\015";
733 $answer = <TTYIN>;
734
735 With descriptors that you haven't opened using "sysopen", such as
736 sockets, you can set them to be non-blocking using "fcntl":
737
738 use Fcntl;
739 my $old_flags = fcntl($handle, F_GETFL, 0)
740 or die "can't get flags: $!";
741 fcntl($handle, F_SETFL, $old_flags | O_NONBLOCK)
742 or die "can't set non blocking: $!";
743
744 Rather than losing yourself in a morass of twisting, turning "ioctl"s,
745 all dissimilar, if you're going to manipulate ttys, it's best to make
746 calls out to the stty(1) program if you have it, or else use the
747 portable POSIX interface. To figure this all out, you'll need to read
748 the termios(3) manpage, which describes the POSIX interface to tty
749 devices, and then POSIX, which describes Perl's interface to POSIX.
750 There are also some high-level modules on CPAN that can help you with
751 these games. Check out Term::ReadKey and Term::ReadLine.
752
753 Opening Sockets
754 What else can you open? To open a connection using sockets, you won't
755 use one of Perl's two open functions. See "Sockets: Client/Server
756 Communication" in perlipc for that. Here's an example. Once you have
757 it, you can use FH as a bidirectional filehandle.
758
759 use IO::Socket;
760 local *FH = IO::Socket::INET->new("www.perl.com:80");
761
762 For opening up a URL, the LWP modules from CPAN are just what the
763 doctor ordered. There's no filehandle interface, but it's still easy
764 to get the contents of a document:
765
766 use LWP::Simple;
767 $doc = get('http://www.linpro.no/lwp/');
768
769 Binary Files
770 On certain legacy systems with what could charitably be called
771 terminally convoluted (some would say broken) I/O models, a file isn't
772 a file--at least, not with respect to the C standard I/O library. On
773 these old systems whose libraries (but not kernels) distinguish between
774 text and binary streams, to get files to behave properly you'll have to
775 bend over backwards to avoid nasty problems. On such infelicitous
776 systems, sockets and pipes are already opened in binary mode, and there
777 is currently no way to turn that off. With files, you have more
778 options.
779
780 Another option is to use the "binmode" function on the appropriate
781 handles before doing regular I/O on them:
782
783 binmode(STDIN);
784 binmode(STDOUT);
785 while (<STDIN>) { print }
786
787 Passing "sysopen" a non-standard flag option will also open the file in
788 binary mode on those systems that support it. This is the equivalent
789 of opening the file normally, then calling "binmode" on the handle.
790
791 sysopen(BINDAT, "records.data", O_RDWR | O_BINARY)
792 || die "can't open records.data: $!";
793
794 Now you can use "read" and "print" on that handle without worrying
795 about the non-standard system I/O library breaking your data. It's not
796 a pretty picture, but then, legacy systems seldom are. CP/M will be
797 with us until the end of days, and after.
798
799 On systems with exotic I/O systems, it turns out that, astonishingly
800 enough, even unbuffered I/O using "sysread" and "syswrite" might do
801 sneaky data mutilation behind your back.
802
803 while (sysread(WHENCE, $buf, 1024)) {
804 syswrite(WHITHER, $buf, length($buf));
805 }
806
807 Depending on the vicissitudes of your runtime system, even these calls
808 may need "binmode" or "O_BINARY" first. Systems known to be free of
809 such difficulties include Unix, the Mac OS, Plan 9, and Inferno.
810
811 File Locking
812 In a multitasking environment, you may need to be careful not to
813 collide with other processes who want to do I/O on the same files as
814 you are working on. You'll often need shared or exclusive locks on
815 files for reading and writing respectively. You might just pretend
816 that only exclusive locks exist.
817
818 Never use the existence of a file "-e $file" as a locking indication,
819 because there is a race condition between the test for the existence of
820 the file and its creation. It's possible for another process to create
821 a file in the slice of time between your existence check and your
822 attempt to create the file. Atomicity is critical.
823
824 Perl's most portable locking interface is via the "flock" function,
825 whose simplicity is emulated on systems that don't directly support it
826 such as SysV or Windows. The underlying semantics may affect how it
827 all works, so you should learn how "flock" is implemented on your
828 system's port of Perl.
829
830 File locking does not lock out another process that would like to do
831 I/O. A file lock only locks out others trying to get a lock, not
832 processes trying to do I/O. Because locks are advisory, if one process
833 uses locking and another doesn't, all bets are off.
834
835 By default, the "flock" call will block until a lock is granted. A
836 request for a shared lock will be granted as soon as there is no
837 exclusive locker. A request for an exclusive lock will be granted as
838 soon as there is no locker of any kind. Locks are on file descriptors,
839 not file names. You can't lock a file until you open it, and you can't
840 hold on to a lock once the file has been closed.
841
842 Here's how to get a blocking shared lock on a file, typically used for
843 reading:
844
845 use 5.004;
846 use Fcntl qw(:DEFAULT :flock);
847 open(FH, "< filename") or die "can't open filename: $!";
848 flock(FH, LOCK_SH) or die "can't lock filename: $!";
849 # now read from FH
850
851 You can get a non-blocking lock by using "LOCK_NB".
852
853 flock(FH, LOCK_SH | LOCK_NB)
854 or die "can't lock filename: $!";
855
856 This can be useful for producing more user-friendly behaviour by
857 warning if you're going to be blocking:
858
859 use 5.004;
860 use Fcntl qw(:DEFAULT :flock);
861 open(FH, "< filename") or die "can't open filename: $!";
862 unless (flock(FH, LOCK_SH | LOCK_NB)) {
863 $| = 1;
864 print "Waiting for lock...";
865 flock(FH, LOCK_SH) or die "can't lock filename: $!";
866 print "got it.\n"
867 }
868 # now read from FH
869
870 To get an exclusive lock, typically used for writing, you have to be
871 careful. We "sysopen" the file so it can be locked before it gets
872 emptied. You can get a nonblocking version using "LOCK_EX | LOCK_NB".
873
874 use 5.004;
875 use Fcntl qw(:DEFAULT :flock);
876 sysopen(FH, "filename", O_WRONLY | O_CREAT)
877 or die "can't open filename: $!";
878 flock(FH, LOCK_EX)
879 or die "can't lock filename: $!";
880 truncate(FH, 0)
881 or die "can't truncate filename: $!";
882 # now write to FH
883
884 Finally, due to the uncounted millions who cannot be dissuaded from
885 wasting cycles on useless vanity devices called hit counters, here's
886 how to increment a number in a file safely:
887
888 use Fcntl qw(:DEFAULT :flock);
889
890 sysopen(FH, "numfile", O_RDWR | O_CREAT)
891 or die "can't open numfile: $!";
892 # autoflush FH
893 $ofh = select(FH); $| = 1; select ($ofh);
894 flock(FH, LOCK_EX)
895 or die "can't write-lock numfile: $!";
896
897 $num = <FH> || 0;
898 seek(FH, 0, 0)
899 or die "can't rewind numfile : $!";
900 print FH $num+1, "\n"
901 or die "can't write numfile: $!";
902
903 truncate(FH, tell(FH))
904 or die "can't truncate numfile: $!";
905 close(FH)
906 or die "can't close numfile: $!";
907
908 IO Layers
909 In Perl 5.8.0 a new I/O framework called "PerlIO" was introduced. This
910 is a new "plumbing" for all the I/O happening in Perl; for the most
911 part everything will work just as it did, but PerlIO also brought in
912 some new features such as the ability to think of I/O as "layers". One
913 I/O layer may in addition to just moving the data also do
914 transformations on the data. Such transformations may include
915 compression and decompression, encryption and decryption, and
916 transforming between various character encodings.
917
918 Full discussion about the features of PerlIO is out of scope for this
919 tutorial, but here is how to recognize the layers being used:
920
921 · The three-(or more)-argument form of "open" is being used and the
922 second argument contains something else in addition to the usual
923 '<', '>', '>>', '|' and their variants, for example:
924
925 open(my $fh, "<:crlf", $fn);
926
927 · The two-argument form of "binmode" is being used, for example
928
929 binmode($fh, ":encoding(utf16)");
930
931 For more detailed discussion about PerlIO see PerlIO; for more detailed
932 discussion about Unicode and I/O see perluniintro.
933
935 The "open" and "sysopen" functions in perlfunc(1); the system open(2),
936 dup(2), fopen(3), and fdopen(3) manpages; the POSIX documentation.
937
939 Copyright 1998 Tom Christiansen.
940
941 This documentation is free; you can redistribute it and/or modify it
942 under the same terms as Perl itself.
943
944 Irrespective of its distribution, all code examples in these files are
945 hereby placed into the public domain. You are permitted and encouraged
946 to use this code in your own programs for fun or for profit as you see
947 fit. A simple comment in the code giving credit would be courteous but
948 is not required.
949
951 First release: Sat Jan 9 08:09:11 MST 1999
952
953
954
955perl v5.10.1 2009-02-12 PERLOPENTUT(1)