1PERLOPENTUT(1) Perl Programmers Reference Guide PERLOPENTUT(1)
2
3
4
6 perlopentut - tutorial on opening things in Perl
7
9 Perl has two simple, built-in ways to open files: the shell way for
10 convenience, and the C way for precision. The shell way also has 2-
11 and 3-argument forms, which have different semantics for handling the
12 filename. The choice is yours.
13
15 Perl's "open" function was designed to mimic the way command-line
16 redirection in the shell works. Here are some basic examples from the
17 shell:
18
19 $ myprogram file1 file2 file3
20 $ myprogram < inputfile
21 $ myprogram > outputfile
22 $ myprogram >> outputfile
23 $ myprogram | otherprogram
24 $ otherprogram | myprogram
25
26 And here are some more advanced examples:
27
28 $ otherprogram | myprogram f1 - f2
29 $ otherprogram 2>&1 | myprogram -
30 $ myprogram <&3
31 $ myprogram >&4
32
33 Programmers accustomed to constructs like those above can take comfort
34 in learning that Perl directly supports these familiar constructs using
35 virtually the same syntax as the shell.
36
37 Simple Opens
38 The "open" function takes two arguments: the first is a filehandle, and
39 the second is a single string comprising both what to open and how to
40 open it. "open" returns true when it works, and when it fails, returns
41 a false value and sets the special variable $! to reflect the system
42 error. If the filehandle was previously opened, it will be implicitly
43 closed first.
44
45 For example:
46
47 open(INFO, "datafile") || die("can't open datafile: $!");
48 open(INFO, "< datafile") || die("can't open datafile: $!");
49 open(RESULTS,"> runstats") || die("can't open runstats: $!");
50 open(LOG, ">> logfile ") || die("can't open logfile: $!");
51
52 If you prefer the low-punctuation version, you could write that this
53 way:
54
55 open INFO, "< datafile" or die "can't open datafile: $!";
56 open RESULTS,"> runstats" or die "can't open runstats: $!";
57 open LOG, ">> logfile " or die "can't open logfile: $!";
58
59 A few things to notice. First, the leading "<" is optional. If
60 omitted, Perl assumes that you want to open the file for reading.
61
62 Note also that the first example uses the "||" logical operator, and
63 the second uses "or", which has lower precedence. Using "||" in the
64 latter examples would effectively mean
65
66 open INFO, ( "< datafile" || die "can't open datafile: $!" );
67
68 which is definitely not what you want.
69
70 The other important thing to notice is that, just as in the shell, any
71 whitespace before or after the filename is ignored. This is good,
72 because you wouldn't want these to do different things:
73
74 open INFO, "<datafile"
75 open INFO, "< datafile"
76 open INFO, "< datafile"
77
78 Ignoring surrounding whitespace also helps for when you read a filename
79 in from a different file, and forget to trim it before opening:
80
81 $filename = <INFO>; # oops, \n still there
82 open(EXTRA, "< $filename") || die "can't open $filename: $!";
83
84 This is not a bug, but a feature. Because "open" mimics the shell in
85 its style of using redirection arrows to specify how to open the file,
86 it also does so with respect to extra whitespace around the filename
87 itself as well. For accessing files with naughty names, see
88 "Dispelling the Dweomer".
89
90 There is also a 3-argument version of "open", which lets you put the
91 special redirection characters into their own argument:
92
93 open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";
94
95 In this case, the filename to open is the actual string in $datafile,
96 so you don't have to worry about $datafile containing characters that
97 might influence the open mode, or whitespace at the beginning of the
98 filename that would be absorbed in the 2-argument version. Also, any
99 reduction of unnecessary string interpolation is a good thing.
100
101 Indirect Filehandles
102 "open"'s first argument can be a reference to a filehandle. As of perl
103 5.6.0, if the argument is uninitialized, Perl will automatically create
104 a filehandle and put a reference to it in the first argument, like so:
105
106 open( my $in, $infile ) or die "Couldn't read $infile: $!";
107 while ( <$in> ) {
108 # do something with $_
109 }
110 close $in;
111
112 Indirect filehandles make namespace management easier. Since
113 filehandles are global to the current package, two subroutines trying
114 to open "INFILE" will clash. With two functions opening indirect
115 filehandles like "my $infile", there's no clash and no need to worry
116 about future conflicts.
117
118 Another convenient behavior is that an indirect filehandle
119 automatically closes when there are no more references to it:
120
121 sub firstline {
122 open( my $in, shift ) && return scalar <$in>;
123 # no close() required
124 }
125
126 Indirect filehandles also make it easy to pass filehandles to and
127 return filehandles from subroutines:
128
129 for my $file ( qw(this.conf that.conf) ) {
130 my $fin = open_or_throw('<', $file);
131 process_conf( $fin );
132 # no close() needed
133 }
134
135 use Carp;
136 sub open_or_throw {
137 my ($mode, $filename) = @_;
138 open my $h, $mode, $filename
139 or croak "Could not open '$filename': $!";
140 return $h;
141 }
142
143 Pipe Opens
144 In C, when you want to open a file using the standard I/O library, you
145 use the "fopen" function, but when opening a pipe, you use the "popen"
146 function. But in the shell, you just use a different redirection
147 character. That's also the case for Perl. The "open" call remains the
148 same--just its argument differs.
149
150 If the leading character is a pipe symbol, "open" starts up a new
151 command and opens a write-only filehandle leading into that command.
152 This lets you write into that handle and have what you write show up on
153 that command's standard input. For example:
154
155 open(PRINTER, "| lpr -Plp1") || die "can't run lpr: $!";
156 print PRINTER "stuff\n";
157 close(PRINTER) || die "can't close lpr: $!";
158
159 If the trailing character is a pipe, you start up a new command and
160 open a read-only filehandle leading out of that command. This lets
161 whatever that command writes to its standard output show up on your
162 handle for reading. For example:
163
164 open(NET, "netstat -i -n |") || die "can't fork netstat: $!";
165 while (<NET>) { } # do something with input
166 close(NET) || die "can't close netstat: $!";
167
168 What happens if you try to open a pipe to or from a non-existent
169 command? If possible, Perl will detect the failure and set $! as
170 usual. But if the command contains special shell characters, such as
171 ">" or "*", called 'metacharacters', Perl does not execute the command
172 directly. Instead, Perl runs the shell, which then tries to run the
173 command. This means that it's the shell that gets the error
174 indication. In such a case, the "open" call will only indicate failure
175 if Perl can't even run the shell. See "How can I capture STDERR from
176 an external command?" in perlfaq8 to see how to cope with this.
177 There's also an explanation in perlipc.
178
179 If you would like to open a bidirectional pipe, the IPC::Open2 library
180 will handle this for you. Check out "Bidirectional Communication with
181 Another Process" in perlipc
182
183 perl-5.6.x introduced a version of piped open that executes a process
184 based on its command line arguments without relying on the shell.
185 (Similar to the "system(@LIST)" notation.) This is safer and faster
186 than executing a single argument pipe-command, but does not allow
187 special shell constructs. (It is also not supported on Microsoft
188 Windows, Mac OS Classic or RISC OS.)
189
190 Here's an example of "open '-|'", which prints a random Unix fortune
191 cookie as uppercase:
192
193 my $collection = shift(@ARGV);
194 open my $fortune, '-|', 'fortune', $collection
195 or die "Could not find fortune - $!";
196 while (<$fortune>)
197 {
198 print uc($_);
199 }
200 close($fortune);
201
202 And this "open '|-'" pipes into lpr:
203
204 open my $printer, '|-', 'lpr', '-Plp1'
205 or die "can't run lpr: $!";
206 print {$printer} "stuff\n";
207 close($printer)
208 or die "can't close lpr: $!";
209
210 The Minus File
211 Again following the lead of the standard shell utilities, Perl's "open"
212 function treats a file whose name is a single minus, "-", in a special
213 way. If you open minus for reading, it really means to access the
214 standard input. If you open minus for writing, it really means to
215 access the standard output.
216
217 If minus can be used as the default input or default output, what
218 happens if you open a pipe into or out of minus? What's the default
219 command it would run? The same script as you're currently running!
220 This is actually a stealth "fork" hidden inside an "open" call. See
221 "Safe Pipe Opens" in perlipc for details.
222
223 Mixing Reads and Writes
224 It is possible to specify both read and write access. All you do is
225 add a "+" symbol in front of the redirection. But as in the shell,
226 using a less-than on a file never creates a new file; it only opens an
227 existing one. On the other hand, using a greater-than always clobbers
228 (truncates to zero length) an existing file, or creates a brand-new one
229 if there isn't an old one. Adding a "+" for read-write doesn't affect
230 whether it only works on existing files or always clobbers existing
231 ones.
232
233 open(WTMP, "+< /usr/adm/wtmp")
234 || die "can't open /usr/adm/wtmp: $!";
235
236 open(SCREEN, "+> lkscreen")
237 || die "can't open lkscreen: $!";
238
239 open(LOGFILE, "+>> /var/log/applog")
240 || die "can't open /var/log/applog: $!";
241
242 The first one won't create a new file, and the second one will always
243 clobber an old one. The third one will create a new file if necessary
244 and not clobber an old one, and it will allow you to read at any point
245 in the file, but all writes will always go to the end. In short, the
246 first case is substantially more common than the second and third
247 cases, which are almost always wrong. (If you know C, the plus in
248 Perl's "open" is historically derived from the one in C's fopen(3S),
249 which it ultimately calls.)
250
251 In fact, when it comes to updating a file, unless you're working on a
252 binary file as in the WTMP case above, you probably don't want to use
253 this approach for updating. Instead, Perl's -i flag comes to the
254 rescue. The following command takes all the C, C++, or yacc source or
255 header files and changes all their foo's to bar's, leaving the old
256 version in the original filename with a ".orig" tacked on the end:
257
258 $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]
259
260 This is a short cut for some renaming games that are really the best
261 way to update textfiles. See the second question in perlfaq5 for more
262 details.
263
264 Filters
265 One of the most common uses for "open" is one you never even notice.
266 When you process the ARGV filehandle using "<ARGV>", Perl actually does
267 an implicit open on each file in @ARGV. Thus a program called like
268 this:
269
270 $ myprogram file1 file2 file3
271
272 can have all its files opened and processed one at a time using a
273 construct no more complex than:
274
275 while (<>) {
276 # do something with $_
277 }
278
279 If @ARGV is empty when the loop first begins, Perl pretends you've
280 opened up minus, that is, the standard input. In fact, $ARGV, the
281 currently open file during "<ARGV>" processing, is even set to "-" in
282 these circumstances.
283
284 You are welcome to pre-process your @ARGV before starting the loop to
285 make sure it's to your liking. One reason to do this might be to
286 remove command options beginning with a minus. While you can always
287 roll the simple ones by hand, the Getopts modules are good for this:
288
289 use Getopt::Std;
290
291 # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o
292 getopts("vDo:");
293
294 # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o}
295 getopts("vDo:", \%args);
296
297 Or the standard Getopt::Long module to permit named arguments:
298
299 use Getopt::Long;
300 GetOptions( "verbose" => \$verbose, # --verbose
301 "Debug" => \$debug, # --Debug
302 "output=s" => \$output );
303 # --output=somestring or --output somestring
304
305 Another reason for preprocessing arguments is to make an empty argument
306 list default to all files:
307
308 @ARGV = glob("*") unless @ARGV;
309
310 You could even filter out all but plain, text files. This is a bit
311 silent, of course, and you might prefer to mention them on the way.
312
313 @ARGV = grep { -f && -T } @ARGV;
314
315 If you're using the -n or -p command-line options, you should put
316 changes to @ARGV in a "BEGIN{}" block.
317
318 Remember that a normal "open" has special properties, in that it might
319 call fopen(3S) or it might called popen(3S), depending on what its
320 argument looks like; that's why it's sometimes called "magic open".
321 Here's an example:
322
323 $pwdinfo = `domainname` =~ /^(\(none\))?$/
324 ? '< /etc/passwd'
325 : 'ypcat passwd |';
326
327 open(PWD, $pwdinfo)
328 or die "can't open $pwdinfo: $!";
329
330 This sort of thing also comes into play in filter processing. Because
331 "<ARGV>" processing employs the normal, shell-style Perl "open", it
332 respects all the special things we've already seen:
333
334 $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
335
336 That program will read from the file f1, the process cmd1, standard
337 input (tmpfile in this case), the f2 file, the cmd2 command, and
338 finally the f3 file.
339
340 Yes, this also means that if you have files named "-" (and so on) in
341 your directory, they won't be processed as literal files by "open".
342 You'll need to pass them as "./-", much as you would for the rm
343 program, or you could use "sysopen" as described below.
344
345 One of the more interesting applications is to change files of a
346 certain name into pipes. For example, to autoprocess gzipped or
347 compressed files by decompressing them with gzip:
348
349 @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc $_ |" : $_ } @ARGV;
350
351 Or, if you have the GET program installed from LWP, you can fetch URLs
352 before processing them:
353
354 @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;
355
356 It's not for nothing that this is called magic "<ARGV>". Pretty nifty,
357 eh?
358
360 If you want the convenience of the shell, then Perl's "open" is
361 definitely the way to go. On the other hand, if you want finer
362 precision than C's simplistic fopen(3S) provides you should look to
363 Perl's "sysopen", which is a direct hook into the open(2) system call.
364 That does mean it's a bit more involved, but that's the price of
365 precision.
366
367 "sysopen" takes 3 (or 4) arguments.
368
369 sysopen HANDLE, PATH, FLAGS, [MASK]
370
371 The HANDLE argument is a filehandle just as with "open". The PATH is a
372 literal path, one that doesn't pay attention to any greater-thans or
373 less-thans or pipes or minuses, nor ignore whitespace. If it's there,
374 it's part of the path. The FLAGS argument contains one or more values
375 derived from the Fcntl module that have been or'd together using the
376 bitwise "|" operator. The final argument, the MASK, is optional; if
377 present, it is combined with the user's current umask for the creation
378 mode of the file. You should usually omit this.
379
380 Although the traditional values of read-only, write-only, and read-
381 write are 0, 1, and 2 respectively, this is known not to hold true on
382 some systems. Instead, it's best to load in the appropriate constants
383 first from the Fcntl module, which supplies the following standard
384 flags:
385
386 O_RDONLY Read only
387 O_WRONLY Write only
388 O_RDWR Read and write
389 O_CREAT Create the file if it doesn't exist
390 O_EXCL Fail if the file already exists
391 O_APPEND Append to the file
392 O_TRUNC Truncate the file
393 O_NONBLOCK Non-blocking access
394
395 Less common flags that are sometimes available on some operating
396 systems include "O_BINARY", "O_TEXT", "O_SHLOCK", "O_EXLOCK",
397 "O_DEFER", "O_SYNC", "O_ASYNC", "O_DSYNC", "O_RSYNC", "O_NOCTTY",
398 "O_NDELAY" and "O_LARGEFILE". Consult your open(2) manpage or its
399 local equivalent for details. (Note: starting from Perl release 5.6
400 the "O_LARGEFILE" flag, if available, is automatically added to the
401 sysopen() flags because large files are the default.)
402
403 Here's how to use "sysopen" to emulate the simple "open" calls we had
404 before. We'll omit the "|| die $!" checks for clarity, but make sure
405 you always check the return values in real code. These aren't quite
406 the same, since "open" will trim leading and trailing whitespace, but
407 you'll get the idea.
408
409 To open a file for reading:
410
411 open(FH, "< $path");
412 sysopen(FH, $path, O_RDONLY);
413
414 To open a file for writing, creating a new file if needed or else
415 truncating an old file:
416
417 open(FH, "> $path");
418 sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT);
419
420 To open a file for appending, creating one if necessary:
421
422 open(FH, ">> $path");
423 sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT);
424
425 To open a file for update, where the file must already exist:
426
427 open(FH, "+< $path");
428 sysopen(FH, $path, O_RDWR);
429
430 And here are things you can do with "sysopen" that you cannot do with a
431 regular "open". As you'll see, it's just a matter of controlling the
432 flags in the third argument.
433
434 To open a file for writing, creating a new file which must not
435 previously exist:
436
437 sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT);
438
439 To open a file for appending, where that file must already exist:
440
441 sysopen(FH, $path, O_WRONLY | O_APPEND);
442
443 To open a file for update, creating a new file if necessary:
444
445 sysopen(FH, $path, O_RDWR | O_CREAT);
446
447 To open a file for update, where that file must not already exist:
448
449 sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT);
450
451 To open a file without blocking, creating one if necessary:
452
453 sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT);
454
455 Permissions a la mode
456 If you omit the MASK argument to "sysopen", Perl uses the octal value
457 0666. The normal MASK to use for executables and directories should be
458 0777, and for anything else, 0666.
459
460 Why so permissive? Well, it isn't really. The MASK will be modified
461 by your process's current "umask". A umask is a number representing
462 disabled permissions bits; that is, bits that will not be turned on in
463 the created file's permissions field.
464
465 For example, if your "umask" were 027, then the 020 part would disable
466 the group from writing, and the 007 part would disable others from
467 reading, writing, or executing. Under these conditions, passing
468 "sysopen" 0666 would create a file with mode 0640, since "0666 & ~027"
469 is 0640.
470
471 You should seldom use the MASK argument to "sysopen()". That takes
472 away the user's freedom to choose what permission new files will have.
473 Denying choice is almost always a bad thing. One exception would be
474 for cases where sensitive or private data is being stored, such as with
475 mail folders, cookie files, and internal temporary files.
476
478 Re-Opening Files (dups)
479 Sometimes you already have a filehandle open, and want to make another
480 handle that's a duplicate of the first one. In the shell, we place an
481 ampersand in front of a file descriptor number when doing redirections.
482 For example, "2>&1" makes descriptor 2 (that's STDERR in Perl) be
483 redirected into descriptor 1 (which is usually Perl's STDOUT). The
484 same is essentially true in Perl: a filename that begins with an
485 ampersand is treated instead as a file descriptor if a number, or as a
486 filehandle if a string.
487
488 open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!";
489 open(MHCONTEXT, "<&4") || die "couldn't dup fd4: $!";
490
491 That means that if a function is expecting a filename, but you don't
492 want to give it a filename because you already have the file open, you
493 can just pass the filehandle with a leading ampersand. It's best to
494 use a fully qualified handle though, just in case the function happens
495 to be in a different package:
496
497 somefunction("&main::LOGFILE");
498
499 This way if somefunction() is planning on opening its argument, it can
500 just use the already opened handle. This differs from passing a
501 handle, because with a handle, you don't open the file. Here you have
502 something you can pass to open.
503
504 If you have one of those tricky, newfangled I/O objects that the C++
505 folks are raving about, then this doesn't work because those aren't a
506 proper filehandle in the native Perl sense. You'll have to use
507 fileno() to pull out the proper descriptor number, assuming you can:
508
509 use IO::Socket;
510 $handle = IO::Socket::INET->new("www.perl.com:80");
511 $fd = $handle->fileno;
512 somefunction("&$fd"); # not an indirect function call
513
514 It can be easier (and certainly will be faster) just to use real
515 filehandles though:
516
517 use IO::Socket;
518 local *REMOTE = IO::Socket::INET->new("www.perl.com:80");
519 die "can't connect" unless defined(fileno(REMOTE));
520 somefunction("&main::REMOTE");
521
522 If the filehandle or descriptor number is preceded not just with a
523 simple "&" but rather with a "&=" combination, then Perl will not
524 create a completely new descriptor opened to the same place using the
525 dup(2) system call. Instead, it will just make something of an alias
526 to the existing one using the fdopen(3S) library call. This is
527 slightly more parsimonious of systems resources, although this is less
528 a concern these days. Here's an example of that:
529
530 $fd = $ENV{"MHCONTEXTFD"};
531 open(MHCONTEXT, "<&=$fd") or die "couldn't fdopen $fd: $!";
532
533 If you're using magic "<ARGV>", you could even pass in as a command
534 line argument in @ARGV something like "<&=$MHCONTEXTFD", but we've
535 never seen anyone actually do this.
536
537 Dispelling the Dweomer
538 Perl is more of a DWIMmer language than something like Java--where DWIM
539 is an acronym for "do what I mean". But this principle sometimes leads
540 to more hidden magic than one knows what to do with. In this way, Perl
541 is also filled with dweomer, an obscure word meaning an enchantment.
542 Sometimes, Perl's DWIMmer is just too much like dweomer for comfort.
543
544 If magic "open" is a bit too magical for you, you don't have to turn to
545 "sysopen". To open a file with arbitrary weird characters in it, it's
546 necessary to protect any leading and trailing whitespace. Leading
547 whitespace is protected by inserting a "./" in front of a filename that
548 starts with whitespace. Trailing whitespace is protected by appending
549 an ASCII NUL byte ("\0") at the end of the string.
550
551 $file =~ s#^(\s)#./$1#;
552 open(FH, "< $file\0") || die "can't open $file: $!";
553
554 This assumes, of course, that your system considers dot the current
555 working directory, slash the directory separator, and disallows ASCII
556 NULs within a valid filename. Most systems follow these conventions,
557 including all POSIX systems as well as proprietary Microsoft systems.
558 The only vaguely popular system that doesn't work this way is the
559 "Classic" Macintosh system, which uses a colon where the rest of us use
560 a slash. Maybe "sysopen" isn't such a bad idea after all.
561
562 If you want to use "<ARGV>" processing in a totally boring and non-
563 magical way, you could do this first:
564
565 # "Sam sat on the ground and put his head in his hands.
566 # 'I wish I had never come here, and I don't want to see
567 # no more magic,' he said, and fell silent."
568 for (@ARGV) {
569 s#^([^./])#./$1#;
570 $_ .= "\0";
571 }
572 while (<>) {
573 # now process $_
574 }
575
576 But be warned that users will not appreciate being unable to use "-" to
577 mean standard input, per the standard convention.
578
579 Paths as Opens
580 You've probably noticed how Perl's "warn" and "die" functions can
581 produce messages like:
582
583 Some warning at scriptname line 29, <FH> line 7.
584
585 That's because you opened a filehandle FH, and had read in seven
586 records from it. But what was the name of the file, rather than the
587 handle?
588
589 If you aren't running with "strict refs", or if you've turned them off
590 temporarily, then all you have to do is this:
591
592 open($path, "< $path") || die "can't open $path: $!";
593 while (<$path>) {
594 # whatever
595 }
596
597 Since you're using the pathname of the file as its handle, you'll get
598 warnings more like
599
600 Some warning at scriptname line 29, </etc/motd> line 7.
601
602 Single Argument Open
603 Remember how we said that Perl's open took two arguments? That was a
604 passive prevarication. You see, it can also take just one argument.
605 If and only if the variable is a global variable, not a lexical, you
606 can pass "open" just one argument, the filehandle, and it will get the
607 path from the global scalar variable of the same name.
608
609 $FILE = "/etc/motd";
610 open FILE or die "can't open $FILE: $!";
611 while (<FILE>) {
612 # whatever
613 }
614
615 Why is this here? Someone has to cater to the hysterical porpoises.
616 It's something that's been in Perl since the very beginning, if not
617 before.
618
619 Playing with STDIN and STDOUT
620 One clever move with STDOUT is to explicitly close it when you're done
621 with the program.
622
623 END { close(STDOUT) || die "can't close stdout: $!" }
624
625 If you don't do this, and your program fills up the disk partition due
626 to a command line redirection, it won't report the error exit with a
627 failure status.
628
629 You don't have to accept the STDIN and STDOUT you were given. You are
630 welcome to reopen them if you'd like.
631
632 open(STDIN, "< datafile")
633 || die "can't open datafile: $!";
634
635 open(STDOUT, "> output")
636 || die "can't open output: $!";
637
638 And then these can be accessed directly or passed on to subprocesses.
639 This makes it look as though the program were initially invoked with
640 those redirections from the command line.
641
642 It's probably more interesting to connect these to pipes. For example:
643
644 $pager = $ENV{PAGER} || "(less || more)";
645 open(STDOUT, "| $pager")
646 || die "can't fork a pager: $!";
647
648 This makes it appear as though your program were called with its stdout
649 already piped into your pager. You can also use this kind of thing in
650 conjunction with an implicit fork to yourself. You might do this if
651 you would rather handle the post processing in your own program, just
652 in a different process:
653
654 head(100);
655 while (<>) {
656 print;
657 }
658
659 sub head {
660 my $lines = shift || 20;
661 return if $pid = open(STDOUT, "|-"); # return if parent
662 die "cannot fork: $!" unless defined $pid;
663 while (<STDIN>) {
664 last if --$lines < 0;
665 print;
666 }
667 exit;
668 }
669
670 This technique can be applied to repeatedly push as many filters on
671 your output stream as you wish.
672
674 These topics aren't really arguments related to "open" or "sysopen",
675 but they do affect what you do with your open files.
676
677 Opening Non-File Files
678 When is a file not a file? Well, you could say when it exists but
679 isn't a plain file. We'll check whether it's a symbolic link first,
680 just in case.
681
682 if (-l $file || ! -f _) {
683 print "$file is not a plain file\n";
684 }
685
686 What other kinds of files are there than, well, files? Directories,
687 symbolic links, named pipes, Unix-domain sockets, and block and
688 character devices. Those are all files, too--just not plain files.
689 This isn't the same issue as being a text file. Not all text files are
690 plain files. Not all plain files are text files. That's why there are
691 separate "-f" and "-T" file tests.
692
693 To open a directory, you should use the "opendir" function, then
694 process it with "readdir", carefully restoring the directory name if
695 necessary:
696
697 opendir(DIR, $dirname) or die "can't opendir $dirname: $!";
698 while (defined($file = readdir(DIR))) {
699 # do something with "$dirname/$file"
700 }
701 closedir(DIR);
702
703 If you want to process directories recursively, it's better to use the
704 File::Find module. For example, this prints out all files recursively
705 and adds a slash to their names if the file is a directory.
706
707 @ARGV = qw(.) unless @ARGV;
708 use File::Find;
709 find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV;
710
711 This finds all bogus symbolic links beneath a particular directory:
712
713 find sub { print "$File::Find::name\n" if -l && !-e }, $dir;
714
715 As you see, with symbolic links, you can just pretend that it is what
716 it points to. Or, if you want to know what it points to, then
717 "readlink" is called for:
718
719 if (-l $file) {
720 if (defined($whither = readlink($file))) {
721 print "$file points to $whither\n";
722 } else {
723 print "$file points nowhere: $!\n";
724 }
725 }
726
727 Opening Named Pipes
728 Named pipes are a different matter. You pretend they're regular files,
729 but their opens will normally block until there is both a reader and a
730 writer. You can read more about them in "Named Pipes" in perlipc.
731 Unix-domain sockets are rather different beasts as well; they're
732 described in "Unix-Domain TCP Clients and Servers" in perlipc.
733
734 When it comes to opening devices, it can be easy and it can be tricky.
735 We'll assume that if you're opening up a block device, you know what
736 you're doing. The character devices are more interesting. These are
737 typically used for modems, mice, and some kinds of printers. This is
738 described in "How do I read and write the serial port?" in perlfaq8
739 It's often enough to open them carefully:
740
741 sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY)
742 # (O_NOCTTY no longer needed on POSIX systems)
743 or die "can't open /dev/ttyS1: $!";
744 open(TTYOUT, "+>&TTYIN")
745 or die "can't dup TTYIN: $!";
746
747 $ofh = select(TTYOUT); $| = 1; select($ofh);
748
749 print TTYOUT "+++at\015";
750 $answer = <TTYIN>;
751
752 With descriptors that you haven't opened using "sysopen", such as
753 sockets, you can set them to be non-blocking using "fcntl":
754
755 use Fcntl;
756 my $old_flags = fcntl($handle, F_GETFL, 0)
757 or die "can't get flags: $!";
758 fcntl($handle, F_SETFL, $old_flags | O_NONBLOCK)
759 or die "can't set non blocking: $!";
760
761 Rather than losing yourself in a morass of twisting, turning "ioctl"s,
762 all dissimilar, if you're going to manipulate ttys, it's best to make
763 calls out to the stty(1) program if you have it, or else use the
764 portable POSIX interface. To figure this all out, you'll need to read
765 the termios(3) manpage, which describes the POSIX interface to tty
766 devices, and then POSIX, which describes Perl's interface to POSIX.
767 There are also some high-level modules on CPAN that can help you with
768 these games. Check out Term::ReadKey and Term::ReadLine.
769
770 Opening Sockets
771 What else can you open? To open a connection using sockets, you won't
772 use one of Perl's two open functions. See "Sockets: Client/Server
773 Communication" in perlipc for that. Here's an example. Once you have
774 it, you can use FH as a bidirectional filehandle.
775
776 use IO::Socket;
777 local *FH = IO::Socket::INET->new("www.perl.com:80");
778
779 For opening up a URL, the LWP modules from CPAN are just what the
780 doctor ordered. There's no filehandle interface, but it's still easy
781 to get the contents of a document:
782
783 use LWP::Simple;
784 $doc = get('http://www.cpan.org/');
785
786 Binary Files
787 On certain legacy systems with what could charitably be called
788 terminally convoluted (some would say broken) I/O models, a file isn't
789 a file--at least, not with respect to the C standard I/O library. On
790 these old systems whose libraries (but not kernels) distinguish between
791 text and binary streams, to get files to behave properly you'll have to
792 bend over backwards to avoid nasty problems. On such infelicitous
793 systems, sockets and pipes are already opened in binary mode, and there
794 is currently no way to turn that off. With files, you have more
795 options.
796
797 Another option is to use the "binmode" function on the appropriate
798 handles before doing regular I/O on them:
799
800 binmode(STDIN);
801 binmode(STDOUT);
802 while (<STDIN>) { print }
803
804 Passing "sysopen" a non-standard flag option will also open the file in
805 binary mode on those systems that support it. This is the equivalent
806 of opening the file normally, then calling "binmode" on the handle.
807
808 sysopen(BINDAT, "records.data", O_RDWR | O_BINARY)
809 || die "can't open records.data: $!";
810
811 Now you can use "read" and "print" on that handle without worrying
812 about the non-standard system I/O library breaking your data. It's not
813 a pretty picture, but then, legacy systems seldom are. CP/M will be
814 with us until the end of days, and after.
815
816 On systems with exotic I/O systems, it turns out that, astonishingly
817 enough, even unbuffered I/O using "sysread" and "syswrite" might do
818 sneaky data mutilation behind your back.
819
820 while (sysread(WHENCE, $buf, 1024)) {
821 syswrite(WHITHER, $buf, length($buf));
822 }
823
824 Depending on the vicissitudes of your runtime system, even these calls
825 may need "binmode" or "O_BINARY" first. Systems known to be free of
826 such difficulties include Unix, the Mac OS, Plan 9, and Inferno.
827
828 File Locking
829 In a multitasking environment, you may need to be careful not to
830 collide with other processes who want to do I/O on the same files as
831 you are working on. You'll often need shared or exclusive locks on
832 files for reading and writing respectively. You might just pretend
833 that only exclusive locks exist.
834
835 Never use the existence of a file "-e $file" as a locking indication,
836 because there is a race condition between the test for the existence of
837 the file and its creation. It's possible for another process to create
838 a file in the slice of time between your existence check and your
839 attempt to create the file. Atomicity is critical.
840
841 Perl's most portable locking interface is via the "flock" function,
842 whose simplicity is emulated on systems that don't directly support it
843 such as SysV or Windows. The underlying semantics may affect how it
844 all works, so you should learn how "flock" is implemented on your
845 system's port of Perl.
846
847 File locking does not lock out another process that would like to do
848 I/O. A file lock only locks out others trying to get a lock, not
849 processes trying to do I/O. Because locks are advisory, if one process
850 uses locking and another doesn't, all bets are off.
851
852 By default, the "flock" call will block until a lock is granted. A
853 request for a shared lock will be granted as soon as there is no
854 exclusive locker. A request for an exclusive lock will be granted as
855 soon as there is no locker of any kind. Locks are on file descriptors,
856 not file names. You can't lock a file until you open it, and you can't
857 hold on to a lock once the file has been closed.
858
859 Here's how to get a blocking shared lock on a file, typically used for
860 reading:
861
862 use 5.004;
863 use Fcntl qw(:DEFAULT :flock);
864 open(FH, "< filename") or die "can't open filename: $!";
865 flock(FH, LOCK_SH) or die "can't lock filename: $!";
866 # now read from FH
867
868 You can get a non-blocking lock by using "LOCK_NB".
869
870 flock(FH, LOCK_SH | LOCK_NB)
871 or die "can't lock filename: $!";
872
873 This can be useful for producing more user-friendly behaviour by
874 warning if you're going to be blocking:
875
876 use 5.004;
877 use Fcntl qw(:DEFAULT :flock);
878 open(FH, "< filename") or die "can't open filename: $!";
879 unless (flock(FH, LOCK_SH | LOCK_NB)) {
880 $| = 1;
881 print "Waiting for lock...";
882 flock(FH, LOCK_SH) or die "can't lock filename: $!";
883 print "got it.\n"
884 }
885 # now read from FH
886
887 To get an exclusive lock, typically used for writing, you have to be
888 careful. We "sysopen" the file so it can be locked before it gets
889 emptied. You can get a nonblocking version using "LOCK_EX | LOCK_NB".
890
891 use 5.004;
892 use Fcntl qw(:DEFAULT :flock);
893 sysopen(FH, "filename", O_WRONLY | O_CREAT)
894 or die "can't open filename: $!";
895 flock(FH, LOCK_EX)
896 or die "can't lock filename: $!";
897 truncate(FH, 0)
898 or die "can't truncate filename: $!";
899 # now write to FH
900
901 Finally, due to the uncounted millions who cannot be dissuaded from
902 wasting cycles on useless vanity devices called hit counters, here's
903 how to increment a number in a file safely:
904
905 use Fcntl qw(:DEFAULT :flock);
906
907 sysopen(FH, "numfile", O_RDWR | O_CREAT)
908 or die "can't open numfile: $!";
909 # autoflush FH
910 $ofh = select(FH); $| = 1; select ($ofh);
911 flock(FH, LOCK_EX)
912 or die "can't write-lock numfile: $!";
913
914 $num = <FH> || 0;
915 seek(FH, 0, 0)
916 or die "can't rewind numfile : $!";
917 print FH $num+1, "\n"
918 or die "can't write numfile: $!";
919
920 truncate(FH, tell(FH))
921 or die "can't truncate numfile: $!";
922 close(FH)
923 or die "can't close numfile: $!";
924
925 IO Layers
926 In Perl 5.8.0 a new I/O framework called "PerlIO" was introduced. This
927 is a new "plumbing" for all the I/O happening in Perl; for the most
928 part everything will work just as it did, but PerlIO also brought in
929 some new features such as the ability to think of I/O as "layers". One
930 I/O layer may in addition to just moving the data also do
931 transformations on the data. Such transformations may include
932 compression and decompression, encryption and decryption, and
933 transforming between various character encodings.
934
935 Full discussion about the features of PerlIO is out of scope for this
936 tutorial, but here is how to recognize the layers being used:
937
938 · The three-(or more)-argument form of "open" is being used and the
939 second argument contains something else in addition to the usual
940 '<', '>', '>>', '|' and their variants, for example:
941
942 open(my $fh, "<:crlf", $fn);
943
944 · The two-argument form of "binmode" is being used, for example
945
946 binmode($fh, ":encoding(utf16)");
947
948 For more detailed discussion about PerlIO see PerlIO; for more detailed
949 discussion about Unicode and I/O see perluniintro.
950
952 The "open" and "sysopen" functions in perlfunc(1); the system open(2),
953 dup(2), fopen(3), and fdopen(3) manpages; the POSIX documentation.
954
956 Copyright 1998 Tom Christiansen.
957
958 This documentation is free; you can redistribute it and/or modify it
959 under the same terms as Perl itself.
960
961 Irrespective of its distribution, all code examples in these files are
962 hereby placed into the public domain. You are permitted and encouraged
963 to use this code in your own programs for fun or for profit as you see
964 fit. A simple comment in the code giving credit would be courteous but
965 is not required.
966
968 First release: Sat Jan 9 08:09:11 MST 1999
969
970
971
972perl v5.16.3 2013-03-04 PERLOPENTUT(1)