perlopentut(1)

1PERLOPENTUT(1)         Perl Programmers Reference Guide         PERLOPENTUT(1)
2
3
4

NAME

6       perlopentut - tutorial on opening things in Perl
7

DESCRIPTION

9       Perl has two simple, built-in ways to open files: the shell way for
10       convenience, and the C way for precision.  The shell way also has 2-
11       and 3-argument forms, which have different semantics for handling the
12       filename.  The choice is yours.
13

Open a la shell

15       Perl's "open" function was designed to mimic the way command-line
16       redirection in the shell works.  Here are some basic examples from the
17       shell:
18
19           $ myprogram file1 file2 file3
20           $ myprogram    <  inputfile
21           $ myprogram    >  outputfile
22           $ myprogram    >> outputfile
23           $ myprogram    |  otherprogram
24           $ otherprogram |  myprogram
25
26       And here are some more advanced examples:
27
28           $ otherprogram      | myprogram f1 - f2
29           $ otherprogram 2>&1 | myprogram -
30           $ myprogram     <&3
31           $ myprogram     >&4
32
33       Programmers accustomed to constructs like those above can take comfort
34       in learning that Perl directly supports these familiar constructs using
35       virtually the same syntax as the shell.
36
37   Simple Opens
38       The "open" function takes two arguments: the first is a filehandle, and
39       the second is a single string comprising both what to open and how to
40       open it.  "open" returns true when it works, and when it fails, returns
41       a false value and sets the special variable $! to reflect the system
42       error.  If the filehandle was previously opened, it will be implicitly
43       closed first.
44
45       For example:
46
47           open(INFO,      "datafile") || die("can't open datafile: $!");
48           open(INFO,   "<  datafile") || die("can't open datafile: $!");
49           open(RESULTS,">  runstats") || die("can't open runstats: $!");
50           open(LOG,    ">> logfile ") || die("can't open logfile:  $!");
51
52       If you prefer the low-punctuation version, you could write that this
53       way:
54
55           open INFO,   "<  datafile"  or die "can't open datafile: $!";
56           open RESULTS,">  runstats"  or die "can't open runstats: $!";
57           open LOG,    ">> logfile "  or die "can't open logfile:  $!";
58
59       A few things to notice.  First, the leading "<" is optional.  If
60       omitted, Perl assumes that you want to open the file for reading.
61
62       Note also that the first example uses the "||" logical operator, and
63       the second uses "or", which has lower precedence.  Using "||" in the
64       latter examples would effectively mean
65
66           open INFO, ( "<  datafile"  || die "can't open datafile: $!" );
67
68       which is definitely not what you want.
69
70       The other important thing to notice is that, just as in the shell, any
71       whitespace before or after the filename is ignored.  This is good,
72       because you wouldn't want these to do different things:
73
74           open INFO,   "<datafile"
75           open INFO,   "< datafile"
76           open INFO,   "<  datafile"
77
78       Ignoring surrounding whitespace also helps for when you read a filename
79       in from a different file, and forget to trim it before opening:
80
81           $filename = <INFO>;         # oops, \n still there
82           open(EXTRA, "< $filename") || die "can't open $filename: $!";
83
84       This is not a bug, but a feature.  Because "open" mimics the shell in
85       its style of using redirection arrows to specify how to open the file,
86       it also does so with respect to extra whitespace around the filename
87       itself as well.  For accessing files with naughty names, see
88       "Dispelling the Dweomer".
89
90       There is also a 3-argument version of "open", which lets you put the
91       special redirection characters into their own argument:
92
93           open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";
94
95       In this case, the filename to open is the actual string in $datafile,
96       so you don't have to worry about $datafile containing characters that
97       might influence the open mode, or whitespace at the beginning of the
98       filename that would be absorbed in the 2-argument version.  Also, any
99       reduction of unnecessary string interpolation is a good thing.
100
101   Indirect Filehandles
102       "open"'s first argument can be a reference to a filehandle.  As of perl
103       5.6.0, if the argument is uninitialized, Perl will automatically create
104       a filehandle and put a reference to it in the first argument, like so:
105
106           open( my $in, $infile )   or die "Couldn't read $infile: $!";
107           while ( <$in> ) {
108               # do something with $_
109           }
110           close $in;
111
112       Indirect filehandles make namespace management easier.  Since
113       filehandles are global to the current package, two subroutines trying
114       to open "INFILE" will clash.  With two functions opening indirect
115       filehandles like "my $infile", there's no clash and no need to worry
116       about future conflicts.
117
118       Another convenient behavior is that an indirect filehandle
119       automatically closes when there are no more references to it:
120
121           sub firstline {
122               open( my $in, shift ) && return scalar <$in>;
123               # no close() required
124           }
125
126       Indirect filehandles also make it easy to pass filehandles to and
127       return filehandles from subroutines:
128
129           for my $file ( qw(this.conf that.conf) ) {
130               my $fin = open_or_throw('<', $file);
131               process_conf( $fin );
132               # no close() needed
133           }
134
135           use Carp;
136           sub open_or_throw {
137               my ($mode, $filename) = @_;
138               open my $h, $mode, $filename
139                   or croak "Could not open '$filename': $!";
140               return $h;
141           }
142
143   Pipe Opens
144       In C, when you want to open a file using the standard I/O library, you
145       use the "fopen" function, but when opening a pipe, you use the "popen"
146       function.  But in the shell, you just use a different redirection
147       character.  That's also the case for Perl.  The "open" call remains the
148       same--just its argument differs.
149
150       If the leading character is a pipe symbol, "open" starts up a new
151       command and opens a write-only filehandle leading into that command.
152       This lets you write into that handle and have what you write show up on
153       that command's standard input.  For example:
154
155           open(PRINTER, "| lpr -Plp1")    || die "can't run lpr: $!";
156           print PRINTER "stuff\n";
157           close(PRINTER)                  || die "can't close lpr: $!";
158
159       If the trailing character is a pipe, you start up a new command and
160       open a read-only filehandle leading out of that command.  This lets
161       whatever that command writes to its standard output show up on your
162       handle for reading.  For example:
163
164           open(NET, "netstat -i -n |")    || die "can't fork netstat: $!";
165           while (<NET>) { }               # do something with input
166           close(NET)                      || die "can't close netstat: $!";
167
168       What happens if you try to open a pipe to or from a non-existent
169       command?  If possible, Perl will detect the failure and set $! as
170       usual.  But if the command contains special shell characters, such as
171       ">" or "*", called 'metacharacters', Perl does not execute the command
172       directly.  Instead, Perl runs the shell, which then tries to run the
173       command.  This means that it's the shell that gets the error
174       indication.  In such a case, the "open" call will only indicate failure
175       if Perl can't even run the shell.  See "How can I capture STDERR from
176       an external command?" in perlfaq8 to see how to cope with this.
177       There's also an explanation in perlipc.
178
179       If you would like to open a bidirectional pipe, the IPC::Open2 library
180       will handle this for you.  Check out "Bidirectional Communication with
181       Another Process" in perlipc
182
183       perl-5.6.x introduced a version of piped open that executes a process
184       based on its command line arguments without relying on the shell.
185       (Similar to the "system(@LIST)" notation.) This is safer and faster
186       than executing a single argument pipe-command, but does not allow
187       special shell constructs. (It is also not supported on Microsoft
188       Windows, Mac OS Classic or RISC OS.)
189
190       Here's an example of "open '-|'", which prints a random Unix fortune
191       cookie as uppercase:
192
193           my $collection = shift(@ARGV);
194           open my $fortune, '-|', 'fortune', $collection
195               or die "Could not find fortune - $!";
196           while (<$fortune>)
197           {
198               print uc($_);
199           }
200           close($fortune);
201
202       And this "open '|-'" pipes into lpr:
203
204           open my $printer, '|-', 'lpr', '-Plp1'
205               or die "can't run lpr: $!";
206           print {$printer} "stuff\n";
207           close($printer)
208               or die "can't close lpr: $!";
209
210   The Minus File
211       Again following the lead of the standard shell utilities, Perl's "open"
212       function treats a file whose name is a single minus, "-", in a special
213       way.  If you open minus for reading, it really means to access the
214       standard input.  If you open minus for writing, it really means to
215       access the standard output.
216
217       If minus can be used as the default input or default output, what
218       happens if you open a pipe into or out of minus?  What's the default
219       command it would run?  The same script as you're currently running!
220       This is actually a stealth "fork" hidden inside an "open" call.  See
221       "Safe Pipe Opens" in perlipc for details.
222
223   Mixing Reads and Writes
224       It is possible to specify both read and write access.  All you do is
225       add a "+" symbol in front of the redirection.  But as in the shell,
226       using a less-than on a file never creates a new file; it only opens an
227       existing one.  On the other hand, using a greater-than always clobbers
228       (truncates to zero length) an existing file, or creates a brand-new one
229       if there isn't an old one.  Adding a "+" for read-write doesn't affect
230       whether it only works on existing files or always clobbers existing
231       ones.
232
233           open(WTMP, "+< /usr/adm/wtmp")
234               || die "can't open /usr/adm/wtmp: $!";
235
236           open(SCREEN, "+> lkscreen")
237               || die "can't open lkscreen: $!";
238
239           open(LOGFILE, "+>> /var/log/applog")
240               || die "can't open /var/log/applog: $!";
241
242       The first one won't create a new file, and the second one will always
243       clobber an old one.  The third one will create a new file if necessary
244       and not clobber an old one, and it will allow you to read at any point
245       in the file, but all writes will always go to the end.  In short, the
246       first case is substantially more common than the second and third
247       cases, which are almost always wrong.  (If you know C, the plus in
248       Perl's "open" is historically derived from the one in C's fopen(3S),
249       which it ultimately calls.)
250
251       In fact, when it comes to updating a file, unless you're working on a
252       binary file as in the WTMP case above, you probably don't want to use
253       this approach for updating.  Instead, Perl's -i flag comes to the
254       rescue.  The following command takes all the C, C++, or yacc source or
255       header files and changes all their foo's to bar's, leaving the old
256       version in the original filename with a ".orig" tacked on the end:
257
258           $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]
259
260       This is a short cut for some renaming games that are really the best
261       way to update textfiles.  See the second question in perlfaq5 for more
262       details.
263
264   Filters
265       One of the most common uses for "open" is one you never even notice.
266       When you process the ARGV filehandle using "<ARGV>", Perl actually does
267       an implicit open on each file in @ARGV.  Thus a program called like
268       this:
269
270           $ myprogram file1 file2 file3
271
272       can have all its files opened and processed one at a time using a
273       construct no more complex than:
274
275           while (<>) {
276               # do something with $_
277           }
278
279       If @ARGV is empty when the loop first begins, Perl pretends you've
280       opened up minus, that is, the standard input.  In fact, $ARGV, the
281       currently open file during "<ARGV>" processing, is even set to "-" in
282       these circumstances.
283
284       You are welcome to pre-process your @ARGV before starting the loop to
285       make sure it's to your liking.  One reason to do this might be to
286       remove command options beginning with a minus.  While you can always
287       roll the simple ones by hand, the Getopts modules are good for this:
288
289           use Getopt::Std;
290
291           # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o
292           getopts("vDo:");
293
294           # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o}
295           getopts("vDo:", \%args);
296
297       Or the standard Getopt::Long module to permit named arguments:
298
299           use Getopt::Long;
300           GetOptions( "verbose"  => \$verbose,        # --verbose
301                       "Debug"    => \$debug,          # --Debug
302                       "output=s" => \$output );
303                   # --output=somestring or --output somestring
304
305       Another reason for preprocessing arguments is to make an empty argument
306       list default to all files:
307
308           @ARGV = glob("*") unless @ARGV;
309
310       You could even filter out all but plain, text files.  This is a bit
311       silent, of course, and you might prefer to mention them on the way.
312
313           @ARGV = grep { -f && -T } @ARGV;
314
315       If you're using the -n or -p command-line options, you should put
316       changes to @ARGV in a "BEGIN{}" block.
317
318       Remember that a normal "open" has special properties, in that it might
319       call fopen(3S) or it might called popen(3S), depending on what its
320       argument looks like; that's why it's sometimes called "magic open".
321       Here's an example:
322
323           $pwdinfo = `domainname` =~ /^(\(none\))?$/
324                           ? '< /etc/passwd'
325                           : 'ypcat passwd |';
326
327           open(PWD, $pwdinfo)
328                       or die "can't open $pwdinfo: $!";
329
330       This sort of thing also comes into play in filter processing.  Because
331       "<ARGV>" processing employs the normal, shell-style Perl "open", it
332       respects all the special things we've already seen:
333
334           $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
335
336       That program will read from the file f1, the process cmd1, standard
337       input (tmpfile in this case), the f2 file, the cmd2 command, and
338       finally the f3 file.
339
340       Yes, this also means that if you have files named "-" (and so on) in
341       your directory, they won't be processed as literal files by "open".
342       You'll need to pass them as "./-", much as you would for the rm
343       program, or you could use "sysopen" as described below.
344
345       One of the more interesting applications is to change files of a
346       certain name into pipes.  For example, to autoprocess gzipped or
347       compressed files by decompressing them with gzip:
348
349           @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc $_ |" : $_  } @ARGV;
350
351       Or, if you have the GET program installed from LWP, you can fetch URLs
352       before processing them:
353
354           @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;
355
356       It's not for nothing that this is called magic "<ARGV>".  Pretty nifty,
357       eh?
358

Open a la C

360       If you want the convenience of the shell, then Perl's "open" is
361       definitely the way to go.  On the other hand, if you want finer
362       precision than C's simplistic fopen(3S) provides you should look to
363       Perl's "sysopen", which is a direct hook into the open(2) system call.
364       That does mean it's a bit more involved, but that's the price of
365       precision.
366
367       "sysopen" takes 3 (or 4) arguments.
368
369           sysopen HANDLE, PATH, FLAGS, [MASK]
370
371       The HANDLE argument is a filehandle just as with "open".  The PATH is a
372       literal path, one that doesn't pay attention to any greater-thans or
373       less-thans or pipes or minuses, nor ignore whitespace.  If it's there,
374       it's part of the path.  The FLAGS argument contains one or more values
375       derived from the Fcntl module that have been or'd together using the
376       bitwise "|" operator.  The final argument, the MASK, is optional; if
377       present, it is combined with the user's current umask for the creation
378       mode of the file.  You should usually omit this.
379
380       Although the traditional values of read-only, write-only, and read-
381       write are 0, 1, and 2 respectively, this is known not to hold true on
382       some systems.  Instead, it's best to load in the appropriate constants
383       first from the Fcntl module, which supplies the following standard
384       flags:
385
386           O_RDONLY            Read only
387           O_WRONLY            Write only
388           O_RDWR              Read and write
389           O_CREAT             Create the file if it doesn't exist
390           O_EXCL              Fail if the file already exists
391           O_APPEND            Append to the file
392           O_TRUNC             Truncate the file
393           O_NONBLOCK          Non-blocking access
394
395       Less common flags that are sometimes available on some operating
396       systems include "O_BINARY", "O_TEXT", "O_SHLOCK", "O_EXLOCK",
397       "O_DEFER", "O_SYNC", "O_ASYNC", "O_DSYNC", "O_RSYNC", "O_NOCTTY",
398       "O_NDELAY" and "O_LARGEFILE".  Consult your open(2) manpage or its
399       local equivalent for details.  (Note: starting from Perl release 5.6
400       the "O_LARGEFILE" flag, if available, is automatically added to the
401       sysopen() flags because large files are the default.)
402
403       Here's how to use "sysopen" to emulate the simple "open" calls we had
404       before.  We'll omit the "|| die $!" checks for clarity, but make sure
405       you always check the return values in real code.  These aren't quite
406       the same, since "open" will trim leading and trailing whitespace, but
407       you'll get the idea.
408
409       To open a file for reading:
410
411           open(FH, "< $path");
412           sysopen(FH, $path, O_RDONLY);
413
414       To open a file for writing, creating a new file if needed or else
415       truncating an old file:
416
417           open(FH, "> $path");
418           sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT);
419
420       To open a file for appending, creating one if necessary:
421
422           open(FH, ">> $path");
423           sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT);
424
425       To open a file for update, where the file must already exist:
426
427           open(FH, "+< $path");
428           sysopen(FH, $path, O_RDWR);
429
430       And here are things you can do with "sysopen" that you cannot do with a
431       regular "open".  As you'll see, it's just a matter of controlling the
432       flags in the third argument.
433
434       To open a file for writing, creating a new file which must not
435       previously exist:
436
437           sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT);
438
439       To open a file for appending, where that file must already exist:
440
441           sysopen(FH, $path, O_WRONLY | O_APPEND);
442
443       To open a file for update, creating a new file if necessary:
444
445           sysopen(FH, $path, O_RDWR | O_CREAT);
446
447       To open a file for update, where that file must not already exist:
448
449           sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT);
450
451       To open a file without blocking, creating one if necessary:
452
453           sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT);
454
455   Permissions a la mode
456       If you omit the MASK argument to "sysopen", Perl uses the octal value
457       0666.  The normal MASK to use for executables and directories should be
458       0777, and for anything else, 0666.
459
460       Why so permissive?  Well, it isn't really.  The MASK will be modified
461       by your process's current "umask".  A umask is a number representing
462       disabled permissions bits; that is, bits that will not be turned on in
463       the created file's permissions field.
464
465       For example, if your "umask" were 027, then the 020 part would disable
466       the group from writing, and the 007 part would disable others from
467       reading, writing, or executing.  Under these conditions, passing
468       "sysopen" 0666 would create a file with mode 0640, since "0666 & ~027"
469       is 0640.
470
471       You should seldom use the MASK argument to "sysopen()".  That takes
472       away the user's freedom to choose what permission new files will have.
473       Denying choice is almost always a bad thing.  One exception would be
474       for cases where sensitive or private data is being stored, such as with
475       mail folders, cookie files, and internal temporary files.
476

Obscure Open Tricks

478   Re-Opening Files (dups)
479       Sometimes you already have a filehandle open, and want to make another
480       handle that's a duplicate of the first one.  In the shell, we place an
481       ampersand in front of a file descriptor number when doing redirections.
482       For example, "2>&1" makes descriptor 2 (that's STDERR in Perl) be
483       redirected into descriptor 1 (which is usually Perl's STDOUT).  The
484       same is essentially true in Perl: a filename that begins with an
485       ampersand is treated instead as a file descriptor if a number, or as a
486       filehandle if a string.
487
488           open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!";
489           open(MHCONTEXT, "<&4")     || die "couldn't dup fd4: $!";
490
491       That means that if a function is expecting a filename, but you don't
492       want to give it a filename because you already have the file open, you
493       can just pass the filehandle with a leading ampersand.  It's best to
494       use a fully qualified handle though, just in case the function happens
495       to be in a different package:
496
497           somefunction("&main::LOGFILE");
498
499       This way if somefunction() is planning on opening its argument, it can
500       just use the already opened handle.  This differs from passing a
501       handle, because with a handle, you don't open the file.  Here you have
502       something you can pass to open.
503
504       If you have one of those tricky, newfangled I/O objects that the C++
505       folks are raving about, then this doesn't work because those aren't a
506       proper filehandle in the native Perl sense.  You'll have to use
507       fileno() to pull out the proper descriptor number, assuming you can:
508
509           use IO::Socket;
510           $handle = IO::Socket::INET->new("www.perl.com:80");
511           $fd = $handle->fileno;
512           somefunction("&$fd");  # not an indirect function call
513
514       It can be easier (and certainly will be faster) just to use real
515       filehandles though:
516
517           use IO::Socket;
518           local *REMOTE = IO::Socket::INET->new("www.perl.com:80");
519           die "can't connect" unless defined(fileno(REMOTE));
520           somefunction("&main::REMOTE");
521
522       If the filehandle or descriptor number is preceded not just with a
523       simple "&" but rather with a "&=" combination, then Perl will not
524       create a completely new descriptor opened to the same place using the
525       dup(2) system call.  Instead, it will just make something of an alias
526       to the existing one using the fdopen(3S) library call.  This is
527       slightly more parsimonious of systems resources, although this is less
528       a concern these days.  Here's an example of that:
529
530           $fd = $ENV{"MHCONTEXTFD"};
531           open(MHCONTEXT, "<&=$fd")   or die "couldn't fdopen $fd: $!";
532
533       If you're using magic "<ARGV>", you could even pass in as a command
534       line argument in @ARGV something like "<&=$MHCONTEXTFD", but we've
535       never seen anyone actually do this.
536
537   Dispelling the Dweomer
538       Perl is more of a DWIMmer language than something like Java--where DWIM
539       is an acronym for "do what I mean".  But this principle sometimes leads
540       to more hidden magic than one knows what to do with.  In this way, Perl
541       is also filled with dweomer, an obscure word meaning an enchantment.
542       Sometimes, Perl's DWIMmer is just too much like dweomer for comfort.
543
544       If magic "open" is a bit too magical for you, you don't have to turn to
545       "sysopen".  To open a file with arbitrary weird characters in it, it's
546       necessary to protect any leading and trailing whitespace.  Leading
547       whitespace is protected by inserting a "./" in front of a filename that
548       starts with whitespace.  Trailing whitespace is protected by appending
549       an ASCII NUL byte ("\0") at the end of the string.
550
551           $file =~ s#^(\s)#./$1#;
552           open(FH, "< $file\0")   || die "can't open $file: $!";
553
554       This assumes, of course, that your system considers dot the current
555       working directory, slash the directory separator, and disallows ASCII
556       NULs within a valid filename.  Most systems follow these conventions,
557       including all POSIX systems as well as proprietary Microsoft systems.
558       The only vaguely popular system that doesn't work this way is the
559       "Classic" Macintosh system, which uses a colon where the rest of us use
560       a slash.  Maybe "sysopen" isn't such a bad idea after all.
561
562       If you want to use "<ARGV>" processing in a totally boring and non-
563       magical way, you could do this first:
564
565           #   "Sam sat on the ground and put his head in his hands.
566           #   'I wish I had never come here, and I don't want to see
567           #   no more magic,' he said, and fell silent."
568           for (@ARGV) {
569               s#^([^./])#./$1#;
570               $_ .= "\0";
571           }
572           while (<>) {
573               # now process $_
574           }
575
576       But be warned that users will not appreciate being unable to use "-" to
577       mean standard input, per the standard convention.
578
579   Paths as Opens
580       You've probably noticed how Perl's "warn" and "die" functions can
581       produce messages like:
582
583           Some warning at scriptname line 29, <FH> line 7.
584
585       That's because you opened a filehandle FH, and had read in seven
586       records from it.  But what was the name of the file, rather than the
587       handle?
588
589       If you aren't running with "strict refs", or if you've turned them off
590       temporarily, then all you have to do is this:
591
592           open($path, "< $path") || die "can't open $path: $!";
593           while (<$path>) {
594               # whatever
595           }
596
597       Since you're using the pathname of the file as its handle, you'll get
598       warnings more like
599
600           Some warning at scriptname line 29, </etc/motd> line 7.
601
602   Single Argument Open
603       Remember how we said that Perl's open took two arguments?  That was a
604       passive prevarication.  You see, it can also take just one argument.
605       If and only if the variable is a global variable, not a lexical, you
606       can pass "open" just one argument, the filehandle, and it will get the
607       path from the global scalar variable of the same name.
608
609           $FILE = "/etc/motd";
610           open FILE or die "can't open $FILE: $!";
611           while (<FILE>) {
612               # whatever
613           }
614
615       Why is this here?  Someone has to cater to the hysterical porpoises.
616       It's something that's been in Perl since the very beginning, if not
617       before.
618
619   Playing with STDIN and STDOUT
620       One clever move with STDOUT is to explicitly close it when you're done
621       with the program.
622
623           END { close(STDOUT) || die "can't close stdout: $!" }
624
625       If you don't do this, and your program fills up the disk partition due
626       to a command line redirection, it won't report the error exit with a
627       failure status.
628
629       You don't have to accept the STDIN and STDOUT you were given.  You are
630       welcome to reopen them if you'd like.
631
632           open(STDIN, "< datafile")
633               || die "can't open datafile: $!";
634
635           open(STDOUT, "> output")
636               || die "can't open output: $!";
637
638       And then these can be accessed directly or passed on to subprocesses.
639       This makes it look as though the program were initially invoked with
640       those redirections from the command line.
641
642       It's probably more interesting to connect these to pipes.  For example:
643
644           $pager = $ENV{PAGER} || "(less || more)";
645           open(STDOUT, "| $pager")
646               || die "can't fork a pager: $!";
647
648       This makes it appear as though your program were called with its stdout
649       already piped into your pager.  You can also use this kind of thing in
650       conjunction with an implicit fork to yourself.  You might do this if
651       you would rather handle the post processing in your own program, just
652       in a different process:
653
654           head(100);
655           while (<>) {
656               print;
657           }
658
659           sub head {
660               my $lines = shift || 20;
661               return if $pid = open(STDOUT, "|-");       # return if parent
662               die "cannot fork: $!" unless defined $pid;
663               while (<STDIN>) {
664                   last if --$lines < 0;
665                   print;
666               }
667               exit;
668           }
669
670       This technique can be applied to repeatedly push as many filters on
671       your output stream as you wish.
672

Other I/O Issues

674       These topics aren't really arguments related to "open" or "sysopen",
675       but they do affect what you do with your open files.
676
677   Opening Non-File Files
678       When is a file not a file?  Well, you could say when it exists but
679       isn't a plain file.   We'll check whether it's a symbolic link first,
680       just in case.
681
682           if (-l $file || ! -f _) {
683               print "$file is not a plain file\n";
684           }
685
686       What other kinds of files are there than, well, files?  Directories,
687       symbolic links, named pipes, Unix-domain sockets, and block and
688       character devices.  Those are all files, too--just not plain files.
689       This isn't the same issue as being a text file. Not all text files are
690       plain files.  Not all plain files are text files.  That's why there are
691       separate "-f" and "-T" file tests.
692
693       To open a directory, you should use the "opendir" function, then
694       process it with "readdir", carefully restoring the directory name if
695       necessary:
696
697           opendir(DIR, $dirname) or die "can't opendir $dirname: $!";
698           while (defined($file = readdir(DIR))) {
699               # do something with "$dirname/$file"
700           }
701           closedir(DIR);
702
703       If you want to process directories recursively, it's better to use the
704       File::Find module.  For example, this prints out all files recursively
705       and adds a slash to their names if the file is a directory.
706
707           @ARGV = qw(.) unless @ARGV;
708           use File::Find;
709           find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV;
710
711       This finds all bogus symbolic links beneath a particular directory:
712
713           find sub { print "$File::Find::name\n" if -l && !-e }, $dir;
714
715       As you see, with symbolic links, you can just pretend that it is what
716       it points to.  Or, if you want to know what it points to, then
717       "readlink" is called for:
718
719           if (-l $file) {
720               if (defined($whither = readlink($file))) {
721                   print "$file points to $whither\n";
722               } else {
723                   print "$file points nowhere: $!\n";
724               }
725           }
726
727   Opening Named Pipes
728       Named pipes are a different matter.  You pretend they're regular files,
729       but their opens will normally block until there is both a reader and a
730       writer.  You can read more about them in "Named Pipes" in perlipc.
731       Unix-domain sockets are rather different beasts as well; they're
732       described in "Unix-Domain TCP Clients and Servers" in perlipc.
733
734       When it comes to opening devices, it can be easy and it can be tricky.
735       We'll assume that if you're opening up a block device, you know what
736       you're doing.  The character devices are more interesting.  These are
737       typically used for modems, mice, and some kinds of printers.  This is
738       described in "How do I read and write the serial port?" in perlfaq8
739       It's often enough to open them carefully:
740
741           sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY)
742                       # (O_NOCTTY no longer needed on POSIX systems)
743               or die "can't open /dev/ttyS1: $!";
744           open(TTYOUT, "+>&TTYIN")
745               or die "can't dup TTYIN: $!";
746
747           $ofh = select(TTYOUT); $| = 1; select($ofh);
748
749           print TTYOUT "+++at\015";
750           $answer = <TTYIN>;
751
752       With descriptors that you haven't opened using "sysopen", such as
753       sockets, you can set them to be non-blocking using "fcntl":
754
755           use Fcntl;
756           my $old_flags = fcntl($handle, F_GETFL, 0)
757               or die "can't get flags: $!";
758           fcntl($handle, F_SETFL, $old_flags | O_NONBLOCK)
759               or die "can't set non blocking: $!";
760
761       Rather than losing yourself in a morass of twisting, turning "ioctl"s,
762       all dissimilar, if you're going to manipulate ttys, it's best to make
763       calls out to the stty(1) program if you have it, or else use the
764       portable POSIX interface.  To figure this all out, you'll need to read
765       the termios(3) manpage, which describes the POSIX interface to tty
766       devices, and then POSIX, which describes Perl's interface to POSIX.
767       There are also some high-level modules on CPAN that can help you with
768       these games.  Check out Term::ReadKey and Term::ReadLine.
769
770   Opening Sockets
771       What else can you open?  To open a connection using sockets, you won't
772       use one of Perl's two open functions.  See "Sockets: Client/Server
773       Communication" in perlipc for that.  Here's an example.  Once you have
774       it, you can use FH as a bidirectional filehandle.
775
776           use IO::Socket;
777           local *FH = IO::Socket::INET->new("www.perl.com:80");
778
779       For opening up a URL, the LWP modules from CPAN are just what the
780       doctor ordered.  There's no filehandle interface, but it's still easy
781       to get the contents of a document:
782
783           use LWP::Simple;
784           $doc = get('http://www.cpan.org/');
785
786   Binary Files
787       On certain legacy systems with what could charitably be called
788       terminally convoluted (some would say broken) I/O models, a file isn't
789       a file--at least, not with respect to the C standard I/O library.  On
790       these old systems whose libraries (but not kernels) distinguish between
791       text and binary streams, to get files to behave properly you'll have to
792       bend over backwards to avoid nasty problems.  On such infelicitous
793       systems, sockets and pipes are already opened in binary mode, and there
794       is currently no way to turn that off.  With files, you have more
795       options.
796
797       Another option is to use the "binmode" function on the appropriate
798       handles before doing regular I/O on them:
799
800           binmode(STDIN);
801           binmode(STDOUT);
802           while (<STDIN>) { print }
803
804       Passing "sysopen" a non-standard flag option will also open the file in
805       binary mode on those systems that support it.  This is the equivalent
806       of opening the file normally, then calling "binmode" on the handle.
807
808           sysopen(BINDAT, "records.data", O_RDWR | O_BINARY)
809               || die "can't open records.data: $!";
810
811       Now you can use "read" and "print" on that handle without worrying
812       about the non-standard system I/O library breaking your data.  It's not
813       a pretty picture, but then, legacy systems seldom are.  CP/M will be
814       with us until the end of days, and after.
815
816       On systems with exotic I/O systems, it turns out that, astonishingly
817       enough, even unbuffered I/O using "sysread" and "syswrite" might do
818       sneaky data mutilation behind your back.
819
820           while (sysread(WHENCE, $buf, 1024)) {
821               syswrite(WHITHER, $buf, length($buf));
822           }
823
824       Depending on the vicissitudes of your runtime system, even these calls
825       may need "binmode" or "O_BINARY" first.  Systems known to be free of
826       such difficulties include Unix, the Mac OS, Plan 9, and Inferno.
827
828   File Locking
829       In a multitasking environment, you may need to be careful not to
830       collide with other processes who want to do I/O on the same files as
831       you are working on.  You'll often need shared or exclusive locks on
832       files for reading and writing respectively.  You might just pretend
833       that only exclusive locks exist.
834
835       Never use the existence of a file "-e $file" as a locking indication,
836       because there is a race condition between the test for the existence of
837       the file and its creation.  It's possible for another process to create
838       a file in the slice of time between your existence check and your
839       attempt to create the file.  Atomicity is critical.
840
841       Perl's most portable locking interface is via the "flock" function,
842       whose simplicity is emulated on systems that don't directly support it
843       such as SysV or Windows.  The underlying semantics may affect how it
844       all works, so you should learn how "flock" is implemented on your
845       system's port of Perl.
846
847       File locking does not lock out another process that would like to do
848       I/O.  A file lock only locks out others trying to get a lock, not
849       processes trying to do I/O.  Because locks are advisory, if one process
850       uses locking and another doesn't, all bets are off.
851
852       By default, the "flock" call will block until a lock is granted.  A
853       request for a shared lock will be granted as soon as there is no
854       exclusive locker.  A request for an exclusive lock will be granted as
855       soon as there is no locker of any kind.  Locks are on file descriptors,
856       not file names.  You can't lock a file until you open it, and you can't
857       hold on to a lock once the file has been closed.
858
859       Here's how to get a blocking shared lock on a file, typically used for
860       reading:
861
862           use 5.004;
863           use Fcntl qw(:DEFAULT :flock);
864           open(FH, "< filename")  or die "can't open filename: $!";
865           flock(FH, LOCK_SH)      or die "can't lock filename: $!";
866           # now read from FH
867
868       You can get a non-blocking lock by using "LOCK_NB".
869
870           flock(FH, LOCK_SH | LOCK_NB)
871               or die "can't lock filename: $!";
872
873       This can be useful for producing more user-friendly behaviour by
874       warning if you're going to be blocking:
875
876           use 5.004;
877           use Fcntl qw(:DEFAULT :flock);
878           open(FH, "< filename")  or die "can't open filename: $!";
879           unless (flock(FH, LOCK_SH | LOCK_NB)) {
880               $| = 1;
881               print "Waiting for lock...";
882               flock(FH, LOCK_SH)  or die "can't lock filename: $!";
883               print "got it.\n"
884           }
885           # now read from FH
886
887       To get an exclusive lock, typically used for writing, you have to be
888       careful.  We "sysopen" the file so it can be locked before it gets
889       emptied.  You can get a nonblocking version using "LOCK_EX | LOCK_NB".
890
891           use 5.004;
892           use Fcntl qw(:DEFAULT :flock);
893           sysopen(FH, "filename", O_WRONLY | O_CREAT)
894               or die "can't open filename: $!";
895           flock(FH, LOCK_EX)
896               or die "can't lock filename: $!";
897           truncate(FH, 0)
898               or die "can't truncate filename: $!";
899           # now write to FH
900
901       Finally, due to the uncounted millions who cannot be dissuaded from
902       wasting cycles on useless vanity devices called hit counters, here's
903       how to increment a number in a file safely:
904
905           use Fcntl qw(:DEFAULT :flock);
906
907           sysopen(FH, "numfile", O_RDWR | O_CREAT)
908               or die "can't open numfile: $!";
909           # autoflush FH
910           $ofh = select(FH); $| = 1; select ($ofh);
911           flock(FH, LOCK_EX)
912               or die "can't write-lock numfile: $!";
913
914           $num = <FH> || 0;
915           seek(FH, 0, 0)
916               or die "can't rewind numfile : $!";
917           print FH $num+1, "\n"
918               or die "can't write numfile: $!";
919
920           truncate(FH, tell(FH))
921               or die "can't truncate numfile: $!";
922           close(FH)
923               or die "can't close numfile: $!";
924
925   IO Layers
926       In Perl 5.8.0 a new I/O framework called "PerlIO" was introduced.  This
927       is a new "plumbing" for all the I/O happening in Perl; for the most
928       part everything will work just as it did, but PerlIO also brought in
929       some new features such as the ability to think of I/O as "layers".  One
930       I/O layer may in addition to just moving the data also do
931       transformations on the data.  Such transformations may include
932       compression and decompression, encryption and decryption, and
933       transforming between various character encodings.
934
935       Full discussion about the features of PerlIO is out of scope for this
936       tutorial, but here is how to recognize the layers being used:
937
938       ·   The three-(or more)-argument form of "open" is being used and the
939           second argument contains something else in addition to the usual
940           '<', '>', '>>', '|' and their variants, for example:
941
942               open(my $fh, "<:crlf", $fn);
943
944       ·   The two-argument form of "binmode" is being used, for example
945
946               binmode($fh, ":encoding(utf16)");
947
948       For more detailed discussion about PerlIO see PerlIO; for more detailed
949       discussion about Unicode and I/O see perluniintro.
950

AUTHOR and COPYRIGHT

956       Copyright 1998 Tom Christiansen.
957
958       This documentation is free; you can redistribute it and/or modify it
959       under the same terms as Perl itself.
960
961       Irrespective of its distribution, all code examples in these files are
962       hereby placed into the public domain.  You are permitted and encouraged
963       to use this code in your own programs for fun or for profit as you see
964       fit.  A simple comment in the code giving credit would be courteous but
965       is not required.
966

HISTORY

968       First release: Sat Jan  9 08:09:11 MST 1999
969
970
971
972perl v5.16.3                      2013-03-04                    PERLOPENTUT(1)