1PERLFAQ5(1) Perl Programmers Reference Guide PERLFAQ5(1)
2
3
4
6 perlfaq5 - Files and Formats
7
9 This section deals with I/O and the "f" issues: filehandles, flushing,
10 formats, and footers.
11
12 How do I flush/unbuffer an output filehandle? Why must I do this?
13 (contributed by brian d foy)
14
15 You might like to read Mark Jason Dominus's "Suffering From Buffering"
16 at <http://perl.plover.com/FAQs/Buffering.html> .
17
18 Perl normally buffers output so it doesn't make a system call for every
19 bit of output. By saving up output, it makes fewer expensive system
20 calls. For instance, in this little bit of code, you want to print a
21 dot to the screen for every line you process to watch the progress of
22 your program. Instead of seeing a dot for every line, Perl buffers the
23 output and you have a long wait before you see a row of 50 dots all at
24 once:
25
26 # long wait, then row of dots all at once
27 while( <> ) {
28 print ".";
29 print "\n" unless ++$count % 50;
30
31 #... expensive line processing operations
32 }
33
34 To get around this, you have to unbuffer the output filehandle, in this
35 case, "STDOUT". You can set the special variable $| to a true value
36 (mnemonic: making your filehandles "piping hot"):
37
38 $|++;
39
40 # dot shown immediately
41 while( <> ) {
42 print ".";
43 print "\n" unless ++$count % 50;
44
45 #... expensive line processing operations
46 }
47
48 The $| is one of the per-filehandle special variables, so each
49 filehandle has its own copy of its value. If you want to merge standard
50 output and standard error for instance, you have to unbuffer each
51 (although STDERR might be unbuffered by default):
52
53 {
54 my $previous_default = select(STDOUT); # save previous default
55 $|++; # autoflush STDOUT
56 select(STDERR);
57 $|++; # autoflush STDERR, to be sure
58 select($previous_default); # restore previous default
59 }
60
61 # now should alternate . and +
62 while( 1 ) {
63 sleep 1;
64 print STDOUT ".";
65 print STDERR "+";
66 print STDOUT "\n" unless ++$count % 25;
67 }
68
69 Besides the $| special variable, you can use "binmode" to give your
70 filehandle a ":unix" layer, which is unbuffered:
71
72 binmode( STDOUT, ":unix" );
73
74 while( 1 ) {
75 sleep 1;
76 print ".";
77 print "\n" unless ++$count % 50;
78 }
79
80 For more information on output layers, see the entries for "binmode"
81 and open in perlfunc, and the PerlIO module documentation.
82
83 If you are using IO::Handle or one of its subclasses, you can call the
84 "autoflush" method to change the settings of the filehandle:
85
86 use IO::Handle;
87 open my( $io_fh ), ">", "output.txt";
88 $io_fh->autoflush(1);
89
90 The IO::Handle objects also have a "flush" method. You can flush the
91 buffer any time you want without auto-buffering
92
93 $io_fh->flush;
94
95 How do I change, delete, or insert a line in a file, or append to the
96 beginning of a file?
97 (contributed by brian d foy)
98
99 The basic idea of inserting, changing, or deleting a line from a text
100 file involves reading and printing the file to the point you want to
101 make the change, making the change, then reading and printing the rest
102 of the file. Perl doesn't provide random access to lines (especially
103 since the record input separator, $/, is mutable), although modules
104 such as Tie::File can fake it.
105
106 A Perl program to do these tasks takes the basic form of opening a
107 file, printing its lines, then closing the file:
108
109 open my $in, '<', $file or die "Can't read old file: $!";
110 open my $out, '>', "$file.new" or die "Can't write new file: $!";
111
112 while( <$in> ) {
113 print $out $_;
114 }
115
116 close $out;
117
118 Within that basic form, add the parts that you need to insert, change,
119 or delete lines.
120
121 To prepend lines to the beginning, print those lines before you enter
122 the loop that prints the existing lines.
123
124 open my $in, '<', $file or die "Can't read old file: $!";
125 open my $out, '>', "$file.new" or die "Can't write new file: $!";
126
127 print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC
128
129 while( <$in> ) {
130 print $out $_;
131 }
132
133 close $out;
134
135 To change existing lines, insert the code to modify the lines inside
136 the "while" loop. In this case, the code finds all lowercased versions
137 of "perl" and uppercases them. The happens for every line, so be sure
138 that you're supposed to do that on every line!
139
140 open my $in, '<', $file or die "Can't read old file: $!";
141 open my $out, '>', "$file.new" or die "Can't write new file: $!";
142
143 print $out "# Add this line to the top\n";
144
145 while( <$in> ) {
146 s/\b(perl)\b/Perl/g;
147 print $out $_;
148 }
149
150 close $out;
151
152 To change only a particular line, the input line number, $., is useful.
153 First read and print the lines up to the one you want to change. Next,
154 read the single line you want to change, change it, and print it. After
155 that, read the rest of the lines and print those:
156
157 while( <$in> ) { # print the lines before the change
158 print $out $_;
159 last if $. == 4; # line number before change
160 }
161
162 my $line = <$in>;
163 $line =~ s/\b(perl)\b/Perl/g;
164 print $out $line;
165
166 while( <$in> ) { # print the rest of the lines
167 print $out $_;
168 }
169
170 To skip lines, use the looping controls. The "next" in this example
171 skips comment lines, and the "last" stops all processing once it
172 encounters either "__END__" or "__DATA__".
173
174 while( <$in> ) {
175 next if /^\s+#/; # skip comment lines
176 last if /^__(END|DATA)__$/; # stop at end of code marker
177 print $out $_;
178 }
179
180 Do the same sort of thing to delete a particular line by using "next"
181 to skip the lines you don't want to show up in the output. This example
182 skips every fifth line:
183
184 while( <$in> ) {
185 next unless $. % 5;
186 print $out $_;
187 }
188
189 If, for some odd reason, you really want to see the whole file at once
190 rather than processing line-by-line, you can slurp it in (as long as
191 you can fit the whole thing in memory!):
192
193 open my $in, '<', $file or die "Can't read old file: $!"
194 open my $out, '>', "$file.new" or die "Can't write new file: $!";
195
196 my @lines = do { local $/; <$in> }; # slurp!
197
198 # do your magic here
199
200 print $out @lines;
201
202 Modules such as File::Slurp and Tie::File can help with that too. If
203 you can, however, avoid reading the entire file at once. Perl won't
204 give that memory back to the operating system until the process
205 finishes.
206
207 You can also use Perl one-liners to modify a file in-place. The
208 following changes all 'Fred' to 'Barney' in inFile.txt, overwriting the
209 file with the new contents. With the "-p" switch, Perl wraps a "while"
210 loop around the code you specify with "-e", and "-i" turns on in-place
211 editing. The current line is in $_. With "-p", Perl automatically
212 prints the value of $_ at the end of the loop. See perlrun for more
213 details.
214
215 perl -pi -e 's/Fred/Barney/' inFile.txt
216
217 To make a backup of "inFile.txt", give "-i" a file extension to add:
218
219 perl -pi.bak -e 's/Fred/Barney/' inFile.txt
220
221 To change only the fifth line, you can add a test checking $., the
222 input line number, then only perform the operation when the test
223 passes:
224
225 perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt
226
227 To add lines before a certain line, you can add a line (or lines!)
228 before Perl prints $_:
229
230 perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt
231
232 You can even add a line to the beginning of a file, since the current
233 line prints at the end of the loop:
234
235 perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt
236
237 To insert a line after one already in the file, use the "-n" switch.
238 It's just like "-p" except that it doesn't print $_ at the end of the
239 loop, so you have to do that yourself. In this case, print $_ first,
240 then print the line that you want to add.
241
242 perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt
243
244 To delete lines, only print the ones that you want.
245
246 perl -ni -e 'print if /d/' inFile.txt
247
248 How do I count the number of lines in a file?
249 (contributed by brian d foy)
250
251 Conceptually, the easiest way to count the lines in a file is to simply
252 read them and count them:
253
254 my $count = 0;
255 while( <$fh> ) { $count++; }
256
257 You don't really have to count them yourself, though, since Perl
258 already does that with the $. variable, which is the current line
259 number from the last filehandle read:
260
261 1 while( <$fh> );
262 my $count = $.;
263
264 If you want to use $., you can reduce it to a simple one-liner, like
265 one of these:
266
267 % perl -lne '} print $.; {' file
268
269 % perl -lne 'END { print $. }' file
270
271 Those can be rather inefficient though. If they aren't fast enough for
272 you, you might just read chunks of data and count the number of
273 newlines:
274
275 my $lines = 0;
276 open my($fh), '<:raw', $filename or die "Can't open $filename: $!";
277 while( sysread $fh, $buffer, 4096 ) {
278 $lines += ( $buffer =~ tr/\n// );
279 }
280 close FILE;
281
282 However, that doesn't work if the line ending isn't a newline. You
283 might change that "tr///" to a "s///" so you can count the number of
284 times the input record separator, $/, shows up:
285
286 my $lines = 0;
287 open my($fh), '<:raw', $filename or die "Can't open $filename: $!";
288 while( sysread $fh, $buffer, 4096 ) {
289 $lines += ( $buffer =~ s|$/||g; );
290 }
291 close FILE;
292
293 If you don't mind shelling out, the "wc" command is usually the
294 fastest, even with the extra interprocess overhead. Ensure that you
295 have an untainted filename though:
296
297 #!perl -T
298
299 $ENV{PATH} = undef;
300
301 my $lines;
302 if( $filename =~ /^([0-9a-z_.]+)\z/ ) {
303 $lines = `/usr/bin/wc -l $1`
304 chomp $lines;
305 }
306
307 How do I delete the last N lines from a file?
308 (contributed by brian d foy)
309
310 The easiest conceptual solution is to count the lines in the file then
311 start at the beginning and print the number of lines (minus the last N)
312 to a new file.
313
314 Most often, the real question is how you can delete the last N lines
315 without making more than one pass over the file, or how to do it
316 without a lot of copying. The easy concept is the hard reality when you
317 might have millions of lines in your file.
318
319 One trick is to use File::ReadBackwards, which starts at the end of the
320 file. That module provides an object that wraps the real filehandle to
321 make it easy for you to move around the file. Once you get to the spot
322 you need, you can get the actual filehandle and work with it as normal.
323 In this case, you get the file position at the end of the last line you
324 want to keep and truncate the file to that point:
325
326 use File::ReadBackwards;
327
328 my $filename = 'test.txt';
329 my $Lines_to_truncate = 2;
330
331 my $bw = File::ReadBackwards->new( $filename )
332 or die "Could not read backwards in [$filename]: $!";
333
334 my $lines_from_end = 0;
335 until( $bw->eof or $lines_from_end == $Lines_to_truncate ) {
336 print "Got: ", $bw->readline;
337 $lines_from_end++;
338 }
339
340 truncate( $filename, $bw->tell );
341
342 The File::ReadBackwards module also has the advantage of setting the
343 input record separator to a regular expression.
344
345 You can also use the Tie::File module which lets you access the lines
346 through a tied array. You can use normal array operations to modify
347 your file, including setting the last index and using "splice".
348
349 How can I use Perl's "-i" option from within a program?
350 "-i" sets the value of Perl's $^I variable, which in turn affects the
351 behavior of "<>"; see perlrun for more details. By modifying the
352 appropriate variables directly, you can get the same behavior within a
353 larger program. For example:
354
355 # ...
356 {
357 local($^I, @ARGV) = ('.orig', glob("*.c"));
358 while (<>) {
359 if ($. == 1) {
360 print "This line should appear at the top of each file\n";
361 }
362 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
363 print;
364 close ARGV if eof; # Reset $.
365 }
366 }
367 # $^I and @ARGV return to their old values here
368
369 This block modifies all the ".c" files in the current directory,
370 leaving a backup of the original data from each file in a new ".c.orig"
371 file.
372
373 How can I copy a file?
374 (contributed by brian d foy)
375
376 Use the File::Copy module. It comes with Perl and can do a true copy
377 across file systems, and it does its magic in a portable fashion.
378
379 use File::Copy;
380
381 copy( $original, $new_copy ) or die "Copy failed: $!";
382
383 If you can't use File::Copy, you'll have to do the work yourself: open
384 the original file, open the destination file, then print to the
385 destination file as you read the original. You also have to remember to
386 copy the permissions, owner, and group to the new file.
387
388 How do I make a temporary file name?
389 If you don't need to know the name of the file, you can use "open()"
390 with "undef" in place of the file name. In Perl 5.8 or later, the
391 "open()" function creates an anonymous temporary file:
392
393 open my $tmp, '+>', undef or die $!;
394
395 Otherwise, you can use the File::Temp module.
396
397 use File::Temp qw/ tempfile tempdir /;
398
399 my $dir = tempdir( CLEANUP => 1 );
400 ($fh, $filename) = tempfile( DIR => $dir );
401
402 # or if you don't need to know the filename
403
404 my $fh = tempfile( DIR => $dir );
405
406 The File::Temp has been a standard module since Perl 5.6.1. If you
407 don't have a modern enough Perl installed, use the "new_tmpfile" class
408 method from the IO::File module to get a filehandle opened for reading
409 and writing. Use it if you don't need to know the file's name:
410
411 use IO::File;
412 my $fh = IO::File->new_tmpfile()
413 or die "Unable to make new temporary file: $!";
414
415 If you're committed to creating a temporary file by hand, use the
416 process ID and/or the current time-value. If you need to have many
417 temporary files in one process, use a counter:
418
419 BEGIN {
420 use Fcntl;
421 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP};
422 my $base_name = sprintf "%s/%d-%d-0000", $temp_dir, $$, time;
423
424 sub temp_file {
425 my $fh;
426 my $count = 0;
427 until( defined(fileno($fh)) || $count++ > 100 ) {
428 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
429 # O_EXCL is required for security reasons.
430 sysopen $fh, $base_name, O_WRONLY|O_EXCL|O_CREAT;
431 }
432
433 if( defined fileno($fh) ) {
434 return ($fh, $base_name);
435 }
436 else {
437 return ();
438 }
439 }
440 }
441
442 How can I manipulate fixed-record-length files?
443 The most efficient way is using pack() and unpack(). This is faster
444 than using substr() when taking many, many strings. It is slower for
445 just a few.
446
447 Here is a sample chunk of code to break up and put back together again
448 some fixed-format input lines, in this case from the output of a
449 normal, Berkeley-style ps:
450
451 # sample input line:
452 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
453 my $PS_T = 'A6 A4 A7 A5 A*';
454 open my $ps, '-|', 'ps';
455 print scalar <$ps>;
456 my @fields = qw( pid tt stat time command );
457 while (<$ps>) {
458 my %process;
459 @process{@fields} = unpack($PS_T, $_);
460 for my $field ( @fields ) {
461 print "$field: <$process{$field}>\n";
462 }
463 print 'line=', pack($PS_T, @process{@fields} ), "\n";
464 }
465
466 We've used a hash slice in order to easily handle the fields of each
467 row. Storing the keys in an array makes it easy to operate on them as
468 a group or loop over them with "for". It also avoids polluting the
469 program with global variables and using symbolic references.
470
471 How can I make a filehandle local to a subroutine? How do I pass
472 filehandles between subroutines? How do I make an array of filehandles?
473 As of perl5.6, open() autovivifies file and directory handles as
474 references if you pass it an uninitialized scalar variable. You can
475 then pass these references just like any other scalar, and use them in
476 the place of named handles.
477
478 open my $fh, $file_name;
479
480 open local $fh, $file_name;
481
482 print $fh "Hello World!\n";
483
484 process_file( $fh );
485
486 If you like, you can store these filehandles in an array or a hash. If
487 you access them directly, they aren't simple scalars and you need to
488 give "print" a little help by placing the filehandle reference in
489 braces. Perl can only figure it out on its own when the filehandle
490 reference is a simple scalar.
491
492 my @fhs = ( $fh1, $fh2, $fh3 );
493
494 for( $i = 0; $i <= $#fhs; $i++ ) {
495 print {$fhs[$i]} "just another Perl answer, \n";
496 }
497
498 Before perl5.6, you had to deal with various typeglob idioms which you
499 may see in older code.
500
501 open FILE, "> $filename";
502 process_typeglob( *FILE );
503 process_reference( \*FILE );
504
505 sub process_typeglob { local *FH = shift; print FH "Typeglob!" }
506 sub process_reference { local $fh = shift; print $fh "Reference!" }
507
508 If you want to create many anonymous handles, you should check out the
509 Symbol or IO::Handle modules.
510
511 How can I use a filehandle indirectly?
512 An indirect filehandle is the use of something other than a symbol in a
513 place that a filehandle is expected. Here are ways to get indirect
514 filehandles:
515
516 $fh = SOME_FH; # bareword is strict-subs hostile
517 $fh = "SOME_FH"; # strict-refs hostile; same package only
518 $fh = *SOME_FH; # typeglob
519 $fh = \*SOME_FH; # ref to typeglob (bless-able)
520 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
521
522 Or, you can use the "new" method from one of the IO::* modules to
523 create an anonymous filehandle and store that in a scalar variable.
524
525 use IO::Handle; # 5.004 or higher
526 my $fh = IO::Handle->new();
527
528 Then use any of those as you would a normal filehandle. Anywhere that
529 Perl is expecting a filehandle, an indirect filehandle may be used
530 instead. An indirect filehandle is just a scalar variable that contains
531 a filehandle. Functions like "print", "open", "seek", or the "<FH>"
532 diamond operator will accept either a named filehandle or a scalar
533 variable containing one:
534
535 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
536 print $ofh "Type it: ";
537 my $got = <$ifh>
538 print $efh "What was that: $got";
539
540 If you're passing a filehandle to a function, you can write the
541 function in two ways:
542
543 sub accept_fh {
544 my $fh = shift;
545 print $fh "Sending to indirect filehandle\n";
546 }
547
548 Or it can localize a typeglob and use the filehandle directly:
549
550 sub accept_fh {
551 local *FH = shift;
552 print FH "Sending to localized filehandle\n";
553 }
554
555 Both styles work with either objects or typeglobs of real filehandles.
556 (They might also work with strings under some circumstances, but this
557 is risky.)
558
559 accept_fh(*STDOUT);
560 accept_fh($handle);
561
562 In the examples above, we assigned the filehandle to a scalar variable
563 before using it. That is because only simple scalar variables, not
564 expressions or subscripts of hashes or arrays, can be used with built-
565 ins like "print", "printf", or the diamond operator. Using something
566 other than a simple scalar variable as a filehandle is illegal and
567 won't even compile:
568
569 my @fd = (*STDIN, *STDOUT, *STDERR);
570 print $fd[1] "Type it: "; # WRONG
571 my $got = <$fd[0]> # WRONG
572 print $fd[2] "What was that: $got"; # WRONG
573
574 With "print" and "printf", you get around this by using a block and an
575 expression where you would place the filehandle:
576
577 print { $fd[1] } "funny stuff\n";
578 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
579 # Pity the poor deadbeef.
580
581 That block is a proper block like any other, so you can put more
582 complicated code there. This sends the message out to one of two
583 places:
584
585 my $ok = -x "/bin/cat";
586 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
587 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
588
589 This approach of treating "print" and "printf" like object methods
590 calls doesn't work for the diamond operator. That's because it's a real
591 operator, not just a function with a comma-less argument. Assuming
592 you've been storing typeglobs in your structure as we did above, you
593 can use the built-in function named "readline" to read a record just as
594 "<>" does. Given the initialization shown above for @fd, this would
595 work, but only because readline() requires a typeglob. It doesn't work
596 with objects or strings, which might be a bug we haven't fixed yet.
597
598 $got = readline($fd[0]);
599
600 Let it be noted that the flakiness of indirect filehandles is not
601 related to whether they're strings, typeglobs, objects, or anything
602 else. It's the syntax of the fundamental operators. Playing the object
603 game doesn't help you at all here.
604
605 How can I set up a footer format to be used with write()?
606 There's no builtin way to do this, but perlform has a couple of
607 techniques to make it possible for the intrepid hacker.
608
609 How can I write() into a string?
610 (contributed by brian d foy)
611
612 If you want to "write" into a string, you just have to <open> a
613 filehandle to a string, which Perl has been able to do since Perl 5.6:
614
615 open FH, '>', \my $string;
616 write( FH );
617
618 Since you want to be a good programmer, you probably want to use a
619 lexical filehandle, even though formats are designed to work with
620 bareword filehandles since the default format names take the filehandle
621 name. However, you can control this with some Perl special per-
622 filehandle variables: $^, which names the top-of-page format, and $~
623 which shows the line format. You have to change the default filehandle
624 to set these variables:
625
626 open my($fh), '>', \my $string;
627
628 { # set per-filehandle variables
629 my $old_fh = select( $fh );
630 $~ = 'ANIMAL';
631 $^ = 'ANIMAL_TOP';
632 select( $old_fh );
633 }
634
635 format ANIMAL_TOP =
636 ID Type Name
637 .
638
639 format ANIMAL =
640 @## @<<< @<<<<<<<<<<<<<<
641 $id, $type, $name
642 .
643
644 Although write can work with lexical or package variables, whatever
645 variables you use have to scope in the format. That most likely means
646 you'll want to localize some package variables:
647
648 {
649 local( $id, $type, $name ) = qw( 12 cat Buster );
650 write( $fh );
651 }
652
653 print $string;
654
655 There are also some tricks that you can play with "formline" and the
656 accumulator variable $^A, but you lose a lot of the value of formats
657 since "formline" won't handle paging and so on. You end up
658 reimplementing formats when you use them.
659
660 How can I open a filehandle to a string?
661 (contributed by Peter J. Holzer, hjp-usenet2@hjp.at)
662
663 Since Perl 5.8.0 a file handle referring to a string can be created by
664 calling open with a reference to that string instead of the filename.
665 This file handle can then be used to read from or write to the string:
666
667 open(my $fh, '>', \$string) or die "Could not open string for writing";
668 print $fh "foo\n";
669 print $fh "bar\n"; # $string now contains "foo\nbar\n"
670
671 open(my $fh, '<', \$string) or die "Could not open string for reading";
672 my $x = <$fh>; # $x now contains "foo\n"
673
674 With older versions of Perl, the IO::String module provides similar
675 functionality.
676
677 How can I output my numbers with commas added?
678 (contributed by brian d foy and Benjamin Goldberg)
679
680 You can use Number::Format to separate places in a number. It handles
681 locale information for those of you who want to insert full stops
682 instead (or anything else that they want to use, really).
683
684 This subroutine will add commas to your number:
685
686 sub commify {
687 local $_ = shift;
688 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
689 return $_;
690 }
691
692 This regex from Benjamin Goldberg will add commas to numbers:
693
694 s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;
695
696 It is easier to see with comments:
697
698 s/(
699 ^[-+]? # beginning of number.
700 \d+? # first digits before first comma
701 (?= # followed by, (but not included in the match) :
702 (?>(?:\d{3})+) # some positive multiple of three digits.
703 (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever.
704 )
705 | # or:
706 \G\d{3} # after the last group, get three digits
707 (?=\d) # but they have to have more digits after them.
708 )/$1,/xg;
709
710 How can I translate tildes (~) in a filename?
711 Use the <> ("glob()") operator, documented in perlfunc. Versions of
712 Perl older than 5.6 require that you have a shell installed that groks
713 tildes. Later versions of Perl have this feature built in. The
714 File::KGlob module (available from CPAN) gives more portable glob
715 functionality.
716
717 Within Perl, you may use this directly:
718
719 $filename =~ s{
720 ^ ~ # find a leading tilde
721 ( # save this in $1
722 [^/] # a non-slash character
723 * # repeated 0 or more times (0 means me)
724 )
725 }{
726 $1
727 ? (getpwnam($1))[7]
728 : ( $ENV{HOME} || $ENV{LOGDIR} )
729 }ex;
730
731 How come when I open a file read-write it wipes it out?
732 Because you're using something like this, which truncates the file then
733 gives you read-write access:
734
735 open my $fh, '+>', '/path/name'; # WRONG (almost always)
736
737 Whoops. You should instead use this, which will fail if the file
738 doesn't exist:
739
740 open my $fh, '+<', '/path/name'; # open for update
741
742 Using ">" always clobbers or creates. Using "<" never does either. The
743 "+" doesn't change this.
744
745 Here are examples of many kinds of file opens. Those using "sysopen"
746 all assume that you've pulled in the constants from Fcntl:
747
748 use Fcntl;
749
750 To open file for reading:
751
752 open my $fh, '<', $path or die $!;
753 sysopen my $fh, $path, O_RDONLY or die $!;
754
755 To open file for writing, create new file if needed or else truncate
756 old file:
757
758 open my $fh, '>', $path or die $!;
759 sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!;
760 sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!;
761
762 To open file for writing, create new file, file must not exist:
763
764 sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!;
765 sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!;
766
767 To open file for appending, create if necessary:
768
769 open my $fh, '>>' $path or die $!;
770 sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!;
771 sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!;
772
773 To open file for appending, file must exist:
774
775 sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!;
776
777 To open file for update, file must exist:
778
779 open my $fh, '+<', $path or die $!;
780 sysopen my $fh, $path, O_RDWR or die $!;
781
782 To open file for update, create file if necessary:
783
784 sysopen my $fh, $path, O_RDWR|O_CREAT or die $!;
785 sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!;
786
787 To open file for update, file must not exist:
788
789 sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!;
790 sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!;
791
792 To open a file without blocking, creating if necessary:
793
794 sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT
795 or die "can't open /foo/somefile: $!":
796
797 Be warned that neither creation nor deletion of files is guaranteed to
798 be an atomic operation over NFS. That is, two processes might both
799 successfully create or unlink the same file! Therefore O_EXCL isn't as
800 exclusive as you might wish.
801
802 See also perlopentut.
803
804 Why do I sometimes get an "Argument list too long" when I use <*>?
805 The "<>" operator performs a globbing operation (see above). In Perl
806 versions earlier than v5.6.0, the internal glob() operator forks csh(1)
807 to do the actual glob expansion, but csh can't handle more than 127
808 items and so gives the error message "Argument list too long". People
809 who installed tcsh as csh won't have this problem, but their users may
810 be surprised by it.
811
812 To get around this, either upgrade to Perl v5.6.0 or later, do the glob
813 yourself with readdir() and patterns, or use a module like File::Glob,
814 one that doesn't use the shell to do globbing.
815
816 How can I open a file with a leading ">" or trailing blanks?
817 (contributed by Brian McCauley)
818
819 The special two-argument form of Perl's open() function ignores
820 trailing blanks in filenames and infers the mode from certain leading
821 characters (or a trailing "|"). In older versions of Perl this was the
822 only version of open() and so it is prevalent in old code and books.
823
824 Unless you have a particular reason to use the two-argument form you
825 should use the three-argument form of open() which does not treat any
826 characters in the filename as special.
827
828 open my $fh, "<", " file "; # filename is " file "
829 open my $fh, ">", ">file"; # filename is ">file"
830
831 How can I reliably rename a file?
832 If your operating system supports a proper mv(1) utility or its
833 functional equivalent, this works:
834
835 rename($old, $new) or system("mv", $old, $new);
836
837 It may be more portable to use the File::Copy module instead. You just
838 copy to the new file to the new name (checking return values), then
839 delete the old one. This isn't really the same semantically as a
840 "rename()", which preserves meta-information like permissions,
841 timestamps, inode info, etc.
842
843 How can I lock a file?
844 Perl's builtin flock() function (see perlfunc for details) will call
845 flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004
846 and later), and lockf(3) if neither of the two previous system calls
847 exists. On some systems, it may even use a different form of native
848 locking. Here are some gotchas with Perl's flock():
849
850 1. Produces a fatal error if none of the three system calls (or their
851 close equivalent) exists.
852
853 2. lockf(3) does not provide shared locking, and requires that the
854 filehandle be open for writing (or appending, or read/writing).
855
856 3. Some versions of flock() can't lock files over a network (e.g. on
857 NFS file systems), so you'd need to force the use of fcntl(2) when
858 you build Perl. But even this is dubious at best. See the flock
859 entry of perlfunc and the INSTALL file in the source distribution
860 for information on building Perl to do this.
861
862 Two potentially non-obvious but traditional flock semantics are
863 that it waits indefinitely until the lock is granted, and that its
864 locks are merely advisory. Such discretionary locks are more
865 flexible, but offer fewer guarantees. This means that files locked
866 with flock() may be modified by programs that do not also use
867 flock(). Cars that stop for red lights get on well with each other,
868 but not with cars that don't stop for red lights. See the perlport
869 manpage, your port's specific documentation, or your system-
870 specific local manpages for details. It's best to assume
871 traditional behavior if you're writing portable programs. (If
872 you're not, you should as always feel perfectly free to write for
873 your own system's idiosyncrasies (sometimes called "features").
874 Slavish adherence to portability concerns shouldn't get in the way
875 of your getting your job done.)
876
877 For more information on file locking, see also "File Locking" in
878 perlopentut if you have it (new for 5.6).
879
880 Why can't I just open(FH, ">file.lock")?
881 A common bit of code NOT TO USE is this:
882
883 sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE
884 open my $lock, '>', 'file.lock'; # THIS BROKEN CODE
885
886 This is a classic race condition: you take two steps to do something
887 which must be done in one. That's why computer hardware provides an
888 atomic test-and-set instruction. In theory, this "ought" to work:
889
890 sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT
891 or die "can't open file.lock: $!";
892
893 except that lamentably, file creation (and deletion) is not atomic over
894 NFS, so this won't work (at least, not every time) over the net.
895 Various schemes involving link() have been suggested, but these tend to
896 involve busy-wait, which is also less than desirable.
897
898 I still don't get locking. I just want to increment the number in the file.
899 How can I do this?
900 Didn't anyone ever tell you web-page hit counters were useless? They
901 don't count number of hits, they're a waste of time, and they serve
902 only to stroke the writer's vanity. It's better to pick a random
903 number; they're more realistic.
904
905 Anyway, this is what you can do if you can't help yourself.
906
907 use Fcntl qw(:DEFAULT :flock);
908 sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!";
909 flock $fh, LOCK_EX or die "can't flock numfile: $!";
910 my $num = <$fh> || 0;
911 seek $fh, 0, 0 or die "can't rewind numfile: $!";
912 truncate $fh, 0 or die "can't truncate numfile: $!";
913 (print $fh $num+1, "\n") or die "can't write numfile: $!";
914 close $fh or die "can't close numfile: $!";
915
916 Here's a much better web-page hit counter:
917
918 $hits = int( (time() - 850_000_000) / rand(1_000) );
919
920 If the count doesn't impress your friends, then the code might. :-)
921
922 All I want to do is append a small amount of text to the end of a file. Do
923 I still have to use locking?
924 If you are on a system that correctly implements "flock" and you use
925 the example appending code from "perldoc -f flock" everything will be
926 OK even if the OS you are on doesn't implement append mode correctly
927 (if such a system exists). So if you are happy to restrict yourself to
928 OSs that implement "flock" (and that's not really much of a
929 restriction) then that is what you should do.
930
931 If you know you are only going to use a system that does correctly
932 implement appending (i.e. not Win32) then you can omit the "seek" from
933 the code in the previous answer.
934
935 If you know you are only writing code to run on an OS and filesystem
936 that does implement append mode correctly (a local filesystem on a
937 modern Unix for example), and you keep the file in block-buffered mode
938 and you write less than one buffer-full of output between each manual
939 flushing of the buffer then each bufferload is almost guaranteed to be
940 written to the end of the file in one chunk without getting
941 intermingled with anyone else's output. You can also use the "syswrite"
942 function which is simply a wrapper around your system's write(2) system
943 call.
944
945 There is still a small theoretical chance that a signal will interrupt
946 the system-level "write()" operation before completion. There is also a
947 possibility that some STDIO implementations may call multiple system
948 level "write()"s even if the buffer was empty to start. There may be
949 some systems where this probability is reduced to zero, and this is not
950 a concern when using ":perlio" instead of your system's STDIO.
951
952 How do I randomly update a binary file?
953 If you're just trying to patch a binary, in many cases something as
954 simple as this works:
955
956 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
957
958 However, if you have fixed sized records, then you might do something
959 more like this:
960
961 my $RECSIZE = 220; # size of record, in bytes
962 my $recno = 37; # which record to update
963 open my $fh, '+<', 'somewhere' or die "can't update somewhere: $!";
964 seek $fh, $recno * $RECSIZE, 0;
965 read $fh, $record, $RECSIZE == $RECSIZE or die "can't read record $recno: $!";
966 # munge the record
967 seek $fh, -$RECSIZE, 1;
968 print $fh $record;
969 close $fh;
970
971 Locking and error checking are left as an exercise for the reader.
972 Don't forget them or you'll be quite sorry.
973
974 How do I get a file's timestamp in perl?
975 If you want to retrieve the time at which the file was last read,
976 written, or had its meta-data (owner, etc) changed, you use the -A, -M,
977 or -C file test operations as documented in perlfunc. These retrieve
978 the age of the file (measured against the start-time of your program)
979 in days as a floating point number. Some platforms may not have all of
980 these times. See perlport for details. To retrieve the "raw" time in
981 seconds since the epoch, you would call the stat function, then use
982 "localtime()", "gmtime()", or "POSIX::strftime()" to convert this into
983 human-readable form.
984
985 Here's an example:
986
987 my $write_secs = (stat($file))[9];
988 printf "file %s updated at %s\n", $file,
989 scalar localtime($write_secs);
990
991 If you prefer something more legible, use the File::stat module (part
992 of the standard distribution in version 5.004 and later):
993
994 # error checking left as an exercise for reader.
995 use File::stat;
996 use Time::localtime;
997 my $date_string = ctime(stat($file)->mtime);
998 print "file $file updated at $date_string\n";
999
1000 The POSIX::strftime() approach has the benefit of being, in theory,
1001 independent of the current locale. See perllocale for details.
1002
1003 How do I set a file's timestamp in perl?
1004 You use the utime() function documented in "utime" in perlfunc. By way
1005 of example, here's a little program that copies the read and write
1006 times from its first argument to all the rest of them.
1007
1008 if (@ARGV < 2) {
1009 die "usage: cptimes timestamp_file other_files ...\n";
1010 }
1011 my $timestamp = shift;
1012 my($atime, $mtime) = (stat($timestamp))[8,9];
1013 utime $atime, $mtime, @ARGV;
1014
1015 Error checking is, as usual, left as an exercise for the reader.
1016
1017 The perldoc for utime also has an example that has the same effect as
1018 touch(1) on files that already exist.
1019
1020 Certain file systems have a limited ability to store the times on a
1021 file at the expected level of precision. For example, the FAT and HPFS
1022 filesystem are unable to create dates on files with a finer granularity
1023 than two seconds. This is a limitation of the filesystems, not of
1024 utime().
1025
1026 How do I print to more than one file at once?
1027 To connect one filehandle to several output filehandles, you can use
1028 the IO::Tee or Tie::FileHandle::Multiplex modules.
1029
1030 If you only have to do this once, you can print individually to each
1031 filehandle.
1032
1033 for my $fh ($fh1, $fh2, $fh3) { print $fh "whatever\n" }
1034
1035 How can I read in an entire file all at once?
1036 The customary Perl approach for processing all the lines in a file is
1037 to do so one line at a time:
1038
1039 open my $input, '<', $file or die "can't open $file: $!";
1040 while (<$input>) {
1041 chomp;
1042 # do something with $_
1043 }
1044 close $input or die "can't close $file: $!";
1045
1046 This is tremendously more efficient than reading the entire file into
1047 memory as an array of lines and then processing it one element at a
1048 time, which is often--if not almost always--the wrong approach.
1049 Whenever you see someone do this:
1050
1051 my @lines = <INPUT>;
1052
1053 You should think long and hard about why you need everything loaded at
1054 once. It's just not a scalable solution.
1055
1056 If you "mmap" the file with the File::Map module from CPAN, you can
1057 virtually load the entire file into a string without actually storing
1058 it in memory:
1059
1060 use File::Map qw(map_file);
1061
1062 map_file my $string, $filename;
1063
1064 Once mapped, you can treat $string as you would any other string.
1065 Since you don't necessarily have to load the data, mmap-ing can be very
1066 fast and may not increase your memory footprint.
1067
1068 You might also find it more fun to use the standard Tie::File module,
1069 or the DB_File module's $DB_RECNO bindings, which allow you to tie an
1070 array to a file so that accessing an element of the array actually
1071 accesses the corresponding line in the file.
1072
1073 If you want to load the entire file, you can use the File::Slurp module
1074 to do it in one one simple and efficient step:
1075
1076 use File::Slurp;
1077
1078 my $all_of_it = read_file($filename); # entire file in scalar
1079 my @all_lines = read_file($filename); # one line per element
1080
1081 Or you can read the entire file contents into a scalar like this:
1082
1083 my $var;
1084 {
1085 local $/;
1086 open my $fh, '<', $file or die "can't open $file: $!";
1087 $var = <$fh>;
1088 }
1089
1090 That temporarily undefs your record separator, and will automatically
1091 close the file at block exit. If the file is already open, just use
1092 this:
1093
1094 my $var = do { local $/; <$fh> };
1095
1096 You can also use a localized @ARGV to eliminate the "open":
1097
1098 my $var = do { local( @ARGV, $/ ) = $file; <> };
1099
1100 For ordinary files you can also use the "read" function.
1101
1102 read( $fh, $var, -s $fh );
1103
1104 That third argument tests the byte size of the data on the $fh
1105 filehandle and reads that many bytes into the buffer $var.
1106
1107 How can I read in a file by paragraphs?
1108 Use the $/ variable (see perlvar for details). You can either set it to
1109 "" to eliminate empty paragraphs ("abc\n\n\n\ndef", for instance, gets
1110 treated as two paragraphs and not three), or "\n\n" to accept empty
1111 paragraphs.
1112
1113 Note that a blank line must have no blanks in it. Thus
1114 "fred\n \nstuff\n\n" is one paragraph, but "fred\n\nstuff\n\n" is two.
1115
1116 How can I read a single character from a file? From the keyboard?
1117 You can use the builtin "getc()" function for most filehandles, but it
1118 won't (easily) work on a terminal device. For STDIN, either use the
1119 Term::ReadKey module from CPAN or use the sample code in "getc" in
1120 perlfunc.
1121
1122 If your system supports the portable operating system programming
1123 interface (POSIX), you can use the following code, which you'll note
1124 turns off echo processing as well.
1125
1126 #!/usr/bin/perl -w
1127 use strict;
1128 $| = 1;
1129 for (1..4) {
1130 print "gimme: ";
1131 my $got = getone();
1132 print "--> $got\n";
1133 }
1134 exit;
1135
1136 BEGIN {
1137 use POSIX qw(:termios_h);
1138
1139 my ($term, $oterm, $echo, $noecho, $fd_stdin);
1140
1141 my $fd_stdin = fileno(STDIN);
1142
1143 $term = POSIX::Termios->new();
1144 $term->getattr($fd_stdin);
1145 $oterm = $term->getlflag();
1146
1147 $echo = ECHO | ECHOK | ICANON;
1148 $noecho = $oterm & ~$echo;
1149
1150 sub cbreak {
1151 $term->setlflag($noecho);
1152 $term->setcc(VTIME, 1);
1153 $term->setattr($fd_stdin, TCSANOW);
1154 }
1155
1156 sub cooked {
1157 $term->setlflag($oterm);
1158 $term->setcc(VTIME, 0);
1159 $term->setattr($fd_stdin, TCSANOW);
1160 }
1161
1162 sub getone {
1163 my $key = '';
1164 cbreak();
1165 sysread(STDIN, $key, 1);
1166 cooked();
1167 return $key;
1168 }
1169 }
1170
1171 END { cooked() }
1172
1173 The Term::ReadKey module from CPAN may be easier to use. Recent
1174 versions include also support for non-portable systems as well.
1175
1176 use Term::ReadKey;
1177 open my $tty, '<', '/dev/tty';
1178 print "Gimme a char: ";
1179 ReadMode "raw";
1180 my $key = ReadKey 0, $tty;
1181 ReadMode "normal";
1182 printf "\nYou said %s, char number %03d\n",
1183 $key, ord $key;
1184
1185 How can I tell whether there's a character waiting on a filehandle?
1186 The very first thing you should do is look into getting the
1187 Term::ReadKey extension from CPAN. As we mentioned earlier, it now even
1188 has limited support for non-portable (read: not open systems, closed,
1189 proprietary, not POSIX, not Unix, etc.) systems.
1190
1191 You should also check out the Frequently Asked Questions list in
1192 comp.unix.* for things like this: the answer is essentially the same.
1193 It's very system-dependent. Here's one solution that works on BSD
1194 systems:
1195
1196 sub key_ready {
1197 my($rin, $nfd);
1198 vec($rin, fileno(STDIN), 1) = 1;
1199 return $nfd = select($rin,undef,undef,0);
1200 }
1201
1202 If you want to find out how many characters are waiting, there's also
1203 the FIONREAD ioctl call to be looked at. The h2ph tool that comes with
1204 Perl tries to convert C include files to Perl code, which can be
1205 "require"d. FIONREAD ends up defined as a function in the sys/ioctl.ph
1206 file:
1207
1208 require 'sys/ioctl.ph';
1209
1210 $size = pack("L", 0);
1211 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
1212 $size = unpack("L", $size);
1213
1214 If h2ph wasn't installed or doesn't work for you, you can grep the
1215 include files by hand:
1216
1217 % grep FIONREAD /usr/include/*/*
1218 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
1219
1220 Or write a small C program using the editor of champions:
1221
1222 % cat > fionread.c
1223 #include <sys/ioctl.h>
1224 main() {
1225 printf("%#08x\n", FIONREAD);
1226 }
1227 ^D
1228 % cc -o fionread fionread.c
1229 % ./fionread
1230 0x4004667f
1231
1232 And then hard-code it, leaving porting as an exercise to your
1233 successor.
1234
1235 $FIONREAD = 0x4004667f; # XXX: opsys dependent
1236
1237 $size = pack("L", 0);
1238 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
1239 $size = unpack("L", $size);
1240
1241 FIONREAD requires a filehandle connected to a stream, meaning that
1242 sockets, pipes, and tty devices work, but not files.
1243
1244 How do I do a "tail -f" in perl?
1245 First try
1246
1247 seek($gw_fh, 0, 1);
1248
1249 The statement "seek($gw_fh, 0, 1)" doesn't change the current position,
1250 but it does clear the end-of-file condition on the handle, so that the
1251 next "<$gw_fh>" makes Perl try again to read something.
1252
1253 If that doesn't work (it relies on features of your stdio
1254 implementation), then you need something more like this:
1255
1256 for (;;) {
1257 for ($curpos = tell($gw_fh); <$gw_fh>; $curpos =tell($gw_fh)) {
1258 # search for some stuff and put it into files
1259 }
1260 # sleep for a while
1261 seek($gw_fh, $curpos, 0); # seek to where we had been
1262 }
1263
1264 If this still doesn't work, look into the "clearerr" method from
1265 IO::Handle, which resets the error and end-of-file states on the
1266 handle.
1267
1268 There's also a File::Tail module from CPAN.
1269
1270 How do I dup() a filehandle in Perl?
1271 If you check "open" in perlfunc, you'll see that several of the ways to
1272 call open() should do the trick. For example:
1273
1274 open my $log, '>>', '/foo/logfile';
1275 open STDERR, '>&', $log;
1276
1277 Or even with a literal numeric descriptor:
1278
1279 my $fd = $ENV{MHCONTEXTFD};
1280 open $mhcontext, "<&=$fd"; # like fdopen(3S)
1281
1282 Note that "<&STDIN" makes a copy, but "<&=STDIN" makes an alias. That
1283 means if you close an aliased handle, all aliases become inaccessible.
1284 This is not true with a copied one.
1285
1286 Error checking, as always, has been left as an exercise for the reader.
1287
1288 How do I close a file descriptor by number?
1289 If, for some reason, you have a file descriptor instead of a filehandle
1290 (perhaps you used "POSIX::open"), you can use the "close()" function
1291 from the POSIX module:
1292
1293 use POSIX ();
1294
1295 POSIX::close( $fd );
1296
1297 This should rarely be necessary, as the Perl "close()" function is to
1298 be used for things that Perl opened itself, even if it was a dup of a
1299 numeric descriptor as with "MHCONTEXT" above. But if you really have
1300 to, you may be able to do this:
1301
1302 require 'sys/syscall.ph';
1303 my $rc = syscall(&SYS_close, $fd + 0); # must force numeric
1304 die "can't sysclose $fd: $!" unless $rc == -1;
1305
1306 Or, just use the fdopen(3S) feature of "open()":
1307
1308 {
1309 open my $fh, "<&=$fd" or die "Cannot reopen fd=$fd: $!";
1310 close $fh;
1311 }
1312
1313 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe`
1314 work?
1315 Whoops! You just put a tab and a formfeed into that filename!
1316 Remember that within double quoted strings ("like\this"), the backslash
1317 is an escape character. The full list of these is in "Quote and Quote-
1318 like Operators" in perlop. Unsurprisingly, you don't have a file called
1319 "c:(tab)emp(formfeed)oo" or "c:(tab)emp(formfeed)oo.exe" on your legacy
1320 DOS filesystem.
1321
1322 Either single-quote your strings, or (preferably) use forward slashes.
1323 Since all DOS and Windows versions since something like MS-DOS 2.0 or
1324 so have treated "/" and "\" the same in a path, you might as well use
1325 the one that doesn't clash with Perl--or the POSIX shell, ANSI C and
1326 C++, awk, Tcl, Java, or Python, just to mention a few. POSIX paths are
1327 more portable, too.
1328
1329 Why doesn't glob("*.*") get all the files?
1330 Because even on non-Unix ports, Perl's glob function follows standard
1331 Unix globbing semantics. You'll need "glob("*")" to get all (non-
1332 hidden) files. This makes glob() portable even to legacy systems. Your
1333 port may include proprietary globbing functions as well. Check its
1334 documentation for details.
1335
1336 Why does Perl let me delete read-only files? Why does "-i" clobber
1337 protected files? Isn't this a bug in Perl?
1338 This is elaborately and painstakingly described in the file-dir-perms
1339 article in the "Far More Than You Ever Wanted To Know" collection in
1340 <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> .
1341
1342 The executive summary: learn how your filesystem works. The permissions
1343 on a file say what can happen to the data in that file. The
1344 permissions on a directory say what can happen to the list of files in
1345 that directory. If you delete a file, you're removing its name from the
1346 directory (so the operation depends on the permissions of the
1347 directory, not of the file). If you try to write to the file, the
1348 permissions of the file govern whether you're allowed to.
1349
1350 How do I select a random line from a file?
1351 Short of loading the file into a database or pre-indexing the lines in
1352 the file, there are a couple of things that you can do.
1353
1354 Here's a reservoir-sampling algorithm from the Camel Book:
1355
1356 srand;
1357 rand($.) < 1 && ($line = $_) while <>;
1358
1359 This has a significant advantage in space over reading the whole file
1360 in. You can find a proof of this method in The Art of Computer
1361 Programming, Volume 2, Section 3.4.2, by Donald E. Knuth.
1362
1363 You can use the File::Random module which provides a function for that
1364 algorithm:
1365
1366 use File::Random qw/random_line/;
1367 my $line = random_line($filename);
1368
1369 Another way is to use the Tie::File module, which treats the entire
1370 file as an array. Simply access a random array element.
1371
1372 Why do I get weird spaces when I print an array of lines?
1373 (contributed by brian d foy)
1374
1375 If you are seeing spaces between the elements of your array when you
1376 print the array, you are probably interpolating the array in double
1377 quotes:
1378
1379 my @animals = qw(camel llama alpaca vicuna);
1380 print "animals are: @animals\n";
1381
1382 It's the double quotes, not the "print", doing this. Whenever you
1383 interpolate an array in a double quote context, Perl joins the elements
1384 with spaces (or whatever is in $", which is a space by default):
1385
1386 animals are: camel llama alpaca vicuna
1387
1388 This is different than printing the array without the interpolation:
1389
1390 my @animals = qw(camel llama alpaca vicuna);
1391 print "animals are: ", @animals, "\n";
1392
1393 Now the output doesn't have the spaces between the elements because the
1394 elements of @animals simply become part of the list to "print":
1395
1396 animals are: camelllamaalpacavicuna
1397
1398 You might notice this when each of the elements of @array end with a
1399 newline. You expect to print one element per line, but notice that
1400 every line after the first is indented:
1401
1402 this is a line
1403 this is another line
1404 this is the third line
1405
1406 That extra space comes from the interpolation of the array. If you
1407 don't want to put anything between your array elements, don't use the
1408 array in double quotes. You can send it to print without them:
1409
1410 print @lines;
1411
1412 How do I traverse a directory tree?
1413 (contributed by brian d foy)
1414
1415 The File::Find module, which comes with Perl, does all of the hard work
1416 to traverse a directory structure. It comes with Perl. You simply call
1417 the "find" subroutine with a callback subroutine and the directories
1418 you want to traverse:
1419
1420 use File::Find;
1421
1422 find( \&wanted, @directories );
1423
1424 sub wanted {
1425 # full path in $File::Find::name
1426 # just filename in $_
1427 ... do whatever you want to do ...
1428 }
1429
1430 The File::Find::Closures, which you can download from CPAN, provides
1431 many ready-to-use subroutines that you can use with File::Find.
1432
1433 The File::Finder, which you can download from CPAN, can help you create
1434 the callback subroutine using something closer to the syntax of the
1435 "find" command-line utility:
1436
1437 use File::Find;
1438 use File::Finder;
1439
1440 my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}');
1441
1442 find( $deep_dirs->as_options, @places );
1443
1444 The File::Find::Rule module, which you can download from CPAN, has a
1445 similar interface, but does the traversal for you too:
1446
1447 use File::Find::Rule;
1448
1449 my @files = File::Find::Rule->file()
1450 ->name( '*.pm' )
1451 ->in( @INC );
1452
1453 How do I delete a directory tree?
1454 (contributed by brian d foy)
1455
1456 If you have an empty directory, you can use Perl's built-in "rmdir".
1457 If the directory is not empty (so, no files or subdirectories), you
1458 either have to empty it yourself (a lot of work) or use a module to
1459 help you.
1460
1461 The File::Path module, which comes with Perl, has a "remove_tree" which
1462 can take care of all of the hard work for you:
1463
1464 use File::Path qw(remove_tree);
1465
1466 remove_tree( @directories );
1467
1468 The File::Path module also has a legacy interface to the older "rmtree"
1469 subroutine.
1470
1471 How do I copy an entire directory?
1472 (contributed by Shlomi Fish)
1473
1474 To do the equivalent of "cp -R" (i.e. copy an entire directory tree
1475 recursively) in portable Perl, you'll either need to write something
1476 yourself or find a good CPAN module such as File::Copy::Recursive.
1477
1479 Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other
1480 authors as noted. All rights reserved.
1481
1482 This documentation is free; you can redistribute it and/or modify it
1483 under the same terms as Perl itself.
1484
1485 Irrespective of its distribution, all code examples here are in the
1486 public domain. You are permitted and encouraged to use this code and
1487 any derivatives thereof in your own programs for fun or for profit as
1488 you see fit. A simple comment in the code giving credit to the FAQ
1489 would be courteous but is not required.
1490
1491
1492
1493perl v5.16.3 2013-03-04 PERLFAQ5(1)