1PERLOPENTUT(1) Perl Programmers Reference Guide PERLOPENTUT(1)
2
3
4
6 perlopentut - simple recipes for opening files and pipes in Perl
7
9 Whenever you do I/O on a file in Perl, you do so through what in Perl
10 is called a filehandle. A filehandle is an internal name for an
11 external file. It is the job of the "open" function to make the
12 association between the internal name and the external name, and it is
13 the job of the "close" function to break that association.
14
15 For your convenience, Perl sets up a few special filehandles that are
16 already open when you run. These include "STDIN", "STDOUT", "STDERR",
17 and "ARGV". Since those are pre-opened, you can use them right away
18 without having to go to the trouble of opening them yourself:
19
20 print STDERR "This is a debugging message.\n";
21
22 print STDOUT "Please enter something: ";
23 $response = <STDIN> // die "how come no input?";
24 print STDOUT "Thank you!\n";
25
26 while (<ARGV>) { ... }
27
28 As you see from those examples, "STDOUT" and "STDERR" are output
29 handles, and "STDIN" and "ARGV" are input handles. They are in all
30 capital letters because they are reserved to Perl, much like the @ARGV
31 array and the %ENV hash are. Their external associations were set up
32 by your shell.
33
34 You will need to open every other filehandle on your own. Although
35 there are many variants, the most common way to call Perl's open()
36 function is with three arguments and one return value:
37
38 " OK = open(HANDLE, MODE, PATHNAME)"
39
40 Where:
41
42 OK will be some defined value if the open succeeds, but "undef" if it
43 fails;
44
45 HANDLE
46 should be an undefined scalar variable to be filled in by the
47 "open" function if it succeeds;
48
49 MODE
50 is the access mode and the encoding format to open the file with;
51
52 PATHNAME
53 is the external name of the file you want opened.
54
55 Most of the complexity of the "open" function lies in the many possible
56 values that the MODE parameter can take on.
57
58 One last thing before we show you how to open files: opening files does
59 not (usually) automatically lock them in Perl. See perlfaq5 for how to
60 lock.
61
63 Opening Text Files for Reading
64 If you want to read from a text file, first open it in read-only mode
65 like this:
66
67 my $filename = "/some/path/to/a/textfile/goes/here";
68 my $encoding = ":encoding(UTF-8)";
69 my $handle = undef; # this will be filled in on success
70
71 open($handle, "< $encoding", $filename)
72 || die "$0: can't open $filename for reading: $!";
73
74 As with the shell, in Perl the "<" is used to open the file in read-
75 only mode. If it succeeds, Perl allocates a brand new filehandle for
76 you and fills in your previously undefined $handle argument with a
77 reference to that handle.
78
79 Now you may use functions like "readline", "read", "getc", and
80 "sysread" on that handle. Probably the most common input function is
81 the one that looks like an operator:
82
83 $line = readline($handle);
84 $line = <$handle>; # same thing
85
86 Because the "readline" function returns "undef" at end of file or upon
87 error, you will sometimes see it used this way:
88
89 $line = <$handle>;
90 if (defined $line) {
91 # do something with $line
92 }
93 else {
94 # $line is not valid, so skip it
95 }
96
97 You can also just quickly "die" on an undefined value this way:
98
99 $line = <$handle> // die "no input found";
100
101 However, if hitting EOF is an expected and normal event, you do not
102 want to exit simply because you have run out of input. Instead, you
103 probably just want to exit an input loop. You can then test to see if
104 an actual error has caused the loop to terminate, and act accordingly:
105
106 while (<$handle>) {
107 # do something with data in $_
108 }
109 if ($!) {
110 die "unexpected error while reading from $filename: $!";
111 }
112
113 A Note on Encodings: Having to specify the text encoding every time
114 might seem a bit of a bother. To set up a default encoding for "open"
115 so that you don't have to supply it each time, you can use the "open"
116 pragma:
117
118 use open qw< :encoding(UTF-8) >;
119
120 Once you've done that, you can safely omit the encoding part of the
121 open mode:
122
123 open($handle, "<", $filename)
124 || die "$0: can't open $filename for reading: $!";
125
126 But never use the bare "<" without having set up a default encoding
127 first. Otherwise, Perl cannot know which of the many, many, many
128 possible flavors of text file you have, and Perl will have no idea how
129 to correctly map the data in your file into actual characters it can
130 work with. Other common encoding formats including "ASCII",
131 "ISO-8859-1", "ISO-8859-15", "Windows-1252", "MacRoman", and even
132 "UTF-16LE". See perlunitut for more about encodings.
133
134 Opening Text Files for Writing
135 When you want to write to a file, you first have to decide what to do
136 about any existing contents of that file. You have two basic choices
137 here: to preserve or to clobber.
138
139 If you want to preserve any existing contents, then you want to open
140 the file in append mode. As in the shell, in Perl you use ">>" to open
141 an existing file in append mode. ">>" creates the file if it does not
142 already exist.
143
144 my $handle = undef;
145 my $filename = "/some/path/to/a/textfile/goes/here";
146 my $encoding = ":encoding(UTF-8)";
147
148 open($handle, ">> $encoding", $filename)
149 || die "$0: can't open $filename for appending: $!";
150
151 Now you can write to that filehandle using any of "print", "printf",
152 "say", "write", or "syswrite".
153
154 As noted above, if the file does not already exist, then the append-
155 mode open will create it for you. But if the file does already exist,
156 its contents are safe from harm because you will be adding your new
157 text past the end of the old text.
158
159 On the other hand, sometimes you want to clobber whatever might already
160 be there. To empty out a file before you start writing to it, you can
161 open it in write-only mode:
162
163 my $handle = undef;
164 my $filename = "/some/path/to/a/textfile/goes/here";
165 my $encoding = ":encoding(UTF-8)";
166
167 open($handle, "> $encoding", $filename)
168 || die "$0: can't open $filename in write-open mode: $!";
169
170 Here again Perl works just like the shell in that the ">" clobbers an
171 existing file.
172
173 As with the append mode, when you open a file in write-only mode, you
174 can now write to that filehandle using any of "print", "printf", "say",
175 "write", or "syswrite".
176
177 What about read-write mode? You should probably pretend it doesn't
178 exist, because opening text files in read-write mode is unlikely to do
179 what you would like. See perlfaq5 for details.
180
182 If the file to be opened contains binary data instead of text
183 characters, then the "MODE" argument to "open" is a little different.
184 Instead of specifying the encoding, you tell Perl that your data are in
185 raw bytes.
186
187 my $filename = "/some/path/to/a/binary/file/goes/here";
188 my $encoding = ":raw :bytes"
189 my $handle = undef; # this will be filled in on success
190
191 And then open as before, choosing "<", ">>", or ">" as needed:
192
193 open($handle, "< $encoding", $filename)
194 || die "$0: can't open $filename for reading: $!";
195
196 open($handle, ">> $encoding", $filename)
197 || die "$0: can't open $filename for appending: $!";
198
199 open($handle, "> $encoding", $filename)
200 || die "$0: can't open $filename in write-open mode: $!";
201
202 Alternately, you can change to binary mode on an existing handle this
203 way:
204
205 binmode($handle) || die "cannot binmode handle";
206
207 This is especially handy for the handles that Perl has already opened
208 for you.
209
210 binmode(STDIN) || die "cannot binmode STDIN";
211 binmode(STDOUT) || die "cannot binmode STDOUT";
212
213 You can also pass "binmode" an explicit encoding to change it on the
214 fly. This isn't exactly "binary" mode, but we still use "binmode" to
215 do it:
216
217 binmode(STDIN, ":encoding(MacRoman)") || die "cannot binmode STDIN";
218 binmode(STDOUT, ":encoding(UTF-8)") || die "cannot binmode STDOUT";
219
220 Once you have your binary file properly opened in the right mode, you
221 can use all the same Perl I/O functions as you used on text files.
222 However, you may wish to use the fixed-size "read" instead of the
223 variable-sized "readline" for your input.
224
225 Here's an example of how to copy a binary file:
226
227 my $BUFSIZ = 64 * (2 ** 10);
228 my $name_in = "/some/input/file";
229 my $name_out = "/some/output/flie";
230
231 my($in_fh, $out_fh, $buffer);
232
233 open($in_fh, "<", $name_in)
234 || die "$0: cannot open $name_in for reading: $!";
235 open($out_fh, ">", $name_out)
236 || die "$0: cannot open $name_out for writing: $!";
237
238 for my $fh ($in_fh, $out_fh) {
239 binmode($fh) || die "binmode failed";
240 }
241
242 while (read($in_fh, $buffer, $BUFSIZ)) {
243 unless (print $out_fh $buffer) {
244 die "couldn't write to $name_out: $!";
245 }
246 }
247
248 close($in_fh) || die "couldn't close $name_in: $!";
249 close($out_fh) || die "couldn't close $name_out: $!";
250
252 Perl also lets you open a filehandle into an external program or shell
253 command rather than into a file. You can do this in order to pass data
254 from your Perl program to an external command for further processing,
255 or to receive data from another program for your own Perl program to
256 process.
257
258 Filehandles into commands are also known as pipes, since they work on
259 similar inter-process communication principles as Unix pipelines. Such
260 a filehandle has an active program instead of a static file on its
261 external end, but in every other sense it works just like a more
262 typical file-based filehandle, with all the techniques discussed
263 earlier in this article just as applicable.
264
265 As such, you open a pipe using the same "open" call that you use for
266 opening files, setting the second ("MODE") argument to special
267 characters that indicate either an input or an output pipe. Use "-|"
268 for a filehandle that will let your Perl program read data from an
269 external program, and "|-" for a filehandle that will send data to that
270 program instead.
271
272 Opening a pipe for reading
273 Let's say you'd like your Perl program to process data stored in a
274 nearby directory called "unsorted", which contains a number of
275 textfiles. You'd also like your program to sort all the contents from
276 these files into a single, alphabetically sorted list of unique lines
277 before it starts processing them.
278
279 You could do this through opening an ordinary filehandle into each of
280 those files, gradually building up an in-memory array of all the file
281 contents you load this way, and finally sorting and filtering that
282 array when you've run out of files to load. Or, you could offload all
283 that merging and sorting into your operating system's own "sort"
284 command by opening a pipe directly into its output, and get to work
285 that much faster.
286
287 Here's how that might look:
288
289 open(my $sort_fh, '-|', 'sort -u unsorted/*.txt')
290 or die "Couldn't open a pipe into sort: $!";
291
292 # And right away, we can start reading sorted lines:
293 while (my $line = <$sort_fh>) {
294 #
295 # ... Do something interesting with each $line here ...
296 #
297 }
298
299 The second argument to "open", "-|", makes it a read-pipe into a
300 separate program, rather than an ordinary filehandle into a file.
301
302 Note that the third argument to "open" is a string containing the
303 program name ("sort") plus all its arguments: in this case, "-u" to
304 specify unqiue sort, and then a fileglob specifying the files to sort.
305 The resulting filehandle $sort_fh works just like a read-only ("<")
306 filehandle, and your program can subsequently read data from it as if
307 it were opened onto an ordinary, single file.
308
309 Opening a pipe for writing
310 Continuing the previous example, let's say that your program has
311 completed its processing, and the results sit in an array called
312 @processed. You want to print these lines to a file called
313 "numbered.txt" with a neatly formatted column of line-numbers.
314
315 Certainly you could write your own code to do this X or, once again,
316 you could kick that work over to another program. In this case, "cat",
317 running with its own "-n" option to activate line numbering, should do
318 the trick:
319
320 open(my $cat_fh, '|-', 'cat -n > numbered.txt')
321 or die "Couldn't open a pipe into cat: $!";
322
323 for my $line (@processed) {
324 print $cat_fh $line;
325 }
326
327 Here, we use a second "open" argument of "|-", signifying that the
328 filehandle assigned to $cat_fh should be a write-pipe. We can then use
329 it just as we would a write-only ordinary filehandle, including the
330 basic function of "print"-ing data to it.
331
332 Note that the third argument, specifying the command that we wish to
333 pipe to, sets up "cat" to redirect its output via that ">" symbol into
334 the file "numbered.txt". This can start to look a little tricky,
335 because that same symbol would have meant something entirely different
336 had it showed it in the second argument to "open"! But here in the
337 third argument, it's simply part of the shell command that Perl will
338 open the pipe into, and Perl itself doesn't invest any special meaning
339 to it.
340
341 Expressing the command as a list
342 For opening pipes, Perl offers the option to call "open" with a list
343 comprising the desired command and all its own arguments as separate
344 elements, rather than combining them into a single string as in the
345 examples above. For instance, we could have phrased the "open" call in
346 the first example like this:
347
348 open(my $sort_fh, '-|', 'sort', '-u', glob('unsorted/*.txt'))
349 or die "Couldn't open a pipe into sort: $!";
350
351 When you call "open" this way, Perl invokes the given command directly,
352 bypassing the shell. As such, the shell won't try to interpret any
353 special characters within the command's argument list, which might
354 overwise have unwanted effects. This can make for safer, less error-
355 prone "open" calls, useful in cases such as passing in variables as
356 arguments, or even just referring to filenames with spaces in them.
357
358 However, when you do want to pass a meaningful metacharacter to the
359 shell, such with the "*" inside that final "unsorted/*.txt" argument
360 here, you can't use this alternate syntax. In this case, we have worked
361 around it via Perl's handy "glob" built-in function, which evaluates
362 its argument into a list of filenames X and we can safely pass that
363 resulting list right into "open", as shown above.
364
365 Note also that representing piped-command arguments in list form like
366 this doesn't work on every platform. It will work on any Unix-based OS
367 that provides a real "fork" function (e.g. macOS or Linux), as well as
368 on Windows when running Perl 5.22 or later.
369
371 The full documentation for "open" provides a thorough reference to this
372 function, beyond the best-practice basics covered here.
373
375 Copyright 2013 Tom Christiansen; now maintained by Perl5 Porters
376
377 This documentation is free; you can redistribute it and/or modify it
378 under the same terms as Perl itself.
379
380
381
382perl v5.34.1 2022-03-15 PERLOPENTUT(1)