1File::Slurp(3) User Contributed Perl Documentation File::Slurp(3)
2
3
4
6 File::Slurp - Simple and Efficient Reading/Writing/Modifying of
7 Complete Files
8
10 use File::Slurp;
11
12 # read in a whole file into a scalar
13 my $text = read_file( 'filename' ) ;
14
15 # read in a whole file into an array of lines
16 my @lines = read_file( 'filename' ) ;
17
18 # write out a whole file from a scalar
19 write_file( 'filename', $text ) ;
20
21 # write out a whole file from an array of lines
22 write_file( 'filename', @lines ) ;
23
24 # Here is a simple and fast way to load and save a simple config file #
25 made of key=value lines.
26 my %conf = read_file( $file_name ) =~ /^(\w+)=(.*)$/mg ;
27 write_file( $file_name, {atomic => 1}, map "$_=$conf{$_}\n", keys
28 %conf ) ;
29
30 # insert text at the beginning of a file
31 prepend_file( 'filename', $text ) ;
32
33 # in-place edit to replace all 'foo' with 'bar' in file
34 edit_file { s/foo/bar/g } 'filename' ;
35
36 # in-place edit to delete all lines with 'foo' from file
37 edit_file_lines sub { $_ = '' if /foo/ }, 'filename' ;
38
39 # read in a whole directory of file names (skipping . and ..)
40 my @files = read_dir( '/path/to/dir' ) ;
41
43 This module provides subs that allow you to read or write entire files
44 with one simple call. They are designed to be simple to use, have
45 flexible ways to pass in or get the file contents and to be very
46 efficient. There is also a sub to read in all the files in a directory
47 other than "." and ".."
48
49 These slurp/spew subs work for files, pipes and sockets, stdio, pseudo-
50 files, and the DATA handle. Read more about why slurping files is a
51 good thing in the file 'slurp_article.pod' in the extras/ directory.
52
53 If you are interested in how fast these calls work, check out the
54 slurp_bench.pl program in the extras/ directory. It compares many
55 different forms of slurping. You can select the I/O direction, context
56 and file sizes. Use the --help option to see how to run it.
57
58 read_file
59 This sub reads in an entire file and returns its contents to the
60 caller. In scalar context it returns the entire file as a single
61 scalar. In list context it will return a list of lines (using the
62 current value of $/ as the separator including support for paragraph
63 mode when it is set to '').
64
65 my $text = read_file( 'filename' ) ;
66 my $bin = read_file( 'filename' { binmode => ':raw' } ) ;
67 my @lines = read_file( 'filename' ) ;
68 my $lines = read_file( 'filename', array_ref => 1 ) ;
69
70 The first argument is the file to slurp in. If the next argument is a
71 hash reference, then it is used as the options. Otherwise the rest of
72 the argument list are is used as key/value options.
73
74 If the file argument is a handle (if it is a ref and is an IO or GLOB
75 object), then that handle is slurped in. This mode is supported so you
76 slurp handles such as "DATA" and "STDIN". See the test handle.t for an
77 example that does "open( '-|' )" and the child process spews data to
78 the parant which slurps it in. All of the options that control how the
79 data is returned to the caller still work in this case.
80
81 If the first argument is an overloaded object then its stringified
82 value is used for the filename and that file is opened. This is a new
83 feature in 9999.14. See the stringify.t test for an example.
84
85 By default "read_file" returns an undef in scalar contex or a single
86 undef in list context if it encounters an error. Those are both
87 impossible to get with a clean read_file call which means you can check
88 the return value and always know if you had an error. You can change
89 how errors are handled with the "err_mode" option.
90
91 Speed Note: If you call read_file and just get a scalar return value it
92 is now optimized to handle shorter files. This is only used if no
93 options are used, the file is shorter then 100k bytes, the filename is
94 a plain scalar and a scalar file is returned. If you want the fastest
95 slurping, use the "buf_ref" or "scalar_ref" options (see below)
96
97 NOTE: as of version 9999.06, read_file works correctly on the "DATA"
98 handle. It used to need a sysseek workaround but that is now handled
99 when needed by the module itself.
100
101 You can optionally request that "slurp()" is exported to your code.
102 This is an alias for read_file and is meant to be forward compatible
103 with Perl 6 (which will have slurp() built-in).
104
105 The options for "read_file" are:
106
107 binmode
108
109 If you set the binmode option, then its value is passed to a call to
110 binmode on the opened handle. You can use this to set the file to be
111 read in binary mode, utf8, etc. See perldoc -f binmode for more.
112
113 my $bin_data = read_file( $bin_file, binmode => ':raw' ) ;
114 my $utf_text = read_file( $bin_file, binmode => ':utf8' ) ;
115
116 array_ref
117
118 If this boolean option is set, the return value (only in scalar
119 context) will be an array reference which contains the lines of the
120 slurped file. The following two calls are equivalent:
121
122 my $lines_ref = read_file( $bin_file, array_ref => 1 ) ;
123 my $lines_ref = [ read_file( $bin_file ) ] ;
124
125 chomp
126
127 If this boolean option is set, the lines are chomped. This only happens
128 if you are slurping in a list context or using the "array_ref" option.
129
130 scalar_ref
131
132 If this boolean option is set, the return value (only in scalar
133 context) will be an scalar reference to a string which is the contents
134 of the slurped file. This will usually be faster than returning the
135 plain scalar. It will also save memory as it will not make a copy of
136 the file to return. Run the extras/slurp_bench.pl script to see speed
137 comparisons.
138
139 my $text_ref = read_file( $bin_file, scalar_ref => 1 ) ;
140
141 buf_ref
142
143 You can use this option to pass in a scalar reference and the slurped
144 file contents will be stored in the scalar. This can be used in
145 conjunction with any of the other options. This saves an extra copy of
146 the slurped file and can lower ram usage vs returning the file. It is
147 usually the fastest way to read a file into a scalar. Run the
148 extras/slurp_bench.pl script to see speed comparisons.
149
150 read_file( $bin_file, buf_ref => \$buffer ) ;
151
152 blk_size
153
154 You can use this option to set the block size used when slurping from
155 an already open handle (like \*STDIN). It defaults to 1MB.
156
157 my $text_ref = read_file( $bin_file, blk_size => 10_000_000,
158 array_ref => 1 ) ;
159
160 err_mode
161
162 You can use this option to control how read_file behaves when an error
163 occurs. This option defaults to 'croak'. You can set it to 'carp' or to
164 'quiet to have no special error handling. This code wants to carp and
165 then read another file if it fails.
166
167 my $text_ref = read_file( $file, err_mode => 'carp' ) ;
168 unless ( $text_ref ) {
169
170 # read a different file but croak if not found
171 $text_ref = read_file( $another_file ) ;
172 }
173
174 # process ${$text_ref}
175
176 write_file
177 This sub writes out an entire file in one call.
178
179 write_file( 'filename', @data ) ;
180
181 The first argument to "write_file" is the filename. The next argument
182 is an optional hash reference and it contains key/values that can
183 modify the behavior of "write_file". The rest of the argument list is
184 the data to be written to the file.
185
186 write_file( 'filename', {append => 1 }, @data ) ;
187 write_file( 'filename', {binmode => ':raw'}, $buffer ) ;
188
189 As a shortcut if the first data argument is a scalar or array
190 reference, it is used as the only data to be written to the file. Any
191 following arguments in @_ are ignored. This is a faster way to pass in
192 the output to be written to the file and is equivalent to the "buf_ref"
193 option of "read_file". These following pairs are equivalent but the
194 pass by reference call will be faster in most cases (especially with
195 larger files).
196
197 write_file( 'filename', \$buffer ) ;
198 write_file( 'filename', $buffer ) ;
199
200 write_file( 'filename', \@lines ) ;
201 write_file( 'filename', @lines ) ;
202
203 If the first argument is a handle (if it is a ref and is an IO or GLOB
204 object), then that handle is written to. This mode is supported so you
205 spew to handles such as \*STDOUT. See the test handle.t for an example
206 that does "open( '-|' )" and child process spews data to the parent
207 which slurps it in. All of the options that control how the data are
208 passed into "write_file" still work in this case.
209
210 If the first argument is an overloaded object then its stringified
211 value is used for the filename and that file is opened. This is new
212 feature in 9999.14. See the stringify.t test for an example.
213
214 By default "write_file" returns 1 upon successfully writing the file or
215 undef if it encountered an error. You can change how errors are handled
216 with the "err_mode" option.
217
218 The options are:
219
220 binmode
221
222 If you set the binmode option, then its value is passed to a call to
223 binmode on the opened handle. You can use this to set the file to be
224 read in binary mode, utf8, etc. See perldoc -f binmode for more.
225
226 write_file( $bin_file, {binmode => ':raw'}, @data ) ;
227 write_file( $bin_file, {binmode => ':utf8'}, $utf_text ) ;
228
229 perms
230
231 The perms option sets the permissions of newly-created files. This
232 value is modified by your process's umask and defaults to 0666 (same as
233 sysopen).
234
235 NOTE: this option is new as of File::Slurp version 9999.14;
236
237 buf_ref
238
239 You can use this option to pass in a scalar reference which has the
240 data to be written. If this is set then any data arguments (including
241 the scalar reference shortcut) in @_ will be ignored. These are
242 equivalent:
243
244 write_file( $bin_file, { buf_ref => \$buffer } ) ;
245 write_file( $bin_file, \$buffer ) ;
246 write_file( $bin_file, $buffer ) ;
247
248 atomic
249
250 If you set this boolean option, the file will be written to in an
251 atomic fashion. A temporary file name is created by appending the pid
252 ($$) to the file name argument and that file is spewed to. After the
253 file is closed it is renamed to the original file name (and rename is
254 an atomic operation on most OS's). If the program using this were to
255 crash in the middle of this, then the file with the pid suffix could be
256 left behind.
257
258 append
259
260 If you set this boolean option, the data will be written at the end of
261 the current file. Internally this sets the sysopen mode flag O_APPEND.
262
263 write_file( $file, {append => 1}, @data ) ;
264
265 You
266 can import append_file and it does the same thing.
267
268 no_clobber
269
270 If you set this boolean option, an existing file will not be
271 overwritten.
272
273 write_file( $file, {no_clobber => 1}, @data ) ;
274
275 err_mode
276
277 You can use this option to control how "write_file" behaves when an
278 error occurs. This option defaults to 'croak'. You can set it to 'carp'
279 or to 'quiet' to have no error handling other than the return value. If
280 the first call to "write_file" fails it will carp and then write to
281 another file. If the second call to "write_file" fails, it will croak.
282
283 unless ( write_file( $file, { err_mode => 'carp', \$data ) ;
284
285 # write a different file but croak if not found
286 write_file( $other_file, \$data ) ;
287 }
288
289 overwrite_file
290 This sub is just a typeglob alias to write_file since write_file always
291 overwrites an existing file. This sub is supported for backwards
292 compatibility with the original version of this module. See write_file
293 for its API and behavior.
294
295 append_file
296 This sub will write its data to the end of the file. It is a wrapper
297 around write_file and it has the same API so see that for the full
298 documentation. These calls are equivalent:
299
300 append_file( $file, @data ) ;
301 write_file( $file, {append => 1}, @data ) ;
302
303 prepend_file
304 This sub writes data to the beginning of a file. The previously
305 existing data is written after that so the effect is prepending data in
306 front of a file. It is a counterpart to the append_file sub in this
307 module. It works by first using "read_file" to slurp in the file and
308 then calling "write_file" with the new data and the existing file data.
309
310 The first argument to "prepend_file" is the filename. The next argument
311 is an optional hash reference and it contains key/values that can
312 modify the behavior of "prepend_file". The rest of the argument list is
313 the data to be written to the file and that is passed to "write_file"
314 as is (see that for allowed data).
315
316 Only the "binmode" and "err_mode" options are supported. The
317 "write_file" call has the "atomic" option set so you will always have a
318 consistant file. See above for more about those options.
319
320 "prepend_file" is not exported by default, you need to import it
321 explicitly.
322
323 use File::Slurp qw( prepend_file ) ;
324 prepend_file( $file, $header ) ;
325 prepend_file( $file, \@lines ) ;
326 prepend_file( $file, { binmode => 'raw:'}, $bin_data ) ;
327
328 edit_file, edit_file_lines
329 These subs read in a file into $_, execute a code block which should
330 modify $_ and then write $_ back to the file. The difference between
331 them is that "edit_file" reads the whole file into $_ and calls the
332 code block one time. With "edit_file_lines" each line is read into $_
333 and the code is called for each line. In both cases the code should
334 modify $_ if desired and it will be written back out. These subs are
335 the equivalent of the -pi command line options of Perl but you can call
336 them from inside your program and not fork out a process. They are in
337 @EXPORT_OK so you need to request them to be imported on the use line
338 or you can import both of them with:
339
340 use File::Slurp qw( :edit ) ;
341
342 The first argument to "edit_file" and "edit_file_lines" is a code block
343 or a code reference. The code block is not followed by a comma (as with
344 grep and map) but a code reference is followed by a comma. See the
345 examples below for both styles. The next argument is the filename. The
346 last argument is an optional hash reference and it contains key/values
347 that can modify the behavior of "prepend_file".
348
349 Only the "binmode" and "err_mode" options are supported. The
350 "write_file" call has the "atomic" option set so you will always have a
351 consistant file. See above for more about those options.
352
353 Each group of calls below show a Perl command line instance and the
354 equivalent calls to "edit_file" and "edit_file_lines".
355
356 perl -0777 -pi -e 's/foo/bar/g' filename
357 use File::Slurp qw( edit_file ) ;
358 edit_file { s/foo/bar/g } 'filename' ;
359 edit_file sub { s/foo/bar/g }, 'filename' ;
360 edit_file \&replace_foo, 'filename' ;
361 sub replace_foo { s/foo/bar/g }
362
363 perl -pi -e '$_ = "" if /foo/' filename
364 use File::Slurp qw( edit_file_lines ) ;
365 use File::Slurp ;
366 edit_file_lines { $_ = '' if /foo/ } 'filename' ;
367 edit_file_lines sub { $_ = '' if /foo/ }, 'filename' ;
368 edit_file \&delete_foo, 'filename' ;
369 sub delete_foo { $_ = '' if /foo/ }
370
371 read_dir
372 This sub reads all the file names from directory and returns them to
373 the caller but "." and ".." are removed by default.
374
375 my @files = read_dir( '/path/to/dir' ) ;
376
377 The first argument is the path to the directory to read. If the next
378 argument is a hash reference, then it is used as the options.
379 Otherwise the rest of the argument list are is used as key/value
380 options.
381
382 In list context "read_dir" returns a list of the entries in the
383 directory. In a scalar context it returns an array reference which has
384 the entries.
385
386 err_mode
387
388 If the "err_mode" option is set, it selects how errors are handled (see
389 "err_mode" in "read_file" or "write_file").
390
391 keep_dot_dot
392
393 If this boolean option is set, "." and ".." are not removed from the
394 list of files.
395
396 my @all_files = read_dir( '/path/to/dir', keep_dot_dot => 1 ) ;
397
398 prefix
399
400 If this boolean option is set, the string "$dir/" is prefixed to each
401 dir entry. This means you can directly use the results to open files. A
402 common newbie mistake is not putting the directory in front of entries
403 when opening themn.
404
405 my @paths = read_dir( '/path/to/dir', prefix => 1 ) ;
406
407 EXPORT
408 These are exported by default or with
409 use File::Slurp qw( :std ) ;
410
411 read_file write_file overwrite_file append_file read_dir
412
413 These are exported with
414 use File::Slurp qw( :edit ) ;
415
416 edit_file edit_file_lines
417
418 You can get all subs in the module exported with
419 use File::Slurp qw( :all ) ;
420
421 LICENSE
422 Same as Perl.
423
424 SEE ALSO
425 An article on file slurping in extras/slurp_article.pod. There is also
426 a benchmarking script in extras/slurp_bench.pl.
427
428 BUGS
429 If run under Perl 5.004, slurping from the DATA handle will fail as
430 that requires B.pm which didn't get into core until 5.005.
431
433 Uri Guttman, <uri AT stemsystems DOT com>
434
435
436
437perl v5.12.3 2011-05-30 File::Slurp(3)