1File::Find(3pm)        Perl Programmers Reference Guide        File::Find(3pm)
2
3
4

NAME

6       File::Find - Traverse a directory tree.
7

SYNOPSIS

9           use File::Find;
10           find(\&wanted, @directories_to_search);
11           sub wanted { ... }
12
13           use File::Find;
14           finddepth(\&wanted, @directories_to_search);
15           sub wanted { ... }
16
17           use File::Find;
18           find({ wanted => \&process, follow => 1 }, '.');
19

DESCRIPTION

21       These are functions for searching through directory trees doing work on
22       each file found similar to the Unix find command.  File::Find exports
23       two functions, "find" and "finddepth".  They work similarly but have
24       subtle differences.
25
26       find
27             find(\&wanted,  @directories);
28             find(\%options, @directories);
29
30           find() does a depth-first search over the given @directories in the
31           order they are given.  For each file or directory found, it calls
32           the &wanted subroutine.  (See below for details on how to use the
33           &wanted function).  Additionally, for each directory found, it will
34           chdir() into that directory and continue the search, invoking the
35           &wanted function on each file or subdirectory in the directory.
36
37       finddepth
38             finddepth(\&wanted,  @directories);
39             finddepth(\%options, @directories);
40
41           finddepth() works just like find() except that it invokes the
42           &wanted function for a directory after invoking it for the
43           directory's contents.  It does a postorder traversal instead of a
44           preorder traversal, working from the bottom of the directory tree
45           up where find() works from the top of the tree down.
46
47       Despite the name of the finddepth() function, both find() and
48       finddepth() perform a depth-first search of the directory hierarchy.
49
50   %options
51       The first argument to find() is either a code reference to your &wanted
52       function, or a hash reference describing the operations to be performed
53       for each file.  The code reference is described in "The wanted
54       function" below.
55
56       Here are the possible keys for the hash:
57
58       "wanted"
59           The value should be a code reference.  This code reference is
60           described in "The wanted function" below. The &wanted subroutine is
61           mandatory.
62
63       "bydepth"
64           Reports the name of a directory only AFTER all its entries have
65           been reported.  Entry point finddepth() is a shortcut for
66           specifying "{ bydepth => 1 }" in the first argument of find().
67
68       "preprocess"
69           The value should be a code reference. This code reference is used
70           to preprocess the current directory. The name of the currently
71           processed directory is in $File::Find::dir. Your preprocessing
72           function is called after readdir(), but before the loop that calls
73           the wanted() function. It is called with a list of strings
74           (actually file/directory names) and is expected to return a list of
75           strings. The code can be used to sort the file/directory names
76           alphabetically, numerically, or to filter out directory entries
77           based on their name alone. When follow or follow_fast are in
78           effect, "preprocess" is a no-op.
79
80       "postprocess"
81           The value should be a code reference. It is invoked just before
82           leaving the currently processed directory. It is called in void
83           context with no arguments. The name of the current directory is in
84           $File::Find::dir. This hook is handy for summarizing a directory,
85           such as calculating its disk usage. When follow or follow_fast are
86           in effect, "postprocess" is a no-op.
87
88       "follow"
89           Causes symbolic links to be followed. Since directory trees with
90           symbolic links (followed) may contain files more than once and may
91           even have cycles, a hash has to be built up with an entry for each
92           file.  This might be expensive both in space and time for a large
93           directory tree. See "follow_fast" and "follow_skip" below.  If
94           either follow or follow_fast is in effect:
95
96           •   It is guaranteed that an lstat has been called before the
97               user's wanted() function is called. This enables fast file
98               checks involving "_".  Note that this guarantee no longer holds
99               if follow or follow_fast are not set.
100
101           •   There is a variable $File::Find::fullname which holds the
102               absolute pathname of the file with all symbolic links resolved.
103               If the link is a dangling symbolic link, then fullname will be
104               set to "undef".
105
106       "follow_fast"
107           This is similar to follow except that it may report some files more
108           than once.  It does detect cycles, however.  Since only symbolic
109           links have to be hashed, this is much cheaper both in space and
110           time.  If processing a file more than once (by the user's wanted()
111           function) is worse than just taking time, the option follow should
112           be used.
113
114       "follow_skip"
115           "follow_skip==1", which is the default, causes all files which are
116           neither directories nor symbolic links to be ignored if they are
117           about to be processed a second time. If a directory or a symbolic
118           link are about to be processed a second time, File::Find dies.
119
120           "follow_skip==0" causes File::Find to die if any file is about to
121           be processed a second time.
122
123           "follow_skip==2" causes File::Find to ignore any duplicate files
124           and directories but to proceed normally otherwise.
125
126       "dangling_symlinks"
127           Specifies what to do with symbolic links whose target doesn't
128           exist.  If true and a code reference, will be called with the
129           symbolic link name and the directory it lives in as arguments.
130           Otherwise, if true and warnings are on, a warning of the form
131           "symbolic_link_name is a dangling symbolic link\n" will be issued.
132           If false, the dangling symbolic link will be silently ignored.
133
134       "no_chdir"
135           Does not chdir() to each directory as it recurses. The wanted()
136           function will need to be aware of this, of course. In this case, $_
137           will be the same as $File::Find::name.
138
139       "untaint"
140           If find is used in taint-mode (-T command line switch or if EUID !=
141           UID or if EGID != GID), then internally directory names have to be
142           untainted before they can be "chdir"'d to. Therefore they are
143           checked against a regular expression untaint_pattern.  Note that
144           all names passed to the user's wanted() function are still tainted.
145           If this option is used while not in taint-mode, "untaint" is a no-
146           op.
147
148       "untaint_pattern"
149           See above. This should be set using the "qr" quoting operator.  The
150           default is set to "qr|^([-+@\w./]+)$|".  Note that the parentheses
151           are vital.
152
153       "untaint_skip"
154           If set, a directory which fails the untaint_pattern is skipped,
155           including all its sub-directories. The default is to "die" in such
156           a case.
157
158   The wanted function
159       The wanted() function does whatever verifications you want on each file
160       and directory.  Note that despite its name, the wanted() function is a
161       generic callback function, and does not tell File::Find if a file is
162       "wanted" or not.  In fact, its return value is ignored.
163
164       The wanted function takes no arguments but rather does its work through
165       a collection of variables.
166
167       $File::Find::dir is the current directory name,
168       $_ is the current filename within that directory
169       $File::Find::name is the complete pathname to the file.
170
171       The above variables have all been localized and may be changed without
172       affecting data outside of the wanted function.
173
174       For example, when examining the file /some/path/foo.ext you will have:
175
176           $File::Find::dir  = /some/path/
177           $_                = foo.ext
178           $File::Find::name = /some/path/foo.ext
179
180       You are chdir()'d to $File::Find::dir when the function is called,
181       unless "no_chdir" was specified. Note that when changing to directories
182       is in effect, the root directory (/) is a somewhat special case
183       inasmuch as the concatenation of $File::Find::dir, '/' and $_ is not
184       literally equal to $File::Find::name. The table below summarizes all
185       variants:
186
187                     $File::Find::name  $File::Find::dir  $_
188        default      /                  /                 .
189        no_chdir=>0  /etc               /                 etc
190                     /etc/x             /etc              x
191
192        no_chdir=>1  /                  /                 /
193                     /etc               /                 /etc
194                     /etc/x             /etc              /etc/x
195
196       When "follow" or "follow_fast" are in effect, there is also a
197       $File::Find::fullname.  The function may set $File::Find::prune to
198       prune the tree unless "bydepth" was specified.  Unless "follow" or
199       "follow_fast" is specified, for compatibility reasons (find.pl,
200       find2perl) there are in addition the following globals available:
201       $File::Find::topdir, $File::Find::topdev, $File::Find::topino,
202       $File::Find::topmode and $File::Find::topnlink.
203
204       This library is useful for the "find2perl" tool (distributed as part of
205       the App-find2perl CPAN distribution), which when fed,
206
207         find2perl / -name .nfs\* -mtime +7 \
208           -exec rm -f {} \; -o -fstype nfs -prune
209
210       produces something like:
211
212        sub wanted {
213           /^\.nfs.*\z/s &&
214           (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
215           int(-M _) > 7 &&
216           unlink($_)
217           ||
218           ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
219           $dev < 0 &&
220           ($File::Find::prune = 1);
221        }
222
223       Notice the "_" in the above "int(-M _)": the "_" is a magical
224       filehandle that caches the information from the preceding stat(),
225       lstat(), or filetest.
226
227       Here's another interesting wanted function.  It will find all symbolic
228       links that don't resolve:
229
230           sub wanted {
231                -l && !-e && print "bogus link: $File::Find::name\n";
232           }
233
234       Note that you may mix directories and (non-directory) files in the list
235       of directories to be searched by the wanted() function.
236
237           find(\&wanted, "./foo", "./bar", "./baz/epsilon");
238
239       In the example above, no file in ./baz/ other than ./baz/epsilon will
240       be evaluated by wanted().
241
242       See also the script "pfind" on CPAN for a nice application of this
243       module.
244

WARNINGS

246       If you run your program with the "-w" switch, or if you use the
247       "warnings" pragma, File::Find will report warnings for several weird
248       situations. You can disable these warnings by putting the statement
249
250           no warnings 'File::Find';
251
252       in the appropriate scope. See warnings for more info about lexical
253       warnings.
254

BUGS AND CAVEATS

256       $dont_use_nlink
257           You can set the variable $File::Find::dont_use_nlink to 0 if you
258           are sure the filesystem you are scanning reflects the number of
259           subdirectories in the parent directory's "nlink" count.
260
261           If you do set $File::Find::dont_use_nlink to 0, you may notice an
262           improvement in speed at the risk of not recursing into
263           subdirectories if a filesystem doesn't populate "nlink" as
264           expected.
265
266           $File::Find::dont_use_nlink now defaults to 1 on all platforms.
267
268       symlinks
269           Be aware that the option to follow symbolic links can be dangerous.
270           Depending on the structure of the directory tree (including
271           symbolic links to directories) you might traverse a given
272           (physical) directory more than once (only if "follow_fast" is in
273           effect).  Furthermore, deleting or changing files in a symbolically
274           linked directory might cause very unpleasant surprises, since you
275           delete or change files in an unknown directory.
276

HISTORY

278       File::Find used to produce incorrect results if called recursively.
279       During the development of perl 5.8 this bug was fixed.  The first fixed
280       version of File::Find was 1.01.
281

SEE ALSO

283       find(1), find2perl.
284
285
286
287perl v5.38.2                      2023-11-30                   File::Find(3pm)
Impressum