1File::Find(3pm)        Perl Programmers Reference Guide        File::Find(3pm)
2
3
4

NAME

6       File::Find - Traverse a directory tree.
7

SYNOPSIS

9           use File::Find;
10           find(\&wanted, @directories_to_search);
11           sub wanted { ... }
12
13           use File::Find;
14           finddepth(\&wanted, @directories_to_search);
15           sub wanted { ... }
16
17           use File::Find;
18           find({ wanted => \&process, follow => 1 }, '.');
19

DESCRIPTION

21       These are functions for searching through directory trees doing work on
22       each file found similar to the Unix find command.  File::Find exports
23       two functions, "find" and "finddepth".  They work similarly but have
24       subtle differences.
25
26       find
27             find(\&wanted,  @directories);
28             find(\%options, @directories);
29
30           "find()" does a depth-first search over the given @directories in
31           the order they are given.  For each file or directory found, it
32           calls the &wanted subroutine.  (See below for details on how to use
33           the &wanted function).  Additionally, for each directory found, it
34           will "chdir()" into that directory and continue the search,
35           invoking the &wanted function on each file or subdirectory in the
36           directory.
37
38       finddepth
39             finddepth(\&wanted,  @directories);
40             finddepth(\%options, @directories);
41
42           "finddepth()" works just like "find()" except that it invokes the
43           &wanted function for a directory after invoking it for the
44           directory's contents.  It does a postorder traversal instead of a
45           preorder traversal, working from the bottom of the directory tree
46           up where "find()" works from the top of the tree down.
47
48       Despite the name of the "finddepth()" function, both "find()" and
49       "finddepth()" perform a depth-first search of the directory hierarchy.
50
51   %options
52       The first argument to "find()" is either a code reference to your
53       &wanted function, or a hash reference describing the operations to be
54       performed for each file.  The code reference is described in "The
55       wanted function" below.
56
57       Here are the possible keys for the hash:
58
59       "wanted"
60           The value should be a code reference.  This code reference is
61           described in "The wanted function" below. The &wanted subroutine is
62           mandatory.
63
64       "bydepth"
65           Reports the name of a directory only AFTER all its entries have
66           been reported.  Entry point "finddepth()" is a shortcut for
67           specifying "{ bydepth => 1 }" in the first argument of "find()".
68
69       "preprocess"
70           The value should be a code reference. This code reference is used
71           to preprocess the current directory. The name of the currently
72           processed directory is in $File::Find::dir. Your preprocessing
73           function is called after "readdir()", but before the loop that
74           calls the "wanted()" function. It is called with a list of strings
75           (actually file/directory names) and is expected to return a list of
76           strings. The code can be used to sort the file/directory names
77           alphabetically, numerically, or to filter out directory entries
78           based on their name alone. When follow or follow_fast are in
79           effect, "preprocess" is a no-op.
80
81       "postprocess"
82           The value should be a code reference. It is invoked just before
83           leaving the currently processed directory. It is called in void
84           context with no arguments. The name of the current directory is in
85           $File::Find::dir. This hook is handy for summarizing a directory,
86           such as calculating its disk usage. When follow or follow_fast are
87           in effect, "postprocess" is a no-op.
88
89       "follow"
90           Causes symbolic links to be followed. Since directory trees with
91           symbolic links (followed) may contain files more than once and may
92           even have cycles, a hash has to be built up with an entry for each
93           file.  This might be expensive both in space and time for a large
94           directory tree. See "follow_fast" and "follow_skip" below.  If
95           either follow or follow_fast is in effect:
96
97           •   It is guaranteed that an lstat has been called before the
98               user's "wanted()" function is called. This enables fast file
99               checks involving "_".  Note that this guarantee no longer holds
100               if follow or follow_fast are not set.
101
102           •   There is a variable $File::Find::fullname which holds the
103               absolute pathname of the file with all symbolic links resolved.
104               If the link is a dangling symbolic link, then fullname will be
105               set to "undef".
106
107           This is a no-op on Win32.
108
109       "follow_fast"
110           This is similar to follow except that it may report some files more
111           than once.  It does detect cycles, however.  Since only symbolic
112           links have to be hashed, this is much cheaper both in space and
113           time.  If processing a file more than once (by the user's
114           "wanted()" function) is worse than just taking time, the option
115           follow should be used.
116
117           This is also a no-op on Win32.
118
119       "follow_skip"
120           "follow_skip==1", which is the default, causes all files which are
121           neither directories nor symbolic links to be ignored if they are
122           about to be processed a second time. If a directory or a symbolic
123           link are about to be processed a second time, File::Find dies.
124
125           "follow_skip==0" causes File::Find to die if any file is about to
126           be processed a second time.
127
128           "follow_skip==2" causes File::Find to ignore any duplicate files
129           and directories but to proceed normally otherwise.
130
131       "dangling_symlinks"
132           Specifies what to do with symbolic links whose target doesn't
133           exist.  If true and a code reference, will be called with the
134           symbolic link name and the directory it lives in as arguments.
135           Otherwise, if true and warnings are on, a warning of the form
136           "symbolic_link_name is a dangling symbolic link\n" will be issued.
137           If false, the dangling symbolic link will be silently ignored.
138
139       "no_chdir"
140           Does not "chdir()" to each directory as it recurses. The "wanted()"
141           function will need to be aware of this, of course. In this case, $_
142           will be the same as $File::Find::name.
143
144       "untaint"
145           If find is used in taint-mode (-T command line switch or if EUID !=
146           UID or if EGID != GID), then internally directory names have to be
147           untainted before they can be "chdir"'d to. Therefore they are
148           checked against a regular expression untaint_pattern.  Note that
149           all names passed to the user's "wanted()" function are still
150           tainted. If this option is used while not in taint-mode, "untaint"
151           is a no-op.
152
153       "untaint_pattern"
154           See above. This should be set using the "qr" quoting operator.  The
155           default is set to "qr|^([-+@\w./]+)$|".  Note that the parentheses
156           are vital.
157
158       "untaint_skip"
159           If set, a directory which fails the untaint_pattern is skipped,
160           including all its sub-directories. The default is to "die" in such
161           a case.
162
163   The wanted function
164       The "wanted()" function does whatever verifications you want on each
165       file and directory.  Note that despite its name, the "wanted()"
166       function is a generic callback function, and does not tell File::Find
167       if a file is "wanted" or not.  In fact, its return value is ignored.
168
169       The wanted function takes no arguments but rather does its work through
170       a collection of variables.
171
172       $File::Find::dir is the current directory name,
173       $_ is the current filename within that directory
174       $File::Find::name is the complete pathname to the file.
175
176       The above variables have all been localized and may be changed without
177       affecting data outside of the wanted function.
178
179       For example, when examining the file /some/path/foo.ext you will have:
180
181           $File::Find::dir  = /some/path/
182           $_                = foo.ext
183           $File::Find::name = /some/path/foo.ext
184
185       You are chdir()'d to $File::Find::dir when the function is called,
186       unless "no_chdir" was specified. Note that when changing to directories
187       is in effect, the root directory (/) is a somewhat special case
188       inasmuch as the concatenation of $File::Find::dir, '/' and $_ is not
189       literally equal to $File::Find::name. The table below summarizes all
190       variants:
191
192                     $File::Find::name  $File::Find::dir  $_
193        default      /                  /                 .
194        no_chdir=>0  /etc               /                 etc
195                     /etc/x             /etc              x
196
197        no_chdir=>1  /                  /                 /
198                     /etc               /                 /etc
199                     /etc/x             /etc              /etc/x
200
201       When "follow" or "follow_fast" are in effect, there is also a
202       $File::Find::fullname.  The function may set $File::Find::prune to
203       prune the tree unless "bydepth" was specified.  Unless "follow" or
204       "follow_fast" is specified, for compatibility reasons (find.pl,
205       find2perl) there are in addition the following globals available:
206       $File::Find::topdir, $File::Find::topdev, $File::Find::topino,
207       $File::Find::topmode and $File::Find::topnlink.
208
209       This library is useful for the "find2perl" tool (distributed as part of
210       the App-find2perl CPAN distribution), which when fed,
211
212         find2perl / -name .nfs\* -mtime +7 \
213           -exec rm -f {} \; -o -fstype nfs -prune
214
215       produces something like:
216
217        sub wanted {
218           /^\.nfs.*\z/s &&
219           (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
220           int(-M _) > 7 &&
221           unlink($_)
222           ||
223           ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
224           $dev < 0 &&
225           ($File::Find::prune = 1);
226        }
227
228       Notice the "_" in the above "int(-M _)": the "_" is a magical
229       filehandle that caches the information from the preceding "stat()",
230       "lstat()", or filetest.
231
232       Here's another interesting wanted function.  It will find all symbolic
233       links that don't resolve:
234
235           sub wanted {
236                -l && !-e && print "bogus link: $File::Find::name\n";
237           }
238
239       Note that you may mix directories and (non-directory) files in the list
240       of directories to be searched by the "wanted()" function.
241
242           find(\&wanted, "./foo", "./bar", "./baz/epsilon");
243
244       In the example above, no file in ./baz/ other than ./baz/epsilon will
245       be evaluated by "wanted()".
246
247       See also the script "pfind" on CPAN for a nice application of this
248       module.
249

WARNINGS

251       If you run your program with the "-w" switch, or if you use the
252       "warnings" pragma, File::Find will report warnings for several weird
253       situations. You can disable these warnings by putting the statement
254
255           no warnings 'File::Find';
256
257       in the appropriate scope. See warnings for more info about lexical
258       warnings.
259

BUGS AND CAVEATS

261       $dont_use_nlink
262           You can set the variable $File::Find::dont_use_nlink to 0 if you
263           are sure the filesystem you are scanning reflects the number of
264           subdirectories in the parent directory's "nlink" count.
265
266           If you do set $File::Find::dont_use_nlink to 0, you may notice an
267           improvement in speed at the risk of not recursing into
268           subdirectories if a filesystem doesn't populate "nlink" as
269           expected.
270
271           $File::Find::dont_use_nlink now defaults to 1 on all platforms.
272
273       symlinks
274           Be aware that the option to follow symbolic links can be dangerous.
275           Depending on the structure of the directory tree (including
276           symbolic links to directories) you might traverse a given
277           (physical) directory more than once (only if "follow_fast" is in
278           effect).  Furthermore, deleting or changing files in a symbolically
279           linked directory might cause very unpleasant surprises, since you
280           delete or change files in an unknown directory.
281

HISTORY

283       File::Find used to produce incorrect results if called recursively.
284       During the development of perl 5.8 this bug was fixed.  The first fixed
285       version of File::Find was 1.01.
286

SEE ALSO

288       find(1), find2perl.
289
290
291
292perl v5.36.3                      2023-11-30                   File::Find(3pm)
Impressum