1File::Find(3pm)        Perl Programmers Reference Guide        File::Find(3pm)
2
3
4

NAME

6       File::Find - Traverse a directory tree.
7

SYNOPSIS

9           use File::Find;
10           find(\&wanted, @directories_to_search);
11           sub wanted { ... }
12
13           use File::Find;
14           finddepth(\&wanted, @directories_to_search);
15           sub wanted { ... }
16
17           use File::Find;
18           find({ wanted => \&process, follow => 1 }, '.');
19

DESCRIPTION

21       These are functions for searching through directory trees doing work on
22       each file found similar to the Unix find command.  File::Find exports
23       two functions, "find" and "finddepth".  They work similarly but have
24       subtle differences.
25
26       find
27             find(\&wanted,  @directories);
28             find(\%options, @directories);
29
30           "find()" does a depth-first search over the given @directories in
31           the order they are given.  For each file or directory found, it
32           calls the &wanted subroutine.  (See below for details on how to use
33           the &wanted function).  Additionally, for each directory found, it
34           will "chdir()" into that directory and continue the search, invok‐
35           ing the &wanted function on each file or subdirectory in the direc‐
36           tory.
37
38       finddepth
39             finddepth(\&wanted,  @directories);
40             finddepth(\%options, @directories);
41
42           "finddepth()" works just like "find()" except that is invokes the
43           &wanted function for a directory after invoking it for the direc‐
44           tory's contents.  It does a postorder traversal instead of a pre‐
45           order traversal, working from the bottom of the directory tree up
46           where "find()" works from the top of the tree down.
47
48       %options
49
50       The first argument to "find()" is either a code reference to your
51       &wanted function, or a hash reference describing the operations to be
52       performed for each file.  The code reference is described in "The
53       wanted function" below.
54
55       Here are the possible keys for the hash:
56
57       "wanted"
58          The value should be a code reference.  This code reference is
59          described in "The wanted function" below.
60
61       "bydepth"
62          Reports the name of a directory only AFTER all its entries have been
63          reported.  Entry point "finddepth()" is a shortcut for specifying
64          "<{ bydepth =" 1 }>> in the first argument of "find()".
65
66       "preprocess"
67          The value should be a code reference. This code reference is used to
68          preprocess the current directory. The name of the currently pro‐
69          cessed directory is in $File::Find::dir. Your preprocessing function
70          is called after "readdir()", but before the loop that calls the
71          "wanted()" function. It is called with a list of strings (actually
72          file/directory names) and is expected to return a list of strings.
73          The code can be used to sort the file/directory names alphabeti‐
74          cally, numerically, or to filter out directory entries based on
75          their name alone. When follow or follow_fast are in effect, "prepro‐
76          cess" is a no-op.
77
78       "postprocess"
79          The value should be a code reference. It is invoked just before
80          leaving the currently processed directory. It is called in void con‐
81          text with no arguments. The name of the current directory is in
82          $File::Find::dir. This hook is handy for summarizing a directory,
83          such as calculating its disk usage. When follow or follow_fast are
84          in effect, "postprocess" is a no-op.
85
86       "follow"
87          Causes symbolic links to be followed. Since directory trees with
88          symbolic links (followed) may contain files more than once and may
89          even have cycles, a hash has to be built up with an entry for each
90          file.  This might be expensive both in space and time for a large
91          directory tree. See follow_fast and follow_skip below.  If either
92          follow or follow_fast is in effect:
93
94          *     It is guaranteed that an lstat has been called before the
95                user's "wanted()" function is called. This enables fast file
96                checks involving _.  Note that this guarantee no longer holds
97                if follow or follow_fast are not set.
98
99          *     There is a variable $File::Find::fullname which holds the
100                absolute pathname of the file with all symbolic links
101                resolved.  If the link is a dangling symbolic link, then full‐
102                name will be set to "undef".
103
104          This is a no-op on Win32.
105
106       "follow_fast"
107          This is similar to follow except that it may report some files more
108          than once.  It does detect cycles, however.  Since only symbolic
109          links have to be hashed, this is much cheaper both in space and
110          time.  If processing a file more than once (by the user's "wanted()"
111          function) is worse than just taking time, the option follow should
112          be used.
113
114          This is also a no-op on Win32.
115
116       "follow_skip"
117          "follow_skip==1", which is the default, causes all files which are
118          neither directories nor symbolic links to be ignored if they are
119          about to be processed a second time. If a directory or a symbolic
120          link are about to be processed a second time, File::Find dies.
121
122          "follow_skip==0" causes File::Find to die if any file is about to be
123          processed a second time.
124
125          "follow_skip==2" causes File::Find to ignore any duplicate files and
126          directories but to proceed normally otherwise.
127
128       "dangling_symlinks"
129          If true and a code reference, will be called with the symbolic link
130          name and the directory it lives in as arguments.  Otherwise, if true
131          and warnings are on, warning "symbolic_link_name is a dangling sym‐
132          bolic link\n" will be issued.  If false, the dangling symbolic link
133          will be silently ignored.
134
135       "no_chdir"
136          Does not "chdir()" to each directory as it recurses. The "wanted()"
137          function will need to be aware of this, of course. In this case, $_
138          will be the same as $File::Find::name.
139
140       "untaint"
141          If find is used in taint-mode (-T command line switch or if EUID !=
142          UID or if EGID != GID) then internally directory names have to be
143          untainted before they can be chdir'ed to. Therefore they are checked
144          against a regular expression untaint_pattern.  Note that all names
145          passed to the user's wanted() function are still tainted. If this
146          option is used while not in taint-mode, "untaint" is a no-op.
147
148       "untaint_pattern"
149          See above. This should be set using the "qr" quoting operator.  The
150          default is set to  "qr⎪^([-+@\w./]+)$⎪".  Note that the parentheses
151          are vital.
152
153       "untaint_skip"
154          If set, a directory which fails the untaint_pattern is skipped,
155          including all its sub-directories. The default is to 'die' in such a
156          case.
157
158       The wanted function
159
160       The "wanted()" function does whatever verifications you want on each
161       file and directory.  Note that despite its name, the "wanted()" func‐
162       tion is a generic callback function, and does not tell File::Find if a
163       file is "wanted" or not.  In fact, its return value is ignored.
164
165       The wanted function takes no arguments but rather does its work through
166       a collection of variables.
167
168       $File::Find::dir is the current directory name,
169       $_ is the current filename within that directory
170       $File::Find::name is the complete pathname to the file.
171
172       Don't modify these variables.
173
174       For example, when examining the file /some/path/foo.ext you will have:
175
176           $File::Find::dir  = /some/path/
177           $_                = foo.ext
178           $File::Find::name = /some/path/foo.ext
179
180       You are chdir()'d to $File::Find::dir when the function is called,
181       unless "no_chdir" was specified. Note that when changing to directories
182       is in effect the root directory (/) is a somewhat special case inasmuch
183       as the concatenation of $File::Find::dir, '/' and $_ is not literally
184       equal to $File::Find::name. The table below summarizes all variants:
185
186                     $File::Find::name  $File::Find::dir  $_
187        default      /                  /                 .
188        no_chdir=>0  /etc               /                 etc
189                     /etc/x             /etc              x
190
191        no_chdir=>1  /                  /                 /
192                     /etc               /                 /etc
193                     /etc/x             /etc              /etc/x
194
195       When <follow> or <follow_fast> are in effect, there is also a
196       $File::Find::fullname.  The function may set $File::Find::prune to
197       prune the tree unless "bydepth" was specified.  Unless "follow" or
198       "follow_fast" is specified, for compatibility reasons (find.pl,
199       find2perl) there are in addition the following globals available:
200       $File::Find::topdir, $File::Find::topdev, $File::Find::topino,
201       $File::Find::topmode and $File::Find::topnlink.
202
203       This library is useful for the "find2perl" tool, which when fed,
204
205           find2perl / -name .nfs\* -mtime +7 \
206               -exec rm -f {} \; -o -fstype nfs -prune
207
208       produces something like:
209
210           sub wanted {
211               /^\.nfs.*\z/s &&
212               (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
213               int(-M _) > 7 &&
214               unlink($_)
215               ⎪⎪
216               ($nlink ⎪⎪ (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
217               $dev < 0 &&
218               ($File::Find::prune = 1);
219           }
220
221       Notice the "_" in the above "int(-M _)": the "_" is a magical filehan‐
222       dle that caches the information from the preceding "stat()", "lstat()",
223       or filetest.
224
225       Here's another interesting wanted function.  It will find all symbolic
226       links that don't resolve:
227
228           sub wanted {
229                -l && !-e && print "bogus link: $File::Find::name\n";
230           }
231
232       See also the script "pfind" on CPAN for a nice application of this mod‐
233       ule.
234

WARNINGS

236       If you run your program with the "-w" switch, or if you use the "warn‐
237       ings" pragma, File::Find will report warnings for several weird situa‐
238       tions. You can disable these warnings by putting the statement
239
240           no warnings 'File::Find';
241
242       in the appropriate scope. See perllexwarn for more info about lexical
243       warnings.
244

CAVEAT

246       $dont_use_nlink
247         You can set the variable $File::Find::dont_use_nlink to 1, if you
248         want to force File::Find to always stat directories. This was used
249         for file systems that do not have an "nlink" count matching the num‐
250         ber of sub-directories.  Examples are ISO-9660 (CD-ROM), AFS, HPFS
251         (OS/2 file system), FAT (DOS file system) and a couple of others.
252
253         You shouldn't need to set this variable, since File::Find should now
254         detect such file systems on-the-fly and switch itself to using stat.
255         This works even for parts of your file system, like a mounted CD-ROM.
256
257         If you do set $File::Find::dont_use_nlink to 1, you will notice
258         slow-downs.
259
260       symlinks
261         Be aware that the option to follow symbolic links can be dangerous.
262         Depending on the structure of the directory tree (including symbolic
263         links to directories) you might traverse a given (physical) directory
264         more than once (only if "follow_fast" is in effect).  Furthermore,
265         deleting or changing files in a symbolically linked directory might
266         cause very unpleasant surprises, since you delete or change files in
267         an unknown directory.
268

NOTES

270       ·   Mac OS (Classic) users should note a few differences:
271
272           ·   The path separator is ':', not '/', and the current directory
273               is denoted as ':', not '.'. You should be careful about speci‐
274               fying relative pathnames.  While a full path always begins with
275               a volume name, a relative pathname should always begin with a
276               ':'.  If specifying a volume name only, a trailing ':' is
277               required.
278
279           ·   $File::Find::dir is guaranteed to end with a ':'. If $_ con‐
280               tains the name of a directory, that name may or may not end
281               with a ':'. Likewise, $File::Find::name, which contains the
282               complete pathname to that directory, and $File::Find::fullname,
283               which holds the absolute pathname of that directory with all
284               symbolic links resolved, may or may not end with a ':'.
285
286           ·   The default "untaint_pattern" (see above) on Mac OS is set to
287               "qr⎪^(.+)$⎪". Note that the parentheses are vital.
288
289           ·   The invisible system file "Icon\015" is ignored. While this
290               file may appear in every directory, there are some more invisi‐
291               ble system files on every volume, which are all located at the
292               volume root level (i.e.  "MacintoshHD:"). These system files
293               are not excluded automatically.  Your filter may use the fol‐
294               lowing code to recognize invisible files or directories
295               (requires Mac::Files):
296
297                use Mac::Files;
298
299                # invisible() --  returns 1 if file/directory is invisible,
300                # 0 if it's visible or undef if an error occurred
301
302                sub invisible($) {
303                  my $file = shift;
304                  my ($fileCat, $fileInfo);
305                  my $invisible_flag =  1 << 14;
306
307                  if ( $fileCat = FSpGetCatInfo($file) ) {
308                    if ($fileInfo = $fileCat->ioFlFndrInfo() ) {
309                      return (($fileInfo->fdFlags & $invisible_flag) && 1);
310                    }
311                  }
312                  return undef;
313                }
314
315               Generally, invisible files are system files, unless an odd
316               application decides to use invisible files for its own pur‐
317               poses. To distinguish such files from system files, you have to
318               look at the type and creator file attributes. The MacPerl
319               built-in functions "GetFileInfo(FILE)" and "SetFileInfo(CRE‐
320               ATOR, TYPE, FILES)" offer access to these attributes (see
321               MacPerl.pm for details).
322
323               Files that appear on the desktop actually reside in an (hidden)
324               directory named "Desktop Folder" on the particular disk volume.
325               Note that, although all desktop files appear to be on the same
326               "virtual" desktop, each disk volume actually maintains its own
327               "Desktop Folder" directory.
328

BUGS AND CAVEATS

330       Despite the name of the "finddepth()" function, both "find()" and
331       "finddepth()" perform a depth-first search of the directory hierarchy.
332

HISTORY

334       File::Find used to produce incorrect results if called recursively.
335       During the development of perl 5.8 this bug was fixed.  The first fixed
336       version of File::Find was 1.01.
337
338
339
340perl v5.8.8                       2001-09-21                   File::Find(3pm)
Impressum