1URIFIND(1) User Contributed Perl Documentation URIFIND(1)
2
3
4
6 urifind - find URIs in a document and dump them to STDOUT.
7
9 $ urifind file
10
12 urifind is a simple script that finds URIs in one or more files (using
13 "URI::Find"), and outputs them to to STDOUT. That's it.
14
15 To find all the URIs in file1, use:
16
17 $ urifind file1
18
19 To find the URIs in multiple files, simply list them as arguments:
20
21 $ urifind file1 file2 file3
22
23 urifind will read from "STDIN" if no files are given or if a filename
24 of "-" is specified:
25
26 $ wget http://www.boston.com/ -O - | urifind
27
28 When multiple files are listed, urifind prefixes each found URI with
29 the file from which it came:
30
31 $ urifind file1 file2
32 file1: http://www.boston.com/index.html
33 file2: http://use.perl.org/
34
35 This can be turned on for single files with the "-p" ("prefix") switch:
36
37 $urifind -p file3
38 file1: http://fsck.com/rt/
39
40 It can also be turned off for multiple files with the "-n" ("no
41 prefix") switch:
42
43 $ urifind -n file1 file2
44 http://www.boston.com/index.html
45 http://use.perl.org/
46
47 By default, URIs will be displayed in the order found; to sort them
48 ascii-betically, use the "-s" ("sort") option. To reverse sort them,
49 use the "-r" ("reverse") flag ("-r" implies "-s").
50
51 $ urifind -s file1 file2
52 http://use.perl.org/
53 http://www.boston.com/index.html
54 mailto:webmaster@boston.com
55
56 $ urifind -r file1 file2
57 mailto:webmaster@boston.com
58 http://www.boston.com/index.html
59 http://use.perl.org/
60
61 Finally, urifind supports limiting the returned URIs by scheme or by
62 arbitrary pattern, using the "-S" option (for schemes) and the "-P"
63 option. Both "-S" and "-P" can be specified multiple times:
64
65 $ urifind -S mailto file1
66 mailto:webmaster@boston.com
67
68 $ urifind -S mailto -S http file1
69 mailto:webmaster@boston.com
70 http://www.boston.com/index.html
71
72 "-P" takes an arbitrary Perl regex. It might need to be protected from
73 the shell:
74
75 $ urifind -P 's?html?' file1
76 http://www.boston.com/index.html
77
78 $ urifind -P '\.org\b' -S http file4
79 http://www.gnu.org/software/wget/wget.html
80
81 Add a "-d" to have urifind dump the refexen generated from "-S" and
82 "-P" to "STDERR". "-D" does the same but exits immediately:
83
84 $ urifind -P '\.org\b' -S http -D
85 $scheme = '^(\bhttp\b):'
86 @pats = ('^(\bhttp\b):', '\.org\b')
87
88 To remove duplicates from the results, use the "-u" ("unique") switch.
89
91 -s Sort results.
92
93 -r Reverse sort results (implies -s).
94
95 -u Return unique results only.
96
97 -n Don't include filename in output.
98
99 -p Include filename in output (0 by default, but 1 if multiple files
100 are included on the command line).
101
102 -P $re
103 Print only lines matching regex '$re' (may be specified multiple
104 times).
105
106 -S $scheme
107 Only this scheme (may be specified multiple times).
108
109 -h Help summary.
110
111 -v Display version and exit.
112
113 -d Dump compiled regexes for "-S" and "-P" to "STDERR".
114
115 -D Same as "-d", but exit after dumping.
116
118 darren chamberlain <darren@cpan.org>
119
121 (C) 2003 darren chamberlain
122
123 This library is free software; you may distribute it and/or modify it
124 under the same terms as Perl itself.
125
127 URI::Find
128
129
130
131perl v5.34.0 2022-01-21 URIFIND(1)