1URIFIND(1)            User Contributed Perl Documentation           URIFIND(1)
2
3
4

NAME

6       urifind - find URIs in a document and dump them to STDOUT.
7

SYNOPSIS

9           $ urifind file
10

DESCRIPTION

12       urifind is a simple script that finds URIs in one or more files (using
13       "URI::Find"), and outputs them to to STDOUT.  That's it.
14
15       To find all the URIs in file1, use:
16
17           $ urifind file1
18
19       To find the URIs in multiple files, simply list them as arguments:
20
21           $ urifind file1 file2 file3
22
23       urifind will read from "STDIN" if no files are given or if a filename
24       of "-" is specified:
25
26           $ wget http://www.boston.com/ -O - | urifind
27
28       When multiple files are listed, urifind prefixes each found URI with
29       the file from which it came:
30
31           $ urifind file1 file2
32           file1: http://www.boston.com/index.html
33           file2: http://use.perl.org/
34
35       This can be turned on for single files with the "-p" ("prefix") switch:
36
37           $urifind -p file3
38           file1: http://fsck.com/rt/
39
40       It can also be turned off for multiple files with the "-n" ("no
41       prefix") switch:
42
43           $ urifind -n file1 file2
44           http://www.boston.com/index.html
45           http://use.perl.org/
46
47       By default, URIs will be displayed in the order found; to sort them
48       ascii-betically, use the "-s" ("sort") option.  To reverse sort them,
49       use the "-r" ("reverse") flag ("-r" implies "-s").
50
51           $ urifind -s file1 file2
52           http://use.perl.org/
53           http://www.boston.com/index.html
54           mailto:webmaster@boston.com
55
56           $ urifind -r file1 file2
57           mailto:webmaster@boston.com
58           http://www.boston.com/index.html
59           http://use.perl.org/
60
61       Finally, urifind supports limiting the returned URIs by scheme or by
62       arbitrary pattern, using the "-S" option (for schemes) and the "-P"
63       option.  Both "-S" and "-P" can be specified multiple times:
64
65           $ urifind -S mailto file1
66           mailto:webmaster@boston.com
67
68           $ urifind -S mailto -S http file1
69           mailto:webmaster@boston.com
70           http://www.boston.com/index.html
71
72       "-P" takes an arbitrary Perl regex.  It might need to be protected from
73       the shell:
74
75           $ urifind -P 's?html?' file1
76           http://www.boston.com/index.html
77
78           $ urifind -P '\.org\b' -S http file4
79           http://www.gnu.org/software/wget/wget.html
80
81       Add a "-d" to have urifind dump the refexen generated from "-S" and
82       "-P" to "STDERR".  "-D" does the same but exits immediately:
83
84           $ urifind -P '\.org\b' -S http -D
85           $scheme = '^(\bhttp\b):'
86           @pats = ('^(\bhttp\b):', '\.org\b')
87
88       To remove duplicates from the results, use the "-u" ("unique") switch.
89

OPTION SUMMARY

91       -s  Sort results.
92
93       -r  Reverse sort results (implies -s).
94
95       -u  Return unique results only.
96
97       -n  Don't include filename in output.
98
99       -p  Include filename in output (0 by default, but 1 if multiple files
100           are included on the command line).
101
102       -P $re
103           Print only lines matching regex '$re' (may be specified multiple
104           times).
105
106       -S $scheme
107           Only this scheme (may be specified multiple times).
108
109       -h  Help summary.
110
111       -v  Display version and exit.
112
113       -d  Dump compiled regexes for "-S" and "-P" to "STDERR".
114
115       -D  Same as "-d", but exit after dumping.
116

AUTHOR

118       darren chamberlain <darren@cpan.org>
119
121       (C) 2003 darren chamberlain
122
123       This library is free software; you may distribute it and/or modify it
124       under the same terms as Perl itself.
125

SEE ALSO

127       URI::Find
128
129
130
131perl v5.30.1                      2020-01-30                        URIFIND(1)
Impressum