1Text::ParseWords(3pm) Perl Programmers Reference Guide Text::ParseWords(3pm)
2
3
4
6 Text::ParseWords - parse text into an array of tokens or array of
7 arrays
8
10 use Text::ParseWords;
11 @lists = nested_quotewords($delim, $keep, @lines);
12 @words = quotewords($delim, $keep, @lines);
13 @words = shellwords(@lines);
14 @words = parse_line($delim, $keep, $line);
15 @words = old_shellwords(@lines); # DEPRECATED!
16
18 The &nested_quotewords() and "ewords() functions accept a delimiter
19 (which can be a regular expression) and a list of lines and then breaks
20 those lines up into a list of words ignoring delimiters that appear
21 inside quotes. "ewords() returns all of the tokens in a single
22 long list, while &nested_quotewords() returns a list of token lists
23 corresponding to the elements of @lines. &parse_line() does tokenizing
24 on a single string. The &*quotewords() functions simply call
25 &parse_line(), so if you're only splitting one line you can call
26 &parse_line() directly and save a function call.
27
28 The $keep argument is a boolean flag. If true, then the tokens are
29 split on the specified delimiter, but all other characters (quotes,
30 backslashes, etc.) are kept in the tokens. If $keep is false then the
31 &*quotewords() functions remove all quotes and backslashes that are not
32 themselves backslash-escaped or inside of single quotes (i.e.,
33 "ewords() tries to interpret these characters just like the Bourne
34 shell). NB: these semantics are significantly different from the
35 original version of this module shipped with Perl 5.000 through 5.004.
36 As an additional feature, $keep may be the keyword "delimiters" which
37 causes the functions to preserve the delimiters in each string as
38 tokens in the token lists, in addition to preserving quote and
39 backslash characters.
40
41 &shellwords() is written as a special case of "ewords(), and it
42 does token parsing with whitespace as a delimiter-- similar to most
43 Unix shells.
44
46 The sample program:
47
48 use Text::ParseWords;
49 @words = quotewords('\s+', 0, q{this is "a test" of\ quotewords \"for you});
50 $i = 0;
51 foreach (@words) {
52 print "$i: <$_>\n";
53 $i++;
54 }
55
56 produces:
57
58 0: <this>
59 1: <is>
60 2: <a test>
61 3: <of quotewords>
62 4: <"for>
63 5: <you>
64
65 demonstrating:
66
67 0 a simple word
68
69 1 multiple spaces are skipped because of our $delim
70
71 2 use of quotes to include a space in a word
72
73 3 use of a backslash to include a space in a word
74
75 4 use of a backslash to remove the special meaning of a double-quote
76
77 5 another simple word (note the lack of effect of the backslashed
78 double-quote)
79
80 Replacing "quotewords('\s+', 0, q{this is...})" with
81 "shellwords(q{this is...})" is a simpler way to accomplish the same
82 thing.
83
85 Maintainer: Alexandr Ciornii <alexchornyATgmail.com>.
86
87 Previous maintainer: Hal Pomeranz <pomeranz@netcom.com>, 1994-1997
88 (Original author unknown). Much of the code for &parse_line()
89 (including the primary regexp) from Joerk Behrends
90 <jbehrends@multimediaproduzenten.de>.
91
92 Examples section another documentation provided by John Heidemann
93 <johnh@ISI.EDU>
94
95 Bug reports, patches, and nagging provided by lots of folks-- thanks
96 everybody! Special thanks to Michael Schwern <schwern@envirolink.org>
97 for assuring me that a &nested_quotewords() would be useful, and to
98 Jeff Friedl <jfriedl@yahoo-inc.com> for telling me not to worry about
99 error-checking (sort of-- you had to be there).
100
101
102
103perl v5.12.4 2011-06-01 Text::ParseWords(3pm)