1Text::ParseWords(3pm)  Perl Programmers Reference Guide  Text::ParseWords(3pm)
2
3
4

NAME

6       Text::ParseWords - parse text into an array of tokens or array of
7       arrays
8

SYNOPSIS

10         use Text::ParseWords;
11         @lists = &nested_quotewords($delim, $keep, @lines);
12         @words = &quotewords($delim, $keep, @lines);
13         @words = &shellwords(@lines);
14         @words = &parse_line($delim, $keep, $line);
15         @words = &old_shellwords(@lines); # DEPRECATED!
16

DESCRIPTION

18       The &nested_quotewords() and &quotewords() functions accept a delimiter
19       (which can be a regular expression) and a list of lines and then breaks
20       those lines up into a list of words ignoring delimiters that appear
21       inside quotes.  &quotewords() returns all of the tokens in a single
22       long list, while &nested_quotewords() returns a list of token lists
23       corresponding to the elements of @lines.  &parse_line() does tokenizing
24       on a single string.  The &*quotewords() functions simply call
25       &parse_line(), so if you're only splitting one line you can call
26       &parse_line() directly and save a function call.
27
28       The $keep argument is a boolean flag.  If true, then the tokens are
29       split on the specified delimiter, but all other characters (quotes,
30       backslashes, etc.) are kept in the tokens.  If $keep is false then the
31       &*quotewords() functions remove all quotes and backslashes that are not
32       themselves backslash-escaped or inside of single quotes (i.e., &quote‐
33       words() tries to interpret these characters just like the Bourne
34       shell).  NB: these semantics are significantly different from the orig‐
35       inal version of this module shipped with Perl 5.000 through 5.004.  As
36       an additional feature, $keep may be the keyword "delimiters" which
37       causes the functions to preserve the delimiters in each string as
38       tokens in the token lists, in addition to preserving quote and back‐
39       slash characters.
40
41       &shellwords() is written as a special case of &quotewords(), and it
42       does token parsing with whitespace as a delimiter-- similar to most
43       Unix shells.
44

EXAMPLES

46       The sample program:
47
48         use Text::ParseWords;
49         @words = &quotewords('\s+', 0, q{this   is "a test" of\ quotewords \"for you});
50         $i = 0;
51         foreach (@words) {
52             print "$i: <$_>\n";
53             $i++;
54         }
55
56       produces:
57
58         0: <this>
59         1: <is>
60         2: <a test>
61         3: <of quotewords>
62         4: <"for>
63         5: <you>
64
65       demonstrating:
66
67       0   a simple word
68
69       1   multiple spaces are skipped because of our $delim
70
71       2   use of quotes to include a space in a word
72
73       3   use of a backslash to include a space in a word
74
75       4   use of a backslash to remove the special meaning of a double-quote
76
77       5   another simple word (note the lack of effect of the backslashed
78           double-quote)
79
80       Replacing "&quotewords('\s+', 0, q{this   is...})" with "&shell‐
81       words(q{this   is...})" is a simpler way to accomplish the same thing.
82

AUTHORS

84       Maintainer is Hal Pomeranz <pomeranz@netcom.com>, 1994-1997 (Original
85       author unknown).  Much of the code for &parse_line() (including the
86       primary regexp) from Joerk Behrends <jbehrends@multimediaproduzen‐
87       ten.de>.
88
89       Examples section another documentation provided by John Heidemann
90       <johnh@ISI.EDU>
91
92       Bug reports, patches, and nagging provided by lots of folks-- thanks
93       everybody!  Special thanks to Michael Schwern <schwern@envirolink.org>
94       for assuring me that a &nested_quotewords() would be useful, and to
95       Jeff Friedl <jfriedl@yahoo-inc.com> for telling me not to worry about
96       error-checking (sort of-- you had to be there).
97
98
99
100perl v5.8.8                       2001-09-21             Text::ParseWords(3pm)
Impressum