1Regex(3)              User Contributed Perl Documentation             Regex(3)
2
3
4

NAME

6       YAPE::Regex - Yet Another Parser/Extractor for Regular Expressions
7

SYNOPSIS

9         use YAPE::Regex;
10         use strict;
11
12         my $regex = qr/reg(ular\s+)?exp?(ression)?/i;
13         my $parser = YAPE::Regex->new($regex);
14
15         # here is the tokenizing part
16         while (my $chunk = $parser->next) {
17           # ...
18         }
19

"YAPE" MODULES

21       The "YAPE" hierarchy of modules is an attempt at a unified means of
22       parsing and extracting content.  It attempts to maintain a generic
23       interface, to promote simplicity and reusability.  The API is powerful,
24       yet simple.  The modules do tokenization (which can be intercepted) and
25       build trees, so that extraction of specific nodes is doable.
26

DESCRIPTION

28       This module is yet another (?) parser and tree-builder for Perl regular
29       expressions.  It builds a tree out of a regex, but at the moment, the
30       extent of the extraction tool for the tree is quite limited (see
31       "Extracting Sections").  However, the tree can be useful to extension
32       modules.
33

USAGE

35       In addition to the base class, "YAPE::Regex", there is the auxiliary
36       class "YAPE::Regex::Element" (common to all "YAPE" base classes) that
37       holds the individual nodes' classes.  There is documentation for the
38       node classes in that module's documentation.
39
40   Methods for "YAPE::Regex"
41       ·   "use YAPE::Regex;"
42
43       ·   "use YAPE::Regex qw( MyExt::Mod );"
44
45           If supplied no arguments, the module is loaded normally, and the
46           node classes are given the proper inheritence (from
47           "YAPE::Regex::Element").  If you supply a module (or list of
48           modules), "import" will automatically include them (if needed) and
49           set up their node classes with the proper inheritence -- that is,
50           it will append "YAPE::Regex" to @MyExt::Mod::ISA, and
51           "YAPE::Regex::xxx" to each node class's @ISA (where "xxx" is the
52           name of the specific node class).
53
54             package MyExt::Mod;
55             use YAPE::Regex 'MyExt::Mod';
56
57             # does the work of:
58             # @MyExt::Mod::ISA = 'YAPE::Regex'
59             # @MyExt::Mod::text::ISA = 'YAPE::Regex::text'
60             # ...
61
62       ·   "my $p = YAPE::Regex->new($REx);"
63
64           Creates a "YAPE::Regex" object, using the contents of $REx as a
65           regular expression.  The "new" method will attempt to convert $REx
66           to a compiled regex (using "qr//") if $REx isn't already one.  If
67           there is an error in the regex, this will fail, but the parser will
68           pretend it was ok.  It will then report the bad token when it gets
69           to it, in the course of parsing.
70
71       ·   "my $text = $p->chunk($len);"
72
73           Returns the next $len characters in the input string; $len defaults
74           to 30 characters.  This is useful for figuring out why a parsing
75           error occurs.
76
77       ·   "my $done = $p->done;"
78
79           Returns true if the parser is done with the input string, and false
80           otherwise.
81
82       ·   "my $errstr = $p->error;"
83
84           Returns the parser error message.
85
86       ·   "my $backref = $p->extract;"
87
88           Returns a code reference that returns the next back-reference in
89           the regex.  For more information on enhancements in upcoming
90           versions of this module, check "Extracting Sections".
91
92       ·   "my $node = $p->display(...);"
93
94           Returns a string representation of the entire content.  It calls
95           the "parse" method in case there is more data that has not yet been
96           parsed.  This calls the "fullstring" method on the root nodes.
97           Check the "YAPE::Regex::Element" docs on the arguments to
98           "fullstring".
99
100       ·   "my $node = $p->next;"
101
102           Returns the next token, or "undef" if there is no valid token.
103           There will be an error message (accessible with the "error" method)
104           if there was a problem in the parsing.
105
106       ·   "my $node = $p->parse;"
107
108           Calls "next" until all the data has been parsed.
109
110       ·   "my $node = $p->root;"
111
112           Returns the root node of the tree structure.
113
114       ·   "my $state = $p->state;"
115
116           Returns the current state of the parser.  It is one of the
117           following values: "alt", "anchor", "any", "backref", capture(N),
118           "Cchar", "class", "close", "code", "comment", "cond(TYPE)", "ctrl",
119           "cut", "done", "error", "flags", "group", "hex", "later",
120           "lookahead(neg|pos)", "lookbehind(neg|pos)", "macro", "named",
121           "oct", "slash", "text", and "utf8hex".
122
123           For capture(N), N will be the number the captured pattern
124           represents.
125
126           For "cond(TYPE)", TYPE will either be a number representing the
127           back-reference that the conditional depends on, or the string
128           "assert".
129
130           For "lookahead" and "lookbehind", one of "neg" and "pos" will be
131           there, depending on the type of assertion.
132
133       ·   "my $node = $p->top;"
134
135           Synonymous to "root".
136
137   Extracting Sections
138       While extraction of nodes is the goal of the "YAPE" modules, the author
139       is at a loss for words as to what needs to be extracted from a regex.
140       At the current time, all the "extract" method does is allow you access
141       to the regex's set of back-references:
142
143         my $extor = $parser->extract;
144         while (my $backref = $extor->()) {
145           # ...
146         }
147
148       "japhy" is very open to suggestions as to the approach to node
149       extraction (in how the API should look, in addition to what should be
150       proffered).  Preliminary ideas include extraction keywords like the
151       output of -Dr (or the "re" module's "debug" option).
152

EXTENSIONS

154       ·   "YAPE::Regex::Explain" 3.011
155
156           Presents an explanation of a regular expression, node by node.
157
158       ·   "YAPE::Regex::Reverse" (Not released)
159
160           Reverses the nodes of a regular expression.
161

TO DO

163       This is a listing of things to add to future versions of this module.
164
165   API
166       ·   Create a robust "extract" method
167
168           Open to suggestions.
169

BUGS

171       Following is a list of known or reported bugs.
172
173   Pending
174       ·   "use charnames ':full'"
175
176           To understand "\N{...}" properly, you must be using 5.6.0 or
177           higher.  However, the parser only knows how to resolve full names
178           (those made using "use charnames ':full'").  There might be an
179           option in the future to specify a class name.
180

SEE ALSO

182       The "YAPE::Regex::Element" documentation, for information on the node
183       classes.  Also, "Text::Balanced", Damian Conway's excellent module,
184       used for the matching of "(?{ ... })" and "(??{ ... })" blocks.
185

AUTHOR

187         Jeff "japhy" Pinyan
188         CPAN ID: PINYAN
189         PINYAN@cpan.org
190
191
192
193perl v5.12.1                      2009-11-30                          Regex(3)
Impressum