1re(3pm)                Perl Programmers Reference Guide                re(3pm)
2
3
4

NAME

6       re - Perl pragma to alter regular expression behaviour
7

SYNOPSIS

9           use re 'taint';
10           ($x) = ($^X =~ /^(.*)$/s);     # $x is tainted here
11
12           $pat = '(?{ $foo = 1 })';
13           use re 'eval';
14           /foo${pat}bar/;                # won't fail (when not under -T switch)
15
16           {
17               no re 'taint';             # the default
18               ($x) = ($^X =~ /^(.*)$/s); # $x is not tainted here
19
20               no re 'eval';              # the default
21               /foo${pat}bar/;            # disallowed (with or without -T switch)
22           }
23
24           use re 'debug';                # output debugging info during
25           /^(.*)$/s;                     #     compile and run time
26
27
28           use re 'debugcolor';           # same as 'debug', but with colored output
29           ...
30
31           use re qw(Debug All);          # Finer tuned debugging options.
32           use re qw(Debug More);
33           no re qw(Debug ALL);           # Turn of all re debugging in this scope
34
35           use re qw(is_regexp regexp_pattern); # import utility functions
36           my ($pat,$mods)=regexp_pattern(qr/foo/i);
37           if (is_regexp($obj)) {
38               print "Got regexp: ",
39                   scalar regexp_pattern($obj); # just as perl would stringify it
40           }                                    # but no hassle with blessed re's.
41
42       (We use $^X in these examples because it's tainted by default.)
43

DESCRIPTION

45   'taint' mode
46       When "use re 'taint'" is in effect, and a tainted string is the target
47       of a regexp, the regexp memories (or values returned by the m//
48       operator in list context) are tainted.  This feature is useful when
49       regexp operations on tainted data aren't meant to extract safe
50       substrings, but to perform other transformations.
51
52   'eval' mode
53       When "use re 'eval'" is in effect, a regexp is allowed to contain "(?{
54       ... })" zero-width assertions and "(??{ ... })" postponed
55       subexpressions, even if the regular expression contains variable
56       interpolation.  That is normally disallowed, since it is a potential
57       security risk.  Note that this pragma is ignored when the regular
58       expression is obtained from tainted data, i.e.  evaluation is always
59       disallowed with tainted regular expressions.  See "(?{ code })" in
60       perlre and "(??{ code })" in perlre.
61
62       For the purpose of this pragma, interpolation of precompiled regular
63       expressions (i.e., the result of "qr//") is not considered variable
64       interpolation.  Thus:
65
66           /foo${pat}bar/
67
68       is allowed if $pat is a precompiled regular expression, even if $pat
69       contains "(?{ ... })" assertions or "(??{ ... })" subexpressions.
70
71   'debug' mode
72       When "use re 'debug'" is in effect, perl emits debugging messages when
73       compiling and using regular expressions.  The output is the same as
74       that obtained by running a "-DDEBUGGING"-enabled perl interpreter with
75       the -Dr switch. It may be quite voluminous depending on the complexity
76       of the match.  Using "debugcolor" instead of "debug" enables a form of
77       output that can be used to get a colorful display on terminals that
78       understand termcap color sequences.  Set $ENV{PERL_RE_TC} to a comma-
79       separated list of "termcap" properties to use for highlighting strings
80       on/off, pre-point part on/off.  See "Debugging regular expressions" in
81       perldebug for additional info.
82
83       As of 5.9.5 the directive "use re 'debug'" and its equivalents are
84       lexically scoped, as the other directives are.  However they have both
85       compile-time and run-time effects.
86
87       See "Pragmatic Modules" in perlmodlib.
88
89   'Debug' mode
90       Similarly "use re 'Debug'" produces debugging output, the difference
91       being that it allows the fine tuning of what debugging output will be
92       emitted. Options are divided into three groups, those related to
93       compilation, those related to execution and those related to special
94       purposes. The options are as follows:
95
96       Compile related options
97           COMPILE
98               Turns on all compile related debug options.
99
100           PARSE
101               Turns on debug output related to the process of parsing the
102               pattern.
103
104           OPTIMISE
105               Enables output related to the optimisation phase of
106               compilation.
107
108           TRIEC
109               Detailed info about trie compilation.
110
111           DUMP
112               Dump the final program out after it is compiled and optimised.
113
114       Execute related options
115           EXECUTE
116               Turns on all execute related debug options.
117
118           MATCH
119               Turns on debugging of the main matching loop.
120
121           TRIEE
122               Extra debugging of how tries execute.
123
124           INTUIT
125               Enable debugging of start point optimisations.
126
127       Extra debugging options
128           EXTRA
129               Turns on all "extra" debugging options.
130
131           BUFFERS
132               Enable debugging the capture buffer storage during match.
133               Warning, this can potentially produce extremely large output.
134
135           TRIEM
136               Enable enhanced TRIE debugging. Enhances both TRIEE and TRIEC.
137
138           STATE
139               Enable debugging of states in the engine.
140
141           STACK
142               Enable debugging of the recursion stack in the engine. Enabling
143               or disabling this option automatically does the same for
144               debugging states as well. This output from this can be quite
145               large.
146
147           OPTIMISEM
148               Enable enhanced optimisation debugging and start point
149               optimisations.  Probably not useful except when debugging the
150               regexp engine itself.
151
152           OFFSETS
153               Dump offset information. This can be used to see how regops
154               correlate to the pattern. Output format is
155
156                  NODENUM:POSITION[LENGTH]
157
158               Where 1 is the position of the first char in the string. Note
159               that position can be 0, or larger than the actual length of the
160               pattern, likewise length can be zero.
161
162           OFFSETSDBG
163               Enable debugging of offsets information. This emits copious
164               amounts of trace information and doesn't mesh well with other
165               debug options.
166
167               Almost definitely only useful to people hacking on the offsets
168               part of the debug engine.
169
170       Other useful flags
171           These are useful shortcuts to save on the typing.
172
173           ALL Enable all options at once except OFFSETS, OFFSETSDBG and
174               BUFFERS
175
176           All Enable DUMP and all execute options. Equivalent to:
177
178                 use re 'debug';
179
180           MORE
181           More
182               Enable TRIEM and all execute compile and execute options.
183
184       As of 5.9.5 the directive "use re 'debug'" and its equivalents are
185       lexically scoped, as the other directives are.  However they have both
186       compile-time and run-time effects.
187
188   Exportable Functions
189       As of perl 5.9.5 're' debug contains a number of utility functions that
190       may be optionally exported into the caller's namespace. They are listed
191       below.
192
193       is_regexp($ref)
194           Returns true if the argument is a compiled regular expression as
195           returned by "qr//", false if it is not.
196
197           This function will not be confused by overloading or blessing. In
198           internals terms, this extracts the regexp pointer out of the
199           PERL_MAGIC_qr structure so it it cannot be fooled.
200
201       regexp_pattern($ref)
202           If the argument is a compiled regular expression as returned by
203           "qr//", then this function returns the pattern.
204
205           In list context it returns a two element list, the first element
206           containing the pattern and the second containing the modifiers used
207           when the pattern was compiled.
208
209             my ($pat, $mods) = regexp_pattern($ref);
210
211           In scalar context it returns the same as perl would when
212           stringifying a raw "qr//" with the same pattern inside.  If the
213           argument is not a compiled reference then this routine returns
214           false but defined in scalar context, and the empty list in list
215           context. Thus the following
216
217               if (regexp_pattern($ref) eq '(?i-xsm:foo)')
218
219           will be warning free regardless of what $ref actually is.
220
221           Like "is_regexp" this function will not be confused by overloading
222           or blessing of the object.
223
224       regmust($ref)
225           If the argument is a compiled regular expression as returned by
226           "qr//", then this function returns what the optimiser considers to
227           be the longest anchored fixed string and longest floating fixed
228           string in the pattern.
229
230           A fixed string is defined as being a substring that must appear for
231           the pattern to match. An anchored fixed string is a fixed string
232           that must appear at a particular offset from the beginning of the
233           match. A floating fixed string is defined as a fixed string that
234           can appear at any point in a range of positions relative to the
235           start of the match. For example,
236
237               my $qr = qr/here .* there/x;
238               my ($anchored, $floating) = regmust($qr);
239               print "anchored:'$anchored'\nfloating:'$floating'\n";
240
241           results in
242
243               anchored:'here'
244               floating:'there'
245
246           Because the "here" is before the ".*" in the pattern, its position
247           can be determined exactly. That's not true, however, for the
248           "there"; it could appear at any point after where the anchored
249           string appeared.  Perl uses both for its optimisations, prefering
250           the longer, or, if they are equal, the floating.
251
252           NOTE: This may not necessarily be the definitive longest anchored
253           and floating string. This will be what the optimiser of the Perl
254           that you are using thinks is the longest. If you believe that the
255           result is wrong please report it via the perlbug utility.
256
257       regname($name,$all)
258           Returns the contents of a named buffer of the last successful
259           match. If $all is true, then returns an array ref containing one
260           entry per buffer, otherwise returns the first defined buffer.
261
262       regnames($all)
263           Returns a list of all of the named buffers defined in the last
264           successful match. If $all is true, then it returns all names
265           defined, if not it returns only names which were involved in the
266           match.
267
268       regnames_count()
269           Returns the number of distinct names defined in the pattern used
270           for the last successful match.
271
272           Note: this result is always the actual number of distinct named
273           buffers defined, it may not actually match that which is returned
274           by "regnames()" and related routines when those routines have not
275           been called with the $all parameter set.
276

SEE ALSO

278       "Pragmatic Modules" in perlmodlib.
279
280
281
282perl v5.12.4                      2011-11-04                           re(3pm)
Impressum