1Regexp::Pattern(3) User Contributed Perl Documentation Regexp::Pattern(3)
2
3
4
6 Regexp::Pattern - Convention/framework for modules that contain
7 collection of regexes
8
10 0.2
11
13 This document describes version 0.2.9 of Regexp::Pattern (from Perl
14 distribution Regexp-Pattern), released on 2019-06-09.
15
17 Subroutine interface:
18
19 use Regexp::Pattern; # exports re()
20
21 my $re = re('YouTube::video_id');
22 say "ID does not look like a YouTube video ID" unless $id =~ /\A$re\z/;
23
24 # a dynamic pattern (generated on-demand) with generator arguments
25 my $re2 = re('Example::re3', {variant=>"B"});
26
27 Hash interface (a la Regexp::Common but simpler with
28 regular/non-magical hash that is only 1-level deep):
29
30 use Regexp::Pattern 'YouTube::video_id';
31 say "ID does not look like a YouTube video ID"
32 unless $id =~ /\A$RE{video_id}\z/;
33
34 # more complex example
35
36 use Regexp::Pattern (
37 're', # we still want the re() function
38 'Foo::bar' => (-as => 'qux'), # the pattern will be in your $RE{qux}
39 'YouTube::*', # wildcard import
40 'Example::re3' => (variant => 'B'), # supply generator arguments
41 'JSON::*' => (-prefix => 'json_'), # add prefix
42 'License::*' => (
43 -has_tag => 'family:cc', # select by tag
44 -lacks_tag => 'type:unversioned', # also select by lack of tag
45 -suffix => '_license', # also add suffix
46 ),
47 );
48
50 Regexp::Pattern is a convention for organizing reusable regexp patterns
51 in modules, as well as framework to provide convenience in using those
52 patterns in your program.
53
54 Structure of an example Regexp::Pattern::* module
55 package Regexp::Pattern::Example;
56
57
58 our %RE = (
59 # the minimum spec
60 re1 => { pat => qr/\d{3}-\d{3}/ },
61
62 # more complete spec
63 re2 => {
64 summary => 'This is regexp for blah',
65 description => <<'_',
66
67 A longer description.
68
69 _
70 pat => qr/\d{3}-\d{3}(?:-\d{5})?/,
71 tags => ['A','B'],
72 examples => [
73 {
74 str => '123-456',
75 matches => 1,
76 },
77 {
78 summary => 'Another example that matches',
79 str => '123-456-78901',
80 matches => 1,
81 },
82 {
83 summary => 'An example that does not match',
84 str => '123456',
85 matches => 0,
86 },
87 {
88 summary => 'An example that does not get tested',
89 str => '123456',
90 },
91 {
92 summary => 'Another example that does not get tested nor rendered to POD',
93 str => '234567',
94 matches => 0,
95 test => 0,
96 doc => 0,
97 },
98 ],
99 },
100
101 # dynamic (regexp generator)
102 re3 => {
103 summary => 'This is a regexp for blah blah',
104 description => <<'_',
105
106 ...
107
108 _
109 gen => sub {
110 my %args = @_;
111 my $variant = $args{variant} || 'A';
112 if ($variant eq 'A') {
113 return qr/\d{3}-\d{3}/;
114 } else { # B
115 return qr/\d{3}-\d{2}-\d{5}/;
116 }
117 },
118 gen_args => {
119 variant => {
120 summary => 'Choose variant',
121 schema => ['str*', in=>['A','B']],
122 default => 'A',
123 req => 1,
124 },
125 },
126 tags => ['B','C'],
127 examples => [
128 {
129 summary => 'An example that matches',
130 gen_args => {variant=>'A'},
131 str => '123-456',
132 matches => 1,
133 },
134 {
135 summary => "An example that doesn't match",
136 gen_args => {variant=>'B'},
137 str => '123-456',
138 matches => 0,
139 },
140 ],
141 },
142
143 re4 => {
144 summary => 'This is a regexp that does capturing',
145 tags => ['capturing'],
146 pat => qr/(\d{3})-(\d{3})/,
147 examples => [
148 {str=>'123-456', matches=>[123, 456]},
149 {str=>'foo-bar', matches=>[]},
150 ],
151 },
152
153 re5 => {
154 summary => 'This is another regexp that does (named) capturing and anchoring',
155 tags => ['capturing', 'anchored'],
156 pat => qr/^(?<cap1>\d{3})-(?<cap2>\d{3})/,
157 examples => [
158 {str=>'123-456', matches=>{cap1=>123, cap2=>456}},
159 {str=>'something 123-456', matches=>{}},
160 ],
161 },
162 );
163
164 A Regexp::Pattern::* module must declare a package global hash variable
165 named %RE. Hash keys are pattern names, hash values are pattern
166 definitions in the form of defhashes (see DefHash).
167
168 Pattern name should be a simple identifier that matches this regexp:
169 "/\A[A-Za-z_][A-Za-z_0-9]*\z/". The definition for the qualified
170 pattern name "Foo::Bar::baz" can then be located in
171 %Regexp::Pattern::Foo::Bar::RE under the hash key "baz".
172
173 Pattern definition hash should at the minimum be:
174
175 { pat => qr/.../ }
176
177 You can add more stuffs from the defhash specification, e.g. summary,
178 description, tags, and so on, for example (taken from
179 Regexp::Pattern::CPAN):
180
181 {
182 summary => 'PAUSE author ID, or PAUSE ID for short',
183 pat => qr/[A-Z][A-Z0-9]{1,8}/,
184 description => <<~HERE,
185 I'm not sure whether PAUSE allows digit for the first letter. For safety
186 I'm assuming no.
187 HERE
188 examples => [
189 {str=>'PERLANCAR', matches=>1},
190 {str=>'BAD ID', matches=>0},
191 ],
192 }
193
194 Examples. Your regexp specification can include an "examples" property
195 (see above for example). The value of the "examples" property is an
196 array, each of which should be a defhash. For each example, at the
197 minimum you should specify "str" (string to be matched by the regexp),
198 "gen_args" (hash, arguments to use when generating dynamic regexp
199 pattern), and "matches" (a boolean value that specifies whether the
200 regexp should match the string or not, or an array/hash that specifies
201 the captures). You can of course specify other defhash properties (e.g.
202 "summary", "description", etc). Other example properties might be
203 introduced in the future.
204
205 If you use Dist::Zilla to build your distribution, you can use the
206 plugin [Regexp::Pattern] to test the examples during building, and the
207 Pod::Weaver plugin [-Regexp::Pattern] to render the examples in your
208 POD.
209
210 Using a Regexp::Pattern::* module
211 Standalone
212
213 A Regexp::Pattern::* module can be used in a standalone way (i.e. no
214 need to use via the Regexp::Pattern framework), as it simply contains
215 data that can be grabbed using a normal means, e.g.:
216
217 use Regexp::Pattern::Example;
218
219 say "Input does not match blah"
220 unless $input =~ /\A$Regexp::Pattern::Example::RE{re1}{pat}\z/;
221
222 Via Regexp::Pattern, sub interface
223
224 Regexp::Pattern (this module) also provides "re()" function to help
225 retrieve the regexp pattern. See "re" for more details.
226
227 Via Regexp::Pattern, hash interface
228
229 Additionally, Regexp::Pattern (since v0.2.0) lets you import regexp
230 patterns into your %RE package hash variable, a la Regexp::Common (but
231 simpler because the hash is just a regular hash, only 1-level deep, and
232 not magical).
233
234 To import, you specify qualified pattern names as the import arguments:
235
236 use Regexp::Pattern 'Q::pat1', 'Q::pat2', ...;
237
238 Each qualified pattern name can optionally be followed by a list of
239 name-value pairs. A pair name can be an option name (which is dash
240 followed by a word, e.g. "-as", "-prefix") or a generator argument
241 name for dynamic pattern.
242
243 Wildcard import. Instead of a qualified pattern name, you can use
244 'Module::SubModule::*' wildcard syntax to import all patterns from a
245 pattern module.
246
247 Importing into a different name. You can add the import option "-as" to
248 import into a different name, for example:
249
250 use Regexp::Pattern 'YouTube::video_id' => (-as => 'yt_id');
251
252 Prefix and suffix. You can also add a prefix and/or suffix to the
253 imported name:
254
255 use Regexp::Pattern 'Example::*' => (-prefix => 'example_');
256 use Regexp::Pattern 'Example::*' => (-suffix => '_sample');
257
258 Filtering. When wildcard-importing, you can select the patterns you
259 want using a combination of these options: "-has_tag" (only select
260 patterns that have a specified tag), "-lacks_tag" (only select patterns
261 that do not have a specified tag).
262
263 Recommendations for writing the regex patterns
264 · Regexp pattern should be written as a "qr//" literal
265
266 Using a string literal is less desirable. That is:
267
268 pat => qr/foo[abc]+/,
269
270 is preferred over:
271
272 pat => 'foo[abc]+',
273
274 · Regexp pattern should not be anchored (unless really necessary)
275
276 That is:
277
278 pat => qr/foo/,
279
280 is preferred over:
281
282 pat => qr/^foo/, # or qr/foo$/, or qr/\Afoo\z/
283
284 Adding anchors limits the reusability of the pattern. When
285 composing pattern, user can add anchors herself if needed.
286
287 When you define an anchored pattern, adding tag "anchored" is
288 recommended:
289
290 tags => ['anchored'],
291
292 · Regexp pattern should not contain capture groups (unless really
293 necessary)
294
295 Adding capture groups limits the reusability of the pattern because
296 it can affect the groups of the composed pattern. When composing
297 pattern, user can add captures herself if needed.
298
299 When you define a capturing pattern, adding tag "capturing" is
300 recommended:
301
302 tags => ['capturing'],
303
305 re
306 Exported by default. Get a regexp pattern by name from a
307 "Regexp::Pattern::*" module.
308
309 Usage:
310
311 re($name[, \%args ]) => $re
312
313 $name is MODULE_NAME::PATTERN_NAME where MODULE_NAME is name of a
314 "Regexp::Pattern::*" module without the "Regexp::Pattern::" prefix and
315 PATTERN_NAME is a key to the %RE package global hash in the module. A
316 dynamic pattern can accept arguments for its generator, and you can
317 pass it as hashref in the second argument of "re()".
318
319 Anchoring. You can also put "-anchor => 1" in %args. This will
320 conveniently wraps the regex inside "qr/\A(?:...)\z/".
321
322 Die when pattern by name $name cannot be found (either the module
323 cannot be loaded or the pattern with that name is not found in the
324 module).
325
327 Please visit the project's homepage at
328 <https://metacpan.org/release/Regexp-Pattern>.
329
331 Source repository is at
332 <https://github.com/perlancar/perl-Regexp-Pattern>.
333
335 Please report any bugs or feature requests on the bugtracker website
336 <https://rt.cpan.org/Public/Dist/Display.html?Name=Regexp-Pattern>
337
338 When submitting a bug or request, please include a test-file or a patch
339 to an existing test-file that illustrates the bug or desired feature.
340
342 Regexp::Common. Regexp::Pattern is an alternative to Regexp::Common.
343 Regexp::Pattern offers simplicity and lower startup overhead. Instead
344 of a magic hash, you retrieve available regexes from normal data
345 structure or via the provided "re()" function. Regexp::Pattern also
346 provides a hash interface, albeit the hash is not magic.
347
348 Regexp::Common::RegexpPattern, a bridge module to use patterns in
349 "Regexp::Pattern::*" modules via Regexp::Common.
350
351 Regexp::Pattern::RegexpCommon, a bridge module to use patterns in
352 "Regexp::Common::*" modules via Regexp::Pattern.
353
354 App::RegexpPatternUtils
355
356 If you use Dist::Zilla: Dist::Zilla::Plugin::Regexp::Pattern,
357 Pod::Weaver::Plugin::Regexp::Pattern,
358 Dist::Zilla::Plugin::AddModule::RegexpCommon::FromRegexpPattern,
359 Dist::Zilla::Plugin::AddModule::RegexpPattern::FromRegexpCommon.
360
361 Test::Regexp::Pattern and test-regexp-pattern.
362
364 perlancar <perlancar@cpan.org>
365
367 This software is copyright (c) 2019, 2018, 2016 by perlancar@cpan.org.
368
369 This is free software; you can redistribute it and/or modify it under
370 the same terms as the Perl 5 programming language system itself.
371
372
373
374perl v5.30.0 2019-07-26 Regexp::Pattern(3)