1DateTime::Format::BuildUesre(r3)Contributed Perl DocumenDtaatteiToinme::Format::Builder(3)
2
3
4
6 DateTime::Format::Builder - Create DateTime parser classes and objects.
7
9 version 0.81
10
12 package DateTime::Format::Brief;
13
14 use DateTime::Format::Builder
15 (
16 parsers => {
17 parse_datetime => [
18 {
19 regex => qr/^(\d{4})(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
20 params => [qw( year month day hour minute second )],
21 },
22 {
23 regex => qr/^(\d{4})(\d\d)(\d\d)$/,
24 params => [qw( year month day )],
25 },
26 ],
27 }
28 );
29
31 DateTime::Format::Builder creates DateTime parsers. Many string
32 formats of dates and times are simple and just require a basic regular
33 expression to extract the relevant information. Builder provides a
34 simple way to do this without writing reams of structural code.
35
36 Builder provides a number of methods, most of which you'll never need,
37 or at least rarely need. They're provided more for exposing of the
38 module's innards to any subclasses, or for when you need to do
39 something slightly beyond what I expected.
40
41 This creates the end methods. Coderefs die on bad parses, return
42 "DateTime" objects on good parse.
43
45 See DateTime::Format::Builder::Tutorial.
46
48 Often, I will speak of "undef" being returned, however that's not
49 strictly true.
50
51 When a simple single specification is given for a method, the method
52 isn't given a single parser directly. It's given a wrapper that will
53 call "on_fail()" if the single parser returns "undef". The single
54 parser must return "undef" so that a multiple parser can work nicely
55 and actual errors can be thrown from any of the callbacks.
56
57 Similarly, any multiple parsers will only call "on_fail()" right at the
58 end when it's tried all it could.
59
60 "on_fail()" (see later) is defined, by default, to throw an error.
61
62 Multiple parser specifications can also specify "on_fail" with a
63 coderef as an argument in the options block. This will take precedence
64 over the inheritable and over-ridable method.
65
66 That said, don't throw real errors from callbacks in multiple parser
67 specifications unless you really want parsing to stop right there and
68 not try any other parsers.
69
70 In summary: calling a method will result in either a "DateTime" object
71 being returned or an error being thrown (unless you've overridden
72 "on_fail()" or "create_method()", or you've specified a "on_fail" key
73 to a multiple parser specification).
74
75 Individual parsers (be they multiple parsers or single parsers) will
76 return either the "DateTime" object or "undef".
77
79 A single specification is a hash ref of instructions on how to create a
80 parser.
81
82 The precise set of keys and values varies according to parser type.
83 There are some common ones though:
84
85 · length is an optional parameter that can be used to specify that
86 this particular regex is only applicable to strings of a certain
87 fixed length. This can be used to make parsers more efficient. It's
88 strongly recommended that any parser that can use this parameter
89 does.
90
91 You may happily specify the same length twice. The parsers will be
92 tried in order of specification.
93
94 You can also specify multiple lengths by giving it an arrayref of
95 numbers rather than just a single scalar. If doing so, please keep
96 the number of lengths to a minimum.
97
98 If any specifications without lengths are given and the particular
99 length parser fails, then the non-length parsers are tried.
100
101 This parameter is ignored unless the specification is part of a
102 multiple parser specification.
103
104 · label provides a name for the specification and is passed to some
105 of the callbacks about to mentioned.
106
107 · on_match and on_fail are callbacks. Both routines will be called
108 with parameters of:
109
110 · input, being the input to the parser (after any preprocessing
111 callbacks).
112
113 · label, being the label of the parser, if there is one.
114
115 · self, being the object on which the method has been invoked
116 (which may just be a class name). Naturally, you can then
117 invoke your own methods on it do get information you want.
118
119 · args, being an arrayref of any passed arguments, if any. If
120 there were no arguments, then this parameter is not given.
121
122 These routines will be called depending on whether the regex match
123 succeeded or failed.
124
125 · preprocess is a callback provided for cleaning up input prior to
126 parsing. It's given a hash as arguments with the following keys:
127
128 · input being the datetime string the parser was given (if using
129 multiple specifications and an overall preprocess then this is
130 the date after it's been through that preprocessor).
131
132 · parsed being the state of parsing so far. Usually empty at this
133 point unless an overall preprocess was given. Items may be
134 placed in it and will be given to any postprocessor and
135 "DateTime->new" (unless the postprocessor deletes it).
136
137 · self, args, label as per on_match and on_fail.
138
139 The return value from the routine is what is given to the regex.
140 Note that this is last code stop before the match.
141
142 Note: mixing length and a preprocess that modifies the length of
143 the input string is probably not what you meant to do. You probably
144 meant to use the multiple parser variant of preprocess which is
145 done before any length calculations. This "single parser" variant
146 of preprocess is performed after any length calculations.
147
148 · postprocess is the last code stop before "DateTime->new()" is
149 called. It's given the same arguments as preprocess. This allows it
150 to modify the parsed parameters after the parse and before the
151 creation of the object. For example, you might use:
152
153 {
154 regex => qr/^(\d\d) (\d\d) (\d\d)$/,
155 params => [qw( year month day )],
156 postprocess => \&_fix_year,
157 }
158
159 where "_fix_year" is defined as:
160
161 sub _fix_year
162 {
163 my %args = @_;
164 my ($date, $p) = @args{qw( input parsed )};
165 $p->{year} += $p->{year} > 69 ? 1900 : 2000;
166 return 1;
167 }
168
169 This will cause the two digit years to be corrected according to
170 the cut off. If the year was '69' or lower, then it is made into
171 2069 (or 2045, or whatever the year was parsed as). Otherwise it is
172 assumed to be 19xx. The DateTime::Format::Mail module uses code
173 similar to this (only it allows the cut off to be configured and it
174 doesn't use Builder).
175
176 Note: It is very important to return an explicit value from the
177 postprocess callback. If the return value is false then the parse
178 is taken to have failed. If the return value is true, then the
179 parse is taken to have succeeded and "DateTime->new()" is called.
180
181 See the documentation for the individual parsers for their valid keys.
182
183 Parsers at the time of writing are:
184
185 · DateTime::Format::Builder::Parser::Regex - provides regular
186 expression based parsing.
187
188 · DateTime::Format::Builder::Parser::Strptime - provides strptime
189 based parsing.
190
191 Subroutines / coderefs as specifications.
192 A single parser specification can be a coderef. This was added mostly
193 because it could be and because I knew someone, somewhere, would want
194 to use it.
195
196 If the specification is a reference to a piece of code, be it a
197 subroutine, anonymous, or whatever, then it's passed more or less
198 straight through. The code should return "undef" in event of failure
199 (or any false value, but "undef" is strongly preferred), or a true
200 value in the event of success (ideally a "DateTime" object or some
201 object that has the same interface).
202
203 This all said, I generally wouldn't recommend using this feature unless
204 you have to.
205
206 Callbacks
207 I mention a number of callbacks in this document.
208
209 Any time you see a callback being mentioned, you can, if you like,
210 substitute an arrayref of coderefs rather than having the straight
211 coderef.
212
214 These are very easily described as an array of single specifications.
215
216 Note that if the first element of the array is an arrayref, then you're
217 specifying options.
218
219 · preprocess lets you specify a preprocessor that is called before
220 any of the parsers are tried. This lets you do things like strip
221 off timezones or any unnecessary data. The most common use people
222 have for it at present is to get the input date to a particular
223 length so that the length is usable (DateTime::Format::ICal would
224 use it to strip off the variable length timezone).
225
226 Arguments are as for the single parser preprocess variant with the
227 exception that label is never given.
228
229 · on_fail should be a reference to a subroutine that is called if the
230 parser fails. If this is not provided, the default action is to
231 call "DateTime::Format::Builder::on_fail", or the "on_fail" method
232 of the subclass of DTFB that was used to create the parser.
233
235 Builder allows you to plug in a fair few callbacks, which can make
236 following how a parse failed (or succeeded unexpectedly) somewhat
237 tricky.
238
239 For Single Specifications
240 A single specification will do the following:
241
242 User calls parser:
243
244 my $dt = $class->parse_datetime( $string );
245
246 1. preprocess is called. It's given $string and a reference to the
247 parsing workspace hash, which we'll call $p. At this point, $p is
248 empty. The return value is used as $date for the rest of this
249 single parser. Anything put in $p is also used for the rest of
250 this single parser.
251
252 2. regex is applied.
253
254 3. If regex did not match, then on_fail is called (and is given $date
255 and also label if it was defined). Any return value is ignored and
256 the next thing is for the single parser to return "undef".
257
258 If regex did match, then on_match is called with the same arguments
259 as would be given to on_fail. The return value is similarly
260 ignored, but we then move to step 4 rather than exiting the parser.
261
262 4. postprocess is called with $date and a filled out $p. The return
263 value is taken as a indication of whether the parse was a success
264 or not. If it wasn't a success then the single parser will exit at
265 this point, returning undef.
266
267 5. "DateTime->new()" is called and the user is given the resultant
268 "DateTime" object.
269
270 See the section on error handling regarding the "undef"s mentioned
271 above.
272
273 For Multiple Specifications
274 With multiple specifications:
275
276 User calls parser:
277
278 my $dt = $class->complex_parse( $string );
279
280 1. The overall preprocessor is called and is given $string and the
281 hashref $p (identically to the per parser preprocess mentioned in
282 the previous flow).
283
284 If the callback modifies $p then a copy of $p is given to each of
285 the individual parsers. This is so parsers won't accidentally
286 pollute each other's workspace.
287
288 2. If an appropriate length specific parser is found, then it is
289 called and the single parser flow (see the previous section) is
290 followed, and the parser is given a copy of $p and the return value
291 of the overall preprocessor as $date.
292
293 If a "DateTime" object was returned so we go straight back to the
294 user.
295
296 If no appropriate parser was found, or the parser returned "undef",
297 then we progress to step 3!
298
299 3. Any non-length based parsers are tried in the order they were
300 specified.
301
302 For each of those the single specification flow above is performed,
303 and is given a copy of the output from the overall preprocessor.
304
305 If a real "DateTime" object is returned then we exit back to the
306 user.
307
308 If no parser could parse, then an error is thrown.
309
310 See the section on error handling regarding the "undef"s mentioned
311 above.
312
314 In the general course of things you won't need any of the methods. Life
315 often throws unexpected things at us so the methods are all available
316 for use.
317
318 import
319 "import()" is a wrapper for "create_class()". If you specify the class
320 option (see documentation for "create_class()") it will be ignored.
321
322 create_class
323 This method can be used as the runtime equivalent of "import()". That
324 is, it takes the exact same parameters as when one does:
325
326 use DateTime::Format::Builder ( blah blah blah )
327
328 That can be (almost) equivalently written as:
329
330 use DateTime::Format::Builder;
331 DateTime::Format::Builder->create_class( blah blah blah );
332
333 The difference being that the first is done at compile time while the
334 second is done at run time.
335
336 In the tutorial I said there were only two parameters at present. I
337 lied. There are actually three of them.
338
339 · parsers takes a hashref of methods and their parser specifications.
340 See the DateTime::Format::Builder::Tutorial for details.
341
342 Note that if you define a subroutine of the same name as one of the
343 methods you define here, an error will be thrown.
344
345 · constructor determines whether and how to create a "new()" function
346 in the new class. If given a true value, a constructor is created.
347 If given a false value, one isn't.
348
349 If given an anonymous sub or a reference to a sub then that is used
350 as "new()".
351
352 The default is 1 (that is, create a constructor using our default
353 code which simply creates a hashref and blesses it).
354
355 If your class defines its own "new()" method it will not be
356 overwritten. If you define your own "new()" and also tell Builder
357 to define one an error will be thrown.
358
359 · verbose takes a value. If the value is undef, then logging is
360 disabled. If the value is a filehandle then that's where logging
361 will go. If it's a true value, then output will go to "STDERR".
362
363 Alternatively, call "$DateTime::Format::Builder::verbose()" with
364 the relevant value. Whichever value is given more recently is
365 adhered to.
366
367 Be aware that verbosity is a global wide setting.
368
369 · class is optional and specifies the name of the class in which to
370 create the specified methods.
371
372 If using this method in the guise of "import()" then this field
373 will cause an error so it is only of use when calling as
374 "create_class()".
375
376 · version is also optional and specifies the value to give $VERSION
377 in the class. It's generally not recommended unless you're
378 combining with the class option. A "ExtUtils::MakeMaker" / "CPAN"
379 compliant version specification is much better.
380
381 In addition to creating any of the methods it also creates a "new()"
382 method that can instantiate (or clone) objects.
383
385 In the rest of the documentation I've often lied in order to get some
386 of the ideas across more easily. The thing is, this module's very
387 flexible. You can get markedly different behaviour from simply
388 subclassing it and overriding some methods.
389
390 create_method
391 Given a parser coderef, returns a coderef that is suitable to be a
392 method.
393
394 The default action is to call "on_fail()" in the event of a non-parse,
395 but you can make it do whatever you want.
396
397 on_fail
398 This is called in the event of a non-parse (unless you've overridden
399 "create_method()" to do something else.
400
401 The single argument is the input string. The default action is to call
402 "croak()". Above, where I've said parsers or methods throw errors, this
403 is the method that is doing the error throwing.
404
405 You could conceivably override this method to, say, return "undef".
406
408 The methods listed in the METHODS section are all you generally need
409 when creating your own class. Sometimes you may not want a full blown
410 class to parse something just for this one program. Some methods are
411 provided to make that task easier.
412
413 new
414 The basic constructor. It takes no arguments, merely returns a new
415 "DateTime::Format::Builder" object.
416
417 my $parser = DateTime::Format::Builder->new();
418
419 If called as a method on an object (rather than as a class method),
420 then it clones the object.
421
422 my $clone = $parser->new();
423
424 clone
425 Provided for those who prefer an explicit "clone()" method rather than
426 using "new()" as an object method.
427
428 my $clone_of_clone = $clone->clone();
429
430 parser
431 Given either a single or multiple parser specification, sets the object
432 to have a parser based on that specification.
433
434 $parser->parser(
435 regex => qr/^ (\d{4}) (\d\d) (\d\d) $/x;
436 params => [qw( year month day )],
437 );
438
439 The arguments given to "parser()" are handed directly to
440 "create_parser()". The resultant parser is passed to "set_parser()".
441
442 If called as an object method, it returns the object.
443
444 If called as a class method, it creates a new object, sets its parser
445 and returns that object.
446
447 set_parser
448 Sets the parser of the object to the given parser.
449
450 $parser->set_parser( $coderef );
451
452 Note: this method does not take specifications. It also does not take
453 anything except coderefs. Luckily, coderefs are what most of the other
454 methods produce.
455
456 The method return value is the object itself.
457
458 get_parser
459 Returns the parser the object is using.
460
461 my $code = $parser->get_parser();
462
463 parse_datetime
464 Given a string, it calls the parser and returns the "DateTime" object
465 that results.
466
467 my $dt = $parser->parse_datetime( "1979 07 16" );
468
469 The return value, if not a "DateTime" object, is whatever the parser
470 wants to return. Generally this means that if the parse failed an error
471 will be thrown.
472
473 format_datetime
474 If you call this function, it will throw an errror.
475
477 Some longer examples are provided in the distribution. These implement
478 some of the common parsing DateTime modules using Builder. Each of them
479 are, or were, drop in replacements for the modules at the time of
480 writing them.
481
483 Dave Rolsky (DROLSKY) for kickstarting the DateTime project, writing
484 DateTime::Format::ICal and DateTime::Format::MySQL, and some much
485 needed review.
486
487 Joshua Hoblitt (JHOBLITT) for the concept, some of the API, impetus for
488 writing the multilength code (both one length with multiple parsers and
489 single parser with multiple lengths), blame for the Regex custom
490 constructor code, spotting a bug in Dispatch, and more much needed
491 review.
492
493 Kellan Elliott-McCrea (KELLAN) for even more review, suggestions,
494 DateTime::Format::W3CDTF and the encouragement to rewrite these docs
495 almost 100%!
496
497 Claus Färber (CFAERBER) for having me get around to fixing the auto-
498 constructor writing, providing the 'args'/'self' patch, and suggesting
499 the multi-callbacks.
500
501 Rick Measham (RICKM) for DateTime::Format::Strptime which Builder now
502 supports.
503
504 Matthew McGillis for pointing out that "on_fail" overriding should be
505 simpler.
506
507 Simon Cozens (SIMON) for saying it was cool.
508
510 Support for this module is provided via the datetime@perl.org email
511 list. See http://lists.perl.org/ for more details.
512
513 Alternatively, log them via the CPAN RT system via the web or email:
514
515 http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DateTime%3A%3AFormat%3A%3ABuilder
516 bug-datetime-format-builder@rt.cpan.org
517
518 This makes it much easier for me to track things and thus means your
519 problem is less likely to be neglected.
520
522 "datetime@perl.org" mailing list.
523
524 http://datetime.perl.org/
525
526 perl, DateTime, DateTime::Format::Builder::Tutorial,
527 DateTime::Format::Builder::Parser
528
530 · Dave Rolsky <autarch@urth.org>
531
532 · Iain Truskett
533
535 This software is Copyright (c) 2013 by Dave Rolsky.
536
537 This is free software, licensed under:
538
539 The Artistic License 2.0 (GPL Compatible)
540
541
542
543perl v5.28.0 2018-07-14 DateTime::Format::Builder(3)