1DateTime::Format::BuildUesre(r3)Contributed Perl DocumenDtaatteiToinme::Format::Builder(3)
2
3
4

NAME

6       DateTime::Format::Builder - Create DateTime parser classes and objects.
7

VERSION

9       version 0.83
10

SYNOPSIS

12           package DateTime::Format::Brief;
13
14           use DateTime::Format::Builder (
15               parsers => {
16                   parse_datetime => [
17                       {
18                           regex  => qr/^(\d{4})(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
19                           params => [qw( year month day hour minute second )],
20                       },
21                       {
22                           regex  => qr/^(\d{4})(\d\d)(\d\d)$/,
23                           params => [qw( year month day )],
24                       },
25                   ],
26               }
27           );
28

DESCRIPTION

30       DateTime::Format::Builder creates DateTime parsers. Many string formats
31       of dates and times are simple and just require a basic regular
32       expression to extract the relevant information. Builder provides a
33       simple way to do this without writing reams of structural code.
34
35       Builder provides a number of methods, most of which you'll never need,
36       or at least rarely need. They're provided more for exposing of the
37       module's innards to any subclasses, or for when you need to do
38       something slightly beyond what I expected.
39

TUTORIAL

41       See DateTime::Format::Builder::Tutorial.
42

ERROR HANDLING AND BAD PARSES

44       Often, I will speak of "undef" being returned, however that's not
45       strictly true.
46
47       When a simple single specification is given for a method, the method
48       isn't given a single parser directly. It's given a wrapper that will
49       call "on_fail" if the single parser returns "undef". The single parser
50       must return "undef" so that a multiple parser can work nicely and
51       actual errors can be thrown from any of the callbacks.
52
53       Similarly, any multiple parsers will only call "on_fail" right at the
54       end when it's tried all it could.
55
56       "on_fail" (see later) is defined, by default, to throw an error.
57
58       Multiple parser specifications can also specify "on_fail" with a
59       coderef as an argument in the options block. This will take precedence
60       over the inheritable and overrideable method.
61
62       That said, don't throw real errors from callbacks in multiple parser
63       specifications unless you really want parsing to stop right there and
64       not try any other parsers.
65
66       In summary: calling a method will result in either a "DateTime" object
67       being returned or an error being thrown (unless you've overridden
68       "on_fail" or "create_method", or you've specified a "on_fail" key to a
69       multiple parser specification).
70
71       Individual parsers (be they multiple parsers or single parsers) will
72       return either the "DateTime" object or "undef".
73

SINGLE SPECIFICATIONS

75       A single specification is a hash ref of instructions on how to create a
76       parser.
77
78       The precise set of keys and values varies according to parser type.
79       There are some common ones though:
80
81       •   length
82
83           length is an optional parameter that can be used to specify that
84           this particular regex is only applicable to strings of a certain
85           fixed length. This can be used to make parsers more efficient. It's
86           strongly recommended that any parser that can use this parameter
87           does.
88
89           You may happily specify the same length twice. The parsers will be
90           tried in order of specification.
91
92           You can also specify multiple lengths by giving it an arrayref of
93           numbers rather than just a single scalar. If doing so, please keep
94           the number of lengths to a minimum.
95
96           If any specifications without lengths are given and the particular
97           length parser fails, then the non-length parsers are tried.
98
99           This parameter is ignored unless the specification is part of a
100           multiple parser specification.
101
102       •   label
103
104           label provides a name for the specification and is passed to some
105           of the callbacks about to mentioned.
106
107       •   on_match and on_fail
108
109           on_match and on_fail are callbacks. Both routines will be called
110           with parameters of:
111
112           •   input
113
114               input is the input to the parser (after any preprocessing
115               callbacks).
116
117           •   label
118
119               label is the label of the parser if there is one.
120
121           •   self
122
123               self is the object on which the method has been invoked (which
124               may just be a class name). Naturally, you can then invoke your
125               own methods on it do get information you want.
126
127args is an arrayref of any passed arguments, if any. If there
128               were no arguments, then this parameter is not given.
129
130           These routines will be called depending on whether the regex match
131           succeeded or failed.
132
133       •   preprocess
134
135           preprocess is a callback provided for cleaning up input prior to
136           parsing. It's given a hash as arguments with the following keys:
137
138           •   input
139
140               input is the datetime string the parser was given (if using
141               multiple specifications and an overall preprocess then this is
142               the date after it's been through that preprocessor).
143
144           •   parsed
145
146               parsed is the state of parsing so far. Usually empty at this
147               point unless an overall preprocess was given.  Items may be
148               placed in it and will be given to any postprocessor and
149               "DateTime->new" (unless the postprocessor deletes it).
150
151           •   self, args, label
152
153               self, args, label as per on_match and on_fail.
154
155           The return value from the routine is what is given to the regex.
156           Note that this is last code stop before the match.
157
158           Note: mixing length and a preprocess that modifies the length of
159           the input string is probably not what you meant to do. You probably
160           meant to use the multiple parser variant of preprocess which is
161           done before any length calculations. This "single parser" variant
162           of preprocess is performed after any length calculations.
163
164       •   postprocess
165
166           postprocess is the last code stop before "DateTime->new" is called.
167           It's given the same arguments as preprocess. This allows it to
168           modify the parsed parameters after the parse and before the
169           creation of the object. For example, you might use:
170
171               {
172                   regex       => qr/^(\d\d) (\d\d) (\d\d)$/,
173                   params      => [qw( year  month  day   )],
174                   postprocess => \&_fix_year,
175               }
176
177           where "_fix_year" is defined as:
178
179               sub _fix_year {
180                   my %args = @_;
181                   my ( $date, $p ) = @args{qw( input parsed )};
182                   $p->{year} += $p->{year} > 69 ? 1900 : 2000;
183                   return 1;
184               }
185
186           This will cause the two digit years to be corrected according to
187           the cut off. If the year was '69' or lower, then it is made into
188           2069 (or 2045, or whatever the year was parsed as). Otherwise it is
189           assumed to be 19xx. The DateTime::Format::Mail module uses code
190           similar to this (only it allows the cut off to be configured and it
191           doesn't use Builder).
192
193           Note: It is very important to return an explicit value from the
194           postprocess callback. If the return value is false then the parse
195           is taken to have failed. If the return value is true, then the
196           parse is taken to have succeeded and "DateTime->new" is called.
197
198       See the documentation for the individual parsers for their valid keys.
199
200       Parsers at the time of writing are:
201
202       •   DateTime::Format::Builder::Parser::Regex - provides regular
203           expression based parsing.
204
205       •   DateTime::Format::Builder::Parser::Strptime - provides strptime
206           based parsing.
207
208   Subroutines / coderefs as specifications.
209       A single parser specification can be a coderef. This was added mostly
210       because it could be and because I knew someone, somewhere, would want
211       to use it.
212
213       If the specification is a reference to a piece of code, be it a
214       subroutine, anonymous, or whatever, then it's passed more or less
215       straight through. The code should return "undef" in event of failure
216       (or any false value, but "undef" is strongly preferred), or a true
217       value in the event of success (ideally a "DateTime" object or some
218       object that has the same interface).
219
220       This all said, I generally wouldn't recommend using this feature unless
221       you have to.
222
223   Callbacks
224       I mention a number of callbacks in this document.
225
226       Any time you see a callback being mentioned, you can, if you like,
227       substitute an arrayref of coderefs rather than having the straight
228       coderef.
229

MULTIPLE SPECIFICATIONS

231       These are very easily described as an array of single specifications.
232
233       Note that if the first element of the array is an arrayref, then you're
234       specifying options.
235
236       •   preprocess
237
238           preprocess lets you specify a preprocessor that is called before
239           any of the parsers are tried. This lets you do things like strip
240           off timezones or any unnecessary data. The most common use people
241           have for it at present is to get the input date to a particular
242           length so that the length is usable (DateTime::Format::ICal would
243           use it to strip off the variable length timezone).
244
245           Arguments are as for the single parser preprocess variant with the
246           exception that label is never given.
247
248       •   on_fail
249
250           on_fail should be a reference to a subroutine that is called if the
251           parser fails. If this is not provided, the default action is to
252           call "DateTime::Format::Builder::on_fail", or the "on_fail" method
253           of the subclass of DTFB that was used to create the parser.
254

EXECUTION FLOW

256       Builder allows you to plug in a fair few callbacks, which can make
257       following how a parse failed (or succeeded unexpectedly) somewhat
258       tricky.
259
260   For Single Specifications
261       A single specification will do the following:
262
263       User calls parser:
264
265           my $dt = $class->parse_datetime($string);
266
267       1.  preprocess is called. It's given $string and a reference to the
268           parsing workspace hash, which we'll call $p. At this point, $p is
269           empty. The return value is used as $date for the rest of this
270           single parser.  Anything put in $p is also used for the rest of
271           this single parser.
272
273       2.  regex is applied.
274
275       3.  If regex did not match, then on_fail is called (and is given $date
276           and also label if it was defined). Any return value is ignored and
277           the next thing is for the single parser to return "undef".
278
279           If regex did match, then on_match is called with the same arguments
280           as would be given to on_fail. The return value is similarly
281           ignored, but we then move to step 4 rather than exiting the parser.
282
283       4.  postprocess is called with $date and a filled out $p. The return
284           value is taken as a indication of whether the parse was a success
285           or not. If it wasn't a success then the single parser will exit at
286           this point, returning undef.
287
288       5.  "DateTime->new" is called and the user is given the resultant
289           "DateTime" object.
290
291       See the section on error handling regarding the "undef"s mentioned
292       above.
293
294   For Multiple Specifications
295       With multiple specifications:
296
297       User calls parser:
298
299           my $dt = $class->complex_parse($string);
300
301       1.  The overall preprocessor is called and is given $string and the
302           hashref $p (identically to the per parser preprocess mentioned in
303           the previous flow).
304
305           If the callback modifies $p then a copy of $p is given to each of
306           the individual parsers. This is so parsers won't accidentally
307           pollute each other's workspace.
308
309       2.  If an appropriate length specific parser is found, then it is
310           called and the single parser flow (see the previous section) is
311           followed, and the parser is given a copy of $p and the return value
312           of the overall preprocessor as $date.
313
314           If a "DateTime" object was returned so we go straight back to the
315           user.
316
317           If no appropriate parser was found, or the parser returned "undef",
318           then we progress to step 3!
319
320       3.  Any non-length based parsers are tried in the order they were
321           specified.
322
323           For each of those the single specification flow above is performed,
324           and is given a copy of the output from the overall preprocessor.
325
326           If a real "DateTime" object is returned then we exit back to the
327           user.
328
329           If no parser could parse, then an error is thrown.
330
331       See the section on error handling regarding the "undef"s mentioned
332       above.
333

METHODS

335       In the general course of things you won't need any of the methods. Life
336       often throws unexpected things at us so the methods are all available
337       for use.
338
339   import
340       "import" is a wrapper for "create_class". If you specify the class
341       option (see documentation for "create_class") it will be ignored.
342
343   create_class
344       This method can be used as the runtime equivalent of "import". That is,
345       it takes the exact same parameters as when one does:
346
347           use DateTime::Format::Builder ( ... )
348
349       That can be (almost) equivalently written as:
350
351           use DateTime::Format::Builder;
352           DateTime::Format::Builder->create_class( ... );
353
354       The difference being that the first is done at compile time while the
355       second is done at run time.
356
357       In the tutorial I said there were only two parameters at present. I
358       lied. There are actually three of them.
359
360       •   parsers
361
362           parsers takes a hashref of methods and their parser specifications.
363           See the DateTime::Format::Builder::Tutorial for details.
364
365           Note that if you define a subroutine of the same name as one of the
366           methods you define here, an error will be thrown.
367
368       •   constructor
369
370           constructor determines whether and how to create a "new" function
371           in the new class. If given a true value, a constructor is created.
372           If given a false value, one isn't.
373
374           If given an anonymous sub or a reference to a sub then that is used
375           as "new".
376
377           The default is 1 (that is, create a constructor using our default
378           code which simply creates a hashref and blesses it).
379
380           If your class defines its own "new" method it will not be
381           overwritten. If you define your own "new" and also tell Builder to
382           define one an error will be thrown.
383
384       •   verbose
385
386           verbose takes a value. If the value is "undef", then logging is
387           disabled. If the value is a filehandle then that's where logging
388           will go. If it's a true value, then output will go to "STDERR".
389
390           Alternatively, call $DateTime::Format::Builder::verbose with the
391           relevant value. Whichever value is given more recently is adhered
392           to.
393
394           Be aware that verbosity is a global setting.
395
396       •   class
397
398           class is optional and specifies the name of the class in which to
399           create the specified methods.
400
401           If using this method in the guise of "import" then this field will
402           cause an error so it is only of use when calling as "create_class".
403
404       •   version
405
406           version is also optional and specifies the value to give $VERSION
407           in the class. It's generally not recommended unless you're
408           combining with the class option. A "ExtUtils::MakeMaker" / "CPAN"
409           compliant version specification is much better.
410
411       In addition to creating any of the methods it also creates a "new"
412       method that can instantiate (or clone) objects.
413

SUBCLASSING

415       In the rest of the documentation I've often lied in order to get some
416       of the ideas across more easily. The thing is, this module's very
417       flexible. You can get markedly different behaviour from simply
418       subclassing it and overriding some methods.
419
420   create_method
421       Given a parser coderef, returns a coderef that is suitable to be a
422       method.
423
424       The default action is to call "on_fail" in the event of a non-parse,
425       but you can make it do whatever you want.
426
427   on_fail
428       This is called in the event of a non-parse (unless you've overridden
429       "create_method" to do something else.
430
431       The single argument is the input string. The default action is to call
432       "croak". Above, where I've said parsers or methods throw errors, this
433       is the method that is doing the error throwing.
434
435       You could conceivably override this method to, say, return "undef".
436

USING BUILDER OBJECTS aka USERS USING BUILDER

438       The methods listed in the METHODS section are all you generally need
439       when creating your own class. Sometimes you may not want a full blown
440       class to parse something just for this one program. Some methods are
441       provided to make that task easier.
442
443   new
444       The basic constructor. It takes no arguments, merely returns a new
445       "DateTime::Format::Builder" object.
446
447           my $parser = DateTime::Format::Builder->new;
448
449       If called as a method on an object (rather than as a class method),
450       then it clones the object.
451
452           my $clone = $parser->new;
453
454   clone
455       Provided for those who prefer an explicit "clone" method rather than
456       using "new" as an object method.
457
458           my $clone_of_clone = $clone->clone;
459
460   parser
461       Given either a single or multiple parser specification, sets the object
462       to have a parser based on that specification.
463
464           $parser->parser(
465               regex  => qr/^ (\d{4}) (\d\d) (\d\d) $/x;
466               params => [qw( year    month  day    )],
467           );
468
469       The arguments given to "parser" are handed directly to "create_parser".
470       The resultant parser is passed to "set_parser".
471
472       If called as an object method, it returns the object.
473
474       If called as a class method, it creates a new object, sets its parser
475       and returns that object.
476
477   set_parser
478       Sets the parser of the object to the given parser.
479
480           $parser->set_parser($coderef);
481
482       Note: this method does not take specifications. It also does not take
483       anything except coderefs. Luckily, coderefs are what most of the other
484       methods produce.
485
486       The method return value is the object itself.
487
488   get_parser
489       Returns the parser the object is using.
490
491           my $code = $parser->get_parser;
492
493   parse_datetime
494       Given a string, it calls the parser and returns the "DateTime" object
495       that results.
496
497           my $dt = $parser->parse_datetime('1979 07 16');
498
499       The return value, if not a "DateTime" object, is whatever the parser
500       wants to return. Generally this means that if the parse failed an error
501       will be thrown.
502
503   format_datetime
504       If you call this function, it will throw an error.
505

LONGER EXAMPLES

507       Some longer examples are provided in the distribution. These implement
508       some of the common parsing DateTime modules using Builder. Each of them
509       are, or were, drop in replacements for the modules at the time of
510       writing them.
511

THANKS

513       Dave Rolsky (DROLSKY) for kickstarting the DateTime project, writing
514       DateTime::Format::ICal and DateTime::Format::MySQL, and some much
515       needed review.
516
517       Joshua Hoblitt (JHOBLITT) for the concept, some of the API, impetus for
518       writing the multi-length code (both one length with multiple parsers
519       and single parser with multiple lengths), blame for the Regex custom
520       constructor code, spotting a bug in Dispatch, and more much needed
521       review.
522
523       Kellan Elliott-McCrea (KELLAN) for even more review, suggestions,
524       DateTime::Format::W3CDTF and the encouragement to rewrite these docs
525       almost 100%!
526
527       Claus Färber (CFAERBER) for having me get around to fixing the auto-
528       constructor writing, providing the 'args'/'self' patch, and suggesting
529       the multi-callbacks.
530
531       Rick Measham (RICKM) for DateTime::Format::Strptime which Builder now
532       supports.
533
534       Matthew McGillis for pointing out that "on_fail" overriding should be
535       simpler.
536
537       Simon Cozens (SIMON) for saying it was cool.
538

SEE ALSO

540       "datetime@perl.org" mailing list.
541
542       http://datetime.perl.org/
543
544       perl, DateTime, DateTime::Format::Builder::Tutorial,
545       DateTime::Format::Builder::Parser
546

SUPPORT

548       Bugs may be submitted at
549       <https://github.com/houseabsolute/DateTime-Format-Builder/issues>.
550
551       I am also usually active on IRC as 'autarch' on "irc://irc.perl.org".
552

SOURCE

554       The source code repository for DateTime-Format-Builder can be found at
555       <https://github.com/houseabsolute/DateTime-Format-Builder>.
556

DONATIONS

558       If you'd like to thank me for the work I've done on this module, please
559       consider making a "donation" to me via PayPal. I spend a lot of free
560       time creating free software, and would appreciate any support you'd
561       care to offer.
562
563       Please note that I am not suggesting that you must do this in order for
564       me to continue working on this particular software. I will continue to
565       do so, inasmuch as I have in the past, for as long as it interests me.
566
567       Similarly, a donation made in this way will probably not make me work
568       on this software much more, unless I get so many donations that I can
569       consider working on free software full time (let's all have a chuckle
570       at that together).
571
572       To donate, log into PayPal and send money to autarch@urth.org, or use
573       the button at <https://www.urth.org/fs-donation.html>.
574

AUTHORS

576       •   Dave Rolsky <autarch@urth.org>
577
578       •   Iain Truskett <spoon@cpan.org>
579

CONTRIBUTORS

581       •   Daisuke Maki <daisuke@endeworks.jp>
582
583       •   James Raspass <jraspass@gmail.com>
584
586       This software is Copyright (c) 2020 by Dave Rolsky.
587
588       This is free software, licensed under:
589
590         The Artistic License 2.0 (GPL Compatible)
591
592       The full text of the license can be found in the LICENSE file included
593       with this distribution.
594
595
596
597perl v5.36.0                      2023-01-20      DateTime::Format::Builder(3)
Impressum