1DateTime::Format::BuildUesre(r3)Contributed Perl DocumenDtaatteiToinme::Format::Builder(3)
2
3
4

NAME

6       DateTime::Format::Builder - Create DateTime parser classes and objects.
7

VERSION

9       version 0.81
10

SYNOPSIS

12           package DateTime::Format::Brief;
13
14           use DateTime::Format::Builder
15           (
16               parsers => {
17                   parse_datetime => [
18                   {
19                       regex => qr/^(\d{4})(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
20                       params => [qw( year month day hour minute second )],
21                   },
22                   {
23                       regex => qr/^(\d{4})(\d\d)(\d\d)$/,
24                       params => [qw( year month day )],
25                   },
26                   ],
27               }
28           );
29

DESCRIPTION

31       DateTime::Format::Builder creates DateTime parsers.  Many string
32       formats of dates and times are simple and just require a basic regular
33       expression to extract the relevant information. Builder provides a
34       simple way to do this without writing reams of structural code.
35
36       Builder provides a number of methods, most of which you'll never need,
37       or at least rarely need. They're provided more for exposing of the
38       module's innards to any subclasses, or for when you need to do
39       something slightly beyond what I expected.
40
41       This creates the end methods. Coderefs die on bad parses, return
42       "DateTime" objects on good parse.
43

TUTORIAL

45       See DateTime::Format::Builder::Tutorial.
46

ERROR HANDLING AND BAD PARSES

48       Often, I will speak of "undef" being returned, however that's not
49       strictly true.
50
51       When a simple single specification is given for a method, the method
52       isn't given a single parser directly. It's given a wrapper that will
53       call "on_fail()" if the single parser returns "undef". The single
54       parser must return "undef" so that a multiple parser can work nicely
55       and actual errors can be thrown from any of the callbacks.
56
57       Similarly, any multiple parsers will only call "on_fail()" right at the
58       end when it's tried all it could.
59
60       "on_fail()" (see later) is defined, by default, to throw an error.
61
62       Multiple parser specifications can also specify "on_fail" with a
63       coderef as an argument in the options block. This will take precedence
64       over the inheritable and over-ridable method.
65
66       That said, don't throw real errors from callbacks in multiple parser
67       specifications unless you really want parsing to stop right there and
68       not try any other parsers.
69
70       In summary: calling a method will result in either a "DateTime" object
71       being returned or an error being thrown (unless you've overridden
72       "on_fail()" or "create_method()", or you've specified a "on_fail" key
73       to a multiple parser specification).
74
75       Individual parsers (be they multiple parsers or single parsers) will
76       return either the "DateTime" object or "undef".
77

SINGLE SPECIFICATIONS

79       A single specification is a hash ref of instructions on how to create a
80       parser.
81
82       The precise set of keys and values varies according to parser type.
83       There are some common ones though:
84
85       ·   length is an optional parameter that can be used to specify that
86           this particular regex is only applicable to strings of a certain
87           fixed length. This can be used to make parsers more efficient. It's
88           strongly recommended that any parser that can use this parameter
89           does.
90
91           You may happily specify the same length twice. The parsers will be
92           tried in order of specification.
93
94           You can also specify multiple lengths by giving it an arrayref of
95           numbers rather than just a single scalar.  If doing so, please keep
96           the number of lengths to a minimum.
97
98           If any specifications without lengths are given and the particular
99           length parser fails, then the non-length parsers are tried.
100
101           This parameter is ignored unless the specification is part of a
102           multiple parser specification.
103
104       ·   label provides a name for the specification and is passed to some
105           of the callbacks about to mentioned.
106
107       ·   on_match and on_fail are callbacks. Both routines will be called
108           with parameters of:
109
110           ·   input, being the input to the parser (after any preprocessing
111               callbacks).
112
113           ·   label, being the label of the parser, if there is one.
114
115           ·   self, being the object on which the method has been invoked
116               (which may just be a class name). Naturally, you can then
117               invoke your own methods on it do get information you want.
118
119           ·   args, being an arrayref of any passed arguments, if any.  If
120               there were no arguments, then this parameter is not given.
121
122           These routines will be called depending on whether the regex match
123           succeeded or failed.
124
125       ·   preprocess is a callback provided for cleaning up input prior to
126           parsing. It's given a hash as arguments with the following keys:
127
128           ·   input being the datetime string the parser was given (if using
129               multiple specifications and an overall preprocess then this is
130               the date after it's been through that preprocessor).
131
132           ·   parsed being the state of parsing so far. Usually empty at this
133               point unless an overall preprocess was given.  Items may be
134               placed in it and will be given to any postprocessor and
135               "DateTime->new" (unless the postprocessor deletes it).
136
137           ·   self, args, label as per on_match and on_fail.
138
139           The return value from the routine is what is given to the regex.
140           Note that this is last code stop before the match.
141
142           Note: mixing length and a preprocess that modifies the length of
143           the input string is probably not what you meant to do. You probably
144           meant to use the multiple parser variant of preprocess which is
145           done before any length calculations. This "single parser" variant
146           of preprocess is performed after any length calculations.
147
148       ·   postprocess is the last code stop before "DateTime->new()" is
149           called. It's given the same arguments as preprocess. This allows it
150           to modify the parsed parameters after the parse and before the
151           creation of the object. For example, you might use:
152
153               {
154                   regex  => qr/^(\d\d) (\d\d) (\d\d)$/,
155                   params => [qw( year  month  day   )],
156                   postprocess => \&_fix_year,
157               }
158
159           where "_fix_year" is defined as:
160
161               sub _fix_year
162               {
163                   my %args = @_;
164                   my ($date, $p) = @args{qw( input parsed )};
165                   $p->{year} += $p->{year} > 69 ? 1900 : 2000;
166                   return 1;
167               }
168
169           This will cause the two digit years to be corrected according to
170           the cut off. If the year was '69' or lower, then it is made into
171           2069 (or 2045, or whatever the year was parsed as). Otherwise it is
172           assumed to be 19xx. The DateTime::Format::Mail module uses code
173           similar to this (only it allows the cut off to be configured and it
174           doesn't use Builder).
175
176           Note: It is very important to return an explicit value from the
177           postprocess callback. If the return value is false then the parse
178           is taken to have failed. If the return value is true, then the
179           parse is taken to have succeeded and "DateTime->new()" is called.
180
181       See the documentation for the individual parsers for their valid keys.
182
183       Parsers at the time of writing are:
184
185       ·   DateTime::Format::Builder::Parser::Regex - provides regular
186           expression based parsing.
187
188       ·   DateTime::Format::Builder::Parser::Strptime - provides strptime
189           based parsing.
190
191   Subroutines / coderefs as specifications.
192       A single parser specification can be a coderef. This was added mostly
193       because it could be and because I knew someone, somewhere, would want
194       to use it.
195
196       If the specification is a reference to a piece of code, be it a
197       subroutine, anonymous, or whatever, then it's passed more or less
198       straight through. The code should return "undef" in event of failure
199       (or any false value, but "undef" is strongly preferred), or a true
200       value in the event of success (ideally a "DateTime" object or some
201       object that has the same interface).
202
203       This all said, I generally wouldn't recommend using this feature unless
204       you have to.
205
206   Callbacks
207       I mention a number of callbacks in this document.
208
209       Any time you see a callback being mentioned, you can, if you like,
210       substitute an arrayref of coderefs rather than having the straight
211       coderef.
212

MULTIPLE SPECIFICATIONS

214       These are very easily described as an array of single specifications.
215
216       Note that if the first element of the array is an arrayref, then you're
217       specifying options.
218
219       ·   preprocess lets you specify a preprocessor that is called before
220           any of the parsers are tried. This lets you do things like strip
221           off timezones or any unnecessary data. The most common use people
222           have for it at present is to get the input date to a particular
223           length so that the length is usable (DateTime::Format::ICal would
224           use it to strip off the variable length timezone).
225
226           Arguments are as for the single parser preprocess variant with the
227           exception that label is never given.
228
229       ·   on_fail should be a reference to a subroutine that is called if the
230           parser fails. If this is not provided, the default action is to
231           call "DateTime::Format::Builder::on_fail", or the "on_fail" method
232           of the subclass of DTFB that was used to create the parser.
233

EXECUTION FLOW

235       Builder allows you to plug in a fair few callbacks, which can make
236       following how a parse failed (or succeeded unexpectedly) somewhat
237       tricky.
238
239   For Single Specifications
240       A single specification will do the following:
241
242       User calls parser:
243
244              my $dt = $class->parse_datetime( $string );
245
246       1.  preprocess is called. It's given $string and a reference to the
247           parsing workspace hash, which we'll call $p. At this point, $p is
248           empty. The return value is used as $date for the rest of this
249           single parser.  Anything put in $p is also used for the rest of
250           this single parser.
251
252       2.  regex is applied.
253
254       3.  If regex did not match, then on_fail is called (and is given $date
255           and also label if it was defined). Any return value is ignored and
256           the next thing is for the single parser to return "undef".
257
258           If regex did match, then on_match is called with the same arguments
259           as would be given to on_fail. The return value is similarly
260           ignored, but we then move to step 4 rather than exiting the parser.
261
262       4.  postprocess is called with $date and a filled out $p. The return
263           value is taken as a indication of whether the parse was a success
264           or not. If it wasn't a success then the single parser will exit at
265           this point, returning undef.
266
267       5.  "DateTime->new()" is called and the user is given the resultant
268           "DateTime" object.
269
270       See the section on error handling regarding the "undef"s mentioned
271       above.
272
273   For Multiple Specifications
274       With multiple specifications:
275
276       User calls parser:
277
278             my $dt = $class->complex_parse( $string );
279
280       1.  The overall preprocessor is called and is given $string and the
281           hashref $p (identically to the per parser preprocess mentioned in
282           the previous flow).
283
284           If the callback modifies $p then a copy of $p is given to each of
285           the individual parsers.  This is so parsers won't accidentally
286           pollute each other's workspace.
287
288       2.  If an appropriate length specific parser is found, then it is
289           called and the single parser flow (see the previous section) is
290           followed, and the parser is given a copy of $p and the return value
291           of the overall preprocessor as $date.
292
293           If a "DateTime" object was returned so we go straight back to the
294           user.
295
296           If no appropriate parser was found, or the parser returned "undef",
297           then we progress to step 3!
298
299       3.  Any non-length based parsers are tried in the order they were
300           specified.
301
302           For each of those the single specification flow above is performed,
303           and is given a copy of the output from the overall preprocessor.
304
305           If a real "DateTime" object is returned then we exit back to the
306           user.
307
308           If no parser could parse, then an error is thrown.
309
310       See the section on error handling regarding the "undef"s mentioned
311       above.
312

METHODS

314       In the general course of things you won't need any of the methods. Life
315       often throws unexpected things at us so the methods are all available
316       for use.
317
318   import
319       "import()" is a wrapper for "create_class()". If you specify the class
320       option (see documentation for "create_class()") it will be ignored.
321
322   create_class
323       This method can be used as the runtime equivalent of "import()". That
324       is, it takes the exact same parameters as when one does:
325
326          use DateTime::Format::Builder ( blah blah blah )
327
328       That can be (almost) equivalently written as:
329
330          use DateTime::Format::Builder;
331          DateTime::Format::Builder->create_class( blah blah blah );
332
333       The difference being that the first is done at compile time while the
334       second is done at run time.
335
336       In the tutorial I said there were only two parameters at present. I
337       lied. There are actually three of them.
338
339       ·   parsers takes a hashref of methods and their parser specifications.
340           See the DateTime::Format::Builder::Tutorial for details.
341
342           Note that if you define a subroutine of the same name as one of the
343           methods you define here, an error will be thrown.
344
345       ·   constructor determines whether and how to create a "new()" function
346           in the new class. If given a true value, a constructor is created.
347           If given a false value, one isn't.
348
349           If given an anonymous sub or a reference to a sub then that is used
350           as "new()".
351
352           The default is 1 (that is, create a constructor using our default
353           code which simply creates a hashref and blesses it).
354
355           If your class defines its own "new()" method it will not be
356           overwritten. If you define your own "new()" and also tell Builder
357           to define one an error will be thrown.
358
359       ·   verbose takes a value. If the value is undef, then logging is
360           disabled. If the value is a filehandle then that's where logging
361           will go. If it's a true value, then output will go to "STDERR".
362
363           Alternatively, call "$DateTime::Format::Builder::verbose()" with
364           the relevant value. Whichever value is given more recently is
365           adhered to.
366
367           Be aware that verbosity is a global wide setting.
368
369       ·   class is optional and specifies the name of the class in which to
370           create the specified methods.
371
372           If using this method in the guise of "import()" then this field
373           will cause an error so it is only of use when calling as
374           "create_class()".
375
376       ·   version is also optional and specifies the value to give $VERSION
377           in the class. It's generally not recommended unless you're
378           combining with the class option. A "ExtUtils::MakeMaker" / "CPAN"
379           compliant version specification is much better.
380
381       In addition to creating any of the methods it also creates a "new()"
382       method that can instantiate (or clone) objects.
383

SUBCLASSING

385       In the rest of the documentation I've often lied in order to get some
386       of the ideas across more easily. The thing is, this module's very
387       flexible. You can get markedly different behaviour from simply
388       subclassing it and overriding some methods.
389
390   create_method
391       Given a parser coderef, returns a coderef that is suitable to be a
392       method.
393
394       The default action is to call "on_fail()" in the event of a non-parse,
395       but you can make it do whatever you want.
396
397   on_fail
398       This is called in the event of a non-parse (unless you've overridden
399       "create_method()" to do something else.
400
401       The single argument is the input string. The default action is to call
402       "croak()". Above, where I've said parsers or methods throw errors, this
403       is the method that is doing the error throwing.
404
405       You could conceivably override this method to, say, return "undef".
406

USING BUILDER OBJECTS aka USERS USING BUILDER

408       The methods listed in the METHODS section are all you generally need
409       when creating your own class. Sometimes you may not want a full blown
410       class to parse something just for this one program. Some methods are
411       provided to make that task easier.
412
413   new
414       The basic constructor. It takes no arguments, merely returns a new
415       "DateTime::Format::Builder" object.
416
417           my $parser = DateTime::Format::Builder->new();
418
419       If called as a method on an object (rather than as a class method),
420       then it clones the object.
421
422           my $clone = $parser->new();
423
424   clone
425       Provided for those who prefer an explicit "clone()" method rather than
426       using "new()" as an object method.
427
428           my $clone_of_clone = $clone->clone();
429
430   parser
431       Given either a single or multiple parser specification, sets the object
432       to have a parser based on that specification.
433
434           $parser->parser(
435               regex  => qr/^ (\d{4}) (\d\d) (\d\d) $/x;
436               params => [qw( year    month  day    )],
437           );
438
439       The arguments given to "parser()" are handed directly to
440       "create_parser()". The resultant parser is passed to "set_parser()".
441
442       If called as an object method, it returns the object.
443
444       If called as a class method, it creates a new object, sets its parser
445       and returns that object.
446
447   set_parser
448       Sets the parser of the object to the given parser.
449
450          $parser->set_parser( $coderef );
451
452       Note: this method does not take specifications. It also does not take
453       anything except coderefs. Luckily, coderefs are what most of the other
454       methods produce.
455
456       The method return value is the object itself.
457
458   get_parser
459       Returns the parser the object is using.
460
461          my $code = $parser->get_parser();
462
463   parse_datetime
464       Given a string, it calls the parser and returns the "DateTime" object
465       that results.
466
467          my $dt = $parser->parse_datetime( "1979 07 16" );
468
469       The return value, if not a "DateTime" object, is whatever the parser
470       wants to return. Generally this means that if the parse failed an error
471       will be thrown.
472
473   format_datetime
474       If you call this function, it will throw an errror.
475

LONGER EXAMPLES

477       Some longer examples are provided in the distribution. These implement
478       some of the common parsing DateTime modules using Builder. Each of them
479       are, or were, drop in replacements for the modules at the time of
480       writing them.
481

THANKS

483       Dave Rolsky (DROLSKY) for kickstarting the DateTime project, writing
484       DateTime::Format::ICal and DateTime::Format::MySQL, and some much
485       needed review.
486
487       Joshua Hoblitt (JHOBLITT) for the concept, some of the API, impetus for
488       writing the multilength code (both one length with multiple parsers and
489       single parser with multiple lengths), blame for the Regex custom
490       constructor code, spotting a bug in Dispatch, and more much needed
491       review.
492
493       Kellan Elliott-McCrea (KELLAN) for even more review, suggestions,
494       DateTime::Format::W3CDTF and the encouragement to rewrite these docs
495       almost 100%!
496
497       Claus Färber (CFAERBER) for having me get around to fixing the auto-
498       constructor writing, providing the 'args'/'self' patch, and suggesting
499       the multi-callbacks.
500
501       Rick Measham (RICKM) for DateTime::Format::Strptime which Builder now
502       supports.
503
504       Matthew McGillis for pointing out that "on_fail" overriding should be
505       simpler.
506
507       Simon Cozens (SIMON) for saying it was cool.
508

SUPPORT

510       Support for this module is provided via the datetime@perl.org email
511       list. See http://lists.perl.org/ for more details.
512
513       Alternatively, log them via the CPAN RT system via the web or email:
514
515           http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DateTime%3A%3AFormat%3A%3ABuilder
516           bug-datetime-format-builder@rt.cpan.org
517
518       This makes it much easier for me to track things and thus means your
519       problem is less likely to be neglected.
520

SEE ALSO

522       "datetime@perl.org" mailing list.
523
524       http://datetime.perl.org/
525
526       perl, DateTime, DateTime::Format::Builder::Tutorial,
527       DateTime::Format::Builder::Parser
528

AUTHORS

530       ·   Dave Rolsky <autarch@urth.org>
531
532       ·   Iain Truskett
533
535       This software is Copyright (c) 2013 by Dave Rolsky.
536
537       This is free software, licensed under:
538
539         The Artistic License 2.0 (GPL Compatible)
540
541
542
543perl v5.28.0                      2018-07-14      DateTime::Format::Builder(3)
Impressum