1DateTime::Format::BuildUesre(r3)Contributed Perl DocumenDtaatteiToinme::Format::Builder(3)
2
3
4

NAME

6       DateTime::Format::Builder - Create DateTime parser classes and objects.
7

VERSION

9       version 0.82
10

SYNOPSIS

12           package DateTime::Format::Brief;
13
14           use DateTime::Format::Builder
15           (
16               parsers => {
17                   parse_datetime => [
18                   {
19                       regex => qr/^(\d{4})(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
20                       params => [qw( year month day hour minute second )],
21                   },
22                   {
23                       regex => qr/^(\d{4})(\d\d)(\d\d)$/,
24                       params => [qw( year month day )],
25                   },
26                   ],
27               }
28           );
29

DESCRIPTION

31       DateTime::Format::Builder creates DateTime parsers.  Many string
32       formats of dates and times are simple and just require a basic regular
33       expression to extract the relevant information. Builder provides a
34       simple way to do this without writing reams of structural code.
35
36       Builder provides a number of methods, most of which you'll never need,
37       or at least rarely need. They're provided more for exposing of the
38       module's innards to any subclasses, or for when you need to do
39       something slightly beyond what I expected.
40

TUTORIAL

42       See DateTime::Format::Builder::Tutorial.
43

ERROR HANDLING AND BAD PARSES

45       Often, I will speak of "undef" being returned, however that's not
46       strictly true.
47
48       When a simple single specification is given for a method, the method
49       isn't given a single parser directly. It's given a wrapper that will
50       call "on_fail()" if the single parser returns "undef". The single
51       parser must return "undef" so that a multiple parser can work nicely
52       and actual errors can be thrown from any of the callbacks.
53
54       Similarly, any multiple parsers will only call "on_fail()" right at the
55       end when it's tried all it could.
56
57       "on_fail()" (see later) is defined, by default, to throw an error.
58
59       Multiple parser specifications can also specify "on_fail" with a
60       coderef as an argument in the options block. This will take precedence
61       over the inheritable and overrideable method.
62
63       That said, don't throw real errors from callbacks in multiple parser
64       specifications unless you really want parsing to stop right there and
65       not try any other parsers.
66
67       In summary: calling a method will result in either a "DateTime" object
68       being returned or an error being thrown (unless you've overridden
69       "on_fail()" or "create_method()", or you've specified a "on_fail" key
70       to a multiple parser specification).
71
72       Individual parsers (be they multiple parsers or single parsers) will
73       return either the "DateTime" object or "undef".
74

SINGLE SPECIFICATIONS

76       A single specification is a hash ref of instructions on how to create a
77       parser.
78
79       The precise set of keys and values varies according to parser type.
80       There are some common ones though:
81
82       ·   length is an optional parameter that can be used to specify that
83           this particular regex is only applicable to strings of a certain
84           fixed length. This can be used to make parsers more efficient. It's
85           strongly recommended that any parser that can use this parameter
86           does.
87
88           You may happily specify the same length twice. The parsers will be
89           tried in order of specification.
90
91           You can also specify multiple lengths by giving it an arrayref of
92           numbers rather than just a single scalar.  If doing so, please keep
93           the number of lengths to a minimum.
94
95           If any specifications without lengths are given and the particular
96           length parser fails, then the non-length parsers are tried.
97
98           This parameter is ignored unless the specification is part of a
99           multiple parser specification.
100
101       ·   label provides a name for the specification and is passed to some
102           of the callbacks about to mentioned.
103
104       ·   on_match and on_fail are callbacks. Both routines will be called
105           with parameters of:
106
107           ·   input, being the input to the parser (after any preprocessing
108               callbacks).
109
110           ·   label, being the label of the parser, if there is one.
111
112           ·   self, being the object on which the method has been invoked
113               (which may just be a class name). Naturally, you can then
114               invoke your own methods on it do get information you want.
115
116           ·   args, being an arrayref of any passed arguments, if any.  If
117               there were no arguments, then this parameter is not given.
118
119           These routines will be called depending on whether the regex match
120           succeeded or failed.
121
122       ·   preprocess is a callback provided for cleaning up input prior to
123           parsing. It's given a hash as arguments with the following keys:
124
125           ·   input being the datetime string the parser was given (if using
126               multiple specifications and an overall preprocess then this is
127               the date after it's been through that preprocessor).
128
129           ·   parsed being the state of parsing so far. Usually empty at this
130               point unless an overall preprocess was given.  Items may be
131               placed in it and will be given to any postprocessor and
132               "DateTime->new" (unless the postprocessor deletes it).
133
134           ·   self, args, label as per on_match and on_fail.
135
136           The return value from the routine is what is given to the regex.
137           Note that this is last code stop before the match.
138
139           Note: mixing length and a preprocess that modifies the length of
140           the input string is probably not what you meant to do. You probably
141           meant to use the multiple parser variant of preprocess which is
142           done before any length calculations. This "single parser" variant
143           of preprocess is performed after any length calculations.
144
145       ·   postprocess is the last code stop before "DateTime->new()" is
146           called. It's given the same arguments as preprocess. This allows it
147           to modify the parsed parameters after the parse and before the
148           creation of the object. For example, you might use:
149
150               {
151                   regex  => qr/^(\d\d) (\d\d) (\d\d)$/,
152                   params => [qw( year  month  day   )],
153                   postprocess => \&_fix_year,
154               }
155
156           where "_fix_year" is defined as:
157
158               sub _fix_year
159               {
160                   my %args = @_;
161                   my ($date, $p) = @args{qw( input parsed )};
162                   $p->{year} += $p->{year} > 69 ? 1900 : 2000;
163                   return 1;
164               }
165
166           This will cause the two digit years to be corrected according to
167           the cut off. If the year was '69' or lower, then it is made into
168           2069 (or 2045, or whatever the year was parsed as). Otherwise it is
169           assumed to be 19xx. The DateTime::Format::Mail module uses code
170           similar to this (only it allows the cut off to be configured and it
171           doesn't use Builder).
172
173           Note: It is very important to return an explicit value from the
174           postprocess callback. If the return value is false then the parse
175           is taken to have failed. If the return value is true, then the
176           parse is taken to have succeeded and "DateTime->new()" is called.
177
178       See the documentation for the individual parsers for their valid keys.
179
180       Parsers at the time of writing are:
181
182       ·   DateTime::Format::Builder::Parser::Regex - provides regular
183           expression based parsing.
184
185       ·   DateTime::Format::Builder::Parser::Strptime - provides strptime
186           based parsing.
187
188   Subroutines / coderefs as specifications.
189       A single parser specification can be a coderef. This was added mostly
190       because it could be and because I knew someone, somewhere, would want
191       to use it.
192
193       If the specification is a reference to a piece of code, be it a
194       subroutine, anonymous, or whatever, then it's passed more or less
195       straight through. The code should return "undef" in event of failure
196       (or any false value, but "undef" is strongly preferred), or a true
197       value in the event of success (ideally a "DateTime" object or some
198       object that has the same interface).
199
200       This all said, I generally wouldn't recommend using this feature unless
201       you have to.
202
203   Callbacks
204       I mention a number of callbacks in this document.
205
206       Any time you see a callback being mentioned, you can, if you like,
207       substitute an arrayref of coderefs rather than having the straight
208       coderef.
209

MULTIPLE SPECIFICATIONS

211       These are very easily described as an array of single specifications.
212
213       Note that if the first element of the array is an arrayref, then you're
214       specifying options.
215
216       ·   preprocess lets you specify a preprocessor that is called before
217           any of the parsers are tried. This lets you do things like strip
218           off timezones or any unnecessary data. The most common use people
219           have for it at present is to get the input date to a particular
220           length so that the length is usable (DateTime::Format::ICal would
221           use it to strip off the variable length timezone).
222
223           Arguments are as for the single parser preprocess variant with the
224           exception that label is never given.
225
226       ·   on_fail should be a reference to a subroutine that is called if the
227           parser fails. If this is not provided, the default action is to
228           call "DateTime::Format::Builder::on_fail", or the "on_fail" method
229           of the subclass of DTFB that was used to create the parser.
230

EXECUTION FLOW

232       Builder allows you to plug in a fair few callbacks, which can make
233       following how a parse failed (or succeeded unexpectedly) somewhat
234       tricky.
235
236   For Single Specifications
237       A single specification will do the following:
238
239       User calls parser:
240
241              my $dt = $class->parse_datetime( $string );
242
243       1.  preprocess is called. It's given $string and a reference to the
244           parsing workspace hash, which we'll call $p. At this point, $p is
245           empty. The return value is used as $date for the rest of this
246           single parser.  Anything put in $p is also used for the rest of
247           this single parser.
248
249       2.  regex is applied.
250
251       3.  If regex did not match, then on_fail is called (and is given $date
252           and also label if it was defined). Any return value is ignored and
253           the next thing is for the single parser to return "undef".
254
255           If regex did match, then on_match is called with the same arguments
256           as would be given to on_fail. The return value is similarly
257           ignored, but we then move to step 4 rather than exiting the parser.
258
259       4.  postprocess is called with $date and a filled out $p. The return
260           value is taken as a indication of whether the parse was a success
261           or not. If it wasn't a success then the single parser will exit at
262           this point, returning undef.
263
264       5.  "DateTime->new()" is called and the user is given the resultant
265           "DateTime" object.
266
267       See the section on error handling regarding the "undef"s mentioned
268       above.
269
270   For Multiple Specifications
271       With multiple specifications:
272
273       User calls parser:
274
275             my $dt = $class->complex_parse( $string );
276
277       1.  The overall preprocessor is called and is given $string and the
278           hashref $p (identically to the per parser preprocess mentioned in
279           the previous flow).
280
281           If the callback modifies $p then a copy of $p is given to each of
282           the individual parsers.  This is so parsers won't accidentally
283           pollute each other's workspace.
284
285       2.  If an appropriate length specific parser is found, then it is
286           called and the single parser flow (see the previous section) is
287           followed, and the parser is given a copy of $p and the return value
288           of the overall preprocessor as $date.
289
290           If a "DateTime" object was returned so we go straight back to the
291           user.
292
293           If no appropriate parser was found, or the parser returned "undef",
294           then we progress to step 3!
295
296       3.  Any non-length based parsers are tried in the order they were
297           specified.
298
299           For each of those the single specification flow above is performed,
300           and is given a copy of the output from the overall preprocessor.
301
302           If a real "DateTime" object is returned then we exit back to the
303           user.
304
305           If no parser could parse, then an error is thrown.
306
307       See the section on error handling regarding the "undef"s mentioned
308       above.
309

METHODS

311       In the general course of things you won't need any of the methods. Life
312       often throws unexpected things at us so the methods are all available
313       for use.
314
315   import
316       "import()" is a wrapper for "create_class()". If you specify the class
317       option (see documentation for "create_class()") it will be ignored.
318
319   create_class
320       This method can be used as the runtime equivalent of "import()". That
321       is, it takes the exact same parameters as when one does:
322
323          use DateTime::Format::Builder ( blah blah blah )
324
325       That can be (almost) equivalently written as:
326
327          use DateTime::Format::Builder;
328          DateTime::Format::Builder->create_class( blah blah blah );
329
330       The difference being that the first is done at compile time while the
331       second is done at run time.
332
333       In the tutorial I said there were only two parameters at present. I
334       lied. There are actually three of them.
335
336       ·   parsers takes a hashref of methods and their parser specifications.
337           See the DateTime::Format::Builder::Tutorial for details.
338
339           Note that if you define a subroutine of the same name as one of the
340           methods you define here, an error will be thrown.
341
342       ·   constructor determines whether and how to create a "new()" function
343           in the new class. If given a true value, a constructor is created.
344           If given a false value, one isn't.
345
346           If given an anonymous sub or a reference to a sub then that is used
347           as "new()".
348
349           The default is 1 (that is, create a constructor using our default
350           code which simply creates a hashref and blesses it).
351
352           If your class defines its own "new()" method it will not be
353           overwritten. If you define your own "new()" and also tell Builder
354           to define one an error will be thrown.
355
356       ·   verbose takes a value. If the value is undef, then logging is
357           disabled. If the value is a filehandle then that's where logging
358           will go. If it's a true value, then output will go to "STDERR".
359
360           Alternatively, call "$DateTime::Format::Builder::verbose()" with
361           the relevant value. Whichever value is given more recently is
362           adhered to.
363
364           Be aware that verbosity is a global wide setting.
365
366       ·   class is optional and specifies the name of the class in which to
367           create the specified methods.
368
369           If using this method in the guise of "import()" then this field
370           will cause an error so it is only of use when calling as
371           "create_class()".
372
373       ·   version is also optional and specifies the value to give $VERSION
374           in the class. It's generally not recommended unless you're
375           combining with the class option. A "ExtUtils::MakeMaker" / "CPAN"
376           compliant version specification is much better.
377
378       In addition to creating any of the methods it also creates a "new()"
379       method that can instantiate (or clone) objects.
380

SUBCLASSING

382       In the rest of the documentation I've often lied in order to get some
383       of the ideas across more easily. The thing is, this module's very
384       flexible. You can get markedly different behaviour from simply
385       subclassing it and overriding some methods.
386
387   create_method
388       Given a parser coderef, returns a coderef that is suitable to be a
389       method.
390
391       The default action is to call "on_fail()" in the event of a non-parse,
392       but you can make it do whatever you want.
393
394   on_fail
395       This is called in the event of a non-parse (unless you've overridden
396       "create_method()" to do something else.
397
398       The single argument is the input string. The default action is to call
399       "croak()". Above, where I've said parsers or methods throw errors, this
400       is the method that is doing the error throwing.
401
402       You could conceivably override this method to, say, return "undef".
403

USING BUILDER OBJECTS aka USERS USING BUILDER

405       The methods listed in the METHODS section are all you generally need
406       when creating your own class. Sometimes you may not want a full blown
407       class to parse something just for this one program. Some methods are
408       provided to make that task easier.
409
410   new
411       The basic constructor. It takes no arguments, merely returns a new
412       "DateTime::Format::Builder" object.
413
414           my $parser = DateTime::Format::Builder->new();
415
416       If called as a method on an object (rather than as a class method),
417       then it clones the object.
418
419           my $clone = $parser->new();
420
421   clone
422       Provided for those who prefer an explicit "clone()" method rather than
423       using "new()" as an object method.
424
425           my $clone_of_clone = $clone->clone();
426
427   parser
428       Given either a single or multiple parser specification, sets the object
429       to have a parser based on that specification.
430
431           $parser->parser(
432               regex  => qr/^ (\d{4}) (\d\d) (\d\d) $/x;
433               params => [qw( year    month  day    )],
434           );
435
436       The arguments given to "parser()" are handed directly to
437       "create_parser()". The resultant parser is passed to "set_parser()".
438
439       If called as an object method, it returns the object.
440
441       If called as a class method, it creates a new object, sets its parser
442       and returns that object.
443
444   set_parser
445       Sets the parser of the object to the given parser.
446
447          $parser->set_parser( $coderef );
448
449       Note: this method does not take specifications. It also does not take
450       anything except coderefs. Luckily, coderefs are what most of the other
451       methods produce.
452
453       The method return value is the object itself.
454
455   get_parser
456       Returns the parser the object is using.
457
458          my $code = $parser->get_parser();
459
460   parse_datetime
461       Given a string, it calls the parser and returns the "DateTime" object
462       that results.
463
464          my $dt = $parser->parse_datetime( "1979 07 16" );
465
466       The return value, if not a "DateTime" object, is whatever the parser
467       wants to return. Generally this means that if the parse failed an error
468       will be thrown.
469
470   format_datetime
471       If you call this function, it will throw an error.
472

LONGER EXAMPLES

474       Some longer examples are provided in the distribution. These implement
475       some of the common parsing DateTime modules using Builder. Each of them
476       are, or were, drop in replacements for the modules at the time of
477       writing them.
478

THANKS

480       Dave Rolsky (DROLSKY) for kickstarting the DateTime project, writing
481       DateTime::Format::ICal and DateTime::Format::MySQL, and some much
482       needed review.
483
484       Joshua Hoblitt (JHOBLITT) for the concept, some of the API, impetus for
485       writing the multi-length code (both one length with multiple parsers
486       and single parser with multiple lengths), blame for the Regex custom
487       constructor code, spotting a bug in Dispatch, and more much needed
488       review.
489
490       Kellan Elliott-McCrea (KELLAN) for even more review, suggestions,
491       DateTime::Format::W3CDTF and the encouragement to rewrite these docs
492       almost 100%!
493
494       Claus Färber (CFAERBER) for having me get around to fixing the auto-
495       constructor writing, providing the 'args'/'self' patch, and suggesting
496       the multi-callbacks.
497
498       Rick Measham (RICKM) for DateTime::Format::Strptime which Builder now
499       supports.
500
501       Matthew McGillis for pointing out that "on_fail" overriding should be
502       simpler.
503
504       Simon Cozens (SIMON) for saying it was cool.
505

SEE ALSO

507       "datetime@perl.org" mailing list.
508
509       http://datetime.perl.org/
510
511       perl, DateTime, DateTime::Format::Builder::Tutorial,
512       DateTime::Format::Builder::Parser
513

SUPPORT

515       Bugs may be submitted at
516       <http://rt.cpan.org/Public/Dist/Display.html?Name=DateTime-Format-Builder>
517       or via email to bug-datetime-format-builder@rt.cpan.org <mailto:bug-
518       datetime-format-builder@rt.cpan.org>.
519
520       I am also usually active on IRC as 'autarch' on "irc://irc.perl.org".
521

SOURCE

523       The source code repository for DateTime-Format-Builder can be found at
524       <https://github.com/houseabsolute/DateTime-Format-Builder>.
525

DONATIONS

527       If you'd like to thank me for the work I've done on this module, please
528       consider making a "donation" to me via PayPal. I spend a lot of free
529       time creating free software, and would appreciate any support you'd
530       care to offer.
531
532       Please note that I am not suggesting that you must do this in order for
533       me to continue working on this particular software. I will continue to
534       do so, inasmuch as I have in the past, for as long as it interests me.
535
536       Similarly, a donation made in this way will probably not make me work
537       on this software much more, unless I get so many donations that I can
538       consider working on free software full time (let's all have a chuckle
539       at that together).
540
541       To donate, log into PayPal and send money to autarch@urth.org, or use
542       the button at <http://www.urth.org/~autarch/fs-donation.html>.
543

AUTHORS

545       ·   Dave Rolsky <autarch@urth.org>
546
547       ·   Iain Truskett
548

CONTRIBUTORS

550       ·   Daisuke Maki <daisuke@endeworks.jp>
551
552       ·   Ian Truskett <spoon@cpan.org>
553
554       ·   (no author) <(no author)@49043108-e40d-0410-ab17-85caa8b5b18d>
555
557       This software is Copyright (c) 2019 by Dave Rolsky.
558
559       This is free software, licensed under:
560
561         The Artistic License 2.0 (GPL Compatible)
562
563       The full text of the license can be found in the LICENSE file included
564       with this distribution.
565
566
567
568perl v5.30.0                      2019-07-26      DateTime::Format::Builder(3)
Impressum