1DateTime::Format::BuildUesre(r3)Contributed Perl DocumenDtaatteiToinme::Format::Builder(3)
2
3
4

NAME

6       DateTime::Format::Builder - Create DateTime parser classes and objects.
7

SYNOPSIS

9           package DateTime::Format::Brief;
10           our $VERSION = '0.07';
11           use DateTime::Format::Builder
12           (
13               parsers => {
14                   parse_datetime => [
15                   {
16                       regex => qr/^(\d{4})(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
17                       params => [qw( year month day hour minute second )],
18                   },
19                   {
20                       regex => qr/^(\d{4})(\d\d)(\d\d)$/,
21                       params => [qw( year month day )],
22                   },
23                   ],
24               }
25           );
26

DESCRIPTION

28       DateTime::Format::Builder creates DateTime parsers.  Many string
29       formats of dates and times are simple and just require a basic regular
30       expression to extract the relevant information. Builder provides a
31       simple way to do this without writing reams of structural code.
32
33       Builder provides a number of methods, most of which you'll never need,
34       or at least rarely need. They're provided more for exposing of the
35       module's innards to any subclasses, or for when you need to do
36       something slightly beyond what I expected.
37

TUTORIAL

39       See DateTime::Format::Builder::Tutorial.
40

ERROR HANDLING AND BAD PARSES

42       Often, I will speak of "undef" being returned, however that's not
43       strictly true.
44
45       When a simple single specification is given for a method, the method
46       isn't given a single parser directly. It's given a wrapper that will
47       call "on_fail()" if the single parser returns "undef". The single
48       parser must return "undef" so that a multiple parser can work nicely
49       and actual errors can be thrown from any of the callbacks.
50
51       Similarly, any multiple parsers will only call "on_fail()" right at the
52       end when it's tried all it could.
53
54       "on_fail()" (see later) is defined, by default, to throw an error.
55
56       Multiple parser specifications can also specify "on_fail" with a
57       coderef as an argument in the options block. This will take precedence
58       over the inheritable and over-ridable method.
59
60       That said, don't throw real errors from callbacks in multiple parser
61       specifications unless you really want parsing to stop right there and
62       not try any other parsers.
63
64       In summary: calling a method will result in either a "DateTime" object
65       being returned or an error being thrown (unless you've overridden
66       "on_fail()" or "create_method()", or you've specified a "on_fail" key
67       to a multiple parser specification).
68
69       Individual parsers (be they multiple parsers or single parsers) will
70       return either the "DateTime" object or "undef".
71

SINGLE SPECIFICATIONS

73       A single specification is a hash ref of instructions on how to create a
74       parser.
75
76       The precise set of keys and values varies according to parser type.
77       There are some common ones though:
78
79       ·   length is an optional parameter that can be used to specify that
80           this particular regex is only applicable to strings of a certain
81           fixed length. This can be used to make parsers more efficient. It's
82           strongly recommended that any parser that can use this parameter
83           does.
84
85           You may happily specify the same length twice. The parsers will be
86           tried in order of specification.
87
88           You can also specify multiple lengths by giving it an arrayref of
89           numbers rather than just a single scalar.  If doing so, please keep
90           the number of lengths to a minimum.
91
92           If any specifications without lengths are given and the particular
93           length parser fails, then the non-length parsers are tried.
94
95           This parameter is ignored unless the specification is part of a
96           multiple parser specification.
97
98       ·   label provides a name for the specification and is passed to some
99           of the callbacks about to mentioned.
100
101       ·   on_match and on_fail are callbacks. Both routines will be called
102           with parameters of:
103
104           ·   input, being the input to the parser (after any preprocessing
105               callbacks).
106
107           ·   label, being the label of the parser, if there is one.
108
109           ·   self, being the object on which the method has been invoked
110               (which may just be a class name). Naturally, you can then
111               invoke your own methods on it do get information you want.
112
113           ·   args, being an arrayref of any passed arguments, if any.  If
114               there were no arguments, then this parameter is not given.
115
116           These routines will be called depending on whether the regex match
117           succeeded or failed.
118
119       ·   preprocess is a callback provided for cleaning up input prior to
120           parsing. It's given a hash as arguments with the following keys:
121
122           ·   input being the datetime string the parser was given (if using
123               multiple specifications and an overall preprocess then this is
124               the date after it's been through that preprocessor).
125
126           ·   parsed being the state of parsing so far. Usually empty at this
127               point unless an overall preprocess was given.  Items may be
128               placed in it and will be given to any postprocessor and
129               "DateTime->new" (unless the postprocessor deletes it).
130
131           ·   self, args, label as per on_match and on_fail.
132
133           The return value from the routine is what is given to the regex.
134           Note that this is last code stop before the match.
135
136           Note: mixing length and a preprocess that modifies the length of
137           the input string is probably not what you meant to do. You probably
138           meant to use the multiple parser variant of preprocess which is
139           done before any length calculations. This "single parser" variant
140           of preprocess is performed after any length calculations.
141
142       ·   postprocess is the last code stop before "DateTime->new()" is
143           called. It's given the same arguments as preprocess. This allows it
144           to modify the parsed parameters after the parse and before the
145           creation of the object. For example, you might use:
146
147               {
148                   regex  => qr/^(\d\d) (\d\d) (\d\d)$/,
149                   params => [qw( year  month  day   )],
150                   postprocess => \&_fix_year,
151               }
152
153           where "_fix_year" is defined as:
154
155               sub _fix_year
156               {
157                   my %args = @_;
158                   my ($date, $p) = @args{qw( input parsed )};
159                   $p->{year} += $p->{year} > 69 ? 1900 : 2000;
160                   return 1;
161               }
162
163           This will cause the two digit years to be corrected according to
164           the cut off. If the year was '69' or lower, then it is made into
165           2069 (or 2045, or whatever the year was parsed as). Otherwise it is
166           assumed to be 19xx. The DateTime::Format::Mail module uses code
167           similar to this (only it allows the cut off to be configured and it
168           doesn't use Builder).
169
170           Note: It is very important to return an explicit value from the
171           postprocess callback. If the return value is false then the parse
172           is taken to have failed. If the return value is true, then the
173           parse is taken to have succeeded and "DateTime->new()" is called.
174
175       See the documentation for the individual parsers for their valid keys.
176
177       Parsers at the time of writing are:
178
179       ·   DateTime::Format::Builder::Parser::Regex - provides regular
180           expression based parsing.
181
182       ·   DateTime::Format::Builder::Parser::Strptime - provides strptime
183           based parsing.
184
185   Subroutines / coderefs as specifications.
186       A single parser specification can be a coderef. This was added mostly
187       because it could be and because I knew someone, somewhere, would want
188       to use it.
189
190       If the specification is a reference to a piece of code, be it a
191       subroutine, anonymous, or whatever, then it's passed more or less
192       straight through. The code should return "undef" in event of failure
193       (or any false value, but "undef" is strongly preferred), or a true
194       value in the event of success (ideally a "DateTime" object or some
195       object that has the same interface).
196
197       This all said, I generally wouldn't recommend using this feature unless
198       you have to.
199
200   Callbacks
201       I mention a number of callbacks in this document.
202
203       Any time you see a callback being mentioned, you can, if you like,
204       substitute an arrayref of coderefs rather than having the straight
205       coderef.
206

MULTIPLE SPECIFICATIONS

208       These are very easily described as an array of single specifications.
209
210       Note that if the first element of the array is an arrayref, then you're
211       specifying options.
212
213       ·   preprocess lets you specify a preprocessor that is called before
214           any of the parsers are tried. This lets you do things like strip
215           off timezones or any unnecessary data. The most common use people
216           have for it at present is to get the input date to a particular
217           length so that the length is usable (DateTime::Format::ICal would
218           use it to strip off the variable length timezone).
219
220           Arguments are as for the single parser preprocess variant with the
221           exception that label is never given.
222
223       ·   on_fail should be a reference to a subroutine that is called if the
224           parser fails. If this is not provided, the default action is to
225           call "DateTime::Format::Builder::on_fail", or the "on_fail" method
226           of the subclass of DTFB that was used to create the parser.
227

EXECUTION FLOW

229       Builder allows you to plug in a fair few callbacks, which can make
230       following how a parse failed (or succeeded unexpectedly) somewhat
231       tricky.
232
233   For Single Specifications
234       A single specification will do the following:
235
236       User calls parser:
237
238              my $dt = $class->parse_datetime( $string );
239
240       1.  preprocess is called. It's given $string and a reference to the
241           parsing workspace hash, which we'll call $p. At this point, $p is
242           empty. The return value is used as $date for the rest of this
243           single parser.  Anything put in $p is also used for the rest of
244           this single parser.
245
246       2.  regex is applied.
247
248       3.  If regex did not match, then on_fail is called (and is given $date
249           and also label if it was defined). Any return value is ignored and
250           the next thing is for the single parser to return "undef".
251
252           If regex did match, then on_match is called with the same arguments
253           as would be given to on_fail. The return value is similarly
254           ignored, but we then move to step 4 rather than exiting the parser.
255
256       4.  postprocess is called with $date and a filled out $p. The return
257           value is taken as a indication of whether the parse was a success
258           or not. If it wasn't a success then the single parser will exit at
259           this point, returning undef.
260
261       5.  "DateTime->new()" is called and the user is given the resultant
262           "DateTime" object.
263
264       See the section on error handling regarding the "undef"s mentioned
265       above.
266
267   For Multiple Specifications
268       With multiple specifications:
269
270       User calls parser:
271
272             my $dt = $class->complex_parse( $string );
273
274       1.  The overall preprocessor is called and is given $string and the
275           hashref $p (identically to the per parser preprocess mentioned in
276           the previous flow).
277
278           If the callback modifies $p then a copy of $p is given to each of
279           the individual parsers.  This is so parsers won't accidentally
280           pollute each other's workspace.
281
282       2.  If an appropriate length specific parser is found, then it is
283           called and the single parser flow (see the previous section) is
284           followed, and the parser is given a copy of $p and the return value
285           of the overall preprocessor as $date.
286
287           If a "DateTime" object was returned so we go straight back to the
288           user.
289
290           If no appropriate parser was found, or the parser returned "undef",
291           then we progress to step 3!
292
293       3.  Any non-length based parsers are tried in the order they were
294           specified.
295
296           For each of those the single specification flow above is performed,
297           and is given a copy of the output from the overall preprocessor.
298
299           If a real "DateTime" object is returned then we exit back to the
300           user.
301
302           If no parser could parse, then an error is thrown.
303
304       See the section on error handling regarding the "undef"s mentioned
305       above.
306

METHODS

308       In the general course of things you won't need any of the methods. Life
309       often throws unexpected things at us so the methods are all available
310       for use.
311
312   import
313       "import()" is a wrapper for "create_class()". If you specify the class
314       option (see documentation for "create_class()") it will be ignored.
315
316   create_class
317       This method can be used as the runtime equivalent of "import()". That
318       is, it takes the exact same parameters as when one does:
319
320          use DateTime::Format::Builder ( blah blah blah )
321
322       That can be (almost) equivalently written as:
323
324          use DateTime::Format::Builder;
325          DateTime::Format::Builder->create_class( blah blah blah );
326
327       The difference being that the first is done at compile time while the
328       second is done at run time.
329
330       In the tutorial I said there were only two parameters at present. I
331       lied. There are actually three of them.
332
333       ·   parsers takes a hashref of methods and their parser specifications.
334           See the tutorial above for details.
335
336           Note that if you define a subroutine of the same name as one of the
337           methods you define here, an error will be thrown.
338
339       ·   constructor determines whether and how to create a "new()" function
340           in the new class. If given a true value, a constructor is created.
341           If given a false value, one isn't.
342
343           If given an anonymous sub or a reference to a sub then that is used
344           as "new()".
345
346           The default is 1 (that is, create a constructor using our default
347           code which simply creates a hashref and blesses it).
348
349           If your class defines its own "new()" method it will not be
350           overwritten. If you define your own "new()" and also tell Builder
351           to define one an error will be thrown.
352
353       ·   verbose takes a value. If the value is undef, then logging is
354           disabled. If the value is a filehandle then that's where logging
355           will go. If it's a true value, then output will go to "STDERR".
356
357           Alternatively, call "$DateTime::Format::Builder::verbose()" with
358           the relevant value. Whichever value is given more recently is
359           adhered to.
360
361           Be aware that verbosity is a global wide setting.
362
363       ·   class is optional and specifies the name of the class in which to
364           create the specified methods.
365
366           If using this method in the guise of "import()" then this field
367           will cause an error so it is only of use when calling as
368           "create_class()".
369
370       ·   version is also optional and specifies the value to give $VERSION
371           in the class. It's generally not recommended unless you're
372           combining with the class option. A "ExtUtils::MakeMaker" / "CPAN"
373           compliant version specification is much better.
374
375       In addition to creating any of the methods it also creates a "new()"
376       method that can instantiate (or clone) objects.
377

SUBCLASSING

379       In the rest of the documentation I've often lied in order to get some
380       of the ideas across more easily. The thing is, this module's very
381       flexible. You can get markedly different behaviour from simply
382       subclassing it and overriding some methods.
383
384   create_method
385       Given a parser coderef, returns a coderef that is suitable to be a
386       method.
387
388       The default action is to call "on_fail()" in the event of a non-parse,
389       but you can make it do whatever you want.
390
391   on_fail
392       This is called in the event of a non-parse (unless you've overridden
393       "create_method()" to do something else.
394
395       The single argument is the input string. The default action is to call
396       "croak()". Above, where I've said parsers or methods throw errors, this
397       is the method that is doing the error throwing.
398
399       You could conceivably override this method to, say, return "undef".
400

USING BUILDER OBJECTS aka USERS USING BUILDER

402       The methods listed in the METHODS section are all you generally need
403       when creating your own class. Sometimes you may not want a full blown
404       class to parse something just for this one program. Some methods are
405       provided to make that task easier.
406
407   new
408       The basic constructor. It takes no arguments, merely returns a new
409       "DateTime::Format::Builder" object.
410
411           my $parser = DateTime::Format::Builder->new();
412
413       If called as a method on an object (rather than as a class method),
414       then it clones the object.
415
416           my $clone = $parser->new();
417
418   clone
419       Provided for those who prefer an explicit "clone()" method rather than
420       using "new()" as an object method.
421
422           my $clone_of_clone = $clone->clone();
423
424   parser
425       Given either a single or multiple parser specification, sets the object
426       to have a parser based on that specification.
427
428           $parser->parser(
429               regex  => qr/^ (\d{4}) (\d\d) (\d\d) $/x;
430               params => [qw( year    month  day    )],
431           );
432
433       The arguments given to "parser()" are handed directly to
434       "create_parser()". The resultant parser is passed to "set_parser()".
435
436       If called as an object method, it returns the object.
437
438       If called as a class method, it creates a new object, sets its parser
439       and returns that object.
440
441   set_parser
442       Sets the parser of the object to the given parser.
443
444          $parser->set_parser( $coderef );
445
446       Note: this method does not take specifications. It also does not take
447       anything except coderefs. Luckily, coderefs are what most of the other
448       methods produce.
449
450       The method return value is the object itself.
451
452   get_parser
453       Returns the parser the object is using.
454
455          my $code = $parser->get_parser();
456
457   parse_datetime
458       Given a string, it calls the parser and returns the "DateTime" object
459       that results.
460
461          my $dt = $parser->parse_datetime( "1979 07 16" );
462
463       The return value, if not a "DateTime" object, is whatever the parser
464       wants to return. Generally this means that if the parse failed an error
465       will be thrown.
466
467   format_datetime
468       If you call this function, it will throw an errror.
469

LONGER EXAMPLES

471       Some longer examples are provided in the distribution. These implement
472       some of the common parsing DateTime modules using Builder. Each of them
473       are, or were, drop in replacements for the modules at the time of
474       writing them.
475

THANKS

477       Dave Rolsky (DROLSKY) for kickstarting the DateTime project, writing
478       DateTime::Format::ICal and DateTime::Format::MySQL, and some much
479       needed review.
480
481       Joshua Hoblitt (JHOBLITT) for the concept, some of the API, impetus for
482       writing the multilength code (both one length with multiple parsers and
483       single parser with multiple lengths), blame for the Regex custom
484       constructor code, spotting a bug in Dispatch, and more much needed
485       review.
486
487       Kellan Elliott-McCrea (KELLAN) for even more review, suggestions,
488       DateTime::Format::W3CDTF and the encouragement to rewrite these docs
489       almost 100%!
490
491       Claus Faerber (CFAERBER) for having me get around to fixing the auto-
492       constructor writing, providing the 'args'/'self' patch, and suggesting
493       the multi-callbacks.
494
495       Rick Measham (RICKM) for DateTime::Format::Strptime which Builder now
496       supports.
497
498       Matthew McGillis for pointing out that "on_fail" overriding should be
499       simpler.
500
501       Simon Cozens (SIMON) for saying it was cool.
502

SUPPORT

504       Support for this module is provided via the datetime@perl.org email
505       list. See http://lists.perl.org/ for more details.
506
507       Alternatively, log them via the CPAN RT system via the web or email:
508
509           http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DateTime%3A%3AFormat%3A%3ABuilder
510           bug-datetime-format-builder@rt.cpan.org
511
512       This makes it much easier for me to track things and thus means your
513       problem is less likely to be neglected.
514
516       Copyright (C) Iain Truskett, 2003. All rights reserved.
517
518       This library is free software; you can redistribute it and/or modify it
519       under the same terms as Perl itself, either Perl version 5.000 or, at
520       your option, any later version of Perl 5 you may have available.
521
522       The full text of the licences can be found in the Artistic and COPYING
523       files included with this module, or in perlartistic and perlgpl as
524       supplied with Perl 5.8.1 and later.
525

AUTHOR

527       Originally written by Iain Truskett <spoon@cpan.org>, who died on
528       December 29, 2003.
529
530       Maintained by Dave Rolsky <autarch@urth.org>.
531

SEE ALSO

533       "datetime@perl.org" mailing list.
534
535       http://datetime.perl.org/
536
537       perl, DateTime, DateTime::Format::Builder::Tutorial,
538       DateTime::Format::Builder::Parser
539
540
541
542perl v5.12.0                      2010-05-14      DateTime::Format::Builder(3)
Impressum