1DateTime::Format::BuildUesre(r3)Contributed Perl DocumenDtaatteiToinme::Format::Builder(3)
2
3
4

NAME

6       DateTime::Format::Builder - Create DateTime parser classes and objects.
7

SYNOPSIS

9           package DateTime::Format::Brief;
10           our $VERSION = '0.07';
11           use DateTime::Format::Builder
12           (
13               parsers => {
14                   parse_datetime => [
15                   {
16                       regex => qr/^(\d{4})(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
17                       params => [qw( year month day hour minute second )],
18                   },
19                   {
20                       regex => qr/^(\d{4})(\d\d)(\d\d)$/,
21                       params => [qw( year month day )],
22                   },
23                   ],
24               }
25           );
26

DESCRIPTION

28       DateTime::Format::Builder creates DateTime parsers.  Many string for‐
29       mats of dates and times are simple and just require a basic regular
30       expression to extract the relevant information. Builder provides a sim‐
31       ple way to do this without writing reams of structural code.
32
33       Builder provides a number of methods, most of which you'll never need,
34       or at least rarely need. They're provided more for exposing of the mod‐
35       ule's innards to any subclasses, or for when you need to do something
36       slightly beyond what I expected.
37

TUTORIAL

39       See DateTime::Format::Builder::Tutorial.
40

ERROR HANDLING AND BAD PARSES

42       Often, I will speak of "undef" being returned, however that's not
43       strictly true.
44
45       When a simple single specification is given for a method, the method
46       isn't given a single parser directly. It's given a wrapper that will
47       call "on_fail()" if the single parser returns "undef". The single
48       parser must return "undef" so that a multiple parser can work nicely
49       and actual errors can be thrown from any of the callbacks.
50
51       Similarly, any multiple parsers will only call "on_fail()" right at the
52       end when it's tried all it could.
53
54       "on_fail()" (see later) is defined, by default, to throw an error.
55
56       Multiple parser specifications can also specify "on_fail" with a
57       coderef as an argument in the options block. This will take precedence
58       over the inheritable and over-ridable method.
59
60       That said, don't throw real errors from callbacks in multiple parser
61       specifications unless you really want parsing to stop right there and
62       not try any other parsers.
63
64       In summary: calling a method will result in either a "DateTime" object
65       being returned or an error being thrown (unless you've overridden
66       "on_fail()" or "create_method()", or you've specified a "on_fail" key
67       to a multiple parser specification).
68
69       Individual parsers (be they multiple parsers or single parsers) will
70       return either the "DateTime" object or "undef".
71

SINGLE SPECIFICATIONS

73       A single specification is a hash ref of instructions on how to create a
74       parser.
75
76       The precise set of keys and values varies according to parser type.
77       There are some common ones though:
78
79       ·   length is an optional parameter that can be used to specify that
80           this particular regex is only applicable to strings of a certain
81           fixed length. This can be used to make parsers more efficient. It's
82           strongly recommended that any parser that can use this parameter
83           does.
84
85           You may happily specify the same length twice. The parsers will be
86           tried in order of specification.
87
88           You can also specify multiple lengths by giving it an arrayref of
89           numbers rather than just a single scalar.  If doing so, please keep
90           the number of lengths to a minimum.
91
92           If any specifications without lengths are given and the particular
93           length parser fails, then the non-length parsers are tried.
94
95           This parameter is ignored unless the specification is part of a
96           multiple parser specification.
97
98       ·   label provides a name for the specification and is passed to some
99           of the callbacks about to mentioned.
100
101       ·   on_match and on_fail are callbacks. Both routines will be called
102           with parameters of:
103
104           ·   input, being the input to the parser (after any preprocessing
105               callbacks).
106
107           ·   label, being the label of the parser, if there is one.
108
109           ·   self, being the object on which the method has been invoked
110               (which may just be a class name). Naturally, you can then
111               invoke your own methods on it do get information you want.
112
113           ·   args, being an arrayref of any passed arguments, if any.  If
114               there were no arguments, then this parameter is not given.
115
116           These routines will be called depending on whether the regex match
117           succeeded or failed.
118
119       ·   preprocess is a callback provided for cleaning up input prior to
120           parsing. It's given a hash as arguments with the following keys:
121
122           ·   input being the datetime string the parser was given (if using
123               multiple specifications and an overall preprocess then this is
124               the date after it's been through that preprocessor).
125
126           ·   parsed being the state of parsing so far. Usually empty at this
127               point unless an overall preprocess was given.  Items may be
128               placed in it and will be given to any postprocessor and "Date‐
129               Time->new" (unless the postprocessor deletes it).
130
131           ·   self, args, label as per on_match and on_fail.
132
133           The return value from the routine is what is given to the regex.
134           Note that this is last code stop before the match.
135
136           Note: mixing length and a preprocess that modifies the length of
137           the input string is probably not what you meant to do. You probably
138           meant to use the multiple parser variant of preprocess which is
139           done before any length calculations. This "single parser" variant
140           of preprocess is performed after any length calculations.
141
142       ·   postprocess is the last code stop before "DateTime->new()" is
143           called. It's given the same arguments as preprocess. This allows it
144           to modify the parsed parameters after the parse and before the cre‐
145           ation of the object. For example, you might use:
146
147               {
148                   regex  => qr/^(\d\d) (\d\d) (\d\d)$/,
149                   params => [qw( year  month  day   )],
150                   postprocess => \&_fix_year,
151               }
152
153           where "_fix_year" is defined as:
154
155               sub _fix_year
156               {
157                   my %args = @_;
158                   my ($date, $p) = @args{qw( input parsed )};
159                   $p->{year} += $p->{year} > 69 ? 1900 : 2000;
160                   return 1;
161               }
162
163           This will cause the two digit years to be corrected according to
164           the cut off. If the year was '69' or lower, then it is made into
165           2069 (or 2045, or whatever the year was parsed as). Otherwise it is
166           assumed to be 19xx. The DateTime::Format::Mail module uses code
167           similar to this (only it allows the cut off to be configured and it
168           doesn't use Builder).
169
170           Note: It is very important to return an explicit value from the
171           postprocess callback. If the return value is false then the parse
172           is taken to have failed. If the return value is true, then the
173           parse is taken to have succeeded and "DateTime->new()" is called.
174
175       See the documentation for the individual parsers for their valid keys.
176
177       Parsers at the time of writing are:
178
179       ·   DateTime::Format::Builder::Parser::Regex - provides regular expres‐
180           sion based parsing.
181
182       ·   DateTime::Format::Builder::Parser::Strptime - provides strptime
183           based parsing.
184
185       Subroutines / coderefs as specifications.
186
187       A single parser specification can be a coderef. This was added mostly
188       because it could be and because I knew someone, somewhere, would want
189       to use it.
190
191       If the specification is a reference to a piece of code, be it a subrou‐
192       tine, anonymous, or whatever, then it's passed more or less straight
193       through. The code should return "undef" in event of failure (or any
194       false value, but "undef" is strongly preferred), or a true value in the
195       event of success (ideally a "DateTime" object or some object that has
196       the same interface).
197
198       This all said, I generally wouldn't recommend using this feature unless
199       you have to.
200
201       Callbacks
202
203       I mention a number of callbacks in this document.
204
205       Any time you see a callback being mentioned, you can, if you like, sub‐
206       stitute an arrayref of coderefs rather than having the straight
207       coderef.
208

MULTIPLE SPECIFICATIONS

210       These are very easily described as an array of single specifications.
211
212       Note that if the first element of the array is an arrayref, then you're
213       specifying options.
214
215       ·   preprocess lets you specify a preprocessor that is called before
216           any of the parsers are tried. This lets you do things like strip
217           off timezones or any unnecessary data. The most common use people
218           have for it at present is to get the input date to a particular
219           length so that the length is usable (DateTime::Format::ICal would
220           use it to strip off the variable length timezone).
221
222           Arguments are as for the single parser preprocess variant with the
223           exception that label is never given.
224
225       ·   on_fail should be a reference to a subroutine that is called if the
226           parser fails. If this is not provided, the default action is to
227           call "DateTime::Format::Builder::on_fail", or the "on_fail" method
228           of the subclass of DTFB that was used to create the parser.
229

EXECUTION FLOW

231       Builder allows you to plug in a fair few callbacks, which can make fol‐
232       lowing how a parse failed (or succeeded unexpectedly) somewhat tricky.
233
234       For Single Specifications
235
236       A single specification will do the following:
237
238       User calls parser:
239
240              my $dt = $class->parse_datetime( $string );
241
242       1   preprocess is called. It's given $string and a reference to the
243           parsing workspace hash, which we'll call $p. At this point, $p is
244           empty. The return value is used as $date for the rest of this sin‐
245           gle parser.  Anything put in $p is also used for the rest of this
246           single parser.
247
248       2   regex is applied.
249
250       3   If regex did not match, then on_fail is called (and is given $date
251           and also label if it was defined). Any return value is ignored and
252           the next thing is for the single parser to return "undef".
253
254           If regex did match, then on_match is called with the same arguments
255           as would be given to on_fail. The return value is similarly
256           ignored, but we then move to step 4 rather than exiting the parser.
257
258       4   postprocess is called with $date and a filled out $p. The return
259           value is taken as a indication of whether the parse was a success
260           or not. If it wasn't a success then the single parser will exit at
261           this point, returning undef.
262
263       5   "DateTime->new()" is called and the user is given the resultant
264           "DateTime" object.
265
266       See the section on error handling regarding the "undef"s mentioned
267       above.
268
269       For Multiple Specifications
270
271       With multiple specifications:
272
273       User calls parser:
274
275             my $dt = $class->complex_parse( $string );
276
277       1   The overall preprocessor is called and is given $string and the
278           hashref $p (identically to the per parser preprocess mentioned in
279           the previous flow).
280
281           If the callback modifies $p then a copy of $p is given to each of
282           the individual parsers.  This is so parsers won't accidentally pol‐
283           lute each other's workspace.
284
285       2   If an appropriate length specific parser is found, then it is
286           called and the single parser flow (see the previous section) is
287           followed, and the parser is given a copy of $p and the return value
288           of the overall preprocessor as $date.
289
290           If a "DateTime" object was returned so we go straight back to the
291           user.
292
293           If no appropriate parser was found, or the parser returned "undef",
294           then we progress to step 3!
295
296       3   Any non-length based parsers are tried in the order they were spec‐
297           ified.
298
299           For each of those the single specification flow above is performed,
300           and is given a copy of the output from the overall preprocessor.
301
302           If a real "DateTime" object is returned then we exit back to the
303           user.
304
305           If no parser could parse, then an error is thrown.
306
307       See the section on error handling regarding the "undef"s mentioned
308       above.
309

METHODS

311       In the general course of things you won't need any of the methods. Life
312       often throws unexpected things at us so the methods are all available
313       for use.
314
315       import
316
317       "import()" is a wrapper for "create_class()". If you specify the class
318       option (see documentation for "create_class()") it will be ignored.
319
320       create_class
321
322       This method can be used as the runtime equivalent of "import()". That
323       is, it takes the exact same parameters as when one does:
324
325          use DateTime::Format::Builder ( blah blah blah )
326
327       That can be (almost) equivalently written as:
328
329          use DateTime::Format::Builder;
330          DateTime::Format::Builder->create_class( blah blah blah );
331
332       The difference being that the first is done at compile time while the
333       second is done at run time.
334
335       In the tutorial I said there were only two parameters at present. I
336       lied. There are actually three of them.
337
338       ·   parsers takes a hashref of methods and their parser specifications.
339           See the tutorial above for details.
340
341           Note that if you define a subroutine of the same name as one of the
342           methods you define here, an error will be thrown.
343
344       ·   constructor determines whether and how to create a "new()" function
345           in the new class. If given a true value, a constructor is created.
346           If given a false value, one isn't.
347
348           If given an anonymous sub or a reference to a sub then that is used
349           as "new()".
350
351           The default is 1 (that is, create a constructor using our default
352           code which simply creates a hashref and blesses it).
353
354           If your class defines its own "new()" method it will not be over‐
355           written. If you define your own "new()" and also tell Builder to
356           define one an error will be thrown.
357
358       ·   verbose takes a value. If the value is undef, then logging is dis‐
359           abled. If the value is a filehandle then that's where logging will
360           go. If it's a true value, then output will go to "STDERR".
361
362           Alternatively, call "$DateTime::Format::Builder::verbose()" with
363           the relevant value. Whichever value is given more recently is
364           adhered to.
365
366           Be aware that verbosity is a global wide setting.
367
368       ·   class is optional and specifies the name of the class in which to
369           create the specified methods.
370
371           If using this method in the guise of "import()" then this field
372           will cause an error so it is only of use when calling as "cre‐
373           ate_class()".
374
375       ·   version is also optional and specifies the value to give $VERSION
376           in the class. It's generally not recommended unless you're combin‐
377           ing with the class option. A "ExtUtils::MakeMaker" / "CPAN" compli‐
378           ant version specification is much better.
379
380       In addition to creating any of the methods it also creates a "new()"
381       method that can instantiate (or clone) objects.
382

SUBCLASSING

384       In the rest of the documentation I've often lied in order to get some
385       of the ideas across more easily. The thing is, this module's very flex‐
386       ible. You can get markedly different behaviour from simply subclassing
387       it and overriding some methods.
388
389       create_method
390
391       Given a parser coderef, returns a coderef that is suitable to be a
392       method.
393
394       The default action is to call "on_fail()" in the event of a non-parse,
395       but you can make it do whatever you want.
396
397       on_fail
398
399       This is called in the event of a non-parse (unless you've overridden
400       "create_method()" to do something else.
401
402       The single argument is the input string. The default action is to call
403       "croak()". Above, where I've said parsers or methods throw errors, this
404       is the method that is doing the error throwing.
405
406       You could conceivably override this method to, say, return "undef".
407

USING BUILDER OBJECTS aka USERS USING BUILDER

409       The methods listed in the METHODS section are all you generally need
410       when creating your own class. Sometimes you may not want a full blown
411       class to parse something just for this one program. Some methods are
412       provided to make that task easier.
413
414       new
415
416       The basic constructor. It takes no arguments, merely returns a new
417       "DateTime::Format::Builder" object.
418
419           my $parser = DateTime::Format::Builder->new();
420
421       If called as a method on an object (rather than as a class method),
422       then it clones the object.
423
424           my $clone = $parser->new();
425
426       clone
427
428       Provided for those who prefer an explicit "clone()" method rather than
429       using "new()" as an object method.
430
431           my $clone_of_clone = $clone->clone();
432
433       parser
434
435       Given either a single or multiple parser specification, sets the object
436       to have a parser based on that specification.
437
438           $parser->parser(
439               regex  => qr/^ (\d{4}) (\d\d) (\d\d) $/x;
440               params => [qw( year    month  day    )],
441           );
442
443       The arguments given to "parser()" are handed directly to "cre‐
444       ate_parser()". The resultant parser is passed to "set_parser()".
445
446       If called as an object method, it returns the object.
447
448       If called as a class method, it creates a new object, sets its parser
449       and returns that object.
450
451       set_parser
452
453       Sets the parser of the object to the given parser.
454
455          $parser->set_parser( $coderef );
456
457       Note: this method does not take specifications. It also does not take
458       anything except coderefs. Luckily, coderefs are what most of the other
459       methods produce.
460
461       The method return value is the object itself.
462
463       get_parser
464
465       Returns the parser the object is using.
466
467          my $code = $parser->get_parser();
468
469       parse_datetime
470
471       Given a string, it calls the parser and returns the "DateTime" object
472       that results.
473
474          my $dt = $parser->parse_datetime( "1979 07 16" );
475
476       The return value, if not a "DateTime" object, is whatever the parser
477       wants to return. Generally this means that if the parse failed an error
478       will be thrown.
479
480       format_datetime
481
482       If you call this function, it will throw an errror.
483

LONGER EXAMPLES

485       Some longer examples are provided in the distribution. These implement
486       some of the common parsing DateTime modules using Builder. Each of them
487       are, or were, drop in replacements for the modules at the time of writ‐
488       ing them.
489

THANKS

491       Dave Rolsky (DROLSKY) for kickstarting the DateTime project, writing
492       DateTime::Format::ICal and DateTime::Format::MySQL, and some much
493       needed review.
494
495       Joshua Hoblitt (JHOBLITT) for the concept, some of the API, impetus for
496       writing the multilength code (both one length with multiple parsers and
497       single parser with multiple lengths), blame for the Regex custom con‐
498       structor code, spotting a bug in Dispatch, and more much needed review.
499
500       Kellan Elliott-McCrea (KELLAN) for even more review, suggestions, Date‐
501       Time::Format::W3CDTF and the encouragement to rewrite these docs almost
502       100%!
503
504       Claus Faerber (CFAERBER) for having me get around to fixing the auto-
505       constructor writing, providing the 'args'/'self' patch, and suggesting
506       the multi-callbacks.
507
508       Rick Measham (RICKM) for DateTime::Format::Strptime which Builder now
509       supports.
510
511       Matthew McGillis for pointing out that "on_fail" overriding should be
512       simpler.
513
514       Simon Cozens (SIMON) for saying it was cool.
515

SUPPORT

517       Support for this module is provided via the datetime@perl.org email
518       list. See http://lists.perl.org/ for more details.
519
520       Alternatively, log them via the CPAN RT system via the web or email:
521
522           http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DateTime%3A%3AFormat%3A%3ABuilder
523           bug-datetime-format-builder@rt.cpan.org
524
525       This makes it much easier for me to track things and thus means your
526       problem is less likely to be neglected.
527
529       Copyright (C) Iain Truskett, 2003. All rights reserved.
530
531       This library is free software; you can redistribute it and/or modify it
532       under the same terms as Perl itself, either Perl version 5.000 or, at
533       your option, any later version of Perl 5 you may have available.
534
535       The full text of the licences can be found in the Artistic and COPYING
536       files included with this module, or in perlartistic and perlgpl as sup‐
537       plied with Perl 5.8.1 and later.
538

AUTHOR

540       Originally written by Iain Truskett <spoon@cpan.org>, who died on
541       December 29, 2003.
542
543       Maintained by Dave Rolsky <autarch@urth.org>.
544

SEE ALSO

546       "datetime@perl.org" mailing list.
547
548       http://datetime.perl.org/
549
550       perl, DateTime, DateTime::Format::Builder::Tutorial, DateTime::For‐
551       mat::Builder::Parser
552
553
554
555perl v5.8.8                       2008-02-01      DateTime::Format::Builder(3)
Impressum