1Text::Reform(3) User Contributed Perl Documentation Text::Reform(3)
2
3
4
6 Text::Reform - Manual text wrapping and reformatting
7
9 This document describes version 1.20 of Text::Reform, released
10 2009-09-06.
11
13 use Text::Reform;
14
15 print form $template,
16 $data, $to, $fill, $it, $with;
17
18
19 use Text::Reform qw( tag );
20
21 print tag 'B', $enboldened_text;
22
24 The "form" sub
25 The "form()" subroutine may be exported from the module. It takes a
26 series of format (or "picture") strings followed by replacement values,
27 interpolates those values into each picture string, and returns the
28 result. The effect is similar to the inbuilt perl "format" mechanism,
29 although the field specification syntax is simpler and some of the
30 formatting behaviour is more sophisticated.
31
32 A picture string consists of sequences of the following characters:
33
34 < Left-justified field indicator. A series of two or more
35 sequential <'s specify a left-justified field to be filled by a
36 subsequent value. A single < is formatted as the literal
37 character '<'
38
39 > Right-justified field indicator. A series of two or more
40 sequential >'s specify a right-justified field to be filled by
41 a subsequent value. A single > is formatted as the literal
42 character '>'
43
44 <<<>>> Fully-justified field indicator. Field may be of any width,
45 and brackets need not balance, but there must be at least 2 '<'
46 and 2 '>'.
47
48 ^ Centre-justified field indicator. A series of two or more
49 sequential ^'s specify a centred field to be filled by a
50 subsequent value. A single ^ is formatted as the literal
51 character '^'
52
53 >>>.<<<<
54 A numerically formatted field with the specified number of
55 digits to either side of the decimal place. See "Numerical
56 formatting" below.
57
58 [ Left-justified block field indicator. Just like a < field,
59 except it repeats as required on subsequent lines. See below.
60 A single [ is formatted as the literal character '['
61
62 ] Right-justified block field indicator. Just like a > field,
63 except it repeats as required on subsequent lines. See below.
64 A single ] is formatted as the literal character ']'
65
66 [[[]]] Fully-justified block field indicator. Just like a <<<>>>
67 field, except it repeats as required on subsequent lines. See
68 below. Field may be of any width, and brackets need not
69 balance, but there must be at least 2 '[' and 2 ']'.
70
71 | Centre-justified block field indicator. Just like a ^ field,
72 except it repeats as required on subsequent lines. See below.
73 A single | is formatted as the literal character '|'
74
75 ]]].[[[[
76 A numerically formatted block field with the specified number
77 of digits to either side of the decimal place. Just like a
78 >>>.<<<< field, except it repeats as required on subsequent
79 lines. See below.
80
81 ~ A one-character wide block field.
82
83 \ Literal escape of next character (e.g. "\~" is formatted as
84 '~', not a one character wide block field).
85
86 Any other character
87 That literal character.
88
89 Any substitution value which is "undef" (either explicitly so, or
90 because it is missing) is replaced by an empty string.
91
92 Controlling line filling.
93 Note that, unlike the a perl "format", "form" preserves whitespace
94 (including newlines) unless called with certain options.
95
96 The "squeeze" option (when specified with a true value) causes any
97 sequence of spaces and/or tabs (but not newlines) in an interpolated
98 string to be replaced with a single space.
99
100 A true value for the "fill" option causes (only) newlines to be
101 squeezed.
102
103 To minimize all whitespace, you need to specify both options. Hence:
104
105 $format = "EG> [[[[[[[[[[[[[[[[[[[[[";
106 $data = "h e\t l lo\nworld\t\t\t\t\t";
107
108 print form $format, $data; # all whitespace preserved:
109 #
110 # EG> h e l lo
111 # EG> world
112
113
114 print form {squeeze=>1}, # only newlines preserved:
115 $format, $data; #
116 # EG> h e l lo
117 # EG> world
118
119
120 print form {fill=>1}, # only spaces/tabs preserved:
121 $format, $data; #
122 # EG> h e l lo world
123
124
125 print form {squeeze=>1, fill=>1}, # no whitespace preserved:
126 $format, $data; #
127 # EG> h e l lo world
128
129 Whether or not filling or squeezing is in effect, "form" can also be
130 directed to trim any extra whitespace from the end of each line it
131 formats, using the "trim" option. If this option is specified with a
132 true value, every line returned by "form" will automatically have the
133 substitution "s/[ \t]+$//gm" applied to it.
134
135 Hence:
136
137 print length form "[[[[[[[[[[", "short";
138 # 11
139
140 print length form {trim=>1}, "[[[[[[[[[[", "short";
141 # 6
142
143 It is also possible to control the character used to fill lines that
144 are too short, using the 'filler' option. If this option is specified
145 the value of the 'filler' flag is used as the fill string, rather than
146 the default " ".
147
148 For example:
149
150 print form { filler=>'*' },
151 "Pay bearer: ^^^^^^^^^^^^^^^^^^^",
152 '$123.45';
153
154 prints:
155
156 Pay bearer: ******$123.45******
157
158 If the filler string is longer than one character, it is truncated to
159 the appropriate length. So:
160
161 print form { filler=>'-->' },
162 "Pay bearer: ]]]]]]]]]]]]]]]]]]]",
163 ['$1234.50', '$123.45', '$12.34'];
164
165 prints:
166
167 Pay bearer: ->-->-->-->$1234.50
168 Pay bearer: -->-->-->-->$123.45
169 Pay bearer: >-->-->-->-->$12.34
170
171 If the value of the 'filler' option is a hash, then it's 'left' and
172 'right' entries specify separate filler strings for each side of an
173 interpolated value. So:
174
175 print form { filler=>{left=>'->', right=>'*'} },
176 "Pay bearer: <<<<<<<<<<<<<<<<<<",
177 '$123.45',
178 "Pay bearer: >>>>>>>>>>>>>>>>>>",
179 '$123.45',
180 "Pay bearer: ^^^^^^^^^^^^^^^^^^",
181 '$123.45';
182
183 prints:
184
185 Pay bearer: $123.45***********
186 Pay bearer: >->->->->->$123.45
187 Pay bearer: >->->$123.45******
188
189 Temporary and permanent default options
190 If "form" is called with options, but no template string or data, it
191 resets it's defaults to the options specified. If called in a void
192 context:
193
194 form { squeeze => 1, trim => 1 };
195
196 the options become permanent defaults.
197
198 However, when called with only options in non-void context, "form"
199 resets its defaults to those options and returns an object. The reset
200 default values persist only until that returned object is destroyed.
201 Hence to temporarily reset "form"'s defaults within a single
202 subroutine:
203
204 sub single {
205 my $tmp = form { squeeze => 1, trim => 1 };
206
207 # do formatting with the obove defaults
208
209 } # form's defaults revert to previous values as $tmp object destroyed
210
211 Multi-line format specifiers and interleaving
212 By default, if a format specifier contains two or more lines (i.e. one
213 or more newline characters), the entire format specifier is repeatedly
214 filled as a unit, until all block fields have consumed their
215 corresponding arguments. For example, to build a simple look-up table:
216
217 my @values = (1..12);
218
219 my @squares = map { sprintf "%.6g", $_**2 } @values;
220 my @roots = map { sprintf "%.6g", sqrt($_) } @values;
221 my @logs = map { sprintf "%.6g", log($_) } @values;
222 my @inverses = map { sprintf "%.6g", 1/$_ } @values;
223
224 print form
225 " N N**2 sqrt(N) log(N) 1/N",
226 "=====================================================",
227 "| [[ | [[[ | [[[[[[[[[[ | [[[[[[[[[ | [[[[[[[[[ |
228 -----------------------------------------------------",
229 \@values, \@squares, \@roots, \@logs, \@inverses;
230
231 The multiline format specifier:
232
233 "| [[ | [[[ | [[[[[[[[[[ | [[[[[[[[[ | [[[[[[[[[ |
234 -----------------------------------------------------",
235
236 is treated as a single logical line. So "form" alternately fills the
237 first physical line (interpolating one value from each of the arrays)
238 and the second physical line (which puts a line of dashes between each
239 row of the table) producing:
240
241 N N**2 sqrt(N) log(N) 1/N
242 =====================================================
243 | 1 | 1 | 1 | 0 | 1 |
244 -----------------------------------------------------
245 | 2 | 4 | 1.41421 | 0.693147 | 0.5 |
246 -----------------------------------------------------
247 | 3 | 9 | 1.73205 | 1.09861 | 0.333333 |
248 -----------------------------------------------------
249 | 4 | 16 | 2 | 1.38629 | 0.25 |
250 -----------------------------------------------------
251 | 5 | 25 | 2.23607 | 1.60944 | 0.2 |
252 -----------------------------------------------------
253 | 6 | 36 | 2.44949 | 1.79176 | 0.166667 |
254 -----------------------------------------------------
255 | 7 | 49 | 2.64575 | 1.94591 | 0.142857 |
256 -----------------------------------------------------
257 | 8 | 64 | 2.82843 | 2.07944 | 0.125 |
258 -----------------------------------------------------
259 | 9 | 81 | 3 | 2.19722 | 0.111111 |
260 -----------------------------------------------------
261 | 10 | 100 | 3.16228 | 2.30259 | 0.1 |
262 -----------------------------------------------------
263 | 11 | 121 | 3.31662 | 2.3979 | 0.0909091 |
264 -----------------------------------------------------
265 | 12 | 144 | 3.4641 | 2.48491 | 0.0833333 |
266 -----------------------------------------------------
267
268 This implies that formats and the variables from which they're filled
269 need to be interleaved. That is, a multi-line specification like this:
270
271 print form
272 "Passed: ##
273 [[[[[[[[[[[[[[[ # single format specification
274 Failed: # (needs two sets of data)
275 [[[[[[[[[[[[[[[", ##
276
277 \@passes, \@fails; ## data for previous format
278
279 would print:
280
281 Passed:
282 <pass 1>
283 Failed:
284 <fail 1>
285 Passed:
286 <pass 2>
287 Failed:
288 <fail 2>
289 Passed:
290 <pass 3>
291 Failed:
292 <fail 3>
293
294 because the four-line format specifier is treated as a single unit, to
295 be repeatedly filled until all the data in @passes and @fails has been
296 consumed.
297
298 Unlike the table example, where this unit filling correctly put a line
299 of dashes between lines of data, in this case the alternation of passes
300 and fails is probably not the desired effect.
301
302 Judging by the labels, it is far more likely that the user wanted:
303
304 Passed:
305 <pass 1>
306 <pass 2>
307 <pass 3>
308 Failed:
309 <fail 4>
310 <fail 5>
311 <fail 6>
312
313 To achieve that, either explicitly interleave the formats and their
314 data sources:
315
316 print form
317 "Passed:", ## single format (no data required)
318 " [[[[[[[[[[[[[[[", ## single format (needs one set of data)
319 \@passes, ## data for previous format
320 "Failed:", ## single format (no data required)
321 " [[[[[[[[[[[[[[[", ## single format (needs one set of data)
322 \@fails; ## data for previous format
323
324 or instruct "form" to do it for you automagically, by setting the
325 'interleave' flag true:
326
327 print form {interleave=>1}
328 "Passed: ##
329 [[[[[[[[[[[[[[[ # single format
330 Failed: # (needs two sets of data)
331 [[[[[[[[[[[[[[[", ##
332
333 ## data to be automagically interleaved
334 \@passes, \@fails; # as necessary between lines of previous
335 ## format
336
337 How "form" hyphenates
338 Any line with a block field repeats on subsequent lines until all block
339 fields on that line have consumed all their data. Non-block fields on
340 these lines are replaced by the appropriate number of spaces.
341
342 Words are wrapped whole, unless they will not fit into the field at
343 all, in which case they are broken and (by default) hyphenated. Simple
344 hyphenation is used (i.e. break at the N-1th character and insert a
345 '-'), unless a suitable alternative subroutine is specified instead.
346
347 Words will not be broken if the break would leave less than 2
348 characters on the current line. This minimum can be varied by setting
349 the 'minbreak' option to a numeric value indicating the minumum total
350 broken characters (including hyphens) required on the current line.
351 Note that, for very narrow fields, words will still be broken (but
352 unhyphenated). For example:
353
354 print form '~', 'split';
355
356 would print:
357
358 s
359 p
360 l
361 i
362 t
363
364 whilst:
365
366 print form {minbreak=>1}, '~', 'split';
367
368 would print:
369
370 s-
371 p-
372 l-
373 i-
374 t
375
376 Alternative breaking subroutines can be specified using the "break"
377 option in a configuration hash. For example:
378
379 form { break => \&my_line_breaker }
380 $format_str,
381 @data;
382
383 "form" expects any user-defined line-breaking subroutine to take three
384 arguments (the string to be broken, the maximum permissible length of
385 the initial section, and the total width of the field being filled).
386 The "hypenate" sub must return a list of two strings: the initial
387 (broken) section of the word, and the remainder of the string
388 respectively).
389
390 For example:
391
392 sub tilde_break = sub($$$)
393 {
394 (substr($_[0],0,$_[1]-1).'~', substr($_[0],$_[1]-1));
395 }
396
397 form { break => \&tilde_break }
398 $format_str,
399 @data;
400
401 makes '~' the hyphenation character, whilst:
402
403 sub wrap_and_slop = sub($$$)
404 {
405 my ($text, $reqlen, $fldlen) = @_;
406 if ($reqlen==$fldlen) { $text =~ m/\A(\s*\S*)(.*)/s }
407 else { ("", $text) }
408 }
409
410 form { break => \&wrap_and_slop }
411 $format_str,
412 @data;
413
414 wraps excessively long words to the next line and "slops" them over the
415 right margin if necessary.
416
417 The Text::Reform package provides three functions to simplify the use
418 of variant hyphenation schemes. The exportable subroutine
419 "Text::Reform::break_wrap" generates a reference to a subroutine
420 implementing the "wrap-and-slop" algorithm shown in the last example,
421 which could therefore be rewritten:
422
423 use Text::Reform qw( form break_wrap );
424
425 form { break => break_wrap }
426 $format_str,
427 @data;
428
429 The subroutine "Text::Reform::break_with" takes a single string
430 argument and returns a reference to a sub which hyphenates by cutting
431 off the text at the right margin and appending the string argument.
432 Hence the first of the two examples could be rewritten:
433
434 use Text::Reform qw( form break_with );
435
436 form { break => break_with('~') }
437 $format_str,
438 @data;
439
440 The subroutine "Text::Reform::break_at" takes a single string argument
441 and returns a reference to a sub which hyphenates by breaking
442 immediately after that string. For example:
443
444 use Text::Reform qw( form break_at );
445
446 form { break => break_at('-') }
447 "[[[[[[[[[[[[[[",
448 "The Newton-Raphson methodology";
449
450 # returns:
451 #
452 # "The Newton-
453 # Raphson
454 # methodology"
455
456 Note that this differs from the behaviour of "break_with", which would
457 be:
458
459 form { break => break_with('-') }
460 "[[[[[[[[[[[[[[",
461 "The Newton-Raphson methodology";
462
463 # returns:
464 #
465 # "The Newton-R-
466 # aphson metho-
467 # dology"
468
469 Hence "break_at" is generally a better choice.
470
471 "break_at" also takes an 'except' option, which tells the resulting
472 subroutine not to break in the middle of certain strings. For example:
473
474 form { break => break_at('-', {except=>qr/Newton-Raphson/}) }
475 "[[[[[[[[[[[[[[",
476 "The Newton-Raphson methodology";
477
478 # returns:
479 #
480 # "The
481 # Newton-Raphson
482 # methodology"
483
484 This option is particularly useful for preserving URLs.
485
486 The subroutine "Text::Reform::break_TeX" returns a reference to a sub
487 which hyphenates using Jan Pazdziora's TeX::Hyphen module. For example:
488
489 use Text::Reform qw( form break_wrap );
490
491 form { break => break_TeX }
492 $format_str,
493 @data;
494
495 Note that in the previous examples there is no leading '\&' before
496 "break_wrap", "break_with", or "break_TeX", since each is being
497 directly called (and returns a reference to some other suitable
498 subroutine);
499
500 The "form" formatting algorithm
501 The algorithm "form" uses is:
502
503 1. If interleaving is specified, split the first string in the
504 argument list into individual format lines and add a
505 terminating newline (unless one is already present).
506 Otherwise, treat the entire string as a single "line" (like
507 /s does in regexes)
508
509 2. For each format line...
510
511 2.1. determine the number of fields and shift
512 that many values off the argument list and
513 into the filling list. If insufficient
514 arguments are available, generate as many
515 empty strings as are required.
516
517 2.2. generate a text line by filling each field
518 in the format line with the initial contents
519 of the corresponding arg in the filling list
520 (and remove those initial contents from the arg).
521
522 2.3. replace any <,>, or ^ fields by an equivalent
523 number of spaces. Splice out the corresponding
524 args from the filling list.
525
526 2.4. Repeat from step 2.2 until all args in the
527 filling list are empty.
528
529 3. concatenate the text lines generated in step 2
530
531 4. repeat from step 1 until the argument list is empty
532
533 "form" examples
534 As an example of the use of "form", the following:
535
536 $count = 1;
537 $text = "A big long piece of text to be formatted exquisitely";
538
539 print form q
540 q{ |||| <<<<<<<<<< },
541 $count, $text,
542 q{ ---------------- },
543 q{ ^^^^ ]]]]]]]]]]| },
544 $count+11, $text,
545 q{ =
546 ]]].[[[ },
547 "123 123.4\n123.456789";
548
549 produces the following output:
550
551 1 A big long
552 ----------------
553 12 piece of|
554 text to be|
555 formatted|
556 exquisite-|
557 ly|
558 =
559 123.0
560 =
561 123.4
562 =
563 123.456
564
565 Note that block fields in a multi-line format string, cause the entire
566 multi-line format to be repeated as often as necessary.
567
568 Picture strings and replacement values are interleaved in the
569 traditional "format" format, but care is needed to ensure that the
570 correct number of substitution values are provided. Another example:
571
572 $report = form
573 'Name Rank Serial Number',
574 '==== ==== =============',
575 '<<<<<<<<<<<<< ^^^^ <<<<<<<<<<<<<',
576 $name, $rank, $serial_number,
577 ''
578 'Age Sex Description',
579 '=== === ===========',
580 '^^^ ^^^ [[[[[[[[[[[',
581 $age, $sex, $description;
582
583 How "form" consumes strings
584 Unlike "format", within "form" non-block fields do consume the text
585 they format, so the following:
586
587 $text = "a line of text to be formatted over three lines";
588 print form "<<<<<<<<<<\n <<<<<<<<\n <<<<<<\n",
589 $text, $text, $text;
590
591 produces:
592
593 a line of
594 text to
595 be fo-
596
597 not:
598
599 a line of
600 a line
601 a line
602
603 To achieve the latter effect, convert the variable arguments to
604 independent literals (by double-quoted interpolation):
605
606 $text = "a line of text to be formatted over three lines";
607 print form "<<<<<<<<<<\n <<<<<<<<\n <<<<<<\n",
608 "$text", "$text", "$text";
609
610 Although values passed from variable arguments are progressively
611 consumed within "form", the values of the original variables passed to
612 "form" are not altered. Hence:
613
614 $text = "a line of text to be formatted over three lines";
615 print form "<<<<<<<<<<\n <<<<<<<<\n <<<<<<\n",
616 $text, $text, $text;
617 print $text, "\n";
618
619 will print:
620
621 a line of
622 text to
623 be fo-
624 a line of text to be formatted over three lines
625
626 To cause "form" to consume the values of the original variables passed
627 to it, pass them as references. Thus:
628
629 $text = "a line of text to be formatted over three lines";
630 print form "<<<<<<<<<<\n <<<<<<<<\n <<<<<<\n",
631 \$text, \$text, \$text;
632 print $text, "\n";
633
634 will print:
635
636 a line of
637 text to
638 be fo-
639 rmatted over three lines
640
641 Note that, for safety, the "non-consuming" behaviour takes precedence,
642 so if a variable is passed to "form" both by reference and by value,
643 its final value will be unchanged.
644
645 Numerical formatting
646 The ">>>.<<<" and "]]].[[[" field specifiers may be used to format
647 numeric values about a fixed decimal place marker. For example:
648
649 print form '(]]]]].[[)', <<EONUMS;
650 1
651 1.0
652 1.001
653 1.009
654 123.456
655 1234567
656 one two
657 EONUMS
658
659 would print:
660
661 ( 1.0 )
662 ( 1.0 )
663 ( 1.00)
664 ( 1.01)
665 ( 123.46)
666 (#####.##)
667 (?????.??)
668 (?????.??)
669
670 Fractions are rounded to the specified number of places after the
671 decimal, but only significant digits are shown. That's why, in the
672 above example, 1 and 1.0 are formatted as "1.0", whilst 1.001 is
673 formatted as "1.00".
674
675 You can specify that the maximal number of decimal places always be
676 used by giving the configuration option 'numeric' a value that matches
677 /\bAllPlaces\b/i. For example:
678
679 print form { numeric => AllPlaces },
680 '(]]]]].[[)', <<'EONUMS';
681 1
682 1.0
683 EONUMS
684
685 would print:
686
687 ( 1.00)
688 ( 1.00)
689
690 Note that although decimal digits are rounded to fit the specified
691 width, the integral part of a number is never modified. If there are
692 not enough places before the decimal place to represent the number, the
693 entire number is replaced with hashes.
694
695 If a non-numeric sequence is passed as data for a numeric field, it is
696 formatted as a series of question marks. This querulous behaviour can
697 be changed by giving the configuration option 'numeric' a value that
698 matches /\bSkipNaN\b/i in which case, any invalid numeric data is
699 simply ignored. For example:
700
701 print form { numeric => 'SkipNaN' }
702 '(]]]]].[[)',
703 <<EONUMS;
704 1
705 two three
706 4
707 EONUMS
708
709 would print:
710
711 ( 1.0 )
712 ( 4.0 )
713
714 Filling block fields with lists of values
715 If an argument corresponding to a field is an array reference, then
716 "form" automatically joins the elements of the array into a single
717 string, separating each element with a newline character. As a result,
718 a call like this:
719
720 @values = qw( 1 10 100 1000 );
721 print form "(]]]].[[)", \@values;
722
723 will print out
724
725 ( 1.00)
726 ( 10.00)
727 ( 100.00)
728 (1000.00)
729
730 as might be expected.
731
732 Note however that arrays must be passed by reference (so that "form"
733 knows that the entire array holds data for a single field). If the
734 previous example had not passed @values by reference:
735
736 @values = qw( 1 10 100 1000 );
737 print form "(]]]].[[)", @values;
738
739 the output would have been:
740
741 ( 1.00)
742 10
743 100
744 1000
745
746 This is because @values would have been interpolated into "form"'s
747 argument list, so only $value[0] would have been used as the data for
748 the initial format string. The remaining elements of @value would have
749 been treated as separate format strings, and printed out "verbatim".
750
751 Note too that, because arrays must be passed using a reference, their
752 original contents are consumed by "form", just like the contents of
753 scalars passed by reference.
754
755 To avoid having an array consumed by "form", pass it as an anonymous
756 array:
757
758 print form "(]]]].[[)", [@values];
759
760 Headers, footers, and pages
761 The "form" subroutine can also insert headers, footers, and page-feeds
762 as it formats. These features are controlled by the "header", "footer",
763 "pagefeed", "pagelen", and "pagenum" options.
764
765 The "pagenum" option takes a scalar value or a reference to a scalar
766 variable and starts page numbering at that value. If a reference to a
767 scalar variable is specified, the value of that variable is updated as
768 the formatting proceeds, so that the final page number is available in
769 it after formatting. This can be useful for multi-part reports.
770
771 The "pagelen" option specifies the total number of lines in a page
772 (including headers, footers, and page-feeds).
773
774 The "pagewidth" option specifies the total number of columns in a page.
775
776 If the "header" option is specified with a string value, that string is
777 used as the header of every page generated. If it is specified as a
778 reference to a subroutine, that subroutine is called at the start of
779 every page and its return value used as the header string. When called,
780 the subroutine is passed the current page number.
781
782 Likewise, if the "footer" option is specified with a string value, that
783 string is used as the footer of every page generated. If it is
784 specified as a reference to a subroutine, that subroutine is called at
785 the start of every page and its return value used as the footer string.
786 When called, the footer subroutine is passed the current page number.
787
788 Both the header and footer options can also be specified as hash
789 references. In this case the hash entries for keys "left", "centre"
790 (or "center"), and "right" specify what is to appear on the left,
791 centre, and right of the header/footer. The entry for the key "width"
792 specifies how wide the footer is to be. If the "width" key is omitted,
793 the "pagewidth" configuration option (which defaults to 72 characters)
794 is used.
795
796 The "left", "centre", and "right" values may be literal strings, or
797 subroutines (just as a normal header/footer specification may be.) See
798 the second example, below.
799
800 Another alternative for header and footer options is to specify them as
801 a subroutine that returns a hash reference. The subroutine is called
802 for each page, then the resulting hash is treated like the hashes
803 described in the preceding paragraph. See the third example, below.
804
805 The "pagefeed" option acts in exactly the same way, to produce a
806 pagefeed which is appended after the footer. But note that the pagefeed
807 is not counted as part of the page length.
808
809 All three of these page components are recomputed at the start of each
810 new page, before the page contents are formatted (recomputing the
811 header and footer first makes it possible to determine how many lines
812 of data to format so as to adhere to the specified page length).
813
814 When the call to "form" is complete and the data has been fully
815 formatted, the footer subroutine is called one last time, with an extra
816 argument of 1. The string returned by this final call is used as the
817 final footer.
818
819 So for example, a 60-line per page report, starting at page 7, with
820 appropriate headers and footers might be set up like so:
821
822 $page = 7;
823
824 form { header => sub { "Page $_[0]\n\n" },
825 footer => sub { my ($pagenum, $lastpage) = @_;
826 return "" if $lastpage;
827 return "-"x50 . "\n"
828 .form ">"x50, "...".($pagenum+1);
829 },
830 pagefeed => "\n\n",
831 pagelen => 60
832 pagenum => \$page,
833 },
834 $template,
835 @data;
836
837 Note the recursive use of "form" within the "footer" option!
838
839 Alternatively, to set up headers and footers such that the running head
840 is right justified in the header and the page number is centred in the
841 footer:
842
843 form { header => { right => "Running head" },
844 footer => { centre => sub { "Page $_[0]" } },
845 pagelen => 60
846 },
847 $template,
848 @data;
849
850 The footer in the previous example could also have been specified the
851 other way around, as a subroutine that returns a hash (rather than a
852 hash containing a subroutine):
853
854 form { header => { right => "Running head" },
855 footer => sub { return {centre => "Page $_[0]"} },
856 pagelen => 60
857 },
858 $template,
859 @data;
860
861 The "cols" option
862 Sometimes data to be used in a "form" call needs to be extracted from a
863 nested data structure. For example, whilst it's easy to print a table
864 if you already have the data in columns:
865
866 @name = qw(Tom Dick Harry);
867 @score = qw( 88 54 99);
868 @time = qw( 15 13 18);
869
870 print form
871 '-------------------------------',
872 'Name Score Time',
873 '-------------------------------',
874 '[[[[[[[[[[[[[[ ||||| ||||',
875 \@name, \@score, \@time;
876
877 if the data is aggregrated by rows:
878
879 @data = (
880 { name=>'Tom', score=>88, time=>15 },
881 { name=>'Dick', score=>54, time=>13 },
882 { name=>'Harry', score=>99, time=>18 },
883 );
884
885 you need to do some fancy mapping before it can be fed to "form":
886
887 print form
888 '-------------------------------',
889 'Name Score Time',
890 '-------------------------------',
891 '[[[[[[[[[[[[[[ ||||| ||||',
892 [map $$_{name}, @data],
893 [map $$_{score}, @data],
894 [map $$_{time} , @data];
895
896 Or you could just use the 'cols' option:
897
898 use Text::Reform qw(form columns);
899
900 print form
901 '-------------------------------',
902 'Name Score Time',
903 '-------------------------------',
904 '[[[[[[[[[[[[[[ ||||| ||||',
905 { cols => [qw(name score time)],
906 from => \@data
907 };
908
909 This option takes an array of strings that specifies the keys of the
910 hash entries to be extracted into columns. The 'from' entry (which must
911 be present) also takes an array, which is expected to contain a list of
912 references to hashes. For each key specified, this option inserts into
913 "form"'s argument list a reference to an array containing the entries
914 for that key, extracted from each of the hash references supplied by
915 'from'. So, for example, the option:
916
917 { cols => [qw(name score time)],
918 from => \@data
919 }
920
921 is replaced by three array references, the first containing the 'name'
922 entries for each hash inside @data, the second containing the 'score'
923 entries for each hash inside @data, and the third containing the 'time'
924 entries for each hash inside @data.
925
926 If, instead, you have a list of arrays containing the data:
927
928 @data = (
929 # Time Name Score
930 [ 15, 'Tom', 88 ],
931 [ 13, 'Dick', 54 ],
932 [ 18, 'Harry', 99 ],
933 );
934
935 the 'cols' option can extract the appropriate columns for that too. You
936 just specify the required indices, rather than keys:
937
938 print form
939 '-----------------------------',
940 'Name Score Time',
941 '-----------------------------',
942 '[[[[[[[[[[[[[[ ||||| ||||',
943 { cols => [1,2,0],
944 from => \@data
945 }
946
947 Note that the indices can be in any order, and the resulting arrays are
948 returned in the same order.
949
950 If you need to merge columns extracted from two hierarchical data
951 structures, just concatenate the data structures first, like so:
952
953 print form
954 '---------------------------------------',
955 'Name Score Time Ranking
956 '---------------------------------------',
957 '[[[[[[[[[[[[[[ ||||| |||| |||||||',
958 { cols => [1,2,0],
959 from => [@data, @olddata],
960 }
961
962 Of course, this only works if the columns are in the same positions in
963 both data sets (and both datasets are stored in arrays) or if the
964 columns have the same keys (and both datasets are in hashes). If not,
965 you would need to format each dataset separately, like so:
966
967 print form
968 '-----------------------------',
969 'Name Score Time'
970 '-----------------------------',
971 '[[[[[[[[[[[[[[ ||||| ||||',
972 { cols=>[1,2,0], from=>\@data },
973 '[[[[[[[[[[[[[[ ||||| ||||',
974 { cols=>[3,8,1], from=>\@olddata },
975 '[[[[[[[[[[[[[[ ||||| ||||',
976 { cols=>[qw(name score time)], from=>\@otherdata };
977
978 The "tag" sub
979 The "tag" subroutine may be exported from the module. It takes two
980 arguments: a tag specifier and a text to be entagged. The tag specifier
981 indicates the indenting of the tag, and of the text. The sub generates
982 an end-tag (using the usual "/tag" variant), unless an explicit end-tag
983 is provided as the third argument.
984
985 The tag specifier consists of the following components (in order):
986
987 An optional vertical spacer (zero or more whitespace-separated
988 newlines)
989 One or more whitespace characters up to a final mandatory newline.
990 This vertical space is inserted before the tag and after the end-
991 tag
992
993 An optional tag indent
994 Zero or more whitespace characters. Both the tag and the end-tag
995 are indented by this whitespace.
996
997 An optional left (opening) tag delimiter
998 Zero or more non-"word" characters (not alphanumeric or '_'). If
999 the opening delimiter is omitted, the character '<' is used.
1000
1001 A tag
1002 One or more "word" characters (alphanumeric or '_').
1003
1004 Optional tag arguments
1005 Any number of any characters
1006
1007 An optional right (closing) tag delimiter
1008 Zero or more non-"word" characters which balance some sequential
1009 portion of the opening tag delimiter. For example, if the opening
1010 delimiter is "<-(" then any of the following are acceptible closing
1011 delimiters: ")->", "->", or ">". If the closing delimiter is
1012 omitted, the "inverse" of the opening delimiter is used (for
1013 example, ")->"),
1014
1015 An optional vertical spacer (zero or more newlines)
1016 One or more whitespace characters up to a mandatory newline. This
1017 vertical space is inserted before and after the complete text.
1018
1019 An optional text indent
1020 Zero or more space of tab characters. Each line of text is indented
1021 by this whitespace (in addition to the tag indent).
1022
1023 For example:
1024
1025 $text = "three lines\nof tagged\ntext";
1026
1027 print tag "A HREF=#nextsection", $text;
1028
1029 prints:
1030
1031 <A HREF=#nextsection>three lines
1032 of tagged
1033 text</A>
1034
1035 whereas:
1036
1037 print tag "[-:GRIN>>>\n", $text;
1038
1039 prints:
1040
1041 [-:GRIN>>>:-]
1042 three lines
1043 of tagged
1044 text
1045 [-:/GRIN>>>:-]
1046
1047 and:
1048
1049 print tag "\n\n <BOLD>\n\n ", $text, "<END BOLD>";
1050
1051 prints:
1052
1053
1054
1055 <BOLD>
1056
1057 three lines
1058 of tagged
1059 text
1060
1061 <END BOLD>
1062
1063
1064
1065 (with the indicated spacing fore and aft).
1066
1068 Damian Conway (damian@conway.org)
1069
1071 The module uses "POSIX::strtod", which may be broken under certain
1072 versions of Windows. Applying the WINDOWS_PATCH patch to Reform.pm will
1073 replace the POSIX function with a copycat subroutine.
1074
1075 There are undoubtedly serious bugs lurking somewhere in code this funky
1076 :-) Bug reports and other feedback are most welcome.
1077
1079 Copyright (c) 1997-2007, Damian Conway "<DCONWAY@CPAN.org>". All rights
1080 reserved.
1081
1082 This module is free software; you can redistribute it and/or modify it
1083 under the same terms as Perl itself. See perlartistic.
1084
1086 BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
1087 FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT
1088 WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER
1089 PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND,
1090 EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
1091 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
1092 ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH
1093 YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
1094 NECESSARY SERVICING, REPAIR, OR CORRECTION.
1095
1096 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
1097 WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
1098 REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE
1099 TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR
1100 CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
1101 SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
1102 RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
1103 FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
1104 SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
1105 DAMAGES.
1106
1107
1108
1109perl v5.32.0 2020-07-28 Text::Reform(3)