1PERLFORM(1) Perl Programmers Reference Guide PERLFORM(1)
2
3
4
6 perlform - Perl formats
7
9 Perl has a mechanism to help you generate simple reports and charts.
10 To facilitate this, Perl helps you code up your output page close to
11 how it will look when it's printed. It can keep track of things like
12 how many lines are on a page, what page you're on, when to print page
13 headers, etc. Keywords are borrowed from FORTRAN: format() to declare
14 and write() to execute; see their entries in perlfunc. Fortunately,
15 the layout is much more legible, more like BASIC's PRINT USING state‐
16 ment. Think of it as a poor man's nroff(1).
17
18 Formats, like packages and subroutines, are declared rather than exe‐
19 cuted, so they may occur at any point in your program. (Usually it's
20 best to keep them all together though.) They have their own namespace
21 apart from all the other "types" in Perl. This means that if you have
22 a function named "Foo", it is not the same thing as having a format
23 named "Foo". However, the default name for the format associated with
24 a given filehandle is the same as the name of the filehandle. Thus,
25 the default format for STDOUT is named "STDOUT", and the default format
26 for filehandle TEMP is named "TEMP". They just look the same. They
27 aren't.
28
29 Output record formats are declared as follows:
30
31 format NAME =
32 FORMLIST
33 .
34
35 If the name is omitted, format "STDOUT" is defined. A single "." in
36 column 1 is used to terminate a format. FORMLIST consists of a
37 sequence of lines, each of which may be one of three types:
38
39 1. A comment, indicated by putting a '#' in the first column.
40
41 2. A "picture" line giving the format for one output line.
42
43 3. An argument line supplying values to plug into the previous picture
44 line.
45
46 Picture lines contain output field definitions, intermingled with lit‐
47 eral text. These lines do not undergo any kind of variable interpola‐
48 tion. Field definitions are made up from a set of characters, for
49 starting and extending a field to its desired width. This is the com‐
50 plete set of characters for field definitions:
51
52 @ start of regular field
53 ^ start of special field
54 < pad character for left adjustification
55 ⎪ pad character for centering
56 > pad character for right adjustificat
57 # pad character for a right justified numeric field
58 0 instead of first #: pad number with leading zeroes
59 . decimal point within a numeric field
60 ... terminate a text field, show "..." as truncation evidence
61 @* variable width field for a multi-line value
62 ^* variable width field for next line of a multi-line value
63 ~ suppress line with all fields empty
64 ~~ repeat line until all fields are exhausted
65
66 Each field in a picture line starts with either "@" (at) or "^"
67 (caret), indicating what we'll call, respectively, a "regular" or "spe‐
68 cial" field. The choice of pad characters determines whether a field
69 is textual or numeric. The tilde operators are not part of a field.
70 Let's look at the various possibilities in detail.
71
72 Text Fields
73
74 The length of the field is supplied by padding out the field with mul‐
75 tiple "<", ">", or "⎪" characters to specify a non-numeric field with,
76 respectively, left justification, right justification, or centering.
77 For a regular field, the value (up to the first newline) is taken and
78 printed according to the selected justification, truncating excess
79 characters. If you terminate a text field with "...", three dots will
80 be shown if the value is truncated. A special text field may be used to
81 do rudimentary multi-line text block filling; see "Using Fill Mode" for
82 details.
83
84 Example:
85 format STDOUT =
86 @<<<<<< @⎪⎪⎪⎪⎪⎪ @>>>>>>
87 "left", "middle", "right"
88 .
89 Output:
90 left middle right
91
92 Numeric Fields
93
94 Using "#" as a padding character specifies a numeric field, with right
95 justification. An optional "." defines the position of the decimal
96 point. With a "0" (zero) instead of the first "#", the formatted number
97 will be padded with leading zeroes if necessary. A special numeric
98 field is blanked out if the value is undefined. If the resulting value
99 would exceed the width specified the field is filled with "#" as over‐
100 flow evidence.
101
102 Example:
103 format STDOUT =
104 @### @.### @##.### @### @### ^####
105 42, 3.1415, undef, 0, 10000, undef
106 .
107 Output:
108 42 3.142 0.000 0 ####
109
110 The Field @* for Variable Width Multi-Line Text
111
112 The field "@*" can be used for printing multi-line, nontruncated val‐
113 ues; it should (but need not) appear by itself on a line. A final line
114 feed is chomped off, but all other characters are emitted verbatim.
115
116 The Field ^* for Variable Width One-line-at-a-time Text
117
118 Like "@*", this is a variable width field. The value supplied must be a
119 scalar variable. Perl puts the first line (up to the first "\n") of the
120 text into the field, and then chops off the front of the string so that
121 the next time the variable is referenced, more of the text can be
122 printed. The variable will not be restored.
123
124 Example:
125 $text = "line 1\nline 2\nline 3";
126 format STDOUT =
127 Text: ^*
128 $text
129 ~~ ^*
130 $text
131 .
132 Output:
133 Text: line 1
134 line 2
135 line 3
136
137 Specifying Values
138
139 The values are specified on the following format line in the same order
140 as the picture fields. The expressions providing the values must be
141 separated by commas. They are all evaluated in a list context before
142 the line is processed, so a single list expression could produce multi‐
143 ple list elements. The expressions may be spread out to more than one
144 line if enclosed in braces. If so, the opening brace must be the first
145 token on the first line. If an expression evaluates to a number with a
146 decimal part, and if the corresponding picture specifies that the deci‐
147 mal part should appear in the output (that is, any picture except mul‐
148 tiple "#" characters without an embedded "."), the character used for
149 the decimal point is always determined by the current LC_NUMERIC
150 locale. This means that, if, for example, the run-time environment
151 happens to specify a German locale, "," will be used instead of the
152 default ".". See perllocale and "WARNINGS" for more information.
153
154 Using Fill Mode
155
156 On text fields the caret enables a kind of fill mode. Instead of an
157 arbitrary expression, the value supplied must be a scalar variable that
158 contains a text string. Perl puts the next portion of the text into
159 the field, and then chops off the front of the string so that the next
160 time the variable is referenced, more of the text can be printed.
161 (Yes, this means that the variable itself is altered during execution
162 of the write() call, and is not restored.) The next portion of text is
163 determined by a crude line breaking algorithm. You may use the carriage
164 return character ("\r") to force a line break. You can change which
165 characters are legal to break on by changing the variable $: (that's
166 $FORMAT_LINE_BREAK_CHARACTERS if you're using the English module) to a
167 list of the desired characters.
168
169 Normally you would use a sequence of fields in a vertical stack associ‐
170 ated with the same scalar variable to print out a block of text. You
171 might wish to end the final field with the text "...", which will
172 appear in the output if the text was too long to appear in its
173 entirety.
174
175 Suppressing Lines Where All Fields Are Void
176
177 Using caret fields can produce lines where all fields are blank. You
178 can suppress such lines by putting a "~" (tilde) character anywhere in
179 the line. The tilde will be translated to a space upon output.
180
181 Repeating Format Lines
182
183 If you put two contiguous tilde characters "~~" anywhere into a line,
184 the line will be repeated until all the fields on the line are
185 exhausted, i.e. undefined. For special (caret) text fields this will
186 occur sooner or later, but if you use a text field of the at variety,
187 the expression you supply had better not give the same value every
188 time forever! ("shift(@f)" is a simple example that would work.) Don't
189 use a regular (at) numeric field in such lines, because it will never
190 go blank.
191
192 Top of Form Processing
193
194 Top-of-form processing is by default handled by a format with the same
195 name as the current filehandle with "_TOP" concatenated to it. It's
196 triggered at the top of each page. See "write" in perlfunc.
197
198 Examples:
199
200 # a report on the /etc/passwd file
201 format STDOUT_TOP =
202 Passwd File
203 Name Login Office Uid Gid Home
204 ------------------------------------------------------------------
205 .
206 format STDOUT =
207 @<<<<<<<<<<<<<<<<<< @⎪⎪⎪⎪⎪⎪⎪ @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
208 $name, $login, $office,$uid,$gid, $home
209 .
210
211 # a report from a bug report form
212 format STDOUT_TOP =
213 Bug Reports
214 @<<<<<<<<<<<<<<<<<<<<<<< @⎪⎪⎪ @>>>>>>>>>>>>>>>>>>>>>>>
215 $system, $%, $date
216 ------------------------------------------------------------------
217 .
218 format STDOUT =
219 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
220 $subject
221 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
222 $index, $description
223 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
224 $priority, $date, $description
225 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
226 $from, $description
227 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
228 $programmer, $description
229 ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
230 $description
231 ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
232 $description
233 ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
234 $description
235 ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
236 $description
237 ~ ^<<<<<<<<<<<<<<<<<<<<<<<...
238 $description
239 .
240
241 It is possible to intermix print()s with write()s on the same output
242 channel, but you'll have to handle "$-" ($FORMAT_LINES_LEFT) yourself.
243
244 Format Variables
245
246 The current format name is stored in the variable $~ ($FORMAT_NAME),
247 and the current top of form format name is in $^ ($FORMAT_TOP_NAME).
248 The current output page number is stored in $% ($FORMAT_PAGE_NUMBER),
249 and the number of lines on the page is in $= ($FORMAT_LINES_PER_PAGE).
250 Whether to autoflush output on this handle is stored in $⎪ ($OUT‐
251 PUT_AUTOFLUSH). The string output before each top of page (except the
252 first) is stored in $^L ($FORMAT_FORMFEED). These variables are set on
253 a per-filehandle basis, so you'll need to select() into a different one
254 to affect them:
255
256 select((select(OUTF),
257 $~ = "My_Other_Format",
258 $^ = "My_Top_Format"
259 )[0]);
260
261 Pretty ugly, eh? It's a common idiom though, so don't be too surprised
262 when you see it. You can at least use a temporary variable to hold the
263 previous filehandle: (this is a much better approach in general,
264 because not only does legibility improve, you now have intermediary
265 stage in the expression to single-step the debugger through):
266
267 $ofh = select(OUTF);
268 $~ = "My_Other_Format";
269 $^ = "My_Top_Format";
270 select($ofh);
271
272 If you use the English module, you can even read the variable names:
273
274 use English '-no_match_vars';
275 $ofh = select(OUTF);
276 $FORMAT_NAME = "My_Other_Format";
277 $FORMAT_TOP_NAME = "My_Top_Format";
278 select($ofh);
279
280 But you still have those funny select()s. So just use the FileHandle
281 module. Now, you can access these special variables using lowercase
282 method names instead:
283
284 use FileHandle;
285 format_name OUTF "My_Other_Format";
286 format_top_name OUTF "My_Top_Format";
287
288 Much better!
289
291 Because the values line may contain arbitrary expressions (for at
292 fields, not caret fields), you can farm out more sophisticated process‐
293 ing to other functions, like sprintf() or one of your own. For exam‐
294 ple:
295
296 format Ident =
297 @<<<<<<<<<<<<<<<
298 &commify($n)
299 .
300
301 To get a real at or caret into the field, do this:
302
303 format Ident =
304 I have an @ here.
305 "@"
306 .
307
308 To center a whole line of text, do something like this:
309
310 format Ident =
311 @⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪
312 "Some text line"
313 .
314
315 There is no builtin way to say "float this to the right hand side of
316 the page, however wide it is." You have to specify where it goes. The
317 truly desperate can generate their own format on the fly, based on the
318 current number of columns, and then eval() it:
319
320 $format = "format STDOUT = \n"
321 . '^' . '<' x $cols . "\n"
322 . '$entry' . "\n"
323 . "\t^" . "<" x ($cols-8) . "~~\n"
324 . '$entry' . "\n"
325 . ".\n";
326 print $format if $Debugging;
327 eval $format;
328 die $@ if $@;
329
330 Which would generate a format looking something like this:
331
332 format STDOUT =
333 ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
334 $entry
335 ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~
336 $entry
337 .
338
339 Here's a little program that's somewhat like fmt(1):
340
341 format =
342 ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ~~
343 $_
344
345 .
346
347 $/ = '';
348 while (<>) {
349 s/\s*\n\s*/ /g;
350 write;
351 }
352
353 Footers
354
355 While $FORMAT_TOP_NAME contains the name of the current header format,
356 there is no corresponding mechanism to automatically do the same thing
357 for a footer. Not knowing how big a format is going to be until you
358 evaluate it is one of the major problems. It's on the TODO list.
359
360 Here's one strategy: If you have a fixed-size footer, you can get
361 footers by checking $FORMAT_LINES_LEFT before each write() and print
362 the footer yourself if necessary.
363
364 Here's another strategy: Open a pipe to yourself, using "open(MYSELF,
365 "⎪-")" (see "open()" in perlfunc) and always write() to MYSELF instead
366 of STDOUT. Have your child process massage its STDIN to rearrange
367 headers and footers however you like. Not very convenient, but doable.
368
369 Accessing Formatting Internals
370
371 For low-level access to the formatting mechanism. you may use form‐
372 line() and access $^A (the $ACCUMULATOR variable) directly.
373
374 For example:
375
376 $str = formline <<'END', 1,2,3;
377 @<<< @⎪⎪⎪ @>>>
378 END
379
380 print "Wow, I just stored `$^A' in the accumulator!\n";
381
382 Or to make an swrite() subroutine, which is to write() what sprintf()
383 is to printf(), do this:
384
385 use Carp;
386 sub swrite {
387 croak "usage: swrite PICTURE ARGS" unless @_;
388 my $format = shift;
389 $^A = "";
390 formline($format,@_);
391 return $^A;
392 }
393
394 $string = swrite(<<'END', 1, 2, 3);
395 Check me out
396 @<<< @⎪⎪⎪ @>>>
397 END
398 print $string;
399
401 The lone dot that ends a format can also prematurely end a mail message
402 passing through a misconfigured Internet mailer (and based on experi‐
403 ence, such misconfiguration is the rule, not the exception). So when
404 sending format code through mail, you should indent it so that the for‐
405 mat-ending dot is not on the left margin; this will prevent SMTP cut‐
406 off.
407
408 Lexical variables (declared with "my") are not visible within a format
409 unless the format is declared within the scope of the lexical variable.
410 (They weren't visible at all before version 5.001.)
411
412 Formats are the only part of Perl that unconditionally use information
413 from a program's locale; if a program's environment specifies an
414 LC_NUMERIC locale, it is always used to specify the decimal point char‐
415 acter in formatted output. Perl ignores all other aspects of locale
416 handling unless the "use locale" pragma is in effect. Formatted output
417 cannot be controlled by "use locale" because the pragma is tied to the
418 block structure of the program, and, for historical reasons, formats
419 exist outside that block structure. See perllocale for further discus‐
420 sion of locale handling.
421
422 Within strings that are to be displayed in a fixed length text field,
423 each control character is substituted by a space. (But remember the
424 special meaning of "\r" when using fill mode.) This is done to avoid
425 misalignment when control characters "disappear" on some output media.
426
427
428
429perl v5.8.8 2006-01-07 PERLFORM(1)