1Format_tutorial(3) OCaml library Format_tutorial(3)
2
3
4
6 Format_tutorial - Principles
7
9 Module Format_tutorial
10
12 Module Format_tutorial
13 : sig end
14
15
16
17 Principles
18 Line breaking is based on three concepts:
19
20
21
22 -boxes : a box is a logical pretty-printing unit, which defines a be‐
23 haviour of the pretty-printing engine to display the material inside
24 the box.
25
26 -break hints: a break hint is a directive to the pretty-printing engine
27 that proposes to break the line here, if it is necessary to properly
28 print the rest of the material. Otherwise, the pretty-printing engine
29 never break lines (except "in case of emergency" to avoid very bad out‐
30 put). In short, a break hint tells the pretty printer that a line
31 break here may be appropriate.
32
33 -indentation rules: When a line break occurs, the pretty-printing en‐
34 gines fixes the indentation (or amount of leading spaces) of the new
35 line using indentation rules, as follows:
36
37 -A box can state the extra indentation of every new line opened in its
38 scope. This extra indentation is named box breaking indentation.
39
40 -A break hint can also set the additional indentation of the new line
41 it may fire. This extra indentation is named hint breaking indentation.
42
43 - If break hint bh fires a new line within box b , then the indentation
44 of the new line is simply the sum of: the current indentation of box b
45 + the additional box breaking indentation, as defined by box b + the
46 additional hint breaking indentation, as defined by break hint bh .
47
48
49
50
51 Boxes
52 There are 4 types of boxes. (The most often used is the "hov" box type,
53 so skip the rest at first reading).
54
55
56 -horizontal box (h box, as obtained by the open_hbox procedure): within
57 this box, break hints do not lead to line breaks.
58
59 -vertical box (v box, as obtained by the open_vbox procedure): within
60 this box, every break hint lead to a new line.
61
62 -vertical/horizontal box (hv box, as obtained by the open_hvbox proce‐
63 dure): if it is possible, the entire box is written on a single line;
64 otherwise, every break hint within the box leads to a new line.
65
66 -vertical or horizontal box (hov box, as obtained by the open_box or
67 open_hovbox procedures): within this box, break hints are used to cut
68 the line when there is no more room on the line. There are two kinds
69 of "hov" boxes, you can find the details below. In first approximation,
70 let me consider these two kinds of "hov" boxes as equivalent and ob‐
71 tained by calling the open_box procedure.
72
73 Let me give an example. Suppose we can write 10 chars before the right
74 margin (that indicates no more room). We represent any char as a -
75 sign; characters [ and ] indicates the opening and closing of a box and
76 b stands for a break hint given to the pretty-printing engine.
77
78 The output "--b--b--" is displayed like this (the b symbol stands for
79 the value of the break that is explained below):
80
81 Within a "h" box:
82
83
84 --b--b--
85
86
87 Within a "v" box:
88
89
90 --b
91 --b
92 --
93
94
95 Within a "hv" box:
96
97 If there is enough room to print the box on the line:
98
99
100 --b--b--
101
102
103 But "---b---b---" that cannot fit on the line is written
104
105
106 ---b
107 ---b
108 ---
109
110
111 Within a "hov" box:
112
113 If there is enough room to print the box on the line:
114
115
116 --b--b--
117
118
119 But if "---b---b---" cannot fit on the line, it is written as
120
121
122 ---b---b
123 ---
124
125
126 The first break hint does not lead to a new line, since there is enough
127 room on the line. The second one leads to a new line since there is no
128 more room to print the material following it. If the room left on the
129 line were even shorter, the first break hint may lead to a new line and
130 "---b---b---" is written as:
131
132
133 ---b
134 ---b
135 ---
136
137
138
139 Printing spaces
140 Break hints are also used to output spaces (if the line is not split
141 when the break is encountered, otherwise the new line indicates prop‐
142 erly the separation between printing items). You output a break hint
143 using print_break sp indent , and this sp integer is used to print "sp"
144 spaces. Thus print_break sp ... may be thought as: print sp spaces or
145 output a new line.
146
147 For instance, if b is break 1 0 in the output "--b--b--", we get
148
149 within a "h" box:
150
151
152 -- -- --
153
154
155 within a "v" box:
156
157
158 --
159 --
160 --
161
162
163 within a "hv" box:
164 -- -- --
165
166
167 or, according to the remaining room on the line:
168
169
170 --
171 --
172 --
173
174
175 and similarly for "hov" boxes.
176
177 Generally speaking, a printing routine using "format", should not di‐
178 rectly output white spaces: the routine should use break hints instead.
179 (For instance print_space () that is a convenient abbreviation for
180 print_break 1 0 and outputs a single space or break the line.)
181
182
183 Indentation of new lines
184 The user gets 2 ways to fix the indentation of new lines:
185
186 When defining the box: when you open a box, you can fix the indentation
187 added to each new line opened within that box.
188
189 For instance: open_hovbox 1 opens a "hov" box with new lines indented 1
190 more than the initial indentation of the box. With output
191 "---[--b--b--b--", we get:
192
193
194 ---[--b--b
195 --b--
196
197
198 with open_hovbox 2, we get
199
200
201 ---[--b--b
202 --b--
203
204
205 Note: the [ sign in the display is not visible on the screen, it is
206 just there to materialise the aperture of the pretty-printing box. Last
207 "screen" stands for:
208
209
210 -----b--b
211 --b--
212
213
214 When defining the break that makes the new line. As said above, you
215 output a break hint using print_break sp indent . The indent integer is
216 used to fix the additional indentation of the new line. Namely, it is
217 added to the default indentation offset of the box where the break oc‐
218 curs.
219
220 For instance, if [ stands for the opening of a "hov" box with 1 as ex‐
221 tra indentation (as obtained by open_hovbox 1 ), and b is print_break 1
222 2 , then from output "---[--b--b--b--", we get:
223
224
225 ---[-- --
226 --
227 --
228
229
230
231 Refinement on hov boxes
232 The "hov" box type is refined into two categories.
233
234
235 -the vertical or horizontal packing box (as obtained by the open_hovbox
236 procedure): break hints are used to cut the line when there is no more
237 room on the line; no new line occurs if there is enough room on the
238 line.
239
240 -vertical or horizontal structural box (as obtained by the open_box
241 procedure): similar to the "hov" packing box, the break hints are used
242 to cut the line when there is no more room on the line; in addition,
243 break hints that can show the box structure lead to new lines even if
244 there is enough room on the current line.
245
246 The difference between a packing and a structural "hov" box is shown by
247 a routine that closes boxes and parentheses at the end of printing:
248 with packing boxes, the closure of boxes and parentheses do not lead to
249 new lines if there is enough room on the line, whereas with structural
250 boxes each break hint will lead to a new line. For instance, when
251 printing "[(---[(----[(---b)]b)]b)]", where "b" is a break hint without
252 extra indentation ( print_cut () ). If "[" means opening of a packing
253 "hov" box ( open_hovbox ), "[(---[(----[(---b)]b)]b)]" is printed as
254 follows:
255
256
257 (---
258 (----
259 (---)))
260
261
262 If we replace the packing boxes by structural boxes ( open_box ), each
263 break hint that precedes a closing parenthesis can show the boxes
264 structure, if it leads to a new line; hence "[(---[(----[(---b)]b)]b)]"
265 is printed like this:
266
267
268 (---
269 (----
270 (---
271 )
272 )
273 )
274
275
276
277 Practical advice
278 When writing a pretty-printing routine, follow these simple rules:
279
280
281 -Boxes must be opened and closed consistently ( open_* and close_box
282 must be nested like parentheses).
283
284 -Never hesitate to open a box.
285
286 -Output many break hints, otherwise the pretty-printer is in a bad sit‐
287 uation where it tries to do its best, which is always "worse than your
288 bad".
289
290 -Do not try to force spacing using explicit spaces in the character
291 strings. For each space you want in the output emit a break hint (
292 print_space () ), unless you explicitly don't want the line to be bro‐
293 ken here. For instance, imagine you want to pretty print an OCaml defi‐
294 nition, more precisely a let rec
295 ident = expression value definition. You will probably treat the first
296 three spaces as "unbreakable spaces" and write them directly in the
297 string constants for keywords, and print "let rec" before the identi‐
298 fier, and similarly write = to get an unbreakable space after the iden‐
299 tifier; in contrast, the space after the = sign is certainly a break
300 hint, since breaking the line after = is a usual (and elegant) way to
301 indent the expression part of a definition. In short, it is often nec‐
302 essary to print unbreakable spaces; however, most of the time a space
303 should be considered a break hint.
304
305 -Do not try to force new lines, let the pretty-printer do it for you:
306 that's its only job. In particular, do not use force_newline : this
307 procedure effectively leads to a newline, but it also as the unfortu‐
308 nate side effect to partially reinitialise the pretty-printing engine,
309 so that the rest of the printing material is noticeably messed up.
310
311 -Never put newline characters directly in the strings to be printed:
312 pretty printing engine will consider this newline character as any
313 other character written on the current line and this will completely
314 mess up the output. Instead of new line characters use line break
315 hints: if those break hints must always result in new lines, it just
316 means that the surrounding box must be a vertical box!
317
318 -End your main program by a print_newline () call, that flushes the
319 pretty-printer tables (hence the output). (Note that the top-level
320 loop of the interactive system does it as well, just before a new in‐
321 put.)
322
323
324 Printing to stdout: using printf
325 The format module provides a general printing facility "a la" printf.
326 In addition to the usual conversion facility provided by printf, you
327 can write pretty-printing indications directly inside the format string
328 (opening and closing boxes, indicating breaking hints, etc).
329
330 Pretty-printing annotations are introduced by the @ symbol, directly
331 into the string format. Almost any function of the Format module can be
332 called from within a printf format string. For instance
333
334
335 -" @[ " open a box (open_box 0). You may precise the type as an extra
336 argument. For instance @[<hov n> is equivalent to open_hovbox n .
337
338 -" @] " close a box ( close_box () ).
339
340 -" @ " output a breakable space ( print_space () ).
341
342 -" @, " output a break hint ( print_cut () ).
343
344 -" @;<n m> " emit a "full" break hint ( print_break n m ).
345
346 -" @. " end the pretty-printing, closing all the boxes still opened (
347 print_newline () ).
348
349 For instance
350
351 printf "@[<1>%s@ =@ %d@ %s@]@." "Prix TTC" 100 "Euros";; Prix TTC = 100
352 Euros - : unit = ()
353
354
355 A concrete example
356 Let me give a full example: the shortest non trivial example you could
357 imagine, that is the lambda calculus :)
358
359 Thus the problem is to pretty-print the values of a concrete data type
360 that models a language of expressions that defines functions and their
361 applications to arguments.
362
363 First, I give the abstract syntax of lambda-terms:
364
365 type lambda = | Lambda of string * lambda | Var of string | Apply of
366 lambda * lambda ;;
367
368 I use the format library to print the lambda-terms:
369
370 open Format;; let ident = print_string;; let kwd = print_string;; val
371 ident : string -> unit = <fun> val kwd : string -> unit = <fun> let rec
372 print_exp0 = function | Var s -> ident s | lam -> open_hovbox 1; kwd
373 "("; print_lambda lam; kwd ")"; close_box () and print_app = function |
374 e -> open_hovbox 2; print_other_applications e; close_box () and
375 print_other_applications f = match f with | Apply (f, arg) -> print_app
376 f; print_space (); print_exp0 arg | f -> print_exp0 f and print_lambda
377 = function | Lambda (s, lam) -> open_hovbox 1; kwd "\\"; ident s; kwd
378 "."; print_space(); print_lambda lam; close_box() | e -> print_app e;;
379 val print_app : lambda -> unit = <fun> val print_other_applications :
380 lambda -> unit = <fun> val print_lambda : lambda -> unit = <fun>
381
382
383 Most general pretty-printing: using fprintf
384 We use the fprintf function to write the most versatile version of the
385 pretty-printing functions for lambda-terms. Now, the functions get an
386 extra argument, namely a pretty-printing formatter (the ppf argument)
387 where printing will occur. This way the printing routines are more gen‐
388 eral, since they can print on any formatter defined in the program (ei‐
389 ther printing to a file, or to stdout , to stderr , or even to a
390 string). Furthermore, the pretty-printing functions are now composi‐
391 tional, since they may be used in conjunction with the special %a con‐
392 version, that prints a fprintf argument with a user's supplied function
393 (these user's supplied functions also have a formatter as first argu‐
394 ment).
395
396 Using fprintf , the lambda-terms printing routines can be written as
397 follows:
398
399 open Format;; let ident ppf s = fprintf ppf "%s" s;; let kwd ppf s =
400 fprintf ppf "%s" s;; val ident : Format.formatter -> string -> unit val
401 kwd : Format.formatter -> string -> unit let rec pr_exp0 ppf = function
402 | Var s -> fprintf ppf "%a" ident s | lam -> fprintf ppf "@[<1>(%a)@]"
403 pr_lambda lam and pr_app ppf = function | e -> fprintf ppf "@[<2>%a@]"
404 pr_other_applications e and pr_other_applications ppf f = match f with
405 | Apply (f, arg) -> fprintf ppf "%a@ %a" pr_app f pr_exp0 arg | f ->
406 pr_exp0 ppf f and pr_lambda ppf = function | Lambda (s, lam) -> fprintf
407 ppf "@[<1>%a%a%a@ %a@]" kwd "\\" ident s kwd "." pr_lambda lam | e ->
408 pr_app ppf e ;; val pr_app : Format.formatter -> lambda -> unit val
409 pr_other_applications : Format.formatter -> lambda -> unit val
410 pr_lambda : Format.formatter -> lambda -> unit
411
412 Given those general printing routines, procedures to print to stdout or
413 stderr is just a matter of partial application:
414
415 let print_lambda = pr_lambda std_formatter;; let eprint_lambda =
416 pr_lambda err_formatter;; val print_lambda : lambda -> unit val
417 eprint_lambda : lambda -> unit
418
419
420
421
422
423
424
425
426
427OCamldoc 2022-02-04 Format_tutorial(3)