1Format_tutorial(3) OCaml library Format_tutorial(3)
2
3
4
6 Format_tutorial - Using the Format module
7
9 Module Format_tutorial
10
12 Module Format_tutorial
13 : sig end
14
15
16
17 Using the Format module
18 Principles
19 Line breaking is based on three concepts:
20
21
22
23 -boxes : a box is a logical pretty-printing unit, which defines a be‐
24 haviour of the pretty-printing engine to display the material inside
25 the box.
26
27 -break hints: a break hint is a directive to the pretty-printing engine
28 that proposes to break the line here, if it is necessary to properly
29 print the rest of the material. Otherwise, the pretty-printing engine
30 never break lines (except "in case of emergency" to avoid very bad out‐
31 put). In short, a break hint tells the pretty printer that a line
32 break here may be appropriate.
33
34 -indentation rules: When a line break occurs, the pretty-printing en‐
35 gines fixes the indentation (or amount of leading spaces) of the new
36 line using indentation rules, as follows:
37
38 -A box can state the extra indentation of every new line opened in its
39 scope. This extra indentation is named box breaking indentation.
40
41 -A break hint can also set the additional indentation of the new line
42 it may fire. This extra indentation is named hint breaking indentation.
43
44 - If break hint bh fires a new line within box b , then the indentation
45 of the new line is simply the sum of: the current indentation of box b
46 + the additional box breaking indentation, as defined by box b + the
47 additional hint breaking indentation, as defined by break hint bh .
48
49
50
51
52 Boxes
53 There are 4 types of boxes. (The most often used is the "hov" box type,
54 so skip the rest at first reading).
55
56
57 -horizontal box (h box, as obtained by the Format.open_hbox procedure):
58 within this box, break hints do not lead to line breaks.
59
60 -vertical box (v box, as obtained by the Format.open_vbox procedure):
61 within this box, every break hint lead to a new line.
62
63 -vertical/horizontal box (hv box, as obtained by the Format.open_hvbox
64 procedure): if it is possible, the entire box is written on a single
65 line; otherwise, every break hint within the box leads to a new line.
66
67 -vertical or horizontal box (hov box, as obtained by the For‐
68 mat.open_box or Format.open_hovbox procedures): within this box, break
69 hints are used to cut the line when there is no more room on the line.
70 There are two kinds of "hov" boxes, you can find the details below. In
71 first approximation, let me consider these two kinds of "hov" boxes as
72 equivalent and obtained by calling the Format.open_box procedure.
73
74 Let me give an example. Suppose we can write 10 chars before the right
75 margin (that indicates no more room). We represent any char as a -
76 sign; characters [ and ] indicates the opening and closing of a box and
77 b stands for a break hint given to the pretty-printing engine.
78
79 The output "--b--b--" is displayed like this (the b symbol stands for
80 the value of the break that is explained below):
81
82 Within a "h" box:
83
84
85 --b--b--
86
87
88 Within a "v" box:
89
90
91 --b
92 --b
93 --
94
95
96 Within a "hv" box:
97
98 If there is enough room to print the box on the line:
99
100
101 --b--b--
102
103
104 But "---b---b---" that cannot fit on the line is written
105
106
107 ---b
108 ---b
109 ---
110
111
112 Within a "hov" box:
113
114 If there is enough room to print the box on the line:
115
116
117 --b--b--
118
119
120 But if "---b---b---" cannot fit on the line, it is written as
121
122
123 ---b---b
124 ---
125
126
127 The first break hint does not lead to a new line, since there is enough
128 room on the line. The second one leads to a new line since there is no
129 more room to print the material following it. If the room left on the
130 line were even shorter, the first break hint may lead to a new line and
131 "---b---b---" is written as:
132
133
134 ---b
135 ---b
136 ---
137
138
139
140 Printing spaces
141 Break hints are also used to output spaces (if the line is not split
142 when the break is encountered, otherwise the new line indicates prop‐
143 erly the separation between printing items). You output a break hint
144 using print_break sp indent , and this sp integer is used to print "sp"
145 spaces. Thus print_break sp ... may be thought as: print sp spaces or
146 output a new line.
147
148 For instance, if b is break 1 0 in the output "--b--b--", we get
149
150 within a "h" box:
151
152
153 -- -- --
154
155
156 within a "v" box:
157
158
159 --
160 --
161 --
162
163
164 within a "hv" box:
165 -- -- --
166
167
168 or, according to the remaining room on the line:
169
170
171 --
172 --
173 --
174
175
176 and similarly for "hov" boxes.
177
178 Generally speaking, a printing routine using "format", should not di‐
179 rectly output white spaces: the routine should use break hints instead.
180 (For instance print_space () that is a convenient abbreviation for
181 print_break 1 0 and outputs a single space or break the line.)
182
183
184 Indentation of new lines
185 The user gets 2 ways to fix the indentation of new lines:
186
187 When defining the box: when you open a box, you can fix the indentation
188 added to each new line opened within that box.
189
190 For instance: open_hovbox 1 opens a "hov" box with new lines indented 1
191 more than the initial indentation of the box. With output
192 "---[--b--b--b--", we get:
193
194
195 ---[--b--b
196 --b--
197
198
199 with open_hovbox 2, we get
200
201
202 ---[--b--b
203 --b--
204
205
206 Note: the [ sign in the display is not visible on the screen, it is
207 just there to materialise the aperture of the pretty-printing box. Last
208 "screen" stands for:
209
210
211 -----b--b
212 --b--
213
214
215 When defining the break that makes the new line. As said above, you
216 output a break hint using print_break sp indent . The indent integer is
217 used to fix the additional indentation of the new line. Namely, it is
218 added to the default indentation offset of the box where the break oc‐
219 curs.
220
221 For instance, if [ stands for the opening of a "hov" box with 1 as ex‐
222 tra indentation (as obtained by open_hovbox 1 ), and b is print_break 1
223 2 , then from output "---[--b--b--b--", we get:
224
225
226 ---[-- --
227 --
228 --
229
230
231
232 Refinement on hov boxes
233 The "hov" box type is refined into two categories.
234
235
236 -the vertical or horizontal packing box (as obtained by the For‐
237 mat.open_hovbox procedure): break hints are used to cut the line when
238 there is no more room on the line; no new line occurs if there is
239 enough room on the line.
240
241 -vertical or horizontal structural box (as obtained by the For‐
242 mat.open_box procedure): similar to the "hov" packing box, the break
243 hints are used to cut the line when there is no more room on the line;
244 in addition, break hints that can show the box structure lead to new
245 lines even if there is enough room on the current line.
246
247 The difference between a packing and a structural "hov" box is shown by
248 a routine that closes boxes and parentheses at the end of printing:
249 with packing boxes, the closure of boxes and parentheses do not lead to
250 new lines if there is enough room on the line, whereas with structural
251 boxes each break hint will lead to a new line. For instance, when
252 printing "[(---[(----[(---b)]b)]b)]", where "b" is a break hint without
253 extra indentation ( print_cut () ). If "[" means opening of a packing
254 "hov" box ( Format.open_hovbox ), "[(---[(----[(---b)]b)]b)]" is
255 printed as follows:
256
257
258 (---
259 (----
260 (---)))
261
262
263 If we replace the packing boxes by structural boxes ( Format.open_box
264 ), each break hint that precedes a closing parenthesis can show the
265 boxes structure, if it leads to a new line; hence
266 "[(---[(----[(---b)]b)]b)]" is printed like this:
267
268
269 (---
270 (----
271 (---
272 )
273 )
274 )
275
276
277
278 Practical advice
279 When writing a pretty-printing routine, follow these simple rules:
280
281
282 -Boxes must be opened and closed consistently ( open_* and For‐
283 mat.close_box must be nested like parentheses).
284
285 -Never hesitate to open a box.
286
287 -Output many break hints, otherwise the pretty-printer is in a bad sit‐
288 uation where it tries to do its best, which is always "worse than your
289 bad".
290
291 -Do not try to force spacing using explicit spaces in the character
292 strings. For each space you want in the output emit a break hint (
293 print_space () ), unless you explicitly don't want the line to be bro‐
294 ken here. For instance, imagine you want to pretty print an OCaml defi‐
295 nition, more precisely a let rec
296 ident = expression value definition. You will probably treat the first
297 three spaces as "unbreakable spaces" and write them directly in the
298 string constants for keywords, and print "let rec" before the identi‐
299 fier, and similarly write = to get an unbreakable space after the iden‐
300 tifier; in contrast, the space after the = sign is certainly a break
301 hint, since breaking the line after = is a usual (and elegant) way to
302 indent the expression part of a definition. In short, it is often nec‐
303 essary to print unbreakable spaces; however, most of the time a space
304 should be considered a break hint.
305
306 -Do not try to force new lines, let the pretty-printer do it for you:
307 that's its only job. In particular, do not use Format.force_newline :
308 this procedure effectively leads to a newline, but it also as the un‐
309 fortunate side effect to partially reinitialise the pretty-printing en‐
310 gine, so that the rest of the printing material is noticeably messed
311 up.
312
313 -Never put newline characters directly in the strings to be printed:
314 pretty printing engine will consider this newline character as any
315 other character written on the current line and this will completely
316 mess up the output. Instead of new line characters use line break
317 hints: if those break hints must always result in new lines, it just
318 means that the surrounding box must be a vertical box!
319
320 -End your main program by a print_newline () call, that flushes the
321 pretty-printer tables (hence the output). (Note that the top-level
322 loop of the interactive system does it as well, just before a new in‐
323 put.)
324
325
326 Printing to stdout: using printf
327 The format module provides a general printing facility "a la" printf.
328 In addition to the usual conversion facility provided by printf, you
329 can write pretty-printing indications directly inside the format string
330 (opening and closing boxes, indicating breaking hints, etc).
331
332 Pretty-printing annotations are introduced by the @ symbol, directly
333 into the string format. Almost any function of the Format module can be
334 called from within a printf format string. For instance
335
336
337 -" @[ " open a box (open_box 0). You may precise the type as an extra
338 argument. For instance @[<hov n> is equivalent to open_hovbox n .
339
340 -" @] " close a box ( close_box () ).
341
342 -" @ " output a breakable space ( print_space () ).
343
344 -" @, " output a break hint ( print_cut () ).
345
346 -" @;<n m> " emit a "full" break hint ( print_break n m ).
347
348 -" @. " end the pretty-printing, closing all the boxes still opened (
349 print_newline () ).
350
351 For instance
352
353 printf "@[<1>%s@ =@ %d@ %s@]@." "Prix TTC" 100 "Euros";; Prix TTC = 100
354 Euros - : unit = ()
355
356
357 A concrete example
358 Let me give a full example: the shortest non trivial example you could
359 imagine, that is the lambda calculus :)
360
361 Thus the problem is to pretty-print the values of a concrete data type
362 that models a language of expressions that defines functions and their
363 applications to arguments.
364
365 First, I give the abstract syntax of lambda-terms:
366
367 type lambda = | Lambda of string * lambda | Var of string | Apply of
368 lambda * lambda ;;
369
370 I use the format library to print the lambda-terms:
371
372 open Format;; let ident = print_string;; let kwd = print_string;; val
373 ident : string -> unit = <fun> val kwd : string -> unit = <fun> let rec
374 print_exp0 = function | Var s -> ident s | lam -> open_hovbox 1; kwd
375 "("; print_lambda lam; kwd ")"; close_box () and print_app = function |
376 e -> open_hovbox 2; print_other_applications e; close_box () and
377 print_other_applications f = match f with | Apply (f, arg) -> print_app
378 f; print_space (); print_exp0 arg | f -> print_exp0 f and print_lambda
379 = function | Lambda (s, lam) -> open_hovbox 1; kwd "\\"; ident s; kwd
380 "."; print_space(); print_lambda lam; close_box() | e -> print_app e;;
381 val print_app : lambda -> unit = <fun> val print_other_applications :
382 lambda -> unit = <fun> val print_lambda : lambda -> unit = <fun>
383
384
385 Most general pretty-printing: using fprintf
386 We use the fprintf function to write the most versatile version of the
387 pretty-printing functions for lambda-terms. Now, the functions get an
388 extra argument, namely a pretty-printing formatter (the ppf argument)
389 where printing will occur. This way the printing routines are more gen‐
390 eral, since they can print on any formatter defined in the program (ei‐
391 ther printing to a file, or to stdout , to stderr , or even to a
392 string). Furthermore, the pretty-printing functions are now composi‐
393 tional, since they may be used in conjunction with the special %a con‐
394 version, that prints a fprintf argument with a user's supplied function
395 (these user's supplied functions also have a formatter as first argu‐
396 ment).
397
398 Using fprintf , the lambda-terms printing routines can be written as
399 follows:
400
401 open Format;; let ident ppf s = fprintf ppf "%s" s;; let kwd ppf s =
402 fprintf ppf "%s" s;; val ident : Format.formatter -> string -> unit val
403 kwd : Format.formatter -> string -> unit let rec pr_exp0 ppf = function
404 | Var s -> fprintf ppf "%a" ident s | lam -> fprintf ppf "@[<1>(%a)@]"
405 pr_lambda lam and pr_app ppf = function | e -> fprintf ppf "@[<2>%a@]"
406 pr_other_applications e and pr_other_applications ppf f = match f with
407 | Apply (f, arg) -> fprintf ppf "%a@ %a" pr_app f pr_exp0 arg | f ->
408 pr_exp0 ppf f and pr_lambda ppf = function | Lambda (s, lam) -> fprintf
409 ppf "@[<1>%a%a%a@ %a@]" kwd "\\" ident s kwd "." pr_lambda lam | e ->
410 pr_app ppf e ;; val pr_app : Format.formatter -> lambda -> unit val
411 pr_other_applications : Format.formatter -> lambda -> unit val
412 pr_lambda : Format.formatter -> lambda -> unit
413
414 Given those general printing routines, procedures to print to stdout or
415 stderr is just a matter of partial application:
416
417 let print_lambda = pr_lambda std_formatter;; let eprint_lambda =
418 pr_lambda err_formatter;; val print_lambda : lambda -> unit val
419 eprint_lambda : lambda -> unit
420
421
422
423
424
425
426
427
428
429OCamldoc 2023-07-20 Format_tutorial(3)