1Format_tutorial(3)               OCaml library              Format_tutorial(3)
2
3
4

NAME

6       Format_tutorial -  Principles
7

Module

9       Module   Format_tutorial
10

Documentation

12       Module Format_tutorial
13        : sig end
14
15
16
17   Principles
18       Line breaking is based on three concepts:
19
20
21
22       -boxes  :  a box is a logical pretty-printing unit, which defines a be‐
23       haviour of the pretty-printing engine to display  the  material  inside
24       the box.
25
26       -break hints: a break hint is a directive to the pretty-printing engine
27       that proposes to break the line here, if it is  necessary  to  properly
28       print  the rest of the material.  Otherwise, the pretty-printing engine
29       never break lines (except "in case of emergency" to avoid very bad out‐
30       put).   In  short,  a  break  hint tells the pretty printer that a line
31       break here may be appropriate.
32
33       -indentation rules: When a line break occurs, the  pretty-printing  en‐
34       gines  fixes  the  indentation (or amount of leading spaces) of the new
35       line using indentation rules, as follows:
36
37       -A box can state the extra indentation of every new line opened in  its
38       scope. This extra indentation is named box breaking indentation.
39
40       -A  break  hint can also set the additional indentation of the new line
41       it may fire. This extra indentation is named hint breaking indentation.
42
43       - If break hint bh fires a new line within box b , then the indentation
44       of  the new line is simply the sum of: the current indentation of box b
45       + the additional box breaking indentation, as defined by box  b  +  the
46       additional hint breaking indentation, as defined by break hint bh .
47
48
49
50
51   Boxes
52       There are 4 types of boxes. (The most often used is the "hov" box type,
53       so skip the rest at first reading).
54
55
56       -horizontal box (h box, as obtained by the open_hbox procedure): within
57       this box, break hints do not lead to line breaks.
58
59       -vertical  box  (v box, as obtained by the open_vbox procedure): within
60       this box, every break hint lead to a new line.
61
62       -vertical/horizontal box (hv box, as obtained by the open_hvbox  proce‐
63       dure):  if  it is possible, the entire box is written on a single line;
64       otherwise, every break hint within the box leads to a new line.
65
66       -vertical or horizontal box (hov box, as obtained by  the  open_box  or
67       open_hovbox  procedures):  within this box, break hints are used to cut
68       the line when there is no more room on the line.  There are  two  kinds
69       of "hov" boxes, you can find the details below. In first approximation,
70       let me consider these two kinds of "hov" boxes as  equivalent  and  ob‐
71       tained by calling the open_box procedure.
72
73       Let  me give an example. Suppose we can write 10 chars before the right
74       margin (that indicates no more room). We represent  any  char  as  a  -
75       sign; characters [ and ] indicates the opening and closing of a box and
76       b stands for a break hint given to the pretty-printing engine.
77
78       The output "--b--b--" is displayed like this (the b symbol  stands  for
79       the value of the break that is explained below):
80
81       Within a "h" box:
82
83
84       --b--b--
85
86
87       Within a "v" box:
88
89
90       --b
91       --b
92       --
93
94
95       Within a "hv" box:
96
97       If there is enough room to print the box on the line:
98
99
100       --b--b--
101
102
103       But "---b---b---" that cannot fit on the line is written
104
105
106       ---b
107       ---b
108       ---
109
110
111       Within a "hov" box:
112
113       If there is enough room to print the box on the line:
114
115
116       --b--b--
117
118
119       But if "---b---b---" cannot fit on the line, it is written as
120
121
122       ---b---b
123       ---
124
125
126       The first break hint does not lead to a new line, since there is enough
127       room on the line. The second one leads to a new line since there is  no
128       more  room  to print the material following it. If the room left on the
129       line were even shorter, the first break hint may lead to a new line and
130       "---b---b---" is written as:
131
132
133       ---b
134       ---b
135       ---
136
137
138
139   Printing spaces
140       Break  hints  are  also used to output spaces (if the line is not split
141       when the break is encountered, otherwise the new line  indicates  prop‐
142       erly  the  separation  between printing items). You output a break hint
143       using print_break sp indent , and this sp integer is used to print "sp"
144       spaces.  Thus print_break sp ...  may be thought as: print sp spaces or
145       output a new line.
146
147       For instance, if b is break 1 0 in the output "--b--b--", we get
148
149       within a "h" box:
150
151
152       -- -- --
153
154
155       within a "v" box:
156
157
158       --
159       --
160       --
161
162
163       within a "hv" box:
164       -- -- --
165
166
167       or, according to the remaining room on the line:
168
169
170       --
171       --
172       --
173
174
175       and similarly for "hov" boxes.
176
177       Generally speaking, a printing routine using "format", should  not  di‐
178       rectly output white spaces: the routine should use break hints instead.
179       (For instance print_space () that  is  a  convenient  abbreviation  for
180       print_break 1 0 and outputs a single space or break the line.)
181
182
183   Indentation of new lines
184       The user gets 2 ways to fix the indentation of new lines:
185
186       When defining the box: when you open a box, you can fix the indentation
187       added to each new line opened within that box.
188
189       For instance: open_hovbox 1 opens a "hov" box with new lines indented 1
190       more   than   the   initial   indentation   of  the  box.  With  output
191       "---[--b--b--b--", we get:
192
193
194       ---[--b--b
195            --b--
196
197
198       with open_hovbox 2, we get
199
200
201       ---[--b--b
202             --b--
203
204
205       Note: the [ sign in the display is not visible on  the  screen,  it  is
206       just there to materialise the aperture of the pretty-printing box. Last
207       "screen" stands for:
208
209
210       -----b--b
211            --b--
212
213
214       When defining the break that makes the new line.  As  said  above,  you
215       output a break hint using print_break sp indent . The indent integer is
216       used to fix the additional indentation of the new line. Namely,  it  is
217       added  to the default indentation offset of the box where the break oc‐
218       curs.
219
220       For instance, if [ stands for the opening of a "hov" box with 1 as  ex‐
221       tra indentation (as obtained by open_hovbox 1 ), and b is print_break 1
222       2 , then from output "---[--b--b--b--", we get:
223
224
225          ---[-- --
226                --
227                --
228
229
230
231   Refinement on hov boxes
232       The "hov" box type is refined into two categories.
233
234
235       -the vertical or horizontal packing box (as obtained by the open_hovbox
236       procedure):  break hints are used to cut the line when there is no more
237       room on the line; no new line occurs if there is  enough  room  on  the
238       line.
239
240       -vertical  or  horizontal  structural  box (as obtained by the open_box
241       procedure): similar to the "hov" packing box, the break hints are  used
242       to  cut  the  line when there is no more room on the line; in addition,
243       break hints that can show the box structure lead to new lines  even  if
244       there is enough room on the current line.
245
246       The difference between a packing and a structural "hov" box is shown by
247       a routine that closes boxes and parentheses at  the  end  of  printing:
248       with packing boxes, the closure of boxes and parentheses do not lead to
249       new lines if there is enough room on the line, whereas with  structural
250       boxes  each  break  hint  will  lead  to a new line. For instance, when
251       printing "[(---[(----[(---b)]b)]b)]", where "b" is a break hint without
252       extra  indentation  ( print_cut () ). If "[" means opening of a packing
253       "hov" box ( open_hovbox ), "[(---[(----[(---b)]b)]b)]"  is  printed  as
254       follows:
255
256
257       (---
258        (----
259         (---)))
260
261
262       If  we replace the packing boxes by structural boxes ( open_box ), each
263       break hint that precedes a  closing  parenthesis  can  show  the  boxes
264       structure, if it leads to a new line; hence "[(---[(----[(---b)]b)]b)]"
265       is printed like this:
266
267
268       (---
269        (----
270         (---
271         )
272        )
273       )
274
275
276
277   Practical advice
278       When writing a pretty-printing routine, follow these simple rules:
279
280
281       -Boxes must be opened and closed consistently (  open_*  and  close_box
282       must be nested like parentheses).
283
284       -Never hesitate to open a box.
285
286       -Output many break hints, otherwise the pretty-printer is in a bad sit‐
287       uation where it tries to do its best, which is always "worse than  your
288       bad".
289
290       -Do  not  try  to  force spacing using explicit spaces in the character
291       strings.  For each space you want in the output emit  a  break  hint  (
292       print_space  () ), unless you explicitly don't want the line to be bro‐
293       ken here. For instance, imagine you want to pretty print an OCaml defi‐
294       nition, more precisely a let rec
295       ident  = expression value definition. You will probably treat the first
296       three spaces as "unbreakable spaces" and write  them  directly  in  the
297       string  constants  for keywords, and print "let rec" before the identi‐
298       fier, and similarly write = to get an unbreakable space after the iden‐
299       tifier;  in  contrast,  the space after the = sign is certainly a break
300       hint, since breaking the line after = is a usual (and elegant)  way  to
301       indent the expression part of a definition.  In short, it is often nec‐
302       essary to print unbreakable spaces; however, most of the time  a  space
303       should be considered a break hint.
304
305       -Do  not  try to force new lines, let the pretty-printer do it for you:
306       that's its only job.  In particular, do not use  force_newline  :  this
307       procedure  effectively  leads to a newline, but it also as the unfortu‐
308       nate side effect to partially reinitialise the pretty-printing  engine,
309       so that the rest of the printing material is noticeably messed up.
310
311       -Never  put  newline  characters directly in the strings to be printed:
312       pretty printing engine will consider  this  newline  character  as  any
313       other  character  written  on the current line and this will completely
314       mess up the output.  Instead of new  line  characters  use  line  break
315       hints:  if  those  break hints must always result in new lines, it just
316       means that the surrounding box must be a vertical box!
317
318       -End your main program by a print_newline ()  call,  that  flushes  the
319       pretty-printer  tables  (hence  the  output).  (Note that the top-level
320       loop of the interactive system does it as well, just before a  new  in‐
321       put.)
322
323
324   Printing to stdout: using printf
325       The  format  module provides a general printing facility "a la" printf.
326       In addition to the usual conversion facility provided  by  printf,  you
327       can write pretty-printing indications directly inside the format string
328       (opening and closing boxes, indicating breaking hints, etc).
329
330       Pretty-printing annotations are introduced by the  @  symbol,  directly
331       into the string format. Almost any function of the Format module can be
332       called from within a printf format string. For instance
333
334
335       -" @[ " open a box (open_box 0).  You may precise the type as an  extra
336       argument. For instance @[<hov n> is equivalent to open_hovbox n .
337
338       -" @] " close a box ( close_box () ).
339
340       -" @ " output a breakable space ( print_space () ).
341
342       -" @, " output a break hint ( print_cut () ).
343
344       -" @;<n m> " emit a "full" break hint ( print_break n m ).
345
346       -"  @.  " end the pretty-printing, closing all the boxes still opened (
347       print_newline () ).
348
349       For instance
350
351       printf "@[<1>%s@ =@ %d@ %s@]@." "Prix TTC" 100 "Euros";; Prix TTC = 100
352       Euros - : unit = ()
353
354
355   A concrete example
356       Let  me give a full example: the shortest non trivial example you could
357       imagine, that is the lambda calculus :)
358
359       Thus the problem is to pretty-print the values of a concrete data  type
360       that  models a language of expressions that defines functions and their
361       applications to arguments.
362
363       First, I give the abstract syntax of lambda-terms:
364
365       type lambda = | Lambda of string * lambda | Var of string  |  Apply  of
366       lambda * lambda ;;
367
368       I use the format library to print the lambda-terms:
369
370       open  Format;;  let ident = print_string;; let kwd = print_string;; val
371       ident : string -> unit = <fun> val kwd : string -> unit = <fun> let rec
372       print_exp0  =  function | Var s ->  ident s | lam -> open_hovbox 1; kwd
373       "("; print_lambda lam; kwd ")"; close_box () and print_app = function |
374       e  ->  open_hovbox  2;  print_other_applications  e;  close_box  () and
375       print_other_applications f = match f with | Apply (f, arg) -> print_app
376       f;  print_space (); print_exp0 arg | f -> print_exp0 f and print_lambda
377       = function | Lambda (s, lam) -> open_hovbox 1; kwd "\\"; ident  s;  kwd
378       ".";  print_space(); print_lambda lam; close_box() | e -> print_app e;;
379       val print_app : lambda -> unit = <fun> val  print_other_applications  :
380       lambda -> unit = <fun> val print_lambda : lambda -> unit = <fun>
381
382
383   Most general pretty-printing: using fprintf
384       We  use the fprintf function to write the most versatile version of the
385       pretty-printing functions for lambda-terms.  Now, the functions get  an
386       extra  argument,  namely a pretty-printing formatter (the ppf argument)
387       where printing will occur. This way the printing routines are more gen‐
388       eral, since they can print on any formatter defined in the program (ei‐
389       ther printing to a file, or to stdout ,  to  stderr  ,  or  even  to  a
390       string).   Furthermore,  the pretty-printing functions are now composi‐
391       tional, since they may be used in conjunction with the special %a  con‐
392       version, that prints a fprintf argument with a user's supplied function
393       (these user's supplied functions also have a formatter as  first  argu‐
394       ment).
395
396       Using  fprintf  ,  the lambda-terms printing routines can be written as
397       follows:
398
399       open Format;; let ident ppf s = fprintf ppf "%s" s;; let kwd  ppf  s  =
400       fprintf ppf "%s" s;; val ident : Format.formatter -> string -> unit val
401       kwd : Format.formatter -> string -> unit let rec pr_exp0 ppf = function
402       |  Var s -> fprintf ppf "%a" ident s | lam -> fprintf ppf "@[<1>(%a)@]"
403       pr_lambda lam and pr_app ppf = function | e -> fprintf ppf  "@[<2>%a@]"
404       pr_other_applications  e and pr_other_applications ppf f = match f with
405       | Apply (f, arg) -> fprintf ppf "%a@ %a" pr_app f pr_exp0 arg  |  f  ->
406       pr_exp0 ppf f and pr_lambda ppf = function | Lambda (s, lam) -> fprintf
407       ppf "@[<1>%a%a%a@ %a@]" kwd "\\" ident s kwd "." pr_lambda lam |  e  ->
408       pr_app  ppf  e  ;;  val pr_app : Format.formatter -> lambda -> unit val
409       pr_other_applications  :  Format.formatter  ->  lambda  ->   unit   val
410       pr_lambda : Format.formatter -> lambda -> unit
411
412       Given those general printing routines, procedures to print to stdout or
413       stderr is just a matter of partial application:
414
415       let  print_lambda  =  pr_lambda  std_formatter;;  let  eprint_lambda  =
416       pr_lambda  err_formatter;;  val  print_lambda  :  lambda  ->  unit  val
417       eprint_lambda : lambda -> unit
418
419
420
421
422
423
424
425
426
427OCamldoc                          2022-02-04                Format_tutorial(3)
Impressum