1roff(7)                Miscellaneous Information Manual                roff(7)
2
3
4

Name

6       roff - concepts and history of roff typesetting
7

Description

9       The  term roff denotes a family of document formatting systems known by
10       names like troff, nroff, and ditroff.  A roff system consists of an in‐
11       terpreter  for an extensible text formatting language and a set of pro‐
12       grams for preparing output for various devices and file formats.  Unix-
13       like  operating  systems  often  distribute  a roff system.  The manual
14       pages on Unix systems (“man pages”) and bestselling books  on  software
15       engineering,  including Brian Kernighan and Dennis Ritchie's The C Pro‐
16       gramming Language and W. Richard Stevens's Advanced Programming in  the
17       Unix Environment have been written using roff systems.  GNU roffgroff
18       is arguably the most widespread roff implementation.
19
20       Below we present typographical concepts that form the background of all
21       roff implementations, narrate the development history of some roff sys‐
22       tems, detail the command pipeline managed by groff(1), survey the  for‐
23       matting  language,  suggest  tips for editing roff input, and recommend
24       further reading materials.
25

Concepts

27       roff input files contain text interspersed with instructions to control
28       the  formatter.   Even in the absence of such instructions, a roff for‐
29       matter still processes its input in several ways, by filling, hyphenat‐
30       ing,  breaking,  and adjusting it, and supplementing it with inter-sen‐
31       tence space.  These processes are basic to typesetting, and can be con‐
32       trolled at the input document's discretion.
33
34       When a device-independent roff formatter starts up, it obtains informa‐
35       tion about the device for which it is preparing output  from  the  lat‐
36       ter's  description  file (see groff_font(5)).  An essential property is
37       the length of the output line, such as “6.5 inches”.
38
39       The formatter interprets plain text files employing the Unix  line-end‐
40       ing convention.  It reads input a character at a time, collecting words
41       as it goes, and fits as many words together on an  output  line  as  it
42       can—this is known as filling.  To a roff system, a word is any sequence
43       of one or more characters that aren't spaces or newlines.   The  excep‐
44       tions separate words.
45
46       A  roff  formatter attempts to detect boundaries between sentences, and
47       supplies additional inter-sentence space between them.  It  flags  cer‐
48       tain  characters  (normally  “!”, “?”, and “.”) as potentially ending a
49       sentence.  When the formatter encounters one of  these  end-of-sentence
50       characters  at  the end of an input line, or one of them is followed by
51       two (unescaped) spaces on the same input line, it appends an inter-word
52       space  followed  by  an  inter-sentence space in the output.  The dummy
53       character escape sequence \& can be used after an end-of-sentence char‐
54       acter  to  defeat  end-of-sentence  detection  on a per-instance basis.
55       Normally, the occurrence of a visible non-end-of-sentence character (as
56       opposed to a space or tab) immediately after an end-of-sentence charac‐
57       ter cancels detection of the end of a sentence.  However, several char‐
58       acters are treated transparently after the occurrence of an end-of-sen‐
59       tence character.  That is, a roff does not cancel  end-of-sentence  de‐
60       tection  when  it  processes them.  This is because such characters are
61       often used as footnote markers or to close quotations  and  parentheti‐
62       cals.   The  default  set  is  ",  ', ), ], *, \[dg], \[dd], \[rq], and
63       \[cq].  The last four are examples of special  characters,  escape  se‐
64       quences  whose purpose is to obtain glyphs that are not easily typed at
65       the keyboard, or which have special meaning to the formatter (like \).
66
67       When an output line is nearly full, it is uncommon for  the  next  word
68       collected  from  the  input to exactly fill it—typically, there is room
69       left over only for part of the next word.  The process of  splitting  a
70       word  so  that it appears partially on one line (with a hyphen to indi‐
71       cate to the reader that the word has been broken) with its remainder on
72       the next is hyphenation.  Hyphenation points can be manually specified;
73       groff also uses a hyphenation algorithm and  language-specific  pattern
74       files  to  decide which words can be hyphenated and where.  Hyphenation
75       does not always occur even when the hyphenation rules for a word  allow
76       it; it can be disabled, and when not disabled there are several parame‐
77       ters that can prevent it in certain circumstances.
78
79       Once an output line is full, the next word (or remainder of  a  hyphen‐
80       ated one) is placed on a different output line; this is called a break.
81       In this document and in roff discussions generally, a  “break”  if  not
82       further  qualified  always refers to the termination of an output line.
83       When the formatter is filling text, it introduces breaks  automatically
84       to  keep output lines from exceeding the configured line length.  After
85       an automatic break, a roff formatter adjusts  the  line  if  applicable
86       (see  below),  and then resumes collecting and filling text on the next
87       output line.
88
89       Sometimes, a line cannot be broken automatically.   This  usually  does
90       not happen with natural language text unless the output line length has
91       been manipulated to be extremely short, but  it  can  with  specialized
92       text  like  program source code.  groff provides a means of telling the
93       formatter where the line may be broken without hyphens.  This  is  done
94       with the non-printing break point escape sequence \:.
95
96       There  are  several ways to cause a break at a predictable location.  A
97       blank input line not only causes a break, but by default it  also  out‐
98       puts  a  one-line  vertical  space  (effectively  a blank output line).
99       Macro packages may discourage or disable this “blank  line  method”  of
100       paragraphing in favor of their own macros.  A line that begins with one
101       or more spaces causes a break.  The spaces are output at the  beginning
102       of  the  next  line  without  being adjusted (see below).  Again, macro
103       packages may provide other methods of  producing  indented  paragraphs.
104       Trailing  spaces  on  text lines (see below) are discarded.  The end of
105       input causes a break.
106
107       After the formatter performs an automatic break, it may then adjust the
108       line,  widening inter-word spaces until the text reaches the right mar‐
109       gin.  Extra spaces between words are preserved.  Leading  and  trailing
110       spaces  are handled as noted above.  Text can be aligned to the left or
111       right margin only, or centered, using requests.
112
113       A roff formatter translates horizontal tab characters, also called sim‐
114       ply  “tabs”,  in  the input into movements to the next tab stop.  These
115       tab stops are by default located every half inch measured from the cur‐
116       rent position on the input line.  With them, simple tables can be made.
117       However, this method can be deceptive, as the appearance (and width) of
118       the  text  in  an  editor  and  the results from the formatter can vary
119       greatly, particularly when proportional  typefaces  are  used.   A  tab
120       character does not cause a break and therefore does not interrupt fill‐
121       ing.  The formatter provides facilities for sophisticated table  compo‐
122       sition;  there  are  many  details  to  track  when using the “tab” and
123       “field” low-level features, so most users turn to the tbl(1) preproces‐
124       sor to lay out tables.
125
126   Requests and macros
127       A  request  is an instruction to the formatter that occurs after a con‐
128       trol character, which is recognized at the beginning of an input  line.
129       The  regular  control character is a dot “.”.  Its counterpart, the no-
130       break control character, a neutral apostrophe “'”, suppresses the break
131       implied  by  some requests.  These characters were chosen because it is
132       uncommon for lines of text in natural languages to begin with them.  If
133       you  require a formatted period or apostrophe (closing single quotation
134       mark) where the formatter is expecting a control character, prefix  the
135       dot  or  neutral  apostrophe  with the dummy character escape sequence,
136\&”.
137
138       An input line beginning with a control character is  called  a  control
139       line.  Every line of input that is not a control line is a text line.
140
141       Requests  often  take arguments, words (separated from the request name
142       and each other by spaces) that specify details of the action  the  for‐
143       matter is expected to perform.  If a request is meaningless without ar‐
144       guments, it is typically ignored.  Of key importance are  the  requests
145       that define macros.  Macros are invoked like requests, enabling the re‐
146       quest repertoire to be extended or overridden.
147
148       A macro can be thought of as an abbreviation you can define for a  col‐
149       lection  of control and text lines.  When the macro is called by giving
150       its name after a control character, it is replaced with what it  stands
151       for.   The  process  of  textual replacement is known as interpolation.
152       Interpolations are handled as soon as they  are  recognized,  and  once
153       performed, a roff formatter scans the replacement for further requests,
154       macro calls, and escape sequences.
155
156       In roff systems, the “de” request defines a macro.
157
158   Page geometry
159       roff systems format text under certain assumptions about  the  size  of
160       the  output  medium,  or  page.  For the formatter to correctly break a
161       line it is filling, it must know the line length, which it derives from
162       the  page  width.   For it to decide whether to write an output line to
163       the current page or wait until the next one,  it  must  know  the  page
164       length.   A device's resolution converts practical units like inches or
165       centimeters to basic units, a convenient length measure for the  output
166       device or file format.  The formatter and output driver use basic units
167       to reckon page measurements.  The device description file  defines  its
168       resolution and page dimensions (see groff_font(5)).
169
170       A  page is a two-dimensional structure upon which a roff system imposes
171       a rectangular coordinate system with its upper left corner as the  ori‐
172       gin.  Coordinate values are in basic units and increase down and to the
173       right.  Useful ones are therefore always positive  and  within  numeric
174       ranges corresponding to the page boundaries.
175
176       While  the  formatter (and, later, output driver) is processing a page,
177       it keeps track of its drawing position, which is the location at  which
178       the next glyph will be written, from which the next motion will be mea‐
179       sured, or where a geometric object will  commence  rendering.   Notion‐
180       ally,  glyphs are drawn from the text baseline upward and to the right.
181       (groff does not yet support right-to-left scripts.)  The text  baseline
182       is  a  (usually invisible) line upon which the glyphs of a typeface are
183       aligned.  A glyph therefore “starts” at  its  bottom-left  corner.   If
184       drawn  at  the  origin,  a  typical letter glyph would lie partially or
185       wholly off the page, depending on whether, like “g”, it features a  de‐
186       scender below the baseline.
187
188       Such  a situation is nearly always undesirable.  It is furthermore con‐
189       ventional not to write or draw  at  the  extreme  edges  of  the  page.
190       Therefore  the  initial  drawing position of a roff formatter is not at
191       the origin, but below and to the right of  it.   This  rightward  shift
192       from the left edge is known as the page offset.  (groff's terminal out‐
193       put devices have page offsets of zero.)  The downward shift leaves room
194       for a text output line.
195
196       Text  is  arranged  on a one-dimensional lattice of text baselines from
197       the top to the bottom of the page.  Vertical spacing  is  the  distance
198       between adjacent text baselines.  Typographic tradition sets this quan‐
199       tity to 120% of the type size.  The initial vertical  drawing  position
200       is  one unit of vertical spacing below the page top.  Typographers term
201       this unit a vee.
202
203       Vertical spacing has an impact on page-breaking decisions.   Generally,
204       when  a  break  occurs, the formatter moves the drawing position to the
205       next text baseline automatically.  If the formatter were already  writ‐
206       ing  to  the last line that would fit on the page, advancing by one vee
207       would place the next text baseline off the page.  Rather than let  that
208       happen,  roff  formatters instruct the output driver to eject the page,
209       start a new one, and again set the drawing position to  one  vee  below
210       the page top; this is a page break.
211
212       When  the  last  line of input text corresponds to the last output line
213       that fits on the page, the break caused by the end of input  will  also
214       break  the  page,  producing  a useless blank one.  Macro packages keep
215       users from having to confront this difficulty by setting “traps”; more‐
216       over,  all but the simplest page layouts tend to have headers and foot‐
217       ers, or at least bear vertical margins larger than one vee.
218
219   Other language elements
220       Escape sequences start with the escape character, a  backslash  \,  and
221       are  followed  by  at  least one additional character.  They can appear
222       anywhere in the input.
223
224       With requests, the escape and control characters can be  changed;  fur‐
225       ther, escape sequence recognition can be turned off and back on.
226
227       Strings store character sequences.  In groff, they can be parameterized
228       as macros can.
229
230       Registers store numerical values, including measurements.   The  latter
231       are  generally in basic units; scaling units can be appended to numeric
232       expressions to clarify their meaning when stored or interpolated.  Some
233       read-only predefined registers interpolate text.
234
235       Fonts are identified either by a name or by a mounting position (a non-
236       negative number).  Four styles are available on all devices.  R is “ro‐
237       man”:  normal,  upright  text.   B  is bold, an upright typeface with a
238       heavier weight.  I is italic, a face that is oblique on typesetter out‐
239       put  devices and usually underlined instead on terminal devices.  BI is
240       bold-italic, combining both of the foregoing style  variations.   Type‐
241       setting  devices  group  these four styles into families of text fonts;
242       they also typically offer one or more special fonts  that  provide  un‐
243       styled glyphs; see groff_char(7).
244
245       groff  supports named colors for glyph rendering and drawing of geomet‐
246       ric objects.  Stroke and fill colors are distinct; the stroke color  is
247       used for glyphs.
248
249       Glyphs  are  visual  representation forms of characters.  In groff, the
250       distinction between those two elements is not  always  obvious  (and  a
251       full  discussion  is  beyond  our scope).  In brief, “A” is a character
252       when we consider it in the abstract: to make it a glyph, we must select
253       a  typeface  with  which  to render it, and determine its type size and
254       color.  The formatting  process  turns  input  characters  into  output
255       glyphs.   A  few characters commonly seen on keyboards are treated spe‐
256       cially by the roff language and may not look correct in output if  used
257       unthinkingly;  they  are  the  (double) quotation mark ("), the neutral
258       apostrophe ('), the minus sign (-), the backslash  (\),  the  caret  or
259       circumflex accent (^), the grave accent (`), and the tilde (~).  All of
260       these and more can be produced with special character escape sequences;
261       see groff_char(7).
262
263       groff  offers streams, identifiers for writable files, but for security
264       reasons this feature is disabled by default.
265
266       A further few language elements arise as page layouts become  more  so‐
267       phisticated  and demanding.  Environments collect formatting parameters
268       like line length and typeface.  A diversion stores formatted output for
269       later  use.  A trap is a condition on the input or output, tested auto‐
270       matically by the formatter, that is associated with a macro, calling it
271       when that condition is fulfilled.
272
273       Footnote  support  often exercises all three of the foregoing features.
274       A simple implementation might work as follows.  A pair of macros is de‐
275       fined:  one  starts a footnote and the other ends it.  The author calls
276       the first macro where a footnote marker is desired.  The  macro  estab‐
277       lishes  a diversion so that the footnote text is collected at the place
278       in the body text where its corresponding marker appears.   An  environ‐
279       ment  is  created for the footnote so that it is set at a smaller type‐
280       face.  The footnote text is formatted in the diversion using that envi‐
281       ronment, but it does not yet appear in the output.  The document author
282       calls the footnote end macro, which returns to the previous environment
283       and  ends the diversion.  Later, after much more body text in the docu‐
284       ment, a trap, set a small distance above the page  bottom,  is  sprung.
285       The macro called by the trap draws a line across the page and emits the
286       stored diversion.  Thus, the footnote is rendered.
287

History

289       Computer-driven document formatting dates back to the 1960s.  The  roff
290       system  is intimately connected with Unix, but its origins lie with the
291       earlier operating systems CTSS, GECOS, and Multics.
292
293   The predecessor—RUNOFF
294       roff's ancestor RUNOFF was written in the MAD language by Jerry Saltzer
295       to  prepare  his  Ph.D.  thesis  on  the Compatible Time Sharing System
296       (CTSS), a project of the Massachusetts Institute of  Technology  (MIT).
297       This  program  is  referred to in full capitals, both to distinguish it
298       from its many descendants, and because bits  were  expensive  in  those
299       days;  five-  and  six-bit character encodings were still in widespread
300       usage, and mixed-case alphabetics in  file  names  seen  as  a  luxury.
301       RUNOFF introduced a syntax of inlining formatting directives amid docu‐
302       ment text, by beginning a line with a period (an unlikely occurrence in
303       human-readable  material)  followed by a “control word”.  Control words
304       with obvious meaning like “.line length n” were supported as well as an
305       abbreviation system; the latter came to overwhelm the former in popular
306       usage and later derivatives of the program.  A sample of control  words
307       from  a RUNOFF manual of December 1966 ⟨http://web.mit.edu/Saltzer/www/
308       publications/ctss/AH.9.01.html⟩ was documented as follows (with the pa‐
309       rameter notation slightly altered).  The abbreviations will be familiar
310       to roff veterans.
311
312                           Abbreviation   Control word
313                                    .ad   .adjust
314                                    .bp   .begin page
315                                    .br   .break
316                                    .ce   .center
317                                    .in   .indent n
318                                    .ll   .line length n
319                                    .nf   .nofill
320                                    .pl   .paper length n
321                                    .sp   .space [n]
322
323       In 1965, MIT's Project MAC teamed with Bell Telephone Laboratories  and
324       General  Electric (GE) to inaugurate the Multics ⟨http://www.multicians
325       .org⟩ project.  After a few years, Bell Labs discontinued its  partici‐
326       pation  in  Multics, famously prompting the development of Unix.  Mean‐
327       while, Saltzer's RUNOFF proved influential, seeing many ports and deri‐
328       vations elsewhere.
329
330       In  1969,  Doug  McIlroy wrote one such reimplementation, adding exten‐
331       sions, in the BCPL language for a GE 645 running GECOS at the Bell Labs
332       location  in  Murray Hill, New Jersey.  In its manual, the control com‐
333       mands were termed “requests”, their two-letter  names  were  canonical,
334       and  the  control character was configurable with a .cc request.  Other
335       familiar requests emerged at this time; no-adjust  (.na),  need  (.ne),
336       page  offset  (.po),  tab  configuration (.ta, though it worked differ‐
337       ently), temporary indent (.ti), character translation (.tr), and  auto‐
338       matic  underlining  (.ul; on RUNOFF you had to backspace and underscore
339       in the input yourself).  .fi to enable filling of output lines got  the
340       name it retains to this day.  McIlroy's program also featured a heuris‐
341       tic system for automatically placing hyphenation points,  designed  and
342       implemented  by  Molly Wagner.  It furthermore introduced numeric vari‐
343       ables, termed registers.  By 1971, this program had been ported to Mul‐
344       tics and was known as roff, a name McIlroy attributes to Bob Morris, to
345       distinguish it from CTSS RUNOFF.
346
347   Unix and roff
348       McIlroy's roff was one of the first Unix programs.  In Ritchie's  term,
349       it  was  “transliterated”  from BCPL to DEC PDP-7 assembly language for
350       the fledgling Unix operating system.  Automatic hyphenation was managed
351       with  .hc  and  .hy requests, line spacing control was generalized with
352       the .ls request, and what later roffs would call diversions were avail‐
353       able  via  “footnote”  requests.  This roff indirectly funded operating
354       systems research at Murray Hill; AT&T prepared patent  applications  to
355       the U.S. government with it.  This arrangement enabled the group to ac‐
356       quire a PDP-11; roff promptly proved equal to the  task  of  formatting
357       the  manual  for what would become known as “First Edition Unix”, dated
358       November 1971.
359
360       Output from all of the foregoing programs was limited to line  printers
361       and  paper  terminals such as the IBM 2471 (based on the Selectric line
362       of typewriters) and the Teletype Corporation Model 37.   Proportionally
363       spaced type was unavailable.
364
365   New roff and Typesetter roff
366       The first years of Unix were spent in rapid evolution.  The practicali‐
367       ties of preparing standardized documents like patent applications  (and
368       Unix  manual  pages), combined with McIlroy's enthusiasm for macro lan‐
369       guages, perhaps created an irresistible pressure to make roff  extensi‐
370       ble.   Joe  Ossanna's nroff, literally a “new roff”, was the outlet for
371       this pressure.  By the time of Unix Version 3 (February 1973)—and still
372       in  PDP-11 assembly language—it sported a swath of features now consid‐
373       ered essential to roff systems: definition of macros  (.de),  diversion
374       of  text  thither (.di), and removal thereof (.rm); trap planting (.wh;
375       “when”) and relocation (.ch; “change”); conditional  processing  (.if);
376       and  environments  (.ev).  Incremental improvements included assignment
377       of the next page number (.pn); no-space mode (.ns) and  restoration  of
378       vertical  spacing  (.rs); the saving (.sv) and output (.os) of vertical
379       space; specification of replacement characters for tabs (.tc) and lead‐
380       ers  (.lc);  configuration  of  the  no-break  control character (.c2);
381       shorthand to disable automatic hyphenation  (.nh);  a  condensation  of
382       what  were  formerly  six  different requests for configuration of page
383       “titles” (headers and footers) into one (.tl) with a length  controlled
384       separately  from the line length (.lt); automatic line numbering (.nm);
385       interactive input (.rd), which necessitated buffer-flushing (.fl),  and
386       was made convenient with early program cessation (.ex); source file in‐
387       clusion in its modern form (.so; though RUNOFF had an “.append” control
388       word for a similar purpose) and early advance to the next file argument
389       (.nx); ignorable content (.ig); and programmable abort (.ab).
390
391       Third Edition Unix also brought the pipe(2) system call, the  explosive
392       growth  of a componentized system based around it, and a “filter model”
393       that remains perceptible today.  Equally  importantly,  the  Bell  Labs
394       site  in  Murray Hill acquired a Graphic Systems C/A/T phototypesetter,
395       and with it came the necessity of expanding the capabilities of a  roff
396       system  to  cope  with  a variety of proportionally spaced typefaces at
397       multiple sizes.  Ossanna wrote a parallel implementation of  nroff  for
398       the  C/A/T,  dubbing  it troff (for “typesetter roff”).  Unfortunately,
399       surviving documentation does not illustrate what requests  were  imple‐
400       mented  at this time for C/A/T support; the troff(1) man page in Fourth
401       Edition Unix (November 1973) does not feature a  request  list,  unlike
402       nroff(1).   Apart from typesetter-driven features, Unix Version 4 roffs
403       added string definitions (.ds); made the escape character  configurable
404       (.ec);  and enabled the user to write diagnostics to the standard error
405       stream (.tm).  Around 1974, empowered with multiple type  sizes,  ital‐
406       ics, and a symbol font specially commissioned by Bell Labs from Graphic
407       Systems, Kernighan and Lorinda Cherry implemented eqn  for  typesetting
408       mathematics.   In  the  same year, for Fifth Edition Unix, Ossanna com‐
409       bined and reimplemented the two roffs in C, using that language's  pre‐
410       processor to generate both from a single source tree.
411
412       Ossanna  documented  the  syntax of the input language to the nroff and
413       troff programs in the “Troff User's Manual”, first published  in  1976,
414       with  further  revisions  as  late as 1992 by Kernighan.  (The original
415       version was entitled “Nroff/Troff User's Manual”, which  may  partially
416       explain  why  roff practitioners have tended to refer to it by its AT&T
417       document identifier, “CSTR #54”.)  Its final revision serves as the  de
418       facto  specification  of AT&T troff, and all subsequent implementors of
419       roff systems have done so in its shadow.
420
421       A small and simple set of roff macros was first  used  for  the  manual
422       pages of Unix Version 4 and persisted for two further releases, but the
423       first macro package to be formally described and installed  was  ms  by
424       Michael  Lesk  in Version 6.  He also wrote a manual, “Typing Documents
425       on the Unix System”, describing ms and basic nroff/troff usage,  updat‐
426       ing it as the package accrued features.  Sixth Edition additionally saw
427       the debut of the tbl preprocessor for formatting tables, also by Lesk.
428
429       For Unix Version 7 (January 1979), McIlroy designed,  implemented,  and
430       documented  the  man  macro package, introducing most of the macros de‐
431       scribed in groff_man(7) today, and edited volume 1  of  the  Version  7
432       manual  using  it.   Documents  composed using ms featured in volume 2,
433       edited by Kernighan.
434
435       Meanwhile, troff proved popular even at Unix sites that lacked a  C/A/T
436       device.   Tom  Ferrin  of the University of California at San Francisco
437       combined it with  Allen  Hershey's  popular  vector  fonts  to  produce
438       vtroff, which translated troff's output to the command language used by
439       Versatec and Benson-Varian plotters.
440
441       Ossanna had passed away unexpectedly in 1977, and after the release  of
442       Version 7, with the C/A/T typesetter becoming supplanted by alternative
443       devices such as the Mergenthaler Linotron 202,  Kernighan  undertook  a
444       revision  and  rewrite of troff to generalize its design.  To implement
445       this revised architecture, he developed the font and device description
446       file  formats  and the page description language that remain in use to‐
447       day.  He described these novelties in the article  “A  Typesetter-inde‐
448       pendent TROFF”, last revised in 1982, and like the troff manual itself,
449       it is widely known by a shorthand, “CSTR #97”.
450
451       Kernighan's innovations prepared troff well for the introduction of the
452       Adobe  PostScript language in 1982 and a vibrant market in laser print‐
453       ers with built-in interpreters for it.   An  output  driver  for  Post‐
454       Script, dpost, was swiftly developed.  However, AT&T's software licens‐
455       ing practices kept Ossanna's troff, with  its  tight  coupling  to  the
456       C/A/T's  capabilities, in parallel distribution with device-independent
457       troff throughout the 1980s.  Today, however,  all  actively  maintained
458       troffs follow Kernighan's device-independent design.
459
460   groff—a free roff from GNU
461       The  most  important free roff project historically has been groff, the
462       GNU implementation of troff, developed by James Clark starting in  1989
463       and  distributed under copyleft ⟨http://www.gnu.org/copyleft⟩ licenses,
464       ensuring to all the availability of source code and the freedom to mod‐
465       ify  and  redistribute  it, properties unprecedented in roff systems to
466       that point.  groff rapidly attracted contributors, and has served as  a
467       replacement  for  almost all applications of AT&T troff (exceptions in‐
468       clude mv, a macro package for preparation of viewgraphs and slides, and
469       the  ideal preprocessor, which produces diagrams from mathematical con‐
470       straints).   Beyond  that,  it  has  added   numerous   features;   see
471       groff_diff(7).   Since  its  inception  and  for at least the following
472       three decades, it has been used by practically all  GNU/Linux  and  BSD
473       operating systems.
474
475       groff  continues to be developed, is available for almost all operating
476       systems in common use (along with several obscure ones), and  is  free.
477       These factors make groff the de facto roff standard today.
478
479   Other free roffs
480       In  2007,  Caldera/SCO  and Sun Microsystems, having acquired rights to
481       AT&T Documenter's Workbench (DWB) troff (a descendant of the Bell  Labs
482       code),  released  it  under  a free but GPL-incompatible license.  This
483       implementation ⟨https://github.com/n-t-roff/DWB3.3⟩ was  made  portable
484       to  modern POSIX systems, and adopted and enhanced first by Gunnar Rit‐
485       ter  and  then  Carsten  Kunze  to  produce  Heirloom  Doctools   troff
486https://github.com/n-t-roff/heirloom-doctools⟩.
487
488       In  July 2013, Ali Gholami Rudi announced neatroffhttps://github.com/
489       aligrudi/neatroff⟩, a permissively licensed new implementation.
490
491       Another descendant of DWB troff is part  of  Plan  9  from  User  Space
492https://9fans.github.io/plan9port/⟩.   Since 2021, this troff has been
493       available under permissive terms.
494

Using roff

496       When you read a man page, often a roff is  the  program  rendering  it.
497       Some roff implementations provide wrapper programs that make it easy to
498       use the roff system from the shell's command line.  These can  be  spe‐
499       cific  to  a  macro package, like mmroff(1), or more general.  groff(1)
500       provides command-line options sparing the user  from  constructing  the
501       long, order-dependent pipelines familiar to AT&T troff users.  Further,
502       a heuristic program, grog(1), is available to infer from  a  document's
503       contents which groff arguments should be used to process it.
504
505   The roff pipeline
506       A  typical  roff document is prepared by running one or more processors
507       in series, followed by a a formatter program and then an output  driver
508       (or  “device  postprocessor”).  Commonly, these programs are structured
509       into a pipeline; that is, each is run in sequence such that the  output
510       of  one is taken as the input to the next, without passing through sec‐
511       ondary storage.  (On non-Unix systems, pipelines may have to  be  simu‐
512       lated with temporary files.)
513
514              $ preproc1 < input-file | preproc2 | ... | troff [option] ... \
515                  | output-driver
516
517       Once  all preprocessors have run, they deliver pure roff language input
518       to the formatter, which in turn generates a document in a page descrip‐
519       tion  language that is then interpreted by a postprocessor for viewing,
520       printing, or further processing.
521
522       Each program interprets input in a language that is independent of  the
523       others;  some  are  purely descriptive, as with tbl(1) and roff output,
524       and some permit the definition of macros, as with eqn(1) and  roff  in‐
525       put.   Most roff input files employ the macros of a document formatting
526       package, intermixed with instructions for one  or  more  preprocessors,
527       and seasoned with escape sequences and requests from the roff language.
528       Some documents are simpler still, since their formatting packages  dis‐
529       courage direct use of roff requests; man pages are a prominent example.
530       Many features of the roff language are seldom needed by users; only au‐
531       thors of macro packages require a substantial command of them.
532
533   Preprocessors
534       A  roff  preprocessor is a program that, directly or ultimately, gener‐
535       ates output in the roff language.  Typically, each preprocessor defines
536       a  language  of its own that transforms its input into that for roff or
537       another preprocessor.  As an example of the latter, chem  produces  pic
538       input.  Preprocessors must consequently be run in an appropriate order;
539       groff(1) handles this automatically for all preprocessors  supplied  by
540       the GNU roff system.
541
542       Portions  of the document written in preprocessor languages are usually
543       bracketed by tokens that look like roff macro calls.  roff preprocessor
544       programs  transform only the regions of the document intended for them.
545       When a preprocessor language is used by a document,  its  corresponding
546       program  must  process it before the input is seen by the formatter, or
547       incorrect rendering is almost guaranteed.
548
549       GNU roff provides several preprocessors, including eqn, grn, pic,  tbl,
550       refer, and soelim.  See groff(1) for a complete list.  Other preproces‐
551       sors for roff systems are known.
552
553              dformat   depicts data structures;
554              grap      constructs statistical charts; and
555              ideal     draws diagrams using a constraint-based language.
556
557   Formatter programs
558       A roff formatter transforms roff language input into a single file in a
559       page description language, described in groff_out(5), intended for pro‐
560       cessing by a selected device.  This page description language  is  spe‐
561       cialized  in  its  parameters, but not its syntax, for the selected de‐
562       vice; the format is device-independent, but not  device-agnostic.   The
563       parameters the formatter uses to arrange the document are stored in de‐
564       vice and font description files; see groff_font(5).
565
566       AT&T Unix had two formatters—nroff for terminals, and troff  for  type‐
567       setters.  Often, the name troff is used loosely to refer to both.  When
568       generalizing thus, groff documentation prefers the term “roff”.  In GNU
569       roff, the formatter program is always troff(1).
570
571   Devices and output drivers
572       To  a  roff  system, a device is a hardware interface like a printer, a
573       text or graphical terminal, or a standardized file  format  that  unre‐
574       lated  software  can  interpret.   An  output  driver is a program that
575       parses the output of troff and produces instructions  specific  to  the
576       device or file format it supports.  An output driver might support mul‐
577       tiple devices, particularly if they are similar.
578
579       The names of the devices and their driver programs  are  not  standard‐
580       ized.   Technological  fashions  evolve;  the devices used for document
581       preparation when AT&T troff was first  written  in  the  1970s  are  no
582       longer  used  in  production  environments.   Device  capabilities have
583       tended to increase,  improving  resolution  and  font  repertoire,  and
584       adding color output and hyperlinking.  Further, to reduce file size and
585       processing time, AT&T troff's page description language placed low lim‐
586       its on the magnitudes of some quantities it could represent.  Its Post‐
587       Script output driver, dpost(1), had a resolution of 720 units per inch;
588       groff's grops(1) uses 72,000.
589

roff programming

591       Documents  using  roff are normal text files interleaved with roff for‐
592       matting elements.  The roff language is powerful enough to support  ar‐
593       bitrary  computation  and  it supplies facilities that encourage exten‐
594       sion.  The primary such facility is macro definition;  with  this  fea‐
595       ture, macro packages have been developed that are tailored for particu‐
596       lar applications.
597
598   Macro packages
599       Macro packages can have a much smaller  vocabulary  than  roff  itself;
600       this  trait  combined  with  their domain-specific nature can make them
601       easy to acquire and master.  The macro definitions  of  a  package  are
602       typically  kept  in  a file called name.tmac (historically, tmac.name).
603       Find  details  on  the  naming  and  placement  of  macro  packages  in
604       groff_tmac(5).
605
606       A  macro  package  anticipated for use in a document can be declared to
607       the formatter by the command-line option -m; see troff(1).  It can  al‐
608       ternatively be specified within a document using the mso request of the
609       groff language; see groff(7).
610
611       Well-known macro packages include man for  traditional  man  pages  and
612       mdoc for BSD-style manual pages.  Macro packages for typesetting books,
613       articles, and letters include ms (from “manuscript macros”), me  (named
614       by a system administrator from the first name of its creator, Eric All‐
615       man), mm (from “memorandum macros”), and mom, a punningly named package
616       exercising many groff extensions.  See groff_tmac(5) for more.
617
618   The roff formatting language
619       The roff language provides requests, escape sequences, macro definition
620       facilities, string variables, registers for storage of numbers  or  di‐
621       mensions, and control of execution flow.  The theoretically minded will
622       observe that a roff is not a mere markup language, but Turing-complete.
623       It has storage (registers), it can perform tests (as in conditional ex‐
624       pressions like “(\n[i] >= 1)”), its “if” and related requests alter the
625       flow of control, and macro definition permits unbounded recursion.
626
627       Requests and escape sequences are instructions, predefined parts of the
628       language, that perform formatting operations, interpolate stored  mate‐
629       rial, or otherwise change the state of the parser.  The user can define
630       their own request-like elements by composing together  text,  requests,
631       and  escape sequences ad libitum.  A document writer will not (usually)
632       note any difference in usage for requests or macros; both are found  on
633       control lines.  However, there is a distinction; requests take either a
634       fixed number of arguments (sometimes zero), silently ignoring  any  ex‐
635       cess,  or consume the rest of the input line, whereas macros can take a
636       variable number of arguments.  Since arguments are separated by spaces,
637       macros  require  a  means of embedding a space in an argument; in other
638       words, of quoting it.  This then demands a mechanism of  embedding  the
639       quoting character itself, in case it is needed literally in a macro ar‐
640       gument.  AT&T troff had complex rules involving the placement and repe‐
641       tition  of the double quote to achieve both aims.  groff cuts this knot
642       by supporting a special character escape sequence for the neutral  dou‐
643       ble  quote,  “\[dq]”,  which  never performs quoting in the typesetting
644       language, but is simply a glyph, ‘"’.
645
646       Escape sequences start with a backslash, “\”.  They can  appear  almost
647       anywhere,  even  in  the midst of text on a line, and implement various
648       features, including the insertion of special characters with “\(xx”  or
649\[xxx]”,  break  suppression  at  input  line  endings with “\c”, font
650       changes with “\f”, type size changes with “\s”, in-line  comments  with
651\"”, and many others.
652
653       Strings  store text.  They are populated with the ds request and inter‐
654       polated using the \* escape sequence.
655
656       Registers store numbers and measurements.  A register can be  set  with
657       the  request  nr  and its value can be retrieved by the escape sequence
658       \n.
659

File naming conventions

661       The structure or content of a file name, beyond  its  location  in  the
662       file  system, is not significant to roff tools.  roff documents employ‐
663       ing “full-service” macro packages (see groff_tmac(5)) tend to be  named
664       with a suffix identifying the package; we thus see file names ending in
665       .man, .ms, .me, .mm, and .mom, for instance.  When installed, man pages
666       tend  to  be named with the manual's section number as the suffix.  For
667       example, the file name for this document is roff.7.  Practice for “raw”
668       roff  documents  is  less consistent; they are sometimes seen with a .t
669       suffix.
670

Input conventions

672       Since troff fills text automatically, it is common practice in the roff
673       language  to  avoid  visual composition of text in input files: the es‐
674       thetic appeal of the formatted output is what matters.  Therefore, roff
675       input should be arranged such that it is easy for authors and maintain‐
676       ers to compose and develop the document, understand the syntax of  roff
677       requests, macro calls, and preprocessor languages used, and predict the
678       behavior of the formatter.  Several traditions have accrued in  service
679       of these goals.
680
681       • Follow  sentence  endings  in  the  input with newlines to ease their
682         recognition.  It is frequently convenient to  end  text  lines  after
683         colons and semicolons as well, as these typically precede independent
684         clauses.  Consider doing so after commas; they often occur  in  lists
685         that become easy to scan when itemized by line, or constitute supple‐
686         ments to the sentence that are added, deleted, or updated to  clarify
687         it.   Parenthetical  and  quoted phrases are also good candidates for
688         placement on text lines by themselves.
689
690       • Set your text editor's line length to 72 characters or fewer; see the
691         subsections  below.   This  limit, combined with the previous item of
692         advice, makes it less common that an input line  will  wrap  in  your
693         text  editor,  and  thus will help you perceive excessively long con‐
694         structions in your text.  Recall that natural languages originate  in
695         speech,  not  writing, and that punctuation is correlated with pauses
696         for breathing and changes in prosody.
697
698       • Use \& after “!”, “?”, and “.” if they are followed by space, tab, or
699         newline characters and don't end a sentence.
700
701       • In  filled text lines, use \& before “.” and “'” if they are preceded
702         by space, so that reflowing the input doesn't turn them into  control
703         lines.
704
705       • Do not use spaces to perform indentation or align columns of a table.
706         Leading spaces are reliable when text is not being filled.
707
708       • Comment your document.  It is never too soon  to  apply  comments  to
709         record  information  of use to future document maintainers (including
710         your future self).  The \" escape sequence causes troff to ignore the
711         remainder of the input line.
712
713       • Use  the  empty request—a control character followed immediately by a
714         newline—to visually manage separation of  material  in  input  files.
715         Many  of  the  groff project's own documents use an empty request be‐
716         tween sentences, after macro definitions, and where a  break  is  ex‐
717         pected,  and  two empty requests between paragraphs or other requests
718         or macro calls that will introduce vertical space into the  document.
719         You can combine the empty request with the comment escape sequence to
720         include whole-line comments in your document, and even “comment  out”
721         sections of it.
722
723       An  example  sufficiently  long to illustrate most of the above sugges‐
724       tions in practice follows.  An arrow → indicates a tab character.
725
726              .\"   nroff this_file.roff | less
727              .\"   groff -T ps this_file.roff > this_file.ps
728              →The theory of relativity is intimately connected with
729              the theory of space and time.
730              .
731              I shall therefore begin with a brief investigation of
732              the origin of our ideas of space and time,
733              although in doing so I know that I introduce a
734              controversial subject.  \" remainder of paragraph elided
735              .
736              .
737
738              →The experiences of an individual appear to us arranged
739              in a series of events;
740              in this series the single events which we remember
741              appear to be ordered according to the criterion of
742              \[lq]earlier\[rq] and \[lq]later\[rq], \" punct swapped
743              which cannot be analysed further.
744              .
745              There exists,
746              therefore,
747              for the individual,
748              an I-time,
749              or subjective time.
750              .
751              This itself is not measurable.
752              .
753              I can,
754              indeed,
755              associate numbers with the events,
756              in such a way that the greater number is associated with
757              the later event than with an earlier one;
758              but the nature of this association may be quite
759              arbitrary.
760              .
761              This association I can define by means of a clock by
762              comparing the order of events furnished by the clock
763              with the order of a given series of events.
764              .
765              We understand by a clock something which provides a
766              series of events which can be counted,
767              and which has other properties of which we shall speak
768              later.
769              .\" Albert Einstein, _The Meaning of Relativity_, 1922
770
771   Editing with Emacs
772       Official GNU doctrine holds that the best program for  editing  a  roff
773       document  is Emacs; see emacs(1).  It provides an nroff major mode that
774       is suitable for all kinds of roff dialects.  This mode can be activated
775       by the following methods.
776
777       When editing a file within Emacs the mode can be changed by typing “M-x
778       nroff-mode”, where M-x means to hold down the meta key (often  labelled
779       “Alt”) while pressing and releasing the “x” key.
780
781       It is also possible to have the mode automatically selected when a roff
782       file is loaded into the editor.
783
784       • The most general method is to include file-local variables at the end
785         of the file; we can also configure the fill column this way.
786
787                .\" Local Variables:
788                .\" fill-column: 72
789                .\" mode: nroff
790                .\" End:
791
792       • Certain  file  name  extensions,  such  as those commonly used by man
793         pages, trigger the automatic activation of the nroff mode.
794
795       • Technically, having the sequence
796
797                .\" -*- nroff -*-
798
799         in the first line of a file will cause Emacs to enter the nroff major
800         mode  when  it is loaded into the buffer.  Unfortunately, some imple‐
801         mentations of the man(1) program are confused by this practice, so we
802         discourage it.
803
804   Editing with Vim
805       Other editors provide support for roff-style files too, such as vim(1),
806       an extension of the vi(1) program.  Vim's highlighting can be  made  to
807       recognize  roff files by setting the filetype option in a Vim modeline.
808       For this feature to work, your copy of vim must be built  with  support
809       for,  and  configured to enable, several features; consult the editor's
810       online help topics “auto-setting”, “filetype”, and “syntax”.  Then  put
811       the following at the end of your roff files, after any Emacs configura‐
812       tion:
813
814                     .\" vim: set filetype=groff textwidth=72:
815
816       Replace “groff” in the above with “nroff” if you want highlighting that
817       does not recognize many of the GNU extensions to roff, such as request,
818       register, and string names longer than two characters.
819

Authors

821       This document was written by  Bernd  Warken  ⟨groff-bernd.warken-72@web
822       .de⟩ and G. Branden Robinson ⟨g.branden.robinson@gmail.com⟩.
823

See also

825       Much  roff documentation is available.  The Bell Labs papers describing
826       AT&T troff remain available, and groff is documented comprehensively.
827
828   Internet sites
829       Unix       Text       Processinghttps://github.com/larrykollar/
830       Unix-Text-Processing⟩, by Dale Dougherty and Tim O'Reilly, 1987, Hayden
831       Books.  This well-regarded text brings the reader from a  state  of  no
832       knowledge  of Unix or text editing (if necessary) to sophisticated com‐
833       puter-aided typesetting.  It has been placed under a free software  li‐
834       cense  by  its  authors and updated by a team of groff contributors and
835       enthusiasts.
836
837       “History of Unix  Manpages”  ⟨http://manpages.bsd.lv/history.html⟩,  an
838       online  article  maintained by the mdocml project, provides an overview
839       of roff development from Saltzer's RUNOFF to 2008, with links to origi‐
840       nal  documentation and recollections of the authors and their contempo‐
841       raries.
842
843       troff.org ⟨http://www.troff.org/⟩, Ralph Corderoy's  troff  site,  pro‐
844       vides an overview and pointers to much historical roff information.
845
846       Multicians ⟨http://www.multicians.org/⟩, a site by Multics enthusiasts,
847       contains a lot of information on the MIT projects CTSS and Multics, in‐
848       cluding  RUNOFF;  it is especially useful for its glossary and the many
849       links to historical documents.
850
851       The Unix Archive ⟨http://www.tuhs.org/Archive/⟩, curated  by  the  Unix
852       Heritage Society, provides the source code and some binaries of histor‐
853       ical Unices (including the source code of some versions  of  troff  and
854       its documentation) contributed by their copyright holders.
855
856       Jerry Saltzer's home page ⟨http://web.mit.edu/Saltzer/www/publications/
857       pubs.html⟩ stores some documents using the original  RUNOFF  formatting
858       language.
859
860       groffhttp://www.gnu.org/software/groff⟩,  GNU  roff's web site, pro‐
861       vides convenient access to groff's source code repository, bug tracker,
862       and mailing lists (including archives and the subscription interface).
863
864   Historical roff documentation
865       Many  AT&T  troff  documents  are available online, and can be found at
866       Ralph Corderoy's site (see above) or via Internet search.
867
868       Of foremost significance are two mentioned in section “History”  above,
869       describing  the language and its device-independent implementation, re‐
870       spectively.
871
872       “Troff User's Manual” by Joseph F. Ossanna, 1976 (revised by  Brian  W.
873       Kernighan,  1992),  AT&T  Bell Laboratories Computing Science Technical
874       Report No. 54.
875
876       “A Typesetter-independent TROFF” by Brian W. Kernighan, 1982, AT&T Bell
877       Laboratories Computing Science Technical Report No. 97.
878
879       You  can  obtain  many  relevant  Bell  Labs  papers  in PDF from Bernd
880       Warken's  “roff  classical”  GitHub   repository   ⟨https://github.com/
881       bwarken/roff_classical.git⟩.
882
883   Manual pages
884       As  a system of multiple components, a roff system potentially has many
885       man pages, each describing an aspect of it.  Unfortunately, there is no
886       consistent  naming  scheme for these pages among the different roff im‐
887       plementations.
888
889       For GNU roff, the groff(1) man page enumerates all man  pages  distrib‐
890       uted with the system, and individual pages frequently refer to external
891       resources as well as manuals distributed with groff  on  a  variety  of
892       topics.
893
894       With  other  roffs,  you  are on your own, but troff(1) might be a good
895       starting point.
896
897
898
899groff 1.23.0                    2 November 2023                        roff(7)
Impressum