1PolyglotMan(1) General Commands Manual PolyglotMan(1)
2
3
4
6 PolyglotMan, rman - reverse compile man pages from formatted form to a
7 number of source formats
8
10 rman [ options ] [ file ]
11
13 Up-to-date instructions can be found at http://polyglotman.source‐
14 forge.net/rman.html
15
16
17 PolyglotMan takes man pages from most of the popular flavors of UNIX
18 and transforms them into any of a number of text source formats. Poly‐
19 glotMan was formerly known as RosettaMan. The name of the binary is
20 still called rman , for scripts that depend on that name; mnemonically,
21 just think "reverse man". Previously PolyglotMan required pages to be
22 formatted by nroff prior to its processing. With version 3.0, it
23 prefers [tn]roff source and usually produces results that are better
24 yet. And source processing is the only way to translate tables. Source
25 format translation is not as mature as formatted, however, so try for‐
26 matted translation as a backup.
27
28 In parsing [tn]roff source, one could implement an arbitrarily large
29 subset of [tn]roff, which I did not and will not do, so the results can
30 be off. I did implement a significant subset of those use in man pages,
31 however, including tbl (but not eqn), if tests, and general macro defi‐
32 nitions, so usually the results look great. If they don't, format the
33 page with nroff before sending it to PolyglotMan. If PolyglotMan
34 doesn't recognize a key macro used by a large class of pages, however,
35 e-mail me the source and a uuencoded nroff-formatted page and I'll see
36 what I can do. When running PolyglotMan with man page source that
37 includes or redirects to other [tn]roff source using the .so (source or
38 inclusion) macro, you should be in the parent directory of the page,
39 since pages are written with this assumption. For example, if you are
40 translating /usr/man/man1/ls.1, first cd into /usr/man.
41
42 PolyglotMan accepts man pages from: SunOS, Sun Solaris, Hewlett-
43 Packard HP-UX, AT&T System V, OSF/1 aka Digital UNIX, DEC Ultrix, SGI
44 IRIX, Linux, FreeBSD, SCO. Source processing works for: SunOS, Sun
45 Solaris, Hewlett-Packard HP-UX, AT&T System V, OSF/1 aka Digital UNIX,
46 DEC Ultrix. It can produce printable ASCII-only (control characters
47 stripped), section headers-only, Tk, TkMan, [tn]roff (traditional man
48 page source), SGML, HTML, MIME, LaTeX, LaTeX2e, RTF, Perl 5 POD. A mod‐
49 ular architecture permits easy addition of additional output formats.
50
51 The latest version of PolyglotMan is available from http://polyglot‐
52 man.sourceforge.net/ .
53
55 The following options should not be used with any others and exit Poly‐
56 glotMan without processing any input.
57
58 -h|--help Show list of command line options and exit.
59
60 -v|--version Show version number and exit.
61
62 You should specify the filter first, as this sets a number of parame‐
63 ters, and then specify other options.
64
65 -f|--filter <ASCII|roff|TkMan|Tk|Sec‐
66 tions|HTML|SGML|MIME|LaTeX|LaTeX2e|RTF|POD>
67 Set the output filter. Defaults to ASCII.
68
69 -S|--source PolyglotMan tries to automatically determine whether its
70 input is source or formatted; use this option to declare
71 source input.
72
73 -F|--format|--formatted
74 PolyglotMan tries to automatically determine whether its
75 input is source or formatted; use this option to declare
76 formatted input.
77
78 -l|--title printf-string
79 In HTML mode this sets the <TITLE> of the man pages,
80 given the same parameters as -r .
81
82 -r|--reference|--manref printf-string
83 In HTML and SGML modes this sets the URL form by which
84 to retrieve other man pages. The string can use two sup‐
85 plied parameters: the man page name and its section.
86 (See the Examples section.) If the string is null (as
87 if set from a shell by "-r ''"), `-' or `off', then man
88 page references will not be HREFs, just set in italics.
89 If your printf supports XPG3 positions specifier, this
90 can be quite flexible.
91
92 -V|--volumes <colon-separated list>
93 Set the list of valid volumes to check against when
94 looking for cross-references to other man pages.
95 Defaults to 1:2:3:4:5:6:7:8:9:o:l:n:p (volume names can
96 be multicharacter). If an non-whitespace string in the
97 page is immediately followed by a left parenthesis, then
98 one of the valid volumes, and ends with optional other
99 characters and then a right parenthesis--then that
100 string is reported as a reference to another manual
101 page. If this -V string starts with an equals sign, then
102 no optional characters are allowed between the match to
103 the list of valids and the right parenthesis. (This
104 option is needed for SCO UNIX.)
105
106 The following options apply only when formatted pages are given as
107 input. They do not apply or are always handled correctly with the
108 source.
109
110 -b|--subsections
111 Try to recognize subsection titles in addition to sec‐
112 tion titles. This can cause problems on some UNIX fla‐
113 vors.
114
115 -K|--nobreak Indicate manual pages don't have page breaks, so don't
116 look for footers and headers around them. (Older nroff
117 -man macros always put in page breaks, but lately some
118 vendors have realized that printout are made through
119 troff, whereas nroff -man is used to format pages for
120 reading on screen, and so have eliminated page breaks.)
121 PolyglotMan usually gets this right even without this
122 flag.
123
124 -k|--keep Keep headers and footers, as a canonical report at the
125 end of the page. changeleft Move changebars, such as
126 those found in the Tcl/Tk manual pages, to the left. -->
127 notaggressive Disable aggressive man page parsing.
128 Aggressive manual, which is on by default, page parsing
129 elides headers and footers, identifies sections and
130 more. -->
131
132 -n|--name name Set name of man page (used in roff format). If the file‐
133 name is given in the form " name . section ", the name
134 and section are automatically determined. If the page is
135 being parsed from [tn]roff source and it has a .TH line,
136 this information is extracted from that line.
137
138 -p|--paragraph paragraph mode toggle. The filter determines whether
139 lines should be linebroken as they were by nroff, or
140 whether lines should be flowed together into paragraphs.
141 Mainly for internal use.
142
143 -s|section # Set volume (aka section) number of man page (used in
144 roff format). tables Turn on aggressive table parsing.
145 -->
146
147 -t|--tabstops #
148 For those macros sets that use tabs in place of spaces
149 where possible in order to reduce the number of charac‐
150 ters used, set tabstops every # columns. Defaults to 8.
151
153 ROFF
154 Some flavors of UNIX ship man page without [tn]roff source, making
155 one's laser printer little more than a laser-powered daisy wheel. This
156 filer tries to intuit the original [tn]roff directives, which can then
157 be recompiled by [tn]roff.
158
159 TkMan
160 TkMan, a hypertext man page browser, uses PolyglotMan to show man pages
161 without the (usually) useless headers and footers on each pages. It
162 also collects section and (optionally) subsection heads for direct
163 access from a pulldown menu. TkMan and Tcl/Tk, the toolkit in which
164 it's written, are available via anonymous ftp from
165 ftp://ftp.smli.com/pub/tcl/
166
167 Tk
168 This option outputs the text in a series of Tcl lists consisting of
169 text-tags pairs, where tag names roughly correspond to HTML. This out‐
170 put can be inserted into a Tk text widget by doing an eval <textwidget>
171 insert end <text> . This format should be relatively easily parsible by
172 other programs that want both the text and the tags. Also see ASCII.
173
174 ASCII
175 When printed on a line printer, man pages try to produce special text
176 effects by overstriking characters with themselves (to produce bold)
177 and underscores (underlining). Other text processing software, such as
178 text editors, searchers, and indexers, must counteract this. The ASCII
179 filter strips away this formatting. Piping nroff output through col -b
180 also strips away this formatting, but it leaves behind unsightly page
181 headers and footers. Also see Tk.
182
183 Sections
184 Dumps section and (optionally) subsection titles. This might be useful
185 for another program that processes man pages.
186
187 HTML
188 With a simple extention to an HTTP server for Mosaic or other World
189 Wide Web browser, PolyglotMan can produce high quality HTML on the
190 fly. Several such extensions and pointers to several others are
191 included in PolyglotMan 's contrib directory.
192
193 SGML
194 This is appoaching the Docbook DTD, but I'm hoping that someone that
195 someone with a real interest in this will polish the tags generated.
196 Try it to see how close the tags are now.
197
198 MIME
199 MIME (Multipurpose Internet Mail Extensions) as defined by RFC 1563,
200 good for consumption by MIME-aware e-mailers or as Emacs (>=19.29)
201 enriched documents.
202
203 LaTeX and LaTeX2e
204 Why not?
205
206 RTF
207 Use output on Mac or NeXT or whatever. Maybe take random man pages and
208 integrate with NeXT's documentation system better. Maybe NeXT has own
209 man page macros that do this.
210
211 PostScript and FrameMaker
212 To produce PostScript, use groff or psroff . To produce FrameMaker
213 MIF, use FrameMaker's builtin filter. In both cases you need [tn]roff
214 source, so if you only have a formatted version of the manual page, use
215 PolyglotMan 's roff filter first.
216
218 To convert the formatted man page named ls.1 back into [tn]roff
219 source form:
220
221 rman -f roff /usr/local/man/cat1/ls.1 > /usr/local/man/man1/ls.1
222
223 Long man pages are often compressed to conserve space (compression is
224 especially effective on formatted man pages as many of the characters
225 are spaces). As it is a long man page, it probably has subsections,
226 which we try to separate out (some macro sets don't distinguish subsec‐
227 tions well enough for PolyglotMan to detect them). Let's convert this
228 to LaTeX format:
229
230 pcat /usr/catman/a_man/cat1/automount.z | rman -b -n automount -s 1 -f
231 latex > automount.man
232
233 Alternatively, man 1 automount | rman -b -n automount -s 1 -f latex >
234 automount.man
235
236 For HTML/Mosaic users, PolyglotMan can, without modification of the
237 source code, produce HTML links that point to other HTML man pages
238 either pregenerated or generated on the fly. First let's assume pregen‐
239 erated HTML versions of man pages stored in /usr/man/html . Generate
240 these one-by-one with the following form:
241 rman -f html -r 'http:/usr/man/html/%s.%s.html' /usr/man/cat1/ls.1 >
242 /usr/man/html/ls.1.html
243
244 If you've extended your HTML client to generate HTML on the fly you
245 should use something like:
246 rman -f html -r 'http:~/bin/man2html?%s:%s' /usr/man/cat1/ls.1
247 when generating HTML.
248
250 PolyglotMan is not perfect in all cases, but it usually does a good
251 job, and in any case reduces the problem of converting man pages to
252 light editing.
253
254 Tables in formatted pages, especially H-P's, aren't handled very well.
255 Be sure to pass in source for the page to recognize tables.
256
257 The man pager woman applies its own idea of formatting for man pages,
258 which can confuse PolyglotMan . Bypass woman by passing the formatted
259 manual page text directly into PolyglotMan .
260
261 The [tn]roff output format uses fB to turn on boldface. If your macro
262 set requires .B, you'll have to a postprocess the PolyglotMan output.
263
265 tkman(1) , xman(1) , man(1) , man(7) or man(5) depending on your fla‐
266 vor of UNIX
267
269 PolyglotMan
270 by Thomas A. Phelps ( phelps@ACM.org )
271 developed at the
272 University of California, Berkeley
273 Computer Science Division
274
275 Manual page last updated on $Date: 1998/07/13 09:47:28 $
276
277
278
279 PolyglotMan(1)