1htmldoc(1) Michael R Sweet htmldoc(1)
2
3
4
6 htmldoc - convert html source files into html, postscript, or pdf.
7
9 htmldoc [options] filename1.{html,md} [ ... filenameN.{html,md} ]
10
11 htmldoc [options] -
12
13 htmldoc [filename.book]
14
16 Htmldoc(1) converts HTML and Markdown source files into indexed HTML,
17 PostScript, or Portable Document Format (PDF) files that can be viewed
18 online or printed. With no options a HTML document is produced on std‐
19 out.
20
21 The second form of htmldoc reads HTML source from stdin, which allows
22 you to use htmldoc as a filter.
23
24 The third form of htmldoc launches a graphical interface that allows
25 you to change options and generate documents interactively.
26
28 There are two types of HTML files - structured documents using headings
29 (H1, H2, etc.) which htmldoc calls "books", and unstructured documents
30 that do not use headings which htmldoc calls "web pages".
31
32 A very common mistake is to try converting a web page using:
33 htmldoc -f filename.pdf filename.html
34 which will likely produce a PDF file with no pages. To convert web
35 page files you must use the --webpage or --continuous options at the
36 command-line or choose Web Page or Continuous in the input tab of the
37 GUI.
38
40 The following command-line options are supported by htmldoc:
41
42 --batch filename.book
43 Generates the specified book file without opening the GUI.
44
45 --bodycolor color
46 Specifies the background color for all pages.
47
48 --bodyfont {courier,helvetica,monospace,sans,serif,times}
49
50 --textfont {courier,helvetica,monospace,sans,serif,times}
51 Specifies the default typeface for all normal text.
52
53 --bodyimage filename
54 Specifies the background image that is tiled on all pages.
55
56 --book
57 Specifies that the HTML sources are structured (headings, chap‐
58 ters, etc.)
59
60 --bottom margin
61 Specifies the bottom margin in points (no suffix or ##pt), inches
62 (##in), centimeters (##cm), or millimeters (##mm).
63
64 --charset {cp-nnnn,iso-8859-1,...,iso-8859-15,utf-8}
65 Specifies the character set to use for the output. Note: UTF-8
66 support is limited to the first 128 Unicode characters that are
67 found in the input.
68
69 --color
70 Specifies that PostScript or PDF output should be in color.
71
72 --continuous
73 Specifies that the HTML sources are unstructured (plain web
74 pages.) No page breaks are inserted between each file or URL in
75 the output.
76
77 --datadir directory
78 Specifies the location of the htmldoc data files, usually
79 /usr/share/htmldoc or C:/Program Files/HTMLDOC.
80
81 --duplex
82 Specifies that the output should be formatted for double-sided
83 printing.
84
85 --effectduration {0.1...10.0}
86 Specifies the duration in seconds of PDF page transition effects.
87
88 --embedfonts
89 Specifies that fonts should be embedded in PDF and PostScript out‐
90 put.
91
92 --encryption
93 Enables encryption of PDF files.
94
95 --fontsize size
96 Specifies the default font size for body text.
97
98 --fontspacing spacing
99 Specifies the default line spacing for body text. The line spacing
100 is a multiplier for the font size, so a value of 1.2 will provide
101 an additional 20% of space between the lines.
102
103 --footer fff
104 Sets the page footer to use on body pages. See the HEADERS/FOOTERS
105 FORMATS section below.
106
107 --format format
108
109 -t format
110 Specifies the output format: epub, html, htmlsep (separate HTML
111 files for each heading in the table-of-contents), ps or ps2 (Post‐
112 Script Level 2), ps1 (PostScript Level 1), ps3 (PostScript Level
113 3), pdf11 (PDF 1.1/Acrobat 2.0), pdf12 (PDF 1.2/Acrobat 3.0), pdf
114 or pdf13 (PDF 1.3/Acrobat 4.0), or pdf14 (PDF 1.4/Acrobat 5.0).
115
116 --gray
117 Specifies that PostScript or PDF output should be grayscale.
118
119 --header fff
120 Sets the page header to use on body pages. See the HEADERS/FOOTERS
121 FORMATS section below.
122
123 --header1 fff
124 Sets the page header to use on the first body/chapter page. See
125 the HEADERS/FOOTERS FORMATS section below.
126
127 --headfootfont font
128 Sets the font to use on headers and footers.
129
130 --headfootsize size
131 Sets the size of the font to use on headers and footers.
132
133 --headingfont typeface
134 Sets the typeface to use for headings.
135
136 --help
137 Displays a summary of command-line options.
138
139 --helpdir directory
140 Specifies the location of the htmldoc online help files, usually
141 /usr/share/doc/htmldoc or C:/Program Files/HTMLDOC/DOC.
142
143 --hfimageN filename
144 Specifies an image (numbered from 1 to 10) to be used in the
145 header or footer in a PostScript or PDF document.
146
147 --jpeg[=quality]
148 Sets the JPEG compression level to use for large images. A value
149 of 0 disables JPEG compression.
150
151 --left margin
152 Specifies the left margin in points (no suffix or ##pt), inches
153 (##in), centimeters (##cm), or millimeters (##mm).
154
155 --linkcolor color
156 Sets the color of links.
157
158 --links
159 Enables generation of links in PDF files (default).
160
161 --linkstyle {plain,underline}
162 Sets the style of links.
163
164 --logoimage filename
165 Specifies an image to be used as a logo in the header or footer in
166 a PostScript or PDF document, and in the navigation bar of a HTML
167 document. Note that you need to use the --header, --header1,
168 and/or --footer options with the l parameter or use the corre‐
169 sponding HTML page comments to display the logo image inthe header
170 or footer.
171
172 --no-compression
173 Disables compression of PostScript or PDF files.
174
175 --no-duplex
176 Disables double-sided printing.
177
178 --no-embedfonts
179 Specifies that fonts should not be embedded in PDF and PostScript
180 output.
181
182 --no-encryption
183 Disables document encryption.
184
185 --no-jpeg
186 Disables JPEG compression of large images.
187
188 --no-links
189 Disables generation of links in a PDF document.
190
191 --no-numbered
192 Disables automatic heading numbering.
193
194 --no-pscommands
195 Disables generation of PostScript setpagedevice commands.
196
197 --no-strict
198 Disables strict HTML input checking.
199
200 --no-title
201 Disables generation of a title page.
202
203 --no-toc
204 Disables generation of a table of contents.
205
206 --numbered
207 Numbers all headings in a document.
208
209 --nup pages
210 Sets the number of pages that are placed on each output page.
211 Valid values are 1, 2, 4, 6, 9, and 16.
212
213 --outdir directory
214
215 -d directory
216 Specifies that output should be sent to a directory in multiple
217 files. (Not compatible with PDF output)
218
219 --outfile filename
220
221 -f filename
222 Specifies that output should be sent to a single file.
223
224 --owner-password password
225 Sets the owner password for encrypted PDF files.
226
227 --pageduration I{1.0...60.0}
228 Sets the view duration of a page in a PDF document.
229
230 --pageeffect effect
231 Specifies the page transition effect for all pages; this attribute
232 is ignored by all Adobe PDF viewers.
233
234 --pagelayout {single,one,twoleft,tworight}
235 Specifies the initial layout of pages for a PDF file.
236
237 --pagemode {document,outlines,fullscreen}
238 Specifies the initial viewing mode for a PDF file.
239
240 --path
241 Specifies a search path for files in a document.
242
243 --permissions permission[,permission,...]
244 Specifies document permissions for encrypted PDF files. The fol‐
245 lowing permissions are understood: all, none, annotate, no-anno‐
246 tate, copy, no-copy, modify, no-modify, print, and no-print. Sepa‐
247 rate multiple permissions with commas.
248
249 --pscommands
250 Specifies that PostScript setpagedevice commands should be
251 included in the output.
252
253 --quiet
254 Suppresses all messages, even error messages.
255
256 --referer url
257 Specifies the URL that is passed in the Referer: field of HTTP
258 requests.
259
260 --right margin
261 Specifies the right margin in points (no suffix or ##pt), inches
262 (##in), centimeters (##cm), or millimeters (##mm).
263
264 --size pagesize
265 Specifies the page size using a standard name or in points (no
266 suffix or ##x##pt), inches (##x##in), centimeters (##x##cm), or
267 millimeters (##x##mm). The standard sizes that are currently rec‐
268 ognized are "letter" (8.5x11in), "legal" (8.5x14in), "a4"
269 (210x297mm), and "universal" (8.27x11in).
270
271 --strict
272 Enables strict HTML input checking.
273
274 --textcolor color
275 Specifies the default color of all text.
276
277 --title
278 Enables the generation of a title page.
279
280 --titlefile filename
281
282 --titleimage filename
283 Specifies the file to use for the title page. If the file is an
284 image then the title page is automatically generated using the
285 document meta data and title image.
286
287 --tocfooter fff
288 Sets the page footer to use on table-of-contents pages. See the
289 HEADERS/FOOTERS FORMATS section below.
290
291 --tocheader fff
292 Sets the page header to use on table-of-contents pages. See the
293 HEADERS/FOOTERS FORMATS section below.
294
295 --toclevels levels
296 Sets the number of levels in the table-of-contents.
297
298 --toctitle string
299 Sets the title for the table-of-contents.
300
301 --top margin
302 Specifies the top margin in points (no suffix or ##pt), inches
303 (##in), centimeters (##cm), or millimeters (##mm).
304
305 --user-password password
306 Specifies the user password for encryption of PDF files.
307
308 --verbose
309
310 -v Provides verbose messages.
311
312 --version
313 Displays the current version number.
314
315 --webpage
316 Specifies that the HTML sources are unstructured (plain web
317 pages.) A page break is inserted between each file or URL in the
318 output.
319
321 Htmldoc returns a non-zero exit status if any errors are seen, zero
322 otherwise.
323
325 The header and footer of each page can contain up to three preformatted
326 values. These values are specified using a single character for the
327 left, middle, and right of the page, resulting in the fff notation
328 shown previously.
329
330 Each character can be one of the following:
331
332 . blank
333
334 / n/N arabic page numbers (1/3, 2/3, 3/3)
335
336 : c/C arabic chapter page numbers (1/2, 2/2, 1/4, 2/4, ...)
337
338 1 arabic numbers (1, 2, 3, ...)
339
340 a lowercase letters
341
342 A uppercase letters
343
344 c current chapter heading
345
346 C current chapter page number (arabic)
347
348 d current date
349
350 D current date and time
351
352 h current heading
353
354 i lowercase roman numerals
355
356 I uppercase roman numerals
357
358 l logo image
359
360 t title text
361
362 T current time
363
364 u current filename or URL
365
366
368 HTMLDOC looks for several environment variables which can override the
369 default directories, display additional debugging information, and dis‐
370 able CGI mode:
371
372 HTMLDOC_DATA
373 This environment variable specifies the location of htmldoc's data
374 and fonts directories, normally /usr/share/htmldoc or C:/Program
375 Files/HTMLDOC.
376
377 HTMLDOC_DEBUG
378 This environment variable enables debugging information that is
379 sent to stderr. The value is a list of any of the following key‐
380 words separated by spaces: "all", "links", "memory", "remote‐
381 bytes", "table", "tempfiles", and/or "timing".
382
383 HTMLDOC_HELP
384 This environment variable specifies the location of htmldoc's doc‐
385 umentation directory, normally /usr/share/doc/htmldoc or C:/Pro‐
386 gram Files/HTMLDOC/doc.
387
388 HTMLDOC_NOCGI
389 This environment variable, when set (the value doesn't matter),
390 disables CGI mode. It is most useful for using htmldoc on a web
391 server from a scripting language or invocation from a program.
392
394 Create a PDF file from a web site:
395 htmldoc --webpage -f example.pdf http://www.example.com/
396 Create a PostScript book from a directory of HTML files
397 htmldoc --book -f example.pdf *.html
398
400 HTMLDOC Users Manual
401
402 https://michaelrsweet.github.io/htmldoc
403
405 Michael R Sweet
406
408 HTMLDOC is copyright © 1997-2019 by Michael R Sweet.
409
410 This program is free software; you can redistribute it and/or modify it
411 under the terms of the GNU General Public License version 2 as pub‐
412 lished by the Free Software Foundation.
413
414 This program is distributed in the hope that it will be useful, but
415 WITHOUT ANY WARRANTY; without even the implied warranty of MER‐
416 CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
417 Public License for more details.
418
419 You should have received a copy of the GNU General Public License along
420 with this program; if not, write to the Free Software Foundation, Inc.,
421 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
422
423
424
42528 August 2019 HTMLDOC 1.9.6 htmldoc(1)