1htmldoc(1)                      Michael R Sweet                     htmldoc(1)
2
3
4

NAME

6       htmldoc - convert html source files into html, postscript, or pdf.
7

SYNOPSIS

9       htmldoc [options] filename1.{html,md} [ ... filenameN.{html,md} ]
10
11       htmldoc [options] -
12
13       htmldoc [filename.book]
14

DESCRIPTION

16       Htmldoc(1)  converts  HTML and Markdown source files into indexed HTML,
17       PostScript, or Portable Document Format (PDF) files that can be  viewed
18       online or printed.  With no options a HTML document is produced on std‐
19       out.
20
21       The second form of htmldoc reads HTML source from stdin,  which  allows
22       you to use htmldoc as a filter.
23
24       The  third  form  of htmldoc launches a graphical interface that allows
25       you to change options and generate documents interactively.
26

COMMON MISTAKES

28       There are two types of HTML files - structured documents using headings
29       (H1,  H2, etc.) which htmldoc calls "books", and unstructured documents
30       that do not use headings which htmldoc calls "web pages".
31
32       A very common mistake is to try converting a web page using:
33           htmldoc -f filename.pdf filename.html
34       which will likely produce a PDF file with no  pages.   To  convert  web
35       page  files  you  must use the --webpage or --continuous options at the
36       command-line or choose Web Page or Continuous in the input tab  of  the
37       GUI.
38

OPTIONS

40       The following command-line options are supported by htmldoc:
41
42       --batch filename.book
43            Generates the specified book file without opening the GUI.
44
45       --bodycolor color
46            Specifies the background color for all pages.
47
48       --bodyfont {courier,helvetica,monospace,sans,serif,times}
49
50       --textfont {courier,helvetica,monospace,sans,serif,times}
51            Specifies the default typeface for all normal text.
52
53       --bodyimage filename
54            Specifies the background image that is tiled on all pages.
55
56       --book
57            Specifies  that  the  HTML sources are structured (headings, chap‐
58            ters, etc.)
59
60       --bottom margin
61            Specifies the bottom margin in points (no suffix or ##pt),  inches
62            (##in), centimeters (##cm), or millimeters (##mm).
63
64       --charset {cp-nnnn,iso-8859-1,...,iso-8859-15,utf-8}
65            Specifies  the  character  set to use for the output.  Note: UTF-8
66            support is limited to the first 128 Unicode  characters  that  are
67            found in the input.
68
69       --color
70            Specifies that PostScript or PDF output should be in color.
71
72       --continuous
73            Specifies  that  the  HTML  sources  are  unstructured  (plain web
74            pages.) No page breaks are inserted between each file  or  URL  in
75            the output.
76
77       --datadir directory
78            Specifies   the  location  of  the  htmldoc  data  files,  usually
79            /usr/share/htmldoc or C:/Program Files/HTMLDOC.
80
81       --duplex
82            Specifies that the output should  be  formatted  for  double-sided
83            printing.
84
85       --effectduration {0.1...10.0}
86            Specifies the duration in seconds of PDF page transition effects.
87
88       --embedfonts
89            Specifies that fonts should be embedded in PDF and PostScript out‐
90            put.
91
92       --encryption
93            Enables encryption of PDF files.
94
95       --fontsize size
96            Specifies the default font size for body text.
97
98       --fontspacing spacing
99            Specifies the default line spacing for body text. The line spacing
100            is  a multiplier for the font size, so a value of 1.2 will provide
101            an additional 20% of space between the lines.
102
103       --footer fff
104            Sets the page footer to use on body pages. See the HEADERS/FOOTERS
105            FORMATS section below.
106
107       --format format
108
109       -t format
110            Specifies  the  output  format: epub, html, htmlsep (separate HTML
111            files for each heading in the table-of-contents), ps or ps2 (Post‐
112            Script  Level  2), ps1 (PostScript Level 1), ps3 (PostScript Level
113            3), pdf11 (PDF 1.1/Acrobat 2.0), pdf12 (PDF 1.2/Acrobat 3.0),  pdf
114            or pdf13 (PDF 1.3/Acrobat 4.0), or pdf14 (PDF 1.4/Acrobat 5.0).
115
116       --gray
117            Specifies that PostScript or PDF output should be grayscale.
118
119       --header fff
120            Sets the page header to use on body pages. See the HEADERS/FOOTERS
121            FORMATS section below.
122
123       --header1 fff
124            Sets the page header to use on the first  body/chapter  page.  See
125            the HEADERS/FOOTERS FORMATS section below.
126
127       --headfootfont font
128            Sets the font to use on headers and footers.
129
130       --headfootsize size
131            Sets the size of the font to use on headers and footers.
132
133       --headingfont typeface
134            Sets the typeface to use for headings.
135
136       --help
137            Displays a summary of command-line options.
138
139       --helpdir directory
140            Specifies  the  location of the htmldoc online help files, usually
141            /usr/share/doc/htmldoc or C:/Program Files/HTMLDOC/DOC.
142
143       --hfimageN filename
144            Specifies an image (numbered from 1 to  10)  to  be  used  in  the
145            header or footer in a PostScript or PDF document.
146
147       --jpeg[=quality]
148            Sets  the  JPEG compression level to use for large images. A value
149            of 0 disables JPEG compression.
150
151       --left margin
152            Specifies the left margin in points (no suffix  or  ##pt),  inches
153            (##in), centimeters (##cm), or millimeters (##mm).
154
155       --linkcolor color
156            Sets the color of links.
157
158       --links
159            Enables generation of links in PDF files (default).
160
161       --linkstyle {plain,underline}
162            Sets the style of links.
163
164       --logoimage filename
165            Specifies an image to be used as a logo in the header or footer in
166            a PostScript or PDF document, and in the navigation bar of a  HTML
167            document.   Note  that  you  need  to use the --header, --header1,
168            and/or --footer options with the l parameter  or  use  the  corre‐
169            sponding HTML page comments to display the logo image inthe header
170            or footer.
171
172       --no-compression
173            Disables compression of PostScript or PDF files.
174
175       --no-duplex
176            Disables double-sided printing.
177
178       --no-embedfonts
179            Specifies that fonts should not be embedded in PDF and  PostScript
180            output.
181
182       --no-encryption
183            Disables document encryption.
184
185       --no-jpeg
186            Disables JPEG compression of large images.
187
188       --no-links
189            Disables generation of links in a PDF document.
190
191       --no-numbered
192            Disables automatic heading numbering.
193
194       --no-pscommands
195            Disables generation of PostScript setpagedevice commands.
196
197       --no-strict
198            Disables strict HTML input checking.
199
200       --no-title
201            Disables generation of a title page.
202
203       --no-toc
204            Disables generation of a table of contents.
205
206       --numbered
207            Numbers all headings in a document.
208
209       --nup pages
210            Sets  the  number  of  pages  that are placed on each output page.
211            Valid values are 1, 2, 4, 6, 9, and 16.
212
213       --outdir directory
214
215       -d directory
216            Specifies that output should be sent to a  directory  in  multiple
217            files. (Not compatible with PDF output)
218
219       --outfile filename
220
221       -f filename
222            Specifies that output should be sent to a single file.
223
224       --owner-password password
225            Sets the owner password for encrypted PDF files.
226
227       --pageduration I{1.0...60.0}
228            Sets the view duration of a page in a PDF document.
229
230       --pageeffect effect
231            Specifies the page transition effect for all pages; this attribute
232            is ignored by all Adobe PDF viewers.
233
234       --pagelayout {single,one,twoleft,tworight}
235            Specifies the initial layout of pages for a PDF file.
236
237       --pagemode {document,outlines,fullscreen}
238            Specifies the initial viewing mode for a PDF file.
239
240       --path
241            Specifies a search path for files in a document.
242
243       --permissions permission[,permission,...]
244            Specifies document permissions for encrypted PDF files.  The  fol‐
245            lowing  permissions  are understood: all, none, annotate, no-anno‐
246            tate, copy, no-copy, modify, no-modify, print, and no-print. Sepa‐
247            rate multiple permissions with commas.
248
249       --pscommands
250            Specifies   that   PostScript  setpagedevice  commands  should  be
251            included in the output.
252
253       --quiet
254            Suppresses all messages, even error messages.
255
256       --referer url
257            Specifies the URL that is passed in the  Referer:  field  of  HTTP
258            requests.
259
260       --right margin
261            Specifies  the  right margin in points (no suffix or ##pt), inches
262            (##in), centimeters (##cm), or millimeters (##mm).
263
264       --size pagesize
265            Specifies the page size using a standard name  or  in  points  (no
266            suffix  or  ##x##pt),  inches (##x##in), centimeters (##x##cm), or
267            millimeters (##x##mm). The standard sizes that are currently  rec‐
268            ognized   are   "letter"   (8.5x11in),  "legal"  (8.5x14in),  "a4"
269            (210x297mm), and "universal" (8.27x11in).
270
271       --strict
272            Enables strict HTML input checking.
273
274       --textcolor color
275            Specifies the default color of all text.
276
277       --title
278            Enables the generation of a title page.
279
280       --titlefile filename
281
282       --titleimage filename
283            Specifies the file to use for the title page.  If the file  is  an
284            image  then  the  title  page is automatically generated using the
285            document meta data and title image.
286
287       --tocfooter fff
288            Sets the page footer to use on table-of-contents  pages.  See  the
289            HEADERS/FOOTERS FORMATS section below.
290
291       --tocheader fff
292            Sets  the  page  header to use on table-of-contents pages. See the
293            HEADERS/FOOTERS FORMATS section below.
294
295       --toclevels levels
296            Sets the number of levels in the table-of-contents.
297
298       --toctitle string
299            Sets the title for the table-of-contents.
300
301       --top margin
302            Specifies the top margin in points (no  suffix  or  ##pt),  inches
303            (##in), centimeters (##cm), or millimeters (##mm).
304
305       --user-password password
306            Specifies the user password for encryption of PDF files.
307
308       --verbose
309
310       -v   Provides verbose messages.
311
312       --version
313            Displays the current version number.
314
315       --webpage
316            Specifies  that  the  HTML  sources  are  unstructured  (plain web
317            pages.) A page break is inserted between each file or URL  in  the
318            output.
319

EXIT STATUS

321       Htmldoc  returns  a  non-zero  exit status if any errors are seen, zero
322       otherwise.
323

HEADER/FOOTER FORMATS

325       The header and footer of each page can contain up to three preformatted
326       values.   These  values  are specified using a single character for the
327       left, middle, and right of the page,  resulting  in  the  fff  notation
328       shown previously.
329
330       Each character can be one of the following:
331
332       .    blank
333
334       /    n/N arabic page numbers (1/3, 2/3, 3/3)
335
336       :    c/C arabic chapter page numbers (1/2, 2/2, 1/4, 2/4, ...)
337
338       1    arabic numbers (1, 2, 3, ...)
339
340       a    lowercase letters
341
342       A    uppercase letters
343
344       c    current chapter heading
345
346       C    current chapter page number (arabic)
347
348       d    current date
349
350       D    current date and time
351
352       h    current heading
353
354       i    lowercase roman numerals
355
356       I    uppercase roman numerals
357
358       l    logo image
359
360       t    title text
361
362       T    current time
363
364       u    current filename or URL
365
366

ENVIRONMENT

368       HTMLDOC  looks for several environment variables which can override the
369       default directories, display additional debugging information, and dis‐
370       able CGI mode:
371
372       HTMLDOC_DATA
373            This environment variable specifies the location of htmldoc's data
374            and fonts directories, normally /usr/share/htmldoc  or  C:/Program
375            Files/HTMLDOC.
376
377       HTMLDOC_DEBUG
378            This  environment  variable  enables debugging information that is
379            sent to stderr. The value is a list of any of the  following  key‐
380            words  separated  by  spaces:  "all",  "links", "memory", "remote‐
381            bytes", "table", "tempfiles", and/or "timing".
382
383       HTMLDOC_HELP
384            This environment variable specifies the location of htmldoc's doc‐
385            umentation  directory,  normally /usr/share/doc/htmldoc or C:/Pro‐
386            gram Files/HTMLDOC/doc.
387
388       HTMLDOC_NOCGI
389            This environment variable, when set (the  value  doesn't  matter),
390            disables  CGI  mode.  It is most useful for using htmldoc on a web
391            server from a scripting language or invocation from a program.
392

EXAMPLES

394       Create a PDF file from a web site:
395           htmldoc --webpage -f example.pdf http://www.example.com/
396       Create a PostScript book from a directory of HTML files
397           htmldoc --book -f example.pdf *.html
398

SEE ALSO

400       HTMLDOC Users Manual
401
402       https://michaelrsweet.github.io/htmldoc
403

AUTHOR

405       Michael R Sweet
406
408       HTMLDOC is copyright © 1997-2019 by Michael R Sweet.
409
410       This program is free software; you can redistribute it and/or modify it
411       under  the  terms  of  the GNU General Public License version 2 as pub‐
412       lished by the Free Software Foundation.
413
414       This program is distributed in the hope that it  will  be  useful,  but
415       WITHOUT  ANY  WARRANTY;  without  even  the  implied  warranty  of MER‐
416       CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU  General
417       Public License for more details.
418
419       You should have received a copy of the GNU General Public License along
420       with this program; if not, write to the Free Software Foundation, Inc.,
421       59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
422
423
424
42528 August 2019                   HTMLDOC 1.9.6                      htmldoc(1)
Impressum