1PDFTOHTML(1)                General Commands Manual               PDFTOHTML(1)
2
3
4

NAME

6       pdftohtml - program to convert PDF files into HTML, XML and PNG images
7

SYNOPSIS

9       pdftohtml [options] <PDF-file> [<HTML-file> <XML-file>]
10

DESCRIPTION

12       This  manual page documents briefly the pdftohtml command.  This manual
13       page was written for the  Debian  GNU/Linux  distribution  because  the
14       original program does not have a manual page.
15
16       pdftohtml is a program that converts PDF documents into HTML. It gener‐
17       ates its output in the current working directory.
18

OPTIONS

20       A summary of options are included below.
21
22       -h, -help
23              Show summary of options.
24
25       -f <int>
26              first page to print
27
28       -l <int>
29              last page to print
30
31       -q     do not print any messages or errors
32
33       -v     print copyright and version info
34
35       -p     exchange .pdf links with .html
36
37       -c     generate complex output
38
39       -s     generate single HTML that includes all pages
40
41       -i     ignore images
42
43       -noframes
44              generate no frames. Not supported in complex output mode.
45
46       -stdout
47              use standard output
48
49       -zoom <fp>
50              zoom the PDF document (default 1.5)
51
52       -xml   output for XML post-processing
53
54       -noRoundedCoordinates
55              do not round coordinates (with XML output only)
56
57       -enc <string>
58              output text encoding name
59
60       -opw <string>
61              owner password (for encrypted files)
62
63       -upw <string>
64              user password (for encrypted files)
65
66       -hidden
67              force hidden text extraction
68
69       -fmt   image file format for Splash output (png or jpg).  If complex is
70              selected, but -fmt is not specified, -fmt png will be assumed
71
72       -nomerge
73              do not merge paragraphs
74
75       -nodrm override document DRM settings
76
77       -wbt <fp>
78              adjust  the  word  break threshold percent. Default is 10.  Word
79              break occurs when distance between two  adjacent  characters  is
80              greater than this percent of character height.
81
82       -fontfullname
83              outputs the font name without any substitutions.
84
85

AUTHOR

87       Pdftohtml  was developed by Gueorgui Ovtcharov and Rainer Dorsch. It is
88       based and benefits a lot from Derek Noonburg's xpdf package.
89
90       This manual page was written by Søren Boll Overgaard <boll@debian.org>,
91       for the Debian GNU/Linux system (but may be used by others).
92

SEE ALSO

94       pdfdetach(1),  pdffonts(1),  pdfimages(1),  pdfinfo(1),  pdftocairo(1),
95       pdftoppm(1), pdftops(1), pdftotext(1) pdfseparate(1), pdfsig(1),  pdfu‐
96       nite(1)
97
98
99
100                                                                  PDFTOHTML(1)
Impressum