1PDFTOHTML(1)                General Commands Manual               PDFTOHTML(1)
2
3
4

NAME

6       pdftohtml - program to convert PDF files into HTML, XML and PNG images
7

SYNOPSIS

9       pdftohtml [options] <PDF-file> [<HTML-file> <XML-file>]
10

DESCRIPTION

12       This  manual page documents briefly the pdftohtml command.  This manual
13       page was written for the  Debian  GNU/Linux  distribution  because  the
14       original program does not have a manual page.
15
16       pdftohtml is a program that converts PDF documents into HTML. It gener‐
17       ates its output in the current working directory.  If PDF-file is  ´-',
18       it reads the PDF file from stdin.
19

OPTIONS

21       A summary of options are included below.
22
23       -h, -help
24              Show summary of options.
25
26       -f <int>
27              first page to print
28
29       -l <int>
30              last page to print
31
32       -q     do not print any messages or errors
33
34       -v     print copyright and version info
35
36       -p     exchange .pdf links with .html
37
38       -c     generate complex output
39
40       -s     generate single HTML that includes all pages
41
42       -dataurls
43              use  data  URLs instead of external images in HTML. No available
44              in all platforms
45
46       -i     ignore images
47
48       -noframes
49              generate no frames. Not supported in complex output mode.
50
51       -stdout
52              use standard output
53
54       -zoom <fp>
55              zoom the PDF document (default 1.5)
56
57       -xml   output for XML post-processing
58
59       -noroundcoord
60              do not round coordinates (with XML output only)
61
62       -enc <string>
63              output text encoding name
64
65       -opw <string>
66              owner password (for encrypted files)
67
68       -upw <string>
69              user password (for encrypted files)
70
71       -hidden
72              force hidden text extraction
73
74       -fmt   image file format for Splash output (png or jpg).  If complex is
75              selected, but -fmt is not specified, -fmt png will be assumed
76
77       -nomerge
78              do not merge paragraphs
79
80       -nodrm override document DRM settings
81
82       -wbt <fp>
83              adjust  the  word  break threshold percent. Default is 10.  Word
84              break occurs when distance between two  adjacent  characters  is
85              greater than this percent of character height.
86
87       -fontfullname
88              outputs the font name without any substitutions.
89
90

AUTHOR

92       Pdftohtml  was developed by Gueorgui Ovtcharov and Rainer Dorsch. It is
93       based and benefits a lot from Derek Noonburg's xpdf package.
94
95       This manual page was written by Søren Boll Overgaard <boll@debian.org>,
96       for the Debian GNU/Linux system (but may be used by others).
97

SEE ALSO

99       pdfdetach(1),  pdffonts(1),  pdfimages(1),  pdfinfo(1),  pdftocairo(1),
100       pdftoppm(1), pdftops(1), pdftotext(1) pdfseparate(1), pdfsig(1),  pdfu‐
101       nite(1)
102
103
104
105                                                                  PDFTOHTML(1)
Impressum