1PDFTOHTML(1)                General Commands Manual               PDFTOHTML(1)
2
3
4

NAME

6       pdftohtml - program to convert PDF files into HTML, XML and PNG images
7

SYNOPSIS

9       pdftohtml [options] <PDF-file> [<HTML-file> <XML-file>]
10

DESCRIPTION

12       This  manual page documents briefly the pdftohtml command.  This manual
13       page was written for the  Debian  GNU/Linux  distribution  because  the
14       original program does not have a manual page.
15
16       pdftohtml is a program that converts PDF documents into HTML. It gener‐
17       ates its output in the current working directory.
18

OPTIONS

20       A summary of options are included below.
21
22       -h, -help
23              Show summary of options.
24
25       -f <int>
26              first page to print
27
28       -l <int>
29              last page to print
30
31       -q     do not print any messages or errors
32
33       -v     print copyright and version info
34
35       -p     exchange .pdf links with .html
36
37       -c     generate complex output
38
39       -s     generate single HTML that includes all pages
40
41       -dataurls
42              use data URLs instead of external images in HTML.  No  available
43              in all platforms
44
45       -i     ignore images
46
47       -noframes
48              generate no frames. Not supported in complex output mode.
49
50       -stdout
51              use standard output
52
53       -zoom <fp>
54              zoom the PDF document (default 1.5)
55
56       -xml   output for XML post-processing
57
58       -noroundcoord
59              do not round coordinates (with XML output only)
60
61       -enc <string>
62              output text encoding name
63
64       -opw <string>
65              owner password (for encrypted files)
66
67       -upw <string>
68              user password (for encrypted files)
69
70       -hidden
71              force hidden text extraction
72
73       -fmt   image file format for Splash output (png or jpg).  If complex is
74              selected, but -fmt is not specified, -fmt png will be assumed
75
76       -nomerge
77              do not merge paragraphs
78
79       -nodrm override document DRM settings
80
81       -wbt <fp>
82              adjust the word break threshold percent. Default  is  10.   Word
83              break  occurs  when  distance between two adjacent characters is
84              greater than this percent of character height.
85
86       -fontfullname
87              outputs the font name without any substitutions.
88
89

AUTHOR

91       Pdftohtml was developed by Gueorgui Ovtcharov and Rainer Dorsch. It  is
92       based and benefits a lot from Derek Noonburg's xpdf package.
93
94       This manual page was written by Søren Boll Overgaard <boll@debian.org>,
95       for the Debian GNU/Linux system (but may be used by others).
96

SEE ALSO

98       pdfdetach(1),  pdffonts(1),  pdfimages(1),  pdfinfo(1),  pdftocairo(1),
99       pdftoppm(1),  pdftops(1), pdftotext(1) pdfseparate(1), pdfsig(1), pdfu‐
100       nite(1)
101
102
103
104                                                                  PDFTOHTML(1)
Impressum