1PDFTOHTML(1)                General Commands Manual               PDFTOHTML(1)
2
3
4

NAME

6       pdftohtml - program to convert PDF files into HTML, XML and PNG images
7

SYNOPSIS

9       pdftohtml [options] <PDF-file> [<HTML-file> <XML-file>]
10

DESCRIPTION

12       This  manual page documents briefly the pdftohtml command.  This manual
13       page was written for the  Debian  GNU/Linux  distribution  because  the
14       original program does not have a manual page.
15
16       pdftohtml is a program that converts PDF documents into HTML. It gener‐
17       ates its output in the current working directory.
18

OPTIONS

20       A summary of options are included below.
21
22       -h, -help
23              Show summary of options.
24
25       -f <int>
26              first page to print
27
28       -l <int>
29              last page to print
30
31       -q     do not print any messages or errors
32
33       -v     print copyright and version info
34
35       -p     exchange .pdf links with .html
36
37       -c     generate complex output
38
39       -s     generate single HTML that includes all pages
40
41       -i     ignore images
42
43       -noframes
44              generate no frames. Not supported in complex output mode.
45
46       -stdout
47              use standard output
48
49       -zoom <fp>
50              zoom the PDF document (default 1.5)
51
52       -xml   output for XML post-processing
53
54       -enc <string>
55              output text encoding name
56
57       -opw <string>
58              owner password (for encrypted files)
59
60       -upw <string>
61              user password (for encrypted files)
62
63       -hidden
64              force hidden text extraction
65
66       -fmt   image file format for Splash output (png or jpg).  If complex is
67              selected, but -fmt is not specified, -fmt png will be assumed
68
69       -nomerge
70              do not merge paragraphs
71
72       -nodrm override document DRM settings
73
74       -wbt <fp>
75              adjust  the  word  break threshold percent. Default is 10.  Word
76              break occurs when distance between two  adjacent  characters  is
77              greater than this percent of character height.
78
79       -fontfullname
80              outputs the font name without any substitutions.
81
82

AUTHOR

84       Pdftohtml  was developed by Gueorgui Ovtcharov and Rainer Dorsch. It is
85       based and benefits a lot from Derek Noonburg's xpdf package.
86
87       This manual page was written by Søren Boll Overgaard <boll@debian.org>,
88       for the Debian GNU/Linux system (but may be used by others).
89

SEE ALSO

91       pdfdetach(1),  pdffonts(1),  pdfimages(1),  pdfinfo(1),  pdftocairo(1),
92       pdftoppm(1), pdftops(1), pdftotext(1) pdfseparate(1), pdfsig(1),  pdfu‐
93       nite(1)
94
95
96
97                                                                  PDFTOHTML(1)
Impressum