1PDFTOHTML(1)                General Commands Manual               PDFTOHTML(1)
2
3
4

NAME

6       pdftohtml - program to convert PDF files into HTML, XML and PNG images
7

SYNOPSIS

9       pdftohtml [options] <PDF-file> [<HTML-file> <XML-file>]
10

DESCRIPTION

12       This  manual page documents briefly the pdftohtml command.  This manual
13       page was written for the  Debian  GNU/Linux  distribution  because  the
14       original program does not have a manual page.
15
16       pdftohtml is a program that converts PDF documents into HTML. It gener‐
17       ates its output in the current working directory.
18

OPTIONS

20       A summary of options are included below.
21
22       -h, -help
23              Show summary of options.
24
25       -f <int>
26              first page to print
27
28       -l <int>
29              last page to print
30
31       -q     do not print any messages or errors
32
33       -v     print copyright and version info
34
35       -p     exchange .pdf links with .html
36
37       -c     generate complex output
38
39       -s     generate single HTML that includes all pages
40
41       -i     ignore images
42
43       -noframes
44              generate no frames. Not supported in complex output mode.
45
46       -stdout
47              use standard output
48
49       -zoom <fp>
50              zoom the PDF document (default 1.5)
51
52       -xml   output for XML post-processing
53
54       -enc <string>
55              output text encoding name
56
57       -opw <string>
58              owner password (for encrypted files)
59
60       -upw <string>
61              user password (for encrypted files)
62
63       -hidden
64              force hidden text extraction
65
66       -fmt   image file format for Splash output (png or jpg).  If complex is
67              selected, but -fmt is not specified, -fmt png will be assumed
68
69       -nomerge
70              do not merge paragraphs
71
72       -nodrm override document DRM settings
73
74       -wbt <fp>
75              adjust  the  word  break threshold percent. Default is 10.  Word
76              break occurs when distance between two  adjacent  characters  is
77              greater than this percent of character height.
78
79       -fontfullname
80              outputs the font name without any substitutions.
81
82

AUTHOR

84       Pdftohtml  was developed by Gueorgui Ovtcharov and Rainer Dorsch. It is
85       based and benefits a lot from Derek Noonburg's xpdf package.
86
87       This manual page was written by Søren Boll Overgaard <boll@debian.org>,
88       for the Debian GNU/Linux system (but may be used by others).
89

SEE ALSO

91       pdfdetach(1),  pdffonts(1),  pdfimages(1),  pdfinfo(1),  pdftocairo(1),
92       pdftoppm(1), pdftops(1), pdftotext(1)
93
94
95
96                                                                  PDFTOHTML(1)
Impressum