1PDFTOHTML(1) General Commands Manual PDFTOHTML(1)
2
3
4
6 pdftohtml - program to convert PDF files into HTML, XML and PNG images
7
9 pdftohtml [options] <PDF-file> [<HTML-file> <XML-file>]
10
12 This manual page documents briefly the pdftohtml command. This manual
13 page was written for the Debian GNU/Linux distribution because the
14 original program does not have a manual page.
15
16 pdftohtml is a program that converts PDF documents into HTML. It gener‐
17 ates its output in the current working directory.
18
20 A summary of options are included below.
21
22 -h, -help
23 Show summary of options.
24
25 -f <int>
26 first page to print
27
28 -l <int>
29 last page to print
30
31 -q do not print any messages or errors
32
33 -v print copyright and version info
34
35 -p exchange .pdf links with .html
36
37 -c generate complex output
38
39 -s generate single HTML that includes all pages
40
41 -i ignore images
42
43 -noframes
44 generate no frames. Not supported in complex output mode.
45
46 -stdout
47 use standard output
48
49 -zoom <fp>
50 zoom the PDF document (default 1.5)
51
52 -xml output for XML post-processing
53
54 -noRoundedCoordinates
55 do not round coordinates (with XML output only)
56
57 -enc <string>
58 output text encoding name
59
60 -opw <string>
61 owner password (for encrypted files)
62
63 -upw <string>
64 user password (for encrypted files)
65
66 -hidden
67 force hidden text extraction
68
69 -fmt image file format for Splash output (png or jpg). If complex is
70 selected, but -fmt is not specified, -fmt png will be assumed
71
72 -nomerge
73 do not merge paragraphs
74
75 -nodrm override document DRM settings
76
77 -wbt <fp>
78 adjust the word break threshold percent. Default is 10. Word
79 break occurs when distance between two adjacent characters is
80 greater than this percent of character height.
81
82 -fontfullname
83 outputs the font name without any substitutions.
84
85
87 Pdftohtml was developed by Gueorgui Ovtcharov and Rainer Dorsch. It is
88 based and benefits a lot from Derek Noonburg's xpdf package.
89
90 This manual page was written by Søren Boll Overgaard <boll@debian.org>,
91 for the Debian GNU/Linux system (but may be used by others).
92
94 pdfdetach(1), pdffonts(1), pdfimages(1), pdfinfo(1), pdftocairo(1),
95 pdftoppm(1), pdftops(1), pdftotext(1) pdfseparate(1), pdfsig(1), pdfu‐
96 nite(1)
97
98
99
100 PDFTOHTML(1)