1PDFTOHTML(1) General Commands Manual PDFTOHTML(1)
2
3
4
6 pdftohtml - program to convert PDF files into HTML, XML and PNG images
7
9 pdftohtml [options] <PDF-file> [<HTML-file> <XML-file>]
10
12 This manual page documents briefly the pdftohtml command. This manual
13 page was written for the Debian GNU/Linux distribution because the
14 original program does not have a manual page.
15
16 pdftohtml is a program that converts PDF documents into HTML. It gener‐
17 ates its output in the current working directory.
18
20 A summary of options are included below.
21
22 -h, -help
23 Show summary of options.
24
25 -f <int>
26 first page to print
27
28 -l <int>
29 last page to print
30
31 -q do not print any messages or errors
32
33 -v print copyright and version info
34
35 -p exchange .pdf links with .html
36
37 -c generate complex output
38
39 -s generate single HTML that includes all pages
40
41 -dataurls
42 use data URLs instead of external images in HTML. No available
43 in all platforms
44
45 -i ignore images
46
47 -noframes
48 generate no frames. Not supported in complex output mode.
49
50 -stdout
51 use standard output
52
53 -zoom <fp>
54 zoom the PDF document (default 1.5)
55
56 -xml output for XML post-processing
57
58 -noroundcoord
59 do not round coordinates (with XML output only)
60
61 -enc <string>
62 output text encoding name
63
64 -opw <string>
65 owner password (for encrypted files)
66
67 -upw <string>
68 user password (for encrypted files)
69
70 -hidden
71 force hidden text extraction
72
73 -fmt image file format for Splash output (png or jpg). If complex is
74 selected, but -fmt is not specified, -fmt png will be assumed
75
76 -nomerge
77 do not merge paragraphs
78
79 -nodrm override document DRM settings
80
81 -wbt <fp>
82 adjust the word break threshold percent. Default is 10. Word
83 break occurs when distance between two adjacent characters is
84 greater than this percent of character height.
85
86 -fontfullname
87 outputs the font name without any substitutions.
88
89
91 Pdftohtml was developed by Gueorgui Ovtcharov and Rainer Dorsch. It is
92 based and benefits a lot from Derek Noonburg's xpdf package.
93
94 This manual page was written by Søren Boll Overgaard <boll@debian.org>,
95 for the Debian GNU/Linux system (but may be used by others).
96
98 pdfdetach(1), pdffonts(1), pdfimages(1), pdfinfo(1), pdftocairo(1),
99 pdftoppm(1), pdftops(1), pdftotext(1) pdfseparate(1), pdfsig(1), pdfu‐
100 nite(1)
101
102
103
104 PDFTOHTML(1)