1PDF2DJVU(1) pdf2djvu manual PDF2DJVU(1)
2
3
4
6 pdf2djvu - creates DjVu files from PDF files
7
9 pdf2djvu [{-o | --output} output-djvu-file] [option...] pdf-file
10
11 pdf2djvu {-i | --indirect} index-djvu-file [option...] pdf-file
12
13 pdf2djvu {--version | --help | -h}
14
16 This program creates a DjVu file from the Portable Document Format file
17 pdf-file.
18
20 pdf2djvu accepts the following options:
21
22 Document type, file names
23 -o, --output=output-djvu-file
24 Generate a bundled multi-page document. Write the file into
25 output-djvu-file instead of standard output.
26
27 -i, --indirect=index-djvu-file
28 Generate an indirect multi-page document. Use index-djvu-file as
29 the index file name; put the component files into the same
30 directory. The directory must exist and be writable.
31
32 --pageid-template=template
33 Specifies the naming scheme for page identifiers. Consult the
34 “TEMPLATE LANGUAGE” section for the template language description.
35
36 The default template is “p{page:04*}.djvu”.
37
38 For portability reasons, page identifiers:
39
40 · must consist only of lowercase ASCII letters, digits, _, +, -
41 and dot,
42
43 · cannot start with a dot,
44
45 · cannot contain two consecutive dots,
46
47 · must end with the .djvu or the .djv extension.
48
49
50 --pageid-prefix=prefix
51 Equivalent to “--pageid-template=prefix{page:04*}.djvu”.
52
53 --page-title-template=template
54 Specifies the template for page titles. Consult the “TEMPLATE
55 LANGUAGE” section for the template language description.
56
57 The default is to set no page titles.
58
59 Resolution, page size
60 -d, --dpi=resolution
61 Specifies the desired resolution to resolution dots per inch. The
62 default is 300 dpi. The allowed range is: 72 ≤ resolution ≤ 6000.
63
64 --media-box
65 Use MediaBox to determine page size. CropBox is used by default.
66
67 --page-size=widthxheight
68 Specifies the preferred page size to width pixels × height pixels.
69 The actual page size may be altered in order to respect aspect
70 ratio and DjVu limitations on resolution. (This option takes
71 precedence over -d/--dpi.)
72
73 --guess-dpi
74 Try to guess native resolution by inspecting embedded images. Use
75 with care.
76
77 Image quality
78 --bg-slices=n+...+n, --bg-slices=n,...,n
79 Specifies the encoding quality of the IW44 background layer. This
80 option is similar to the -slice option of c44. Consult the c44(1)
81 manual page for details. The default is 72+11+10+10.
82
83 --bg-subsample=n
84 Specifies the background subsampling ratio. The default is 3. Valid
85 values are integers between 1 and 12, inclusive.
86
87 --fg-colors=default
88 Try to preserve all the foreground layer colors. This is the
89 default.
90
91 --fg-colors=web
92 Reduce foreground layer colors to the web palette (216 colors).
93 This option is not recommended.
94
95 --fg-colors=n
96 Use GraphicsMagick to reduce number of distinct colors in the
97 foreground layer to n. Valid values are integers between 1 and
98 4080. This option is not recommended.
99
100 --fg-colors=black
101 Discard any color information from the foreground layer.
102
103 --monochrome
104 Render pages as monochrome bitmaps. With this option, --bg-... and
105 --fg-... options are not respected.
106
107 --loss-level=n
108 Specifies the aggressiveness of the lossy compression. The default
109 is 0 (lossless). Valid values are integers between 0 and 200,
110 inclusive. This option is similar to the -losslevel option of cjb2;
111 consult the cjb2(1) manual page for details. This option is
112 respected only along with the --monochrome option.
113
114 --lossy
115 Synonym for --loss-level=100.
116
117 --anti-alias
118 Enable font and vector anti-aliasing. This option is not
119 recommended.
120
121 Extraction
122 --no-metadata
123 Don't extract the metadata.
124
125 By default:
126
127 · The following entries of the document information dictionary
128 are extracted: Title, Author, Subject, Creator, Producer,
129 CreationDate, ModDate. Timestamps are formatted according to
130 RFC 3999[1], with date and time components separated by a
131 single space.
132
133 The XMP metadata is extracted (or created) and updated
134 accordingly.
135
136
137 --verbatim-metadata
138 Keep the original metadata intact.
139
140 --no-outline
141 Don't extract the document outline.
142
143 --hyperlinks=border-avis
144 Make hyperlink borders always visible.
145
146 By default, a hyperlink border is visible only when the mouse is
147 over the hyperlink.
148
149 --hyperlinks=#RRGGBB
150 Force the specified border color for hyperlinks.
151
152 --no-hyperlinks, --hyperlinks=none
153 Don't extract hyperlinks.
154
155 --no-text
156 Don't extract the text.
157
158 --words
159 Extract the text. Record the location of every word. This is the
160 default.
161
162 --lines
163 Extract the text. Record the location of every line, rather that
164 every word.
165
166 --crop-text
167 Extract no text outside the page boundary.
168
169 --no-nfkc
170 Don't NFKC[2]-normalize the text.
171
172 --filter-text=command-line
173 Filter the text through the command-line. The provided filter must
174 preserve whitespace, control characters and decimal digits.
175
176 This option implies --no-nfkc.
177
178 -p, --pages=page-range
179 Specifies pages to convert. page-range is a comma-separated list
180 of sub-ranges. Each sub-range is either a single page (e.g. 17) or
181 a contiguous range of pages (e.g. 37-42). Pages are numbered from
182 1.
183
184 The default is to convert all pages.
185
186 Performance
187 -j, --jobs=n
188 Use n threads to perform conversion. The default is to use one
189 thread.
190
191 -j0, --jobs=0
192 Determine automatically how many threads to use to perform
193 conversion.
194
195 Verbosity, help
196 -v, --verbose
197 Display more informational messages while converting the file.
198
199 -q, --quiet
200 Don't display informational messages while converting the file.
201
202 --version
203 Output version information and exit.
204
205 -h, --help
206 Display help and exit.
207
209 OMP_*
210 Details of runtime behaviour with respect to parallelism can be
211 controlled by several environment variables. Please refer to the
212 OpenMP API specification[3] for details.
213
215 Template syntax
216 The template language is roughly modelled on the Python string
217 formatting syntax[4].
218
219 A template is a piece of text which contains fields, surrounded by
220 curly braces {}. Fields are replaced with appropriately formatted
221 values when the template is evaluated. Moreover, {{ is replaced with a
222 single { and }} is replaced with a single }.
223
224 Field syntax
225 Each field consists of a variable name, optionally followed by a shift,
226 optionally followed by a format specification.
227
228 The shift is a signed (i.e. starting with a + or - character) integer.
229
230 The format specification consists of a colon, followed by a width
231 specification.
232
233 The width specification is a decimal integer defining the minimum field
234 width. If not specified, then the field width will be determined by the
235 content. Preceding the width specification with a zero (0) character
236 enables zero-padding.
237
238 The width specification is optionally followed by an asterisk (*)
239 character, which increases the minimum field width to the width of the
240 longest possible content of the variable.
241
242 Available variables
243 page, spage
244 Page number in the PDF document.
245
246 dpage
247 Page number in the DjVu document.
248
250 Layer separation algorithm
251 Unless the --monochrome option is on, pdf2djvu uses the following naïve
252 layer separation algorithm:
253
254 1. For each page, do the following:
255
256 1. Raster the page into a pixmap, in the usual manner.
257
258 2. Raster the page into another pixmap, omitting the following
259 page elements:
260
261 · text,
262
263 · 1 bit-per-pixel raster images,
264
265 · vector elements (except fills of large areas).
266
267
268 3. Compare both pixmaps, pixel by pixel:
269
270 1. If their colors match, classify the pixel as a part of the
271 background layer.
272
273 2. Otherwise, classify the pixel as a part of the foreground
274 layer.
275
276
277
278
280 If you find a bug in pdf2djvu, please report it at the issue
281 tracker[5].
282
284 djvu(1), djvudigital(1), csepdjvu(1)
285
287 Jakub Wilk <jwilk@jwilk.net>
288 Author.
289
291 Copyright © 2007, 2008, 2009, 2010 Jakub Wilk
292
294 1. RFC 3999
295 http://www.ietf.org/rfc/rfc3339
296
297 2. NFKC
298 http://unicode.org/reports/tr15/
299
300 3. OpenMP API specification
301 http://openmp.org/wp/openmp-specifications/
302
303 4. Python string formatting syntax
304 http://docs.python.org/library/string.html#format-string-syntax
305
306 5. the issue tracker
307 http://code.google.com/p/pdf2djvu/issues/
308
309
310
311pdf2djvu 0.7.3 05/24/2010 PDF2DJVU(1)