1PDF2DJVU(1)                     pdf2djvu manual                    PDF2DJVU(1)
2
3
4

NAME

6       pdf2djvu - creates DjVu files from PDF files
7

SYNOPSIS

9       pdf2djvu [{-o | --output} output-djvu-file] [option...] pdf-file
10
11       pdf2djvu {-i | --indirect} index-djvu-file  [option...] pdf-file
12
13       pdf2djvu {--version | --help | -h}
14

DESCRIPTION

16       This program creates a DjVu file from the Portable Document Format file
17       pdf-file.
18

OPTIONS

20       pdf2djvu accepts the following options:
21
22   Document type, file names
23       -o, --output=output-djvu-file
24           Generate a bundled multi-page document. Write the file into
25           output-djvu-file instead of standard output.
26
27       -i, --indirect=index-djvu-file
28           Generate an indirect multi-page document. Use index-djvu-file as
29           the index file name; put the component files into the same
30           directory. The directory must exist and be writable.
31
32       --pageid-template=template
33           Specifies the naming scheme for page identifiers. Consult the
34           “TEMPLATE LANGUAGE” section for the template language description.
35
36           The default template is “p{page:04*}.djvu”.
37
38           For portability reasons, page identifiers:
39
40           ·   must consist only of lowercase ASCII letters, digits, _, +, -
41               and dot,
42
43           ·   cannot start with a dot,
44
45           ·   cannot contain two consecutive dots,
46
47           ·   must end with the .djvu or the .djv extension.
48
49
50       --pageid-prefix=prefix
51           Equivalent to “--pageid-template=prefix{page:04*}.djvu”.
52
53       --page-title-template=template
54           Specifies the template for page titles. Consult the “TEMPLATE
55           LANGUAGE” section for the template language description.
56
57           The default is to set no page titles.
58
59   Resolution, page size
60       -d, --dpi=resolution
61           Specifies the desired resolution to resolution dots per inch. The
62           default is 300 dpi. The allowed range is: 72 ≤ resolution ≤ 6000.
63
64       --media-box
65           Use MediaBox to determine page size.  CropBox is used by default.
66
67       --page-size=widthxheight
68           Specifies the preferred page size to width pixels × height pixels.
69           The actual page size may be altered in order to respect aspect
70           ratio and DjVu limitations on resolution. (This option takes
71           precedence over -d/--dpi.)
72
73       --guess-dpi
74           Try to guess native resolution by inspecting embedded images. Use
75           with care.
76
77   Image quality
78       --bg-slices=n+...+n, --bg-slices=n,...,n
79           Specifies the encoding quality of the IW44 background layer. This
80           option is similar to the -slice option of c44. Consult the c44(1)
81           manual page for details. The default is 72+11+10+10.
82
83       --bg-subsample=n
84           Specifies the background subsampling ratio. The default is 3. Valid
85           values are integers between 1 and 12, inclusive.
86
87       --fg-colors=default
88           Try to preserve all the foreground layer colors. This is the
89           default.
90
91       --fg-colors=web
92           Reduce foreground layer colors to the web palette (216 colors).
93           This option is not recommended.
94
95       --fg-colors=n
96           Use GraphicsMagick to reduce number of distinct colors in the
97           foreground layer to n. Valid values are integers between 1 and
98           4080. This option is not recommended.
99
100       --fg-colors=black
101           Discard any color information from the foreground layer.
102
103       --monochrome
104           Render pages as monochrome bitmaps. With this option, --bg-...  and
105           --fg-...  options are not respected.
106
107       --loss-level=n
108           Specifies the aggressiveness of the lossy compression. The default
109           is 0 (lossless). Valid values are integers between 0 and 200,
110           inclusive. This option is similar to the -losslevel option of cjb2;
111           consult the cjb2(1) manual page for details. This option is
112           respected only along with the --monochrome option.
113
114       --lossy
115           Synonym for --loss-level=100.
116
117       --anti-alias
118           Enable font and vector anti-aliasing. This option is not
119           recommended.
120
121   Extraction
122       --no-metadata
123           Don't extract the metadata.
124
125           By default:
126
127           ·   The following entries of the document information dictionary
128               are extracted: Title, Author, Subject, Creator, Producer,
129               CreationDate, ModDate. Timestamps are formatted according to
130               RFC 3999[1], with date and time components separated by a
131               single space.
132
133               The XMP metadata is extracted (or created) and updated
134               accordingly.
135
136
137       --verbatim-metadata
138           Keep the original metadata intact.
139
140       --no-outline
141           Don't extract the document outline.
142
143       --hyperlinks=border-avis
144           Make hyperlink borders always visible.
145
146           By default, a hyperlink border is visible only when the mouse is
147           over the hyperlink.
148
149       --hyperlinks=#RRGGBB
150           Force the specified border color for hyperlinks.
151
152       --no-hyperlinks, --hyperlinks=none
153           Don't extract hyperlinks.
154
155       --no-text
156           Don't extract the text.
157
158       --words
159           Extract the text. Record the location of every word. This is the
160           default.
161
162       --lines
163           Extract the text. Record the location of every line, rather that
164           every word.
165
166       --crop-text
167           Extract no text outside the page boundary.
168
169       --no-nfkc
170           Don't NFKC[2]-normalize the text.
171
172       --filter-text=command-line
173           Filter the text through the command-line. The provided filter must
174           preserve whitespace, control characters and decimal digits.
175
176           This option implies --no-nfkc.
177
178       -p, --pages=page-range
179           Specifies pages to convert.  page-range is a comma-separated list
180           of sub-ranges. Each sub-range is either a single page (e.g. 17) or
181           a contiguous range of pages (e.g. 37-42). Pages are numbered from
182           1.
183
184           The default is to convert all pages.
185
186   Performance
187       -j, --jobs=n
188           Use n threads to perform conversion. The default is to use one
189           thread.
190
191       -j0, --jobs=0
192           Determine automatically how many threads to use to perform
193           conversion.
194
195   Verbosity, help
196       -v, --verbose
197           Display more informational messages while converting the file.
198
199       -q, --quiet
200           Don't display informational messages while converting the file.
201
202       --version
203           Output version information and exit.
204
205       -h, --help
206           Display help and exit.
207

ENVIRONMENT

209       OMP_*
210           Details of runtime behaviour with respect to parallelism can be
211           controlled by several environment variables. Please refer to the
212           OpenMP API specification[3] for details.
213

TEMPLATE LANGUAGE

215   Template syntax
216       The template language is roughly modelled on the Python string
217       formatting syntax[4].
218
219       A template is a piece of text which contains fields, surrounded by
220       curly braces {}. Fields are replaced with appropriately formatted
221       values when the template is evaluated. Moreover, {{ is replaced with a
222       single { and }} is replaced with a single }.
223
224   Field syntax
225       Each field consists of a variable name, optionally followed by a shift,
226       optionally followed by a format specification.
227
228       The shift is a signed (i.e. starting with a + or - character) integer.
229
230       The format specification consists of a colon, followed by a width
231       specification.
232
233       The width specification is a decimal integer defining the minimum field
234       width. If not specified, then the field width will be determined by the
235       content. Preceding the width specification with a zero (0) character
236       enables zero-padding.
237
238       The width specification is optionally followed by an asterisk (*)
239       character, which increases the minimum field width to the width of the
240       longest possible content of the variable.
241
242   Available variables
243       page, spage
244           Page number in the PDF document.
245
246       dpage
247           Page number in the DjVu document.
248

IMPLEMENTATION DETAILS

250   Layer separation algorithm
251       Unless the --monochrome option is on, pdf2djvu uses the following naïve
252       layer separation algorithm:
253
254        1. For each page, do the following:
255
256            1. Raster the page into a pixmap, in the usual manner.
257
258            2. Raster the page into another pixmap, omitting the following
259               page elements:
260
261               ·   text,
262
263               ·   1 bit-per-pixel raster images,
264
265               ·   vector elements (except fills of large areas).
266
267
268            3. Compare both pixmaps, pixel by pixel:
269
270                1. If their colors match, classify the pixel as a part of the
271                   background layer.
272
273                2. Otherwise, classify the pixel as a part of the foreground
274                   layer.
275
276
277
278

BUG REPORTS

280       If you find a bug in pdf2djvu, please report it at the issue
281       tracker[5].
282

SEE ALSO

284       djvu(1), djvudigital(1), csepdjvu(1)
285

AUTHOR

287       Jakub Wilk <jwilk@jwilk.net>
288           Author.
289
291       Copyright © 2007, 2008, 2009, 2010 Jakub Wilk
292

NOTES

294        1. RFC 3999
295           http://www.ietf.org/rfc/rfc3339
296
297        2. NFKC
298           http://unicode.org/reports/tr15/
299
300        3. OpenMP API specification
301           http://openmp.org/wp/openmp-specifications/
302
303        4. Python string formatting syntax
304           http://docs.python.org/library/string.html#format-string-syntax
305
306        5. the issue tracker
307           http://code.google.com/p/pdf2djvu/issues/
308
309
310
311pdf2djvu 0.7.3                    05/24/2010                       PDF2DJVU(1)
Impressum