1DJVUSED(1)                       DjVuLibre-3.5                      DJVUSED(1)
2
3
4

NAME

6       djvused - Multi-purpose DjVu document editor.
7
8

SYNOPSIS

10       djvused [options] djvufile
11
12

DESCRIPTION

14       Program djvused is a powerful command line tool for manipulating multi-
15       page documents, creating or  editing  annotation  chunks,  creating  or
16       editing  hidden  text layers, pre-computing thumbnail images, and more.
17       The program first reads the DjVu document djvufile and executes a  num‐
18       ber of djvused commands.
19
20       Djvused  commands  can  be read from a specific file (when option -f is
21       specified), read from the command line (when option -e  is  specified),
22       or read from the standard input (the default).
23
24

OPTIONS

26       -v     Cause djvused to print a command line prompt before reading com‐
27              mands and a brief message describing how each command  was  exe‐
28              cuted.  This option is very useful for debugging djvused scripts
29              and also for interactively  entering  djvused  commands  on  the
30              standard input.
31
32       -f scriptfile
33              Cause djvused to read commands from file scriptfile.
34
35       -e command
36              Cause  djvused  to  execute the commands specified by the option
37              argument commands.  It is advisable to surround the djvused com‐
38              mands by single quotes in order to prevent unwanted shell expan‐
39              sion.
40
41       -s     Cause djvused to save the  file  djvufile  after  executing  the
42              specified  commands.   This is similar to executing command save
43              immediately before terminating the program.
44
45       -n     Cause djvused to disregard save commands.  This  is  useful  for
46              debugging  djvused  scripts  without  overwriting  files on your
47              disk.
48
49

DJVUSED EXAMPLES

51       There are many ways to use program  djvused.   The  following  examples
52       illustrate some common uses of this program.
53
54
55   Obtaining the size of a page
56       Command size outputs the width and height of the selected pages using a
57       HTML friendly syntax.  For instance, the following command  prints  the
58       size of page 3 of document myfile.djvu.
59
60          djvused myfile.djvu -e 'select 3; size'
61
62   Extracting the hidden text
63       Command  print-pure-txt  outputs  the  text associated with a page or a
64       document.  For instance, the following shell command outputs  the  text
65       for  the  entire  document.  Lines and pages are delimited by the usual
66       control characters.
67
68          djvused myfile.djvu -e 'print-pure-txt'
69
70       Command print-txt produces  a  more  extensive  output  describing  the
71       structure  and the location of the text components.  The syntax of this
72       output is described later in this man page.  For instance, the  follow‐
73       ing shell command outputs extended text information for page 3 of docu‐
74       ment myfile.djvu.
75
76          djvused myfile.djvu -e 'select 3; print-txt'
77
78   Extracting the annotations
79       Annotation data can be extracted using command print-ant.   The  syntax
80       of  the  annotation  data  is  described  later  in this man page.  For
81       instance, the following shell command outputs the annotation  data  for
82       the first page of document myfile.djvu.
83
84          djvused myfile.djvu -e 'select 1; print-ant'
85
86       Command  print-ant  only  prints the annotations stored in the selected
87       component file.  Command print-merged-ant  also  retrieves  annotations
88       from all the component files referenced by the current page (using INCL
89       chunks) and prints the merged information.
90
91
92   Dumping/restoring annotations and text
93       Three commands, output-txt, output-ant, and output-all, produce djvused
94       scripts.   For instance, the following shell command produces a djvused
95       script, myfile.dsed, that recreates all the text and annotation data in
96       document myfile.djvu.
97
98          djvused myfile.djvu -e 'output-all' > myfile.dsed
99
100       Script  myfile.dsed is a text file that can be easily edited.  The fol‐
101       lowing shell command then recreates the text and annotation information
102       in file myfile.djvu.
103
104          djvused myfile.djvu -f myfile.dsed -s
105
106
107   Extracting a page
108       Both  commands  save-page  and save-page-with create a DjVu file repre‐
109       senting the selected component file of a document.  The following shell
110       command,  for  instance,  creates  a file p05.djvu containing page 5 of
111       document myfile.djvu.
112
113          djvused myfile.djvu -e 'select 5; save-page p05.djvu'
114
115       Each page of a document might import data from another  component  file
116       using  the so-called inclusion ( INCL ) chunks.  Command save-page then
117       produces a file with unresolved references to imported  data.   Such  a
118       file  should  then be made part of a multi-page document containing the
119       required data in other component files.  On  the  other  hand,  command
120       save-page-with copies all the imported data into the output file.  This
121       file is directly usable. Yet  collecting  several  such  files  into  a
122       multi-page document might lead to useless data replication.
123
124
125   Pre-computing thumbnails
126       Commands  set-thumbnails  constructs  thumbnails that can be later dis‐
127       played by DjVu viewers.  The following  shell  command,  for  instance,
128       computes  thumbnails  of  size  64x64  pixels  for  all  pages  of file
129       myfile.djvu.
130
131          djvused myfile.djvu -e 'set-thumbnails 64' -s
132
133

DJVUSED COMMANDS

135       Command lines might contain zero, one, or more djvused commands and  an
136       optional  comment.   Multiple  djvused  commands must be separated by a
137       semicolon character ';'.  Comments are introduced by the '#'  character
138       and extend until the end of the command line.
139
140
141   Selection commands
142       Multi-page  DjVu documents are composed of a number of component files.
143       Most component files describe a specific page of a document.  Some com‐
144       ponent files contain information shared by several pages such as shared
145       image data, shared annotations or thumbnails.   Many  djvused  commands
146       operate on selected component files.  All component files are initially
147       selected.  The following commands are useful for  changing  the  selec‐
148       tion.
149
150       ls     List all component files in the document.  Each line contains an
151              optional page number, a letter  describing  the  component  file
152              type,  the size of the component file, and the identifier of the
153              component file.  Component file type letters  P,  I,  A,  and  T
154              respectively  stand  for  page  data,  shared image data, shared
155              annotation data, and thumbnail  data.   Page  numbers  are  only
156              listed for component files containing page data.
157
158       select [fileid]
159              Select  the component file identified by argument fileid.  Argu‐
160              ment fileid must be either a page number  or  a  component  file
161              identifier.  The select command selects all component files when
162              the argument fileid is omitted.
163
164       select-shared-ant
165              Select a component file containing shared annotations.  Only one
166              such  component  file is supported by the current DjVu software.
167              This component file usually contains annotations  pertaining  to
168              the  whole document as opposed to specific pages.  An error mes‐
169              sage is displayed if there is no such component file.
170
171       create-shared-ant
172              Create and select a component  file  containing  shared  annota‐
173              tions.   This  command only selects the shared annotation compo‐
174              nent file if such a component file already exists.  Otherwise it
175              creates  a  new  shared annotation component file and makes sure
176              that it is imported by all pages in the document.
177
178
179   Miscellaneous commands
180       help   Display  a  help  message  listing  all  commands  supported  by
181              djvused.
182
183       n      Print the total number of pages in the document.
184
185       dump   Display  the  EA  IFF  85  structure  of  the document or of the
186              selected component file.  A similar  capability  is  offered  by
187              program djvudump.
188
189       size   Display  the  width  and  the height of the selected pages.  The
190              dimensions of each page are displayed using  a  syntax  suitable
191              for direct insertion into the <EMBED...></EMBED> tags.
192
193
194   Text and annotation commands
195       print-pure-txt
196              Print  the  text stored in the hidden text layer of the selected
197              pages.  A similar capability  is  offered  by  program  djvutxt.
198              Structural information is sometimes represented by control char‐
199              acters.  Text from different pages is  delimited  by  form  feed
200              characters  ("\f").   Lines  are delimited by newline characters
201              ("\n").  Columns, regions, and paragraphs are  sometimes  delim‐
202              ited  by  vertical  tab  ("\013"), group separators ("\035") and
203              unit separators ("\037") respectively.
204
205       print-txt
206              Prints extensive hidden text information for the selected pages.
207              This information describes the structure of the text on the doc‐
208              ument page and locates  the  structural  elements  in  the  page
209              image.  The syntax of this output is described later in this man
210              page.
211
212       remove-txt
213              Remove the hidden text information from the  selected  component
214              files.   For  instance, executing commands select and remove-txt
215              removes all hidden text information from the DjVu document.
216
217       set-txt [djvusedtxtfile]
218              Insert hidden text information into  the  selected  pages.   The
219              optional  argument  djvusedtxtfile  names  a file containing the
220              hidden text information.  This file must contain data similar to
221              what  is produced by command print-txt.  When the optional argu‐
222              ment is omitted, the program reads the hidden  text  information
223              from  the djvused script until reaching an end-of-file or a line
224              containing a single period.
225
226       output-txt
227              Prints a djvused script that reconstructs the hidden text infor‐
228              mation  for the selected pages.  This script can later be edited
229              and executed by invoking program djvused with option -f.
230
231       print-ant
232              Prints the annotations of  the  selected  component  file.   The
233              annotation  data  is represented using a simple syntax described
234              later in this document.
235
236       print-merged-ant
237              Merge the annotations stored in  the  selected  component  files
238              with the annotations imported from other component files such as
239              the shared annotation component file..  The annotation  data  is
240              represented  using a simple syntax described later in this docu‐
241              ment.
242
243       remove-ant
244              Remove the annotation information from  the  selected  component
245              files.   For  instance, executing commands select and remove-ant
246              removes all annotation information from the DjVu document.
247
248       set-ant [djvusedantfile]
249              Insert  annotations  into  the  selected  component  file.   The
250              optional  argument  djvusedantfile  names  a file containing the
251              annotation data.  This file must contain data similar to what is
252              produced  by  command  print-ant.  When the optional argument is
253              omitted, the program reads the annotation data from the  djvused
254              script itself until reaching an end-of-file or a line containing
255              a single period.
256
257       output-ant
258              Print a djvused script that reconstructs the annotation informa‐
259              tion  for  the  selected pages.  This script can later be edited
260              and executed by invoking program djvused with option -f.
261
262       print-meta
263              Print the meta-data part of the  annotations  for  the  selected
264              component  file.  This command displays a subset of the informa‐
265              tion printed by command  print-ant  using  a  different  syntax.
266              Meta-data  are  organized as key-value pairs.  Each printed line
267              contains the key name such as author, title,etc., followed by  a
268              tab character ("\t") and a double-quoted string representing the
269              UTF-8 encoded meta-data value.
270
271       set-meta [djvusedmetafile]
272              Set the meta-data part of the annotations of the selected compo‐
273              nent  file.   The  remaining  part  of  the  annotations is left
274              unchanged The optional argument  djvusedmetafile  names  a  file
275              containing  the  meta-data.  This file must contain data similar
276              to what is produced by command print-meta.   When  the  optional
277              argument  is omitted, the program reads the annotation data from
278              the djvused script itself until reaching  an  end-of-file  or  a
279              line containing a single period.
280
281       output-all
282              Print  a  djvused  script that reconstructs both the hidden text
283              and the annotation information for  the  selected  pages.   This
284              script  can  later  be  edited  and executed by invoking program
285              djvused with option -f.
286
287   Outline/bookmarks commands
288       print-outline
289              Print the outline of the document.  Nothing is  printed  if  the
290              document contains no outline.
291
292       set-outline [djvusedoutlinefile]
293              Insert  outline  information  into  the  document.  The optional
294              argument djvusedoutlinefile names a file containing the  outline
295              information.   This  file  must  contain data similar to what is
296              produced by command print-outline.  When the  optional  argument
297              is  omitted,  the program reads the hidden text information from
298              the djvused script until reaching an end-of-file or a line  con‐
299              taining a single period.
300
301   Thumbnail commands
302       set-thumbnails sz
303              Compute thumbnails of size szxsz pixels and insert them into the
304              document.  DjVu viewers can later display these thumbnails  very
305              efficiently  without  need  to  download the data for each page.
306              Typical thumbnail size range from 48 to 128 pixels.
307
308       remove-thumbnails
309              Remove the pre-computed thumbnails from the DjVu document.   New
310              thumbnails can then be computed using command set-thumbnails.
311
312
313   Save commands
314       The  above  commands only modify the memory image of the DjVu document.
315       The following commands provide means to save the modified data into the
316       file system.
317
318       save   Save  the  modified DjVu document back into the input file djvu‐
319              file specified by the arguments of the program djvused.  Nothing
320              is  done  if  the DjVu file was not modified.  Passing option -s
321              program djvused is equivalent to executing command  save  before
322              exiting the program.
323
324       save-bundled filename
325              Save the current DjVu document as a bundled multi-page DjVu doc‐
326              ument named filename.  A similar capability is offered  by  pro‐
327              gram djvmcvt.
328
329       save-indirect filename
330              Save  the  current  DjVu document as an indirect multi-page DjVu
331              document.  The index file of the indirect document will be named
332              filename.   All other files composing the indirect document will
333              be saved into the same directory as the index file.   A  similar
334              capability is offered by program djvmcvt.
335
336       save-page filename
337              Save  the  selected component file into DjVu file filename.  The
338              selected component file might import data from another component
339              file  using  the so-called inclusion ( INCL ) chunks.  This com‐
340              mand then produces a file with unresolved references to imported
341              data.  Such a file should then be made part of a multi-page doc‐
342              ument containing the required data in other component files.
343
344       save-page-with filename
345              Save the selected component file into DjVu file  filename.   All
346              data imported from other component files is copied into the out‐
347              put file as well.  This command always produces  a  usable  DjVu
348              file.   On  the other hand, collecting several such files into a
349              multi-page document might lead to useless data replication.
350
351
352

DJVUSED FILE FORMATS

354       Djvused uses a simple parenthesized syntax to  represent  both  annota‐
355       tions and hidden text.
356
357       *  This  syntax  is  the native syntax used by DjVu for storing annota‐
358          tions.  Program djvused simply compresses the annotation data  using
359          the bzz(1) algorithm.
360
361       *  This  syntax differs from the native syntax used by DjVu for storing
362          the hidden text.  Program djvused performs the translations  between
363          the  compact binary representation used by DjVu and the easily modi‐
364          fiable parenthesized syntax.
365
366   General syntax
367       Djvused files are ASCII text files.  The legal  characters  in  djvused
368       files are the printable ASCII characters and the space, tab, cr, and nl
369       characters.  Using other characters has undefined results.
370
371       Djvused files are composed of a sequence of  expressions  separated  by
372       blank  characters  (space,  tab,  cr,  or  nl).  There are four kind of
373       expressions, namely integers, symbols, strings and lists.
374
375       Integers:
376              Integer numbers are represented by one or more digits, with  the
377              usual interpretation.
378
379       Symbols:
380              Symbols,  or identifiers, are sequences of printable ascii char‐
381              acters representing a name or a keyword.  Acceptable  characters
382              are  the alpha-numeric characters, the underscore "_", the minus
383              character "-", and the hash character  "#".   Names  should  not
384              begin with a digit or a minus character.
385
386       Strings:
387              Strings  denote  an  arbitrary sequence of bytes, usually inter‐
388              preted as a sequence of UTF-8 encoded  characters.   Strings  in
389              djvused  files  are  similar to strings in the C language.  They
390              are surrounded by double quote characters.  Certain sequences of
391              characters  starting with a backslash ("\") have a special mean‐
392              ing.  A backslash followed by letter "a", "b",  "t",  "n",  "v",
393              "f",  "r",  "\",  and  stands  for the ascii character BEL(007),
394              BS(008), HT(009),  LF(010),  VT(011),  FF(012),  CR(013),  BACK‐
395              SLASH(134)  and DOUBLEQUOTE(042) respectively.  A backslash fol‐
396              lowed by one to three digits stands for  the  byte  whose  octal
397              code  is expressed by the digits.  All other backslash sequences
398              are  illegal.   All  non  printable  ascii  characters  must  be
399              escaped.
400
401       Lists: Lists  are  sequence of expressions separated by blanks and sur‐
402              rounded by parentheses.  All expressions  types  are  acceptable
403              within a list, including sub-lists.
404
405
406   Hidden text syntax
407       The  building  blocks  of the hidden text syntax are lists representing
408       each structural component of the hidden  text.   Structural  components
409       have the following form:
410
411          (type xmin xmax ymin ymax ... )
412
413       The  symbol type must be one of page, column, region, para, line, word,
414       or char, listed here by decreasing order of importance.   The  integers
415       xmin,  xmax,  ymin,  and  ymax represent the coordinates of a rectangle
416       indicating the position of the structural component in the page.  Coor‐
417       dinates are measured in pixels and have their origin at the bottom left
418       corner of the page.  The remaining expressions in the list either is  a
419       single string representing the encoded text associated with this struc‐
420       tural component, or is a  sequence  of  structural  components  with  a
421       lesser type.
422
423       The  hidden text for each page is simply represented by a single struc‐
424       tural element of type page.  Various level  of  structural  information
425       are  acceptable.   For  instance,  the  page level component might only
426       specify a page level string, or might only provide a list of lines,  or
427       might provide a full hierarchy down to the individual characters.
428
429
430   Outline/Bookmark syntax
431       The outline syntax is a single list of the form
432
433          (bookmarks ...)
434
435       The first element of the list is symbol bookmarks.  The subsequent ele‐
436       ments are lists representing the toplevel outline entries.   Each  out‐
437       line entry is represented by a list with the following form:
438
439          (title url ... )
440
441       The  string title is the title of the outline entry.  The string url is
442       composed of the hash character ("#") followed by either  the  component
443       file  identifier or the page number corresponding to the outline entry.
444       The remaining expressions describe subentries of this outline entry.
445
446
447   Annotation syntax
448       Annotations are represented by a sequence  of  annotation  expressions.
449       The following annotation expressions are recognized:
450
451       (background color)
452              Specify the color of the viewer area surrounding the DjVu image.
453              Colors are represented with the X11 hexadecimal syntax  #RRGGBB.
454              For instance, #000000 is black and #FFFFFF is white.
455
456       (zoom zoomvalue)
457              Specify  the  initial  zoom factor of the image.  Argument zoom‐
458              value can be one of stretch, one2one, width, page,  or  composed
459              of  the  letter  d followed by a number in range 1 to 999 repre‐
460              senting a zoom factor (such as in d300 or d150 for instance.)
461
462       (mode modevalue)
463              Specify the initial display mode of the image.   Argument  mode‐
464              value is one of color, bw, fore, or back.
465
466       (align horzalign vertalign)
467              Specify  how  the image should be aligned on the viewer surface.
468              By default the image is located in the center.  Argument  horza‐
469              lign  can  be one of left, center, or right.  Argument vertalign
470              can be one of top, center, or bottom.
471
472       (maparea url comment area ...)
473              Define an hyper-link for the specified destination.
474
475              Argument url can have one of the following forms:
476
477                 href
478                 (url href target)
479
480              where href is a string representing the destination  and  target
481              is a string representing the target frame for the hyper-link, as
482              defined by the HTML anchor tag <A>.  The destination string href
483              can be an arbitrary URL or can be composed of the hash character
484              ("#") followed by either a component file identifier or  a  page
485              number.   Page  numbers may be prefixed with an optional sign to
486              represent a page displacement.  For instance the  strings  "#-1"
487              and  "#+1"  can be used to access the previous page and the next
488              page.
489
490              Argument comment is a string that  might  be  displayed  by  the
491              viewer when the user moves the mouse over the hyper-link.
492
493              Argument  area  defines the shape and the location of the hyper‐
494              link.  The following forms are recognized:
495
496                 (rect xmin ymin width height)
497                 (oval xmin ymin width height)
498                 (poly x0 y0 x1 y1 ... )
499                 (text xmin ymin width height) - Not implemented.
500                 (line x0 y0 x1 y1) - Not implemented.
501
502              All parameters are numbers  representing  coordinates.   Coordi‐
503              nates are measured in pixels and have their origin at the bottom
504              left corner of the page.
505
506              The remaining expressions in the maparea list represent the vis‐
507              ual effect associated with the hyper-link.
508
509              A  first  set of options defines how borders are drawn for rect,
510              oval, polygon, or text hyperlink areas.
511
512                 (none)
513                 (xor)
514                 (border color)
515                 (shadow_in [thickness])
516                 (shadow_out [thickness])
517                 (shadow_ein [thickness])
518                 (shadow_eout [thickness])
519
520              where parameter color has syntax #RRGGBB as described above, and
521              parameter  thickness  is  an integer in range 1 to 32.  The last
522              four border options are only supported for rect hyperlink areas.
523              The  default  border  is a simple black line.  Border options do
524              not apply to line areas.
525
526              When a border option is specified, the  border  becomes  visible
527              when the user moves the mouse over the hyperlink. The border may
528              be made always visible by using the following option:
529
530                 (border-avis)
531
532              The following two options may be used with rect hyperlink areas.
533              The  complete area will be highlighted using the specified color
534              at the specified opacity (0-100, default 50).
535
536                 (hilite color)
537                 (opacity op) - Not implemented.
538
539              This is often used with an empty URL for  simply  emphasizing  a
540              specific segment of an image.
541
542              The following three options may be used with line areas to spec‐
543              ify an optional ending arrow, the line  width  and  color.   The
544              default is a black line with width 1 and without arrow.
545
546                 (arrow) - Not implemented.
547                 (width w) - Not implemented.
548                 (lineclr color) - Not implemented.
549
550              Finally the following three options can be used with text areas.
551              The default background color is transparent.  The  default  text
552              color  is  black.  The pushpin option indicates that the text is
553              symbolized by a small pushpin icon.  Clicking the  icon  reveals
554              the text.
555
556                 (backclr bkcolor) - Not implemented.
557                 (textclr txtcolor) - Not implemented.
558                 (pushpin) - Not implemented.
559
560       (metadata ... (key value) ... )
561              Define  meta-data entries.  Each entry is identified by a symbol
562              key representing the nature of the  meta  data  entry.   Typical
563              keys  include  year, booktitle, editor, author, etc.  It is sug‐
564              gested to use the same key names as the BibTeX bibliography sys‐
565              tem.  String value represents the value associated with the cor‐
566              responding key.
567
568

LIMITATIONS

570       The current version of program djvused only supports selecting one com‐
571       ponent  file  or all component files.  There is no way to select only a
572       few component files.
573
574

CREDITS

576       This program was initially written by Léon Bottou  <leonb@users.source‐
577       forge.net>  and  was improved by Yann Le Cun <profshadoko@users.source‐
578       forge.net>, Florin Nicsa, Bill  Riemers  <docbill@sourceforge.net>  and
579       many others.
580
581

SEE ALSO

583       djvu(1), djvutxt(1), djvmcvt(1), djvudump(1), bzz(1)
584
585
586
587DjVuLibre-3.5                      5/22/2005                        DJVUSED(1)
Impressum