1DJVUSED(1)                       DjVuLibre-3.5                      DJVUSED(1)
2
3
4

NAME

6       djvused - Multi-purpose DjVu document editor.
7
8

SYNOPSIS

10       djvused [options] djvufile
11
12
13

DESCRIPTION

15       Program djvused is a powerful command line tool for manipulating multi-
16       page documents, creating or  editing  annotation  chunks,  creating  or
17       editing  hidden  text layers, pre-computing thumbnail images, and more.
18       The program first reads the DjVu document djvufile and executes a  num‐
19       ber of djvused commands.
20
21       Djvused  commands  can  be read from a specific file (when option -f is
22       specified), read from the command line (when option -e  is  specified),
23       or read from the standard input (the default).
24
25

OPTIONS

27       -v     Cause djvused to print a command line prompt before reading com‐
28              mands and a brief message describing how each command  was  exe‐
29              cuted.  This option is very useful for debugging djvused scripts
30              and also for interactively  entering  djvused  commands  on  the
31              standard input.
32
33       -f scriptfile
34              Cause djvused to read commands from file scriptfile.
35
36       -e command
37              Cause  djvused  to  execute the commands specified by the option
38              argument commands.  It is advisable to surround the djvused com‐
39              mands by single quotes in order to prevent unwanted shell expan‐
40              sion.
41
42       -s     Cause djvused to save the  file  djvufile  after  executing  the
43              specified  commands.   This is similar to executing command save
44              immediately before terminating the program.
45
46       -n     Cause djvused to disregard save commands.  This  is  useful  for
47              debugging  djvused  scripts  without  overwriting  files on your
48              disk.
49
50

DJVUSED EXAMPLES

52       There are many ways to use program  djvused.   The  following  examples
53       illustrate some common uses of this program.
54
55
56   Obtaining the size of a page
57       Command size outputs the width and height of the selected pages using a
58       HTML friendly syntax.  For instance, the following command  prints  the
59       size of page 3 of document myfile.djvu.
60
61          djvused myfile.djvu -e 'select 3; size'
62
63   Extracting the hidden text
64       Command  print-pure-txt  outputs  the  text associated with a page or a
65       document.  For instance, the following shell command outputs  the  text
66       for  the  entire  document.  Lines and pages are delimited by the usual
67       control characters.
68
69          djvused myfile.djvu -e 'print-pure-txt'
70
71       Command print-txt produces  a  more  extensive  output  describing  the
72       structure  and the location of the text components.  The syntax of this
73       output is described later in this man page.  For instance, the  follow‐
74       ing shell command outputs extended text information for page 3 of docu‐
75       ment myfile.djvu.
76
77          djvused myfile.djvu -e 'select 3; print-txt'
78
79   Extracting the annotations
80       Annotation data can be extracted using command print-ant.   The  syntax
81       of  the  annotation  data  is  described  later  in this man page.  For
82       instance, the following shell command outputs the annotation  data  for
83       the first page of document myfile.djvu.
84
85          djvused myfile.djvu -e 'select 1; print-ant'
86
87       Command  print-ant  only  prints the annotations stored in the selected
88       component file.  Command print-merged-ant  also  retrieves  annotations
89       from all the component files referenced by the current page (using INCL
90       chunks) and prints the merged information.
91
92
93   Dumping/restoring annotations and text
94       Three commands, output-txt, output-ant, and output-all, produce djvused
95       scripts.   For instance, the following shell command produces a djvused
96       script, myfile.dsed, that recreates all the text and annotation data in
97       document myfile.djvu.
98
99          djvused myfile.djvu -e 'output-all' > myfile.dsed
100
101       Script  myfile.dsed is a text file that can be easily edited.  The fol‐
102       lowing shell command then recreates the text and annotation information
103       in file myfile.djvu.
104
105          djvused myfile.djvu -f myfile.dsed -s
106
107
108   Extracting a page
109       Both  commands  save-page  and save-page-with create a DjVu file repre‐
110       senting the selected component file of a document.  The following shell
111       command,  for  instance,  creates  a file p05.djvu containing page 5 of
112       document myfile.djvu.
113
114          djvused myfile.djvu -e 'select 5; save-page p05.djvu'
115
116       Each page of a document might import data from another  component  file
117       using  the so-called inclusion ( INCL ) chunks.  Command save-page then
118       produces a file with unresolved references to imported  data.   Such  a
119       file  should  then be made part of a multi-page document containing the
120       required data in other component files.  On  the  other  hand,  command
121       save-page-with copies all the imported data into the output file.  This
122       file is directly usable. Yet  collecting  several  such  files  into  a
123       multi-page document might lead to useless data replication.
124
125
126   Pre-computing thumbnails
127       Commands  set-thumbnails  constructs  thumbnails that can be later dis‐
128       played by DjVu viewers.  The following  shell  command,  for  instance,
129       computes  thumbnails  of  size  64x64  pixels  for  all  pages  of file
130       myfile.djvu.
131
132          djvused myfile.djvu -e 'set-thumbnails 64' -s
133
134

DJVUSED COMMANDS

136       Command lines might contain zero, one, or more djvused commands and  an
137       optional  comment.   Multiple  djvused  commands must be separated by a
138       semicolon character ';'.  Comments are introduced by the '#'  character
139       and extend until the end of the command line.
140
141
142   Selection commands
143       Multi-page  DjVu documents are composed of a number of component files.
144       Most component files describe a specific page of a document.  Some com‐
145       ponent files contain information shared by several pages such as shared
146       image data, shared annotations or thumbnails.   Many  djvused  commands
147       operate on selected component files.  All component files are initially
148       selected.  The following commands are useful for  changing  the  selec‐
149       tion.
150
151       n      Print the total number of pages in the document.
152
153       ls     List all component files in the document.  Each line contains an
154              optional page number, a letter  describing  the  component  file
155              type, the size of the component file, and identifier of the com‐
156              ponent file.  Component file type letters P, I, A, and T respec‐
157              tively stand for page data, shared image data, shared annotation
158              data, and thumbnail data.  Page numbers are only listed for com‐
159              ponent files containing page data.  When it is set, the optional
160              page title (see command set-page-title below) is displayed after
161              the component file identifier.
162
163       select [fileid]
164              Select  the component file identified by argument fileid.  Argu‐
165              ment fileid must be either a page number  or  a  component  file
166              identifier.  The select command selects all component files when
167              the argument fileid is omitted.
168
169       select-shared-ant
170              Select a component file containing shared annotations.  Only one
171              such  component  file is supported by the current DjVu software.
172              This component file usually contains annotations  pertaining  to
173              the  whole document as opposed to specific pages.  An error mes‐
174              sage is displayed if there is no such component file.
175
176       create-shared-ant
177              Create and select a component  file  containing  shared  annota‐
178              tions.   This  command only selects the shared annotation compo‐
179              nent file if such a component file already exists.  Otherwise it
180              creates  a  new  shared annotation component file and makes sure
181              that it is imported by all pages in the document.
182
183
184   Text and annotation commands
185       print-pure-txt
186              Print the text stored in the hidden text layer of  the  selected
187              pages.   A  similar  capability  is  offered by program djvutxt.
188              Structural information is sometimes represented by control char‐
189              acters.   Text  from  different  pages is delimited by form feed
190              characters ("\f").  Lines are delimited  by  newline  characters
191              ("\n").   Columns,  regions, and paragraphs are sometimes delim‐
192              ited by vertical tab ("\013"),  group  separators  ("\035")  and
193              unit separators ("\037") respectively.
194
195       print-txt
196              Prints extensive hidden text information for the selected pages.
197              This information describes the structure of the text on the doc‐
198              ument  page  and  locates  the  structural  elements in the page
199              image.  The syntax of this output is described later in this man
200              page.
201
202       remove-txt
203              Remove  the  hidden text information from the selected component
204              files.  For instance, executing commands select  and  remove-txt
205              removes all hidden text information from the DjVu document.
206
207       set-txt [djvusedtxtfile]
208              Insert  hidden  text  information  into the selected pages.  The
209              optional argument djvusedtxtfile names  a  file  containing  the
210              hidden text information.  This file must contain data similar to
211              what is produced by command print-txt.  When the optional  argu‐
212              ment  is  omitted, the program reads the hidden text information
213              from the djvused script until reaching an end-of-file or a  line
214              containing a single period.
215
216       output-txt
217              Prints a djvused script that reconstructs the hidden text infor‐
218              mation for the selected pages.  This script can later be  edited
219              and executed by invoking program djvused with option -f.
220
221       print-ant
222              Prints  the  annotations  of  the  selected component file.  The
223              annotation data is represented using a simple  syntax  described
224              later in this document.
225
226       print-merged-ant
227              Merge  the  annotations  stored  in the selected component files
228              with the annotations imported from other component files such as
229              the  shared  annotation component file..  The annotation data is
230              represented using a simple syntax described later in this  docu‐
231              ment.
232
233       remove-ant
234              Remove  the  annotation  information from the selected component
235              files.  For instance, executing commands select  and  remove-ant
236              removes all annotation information from the DjVu document.
237
238       set-ant [djvusedantfile]
239              Insert  annotations  into  the  selected  component  file.   The
240              optional argument djvusedantfile names  a  file  containing  the
241              annotation data.  This file must contain data similar to what is
242              produced by command print-ant.  When the  optional  argument  is
243              omitted,  the program reads the annotation data from the djvused
244              script itself until reaching an end-of-file or a line containing
245              a single period.
246
247       output-ant
248              Print a djvused script that reconstructs the annotation informa‐
249              tion for the selected pages.  This script can  later  be  edited
250              and executed by invoking program djvused with option -f.
251
252       print-meta
253              Print  the  meta-data  part  of the annotations for the selected
254              component file.  This command displays a subset of the  informa‐
255              tion  printed  by  command  print-ant  using a different syntax.
256              Meta-data are organized as key-value pairs.  Each  printed  line
257              contains  the key name such as author, title,etc., followed by a
258              tab character ("\t") and a double-quoted string representing the
259              UTF-8 encoded meta-data value.
260
261       set-meta [djvusedmetafile]
262              Set the meta-data part of the annotations of the selected compo‐
263              nent file.  The  remaining  part  of  the  annotations  is  left
264              unchanged.   The  optional argument djvusedmetafile names a file
265              containing the meta-data.  This file must contain  data  similar
266              to  what  is  produced by command print-meta.  When the optional
267              argument is omitted, the program reads the annotation data  from
268              the  djvused  script  itself  until reaching an end-of-file or a
269              line containing a single period.
270
271       output-all
272              Print a djvused script that reconstructs both  the  hidden  text
273              and  the  annotation  information  for the selected pages.  This
274              script can later be edited  and  executed  by  invoking  program
275              djvused with option -f.
276
277   Outline/bookmarks commands
278       print-outline
279              Print  the  outline  of the document.  Nothing is printed if the
280              document contains no outline.
281
282       set-outline [djvusedoutlinefile]
283              Insert outline information  into  the  document.   The  optional
284              argument  djvusedoutlinefile names a file containing the outline
285              information.  This file must contain data  similar  to  what  is
286              produced  by  command print-outline.  When the optional argument
287              is omitted, the program reads the hidden text  information  from
288              the  djvused script until reaching an end-of-file or a line con‐
289              taining a single period.
290
291   Thumbnail commands
292       set-thumbnails sz
293              Compute thumbnails of size szxsz pixels and insert them into the
294              document.   DjVu viewers can later display these thumbnails very
295              efficiently without need to download the  data  for  each  page.
296              Typical thumbnail size range from 48 to 128 pixels.
297
298       remove-thumbnails
299              Remove  the pre-computed thumbnails from the DjVu document.  New
300              thumbnails can then be computed using command set-thumbnails.
301
302
303   Save commands
304       The above commands only modify the memory image of the  DjVu  document.
305       The following commands provide means to save the modified data into the
306       file system.
307
308       save   Save the modified DjVu document back into the input  file  djvu‐
309              file specified by the arguments of the program djvused.  Nothing
310              is done if the DjVu file was not modified.   Passing  option  -s
311              program  djvused  is equivalent to executing command save before
312              exiting the program.
313
314       save-bundled filename
315              Save the current DjVu document as a bundled multi-page DjVu doc‐
316              ument  named  filename.  A similar capability is offered by pro‐
317              gram djvmcvt.
318
319       save-indirect filename
320              Save the current DjVu document as an  indirect  multi-page  DjVu
321              document.  The index file of the indirect document will be named
322              filename.  All other files composing the indirect document  will
323              be  saved  into the same directory as the index file.  A similar
324              capability is offered by program djvmcvt.
325
326       save-page filename
327              Save the selected component file into DjVu file  filename.   The
328              selected component file might import data from another component
329              file using the so-called inclusion ( INCL ) chunks.   This  com‐
330              mand then produces a file with unresolved references to imported
331              data.  Such a file should then be made part of a multi-page doc‐
332              ument containing the required data in other component files.
333
334       save-page-with filename
335              Save  the  selected component file into DjVu file filename.  All
336              data imported from other component files is copied into the out‐
337              put  file  as  well.  This command always produces a usable DjVu
338              file.  On the other hand, collecting several such files  into  a
339              multi-page document might lead to useless data replication.
340
341
342   Miscellaneous commands
343       help   Display  a  help  message  listing  all  commands  supported  by
344              djvused.
345
346       dump   Display the EA IFF 85  structure  of  the  document  or  of  the
347              selected  component  file.   A  similar capability is offered by
348              program djvudump.
349
350       size   Display the width and the height of  the  selected  pages.   The
351              dimensions  of  each  page are displayed using a syntax suitable
352              for direct insertion into the <EMBED...></EMBED> tags.
353
354       set-page-title title
355              Sets a page title for the selected page.  When page  titles  are
356              available,  recent  versions  of  the  DjVuLibre viewers display
357              these page titles instead of page numbers and also  accept  them
358              in  page  selection options.  Command ls can be used to see both
359              the page titles and page identifiers.  To unset  a  page  title,
360              simply make it equal to the page identifier.
361
362

DJVUSED FILE FORMATS

364       Djvused  uses  a  simple parenthesized syntax to represent both annota‐
365       tions and hidden text.
366
367       *  This syntax is the native syntax used by DjVu  for  storing  annota‐
368          tions.   Program djvused simply compresses the annotation data using
369          the bzz(1) algorithm.
370
371       *  This syntax differs from the native syntax used by DjVu for  storing
372          the  hidden text.  Program djvused performs the translations between
373          the compact binary representation used by DjVu and the easily  modi‐
374          fiable parenthesized syntax.
375
376   General syntax
377       Djvused  files  are  ASCII text files.  The legal characters in djvused
378       files are the printable ASCII characters and the space, tab, cr, and nl
379       characters.  Using other characters has undefined results.
380
381       Djvused  files  are  composed of a sequence of expressions separated by
382       blank characters (space, tab, cr, or  nl).   There  are  four  kind  of
383       expressions, namely integers, symbols, strings and lists.
384
385       Integers:
386              Integer  numbers are represented by one or more digits, with the
387              usual interpretation.
388
389       Symbols:
390              Symbols, or identifiers, are sequences of printable ascii  char‐
391              acters  representing a name or a keyword.  Acceptable characters
392              are the alpha-numeric characters, the underscore "_", the  minus
393              character  "-",  and  the  hash character "#".  Names should not
394              begin with a digit or a minus character.
395
396       Strings:
397              Strings denote an arbitrary sequence of  bytes,  usually  inter‐
398              preted  as  a  sequence of UTF-8 encoded characters.  Strings in
399              djvused files are similar to strings in the  C  language.   They
400              are surrounded by double quote characters.  Certain sequences of
401              characters starting with a backslash ("\") have a special  mean‐
402              ing.   A  backslash  followed by letter "a", "b", "t", "n", "v",
403              "f", "r", "\", and stands  for  the  ascii  character  BEL(007),
404              BS(008),  HT(009),  LF(010),  VT(011),  FF(012),  CR(013), BACK‐
405              SLASH(134) and DOUBLEQUOTE(042) respectively.  A backslash  fol‐
406              lowed  by  one  to  three digits stands for the byte whose octal
407              code is expressed by the digits.  All other backslash  sequences
408              are  illegal.   All  non  printable  ascii  characters  must  be
409              escaped.
410
411       Lists: Lists are sequence of expressions separated by blanks  and  sur‐
412              rounded  by  parentheses.   All expressions types are acceptable
413              within a list, including sub-lists.
414
415
416   Hidden text syntax
417       The building blocks of the hidden text syntax  are  lists  representing
418       each  structural  component  of the hidden text.  Structural components
419       have the following form:
420
421          (type xmin ymin xmax ymax ... )
422
423       The symbol type must be one of page, column, region, para, line,  word,
424       or  char,  listed here by decreasing order of importance.  The integers
425       xmin, ymin, xmax, and ymax represent the  coordinates  of  a  rectangle
426       indicating the position of the structural component in the page.  Coor‐
427       dinates are measured in pixels and have their origin at the bottom left
428       corner  of the page.  The remaining expressions in the list either is a
429       single string representing the encoded text associated with this struc‐
430       tural  component,  or  is  a  sequence  of structural components with a
431       lesser type.
432
433       The hidden text for each page is simply represented by a single  struc‐
434       tural  element  of  type page.  Various level of structural information
435       are acceptable.  For instance, the  page  level  component  might  only
436       specify  a page level string, or might only provide a list of lines, or
437       might provide a full hierarchy down to the individual characters.
438
439
440   Outline/Bookmark syntax
441       The outline syntax is a single list of the form
442
443          (bookmarks ...)
444
445       The first element of the list is symbol bookmarks.  The subsequent ele‐
446       ments  are  lists representing the toplevel outline entries.  Each out‐
447       line entry is represented by a list with the following form:
448
449          (title url ... )
450
451       The string title is the title of the outline  entry.   The  destination
452       string url can be an arbitrary URL or can be composed of the hash char‐
453       acter ("#") followed by either the component  file  identifier  or  the
454       page  number corresponding to the outline entry.  The remaining expres‐
455       sions describe subentries of this outline entry.
456
457
458   Annotation syntax
459       Annotations are represented by a sequence  of  annotation  expressions.
460       The following annotation expressions are recognized:
461
462       (background color)
463              Specify the color of the viewer area surrounding the DjVu image.
464              Colors are represented with the X11 hexadecimal syntax  #RRGGBB.
465              For instance, #000000 is black and #FFFFFF is white.
466
467       (zoom zoomvalue)
468              Specify  the  initial  zoom factor of the image.  Argument zoom‐
469              value can be one of stretch, one2one, width, page,  or  composed
470              of  the  letter  d followed by a number in range 1 to 999 repre‐
471              senting a zoom factor (such as in d300 or d150 for instance.)
472
473       (mode modevalue)
474              Specify the initial display mode of the image.   Argument  mode‐
475              value is one of color, bw, fore, or back.
476
477       (align horzalign vertalign)
478              Specify  how  the image should be aligned on the viewer surface.
479              By default the image is located in the center.  Argument  horza‐
480              lign  can  be one of left, center, or right.  Argument vertalign
481              can be one of top, center, or bottom.
482
483       (maparea url comment area ...)
484              Define an hyper-link for the specified destination.
485
486              Argument url can have one of the following forms:
487
488                 href
489                 (url href target)
490
491              where href is a string representing the destination  and  target
492              is a string representing the target frame for the hyper-link, as
493              defined by the HTML anchor tag <A>.  The destination string href
494              can be an arbitrary URL or can be composed of the hash character
495              ("#") followed by either a component file identifier or  a  page
496              number.   Page  numbers may be prefixed with an optional sign to
497              represent a page displacement.  For instance the  strings  "#-1"
498              and  "#+1"  can be used to access the previous page and the next
499              page.
500
501              Argument comment is a string that  might  be  displayed  by  the
502              viewer when the user moves the mouse over the hyper-link.
503
504              Argument  area  defines the shape and the location of the hyper‐
505              link.  The following forms are recognized:
506
507                 (rect xmin ymin width height)
508                 (oval xmin ymin width height)
509                 (poly x0 y0 x1 y1 ... )
510                 (text xmin ymin width height)
511                 (line x0 y0 x1 y1)
512
513              All parameters are numbers  representing  coordinates.   Coordi‐
514              nates are measured in pixels and have their origin at the bottom
515              left corner of the page.
516
517              The remaining expressions in the maparea list represent the vis‐
518              ual effect associated with the hyper-link.
519
520              A  first  set of options defines how borders are drawn for rect,
521              oval, polygon, or text hyperlink areas.
522
523                 (none)
524                 (xor)
525                 (border color)
526                 (shadow_in [thickness])
527                 (shadow_out [thickness])
528                 (shadow_ein [thickness])
529                 (shadow_eout [thickness])
530
531              where parameter color has syntax #RRGGBB as described above, and
532              parameter  thickness  is  an integer in range 1 to 32.  The last
533              four border options are only supported for rect hyperlink areas.
534              The  default  border  is a simple black line.  Border options do
535              not apply to line areas.
536
537              When a border option is specified, the  border  becomes  visible
538              when the user moves the mouse over the hyperlink. The border may
539              be made always visible by using the following option:
540
541                 (border_avis)
542
543              The following two options may be used with rect hyperlink areas.
544              The  complete area will be highlighted using the specified color
545              at the specified opacity (0-100, default 50).
546
547                 (hilite color)
548                 (opacity op)
549
550              This is often used with an empty URL for  simply  emphasizing  a
551              specific segment of an image.
552
553              The following three options may be used with line areas to spec‐
554              ify an optional ending arrow, the line  width  and  color.   The
555              default is a black line with width 1 and without arrow.
556
557                 (arrow)
558                 (width w)
559                 (lineclr color)
560
561              Finally the following three options can be used with text areas.
562              The default background color is transparent.  The  default  text
563              color  is  black.  The pushpin option indicates that the text is
564              symbolized by a small pushpin icon.  Clicking the  icon  reveals
565              the text.
566
567                 (backclr bkcolor)
568                 (textclr txtcolor)
569                 (pushpin)
570
571       (metadata ... (key value) ... )
572              Define  meta-data entries.  Each entry is identified by a symbol
573              key representing the nature of the meta data entry.  The  string
574              value  represents  the  value  associated with the corresponding
575              key.  Two sets of keys are noteworthy: keys  borrowed  from  the
576              BibTex  bibliography  system,  and  keys  borrowed  from the PDF
577              DocInfo metadata.  BibTex keys are always  expressed  in  lower‐
578              case,  such  as  year, booktitle, editor, author, etc..  DocInfo
579              keys start with an uppercase letter, such as Title, Author, Sub‐
580              ject,  Creator,  Produced,  Trapped,  CreationDate, and ModDate.
581              The values associated with the last two  keys  should  be  dates
582              expressed according to RFC 3339.
583
584

LIMITATIONS

586       The current version of program djvused only supports selecting one com‐
587       ponent file or all component files.  There is no way to select  only  a
588       few component files.
589
590

CREDITS

592       This  program was initially written by Léon Bottou <leonb@users.source‐
593       forge.net> and was improved by Yann Le  Cun  <profshadoko@users.source‐
594       forge.net>,  Florin  Nicsa,  Bill Riemers <docbill@sourceforge.net> and
595       many others.
596
597

SEE ALSO

599       djvu(1), djvutxt(1), djvmcvt(1), djvudump(1), bzz(1)
600
601
602
603DjVuLibre-3.5                      5/22/2005                        DJVUSED(1)
Impressum