1UNPAPER(1)                          unpaper                         UNPAPER(1)
2
3
4

NAME

6       unpaper - unpaper
7

SYNOPSIS

9       unpaper  [options] (input patterns output patterns | input files output
10       files)
11

OVERVIEW

13       unpaper is a post-processing tool for scanned sheets  of  paper,  espe‐
14       cially  for  book  pages that have been scanned from previously created
15       photocopies. The main purpose is to  make  scanned  book  pages  better
16       readable on screen after conversion to PDF. Additionally, unpaper might
17       be useful to enhance the quality of scanned pages before performing op‐
18       tical character recognition (OCR).
19
20       unpaper  tries  to clean scanned images by removing dark edges that ap‐
21       peared through scanning or copying on areas  outside  the  actual  page
22       content   (e.g.   dark   areas   between  the  left-hand-side  and  the
23       right-hand-side of a double- sided book-page scan).  The  program  also
24       tries to detect misaligned centering and rotation of pages and will au‐
25       tomatically straighten each page by rotating it to the  correct  angle.
26       This  process is called "deskewing". Note that the automatic processing
27       will sometimes fail. It is always a good idea to manually  control  the
28       results  of  unpaper and adjust the parameter settings according to the
29       requirements of the input. Each processing step can  also  be  disabled
30       individually for each sheet.
31
32       Input and output files can be in either .pbm, .pgm or .ppm format, thus
33       generally in .pnm format, as also used  by  the  Linux  scanning  tools
34       scanimage and scanadf. Conversion to PDF can e.g.  be achieved with the
35       Linux tools pgm2tiff, tiffcp and tiff2pdf.
36

INPUT AND OUTPUT FILES

38       Input and output files need to be designed either by using patterns  or
39       an  ordered  list of input and output files; if patterns are used, such
40       as %04d, then they are substituted for the input and output sheet  num‐
41       ber before opening the file for input or output.
42
43       If you're not using patterns, then the program expects one or two input
44       files depending on what is passed as --input-pages and one or two  out‐
45       put files depending on what is passed as --output-pages, in order.
46
47       Missing  output  file names are fatal and will stop processing; missing
48       initial input file names are fatal, and so is any missing input file if
49       a range of sheets is defined through --sheet or --end-sheet.
50
51       unpaper accepts files in PNM format, which means they might be in .pbm,
52       .pgm, .ppm or .pnm format, which is what is produced by  Linux  command
53       line scanning tools such as scanimage and scanadf.
54

OPTIONS

56       -l { single | double | none } ; --layout { single | double | none }
57              Set default layout options for a sheet:
58
59              single One page per sheet.
60
61              double Two  pages  per sheet, landscape orientation (one page on
62                     the left half, one page on the right half).
63
64              none   No  auto-layout,  mask-scan-points  may  individually  be
65                     specified.
66
67              Using   single   or   double  automatically  sets  corresponding
68              --mask-scan-points. The default is single.
69
70       -start sheet ; --start-sheet start-sheet
71              Number of first sheet to process in multi-sheet mode.  (default:
72              1)
73
74       -end sheet ; --end-sheet sheet
75              Number  of  last  sheet to process in multi-sheet mode. -1 indi‐
76              cates processing until no more input file with the corresponding
77              page number is available (default: -1)
78
79       -# sheet-range ; --sheet sheet-range
80              Optionally  specifies  which  sheets to process in the range be‐
81              tween start-sheet and end-sheet.
82
83       -x sheet-range ; --exclude sheet-range
84              Excludes sheets from processing in the range between start-sheet
85              and end-sheet.
86
87       --pre-rotate { -90 | 90 }
88              Rotates  the  whole image clockwise (90) or anti-clockwise (-90)
89              before any other processing.
90
91       --post-rotate { -90 | 90 }
92              Rotates the whole image clockwise (90) or  anti-clockwise  (-90)
93              after any other processing.
94
95       -M { v | h | v,h } ; --pre-mirror { v | h | v,h }
96              Mirror  the  image,  after  possible pre-rotation. Either v (for
97              vertical mirroring), h (for horizontal mirroring)  or  v,h  (for
98              both) can be specified.
99
100       --post-mirror { v | h | v,h }
101              Mirror  the  image,  after  any other processing except possible
102              post-rotation. Either v (for vertical mirroring), h  (for  hori‐
103              zontal mirroring) or v,h (for both) can be specified.
104
105       --pre-shift h, v
106              Shift  the  image before further processing. Values for h (hori‐
107              zontal shift) and v (vertical shift) can either be  positive  or
108              negative.
109
110       --post-shift h, v
111              Shift the image after other processing. Values for h (horizontal
112              shift) and v (vertical shift) can either be  positive  or  nega‐
113              tive.
114
115       --pre-wipe left, top, right, bottom
116              Manually  wipe  out an area before further processing. Any pixel
117              in a wiped area will be set to white. Multiple areas to be wiped
118              may be specified by multiple occurrences of this options.
119
120       --post-wipe left, top, right, bottom
121              Manually wipe out an area after processing. Any pixel in a wiped
122              area will be set to white. Multiple areas to  be  wiped  may  be
123              specified by multiple occurrences of this options.
124
125       --pre-border left, top, right, bottom
126              Clear  the  border-area  of the sheet before further processing.
127              Any pixel in the border area will be set to white.
128
129       --post-border left, top, right, bottom
130              Clear the border-area of the sheet after other  processing.  Any
131              pixel in the border area will be set to white.
132
133       --pre-mask x1, y1, x2, y2
134              Specify  masks  to  apply before any other processing. Any pixel
135              outside a mask will be set to white,  unless  another  mask  in‐
136              cludes this pixel.
137
138              Only  pixels  inside  a  mask will remain. Multiple masks may be
139              specified. No deskewing will be applied to the  masks  specified
140              by --pre-mask.
141
142       -s { width, height | size-name } ; --size { width, height | size-name }
143              Change  the  sheet size before other processing is applied. Con‐
144              tent on the sheet gets zoomed to fit to  the  appropriate  size,
145              but  the  aspect ratio is preserved. Instead, if the sheet's as‐
146              pect ratio changes, the zoomed  content  gets  centered  on  the
147              sheet.
148
149              Possible  values  for  size-name are: a5, a4, a3, letter, legal.
150              All size names can also be applied in rotated landscape orienta‐
151              tion, use a4-landscape, letter-landscape etc.
152
153       --post-size { width, height | size-name }
154              Change  the sheet size preserving the content's aspect ratio af‐
155              ter other processing steps are applied.
156
157       --stretch { width, height | size-name }
158              Change the sheet size before other processing is  applied.  Con‐
159              tent on the sheet gets stretched to the specified size, possibly
160              changing the aspect ratio.
161
162       --post-stretch { width, height | size-name }
163              Change the sheet size after other processing is applied. Content
164              on  the  sheet  gets  stretched  to the specified size, possibly
165              changing the aspect ratio.
166
167       -z factor ; --zoom factor
168              Change the sheet size according to the given factor before other
169              processing is done.
170
171       --post-zoom factor
172              Change  the  sheet size according to the given factor after pro‐
173              cessing is done.
174
175       -bn { v | h | v, h } ; --blackfilter-scan-direction { v | h | v, h }
176              Directions in which to search for solidly black areas. Either  v
177              (for  vertical  searching),  h (for horizontal searching) or v,h
178              (for both) can be specified. The blackfilter works by  moving  a
179              virtual  bar  across  each page. The darkness inside the virtual
180              bar is determined and if it  exceeds  blackfilter-scan-threshold
181              black  pixels  in the area are filled. During filling the black‐
182              ness of each pixel is determined by black-threshold. The bar  is
183              then  moved  by blackfilter-scan-step in the scanning direction.
184              Once a page border is encountered the bar is moved  down  (hori‐
185              zontal   scan)   or  right  (vertical  scan)  by  its  blackfil‐
186              ter-scan-size.
187
188       -bs { size | h-size,  v-size  }  ;  --blackfilter-scan-size  {  size  |
189       h-size, v-size }
190              Size  of virtual bar in direction of scanning (meaning width for
191              horizontal pass, height for vertical pass) used for  black  area
192              detection.  Two  values may be specified to individually set the
193              size for the horizontal scanning-pass  and  the  vertical  pass.
194              (default: 20,20)
195
196       -bd  {  depth | h-depth, v-depth } ; --blackfilter-scan-depth { depth |
197       h-depth, v-depth }
198              Depth of virtual bar in non-scanning direction  (meaning  height
199              for  horizontal  pass,  width  for vertical pass) used for black
200              area detection. Two values may be specified to individually  set
201              the  depth  for  the  horizontal  scanning-pass and the vertical
202              pass. (default: 500,500)
203
204       -bp { step | h-step,  v-step  }  ;  --blackfilter-scan-step  {  step  |
205       h-step, v-step }
206              Steps  to  move virtual bar for black area detection. Two values
207              may be specified to individually set the step for the horizontal
208              scanning-pass and the vertical pass. (default: 5,5)
209
210       -bt threshold ; --blackfilter-scan-threshold threshold
211              Ratio  of  dark  pixels  above which a black area gets detected.
212              (default: 0.95).
213
214       -bx left, top, right, bottom ;  --blackfilter-scan-exclude  left,  top,
215       right, bottom
216              Area  on  which  the blackfilter should not operate. This can be
217              useful to prevent the blackfilter from  working  on  inner  page
218              content.  May  be  specified multiple times to set more than one
219              area.
220
221       -bi intensity ; --blackfilter-intensity intensity
222              Intensity with which to delete black areas. This deletes  pixels
223              around  the  virtual  scan  bar.  Larger  values will leave less
224              noise-pixels around former black areas, but may delete page con‐
225              tent.  (default: 20)
226
227       -ni intensity ; -noisefilter-intensity intensity
228              Intensity  with  which to delete individual pixels or tiny clus‐
229              ters of pixels. Any cluster which only contains  intensity  dark
230              pixels together will be deleted. (default: 4)
231
232       -ls  {  size  |  h-size,  v-size } ; --blurfilter-size { size | h-size,
233       v-size }
234              Size of blurfilter area to search for "lonely" clusters of  pix‐
235              els.  (default: 100,100)
236
237       -lp  {  step  |  h-step,  v-step } ; --blurfilter-step { step | h-step,
238       v-step }
239              Size of "blurring" steps in each direction. (default: 50,50)
240
241       -li ratio ; --blurfilter-intensity ratio
242              Relative intensity with which to delete tiny clusters of pixels.
243              Any blurred area which contains at most the ratio of dark pixels
244              will be cleared. (default: 0.01)
245
246       -gs { size | h-size, v-size } ;  --grayfilter-size  {  size  |  h-size,
247       v-size }
248              Size  of grayfilter mask to search for "gray-only" areas of pix‐
249              els.  (default: 50,50)
250
251       -gp { step | h-step, v-step } ;  --grayfilter-step  {  step  |  h-step,
252       v-step }
253              Size of steps moving the grayfilter mask in each direction. (de‐
254              fault: 20,20)
255
256       -gt ratio ; --grayfilter-threshold ratio
257              Relative intensity of grayness which is accepted before clearing
258              the  grayfilter  mask  in cases where no black pixel is found in
259              the mask.  (default: 0.5)
260
261       -p x, y; --mask-scan-point x, y
262              Manually  set  starting  point  for   mask-detection.   Multiple
263              --mask-scan-point  options  may  be specified to detect multiple
264              masks.
265
266       -m x1, y1, x2, y2; --mask x1, y1, x2, y2
267              Manually add a mask, in addition to masks automatically detected
268              around  the --mask-scan-point coordinates (unless --no-mask-scan
269              is specified).
270
271              Any pixel outside a mask will be set to  white,  unless  another
272              mask covers this pixel.
273
274       -mn { v \| h \| v,h }; --mask-scan-direction { v \| h \| v,h }
275              Directions  in  which  to search for mask borders, starting from
276              --mask-scan-point coordinates. Either v  (for  vertical  mirror‐
277              ing),  h  (for  horizontal  mirroring)  or v,h (for both) can be
278              specified. (default: h, as v may cut text-  paragraphs  on  sin‐
279              gle-page sheets)
280
281       -ms  {  size  \|  h-size,  v-size }; --mask-scan-size { size \| h-size,
282       v-size }
283              Width of the virtual bar used for mask detection. Two values may
284              be  specified  to individually set horizontal and vertical size.
285              (default: 50,50)
286
287       -md { depth  \|  h-depth,  v-depth  };  --mask-scan-depth  {  depth  \|
288       h-depth, v-depth }
289              Height  of  the  virtual  bar used for mask detection. (default:
290              -1,-1, using the total width or height of the sheet)
291
292       -mp { step \| h-step, v-step };  --mask-scan-step  {  step  \|  h-step,
293       v-step }
294              Steps to move the virtual bar for mask detection. (default: 5,5)
295
296       -mt  { threshold \| h-threshold, v-threshold }; --mask-scan-threshold {
297       threshold \| h-threshold, v-threshold }
298              Ratio of dark pixels below which an edge gets detected, relative
299              to  maximum  blackness  when  counting from the start coordinate
300              heading towards one edge. (default: 0.1)
301
302       -mm w, h; --mask-scan-minimum w, h
303              Minimum allowed size of an auto-detected  mask.  Masks  detected
304              below this size will be ignored and set to the size specified by
305              mask-scan-maximum. (default: 100,100)
306
307       -mM w, h; --mask-scan-maximum w, h
308              Maximum allowed size of an auto-detected  mask.  Masks  detected
309              above this size will be shrunk to the maximum value, each direc‐
310              tion individually. (default: sheet size, or  page  size  derived
311              from --layout option)
312
313       -mc color; --mask-color color
314              Color  value  with  which  to wipe out pixels not covered by any
315              mask.  Maybe useful for testing in order to visualize the effect
316              of  masking.   (Note  that  an  RGB-value is expected: R*65536 +
317              G*256 + B.)
318
319       -dn { left \| top \| right \| bottom },...;  --deskew-scan-direction  {
320       left \| top \| right \| bottom },...
321              Edges  from  which to scan for rotation. Each edge of a mask can
322              be used to detect the mask's rotation.  If  multiple  edges  are
323              specified,  the average value will be used, unless the statisti‐
324              cal deviation  exceeds  --deskew-scan-deviation.  Use  left  for
325              scanning from the left edge, top for scanning from the top edge,
326              right for scanning from the right edge, bottom for scanning from
327              the bottom. Multiple directions can be separated by commas. (de‐
328              fault: left,right)
329
330       -ds pixels; --deskew-scan-size pixels
331              Size of virtual line for rotation detection. (default: 1500)
332
333       -dd ratio; --deskew-scan-depth ratio
334              Amount of dark pixels to accumulate until scanning  is  stopped,
335              relative to scan-bar size. (default: 0.5)
336
337       -dr degrees; --deskew-scan-range degrees
338              Range in which to search for rotation, from -degrees to +degrees
339              rotation. (default: 5.0)
340
341       -dp degrees; --deskew-scan-step degrees
342              Steps between single rotation-angle  detections.  Lower  numbers
343              lead to better results but slow down processing. (default: 0.1)
344
345       -dv deviation; --deskew-scan-deviation deviation
346              Maximum statistical deviation allowed among the results from de‐
347              tected edges. No rotation if exceeded. (default: 1.0)
348
349       -W left, top, right, bottom; --wipe left, top, right, bottom
350              Manually wipe out an area. Any pixel in a wiped area will be set
351              to  white.  Multiple  --wipe areas may be specified. This is ap‐
352              plied after deskewing and before automatic border-scan.
353
354       -mw { size \| left, right }; --middle-wipe { size \| left, right }
355              If --layout is set to double, this may specify  the  size  of  a
356              middle area to wipe out between the two pages on the sheet. This
357              may be useful if the blackfilter fails to remove some black  ar‐
358              eas  (e.g.   resulting  from photo-copying in the middle between
359              two pages).
360
361       -B left, top, right, bottom; --border left, top, right, bottom
362              Manually add a border. Any pixel in the border area will be  set
363              to  white.  This is applied after deskewing and before automatic
364              border-scan.
365
366       -Bn { v \| h \| v,h }; --border-scan-direction { v \| h \| v,h }
367              Directions in which to search for outer border.  Either  v  (for
368              vertical  mirroring),  h  (for horizontal mirroring) or v,h (for
369              both) can be specified. (default: v)
370
371       -Bs { size \| h-size, v-size }; --border-scan-size {  size  \|  h-size,
372       v-size }
373              Width  of  virtual bar used for border detection. Two values may
374              be specified to individually set horizontal and  vertical  size.
375              (default: 5,5)
376
377       -Bp  {  step  \| h-step, v-step }; --border-scan-step { step \| h-step,
378       v-step }
379              Steps to move virtual bar for border detection. (default: 5,5)
380
381       -Bt threshold; --border-scan-threshold threshold
382              Absolute number of dark pixels covered by the  border-scan  mask
383              above which a border is detected. (default: 5)
384
385       -Ba { left \| top \| right \| bottom }; --border-align { left \| top \|
386       right \| bottom }
387              Direction where to shift the detected  border-area.  Use  --bor‐
388              der-margin  to  specify  horizontal and vertical distances to be
389              kept from the sheet-edge. (default: none)
390
391       -Bm vertical, horizontal; --border-margin vertical, horizontal
392              Distance to keep from the sheet  edge  when  aligning  a  border
393              area. May use measurement suffices such as cm, in.
394
395       -w threshold; --white-threshold threshold
396              Brightness  ratio  above which a pixel is considered white. (de‐
397              fault: 0.9)
398
399       -b threshold; --black-threshold threshold
400              Brightness  ratio  below  which  a  pixel  is  considered  black
401              (non-gray).   This  is used by the gray-filter and the blackfil‐
402              ter. This value is also used when converting a  grayscale  image
403              to black-and-white mode (default: 0.33)
404
405       -ip { 1 \| 2 }; --input-pages { 1 \| 2 }
406              If  2 is specified, read two input images instead of one and in‐
407              ternally combine them to a doubled-layout sheet  before  further
408              processing.  Before  internally combining, --pre-rotation is op‐
409              tionally applied individually to both input images as  the  very
410              first processing steps.
411
412       -op { 1 \| 2 }; --output-pages { 1 \| 2 }
413              If  2 is specified, write two output images instead of one, as a
414              result of splitting a doubled-layout sheet after processing. Af‐
415              ter  splitting  the sheet, --post-rotation is optionally applied
416              individually to both output images as the very  last  processing
417              step.
418
419       -S  {  width,  height  \|  size-name }; --sheet-size { width, height \|
420       size-name }
421              Force a fix sheet size. Usually, the sheet size is determined by
422              the  input  image size (if input-pages=1), or by the double size
423              of the first page in a two-page input set (if input-pages=2). If
424              the input image is smaller than the size specified here, it will
425              appear centered and surrounded with a white border on the sheet.
426              If  the input image is bigger, it will be centered and the edges
427              will be cropped. This option may also be helpful to get  regular
428              sized  output  images  if the input image sizes differ. Standard
429              size-names like a4-landscape, letter,  etc.  may  be  used  (see
430              --size).  (default: as in input file)
431
432       --sheet-background { black \| white }
433              Sets  a color with which the sheet is filled before any image is
434              loaded and placed onto it. This can be  useful  when  the  sheet
435              size and the image size differ.
436
437       --no-blackfilter sheet-range
438              Disables black area scan. Individual sheet indices can be speci‐
439              fied.
440
441       --no-noisefilter sheet-range
442              Disables the noisefilter. Individual sheet indices can be speci‐
443              fied.
444
445       --no-blurfilter sheet-range
446              Disables  the blurfilter. Individual sheet indices can be speci‐
447              fied.
448
449       --no-grayfilter sheet-range
450              Disables the grayfilter. Individual sheet indices can be  speci‐
451              fied.
452
453       --no-mask-scan sheet-range
454              Disables  mask-detection.  Masks  explicitly  set by --mask will
455              still have effect. Individual sheet indices can be specified.
456
457       --no-mask-center sheet-range
458              Disables auto-centering of each  mask.  Auto-centering  is  per‐
459              formed  by default if the --layout option has been set. Individ‐
460              ual sheet indices can be specified.
461
462       --no-deskew sheet-range
463              Disables deskewing. Individual sheet indices can be specified.
464
465       --no-wipe sheet-range
466              Disables explicit wipe-areas. This means the effect of parameter
467              --wipe can be disabled individually per sheet.
468
469       --no-border sheet-range
470              Disables explicitly set borders. This means the effect of param‐
471              eter --border can be disabled individually per sheet.
472
473       --no-border-scan sheet-range
474              Disables border-scanning from the edges of the sheet. Individual
475              sheet indices can be specified.
476
477       --no-border-align sheet-range
478              Disables  aligning  of the area detected by border-scanning (see
479              --border-align). Individual sheet indices can be specified.
480
481       -n sheet-range; --no-processing sheet-range
482              Do not perform any processing on a sheet except pre/post  rotat‐
483              ing  and  mirroring,  and file-depth conversions on saving. This
484              option has the same effect as setting all --no-xxx  options  to‐
485              gether. Individual sheet indices can be specified.
486
487       --interpolate { nearest \| linear \| cubic }
488              Set  the  interpolation function used for deskewing and stretch‐
489              ing. The cubic option provides the  best  image  quality,  while
490              nearest is the fastest. (default: cubic)
491
492       --no-multi-pages
493              Disable  multi-page  processing  even if the input filename con‐
494              tains a % (usually indicating the start of a placeholder for the
495              page counter).
496
497       --dpi dpi
498              Dots  per inch used for conversion of measured size values, like
499              e.g.  21cm,27.9cm. Mind that this parameter should occur  before
500              specifying  any  size  value  with measurement suffix. (default:
501              300)
502
503       -t { pbm \| pgm \| ppm }; --type { pbm \| pgm> \| ppm }
504              Output file type (and bit depth). If not specified, the one with
505              the  same,  or closest, pixel format as the original input files
506              will be used.
507
508              pbm    Portable Bit Map, monochrome raw image.
509
510              pgm    Portable Grayscale Map, 8-bit per pixel grayscale raw im‐
511                     age.
512
513              ppm    Portable Pixel Map, 24-bit per pixel RGB raw image.
514
515       -T ; --test-only
516              Do  not  write  any  output.  May  be useful in combination with
517              --verbose to get information about the input.
518
519       -si nr; --start-input nr
520              Set the first page number to substitute for '%d' in input  file‐
521              names.   Every  time  the  input file sequence is repeated, this
522              number gets  increased  by  1.  (default:  (startsheet-1)*input‐
523              pages+1)
524
525       -so nr; --start-output nr
526              Set the first page number to substitute for '%d' in output file‐
527              names.  Every time the output file sequence  is  repeated,  this
528              number  gets  increased  by  1. (default: (startsheet-1)*output‐
529              pages+1)
530
531       --insert-blank nr [,nr...]
532              Use blank input instead of an input file from the input file se‐
533              quence at the specified index-positions. The input file sequence
534              will be interrupted temporarily and will continue with the  next
535              input  file  afterwards. This can be useful to insert blank con‐
536              tent into a sequence of input images.
537
538       --replace-blank nr [,nr...]
539              Like --insert-blank, but the input images at the specified index
540              positions  get  replaced with blank content and thus will be ig‐
541              nored.
542
543       --overwrite
544              Allow overwriting existing files. Otherwise the  program  termi‐
545              nates  with an error if an output file to be written already ex‐
546              ists.
547
548       -q ; --quiet
549              Quiet mode, no output at all.
550
551       -v ; --verbose
552              Verbose output, more info messages.
553
554       -vv    Even more verbose output, show parameter  settings  before  pro‐
555              cessing.
556
557       -V ; --version
558              Output version and build information.
559

AUTHOR

561       The unpaper authors
562
564       2023, The unpaper Authors
565
566
567
568
569                                 Jul 22, 2023                       UNPAPER(1)
Impressum