1GSCAN2PDF(1)          User Contributed Perl Documentation         GSCAN2PDF(1)
2
3
4

NAME

6       gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents
7

USAGE

9       1. Scan one or several pages in with File/Scan
10       2. Create PDF of selected pages with File/Save
11

REQUIRED ARGUMENTS

13       None
14

OPTIONS

16       gscan2pdf has the following command-line options:
17
18       --device=device
19           Specifies the device to use, instead of getting the list of devices
20           from via the SANE API.  This can be useful if the scanner is on a
21           remote computer which is not broadcasting its existence.
22
23       --help
24           Displays this help page and exits.
25
26       --log=log-file
27           Specifies a file to store logging messages.
28
29       --debug, --info, --warn, --error, --fatal
30           Defines the log level.  If a log file is specified, this defaults
31           to --debug, otherwise --error.
32
33       --import=PDF|DjVu|images
34           Imports the specified file(s). If the document has more than one
35           page, a window is displayed to select the required pages.
36
37       --import-all=PDF|DjVu|images Imports all pages of the specified
38       file(s).
39       --version
40           Displays the program version and exits.
41
42       Scanning is handled with SANE via scanimage.  PDF conversion is done by
43       PDF::Builder.  TIFF export is handled by libtiff (faster and smaller
44       memory footprint for multipage files).
45

DIAGNOSTICS

47       To diagnose a possible error, start gscan2pdf from the command line
48       with logging enabled:
49
50       "gscan2pdf --log=file.log"
51
52       and check file.log.
53

EXIT STATUS

55       None
56

CONFIGURATION

58       gscan2pdf creates a text resource file in ~/.config/gscan2pdfrc. The
59       directory can be changed by setting the $XDG_CONFIG_HOME variable.
60       Generally, however, preferences should be changed via the
61       Edit/Preferences menu, or are captured automatically during normal
62       usage of the program.
63

INCOMPATIBILITIES

65       None known.
66

BUGS AND LIMITATIONS

68       Whilst it is possible to import PDFs, this is intended to be able to
69       round-trip files created by gscan2pdf.
70

Download

72       gscan2pdf is available on Sourceforge
73       (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).
74
75   Debian-based
76       If you are using Debian, you should find that sid
77       <https://www.debian.org/releases/sid/> has the latest version already
78       packaged.
79
80       If you are using a Ubuntu-based system, you can automatically keep up
81       to date with the latest version via the ppa:
82
83       "sudo apt-add-repository ppa:jeffreyratcliffe/ppa"
84
85       If you are you are using Synaptic, then use menu Edit/Reload Package
86       Information, search for gscan2pdf in the package list, and lo and
87       behold, you can install the nice shiny new version.
88
89       From the command line:
90
91       "sudo apt update"
92
93       "sudo apt install gscan2pdf"
94
95   From source
96       The source is hosted in the files section of the gscan2pdf project on
97       Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).
98
99   From the repository
100       gscan2pdf uses Git for its Revision Control System. You can browse the
101       tree at <https://sourceforge.net/p/gscan2pdf/code/>.
102
103       Git users can clone the complete tree with "git clone
104       git://git.code.sf.net/p/gscan2pdf/code"
105

Building gscan2pdf from source

107       Having downloaded the source either from a Sourceforge file release, or
108       from the Git repository, unpack it if necessary with "tar xvfz
109       gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"
110
111       "perl Makefile.PL", will create the Makefile.
112
113       "make test" should run several hundred tests to confirm that things
114       will work properly on your system.
115
116       You can install directly from the source with "make install", but
117       building the appropriate package for your distribution should be as
118       straightforward as "make debdist" or "make rpmdist". However, you will
119       additionally need the rpm, devscripts, fakeroot, debhelper and gettext
120       packages.
121

Dependencies

123       The list below looks daunting, but all packages are available from any
124       reasonable up-to-date distribution. If you are using Synaptic, having
125       installed gscan2pdf, locate the gscan2pdf entry in Synaptic, right-
126       click it and you can install them under Recommends. Note also that the
127       library names given below are the Debian/Ubuntu ones. Those
128       distributions using RPM typically use perl(module) where Debian has
129       libmodule-perl.
130
131       Required
132           libgtk3-perl >= 0.028
133               There is a bug in version of libgtk3-perl before 0.028 that
134               causes gscan2pdf to crash when saving. Whilst I could prevent
135               gscan2pdf from crashing, it would still be impossible to save
136               anything, rendering gscan2pdf rather useless.
137
138           libgtk3-simplelist-perl
139               A simple interface to Gtk3's complex MVC list widget
140
141           liblocale-gettext-perl (>= 1.05)
142               Using libc functions for internationalisation in Perl
143
144           libpdf-builder-perl
145               provides the functions for creating PDF documents in Perl
146
147           libsane
148               API library for scanners
149
150           libimage-sane-perl
151               Perl bindings for libsane.
152
153           libset-intspan-perl
154               manages sets of integers
155
156           libtiff-tools
157               TIFF manipulation and conversion tools
158
159           Imagemagick
160               Image manipulation programs
161
162           perlmagick
163               A perl interface to the libMagick graphics routines
164
165           sane-utils
166               API library for scanners -- utilities.
167
168       Optional
169           sane
170               scanner graphical frontends. Only required for the scanadf
171               frontend.
172
173           unpaper
174               post-processing tool for scanned pages. See
175               <https://www.flameeyes.eu/projects/unpaper>.
176
177           xdg-utils
178               Desktop integration utilities from freedesktop.org. Required
179               for Email as PDF.  See
180               <https://www.freedesktop.org/wiki/Software/xdg-utils/>
181
182           djvulibre-bin
183               Utilities for the DjVu image format. See
184               <http://djvu.sourceforge.net/>
185
186           gocr
187               A command line OCR. See <http://jocr.sourceforge.net/>.
188
189           tesseract
190               A command line OCR. See
191               <https://github.com/tesseract-ocr/tesseract>
192
193           cuneiform
194               A command line OCR. See <http://launchpad.net/cuneiform-linux>
195

Support

197       There are two mailing lists for gscan2pdf:
198
199       gscan2pdf-announce
200           A low-traffic list for announcements, mostly of new releases. You
201           can subscribe at
202           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>
203
204       gscan2pdf-help
205           General support, questions, etc.. You can subscribe at
206           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>
207

Reporting bugs

209       Before reporting bugs, please read the "FAQs" section.
210
211       Please report any bugs found, preferably against the Debian
212       package[1][2].  You do not need to be a Debian user, or set up an
213       account to do this.  The Debian tool "reportbug" provides a convenient
214       GUI for doing so.
215
216       1. https://packages.debian.org/sid/gscan2pdf
217       2. https://www.debian.org/Bugs/
218
219       Alternatively, there is a bug tracker for the gscan2pdf project on
220       Sourceforge
221       (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).
222
223       Please include the log file created by "gscan2pdf --log=log" with any
224       new bug report.
225

Translations

227       gscan2pdf has already been partly translated into several languages.
228       If you would like to contribute to an existing or new translation,
229       please check out Rosetta:
230       <https://translations.launchpad.net/gscan2pdf>
231
232       Note that the translations for the scanner options are taken directly
233       from sane-backends. If you would like to contribute to these, you can
234       do so either at contact the sane-devel mailing list
235       (sane-devel@lists.alioth.debian.org) and have a look at the po/
236       directory in the source code <http://www.sane-project.org/cvs.html>.
237
238       Alternatively, Ubuntu has its own translation project. For the 9.04
239       release, the translations are available at
240       <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>
241
242       If you have updated an ".po" file in the "po" directory of the
243       gscan2pdf source tree and would like to test it, pick a test directory
244       for the compiled locales, e.g.  "./locale", and create the ".mo" files
245       with:
246
247       "perl Makefile.PL LOCALEDIR=./locale"
248
249       If the updated locale is your standard one, then the following will
250       find the updated file:
251
252       "perl -I lib bin/gscan2pdf --log=log --locale=locale"
253
254       If it is not your standard locale, you will need something like (for
255       Russian):
256
257       "LC_ALL=ru_RU.utf8 LC_MESSAGES=ru_RU.utf8 LC_CTYPE=ru_RU.utf8
258       LANG=ru_RU.utf8 LANGUAGE=ru_RU.utf8 perl -I lib bin/gscan2pdf --log=log
259       --locale=locale"
260
261       or German:
262
263       "LC_ALL=de_DE LC_MESSAGES=de_DE LC_CTYPE=de_DE LANG=de_DE
264       LANGUAGE=de_DE perl -I lib bin/gscan2pdf --log=log --locale=locale"
265
266       If the above doesn't work, make sure it is in the list produced by
267       "locale -a", including any ".utf8" suffix. If necessary, generate new
268       locales with "sudo dpkg-reconfigure locales"
269

DESCRIPTION

271   File
272       New
273
274       Clears the page list.
275
276       Open
277
278       Opens any format that imagemagick supports. PDFs will have their
279       embedded images extracted and imported one per page.
280
281       Note that files can also be imported by dragging them into the
282       thumbnail list from a program like nautilus or konqueror.
283
284       Scan
285
286       Sets options before scanning via SANE.
287
288       Device
289
290       Chooses between available scanners.
291
292       # Pages
293
294       Selects the number of pages, or all pages to scan.
295
296       Source document
297
298       Selects between single sided or double sides pages.
299
300       This affects the page numbering.  Single sided scans are numbered
301       consecutively.  Double sided scans are incremented (or decremented, see
302       below) by 2, i.e. 1, 3, 5, etc..
303
304       Side to scan
305
306       If double sided is selected above, assuming a non-duplex scanner, i.e.
307       a scanner that cannot automatically scan both sides of a page, this
308       determines whether the page number is incremented or decremented by 2.
309
310       To scan both sides of three pages, i.e. 6 sides:
311
312       1. Select:
313           # Pages = 3 (or "all" if your scanner can detect when it is out of
314           paper)
315
316           Double sided
317
318           Facing side
319
320       2. Scans sides 1, 3 & 5.
321       3. Put pile back with scanner ready to scan back of last page.
322       4. Select:
323           # Pages = 3 (or "all" if your scanner can detect when it is out of
324           paper)
325
326           Double sided
327
328           Reverse side
329
330       5. Scans sides 6, 4 & 2.
331       6. gscan2pdf automatically sorts the pages so that they appear in the
332       correct order.
333
334       Device-dependent options
335
336       These, naturally, depend on your scanner.  They can include
337
338       Page size.
339       Mode (colour/black & white/greyscale)
340       Resolution (in PPI)
341       Batch-scan
342           Guarantees that a "no documents" condition will be returned after
343           the last scanned page, to prevent endless flatbed scans after a
344           batch scan.
345
346       Wait-for-button/Button-wait
347           After sending the scan command, wait until the button on the
348           scanner is pressed before actually starting the scan process.
349
350       Source
351           Selects the document source.  Possible options can include Flatbed
352           or ADF.  On some scanners, this is the only way of generating an
353           out-of-documents signal.
354
355       Save
356
357       Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG, PNM or
358       GIF.
359
360       Metadata
361
362       Metadata are information that are not visible when viewing the
363       PDF/DjVu, but are embedded in the file and so searchable and can be
364       examined, typically with the "Properties" option of the document
365       viewer.
366
367       The metadata are completely optional, but can also be used to generate
368       the filename see preferences for details.
369
370       The date can be selected with use of the calendar widget. The displayed
371       date can be incremented or decremented with use of the '+' and '-'
372       keys.
373
374       DjVu
375
376       Both black and white, and colour images produce better compression than
377       PDF. See <http://www.djvuzone.org/> for more details.
378
379       Email as PDF
380
381       Attaches the selected or all pages as a PDF to a blank email.  This
382       requires xdg-email, which is in the xdg-utils package.  If this is not
383       present, the option is ghosted out.
384
385       Print
386
387       Prints the selected or all pages.
388
389       Compress temporary files
390
391       If your temporary ($TMPDIR) directory is getting full, this function
392       can be useful - compressing all images at LZW-compressed TIFFs. These
393       require much less space than the PNM files that are typically produced
394       by SANE or by importing a PDF.
395
396   Edit
397       Delete
398
399       Deletes the selected page.
400
401       Renumber
402
403       Renumbers the pages from 1..n.
404
405       Note that the page order can also be changed by drag and drop in the
406       thumbnail view.
407
408       Select
409
410       The select menus can be used to select, all, even, odd, blank, dark or
411       modified pages. Selecting blank or dark pages runs imagemagick to make
412       the decision.  Selecting modified pages selects those which have
413       modified by threshold, unsharp, etc., since the last OCR run was made.
414
415       Properties
416
417       When an image is scanned, gscan2pdf attempts to extract the resolution
418       from the scan options. This nearly always works without problem.
419
420       Importing an image can be trickier, however. Some image formats such as
421       PNM do not encode metadata for resolution. In other cases, the data is
422       incorrect.  Edit/Properties allows the user to manually correct the
423       metadata for a particular page, thus correcting the size of final PDF
424       or DjVu. The image itself is otherwise not changed - it is not down- or
425       upscaled.
426
427       Preferences
428
429       The preferences menu item allows the control of the default behaviour
430       of various functions. Most of these are self-explanatory.
431
432       Frontends
433
434       gscan2pdf initially supported two frontends, scanimage and scanadf.
435       scanadf support was added when it was realised that scanadf works
436       better than scanimage with some scanners. On Debian-based systems,
437       scanadf is in the sane package, not, like scanimage, in sane-utils. If
438       scanadf is not present, the option is obviously ghosted out.
439
440       In 0.9.27, Perl bindings for SANE were introduced. These are called
441       libsane-perl.
442
443       Before 1.2.0, options available through CLI frontends like scanimage
444       were made visible as users asked for them. In 1.2.0, all options can be
445       shown or hidden via Edit/Preferences, along with the ability to specify
446       which options trigger a reload.
447
448       In 1.8.3, New Perl bindings for SANE were introduced. These are called
449       libimage-sane-perl and are the preferred frontend.
450
451       In 1.8.5, support for libsane-perl was removed.
452
453       Device blacklist
454
455       Ignore listed devices.
456
457       Note that this is a device name regular expression, e.g. /dev/video,
458       and not the name as listed in the scan window, e.g. Noname
459       Integrated_Webcam_HD.
460
461       Default filename for PDF or DjVu files
462
463       All strftime codes (e.g. %Y for the current year) are available as
464       variables, with the following additions:
465
466       %Da author
467
468       %De filename extension
469
470       %Dt title
471
472       All document date codes use strftime codes with a leading D, e.g.:
473
474       %DY document year
475
476       %Dm document month
477
478       %Dd document day
479
480   View
481       Zoom 100%
482
483       Zooms to 1:1. How this appears depends on the desktop resolution.
484
485       Zoom to fit
486
487       Scales the view such that all the page is visible.
488
489       Zoom in
490
491       Zoom out
492
493       Rotate 90° clockwise
494
495       The rotate options require the package imagemagick and, if this is not
496       present, are ghosted out.
497
498       Rotate 180°
499
500       Rotate 90° anticlockwise
501
502   Tools
503       Threshold
504
505       Changes all pixels darker than the given value to black; all others
506       become white.
507
508       Unsharp mask
509
510       The unsharp option sharpens an image. The image is convolved with a
511       Gaussian operator of the given radius and standard deviation (sigma).
512       For reasonable results, radius should be larger than sigma. Use a
513       radius of 0 to have the method select a suitable radius.
514
515       Crop
516
517       unpaper
518
519       unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility
520       for cleaning up a scan.
521
522       OCR (Optical Character Recognition)
523
524       The gocr, tesseract or cuneiform utilities are used to produce text
525       from an image.
526
527       There is an OCR output buffer for each page and is embedded as plain
528       text behind the scanned image in the PDF produced. This way, Beagle can
529       index (i.e. search) the plain text.
530
531       In DjVu files, the OCR output buffer is embedded in the hidden text
532       layer.  Thus these can also be indexed by Beagle.
533
534       There is an interesting review of OCR software at
535       <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
536       An important conclusion was that 400ppi is necessary for decent
537       results.
538
539       Up to v2.04, the only way to tell which languages were available to
540       tesseract was to look for the language files. Therefore, gscan2pdf
541       checks the path returned by:
542
543       "tesseract '' '' -l ''"
544
545       If there are no language files in the above location, then gscan2pdf
546       assumes that tesseract v1.0 is installed, which had no language files.
547
548       Variables for user-defined tools
549
550       The following variables are available:
551
552       %i  input filename
553
554       %o  output filename
555
556       %r  resolution
557
558       An image can be modified in-place by just specifying %i.
559

FAQs

561   Why isn't option xyz available in the scan window?
562       Possibly because SANE or your scanner doesn't support it.
563
564       If an option listed in the output of "scanimage --help" that you would
565       like to use isn't available, send me the output and I will look at
566       implementing it.
567
568   I've only got an old flatbed scanner with no automatic sheetfeeder. How do
569       I scan a multipage document?
570       In Edit/Preferences, tick the box "Allow batch scanning from flatbed".
571
572       Some Brother scanners report "out of documents", despite scanning from
573       flatbed.  This can be worked around by ticking the box "Force new scan
574       job between pages".
575
576       If you are lucky, you have an option like Wait-for-button or Button-
577       wait, where the scanner will wait for you to press the scan button on
578       the device before it starts the scan, allowing you to scan multiple
579       pages without touching the computer.
580
581       If you are quick, you might be able to change the document on the
582       flatbed whilst the scan head is returning.
583
584       Otherwise, you have to set the number of pages to scan to 1 and hit the
585       scan button on the scan window for each page.
586
587   Why is option xyz ghosted out?
588       Probably because the package required for that option is not installed.
589       Email as PDF requires xdg-email (xdg-utils), unpaper and the rotate
590       options require imagemagick.
591
592   Why can I not scan from the flatbed of my HP scanner?
593       Generally for HP scanners with an ADF, to scan from the flatbed, you
594       should set "# Pages" to "1", and possibly "Batch scan" to "No".
595
596   When I update gscan2pdf using the Update Manager in Ubuntu, why is the list
597       of changes never displayed?
598       As far as I can tell, this is pulled from changelogs.ubuntu.com, and
599       therefore only the changelogs from official Ubuntu builds are
600       displayed.
601
602   Why can gscan2pdf not find my scanner?
603       If your scanner is not connected directly to the machine on which you
604       are running gscan2pdf and you have not installed the SANE daemon,
605       saned, gscan2pdf cannot automatically find it. In this case, you can
606       specify the scanner device on the command line:
607
608       "gscan2pdf --device <device">
609
610   How can I search for text in the OCR layer of the finished PDF or DJVU
611       file?
612       pdftotext or djvutxt can extract the text layer from PDF or DJVU files.
613       See the respective man pages for details.
614
615       Having opened a PDF or DJVU file in evince or Acrobat Reader, the
616       search function will typically find the page with the requested text
617       and highlight it.
618
619       There are various tools for searching or indexing files, including PDF
620       and DJVU:
621
622       •   (meta) Tracker (<https://projects.gnome.org/tracker/>)
623
624       •   plone (<http://plone.org/>)
625
626       •   pdfgrep (<http://pdfgrep.sourceforge.net/>
627
628       •   swish-e (<http://www.swish-e.org/>)
629
630       •   recoll (<http://www.lesbonscomptes.com/recoll/>)
631
632       •   terrier (<http://www.lesbonscomptes.com/recoll/>)
633
634   How can I change the colour of the selection box in the image viewer?
635       Create a file called "~/.config/gtk-3.0/gtk.css" with the following
636       content:
637
638        .rubberband,
639        rubberband,
640        flowbox rubberband,
641        treeview.view rubberband,
642        .content-view rubberband,
643        .content-view .rubberband {
644          border: 1px solid #2a76c6;
645          background-color: rgba(42, 118, 198, 0.2); }
646
647   How can I change the colour of the OCR output
648       Create a file called "~/.config/gtk-3.0/gtk.css" with the following
649       content:
650
651        #gscan2pdf-ocr-output {
652          color: black;
653        }
654

See Also

656       XSane (<http://xsane.org/>)
657
658       Scan Tailor (<http://scantailor.org/>)
659

Author

661       Jeffrey Ratcliffe (jffry at posteo dot net)
662

Thanks to

664       •   all the people who have sent patches, translations, bugs and
665           feedback.
666
667       •   the gtk+ project for a most excellent graphics toolkit.
668
669       •   the Gtk3-Perl project for their superb Perl bindings for GTK3.
670
671       •   The SANE project for scanner access
672
673       •   Björn Lindqvist for the gtkimageview widget
674
675       •   Sourceforge for hosting the project.
676
678       Copyright (C) 2006--2022 Jeffrey Ratcliffe <jffry@posteo.net>
679
680       This program is free software: you can redistribute it and/or modify it
681       under the terms of the version 3 GNU General Public License as
682       published by the Free Software Foundation.
683
684       This program is distributed in the hope that it will be useful, but
685       WITHOUT ANY WARRANTY; without even the implied warranty of
686       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
687       General Public License for more details.
688
689       You should have received a copy of the GNU General Public License along
690       with this program.  If not, see <https://www.gnu.org/licenses/>.
691
692
693
694perl v5.34.1                      2022-07-11                      GSCAN2PDF(1)
Impressum