1GSCAN2PDF(1)          User Contributed Perl Documentation         GSCAN2PDF(1)
2
3
4

NAME

6       gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents
7

USAGE

9       1. Scan one or several pages in with File/Scan
10       2. Create PDF of selected pages with File/Save
11

REQUIRED ARGUMENTS

13       None
14

OPTIONS

16       gscan2pdf has the following command-line options:
17
18       --device=<device> Specifies the device to use, instead of getting the
19       list of devices from via the SANE API. This can be useful if the
20       scanner is on a remote computer which is not broadcasting its
21       existence.
22       --help Displays this help page and exits.
23       --log=<log file> Specifies a file to store logging messages.
24       --(debug|info|warn|error|fatal) Defines the log level. If a log file is
25       specified, this defaults to 'debug', otherwise 'warn'.
26       --import=<PDF|DjVu|images> Imports the specified file(s). If the
27       document has more than one page, a window is displayed to select the
28       required pages.
29       --import-all=<PDF|DjVu|images> Imports all pages of the specified
30       file(s).
31       --version Displays the program version and exits.
32
33       Scanning is handled with SANE via scanimage.  PDF conversion is done by
34       PDF::API2.  TIFF export is handled by libtiff (faster and smaller
35       memory footprint for multipage files).
36

DIAGNOSTICS

38       To diagnose a possible error, start gscan2pdf from the command line
39       with logging enabled:
40
41       "gscan2pdf --log=file.log"
42
43       and check file.log.
44

EXIT STATUS

46       None
47

CONFIGURATION

49       gscan2pdf creates a text resource file in ~/.config/gscan2pdfrc. The
50       directory can be changed by setting the $XDG_CONFIG_HOME variable.
51       Generally, however, preferences should be changed via the
52       Edit/Preferences menu, or are captured automatically during normal
53       usage of the program.
54

INCOMPATIBILITIES

56       None known.
57

BUGS AND LIMITATIONS

59       Whilst it is possible to import PDFs, this is intended to be able to
60       round-trip files created by gscan2pdf.
61

Download

63       gscan2pdf is available on Sourceforge
64       (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).
65
66   Debian-based
67       If you are using Debian, you should find that sid has the latest
68       version already packaged.
69
70       If you are using a Ubuntu-based system, you can automatically keep up
71       to date with the latest version via the ppa:
72
73       "sudo apt-add-repository ppa:jeffreyratcliffe/ppa"
74
75       If you are you are using Synaptic, then use menu Edit/Reload Package
76       Information, search for gscan2pdf in the package list, and lo and
77       behold, you can install the nice shiny new version.
78
79       From the command line:
80
81       "sudo apt update"
82
83       "sudo apt install gscan2pdf"
84
85   RPMs
86       Download the rpm from Sourceforge, and then install it with "rpm -i
87       gscan2pdf-version.rpm"
88
89   From source
90       The source is hosted in the files section of the gscan2pdf project on
91       Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).
92
93   From the repository
94       gscan2pdf uses Git for its Revision Control System. You can browse the
95       tree at <https://sourceforge.net/p/gscan2pdf/code/>.
96
97       Git users can clone the complete tree with "git clone
98       git://git.code.sf.net/p/gscan2pdf/code"
99

Building gscan2pdf from source

101       Having downloaded the source either from a Sourceforge file release, or
102       from the Git repository, unpack it if necessary with "tar xvfz
103       gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"
104
105       "perl Makefile.PL", will create the Makefile.
106
107       "make test" should run several hundred tests to confirm that things
108       will work properly on your system.
109
110       You can install directly from the source with "make install", but
111       building the appropriate package for your distribution should be as
112       straightforward as "make debdist" or "make rpmdist". However, you will
113       additionally need the rpm, devscripts, fakeroot, debhelper and gettext
114       packages.
115

Dependencies

117       The list below looks daunting, but all packages are available from any
118       reasonable up-to-date distribution. If you are using Synaptic, having
119       installed gscan2pdf, locate the gscan2pdf entry in Synaptic, right-
120       click it and you can install them under Recommends. Note also that the
121       library names given below are the Debian/Ubuntu ones. Those
122       distributions using RPM typically use perl(module) where Debian has
123       libmodule-perl.
124
125       Required
126           libgtk3-perl >= 0.028
127               There is a bug in version of libgtk3-perl before 0.028 that
128               causes gscan2pdf to crash when saving. Whilst I could prevent
129               gscan2pdf from crashing, it would still be impossible to save
130               anything, rendering gscan2pdf rather useless.
131
132           libgtk3-simplelist-perl
133               A simple interface to Gtk3's complex MVC list widget
134
135           liblocale-gettext-perl (>= 1.05)
136               Using libc functions for internationalisation in Perl
137
138           libpdf-api2-perl
139               provides the functions for creating PDF documents in Perl
140
141           libsane
142               API library for scanners
143
144           libimage-sane-perl
145               Perl bindings for libsane.
146
147           libset-intspan-perl
148               manages sets of integers
149
150           libtiff-tools
151               TIFF manipulation and conversion tools
152
153           Imagemagick
154               Image manipulation programs
155
156           perlmagick
157               A perl interface to the libMagick graphics routines
158
159           sane-utils
160               API library for scanners -- utilities.
161
162       Optional
163           sane
164               scanner graphical frontends. Only required for the scanadf
165               frontend.
166
167           unpaper
168               post-processing tool for scanned pages. See
169               <https://www.flameeyes.eu/projects/unpaper>.
170
171           xdg-utils
172               Desktop integration utilities from freedesktop.org. Required
173               for Email as PDF.  See
174               <https://www.freedesktop.org/wiki/Software/xdg-utils/>
175
176           djvulibre-bin
177               Utilities for the DjVu image format. See
178               <http://djvu.sourceforge.net/>
179
180           gocr
181               A command line OCR. See <http://jocr.sourceforge.net/>.
182
183           tesseract
184               A command line OCR. See
185               <https://github.com/tesseract-ocr/tesseract>
186
187           ocropus
188               A command line OCR. See <http://code.google.com/p/ocropus/>
189
190           cuneiform
191               A command line OCR. See <http://launchpad.net/cuneiform-linux>
192

Support

194       There are two mailing lists for gscan2pdf:
195
196       gscan2pdf-announce
197           A low-traffic list for announcements, mostly of new releases. You
198           can subscribe at
199           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>
200
201       gscan2pdf-help
202           General support, questions, etc.. You can subscribe at
203           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>
204

Reporting bugs

206       Before reporting bugs, please read the "FAQs" section.
207
208       Please report any bugs found, preferably against the Debian
209       package[1][2].  You do not need to be a Debian user, or set up an
210       account to do this.  The Debian tool "reportbug" provides a convenient
211       GUI for doing so.
212
213       1. https://packages.debian.org/sid/gscan2pdf
214       2. https://www.debian.org/Bugs/
215
216       Alternatively, there is a bug tracker for the gscan2pdf project on
217       Sourceforge
218       (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).
219
220       Please include the log file created by "gscan2pdf --log=log" with any
221       new bug report.
222

Translations

224       gscan2pdf has already been partly translated into several languages.
225       If you would like to contribute to an existing or new translation,
226       please check out Rosetta:
227       <https://translations.launchpad.net/gscan2pdf>
228
229       Note that the translations for the scanner options are taken directly
230       from sane-backends. If you would like to contribute to these, you can
231       do so either at contact the sane-devel mailing list
232       (sane-devel@lists.alioth.debian.org) and have a look at the po/
233       directory in the source code <http://www.sane-project.org/cvs.html>.
234
235       Alternatively, Ubuntu has its own translation project. For the 9.04
236       release, the translations are available at
237       <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>
238

DESCRIPTION

240   File
241       New
242
243       Clears the page list.
244
245       Open
246
247       Opens any format that imagemagick supports. PDFs will have their
248       embedded images extracted and imported one per page.
249
250       Note that files can also be imported by dragging them into the
251       thumbnail list from a program like nautilus or konqueror.
252
253       Scan
254
255       Sets options before scanning via SANE.
256
257       Device
258
259       Chooses between available scanners.
260
261       # Pages
262
263       Selects the number of pages, or all pages to scan.
264
265       Source document
266
267       Selects between single sided or double sides pages.
268
269       This affects the page numbering.  Single sided scans are numbered
270       consecutively.  Double sided scans are incremented (or decremented, see
271       below) by 2, i.e. 1, 3, 5, etc..
272
273       Side to scan
274
275       If double sided is selected above, assuming a non-duplex scanner, i.e.
276       a scanner that cannot automatically scan both sides of a page, this
277       determines whether the page number is incremented or decremented by 2.
278
279       To scan both sides of three pages, i.e. 6 sides:
280
281       1. Select:
282           # Pages = 3 (or "all" if your scanner can detect when it is out of
283           paper)
284
285           Double sided
286
287           Facing side
288
289       2. Scans sides 1, 3 & 5.
290       3. Put pile back with scanner ready to scan back of last page.
291       4. Select:
292           # Pages = 3 (or "all" if your scanner can detect when it is out of
293           paper)
294
295           Double sided
296
297           Reverse side
298
299       5. Scans sides 6, 4 & 2.
300       6. gscan2pdf automatically sorts the pages so that they appear in the
301       correct order.
302
303       Device-dependent options
304
305       These, naturally, depend on your scanner.  They can include
306
307       Page size.
308       Mode (colour/black & white/greyscale)
309       Resolution (in PPI)
310       Batch-scan
311           Guarantees that a "no documents" condition will be returned after
312           the last scanned page, to prevent endless flatbed scans after a
313           batch scan.
314
315       Wait-for-button/Button-wait
316           After sending the scan command, wait until the button on the
317           scanner is pressed before actually starting the scan process.
318
319       Source
320           Selects the document source.  Possible options can include Flatbed
321           or ADF.  On some scanners, this is the only way of generating an
322           out-of-documents signal.
323
324       Save
325
326       Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG, PNM or
327       GIF.
328
329       Metadata
330
331       Metadata are information that are not visible when viewing the
332       PDF/DjVu, but are embedded in the file and so searchable and can be
333       examined, typically with the "Properties" option of the document
334       viewer.
335
336       The metadata are completely optional, but can also be used to generate
337       the filename see preferences for details.
338
339       The date can be selected with use of the calendar widget. The displayed
340       date can be incremented or decremented with use of the '+' and '-'
341       keys.
342
343       DjVu
344
345       Both black and white, and colour images produce better compression than
346       PDF. See <http://www.djvuzone.org/> for more details.
347
348       Email as PDF
349
350       Attaches the selected or all pages as a PDF to a blank email.  This
351       requires xdg-email, which is in the xdg-utils package.  If this is not
352       present, the option is ghosted out.
353
354       Print
355
356       Prints the selected or all pages.
357
358       Compress temporary files
359
360       If your temporary ($TMPDIR) directory is getting full, this function
361       can be useful - compressing all images at LZW-compressed TIFFs. These
362       require much less space than the PNM files that are typically produced
363       by SANE or by importing a PDF.
364
365   Edit
366       Delete
367
368       Deletes the selected page.
369
370       Renumber
371
372       Renumbers the pages from 1..n.
373
374       Note that the page order can also be changed by drag and drop in the
375       thumbnail view.
376
377       Select
378
379       The select menus can be used to select, all, even, odd, blank, dark or
380       modified pages. Selecting blank or dark pages runs imagemagick to make
381       the decision.  Selecting modified pages selects those which have
382       modified by threshold, unsharp, etc., since the last OCR run was made.
383
384       Properties
385
386       When an image is scanned, gscan2pdf attempts to extract the resolution
387       from the scan options. This nearly always works without problem.
388
389       Importing an image can be trickier, however. Some image formats such as
390       PNM do not encode metadata for resolution. In other cases, the data is
391       incorrect.  Edit/Properties allows the user to manually correct the
392       metadata for a particular page, thus correcting the size of final PDF
393       or DjVu. The image itself is otherwise not changed - it is not down- or
394       upscaled.
395
396       Preferences
397
398       The preferences menu item allows the control of the default behaviour
399       of various functions. Most of these are self-explanatory.
400
401       Frontends
402
403       gscan2pdf initially supported two frontends, scanimage and scanadf.
404       scanadf support was added when it was realised that scanadf works
405       better than scanimage with some scanners. On Debian-based systems,
406       scanadf is in the sane package, not, like scanimage, in sane-utils. If
407       scanadf is not present, the option is obviously ghosted out.
408
409       In 0.9.27, Perl bindings for SANE were introduced. These are called
410       libsane-perl.
411
412       Before 1.2.0, options available through CLI frontends like scanimage
413       were made visible as users asked for them. In 1.2.0, all options can be
414       shown or hidden via Edit/Preferences, along with the ability to specify
415       which options trigger a reload.
416
417       In 1.8.3, New Perl bindings for SANE were introduced. These are called
418       libimage-sane-perl and are the preferred frontend.
419
420       In 1.8.5, support for libsane-perl was removed.
421
422       Device blacklist
423
424       Ignore listed devices.
425
426       Note that this is a device name regular expression, e.g. /dev/video,
427       and not the name as listed in the scan window, e.g. Noname
428       Integrated_Webcam_HD.
429
430       Default filename for PDF or DjVu files
431
432       All strftime codes (e.g. %Y for the current year) are available as
433       variables, with the following additions:
434
435        %Da    author
436        %De    filename extension
437        %Dt    title
438
439       All document date codes use strftime codes with a leading D, e.g.:
440
441        %DY    document year
442        %Dm    document month
443        %Dd    document day
444
445   View
446       Zoom 100%
447
448       Zooms to 1:1. How this appears depends on the desktop resolution.
449
450       Zoom to fit
451
452       Scales the view such that all the page is visible.
453
454       Zoom in
455
456       Zoom out
457
458       Rotate 90° clockwise
459
460       The rotate options require the package imagemagick and, if this is not
461       present, are ghosted out.
462
463       Rotate 180°
464
465       Rotate 90° anticlockwise
466
467   Tools
468       Threshold
469
470       Changes all pixels darker than the given value to black; all others
471       become white.
472
473       Unsharp mask
474
475       The unsharp option sharpens an image. The image is convolved with a
476       Gaussian operator of the given radius and standard deviation (sigma).
477       For reasonable results, radius should be larger than sigma. Use a
478       radius of 0 to have the method select a suitable radius.
479
480       Crop
481
482       unpaper
483
484       unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility
485       for cleaning up a scan.
486
487       OCR (Optical Character Recognition)
488
489       The gocr, tesseract, ocropus or cuneiform utilities are used to produce
490       text from an image.
491
492       There is an OCR output buffer for each page and is embedded as plain
493       text behind the scanned image in the PDF produced. This way, Beagle can
494       index (i.e. search) the plain text.
495
496       In DjVu files, the OCR output buffer is embedded in the hidden text
497       layer.  Thus these can also be indexed by Beagle.
498
499       There is an interesting review of OCR software at
500       <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
501       An important conclusion was that 400ppi is necessary for decent
502       results.
503
504       Up to v2.04, the only way to tell which languages were available to
505       tesseract was to look for the language files. Therefore, gscan2pdf
506       checks the path returned by:
507
508        tesseract '' '' -l ''
509
510       If there are no language files in the above location, then gscan2pdf
511       assumes that tesseract v1.0 is installed, which had no language files.
512
513       Variables for user-defined tools
514
515       The following variables are available:
516
517        %i     input filename
518        %o     output filename
519        %r     resolution
520
521       An image can be modified in-place by just specifying %i.
522

FAQs

524   Why isn't option xyz available in the scan window?
525       Possibly because SANE or your scanner doesn't support it.
526
527       If an option listed in the output of "scanimage --help" that you would
528       like to use isn't available, send me the output and I will look at
529       implementing it.
530
531   I've only got an old flatbed scanner with no automatic sheetfeeder. How do
532       I scan a multipage document?
533       In Edit/Preferences, tick the box "Allow batch scanning from flatbed".
534
535       Some Brother scanners report "out of documents", despite scanning from
536       flatbed.  This can be worked around by ticking the box "Force new scan
537       job between pages".
538
539       If you are lucky, you have an option like Wait-for-button or Button-
540       wait, where the scanner will wait for you to press the scan button on
541       the device before it starts the scan, allowing you to scan multiple
542       pages without touching the computer.
543
544       If you are quick, you might be able to change the document on the
545       flatbed whilst the scan head is returning.
546
547       Otherwise, you have to set the number of pages to scan to 1 and hit the
548       scan button on the scan window for each page.
549
550   Why is option xyz ghosted out?
551       Probably because the package required for that option is not installed.
552       Email as PDF requires xdg-email (xdg-utils), unpaper and the rotate
553       options require imagemagick.
554
555   Why can I not scan from the flatbed of my HP scanner?
556       Generally for HP scanners with an ADF, to scan from the flatbed, you
557       should set "# Pages" to "1", and possibly "Batch scan" to "No".
558
559   When I update gscan2pdf using the Update Manager in Ubuntu, why is the list
560       of changes never displayed?
561       As far as I can tell, this is pulled from changelogs.ubuntu.com, and
562       therefore only the changelogs from official Ubuntu builds are
563       displayed.
564
565   Why can gscan2pdf not find my scanner?
566       If your scanner is not connected directly to the machine on which you
567       are running gscan2pdf and you have not installed the SANE daemon,
568       saned, gscan2pdf cannot automatically find it. In this case, you can
569       specify the scanner device on the command line:
570
571       "gscan2pdf --device <device">
572
573   How can I search for text in the OCR layer of the finished PDF or DJVU
574       file?
575       pdftotext or djvutxt can extract the text layer from PDF or DJVU files.
576       See the respective man pages for details.
577
578       Having opened a PDF or DJVU file in evince or Acrobat Reader, the
579       search function will typically find the page with the requested text
580       and highlight it.
581
582       There are various tools for searching or indexing files, including PDF
583       and DJVU:
584
585       ·   (meta) Tracker (<https://projects.gnome.org/tracker/>)
586
587       ·   plone (<http://plone.org/>)
588
589       ·   pdfgrep (<http://pdfgrep.sourceforge.net/>
590
591       ·   swish-e (<http://www.swish-e.org/>)
592
593       ·   recoll (<http://www.lesbonscomptes.com/recoll/>)
594
595       ·   terrier (<http://www.lesbonscomptes.com/recoll/>)
596
597   How can I change the colour of the selection box in the image viewer?
598       Create a file called "~/.config/gtk-3.0/gtk.css" with the following
599       content:
600
601        .rubberband,
602        rubberband,
603        flowbox rubberband,
604        treeview.view rubberband,
605        .content-view rubberband,
606        .content-view .rubberband {
607          border: 1px solid #2a76c6;
608          background-color: rgba(42, 118, 198, 0.2); }
609
610   How can I change the colour of the OCR output
611       Create a file called "~/.config/gtk-3.0/gtk.css" with the following
612       content:
613
614       #gscan2pdf-ocr-output {
615         color: black; }
616

See Also

618       XSane (<http://xsane.org/>)
619
620       Scan Tailor (<http://scantailor.org/>)
621

Author

623       Jeffrey Ratcliffe (jffry at posteo dot net)
624

Thanks to

626       ·   all the people who have sent patches, translations, bugs and
627           feedback.
628
629       ·   the gtk+ project for a most excellent graphics toolkit.
630
631       ·   the Gtk3-Perl project for their superb Perl bindings for GTK3.
632
633       ·   The SANE project for scanner access
634
635       ·   Björn Lindqvist for the gtkimageview widget
636
637       ·   Sourceforge for hosting the project.
638
640       Copyright (C) 2006--2020 Jeffrey Ratcliffe <jffry@posteo.net>
641
642       This program is free software: you can redistribute it and/or modify it
643       under the terms of the version 3 GNU General Public License as
644       published by the Free Software Foundation.
645
646       This program is distributed in the hope that it will be useful, but
647       WITHOUT ANY WARRANTY; without even the implied warranty of
648       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
649       General Public License for more details.
650
651       You should have received a copy of the GNU General Public License along
652       with this program.  If not, see <https://www.gnu.org/licenses/>.
653
654
655
656perl v5.32.0                      2020-09-25                      GSCAN2PDF(1)
Impressum