1GSCAN2PDF(1)          User Contributed Perl Documentation         GSCAN2PDF(1)
2
3
4

NAME

6       gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents
7

USAGE

9       1. Scan one or several pages in with File/Scan
10       2. Create PDF of selected pages with File/Save
11

REQUIRED ARGUMENTS

13       None
14

OPTIONS

16       gscan2pdf has the following command-line options:
17
18       --device=<device> Specifies the device to use, instead of getting the
19       list of devices from via the SANE API. This can be useful if the
20       scanner is on a remote computer which is not broadcasting its
21       existence.
22       --help Displays this help page and exits.
23       --log=<log file> Specifies a file to store logging messages.
24       --(debug|info|warn|error|fatal) Defines the log level. If a log file is
25       specified, this defaults to 'debug', otherwise 'warn'.
26       --import=<PDF|DjVu|image> Imports the specified file
27       --version Displays the program version and exits.
28
29       Scanning is handled with SANE via scanimage.  PDF conversion is done by
30       PDF::API2.  TIFF export is handled by libtiff (faster and smaller
31       memory footprint for multipage files).
32

DIAGNOSTICS

34       To diagnose a possible error, start gscan2pdf from the command line
35       with logging enabled:
36
37       "gscan2pdf --log=file.log"
38
39       and check file.log.
40

EXIT STATUS

42       None
43

CONFIGURATION

45       gscan2pdf creates a text resource file in ~/.config/gscan2pdfrc. The
46       directory can be changed by setting the $XDG_CONFIG_HOME variable.
47       Generally, however, preferences should be changed via the
48       Edit/Preferences menu, or are captured automatically during normal
49       usage of the program.
50

INCOMPATIBILITIES

52       None known.
53

BUGS AND LIMITATIONS

55       Whilst it is possible to import PDFs, this is intended to be able to
56       round-trip files created by gscan2pdf.
57

Download

59       gscan2pdf is available on Sourceforge
60       (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).
61
62   Debian-based
63       If you are using Debian, you should find that sid has the latest
64       version already packaged.
65
66       If you are using a Ubuntu-based system, you can automatically keep up
67       to date with the latest version via the ppa:
68
69       "sudo apt-add-repository ppa:jeffreyratcliffe/ppa"
70
71       If you are you are using Synaptic, then use menu Edit/Reload Package
72       Information, search for gscan2pdf in the package list, and lo and
73       behold, you can install the nice shiny new version.
74
75       From the command line:
76
77       "sudo apt-get update"
78
79       "sudo apt-get install gscan2pdf"
80
81   RPMs
82       Download the rpm from Sourceforge, and then install it with "rpm -i
83       gscan2pdf-version.rpm"
84
85   From source
86       The source is hosted in the files section of the gscan2pdf project on
87       Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).
88
89   From the repository
90       gscan2pdf uses Git for its Revision Control System. You can browse the
91       tree at <https://sourceforge.net/p/gscan2pdf/code/>.
92
93       Git users can clone the complete tree with "git clone
94       git://git.code.sf.net/p/gscan2pdf/code"
95

Building gscan2pdf from source

97       Having downloaded the source either from a Sourceforge file release, or
98       from the Git repository, unpack it if necessary with "tar xvfz
99       gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"
100
101       "perl Makefile.PL", will create the Makefile.
102
103       "make test" should run several hundred tests to confirm that things
104       will work properly on your system.
105
106       You can install directly from the source with "make install", but
107       building the appropriate package for your distribution should be as
108       straightforward as "make debdist" or "make rpmdist". However, you will
109       additionally need the rpm, devscripts, fakeroot, debhelper and gettext
110       packages.
111

Dependencies

113       The list below looks daunting, but all packages are available from any
114       reasonable up-to-date distribution. If you are using Synaptic, having
115       installed gscan2pdf, locate the gscan2pdf entry in Synaptic, right-
116       click it and you can install them under Recommends. Note also that the
117       library names given below are the Debian/Ubuntu ones. Those
118       distributions using RPM typically use perl(module) where Debian has
119       libmodule-perl.
120
121       Required
122           libgtk3-perl >= 0.028
123               There is a bug in version of libgtk3-perl before 0.028 that
124               causes gscan2pdf to crash when saving. Whilst I could prevent
125               gscan2pdf from crashing, it would still be impossible to save
126               anything, rendering gscan2pdf rather useless.
127
128           libgtk3-simplelist-perl
129               A simple interface to Gtk3's complex MVC list widget
130
131           liblocale-gettext-perl (>= 1.05)
132               Using libc functions for internationalisation in Perl
133
134           libpdf-api2-perl
135               provides the functions for creating PDF documents in Perl
136
137           libsane
138               API library for scanners
139
140           libimage-sane-perl
141               Perl bindings for libsane.
142
143           libset-intspan-perl
144               manages sets of integers
145
146           libtiff-tools
147               TIFF manipulation and conversion tools
148
149           Imagemagick
150               Image manipulation programs
151
152           perlmagick
153               A perl interface to the libMagick graphics routines
154
155           sane-utils
156               API library for scanners -- utilities.
157
158       Optional
159           sane
160               scanner graphical frontends. Only required for the scanadf
161               frontend.
162
163           unpaper
164               post-processing tool for scanned pages. See
165               <https://www.flameeyes.eu/projects/unpaper>.
166
167           xdg-utils
168               Desktop integration utilities from freedesktop.org. Required
169               for Email as PDF.  See
170               <https://www.freedesktop.org/wiki/Software/xdg-utils/>
171
172           djvulibre-bin
173               Utilities for the DjVu image format. See
174               <http://djvu.sourceforge.net/>
175
176           gocr
177               A command line OCR. See <http://jocr.sourceforge.net/>.
178
179           tesseract
180               A command line OCR. See
181               <https://github.com/tesseract-ocr/tesseract>
182
183           ocropus
184               A command line OCR. See <http://code.google.com/p/ocropus/>
185
186           cuneiform
187               A command line OCR. See <http://launchpad.net/cuneiform-linux>
188

Support

190       There are two mailing lists for gscan2pdf:
191
192       gscan2pdf-announce
193           A low-traffic list for announcements, mostly of new releases. You
194           can subscribe at
195           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>
196
197       gscan2pdf-help
198           General support, questions, etc.. You can subscribe at
199           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>
200

Reporting bugs

202       Before reporting bugs, please read the "FAQs" section.
203
204       Please report any bugs found, preferably against the Debian
205       package[1][2].  You do not need to be a Debian user, or set up an
206       account to do this.  The Debian tool "reportbug" provides a convenient
207       GUI for doing so.
208
209       1. https://packages.debian.org/sid/gscan2pdf
210       2. https://www.debian.org/Bugs/
211
212       Alternatively, there is a bug tracker for the gscan2pdf project on
213       Sourceforge
214       (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).
215
216       Please include the log file created by "gscan2pdf --log=log" with any
217       new bug report.
218

Translations

220       gscan2pdf has already been partly translated into several languages.
221       If you would like to contribute to an existing or new translation,
222       please check out Rosetta:
223       <https://translations.launchpad.net/gscan2pdf>
224
225       Note that the translations for the scanner options are taken directly
226       from sane-backends. If you would like to contribute to these, you can
227       do so either at contact the sane-devel mailing list
228       (sane-devel@lists.alioth.debian.org) and have a look at the po/
229       directory in the source code <http://www.sane-project.org/cvs.html>.
230
231       Alternatively, Ubuntu has its own translation project. For the 9.04
232       release, the translations are available at
233       <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>
234

DESCRIPTION

236   File
237       New
238
239       Clears the page list.
240
241       Open
242
243       Opens any format that imagemagick supports. PDFs will have their
244       embedded images extracted and imported one per page.
245
246       Note that files can also be imported by dragging them into the
247       thumbnail list from a program like nautilus or konqueror.
248
249       Scan
250
251       Sets options before scanning via SANE.
252
253       Device
254
255       Chooses between available scanners.
256
257       # Pages
258
259       Selects the number of pages, or all pages to scan.
260
261       Source document
262
263       Selects between single sided or double sides pages.
264
265       This affects the page numbering.  Single sided scans are numbered
266       consecutively.  Double sided scans are incremented (or decremented, see
267       below) by 2, i.e. 1, 3, 5, etc..
268
269       Side to scan
270
271       If double sided is selected above, assuming a non-duplex scanner, i.e.
272       a scanner that cannot automatically scan both sides of a page, this
273       determines whether the page number is incremented or decremented by 2.
274
275       To scan both sides of three pages, i.e. 6 sides:
276
277       1. Select:
278           # Pages = 3 (or "all" if your scanner can detect when it is out of
279           paper)
280
281           Double sided
282
283           Facing side
284
285       2. Scans sides 1, 3 & 5.
286       3. Put pile back with scanner ready to scan back of last page.
287       4. Select:
288           # Pages = 3 (or "all" if your scanner can detect when it is out of
289           paper)
290
291           Double sided
292
293           Reverse side
294
295       5. Scans sides 6, 4 & 2.
296       6. gscan2pdf automatically sorts the pages so that they appear in the
297       correct order.
298
299       Device-dependent options
300
301       These, naturally, depend on your scanner.  They can include
302
303       Page size.
304       Mode (colour/black & white/greyscale)
305       Resolution (in PPI)
306       Batch-scan
307           Guarantees that a "no documents" condition will be returned after
308           the last scanned page, to prevent endless flatbed scans after a
309           batch scan.
310
311       Wait-for-button/Button-wait
312           After sending the scan command, wait until the button on the
313           scanner is pressed before actually starting the scan process.
314
315       Source
316           Selects the document source.  Possible options can include Flatbed
317           or ADF.  On some scanners, this is the only way of generating an
318           out-of-documents signal.
319
320       Save
321
322       Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG, PNM or
323       GIF.
324
325       PDF Metadata
326
327       Metadata are information that are not visible when viewing the PDF, but
328       are embedded in the file and so searchable and can be examined,
329       typically with the "Properties" option of the PDF viewer.
330
331       The metadata are completely optional, but can also be used to generate
332       the filename see preferences for details.
333
334       DjVu
335
336       Both black and white, and colour images produce better compression than
337       PDF. See <http://www.djvuzone.org/> for more details.
338
339       Email as PDF
340
341       Attaches the selected or all pages as a PDF to a blank email.  This
342       requires xdg-email, which is in the xdg-utils package.  If this is not
343       present, the option is ghosted out.
344
345       Print
346
347       Prints the selected or all pages.
348
349       Compress temporary files
350
351       If your temporary ($TMPDIR) directory is getting full, this function
352       can be useful - compressing all images at LZW-compressed TIFFs. These
353       require much less space than the PNM files that are typically produced
354       by SANE or by importing a PDF.
355
356   Edit
357       Delete
358
359       Deletes the selected page.
360
361       Renumber
362
363       Renumbers the pages from 1..n.
364
365       Note that the page order can also be changed by drag and drop in the
366       thumbnail view.
367
368       Select
369
370       The select menus can be used to select, all, even, odd, blank, dark or
371       modified pages. Selecting blank or dark pages runs imagemagick to make
372       the decision.  Selecting modified pages selects those which have
373       modified by threshold, unsharp, etc., since the last OCR run was made.
374
375       Properties
376
377       When an image is scanned, gscan2pdf attempts to extract the resolution
378       from the scan options. This nearly always works without problem.
379
380       Importing an image can be trickier, however. Some image formats such as
381       PNM do not encode metadata for resolution. In other cases, the data is
382       incorrect.  Edit/Properties allows the user to manually correct the
383       metadata for a particular page, thus correcting the size of final PDF
384       or DjVu. The image itself is otherwise not changed - it is not down- or
385       upscaled.
386
387       Preferences
388
389       The preferences menu item allows the control of the default behaviour
390       of various functions. Most of these are self-explanatory.
391
392       Frontends
393
394       gscan2pdf initially supported two frontends, scanimage and scanadf.
395       scanadf support was added when it was realised that scanadf works
396       better than scanimage with some scanners. On Debian-based systems,
397       scanadf is in the sane package, not, like scanimage, in sane-utils. If
398       scanadf is not present, the option is obviously ghosted out.
399
400       In 0.9.27, Perl bindings for SANE were introduced. These are called
401       libsane-perl.
402
403       Before 1.2.0, options available through CLI frontends like scanimage
404       were made visible as users asked for them. In 1.2.0, all options can be
405       shown or hidden via Edit/Preferences, along with the ability to specify
406       which options trigger a reload.
407
408       In 1.8.3, New Perl bindings for SANE were introduced. These are called
409       libimage-sane-perl and are the preferred frontend.
410
411       In 1.8.5, support for libsane-perl was removed.
412
413       Device blacklist
414
415       Ignore listed devices.
416
417       Note that this is a device name regular expression, e.g. /dev/video,
418       and not the name as listed in the scan window, e.g. Noname
419       Integrated_Webcam_HD.
420
421       Default filename for PDF or DjVu files
422
423       All strftime codes (e.g. %Y for the current year) are available as
424       variables, with the following additions:
425
426        %Da    author
427        %De    filename extension
428        %Dt    title
429
430       All document date codes use strftime codes with a leading D, e.g.:
431
432        %DY    document year
433        %Dm    document month
434        %Dd    document day
435
436   View
437       Zoom 100%
438
439       Zooms to 1:1. How this appears depends on the desktop resolution.
440
441       Zoom to fit
442
443       Scales the view such that all the page is visible.
444
445       Zoom in
446
447       Zoom out
448
449       Rotate 90° clockwise
450
451       The rotate options require the package imagemagick and, if this is not
452       present, are ghosted out.
453
454       Rotate 180°
455
456       Rotate 90° anticlockwise
457
458   Tools
459       Threshold
460
461       Changes all pixels darker than the given value to black; all others
462       become white.
463
464       Unsharp mask
465
466       The unsharp option sharpens an image. The image is convolved with a
467       Gaussian operator of the given radius and standard deviation (sigma).
468       For reasonable results, radius should be larger than sigma. Use a
469       radius of 0 to have the method select a suitable radius.
470
471       Crop
472
473       unpaper
474
475       unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility
476       for cleaning up a scan.
477
478       OCR (Optical Character Recognition)
479
480       The gocr, tesseract, ocropus or cuneiform utilities are used to produce
481       text from an image.
482
483       There is an OCR output buffer for each page and is embedded as plain
484       text behind the scanned image in the PDF produced. This way, Beagle can
485       index (i.e. search) the plain text.
486
487       In DjVu files, the OCR output buffer is embedded in the hidden text
488       layer.  Thus these can also be indexed by Beagle.
489
490       There is an interesting review of OCR software at
491       <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
492       An important conclusion was that 400ppi is necessary for decent
493       results.
494
495       Up to v2.04, the only way to tell which languages were available to
496       tesseract was to look for the language files. Therefore, gscan2pdf
497       checks the path returned by:
498
499        tesseract '' '' -l ''
500
501       If there are no language files in the above location, then gscan2pdf
502       assumes that tesseract v1.0 is installed, which had no language files.
503
504       Variables for user-defined tools
505
506       The following variables are available:
507
508        %i     input filename
509        %o     output filename
510        %r     resolution
511
512       An image can be modified in-place by just specifying %i.
513

FAQs

515   Why isn't option xyz available in the scan window?
516       Possibly because SANE or your scanner doesn't support it.
517
518       If an option listed in the output of "scanimage --help" that you would
519       like to use isn't available, send me the output and I will look at
520       implementing it.
521
522   I've only got an old flatbed scanner with no automatic sheetfeeder. How do
523       I scan a multipage document?
524       In Edit/Preferences, tick the box "Allow batch scanning from flatbed".
525
526       Some Brother scanners report "out of documents", despite scanning from
527       flatbed.  This can be worked around by ticking the box "Force new scan
528       job between pages".
529
530       If you are lucky, you have an option like Wait-for-button or Button-
531       wait, where the scanner will wait for you to press the scan button on
532       the device before it starts the scan, allowing you to scan multiple
533       pages without touching the computer.
534
535       If you are quick, you might be able to change the document on the
536       flatbed whilst the scan head is returning.
537
538       Otherwise, you have to set the number of pages to scan to 1 and hit the
539       scan button on the scan window for each page.
540
541   Why is option xyz ghosted out?
542       Probably because the package required for that option is not installed.
543       Email as PDF requires xdg-email (xdg-utils), unpaper and the rotate
544       options require imagemagick.
545
546   Why can I not scan from the flatbed of my HP scanner?
547       Generally for HP scanners with an ADF, to scan from the flatbed, you
548       should set "# Pages" to "1", and possibly "Batch scan" to "No".
549
550   When I update gscan2pdf using the Update Manager in Ubuntu, why is the list
551       of changes never displayed?
552       As far as I can tell, this is pulled from changelogs.ubuntu.com, and
553       therefore only the changelogs from official Ubuntu builds are
554       displayed.
555
556   Why can gscan2pdf not find my scanner?
557       If your scanner is not connected directly to the machine on which you
558       are running gscan2pdf and you have not installed the SANE daemon,
559       saned, gscan2pdf cannot automatically find it. In this case, you can
560       specify the scanner device on the command line:
561
562       "gscan2pdf --device <device">
563
564   How can I search for text in the OCR layer of the finished PDF or DJVU
565       file?
566       pdftotext or djvutxt can extract the text layer from PDF or DJVU files.
567       See the respective man pages for details.
568
569       Having opened a PDF or DJVU file in evince or Acrobat Reader, the
570       search function will typically find the page with the requested text
571       and highlight it.
572
573       There are various tools for searching or indexing files, including PDF
574       and DJVU:
575
576       ·   (meta) Tracker (<https://projects.gnome.org/tracker/>)
577
578       ·   plone (<http://plone.org/>)
579
580       ·   pdfgrep (<http://pdfgrep.sourceforge.net/>
581
582       ·   swish-e (<http://www.swish-e.org/>)
583
584       ·   recoll (<http://www.lesbonscomptes.com/recoll/>)
585
586       ·   terrier (<http://www.lesbonscomptes.com/recoll/>)
587
588   How can I change the colour of the selection box in the image viewer?
589       Create a file called "~/.config/gtk-3.0/gtk.css" with the following
590       content:
591
592        .rubberband,
593        rubberband,
594        flowbox rubberband,
595        treeview.view rubberband,
596        .content-view rubberband,
597        .content-view .rubberband {
598          border: 1px solid #2a76c6;
599          background-color: rgba(42, 118, 198, 0.2); }
600
601   How can I change the colour of the OCR output
602       Create a file called "~/.config/gtk-3.0/gtk.css" with the following
603       content:
604
605       #gscan2pdf-ocr-output {
606         color: black; }
607

See Also

609       XSane (<http://xsane.org/>)
610
611       Scan Tailor (<http://scantailor.org/>)
612

Author

614       Jeffrey Ratcliffe (jffry at posteo dot net)
615

Thanks to

617       ·   all the people who have sent patches, translations, bugs and
618           feedback.
619
620       ·   the gtk+ project for a most excellent graphics toolkit.
621
622       ·   the Gtk3-Perl project for their superb Perl bindings for GTK3.
623
624       ·   The SANE project for scanner access
625
626       ·   Björn Lindqvist for the gtkimageview widget
627
628       ·   Sourceforge for hosting the project.
629
631       Copyright (C) 2006--2019 Jeffrey Ratcliffe <jffry@posteo.net>
632
633       This program is free software: you can redistribute it and/or modify it
634       under the terms of the version 3 GNU General Public License as
635       published by the Free Software Foundation.
636
637       This program is distributed in the hope that it will be useful, but
638       WITHOUT ANY WARRANTY; without even the implied warranty of
639       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
640       General Public License for more details.
641
642       You should have received a copy of the GNU General Public License along
643       with this program.  If not, see <https://www.gnu.org/licenses/>.
644
645
646
647perl v5.28.1                      2019-02-25                      GSCAN2PDF(1)
Impressum