1GSCAN2PDF(1)          User Contributed Perl Documentation         GSCAN2PDF(1)
2
3
4

NAME

6       gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents
7

USAGE

9       1. Scan one or several pages in with File/Scan
10       2. Create PDF of selected pages with File/Save
11

REQUIRED ARGUMENTS

13       None
14

OPTIONS

16       gscan2pdf has the following command-line options:
17
18       --device=device
19           Specifies the device to use, instead of getting the list of devices
20           from via the SANE API.  This can be useful if the scanner is on a
21           remote computer which is not broadcasting its existence.
22
23       --help
24           Displays this help page and exits.
25
26       --log=log-file
27           Specifies a file to store logging messages.
28
29       --debug, --info, --warn, --error, --fatal
30           Defines the log level.  If a log file is specified, this defaults
31           to --debug, otherwise --error.
32
33       --import=PDF|DjVu|images
34           Imports the specified file(s). If the document has more than one
35           page, a window is displayed to select the required pages.
36
37       --import-all=PDF|DjVu|images Imports all pages of the specified
38       file(s).
39       --version
40           Displays the program version and exits.
41
42       Scanning is handled with SANE via scanimage.  PDF conversion is done by
43       PDF::Builder.  TIFF export is handled by libtiff (faster and smaller
44       memory footprint for multipage files).
45

DIAGNOSTICS

47       To diagnose a possible error, start gscan2pdf from the command line
48       with logging enabled:
49
50       "gscan2pdf --log=file.log"
51
52       and check file.log.
53

EXIT STATUS

55       None
56

CONFIGURATION

58       gscan2pdf creates a text resource file in ~/.config/gscan2pdfrc. The
59       directory can be changed by setting the $XDG_CONFIG_HOME variable.
60       Generally, however, preferences should be changed via the
61       Edit/Preferences menu, or are captured automatically during normal
62       usage of the program.
63

INCOMPATIBILITIES

65       None known.
66

BUGS AND LIMITATIONS

68       Whilst it is possible to import PDFs, this is intended to be able to
69       round-trip files created by gscan2pdf.
70

Download

72       gscan2pdf is available on Sourceforge
73       (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).
74
75   Debian-based
76       If you are using Debian, you should find that sid
77       <https://www.debian.org/releases/sid/> has the latest version already
78       packaged.
79
80       If you are using a Ubuntu-based system, you can automatically keep up
81       to date with the latest version via the ppa:
82
83       "sudo apt-add-repository ppa:jeffreyratcliffe/ppa"
84
85       If you are you are using Synaptic, then use menu Edit/Reload Package
86       Information, search for gscan2pdf in the package list, and lo and
87       behold, you can install the nice shiny new version.
88
89       From the command line:
90
91       "sudo apt update"
92
93       "sudo apt install gscan2pdf"
94
95   From source
96       The source is hosted in the files section of the gscan2pdf project on
97       Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).
98
99   From the repository
100       gscan2pdf uses Git for its Revision Control System. You can browse the
101       tree at <https://sourceforge.net/p/gscan2pdf/code/>.
102
103       Git users can clone the complete tree with "git clone
104       git://git.code.sf.net/p/gscan2pdf/code"
105

Building gscan2pdf from source

107       Having downloaded the source either from a Sourceforge file release, or
108       from the Git repository, unpack it if necessary with "tar xvfz
109       gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"
110
111       "perl Makefile.PL", will create the Makefile.
112
113       "make test" should run several hundred tests to confirm that things
114       will work properly on your system.
115
116       You can install directly from the source with "make install", but
117       building the appropriate package for your distribution should be as
118       straightforward as "make debdist" or "make rpmdist". However, you will
119       additionally need the rpm, devscripts, fakeroot, debhelper and gettext
120       packages.
121

Dependencies

123       The list below looks daunting, but all packages are available from any
124       reasonable up-to-date distribution. If you are using Synaptic, having
125       installed gscan2pdf, locate the gscan2pdf entry in Synaptic, right-
126       click it and you can install them under Recommends. Note also that the
127       library names given below are the Debian/Ubuntu ones. Those
128       distributions using RPM typically use perl(module) where Debian has
129       libmodule-perl.
130
131       Required
132           libgtk3-perl >= 0.028
133               There is a bug in version of libgtk3-perl before 0.028 that
134               causes gscan2pdf to crash when saving. Whilst I could prevent
135               gscan2pdf from crashing, it would still be impossible to save
136               anything, rendering gscan2pdf rather useless.
137
138           libgtk3-simplelist-perl
139               A simple interface to Gtk3's complex MVC list widget
140
141           liblocale-gettext-perl (>= 1.05)
142               Using libc functions for internationalisation in Perl
143
144           libpdf-builder-perl
145               provides the functions for creating PDF documents in Perl
146
147           libsane
148               API library for scanners
149
150           libimage-sane-perl
151               Perl bindings for libsane.
152
153           libset-intspan-perl
154               manages sets of integers
155
156           libtiff-tools
157               TIFF manipulation and conversion tools
158
159           Imagemagick
160               Image manipulation programs
161
162           perlmagick
163               A perl interface to the libMagick graphics routines
164
165           sane-utils
166               API library for scanners -- utilities.
167
168       Optional
169           sane
170               scanner graphical frontends. Only required for the scanadf
171               frontend.
172
173           unpaper
174               post-processing tool for scanned pages. See
175               <https://www.flameeyes.eu/projects/unpaper>.
176
177           xdg-utils
178               Desktop integration utilities from freedesktop.org. Required
179               for Email as PDF.  See
180               <https://www.freedesktop.org/wiki/Software/xdg-utils/>
181
182           djvulibre-bin
183               Utilities for the DjVu image format. See
184               <http://djvu.sourceforge.net/>
185
186           gocr
187               A command line OCR. See <http://jocr.sourceforge.net/>.
188
189           tesseract
190               A command line OCR. See
191               <https://github.com/tesseract-ocr/tesseract>
192
193           ocropus
194               A command line OCR. See <http://code.google.com/p/ocropus/>
195
196           cuneiform
197               A command line OCR. See <http://launchpad.net/cuneiform-linux>
198

Support

200       There are two mailing lists for gscan2pdf:
201
202       gscan2pdf-announce
203           A low-traffic list for announcements, mostly of new releases. You
204           can subscribe at
205           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>
206
207       gscan2pdf-help
208           General support, questions, etc.. You can subscribe at
209           <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>
210

Reporting bugs

212       Before reporting bugs, please read the "FAQs" section.
213
214       Please report any bugs found, preferably against the Debian
215       package[1][2].  You do not need to be a Debian user, or set up an
216       account to do this.  The Debian tool "reportbug" provides a convenient
217       GUI for doing so.
218
219       1. https://packages.debian.org/sid/gscan2pdf
220       2. https://www.debian.org/Bugs/
221
222       Alternatively, there is a bug tracker for the gscan2pdf project on
223       Sourceforge
224       (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).
225
226       Please include the log file created by "gscan2pdf --log=log" with any
227       new bug report.
228

Translations

230       gscan2pdf has already been partly translated into several languages.
231       If you would like to contribute to an existing or new translation,
232       please check out Rosetta:
233       <https://translations.launchpad.net/gscan2pdf>
234
235       Note that the translations for the scanner options are taken directly
236       from sane-backends. If you would like to contribute to these, you can
237       do so either at contact the sane-devel mailing list
238       (sane-devel@lists.alioth.debian.org) and have a look at the po/
239       directory in the source code <http://www.sane-project.org/cvs.html>.
240
241       Alternatively, Ubuntu has its own translation project. For the 9.04
242       release, the translations are available at
243       <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>
244

DESCRIPTION

246   File
247       New
248
249       Clears the page list.
250
251       Open
252
253       Opens any format that imagemagick supports. PDFs will have their
254       embedded images extracted and imported one per page.
255
256       Note that files can also be imported by dragging them into the
257       thumbnail list from a program like nautilus or konqueror.
258
259       Scan
260
261       Sets options before scanning via SANE.
262
263       Device
264
265       Chooses between available scanners.
266
267       # Pages
268
269       Selects the number of pages, or all pages to scan.
270
271       Source document
272
273       Selects between single sided or double sides pages.
274
275       This affects the page numbering.  Single sided scans are numbered
276       consecutively.  Double sided scans are incremented (or decremented, see
277       below) by 2, i.e. 1, 3, 5, etc..
278
279       Side to scan
280
281       If double sided is selected above, assuming a non-duplex scanner, i.e.
282       a scanner that cannot automatically scan both sides of a page, this
283       determines whether the page number is incremented or decremented by 2.
284
285       To scan both sides of three pages, i.e. 6 sides:
286
287       1. Select:
288           # Pages = 3 (or "all" if your scanner can detect when it is out of
289           paper)
290
291           Double sided
292
293           Facing side
294
295       2. Scans sides 1, 3 & 5.
296       3. Put pile back with scanner ready to scan back of last page.
297       4. Select:
298           # Pages = 3 (or "all" if your scanner can detect when it is out of
299           paper)
300
301           Double sided
302
303           Reverse side
304
305       5. Scans sides 6, 4 & 2.
306       6. gscan2pdf automatically sorts the pages so that they appear in the
307       correct order.
308
309       Device-dependent options
310
311       These, naturally, depend on your scanner.  They can include
312
313       Page size.
314       Mode (colour/black & white/greyscale)
315       Resolution (in PPI)
316       Batch-scan
317           Guarantees that a "no documents" condition will be returned after
318           the last scanned page, to prevent endless flatbed scans after a
319           batch scan.
320
321       Wait-for-button/Button-wait
322           After sending the scan command, wait until the button on the
323           scanner is pressed before actually starting the scan process.
324
325       Source
326           Selects the document source.  Possible options can include Flatbed
327           or ADF.  On some scanners, this is the only way of generating an
328           out-of-documents signal.
329
330       Save
331
332       Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG, PNM or
333       GIF.
334
335       Metadata
336
337       Metadata are information that are not visible when viewing the
338       PDF/DjVu, but are embedded in the file and so searchable and can be
339       examined, typically with the "Properties" option of the document
340       viewer.
341
342       The metadata are completely optional, but can also be used to generate
343       the filename see preferences for details.
344
345       The date can be selected with use of the calendar widget. The displayed
346       date can be incremented or decremented with use of the '+' and '-'
347       keys.
348
349       DjVu
350
351       Both black and white, and colour images produce better compression than
352       PDF. See <http://www.djvuzone.org/> for more details.
353
354       Email as PDF
355
356       Attaches the selected or all pages as a PDF to a blank email.  This
357       requires xdg-email, which is in the xdg-utils package.  If this is not
358       present, the option is ghosted out.
359
360       Print
361
362       Prints the selected or all pages.
363
364       Compress temporary files
365
366       If your temporary ($TMPDIR) directory is getting full, this function
367       can be useful - compressing all images at LZW-compressed TIFFs. These
368       require much less space than the PNM files that are typically produced
369       by SANE or by importing a PDF.
370
371   Edit
372       Delete
373
374       Deletes the selected page.
375
376       Renumber
377
378       Renumbers the pages from 1..n.
379
380       Note that the page order can also be changed by drag and drop in the
381       thumbnail view.
382
383       Select
384
385       The select menus can be used to select, all, even, odd, blank, dark or
386       modified pages. Selecting blank or dark pages runs imagemagick to make
387       the decision.  Selecting modified pages selects those which have
388       modified by threshold, unsharp, etc., since the last OCR run was made.
389
390       Properties
391
392       When an image is scanned, gscan2pdf attempts to extract the resolution
393       from the scan options. This nearly always works without problem.
394
395       Importing an image can be trickier, however. Some image formats such as
396       PNM do not encode metadata for resolution. In other cases, the data is
397       incorrect.  Edit/Properties allows the user to manually correct the
398       metadata for a particular page, thus correcting the size of final PDF
399       or DjVu. The image itself is otherwise not changed - it is not down- or
400       upscaled.
401
402       Preferences
403
404       The preferences menu item allows the control of the default behaviour
405       of various functions. Most of these are self-explanatory.
406
407       Frontends
408
409       gscan2pdf initially supported two frontends, scanimage and scanadf.
410       scanadf support was added when it was realised that scanadf works
411       better than scanimage with some scanners. On Debian-based systems,
412       scanadf is in the sane package, not, like scanimage, in sane-utils. If
413       scanadf is not present, the option is obviously ghosted out.
414
415       In 0.9.27, Perl bindings for SANE were introduced. These are called
416       libsane-perl.
417
418       Before 1.2.0, options available through CLI frontends like scanimage
419       were made visible as users asked for them. In 1.2.0, all options can be
420       shown or hidden via Edit/Preferences, along with the ability to specify
421       which options trigger a reload.
422
423       In 1.8.3, New Perl bindings for SANE were introduced. These are called
424       libimage-sane-perl and are the preferred frontend.
425
426       In 1.8.5, support for libsane-perl was removed.
427
428       Device blacklist
429
430       Ignore listed devices.
431
432       Note that this is a device name regular expression, e.g. /dev/video,
433       and not the name as listed in the scan window, e.g. Noname
434       Integrated_Webcam_HD.
435
436       Default filename for PDF or DjVu files
437
438       All strftime codes (e.g. %Y for the current year) are available as
439       variables, with the following additions:
440
441       %Da author
442
443       %De filename extension
444
445       %Dt title
446
447       All document date codes use strftime codes with a leading D, e.g.:
448
449       %DY document year
450
451       %Dm document month
452
453       %Dd document day
454
455   View
456       Zoom 100%
457
458       Zooms to 1:1. How this appears depends on the desktop resolution.
459
460       Zoom to fit
461
462       Scales the view such that all the page is visible.
463
464       Zoom in
465
466       Zoom out
467
468       Rotate 90° clockwise
469
470       The rotate options require the package imagemagick and, if this is not
471       present, are ghosted out.
472
473       Rotate 180°
474
475       Rotate 90° anticlockwise
476
477   Tools
478       Threshold
479
480       Changes all pixels darker than the given value to black; all others
481       become white.
482
483       Unsharp mask
484
485       The unsharp option sharpens an image. The image is convolved with a
486       Gaussian operator of the given radius and standard deviation (sigma).
487       For reasonable results, radius should be larger than sigma. Use a
488       radius of 0 to have the method select a suitable radius.
489
490       Crop
491
492       unpaper
493
494       unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility
495       for cleaning up a scan.
496
497       OCR (Optical Character Recognition)
498
499       The gocr, tesseract, ocropus or cuneiform utilities are used to produce
500       text from an image.
501
502       There is an OCR output buffer for each page and is embedded as plain
503       text behind the scanned image in the PDF produced. This way, Beagle can
504       index (i.e. search) the plain text.
505
506       In DjVu files, the OCR output buffer is embedded in the hidden text
507       layer.  Thus these can also be indexed by Beagle.
508
509       There is an interesting review of OCR software at
510       <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
511       An important conclusion was that 400ppi is necessary for decent
512       results.
513
514       Up to v2.04, the only way to tell which languages were available to
515       tesseract was to look for the language files. Therefore, gscan2pdf
516       checks the path returned by:
517
518       "tesseract '' '' -l ''"
519
520       If there are no language files in the above location, then gscan2pdf
521       assumes that tesseract v1.0 is installed, which had no language files.
522
523       Variables for user-defined tools
524
525       The following variables are available:
526
527       %i  input filename
528
529       %o  output filename
530
531       %r  resolution
532
533       An image can be modified in-place by just specifying %i.
534

FAQs

536   Why isn't option xyz available in the scan window?
537       Possibly because SANE or your scanner doesn't support it.
538
539       If an option listed in the output of "scanimage --help" that you would
540       like to use isn't available, send me the output and I will look at
541       implementing it.
542
543   I've only got an old flatbed scanner with no automatic sheetfeeder. How do
544       I scan a multipage document?
545       In Edit/Preferences, tick the box "Allow batch scanning from flatbed".
546
547       Some Brother scanners report "out of documents", despite scanning from
548       flatbed.  This can be worked around by ticking the box "Force new scan
549       job between pages".
550
551       If you are lucky, you have an option like Wait-for-button or Button-
552       wait, where the scanner will wait for you to press the scan button on
553       the device before it starts the scan, allowing you to scan multiple
554       pages without touching the computer.
555
556       If you are quick, you might be able to change the document on the
557       flatbed whilst the scan head is returning.
558
559       Otherwise, you have to set the number of pages to scan to 1 and hit the
560       scan button on the scan window for each page.
561
562   Why is option xyz ghosted out?
563       Probably because the package required for that option is not installed.
564       Email as PDF requires xdg-email (xdg-utils), unpaper and the rotate
565       options require imagemagick.
566
567   Why can I not scan from the flatbed of my HP scanner?
568       Generally for HP scanners with an ADF, to scan from the flatbed, you
569       should set "# Pages" to "1", and possibly "Batch scan" to "No".
570
571   When I update gscan2pdf using the Update Manager in Ubuntu, why is the list
572       of changes never displayed?
573       As far as I can tell, this is pulled from changelogs.ubuntu.com, and
574       therefore only the changelogs from official Ubuntu builds are
575       displayed.
576
577   Why can gscan2pdf not find my scanner?
578       If your scanner is not connected directly to the machine on which you
579       are running gscan2pdf and you have not installed the SANE daemon,
580       saned, gscan2pdf cannot automatically find it. In this case, you can
581       specify the scanner device on the command line:
582
583       "gscan2pdf --device <device">
584
585   How can I search for text in the OCR layer of the finished PDF or DJVU
586       file?
587       pdftotext or djvutxt can extract the text layer from PDF or DJVU files.
588       See the respective man pages for details.
589
590       Having opened a PDF or DJVU file in evince or Acrobat Reader, the
591       search function will typically find the page with the requested text
592       and highlight it.
593
594       There are various tools for searching or indexing files, including PDF
595       and DJVU:
596
597       •   (meta) Tracker (<https://projects.gnome.org/tracker/>)
598
599       •   plone (<http://plone.org/>)
600
601       •   pdfgrep (<http://pdfgrep.sourceforge.net/>
602
603       •   swish-e (<http://www.swish-e.org/>)
604
605       •   recoll (<http://www.lesbonscomptes.com/recoll/>)
606
607       •   terrier (<http://www.lesbonscomptes.com/recoll/>)
608
609   How can I change the colour of the selection box in the image viewer?
610       Create a file called "~/.config/gtk-3.0/gtk.css" with the following
611       content:
612
613        .rubberband,
614        rubberband,
615        flowbox rubberband,
616        treeview.view rubberband,
617        .content-view rubberband,
618        .content-view .rubberband {
619          border: 1px solid #2a76c6;
620          background-color: rgba(42, 118, 198, 0.2); }
621
622   How can I change the colour of the OCR output
623       Create a file called "~/.config/gtk-3.0/gtk.css" with the following
624       content:
625
626        #gscan2pdf-ocr-output {
627          color: black;
628        }
629

See Also

631       XSane (<http://xsane.org/>)
632
633       Scan Tailor (<http://scantailor.org/>)
634

Author

636       Jeffrey Ratcliffe (jffry at posteo dot net)
637

Thanks to

639       •   all the people who have sent patches, translations, bugs and
640           feedback.
641
642       •   the gtk+ project for a most excellent graphics toolkit.
643
644       •   the Gtk3-Perl project for their superb Perl bindings for GTK3.
645
646       •   The SANE project for scanner access
647
648       •   Björn Lindqvist for the gtkimageview widget
649
650       •   Sourceforge for hosting the project.
651
653       Copyright (C) 2006--2021 Jeffrey Ratcliffe <jffry@posteo.net>
654
655       This program is free software: you can redistribute it and/or modify it
656       under the terms of the version 3 GNU General Public License as
657       published by the Free Software Foundation.
658
659       This program is distributed in the hope that it will be useful, but
660       WITHOUT ANY WARRANTY; without even the implied warranty of
661       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
662       General Public License for more details.
663
664       You should have received a copy of the GNU General Public License along
665       with this program.  If not, see <https://www.gnu.org/licenses/>.
666
667
668
669perl v5.32.1                      2021-04-22                      GSCAN2PDF(1)
Impressum