1GSCAN2PDF(1) User Contributed Perl Documentation GSCAN2PDF(1)
2
3
4
6 gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents
7
9 1. Scan one or several pages in with File/Scan
10 2. Create PDF of selected pages with File/Save
11
13 None
14
16 gscan2pdf has the following command-line options:
17
18 --device=<device> Specifies the device to use, instead of getting the
19 list of devices from via the SANE API. This can be useful if the
20 scanner is on a remote computer which is not broadcasting its
21 existence.
22 --help Displays this help page and exits.
23 --log=<log file> Specifies a file to store logging messages.
24 --(debug|info|warn|error|fatal) Defines the log level. If a log file is
25 specified, this defaults to 'debug', otherwise 'warn'.
26 --import=<PDF|DjVu|images> Imports the specified file(s). If the
27 document has more than one page, a window is displayed to select the
28 required pages.
29 --import-all=<PDF|DjVu|images> Imports all pages of the specified
30 file(s).
31 --version Displays the program version and exits.
32
33 Scanning is handled with SANE via scanimage. PDF conversion is done by
34 PDF::API2. TIFF export is handled by libtiff (faster and smaller
35 memory footprint for multipage files).
36
38 To diagnose a possible error, start gscan2pdf from the command line
39 with logging enabled:
40
41 "gscan2pdf --log=file.log"
42
43 and check file.log.
44
46 None
47
49 gscan2pdf creates a text resource file in ~/.config/gscan2pdfrc. The
50 directory can be changed by setting the $XDG_CONFIG_HOME variable.
51 Generally, however, preferences should be changed via the
52 Edit/Preferences menu, or are captured automatically during normal
53 usage of the program.
54
56 None known.
57
59 Whilst it is possible to import PDFs, this is intended to be able to
60 round-trip files created by gscan2pdf.
61
63 gscan2pdf is available on Sourceforge
64 (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).
65
66 Debian-based
67 If you are using Debian, you should find that sid has the latest
68 version already packaged.
69
70 If you are using a Ubuntu-based system, you can automatically keep up
71 to date with the latest version via the ppa:
72
73 "sudo apt-add-repository ppa:jeffreyratcliffe/ppa"
74
75 If you are you are using Synaptic, then use menu Edit/Reload Package
76 Information, search for gscan2pdf in the package list, and lo and
77 behold, you can install the nice shiny new version.
78
79 From the command line:
80
81 "sudo apt-get update"
82
83 "sudo apt-get install gscan2pdf"
84
85 RPMs
86 Download the rpm from Sourceforge, and then install it with "rpm -i
87 gscan2pdf-version.rpm"
88
89 From source
90 The source is hosted in the files section of the gscan2pdf project on
91 Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).
92
93 From the repository
94 gscan2pdf uses Git for its Revision Control System. You can browse the
95 tree at <https://sourceforge.net/p/gscan2pdf/code/>.
96
97 Git users can clone the complete tree with "git clone
98 git://git.code.sf.net/p/gscan2pdf/code"
99
101 Having downloaded the source either from a Sourceforge file release, or
102 from the Git repository, unpack it if necessary with "tar xvfz
103 gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"
104
105 "perl Makefile.PL", will create the Makefile.
106
107 "make test" should run several hundred tests to confirm that things
108 will work properly on your system.
109
110 You can install directly from the source with "make install", but
111 building the appropriate package for your distribution should be as
112 straightforward as "make debdist" or "make rpmdist". However, you will
113 additionally need the rpm, devscripts, fakeroot, debhelper and gettext
114 packages.
115
117 The list below looks daunting, but all packages are available from any
118 reasonable up-to-date distribution. If you are using Synaptic, having
119 installed gscan2pdf, locate the gscan2pdf entry in Synaptic, right-
120 click it and you can install them under Recommends. Note also that the
121 library names given below are the Debian/Ubuntu ones. Those
122 distributions using RPM typically use perl(module) where Debian has
123 libmodule-perl.
124
125 Required
126 libgtk3-perl >= 0.028
127 There is a bug in version of libgtk3-perl before 0.028 that
128 causes gscan2pdf to crash when saving. Whilst I could prevent
129 gscan2pdf from crashing, it would still be impossible to save
130 anything, rendering gscan2pdf rather useless.
131
132 libgtk3-simplelist-perl
133 A simple interface to Gtk3's complex MVC list widget
134
135 liblocale-gettext-perl (>= 1.05)
136 Using libc functions for internationalisation in Perl
137
138 libpdf-api2-perl
139 provides the functions for creating PDF documents in Perl
140
141 libsane
142 API library for scanners
143
144 libimage-sane-perl
145 Perl bindings for libsane.
146
147 libset-intspan-perl
148 manages sets of integers
149
150 libtiff-tools
151 TIFF manipulation and conversion tools
152
153 Imagemagick
154 Image manipulation programs
155
156 perlmagick
157 A perl interface to the libMagick graphics routines
158
159 sane-utils
160 API library for scanners -- utilities.
161
162 Optional
163 sane
164 scanner graphical frontends. Only required for the scanadf
165 frontend.
166
167 unpaper
168 post-processing tool for scanned pages. See
169 <https://www.flameeyes.eu/projects/unpaper>.
170
171 xdg-utils
172 Desktop integration utilities from freedesktop.org. Required
173 for Email as PDF. See
174 <https://www.freedesktop.org/wiki/Software/xdg-utils/>
175
176 djvulibre-bin
177 Utilities for the DjVu image format. See
178 <http://djvu.sourceforge.net/>
179
180 gocr
181 A command line OCR. See <http://jocr.sourceforge.net/>.
182
183 tesseract
184 A command line OCR. See
185 <https://github.com/tesseract-ocr/tesseract>
186
187 ocropus
188 A command line OCR. See <http://code.google.com/p/ocropus/>
189
190 cuneiform
191 A command line OCR. See <http://launchpad.net/cuneiform-linux>
192
194 There are two mailing lists for gscan2pdf:
195
196 gscan2pdf-announce
197 A low-traffic list for announcements, mostly of new releases. You
198 can subscribe at
199 <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>
200
201 gscan2pdf-help
202 General support, questions, etc.. You can subscribe at
203 <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>
204
206 Before reporting bugs, please read the "FAQs" section.
207
208 Please report any bugs found, preferably against the Debian
209 package[1][2]. You do not need to be a Debian user, or set up an
210 account to do this. The Debian tool "reportbug" provides a convenient
211 GUI for doing so.
212
213 1. https://packages.debian.org/sid/gscan2pdf
214 2. https://www.debian.org/Bugs/
215
216 Alternatively, there is a bug tracker for the gscan2pdf project on
217 Sourceforge
218 (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).
219
220 Please include the log file created by "gscan2pdf --log=log" with any
221 new bug report.
222
224 gscan2pdf has already been partly translated into several languages.
225 If you would like to contribute to an existing or new translation,
226 please check out Rosetta:
227 <https://translations.launchpad.net/gscan2pdf>
228
229 Note that the translations for the scanner options are taken directly
230 from sane-backends. If you would like to contribute to these, you can
231 do so either at contact the sane-devel mailing list
232 (sane-devel@lists.alioth.debian.org) and have a look at the po/
233 directory in the source code <http://www.sane-project.org/cvs.html>.
234
235 Alternatively, Ubuntu has its own translation project. For the 9.04
236 release, the translations are available at
237 <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>
238
240 File
241 New
242
243 Clears the page list.
244
245 Open
246
247 Opens any format that imagemagick supports. PDFs will have their
248 embedded images extracted and imported one per page.
249
250 Note that files can also be imported by dragging them into the
251 thumbnail list from a program like nautilus or konqueror.
252
253 Scan
254
255 Sets options before scanning via SANE.
256
257 Device
258
259 Chooses between available scanners.
260
261 # Pages
262
263 Selects the number of pages, or all pages to scan.
264
265 Source document
266
267 Selects between single sided or double sides pages.
268
269 This affects the page numbering. Single sided scans are numbered
270 consecutively. Double sided scans are incremented (or decremented, see
271 below) by 2, i.e. 1, 3, 5, etc..
272
273 Side to scan
274
275 If double sided is selected above, assuming a non-duplex scanner, i.e.
276 a scanner that cannot automatically scan both sides of a page, this
277 determines whether the page number is incremented or decremented by 2.
278
279 To scan both sides of three pages, i.e. 6 sides:
280
281 1. Select:
282 # Pages = 3 (or "all" if your scanner can detect when it is out of
283 paper)
284
285 Double sided
286
287 Facing side
288
289 2. Scans sides 1, 3 & 5.
290 3. Put pile back with scanner ready to scan back of last page.
291 4. Select:
292 # Pages = 3 (or "all" if your scanner can detect when it is out of
293 paper)
294
295 Double sided
296
297 Reverse side
298
299 5. Scans sides 6, 4 & 2.
300 6. gscan2pdf automatically sorts the pages so that they appear in the
301 correct order.
302
303 Device-dependent options
304
305 These, naturally, depend on your scanner. They can include
306
307 Page size.
308 Mode (colour/black & white/greyscale)
309 Resolution (in PPI)
310 Batch-scan
311 Guarantees that a "no documents" condition will be returned after
312 the last scanned page, to prevent endless flatbed scans after a
313 batch scan.
314
315 Wait-for-button/Button-wait
316 After sending the scan command, wait until the button on the
317 scanner is pressed before actually starting the scan process.
318
319 Source
320 Selects the document source. Possible options can include Flatbed
321 or ADF. On some scanners, this is the only way of generating an
322 out-of-documents signal.
323
324 Save
325
326 Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG, PNM or
327 GIF.
328
329 Metadata
330
331 Metadata are information that are not visible when viewing the
332 PDF/DjVu, but are embedded in the file and so searchable and can be
333 examined, typically with the "Properties" option of the document
334 viewer.
335
336 The metadata are completely optional, but can also be used to generate
337 the filename see preferences for details.
338
339 The date can be selected with use of the calendar widget. The displayed
340 date can be incremented or decremented with use of the '+' and '-'
341 keys.
342
343 DjVu
344
345 Both black and white, and colour images produce better compression than
346 PDF. See <http://www.djvuzone.org/> for more details.
347
348 Email as PDF
349
350 Attaches the selected or all pages as a PDF to a blank email. This
351 requires xdg-email, which is in the xdg-utils package. If this is not
352 present, the option is ghosted out.
353
354 Print
355
356 Prints the selected or all pages.
357
358 Compress temporary files
359
360 If your temporary ($TMPDIR) directory is getting full, this function
361 can be useful - compressing all images at LZW-compressed TIFFs. These
362 require much less space than the PNM files that are typically produced
363 by SANE or by importing a PDF.
364
365 Edit
366 Delete
367
368 Deletes the selected page.
369
370 Renumber
371
372 Renumbers the pages from 1..n.
373
374 Note that the page order can also be changed by drag and drop in the
375 thumbnail view.
376
377 Select
378
379 The select menus can be used to select, all, even, odd, blank, dark or
380 modified pages. Selecting blank or dark pages runs imagemagick to make
381 the decision. Selecting modified pages selects those which have
382 modified by threshold, unsharp, etc., since the last OCR run was made.
383
384 Properties
385
386 When an image is scanned, gscan2pdf attempts to extract the resolution
387 from the scan options. This nearly always works without problem.
388
389 Importing an image can be trickier, however. Some image formats such as
390 PNM do not encode metadata for resolution. In other cases, the data is
391 incorrect. Edit/Properties allows the user to manually correct the
392 metadata for a particular page, thus correcting the size of final PDF
393 or DjVu. The image itself is otherwise not changed - it is not down- or
394 upscaled.
395
396 Preferences
397
398 The preferences menu item allows the control of the default behaviour
399 of various functions. Most of these are self-explanatory.
400
401 Frontends
402
403 gscan2pdf initially supported two frontends, scanimage and scanadf.
404 scanadf support was added when it was realised that scanadf works
405 better than scanimage with some scanners. On Debian-based systems,
406 scanadf is in the sane package, not, like scanimage, in sane-utils. If
407 scanadf is not present, the option is obviously ghosted out.
408
409 In 0.9.27, Perl bindings for SANE were introduced. These are called
410 libsane-perl.
411
412 Before 1.2.0, options available through CLI frontends like scanimage
413 were made visible as users asked for them. In 1.2.0, all options can be
414 shown or hidden via Edit/Preferences, along with the ability to specify
415 which options trigger a reload.
416
417 In 1.8.3, New Perl bindings for SANE were introduced. These are called
418 libimage-sane-perl and are the preferred frontend.
419
420 In 1.8.5, support for libsane-perl was removed.
421
422 Device blacklist
423
424 Ignore listed devices.
425
426 Note that this is a device name regular expression, e.g. /dev/video,
427 and not the name as listed in the scan window, e.g. Noname
428 Integrated_Webcam_HD.
429
430 Default filename for PDF or DjVu files
431
432 All strftime codes (e.g. %Y for the current year) are available as
433 variables, with the following additions:
434
435 %Da author
436 %De filename extension
437 %Dt title
438
439 All document date codes use strftime codes with a leading D, e.g.:
440
441 %DY document year
442 %Dm document month
443 %Dd document day
444
445 View
446 Zoom 100%
447
448 Zooms to 1:1. How this appears depends on the desktop resolution.
449
450 Zoom to fit
451
452 Scales the view such that all the page is visible.
453
454 Zoom in
455
456 Zoom out
457
458 Rotate 90° clockwise
459
460 The rotate options require the package imagemagick and, if this is not
461 present, are ghosted out.
462
463 Rotate 180°
464
465 Rotate 90° anticlockwise
466
467 Tools
468 Threshold
469
470 Changes all pixels darker than the given value to black; all others
471 become white.
472
473 Unsharp mask
474
475 The unsharp option sharpens an image. The image is convolved with a
476 Gaussian operator of the given radius and standard deviation (sigma).
477 For reasonable results, radius should be larger than sigma. Use a
478 radius of 0 to have the method select a suitable radius.
479
480 Crop
481
482 unpaper
483
484 unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility
485 for cleaning up a scan.
486
487 OCR (Optical Character Recognition)
488
489 The gocr, tesseract, ocropus or cuneiform utilities are used to produce
490 text from an image.
491
492 There is an OCR output buffer for each page and is embedded as plain
493 text behind the scanned image in the PDF produced. This way, Beagle can
494 index (i.e. search) the plain text.
495
496 In DjVu files, the OCR output buffer is embedded in the hidden text
497 layer. Thus these can also be indexed by Beagle.
498
499 There is an interesting review of OCR software at
500 <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
501 An important conclusion was that 400ppi is necessary for decent
502 results.
503
504 Up to v2.04, the only way to tell which languages were available to
505 tesseract was to look for the language files. Therefore, gscan2pdf
506 checks the path returned by:
507
508 tesseract '' '' -l ''
509
510 If there are no language files in the above location, then gscan2pdf
511 assumes that tesseract v1.0 is installed, which had no language files.
512
513 Variables for user-defined tools
514
515 The following variables are available:
516
517 %i input filename
518 %o output filename
519 %r resolution
520
521 An image can be modified in-place by just specifying %i.
522
524 Why isn't option xyz available in the scan window?
525 Possibly because SANE or your scanner doesn't support it.
526
527 If an option listed in the output of "scanimage --help" that you would
528 like to use isn't available, send me the output and I will look at
529 implementing it.
530
531 I've only got an old flatbed scanner with no automatic sheetfeeder. How do
532 I scan a multipage document?
533 In Edit/Preferences, tick the box "Allow batch scanning from flatbed".
534
535 Some Brother scanners report "out of documents", despite scanning from
536 flatbed. This can be worked around by ticking the box "Force new scan
537 job between pages".
538
539 If you are lucky, you have an option like Wait-for-button or Button-
540 wait, where the scanner will wait for you to press the scan button on
541 the device before it starts the scan, allowing you to scan multiple
542 pages without touching the computer.
543
544 If you are quick, you might be able to change the document on the
545 flatbed whilst the scan head is returning.
546
547 Otherwise, you have to set the number of pages to scan to 1 and hit the
548 scan button on the scan window for each page.
549
550 Why is option xyz ghosted out?
551 Probably because the package required for that option is not installed.
552 Email as PDF requires xdg-email (xdg-utils), unpaper and the rotate
553 options require imagemagick.
554
555 Why can I not scan from the flatbed of my HP scanner?
556 Generally for HP scanners with an ADF, to scan from the flatbed, you
557 should set "# Pages" to "1", and possibly "Batch scan" to "No".
558
559 When I update gscan2pdf using the Update Manager in Ubuntu, why is the list
560 of changes never displayed?
561 As far as I can tell, this is pulled from changelogs.ubuntu.com, and
562 therefore only the changelogs from official Ubuntu builds are
563 displayed.
564
565 Why can gscan2pdf not find my scanner?
566 If your scanner is not connected directly to the machine on which you
567 are running gscan2pdf and you have not installed the SANE daemon,
568 saned, gscan2pdf cannot automatically find it. In this case, you can
569 specify the scanner device on the command line:
570
571 "gscan2pdf --device <device">
572
573 How can I search for text in the OCR layer of the finished PDF or DJVU
574 file?
575 pdftotext or djvutxt can extract the text layer from PDF or DJVU files.
576 See the respective man pages for details.
577
578 Having opened a PDF or DJVU file in evince or Acrobat Reader, the
579 search function will typically find the page with the requested text
580 and highlight it.
581
582 There are various tools for searching or indexing files, including PDF
583 and DJVU:
584
585 · (meta) Tracker (<https://projects.gnome.org/tracker/>)
586
587 · plone (<http://plone.org/>)
588
589 · pdfgrep (<http://pdfgrep.sourceforge.net/>
590
591 · swish-e (<http://www.swish-e.org/>)
592
593 · recoll (<http://www.lesbonscomptes.com/recoll/>)
594
595 · terrier (<http://www.lesbonscomptes.com/recoll/>)
596
597 How can I change the colour of the selection box in the image viewer?
598 Create a file called "~/.config/gtk-3.0/gtk.css" with the following
599 content:
600
601 .rubberband,
602 rubberband,
603 flowbox rubberband,
604 treeview.view rubberband,
605 .content-view rubberband,
606 .content-view .rubberband {
607 border: 1px solid #2a76c6;
608 background-color: rgba(42, 118, 198, 0.2); }
609
610 How can I change the colour of the OCR output
611 Create a file called "~/.config/gtk-3.0/gtk.css" with the following
612 content:
613
614 #gscan2pdf-ocr-output {
615 color: black; }
616
618 XSane (<http://xsane.org/>)
619
620 Scan Tailor (<http://scantailor.org/>)
621
623 Jeffrey Ratcliffe (jffry at posteo dot net)
624
626 · all the people who have sent patches, translations, bugs and
627 feedback.
628
629 · the gtk+ project for a most excellent graphics toolkit.
630
631 · the Gtk3-Perl project for their superb Perl bindings for GTK3.
632
633 · The SANE project for scanner access
634
635 · Björn Lindqvist for the gtkimageview widget
636
637 · Sourceforge for hosting the project.
638
640 Copyright (C) 2006--2020 Jeffrey Ratcliffe <jffry@posteo.net>
641
642 This program is free software: you can redistribute it and/or modify it
643 under the terms of the version 3 GNU General Public License as
644 published by the Free Software Foundation.
645
646 This program is distributed in the hope that it will be useful, but
647 WITHOUT ANY WARRANTY; without even the implied warranty of
648 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
649 General Public License for more details.
650
651 You should have received a copy of the GNU General Public License along
652 with this program. If not, see <https://www.gnu.org/licenses/>.
653
654
655
656perl v5.30.2 2020-04-09 GSCAN2PDF(1)