1GSCAN2PDF(1) User Contributed Perl Documentation GSCAN2PDF(1)
2
3
4
6 gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents
7
9 1. Scan one or several pages in with File/Scan
10 2. Create PDF of selected pages with File/Save
11
13 None
14
16 gscan2pdf has the following command-line options:
17
18 --device=<device> Specifies the device to use, instead of getting the
19 list of devices from via the SANE API. This can be useful if the
20 scanner is on a remote computer which is not broadcasting its
21 existence.
22 --help Displays this help page and exits.
23 --log=<log file> Specifies a file to store logging messages.
24 --(debug|info|warn|error|fatal) Defines the log level. If a log file is
25 specified, this defaults to 'debug', otherwise 'warn'.
26 --import=<PDF|DjVu|image> Imports the specified file
27 --version Displays the program version and exits.
28
29 Scanning is handled with SANE via scanimage. PDF conversion is done by
30 PDF::API2. TIFF export is handled by libtiff (faster and smaller
31 memory footprint for multipage files).
32
34 To diagnose a possible error, start gscan2pdf from the command line
35 with logging enabled:
36
37 "gscan2pdf --log=file.log"
38
39 and check file.log.
40
42 None
43
45 gscan2pdf creates a text resource file in ~/.config/gscan2pdfrc. The
46 directory can be changed by setting the $XDG_CONFIG_HOME variable.
47 Generally, however, preferences should be changed via the
48 Edit/Preferences menu, or are captured automatically during normal
49 usage of the program.
50
52 None known.
53
55 Whilst it is possible to import PDFs, this is intended to be able to
56 round-trip files created by gscan2pdf.
57
59 gscan2pdf is available on Sourceforge
60 (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).
61
62 Debian-based
63 If you are using Debian, you should find that sid has the latest
64 version already packaged.
65
66 If you are using a Ubuntu-based system, you can automatically keep up
67 to date with the latest version via the ppa:
68
69 "sudo apt-add-repository ppa:jeffreyratcliffe/ppa"
70
71 If you are you are using Synaptic, then use menu Edit/Reload Package
72 Information, search for gscan2pdf in the package list, and lo and
73 behold, you can install the nice shiny new version.
74
75 From the command line:
76
77 "sudo apt-get update"
78
79 "sudo apt-get install gscan2pdf"
80
81 RPMs
82 Download the rpm from Sourceforge, and then install it with "rpm -i
83 gscan2pdf-version.rpm"
84
85 From source
86 The source is hosted in the files section of the gscan2pdf project on
87 Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).
88
89 From the repository
90 gscan2pdf uses Git for its Revision Control System. You can browse the
91 tree at <https://sourceforge.net/p/gscan2pdf/code/>.
92
93 Git users can clone the complete tree with "git clone
94 git://git.code.sf.net/p/gscan2pdf/code"
95
97 Having downloaded the source either from a Sourceforge file release, or
98 from the Git repository, unpack it if necessary with "tar xvfz
99 gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"
100
101 "perl Makefile.PL", will create the Makefile.
102
103 "make test" should run several hundred tests to confirm that things
104 will work properly on your system.
105
106 You can install directly from the source with "make install", but
107 building the appropriate package for your distribution should be as
108 straightforward as "make debdist" or "make rpmdist". However, you will
109 additionally need the rpm, devscripts, fakeroot, debhelper and gettext
110 packages.
111
113 The list below looks daunting, but all packages are available from any
114 reasonable up-to-date distribution. If you are using Synaptic, having
115 installed gscan2pdf, locate the gscan2pdf entry in Synaptic, right-
116 click it and you can install them under Recommends. Note also that the
117 library names given below are the Debian/Ubuntu ones. Those
118 distributions using RPM typically use perl(module) where Debian has
119 libmodule-perl.
120
121 Required
122 libgtk3-perl >= 0.028
123 There is a bug in version of libgtk3-perl before 0.028 that
124 causes gscan2pdf to crash when saving. Whilst I could prevent
125 gscan2pdf from crashing, it would still be impossible to save
126 anything, rendering gscan2pdf rather useless.
127
128 libgtk3-simplelist-perl
129 A simple interface to Gtk3's complex MVC list widget
130
131 liblocale-gettext-perl (>= 1.05)
132 Using libc functions for internationalisation in Perl
133
134 libpdf-api2-perl
135 provides the functions for creating PDF documents in Perl
136
137 libsane
138 API library for scanners
139
140 libimage-sane-perl
141 Perl bindings for libsane.
142
143 libset-intspan-perl
144 manages sets of integers
145
146 libtiff-tools
147 TIFF manipulation and conversion tools
148
149 Imagemagick
150 Image manipulation programs
151
152 perlmagick
153 A perl interface to the libMagick graphics routines
154
155 sane-utils
156 API library for scanners -- utilities.
157
158 Optional
159 sane
160 scanner graphical frontends. Only required for the scanadf
161 frontend.
162
163 unpaper
164 post-processing tool for scanned pages. See
165 <https://www.flameeyes.eu/projects/unpaper>.
166
167 xdg-utils
168 Desktop integration utilities from freedesktop.org. Required
169 for Email as PDF. See
170 <https://www.freedesktop.org/wiki/Software/xdg-utils/>
171
172 djvulibre-bin
173 Utilities for the DjVu image format. See
174 <http://djvu.sourceforge.net/>
175
176 gocr
177 A command line OCR. See <http://jocr.sourceforge.net/>.
178
179 tesseract
180 A command line OCR. See
181 <https://github.com/tesseract-ocr/tesseract>
182
183 ocropus
184 A command line OCR. See <http://code.google.com/p/ocropus/>
185
186 cuneiform
187 A command line OCR. See <http://launchpad.net/cuneiform-linux>
188
190 There are two mailing lists for gscan2pdf:
191
192 gscan2pdf-announce
193 A low-traffic list for announcements, mostly of new releases. You
194 can subscribe at
195 <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>
196
197 gscan2pdf-help
198 General support, questions, etc.. You can subscribe at
199 <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>
200
202 Before reporting bugs, please read the "FAQs" section.
203
204 Please report any bugs found, preferably against the Debian
205 package[1][2]. You do not need to be a Debian user, or set up an
206 account to do this. The Debian tool "reportbug" provides a convenient
207 GUI for doing so.
208
209 1. https://packages.debian.org/sid/gscan2pdf
210 2. https://www.debian.org/Bugs/
211
212 Alternatively, there is a bug tracker for the gscan2pdf project on
213 Sourceforge
214 (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).
215
216 Please include the log file created by "gscan2pdf --log=log" with any
217 new bug report.
218
220 gscan2pdf has already been partly translated into several languages.
221 If you would like to contribute to an existing or new translation,
222 please check out Rosetta:
223 <https://translations.launchpad.net/gscan2pdf>
224
225 Note that the translations for the scanner options are taken directly
226 from sane-backends. If you would like to contribute to these, you can
227 do so either at contact the sane-devel mailing list
228 (sane-devel@lists.alioth.debian.org) and have a look at the po/
229 directory in the source code <http://www.sane-project.org/cvs.html>.
230
231 Alternatively, Ubuntu has its own translation project. For the 9.04
232 release, the translations are available at
233 <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>
234
236 File
237 New
238
239 Clears the page list.
240
241 Open
242
243 Opens any format that imagemagick supports. PDFs will have their
244 embedded images extracted and imported one per page.
245
246 Note that files can also be imported by dragging them into the
247 thumbnail list from a program like nautilus or konqueror.
248
249 Scan
250
251 Sets options before scanning via SANE.
252
253 Device
254
255 Chooses between available scanners.
256
257 # Pages
258
259 Selects the number of pages, or all pages to scan.
260
261 Source document
262
263 Selects between single sided or double sides pages.
264
265 This affects the page numbering. Single sided scans are numbered
266 consecutively. Double sided scans are incremented (or decremented, see
267 below) by 2, i.e. 1, 3, 5, etc..
268
269 Side to scan
270
271 If double sided is selected above, assuming a non-duplex scanner, i.e.
272 a scanner that cannot automatically scan both sides of a page, this
273 determines whether the page number is incremented or decremented by 2.
274
275 To scan both sides of three pages, i.e. 6 sides:
276
277 1. Select:
278 # Pages = 3 (or "all" if your scanner can detect when it is out of
279 paper)
280
281 Double sided
282
283 Facing side
284
285 2. Scans sides 1, 3 & 5.
286 3. Put pile back with scanner ready to scan back of last page.
287 4. Select:
288 # Pages = 3 (or "all" if your scanner can detect when it is out of
289 paper)
290
291 Double sided
292
293 Reverse side
294
295 5. Scans sides 6, 4 & 2.
296 6. gscan2pdf automatically sorts the pages so that they appear in the
297 correct order.
298
299 Device-dependent options
300
301 These, naturally, depend on your scanner. They can include
302
303 Page size.
304 Mode (colour/black & white/greyscale)
305 Resolution (in PPI)
306 Batch-scan
307 Guarantees that a "no documents" condition will be returned after
308 the last scanned page, to prevent endless flatbed scans after a
309 batch scan.
310
311 Wait-for-button/Button-wait
312 After sending the scan command, wait until the button on the
313 scanner is pressed before actually starting the scan process.
314
315 Source
316 Selects the document source. Possible options can include Flatbed
317 or ADF. On some scanners, this is the only way of generating an
318 out-of-documents signal.
319
320 Save
321
322 Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG, PNM or
323 GIF.
324
325 PDF Metadata
326
327 Metadata are information that are not visible when viewing the PDF, but
328 are embedded in the file and so searchable and can be examined,
329 typically with the "Properties" option of the PDF viewer.
330
331 The metadata are completely optional, but can also be used to generate
332 the filename see preferences for details.
333
334 DjVu
335
336 Both black and white, and colour images produce better compression than
337 PDF. See <http://www.djvuzone.org/> for more details.
338
339 Email as PDF
340
341 Attaches the selected or all pages as a PDF to a blank email. This
342 requires xdg-email, which is in the xdg-utils package. If this is not
343 present, the option is ghosted out.
344
345 Print
346
347 Prints the selected or all pages.
348
349 Compress temporary files
350
351 If your temporary ($TMPDIR) directory is getting full, this function
352 can be useful - compressing all images at LZW-compressed TIFFs. These
353 require much less space than the PNM files that are typically produced
354 by SANE or by importing a PDF.
355
356 Edit
357 Delete
358
359 Deletes the selected page.
360
361 Renumber
362
363 Renumbers the pages from 1..n.
364
365 Note that the page order can also be changed by drag and drop in the
366 thumbnail view.
367
368 Select
369
370 The select menus can be used to select, all, even, odd, blank, dark or
371 modified pages. Selecting blank or dark pages runs imagemagick to make
372 the decision. Selecting modified pages selects those which have
373 modified by threshold, unsharp, etc., since the last OCR run was made.
374
375 Properties
376
377 When an image is scanned, gscan2pdf attempts to extract the resolution
378 from the scan options. This nearly always works without problem.
379
380 Importing an image can be trickier, however. Some image formats such as
381 PNM do not encode metadata for resolution. In other cases, the data is
382 incorrect. Edit/Properties allows the user to manually correct the
383 metadata for a particular page, thus correcting the size of final PDF
384 or DjVu. The image itself is otherwise not changed - it is not down- or
385 upscaled.
386
387 Preferences
388
389 The preferences menu item allows the control of the default behaviour
390 of various functions. Most of these are self-explanatory.
391
392 Frontends
393
394 gscan2pdf initially supported two frontends, scanimage and scanadf.
395 scanadf support was added when it was realised that scanadf works
396 better than scanimage with some scanners. On Debian-based systems,
397 scanadf is in the sane package, not, like scanimage, in sane-utils. If
398 scanadf is not present, the option is obviously ghosted out.
399
400 In 0.9.27, Perl bindings for SANE were introduced. These are called
401 libsane-perl.
402
403 Before 1.2.0, options available through CLI frontends like scanimage
404 were made visible as users asked for them. In 1.2.0, all options can be
405 shown or hidden via Edit/Preferences, along with the ability to specify
406 which options trigger a reload.
407
408 In 1.8.3, New Perl bindings for SANE were introduced. These are called
409 libimage-sane-perl and are the preferred frontend.
410
411 In 1.8.5, support for libsane-perl was removed.
412
413 Device blacklist
414
415 Ignore listed devices.
416
417 Note that this is a device name regular expression, e.g. /dev/video,
418 and not the name as listed in the scan window, e.g. Noname
419 Integrated_Webcam_HD.
420
421 Default filename for PDF or DjVu files
422
423 All strftime codes (e.g. %Y for the current year) are available as
424 variables, with the following additions:
425
426 %Da author
427 %De filename extension
428 %Dt title
429
430 All document date codes use strftime codes with a leading D, e.g.:
431
432 %DY document year
433 %Dm document month
434 %Dd document day
435
436 View
437 Zoom 100%
438
439 Zooms to 1:1. How this appears depends on the desktop resolution.
440
441 Zoom to fit
442
443 Scales the view such that all the page is visible.
444
445 Zoom in
446
447 Zoom out
448
449 Rotate 90° clockwise
450
451 The rotate options require the package imagemagick and, if this is not
452 present, are ghosted out.
453
454 Rotate 180°
455
456 Rotate 90° anticlockwise
457
458 Tools
459 Threshold
460
461 Changes all pixels darker than the given value to black; all others
462 become white.
463
464 Unsharp mask
465
466 The unsharp option sharpens an image. The image is convolved with a
467 Gaussian operator of the given radius and standard deviation (sigma).
468 For reasonable results, radius should be larger than sigma. Use a
469 radius of 0 to have the method select a suitable radius.
470
471 Crop
472
473 unpaper
474
475 unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility
476 for cleaning up a scan.
477
478 OCR (Optical Character Recognition)
479
480 The gocr, tesseract, ocropus or cuneiform utilities are used to produce
481 text from an image.
482
483 There is an OCR output buffer for each page and is embedded as plain
484 text behind the scanned image in the PDF produced. This way, Beagle can
485 index (i.e. search) the plain text.
486
487 In DjVu files, the OCR output buffer is embedded in the hidden text
488 layer. Thus these can also be indexed by Beagle.
489
490 There is an interesting review of OCR software at
491 <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
492 An important conclusion was that 400ppi is necessary for decent
493 results.
494
495 Up to v2.04, the only way to tell which languages were available to
496 tesseract was to look for the language files. Therefore, gscan2pdf
497 checks the path returned by:
498
499 tesseract '' '' -l ''
500
501 If there are no language files in the above location, then gscan2pdf
502 assumes that tesseract v1.0 is installed, which had no language files.
503
504 Variables for user-defined tools
505
506 The following variables are available:
507
508 %i input filename
509 %o output filename
510 %r resolution
511
512 An image can be modified in-place by just specifying %i.
513
515 Why isn't option xyz available in the scan window?
516 Possibly because SANE or your scanner doesn't support it.
517
518 If an option listed in the output of "scanimage --help" that you would
519 like to use isn't available, send me the output and I will look at
520 implementing it.
521
522 I've only got an old flatbed scanner with no automatic sheetfeeder. How do
523 I scan a multipage document?
524 In Edit/Preferences, tick the box "Allow batch scanning from flatbed".
525
526 Some Brother scanners report "out of documents", despite scanning from
527 flatbed. This can be worked around by ticking the box "Force new scan
528 job between pages".
529
530 If you are lucky, you have an option like Wait-for-button or Button-
531 wait, where the scanner will wait for you to press the scan button on
532 the device before it starts the scan, allowing you to scan multiple
533 pages without touching the computer.
534
535 If you are quick, you might be able to change the document on the
536 flatbed whilst the scan head is returning.
537
538 Otherwise, you have to set the number of pages to scan to 1 and hit the
539 scan button on the scan window for each page.
540
541 Why is option xyz ghosted out?
542 Probably because the package required for that option is not installed.
543 Email as PDF requires xdg-email (xdg-utils), unpaper and the rotate
544 options require imagemagick.
545
546 Why can I not scan from the flatbed of my HP scanner?
547 Generally for HP scanners with an ADF, to scan from the flatbed, you
548 should set "# Pages" to "1", and possibly "Batch scan" to "No".
549
550 When I update gscan2pdf using the Update Manager in Ubuntu, why is the list
551 of changes never displayed?
552 As far as I can tell, this is pulled from changelogs.ubuntu.com, and
553 therefore only the changelogs from official Ubuntu builds are
554 displayed.
555
556 Why can gscan2pdf not find my scanner?
557 If your scanner is not connected directly to the machine on which you
558 are running gscan2pdf and you have not installed the SANE daemon,
559 saned, gscan2pdf cannot automatically find it. In this case, you can
560 specify the scanner device on the command line:
561
562 "gscan2pdf --device <device">
563
564 How can I search for text in the OCR layer of the finished PDF or DJVU
565 file?
566 pdftotext or djvutxt can extract the text layer from PDF or DJVU files.
567 See the respective man pages for details.
568
569 Having opened a PDF or DJVU file in evince or Acrobat Reader, the
570 search function will typically find the page with the requested text
571 and highlight it.
572
573 There are various tools for searching or indexing files, including PDF
574 and DJVU:
575
576 · (meta) Tracker (<https://projects.gnome.org/tracker/>)
577
578 · plone (<http://plone.org/>)
579
580 · pdfgrep (<http://pdfgrep.sourceforge.net/>
581
582 · swish-e (<http://www.swish-e.org/>)
583
584 · recoll (<http://www.lesbonscomptes.com/recoll/>)
585
586 · terrier (<http://www.lesbonscomptes.com/recoll/>)
587
588 How can I change the colour of the selection box in the image viewer?
589 Create a file called "~/.config/gtk-3.0/gtk.css" with the following
590 content:
591
592 .rubberband,
593 rubberband,
594 flowbox rubberband,
595 treeview.view rubberband,
596 .content-view rubberband,
597 .content-view .rubberband {
598 border: 1px solid #2a76c6;
599 background-color: rgba(42, 118, 198, 0.2); }
600
601 How can I change the colour of the OCR output
602 Create a file called "~/.config/gtk-3.0/gtk.css" with the following
603 content:
604
605 #gscan2pdf-ocr-output {
606 color: black; }
607
609 XSane (<http://xsane.org/>)
610
611 Scan Tailor (<http://scantailor.org/>)
612
614 Jeffrey Ratcliffe (jffry at posteo dot net)
615
617 · all the people who have sent patches, translations, bugs and
618 feedback.
619
620 · the gtk+ project for a most excellent graphics toolkit.
621
622 · the Gtk3-Perl project for their superb Perl bindings for GTK3.
623
624 · The SANE project for scanner access
625
626 · Björn Lindqvist for the gtkimageview widget
627
628 · Sourceforge for hosting the project.
629
631 Copyright (C) 2006--2019 Jeffrey Ratcliffe <jffry@posteo.net>
632
633 This program is free software: you can redistribute it and/or modify it
634 under the terms of the version 3 GNU General Public License as
635 published by the Free Software Foundation.
636
637 This program is distributed in the hope that it will be useful, but
638 WITHOUT ANY WARRANTY; without even the implied warranty of
639 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
640 General Public License for more details.
641
642 You should have received a copy of the GNU General Public License along
643 with this program. If not, see <https://www.gnu.org/licenses/>.
644
645
646
647perl v5.28.2 2019-05-21 GSCAN2PDF(1)