1GSCAN2PDF(1) User Contributed Perl Documentation GSCAN2PDF(1)
2
3
4
6 gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents
7
9 1. Scan one or several pages in with File/Scan
10 2. Create PDF of selected pages with File/Save
11
13 None
14
16 gscan2pdf has the following command-line options:
17
18 --device=device
19 Specifies the device to use, instead of getting the list of devices
20 from via the SANE API. This can be useful if the scanner is on a
21 remote computer which is not broadcasting its existence.
22
23 --help
24 Displays this help page and exits.
25
26 --log=log-file
27 Specifies a file to store logging messages.
28
29 --debug, --info, --warn, --error, --fatal
30 Defines the log level. If a log file is specified, this defaults
31 to --debug, otherwise --error.
32
33 --import=PDF|DjVu|images
34 Imports the specified file(s). If the document has more than one
35 page, a window is displayed to select the required pages.
36
37 --import-all=PDF|DjVu|images Imports all pages of the specified
38 file(s).
39 --version
40 Displays the program version and exits.
41
42 Scanning is handled with SANE via scanimage. PDF conversion is done by
43 PDF::Builder. TIFF export is handled by libtiff (faster and smaller
44 memory footprint for multipage files).
45
47 To diagnose a possible error, start gscan2pdf from the command line
48 with logging enabled:
49
50 "gscan2pdf --log=file.log"
51
52 and check file.log.
53
55 None
56
58 gscan2pdf creates a text resource file in ~/.config/gscan2pdfrc. The
59 directory can be changed by setting the $XDG_CONFIG_HOME variable.
60 Generally, however, preferences should be changed via the
61 Edit/Preferences menu, or are captured automatically during normal
62 usage of the program.
63
65 None known.
66
68 Whilst it is possible to import PDFs, this is intended to be able to
69 round-trip files created by gscan2pdf.
70
72 gscan2pdf is available on Sourceforge
73 (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).
74
75 Debian-based
76 If you are using Debian, you should find that sid
77 <https://www.debian.org/releases/sid/> has the latest version already
78 packaged.
79
80 If you are using a Ubuntu-based system, you can automatically keep up
81 to date with the latest version via the ppa:
82
83 "sudo apt-add-repository ppa:jeffreyratcliffe/ppa"
84
85 If you are you are using Synaptic, then use menu Edit/Reload Package
86 Information, search for gscan2pdf in the package list, and lo and
87 behold, you can install the nice shiny new version.
88
89 From the command line:
90
91 "sudo apt update"
92
93 "sudo apt install gscan2pdf"
94
95 From source
96 The source is hosted in the files section of the gscan2pdf project on
97 Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).
98
99 From the repository
100 gscan2pdf uses Git for its Revision Control System. You can browse the
101 tree at <https://sourceforge.net/p/gscan2pdf/code/>.
102
103 Git users can clone the complete tree with "git clone
104 git://git.code.sf.net/p/gscan2pdf/code"
105
107 Having downloaded the source either from a Sourceforge file release, or
108 from the Git repository, unpack it if necessary with "tar xvfz
109 gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"
110
111 "perl Makefile.PL", will create the Makefile.
112
113 "make test" should run several hundred tests to confirm that things
114 will work properly on your system.
115
116 You can install directly from the source with "make install", but
117 building the appropriate package for your distribution should be as
118 straightforward as "make debdist" or "make rpmdist". However, you will
119 additionally need the rpm, devscripts, fakeroot, debhelper and gettext
120 packages.
121
123 The list below looks daunting, but all packages are available from any
124 reasonable up-to-date distribution. If you are using Synaptic, having
125 installed gscan2pdf, locate the gscan2pdf entry in Synaptic, right-
126 click it and you can install them under Recommends. Note also that the
127 library names given below are the Debian/Ubuntu ones. Those
128 distributions using RPM typically use perl(module) where Debian has
129 libmodule-perl.
130
131 Required
132 libgtk3-perl >= 0.028
133 There is a bug in version of libgtk3-perl before 0.028 that
134 causes gscan2pdf to crash when saving. Whilst I could prevent
135 gscan2pdf from crashing, it would still be impossible to save
136 anything, rendering gscan2pdf rather useless.
137
138 libgtk3-simplelist-perl
139 A simple interface to Gtk3's complex MVC list widget
140
141 liblocale-gettext-perl (>= 1.05)
142 Using libc functions for internationalisation in Perl
143
144 libpdf-builder-perl
145 provides the functions for creating PDF documents in Perl
146
147 libsane
148 API library for scanners
149
150 libimage-sane-perl
151 Perl bindings for libsane.
152
153 libset-intspan-perl
154 manages sets of integers
155
156 libtiff-tools
157 TIFF manipulation and conversion tools
158
159 Imagemagick
160 Image manipulation programs
161
162 perlmagick
163 A perl interface to the libMagick graphics routines
164
165 sane-utils
166 API library for scanners -- utilities.
167
168 Optional
169 sane
170 scanner graphical frontends. Only required for the scanadf
171 frontend.
172
173 unpaper
174 post-processing tool for scanned pages. See
175 <https://www.flameeyes.eu/projects/unpaper>.
176
177 xdg-utils
178 Desktop integration utilities from freedesktop.org. Required
179 for Email as PDF. See
180 <https://www.freedesktop.org/wiki/Software/xdg-utils/>
181
182 djvulibre-bin
183 Utilities for the DjVu image format. See
184 <http://djvu.sourceforge.net/>
185
186 gocr
187 A command line OCR. See <http://jocr.sourceforge.net/>.
188
189 tesseract
190 A command line OCR. See
191 <https://github.com/tesseract-ocr/tesseract>
192
193 ocropus
194 A command line OCR. See <http://code.google.com/p/ocropus/>
195
196 cuneiform
197 A command line OCR. See <http://launchpad.net/cuneiform-linux>
198
200 There are two mailing lists for gscan2pdf:
201
202 gscan2pdf-announce
203 A low-traffic list for announcements, mostly of new releases. You
204 can subscribe at
205 <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>
206
207 gscan2pdf-help
208 General support, questions, etc.. You can subscribe at
209 <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>
210
212 Before reporting bugs, please read the "FAQs" section.
213
214 Please report any bugs found, preferably against the Debian
215 package[1][2]. You do not need to be a Debian user, or set up an
216 account to do this. The Debian tool "reportbug" provides a convenient
217 GUI for doing so.
218
219 1. https://packages.debian.org/sid/gscan2pdf
220 2. https://www.debian.org/Bugs/
221
222 Alternatively, there is a bug tracker for the gscan2pdf project on
223 Sourceforge
224 (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).
225
226 Please include the log file created by "gscan2pdf --log=log" with any
227 new bug report.
228
230 gscan2pdf has already been partly translated into several languages.
231 If you would like to contribute to an existing or new translation,
232 please check out Rosetta:
233 <https://translations.launchpad.net/gscan2pdf>
234
235 Note that the translations for the scanner options are taken directly
236 from sane-backends. If you would like to contribute to these, you can
237 do so either at contact the sane-devel mailing list
238 (sane-devel@lists.alioth.debian.org) and have a look at the po/
239 directory in the source code <http://www.sane-project.org/cvs.html>.
240
241 Alternatively, Ubuntu has its own translation project. For the 9.04
242 release, the translations are available at
243 <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>
244
246 File
247 New
248
249 Clears the page list.
250
251 Open
252
253 Opens any format that imagemagick supports. PDFs will have their
254 embedded images extracted and imported one per page.
255
256 Note that files can also be imported by dragging them into the
257 thumbnail list from a program like nautilus or konqueror.
258
259 Scan
260
261 Sets options before scanning via SANE.
262
263 Device
264
265 Chooses between available scanners.
266
267 # Pages
268
269 Selects the number of pages, or all pages to scan.
270
271 Source document
272
273 Selects between single sided or double sides pages.
274
275 This affects the page numbering. Single sided scans are numbered
276 consecutively. Double sided scans are incremented (or decremented, see
277 below) by 2, i.e. 1, 3, 5, etc..
278
279 Side to scan
280
281 If double sided is selected above, assuming a non-duplex scanner, i.e.
282 a scanner that cannot automatically scan both sides of a page, this
283 determines whether the page number is incremented or decremented by 2.
284
285 To scan both sides of three pages, i.e. 6 sides:
286
287 1. Select:
288 # Pages = 3 (or "all" if your scanner can detect when it is out of
289 paper)
290
291 Double sided
292
293 Facing side
294
295 2. Scans sides 1, 3 & 5.
296 3. Put pile back with scanner ready to scan back of last page.
297 4. Select:
298 # Pages = 3 (or "all" if your scanner can detect when it is out of
299 paper)
300
301 Double sided
302
303 Reverse side
304
305 5. Scans sides 6, 4 & 2.
306 6. gscan2pdf automatically sorts the pages so that they appear in the
307 correct order.
308
309 Device-dependent options
310
311 These, naturally, depend on your scanner. They can include
312
313 Page size.
314 Mode (colour/black & white/greyscale)
315 Resolution (in PPI)
316 Batch-scan
317 Guarantees that a "no documents" condition will be returned after
318 the last scanned page, to prevent endless flatbed scans after a
319 batch scan.
320
321 Wait-for-button/Button-wait
322 After sending the scan command, wait until the button on the
323 scanner is pressed before actually starting the scan process.
324
325 Source
326 Selects the document source. Possible options can include Flatbed
327 or ADF. On some scanners, this is the only way of generating an
328 out-of-documents signal.
329
330 Save
331
332 Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG, PNM or
333 GIF.
334
335 Metadata
336
337 Metadata are information that are not visible when viewing the
338 PDF/DjVu, but are embedded in the file and so searchable and can be
339 examined, typically with the "Properties" option of the document
340 viewer.
341
342 The metadata are completely optional, but can also be used to generate
343 the filename see preferences for details.
344
345 The date can be selected with use of the calendar widget. The displayed
346 date can be incremented or decremented with use of the '+' and '-'
347 keys.
348
349 DjVu
350
351 Both black and white, and colour images produce better compression than
352 PDF. See <http://www.djvuzone.org/> for more details.
353
354 Email as PDF
355
356 Attaches the selected or all pages as a PDF to a blank email. This
357 requires xdg-email, which is in the xdg-utils package. If this is not
358 present, the option is ghosted out.
359
360 Print
361
362 Prints the selected or all pages.
363
364 Compress temporary files
365
366 If your temporary ($TMPDIR) directory is getting full, this function
367 can be useful - compressing all images at LZW-compressed TIFFs. These
368 require much less space than the PNM files that are typically produced
369 by SANE or by importing a PDF.
370
371 Edit
372 Delete
373
374 Deletes the selected page.
375
376 Renumber
377
378 Renumbers the pages from 1..n.
379
380 Note that the page order can also be changed by drag and drop in the
381 thumbnail view.
382
383 Select
384
385 The select menus can be used to select, all, even, odd, blank, dark or
386 modified pages. Selecting blank or dark pages runs imagemagick to make
387 the decision. Selecting modified pages selects those which have
388 modified by threshold, unsharp, etc., since the last OCR run was made.
389
390 Properties
391
392 When an image is scanned, gscan2pdf attempts to extract the resolution
393 from the scan options. This nearly always works without problem.
394
395 Importing an image can be trickier, however. Some image formats such as
396 PNM do not encode metadata for resolution. In other cases, the data is
397 incorrect. Edit/Properties allows the user to manually correct the
398 metadata for a particular page, thus correcting the size of final PDF
399 or DjVu. The image itself is otherwise not changed - it is not down- or
400 upscaled.
401
402 Preferences
403
404 The preferences menu item allows the control of the default behaviour
405 of various functions. Most of these are self-explanatory.
406
407 Frontends
408
409 gscan2pdf initially supported two frontends, scanimage and scanadf.
410 scanadf support was added when it was realised that scanadf works
411 better than scanimage with some scanners. On Debian-based systems,
412 scanadf is in the sane package, not, like scanimage, in sane-utils. If
413 scanadf is not present, the option is obviously ghosted out.
414
415 In 0.9.27, Perl bindings for SANE were introduced. These are called
416 libsane-perl.
417
418 Before 1.2.0, options available through CLI frontends like scanimage
419 were made visible as users asked for them. In 1.2.0, all options can be
420 shown or hidden via Edit/Preferences, along with the ability to specify
421 which options trigger a reload.
422
423 In 1.8.3, New Perl bindings for SANE were introduced. These are called
424 libimage-sane-perl and are the preferred frontend.
425
426 In 1.8.5, support for libsane-perl was removed.
427
428 Device blacklist
429
430 Ignore listed devices.
431
432 Note that this is a device name regular expression, e.g. /dev/video,
433 and not the name as listed in the scan window, e.g. Noname
434 Integrated_Webcam_HD.
435
436 Default filename for PDF or DjVu files
437
438 All strftime codes (e.g. %Y for the current year) are available as
439 variables, with the following additions:
440
441 %Da author
442
443 %De filename extension
444
445 %Dt title
446
447 All document date codes use strftime codes with a leading D, e.g.:
448
449 %DY document year
450
451 %Dm document month
452
453 %Dd document day
454
455 View
456 Zoom 100%
457
458 Zooms to 1:1. How this appears depends on the desktop resolution.
459
460 Zoom to fit
461
462 Scales the view such that all the page is visible.
463
464 Zoom in
465
466 Zoom out
467
468 Rotate 90° clockwise
469
470 The rotate options require the package imagemagick and, if this is not
471 present, are ghosted out.
472
473 Rotate 180°
474
475 Rotate 90° anticlockwise
476
477 Tools
478 Threshold
479
480 Changes all pixels darker than the given value to black; all others
481 become white.
482
483 Unsharp mask
484
485 The unsharp option sharpens an image. The image is convolved with a
486 Gaussian operator of the given radius and standard deviation (sigma).
487 For reasonable results, radius should be larger than sigma. Use a
488 radius of 0 to have the method select a suitable radius.
489
490 Crop
491
492 unpaper
493
494 unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility
495 for cleaning up a scan.
496
497 OCR (Optical Character Recognition)
498
499 The gocr, tesseract, ocropus or cuneiform utilities are used to produce
500 text from an image.
501
502 There is an OCR output buffer for each page and is embedded as plain
503 text behind the scanned image in the PDF produced. This way, Beagle can
504 index (i.e. search) the plain text.
505
506 In DjVu files, the OCR output buffer is embedded in the hidden text
507 layer. Thus these can also be indexed by Beagle.
508
509 There is an interesting review of OCR software at
510 <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
511 An important conclusion was that 400ppi is necessary for decent
512 results.
513
514 Up to v2.04, the only way to tell which languages were available to
515 tesseract was to look for the language files. Therefore, gscan2pdf
516 checks the path returned by:
517
518 "tesseract '' '' -l ''"
519
520 If there are no language files in the above location, then gscan2pdf
521 assumes that tesseract v1.0 is installed, which had no language files.
522
523 Variables for user-defined tools
524
525 The following variables are available:
526
527 %i input filename
528
529 %o output filename
530
531 %r resolution
532
533 An image can be modified in-place by just specifying %i.
534
536 Why isn't option xyz available in the scan window?
537 Possibly because SANE or your scanner doesn't support it.
538
539 If an option listed in the output of "scanimage --help" that you would
540 like to use isn't available, send me the output and I will look at
541 implementing it.
542
543 I've only got an old flatbed scanner with no automatic sheetfeeder. How do
544 I scan a multipage document?
545 In Edit/Preferences, tick the box "Allow batch scanning from flatbed".
546
547 Some Brother scanners report "out of documents", despite scanning from
548 flatbed. This can be worked around by ticking the box "Force new scan
549 job between pages".
550
551 If you are lucky, you have an option like Wait-for-button or Button-
552 wait, where the scanner will wait for you to press the scan button on
553 the device before it starts the scan, allowing you to scan multiple
554 pages without touching the computer.
555
556 If you are quick, you might be able to change the document on the
557 flatbed whilst the scan head is returning.
558
559 Otherwise, you have to set the number of pages to scan to 1 and hit the
560 scan button on the scan window for each page.
561
562 Why is option xyz ghosted out?
563 Probably because the package required for that option is not installed.
564 Email as PDF requires xdg-email (xdg-utils), unpaper and the rotate
565 options require imagemagick.
566
567 Why can I not scan from the flatbed of my HP scanner?
568 Generally for HP scanners with an ADF, to scan from the flatbed, you
569 should set "# Pages" to "1", and possibly "Batch scan" to "No".
570
571 When I update gscan2pdf using the Update Manager in Ubuntu, why is the list
572 of changes never displayed?
573 As far as I can tell, this is pulled from changelogs.ubuntu.com, and
574 therefore only the changelogs from official Ubuntu builds are
575 displayed.
576
577 Why can gscan2pdf not find my scanner?
578 If your scanner is not connected directly to the machine on which you
579 are running gscan2pdf and you have not installed the SANE daemon,
580 saned, gscan2pdf cannot automatically find it. In this case, you can
581 specify the scanner device on the command line:
582
583 "gscan2pdf --device <device">
584
585 How can I search for text in the OCR layer of the finished PDF or DJVU
586 file?
587 pdftotext or djvutxt can extract the text layer from PDF or DJVU files.
588 See the respective man pages for details.
589
590 Having opened a PDF or DJVU file in evince or Acrobat Reader, the
591 search function will typically find the page with the requested text
592 and highlight it.
593
594 There are various tools for searching or indexing files, including PDF
595 and DJVU:
596
597 • (meta) Tracker (<https://projects.gnome.org/tracker/>)
598
599 • plone (<http://plone.org/>)
600
601 • pdfgrep (<http://pdfgrep.sourceforge.net/>
602
603 • swish-e (<http://www.swish-e.org/>)
604
605 • recoll (<http://www.lesbonscomptes.com/recoll/>)
606
607 • terrier (<http://www.lesbonscomptes.com/recoll/>)
608
609 How can I change the colour of the selection box in the image viewer?
610 Create a file called "~/.config/gtk-3.0/gtk.css" with the following
611 content:
612
613 .rubberband,
614 rubberband,
615 flowbox rubberband,
616 treeview.view rubberband,
617 .content-view rubberband,
618 .content-view .rubberband {
619 border: 1px solid #2a76c6;
620 background-color: rgba(42, 118, 198, 0.2); }
621
622 How can I change the colour of the OCR output
623 Create a file called "~/.config/gtk-3.0/gtk.css" with the following
624 content:
625
626 #gscan2pdf-ocr-output {
627 color: black;
628 }
629
631 XSane (<http://xsane.org/>)
632
633 Scan Tailor (<http://scantailor.org/>)
634
636 Jeffrey Ratcliffe (jffry at posteo dot net)
637
639 • all the people who have sent patches, translations, bugs and
640 feedback.
641
642 • the gtk+ project for a most excellent graphics toolkit.
643
644 • the Gtk3-Perl project for their superb Perl bindings for GTK3.
645
646 • The SANE project for scanner access
647
648 • Björn Lindqvist for the gtkimageview widget
649
650 • Sourceforge for hosting the project.
651
653 Copyright (C) 2006--2021 Jeffrey Ratcliffe <jffry@posteo.net>
654
655 This program is free software: you can redistribute it and/or modify it
656 under the terms of the version 3 GNU General Public License as
657 published by the Free Software Foundation.
658
659 This program is distributed in the hope that it will be useful, but
660 WITHOUT ANY WARRANTY; without even the implied warranty of
661 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
662 General Public License for more details.
663
664 You should have received a copy of the GNU General Public License along
665 with this program. If not, see <https://www.gnu.org/licenses/>.
666
667
668
669perl v5.34.0 2021-09-20 GSCAN2PDF(1)