1GOCR(1)                          User's Manual                         GOCR(1)
2
3
4

NAME

6       gocr - command line text recognition tool
7

SYNOPSIS

9       gocr [OPTION] [-i] pnm-file
10

DESCRIPTION

12       gocr  is an optical character recognition program that can be used from
13       the command line.  It takes input in PNM, PGM, PBM, PPM, or PCX format,
14       and  writes  recognized  text  to  stdout.  If the pnm file is a single
15       dash, PNM data is read from stdin.  If gzip, bzip2 and netpbm-progs are
16       installed  and your system supports popen(3) also pnm.gz, pnm.bz2, png,
17       jpg, jpeg, tiff, gif, bmp, ps (only single pages) and eps are supported
18       as  input files (not as input stream), where pnm can be replaced by one
19       of ppm, pgm and pbm.
20

OPTIONS

22       -h     show usage information
23
24       -V     show version information
25
26       -i file
27              read input from file (or stdin if file is a single dash)
28
29       -o file
30              send output to file instead of stdout
31
32       -e file
33              send errors to file instead of stderr or to stdout if file is  a
34              dash
35
36       -x file
37              progress output to file (file can be a file name, a fifo name or
38              a file descriptor 1...255), this is useful for  GUI  developpers
39              to  show  the OCR progress, the file descriptor argument is only
40              available, if compiled with __USE_POSIX defined
41
42       -p path
43              database path, a final slash must be included, default is ./db/,
44              this path will be populated with images of learned characters
45
46       -f format
47              output  format  of  the  recognized text (ISO8859_1 TeX HTML XML
48              UTF8 ASCII), XML will also output position and probability data
49
50       -l level
51              set grey level to level (0<160<=255, default: 0 for autodetect),
52              darker  pixels  belong to characters, brighter pixels are inter‐
53              preted as background of the input image
54
55       -d size
56              set  dust  size  in  pixels  (clusters  smaller  than  this  are
57              removed), 0 means no clusters are removed, the default is -1 for
58              auto detection
59
60       -s num set spacewidth between words in units of dots  (default:  0  for
61              autodetect),  wider  widths  are  interpreted  as  word  spaces,
62              smaller as character spaces
63
64       -v verbosity
65              be verbose to stderr; verbosity is a bitfield
66
67       -c string
68              only verbose output of characters from string  to  stderr,  more
69              output  is  generated  for all characters within the string, the
70              underscore stands for unknown chars, this function is usefull to
71              limit debug information to the necessary one
72
73       -C string
74              only recognise characters from string, this is a filter function
75              in cases where the interest is only to a part of  the  character
76              alphabet,  you