classifier_tester(1)

1CLASSIFIER_TESTER(1)                                      CLASSIFIER_TESTER(1)
2
3
4

NAME

6       classifier_tester - for *legacy tesseract* engine.
7

SYNOPSIS

9       classifier_tester -U unicharset_file -F font_properties_file -X
10       xheights_file -classifier x -lang lang [-output_trainer trainer] *.tr
11

DESCRIPTION

13       classifier_tester(1) runs Tesseract in a special mode. It takes a list
14       of .tr files and tests a character classifier on data as formatted for
15       training, but it doesn’t have to be the same as the training data.
16

IN/OUT ARGUMENTS

18       a list of .tr files
19

OPTIONS

21       -l lang
22           (Input) three character language code; default value eng.
23
24       -classifier x
25           (Input) One of "pruner", "full".
26
27       -U unicharset
28           (Input) The unicharset for the language.
29
30       -F font_properties_file
31           (Input) font properties file, each line is of the following form,
32           where each field other than the font name is 0 or 1:
33
34               *font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur*
35
36       -X xheights_file
37           (Input) x heights file, each line is of the following form, where
38           xheight is calculated as the pixel x height of a character drawn at
39           32pt on 300 dpi. [ That is, if base x height + ascenders +
40           descenders = 133, how much is x height? ]
41
42               *font_name* *xheight*
43
44       -output_trainer trainer
45           (Output, Optional) Filename for output trainer.
46

COPYING

51       Copyright (C) 2012 Google, Inc. Licensed under the Apache License,
52       Version 2.0
53

AUTHOR

55       The Tesseract OCR engine was written by Ray Smith and his research
56       groups at Hewlett Packard (1985-1995) and Google (2006-present).
57
58
59
60                                  07/22/2023              CLASSIFIER_TESTER(1)