lstmtraining(1)

1LSTMTRAINING(1)                                                LSTMTRAINING(1)
2
3
4

NAME

6       lstmtraining - Training program for LSTM-based networks.
7

SYNOPSIS

9       lstmtraining --continue_from train_output_dir/continue_from_lang.lstm
10       --old_traineddata bestdata_dir/continue_from_lang.traineddata
11       --traineddata train_output_dir/lang/lang.traineddata --max_iterations
12       NNN --debug_interval 0|-1 --train_listfile
13       train_output_dir/lang.training_files.txt --model_output
14       train_output_dir/newlstmmodel
15

DESCRIPTION

17       lstmtraining(1) trains LSTM-based networks using a list of lstmf files
18       and starter traineddata file as the main input. Training from scratch
19       is not recommended to be done by users. Finetuning (example command
20       shown in synopsis above) or replacing a layer options can be used
21       instead. Different options apply to different types of training. Read
22       the [training
23       documentation](https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html)
24       for details.
25

OPTIONS

27       '--debug_interval '
28           How often to display the alignment. (type:int default:0)
29
30       '--net_mode '
31           Controls network behavior. (type:int default:192)
32
33       '--perfect_sample_delay '
34           How many imperfect samples between perfect ones. (type:int
35           default:0)
36
37       '--max_image_MB '
38           Max memory to use for images. (type:int default:6000)
39
40       '--append_index '
41           Index in continue_from Network at which to attach the new network
42           defined by net_spec (type:int default:-1)
43
44       '--max_iterations '
45           If set, exit after this many iterations. A negative value is
46           interpreted as epochs, 0 means infinite iterations. (type:int
47           default:0)
48
49       '--target_error_rate '
50           Final error rate in percent. (type:double default:0.01)
51
52       '--weight_range '
53           Range of initial random weights. (type:double default:0.1)
54
55       '--learning_rate '
56           Weight factor for new deltas. (type:double default:0.001)
57
58       '--momentum '
59           Decay factor for repeating deltas. (type:double default:0.5)
60
61       '--adam_beta '
62           Decay factor for repeating deltas. (type:double default:0.999)
63
64       '--stop_training '
65           Just convert the training model to a runtime model. (type:bool
66           default:false)
67
68       '--convert_to_int '
69           Convert the recognition model to an integer model. (type:bool
70           default:false)
71
72       '--sequential_training '
73           Use the training files sequentially instead of round-robin.
74           (type:bool default:false)
75
76       '--debug_network '
77           Get info on distribution of weight values (type:bool default:false)
78
79       '--randomly_rotate '
80           Train OSD and randomly turn training samples upside-down (type:bool
81           default:false)
82
83       '--net_spec '
84           Network specification (type:string default:)
85
86       '--continue_from '
87           Existing model to extend (type:string default:)
88
89       '--model_output '
90           Basename for output models (type:string default:lstmtrain)
91
92       '--train_listfile '
93           File listing training files in lstmf training format. (type:string
94           default:)
95
96       '--eval_listfile '
97           File listing eval files in lstmf training format. (type:string
98           default:)
99
100       '--traineddata '
101           Starter traineddata with combined Dawgs/Unicharset/Recoder for
102           language model (type:string default:)
103
104       '--old_traineddata '
105           When changing the character set, this specifies the traineddata
106           with the old character set that is to be replaced (type:string
107           default:)
108

HISTORY

110       lstmtraining(1) was first made available for tesseract4.00.00alpha.
111

RESOURCES

113       Main web site: https://github.com/tesseract-ocr Information on training
114       tesseract LSTM:
115       https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html
116

COPYING

121       Copyright (C) 2012 Google, Inc. Licensed under the Apache License,
122       Version 2.0
123

AUTHOR

125       The Tesseract OCR engine was written by Ray Smith and his research
126       groups at Hewlett Packard (1985-1995) and Google (2006-present).
127
128
129
130                                  07/22/2023                   LSTMTRAINING(1)