1LSTMTRAINING(1) LSTMTRAINING(1)
2
3
4
6 lstmtraining - Training program for LSTM-based networks.
7
9 lstmtraining --continue_from train_output_dir/continue_from_lang.lstm
10 --old_traineddata bestdata_dir/continue_from_lang.traineddata
11 --traineddata train_output_dir/lang/lang.traineddata --max_iterations
12 NNN --debug_interval 0|-1 --train_listfile
13 train_output_dir/lang.training_files.txt --model_output
14 train_output_dir/newlstmmodel
15
17 lstmtraining(1) trains LSTM-based networks using a list of lstmf files
18 and starter traineddata file as the main input. Training from scratch
19 is not recommended to be done by users. Finetuning (example command
20 shown in synopsis above) or replacing a layer options can be used
21 instead. Different options apply to different types of training. Read
22 the [training
23 documentation](https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html)
24 for details.
25
27 '--debug_interval '
28 How often to display the alignment. (type:int default:0)
29
30 '--net_mode '
31 Controls network behavior. (type:int default:192)
32
33 '--perfect_sample_delay '
34 How many imperfect samples between perfect ones. (type:int
35 default:0)
36
37 '--max_image_MB '
38 Max memory to use for images. (type:int default:6000)
39
40 '--append_index '
41 Index in continue_from Network at which to attach the new network
42 defined by net_spec (type:int default:-1)
43
44 '--max_iterations '
45 If set, exit after this many iterations. A negative value is
46 interpreted as epochs, 0 means infinite iterations. (type:int
47 default:0)
48
49 '--target_error_rate '
50 Final error rate in percent. (type:double default:0.01)
51
52 '--weight_range '
53 Range of initial random weights. (type:double default:0.1)
54
55 '--learning_rate '
56 Weight factor for new deltas. (type:double default:0.001)
57
58 '--momentum '
59 Decay factor for repeating deltas. (type:double default:0.5)
60
61 '--adam_beta '
62 Decay factor for repeating deltas. (type:double default:0.999)
63
64 '--stop_training '
65 Just convert the training model to a runtime model. (type:bool
66 default:false)
67
68 '--convert_to_int '
69 Convert the recognition model to an integer model. (type:bool
70 default:false)
71
72 '--sequential_training '
73 Use the training files sequentially instead of round-robin.
74 (type:bool default:false)
75
76 '--debug_network '
77 Get info on distribution of weight values (type:bool default:false)
78
79 '--randomly_rotate '
80 Train OSD and randomly turn training samples upside-down (type:bool
81 default:false)
82
83 '--net_spec '
84 Network specification (type:string default:)
85
86 '--continue_from '
87 Existing model to extend (type:string default:)
88
89 '--model_output '
90 Basename for output models (type:string default:lstmtrain)
91
92 '--train_listfile '
93 File listing training files in lstmf training format. (type:string
94 default:)
95
96 '--eval_listfile '
97 File listing eval files in lstmf training format. (type:string
98 default:)
99
100 '--traineddata '
101 Starter traineddata with combined Dawgs/Unicharset/Recoder for
102 language model (type:string default:)
103
104 '--old_traineddata '
105 When changing the character set, this specifies the traineddata
106 with the old character set that is to be replaced (type:string
107 default:)
108
110 lstmtraining(1) was first made available for tesseract4.00.00alpha.
111
113 Main web site: https://github.com/tesseract-ocr Information on training
114 tesseract LSTM:
115 https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html
116
118 tesseract(1)
119
121 Copyright (C) 2012 Google, Inc. Licensed under the Apache License,
122 Version 2.0
123
125 The Tesseract OCR engine was written by Ray Smith and his research
126 groups at Hewlett Packard (1985-1995) and Google (2006-present).
127
128
129
130 07/22/2023 LSTMTRAINING(1)