1TEXT2IMAGE(1) TEXT2IMAGE(1)
2
3
4
6 text2image - generate OCR training pages.
7
9 text2image --text FILE --outputbase PATH --fonts_dir PATH [OPTION]
10
12 text2image(1) generates OCR training pages. Given a text file it
13 outputs an image with a given font and degradation.
14
16 --text FILE
17 File name of text input to use for creating synthetic training
18 data. (type:string default:)
19
20 --outputbase FILE
21 Basename for output image/box file (type:string default:)
22
23 --fontconfig_tmpdir PATH
24 Overrides fontconfig default temporary dir (type:string
25 default:/tmp)
26
27 --fonts_dir PATH
28 If empty it use system default. Otherwise it overrides system
29 default font location (type:string default:)
30
31 --font FONTNAME
32 Font description name to use (type:string default:Arial)
33
34 --writing_mode MODE
35 Specify one of the following writing modes. horizontal : Render
36 regular horizontal text. (default) vertical : Render vertical text.
37 Glyph orientation is selected by Pango. vertical-upright : Render
38 vertical text. Glyph orientation is set to be upright. (type:string
39 default:horizontal)
40
41 --tlog_level INT
42 Minimum logging level for tlog() output (type:int default:0)
43
44 --max_pages INT
45 Maximum number of pages to output (0=unlimited) (type:int
46 default:0)
47
48 --degrade_image BOOL
49 Degrade rendered image with speckle noise, dilation/erosion and
50 rotation (type:bool default:true)
51
52 --rotate_image BOOL
53 Rotate the image in a random way. (type:bool default:true)
54
55 --strip_unrenderable_words BOOL
56 Remove unrenderable words from source text (type:bool default:true)
57
58 --ligatures BOOL
59 Rebuild and render ligatures (type:bool default:false)
60
61 --exposure INT
62 Exposure level in photocopier (type:int default:0)
63
64 --resolution INT
65 Pixels per inch (type:int default:300)
66
67 --xsize INT
68 Width of output image (type:int default:3600)
69
70 --ysize INT
71 Height of output image (type:int default:4800)
72
73 --margin INT
74 Margin round edges of image (type:int default:100)
75
76 --ptsize INT
77 Size of printed text (type:int default:12)
78
79 --leading INT
80 Inter-line space (in pixels) (type:int default:12)
81
82 --box_padding INT
83 Padding around produced bounding boxes (type:int default:0)
84
85 --char_spacing DOUBLE
86 Inter-character space in ems (type:double default:0)
87
88 --underline_start_prob DOUBLE
89 Fraction of words to underline (value in [0,1]) (type:double
90 default:0)
91
92 --underline_continuation_prob DOUBLE
93 Fraction of words to underline (value in [0,1]) (type:double
94 default:0)
95
96 --render_ngrams BOOL
97 Put each space-separated entity from the input file into one
98 bounding box. The ngrams in the input file will be randomly
99 permuted before rendering (so that there is sufficient variety of
100 characters on each line). (type:bool default:false)
101
102 --output_word_boxes BOOL
103 Output word bounding boxes instead of character boxes. This is used
104 for Cube training, and implied by --render_ngrams. (type:bool
105 default:false)
106
107 --unicharset_file FILE
108 File with characters in the unicharset. If --render_ngrams is true
109 and --unicharset_file is specified, ngrams with characters that are
110 not in unicharset will be omitted (type:string default:)
111
112 --bidirectional_rotation BOOL
113 Rotate the generated characters both ways. (type:bool
114 default:false)
115
116 --only_extract_font_properties BOOL
117 Assumes that the input file contains a list of ngrams. Renders each
118 ngram, extracts spacing properties and records them in
119 output_base/[font_name].fontinfo file. (type:bool default:false)
120
122 --output_individual_glyph_images BOOL
123 If true also outputs individual character images (type:bool
124 default:false)
125
126 --glyph_resized_size INT
127 Each glyph is square with this side length in pixels (type:int
128 default:0)
129
130 --glyph_num_border_pixels_to_pad INT
131 Final_size=glyph_resized_size+2*glyph_num_border_pixels_to_pad
132 (type:int default:0)
133
135 --find_fonts BOOL
136 Search for all fonts that can render the text (type:bool
137 default:false)
138
139 --render_per_font BOOL
140 If find_fonts==true, render each font to its own image. Image
141 filenames are of the form output_name.font_name.tif (type:bool
142 default:true)
143
144 --min_coverage DOUBLE
145 If find_fonts==true, the minimum coverage the font has of the
146 characters in the text file to include it, between 0 and 1.
147 (type:double default:1)
148
149 Example Usage: ``` text2image --find_fonts \ --fonts_dir
150 /usr/share/fonts \ --text ../langdata/hin/hin.training_text \
151 --min_coverage .9 \ --render_per_font \ --outputbase
152 ../langdata/hin/hin \ |& grep raw | sed -e s/ :.*/" \\/g | sed -e s/^/
153 "/ >../langdata/hin/fontslist.txt ```
154
156 --list_available_fonts BOOL
157 List available fonts and quit. (type:bool default:false)
158
160 text2image(1) was first made available for tesseract 3.03.
161
163 Main web site: https://github.com/tesseract-ocr Information on training
164 tesseract LSTM:
165 https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html
166
168 tesseract(1)
169
171 Copyright (C) 2012 Google, Inc. Licensed under the Apache License,
172 Version 2.0
173
175 The Tesseract OCR engine was written by Ray Smith and his research
176 groups at Hewlett Packard (1985-1995) and Google (2006-present).
177
178
179
180 03/20/2023 TEXT2IMAGE(1)