1TEXT2IMAGE(1)                                                    TEXT2IMAGE(1)
2
3
4

NAME

6       text2image - generate OCR training pages.
7

SYNOPSIS

9       text2image --text FILE --outputbase PATH --fonts_dir PATH [OPTION]
10

DESCRIPTION

12       text2image(1) generates OCR training pages. Given a text file it
13       outputs an image with a given font and degradation.
14

OPTIONS

16       --text FILE
17           File name of text input to use for creating synthetic training
18           data. (type:string default:)
19
20       --outputbase FILE
21           Basename for output image/box file (type:string default:)
22
23       --fontconfig_tmpdir PATH
24           Overrides fontconfig default temporary dir (type:string
25           default:/tmp)
26
27       --fonts_dir PATH
28           If empty it use system default. Otherwise it overrides system
29           default font location (type:string default:)
30
31       --font FONTNAME
32           Font description name to use (type:string default:Arial)
33
34       --writing_mode MODE
35           Specify one of the following writing modes.  horizontal : Render
36           regular horizontal text. (default) vertical : Render vertical text.
37           Glyph orientation is selected by Pango.  vertical-upright : Render
38           vertical text. Glyph orientation is set to be upright. (type:string
39           default:horizontal)
40
41       --tlog_level INT
42           Minimum logging level for tlog() output (type:int default:0)
43
44       --max_pages INT
45           Maximum number of pages to output (0=unlimited) (type:int
46           default:0)
47
48       --degrade_image BOOL
49           Degrade rendered image with speckle noise, dilation/erosion and
50           rotation (type:bool default:true)
51
52       --rotate_image BOOL
53           Rotate the image in a random way. (type:bool default:true)
54
55       --strip_unrenderable_words BOOL
56           Remove unrenderable words from source text (type:bool default:true)
57
58       --ligatures BOOL
59           Rebuild and render ligatures (type:bool default:false)
60
61       --exposure INT
62           Exposure level in photocopier (type:int default:0)
63
64       --resolution INT
65           Pixels per inch (type:int default:300)
66
67       --xsize INT
68           Width of output image (type:int default:3600)
69
70       --ysize INT
71           Height of output image (type:int default:4800)
72
73       --margin INT
74           Margin round edges of image (type:int default:100)
75
76       --ptsize INT
77           Size of printed text (type:int default:12)
78
79       --leading INT
80           Inter-line space (in pixels) (type:int default:12)
81
82       --box_padding INT
83           Padding around produced bounding boxes (type:int default:0)
84
85       --char_spacing DOUBLE
86           Inter-character space in ems (type:double default:0)
87
88       --underline_start_prob DOUBLE
89           Fraction of words to underline (value in [0,1]) (type:double
90           default:0)
91
92       --underline_continuation_prob DOUBLE
93           Fraction of words to underline (value in [0,1]) (type:double
94           default:0)
95
96       --render_ngrams BOOL
97           Put each space-separated entity from the input file into one
98           bounding box. The ngrams in the input file will be randomly
99           permuted before rendering (so that there is sufficient variety of
100           characters on each line). (type:bool default:false)
101
102       --output_word_boxes BOOL
103           Output word bounding boxes instead of character boxes. This is used
104           for Cube training, and implied by --render_ngrams. (type:bool
105           default:false)
106
107       --unicharset_file FILE
108           File with characters in the unicharset. If --render_ngrams is true
109           and --unicharset_file is specified, ngrams with characters that are
110           not in unicharset will be omitted (type:string default:)
111
112       --bidirectional_rotation BOOL
113           Rotate the generated characters both ways. (type:bool
114           default:false)
115
116       --only_extract_font_properties BOOL
117           Assumes that the input file contains a list of ngrams. Renders each
118           ngram, extracts spacing properties and records them in
119           output_base/[font_name].fontinfo file. (type:bool default:false)
120

USE THESE FLAGS TO OUTPUT ZERO-PADDED, SQUARE INDIVIDUAL CHARACTER IMAGES

122       --output_individual_glyph_images BOOL
123           If true also outputs individual character images (type:bool
124           default:false)
125
126       --glyph_resized_size INT
127           Each glyph is square with this side length in pixels (type:int
128           default:0)
129
130       --glyph_num_border_pixels_to_pad INT
131           Final_size=glyph_resized_size+2*glyph_num_border_pixels_to_pad
132           (type:int default:0)
133

USE THESE FLAGS TO FIND FONTS THAT CAN RENDER A GIVEN TEXT

135       --find_fonts BOOL
136           Search for all fonts that can render the text (type:bool
137           default:false)
138
139       --render_per_font BOOL
140           If find_fonts==true, render each font to its own image. Image
141           filenames are of the form output_name.font_name.tif (type:bool
142           default:true)
143
144       --min_coverage DOUBLE
145           If find_fonts==true, the minimum coverage the font has of the
146           characters in the text file to include it, between 0 and 1.
147           (type:double default:1)
148
149       Example Usage: ``` text2image --find_fonts \ --fonts_dir
150       /usr/share/fonts \ --text ../langdata/hin/hin.training_text \
151       --min_coverage .9 \ --render_per_font \ --outputbase
152       ../langdata/hin/hin \ |& grep raw | sed -e s/ :.*/" \\/g | sed -e s/^/
153       "/ >../langdata/hin/fontslist.txt ```
154

SINGLE OPTIONS

156       --list_available_fonts BOOL
157           List available fonts and quit. (type:bool default:false)
158

HISTORY

160       text2image(1) was first made available for tesseract 3.03.
161

RESOURCES

163       Main web site: https://github.com/tesseract-ocr Information on training
164       tesseract LSTM:
165       https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html
166

SEE ALSO

168       tesseract(1)
169

COPYING

171       Copyright (C) 2012 Google, Inc. Licensed under the Apache License,
172       Version 2.0
173

AUTHOR

175       The Tesseract OCR engine was written by Ray Smith and his research
176       groups at Hewlett Packard (1985-1995) and Google (2006-present).
177
178
179
180                                  03/20/2023                     TEXT2IMAGE(1)
Impressum