1HTML::FromText(3) User Contributed Perl Documentation HTML::FromText(3)
2
3
4
6 HTML::FromText - converts plain text to HTML
7
9 version 2.07
10
12 use HTML::FromText;
13 text2html( $text, %options );
14
15 # or
16
17 use HTML::FromText ();
18 my $t2h = HTML::FromText->new( \%options );
19 my $html = $t2h->parse( $html );
20
22 "HTML::FromText" converts plain text to HTML. There are a handful of
23 options that shape the conversion. There is a utility function,
24 "text2html", that's exported by default. This function is simply a
25 short- cut to the Object Oriented interface described in detail below.
26
28 new
29 my $t2h = HTML::FromText->new({
30 paras => 1,
31 blockcode => 1,
32 tables => 1,
33 bullets => 1,
34 numbers => 1,
35 urls => 1,
36 email => 1,
37 bold => 1,
38 underline => 1,
39 });
40
41 Constructs a new "HTML::FromText" object using the given configuration.
42 The resulting object can parse lots of objects using the "parse"
43 method.
44
45 Options to "new" are passed by name, with the value being either true
46 or false. If true, the option will be turned on. If false, it will be
47 turned off. The following outlines all the options.
48
49 Decorators
50
51 metachars
52 This option is on by default.
53
54 All characters that are unsafe for HTML display will be encoded
55 using HTML::Entities::encode_entities().
56
57 urls This option is off by default.
58
59 Replaces URLs with links.
60
61 email
62 This option is off by default.
63
64 Replaces email addresses with "mailto:" links.
65
66 bold This option is off by default.
67
68 Replaces text surrounded by asterisks ("*") with the same text
69 surrounded by "strong" tags.
70
71 underline
72 This option is off by default.
73
74 Replaces text surrownded by underscores ("_") with the same text
75 surrounded by "span" tags with an underline style.
76
77 Output Modes
78
79 The following are three output modes and the options associated with
80 them. They are listed in order of precidence. If none of these modes
81 are supplied, the basic decorators are applied to the text in whole.
82
83 pre This option is off by default.
84
85 Wraps the entire text in "pre" tags.
86
87 lines
88 This option is off by default.
89
90 Preserves line breaks by inserting "br" tags at the end of each
91 line.
92
93 This mode has further options.
94
95 spaces
96 This option is off by default.
97
98 All spaces are HTML encoded.
99
100 paras
101 This option is off by default.
102
103 Preserves paragraphs by wrapping them in "p" tags.
104
105 This mode has further options.
106
107 bullets
108 This option is off by default.
109
110 Convert bulleted lists into unordered lists ("ul"). Bullets
111 can be either an asterisk ("*") or a hyphen ("-"). Lists can
112 be nested.
113
114 numbers
115 This option is off by default.
116
117 Convert numbered lists into ordered lists ("ol"). Numbered
118 lists are identified by numerals. Lists may be nested.
119
120 headings
121 This option is off by default.
122
123 Convert paragraphs identified as headings into HTML headings
124 at the appropriate level. The heading "1. Top" would be
125 heading level one ("h1"). The heading "2.5.1. Blah" would be
126 heading level three ("h3").
127
128 title
129 This option is off by default.
130
131 Convert the first paragraph to a heading level one ("h1").
132
133 tables
134 This option is off by default.
135
136 Convert paragraphs identified as tables to HTML tables.
137 Tables are two or more rows and two or more columns. Columns
138 should be separated by two or more spaces.
139
140 The following options apply specifically to indented paragraphs.
141 They are listed in order of precidence.
142
143 blockparas
144 This option is off by default.
145
146 Convert indented paragraphs to block quotes using the
147 "blockquote" tag.
148
149 blockquotes
150 Convert indented paragraphs as "blockparas" would, but also
151 preserving line breaks.
152
153 blockcode
154 Convert indented paragraphs as "blockquotes" would, but also
155 preserving spaces using "pre" tags.
156
157 parse
158 my $html = $t2h->parse( $text );
159
160 Parses text supplied as a single scalar string and returns the HTML as
161 a single scalar string. All the tabs in your text will be expanded
162 using Text::Tabs::expand().
163
165 text2html
166 my $html = text2html(
167 $text,
168 urls => 1,
169 email => 1,
170 );
171
172 Functional interface that just wraps the OO interface. This function is
173 exported by default. If you don't want it you can "require" the module
174 or "use" it with an empty list.
175
176 require HTML::FromText;
177 # or ...
178 use HTML::FromText ();
179
180 Subclassing
181 Note: At the time of this release, the internals of "HTML::FromText"
182 are in a state of development and cannot be expected to stay the same
183 from release to release. I expect that release version 3.00 will be
184 analogous to a 1.00 release of other software. This is because the
185 current maintainer has rewritten this distribution from the ground up
186 for the "2.x" series. You have been warned.
187
188 The following methods may be used for subclassing "HTML::FromText" to
189 create your own text to HTML conversions. Each of these methods is
190 passed just one argument, the object ($self), unless otherwise stated.
191
192 The structure of $self is as follows for this release.
193
194 {
195 options => {
196 option_name => $value,
197 ...
198 },
199 text => $text, # as passed to parse(), with tabs expanded
200 html => $html, # the HTML that will be returned from parse()
201 }
202
203 pre
204
205 Used when "pre" mode is specified.
206
207 Should set "$self->{html}".
208
209 Return value is ignored.
210
211 lines
212
213 Used when "lines" mode is specified.
214
215 Implements the "spaces" option internally when the option is set to a
216 true value.
217
218 Should set "$self->{html}".
219
220 Return value is ignored.
221
222 paras
223
224 Used when the "paras" mode is specified.
225
226 Splits "$self->{text}" into paragraphs internally and sets up
227 "$self->{paras}" as follows.
228
229 paras => {
230 0 => {
231 text => $text, # paragraph text
232 html => $html, # paragraph html
233 },
234 ... # and so on for all paragraphs
235 },
236
237 Implements the "title" option internally when the option is turned on.
238
239 Converts any normal paragraphs to HTML paragraphs (surrounded by "p"
240 tags) internally.
241
242 Should set "$self->{html}".
243
244 Return value is ignored.
245
246 headings
247
248 Used to format headings when the "headings" option is turned on.
249
250 Return value is ignored.
251
252 bullets
253
254 Format bulleted lists when the "bullets" option is turned on.
255
256 Return value is ignored.
257
258 numbers
259
260 Format numbered lists when the "numbers" option is turned on.
261
262 Return value is ignored.
263
264 tables
265
266 Format tables when the "tables" option is turned on.
267
268 Return value is ignored.
269
270 blockparas
271
272 Used when the "blockparas" option is turned on.
273
274 Return value is ignored.
275
276 blockquotes
277
278 Used when the "blockquotes" option is turned on.
279
280 Return value is ignored.
281
282 blockcode
283
284 Used when the "blockcode" option is turned on.
285
286 Return value is ignored.
287
288 urls
289
290 Turn urls into links when "urls" option is turned on.
291
292 Should operate on "$self->{html}".
293
294 Return value is ignored.
295
296 email
297
298 Turn email addresses into "mailto:" links when "email" option is turned
299 on.
300
301 Should operate on "$self->{html}".
302
303 Return value is ignored.
304
305 underline
306
307 Underline things between _underscores_ when "underline" option is
308 turned on.
309
310 Should operate on "$self->{html}".
311
312 Return value is ignored.
313
314 bold
315
316 Bold things between *asterisks* when "bold" option is turned on.
317
318 Should operate on "$self->{html}".
319
320 Return value is ignored.
321
322 metachars
323
324 Encode meta characters when "metachars" option is turned on.
325
326 Should operate on "$self->{html}".
327
328 Return value is ignored.
329
330 Output
331 The output from "HTML::FromText" has been updated to pass XHTML 1.1
332 validation. Every HTML tag that should have a CSS class name does. They
333 are prefixed with "hft-" and correspond to the names of the options to
334 new() (or text2html()). For example "hft-lines", "hft-paras", and
335 "hft-urls".
336
337 One important note is the output for "underline". Because the <u> tag
338 is deprecated in this specification a "span" is used with a style
339 attribute of "text-decoration: underline". The class is "hft-
340 underline". If you want to override the "text-decoration" style in the
341 CSS class you'll need to do so like this.
342
343 text-decoration: none !important;
344
346 text2html(1).
347
349 • Ricardo SIGNES <rjbs@cpan.org>
350
351 • Casey West <casey@geeknest.com>
352
353 • Gareth Rees <garethr@cre.canon.co.uk>
354
356 This software is copyright (c) 2003 by Casey West.
357
358 This is free software; you can redistribute it and/or modify it under
359 the same terms as the Perl 5 programming language system itself.
360
361
362
363perl v5.36.0 2023-01-20 HTML::FromText(3)