1HTML::FromText(3) User Contributed Perl Documentation HTML::FromText(3)
2
3
4
6 HTML::FromText - Convert plain text to HTML.
7
9 use HTML::FromText;
10 text2html( $text, %options );
11
12 # or
13
14 use HTML::FromText ();
15 my $t2h = HTML::FromText->new( \%options );
16 my $html = $t2h->parse( $html );
17
19 "HTML::FromText" converts plain text to HTML. There are a handfull of
20 options that shape the conversion. There is a utility function,
21 "text2html", that's exported by default. This function is simply a
22 short- cut to the Object Oriented interface described in detail below.
23
24 Methods
25 The following methods may be used as the public interface.
26
27 new
28
29 my $t2h = HTML::FromText->new({
30 paras => 1,
31 blockcode => 1,
32 tables => 1,
33 bullets => 1,
34 numbers => 1,
35 urls => 1,
36 email => 1,
37 bold => 1,
38 underline => 1,
39 });
40
41 Constructs a new "HTML::FromText" object using the given configuration.
42 The resulting object can parse lots of objects using the "parse"
43 method.
44
45 Options to "new" are passed by name, with the value being either true
46 or false. If true, the option will be turned on. If false, it will be
47 turned off. The following outlines all the options.
48
49 Decorators
50
51 metachars
52 This option is on by default.
53
54 All characters that are unsafe for HTML display will be encoded
55 using "HTML::Entities::encode_entities()".
56
57 urls This option is off by default.
58
59 Replaces URLs with links.
60
61 email
62 This option is off by default.
63
64 Replaces email addresses with "mailto:" links.
65
66 bold This option is off by default.
67
68 Replaces text surrounded by asterisks ("*") with the same text
69 surrounded by "strong" tags.
70
71 underline
72 This option is off by default.
73
74 Replaces text surrownded by underscores ("_") with the same text
75 surrounded by "span" tags with an underline style.
76
77 Output Modes
78
79 The following are three output modes and the options associated with
80 them. They are listed in order of precidence. If none of these modes
81 are supplied, the basic decorators are applied to the text in whole.
82
83 pre This option is off by default.
84
85 Wraps the entire text in "pre" tags.
86
87 lines
88 This option is off by default.
89
90 Preserves line breaks by inserting "br" tags at the end of each
91 line.
92
93 This mode has further options.
94
95 spaces
96 This option is off by default.
97
98 All spaces are HTML encoded.
99
100 paras
101 This option is off by default.
102
103 Preserves paragraphs by wrapping them in "p" tags.
104
105 This mode has further options.
106
107 bullets
108 This option is off by default.
109
110 Convert bulleted lists into unordered lists ("ul"). Bullets
111 can be either an asterisk ("*") or a hyphen ("-"). Lists can
112 be nested.
113
114 numbers
115 This option is off by default.
116
117 Convert numbered lists into ordered lists ("ol"). Numbered
118 lists are identified by numerals. Lists may be nested.
119
120 headings
121 This option is off by default.
122
123 Convert paragraphs identified as headings into HTML headings
124 at the appropriate level. The heading "1. Top" would be
125 heading level one ("h1"). The heading "2.5.1. Blah" would be
126 heading level three ("h3").
127
128 title
129 This option is off by default.
130
131 Convert the first paragraph to a heading level one ("h1").
132
133 tables
134 This option is off by default.
135
136 Convert paragraphs identified as tables to HTML tables.
137 Tables are two or more rows and two or more columns. Columns
138 should be separated by two or more spaces.
139
140 The following options apply specifically to indented paragraphs.
141 They are listed in order of precidence.
142
143 blockparas
144 This option is off by default.
145
146 Convert indented paragraphs to block quotes using the
147 "blockquote" tag.
148
149 blockquotes
150 Convert indented paragraphs as "blockparas" would, but also
151 preserving line breaks.
152
153 blockcode
154 Convert indented paragraphs as "blockquotes" would, but also
155 preserving spaces using "pre" tags.
156
157 parse
158
159 my $html = $t2h->parse( $text );
160
161 Parses text supplied as a single scalar string and returns the HTML as
162 a single scalar string. All the tabs in your text will be expanded
163 using "Text::Tabs::expand()".
164
165 Functions
166 text2html
167
168 my $html = text2html(
169 $text,
170 urls => 1,
171 email => 1,
172 );
173
174 Functional interface that just wraps the OO interface. This function is
175 exported by default. If you don't want it you can "require" the module
176 or "use" it with an empty list.
177
178 require HTML::FromText;
179 # or ...
180 use HTML::FromText ();
181
182 Subclassing
183 Note: At the time of this release, the internals of "HTML::FromText"
184 are in a state of development and cannot be expected to stay the same
185 from release to release. I expect that release version 3.00 will be
186 analogous to a 1.00 release of other software. This is because the
187 current maintainer has rewritten this distribution from the ground up
188 for the "2.x" series. You have been warned.
189
190 The following methods may be used for subclassing "HTML::FromText" to
191 create your own text to HTML conversions. Each of these methods is
192 passed just one argument, the object ($self), unless otherwise stated.
193
194 The structure of $self is as follows for this release.
195
196 {
197 options => {
198 option_name => $value,
199 ...
200 },
201 text => $text, # as passed to parse(), with tabs expanded
202 html => $html, # the HTML that will be returned from parse()
203 }
204
205 pre
206
207 Used when "pre" mode is specified.
208
209 Should set "$self->{html}".
210
211 Return value is ignored.
212
213 lines
214
215 Used when "lines" mode is specified.
216
217 Implements the "spaces" option internally when the option is set to a
218 true value.
219
220 Should set "$self->{html}".
221
222 Return value is ignored.
223
224 paras
225
226 Used when the "paras" mode is specified.
227
228 Splits "$self->{text}" into paragraphs internally and sets up
229 "$self->{paras}" as follows.
230
231 paras => {
232 0 => {
233 text => $text, # paragraph text
234 html => $html, # paragraph html
235 },
236 ... # and so on for all paragraphs
237 },
238
239 Implements the "title" option internally when the option is turned on.
240
241 Converts any normal paragraphs to HTML paragraphs (surrounded by "p"
242 tags) internally.
243
244 Should set "$self->{html}".
245
246 Return value is ignored.
247
248 headings
249
250 Used to format headings when the "headings" option is turned on.
251
252 Return value is ignored.
253
254 bullets
255
256 Format bulleted lists when the "bullets" option is turned on.
257
258 Return value is ignored.
259
260 numbers
261
262 Format numbered lists when the "numbers" option is turned on.
263
264 Return value is ignored.
265
266 tables
267
268 Format tables when the "tables" option is turned on.
269
270 Return value is ignored.
271
272 blockparas
273
274 Used when the "blockparas" option is turned on.
275
276 Return value is ignored.
277
278 blockquotes
279
280 Used when the "blockquotes" option is turned on.
281
282 Return value is ignored.
283
284 blockcode
285
286 Used when the "blockcode" option is turned on.
287
288 Return value is ignored.
289
290 urls
291
292 Turn urls into links when "urls" option is turned on.
293
294 Should operate on "$self->{html}".
295
296 Return value is ignored.
297
298 email
299
300 Turn email addresses into "mailto:" links when "email" option is turned
301 on.
302
303 Should operate on "$self->{html}".
304
305 Return value is ignored.
306
307 underline
308
309 Underline things between _underscores_ when "underline" option is
310 turned on.
311
312 Should operate on "$self->{html}".
313
314 Return value is ignored.
315
316 bold
317
318 Bold things between *asterisks* when "bold" option is turned on.
319
320 Should operate on "$self->{html}".
321
322 Return value is ignored.
323
324 metachars
325
326 Encode meta characters when "metachars" option is turned on.
327
328 Should operate on "$self->{html}".
329
330 Return value is ignored.
331
332 Output
333 The output from "HTML::FromText" has been updated to pass XHTML 1.1
334 validation. Every HTML tag that should have a CSS class name does. They
335 are prefixed with "hft-" and correspond to the names of the options to
336 "new()" (or "text2html()"). For example "hft-lines", "hft-paras", and
337 "hft-urls".
338
339 One important note is the output for "underline". Because the <u> tag
340 is deprecated in this specification a "span" is used with a style
341 attribute of "text-decoration: underline". The class is "hft-
342 underline". If you want to override the "text-decoration" style in the
343 CSS class you'll need to do so like this.
344
345 text-decoration: none !important;
346
348 text2html(1).
349
351 Casey West <casey@geeknest.com>.
352
354 Gareth Rees <garethr@cre.canon.co.uk>.
355
357 Copyright (c) 2003 Casey West. All rights reserved.
358 This module is free software; you can redistribute it and/or modify it
359 under the same terms as Perl itself.
360
361
362
363perl v5.12.0 2003-10-14 HTML::FromText(3)