1HTML::FormatText::Html2Utseexrt(C3o)ntributed Perl DocumHeTnMtLa:t:iFoonrmatText::Html2text(3)
2
3
4

NAME

6       HTML::FormatText::Html2text - format HTML as plain text using html2text
7

SYNOPSIS

9        use HTML::FormatText::Html2text;
10        $text = HTML::FormatText::Html2text->format_file ($filename);
11        $text = HTML::FormatText::Html2text->format_string ($html_string);
12
13        $formatter = HTML::FormatText::Html2text->new;
14        $tree = HTML::TreeBuilder->new_from_file ($filename);
15        $text = $formatter->format ($tree);
16

DESCRIPTION

18       "HTML::FormatText::Html2text" turns HTML into plain text using the
19       "html2text" program.
20
21           <http://www.mbayer.de/html2text/>
22
23       The module interface is compatible with formatters like
24       "HTML::FormatText", but all parsing etc is done by html2text.
25
26       See "HTML::FormatExternal" for the formatting functions and options,
27       with the following caveats,
28
29       "input_charset"
30           Currently this option has no effect.  Input generally has to be
31           latin-1 only, though the Debian extended "html2ext" interprets a
32           "<meta>" charset directive in the HTML header.
33
34           Various "&" style named or numbered entities are recognised and
35           result in suitable output.  The suggestion would be entitized input
36           for maximum portability among "html2text" versions.
37
38       "output_charset"
39           If set to "ascii" or "ANSI_X3.4-1968" (both case-insensitive) the
40           "html2text -ascii" option is used, when available ("html2text"
41           1.3.2 from Jan 2004).
42
43           If set to "UTF-8" then Debian extension "-utf8" option is used
44           (circa 2009).
45
46           Apart from this there's no control over the output charset.
47

SEE ALSO

49       HTML::FormatExternal, html2text(1)
50

HOME PAGE

52       <http://user42.tuxfamily.org/html-formatexternal/index.html>
53

LICENSE

55       Copyright 2008, 2009, 2010, 2013, 2015 Kevin Ryde
56
57       HTML-FormatExternal is free software; you can redistribute it and/or
58       modify it under the terms of the GNU General Public License as
59       published by the Free Software Foundation; either version 3, or (at
60       your option) any later version.
61
62       HTML-FormatExternal is distributed in the hope that it will be useful,
63       but WITHOUT ANY WARRANTY; without even the implied warranty of
64       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
65       General Public License for more details.
66
67       You should have received a copy of the GNU General Public License along
68       with HTML-FormatExternal.  If not, see <http://www.gnu.org/licenses/>.
69
70
71
72perl v5.38.0                      2023-07-20    HTML::FormatText::Html2text(3)
Impressum