1HTTP::Proxy::BodyFilterU(s3e)r Contributed Perl DocumentaHtTiToPn::Proxy::BodyFilter(3)
2
3
4

NAME

6       HTTP::Proxy::BodyFilter - A base class for HTTP messages body filters
7

SYNOPSIS

9           package MyFilter;
10
11           use base qw( HTTP::Proxy::BodyFilter );
12
13           # a simple modification, that may break things
14           sub filter {
15               my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
16               $$dataref =~ s/PERL/Perl/g;
17           }
18
19           1;
20

DESCRIPTION

22       The HTTP::Proxy::BodyFilter class is used to create filters for HTTP
23       request/response body data.
24
25       Creating a BodyFilter
26
27       A BodyFilter is just a derived class that implements some methods
28       called by the proxy. Of all the methods presented below, only "fil‐
29       ter()" must be defined in the derived class.
30
31       filter()
32           The signature of the filter() method is the following:
33
34               sub filter {
35                   my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
36                   ...
37               }
38
39           where $self is the filter object, $dataref is a reference to the
40           chunk of body data received, $message is a reference to either a
41           HTTP::Request or a HTTP::Response object, and $protocol is a refer‐
42           ence to the LWP::Protocol protocol object.
43
44           Note that this subroutine signature looks a lot like that of the
45           call- backs of LWP::UserAgent (except that $message is either a
46           HTTP::Request or a HTTP::Response object).
47
48           $buffer is a reference to a buffer where some of the unprocessed
49           data can be stored for the next time the filter will be called (see
50           "Using a buffer to store data for a later use" for details). Thanks
51           to the built-in HTTP::Proxy::BodyFilter::* filters, this is rarely
52           needed.
53
54           It is possible to access the headers of the message with "$mes‐
55           sage->headers()". This HTTP::Headers object is the one that was
56           sent to the client (if the filter is on the response stack) or ori‐
57           gin server (if the filter is on the request stack). Modifying it in
58           the filter() method is useless, since the headers have already been
59           sent.
60
61           Since $dataref is a reference to the data string, the referent can
62           be modified and the changes will be transmitted through the filters
63           that follows, until the data reaches its recipient.
64
65           A HTTP::Proxy::BodyFilter object is a blessed hash, and the base
66           class reserves only hash keys that start with "_hpbf".
67
68       new()
69           The constructor is defined for all subclasses. Initialisation tasks
70           (if any) for subclasses should be done in the "init()" method (see
71           below).
72
73       init()
74           This method is called by the "new()" constructeur to perform all
75           initisalisation tasks. It's called once in the filter lifetime.
76
77           It receives all the parameters passed to "new()".
78
79       begin()
80           Some filters might require initialisation before they are able to
81           handle the data. If a "begin()" method is defined in your subclass,
82           the proxy will call it before sending data to the "filter()"
83           method.
84
85           It's called once per HTTP message handled by the filter, before
86           data processing begins.
87
88           The method signature is as follows:
89
90               sub begin {
91                   my ( $self, $message ) = @_
92                   ...
93               }
94
95       end()
96           Some filters might require finalisation after they are finished
97           handling the data. If a "end()" method is defined in your subclass,
98           the proxy will call it after it has finished sending data to the
99           "filter()" method.
100
101           It's called once per HTTP message handled by the filter, after all
102           data processing is done.
103
104           This method does not expect any parameters.
105
106       will_modify()
107           This method return a boolean value that indicate if the filter will
108           modify the body data on the fly.
109
110           The default implementation returns a true value.
111
112       Using a buffer to store data for a later use
113
114       Some filters cannot handle arbitrary data: for example a filter that
115       basically lowercases tag name will apply a simple regex such as
116       "s/<\s*(\w+)([^>]*)>/<\L$1\E$2>/g".  But the filter will fail is the
117       chunk of data contains a tag that is cut before the final ">".
118
119       It would be extremely complicated and error-prone to let each filter
120       (and its author) do its own buffering, so the HTTP::Proxy architecture
121       handles this too. The proxy passes to each filter, each time it is
122       called, a reference to an empty string ($buffer in the above signature)
123       that the filter can use to store some data for next run.
124
125       When the reference is "undef", it means that the filter cannot store
126       any data, because this is the very last run, needed to gather all the
127       data left in all buffers.
128
129       It is recommended to store as little data as possible in the buffer, so
130       as to avoid (badly) reproducing what HTTP::Proxy::BodyFilter::complete
131       does.
132
133       In particular, you have to remember that all the data that remains in
134       the buffer after the last piece of data is received from the origin
135       server will be sent back to your filter in one big piece.
136
137       The store and forward approach
138
139       HTTP::Proxy implements a store and forward mechanism, for those filters
140       which need to have the whole message body to work. It's enabled simply
141       by pushing the HTTP::Proxy::BodyFilter::complete filter on the filter
142       stack.
143
144       The data is stored in memory by the "complete" filter, which passes it
145       on to the following filter once the full message body has been
146       received.
147
148       Standard BodyFilters
149
150       Standard HTTP::Proxy::BodyFilter classes are lowercase.
151
152       The following BodyFilters are included in the HTTP::Proxy distribution:
153
154       lines
155           This filter makes sure that the next filter in the filter chain
156           will only receive complete lines. The "chunks" of data received by
157           the following filters with either end with "\n" or will be the last
158           piece of data for the current HTTP message body.
159
160       htmltext
161           This class lets you create a filter that runs a given code refer‐
162           ence against text included in a HTML document (outside "<script>"
163           and "<style>" tags). HTML entities are not included in the text.
164
165       htmlparser
166           Creates a filter from a HTML::Parser object.
167
168       simple
169           This class lets you create a simple body filter from a code refer‐
170           ence.
171
172       save
173           Store the message body to a file.
174
175       complete
176           This filter stores the whole message body in memory, thus allowing
177           some actions to be taken only when the full page has been received
178           by the proxy.
179
180       tags
181           The HTTP::Proxy::BodyFilter::tags filter makes sure that the next
182           filter in the filter chain will only receive complete tags. The
183           current implementation is not 100% perfect, though.
184
185       Please read each filter's documentation for more details about their
186       use.
187

USEFUL METHODS FOR SUBCLASSES

189       Some methods are available to filters, so that they can eventually use
190       the little knowledge they might have of HTTP::Proxy's internals. They
191       mostly are accessors.
192
193       proxy()
194           Gets a reference to the HTTP::Proxy objects that owns the filter.
195           This gives access to some of the proxy methods.
196

AUTHOR

198       Philippe "BooK" Bruhat, <book@cpan.org>.
199

SEE ALSO

201       HTTP::Proxy, HTTP::Proxy::HeaderFilter.
202
204       Copyright 2003-2005, Philippe Bruhat.
205

LICENSE

207       This module is free software; you can redistribute it or modify it
208       under the same terms as Perl itself.
209
210
211
212perl v5.8.8                       2006-09-04        HTTP::Proxy::BodyFilter(3)
Impressum