1HTTP::Proxy::BodyFilterU(s3e)r Contributed Perl DocumentaHtTiToPn::Proxy::BodyFilter(3)
2
3
4
6 HTTP::Proxy::BodyFilter - A base class for HTTP messages body filters
7
9 package MyFilter;
10
11 use base qw( HTTP::Proxy::BodyFilter );
12
13 # a simple modification, that may break things
14 sub filter {
15 my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
16 $$dataref =~ s/PERL/Perl/g;
17 }
18
19 1;
20
22 The HTTP::Proxy::BodyFilter class is used to create filters for HTTP
23 request/response body data.
24
25 Creating a BodyFilter
26
27 A BodyFilter is just a derived class that implements some methods
28 called by the proxy. Of all the methods presented below, only "fil‐
29 ter()" must be defined in the derived class.
30
31 filter()
32 The signature of the filter() method is the following:
33
34 sub filter {
35 my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
36 ...
37 }
38
39 where $self is the filter object, $dataref is a reference to the
40 chunk of body data received, $message is a reference to either a
41 HTTP::Request or a HTTP::Response object, and $protocol is a refer‐
42 ence to the LWP::Protocol protocol object.
43
44 Note that this subroutine signature looks a lot like that of the
45 call- backs of LWP::UserAgent (except that $message is either a
46 HTTP::Request or a HTTP::Response object).
47
48 $buffer is a reference to a buffer where some of the unprocessed
49 data can be stored for the next time the filter will be called (see
50 "Using a buffer to store data for a later use" for details). Thanks
51 to the built-in HTTP::Proxy::BodyFilter::* filters, this is rarely
52 needed.
53
54 It is possible to access the headers of the message with "$mes‐
55 sage->headers()". This HTTP::Headers object is the one that was
56 sent to the client (if the filter is on the response stack) or ori‐
57 gin server (if the filter is on the request stack). Modifying it in
58 the filter() method is useless, since the headers have already been
59 sent.
60
61 Since $dataref is a reference to the data string, the referent can
62 be modified and the changes will be transmitted through the filters
63 that follows, until the data reaches its recipient.
64
65 A HTTP::Proxy::BodyFilter object is a blessed hash, and the base
66 class reserves only hash keys that start with "_hpbf".
67
68 new()
69 The constructor is defined for all subclasses. Initialisation tasks
70 (if any) for subclasses should be done in the "init()" method (see
71 below).
72
73 init()
74 This method is called by the "new()" constructeur to perform all
75 initisalisation tasks. It's called once in the filter lifetime.
76
77 It receives all the parameters passed to "new()".
78
79 begin()
80 Some filters might require initialisation before they are able to
81 handle the data. If a "begin()" method is defined in your subclass,
82 the proxy will call it before sending data to the "filter()"
83 method.
84
85 It's called once per HTTP message handled by the filter, before
86 data processing begins.
87
88 The method signature is as follows:
89
90 sub begin {
91 my ( $self, $message ) = @_
92 ...
93 }
94
95 end()
96 Some filters might require finalisation after they are finished
97 handling the data. If a "end()" method is defined in your subclass,
98 the proxy will call it after it has finished sending data to the
99 "filter()" method.
100
101 It's called once per HTTP message handled by the filter, after all
102 data processing is done.
103
104 This method does not expect any parameters.
105
106 will_modify()
107 This method return a boolean value that indicate if the filter will
108 modify the body data on the fly.
109
110 The default implementation returns a true value.
111
112 Using a buffer to store data for a later use
113
114 Some filters cannot handle arbitrary data: for example a filter that
115 basically lowercases tag name will apply a simple regex such as
116 "s/<\s*(\w+)([^>]*)>/<\L$1\E$2>/g". But the filter will fail is the
117 chunk of data contains a tag that is cut before the final ">".
118
119 It would be extremely complicated and error-prone to let each filter
120 (and its author) do its own buffering, so the HTTP::Proxy architecture
121 handles this too. The proxy passes to each filter, each time it is
122 called, a reference to an empty string ($buffer in the above signature)
123 that the filter can use to store some data for next run.
124
125 When the reference is "undef", it means that the filter cannot store
126 any data, because this is the very last run, needed to gather all the
127 data left in all buffers.
128
129 It is recommended to store as little data as possible in the buffer, so
130 as to avoid (badly) reproducing what HTTP::Proxy::BodyFilter::complete
131 does.
132
133 In particular, you have to remember that all the data that remains in
134 the buffer after the last piece of data is received from the origin
135 server will be sent back to your filter in one big piece.
136
137 The store and forward approach
138
139 HTTP::Proxy implements a store and forward mechanism, for those filters
140 which need to have the whole message body to work. It's enabled simply
141 by pushing the HTTP::Proxy::BodyFilter::complete filter on the filter
142 stack.
143
144 The data is stored in memory by the "complete" filter, which passes it
145 on to the following filter once the full message body has been
146 received.
147
148 Standard BodyFilters
149
150 Standard HTTP::Proxy::BodyFilter classes are lowercase.
151
152 The following BodyFilters are included in the HTTP::Proxy distribution:
153
154 lines
155 This filter makes sure that the next filter in the filter chain
156 will only receive complete lines. The "chunks" of data received by
157 the following filters with either end with "\n" or will be the last
158 piece of data for the current HTTP message body.
159
160 htmltext
161 This class lets you create a filter that runs a given code refer‐
162 ence against text included in a HTML document (outside "<script>"
163 and "<style>" tags). HTML entities are not included in the text.
164
165 htmlparser
166 Creates a filter from a HTML::Parser object.
167
168 simple
169 This class lets you create a simple body filter from a code refer‐
170 ence.
171
172 save
173 Store the message body to a file.
174
175 complete
176 This filter stores the whole message body in memory, thus allowing
177 some actions to be taken only when the full page has been received
178 by the proxy.
179
180 tags
181 The HTTP::Proxy::BodyFilter::tags filter makes sure that the next
182 filter in the filter chain will only receive complete tags. The
183 current implementation is not 100% perfect, though.
184
185 Please read each filter's documentation for more details about their
186 use.
187
189 Some methods are available to filters, so that they can eventually use
190 the little knowledge they might have of HTTP::Proxy's internals. They
191 mostly are accessors.
192
193 proxy()
194 Gets a reference to the HTTP::Proxy objects that owns the filter.
195 This gives access to some of the proxy methods.
196
198 Philippe "BooK" Bruhat, <book@cpan.org>.
199
201 HTTP::Proxy, HTTP::Proxy::HeaderFilter.
202
204 Copyright 2003-2005, Philippe Bruhat.
205
207 This module is free software; you can redistribute it or modify it
208 under the same terms as Perl itself.
209
210
211
212perl v5.8.8 2006-09-04 HTTP::Proxy::BodyFilter(3)