1HTTP::Proxy::BodyFilterU(s3e)r Contributed Perl DocumentaHtTiToPn::Proxy::BodyFilter(3)
2
3
4
6 HTTP::Proxy::BodyFilter - A base class for HTTP messages body filters
7
9 package MyFilter;
10
11 use base qw( HTTP::Proxy::BodyFilter );
12
13 # a simple modification, that may break things
14 sub filter {
15 my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
16 $$dataref =~ s/PERL/Perl/g;
17 }
18
19 1;
20
22 The HTTP::Proxy::BodyFilter class is used to create filters for HTTP
23 request/response body data.
24
25 Creating a BodyFilter
26 A BodyFilter is just a derived class that implements some methods
27 called by the proxy. Of all the methods presented below, only filter()
28 must be defined in the derived class.
29
30 filter()
31 The signature of the filter() method is the following:
32
33 sub filter {
34 my ( $self, $dataref, $message, $protocol, $buffer ) = @_;
35 ...
36 }
37
38 where $self is the filter object, $dataref is a reference to the
39 chunk of body data received, $message is a reference to either a
40 HTTP::Request or a HTTP::Response object, and $protocol is a
41 reference to the LWP::Protocol protocol object.
42
43 Note that this subroutine signature looks a lot like that of the
44 call- backs of LWP::UserAgent (except that $message is either a
45 HTTP::Request or a HTTP::Response object).
46
47 $buffer is a reference to a buffer where some of the unprocessed
48 data can be stored for the next time the filter will be called (see
49 "Using a buffer to store data for a later use" for details). Thanks
50 to the built-in HTTP::Proxy::BodyFilter::* filters, this is rarely
51 needed.
52
53 It is possible to access the headers of the message with
54 "$message->headers()". This HTTP::Headers object is the one that
55 was sent to the client (if the filter is on the response stack) or
56 origin server (if the filter is on the request stack). Modifying it
57 in the filter() method is useless, since the headers have already
58 been sent.
59
60 Since $dataref is a reference to the data string, the referent can
61 be modified and the changes will be transmitted through the filters
62 that follows, until the data reaches its recipient.
63
64 A HTTP::Proxy::BodyFilter object is a blessed hash, and the base
65 class reserves only hash keys that start with "_hpbf".
66
67 new()
68 The constructor is defined for all subclasses. Initialisation tasks
69 (if any) for subclasses should be done in the init() method (see
70 below).
71
72 init()
73 This method is called by the new() constructeur to perform all
74 initisalisation tasks. It's called once in the filter lifetime.
75
76 It receives all the parameters passed to new().
77
78 begin()
79 Some filters might require initialisation before they are able to
80 handle the data. If a begin() method is defined in your subclass,
81 the proxy will call it before sending data to the filter() method.
82
83 It's called once per HTTP message handled by the filter, before
84 data processing begins.
85
86 The method signature is as follows:
87
88 sub begin {
89 my ( $self, $message ) = @_
90 ...
91 }
92
93 end()
94 Some filters might require finalisation after they are finished
95 handling the data. If a end() method is defined in your subclass,
96 the proxy will call it after it has finished sending data to the
97 filter() method.
98
99 It's called once per HTTP message handled by the filter, after all
100 data processing is done.
101
102 This method does not expect any parameters.
103
104 will_modify()
105 This method return a boolean value that indicate if the filter will
106 modify the body data on the fly.
107
108 The default implementation returns a true value.
109
110 Using a buffer to store data for a later use
111 Some filters cannot handle arbitrary data: for example a filter that
112 basically lowercases tag name will apply a simple regex such as
113 "s/<\s*(\w+)([^>]*)>/<\L$1\E$2>/g". But the filter will fail is the
114 chunk of data contains a tag that is cut before the final ">".
115
116 It would be extremely complicated and error-prone to let each filter
117 (and its author) do its own buffering, so the HTTP::Proxy architecture
118 handles this too. The proxy passes to each filter, each time it is
119 called, a reference to an empty string ($buffer in the above signature)
120 that the filter can use to store some data for next run.
121
122 When the reference is "undef", it means that the filter cannot store
123 any data, because this is the very last run, needed to gather all the
124 data left in all buffers.
125
126 It is recommended to store as little data as possible in the buffer, so
127 as to avoid (badly) reproducing what HTTP::Proxy::BodyFilter::complete
128 does.
129
130 In particular, you have to remember that all the data that remains in
131 the buffer after the last piece of data is received from the origin
132 server will be sent back to your filter in one big piece.
133
134 The store and forward approach
135 HTTP::Proxy implements a store and forward mechanism, for those filters
136 which need to have the whole message body to work. It's enabled simply
137 by pushing the HTTP::Proxy::BodyFilter::complete filter on the filter
138 stack.
139
140 The data is stored in memory by the "complete" filter, which passes it
141 on to the following filter once the full message body has been
142 received.
143
144 Standard BodyFilters
145 Standard HTTP::Proxy::BodyFilter classes are lowercase.
146
147 The following BodyFilters are included in the HTTP::Proxy distribution:
148
149 lines
150 This filter makes sure that the next filter in the filter chain
151 will only receive complete lines. The "chunks" of data received by
152 the following filters with either end with "\n" or will be the last
153 piece of data for the current HTTP message body.
154
155 htmltext
156 This class lets you create a filter that runs a given code
157 reference against text included in a HTML document (outside
158 "<script>" and "<style>" tags). HTML entities are not included in
159 the text.
160
161 htmlparser
162 Creates a filter from a HTML::Parser object.
163
164 simple
165 This class lets you create a simple body filter from a code
166 reference.
167
168 save
169 Store the message body to a file.
170
171 complete
172 This filter stores the whole message body in memory, thus allowing
173 some actions to be taken only when the full page has been received
174 by the proxy.
175
176 tags
177 The HTTP::Proxy::BodyFilter::tags filter makes sure that the next
178 filter in the filter chain will only receive complete tags. The
179 current implementation is not 100% perfect, though.
180
181 Please read each filter's documentation for more details about their
182 use.
183
185 Some methods are available to filters, so that they can eventually use
186 the little knowledge they might have of HTTP::Proxy's internals. They
187 mostly are accessors.
188
189 proxy()
190 Gets a reference to the HTTP::Proxy objects that owns the filter.
191 This gives access to some of the proxy methods.
192
194 Philippe "BooK" Bruhat, <book@cpan.org>.
195
197 HTTP::Proxy, HTTP::Proxy::HeaderFilter.
198
200 Copyright 2003-2015, Philippe Bruhat.
201
203 This module is free software; you can redistribute it or modify it
204 under the same terms as Perl itself.
205
206
207
208perl v5.36.0 2023-01-20 HTTP::Proxy::BodyFilter(3)