1POE::Component::Client:U:sHeTrTPC(o3n)tributed Perl DocuPmOeEn:t:aCtoimopnonent::Client::HTTP(3)
2
3
4

NAME

6       POE::Component::Client::HTTP - a HTTP user-agent component
7

SYNOPSIS

9         use POE qw(Component::Client::HTTP);
10
11         POE::Component::Client::HTTP->spawn(
12           Agent     => 'SpiffCrawler/0.90',   # defaults to something long
13           Alias     => 'ua',                  # defaults to 'weeble'
14           From      => 'spiffster@perl.org',  # defaults to undef (no header)
15           Protocol  => 'HTTP/0.9',            # defaults to 'HTTP/1.1'
16           Timeout   => 60,                    # defaults to 180 seconds
17           MaxSize   => 16384,                 # defaults to entire response
18           Streaming => 4096,                  # defaults to 0 (off)
19           FollowRedirects => 2,               # defaults to 0 (off)
20           Proxy     => "http://localhost:80", # defaults to HTTP_PROXY env. variable
21           NoProxy   => [ "localhost", "127.0.0.1" ], # defs to NO_PROXY env. variable
22           BindAddr  => "12.34.56.78",         # defaults to INADDR_ANY
23         );
24
25         $kernel->post(
26           'ua',        # posts to the 'ua' alias
27           'request',   # posts to ua's 'request' state
28           'response',  # which of our states will receive the response
29           $request,    # an HTTP::Request object
30         );
31
32         # This is the sub which is called when the session receives a
33         # 'response' event.
34         sub response_handler {
35           my ($request_packet, $response_packet) = @_[ARG0, ARG1];
36
37           # HTTP::Request
38           my $request_object  = $request_packet->[0];
39
40           # HTTP::Response
41           my $response_object = $response_packet->[0];
42
43           my $stream_chunk;
44           if (! defined($response_object->content)) {
45             $stream_chunk = $response_packet->[1];
46           }
47
48           print(
49             "*" x 78, "\n",
50             "*** my request:\n",
51             "-" x 78, "\n",
52             $request_object->as_string(),
53             "*" x 78, "\n",
54             "*** their response:\n",
55             "-" x 78, "\n",
56             $response_object->as_string(),
57           );
58
59           if (defined $stream_chunk) {
60             print "-" x 40, "\n", $stream_chunk, "\n";
61           }
62
63           print "*" x 78, "\n";
64         }
65

DESCRIPTION

67       POE::Component::Client::HTTP is an HTTP user-agent for POE.  It lets
68       other sessions run while HTTP transactions are being processed, and it
69       lets several HTTP transactions be processed in parallel.
70
71       It supports keep-alive through POE::Component::Client::Keepalive, which
72       in turn uses POE::Component::Client::DNS for asynchronous name
73       resolution.
74
75       HTTP client components are not proper objects.  Instead of being
76       created, as most objects are, they are "spawned" as separate sessions.
77       To avoid confusion (and hopefully not cause other confusion), they must
78       be spawned with a "spawn" method, not created anew with a "new" one.
79

CONSTRUCTOR

81   spawn
82       PoCo::Client::HTTP's "spawn" method takes a few named parameters:
83
84       Agent => $user_agent_string
85       Agent => \@list_of_agents
86         If a UserAgent header is not present in the HTTP::Request, a random
87         one will be used from those specified by the "Agent" parameter.  If
88         none are supplied, POE::Component::Client::HTTP will advertise itself
89         to the server.
90
91         "Agent" may contain a reference to a list of user agents.  If this is
92         the case, PoCo::Client::HTTP will choose one of them at random for
93         each request.
94
95       Alias => $session_alias
96         "Alias" sets the name by which the session will be known.  If no
97         alias is given, the component defaults to "weeble".  The alias lets
98         several sessions interact with HTTP components without keeping (or
99         even knowing) hard references to them.  It's possible to spawn
100         several HTTP components with different names.
101
102       ConnectionManager => $poco_client_keepalive
103         "ConnectionManager" sets this component's connection pool manager.
104         It expects the connection manager to be a reference to a
105         POE::Component::Client::Keepalive object.  The HTTP client component
106         will call "allocate()" on the connection manager itself so you should
107         not have done this already.
108
109           my $pool = POE::Component::Client::Keepalive->new(
110             keep_alive    => 10, # seconds to keep connections alive
111             max_open      => 100, # max concurrent connections - total
112             max_per_host  => 20, # max concurrent connections - per host
113             timeout       => 30, # max time (seconds) to establish a new connection
114           );
115
116           POE::Component::Client::HTTP->spawn(
117             # ...
118             ConnectionManager => $pool,
119             # ...
120           );
121
122         See POE::Component::Client::Keepalive for more information.
123
124       CookieJar => $cookie_jar
125         "CookieJar" sets the component's cookie jar.  It expects the cookie
126         jar to be a reference to a HTTP::Cookies object.
127
128       From => $admin_address
129         "From" holds an e-mail address where the client's administrator
130         and/or maintainer may be reached.  It defaults to undef, which means
131         no From header will be included in requests.
132
133       MaxSize => OCTETS
134         "MaxSize" specifies the largest response to accept from a server.
135         The content of larger responses will be truncated to OCTET octets.
136         This has been used to return the <head></head> section of web pages
137         without the need to wade through <body></body>.
138
139       NoProxy => [ $host_1, $host_2, ..., $host_N ]
140       NoProxy => "host1,host2,hostN"
141         "NoProxy" specifies a list of server hosts that will not be proxied.
142         It is useful for local hosts and hosts that do not properly support
143         proxying.  If NoProxy is not specified, a list will be taken from the
144         NO_PROXY environment variable.
145
146           NoProxy => [ "localhost", "127.0.0.1" ],
147           NoProxy => "localhost,127.0.0.1",
148
149       BindAddr => $local_ip
150         Specify "BindAddr" to bind all client sockets to a particular local
151         address.  The value of BindAddr will be passed through
152         POE::Component::Client::Keepalive to POE::Wheel::SocketFactory (as
153         "bind_address").  See that module's documentation for implementation
154         details.
155
156           BindAddr => "12.34.56.78"
157
158       Protocol => $http_protocol_string
159         "Protocol" advertises the protocol that the client wishes to see.
160         Under normal circumstances, it should be left to its default value:
161         "HTTP/1.1".
162
163       Proxy => [ $proxy_host, $proxy_port ]
164       Proxy => $proxy_url
165       Proxy => $proxy_url,$proxy_url,...
166         "Proxy" specifies one or more proxy hosts that requests will be
167         passed through.  If not specified, proxy servers will be taken from
168         the HTTP_PROXY (or http_proxy) environment variable.  No proxying
169         will occur unless Proxy is set or one of the environment variables
170         exists.
171
172         The proxy can be specified either as a host and port, or as one or
173         more URLs.  Proxy URLs must specify the proxy port, even if it is 80.
174
175           Proxy => [ "127.0.0.1", 80 ],
176           Proxy => "http://127.0.0.1:80/",
177
178         "Proxy" may specify multiple proxies separated by commas.
179         PoCo::Client::HTTP will choose proxies from this list at random.
180         This is useful for load balancing requests through multiple gateways.
181
182           Proxy => "http://127.0.0.1:80/,http://127.0.0.1:81/",
183
184       Streaming => OCTETS
185         "Streaming" changes allows Client::HTTP to return large content in
186         chunks (of OCTETS octets each) rather than combine the entire content
187         into a single HTTP::Response object.
188
189         By default, Client::HTTP reads the entire content for a response into
190         memory before returning an HTTP::Response object.  This is obviously
191         bad for applications like streaming MP3 clients, because they often
192         fetch songs that never end.  Yes, they go on and on, my friend.
193
194         When "Streaming" is set to nonzero, however, the response handler
195         receives chunks of up to OCTETS octets apiece.  The response handler
196         accepts slightly different parameters in this case.  ARG0 is also an
197         HTTP::Response object but it does not contain response content, and
198         ARG1 contains a a chunk of raw response content, or undef if the
199         stream has ended.
200
201           sub streaming_response_handler {
202             my $response_packet = $_[ARG1];
203             my ($response, $data) = @$response_packet;
204             print SAVED_STREAM $data if defined $data;
205           }
206
207       FollowRedirects => $number_of_hops_to_follow
208         "FollowRedirects" specifies how many redirects (e.g. 302 Moved) to
209         follow.  If not specified defaults to 0, and thus no redirection is
210         followed.  This maintains compatibility with the previous behavior,
211         which was not to follow redirects at all.
212
213         If redirects are followed, a response chain should be built, and can
214         be accessed through $response_object->previous(). See HTTP::Response
215         for details here.
216
217       Timeout => $query_timeout
218         "Timeout" sets how long POE::Component::Client::HTTP has to process
219         an application's request, in seconds.  "Timeout" defaults to 180
220         (three minutes) if not specified.
221
222         It's important to note that the timeout begins when the component
223         receives an application's request, not when it attempts to connect to
224         the web server.
225
226         Timeouts may result from sending the component too many requests at
227         once.  Each request would need to be received and tracked in order.
228         Consider this:
229
230           $_[KERNEL]->post(component => request => ...) for (1..15_000);
231
232         15,000 requests are queued together in one enormous bolus.  The
233         component would receive and initialize them in order.  The first
234         socket activity wouldn't arrive until the 15,000th request was set
235         up.  If that took longer than "Timeout", then the requests that have
236         waited too long would fail.
237
238         "ConnectionManager"'s own timeout and concurrency limits also affect
239         how many requests may be processed at once.  For example, most of the
240         15,000 requests would wait in the connection manager's pool until
241         sockets become available.  Meanwhile, the "Timeout" would be counting
242         down.
243
244         Applications may elect to control concurrency outside the component's
245         "Timeout".  They may do so in a few ways.
246
247         The easiest way is to limit the initial number of requests to
248         something more manageable.  As responses arrive, the application
249         should handle them and start new requests.  This limits concurrency
250         to the initial request count.
251
252         An application may also outsource job throttling to another module,
253         such as POE::Component::JobQueue.
254
255         In any case, "Timeout" and "ConnectionManager" may be tuned to
256         maximize timeouts and concurrency limits.  This may help in some
257         cases.  Developers should be aware that doing so will increase memory
258         usage.  POE::Component::Client::HTTP and KeepAlive track requests in
259         memory, while applications are free to keep pending requests on disk.
260

ACCEPTED EVENTS

262       Sessions communicate asynchronously with PoCo::Client::HTTP.  They post
263       requests to it, and it posts responses back.
264
265   request
266       Requests are posted to the component's "request" state.  They include
267       an HTTP::Request object which defines the request.  For example:
268
269         $kernel->post(
270           'ua', 'request',           # http session alias & state
271           'response',                # my state to receive responses
272           GET 'http://poe.perl.org', # a simple HTTP request
273           'unique id',               # a tag to identify the request
274           'progress',                # an event to indicate progress
275           'http://1.2.3.4:80/'       # proxy to use for this request
276         );
277
278       Requests include the state to which responses will be posted.  In the
279       previous example, the handler for a 'response' state will be called
280       with each HTTP response.  The "progress" handler is optional and if
281       installed, the component will provide progress metrics (see sample
282       handler below).  The "proxy" parameter is optional and if not defined,
283       a default proxy will be used if configured.  No proxy will be used if
284       neither a default one nor a "proxy" parameter is defined.
285
286   pending_requests_count
287       There's also a pending_requests_count state that returns the number of
288       requests currently being processed.  To receive the return value, it
289       must be invoked with $kernel->call().
290
291         my $count = $kernel->call('ua' => 'pending_requests_count');
292
293   cancel
294       Cancel a specific HTTP request.  Requires a reference to the original
295       request (blessed or stringified) so it knows which one to cancel.  See
296       "progress handler" below for notes on canceling streaming requests.
297
298       To cancel a request based on its blessed HTTP::Request object:
299
300         $kernel->post( component => cancel => $http_request );
301
302       To cancel a request based on its stringified HTTP::Request object:
303
304         $kernel->post( component => cancel => "$http_request" );
305
306   shutdown
307       Responds to all pending requests with 408 (request timeout), and then
308       shuts down the component and all subcomponents.
309

SENT EVENTS

311   response handler
312       In addition to all the usual POE parameters, HTTP responses come with
313       two list references:
314
315         my ($request_packet, $response_packet) = @_[ARG0, ARG1];
316
317       $request_packet contains a reference to the original HTTP::Request
318       object.  This is useful for matching responses back to the requests
319       that generated them.
320
321         my $http_request_object = $request_packet->[0];
322         my $http_request_tag    = $request_packet->[1]; # from the 'request' post
323
324       $response_packet contains a reference to the resulting HTTP::Response
325       object.
326
327         my $http_response_object = $response_packet->[0];
328
329       Please see the HTTP::Request and HTTP::Response manpages for more
330       information.
331
332   progress handler
333       The example progress handler shows how to calculate a percentage of
334       download completion.
335
336         sub progress_handler {
337           my $gen_args  = $_[ARG0];    # args passed to all calls
338           my $call_args = $_[ARG1];    # args specific to the call
339
340           my $req = $gen_args->[0];    # HTTP::Request object being serviced
341           my $tag = $gen_args->[1];    # Request ID tag from.
342           my $got = $call_args->[0];   # Number of bytes retrieved so far.
343           my $tot = $call_args->[1];   # Total bytes to be retrieved.
344           my $oct = $call_args->[2];   # Chunk of raw octets received this time.
345
346           my $percent = $got / $tot * 100;
347
348           printf(
349             "-- %.0f%% [%d/%d]: %s\n", $percent, $got, $tot, $req->uri()
350           );
351
352           # To cancel the request:
353           # $_[KERNEL]->post( component => cancel => $req );
354         }
355
356       DEPRECATION WARNING
357
358       The third return argument (the raw octets received) has been
359       deprecated.  Instead of it, use the Streaming parameter to get chunks
360       of content in the response handler.
361

REQUEST CALLBACKS

363       The HTTP::Request object passed to the request event can contain a CODE
364       reference as "content".  This allows for sending large files without
365       wasting memory.  Your callback should return a chunk of data each time
366       it is called, and an empty string when done.  Don't forget to set the
367       Content-Length header correctly.  Example:
368
369         my $request = HTTP::Request->new( PUT => 'http://...' );
370
371         my $file = '/path/to/large_file';
372
373         open my $fh, '<', $file;
374
375         my $upload_cb = sub {
376           if ( sysread $fh, my $buf, 4096 ) {
377             return $buf;
378           }
379           else {
380             close $fh;
381             return '';
382           }
383         };
384
385         $request->content_length( -s $file );
386
387         $request->content( $upload_cb );
388
389         $kernel->post( ua => request, 'response', $request );
390

CONTENT ENCODING AND COMPRESSION

392       Transparent content decoding has been disabled as of version 0.84.
393       This also removes support for transparent gzip requesting and
394       decompression.
395
396       To re-enable gzip compression, specify the gzip Content-Encoding and
397       use HTTP::Response's decoded_content() method rather than content():
398
399         my $request = HTTP::Request->new(
400           GET => "http://www.yahoo.com/", [
401             'Accept-Encoding' => 'gzip'
402           ]
403         );
404
405         # ... time passes ...
406
407         my $content = $response->decoded_content();
408
409       The change in POE::Component::Client::HTTP behavior was prompted by
410       changes in HTTP::Response that surfaced a bug in the component's
411       transparent gzip handling.
412
413       Allowing the application to specify and handle content encodings seems
414       to be the most reliable and flexible resolution.
415
416       For more information about the problem and discussions regarding the
417       solution, see: <http://www.perlmonks.org/?node_id=683833> and
418       <http://rt.cpan.org/Ticket/Display.html?id=35538>
419

CLIENT HEADERS

421       POE::Component::Client::HTTP sets its own response headers with
422       additional information.  All of its headers begin with "X-PCCH".
423
424   X-PCCH-Peer
425       X-PCCH-Peer contains the remote IPv4 address and port, separated by a
426       period.  For example, "127.0.0.1.8675" represents port 8675 on
427       localhost.
428
429       Proxying will render X-PCCH-Peer nearly useless, since the socket will
430       be connected to a proxy rather than the server itself.
431
432       This feature was added at Doreen Grey's request.  Doreen wanted a means
433       to find the remote server's address without having to make an
434       additional request.
435
436       Patches for IPv6 support are welcome.
437

ENVIRONMENT

439       POE::Component::Client::HTTP uses two standard environment variables:
440       HTTP_PROXY and NO_PROXY.
441
442       HTTP_PROXY sets the proxy server that Client::HTTP will forward
443       requests through.  NO_PROXY sets a list of hosts that will not be
444       forwarded through a proxy.
445
446       See the Proxy and NoProxy constructor parameters for more information
447       about these variables.
448

SEE ALSO

450       This component is built upon HTTP::Request, HTTP::Response, and POE.
451       Please see its source code and the documentation for its foundation
452       modules to learn more.  If you want to use cookies, you'll need to read
453       about HTTP::Cookies as well.
454
455       Also see the test program, t/01_request.t, in the PoCo::Client::HTTP
456       distribution.
457

BUGS

459       There is no support for CGI_PROXY or CgiProxy.
460
461       Secure HTTP (https) proxying is not supported at this time.
462
463       There is no object oriented interface.  See
464       POE::Component::Client::Keepalive and POE::Component::Client::DNS for
465       examples of a decent OO interface.
466

AUTHOR, COPYRIGHT, & LICENSE

468       POE::Component::Client::HTTP is
469
470       · Copyright 1999-2009 Rocco Caputo
471
472       · Copyright 2004 Rob Bloodgood
473
474       · Copyright 2004-2005 Martijn van Beers
475
476       All rights are reserved.  POE::Component::Client::HTTP is free
477       software; you may redistribute it and/or modify it under the same terms
478       as Perl itself.
479

CONTRIBUTORS

481       Joel Bernstein solved some nasty race conditions.  Portugal Telecom
482       <http://www.sapo.pt/> was kind enough to support his contributions.
483
484       Jeff Bisbee added POD tests and documentation to pass several of them
485       to version 0.79.  He's a kwalitee-increasing machine!
486

BUG TRACKER

488       https://rt.cpan.org/Dist/Display.html?Queue=POE-Component-Client-HTTP
489

REPOSITORY

491       http://github.com/rcaputo/poe-component-client-http
492       http://gitorious.org/poe-component-client-http
493

OTHER RESOURCES

495       http://search.cpan.org/dist/POE-Component-Client-HTTP/
496
497
498
499perl v5.12.0                      2010-02-15   POE::Component::Client::HTTP(3)
Impressum