1LWP::UserAgent(3)     User Contributed Perl Documentation    LWP::UserAgent(3)
2
3
4

NAME

6       LWP::UserAgent - Web user agent class
7

SYNOPSIS

9        require LWP::UserAgent;
10
11        my $ua = LWP::UserAgent->new;
12        $ua->timeout(10);
13        $ua->env_proxy;
14
15        my $response = $ua->get('http://search.cpan.org/');
16
17        if ($response->is_success) {
18            print $response->content;  # or whatever
19        }
20        else {
21            die $response->status_line;
22        }
23

DESCRIPTION

25       The "LWP::UserAgent" is a class implementing a web user agent.
26       "LWP::UserAgent" objects can be used to dispatch web requests.
27
28       In normal use the application creates an "LWP::UserAgent" object, and
29       then configures it with values for timeouts, proxies, name, etc. It
30       then creates an instance of "HTTP::Request" for the request that needs
31       to be performed. This request is then passed to one of the request
32       method the UserAgent, which dispatches it using the relevant protocol,
33       and returns a "HTTP::Response" object.  There are convenience methods
34       for sending the most common request types: get(), head() and post().
35       When using these methods then the creation of the request object is
36       hidden as shown in the synopsis above.
37
38       The basic approach of the library is to use HTTP style communication
39       for all protocol schemes.  This means that you will construct
40       "HTTP::Request" objects and receive "HTTP::Response" objects even for
41       non-HTTP resources like gopher and ftp.  In order to achieve even more
42       similarity to HTTP style communications, gopher menus and file directo‐
43       ries are converted to HTML documents.
44

CONSTRUCTOR METHODS

46       The following constructor methods are available:
47
48       $ua = LWP::UserAgent->new( %options )
49           This method constructs a new "LWP::UserAgent" object and returns
50           it.  Key/value pair arguments may be provided to set up the initial
51           state.  The following options correspond to attribute methods
52           described below:
53
54              KEY                     DEFAULT
55              -----------             --------------------
56              agent                   "libwww-perl/#.##"
57              from                    undef
58              conn_cache              undef
59              cookie_jar              undef
60              default_headers         HTTP::Headers->new
61              max_size                undef
62              max_redirect            7
63              parse_head              1
64              protocols_allowed       undef
65              protocols_forbidden     undef
66              requests_redirectable   ['GET', 'HEAD']
67              timeout                 180
68
69           The following additional options are also accepted: If the
70           "env_proxy" option is passed in with a TRUE value, then proxy set‐
71           tings are read from environment variables (see env_proxy() method
72           below).  If the "keep_alive" option is passed in, then a "LWP::Con‐
73           nCache" is set up (see conn_cache() method below).  The
74           "keep_alive" value is passed on as the "total_capacity" for the
75           connection cache.
76
77       $ua->clone
78           Returns a copy of the LWP::UserAgent object.
79

ATTRIBUTES

81       The settings of the configuration attributes modify the behaviour of
82       the "LWP::UserAgent" when it dispatches requests.  Most of these can
83       also be initialized by options passed to the constructor method.
84
85       The following attributes methods are provided.  The attribute value is
86       left unchanged if no argument is given.  The return value from each
87       method is the old attribute value.
88
89       $ua->agent
90       $ua->agent( $product_id )
91           Get/set the product token that is used to identify the user agent
92           on the network.  The agent value is sent as the "User-Agent" header
93           in the requests.  The default is the string returned by the
94           _agent() method (see below).
95
96           If the $product_id ends with space then the _agent() string is
97           appended to it.
98
99           The user agent string should be one or more simple product identi‐
100           fiers with an optional version number separated by the "/" charac‐
101           ter.  Examples are:
102
103             $ua->agent('Checkbot/0.4 ' . $ua->_agent);
104             $ua->agent('Checkbot/0.4 ');    # same as above
105             $ua->agent('Mozilla/5.0');
106             $ua->agent("");                 # don't identify
107
108       $ua->_agent
109           Returns the default agent identifier.  This is a string of the form
110           "libwww-perl/#.##", where "#.##" is substituted with the version
111           number of this library.
112
113       $ua->from
114       $ua->from( $email_address )
115           Get/set the e-mail address for the human user who controls the
116           requesting user agent.  The address should be machine-usable, as
117           defined in RFC 822.  The "from" value is send as the "From" header
118           in the requests.  Example:
119
120             $ua->from('gaas@cpan.org');
121
122           The default is to not send a "From" header.  See the default_head‐
123           ers() method for the more general interface that allow any header
124           to be defaulted.
125
126       $ua->cookie_jar
127       $ua->cookie_jar( $cookie_jar_obj )
128           Get/set the cookie jar object to use.  The only requirement is that
129           the cookie jar object must implement the extract_cookies($request)
130           and add_cookie_header($response) methods.  These methods will then
131           be invoked by the user agent as requests are sent and responses are
132           received.  Normally this will be a "HTTP::Cookies" object or some
133           subclass.
134
135           The default is to have no cookie_jar, i.e. never automatically add
136           "Cookie" headers to the requests.
137
138           Shortcut: If a reference to a plain hash is passed in as the
139           $cookie_jar_object, then it is replaced with an instance of
140           "HTTP::Cookies" that is initialized based on the hash.  This form
141           also automatically loads the "HTTP::Cookies" module.  It means
142           that:
143
144             $ua->cookie_jar({ file => "$ENV{HOME}/.cookies.txt" });
145
146           is really just a shortcut for:
147
148             require HTTP::Cookies;
149             $ua->cookie_jar(HTTP::Cookies->new(file => "$ENV{HOME}/.cookies.txt"));
150
151       $ua->default_headers
152       $ua->default_headers( $headers_obj )
153           Get/set the headers object that will provide default header values
154           for any requests sent.  By default this will be an empty
155           "HTTP::Headers" object.  Example:
156
157             $ua->default_headers->push_header('Accept-Language' => "no, en");
158
159       $ua->default_header( $field )
160       $ua->default_header( $field => $value )
161           This is just a short-cut for $ua->default_headers->header( $field
162           => $value ). Example:
163
164             $ua->default_header('Accept-Language' => "no, en");
165
166       $ua->conn_cache
167       $ua->conn_cache( $cache_obj )
168           Get/set the "LWP::ConnCache" object to use.  See LWP::ConnCache for
169           details.
170
171       $ua->credentials( $netloc, $realm, $uname, $pass )
172           Set the user name and password to be used for a realm.  It is often
173           more useful to specialize the get_basic_credentials() method
174           instead.
175
176       $ua->max_size
177       $ua->max_size( $bytes )
178           Get/set the size limit for response content.  The default is
179           "undef", which means that there is no limit.  If the returned
180           response content is only partial, because the size limit was
181           exceeded, then a "Client-Aborted" header will be added to the
182           response.  The content might end up longer than "max_size" as we
183           abort once appending a chunk of data makes the length exceed the
184           limit.  The "Content-Length" header, if present, will indicate the
185           length of the full content and will normally not be the same as
186           "length($res->content)".
187
188       $ua->max_redirect
189       $ua->max_redirect( $n )
190           This reads or sets the object's limit of how many times it will
191           obey redirection responses in a given request cycle.
192
193           By default, the value is 7. This means that if you call request()
194           method and the response is a redirect elsewhere which is in turn a
195           redirect, and so on seven times, then LWP gives up after that sev‐
196           enth request.
197
198       $ua->parse_head
199       $ua->parse_head( $boolean )
200           Get/set a value indicating whether we should initialize response
201           headers from the <head> section of HTML documents. The default is
202           TRUE.  Do not turn this off, unless you know what you are doing.
203
204       $ua->protocols_allowed
205       $ua->protocols_allowed( \@protocols )
206           This reads (or sets) this user agent's list of protocols that the
207           request methods will exclusively allow.  The protocol names are
208           case insensitive.
209
210           For example: "$ua->protocols_allowed( [ 'http', 'https'] );" means
211           that this user agent will allow only those protocols, and attempts
212           to use this user agent to access URLs with any other schemes (like
213           "ftp://...") will result in a 500 error.
214
215           To delete the list, call: "$ua->protocols_allowed(undef)"
216
217           By default, an object has neither a "protocols_allowed" list, nor a
218           "protocols_forbidden" list.
219
220           Note that having a "protocols_allowed" list causes any "proto‐
221           cols_forbidden" list to be ignored.
222
223       $ua->protocols_forbidden
224       $ua->protocols_forbidden( \@protocols )
225           This reads (or sets) this user agent's list of protocols that the
226           request method will not allow. The protocol names are case insensi‐
227           tive.
228
229           For example: "$ua->protocols_forbidden( [ 'file', 'mailto'] );"
230           means that this user agent will not allow those protocols, and
231           attempts to use this user agent to access URLs with those schemes
232           will result in a 500 error.
233
234           To delete the list, call: "$ua->protocols_forbidden(undef)"
235
236       $ua->requests_redirectable
237       $ua->requests_redirectable( \@requests )
238           This reads or sets the object's list of request names that
239           "$ua->redirect_ok(...)" will allow redirection for.  By default,
240           this is "['GET', 'HEAD']", as per RFC 2616.  To change to include
241           'POST', consider:
242
243              push @{ $ua->requests_redirectable }, 'POST';
244
245       $ua->timeout
246       $ua->timeout( $secs )
247           Get/set the timeout value in seconds. The default timeout() value
248           is 180 seconds, i.e. 3 minutes.
249
250           The requests is aborted if no activity on the connection to the
251           server is observed for "timeout" seconds.  This means that the time
252           it takes for the complete transaction and the request() method to
253           actually return might be longer.
254
255       Proxy attributes
256
257       The following methods set up when requests should be passed via a proxy
258       server.
259
260       $ua->proxy(\@schemes, $proxy_url)
261       $ua->proxy($scheme, $proxy_url)
262           Set/retrieve proxy URL for a scheme:
263
264            $ua->proxy(['http', 'ftp'], 'http://proxy.sn.no:8001/');
265            $ua->proxy('gopher', 'http://proxy.sn.no:8001/');
266
267           The first form specifies that the URL is to be used for proxying of
268           access methods listed in the list in the first method argument,
269           i.e. 'http' and 'ftp'.
270
271           The second form shows a shorthand form for specifying proxy URL for
272           a single access scheme.
273
274       $ua->no_proxy( $domain, ... )
275           Do not proxy requests to the given domains.  Calling no_proxy with‐
276           out any domains clears the list of domains. Eg:
277
278            $ua->no_proxy('localhost', 'no', ...);
279
280       $ua->env_proxy
281           Load proxy settings from *_proxy environment variables.  You might
282           specify proxies like this (sh-syntax):
283
284             gopher_proxy=http://proxy.my.place/
285             wais_proxy=http://proxy.my.place/
286             no_proxy="localhost,my.domain"
287             export gopher_proxy wais_proxy no_proxy
288
289           csh or tcsh users should use the "setenv" command to define these
290           environment variables.
291
292           On systems with case insensitive environment variables there exists
293           a name clash between the CGI environment variables and the
294           "HTTP_PROXY" environment variable normally picked up by
295           env_proxy().  Because of this "HTTP_PROXY" is not honored for CGI
296           scripts.  The "CGI_HTTP_PROXY" environment variable can be used
297           instead.
298

REQUEST METHODS

300       The methods described in this section are used to dispatch requests via
301       the user agent.  The following request methods are provided:
302
303       $ua->get( $url )
304       $ua->get( $url , $field_name => $value, ... )
305           This method will dispatch a "GET" request on the given $url.  Fur‐
306           ther arguments can be given to initialize the headers of the
307           request. These are given as separate name/value pairs.  The return
308           value is a response object.  See HTTP::Response for a description
309           of the interface it provides.
310
311           Fields names that start with ":" are special.  These will not ini‐
312           tialize headers of the request but will determine how the response
313           content is treated.  The following special field names are recog‐
314           nized:
315
316               :content_file   => $filename
317               :content_cb     => \&callback
318               :read_size_hint => $bytes
319
320           If a $filename is provided with the ":content_file" option, then
321           the response content will be saved here instead of in the response
322           object.  If a callback is provided with the ":content_cb" option
323           then this function will be called for each chunk of the response
324           content as it is received from the server.  If neither of these
325           options are given, then the response content will accumulate in the
326           response object itself.  This might not be suitable for very large
327           response bodies.  Only one of ":content_file" or ":content_cb" can
328           be specified.  The content of unsuccessful responses will always
329           accumulate in the response object itself, regardless of the ":con‐
330           tent_file" or ":content_cb" options passed in.
331
332           The ":read_size_hint" option is passed to the protocol module which
333           will try to read data from the server in chunks of this size.  A
334           smaller value for the ":read_size_hint" will result in a higher
335           number of callback invocations.
336
337           The callback function is called with 3 arguments: a chunk of data,
338           a reference to the response object, and a reference to the protocol
339           object.  The callback can abort the request by invoking die().  The
340           exception message will show up as the "X-Died" header field in the
341           response returned by the get() function.
342
343       $ua->head( $url )
344       $ua->head( $url , $field_name => $value, ... )
345           This method will dispatch a "HEAD" request on the given $url.  Oth‐
346           erwise it works like the get() method described above.
347
348       $ua->post( $url, \%form )
349       $ua->post( $url, \@form )
350       $ua->post( $url, \%form, $field_name => $value, ... )
351           This method will dispatch a "POST" request on the given $url, with
352           %form or @form providing the key/value pairs for the fill-in form
353           content. Additional headers and content options are the same as for
354           the get() method.
355
356           This method will use the POST() function from "HTTP::Request::Com‐
357           mon" to build the request.  See HTTP::Request::Common for a details
358           on how to pass form content and other advanced features.
359
360       $ua->mirror( $url, $filename )
361           This method will get the document identified by $url and store it
362           in file called $filename.  If the file already exists, then the
363           request will contain an "If-Modified-Since" header matching the
364           modification time of the file.  If the document on the server has
365           not changed since this time, then nothing happens.  If the document
366           has been updated, it will be downloaded again.  The modification
367           time of the file will be forced to match that of the server.
368
369           The return value is the the response object.
370
371       $ua->request( $request )
372       $ua->request( $request, $content_file )
373       $ua->request( $request, $content_cb )
374       $ua->request( $request, $content_cb, $read_size_hint )
375           This method will dispatch the given $request object.  Normally this
376           will be an instance of the "HTTP::Request" class, but any object
377           with a similar interface will do.  The return value is a response
378           object.  See HTTP::Request and HTTP::Response for a description of
379           the interface provided by these classes.
380
381           The request() method will process redirects and authentication
382           responses transparently.  This means that it may actually send sev‐
383           eral simple requests via the simple_request() method described
384           below.
385
386           The request methods described above; get(), head(), post() and mir‐
387           ror(), will all dispatch the request they build via this method.
388           They are convenience methods that simply hides the creation of the
389           request object for you.
390
391           The $content_file, $content_cb and $read_size_hint all correspond
392           to options described with the get() method above.
393
394           You are allowed to use a CODE reference as "content" in the request
395           object passed in.  The "content" function should return the content
396           when called.  The content can be returned in chunks.  The content
397           function will be invoked repeatedly until it return an empty string
398           to signal that there is no more content.
399
400       $ua->simple_request( $request )
401       $ua->simple_request( $request, $content_file )
402       $ua->simple_request( $request, $content_cb )
403       $ua->simple_request( $request, $content_cb, $read_size_hint )
404           This method dispatches a single request and returns the response
405           received.  Arguments are the same as for request() described above.
406
407           The difference from request() is that simple_request() will not try
408           to handle redirects or authentication responses.  The request()
409           method will in fact invoke this method for each simple request it
410           sends.
411
412       $ua->is_protocol_supported( $scheme )
413           You can use this method to test whether this user agent object sup‐
414           ports the specified "scheme".  (The "scheme" might be a string
415           (like 'http' or 'ftp') or it might be an URI object reference.)
416
417           Whether a scheme is supported, is determined by the user agent's
418           "protocols_allowed" or "protocols_forbidden" lists (if any), and by
419           the capabilities of LWP.  I.e., this will return TRUE only if LWP
420           supports this protocol and it's permitted for this particular
421           object.
422
423       Callback methods
424
425       The following methods will be invoked as requests are processed. These
426       methods are documented here because subclasses of "LWP::UserAgent"
427       might want to override their behaviour.
428
429       $ua->prepare_request( $request )
430           This method is invoked by simple_request().  Its task is to modify
431           the given $request object by setting up various headers based on
432           the attributes of the user agent. The return value should normally
433           be the $request object passed in.  If a different request object is
434           returned it will be the one actually processed.
435
436           The headers affected by the base implementation are; "User-Agent",
437           "From", "Range" and "Cookie".
438
439       $ua->redirect_ok( $prospective_request, $response )
440           This method is called by request() before it tries to follow a re‐
441           direction to the request in $response.  This should return a TRUE
442           value if this redirection is permissible.  The $prospective_request
443           will be the request to be sent if this method returns TRUE.
444
445           The base implementation will return FALSE unless the method is in
446           the object's "requests_redirectable" list, FALSE if the proposed
447           redirection is to a "file://..."  URL, and TRUE otherwise.
448
449       $ua->get_basic_credentials( $realm, $uri, $isproxy )
450           This is called by request() to retrieve credentials for documents
451           protected by Basic or Digest Authentication.  The arguments passed
452           in is the $realm provided by the server, the $uri requested and a
453           boolean flag to indicate if this is authentication against a proxy
454           server.
455
456           The method should return a username and password.  It should return
457           an empty list to abort the authentication resolution attempt.  Sub‐
458           classes can override this method to prompt the user for the infor‐
459           mation. An example of this can be found in "lwp-request" program
460           distributed with this library.
461
462           The base implementation simply checks a set of pre-stored member
463           variables, set up with the credentials() method.
464

SEE ALSO

466       See LWP for a complete overview of libwww-perl5.  See lwpcook and the
467       scripts lwp-request and lwp-download for examples of usage.
468
469       See HTTP::Request and HTTP::Response for a description of the message
470       objects dispatched and received.  See HTTP::Request::Common and
471       HTML::Form for other ways to build request objects.
472
473       See WWW::Mechanize and WWW::Search for examples of more specialized
474       user agents based on "LWP::UserAgent".
475
477       Copyright 1995-2004 Gisle Aas.
478
479       This library is free software; you can redistribute it and/or modify it
480       under the same terms as Perl itself.
481
482
483
484perl v5.8.8                       2004-04-06                 LWP::UserAgent(3)
Impressum