1LWP::UserAgent(3)     User Contributed Perl Documentation    LWP::UserAgent(3)


6       LWP::UserAgent - Web user agent class


9        require LWP::UserAgent;
11        my $ua = LWP::UserAgent->new;
12        $ua->timeout(10);
13        $ua->env_proxy;
15        my $response = $ua->get('http://search.cpan.org/');
17        if ($response->is_success) {
18            print $response->content;  # or whatever
19        }
20        else {
21            die $response->status_line;
22        }


25       The "LWP::UserAgent" is a class implementing a web user agent.
26       "LWP::UserAgent" objects can be used to dispatch web requests.
28       In normal use the application creates an "LWP::UserAgent" object, and
29       then configures it with values for timeouts, proxies, name, etc. It
30       then creates an instance of "HTTP::Request" for the request that needs
31       to be performed. This request is then passed to one of the request
32       method the UserAgent, which dispatches it using the relevant protocol,
33       and returns a "HTTP::Response" object.  There are convenience methods
34       for sending the most common request types: get(), head() and post().
35       When using these methods then the creation of the request object is
36       hidden as shown in the synopsis above.
38       The basic approach of the library is to use HTTP style communication
39       for all protocol schemes.  This means that you will construct
40       "HTTP::Request" objects and receive "HTTP::Response" objects even for
41       non-HTTP resources like gopher and ftp.  In order to achieve even more
42       similarity to HTTP style communications, gopher menus and file directo‐
43       ries are converted to HTML documents.


46       The following constructor methods are available:
48       $ua = LWP::UserAgent->new( %options )
49           This method constructs a new "LWP::UserAgent" object and returns
50           it.  Key/value pair arguments may be provided to set up the initial
51           state.  The following options correspond to attribute methods
52           described below:
54              KEY                     DEFAULT
55              -----------             --------------------
56              agent                   "libwww-perl/#.##"
57              from                    undef
58              conn_cache              undef
59              cookie_jar              undef
60              default_headers         HTTP::Headers->new
61              max_size                undef
62              max_redirect            7
63              parse_head              1
64              protocols_allowed       undef
65              protocols_forbidden     undef
66              requests_redirectable   ['GET', 'HEAD']
67              timeout                 180
69           The following additional options are also accepted: If the
70           "env_proxy" option is passed in with a TRUE value, then proxy set‐
71           tings are read from environment variables (see env_proxy() method
72           below).  If the "keep_alive" option is passed in, then a "LWP::Con‐
73           nCache" is set up (see conn_cache() method below).  The
74           "keep_alive" value is passed on as the "total_capacity" for the
75           connection cache.
77       $ua->clone
78           Returns a copy of the LWP::UserAgent object.


81       The settings of the configuration attributes modify the behaviour of
82       the "LWP::UserAgent" when it dispatches requests.  Most of these can
83       also be initialized by options passed to the constructor method.
85       The following attributes methods are provided.  The attribute value is
86       left unchanged if no argument is given.  The return value from each
87       method is the old attribute value.
89       $ua->agent
90       $ua->agent( $product_id )
91           Get/set the product token that is used to identify the user agent
92           on the network.  The agent value is sent as the "User-Agent" header
93           in the requests.  The default is the string returned by the
94           _agent() method (see below).
96           If the $product_id ends with space then the _agent() string is
97           appended to it.
99           The user agent string should be one or more simple product identi‐
100           fiers with an optional version number separated by the "/" charac‐
101           ter.  Examples are:
103             $ua->agent('Checkbot/0.4 ' . $ua->_agent);
104             $ua->agent('Checkbot/0.4 ');    # same as above
105             $ua->agent('Mozilla/5.0');
106             $ua->agent("");                 # don't identify
108       $ua->_agent
109           Returns the default agent identifier.  This is a string of the form
110           "libwww-perl/#.##", where "#.##" is substituted with the version
111           number of this library.
113       $ua->from
114       $ua->from( $email_address )
115           Get/set the e-mail address for the human user who controls the
116           requesting user agent.  The address should be machine-usable, as
117           defined in RFC 822.  The "from" value is send as the "From" header
118           in the requests.  Example:
120             $ua->from('gaas@cpan.org');
122           The default is to not send a "From" header.  See the default_head‐
123           ers() method for the more general interface that allow any header
124           to be defaulted.
126       $ua->cookie_jar
127       $ua->cookie_jar( $cookie_jar_obj )
128           Get/set the cookie jar object to use.  The only requirement is that
129           the cookie jar object must implement the extract_cookies($request)
130           and add_cookie_header($response) methods.  These methods will then
131           be invoked by the user agent as requests are sent and responses are
132           received.  Normally this will be a "HTTP::Cookies" object or some
133           subclass.
135           The default is to have no cookie_jar, i.e. never automatically add
136           "Cookie" headers to the requests.
138           Shortcut: If a reference to a plain hash is passed in as the
139           $cookie_jar_object, then it is replaced with an instance of
140           "HTTP::Cookies" that is initialized based on the hash.  This form
141           also automatically loads the "HTTP::Cookies" module.  It means
142           that:
144             $ua->cookie_jar({ file => "$ENV{HOME}/.cookies.txt" });
146           is really just a shortcut for:
148             require HTTP::Cookies;
149             $ua->cookie_jar(HTTP::Cookies->new(file => "$ENV{HOME}/.cookies.txt"));
151       $ua->default_headers
152       $ua->default_headers( $headers_obj )
153           Get/set the headers object that will provide default header values
154           for any requests sent.  By default this will be an empty
155           "HTTP::Headers" object.  Example:
157             $ua->default_headers->push_header('Accept-Language' => "no, en");
159       $ua->default_header( $field )
160       $ua->default_header( $field => $value )
161           This is just a short-cut for $ua->default_headers->header( $field
162           => $value ). Example:
164             $ua->default_header('Accept-Language' => "no, en");
166       $ua->conn_cache
167       $ua->conn_cache( $cache_obj )
168           Get/set the "LWP::ConnCache" object to use.  See LWP::ConnCache for
169           details.
171       $ua->credentials( $netloc, $realm, $uname, $pass )
172           Set the user name and password to be used for a realm.  It is often
173           more useful to specialize the get_basic_credentials() method
174           instead.
176       $ua->max_size
177       $ua->max_size( $bytes )
178           Get/set the size limit for response content.  The default is
179           "undef", which means that there is no limit.  If the returned
180           response content is only partial, because the size limit was
181           exceeded, then a "Client-Aborted" header will be added to the
182           response.  The content might end up longer than "max_size" as we
183           abort once appending a chunk of data makes the length exceed the
184           limit.  The "Content-Length" header, if present, will indicate the
185           length of the full content and will normally not be the same as
186           "length($res->content)".
188       $ua->max_redirect
189       $ua->max_redirect( $n )
190           This reads or sets the object's limit of how many times it will
191           obey redirection responses in a given request cycle.
193           By default, the value is 7. This means that if you call request()
194           method and the response is a redirect elsewhere which is in turn a
195           redirect, and so on seven times, then LWP gives up after that sev‐
196           enth request.
198       $ua->parse_head
199       $ua->parse_head( $boolean )
200           Get/set a value indicating whether we should initialize response
201           headers from the <head> section of HTML documents. The default is
202           TRUE.  Do not turn this off, unless you know what you are doing.
204       $ua->protocols_allowed
205       $ua->protocols_allowed( \@protocols )
206           This reads (or sets) this user agent's list of protocols that the
207           request methods will exclusively allow.  The protocol names are
208           case insensitive.
210           For example: "$ua->protocols_allowed( [ 'http', 'https'] );" means
211           that this user agent will allow only those protocols, and attempts
212           to use this user agent to access URLs with any other schemes (like
213           "ftp://...") will result in a 500 error.
215           To delete the list, call: "$ua->protocols_allowed(undef)"
217           By default, an object has neither a "protocols_allowed" list, nor a
218           "protocols_forbidden" list.
220           Note that having a "protocols_allowed" list causes any "proto‐
221           cols_forbidden" list to be ignored.
223       $ua->protocols_forbidden
224       $ua->protocols_forbidden( \@protocols )
225           This reads (or sets) this user agent's list of protocols that the
226           request method will not allow. The protocol names are case insensi‐
227           tive.
229           For example: "$ua->protocols_forbidden( [ 'file', 'mailto'] );"
230           means that this user agent will not allow those protocols, and
231           attempts to use this user agent to access URLs with those schemes
232           will result in a 500 error.
234           To delete the list, call: "$ua->protocols_forbidden(undef)"
236       $ua->requests_redirectable
237       $ua->requests_redirectable( \@requests )
238           This reads or sets the object's list of request names that
239           "$ua->redirect_ok(...)" will allow redirection for.  By default,
240           this is "['GET', 'HEAD']", as per RFC 2616.  To change to include
241           'POST', consider:
243              push @{ $ua->requests_redirectable }, 'POST';
245       $ua->timeout
246       $ua->timeout( $secs )
247           Get/set the timeout value in seconds. The default timeout() value
248           is 180 seconds, i.e. 3 minutes.
250           The requests is aborted if no activity on the connection to the
251           server is observed for "timeout" seconds.  This means that the time
252           it takes for the complete transaction and the request() method to
253           actually return might be longer.
255       Proxy attributes
257       The following methods set up when requests should be passed via a proxy
258       server.
260       $ua->proxy(\@schemes, $proxy_url)
261       $ua->proxy($scheme, $proxy_url)
262           Set/retrieve proxy URL for a scheme:
264            $ua->proxy(['http', 'ftp'], 'http://proxy.sn.no:8001/');
265            $ua->proxy('gopher', 'http://proxy.sn.no:8001/');
267           The first form specifies that the URL is to be used for proxying of
268           access methods listed in the list in the first method argument,
269           i.e. 'http' and 'ftp'.
271           The second form shows a shorthand form for specifying proxy URL for
272           a single access scheme.
274       $ua->no_proxy( $domain, ... )
275           Do not proxy requests to the given domains.  Calling no_proxy with‐
276           out any domains clears the list of domains. Eg:
278            $ua->no_proxy('localhost', 'no', ...);
280       $ua->env_proxy
281           Load proxy settings from *_proxy environment variables.  You might
282           specify proxies like this (sh-syntax):
284             gopher_proxy=http://proxy.my.place/
285             wais_proxy=http://proxy.my.place/
286             no_proxy="localhost,my.domain"
287             export gopher_proxy wais_proxy no_proxy
289           csh or tcsh users should use the "setenv" command to define these
290           environment variables.
292           On systems with case insensitive environment variables there exists
293           a name clash between the CGI environment variables and the
294           "HTTP_PROXY" environment variable normally picked up by
295           env_proxy().  Because of this "HTTP_PROXY" is not honored for CGI
296           scripts.  The "CGI_HTTP_PROXY" environment variable can be used
297           instead.


300       The methods described in this section are used to dispatch requests via
301       the user agent.  The following request methods are provided:
303       $ua->get( $url )
304       $ua->get( $url , $field_name => $value, ... )
305           This method will dispatch a "GET" request on the given $url.  Fur‐
306           ther arguments can be given to initialize the headers of the
307           request. These are given as separate name/value pairs.  The return
308           value is a response object.  See HTTP::Response for a description
309           of the interface it provides.
311           Fields names that start with ":" are special.  These will not ini‐
312           tialize headers of the request but will determine how the response
313           content is treated.  The following special field names are recog‐
314           nized:
316               :content_file   => $filename
317               :content_cb     => \&callback
318               :read_size_hint => $bytes
320           If a $filename is provided with the ":content_file" option, then
321           the response content will be saved here instead of in the response
322           object.  If a callback is provided with the ":content_cb" option
323           then this function will be called for each chunk of the response
324           content as it is received from the server.  If neither of these
325           options are given, then the response content will accumulate in the
326           response object itself.  This might not be suitable for very large
327           response bodies.  Only one of ":content_file" or ":content_cb" can
328           be specified.  The content of unsuccessful responses will always
329           accumulate in the response object itself, regardless of the ":con‐
330           tent_file" or ":content_cb" options passed in.
332           The ":read_size_hint" option is passed to the protocol module which
333           will try to read data from the server in chunks of this size.  A
334           smaller value for the ":read_size_hint" will result in a higher
335           number of callback invocations.
337           The callback function is called with 3 arguments: a chunk of data,
338           a reference to the response object, and a reference to the protocol
339           object.  The callback can abort the request by invoking die().  The
340           exception message will show up as the "X-Died" header field in the
341           response returned by the get() function.
343       $ua->head( $url )
344       $ua->head( $url , $field_name => $value, ... )
345           This method will dispatch a "HEAD" request on the given $url.  Oth‐
346           erwise it works like the get() method described above.
348       $ua->post( $url, \%form )
349       $ua->post( $url, \@form )
350       $ua->post( $url, \%form, $field_name => $value, ... )
351           This method will dispatch a "POST" request on the given $url, with
352           %form or @form providing the key/value pairs for the fill-in form
353           content. Additional headers and content options are the same as for
354           the get() method.
356           This method will use the POST() function from "HTTP::Request::Com‐
357           mon" to build the request.  See HTTP::Request::Common for a details
358           on how to pass form content and other advanced features.
360       $ua->mirror( $url, $filename )
361           This method will get the document identified by $url and store it
362           in file called $filename.  If the file already exists, then the
363           request will contain an "If-Modified-Since" header matching the
364           modification time of the file.  If the document on the server has
365           not changed since this time, then nothing happens.  If the document
366           has been updated, it will be downloaded again.  The modification
367           time of the file will be forced to match that of the server.
369           The return value is the the response object.
371       $ua->request( $request )
372       $ua->request( $request, $content_file )
373       $ua->request( $request, $content_cb )
374       $ua->request( $request, $content_cb, $read_size_hint )
375           This method will dispatch the given $request object.  Normally this
376           will be an instance of the "HTTP::Request" class, but any object
377           with a similar interface will do.  The return value is a response
378           object.  See HTTP::Request and HTTP::Response for a description of
379           the interface provided by these classes.
381           The request() method will process redirects and authentication
382           responses transparently.  This means that it may actually send sev‐
383           eral simple requests via the simple_request() method described
384           below.
386           The request methods described above; get(), head(), post() and mir‐
387           ror(), will all dispatch the request they build via this method.
388           They are convenience methods that simply hides the creation of the
389           request object for you.
391           The $content_file, $content_cb and $read_size_hint all correspond
392           to options described with the get() method above.
394           You are allowed to use a CODE reference as "content" in the request
395           object passed in.  The "content" function should return the content
396           when called.  The content can be returned in chunks.  The content
397           function will be invoked repeatedly until it return an empty string
398           to signal that there is no more content.
400       $ua->simple_request( $request )
401       $ua->simple_request( $request, $content_file )
402       $ua->simple_request( $request, $content_cb )
403       $ua->simple_request( $request, $content_cb, $read_size_hint )
404           This method dispatches a single request and returns the response
405           received.  Arguments are the same as for request() described above.
407           The difference from request() is that simple_request() will not try
408           to handle redirects or authentication responses.  The request()
409           method will in fact invoke this method for each simple request it
410           sends.
412       $ua->is_protocol_supported( $scheme )
413           You can use this method to test whether this user agent object sup‐
414           ports the specified "scheme".  (The "scheme" might be a string
415           (like 'http' or 'ftp') or it might be an URI object reference.)
417           Whether a scheme is supported, is determined by the user agent's
418           "protocols_allowed" or "protocols_forbidden" lists (if any), and by
419           the capabilities of LWP.  I.e., this will return TRUE only if LWP
420           supports this protocol and it's permitted for this particular
421           object.
423       Callback methods
425       The following methods will be invoked as requests are processed. These
426       methods are documented here because subclasses of "LWP::UserAgent"
427       might want to override their behaviour.
429       $ua->prepare_request( $request )
430           This method is invoked by simple_request().  Its task is to modify
431           the given $request object by setting up various headers based on
432           the attributes of the user agent. The return value should normally
433           be the $request object passed in.  If a different request object is
434           returned it will be the one actually processed.
436           The headers affected by the base implementation are; "User-Agent",
437           "From", "Range" and "Cookie".
439       $ua->redirect_ok( $prospective_request, $response )
440           This method is called by request() before it tries to follow a re‐
441           direction to the request in $response.  This should return a TRUE
442           value if this redirection is permissible.  The $prospective_request
443           will be the request to be sent if this method returns TRUE.
445           The base implementation will return FALSE unless the method is in
446           the object's "requests_redirectable" list, FALSE if the proposed
447           redirection is to a "file://..."  URL, and TRUE otherwise.
449       $ua->get_basic_credentials( $realm, $uri, $isproxy )
450           This is called by request() to retrieve credentials for documents
451           protected by Basic or Digest Authentication.  The arguments passed
452           in is the $realm provided by the server, the $uri requested and a
453           boolean flag to indicate if this is authentication against a proxy
454           server.
456           The method should return a username and password.  It should return
457           an empty list to abort the authentication resolution attempt.  Sub‐
458           classes can override this method to prompt the user for the infor‐
459           mation. An example of this can be found in "lwp-request" program
460           distributed with this library.
462           The base implementation simply checks a set of pre-stored member
463           variables, set up with the credentials() method.


466       See LWP for a complete overview of libwww-perl5.  See lwpcook and the
467       scripts lwp-request and lwp-download for examples of usage.
469       See HTTP::Request and HTTP::Response for a description of the message
470       objects dispatched and received.  See HTTP::Request::Common and
471       HTML::Form for other ways to build request objects.
473       See WWW::Mechanize and WWW::Search for examples of more specialized
474       user agents based on "LWP::UserAgent".
477       Copyright 1995-2004 Gisle Aas.
479       This library is free software; you can redistribute it and/or modify it
480       under the same terms as Perl itself.
484perl v5.8.8                       2004-04-06                 LWP::UserAgent(3)