1LWP::RobotUA(3)       User Contributed Perl Documentation      LWP::RobotUA(3)
2
3
4

NAME

6       LWP::RobotUA - a class for well-behaved Web robots
7

SYNOPSIS

9         use LWP::RobotUA;
10         my $ua = LWP::RobotUA->new('my-robot/0.1', 'me@foo.com');
11         $ua->delay(10);  # be very nice -- max one hit every ten minutes!
12         ...
13
14         # Then just use it just like a normal LWP::UserAgent:
15         my $response = $ua->get('http://whatever.int/...');
16         ...
17

DESCRIPTION

19       This class implements a user agent that is suitable for robot
20       applications.  Robots should be nice to the servers they visit.  They
21       should consult the /robots.txt file to ensure that they are welcomed
22       and they should not make requests too frequently.
23
24       But before you consider writing a robot, take a look at
25       <URL:http://www.robotstxt.org/>.
26
27       When you use a LWP::RobotUA object as your user agent, then you do not
28       really have to think about these things yourself; "robots.txt" files
29       are automatically consulted and obeyed, the server isn't queried too
30       rapidly, and so on.  Just send requests as you do when you are using a
31       normal LWP::UserAgent object (using "$ua->get(...)", "$ua->head(...)",
32       "$ua->request(...)", etc.), and this special agent will make sure you
33       are nice.
34

METHODS

36       The LWP::RobotUA is a sub-class of LWP::UserAgent and implements the
37       same methods. In addition the following methods are provided:
38
39       $ua = LWP::RobotUA->new( %options )
40       $ua = LWP::RobotUA->new( $agent, $from )
41       $ua = LWP::RobotUA->new( $agent, $from, $rules )
42           The LWP::UserAgent options "agent" and "from" are mandatory.  The
43           options "delay", "use_sleep" and "rules" initialize attributes
44           private to the RobotUA.  If "rules" are not provided, then
45           "WWW::RobotRules" is instantiated providing an internal database of
46           robots.txt.
47
48           It is also possible to just pass the value of "agent", "from" and
49           optionally "rules" as plain positional arguments.
50
51       $ua->delay
52       $ua->delay( $minutes )
53           Get/set the minimum delay between requests to the same server, in
54           minutes.  The default is 1 minute.  Note that this number doesn't
55           have to be an integer; for example, this sets the delay to 10
56           seconds:
57
58               $ua->delay(10/60);
59
60       $ua->use_sleep
61       $ua->use_sleep( $boolean )
62           Get/set a value indicating whether the UA should sleep() if
63           requests arrive too fast, defined as $ua->delay minutes not passed
64           since last request to the given server.  The default is TRUE.  If
65           this value is FALSE then an internal SERVICE_UNAVAILABLE response
66           will be generated.  It will have an Retry-After header that
67           indicates when it is OK to send another request to this server.
68
69       $ua->rules
70       $ua->rules( $rules )
71           Set/get which WWW::RobotRules object to use.
72
73       $ua->no_visits( $netloc )
74           Returns the number of documents fetched from this server host. Yeah
75           I know, this method should probably have been named num_visits() or
76           something like that. :-(
77
78       $ua->host_wait( $netloc )
79           Returns the number of seconds (from now) you must wait before you
80           can make a new request to this host.
81
82       $ua->as_string
83           Returns a string that describes the state of the UA.  Mainly useful
84           for debugging.
85

SEE ALSO

87       LWP::UserAgent, WWW::RobotRules
88
90       Copyright 1996-2004 Gisle Aas.
91
92       This library is free software; you can redistribute it and/or modify it
93       under the same terms as Perl itself.
94
95
96
97perl v5.12.4                      2010-05-05                   LWP::RobotUA(3)
Impressum