1LWP::RobotUA(3) User Contributed Perl Documentation LWP::RobotUA(3)
2
3
4
6 LWP::RobotUA - a class for well-behaved Web robots
7
9 use LWP::RobotUA;
10 my $ua = LWP::RobotUA->new('my-robot/0.1', 'me@foo.com');
11 $ua->delay(10); # be very nice -- max one hit every ten minutes!
12 ...
13
14 # Then just use it just like a normal LWP::UserAgent:
15 my $response = $ua->get('http://whatever.int/...');
16 ...
17
19 This class implements a user agent that is suitable for robot
20 applications. Robots should be nice to the servers they visit. They
21 should consult the /robots.txt file to ensure that they are welcomed
22 and they should not make requests too frequently.
23
24 But before you consider writing a robot, take a look at
25 <URL:http://www.robotstxt.org/>.
26
27 When you use an LWP::RobotUA object as your user agent, then you do not
28 really have to think about these things yourself; "robots.txt" files
29 are automatically consulted and obeyed, the server isn't queried too
30 rapidly, and so on. Just send requests as you do when you are using a
31 normal LWP::UserAgent object (using "$ua->get(...)", "$ua->head(...)",
32 "$ua->request(...)", etc.), and this special agent will make sure you
33 are nice.
34
36 The LWP::RobotUA is a sub-class of LWP::UserAgent and implements the
37 same methods. In addition the following methods are provided:
38
39 new
40 my $ua = LWP::RobotUA->new( %options )
41 my $ua = LWP::RobotUA->new( $agent, $from )
42 my $ua = LWP::RobotUA->new( $agent, $from, $rules )
43
44 The LWP::UserAgent options "agent" and "from" are mandatory. The
45 options "delay", "use_sleep" and "rules" initialize attributes private
46 to the RobotUA. If "rules" are not provided, then "WWW::RobotRules" is
47 instantiated providing an internal database of robots.txt.
48
49 It is also possible to just pass the value of "agent", "from" and
50 optionally "rules" as plain positional arguments.
51
52 delay
53 my $delay = $ua->delay;
54 $ua->delay( $minutes );
55
56 Get/set the minimum delay between requests to the same server, in
57 minutes. The default is 1 minute. Note that this number doesn't have
58 to be an integer; for example, this sets the delay to 10 seconds:
59
60 $ua->delay(10/60);
61
62 use_sleep
63 my $bool = $ua->use_sleep;
64 $ua->use_sleep( $boolean );
65
66 Get/set a value indicating whether the UA should "sleep" in
67 LWP::RobotUA if requests arrive too fast, defined as "$ua->delay"
68 minutes not passed since last request to the given server. The default
69 is true. If this value is false then an internal "SERVICE_UNAVAILABLE"
70 response will be generated. It will have a "Retry-After" header that
71 indicates when it is OK to send another request to this server.
72
73 rules
74 my $rules = $ua->rules;
75 $ua->rules( $rules );
76
77 Set/get which WWW::RobotRules object to use.
78
79 no_visits
80 my $num = $ua->no_visits( $netloc )
81
82 Returns the number of documents fetched from this server host. Yeah I
83 know, this method should probably have been named "num_visits" or
84 something like that. :-(
85
86 host_wait
87 my $num = $ua->host_wait( $netloc )
88
89 Returns the number of seconds (from now) you must wait before you can
90 make a new request to this host.
91
92 as_string
93 my $string = $ua->as_string;
94
95 Returns a string that describes the state of the UA. Mainly useful for
96 debugging.
97
99 LWP::UserAgent, WWW::RobotRules
100
102 Copyright 1996-2004 Gisle Aas.
103
104 This library is free software; you can redistribute it and/or modify it
105 under the same terms as Perl itself.
106
107
108
109perl v5.30.1 2019-11-27 LWP::RobotUA(3)