1WWW::Mechanize(3)     User Contributed Perl Documentation    WWW::Mechanize(3)
2
3
4

NAME

6       WWW::Mechanize - Handy web browsing in a Perl object
7

VERSION

9       version 2.17
10

SYNOPSIS

12       WWW::Mechanize supports performing a sequence of page fetches including
13       following links and submitting forms. Each fetched page is parsed and
14       its links and forms are extracted. A link or a form can be selected,
15       form fields can be filled and the next page can be fetched.  Mech also
16       stores a history of the URLs you've visited, which can be queried and
17       revisited.
18
19           use WWW::Mechanize ();
20           my $mech = WWW::Mechanize->new();
21
22           $mech->get( $url );
23
24           $mech->follow_link( n => 3 );
25           $mech->follow_link( text_regex => qr/download this/i );
26           $mech->follow_link( url => 'http://host.com/index.html' );
27
28           $mech->submit_form(
29               form_number => 3,
30               fields      => {
31                   username    => 'mungo',
32                   password    => 'lost-and-alone',
33               }
34           );
35
36           $mech->submit_form(
37               form_name => 'search',
38               fields    => { query  => 'pot of gold', },
39               button    => 'Search Now'
40           );
41
42           # Enable strict form processing to catch typos and non-existant form fields.
43           my $strict_mech = WWW::Mechanize->new( strict_forms => 1);
44
45           $strict_mech->get( $url );
46
47           # This method call will die, saving you lots of time looking for the bug.
48           $strict_mech->submit_form(
49               form_number => 3,
50               fields      => {
51                   usernaem     => 'mungo',           # typo in field name
52                   password     => 'lost-and-alone',
53                   extra_field  => 123,               # field does not exist
54               }
55           );
56

DESCRIPTION

58       "WWW::Mechanize", or Mech for short, is a Perl module for stateful
59       programmatic web browsing, used for automating interaction with
60       websites.
61
62       Features include:
63
64       •   All HTTP methods
65
66       •   High-level hyperlink and HTML form support, without having to parse
67           HTML yourself
68
69       •   SSL support
70
71       •   Automatic cookies
72
73       •   Custom HTTP headers
74
75       •   Automatic handling of redirections
76
77       •   Proxies
78
79       •   HTTP authentication
80
81       Mech is well suited for use in testing web applications.  If you use
82       one of the Test::*, like Test::HTML::Lint modules, you can check the
83       fetched content and use that as input to a test call.
84
85           use Test::More;
86           like( $mech->content(), qr/$expected/, "Got expected content" );
87
88       Each page fetch stores its URL in a history stack which you can
89       traverse.
90
91           $mech->back();
92
93       If you want finer control over your page fetching, you can use these
94       methods. follow_link() and submit_form() are just high level wrappers
95       around them.
96
97           $mech->find_link( n => $number );
98           $mech->form_number( $number );
99           $mech->form_name( $name );
100           $mech->field( $name, $value );
101           $mech->set_fields( %field_values );
102           $mech->set_visible( @criteria );
103           $mech->click( $button );
104
105       WWW::Mechanize is a proper subclass of LWP::UserAgent and you can also
106       use any of LWP::UserAgent's methods.
107
108           $mech->add_header($name => $value);
109
110       Please note that Mech does NOT support JavaScript, you need additional
111       software for that. Please check "JavaScript" in WWW::Mechanize::FAQ for
112       more.
113
115       •   <https://github.com/libwww-perl/WWW-Mechanize/issues>
116
117           The queue for bugs & enhancements in WWW::Mechanize.  Please note
118           that the queue at <http://rt.cpan.org> is no longer maintained.
119
120       •   <https://metacpan.org/pod/WWW::Mechanize>
121
122           The CPAN documentation page for Mechanize.
123
124       •   <https://metacpan.org/pod/distribution/WWW-Mechanize/lib/WWW/Mechanize/FAQ.pod>
125
126           Frequently asked questions.  Make sure you read here FIRST.
127

CONSTRUCTOR AND STARTUP

129   new()
130       Creates and returns a new WWW::Mechanize object, hereafter referred to
131       as the "agent".
132
133           my $mech = WWW::Mechanize->new()
134
135       The constructor for WWW::Mechanize overrides two of the params to the
136       LWP::UserAgent constructor:
137
138           agent => 'WWW-Mechanize/#.##'
139           cookie_jar => {}    # an empty, memory-only HTTP::Cookies object
140
141       You can override these overrides by passing params to the constructor,
142       as in:
143
144           my $mech = WWW::Mechanize->new( agent => 'wonderbot 1.01' );
145
146       If you want none of the overhead of a cookie jar, or don't want your
147       bot accepting cookies, you have to explicitly disallow it, like so:
148
149           my $mech = WWW::Mechanize->new( cookie_jar => undef );
150
151       Here are the params that WWW::Mechanize recognizes.  These do not
152       include params that LWP::UserAgent recognizes.
153
154       •   "autocheck => [0|1]"
155
156           Checks each request made to see if it was successful.  This saves
157           you the trouble of manually checking yourself.  Any errors found
158           are errors, not warnings.
159
160           The default value is ON, unless it's being subclassed, in which
161           case it is OFF.  This means that standalone WWW::Mechanize
162           instances have autocheck turned on, which is protective for the
163           vast majority of Mech users who don't bother checking the return
164           value of get() and post() and can't figure why their code fails.
165           However, if WWW::Mechanize is subclassed, such as for
166           Test::WWW::Mechanize or Test::WWW::Mechanize::Catalyst, this may
167           not be an appropriate default, so it's off.
168
169       •   "noproxy => [0|1]"
170
171           Turn off the automatic call to the LWP::UserAgent "env_proxy"
172           function.
173
174           This needs to be explicitly turned off if you're using
175           Crypt::SSLeay to access a https site via a proxy server.  Note: you
176           still need to set your HTTPS_PROXY environment variable as
177           appropriate.
178
179       •   "onwarn => \&func"
180
181           Reference to a "warn"-compatible function, such as "Carp::carp",
182           that is called when a warning needs to be shown.
183
184           If this is set to "undef", no warnings will ever be shown.
185           However, it's probably better to use the "quiet" method to control
186           that behavior.
187
188           If this value is not passed, Mech uses "Carp::carp" if Carp is
189           installed, or "CORE::warn" if not.
190
191       •   "onerror => \&func"
192
193           Reference to a "die"-compatible function, such as "Carp::croak",
194           that is called when there's a fatal error.
195
196           If this is set to "undef", no errors will ever be shown.
197
198           If this value is not passed, Mech uses "Carp::croak" if Carp is
199           installed, or "CORE::die" if not.
200
201       •   "quiet => [0|1]"
202
203           Don't complain on warnings.  Setting "quiet => 1" is the same as
204           calling "$mech->quiet(1)".  Default is off.
205
206       •   "stack_depth => $value"
207
208           Sets the depth of the page stack that keeps track of all the
209           downloaded pages. Default is effectively infinite stack size.  If
210           the stack is eating up your memory, then set this to a smaller
211           number, say 5 or 10.  Setting this to zero means Mech will keep no
212           history.
213
214       In addition, WWW::Mechanize also allows you to globally enable strict
215       and verbose mode for form handling, which is done with HTML::Form.
216
217       •   "strict_forms => [0|1]"
218
219           Globally sets the HTML::Form strict flag which causes form
220           submission to croak if any of the passed fields don't exist in the
221           form, and/or a value doesn't exist in a select element. This can
222           still be disabled in individual calls to submit_form().
223
224           Default is off.
225
226       •   "verbose_forms => [0|1]"
227
228           Globally sets the HTML::Form verbose flag which causes form
229           submission to warn about any bad HTML form constructs found. This
230           cannot be disabled later.
231
232           Default is off.
233
234       •   "marked_sections => [0|1]"
235
236           Globally sets the HTML::Parser marked sections flag which causes
237           HTML "CDATA[[" sections to be honoured. This cannot be disabled
238           later.
239
240           Default is on.
241
242       To support forms, WWW::Mechanize's constructor pushes POST on to the
243       agent's "requests_redirectable" list (see also LWP::UserAgent.)
244
245   $mech->agent_alias( $alias )
246       Sets the user agent string to the expanded version from a table of
247       actual user strings.  $alias can be one of the following:
248
249       •   Windows IE 6
250
251       •   Windows Mozilla
252
253       •   Mac Safari
254
255       •   Mac Mozilla
256
257       •   Linux Mozilla
258
259       •   Linux Konqueror
260
261       then it will be replaced with a more interesting one.  For instance,
262
263           $mech->agent_alias( 'Windows IE 6' );
264
265       sets your User-Agent to
266
267           Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
268
269       The list of valid aliases can be returned from known_agent_aliases().
270       The current list is:
271
272       •   Windows IE 6
273
274       •   Windows Mozilla
275
276       •   Mac Safari
277
278       •   Mac Mozilla
279
280       •   Linux Mozilla
281
282       •   Linux Konqueror
283
284   $mech->known_agent_aliases()
285       Returns a list of all the agent aliases that Mech knows about.  This
286       can also be called as a package or class method.
287
288           @aliases = WWW::Mechanize::known_agent_aliases();
289           @aliases = WWW::Mechanize->known_agent_aliases();
290           @aliases = $mech->known_agent_aliases();
291

PAGE-FETCHING METHODS

293   $mech->get( $uri )
294       Given a URL/URI, fetches it.  Returns an HTTP::Response object.  $uri
295       can be a well-formed URL string, a URI object, or a
296       WWW::Mechanize::Link object.
297
298       The results are stored internally in the agent object, but you don't
299       know that.  Just use the accessors listed below.  Poking at the
300       internals is deprecated and subject to change in the future.
301
302       get() is a well-behaved overloaded version of the method in
303       LWP::UserAgent.  This lets you do things like
304
305           $mech->get( $uri, ':content_file' => $filename );
306
307       and you can rest assured that the params will get filtered down
308       appropriately. See "get" in LWP::UserAgent for more details.
309
310       NOTE: The file in ":content_file" will contain the raw content of the
311       response. If the response content is encoded (e.g. gzip encoded), the
312       file will be encoded as well. Use $mech->save_content if you need the
313       decoded content.
314
315       NOTE: Because ":content_file" causes the page contents to be stored in
316       a file instead of the response object, some Mech functions that expect
317       it to be there won't work as expected. Use with caution.
318
319       Here is a non-complete list of methods that do not work as expected
320       with ":content_file":  forms() ,  current_form() ,  links() ,  title()
321       ,  content(...) ,  text() , all content-handling methods, all link
322       methods, all image methods, all form methods, all field methods,
323       save_content(...) ,  dump_links(...) ,  dump_images(...) ,
324       dump_forms(...) ,  dump_text(...)
325
326   $mech->post( $uri, content => $content )
327       POSTs $content to $uri.  Returns an HTTP::Response object.  $uri can be
328       a well-formed URI string, a URI object, or a WWW::Mechanize::Link
329       object.
330
331   $mech->put( $uri, content => $content )
332       PUTs $content to $uri.  Returns an HTTP::Response object.  $uri can be
333       a well-formed URI string, a URI object, or a WWW::Mechanize::Link
334       object.
335
336           my $res = $mech->put( $uri );
337           my $res = $mech->put( $uri , $field_name => $value, ... );
338
339   $mech->head ($uri )
340       Performs a HEAD request to $uri. Returns an HTTP::Response object.
341       $uri can be a well-formed URI string, a URI object, or a
342       WWW::Mechanize::Link object.
343
344   $mech->delete ($uri )
345       Performs a DELETE request to $uri. Returns an HTTP::Response object.
346       $uri can be a well-formed URI string, a URI object, or a
347       WWW::Mechanize::Link object.
348
349   $mech->reload()
350       Acts like the reload button in a browser: repeats the current request.
351       The history (as per the back() method) is not altered.
352
353       Returns the HTTP::Response object from the reload, or "undef" if
354       there's no current request.
355
356   $mech->back()
357       The equivalent of hitting the "back" button in a browser.  Returns to
358       the previous page.  Won't go back past the first page. (Really, what
359       would it do if it could?)
360
361       Returns true if it could go back, or false if not.
362
363   $mech->clear_history()
364       This deletes all the history entries and returns true.
365
366   $mech->history_count()
367       This returns the number of items in the browser history.  This number
368       does include the most recently made request.
369
370   $mech->history($n)
371       This returns the nth item in history.  The 0th item is the most recent
372       request and response, which would be acted on by methods like
373       find_link().  The 1st item is the state you'd return to if you called
374       back().
375
376       The maximum useful value for $n is "$mech->history_count - 1".
377       Requests beyond that bound will return "undef".
378
379       History items are returned as hash references, in the form:
380
381         { req => $http_request, res => $http_response }
382

STATUS METHODS

384   $mech->success()
385       Returns a boolean telling whether the last request was successful.  If
386       there hasn't been an operation yet, returns false.
387
388       This is a convenience function that wraps "$mech->res->is_success".
389
390   $mech->uri()
391       Returns the current URI as a URI object. This object stringifies to the
392       URI itself.
393
394   $mech->response() / $mech->res()
395       Return the current response as an HTTP::Response object.
396
397       Synonym for "$mech->response()".
398
399   $mech->status()
400       Returns the HTTP status code of the response.  This is a 3-digit number
401       like 200 for OK, 404 for not found, and so on.
402
403   $mech->ct() / $mech->content_type()
404       Returns the content type of the response.
405
406   $mech->base()
407       Returns the base URI for the current response
408
409   $mech->forms()
410       When called in a list context, returns a list of the forms found in the
411       last fetched page. In a scalar context, returns a reference to an array
412       with those forms. The forms returned are all HTML::Form objects.
413
414   $mech->current_form()
415       Returns the current form as an HTML::Form object.
416
417   $mech->links()
418       When called in a list context, returns a list of the links found in the
419       last fetched page.  In a scalar context it returns a reference to an
420       array with those links.  Each link is a WWW::Mechanize::Link object.
421
422   $mech->is_html()
423       Returns true/false on whether our content is HTML, according to the
424       HTTP headers.
425
426   $mech->title()
427       Returns the contents of the "<TITLE>" tag, as parsed by
428       HTML::HeadParser.  Returns "undef" if the content is not HTML.
429
430   $mech->redirects()
431       Convenience method to get the redirects from the most recent
432       HTTP::Response.
433
434       Note that you can also use is_redirect to see if the most recent
435       response was a redirect like this.
436
437           $mech->get($url);
438           do_stuff() if $mech->res->is_redirect;
439

CONTENT-HANDLING METHODS

441   $mech->content(...)
442       Returns the content that the mech uses internally for the last page
443       fetched. Ordinarily this is the same as
444       "$mech->response()->decoded_content()", but this may differ for HTML
445       documents if "update_html" is overloaded (in which case the value
446       passed to the base-class implementation of same will be returned),
447       and/or extra named arguments are passed to content():
448
449       $mech->content( format => 'text' )
450         Returns a text-only version of the page, with all HTML markup
451         stripped. This feature requires HTML::TreeBuilder version 5 or higher
452         to be installed, or a fatal error will be thrown. This works only if
453         the contents are HTML.
454
455       $mech->content( base_href => [$base_href|undef] )
456         Returns the HTML document, modified to contain a "<base
457         href="$base_href">" mark-up in the header.  $base_href is
458         "$mech->base()" if not specified. This is handy to pass the HTML to
459         e.g. HTML::Display. This works only if the contents are HTML.
460
461       $mech->content( raw => 1 )
462         Returns "$self->response()->content()", i.e. the raw contents from
463         the response.
464
465       $mech->content( decoded_by_headers => 1 )
466         Returns the content after applying all "Content-Encoding" headers but
467         with not additional mangling.
468
469       $mech->content( charset => $charset )
470         Returns "$self->response()->decoded_content(charset => $charset)"
471         (see HTTP::Response for details).
472
473       To preserve backwards compatibility, additional parameters will be
474       ignored unless none of "raw | decoded_by_headers | charset" is
475       specified and the text is HTML, in which case an error will be
476       triggered.
477
478       A fresh instance of WWW::Mechanize will return "undef" when
479       "$mech->content()" is called, because no content is present before a
480       request has been made.
481
482   $mech->text()
483       Returns the text of the current HTML content.  If the content isn't
484       HTML, $mech will die.
485
486       The text is extracted by parsing the content, and then the extracted
487       text is cached, so don't worry about performance of calling this
488       repeatedly.
489
491   $mech->links()
492       Lists all the links on the current page.  Each link is a
493       WWW::Mechanize::Link object. In list context, returns a list of all
494       links.  In scalar context, returns an array reference of all links.
495
496   $mech->follow_link(...)
497       Follows a specified link on the page.  You specify the match to be
498       found using the same params that find_link() uses.
499
500       Here some examples:
501
502       •   3rd link called "download"
503
504               $mech->follow_link( text => 'download', n => 3 );
505
506       •   first link where the URL has "download" in it, regardless of case:
507
508               $mech->follow_link( url_regex => qr/download/i );
509
510           or
511
512               $mech->follow_link( url_regex => qr/(?i:download)/ );
513
514       •   3rd link on the page
515
516               $mech->follow_link( n => 3 );
517
518       •   the link with the url
519
520               $mech->follow_link( url => '/other/page' );
521
522           or
523
524               $mech->follow_link( url => 'http://example.com/page' );
525
526       Returns the result of the "GET" method (an HTTP::Response object) if a
527       link was found.
528
529       If the page has no links, or the specified link couldn't be found,
530       returns "undef".  If "autocheck" is enabled an exception will be thrown
531       instead.
532
533   $mech->find_link( ... )
534       Finds a link in the currently fetched page. It returns a
535       WWW::Mechanize::Link object which describes the link.  (You'll probably
536       be most interested in the url() property.)  If it fails to find a link
537       it returns "undef".
538
539       You can take the URL part and pass it to the get() method.  If that's
540       your plan, you might as well use the follow_link() method directly,
541       since it does the get() for you automatically.
542
543       Note that "<FRAME SRC="...">" tags are parsed out of the HTML and
544       treated as links so this method works with them.
545
546       You can select which link to find by passing in one or more of these
547       key/value pairs:
548
549       •   "text => 'string'," and "text_regex => qr/regex/,"
550
551           "text" matches the text of the link against string, which must be
552           an exact match.  To select a link with text that is exactly
553           "download", use
554
555               $mech->find_link( text => 'download' );
556
557           "text_regex" matches the text of the link against regex.  To select
558           a link with text that has "download" anywhere in it, regardless of
559           case, use
560
561               $mech->find_link( text_regex => qr/download/i );
562
563           Note that the text extracted from the page's links are trimmed.
564           For example, "<a> foo </a>" is stored as 'foo', and searching for
565           leading or trailing spaces will fail.
566
567       •   "url => 'string'," and "url_regex => qr/regex/,"
568
569           Matches the URL of the link against string or regex, as
570           appropriate.  The URL may be a relative URL, like foo/bar.html,
571           depending on how it's coded on the page.
572
573       •   "url_abs => string" and "url_abs_regex => regex"
574
575           Matches the absolute URL of the link against string or regex, as
576           appropriate.  The URL will be an absolute URL, even if it's
577           relative in the page.
578
579       •   "name => string" and "name_regex => regex"
580
581           Matches the name of the link against string or regex, as
582           appropriate.
583
584       •   "rel => string" and "rel_regex => regex"
585
586           Matches the rel of the link against string or regex, as
587           appropriate.  This can be used to find stylesheets, favicons, or
588           links the author of the page does not want bots to follow.
589
590       •   "id => string" and "id_regex => regex"
591
592           Matches the attribute 'id' of the link against string or regex, as
593           appropriate.
594
595       •   "class => string" and "class_regex => regex"
596
597           Matches the attribute 'class' of the link against string or regex,
598           as appropriate.
599
600       •   "tag => string" and "tag_regex => regex"
601
602           Matches the tag that the link came from against string or regex, as
603           appropriate.  The "tag_regex" is probably most useful to check for
604           more than one tag, as in:
605
606               $mech->find_link( tag_regex => qr/^(a|frame)$/ );
607
608           The tags and attributes looked at are defined below.
609
610       If "n" is not specified, it defaults to 1.  Therefore, if you don't
611       specify any params, this method defaults to finding the first link on
612       the page.
613
614       Note that you can specify multiple text or URL parameters, which will
615       be ANDed together.  For example, to find the first link with text of
616       "News" and with "cnn.com" in the URL, use:
617
618           $mech->find_link( text => 'News', url_regex => qr/cnn\.com/ );
619
620       The return value is a reference to an array containing a
621       WWW::Mechanize::Link object for every link in "$self->content".
622
623       The links come from the following:
624
625       "<a href=...>"
626       "<area href=...>"
627       "<frame src=...>"
628       "<iframe src=...>"
629       "<link href=...>"
630       "<meta content=...>"
631
632   $mech->find_all_links( ... )
633       Returns all the links on the current page that match the criteria.  The
634       method for specifying link criteria is the same as in find_link().
635       Each of the links returned is a WWW::Mechanize::Link object.
636
637       In list context, find_all_links() returns a list of the links.
638       Otherwise, it returns a reference to the list of links.
639
640       find_all_links() with no parameters returns all links in the page.
641
642   $mech->find_all_inputs( ... criteria ... )
643       find_all_inputs() returns an array of all the input controls in the
644       current form whose properties match all of the regexes passed in.  The
645       controls returned are all descended from HTML::Form::Input.  See
646       "INPUTS" in HTML::Form for details.
647
648       If no criteria are passed, all inputs will be returned.
649
650       If there is no current page, there is no form on the current page, or
651       there are no submit controls in the current form then the return will
652       be an empty array.
653
654       You may use a regex or a literal string:
655
656           # get all textarea controls whose names begin with "customer"
657           my @customer_text_inputs = $mech->find_all_inputs(
658               type       => 'textarea',
659               name_regex => qr/^customer/,
660           );
661
662           # get all text or textarea controls called "customer"
663           my @customer_text_inputs = $mech->find_all_inputs(
664               type_regex => qr/^(text|textarea)$/,
665               name       => 'customer',
666           );
667
668   $mech->find_all_submits( ... criteria ... )
669       find_all_submits() does the same thing as find_all_inputs() except that
670       it only returns controls that are submit controls, ignoring other types
671       of input controls like text and checkboxes.
672

IMAGE METHODS

674   $mech->images
675       Lists all the images on the current page.  Each image is a
676       WWW::Mechanize::Image object. In list context, returns a list of all
677       images.  In scalar context, returns an array reference of all images.
678
679   $mech->find_image()
680       Finds an image in the current page. It returns a WWW::Mechanize::Image
681       object which describes the image.  If it fails to find an image it
682       returns "undef".
683
684       You can select which image to find by passing in one or more of these
685       key/value pairs:
686
687       •   "alt => 'string'" and "alt_regex => qr/regex/"
688
689           "alt" matches the ALT attribute of the image against string, which
690           must be an exact match. To select a image with an ALT tag that is
691           exactly "download", use
692
693               $mech->find_image( alt => 'download' );
694
695           "alt_regex" matches the ALT attribute of the image  against a
696           regular expression.  To select an image with an ALT attribute that
697           has "download" anywhere in it, regardless of case, use
698
699               $mech->find_image( alt_regex => qr/download/i );
700
701       •   "url => 'string'" and "url_regex => qr/regex/"
702
703           Matches the URL of the image against string or regex, as
704           appropriate.  The URL may be a relative URL, like foo/bar.html,
705           depending on how it's coded on the page.
706
707       •   "url_abs => string" and "url_abs_regex => regex"
708
709           Matches the absolute URL of the image against string or regex, as
710           appropriate.  The URL will be an absolute URL, even if it's
711           relative in the page.
712
713       •   "tag => string" and "tag_regex => regex"
714
715           Matches the tag that the image came from against string or regex,
716           as appropriate.  The "tag_regex" is probably most useful to check
717           for more than one tag, as in:
718
719               $mech->find_image( tag_regex => qr/^(img|input)$/ );
720
721           The tags supported are "<img>" and "<input>".
722
723       •   "id => string" and "id_regex => regex"
724
725           "id" matches the id attribute of the image against string, which
726           must be an exact match. To select an image with the exact id
727           "download-image", use
728
729               $mech->find_image( id => 'download-image' );
730
731           "id_regex" matches the id attribute of the image against a regular
732           expression. To select the first image with an id that contains
733           "download" anywhere in it, use
734
735               $mech->find_image( id_regex => qr/download/ );
736
737       •   "classs => string" and "class_regex => regex"
738
739           "class" matches the class attribute of the image against string,
740           which must be an exact match. To select an image with the exact
741           class "img-fuid", use
742
743               $mech->find_image( class => 'img-fluid' );
744
745           To select an image with the class attribute "rounded float-left",
746           use
747
748               $mech->find_image( class => 'rounded float-left' );
749
750           Note that the classes have to be matched as a complete string, in
751           the exact order they appear in the website's source code.
752
753           "class_regex" matches the class attribute of the image against a
754           regular expression. Use this if you want a partial class name, or
755           if an image has several classes, but you only care about one.
756
757           To select the first image with the class "rounded", where there are
758           multiple images that might also have either class "float-left" or
759           "float-right", use
760
761               $mech->find_image( class_regex => qr/\brounded\b/ );
762
763           Selecting an image with multiple classes where you do not care
764           about the order they appear in the website's source code is not
765           currently supported.
766
767       If "n" is not specified, it defaults to 1.  Therefore, if you don't
768       specify any params, this method defaults to finding the first image on
769       the page.
770
771       Note that you can specify multiple ALT or URL parameters, which will be
772       ANDed together.  For example, to find the first image with ALT text of
773       "News" and with "cnn.com" in the URL, use:
774
775           $mech->find_image( image => 'News', url_regex => qr/cnn\.com/ );
776
777       The return value is a reference to an array containing a
778       WWW::Mechanize::Image object for every image in "$mech->content".
779
780   $mech->find_all_images( ... )
781       Returns all the images on the current page that match the criteria.
782       The method for specifying image criteria is the same as in
783       find_image().  Each of the images returned is a WWW::Mechanize::Image
784       object.
785
786       In list context, find_all_images() returns a list of the images.
787       Otherwise, it returns a reference to the list of images.
788
789       find_all_images() with no parameters returns all images in the page.
790

FORM METHODS

792       These methods let you work with the forms on a page.  The idea is to
793       choose a form that you'll later work with using the field methods
794       below.
795
796   $mech->forms
797       Lists all the forms on the current page.  Each form is an HTML::Form
798       object.  In list context, returns a list of all forms.  In scalar
799       context, returns an array reference of all forms.
800
801   $mech->form_number($number)
802       Selects the numberth form on the page as the target for subsequent
803       calls to field() and click().  Also returns the form that was selected.
804
805       If it is found, the form is returned as an HTML::Form object and set
806       internally for later use with Mech's form methods such as field() and
807       click().  When called in a list context, the number of the found form
808       is also returned as a second value.
809
810       Emits a warning and returns "undef" if no form is found.
811
812       The first form is number 1, not zero.
813
814   $mech->form_action( $action )
815       Selects a form by action, using a regex containing $action.  If there
816       is more than one form on the page matching that action, then the first
817       one is used, and a warning is generated.
818
819       If it is found, the form is returned as an HTML::Form object and set
820       internally for later use with Mech's form methods such as field() and
821       click().
822
823       Returns "undef" if no form is found.
824
825   $mech->form_name( $name [, \%args ] )
826       Selects a form by name.
827
828       By default, the first form that has this name will be returned.
829
830           my $form = $mech->form_name("order_form");
831
832       If you want the second, third or nth match, pass an optional arguments
833       hash reference as the final parameter with a key "n" to pick which
834       instance you want. The numbering starts at 1.
835
836           my $third_product_form = $mech->form_name("buy_now", { n => 3 });
837
838       If the "n" parameter is not passed, and there is more than one form on
839       the page with that name, then the first one is used, and a warning is
840       generated.
841
842       If it is found, the form is returned as an HTML::Form object and set
843       internally for later use with Mech's form methods such as field() and
844       click().
845
846       Returns "undef" if no form is found.
847
848   $mech->form_id( $id [, \%args ] )
849       Selects a form by ID.
850
851       By default, the first form that has this ID will be returned.
852
853           my $form = $mech->form_id("order_form");
854
855       Although the HTML specification requires the ID to be unique within a
856       page, some pages might not adhere to that. If you want the second,
857       third or nth match, pass an optional arguments hash reference as the
858       final parameter with a key "n" to pick which instance you want. The
859       numbering starts at 1.
860
861           my $third_product_form = $mech->form_id("buy_now", { n => 3 });
862
863       If the "n" parameter is not passed, and there is more than one form on
864       the page with that ID, then the first one is used, and a warning is
865       generated.
866
867       If it is found, the form is returned as an HTML::Form object and set
868       internally for later use with Mech's form methods such as field() and
869       click().
870
871       If no form is found it returns "undef".  This will also trigger a
872       warning, unless "quiet" is enabled.
873
874   $mech->all_forms_with_fields( @fields )
875       Selects a form by passing in a list of field names it must contain.
876       All matching forms (perhaps none) are returned as a list of HTML::Form
877       objects.
878
879   $mech->form_with_fields( @fields, [ \%args ] )
880       Selects a form by passing in a list of field names it must contain. By
881       default, the first form that matches all of these field names will be
882       returned.
883
884           my $form = $mech->form_with_fields( qw/sku quantity add_to_cart/ );
885
886       If you want the second, third or nth match, pass an optional arguments
887       hash reference as the final parameter with a key "n" to pick which
888       instance you want. The numbering starts at 1.
889
890           my $form = $mech->form_with_fields( 'sky', 'qty', { n => 2 } );
891
892       If the "n" parameter is not passed, and there is more than one form on
893       the page with that ID, then the first one is used, and a warning is
894       generated.
895
896       If it is found, the form is returned as an HTML::Form object and set
897       internally for later used with Mech's form methods such as field() and
898       click().
899
900       Returns "undef" and emits a warning if no form is found.
901
902       Note that this functionality requires libwww-perl 5.69 or higher.
903
904   $mech->all_forms_with( $attr1 => $value1, $attr2 => $value2, ... )
905       Searches for forms with arbitrary attribute/value pairs within the
906       <form> tag.  When given more than one pair, all criteria must match.
907       Using "undef" as value means that the attribute in question must not be
908       present.
909
910       All matching forms (perhaps none) are returned as a list of HTML::Form
911       objects.
912
913   $mech->form_with( $attr1 => $value1, $attr2 => $value2, ..., [ \%args ] )
914       Searches for forms with arbitrary attribute/value pairs within the
915       <form> tag.  When given more than one pair, all criteria must match.
916       Using "undef" as value means that the attribute in question must not be
917       present.
918
919       By default, the first form that matches all criteria will be returned.
920
921           my $form = $mech->form_with( name => 'order_form', method => 'POST' );
922
923       If you want the second, third or nth match, pass an optional arguments
924       hash reference as the final parameter with a key "n" to pick which
925       instance you want. The numbering starts at 1.
926
927           my $form = $mech->form_with( method => 'POST', { n => 4 } );
928
929       If the "n" parameter is not passed, and there is more than one form on
930       the page matching these criteria, then the first one is used, and a
931       warning is generated.
932
933       If it is found, the form is returned as an HTML::Form object and set
934       internally for later used with Mech's form methods such as field() and
935       click().
936
937       Returns "undef" if no form is found.
938

FIELD METHODS

940       These methods allow you to set the values of fields in a given form.
941
942   $mech->field( $name, $value, $number )
943   $mech->field( $name, \@values, $number )
944   $mech->field( $name, \@file_upload_values, $number )
945       Given the name of a field, set its value to the value specified.  This
946       applies to the current form (as set by the form_name() or form_number()
947       method or defaulting to the first form on the page).
948
949       If the field is of type "file", its value should be an arrayref.
950       Example:
951
952           $mech->field( $file_input, ['/tmp/file.txt'] );
953
954       Value examples for "file" inputs, followed by explanation of what each
955       index mean:
956
957           # 0: filepath      1: filename    3: headers
958           ['/tmp/file.txt']
959           ['/tmp/file.txt', 'filename.txt']
960           ['/tmp/file.txt', 'filename.txt', @headers]
961           ['/tmp/file.txt', 'filename.txt', Content => 'some content']
962           [undef,           'filename.txt', Content => 'content here']
963
964       Index 0 is the filepath that will be read from disk. Index 1 is the
965       filename which will be used in the HTTP request body; if not given,
966       filepath (index 0) is used instead. If "Content => 'content here'" is
967       used as shown, then filepath will be ignored.
968
969       The optional $number parameter is used to distinguish between two
970       fields with the same name.  The fields are numbered from 1.
971
972   $mech->select($name, $new_or_additional_single_value)
973   $mech->select($name, \%new_single_value_by_number)
974   $mech->select($name, \@new_list_of_values)
975   $mech->select($name, \%new_list_of_values_by_number)
976       Given the name of a "select" field, set its value to the value
977       specified.
978
979           # select 'foo'
980           $mech->select($name, 'foo');
981
982       If the field is not "<select multiple>" and the $value is an array
983       reference, only the first value will be set.  [Note: until version
984       1.05_03 the documentation claimed that only the last value would be
985       set, but this was incorrect.]
986
987           # select 'bar'
988           $mech->select($name, ['bar', 'ignored', 'ignored']);
989
990       Passing $value as a hash reference with an "n" key selects an item by
991       number.
992
993           # select the third value
994           $mech->select($name, {n => 3});
995
996       The numbering starts at 1.  This applies to the current form.
997
998       If you have a field with "<select multiple>" and you pass a single
999       $value, then $value will be added to the list of fields selected,
1000       without clearing the others.
1001
1002           # add 'bar' to the list of selected values
1003           $mech->select($name, 'bar');
1004
1005       However, if you pass an array reference, then all previously selected
1006       values will be cleared and replaced with all values inside the array
1007       reference.
1008
1009           # replace the selection with 'foo' and 'bar'
1010           $mech->select($name, ['foo', 'bar']);
1011
1012       This also works when selecting by numbers, in which case the value of
1013       the "n" key will be an array reference of value numbers you want to
1014       replace the selection with.
1015
1016           # replace the selection with the 2nd and 4th element
1017           $mech->select($name, {n => [2, 4]});
1018
1019       To add multiple additional values to the list of selected fields
1020       without clearing, call "select" in the simple $value form with each
1021       single value in a loop.
1022
1023           # add all values in the array to the selection
1024           $mech->select($name, $_) for @additional_values;
1025
1026       Returns true on successfully setting the value. On failure, returns
1027       false and calls "$self->warn()" with an error message.
1028
1029   $mech->set_fields( $name => $value ... )
1030   $mech->set_fields( $name => \@value_and_instance_number )
1031   $mech->set_fields( $name => \$value_instance_number )
1032   $mech->set_fields( $name => \@file_upload )
1033       This method sets multiple fields of the current form. It takes a list
1034       of field name and value pairs. If there is more than one field with the
1035       same name, the first one found is set. If you want to select which of
1036       the duplicate field to set, use a value which is an anonymous array
1037       which has the field value and its number as the 2 elements.
1038
1039               # set the second $name field to 'foo'
1040               $mech->set_fields( $name => [ 'foo', 2 ] );
1041
1042       The value of a field of type "file" should be an arrayref as described
1043       in field(). Examples:
1044
1045               $mech->set_fields( $file_field => ['/tmp/file.txt'] );
1046               $mech->set_fields( $file_field => ['/tmp/file.txt', 'filename.txt'] );
1047
1048       The value for a "file" input can also be an arrayref containing an
1049       arrayref and a number, as documented in submit_form().  The number will
1050       be used to find the field in the form. Example:
1051
1052               $mech->set_fields( $file_field => [['/tmp/file.txt'], 1] );
1053
1054       The fields are numbered from 1.
1055
1056       For fields that have a predefined set of values, you may also provide a
1057       reference to an integer, if you don't know the options for the field,
1058       but you know you just want (e.g.) the first one.
1059
1060               # select the first value in the $name select box
1061               $mech->set_fields( $name => \0 );
1062               # select the last value in the $name select box
1063               $mech->set_fields( $name => \-1 );
1064
1065       This applies to the current form.
1066
1067   $mech->set_visible( @criteria )
1068       This method sets fields of the current form without having to know
1069       their names.  So if you have a login screen that wants a username and
1070       password, you do not have to fetch the form and inspect the source (or
1071       use the mech-dump utility, installed with WWW::Mechanize) to see what
1072       the field names are; you can just say
1073
1074           $mech->set_visible( $username, $password );
1075
1076       and the first and second fields will be set accordingly.  The method is
1077       called set_visible because it acts only on visible fields; hidden form
1078       inputs are not considered.  The order of the fields is the order in
1079       which they appear in the HTML source which is nearly always the order
1080       anyone viewing the page would think they are in, but some creative work
1081       with tables could change that; caveat user.
1082
1083       Each element in @criteria is either a field value or a field specifier.
1084       A field value is a scalar.  A field specifier allows you to specify the
1085       type of input field you want to set and is denoted with an arrayref
1086       containing two elements.  So you could specify the first radio button
1087       with
1088
1089           $mech->set_visible( [ radio => 'KCRW' ] );
1090
1091       Field values and specifiers can be intermixed, hence
1092
1093           $mech->set_visible( 'fred', 'secret', [ option => 'Checking' ] );
1094
1095       would set the first two fields to "fred" and "secret", and the next
1096       "OPTION" menu field to "Checking".
1097
1098       The possible field specifier types are: "text", "password", "hidden",
1099       "textarea", "file", "image", "submit", "radio", "checkbox" and
1100       "option".
1101
1102       "set_visible" returns the number of values set.
1103
1104   $mech->tick( $name, $value [, $set] )
1105       "Ticks" the first checkbox that has both the name and value associated
1106       with it on the current form.  If there is no value to the input, just
1107       pass an empty string as the value.  Dies if there is no named checkbox
1108       for the value given, if a value is given.  Passing in a false value as
1109       the third optional argument will cause the checkbox to be unticked.
1110       The third value does not need to be set if you wish to merely tick the
1111       box.
1112
1113           $mech->tick('extra', 'cheese');
1114           $mech->tick('extra', 'mushrooms');
1115
1116           $mech->tick('no_value', ''); # <input type="checkbox" name="no_value">
1117
1118   $mech->untick($name, $value)
1119       Causes the checkbox to be unticked.  Shorthand for
1120       "tick($name,$value,undef)"
1121
1122   $mech->value( $name [, $number] )
1123       Given the name of a field, return its value. This applies to the
1124       current form.
1125
1126       The optional $number parameter is used to distinguish between two
1127       fields with the same name.  The fields are numbered from 1.
1128
1129       If the field is of type file (file upload field), the value is always
1130       cleared to prevent remote sites from downloading your local files.  To
1131       upload a file, specify its file name explicitly.
1132
1133   $mech->click( $button [, $x, $y] )
1134       Has the effect of clicking a button on the current form.  The first
1135       argument is the name of the button to be clicked.  The second and third
1136       arguments (optional) allow you to specify the (x,y) coordinates of the
1137       click.
1138
1139       If there is only one button on the form, "$mech->click()" with no
1140       arguments simply clicks that one button.
1141
1142       Returns an HTTP::Response object.
1143
1144   $mech->click_button( ... )
1145       Has the effect of clicking a button on the current form by specifying
1146       its attributes. The arguments are a list of key/value pairs. Only one
1147       of name, id, number, input or value must be specified in the keys.
1148
1149       Dies if no button is found.
1150
1151       •   "name => name"
1152
1153           Clicks the button named name in the current form.
1154
1155       •   "id => id"
1156
1157           Clicks the button with the id id in the current form.
1158
1159       •   "number => n"
1160
1161           Clicks the nth button with type submit in the current form.
1162           Numbering starts at 1.
1163
1164       •   "value => value"
1165
1166           Clicks the button with the value value in the current form.
1167
1168       •   "input => $inputobject"
1169
1170           Clicks on the button referenced by $inputobject, an instance of
1171           HTML::Form::SubmitInput obtained e.g. from
1172
1173               $mech->current_form()->find_input( undef, 'submit' )
1174
1175           $inputobject must belong to the current form.
1176
1177       •   "x => x"
1178
1179       •   "y => y"
1180
1181           These arguments (optional) allow you to specify the (x,y)
1182           coordinates of the click.
1183
1184   $mech->submit()
1185       Submits the current form, without specifying a button to click.
1186       Actually, no button is clicked at all.
1187
1188       Returns an HTTP::Response object.
1189
1190       This used to be a synonym for "$mech->click( 'submit' )", but is no
1191       longer so.
1192
1193   $mech->submit_form( ... )
1194       This method lets you select a form from the previously fetched page,
1195       fill in its fields, and submit it. It combines the
1196       "form_number"/"form_name", "set_fields" and "click" methods into one
1197       higher level call. Its arguments are a list of key/value pairs, all of
1198       which are optional.
1199
1200       •   "fields => \%fields"
1201
1202           Specifies the fields to be filled in the current form.
1203
1204       •   "with_fields => \%fields"
1205
1206           Probably all you need for the common case. It combines a smart form
1207           selector and data setting in one operation. It selects the first
1208           form that contains all fields mentioned in "\%fields".  This is
1209           nice because you don't need to know the name or number of the form
1210           to do this.
1211
1212           (calls form_with_fields() and set_fields()).
1213
1214           If you choose "with_fields", the "fields" option will be ignored.
1215           The "form_number", "form_name" and "form_id" options will still be
1216           used.  An exception will be thrown unless exactly one form matches
1217           all of the provided criteria.
1218
1219       •   "form_number => n"
1220
1221           Selects the nth form (calls form_number().  If this param is not
1222           specified, the currently-selected form is used.
1223
1224       •   "form_name => name"
1225
1226           Selects the form named name (calls form_name())
1227
1228       •   "form_id => ID"
1229
1230           Selects the form with ID ID (calls form_id())
1231
1232       •   "button => button"
1233
1234           Clicks on button button (calls click())
1235
1236       •   "x => x, y => y"
1237
1238           Sets the x or y values for click()
1239
1240       •   "strict_forms => bool"
1241
1242           Sets the HTML::Form strict flag which causes form submission to
1243           croak if any of the passed fields don't exist on the page, and/or a
1244           value doesn't exist in a select element.  By default HTML::Form
1245           sets this value to false.
1246
1247           This behavior can also be turned on globally by passing
1248           "strict_forms => 1" to "WWW::Mechanize->new". If you do that, you
1249           can still disable it for individual calls by passing "strict_forms
1250           => 0" here.
1251
1252       If no form is selected, the first form found is used.
1253
1254       If button is not passed, then the submit() method is used instead.
1255
1256       If you want to submit a file and get its content from a scalar rather
1257       than a file in the filesystem, you can use:
1258
1259           $mech->submit_form(with_fields => { logfile => [ [ undef, 'whatever', Content => $content ], 1 ] } );
1260
1261       Returns an HTTP::Response object.
1262

MISCELLANEOUS METHODS

1264   $mech->add_header( name => $value [, name => $value... ] )
1265       Sets HTTP headers for the agent to add or remove from the HTTP request.
1266
1267           $mech->add_header( Encoding => 'text/klingon' );
1268
1269       If a value is "undef", then that header will be removed from any future
1270       requests.  For example, to never send a Referer header:
1271
1272           $mech->add_header( Referer => undef );
1273
1274       If you want to delete a header, use "delete_header".
1275
1276       Returns the number of name/value pairs added.
1277
1278       NOTE: This method was very different in WWW::Mechanize before 1.00.
1279       Back then, the headers were stored in a package hash, not as a member
1280       of the object instance.  Calling add_header() would modify the headers
1281       for every WWW::Mechanize object, even after your object no longer
1282       existed.
1283
1284   $mech->delete_header( name [, name ... ] )
1285       Removes HTTP headers from the agent's list of special headers.  For
1286       instance, you might need to do something like:
1287
1288           # Don't send a Referer for this URL
1289           $mech->add_header( Referer => undef );
1290
1291           # Get the URL
1292           $mech->get( $url );
1293
1294           # Back to the default behavior
1295           $mech->delete_header( 'Referer' );
1296
1297   $mech->quiet(true/false)
1298       Allows you to suppress warnings to the screen.
1299
1300           $mech->quiet(0); # turns on warnings (the default)
1301           $mech->quiet(1); # turns off warnings
1302           $mech->quiet();  # returns the current quietness status
1303
1304   $mech->autocheck(true/false)
1305       Allows you to enable and disable autochecking.
1306
1307       Autocheck checks each request made to see if it was successful. This
1308       saves you the trouble of manually checking yourself. Any errors found
1309       are errors, not warnings. Please see "new" for more details.
1310
1311           $mech->autocheck(1); # turns on automatic request checking (the default)
1312           $mech->autocheck(0); # turns off automatic request checking
1313           $mech->autocheck();  # returns the current autocheck status
1314
1315   $mech->stack_depth( $max_depth )
1316       Get or set the page stack depth. Use this if you're doing a lot of page
1317       scraping and running out of memory.
1318
1319       A value of 0 means "no history at all."  By default, the max stack
1320       depth is humongously large, effectively keeping all history.
1321
1322   $mech->save_content( $filename, %opts )
1323       Dumps the contents of "$mech->content" into $filename.  $filename will
1324       be overwritten.  Dies if there are any errors.
1325
1326       If the content type does not begin with "text/", then the content is
1327       saved in binary mode (i.e. binmode() is set on the output filehandle).
1328
1329       Additional arguments can be passed as key/value pairs:
1330
1331       $mech->save_content( $filename, binary => 1 )
1332           Filehandle is set with "binmode" to ":raw" and contents are taken
1333           calling "$self->content(decoded_by_headers => 1)". Same as calling:
1334
1335               $mech->save_content( $filename, binmode => ':raw',
1336                                    decoded_by_headers => 1 );
1337
1338           This should be the safest way to save contents verbatim.
1339
1340       $mech->save_content( $filename, binmode => $binmode )
1341           Filehandle is set to binary mode. If $binmode begins with ':', it
1342           is passed as a parameter to "binmode":
1343
1344               binmode $fh, $binmode;
1345
1346           otherwise the filehandle is set to binary mode if $binmode is true:
1347
1348               binmode $fh;
1349
1350       all other arguments
1351           are passed as-is to "$mech->content(%opts)". In particular,
1352           "decoded_by_headers" might come handy if you want to revert the
1353           effect of line compression performed by the web server but without
1354           further interpreting the contents (e.g. decoding it according to
1355           the charset).
1356
1357   $mech->dump_headers( [$fh] )
1358       Prints a dump of the HTTP response headers for the most recent
1359       response.  If $fh is not specified or is "undef", it dumps to STDOUT.
1360
1361       Unlike the rest of the "dump_*" methods, $fh can be a scalar. It will
1362       be used as a file name.
1363
1364   $mech->dump_links( [[$fh], $absolute] )
1365       Prints a dump of the links on the current page to $fh.  If $fh is not
1366       specified or is "undef", it dumps to STDOUT.
1367
1368       If $absolute is true, links displayed are absolute, not relative.
1369
1370   $mech->dump_images( [[$fh], $absolute] )
1371       Prints a dump of the images on the current page to $fh.  If $fh is not
1372       specified or is "undef", it dumps to STDOUT.
1373
1374       If $absolute is true, links displayed are absolute, not relative.
1375
1376       The output will include empty lines for images that have no "src"
1377       attribute and therefore no URL.
1378
1379   $mech->dump_forms( [$fh] )
1380       Prints a dump of the forms on the current page to $fh.  If $fh is not
1381       specified or is "undef", it dumps to STDOUT. Running the following:
1382
1383           my $mech = WWW::Mechanize->new();
1384           $mech->get("https://www.google.com/");
1385           $mech->dump_forms;
1386
1387       will print:
1388
1389           GET https://www.google.com/search [f]
1390             ie=ISO-8859-1                  (hidden readonly)
1391             hl=en                          (hidden readonly)
1392             source=hp                      (hidden readonly)
1393             biw=                           (hidden readonly)
1394             bih=                           (hidden readonly)
1395             q=                             (text)
1396             btnG=Google Search             (submit)
1397             btnI=I'm Feeling Lucky         (submit)
1398             gbv=1                          (hidden readonly)
1399
1400   $mech->dump_text( [$fh] )
1401       Prints a dump of the text on the current page to $fh.  If $fh is not
1402       specified or is "undef", it dumps to STDOUT.
1403

OVERRIDDEN LWP::UserAgent METHODS

1405   $mech->clone()
1406       Clone the mech object.  The clone will be using the same cookie jar as
1407       the original mech.
1408
1409   $mech->redirect_ok()
1410       An overloaded version of redirect_ok() in LWP::UserAgent.  This method
1411       is used to determine whether a redirection in the request should be
1412       followed.
1413
1414       Note that WWW::Mechanize's constructor pushes POST on to the agent's
1415       "requests_redirectable" list.
1416
1417   $mech->request( $request [, $arg [, $size]])
1418       Overloaded version of request() in LWP::UserAgent.  Performs the actual
1419       request.  Normally, if you're using WWW::Mechanize, it's because you
1420       don't want to deal with this level of stuff anyway.
1421
1422       Note that $request will be modified.
1423
1424       Returns an HTTP::Response object.
1425
1426   $mech->update_html( $html )
1427       Allows you to replace the HTML that the mech has found.  Updates the
1428       forms and links parse-trees that the mech uses internally.
1429
1430       Say you have a page that you know has malformed output, and you want to
1431       update it so the links come out correctly:
1432
1433           my $html = $mech->content;
1434           $html =~ s[</option>.{0,3}</td>][</option></select></td>]isg;
1435           $mech->update_html( $html );
1436
1437       This method is also used internally by the mech itself to update its
1438       own HTML content when loading a page. This means that if you would like
1439       to systematically perform the above HTML substitution, you would
1440       overload "update_html" in a subclass thusly:
1441
1442          package MyMech;
1443          use base 'WWW::Mechanize';
1444
1445          sub update_html {
1446              my ($self, $html) = @_;
1447              $html =~ s[</option>.{0,3}</td>][</option></select></td>]isg;
1448              $self->WWW::Mechanize::update_html( $html );
1449          }
1450
1451       If you do this, then the mech will use the tidied-up HTML instead of
1452       the original both when parsing for its own needs, and for returning to
1453       you through content().
1454
1455       Overloading this method is also the recommended way of implementing
1456       extra validation steps (e.g. link checkers) for every HTML page
1457       received.  "warn" and "warn" would then come in handy to signal
1458       validation errors.
1459
1460   $mech->credentials( $username, $password )
1461       Provide credentials to be used for HTTP Basic authentication for all
1462       sites and realms until further notice.
1463
1464       The four argument form described in LWP::UserAgent is still supported.
1465
1466   $mech->get_basic_credentials( $realm, $uri, $isproxy )
1467       Returns the credentials for the realm and URI.
1468
1469   $mech->clear_credentials()
1470       Remove any credentials set up with credentials().
1471

INHERITED UNCHANGED LWP::UserAgent METHODS

1473       As a subclass of LWP::UserAgent, WWW::Mechanize inherits all of
1474       LWP::UserAgent's methods.  Many of which are overridden or extended.
1475       The following methods are inherited unchanged. View the LWP::UserAgent
1476       documentation for their implementation descriptions.
1477
1478       This is not meant to be an inclusive list.  LWP::UA may have added
1479       others.
1480
1481   $mech->head()
1482       Inherited from LWP::UserAgent.
1483
1484   $mech->mirror()
1485       Inherited from LWP::UserAgent.
1486
1487   $mech->simple_request()
1488       Inherited from LWP::UserAgent.
1489
1490   $mech->is_protocol_supported()
1491       Inherited from LWP::UserAgent.
1492
1493   $mech->prepare_request()
1494       Inherited from LWP::UserAgent.
1495
1496   $mech->progress()
1497       Inherited from LWP::UserAgent.
1498

INTERNAL-ONLY METHODS

1500       These methods are only used internally.  You probably don't need to
1501       know about them.
1502
1503   $mech->_update_page($request, $response)
1504       Updates all internal variables in $mech as if $request was just
1505       performed, and returns $response. The page stack is not altered by this
1506       method, it is up to caller (e.g.  "request") to do that.
1507
1508   $mech->_modify_request( $req )
1509       Modifies a HTTP::Request before the request is sent out, for both GET
1510       and POST requests.
1511
1512       We add a "Referer" header, as well as header to note that we can accept
1513       gzip encoded content, if Compress::Zlib is installed.
1514
1515   $mech->_make_request()
1516       Convenience method to make it easier for subclasses like
1517       WWW::Mechanize::Cached to intercept the request.
1518
1519   $mech->_reset_page()
1520       Resets the internal fields that track page parsed stuff.
1521
1522   $mech->_extract_links()
1523       Extracts links from the content of a webpage, and populates the
1524       "{links}" property with WWW::Mechanize::Link objects.
1525
1526   $mech->_push_page_stack()
1527       The agent keeps a stack of visited pages, which it can pop when it
1528       needs to go BACK and so on.
1529
1530       The current page needs to be pushed onto the stack before we get a new
1531       page, and the stack needs to be popped when BACK occurs.
1532
1533       Neither of these take any arguments, they just operate on the $mech
1534       object.
1535
1536   warn( @messages )
1537       Centralized warning method, for diagnostics and non-fatal problems.
1538       Defaults to calling "CORE::warn", but may be overridden by setting
1539       "onwarn" in the constructor.
1540
1541   die( @messages )
1542       Centralized error method.  Defaults to calling "CORE::die", but may be
1543       overridden by setting "onerror" in the constructor.
1544

BEST PRACTICES

1546       The default settings can get you up and running quickly, but there are
1547       settings you can change in order to make your life easier.
1548
1549       autocheck
1550           "autocheck" can save you the overhead of checking status codes for
1551           success.  You may outgrow it as your needs get more sophisticated,
1552           but it's a safe option to start with.
1553
1554               my $agent = WWW::Mechanize->new( autocheck => 1 );
1555
1556       cookie_jar
1557           You are encouraged to install Mozilla::PublicSuffix and use
1558           HTTP::CookieJar::LWP as your cookie jar.  HTTP::CookieJar::LWP
1559           provides a better security model matching that of current Web
1560           browsers when Mozilla::PublicSuffix is installed.
1561
1562               use HTTP::CookieJar::LWP ();
1563
1564               my $jar = HTTP::CookieJar::LWP->new;
1565               my $agent = WWW::Mechanize->new( cookie_jar => $jar );
1566
1567       protocols_allowed
1568           This option is inherited directly from LWP::UserAgent.  It may be
1569           used to allow arbitrary protocols.
1570
1571               my $agent = WWW::Mechanize->new(
1572                   protocols_allowed => [ 'http', 'https' ]
1573               );
1574
1575           This will prevent you from inadvertently following URLs like
1576           "file:///etc/passwd"
1577
1578       protocols_forbidden
1579           This option is also inherited directly from LWP::UserAgent.  It may
1580           be used to deny arbitrary protocols.
1581
1582               my $agent = WWW::Mechanize->new(
1583                   protocols_forbidden => [ 'file', 'mailto', 'ssh', ]
1584               );
1585
1586           This will prevent you from inadvertently following URLs like
1587           "file:///etc/passwd"
1588
1589       strict_forms
1590           Consider turning on the "strict_forms" option when you create a new
1591           Mech.  This will perform a helpful sanity check on form fields
1592           every time you are submitting a form, which can save you a lot of
1593           debugging time.
1594
1595               my $agent = WWW::Mechanize->new( strict_forms => 1 );
1596
1597           If you do not want to have this option globally, you can still turn
1598           it on for individual forms.
1599
1600               $agent->submit_form( fields => { foo => 'bar' } , strict_forms => 1 );
1601

WWW::MECHANIZE'S GIT REPOSITORY

1603       WWW::Mechanize is hosted at GitHub.
1604
1605       Repository: <https://github.com/libwww-perl/WWW-Mechanize>.  Bugs:
1606       <https://github.com/libwww-perl/WWW-Mechanize/issues>.
1607

OTHER DOCUMENTATION

1609   Spidering Hacks, by Kevin Hemenway and Tara Calishain
1610       Spidering Hacks from O'Reilly
1611       (<http://www.oreilly.com/catalog/spiderhks/>) is a great book for
1612       anyone wanting to know more about screen-scraping and spidering.
1613
1614       There are six hacks that use Mech or a Mech derivative:
1615
1616       #21 WWW::Mechanize 101
1617       #22 Scraping with WWW::Mechanize
1618       #36 Downloading Images from Webshots
1619       #44 Archiving Yahoo! Groups Messages with WWW::Yahoo::Groups
1620       #64 Super Author Searching
1621       #73 Scraping TV Listings
1622
1623       The book was also positively reviewed on Slashdot:
1624       <http://books.slashdot.org/article.pl?sid=03/12/11/2126256>
1625

ONLINE RESOURCES AND SUPPORT

1627       •   WWW::Mechanize mailing list
1628
1629           The Mech mailing list is at
1630           <http://groups.google.com/group/www-mechanize-users> and is
1631           specific to Mechanize, unlike the LWP mailing list below.  Although
1632           it is a users list, all development discussion takes place here,
1633           too.
1634
1635       •   LWP mailing list
1636
1637           The LWP mailing list is at
1638           <http://lists.perl.org/showlist.cgi?name=libwww>, and is more user-
1639           oriented and well-populated than the WWW::Mechanize list.
1640
1641       •   Perlmonks
1642
1643           <http://perlmonks.org> is an excellent community of support, and
1644           many questions about Mech have already been answered there.
1645
1646       •   WWW::Mechanize::Examples
1647
1648           A random array of examples submitted by users, included with the
1649           Mechanize distribution.
1650

ARTICLES ABOUT WWW::MECHANIZE

1652       •   <http://www.ibm.com/developerworks/linux/library/wa-perlsecure/>
1653
1654           IBM article "Secure Web site access with Perl"
1655
1656       •   <http://www.oreilly.com/catalog/googlehks2/chapter/hack84.pdf>
1657
1658           Leland Johnson's hack #84 in Google Hacks, 2nd Edition is an
1659           example of a production script that uses WWW::Mechanize and
1660           HTML::TableContentParser. It takes in keywords and returns the
1661           estimated price of these keywords on Google's AdWords program.
1662
1663       •   <http://www.perl.com/pub/a/2004/06/04/recorder.html>
1664
1665           Linda Julien writes about using HTTP::Recorder to create
1666           WWW::Mechanize scripts.
1667
1668       •   <http://www.developer.com/lang/other/article.php/3454041>
1669
1670           Jason Gilmore's article on using WWW::Mechanize for scraping sales
1671           information from Amazon and eBay.
1672
1673       •   <http://www.perl.com/pub/a/2003/01/22/mechanize.html>
1674
1675           Chris Ball's article about using WWW::Mechanize for scraping TV
1676           listings.
1677
1678       •   <http://www.stonehenge.com/merlyn/LinuxMag/col47.html>
1679
1680           Randal Schwartz's article on scraping Yahoo News for images.  It's
1681           already out of date: He manually walks the list of links hunting
1682           for matches, which wouldn't have been necessary if the find_link()
1683           method existed at press time.
1684
1685       •   <http://www.perladvent.org/2002/16th/>
1686
1687           WWW::Mechanize on the Perl Advent Calendar, by Mark Fowler.
1688
1689       •   <http://www.linux-magazin.de/ausgaben/2004/03/datenruessel/>
1690
1691           Michael Schilli's article on Mech and WWW::Mechanize::Shell for the
1692           German magazine Linux Magazin.
1693
1694   Other modules that use Mechanize
1695       Here are modules that use or subclass Mechanize.  Let me know of any
1696       others:
1697
1698       •   Finance::Bank::LloydsTSB
1699
1700       •   HTTP::Recorder
1701
1702           Acts as a proxy for web interaction, and then generates
1703           WWW::Mechanize scripts.
1704
1705       •   Win32::IE::Mechanize
1706
1707           Just like Mech, but using Microsoft Internet Explorer to do the
1708           work.
1709
1710       •   WWW::Bugzilla
1711
1712       •   WWW::Google::Groups
1713
1714       •   WWW::Hotmail
1715
1716       •   WWW::Mechanize::Cached
1717
1718       •   WWW::Mechanize::Cached::GZip
1719
1720       •   WWW::Mechanize::FormFiller
1721
1722       •   WWW::Mechanize::Shell
1723
1724       •   WWW::Mechanize::Sleepy
1725
1726       •   WWW::Mechanize::SpamCop
1727
1728       •   WWW::Mechanize::Timed
1729
1730       •   WWW::SourceForge
1731
1732       •   WWW::Yahoo::Groups
1733
1734       •   WWW::Scripter
1735

ACKNOWLEDGEMENTS

1737       Thanks to the numerous people who have helped out on WWW::Mechanize in
1738       one way or another, including Kirrily Robert for the original
1739       "WWW::Automate", Lyle Hopkins, Damien Clark, Ansgar Burchardt, Gisle
1740       Aas, Jeremy Ary, Hilary Holz, Rafael Kitover, Norbert Buchmuller, Dave
1741       Page, David Sainty, H.Merijn Brand, Matt Lawrence, Michael Schwern,
1742       Adriano Ferreira, Miyagawa, Peteris Krumins, Rafael Kitover, David
1743       Steinbrunner, Kevin Falcone, Mike O'Regan, Mark Stosberg, Uri Guttman,
1744       Peter Scott, Philippe Bruhat, Ian Langworth, John Beppu, Gavin Estey,
1745       Jim Brandt, Ask Bjoern Hansen, Greg Davies, Ed Silva, Mark-Jason
1746       Dominus, Autrijus Tang, Mark Fowler, Stuart Children, Max Maischein,
1747       Meng Wong, Prakash Kailasa, Abigail, Jan Pazdziora, Dominique
1748       Quatravaux, Scott Lanning, Rob Casey, Leland Johnson, Joshua Gatcomb,
1749       Julien Beasley, Abe Timmerman, Peter Stevens, Pete Krawczyk, Tad
1750       McClellan, and the late great Iain Truskett.
1751

AUTHOR

1753       Andy Lester <andy at petdance.com>
1754
1756       This software is copyright (c) 2004 by Andy Lester.
1757
1758       This is free software; you can redistribute it and/or modify it under
1759       the same terms as the Perl 5 programming language system itself.
1760
1761
1762
1763perl v5.38.0                      2023-07-21                 WWW::Mechanize(3)
Impressum