1WWW::Mechanize(3)     User Contributed Perl Documentation    WWW::Mechanize(3)
2
3
4

NAME

6       WWW::Mechanize - Handy web browsing in a Perl object
7

VERSION

9       version 2.15
10

SYNOPSIS

12       WWW::Mechanize supports performing a sequence of page fetches including
13       following links and submitting forms. Each fetched page is parsed and
14       its links and forms are extracted. A link or a form can be selected,
15       form fields can be filled and the next page can be fetched.  Mech also
16       stores a history of the URLs you've visited, which can be queried and
17       revisited.
18
19           use WWW::Mechanize ();
20           my $mech = WWW::Mechanize->new();
21
22           $mech->get( $url );
23
24           $mech->follow_link( n => 3 );
25           $mech->follow_link( text_regex => qr/download this/i );
26           $mech->follow_link( url => 'http://host.com/index.html' );
27
28           $mech->submit_form(
29               form_number => 3,
30               fields      => {
31                   username    => 'mungo',
32                   password    => 'lost-and-alone',
33               }
34           );
35
36           $mech->submit_form(
37               form_name => 'search',
38               fields    => { query  => 'pot of gold', },
39               button    => 'Search Now'
40           );
41
42           # Enable strict form processing to catch typos and non-existant form fields.
43           my $strict_mech = WWW::Mechanize->new( strict_forms => 1);
44
45           $strict_mech->get( $url );
46
47           # This method call will die, saving you lots of time looking for the bug.
48           $strict_mech->submit_form(
49               form_number => 3,
50               fields      => {
51                   usernaem     => 'mungo',           # typo in field name
52                   password     => 'lost-and-alone',
53                   extra_field  => 123,               # field does not exist
54               }
55           );
56

DESCRIPTION

58       "WWW::Mechanize", or Mech for short, is a Perl module for stateful
59       programmatic web browsing, used for automating interaction with
60       websites.
61
62       Features include:
63
64       •   All HTTP methods
65
66       •   High-level hyperlink and HTML form support, without having to parse
67           HTML yourself
68
69       •   SSL support
70
71       •   Automatic cookies
72
73       •   Custom HTTP headers
74
75       •   Automatic handling of redirections
76
77       •   Proxies
78
79       •   HTTP authentication
80
81       Mech is well suited for use in testing web applications.  If you use
82       one of the Test::*, like Test::HTML::Lint modules, you can check the
83       fetched content and use that as input to a test call.
84
85           use Test::More;
86           like( $mech->content(), qr/$expected/, "Got expected content" );
87
88       Each page fetch stores its URL in a history stack which you can
89       traverse.
90
91           $mech->back();
92
93       If you want finer control over your page fetching, you can use these
94       methods. follow_link() and submit_form() are just high level wrappers
95       around them.
96
97           $mech->find_link( n => $number );
98           $mech->form_number( $number );
99           $mech->form_name( $name );
100           $mech->field( $name, $value );
101           $mech->set_fields( %field_values );
102           $mech->set_visible( @criteria );
103           $mech->click( $button );
104
105       WWW::Mechanize is a proper subclass of LWP::UserAgent and you can also
106       use any of LWP::UserAgent's methods.
107
108           $mech->add_header($name => $value);
109
110       Please note that Mech does NOT support JavaScript, you need additional
111       software for that. Please check "JavaScript" in WWW::Mechanize::FAQ for
112       more.
113
115       •   <https://github.com/libwww-perl/WWW-Mechanize/issues>
116
117           The queue for bugs & enhancements in WWW::Mechanize.  Please note
118           that the queue at <http://rt.cpan.org> is no longer maintained.
119
120       •   <https://metacpan.org/pod/WWW::Mechanize>
121
122           The CPAN documentation page for Mechanize.
123
124       •   <https://metacpan.org/pod/distribution/WWW-Mechanize/lib/WWW/Mechanize/FAQ.pod>
125
126           Frequently asked questions.  Make sure you read here FIRST.
127

CONSTRUCTOR AND STARTUP

129   new()
130       Creates and returns a new WWW::Mechanize object, hereafter referred to
131       as the "agent".
132
133           my $mech = WWW::Mechanize->new()
134
135       The constructor for WWW::Mechanize overrides two of the params to the
136       LWP::UserAgent constructor:
137
138           agent => 'WWW-Mechanize/#.##'
139           cookie_jar => {}    # an empty, memory-only HTTP::Cookies object
140
141       You can override these overrides by passing params to the constructor,
142       as in:
143
144           my $mech = WWW::Mechanize->new( agent => 'wonderbot 1.01' );
145
146       If you want none of the overhead of a cookie jar, or don't want your
147       bot accepting cookies, you have to explicitly disallow it, like so:
148
149           my $mech = WWW::Mechanize->new( cookie_jar => undef );
150
151       Here are the params that WWW::Mechanize recognizes.  These do not
152       include params that LWP::UserAgent recognizes.
153
154       •   "autocheck => [0|1]"
155
156           Checks each request made to see if it was successful.  This saves
157           you the trouble of manually checking yourself.  Any errors found
158           are errors, not warnings.
159
160           The default value is ON, unless it's being subclassed, in which
161           case it is OFF.  This means that standalone WWW::Mechanize
162           instances have autocheck turned on, which is protective for the
163           vast majority of Mech users who don't bother checking the return
164           value of get() and post() and can't figure why their code fails.
165           However, if WWW::Mechanize is subclassed, such as for
166           Test::WWW::Mechanize or Test::WWW::Mechanize::Catalyst, this may
167           not be an appropriate default, so it's off.
168
169       •   "noproxy => [0|1]"
170
171           Turn off the automatic call to the LWP::UserAgent "env_proxy"
172           function.
173
174           This needs to be explicitly turned off if you're using
175           Crypt::SSLeay to access a https site via a proxy server.  Note: you
176           still need to set your HTTPS_PROXY environment variable as
177           appropriate.
178
179       •   "onwarn => \&func"
180
181           Reference to a "warn"-compatible function, such as "Carp::carp",
182           that is called when a warning needs to be shown.
183
184           If this is set to "undef", no warnings will ever be shown.
185           However, it's probably better to use the "quiet" method to control
186           that behavior.
187
188           If this value is not passed, Mech uses "Carp::carp" if Carp is
189           installed, or "CORE::warn" if not.
190
191       •   "onerror => \&func"
192
193           Reference to a "die"-compatible function, such as "Carp::croak",
194           that is called when there's a fatal error.
195
196           If this is set to "undef", no errors will ever be shown.
197
198           If this value is not passed, Mech uses "Carp::croak" if Carp is
199           installed, or "CORE::die" if not.
200
201       •   "quiet => [0|1]"
202
203           Don't complain on warnings.  Setting "quiet => 1" is the same as
204           calling "$mech->quiet(1)".  Default is off.
205
206       •   "stack_depth => $value"
207
208           Sets the depth of the page stack that keeps track of all the
209           downloaded pages. Default is effectively infinite stack size.  If
210           the stack is eating up your memory, then set this to a smaller
211           number, say 5 or 10.  Setting this to zero means Mech will keep no
212           history.
213
214       In addition, WWW::Mechanize also allows you to globally enable strict
215       and verbose mode for form handling, which is done with HTML::Form.
216
217       •   "strict_forms => [0|1]"
218
219           Globally sets the HTML::Form strict flag which causes form
220           submission to croak if any of the passed fields don't exist in the
221           form, and/or a value doesn't exist in a select element. This can
222           still be disabled in individual calls to submit_form().
223
224           Default is off.
225
226       •   "verbose_forms => [0|1]"
227
228           Globally sets the HTML::Form verbose flag which causes form
229           submission to warn about any bad HTML form constructs found. This
230           cannot be disabled later.
231
232           Default is off.
233
234       •   "marked_sections => [0|1]"
235
236           Globally sets the HTML::Parser marked sections flag which causes
237           HTML "CDATA[[" sections to be honoured. This cannot be disabled
238           later.
239
240           Default is on.
241
242       To support forms, WWW::Mechanize's constructor pushes POST on to the
243       agent's "requests_redirectable" list (see also LWP::UserAgent.)
244
245   $mech->agent_alias( $alias )
246       Sets the user agent string to the expanded version from a table of
247       actual user strings.  $alias can be one of the following:
248
249       •   Windows IE 6
250
251       •   Windows Mozilla
252
253       •   Mac Safari
254
255       •   Mac Mozilla
256
257       •   Linux Mozilla
258
259       •   Linux Konqueror
260
261       then it will be replaced with a more interesting one.  For instance,
262
263           $mech->agent_alias( 'Windows IE 6' );
264
265       sets your User-Agent to
266
267           Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
268
269       The list of valid aliases can be returned from known_agent_aliases().
270       The current list is:
271
272       •   Windows IE 6
273
274       •   Windows Mozilla
275
276       •   Mac Safari
277
278       •   Mac Mozilla
279
280       •   Linux Mozilla
281
282       •   Linux Konqueror
283
284   $mech->known_agent_aliases()
285       Returns a list of all the agent aliases that Mech knows about.  This
286       can also be called as a package or class method.
287
288           @aliases = WWW::Mechanize::known_agent_aliases();
289           @aliases = WWW::Mechanize->known_agent_aliases();
290           @aliases = $mech->known_agent_aliases();
291

PAGE-FETCHING METHODS

293   $mech->get( $uri )
294       Given a URL/URI, fetches it.  Returns an HTTP::Response object.  $uri
295       can be a well-formed URL string, a URI object, or a
296       WWW::Mechanize::Link object.
297
298       The results are stored internally in the agent object, but you don't
299       know that.  Just use the accessors listed below.  Poking at the
300       internals is deprecated and subject to change in the future.
301
302       get() is a well-behaved overloaded version of the method in
303       LWP::UserAgent.  This lets you do things like
304
305           $mech->get( $uri, ':content_file' => $filename );
306
307       and you can rest assured that the params will get filtered down
308       appropriately. See "get" in LWP::UserAgent for more details.
309
310       NOTE: Because ":content_file" causes the page contents to be stored in
311       a file instead of the response object, some Mech functions that expect
312       it to be there won't work as expected. Use with caution.
313
314       Here is a non-complete list of methods that do not work as expected
315       with ":content_file":  forms() ,  current_form() ,  links() ,  title()
316       ,  content(...) ,  text() , all content-handling methods, all link
317       methods, all image methods, all form methods, all field methods,
318       save_content(...) ,  dump_links(...) ,  dump_images(...) ,
319       dump_forms(...) ,  dump_text(...)
320
321   $mech->post( $uri, content => $content )
322       POSTs $content to $uri.  Returns an HTTP::Response object.  $uri can be
323       a well-formed URI string, a URI object, or a WWW::Mechanize::Link
324       object.
325
326   $mech->put( $uri, content => $content )
327       PUTs $content to $uri.  Returns an HTTP::Response object.  $uri can be
328       a well-formed URI string, a URI object, or a WWW::Mechanize::Link
329       object.
330
331           my $res = $mech->head( $uri );
332           my $res = $mech->head( $uri , $field_name => $value, ... );
333
334   $mech->head ($uri )
335       Performs a HEAD request to $uri. Returns an HTTP::Response object.
336       $uri can be a well-formed URI string, a URI object, or a
337       WWW::Mechanize::Link object.
338
339   $mech->reload()
340       Acts like the reload button in a browser: repeats the current request.
341       The history (as per the back() method) is not altered.
342
343       Returns the HTTP::Response object from the reload, or "undef" if
344       there's no current request.
345
346   $mech->back()
347       The equivalent of hitting the "back" button in a browser.  Returns to
348       the previous page.  Won't go back past the first page. (Really, what
349       would it do if it could?)
350
351       Returns true if it could go back, or false if not.
352
353   $mech->clear_history()
354       This deletes all the history entries and returns true.
355
356   $mech->history_count()
357       This returns the number of items in the browser history.  This number
358       does include the most recently made request.
359
360   $mech->history($n)
361       This returns the nth item in history.  The 0th item is the most recent
362       request and response, which would be acted on by methods like
363       find_link().  The 1st item is the state you'd return to if you called
364       back().
365
366       The maximum useful value for $n is "$mech->history_count - 1".
367       Requests beyond that bound will return "undef".
368
369       History items are returned as hash references, in the form:
370
371         { req => $http_request, res => $http_response }
372

STATUS METHODS

374   $mech->success()
375       Returns a boolean telling whether the last request was successful.  If
376       there hasn't been an operation yet, returns false.
377
378       This is a convenience function that wraps "$mech->res->is_success".
379
380   $mech->uri()
381       Returns the current URI as a URI object. This object stringifies to the
382       URI itself.
383
384   $mech->response() / $mech->res()
385       Return the current response as an HTTP::Response object.
386
387       Synonym for "$mech->response()".
388
389   $mech->status()
390       Returns the HTTP status code of the response.  This is a 3-digit number
391       like 200 for OK, 404 for not found, and so on.
392
393   $mech->ct() / $mech->content_type()
394       Returns the content type of the response.
395
396   $mech->base()
397       Returns the base URI for the current response
398
399   $mech->forms()
400       When called in a list context, returns a list of the forms found in the
401       last fetched page. In a scalar context, returns a reference to an array
402       with those forms. The forms returned are all HTML::Form objects.
403
404   $mech->current_form()
405       Returns the current form as an HTML::Form object.
406
407   $mech->links()
408       When called in a list context, returns a list of the links found in the
409       last fetched page.  In a scalar context it returns a reference to an
410       array with those links.  Each link is a WWW::Mechanize::Link object.
411
412   $mech->is_html()
413       Returns true/false on whether our content is HTML, according to the
414       HTTP headers.
415
416   $mech->title()
417       Returns the contents of the "<TITLE>" tag, as parsed by
418       HTML::HeadParser.  Returns "undef" if the content is not HTML.
419
420   $mech->redirects()
421       Convenience method to get the redirects from the most recent
422       HTTP::Response.
423
424       Note that you can also use is_redirect to see if the most recent
425       response was a redirect like this.
426
427           $mech->get($url);
428           do_stuff() if $mech->res->is_redirect;
429

CONTENT-HANDLING METHODS

431   $mech->content(...)
432       Returns the content that the mech uses internally for the last page
433       fetched. Ordinarily this is the same as
434       "$mech->response()->decoded_content()", but this may differ for HTML
435       documents if "update_html" is overloaded (in which case the value
436       passed to the base-class implementation of same will be returned),
437       and/or extra named arguments are passed to content():
438
439       $mech->content( format => 'text' )
440         Returns a text-only version of the page, with all HTML markup
441         stripped. This feature requires HTML::TreeBuilder version 5 or higher
442         to be installed, or a fatal error will be thrown. This works only if
443         the contents are HTML.
444
445       $mech->content( base_href => [$base_href|undef] )
446         Returns the HTML document, modified to contain a "<base
447         href="$base_href">" mark-up in the header.  $base_href is
448         "$mech->base()" if not specified. This is handy to pass the HTML to
449         e.g. HTML::Display. This works only if the contents are HTML.
450
451       $mech->content( raw => 1 )
452         Returns "$self->response()->content()", i.e. the raw contents from
453         the response.
454
455       $mech->content( decoded_by_headers => 1 )
456         Returns the content after applying all "Content-Encoding" headers but
457         with not additional mangling.
458
459       $mech->content( charset => $charset )
460         Returns "$self->response()->decoded_content(charset => $charset)"
461         (see HTTP::Response for details).
462
463       To preserve backwards compatibility, additional parameters will be
464       ignored unless none of "raw | decoded_by_headers | charset" is
465       specified and the text is HTML, in which case an error will be
466       triggered.
467
468       A fresh instance of WWW::Mechanize will return "undef" when
469       "$mech->content()" is called, because no content is present before a
470       request has been made.
471
472   $mech->text()
473       Returns the text of the current HTML content.  If the content isn't
474       HTML, $mech will die.
475
476       The text is extracted by parsing the content, and then the extracted
477       text is cached, so don't worry about performance of calling this
478       repeatedly.
479
481   $mech->links()
482       Lists all the links on the current page.  Each link is a
483       WWW::Mechanize::Link object. In list context, returns a list of all
484       links.  In scalar context, returns an array reference of all links.
485
486   $mech->follow_link(...)
487       Follows a specified link on the page.  You specify the match to be
488       found using the same params that find_link() uses.
489
490       Here some examples:
491
492       •   3rd link called "download"
493
494               $mech->follow_link( text => 'download', n => 3 );
495
496       •   first link where the URL has "download" in it, regardless of case:
497
498               $mech->follow_link( url_regex => qr/download/i );
499
500           or
501
502               $mech->follow_link( url_regex => qr/(?i:download)/ );
503
504       •   3rd link on the page
505
506               $mech->follow_link( n => 3 );
507
508       •   the link with the url
509
510               $mech->follow_link( url => '/other/page' );
511
512           or
513
514               $mech->follow_link( url => 'http://example.com/page' );
515
516       Returns the result of the "GET" method (an HTTP::Response object) if a
517       link was found.
518
519       If the page has no links, or the specified link couldn't be found,
520       returns "undef".  If "autocheck" is enabled an exception will be thrown
521       instead.
522
523   $mech->find_link( ... )
524       Finds a link in the currently fetched page. It returns a
525       WWW::Mechanize::Link object which describes the link.  (You'll probably
526       be most interested in the url() property.)  If it fails to find a link
527       it returns "undef".
528
529       You can take the URL part and pass it to the get() method.  If that's
530       your plan, you might as well use the follow_link() method directly,
531       since it does the get() for you automatically.
532
533       Note that "<FRAME SRC="...">" tags are parsed out of the HTML and
534       treated as links so this method works with them.
535
536       You can select which link to find by passing in one or more of these
537       key/value pairs:
538
539       •   "text => 'string'," and "text_regex => qr/regex/,"
540
541           "text" matches the text of the link against string, which must be
542           an exact match.  To select a link with text that is exactly
543           "download", use
544
545               $mech->find_link( text => 'download' );
546
547           "text_regex" matches the text of the link against regex.  To select
548           a link with text that has "download" anywhere in it, regardless of
549           case, use
550
551               $mech->find_link( text_regex => qr/download/i );
552
553           Note that the text extracted from the page's links are trimmed.
554           For example, "<a> foo </a>" is stored as 'foo', and searching for
555           leading or trailing spaces will fail.
556
557       •   "url => 'string'," and "url_regex => qr/regex/,"
558
559           Matches the URL of the link against string or regex, as
560           appropriate.  The URL may be a relative URL, like foo/bar.html,
561           depending on how it's coded on the page.
562
563       •   "url_abs => string" and "url_abs_regex => regex"
564
565           Matches the absolute URL of the link against string or regex, as
566           appropriate.  The URL will be an absolute URL, even if it's
567           relative in the page.
568
569       •   "name => string" and "name_regex => regex"
570
571           Matches the name of the link against string or regex, as
572           appropriate.
573
574       •   "rel => string" and "rel_regex => regex"
575
576           Matches the rel of the link against string or regex, as
577           appropriate.  This can be used to find stylesheets, favicons, or
578           links the author of the page does not want bots to follow.
579
580       •   "id => string" and "id_regex => regex"
581
582           Matches the attribute 'id' of the link against string or regex, as
583           appropriate.
584
585       •   "class => string" and "class_regex => regex"
586
587           Matches the attribute 'class' of the link against string or regex,
588           as appropriate.
589
590       •   "tag => string" and "tag_regex => regex"
591
592           Matches the tag that the link came from against string or regex, as
593           appropriate.  The "tag_regex" is probably most useful to check for
594           more than one tag, as in:
595
596               $mech->find_link( tag_regex => qr/^(a|frame)$/ );
597
598           The tags and attributes looked at are defined below.
599
600       If "n" is not specified, it defaults to 1.  Therefore, if you don't
601       specify any params, this method defaults to finding the first link on
602       the page.
603
604       Note that you can specify multiple text or URL parameters, which will
605       be ANDed together.  For example, to find the first link with text of
606       "News" and with "cnn.com" in the URL, use:
607
608           $mech->find_link( text => 'News', url_regex => qr/cnn\.com/ );
609
610       The return value is a reference to an array containing a
611       WWW::Mechanize::Link object for every link in "$self->content".
612
613       The links come from the following:
614
615       "<a href=...>"
616       "<area href=...>"
617       "<frame src=...>"
618       "<iframe src=...>"
619       "<link href=...>"
620       "<meta content=...>"
621
622   $mech->find_all_links( ... )
623       Returns all the links on the current page that match the criteria.  The
624       method for specifying link criteria is the same as in find_link().
625       Each of the links returned is a WWW::Mechanize::Link object.
626
627       In list context, find_all_links() returns a list of the links.
628       Otherwise, it returns a reference to the list of links.
629
630       find_all_links() with no parameters returns all links in the page.
631
632   $mech->find_all_inputs( ... criteria ... )
633       find_all_inputs() returns an array of all the input controls in the
634       current form whose properties match all of the regexes passed in.  The
635       controls returned are all descended from HTML::Form::Input.  See
636       "INPUTS" in HTML::Form for details.
637
638       If no criteria are passed, all inputs will be returned.
639
640       If there is no current page, there is no form on the current page, or
641       there are no submit controls in the current form then the return will
642       be an empty array.
643
644       You may use a regex or a literal string:
645
646           # get all textarea controls whose names begin with "customer"
647           my @customer_text_inputs = $mech->find_all_inputs(
648               type       => 'textarea',
649               name_regex => qr/^customer/,
650           );
651
652           # get all text or textarea controls called "customer"
653           my @customer_text_inputs = $mech->find_all_inputs(
654               type_regex => qr/^(text|textarea)$/,
655               name       => 'customer',
656           );
657
658   $mech->find_all_submits( ... criteria ... )
659       find_all_submits() does the same thing as find_all_inputs() except that
660       it only returns controls that are submit controls, ignoring other types
661       of input controls like text and checkboxes.
662

IMAGE METHODS

664   $mech->images
665       Lists all the images on the current page.  Each image is a
666       WWW::Mechanize::Image object. In list context, returns a list of all
667       images.  In scalar context, returns an array reference of all images.
668
669   $mech->find_image()
670       Finds an image in the current page. It returns a WWW::Mechanize::Image
671       object which describes the image.  If it fails to find an image it
672       returns "undef".
673
674       You can select which image to find by passing in one or more of these
675       key/value pairs:
676
677       •   "alt => 'string'" and "alt_regex => qr/regex/"
678
679           "alt" matches the ALT attribute of the image against string, which
680           must be an exact match. To select a image with an ALT tag that is
681           exactly "download", use
682
683               $mech->find_image( alt => 'download' );
684
685           "alt_regex" matches the ALT attribute of the image  against a
686           regular expression.  To select an image with an ALT attribute that
687           has "download" anywhere in it, regardless of case, use
688
689               $mech->find_image( alt_regex => qr/download/i );
690
691       •   "url => 'string'" and "url_regex => qr/regex/"
692
693           Matches the URL of the image against string or regex, as
694           appropriate.  The URL may be a relative URL, like foo/bar.html,
695           depending on how it's coded on the page.
696
697       •   "url_abs => string" and "url_abs_regex => regex"
698
699           Matches the absolute URL of the image against string or regex, as
700           appropriate.  The URL will be an absolute URL, even if it's
701           relative in the page.
702
703       •   "tag => string" and "tag_regex => regex"
704
705           Matches the tag that the image came from against string or regex,
706           as appropriate.  The "tag_regex" is probably most useful to check
707           for more than one tag, as in:
708
709               $mech->find_image( tag_regex => qr/^(img|input)$/ );
710
711           The tags supported are "<img>" and "<input>".
712
713       •   "id => string" and "id_regex => regex"
714
715           "id" matches the id attribute of the image against string, which
716           must be an exact match. To select an image with the exact id
717           "download-image", use
718
719               $mech->find_image( id => 'download-image' );
720
721           "id_regex" matches the id attribute of the image against a regular
722           expression. To select the first image with an id that contains
723           "download" anywhere in it, use
724
725               $mech->find_image( id_regex => qr/download/ );
726
727       •   "classs => string" and "class_regex => regex"
728
729           "class" matches the class attribute of the image against string,
730           which must be an exact match. To select an image with the exact
731           class "img-fuid", use
732
733               $mech->find_image( class => 'img-fluid' );
734
735           To select an image with the class attribute "rounded float-left",
736           use
737
738               $mech->find_image( class => 'rounded float-left' );
739
740           Note that the classes have to be matched as a complete string, in
741           the exact order they appear in the website's source code.
742
743           "class_regex" matches the class attribute of the image against a
744           regular expression. Use this if you want a partial class name, or
745           if an image has several classes, but you only care about one.
746
747           To select the first image with the class "rounded", where there are
748           multiple images that might also have either class "float-left" or
749           "float-right", use
750
751               $mech->find_image( class_regex => qr/\brounded\b/ );
752
753           Selecting an image with multiple classes where you do not care
754           about the order they appear in the website's source code is not
755           currently supported.
756
757       If "n" is not specified, it defaults to 1.  Therefore, if you don't
758       specify any params, this method defaults to finding the first image on
759       the page.
760
761       Note that you can specify multiple ALT or URL parameters, which will be
762       ANDed together.  For example, to find the first image with ALT text of
763       "News" and with "cnn.com" in the URL, use:
764
765           $mech->find_image( image => 'News', url_regex => qr/cnn\.com/ );
766
767       The return value is a reference to an array containing a
768       WWW::Mechanize::Image object for every image in "$mech->content".
769
770   $mech->find_all_images( ... )
771       Returns all the images on the current page that match the criteria.
772       The method for specifying image criteria is the same as in
773       find_image().  Each of the images returned is a WWW::Mechanize::Image
774       object.
775
776       In list context, find_all_images() returns a list of the images.
777       Otherwise, it returns a reference to the list of images.
778
779       find_all_images() with no parameters returns all images in the page.
780

FORM METHODS

782       These methods let you work with the forms on a page.  The idea is to
783       choose a form that you'll later work with using the field methods
784       below.
785
786   $mech->forms
787       Lists all the forms on the current page.  Each form is an HTML::Form
788       object.  In list context, returns a list of all forms.  In scalar
789       context, returns an array reference of all forms.
790
791   $mech->form_number($number)
792       Selects the numberth form on the page as the target for subsequent
793       calls to field() and click().  Also returns the form that was selected.
794
795       If it is found, the form is returned as an HTML::Form object and set
796       internally for later use with Mech's form methods such as field() and
797       click().  When called in a list context, the number of the found form
798       is also returned as a second value.
799
800       Emits a warning and returns "undef" if no form is found.
801
802       The first form is number 1, not zero.
803
804   $mech->form_action( $action )
805       Selects a form by action, using a regex containing $action.  If there
806       is more than one form on the page matching that action, then the first
807       one is used, and a warning is generated.
808
809       If it is found, the form is returned as an HTML::Form object and set
810       internally for later use with Mech's form methods such as field() and
811       click().
812
813       Returns "undef" if no form is found.
814
815   $mech->form_name( $name [, \%args ] )
816       Selects a form by name.
817
818       By default, the first form that has this name will be returned.
819
820           my $form = $mech->form_name("order_form");
821
822       If you want the second, third or nth match, pass an optional arguments
823       hash reference as the final parameter with a key "n" to pick which
824       instance you want. The numbering starts at 1.
825
826           my $third_product_form = $mech->form_name("buy_now", { n => 3 });
827
828       If the "n" parameter is not passed, and there is more than one form on
829       the page with that name, then the first one is used, and a warning is
830       generated.
831
832       If it is found, the form is returned as an HTML::Form object and set
833       internally for later use with Mech's form methods such as field() and
834       click().
835
836       Returns "undef" if no form is found.
837
838   $mech->form_id( $id [, \%args ] )
839       Selects a form by ID.
840
841       By default, the first form that has this ID will be returned.
842
843           my $form = $mech->form_id("order_form");
844
845       Although the HTML specification requires the ID to be unique within a
846       page, some pages might not adhere to that. If you want the second,
847       third or nth match, pass an optional arguments hash reference as the
848       final parameter with a key "n" to pick which instance you want. The
849       numbering starts at 1.
850
851           my $third_product_form = $mech->form_id("buy_now", { n => 3 });
852
853       If the "n" parameter is not passed, and there is more than one form on
854       the page with that ID, then the first one is used, and a warning is
855       generated.
856
857       If it is found, the form is returned as an HTML::Form object and set
858       internally for later use with Mech's form methods such as field() and
859       click().
860
861       If no form is found it returns "undef".  This will also trigger a
862       warning, unless "quiet" is enabled.
863
864   $mech->all_forms_with_fields( @fields )
865       Selects a form by passing in a list of field names it must contain.
866       All matching forms (perhaps none) are returned as a list of HTML::Form
867       objects.
868
869   $mech->form_with_fields( @fields, [ \%args ] )
870       Selects a form by passing in a list of field names it must contain. By
871       default, the first form that matches all of these field names will be
872       returned.
873
874           my $form = $mech->form_with_fields( qw/sku quantity add_to_cart/ );
875
876       If you want the second, third or nth match, pass an optional arguments
877       hash reference as the final parameter with a key "n" to pick which
878       instance you want. The numbering starts at 1.
879
880           my $form = $mech->form_with_fields( 'sky', 'qty', { n => 2 } );
881
882       If the "n" parameter is not passed, and there is more than one form on
883       the page with that ID, then the first one is used, and a warning is
884       generated.
885
886       If it is found, the form is returned as an HTML::Form object and set
887       internally for later used with Mech's form methods such as field() and
888       click().
889
890       Returns "undef" and emits a warning if no form is found.
891
892       Note that this functionality requires libwww-perl 5.69 or higher.
893
894   $mech->all_forms_with( $attr1 => $value1, $attr2 => $value2, ... )
895       Searches for forms with arbitrary attribute/value pairs within the
896       <form> tag.  When given more than one pair, all criteria must match.
897       Using "undef" as value means that the attribute in question must not be
898       present.
899
900       All matching forms (perhaps none) are returned as a list of HTML::Form
901       objects.
902
903   $mech->form_with( $attr1 => $value1, $attr2 => $value2, ..., [ \%args ] )
904       Searches for forms with arbitrary attribute/value pairs within the
905       <form> tag.  When given more than one pair, all criteria must match.
906       Using "undef" as value means that the attribute in question must not be
907       present.
908
909       By default, the first form that matches all criteria will be returned.
910
911           my $form = $mech->form_with( name => 'order_form', method => 'POST' );
912
913       If you want the second, third or nth match, pass an optional arguments
914       hash reference as the final parameter with a key "n" to pick which
915       instance you want. The numbering starts at 1.
916
917           my $form = $mech->form_with( method => 'POST', { n => 4 } );
918
919       If the "n" parameter is not passed, and there is more than one form on
920       the page matching these criteria, then the first one is used, and a
921       warning is generated.
922
923       If it is found, the form is returned as an HTML::Form object and set
924       internally for later used with Mech's form methods such as field() and
925       click().
926
927       Returns "undef" if no form is found.
928

FIELD METHODS

930       These methods allow you to set the values of fields in a given form.
931
932   $mech->field( $name, $value, $number )
933   $mech->field( $name, \@values, $number )
934   $mech->field( $name, \@file_upload_values, $number )
935       Given the name of a field, set its value to the value specified.  This
936       applies to the current form (as set by the form_name() or form_number()
937       method or defaulting to the first form on the page).
938
939       If the field is of type "file", its value should be an arrayref.
940       Example:
941
942           $mech->field( $file_input, ['/tmp/file.txt'] );
943
944       Value examples for "file" inputs, followed by explanation of what each
945       index mean:
946
947           # 0: filepath      1: filename    3: headers
948           ['/tmp/file.txt']
949           ['/tmp/file.txt', 'filename.txt']
950           ['/tmp/file.txt', 'filename.txt', @headers]
951           ['/tmp/file.txt', 'filename.txt', Content => 'some content']
952           [undef,           'filename.txt', Content => 'content here']
953
954       Index 0 is the filepath that will be read from disk. Index 1 is the
955       filename which will be used in the HTTP request body; if not given,
956       filepath (index 0) is used instead. If "<Content =" 'content here'>> is
957       used as shown, then filepath will be ignored.
958
959       The optional $number parameter is used to distinguish between two
960       fields with the same name.  The fields are numbered from 1.
961
962   $mech->select($name, $value)
963   $mech->select($name, \@values)
964       Given the name of a "select" field, set its value to the value
965       specified.  If the field is not "<select multiple>" and the $value is
966       an array, only the first value will be set.  [Note: the documentation
967       previously claimed that only the last value would be set, but this was
968       incorrect.]  Passing $value as a hash with an "n" key selects an item
969       by number (e.g.  "{n => 3}" or "{n => [2,4]}").  The numbering starts
970       at 1.  This applies to the current form.
971
972       If you have a field with "<select multiple>" and you pass a single
973       $value, then $value will be added to the list of fields selected,
974       without clearing the others.  However, if you pass an array reference,
975       then all previously selected values will be cleared.
976
977       Returns true on successfully setting the value. On failure, returns
978       false and calls "$self->warn()" with an error message.
979
980   $mech->set_fields( $name => $value ... )
981   $mech->set_fields( $name => \@value_and_instance_number )
982   $mech->set_fields( $name => \$value_instance_number )
983   $mech->set_fields( $name => \@file_upload )
984       This method sets multiple fields of the current form. It takes a list
985       of field name and value pairs. If there is more than one field with the
986       same name, the first one found is set. If you want to select which of
987       the duplicate field to set, use a value which is an anonymous array
988       which has the field value and its number as the 2 elements.
989
990               # set the second $name field to 'foo'
991               $mech->set_fields( $name => [ 'foo', 2 ] );
992
993       The value of a field of type "file" should be an arrayref as described
994       in field(). Examples:
995
996               $mech->set_fields( $file_field => ['/tmp/file.txt'] );
997               $mech->set_fields( $file_field => ['/tmp/file.txt', 'filename.txt'] );
998
999       The value for a "file" input can also be an arrayref containing an
1000       arrayref and a number, as documented in submit_form().  The number will
1001       be used to find the field in the form. Example:
1002
1003               $mech->set_fields( $file_field => [['/tmp/file.txt'], 1] );
1004
1005       The fields are numbered from 1.
1006
1007       For fields that have a predefined set of values, you may also provide a
1008       reference to an integer, if you don't know the options for the field,
1009       but you know you just want (e.g.) the first one.
1010
1011               # select the first value in the $name select box
1012               $mech->set_fields( $name => \0 );
1013               # select the last value in the $name select box
1014               $mech->set_fields( $name => \-1 );
1015
1016       This applies to the current form.
1017
1018   $mech->set_visible( @criteria )
1019       This method sets fields of the current form without having to know
1020       their names.  So if you have a login screen that wants a username and
1021       password, you do not have to fetch the form and inspect the source (or
1022       use the mech-dump utility, installed with WWW::Mechanize) to see what
1023       the field names are; you can just say
1024
1025           $mech->set_visible( $username, $password );
1026
1027       and the first and second fields will be set accordingly.  The method is
1028       called set_visible because it acts only on visible fields; hidden form
1029       inputs are not considered.  The order of the fields is the order in
1030       which they appear in the HTML source which is nearly always the order
1031       anyone viewing the page would think they are in, but some creative work
1032       with tables could change that; caveat user.
1033
1034       Each element in @criteria is either a field value or a field specifier.
1035       A field value is a scalar.  A field specifier allows you to specify the
1036       type of input field you want to set and is denoted with an arrayref
1037       containing two elements.  So you could specify the first radio button
1038       with
1039
1040           $mech->set_visible( [ radio => 'KCRW' ] );
1041
1042       Field values and specifiers can be intermixed, hence
1043
1044           $mech->set_visible( 'fred', 'secret', [ option => 'Checking' ] );
1045
1046       would set the first two fields to "fred" and "secret", and the next
1047       "OPTION" menu field to "Checking".
1048
1049       The possible field specifier types are: "text", "password", "hidden",
1050       "textarea", "file", "image", "submit", "radio", "checkbox" and
1051       "option".
1052
1053       "set_visible" returns the number of values set.
1054
1055   $mech->tick( $name, $value [, $set] )
1056       "Ticks" the first checkbox that has both the name and value associated
1057       with it on the current form.  If there is no value to the input, just
1058       pass an empty string as the value.  Dies if there is no named checkbox
1059       for the value given, if a value is given.  Passing in a false value as
1060       the third optional argument will cause the checkbox to be unticked.
1061       The third value does not need to be set if you wish to merely tick the
1062       box.
1063
1064           $mech->tick('extra', 'cheese');
1065           $mech->tick('extra', 'mushrooms');
1066
1067           $mech->tick('no_value', ''); # <input type="checkbox" name="no_value">
1068
1069   $mech->untick($name, $value)
1070       Causes the checkbox to be unticked.  Shorthand for
1071       "tick($name,$value,undef)"
1072
1073   $mech->value( $name [, $number] )
1074       Given the name of a field, return its value. This applies to the
1075       current form.
1076
1077       The optional $number parameter is used to distinguish between two
1078       fields with the same name.  The fields are numbered from 1.
1079
1080       If the field is of type file (file upload field), the value is always
1081       cleared to prevent remote sites from downloading your local files.  To
1082       upload a file, specify its file name explicitly.
1083
1084   $mech->click( $button [, $x, $y] )
1085       Has the effect of clicking a button on the current form.  The first
1086       argument is the name of the button to be clicked.  The second and third
1087       arguments (optional) allow you to specify the (x,y) coordinates of the
1088       click.
1089
1090       If there is only one button on the form, "$mech->click()" with no
1091       arguments simply clicks that one button.
1092
1093       Returns an HTTP::Response object.
1094
1095   $mech->click_button( ... )
1096       Has the effect of clicking a button on the current form by specifying
1097       its attributes. The arguments are a list of key/value pairs. Only one
1098       of name, id, number, input or value must be specified in the keys.
1099
1100       Dies if no button is found.
1101
1102       •   "name => name"
1103
1104           Clicks the button named name in the current form.
1105
1106       •   "id => id"
1107
1108           Clicks the button with the id id in the current form.
1109
1110       •   "number => n"
1111
1112           Clicks the nth button with type submit in the current form.
1113           Numbering starts at 1.
1114
1115       •   "value => value"
1116
1117           Clicks the button with the value value in the current form.
1118
1119       •   "input => $inputobject"
1120
1121           Clicks on the button referenced by $inputobject, an instance of
1122           HTML::Form::SubmitInput obtained e.g. from
1123
1124               $mech->current_form()->find_input( undef, 'submit' )
1125
1126           $inputobject must belong to the current form.
1127
1128       •   "x => x"
1129
1130       •   "y => y"
1131
1132           These arguments (optional) allow you to specify the (x,y)
1133           coordinates of the click.
1134
1135   $mech->submit()
1136       Submits the current form, without specifying a button to click.
1137       Actually, no button is clicked at all.
1138
1139       Returns an HTTP::Response object.
1140
1141       This used to be a synonym for "$mech->click( 'submit' )", but is no
1142       longer so.
1143
1144   $mech->submit_form( ... )
1145       This method lets you select a form from the previously fetched page,
1146       fill in its fields, and submit it. It combines the
1147       "form_number"/"form_name", "set_fields" and "click" methods into one
1148       higher level call. Its arguments are a list of key/value pairs, all of
1149       which are optional.
1150
1151       •   "fields => \%fields"
1152
1153           Specifies the fields to be filled in the current form.
1154
1155       •   "with_fields => \%fields"
1156
1157           Probably all you need for the common case. It combines a smart form
1158           selector and data setting in one operation. It selects the first
1159           form that contains all fields mentioned in "\%fields".  This is
1160           nice because you don't need to know the name or number of the form
1161           to do this.
1162
1163           (calls form_with_fields() and set_fields()).
1164
1165           If you choose "with_fields", the "fields" option will be ignored.
1166           The "form_number", "form_name" and "form_id" options will still be
1167           used.  An exception will be thrown unless exactly one form matches
1168           all of the provided criteria.
1169
1170       •   "form_number => n"
1171
1172           Selects the nth form (calls form_number().  If this param is not
1173           specified, the currently-selected form is used.
1174
1175       •   "form_name => name"
1176
1177           Selects the form named name (calls form_name())
1178
1179       •   "form_id => ID"
1180
1181           Selects the form with ID ID (calls form_id())
1182
1183       •   "button => button"
1184
1185           Clicks on button button (calls click())
1186
1187       •   "x => x, y => y"
1188
1189           Sets the x or y values for click()
1190
1191       •   "strict_forms => bool"
1192
1193           Sets the HTML::Form strict flag which causes form submission to
1194           croak if any of the passed fields don't exist on the page, and/or a
1195           value doesn't exist in a select element.  By default HTML::Form
1196           sets this value to false.
1197
1198           This behavior can also be turned on globally by passing
1199           "strict_forms => 1" to "WWW::Mechanize->new". If you do that, you
1200           can still disable it for individual calls by passing "strict_forms
1201           => 0" here.
1202
1203       If no form is selected, the first form found is used.
1204
1205       If button is not passed, then the submit() method is used instead.
1206
1207       If you want to submit a file and get its content from a scalar rather
1208       than a file in the filesystem, you can use:
1209
1210           $mech->submit_form(with_fields => { logfile => [ [ undef, 'whatever', Content => $content ], 1 ] } );
1211
1212       Returns an HTTP::Response object.
1213

MISCELLANEOUS METHODS

1215   $mech->add_header( name => $value [, name => $value... ] )
1216       Sets HTTP headers for the agent to add or remove from the HTTP request.
1217
1218           $mech->add_header( Encoding => 'text/klingon' );
1219
1220       If a value is "undef", then that header will be removed from any future
1221       requests.  For example, to never send a Referer header:
1222
1223           $mech->add_header( Referer => undef );
1224
1225       If you want to delete a header, use "delete_header".
1226
1227       Returns the number of name/value pairs added.
1228
1229       NOTE: This method was very different in WWW::Mechanize before 1.00.
1230       Back then, the headers were stored in a package hash, not as a member
1231       of the object instance.  Calling add_header() would modify the headers
1232       for every WWW::Mechanize object, even after your object no longer
1233       existed.
1234
1235   $mech->delete_header( name [, name ... ] )
1236       Removes HTTP headers from the agent's list of special headers.  For
1237       instance, you might need to do something like:
1238
1239           # Don't send a Referer for this URL
1240           $mech->add_header( Referer => undef );
1241
1242           # Get the URL
1243           $mech->get( $url );
1244
1245           # Back to the default behavior
1246           $mech->delete_header( 'Referer' );
1247
1248   $mech->quiet(true/false)
1249       Allows you to suppress warnings to the screen.
1250
1251           $mech->quiet(0); # turns on warnings (the default)
1252           $mech->quiet(1); # turns off warnings
1253           $mech->quiet();  # returns the current quietness status
1254
1255   $mech->autocheck(true/false)
1256       Allows you to enable and disable autochecking.
1257
1258       Autocheck checks each request made to see if it was successful. This
1259       saves you the trouble of manually checking yourself. Any errors found
1260       are errors, not warnings. Please see "new" for more details.
1261
1262           $mech->autocheck(1); # turns on automatic request checking (the default)
1263           $mech->autocheck(0); # turns off automatic request checking
1264           $mech->autocheck();  # returns the current autocheck status
1265
1266   $mech->stack_depth( $max_depth )
1267       Get or set the page stack depth. Use this if you're doing a lot of page
1268       scraping and running out of memory.
1269
1270       A value of 0 means "no history at all."  By default, the max stack
1271       depth is humongously large, effectively keeping all history.
1272
1273   $mech->save_content( $filename, %opts )
1274       Dumps the contents of "$mech->content" into $filename.  $filename will
1275       be overwritten.  Dies if there are any errors.
1276
1277       If the content type does not begin with "text/", then the content is
1278       saved in binary mode (i.e. binmode() is set on the output filehandle).
1279
1280       Additional arguments can be passed as key/value pairs:
1281
1282       $mech->save_content( $filename, binary => 1 )
1283           Filehandle is set with "binmode" to ":raw" and contents are taken
1284           calling "$self->content(decoded_by_headers => 1)". Same as calling:
1285
1286               $mech->save_content( $filename, binmode => ':raw',
1287                                    decoded_by_headers => 1 );
1288
1289           This should be the safest way to save contents verbatim.
1290
1291       $mech->save_content( $filename, binmode => $binmode )
1292           Filehandle is set to binary mode. If $binmode begins with ':', it
1293           is passed as a parameter to "binmode":
1294
1295               binmode $fh, $binmode;
1296
1297           otherwise the filehandle is set to binary mode if $binmode is true:
1298
1299               binmode $fh;
1300
1301       all other arguments
1302           are passed as-is to "$mech->content(%opts)". In particular,
1303           "decoded_by_headers" might come handy if you want to revert the
1304           effect of line compression performed by the web server but without
1305           further interpreting the contents (e.g. decoding it according to
1306           the charset).
1307
1308   $mech->dump_headers( [$fh] )
1309       Prints a dump of the HTTP response headers for the most recent
1310       response.  If $fh is not specified or is "undef", it dumps to STDOUT.
1311
1312       Unlike the rest of the "dump_*" methods, $fh can be a scalar. It will
1313       be used as a file name.
1314
1315   $mech->dump_links( [[$fh], $absolute] )
1316       Prints a dump of the links on the current page to $fh.  If $fh is not
1317       specified or is "undef", it dumps to STDOUT.
1318
1319       If $absolute is true, links displayed are absolute, not relative.
1320
1321   $mech->dump_images( [[$fh], $absolute] )
1322       Prints a dump of the images on the current page to $fh.  If $fh is not
1323       specified or is "undef", it dumps to STDOUT.
1324
1325       If $absolute is true, links displayed are absolute, not relative.
1326
1327       The output will include empty lines for images that have no "src"
1328       attribute and therefore no URL.
1329
1330   $mech->dump_forms( [$fh] )
1331       Prints a dump of the forms on the current page to $fh.  If $fh is not
1332       specified or is "undef", it dumps to STDOUT. Running the following:
1333
1334           my $mech = WWW::Mechanize->new();
1335           $mech->get("https://www.google.com/");
1336           $mech->dump_forms;
1337
1338       will print:
1339
1340           GET https://www.google.com/search [f]
1341             ie=ISO-8859-1                  (hidden readonly)
1342             hl=en                          (hidden readonly)
1343             source=hp                      (hidden readonly)
1344             biw=                           (hidden readonly)
1345             bih=                           (hidden readonly)
1346             q=                             (text)
1347             btnG=Google Search             (submit)
1348             btnI=I'm Feeling Lucky         (submit)
1349             gbv=1                          (hidden readonly)
1350
1351   $mech->dump_text( [$fh] )
1352       Prints a dump of the text on the current page to $fh.  If $fh is not
1353       specified or is "undef", it dumps to STDOUT.
1354

OVERRIDDEN LWP::UserAgent METHODS

1356   $mech->clone()
1357       Clone the mech object.  The clone will be using the same cookie jar as
1358       the original mech.
1359
1360   $mech->redirect_ok()
1361       An overloaded version of redirect_ok() in LWP::UserAgent.  This method
1362       is used to determine whether a redirection in the request should be
1363       followed.
1364
1365       Note that WWW::Mechanize's constructor pushes POST on to the agent's
1366       "requests_redirectable" list.
1367
1368   $mech->request( $request [, $arg [, $size]])
1369       Overloaded version of request() in LWP::UserAgent.  Performs the actual
1370       request.  Normally, if you're using WWW::Mechanize, it's because you
1371       don't want to deal with this level of stuff anyway.
1372
1373       Note that $request will be modified.
1374
1375       Returns an HTTP::Response object.
1376
1377   $mech->update_html( $html )
1378       Allows you to replace the HTML that the mech has found.  Updates the
1379       forms and links parse-trees that the mech uses internally.
1380
1381       Say you have a page that you know has malformed output, and you want to
1382       update it so the links come out correctly:
1383
1384           my $html = $mech->content;
1385           $html =~ s[</option>.{0,3}</td>][</option></select></td>]isg;
1386           $mech->update_html( $html );
1387
1388       This method is also used internally by the mech itself to update its
1389       own HTML content when loading a page. This means that if you would like
1390       to systematically perform the above HTML substitution, you would
1391       overload "update_html" in a subclass thusly:
1392
1393          package MyMech;
1394          use base 'WWW::Mechanize';
1395
1396          sub update_html {
1397              my ($self, $html) = @_;
1398              $html =~ s[</option>.{0,3}</td>][</option></select></td>]isg;
1399              $self->WWW::Mechanize::update_html( $html );
1400          }
1401
1402       If you do this, then the mech will use the tidied-up HTML instead of
1403       the original both when parsing for its own needs, and for returning to
1404       you through content().
1405
1406       Overloading this method is also the recommended way of implementing
1407       extra validation steps (e.g. link checkers) for every HTML page
1408       received.  "warn" and "warn" would then come in handy to signal
1409       validation errors.
1410
1411   $mech->credentials( $username, $password )
1412       Provide credentials to be used for HTTP Basic authentication for all
1413       sites and realms until further notice.
1414
1415       The four argument form described in LWP::UserAgent is still supported.
1416
1417   $mech->get_basic_credentials( $realm, $uri, $isproxy )
1418       Returns the credentials for the realm and URI.
1419
1420   $mech->clear_credentials()
1421       Remove any credentials set up with credentials().
1422

INHERITED UNCHANGED LWP::UserAgent METHODS

1424       As a subclass of LWP::UserAgent, WWW::Mechanize inherits all of
1425       LWP::UserAgent's methods.  Many of which are overridden or extended.
1426       The following methods are inherited unchanged. View the LWP::UserAgent
1427       documentation for their implementation descriptions.
1428
1429       This is not meant to be an inclusive list.  LWP::UA may have added
1430       others.
1431
1432   $mech->head()
1433       Inherited from LWP::UserAgent.
1434
1435   $mech->mirror()
1436       Inherited from LWP::UserAgent.
1437
1438   $mech->simple_request()
1439       Inherited from LWP::UserAgent.
1440
1441   $mech->is_protocol_supported()
1442       Inherited from LWP::UserAgent.
1443
1444   $mech->prepare_request()
1445       Inherited from LWP::UserAgent.
1446
1447   $mech->progress()
1448       Inherited from LWP::UserAgent.
1449

INTERNAL-ONLY METHODS

1451       These methods are only used internally.  You probably don't need to
1452       know about them.
1453
1454   $mech->_update_page($request, $response)
1455       Updates all internal variables in $mech as if $request was just
1456       performed, and returns $response. The page stack is not altered by this
1457       method, it is up to caller (e.g.  "request") to do that.
1458
1459   $mech->_modify_request( $req )
1460       Modifies a HTTP::Request before the request is sent out, for both GET
1461       and POST requests.
1462
1463       We add a "Referer" header, as well as header to note that we can accept
1464       gzip encoded content, if Compress::Zlib is installed.
1465
1466   $mech->_make_request()
1467       Convenience method to make it easier for subclasses like
1468       WWW::Mechanize::Cached to intercept the request.
1469
1470   $mech->_reset_page()
1471       Resets the internal fields that track page parsed stuff.
1472
1473   $mech->_extract_links()
1474       Extracts links from the content of a webpage, and populates the
1475       "{links}" property with WWW::Mechanize::Link objects.
1476
1477   $mech->_push_page_stack()
1478       The agent keeps a stack of visited pages, which it can pop when it
1479       needs to go BACK and so on.
1480
1481       The current page needs to be pushed onto the stack before we get a new
1482       page, and the stack needs to be popped when BACK occurs.
1483
1484       Neither of these take any arguments, they just operate on the $mech
1485       object.
1486
1487   warn( @messages )
1488       Centralized warning method, for diagnostics and non-fatal problems.
1489       Defaults to calling "CORE::warn", but may be overridden by setting
1490       "onwarn" in the constructor.
1491
1492   die( @messages )
1493       Centralized error method.  Defaults to calling "CORE::die", but may be
1494       overridden by setting "onerror" in the constructor.
1495

BEST PRACTICES

1497       The default settings can get you up and running quickly, but there are
1498       settings you can change in order to make your life easier.
1499
1500       autocheck
1501           "autocheck" can save you the overhead of checking status codes for
1502           success.  You may outgrow it as your needs get more sophisticated,
1503           but it's a safe option to start with.
1504
1505               my $agent = WWW::Mechanize->new( autocheck => 1 );
1506
1507       cookie_jar
1508           You are encouraged to install Mozilla::PublicSuffix and use
1509           HTTP::CookieJar::LWP as your cookie jar.  HTTP::CookieJar::LWP
1510           provides a better security model matching that of current Web
1511           browsers when Mozilla::PublicSuffix is installed.
1512
1513               use HTTP::CookieJar::LWP ();
1514
1515               my $jar = HTTP::CookieJar::LWP->new;
1516               my $agent = WWW::Mechanize->new( cookie_jar => $jar );
1517
1518       protocols_allowed
1519           This option is inherited directly from LWP::UserAgent.  It may be
1520           used to allow arbitrary protocols.
1521
1522               my $agent = WWW::Mechanize->new(
1523                   protocols_allowed => [ 'http', 'https' ]
1524               );
1525
1526           This will prevent you from inadvertently following URLs like
1527           "file:///etc/passwd"
1528
1529       protocols_forbidden
1530           This option is also inherited directly from LWP::UserAgent.  It may
1531           be used to deny arbitrary protocols.
1532
1533               my $agent = WWW::Mechanize->new(
1534                   protocols_forbidden => [ 'file', 'mailto', 'ssh', ]
1535               );
1536
1537           This will prevent you from inadvertently following URLs like
1538           "file:///etc/passwd"
1539
1540       strict_forms
1541           Consider turning on the "strict_forms" option when you create a new
1542           Mech.  This will perform a helpful sanity check on form fields
1543           every time you are submitting a form, which can save you a lot of
1544           debugging time.
1545
1546               my $agent = WWW::Mechanize->new( strict_forms => 1 );
1547
1548           If you do not want to have this option globally, you can still turn
1549           it on for individual forms.
1550
1551               $agent->submit_form( fields => { foo => 'bar' } , strict_forms => 1 );
1552

WWW::MECHANIZE'S GIT REPOSITORY

1554       WWW::Mechanize is hosted at GitHub.
1555
1556       Repository: <https://github.com/libwww-perl/WWW-Mechanize>.  Bugs:
1557       <https://github.com/libwww-perl/WWW-Mechanize/issues>.
1558

OTHER DOCUMENTATION

1560   Spidering Hacks, by Kevin Hemenway and Tara Calishain
1561       Spidering Hacks from O'Reilly
1562       (<http://www.oreilly.com/catalog/spiderhks/>) is a great book for
1563       anyone wanting to know more about screen-scraping and spidering.
1564
1565       There are six hacks that use Mech or a Mech derivative:
1566
1567       #21 WWW::Mechanize 101
1568       #22 Scraping with WWW::Mechanize
1569       #36 Downloading Images from Webshots
1570       #44 Archiving Yahoo! Groups Messages with WWW::Yahoo::Groups
1571       #64 Super Author Searching
1572       #73 Scraping TV Listings
1573
1574       The book was also positively reviewed on Slashdot:
1575       <http://books.slashdot.org/article.pl?sid=03/12/11/2126256>
1576

ONLINE RESOURCES AND SUPPORT

1578       •   WWW::Mechanize mailing list
1579
1580           The Mech mailing list is at
1581           <http://groups.google.com/group/www-mechanize-users> and is
1582           specific to Mechanize, unlike the LWP mailing list below.  Although
1583           it is a users list, all development discussion takes place here,
1584           too.
1585
1586       •   LWP mailing list
1587
1588           The LWP mailing list is at
1589           <http://lists.perl.org/showlist.cgi?name=libwww>, and is more user-
1590           oriented and well-populated than the WWW::Mechanize list.
1591
1592       •   Perlmonks
1593
1594           <http://perlmonks.org> is an excellent community of support, and
1595           many questions about Mech have already been answered there.
1596
1597       •   WWW::Mechanize::Examples
1598
1599           A random array of examples submitted by users, included with the
1600           Mechanize distribution.
1601

ARTICLES ABOUT WWW::MECHANIZE

1603       •   <http://www.ibm.com/developerworks/linux/library/wa-perlsecure/>
1604
1605           IBM article "Secure Web site access with Perl"
1606
1607       •   <http://www.oreilly.com/catalog/googlehks2/chapter/hack84.pdf>
1608
1609           Leland Johnson's hack #84 in Google Hacks, 2nd Edition is an
1610           example of a production script that uses WWW::Mechanize and
1611           HTML::TableContentParser. It takes in keywords and returns the
1612           estimated price of these keywords on Google's AdWords program.
1613
1614       •   <http://www.perl.com/pub/a/2004/06/04/recorder.html>
1615
1616           Linda Julien writes about using HTTP::Recorder to create
1617           WWW::Mechanize scripts.
1618
1619       •   <http://www.developer.com/lang/other/article.php/3454041>
1620
1621           Jason Gilmore's article on using WWW::Mechanize for scraping sales
1622           information from Amazon and eBay.
1623
1624       •   <http://www.perl.com/pub/a/2003/01/22/mechanize.html>
1625
1626           Chris Ball's article about using WWW::Mechanize for scraping TV
1627           listings.
1628
1629       •   <http://www.stonehenge.com/merlyn/LinuxMag/col47.html>
1630
1631           Randal Schwartz's article on scraping Yahoo News for images.  It's
1632           already out of date: He manually walks the list of links hunting
1633           for matches, which wouldn't have been necessary if the find_link()
1634           method existed at press time.
1635
1636       •   <http://www.perladvent.org/2002/16th/>
1637
1638           WWW::Mechanize on the Perl Advent Calendar, by Mark Fowler.
1639
1640       •   <http://www.linux-magazin.de/ausgaben/2004/03/datenruessel/>
1641
1642           Michael Schilli's article on Mech and WWW::Mechanize::Shell for the
1643           German magazine Linux Magazin.
1644
1645   Other modules that use Mechanize
1646       Here are modules that use or subclass Mechanize.  Let me know of any
1647       others:
1648
1649       •   Finance::Bank::LloydsTSB
1650
1651       •   HTTP::Recorder
1652
1653           Acts as a proxy for web interaction, and then generates
1654           WWW::Mechanize scripts.
1655
1656       •   Win32::IE::Mechanize
1657
1658           Just like Mech, but using Microsoft Internet Explorer to do the
1659           work.
1660
1661       •   WWW::Bugzilla
1662
1663       •   WWW::Google::Groups
1664
1665       •   WWW::Hotmail
1666
1667       •   WWW::Mechanize::Cached
1668
1669       •   WWW::Mechanize::Cached::GZip
1670
1671       •   WWW::Mechanize::FormFiller
1672
1673       •   WWW::Mechanize::Shell
1674
1675       •   WWW::Mechanize::Sleepy
1676
1677       •   WWW::Mechanize::SpamCop
1678
1679       •   WWW::Mechanize::Timed
1680
1681       •   WWW::SourceForge
1682
1683       •   WWW::Yahoo::Groups
1684
1685       •   WWW::Scripter
1686

ACKNOWLEDGEMENTS

1688       Thanks to the numerous people who have helped out on WWW::Mechanize in
1689       one way or another, including Kirrily Robert for the original
1690       "WWW::Automate", Lyle Hopkins, Damien Clark, Ansgar Burchardt, Gisle
1691       Aas, Jeremy Ary, Hilary Holz, Rafael Kitover, Norbert Buchmuller, Dave
1692       Page, David Sainty, H.Merijn Brand, Matt Lawrence, Michael Schwern,
1693       Adriano Ferreira, Miyagawa, Peteris Krumins, Rafael Kitover, David
1694       Steinbrunner, Kevin Falcone, Mike O'Regan, Mark Stosberg, Uri Guttman,
1695       Peter Scott, Philippe Bruhat, Ian Langworth, John Beppu, Gavin Estey,
1696       Jim Brandt, Ask Bjoern Hansen, Greg Davies, Ed Silva, Mark-Jason
1697       Dominus, Autrijus Tang, Mark Fowler, Stuart Children, Max Maischein,
1698       Meng Wong, Prakash Kailasa, Abigail, Jan Pazdziora, Dominique
1699       Quatravaux, Scott Lanning, Rob Casey, Leland Johnson, Joshua Gatcomb,
1700       Julien Beasley, Abe Timmerman, Peter Stevens, Pete Krawczyk, Tad
1701       McClellan, and the late great Iain Truskett.
1702

AUTHOR

1704       Andy Lester <andy at petdance.com>
1705
1707       This software is copyright (c) 2004 by Andy Lester.
1708
1709       This is free software; you can redistribute it and/or modify it under
1710       the same terms as the Perl 5 programming language system itself.
1711
1712
1713
1714perl v5.36.0                      2023-01-20                 WWW::Mechanize(3)
Impressum