1LW2(3)                User Contributed Perl Documentation               LW2(3)
2
3
4

NAME

6       LW2 - Perl HTTP library version 2.4
7

SYNOPSIS

9       use LW2;
10
11       require 'LW2.pm';
12

DESCRIPTION

14       Libwhisker is a Perl library useful for HTTP testing scripts.  It con‐
15       tains a pure-Perl reimplementation of functionality found in the "LWP",
16       "URI", "Digest::MD5", "Digest::MD4", "Data::Dumper", "Authen::NTLM",
17       "HTML::Parser", "HTML::FormParser", "CGI::Upload", "MIME::Base64", and
18       "GetOpt::Std" modules.
19
20       Libwhisker is designed to be portable (a single perl file), fast (gen‐
21       eral benchmarks show libwhisker is faster than LWP), and flexible
22       (great care was taken to ensure the library does exactly what you want
23       to do, even if it means breaking the protocol).
24

FUNCTIONS

26       The following are the functions contained in Libwhisker:
27
28       auth_brute_force
29           Params: $auth_method, \%req, $user, \@passwords [, $domain,
30           $fail_code ]
31
32           Return: $first_valid_password, undef if error/none found
33
34           Perform a HTTP authentication brute force against a server (host
35           and URI defined in %req).  It will try every password in the pass‐
36           word array for the given user.  The first password (in conjunction
37           with the given user) that doesn't return HTTP 401 is returned (and
38           the brute force is stopped at that point).  You should retry the
39           request with the given password and double-check that you got a
40           useful HTTP return code that indicates successful authentication
41           (200, 302), and not something a bit more abnormal (407, 500, etc).
42           $domain is optional, and is only used for NTLM auth.
43
44           Note: set up any proxy settings and proxy auth in %req before call‐
45           ing this function.
46
47           You can brute-force proxy authentication by setting up the target
48           proxy as proxy_host and proxy_port in %req, using an arbitrary host
49           and uri (preferably one that is reachable upon successful proxy
50           authorization), and setting the $fail_code to 407.  The
51           $auth_method passed to this function should be a proxy-based one
52           ('proxy-basic', 'proxy-ntlm', etc).
53
54           if your server returns something other than 401 upon auth failure,
55           then set $fail_code to whatever is returned (and it needs to be
56           something *different* than what is received on auth success, or
57           this function won't be able to tell the difference).
58
59       auth_unset
60           Params: \%req
61
62           Return: nothing (modifies %req)
63
64           Modifes %req to disable all authentication (regular and proxy).
65
66           Note: it only removes the values set by auth_set().  Manually-
67           defined [Proxy-]Authorization headers will also be deleted (but you
68           shouldn't be using the auth_* functions if you're manually handling
69           your own auth...)
70
71       auth_set
72           Params: $auth_method, \%req, $user, $password [, $domain]
73
74           Return: nothing (modifies %req)
75
76           Modifes %req to use the indicated authentication info.
77
78           Auth_method can be: 'basic', 'proxy-basic', 'ntlm', 'proxy-ntlm'.
79
80           Note: this function may not necessarily set any headers after being
81           called.  Also, proxy-ntlm with SSL is not currently supported.
82
83       cookie_new_jar
84           Params: none
85
86           Return: $jar
87
88           Create a new cookie jar, for use with the other functions.  Even
89           though the jar is technically just a hash, you should still use
90           this function in order to be future-compatible (should the jar for‐
91           mat change).
92
93       cookie_read
94           Params: $jar, \%response [, \%request, $reject ]
95
96           Return: $num_of_cookies_read
97
98           Read in cookies from an %response hash, and put them in $jar.
99
100           Notice: cookie_read uses internal magic done by http_do_request in
101           order to read cookies regardless of 'Set-Cookie[2]' header appear‐
102           ance.
103
104           If the optional %request hash is supplied, then it will be used to
105           calculate default host and path values, in case the cookie doesn't
106           specify them explicitly.  If $reject is set to 1, then the %request
107           hash values are used to calculate and reject cookies which are not
108           appropriate for the path and domains of the given request.
109
110       cookie_parse
111           Params: $jar, $cookie [, $default_domain, $default_path, $reject ]
112
113           Return: nothing
114
115           Parses the cookie into the various parts and then sets the appro‐
116           priate values in the cookie $jar. If the cookie value is blank, it
117           will delete it from the $jar.  See the 'docs/cookies.txt' document
118           for a full explanation of how Libwhisker parses cookies and what
119           RFC aspects are supported.
120
121           The optional $default_domain value is taken literally.  Values with
122           no leading dot (e.g. 'www.host.com') are considered to be strict
123           hostnames and will only match the identical hostname.  Values with
124           leading dots (e.g.  '.host.com') are treated as sub-domain matches
125           for a single domain level.  If the cookie does not indicate a
126           domain, and a $default_domain is not provided, then the cookie is
127           considered to match all domains/hosts.
128
129           The optional $default_path is used when the cookie does not specify
130           a path.  $default_path must be absolute (start with '/'), or it
131           will be ignored.  If the cookie does not specify a path, and
132           $default_path is not provided, then the default value '/' will be
133           used.
134
135           Set $reject to 1 if you wish to reject cookies based upon the pro‐
136           vided $default_domain and $default_path.  Note that $default_domain
137           and $default_path must be specified for $reject to actually do
138           something meaningful.
139
140       cookie_write
141           Params: $jar, \%request, $override
142
143           Return: nothing
144
145           Goes through the given $jar and sets the Cookie header in %req
146           pending the correct domain and path.  If $override is true, then
147           the secure, domain and path restrictions of the cookies are ignored
148           and all cookies are essentially included.
149
150           Notice: cookie expiration is currently not implemented.  URL
151           restriction comparision is also case-insensitive.
152
153       cookie_get
154           Params: $jar, $name
155
156           Return: @elements
157
158           Fetch the named cookie from the $jar, and return the components.
159           The returned items will be an array in the following order:
160
161           value, domain, path, expire, secure
162
163           value  = cookie value, should always be non-empty string domain =
164           domain root for cookie, can be undefined path   = URL path for
165           cookie, should always be a non-empty string expire = undefined
166           (depreciated, but exists for backwards-compatibility) secure =
167           whether or not the cookie is limited to HTTPs; value is 0 or 1
168
169       cookie_get_names
170           Params: $jar
171
172           Return: @names
173
174           Fetch all the cookie names from the jar, which then let you
175           cooke_get() them individually.
176
177       cookie_get_valid_names
178           Params: $jar, $domain, $url, $ssl
179
180           Return: @names
181
182           Fetch all the cookie names from the jar which are valid for the
183           given $domain, $url, and $ssl values.  $domain should be string
184           scalar of the target host domain ('www.example.com', etc.).  $url
185           should be the absolute URL for the page ('/index.html',
186           '/cgi-bin/foo.cgi', etc.).  $ssl should be 0 for non-secure cook‐
187           ies, or 1 for all (secure and normal) cookies.  The return value is
188           an array of names compatible with cookie_get().
189
190       cookie_set
191           Params: $jar, $name, $value, $domain, $path, $expire, $secure
192
193           Return: nothing
194
195           Set the named cookie with the provided values into the %jar.  $name
196           is required to be a non-empty string.  $value is required, and will
197           delete the named cookie from the $jar if it is an empty string.
198           $domain and $path can be strings or undefined.  $expire is ignored
199           (but exists for backwards-compatibility).  $secure should be the
200           numeric value of 0 or 1.
201
202       crawl_new
203           Params: $START, $MAX_DEPTH, \%request_hash [, \%tracking_hash ]
204
205           Return: $crawl_object
206
207           The crawl_new() functions initializes a crawl object (hash) to the
208           default values, and then returns it for later use by crawl().
209           $START is the starting URL (in the form of
210           'http://www.host.com/url'), and MAX_DEPTH is the maximum number of
211           levels to crawl (the START URL counts as 1, so a value of 2 will
212           crawl the START URL and all URLs found on that page).  The
213           request_hash is a standard initialized request hash to be used for
214           requests; you should set any authentication information or headers
215           in this hash in order for the crawler to use them.  The optional
216           tracking_hash lets you supply a hash for use in tracking URL
217           results (otherwise crawl_new() will allocate a new anon hash).
218
219       crawl
220           Params: $crawl_object [, $START, $MAX_DEPTH ]
221
222           Return: $count [ undef on error ]
223
224           The heart of the crawl package.  Will perform an HTTP crawl on the
225           specified HOST, starting at START URI, proceeding up to MAX_DEPTH.
226
227           Crawl_object needs to be the variable returned by crawl_new().  You
228           can also indirectly call crawl() via the crawl_object itself:
229
230                   $crawl_object->{crawl}->($START,$MAX_DEPTH)
231
232           Returns the number of URLs actually crawled (not including those
233           skipped).
234
235       dump
236           Params: $name, \@array [, $name, \%hash, $name, \$scalar ]
237
238           Return: $code [ undef on error ]
239
240           The dump function will take the given $name and data reference, and
241           will create an ASCII perl code representation suitable for eval'ing
242           later to recreate the same structure.  $name is the name of the
243           variable that it will be saved as.  Example:
244
245            $output = LW2::dump('request',\%request);
246
247           NOTE: dump() creates anonymous structures under the name given.
248           For example, if you dump the hash %hin under the name 'hin', then
249           when you eval the dumped code you will need to use %$hin, since
250           $hin is now a *reference* to a hash.
251
252       dump_writefile
253           Params: $file, $name, \@array [, $name, \%hash, $name, \@scalar ]
254
255           Return: 0 if success; 1 if error
256
257           This calls dump() and saves the output to the specified $file.
258
259           Note: LW does not checking on the validity of the file name, it's
260           creation, or anything of the sort.  Files are opened in overwrite
261           mode.
262
263       encode_base64
264           Params: $data [, $eol]
265
266           Return: $b64_encoded_data
267
268           This function does Base64 encoding.  If the binary MIME::Base64
269           module is available, it will use that; otherwise, it falls back to
270           an internal perl version.  The perl version carries the following
271           copyright:
272
273            Copyright 1995-1999 Gisle Aas <gisle@aas.no>
274
275           NOTE: the $eol parameter will be inserted every 76 characters.
276           This is used to format the data for output on a 80 character wide
277           terminal.
278
279       decode_base64
280           Params: $data
281
282           Return: $b64_decoded_data
283
284           A perl implementation of base64 decoding.  The perl code for this
285           function was actually taken from an older MIME::Base64 perl module,
286           and bears the following copyright:
287
288           Copyright 1995-1999 Gisle Aas <gisle@aas.no>
289
290       encode_uri_hex
291           Params: $data
292
293           Return: $result
294
295           This function encodes every character (except the / character) with
296           normal URL hex encoding.
297
298       encode_uri_randomhex
299           Params: $data
300
301           Return: $result
302
303           This function randomly encodes characters (except the / character)
304           with normal URL hex encoding.
305
306       encode_uri_randomcase
307           Params: $data
308
309           Return: $result
310
311           This function randomly changes the case of characters in the
312           string.
313
314       encode_unicode
315           Params: $data
316
317           Return: $result
318
319           This function converts a normal string into Windows unicode format
320           (non-overlong or anything fancy).
321
322       decode_unicode
323           Params: $unicode_string
324
325           Return: $decoded_string
326
327           This function attempts to decode a unicode (UTF-8) string by con‐
328           verting it into a single-byte-character string.  Overlong charac‐
329           ters are converted to their standard characters in place; non-over‐
330           long (aka multi-byte) characters are substituted with the 0xff;
331           invalid encoding characters are left as-is.
332
333           Note: this function is useful for dealing with the various unicode
334           exploits/vulnerabilities found in web servers; it is *not* good for
335           doing actual UTF-8 parsing, since characters over a single byte are
336           basically dropped/replaced with a placeholder.
337
338       encode_anti_ids
339           Params: \%request, $modes
340
341           Return: nothing
342
343           encode_anti_ids computes the proper anti-ids encoding/tricks speci‐
344           fied by $modes, and sets up %hin in order to use those tricks.
345           Valid modes are (the mode numbers are the same as those found in
346           whisker 1.4):
347
348           1 Encode some of the characters via normal URL encoding
349           2 Insert directory self-references (/./)
350           3 Premature URL ending (make it appear the request line is done)
351           4 Prepend a long random string in the form of "/string/../URL"
352           5 Add a fake URL parameter
353           6 Use a tab instead of a space as a request spacer
354           7 Change the case of the URL (works against Windows and Novell)
355           8 Change normal seperators ('/') to Windows version ('\')
356           9 Session splicing [NOTE: not currently available]
357
358           You can set multiple modes by setting the string to contain all the
359           modes desired; i.e. $modes="146" will use modes 1, 4, and 6.
360
361       FORMS FUNCTIONS
362           The goal is to parse the variable, human-readable HTML into con‐
363           crete structures useable by your program.  The forms functions does
364           do a good job at making these structures, but I will admit: they
365           are not exactly simple, and thus not a cinch to work with.  But
366           then again, representing something as complex as a HTML form is not
367           a simple thing either.  I think the results are acceptable for
368           what's trying to be done.  Anyways...
369
370           Forms are stored in perl hashes, with elements in the following
371           format:
372
373            $form{'element_name'}=@([ 'type', 'value', @params ])
374
375           Thus every element in the hash is an array of anonymous arrays.
376           The first array value contains the element type (which is 'select',
377           'textarea', 'button', or an 'input' value of the form 'input-text',
378           'input-hidden', 'input-radio', etc).
379
380           The second value is the value, if applicable (it could be undef if
381           no value was specified).  Note that select elements will always
382           have an undef value--the actual values are in the subsequent
383           options elements.
384
385           The third value, if defined, is an anonymous array of additional
386           tag parameters found in the element (like 'onchange="blah"',
387           'size="20"', 'maxlength="40"', 'selected', etc).
388
389           The array does contain one special element, which is stored in the
390           hash under a NULL character ("\0") key.  This element is of the
391           format:
392
393            $form{"\0"}=['name', 'method', 'action', @parameters];
394
395           The element is an anonymous array that contains strings of the
396           form's name, method, and action (values can be undef), and a
397           @parameters array similar to that found in normal elements (above).
398
399           Accessing individual values stored in the form hash becomes a test
400           of your perl referencing skills.  Hint: to access the 'value' of
401           the third element named 'choices', you would need to do:
402
403            $form{'choices'}->[2]->[1];
404
405           The '[2]' is the third element (normal array starts with 0), and
406           the actual value is '[1]' (the type is '[0]', and the parameter
407           array is '[2]').
408
409       forms_read
410           Params: \$html_data
411
412           Return: \@found_forms
413
414           This function parses the given $html_data into libwhisker form
415           hashes.  It returns a reference to an array of hash references to
416           the found forms.
417
418       forms_write
419           Params: \%form_hash
420
421           Return: $html_of_form [undef on error]
422
423           This function will take the given %form hash and compose a generic
424           HTML representation of it, formatted with tabs and newlines in
425           order to make it neat and tidy for printing.
426
427           Note: this function does *not* escape any special characters that
428           were embedded in the element values.
429
430       html_find_tags
431           Params: \$data, \&callback_function [, $xml_flag, $funcref,
432           \%tag_map]
433
434           Return: nothing
435
436           html_find_tags parses a piece of HTML and 'extracts' all found
437           tags, passing the info to the given callback function.  The call‐
438           back function must accept two parameters: the current tag (as a
439           scalar), and a hash ref of all the tag's elements. For example, the
440           tag <a href="/file"> will pass 'a' as the current tag, and a hash
441           reference which contains {'href'=>"/file"}.
442
443           The xml_flag, when set, causes the parser to do some extra process‐
444           ing and checks to accomodate XML style tags such as <tag
445           foo="bar"/>.
446
447           The optional %tagmap is a hash of lowercase tag names.  If a tagmap
448           is supplied, then the parser will only call the callback function
449           if the tag name exists in the tagmap.
450
451           The optional $funcref variable is passed straight to the callback
452           function, allowing you to pass flags or references to more complex
453           structures to your callback function.
454
455       html_find_tags_rewrite
456           Params: $position, $length, $replacement
457
458           Return: nothing
459
460           html_find_tags_rewrite() is used to 'rewrite' an HTML stream from
461           within an html_find_tags() callback function.  In general, you can
462           think of html_find_tags_rewrite working as:
463
464           substr(DATA, $position, $length) = $replacement
465
466           Where DATA is the current HTML string the html parser is using.
467           The reason you need to use this function and not substr() is
468           because a few internal parser pointers and counters need to be
469           adjusted to accomodate the changes.
470
471           If you want to remove a piece of the string, just set the replace‐
472           ment to an empty string ('').  If you wish to insert a string
473           instead of overwrite, just set $length to 0; your string will be
474           inserted at the indicated $position.
475
476       html_link_extractor
477           Params: \$html_data
478
479           Return: @urls
480
481           The html_link_extractor() function uses the internal crawl tests to
482           extract all the HTML links from the given HTML data stream.
483
484           Note: html_link_extractor() does not unique the returned array of
485           discovered links, nor does it attempt to remove javascript links or
486           make the links absolute.  It just extracts every raw link from the
487           HTML stream and returns it.  You'll have to do your own post-pro‐
488           cessing.
489
490       http_new_request
491           Params: %parameters
492
493           Return: \%request_hash
494
495           This function basically 'objectifies' the creation of whisker
496           request hash objects.  You would call it like:
497
498            $req = http_new_request( host=>'www.example.com', uri=>'/' )
499
500           where 'host' and 'uri' can be any number of {whisker} hash control
501           values (see http_init_request for default list).
502
503       http_new_response
504           Params: [none]
505
506           Return: \%response_hash
507
508           This function basically 'objectifies' the creation of whisker
509           response hash objects.  You would call it like:
510
511                   $resp = http_new_response()
512
513       http_init_request
514           Params: \%request_hash_to_initialize
515
516           Return: Nothing (modifies input hash)
517
518           Sets default values to the input hash for use.  Sets the host to
519           'localhost', port 80, request URI '/', using HTTP 1.1 with GET
520           method.  The timeout is set to 10 seconds, no proxies are defined,
521           and all URI formatting is set to standard HTTP syntax.  It also
522           sets the Connection (Keep-Alive) and User-Agent headers.
523
524           NOTICE!!  It's important to use http_init_request before calling
525           http_do_request, or http_do_request might puke.  Thus, a special
526           magic value is placed in the hash to let http_do_request know that
527           the hash has been properly initialized.  If you really must 'roll
528           your own' and not use http_init_request before you call
529           http_do_request, you will at least need to set the MAGIC value
530           (amongst other things).
531
532       http_do_request
533           Params: \%request, \%response [, \%configs]
534
535           Return: >=1 if error; 0 if no error (also modifies response hash)
536
537           *THE* core function of libwhisker.  http_do_request actually per‐
538           forms the HTTP request, using the values submitted in %request, and
539           placing result values in %response.  This allows you to resubmit
540           %request in subsequent requests (%response is automatically cleared
541           upon execution).  You can submit 'runtime' config directives as
542           %configs, which will be spliced into $hin{whisker}->{} before any‐
543           thing else.  That means you can do:
544
545           LW2::http_do_request(\%req,\%resp,{'uri'=>'/cgi-bin/'});
546
547           This will set $req{whisker}->{'uri'}='/cgi-bin/' before execution,
548           and provides a simple shortcut (note: it does modify %req).
549
550           This function will also retry any requests that bomb out during the
551           transaction (but not during the connecting phase).  This is con‐
552           trolled by the {whisker}->{retry} value.  Also note that the
553           returned error message in hout is the *last* error received.  All
554           retry errors are put into {whisker}->{retry_errors}, which is an
555           anonymous array.
556
557           Also note that all NTLM auth logic is implemented in
558           http_do_request().  NTLM requires multiple requests in order to
559           work correctly, and so this function attempts to wrap that and make
560           it all transparent, so that the final end result is what's passed
561           to the application.
562
563           This function will return 0 on success, 1 on HTTP protocol error,
564           and 2 on non-recoverable network connection error (you can retry
565           error 1, but error 2 means that the server is totally unreachable
566           and there's no point in retrying).
567
568       http_req2line
569           Params: \%request, $uri_only_switch
570
571           Return: $request
572
573           req2line is used internally by http_do_request, as well as provides
574           a convienient way to turn a %request configuration into an actual
575           HTTP request line.  If $switch is set to 1, then the returned
576           $request will be the URI only ('/requested/page.html'), versus the
577           entire HTTP request ('GET /requested/page.html HTTP/1.0\n\n').
578           Also, if the 'full_request_override' whisker config variable is set
579           in %hin, then it will be returned instead of the constructed URI.
580
581       http_resp2line
582           Params: \%response
583
584           Return: $response
585
586           http_resp2line provides a convienient way to turn a %response hash
587           back into the original HTTP response line.
588
589       http_fixup_request
590           Params: $hash_ref
591
592           Return: Nothing
593
594           This function takes a %hin hash reference and makes sure the proper
595           headers exist (for example, it will add the Host: header, calculate
596           the Content-Length: header for POST requests, etc).  For standard
597           requests (i.e. you want the request to be HTTP RFC-compliant), you
598           should call this function right before you call http_do_request.
599
600       http_reset
601           Params: Nothing
602
603           Return: Nothing
604
605           The http_reset function will walk through the %http_host_cache,
606           closing all open sockets and freeing SSL resources.  It also clears
607           out the host cache in case you need to rerun everything fresh.
608
609           Note: if you just want to close a single connection, and you have a
610           copy of the %request hash you used, you should use the http_close()
611           function instead.
612
613       ssl_is_available
614           Params: Nothing
615
616           Return: $boolean [, $lib_name, $version]
617
618           The ssl_is_available() function will inform you whether SSL
619           requests are allowed, which is dependant on whether the appropriate
620           SSL libraries are installed on the machine.  In scalar context, the
621           function will return 1 or 0.  In array context, the second element
622           will be the SSL library name that is currently being used by LW2,
623           and the third elment will be the SSL library version number.  Ele‐
624           ments two and three (name and version) will be undefined if called
625           in array context and no SSL libraries are available.
626
627       http_read_headers
628           Params: $stream, \%in, \%out
629
630           Return: $result_code, $encoding, $length, $connection
631
632           Read HTTP headers from the given stream, storing the results in
633           %out.  On success, $result_code will be 1 and $encoding, $length,
634           and $connection will hold the values of the Transfer-Encoding, Con‐
635           tent-Length, and Connection headers, respectively.  If any of those
636           headers are not present, then it will have an 'undef' value.  On an
637           error, the $result_code will be 0 and $encoding will contain an
638           error message.
639
640           This function can be used to parse both request and response head‐
641           ers.
642
643           Note: if there are multiple Transfer-Encoding, Content-Length, or
644           Connection headers, then only the last header value is the one
645           returned by the function.
646
647       http_read_body
648           Params: $stream, \%in, \%out, $encoding, $length
649
650           Return: 1 on success, 0 on error (and sets
651           $hout->{whisker}->{error})
652
653           Read the body from the given stream, placing it in
654           $out->{whisker}->{data}.  Handles chunked encoding.  Can be used to
655           read HTTP (POST) request or HTTP response bodies.  $encoding param‐
656           eter should be lowercase encoding type.
657
658           NOTE: $out->{whisker}->{data} is erased/cleared when this function
659           is called, leaving {data} to just contain this particular HTTP
660           body.
661
662       http_construct_headers
663           Params: \%in
664
665           Return: $data
666
667           This function assembles the headers in the given hash into a data
668           string.
669
670       http_close
671           Params: \%request
672
673           Return: nothing
674
675           This function will close any open streams for the given request.
676
677           Note: in order for http_close() to find the right connection, all
678           original host/proxy/port parameters in %request must be the exact
679           same as when the original request was made.
680
681       http_do_request_timeout
682           Params: \%request, \%response, $timeout
683
684           Return: $result
685
686           This function is identical to http_do_request(), except that it
687           wraps the entire request in a timeout wrapper.  $timeout is the
688           number of seconds to allow for the entire request to be completed.
689
690           Note: this function uses alarm() and signals, and thus will only
691           work on Unix-ish platforms.  It should be safe to call on any plat‐
692           form though.
693
694       md5 Params: $data
695
696           Return: $hex_md5_string
697
698           This function takes a data scalar, and composes a MD5 hash of it,
699           and returns it in a hex ascii string.  It will use the fastest MD5
700           function available.
701
702       md4 Params: $data
703
704           Return: $hex_md4_string
705
706           This function takes a data scalar, and composes a MD4 hash of it,
707           and returns it in a hex ascii string.  It will use the fastest MD4
708           function available.
709
710       multipart_set
711           Params: \%multi_hash, $param_name, $param_value
712
713           Return: nothing
714
715           This function sets the named parameter to the given value within
716           the supplied multipart hash.
717
718       multipart_get
719           Params: \%multi_hash, $param_name
720
721           Return: $param_value, undef on error
722
723           This function retrieves the named parameter to the given value
724           within the supplied multipart hash.  There is a special case where
725           the named parameter is actually a file--in which case the resulting
726           value will be "\0FILE".  In general, all special values will be
727           prefixed with a NULL character.  In order to get a file's info, use
728           multipart_getfile().
729
730       multipart_setfile
731           Params: \%multi_hash, $param_name, $file_path [, $filename]
732
733           Return: undef on error, 1 on success
734
735           NOTE: this function does not actually add the contents of
736           $file_path into the %multi_hash; instead, multipart_write() inserts
737           the content when generating the final request.
738
739       multipart_getfile
740           Params: \%multi_hash, $file_param_name
741
742           Return: $path, $name ($path=undef on error)
743
744           multipart_getfile is used to retrieve information for a file param‐
745           eter contained in %multi_hash.  To use this you would most likely
746           do:
747
748            ($path,$fname)=LW2::multipart_getfile(\%multi,"param_name");
749
750       multipart_boundary
751           Params: \%multi_hash [, $new_boundary_name]
752
753           Return: $current_boundary_name
754
755           multipart_boundary is used to retrieve, and optionally set, the
756           multipart boundary used for the request.
757
758           NOTE: the function does no checking on the supplied boundary, so if
759           you want things to work make sure it's a legit boundary.  Lib‐
760           whisker does *not* prefix it with any '---' characters.
761
762       multipart_write
763           Params: \%multi_hash, \%request
764
765           Return: 1 if successful, undef on error
766
767           multipart_write is used to parse and construct the multipart data
768           contained in %multi_hash, and place it ready to go in the given
769           whisker hash (%request) structure, to be sent to the server.
770
771           NOTE: file contents are read into the final %request, so it's pos‐
772           sible for the hash to get *very* large if you have (a) large
773           file(s).
774
775       multipart_read
776           Params: \%multi_hash, \%hout_response [, $filepath ]
777
778           Return: 1 if successful, undef on error
779
780           multipart_read will parse the data contents of the supplied
781           %hout_response hash, by passing the appropriate info to multi‐
782           part_read_data().  Please see multipart_read_data() for more info
783           on parameters and behaviour.
784
785           NOTE: this function will return an error if the given
786           %hout_response Content-Type is not set to "multipart/form-data".
787
788       multipart_read_data
789           Params: \%multi_hash, \$data, $boundary [, $filepath ]
790
791           Return: 1 if successful, undef on error
792
793           multipart_read_data parses the contents of the supplied data using
794           the given boundary and puts the values in the supplied %multi_hash.
795           Embedded files will *not* be saved unless a $filepath is given,
796           which should be a directory suitable for writing out temporary
797           files.
798
799           NOTE: currently only application/octet-stream is the only supported
800           file encoding.  All other file encodings will not be parsed/saved.
801
802       multipart_files_list
803           Params: \%multi_hash
804
805           Return: @files
806
807           multipart_files_list returns an array of parameter names for all
808           the files that are contained in %multi_hash.
809
810       multipart_params_list
811           Params: \%multi_hash
812
813           Return: @params
814
815           multipart_files_list returns an array of parameter names for all
816           the regular parameters (non-file) that are contained in
817           %multi_hash.
818
819       ntlm_new
820           Params: $username, $password [, $domain, $ntlm_only]
821
822           Return: $ntlm_object
823
824           Returns a reference to an array (otherwise known as the 'ntlm
825           object') which contains the various informations specific to a
826           user/pass combo.  If $ntlm_only is set to 1, then only the NTLM
827           hash (and not the LanMan hash) will be generated.  This results in
828           a speed boost, and is typically fine for using against IIS servers.
829
830           The array contains the following items, in order: username, pass‐
831           word, domain, lmhash(password), ntlmhash(password)
832
833       ntlm_decode_challenge
834           Params: $challenge
835
836           Return: @challenge_parts
837
838           Splits the supplied challenge into the various parts.  The returned
839           array contains elements in the following order:
840
841           unicode_domain, ident, packet_type, domain_len, domain_maxlen,
842           domain_offset, flags, challenge_token, reserved, empty, raw_data
843
844       ntlm_client
845           Params: $ntlm_obj [, $server_challenge]
846
847           Return: $response
848
849           ntlm_client() is responsible for generating the base64-encoded text
850           you include in the HTTP Authorization header.  If you call
851           ntlm_client() without a $server_challenge, the function will return
852           the initial NTLM request packet (message packet #1).  You send this
853           to the server, and take the server's response (message packet #2)
854           and pass that as $server_challenge, causing ntlm_client() to gener‐
855           ate the final response packet (message packet #3).
856
857           Note: $server_challenge is expected to be base64 encoded.
858
859       get_page
860           Params: $url [, \%request]
861
862           Return: $code, $data ($code will be set to undef on error, $data
863           will                contain error message)
864
865           This function will fetch the page at the given URL, and return the
866           HTTP response code and page contents.  Use this in the form of:
867           ($code,$html)=LW2::get_page("http://host.com/page.html")
868
869           The optional %request will be used if supplied.  This allows you to
870           set headers and other parameters.
871
872       get_page_hash
873           Params: $url [, \%request]
874
875           Return: $hash_ref (undef on no URL)
876
877           This function will fetch the page at the given URL, and return the
878           whisker HTTP response hash.  The return code of the function is set
879           to $hash_ref->{whisker}->{get_page_hash}, and uses the
880           http_do_request() return values.
881
882           Note: undef is returned if no URL is supplied
883
884       get_page_to_file
885           Params: $url, $filepath [, \%request]
886
887           Return: $code ($code will be set to undef on error)
888
889           This function will fetch the page at the given URL, place the
890           resulting HTML in the file specified, and return the HTTP response
891           code.  The optional %request hash sets the default parameters to be
892           used in the request.
893
894           NOTE: libwhisker does not do any file checking; libwhisker will
895           open the supplied filepath for writing, overwriting any previously-
896           existing files.  Libwhisker does not differentiate between a bad
897           request, and a bad file open.  If you're having troubles making
898           this function work, make sure that your $filepath is legal and
899           valid, and that you have appropriate write permissions to cre‐
900           ate/overwrite that file.
901
902       time_mktime
903           Params: $seconds, $minutes, $hours, $day_of_month, $month,
904           $year_minus_1900
905
906           Return: $seconds [ -1 on error ]
907
908           Performs a general mktime calculation with the given time compo‐
909           nents.  Note that the input parameter values are expected to be in
910           the format output by localtime/gmtime.  Namely, $seconds is 0-60
911           (yes, there can be a leap second value of 60 occasionally), $min‐
912           utes is 0-59, $hours is 0-23, $days is 1-31, $month is 0-11, and
913           $year is 70-127.  This function is limited in that it will not
914           process dates prior to 1970 or after 2037 (that way 32-bit time_t
915           overflow calculations aren't required).
916
917           Additional parameters passed to the function are ignored, so it is
918           safe to use the full localtime/gmtime output, such as:
919
920                   $seconds = LW2::time_mktime( localtime( time ) );
921
922           Note: this function does not adjust for time zone, daylight savings
923           time, etc.  You must do that yourself.
924
925       time_gmtolocal
926           Params: $seconds_gmt
927
928           Return: $seconds_local_timezone
929
930           Takes a seconds value in UTC/GMT time and adjusts it to reflect the
931           current timezone.  This function is slightly expensive; it takes
932           the gmtime() and localtime() representations of the current time,
933           calculates the delta difference by turning them back into seconds
934           via time_mktime, and then applies this delta difference to $sec‐
935           onds_gmt.
936
937           Note that if you give this function a time and subtract the return
938           value from the original time, you will get the delta value.  At
939           that point, you can just apply the delta directly and skip calling
940           this function, which is a massive performance boost.  However, this
941           will cause problems if you have a long running program which
942           crosses daylight savings time boundaries, as the DST adjustment
943           will not be accounted for unless you recalculate the new delta.
944
945       uri_split
946           Params: $uri_string [, \%request_hash]
947
948           Return: @uri_parts
949
950           Return an array of the following values, in order:  uri, protocol,
951           host, port, params, frag, user, password.  Values not defined are
952           given an undef value.  If a %request hash is passed in, then
953           uri_split() will also set the appropriate values in the hash.
954
955           Note:  uri_split() will only set the %request hash if the protocol
956           is HTTP or HTTPS!
957
958       uri_join
959           Params: @vals
960
961           Return: $url
962
963           Takes the @vals array output from http_split_uri, and returns a
964           single scalar/string with them joined again, in the form of: proto‐
965           col://user:pass@host:port/uri?params#frag
966
967       uri_absolute
968           Params: $uri, $base_uri [, $normalize_flag ]
969
970           Return: $absolute_uri
971
972           Double checks that the given $uri is in absolute form (that is,
973           "http://host/file"), and if not (it's in the form "/file"), then it
974           will append the given $base_uri to make it absolute.  This provides
975           a compatibility similar to that found in the URI subpackage.
976
977           If $normalize_flag is set to 1, then the output will be passed
978           through utils_normalize_uri before being returned.
979
980       uri_normalize
981           Params: $uri [, $fix_windows_slashes ]
982
983           Return: $normalized_uri [ undef on error ]
984
985           Takes the given $uri and does any /./ and /../ dereferencing in
986           order to come up with the correct absolute URL.  If the $fix_ win‐
987           dows_slashes parameter is set to 1, all \ (back slashes) will be
988           converted to / (forward slashes).
989
990           Non-http/https URIs return an error.
991
992       uri_get_dir
993           Params: $uri
994
995           Return: $uri_directory
996
997           Will take a URI and return the directory base of it, i.e.
998           /rfp/page.php will return /rfp/.
999
1000       uri_strip_path_parameters
1001           Params: $uri [, \%param_hash]
1002
1003           Return: $stripped_uri
1004
1005           This function removes all URI path parameters of the form
1006
1007            /blah1;foo=bar/blah2;baz
1008
1009           and returns the stripped URI ('/blah1/blah2').  If the optional
1010           parameter hash reference is provided, the stripped parameters are
1011           saved in the form of 'blah1'=>'foo=bar', 'blah2'=>'baz'.
1012
1013           Note: only the last value of a duplicate name is saved into the
1014           param_hash, if provided.  So a $uri of '/foo;A/foo;B/' will result
1015           in a single hash entry of 'foo'=>'B'.
1016
1017       uri_parse_parameters
1018           Params: $parameter_string [, $decode, $multi_flag ]
1019
1020           Return: \%parameter_hash
1021
1022           This function takes a string in the form of:
1023
1024            foo=1&bar=2&baz=3&foo=4
1025
1026           And parses it into a hash.  In the above example, the element 'foo'
1027           has two values (1 and 4).  If $multi_flag is set to 1, then the
1028           'foo' hash entry will hold an anonymous array of both values.  Oth‐
1029           erwise, the default is to just contain the last value (in this
1030           case, '4').
1031
1032           If $decode is set to 1, then normal hex decoding is done on the
1033           characters, where needed (both the name and value are decoded).
1034
1035           Note: if a URL parameter name appears without a value, then the
1036           value will be set to undef.  E.g. for the string "foo=1&bar&baz=2",
1037           the 'bar' hash element will have an undef value.
1038
1039       uri_escape
1040           Params: $data
1041
1042           Return: $encoded_data
1043
1044           This function encodes the given $data so it is safe to be used in
1045           URIs.
1046
1047       uri_unescape
1048           Params: $encoded_data
1049
1050           Return: $data
1051
1052           This function decodes the given $data out of URI format.
1053
1054       utils_recperm
1055           Params: $uri, $depth, \@dir_parts, \@valid, \&func, \%track,
1056           \%arrays, \&cfunc
1057
1058           Return: nothing
1059
1060           This is a special function which is used to recursively-permutate
1061           through a given directory listing.  This is really only used by
1062           whisker, in order to traverse down directories, testing them as it
1063           goes.  See whisker 2.0 for exact usage examples.
1064
1065       utils_array_shuffle
1066           Params: \@array
1067
1068           Return: nothing
1069
1070           This function will randomize the order of the elements in the given
1071           array.
1072
1073       utils_randstr
1074           Params: [ $size, $chars ]
1075
1076           Return: $random_string
1077
1078           This function generates a random string between 10 and 20 charac‐
1079           ters long, or of $size if specified.  If $chars is specified, then
1080           the random function picks characters from the supplied string.  For
1081           example, to have a random string of 10 characters, composed of only
1082           the characters 'abcdef', then you would run:
1083
1084            utils_randstr(10,'abcdef');
1085
1086           The default character string is alphanumeric.
1087
1088       utils_port_open
1089           Params: $host, $port
1090
1091           Return: $result
1092
1093           Quick function to attempt to make a connection to the given host
1094           and port.  If a connection was successfully made, function will
1095           return true (1).  Otherwise it returns false (0).
1096
1097           Note: this uses standard TCP connections, thus is not recommended
1098           for use in port-scanning type applications.  Extremely slow.
1099
1100       utils_lowercase_keys
1101           Params: \%hash
1102
1103           Return: $number_changed
1104
1105           Will lowercase all the header names (but not values) of the given
1106           hash.
1107
1108       utils_find_lowercase_key
1109           Params: \%hash, $key
1110
1111           Return: $value, undef on error or not exist
1112
1113           Searches the given hash for the $key (regardless of case), and
1114           returns the value. If the return value is placed into an array, the
1115           will dereference any multi-value references and return an array of
1116           all values.
1117
1118           WARNING!  In scalar context, $value can either be a single-value
1119           scalar or an array reference for multiple scalar values.  That
1120           means you either need to check the return value and act appropri‐
1121           ately, or use an array context (even if you only want a single
1122           value).  This is very important, even if you know there are no
1123           multi-value hash keys.  This function may still return an array of
1124           multiple values even if all hash keys are single value, since low‐
1125           ercasing the keys could result in multiple keys matching.  For
1126           example, a hash with the values { 'Foo'=>'a', 'fOo'=>'b' } techni‐
1127           cally has two keys with the lowercase name 'foo', and so this func‐
1128           tion will either return an array or array reference with both 'a'
1129           and 'b'.
1130
1131       utils_find_key
1132           Params: \%hash, $key
1133
1134           Return: $value, undef on error or not exist
1135
1136           Searches the given hash for the $key (case-sensitive), and returns
1137           the value. If the return value is placed into an array, the will
1138           dereference any multi-value references and return an array of all
1139           values.
1140
1141       utils_delete_lowercase_key
1142           Params: \%hash, $key
1143
1144           Return: $number_found
1145
1146           Searches the given hash for the $key (regardless of case), and
1147           deletes the key out of the hash if found.  The function returns the
1148           number of keys found and deleted (since multiple keys can exist
1149           under the names 'Key', 'key', 'keY', 'KEY', etc.).
1150
1151       utils_getline
1152           Params: \$data [, $resetpos ]
1153
1154           Return: $line (undef if no more data)
1155
1156           Fetches the next \n terminated line from the given data.  Use the
1157           optional $resetpos to reset the internal position pointer.  Does
1158           *NOT* return trialing \n.
1159
1160       utils_getline_crlf
1161           Params: \$data [, $resetpos ]
1162
1163           Return: $line (undef if no more data)
1164
1165           Fetches the next \r\n terminated line from the given data.  Use the
1166           optional $resetpos to reset the internal position pointer.  Does
1167           *NOT* return trialing \r\n.
1168
1169       utils_save_page
1170           Params: $file, \%response
1171
1172           Return: 0 on success, 1 on error
1173
1174           Saves the data portion of the given whisker %response hash to the
1175           indicated file.  Can technically save the data portion of a
1176           %request hash too.  A file is not written if there is no data.
1177
1178           Note: LW does not do any special file checking; files are opened in
1179           overwrite mode.
1180
1181       utils_getopts
1182           Params: $opt_str, \%opt_results
1183
1184           Return: 0 on success, 1 on error
1185
1186           This function is a general implementation of GetOpts::Std.  It will
1187           parse @ARGV, looking for the options specified in $opt_str, and
1188           will put the results in %opt_results.  Behavior/parameter values
1189           are similar to GetOpts::Std's getopts().
1190
1191           Note: this function does *not* support long options (--option),
1192           option grouping (-opq), or options with immediate values (-ovalue).
1193           If an option is indicated as having a value, it will take the next
1194           argument regardless.
1195
1196       utils_text_wrapper
1197           Params: $long_text_string [, $crlf, $width ]
1198
1199           Return: $formatted_test_string
1200
1201           This is a simple function used to format a long line of text for
1202           display on a typical limited-character screen, such as a unix shell
1203           console.
1204
1205           $crlf defaults to "\n", and $width defaults to 76.
1206
1207       utils_bruteurl
1208           Params: \%req, $pre, $post, \@values_in, \@values_out
1209
1210           Return: Nothing (adds to @out)
1211
1212           Bruteurl will perform a brute force against the host/server speci‐
1213           fied in %req.  However, it will make one request per entry in @in,
1214           taking the value and setting $hin{'whisker'}->{'uri'}=
1215           $pre.value.$post.  Any URI responding with an HTTP 200 or 403
1216           response is pushed into @out.  An example of this would be to brute
1217           force usernames, putting a list of common usernames in @in, setting
1218           $pre='/~' and $post='/'.
1219
1220       utils_join_tag
1221           Params: $tag_name, \%attributes
1222
1223           Return: $tag_string [undef on error]
1224
1225           This function takes the $tag_name (like 'A') and a hash full of
1226           attributes (like {href=>'http://foo/'}) and returns the constructed
1227           HTML tag string (<A href="http://foo">).
1228
1229       utils_request_clone
1230           Params: \%from_request, \%to_request
1231
1232           Return: 1 on success, 0 on error
1233
1234           This function takes the connection/request-specific values from the
1235           given from_request hash, and copies them to the to_request hash.
1236
1237       utils_request_fingerprint
1238           Params: \%request [, $hash ]
1239
1240           Return: $fingerprint [undef on error]
1241
1242           This function constructs a 'fingerprint' of the given request by
1243           using a cryptographic hashing function on the constructed original
1244           HTTP request.
1245
1246           Note: $hash can be 'md5' (default) or 'md4'.
1247
1248       utils_flatten_lwhash
1249           Params: \%lwhash
1250
1251           Return: $flat_version [undef on error]
1252
1253           This function takes a %request or %response libwhisker hash, and
1254           creates an approximate flat data string of the original request/
1255           response (i.e. before it was parsed into components and placed into
1256           the libwhisker hash).
1257
1258       utils_carp
1259           Params: [ $package_name ]
1260
1261           Return: nothing
1262
1263           This function acts like Carp's carp function.  It warn's with the
1264           file and line number of user's code which causes a problem.  It
1265           traces up the call stack and reports the first function that is not
1266           in the LW2 or optional $package_name package package.
1267
1268       utils_croak
1269           Params: [ $package_name ]
1270
1271           Return: nothing
1272
1273           This function acts like Carp's croak function.  It die's with the
1274           file and line number of user's code which causes a problem.  It
1275           traces up the call stack and reports the first function that is not
1276           in the LW2 or optional $package_name package package.
1277

SEE ALSO

1279       LWP
1280
1282       Copyright 2001-2006 Rain Forest Puppy
1283
1284       This program is free software; you can redistribute it and/or modify it
1285       under the terms of the GPL.
1286
1287
1288
12892.4                               2007-05-27                            LW2(3)
Impressum