1LW2(3)                User Contributed Perl Documentation               LW2(3)
2
3
4

NAME

6       LW2 - Perl HTTP library version 2.5
7

SYNOPSIS

9       use LW2;
10
11       require 'LW2.pm';
12

DESCRIPTION

14       Libwhisker is a Perl library useful for HTTP testing scripts.  It
15       contains a pure-Perl reimplementation of functionality found in the
16       "LWP", "URI", "Digest::MD5", "Digest::MD4", "Data::Dumper",
17       "Authen::NTLM", "HTML::Parser", "HTML::FormParser", "CGI::Upload",
18       "MIME::Base64", and "GetOpt::Std" modules.
19
20       Libwhisker is designed to be portable (a single perl file), fast
21       (general benchmarks show libwhisker is faster than LWP), and flexible
22       (great care was taken to ensure the library does exactly what you want
23       to do, even if it means breaking the protocol).
24

FUNCTIONS

26       The following are the functions contained in Libwhisker:
27
28       auth_brute_force
29           Params: $auth_method, \%req, $user, \@passwords [, $domain,
30           $fail_code ]
31
32           Return: $first_valid_password, undef if error/none found
33
34           Perform a HTTP authentication brute force against a server (host
35           and URI defined in %req).  It will try every password in the
36           password array for the given user.  The first password (in
37           conjunction with the given user) that doesn't return HTTP 401 is
38           returned (and the brute force is stopped at that point).  You
39           should retry the request with the given password and double-check
40           that you got a useful HTTP return code that indicates successful
41           authentication (200, 302), and not something a bit more abnormal
42           (407, 500, etc).  $domain is optional, and is only used for NTLM
43           auth.
44
45           Note: set up any proxy settings and proxy auth in %req before
46           calling this function.
47
48           You can brute-force proxy authentication by setting up the target
49           proxy as proxy_host and proxy_port in %req, using an arbitrary host
50           and uri (preferably one that is reachable upon successful proxy
51           authorization), and setting the $fail_code to 407.  The
52           $auth_method passed to this function should be a proxy-based one
53           ('proxy-basic', 'proxy-ntlm', etc).
54
55           if your server returns something other than 401 upon auth failure,
56           then set $fail_code to whatever is returned (and it needs to be
57           something *different* than what is received on auth success, or
58           this function won't be able to tell the difference).
59
60       auth_unset
61           Params: \%req
62
63           Return: nothing (modifies %req)
64
65           Modifes %req to disable all authentication (regular and proxy).
66
67           Note: it only removes the values set by auth_set().  Manually-
68           defined [Proxy-]Authorization headers will also be deleted (but you
69           shouldn't be using the auth_* functions if you're manually handling
70           your own auth...)
71
72       auth_set
73           Params: $auth_method, \%req, $user, $password [, $domain]
74
75           Return: nothing (modifies %req)
76
77           Modifes %req to use the indicated authentication info.
78
79           Auth_method can be: 'basic', 'proxy-basic', 'ntlm', 'proxy-ntlm'.
80
81           Note: this function may not necessarily set any headers after being
82           called.  Also, proxy-ntlm with SSL is not currently supported.
83
84       cookie_new_jar
85           Params: none
86
87           Return: $jar
88
89           Create a new cookie jar, for use with the other functions.  Even
90           though the jar is technically just a hash, you should still use
91           this function in order to be future-compatible (should the jar
92           format change).
93
94       cookie_read
95           Params: $jar, \%response [, \%request, $reject ]
96
97           Return: $num_of_cookies_read
98
99           Read in cookies from an %response hash, and put them in $jar.
100
101           Notice: cookie_read uses internal magic done by http_do_request in
102           order to read cookies regardless of 'Set-Cookie[2]' header
103           appearance.
104
105           If the optional %request hash is supplied, then it will be used to
106           calculate default host and path values, in case the cookie doesn't
107           specify them explicitly.  If $reject is set to 1, then the %request
108           hash values are used to calculate and reject cookies which are not
109           appropriate for the path and domains of the given request.
110
111       cookie_parse
112           Params: $jar, $cookie [, $default_domain, $default_path, $reject ]
113
114           Return: nothing
115
116           Parses the cookie into the various parts and then sets the
117           appropriate values in the cookie $jar. If the cookie value is
118           blank, it will delete it from the $jar.  See the 'docs/cookies.txt'
119           document for a full explanation of how Libwhisker parses cookies
120           and what RFC aspects are supported.
121
122           The optional $default_domain value is taken literally.  Values with
123           no leading dot (e.g. 'www.host.com') are considered to be strict
124           hostnames and will only match the identical hostname.  Values with
125           leading dots (e.g.  '.host.com') are treated as sub-domain matches
126           for a single domain level.  If the cookie does not indicate a
127           domain, and a $default_domain is not provided, then the cookie is
128           considered to match all domains/hosts.
129
130           The optional $default_path is used when the cookie does not specify
131           a path.  $default_path must be absolute (start with '/'), or it
132           will be ignored.  If the cookie does not specify a path, and
133           $default_path is not provided, then the default value '/' will be
134           used.
135
136           Set $reject to 1 if you wish to reject cookies based upon the
137           provided $default_domain and $default_path.  Note that
138           $default_domain and $default_path must be specified for $reject to
139           actually do something meaningful.
140
141       cookie_write
142           Params: $jar, \%request, $override
143
144           Return: nothing
145
146           Goes through the given $jar and sets the Cookie header in %req
147           pending the correct domain and path.  If $override is true, then
148           the secure, domain and path restrictions of the cookies are ignored
149           and all cookies are essentially included.
150
151           Notice: cookie expiration is currently not implemented.  URL
152           restriction comparision is also case-insensitive.
153
154       cookie_get
155           Params: $jar, $name
156
157           Return: @elements
158
159           Fetch the named cookie from the $jar, and return the components.
160           The returned items will be an array in the following order:
161
162           value, domain, path, expire, secure
163
164           value  = cookie value, should always be non-empty string domain =
165           domain root for cookie, can be undefined path   = URL path for
166           cookie, should always be a non-empty string expire = undefined
167           (depreciated, but exists for backwards-compatibility) secure =
168           whether or not the cookie is limited to HTTPs; value is 0 or 1
169
170       cookie_get_names
171           Params: $jar
172
173           Return: @names
174
175           Fetch all the cookie names from the jar, which then let you
176           cooke_get() them individually.
177
178       cookie_get_valid_names
179           Params: $jar, $domain, $url, $ssl
180
181           Return: @names
182
183           Fetch all the cookie names from the jar which are valid for the
184           given $domain, $url, and $ssl values.  $domain should be string
185           scalar of the target host domain ('www.example.com', etc.).  $url
186           should be the absolute URL for the page ('/index.html',
187           '/cgi-bin/foo.cgi', etc.).  $ssl should be 0 for non-secure
188           cookies, or 1 for all (secure and normal) cookies.  The return
189           value is an array of names compatible with cookie_get().
190
191       cookie_set
192           Params: $jar, $name, $value, $domain, $path, $expire, $secure
193
194           Return: nothing
195
196           Set the named cookie with the provided values into the %jar.  $name
197           is required to be a non-empty string.  $value is required, and will
198           delete the named cookie from the $jar if it is an empty string.
199           $domain and $path can be strings or undefined.  $expire is ignored
200           (but exists for backwards-compatibility).  $secure should be the
201           numeric value of 0 or 1.
202
203       crawl_new
204           Params: $START, $MAX_DEPTH, \%request_hash [, \%tracking_hash ]
205
206           Return: $crawl_object
207
208           The crawl_new() functions initializes a crawl object (hash) to the
209           default values, and then returns it for later use by crawl().
210           $START is the starting URL (in the form of
211           'http://www.host.com/url'), and MAX_DEPTH is the maximum number of
212           levels to crawl (the START URL counts as 1, so a value of 2 will
213           crawl the START URL and all URLs found on that page).  The
214           request_hash is a standard initialized request hash to be used for
215           requests; you should set any authentication information or headers
216           in this hash in order for the crawler to use them.  The optional
217           tracking_hash lets you supply a hash for use in tracking URL
218           results (otherwise crawl_new() will allocate a new anon hash).
219
220       crawl
221           Params: $crawl_object [, $START, $MAX_DEPTH ]
222
223           Return: $count [ undef on error ]
224
225           The heart of the crawl package.  Will perform an HTTP crawl on the
226           specified HOST, starting at START URI, proceeding up to MAX_DEPTH.
227
228           Crawl_object needs to be the variable returned by crawl_new().  You
229           can also indirectly call crawl() via the crawl_object itself:
230
231                   $crawl_object->{crawl}->($START,$MAX_DEPTH)
232
233           Returns the number of URLs actually crawled (not including those
234           skipped).
235
236       dump
237           Params: $name, \@array [, $name, \%hash, $name, \$scalar ]
238
239           Return: $code [ undef on error ]
240
241           The dump function will take the given $name and data reference, and
242           will create an ASCII perl code representation suitable for eval'ing
243           later to recreate the same structure.  $name is the name of the
244           variable that it will be saved as.  Example:
245
246            $output = LW2::dump('request',\%request);
247
248           NOTE: dump() creates anonymous structures under the name given.
249           For example, if you dump the hash %hin under the name 'hin', then
250           when you eval the dumped code you will need to use %$hin, since
251           $hin is now a *reference* to a hash.
252
253       dump_writefile
254           Params: $file, $name, \@array [, $name, \%hash, $name, \@scalar ]
255
256           Return: 0 if success; 1 if error
257
258           This calls dump() and saves the output to the specified $file.
259
260           Note: LW does not checking on the validity of the file name, it's
261           creation, or anything of the sort.  Files are opened in overwrite
262           mode.
263
264       encode_base64
265           Params: $data [, $eol]
266
267           Return: $b64_encoded_data
268
269           This function does Base64 encoding.  If the binary MIME::Base64
270           module is available, it will use that; otherwise, it falls back to
271           an internal perl version.  The perl version carries the following
272           copyright:
273
274            Copyright 1995-1999 Gisle Aas <gisle@aas.no>
275
276           NOTE: the $eol parameter will be inserted every 76 characters.
277           This is used to format the data for output on a 80 character wide
278           terminal.
279
280       decode_base64
281           Params: $data
282
283           Return: $b64_decoded_data
284
285           A perl implementation of base64 decoding.  The perl code for this
286           function was actually taken from an older MIME::Base64 perl module,
287           and bears the following copyright:
288
289           Copyright 1995-1999 Gisle Aas <gisle@aas.no>
290
291       encode_uri_hex
292           Params: $data
293
294           Return: $result
295
296           This function encodes every character (except the / character) with
297           normal URL hex encoding.
298
299       encode_uri_randomhex
300           Params: $data
301
302           Return: $result
303
304           This function randomly encodes characters (except the / character)
305           with normal URL hex encoding.
306
307       encode_uri_randomcase
308           Params: $data
309
310           Return: $result
311
312           This function randomly changes the case of characters in the
313           string.
314
315       encode_unicode
316           Params: $data
317
318           Return: $result
319
320           This function converts a normal string into Windows unicode format
321           (non-overlong or anything fancy).
322
323       decode_unicode
324           Params: $unicode_string
325
326           Return: $decoded_string
327
328           This function attempts to decode a unicode (UTF-8) string by
329           converting it into a single-byte-character string.  Overlong
330           characters are converted to their standard characters in place;
331           non-overlong (aka multi-byte) characters are substituted with the
332           0xff; invalid encoding characters are left as-is.
333
334           Note: this function is useful for dealing with the various unicode
335           exploits/vulnerabilities found in web servers; it is *not* good for
336           doing actual UTF-8 parsing, since characters over a single byte are
337           basically dropped/replaced with a placeholder.
338
339       encode_anti_ids
340           Params: \%request, $modes
341
342           Return: nothing
343
344           encode_anti_ids computes the proper anti-ids encoding/tricks
345           specified by $modes, and sets up %hin in order to use those tricks.
346           Valid modes are (the mode numbers are the same as those found in
347           whisker 1.4):
348
349           1 Encode some of the characters via normal URL encoding
350           2 Insert directory self-references (/./)
351           3 Premature URL ending (make it appear the request line is done)
352           4 Prepend a long random string in the form of "/string/../URL"
353           5 Add a fake URL parameter
354           6 Use a tab instead of a space as a request spacer
355           7 Change the case of the URL (works against Windows and Novell)
356           8 Change normal seperators ('/') to Windows version ('\')
357           9 Session splicing [NOTE: not currently available]
358           A Use a carriage return (0x0d) as a request spacer
359           B Use binary value 0x0b as a request spacer
360
361           You can set multiple modes by setting the string to contain all the
362           modes desired; i.e. $modes="146" will use modes 1, 4, and 6.
363
364       FORMS FUNCTIONS
365           The goal is to parse the variable, human-readable HTML into
366           concrete structures useable by your program.  The forms functions
367           does do a good job at making these structures, but I will admit:
368           they are not exactly simple, and thus not a cinch to work with.
369           But then again, representing something as complex as a HTML form is
370           not a simple thing either.  I think the results are acceptable for
371           what's trying to be done.  Anyways...
372
373           Forms are stored in perl hashes, with elements in the following
374           format:
375
376            $form{'element_name'}=@([ 'type', 'value', @params ])
377
378           Thus every element in the hash is an array of anonymous arrays.
379           The first array value contains the element type (which is 'select',
380           'textarea', 'button', or an 'input' value of the form 'input-text',
381           'input-hidden', 'input-radio', etc).
382
383           The second value is the value, if applicable (it could be undef if
384           no value was specified).  Note that select elements will always
385           have an undef value--the actual values are in the subsequent
386           options elements.
387
388           The third value, if defined, is an anonymous array of additional
389           tag parameters found in the element (like 'onchange="blah"',
390           'size="20"', 'maxlength="40"', 'selected', etc).
391
392           The array does contain one special element, which is stored in the
393           hash under a NULL character ("\0") key.  This element is of the
394           format:
395
396            $form{"\0"}=['name', 'method', 'action', @parameters];
397
398           The element is an anonymous array that contains strings of the
399           form's name, method, and action (values can be undef), and a
400           @parameters array similar to that found in normal elements (above).
401
402           Accessing individual values stored in the form hash becomes a test
403           of your perl referencing skills.  Hint: to access the 'value' of
404           the third element named 'choices', you would need to do:
405
406            $form{'choices'}->[2]->[1];
407
408           The '[2]' is the third element (normal array starts with 0), and
409           the actual value is '[1]' (the type is '[0]', and the parameter
410           array is '[2]').
411
412       forms_read
413           Params: \$html_data
414
415           Return: \@found_forms
416
417           This function parses the given $html_data into libwhisker form
418           hashes.  It returns a reference to an array of hash references to
419           the found forms.
420
421       forms_write
422           Params: \%form_hash
423
424           Return: $html_of_form [undef on error]
425
426           This function will take the given %form hash and compose a generic
427           HTML representation of it, formatted with tabs and newlines in
428           order to make it neat and tidy for printing.
429
430           Note: this function does *not* escape any special characters that
431           were embedded in the element values.
432
433       html_find_tags
434           Params: \$data, \&callback_function [, $xml_flag, $funcref,
435           \%tag_map]
436
437           Return: nothing
438
439           html_find_tags parses a piece of HTML and 'extracts' all found
440           tags, passing the info to the given callback function.  The
441           callback function must accept two parameters: the current tag (as a
442           scalar), and a hash ref of all the tag's elements. For example, the
443           tag <a href="/file"> will pass 'a' as the current tag, and a hash
444           reference which contains {'href'=>"/file"}.
445
446           The xml_flag, when set, causes the parser to do some extra
447           processing and checks to accomodate XML style tags such as <tag
448           foo="bar"/>.
449
450           The optional %tagmap is a hash of lowercase tag names.  If a tagmap
451           is supplied, then the parser will only call the callback function
452           if the tag name exists in the tagmap.
453
454           The optional $funcref variable is passed straight to the callback
455           function, allowing you to pass flags or references to more complex
456           structures to your callback function.
457
458       html_find_tags_rewrite
459           Params: $position, $length, $replacement
460
461           Return: nothing
462
463           html_find_tags_rewrite() is used to 'rewrite' an HTML stream from
464           within an html_find_tags() callback function.  In general, you can
465           think of html_find_tags_rewrite working as:
466
467           substr(DATA, $position, $length) = $replacement
468
469           Where DATA is the current HTML string the html parser is using.
470           The reason you need to use this function and not substr() is
471           because a few internal parser pointers and counters need to be
472           adjusted to accomodate the changes.
473
474           If you want to remove a piece of the string, just set the
475           replacement to an empty string ('').  If you wish to insert a
476           string instead of overwrite, just set $length to 0; your string
477           will be inserted at the indicated $position.
478
479       html_link_extractor
480           Params: \$html_data
481
482           Return: @urls
483
484           The html_link_extractor() function uses the internal crawl tests to
485           extract all the HTML links from the given HTML data stream.
486
487           Note: html_link_extractor() does not unique the returned array of
488           discovered links, nor does it attempt to remove javascript links or
489           make the links absolute.  It just extracts every raw link from the
490           HTML stream and returns it.  You'll have to do your own post-
491           processing.
492
493       http_new_request
494           Params: %parameters
495
496           Return: \%request_hash
497
498           This function basically 'objectifies' the creation of whisker
499           request hash objects.  You would call it like:
500
501            $req = http_new_request( host=>'www.example.com', uri=>'/' )
502
503           where 'host' and 'uri' can be any number of {whisker} hash control
504           values (see http_init_request for default list).
505
506       http_new_response
507           Params: [none]
508
509           Return: \%response_hash
510
511           This function basically 'objectifies' the creation of whisker
512           response hash objects.  You would call it like:
513
514                   $resp = http_new_response()
515
516       http_init_request
517           Params: \%request_hash_to_initialize
518
519           Return: Nothing (modifies input hash)
520
521           Sets default values to the input hash for use.  Sets the host to
522           'localhost', port 80, request URI '/', using HTTP 1.1 with GET
523           method.  The timeout is set to 10 seconds, no proxies are defined,
524           and all URI formatting is set to standard HTTP syntax.  It also
525           sets the Connection (Keep-Alive) and User-Agent headers.
526
527           NOTICE!!  It's important to use http_init_request before calling
528           http_do_request, or http_do_request might puke.  Thus, a special
529           magic value is placed in the hash to let http_do_request know that
530           the hash has been properly initialized.  If you really must 'roll
531           your own' and not use http_init_request before you call
532           http_do_request, you will at least need to set the MAGIC value
533           (amongst other things).
534
535       http_do_request
536           Params: \%request, \%response [, \%configs]
537
538           Return: >=1 if error; 0 if no error (also modifies response hash)
539
540           *THE* core function of libwhisker.  http_do_request actually
541           performs the HTTP request, using the values submitted in %request,
542           and placing result values in %response.  This allows you to
543           resubmit %request in subsequent requests (%response is
544           automatically cleared upon execution).  You can submit 'runtime'
545           config directives as %configs, which will be spliced into
546           $hin{whisker}->{} before anything else.  That means you can do:
547
548           LW2::http_do_request(\%req,\%resp,{'uri'=>'/cgi-bin/'});
549
550           This will set $req{whisker}->{'uri'}='/cgi-bin/' before execution,
551           and provides a simple shortcut (note: it does modify %req).
552
553           This function will also retry any requests that bomb out during the
554           transaction (but not during the connecting phase).  This is
555           controlled by the {whisker}->{retry} value.  Also note that the
556           returned error message in hout is the *last* error received.  All
557           retry errors are put into {whisker}->{retry_errors}, which is an
558           anonymous array.
559
560           Also note that all NTLM auth logic is implemented in
561           http_do_request().  NTLM requires multiple requests in order to
562           work correctly, and so this function attempts to wrap that and make
563           it all transparent, so that the final end result is what's passed
564           to the application.
565
566           This function will return 0 on success, 1 on HTTP protocol error,
567           and 2 on non-recoverable network connection error (you can retry
568           error 1, but error 2 means that the server is totally unreachable
569           and there's no point in retrying).
570
571       http_req2line
572           Params: \%request, $uri_only_switch
573
574           Return: $request
575
576           req2line is used internally by http_do_request, as well as provides
577           a convienient way to turn a %request configuration into an actual
578           HTTP request line.  If $switch is set to 1, then the returned
579           $request will be the URI only ('/requested/page.html'), versus the
580           entire HTTP request ('GET /requested/page.html HTTP/1.0\n\n').
581           Also, if the 'full_request_override' whisker config variable is set
582           in %hin, then it will be returned instead of the constructed URI.
583
584       http_resp2line
585           Params: \%response
586
587           Return: $response
588
589           http_resp2line provides a convienient way to turn a %response hash
590           back into the original HTTP response line.
591
592       http_fixup_request
593           Params: $hash_ref
594
595           Return: Nothing
596
597           This function takes a %hin hash reference and makes sure the proper
598           headers exist (for example, it will add the Host: header, calculate
599           the Content-Length: header for POST requests, etc).  For standard
600           requests (i.e. you want the request to be HTTP RFC-compliant), you
601           should call this function right before you call http_do_request.
602
603       http_reset
604           Params: Nothing
605
606           Return: Nothing
607
608           The http_reset function will walk through the %http_host_cache,
609           closing all open sockets and freeing SSL resources.  It also clears
610           out the host cache in case you need to rerun everything fresh.
611
612           Note: if you just want to close a single connection, and you have a
613           copy of the %request hash you used, you should use the http_close()
614           function instead.
615
616       ssl_is_available
617           Params: Nothing
618
619           Return: $boolean [, $lib_name, $version]
620
621           The ssl_is_available() function will inform you whether SSL
622           requests are allowed, which is dependant on whether the appropriate
623           SSL libraries are installed on the machine.  In scalar context, the
624           function will return 1 or 0.  In array context, the second element
625           will be the SSL library name that is currently being used by LW2,
626           and the third elment will be the SSL library version number.
627           Elements two and three (name and version) will be undefined if
628           called in array context and no SSL libraries are available.
629
630       http_read_headers
631           Params: $stream, \%in, \%out
632
633           Return: $result_code, $encoding, $length, $connection
634
635           Read HTTP headers from the given stream, storing the results in
636           %out.  On success, $result_code will be 1 and $encoding, $length,
637           and $connection will hold the values of the Transfer-Encoding,
638           Content-Length, and Connection headers, respectively.  If any of
639           those headers are not present, then it will have an 'undef' value.
640           On an error, the $result_code will be 0 and $encoding will contain
641           an error message.
642
643           This function can be used to parse both request and response
644           headers.
645
646           Note: if there are multiple Transfer-Encoding, Content-Length, or
647           Connection headers, then only the last header value is the one
648           returned by the function.
649
650       http_read_body
651           Params: $stream, \%in, \%out, $encoding, $length
652
653           Return: 1 on success, 0 on error (and sets
654           $hout->{whisker}->{error})
655
656           Read the body from the given stream, placing it in
657           $out->{whisker}->{data}.  Handles chunked encoding.  Can be used to
658           read HTTP (POST) request or HTTP response bodies.  $encoding
659           parameter should be lowercase encoding type.
660
661           NOTE: $out->{whisker}->{data} is erased/cleared when this function
662           is called, leaving {data} to just contain this particular HTTP
663           body.
664
665       http_construct_headers
666           Params: \%in
667
668           Return: $data
669
670           This function assembles the headers in the given hash into a data
671           string.
672
673       http_close
674           Params: \%request
675
676           Return: nothing
677
678           This function will close any open streams for the given request.
679
680           Note: in order for http_close() to find the right connection, all
681           original host/proxy/port parameters in %request must be the exact
682           same as when the original request was made.
683
684       http_do_request_timeout
685           Params: \%request, \%response, $timeout
686
687           Return: $result
688
689           This function is identical to http_do_request(), except that it
690           wraps the entire request in a timeout wrapper.  $timeout is the
691           number of seconds to allow for the entire request to be completed.
692
693           Note: this function uses alarm() and signals, and thus will only
694           work on Unix-ish platforms.  It should be safe to call on any
695           platform though.
696
697       md5 Params: $data
698
699           Return: $hex_md5_string
700
701           This function takes a data scalar, and composes a MD5 hash of it,
702           and returns it in a hex ascii string.  It will use the fastest MD5
703           function available.
704
705       md4 Params: $data
706
707           Return: $hex_md4_string
708
709           This function takes a data scalar, and composes a MD4 hash of it,
710           and returns it in a hex ascii string.  It will use the fastest MD4
711           function available.
712
713       multipart_set
714           Params: \%multi_hash, $param_name, $param_value
715
716           Return: nothing
717
718           This function sets the named parameter to the given value within
719           the supplied multipart hash.
720
721       multipart_get
722           Params: \%multi_hash, $param_name
723
724           Return: $param_value, undef on error
725
726           This function retrieves the named parameter to the given value
727           within the supplied multipart hash.  There is a special case where
728           the named parameter is actually a file--in which case the resulting
729           value will be "\0FILE".  In general, all special values will be
730           prefixed with a NULL character.  In order to get a file's info, use
731           multipart_getfile().
732
733       multipart_setfile
734           Params: \%multi_hash, $param_name, $file_path [, $filename]
735
736           Return: undef on error, 1 on success
737
738           NOTE: this function does not actually add the contents of
739           $file_path into the %multi_hash; instead, multipart_write() inserts
740           the content when generating the final request.
741
742       multipart_getfile
743           Params: \%multi_hash, $file_param_name
744
745           Return: $path, $name ($path=undef on error)
746
747           multipart_getfile is used to retrieve information for a file
748           parameter contained in %multi_hash.  To use this you would most
749           likely do:
750
751            ($path,$fname)=LW2::multipart_getfile(\%multi,"param_name");
752
753       multipart_boundary
754           Params: \%multi_hash [, $new_boundary_name]
755
756           Return: $current_boundary_name
757
758           multipart_boundary is used to retrieve, and optionally set, the
759           multipart boundary used for the request.
760
761           NOTE: the function does no checking on the supplied boundary, so if
762           you want things to work make sure it's a legit boundary.
763           Libwhisker does *not* prefix it with any '---' characters.
764
765       multipart_write
766           Params: \%multi_hash, \%request
767
768           Return: 1 if successful, undef on error
769
770           multipart_write is used to parse and construct the multipart data
771           contained in %multi_hash, and place it ready to go in the given
772           whisker hash (%request) structure, to be sent to the server.
773
774           NOTE: file contents are read into the final %request, so it's
775           possible for the hash to get *very* large if you have (a) large
776           file(s).
777
778       multipart_read
779           Params: \%multi_hash, \%hout_response [, $filepath ]
780
781           Return: 1 if successful, undef on error
782
783           multipart_read will parse the data contents of the supplied
784           %hout_response hash, by passing the appropriate info to
785           multipart_read_data().  Please see multipart_read_data() for more
786           info on parameters and behaviour.
787
788           NOTE: this function will return an error if the given
789           %hout_response Content-Type is not set to "multipart/form-data".
790
791       multipart_read_data
792           Params: \%multi_hash, \$data, $boundary [, $filepath ]
793
794           Return: 1 if successful, undef on error
795
796           multipart_read_data parses the contents of the supplied data using
797           the given boundary and puts the values in the supplied %multi_hash.
798           Embedded files will *not* be saved unless a $filepath is given,
799           which should be a directory suitable for writing out temporary
800           files.
801
802           NOTE: currently only application/octet-stream is the only supported
803           file encoding.  All other file encodings will not be parsed/saved.
804
805       multipart_files_list
806           Params: \%multi_hash
807
808           Return: @files
809
810           multipart_files_list returns an array of parameter names for all
811           the files that are contained in %multi_hash.
812
813       multipart_params_list
814           Params: \%multi_hash
815
816           Return: @params
817
818           multipart_files_list returns an array of parameter names for all
819           the regular parameters (non-file) that are contained in
820           %multi_hash.
821
822       ntlm_new
823           Params: $username, $password [, $domain, $ntlm_only]
824
825           Return: $ntlm_object
826
827           Returns a reference to an array (otherwise known as the 'ntlm
828           object') which contains the various informations specific to a
829           user/pass combo.  If $ntlm_only is set to 1, then only the NTLM
830           hash (and not the LanMan hash) will be generated.  This results in
831           a speed boost, and is typically fine for using against IIS servers.
832
833           The array contains the following items, in order: username,
834           password, domain, lmhash(password), ntlmhash(password)
835
836       ntlm_decode_challenge
837           Params: $challenge
838
839           Return: @challenge_parts
840
841           Splits the supplied challenge into the various parts.  The returned
842           array contains elements in the following order:
843
844           unicode_domain, ident, packet_type, domain_len, domain_maxlen,
845           domain_offset, flags, challenge_token, reserved, empty, raw_data
846
847       ntlm_client
848           Params: $ntlm_obj [, $server_challenge]
849
850           Return: $response
851
852           ntlm_client() is responsible for generating the base64-encoded text
853           you include in the HTTP Authorization header.  If you call
854           ntlm_client() without a $server_challenge, the function will return
855           the initial NTLM request packet (message packet #1).  You send this
856           to the server, and take the server's response (message packet #2)
857           and pass that as $server_challenge, causing ntlm_client() to
858           generate the final response packet (message packet #3).
859
860           Note: $server_challenge is expected to be base64 encoded.
861
862       get_page
863           Params: $url [, \%request]
864
865           Return: $code, $data ($code will be set to undef on error, $data
866           will                contain error message)
867
868           This function will fetch the page at the given URL, and return the
869           HTTP response code and page contents.  Use this in the form of:
870           ($code,$html)=LW2::get_page("http://host.com/page.html")
871
872           The optional %request will be used if supplied.  This allows you to
873           set headers and other parameters.
874
875       get_page_hash
876           Params: $url [, \%request]
877
878           Return: $hash_ref (undef on no URL)
879
880           This function will fetch the page at the given URL, and return the
881           whisker HTTP response hash.  The return code of the function is set
882           to $hash_ref->{whisker}->{get_page_hash}, and uses the
883           http_do_request() return values.
884
885           Note: undef is returned if no URL is supplied
886
887       get_page_to_file
888           Params: $url, $filepath [, \%request]
889
890           Return: $code ($code will be set to undef on error)
891
892           This function will fetch the page at the given URL, place the
893           resulting HTML in the file specified, and return the HTTP response
894           code.  The optional %request hash sets the default parameters to be
895           used in the request.
896
897           NOTE: libwhisker does not do any file checking; libwhisker will
898           open the supplied filepath for writing, overwriting any previously-
899           existing files.  Libwhisker does not differentiate between a bad
900           request, and a bad file open.  If you're having troubles making
901           this function work, make sure that your $filepath is legal and
902           valid, and that you have appropriate write permissions to
903           create/overwrite that file.
904
905       time_mktime
906           Params: $seconds, $minutes, $hours, $day_of_month, $month,
907           $year_minus_1900
908
909           Return: $seconds [ -1 on error ]
910
911           Performs a general mktime calculation with the given time
912           components.  Note that the input parameter values are expected to
913           be in the format output by localtime/gmtime.  Namely, $seconds is
914           0-60 (yes, there can be a leap second value of 60 occasionally),
915           $minutes is 0-59, $hours is 0-23, $days is 1-31, $month is 0-11,
916           and $year is 70-127.  This function is limited in that it will not
917           process dates prior to 1970 or after 2037 (that way 32-bit time_t
918           overflow calculations aren't required).
919
920           Additional parameters passed to the function are ignored, so it is
921           safe to use the full localtime/gmtime output, such as:
922
923                   $seconds = LW2::time_mktime( localtime( time ) );
924
925           Note: this function does not adjust for time zone, daylight savings
926           time, etc.  You must do that yourself.
927
928       time_gmtolocal
929           Params: $seconds_gmt
930
931           Return: $seconds_local_timezone
932
933           Takes a seconds value in UTC/GMT time and adjusts it to reflect the
934           current timezone.  This function is slightly expensive; it takes
935           the gmtime() and localtime() representations of the current time,
936           calculates the delta difference by turning them back into seconds
937           via time_mktime, and then applies this delta difference to
938           $seconds_gmt.
939
940           Note that if you give this function a time and subtract the return
941           value from the original time, you will get the delta value.  At
942           that point, you can just apply the delta directly and skip calling
943           this function, which is a massive performance boost.  However, this
944           will cause problems if you have a long running program which
945           crosses daylight savings time boundaries, as the DST adjustment
946           will not be accounted for unless you recalculate the new delta.
947
948       uri_split
949           Params: $uri_string [, \%request_hash]
950
951           Return: @uri_parts
952
953           Return an array of the following values, in order:  uri, protocol,
954           host, port, params, frag, user, password.  Values not defined are
955           given an undef value.  If a %request hash is passed in, then
956           uri_split() will also set the appropriate values in the hash.
957
958           Note:  uri_split() will only set the %request hash if the protocol
959           is HTTP or HTTPS!
960
961       uri_join
962           Params: @vals
963
964           Return: $url
965
966           Takes the @vals array output from http_split_uri, and returns a
967           single scalar/string with them joined again, in the form of:
968           protocol://user:pass@host:port/uri?params#frag
969
970       uri_absolute
971           Params: $uri, $base_uri [, $normalize_flag ]
972
973           Return: $absolute_uri
974
975           Double checks that the given $uri is in absolute form (that is,
976           "http://host/file"), and if not (it's in the form "/file"), then it
977           will append the given $base_uri to make it absolute.  This provides
978           a compatibility similar to that found in the URI subpackage.
979
980           If $normalize_flag is set to 1, then the output will be passed
981           through uri_normalize before being returned.
982
983       uri_normalize
984           Params: $uri [, $fix_windows_slashes ]
985
986           Return: $normalized_uri [ undef on error ]
987
988           Takes the given $uri and does any /./ and /../ dereferencing in
989           order to come up with the correct absolute URL.  If the $fix_
990           windows_slashes parameter is set to 1, all \ (back slashes) will be
991           converted to / (forward slashes).
992
993           Non-http/https URIs return an error.
994
995       uri_get_dir
996           Params: $uri
997
998           Return: $uri_directory
999
1000           Will take a URI and return the directory base of it, i.e.
1001           /rfp/page.php will return /rfp/.
1002
1003       uri_strip_path_parameters
1004           Params: $uri [, \%param_hash]
1005
1006           Return: $stripped_uri
1007
1008           This function removes all URI path parameters of the form
1009
1010            /blah1;foo=bar/blah2;baz
1011
1012           and returns the stripped URI ('/blah1/blah2').  If the optional
1013           parameter hash reference is provided, the stripped parameters are
1014           saved in the form of 'blah1'=>'foo=bar', 'blah2'=>'baz'.
1015
1016           Note: only the last value of a duplicate name is saved into the
1017           param_hash, if provided.  So a $uri of '/foo;A/foo;B/' will result
1018           in a single hash entry of 'foo'=>'B'.
1019
1020       uri_parse_parameters
1021           Params: $parameter_string [, $decode, $multi_flag ]
1022
1023           Return: \%parameter_hash
1024
1025           This function takes a string in the form of:
1026
1027            foo=1&bar=2&baz=3&foo=4
1028
1029           And parses it into a hash.  In the above example, the element 'foo'
1030           has two values (1 and 4).  If $multi_flag is set to 1, then the
1031           'foo' hash entry will hold an anonymous array of both values.
1032           Otherwise, the default is to just contain the last value (in this
1033           case, '4').
1034
1035           If $decode is set to 1, then normal hex decoding is done on the
1036           characters, where needed (both the name and value are decoded).
1037
1038           Note: if a URL parameter name appears without a value, then the
1039           value will be set to undef.  E.g. for the string "foo=1&bar&baz=2",
1040           the 'bar' hash element will have an undef value.
1041
1042       uri_escape
1043           Params: $data
1044
1045           Return: $encoded_data
1046
1047           This function encodes the given $data so it is safe to be used in
1048           URIs.
1049
1050       uri_unescape
1051           Params: $encoded_data
1052
1053           Return: $data
1054
1055           This function decodes the given $data out of URI format.
1056
1057       utils_recperm
1058           Params: $uri, $depth, \@dir_parts, \@valid, \&func, \%track,
1059           \%arrays, \&cfunc
1060
1061           Return: nothing
1062
1063           This is a special function which is used to recursively-permutate
1064           through a given directory listing.  This is really only used by
1065           whisker, in order to traverse down directories, testing them as it
1066           goes.  See whisker 2.0 for exact usage examples.
1067
1068       utils_array_shuffle
1069           Params: \@array
1070
1071           Return: nothing
1072
1073           This function will randomize the order of the elements in the given
1074           array.
1075
1076       utils_randstr
1077           Params: [ $size, $chars ]
1078
1079           Return: $random_string
1080
1081           This function generates a random string between 10 and 20
1082           characters long, or of $size if specified.  If $chars is specified,
1083           then the random function picks characters from the supplied string.
1084           For example, to have a random string of 10 characters, composed of
1085           only the characters 'abcdef', then you would run:
1086
1087            utils_randstr(10,'abcdef');
1088
1089           The default character string is alphanumeric.
1090
1091       utils_port_open
1092           Params: $host, $port
1093
1094           Return: $result
1095
1096           Quick function to attempt to make a connection to the given host
1097           and port.  If a connection was successfully made, function will
1098           return true (1).  Otherwise it returns false (0).
1099
1100           Note: this uses standard TCP connections, thus is not recommended
1101           for use in port-scanning type applications.  Extremely slow.
1102
1103       utils_lowercase_keys
1104           Params: \%hash
1105
1106           Return: $number_changed
1107
1108           Will lowercase all the header names (but not values) of the given
1109           hash.
1110
1111       utils_find_lowercase_key
1112           Params: \%hash, $key
1113
1114           Return: $value, undef on error or not exist
1115
1116           Searches the given hash for the $key (regardless of case), and
1117           returns the value. If the return value is placed into an array, the
1118           will dereference any multi-value references and return an array of
1119           all values.
1120
1121           WARNING!  In scalar context, $value can either be a single-value
1122           scalar or an array reference for multiple scalar values.  That
1123           means you either need to check the return value and act
1124           appropriately, or use an array context (even if you only want a
1125           single value).  This is very important, even if you know there are
1126           no multi-value hash keys.  This function may still return an array
1127           of multiple values even if all hash keys are single value, since
1128           lowercasing the keys could result in multiple keys matching.  For
1129           example, a hash with the values { 'Foo'=>'a', 'fOo'=>'b' }
1130           technically has two keys with the lowercase name 'foo', and so this
1131           function will either return an array or array reference with both
1132           'a' and 'b'.
1133
1134       utils_find_key
1135           Params: \%hash, $key
1136
1137           Return: $value, undef on error or not exist
1138
1139           Searches the given hash for the $key (case-sensitive), and returns
1140           the value. If the return value is placed into an array, the will
1141           dereference any multi-value references and return an array of all
1142           values.
1143
1144       utils_delete_lowercase_key
1145           Params: \%hash, $key
1146
1147           Return: $number_found
1148
1149           Searches the given hash for the $key (regardless of case), and
1150           deletes the key out of the hash if found.  The function returns the
1151           number of keys found and deleted (since multiple keys can exist
1152           under the names 'Key', 'key', 'keY', 'KEY', etc.).
1153
1154       utils_getline
1155           Params: \$data [, $resetpos ]
1156
1157           Return: $line (undef if no more data)
1158
1159           Fetches the next \n terminated line from the given data.  Use the
1160           optional $resetpos to reset the internal position pointer.  Does
1161           *NOT* return trialing \n.
1162
1163       utils_getline_crlf
1164           Params: \$data [, $resetpos ]
1165
1166           Return: $line (undef if no more data)
1167
1168           Fetches the next \r\n terminated line from the given data.  Use the
1169           optional $resetpos to reset the internal position pointer.  Does
1170           *NOT* return trialing \r\n.
1171
1172       utils_save_page
1173           Params: $file, \%response
1174
1175           Return: 0 on success, 1 on error
1176
1177           Saves the data portion of the given whisker %response hash to the
1178           indicated file.  Can technically save the data portion of a
1179           %request hash too.  A file is not written if there is no data.
1180
1181           Note: LW does not do any special file checking; files are opened in
1182           overwrite mode.
1183
1184       utils_getopts
1185           Params: $opt_str, \%opt_results
1186
1187           Return: 0 on success, 1 on error
1188
1189           This function is a general implementation of GetOpts::Std.  It will
1190           parse @ARGV, looking for the options specified in $opt_str, and
1191           will put the results in %opt_results.  Behavior/parameter values
1192           are similar to GetOpts::Std's getopts().
1193
1194           Note: this function does *not* support long options (--option),
1195           option grouping (-opq), or options with immediate values (-ovalue).
1196           If an option is indicated as having a value, it will take the next
1197           argument regardless.
1198
1199       utils_text_wrapper
1200           Params: $long_text_string [, $crlf, $width ]
1201
1202           Return: $formatted_test_string
1203
1204           This is a simple function used to format a long line of text for
1205           display on a typical limited-character screen, such as a unix shell
1206           console.
1207
1208           $crlf defaults to "\n", and $width defaults to 76.
1209
1210       utils_bruteurl
1211           Params: \%req, $pre, $post, \@values_in, \@values_out
1212
1213           Return: Nothing (adds to @out)
1214
1215           Bruteurl will perform a brute force against the host/server
1216           specified in %req.  However, it will make one request per entry in
1217           @in, taking the value and setting $hin{'whisker'}->{'uri'}=
1218           $pre.value.$post.  Any URI responding with an HTTP 200 or 403
1219           response is pushed into @out.  An example of this would be to brute
1220           force usernames, putting a list of common usernames in @in, setting
1221           $pre='/~' and $post='/'.
1222
1223       utils_join_tag
1224           Params: $tag_name, \%attributes
1225
1226           Return: $tag_string [undef on error]
1227
1228           This function takes the $tag_name (like 'A') and a hash full of
1229           attributes (like {href=>'http://foo/'}) and returns the constructed
1230           HTML tag string (<A href="http://foo">).
1231
1232       utils_request_clone
1233           Params: \%from_request, \%to_request
1234
1235           Return: 1 on success, 0 on error
1236
1237           This function takes the connection/request-specific values from the
1238           given from_request hash, and copies them to the to_request hash.
1239
1240       utils_request_fingerprint
1241           Params: \%request [, $hash ]
1242
1243           Return: $fingerprint [undef on error]
1244
1245           This function constructs a 'fingerprint' of the given request by
1246           using a cryptographic hashing function on the constructed original
1247           HTTP request.
1248
1249           Note: $hash can be 'md5' (default) or 'md4'.
1250
1251       utils_flatten_lwhash
1252           Params: \%lwhash
1253
1254           Return: $flat_version [undef on error]
1255
1256           This function takes a %request or %response libwhisker hash, and
1257           creates an approximate flat data string of the original request/
1258           response (i.e. before it was parsed into components and placed into
1259           the libwhisker hash).
1260
1261       utils_carp
1262           Params: [ $package_name ]
1263
1264           Return: nothing
1265
1266           This function acts like Carp's carp function.  It warn's with the
1267           file and line number of user's code which causes a problem.  It
1268           traces up the call stack and reports the first function that is not
1269           in the LW2 or optional $package_name package package.
1270
1271       utils_croak
1272           Params: [ $package_name ]
1273
1274           Return: nothing
1275
1276           This function acts like Carp's croak function.  It die's with the
1277           file and line number of user's code which causes a problem.  It
1278           traces up the call stack and reports the first function that is not
1279           in the LW2 or optional $package_name package package.
1280

SEE ALSO

1282       LWP
1283
1285       Copyright 2009 Jeff Forristal
1286
1287
1288
12892.5                               2022-01-21                            LW2(3)
Impressum