1Hash::Util(3pm)        Perl Programmers Reference Guide        Hash::Util(3pm)
2
3
4

NAME

6       Hash::Util - A selection of general-utility hash subroutines
7

SYNOPSIS

9         # Restricted hashes
10
11         use Hash::Util qw(
12                            fieldhash fieldhashes
13
14                            all_keys
15                            lock_keys unlock_keys
16                            lock_value unlock_value
17                            lock_hash unlock_hash
18                            lock_keys_plus
19                            hash_locked hash_unlocked
20                            hashref_locked hashref_unlocked
21                            hidden_keys legal_keys
22
23                            lock_ref_keys unlock_ref_keys
24                            lock_ref_value unlock_ref_value
25                            lock_hashref unlock_hashref
26                            lock_ref_keys_plus
27                            hidden_ref_keys legal_ref_keys
28
29                            hash_seed hash_value hv_store
30                            bucket_stats bucket_info bucket_array
31                            lock_hash_recurse unlock_hash_recurse
32                            lock_hashref_recurse unlock_hashref_recurse
33
34                            hash_traversal_mask
35                          );
36
37         %hash = (foo => 42, bar => 23);
38         # Ways to restrict a hash
39         lock_keys(%hash);
40         lock_keys(%hash, @keyset);
41         lock_keys_plus(%hash, @additional_keys);
42
43         # Ways to inspect the properties of a restricted hash
44         my @legal = legal_keys(%hash);
45         my @hidden = hidden_keys(%hash);
46         my $ref = all_keys(%hash,@keys,@hidden);
47         my $is_locked = hash_locked(%hash);
48
49         # Remove restrictions on the hash
50         unlock_keys(%hash);
51
52         # Lock individual values in a hash
53         lock_value  (%hash, 'foo');
54         unlock_value(%hash, 'foo');
55
56         # Ways to change the restrictions on both keys and values
57         lock_hash  (%hash);
58         unlock_hash(%hash);
59
60         my $hashes_are_randomised = hash_seed() !~ /^\0+$/;
61
62         my $int_hash_value = hash_value( 'string' );
63
64         my $mask= hash_traversal_mask(%hash);
65
66         hash_traversal_mask(%hash,1234);
67

DESCRIPTION

69       "Hash::Util" and "Hash::Util::FieldHash" contain special functions for
70       manipulating hashes that don't really warrant a keyword.
71
72       "Hash::Util" contains a set of functions that support restricted
73       hashes. These are described in this document.  "Hash::Util::FieldHash"
74       contains an (unrelated) set of functions that support the use of hashes
75       in inside-out classes, described in Hash::Util::FieldHash.
76
77       By default "Hash::Util" does not export anything.
78
79   Restricted hashes
80       5.8.0 introduces the ability to restrict a hash to a certain set of
81       keys.  No keys outside of this set can be added.  It also introduces
82       the ability to lock an individual key so it cannot be deleted and the
83       ability to ensure that an individual value cannot be changed.
84
85       This is intended to largely replace the deprecated pseudo-hashes.
86
87       lock_keys
88       unlock_keys
89             lock_keys(%hash);
90             lock_keys(%hash, @keys);
91
92           Restricts the given %hash's set of keys to @keys.  If @keys is not
93           given it restricts it to its current keyset.  No more keys can be
94           added. delete() and exists() will still work, but will not alter
95           the set of allowed keys. Note: the current implementation prevents
96           the hash from being bless()ed while it is in a locked state. Any
97           attempt to do so will raise an exception. Of course you can still
98           bless() the hash before you call lock_keys() so this shouldn't be a
99           problem.
100
101             unlock_keys(%hash);
102
103           Removes the restriction on the %hash's keyset.
104
105           Note that if any of the values of the hash have been locked they
106           will not be unlocked after this sub executes.
107
108           Both routines return a reference to the hash operated on.
109
110       lock_keys_plus
111             lock_keys_plus(%hash,@additional_keys)
112
113           Similar to "lock_keys()", with the difference being that the
114           optional key list specifies keys that may or may not be already in
115           the hash. Essentially this is an easier way to say
116
117             lock_keys(%hash,@additional_keys,keys %hash);
118
119           Returns a reference to %hash
120
121       lock_value
122       unlock_value
123             lock_value  (%hash, $key);
124             unlock_value(%hash, $key);
125
126           Locks and unlocks the value for an individual key of a hash.  The
127           value of a locked key cannot be changed.
128
129           Unless %hash has already been locked the key/value could be deleted
130           regardless of this setting.
131
132           Returns a reference to the %hash.
133
134       lock_hash
135       unlock_hash
136               lock_hash(%hash);
137
138           lock_hash() locks an entire hash, making all keys and values read-
139           only.  No value can be changed, no keys can be added or deleted.
140
141               unlock_hash(%hash);
142
143           unlock_hash() does the opposite of lock_hash().  All keys and
144           values are made writable.  All values can be changed and keys can
145           be added and deleted.
146
147           Returns a reference to the %hash.
148
149       lock_hash_recurse
150       unlock_hash_recurse
151               lock_hash_recurse(%hash);
152
153           lock_hash() locks an entire hash and any hashes it references
154           recursively, making all keys and values read-only. No value can be
155           changed, no keys can be added or deleted.
156
157           This method only recurses into hashes that are referenced by
158           another hash.  Thus a Hash of Hashes (HoH) will all be restricted,
159           but a Hash of Arrays of Hashes (HoAoH) will only have the top hash
160           restricted.
161
162               unlock_hash_recurse(%hash);
163
164           unlock_hash_recurse() does the opposite of lock_hash_recurse().
165           All keys and values are made writable.  All values can be changed
166           and keys can be added and deleted. Identical recursion restrictions
167           apply as to lock_hash_recurse().
168
169           Returns a reference to the %hash.
170
171       hashref_locked
172       hash_locked
173             hashref_locked(\%hash) and print "Hash is locked!\n";
174             hash_locked(%hash) and print "Hash is locked!\n";
175
176           Returns true if the hash and its keys are locked.
177
178       hashref_unlocked
179       hash_unlocked
180             hashref_unlocked(\%hash) and print "Hash is unlocked!\n";
181             hash_unlocked(%hash) and print "Hash is unlocked!\n";
182
183           Returns true if the hash and its keys are unlocked.
184
185       legal_keys
186             my @keys = legal_keys(%hash);
187
188           Returns the list of the keys that are legal in a restricted hash.
189           In the case of an unrestricted hash this is identical to calling
190           keys(%hash).
191
192       hidden_keys
193             my @keys = hidden_keys(%hash);
194
195           Returns the list of the keys that are legal in a restricted hash
196           but do not have a value associated to them. Thus if 'foo' is a
197           "hidden" key of the %hash it will return false for both "defined"
198           and "exists" tests.
199
200           In the case of an unrestricted hash this will return an empty list.
201
202           NOTE this is an experimental feature that is heavily dependent on
203           the current implementation of restricted hashes. Should the
204           implementation change, this routine may become meaningless, in
205           which case it will return an empty list.
206
207       all_keys
208             all_keys(%hash,@keys,@hidden);
209
210           Populates the arrays @keys with the all the keys that would pass an
211           "exists" tests, and populates @hidden with the remaining legal keys
212           that have not been utilized.
213
214           Returns a reference to the hash.
215
216           In the case of an unrestricted hash this will be equivalent to
217
218             $ref = do {
219                 @keys = keys %hash;
220                 @hidden = ();
221                 \%hash
222             };
223
224           NOTE this is an experimental feature that is heavily dependent on
225           the current implementation of restricted hashes. Should the
226           implementation change this routine may become meaningless in which
227           case it will behave identically to how it would behave on an
228           unrestricted hash.
229
230       hash_seed
231               my $hash_seed = hash_seed();
232
233           hash_seed() returns the seed bytes used to randomise hash ordering.
234
235           Note that the hash seed is sensitive information: by knowing it one
236           can craft a denial-of-service attack against Perl code, even
237           remotely, see "Algorithmic Complexity Attacks" in perlsec for more
238           information.  Do not disclose the hash seed to people who don't
239           need to know it.  See also "PERL_HASH_SEED_DEBUG" in perlrun.
240
241           Prior to Perl 5.17.6 this function returned a UV, it now returns a
242           string, which may be of nearly any size as determined by the hash
243           function your Perl has been built with. Possible sizes may be but
244           are not limited to 4 bytes (for most hash algorithms) and 16 bytes
245           (for siphash).
246
247       hash_value
248               my $hash_value = hash_value($string);
249
250           hash_value() returns the current perl's internal hash value for a
251           given string.
252
253           Returns a 32 bit integer representing the hash value of the string
254           passed in. This value is only reliable for the lifetime of the
255           process. It may be different depending on invocation, environment
256           variables,  perl version, architectures, and build options.
257
258           Note that the hash value of a given string is sensitive
259           information: by knowing it one can deduce the hash seed which in
260           turn can allow one to craft a denial-of-service attack against Perl
261           code, even remotely, see "Algorithmic Complexity Attacks" in
262           perlsec for more information.  Do not disclose the hash value of a
263           string to people who don't need to know it. See also
264           "PERL_HASH_SEED_DEBUG" in perlrun.
265
266       bucket_info
267           Return a set of basic information about a hash.
268
269               my ($keys, $buckets, $used, @length_counts)= bucket_info($hash);
270
271           Fields are as follows:
272
273               0: Number of keys in the hash
274               1: Number of buckets in the hash
275               2: Number of used buckets in the hash
276               rest : list of counts, Kth element is the number of buckets
277                      with K keys in it.
278
279           See also bucket_stats() and bucket_array().
280
281       bucket_stats
282           Returns a list of statistics about a hash.
283
284            my ($keys, $buckets, $used, $quality, $utilization_ratio,
285                   $collision_pct, $mean, $stddev, @length_counts)
286               = bucket_stats($hashref);
287
288           Fields are as follows:
289
290               0: Number of keys in the hash
291               1: Number of buckets in the hash
292               2: Number of used buckets in the hash
293               3: Hash Quality Score
294               4: Percent of buckets used
295               5: Percent of keys which are in collision
296               6: Mean bucket length of occupied buckets
297               7: Standard Deviation of bucket lengths of occupied buckets
298               rest : list of counts, Kth element is the number of buckets
299                      with K keys in it.
300
301           See also bucket_info() and bucket_array().
302
303           Note that Hash Quality Score would be 1 for an ideal hash, numbers
304           close to and below 1 indicate good hashing, and number
305           significantly above indicate a poor score. In practice it should be
306           around 0.95 to 1.05.  It is defined as:
307
308            $score= sum( $count[$length] * ($length * ($length + 1) / 2) )
309                       /
310                       ( ( $keys / 2 * $buckets ) *
311                         ( $keys + ( 2 * $buckets ) - 1 ) )
312
313           The formula is from the Red Dragon book (reformulated to use the
314           data available) and is documented at
315           <http://www.strchr.com/hash_functions>
316
317       bucket_array
318               my $array= bucket_array(\%hash);
319
320           Returns a packed representation of the bucket array associated with
321           a hash. Each element of the array is either an integer K, in which
322           case it represents K empty buckets, or a reference to another array
323           which contains the keys that are in that bucket.
324
325           Note that the information returned by bucket_array is sensitive
326           information: by knowing it one can directly attack perl's hash
327           function which in turn may allow one to craft a denial-of-service
328           attack against Perl code, even remotely, see "Algorithmic
329           Complexity Attacks" in perlsec for more information.  Do not
330           disclose the output of this function to people who don't need to
331           know it. See also "PERL_HASH_SEED_DEBUG" in perlrun. This function
332           is provided strictly for  debugging and diagnostics purposes only,
333           it is hard to imagine a reason why it would be used in production
334           code.
335
336       bucket_stats_formatted
337             print bucket_stats_formatted($hashref);
338
339           Return a formatted report of the information returned by
340           bucket_stats().  An example report looks like this:
341
342            Keys: 50 Buckets: 33/64 Quality-Score: 1.01 (Good)
343            Utilized Buckets: 51.56% Optimal: 78.12% Keys In Collision: 34.00%
344            Chain Length - mean: 1.52 stddev: 0.66
345            Buckets 64          [0000000000000000000000000000000111111111111111111122222222222333]
346            Len   0 Pct:  48.44 [###############################]
347            Len   1 Pct:  29.69 [###################]
348            Len   2 Pct:  17.19 [###########]
349            Len   3 Pct:   4.69 [###]
350            Keys    50          [11111111111111111111111111111111122222222222222333]
351            Pos   1 Pct:  66.00 [#################################]
352            Pos   2 Pct:  28.00 [##############]
353            Pos   3 Pct:   6.00 [###]
354
355           The first set of stats gives some summary statistical information,
356           including the quality score translated into "Good", "Poor" and
357           "Bad", (score<=1.05, score<=1.2, score>1.2). See the documentation
358           in bucket_stats() for more details.
359
360           The two sets of barcharts give stats and a visual indication of
361           performance of the hash.
362
363           The first gives data on bucket chain lengths and provides insight
364           on how much work a fetch *miss* will take. In this case we have to
365           inspect every item in a bucket before we can be sure the item is
366           not in the list. The performance for an insert is equivalent to
367           this case, as is a delete where the item is not in the hash.
368
369           The second gives data on how many keys are at each depth in the
370           chain, and gives an idea of how much work a fetch *hit* will take.
371           The performance for an update or delete of an item in the hash is
372           equivalent to this case.
373
374           Note that these statistics are summary only. Actual performance
375           will depend on real hit/miss ratios accessing the hash. If you are
376           concerned by hit ratios you are recommended to "oversize" your hash
377           by using something like:
378
379              keys(%hash)= keys(%hash) << $k;
380
381           With $k chosen carefully, and likely to be a small number like 1 or
382           2. In theory the larger the bucket array the less chance of
383           collision.
384
385       hv_store
386             my $sv = 0;
387             hv_store(%hash,$key,$sv) or die "Failed to alias!";
388             $hash{$key} = 1;
389             print $sv; # prints 1
390
391           Stores an alias to a variable in a hash instead of copying the
392           value.
393
394       hash_traversal_mask
395           As of Perl 5.18 every hash has its own hash traversal order, and
396           this order changes every time a new element is inserted into the
397           hash. This functionality is provided by maintaining an unsigned
398           integer mask (U32) which is xor'ed with the actual bucket id during
399           a traversal of the hash buckets using keys(), values() or each().
400
401           You can use this subroutine to get and set the traversal mask for a
402           specific hash. Setting the mask ensures that a given hash will
403           produce the same key order. Note that this does not guarantee that
404           two hashes will produce the same key order for the same hash seed
405           and traversal mask, items that collide into one bucket may have
406           different orders regardless of this setting.
407
408       bucket_ratio
409           This function behaves the same way that scalar(%hash) behaved prior
410           to Perl 5.25. Specifically if the hash is tied, then it calls the
411           SCALAR tied hash method, if untied then if the hash is empty it
412           return 0, otherwise it returns a string containing the number of
413           used buckets in the hash, followed by a slash, followed by the
414           total number of buckets in the hash.
415
416               my %hash=("foo"=>1);
417               print Hash::Util::bucket_ratio(%hash); # prints "1/8"
418
419       used_buckets
420           This function returns the count of used buckets in the hash. It is
421           expensive to calculate and the value is NOT cached, so avoid use of
422           this function in production code.
423
424       num_buckets
425           This function returns the total number of buckets the hash holds,
426           or would hold if the array were created. (When a hash is freshly
427           created the array may not be allocated even though this value will
428           be non-zero.)
429
430   Operating on references to hashes.
431       Most subroutines documented in this module have equivalent versions
432       that operate on references to hashes instead of native hashes.  The
433       following is a list of these subs. They are identical except in name
434       and in that instead of taking a %hash they take a $hashref, and
435       additionally are not prototyped.
436
437       lock_ref_keys
438       unlock_ref_keys
439       lock_ref_keys_plus
440       lock_ref_value
441       unlock_ref_value
442       lock_hashref
443       unlock_hashref
444       lock_hashref_recurse
445       unlock_hashref_recurse
446       hash_ref_unlocked
447       legal_ref_keys
448       hidden_ref_keys
449

CAVEATS

451       Note that the trapping of the restricted operations is not atomic: for
452       example
453
454           eval { %hash = (illegal_key => 1) }
455
456       leaves the %hash empty rather than with its original contents.
457

BUGS

459       The interface exposed by this module is very close to the current
460       implementation of restricted hashes. Over time it is expected that this
461       behavior will be extended and the interface abstracted further.
462

AUTHOR

464       Michael G Schwern <schwern@pobox.com> on top of code by Nick Ing-
465       Simmons and Jeffrey Friedl.
466
467       hv_store() is from Array::RefElem, Copyright 2000 Gisle Aas.
468
469       Additional code by Yves Orton.
470

SEE ALSO

472       Scalar::Util, List::Util and "Algorithmic Complexity Attacks" in
473       perlsec.
474
475       Hash::Util::FieldHash.
476
477
478
479perl v5.32.1                      2021-03-31                   Hash::Util(3pm)
Impressum