1Hash::Util(3pm)        Perl Programmers Reference Guide        Hash::Util(3pm)
2
3
4

NAME

6       Hash::Util - A selection of general-utility hash subroutines
7

SYNOPSIS

9         # Restricted hashes
10
11         use Hash::Util qw(
12                            fieldhash fieldhashes
13
14                            all_keys
15                            lock_keys unlock_keys
16                            lock_value unlock_value
17                            lock_hash unlock_hash
18                            lock_keys_plus
19                            hash_locked hash_unlocked
20                            hashref_locked hashref_unlocked
21                            hidden_keys legal_keys
22
23                            lock_ref_keys unlock_ref_keys
24                            lock_ref_value unlock_ref_value
25                            lock_hashref unlock_hashref
26                            lock_ref_keys_plus
27                            hidden_ref_keys legal_ref_keys
28
29                            hash_seed hash_value hv_store
30                            bucket_stats bucket_info bucket_array
31                            lock_hash_recurse unlock_hash_recurse
32                            lock_hashref_recurse unlock_hashref_recurse
33
34                            hash_traversal_mask
35                          );
36
37         %hash = (foo => 42, bar => 23);
38         # Ways to restrict a hash
39         lock_keys(%hash);
40         lock_keys(%hash, @keyset);
41         lock_keys_plus(%hash, @additional_keys);
42
43         # Ways to inspect the properties of a restricted hash
44         my @legal = legal_keys(%hash);
45         my @hidden = hidden_keys(%hash);
46         my $ref = all_keys(%hash,@keys,@hidden);
47         my $is_locked = hash_locked(%hash);
48
49         # Remove restrictions on the hash
50         unlock_keys(%hash);
51
52         # Lock individual values in a hash
53         lock_value  (%hash, 'foo');
54         unlock_value(%hash, 'foo');
55
56         # Ways to change the restrictions on both keys and values
57         lock_hash  (%hash);
58         unlock_hash(%hash);
59
60         my $hashes_are_randomised = hash_seed() !~ /^\0+$/;
61
62         my $int_hash_value = hash_value( 'string' );
63
64         my $mask= hash_traversal_mask(%hash);
65
66         hash_traversal_mask(%hash,1234);
67

DESCRIPTION

69       "Hash::Util" and "Hash::Util::FieldHash" contain special functions for
70       manipulating hashes that don't really warrant a keyword.
71
72       "Hash::Util" contains a set of functions that support restricted
73       hashes. These are described in this document.  "Hash::Util::FieldHash"
74       contains an (unrelated) set of functions that support the use of hashes
75       in inside-out classes, described in Hash::Util::FieldHash.
76
77       By default "Hash::Util" does not export anything.
78
79   Restricted hashes
80       5.8.0 introduces the ability to restrict a hash to a certain set of
81       keys.  No keys outside of this set can be added.  It also introduces
82       the ability to lock an individual key so it cannot be deleted and the
83       ability to ensure that an individual value cannot be changed.
84
85       This is intended to largely replace the deprecated pseudo-hashes.
86
87       lock_keys
88       unlock_keys
89             lock_keys(%hash);
90             lock_keys(%hash, @keys);
91
92           Restricts the given %hash's set of keys to @keys.  If @keys is not
93           given it restricts it to its current keyset.  No more keys can be
94           added. delete() and exists() will still work, but will not alter
95           the set of allowed keys. Note: the current implementation prevents
96           the hash from being bless()ed while it is in a locked state. Any
97           attempt to do so will raise an exception. Of course you can still
98           bless() the hash before you call lock_keys() so this shouldn't be a
99           problem.
100
101             unlock_keys(%hash);
102
103           Removes the restriction on the %hash's keyset.
104
105           Note that if any of the values of the hash have been locked they
106           will not be unlocked after this sub executes.
107
108           Both routines return a reference to the hash operated on.
109
110       lock_keys_plus
111             lock_keys_plus(%hash,@additional_keys)
112
113           Similar to "lock_keys()", with the difference being that the
114           optional key list specifies keys that may or may not be already in
115           the hash. Essentially this is an easier way to say
116
117             lock_keys(%hash,@additional_keys,keys %hash);
118
119           Returns a reference to %hash
120
121       lock_value
122       unlock_value
123             lock_value  (%hash, $key);
124             unlock_value(%hash, $key);
125
126           Locks and unlocks the value for an individual key of a hash.  The
127           value of a locked key cannot be changed.
128
129           Unless %hash has already been locked the key/value could be deleted
130           regardless of this setting.
131
132           Returns a reference to the %hash.
133
134       lock_hash
135       unlock_hash
136               lock_hash(%hash);
137
138           lock_hash() locks an entire hash, making all keys and values read-
139           only.  No value can be changed, no keys can be added or deleted.
140
141               unlock_hash(%hash);
142
143           unlock_hash() does the opposite of lock_hash().  All keys and
144           values are made writable.  All values can be changed and keys can
145           be added and deleted.
146
147           Returns a reference to the %hash.
148
149       lock_hash_recurse
150       unlock_hash_recurse
151               lock_hash_recurse(%hash);
152
153           lock_hash() locks an entire hash and any hashes it references
154           recursively, making all keys and values read-only. No value can be
155           changed, no keys can be added or deleted.
156
157           This method only recurses into hashes that are referenced by
158           another hash.  Thus a Hash of Hashes (HoH) will all be restricted,
159           but a Hash of Arrays of Hashes (HoAoH) will only have the top hash
160           restricted.
161
162               unlock_hash_recurse(%hash);
163
164           unlock_hash_recurse() does the opposite of lock_hash_recurse().
165           All keys and values are made writable.  All values can be changed
166           and keys can be added and deleted. Identical recursion restrictions
167           apply as to lock_hash_recurse().
168
169           Returns a reference to the %hash.
170
171       hashref_locked
172       hash_locked
173             hashref_locked(\%hash) and print "Hash is locked!\n";
174             hash_locked(%hash) and print "Hash is locked!\n";
175
176           Returns true if the hash and its keys are locked.
177
178       hashref_unlocked
179       hash_unlocked
180             hashref_unlocked(\%hash) and print "Hash is unlocked!\n";
181             hash_unlocked(%hash) and print "Hash is unlocked!\n";
182
183           Returns true if the hash and its keys are unlocked.
184
185       legal_keys
186             my @keys = legal_keys(%hash);
187
188           Returns the list of the keys that are legal in a restricted hash.
189           In the case of an unrestricted hash this is identical to calling
190           keys(%hash).
191
192       hidden_keys
193             my @keys = hidden_keys(%hash);
194
195           Returns the list of the keys that are legal in a restricted hash
196           but do not have a value associated to them. Thus if 'foo' is a
197           "hidden" key of the %hash it will return false for both "defined"
198           and "exists" tests.
199
200           In the case of an unrestricted hash this will return an empty list.
201
202           NOTE this is an experimental feature that is heavily dependent on
203           the current implementation of restricted hashes. Should the
204           implementation change, this routine may become meaningless, in
205           which case it will return an empty list.
206
207       all_keys
208             all_keys(%hash,@keys,@hidden);
209
210           Populates the arrays @keys with the all the keys that would pass an
211           "exists" tests, and populates @hidden with the remaining legal keys
212           that have not been utilized.
213
214           Returns a reference to the hash.
215
216           In the case of an unrestricted hash this will be equivalent to
217
218             $ref = do {
219                 @keys = keys %hash;
220                 @hidden = ();
221                 \%hash
222             };
223
224           NOTE this is an experimental feature that is heavily dependent on
225           the current implementation of restricted hashes. Should the
226           implementation change this routine may become meaningless in which
227           case it will behave identically to how it would behave on an
228           unrestricted hash.
229
230       hash_seed
231               my $hash_seed = hash_seed();
232
233           hash_seed() returns the seed bytes used to randomise hash ordering.
234
235           Note that the hash seed is sensitive information: by knowing it one
236           can craft a denial-of-service attack against Perl code, even
237           remotely, see "Algorithmic Complexity Attacks" in perlsec for more
238           information.  Do not disclose the hash seed to people who don't
239           need to know it.  See also "PERL_HASH_SEED_DEBUG" in perlrun.
240
241           Prior to Perl 5.17.6 this function returned a UV, it now returns a
242           string, which may be of nearly any size as determined by the hash
243           function your Perl has been built with. Possible sizes may be but
244           are not limited to 4 bytes (for most hash algorithms) and 16 bytes
245           (for siphash).
246
247       hash_value
248               my $hash_value = hash_value($string);
249               my $hash_value = hash_value($string, $seed);
250
251           "hash_value($string)" returns the current perl's internal hash
252           value for a given string.  "hash_value($string, $seed)" returns the
253           hash value as if computed with a different seed.  If the custom
254           seed is too short, the function errors out.  The minimum length of
255           the seed is implementation-dependent.
256
257           Returns a 32-bit integer representing the hash value of the string
258           passed in.  The 1-parameter value is only reliable for the lifetime
259           of the process.  It may be different depending on invocation,
260           environment variables, perl version, architectures, and build
261           options.
262
263           Note that the hash value of a given string is sensitive
264           information: by knowing it one can deduce the hash seed which in
265           turn can allow one to craft a denial-of-service attack against Perl
266           code, even remotely, see "Algorithmic Complexity Attacks" in
267           perlsec for more information.  Do not disclose the hash value of a
268           string to people who don't need to know it. See also
269           "PERL_HASH_SEED_DEBUG" in perlrun.
270
271       bucket_info
272           Return a set of basic information about a hash.
273
274               my ($keys, $buckets, $used, @length_counts)= bucket_info($hash);
275
276           Fields are as follows:
277
278               0: Number of keys in the hash
279               1: Number of buckets in the hash
280               2: Number of used buckets in the hash
281               rest : list of counts, Kth element is the number of buckets
282                      with K keys in it.
283
284           See also bucket_stats() and bucket_array().
285
286       bucket_stats
287           Returns a list of statistics about a hash.
288
289            my ($keys, $buckets, $used, $quality, $utilization_ratio,
290                   $collision_pct, $mean, $stddev, @length_counts)
291               = bucket_stats($hashref);
292
293           Fields are as follows:
294
295               0: Number of keys in the hash
296               1: Number of buckets in the hash
297               2: Number of used buckets in the hash
298               3: Hash Quality Score
299               4: Percent of buckets used
300               5: Percent of keys which are in collision
301               6: Mean bucket length of occupied buckets
302               7: Standard Deviation of bucket lengths of occupied buckets
303               rest : list of counts, Kth element is the number of buckets
304                      with K keys in it.
305
306           See also bucket_info() and bucket_array().
307
308           Note that Hash Quality Score would be 1 for an ideal hash, numbers
309           close to and below 1 indicate good hashing, and number
310           significantly above indicate a poor score. In practice it should be
311           around 0.95 to 1.05.  It is defined as:
312
313            $score= sum( $count[$length] * ($length * ($length + 1) / 2) )
314                       /
315                       ( ( $keys / 2 * $buckets ) *
316                         ( $keys + ( 2 * $buckets ) - 1 ) )
317
318           The formula is from the Red Dragon book (reformulated to use the
319           data available) and is documented at
320           <http://www.strchr.com/hash_functions>
321
322       bucket_array
323               my $array= bucket_array(\%hash);
324
325           Returns a packed representation of the bucket array associated with
326           a hash. Each element of the array is either an integer K, in which
327           case it represents K empty buckets, or a reference to another array
328           which contains the keys that are in that bucket.
329
330           Note that the information returned by bucket_array is sensitive
331           information: by knowing it one can directly attack perl's hash
332           function which in turn may allow one to craft a denial-of-service
333           attack against Perl code, even remotely, see "Algorithmic
334           Complexity Attacks" in perlsec for more information.  Do not
335           disclose the output of this function to people who don't need to
336           know it. See also "PERL_HASH_SEED_DEBUG" in perlrun. This function
337           is provided strictly for  debugging and diagnostics purposes only,
338           it is hard to imagine a reason why it would be used in production
339           code.
340
341       bucket_stats_formatted
342             print bucket_stats_formatted($hashref);
343
344           Return a formatted report of the information returned by
345           bucket_stats().  An example report looks like this:
346
347            Keys: 50 Buckets: 33/64 Quality-Score: 1.01 (Good)
348            Utilized Buckets: 51.56% Optimal: 78.12% Keys In Collision: 34.00%
349            Chain Length - mean: 1.52 stddev: 0.66
350            Buckets 64          [0000000000000000000000000000000111111111111111111122222222222333]
351            Len   0 Pct:  48.44 [###############################]
352            Len   1 Pct:  29.69 [###################]
353            Len   2 Pct:  17.19 [###########]
354            Len   3 Pct:   4.69 [###]
355            Keys    50          [11111111111111111111111111111111122222222222222333]
356            Pos   1 Pct:  66.00 [#################################]
357            Pos   2 Pct:  28.00 [##############]
358            Pos   3 Pct:   6.00 [###]
359
360           The first set of stats gives some summary statistical information,
361           including the quality score translated into "Good", "Poor" and
362           "Bad", (score<=1.05, score<=1.2, score>1.2). See the documentation
363           in bucket_stats() for more details.
364
365           The two sets of barcharts give stats and a visual indication of
366           performance of the hash.
367
368           The first gives data on bucket chain lengths and provides insight
369           on how much work a fetch *miss* will take. In this case we have to
370           inspect every item in a bucket before we can be sure the item is
371           not in the list. The performance for an insert is equivalent to
372           this case, as is a delete where the item is not in the hash.
373
374           The second gives data on how many keys are at each depth in the
375           chain, and gives an idea of how much work a fetch *hit* will take.
376           The performance for an update or delete of an item in the hash is
377           equivalent to this case.
378
379           Note that these statistics are summary only. Actual performance
380           will depend on real hit/miss ratios accessing the hash. If you are
381           concerned by hit ratios you are recommended to "oversize" your hash
382           by using something like:
383
384              keys(%hash)= keys(%hash) << $k;
385
386           With $k chosen carefully, and likely to be a small number like 1 or
387           2. In theory the larger the bucket array the less chance of
388           collision.
389
390       hv_store
391             my $sv = 0;
392             hv_store(%hash,$key,$sv) or die "Failed to alias!";
393             $hash{$key} = 1;
394             print $sv; # prints 1
395
396           Stores an alias to a variable in a hash instead of copying the
397           value.
398
399       hash_traversal_mask
400           As of Perl 5.18 every hash has its own hash traversal order, and
401           this order changes every time a new element is inserted into the
402           hash. This functionality is provided by maintaining an unsigned
403           integer mask (U32) which is xor'ed with the actual bucket id during
404           a traversal of the hash buckets using keys(), values() or each().
405
406           You can use this subroutine to get and set the traversal mask for a
407           specific hash. Setting the mask ensures that a given hash will
408           produce the same key order. Note that this does not guarantee that
409           two hashes will produce the same key order for the same hash seed
410           and traversal mask, items that collide into one bucket may have
411           different orders regardless of this setting.
412
413       bucket_ratio
414           This function behaves the same way that scalar(%hash) behaved prior
415           to Perl 5.25. Specifically if the hash is tied, then it calls the
416           SCALAR tied hash method, if untied then if the hash is empty it
417           return 0, otherwise it returns a string containing the number of
418           used buckets in the hash, followed by a slash, followed by the
419           total number of buckets in the hash.
420
421               my %hash=("foo"=>1);
422               print Hash::Util::bucket_ratio(%hash); # prints "1/8"
423
424       used_buckets
425           This function returns the count of used buckets in the hash. It is
426           expensive to calculate and the value is NOT cached, so avoid use of
427           this function in production code.
428
429       num_buckets
430           This function returns the total number of buckets the hash holds,
431           or would hold if the array were created. (When a hash is freshly
432           created the array may not be allocated even though this value will
433           be non-zero.)
434
435   Operating on references to hashes.
436       Most subroutines documented in this module have equivalent versions
437       that operate on references to hashes instead of native hashes.  The
438       following is a list of these subs. They are identical except in name
439       and in that instead of taking a %hash they take a $hashref, and
440       additionally are not prototyped.
441
442       lock_ref_keys
443       unlock_ref_keys
444       lock_ref_keys_plus
445       lock_ref_value
446       unlock_ref_value
447       lock_hashref
448       unlock_hashref
449       lock_hashref_recurse
450       unlock_hashref_recurse
451       hash_ref_unlocked
452       legal_ref_keys
453       hidden_ref_keys
454

CAVEATS

456       Note that the trapping of the restricted operations is not atomic: for
457       example
458
459           eval { %hash = (illegal_key => 1) }
460
461       leaves the %hash empty rather than with its original contents.
462

BUGS

464       The interface exposed by this module is very close to the current
465       implementation of restricted hashes. Over time it is expected that this
466       behavior will be extended and the interface abstracted further.
467

AUTHOR

469       Michael G Schwern <schwern@pobox.com> on top of code by Nick Ing-
470       Simmons and Jeffrey Friedl.
471
472       hv_store() is from Array::RefElem, Copyright 2000 Gisle Aas.
473
474       Additional code by Yves Orton.
475
476       Description of "hash_value($string, $seed)" by Christopher Yeleighton
477       <ne01026@shark.2a.pl>
478

SEE ALSO

480       Scalar::Util, List::Util and "Algorithmic Complexity Attacks" in
481       perlsec.
482
483       Hash::Util::FieldHash.
484
485
486
487perl v5.36.0                      2022-08-30                   Hash::Util(3pm)
Impressum