1Hash::Util(3pm) Perl Programmers Reference Guide Hash::Util(3pm)
2
3
4
6 Hash::Util - A selection of general-utility hash subroutines
7
9 # Restricted hashes
10
11 use Hash::Util qw(
12 fieldhash fieldhashes
13
14 all_keys
15 lock_keys unlock_keys
16 lock_value unlock_value
17 lock_hash unlock_hash
18 lock_keys_plus
19 hash_locked hash_unlocked
20 hashref_locked hashref_unlocked
21 hidden_keys legal_keys
22
23 lock_ref_keys unlock_ref_keys
24 lock_ref_value unlock_ref_value
25 lock_hashref unlock_hashref
26 lock_ref_keys_plus
27 hidden_ref_keys legal_ref_keys
28
29 hash_seed hash_value hv_store
30 bucket_stats bucket_info bucket_array
31 lock_hash_recurse unlock_hash_recurse
32 lock_hashref_recurse unlock_hashref_recurse
33
34 hash_traversal_mask
35 );
36
37 %hash = (foo => 42, bar => 23);
38 # Ways to restrict a hash
39 lock_keys(%hash);
40 lock_keys(%hash, @keyset);
41 lock_keys_plus(%hash, @additional_keys);
42
43 # Ways to inspect the properties of a restricted hash
44 my @legal = legal_keys(%hash);
45 my @hidden = hidden_keys(%hash);
46 my $ref = all_keys(%hash,@keys,@hidden);
47 my $is_locked = hash_locked(%hash);
48
49 # Remove restrictions on the hash
50 unlock_keys(%hash);
51
52 # Lock individual values in a hash
53 lock_value (%hash, 'foo');
54 unlock_value(%hash, 'foo');
55
56 # Ways to change the restrictions on both keys and values
57 lock_hash (%hash);
58 unlock_hash(%hash);
59
60 my $hashes_are_randomised = hash_seed() !~ /^\0+$/;
61
62 my $int_hash_value = hash_value( 'string' );
63
64 my $mask= hash_traversal_mask(%hash);
65
66 hash_traversal_mask(%hash,1234);
67
69 "Hash::Util" and "Hash::Util::FieldHash" contain special functions for
70 manipulating hashes that don't really warrant a keyword.
71
72 "Hash::Util" contains a set of functions that support restricted
73 hashes. These are described in this document. "Hash::Util::FieldHash"
74 contains an (unrelated) set of functions that support the use of hashes
75 in inside-out classes, described in Hash::Util::FieldHash.
76
77 By default "Hash::Util" does not export anything.
78
79 Restricted hashes
80 5.8.0 introduces the ability to restrict a hash to a certain set of
81 keys. No keys outside of this set can be added. It also introduces
82 the ability to lock an individual key so it cannot be deleted and the
83 ability to ensure that an individual value cannot be changed.
84
85 This is intended to largely replace the deprecated pseudo-hashes.
86
87 lock_keys
88 unlock_keys
89 lock_keys(%hash);
90 lock_keys(%hash, @keys);
91
92 Restricts the given %hash's set of keys to @keys. If @keys is not
93 given it restricts it to its current keyset. No more keys can be
94 added. delete() and exists() will still work, but will not alter
95 the set of allowed keys. Note: the current implementation prevents
96 the hash from being bless()ed while it is in a locked state. Any
97 attempt to do so will raise an exception. Of course you can still
98 bless() the hash before you call lock_keys() so this shouldn't be a
99 problem.
100
101 unlock_keys(%hash);
102
103 Removes the restriction on the %hash's keyset.
104
105 Note that if any of the values of the hash have been locked they
106 will not be unlocked after this sub executes.
107
108 Both routines return a reference to the hash operated on.
109
110 lock_keys_plus
111 lock_keys_plus(%hash,@additional_keys)
112
113 Similar to "lock_keys()", with the difference being that the
114 optional key list specifies keys that may or may not be already in
115 the hash. Essentially this is an easier way to say
116
117 lock_keys(%hash,@additional_keys,keys %hash);
118
119 Returns a reference to %hash
120
121 lock_value
122 unlock_value
123 lock_value (%hash, $key);
124 unlock_value(%hash, $key);
125
126 Locks and unlocks the value for an individual key of a hash. The
127 value of a locked key cannot be changed.
128
129 Unless %hash has already been locked the key/value could be deleted
130 regardless of this setting.
131
132 Returns a reference to the %hash.
133
134 lock_hash
135 unlock_hash
136 lock_hash(%hash);
137
138 lock_hash() locks an entire hash, making all keys and values read-
139 only. No value can be changed, no keys can be added or deleted.
140
141 unlock_hash(%hash);
142
143 unlock_hash() does the opposite of lock_hash(). All keys and
144 values are made writable. All values can be changed and keys can
145 be added and deleted.
146
147 Returns a reference to the %hash.
148
149 lock_hash_recurse
150 unlock_hash_recurse
151 lock_hash_recurse(%hash);
152
153 lock_hash() locks an entire hash and any hashes it references
154 recursively, making all keys and values read-only. No value can be
155 changed, no keys can be added or deleted.
156
157 This method only recurses into hashes that are referenced by
158 another hash. Thus a Hash of Hashes (HoH) will all be restricted,
159 but a Hash of Arrays of Hashes (HoAoH) will only have the top hash
160 restricted.
161
162 unlock_hash_recurse(%hash);
163
164 unlock_hash_recurse() does the opposite of lock_hash_recurse().
165 All keys and values are made writable. All values can be changed
166 and keys can be added and deleted. Identical recursion restrictions
167 apply as to lock_hash_recurse().
168
169 Returns a reference to the %hash.
170
171 hashref_locked
172 hash_locked
173 hashref_locked(\%hash) and print "Hash is locked!\n";
174 hash_locked(%hash) and print "Hash is locked!\n";
175
176 Returns true if the hash and its keys are locked.
177
178 hashref_unlocked
179 hash_unlocked
180 hashref_unlocked(\%hash) and print "Hash is unlocked!\n";
181 hash_unlocked(%hash) and print "Hash is unlocked!\n";
182
183 Returns true if the hash and its keys are unlocked.
184
185 legal_keys
186 my @keys = legal_keys(%hash);
187
188 Returns the list of the keys that are legal in a restricted hash.
189 In the case of an unrestricted hash this is identical to calling
190 keys(%hash).
191
192 hidden_keys
193 my @keys = hidden_keys(%hash);
194
195 Returns the list of the keys that are legal in a restricted hash
196 but do not have a value associated to them. Thus if 'foo' is a
197 "hidden" key of the %hash it will return false for both "defined"
198 and "exists" tests.
199
200 In the case of an unrestricted hash this will return an empty list.
201
202 NOTE this is an experimental feature that is heavily dependent on
203 the current implementation of restricted hashes. Should the
204 implementation change, this routine may become meaningless, in
205 which case it will return an empty list.
206
207 all_keys
208 all_keys(%hash,@keys,@hidden);
209
210 Populates the arrays @keys with the all the keys that would pass an
211 "exists" tests, and populates @hidden with the remaining legal keys
212 that have not been utilized.
213
214 Returns a reference to the hash.
215
216 In the case of an unrestricted hash this will be equivalent to
217
218 $ref = do {
219 @keys = keys %hash;
220 @hidden = ();
221 \%hash
222 };
223
224 NOTE this is an experimental feature that is heavily dependent on
225 the current implementation of restricted hashes. Should the
226 implementation change this routine may become meaningless in which
227 case it will behave identically to how it would behave on an
228 unrestricted hash.
229
230 hash_seed
231 my $hash_seed = hash_seed();
232
233 hash_seed() returns the seed bytes used to randomise hash ordering.
234
235 Note that the hash seed is sensitive information: by knowing it one
236 can craft a denial-of-service attack against Perl code, even
237 remotely, see "Algorithmic Complexity Attacks" in perlsec for more
238 information. Do not disclose the hash seed to people who don't
239 need to know it. See also "PERL_HASH_SEED_DEBUG" in perlrun.
240
241 Prior to Perl 5.17.6 this function returned a UV, it now returns a
242 string, which may be of nearly any size as determined by the hash
243 function your Perl has been built with. Possible sizes may be but
244 are not limited to 4 bytes (for most hash algorithms) and 16 bytes
245 (for siphash).
246
247 hash_value
248 my $hash_value = hash_value($string);
249 my $hash_value = hash_value($string, $seed);
250
251 "hash_value($string)" returns the current perl's internal hash
252 value for a given string. "hash_value($string, $seed)" returns the
253 hash value as if computed with a different seed. If the custom
254 seed is too short, the function errors out. The minimum length of
255 the seed is implementation-dependent.
256
257 Returns a 32-bit integer representing the hash value of the string
258 passed in. The 1-parameter value is only reliable for the lifetime
259 of the process. It may be different depending on invocation,
260 environment variables, perl version, architectures, and build
261 options.
262
263 Note that the hash value of a given string is sensitive
264 information: by knowing it one can deduce the hash seed which in
265 turn can allow one to craft a denial-of-service attack against Perl
266 code, even remotely, see "Algorithmic Complexity Attacks" in
267 perlsec for more information. Do not disclose the hash value of a
268 string to people who don't need to know it. See also
269 "PERL_HASH_SEED_DEBUG" in perlrun.
270
271 bucket_info
272 Return a set of basic information about a hash.
273
274 my ($keys, $buckets, $used, @length_counts)= bucket_info($hash);
275
276 Fields are as follows:
277
278 0: Number of keys in the hash
279 1: Number of buckets in the hash
280 2: Number of used buckets in the hash
281 rest : list of counts, Kth element is the number of buckets
282 with K keys in it.
283
284 See also bucket_stats() and bucket_array().
285
286 bucket_stats
287 Returns a list of statistics about a hash.
288
289 my ($keys, $buckets, $used, $quality, $utilization_ratio,
290 $collision_pct, $mean, $stddev, @length_counts)
291 = bucket_stats($hashref);
292
293 Fields are as follows:
294
295 0: Number of keys in the hash
296 1: Number of buckets in the hash
297 2: Number of used buckets in the hash
298 3: Hash Quality Score
299 4: Percent of buckets used
300 5: Percent of keys which are in collision
301 6: Mean bucket length of occupied buckets
302 7: Standard Deviation of bucket lengths of occupied buckets
303 rest : list of counts, Kth element is the number of buckets
304 with K keys in it.
305
306 See also bucket_info() and bucket_array().
307
308 Note that Hash Quality Score would be 1 for an ideal hash, numbers
309 close to and below 1 indicate good hashing, and number
310 significantly above indicate a poor score. In practice it should be
311 around 0.95 to 1.05. It is defined as:
312
313 $score= sum( $count[$length] * ($length * ($length + 1) / 2) )
314 /
315 ( ( $keys / 2 * $buckets ) *
316 ( $keys + ( 2 * $buckets ) - 1 ) )
317
318 The formula is from the Red Dragon book (reformulated to use the
319 data available) and is documented at
320 <http://www.strchr.com/hash_functions>
321
322 bucket_array
323 my $array= bucket_array(\%hash);
324
325 Returns a packed representation of the bucket array associated with
326 a hash. Each element of the array is either an integer K, in which
327 case it represents K empty buckets, or a reference to another array
328 which contains the keys that are in that bucket.
329
330 Note that the information returned by bucket_array is sensitive
331 information: by knowing it one can directly attack perl's hash
332 function which in turn may allow one to craft a denial-of-service
333 attack against Perl code, even remotely, see "Algorithmic
334 Complexity Attacks" in perlsec for more information. Do not
335 disclose the output of this function to people who don't need to
336 know it. See also "PERL_HASH_SEED_DEBUG" in perlrun. This function
337 is provided strictly for debugging and diagnostics purposes only,
338 it is hard to imagine a reason why it would be used in production
339 code.
340
341 bucket_stats_formatted
342 print bucket_stats_formatted($hashref);
343
344 Return a formatted report of the information returned by
345 bucket_stats(). An example report looks like this:
346
347 Keys: 50 Buckets: 33/64 Quality-Score: 1.01 (Good)
348 Utilized Buckets: 51.56% Optimal: 78.12% Keys In Collision: 34.00%
349 Chain Length - mean: 1.52 stddev: 0.66
350 Buckets 64 [0000000000000000000000000000000111111111111111111122222222222333]
351 Len 0 Pct: 48.44 [###############################]
352 Len 1 Pct: 29.69 [###################]
353 Len 2 Pct: 17.19 [###########]
354 Len 3 Pct: 4.69 [###]
355 Keys 50 [11111111111111111111111111111111122222222222222333]
356 Pos 1 Pct: 66.00 [#################################]
357 Pos 2 Pct: 28.00 [##############]
358 Pos 3 Pct: 6.00 [###]
359
360 The first set of stats gives some summary statistical information,
361 including the quality score translated into "Good", "Poor" and
362 "Bad", (score<=1.05, score<=1.2, score>1.2). See the documentation
363 in bucket_stats() for more details.
364
365 The two sets of barcharts give stats and a visual indication of
366 performance of the hash.
367
368 The first gives data on bucket chain lengths and provides insight
369 on how much work a fetch *miss* will take. In this case we have to
370 inspect every item in a bucket before we can be sure the item is
371 not in the list. The performance for an insert is equivalent to
372 this case, as is a delete where the item is not in the hash.
373
374 The second gives data on how many keys are at each depth in the
375 chain, and gives an idea of how much work a fetch *hit* will take.
376 The performance for an update or delete of an item in the hash is
377 equivalent to this case.
378
379 Note that these statistics are summary only. Actual performance
380 will depend on real hit/miss ratios accessing the hash. If you are
381 concerned by hit ratios you are recommended to "oversize" your hash
382 by using something like:
383
384 keys(%hash)= keys(%hash) << $k;
385
386 With $k chosen carefully, and likely to be a small number like 1 or
387 2. In theory the larger the bucket array the less chance of
388 collision.
389
390 hv_store
391 my $sv = 0;
392 hv_store(%hash,$key,$sv) or die "Failed to alias!";
393 $hash{$key} = 1;
394 print $sv; # prints 1
395
396 Stores an alias to a variable in a hash instead of copying the
397 value.
398
399 hash_traversal_mask
400 As of Perl 5.18 every hash has its own hash traversal order, and
401 this order changes every time a new element is inserted into the
402 hash. This functionality is provided by maintaining an unsigned
403 integer mask (U32) which is xor'ed with the actual bucket id during
404 a traversal of the hash buckets using keys(), values() or each().
405
406 You can use this subroutine to get and set the traversal mask for a
407 specific hash. Setting the mask ensures that a given hash will
408 produce the same key order. Note that this does not guarantee that
409 two hashes will produce the same key order for the same hash seed
410 and traversal mask, items that collide into one bucket may have
411 different orders regardless of this setting.
412
413 bucket_ratio
414 This function behaves the same way that scalar(%hash) behaved prior
415 to Perl 5.25. Specifically if the hash is tied, then it calls the
416 SCALAR tied hash method, if untied then if the hash is empty it
417 return 0, otherwise it returns a string containing the number of
418 used buckets in the hash, followed by a slash, followed by the
419 total number of buckets in the hash.
420
421 my %hash=("foo"=>1);
422 print Hash::Util::bucket_ratio(%hash); # prints "1/8"
423
424 used_buckets
425 This function returns the count of used buckets in the hash. It is
426 expensive to calculate and the value is NOT cached, so avoid use of
427 this function in production code.
428
429 num_buckets
430 This function returns the total number of buckets the hash holds,
431 or would hold if the array were created. (When a hash is freshly
432 created the array may not be allocated even though this value will
433 be non-zero.)
434
435 Operating on references to hashes.
436 Most subroutines documented in this module have equivalent versions
437 that operate on references to hashes instead of native hashes. The
438 following is a list of these subs. They are identical except in name
439 and in that instead of taking a %hash they take a $hashref, and
440 additionally are not prototyped.
441
442 lock_ref_keys
443 unlock_ref_keys
444 lock_ref_keys_plus
445 lock_ref_value
446 unlock_ref_value
447 lock_hashref
448 unlock_hashref
449 lock_hashref_recurse
450 unlock_hashref_recurse
451 hash_ref_unlocked
452 legal_ref_keys
453 hidden_ref_keys
454
456 Note that the trapping of the restricted operations is not atomic: for
457 example
458
459 eval { %hash = (illegal_key => 1) }
460
461 leaves the %hash empty rather than with its original contents.
462
464 The interface exposed by this module is very close to the current
465 implementation of restricted hashes. Over time it is expected that this
466 behavior will be extended and the interface abstracted further.
467
469 Michael G Schwern <schwern@pobox.com> on top of code by Nick Ing-
470 Simmons and Jeffrey Friedl.
471
472 hv_store() is from Array::RefElem, Copyright 2000 Gisle Aas.
473
474 Additional code by Yves Orton.
475
476 Description of "hash_value($string, $seed)" by Christopher Yeleighton
477 <ne01026@shark.2a.pl>
478
480 Scalar::Util, List::Util and "Algorithmic Complexity Attacks" in
481 perlsec.
482
483 Hash::Util::FieldHash.
484
485
486
487perl v5.36.3 2023-11-30 Hash::Util(3pm)