1Hash::Util(3pm) Perl Programmers Reference Guide Hash::Util(3pm)
2
3
4
6 Hash::Util - A selection of general-utility hash subroutines
7
9 # Restricted hashes
10
11 use Hash::Util qw(
12 fieldhash fieldhashes
13
14 all_keys
15 lock_keys unlock_keys
16 lock_value unlock_value
17 lock_hash unlock_hash
18 lock_keys_plus
19 hash_locked hash_unlocked
20 hashref_locked hashref_unlocked
21 hidden_keys legal_keys
22
23 lock_ref_keys unlock_ref_keys
24 lock_ref_value unlock_ref_value
25 lock_hashref unlock_hashref
26 lock_ref_keys_plus
27 hidden_ref_keys legal_ref_keys
28
29 hash_seed hash_value hv_store
30 bucket_stats bucket_info bucket_array
31 lock_hash_recurse unlock_hash_recurse
32 lock_hashref_recurse unlock_hashref_recurse
33
34 hash_traversal_mask
35 );
36
37 %hash = (foo => 42, bar => 23);
38 # Ways to restrict a hash
39 lock_keys(%hash);
40 lock_keys(%hash, @keyset);
41 lock_keys_plus(%hash, @additional_keys);
42
43 # Ways to inspect the properties of a restricted hash
44 my @legal = legal_keys(%hash);
45 my @hidden = hidden_keys(%hash);
46 my $ref = all_keys(%hash,@keys,@hidden);
47 my $is_locked = hash_locked(%hash);
48
49 # Remove restrictions on the hash
50 unlock_keys(%hash);
51
52 # Lock individual values in a hash
53 lock_value (%hash, 'foo');
54 unlock_value(%hash, 'foo');
55
56 # Ways to change the restrictions on both keys and values
57 lock_hash (%hash);
58 unlock_hash(%hash);
59
60 my $hashes_are_randomised = hash_seed() != 0;
61
62 my $int_hash_value = hash_value( 'string' );
63
64 my $mask= hash_traversal_mask(%hash);
65
66 hash_traversal_mask(%hash,1234);
67
69 "Hash::Util" and "Hash::Util::FieldHash" contain special functions for
70 manipulating hashes that don't really warrant a keyword.
71
72 "Hash::Util" contains a set of functions that support restricted
73 hashes. These are described in this document. "Hash::Util::FieldHash"
74 contains an (unrelated) set of functions that support the use of hashes
75 in inside-out classes, described in Hash::Util::FieldHash.
76
77 By default "Hash::Util" does not export anything.
78
79 Restricted hashes
80 5.8.0 introduces the ability to restrict a hash to a certain set of
81 keys. No keys outside of this set can be added. It also introduces
82 the ability to lock an individual key so it cannot be deleted and the
83 ability to ensure that an individual value cannot be changed.
84
85 This is intended to largely replace the deprecated pseudo-hashes.
86
87 lock_keys
88 unlock_keys
89 lock_keys(%hash);
90 lock_keys(%hash, @keys);
91
92 Restricts the given %hash's set of keys to @keys. If @keys is not
93 given it restricts it to its current keyset. No more keys can be
94 added. delete() and exists() will still work, but will not alter
95 the set of allowed keys. Note: the current implementation prevents
96 the hash from being bless()ed while it is in a locked state. Any
97 attempt to do so will raise an exception. Of course you can still
98 bless() the hash before you call lock_keys() so this shouldn't be a
99 problem.
100
101 unlock_keys(%hash);
102
103 Removes the restriction on the %hash's keyset.
104
105 Note that if any of the values of the hash have been locked they
106 will not be unlocked after this sub executes.
107
108 Both routines return a reference to the hash operated on.
109
110 lock_keys_plus
111 lock_keys_plus(%hash,@additional_keys)
112
113 Similar to "lock_keys()", with the difference being that the
114 optional key list specifies keys that may or may not be already in
115 the hash. Essentially this is an easier way to say
116
117 lock_keys(%hash,@additional_keys,keys %hash);
118
119 Returns a reference to %hash
120
121 lock_value
122 unlock_value
123 lock_value (%hash, $key);
124 unlock_value(%hash, $key);
125
126 Locks and unlocks the value for an individual key of a hash. The
127 value of a locked key cannot be changed.
128
129 Unless %hash has already been locked the key/value could be deleted
130 regardless of this setting.
131
132 Returns a reference to the %hash.
133
134 lock_hash
135 unlock_hash
136 lock_hash(%hash);
137
138 lock_hash() locks an entire hash, making all keys and values read-
139 only. No value can be changed, no keys can be added or deleted.
140
141 unlock_hash(%hash);
142
143 unlock_hash() does the opposite of lock_hash(). All keys and
144 values are made writable. All values can be changed and keys can
145 be added and deleted.
146
147 Returns a reference to the %hash.
148
149 lock_hash_recurse
150 unlock_hash_recurse
151 lock_hash_recurse(%hash);
152
153 lock_hash() locks an entire hash and any hashes it references
154 recursively, making all keys and values read-only. No value can be
155 changed, no keys can be added or deleted.
156
157 This method only recurses into hashes that are referenced by
158 another hash. Thus a Hash of Hashes (HoH) will all be restricted,
159 but a Hash of Arrays of Hashes (HoAoH) will only have the top hash
160 restricted.
161
162 unlock_hash_recurse(%hash);
163
164 unlock_hash_recurse() does the opposite of lock_hash_recurse().
165 All keys and values are made writable. All values can be changed
166 and keys can be added and deleted. Identical recursion restrictions
167 apply as to lock_hash_recurse().
168
169 Returns a reference to the %hash.
170
171 hashref_locked
172 hash_locked
173 hashref_locked(\%hash) and print "Hash is locked!\n";
174 hash_locked(%hash) and print "Hash is locked!\n";
175
176 Returns true if the hash and its keys are locked.
177
178 hashref_unlocked
179 hash_unlocked
180 hashref_unlocked(\%hash) and print "Hash is unlocked!\n";
181 hash_unlocked(%hash) and print "Hash is unlocked!\n";
182
183 Returns true if the hash and its keys are unlocked.
184
185 legal_keys
186 my @keys = legal_keys(%hash);
187
188 Returns the list of the keys that are legal in a restricted hash.
189 In the case of an unrestricted hash this is identical to calling
190 keys(%hash).
191
192 hidden_keys
193 my @keys = hidden_keys(%hash);
194
195 Returns the list of the keys that are legal in a restricted hash
196 but do not have a value associated to them. Thus if 'foo' is a
197 "hidden" key of the %hash it will return false for both "defined"
198 and "exists" tests.
199
200 In the case of an unrestricted hash this will return an empty list.
201
202 NOTE this is an experimental feature that is heavily dependent on
203 the current implementation of restricted hashes. Should the
204 implementation change, this routine may become meaningless, in
205 which case it will return an empty list.
206
207 all_keys
208 all_keys(%hash,@keys,@hidden);
209
210 Populates the arrays @keys with the all the keys that would pass an
211 "exists" tests, and populates @hidden with the remaining legal keys
212 that have not been utilized.
213
214 Returns a reference to the hash.
215
216 In the case of an unrestricted hash this will be equivalent to
217
218 $ref = do {
219 @keys = keys %hash;
220 @hidden = ();
221 \%hash
222 };
223
224 NOTE this is an experimental feature that is heavily dependent on
225 the current implementation of restricted hashes. Should the
226 implementation change this routine may become meaningless in which
227 case it will behave identically to how it would behave on an
228 unrestricted hash.
229
230 hash_seed
231 my $hash_seed = hash_seed();
232
233 hash_seed() returns the seed bytes used to randomise hash ordering.
234
235 Note that the hash seed is sensitive information: by knowing it one
236 can craft a denial-of-service attack against Perl code, even
237 remotely, see "Algorithmic Complexity Attacks" in perlsec for more
238 information. Do not disclose the hash seed to people who don't
239 need to know it. See also "PERL_HASH_SEED_DEBUG" in perlrun.
240
241 Prior to Perl 5.17.6 this function returned a UV, it now returns a
242 string, which may be of nearly any size as determined by the hash
243 function your Perl has been built with. Possible sizes may be but
244 are not limited to 4 bytes (for most hash algorithms) and 16 bytes
245 (for siphash).
246
247 hash_value
248 my $hash_value = hash_value($string);
249
250 hash_value() returns the current perl's internal hash value for a
251 given string.
252
253 Returns a 32 bit integer representing the hash value of the string
254 passed in. This value is only reliable for the lifetime of the
255 process. It may be different depending on invocation, environment
256 variables, perl version, architectures, and build options.
257
258 Note that the hash value of a given string is sensitive
259 information: by knowing it one can deduce the hash seed which in
260 turn can allow one to craft a denial-of-service attack against Perl
261 code, even remotely, see "Algorithmic Complexity Attacks" in
262 perlsec for more information. Do not disclose the hash value of a
263 string to people who don't need to know it. See also
264 "PERL_HASH_SEED_DEBUG" in perlrun.
265
266 bucket_info
267 Return a set of basic information about a hash.
268
269 my ($keys, $buckets, $used, @length_counts)= bucket_info($hash);
270
271 Fields are as follows:
272
273 0: Number of keys in the hash
274 1: Number of buckets in the hash
275 2: Number of used buckets in the hash
276 rest : list of counts, Kth element is the number of buckets
277 with K keys in it.
278
279 See also bucket_stats() and bucket_array().
280
281 bucket_stats
282 Returns a list of statistics about a hash.
283
284 my ($keys, $buckets, $used, $quality, $utilization_ratio,
285 $collision_pct, $mean, $stddev, @length_counts)
286 = bucket_stats($hashref);
287
288 Fields are as follows:
289
290 0: Number of keys in the hash
291 1: Number of buckets in the hash
292 2: Number of used buckets in the hash
293 3: Hash Quality Score
294 4: Percent of buckets used
295 5: Percent of keys which are in collision
296 6: Mean bucket length of occupied buckets
297 7: Standard Deviation of bucket lengths of occupied buckets
298 rest : list of counts, Kth element is the number of buckets
299 with K keys in it.
300
301 See also bucket_info() and bucket_array().
302
303 Note that Hash Quality Score would be 1 for an ideal hash, numbers
304 close to and below 1 indicate good hashing, and number
305 significantly above indicate a poor score. In practice it should be
306 around 0.95 to 1.05. It is defined as:
307
308 $score= sum( $count[$length] * ($length * ($length + 1) / 2) )
309 /
310 ( ( $keys / 2 * $buckets ) *
311 ( $keys + ( 2 * $buckets ) - 1 ) )
312
313 The formula is from the Red Dragon book (reformulated to use the
314 data available) and is documented at
315 <http://www.strchr.com/hash_functions>
316
317 bucket_array
318 my $array= bucket_array(\%hash);
319
320 Returns a packed representation of the bucket array associated with
321 a hash. Each element of the array is either an integer K, in which
322 case it represents K empty buckets, or a reference to another array
323 which contains the keys that are in that bucket.
324
325 Note that the information returned by bucket_array is sensitive
326 information: by knowing it one can directly attack perl's hash
327 function which in turn may allow one to craft a denial-of-service
328 attack against Perl code, even remotely, see "Algorithmic
329 Complexity Attacks" in perlsec for more information. Do not
330 disclose the output of this function to people who don't need to
331 know it. See also "PERL_HASH_SEED_DEBUG" in perlrun. This function
332 is provided strictly for debugging and diagnostics purposes only,
333 it is hard to imagine a reason why it would be used in production
334 code.
335
336 bucket_stats_formatted
337 print bucket_stats_formatted($hashref);
338
339 Return a formatted report of the information returned by
340 bucket_stats(). An example report looks like this:
341
342 Keys: 50 Buckets: 33/64 Quality-Score: 1.01 (Good)
343 Utilized Buckets: 51.56% Optimal: 78.12% Keys In Collision: 34.00%
344 Chain Length - mean: 1.52 stddev: 0.66
345 Buckets 64 [0000000000000000000000000000000111111111111111111122222222222333]
346 Len 0 Pct: 48.44 [###############################]
347 Len 1 Pct: 29.69 [###################]
348 Len 2 Pct: 17.19 [###########]
349 Len 3 Pct: 4.69 [###]
350 Keys 50 [11111111111111111111111111111111122222222222222333]
351 Pos 1 Pct: 66.00 [#################################]
352 Pos 2 Pct: 28.00 [##############]
353 Pos 3 Pct: 6.00 [###]
354
355 The first set of stats gives some summary statistical information,
356 including the quality score translated into "Good", "Poor" and
357 "Bad", (score<=1.05, score<=1.2, score>1.2). See the documentation
358 in bucket_stats() for more details.
359
360 The two sets of barcharts give stats and a visual indication of
361 performance of the hash.
362
363 The first gives data on bucket chain lengths and provides insight
364 on how much work a fetch *miss* will take. In this case we have to
365 inspect every item in a bucket before we can be sure the item is
366 not in the list. The performance for an insert is equivalent to
367 this case, as is a delete where the item is not in the hash.
368
369 The second gives data on how many keys are at each depth in the
370 chain, and gives an idea of how much work a fetch *hit* will take.
371 The performance for an update or delete of an item in the hash is
372 equivalent to this case.
373
374 Note that these statistics are summary only. Actual performance
375 will depend on real hit/miss ratios accessing the hash. If you are
376 concerned by hit ratios you are recommended to "oversize" your hash
377 by using something like:
378
379 keys(%hash)= keys(%hash) << $k;
380
381 With $k chosen carefully, and likely to be a small number like 1 or
382 2. In theory the larger the bucket array the less chance of
383 collision.
384
385 hv_store
386 my $sv = 0;
387 hv_store(%hash,$key,$sv) or die "Failed to alias!";
388 $hash{$key} = 1;
389 print $sv; # prints 1
390
391 Stores an alias to a variable in a hash instead of copying the
392 value.
393
394 hash_traversal_mask
395 As of Perl 5.18 every hash has its own hash traversal order, and
396 this order changes every time a new element is inserted into the
397 hash. This functionality is provided by maintaining an unsigned
398 integer mask (U32) which is xor'ed with the actual bucket id during
399 a traversal of the hash buckets using keys(), values() or each().
400
401 You can use this subroutine to get and set the traversal mask for a
402 specific hash. Setting the mask ensures that a given hash will
403 produce the same key order. Note that this does not guarantee that
404 two hashes will produce the same key order for the same hash seed
405 and traversal mask, items that collide into one bucket may have
406 different orders regardless of this setting.
407
408 bucket_ratio
409 This function behaves the same way that scalar(%hash) behaved prior
410 to Perl 5.25. Specifically if the hash is tied, then it calls the
411 SCALAR tied hash method, if untied then if the hash is empty it
412 return 0, otherwise it returns a string containing the number of
413 used buckets in the hash, followed by a slash, followed by the
414 total number of buckets in the hash.
415
416 my %hash=("foo"=>1);
417 print Hash::Util::bucket_ratio(%hash); # prints "1/8"
418
419 used_buckets
420 This function returns the count of used buckets in the hash. It is
421 expensive to calculate and the value is NOT cached, so avoid use of
422 this function in production code.
423
424 num_buckets
425 This function returns the total number of buckets the hash holds,
426 or would hold if the array were created. (When a hash is freshly
427 created the array may not be allocated even though this value will
428 be non-zero.)
429
430 Operating on references to hashes.
431 Most subroutines documented in this module have equivalent versions
432 that operate on references to hashes instead of native hashes. The
433 following is a list of these subs. They are identical except in name
434 and in that instead of taking a %hash they take a $hashref, and
435 additionally are not prototyped.
436
437 lock_ref_keys
438 unlock_ref_keys
439 lock_ref_keys_plus
440 lock_ref_value
441 unlock_ref_value
442 lock_hashref
443 unlock_hashref
444 lock_hashref_recurse
445 unlock_hashref_recurse
446 hash_ref_unlocked
447 legal_ref_keys
448 hidden_ref_keys
449
451 Note that the trapping of the restricted operations is not atomic: for
452 example
453
454 eval { %hash = (illegal_key => 1) }
455
456 leaves the %hash empty rather than with its original contents.
457
459 The interface exposed by this module is very close to the current
460 implementation of restricted hashes. Over time it is expected that this
461 behavior will be extended and the interface abstracted further.
462
464 Michael G Schwern <schwern@pobox.com> on top of code by Nick Ing-
465 Simmons and Jeffrey Friedl.
466
467 hv_store() is from Array::RefElem, Copyright 2000 Gisle Aas.
468
469 Additional code by Yves Orton.
470
472 Scalar::Util, List::Util and "Algorithmic Complexity Attacks" in
473 perlsec.
474
475 Hash::Util::FieldHash.
476
477
478
479perl v5.28.2 2018-11-01 Hash::Util(3pm)