1Hashtbl(3) OCaml library Hashtbl(3)
2
3
4
6 Hashtbl - Hash tables and hash functions.
7
9 Module Hashtbl
10
12 Module Hashtbl
13 : sig end
14
15
16 Hash tables and hash functions.
17
18 Hash tables are hashed association tables, with in-place modification.
19
20
21 Alert unsynchronized_access. Unsynchronized accesses to hash tables
22 are a programming error.
23
24
25
26
27
28
29
30 Unsynchronized accesses
31
32 Unsynchronized accesses to a hash table may lead to an invalid hash ta‐
33 ble state. Thus, concurrent accesses to a hash tables must be synchro‐
34 nized (for instance with a Mutex.t ).
35
36 Generic interface
37 type ('a, 'b) t
38
39
40 The type of hash tables from type 'a to type 'b .
41
42
43
44 val create : ?random:bool -> int -> ('a, 'b) t
45
46
47 Hashtbl.create n creates a new, empty hash table, with initial size n .
48 For best results, n should be on the order of the expected number of
49 elements that will be in the table. The table grows as needed, so n is
50 just an initial guess.
51
52 The optional ~random parameter (a boolean) controls whether the inter‐
53 nal organization of the hash table is randomized at each execution of
54 Hashtbl.create or deterministic over all executions.
55
56 A hash table that is created with ~random set to false uses a fixed
57 hash function ( Hashtbl.hash ) to distribute keys among buckets. As a
58 consequence, collisions between keys happen deterministically. In
59 Web-facing applications or other security-sensitive applications, the
60 deterministic collision patterns can be exploited by a malicious user
61 to create a denial-of-service attack: the attacker sends input crafted
62 to create many collisions in the table, slowing the application down.
63
64 A hash table that is created with ~random set to true uses the seeded
65 hash function Hashtbl.seeded_hash with a seed that is randomly chosen
66 at hash table creation time. In effect, the hash function used is ran‐
67 domly selected among 2^{30} different hash functions. All these hash
68 functions have different collision patterns, rendering ineffective the
69 denial-of-service attack described above. However, because of random‐
70 ization, enumerating all elements of the hash table using Hashtbl.fold
71 or Hashtbl.iter is no longer deterministic: elements are enumerated in
72 different orders at different runs of the program.
73
74 If no ~random parameter is given, hash tables are created in non-random
75 mode by default. This default can be changed either programmatically
76 by calling Hashtbl.randomize or by setting the R flag in the OCAMLRUN‐
77 PARAM environment variable.
78
79
80 Before4.00.0 the ~random parameter was not present and all hash tables
81 were created in non-randomized mode.
82
83
84
85
86 val clear : ('a, 'b) t -> unit
87
88 Empty a hash table. Use reset instead of clear to shrink the size of
89 the bucket table to its initial size.
90
91
92
93 val reset : ('a, 'b) t -> unit
94
95 Empty a hash table and shrink the size of the bucket table to its ini‐
96 tial size.
97
98
99 Since 4.00.0
100
101
102
103 val copy : ('a, 'b) t -> ('a, 'b) t
104
105 Return a copy of the given hashtable.
106
107
108
109 val add : ('a, 'b) t -> 'a -> 'b -> unit
110
111
112 Hashtbl.add tbl key data adds a binding of key to data in table tbl .
113 Previous bindings for key are not removed, but simply hidden. That is,
114 after performing Hashtbl.remove tbl key , the previous binding for key
115 , if any, is restored. (Same behavior as with association lists.)
116
117
118
119 val find : ('a, 'b) t -> 'a -> 'b
120
121
122 Hashtbl.find tbl x returns the current binding of x in tbl , or raises
123 Not_found if no such binding exists.
124
125
126
127 val find_opt : ('a, 'b) t -> 'a -> 'b option
128
129
130 Hashtbl.find_opt tbl x returns the current binding of x in tbl , or
131 None if no such binding exists.
132
133
134 Since 4.05
135
136
137
138 val find_all : ('a, 'b) t -> 'a -> 'b list
139
140
141 Hashtbl.find_all tbl x returns the list of all data associated with x
142 in tbl . The current binding is returned first, then the previous
143 bindings, in reverse order of introduction in the table.
144
145
146
147 val mem : ('a, 'b) t -> 'a -> bool
148
149
150 Hashtbl.mem tbl x checks if x is bound in tbl .
151
152
153
154 val remove : ('a, 'b) t -> 'a -> unit
155
156
157 Hashtbl.remove tbl x removes the current binding of x in tbl , restor‐
158 ing the previous binding if it exists. It does nothing if x is not
159 bound in tbl .
160
161
162
163 val replace : ('a, 'b) t -> 'a -> 'b -> unit
164
165
166 Hashtbl.replace tbl key data replaces the current binding of key in tbl
167 by a binding of key to data . If key is unbound in tbl , a binding of
168 key to data is added to tbl . This is functionally equivalent to
169 Hashtbl.remove tbl key followed by Hashtbl.add tbl key data .
170
171
172
173 val iter : ('a -> 'b -> unit) -> ('a, 'b) t -> unit
174
175
176 Hashtbl.iter f tbl applies f to all bindings in table tbl . f receives
177 the key as first argument, and the associated value as second argument.
178 Each binding is presented exactly once to f .
179
180 The order in which the bindings are passed to f is unspecified. How‐
181 ever, if the table contains several bindings for the same key, they are
182 passed to f in reverse order of introduction, that is, the most recent
183 binding is passed first.
184
185 If the hash table was created in non-randomized mode, the order in
186 which the bindings are enumerated is reproducible between successive
187 runs of the program, and even between minor versions of OCaml. For
188 randomized hash tables, the order of enumeration is entirely random.
189
190 The behavior is not specified if the hash table is modified by f during
191 the iteration.
192
193
194
195 val filter_map_inplace : ('a -> 'b -> 'b option) -> ('a, 'b) t -> unit
196
197
198 Hashtbl.filter_map_inplace f tbl applies f to all bindings in table tbl
199 and update each binding depending on the result of f . If f returns
200 None , the binding is discarded. If it returns Some new_val , the
201 binding is update to associate the key to new_val .
202
203 Other comments for Hashtbl.iter apply as well.
204
205
206 Since 4.03.0
207
208
209
210 val fold : ('a -> 'b -> 'c -> 'c) -> ('a, 'b) t -> 'c -> 'c
211
212
213 Hashtbl.fold f tbl init computes (f kN dN ... (f k1 d1 init)...) ,
214 where k1 ... kN are the keys of all bindings in tbl , and d1 ... dN are
215 the associated values. Each binding is presented exactly once to f .
216
217 The order in which the bindings are passed to f is unspecified. How‐
218 ever, if the table contains several bindings for the same key, they are
219 passed to f in reverse order of introduction, that is, the most recent
220 binding is passed first.
221
222 If the hash table was created in non-randomized mode, the order in
223 which the bindings are enumerated is reproducible between successive
224 runs of the program, and even between minor versions of OCaml. For
225 randomized hash tables, the order of enumeration is entirely random.
226
227 The behavior is not specified if the hash table is modified by f during
228 the iteration.
229
230
231
232 val length : ('a, 'b) t -> int
233
234
235 Hashtbl.length tbl returns the number of bindings in tbl . It takes
236 constant time. Multiple bindings are counted once each, so
237 Hashtbl.length gives the number of times Hashtbl.iter calls its first
238 argument.
239
240
241
242 val randomize : unit -> unit
243
244 After a call to Hashtbl.randomize() , hash tables are created in ran‐
245 domized mode by default: Hashtbl.create returns randomized hash tables,
246 unless the ~random:false optional parameter is given. The same effect
247 can be achieved by setting the R parameter in the OCAMLRUNPARAM envi‐
248 ronment variable.
249
250 It is recommended that applications or Web frameworks that need to pro‐
251 tect themselves against the denial-of-service attack described in
252 Hashtbl.create call Hashtbl.randomize() at initialization time before
253 any domains are created.
254
255 Note that once Hashtbl.randomize() was called, there is no way to re‐
256 vert to the non-randomized default behavior of Hashtbl.create . This
257 is intentional. Non-randomized hash tables can still be created using
258 Hashtbl.create ~random:false .
259
260
261 Since 4.00.0
262
263
264
265 val is_randomized : unit -> bool
266
267 Return true if the tables are currently created in randomized mode by
268 default, false otherwise.
269
270
271 Since 4.03.0
272
273
274
275 val rebuild : ?random:bool -> ('a, 'b) t -> ('a, 'b) t
276
277 Return a copy of the given hashtable. Unlike Hashtbl.copy ,
278 Hashtbl.rebuild h re-hashes all the (key, value) entries of the origi‐
279 nal table h . The returned hash table is randomized if h was random‐
280 ized, or the optional random parameter is true, or if the default is to
281 create randomized hash tables; see Hashtbl.create for more information.
282
283
284 Hashtbl.rebuild can safely be used to import a hash table built by an
285 old version of the Hashtbl module, then marshaled to persistent stor‐
286 age. After unmarshaling, apply Hashtbl.rebuild to produce a hash table
287 for the current version of the Hashtbl module.
288
289
290 Since 4.12.0
291
292
293 type statistics = {
294 num_bindings : int ; (* Number of bindings present in the table.
295 Same value as returned by Hashtbl.length .
296 *)
297 num_buckets : int ; (* Number of buckets in the table.
298 *)
299 max_bucket_length : int ; (* Maximal number of bindings per bucket.
300 *)
301 bucket_histogram : int array ; (* Histogram of bucket sizes. This
302 array histo has length max_bucket_length + 1 . The value of histo.(i)
303 is the number of buckets whose size is i .
304 *)
305 }
306
307
308 Since 4.00.0
309
310
311
312 val stats : ('a, 'b) t -> statistics
313
314
315 Hashtbl.stats tbl returns statistics about the table tbl : number of
316 buckets, size of the biggest bucket, distribution of buckets by size.
317
318
319 Since 4.00.0
320
321
322
323
324 Hash tables and Sequences
325 val to_seq : ('a, 'b) t -> ('a * 'b) Seq.t
326
327 Iterate on the whole table. The order in which the bindings appear in
328 the sequence is unspecified. However, if the table contains several
329 bindings for the same key, they appear in reversed order of introduc‐
330 tion, that is, the most recent binding appears first.
331
332 The behavior is not specified if the hash table is modified during the
333 iteration.
334
335
336 Since 4.07
337
338
339
340 val to_seq_keys : ('a, 'b) t -> 'a Seq.t
341
342 Same as Seq.map fst (to_seq m)
343
344
345
346 Since 4.07
347
348
349
350 val to_seq_values : ('a, 'b) t -> 'b Seq.t
351
352 Same as Seq.map snd (to_seq m)
353
354
355
356 Since 4.07
357
358
359
360 val add_seq : ('a, 'b) t -> ('a * 'b) Seq.t -> unit
361
362 Add the given bindings to the table, using Hashtbl.add
363
364
365
366 Since 4.07
367
368
369
370 val replace_seq : ('a, 'b) t -> ('a * 'b) Seq.t -> unit
371
372 Add the given bindings to the table, using Hashtbl.replace
373
374
375
376 Since 4.07
377
378
379
380 val of_seq : ('a * 'b) Seq.t -> ('a, 'b) t
381
382 Build a table from the given bindings. The bindings are added in the
383 same order they appear in the sequence, using Hashtbl.replace_seq ,
384 which means that if two pairs have the same key, only the latest one
385 will appear in the table.
386
387
388 Since 4.07
389
390
391
392
393 Functorial interface
394 The functorial interface allows the use of specific comparison and hash
395 functions, either for performance/security concerns, or because keys
396 are not hashable/comparable with the polymorphic builtins.
397
398 For instance, one might want to specialize a table for integer keys:
399 module IntHash =
400 struct
401 type t = int
402 let equal i j = i=j
403 let hash i = i land max_int
404 end
405
406 module IntHashtbl = Hashtbl.Make(IntHash)
407
408 let h = IntHashtbl.create 17 in
409 IntHashtbl.add h 12 "hello"
410
411
412 This creates a new module IntHashtbl , with a new type 'a
413 IntHashtbl.t of tables from int to 'a . In this example, h contains
414 string values so its type is string IntHashtbl.t .
415
416 Note that the new type 'a IntHashtbl.t is not compatible with the type
417 ('a,'b) Hashtbl.t of the generic interface. For example, Hashtbl.length
418 h would not type-check, you must use IntHashtbl.length .
419
420 module type HashedType = sig end
421
422
423 The input signature of the functor Hashtbl.Make .
424
425
426 module type S = sig end
427
428
429 The output signature of the functor Hashtbl.Make .
430
431
432 module Make : functor (H : HashedType) -> sig end
433
434
435 Functor building an implementation of the hashtable structure. The
436 functor Hashtbl.Make returns a structure containing a type key of keys
437 and a type 'a t of hash tables associating data of type 'a to keys of
438 type key . The operations perform similarly to those of the generic
439 interface, but use the hashing and equality functions specified in the
440 functor argument H instead of generic equality and hashing. Since the
441 hash function is not seeded, the create operation of the result struc‐
442 ture always returns non-randomized hash tables.
443
444
445 module type SeededHashedType = sig end
446
447
448 The input signature of the functor Hashtbl.MakeSeeded .
449
450
451 Since 4.00.0
452
453
454 module type SeededS = sig end
455
456
457 The output signature of the functor Hashtbl.MakeSeeded .
458
459
460 Since 4.00.0
461
462
463 module MakeSeeded : functor (H : SeededHashedType) -> sig end
464
465
466 Functor building an implementation of the hashtable structure. The
467 functor Hashtbl.MakeSeeded returns a structure containing a type key of
468 keys and a type 'a t of hash tables associating data of type 'a to keys
469 of type key . The operations perform similarly to those of the generic
470 interface, but use the seeded hashing and equality functions specified
471 in the functor argument H instead of generic equality and hashing. The
472 create operation of the result structure supports the ~random optional
473 parameter and returns randomized hash tables if ~random:true is passed
474 or if randomization is globally on (see Hashtbl.randomize ).
475
476
477 Since 4.00.0
478
479
480
481
482 The polymorphic hash functions
483 val hash : 'a -> int
484
485
486 Hashtbl.hash x associates a nonnegative integer to any value of any
487 type. It is guaranteed that if x = y or Stdlib.compare x y = 0 , then
488 hash x = hash y . Moreover, hash always terminates, even on cyclic
489 structures.
490
491
492
493 val seeded_hash : int -> 'a -> int
494
495 A variant of Hashtbl.hash that is further parameterized by an integer
496 seed.
497
498
499 Since 4.00.0
500
501
502
503 val hash_param : int -> int -> 'a -> int
504
505
506 Hashtbl.hash_param meaningful total x computes a hash value for x ,
507 with the same properties as for hash . The two extra integer parameters
508 meaningful and total give more precise control over hashing. Hashing
509 performs a breadth-first, left-to-right traversal of the structure x ,
510 stopping after meaningful meaningful nodes were encountered, or total
511 nodes (meaningful or not) were encountered. If total as specified by
512 the user exceeds a certain value, currently 256, then it is capped to
513 that value. Meaningful nodes are: integers; floating-point numbers;
514 strings; characters; booleans; and constant constructors. Larger values
515 of meaningful and total means that more nodes are taken into account to
516 compute the final hash value, and therefore collisions are less likely
517 to happen. However, hashing takes longer. The parameters meaningful
518 and total govern the tradeoff between accuracy and speed. As default
519 choices, Hashtbl.hash and Hashtbl.seeded_hash take meaningful = 10 and
520 total = 100 .
521
522
523
524 val seeded_hash_param : int -> int -> int -> 'a -> int
525
526 A variant of Hashtbl.hash_param that is further parameterized by an in‐
527 teger seed. Usage: Hashtbl.seeded_hash_param meaningful total seed x .
528
529
530 Since 4.00.0
531
532
533
534
535
536OCamldoc 2023-07-20 Hashtbl(3)