1Hashtbl(3) OCaml library Hashtbl(3)
2
3
4
6 Hashtbl - Hash tables and hash functions.
7
9 Module Hashtbl
10
12 Module Hashtbl
13 : sig end
14
15
16 Hash tables and hash functions.
17
18 Hash tables are hashed association tables, with in-place modification.
19
20
21
22
23
24
25
26 Generic interface
27 type ('a, 'b) t
28
29
30 The type of hash tables from type 'a to type 'b .
31
32
33
34 val create : ?random:bool -> int -> ('a, 'b) t
35
36
37 Hashtbl.create n creates a new, empty hash table, with initial size n .
38 For best results, n should be on the order of the expected number of
39 elements that will be in the table. The table grows as needed, so n is
40 just an initial guess.
41
42 The optional ~ random parameter (a boolean) controls whether the inter‐
43 nal organization of the hash table is randomized at each execution of
44 Hashtbl.create or deterministic over all executions.
45
46 A hash table that is created with ~ random set to false uses a fixed
47 hash function ( Hashtbl.hash ) to distribute keys among buckets. As a
48 consequence, collisions between keys happen deterministically. In
49 Web-facing applications or other security-sensitive applications, the
50 deterministic collision patterns can be exploited by a malicious user
51 to create a denial-of-service attack: the attacker sends input crafted
52 to create many collisions in the table, slowing the application down.
53
54 A hash table that is created with ~ random set to true uses the seeded
55 hash function Hashtbl.seeded_hash with a seed that is randomly chosen
56 at hash table creation time. In effect, the hash function used is ran‐
57 domly selected among 2^{30} different hash functions. All these hash
58 functions have different collision patterns, rendering ineffective the
59 denial-of-service attack described above. However, because of random‐
60 ization, enumerating all elements of the hash table using Hashtbl.fold
61 or Hashtbl.iter is no longer deterministic: elements are enumerated in
62 different orders at different runs of the program.
63
64 If no ~ random parameter is given, hash tables are created in non-ran‐
65 dom mode by default. This default can be changed either programmati‐
66 cally by calling Hashtbl.randomize or by setting the R flag in the
67 OCAMLRUNPARAM environment variable.
68
69
70 Before4.00.0 the ~ random parameter was not present and all hash tables
71 were created in non-randomized mode.
72
73
74
75
76 val clear : ('a, 'b) t -> unit
77
78 Empty a hash table. Use reset instead of clear to shrink the size of
79 the bucket table to its initial size.
80
81
82
83 val reset : ('a, 'b) t -> unit
84
85 Empty a hash table and shrink the size of the bucket table to its ini‐
86 tial size.
87
88
89 Since 4.00.0
90
91
92
93 val copy : ('a, 'b) t -> ('a, 'b) t
94
95 Return a copy of the given hashtable.
96
97
98
99 val add : ('a, 'b) t -> 'a -> 'b -> unit
100
101
102 Hashtbl.add tbl key data adds a binding of key to data in table tbl .
103 Previous bindings for key are not removed, but simply hidden. That is,
104 after performing Hashtbl.remove tbl key , the previous binding for key
105 , if any, is restored. (Same behavior as with association lists.)
106
107
108
109 val find : ('a, 'b) t -> 'a -> 'b
110
111
112 Hashtbl.find tbl x returns the current binding of x in tbl , or raises
113 Not_found if no such binding exists.
114
115
116
117 val find_opt : ('a, 'b) t -> 'a -> 'b option
118
119
120 Hashtbl.find_opt tbl x returns the current binding of x in tbl , or
121 None if no such binding exists.
122
123
124 Since 4.05
125
126
127
128 val find_all : ('a, 'b) t -> 'a -> 'b list
129
130
131 Hashtbl.find_all tbl x returns the list of all data associated with x
132 in tbl . The current binding is returned first, then the previous
133 bindings, in reverse order of introduction in the table.
134
135
136
137 val mem : ('a, 'b) t -> 'a -> bool
138
139
140 Hashtbl.mem tbl x checks if x is bound in tbl .
141
142
143
144 val remove : ('a, 'b) t -> 'a -> unit
145
146
147 Hashtbl.remove tbl x removes the current binding of x in tbl , restor‐
148 ing the previous binding if it exists. It does nothing if x is not
149 bound in tbl .
150
151
152
153 val replace : ('a, 'b) t -> 'a -> 'b -> unit
154
155
156 Hashtbl.replace tbl key data replaces the current binding of key in tbl
157 by a binding of key to data . If key is unbound in tbl , a binding of
158 key to data is added to tbl . This is functionally equivalent to
159 Hashtbl.remove tbl key followed by Hashtbl.add tbl key data .
160
161
162
163 val iter : ('a -> 'b -> unit) -> ('a, 'b) t -> unit
164
165
166 Hashtbl.iter f tbl applies f to all bindings in table tbl . f receives
167 the key as first argument, and the associated value as second argument.
168 Each binding is presented exactly once to f .
169
170 The order in which the bindings are passed to f is unspecified. How‐
171 ever, if the table contains several bindings for the same key, they are
172 passed to f in reverse order of introduction, that is, the most recent
173 binding is passed first.
174
175 If the hash table was created in non-randomized mode, the order in
176 which the bindings are enumerated is reproducible between successive
177 runs of the program, and even between minor versions of OCaml. For
178 randomized hash tables, the order of enumeration is entirely random.
179
180 The behavior is not defined if the hash table is modified by f during
181 the iteration.
182
183
184
185 val filter_map_inplace : ('a -> 'b -> 'b option) -> ('a, 'b) t -> unit
186
187
188 Hashtbl.filter_map_inplace f tbl applies f to all bindings in table tbl
189 and update each binding depending on the result of f . If f returns
190 None , the binding is discarded. If it returns Some new_val , the
191 binding is update to associate the key to new_val .
192
193 Other comments for Hashtbl.iter apply as well.
194
195
196 Since 4.03.0
197
198
199
200 val fold : ('a -> 'b -> 'c -> 'c) -> ('a, 'b) t -> 'c -> 'c
201
202
203 Hashtbl.fold f tbl init computes (f kN dN ... (f k1 d1 init)...) ,
204 where k1 ... kN are the keys of all bindings in tbl , and d1 ... dN are
205 the associated values. Each binding is presented exactly once to f .
206
207 The order in which the bindings are passed to f is unspecified. How‐
208 ever, if the table contains several bindings for the same key, they are
209 passed to f in reverse order of introduction, that is, the most recent
210 binding is passed first.
211
212 If the hash table was created in non-randomized mode, the order in
213 which the bindings are enumerated is reproducible between successive
214 runs of the program, and even between minor versions of OCaml. For
215 randomized hash tables, the order of enumeration is entirely random.
216
217 The behavior is not defined if the hash table is modified by f during
218 the iteration.
219
220
221
222 val length : ('a, 'b) t -> int
223
224
225 Hashtbl.length tbl returns the number of bindings in tbl . It takes
226 constant time. Multiple bindings are counted once each, so
227 Hashtbl.length gives the number of times Hashtbl.iter calls its first
228 argument.
229
230
231
232 val randomize : unit -> unit
233
234 After a call to Hashtbl.randomize() , hash tables are created in ran‐
235 domized mode by default: Hashtbl.create returns randomized hash tables,
236 unless the ~random:false optional parameter is given. The same effect
237 can be achieved by setting the R parameter in the OCAMLRUNPARAM envi‐
238 ronment variable.
239
240 It is recommended that applications or Web frameworks that need to pro‐
241 tect themselves against the denial-of-service attack described in
242 Hashtbl.create call Hashtbl.randomize() at initialization time.
243
244 Note that once Hashtbl.randomize() was called, there is no way to re‐
245 vert to the non-randomized default behavior of Hashtbl.create . This
246 is intentional. Non-randomized hash tables can still be created using
247 Hashtbl.create ~random:false .
248
249
250 Since 4.00.0
251
252
253
254 val is_randomized : unit -> bool
255
256 Return true if the tables are currently created in randomized mode by
257 default, false otherwise.
258
259
260 Since 4.03.0
261
262
263
264 val rebuild : ?random:bool -> ('a, 'b) t -> ('a, 'b) t
265
266 Return a copy of the given hashtable. Unlike Hashtbl.copy ,
267 Hashtbl.rebuild h re-hashes all the (key, value) entries of the origi‐
268 nal table h . The returned hash table is randomized if h was random‐
269 ized, or the optional random parameter is true, or if the default is to
270 create randomized hash tables; see Hashtbl.create for more information.
271
272
273 Hashtbl.rebuild can safely be used to import a hash table built by an
274 old version of the Hashtbl module, then marshaled to persistent stor‐
275 age. After unmarshaling, apply Hashtbl.rebuild to produce a hash table
276 for the current version of the Hashtbl module.
277
278
279 Since 4.12.0
280
281
282 type statistics = {
283 num_bindings : int ; (* Number of bindings present in the table.
284 Same value as returned by Hashtbl.length .
285 *)
286 num_buckets : int ; (* Number of buckets in the table.
287 *)
288 max_bucket_length : int ; (* Maximal number of bindings per bucket.
289 *)
290 bucket_histogram : int array ; (* Histogram of bucket sizes. This
291 array histo has length max_bucket_length + 1 . The value of histo.(i)
292 is the number of buckets whose size is i .
293 *)
294 }
295
296
297 Since 4.00.0
298
299
300
301 val stats : ('a, 'b) t -> statistics
302
303
304 Hashtbl.stats tbl returns statistics about the table tbl : number of
305 buckets, size of the biggest bucket, distribution of buckets by size.
306
307
308 Since 4.00.0
309
310
311
312
313 Hash tables and Sequences
314 val to_seq : ('a, 'b) t -> ('a * 'b) Seq.t
315
316 Iterate on the whole table. The order in which the bindings appear in
317 the sequence is unspecified. However, if the table contains several
318 bindings for the same key, they appear in reversed order of introduc‐
319 tion, that is, the most recent binding appears first.
320
321 The behavior is not defined if the hash table is modified during the
322 iteration.
323
324
325 Since 4.07
326
327
328
329 val to_seq_keys : ('a, 'b) t -> 'a Seq.t
330
331 Same as Seq.map fst (to_seq m)
332
333
334
335 Since 4.07
336
337
338
339 val to_seq_values : ('a, 'b) t -> 'b Seq.t
340
341 Same as Seq.map snd (to_seq m)
342
343
344
345 Since 4.07
346
347
348
349 val add_seq : ('a, 'b) t -> ('a * 'b) Seq.t -> unit
350
351 Add the given bindings to the table, using Hashtbl.add
352
353
354
355 Since 4.07
356
357
358
359 val replace_seq : ('a, 'b) t -> ('a * 'b) Seq.t -> unit
360
361 Add the given bindings to the table, using Hashtbl.replace
362
363
364
365 Since 4.07
366
367
368
369 val of_seq : ('a * 'b) Seq.t -> ('a, 'b) t
370
371 Build a table from the given bindings. The bindings are added in the
372 same order they appear in the sequence, using Hashtbl.replace_seq ,
373 which means that if two pairs have the same key, only the latest one
374 will appear in the table.
375
376
377 Since 4.07
378
379
380
381
382 Functorial interface
383 The functorial interface allows the use of specific comparison and hash
384 functions, either for performance/security concerns, or because keys
385 are not hashable/comparable with the polymorphic builtins.
386
387 For instance, one might want to specialize a table for integer keys:
388 module IntHash =
389 struct
390 type t = int
391 let equal i j = i=j
392 let hash i = i land max_int
393 end
394
395 module IntHashtbl = Hashtbl.Make(IntHash)
396
397 let h = IntHashtbl.create 17 in
398 IntHashtbl.add h 12 "hello"
399
400
401 This creates a new module IntHashtbl , with a new type 'a
402 IntHashtbl.t of tables from int to 'a . In this example, h contains
403 string values so its type is string IntHashtbl.t .
404
405 Note that the new type 'a IntHashtbl.t is not compatible with the type
406 ('a,'b) Hashtbl.t of the generic interface. For example, Hashtbl.length
407 h would not type-check, you must use IntHashtbl.length .
408
409 module type HashedType = sig end
410
411
412 The input signature of the functor Hashtbl.Make .
413
414
415 module type S = sig end
416
417
418 The output signature of the functor Hashtbl.Make .
419
420
421 module Make : functor (H : HashedType) -> sig end
422
423
424 Functor building an implementation of the hashtable structure. The
425 functor Hashtbl.Make returns a structure containing a type key of keys
426 and a type 'a t of hash tables associating data of type 'a to keys of
427 type key . The operations perform similarly to those of the generic
428 interface, but use the hashing and equality functions specified in the
429 functor argument H instead of generic equality and hashing. Since the
430 hash function is not seeded, the create operation of the result struc‐
431 ture always returns non-randomized hash tables.
432
433
434 module type SeededHashedType = sig end
435
436
437 The input signature of the functor Hashtbl.MakeSeeded .
438
439
440 Since 4.00.0
441
442
443 module type SeededS = sig end
444
445
446 The output signature of the functor Hashtbl.MakeSeeded .
447
448
449 Since 4.00.0
450
451
452 module MakeSeeded : functor (H : SeededHashedType) -> sig end
453
454
455 Functor building an implementation of the hashtable structure. The
456 functor Hashtbl.MakeSeeded returns a structure containing a type key of
457 keys and a type 'a t of hash tables associating data of type 'a to keys
458 of type key . The operations perform similarly to those of the generic
459 interface, but use the seeded hashing and equality functions specified
460 in the functor argument H instead of generic equality and hashing. The
461 create operation of the result structure supports the ~ random optional
462 parameter and returns randomized hash tables if ~random:true is passed
463 or if randomization is globally on (see Hashtbl.randomize ).
464
465
466 Since 4.00.0
467
468
469
470
471 The polymorphic hash functions
472 val hash : 'a -> int
473
474
475 Hashtbl.hash x associates a nonnegative integer to any value of any
476 type. It is guaranteed that if x = y or Stdlib.compare x y = 0 , then
477 hash x = hash y . Moreover, hash always terminates, even on cyclic
478 structures.
479
480
481
482 val seeded_hash : int -> 'a -> int
483
484 A variant of Hashtbl.hash that is further parameterized by an integer
485 seed.
486
487
488 Since 4.00.0
489
490
491
492 val hash_param : int -> int -> 'a -> int
493
494
495 Hashtbl.hash_param meaningful total x computes a hash value for x ,
496 with the same properties as for hash . The two extra integer parameters
497 meaningful and total give more precise control over hashing. Hashing
498 performs a breadth-first, left-to-right traversal of the structure x ,
499 stopping after meaningful meaningful nodes were encountered, or total
500 nodes (meaningful or not) were encountered. If total as specified by
501 the user exceeds a certain value, currently 256, then it is capped to
502 that value. Meaningful nodes are: integers; floating-point numbers;
503 strings; characters; booleans; and constant constructors. Larger values
504 of meaningful and total means that more nodes are taken into account to
505 compute the final hash value, and therefore collisions are less likely
506 to happen. However, hashing takes longer. The parameters meaningful
507 and total govern the tradeoff between accuracy and speed. As default
508 choices, Hashtbl.hash and Hashtbl.seeded_hash take meaningful = 10 and
509 total = 100 .
510
511
512
513 val seeded_hash_param : int -> int -> int -> 'a -> int
514
515 A variant of Hashtbl.hash_param that is further parameterized by an in‐
516 teger seed. Usage: Hashtbl.seeded_hash_param meaningful total seed x .
517
518
519 Since 4.00.0
520
521
522
523
524
525OCamldoc 2022-02-04 Hashtbl(3)