1JudyHS(3) Library Functions Manual JudyHS(3)
2
3
4
6 JudyHS macros - C library for creating and accessing a dynamic array,
7 using an array-of-bytes of Length as an Index and a word as a Value.
8
10 cc [flags] sourcefiles -lJudy
11
12 #include <Judy.h>
13
14 Word_t * PValue; // JudyHS array element
15 int Rc_int; // return flag
16 Word_t Rc_word; // full word return value
17 Pvoid_t PJHSArray = (Pvoid_t) NULL; // initialize JudyHS array
18 uint8_t * Index; // array-of-bytes pointer
19 Word_t Length; // number of bytes in Index
20
21 JHSI( PValue, PJHSArray, Index, Length); // JudyHSIns()
22 JHSD( Rc_int, PJHSArray, Index, Length); // JudyHSDel()
23 JHSG( PValue, PJHSArray, Index, Length); // JudyHSGet()
24 JHSFA(Rc_word, PJHSArray); // JudyHSFreeArray()
25
27 A JudyHS array is the equivalent of an array of word-sized value/point‐
28 ers. An Index is a pointer to an array-of-bytes of specified length:
29 Length. Rather than using a null terminated string, this difference
30 from JudySL(3) allows strings to contain all bits (specifically the
31 null character). This new addition (May 2004) to Judy arrays is a hy‐
32 bird using the best capabilities of hashing and Judy methods. JudyHS
33 does not have a poor performance case where knowledge of the hash algo‐
34 rithm can be used to degrade the performance.
35
36 Since JudyHS is based on a hash method, Indexes are not stored in any
37 particular order. Therefore the JudyHSFirst(), JudyHSNext(), JudyH‐
38 SPrev() and JudyHSLast() neighbor search functions are not practical.
39 The Length of each array-of-bytes can be from 0 to the limits of mal‐
40 loc() (about 2GB).
41
42 The hallmark of JudyHS is speed with scalability, but memory efficiency
43 is excellent. The speed is very competitive with the best hashing
44 methods. The memory efficiency is similar to a linked list of the same
45 Indexes and Values. JudyHS is designed to scale from 0 to billions of
46 Indexes.
47
48 A JudyHS array is allocated with a NULL pointer
49
50 Pvoid_t PJHSArray = (Pvoid_t) NULL;
51
52 Because the macro forms of the API have a simpler error handling inter‐
53 face than the equivalent functions, they are the preferred way to use
54 JudyHS.
55
57 Given a pointer to a JudyHS array (PJHSArray), insert an Index string
58 of length: Length and a Value into the JudyHS array: PJHSArray. If
59 the Index is successfully inserted, the Value is initialized to 0. If
60 the Index was already present, the Value is not modified.
61
62 Return PValue pointing to Value. Your program should use this pointer
63 to read or modify the Value, for example:
64
65 Value = *PValue;
66 *PValue = 1234;
67
68 Note: JHSI() and JHSD can reorganize the JudyHS array. Therefore,
69 pointers returned from previous JudyHS calls become invalid and must be
70 re-acquired (using JHSG()).
71
73 Given a pointer to a JudyHS array (PJHSArray), delete the specified In‐
74 dex along with the Value from the JudyHS array.
75
76 Return Rc_int set to 1 if successfully removed from the array. Return
77 Rc_int set to 0 if Index was not present.
78
80 Given a pointer to a JudyHS array (PJHSArray), find Value associated
81 with Index.
82
83 Return PValue pointing to Index's Value. Return PValue set to NULL if
84 the Index was not present.
85
87 Given a pointer to a JudyHS array (PJHSArray), free the entire array.
88
89 Return Rc_word set to the number of bytes freed and PJHSArray set to
90 NULL.
91
94 Show how to program with the JudyHS macros. This program will print
95 duplicate lines and their line number from stdin.
96
97 #include <unistd.h>
98 #include <stdio.h>
99 #include <string.h>
100 #include <Judy.h>
101
102 // Compiled:
103 // cc -O PrintDupLines.c -lJudy -o PrintDupLines
104
105 #define MAXLINE 1000000 /* max fgets length of line */
106 uint8_t Index[MAXLINE]; // string to check
107
108 int // Usage: PrintDupLines < file
109 main()
110 {
111 Pvoid_t PJArray = (PWord_t)NULL; // Judy array.
112 PWord_t PValue; // Judy array element pointer.
113 Word_t Bytes; // size of JudyHS array.
114 Word_t LineNumb = 0; // current line number
115 Word_t Dups = 0; // number of duplicate lines
116
117 while (fgets(Index, MAXLINE, stdin) != (char *)NULL)
118 {
119 LineNumb++; // line number
120
121 // store string into array
122 JHSI(PValue, PJArray, Index, strlen(Index));
123 if (PValue == PJERR) // See ERRORS section
124 {
125 fprintf(stderr, "Out of memory -- exit\n");
126 exit(1);
127 }
128 if (*PValue == 0) // check if duplicate
129 {
130 Dups++;
131 printf("Duplicate lines %lu:%lu:%s", *PValue, LineNumb, Index);
132 }
133 else
134 {
135 *PValue = LineNumb; // store Line number
136 }
137 }
138 printf("%lu Duplicates, free JudyHS array of %lu Lines\n",
139 Dups, LineNumb - Dups);
140 JHSFA(Bytes, PJArray); // free JudyHS array
141 printf("JudyHSFreeArray() free'ed %lu bytes of memory\n", Bytes);
142 return (0);
143 }
144
146 JudyHS was invented and implemented by Doug Baskins after retiring from
147 Hewlett-Packard.
148
150 Judy(3), Judy1(3), JudyL(3), JudySL(3),
151 malloc(),
152 the Judy website, http://judy.sourceforge.net, for further information
153 and Application Notes.
154
155
156
157 JudyHS(3)