1mlpack_lsh(1) General Commands Manual mlpack_lsh(1)
2
3
4
6 mlpack_lsh - all k-approximate-nearest-neighbor search with lsh
7
9 mlpack_lsh [-h] [-v]
10
12 This program will calculate the k approximate-nearest-neighbors of a
13 set of points using locality-sensitive hashing. You may specify a sepa‐
14 rate set of reference points and query points, or just a reference set
15 which will be used as both the reference and query set.
16
17 For example, the following will return 5 neighbors from the data for
18 each point in 'input.csv' and store the distances in 'distances.csv'
19 and the neighbors in the file 'neighbors.csv':
20
21 $ lsh -k 5 -r input.csv -d distances.csv -n neighbors.csv
22
23 The output files are organized such that row i and column j in the
24 neighbors output file corresponds to the index of the point in the ref‐
25 erence set which is the i'th nearest neighbor from the point in the
26 query set with index j. Row i and column j in the distances output
27 file corresponds to the distance between those two points.
28
29 Because this is approximate-nearest-neighbors search, results may be
30 different from run to run. Thus, the --seed option can be specified to
31 set the random seed.
32
34 --bucket_size (-B) [int]
35 The size of a bucket in the second level hash. Default value
36 500.
37
38 --hash_width (-H) [double]
39 The hash width for the first-level hashing in the LSH prepro‐
40 cessing. By default, the LSH class automatically estimates a
41 hash width for its use. Default value 0.
42
43 --help (-h)
44 Default help info.
45
46 --info [string]
47 Get help on a specific module or option. Default value ''.
48 --input_model_file (-m) [string] File to load LSH model from.
49 (Cannot be specified with --reference_file.) Default value ’'.
50
51 --k (-k) [int]
52 Number of nearest neighbors to find. Default value 0.
53
54 --num_probes (-T) [int]
55 Number of additional probes for multiprobe LSH; if 0, tradi‐
56 tional LSH is used. Default value
57
58 0.
59
60
61 --projections (-K) [int]
62 The number of hash functions for each table Default value 10.
63
64 --query_file (-q) [string]
65 File containing query points (optional). Default value ''.
66 --reference_file (-r) [string] File containing the reference
67 dataset. Default value ''. --second_hash_size (-S) [int] The
68 size of the second level hash table. Default value 99901.
69
70 --seed (-s) [int]
71 Random seed. If 0, 'std::time(NULL)' is used. Default value 0.
72
73 --tables (-L) [int]
74 The number of hash tables to be used. Default value 30.
75 --true_neighbors_file (-t) [string] File of true neighbors to
76 compute recall with (the recall is printed when -v is speci‐
77 fied). Default value ''.
78
79 --verbose (-v)
80 Display informational messages and the full list of parameters
81 and timers at the end of execution.
82
83 --version (-V)
84 Display the version of mlpack.
85
87 --distances_file (-d) [string] File to output distances into. Default
88 value ’'. --neighbors_file (-n) [string] File to output neighbors
89 into. Default value ’'. --output_model_file (-M) [string] File to save
90 LSH model to. Default value ''.
91
94 For further information, including relevant papers, citations, and the‐
95 ory, For further information, including relevant papers, citations, and
96 theory, consult the documentation found at http://www.mlpack.org or
97 included with your consult the documentation found at
98 http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK.
99 DISTRIBUTION OF MLPACK.
100
101
102
103 mlpack_lsh(1)