1mlpack_lsh(1) User Commands mlpack_lsh(1)
2
3
4
6 mlpack_lsh - k-approximate-nearest-neighbor search with lsh
7
9 mlpack_lsh [-B int] [-H double] [-m unknown] [-k int] [-T int] [-K int] [-q string] [-r string] [-S int] [-s int] [-L int] [-t string] [-V bool] [-d string] [-n string] [-M unknown] [-h -v]
10
12 This program will calculate the k approximate-nearest-neighbors of a
13 set of points using locality-sensitive hashing. You may specify a sepa‐
14 rate set of reference points and query points, or just a reference set
15 which will be used as both the reference and query set.
16
17 For example, the following will return 5 neighbors from the data for
18 each point in 'input.csv' and store the distances in 'distances.csv'
19 and the neighbors in 'neighbors.csv':
20
21 $ lsh --k 5 --reference_file input.csv --distances_file distances.csv
22 --neighbors_file neighbors.csv
23
24 The output is organized such that row i and column j in the neighbors
25 output corresponds to the index of the point in the reference set which
26 is the j'th nearest neighbor from the point in the query set with index
27 i. Row j and column i in the distances output file corresponds to the
28 distance between those two points.
29
30 Because this is approximate-nearest-neighbors search, results may be
31 different from run to run. Thus, the '--seed (-s)' parameter can be
32 specified to set the random seed.
33
34 This program also has many other parameters to control its functional‐
35 ity; see the parameter-specific documentation for more information.
36
38 --bucket_size (-B) [int]
39 The size of a bucket in the second level hash. Default value
40 500.
41
42 --hash_width (-H) [double]
43 The hash width for the first-level hashing in the LSH prepro‐
44 cessing. By default, the LSH class automatically estimates a
45 hash width for its use. Default value 0.
46
47 --help (-h) [bool]
48 Default help info.
49
50 --info [string]
51 Get help on a specific module or option. Default value ''.
52
53 --input_model_file (-m) [unknown]
54 Input LSH model. Default value ''.
55
56 --k (-k) [int]
57 Number of nearest neighbors to find. Default value 0.
58
59 --num_probes (-T) [int]
60 Number of additional probes for multiprobe LSH; if 0, tradi‐
61 tional LSH is used. Default value 0.
62
63 --projections (-K) [int]
64 The number of hash functions for each table Default value 10.
65
66 --query_file (-q) [string]
67 Matrix containing query points (optional). Default value ''.
68
69 --reference_file (-r) [string]
70 Matrix containing the reference dataset. Default value ''.
71
72 --second_hash_size (-S) [int]
73 The size of the second level hash table. Default value 99901.
74
75 --seed (-s) [int]
76 Random seed. If 0, 'std::time(NULL)' is used. Default value 0.
77
78 --tables (-L) [int]
79 The number of hash tables to be used. Default value 30.
80
81 --true_neighbors_file (-t) [string]
82 Matrix of true neighbors to compute recall with (the recall is
83 printed when -v is specified). Default value ''.
84
85 --verbose (-v) [bool]
86 Display informational messages and the full list of parameters
87 and timers at the end of execution.
88
89 --version (-V) [bool]
90 Display the version of mlpack.
91
93 --distances_file (-d) [string]
94 Matrix to output distances into. Default value ''.
95
96 --neighbors_file (-n) [string]
97 Matrix to output neighbors into. Default value ''.
98
99 --output_model_file (-M) [unknown]
100 Output for trained LSH model. Default value ''.
101
103 For further information, including relevant papers, citations, and the‐
104 ory, consult the documentation found at http://www.mlpack.org or
105 included with your distribution of mlpack.
106
107
108
109mlpack-3.0.4 21 February 2019 mlpack_lsh(1)