1mlpack_approx_kfn(1)             User Commands            mlpack_approx_kfn(1)
2
3
4

NAME

6       mlpack_approx_kfn - approximate furthest neighbor search
7

SYNOPSIS

9        mlpack_approx_kfn [-a string] [-e bool] [-x string] [-m unknown] [-k int] [-p int] [-t int] [-q string] [-r string] [-V bool] [-d string] [-n string] [-M unknown] [-h -v]
10

DESCRIPTION

12       This  program  implements  two strategies for furthest neighbor search.
13       These strategies are:
14
15              ·  The 'qdafn' algorithm from "Approximate Furthest Neighbor  in
16                 High  Dimensions" by R. Pagh, F. Silvestri, J. Sivertsen, and
17                 M. Skala, in Similarity Search and Applications 2015 (SISAP).
18
19              ·  The 'DrusillaSelect' algorithm from  "Fast  approximate  fur‐
20                 thest  neighbors with data-dependent candidate selection", by
21                 R.R. Curtin and A.B. Gardner, in Similarity Search and Appli‐
22                 cations 2016 (SISAP).
23
24       These two strategies give approximate results for the furthest neighbor
25       search problem and can be used as fast replacements for other  furthest
26       neighbor techniques such as those found in the mlpack_kfn program. Note
27       that typically, the 'ds' algorithm requires far fewer tables  and  pro‐
28       jections than the 'qdafn' algorithm.
29
30       Specify  a  reference  set  (set  to  search in) with '--reference_file
31       (-r)', specify a query set with '--query_file (-q)', and specify  algo‐
32       rithm  parameters with '--num_tables (-t)' and '--num_projections (-p)'
33       (or don't and defaults will be used). The algorithm to be used  (either
34       'ds'---the  default---or  ’qdafn')  may  be specified with '--algorithm
35       (-a)'. Also specify the number of neighbors to  search  for  with  '--k
36       (-k)'.
37
38       If  no  query  set  is specified, the reference set will be used as the
39       query set.  The '--output_model_file (-M)' output parameter may be used
40       to  store  the built model, and an input model may be loaded instead of
41       specifying a reference set with the '--input_model_file (-m)' option.
42
43       Results for each query point can be stored with  the  '--neighbors_file
44       (-n)'  and '--distances_file (-d)' output parameters. Each row of these
45       output matrices holds the k distances  or  neighbor  indices  for  each
46       query point.
47
48       For  example, to find the 5 approximate furthest neighbors with ’refer‐
49       ence_set.csv' as the reference set and 'query_set.csv' as the query set
50       using  DrusillaSelect, storing the furthest neighbor indices to 'neigh‐
51       bors.csv' and the furthest neighbor distances to  'distances.csv',  one
52       could call
53
54       $   approx_kfn   --query_file   query_set.csv  --reference_file  refer‐
55       ence_set.csv --k 5 --algorithm ds --neighbors_file neighbors.csv --dis‐
56       tances_file distances.csv
57
58       and  to  perform  approximate all-furthest-neighbors search with k=1 on
59       the set ’data.csv' storing only  the  furthest  neighbor  distances  to
60       'distances.csv', one could call
61
62       $  approx_kfn --reference_file reference_set.csv --k 1 --distances_file
63       distances.csv
64
65       A trained model can be re-used. If a model has been previously saved to
66       ’model.bin',  then  we  may  find 3 approximate furthest neighbors on a
67       query set ’new_query_set.csv' using that model and store  the  furthest
68       neighbor indices into 'neighbors.csv' by calling
69
70       $      approx_kfn     --input_model_file     model.bin     --query_file
71       new_query_set.csv --k 3 --neighbors_file neighbors.csv
72

OPTIONAL INPUT OPTIONS

74       --algorithm (-a) [string]
75              Algorithm to use: 'ds' or 'qdafn'. Default value 'ds'.
76
77       --calculate_error (-e) [bool]
78              If set, calculate the average distance error for the first  fur‐
79              thest neighbor only.
80
81       --exact_distances_file (-x) [string]
82              Matrix  containing  exact  distances to furthest neighbors; this
83              can be used to avoid explicit
84
85       calculation when --calculate_error is set.
86              Default value ''.
87
88       --help (-h) [bool]
89              Default help info.
90
91       --info [string]
92              Get help on a specific module or option.  Default value ''.
93
94       --input_model_file (-m) [unknown]
95              File containing input model. Default value ''.
96
97       --k (-k) [int]
98              Number of furthest neighbors to search for.   Default  value  0.
99              --num_projections  (-p)  [int]  Number  of projections to use in
100              each hash table. Default value 5.
101
102       --num_tables (-t) [int]
103              Number of hash tables to use. Default value 5.
104
105       --query_file (-q) [string]
106              Matrix containing query points. Default value ''.
107
108       --reference_file (-r) [string]
109              Matrix containing the reference dataset.  Default value ''.
110
111       --verbose (-v) [bool]
112              Display informational messages and the full list  of  parameters
113              and timers at the end of execution.
114
115       --version (-V) [bool]
116              Display the version of mlpack.
117

OPTIONAL OUTPUT OPTIONS

119       --distances_file (-d) [string]
120              Matrix  to  save  furthest neighbor distances to.  Default value
121              ''.
122
123       --neighbors_file (-n) [string]
124              Matrix to save neighbor indices to. Default value ''.
125
126       --output_model_file (-M) [unknown]
127              File to save output model to. Default value ''.
128

ADDITIONAL INFORMATION

130       For further information, including relevant papers, citations, and the‐
131       ory,  consult  the  documentation  found  at  http://www.mlpack.org  or
132       included with your distribution of mlpack.
133
134
135
136mlpack-3.0.4                   21 February 2019           mlpack_approx_kfn(1)
Impressum