1mlpack_krann(1)             General Commands Manual            mlpack_krann(1)
2
3
4

NAME

6       mlpack_krann - k-rank-approximate-nearest-neighbors (krann)
7

SYNOPSIS

9        mlpack_krann [-h] [-v]
10

DESCRIPTION

12       This program will calculate the k rank-approximate-nearest-neighbors of
13       a set of points. You may specify a separate set of reference points and
14       query  points,  or  just a reference set which will be used as both the
15       reference and query set. You must specify the rank approximation (in %)
16       (and optionally the success probability).
17
18       For example, the following will return 5 neighbors from the top 0.1% of
19       the data (with probability 0.95) for  each  point  in  'input.csv'  and
20       store  the  distances  in 'distances.csv' and the neighbors in the file
21       'neighbors.csv':
22
23       $ allkrann -k 5 -r input.csv -d distances.csv  -n  neighbors.csv  --tau
24       0.1
25
26       Note  that tau must be set such that the number of points in the corre‐
27       sponding percentile of the data is greater than k. Thus, if  we  choose
28       tau = 0.1 with a dataset of 1000 points and k = 5, then we are attempt‐
29       ing to choose 5 nearest neighbors out of the closest 1 point -- this is
30       invalid and the program will terminate with an error message.
31
32       The  output  files  are  organized  such that row i and column j in the
33       neighbors output file corresponds to the index of the point in the ref‐
34       erence  set  which  is  the i'th nearest neighbor from the point in the
35       query set with index j.  Row i and column j  in  the  distances  output
36       file corresponds to the distance between those two points.
37

OPTIONAL INPUT OPTIONS

39       --alpha (-a) [double]
40              The desired success probability. Default value 0.95.
41
42       --first_leaf_exact (-X)
43              The  flag  to  trigger sampling only after exactly exploring the
44              first leaf.
45
46       --help (-h)
47              Default help info.
48
49       --info [string]
50              Get help on a specific module  or  option.   Default  value  ''.
51              --input_model_file (-m) [string] File containing pre-trained kNN
52              model. Default value ''.
53
54       --k (-k) [int]
55              Number of nearest neighbors to find. Default value 0.
56
57       --leaf_size (-l) [int]
58              Leaf size for tree building (used  for  kd-trees,  UB  trees,  R
59              trees,  R* trees, X trees, Hilbert R trees, R+ trees, R++ trees,
60              and octrees).  Default value 20.
61
62       --naive (-N)
63              If true, sampling will be done without using a tree.
64
65       --query_file (-q) [string]
66              File containing query points (optional).  Default value ''.
67
68       --random_basis (-R)
69              Before tree-building, project the data onto a random  orthogonal
70              basis.   --reference_file (-r) [string] File containing the ref‐
71              erence dataset. Default value ''.
72
73       --sample_at_leaves (-L)
74              The flag to trigger sampling at leaves.
75
76       --seed [int]
77              Random seed (if 0, std::time(NULL) is used).  Default value 0.
78
79       --single_mode (-s)
80              If true, single-tree search is used  (as  opposed  to  dual-tree
81              search.  --single_sample_limit (-S) [int] The limit on the maxi‐
82              mum number of samples  (and  hence  the  largest  node  you  can
83              approximate).  Default value 20.
84
85       --tau (-T) [double]
86              The  allowed  rank-error in terms of the percentile of the data.
87              Default value 5.
88
89       --tree_type (-t) [string]
90              Type of tree to use: 'kd', 'ub', 'cover',  'r',  ’x',  'r-star',
91              'hilbert-r', 'r-plus', ’r-plus-plus', 'oct'. Default value 'kd'.
92
93       --verbose (-v)
94              Display  informational  messages and the full list of parameters
95              and timers at the end of execution.
96
97       --version (-V)
98              Display the version of mlpack.
99

OPTIONAL OUTPUT OPTIONS

101       --distances_file (-d) [string] File to output distances  into.  Default
102       value  ’'.   --neighbors_file  (-n)  [string]  File to output neighbors
103       into. Default value ’'.  --output_model_file (-M)  [string]  If  speci‐
104       fied, the kNN model will be saved to the given file. Default value ''.
105

ADDITIONAL INFORMATION

ADDITIONAL INFORMATION

108       For further information, including relevant papers, citations, and the‐
109       ory, For further information, including relevant papers, citations, and
110       theory,  consult  the  documentation  found at http://www.mlpack.org or
111       included   with   your   consult    the    documentation    found    at
112       http://www.mlpack.org  or  included  with  your DISTRIBUTION OF MLPACK.
113       DISTRIBUTION OF MLPACK.
114
115
116
117                                                               mlpack_krann(1)
Impressum