mlpack_nca(1)

1mlpack_nca(1)               General Commands Manual              mlpack_nca(1)
2
3
4

NAME

6       mlpack_nca - neighborhood components analysis (nca)
7

SYNOPSIS

9        mlpack_nca [-h] [-v]
10

DESCRIPTION

12       This program implements Neighborhood Components Analysis, both a linear
13       dimensionality reduction technique and a distance  learning  technique.
14       The  method  seeks  to  improve  k-nearest-neighbor classification on a
15       dataset by scaling the dimensions. The  method  is  nonparametric,  and
16       does  not  require  a value of k. It works by using stochastic ("soft")
17       neighbor assignments and using optimization techniques over the  gradi‐
18       ent of the accuracy of the neighbor assignments.
19
20       To work, this algorithm needs labeled data. It can be given as the last
21       row of the input dataset (--input_file), or alternatively in a separate
22       file (--labels_file).
23
24       This implementation of NCA uses stochastic gradient descent, mini-batch
25       stochastic gradient descent, or the L_BFGS optimizer. These  optimizers
26       do  not guarantee global convergence for a nonconvex objective function
27       (NCA's objective function is nonconvex), so  the  final  results  could
28       depend on the random seed or other optimizer parameters.
29
30       Stochastic  gradient  descent,  specified by --optimizer "sgd", depends
31       primarily on two parameters: the step size (--step_size) and the  maxi‐
32       mum  number of iterations (--max_iterations). In addition, a normalized
33       starting point can be used (--normalize), which is  necessary  if  many
34       warnings  of  the form ’Denominator of p_i is 0!' are given. Tuning the
35       step size can be a tedious affair. In general, the  step  size  is  too
36       large  if the objective is not mostly uniformly decreasing, or if zero-
37       valued denominator warnings are being issued.  The  step  size  is  too
38       small if the objective is changing very slowly. Setting the termination
39       condition can be done easily once a good step size parameter is  found;
40       either  increase the maximum iterations to a large number and allow SGD
41       to find a minimum, or set the maximum iterations to 0  (allowing  infi‐
42       nite iterations) and set the tolerance (--tolerance) to define the max‐
43       imum allowed difference between objectives for  SGD  to  terminate.  Be
44       careful---setting  the  tolerance instead of the maximum iterations can
45       take a very long time and may actually never converge due to the  prop‐
46       erties of the SGD optimizer. Note that a single iteration of SGD refers
47       to a single point, so to take a  single  pass  over  the  dataset,  set
48       --max_iterations equal to the number of points in the dataset.
49
50       The mini-batch SGD optimizer, specified by --optimizer "minibatch-sgd",
51       has the same parameters as SGD, but the batch size may also  be  speci‐
52       fied  with  the  --batch_size (-b) option. Each iteration of mini-batch
53       SGD refers to a single mini-batch.
54
55       The L-BFGS optimizer, specified by --optimizer "lbfgs",  uses  a  back-
56       tracking  line  search  algorithm to minimize a function. The following
57       parameters are used by L-BFGS: --num_basis  (specifies  the  number  of
58       memory  points  used  by  L-BFGS), --max_iterations, --armijo_constant,
59       --wolfe, --tolerance (the optimization is terminated when the  gradient
60       norm  is  below  this  value), --max_line_search_trials, --min_step and
61       --max_step (which both refer to the  line  search  routine).  For  more
62       details on the L-BFGS optimizer, consult either the mlpack L-BFGS docu‐
63       mentation (in lbfgs.hpp) or the vast set of published literature on  L-
64       BFGS.
65
66       By default, the SGD optimizer is used.
67

REQUIRED INPUT OPTIONS

69       --input_file (-i) [string]
70              Input dataset to run NCA on.
71

OPTIONAL INPUT OPTIONS

73       --armijo_constant  (-A)  [double]  Armijo  constant for L-BFGS. Default
74       value 0.0001.
75
76       --batch_size (-b) [int]
77              Batch size for mini-batch SGD. Default value
78
79              50.
80
81
82       --help (-h)
83              Default help info.
84
85       --info [string]
86              Get help on a specific module or option.  Default value ''.
87
88       --labels_file (-l) [string]
89              File of labels for input dataset. Default value ’'.
90
91       --linear_scan (-L)
92              Don't shuffle the order in which data points are visited for SGD
93              or mini-batch SGD.
94
95       --max_iterations (-n) [int]
96              Maximum  number  of iterations for SGD or L-BFGS (0 indicates no
97              limit). Default  value  500000.   --max_line_search_trials  (-T)
98              [int]  Maximum number of line search trials for L-BFGS.  Default
99              value 50.
100
101       --max_step (-M) [double]
102              Maximum step of line search for L-BFGS. Default value 1e+20.
103
104       --min_step (-m) [double]
105              Minimum step of line search for L-BFGS. Default value 1e-20.
106
107       --normalize (-N)
108              Use a normalized starting point for optimization. This is useful
109              for when points are far apart, or when SGD is returning NaN.
110
111       --num_basis (-B) [int]
112              Number  of memory points to be stored for L-BFGS.  Default value
113              5.
114
115       --optimizer (-O) [string]
116              Optimizer to use; 'sgd', 'minibatch-sgd',  or  ’lbfgs'.  Default
117              value 'sgd'.
118
119       --seed (-s) [int]
120              Random seed. If 0, 'std::time(NULL)' is used.  Default value 0.
121
122       --step_size (-a) [double]
123              Step size for stochastic gradient descent (alpha). Default value
124              0.01.
125
126       --tolerance (-t) [double]
127              Maximum tolerance for termination  of  SGD  or  L-BFGS.  Default
128              value 1e-07.
129
130       --verbose (-v)
131              Display  informational  messages and the full list of parameters
132              and timers at the end of execution.
133
134       --version (-V)
135              Display the version of mlpack.
136
137       --wolfe (-w) [double]
138              Wolfe condition parameter for L-BFGS. Default value 0.9.
139

OPTIONAL OUTPUT OPTIONS

141       --output_file (-o) [string]
142              Output file for learned distance matrix.  Default value ''.
143

ADDITIONAL INFORMATION

146       For further information, including relevant papers, citations, and the‐
147       ory, For further information, including relevant papers, citations, and
148       theory, consult the documentation  found  at  http://www.mlpack.org  or
149       included    with    your    consult    the   documentation   found   at
150       http://www.mlpack.org or included with  your  DISTRIBUTION  OF  MLPACK.
151       DISTRIBUTION OF MLPACK.
152
153
154
155                                                                 mlpack_nca(1)