1mlpack_nca(1) General Commands Manual mlpack_nca(1)
2
3
4
6 mlpack_nca - neighborhood components analysis (nca)
7
9 mlpack_nca [-h] [-v]
10
12 This program implements Neighborhood Components Analysis, both a linear
13 dimensionality reduction technique and a distance learning technique.
14 The method seeks to improve k-nearest-neighbor classification on a
15 dataset by scaling the dimensions. The method is nonparametric, and
16 does not require a value of k. It works by using stochastic ("soft")
17 neighbor assignments and using optimization techniques over the gradi‐
18 ent of the accuracy of the neighbor assignments.
19
20 To work, this algorithm needs labeled data. It can be given as the last
21 row of the input dataset (--input_file), or alternatively in a separate
22 file (--labels_file).
23
24 This implementation of NCA uses stochastic gradient descent, mini-batch
25 stochastic gradient descent, or the L_BFGS optimizer. These optimizers
26 do not guarantee global convergence for a nonconvex objective function
27 (NCA's objective function is nonconvex), so the final results could
28 depend on the random seed or other optimizer parameters.
29
30 Stochastic gradient descent, specified by --optimizer "sgd", depends
31 primarily on two parameters: the step size (--step_size) and the maxi‐
32 mum number of iterations (--max_iterations). In addition, a normalized
33 starting point can be used (--normalize), which is necessary if many
34 warnings of the form ’Denominator of p_i is 0!' are given. Tuning the
35 step size can be a tedious affair. In general, the step size is too
36 large if the objective is not mostly uniformly decreasing, or if zero-
37 valued denominator warnings are being issued. The step size is too
38 small if the objective is changing very slowly. Setting the termination
39 condition can be done easily once a good step size parameter is found;
40 either increase the maximum iterations to a large number and allow SGD
41 to find a minimum, or set the maximum iterations to 0 (allowing infi‐
42 nite iterations) and set the tolerance (--tolerance) to define the max‐
43 imum allowed difference between objectives for SGD to terminate. Be
44 careful---setting the tolerance instead of the maximum iterations can
45 take a very long time and may actually never converge due to the prop‐
46 erties of the SGD optimizer. Note that a single iteration of SGD refers
47 to a single point, so to take a single pass over the dataset, set
48 --max_iterations equal to the number of points in the dataset.
49
50 The mini-batch SGD optimizer, specified by --optimizer "minibatch-sgd",
51 has the same parameters as SGD, but the batch size may also be speci‐
52 fied with the --batch_size (-b) option. Each iteration of mini-batch
53 SGD refers to a single mini-batch.
54
55 The L-BFGS optimizer, specified by --optimizer "lbfgs", uses a back-
56 tracking line search algorithm to minimize a function. The following
57 parameters are used by L-BFGS: --num_basis (specifies the number of
58 memory points used by L-BFGS), --max_iterations, --armijo_constant,
59 --wolfe, --tolerance (the optimization is terminated when the gradient
60 norm is below this value), --max_line_search_trials, --min_step and
61 --max_step (which both refer to the line search routine). For more
62 details on the L-BFGS optimizer, consult either the mlpack L-BFGS docu‐
63 mentation (in lbfgs.hpp) or the vast set of published literature on L-
64 BFGS.
65
66 By default, the SGD optimizer is used.
67
69 --input_file (-i) [string]
70 Input dataset to run NCA on.
71
73 --armijo_constant (-A) [double] Armijo constant for L-BFGS. Default
74 value 0.0001.
75
76 --batch_size (-b) [int]
77 Batch size for mini-batch SGD. Default value
78
79 50.
80
81
82 --help (-h)
83 Default help info.
84
85 --info [string]
86 Get help on a specific module or option. Default value ''.
87
88 --labels_file (-l) [string]
89 File of labels for input dataset. Default value ’'.
90
91 --linear_scan (-L)
92 Don't shuffle the order in which data points are visited for SGD
93 or mini-batch SGD.
94
95 --max_iterations (-n) [int]
96 Maximum number of iterations for SGD or L-BFGS (0 indicates no
97 limit). Default value 500000. --max_line_search_trials (-T)
98 [int] Maximum number of line search trials for L-BFGS. Default
99 value 50.
100
101 --max_step (-M) [double]
102 Maximum step of line search for L-BFGS. Default value 1e+20.
103
104 --min_step (-m) [double]
105 Minimum step of line search for L-BFGS. Default value 1e-20.
106
107 --normalize (-N)
108 Use a normalized starting point for optimization. This is useful
109 for when points are far apart, or when SGD is returning NaN.
110
111 --num_basis (-B) [int]
112 Number of memory points to be stored for L-BFGS. Default value
113 5.
114
115 --optimizer (-O) [string]
116 Optimizer to use; 'sgd', 'minibatch-sgd', or ’lbfgs'. Default
117 value 'sgd'.
118
119 --seed (-s) [int]
120 Random seed. If 0, 'std::time(NULL)' is used. Default value 0.
121
122 --step_size (-a) [double]
123 Step size for stochastic gradient descent (alpha). Default value
124 0.01.
125
126 --tolerance (-t) [double]
127 Maximum tolerance for termination of SGD or L-BFGS. Default
128 value 1e-07.
129
130 --verbose (-v)
131 Display informational messages and the full list of parameters
132 and timers at the end of execution.
133
134 --version (-V)
135 Display the version of mlpack.
136
137 --wolfe (-w) [double]
138 Wolfe condition parameter for L-BFGS. Default value 0.9.
139
141 --output_file (-o) [string]
142 Output file for learned distance matrix. Default value ''.
143
146 For further information, including relevant papers, citations, and the‐
147 ory, For further information, including relevant papers, citations, and
148 theory, consult the documentation found at http://www.mlpack.org or
149 included with your consult the documentation found at
150 http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK.
151 DISTRIBUTION OF MLPACK.
152
153
154
155 mlpack_nca(1)