1mlpack_kernel_pca(1) General Commands Manual mlpack_kernel_pca(1)
2
3
4
6 mlpack_kernel_pca - kernel principal components analysis
7
9 mlpack_kernel_pca [-h] [-v]
10
12 This program performs Kernel Principal Components Analysis (KPCA) on
13 the specified dataset with the specified kernel. This will transform
14 the data onto the kernel principal components, and optionally reduce
15 the dimensionality by ignoring the kernel principal components with the
16 smallest eigenvalues.
17
18 For the case where a linear kernel is used, this reduces to regular
19 PCA.
20
21 For example, the following will perform KPCA on the 'input.csv' file
22 using the gaussian kernel and store the transformed date in the 'trans‐
23 formed.csv' file.
24
25 $ kernel_pca -i input.csv -k gaussian -o transformed.csv
26
27 The kernels that are supported are listed below:
28
29 · ’linear': the standard linear dot product (same as normal
30 PCA): K(x, y) = x^T y
31
32 · ’gaussian': a Gaussian kernel; requires bandwidth: K(x, y) =
33 exp(-(|| x - y || ^ 2) / (2 * (bandwidth ^ 2)))
34
35 · ’polynomial': polynomial kernel; requires offset and degree:
36 K(x, y) = (x^T y + offset) ^ degree
37
38 · ’hyptan': hyperbolic tangent kernel; requires scale and off‐
39 set: K(x, y) = tanh(scale * (x^T y) + offset)
40
41 · ’laplacian': Laplacian kernel; requires bandwidth: K(x, y) =
42 exp(-(|| x - y ||) / bandwidth)
43
44 · ’epanechnikov': Epanechnikov kernel; requires bandwidth: K(x,
45 y) = max(0, 1 - || x - y ||^2 / bandwidth^2)
46
47 · ’cosine': cosine distance: K(x, y) = 1 - (x^T y) / (|| x || *
48 || y ||)
49
50 The parameters for each of the kernels should be specified with the
51 options --bandwidth, --kernel_scale, --offset, or --degree (or a combi‐
52 nation of those options).
53
54 Optionally, the nyström method ("Using the Nystroem method to speed up
55 kernel machines", 2001) can be used to calculate the kernel matrix by
56 specifying the --nystroem_method (-n) option. This approach works by
57 using a subset of the data as basis to reconstruct the kernel matrix;
58 to specify the sampling scheme, the --sampling parameter is used, the
59 sampling scheme for the nyström method can be chosen from the following
60 list: kmeans, random, ordered.
61
63 --input_file (-i) [string]
64 Input dataset to perform KPCA on.
65
66 --kernel (-k) [string]
67 The kernel to use; see the above documentation for the list of
68 usable kernels.
69
71 --bandwidth (-b) [double]
72 Bandwidth, for 'gaussian' and 'laplacian' kernels. Default value
73 1.
74
75 --center (-c)
76 If set, the transformed data will be centered about the origin.
77
78 --degree (-D) [double]
79 Degree of polynomial, for 'polynomial' kernel. Default value 1.
80
81 --help (-h)
82 Default help info.
83
84 --info [string]
85 Get help on a specific module or option. Default value ''.
86 --kernel_scale (-S) [double] Scale, for 'hyptan' kernel. Default
87 value 1. --new_dimensionality (-d) [int] If not 0, reduce the
88 dimensionality of the output dataset by ignoring the dimensions
89 with the smallest eigenvalues. Default value 0.
90
91 --nystroem_method (-n)
92 If set, the nystroem method will be used.
93
94 --offset (-O) [double]
95 Offset, for 'hyptan' and 'polynomial' kernels. Default value 0.
96
97 --sampling (-s) [string]
98 Sampling scheme to use for the nystroem method: ’kmeans', 'ran‐
99 dom', 'ordered' Default value ’kmeans'.
100
101 --verbose (-v)
102 Display informational messages and the full list of parameters
103 and timers at the end of execution.
104
105 --version (-V)
106 Display the version of mlpack.
107
109 --output_file (-o) [string]
110 File to save modified dataset to. Default value ’'.
111
114 For further information, including relevant papers, citations, and the‐
115 ory, For further information, including relevant papers, citations, and
116 theory, consult the documentation found at http://www.mlpack.org or
117 included with your consult the documentation found at
118 http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK.
119 DISTRIBUTION OF MLPACK.
120
121
122
123 mlpack_kernel_pca(1)