1mlpack_dbscan(1) General Commands Manual mlpack_dbscan(1)
2
3
4
6 mlpack_dbscan - dbscan clustering
7
9 mlpack_dbscan [-h] [-v]
10
12 This program implements the DBSCAN algorithm for clustering using
13 accelerated tree-based range search. The type of tree that is used may
14 be parameterized, or brute-force range search may also be used.
15
16 The input dataset to be clustered may be specified with the
17 --input_file option, the radius of each range search may be specified
18 with the --epsilon option, and the minimum number of points in a clus‐
19 ter may be specified with the --min_size option.
20
21 The output of the clustering may be saved as --assignments_file or
22 --centroids_file; --assignments_file will save the cluster assignments
23 of each point, and --centroids_file will save the centroids of each
24 cluster.
25
26 The range search may be controlled with the --tree_type, --single_mode,
27 and --naive parameters. The --tree_type parameter can control the type
28 of tree used for range search; this can take a variety of values: 'kd',
29 'r', 'r-star', ’x', 'hilbert-r', 'r-plus', 'r-plus-plus', 'cover',
30 'ball'. The --single_mode option will force single-tree search (as
31 opposed to the default dual-tree search). --single_mode can be useful
32 when the RAM usage of batch search is too high. The --naive option will
33 force brute-force range search.
34
35 An example usage to run DBSCAN on the dataset in input.csv with a
36 radius of 0.5 and a minimum cluster size of 5 is given below:
37
38 $ mlpack_dbscan -i input.csv -e 0.5 -m 5
39
40
42 --input_file (-i) [string]
43 Input dataset to cluster.
44
46 --epsilon (-e) [double]
47 Radius of each range search. Default value 1.
48
49 --help (-h)
50 Default help info.
51
52 --info [string]
53 Get help on a specific module or option. Default value ''.
54
55 --min_size (-m) [int]
56 Minimum number of points for a cluster. Default value 5.
57
58 --naive (-N)
59 If set, brute-force range search (not tree-based) will be used.
60
61 --single_mode (-S)
62 If set, single-tree range search (not dual-tree) will be used.
63
64 --tree_type (-t) [string]
65 If using single-tree or dual-tree search, the type of tree to
66 use ('kd', 'r', 'r-star', 'x', ’hilbert-r', 'r-plus', 'r-plus-
67 plus', 'cover', ’ball'). Default value 'kd'.
68
69 --verbose (-v)
70 Display informational messages and the full list of parameters
71 and timers at the end of execution.
72
73 --version (-V)
74 Display the version of mlpack.
75
77 --assignments_file (-a) [string] Output file for assignments of each
78 point. Default value ''. --centroids_file (-C) [string] File to save
79 output centroids to. Default value ’'.
80
83 For further information, including relevant papers, citations, and the‐
84 ory, For further information, including relevant papers, citations, and
85 theory, consult the documentation found at http://www.mlpack.org or
86 included with your consult the documentation found at
87 http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK.
88 DISTRIBUTION OF MLPACK.
89
90
91
92 mlpack_dbscan(1)