1mlpack_dbscan(1) User Commands mlpack_dbscan(1)
2
3
4
6 mlpack_dbscan - dbscan clustering
7
9 mlpack_dbscan -i string [-e double] [-m int] [-N bool] [-S bool] [-t string] [-V bool] [-a string] [-C string] [-h -v]
10
12 This program implements the DBSCAN algorithm for clustering using
13 accelerated tree-based range search. The type of tree that is used may
14 be parameterized, or brute-force range search may also be used.
15
16 The input dataset to be clustered may be specified with the
17 '--input_file (-i)' parameter; the radius of each range search may be
18 specified with the ’--epsilon (-e)' parameters, and the minimum number
19 of points in a cluster may be specified with the '--min_size (-m)'
20 parameter.
21
22 The '--assignments_file (-a)' and '--centroids_file (-C)' output param‐
23 eters may be used to save the output of the clustering. '--assign‐
24 ments_file (-a)' contains the cluster assignments of each point, and
25 '--centroids_file (-C)' contains the centroids of each cluster.
26
27 The range search may be controlled with the '--tree_type (-t)', '--sin‐
28 gle_mode (-S)', and '--naive (-N)' parameters. '--tree_type (-t)' can
29 control the type of tree used for range search; this can take a variety
30 of values: 'kd', 'r', ’r-star', 'x', 'hilbert-r', 'r-plus', 'r-plus-
31 plus', 'cover', 'ball'. The ’--single_mode (-S)' parameter will force
32 single-tree search (as opposed to the default dual-tree search), and
33 ''--naive (-N)' will force brute-force range search.
34
35 An example usage to run DBSCAN on the dataset in 'input.csv' with a
36 radius of 0.5 and a minimum cluster size of 5 is given below:
37
38 $ dbscan --input_file input.csv --epsilon 0.5 --min_size 5
39
41 --input_file (-i) [string]
42 Input dataset to cluster.
43
45 --epsilon (-e) [double]
46 Radius of each range search. Default value 1.
47
48 --help (-h) [bool]
49 Default help info.
50
51 --info [string]
52 Get help on a specific module or option. Default value ''.
53
54 --min_size (-m) [int]
55 Minimum number of points for a cluster. Default value 5.
56
57 --naive (-N) [bool]
58 If set, brute-force range search (not tree-based) will be used.
59
60 --single_mode (-S) [bool]
61 If set, single-tree range search (not dual-tree) will be used.
62
63 --tree_type (-t) [string]
64 If using single-tree or dual-tree search, the type of tree to
65 use ('kd', 'r', 'r-star', 'x', 'hilbert-r', 'r-plus', 'r-plus-
66 plus', 'cover', 'ball'). Default value 'kd'.
67
68 --verbose (-v) [bool]
69 Display informational messages and the full list of parameters
70 and timers at the end of execution.
71
72 --version (-V) [bool]
73 Display the version of mlpack.
74
76 --assignments_file (-a) [string]
77 Output matrix for assignments of each point. Default value ''.
78
79 --centroids_file (-C) [string]
80 Matrix to save output centroids to. Default value ''.
81
83 For further information, including relevant papers, citations, and the‐
84 ory, consult the documentation found at http://www.mlpack.org or
85 included with your distribution of mlpack.
86
87
88
89mlpack-3.0.4 21 February 2019 mlpack_dbscan(1)