1mlpack_decision_stump(1) User Commands mlpack_decision_stump(1)
2
3
4
6 mlpack_decision_stump - decision stump
7
9 mlpack_decision_stump [-b int] [-m unknown] [-l string] [-T string] [-t string] [-V bool] [-M unknown] [-p string] [-h -v]
10
12 This program implements a decision stump, which is a single-level deci‐
13 sion tree. The decision stump will split on one dimension of the input
14 data, and will split into multiple buckets. The dimension and bins are
15 selected by maximizing the information gain of the split. Optionally,
16 the minimum number of training points in each bin can be specified with
17 the '--bucket_size (-b)' parameter.
18
19 The decision stump is parameterized by a splitting dimension and a vec‐
20 tor of values that denote the splitting values of each bin.
21
22 This program enables several applications: a decision tree may be
23 trained or loaded, and then that decision tree may be used to classify
24 a given set of test points. The decision tree may also be saved to a
25 file for later usage.
26
27 To train a decision stump, training data should be passed with the
28 ’--training_file (-t)' parameter, and their corresponding labels should
29 be passed with the '--labels_file (-l)' option. Optionally, if
30 '--labels_file (-l)' is not specified, the labels are assumed to be the
31 last dimension of the training dataset. The '--bucket_size (-b)' param‐
32 eter controls the minimum number of training points in each decision
33 stump bucket.
34
35 For classifying a test set, a decision stump may be loaded with the
36 ’--input_model_file (-m)' parameter (useful for the situation where a
37 stump has already been trained), and a test set may be specified with
38 the ’--test_file (-T)' parameter. The predicted labels can be saved
39 with the ’--predictions_file (-p)' output parameter.
40
41 Because decision stumps are trained in batch, retraining does not make
42 sense and thus it is not possible to pass both '--training_file (-t)'
43 and ’--input_model_file (-m)'; instead, simply build a new decision
44 stump with the training data.
45
46 After training, a decision stump can be saved with the '--out‐
47 put_model_file (-M)' output parameter. That stump may later be re-used
48 in subsequent calls to this program (or others).
49
51 --bucket_size (-b) [int]
52 The minimum number of training points in each decision stump
53 bucket. Default value 6.
54
55 --help (-h) [bool]
56 Default help info.
57
58 --info [string]
59 Get help on a specific module or option. Default value ''.
60
61 --input_model_file (-m) [unknown]
62 Decision stump model to load. Default value ''.
63
64 --labels_file (-l) [string]
65 Labels for the training set. If not specified, the labels are
66 assumed to be the last row of the training data. Default value
67 ''.
68
69 --test_file (-T) [string]
70 A dataset to calculate predictions for. Default value ''.
71
72 --training_file (-t) [string]
73 The dataset to train on. Default value ''.
74
75 --verbose (-v) [bool]
76 Display informational messages and the full list of parameters
77 and timers at the end of execution.
78
79 --version (-V) [bool]
80 Display the version of mlpack.
81
83 --output_model_file (-M) [unknown]
84 Output decision stump model to save. Default value ''.
85
86 --predictions_file (-p) [string]
87 The output matrix that will hold the predicted labels for the
88 test set. Default value ''.
89
91 For further information, including relevant papers, citations, and the‐
92 ory, consult the documentation found at http://www.mlpack.org or
93 included with your distribution of mlpack.
94
95
96
97mlpack-3.0.4 21 February 2019 mlpack_decision_stump(1)