1mlpack_decision_tree(1)          User Commands         mlpack_decision_tree(1)
2
3
4

NAME

6       mlpack_decision_tree - decision tree
7

SYNOPSIS

9        mlpack_decision_tree [-m unknown] [-l string] [-g double] [-n int] [-e bool] [-T string] [-L string] [-t string] [-V bool] [-w string] [-M unknown] [-p string] [-P string] [-h -v]
10

DESCRIPTION

12       Train  and  evaluate  using a decision tree. Given a dataset containing
13       numeric or categorical features, and associated labels for  each  point
14       in the dataset, this program can train a decision tree on that data.
15
16       The training set and associated labels are specified with the '--train‐
17       ing_file (-t)' and '--labels_file (-l)' parameters,  respectively.  The
18       labels  should  be  in  the  range [0, num_classes - 1]. Optionally, if
19       '--labels_file (-l)' is not specified, the labels are assumed to be the
20       last dimension of the training dataset.
21
22       When  a model is trained, the '--output_model_file (-M)' output parame‐
23       ter may be used to save the trained model. A model may  be  loaded  for
24       predictions   with   the   '--input_model_file   (-m)'  parameter.  The
25       '--input_model_file (-m)' parameter  may  not  be  specified  when  the
26       '--training_file (-t)' parameter is specified. The '--minimum_leaf_size
27       (-n)' parameter specifies the minimum number of  training  points  that
28       must fall into each leaf for it to be split.  The '--minimum_gain_split
29       (-g)' parameter specifies the minimum gain that is needed for the  node
30       to  split.  If '--print_training_error (-e)' is specified, the training
31       error will be printed.
32
33       Test data may be specified with the '--test_file (-T)'  parameter,  and
34       if  performance  numbers  are  desired for that test set, labels may be
35       specified with the '--test_labels_file (-L)' parameter. Predictions for
36       each  test  point may be saved via the '--predictions_file (-p)' output
37       parameter. Class probabilities for each prediction may  be  saved  with
38       the '--probabilities_file (-P)' output parameter.
39
40       For example, to train a decision tree with a minimum leaf size of 20 on
41       the dataset contained in 'data.csv' with  labels  'labels.csv',  saving
42       the  output  model  to  'tree.bin' and printing the training error, one
43       could call
44
45       $  decision_tree  --training_file  data.arff  --labels_file  labels.csv
46       --output_model_file    tree.bin    --minimum_leaf_size    20    --mini‐
47       mum_gain_split 0.001 --print_training_error
48
49       Then, to use that model to classify points in 'test_set.csv' and  print
50       the  test  error  given  the labels 'test_labels.csv' using that model,
51       while saving the predictions for each point to  'predictions.csv',  one
52       could call
53
54       $  decision_tree  --input_model_file tree.bin --test_file test_set.arff
55       --test_labels_file test_labels.csv --predictions_file predictions.csv
56

OPTIONAL INPUT OPTIONS

58       --help (-h) [bool]
59              Default help info.
60
61       --info [string]
62              Get help on a specific module or option.  Default value ''.
63
64       --input_model_file (-m) [unknown]
65              Pre-trained decision tree, to be used with test points.  Default
66              value ''.
67
68       --labels_file (-l) [string]
69              Training labels. Default value ''.
70
71       --minimum_gain_split (-g) [double]
72              Minimum gain for node splitting. Default value 1e-07.
73
74       --minimum_leaf_size (-n) [int]
75              Minimum number of points in a leaf. Default value 20.
76
77       --print_training_error (-e) [bool]
78              Print the training error.
79
80       --test_file (-T) [string]
81              Testing dataset (may be categorical). Default value ''.
82
83       --test_labels_file (-L) [string]
84              Test  point  labels, if accuracy calculation is desired. Default
85              value ''.
86
87       --training_file (-t) [string]
88              Training dataset (may be categorical). Default value ''.
89
90       --verbose (-v) [bool]
91              Display informational messages and the full list  of  parameters
92              and timers at the end of execution.
93
94       --version (-V) [bool]
95              Display the version of mlpack.
96
97       --weights_file (-w) [string] The weight of labels Default value ''.
98

OPTIONAL OUTPUT OPTIONS

100       --output_model_file (-M) [unknown]
101              Output for trained decision tree. Default value ''.
102
103       --predictions_file (-p) [string]
104              Class predictions for each test point. Default value ''.
105
106       --probabilities_file (-P) [string]
107              Class probabilities for each test point.  Default value ''.
108

ADDITIONAL INFORMATION

110       For further information, including relevant papers, citations, and the‐
111       ory,  consult  the  documentation  found  at  http://www.mlpack.org  or
112       included with your distribution of mlpack.
113
114
115
116mlpack-3.0.4                   21 February 2019        mlpack_decision_tree(1)
Impressum