1DBRVSTATDIFF(1)       User Contributed Perl Documentation      DBRVSTATDIFF(1)
2
3
4

NAME

6       dbrvstatdiff - evaluate statistical differences between two random
7       variables
8

SYNOPSIS

10           dbrvstatdiff [-f format] [-c ConfRating]
11               [-h HypothesizedDifference] m1c sd1c n1c m2c sd2c n2c
12
13       OR
14
15           dbrvstatdiff [-f format] [-c ConfRating] m1c n1c m2c n2c
16

DESCRIPTION

18       Produce statistics on the difference of sets of random variables.  If a
19       hypothesized difference is given (with "-h"), to does a Student's
20       t-test.
21
22       Random variables are specified by:
23
24       "m1c", "m2c"
25           The column names of means of random variables.
26
27       "sd1c", "sd2c"
28           The column names of standard deviations of random variables.
29
30       "n1c", "n2c"
31           Counts of number of samples for each random variable
32
33       These values can be computed with dbcolstats.
34
35       Creates up to ten new columns:
36
37       "diff"
38           The difference of RV 2 - RV 1.
39
40       "diff_pct"
41           The percentage difference (RV2-RV1)/1
42
43       "diff_conf_{half,low,high}" and "diff_conf_pct_{half,low,high}"
44           The half half confidence intervals and low and high values for
45           absolute and relative confidence.
46
47       "t_test"
48           The T-test value for the given hypothesized difference.
49
50       "t_test_result"
51           Given the confidence rating, does the test pass?  Will be either
52           "rejected" or "not-rejected".
53
54       "t_test_break"
55           The hypothesized value that is break-even point for the T-test.
56
57       "t_test_break_pct"
58           Break-even point as a percent of m1c.
59
60       Confidence intervals are not printed if standard deviations are not
61       provided.  Confidence intervals assume normal distributions with common
62       variances.
63
64       T-tests are only computed if a hypothesized difference is provided.
65       Hypothesized differences should be proceeded by <=, >=, =.  T-tests
66       assume normal distributions with common variances.
67

OPTIONS

69       -c FRACTION or --confidence FRACTION
70           Specify FRACTION for the confidence interval.  Defaults to 0.95 for
71           a 95% confidence factor (alpha = 0.05).
72
73       -f FORMAT or --format FORMAT
74           Specify a printf(3)-style format for output statistics.  Defaults
75           to "%.5g".
76
77       -h DIFF or --hypothesis DIFF
78           Specify the hypothesized difference as "DIFF", where "DIFF" is
79           something like "<=0" or ">=0", etc.
80
81       This module also supports the standard fsdb options:
82
83       -d  Enable debugging output.
84
85       -i or --input InputSource
86           Read from InputSource, typically a file name, or "-" for standard
87           input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
88           objects.
89
90       -o or --output OutputDestination
91           Write to OutputDestination, typically a file name, or "-" for
92           standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
93           Fsdb::BoundedQueue objects.
94
95       --autorun or --noautorun
96           By default, programs process automatically, but Fsdb::Filter
97           objects in Perl do not run until you invoke the run() method.  The
98           "--(no)autorun" option controls that behavior within Perl.
99
100       --help
101           Show help.
102
103       --man
104           Show full manual.
105

SAMPLE USAGE

107   Input:
108           #fsdb title mean2 stddev2 n2 mean1 stddev1 n1
109           example6.12 0.17 0.0020 5 0.22 0.0010 4
110
111   Command:
112           cat data.fsdb | dbrvstatdiff mean2 stddev2 n2 mean1 stddev1 n1
113
114   Output:
115           #fsdb title mean2 stddev2 n2 mean1 stddev1 n1 diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high
116           example6.12 0.17    0.0020  5       0.22    0.0010  4       0.05    29.412  0.0026138       0.047386        0.052614        1.5375  27.874  30.949
117           #  | dbrvstatdiff mean2 stddev2 n2 mean1 stddev1 n1
118
119   Input 2:
120       (example 7.10 from Scheaffer and McClave):
121
122           #fsdb title x2 sd2 n2 x1 sd1 n1
123           example7.10 9 35.22 24.44 9 31.56 20.03
124
125   Command 2:
126           dbrvstatdiff -h '<=0' x2 sd2 n2 x1 sd1 n1
127
128   Output 2:
129           #fsdb title n1 x1 sd1 n2 x2 sd2 diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high t_test t_test_result
130           example7.10 9 35.22 24.44 9 31.56 20.03 3.66 0.11597 4.7125 -1.0525 8.3725 0.14932 -0.033348 0.26529 1.6465 not-rejected
131           #  | /global/us/edu/ucla/cs/ficus/users/johnh/BIN/DB/dbrvstatdiff -h <=0 x2 sd2 n2 x1 sd1 n1
132
133   Case 3:
134       A common use case is to have one file with a set of trials from two
135       experiments, and to use dbrvstatdiff to see if they are different.
136
137       Input 3:
138
139           #fsdb case trial value
140           a  1  1
141           a  2  1.1
142           a  3  0.9
143           a  4  1
144           a  5  1.1
145           b  1  2
146           b  2  2.1
147           b  3  1.9
148           b  4  2
149           b  5  1.9
150
151   Command 3:
152           cat two_trial.fsdb |
153               dbmultistats -k case value |
154               dbcolcopylast mean stddev n |
155               dbrow '_case eq "b"' |
156               dbrvstatdiff -h '=0' mean stddev n copylast_mean copylast_stddev copylast_n |
157               dblistize
158
159       Output 3:
160
161               #fsdb -R C case mean stddev pct_rsd conf_range conf_low conf_high conf_pct sum sum_squared min max n copylast_mean copylast_stddev copylast_n diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high t_test t_test_result t_test_break t_test_break_pct
162               case: b
163               mean: 1.98
164               stddev: 0.083666
165               pct_rsd: 4.2256
166               conf_range: 0.10387
167               conf_low: 1.8761
168               conf_high: 2.0839
169               conf_pct: 0.95
170               sum: 9.9
171               sum_squared: 19.63
172               min: 1.9
173               max: 2.1
174               n: 5
175               copylast_mean: 1.02
176               copylast_stddev: 0.083666
177               copylast_n: 5
178               diff: -0.96
179               diff_pct: -48.485
180               diff_conf_half: 0.12202
181               diff_conf_low: -1.082
182               diff_conf_high: -0.83798
183               diff_conf_pct_half: 6.1627
184               diff_conf_pct_low: -54.648
185               diff_conf_pct_high: -42.322
186               t_test: -18.142
187               t_test_result: rejected
188               t_test_break: -1.082
189               t_test_break_pct: -54.648
190
191               #  | dbmultistats -k case value
192               #   | dbcolcopylast mean stddev n
193               #   | dbrow _case eq "b"
194               #   | dbrvstatdiff -h =0 mean stddev n copylast_mean copylast_stddev copylast_n
195               #   | dbfilealter -R C
196
197       (So one cannot say that they are statistically equal.)
198

SEE ALSO

200       Fsdb.  dbcolstats.  dbcolcopylast.
201
203       Copyright (C) 1991-2018 by John Heidemann <johnh@isi.edu>
204
205       This program is distributed under terms of the GNU general public
206       license, version 2.  See the file COPYING with the distribution for
207       details.
208
209
210
211perl v5.28.1                      2019-02-02                   DBRVSTATDIFF(1)
Impressum