1GMTREGRESS(1) GMT GMTREGRESS(1)
2
3
4
6 gmtregress - Linear regression of 1-D data sets
7
9 gmtregress [ table ] [ -Amin/max/inc ] [ -Clevel ] [ -Ex|y|o|r ] [
10 -Fflags ] [ -N1|2|r|w ] [ -S[r] ] [ -Tmin/max/inc | -Tn ] [
11 -W[w][x][y][r] ] [ -V[level] ] [ -aflags ] [ -bbinary ] [ -dnodata ] [
12 -eregexp ] [ -ggaps ] [ -hheaders ] [ -iflags ] [ -oflags ]
13
14 Note: No space is allowed between the option flag and the associated
15 arguments.
16
18 gmtregress reads one or more data tables [or stdin] and determines the
19 best linear regression model y = a + b* x for each segment using the
20 chosen parameters. The user may specify which data and model compo‐
21 nents should be reported. By default, the model will be evaluated at
22 the input points, but alternatively you can specify an equidistant
23 range over which to evaluate the model, or turn off evaluation com‐
24 pletely. Instead of determining the best fit we can perform a scan of
25 all possible regression lines (for a range of slope angles) and examine
26 how the chosen misfit measure varies with slope. This is particularly
27 useful when analyzing data with many outliers. Note: If you actually
28 need to work with log10 of x or y you can accomplish that transforma‐
29 tion during read by using the -i option.
30
32 None
33
35 table One or more ASCII (or binary, see -bi[ncols][type]) data table
36 file(s) holding a number of data columns. If no tables are given
37 then we read from standard input. The first two columns are
38 expected to contain the required x and y data. Depending on
39 your -W and -E settings we may expect an additional 1-3 columns
40 with error estimates of one of both of the data coordinates, and
41 even their correlation.
42
43 -Amin/max/inc
44 Instead of determining a best-fit regression we explore the full
45 range of regressions. Examine all possible regression lines
46 with slope angles between min and max, using steps of inc
47 degrees [-90/+90/1]. For each slope the optimum intercept is
48 determined based on your regression type (-E) and misfit norm
49 (-N) settings. For each segment we report the four columns
50 angle, E, slope, intercept, for the range of specified angles.
51 The best model parameters within this range are written into the
52 segment header and reported in verbose mode (-V).
53
54 -Clevel
55 Set the confidence level (in %) to use for the optional calcula‐
56 tion of confidence bands on the regression [95]. This is only
57 used if -F includes the output column c.
58
59 -Ex|y|o|r
60 Type of linear regression, i.e., select the type of misfit we
61 should calculate. Choose from x (regress x on y; i.e., the mis‐
62 fit is measured horizontally from data point to regression
63 line), y (regress y on x; i.e., the misfit is measured verti‐
64 cally [Default]), o (orthogonal regression; i.e., the misfit is
65 measured from data point orthogonally to nearest point on the
66 line), or r (Reduced Major Axis regression; i.e., the misfit is
67 the product of both vertical and horizontal misfits) [y].
68
69 -Fflags
70 Append a combination of the columns you wish returned; the out‐
71 put order will match the order specified. Choose from x
72 (observed x), y (observed y), m (model prediction), r (residual
73 = data minus model), c (symmetrical confidence interval on the
74 regression; see -C for specifying the level), z (standardized
75 residuals or so-called z-scores) and w (outlier weights 0 or 1;
76 for -Nw these are the Reweighted Least Squares weights) [xymr‐
77 czw]. As an alternative to evaluating the model, just give -Fp
78 and we instead write a single record with the model parameters
79 npoints xmean ymean angle misfit slope intercept sigma_slope
80 sigma_intercept.
81
82 -N1|2|r|w
83 Selects the norm to use for the misfit calculation. Choose
84 among 1 (L-1 measure; the mean of the absolute residuals), 2
85 (Least-squares; the mean of the squared residuals), r (LMS; The
86 least median of the squared residuals), or w (RLS; Reweighted
87 Least Squares: the mean of the squared residuals after outliers
88 identified via LMS have been removed) [Default is 2]. Tradi‐
89 tional regression uses L-2 while L-1 and in particular LMS are
90 more robust in how they handle outliers. As alluded to, RLS
91 implies an initial LMS regression which is then used to identify
92 outliers in the data, assign these a zero weight, and then redo
93 the regression using a L-2 norm.
94
95 -S[r] Restricts which records will be output. By default all data
96 records will be output in the format specified by -F. Use -S to
97 exclude data points identified as outliers by the regression.
98 Alternatively, use -Sr to reverse this and only output the out‐
99 lier records.
100
101 -Tmin/max/inc | -Tn
102 Evaluate the best-fit regression model at the equidistant points
103 implied by the arguments. If -Tn is given instead we will reset
104 min and max to the extreme x-values for each segment and deter‐
105 mine inc so that there are exactly n output values for each seg‐
106 ment. To skip the model evaluation entirely, simply provide
107 -T0.
108
109 -W[w][x][y][r]
110 Specifies weighted regression and which weights will be pro‐
111 vided. Append x if giving 1-sigma uncertainties in the x-obser‐
112 vations, y if giving 1-sigma uncertainties in y, and r if giving
113 correlations between x and y observations, in the order these
114 columns appear in the input (after the two required and leading
115 x, y columns). Giving both x and y (and optionally r) implies
116 an orthogonal regression, otherwise giving x requires -Ex and y
117 requires -Ey. We convert uncertainties in x and y to regression
118 weights via the relationship weight = 1/sigma. Use -Ww if the
119 we should interpret the input columns to have precomputed
120 weights instead. Note: residuals with respect to the regression
121 line will be scaled by the given weights. Most norms will then
122 square this weighted residual (-N1 is the only exception).
123
124 -V[level] (more ...)
125 Select verbosity level [c].
126
127 -acol=name[...] (more ...)
128 Set aspatial column associations col=name.
129
130 -bi[ncols][t] (more ...)
131 Select native binary input.
132
133 -bo[ncols][type] (more ...)
134 Select native binary output. [Default is same as input].
135
136 -d[i|o]nodata (more ...)
137 Replace input columns that equal nodata with NaN and do the
138 reverse on output.
139
140 -e[~]"pattern" | -e[~]/regexp/[i] (more ...)
141 Only accept data records that match the given pattern.
142
143 -g[a]x|y|d|X|Y|D|[col]z[+|-]gap[u] (more ...)
144 Determine data gaps and line breaks.
145
146 -h[i|o][n][+c][+d][+rremark][+rtitle] (more ...)
147 Skip or produce header record(s).
148
149 -icols[+l][+sscale][+ooffset][,...] (more ...)
150 Select input columns and transformations (0 is first column).
151
152 -ocols[,...] (more ...)
153 Select output columns (0 is first column).
154
155 -^ or just -
156 Print a short message about the syntax of the command, then
157 exits (NOTE: on Windows just use -).
158
159 -+ or just +
160 Print an extensive usage (help) message, including the explana‐
161 tion of any module-specific option (but not the GMT common
162 options), then exits.
163
164 -? or no arguments
165 Print a complete usage (help) message, including the explanation
166 of all options, then exits.
167
169 The ASCII output formats of numerical data are controlled by parameters
170 in your gmt.conf file. Longitude and latitude are formatted according
171 to FORMAT_GEO_OUT, absolute time is under the control of FOR‐
172 MAT_DATE_OUT and FORMAT_CLOCK_OUT, whereas general floating point val‐
173 ues are formatted according to FORMAT_FLOAT_OUT. Be aware that the for‐
174 mat in effect can lead to loss of precision in ASCII output, which can
175 lead to various problems downstream. If you find the output is not
176 written with enough precision, consider switching to binary output (-bo
177 if available) or specify more decimals using the FORMAT_FLOAT_OUT set‐
178 ting.
179
181 To do a standard least-squares regression on the x-y data in points.txt
182 and return x, y, and model prediction with 99% confidence intervals,
183 try
184
185 gmt regress points.txt -Fxymc -C99 > points_regressed.txt
186
187 To just get the slope for the above regression, try
188
189 slope=`gmt regress points.txt -Fp -o5`
190
191 To do a reweighted least-squares regression on the data rough.txt and
192 return x, y, model prediction and the RLS weights, try
193
194 gmt regress rough.txt -Fxymw > points_regressed.txt
195
196 To do an orthogonal least-squares regression on the data crazy.txt but
197 first take the logarithm of both x and y, then return x, y, model pre‐
198 diction and the normalized residuals (z-scores), try
199
200 gmt regress crazy.txt -Eo -Fxymz -i0-1l > points_regressed.txt
201
202 To examine how the orthogonal LMS misfits vary with angle between 0 and
203 90 in steps of 0.2 degrees for the same file, try
204
205 gmt regress points.txt -A0/90/0.2 -Eo -Nr > points_analysis.txt
206
208 Draper, N. R., and H. Smith, 1998, Applied regression analysis, 3rd
209 ed., 736 pp., John Wiley and Sons, New York.
210
211 Rousseeuw, P. J., and A. M. Leroy, 1987, Robust regression and outlier
212 detection, 329 pp., John Wiley and Sons, New York.
213
214 York, D., N. M. Evensen, M. L. Martinez, and J. De Basebe Delgado,
215 2004, Unified equations for the slope, intercept, and standard errors
216 of the best straight line, Am. J. Phys., 72(3), 367-375.
217
219 gmt, trend1d, trend2d
220
222 2019, P. Wessel, W. H. F. Smith, R. Scharroo, J. Luis, and F. Wobbe
223
224
225
226
2275.4.5 Feb 24, 2019 GMTREGRESS(1)