1SPECTRUM1D(1) GMT SPECTRUM1D(1)
2
3
4
6 spectrum1d - Compute auto- [and cross- ] spectra from one [or two]
7 time-series
8
10 spectrum1d [ table ] -Ssegment_size] [ -C[xycnpago] ] [ -Ddt ] [
11 -L[h|m] ] [ -N[name_stem ] ] [ -T ] [ -W ] [ -bbinary ] [ -dnodata ]
12 [ -eregexp ] [ -fflags ] [ -ggaps ] [ -hheaders ] [ -iflags ]
13
14 Note: No space is allowed between the option flag and the associated
15 arguments.
16
18 spectrum1d reads X [and Y] values from the first [and second] columns
19 on standard input [or x[y]file]. These values are treated as timeseries
20 X(t) [Y(t)] sampled at equal intervals spaced dt units apart. There may
21 be any number of lines of input. spectrum1d will create file[s] con‐
22 taining auto- [and cross- ] spectral density estimates by Welch's
23 method of ensemble averaging of multiple overlapped windows, using
24 standard error estimates from Bendat and Piersol.
25
26 The output files have 3 columns: f or w, p, and e. f or w is the fre‐
27 quency or wavelength, p is the spectral density estimate, and e is the
28 one standard deviation error bar size. These files are named based on
29 name_stem. If the -C option is used, up to eight files are created;
30 otherwise only one (xpower) is written. The files (which are ASCII
31 unless -bo is set) are as follows:
32
33 name_stem.xpower
34 Power spectral density of X(t). Units of X * X * dt.
35
36 name_stem.ypower
37 Power spectral density of Y(t). Units of Y * Y * dt.
38
39 name_stem.cpower
40 Power spectral density of the coherent output. Units same as
41 ypower.
42
43 name_stem.npower
44 Power spectral density of the noise output. Units same as
45 ypower.
46
47 name_stem.gain
48 Gain spectrum, or modulus of the transfer function. Units of (Y
49 / X).
50
51 name_stem.phase
52 Phase spectrum, or phase of the transfer function. Units are
53 radians.
54
55 name_stem.admit
56 Admittance spectrum, or real part of the transfer function.
57 Units of (Y / X).
58
59 name_stem.coh
60 (Squared) coherency spectrum, or linear correlation coefficient
61 as a function of frequency. Dimensionless number in [0, 1]. The
62 Signal-to-Noise-Ratio (SNR) is coh / (1 - coh). SNR = 1 when coh
63 = 0.5.
64
65 In addition, a single file with all of the above as individual columns
66 will be written to stdout (unless disabled via -T).
67
69 -Ssegment_size]
70 segment_size is a radix-2 number of samples per window for
71 ensemble averaging. The smallest frequency estimated is
72 1.0/(segment_size * dt), while the largest is 1.0/(2 * dt). One
73 standard error in power spectral density is approximately 1.0 /
74 sqrt(n_data / segment_size), so if segment_size = 256, you need
75 25,600 data to get a one standard error bar of 10%. Cross-spec‐
76 tral error bars are larger and more complicated, being a func‐
77 tion also of the coherency.
78
80 table One or more ASCII (or binary, see -bi) files holding X(t) [Y(t)]
81 samples in the first 1 [or 2] columns. If no files are speci‐
82 fied, spectrum1d will read from standard input.
83
84 -C[xycnpago]
85 Read the first two columns of input as samples of two
86 time-series, X(t) and Y(t). Consider Y(t) to be the output and
87 X(t) the input in a linear system with noise. Estimate the opti‐
88 mum frequency response function by least squares, such that the
89 noise output is minimized and the coherent output and the noise
90 output are uncorrelated. Optionally specify up to 8 letters
91 from the set { x y c n p a g o } in any order to create only
92 those output files instead of the default [all]. x = xpower, y =
93 ypower, c = cpower, n = npower, p = phase, a = admit, g = gain,
94 o = coh.
95
96 -Ddt dt Set the spacing between samples in the time-series [Default =
97 1].
98
99 -L Leave trend alone. By default, a linear trend will be removed
100 prior to the transform. Alternatively, append m to just remove
101 the mean value or h to remove the mid-value.
102
103 -N[name_stem]
104 Supply an alternate name stem to be used for output files
105 [Default = "spectrum"]. If -N is given with no argument then we
106 disable the writing of individual output files and instead write
107 a single table to standard output.
108
109 -V[level] (more ...)
110 Select verbosity level [c].
111
112 -T Disable the writing of a single composite results file to std‐
113 out.
114
115 -W Write Wavelength rather than frequency in column 1 of the output
116 file[s] [Default = frequency, (cycles / dt)].
117
118 -bi[ncols][t] (more ...)
119 Select native binary input. [Default is 2 input columns].
120
121 -bo[ncols][type] (more ...)
122 Select native binary output. [Default is 2 output columns].
123
124 -d[i|o]nodata (more ...)
125 Replace input columns that equal nodata with NaN and do the
126 reverse on output.
127
128 -e[~]"pattern" | -e[~]/regexp/[i] (more ...)
129 Only accept data records that match the given pattern.
130
131 -f[i|o]colinfo (more ...)
132 Specify data types of input and/or output columns.
133
134 -g[a]x|y|d|X|Y|D|[col]z[+|-]gap[u] (more ...)
135 Determine data gaps and line breaks.
136
137 -h[i|o][n][+c][+d][+rremark][+rtitle] (more ...)
138 Skip or produce header record(s).
139
140 -icols[+l][+sscale][+ooffset][,...] (more ...)
141 Select input columns and transformations (0 is first column).
142
143 -^ or just -
144 Print a short message about the syntax of the command, then
145 exits (NOTE: on Windows just use -).
146
147 -+ or just +
148 Print an extensive usage (help) message, including the explana‐
149 tion of any module-specific option (but not the GMT common
150 options), then exits.
151
152 -? or no arguments
153 Print a complete usage (help) message, including the explanation
154 of all options, then exits.
155
157 The ASCII output formats of numerical data are controlled by parameters
158 in your gmt.conf file. Longitude and latitude are formatted according
159 to FORMAT_GEO_OUT, absolute time is under the control of FOR‐
160 MAT_DATE_OUT and FORMAT_CLOCK_OUT, whereas general floating point val‐
161 ues are formatted according to FORMAT_FLOAT_OUT. Be aware that the for‐
162 mat in effect can lead to loss of precision in ASCII output, which can
163 lead to various problems downstream. If you find the output is not
164 written with enough precision, consider switching to binary output (-bo
165 if available) or specify more decimals using the FORMAT_FLOAT_OUT set‐
166 ting.
167
169 Suppose data.g is gravity data in mGal, sampled every 1.5 km. To write
170 its power spectrum, in mGal**2-km, to the file data.xpower, use
171
172 gmt spectrum1d data.g -S256 -D1.5 -Ndata
173
174 Suppose in addition to data.g you have data.t, which is topography in
175 meters sampled at the same points as data.g. To estimate various fea‐
176 tures of the transfer function, considering data.t as input and data.g
177 as output, use
178
179 paste data.t data.g | gmt spectrum1d -S256 -D1.5 -Ndata -C > results.txt
180
182 The output of spectrum1d is in units of power spectral density, and so
183 to get units of data-squared you must divide by delta_t, where delta_t
184 is the sample spacing. (There may be a factor of 2 pi somewhere, also.
185 If you want to be sure of the normalization, you can determine a scale
186 factor from Parseval's theorem: the sum of the squares of your input
187 data should equal the sum of the squares of the outputs from spec‐
188 trum1d, if you are simply trying to get a periodogram. [See below.])
189
190 Suppose we simply take a data set, x(t), and compute the discrete
191 Fourier transform (DFT) of the entire data set in one go. Call this
192 X(f). Then suppose we form X(f) times the complex conjugate of X(f).
193
194 P_raw(f) = X(f) * X'(f), where the ' indicates complex conjugation.
195
196 P_raw is called the periodogram. The sum of the samples of the peri‐
197 odogram equals the sum of the samples of the squares of x(t), by Parse‐
198 val's theorem. (If you use a DFT subroutine on a computer, usually the
199 sum of P_raw equals the sum of x-squared, times M, where M is the num‐
200 ber of samples in x(t).)
201
202 Each estimate of X(f) is now formed by a weighted linear combination of
203 all of the x(t) values. (The weights are sometimes called "twiddle fac‐
204 tors" in the DFT literature.) So, no matter what the probability dis‐
205 tribution for the x(t) values is, the probability distribution for the
206 X(f) values approaches [complex] Gaussian, by the Central Limit Theo‐
207 rem. This means that the probability distribution for P_raw(f)
208 approaches chi-squared with two degrees of freedom. That reduces to an
209 exponential distribution, and the variance of the estimate of P_raw is
210 proportional to the square of the mean, that is, the expected value of
211 P_raw.
212
213 In practice if we form P_raw, the estimates are hopelessly noisy. Thus
214 P_raw is not useful, and we need to do some kind of smoothing or aver‐
215 aging to get a useful estimate, P_useful(f).
216
217 There are several different ways to do this in the literature. One is
218 to form P_raw and then smooth it. Another is to form the auto-covari‐
219 ance function of x(t), smooth, taper and shape it, and then take the
220 Fourier transform of the smoothed, tapered and shaped auto-covariance.
221 Another is to form a parametric model for the auto-correlation struc‐
222 ture in x(t), then compute the spectrum of that model. This last
223 approach is what is done in what is called the "maximum entropy" or
224 "Berg" or "Box-Jenkins" or "ARMA" or "ARIMA" methods.
225
226 Welch's method is a tried-and-true method. In his method, you choose a
227 segment length, -SN, so that estimates will be made from segments of
228 length N. The frequency samples (in cycles per delta_t unit) of your
229 P_useful will then be at k /(N * delta_t), where k is an integer, and
230 you will get N samples (since the spectrum is an even function of f,
231 only N/2 of them are really useful). If the length of your entire data
232 set, x(t), is M samples long, then the variance in your P_useful will
233 decrease in proportion to N/M. Thus you need to choose N << M to get
234 very low noise and high confidence in P_useful. There is a trade-off
235 here; see below.
236
237 There is an additional reduction in variance in that Welch's method
238 uses a Von Hann spectral window on each sample of length N. This
239 reduces side lobe leakage and has the effect of smoothing the (N seg‐
240 ment) periodogram as if the X(f) had been convolved with [1/4, 1/2,
241 1/4] prior to forming P_useful. But this slightly widens the spectral
242 bandwidth of each estimate, because the estimate at frequency sample k
243 is now a little correlated with the estimate at frequency sample k+1.
244 (Of course this would also happen if you simply formed P_raw and then
245 smoothed it.)
246
247 Finally, Welch's method also uses overlapped processing. Since the Von
248 Hann window is large in the middle and tapers to near zero at the ends,
249 only the middle of the segment of length N contributes much to its
250 estimate. Therefore in taking the next segment of data, we move ahead
251 in the x(t) sequence only N/2 points. In this way, the next segment
252 gets large weight where the segments on either side of it will get lit‐
253 tle weight, and vice versa. This doubles the smoothing effect and
254 ensures that (if N << M) nearly every point in x(t) contributes with
255 nearly equal weight in the final answer.
256
257 Welch's method of spectral estimation has been widely used and widely
258 studied. It is very reliable and its statistical properties are well
259 understood. It is highly recommended in such textbooks as "Random Data:
260 Analysis and Measurement Procedures" by Bendat and Piersol.
261
262 In all problems of estimating parameters from data, there is a classic
263 trade-off between resolution and variance. If you want to try to
264 squeeze more resolution out of your data set, then you have to be will‐
265 ing to accept more noise in the estimates. The same trade-off is evi‐
266 dent here in Welch's method. If you want to have very low noise in the
267 spectral estimates, then you have to choose N << M, and this means that
268 you get only N samples of the spectrum, and the longest period that you
269 can resolve is only N * delta_t. So you see that reducing the noise
270 lowers the number of spectral samples and lowers the longest period.
271 Conversely, if you choose N approaching M, then you approach the peri‐
272 odogram with its very bad statistical properties, but you get lots of
273 samples and a large fundamental period.
274
275 The other spectral estimation methods also can do a good job. Welch's
276 method was selected because the way it works, how one can code it, and
277 its effects on statistical distributions, resolution, side-lobe leak‐
278 age, bias, variance, etc. are all easily understood. Some of the other
279 methods (e.g. Maximum Entropy) tend to hide where some of these
280 trade-offs are happening inside a "black box".
281
283 gmt, grdfft
284
286 Bendat, J. S., and A. G. Piersol, 1986, Random Data, 2nd revised ed.,
287 John Wiley & Sons.
288
289 Welch, P. D., 1967, The use of Fast Fourier Transform for the estima‐
290 tion of power spectra: a method based on time averaging over short,
291 modified periodograms, IEEE Transactions on Audio and Electroacoustics,
292 Vol AU-15, No 2.
293
295 2019, P. Wessel, W. H. F. Smith, R. Scharroo, J. Luis, and F. Wobbe
296
297
298
299
3005.4.5 Feb 24, 2019 SPECTRUM1D(1)