spectrum1d(1)

1SPECTRUM1D(1)                         GMT                        SPECTRUM1D(1)
2
3
4

NAME

6       spectrum1d  -  Compute  auto-  [and  cross- ] spectra from one [or two]
7       time-series
8

SYNOPSIS

10       spectrum1d [ table ]  -Ssegment_size] [  -C[xycnpago] ]  [   -Ddt  ]  [
11       -L[h|m] ] [  -N[name_stem ] ] [  -T ] [  -W ] [ -bbinary ] [ -dnodata ]
12       [ -eregexp ] [ -fflags ] [ -ggaps ] [ -hheaders ] [ -iflags ]
13
14       Note: No space is allowed between the option flag  and  the  associated
15       arguments.
16

DESCRIPTION

18       spectrum1d  reads  X [and Y] values from the first [and second] columns
19       on standard input [or x[y]file]. These values are treated as timeseries
20       X(t) [Y(t)] sampled at equal intervals spaced dt units apart. There may
21       be any number of lines of input. spectrum1d will  create  file[s]  con‐
22       taining  auto-  [and  cross-  ]  spectral  density estimates by Welch's
23       method of ensemble averaging  of  multiple  overlapped  windows,  using
24       standard error estimates from Bendat and Piersol.
25
26       The  output  files have 3 columns: f or w, p, and e. f or w is the fre‐
27       quency or wavelength, p is the spectral density estimate, and e is  the
28       one  standard  deviation error bar size. These files are named based on
29       name_stem. If the -C option is used, up to  eight  files  are  created;
30       otherwise  only  one  (xpower)  is  written. The files (which are ASCII
31       unless -bo is set) are as follows:
32
33       name_stem.xpower
34              Power spectral density of X(t). Units of X * X * dt.
35
36       name_stem.ypower
37              Power spectral density of Y(t). Units of Y * Y * dt.
38
39       name_stem.cpower
40              Power spectral density of the coherent  output.  Units  same  as
41              ypower.
42
43       name_stem.npower
44              Power  spectral  density  of  the  noise  output.  Units same as
45              ypower.
46
47       name_stem.gain
48              Gain spectrum, or modulus of the transfer function. Units of  (Y
49              / X).
50
51       name_stem.phase
52              Phase  spectrum,  or  phase  of the transfer function. Units are
53              radians.
54
55       name_stem.admit
56              Admittance spectrum, or real  part  of  the  transfer  function.
57              Units of (Y / X).
58
59       name_stem.coh
60              (Squared)  coherency spectrum, or linear correlation coefficient
61              as a function of frequency. Dimensionless number in [0, 1].  The
62              Signal-to-Noise-Ratio (SNR) is coh / (1 - coh). SNR = 1 when coh
63              = 0.5.
64
65       In addition, a single file with all of the above as individual  columns
66       will be written to stdout (unless disabled via -T).
67

REQUIRED ARGUMENTS

69       -Ssegment_size]
70              segment_size  is  a  radix-2  number  of  samples per window for
71              ensemble  averaging.  The  smallest   frequency   estimated   is
72              1.0/(segment_size  * dt), while the largest is 1.0/(2 * dt). One
73              standard error in power spectral density is approximately 1.0  /
74              sqrt(n_data  / segment_size), so if segment_size = 256, you need
75              25,600 data to get a one standard error bar of 10%.  Cross-spec‐
76              tral  error  bars are larger and more complicated, being a func‐
77              tion also of the coherency.
78

OPTIONAL ARGUMENTS

80       table  One or more ASCII (or binary, see -bi) files holding X(t) [Y(t)]
81              samples  in  the  first 1 [or 2] columns. If no files are speci‐
82              fied, spectrum1d will read from standard input.
83
84       -C[xycnpago]
85              Read  the  first  two  columns  of  input  as  samples  of   two
86              time-series,  X(t)  and Y(t). Consider Y(t) to be the output and
87              X(t) the input in a linear system with noise. Estimate the opti‐
88              mum  frequency response function by least squares, such that the
89              noise output is minimized and the coherent output and the  noise
90              output  are  uncorrelated.   Optionally  specify up to 8 letters
91              from the set { x y c n p a g o } in any  order  to  create  only
92              those output files instead of the default [all]. x = xpower, y =
93              ypower, c = cpower, n = npower, p = phase, a = admit, g =  gain,
94              o = coh.
95
96       -Ddt   dt Set the spacing between samples in the time-series [Default =
97              1].
98
99       -L     Leave trend alone. By default, a linear trend  will  be  removed
100              prior  to  the transform. Alternatively, append m to just remove
101              the mean value or h to remove the mid-value.
102
103       -N[name_stem]
104              Supply an alternate name  stem  to  be  used  for  output  files
105              [Default = "spectrum"].  If -N is given with no argument then we
106              disable the writing of individual output files and instead write
107              a single table to standard output.
108
109       -V[level] (more ...)
110              Select verbosity level [c].
111
112       -T     Disable  the  writing of a single composite results file to std‐
113              out.
114
115       -W     Write Wavelength rather than frequency in column 1 of the output
116              file[s] [Default = frequency, (cycles / dt)].
117
118       -bi[ncols][t] (more ...)
119              Select native binary input. [Default is 2 input columns].
120
121       -bo[ncols][type] (more ...)
122              Select native binary output. [Default is 2 output columns].
123
124       -d[i|o]nodata (more ...)
125              Replace  input  columns  that  equal  nodata with NaN and do the
126              reverse on output.
127
128       -e[~]"pattern" | -e[~]/regexp/[i] (more ...)
129              Only accept data records that match the given pattern.
130
131       -f[i|o]colinfo (more ...)
132              Specify data types of input and/or output columns.
133
134       -g[a]x|y|d|X|Y|D|[col]z[+|-]gap[u] (more ...)
135              Determine data gaps and line breaks.
136
137       -h[i|o][n][+c][+d][+rremark][+rtitle] (more ...)
138              Skip or produce header record(s).
139
140       -icols[+l][+sscale][+ooffset][,...] (more ...)
141              Select input columns and transformations (0 is first column).
142
143       -^ or just -
144              Print a short message about the  syntax  of  the  command,  then
145              exits (NOTE: on Windows just use -).
146
147       -+ or just +
148              Print  an extensive usage (help) message, including the explana‐
149              tion of any module-specific  option  (but  not  the  GMT  common
150              options), then exits.
151
152       -? or no arguments
153              Print a complete usage (help) message, including the explanation
154              of all options, then exits.
155

ASCII FORMAT PRECISION

157       The ASCII output formats of numerical data are controlled by parameters
158       in  your  gmt.conf file. Longitude and latitude are formatted according
159       to  FORMAT_GEO_OUT,  absolute  time  is  under  the  control  of   FOR‐
160       MAT_DATE_OUT  and FORMAT_CLOCK_OUT, whereas general floating point val‐
161       ues are formatted according to FORMAT_FLOAT_OUT. Be aware that the for‐
162       mat  in effect can lead to loss of precision in ASCII output, which can
163       lead to various problems downstream. If you  find  the  output  is  not
164       written with enough precision, consider switching to binary output (-bo
165       if available) or specify more decimals using the FORMAT_FLOAT_OUT  set‐
166       ting.
167

EXAMPLES

169       Suppose  data.g is gravity data in mGal, sampled every 1.5 km. To write
170       its power spectrum, in mGal**2-km, to the file data.xpower, use
171
172              gmt spectrum1d data.g -S256 -D1.5 -Ndata
173
174       Suppose in addition to data.g you have data.t, which is  topography  in
175       meters  sampled  at the same points as data.g. To estimate various fea‐
176       tures of the transfer function, considering data.t as input and  data.g
177       as output, use
178
179              paste data.t data.g | gmt spectrum1d -S256 -D1.5 -Ndata -C > results.txt
180

TUTORIAL

182       The  output of spectrum1d is in units of power spectral density, and so
183       to get units of data-squared you must divide by delta_t, where  delta_t
184       is the sample spacing.  (There may be a factor of 2 pi somewhere, also.
185       If you want to be sure of the normalization, you can determine a  scale
186       factor  from  Parseval's  theorem: the sum of the squares of your input
187       data should equal the sum of the squares  of  the  outputs  from  spec‐
188       trum1d, if you are simply trying to get a periodogram. [See below.])
189
190       Suppose  we  simply  take  a  data  set, x(t), and compute the discrete
191       Fourier transform (DFT) of the entire data set in  one  go.  Call  this
192       X(f). Then suppose we form X(f) times the complex conjugate of X(f).
193
194       P_raw(f) = X(f) * X'(f), where the ' indicates complex conjugation.
195
196       P_raw  is  called  the periodogram. The sum of the samples of the peri‐
197       odogram equals the sum of the samples of the squares of x(t), by Parse‐
198       val's  theorem. (If you use a DFT subroutine on a computer, usually the
199       sum of P_raw equals the sum of x-squared, times M, where M is the  num‐
200       ber of samples in x(t).)
201
202       Each estimate of X(f) is now formed by a weighted linear combination of
203       all of the x(t) values. (The weights are sometimes called "twiddle fac‐
204       tors"  in the DFT literature.)  So, no matter what the probability dis‐
205       tribution for the x(t) values is, the probability distribution for  the
206       X(f)  values  approaches [complex] Gaussian, by the Central Limit Theo‐
207       rem.  This  means  that  the  probability  distribution  for   P_raw(f)
208       approaches  chi-squared with two degrees of freedom. That reduces to an
209       exponential distribution, and the variance of the estimate of P_raw  is
210       proportional  to the square of the mean, that is, the expected value of
211       P_raw.
212
213       In practice if we form P_raw, the estimates are hopelessly noisy.  Thus
214       P_raw  is not useful, and we need to do some kind of smoothing or aver‐
215       aging to get a useful estimate, P_useful(f).
216
217       There are several different ways to do this in the literature.  One  is
218       to  form  P_raw and then smooth it. Another is to form the auto-covari‐
219       ance function of x(t), smooth, taper and shape it, and  then  take  the
220       Fourier  transform of the smoothed, tapered and shaped auto-covariance.
221       Another is to form a parametric model for the  auto-correlation  struc‐
222       ture  in  x(t),  then  compute  the  spectrum  of that model. This last
223       approach is what is done in what is called  the  "maximum  entropy"  or
224       "Berg" or "Box-Jenkins" or "ARMA" or "ARIMA" methods.
225
226       Welch's  method is a tried-and-true method. In his method, you choose a
227       segment length, -SN, so that estimates will be made  from  segments  of
228       length  N.  The  frequency samples (in cycles per delta_t unit) of your
229       P_useful will then be at k /(N * delta_t), where k is an  integer,  and
230       you  will  get  N samples (since the spectrum is an even function of f,
231       only N/2 of them are really useful). If the length of your entire  data
232       set,  x(t),  is M samples long, then the variance in your P_useful will
233       decrease in proportion to N/M. Thus you need to choose N <<  M  to  get
234       very  low  noise  and high confidence in P_useful. There is a trade-off
235       here; see below.
236
237       There is an additional reduction in variance  in  that  Welch's  method
238       uses  a  Von  Hann  spectral  window  on  each sample of length N. This
239       reduces side lobe leakage and has the effect of smoothing the  (N  seg‐
240       ment)  periodogram  as  if  the X(f) had been convolved with [1/4, 1/2,
241       1/4] prior to forming P_useful. But this slightly widens  the  spectral
242       bandwidth  of each estimate, because the estimate at frequency sample k
243       is now a little correlated with the estimate at frequency  sample  k+1.
244       (Of  course  this would also happen if you simply formed P_raw and then
245       smoothed it.)
246
247       Finally, Welch's method also uses overlapped processing. Since the  Von
248       Hann window is large in the middle and tapers to near zero at the ends,
249       only the middle of the segment of length  N  contributes  much  to  its
250       estimate.  Therefore  in taking the next segment of data, we move ahead
251       in the x(t) sequence only N/2 points. In this  way,  the  next  segment
252       gets large weight where the segments on either side of it will get lit‐
253       tle weight, and vice versa.  This  doubles  the  smoothing  effect  and
254       ensures  that  (if  N << M) nearly every point in x(t) contributes with
255       nearly equal weight in the final answer.
256
257       Welch's method of spectral estimation has been widely used  and  widely
258       studied.  It  is  very reliable and its statistical properties are well
259       understood. It is highly recommended in such textbooks as "Random Data:
260       Analysis and Measurement Procedures" by Bendat and Piersol.
261
262       In  all problems of estimating parameters from data, there is a classic
263       trade-off between resolution and  variance.  If  you  want  to  try  to
264       squeeze more resolution out of your data set, then you have to be will‐
265       ing to accept more noise in the estimates. The same trade-off  is  evi‐
266       dent  here in Welch's method. If you want to have very low noise in the
267       spectral estimates, then you have to choose N << M, and this means that
268       you get only N samples of the spectrum, and the longest period that you
269       can resolve is only N * delta_t.  So you see that  reducing  the  noise
270       lowers  the  number  of spectral samples and lowers the longest period.
271       Conversely, if you choose N approaching M, then you approach the  peri‐
272       odogram  with  its very bad statistical properties, but you get lots of
273       samples and a large fundamental period.
274
275       The other spectral estimation methods also can do a good  job.  Welch's
276       method  was selected because the way it works, how one can code it, and
277       its effects on statistical distributions, resolution,  side-lobe  leak‐
278       age,  bias, variance, etc. are all easily understood. Some of the other
279       methods (e.g. Maximum  Entropy)  tend  to  hide  where  some  of  these
280       trade-offs are happening inside a "black box".
281

REFERENCES

286       Bendat,  J.  S., and A. G. Piersol, 1986, Random Data, 2nd revised ed.,
287       John Wiley & Sons.
288
289       Welch, P. D., 1967, The use of Fast Fourier Transform for  the  estima‐
290       tion  of  power  spectra:  a method based on time averaging over short,
291       modified periodograms, IEEE Transactions on Audio and Electroacoustics,
292       Vol AU-15, No 2.
293

COPYRIGHT

295       2019, P. Wessel, W. H. F. Smith, R. Scharroo, J. Luis, and F. Wobbe
296
297
298
299
3005.4.5                            Feb 24, 2019                    SPECTRUM1D(1)