1r.fill.stats(1)             GRASS GIS User's Manual            r.fill.stats(1)
2
3
4

NAME

6       r.fill.stats   -  Rapidly fills ’no data’ cells (NULLs) of a raster map
7       with interpolated values (IDW).
8

KEYWORDS

10       raster, surface, interpolation, IDW, no-data filling
11

SYNOPSIS

13       r.fill.stats
14       r.fill.stats --help
15       r.fill.stats [-mkwus] input=name output=name  [uncertainty=name]   dis‐
16       tance=value  mode=name   [minimum=value]   [maximum=value]  power=value
17       cells=value  [--overwrite]  [--help]  [--verbose]  [--quiet]  [--ui]
18
19   Flags:
20       -m
21           Interpret distance as map units, not number of cells
22
23       -k
24           Keep (preserve) original cell values
25           By default original values are smoothed
26
27       -w
28           Just print the spatial weights matrix
29
30       -u
31           Just print estimated memory usage
32
33       -s
34           Single precision floating point output
35
36       --overwrite
37           Allow output files to overwrite existing files
38
39       --help
40           Print usage summary
41
42       --verbose
43           Verbose module output
44
45       --quiet
46           Quiet module output
47
48       --ui
49           Force launching GUI dialog
50
51   Parameters:
52       input=name [required]
53           Raster map with data gaps to fill
54
55       output=name [required]
56           Name of result output map
57
58       uncertainty=name
59           Name of uncertainty output map
60
61       distance=value [required]
62           Distance threshold (default: in cells) for interpolation
63           Default: 3
64
65       mode=name [required]
66           Statistic for interpolated cell values
67           Options: wmean, mean, median, mode
68           Default: wmean
69
70       minimum=value
71           Minimum input data value to include in interpolation
72
73       maximum=value
74           Maximum input data value to include in interpolation
75
76       power=value [required]
77           Power coefficient for IDW interpolation
78           Default: 2.0
79
80       cells=value [required]
81           Minimum number of data cells within search radius
82           Default: 8
83

DESCRIPTION

85       r.fill.stats is a module for fast gap filling and  interpolation  (with
86       smoothing) of dense raster data.
87
88       The  r.fill.stats  module is capable of quickly filling small data gaps
89       in large and high-resolution raster maps. It’s primary  purpose  is  to
90       improve  high-resolution,  rasterized sensor data (such as Lidar data).
91       As a rule of thumb, there should be at least  as  many  data  cells  as
92       there  are  "no data" (NULL) cells in the input raster map. Several in‐
93       terpolation modes are available. By default, all values  of  the  input
94       raster  map  will  be  replaced with locally interpolated values in the
95       output raster map. This is the equivalent of running a low-pass smooth‐
96       ing  filter  on  the interpolated data and is often desirable, owing to
97       noisy nature of high-resolution sensor data. With dense data and  small
98       gaps,  this  should  result in only slight deviations from the original
99       data and produce smooth output. Alternatively, setting the -k flag will
100       disable the low-pass filter and preserve the original cell values.
101
102       Available gap filling modes:
103
104           •   spatially weighted mean (default)
105
106           •   simple mean
107
108           •   simple median
109
110           •   simple mode
111
112       The  spatially  weighted  mean  is  equivalent  to  an Inverse Distance
113       Weighting (IDW; see also r.surf.idw) interpolation. The simple mean  is
114       equivalent  to  a simple low-pass filter.  Median and mode replacements
115       can also be achieved using r.neighbors.
116
117       Note that r.fill.stats is highly optimized for fast processing of large
118       raster  datasets  with  a  small interpolation distance threshold (i.e.
119       closing small gaps). As a trade-off for speed and a small  memory  foot
120       print,  some  spatial  accuracy is lost due to the rounding of the dis‐
121       tance threshold to the closest approximation in input raster cells  and
122       the use of a matrix of precomputed weights at cell resolution (see fur‐
123       ther down for details). Note also that processing  time  will  increase
124       exponentially with higher distance settings. Cells outside the distance
125       threshold will not be interpolated, preserving the edges of the  origi‐
126       nal data extent.
127
128       This  module is not well suited for interpolating sparse input data and
129       closing large gaps. For such  purposes  more  appropriate  methods  are
130       available, such as v.surf.bspline, v.surf.idw or v.surf.rst.
131
132       Applications  where the properties of r.fill.stats are advantageous in‐
133       clude the processing of high resolution, close range scanning  and  re‐
134       mote  sensing  datasets.  Such datasets commonly feature densely spaced
135       measurements that have some gaps  after  rasterization,  due  to  blind
136       spots,  instrument  failures, and misalignments between the GIS’ raster
137       cell grids and the original  measurement  locations.  In  these  cases,
138       r.fill.stats  should  typically  be  run using the "weighted mean" (de‐
139       fault) method and with a small distance setting (the default is to  use
140       a search radius of just three cells).
141
142       The  images below show a gradiometer dataset with gaps and its interpo‐
143       lated equivalent, produced using the spatially weighted  mean  operator
144       (mode="wmean").
145
146       In  addition, r.fill.stats can be useful for raster map generalization.
147       Often, this involves removing small clumps  of  categorized  cells  and
148       then  filling  the resulting data gaps without "inventing" any new val‐
149       ues. In such cases, the "mode"  or  "median"  interpolators  should  be
150       used.
151
152   Usage
153       The  most  critical  user-provided  settings  are the interpolation/gap
154       filling  method  (mode)  and  the  maximum  distance,   up   to   which
155       r.fill.stats  will scan for cells with values (distance).  The distance
156       can be expressed as a number of cells (default) or in the current GRASS
157       location’s  units  (if  the -m flag is given). The latter are typically
158       meters, but can be any other units of a planar coordinate system.
159
160       Note that proper handling of geodetic coordinates  (lat/lon)  and  dis‐
161       tances  is  currently  not  implemented. For lat/lon data, the distance
162       should therefore be specified in cells and usage of r.fill.stats should
163       be restricted to small areas to avoid large inaccuracies that may arise
164       from the different extents of cells along the latitudinal and  longitu‐
165       dinal axes.
166
167       Distances  specified in map units (-m flag) will be approximated as ac‐
168       curately as the current region’s cell resolution settings  allow.   The
169       program will warn if the distance cannot be expressed as whole cells at
170       the current region’s resolution. In such case, the number of  cells  in
171       the search window will be rounded up. Due to the rounding effect intro‐
172       duced by using cells as spatial units, the actual maximum distance con‐
173       sidered  by  the interpolation may be up to half a cell diagonal larger
174       than the one specified by the user.
175
176       The interpolator type "wmean" uses  a  precomputed  matrix  of  spatial
177       weights  to  speed up computation. This matrix can be examined (printed
178       to the screen) before running the  interpolation,  by  setting  the  -w
179       flag.  In  mode "wmean", the power option has the usual meaning in IDW:
180       higher values mean that cell values in the neighborhood lose their  in‐
181       fluence  on  the  cell  to be interpolated more rapidly with increasing
182       distance from the latter. Another way of explaining this effect  is  to
183       state  that larger "power" settings result in more localized interpola‐
184       tion, smaller ones in more globalized interpolation.  The default  set‐
185       ting is power=2.0.
186
187       The  interpolators  "mean", "median" and "mode" are calculated from all
188       cell values within the search radius. No spatial weighting  is  applied
189       for  these  methods.  The  "mode"  of  the input data may not always be
190       unique. In such case, the mode will be  the  smallest  value  with  the
191       highest frequency.
192
193       Often,  input  data will contain spurious extreme measurements (spikes,
194       outliers, noise) caused by the limits of device  sensitivity,  hardware
195       defects,  environmental  influences, etc. If the normal, valid range of
196       input data is known beforehand, then the minimum  and  maximum  options
197       can  be  used  to  exclude  those input cells that have values below or
198       above that range, respectively. This  will  prevent  the  influence  of
199       spikes and outliers from spreading through the interpolation.
200
201       Unless the -k (keep) flag is given, data cells of the input map will be
202       replaced with interpolated values instead of  preserving  them  in  the
203       output. In modes "wmean" and "mean", this results in a smoothing effect
204       that includes the original data (see below)!
205
206       Besides the result of the interpolation/gap filling,  a  second  output
207       map  can  be  specified  via the uncertainty option. The cell values in
208       this map represent a simple measure of how  much  uncertainty  was  in‐
209       volved in interpolating each cell value of the primary output map, with
210       "0" being the lowest and "1" being the theoretic  highest  uncertainty.
211       Uncertainty  is  measured  by  summing up all cells in the neighborhood
212       (defined by the search radius distance) that contain a value in the in‐
213       put  map,  multiplied  by their weights, and dividing the result by the
214       sum of all weights in the neighborhood.   For  mode=wmean,  this  means
215       that  interpolated output cells that were computed from many nearby in‐
216       put cells have very low uncertainty  and  vice  versa.  For  all  other
217       modes,  all weights in the neighborhood are constant "1" and the uncer‐
218       tainty measure is a simple measure of how many input  data  cells  were
219       present in the search window.
220
221   Smoothing
222       The  r.fill.stats  module  uses  the  interpolated values to adjust the
223       original values and create a smooth surface, which is akin to running a
224       low-pass  filter on the data. Since most high-resolution sensor data is
225       noisy, this is normally a desired effect and results in output that  is
226       more  suitable  for  deriving secondary products, such as slope, aspect
227       and curvature maps. Larger settings for the  search  radius  (distance)
228       will  result in a stronger smoothing. In practice, some experimentation
229       with different settings for distance and power  might  be  required  to
230       achieve  good  results. In some cases (e.g. when dealing with low-noise
231       or classified data), it might be desirable to turn off  data  smoothing
232       by  setting the -k (keep) flag. This will ensure that the original cell
233       data is copied through to the result map without alteration.
234
235        Effect of smoothing the original data: The top row shows a  gap-filled
236       surface  computed from a rasterized Lidar point cloud (using mode=wmean
237       and power=2), and the derived  slope,  aspect,  and  profile  curvature
238       maps.  The  smoothing  effect is clearly visible.  The bottom row shows
239       the effect of setting the -k flag: Preserving the original cell  values
240       in  the interpolated output produces and unsmoothed, noisy surface, and
241       likewise noisy derivative maps.  The effect can be seen in  the  illus‐
242       tration  above: Slope, aspect, and profile curvature are computed using
243       the r.slope.aspect module, which uses a window  (kernel)  for  computa‐
244       tions that considers only the immediate neighborhood of each cell. When
245       performed on noisy data, such local operations result in equally  noisy
246       derivatives  if the original data is preserved (by setting the -k flag)
247       and no smoothing is performed.
248
249       (Note that the effects of noisy data can also be avoided by using  mod‐
250       ules  that are not restricted to minimal kernel sizes. For example, as‐
251       pect and other  morphometric  parameters  can  be  computed  using  the
252       r.param.scale  module  which operates with variable-size cell neighbor‐
253       hoods.)
254
255   Spatial weighting scheme
256       The key to getting good gap filling results is to understand  the  spa‐
257       tial weighting scheme used in mode "wmean". The weights are precomputed
258       and assigned per cell within the search window centered on the location
259       at  which  to  interpolate/gap fill all cells within the user-specified
260       distance.
261
262       The illustration below shows  the  precomputed  weights  matrix  for  a
263       search distance of four cells from the center cell:
264       000.00 000.01 000.04 000.07 000.09 000.07 000.04 000.01 000.00
265       000.01 000.06 000.13 000.19 000.22 000.19 000.13 000.06 000.01
266       000.04 000.13 000.25 000.37 000.42 000.37 000.25 000.13 000.04
267       000.07 000.19 000.37 000.56 000.68 000.56 000.37 000.19 000.07
268       000.09 000.22 000.42 000.68 001.00 000.68 000.42 000.22 000.09
269       000.07 000.19 000.37 000.56 000.68 000.56 000.37 000.19 000.07
270       000.04 000.13 000.25 000.37 000.42 000.37 000.25 000.13 000.04
271       000.01 000.06 000.13 000.19 000.22 000.19 000.13 000.06 000.01
272       000.00 000.01 000.04 000.07 000.09 000.07 000.04 000.01 000.00
273       Note  that  the weights in such a small window drop rapidly for the de‐
274       fault setting of power=2.
275
276       If the distance is given in map units (flag -m), then the search window
277       can  be  modeled  more  accurately  as a circle. The illustration below
278       shows the precomputed weights for a distance in map units that  is  ap‐
279       proximately equivalent to four cells from the center cell:
280       ...... ...... ...... 000.00 000.00 000.00 ...... ...... ......
281       ...... 000.00 000.02 000.06 000.09 000.06 000.02 000.00 ......
282       ...... 000.02 000.11 000.22 000.28 000.22 000.11 000.02 ......
283       000.00 000.07 000.22 000.44 000.58 000.44 000.22 000.07 000.00
284       000.00 000.09 000.28 000.58 001.00 000.58 000.28 000.09 000.00
285       000.00 000.07 000.22 000.44 000.58 000.44 000.22 000.07 000.00
286       ...... 000.02 000.11 000.22 000.28 000.22 000.11 000.02 ......
287       ...... 000.00 000.02 000.06 000.09 000.06 000.02 000.00 ......
288       ...... ...... ...... 000.00 000.00 000.00 ...... ...... ......
289
290       When  using  a  small  search radius, cells must also be set to a small
291       value. Otherwise, there may not be enough cells with  data  within  the
292       search radius to support interpolation.
293

NOTES

295       The  straight-line metric used for converting distances in map units to
296       cell numbers is only adequate for planar coordinate systems. Using this
297       module with lat/lon input data will likely give inaccurate results, es‐
298       pecially when interpolating over large geographical areas.
299
300       If the distance is set to a relatively  large  value,  processing  time
301       will  quickly approach and eventually exceed that of point-based inter‐
302       polation modules such as v.surf.rst.
303
304       This module can handle cells with different X and Y resolutions.   How‐
305       ever,  note  that  the weight matrix will be skewed in such cases, with
306       higher weights occurring close to the center and along  the  axis  with
307       the  higher resolution. This is because weights on the lower resolution
308       axis are  less  accurately  calculated.  The  skewing  effect  will  be
309       stronger  if  the  difference  between  the  X and Y axis resolution is
310       greater and a larger "power" setting is chosen. This  property  of  the
311       weights  matrix  directly reflects the higher information density along
312       the higher resolution axis.
313
314       Note on printing the weights matrix (using the  -w  flag):  the  matrix
315       cannot be printed if it is too large.
316
317       The  memory  estimate  provided  by the -u flag is a lower limit on the
318       amount of RAM that will be needed.
319
320       If the -s flag is set, floating point type output will be  saved  as  a
321       "single  precision"  raster map, saving ~50% disk space compared to the
322       default "double precision" output.
323

EXAMPLES

325   Gap-filling of a dataset using spatially weighted mean (IDW)
326       Gap-fill a dataset using spatially weighted mean (IDW)  and  a  maximum
327       search  radius  of  3.0  map units; also produce uncertainty estimation
328       map:
329       r.fill.stats input=measurements output=result dist=3.0 -m mode=wmean uncertainty=uncert_map
330       Run a fast low-pass filter (replacement all cells with  mean  value  of
331       neighboring cells) on the input map:
332       r.fill.stats input=measurements output=result dist=10 mode=mean
333       Fill data gaps in a categorized raster map; preserve existing data:
334       r.fill.stats input=categories output=result dist=100 -m mode=mode -k
335
336   Lidar point cloud example
337       Inspect  the  point density and determine the extent of the point cloud
338       using the r.in.lidar module:
339       r.in.lidar -e input=points.las output=density method=n resolution=5 class_filter=2
340       Based on the result, set computational region extent and desired  reso‐
341       lution:
342       g.region -pa raster=density res=1
343       Import the point cloud as raster using binning:
344       r.in.lidar input=points.las output=ground_raw method=mean class_filter=2
345       Check that there are more non-NULL cells than NULL ("no data") cells:
346       r.univar map=ground_raw
347       total null and non-null cells: 2340900
348       total null cells: 639184
349       ...
350       Fill in the NULL cells using the default 3-cell search radius:
351       r.fill.stats input=ground output=ground_filled uncertainty=uncertainty distance=3 mode=wmean power=2.0 cells=8
352
353         Binning of Lidar and resulting ground surface with filled gaps.  Note
354       the remaining NULL cells  (white)  in  the  resulting  ground  surface.
355       These are areas with a lack of cells with values in close proximity.
356
357   Outlier removal and gap-filling of SRTM elevation data
358       In  this  example,  the SRTM elevation map in the North Carolina sample
359       dataset location is filtered for outlier elevation values; missing pix‐
360       els are then re-interpolated to obtain a complete elevation map:
361       g.region raster=elev_srtm_30m -p
362       d.mon wx0
363       d.histogram elev_srtm_30m
364       # remove SRTM outliers, i.e. SRTM below 50m (esp. lakes), leading to no data areas
365       r.mapcalc "elev_srtm_30m_filt = if(elev_srtm_30m < 50.0, null(), elev_srtm_30m)"
366       d.histogram elev_srtm_30m_filt
367       d.rast elev_srtm_30m_filt
368       # using the IDW method to fill these holes in DEM without low-pass filter
369       # increase distance to gap-fill larger holes
370       r.fill.stats -k input=elev_srtm_30m_filt output=elev_srtm_30m_idw distance=100
371       d.histogram elev_srtm_30m_idw
372       d.rast elev_srtm_30m_idw
373       r.mapcalc "diff_orig_idw = elev_srtm_30m - elev_srtm_30m_idw"
374       r.colors diff_orig_idw color=differences
375       r.univar -e diff_orig_idw
376       d.erase
377       d.rast diff_orig_idw
378       d.legend diff_orig_idw
379

SEE ALSO

381         r.fillnulls,  r.neighbors,  r.surf.idw,  v.surf.bspline,  v.surf.idw,
382       v.surf.rst
383
384       Inverse Distance Weighting in Wikipedia
385

AUTHOR

387       Benjamin Ducke
388

SOURCE CODE

390       Available at: r.fill.stats source code (history)
391
392       Accessed: Mon Jun 20 16:46:03 2022
393
394       Main index | Raster index | Topics index | Keywords index  |  Graphical
395       index | Full index
396
397       © 2003-2022 GRASS Development Team, GRASS GIS 8.2.0 Reference Manual
398
399
400
401GRASS 8.2.0                                                    r.fill.stats(1)
Impressum