1r.fill.stats(1) GRASS GIS User's Manual r.fill.stats(1)
2
3
4
6 r.fill.stats - Rapidly fills ’no data’ cells (NULLs) of a raster map
7 with interpolated values (IDW).
8
10 raster, surface, interpolation, IDW, no-data filling
11
13 r.fill.stats
14 r.fill.stats --help
15 r.fill.stats [-mkwus] input=name output=name [uncertainty=name] dis‐
16 tance=value mode=name [minimum=value] [maximum=value] power=value
17 cells=value [--overwrite] [--help] [--verbose] [--quiet] [--ui]
18
19 Flags:
20 -m
21 Interpret distance as map units, not number of cells
22
23 -k
24 Keep (preserve) original cell values
25 By default original values are smoothed
26
27 -w
28 Just print the spatial weights matrix
29
30 -u
31 Just print estimated memory usage
32
33 -s
34 Single precision floating point output
35
36 --overwrite
37 Allow output files to overwrite existing files
38
39 --help
40 Print usage summary
41
42 --verbose
43 Verbose module output
44
45 --quiet
46 Quiet module output
47
48 --ui
49 Force launching GUI dialog
50
51 Parameters:
52 input=name [required]
53 Raster map with data gaps to fill
54
55 output=name [required]
56 Name of result output map
57
58 uncertainty=name
59 Name of uncertainty output map
60
61 distance=value [required]
62 Distance threshold (default: in cells) for interpolation
63 Default: 3
64
65 mode=name [required]
66 Statistic for interpolated cell values
67 Options: wmean, mean, median, mode
68 Default: wmean
69
70 minimum=value
71 Minimum input data value to include in interpolation
72
73 maximum=value
74 Maximum input data value to include in interpolation
75
76 power=value [required]
77 Power coefficient for IDW interpolation
78 Default: 2.0
79
80 cells=value [required]
81 Minimum number of data cells within search radius
82 Default: 8
83
85 r.fill.stats is a module for fast gap filling and interpolation (with
86 smoothing) of dense raster data.
87
88 The r.fill.stats module is capable of quickly filling small data gaps
89 in large and high-resolution raster maps. It’s primary purpose is to
90 improve high-resolution, rasterized sensor data (such as Lidar data).
91 As a rule of thumb, there should be at least as many data cells as
92 there are "no data" (NULL) cells in the input raster map. Several
93 interpolation modes are available. By default, all values of the input
94 raster map will be replaced with locally interpolated values in the
95 output raster map. This is the equivalent of running a low-pass smooth‐
96 ing filter on the interpolated data and is often desirable, owing to
97 noisy nature of high-resolution sensor data. With dense data and small
98 gaps, this should result in only slight deviations from the original
99 data and produce smooth output. Alternatively, setting the -k flag will
100 disable the low-pass filter and preserve the original cell values.
101
102 Available gap filling modes:
103
104 · spatially weighted mean (default)
105
106 · simple mean
107
108 · simple median
109
110 · simple mode
111
112 The spatially weighted mean is equivalent to an Inverse Distance
113 Weighting (IDW; see also r.surf.idw) interpolation. The simple mean is
114 equivalent to a simple low-pass filter. Median and mode replacements
115 can also be achieved using r.neighbors.
116
117 Note that r.fill.stats is highly optimized for fast processing of large
118 raster datasets with a small interpolation distance threshold (i.e.
119 closing small gaps). As a trade-off for speed and a small memory foot
120 print, some spatial accuracy is lost due to the rounding of the dis‐
121 tance threshold to the closest approximation in input raster cells and
122 the use of a matrix of precomputed weights at cell resolution (see fur‐
123 ther down for details). Note also that processing time will increase
124 exponentially with higher distance settings. Cells outside the distance
125 threshold will not be interpolated, preserving the edges of the origi‐
126 nal data extent.
127
128 This module is not well suited for interpolating sparse input data and
129 closing large gaps. For such purposes more appropriate methods are
130 available, such as v.surf.idw or v.surf.rst.
131
132 Applications where the properties of r.fill.stats are advantageous
133 include the processing of high resolution, close range scanning and
134 remote sensing datasets. Such datasets commonly feature densely spaced
135 measurements that have some gaps after rasterization, due to blind
136 spots, instrument failures, and misalignments between the GIS’ raster
137 cell grids and the original measurement locations. In these cases,
138 r.fill.stats should typically be run using the "weighted mean"
139 (default) method and with a small distance setting (the default is to
140 use a search radius of just three cells).
141
142 The images below show a gradiometer dataset with gaps and its interpo‐
143 lated equivalent, produced using the spatially weighted mean operator
144 (mode="wmean").
145
146 In addition, r.fill.stats can be useful for raster map generalization.
147 Often, this involves removing small clumps of categorized cells and
148 then filling the resulting data gaps without "inventing" any new val‐
149 ues. In such cases, the "mode" or "median" interpolators should be
150 used.
151
152 Usage
153 The most critical user-provided settings are the interpolation/gap
154 filling method (mode) and the maximum distance, up to which
155 r.fill.stats will scan for cells with values (distance). The distance
156 can be expressed as a number of cells (default) or in the current GRASS
157 location’s units (if the -m flag is given). The latter are typically
158 meters, but can be any other units of a planar coordinate system.
159
160 Note that proper handling of geodetic coordinates (lat/lon) and dis‐
161 tances is currently not implemented. For lat/lon data, the distance
162 should therefore be specified in cells and usage of r.fill.stats should
163 be restricted to small areas to avoid large inaccuracies that may arise
164 from the different extents of cells along the latitudinal and longitu‐
165 dinal axes.
166
167 Distances specified in map units (-m flag) will be approximated as
168 accurately as the current region’s cell resolution settings allow. The
169 program will warn if the distance cannot be expressed as whole cells at
170 the current region’s resolution. In such case, the number of cells in
171 the search window will be rounded up. Due to the rounding effect intro‐
172 duced by using cells as spatial units, the actual maximum distance con‐
173 sidered by the interpolation may be up to half a cell diagonal larger
174 than the one specified by the user.
175
176 The interpolator type "wmean" uses a precomputed matrix of spatial
177 weights to speed up computation. This matrix can be examined (printed
178 to the screen) before running the interpolation, by setting the -w
179 flag. In mode "wmean", the power option has the usual meaning in IDW:
180 higher values mean that cell values in the neighborhood lose their
181 influence on the cell to be interpolated more rapidly with increasing
182 distance from the latter. Another way of explaining this effect is to
183 state that larger "power" settings result in more localized interpola‐
184 tion, smaller ones in more globalized interpolation. The default set‐
185 ting is power=2.0.
186
187 The interpolators "mean", "median" and "mode" are calculated from all
188 cell values within the search radius. No spatial weighting is applied
189 for these methods. The "mode" of the input data may not always be
190 unique. In such case, the mode will be the smallest value with the
191 highest frequency.
192
193 Often, input data will contain spurious extreme measurements (spikes,
194 outliers, noise) caused by the limits of device sensitivity, hardware
195 defects, environmental influences, etc. If the normal, valid range of
196 input data is known beforehand, then the minimum and maximum options
197 can be used to exclude those input cells that have values below or
198 above that range, respectively. This will prevent the influence of
199 spikes and outliers from spreading through the interpolation.
200
201 Unless the -k (keep) flag is given, data cells of the input map will be
202 replaced with interpolated values instead of preserving them in the
203 output. In modes "wmean" and "mean", this results in a smoothing effect
204 that includes the original data (see below)!
205
206 Besides the result of the interpolation/gap filling, a second output
207 map can be specified via the uncertainty option. The cell values in
208 this map represent a simple measure of how much uncertainty was
209 involved in interpolating each cell value of the primary output map,
210 with "0" being the lowest and "1" being the theoretic highest uncer‐
211 tainty. Uncertainty is measured by summing up all cells in the neigh‐
212 borhood (defined by the search radius distance) that contain a value in
213 the input map, multiplied by their weights, and dividing the result by
214 the sum of all weights in the neighborhood. For mode=wmean, this means
215 that interpolated output cells that were computed from many nearby
216 input cells have very low uncertainty and vice versa. For all other
217 modes, all weights in the neighborhood are constant "1" and the uncer‐
218 tainty measure is a simple measure of how many input data cells were
219 present in the search window.
220
221 Smoothing
222 The r.fill.stats module uses the interpolated values to adjust the
223 original values and create a smooth surface, which is akin to running a
224 low-pass filter on the data. Since most high-resolution sensor data is
225 noisy, this is normally a desired effect and results in output that is
226 more suitable for deriving secondary products, such as slope, aspect
227 and curvature maps. Larger settings for the search radius (distance)
228 will result in a stronger smoothing. In practice, some experimentation
229 with different settings for distance and power might be required to
230 achieve good results. In some cases (e.g. when dealing with low-noise
231 or classified data), it might be desirable to turn off data smoothing
232 by setting the -k (keep) flag. This will ensure that the original cell
233 data is copied through to the result map without alteration.
234
235 Effect of smoothing the original data: The top row shows a gap-filled
236 surface computed from a rasterized Lidar point cloud (using mode=wmean
237 and power=2), and the derived slope, aspect, and profile curvature
238 maps. The smoothing effect is clearly visible. The bottom row shows
239 the effect of setting the -k flag: Preserving the original cell values
240 in the interpolated output produces and unsmoothed, noisy surface, and
241 likewise noisy derivative maps. The effect can be seen in the illus‐
242 tration above: Slope, aspect, and profile curvature are computed using
243 the r.slope.aspect module, which uses a window (kernel) for computa‐
244 tions that considers only the immediate neighborhood of each cell. When
245 performed on noisy data, such local operations result in equally noisy
246 derivatives if the original data is preserved (by setting the -k flag)
247 and no smoothing is performed.
248
249 (Note that the effects of noisy data can also be avoided by using mod‐
250 ules that are not restricted to minimal kernel sizes. For example,
251 aspect and other morphometric parameters can be computed using the
252 r.param.scale module which operates with variable-size cell neighbor‐
253 hoods.)
254
255 Spatial weighting scheme
256 The key to getting good gap filling results is to understand the spa‐
257 tial weighting scheme used in mode "wmean". The weights are precomputed
258 and assigned per cell within the search window centered on the location
259 at which to interpolate/gap fill all cells within the user-specified
260 distance.
261
262 The illustration below shows the precomputed weights matrix for a
263 search distance of four cells from the center cell:
264 000.00 000.01 000.04 000.07 000.09 000.07 000.04 000.01 000.00
265 000.01 000.06 000.13 000.19 000.22 000.19 000.13 000.06 000.01
266 000.04 000.13 000.25 000.37 000.42 000.37 000.25 000.13 000.04
267 000.07 000.19 000.37 000.56 000.68 000.56 000.37 000.19 000.07
268 000.09 000.22 000.42 000.68 001.00 000.68 000.42 000.22 000.09
269 000.07 000.19 000.37 000.56 000.68 000.56 000.37 000.19 000.07
270 000.04 000.13 000.25 000.37 000.42 000.37 000.25 000.13 000.04
271 000.01 000.06 000.13 000.19 000.22 000.19 000.13 000.06 000.01
272 000.00 000.01 000.04 000.07 000.09 000.07 000.04 000.01 000.00
273 Note that the weights in such a small window drop rapidly for the
274 default setting of power=2.
275
276 If the distance is given in map units (flag -m), then the search window
277 can be modeled more accurately as a circle. The illustration below
278 shows the precomputed weights for a distance in map units that is
279 approximately equivalent to four cells from the center cell:
280 ...... ...... ...... 000.00 000.00 000.00 ...... ...... ......
281 ...... 000.00 000.02 000.06 000.09 000.06 000.02 000.00 ......
282 ...... 000.02 000.11 000.22 000.28 000.22 000.11 000.02 ......
283 000.00 000.07 000.22 000.44 000.58 000.44 000.22 000.07 000.00
284 000.00 000.09 000.28 000.58 001.00 000.58 000.28 000.09 000.00
285 000.00 000.07 000.22 000.44 000.58 000.44 000.22 000.07 000.00
286 ...... 000.02 000.11 000.22 000.28 000.22 000.11 000.02 ......
287 ...... 000.00 000.02 000.06 000.09 000.06 000.02 000.00 ......
288 ...... ...... ...... 000.00 000.00 000.00 ...... ...... ......
289
290 When using a small search radius, cells must also be set to a small
291 value. Otherwise, there may not be enough cells with data within the
292 search radius to support interpolation.
293
295 The straight-line metric used for converting distances in map units to
296 cell numbers is only adequate for planar coordinate systems. Using this
297 module with lat/lon input data will likely give inaccurate results,
298 especially when interpolating over large geographical areas.
299
300 If the distance is set to a relatively large value, processing time
301 will quickly approach and eventually exceed that of point-based inter‐
302 polation modules such as v.surf.rst.
303
304 This module can handle cells with different X and Y resolutions. How‐
305 ever, note that the weight matrix will be skewed in such cases, with
306 higher weights occurring close to the center and along the axis with
307 the higher resolution. This is because weights on the lower resolution
308 axis are less accurately calculated. The skewing effect will be
309 stronger if the difference between the X and Y axis resolution is
310 greater and a larger "power" setting is chosen. This property of the
311 weights matrix directly reflects the higher information density along
312 the higher resolution axis.
313
314 Note on printing the weights matrix (using the -w flag): the matrix
315 cannot be printed if it is too large.
316
317 The memory estimate provided by the -u flag is a lower limit on the
318 amount of RAM that will be needed.
319
320 If the -s flag is set, floating point type output will be saved as a
321 "single precision" raster map, saving ~50% disk space compared to the
322 default "double precision" output.
323
325 Gap-filling of a dataset using spatially weighted mean (IDW)
326 Gap-fill a dataset using spatially weighted mean (IDW) and a maximum
327 search radius of 3.0 map units; also produce uncertainty estimation
328 map:
329 r.fill.stats input=measurements output=result dist=3.0 -m mode=wmean uncertainty=uncert_map
330 Run a fast low-pass filter (replacement all cells with mean value of
331 neighboring cells) on the input map:
332 r.fill.stats input=measurements output=result dist=10 mode=mean
333 Fill data gaps in a categorized raster map; preserve existing data:
334 r.fill.stats input=categories output=result dist=100 -m mode=mode -k
335
336 Lidar point cloud example
337 Inspect the point density and determine the extent of the point cloud
338 using the r.in.lidar module:
339 r.in.lidar -e input=points.las output=density method=n resolution=5 class_filter=2
340 Based on the result, set computational region extent and desired reso‐
341 lution:
342 g.region -pa raster=density res=1
343 Import the point cloud as raster using binning:
344 r.in.lidar input=points.las output=ground_raw method=mean class_filter=2
345 Check that there are more non-NULL cells than NULL ("no data") cells:
346 r.univar map=ground_raw
347 total null and non-null cells: 2340900
348 total null cells: 639184
349 ...
350 Fill in the NULL cells using the default 3-cell search radius:
351 r.fill.stats input=ground output=ground_filled uncertainty=uncertainty distance=3 mode=wmean power=2.0 cells=8
352
353 Binning of Lidar and resulting ground surface with filled gaps. Note
354 the remaining NULL cells (white) in the resulting ground surface.
355 These are areas with a lack of cells with values in close proximity.
356
357 Outlier removal and gap-filling of SRTM elevation data
358 In this example, the SRTM elevation map in the North Carolina sample
359 dataset location is filtered for outlier elevation values; missing pix‐
360 els are then re-interpolated to obtain a complete elevation map:
361 g.region raster=elev_srtm_30m -p
362 d.mon wx0
363 d.histogram elev_srtm_30m
364 # remove SRTM outliers, i.e. SRTM below 50m (esp. lakes), leading to no data areas
365 r.mapcalc "elev_srtm_30m_filt = if(elev_srtm_30m < 50.0, null(), elev_srtm_30m)"
366 d.histogram elev_srtm_30m_filt
367 d.rast elev_srtm_30m_filt
368 # using the IDW method to fill these holes in DEM without low-pass filter
369 # increase distance to gap-fill larger holes
370 r.fill.stats -k input=elev_srtm_30m_filt output=elev_srtm_30m_idw distance=100
371 d.histogram elev_srtm_30m_idw
372 d.rast elev_srtm_30m_idw
373 r.mapcalc "diff_orig_idw = elev_srtm_30m - elev_srtm_30m_idw"
374 r.colors diff_orig_idw color=differences
375 r.univar -e diff_orig_idw
376 d.erase
377 d.rast diff_orig_idw
378 d.legend diff_orig_idw
379
381 r.fillnulls, r.neighbors, r.surf.idw, v.surf.idw, v.surf.rst
382
383 Inverse Distance Weighting in Wikipedia
384
386 Benjamin Ducke
387
389 Available at: r.fill.stats source code (history)
390
391 Main index | Raster index | Topics index | Keywords index | Graphical
392 index | Full index
393
394 © 2003-2020 GRASS Development Team, GRASS GIS 7.8.5 Reference Manual
395
396
397
398GRASS 7.8.5 r.fill.stats(1)