1i.cluster(1) Grass User's Manual i.cluster(1)
2
3
4
6 i.cluster - Generates spectral signatures for land cover types in an
7 image using a clustering algorithm.
8 The resulting signature file is used as input for i.maxlik, to generate
9 an unsupervised image classification.
10
12 imagery, classification, signatures
13
15 i.cluster
16 i.cluster --help
17 i.cluster group=name subgroup=name signaturefile=name classes=integer
18 [seed=name] [sample=row_interval,col_interval] [iterations=integer]
19 [convergence=float] [separation=float] [min_size=integer]
20 [reportfile=name] [--overwrite] [--help] [--verbose] [--quiet]
21 [--ui]
22
23 Flags:
24 --overwrite
25 Allow output files to overwrite existing files
26
27 --help
28 Print usage summary
29
30 --verbose
31 Verbose module output
32
33 --quiet
34 Quiet module output
35
36 --ui
37 Force launching GUI dialog
38
39 Parameters:
40 group=name [required]
41 Name of input imagery group
42
43 subgroup=name [required]
44 Name of input imagery subgroup
45
46 signaturefile=name [required]
47 Name for output file containing result signatures
48
49 classes=integer [required]
50 Initial number of classes
51 Options: 1-255
52
53 seed=name
54 Name of file containing initial signatures
55
56 sample=row_interval,col_interval
57 Sampling intervals (by row and col); default: ~10,000 pixels
58
59 iterations=integer
60 Maximum number of iterations
61 Default: 30
62
63 convergence=float
64 Percent convergence
65 Options: 0-100
66 Default: 98.0
67
68 separation=float
69 Cluster separation
70 Default: 0.0
71
72 min_size=integer
73 Minimum number of pixels in a class
74 Default: 17
75
76 reportfile=name
77 Name for output file containing final report
78
80 i.cluster performs the first pass in the two-pass unsupervised classi‐
81 fication of imagery, while the GRASS module i.maxlik executes the sec‐
82 ond pass. Both commands must be run to complete the unsupervised clas‐
83 sification.
84
85 i.cluster is a clustering algorithm (a modification of the k-means
86 clustering algorithm) that reads through the (raster) imagery data and
87 builds pixel clusters based on the spectral reflectances of the pixels
88 (see Figure). The pixel clusters are imagery categories that can be
89 related to land cover types on the ground. The spectral distributions
90 of the clusters (e.g., land cover spectral signatures) are influenced
91 by six parameters set by the user. A relevant parameter set by the user
92 is the initial number of clusters to be discriminated.
93
94 Fig.: Land use/land cover clustering of LANDSAT scene (sim‐
95 plified)
96
97
98 i.cluster starts by generating spectral signatures for this number of
99 clusters and "attempts" to end up with this number of clusters during
100 the clustering process. The resulting number of clusters and their
101 spectral distributions, however, are also influenced by the range of
102 the spectral values (category values) in the image files and the other
103 parameters set by the user. These parameters are: the minimum cluster
104 size, minimum cluster separation, the percent convergence, the maximum
105 number of iterations, and the row and column sampling intervals.
106
107 The cluster spectral signatures that result are composed of cluster
108 means and covariance matrices. These cluster means and covariance
109 matrices are used in the second pass (i.maxlik) to classify the image.
110 The clusters or spectral classes result can be related to land cover
111 types on the ground. The user has to specify the name of group file,
112 the name of subgroup file, the name of a file to contain result signa‐
113 tures, the initial number of clusters to be discriminated, and option‐
114 ally other parameters (see below) where the group should contain the
115 imagery files that the user wishes to classify. The subgroup is a sub‐
116 set of this group. The user must create a group and subgroup by run‐
117 ning the GRASS program i.group before running i.cluster. The subgroup
118 should contain only the imagery band files that the user wishes to
119 classify. Note that this subgroup must contain more than one band
120 file. The purpose of the group and subgroup is to collect map layers
121 for classification or analysis. The signaturefile is the file to con‐
122 tain result signatures which can be used as input for i.maxlik. The
123 classes value is the initial number of clusters to be discriminated;
124 any parameter values left unspecified are set to their default values.
125
126 Parameters:
127 group=name
128 The name of the group file which contains the imagery files that
129 the user wishes to classify.
130
131 subgroup=name
132 The name of the subset of the group specified in group option,
133 which must contain only imagery band files and more than one band
134 file. The user must create a group and a subgroup by running the
135 GRASS program i.group before running i.cluster.
136
137 signaturefile=name
138 The name assigned to output signature file which contains signa‐
139 tures of classes and can be used as the input file for the GRASS
140 program i.maxlik for an unsupervised classification.
141
142 classes=value
143 The number of clusters that will initially be identified in the
144 clustering process before the iterations begin.
145
146 seed=name
147 The name of a seed signature file is optional. The seed signatures
148 are signatures that contain cluster means and covariance matrices
149 which were calculated prior to the current run of i.cluster. They
150 may be acquired from a previously run of i.cluster or from a super‐
151 vised classification signature training site section (e.g., using
152 the signature file output by g.gui.iclass). The purpose of seed
153 signatures is to optimize the cluster decision boundaries (means)
154 for the number of clusters specified.
155
156 sample=row_interval,col_interval
157 These numbers are optional with default values based on the size of
158 the data set such that the total pixels to be processed is approxi‐
159 mately 10,000 (consider round up).
160
161 iterations=value
162 This parameter determines the maximum number of iterations which is
163 greater than the number of iterations predicted to achieve the
164 optimum percent convergence. The default value is 30. If the number
165 of iterations reaches the maximum designated by the user; the user
166 may want to rerun i.cluster with a higher number of iterations (see
167 reportfile).
168 Default: 30
169
170 convergence=value
171 A high percent convergence is the point at which cluster means
172 become stable during the iteration process. The default value is
173 98.0 percent. When clusters are being created, their means con‐
174 stantly change as pixels are assigned to them and the means are
175 recalculated to include the new pixel. After all clusters have
176 been created, i.cluster begins iterations that change cluster means
177 by maximizing the distances between them. As these means shift, a
178 higher and higher convergence is approached. Because means will
179 never become totally static, a percent convergence and a maximum
180 number of iterations are supplied to stop the iterative process.
181 The percent convergence should be reached before the maximum number
182 of iterations. If the maximum number of iterations is reached, it
183 is probable that the desired percent convergence was not reached.
184 The number of iterations is reported in the cluster statistics in
185 the report file (see reportfile).
186 Default: 98.0
187
188 separation=value
189 This is the minimum separation below which clusters will be merged
190 in the iteration process. The default value is 0.0. This is an
191 image-specific number (a "magic" number) that depends on the image
192 data being classified and the number of final clusters that are
193 acceptable. Its determination requires experimentation. Note that
194 as the minimum class (or cluster) separation is increased, the max‐
195 imum number of iterations should also be increased to achieve this
196 separation with a high percentage of convergence (see convergence).
197 Default: 0.0
198
199 min_size=value
200 This is the minimum number of pixels that will be used to define a
201 cluster, and is therefore the minimum number of pixels for which
202 means and covariance matrices will be calculated.
203 Default: 17
204
205 reportfile=name
206 The reportfile is an optional parameter which contains the result,
207 i.e., the statistics for each cluster. Also included are the
208 resulting percent convergence for the clusters, the number of iter‐
209 ations that was required to achieve the convergence, and the sepa‐
210 rability matrix.
211
213 Sampling method
214 i.cluster does not cluster all pixels, but only a sample (see parameter
215 sample). The result of that clustering is not that all pixels are
216 assigned to a given cluster; essentially, only signatures which are
217 representative of a given cluster are generated. When running i.cluster
218 on the same data asking for the same number of classes, but with dif‐
219 ferent sample sizes, likely slightly different signatures for each
220 cluster are obtained at each run.
221
222 Algorithm used for i.cluster
223 The algorithm uses input parameters set by the user on the initial num‐
224 ber of clusters, the minimum distance between clusters, and the corre‐
225 spondence between iterations which is desired, and minimum size for
226 each cluster. It also asks if all pixels to be clustered, or every
227 "x"th row and "y"th column (sampling), the correspondence between iter‐
228 ations desired, and the maximum number of iterations to be carried out.
229
230 In the 1st pass, initial cluster means for each band are defined by
231 giving the first cluster a value equal to the band mean minus its stan‐
232 dard deviation, and the last cluster a value equal to the band mean
233 plus its standard deviation, with all other cluster means distributed
234 equally spaced in between these. Each pixel is then assigned to the
235 class which it is closest to, distance being measured as Euclidean dis‐
236 tance. All clusters less than the user-specified minimum distance are
237 then merged. If a cluster has less than the user-specified minimum num‐
238 ber of pixels, all those pixels are again reassigned to the next near‐
239 est cluster. New cluster means are calculated for each band as the
240 average of raster pixel values in that band for all pixels present in
241 that cluster.
242
243 In the 2nd pass, pixels are then again reassigned to clusters based on
244 new cluster means. The cluster means are then again recalculated. This
245 process is repeated until the correspondence between iterations reaches
246 a user-specified level, or till the maximum number of iterations speci‐
247 fied is over, whichever comes first.
248
250 Preparing the statistics for unsupervised classification of a LANDSAT
251 subscene in North Carolina:
252 g.region raster=lsat7_2002_10 -p
253 # store VIZ, NIR, MIR into group/subgroup (leaving out TIR)
254 i.group group=lsat7_2002 subgroup=lsat7_2002 \
255 input=lsat7_2002_10,lsat7_2002_20,lsat7_2002_30,lsat7_2002_40,lsat7_2002_50,lsat7_2002_70
256 # generate signature file and report
257 i.cluster group=lsat7_2002 subgroup=lsat7_2002 \
258 signaturefile=sig_cluster_lsat2002 \
259 classes=10 reportfile=rep_clust_lsat2002.txt
260 To complete the unsupervised classification, i.maxlik is subsequently
261 used. See example in its manual page.
262
264 · Image classification wiki page
265
266 · Historical reference also the GRASS GIS 4 Image Processing man‐
267 ual (PDF)
268
269 · Wikipedia article on k-means clustering (note that i.cluster
270 uses a modification of the k-means clustering algorithm)
271
272 g.gui.iclass, i.group, i.gensig, i.maxlik, i.segment, i.smap, r.kappa
273
275 Michael Shapiro, U.S. Army Construction Engineering Research Laboratory
276 Tao Wen, University of Illinois at Urbana-Champaign, Illinois
277
278 Last changed: $Date: 2018-03-04 13:06:43 +0100 (Sun, 04 Mar 2018) $
279
281 Available at: i.cluster source code (history)
282
283 Main index | Imagery index | Topics index | Keywords index | Graphical
284 index | Full index
285
286 © 2003-2019 GRASS Development Team, GRASS GIS 7.4.4 Reference Manual
287
288
289
290GRASS 7.4.4 i.cluster(1)