1NCCOPY(1) UNIDATA UTILITIES NCCOPY(1)
2
3
4
6 nccopy - Copy a netCDF file, optionally changing format, compression,
7 or chunking in the output.
8
10 nccopy [-k kind_name ] [-kind_code] [-d n ] [-s] [-c chunkspec ]
11 [-u] [-w] [-[v|V] var1,...] [-[g|G] grp1,...] [-m bufsize ]
12 [-h chunk_cache ] [-e cache_elems ] [-r] infile outfile
13
15 The nccopy utility copies an input netCDF file in any supported format
16 variant to an output netCDF file, optionally converting the output to
17 any compatible netCDF format variant, compressing the data, or rechunk‐
18 ing the data. For example, if built with the netCDF-3 library, a
19 netCDF classic file may be copied to a netCDF 64-bit offset file, per‐
20 mitting larger variables. If built with the netCDF-4 library, a netCDF
21 classic file may be copied to a netCDF-4 file or to a netCDF-4 classic
22 model file as well, permitting data compression, efficient schema
23 changes, larger variable sizes, and use of other netCDF-4 features.
24
25 If no output format is specified, with either -k kind_name or
26 -kind_code, then the output will use the same format as the input, un‐
27 less the input is classic or 64-bit offset and either chunking or com‐
28 pression is specified, in which case the output will be netCDF-4 clas‐
29 sic model format. Attempting some kinds of format conversion will re‐
30 sult in an error, if the conversion is not possible. For example, an
31 attempt to copy a netCDF-4 file that uses features of the enhanced mod‐
32 el, such as groups or variable-length strings, to any of the other
33 kinds of netCDF formats that use the classic model will result in an
34 error.
35
36 nccopy also serves as an example of a generic netCDF-4 program, with
37 its ability to read any valid netCDF file and handle nested groups,
38 strings, and user-defined types, including arbitrarily nested compound
39 types, variable-length types, and data of any valid netCDF-4 type.
40
41 If DAP support was enabled when nccopy was built, the file name may
42 specify a DAP URL. This may be used to convert data on DAP servers to
43 local netCDF files.
44
46 -k kind_name
47 Use format name to specify the kind of file to be created and,
48 by inference, the data model (i.e. netcdf-3 (classic) or
49 netcdf-4 (enhanced)). The possible arguments are:
50
51 'nc3' or 'classic' => netCDF classic format
52
53 'nc6' or '64-bit offset' => netCDF 64-bit format
54
55 'nc4' or 'netCDF-4' => netCDF-4 format (enhanced data
56 model)
57
58 'nc7' or 'netCDF-4 classic model' => netCDF-4 classic
59 model format
60
61 Note: The old format numbers '1', '2', '3', '4', equivalent to
62 the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are
63 also still accepted but deprecated, due to easy confusion be‐
64 tween format numbers and format names.
65
66 [-kind_code]
67 Use format numeric code (instead of format name) to specify the
68 kind of file to be created and, by inference, the data model
69 (i.e. netcdf-3 (classic) versus netcdf-4 (enhanced)). The nu‐
70 meric codes are:
71
72 3 => netcdf classic format
73
74 6 => netCDF 64-bit format
75
76 4 => netCDF-4 format (enhanced data model)
77
78 7 => netCDF-4 classic model format
79 The numeric code "7" is used because "7=3+4", specifying the format
80 that uses the netCDF-3 data model for compatibility with the netCDF-4
81 storage format for performance. Credit is due to NCO for use of these
82 numeric codes instead of the old and confusing format numbers.
83
84 -d n
85 For netCDF-4 output, including netCDF-4 classic model, specify
86 deflation level (level of compression) for variable data output.
87 0 corresponds to no compression and 9 to maximum compression,
88 with higher levels of compression requiring marginally more time
89 to compress or uncompress than lower levels. Compression
90 achieved may also depend on output chunking parameters. If this
91 option is specified for a classic format or 64-bit offset format
92 input file, it is not necessary to also specify that the output
93 should be netCDF-4 classic model, as that will be the default.
94 If this option is not specified and the input file has com‐
95 pressed variables, the compression will still be preserved in
96 the output, using the same chunking as in the input by default.
97
98 Note that nccopy requires all variables to be compressed using
99 the same compression level, but the API has no such restriction.
100 With a program you can customize compression for each variable
101 independently.
102
103 -s For netCDF-4 output, including netCDF-4 classic model, specify
104 shuffling of variable data bytes before compression or after de‐
105 compression. Shuffling refers to interlacing of bytes in a
106 chunk so that the first bytes of all values are contiguous in
107 storage, followed by all the second bytes, and so on, which of‐
108 ten improves compression. This option is ignored unless a non-
109 zero deflation level is specified. Using -d0 to specify no de‐
110 flation on input data that has been compressed and shuffled
111 turns off both compression and shuffling in the output.
112
113 -u Convert any unlimited size dimensions in the input to fixed size
114 dimensions in the output. This can speed up variable-at-a-time
115 access, but slow down record-at-a-time access to multiple vari‐
116 ables along an unlimited dimension.
117
118 -w Keep output in memory (as a diskless netCDF file) until output
119 is closed, at which time output file is written to disk. This
120 can greatly speedup operations such as converting unlimited di‐
121 mension to fixed size (-u option), chunking, rechunking, or com‐
122 pressing the input. It requires that available memory is large
123 enough to hold the output file. This option may provide a larg‐
124 er speedup than careful tuning of the -m, -h, or -e options, and
125 it's certainly a lot simpler.
126
127 -c chunkspec
128 For netCDF-4 output, including netCDF-4 classic model, specify
129 chunking (multidimensional tiling) for variable data in the out‐
130 put. This is useful to specify the units of disk access, com‐
131 pression, or other filters such as checksums. Changing the
132 chunking in a netCDF file can also greatly speedup access, by
133 choosing chunk shapes that are appropriate for the most common
134 access patterns.
135
136 The chunkspec argument is a string of comma-separated associa‐
137 tions, each specifying a dimension name, a '/' character, and
138 optionally the corresponding chunk length for that dimension.
139 No blanks should appear in the chunkspec string, except possibly
140 escaped blanks that are part of a dimension name. A chunkspec
141 names at least one dimension, and may omit dimensions which are
142 not to be chunked or for which the default chunk length is de‐
143 sired. If a dimension name is followed by a '/' character but
144 no subsequent chunk length, the actual dimension length is as‐
145 sumed. If copying a classic model file to a netCDF-4 output
146 file and not naming all dimensions in the chunkspec, unnamed di‐
147 mensions will also use the actual dimension length for the chunk
148 length. An example of a chunkspec for variables that use 'm'
149 and 'n' dimensions might be 'm/100,n/200' to specify 100 by 200
150 chunks. To see the chunking resulting from copying with a
151 chunkspec, use the '-s' option of ncdump on the output file.
152
153 The chunkspec '/' that omits all dimension names and correspond‐
154 ing chunk lengths specifies that no chunking is to occur in the
155 output, so can be used to unchunk all the chunked variables. To
156 see the chunking resulting from copying with a chunkspec, use
157 the '-s' option of ncdump on the output file.
158
159 As an I/O optimization, nccopy has a threshold for the minimum
160 size of non-record variables that get chunked, currently 8192
161 bytes. In the future, use of this threshold and its size may be
162 settable in an option.
163
164 Note that nccopy requires variables that share a dimension to
165 also share the chunk size associated with that dimension, but
166 the programming interface has no such restriction. If you need
167 to customize chunking for variables independently, you will need
168 to use the library API in a custom utility program.
169
170 -v var1,...
171 The output will include data values for the specified variables,
172 in addition to the declarations of all dimensions, variables,
173 and attributes. One or more variables must be specified by name
174 in the comma-delimited list following this option. The list must
175 be a single argument to the command, hence cannot contain un‐
176 escaped blanks or other white space characters. The named vari‐
177 ables must be valid netCDF variables in the input-file. A vari‐
178 able within a group in a netCDF-4 file may be specified with an
179 absolute path name, such as "/GroupA/GroupA2/var". Use of a
180 relative path name such as 'var' or "grp/var" specifies all
181 matching variable names in the file. The default, without this
182 option, is to include data values for all variables in the
183 output.
184
185 -V var1,...
186 The output will include the specified variables only but all di‐
187 mensions and global or group attributes. One or more variables
188 must be specified by name in the comma-delimited list following
189 this option. The list must be a single argument to the command,
190 hence cannot contain unescaped blanks or other white space char‐
191 acters. The named variables must be valid netCDF variables in
192 the input-file. A variable within a group in a netCDF-4 file may
193 be specified with an absolute path name, such as
194 '/GroupA/GroupA2/var'. Use of a relative path name such as
195 'var' or 'grp/var' specifies all matching variable names in the
196 file. The default, without this option, is to include all
197 variables in the output.
198
199 -g grp1,...
200 The output will include data values only for the specified
201 groups. One or more groups must be specified by name in the
202 comma-delimited list following this option. The list must be a
203 single argument to the command. The named groups must be valid
204 netCDF groups in the input-file. The default, without this op‐
205 tion, is to include data values for all groups in the output.
206
207 -G grp1,...
208 The output will include only the specified groups. One or more
209 groups must be specified by name in the comma-delimited list
210 following this option. The list must be a single argument to the
211 command. The named groups must be valid netCDF groups in the in‐
212 put-file. The default, without this option, is to include all
213 groups in the output.
214
215 -m bufsize
216 An integer or floating-point number that specifies the size, in
217 bytes, of the copy buffer used to copy large variables. A suf‐
218 fix of K, M, G, or T multiplies the copy buffer size by one
219 thousand, million, billion, or trillion, respectively. The de‐
220 fault is 5 Mbytes, but will be increased if necessary to hold at
221 least one chunk of netCDF-4 chunked variables in the input file.
222 You may want to specify a value larger than the default for
223 copying large files over high latency networks. Using the '-w'
224 option may provide better performance, if the output fits in
225 memory.
226
227 -h chunk_cache
228 For netCDF-4 output, including netCDF-4 classic model, an inte‐
229 ger or floating-point number that specifies the size in bytes of
230 chunk cache allocated for each chunked variable. This is not a
231 property of the file, but merely a performance tuning parameter
232 for avoiding compressing or decompressing the same data multiple
233 times while copying and changing chunk shapes. A suffix of K,
234 M, G, or T multiplies the chunk cache size by one thousand, mil‐
235 lion, billion, or trillion, respectively. The default is
236 4.194304 Mbytes (or whatever was specified for the configure-
237 time constant CHUNK_CACHE_SIZE when the netCDF library was
238 built). Ideally, the nccopy utility should accept only one mem‐
239 ory buffer size and divide it optimally between a copy buffer
240 and chunk cache, but no general algorithm for computing the op‐
241 timum chunk cache size has been implemented yet. Using the '-w'
242 option may provide better performance, if the output fits in
243 memory.
244
245 -e cache_elems
246 For netCDF-4 output, including netCDF-4 classic model, specifies
247 number of chunks that the chunk cache can hold. A suffix of K,
248 M, G, or T multiplies the number of chunks that can be held in
249 the cache by one thousand, million, billion, or trillion, re‐
250 spectively. This is not a property of the file, but merely a
251 performance tuning parameter for avoiding compressing or decom‐
252 pressing the same data multiple times while copying and changing
253 chunk shapes. The default is 1009 (or whatever was specified
254 for the configure-time constant CHUNK_CACHE_NELEMS when the
255 netCDF library was built). Ideally, the nccopy utility should
256 determine an optimum value for this parameter, but no general
257 algorithm for computing the optimum number of chunk cache ele‐
258 ments has been implemented yet.
259
260 -r Read netCDF classic or 64-bit offset input file into a diskless
261 netCDF file in memory before copying. Requires that input file
262 be small enough to fit into memory. For nccopy, this doesn't
263 seem to provide any significant speedup, so may not be a useful
264 option.
265
267 Make a copy of foo1.nc, a netCDF file of any type, to foo2.nc, a netCDF
268 file of the same type:
269
270 nccopy foo1.nc foo2.nc
271
272 Note that the above copy will not be as fast as use of cp or other sim‐
273 ple copy utility, because the file is copied using only the netCDF API.
274 If the input file has extra bytes after the end of the netCDF data,
275 those will not be copied, because they are not accessible through the
276 netCDF interface. If the original file was generated in "No fill" mode
277 so that fill values are not stored for padding for data alignment, the
278 output file may have different padding bytes.
279
280 Convert a netCDF-4 classic model file, compressed.nc, that uses com‐
281 pression, to a netCDF-3 file classic.nc:
282
283 nccopy -k classic compressed.nc classic.nc
284
285 Note that 'nc3' could be used instead of 'classic'.
286
287 Download the variable 'time_bnds' and its associated attributes from an
288 OPeNDAP server and copy the result to a netCDF file named 'tb.nc':
289
290 nccopy 'http://test.opendap.org/opendap/data/nc/sst.mn‐
291 mean.nc.gz?time_bnds' tb.nc
292
293 Note that URLs that name specific variables as command-line arguments
294 should generally be quoted, to avoid the shell interpreting special
295 characters such as '?'.
296
297 Compress all the variables in the input file foo.nc, a netCDF file of
298 any type, to the output file bar.nc:
299
300 nccopy -d1 foo.nc bar.nc
301
302 If foo.nc was a classic or 64-bit offset netCDF file, bar.nc will be a
303 netCDF-4 classic model netCDF file, because the classic and 64-bit off‐
304 set format variants don't support compression. If foo.nc was a
305 netCDF-4 file with some variables compressed using various deflation
306 levels, the output will also be a netCDF-4 file of the same type, but
307 all the variables, including any uncompressed variables in the input,
308 will now use deflation level 1.
309
310 Assume the input data includes gridded variables that use time, lat,
311 lon dimensions, with 1000 times by 1000 latitudes by 1000 longitudes,
312 and that the time dimension varies most slowly. Also assume that users
313 want quick access to data at all times for a small set of lat-lon
314 points. Accessing data for 1000 times would typically require access‐
315 ing 1000 disk blocks, which may be slow.
316
317 Reorganizing the data into chunks on disk that have all the time in
318 each chunk for a few lat and lon coordinates would greatly speed up
319 such access. To chunk the data in the input file slow.nc, a netCDF
320 file of any type, to the output file fast.nc, you could use;
321
322 nccopy -c time/1000,lat/40,lon/40 slow.nc fast.nc
323
324 to specify data chunks of 1000 times, 40 latitudes, and 40 longitudes.
325 If you had enough memory to contain the output file, you could speed up
326 the rechunking operation significantly by creating the output in memory
327 before writing it to disk on close:
328
329 nccopy -w -c time/1000,lat/40,lon/40 slow.nc fast.nc
330
332 ncdump(1),[22mncgen(1),netcdf(3)
333
334
335
336Release 4.2 2012-03-08 NCCOPY(1)