1NCCOPY(1)                      UNIDATA UTILITIES                     NCCOPY(1)
2
3
4

NAME

6       nccopy  -  Copy a netCDF file, optionally changing format, compression,
7       or chunking in the output.
8

SYNOPSIS

10       nccopy [-k  kind_name ] [-kind_code] [-d  n ]  [-s]  [-c   chunkspec  ]
11              [-u]  [-w]  [-[v|V] var1,...]  [-[g|G] grp1,...]  [-m  bufsize ]
12              [-h  chunk_cache ] [-e  cache_elems ] [-r]  infile  outfile
13

DESCRIPTION

15       The nccopy utility copies an input netCDF file in any supported  format
16       variant  to  an output netCDF file, optionally converting the output to
17       any compatible netCDF format variant, compressing the data, or rechunk‐
18       ing  the  data.   For  example,  if  built with the netCDF-3 library, a
19       netCDF classic file may be copied to a netCDF 64-bit offset file,  per‐
20       mitting larger variables.  If built with the netCDF-4 library, a netCDF
21       classic file may be copied to a netCDF-4 file or to a netCDF-4  classic
22       model  file  as  well,  permitting  data  compression, efficient schema
23       changes, larger variable sizes, and use of other netCDF-4 features.
24
25       If  no  output  format  is  specified,  with  either  -k  kind_name  or
26       -kind_code,  then the output will use the same format as the input, un‐
27       less the input is classic or 64-bit offset and either chunking or  com‐
28       pression  is specified, in which case the output will be netCDF-4 clas‐
29       sic model format.  Attempting some kinds of format conversion will  re‐
30       sult  in  an error, if the conversion is not possible.  For example, an
31       attempt to copy a netCDF-4 file that uses features of the enhanced mod‐
32       el,  such  as  groups  or  variable-length strings, to any of the other
33       kinds of netCDF formats that use the classic model will  result  in  an
34       error.
35
36       nccopy  also  serves  as an example of a generic netCDF-4 program, with
37       its ability to read any valid netCDF file  and  handle  nested  groups,
38       strings,  and user-defined types, including arbitrarily nested compound
39       types, variable-length types, and data of any valid netCDF-4 type.
40
41       If DAP support was enabled when nccopy was built,  the  file  name  may
42       specify  a  DAP URL. This may be used to convert data on DAP servers to
43       local netCDF files.
44

OPTIONS

46        -k   kind_name
47              Use format name to specify the kind of file to be  created  and,
48              by  inference,  the  data  model  (i.e.  netcdf-3  (classic)  or
49              netcdf-4 (enhanced)).  The possible arguments are:
50
51                     'nc3' or 'classic' => netCDF classic format
52
53                     'nc6' or '64-bit offset' => netCDF 64-bit format
54
55                     'nc4' or 'netCDF-4' =>  netCDF-4  format  (enhanced  data
56                     model)
57
58                     'nc7'  or  'netCDF-4  classic  model' => netCDF-4 classic
59                     model format
60
61              Note: The old format numbers '1', '2', '3', '4',  equivalent  to
62              the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are
63              also still accepted but deprecated, due to  easy  confusion  be‐
64              tween format numbers and format names.
65
66       [-kind_code]
67              Use  format numeric code (instead of format name) to specify the
68              kind of file to be created and, by  inference,  the  data  model
69              (i.e.  netcdf-3  (classic) versus netcdf-4 (enhanced)).  The nu‐
70              meric codes are:
71
72                     3 => netcdf classic format
73
74                     6 => netCDF 64-bit format
75
76                     4 => netCDF-4 format (enhanced data model)
77
78                     7 => netCDF-4 classic model format
79       The numeric code "7" is used because  "7=3+4",  specifying  the  format
80       that  uses  the netCDF-3 data model for compatibility with the netCDF-4
81       storage format for performance. Credit is due to NCO for use  of  these
82       numeric codes instead of the old and confusing format numbers.
83
84        -d   n
85              For  netCDF-4  output, including netCDF-4 classic model, specify
86              deflation level (level of compression) for variable data output.
87              0  corresponds  to  no compression and 9 to maximum compression,
88              with higher levels of compression requiring marginally more time
89              to  compress  or  uncompress  than  lower  levels.   Compression
90              achieved may also depend on output chunking parameters.  If this
91              option is specified for a classic format or 64-bit offset format
92              input file, it is not necessary to also specify that the  output
93              should  be  netCDF-4 classic model, as that will be the default.
94              If this option is not specified and  the  input  file  has  com‐
95              pressed  variables,  the  compression will still be preserved in
96              the output, using the same chunking as in the input by default.
97
98              Note that nccopy requires all variables to be  compressed  using
99              the same compression level, but the API has no such restriction.
100              With a program you can customize compression for  each  variable
101              independently.
102
103        -s    For  netCDF-4  output, including netCDF-4 classic model, specify
104              shuffling of variable data bytes before compression or after de‐
105              compression.   Shuffling  refers  to  interlacing  of bytes in a
106              chunk so that the first bytes of all values  are  contiguous  in
107              storage,  followed by all the second bytes, and so on, which of‐
108              ten improves compression.  This option is ignored unless a  non-
109              zero  deflation level is specified.  Using -d0 to specify no de‐
110              flation on input data that  has  been  compressed  and  shuffled
111              turns off both compression and shuffling in the output.
112
113        -u    Convert any unlimited size dimensions in the input to fixed size
114              dimensions in the output.  This can speed up  variable-at-a-time
115              access,  but slow down record-at-a-time access to multiple vari‐
116              ables along an unlimited dimension.
117
118        -w    Keep output in memory (as a diskless netCDF file)  until  output
119              is  closed,  at which time output file is written to disk.  This
120              can greatly speedup operations such as converting unlimited  di‐
121              mension to fixed size (-u option), chunking, rechunking, or com‐
122              pressing the input.  It requires that available memory is  large
123              enough to hold the output file.  This option may provide a larg‐
124              er speedup than careful tuning of the -m, -h, or -e options, and
125              it's certainly a lot simpler.
126
127        -c  chunkspec
128              For  netCDF-4  output, including netCDF-4 classic model, specify
129              chunking (multidimensional tiling) for variable data in the out‐
130              put.   This  is useful to specify the units of disk access, com‐
131              pression, or other filters  such  as  checksums.   Changing  the
132              chunking  in  a  netCDF file can also greatly speedup access, by
133              choosing chunk shapes that are appropriate for the  most  common
134              access patterns.
135
136              The  chunkspec  argument is a string of comma-separated associa‐
137              tions, each specifying a dimension name, a  '/'  character,  and
138              optionally  the  corresponding  chunk length for that dimension.
139              No blanks should appear in the chunkspec string, except possibly
140              escaped  blanks  that are part of a dimension name.  A chunkspec
141              names at least one dimension, and may omit dimensions which  are
142              not  to  be chunked or for which the default chunk length is de‐
143              sired.  If a dimension name is followed by a '/'  character  but
144              no  subsequent  chunk length, the actual dimension length is as‐
145              sumed.  If copying a classic model file  to  a  netCDF-4  output
146              file and not naming all dimensions in the chunkspec, unnamed di‐
147              mensions will also use the actual dimension length for the chunk
148              length.   An  example  of a chunkspec for variables that use 'm'
149              and 'n' dimensions might be 'm/100,n/200' to specify 100 by  200
150              chunks.  To  see  the  chunking  resulting  from  copying with a
151              chunkspec, use the '-s' option of ncdump on the output file.
152
153              The chunkspec '/' that omits all dimension names and correspond‐
154              ing  chunk lengths specifies that no chunking is to occur in the
155              output, so can be used to unchunk all the chunked variables.  To
156              see  the  chunking  resulting from copying with a chunkspec, use
157              the '-s' option of ncdump on the output file.
158
159              As an I/O optimization, nccopy has a threshold for  the  minimum
160              size  of  non-record  variables that get chunked, currently 8192
161              bytes.  In the future, use of this threshold and its size may be
162              settable in an option.
163
164              Note  that  nccopy  requires variables that share a dimension to
165              also share the chunk size associated with  that  dimension,  but
166              the  programming interface has no such restriction.  If you need
167              to customize chunking for variables independently, you will need
168              to use the library API in a custom utility program.
169
170        -v   var1,...
171              The output will include data values for the specified variables,
172              in addition to the declarations of  all  dimensions,  variables,
173              and  attributes. One or more variables must be specified by name
174              in the comma-delimited list following this option. The list must
175              be  a  single  argument to the command, hence cannot contain un‐
176              escaped blanks or other white space characters. The named  vari‐
177              ables  must be valid netCDF variables in the input-file. A vari‐
178              able within a group in a netCDF-4 file may be specified with  an
179              absolute  path  name,  such  as "/GroupA/GroupA2/var".  Use of a
180              relative path name such as  'var'  or  "grp/var"  specifies  all
181              matching  variable names in the file.  The default, without this
182              option, is to include data values for   all   variables  in  the
183              output.
184
185        -V   var1,...
186              The output will include the specified variables only but all di‐
187              mensions and global or group attributes. One or  more  variables
188              must  be specified by name in the comma-delimited list following
189              this option. The list must be a single argument to the  command,
190              hence cannot contain unescaped blanks or other white space char‐
191              acters. The named variables must be valid  netCDF  variables  in
192              the input-file. A variable within a group in a netCDF-4 file may
193              be   specified   with   an   absolute   path   name,   such   as
194              '/GroupA/GroupA2/var'.   Use  of  a  relative  path name such as
195              'var' or 'grp/var' specifies all matching variable names in  the
196              file.   The  default,  without  this  option, is to include  all
197              variables in the output.
198
199        -g   grp1,...
200              The output will include  data  values  only  for  the  specified
201              groups.   One  or  more  groups must be specified by name in the
202              comma-delimited list following this option. The list must  be  a
203              single  argument  to the command. The named groups must be valid
204              netCDF groups in the input-file. The default, without  this  op‐
205              tion, is to include data values for all groups in the output.
206
207        -G   grp1,...
208              The  output will include only the specified groups.  One or more
209              groups must be specified by name  in  the  comma-delimited  list
210              following this option. The list must be a single argument to the
211              command. The named groups must be valid netCDF groups in the in‐
212              put-file.  The  default,  without this option, is to include all
213              groups in the output.
214
215        -m   bufsize
216              An integer or floating-point number that specifies the size,  in
217              bytes,  of the copy buffer used to copy large variables.  A suf‐
218              fix of K, M, G, or T multiplies the  copy  buffer  size  by  one
219              thousand,  million, billion, or trillion, respectively.  The de‐
220              fault is 5 Mbytes, but will be increased if necessary to hold at
221              least one chunk of netCDF-4 chunked variables in the input file.
222              You may want to specify a value  larger  than  the  default  for
223              copying  large files over high latency networks.  Using the '-w'
224              option may provide better performance, if  the  output  fits  in
225              memory.
226
227        -h   chunk_cache
228              For  netCDF-4 output, including netCDF-4 classic model, an inte‐
229              ger or floating-point number that specifies the size in bytes of
230              chunk  cache allocated for each chunked variable.  This is not a
231              property of the file, but merely a performance tuning  parameter
232              for avoiding compressing or decompressing the same data multiple
233              times while copying and changing chunk shapes.  A suffix  of  K,
234              M, G, or T multiplies the chunk cache size by one thousand, mil‐
235              lion,  billion,  or  trillion,  respectively.   The  default  is
236              4.194304  Mbytes  (or  whatever was specified for the configure-
237              time constant  CHUNK_CACHE_SIZE  when  the  netCDF  library  was
238              built).  Ideally, the nccopy utility should accept only one mem‐
239              ory buffer size and divide it optimally between  a  copy  buffer
240              and  chunk cache, but no general algorithm for computing the op‐
241              timum chunk cache size has been implemented yet. Using the  '-w'
242              option  may  provide  better  performance, if the output fits in
243              memory.
244
245        -e   cache_elems
246              For netCDF-4 output, including netCDF-4 classic model, specifies
247              number  of  chunks that the chunk cache can hold. A suffix of K,
248              M, G, or T multiplies the number of chunks that can be  held  in
249              the  cache  by  one thousand, million, billion, or trillion, re‐
250              spectively.  This is not a property of the file,  but  merely  a
251              performance  tuning parameter for avoiding compressing or decom‐
252              pressing the same data multiple times while copying and changing
253              chunk  shapes.   The  default is 1009 (or whatever was specified
254              for the  configure-time  constant  CHUNK_CACHE_NELEMS  when  the
255              netCDF  library  was built).  Ideally, the nccopy utility should
256              determine an optimum value for this parameter,  but  no  general
257              algorithm  for  computing the optimum number of chunk cache ele‐
258              ments has been implemented yet.
259
260        -r    Read netCDF classic or 64-bit offset input file into a  diskless
261              netCDF  file in memory before copying.  Requires that input file
262              be small enough to fit into memory.  For  nccopy,  this  doesn't
263              seem  to provide any significant speedup, so may not be a useful
264              option.
265

EXAMPLES

267       Make a copy of foo1.nc, a netCDF file of any type, to foo2.nc, a netCDF
268       file of the same type:
269
270              nccopy foo1.nc foo2.nc
271
272       Note that the above copy will not be as fast as use of cp or other sim‐
273       ple copy utility, because the file is copied using only the netCDF API.
274       If  the  input  file  has extra bytes after the end of the netCDF data,
275       those will not be copied, because they are not accessible  through  the
276       netCDF interface.  If the original file was generated in "No fill" mode
277       so that fill values are not stored for padding for data alignment,  the
278       output file may have different padding bytes.
279
280       Convert  a  netCDF-4  classic model file, compressed.nc, that uses com‐
281       pression, to a netCDF-3 file classic.nc:
282
283              nccopy -k classic compressed.nc classic.nc
284
285       Note that 'nc3' could be used instead of 'classic'.
286
287       Download the variable 'time_bnds' and its associated attributes from an
288       OPeNDAP server and copy the result to a netCDF file named 'tb.nc':
289
290              nccopy          'http://test.opendap.org/opendap/data/nc/sst.mn
291                     mean.nc.gz?time_bnds' tb.nc
292
293       Note that URLs that name specific variables as  command-line  arguments
294       should  generally  be  quoted,  to avoid the shell interpreting special
295       characters such as '?'.
296
297       Compress all the variables in the input file foo.nc, a netCDF  file  of
298       any type, to the output file bar.nc:
299
300              nccopy -d1 foo.nc bar.nc
301
302       If  foo.nc was a classic or 64-bit offset netCDF file, bar.nc will be a
303       netCDF-4 classic model netCDF file, because the classic and 64-bit off‐
304       set  format  variants  don't  support  compression.   If  foo.nc  was a
305       netCDF-4 file with some variables compressed  using  various  deflation
306       levels,  the  output will also be a netCDF-4 file of the same type, but
307       all the variables, including any uncompressed variables in  the  input,
308       will now use deflation level 1.
309
310       Assume  the  input  data includes gridded variables that use time, lat,
311       lon dimensions, with 1000 times by 1000 latitudes by  1000  longitudes,
312       and that the time dimension varies most slowly.  Also assume that users
313       want quick access to data at all times  for  a  small  set  of  lat-lon
314       points.   Accessing data for 1000 times would typically require access‐
315       ing 1000 disk blocks, which may be slow.
316
317       Reorganizing the data into chunks on disk that have  all  the  time  in
318       each  chunk  for  a  few lat and lon coordinates would greatly speed up
319       such access.  To chunk the data in the input  file  slow.nc,  a  netCDF
320       file of any type, to the output file fast.nc, you could use;
321
322              nccopy -c time/1000,lat/40,lon/40 slow.nc fast.nc
323
324       to  specify data chunks of 1000 times, 40 latitudes, and 40 longitudes.
325       If you had enough memory to contain the output file, you could speed up
326       the rechunking operation significantly by creating the output in memory
327       before writing it to disk on close:
328
329              nccopy -w -c time/1000,lat/40,lon/40 slow.nc fast.nc
330

SEE ALSO

332       ncdump(1),ncgen(1),netcdf(3)
333
334
335
336Release 4.2                       2012-03-08                         NCCOPY(1)
Impressum