ncgen(1) - f14

1NCGEN(1)                       UNIDATA UTILITIES                      NCGEN(1)
2
3
4

NAME

6       ncgen  - From a CDL file generate a netCDF-3 file, a netCDF-4 file or a
7       C program
8

SYNOPSIS

10       ncgen [-b] [-c] [-f] [-k file format] [-l  output  language]  [-n]  [-o
11              netcdf_filename] [-x] input_file
12

DESCRIPTION

14       ncgen  generates  either  a  netCDF-3 (i.e. classic) binary .nc file, a
15       netCDF-4 (i.e. enhanced) binary .nc file or a file in some source  lan‐
16       guage  that  when  executed will construct the corresponding binary .nc
17       file.  The input to ncgen is a description of a netCDF file in a  small
18       language  known  as  CDL (network Common Data form Language), described
19       below.  If no options are specified in invoking ncgen, it merely checks
20       the syntax of the input CDL file, producing error messages for any vio‐
21       lations of CDL syntax.  Other options can be used, for example, to cre‐
22       ate the corresponding netCDF file, or to generate a C program that uses
23       the netCDF C interface to create the netCDF file.
24
25       Note that this version of ncgen was originally called ncgen4.  The old‐
26       er ncgen program has been renamed to ncgen3.
27
28       ncgen  may  be  used  with the companion program ncdump to perform some
29       simple operations on netCDF files.  For example, to rename a  dimension
30       in  a  netCDF file, use ncdump to get a CDL version of the netCDF file,
31       edit the CDL file to change the name of the dimensions, and  use  ncgen
32       to generate the corresponding netCDF file from the edited CDL file.
33

OPTIONS

35       -b     Create  a  (binary)  netCDF file.  If the -o option is absent, a
36              default file name will  be  constructed  from  the  netCDF  name
37              (specified  after  the netcdf keyword in the input) by appending
38              the `.nc' extension.  If a file already exists with  the  speci‐
39              fied name, it will be overwritten.
40
41       -c     Generate  C  source code that will create a netCDF file matching
42              the netCDF specification.  The C source code is written to stan‐
43              dard output; equivalent to -lc.
44
45       -f     Generate  FORTRAN  77 source code that will create a netCDF file
46              matching the netCDF specification.  The source code  is  written
47              to standard output; equivalent to -lf77.
48
49       -o netcdf_file
50              Name  for  the  binary  netCDF  file created.  If this option is
51              specified, it implies the "-b" option.  (This option  is  neces‐
52              sary because netCDF files cannot be written directly to standard
53              output, since standard output is not seekable.)
54
55       -k file_format
56              The -k flag specifies the format of the file to be created  and,
57              by  inference,  the  data model accepted by ncgen (i.e. netcdf-3
58              (classic) versus netcdf-4).  The possible arguments are as  fol‐
59              lows.
60
61                     '1',  'classic'  =>  netcdf classic file format, netcdf-3
62                     type model.
63
64                     '2', '64-bit-offset', '64-bit offset' =>  netcdf  64  bit
65                     classic file format, netcdf-3 type model.
66
67                     '3', 'hdf5', 'netCDF-4', 'enhanced' => netcdf-4 file for‐
68                     mat, netcdf-4 type model.
69
70                     '4', 'hdf5-nc3', 'netCDF-4 classic model', 'enhanced-nc3'
71                     => netcdf-4 file format, netcdf-3 type model.
72       If no -k is specified then it defaults to -k1 (i.e. classic).  Note al‐
73       so that -v is accepted to mean the same thing as -k for  backward  com‐
74       patibility,  but -k is preferred, to match the corresponding ncdump op‐
75       tion.
76
77       -x     Don't initialize data with fill values.  This can speed up  cre‐
78              ation  of large netCDF files greatly, but later attempts to read
79              unwritten data from the generated file will not  be  easily  de‐
80              tectable.
81
82       -l output_language
83              The -l flag specifies the output language to use when generating
84              source code that will create or define a  netCDF  file  matching
85              the  netCDF  specification.   The  output is written to standard
86              output.  The currently supported languages  have  the  following
87              flags.
88
89                     c|C' => C language output.
90
91                     f77|fortran77' => FORTRAN 77 language output
92                            ;  note  that  currently only the classic model is
93                            supported.
94
95                     j|java' => (experimental) Java language output
96                            ; targets the  existing  Unidata  Java  interface,
97                            which  means  that  only the classic model is sup‐
98                            ported.
99

EXAMPLES

101       Check the syntax of the CDL file `foo.cdl':
102
103              ncgen foo.cdl
104
105       From the CDL file `foo.cdl', generate an equivalent binary netCDF  file
106       named `x.nc':
107
108              ncgen -o x.nc foo.cdl
109
110       From the CDL file `foo.cdl', generate a C program containing the netCDF
111       function invocations necessary to create an  equivalent  binary  netCDF
112       file named `x.nc':
113
114              ncgen -c -o x.nc foo.cdl
115

USAGE

117   CDL Syntax Overview
118       Below is an example of CDL syntax, describing a netCDF file with sever‐
119       al named dimensions (lat, lon, and time), variables (Z, t, p, rh,  lat,
120       lon,  time), variable attributes (units, long_name, valid_range, _Fill‐
121       Value), and some data.  CDL keywords are in boldface.  (This example is
122       intended  to  illustrate  the syntax; a real CDL file would have a more
123       complete set of attributes so that the data would  be  more  completely
124       self-describing.)
125              netcdf foo {  // an example netCDF specification in CDL
126
127              types:
128                  ubyte enum enum_t {Clear = 0, Cumulonimbus = 1, Stratus = 2};
129                  opaque(11) opaque_t;
130                  int(*) vlen_t;
131
132              dimensions:
133                   lat = 10, lon = 5, time = unlimited ;
134
135              variables:
136                   long    lat(lat), lon(lon), time(time);
137                   float   Z(time,lat,lon), t(time,lat,lon);
138                   double  p(time,lat,lon);
139                   long    rh(time,lat,lon);
140
141                   string  country(time,lat,lon);
142                   ubyte   tag;
143
144                   // variable attributes
145                   lat:long_name = "latitude";
146                   lat:units = "degrees_north";
147                   lon:long_name = "longitude";
148                   lon:units = "degrees_east";
149                   time:units = "seconds since 1992-1-1 00:00:00";
150
151                   // typed variable attributes
152                   string Z:units = "geopotential meters";
153                   float Z:valid_range = 0., 5000.;
154                   double p:_FillValue = -9999.;
155                   long rh:_FillValue = -1;
156                   vlen_t :globalatt = {17, 18, 19};
157              data:
158                   lat   = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
159                   lon   = -140, -118, -96, -84, -52;
160              group g {
161              types:
162                  compound cmpd_t { vlen_t f1; enum_t f2;};
163              } // group g
164              group h {
165              variables:
166                   /g/cmpd_t  compoundvar;
167              data:
168                      compoundvar = { {3,4,5}, Stratus } ;
169              } // group h
170              }
171
172       All  CDL  statements  are terminated by a semicolon.  Spaces, tabs, and
173       newlines can be used freely for readability.  Comments may  follow  the
174       characters `//' on any line.
175
176       A  CDL  description consists of five optional parts: types, dimensions,
177       variables, data, beginning with the keyword types:, dimensions:,  vari‐
178       ables:, and data, respectively.  The variable part may contain variable
179       declarations and attribute assignments.  All sections may contain glob‐
180       al attribute assignments.
181
182       In  addition,  after the data: section, the user may define a series of
183       groups (see the example above).  Groups themselves can  contain  types,
184       dimensions, variables, data, and other (nested) groups.
185
186       The  netCDF type section declares the user defined types.  These may be
187       constructed using any of the following types: enum,  vlen,  opaque,  or
188       compound.
189
190       A  netCDF  dimension  is used to define the shape of one or more of the
191       multidimensional variables contained in the netCDF file.  A netCDF  di‐
192       mension  has  a  name  and  a size.  A dimension can have the unlimited
193       size, which means a variable using  this  dimension  can  grow  to  any
194       length in that dimension.
195
196       A  variable  represents  a multidimensional array of values of the same
197       type.  A variable has a name, a data type, and a shape described by its
198       list  of dimensions.  Each variable may also have associated attributes
199       (see below) as well as data values.  The name, data type, and shape  of
200       a  variable are specified by its declaration in the variable section of
201       a CDL description.  A variable may have the same name as  a  dimension;
202       by  convention  such a variable is one-dimensional and contains coordi‐
203       nates of the dimension it names.  Dimensions need not have  correspond‐
204       ing variables.
205
206       A  netCDF  attribute  contains  information  about a netCDF variable or
207       about the whole netCDF dataset.  Attributes are used  to  specify  such
208       properties  as units, special values, maximum and minimum valid values,
209       scaling factors, offsets, and  parameters.   Attribute  information  is
210       represented by single values or arrays of values.  For example, "units"
211       is an attribute represented by a character array such as "celsius".  An
212       attribute  has  an  associated variable, a name, a data type, a length,
213       and a value.  In contrast to variables that are intended for data,  at‐
214       tributes are intended for metadata (data about data).  Unlike netCDF-3,
215       attribute types can be any user defined  type  as  well  as  the  usual
216       built-in types.
217
218       In  CDL, an attribute is designated by a a type, a variable, a ':', and
219       then an attribute name.  The type is optional and if missing,  it  will
220       be  inferred from the values assigned to the attribute.  It is possible
221       to assign global attributes not associated with  any  variable  to  the
222       netCDF as a whole by omitting the variable name in the attribute decla‐
223       ration.  Notice that there is a potential ambiguity in a  specification
224       such as
225       x : a = ...
226       In  this situation, x could be either a type for a global attribute, or
227       the variable name for an attribute. Since there could both  be  a  type
228       named  x  and  a  variable named x, there is an ambiguity.  The rule is
229       that in this situation, x will be interpreted as a  type  if  possible,
230       and otherwise as a variable.
231
232       If  not specified, the data type of an attribute in CDL is derived from
233       the type of the value(s) assigned to it.  The length of an attribute is
234       the  number  of data values assigned to it, or the number of characters
235       in the character string assigned to it.  Multiple values  are  assigned
236       to  non-character attributes by separating the values with commas.  All
237       values assigned to an attribute must be of the same type.
238
239       The names for CDL dimensions, variables, and attributes must begin with
240       an  alphabetic  character  or `_', and subsequent characters may be al‐
241       phanumeric or `_' or `-'.
242
243       The optional data section of a CDL specification is where netCDF  vari‐
244       ables may be initialized.  The syntax of an initialization is simple: a
245       variable name, an equals sign, and a comma-delimited list of  constants
246       (possibly  separated  by  spaces,  tabs and newlines) terminated with a
247       semicolon.  For multi-dimensional arrays,  the  last  dimension  varies
248       fastest.  Thus row-order rather than column order is used for matrices.
249       If fewer values are supplied than are needed to fill a variable, it  is
250       extended with a type-dependent `fill value', which can be overridden by
251       supplying a value for a distinguished variable attribute named  `_Fill‐
252       Value'.   The types of constants need not match the type declared for a
253       variable; coercions are done to convert integers to floating point, for
254       example.   The constant `_' can be used to designate the fill value for
255       a variable.
256
257   Primitive Data Types
258              char characters
259              byte 8-bit data
260              short     16-bit signed integers
261              int  32-bit signed integers
262              long (synonymous with int)
263              int64     64-bit signed integers
264              float     IEEE single precision floating point (32 bits)
265              real (synonymous with float)
266              double    IEEE double precision floating point (64 bits)
267              ubyte     unsigned 8-bit data
268              ushort    16-bit unsigned integers
269              uint 32-bit unsigned integers
270              uint64    64-bit unsigned integers
271              string    arbitrary length strings
272
273       CDL supports a superset of the primitive data types of  C.   The  names
274       for the primitive data types are reserved words in CDL, so the names of
275       variables, dimensions, and attributes must not be primitive type names.
276       In  declarations,  type names may be specified in either upper or lower
277       case.
278
279       Bytes differ from characters in that they are intended to hold  a  full
280       eight  bits  of data, and the zero byte has no special significance, as
281       it does for character data.  ncgen converts byte declarations  to  char
282       declarations  in the output C code and to the nonstandard BYTE declara‐
283       tion in output Fortran code.
284
285       Shorts can hold values between -32768 and 32767.  ncgen converts  short
286       declarations to short declarations in the output C code and to the non‐
287       standard INTEGER*2 declaration in output Fortran code.
288
289       Ints can hold values between -2147483648 and  2147483647.   ncgen  con‐
290       verts  int declarations to int declarations in the output C code and to
291       INTEGER declarations in output Fortran code.  long  is  accepted  as  a
292       synonym  for int in CDL declarations, but is deprecated since there are
293       now platforms with 64-bit representations for C longs.
294
295       Int64   can    hold    values    between    -9223372036854775808    and
296       9223372036854775807.   ncgen  converts  int64  declarations to longlong
297       declarations in the output C code.
298
299       Floats can hold values between about -3.4+38 and 3.4+38.  Their  exter‐
300       nal representation is as 32-bit IEEE normalized single-precision float‐
301       ing point numbers.  ncgen converts float declarations to float declara‐
302       tions  in  the output C code and to REAL declarations in output Fortran
303       code.  real is accepted as a synonym for float in CDL declarations.
304
305       Doubles can hold values between about -1.7+308 and 1.7+308.  Their  ex‐
306       ternal representation is as 64-bit IEEE standard normalized double-pre‐
307       cision floating point numbers.  ncgen converts double  declarations  to
308       double declarations in the output C code and to DOUBLE PRECISION decla‐
309       rations in output Fortran code.
310
311       The unsigned counterparts of the above integer types are mapped to  the
312       corresponding  unsigned C types.  Their ranges are suitably modified to
313       start at zero.
314
315   CDL Constants
316       Constants assigned to attributes or variables may be of any of the  ba‐
317       sic netCDF types.  The syntax for constants is similar to C syntax, ex‐
318       cept that type suffixes must be appended to shorts and floats  to  dis‐
319       tinguish them from longs and doubles.
320
321       A  byte constant is represented by a single character or multiple char‐
322       acter escape sequence enclosed in single quotes.  For example,
323               'a'      // ASCII `a'
324               '\0'          // a zero byte
325               '\n'          // ASCII newline character
326               '\33'         // ASCII escape character (33 octal)
327               '\x2b'   // ASCII plus (2b hex)
328               '\377'   // 377 octal = 255 decimal, non-ASCII
329
330       Character constants are enclosed in double quotes.  A  character  array
331       may  be represented as a string enclosed in double quotes.  The usual C
332       string escape conventions are honored.  For example
333              "a"       // ASCII `a'
334              "Two\nlines\n" // a 10-character string with two embedded newlines
335              "a bell:\007"  // a string containing an ASCII bell
336       Note that the netCDF character array "a" would  fit  in  a  one-element
337       variable,  since  no terminating NULL character is assumed.  However, a
338       zero byte in a character array is interpreted as the end of the signif‐
339       icant  characters  by  the  ncdump program, following the C convention.
340       Therefore, a NULL byte should not be embedded in a character string un‐
341       less  at  the  end: use the byte data type instead for byte arrays that
342       contain the zero byte.
343
344       short integer constants are intended  for  representing  16-bit  signed
345       quantities.   The  form of a short constant is an integer constant with
346       an `s' or `S' appended.  If a short constant begins with `0', it is in‐
347       terpreted  as  octal,  except that if it begins with `0x', it is inter‐
348       preted as a hexadecimal constant.  For example:
349              -2s  // a short -2
350              0123s     // octal
351              0x7ffs  //hexadecimal
352
353       int integer constants are intended for representing 32-bit signed quan‐
354       tities.   The  form of an int constant is an ordinary integer constant,
355       although it is acceptable to append an optional `l' or `L' (again, dep‐
356       recated).  If an int constant begins with `0', it is interpreted as oc‐
357       tal, except that if it begins with `0x', it is interpreted as  a  hexa‐
358       decimal  constant  (but see opaque constants below).  Examples of valid
359       int constants include:
360              -2
361              1234567890L
362              0123      // octal
363              0x7ff          // hexadecimal
364
365       int64 integer constants are intended  for  representing  64-bit  signed
366       quantities.   The form of an int64 constant is an integer constant with
367       an `ll' or `LL' appended.  If an int64 constant begins with `0', it  is
368       interpreted  as octal, except that if it begins with `0x', it is inter‐
369       preted as a hexadecimal constant.  For example:
370              -2ll // an unsigned -2
371              0123LL    // octal
372              0x7ffLL  //hexadecimal
373
374       Floating point constants of type float are appropriate for representing
375       floating  point  data with about seven significant digits of precision.
376       The form of a float constant is the same as a C floating point constant
377       with an `f' or `F' appended.  For example the following are all accept‐
378       able float constants:
379              -2.0f
380              3.14159265358979f   // will be truncated to less precision
381              1.f
382
383       Floating point constants of type double are appropriate for  represent‐
384       ing floating point data with about sixteen significant digits of preci‐
385       sion.  The form of a double constant is the same as a C floating  point
386       constant.   An  optional  `d'  or `D' may be appended.  For example the
387       following are all acceptable double constants:
388              -2.0
389              3.141592653589793
390              1.0e-20
391              1.d
392
393       Unsigned integer constants can be created by  appending  the  character
394       'U'  or 'u' between the constant and any trailing size specifier.  Thus
395       one could say 10U, 100us, 100000ul, or 1000000ull, for example.
396
397       String constants are, like character constants, represented using  dou‐
398       ble quotes. This represents a potential ambiguity since a multi-charac‐
399       ter string may also indicate a dimensioned character value. Disambigua‐
400       tion  usually  occurs  by  context, but care should be taken to specify
401       thestring type to ensure the proper choice.
402
403       Opaque constants are represented as  sequences  of  hexadecimal  digits
404       preceded  by  0X  or  0x: 0xaa34ffff, for example.  These constants can
405       still be used as integer constants and will be either truncated or  ex‐
406       tended as necessary.
407
408   Compound Constant Expressions
409       In  order  to  assign values to variables (or attributes) whose type is
410       user-defined type, the constant notation has been extended  to  include
411       sequences  of  constants  enclosed  in curly brackets (e.g. "{"..."}").
412       Such a constant is called a compound constant, and  compound  constants
413       can be nested.
414
415       Given  a type "T(*) vlen_t", where T is some other arbitrary base type,
416       constants for this should be specified as follows.
417           vlen_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2m};
418       The values tij, are assumed to be constants of type T.
419
420       Given a type "compound cmpd_t {T1 f1; T2 f2...Tn fn}", where the Ti are
421       other  arbitrary  base types, constants for this should be specified as
422       follows.
423           cmpd_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2n};
424       The values tij, are assumed to be constants of type Ti.  If the  fields
425       are  missing, then they will be set using any specified or default fill
426       value for the field's base type.
427
428       The general set of rules for using braces are defined in the Specifying
429       Datalists section below.
430
431   Scoping Rules
432       With  the  addition of groups, the name space for defined objects is no
433       longer flat. References (names) of any type, dimension, or variable may
434       be  prefixed  with the absolute path specifying a specific declaration.
435       Thus one might say
436           variables:
437               /g1/g2/t1 v1;
438       The type being referenced (t1) is the one within  group  g2,  which  in
439       turn  is  nested  in group g1.  The similarity of this notation to Unix
440       file paths is deliberate, and one can consider groups as a form of  di‐
441       rectory structure.
442
443       1.  When  name  is not prefixed, then scope rules are applied to locate
444              the specified declaration. Currently, there are three rules: one
445              for dimensions, one for types and enumeration constants, and one
446              for all others.
447
448       2. When an unprefixed name of a dimension is used  (as  in  a  variable
449              declaration),  ncgen  first  looks  in the immediately enclosing
450              group for the dimension.  If it is  not  found  there,  then  it
451              looks  in the group enclosing this group.  This continues up the
452              group hierarchy until the dimension is found, or  there  are  no
453              more groups to search.
454
455       3.  For  all  other  names,  only  the  immediately  enclosing group is
456              searched.
457
458       When an unprefixed name of a type or an enumeration constant  is  used,
459       ncgen  searches  the  group  tree using a pre-order depth-first search.
460       This essentially means that it will find the matching declaration  that
461       precedes  the reference textually in the cdl file and that is "highest"
462       in the group hierarchy.
463
464       One final note. Forward references are not allowed.   This  means  that
465       specifying,  for  example, /g1/g2/t1 will fail if this reference occurs
466       before g1 and/or g2 are defined.
467
468   Special Attributes
469       Special, virtual, attributes can be specified to  provide  performance-
470       related  information  about  the file format and about variable proper‐
471       ties.  The file must be a netCDF-4 file for these to take effect.
472
473       These special virtual attributes are not actually  part  of  the  file,
474       they are merely a convenient way to set miscellaneous properties of the
475       data in CDL
476
477       The special attributes currently supported are as  follows:  `_Format',
478       `_Fletcher32,  `_ChunkSizes',  `_Endianness',  `_DeflateLevel', `_Shuf‐
479       fle', and `_Storage'.
480
481       `_Format' is a global attribute specifying the netCDF  format  variant.
482       Its  value  must  be a single string matching one of `classic', `64-bit
483       offset', `netCDF-4', or `netCDF-4 classic model'.
484
485       The rest of the special attributes are all variable attributes.  Essen‐
486       tially  all of then map to some corresponding `nc_def_var_XXX' function
487       as defined in the netCDF-4 API.   `_Fletcher32  sets  the  `fletcher32'
488       property  for  a  variable.  `_Endianness' is either `little' or `big',
489       depending on how the variable  is  stored  when  first  written.  `_De‐
490       flateLevel'  is an integer between 0 and 9 inclusive if compression has
491       been specified for the variable.  `_Shuffle' is 1 if use of the shuffle
492       filter  is  specified  for the variable.  `_Storage' is `contiguous' or
493       `chunked'.  `_ChunkSizes' is a list of chunk sizes for  each  dimension
494       of the variable
495
496
497   Specifying Datalists
498       Specifying  datalists for variables in the `data:` section can be some‐
499       what complicated. There are some rules that must be followed to  ensure
500       that datalists are parsed correctly by ncgen.
501
502       1. The top level is automatically assumed to be a list of items,
503                 so it should not be inside {...}.
504
505       2. Instances of UNLIMITED dimensions (other than the first dimension)
506                 must be surrounded by {...} in order to specify the size.
507
508       3. Instances of vlens must be surrounded by {...} in order to
509                 specify the size.
510
511       4. Compound instances must be embedded in {...}
512
513       5. Non-scalar fields of compound instances must be embedded in {...}.
514
515       6.  Datalists associated with attributes are implicitly a vector (i.e.,
516              a list) of values of the type of the  attribute  and  the  above
517              rules must apply with that in mind.
518
519       7. No other use of braces is allowed.
520
521       Note  that one consequence of these rules is that arrays of values can‐
522       not have subarrays  within  braces.   Thus,  given,  for  example,  int
523       var(d1)(d2)...(dn),  a datalist for this variable must be a single list
524       of integers, where the number of integers is no more than D=d1*d2*...dn
525       values;  note that the list can be less than D, in which case fill val‐
526       ues will be used to pad the list.
527
528       Rule 6 about attribute datalist has the following consequence.  If  the
529       type  of  the attribute is a compound (or vlen) type, and if the number
530       of entries in the list is one, then the compound instances must be  en‐
531       closed in braces.
532
533

BUGS

535       The  programs generated by ncgen when using the -c flag use initializa‐
536       tion statements to store data in variables, and will  fail  to  produce
537       compilable  programs  if  you try to use them for large datasets, since
538       the resulting statements may exceed the line length or number  of  con‐
539       tinuation statements permitted by the compiler.
540
541       The  CDL  syntax  makes  it  easy to assign what looks like an array of
542       variable-length strings to a netCDF variable, but the strings may  sim‐
543       ply be concatenated into a single array of characters.  Specific use of
544       the string type specifier may solve the problem
545
546
547
548Printed: 119-6-22        $Date: 2010/04/04 19:39:52 $                 NCGEN(1)