ncgen(1) - f37

1NCGEN(1)                       UNIDATA UTILITIES                      NCGEN(1)
2
3
4

NAME

6       ncgen  - From a CDL file generate a netCDF-3 file, a netCDF-4 file or a
7       C program
8

SYNOPSIS

10       ncgen [-format_code] [-1|3|4|5|6|7] [-b] [-c] [-d] [-D debuglevel] [-f]
11              [-h]  [-H]  [-k format_name] [-l b|c|f77|java] [-L loglevel] [-M
12              name] [-n] [-N datasetname] [-o netcdf_filename] [-P] [-x]
13

DESCRIPTION

15       ncgen generates either a netCDF-3 (i.e. classic)  binary  .nc  file,  a
16       netCDF-4  (i.e. enhanced) binary .nc file or a file in some source lan‐
17       guage that when executed will construct the  corresponding  binary  .nc
18       file.   The input to ncgen is a description of a netCDF file in a small
19       language known as CDL (network Common Data  form  Language),  described
20       below.   Input  is  read from standard input if no input_file is speci‐
21       fied.  If no options are specified in invoking ncgen, it merely  checks
22       the syntax of the input CDL file, producing error messages for any vio‐
23       lations of CDL syntax.  Other options can be used, for example, to cre‐
24       ate the corresponding netCDF file, or to generate a C program that uses
25       the netCDF C interface to create the netCDF file.
26
27       Note that this version of ncgen was originally called ncgen4.  The old‐
28       er ncgen program has been renamed to ncgen3.
29
30       ncgen  may  be  used  with the companion program ncdump to perform some
31       simple operations on netCDF files.  For example, to rename a  dimension
32       in  a  netCDF file, use ncdump to get a CDL version of the netCDF file,
33       edit the CDL file to change the name of the dimensions, and  use  ncgen
34       to generate the corresponding netCDF file from the edited CDL file.
35

OPTIONS

37       -1|3|4|5|6|7
38              Alternate method to specify the format.
39
40                     3 => netcdf classic format
41
42                     4 => netCDF-4 format (enhanced data model)
43
44                     5 => netcdf 5 format
45
46                     6 => netCDF 64-bit format
47
48                     7 => netCDF-4 classic model format (3+4 == 7)
49       See the -k flag.
50
51       -b     Create  a  (binary)  netCDF file.  If the -o option is absent, a
52              default file name will be constructed from the basename  of  the
53              CDL file, with any suffix replaced by the `.nc' extension.  If a
54              file already exists with the specified name, it  will  be  over‐
55              written.
56
57       -c     Generate  C  source code that will create a netCDF file matching
58              the netCDF specification.  The C source code is written to stan‐
59              dard output; equivalent to -lc.
60
61       -d     Same as -D1.
62
63       -D debuglevel
64              Set the level of debug output.
65
66       -f     Generate  FORTRAN  77 source code that will create a netCDF file
67              matching the netCDF specification.  The source code  is  written
68              to standard output; equivalent to -lf77.
69
70       -h     Output help information.
71
72       -H     Output the header only; ignore the data section.
73
74       -k format_name
75
76       -format_code
77              The  -k flag specifies the format of the file to be created and,
78              by inference, the data model accepted by  ncgen  (i.e.  netcdf-3
79              (classic) versus netcdf-4 vs netcdf-5). As a shortcut, a numeric
80              format_code may be specified instead.  The possible  format_name
81              values for the -k option are:
82
83                     'classic' or 'nc3' => netCDF classic format
84
85                     '64-bit offset' or 'nc6' => netCDF 64-bit format
86
87                     '64-bit data or 'nc5' => netCDF-5 (64-bit data) format
88
89                     'netCDF-4'  0r  'nc4'  =>  netCDF-4 format (enhanced data
90                     model)
91
92                     'netCDF-4 classic model' or  'nc7'  =>  netCDF-4  classic
93                     model format
94       Accepted   format_code  numeric  arguments,  just  shortcuts  for  for‐
95       mat_names, are:
96
97                     3 => netcdf classic format
98
99                     5 => netcdf 5 format
100
101                     6 => netCDF 64-bit format
102
103                     4 => netCDF-4 format (enhanced data model)
104
105                     7 => netCDF-4 classic model format
106       The numeric code "7" is used because "7=3+4", a mnemonic for the format
107       that  uses  the netCDF-3 data model for compatibility with the netCDF-4
108       storage format for performance. Credit is due to NCO for use  of  these
109       numeric codes instead of the old and confusing format numbers.
110
111       Note:  The old version format numbers '1', '2', '3', '4', equivalent to
112       the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively,  are  also
113       still  accepted  but  deprecated,  due to easy confusion between format
114       numbers and format names. Various old format name aliases are also  ac‐
115       cepted  but  deprecated,  e.g. 'hdf5', 'enhanced-nc3', etc.  Also, note
116       that -v is accepted to mean the same thing as -k for backward  compati‐
117       bility.
118
119       -l b|c|f77|java
120              The -l flag specifies the output language to use when generating
121              source code that will create or define a  netCDF  file  matching
122              the  netCDF  specification.   The  output is written to standard
123              output.  The currently supported languages  have  the  following
124              flags.
125
126                     c|C' => C language output.
127
128                     f77|fortran77' => FORTRAN 77 language output
129                            ;  note  that  currently only the classic model is
130                            supported.
131
132                     j|java' => (experimental) Java language output
133                            ; targets the  existing  Unidata  Java  interface,
134                            which  means  that  only the classic model is sup‐
135                            ported.
136

Choosing the output format

138       The choice of output format is determined by three flags.
139
140       -k flag.
141
142       _Format attribute (see below).
143
144       Occurrence of CDF-5 (64-bit data) or
145              netcdf-4 constructs in the input CDL."  The term "netCDF-4  con‐
146              structs" means constructs from the enhanced data model, not just
147              special performance-related attributes such as
148               _ChunkSizes, _DeflateLevel, _Endianness, etc.  The term  "CDF-5
149              constructs" means extended unsigned integer types allowed in the
150              64-bit data model.
151
152       Note that there is an ambiguity between the netCDF-4 case and the CDF-5
153       case is only an unsigned type is seen in the input.
154
155       The rules are as follows, in order of application.
156
157       1.     If either Fortran or Java output is specified, then -k flag val‐
158              ue of 1 (classic model) will be used.  Conflicts with the use of
159              enhanced constructs in the CDL will report an error.
160
161       2.     If  both  the  -k  flag and _Format attribute are specified, the
162              _Format flag will be ignored.  If no -k flag is specified, and a
163              _Format  attribute  value  is  specified, then the -k flag value
164              will be set to that of the _Format attribute.  Otherwise the  -k
165              flag is undefined.
166
167       3.     If  the -k option is defined and is consistent with the CDL, nc‐
168              gen will output a file in the requested form, else an error will
169              be reported.
170
171       4.     If  the -k flag is undefined, and if there are CDF-5 constructs,
172              only, in the CDL, a -k flag value of 5 (64-bit data model)  will
173              be used.  If there are true netCDF-4 constructs in the CDL, a -k
174              flag value of 3 (enhanced model) will be used.
175
176       5.     If special performance-related attributes are specified  in  the
177              CDL, a -k flag value of 4 (netCDF-4 classic model) will be used.
178
179       6.     Otherwise ncgen will set the -k flag to 1 (classic model).
180
181       -L loglevel
182
183       -M name
184              Specify the name for the main function for C, F77, or Java.
185
186       -n
187
188       -N datasetname
189
190       -o netcdf_file
191              Name of the file to pass to calls to "nc_create()".  If this op‐
192              tion is specified it implies (in the absence of any explicit  -l
193              flag)  the "-b" option.  This option is necessary because netCDF
194              files cannot be written directly to standard output, since stan‐
195              dard output is not seekable.
196
197       -P     Use NC_DISKLESS mode to create the file totally in memory before
198              persisting it to disk.
199
200       -W maxwholevarsize
201              Set wholevarsizem where if total number of elements is less than
202              maxwholevarsize   then   updata   a   variable  using  a  single
203              nc_put_var. Requires that the variable has no  unlimited  dimen‐
204              sions.
205
206       -x     Don't  initialize data with fill values.  This can speed up cre‐
207              ation of large netCDF files greatly, but later attempts to  read
208              unwritten  data  from  the generated file will not be easily de‐
209              tectable.
210
211

EXAMPLES

213       Check the syntax of the CDL file `foo.cdl':
214
215              ncgen foo.cdl
216
217       From the CDL file `foo.cdl', generate an equivalent binary netCDF  file
218       named `x.nc':
219
220              ncgen -o x.nc foo.cdl
221
222       From the CDL file `foo.cdl', generate a C program containing the netCDF
223       function invocations necessary to create an  equivalent  binary  netCDF
224       file named `x.nc':
225
226              ncgen -lc foo.cdl >x.c
227

USAGE

229   CDL Syntax Overview
230       Below is an example of CDL syntax, describing a netCDF file with sever‐
231       al named dimensions (lat, lon, and time), variables (Z, t, p, rh,  lat,
232       lon,  time), variable attributes (units, long_name, valid_range, _Fill‐
233       Value), and some data.  CDL keywords are in boldface.  (This example is
234       intended  to  illustrate  the syntax; a real CDL file would have a more
235       complete set of attributes so that the data would  be  more  completely
236       self-describing.)
237              netcdf foo {  // an example netCDF specification in CDL
238
239              types:
240                  ubyte enum enum_t {Clear = 0, Cumulonimbus = 1, Stratus = 2};
241                  opaque(11) opaque_t;
242                  int(*) vlen_t;
243
244              dimensions:
245                   lat = 10, lon = 5, time = unlimited ;
246
247              variables:
248                   long    lat(lat), lon(lon), time(time);
249                   float   Z(time,lat,lon), t(time,lat,lon);
250                   double  p(time,lat,lon);
251                   long    rh(time,lat,lon);
252
253                   string  country(time,lat,lon);
254                   ubyte   tag;
255
256                   // variable attributes
257                   lat:long_name = "latitude";
258                   lat:units = "degrees_north";
259                   lon:long_name = "longitude";
260                   lon:units = "degrees_east";
261                   time:units = "seconds since 1992-1-1 00:00:00";
262
263                   // typed variable attributes
264                   string Z:units = "geopotential meters";
265                   float Z:valid_range = 0., 5000.;
266                   double p:_FillValue = -9999.;
267                   long rh:_FillValue = -1;
268                   vlen_t :globalatt = {17, 18, 19};
269              data:
270                   lat   = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
271                   lon   = -140, -118, -96, -84, -52;
272              group: g {
273              types:
274                  compound cmpd_t { vlen_t f1; enum_t f2;};
275              } // group g
276              group: h {
277              variables:
278                   /g/cmpd_t  compoundvar;
279              data:
280                      compoundvar = { {3,4,5}, enum_t.Stratus } ;
281              } // group h
282              }
283
284       All  CDL  statements  are terminated by a semicolon.  Spaces, tabs, and
285       newlines can be used freely for readability.  Comments may  follow  the
286       characters `//' on any line.
287
288       A  CDL  description consists of five optional parts: types, dimensions,
289       variables, data, beginning with the  keyword  `types:',  `dimensions:',
290       `variables:',  and `data:', respectively.  Note several things: (1) the
291       keyword includes the trailing colon, so there must not be any space be‐
292       fore the colon character, and (2) the keywords are required to be lower
293       case.
294
295       The variables: section may contain variable declarations and  attribute
296       assignments.  All sections may contain global attribute assignments.
297
298       In  addition,  after the data: section, the user may define a series of
299       groups (see the example above).  Groups themselves can  contain  types,
300       dimensions, variables, data, and other (nested) groups.
301
302       The  netCDF  types: section declares the user defined types.  These may
303       be constructed using any of the following types: enum, vlen, opaque, or
304       compound.
305
306       A  netCDF  dimension  is used to define the shape of one or more of the
307       multidimensional variables contained in the netCDF file.  A netCDF  di‐
308       mension  has  a  name  and  a size.  A dimension can have the unlimited
309       size, which means a variable using  this  dimension  can  grow  to  any
310       length in that dimension.
311
312       A  variable  represents  a multidimensional array of values of the same
313       type.  A variable has a name, a data type, and a shape described by its
314       list  of dimensions.  Each variable may also have associated attributes
315       (see below) as well as data values.  The name, data type, and shape  of
316       a  variable are specified by its declaration in the variable section of
317       a CDL description.  A variable may have the same name as  a  dimension;
318       by  convention  such a variable is one-dimensional and contains coordi‐
319       nates of the dimension it names.  Dimensions need not have  correspond‐
320       ing variables.
321
322       A  netCDF  attribute  contains  information  about a netCDF variable or
323       about the whole netCDF dataset.  Attributes are used  to  specify  such
324       properties  as units, special values, maximum and minimum valid values,
325       scaling factors, offsets, and  parameters.   Attribute  information  is
326       represented by single values or arrays of values.  For example, "units"
327       is an attribute represented by a character array such as "celsius".  An
328       attribute  has  an  associated variable, a name, a data type, a length,
329       and a value.  In contrast to variables that are intended for data,  at‐
330       tributes are intended for metadata (data about data).  Unlike netCDF-3,
331       attribute types can be any user defined  type  as  well  as  the  usual
332       built-in types.
333
334       In  CDL, an attribute is designated by a a type, a variable, a ':', and
335       then an attribute name.  The type is optional and if missing,  it  will
336       be  inferred from the values assigned to the attribute.  It is possible
337       to assign global attributes not associated with  any  variable  to  the
338       netCDF as a whole by omitting the variable name in the attribute decla‐
339       ration.  Notice that there is a potential ambiguity in a  specification
340       such as
341       x : a = ...
342       In  this situation, x could be either a type for a global attribute, or
343       the variable name for an attribute. Since there could both  be  a  type
344       named  x  and  a  variable named x, there is an ambiguity.  The rule is
345       that in this situation, x will be interpreted as a  type  if  possible,
346       and otherwise as a variable.
347
348       If  not specified, the data type of an attribute in CDL is derived from
349       the type of the value(s) assigned to it.  The length of an attribute is
350       the  number  of data values assigned to it, or the number of characters
351       in the character string assigned to it.  Multiple values  are  assigned
352       to  non-character attributes by separating the values with commas.  All
353       values assigned to an attribute must be of the same type.
354
355       The names for CDL dimensions, variables, attributes, types, and  groups
356       may  contain  any  non-control utf-8 character except the forward slash
357       character (`/').  However, certain characters must escaped if they  are
358       used  in  a name, where the escape character is the backward slash `\'.
359       In particular, if the leading character off the name is a digit  (0-9),
360       then  it  must  be  preceded by the escape character.  In addition, the
361       characters ` !"#$%&()*,:;<=>?[]^`´{}|~\' must be escaped if they  occur
362       anywhere  in a name.  Note also that attribute names that begin with an
363       underscore (`_') are reserved for the use of Unidata and should not  be
364       used in user defined attributes.
365
366       Note  also  that  the words `variables', `dimensions', `data', `group',
367       and `types' are legal CDL names, but be careful that there is  a  space
368       between  them and any following colon character when used as a variable
369       name.  This is mostly an issue with attribute declarations.  For  exam‐
370       ple, consider this.
371
372
373               netcdf ... {
374               ...
375               variables:
376                  int dimensions;
377                      dimensions: attribute=0 ; // this will cause an error
378                      dimensions : attribute=0 ; // this is ok.
379                   ...
380               }
381
382       The optional data: section of a CDL specification is where netCDF vari‐
383       ables may be initialized.  The syntax of an initialization is simple: a
384       variable  name, an equals sign, and a comma-delimited list of constants
385       (possibly separated by spaces, tabs and  newlines)  terminated  with  a
386       semicolon.   For  multi-dimensional  arrays,  the last dimension varies
387       fastest.  Thus row-order rather than column order is used for matrices.
388       If  fewer values are supplied than are needed to fill a variable, it is
389       extended with a type-dependent `fill value', which can be overridden by
390       supplying  a value for a distinguished variable attribute named `_Fill‐
391       Value'.  The types of constants need not match the type declared for  a
392       variable; coercions are done to convert integers to floating point, for
393       example.  The constant `_' can be used to designate the fill value  for
394       a  variable.   If the type of the variable is explicitly `string', then
395       the special constant `NIL` can be used to represent a nil string, which
396       is not the same as a zero length string.
397
398   Primitive Data Types
399              char characters
400              byte 8-bit data
401              short     16-bit signed integers
402              int  32-bit signed integers
403              long (synonymous with int)
404              int64     64-bit signed integers
405              float     IEEE single precision floating point (32 bits)
406              real (synonymous with float)
407              double    IEEE double precision floating point (64 bits)
408              ubyte     unsigned 8-bit data
409              ushort    16-bit unsigned integers
410              uint 32-bit unsigned integers
411              uint64    64-bit unsigned integers
412              string    arbitrary length strings
413
414       CDL  supports  a  superset of the primitive data types of C.  The names
415       for the primitive data types are reserved words in CDL, so the names of
416       variables, dimensions, and attributes must not be primitive type names.
417       In declarations, type names may be specified in either upper  or  lower
418       case.
419
420       Bytes are intended to hold a full eight bits of data, and the zero byte
421       has no special significance, as it mays for character data.  ncgen con‐
422       verts  byte  declarations to char declarations in the output C code and
423       to the nonstandard BYTE declaration in output Fortran code.
424
425       Shorts can hold values between -32768 and 32767.  ncgen converts  short
426       declarations to short declarations in the output C code and to the non‐
427       standard INTEGER*2 declaration in output Fortran code.
428
429       Ints can hold values between -2147483648 and  2147483647.   ncgen  con‐
430       verts  int declarations to int declarations in the output C code and to
431       INTEGER declarations in output Fortran code.  long  is  accepted  as  a
432       synonym  for int in CDL declarations, but is deprecated since there are
433       now platforms with 64-bit representations for C longs.
434
435       Int64   can    hold    values    between    -9223372036854775808    and
436       9223372036854775807.   ncgen  converts  int64  declarations to longlong
437       declarations in the output C code.
438
439       Floats can hold values between about -3.4+38 and 3.4+38.  Their  exter‐
440       nal representation is as 32-bit IEEE normalized single-precision float‐
441       ing point numbers.  ncgen converts float declarations to float declara‐
442       tions  in  the output C code and to REAL declarations in output Fortran
443       code.  real is accepted as a synonym for float in CDL declarations.
444
445       Doubles can hold values between about -1.7+308 and 1.7+308.  Their  ex‐
446       ternal representation is as 64-bit IEEE standard normalized double-pre‐
447       cision floating point numbers.  ncgen converts double  declarations  to
448       double declarations in the output C code and to DOUBLE PRECISION decla‐
449       rations in output Fortran code.
450
451       The unsigned counterparts of the above integer types are mapped to  the
452       corresponding  unsigned C types.  Their ranges are suitably modified to
453       start at zero.
454
455       The technical interpretation of the char type is that it is an unsigned
456       8-bit  value. The encoding of the 256 possible values is unspecified by
457       default. A variable of char type may be marked with an "_Encoding"  at‐
458       tribute to indicate the character set to be used: US-ASCII, ISO-8859-1,
459       etc.  Note that specifying the encoding of UTF-8 is equivalent to spec‐
460       ifying  US-ASCII  This is because multi-byte UTF-8 characters cannot be
461       stored in an 8-bit character. The only legal single byte  UTF-8  values
462       are  by  definition the 7-bit US-ASCII encoding with the top bit set to
463       zero.
464
465       Strings are assumed by default to be encoded using  UTF-8.   Note  that
466       this  means  that  multi-byte  UTF-8  encodings  may  be present in the
467       string, so it is possible that the number of distinct UTF-8  characters
468       in a string is smaller than the number of 8-bit bytes used to store the
469       string.
470
471   CDL Constants
472       Constants assigned to attributes or variables may be of any of the  ba‐
473       sic netCDF types.  The syntax for constants is similar to C syntax, ex‐
474       cept that type suffixes must be appended to shorts and floats  to  dis‐
475       tinguish them from longs and doubles.
476
477       A  byte  constant  is represented by an integer constant with a `b' (or
478       `B') appended.  In the old netCDF-2 API, byte constants could  also  be
479       represented  using single characters or standard C character escape se‐
480       quences such as `a' or `0.  This is still supported for  backward  com‐
481       patibility,  but  deprecated  to make the distinction clear between the
482       numeric byte type and the textual char type.   Example  byte  constants
483       include:
484               0b             // a zero byte
485               -1b            // -1 as an 8-bit byte
486               255b           // also -1 as a signed 8-bit byte
487
488       short  integer  constants  are  intended for representing 16-bit signed
489       quantities.  The form of a short constant is an integer  constant  with
490       an `s' or `S' appended.  If a short constant begins with `0', it is in‐
491       terpreted as octal, except that if it begins with `0x',  it  is  inter‐
492       preted as a hexadecimal constant.  For example:
493              -2s  // a short -2
494              0123s     // octal
495              0x7ffs  //hexadecimal
496
497       int integer constants are intended for representing 32-bit signed quan‐
498       tities.  The form of an int constant is an ordinary  integer  constant,
499       although  it  is  acceptable  to  optionally append a single `l' or `L'
500       (again, deprecated). Be careful, though, the L suffix is interpreted as
501       a  32 bit integer, and never as a 64 bit integer. This can be confusing
502       since the C long type can ambigously be either 32 bit or 64 bit.
503
504       If an int constant begins with `0', it is interpreted as octal,  except
505       that  if  it  begins with `0x', it is interpreted as a hexadecimal con‐
506       stant (but see opaque constants below).  Examples  of  valid  int  con‐
507       stants include:
508              -2
509              1234567890L
510              0123      // octal
511              0x7ff          // hexadecimal
512
513       int64  integer  constants  are  intended for representing 64-bit signed
514       quantities.  The form of an int64 constant is an integer constant  with
515       an  `ll' or `LL' appended.  If an int64 constant begins with `0', it is
516       interpreted as octal, except that if it begins with `0x', it is  inter‐
517       preted as a hexadecimal constant.  For example:
518              -2ll // an unsigned -2
519              0123LL    // octal
520              0x7ffLL  //hexadecimal
521
522       Floating point constants of type float are appropriate for representing
523       floating point data with about seven significant digits  of  precision.
524       The form of a float constant is the same as a C floating point constant
525       with an `f' or `F' appended.  For example the following are all accept‐
526       able float constants:
527              -2.0f
528              3.14159265358979f   // will be truncated to less precision
529              1.f
530
531
532       Floating  point constants of type double are appropriate for represent‐
533       ing floating point data with about sixteen significant digits of preci‐
534       sion.   The form of a double constant is the same as a C floating point
535       constant.  An optional `d' or `D' may be  appended.   For  example  the
536       following are all acceptable double constants:
537              -2.0
538              3.141592653589793
539              1.0e-20
540              1.d
541
542       Unsigned  integer  constants  can be created by appending the character
543       'U' or 'u' between the constant and any trailing size specifier, or im‐
544       mediately  at  the  end of the size specifier.  Thus one could say 10U,
545       100su, 100000ul, or 1000000llu, for example.
546
547       Single character constants may be enclosed in single quotes.  If a  se‐
548       quence of one or more characters is enclosed in double quotes, then its
549       interpretation must be inferred from the context.  If  the  dataset  is
550       created using the netCDF classic model, then all such constants are in‐
551       terpreted as a character array, so each character in  the  constant  is
552       interpreted as if it were a single character.  If the dataset is netCDF
553       extended, then the constant may be interpreted as for the classic model
554       or  as a true string (see below) depending on the type of the attribute
555       or variable into which the string is contained.
556
557       The interpretation of char constants is that  those  that  are  in  the
558       printable  ASCII  range  ('  '..'~')  are  assumed to be encoded as the
559       1-byte subset ofUTF-8, which is equivalent to US-ASCII.  In all  cases,
560       the  usual  C  string  escape conventions are honored for values from 0
561       thru 127. Values greater than 127 are allowed, but  their  encoding  is
562       undefined.  For netCDF extended, the use of the char type is deprecated
563       in favor of the string type.
564
565       Some character constant examples are as follows.
566               'a'      // ASCII `a'
567               "a"      // equivalent to 'a'
568               "Two\nlines\n"     // a 10-character string with two embedded newlines
569               "a bell:\007" // a string containing an ASCII bell
570       Note that the netCDF character array "a" would  fit  in  a  one-element
571       variable,  since  no terminating NULL character is assumed.  However, a
572       zero byte in a character array is interpreted as the end of the signif‐
573       icant  characters  by  the  ncdump program, following the C convention.
574       Therefore, a NULL byte should not be embedded in a character string un‐
575       less  at  the  end: use the byte data type instead for byte arrays that
576       contain the zero byte.
577
578       String constants are, like character constants, represented using  dou‐
579       ble quotes. This represents a potential ambiguity since a multi-charac‐
580       ter string may also indicate a dimensioned character value. Disambigua‐
581       tion  usually  occurs  by  context, but care should be taken to specify
582       thestring type to ensure the proper choice.  String constants  are  as‐
583       sumed  to  always  be  UTF-8  encoded. This specifically means that the
584       string constant may actually contain multi-byte UTF-8 characters.   The
585       special  constant `NIL` can be used to represent a nil string, which is
586       not the same as a zero length string.
587
588       Opaque constants are represented as  sequences  of  hexadecimal  digits
589       preceded  by  0X  or  0x: 0xaa34ffff, for example.  These constants can
590       still be used as integer constants and will be either truncated or  ex‐
591       tended as necessary.
592
593   Compound Constant Expressions
594       In  order  to  assign values to variables (or attributes) whose type is
595       user-defined type, the constant notation has been extended  to  include
596       sequences  of  constants  enclosed  in curly brackets (e.g. "{"..."}").
597       Such a constant is called a compound constant, and  compound  constants
598       can be nested.
599
600       Given  a type "T(*) vlen_t", where T is some other arbitrary base type,
601       constants for this should be specified as follows.
602           vlen_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2m};
603       The values tij, are assumed to be constants of type T.
604
605       Given a type "compound cmpd_t {T1 f1; T2 f2...Tn fn}", where the Ti are
606       other  arbitrary  base types, constants for this should be specified as
607       follows.
608           cmpd_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2n};
609       The values tij, are assumed to be constants of type Ti.  If the  fields
610       are  missing, then they will be set using any specified or default fill
611       value for the field's base type.
612
613       The general set of rules for using braces are defined in the Specifying
614       Datalists section below.
615
616   Scoping Rules
617       With  the  addition of groups, the name space for defined objects is no
618       longer flat. References (names) of any type, dimension, or variable may
619       be  prefixed  with the absolute path specifying a specific declaration.
620       Thus one might say
621           variables:
622               /g1/g2/t1 v1;
623       The type being referenced (t1) is the one within  group  g2,  which  in
624       turn  is  nested  in group g1.  The similarity of this notation to Unix
625       file paths is deliberate, and one can consider groups as a form of  di‐
626       rectory structure.
627
628       When  name  is not prefixed, then scope rules are applied to locate the
629       specified declaration. Currently, there are three rules: one for dimen‐
630       sions, one for types and enumeration constants, and one for all others.
631
632       When an unprefixed name of a dimension is used (as in a variable decla‐
633              ration), ncgen first looks in the  immediately  enclosing  group
634              for  the  dimension.  If it is not found there, then it looks in
635              the group enclosing this group.  This continues up the group hi‐
636              erarchy  until  the  dimension  is  found,  or there are no more
637              groups to search.
638
639       2. When an unprefixed name of a type  or  an  enumeration  constant  is
640              used,  ncgen  searches  the  group tree using a pre-order depth-
641              first search. This essentially  means  that  it  will  find  the
642              matching  declaration  that  precedes the reference textually in
643              the cdl file and that is "highest" in the group hierarchy.
644
645       3. For all  other  names,  only  the  immediately  enclosing  group  is
646              searched.
647
648       One  final  note.  Forward references are not allowed.  This means that
649       specifying, for example, /g1/g2/t1 will fail if this  reference  occurs
650       before g1 and/or g2 are defined.
651
652   Specifying Enumeration Constants
653       References  to  Enumeration  constants (in data lists) can be ambiguous
654       since the same enumeration constant name can be defined  in  more  than
655       one  enumeration.  If  a cdl file specified an ambiguous constant, then
656       ncgen will signal an error. Such constants can be disambiguated in  two
657       ways.
658
659       1.     Prefix the enumeration constant with the name of the enumeration
660              separated by a dot: enum.econst, for example.
661
662       2.     If case one is not sufficient to  disambiguate  the  enumeration
663              constant, then one must specify the precise enumeration type us‐
664              ing a group path: /g1/g2/enum.econst, for example.
665
666   Special Attributes
667       Special, virtual, attributes can be specified to  provide  performance-
668       related  information  about  the file format and about variable proper‐
669       ties.  The file must be a netCDF-4 file for these to take effect.
670
671       These special virtual attributes are not actually  part  of  the  file,
672       they are merely a convenient way to set miscellaneous properties of the
673       data in CDL
674
675       The special attributes currently supported are as  follows:  `_Format',
676       `_Fletcher32,  `_ChunkSizes',  `_Endianness',  `_DeflateLevel', `_Shuf‐
677       fle', and `_Storage'.
678
679       `_Format' is a global attribute specifying the netCDF  format  variant.
680       Its  value  must  be a single string matching one of `classic', `64-bit
681       offset', `64-bit data', `netCDF-4', or `netCDF-4 classic model'.
682
683       The rest of the special attributes are all variable attributes.  Essen‐
684       tially  all of then map to some corresponding `nc_def_var_XXX' function
685       as defined in the netCDF-4 API.  For the attributes that are essential‐
686       ly  boolean (_Fletcher32, _Shuffle, and _NOFILL), the value true can be
687       specified by using the strings `true' or `1', or by using  the  integer
688       1.  The value false expects either `false', `0', or the integer 0.  The
689       actions associated with these attributes are as follows.
690
691       1. `_Fletcher32 sets the `fletcher32' property for a variable.
692
693       2. `_Endianness' is either `little' or  `big',  depending  on  how  the
694          variable is stored when first written.
695
696       3. `_DeflateLevel'  is an integer between 0 and 9 inclusive if compres‐
697          sion has been specified for the variable.
698
699       4. `_Shuffle' specifies if the the shuffle filter should be used.
700
701       5. `_Storage' is `contiguous' or `compact` or `chunked'.
702
703       6. `_ChunkSizes' is a list of chunk sizes for  each  dimension  of  the
704          variable
705
706       Note  that  attributes  such  as "add_offset" or "scale_factor" have no
707       special meaning to ncgen.  These attributes are currently  conventions,
708       handled  above the library layer by other utility packages, for example
709       NCO.
710
711   Specifying Datalists
712       Specifying datalists for variables in the `data:` section can be  some‐
713       what  complicated. There are some rules that must be followed to ensure
714       that datalists are parsed correctly by ncgen.
715
716       First, the top level is automatically assumed to be a list of items, so
717       it  should  not  be inside {...}.  That means that if the variable is a
718       scalar, there will be a single top-level element and if the variable is
719       an  array, there will be N top-level elements.  For each element of the
720       top level list, the following rules should be applied.
721
722       1. Instances of UNLIMITED dimensions (other than the  first  dimension)
723          must be surrounded by {...} in order to specify the size.
724
725       2. Compound instances must be embedded in {...}
726
727       3. Non-scalar fields of compound instances must be embedded in {...}.
728
729       4. Instances  of  vlens must be surrounded by {...} in order to specify
730          the size.
731
732       Datalists associated with attributes are implicitly a vector  (i.e.,  a
733       list)  of  values of the type of the attribute and the above rules must
734       apply with that in mind.
735
736       7. No other use of braces is allowed.
737
738       Note that one consequence of these rules is that arrays of values  can‐
739       not   have   subarrays  within  braces.   Consider,  for  example,  int
740       var(d1)(d2)...(dn), where none of d2...dn are  unlimited.   A  datalist
741       for  this  variable must be a single list of integers, where the number
742       of integers is no more than D=d1*d2*...dn values; note  that  the  list
743       can  be  less than D, in which case fill values will be used to pad the
744       list.
745
746       Rule 6 about attribute datalist has the following consequence.  If  the
747       type  of  the attribute is a compound (or vlen) type, and if the number
748       of entries in the list is one, then the compound instances must be  en‐
749       closed in braces.
750
751   Specifying Character Datalists
752       Specifying datalists for variables of type char also has some complica‐
753       tions. consider, for example
754              dimensions: u=UNLIMITED; d1=1; d2=2; d3=3;
755                          d4=4; d5=5; u2=UNLIMITED;
756              variables: char var(d4,d5);
757              datalist: var="1", "two", "three";
758
759       We have twenty elements of var to fill (d5 X  d4)  and  we  have  three
760       strings  of  length  1,  3,  5.  How do we assign the characters in the
761       strings to the twenty elements?
762
763       This is challenging because it is desirable to mimic the original ncgen
764       (ncgen3).  The core algorithm is notionally as follows.
765
766       1. Assume  we  have a set of dimensions D1..Dn, where D1 may optionally
767          be an Unlimited dimension.  It is assumed that the sizes of  the  Di
768          are all known (including unlimited dimensions).
769
770       2. Given  a  sequence of string or character constants C1..Cm, our goal
771          is to construct a single string whose length is the cross product of
772          D1  thru  Dn.   Note  that for purposes of this algorithm, character
773          constants are treated as strings of size 1.
774
775       3. Construct Dx = cross product of D1 thru D(n-1).
776
777       4. For each constant Ci, add fill characters  as  needed  so  that  its
778          length is a multiple of Dn.
779
780       5. Concatenate the modified C1..Cm to produce string S.
781
782       6. Add fill characters to S to make its length be a multiple of Dn.
783
784       8. If  S is longer than the Dx * Dn, then truncate and generate a warn‐
785          ing.
786
787       There are three other cases of note.
788
789       1. If there is only a single, unlimited dimension, then all of the con‐
790          stants  are concatenated and fill characters are added to the end of
791          the resulting string to make its length be that of the unlimited di‐
792          mension.  If the length is larger than the unlimited dimension, then
793          it is truncated with a warning.
794
795       2. For the case of  character typed vlen, "char(*) vlen_t" for example.
796          we simply concatenate all the constants with no filling at all.
797
798       3. For  the  case of a character typed attribute, we simply concatenate
799          all the constants.
800
801       In netcdf-4, dimensions other than the  first  can  be  unlimited.   Of
802       course by the rules above, the interior unlimited instances must be de‐
803       limited by {...}. For example.
804            variables: char var(u,u2);
805            datalist: var={"1", "two"}, {"three"};
806       In this case u will have the effective length of two.  Within each  in‐
807       stance of u2, the rules above will apply, leading to this.
808            datalist: var={"1","t","w","o"}, {"t","h","r","e","e"};
809       The  effective  size  of u2 will be the max of the two instance lengths
810       (five in this case) and the shorter will be padded to produce this.
811            datalist: var={"1","t","w","o","\0"}, {"t","h","r","e","e"};
812
813       Consider an even more complicated case.
814            variables: char var(u,u2,u3);
815            datalist: var={{"1", "two"}}, {{"three"},{"four","xy"}};
816       In this case u again will have the effective length of two.  The u2 di‐
817       mensions  will  have a size = max(1,2) = 2; Within each instance of u2,
818       the rules above will apply, leading to this.
819            datalist: var={{"1","t","w","o"}}, {{"t","h","r","e","e"},{"f","o","u","r","x","y"}};
820       The  effective  size  of u3 will be the max of the two instance lengths
821       (six in this case) and the shorter ones will be padded to produce this.
822            datalist: var={{"1","t","w","o"," "," "}}, {{"t","h","r","e","e"," "},{"f","o","u","r","x","y"}};
823       Note  however that the first instance of u2 is less than the max length
824       of u2, so we need to add a filler for another instance of u2, producing
825       this.
826            datalist: var={{"1","t","w","o"," "," "},{" "," "," "," "," "," "}}, {{"t","h","r","e","e"," "},{"f","o","u","r","x","y"}};
827
828

BUGS

830       The  programs generated by ncgen when using the -c flag use initializa‐
831       tion statements to store data in variables, and will  fail  to  produce
832       compilable  programs  if  you try to use them for large datasets, since
833       the resulting statements may exceed the line length or number  of  con‐
834       tinuation statements permitted by the compiler.
835
836       The  CDL  syntax  makes  it  easy to assign what looks like an array of
837       variable-length strings to a netCDF variable, but the strings may  sim‐
838       ply be concatenated into a single array of characters.  Specific use of
839       the string type specifier may solve the problem
840
841

Identifiers and Keywords

843       Under certain conditions, some keywords can be used as identifiers.
844
845       1.     If a type keyword is not a type supported by the format  of  the
846              .cdl  file,  then it can be used as an identifier. So, for exam‐
847              ple, when translating a .cdl  file  as  a  netCDF-3  file,  then
848              "string" or "uint64" can be used as identifiers.
849
850       2.     The  keyword  "data" can be used as an identifier because it can
851              be tested in a context sensitive fashion to see if "data"  is  a
852              keyword versus an identifier.
853
854

CDL Grammar

856       The file ncgen.y is the definitive grammar for CDL, but a stripped down
857       version is included here for completeness.
858              ncdesc: NETCDF
859                   datasetid
860                      rootgroup
861                      ;
862
863              datasetid: DATASETID
864
865              rootgroup: '{'
866                         groupbody
867                         subgrouplist
868                         '}';
869
870              groupbody:
871                        attrdecllist
872                              typesection
873                              dimsection
874                              vasection
875                              datasection
876                              ;
877
878              subgrouplist:
879                     /*empty*/
880                   | subgrouplist namedgroup
881                   ;
882
883              namedgroup: GROUP ident '{'
884                          groupbody
885                          subgrouplist
886                          '}'
887                       attrdecllist
888                       ;
889
890              typesection:    /* empty */
891                              | TYPES
892                        | TYPES typedecls
893                              ;
894
895              typedecls:
896                     type_or_attr_decl
897                   | typedecls type_or_attr_decl
898                   ;
899
900              typename: ident ;
901
902              type_or_attr_decl:
903                     typedecl
904                   | attrdecl ';'
905                   ;
906
907              typedecl:
908                     enumdecl optsemicolon
909                   | compounddecl optsemicolon
910                   | vlendecl optsemicolon
911                   | opaquedecl optsemicolon
912                   ;
913
914              optsemicolon:
915                     /*empty*/
916                   | ';'
917                   ;
918
919              enumdecl: primtype ENUM typename ;
920
921              enumidlist:   enumid
922                       | enumidlist ',' enumid
923                       ;
924
925              enumid: ident '=' constint ;
926
927              opaquedecl: OPAQUE '(' INT_CONST ')' typename ;
928
929              vlendecl: typeref '(' '*' ')' typename ;
930
931              compounddecl: COMPOUND typename '{' fields '}' ;
932
933              fields:   field ';'
934                   | fields field ';'
935                   ;
936
937              field: typeref fieldlist ;
938
939              primtype:         CHAR_K
940                              | BYTE_K
941                              | SHORT_K
942                              | INT_K
943                              | FLOAT_K
944                              | DOUBLE_K
945                              | UBYTE_K
946                              | USHORT_K
947                              | UINT_K
948                              | INT64_K
949                              | UINT64_K
950                              ;
951
952              dimsection:     /* empty */
953                              | DIMENSIONS
954                        | DIMENSIONS dimdecls
955                              ;
956
957              dimdecls:       dim_or_attr_decl ';'
958                              | dimdecls dim_or_attr_decl ';'
959                              ;
960
961              dim_or_attr_decl: dimdeclist  | attrdecl  ;
962
963              dimdeclist:     dimdecl
964                              | dimdeclist ',' dimdecl
965                              ;
966
967              dimdecl:
968                     dimd '=' UINT_CONST
969                   | dimd '=' INT_CONST
970                      | dimd '=' DOUBLE_CONST
971                      | dimd '=' NC_UNLIMITED_K
972                      ;
973
974              dimd:           ident ;
975
976              vasection:      /* empty */
977                              | VARIABLES
978                              | VARIABLES vadecls
979                              ;
980
981              vadecls:        vadecl_or_attr ';'
982                              | vadecls vadecl_or_attr ';'
983                              ;
984
985              vadecl_or_attr: vardecl  | attrdecl  ;
986
987              vardecl:        typeref varlist ;
988
989              varlist:      varspec
990                          | varlist ',' varspec
991                          ;
992
993              varspec:        ident dimspec ;
994
995              dimspec:        /* empty */
996                              | '(' dimlist ')'
997                              ;
998
999              dimlist:        dimref
1000                              | dimlist ',' dimref
1001                              ;
1002
1003              dimref: path ;
1004
1005              fieldlist:
1006                     fieldspec
1007                   | fieldlist ',' fieldspec
1008                      ;
1009
1010              fieldspec: ident fielddimspec ;
1011
1012              fielddimspec:     /* empty */
1013                              | '(' fielddimlist ')'
1014                              ;
1015
1016              fielddimlist:
1017                     fielddim
1018                   | fielddimlist ',' fielddim
1019                      ;
1020
1021              fielddim:
1022                     UINT_CONST
1023                   | INT_CONST
1024                   ;
1025
1026              /* Use this when referencing defined objects */
1027              varref: type_var_ref ;
1028
1029              typeref: type_var_ref       ;
1030
1031              type_var_ref:
1032                     path
1033                   | primtype
1034                   ;
1035
1036              /* Use this for all attribute decls */
1037              /* Watch out; this is left recursive */
1038              attrdecllist: /*empty*/  | attrdecl ';' attrdecllist  ;
1039
1040              attrdecl:
1041                     ':' ident '=' datalist
1042                   | typeref type_var_ref ':' ident '=' datalist
1043                   | type_var_ref ':' ident '=' datalist
1044                   | type_var_ref ':' _FILLVALUE '=' datalist
1045                   | typeref type_var_ref ':' _FILLVALUE '=' datalist
1046                   | type_var_ref ':' _STORAGE '=' conststring
1047                   | type_var_ref ':' _CHUNKSIZES '=' intlist
1048                   | type_var_ref ':' _FLETCHER32 '=' constbool
1049                   | type_var_ref ':' _DEFLATELEVEL '=' constint
1050                   | type_var_ref ':' _SHUFFLE '=' constbool
1051                   | type_var_ref ':' _ENDIANNESS '=' conststring
1052                   | type_var_ref ':' _NOFILL '=' constbool
1053                   | ':' _FORMAT '=' conststring
1054                   ;
1055
1056              path:
1057                     ident
1058                   | PATH
1059                   ;
1060
1061              datasection:    /* empty */
1062                              | DATA
1063                              | DATA datadecls
1064                              ;
1065
1066              datadecls:
1067                     datadecl ';'
1068                   | datadecls datadecl ';'
1069                   ;
1070
1071              datadecl: varref '=' datalist ;
1072              datalist:
1073                     datalist0
1074                   | datalist1
1075                   ;
1076
1077              datalist0:
1078                   /*empty*/
1079                   ;
1080
1081              /* Must have at least 1 element */
1082              datalist1:
1083                     dataitem
1084                   | datalist ',' dataitem
1085                   ;
1086
1087              dataitem:
1088                     constdata
1089                   | '{' datalist '}'
1090                   ;
1091
1092              constdata:
1093                     simpleconstant
1094                   | OPAQUESTRING
1095                   | FILLMARKER
1096                   | NIL
1097                   | econstref
1098                   | function
1099                   ;
1100
1101              econstref: path ;
1102
1103              function: ident '(' arglist ')' ;
1104
1105              arglist:
1106                     simpleconstant
1107                   | arglist ',' simpleconstant
1108                   ;
1109
1110              simpleconstant:
1111                     CHAR_CONST /* never used apparently*/
1112                   | BYTE_CONST
1113                   | SHORT_CONST
1114                   | INT_CONST
1115                   | INT64_CONST
1116                   | UBYTE_CONST
1117                   | USHORT_CONST
1118                   | UINT_CONST
1119                   | UINT64_CONST
1120                   | FLOAT_CONST
1121                   | DOUBLE_CONST
1122                   | TERMSTRING
1123                   ;
1124
1125              intlist:
1126                     constint
1127                   | intlist ',' constint
1128                   ;
1129
1130              constint:
1131                     INT_CONST
1132                   | UINT_CONST
1133                   | INT64_CONST
1134                   | UINT64_CONST
1135                   ;
1136
1137              conststring: TERMSTRING ;
1138
1139              constbool:
1140                     conststring
1141                   | constint
1142                   ;
1143
1144              /* Push all idents thru here for tracking */
1145              ident: IDENT ;
1146
1147
1148
1149Printed: 123-2-6         $Date: 2010/04/29 16:38:55 $                 NCGEN(1)