1NCGEN(1) UNIDATA UTILITIES NCGEN(1)
2
3
4
6 ncgen - From a CDL file generate a netCDF-3 file, a netCDF-4 file or a
7 C program
8
10 ncgen [-format_code] [-1|3|4|5|6|7] [-b] [-c] [-d] [-D debuglevel] [-f]
11 [-h] [-H] [-k format_name] [-l b|c|f77|java] [-L loglevel] [-M
12 name] [-n] [-N datasetname] [-o netcdf_filename] [-P] [-x]
13
15 ncgen generates either a netCDF-3 (i.e. classic) binary .nc file, a
16 netCDF-4 (i.e. enhanced) binary .nc file or a file in some source lan‐
17 guage that when executed will construct the corresponding binary .nc
18 file. The input to ncgen is a description of a netCDF file in a small
19 language known as CDL (network Common Data form Language), described
20 below. Input is read from standard input if no input_file is speci‐
21 fied. If no options are specified in invoking ncgen, it merely checks
22 the syntax of the input CDL file, producing error messages for any vio‐
23 lations of CDL syntax. Other options can be used, for example, to cre‐
24 ate the corresponding netCDF file, or to generate a C program that uses
25 the netCDF C interface to create the netCDF file.
26
27 Note that this version of ncgen was originally called ncgen4. The old‐
28 er ncgen program has been renamed to ncgen3.
29
30 ncgen may be used with the companion program ncdump to perform some
31 simple operations on netCDF files. For example, to rename a dimension
32 in a netCDF file, use ncdump to get a CDL version of the netCDF file,
33 edit the CDL file to change the name of the dimensions, and use ncgen
34 to generate the corresponding netCDF file from the edited CDL file.
35
37 -1|3|4|5|6|7
38 Alternate method to specify the format.
39
40 3 => netcdf classic format
41
42 4 => netCDF-4 format (enhanced data model)
43
44 5 => netcdf 5 format
45
46 6 => netCDF 64-bit format
47
48 7 => netCDF-4 classic model format (3+4 == 7)
49 See the -k flag.
50
51 -b Create a (binary) netCDF file. If the -o option is absent, a
52 default file name will be constructed from the basename of the
53 CDL file, with any suffix replaced by the `.nc' extension. If a
54 file already exists with the specified name, it will be over‐
55 written.
56
57 -c Generate C source code that will create a netCDF file matching
58 the netCDF specification. The C source code is written to stan‐
59 dard output; equivalent to -lc.
60
61 -d Same as -D1.
62
63 -D debuglevel
64 Set the level of debug output.
65
66 -f Generate FORTRAN 77 source code that will create a netCDF file
67 matching the netCDF specification. The source code is written
68 to standard output; equivalent to -lf77.
69
70 -h Output help information.
71
72 -H Output the header only; ignore the data section.
73
74 -k format_name
75
76 -format_code
77 The -k flag specifies the format of the file to be created and,
78 by inference, the data model accepted by ncgen (i.e. netcdf-3
79 (classic) versus netcdf-4 vs netcdf-5). As a shortcut, a numeric
80 format_code may be specified instead. The possible format_name
81 values for the -k option are:
82
83 'classic' or 'nc3' => netCDF classic format
84
85 '64-bit offset' or 'nc6' => netCDF 64-bit format
86
87 '64-bit data or 'nc5' => netCDF-5 (64-bit data) format
88
89 'netCDF-4' 0r 'nc4' => netCDF-4 format (enhanced data
90 model)
91
92 'netCDF-4 classic model' or 'nc7' => netCDF-4 classic
93 model format
94 Accepted format_code numeric arguments, just shortcuts for for‐
95 mat_names, are:
96
97 3 => netcdf classic format
98
99 5 => netcdf 5 format
100
101 6 => netCDF 64-bit format
102
103 4 => netCDF-4 format (enhanced data model)
104
105 7 => netCDF-4 classic model format
106 The numeric code "7" is used because "7=3+4", a mnemonic for the format
107 that uses the netCDF-3 data model for compatibility with the netCDF-4
108 storage format for performance. Credit is due to NCO for use of these
109 numeric codes instead of the old and confusing format numbers.
110
111 Note: The old version format numbers '1', '2', '3', '4', equivalent to
112 the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are also
113 still accepted but deprecated, due to easy confusion between format
114 numbers and format names. Various old format name aliases are also ac‐
115 cepted but deprecated, e.g. 'hdf5', 'enhanced-nc3', etc. Also, note
116 that -v is accepted to mean the same thing as -k for backward compati‐
117 bility.
118
119 -l b|c|f77|java
120 The -l flag specifies the output language to use when generating
121 source code that will create or define a netCDF file matching
122 the netCDF specification. The output is written to standard
123 output. The currently supported languages have the following
124 flags.
125
126 c|C' => C language output.
127
128 f77|fortran77' => FORTRAN 77 language output
129 ; note that currently only the classic model is
130 supported.
131
132 j|java' => (experimental) Java language output
133 ; targets the existing Unidata Java interface,
134 which means that only the classic model is sup‐
135 ported.
136
138 The choice of output format is determined by three flags.
139
140 -k flag.
141
142 _Format attribute (see below).
143
144 Occurrence of CDF-5 (64-bit data) or
145 netcdf-4 constructs in the input CDL." The term "netCDF-4 con‐
146 structs" means constructs from the enhanced data model, not just
147 special performance-related attributes such as
148 _ChunkSizes, _DeflateLevel, _Endianness, etc. The term "CDF-5
149 constructs" means extended unsigned integer types allowed in the
150 64-bit data model.
151
152 Note that there is an ambiguity between the netCDF-4 case and the CDF-5
153 case is only an unsigned type is seen in the input.
154
155 The rules are as follows, in order of application.
156
157 1. If either Fortran or Java output is specified, then -k flag val‐
158 ue of 1 (classic model) will be used. Conflicts with the use of
159 enhanced constructs in the CDL will report an error.
160
161 2. If both the -k flag and _Format attribute are specified, the
162 _Format flag will be ignored. If no -k flag is specified, and a
163 _Format attribute value is specified, then the -k flag value
164 will be set to that of the _Format attribute. Otherwise the -k
165 flag is undefined.
166
167 3. If the -k option is defined and is consistent with the CDL, nc‐
168 gen will output a file in the requested form, else an error will
169 be reported.
170
171 4. If the -k flag is undefined, and if there are CDF-5 constructs,
172 only, in the CDL, a -k flag value of 5 (64-bit data model) will
173 be used. If there are true netCDF-4 constructs in the CDL, a -k
174 flag value of 3 (enhanced model) will be used.
175
176 5. If special performance-related attributes are specified in the
177 CDL, a -k flag value of 4 (netCDF-4 classic model) will be used.
178
179 6. Otherwise ncgen will set the -k flag to 1 (classic model).
180
181 -L loglevel
182
183 -M name
184 Specify the name for the main function for C, F77, or Java.
185
186 -n
187
188 -N datasetname
189
190 -o netcdf_file
191 Name of the file to pass to calls to "nc_create()". If this op‐
192 tion is specified it implies (in the absence of any explicit -l
193 flag) the "-b" option. This option is necessary because netCDF
194 files cannot be written directly to standard output, since stan‐
195 dard output is not seekable.
196
197 -P Use NC_DISKLESS mode to create the file totally in memory before
198 persisting it to disk.
199
200 -W maxwholevarsize
201 Set wholevarsizem where if total number of elements is less than
202 maxwholevarsize then updata a variable using a single
203 nc_put_var. Requires that the variable has no unlimited dimen‐
204 sions.
205
206 -x Don't initialize data with fill values. This can speed up cre‐
207 ation of large netCDF files greatly, but later attempts to read
208 unwritten data from the generated file will not be easily de‐
209 tectable.
210
211
213 Check the syntax of the CDL file `foo.cdl':
214
215 ncgen foo.cdl
216
217 From the CDL file `foo.cdl', generate an equivalent binary netCDF file
218 named `x.nc':
219
220 ncgen -o x.nc foo.cdl
221
222 From the CDL file `foo.cdl', generate a C program containing the netCDF
223 function invocations necessary to create an equivalent binary netCDF
224 file named `x.nc':
225
226 ncgen -lc foo.cdl >x.c
227
229 CDL Syntax Overview
230 Below is an example of CDL syntax, describing a netCDF file with sever‐
231 al named dimensions (lat, lon, and time), variables (Z, t, p, rh, lat,
232 lon, time), variable attributes (units, long_name, valid_range, _Fill‐
233 Value), and some data. CDL keywords are in boldface. (This example is
234 intended to illustrate the syntax; a real CDL file would have a more
235 complete set of attributes so that the data would be more completely
236 self-describing.)
237 netcdf foo { // an example netCDF specification in CDL
238
239 types:
240 ubyte enum enum_t {Clear = 0, Cumulonimbus = 1, Stratus = 2};
241 opaque(11) opaque_t;
242 int(*) vlen_t;
243
244 dimensions:
245 lat = 10, lon = 5, time = unlimited ;
246
247 variables:
248 long lat(lat), lon(lon), time(time);
249 float Z(time,lat,lon), t(time,lat,lon);
250 double p(time,lat,lon);
251 long rh(time,lat,lon);
252
253 string country(time,lat,lon);
254 ubyte tag;
255
256 // variable attributes
257 lat:long_name = "latitude";
258 lat:units = "degrees_north";
259 lon:long_name = "longitude";
260 lon:units = "degrees_east";
261 time:units = "seconds since 1992-1-1 00:00:00";
262
263 // typed variable attributes
264 string Z:units = "geopotential meters";
265 float Z:valid_range = 0., 5000.;
266 double p:_FillValue = -9999.;
267 long rh:_FillValue = -1;
268 vlen_t :globalatt = {17, 18, 19};
269 data:
270 lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
271 lon = -140, -118, -96, -84, -52;
272 group: g {
273 types:
274 compound cmpd_t { vlen_t f1; enum_t f2;};
275 } // group g
276 group: h {
277 variables:
278 /g/cmpd_t compoundvar;
279 data:
280 compoundvar = { {3,4,5}, enum_t.Stratus } ;
281 } // group h
282 }
283
284 All CDL statements are terminated by a semicolon. Spaces, tabs, and
285 newlines can be used freely for readability. Comments may follow the
286 characters `//' on any line.
287
288 A CDL description consists of five optional parts: types, dimensions,
289 variables, data, beginning with the keyword `types:', `dimensions:',
290 `variables:', and `data:', respectively. Note several things: (1) the
291 keyword includes the trailing colon, so there must not be any space be‐
292 fore the colon character, and (2) the keywords are required to be lower
293 case.
294
295 The variables: section may contain variable declarations and attribute
296 assignments. All sections may contain global attribute assignments.
297
298 In addition, after the data: section, the user may define a series of
299 groups (see the example above). Groups themselves can contain types,
300 dimensions, variables, data, and other (nested) groups.
301
302 The netCDF types: section declares the user defined types. These may
303 be constructed using any of the following types: enum, vlen, opaque, or
304 compound.
305
306 A netCDF dimension is used to define the shape of one or more of the
307 multidimensional variables contained in the netCDF file. A netCDF di‐
308 mension has a name and a size. A dimension can have the unlimited
309 size, which means a variable using this dimension can grow to any
310 length in that dimension.
311
312 A variable represents a multidimensional array of values of the same
313 type. A variable has a name, a data type, and a shape described by its
314 list of dimensions. Each variable may also have associated attributes
315 (see below) as well as data values. The name, data type, and shape of
316 a variable are specified by its declaration in the variable section of
317 a CDL description. A variable may have the same name as a dimension;
318 by convention such a variable is one-dimensional and contains coordi‐
319 nates of the dimension it names. Dimensions need not have correspond‐
320 ing variables.
321
322 A netCDF attribute contains information about a netCDF variable or
323 about the whole netCDF dataset. Attributes are used to specify such
324 properties as units, special values, maximum and minimum valid values,
325 scaling factors, offsets, and parameters. Attribute information is
326 represented by single values or arrays of values. For example, "units"
327 is an attribute represented by a character array such as "celsius". An
328 attribute has an associated variable, a name, a data type, a length,
329 and a value. In contrast to variables that are intended for data, at‐
330 tributes are intended for metadata (data about data). Unlike netCDF-3,
331 attribute types can be any user defined type as well as the usual
332 built-in types.
333
334 In CDL, an attribute is designated by a a type, a variable, a ':', and
335 then an attribute name. The type is optional and if missing, it will
336 be inferred from the values assigned to the attribute. It is possible
337 to assign global attributes not associated with any variable to the
338 netCDF as a whole by omitting the variable name in the attribute decla‐
339 ration. Notice that there is a potential ambiguity in a specification
340 such as
341 x : a = ...
342 In this situation, x could be either a type for a global attribute, or
343 the variable name for an attribute. Since there could both be a type
344 named x and a variable named x, there is an ambiguity. The rule is
345 that in this situation, x will be interpreted as a type if possible,
346 and otherwise as a variable.
347
348 If not specified, the data type of an attribute in CDL is derived from
349 the type of the value(s) assigned to it. The length of an attribute is
350 the number of data values assigned to it, or the number of characters
351 in the character string assigned to it. Multiple values are assigned
352 to non-character attributes by separating the values with commas. All
353 values assigned to an attribute must be of the same type.
354
355 The names for CDL dimensions, variables, attributes, types, and groups
356 may contain any non-control utf-8 character except the forward slash
357 character (`/'). However, certain characters must escaped if they are
358 used in a name, where the escape character is the backward slash `\'.
359 In particular, if the leading character off the name is a digit (0-9),
360 then it must be preceded by the escape character. In addition, the
361 characters ` !"#$%&()*,:;<=>?[]^`´{}|~\' must be escaped if they occur
362 anywhere in a name. Note also that attribute names that begin with an
363 underscore (`_') are reserved for the use of Unidata and should not be
364 used in user defined attributes.
365
366 Note also that the words `variables', `dimensions', `data', `group',
367 and `types' are legal CDL names, but be careful that there is a space
368 between them and any following colon character when used as a variable
369 name. This is mostly an issue with attribute declarations. For exam‐
370 ple, consider this.
371
372
373 netcdf ... {
374 ...
375 variables:
376 int dimensions;
377 dimensions: attribute=0 ; // this will cause an error
378 dimensions : attribute=0 ; // this is ok.
379 ...
380 }
381
382 The optional data: section of a CDL specification is where netCDF vari‐
383 ables may be initialized. The syntax of an initialization is simple: a
384 variable name, an equals sign, and a comma-delimited list of constants
385 (possibly separated by spaces, tabs and newlines) terminated with a
386 semicolon. For multi-dimensional arrays, the last dimension varies
387 fastest. Thus row-order rather than column order is used for matrices.
388 If fewer values are supplied than are needed to fill a variable, it is
389 extended with a type-dependent `fill value', which can be overridden by
390 supplying a value for a distinguished variable attribute named `_Fill‐
391 Value'. The types of constants need not match the type declared for a
392 variable; coercions are done to convert integers to floating point, for
393 example. The constant `_' can be used to designate the fill value for
394 a variable. If the type of the variable is explicitly `string', then
395 the special constant `NIL` can be used to represent a nil string, which
396 is not the same as a zero length string.
397
398 Primitive Data Types
399 char characters
400 byte 8-bit data
401 short 16-bit signed integers
402 int 32-bit signed integers
403 long (synonymous with int)
404 int64 64-bit signed integers
405 float IEEE single precision floating point (32 bits)
406 real (synonymous with float)
407 double IEEE double precision floating point (64 bits)
408 ubyte unsigned 8-bit data
409 ushort 16-bit unsigned integers
410 uint 32-bit unsigned integers
411 uint64 64-bit unsigned integers
412 string arbitrary length strings
413
414 CDL supports a superset of the primitive data types of C. The names
415 for the primitive data types are reserved words in CDL, so the names of
416 variables, dimensions, and attributes must not be primitive type names.
417 In declarations, type names may be specified in either upper or lower
418 case.
419
420 Bytes are intended to hold a full eight bits of data, and the zero byte
421 has no special significance, as it mays for character data. ncgen con‐
422 verts byte declarations to char declarations in the output C code and
423 to the nonstandard BYTE declaration in output Fortran code.
424
425 Shorts can hold values between -32768 and 32767. ncgen converts short
426 declarations to short declarations in the output C code and to the non‐
427 standard INTEGER*2 declaration in output Fortran code.
428
429 Ints can hold values between -2147483648 and 2147483647. ncgen con‐
430 verts int declarations to int declarations in the output C code and to
431 INTEGER declarations in output Fortran code. long is accepted as a
432 synonym for int in CDL declarations, but is deprecated since there are
433 now platforms with 64-bit representations for C longs.
434
435 Int64 can hold values between -9223372036854775808 and
436 9223372036854775807. ncgen converts int64 declarations to longlong
437 declarations in the output C code.
438
439 Floats can hold values between about -3.4+38 and 3.4+38. Their exter‐
440 nal representation is as 32-bit IEEE normalized single-precision float‐
441 ing point numbers. ncgen converts float declarations to float declara‐
442 tions in the output C code and to REAL declarations in output Fortran
443 code. real is accepted as a synonym for float in CDL declarations.
444
445 Doubles can hold values between about -1.7+308 and 1.7+308. Their ex‐
446 ternal representation is as 64-bit IEEE standard normalized double-pre‐
447 cision floating point numbers. ncgen converts double declarations to
448 double declarations in the output C code and to DOUBLE PRECISION decla‐
449 rations in output Fortran code.
450
451 The unsigned counterparts of the above integer types are mapped to the
452 corresponding unsigned C types. Their ranges are suitably modified to
453 start at zero.
454
455 The technical interpretation of the char type is that it is an unsigned
456 8-bit value. The encoding of the 256 possible values is unspecified by
457 default. A variable of char type may be marked with an "_Encoding" at‐
458 tribute to indicate the character set to be used: US-ASCII, ISO-8859-1,
459 etc. Note that specifying the encoding of UTF-8 is equivalent to spec‐
460 ifying US-ASCII This is because multi-byte UTF-8 characters cannot be
461 stored in an 8-bit character. The only legal single byte UTF-8 values
462 are by definition the 7-bit US-ASCII encoding with the top bit set to
463 zero.
464
465 Strings are assumed by default to be encoded using UTF-8. Note that
466 this means that multi-byte UTF-8 encodings may be present in the
467 string, so it is possible that the number of distinct UTF-8 characters
468 in a string is smaller than the number of 8-bit bytes used to store the
469 string.
470
471 CDL Constants
472 Constants assigned to attributes or variables may be of any of the ba‐
473 sic netCDF types. The syntax for constants is similar to C syntax, ex‐
474 cept that type suffixes must be appended to shorts and floats to dis‐
475 tinguish them from longs and doubles.
476
477 A byte constant is represented by an integer constant with a `b' (or
478 `B') appended. In the old netCDF-2 API, byte constants could also be
479 represented using single characters or standard C character escape se‐
480 quences such as `a' or `0. This is still supported for backward com‐
481 patibility, but deprecated to make the distinction clear between the
482 numeric byte type and the textual char type. Example byte constants
483 include:
484 0b // a zero byte
485 -1b // -1 as an 8-bit byte
486 255b // also -1 as a signed 8-bit byte
487
488 short integer constants are intended for representing 16-bit signed
489 quantities. The form of a short constant is an integer constant with
490 an `s' or `S' appended. If a short constant begins with `0', it is in‐
491 terpreted as octal, except that if it begins with `0x', it is inter‐
492 preted as a hexadecimal constant. For example:
493 -2s // a short -2
494 0123s // octal
495 0x7ffs //hexadecimal
496
497 int integer constants are intended for representing 32-bit signed quan‐
498 tities. The form of an int constant is an ordinary integer constant,
499 although it is acceptable to optionally append a single `l' or `L'
500 (again, deprecated). Be careful, though, the L suffix is interpreted as
501 a 32 bit integer, and never as a 64 bit integer. This can be confusing
502 since the C long type can ambigously be either 32 bit or 64 bit.
503
504 If an int constant begins with `0', it is interpreted as octal, except
505 that if it begins with `0x', it is interpreted as a hexadecimal con‐
506 stant (but see opaque constants below). Examples of valid int con‐
507 stants include:
508 -2
509 1234567890L
510 0123 // octal
511 0x7ff // hexadecimal
512
513 int64 integer constants are intended for representing 64-bit signed
514 quantities. The form of an int64 constant is an integer constant with
515 an `ll' or `LL' appended. If an int64 constant begins with `0', it is
516 interpreted as octal, except that if it begins with `0x', it is inter‐
517 preted as a hexadecimal constant. For example:
518 -2ll // an unsigned -2
519 0123LL // octal
520 0x7ffLL //hexadecimal
521
522 Floating point constants of type float are appropriate for representing
523 floating point data with about seven significant digits of precision.
524 The form of a float constant is the same as a C floating point constant
525 with an `f' or `F' appended. For example the following are all accept‐
526 able float constants:
527 -2.0f
528 3.14159265358979f // will be truncated to less precision
529 1.f
530
531
532 Floating point constants of type double are appropriate for represent‐
533 ing floating point data with about sixteen significant digits of preci‐
534 sion. The form of a double constant is the same as a C floating point
535 constant. An optional `d' or `D' may be appended. For example the
536 following are all acceptable double constants:
537 -2.0
538 3.141592653589793
539 1.0e-20
540 1.d
541
542 Unsigned integer constants can be created by appending the character
543 'U' or 'u' between the constant and any trailing size specifier, or im‐
544 mediately at the end of the size specifier. Thus one could say 10U,
545 100su, 100000ul, or 1000000llu, for example.
546
547 Single character constants may be enclosed in single quotes. If a se‐
548 quence of one or more characters is enclosed in double quotes, then its
549 interpretation must be inferred from the context. If the dataset is
550 created using the netCDF classic model, then all such constants are in‐
551 terpreted as a character array, so each character in the constant is
552 interpreted as if it were a single character. If the dataset is netCDF
553 extended, then the constant may be interpreted as for the classic model
554 or as a true string (see below) depending on the type of the attribute
555 or variable into which the string is contained.
556
557 The interpretation of char constants is that those that are in the
558 printable ASCII range (' '..'~') are assumed to be encoded as the
559 1-byte subset ofUTF-8, which is equivalent to US-ASCII. In all cases,
560 the usual C string escape conventions are honored for values from 0
561 thru 127. Values greater than 127 are allowed, but their encoding is
562 undefined. For netCDF extended, the use of the char type is deprecated
563 in favor of the string type.
564
565 Some character constant examples are as follows.
566 'a' // ASCII `a'
567 "a" // equivalent to 'a'
568 "Two\nlines\n" // a 10-character string with two embedded newlines
569 "a bell:\007" // a string containing an ASCII bell
570 Note that the netCDF character array "a" would fit in a one-element
571 variable, since no terminating NULL character is assumed. However, a
572 zero byte in a character array is interpreted as the end of the signif‐
573 icant characters by the ncdump program, following the C convention.
574 Therefore, a NULL byte should not be embedded in a character string un‐
575 less at the end: use the byte data type instead for byte arrays that
576 contain the zero byte.
577
578 String constants are, like character constants, represented using dou‐
579 ble quotes. This represents a potential ambiguity since a multi-charac‐
580 ter string may also indicate a dimensioned character value. Disambigua‐
581 tion usually occurs by context, but care should be taken to specify
582 thestring type to ensure the proper choice. String constants are as‐
583 sumed to always be UTF-8 encoded. This specifically means that the
584 string constant may actually contain multi-byte UTF-8 characters. The
585 special constant `NIL` can be used to represent a nil string, which is
586 not the same as a zero length string.
587
588 Opaque constants are represented as sequences of hexadecimal digits
589 preceded by 0X or 0x: 0xaa34ffff, for example. These constants can
590 still be used as integer constants and will be either truncated or ex‐
591 tended as necessary.
592
593 Compound Constant Expressions
594 In order to assign values to variables (or attributes) whose type is
595 user-defined type, the constant notation has been extended to include
596 sequences of constants enclosed in curly brackets (e.g. "{"..."}").
597 Such a constant is called a compound constant, and compound constants
598 can be nested.
599
600 Given a type "T(*) vlen_t", where T is some other arbitrary base type,
601 constants for this should be specified as follows.
602 vlen_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2m};
603 The values tij, are assumed to be constants of type T.
604
605 Given a type "compound cmpd_t {T1 f1; T2 f2...Tn fn}", where the Ti are
606 other arbitrary base types, constants for this should be specified as
607 follows.
608 cmpd_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2n};
609 The values tij, are assumed to be constants of type Ti. If the fields
610 are missing, then they will be set using any specified or default fill
611 value for the field's base type.
612
613 The general set of rules for using braces are defined in the Specifying
614 Datalists section below.
615
616 Scoping Rules
617 With the addition of groups, the name space for defined objects is no
618 longer flat. References (names) of any type, dimension, or variable may
619 be prefixed with the absolute path specifying a specific declaration.
620 Thus one might say
621 variables:
622 /g1/g2/t1 v1;
623 The type being referenced (t1) is the one within group g2, which in
624 turn is nested in group g1. The similarity of this notation to Unix
625 file paths is deliberate, and one can consider groups as a form of di‐
626 rectory structure.
627
628 When name is not prefixed, then scope rules are applied to locate the
629 specified declaration. Currently, there are three rules: one for dimen‐
630 sions, one for types and enumeration constants, and one for all others.
631
632 When an unprefixed name of a dimension is used (as in a variable decla‐
633 ration), ncgen first looks in the immediately enclosing group
634 for the dimension. If it is not found there, then it looks in
635 the group enclosing this group. This continues up the group hi‐
636 erarchy until the dimension is found, or there are no more
637 groups to search.
638
639 2. When an unprefixed name of a type or an enumeration constant is
640 used, ncgen searches the group tree using a pre-order depth-
641 first search. This essentially means that it will find the
642 matching declaration that precedes the reference textually in
643 the cdl file and that is "highest" in the group hierarchy.
644
645 3. For all other names, only the immediately enclosing group is
646 searched.
647
648 One final note. Forward references are not allowed. This means that
649 specifying, for example, /g1/g2/t1 will fail if this reference occurs
650 before g1 and/or g2 are defined.
651
652 Specifying Enumeration Constants
653 References to Enumeration constants (in data lists) can be ambiguous
654 since the same enumeration constant name can be defined in more than
655 one enumeration. If a cdl file specified an ambiguous constant, then
656 ncgen will signal an error. Such constants can be disambiguated in two
657 ways.
658
659 1. Prefix the enumeration constant with the name of the enumeration
660 separated by a dot: enum.econst, for example.
661
662 2. If case one is not sufficient to disambiguate the enumeration
663 constant, then one must specify the precise enumeration type us‐
664 ing a group path: /g1/g2/enum.econst, for example.
665
666 Special Attributes
667 Special, virtual, attributes can be specified to provide performance-
668 related information about the file format and about variable proper‐
669 ties. The file must be a netCDF-4 file for these to take effect.
670
671 These special virtual attributes are not actually part of the file,
672 they are merely a convenient way to set miscellaneous properties of the
673 data in CDL
674
675 The special attributes currently supported are as follows: `_Format',
676 `_Fletcher32, `_ChunkSizes', `_Endianness', `_DeflateLevel', `_Shuf‐
677 fle', and `_Storage'.
678
679 `_Format' is a global attribute specifying the netCDF format variant.
680 Its value must be a single string matching one of `classic', `64-bit
681 offset', `64-bit data', `netCDF-4', or `netCDF-4 classic model'.
682
683 The rest of the special attributes are all variable attributes. Essen‐
684 tially all of then map to some corresponding `nc_def_var_XXX' function
685 as defined in the netCDF-4 API. For the attributes that are essential‐
686 ly boolean (_Fletcher32, _Shuffle, and _NOFILL), the value true can be
687 specified by using the strings `true' or `1', or by using the integer
688 1. The value false expects either `false', `0', or the integer 0. The
689 actions associated with these attributes are as follows.
690
691 1. `_Fletcher32 sets the `fletcher32' property for a variable.
692
693 2. `_Endianness' is either `little' or `big', depending on how the
694 variable is stored when first written.
695
696 3. `_DeflateLevel' is an integer between 0 and 9 inclusive if compres‐
697 sion has been specified for the variable.
698
699 4. `_Shuffle' specifies if the the shuffle filter should be used.
700
701 5. `_Storage' is `contiguous' or `compact` or `chunked'.
702
703 6. `_ChunkSizes' is a list of chunk sizes for each dimension of the
704 variable
705
706 Note that attributes such as "add_offset" or "scale_factor" have no
707 special meaning to ncgen. These attributes are currently conventions,
708 handled above the library layer by other utility packages, for example
709 NCO.
710
711 Specifying Datalists
712 Specifying datalists for variables in the `data:` section can be some‐
713 what complicated. There are some rules that must be followed to ensure
714 that datalists are parsed correctly by ncgen.
715
716 First, the top level is automatically assumed to be a list of items, so
717 it should not be inside {...}. That means that if the variable is a
718 scalar, there will be a single top-level element and if the variable is
719 an array, there will be N top-level elements. For each element of the
720 top level list, the following rules should be applied.
721
722 1. Instances of UNLIMITED dimensions (other than the first dimension)
723 must be surrounded by {...} in order to specify the size.
724
725 2. Compound instances must be embedded in {...}
726
727 3. Non-scalar fields of compound instances must be embedded in {...}.
728
729 4. Instances of vlens must be surrounded by {...} in order to specify
730 the size.
731
732 Datalists associated with attributes are implicitly a vector (i.e., a
733 list) of values of the type of the attribute and the above rules must
734 apply with that in mind.
735
736 7. No other use of braces is allowed.
737
738 Note that one consequence of these rules is that arrays of values can‐
739 not have subarrays within braces. Consider, for example, int
740 var(d1)(d2)...(dn), where none of d2...dn are unlimited. A datalist
741 for this variable must be a single list of integers, where the number
742 of integers is no more than D=d1*d2*...dn values; note that the list
743 can be less than D, in which case fill values will be used to pad the
744 list.
745
746 Rule 6 about attribute datalist has the following consequence. If the
747 type of the attribute is a compound (or vlen) type, and if the number
748 of entries in the list is one, then the compound instances must be en‐
749 closed in braces.
750
751 Specifying Character Datalists
752 Specifying datalists for variables of type char also has some complica‐
753 tions. consider, for example
754 dimensions: u=UNLIMITED; d1=1; d2=2; d3=3;
755 d4=4; d5=5; u2=UNLIMITED;
756 variables: char var(d4,d5);
757 datalist: var="1", "two", "three";
758
759 We have twenty elements of var to fill (d5 X d4) and we have three
760 strings of length 1, 3, 5. How do we assign the characters in the
761 strings to the twenty elements?
762
763 This is challenging because it is desirable to mimic the original ncgen
764 (ncgen3). The core algorithm is notionally as follows.
765
766 1. Assume we have a set of dimensions D1..Dn, where D1 may optionally
767 be an Unlimited dimension. It is assumed that the sizes of the Di
768 are all known (including unlimited dimensions).
769
770 2. Given a sequence of string or character constants C1..Cm, our goal
771 is to construct a single string whose length is the cross product of
772 D1 thru Dn. Note that for purposes of this algorithm, character
773 constants are treated as strings of size 1.
774
775 3. Construct Dx = cross product of D1 thru D(n-1).
776
777 4. For each constant Ci, add fill characters as needed so that its
778 length is a multiple of Dn.
779
780 5. Concatenate the modified C1..Cm to produce string S.
781
782 6. Add fill characters to S to make its length be a multiple of Dn.
783
784 8. If S is longer than the Dx * Dn, then truncate and generate a warn‐
785 ing.
786
787 There are three other cases of note.
788
789 1. If there is only a single, unlimited dimension, then all of the con‐
790 stants are concatenated and fill characters are added to the end of
791 the resulting string to make its length be that of the unlimited di‐
792 mension. If the length is larger than the unlimited dimension, then
793 it is truncated with a warning.
794
795 2. For the case of character typed vlen, "char(*) vlen_t" for example.
796 we simply concatenate all the constants with no filling at all.
797
798 3. For the case of a character typed attribute, we simply concatenate
799 all the constants.
800
801 In netcdf-4, dimensions other than the first can be unlimited. Of
802 course by the rules above, the interior unlimited instances must be de‐
803 limited by {...}. For example.
804 variables: char var(u,u2);
805 datalist: var={"1", "two"}, {"three"};
806 In this case u will have the effective length of two. Within each in‐
807 stance of u2, the rules above will apply, leading to this.
808 datalist: var={"1","t","w","o"}, {"t","h","r","e","e"};
809 The effective size of u2 will be the max of the two instance lengths
810 (five in this case) and the shorter will be padded to produce this.
811 datalist: var={"1","t","w","o","\0"}, {"t","h","r","e","e"};
812
813 Consider an even more complicated case.
814 variables: char var(u,u2,u3);
815 datalist: var={{"1", "two"}}, {{"three"},{"four","xy"}};
816 In this case u again will have the effective length of two. The u2 di‐
817 mensions will have a size = max(1,2) = 2; Within each instance of u2,
818 the rules above will apply, leading to this.
819 datalist: var={{"1","t","w","o"}}, {{"t","h","r","e","e"},{"f","o","u","r","x","y"}};
820 The effective size of u3 will be the max of the two instance lengths
821 (six in this case) and the shorter ones will be padded to produce this.
822 datalist: var={{"1","t","w","o"," "," "}}, {{"t","h","r","e","e"," "},{"f","o","u","r","x","y"}};
823 Note however that the first instance of u2 is less than the max length
824 of u2, so we need to add a filler for another instance of u2, producing
825 this.
826 datalist: var={{"1","t","w","o"," "," "},{" "," "," "," "," "," "}}, {{"t","h","r","e","e"," "},{"f","o","u","r","x","y"}};
827
828
830 The programs generated by ncgen when using the -c flag use initializa‐
831 tion statements to store data in variables, and will fail to produce
832 compilable programs if you try to use them for large datasets, since
833 the resulting statements may exceed the line length or number of con‐
834 tinuation statements permitted by the compiler.
835
836 The CDL syntax makes it easy to assign what looks like an array of
837 variable-length strings to a netCDF variable, but the strings may sim‐
838 ply be concatenated into a single array of characters. Specific use of
839 the string type specifier may solve the problem
840
841
843 Under certain conditions, some keywords can be used as identifiers.
844
845 1. If a type keyword is not a type supported by the format of the
846 .cdl file, then it can be used as an identifier. So, for exam‐
847 ple, when translating a .cdl file as a netCDF-3 file, then
848 "string" or "uint64" can be used as identifiers.
849
850 2. The keyword "data" can be used as an identifier because it can
851 be tested in a context sensitive fashion to see if "data" is a
852 keyword versus an identifier.
853
854
856 The file ncgen.y is the definitive grammar for CDL, but a stripped down
857 version is included here for completeness.
858 ncdesc: NETCDF
859 datasetid
860 rootgroup
861 ;
862
863 datasetid: DATASETID
864
865 rootgroup: '{'
866 groupbody
867 subgrouplist
868 '}';
869
870 groupbody:
871 attrdecllist
872 typesection
873 dimsection
874 vasection
875 datasection
876 ;
877
878 subgrouplist:
879 /*empty*/
880 | subgrouplist namedgroup
881 ;
882
883 namedgroup: GROUP ident '{'
884 groupbody
885 subgrouplist
886 '}'
887 attrdecllist
888 ;
889
890 typesection: /* empty */
891 | TYPES
892 | TYPES typedecls
893 ;
894
895 typedecls:
896 type_or_attr_decl
897 | typedecls type_or_attr_decl
898 ;
899
900 typename: ident ;
901
902 type_or_attr_decl:
903 typedecl
904 | attrdecl ';'
905 ;
906
907 typedecl:
908 enumdecl optsemicolon
909 | compounddecl optsemicolon
910 | vlendecl optsemicolon
911 | opaquedecl optsemicolon
912 ;
913
914 optsemicolon:
915 /*empty*/
916 | ';'
917 ;
918
919 enumdecl: primtype ENUM typename ;
920
921 enumidlist: enumid
922 | enumidlist ',' enumid
923 ;
924
925 enumid: ident '=' constint ;
926
927 opaquedecl: OPAQUE '(' INT_CONST ')' typename ;
928
929 vlendecl: typeref '(' '*' ')' typename ;
930
931 compounddecl: COMPOUND typename '{' fields '}' ;
932
933 fields: field ';'
934 | fields field ';'
935 ;
936
937 field: typeref fieldlist ;
938
939 primtype: CHAR_K
940 | BYTE_K
941 | SHORT_K
942 | INT_K
943 | FLOAT_K
944 | DOUBLE_K
945 | UBYTE_K
946 | USHORT_K
947 | UINT_K
948 | INT64_K
949 | UINT64_K
950 ;
951
952 dimsection: /* empty */
953 | DIMENSIONS
954 | DIMENSIONS dimdecls
955 ;
956
957 dimdecls: dim_or_attr_decl ';'
958 | dimdecls dim_or_attr_decl ';'
959 ;
960
961 dim_or_attr_decl: dimdeclist | attrdecl ;
962
963 dimdeclist: dimdecl
964 | dimdeclist ',' dimdecl
965 ;
966
967 dimdecl:
968 dimd '=' UINT_CONST
969 | dimd '=' INT_CONST
970 | dimd '=' DOUBLE_CONST
971 | dimd '=' NC_UNLIMITED_K
972 ;
973
974 dimd: ident ;
975
976 vasection: /* empty */
977 | VARIABLES
978 | VARIABLES vadecls
979 ;
980
981 vadecls: vadecl_or_attr ';'
982 | vadecls vadecl_or_attr ';'
983 ;
984
985 vadecl_or_attr: vardecl | attrdecl ;
986
987 vardecl: typeref varlist ;
988
989 varlist: varspec
990 | varlist ',' varspec
991 ;
992
993 varspec: ident dimspec ;
994
995 dimspec: /* empty */
996 | '(' dimlist ')'
997 ;
998
999 dimlist: dimref
1000 | dimlist ',' dimref
1001 ;
1002
1003 dimref: path ;
1004
1005 fieldlist:
1006 fieldspec
1007 | fieldlist ',' fieldspec
1008 ;
1009
1010 fieldspec: ident fielddimspec ;
1011
1012 fielddimspec: /* empty */
1013 | '(' fielddimlist ')'
1014 ;
1015
1016 fielddimlist:
1017 fielddim
1018 | fielddimlist ',' fielddim
1019 ;
1020
1021 fielddim:
1022 UINT_CONST
1023 | INT_CONST
1024 ;
1025
1026 /* Use this when referencing defined objects */
1027 varref: type_var_ref ;
1028
1029 typeref: type_var_ref ;
1030
1031 type_var_ref:
1032 path
1033 | primtype
1034 ;
1035
1036 /* Use this for all attribute decls */
1037 /* Watch out; this is left recursive */
1038 attrdecllist: /*empty*/ | attrdecl ';' attrdecllist ;
1039
1040 attrdecl:
1041 ':' ident '=' datalist
1042 | typeref type_var_ref ':' ident '=' datalist
1043 | type_var_ref ':' ident '=' datalist
1044 | type_var_ref ':' _FILLVALUE '=' datalist
1045 | typeref type_var_ref ':' _FILLVALUE '=' datalist
1046 | type_var_ref ':' _STORAGE '=' conststring
1047 | type_var_ref ':' _CHUNKSIZES '=' intlist
1048 | type_var_ref ':' _FLETCHER32 '=' constbool
1049 | type_var_ref ':' _DEFLATELEVEL '=' constint
1050 | type_var_ref ':' _SHUFFLE '=' constbool
1051 | type_var_ref ':' _ENDIANNESS '=' conststring
1052 | type_var_ref ':' _NOFILL '=' constbool
1053 | ':' _FORMAT '=' conststring
1054 ;
1055
1056 path:
1057 ident
1058 | PATH
1059 ;
1060
1061 datasection: /* empty */
1062 | DATA
1063 | DATA datadecls
1064 ;
1065
1066 datadecls:
1067 datadecl ';'
1068 | datadecls datadecl ';'
1069 ;
1070
1071 datadecl: varref '=' datalist ;
1072 datalist:
1073 datalist0
1074 | datalist1
1075 ;
1076
1077 datalist0:
1078 /*empty*/
1079 ;
1080
1081 /* Must have at least 1 element */
1082 datalist1:
1083 dataitem
1084 | datalist ',' dataitem
1085 ;
1086
1087 dataitem:
1088 constdata
1089 | '{' datalist '}'
1090 ;
1091
1092 constdata:
1093 simpleconstant
1094 | OPAQUESTRING
1095 | FILLMARKER
1096 | NIL
1097 | econstref
1098 | function
1099 ;
1100
1101 econstref: path ;
1102
1103 function: ident '(' arglist ')' ;
1104
1105 arglist:
1106 simpleconstant
1107 | arglist ',' simpleconstant
1108 ;
1109
1110 simpleconstant:
1111 CHAR_CONST /* never used apparently*/
1112 | BYTE_CONST
1113 | SHORT_CONST
1114 | INT_CONST
1115 | INT64_CONST
1116 | UBYTE_CONST
1117 | USHORT_CONST
1118 | UINT_CONST
1119 | UINT64_CONST
1120 | FLOAT_CONST
1121 | DOUBLE_CONST
1122 | TERMSTRING
1123 ;
1124
1125 intlist:
1126 constint
1127 | intlist ',' constint
1128 ;
1129
1130 constint:
1131 INT_CONST
1132 | UINT_CONST
1133 | INT64_CONST
1134 | UINT64_CONST
1135 ;
1136
1137 conststring: TERMSTRING ;
1138
1139 constbool:
1140 conststring
1141 | constint
1142 ;
1143
1144 /* Push all idents thru here for tracking */
1145 ident: IDENT ;
1146
1147
1148
1149Printed: 123-2-6 $Date: 2010/04/29 16:38:55 $ NCGEN(1)