1NCGEN(1) UNIDATA UTILITIES NCGEN(1)
2
3
4
6 ncgen - From a CDL file generate a netCDF-3 file, a netCDF-4 file or a
7 C program
8
10 ncgen [-format_code] [-1|3|4|5|6|7] [-b] [-B buffersize] [-c] [-d] [-D
11 debuglevel] [-f] [-h] [-H] [-k format_name] [-l b|c|f77|java]
12 [-L loglevel] [-M name] [-n] [-N datasetname] [-o
13 netcdf_filename] [-P] [-x]
14
16 ncgen generates either a netCDF-3 (i.e. classic) binary .nc file, a
17 netCDF-4 (i.e. enhanced) binary .nc file or a file in some source lan‐
18 guage that when executed will construct the corresponding binary .nc
19 file. The input to ncgen is a description of a netCDF file in a small
20 language known as CDL (network Common Data form Language), described
21 below. Input is read from standard input if no input_file is speci‐
22 fied. If no options are specified in invoking ncgen, it merely checks
23 the syntax of the input CDL file, producing error messages for any vio‐
24 lations of CDL syntax. Other options can be used, for example, to cre‐
25 ate the corresponding netCDF file, or to generate a C program that uses
26 the netCDF C interface to create the netCDF file.
27
28 Note that this version of ncgen was originally called ncgen4. The old‐
29 er ncgen program has been renamed to ncgen3.
30
31 ncgen may be used with the companion program ncdump to perform some
32 simple operations on netCDF files. For example, to rename a dimension
33 in a netCDF file, use ncdump to get a CDL version of the netCDF file,
34 edit the CDL file to change the name of the dimensions, and use ncgen
35 to generate the corresponding netCDF file from the edited CDL file.
36
38 -1|3|4|5|6|7
39 Alternate method to specify the format.
40
41 3 => netcdf classic format
42
43 4 => netCDF-4 format (enhanced data model)
44
45 5 => netcdf 5 format
46
47 6 => netCDF 64-bit format
48
49 7 => netCDF-4 classic model format (3+4 == 7)
50 See the -k flag.
51
52 -b Create a (binary) netCDF file. If the -o option is absent, a
53 default file name will be constructed from the basename of the
54 CDL file, with any suffix replaced by the `.nc' extension. If a
55 file already exists with the specified name, it will be over‐
56 written.
57
58 -B buffersize
59 Specify the internal iterator buffer size.
60
61 -c Generate C source code that will create a netCDF file matching
62 the netCDF specification. The C source code is written to stan‐
63 dard output; equivalent to -lc.
64
65 -d Same as -D1.
66
67 -D debuglevel
68 Set the level of debug output.
69
70 -f Generate FORTRAN 77 source code that will create a netCDF file
71 matching the netCDF specification. The source code is written
72 to standard output; equivalent to -lf77.
73
74 -h Output help information.
75
76 -H Output the header only; ignore the data section.
77
78 -k format_name
79
80 -format_code
81 The -k flag specifies the format of the file to be created and,
82 by inference, the data model accepted by ncgen (i.e. netcdf-3
83 (classic) versus netcdf-4 vs netcdf-5). As a shortcut, a numeric
84 format_code may be specified instead. The possible format_name
85 values for the -k option are:
86
87 'classic' or 'nc3' => netCDF classic format
88
89 '64-bit offset' or 'nc6' => netCDF 64-bit format
90
91 '64-bit data or 'nc5' => netCDF-5 (64-bit data) format
92
93 'netCDF-4' 0r 'nc4' => netCDF-4 format (enhanced data
94 model)
95
96 'netCDF-4 classic model' or 'nc7' => netCDF-4 classic
97 model format
98 Accepted format_number arguments, just shortcuts for format_names, are:
99
100 3 => netcdf classic format
101
102 5 => netcdf 5 format
103
104 6 => netCDF 64-bit format
105
106 4 => netCDF-4 format (enhanced data model)
107
108 7 => netCDF-4 classic model format
109 The numeric code "7" is used because "7=3+4", a mnemonic for the format
110 that uses the netCDF-3 data model for compatibility with the netCDF-4
111 storage format for performance. Credit is due to NCO for use of these
112 numeric codes instead of the old and confusing format numbers.
113
114 Note: The old version format numbers '1', '2', '3', '4', equivalent to
115 the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are also
116 still accepted but deprecated, due to easy confusion between format
117 numbers and format names. Various old format name aliases are also ac‐
118 cepted but deprecated, e.g. 'hdf5', 'enhanced-nc3', etc. Also, note
119 that -v is accepted to mean the same thing as -k for backward compati‐
120 bility.
121
122 -l b|c|f77|java
123 The -l flag specifies the output language to use when generating
124 source code that will create or define a netCDF file matching
125 the netCDF specification. The output is written to standard
126 output. The currently supported languages have the following
127 flags.
128
129 c|C' => C language output.
130
131 f77|fortran77' => FORTRAN 77 language output
132 ; note that currently only the classic model is
133 supported.
134
135 j|java' => (experimental) Java language output
136 ; targets the existing Unidata Java interface,
137 which means that only the classic model is sup‐
138 ported.
139
141 The choice of output format is determined by three flags.
142
143 -k flag.
144
145 _Format attribute (see below).
146
147 Occurrence of CDF-5 (64-bit data) or
148 netcdf-4 constructs in the input CDL." The term "netCDF-4 con‐
149 structs" means constructs from the enhanced data model, not just
150 special performance-related attributes such as
151 _ChunkSizes, _DeflateLevel, _Endianness, etc. The term "CDF-5
152 constructs" means extended unsigned integer types allowed in the
153 64-bit data model.
154
155 Note that there is an ambiguity between the netCDF-4 case and the CDF-5
156 case is only an unsigned type is seen in the input.
157
158 The rules are as follows, in order of application.
159
160 1. If either Fortran or Java output is specified, then -k flag val‐
161 ue of 1 (classic model) will be used. Conflicts with the use of
162 enhanced constructs in the CDL will report an error.
163
164 2. If both the -k flag and _Format attribute are specified, the
165 _Format flag will be ignored. If no -k flag is specified, and a
166 _Format attribute value is specified, then the -k flag value
167 will be set to that of the _Format attribute. Otherwise the -k
168 flag is undefined.
169
170 3. If the -k option is defined and is consistent with the CDL, nc‐
171 gen will output a file in the requested form, else an error will
172 be reported.
173
174 4. If the -k flag is undefined, and if there are CDF-5 constructs,
175 only, in the CDL, a -k flag value of 5 (64-bit data model) will
176 be used. If there are true netCDF-4 constructs in the CDL, a -k
177 flag value of 3 (enhanced model) will be used.
178
179 5. If special performance-related attributes are specified in the
180 CDL, a -k flag value of 4 (netCDF-4 classic model) will be used.
181
182 6. Otherwise ncgen will set the -k flag to 1 (classic model).
183
184 -L loglevel
185
186 -M name
187 Specify the name for the main function for C, F77, or Java.
188
189 -n
190
191 -N datasetname
192
193 -o netcdf_file
194 Name of the file to pass to calls to "nc_create()". If this op‐
195 tion is specified it implies (in the absence of any explicit -l
196 flag) the "-b" option. This option is necessary because netCDF
197 files cannot be written directly to standard output, since stan‐
198 dard output is not seekable.
199
200 -P Use NC_DISKLESS mode to create the file totally in memory before
201 persisting it to disk.
202
203 -x Don't initialize data with fill values. This can speed up cre‐
204 ation of large netCDF files greatly, but later attempts to read
205 unwritten data from the generated file will not be easily de‐
206 tectable.
207
208
210 Check the syntax of the CDL file `foo.cdl':
211
212 ncgen foo.cdl
213
214 From the CDL file `foo.cdl', generate an equivalent binary netCDF file
215 named `x.nc':
216
217 ncgen -o x.nc foo.cdl
218
219 From the CDL file `foo.cdl', generate a C program containing the netCDF
220 function invocations necessary to create an equivalent binary netCDF
221 file named `x.nc':
222
223 ncgen -lc foo.cdl >x.c
224
226 CDL Syntax Overview
227 Below is an example of CDL syntax, describing a netCDF file with sever‐
228 al named dimensions (lat, lon, and time), variables (Z, t, p, rh, lat,
229 lon, time), variable attributes (units, long_name, valid_range, _Fill‐
230 Value), and some data. CDL keywords are in boldface. (This example is
231 intended to illustrate the syntax; a real CDL file would have a more
232 complete set of attributes so that the data would be more completely
233 self-describing.)
234 netcdf foo { // an example netCDF specification in CDL
235
236 types:
237 ubyte enum enum_t {Clear = 0, Cumulonimbus = 1, Stratus = 2};
238 opaque(11) opaque_t;
239 int(*) vlen_t;
240
241 dimensions:
242 lat = 10, lon = 5, time = unlimited ;
243
244 variables:
245 long lat(lat), lon(lon), time(time);
246 float Z(time,lat,lon), t(time,lat,lon);
247 double p(time,lat,lon);
248 long rh(time,lat,lon);
249
250 string country(time,lat,lon);
251 ubyte tag;
252
253 // variable attributes
254 lat:long_name = "latitude";
255 lat:units = "degrees_north";
256 lon:long_name = "longitude";
257 lon:units = "degrees_east";
258 time:units = "seconds since 1992-1-1 00:00:00";
259
260 // typed variable attributes
261 string Z:units = "geopotential meters";
262 float Z:valid_range = 0., 5000.;
263 double p:_FillValue = -9999.;
264 long rh:_FillValue = -1;
265 vlen_t :globalatt = {17, 18, 19};
266 data:
267 lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
268 lon = -140, -118, -96, -84, -52;
269 group: g {
270 types:
271 compound cmpd_t { vlen_t f1; enum_t f2;};
272 } // group g
273 group: h {
274 variables:
275 /g/cmpd_t compoundvar;
276 data:
277 compoundvar = { {3,4,5}, enum_t.Stratus } ;
278 } // group h
279 }
280
281 All CDL statements are terminated by a semicolon. Spaces, tabs, and
282 newlines can be used freely for readability. Comments may follow the
283 characters `//' on any line.
284
285 A CDL description consists of five optional parts: types, dimensions,
286 variables, data, beginning with the keyword `types:', `dimensions:',
287 `variables:', and `data:', respectively. Note several things: (1) the
288 keyword includes the trailing colon, so there must not be any space be‐
289 fore the colon character, and (2) the keywords are required to be lower
290 case.
291
292 The variables: section may contain variable declarations and attribute
293 assignments. All sections may contain global attribute assignments.
294
295 In addition, after the data: section, the user may define a series of
296 groups (see the example above). Groups themselves can contain types,
297 dimensions, variables, data, and other (nested) groups.
298
299 The netCDF types: section declares the user defined types. These may
300 be constructed using any of the following types: enum, vlen, opaque, or
301 compound.
302
303 A netCDF dimension is used to define the shape of one or more of the
304 multidimensional variables contained in the netCDF file. A netCDF di‐
305 mension has a name and a size. A dimension can have the unlimited
306 size, which means a variable using this dimension can grow to any
307 length in that dimension.
308
309 A variable represents a multidimensional array of values of the same
310 type. A variable has a name, a data type, and a shape described by its
311 list of dimensions. Each variable may also have associated attributes
312 (see below) as well as data values. The name, data type, and shape of
313 a variable are specified by its declaration in the variable section of
314 a CDL description. A variable may have the same name as a dimension;
315 by convention such a variable is one-dimensional and contains coordi‐
316 nates of the dimension it names. Dimensions need not have correspond‐
317 ing variables.
318
319 A netCDF attribute contains information about a netCDF variable or
320 about the whole netCDF dataset. Attributes are used to specify such
321 properties as units, special values, maximum and minimum valid values,
322 scaling factors, offsets, and parameters. Attribute information is
323 represented by single values or arrays of values. For example, "units"
324 is an attribute represented by a character array such as "celsius". An
325 attribute has an associated variable, a name, a data type, a length,
326 and a value. In contrast to variables that are intended for data, at‐
327 tributes are intended for metadata (data about data). Unlike netCDF-3,
328 attribute types can be any user defined type as well as the usual
329 built-in types.
330
331 In CDL, an attribute is designated by a a type, a variable, a ':', and
332 then an attribute name. The type is optional and if missing, it will
333 be inferred from the values assigned to the attribute. It is possible
334 to assign global attributes not associated with any variable to the
335 netCDF as a whole by omitting the variable name in the attribute decla‐
336 ration. Notice that there is a potential ambiguity in a specification
337 such as
338 x : a = ...
339 In this situation, x could be either a type for a global attribute, or
340 the variable name for an attribute. Since there could both be a type
341 named x and a variable named x, there is an ambiguity. The rule is
342 that in this situation, x will be interpreted as a type if possible,
343 and otherwise as a variable.
344
345 If not specified, the data type of an attribute in CDL is derived from
346 the type of the value(s) assigned to it. The length of an attribute is
347 the number of data values assigned to it, or the number of characters
348 in the character string assigned to it. Multiple values are assigned
349 to non-character attributes by separating the values with commas. All
350 values assigned to an attribute must be of the same type.
351
352 The names for CDL dimensions, variables, attributes, types, and groups
353 may contain any non-control utf-8 character except the forward slash
354 character (`/'). However, certain characters must escaped if they are
355 used in a name, where the escape character is the backward slash `\'.
356 In particular, if the leading character off the name is a digit (0-9),
357 then it must be preceded by the escape character. In addition, the
358 characters ` !"#$%&()*,:;<=>?[]^`´{}|~\' must be escaped if they occur
359 anywhere in a name. Note also that attribute names that begin with an
360 underscore (`_') are reserved for the use of Unidata and should not be
361 used in user defined attributes.
362
363 Note also that the words `variable', `dimension', `data', `group', and
364 `types' are legal CDL names, but be careful that there is a space be‐
365 tween them and any following colon character when used as a variable
366 name. This is mostly an issue with attribute declarations. For exam‐
367 ple, consider this.
368
369
370 netcdf ... {
371 ...
372 variables:
373 int dimensions;
374 dimensions: attribute=0 ; // this will cause an error
375 dimensions : attribute=0 ; // this is ok.
376 ...
377 }
378
379 The optional data: section of a CDL specification is where netCDF vari‐
380 ables may be initialized. The syntax of an initialization is simple: a
381 variable name, an equals sign, and a comma-delimited list of constants
382 (possibly separated by spaces, tabs and newlines) terminated with a
383 semicolon. For multi-dimensional arrays, the last dimension varies
384 fastest. Thus row-order rather than column order is used for matrices.
385 If fewer values are supplied than are needed to fill a variable, it is
386 extended with a type-dependent `fill value', which can be overridden by
387 supplying a value for a distinguished variable attribute named `_Fill‐
388 Value'. The types of constants need not match the type declared for a
389 variable; coercions are done to convert integers to floating point, for
390 example. The constant `_' can be used to designate the fill value for
391 a variable. If the type of the variable is explicitly `string', then
392 the special constant `NIL` can be used to represent a nil string, which
393 is not the same as a zero length string.
394
395 Primitive Data Types
396 char characters
397 byte 8-bit data
398 short 16-bit signed integers
399 int 32-bit signed integers
400 long (synonymous with int)
401 int64 64-bit signed integers
402 float IEEE single precision floating point (32 bits)
403 real (synonymous with float)
404 double IEEE double precision floating point (64 bits)
405 ubyte unsigned 8-bit data
406 ushort 16-bit unsigned integers
407 uint 32-bit unsigned integers
408 uint64 64-bit unsigned integers
409 string arbitrary length strings
410
411 CDL supports a superset of the primitive data types of C. The names
412 for the primitive data types are reserved words in CDL, so the names of
413 variables, dimensions, and attributes must not be primitive type names.
414 In declarations, type names may be specified in either upper or lower
415 case.
416
417 Bytes are intended to hold a full eight bits of data, and the zero byte
418 has no special significance, as it mays for character data. ncgen con‐
419 verts byte declarations to char declarations in the output C code and
420 to the nonstandard BYTE declaration in output Fortran code.
421
422 Shorts can hold values between -32768 and 32767. ncgen converts short
423 declarations to short declarations in the output C code and to the non‐
424 standard INTEGER*2 declaration in output Fortran code.
425
426 Ints can hold values between -2147483648 and 2147483647. ncgen con‐
427 verts int declarations to int declarations in the output C code and to
428 INTEGER declarations in output Fortran code. long is accepted as a
429 synonym for int in CDL declarations, but is deprecated since there are
430 now platforms with 64-bit representations for C longs.
431
432 Int64 can hold values between -9223372036854775808 and
433 9223372036854775807. ncgen converts int64 declarations to longlong
434 declarations in the output C code.
435
436 Floats can hold values between about -3.4+38 and 3.4+38. Their exter‐
437 nal representation is as 32-bit IEEE normalized single-precision float‐
438 ing point numbers. ncgen converts float declarations to float declara‐
439 tions in the output C code and to REAL declarations in output Fortran
440 code. real is accepted as a synonym for float in CDL declarations.
441
442 Doubles can hold values between about -1.7+308 and 1.7+308. Their ex‐
443 ternal representation is as 64-bit IEEE standard normalized double-pre‐
444 cision floating point numbers. ncgen converts double declarations to
445 double declarations in the output C code and to DOUBLE PRECISION decla‐
446 rations in output Fortran code.
447
448 The unsigned counterparts of the above integer types are mapped to the
449 corresponding unsigned C types. Their ranges are suitably modified to
450 start at zero.
451
452 The technical interpretation of the char type is that it is an unsigned
453 8-bit value. The encoding of the 256 possible values is unspecified by
454 default. A variable of char type may be marked with an "_Encoding" at‐
455 tribute to indicate the character set to be used: US-ASCII, ISO-8859-1,
456 etc. Note that specifying the encoding of UTF-8 is equivalent to spec‐
457 ifying US-ASCII This is because multi-byte UTF-8 characters cannot be
458 stored in an 8-bit character. The only legal single byte UTF-8 values
459 are by definition the 7-bit US-ASCII encoding with the top bit set to
460 zero.
461
462 Strings are assumed by default to be encoded using UTF-8. Note that
463 this means that multi-byte UTF-8 encodings may be present in the
464 string, so it is possible that the number of distinct UTF-8 characters
465 in a string is smaller than the number of 8-bit bytes used to store the
466 string.
467
468 CDL Constants
469 Constants assigned to attributes or variables may be of any of the ba‐
470 sic netCDF types. The syntax for constants is similar to C syntax, ex‐
471 cept that type suffixes must be appended to shorts and floats to dis‐
472 tinguish them from longs and doubles.
473
474 A byte constant is represented by an integer constant with a `b' (or
475 `B') appended. In the old netCDF-2 API, byte constants could also be
476 represented using single characters or standard C character escape se‐
477 quences such as `a' or `0. This is still supported for backward com‐
478 patibility, but deprecated to make the distinction clear between the
479 numeric byte type and the textual char type. Example byte constants
480 include:
481 0b // a zero byte
482 -1b // -1 as an 8-bit byte
483 255b // also -1 as a signed 8-bit byte
484
485 short integer constants are intended for representing 16-bit signed
486 quantities. The form of a short constant is an integer constant with
487 an `s' or `S' appended. If a short constant begins with `0', it is in‐
488 terpreted as octal, except that if it begins with `0x', it is inter‐
489 preted as a hexadecimal constant. For example:
490 -2s // a short -2
491 0123s // octal
492 0x7ffs //hexadecimal
493
494 int integer constants are intended for representing 32-bit signed quan‐
495 tities. The form of an int constant is an ordinary integer constant,
496 although it is acceptable to optionally append a single `l' or `L'
497 (again, deprecated). Be careful, though, the L suffix is interpreted as
498 a 32 bit integer, and never as a 64 bit integer. This can be confusing
499 since the C long type can ambigously be either 32 bit or 64 bit.
500
501 If an int constant begins with `0', it is interpreted as octal, except
502 that if it begins with `0x', it is interpreted as a hexadecimal con‐
503 stant (but see opaque constants below). Examples of valid int con‐
504 stants include:
505 -2
506 1234567890L
507 0123 // octal
508 0x7ff // hexadecimal
509
510 int64 integer constants are intended for representing 64-bit signed
511 quantities. The form of an int64 constant is an integer constant with
512 an `ll' or `LL' appended. If an int64 constant begins with `0', it is
513 interpreted as octal, except that if it begins with `0x', it is inter‐
514 preted as a hexadecimal constant. For example:
515 -2ll // an unsigned -2
516 0123LL // octal
517 0x7ffLL //hexadecimal
518
519 Floating point constants of type float are appropriate for representing
520 floating point data with about seven significant digits of precision.
521 The form of a float constant is the same as a C floating point constant
522 with an `f' or `F' appended. For example the following are all accept‐
523 able float constants:
524 -2.0f
525 3.14159265358979f // will be truncated to less precision
526 1.f
527
528
529 Floating point constants of type double are appropriate for represent‐
530 ing floating point data with about sixteen significant digits of preci‐
531 sion. The form of a double constant is the same as a C floating point
532 constant. An optional `d' or `D' may be appended. For example the
533 following are all acceptable double constants:
534 -2.0
535 3.141592653589793
536 1.0e-20
537 1.d
538
539 Unsigned integer constants can be created by appending the character
540 'U' or 'u' between the constant and any trailing size specifier, or im‐
541 mediately at the end of the size specifier. Thus one could say 10U,
542 100su, 100000ul, or 1000000llu, for example.
543
544 Single character constants may be enclosed in single quotes. If a se‐
545 quence of one or more characters is enclosed in double quotes, then its
546 interpretation must be inferred from the context. If the dataset is
547 created using the netCDF classic model, then all such constants are in‐
548 terpreted as a character array, so each character in the constant is
549 interpreted as if it were a single character. If the dataset is netCDF
550 extended, then the constant may be interpreted as for the classic model
551 or as a true string (see below) depending on the type of the attribute
552 or variable into which the string is contained.
553
554 The interpretation of char constants is that those that are in the
555 printable ASCII range (' '..'~') are assumed to be encoded as the
556 1-byte subset ofUTF-8, which is equivalent to US-ASCII. In all cases,
557 the usual C string escape conventions are honored for values from 0
558 thru 127. Values greater than 127 are allowed, but their encoding is
559 undefined. For netCDF extended, the use of the char type is deprecated
560 in favor of the string type.
561
562 Some character constant examples are as follows.
563 'a' // ASCII `a'
564 "a" // equivalent to 'a'
565 "Two\nlines\n" // a 10-character string with two embedded newlines
566 "a bell:\007" // a string containing an ASCII bell
567 Note that the netCDF character array "a" would fit in a one-element
568 variable, since no terminating NULL character is assumed. However, a
569 zero byte in a character array is interpreted as the end of the signif‐
570 icant characters by the ncdump program, following the C convention.
571 Therefore, a NULL byte should not be embedded in a character string un‐
572 less at the end: use the byte data type instead for byte arrays that
573 contain the zero byte.
574
575 String constants are, like character constants, represented using dou‐
576 ble quotes. This represents a potential ambiguity since a multi-charac‐
577 ter string may also indicate a dimensioned character value. Disambigua‐
578 tion usually occurs by context, but care should be taken to specify
579 thestring type to ensure the proper choice. String constants are as‐
580 sumed to always be UTF-8 encoded. This specifically means that the
581 string constant may actually contain multi-byte UTF-8 characters. The
582 special constant `NIL` can be used to represent a nil string, which is
583 not the same as a zero length string.
584
585 Opaque constants are represented as sequences of hexadecimal digits
586 preceded by 0X or 0x: 0xaa34ffff, for example. These constants can
587 still be used as integer constants and will be either truncated or ex‐
588 tended as necessary.
589
590 Compound Constant Expressions
591 In order to assign values to variables (or attributes) whose type is
592 user-defined type, the constant notation has been extended to include
593 sequences of constants enclosed in curly brackets (e.g. "{"..."}").
594 Such a constant is called a compound constant, and compound constants
595 can be nested.
596
597 Given a type "T(*) vlen_t", where T is some other arbitrary base type,
598 constants for this should be specified as follows.
599 vlen_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2m};
600 The values tij, are assumed to be constants of type T.
601
602 Given a type "compound cmpd_t {T1 f1; T2 f2...Tn fn}", where the Ti are
603 other arbitrary base types, constants for this should be specified as
604 follows.
605 cmpd_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2n};
606 The values tij, are assumed to be constants of type Ti. If the fields
607 are missing, then they will be set using any specified or default fill
608 value for the field's base type.
609
610 The general set of rules for using braces are defined in the Specifying
611 Datalists section below.
612
613 Scoping Rules
614 With the addition of groups, the name space for defined objects is no
615 longer flat. References (names) of any type, dimension, or variable may
616 be prefixed with the absolute path specifying a specific declaration.
617 Thus one might say
618 variables:
619 /g1/g2/t1 v1;
620 The type being referenced (t1) is the one within group g2, which in
621 turn is nested in group g1. The similarity of this notation to Unix
622 file paths is deliberate, and one can consider groups as a form of di‐
623 rectory structure.
624
625 When name is not prefixed, then scope rules are applied to locate the
626 specified declaration. Currently, there are three rules: one for dimen‐
627 sions, one for types and enumeration constants, and one for all others.
628
629 When an unprefixed name of a dimension is used (as in a variable decla‐
630 ration), ncgen first looks in the immediately enclosing group
631 for the dimension. If it is not found there, then it looks in
632 the group enclosing this group. This continues up the group hi‐
633 erarchy until the dimension is found, or there are no more
634 groups to search.
635
636 2. When an unprefixed name of a type or an enumeration constant is
637 used, ncgen searches the group tree using a pre-order depth-
638 first search. This essentially means that it will find the
639 matching declaration that precedes the reference textually in
640 the cdl file and that is "highest" in the group hierarchy.
641
642 3. For all other names, only the immediately enclosing group is
643 searched.
644
645 One final note. Forward references are not allowed. This means that
646 specifying, for example, /g1/g2/t1 will fail if this reference occurs
647 before g1 and/or g2 are defined.
648
649 Specifying Enumeration Constants
650 References to Enumeration constants (in data lists) can be ambiguous
651 since the same enumeration constant name can be defined in more than
652 one enumeration. If a cdl file specified an ambiguous constant, then
653 ncgen will signal an error. Such constants can be disambiguated in two
654 ways.
655
656 1. Prefix the enumeration constant with the name of the enumeration
657 separated by a dot: enum.econst, for example.
658
659 2. If case one is not sufficient to disambiguate the enumeration
660 constant, then one must specify the precise enumeration type us‐
661 ing a group path: /g1/g2/enum.econst, for example.
662
663 Special Attributes
664 Special, virtual, attributes can be specified to provide performance-
665 related information about the file format and about variable proper‐
666 ties. The file must be a netCDF-4 file for these to take effect.
667
668 These special virtual attributes are not actually part of the file,
669 they are merely a convenient way to set miscellaneous properties of the
670 data in CDL
671
672 The special attributes currently supported are as follows: `_Format',
673 `_Fletcher32, `_ChunkSizes', `_Endianness', `_DeflateLevel', `_Shuf‐
674 fle', and `_Storage'.
675
676 `_Format' is a global attribute specifying the netCDF format variant.
677 Its value must be a single string matching one of `classic', `64-bit
678 offset', `64-bit data', `netCDF-4', or `netCDF-4 classic model'.
679
680 The rest of the special attributes are all variable attributes. Essen‐
681 tially all of then map to some corresponding `nc_def_var_XXX' function
682 as defined in the netCDF-4 API. For the attributes that are essential‐
683 ly boolean (_Fletcher32, _Shuffle, and _NOFILL), the value true can be
684 specified by using the strings `true' or `1', or by using the integer
685 1. The value false expects either `false', `0', or the integer 0. The
686 actions associated with these attributes are as follows.
687
688 1. `_Fletcher32 sets the `fletcher32' property for a variable.
689
690 2. `_Endianness' is either `little' or `big', depending on how the
691 variable is stored when first written.
692
693 3. `_DeflateLevel' is an integer between 0 and 9 inclusive if compres‐
694 sion has been specified for the variable.
695
696 4. `_Shuffle' specifies if the the shuffle filter should be used.
697
698 5. `_Storage' is `contiguous' or `chunked'.
699
700 6. `_ChunkSizes' is a list of chunk sizes for each dimension of the
701 variable
702
703 Note that attributes such as "add_offset" or "scale_factor" have no
704 special meaning to ncgen. These attributes are currently conventions,
705 handled above the library layer by other utility packages, for example
706 NCO.
707
708 Specifying Datalists
709 Specifying datalists for variables in the `data:` section can be some‐
710 what complicated. There are some rules that must be followed to ensure
711 that datalists are parsed correctly by ncgen.
712
713 First, the top level is automatically assumed to be a list of items, so
714 it should not be inside {...}. That means that if the variable is a
715 scalar, there will be a single top-level element and if the variable is
716 an array, there will be N top-level elements. For each element of the
717 top level list, the following rules should be applied.
718
719 1. Instances of UNLIMITED dimensions (other than the first dimension)
720 must be surrounded by {...} in order to specify the size.
721
722 2. Compound instances must be embedded in {...}
723
724 3. Non-scalar fields of compound instances must be embedded in {...}.
725
726 4. Instances of vlens must be surrounded by {...} in order to specify
727 the size.
728
729 Datalists associated with attributes are implicitly a vector (i.e., a
730 list) of values of the type of the attribute and the above rules must
731 apply with that in mind.
732
733 7. No other use of braces is allowed.
734
735 Note that one consequence of these rules is that arrays of values can‐
736 not have subarrays within braces. Consider, for example, int
737 var(d1)(d2)...(dn), where none of d2...dn are unlimited. A datalist
738 for this variable must be a single list of integers, where the number
739 of integers is no more than D=d1*d2*...dn values; note that the list
740 can be less than D, in which case fill values will be used to pad the
741 list.
742
743 Rule 6 about attribute datalist has the following consequence. If the
744 type of the attribute is a compound (or vlen) type, and if the number
745 of entries in the list is one, then the compound instances must be en‐
746 closed in braces.
747
748 Specifying Character Datalists
749 Specifying datalists for variables of type char also has some complica‐
750 tions. consider, for example
751 dimensions: u=UNLIMITED; d1=1; d2=2; d3=3;
752 d4=4; d5=5; u2=UNLIMITED;
753 variables: char var(d4,d5);
754 datalist: var="1", "two", "three";
755
756 We have twenty elements of var to fill (d5 X d4) and we have three
757 strings of length 1, 3, 5. How do we assign the characters in the
758 strings to the twenty elements?
759
760 This is challenging because it is desirable to mimic the original ncgen
761 (ncgen3). The core algorithm is notionally as follows.
762
763 1. Assume we have a set of dimensions D1..Dn, where D1 may optionally
764 be an Unlimited dimension. It is assumed that the sizes of the Di
765 are all known (including unlimited dimensions).
766
767 2. Given a sequence of string or character constants C1..Cm, our goal
768 is to construct a single string whose length is the cross product of
769 D1 thru Dn. Note that for purposes of this algorithm, character
770 constants are treated as strings of size 1.
771
772 3. Construct Dx = cross product of D1 thru D(n-1).
773
774 4. For each constant Ci, add fill characters as needed so that its
775 length is a multiple of Dn.
776
777 5. Concatenate the modified C1..Cm to produce string S.
778
779 6. Add fill characters to S to make its length be a multiple of Dn.
780
781 8. If S is longer than the Dx * Dn, then truncate and generate a warn‐
782 ing.
783
784 There are three other cases of note.
785
786 1. If there is only a single, unlimited dimension, then all of the con‐
787 stants are concatenated and fill characters are added to the end of
788 the resulting string to make its length be that of the unlimited di‐
789 mension. If the length is larger than the unlimited dimension, then
790 it is truncated with a warning.
791
792 2. For the case of character typed vlen, "char(*) vlen_t" for example.
793 we simply concatenate all the constants with no filling at all.
794
795 3. For the case of a character typed attribute, we simply concatenate
796 all the constants.
797
798 In netcdf-4, dimensions other than the first can be unlimited. Of
799 course by the rules above, the interior unlimited instances must be de‐
800 limited by {...}. For example.
801 variables: char var(u,u2);
802 datalist: var={"1", "two"}, {"three"};
803 In this case u will have the effective length of two. Within each in‐
804 stance of u2, the rules above will apply, leading to this.
805 datalist: var={"1","t","w","o"}, {"t","h","r","e","e"};
806 The effective size of u2 will be the max of the two instance lengths
807 (five in this case) and the shorter will be padded to produce this.
808 datalist: var={"1","t","w","o","\0"}, {"t","h","r","e","e"};
809
810 Consider an even more complicated case.
811 variables: char var(u,u2,u3);
812 datalist: var={{"1", "two"}}, {{"three"},{"four","xy"}};
813 In this case u again will have the effective length of two. The u2 di‐
814 mensions will have a size = max(1,2) = 2; Within each instance of u2,
815 the rules above will apply, leading to this.
816 datalist: var={{"1","t","w","o"}}, {{"t","h","r","e","e"},{"f","o","u","r","x","y"}};
817 The effective size of u3 will be the max of the two instance lengths
818 (six in this case) and the shorter ones will be padded to produce this.
819 datalist: var={{"1","t","w","o"," "," "}}, {{"t","h","r","e","e"," "},{"f","o","u","r","x","y"}};
820 Note however that the first instance of u2 is less than the max length
821 of u2, so we need to add a filler for another instance of u2, producing
822 this.
823 datalist: var={{"1","t","w","o"," "," "},{" "," "," "," "," "," "}}, {{"t","h","r","e","e"," "},{"f","o","u","r","x","y"}};
824
825
827 The programs generated by ncgen when using the -c flag use initializa‐
828 tion statements to store data in variables, and will fail to produce
829 compilable programs if you try to use them for large datasets, since
830 the resulting statements may exceed the line length or number of con‐
831 tinuation statements permitted by the compiler.
832
833 The CDL syntax makes it easy to assign what looks like an array of
834 variable-length strings to a netCDF variable, but the strings may sim‐
835 ply be concatenated into a single array of characters. Specific use of
836 the string type specifier may solve the problem
837
838
840 The file ncgen.y is the definitive grammar for CDL, but a stripped down
841 version is included here for completeness.
842 ncdesc: NETCDF
843 datasetid
844 rootgroup
845 ;
846
847 datasetid: DATASETID
848
849 rootgroup: '{'
850 groupbody
851 subgrouplist
852 '}';
853
854 groupbody:
855 attrdecllist
856 typesection
857 dimsection
858 vasection
859 datasection
860 ;
861
862 subgrouplist:
863 /*empty*/
864 | subgrouplist namedgroup
865 ;
866
867 namedgroup: GROUP ident '{'
868 groupbody
869 subgrouplist
870 '}'
871 attrdecllist
872 ;
873
874 typesection: /* empty */
875 | TYPES
876 | TYPES typedecls
877 ;
878
879 typedecls:
880 type_or_attr_decl
881 | typedecls type_or_attr_decl
882 ;
883
884 typename: ident ;
885
886 type_or_attr_decl:
887 typedecl
888 | attrdecl ';'
889 ;
890
891 typedecl:
892 enumdecl optsemicolon
893 | compounddecl optsemicolon
894 | vlendecl optsemicolon
895 | opaquedecl optsemicolon
896 ;
897
898 optsemicolon:
899 /*empty*/
900 | ';'
901 ;
902
903 enumdecl: primtype ENUM typename ;
904
905 enumidlist: enumid
906 | enumidlist ',' enumid
907 ;
908
909 enumid: ident '=' constint ;
910
911 opaquedecl: OPAQUE '(' INT_CONST ')' typename ;
912
913 vlendecl: typeref '(' '*' ')' typename ;
914
915 compounddecl: COMPOUND typename '{' fields '}' ;
916
917 fields: field ';'
918 | fields field ';'
919 ;
920
921 field: typeref fieldlist ;
922
923 primtype: CHAR_K
924 | BYTE_K
925 | SHORT_K
926 | INT_K
927 | FLOAT_K
928 | DOUBLE_K
929 | UBYTE_K
930 | USHORT_K
931 | UINT_K
932 | INT64_K
933 | UINT64_K
934 ;
935
936 dimsection: /* empty */
937 | DIMENSIONS
938 | DIMENSIONS dimdecls
939 ;
940
941 dimdecls: dim_or_attr_decl ';'
942 | dimdecls dim_or_attr_decl ';'
943 ;
944
945 dim_or_attr_decl: dimdeclist | attrdecl ;
946
947 dimdeclist: dimdecl
948 | dimdeclist ',' dimdecl
949 ;
950
951 dimdecl:
952 dimd '=' UINT_CONST
953 | dimd '=' INT_CONST
954 | dimd '=' DOUBLE_CONST
955 | dimd '=' NC_UNLIMITED_K
956 ;
957
958 dimd: ident ;
959
960 vasection: /* empty */
961 | VARIABLES
962 | VARIABLES vadecls
963 ;
964
965 vadecls: vadecl_or_attr ';'
966 | vadecls vadecl_or_attr ';'
967 ;
968
969 vadecl_or_attr: vardecl | attrdecl ;
970
971 vardecl: typeref varlist ;
972
973 varlist: varspec
974 | varlist ',' varspec
975 ;
976
977 varspec: ident dimspec ;
978
979 dimspec: /* empty */
980 | '(' dimlist ')'
981 ;
982
983 dimlist: dimref
984 | dimlist ',' dimref
985 ;
986
987 dimref: path ;
988
989 fieldlist:
990 fieldspec
991 | fieldlist ',' fieldspec
992 ;
993
994 fieldspec: ident fielddimspec ;
995
996 fielddimspec: /* empty */
997 | '(' fielddimlist ')'
998 ;
999
1000 fielddimlist:
1001 fielddim
1002 | fielddimlist ',' fielddim
1003 ;
1004
1005 fielddim:
1006 UINT_CONST
1007 | INT_CONST
1008 ;
1009
1010 /* Use this when referencing defined objects */
1011 varref: type_var_ref ;
1012
1013 typeref: type_var_ref ;
1014
1015 type_var_ref:
1016 path
1017 | primtype
1018 ;
1019
1020 /* Use this for all attribute decls */
1021 /* Watch out; this is left recursive */
1022 attrdecllist: /*empty*/ | attrdecl ';' attrdecllist ;
1023
1024 attrdecl:
1025 ':' ident '=' datalist
1026 | typeref type_var_ref ':' ident '=' datalist
1027 | type_var_ref ':' ident '=' datalist
1028 | type_var_ref ':' _FILLVALUE '=' datalist
1029 | typeref type_var_ref ':' _FILLVALUE '=' datalist
1030 | type_var_ref ':' _STORAGE '=' conststring
1031 | type_var_ref ':' _CHUNKSIZES '=' intlist
1032 | type_var_ref ':' _FLETCHER32 '=' constbool
1033 | type_var_ref ':' _DEFLATELEVEL '=' constint
1034 | type_var_ref ':' _SHUFFLE '=' constbool
1035 | type_var_ref ':' _ENDIANNESS '=' conststring
1036 | type_var_ref ':' _NOFILL '=' constbool
1037 | ':' _FORMAT '=' conststring
1038 ;
1039
1040 path:
1041 ident
1042 | PATH
1043 ;
1044
1045 datasection: /* empty */
1046 | DATA
1047 | DATA datadecls
1048 ;
1049
1050 datadecls:
1051 datadecl ';'
1052 | datadecls datadecl ';'
1053 ;
1054
1055 datadecl: varref '=' datalist ;
1056 datalist:
1057 datalist0
1058 | datalist1
1059 ;
1060
1061 datalist0:
1062 /*empty*/
1063 ;
1064
1065 /* Must have at least 1 element */
1066 datalist1:
1067 dataitem
1068 | datalist ',' dataitem
1069 ;
1070
1071 dataitem:
1072 constdata
1073 | '{' datalist '}'
1074 ;
1075
1076 constdata:
1077 simpleconstant
1078 | OPAQUESTRING
1079 | FILLMARKER
1080 | NIL
1081 | econstref
1082 | function
1083 ;
1084
1085 econstref: path ;
1086
1087 function: ident '(' arglist ')' ;
1088
1089 arglist:
1090 simpleconstant
1091 | arglist ',' simpleconstant
1092 ;
1093
1094 simpleconstant:
1095 CHAR_CONST /* never used apparently*/
1096 | BYTE_CONST
1097 | SHORT_CONST
1098 | INT_CONST
1099 | INT64_CONST
1100 | UBYTE_CONST
1101 | USHORT_CONST
1102 | UINT_CONST
1103 | UINT64_CONST
1104 | FLOAT_CONST
1105 | DOUBLE_CONST
1106 | TERMSTRING
1107 ;
1108
1109 intlist:
1110 constint
1111 | intlist ',' constint
1112 ;
1113
1114 constint:
1115 INT_CONST
1116 | UINT_CONST
1117 | INT64_CONST
1118 | UINT64_CONST
1119 ;
1120
1121 conststring: TERMSTRING ;
1122
1123 constbool:
1124 conststring
1125 | constint
1126 ;
1127
1128 /* Push all idents thru here for tracking */
1129 ident: IDENT ;
1130
1131
1132
1133Printed: 119-12-31 $Date: 2010/04/29 16:38:55 $ NCGEN(1)