1NCGEN(1) UNIDATA UTILITIES NCGEN(1)
2
3
4
6 ncgen - From a CDL file generate a netCDF-3 file, a netCDF-4 file or a
7 C program
8
10 ncgen [-b] [-c] [-f] [-k file format] [-l output language] [-n] [-o
11 netcdf_filename] [-x] input_file
12
14 ncgen generates either a netCDF-3 (i.e. classic) binary .nc file, a
15 netCDF-4 (i.e. enhanced) binary .nc file or a file in some source lan‐
16 guage that when executed will construct the corresponding binary .nc
17 file. The input to ncgen is a description of a netCDF file in a small
18 language known as CDL (network Common Data form Language), described
19 below. If no options are specified in invoking ncgen, it merely checks
20 the syntax of the input CDL file, producing error messages for any vio‐
21 lations of CDL syntax. Other options can be used, for example, to cre‐
22 ate the corresponding netCDF file, or to generate a C program that uses
23 the netCDF C interface to create the netCDF file.
24
25 Note that this version of ncgen was originally called ncgen4. The old‐
26 er ncgen program has been renamed to ncgen3.
27
28 ncgen may be used with the companion program ncdump to perform some
29 simple operations on netCDF files. For example, to rename a dimension
30 in a netCDF file, use ncdump to get a CDL version of the netCDF file,
31 edit the CDL file to change the name of the dimensions, and use ncgen
32 to generate the corresponding netCDF file from the edited CDL file.
33
35 -b Create a (binary) netCDF file. If the -o option is absent, a
36 default file name will be constructed from the netCDF name
37 (specified after the netcdf keyword in the input) by appending
38 the `.nc' extension. If a file already exists with the speci‐
39 fied name, it will be overwritten.
40
41 -c Generate C source code that will create a netCDF file matching
42 the netCDF specification. The C source code is written to stan‐
43 dard output; equivalent to -lc.
44
45 -f Generate FORTRAN 77 source code that will create a netCDF file
46 matching the netCDF specification. The source code is written
47 to standard output; equivalent to -lf77.
48
49 -o netcdf_file
50 Name for the binary netCDF file created. If this option is
51 specified, it implies the "-b" option. (This option is neces‐
52 sary because netCDF files cannot be written directly to standard
53 output, since standard output is not seekable.)
54
55 -k file_format
56 The -k flag specifies the format of the file to be created and,
57 by inference, the data model accepted by ncgen (i.e. netcdf-3
58 (classic) versus netcdf-4). The possible arguments are as fol‐
59 lows.
60
61 '1', 'classic' => netcdf classic file format, netcdf-3
62 type model.
63
64 '2', '64-bit-offset', '64-bit offset' => netcdf 64 bit
65 classic file format, netcdf-3 type model.
66
67 '3', 'hdf5', 'netCDF-4', 'enhanced' => netcdf-4 file for‐
68 mat, netcdf-4 type model.
69
70 '4', 'hdf5-nc3', 'netCDF-4 classic model', 'enhanced-nc3'
71 => netcdf-4 file format, netcdf-3 type model.
72 If no -k is specified then it defaults to -k1 (i.e. classic). Note al‐
73 so that -v is accepted to mean the same thing as -k for backward com‐
74 patibility, but -k is preferred, to match the corresponding ncdump op‐
75 tion.
76
77 -x Don't initialize data with fill values. This can speed up cre‐
78 ation of large netCDF files greatly, but later attempts to read
79 unwritten data from the generated file will not be easily de‐
80 tectable.
81
82 -l output_language
83 The -l flag specifies the output language to use when generating
84 source code that will create or define a netCDF file matching
85 the netCDF specification. The output is written to standard
86 output. The currently supported languages have the following
87 flags.
88
89 c|C' => C language output.
90
91 f77|fortran77' => FORTRAN 77 language output
92 ; note that currently only the classic model is
93 supported.
94
95 j|java' => (experimental) Java language output
96 ; targets the existing Unidata Java interface,
97 which means that only the classic model is sup‐
98 ported.
99
101 Check the syntax of the CDL file `foo.cdl':
102
103 ncgen foo.cdl
104
105 From the CDL file `foo.cdl', generate an equivalent binary netCDF file
106 named `x.nc':
107
108 ncgen -o x.nc foo.cdl
109
110 From the CDL file `foo.cdl', generate a C program containing the netCDF
111 function invocations necessary to create an equivalent binary netCDF
112 file named `x.nc':
113
114 ncgen -c -o x.nc foo.cdl
115
117 CDL Syntax Overview
118 Below is an example of CDL syntax, describing a netCDF file with sever‐
119 al named dimensions (lat, lon, and time), variables (Z, t, p, rh, lat,
120 lon, time), variable attributes (units, long_name, valid_range, _Fill‐
121 Value), and some data. CDL keywords are in boldface. (This example is
122 intended to illustrate the syntax; a real CDL file would have a more
123 complete set of attributes so that the data would be more completely
124 self-describing.)
125 netcdf foo { // an example netCDF specification in CDL
126
127 types:
128 ubyte enum enum_t {Clear = 0, Cumulonimbus = 1, Stratus = 2};
129 opaque(11) opaque_t;
130 int(*) vlen_t;
131
132 dimensions:
133 lat = 10, lon = 5, time = unlimited ;
134
135 variables:
136 long lat(lat), lon(lon), time(time);
137 float Z(time,lat,lon), t(time,lat,lon);
138 double p(time,lat,lon);
139 long rh(time,lat,lon);
140
141 string country(time,lat,lon);
142 ubyte tag;
143
144 // variable attributes
145 lat:long_name = "latitude";
146 lat:units = "degrees_north";
147 lon:long_name = "longitude";
148 lon:units = "degrees_east";
149 time:units = "seconds since 1992-1-1 00:00:00";
150
151 // typed variable attributes
152 string Z:units = "geopotential meters";
153 float Z:valid_range = 0., 5000.;
154 double p:_FillValue = -9999.;
155 long rh:_FillValue = -1;
156 vlen_t :globalatt = {17, 18, 19};
157 data:
158 lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
159 lon = -140, -118, -96, -84, -52;
160 group g {
161 types:
162 compound cmpd_t { vlen_t f1; enum_t f2;};
163 } // group g
164 group h {
165 variables:
166 /g/cmpd_t compoundvar;
167 data:
168 compoundvar = { {3,4,5}, Stratus } ;
169 } // group h
170 }
171
172 All CDL statements are terminated by a semicolon. Spaces, tabs, and
173 newlines can be used freely for readability. Comments may follow the
174 characters `//' on any line.
175
176 A CDL description consists of five optional parts: types, dimensions,
177 variables, data, beginning with the keyword types:, dimensions:, vari‐
178 ables:, and data, respectively. The variable part may contain variable
179 declarations and attribute assignments. All sections may contain glob‐
180 al attribute assignments.
181
182 In addition, after the data: section, the user may define a series of
183 groups (see the example above). Groups themselves can contain types,
184 dimensions, variables, data, and other (nested) groups.
185
186 The netCDF type section declares the user defined types. These may be
187 constructed using any of the following types: enum, vlen, opaque, or
188 compound.
189
190 A netCDF dimension is used to define the shape of one or more of the
191 multidimensional variables contained in the netCDF file. A netCDF di‐
192 mension has a name and a size. A dimension can have the unlimited
193 size, which means a variable using this dimension can grow to any
194 length in that dimension.
195
196 A variable represents a multidimensional array of values of the same
197 type. A variable has a name, a data type, and a shape described by its
198 list of dimensions. Each variable may also have associated attributes
199 (see below) as well as data values. The name, data type, and shape of
200 a variable are specified by its declaration in the variable section of
201 a CDL description. A variable may have the same name as a dimension;
202 by convention such a variable is one-dimensional and contains coordi‐
203 nates of the dimension it names. Dimensions need not have correspond‐
204 ing variables.
205
206 A netCDF attribute contains information about a netCDF variable or
207 about the whole netCDF dataset. Attributes are used to specify such
208 properties as units, special values, maximum and minimum valid values,
209 scaling factors, offsets, and parameters. Attribute information is
210 represented by single values or arrays of values. For example, "units"
211 is an attribute represented by a character array such as "celsius". An
212 attribute has an associated variable, a name, a data type, a length,
213 and a value. In contrast to variables that are intended for data, at‐
214 tributes are intended for metadata (data about data). Unlike netCDF-3,
215 attribute types can be any user defined type as well as the usual
216 built-in types.
217
218 In CDL, an attribute is designated by a a type, a variable, a ':', and
219 then an attribute name. The type is optional and if missing, it will
220 be inferred from the values assigned to the attribute. It is possible
221 to assign global attributes not associated with any variable to the
222 netCDF as a whole by omitting the variable name in the attribute decla‐
223 ration. Notice that there is a potential ambiguity in a specification
224 such as
225 x : a = ...
226 In this situation, x could be either a type for a global attribute, or
227 the variable name for an attribute. Since there could both be a type
228 named x and a variable named x, there is an ambiguity. The rule is
229 that in this situation, x will be interpreted as a type if possible,
230 and otherwise as a variable.
231
232 If not specified, the data type of an attribute in CDL is derived from
233 the type of the value(s) assigned to it. The length of an attribute is
234 the number of data values assigned to it, or the number of characters
235 in the character string assigned to it. Multiple values are assigned
236 to non-character attributes by separating the values with commas. All
237 values assigned to an attribute must be of the same type.
238
239 The names for CDL dimensions, variables, and attributes must begin with
240 an alphabetic character or `_', and subsequent characters may be al‐
241 phanumeric or `_' or `-'.
242
243 The optional data section of a CDL specification is where netCDF vari‐
244 ables may be initialized. The syntax of an initialization is simple: a
245 variable name, an equals sign, and a comma-delimited list of constants
246 (possibly separated by spaces, tabs and newlines) terminated with a
247 semicolon. For multi-dimensional arrays, the last dimension varies
248 fastest. Thus row-order rather than column order is used for matrices.
249 If fewer values are supplied than are needed to fill a variable, it is
250 extended with a type-dependent `fill value', which can be overridden by
251 supplying a value for a distinguished variable attribute named `_Fill‐
252 Value'. The types of constants need not match the type declared for a
253 variable; coercions are done to convert integers to floating point, for
254 example. The constant `_' can be used to designate the fill value for
255 a variable.
256
257 Primitive Data Types
258 char characters
259 byte 8-bit data
260 short 16-bit signed integers
261 int 32-bit signed integers
262 long (synonymous with int)
263 int64 64-bit signed integers
264 float IEEE single precision floating point (32 bits)
265 real (synonymous with float)
266 double IEEE double precision floating point (64 bits)
267 ubyte unsigned 8-bit data
268 ushort 16-bit unsigned integers
269 uint 32-bit unsigned integers
270 uint64 64-bit unsigned integers
271 string arbitrary length strings
272
273 CDL supports a superset of the primitive data types of C. The names
274 for the primitive data types are reserved words in CDL, so the names of
275 variables, dimensions, and attributes must not be primitive type names.
276 In declarations, type names may be specified in either upper or lower
277 case.
278
279 Bytes differ from characters in that they are intended to hold a full
280 eight bits of data, and the zero byte has no special significance, as
281 it does for character data. ncgen converts byte declarations to char
282 declarations in the output C code and to the nonstandard BYTE declara‐
283 tion in output Fortran code.
284
285 Shorts can hold values between -32768 and 32767. ncgen converts short
286 declarations to short declarations in the output C code and to the non‐
287 standard INTEGER*2 declaration in output Fortran code.
288
289 Ints can hold values between -2147483648 and 2147483647. ncgen con‐
290 verts int declarations to int declarations in the output C code and to
291 INTEGER declarations in output Fortran code. long is accepted as a
292 synonym for int in CDL declarations, but is deprecated since there are
293 now platforms with 64-bit representations for C longs.
294
295 Int64 can hold values between -9223372036854775808 and
296 9223372036854775807. ncgen converts int64 declarations to longlong
297 declarations in the output C code.
298
299 Floats can hold values between about -3.4+38 and 3.4+38. Their exter‐
300 nal representation is as 32-bit IEEE normalized single-precision float‐
301 ing point numbers. ncgen converts float declarations to float declara‐
302 tions in the output C code and to REAL declarations in output Fortran
303 code. real is accepted as a synonym for float in CDL declarations.
304
305 Doubles can hold values between about -1.7+308 and 1.7+308. Their ex‐
306 ternal representation is as 64-bit IEEE standard normalized double-pre‐
307 cision floating point numbers. ncgen converts double declarations to
308 double declarations in the output C code and to DOUBLE PRECISION decla‐
309 rations in output Fortran code.
310
311 The unsigned counterparts of the above integer types are mapped to the
312 corresponding unsigned C types. Their ranges are suitably modified to
313 start at zero.
314
315 CDL Constants
316 Constants assigned to attributes or variables may be of any of the ba‐
317 sic netCDF types. The syntax for constants is similar to C syntax, ex‐
318 cept that type suffixes must be appended to shorts and floats to dis‐
319 tinguish them from longs and doubles.
320
321 A byte constant is represented by a single character or multiple char‐
322 acter escape sequence enclosed in single quotes. For example,
323 'a' // ASCII `a'
324 '\0' // a zero byte
325 '\n' // ASCII newline character
326 '\33' // ASCII escape character (33 octal)
327 '\x2b' // ASCII plus (2b hex)
328 '\377' // 377 octal = 255 decimal, non-ASCII
329
330 Character constants are enclosed in double quotes. A character array
331 may be represented as a string enclosed in double quotes. The usual C
332 string escape conventions are honored. For example
333 "a" // ASCII `a'
334 "Two\nlines\n" // a 10-character string with two embedded newlines
335 "a bell:\007" // a string containing an ASCII bell
336 Note that the netCDF character array "a" would fit in a one-element
337 variable, since no terminating NULL character is assumed. However, a
338 zero byte in a character array is interpreted as the end of the signif‐
339 icant characters by the ncdump program, following the C convention.
340 Therefore, a NULL byte should not be embedded in a character string un‐
341 less at the end: use the byte data type instead for byte arrays that
342 contain the zero byte.
343
344 short integer constants are intended for representing 16-bit signed
345 quantities. The form of a short constant is an integer constant with
346 an `s' or `S' appended. If a short constant begins with `0', it is in‐
347 terpreted as octal, except that if it begins with `0x', it is inter‐
348 preted as a hexadecimal constant. For example:
349 -2s // a short -2
350 0123s // octal
351 0x7ffs //hexadecimal
352
353 int integer constants are intended for representing 32-bit signed quan‐
354 tities. The form of an int constant is an ordinary integer constant,
355 although it is acceptable to append an optional `l' or `L' (again, dep‐
356 recated). If an int constant begins with `0', it is interpreted as oc‐
357 tal, except that if it begins with `0x', it is interpreted as a hexa‐
358 decimal constant (but see opaque constants below). Examples of valid
359 int constants include:
360 -2
361 1234567890L
362 0123 // octal
363 0x7ff // hexadecimal
364
365 int64 integer constants are intended for representing 64-bit signed
366 quantities. The form of an int64 constant is an integer constant with
367 an `ll' or `LL' appended. If an int64 constant begins with `0', it is
368 interpreted as octal, except that if it begins with `0x', it is inter‐
369 preted as a hexadecimal constant. For example:
370 -2ll // an unsigned -2
371 0123LL // octal
372 0x7ffLL //hexadecimal
373
374 Floating point constants of type float are appropriate for representing
375 floating point data with about seven significant digits of precision.
376 The form of a float constant is the same as a C floating point constant
377 with an `f' or `F' appended. For example the following are all accept‐
378 able float constants:
379 -2.0f
380 3.14159265358979f // will be truncated to less precision
381 1.f
382
383 Floating point constants of type double are appropriate for represent‐
384 ing floating point data with about sixteen significant digits of preci‐
385 sion. The form of a double constant is the same as a C floating point
386 constant. An optional `d' or `D' may be appended. For example the
387 following are all acceptable double constants:
388 -2.0
389 3.141592653589793
390 1.0e-20
391 1.d
392
393 Unsigned integer constants can be created by appending the character
394 'U' or 'u' between the constant and any trailing size specifier. Thus
395 one could say 10U, 100us, 100000ul, or 1000000ull, for example.
396
397 String constants are, like character constants, represented using dou‐
398 ble quotes. This represents a potential ambiguity since a multi-charac‐
399 ter string may also indicate a dimensioned character value. Disambigua‐
400 tion usually occurs by context, but care should be taken to specify
401 thestring type to ensure the proper choice.
402
403 Opaque constants are represented as sequences of hexadecimal digits
404 preceded by 0X or 0x: 0xaa34ffff, for example. These constants can
405 still be used as integer constants and will be either truncated or ex‐
406 tended as necessary.
407
408 Compound Constant Expressions
409 In order to assign values to variables (or attributes) whose type is
410 user-defined type, the constant notation has been extended to include
411 sequences of constants enclosed in curly brackets (e.g. "{"..."}").
412 Such a constant is called a compound constant, and compound constants
413 can be nested.
414
415 Given a type "T(*) vlen_t", where T is some other arbitrary base type,
416 constants for this should be specified as follows.
417 vlen_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2m};
418 The values tij, are assumed to be constants of type T.
419
420 Given a type "compound cmpd_t {T1 f1; T2 f2...Tn fn}", where the Ti are
421 other arbitrary base types, constants for this should be specified as
422 follows.
423 cmpd_t var[2] = {t11,t12,...t1N}, {t21,t22,...t2n};
424 The values tij, are assumed to be constants of type Ti. If the fields
425 are missing, then they will be set using any specified or default fill
426 value for the field's base type.
427
428 The general set of rules for using braces are defined in the Specifying
429 Datalists section below.
430
431 Scoping Rules
432 With the addition of groups, the name space for defined objects is no
433 longer flat. References (names) of any type, dimension, or variable may
434 be prefixed with the absolute path specifying a specific declaration.
435 Thus one might say
436 variables:
437 /g1/g2/t1 v1;
438 The type being referenced (t1) is the one within group g2, which in
439 turn is nested in group g1. The similarity of this notation to Unix
440 file paths is deliberate, and one can consider groups as a form of di‐
441 rectory structure.
442
443 1. When name is not prefixed, then scope rules are applied to locate
444 the specified declaration. Currently, there are three rules: one
445 for dimensions, one for types and enumeration constants, and one
446 for all others.
447
448 2. When an unprefixed name of a dimension is used (as in a variable
449 declaration), ncgen first looks in the immediately enclosing
450 group for the dimension. If it is not found there, then it
451 looks in the group enclosing this group. This continues up the
452 group hierarchy until the dimension is found, or there are no
453 more groups to search.
454
455 3. For all other names, only the immediately enclosing group is
456 searched.
457
458 When an unprefixed name of a type or an enumeration constant is used,
459 ncgen searches the group tree using a pre-order depth-first search.
460 This essentially means that it will find the matching declaration that
461 precedes the reference textually in the cdl file and that is "highest"
462 in the group hierarchy.
463
464 One final note. Forward references are not allowed. This means that
465 specifying, for example, /g1/g2/t1 will fail if this reference occurs
466 before g1 and/or g2 are defined.
467
468 Special Attributes
469 Special, virtual, attributes can be specified to provide performance-
470 related information about the file format and about variable proper‐
471 ties. The file must be a netCDF-4 file for these to take effect.
472
473 These special virtual attributes are not actually part of the file,
474 they are merely a convenient way to set miscellaneous properties of the
475 data in CDL
476
477 The special attributes currently supported are as follows: `_Format',
478 `_Fletcher32, `_ChunkSizes', `_Endianness', `_DeflateLevel', `_Shuf‐
479 fle', and `_Storage'.
480
481 `_Format' is a global attribute specifying the netCDF format variant.
482 Its value must be a single string matching one of `classic', `64-bit
483 offset', `netCDF-4', or `netCDF-4 classic model'.
484
485 The rest of the special attributes are all variable attributes. Essen‐
486 tially all of then map to some corresponding `nc_def_var_XXX' function
487 as defined in the netCDF-4 API. `_Fletcher32 sets the `fletcher32'
488 property for a variable. `_Endianness' is either `little' or `big',
489 depending on how the variable is stored when first written. `_De‐
490 flateLevel' is an integer between 0 and 9 inclusive if compression has
491 been specified for the variable. `_Shuffle' is 1 if use of the shuffle
492 filter is specified for the variable. `_Storage' is `contiguous' or
493 `chunked'. `_ChunkSizes' is a list of chunk sizes for each dimension
494 of the variable
495
496
497 Specifying Datalists
498 Specifying datalists for variables in the `data:` section can be some‐
499 what complicated. There are some rules that must be followed to ensure
500 that datalists are parsed correctly by ncgen.
501
502 1. The top level is automatically assumed to be a list of items,
503 so it should not be inside {...}.
504
505 2. Instances of UNLIMITED dimensions (other than the first dimension)
506 must be surrounded by {...} in order to specify the size.
507
508 3. Instances of vlens must be surrounded by {...} in order to
509 specify the size.
510
511 4. Compound instances must be embedded in {...}
512
513 5. Non-scalar fields of compound instances must be embedded in {...}.
514
515 6. Datalists associated with attributes are implicitly a vector (i.e.,
516 a list) of values of the type of the attribute and the above
517 rules must apply with that in mind.
518
519 7. No other use of braces is allowed.
520
521 Note that one consequence of these rules is that arrays of values can‐
522 not have subarrays within braces. Thus, given, for example, int
523 var(d1)(d2)...(dn), a datalist for this variable must be a single list
524 of integers, where the number of integers is no more than D=d1*d2*...dn
525 values; note that the list can be less than D, in which case fill val‐
526 ues will be used to pad the list.
527
528 Rule 6 about attribute datalist has the following consequence. If the
529 type of the attribute is a compound (or vlen) type, and if the number
530 of entries in the list is one, then the compound instances must be en‐
531 closed in braces.
532
533
535 The programs generated by ncgen when using the -c flag use initializa‐
536 tion statements to store data in variables, and will fail to produce
537 compilable programs if you try to use them for large datasets, since
538 the resulting statements may exceed the line length or number of con‐
539 tinuation statements permitted by the compiler.
540
541 The CDL syntax makes it easy to assign what looks like an array of
542 variable-length strings to a netCDF variable, but the strings may sim‐
543 ply be concatenated into a single array of characters. Specific use of
544 the string type specifier may solve the problem
545
546
547
548Printed: 119-6-22 $Date: 2010/04/04 19:39:52 $ NCGEN(1)