1h5import(1)                 General Commands Manual                h5import(1)
2
3
4

NAME

6       h5import - Imports data into an existing or new HDF5 file.
7

SYNOPSIS

9       h5import  infile  -d  dim_list  [ -p pathname ] [ -t input_class ] [ -s
10       input_size ] [infile ...]  -o outfile
11
12       h5import infile -dims dim_list [ -path pathname ] [ -type input_class ]
13       [ -size input_size ] [infile ...]  -outfile outfile
14
15       h5import infile -c config_file [infile ...]  -outfile outfile
16
17       h5import -h
18
19       h5import -help
20

DESCRIPTION

22       h5import  converts data from one or more ASCII or binary files, infile,
23       into the same number of HDF5 datasets in the existing or new HDF5 file,
24       outfile. Data conversion is performed in accordance with the user-spec‐
25       ified type and storage properties specified in in_options.
26
27       The primary objective of h5import is to import floating point or  inte‐
28       ger  data.  The utility's design allows for future versions that accept
29       ASCII text files and store the contents as  a  compact  array  of  one-
30       dimensional  strings,  but  that  capability is not implemented in HDF5
31       Release 1.6.
32
33       Input data and options
34
35       Input data can be provided in one of the follwing forms:
36
37       *      As an ASCII, or  plain-text,  file  containing  either  floating
38              point or integer data
39
40       *      As  a  binary  file  containing  either  32-bit or 64-bit native
41              floating point data
42
43       *      As a binary file  containing  native  integer  data,  signed  or
44              unsigned and 8-bit, 16-bit, 32-bit, or 64-bit.
45
46       *      As  an  ASCII,  or  plain-text, file containing text data. (This
47              feature is not implemented in HDF5 Release 1.6.)
48
49       Each input file, infile, contains a single n-dimensional array of  val‐
50       ues  of one of the above types expressed in the order of fastest-chang‐
51       ing dimensions first.
52
53       Floating point data in an ASCII input file must  be  expressed  in  the
54       fixed  floating form (e.g., 323.56) h5import is designed to accept sci‐
55       entific notation (e.g., 3.23E+02) in an ASCII, but that is  not  imple‐
56       mented in HDF5 release 1.6.
57
58       Each  input file can be associated with options specifying the datatype
59       and storage properties. These options can be specified either  as  com‐
60       mand  line  arguments or in a configuration file. Note that exactly one
61       of these approaches must be used with a single input file.
62
63       Command line arguments, best used with simple input files, can be  used
64       to  specify  the  class,  size, dimensions of the input data and a path
65       identifying the output dataset.
66
67       The recommended means of specifying input data options is in a configu‐
68       ration file; this is also the only means of specifying advanced storage
69       features. See further discussion in "The configuration file" below.
70
71       The only required option for input data is  dimension  sizes;  defaults
72       are available for all others.
73
74       h5import  will accept up to 30 input files in a single call. Other con‐
75       siderations, such as the maximum length of a command line, may impose a
76       more stringent limitation.
77
78       Output data and options:
79
80       The  name  of  the output file is specified following the -o or -output
81       option in outfile. The data from each input file is stored as  a  sepa‐
82       rate  dataset  in this output file. outfile may be an existing file. If
83       it does not yet exist, h5import will create it.
84
85       Output dataset information and storage properties can be specified only
86       by means of a configuration file.
87
88       Dataset path
89              If  the  groups in the path leading to the dataset do not exist,
90              h5import will create  them.   If  no  group  is  specified,  the
91              dataset  will  be  created as a member of the root group.  If no
92              dataset name is specified, the default name is dataset1 for  the
93              first  input  dataset,  dataset2  for  the second input dataset,
94              dataset3 for the third input dataset, etc.   h5import  does  not
95              overwrite  a  pre-existing  dataset  of the specified or default
96              name. When an existing dataset of a confilcting name is  encoun‐
97              tered,  h5import quits with an error; the current input file and
98              any subsequent input files are not processed.
99
100       Output type
101              Datatype parameters for output data
102
103       Output data class
104              Signed or unsigned integer or floating point
105
106       Output data size
107              8-, 16-, 32-, or 64-bit integer 32- or 64-bit floating point
108
109       Output architecture
110              IEEE, STD, NATIVE (Default), Other architectures are included in
111              the h5import design but are not implemented in this release.
112
113       Output byte order
114              Little-  or big-endian.  Relevant only if output architecture is
115              IEEE, UNIX, or STD; fixed for other architectures.
116
117       Dataset layout and storage properties
118              Denote how raw data is to be organized on the disk. If  none  of
119              the  following  are specified, the default configuration is con‐
120              tiguous layout and with no compression.
121
122       Layout Contiguous (Default),  Chunked
123
124       External storage
125              Allows raw data to be stored in a non-HDF5 file or in an  exter‐
126              nal HDF5 file. Requires contiguous layout.
127
128       Compressed
129              Sets  the type of compression and the level to which the dataset
130              must be compressed. Requires chunked layout.
131
132       Extendible
133              Allows the dimensions of the dataset increase over  time  and/or
134              to be unlimited. Requires chunked layout.
135
136       Compressed and extendible
137              Requires chunked layout.
138

FILES

140       A configuration file is specified with the -c config_file option:
141              h5import infile -c config_file [infile -
142
143       The  configuration file is an ASCII file and must be organized as "Con‐
144       figuration_Keyword Value" pairs, with one pair on each line. For  exam‐
145       ple,  the line indicating that the input data class (configuration key‐
146       word INPUT-CLASS) is floating point in a text file (value TEXTFP) would
147       appear as follows:
148              INPUT-CLASS TEXTFP
149
150       A  configuration  file may have the following keywords each followed by
151       one of the following defined values. One entry for each  of  the  first
152       two keywords, RANK and DIMENSION-SIZES, is required; all other keywords
153       are optional.
154
155       RANK rank
156              An integer specifying the number of dimensions in  the  dataset.
157              Example:   RANK 4   for a 4-dimensional dataset.
158
159       DIMENSION-SIZES dim_sizes
160              Sizes  of  the dataset dimensions. (Required) A string of space-
161              separated integers specifying the sizes of the dimensions in the
162              dataset.  The number of sizes in this entry must match the value
163              in the RANK entry. The fastest-changing dimension must be listed
164              first.   Example:    DIMENSION_SIZES  4  3 4 38   for a 38x4x3x4
165              dataset.
166
167       PATH path
168              Path of the output dataset.  The full HDF5 pathname  identifying
169              the  output dataset relative to the root group within the output
170              file. I.e., path is a string consisting of optional group names,
171              each followed by a slash, and ending with a dataset name. If the
172              groups in the path do no exist, they will be created. If PATH is
173              not  specified,  the output dataset is stored as a member of the
174              root group and the default dataset  name  is  dataset1  for  the
175              first  input  dataset,  dataset2  for  the second input dataset,
176              dataset3 for the third input dataset, etc.  Note  that  h5import
177              does  not  overwrite  a pre-existing dataset of the specified or
178              default name. When an existing dataset of a confilcting name  is
179              encountered,  h5import  quits  with  an error; the current input
180              file and any subsequent input files are not processed.  Example:
181              The configuration file entry "PATH grp1/grp2/dataset1" indicates
182              that the output dataset dataset1 will be written  in  the  group
183              grp2/ which is in the group grp1/, a member of the root group in
184              the output file.
185
186       INPUT-CLASS {TEXTIN|TEXTUIN|TEXTFP|TEXTFPE|IN|UIN|FP|STR}
187              A string denoting the type of input data.
188              TEXTIN Input is signed integer data in an ASCII file.
189              TEXTUIN Input is unsigned integer data in an ASCII file.
190              TEXTFP Input is floating point data  in  fixed  notation  (e.g.,
191              325.34) in an ASCII file.
192              TEXTFPE  Input  is  floating  point  data in scientific notation
193              (e.g., 3.2534E+02) in an ASCII file. (Not  implemented  in  this
194              release.)
195              IN Input is signed integer data in a binary file.
196              UIN Input is unsigned integer data in a binary file.
197              FP Input is floating point data in a binary file. (Default)
198              STR  Input  is character data in an ASCII file. With this value,
199              the configuration keywords RANK, DIMENSION-SIZES,  OUTPUT-CLASS,
200              OUTPUT-SIZE,  OUTPUT-ARCHITECTURE, and OUTPUT-BYTE-ORDER will be
201              ignored. (Not implemented in this release.)
202
203       INPUT-SIZE {8|16|32|64}
204              An integer denoting the size of the input data,  in  bits.   For
205              signed  and  unsigned integer data (TEXTIN, TEXTUIN, IN, or UIN)
206              any of 8, 16, 32, or 64 may be used.  The default is  32.    For
207              floating  point  data  (TEXTFP, TEXTFPE, or FP), either 32 or 64
208              may be specified.  The default is 32.
209
210       OUTPUT-CLASS {IN|UIN|FP|STR}
211              A string denoting the type of output data.
212              IN Output is signed integer data.  (Default if INPUT-CLASS is IN
213              or TEXTIN)
214              UIN  Output is unsigned integer data. (Default if INPUT-CLASS is
215              UIN or TEXTUIN)
216              FP Output is floating point data. (Default if INPUT-CLASS is not
217              specified or is FP, TEXTFP, or TEXTFPE)
218              STR  Output  is character data, to be written as a 1-dimensional
219              array of strings. (Default if INPUT-CLASS is  STR)  (Not  imple‐
220              mented in this release.)
221
222       OUTPUT-SIZE {8|16|32|64}
223              An  integer  denoting the size of the output data, in bits.  For
224              signed and unsigned integer data (IN or UIN), any  of  the  four
225              sizes are valid. The default is the same as INPUT-SIZE, else 32.
226              For floating point data (FP), either 32 or 64 may be  specified.
227              The default is the same as INPUT-SIZE, else 32.
228
229       OUTPUT-ARCHITECTURE {NATIVE|STD|IEEE|INTEL|CRAY|MIPS|ALPHA|UNIX}
230              A  string  denoting  the  type  of output architecture.  See the
231              "Predefined Atomic Types" section in the "HDF5 Datatypes"  chap‐
232              ter of the HDF5 User's Guide for a discussion of these architec‐
233              tures.  INTEL, CRAY, MIPS, ALPHA, and UNIX are  not  implemented
234              in this release.  (Default: NATIVE)
235
236       OUTPUT-BYTE-ORDER {BE|LE}
237              A  string  denoting the output byte order. This entry is ignored
238              if the OUTPUT-ARCHITECTURE is not specified  or  if  it  is  not
239              specified as IEEE, UNIX, or STD.
240              BE Big-endian. (Default)
241              LE Little-endian.
242
243       The following options are disabled by default, making the default stor‐
244       age properties no chunking, no compression, no external storage, and no
245       extensible dimensions.
246
247       CHUNKED-DIMENSION-SIZES chunk_dims
248              Dimension  sizes of the chunk for chunked output data.  A string
249              of space-separated integers specifying the  dimension  sizes  of
250              the chunk for chunked output data. The number of dimensions must
251              correspond to the value of RANK.  The  presence  of  this  field
252              indicates  that  the  output  dataset is to be stored in chunked
253              layout; if this configuration field is absent, the dataset  will
254              be stored in contiguous layout.
255
256       COMPRESSION-TYPE {GZIP}
257              Type  of  compression  to be used with chunked storage. Requires
258              that CHUNKED-DIMENSION-SIZES be specified.  GZIP        is  gzip
259              compression.  Othe compression algorithms are not implemented in
260              this release of h5import.
261
262       COMPRESSION-PARAM [1-9]
263              Compression level.  Required if COMPRESSION-TYPE  is  specified.
264              Gzip  compression  levels: 1 will result in the fastest compres‐
265              sion  while  9  will  result  in  the  best  compression  ratio.
266              (Default:  6.  The  default gzip compression level is 6; not all
267              compression methods will have a default level.)
268
269       EXTERNAL-STORAGE external_file
270              Name of an external file in which to create the output  dataset.
271              Cannot  be used with CHUNKED-DIMENSIONS-SIZES, COMPRESSION-TYPE,
272              OR MAXIMUM-DIMENSIONS.  A  string  specifying  the  name  of  an
273              external file.
274
275       MAXIMUM-DIMENSIONS max_dims
276              Maximum  sizes  of  all dimensions. Requires that CHUNKED-DIMEN‐
277              SION-SIZES be specified.  A string of  space-separated  integers
278              specifying  the  maximum  size  of  each dimension of the output
279              dataset. A value of -1 for any dimension implies unlimited  size
280              for  that  particular  dimension.  The number of dimensions must
281              correspond to the value of RANK.
282

OPTIONS

284       -h[elp]
285              prints the h5import usage summary
286
287       infile(s)
288              Name of the Input file(s).
289
290       -d[ims] dim_list
291              Input data dimensions.  dim_list is a string of  comma-separated
292              numbers  with  no  spaces describing the dimensions of the input
293              data. For example, a 50 x 100 2-dimensional array would be spec‐
294              ified  as  -dims 50,100.  Required argument: if no configuration
295              file is used, this command-line argument is mandatory.
296
297       -p[athname] pathname
298              pathname is a string consisting of one or more strings separated
299              by  slashes (/) specifying the path of the dataset in the output
300              file. If the groups in the path do no exist, they will  be  cre‐
301              ated.   Optional argument: if not specified, the default path is
302              dataset1 for the first input dataset, dataset2  for  the  second
303              input  dataset,  dataset3  for  the  third  input  dataset, etc.
304              h5import does not overwrite a pre-existing dataset of the speci‐
305              fied  or default name. When an existing dataset of a confilcting
306              name is encountered, h5import quits with an error;  the  current
307              input file and any subsequent input files are not processed.
308
309       -t[ype] input_class
310              input_class specifies the class of the input data and determines
311              the class of the output data.  Valid values are  as  defined  in
312              the Keyword/Values table in the section "The configuration file"
313              above.  Optional argument: if not specified, the  default  value
314              is FP.
315
316       -s[size] input_size
317              input_size  specifies  the  size  in  bits of the input data and
318              determines the size of the output data.  Valid values for signed
319              or  unsigned  integers  are 8, 16, 32, and 64.  Valid values for
320              floating point data are 32 and 64.  Optional  argument:  if  not
321              specified, the default value is 32.
322
323       -c config_file
324              config_file  specifies  a  configuration  file.   This  argument
325              replaces all other arguments except infile and -o outfile
326
327       outfile
328              Name of the HDF5 output file.
329

NOTES

331       If the -c config_file option is used with an input file, no other argu‐
332       ment  can be used with that input file. If the -c config_file option is
333       not used with an input data file, the -d dim_list  argument  (or  -dims
334       dim_list) must be used and any combination of the remaining options may
335       be used. Any arguments used must appear in exactly the  order  used  in
336       the syntax declarations immediately above.
337
338       Note  that  while  only  the -dims argument is required, arguments must
339       used in the order in which they are listed below.
340

EXAMPLES

342       Using command-line arguments:
343
344       This command creates a file out1 containing a single 2x3x4 32-bit inte‐
345       ger  dataset.  Since no pathname is specified, the dataset is stored in
346       out1 as /dataset1.
347              h5import infile -dims 2,3,4 -type TEXTIN -size 32 -o out1
348
349       This command creates a file out2 containing a  single  a  20x50  64-bit
350       floating point dataset. The dataset is stored in out2 as /bin1/dset1.
351              h5import  infile  -dims 20,50 -path bin1/dset1 -type FP -size 64
352              -o out2
353
354       Sample configuration files: The following configuration file  specifies
355       the following:
356
357       o      The input data is a 5x2x4 floating point array in an ASCII file.
358
359       o      The  output  dataset will be saved in chunked layout, with chunk
360              dimension sizes of 2x2x2.
361
362       o      The output datatype  will  be  64-bit  floating  point,  little-
363              endian, IEEE.
364
365       o      The  output  dataset  will be stored in outfile at /work/h5/pka‐
366              mat/First-set.
367
368       o      The maximum dimension  sizes  of  the  output  dataset  will  be
369              8x8x(unlimited).
370
371              PATH work/h5/pkamat/First-set
372              INPUT-CLASS TEXTFP
373              RANK 3
374              DIMENSION-SIZES 5 2 4
375              OUTPUT-CLASS FP
376              OUTPUT-SIZE 64
377              OUTPUT-ARCHITECTURE IEEE
378              OUTPUT-BYTE-ORDER LE
379              CHUNKED-DIMENSION-SIZES 2 2 2
380              MAXIMUM-DIMENSIONS 8 8 -1
381
382       The next configuration file specifies the following:
383
384       o      The input data is a 6x3x5x2x4 integer array in a binary file.
385
386       o      The  output  dataset will be saved in chunked layout, with chunk
387              dimension sizes of 2x2x2x2x2.
388
389       o      The output datatype will be 32-bit integer in NATIVE format  (as
390              the output architecure is not specified).
391
392       o      The  output  dataset  will  be compressed using Gzip compression
393              with a compression level of 7.
394
395       o      The output dataset will be stored in outfile at /Second-set.
396
397              PATH Second-set
398              INPUT-CLASS IN
399              RANK 5
400              DIMENSION-SIZES 6 3 5 2 4
401              OUTPUT-CLASS IN
402              OUTPUT-SIZE 32
403              CHUNKED-DIMENSION-SIZES 2 2 2 2 2
404              COMPRESSION-TYPE GZIP
405              COMPRESSION-PARAM 7
406

SEE ALSO

408       h5dump(1),  h5ls(1),  h5diff(1),  h5repart(1),  gif2h5(1),   h52gif(1),
409       h5perf(1)
410
411
412
413
414                                                                   h5import(1)
Impressum