1
2
3
4
5
6
7
8
9gd_getdata(3)                       GETDATA                      gd_getdata(3)
10
11
12

NAME

14       gd_getdata — retrieve data from a Dirfile database
15
16

SYNOPSIS

18       #include <getdata.h>
19
20       size_t gd_getdata(DIRFILE *dirfile, const char *field_code, off_t
21              first_frame, off_t first_sample, size_t num_frames, size_t
22              num_samples, gd_type_t return_type, void *data_out);
23
24

DESCRIPTION

26       The  gd_getdata()  function  queries a dirfile(5) database specified by
27       dirfile for the field field_code.  It fetches  num_frames  frames  plus
28       num_samples samples from this field, starting first_sample samples past
29       frame first_frame.  The data is converted to the data type specified by
30       return_type, and stored in the user-supplied buffer data_out.
31
32       The field_code may contain one of the representation suffixes listed in
33       dirfile-format(5).  If it does, gd_getdata() will compute the appropri‐
34       ate complex norm before returning the data.
35
36       The  dirfile  argument  must point to a valid DIRFILE object previously
37       created by a call to gd_open(3).  The argument data_out must point to a
38       valid memory location of sufficient size to hold all data requested.
39
40       Unless using GD_HERE (see below), the first sample returned will be
41
42              first_frame * samples_per_frame + first_sample
43
44       as  measured  from the start of the dirfile, where samples_per_frame is
45       the number of samples per frame as returned by gd_spf(3).   The  number
46       of samples fetched is, similarly,
47
48              num_frames * samples_per_frame + num_samples.
49
50       Although  calling  gd_getdata() using both samples and frames is possi‐
51       ble, the function is  typically  called  with  either  num_samples  and
52       first_sample, or num_frames and first_frames, equal to zero.
53
54       Instead of explicitly specifying the origin of the read, the caller may
55       pass the special symbol GD_HERE as first_frame.  This  will  result  in
56       the  read  occurring at the current position of the I/O pointer for the
57       field (see GetData I/O Pointers below for a  discussion  of  field  I/O
58       pointers).  In this case, the value of first_sample is ignored.
59
60       When  reading  a  SINDIR field, return_type must be GD_STRING.  For all
61       other field types, the return_type argument should be one of  the  fol‐
62       lowing symbols, which indicates the desired return type of the data:
63
64              GD_UINT8
65                      unsigned 8-bit integer
66
67              GD_INT8 signed (two's complement) 8-bit integer
68
69              GD_UINT16
70                      unsigned 16-bit integer
71
72              GD_INT16
73                      signed (two's complement) 16-bit integer
74
75              GD_UINT32
76                      unsigned 32-bit integer
77
78              GD_INT32
79                      signed (two's complement) 32-bit integer
80
81              GD_UINT64
82                      unsigned 64-bit integer
83
84              GD_INT64
85                      signed (two's complement) 64-bit integer
86
87              GD_FLOAT32
88                      IEEE-754 standard 32-bit single precision floating point
89                      number
90
91              GD_FLOAT64
92                      IEEE-754 standard 64-bit double precision floating point
93                      number
94
95              GD_COMPLEX64
96                      C99-conformant 64-bit single precision complex number
97
98              GD_COMPLEX128
99                      C99-conformant 128-bit double precision complex number
100
101              GD_NULL the  null type: the database is queried as usual, but no
102                      data is returned.  In this case, data_out is ignored and
103                      may be NULL.
104
105       The return type of the data need not be the same as the type of the da‐
106       ta stored in the database.  Type conversion will be performed as neces‐
107       sary to return the requested type.  If the field_code does not indicate
108       a representation, but conversion from a complex value to a purely  real
109       one  is required, only the real portion of the requested vector will be
110       returned.
111
112       Upon successful completion, the I/O pointer of the field will be on the
113       sample immediately following the last sample returned, if possible.  On
114       error, the position of the I/O pointer is not specified,  and  may  not
115       even be well defined.
116
117
118   Behaviour While Reading Specific Field Types
119       MPLEX: Reading  an  MPLEX field typically requires GetData to read data
120              before the range returned in order to determine the value of the
121              first  sample returned.  This can become expensive if the encod‐
122              ing of the underlying RAW data does not  support  seeking  back‐
123              wards  (which  is true of most compression encodings).  How much
124              preceding data GetData searches for the initial value of the re‐
125              turned  data can be adjusted, or the lookback disabled complete‐
126              ly, using gd_mplex_lookback(3).  If the  initial  value  of  the
127              field  is  not found in the data searched, GetData will fill the
128              returned vector, up to the next available sample of  the  mulit‐
129              plexed   field,   with   zero   for  integer  return  types,  or
130              IEEE-754-conforming NaN (not-a-number) for floating point return
131              types,  as  it does when providing data before the beginning-of-
132              field.
133
134              GetData caches the value of the last sample from every MPLEX  it
135              reads  so  that a subsequent read of the field starting from the
136              following sample (either through  an  explicit  starting  sample
137              given  by  the caller or else implicitly using GD_HERE) will not
138              need to scan the field backwards.  This cache is invalidated  if
139              a  different return type is used, or if an intervening operation
140              moves the field's I/O pointer.
141
142
143       SINDIR:
144              The  only  allowed  return_type  when  reading  SINDIR  data  is
145              GD_STRING.   The  data argument should be of type const char **,
146              and be large enough to hold one pointer for each sample request‐
147              ed.   It  will be filled with pointers to read-only string data.
148              The caller should not free the returned  string  pointers.   For
149              convenience  when allocating buffers, the GD_STRING constant has
150              the property: GD_SIZE(GD_STRING) == sizeof(const  char  *).   On
151              samples  where  the  index vector is out of range of the SARRAY,
152              and also on samples before the index vector's frame offset,  the
153              value stored in data will be the NULL pointer.
154
155
156       PHASE: A  forward-shifted PHASE field will always encounter the end-of-
157              field marker before its input field does.   This  has  ramifica‐
158              tions  when  reading  streaming data with gd_getdata() and using
159              gd_nframes(3) to gauge field lengths (that is: a forward-shifted
160              PHASE  field  always  has less data in it than gd_nframes(3) im‐
161              plies that it does).  As with any other field, gd_getdata() will
162              return  a short count whenever a read from a PHASE field encoun‐
163              ters the end-of-field marker.
164
165              Backward-shifted PHASE fields do not suffer from  this  problem,
166              since gd_getdata() pads reads past the beginning-of-field marker
167              with NaN or zero as appropriate.  Database creators who wish  to
168              use  the  PHASE field type with streaming data are encouraged to
169              work around this limitation by only using backward-shifted PHASE
170              fields,  by  writing RAW data at the maximal frame lag, and then
171              back-shifting all data which should have been  written  earlier.
172              Another possible work-around is to write systematically less da‐
173              ta to the reference RAW field in proportion to the maximal  for‐
174              ward phase shift.  This method will work with applications which
175              respect the database size reported by gd_nframes(3) resulting in
176              these  applications  effectively  ignoring  all  frames past the
177              frame containing the  maximally  forward-shifted  PHASE  field's
178              end-of-field marker.
179
180
181       WINDOW:
182              The samples of a WINDOW for which the field conditional is false
183              will be filled with either zero for  integer  return  types,  or
184              IEEE-754-conforming NaN (not-a-number) for floating point return
185              types.
186
187

RETURN VALUE

189       In all cases, gd_getdata() returns the number of  samples  (not  bytes)
190       successfully  read  from  the database.  If the end-of-field is encoun‐
191       tered before the requested number of samples have been  read,  a  short
192       count will result.  this is not an error.
193
194       Requests  for data before the beginning-of-field marker, which may have
195       been shifted from frame zero by a PHASE field  or  /FRAMEOFFSET  direc‐
196       tive,  will  result in the the data being padded at the front by NaN or
197       zero, depending on whether the return type is of floating point or  in‐
198       tegral type.
199
200       On error, this function returns zero and stores a negative-valued error
201       code in the DIRFILE object which may be retrieved by a subsequent  call
202       to gd_error(3).  Possible error codes are:
203
204       GD_E_ALLOC
205               The library was unable to allocate memory.
206
207       GD_E_BAD_CODE
208               The field specified by field_code, or one of the fields it uses
209               for input, was not found in the database.
210
211       GD_E_BAD_DIRFILE
212               An invalid dirfile was supplied.
213
214       GD_E_BAD_SCALAR
215               A scalar field used in the definition  of  the  field  was  not
216               found, or was not of scalar type.
217
218       GD_E_BAD_TYPE
219               An invalid return_type was specified.
220
221       GD_E_DIMENSION
222               The  supplied field_code referred to a CONST, CARRAY, or STRING
223               field.   The   caller   should   use   gd_get_constant(3),   or
224               gd_get_string(3) instead.  Or, a scalar field was found where a
225               vector field was expected in the definition  of  field_code  or
226               one of its inputs.
227
228       GD_E_DOMAIN
229               An  immediate  read  was  attempted  using GD_HERE, but the I/O
230               pointer of the field was not well defined because two  or  more
231               of  the  field's inputs did not agree as to the location of the
232               I/O pointer.
233
234       GD_E_INTERNAL_ERROR
235               An internal error occurred in the library while trying to  per‐
236               form  the  task.   This indicates a bug in the library.  Please
237               report the incident to the maintainer.
238
239       GD_E_IO An error occurred while trying to open or read from a  file  on
240               disk containing a raw field or LINTERP table.
241
242       GD_E_LUT
243               A LINTERP table was malformed.
244
245       GD_E_RANGE
246               An  attempt  was  made  to  read  data  outside the addressable
247               Dirfile range (more than 2**63 samples past the  start  of  the
248               dirfile).
249
250       GD_E_RECURSE_LEVEL
251               Too  many  levels of recursion were encountered while trying to
252               resolve field_code.  This usually indicates a  circular  depen‐
253               dency in field specification in the dirfile.
254
255       GD_E_UNKNOWN_ENCODING
256               The  encoding  scheme  of  a RAW field could not be determined.
257               This may also indicate that the binary file associated with the
258               RAW field could not be found.
259
260       GD_E_UNSUPPORTED
261               Reading from dirfiles with the encoding scheme of the specified
262               dirfile is not supported by the  library.   See  dirfile-encod‐
263               ing(5) for details on dirfile encoding schemes.
264
265       A  descriptive  error  string  for the error may be obtained by calling
266       gd_error_string(3).
267
268

NOTES

270       To save memory, gd_getdata() uses the memory pointed to by data_out  as
271       scratch space while computing derived fields.  As a result, if an error
272       is encountered during the computation, the contents of this memory buf‐
273       fer  are  unspecified,  and  may  have been modified by this call, even
274       though gd_getdata() will report zero samples returned on error.
275
276       Reading slim-compressed data (see defile-encoding(5)), may cause  unex‐
277       pected  memory  usage.   This is because slimlib internally caches open
278       decompressed files as they are read, and  GetData  doesn't  close  data
279       files between gd_getdata() calls for efficiency's sake.  Memory used by
280       this  internal   slimlib   buffer   can   be   reclaimed   by   calling
281       gd_raw_close(3) on fields when finished reading them.
282
283       When  operating  on  a  platform whose size_t is N-bytes wide, a single
284       call of gd_getdata() will never return more than (2**(N-1)  -  1)  sam‐
285       ples.  The request will be truncated at (2**(N-M) - 1) samples, where M
286       is the size, in bytes, of the largest data type used to  calculate  the
287       returned  field.   If a larger request is specified, less data than re‐
288       quested will be returned, without raising an error.  This limit is  im‐
289       posed  even  when return_type is GD_NULL or when reading from the INDEX
290       field (i.e., even when no actual I/O or calculation  occurs).   In  all
291       cases, the actual amount of data is returned.
292
293

GETDATA I/O POINTERS

295       This  is  a general discussion of field I/O pointers in the GetData li‐
296       brary,  and   contains   information   not   directly   applicable   to
297       gd_getdata().
298
299       Every  RAW  field in an open Dirfile has an I/O pointer which indicates
300       the library's current read and write poisition in the field.  These I/O
301       pointers  are  useful  when  performing  sequential  reads or writes on
302       Dirfile fields (see GD_HERE in the description above).   The  value  of
303       the I/O pointer of a field is reported by gd_tell(3).
304
305       Derived  fields have virtual I/O pointers arising from the I/O pointers
306       of their input fields.  These virtual I/O pointers may be  valid  (when
307       all  input  fields  agree  on their position in the dirfile) or invalid
308       (when the input fields are not in agreement).  The I/O pointer of  some
309       derived fields is always invalid.  The usual reason for this is the de‐
310       rived field simultaneously reading from two  different  places  in  the
311       same  RAW  field.   For  example,  given the following Dirfile metadata
312       specification:
313
314              a RAW UINT8 1
315              b PHASE a 1
316              c LINCOM 2 a 1 0 b 1 0
317
318       the derived field c never has a valid I/O pointer, since any particular
319       sample of c ultimately involves reading from more than one place in the
320       RAW field a.  Attempting to perform sequential reads  or  writes  (with
321       GD_HERE) on a derived field when its I/O pointer is invalid will result
322       in an error (specifically, GD_E_DOMAIN).
323
324       The implicit INDEX field has an effective I/O pointer than  mostly  be‐
325       haves  like a true RAW field I/O pointer, except that it permits simul‐
326       taneous reads from multiple locations.  So, given the following metada‐
327       ta specification:
328
329              d PHASE INDEX 1
330              e LINCOM 2 INDEX 1 0 d 1 0
331
332       the I/O pointer of the derived field e will always be valid, unlike the
333       similarly defined c above.  The virtual I/O pointer of a derived  field
334       will  change in response to movement of the RAW I/O pointers underlying
335       the derived fields inputs, and vice versa: moving the I/O pointer of  a
336       derived field will move the I/O pointer of the RAW fields from which it
337       ultimately derives.  As a result, the I/O  pointer  of  any  particular
338       field may move in unexpected ways if multiple fields are manipulated at
339       the same time.
340
341       When a Dirfile is first opened, the I/O pointer of every RAW  field  is
342       set  to the beginning-of-frame (the value returned by gd_bof(3)), as is
343       the I/O pointer of any newly-created RAW field.
344
345       The following library calls cause I/O pointers to move:
346
347       gd_getdata() and gd_putdata(3)
348              These functions move the I/O pointer of affected fields  to  the
349              sample  immediately  following  the last sample read or written,
350              both when performed at an absolutely specified position and when
351              called for a sequential read or write using GD_HERE.  When read‐
352              ing a derived field which simultaneously reads  from  more  than
353              one place in a RAW field (such as c above), the position of that
354              RAW field's I/O pointer is unspecified (that is: it is not spec‐
355              ified which input field is read first).
356
357       gd_seek(3)
358              This function is used to manipulate I/O pointers directly.
359
360       gd_flush(3) and gd_raw_close(3)
361              These  functions  set  the I/O pointer of any RAW field which is
362              closed back to the beginning-of-field.
363
364       calls which result in modifications to raw data files:
365              this may  happen  when  calling  any  of:  gd_alter_encoding(3),
366              gd_alter_endianness(3),                 gd_alter_frameoffset(3),
367              gd_alter_entry(3),      gd_alter_raw(3),       gd_alter_spec(3),
368              gd_malter_spec(3),  gd_move(3), or gd_rename(3); these functions
369              close affected RAW fields before making changes to the raw  data
370              files, and so reset the corresponding I/O pointers to the begin‐
371              ning-of-field.
372
373
374       In general, when these calls fail, the I/O pointers of affected  fields
375       may  be  anything,  even out-of-bounds or invalid.  After an error, the
376       caller should issue an explicit gd_seek(3) to repoisition I/O  pointers
377       before attempting further sequential operations.
378
379

HISTORY

381       The function getdata() appeared in GetData-0.3.0.
382
383       The  GD_COMPLEX64  and  GD_COMPLEX128  data  types  appeared  in GetDa‐
384       ta-0.6.0.
385
386       In GetData-0.7.0, this function was renamed to gd_getdata().
387
388       The GD_HERE symbol used for sequential reads appeared in GetData-0.8.0.
389
390       The GD_STRING data type appeared in GetData-0.10.0.
391
392

SEE ALSO

394       GD_SIZE(3),   gd_error(3),   gd_error_string(3),    gd_get_constant(3),
395       gd_get_string(3),   gd_mplex_lookback(3),   gd_nframes(3),  gd_open(3),
396       gd_raw_close(3),  gd_seek(3),  gd_spf(3),  gd_putdata(3),   dirfile(5),
397       dirfile-encoding(5)
398
399
400
401Version 0.10.0                 25 December 2016                  gd_getdata(3)
Impressum