1
2
3
4
5
6
7
8
9gd_getdata(3) GETDATA gd_getdata(3)
10
11
12
14 gd_getdata — retrieve data from a Dirfile database
15
16
18 #include <getdata.h>
19
20 size_t gd_getdata(DIRFILE *dirfile, const char *field_code, off_t
21 first_frame, off_t first_sample, size_t num_frames, size_t
22 num_samples, gd_type_t return_type, void *data_out);
23
24
26 The gd_getdata() function queries a dirfile(5) database specified by
27 dirfile for the field field_code. It fetches num_frames frames plus
28 num_samples samples from this field, starting first_sample samples past
29 frame first_frame. The data is converted to the data type specified by
30 return_type, and stored in the user-supplied buffer data_out.
31
32 The field_code may contain one of the representation suffixes listed in
33 dirfile-format(5). If it does, gd_getdata() will compute the appropri‐
34 ate complex norm before returning the data.
35
36 The dirfile argument must point to a valid DIRFILE object previously
37 created by a call to gd_open(3). The argument data_out must point to a
38 valid memory location of sufficient size to hold all data requested.
39
40 Unless using GD_HERE (see below), the first sample returned will be
41
42 first_frame * samples_per_frame + first_sample
43
44 as measured from the start of the dirfile, where samples_per_frame is
45 the number of samples per frame as returned by gd_spf(3). The number
46 of samples fetched is, similarly,
47
48 num_frames * samples_per_frame + num_samples.
49
50 Although calling gd_getdata() using both samples and frames is possi‐
51 ble, the function is typically called with either num_samples and
52 first_sample, or num_frames and first_frames, equal to zero.
53
54 Instead of explicitly specifying the origin of the read, the caller may
55 pass the special symbol GD_HERE as first_frame. This will result in
56 the read occurring at the current position of the I/O pointer for the
57 field (see GetData I/O Pointers below for a discussion of field I/O
58 pointers). In this case, the value of first_sample is ignored.
59
60 When reading a SINDIR field, return_type must be GD_STRING. For all
61 other field types, the return_type argument should be one of the fol‐
62 lowing symbols, which indicates the desired return type of the data:
63
64 GD_UINT8
65 unsigned 8-bit integer
66
67 GD_INT8 signed (two's complement) 8-bit integer
68
69 GD_UINT16
70 unsigned 16-bit integer
71
72 GD_INT16
73 signed (two's complement) 16-bit integer
74
75 GD_UINT32
76 unsigned 32-bit integer
77
78 GD_INT32
79 signed (two's complement) 32-bit integer
80
81 GD_UINT64
82 unsigned 64-bit integer
83
84 GD_INT64
85 signed (two's complement) 64-bit integer
86
87 GD_FLOAT32
88 IEEE-754 standard 32-bit single precision floating point
89 number
90
91 GD_FLOAT64
92 IEEE-754 standard 64-bit double precision floating point
93 number
94
95 GD_COMPLEX64
96 C99-conformant 64-bit single precision complex number
97
98 GD_COMPLEX128
99 C99-conformant 128-bit double precision complex number
100
101 GD_NULL the null type: the database is queried as usual, but no
102 data is returned. In this case, data_out is ignored and
103 may be NULL.
104
105 The return type of the data need not be the same as the type of the da‐
106 ta stored in the database. Type conversion will be performed as neces‐
107 sary to return the requested type. If the field_code does not indicate
108 a representation, but conversion from a complex value to a purely real
109 one is required, only the real portion of the requested vector will be
110 returned.
111
112 Upon successful completion, the I/O pointer of the field will be on the
113 sample immediately following the last sample returned, if possible. On
114 error, the position of the I/O pointer is not specified, and may not
115 even be well defined.
116
117
118 Behaviour While Reading Specific Field Types
119 MPLEX: Reading an MPLEX field typically requires GetData to read data
120 before the range returned in order to determine the value of the
121 first sample returned. This can become expensive if the encod‐
122 ing of the underlying RAW data does not support seeking back‐
123 wards (which is true of most compression encodings). How much
124 preceding data GetData searches for the initial value of the re‐
125 turned data can be adjusted, or the lookback disabled complete‐
126 ly, using gd_mplex_lookback(3). If the initial value of the
127 field is not found in the data searched, GetData will fill the
128 returned vector, up to the next available sample of the mulit‐
129 plexed field, with zero for integer return types, or
130 IEEE-754-conforming NaN (not-a-number) for floating point return
131 types, as it does when providing data before the beginning-of-
132 field.
133
134 GetData caches the value of the last sample from every MPLEX it
135 reads so that a subsequent read of the field starting from the
136 following sample (either through an explicit starting sample
137 given by the caller or else implicitly using GD_HERE) will not
138 need to scan the field backwards. This cache is invalidated if
139 a different return type is used, or if an intervening operation
140 moves the field's I/O pointer.
141
142
143 SINDIR:
144 The only allowed return_type when reading SINDIR data is
145 GD_STRING. The data argument should be of type const char **,
146 and be large enough to hold one pointer for each sample request‐
147 ed. It will be filled with pointers to read-only string data.
148 The caller should not free the returned string pointers. For
149 convenience when allocating buffers, the GD_STRING constant has
150 the property: GD_SIZE(GD_STRING) == sizeof(const char *). On
151 samples where the index vector is out of range of the SARRAY,
152 and also on samples before the index vector's frame offset, the
153 value stored in data will be the NULL pointer.
154
155
156 PHASE: A forward-shifted PHASE field will always encounter the end-of-
157 field marker before its input field does. This has ramifica‐
158 tions when reading streaming data with gd_getdata() and using
159 gd_nframes(3) to gauge field lengths (that is: a forward-shifted
160 PHASE field always has less data in it than gd_nframes(3) im‐
161 plies that it does). As with any other field, gd_getdata() will
162 return a short count whenever a read from a PHASE field encoun‐
163 ters the end-of-field marker.
164
165 Backward-shifted PHASE fields do not suffer from this problem,
166 since gd_getdata() pads reads past the beginning-of-field marker
167 with NaN or zero as appropriate. Database creators who wish to
168 use the PHASE field type with streaming data are encouraged to
169 work around this limitation by only using backward-shifted PHASE
170 fields, by writing RAW data at the maximal frame lag, and then
171 back-shifting all data which should have been written earlier.
172 Another possible work-around is to write systematically less da‐
173 ta to the reference RAW field in proportion to the maximal for‐
174 ward phase shift. This method will work with applications which
175 respect the database size reported by gd_nframes(3) resulting in
176 these applications effectively ignoring all frames past the
177 frame containing the maximally forward-shifted PHASE field's
178 end-of-field marker.
179
180
181 WINDOW:
182 The samples of a WINDOW for which the field conditional is false
183 will be filled with either zero for integer return types, or
184 IEEE-754-conforming NaN (not-a-number) for floating point return
185 types.
186
187
189 In all cases, gd_getdata() returns the number of samples (not bytes)
190 successfully read from the database. If the end-of-field is encoun‐
191 tered before the requested number of samples have been read, a short
192 count will result. this is not an error.
193
194 Requests for data before the beginning-of-field marker, which may have
195 been shifted from frame zero by a PHASE field or /FRAMEOFFSET direc‐
196 tive, will result in the the data being padded at the front by NaN or
197 zero, depending on whether the return type is of floating point or in‐
198 tegral type.
199
200 On error, this function returns zero and stores a negative-valued error
201 code in the DIRFILE object which may be retrieved by a subsequent call
202 to gd_error(3). Possible error codes are:
203
204 GD_E_ALLOC
205 The library was unable to allocate memory.
206
207 GD_E_BAD_CODE
208 The field specified by field_code, or one of the fields it uses
209 for input, was not found in the database.
210
211 GD_E_BAD_DIRFILE
212 An invalid dirfile was supplied.
213
214 GD_E_BAD_SCALAR
215 A scalar field used in the definition of the field was not
216 found, or was not of scalar type.
217
218 GD_E_BAD_TYPE
219 An invalid return_type was specified.
220
221 GD_E_DIMENSION
222 The supplied field_code referred to a CONST, CARRAY, or STRING
223 field. The caller should use gd_get_constant(3), or
224 gd_get_string(3) instead. Or, a scalar field was found where a
225 vector field was expected in the definition of field_code or
226 one of its inputs.
227
228 GD_E_DOMAIN
229 An immediate read was attempted using GD_HERE, but the I/O
230 pointer of the field was not well defined because two or more
231 of the field's inputs did not agree as to the location of the
232 I/O pointer.
233
234 GD_E_INTERNAL_ERROR
235 An internal error occurred in the library while trying to per‐
236 form the task. This indicates a bug in the library. Please
237 report the incident to the maintainer.
238
239 GD_E_IO An error occurred while trying to open or read from a file on
240 disk containing a raw field or LINTERP table.
241
242 GD_E_LUT
243 A LINTERP table was malformed.
244
245 GD_E_RANGE
246 An attempt was made to read data outside the addressable
247 Dirfile range (more than 2**63 samples past the start of the
248 dirfile).
249
250 GD_E_RECURSE_LEVEL
251 Too many levels of recursion were encountered while trying to
252 resolve field_code. This usually indicates a circular depen‐
253 dency in field specification in the dirfile.
254
255 GD_E_UNKNOWN_ENCODING
256 The encoding scheme of a RAW field could not be determined.
257 This may also indicate that the binary file associated with the
258 RAW field could not be found.
259
260 GD_E_UNSUPPORTED
261 Reading from dirfiles with the encoding scheme of the specified
262 dirfile is not supported by the library. See dirfile-encod‐
263 ing(5) for details on dirfile encoding schemes.
264
265 A descriptive error string for the error may be obtained by calling
266 gd_error_string(3).
267
268
270 To save memory, gd_getdata() uses the memory pointed to by data_out as
271 scratch space while computing derived fields. As a result, if an error
272 is encountered during the computation, the contents of this memory buf‐
273 fer are unspecified, and may have been modified by this call, even
274 though gd_getdata() will report zero samples returned on error.
275
276 Reading slim-compressed data (see defile-encoding(5)), may cause unex‐
277 pected memory usage. This is because slimlib internally caches open
278 decompressed files as they are read, and GetData doesn't close data
279 files between gd_getdata() calls for efficiency's sake. Memory used by
280 this internal slimlib buffer can be reclaimed by calling
281 gd_raw_close(3) on fields when finished reading them.
282
283 When operating on a platform whose size_t is N-bytes wide, a single
284 call of gd_getdata() will never return more than (2**(N-1) - 1) sam‐
285 ples. The request will be truncated at (2**(N-M) - 1) samples, where M
286 is the size, in bytes, of the largest data type used to calculate the
287 returned field. If a larger request is specified, less data than re‐
288 quested will be returned, without raising an error. This limit is im‐
289 posed even when return_type is GD_NULL or when reading from the INDEX
290 field (i.e., even when no actual I/O or calculation occurs). In all
291 cases, the actual amount of data is returned.
292
293
295 This is a general discussion of field I/O pointers in the GetData li‐
296 brary, and contains information not directly applicable to
297 gd_getdata().
298
299 Every RAW field in an open Dirfile has an I/O pointer which indicates
300 the library's current read and write poisition in the field. These I/O
301 pointers are useful when performing sequential reads or writes on
302 Dirfile fields (see GD_HERE in the description above). The value of
303 the I/O pointer of a field is reported by gd_tell(3).
304
305 Derived fields have virtual I/O pointers arising from the I/O pointers
306 of their input fields. These virtual I/O pointers may be valid (when
307 all input fields agree on their position in the dirfile) or invalid
308 (when the input fields are not in agreement). The I/O pointer of some
309 derived fields is always invalid. The usual reason for this is the de‐
310 rived field simultaneously reading from two different places in the
311 same RAW field. For example, given the following Dirfile metadata
312 specification:
313
314 a RAW UINT8 1
315 b PHASE a 1
316 c LINCOM 2 a 1 0 b 1 0
317
318 the derived field c never has a valid I/O pointer, since any particular
319 sample of c ultimately involves reading from more than one place in the
320 RAW field a. Attempting to perform sequential reads or writes (with
321 GD_HERE) on a derived field when its I/O pointer is invalid will result
322 in an error (specifically, GD_E_DOMAIN).
323
324 The implicit INDEX field has an effective I/O pointer than mostly be‐
325 haves like a true RAW field I/O pointer, except that it permits simul‐
326 taneous reads from multiple locations. So, given the following metada‐
327 ta specification:
328
329 d PHASE INDEX 1
330 e LINCOM 2 INDEX 1 0 d 1 0
331
332 the I/O pointer of the derived field e will always be valid, unlike the
333 similarly defined c above. The virtual I/O pointer of a derived field
334 will change in response to movement of the RAW I/O pointers underlying
335 the derived fields inputs, and vice versa: moving the I/O pointer of a
336 derived field will move the I/O pointer of the RAW fields from which it
337 ultimately derives. As a result, the I/O pointer of any particular
338 field may move in unexpected ways if multiple fields are manipulated at
339 the same time.
340
341 When a Dirfile is first opened, the I/O pointer of every RAW field is
342 set to the beginning-of-frame (the value returned by gd_bof(3)), as is
343 the I/O pointer of any newly-created RAW field.
344
345 The following library calls cause I/O pointers to move:
346
347 gd_getdata() and gd_putdata(3)
348 These functions move the I/O pointer of affected fields to the
349 sample immediately following the last sample read or written,
350 both when performed at an absolutely specified position and when
351 called for a sequential read or write using GD_HERE. When read‐
352 ing a derived field which simultaneously reads from more than
353 one place in a RAW field (such as c above), the position of that
354 RAW field's I/O pointer is unspecified (that is: it is not spec‐
355 ified which input field is read first).
356
357 gd_seek(3)
358 This function is used to manipulate I/O pointers directly.
359
360 gd_flush(3) and gd_raw_close(3)
361 These functions set the I/O pointer of any RAW field which is
362 closed back to the beginning-of-field.
363
364 calls which result in modifications to raw data files:
365 this may happen when calling any of: gd_alter_encoding(3),
366 gd_alter_endianness(3), gd_alter_frameoffset(3),
367 gd_alter_entry(3), gd_alter_raw(3), gd_alter_spec(3),
368 gd_malter_spec(3), gd_move(3), or gd_rename(3); these functions
369 close affected RAW fields before making changes to the raw data
370 files, and so reset the corresponding I/O pointers to the begin‐
371 ning-of-field.
372
373
374 In general, when these calls fail, the I/O pointers of affected fields
375 may be anything, even out-of-bounds or invalid. After an error, the
376 caller should issue an explicit gd_seek(3) to repoisition I/O pointers
377 before attempting further sequential operations.
378
379
381 The function getdata() appeared in GetData-0.3.0.
382
383 The GD_COMPLEX64 and GD_COMPLEX128 data types appeared in GetDa‐
384 ta-0.6.0.
385
386 In GetData-0.7.0, this function was renamed to gd_getdata().
387
388 The GD_HERE symbol used for sequential reads appeared in GetData-0.8.0.
389
390 The GD_STRING data type appeared in GetData-0.10.0.
391
392
394 GD_SIZE(3), gd_error(3), gd_error_string(3), gd_get_constant(3),
395 gd_get_string(3), gd_mplex_lookback(3), gd_nframes(3), gd_open(3),
396 gd_raw_close(3), gd_seek(3), gd_spf(3), gd_putdata(3), dirfile(5),
397 dirfile-encoding(5)
398
399
400
401Version 0.10.0 25 December 2016 gd_getdata(3)