1funcolumnselect(3) SAORD Documentation funcolumnselect(3)
2
3
4
6 FunColumnSelect - select Funtools columns
7
9 #include <funtools.h>
10
11 int FunColumnSelect(Fun fun, int size, char *plist,
12 char *name1, char *type1, char *mode1, int offset1,
13 char *name2, char *type2, char *mode2, int offset2,
14 ...,
15 NULL)
16
17 int FunColumnSelectArr(Fun fun, int size, char *plist,
18 char **names, char **types, char **modes,
19 int *offsets, int nargs);
20
22 The FunColumnSelect() routine is used to select the columns from a Fun‐
23 tools binary table extension or raw event file for processing. This
24 routine allows you to specify how columns in a file are to be read into
25 a user record structure or written from a user record structure to an
26 output FITS file.
27
28 The first argument is the Fun handle associated with this set of col‐
29 umns. The second argument specifies the size of the user record struc‐
30 ture into which columns will be read. Typically, the sizeof() macro is
31 used to specify the size of a record structure. The third argument
32 allows you to specify keyword directives for the selection and is
33 described in more detail below.
34
35 Following the first three required arguments is a variable length list
36 of column specifications. Each column specification will consist of
37 four arguments:
38
39 · name: the name of the column
40
41 · type: the data type of the column as it will be stored in the user
42 record struct (not the data type of the input file). The following
43 basic data types are recognized:
44
45 · A: ASCII characters
46
47 · B: unsigned 8-bit char
48
49 · I: signed 16-bit int
50
51 · U: unsigned 16-bit int (not standard FITS)
52
53 · J: signed 32-bit int
54
55 · V: unsigned 32-bit int (not standard FITS)
56
57 · E: 32-bit float
58
59 · D: 64-bit float
60
61 The syntax used is similar to that which defines the TFORM parame‐
62 ter in FITS binary tables. That is, a numeric repeat value can pre‐
63 cede the type character, so that "10I" means a vector of 10 short
64 ints, "E" means a single precision float, etc. Note that the col‐
65 umn value from the input file will be converted to the specified
66 data type as the data is read by FunTableRowGet().
67
68 [ A short digression regarding bit-fields: Special attention is
69 required when reading or writing the FITS bit-field type ("X").
70 Bit-fields almost always have a numeric repeat character preceding
71 the 'X' specification. Usually this value is a multiple of 8 so
72 that bit-fields fit into an integral number of bytes. For all
73 cases, the byte size of the bit-field B is (N+7)/8, where N is the
74 numeric repeat character.
75
76 A bit-field is most easily declared in the user struct as an array
77 of type char of size B as defined above. In this case, bytes are
78 simply moved from the file to the user space. If, instead, a short
79 or int scalar or array is used, then the algorithm for reading the
80 bit-field into the user space depends on the size of the data type
81 used along with the value of the repeat character. That is, if the
82 user data size is equal to the byte size of the bit-field, then the
83 data is simply moved (possibly with endian-based byte-swapping)
84 from one to the other. If, on the other hand, the data storage is
85 larger than the bit-field size, then a data type cast conversion is
86 performed to move parts of the bit-field into elements of the
87 array. Examples will help make this clear:
88
89 · If the file contains a 16X bit-field and user space specifies a
90 2B char array[2], then the bit-field is moved directly into the
91 char array.
92
93 · If the file contains a 16X bit-field and user space specifies a
94 1I scalar short int, then the bit-field is moved directly into
95 the short int.
96
97 · If the file contains a 16X bit-field and user space specifies a
98 1J scalar int, then the bit-field is type-cast to unsigned int
99 before being moved (use of unsigned avoids possible sign exten‐
100 sion).
101
102 · If the file contains a 16X bit-field and user space specifies a
103 2J int array[2], then the bit-field is handled as 2 chars, each
104 of which are type-cast to unsigned int before being moved (use
105 of unsigned avoids possible sign extension).
106
107 · If the file contains a 16X bit-field and user space specifies a
108 1B char, then the bit-field is treated as a char, i.e., trunca‐
109 tion will occur.
110
111 · If the file contains a 16X bit-field and user space specifies a
112 4J int array[4], then the results are undetermined.
113
114 For all user data types larger than char, the bit-field is byte-
115 swapped as necessary to convert to native format, so that bits in
116 the resulting data in user space can be tested, masked, etc. in the
117 same way regardless of platform.]
118
119 In addition to setting data type and size, the type specification
120 allows a few ancillary parameters to be set, using the full syntax
121 for type:
122
123 [@][n]<type>[[['B']poff]][:[tlmin[:tlmax[:binsiz]]]]
124
125 The special character "@" can be prepended to this specification to
126 indicated that the data element is a pointer in the user record,
127 rather than an array stored within the record.
128
129 The [n] value is an integer that specifies the number of elements
130 that are in this column (default is 1). TLMIN, TLMAX, and BINSIZ
131 values also can be specified for this column after the type, sepa‐
132 rated by colons. If only one such number is specified, it is
133 assumed to be TLMAX, and TLMIN and BINSIZ are set to 1.
134
135 The [poff] value can be used to specify the offset into an array.
136 By default, this offset value is set to zero and the data specified
137 starts at the beginning of the array. The offset usually is speci‐
138 fied in terms of the data type of the column. Thus an offset speci‐
139 fication of [5] means a 20-byte offset if the data type is a 32-bit
140 integer, and a 40-byte offset for a double. If you want to specify
141 a byte offset instead of an offset tied to the column data type,
142 precede the offset value with 'B', e.g. [B6] means a 6-bye offset,
143 regardless of the column data type.
144
145 The [poff] is especially useful in conjunction with the pointer @
146 specification, since it allows the data element to anywhere stored
147 anywhere in the allocated array. For example, a specification such
148 as "@I[2]" specifies the third (i.e., starting from 0) element in
149 the array pointed to by the pointer value. A value of "@2I[4]"
150 specifies the fifth and sixth values in the array. For example,
151 consider the following specification:
152
153 typedef struct EvStruct{
154 short x[4], *atp;
155 } *Event, EventRec;
156 /* set up the (hardwired) columns */
157 FunColumnSelect( fun, sizeof(EventRec), NULL,
158 "2i", "2I ", "w", FUN_OFFSET(Event, x),
159 "2i2", "2I[2]", "w", FUN_OFFSET(Event, x),
160 "at2p", "@2I", "w", FUN_OFFSET(Event, atp),
161 "at2p4", "@2I[4]", "w", FUN_OFFSET(Event, atp),
162 "atp9", "@I[9]", "w", FUN_OFFSET(Event, atp),
163 "atb20", "@I[B20]", "w", FUN_OFFSET(Event, atb),
164 NULL);
165
166 Here we have specified the following columns:
167
168 · 2i: two short ints in an array which is stored as part the
169 record
170
171 · 2i2: the 3rd and 4th elements of an array which is stored as
172 part of the record
173
174 · an array of at least 10 elements, not stored in the record but
175 allocated elsewhere, and used by three different columns:
176
177 · at2p: 2 short ints which are the first 2 elements of the
178 allocated array
179
180 · at2p4: 2 short ints which are the 5th and 6th elements of
181 the allocated array
182
183 · atp9: a short int which is the 10th element of the allo‐
184 cated array
185
186 · atb20: a short int which is at byte offset 20 of another allo‐
187 cated array
188
189 In this way, several columns can be specified, all of which are in
190 a single array. NB: it is the programmer's responsibility to ensure
191 that specification of a positive value for poff does not point past
192 the end of valid data.
193
194 · read/write mode: "r" means that the column is read from an input
195 file into user space by FunTableRowGet(), "w" means that the column
196 is written to an output file. Both can specified at the same time.
197
198 · offset: the offset into the user data to store this column. Typi‐
199 cally, the macro FUN_OFFSET(recname, colname) is used to define the
200 offset into a record structure.
201
202 When all column arguments have been specified, a final NULL argument
203 must added to signal the column selection list.
204
205 As an alternative to the varargs FunColumnSelect() routine, a non-
206 varargs routine called FunColumnSelectArr() also is available. The
207 first three arguments (fun, size, plist) of this routine are the same
208 as in FunColumnSelect(). Instead of a variable argument list, however,
209 FunColumnSelectArr() takes 5 additional arguments. The first 4 arrays
210 arguments contain the names, types, modes, and offsets, respectively,
211 of the columns being selected. The final argument is the number of col‐
212 umns that are contained in these arrays. It is the user's responsibil‐
213 ity to free string space allocated in these arrays.
214
215 Consider the following example:
216
217 typedef struct evstruct{
218 int status;
219 float pi, pha, *phas;
220 double energy;
221 } *Ev, EvRec;
222
223 FunColumnSelect(fun, sizeof(EvRec), NULL,
224 "status", "J", "r", FUN_OFFSET(Ev, status),
225 "pi", "E", "r", FUN_OFFSET(Ev, pi),
226 "pha", "E", "r", FUN_OFFSET(Ev, pha),
227 "phas", "@9E", "r", FUN_OFFSET(Ev, phas),
228 NULL);
229
230 Each time a row is read into the Ev struct, the "status" column is con‐
231 verted to an int data type (regardless of its data type in the file)
232 and stored in the status value of the struct. Similarly, "pi" and
233 "pha", and the phas vector are all stored as floats. Note that the "@"
234 sign indicates that the "phas" vector is a pointer to a 9 element
235 array, rather than an array allocated in the struct itself. The row
236 record can then be processed as required:
237
238 /* get rows -- let routine allocate the row array */
239 while( (ebuf = (Ev)FunTableRowGet(fun, NULL, MAXROW, NULL, &got)) ){
240 /* process all rows */
241 for(i=0; i<got; i++){
242 /* point to the i'th row */
243 ev = ebuf+i;
244 ev->pi = (ev->pi+.5);
245 ev->pha = (ev->pi-.5);
246 }
247
248 FunColumnSelect() can also be called to define "writable" columns in
249 order to generate a FITS Binary Table, without reference to any input
250 columns. For example, the following will generate a 4-column FITS
251 binary table when FunTableRowPut() is used to write Ev records:
252
253 typedef struct evstruct{
254 int status;
255 float pi, pha
256 double energy;
257 } *Ev, EvRec;
258
259 FunColumnSelect(fun, sizeof(EvRec), NULL,
260 "status", "J", "w", FUN_OFFSET(Ev, status),
261 "pi", "E", "w", FUN_OFFSET(Ev, pi),
262 "pha", "E", "w", FUN_OFFSET(Ev, pha),
263 "energy", "D", "w", FUN_OFFSET(Ev, energy),
264 NULL);
265
266 All columns are declared to be write-only, so presumably the column
267 data is being generated or read from some other source.
268
269 In addition, FunColumnSelect() can be called to define both "readable"
270 and "writable" columns. In this case, the "read" columns are associ‐
271 ated with an input file, while the "write" columns are associated with
272 the output file. Of course, columns can be specified as both "readable"
273 and "writable", in which case they are read from input and (possibly
274 modified data values are) written to the output. The FunColumnSelect()
275 call itself is made by passing the input Funtools handle, and it is
276 assumed that the output file has been opened using this input handle as
277 its Funtools reference handle.
278
279 Consider the following example:
280
281 typedef struct evstruct{
282 int status;
283 float pi, pha, *phas;
284 double energy;
285 } *Ev, EvRec;
286
287 FunColumnSelect(fun, sizeof(EvRec), NULL,
288 "status", "J", "r", FUN_OFFSET(Ev, status),
289 "pi", "E", "rw", FUN_OFFSET(Ev, pi),
290 "pha", "E", "rw", FUN_OFFSET(Ev, pha),
291 "phas", "@9E", "rw", FUN_OFFSET(Ev, phas),
292 "energy", "D", "w", FUN_OFFSET(Ev, energy),
293 NULL);
294
295 As in the "read" example above, each time an row is read into the Ev
296 struct, the "status" column is converted to an int data type (regard‐
297 less of its data type in the file) and stored in the status value of
298 the struct. Similarly, "pi" and "pha", and the phas vector are all
299 stored as floats. Since the "pi", "pha", and "phas" variables are
300 declared as "writable" as well as "readable", they also will be written
301 to the output file. Note, however, that the "status" variable is
302 declared as "readable" only, and hence it will not be written to an
303 output file. Finally, the "energy" column is declared as "writable"
304 only, meaning it will not be read from the input file. In this case, it
305 can be assumed that "energy" will be calculated in the program before
306 being output along with the other values.
307
308 In these simple cases, only the columns specified as "writable" will be
309 output using FunTableRowPut(). However, it often is the case that you
310 want to merge the user columns back in with the input columns, even in
311 cases where not all of the input column names are explicitly read or
312 even known. For this important case, the merge=[type] keyword is pro‐
313 vided in the plist string.
314
315 The merge=[type] keyword tells Funtools to merge the columns from the
316 input file with user columns on output. It is normally used when an
317 input and output file are opened and the input file provides the Fun‐
318 tools reference handle for the output file. In this case, each time
319 FunTableRowGet() is called, the raw input rows are saved in a special
320 buffer. If FunTableRowPut() then is called (before another call to
321 FunTableRowGet()), the contents of the raw input rows are merged with
322 the user rows according to the value of type as follows:
323
324 · update: add new user columns, and update value of existing ones
325 (maintaining the input data type)
326
327 · replace: add new user columns, and replace the data type and value
328 of existing ones. (Note that if tlmin/tlmax values are not speci‐
329 fied in the replacing column, but are specified in the original
330 column being replaced, then the original tlmin/tlmax values are
331 used in the replacing column.)
332
333 · append: only add new columns, do not "replace" or "update" existing
334 ones
335
336 Consider the example above. If merge=update is specified in the plist
337 string, then "energy" will be added to the input columns, and the val‐
338 ues of "pi", "pha", and "phas" will be taken from the user space (i.e.,
339 the values will be updated from the original values, if they were
340 changed by the program). The data type for "pi", "pha", and "phas"
341 will be the same as in the original file. If merge=replace is speci‐
342 fied, both the data type and value of these three input columns will be
343 changed to the data type and value in the user structure. If
344 merge=append is specified, none of these three columns will be updated,
345 and only the "energy" column will be added. Note that in all cases,
346 "status" will be written from the input data, not from the user record,
347 since it was specified as read-only.
348
349 Standard applications will call FunColumnSelect() to define user col‐
350 umns. However, if this routine is not called, the default behavior is
351 to transfer all input columns into user space. For this purpose a
352 default record structure is defined such that each data element is
353 properly aligned on a valid data type boundary. This mechanism is used
354 by programs such as fundisp and funtable to process columns without
355 needing to know the specific names of those columns. It is not antici‐
356 pated that users will need such capabilities (contact us if you do!)
357
358 By default, FunColumnSelect() reads/writes rows to/from an "array of
359 structs", where each struct contains the column values for a single row
360 of the table. This means that the returned values for a given column
361 are not contiguous. You can set up the IO to return a "struct of
362 arrays" so that each of the returned columns are contiguous by specify‐
363 ing org=structofarrays (abbreviation: org=soa) in the plist. (The
364 default case is org=arrayofstructs or org=aos.)
365
366 For example, the default setup to retrieve rows from a table would be
367 to define a record structure for a single event and then call
368 FunColumnSelect() as follows:
369
370 typedef struct evstruct{
371 short region;
372 double x, y;
373 int pi, pha;
374 double time;
375 } *Ev, EvRec;
376
377 got = FunColumnSelect(fun, sizeof(EvRec), NULL,
378 "x", "D:10:10", mode, FUN_OFFSET(Ev, x),
379 "y", "D:10:10", mode, FUN_OFFSET(Ev, y),
380 "pi", "J", mode, FUN_OFFSET(Ev, pi),
381 "pha", "J", mode, FUN_OFFSET(Ev, pha),
382 "time", "1D", mode, FUN_OFFSET(Ev, time),
383 NULL);
384
385 Subsequently, each call to FunTableRowGet() will return an array of
386 structs, one for each returned row. If instead you wanted to read col‐
387 umns into contiguous arrays, you specify org=soa:
388
389 typedef struct aevstruct{
390 short region[MAXROW];
391 double x[MAXROW], y[MAXROW];
392 int pi[MAXROW], pha[MAXROW];
393 double time[MAXROW];
394 } *AEv, AEvRec;
395
396 got = FunColumnSelect(fun, sizeof(AEvRec), "org=soa",
397 "x", "D:10:10", mode, FUN_OFFSET(AEv, x),
398 "y", "D:10:10", mode, FUN_OFFSET(AEv, y),
399 "pi", "J", mode, FUN_OFFSET(AEv, pi),
400 "pha", "J", mode, FUN_OFFSET(AEv, pha),
401 "time", "1D", mode, FUN_OFFSET(AEv, time),
402 NULL);
403
404 Note that the only modification to the call is in the plist string.
405
406 Of course, instead of using staticly allocated arrays, you also can
407 specify dynamically allocated pointers:
408
409 /* pointers to arrays of columns (used in struct of arrays) */
410 typedef struct pevstruct{
411 short *region;
412 double *x, *y;
413 int *pi, *pha;
414 double *time;
415 } *PEv, PEvRec;
416
417 got = FunColumnSelect(fun, sizeof(PEvRec), "org=structofarrays",
418 "$region", "@I", mode, FUN_OFFSET(PEv, region),
419 "x", "@D:10:10", mode, FUN_OFFSET(PEv, x),
420 "y", "@D:10:10", mode, FUN_OFFSET(PEv, y),
421 "pi", "@J", mode, FUN_OFFSET(PEv, pi),
422 "pha", "@J", mode, FUN_OFFSET(PEv, pha),
423 "time", "@1D", mode, FUN_OFFSET(PEv, time),
424 NULL);
425
426 Here, the actual storage space is either allocated by the user or by
427 the FunColumnSelect() call).
428
429 In all of the above cases, the same call is made to retrieve rows,
430 e.g.:
431
432 buf = (void *)FunTableRowGet(fun, NULL, MAXROW, NULL, &got);
433
434 However, the individual data elements are accessed differently. For
435 the default case of an "array of structs", the individual row records
436 are accessed using:
437
438 for(i=0; i<got; i++){
439 ev = (Ev)buf+i;
440 fprintf(stdout, "%.2f\t%.2f\t%d\t%d\t%.4f\t%.4f\t%21.8f\n",
441 ev->x, ev->y, ev->pi, ev->pha, ev->dx, ev->dy, ev->time);
442 }
443
444 For a struct of arrays or a struct of array pointers, we have a single
445 struct through which we access individual columns and rows using:
446
447 aev = (AEv)buf;
448 for(i=0; i<got; i++){
449 fprintf(stdout, "%.2f\t%.2f\t%d\t%d\t%.4f\t%.4f\t%21.8f\n",
450 aev->x[i], aev->y[i], aev->pi[i], aev->pha[i],
451 aev->dx[i], aev->dy[i], aev->time[i]);
452 }
453
454 Support for struct of arrays in the FunTableRowPut() call is handled
455 analogously.
456
457 See the evread example code and evmerge example code for working exam‐
458 ples of how FunColumnSelect() is used.
459
461 See funtools(n) for a list of Funtools help pages
462
463
464
465version 1.4.0 August 15, 2007 funcolumnselect(3)