1funcalc(1) SAORD Documentation funcalc(1)
2
3
4
6 funcalc - Funtools calculator (for binary tables)
7
9 funcalc [-n] [-a argstr] [-e expr] [-f file] [-l link] [-p prog]
10 <iname> [oname [columns]]
11
13 -a argstr # user arguments to pass to the compiled program
14 -e expr # funcalc expression
15 -f file # file containing funcalc expression
16 -l libs # libs to add to link command
17 -n # output generated code instead of compiling and executing
18 -p prog # generate named program, no execution
19 -u # die if any variable is undeclared (don't auto-declare)
20
22 funcalc is a calculator program that allows arbitrary expressions to be
23 constructed, compiled, and executed on columns in a Funtools table
24 (FITS binary table or raw event file). It works by integrating user-
25 supplied expression(s) into a template C program, then compiling and
26 executing the program. funcalc expressions are C statements, although
27 some important simplifications (such as automatic declaration of vari‐
28 ables) are supported.
29
30 funcalc expressions can be specified in three ways: on the command line
31 using the -e [expression] switch, in a file using the -f [file] switch,
32 or from stdin (if neither -e nor -f is specified). Of course a file
33 containing funcalc expressions can be read from stdin.
34
35 Each invocation of funcalc requires an input Funtools table file to be
36 specified as the first command line argument. The output Funtools ta‐
37 ble file is the second optional argument. It is needed only if an out‐
38 put FITS file is being created (i.e., in cases where the funcalc
39 expression only prints values, no output file is needed). If input and
40 output file are both specified, a third optional argument can specify
41 the list of columns to activate (using FunColumnActivate()). Note that
42 funcalc determines whether or not to generate code for writing an out‐
43 put file based on the presence or absence of an output file argument.
44
45 A funcalc expression executes on each row of a table and consists of
46 one or more C statements that operate on the columns of that row (pos‐
47 sibly using temporary variables). Within an expression, reference is
48 made to a column of the current row using the C struct syntax cur-[col‐
49 name]>, e.g. cur->x, cur->pha, etc. Local scalar variables can be
50 defined using C declarations at very the beginning of the expression,
51 or else they can be defined automatically by funcalc (to be of type
52 double). Thus, for example, a swap of columns x and y in a table can be
53 performed using either of the following equivalent funcalc expressions:
54
55 double temp;
56 temp = cur->x;
57 cur->x = cur->y;
58 cur->y = temp;
59
60 or:
61
62 temp = cur->x;
63 cur->x = cur->y;
64 cur->y = temp;
65
66 When this expression is executed using a command such as:
67
68 funcalc -f swap.expr itest.ev otest.ev
69
70 the resulting file will have values of the x and y columns swapped.
71
72 By default, the data type of the variable for a column is the same as
73 the data type of the column as stored in the file. This can be changed
74 by appending ":[dtype]" to the first reference to that column. In the
75 example above, to force x and y to be output as doubles, specify the
76 type 'D' explicitly:
77
78 temp = cur->x:D;
79 cur->x = cur->y:D;
80 cur->y = temp;
81
82 Data type specifiers follow standard FITS table syntax for defining
83 columns using TFORM:
84
85 · A: ASCII characters
86
87 · B: unsigned 8-bit char
88
89 · I: signed 16-bit int
90
91 · U: unsigned 16-bit int (not standard FITS)
92
93 · J: signed 32-bit int
94
95 · V: unsigned 32-bit int (not standard FITS)
96
97 · E: 32-bit float
98
99 · D: 64-bit float
100
101 · X: bits (treated as an array of chars)
102
103 Note that only the first reference to a column should contain the
104 explicit data type specifier.
105
106 Of course, it is important to handle the data type of the columns cor‐
107 rectly. One of the most frequent cause of error in funcalc programming
108 is the implicit use of the wrong data type for a column in expression.
109 For example, the calculation:
110
111 dx = (cur->x - cur->y)/(cur->x + cur->y);
112
113 usually needs to be performed using floating point arithmetic. In cases
114 where the x and y columns are integers, this can be done by reading the
115 columns as doubles using an explicit type specification:
116
117 dx = (cur->x:D - cur->y:D)/(cur->x + cur->y);
118
119 Alternatively, it can be done using C type-casting in the expression:
120
121 dx = ((double)cur->x - (double)cur->y)/((double)cur->x + (double)cur->y);
122
123 In addition to accessing columns in the current row, reference also can
124 be made to the previous row using prev-[colname]>, and to the next row
125 using next-[colname]>. Note that if prev-[colname]> is specified in
126 the funcalc expression, the very first row is not processed. If
127 next-[colname]> is specified in the funcalc expression, the very last
128 row is not processed. In this way, prev and next are guaranteed always
129 to point to valid rows. For example, to print out the values of the
130 current x column and the previous y column, use the C fprintf function
131 in a funcalc expression:
132
133 fprintf(stdout, "%d %d\n", cur->x, prev->y);
134
135 New columns can be specified using the same cur-[colname]> syntax by
136 appending the column type (and optional tlmin/tlmax/binsiz specifiers),
137 separated by colons. For example, cur->avg:D will define a new column
138 of type double. Type specifiers are the same those used above to spec‐
139 ify new data types for existing columns.
140
141 For example, to create and output a new column that is the average
142 value of the x and y columns, a new "avg" column can be defined:
143
144 cur->avg:D = (cur->x + cur->y)/2.0
145
146 Note that the final ';' is not required for single-line expressions.
147
148 As with FITS TFORM data type specification, the column data type speci‐
149 fier can be preceded by a numeric count to define an array, e.g., "10I"
150 means a vector of 10 short ints, "2E" means two single precision
151 floats, etc. A new column only needs to be defined once in a funcalc
152 expression, after which it can be used without re-specifying the type.
153 This includes reference to elements of a column array:
154
155 cur->avg[0]:2D = (cur->x + cur->y)/2.0;
156 cur->avg[1] = (cur->x - cur->y)/2.0;
157
158 The 'X' (bits) data type is treated as a char array of dimension
159 (numeric_count/8), i.e., 16X is processed as a 2-byte char array. Each
160 8-bit array element is accessed separately:
161
162 cur->stat[0]:16X = 1;
163 cur->stat[1] = 2;
164
165 Here, a 16-bit column is created with the MSB is set to 1 and the LSB
166 set to 2.
167
168 By default, all processed rows are written to the specified output
169 file. If you want to skip writing certain rows, simply execute the C
170 "continue" statement at the end of the funcalc expression, since the
171 writing of the row is performed immediately after the expression is
172 executed. For example, to skip writing rows whose average is the same
173 as the current x value:
174
175 cur->avg[0]:2D = (cur->x + cur->y)/2.0;
176 cur->avg[1] = (cur->x - cur->y)/2.0;
177 if( cur->avg[0] == cur->x )
178 continue;
179
180 If no output file argument is specified on the funcalc command line, no
181 output file is opened and no rows are written. This is useful in
182 expressions that simply print output results instead of generating a
183 new file:
184
185 fpv = (cur->av3:D-cur->av1:D)/(cur->av1+cur->av2:D+cur->av3);
186 fbv = cur->av2/(cur->av1+cur->av2+cur->av3);
187 fpu = ((double)cur->au3-cur->au1)/((double)cur->au1+cur->au2+cur->au3);
188 fbu = cur->au2/(double)(cur->au1+cur->au2+cur->au3);
189 fprintf(stdout, "%f\t%f\t%f\t%f\n", fpv, fbv, fpu, fbu);
190
191 In the above example, we use both explicit type specification (for "av"
192 columns) and type casting (for "au" columns) to ensure that all opera‐
193 tions are performed in double precision.
194
195 When an output file is specified, the selected input table is processed
196 and output rows are copied to the output file. Note that the output
197 file can be specified as "stdout" in order to write the output rows to
198 the standard output. If the output file argument is passed, an
199 optional third argument also can be passed to specify which columns to
200 process.
201
202 In a FITS binary table, it sometimes is desirable to copy all of the
203 other FITS extensions to the output file as well. This can be done by
204 appending a '+' sign to the name of the extension in the input file
205 name. See funtable for a related example.
206
207 funcalc works by integrating the user-specified expression into a tem‐
208 plate C program called tabcalc.c. The completed program then is com‐
209 piled and executed. Variable declarations that begin the funcalc
210 expression are placed in the local declaration section of the template
211 main program. All other lines are placed in the template main pro‐
212 gram's inner processing loop. Other details of program generation are
213 handled automatically. For example, column specifiers are analyzed to
214 build a C struct for processing rows, which is passed to FunColumnSe‐
215 lect() and used in FunTableRowGet(). If an unknown variable is used in
216 the expression, resulting in a compilation error, the program build is
217 retried after defining the unknown variable to be of type double.
218
219 Normally, funcalc expression code is added to funcalc row processing
220 loop. It is possible to add code to other parts of the program by plac‐
221 ing this code inside special directives of the form:
222
223 [directive name]
224 ... code goes here ...
225 end
226
227 The directives are:
228
229 · global add code and declarations in global space, before the main
230 routine.
231
232 · local add declarations (and code) just after the local declarations
233 in main
234
235 · before add code just before entering the main row processing loop
236
237 · after add code just after exiting the main row processing loop
238
239 Thus, the following funcalc expression will declare global variables
240 and make subroutine calls just before and just after the main process‐
241 ing loop:
242
243 global
244 double v1, v2;
245 double init(void);
246 double finish(double v);
247 end
248 before
249 v1 = init();
250 end
251 ... process rows, with calculations using v1 ...
252 after
253 v2 = finish(v1);
254 if( v2 < 0.0 ){
255 fprintf(stderr, "processing failed %g -> %g\n", v1, v2);
256 exit(1);
257 }
258 end
259
260 Routines such as init() and finish() above are passed to the generated
261 program for linking using the -l [link directives ...] switch. The
262 string specified by this switch will be added to the link line used to
263 build the program (before the funtools library). For example, assuming
264 that init() and finish() are in the library libmysubs.a in the
265 /opt/special/lib directory, use:
266
267 funcalc -l "-L/opt/special/lib -lmysubs" ...
268
269 User arguments can be passed to a compiled funcalc program using a
270 string argument to the "-a" switch. The string should contain all of
271 the user arguments. For example, to pass the integers 1 and 2, use:
272
273 funcalc -a "1 2" ...
274
275 The arguments are stored in an internal array and are accessed as
276 strings via the ARGV(n) macro. For example, consider the following
277 expression:
278
279 local
280 int pmin, pmax;
281 end
282
283 before
284 pmin=atoi(ARGV(0));
285 pmax=atoi(ARGV(1));
286 end
287
288 if( (cur->pha >= pmin) && (cur->pha <= pmax) )
289 fprintf(stderr, "%d %d %d\n", cur->x, cur->y, cur->pha);
290
291 This expression will print out x, y, and pha values for all rows in
292 which the pha value is between the two user-input values:
293
294 funcalc -a '1 12' -f foo snr.ev'[cir 512 512 .1]'
295 512 512 6
296 512 512 8
297 512 512 5
298 512 512 5
299 512 512 8
300
301 funcalc -a '5 6' -f foo snr.ev'[cir 512 512 .1]'
302 512 512 6
303 512 512 5
304 512 512 5
305
306 Note that it is the user's responsibility to ensure that the correct
307 number of arguments are passed. The ARGV(n) macro returns a NULL if a
308 requested argument is outside the limits of the actual number of args,
309 usually resulting in a SEGV if processed blindly. To check the argu‐
310 ment count, use the ARGC macro:
311
312 local
313 long int seed=1;
314 double limit=0.8;
315 end
316
317 before
318 if( ARGC >= 1 ) seed = atol(ARGV(0));
319 if( ARGC >= 2 ) limit = atof(ARGV(1));
320 srand48(seed);
321 end
322
323 if ( drand48() > limit ) continue;
324
325 The macro WRITE_ROW expands to the FunTableRowPut() call that writes
326 the current row. It can be used to write the row more than once. In
327 addition, the macro NROW expands to the row number currently being pro‐
328 cessed. Use of these two macros is shown in the following example:
329
330 if( cur->pha:I == cur->pi:I ) continue;
331 a = cur->pha;
332 cur->pha = cur->pi;
333 cur->pi = a;
334 cur->AVG:E = (cur->pha+cur->pi)/2.0;
335 cur->NR:I = NROW;
336 if( NROW < 10 ) WRITE_ROW;
337
338 If the -p [prog] switch is specified, the expression is not executed.
339 Rather, the generated executable is saved with the specified program
340 name for later use.
341
342 If the -n switch is specified, the expression is not executed. Rather,
343 the generated code is written to stdout. This is especially useful if
344 you want to generate a skeleton file and add your own code, or if you
345 need to check compilation errors. Note that the comment at the start of
346 the output gives the compiler command needed to build the program on
347 that platform. (The command can change from platform to platform
348 because of the use of different libraries, compiler switches, etc.)
349
350 As mentioned previously, funcalc will declare a scalar variable auto‐
351 matically (as a double) if that variable has been used but not
352 declared. This facility is implemented using a sed script named fun‐
353 calc.sed, which processes the compiler output to sense an undeclared
354 variable error. This script has been seeded with the appropriate error
355 information for gcc, and for cc on Solaris, DecAlpha, and SGI plat‐
356 forms. If you find that automatic declaration of scalars is not working
357 on your platform, check this sed script; it might be necessary to add
358 to or edit some of the error messages it senses.
359
360 In order to keep the lexical analysis of funcalc expressions (reason‐
361 ably) simple, we chose to accept some limitations on how accurately C
362 comments, spaces, and new-lines are placed in the generated program. In
363 particular, comments associated with local variables declared at the
364 beginning of an expression (i.e., not in a local...end block) will usu‐
365 ally end up in the inner loop, not with the local declarations:
366
367 /* this comment will end up in the wrong place (i.e, inner loop) */
368 double a; /* also in wrong place */
369 /* this will be in the the right place (inner loop) */
370 if( cur->x:D == cur->y:D ) continue; /* also in right place */
371 a = cur->x;
372 cur->x = cur->y;
373 cur->y = a;
374 cur->avg:E = (cur->x+cur->y)/2.0;
375
376 Similarly, spaces and new-lines sometimes are omitted or added in a
377 seemingly arbitrary manner. Of course, none of these stylistic blem‐
378 ishes affect the correctness of the generated code.
379
380 Because funcalc must analyze the user expression using the data file(s)
381 passed on the command line, the input file(s) must be opened and read
382 twice: once during program generation and once during execution. As a
383 result, it is not possible to use stdin for the input file: funcalc
384 cannot be used as a filter. We will consider removing this restriction
385 at a later time.
386
387 Along with C comments, funcalc expressions can have one-line internal
388 comments that are not passed on to the generated C program. These
389 internal comment start with the # character and continue up to the
390 new-line:
391
392 double a; # this is not passed to the generated C file
393 # nor is this
394 a = cur->x;
395 cur->x = cur->y;
396 cur->y = a;
397 /* this comment is passed to the C file */
398 cur->avg:E = (cur->x+cur->y)/2.0;
399
400 As previously mentioned, input columns normally are identified by their
401 being used within the inner event loop. There are rare cases where you
402 might want to read a column and process it outside the main loop. For
403 example, qsort might use a column in its sort comparison routine that
404 is not processed inside the inner loop (and therefore not implicitly
405 specified as a column to be read). To ensure that such a column is
406 read by the event loop, use the explicit keyword. The arguments to
407 this keyword specify columns that should be read into the input record
408 structure even though they are not mentioned in the inner loop. For
409 example:
410
411 explicit pi pha
412
413 will ensure that the pi and pha columns are read for each row, even if
414 they are not processed in the inner event loop. The explicit statement
415 can be placed anywhere.
416
417 Finally, note that funcalc currently works on expressions involving
418 FITS binary tables and raw event files. We will consider adding support
419 for image expressions at a later point, if there is demand for such
420 support from the community.
421
423 See funtools(n) for a list of Funtools help pages
424
425
426
427version 1.4.2 January 2, 2008 funcalc(1)