funcalc(1)

1funcalc(1)                    SAORD Documentation                   funcalc(1)
2
3
4

NAME

6       funcalc - Funtools calculator (for binary tables)
7

SYNOPSIS

9       funcalc [-n] [-a argstr] [-e expr] [-f file] [-l link] [-p prog]
10       <iname> [oname [columns]]
11

OPTIONS

13         -a argstr    # user arguments to pass to the compiled program
14         -e expr      # funcalc expression
15         -f file      # file containing funcalc expression
16         -l libs      # libs to add to link command
17         -n           # output generated code instead of compiling and executing
18         -p prog      # generate named program, no execution
19         -u           # die if any variable is undeclared (don't auto-declare)
20

DESCRIPTION

22       funcalc is a calculator program that allows arbitrary expressions to be
23       constructed, compiled, and executed on columns in a Funtools table
24       (FITS binary table or raw event file). It works by integrating user-
25       supplied expression(s) into a template C program, then compiling and
26       executing the program. funcalc expressions are C statements, although
27       some important simplifications (such as automatic declaration of vari‐
28       ables) are supported.
29
30       funcalc expressions can be specified in three ways: on the command line
31       using the -e [expression] switch, in a file using the -f [file] switch,
32       or from stdin (if neither -e nor -f is specified). Of course a file
33       containing funcalc expressions can be read from stdin.
34
35       Each invocation of funcalc requires an input Funtools table file to be
36       specified as the first command line argument.  The output Funtools ta‐
37       ble file is the second optional argument. It is needed only if an out‐
38       put FITS file is being created (i.e., in cases where the funcalc
39       expression only prints values, no output file is needed). If input and
40       output file are both specified, a third optional argument can specify
41       the list of columns to activate (using FunColumnActivate()).  Note that
42       funcalc determines whether or not to generate code for writing an out‐
43       put file based on the presence or absence of an output file argument.
44
45       A funcalc expression executes on each row of a table and consists of
46       one or more C statements that operate on the columns of that row (pos‐
47       sibly using temporary variables).  Within an expression, reference is
48       made to a column of the current row using the C struct syntax cur-[col‐
49       name]>, e.g. cur->x, cur->pha, etc.  Local scalar variables can be
50       defined using C declarations at very the beginning of the expression,
51       or else they can be defined automatically by funcalc (to be of type
52       double). Thus, for example, a swap of columns x and y in a table can be
53       performed using either of the following equivalent funcalc expressions:
54
55         double temp;
56         temp = cur->x;
57         cur->x = cur->y;
58         cur->y = temp;
59
60       or:
61
62         temp = cur->x;
63         cur->x = cur->y;
64         cur->y = temp;
65
66       When this expression is executed using a command such as:
67
68         funcalc -f swap.expr itest.ev otest.ev
69
70       the resulting file will have values of the x and y columns swapped.
71
72       By default, the data type of the variable for a column is the same as
73       the data type of the column as stored in the file. This can be changed
74       by appending ":[dtype]" to the first reference to that column. In the
75       example above, to force x and y to be output as doubles, specify the
76       type 'D' explicitly:
77
78         temp = cur->x:D;
79         cur->x = cur->y:D;
80         cur->y = temp;
81
82       Data type specifiers follow standard FITS table syntax for defining
83       columns using TFORM:
84
85       ·   A: ASCII characters
86
87       ·   B: unsigned 8-bit char
88
89       ·   I: signed 16-bit int
90
91       ·   U: unsigned 16-bit int (not standard FITS)
92
93       ·   J: signed 32-bit int
94
95       ·   V: unsigned 32-bit int (not standard FITS)
96
97       ·   E: 32-bit float
98
99       ·   D: 64-bit float
100
101       ·   X: bits (treated as an array of chars)
102
103       Note that only the first reference to a column should contain the
104       explicit data type specifier.
105
106       Of course, it is important to handle the data type of the columns cor‐
107       rectly.  One of the most frequent cause of error in funcalc programming
108       is the implicit use of the wrong data type for a column in expression.
109       For example, the calculation:
110
111         dx = (cur->x - cur->y)/(cur->x + cur->y);
112
113       usually needs to be performed using floating point arithmetic. In cases
114       where the x and y columns are integers, this can be done by reading the
115       columns as doubles using an explicit type specification:
116
117         dx = (cur->x:D - cur->y:D)/(cur->x + cur->y);
118
119       Alternatively, it can be done using C type-casting in the expression:
120
121         dx = ((double)cur->x - (double)cur->y)/((double)cur->x + (double)cur->y);
122
123       In addition to accessing columns in the current row, reference also can
124       be made to the previous row using prev-[colname]>, and to the next row
125       using next-[colname]>.  Note that if prev-[colname]> is specified in
126       the funcalc expression, the very first row is not processed.  If
127       next-[colname]> is specified in the funcalc expression, the very last
128       row is not processed. In this way, prev and next are guaranteed always
129       to point to valid rows.  For example, to print out the values of the
130       current x column and the previous y column, use the C fprintf function
131       in a funcalc expression:
132
133         fprintf(stdout, "%d %d\n", cur->x, prev->y);
134
135       New columns can be specified using the same cur-[colname]> syntax by
136       appending the column type (and optional tlmin/tlmax/binsiz specifiers),
137       separated by colons. For example, cur->avg:D will define a new column
138       of type double. Type specifiers are the same those used above to spec‐
139       ify new data types for existing columns.
140
141       For example, to create and output a new column that is the average
142       value of the x and y columns, a new "avg" column can be defined:
143
144         cur->avg:D = (cur->x + cur->y)/2.0
145
146       Note that the final ';' is not required for single-line expressions.
147
148       As with FITS TFORM data type specification, the column data type speci‐
149       fier can be preceded by a numeric count to define an array, e.g., "10I"
150       means a vector of 10 short ints, "2E" means two single precision
151       floats, etc.  A new column only needs to be defined once in a funcalc
152       expression, after which it can be used without re-specifying the type.
153       This includes reference to elements of a column array:
154
155         cur->avg[0]:2D = (cur->x + cur->y)/2.0;
156         cur->avg[1] = (cur->x - cur->y)/2.0;
157
158       The 'X' (bits) data type is treated as a char array of dimension
159       (numeric_count/8), i.e., 16X is processed as a 2-byte char array. Each
160       8-bit array element is accessed separately:
161
162         cur->stat[0]:16X  = 1;
163         cur->stat[1]      = 2;
164
165       Here, a 16-bit column is created with the MSB is set to 1 and the LSB
166       set to 2.
167
168       By default, all processed rows are written to the specified output
169       file. If you want to skip writing certain rows, simply execute the C
170       "continue" statement at the end of the funcalc expression, since the
171       writing of the row is performed immediately after the expression is
172       executed. For example, to skip writing rows whose average is the same
173       as the current x value:
174
175         cur->avg[0]:2D = (cur->x + cur->y)/2.0;
176         cur->avg[1] = (cur->x - cur->y)/2.0;
177         if( cur->avg[0] == cur->x )
178           continue;
179
180       If no output file argument is specified on the funcalc command line, no
181       output file is opened and no rows are written. This is useful in
182       expressions that simply print output results instead of generating a
183       new file:
184
185         fpv = (cur->av3:D-cur->av1:D)/(cur->av1+cur->av2:D+cur->av3);
186         fbv =  cur->av2/(cur->av1+cur->av2+cur->av3);
187         fpu = ((double)cur->au3-cur->au1)/((double)cur->au1+cur->au2+cur->au3);
188         fbu =  cur->au2/(double)(cur->au1+cur->au2+cur->au3);
189         fprintf(stdout, "%f\t%f\t%f\t%f\n", fpv, fbv, fpu, fbu);
190
191       In the above example, we use both explicit type specification (for "av"
192       columns) and type casting (for "au" columns) to ensure that all opera‐
193       tions are performed in double precision.
194
195       When an output file is specified, the selected input table is processed
196       and output rows are copied to the output file.  Note that the output
197       file can be specified as "stdout" in order to write the output rows to
198       the standard output.  If the output file argument is passed, an
199       optional third argument also can be passed to specify which columns to
200       process.
201
202       In a FITS binary table, it sometimes is desirable to copy all of the
203       other FITS extensions to the output file as well. This can be done by
204       appending a '+' sign to the name of the extension in the input file
205       name. See funtable for a related example.
206
207       funcalc works by integrating the user-specified expression into a tem‐
208       plate C program called tabcalc.c.  The completed program then is com‐
209       piled and executed. Variable declarations that begin the funcalc
210       expression are placed in the local declaration section of the template
211       main program.  All other lines are placed in the template main pro‐
212       gram's inner processing loop. Other details of program generation are
213       handled automatically. For example, column specifiers are analyzed to
214       build a C struct for processing rows, which is passed to FunColumnSe‐
215       lect() and used in FunTableRowGet().  If an unknown variable is used in
216       the expression, resulting in a compilation error, the program build is
217       retried after defining the unknown variable to be of type double.
218
219       Normally, funcalc expression code is added to funcalc row processing
220       loop. It is possible to add code to other parts of the program by plac‐
221       ing this code inside special directives of the form:
222
223         [directive name]
224           ... code goes here ...
225         end
226
227       The directives are:
228
229       ·   global add code and declarations in global space, before the main
230           routine.
231
232       ·   local add declarations (and code) just after the local declarations
233           in main
234
235       ·   before add code just before entering the main row processing loop
236
237       ·   after add code just after exiting the main row processing loop
238
239       Thus, the following funcalc expression will declare global variables
240       and make subroutine calls just before and just after the main process‐
241       ing loop:
242
243         global
244           double v1, v2;
245           double init(void);
246           double finish(double v);
247         end
248         before
249           v1  = init();
250         end
251         ... process rows, with calculations using v1 ...
252         after
253           v2 = finish(v1);
254           if( v2 < 0.0 ){
255             fprintf(stderr, "processing failed %g -> %g\n", v1, v2);
256             exit(1);
257           }
258         end
259
260       Routines such as init() and finish() above are passed to the generated
261       program for linking using the -l [link directives ...]  switch. The
262       string specified by this switch will be added to the link line used to
263       build the program (before the funtools library). For example, assuming
264       that init() and finish() are in the library libmysubs.a in the
265       /opt/special/lib directory, use:
266
267         funcalc  -l "-L/opt/special/lib -lmysubs" ...
268
269       User arguments can be passed to a compiled funcalc program using a
270       string argument to the "-a" switch.  The string should contain all of
271       the user arguments. For example, to pass the integers 1 and 2, use:
272
273         funcalc -a "1 2" ...
274
275       The arguments are stored in an internal array and are accessed as
276       strings via the ARGV(n) macro.  For example, consider the following
277       expression:
278
279         local
280           int pmin, pmax;
281         end
282
283         before
284           pmin=atoi(ARGV(0));
285           pmax=atoi(ARGV(1));
286         end
287
288         if( (cur->pha >= pmin) && (cur->pha <= pmax) )
289           fprintf(stderr, "%d %d %d\n", cur->x, cur->y, cur->pha);
290
291       This expression will print out x, y, and pha values for all rows in
292       which the pha value is between the two user-input values:
293
294         funcalc -a '1 12' -f foo snr.ev'[cir 512 512 .1]'
295         512 512 6
296         512 512 8
297         512 512 5
298         512 512 5
299         512 512 8
300
301         funcalc -a '5 6' -f foo snr.ev'[cir 512 512 .1]'
302         512 512 6
303         512 512 5
304         512 512 5
305
306       Note that it is the user's responsibility to ensure that the correct
307       number of arguments are passed. The ARGV(n) macro returns a NULL if a
308       requested argument is outside the limits of the actual number of args,
309       usually resulting in a SEGV if processed blindly.  To check the argu‐
310       ment count, use the ARGC macro:
311
312         local
313           long int seed=1;
314           double limit=0.8;
315         end
316
317         before
318           if( ARGC >= 1 ) seed = atol(ARGV(0));
319           if( ARGC >= 2 ) limit = atof(ARGV(1));
320           srand48(seed);
321         end
322
323         if ( drand48() > limit ) continue;
324
325       The macro WRITE_ROW expands to the FunTableRowPut() call that writes
326       the current row. It can be used to write the row more than once.  In
327       addition, the macro NROW expands to the row number currently being pro‐
328       cessed. Use of these two macros is shown in the following example:
329
330         if( cur->pha:I == cur->pi:I ) continue;
331         a = cur->pha;
332         cur->pha = cur->pi;
333         cur->pi = a;
334         cur->AVG:E  = (cur->pha+cur->pi)/2.0;
335         cur->NR:I = NROW;
336         if( NROW < 10 ) WRITE_ROW;
337
338       If the -p [prog] switch is specified, the expression is not executed.
339       Rather, the generated executable is saved with the specified program
340       name for later use.
341
342       If the -n switch is specified, the expression is not executed. Rather,
343       the generated code is written to stdout. This is especially useful if
344       you want to generate a skeleton file and add your own code, or if you
345       need to check compilation errors. Note that the comment at the start of
346       the output gives the compiler command needed to build the program on
347       that platform. (The command can change from platform to platform
348       because of the use of different libraries, compiler switches, etc.)
349
350       As mentioned previously, funcalc will declare a scalar variable auto‐
351       matically (as a double) if that variable has been used but not
352       declared.  This facility is implemented using a sed script named fun‐
353       calc.sed, which processes the compiler output to sense an undeclared
354       variable error.  This script has been seeded with the appropriate error
355       information for gcc, and for cc on Solaris, DecAlpha, and SGI plat‐
356       forms. If you find that automatic declaration of scalars is not working
357       on your platform, check this sed script; it might be necessary to add
358       to or edit some of the error messages it senses.
359
360       In order to keep the lexical analysis of funcalc expressions (reason‐
361       ably) simple, we chose to accept some limitations on how accurately C
362       comments, spaces, and new-lines are placed in the generated program. In
363       particular, comments associated with local variables declared at the
364       beginning of an expression (i.e., not in a local...end block) will usu‐
365       ally end up in the inner loop, not with the local declarations:
366
367         /* this comment will end up in the wrong place (i.e, inner loop) */
368         double a; /* also in wrong place */
369         /* this will be in the the right place (inner loop) */
370         if( cur->x:D == cur->y:D ) continue; /* also in right place */
371         a = cur->x;
372         cur->x = cur->y;
373         cur->y = a;
374         cur->avg:E  = (cur->x+cur->y)/2.0;
375
376       Similarly, spaces and new-lines sometimes are omitted or added in a
377       seemingly arbitrary manner. Of course, none of these stylistic blem‐
378       ishes affect the correctness of the generated code.
379
380       Because funcalc must analyze the user expression using the data file(s)
381       passed on the command line, the input file(s) must be opened and read
382       twice: once during program generation and once during execution. As a
383       result, it is not possible to use stdin for the input file: funcalc
384       cannot be used as a filter. We will consider removing this restriction
385       at a later time.
386
387       Along with C comments, funcalc expressions can have one-line internal
388       comments that are not passed on to the generated C program. These
389       internal comment start with the # character and continue up to the
390       new-line:
391
392         double a; # this is not passed to the generated C file
393         # nor is this
394         a = cur->x;
395         cur->x = cur->y;
396         cur->y = a;
397         /* this comment is passed to the C file */
398         cur->avg:E  = (cur->x+cur->y)/2.0;
399
400       As previously mentioned, input columns normally are identified by their
401       being used within the inner event loop. There are rare cases where you
402       might want to read a column and process it outside the main loop. For
403       example, qsort might use a column in its sort comparison routine that
404       is not processed inside the inner loop (and therefore not implicitly
405       specified as a column to be read).  To ensure that such a column is
406       read by the event loop, use the explicit keyword.  The arguments to
407       this keyword specify columns that should be read into the input record
408       structure even though they are not mentioned in the inner loop. For
409       example:
410
411         explicit pi pha
412
413       will ensure that the pi and pha columns are read for each row, even if
414       they are not processed in the inner event loop. The explicit statement
415       can be placed anywhere.
416
417       Finally, note that funcalc currently works on expressions involving
418       FITS binary tables and raw event files. We will consider adding support
419       for image expressions at a later point, if there is demand for such
420       support from the community.
421

NAME

SYNOPSIS

OPTIONS

DESCRIPTION

SEE ALSO