1funjoin(1) SAORD Documentation funjoin(1)
2
3
4
6 funjoin - join two or more FITS binary tables on specified columns
7
9 funjoin [switches] <ifile1> <ifile2> ... <ifilen> <ofile>
10
12 -a cols # columns to activate in all files
13 -a1 cols ... an cols # columns to activate in each file
14 -b 'c1:bvl,c2:bv2' # blank values for common columns in all files
15 -bn 'c1:bv1,c2:bv2' # blank values for columns in specific files
16 -j col # column to join in all files
17 -j1 col ... jn col # column to join in each file
18 -m min # min matches to output a row
19 -M max # max matches to output a row
20 -s # add 'jfiles' status column
21 -S col # add col as status column
22 -t tol # tolerance for joining numeric cols [2 files only]
23
25 funjoin joins rows from two or more (up to 32) FITS Binary Table files,
26 based on the values of specified join columns in each file. NB: the
27 join columns must have an index file associated with it. These files
28 are generated using the funindex program.
29
30 The first argument to the program specifies the first input FITS table
31 or raw event file. If "stdin" is specified, data are read from the
32 standard input. Subsequent arguments specify additional event files
33 and tables to join. The last argument is the output FITS file.
34
35 NB: Do not use Funtools Bracket Notation to specify FITS extensions and
36 row filters when running funjoin or you will get wrong results. Rows
37 are accessed and joined using the index files directly, and this
38 bypasses all filtering.
39
40 The join columns are specified using the -j col switch (which specifies
41 a column name to use for all files) or with -j1 col1, -j2 col2, ... -jn
42 coln switches (which specify a column name to use for each file). A
43 join column must be specified for each file. If both -j col and -jn
44 coln are specified for a given file, then the latter is used. Join col‐
45 umns must either be of type string or type numeric; it is illegal to
46 mix numeric and string columns in a given join. For example, to join
47 three files using the same key column for each file, use:
48
49 funjoin -j key in1.fits in2.fits in3.fits out.fits
50
51 A different key can be specified for the third file in this way:
52
53 funjoin -j key -j3 otherkey in1.fits in2.fits in3.fits out.fits
54
55 The -a "cols" switch (and -a1 "col1", -a2 "cols2" counterparts) can be
56 used to specify columns to activate (i.e. write to the output file) for
57 each input file. By default, all columns are output.
58
59 If two or more columns from separate files have the same name, the sec‐
60 ond (and subsequent) columns are renamed to have an underscore and a
61 numeric value appended.
62
63 The -m min and -M max switches specify the minimum and maximum number
64 of joins required to write out a row. The default minimum is 0 joins
65 (i.e. all rows are written out) and the default maximum is 63 (the max‐
66 imum number of possible joins with a limit of 32 input files). For
67 example, to write out only those rows in which exactly two files have
68 columns that match (i.e. one join):
69
70 funjoin -j key -m 1 -M 1 in1.fits in2.fits in3.fits ... out.fits
71
72 A given row can have the requisite number of joins without all of the
73 files being joined (e.g. three files are being joined but only two have
74 a given join key value). In this case, all of the columns of the non-
75 joined file are written out, by default, using blanks (zeros or NULLs).
76 The -b c1:bv1,c2:bv2 and -b1 'c1:bv1,c2:bv2' -b2 'c1:bv1,c2:bv2' ...
77 switches can be used to set the blank value for columns common to all
78 files and/or columns in a specified file, respectively. Each blank
79 value string contains a comma-separated list of column:blank_val speci‐
80 fiers. For floating point values (single or double), a case-insensi‐
81 tive string value of "nan" means that the IEEE NaN (not-a-number)
82 should be used. Thus, for example:
83
84 funjoin -b "AKEY:???" -b1 "A:-1" -b3 "G:NaN,E:-1,F:-100" ...
85
86 means that a non-joined AKEY column in any file will contain the string
87 "???", the non-joined A column of file 1 will contain a value of -1,
88 the non-joined G column of file 3 will contain IEEE NaNs, while the
89 non-joined E and F columns of the same file will contain values -1 and
90 -100, respectively. Of course, where common and specific blank values
91 are specified for the same column, the specific blank value is used.
92
93 To distinguish which files are non-blank components of a given row, the
94 -s (status) switch can be used to add a bitmask column named "JFILES"
95 to the output file. In this column, a bit is set for each non-blank
96 file composing the given row, with bit 0 corresponds to the first file,
97 bit 1 to the second file, and so on. The file names themselves are
98 stored in the FITS header as parameters named JFILE1, JFILE2, etc. The
99 -S col switch allows you to change the name of the status column from
100 the default "JFILES".
101
102 A join between rows is the Cartesian product of all rows in one file
103 having a given join column value with all rows in a second file having
104 the same value for its join column and so on. Thus, if file1 has 2 rows
105 with join column value 100, file2 has 3 rows with the same value, and
106 file3 has 4 rows, then the join results in 2*3*4=24 rows being output.
107
108 The join algorithm directly processes the index file associated with
109 the join column of each file. The smallest value of all the current
110 columns is selected as a base, and this value is used to join equal-
111 valued columns in the other files. In this way, the index files are
112 traversed exactly once.
113
114 The -t tol switch specifies a tolerance value for numeric columns. At
115 present, a tolerance value can join only two files at a time. (A com‐
116 pletely different algorithm is required to join more than two files
117 using a tolerance, somethng we might consider implementing in the
118 future.)
119
120 The following example shows many of the features of funjoin. The input
121 files t1.fits, t2.fits, and t3.fits contain the following columns:
122
123 [sh] fundisp t1.fits
124 AKEY KEY A B
125 ----------- ------ ------ ------
126 aaa 0 0 1
127 bbb 1 3 4
128 ccc 2 6 7
129 ddd 3 9 10
130 eee 4 12 13
131 fff 5 15 16
132 ggg 6 18 19
133 hhh 7 21 22
134
135 fundisp t2.fits
136 AKEY KEY C D
137 ----------- ------ ------ ------
138 iii 8 24 25
139 ggg 6 18 19
140 eee 4 12 13
141 ccc 2 6 7
142 aaa 0 0 1
143
144 fundisp t3.fits
145 AKEY KEY E F G ------------ ------
146 -------- -------- -----------
147 ggg 6 18 19 100.10
148 jjj 9 27 28 200.20
149 aaa 0 0 1 300.30
150 ddd 3 9 10 400.40
151
152 Given these input files, the following funjoin command:
153
154 funjoin -s -a1 "-B" -a2 "-D" -a3 "-E" -b \
155 "AKEY:???" -b1 "AKEY:XXX,A:255" -b3 "G:NaN,E:-1,F:-100" \
156 -j key t1.fits t2.fits t3.fits foo.fits
157
158 will join the files on the KEY column, outputting all columns except B
159 (in t1.fits), D (in t2.fits) and E (in t3.fits), and setting blank val‐
160 ues for AKEY (globally, but overridden for t1.fits) and A (in file 1)
161 and G, E, and F (in file 3). A JFILES column will be output to flag
162 which files were used in each row:
163
164 AKEY KEY A AKEY_2 KEY_2 C AKEY_3 KEY_3 F G JFILES
165 ------------ ------ ------ ------------ ------ ------ ------------ ------ -------- ----------- --------
166 aaa 0 0 aaa 0 0 aaa 0 1 300.30 7
167 bbb 1 3 ??? 0 0 ??? 0 -100 nan 1
168 ccc 2 6 ccc 2 6 ??? 0 -100 nan 3
169 ddd 3 9 ??? 0 0 ddd 3 10 400.40 5
170 eee 4 12 eee 4 12 ??? 0 -100 nan 3
171 fff 5 15 ??? 0 0 ??? 0 -100 nan 1
172 ggg 6 18 ggg 6 18 ggg 6 19 100.10 7
173 hhh 7 21 ??? 0 0 ??? 0 -100 nan 1
174 XXX 0 255 iii 8 24 ??? 0 -100 nan 2
175 XXX 0 255 ??? 0 0 jjj 9 28 200.20 4
176
178 See funtools(n) for a list of Funtools help pages
179
180
181
182version 1.4.2 January 2, 2008 funjoin(1)