1Cflow(3) User Contributed Perl Documentation Cflow(3)
2
3
4
6 Cflow::find - find "interesting" flows in raw IP flow files
7
9 use Cflow;
10
11 Cflow::verbose(1);
12 Cflow::find(\&wanted, <*.flows*>);
13
14 sub wanted { ... }
15
16 or:
17
18 Cflow::find(\&wanted, \&perfile, <*.flows*>);
19
20 sub perfile {
21 my $fname = shift;
22 ...
23 }
24
26 This module implements an API for processing IP flow accounting infor‐
27 mation which as been collected from routers and written into flow files
28 by one of the various flow collectors listed below.
29
30 It was originally conceived and written for use by FlowScan:
31
32 http://net.doit.wisc.edu/~plonka/FlowScan/
33
35 This package is of little use on its own. It requires input in the
36 form of time-stamped raw flow files produced by other software pack‐
37 ages. These "flow sources" either snoop a local ethernet (via libpcap)
38 or collect flow information from IP routers that are configured to
39 export said information. The following flow sources are supported:
40
41 argus by Carter Bullard:
42 http://www.qosient.com/argus/
43
44 flow-tools by Mark Fullmer (with NetFlow v1, v5, v6, or v7):
45 http://www.splintered.net/sw/flow-tools/
46
47 CAIDA's cflowd (with NetFlow v5):
48 http://www.caida.org/tools/measurement/cflowd/
49 http://net.doit.wisc.edu/~plonka/cflowd/
50
51 lfapd by Steve Premeau (with LFAPv4):
52 http://www.nmops.org/
53
55 Cflow::find() will iterate across all the flows in the specified files.
56 It will call your wanted() function once per flow record. If the file
57 name argument passed to find() is specified as "-", flows will be read
58 from standard input.
59
60 The wanted() function does whatever you write it to do. For instance,
61 it could simply print interesting flows or it might maintain byte,
62 packet, and flow counters which could be written to a database after
63 the find subroutine completes.
64
65 Within your wanted() function, tests on the "current" flow can be per‐
66 formed using the following variables:
67
68 $Cflow::unix_secs
69 secs since epoch (deprecated)
70
71 $Cflow::exporter
72 Exporter IP Address as a host-ordered "long"
73
74 $Cflow::exporterip
75 Exporter IP Address as dotted-decimal string
76
77 $Cflow::localtime
78 $Cflow::unix_secs interpreted as localtime with this strftime(3)
79 format:
80
81 %Y/%m/%d %H:%M:%S
82
83 $Cflow::srcaddr
84 Source IP Address as a host-ordered "long"
85
86 $Cflow::srcip
87 Source IP Address as a dotted-decimal string
88
89 $Cflow::dstaddr
90 Destination IP Address as a host-ordered "long"
91
92 $Cflow::dstip
93 Destination IP Address as a dotted-decimal string
94
95 $Cflow::input_if
96 Input interface index
97
98 $Cflow::output_if
99 Output interface index
100
101 $Cflow::srcport
102 TCP/UDP src port number or equivalent
103
104 $Cflow::dstport
105 TCP/UDP dst port number or equivalent
106
107 $Cflow::ICMPType
108 high byte of $Cflow::dstport
109
110 Undefined if the current flow is not an ICMP flow.
111
112 $Cflow::ICMPCode
113 low byte of $Cflow::dstport
114
115 Undefined if the current flow is not an ICMP flow.
116
117 $Cflow::ICMPTypeCode
118 symbolic representation of $Cflow::dstport
119
120 The value is a the type-specific ICMP code, if any, followed by the
121 ICMP type. E.g.
122
123 ECHO
124 HOST_UNREACH
125
126 Undefined if the current flow is not an ICMP flow.
127
128 $Cflow::pkts
129 Packets sent in Duration
130
131 $Cflow::bytes
132 Octets sent in Duration
133
134 $Cflow::nexthop
135 Next hop router's IP Address as a host-ordered "long"
136
137 $Cflow::nexthopip
138 Next hop router's IP Address as a dotted-decimal string
139
140 $Cflow::startime
141 secs since epoch at start of flow
142
143 $Cflow::start_msecs
144 fractional portion of startime (in milliseconds)
145
146 This will be zero unless the source is flow-tools or argus.
147
148 $Cflow::endtime
149 secs since epoch at last packet of flow
150
151 $Cflow::end_msecs
152 fractional portion of endtime (in milliseconds)
153
154 This will be zero unless the source is flow-tools or argus.
155
156 $Cflow::protocol
157 IP protocol number (as is specified in /etc/protocols, i.e.
158 1=ICMP, 6=TCP, 17=UDP, etc.)
159
160 $Cflow::tos
161 IP Type-of-Service
162
163 $Cflow::tcp_flags
164 bitwise OR of all TCP flags that were set within packets in the
165 flow; 0x10 for non-TCP flows
166
167 $Cflow::TCPFlags
168 symbolic representation of $Cflow::tcp_flags The value will be a
169 bitwise-or expression. E.g.
170
171 PUSH⎪SYN⎪FIN⎪ACK
172
173 Undefined if the current flow is not a TCP flow.
174
175 $Cflow::raw
176 the entire "packed" flow record as read from the input file
177
178 This is useful when the "wanted" subroutine wants to write the flow
179 to another FILEHANDLE. E.g.:
180
181 syswrite(FILEHANDLE, $Cflow::raw, length $Cflow::raw)
182
183 Note that if you're using a modern version of perl that supports
184 PerlIO Layers, be sure that FILEHANDLE is using something appropri‐
185 ate like the ":bytes" layer. This can be activated on open, or
186 with:
187
188 binmode(FILEHANDLE, ":bytes");
189
190 This will prevent the external LANG setting from causing perl to do
191 such things as interpreting your raw flow records as UTF-8 charac‐
192 ters and corrupting the record.
193
194 $Cflow::reraw
195 the entire "re-packed" flow record formatted like $Cflow::raw.
196
197 This is useful when the "wanted" subroutine wants to write a modi‐
198 fied flow to another FILEHANDLE. E.g.:
199
200 $srcaddr = my_encode($srcaddr);
201 $dstaddr = my_encode($dstaddr);
202 syswrite(FILEHANDLE, $Cflow::reraw, length $Cflow::raw)
203
204 These flow variables are packed into $Cflow::reraw:
205
206 $Cflow::index, $Cflow::exporter,
207 $Cflow::srcaddr, $Cflow::dstaddr,
208 $Cflow::input_if, $Cflow::output_if,
209 $Cflow::srcport, $Cflow::dstport,
210 $Cflow::pkts, $Cflow::bytes,
211 $Cflow::nexthop,
212 $Cflow::startime, $Cflow::endtime,
213 $Cflow::protocol, $Cflow::tos,
214 $Cflow::src_as, $Cflow::dst_as,
215 $Cflow::src_mask, $Cflow::dst_mask,
216 $Cflow::tcp_flags,
217 $Cflow::engine_type, $Cflow::engine_id
218
219 Note that if you're using a modern version of perl that supports
220 PerlIO Layers, be sure that FILEHANDLE is using something appropri‐
221 ate like the ":bytes" layer. This can be activated on open, or
222 with:
223
224 binmode(FILEHANDLE, ":bytes");
225
226 This will prevent the external LANG setting from causing perl to do
227 such things as interpreting your raw flow records as UTF-8 charac‐
228 ters and corrupting the record.
229
230 $Cflow::Bps
231 the minimum bytes per second for the current flow
232
233 $Cflow::pps
234 the minimum packets per second for the current flow
235
236 The following variables are undefined if using NetFlow v1 (which does
237 not contain the requisite information):
238
239 $Cflow::src_as
240 originating or peer AS of source address
241
242 $Cflow::dst_as
243 originating or peer AS of destination address
244
245 The following variables are undefined if using NetFlow v1 or LFAPv4
246 (which do not contain the requisite information):
247
248 $Cflow::src_mask
249 source address prefix mask bits
250
251 $Cflow::dst_mask
252 destination address prefix mask bits
253
254 $Cflow::engine_type
255 type of flow switching engine
256
257 $Cflow::engine_id
258 ID of the flow switching engine
259
260 Optionally, a reference to a perfile() function can be passed to
261 Cflow::find as the argument following the reference to the wanted()
262 function. This perfile() function will be called once for each flow
263 file. The argument to the perfile() function will be name of the flow
264 file which is about to be processed. The purpose of the perfile()
265 function is to allow you to periodically report the progress of
266 Cflow::find() and to provide an opportunity to periodically reclaim
267 storage used by data objects that may have been allocated or maintained
268 by the wanted() function. For instance, when counting the number of
269 active hosts IP addresses in each time-stamped flow file, perfile() can
270 reset the counter to zero and clear the search tree or hash used to
271 remember those IP addresses.
272
273 Since Cflow is an Exporter, you can request that all those scalar flow
274 variables be exported (so that you need not use the "Cflow::" prefix):
275
276 use Cflow qw(:flowvars);
277
278 Also, you can request that the symbolic names for the TCP flags, ICMP
279 types, and/or ICMP codes be exported:
280
281 use Cflow qw(:tcpflags :icmptypes :icmpcodes);
282
283 The tcpflags are:
284
285 $TH_FIN $TH_SYN $TH_RST $TH_PUSH $TH_ACK $TH_URG
286
287 The icmptypes are:
288
289 $ICMP_ECHOREPLY $ICMP_DEST_UNREACH $ICMP_SOURCE_QUENCH
290 $ICMP_REDIRECT $ICMP_ECHO $ICMP_TIME_EXCEEDED
291 $ICMP_PARAMETERPROB $ICMP_TIMESTAMP $ICMP_TIMESTAMPREPLY
292 $ICMP_INFO_REQUEST $ICMP_INFO_REPLY $ICMP_ADDRESS
293 $ICMP_ADDRESSREPLY
294
295 The icmpcodes are:
296
297 $ICMP_NET_UNREACH $ICMP_HOST_UNREACH $ICMP_PROT_UNREACH
298 $ICMP_PORT_UNREACH $ICMP_FRAG_NEEDED $ICMP_SR_FAILED
299 $ICMP_NET_UNKNOWN $ICMP_HOST_UNKNOWN $ICMP_HOST_ISOLATED
300 $ICMP_NET_ANO $ICMP_HOST_ANO $ICMP_NET_UNR_TOS
301 $ICMP_HOST_UNR_TOS $ICMP_PKT_FILTERED $ICMP_PREC_VIOLATION
302 $ICMP_PREC_CUTOFF $ICMP_UNREACH $ICMP_REDIR_NET
303 $ICMP_REDIR_HOST $ICMP_REDIR_NETTOS $ICMP_REDIR_HOSTTOS
304 $ICMP_EXC_TTL $ICMP_EXC_FRAGTIME
305
306 Please note that the names above are not necessarily exactly the same
307 as the names of the flags, types, and codes as set in the values of the
308 aforemented $Cflow::TCPFlags and $Cflow::ICMPTypeCode flow variables.
309
310 Lastly, as is usually the case for modules, the subroutine names can be
311 imported, and a minimum version of Cflow can be specified:
312
313 use Cflow qw(:flowvars find verbose 1.031);
314
315 Cflow::find() returns a "hit-ratio". This hit-ratio is a string for‐
316 matted similarly to that of the value of a perl hash when taken in a
317 scalar context. This hit-ratio indicates ((# of "wanted" flows) / (#
318 of scanned flows)). A flow is considered to have been "wanted" if your
319 wanted() function returns non-zero.
320
321 Cflow::verbose() takes a single scalar boolean argument which indicates
322 whether or not you wish warning messages to be generated to STDERR when
323 "problems" occur. Verbose mode is set by default.
324
326 Here's a complete example with a sample wanted function. It will print
327 all UDP flows that involve either a source or destination port of 31337
328 and port on the other end that is unreserved (greater than 1024):
329
330 use Cflow qw(:flowvars find);
331
332 my $udp = getprotobyname('udp');
333 verbose(0);
334 find(\&wanted, @ARGV? @ARGV : <*.flows*>);
335
336 sub wanted {
337 return if ($srcport < 1024 ⎪⎪ $dstport < 1024);
338 return unless (($srcport == 31337 ⎪⎪ $dstport == 31337) &&
339 $udp == $protocol);
340
341 printf("%s %15.15s.%-5hu %15.15s.%-5hu %2hu %10u %10u\n",
342 $localtime,
343 $srcip,
344 $srcport,
345 $dstip,
346 $dstport,
347 $protocol,
348 $pkts,
349 $bytes)
350 }
351
352 Here's an example which demonstrates a technique which can be used to
353 pass arbitrary arguments to your wanted function by passing a reference
354 to an anonymous subroutine as the wanted() function argument to
355 Cflow::find():
356
357 sub wanted {
358 my @params = @_;
359 # ...
360 }
361
362 Cflow::find(sub { wanted(@params) }, @files);
363
365 Argus uses a bidirectional flow model. This means that some argus
366 flows represent packets not only in the forward direction (from
367 "source" to "destination"), but also in the reverse direction (from the
368 so-called "destination" to the "source"). However, this module uses a
369 unidirection flow model, and therfore splits some argus flows into two
370 unidirectional flows for the purpose of reporting.
371
372 Currently, using this module's API there is no way to determine if two
373 subsequently reported unidirectional flows were really a single argus
374 flow. This may be addressed in a future release of this package.
375
376 Furthermore, for argus flows which represent bidirectional ICMP traf‐
377 fic, this module presumes that all the reverse packets were ECHOREPLYs
378 (sic). This is sometimes incorrect as described here:
379
380 http://www.theorygroup.com/Archive/Argus/2002/msg00016.html
381
382 and will be fixed in a future release of this package.
383
384 Timestamps ($startime and $endtime) are sometimes reported incorrectly
385 for bidirectional argus flows that represent only one packet in each
386 direction. This will be fixed in a future release.
387
388 Argus flows sometimes contain information which does not map directly
389 to the flow variables presented by this module. For the time being,
390 this information is simply not accessible through this module's API.
391 This may be addressed in a future release.
392
393 Lastly, argus flows produced from observed traffic on a local ethernet
394 do not contain enough information to meaningfully set the values of all
395 this module's flow variables. For instance, the next-hop and
396 input/output ifIndex numbers are missing. For the time being, all
397 argus flows accessed throught this module's API will have both the
398 $input_if and $output_if as 42. Althought 42 is the answer to life,
399 the universe, and everthing, in this context, it is just an arbitrary
400 number. It is important that $output_if is non-zero, however, since
401 existing FlowScan reports interpret an $output_if value of zero to mean
402 that the traffic represented by that flow was not forwarded (i.e.
403 dropped). For similar reasons, the $nexthopip for all argus flows is
404 reported as "127.0.0.1".
405
407 Currently, only NetFlow version 5 is supported when reading cflowd-for‐
408 mat raw flow files.
409
410 When built with support for flow-tools and attempting to read a cflowd
411 format raw flow file from standard input, you'll get the error:
412
413 open "-": No such file or directory
414
415 For the time being, the workaround is to write the content to a file
416 and read it from directly from there rather than from standard input.
417 (This happens because we can't close and re-open file descriptor zero
418 after determining that the content was not in flow-tools format.)
419
420 When built with support for flow-tools and using verbose mode,
421 Cflow::find will generate warnings if you process a cflowd format raw
422 flow file. This happens because it will first attempt to open the file
423 as a flow-tools format raw flow file (which will produce a warning mes‐
424 sage), and then revert to handling it as cflowd format raw flow file.
425
426 Likewise, when built with support for argus and attempting to read a
427 cflowd format raw flow file from standard input, you'll get this warn‐
428 ing message:
429
430 not Argus-2.0 data stream.
431
432 This is because argus (as of argus-2.0.4) doesn't seem to have a mode
433 in which such warning messages are supressed.
434
435 The $Cflow::raw flow variable contains the flow record in cflowd for‐
436 mat, even if it was read from a raw flow file produced by flow-tools or
437 argus. Because cflowd discards the fractional portion of the flow
438 start and end time, only the whole seconds portion of these times will
439 be retained. (That is, the raw record in $Cflow::raw does not contain
440 the $start_msecs and $end_msecs, so using $Cflow::raw to convert to
441 cflowd format is a lossy operation.)
442
443 When used with cflowd, Cflow::find() will generate warnings if the flow
444 data file is "invalid" as far as its concerned. To avoid this, you
445 must be using Cisco version 5 flow-export and configure cflowd so that
446 it saves all flow-export data. This is the default behavior when
447 cflowd produces time-stamped raw flow files after being patched as
448 described here:
449
450 http://net.doit.wisc.edu/~plonka/cflowd/
451
453 The interface presented by this package is a blatant ripoff of
454 File::Find.
455
457 Dave Plonka <plonka@doit.wisc.edu>
458
459 Copyright (C) 1998-2005 Dave Plonka. This program is free software;
460 you can redistribute it and/or modify it under the terms of the GNU
461 General Public License as published by the Free Software Foundation;
462 either version 2 of the License, or (at your option) any later version.
463
465 The version number is the module file RCS revision number ($Revision:
466 1.53 $) with the minor number printed right justified with leading
467 zeroes to 3 decimal places. For instance, RCS revision 1.1 would yield
468 a package version number of 1.001.
469
470 This is so that revision 1.10 (which is version 1.010), for example,
471 will test greater than revision 1.2 (which is version 1.002) when you
472 want to require a minimum version of this module.
473
475 perl(1), Socket, Net::Netmask, Net::Patricia.
476
477
478
479perl v5.8.8 2005-09-28 Cflow(3)