1MK-TCP-MODEL(1) User Contributed Perl Documentation MK-TCP-MODEL(1)
2
3
4
6 mk-tcp-model - Transform tcpdump into metrics that permit performance
7 and scalability modeling.
8
10 Usage: mk-tcp-model [OPTION...] [FILE]
11
12 mk-tcp-model parses and analyzes tcpdump files. With no FILE, or when
13 FILE is -, it read standard input.
14
15 Dump TCP requests and responses to a file, capturing only the packet
16 headers to avoid dropped packets, and ignoring any packets without a
17 payload (such as ack-only packets). Capture port 3306 (MySQL database
18 traffic). Note that to avoid line breaking in terminals and man pages,
19 the TCP filtering expression that follows has a line break at the end
20 of the second line; you should omit this from your tcpdump command.
21
22 tcpdump -s 384 -i any -nnq -tttt \
23 'tcp port 3306 and (((ip[2:2] - ((ip[0]&0xf)<<2))
24 - ((tcp[12]&0xf0)>>2)) != 0)' \
25 > /path/to/tcp-file.txt
26
27 Extract individual response times, sorted by end time:
28
29 mk-tcp-model /path/to/tcp-file.txt > requests.txt
30
31 Sort the result by arrival time, for input to the next step:
32
33 sort -n -k1,1 requests.txt > sorted.txt
34
35 Slice the result into 10-second intervals and emit throughput,
36 concurrency, and response time metrics for each interval:
37
38 mk-tcp-model --type=requests --run-time=10 sorted.txt > sliced.txt
39
40 Transform the result for modeling with Aspersa's usl tool, discarding
41 the first and last line of each file if you specify multiple files (the
42 first and last line are normally incomplete observation periods and are
43 aberrant):
44
45 for f in sliced.txt; do
46 tail -n +2 "$f" | head -n -1 | awk '{print $2, $3, $7/$4}'
47 done > usl-input.txt
48
50 The following section is included to inform users about the potential
51 risks, whether known or unknown, of using this tool. The two main
52 categories of risks are those created by the nature of the tool (e.g.
53 read-only tools vs. read-write tools) and those created by bugs.
54
55 mk-tcp-model merely reads and transforms its input, printing it to the
56 output. It should be very low risk.
57
58 At the time of this release, we know of no bugs that could cause
59 serious harm to users.
60
61 The authoritative source for updated information is always the online
62 issue tracking system. Issues that affect this tool will be marked as
63 such. You can see a list of such issues at the following URL:
64 <http://www.maatkit.org/bugs/mk-tcp-model>.
65
66 See also "BUGS" for more information on filing bugs and getting help.
67
69 This tool recognizes requests and responses in a TCP stream, and
70 extracts the "conversations". You can use it to capture the response
71 times of individual queries to a database, for example. It expects the
72 TCP input to be in the following format, which should result from the
73 sample shown in the SYNOPSIS:
74
75 <date> <time.microseconds> IP <IP.port> > <IP.port>: <junk>
76
77 The tool watches for "incoming" packets to the port you specify with
78 the "--watch-server" option. This begins a request. If multiple
79 inbound packets follow each other, then by default the last inbound
80 packet seen determines the time at which the request is assumed to
81 begin. This is logical if one assumes that a server must receive the
82 whole SQL statement before beginning execution, for example.
83
84 When the first outbound packet is seen, the server is considered to
85 have responded to the request. The tool might see an inbound packet,
86 but never see a response. This can happen when the kernel drops
87 packets, for example. As a result, the tool never prints a request
88 unless it sees the response to it. However, the tool actually does not
89 print any request until it sees the "last" outbound packet. It
90 determines this by waiting for either another inbound packet, or EOF,
91 and then considers the previous inbound/outbound pair to be complete.
92 As a result, the tool prints requests in a relatively random order.
93 Most types of analysis require processing in either arrival or
94 completion order. Therefore, the second type of processing this tool
95 can do requires that you sort the output from the first stage and
96 supply it as input.
97
98 The second type of processing is selected with the "--type" option set
99 to "requests". In this mode, the tool reads a group of requests and
100 aggregates them, then emits the aggregated metrics.
101
103 In the default mode (parsing tcpdump output), requests are printed out
104 one per line, in the following format:
105
106 <id> <start> <end> <elapsed> <IP:port>
107
108 The ID is an incrementing number, assigned in arrival order in the
109 original TCP traffic. The start and end timestamps, and the elapsed
110 time, can be customized with the "--start-end" option.
111
112 In "--type=requests" mode, the tool prints out one line per time
113 interval as defined by "--run-time", with the following columns: ts,
114 concurrency, throughput, arrivals, completions, busy_time,
115 weighted_time, sum_time, variance_mean, quantile_time, obs_time. A
116 detailed explanation follows:
117
118 ts The timestamp that defines the beginning of the interval.
119
120 concurrency
121 The average number of requests resident in the server during the
122 interval.
123
124 throughput
125 The number of arrivals per second during the interval.
126
127 arrivals
128 The number of arrivals during the interval.
129
130 completions
131 The number of completions during the interval.
132
133 busy_time
134 The total amount of time during which at least one request was
135 resident in the server during the interval.
136
137 weighted_time
138 The total response time of all the requests resident in the server
139 during the interval, including requests that neither arrived nor
140 completed during the interval.
141
142 sum_time
143 The total response time of all the requests that arrived in the
144 interval.
145
146 variance_mean
147 The variance-to-mean ratio (index of dispersion) of the response
148 times of the requests that arrived in the interval.
149
150 quantile_time
151 The Nth percentile response time for all the requests that arrived
152 in the interval. See also "--quantile".
153
154 obs_time
155 The length of the observation time window. This will usually be
156 the same as the interval length, except for the first and last
157 intervals in a file, which might have a shorter observation time.
158
160 This tool accepts additional command-line arguments. Refer to the
161 "SYNOPSIS" and usage information for details.
162
163 --config
164 type: Array
165
166 Read this comma-separated list of config files; if specified, this
167 must be the first option on the command line.
168
169 --help
170 Show help and exit.
171
172 --progress
173 type: array; default: time,30
174
175 Print progress reports to STDERR. The value is a comma-separated
176 list with two parts. The first part can be percentage, time, or
177 iterations; the second part specifies how often an update should be
178 printed, in percentage, seconds, or number of iterations.
179
180 --quantile
181 type: float
182
183 The percentile for the last column when "--type" is "requests"
184 (default .99).
185
186 --run-time
187 type: float
188
189 The size of the aggregation interval in seconds when "--type" is
190 "requests" (default 1). Fractional values are permitted.
191
192 --start-end
193 type: Array; default: ts,end
194
195 Define how the arrival and completion timestamps of a query, and
196 thus its response time (elapsed time) are computed. Recall that
197 there may be multiple inbound and outbound packets per request and
198 response, and refer to the following ASCII diagram. Suppose that a
199 client sends a series of three inbound (I) packets to the server,
200 whch computes the result and then sends two outbound (O) packets
201 back:
202
203 I I I ..................... O O
204 |<---->|<---response time----->|<-->|
205 ts0 ts end end1
206
207 By default, the query is considered to arrive at time ts, and
208 complete at time end. However, this might not be what you want.
209 Perhaps you do not want to consider the query to have completed
210 until time end1. You can accomplish this by setting this option to
211 "ts,end1".
212
213 --type
214 type: string
215
216 The type of input to parse (default tcpdump). The permitted types
217 are
218
219 tcpdump
220 The parser expects the input to be formatted with the following
221 options: "-x -n -q -tttt". For example, if you want to capture
222 output from your local machine, you can do something like the
223 following (the port must come last on FreeBSD):
224
225 tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 \
226 > mysql.tcp.txt
227 mk-query-digest --type tcpdump mysql.tcp.txt
228
229 The other tcpdump parameters, such as -s, -c, and -i, are up to
230 you. Just make sure the output looks like this (there is a
231 line break in the first line to avoid man-page problems):
232
233 2009-04-12 09:50:16.804849 IP 127.0.0.1.42167
234 > 127.0.0.1.3306: tcp 37
235
236 All MySQL servers running on port 3306 are automatically
237 detected in the tcpdump output. Therefore, if the tcpdump out
238 contains packets from multiple servers on port 3306 (for
239 example, 10.0.0.1:3306, 10.0.0.2:3306, etc.), all
240 packets/queries from all these servers will be analyzed
241 together as if they were one server.
242
243 If you're analyzing traffic for a protocol that is not running
244 on port 3306, see "--watch-server".
245
246 --version
247 Show version and exit.
248
249 --watch-server
250 type: string; default: 10.10.10.10:3306
251
252 This option tells mk-tcp-model which server IP address and port
253 (such as "10.0.0.1:3306") to watch when parsing tcpdump for
254 "--type" tcpdump. If you don't specify it, the tool watches all
255 servers by looking for any IP address using port 3306. If you're
256 watching a server with a non-standard port, this won't work, so you
257 must specify the IP address and port to watch.
258
259 Currently, IP address filtering isn't implemented; so even though
260 you must specify the option in IP:port form, it ignores the IP and
261 only looks at the port number.
262
264 You can download Maatkit from Google Code at
265 <http://code.google.com/p/maatkit/>, or you can get any of the tools
266 easily with a command like the following:
267
268 wget http://www.maatkit.org/get/toolname
269 or
270 wget http://www.maatkit.org/trunk/toolname
271
272 Where "toolname" can be replaced with the name (or fragment of a name)
273 of any of the Maatkit tools. Once downloaded, they're ready to run; no
274 installation is needed. The first URL gets the latest released version
275 of the tool, and the second gets the latest trunk code from Subversion.
276
278 The environment variable "MKDEBUG" enables verbose debugging output in
279 all of the Maatkit tools:
280
281 MKDEBUG=1 mk-....
282
284 You need Perl and some core packages that ought to be installed in any
285 reasonably new version of Perl.
286
288 For a list of known bugs see
289 <http://www.maatkit.org/bugs/mk-tcp-model>.
290
291 Please use Google Code Issues and Groups to report bugs or request
292 support: <http://code.google.com/p/maatkit/>. You can also join
293 #maatkit on Freenode to discuss Maatkit.
294
295 Please include the complete command-line used to reproduce the problem
296 you are seeing, the version of all MySQL servers involved, the complete
297 output of the tool when run with "--version", and if possible,
298 debugging output produced by running with the "MKDEBUG=1" environment
299 variable.
300
302 This program is copyright 2011 Baron Schwartz. Feedback and
303 improvements are welcome.
304
305 THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
306 WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
307 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
308
309 This program is free software; you can redistribute it and/or modify it
310 under the terms of the GNU General Public License as published by the
311 Free Software Foundation, version 2; OR the Perl Artistic License. On
312 UNIX and similar systems, you can issue `man perlgpl' or `man
313 perlartistic' to read these licenses.
314
315 You should have received a copy of the GNU General Public License along
316 with this program; if not, write to the Free Software Foundation, Inc.,
317 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
318
320 Baron Schwartz
321
323 This tool is part of Maatkit, a toolkit for power users of MySQL.
324 Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
325 primary code contributors. Both are employed by Percona. Financial
326 support for Maatkit development is primarily provided by Percona and
327 its clients.
328
330 This manual page documents Ver 1.0.1 Distrib 7540 $Revision: 7536 $.
331
332
333
334perl v5.28.1 2011-06-08 MK-TCP-MODEL(1)