mk-tcp-model(1p)

1MK-TCP-MODEL(1)       User Contributed Perl Documentation      MK-TCP-MODEL(1)
2
3
4

NAME

6       mk-tcp-model - Transform tcpdump into metrics that permit performance
7       and scalability modeling.
8

SYNOPSIS

10       Usage: mk-tcp-model [OPTION...] [FILE]
11
12       mk-tcp-model parses and analyzes tcpdump files.  With no FILE, or when
13       FILE is -, it read standard input.
14
15       Dump TCP requests and responses to a file, capturing only the packet
16       headers to avoid dropped packets, and ignoring any packets without a
17       payload (such as ack-only packets).  Capture port 3306 (MySQL database
18       traffic).  Note that to avoid line breaking in terminals and man pages,
19       the TCP filtering expression that follows has a line break at the end
20       of the second line; you should omit this from your tcpdump command.
21
22        tcpdump -s 384 -i any -nnq -tttt \
23               'tcp port 3306 and (((ip[2:2] - ((ip[0]&0xf)<<2))
24             - ((tcp[12]&0xf0)>>2)) != 0)' \
25          > /path/to/tcp-file.txt
26
27       Extract individual response times, sorted by end time:
28
29        mk-tcp-model /path/to/tcp-file.txt > requests.txt
30
31       Sort the result by arrival time, for input to the next step:
32
33        sort -n -k1,1 requests.txt > sorted.txt
34
35       Slice the result into 10-second intervals and emit throughput,
36       concurrency, and response time metrics for each interval:
37
38        mk-tcp-model --type=requests --run-time=10 sorted.txt > sliced.txt
39
40       Transform the result for modeling with Aspersa's usl tool, discarding
41       the first and last line of each file if you specify multiple files (the
42       first and last line are normally incomplete observation periods and are
43       aberrant):
44
45        for f in sliced.txt; do
46           tail -n +2 "$f" | head -n -1 | awk '{print $2, $3, $7/$4}'
47        done > usl-input.txt
48

RISKS

50       The following section is included to inform users about the potential
51       risks, whether known or unknown, of using this tool.  The two main
52       categories of risks are those created by the nature of the tool (e.g.
53       read-only tools vs. read-write tools) and those created by bugs.
54
55       mk-tcp-model merely reads and transforms its input, printing it to the
56       output.  It should be very low risk.
57
58       At the time of this release, we know of no bugs that could cause
59       serious harm to users.
60
61       The authoritative source for updated information is always the online
62       issue tracking system.  Issues that affect this tool will be marked as
63       such.  You can see a list of such issues at the following URL:
64       <http://www.maatkit.org/bugs/mk-tcp-model>.
65
66       See also "BUGS" for more information on filing bugs and getting help.
67

DESCRIPTION

69       This tool recognizes requests and responses in a TCP stream, and
70       extracts the "conversations".  You can use it to capture the response
71       times of individual queries to a database, for example.  It expects the
72       TCP input to be in the following format, which should result from the
73       sample shown in the SYNOPSIS:
74
75        <date> <time.microseconds> IP <IP.port> > <IP.port>: <junk>
76
77       The tool watches for "incoming" packets to the port you specify with
78       the "--watch-server" option.  This begins a request.  If multiple
79       inbound packets follow each other, then by default the last inbound
80       packet seen determines the time at which the request is assumed to
81       begin.  This is logical if one assumes that a server must receive the
82       whole SQL statement before beginning execution, for example.
83
84       When the first outbound packet is seen, the server is considered to
85       have responded to the request.  The tool might see an inbound packet,
86       but never see a response.  This can happen when the kernel drops
87       packets, for example.  As a result, the tool never prints a request
88       unless it sees the response to it.  However, the tool actually does not
89       print any request until it sees the "last" outbound packet.  It
90       determines this by waiting for either another inbound packet, or EOF,
91       and then considers the previous inbound/outbound pair to be complete.
92       As a result, the tool prints requests in a relatively random order.
93       Most types of analysis require processing in either arrival or
94       completion order.  Therefore, the second type of processing this tool
95       can do requires that you sort the output from the first stage and
96       supply it as input.
97
98       The second type of processing is selected with the "--type" option set
99       to "requests".  In this mode, the tool reads a group of requests and
100       aggregates them, then emits the aggregated metrics.
101

OUTPUT

103       In the default mode (parsing tcpdump output), requests are printed out
104       one per line, in the following format:
105
106        <id> <start> <end> <elapsed> <IP:port>
107
108       The ID is an incrementing number, assigned in arrival order in the
109       original TCP traffic.  The start and end timestamps, and the elapsed
110       time, can be customized with the "--start-end" option.
111
112       In "--type=requests" mode, the tool prints out one line per time
113       interval as defined by "--run-time", with the following columns: ts,
114       concurrency, throughput, arrivals, completions, busy_time,
115       weighted_time, sum_time, variance_mean, quantile_time, obs_time.  A
116       detailed explanation follows:
117
118       ts  The timestamp that defines the beginning of the interval.
119
120       concurrency
121           The average number of requests resident in the server during the
122           interval.
123
124       throughput
125           The number of arrivals per second during the interval.
126
127       arrivals
128           The number of arrivals during the interval.
129
130       completions
131           The number of completions during the interval.
132
133       busy_time
134           The total amount of time during which at least one request was
135           resident in the server during the interval.
136
137       weighted_time
138           The total response time of all the requests resident in the server
139           during the interval, including requests that neither arrived nor
140           completed during the interval.
141
142       sum_time
143           The total response time of all the requests that arrived in the
144           interval.
145
146       variance_mean
147           The variance-to-mean ratio (index of dispersion) of the response
148           times of the requests that arrived in the interval.
149
150       quantile_time
151           The Nth percentile response time for all the requests that arrived
152           in the interval.  See also "--quantile".
153
154       obs_time
155           The length of the observation time window.  This will usually be
156           the same as the interval length, except for the first and last
157           intervals in a file, which might have a shorter observation time.
158

OPTIONS

160       This tool accepts additional command-line arguments.  Refer to the
161       "SYNOPSIS" and usage information for details.
162
163       --config
164           type: Array
165
166           Read this comma-separated list of config files; if specified, this
167           must be the first option on the command line.
168
169       --help
170           Show help and exit.
171
172       --progress
173           type: array; default: time,30
174
175           Print progress reports to STDERR.  The value is a comma-separated
176           list with two parts.  The first part can be percentage, time, or
177           iterations; the second part specifies how often an update should be
178           printed, in percentage, seconds, or number of iterations.
179
180       --quantile
181           type: float
182
183           The percentile for the last column when "--type" is "requests"
184           (default .99).
185
186       --run-time
187           type: float
188
189           The size of the aggregation interval in seconds when "--type" is
190           "requests" (default 1).  Fractional values are permitted.
191
192       --start-end
193           type: Array; default: ts,end
194
195           Define how the arrival and completion timestamps of a query, and
196           thus its response time (elapsed time) are computed.  Recall that
197           there may be multiple inbound and outbound packets per request and
198           response, and refer to the following ASCII diagram.  Suppose that a
199           client sends a series of three inbound (I) packets to the server,
200           whch computes the result and then sends two outbound (O) packets
201           back:
202
203             I I    I ..................... O    O
204             |<---->|<---response time----->|<-->|
205             ts0    ts                      end  end1
206
207           By default, the query is considered to arrive at time ts, and
208           complete at time end.  However, this might not be what you want.
209           Perhaps you do not want to consider the query to have completed
210           until time end1.  You can accomplish this by setting this option to
211           "ts,end1".
212
213       --type
214           type: string
215
216           The type of input to parse (default tcpdump).  The permitted types
217           are
218
219           tcpdump
220               The parser expects the input to be formatted with the following
221               options: "-x -n -q -tttt".  For example, if you want to capture
222               output from your local machine, you can do something like the
223               following (the port must come last on FreeBSD):
224
225                 tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 \
226                   > mysql.tcp.txt
227                 mk-query-digest --type tcpdump mysql.tcp.txt
228
229               The other tcpdump parameters, such as -s, -c, and -i, are up to
230               you.  Just make sure the output looks like this (there is a
231               line break in the first line to avoid man-page problems):
232
233                 2009-04-12 09:50:16.804849 IP 127.0.0.1.42167
234                        > 127.0.0.1.3306: tcp 37
235
236               All MySQL servers running on port 3306 are automatically
237               detected in the tcpdump output.  Therefore, if the tcpdump out
238               contains packets from multiple servers on port 3306 (for
239               example, 10.0.0.1:3306, 10.0.0.2:3306, etc.), all
240               packets/queries from all these servers will be analyzed
241               together as if they were one server.
242
243               If you're analyzing traffic for a protocol that is not running
244               on port 3306, see "--watch-server".
245
246       --version
247           Show version and exit.
248
249       --watch-server
250           type: string; default: 10.10.10.10:3306
251
252           This option tells mk-tcp-model which server IP address and port
253           (such as "10.0.0.1:3306") to watch when parsing tcpdump for
254           "--type" tcpdump.  If you don't specify it, the tool watches all
255           servers by looking for any IP address using port 3306.  If you're
256           watching a server with a non-standard port, this won't work, so you
257           must specify the IP address and port to watch.
258
259           Currently, IP address filtering isn't implemented; so even though
260           you must specify the option in IP:port form, it ignores the IP and
261           only looks at the port number.
262

DOWNLOADING

264       You can download Maatkit from Google Code at
265       <http://code.google.com/p/maatkit/>, or you can get any of the tools
266       easily with a command like the following:
267
268          wget http://www.maatkit.org/get/toolname
269          or
270          wget http://www.maatkit.org/trunk/toolname
271
272       Where "toolname" can be replaced with the name (or fragment of a name)
273       of any of the Maatkit tools.  Once downloaded, they're ready to run; no
274       installation is needed.  The first URL gets the latest released version
275       of the tool, and the second gets the latest trunk code from Subversion.
276

ENVIRONMENT

278       The environment variable "MKDEBUG" enables verbose debugging output in
279       all of the Maatkit tools:
280
281          MKDEBUG=1 mk-....
282

SYSTEM REQUIREMENTS

284       You need Perl and some core packages that ought to be installed in any
285       reasonably new version of Perl.
286

BUGS

288       For a list of known bugs see
289       <http://www.maatkit.org/bugs/mk-tcp-model>.
290
291       Please use Google Code Issues and Groups to report bugs or request
292       support: <http://code.google.com/p/maatkit/>.  You can also join
293       #maatkit on Freenode to discuss Maatkit.
294
295       Please include the complete command-line used to reproduce the problem
296       you are seeing, the version of all MySQL servers involved, the complete
297       output of the tool when run with "--version", and if possible,
298       debugging output produced by running with the "MKDEBUG=1" environment
299       variable.
300

COPYRIGHT, LICENSE AND WARRANTY

302       This program is copyright 2011 Baron Schwartz.  Feedback and
303       improvements are welcome.
304
305       THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
306       WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
307       MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
308
309       This program is free software; you can redistribute it and/or modify it
310       under the terms of the GNU General Public License as published by the
311       Free Software Foundation, version 2; OR the Perl Artistic License.  On
312       UNIX and similar systems, you can issue `man perlgpl' or `man
313       perlartistic' to read these licenses.
314
315       You should have received a copy of the GNU General Public License along
316       with this program; if not, write to the Free Software Foundation, Inc.,
317       59 Temple Place, Suite 330, Boston, MA  02111-1307  USA.
318

AUTHOR

320       Baron Schwartz
321

ABOUT MAATKIT

323       This tool is part of Maatkit, a toolkit for power users of MySQL.
324       Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the
325       primary code contributors.  Both are employed by Percona.  Financial
326       support for Maatkit development is primarily provided by Percona and
327       its clients.
328

VERSION

330       This manual page documents Ver 1.0.1 Distrib 7540 $Revision: 7536 $.
331
332
333
334perl v5.32.0                      2020-07-28                   MK-TCP-MODEL(1)