fiologparser_hist.py(1)

1fiologparser_hist.py(1)     General Commands Manual    fiologparser_hist.py(1)
2
3
4

NAME

6       fiologparser_hist.py - Calculate statistics from fio histograms
7

SYNOPSIS

9       fiologparser_hist.py [options] [clat_hist_files]...
10

DESCRIPTION

12       fiologparser_hist.py  is  a  utility  for converting *_clat_hist* files
13       generated by fio into a CSV of latency  statistics  including  minimum,
14       average, maximum latency, and 50th, 95th, and 99th percentiles.
15

EXAMPLES

17       $ fiologparser_hist.py *_clat_hist*
18       end-time, samples, min, avg, median, 90%, 95%, 99%, max
19       1000, 15, 192, 1678.107, 1788.859, 1856.076, 1880.040, 1899.208, 1888.000
20       2000, 43, 152, 1642.368, 1714.099, 1816.659, 1845.552, 1888.131, 1888.000
21       4000, 39, 1152, 1546.962, 1545.785, 1627.192, 1640.019, 1691.204, 1744
22

OPTIONS

24       --help Print these options.
25
26       --buff_size=int
27              Number  of  samples  to  buffer into numpy at a time. Default is
28              10,000.  This can be adjusted to help performance.
29
30       --max_latency=int
31              Number of seconds of data to process at a time. Defaults  to  20
32              seconds, in order to handle the 17 second upper bound on latency
33              in histograms reported by fio. This should be increased  if  fio
34              has been run with a larger maximum latency. Lowering this when a
35              lower maximum latency is  known  can  improve  performance.  See
36              NOTES for more details.
37
38       -i, --interval=int
39              Interval  at which statistics are reported. Defaults to 1000 ms.
40              This should be set a minimum of the value for  log_hist_msec  as
41              given to fio.
42
43       -d, --divisor=int
44              Divide  statistics  by  this value. Defaults to 1. Useful if you
45              want to convert latencies from milliseconds  to  seconds  (divi‐
46              sor=1000).
47
48       --warn Enables  warning  messages  printed to stderr, useful for debug‐
49              ging.
50
51       --group_nr=int
52              Set this to the value of FIO_IO_U_PLAT_GROUP_NR  as  defined  in
53              stat.h  if  fio has been recompiled. Defaults to 19, the current
54              value used in fio. See NOTES for more details.
55
56

NOTES

58       end-times are calculated to be uniform  increments  of  the  --interval
59       value  given,  regardless  of  when  histogram samples are reported. Of
60       note:
61
62              Intervals with no samples are omitted. In the example above this
63              means "no statistics from 2 to 3 seconds" and "39 samples influ‐
64              enced the statistics of the interval from 3 to 4 seconds".
65
66              Intervals with a single sample will have the same value for  all
67              statistics
68
69
70       The  number of samples is unweighted, corresponding to the total number
71       of samples which have any effect whatsoever on the interval.
72
73       Min statistics are computed using value of the lower  boundary  of  the
74       first  bin (in increasing bin order) with non-zero samples in it. Simi‐
75       larly for max, we take the upper boundary of the last bin with non-zero
76       samples  in  it.   This is semantically identical to taking the 0th and
77       100th percentiles with a 50% bin-width buffer (because percentiles  are
78       computed  using  mid-points  of  the bins). This enforces the following
79       nice properties:
80
81              min <= 50th <= 90th <= 95th <= 99th <= max
82
83              min and max are strict lower and upper bounds on the actual  min
84              /  max  seen  by  fio  (and  reported in *_clat.* with averaging
85              turned off).
86
87
88       Average statistics use a standard weighted arithmetic mean.
89
90       Percentile statistics are computed using the weighted percentile method
91       as       described       here:       https://en.wikipedia.org/wiki/Per‐
92       centile#Weighted_percentile.  See weights() method for details  on  how
93       weights  are  computed for individual samples. In process_interval() we
94       further multiply by the height of each bin to get weighted histograms.
95
96       We convert files given on the command line, assumed to be fio histogram
97       files, An individual histogram file can contain the histograms for mul‐
98       tiple different r/w directions  (notably  when  --rw=randrw).  This  is
99       accounted for by tracking each r/w direction separately. In the statis‐
100       tics reported we ultimately merge *all* histograms (regardless  of  r/w
101       direction).
102
103       The  value  of  *_GROUP_NR  in  stat.h (and *_BITS) determines how many
104       latency bins fio outputs when histogramming is enabled. Namely for  the
105       current  default  of  GROUP_NR=19,  we  get  1,216  bins with a maximum
106       latency of approximately 17 seconds. For certain applications this  may
107       not  be  sufficient.  With  GROUP_NR=24 we have 1,536 bins, giving us a
108       maximum latency of 541 seconds (~ 9 minutes). If you expect your appli‐
109       cation  to  experience latencies greater than 17 seconds, you will need
110       to recompile fio with a larger GROUP_NR, e.g. with:
111
112
113              sed -i.bak 's/^#define FIO_IO_U_PLAT_GROUP_NR 190#define FIO_IO_U_PLAT_GROUP_NR 24/g' stat.h
114              make fio
115
116       Quick reference table for the max latency corresponding to  a  sampling
117       of values for GROUP_NR:
118
119
120              GROUP_NR | # bins | max latency bin value
121              19       | 1216   | 16.9 sec
122              20       | 1280   | 33.8 sec
123              21       | 1344   | 67.6 sec
124              22       | 1408   | 2  min, 15 sec
125              23       | 1472   | 4  min, 32 sec
126              24       | 1536   | 9  min, 4  sec
127              25       | 1600   | 18 min, 8  sec
128              26       | 1664   | 36 min, 16 sec
129
130       At  present  this program automatically detects the number of histogram
131       bins in the log files, and adjusts the bin latency values  accordingly.
132       In  particular  if  you use the --log_hist_coarseness parameter of fio,
133       you get output files with a number of bins according to  the  following
134       table (note that the first row is identical to the table above):
135
136
137              coarse \ GROUP_NR
138                      19     20    21     22     23     24     25     26
139                 -------------------------------------------------------
140                0  [[ 1216,  1280,  1344,  1408,  1472,  1536,  1600,  1664],
141                1   [  608,   640,   672,   704,   736,   768,   800,   832],
142                2   [  304,   320,   336,   352,   368,   384,   400,   416],
143                3   [  152,   160,   168,   176,   184,   192,   200,   208],
144                4   [   76,    80,    84,    88,    92,    96,   100,   104],
145                5   [   38,    40,    42,    44,    46,    48,    50,    52],
146                6   [   19,    20,    21,    22,    23,    24,    25,    26],
147                7   [  N/A,    10,   N/A,    11,   N/A,    12,   N/A,    13],
148                8   [  N/A,     5,   N/A,   N/A,   N/A,     6,   N/A,   N/A]]
149
150       For other values of GROUP_NR and coarseness, this table can be computed
151       like this:
152
153
154              bins = [1216,1280,1344,1408,1472,1536,1600,1664]
155              max_coarse = 8
156              fncn = lambda z: list(map(lambda x: z/2**x if z % 2**x == 0 else nan, range(max_coarse + 1)))
157              np.transpose(list(map(fncn, bins)))
158
159       If you have not adjusted GROUP_NR for your (high latency)  application,
160       then  you will see the percentiles computed by this tool max out at the
161       max latency bin value as in the first table above,  and  in  this  plot
162       (where  GROUP_NR=19  and  thus we see a max latency of ~16.7 seconds in
163       the red line):
164
165              https://www.cronburg.com/fio/max_latency_bin_value_bug.png
166
167
168       Motivation for, design decisions, and the  implementation  process  are
169       described in further detail here:
170
171              https://www.cronburg.com/fio/cloud-latency-problem-measurement/
172
173

AUTHOR

175       fiologparser_hist.py and this manual page were written by Karl Cronburg
176       <karl.cronburg@gmail.com>.
177

REPORTING BUGS

179       Report bugs to the fio mailing list <fio@vger.kernel.org>.
180
181
182
183                                August 18, 2016        fiologparser_hist.py(1)