fiologparser_hist.py(1)

1fiologparser_hist.py(1)     General Commands Manual    fiologparser_hist.py(1)
2
3
4

NAME

6       fiologparser_hist.py - Calculate statistics from fio histograms
7

SYNOPSIS

9       fiologparser_hist.py [options] [clat_hist_files]...
10

DESCRIPTION

12       fiologparser_hist.py  is  a  utility  for converting *_clat_hist* files
13       generated by fio into a CSV of latency  statistics  including  minimum,
14       average, maximum latency, and selectable percentiles.
15

EXAMPLES

17       $ fiologparser_hist.py *_clat_hist*
18       end-time, samples, min, avg, median, 90%, 95%, 99%, max
19       1000, 15, 192, 1678.107, 1788.859, 1856.076, 1880.040, 1899.208, 1888.000
20       2000, 43, 152, 1642.368, 1714.099, 1816.659, 1845.552, 1888.131, 1888.000
21       4000, 39, 1152, 1546.962, 1545.785, 1627.192, 1640.019, 1691.204, 1744
22       ...
23

OPTIONS

25       --help Print these options.
26
27       --buff_size=int
28              Number  of  samples  to  buffer into numpy at a time. Default is
29              10,000.  This can be adjusted to help performance.
30
31       --max_latency=int
32              Number of seconds of data to process at a time. Defaults  to  20
33              seconds, in order to handle the 17 second upper bound on latency
34              in histograms reported by fio. This should be increased  if  fio
35              has been run with a larger maximum latency. Lowering this when a
36              lower maximum latency is  known  can  improve  performance.  See
37              NOTES for more details.
38
39       -i, --interval=int
40              Interval  at which statistics are reported. Defaults to 1000 ms.
41              This should be set a minimum of the value for  log_hist_msec  as
42              given to fio.
43
44       --noweight
45              Do  not  perform  weighting of samples between output intervals.
46              Default is False.
47
48       -d, --divisor=int
49              Divide statistics by this value. Defaults to 1.  Useful  if  you
50              want  to  convert  latencies from milliseconds to seconds (divi‐
51              sor=1000).
52
53       --warn Enables warning messages printed to stderr,  useful  for  debug‐
54              ging.
55
56       --group_nr=int
57              Set  this  to  the value of FIO_IO_U_PLAT_GROUP_NR as defined in
58              stat.h if fio has been recompiled. Defaults to 19,  the  current
59              value used in fio. See NOTES for more details.
60
61       --percentiles=str
62              Pass  desired  list  of  comma or colon separated percentiles to
63              print.  The default is "90.0:95.0:99.0",  but  min,  median(50%)
64              and max percentiles are always printed
65
66       --usbin
67              Use  to  indicate  to parser that histogram bin latencies values
68              are in microseconds.  The default is  to  use  nanoseconds,  but
69              histogram logs from fio versions <= 2.99 are in microseconds.
70
71       --directions=str
72              By  default,  all directions (e.g read and write) histogram bins
73              are combined producing one 'mixed' result.  To produce  indepen‐
74              dent  directional results, pass some combination of ´rwtm´ char‐
75              acters with the --directions=rwtm option.   A  ´dir´  column  is
76              added indicating the result direction for a row.
77
78

NOTES

80       end-times  are  calculated  to  be uniform increments of the --interval
81       value given, regardless of when  histogram  samples  are  reported.  Of
82       note:
83
84              Intervals with no samples are omitted. In the example above this
85              means "no statistics from 2 to 3 seconds" and "39 samples influ‐
86              enced the statistics of the interval from 3 to 4 seconds".
87
88              Intervals  with a single sample will have the same value for all
89              statistics
90
91
92       The number of samples is unweighted, corresponding to the total  number
93       of samples which have any effect whatsoever on the interval.
94
95       Min  statistics  are  computed using value of the lower boundary of the
96       first bin (in increasing bin order) with non-zero samples in it.  Simi‐
97       larly for max, we take the upper boundary of the last bin with non-zero
98       samples in it.  This is semantically identical to taking  the  0th  and
99       100th  percentiles with a 50% bin-width buffer (because percentiles are
100       computed using mid-points of the bins).  This  enforces  the  following
101       nice properties:
102
103              min <= 50th <= 90th <= 95th <= 99th <= max
104
105              min  and max are strict lower and upper bounds on the actual min
106              / max seen by fio  (and  reported  in  *_clat.*  with  averaging
107              turned off).
108
109
110       Average statistics use a standard weighted arithmetic mean.
111
112       When  --noweights  option  is false (the default) percentile statistics
113       are computed using the weighted percentile method  as  described  here:
114       https://en.wikipedia.org/wiki/Percentile#Weighted_percentile.       See
115       weights() method for details on how weights are computed for individual
116       samples.  In  process_interval()  we  further multiply by the height of
117       each bin to get weighted histograms.
118
119       We convert files given on the command line, assumed to be fio histogram
120       files, An individual histogram file can contain the histograms for mul‐
121       tiple different r/w directions  (notably  when  --rw=randrw).  This  is
122       accounted for by tracking each r/w direction separately. In the statis‐
123       tics reported we ultimately merge *all* histograms (regardless  of  r/w
124       direction).
125
126       The  value  of  *_GROUP_NR  in  stat.h (and *_BITS) determines how many
127       latency bins fio outputs when histogramming is enabled. Namely for  the
128       current  default  of  GROUP_NR=19,  we  get  1,216  bins with a maximum
129       latency of approximately 17 seconds. For certain applications this  may
130       not  be  sufficient.  With  GROUP_NR=24 we have 1,536 bins, giving us a
131       maximum latency of 541 seconds (~ 9 minutes). If you expect your appli‐
132       cation  to  experience latencies greater than 17 seconds, you will need
133       to recompile fio with a larger GROUP_NR, e.g. with:
134
135
136              sed -i.bak 's/^#define FIO_IO_U_PLAT_GROUP_NR 190#define FIO_IO_U_PLAT_GROUP_NR 24/g' stat.h
137              make fio
138
139       Quick reference table for the max latency corresponding to  a  sampling
140       of values for GROUP_NR:
141
142
143              GROUP_NR | # bins | max latency bin value
144              19       | 1216   | 16.9 sec
145              20       | 1280   | 33.8 sec
146              21       | 1344   | 67.6 sec
147              22       | 1408   | 2  min, 15 sec
148              23       | 1472   | 4  min, 32 sec
149              24       | 1536   | 9  min, 4  sec
150              25       | 1600   | 18 min, 8  sec
151              26       | 1664   | 36 min, 16 sec
152
153       At  present  this program automatically detects the number of histogram
154       bins in the log files, and adjusts the bin latency values  accordingly.
155       In  particular  if  you use the --log_hist_coarseness parameter of fio,
156       you get output files with a number of bins according to  the  following
157       table (note that the first row is identical to the table above):
158
159
160              coarse \ GROUP_NR
161                      19     20    21     22     23     24     25     26
162                 -------------------------------------------------------
163                0  [[ 1216,  1280,  1344,  1408,  1472,  1536,  1600,  1664],
164                1   [  608,   640,   672,   704,   736,   768,   800,   832],
165                2   [  304,   320,   336,   352,   368,   384,   400,   416],
166                3   [  152,   160,   168,   176,   184,   192,   200,   208],
167                4   [   76,    80,    84,    88,    92,    96,   100,   104],
168                5   [   38,    40,    42,    44,    46,    48,    50,    52],
169                6   [   19,    20,    21,    22,    23,    24,    25,    26],
170                7   [  N/A,    10,   N/A,    11,   N/A,    12,   N/A,    13],
171                8   [  N/A,     5,   N/A,   N/A,   N/A,     6,   N/A,   N/A]]
172
173       For other values of GROUP_NR and coarseness, this table can be computed
174       like this:
175
176
177              bins = [1216,1280,1344,1408,1472,1536,1600,1664]
178              max_coarse = 8
179              fncn = lambda z: list(map(lambda x: z/2**x if z % 2**x == 0 else nan, range(max_coarse + 1)))
180              np.transpose(list(map(fncn, bins)))
181
182       If you have not adjusted GROUP_NR for your (high latency)  application,
183       then  you will see the percentiles computed by this tool max out at the
184       max latency bin value as in the first table above,  and  in  this  plot
185       (where  GROUP_NR=19  and  thus we see a max latency of ~16.7 seconds in
186       the red line):
187
188              https://www.cronburg.com/fio/max_latency_bin_value_bug.png
189
190
191       Motivation for, design decisions, and the  implementation  process  are
192       described in further detail here:
193
194              https://www.cronburg.com/fio/cloud-latency-problem-measurement/
195
196

AUTHOR

198       fiologparser_hist.py and this manual page were written by Karl Cronburg
199       <karl.cronburg@gmail.com>.
200

REPORTING BUGS

202       Report bugs to the fio mailing list <fio@vger.kernel.org>.
203
204
205
206                                August 18, 2016        fiologparser_hist.py(1)