1RRD_PDPCALC(1) rrdtool RRD_PDPCALC(1)
2
3
4
6 rrd_pdpcalc - PDP inner calculation logics with an example by Tianpeng
7 Xia
8
10 This article explains how PDP are calculated in a detailed yet easy-to-
11 understand way, with an example.
12
14 Fundamental knowledge
15 If you have not read the tutorials or man pages either on the official
16 site or those by others, then I strongly encourage you to do so. As
17 said in the description, this article will only explain how a PDP is
18 calculated, but not the definition of it. So please read the following
19 materials to get a basic understanding of PDP:
20
21 <http://rrdtool.vandenbogaerdt.nl/process.php> - By Alex van den
22 Bogaerdt. This article explained PDP in a very detailed and clear way,
23 however, it does not explain the "normalization process" in its
24 "Normalize interval" section in the right way( as opposed to the
25 official version I confirmed with @oetiker himself). The flaw can be
26 easily seen in the bar charts, discussed in the "Calculation logics"
27 section.
28
29 <https://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html> - This one is on
30 the official site. Actually it's the manual page for "rrdcreate", and
31 it reveals what's under the hood with regard to PDP calculation in its
32 "The HEARTBEAT and the STEP" section.
33
34 The text graph by Don Baarda provides a vivid explanation on how
35 UNKNOWN data are produced and how heartbeat value can influence in the
36 sampling. Unfortunately, it fails to give a clear method by which PDPs
37 are calculated.
38
39 <https://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html> - Another
40 detailed official tutorial by Alex van den Bogaerdt. Similarly, it only
41 provides examples with data evenly and exactly distributed according to
42 the step set.
43
44 If you don't like doing experiments or care about the inner mechanics
45 that much, you can just stop here and give more attention to more
46 practical topics like graph exports or command manual. But if you are
47 the sort of people like me who just care as much about the calculation
48 logics, please read on.
49
51 Here begins the core part of this article. In the following content of
52 this section, I would like to give two versions of calculation methods,
53 one by Alex van den Bogaerdt and the other by @eotiker.
54
55 To provide an ASCII-friendly explanation, I will explain both versions
56 with the char below instead of a real image.
57
58 |
59 | (v1)
60 | _______ (v4) (v5)
61 | | | (v3) ____________
62 | | | ______________| || |
63 | | | | || || |
64 | | | | || || |
65 | | | (v2) | || || |
66 | | |________| || || |
67 --------------------------------------------->
68 0 1 3 7 17 20 21
69
70 The X axis means time slots( each second denotes one slot) and the Y
71 axis means the value.
72
73 Let's make everything a little clearer:
74
75 - The step is 5
76
77 - each PDP gets updated only if a value arrives at or after the last
78 slot of the PDP, for instance, the last slot of the PDP from 16 to 20
79 is 20
80
81 - The heartbeat is 20, so the samples during the entire 7-17 period is
82 not discarded
83
84 - At second 3, the first value comes in as v1, and so on
85
86 - Second 0 is the origin, and it does not count as a sample
87
88 Bogaerdt version
89 As can be seen on this page:
90 <http://rrdtool.vandenbogaerdt.nl/process.php>, after all the primary
91 data are transformed to rates( except for GAUGE, of course), they have
92 to go through a normalization process if they are not distributed
93 exactly according to the step or on well-defined boundaries in time, in
94 the words of the author.
95
96 What does that mean? Basically, if all the known (as opposed to an
97 unknown value) data make up at least 50% of all slots during a period,
98 then a PDP is calculated from them.
99
100 This version seems to go well until we reach the bar chart part.
101
102 According to the ASCII bar chart, we have the following results:
103
104 From second 1 on, the PDP of each period( 1-5,6-10, ...) is computed by
105 averaging all the values within it.
106
107 So: - the PDP from 1 to 5 is (v1*3+v2*2)/5
108
109 - the PDP from 6 to 10 is (v2*2+v3*3)/5
110
111 - the PDP from 11 to 15 is (v3*5)/5, since all the values in slots 11,
112 12, 13, 14 and 15 are the same, which is v3
113
114 - ...
115
116 The official version( also @oetiker version):
117 Using the same chart, this version suggests the following:
118
119 - the PDP from 1 to 5 is (v1*3+v2*2)/5
120
121 - the PDPs from 6 to 10 and 11 to 15 are the SAME, which is (v2*2+v3*8)
122
123 - ...
124
125 A Comparison and some explanation
126 So we have seen the above two versions and their PDPs from 6 to 10 and
127 11 to 15 do not comply with each other.
128
129 Why is that?
130
131 Because the difference between the official version and Bogaerdt
132 version stems from the way they do the calculation for PDP(6-10) and
133 PDP(11-15).
134
135 Let's discuss this in more detail using the above bar chart.
136
137 Bogaerdt's version,
138
139 PDPs are always computed individually no matter how values arrive.
140
141 For example, the value at slot 17 comes after the last slot of
142 PDP(11-15). Also, the immediate previous value before slot 17 is at 7.
143 All the slots from 7 to 17 are assigned v3. Since each PDP is computed
144 individually, PDP(6-10) is (v2*2+v3*3)/5 while the PDP(11-15) is
145 (v3*5)/5.
146
147 The official version
148
149 PDPs are always computed in terms of the steps which the next update
150 spans, be it 1 step, 2 steps or n steps; in other words, PDPs may be
151 computed together.
152
153 For example, the update at slot 17 spans PDP(6-10) and PDP(11-15)
154 because the immediate previous value is at 7 and 7 is within 6 and 10 ,
155 and 17 is after 15. PDP(1-5) and PDP(16-20) are not included since the
156 update at slot 7 has already triggered the calculation for PDP(1-5) and
157 the update at slot 17 comes before the last slot of PDP(16-20) which is
158 20.
159
160 That's the reason why PDP(6-10) and PDP(11-15) have the same value,
161 (v2*2+v3*8).
162
164 If you are still confused, don't worry, an example is here to help you.
165
166 Let's get our hands dirty with some commands
167
168 rrdtool create target.rrd --start 1000000000 --step 5 DS:mem:GAUGE:20:0:100 RRA:AVERAGE:0.5:1:10
169 rrdtool update target.rrd 1000000003:8 1000000006:1 1000000017:6 \
170 1000000020:7 1000000021:7 1000000022:4 \
171 1000000023:3 1000000036:1 1000000037:2 \
172 1000000038:3 1000000039:3 1000000042:5
173 rrdtool fetch target.rrd AVERAGE --start 1000000000 --end 1000000045
174
175 Basically, the above codes contain 3 commands: create, update and
176 fetch. First create a new rrd file, and then we feed in some data and
177 last we fetch all the PDPs from the rrd.
178
179 Focus on single steps
180 In order to provide a detailed explanation, each the calculation
181 process of each PDP is provided.
182
183 Below is the output of the commands above:
184
185 1000000005: 5.2000000000e+00
186 1000000010: 5.5000000000e+00
187 1000000015: 5.5000000000e+00
188 1000000020: 6.6000000000e+00
189 1000000025: 1.7333333333e+00
190 1000000030: 1.7333333333e+00
191 1000000035: 1.7333333333e+00
192 1000000040: 2.8000000000e+00
193 1000000045: nan
194 1000000050: nan
195
196 NOTE: 1000000005 means the PDP from 1000000001 to 1000000005, and so
197 on. For concision and readability, we use only the last two digits, so
198 05 denotes 1000000005. We choose the type of the data source as gauge
199 because original values will be treated as rates, no additional
200 transformation is needed, see this article
201 <http://rrdtool.vandenbogaerdt.nl/process.php> for detail.
202
203 05: 5.2 = (8*3+1*2)/5
204
205 10: 5.5 = (1*1+6*9)/10
206
207 15: the same as the previous one
208
209 20: 6.6 = (6*2+7*3)/5
210
211 25: 1.73333 = (7+4+3+1*12)/15
212
213 ...
214
215 45: nan, as the last value is at 42,which does not trigger the
216 calculation for PDP(41-45)
217
218 50: nan, why this unknown PDP is shown is explained in this article
219 <https://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html>
220
222 All that said, I hope you get a clear understanding of the inner
223 calculation "magic" for PDPs.
224
225 Other References
226 • A great PowerShell shell script for generating ASCII bar charts:
227 <https://gallery.technet.microsoft.com/scriptcenter/Sample-Script-to-Generate-59c80d4c>
228
229 • <https://stackoverflow.com/questions/18924450/rrd-wrong-values>
230
231
232
2331.8.0 2022-03-14 RRD_PDPCALC(1)