1PMLOGREWRITE(1) General Commands Manual PMLOGREWRITE(1)
2
3
4
6 pmlogrewrite - rewrite Performance Co-Pilot archives
7
9 $PCP_BINADM_DIR/pmlogrewrite [-Cdiqsvw ] [-c config] inlog [outlog]
10
12 pmlogrewrite reads a set of Performance Co-Pilot (PCP) archive logs
13 identified by inlog and creates a PCP archive log in outlog. Under
14 normal usage, the -c option will be used to nominate a configuration
15 file or files that contains specifications (see the REWRITING RULES
16 SYNTAX section below) that describe how the data and metadata from
17 inlog should be transformed to produce outlog.
18
19 The typical uses for pmlogrewrite would be to accommodate the evolution
20 of Performance Metric Domain Agents (PMDAs) where the names, metadata
21 and semantics of metrics and their associated instance domains may
22 change over time, e.g. promoting the type of a metric from a 32-bit to
23 a 64-bit integer, or renaming a group of metrics. Refer to the EXAM‐
24 PLES section for some additional use cases.
25
26 pmlogrewrite is most useful where PMDA changes, or errors in the pro‐
27 duction environment, result in archives that cannot be combined with
28 pmlogextract(1). By pre-processing the archives with pmlogrewrite the
29 resulting archives may be able to be merged with pmlogextract(1).
30
31 The input inlog must be a set of PCP archive logs created by pmlog‐
32 ger(1), or possibly one of the tools that read and create PCP archives,
33 e.g. pmlogextract(1) and pmlogreduce(1). inlog is a comma-separated
34 list of names, each of which may be the base name of an archive or the
35 name of a directory containing one or more archives.
36
37 If no -c option is specified, then the default behavior simply creates
38 outlog as a copy of inlog. This is a little more complicated than
39 cat(1), as each PCP archive is made up of several physical files.
40
41 While pmlogrewrite may be used to repair some data consistency issues
42 in PCP archives, there is also a class of repair tasks that cannot be
43 handled by pmlogrewrite and pmloglabel(1) may be a useful tool in these
44 cases.
45
47 The command line options for pmlogrewrite are as follows:
48
49 -C Parse the rewriting rules and quit. outlog is not created.
50 When -C is specified, this also sets -v and -w so that all warn‐
51 ings and verbose messages are displayed as config is parsed.
52
53 -c config
54 If config is a file or symbolic link, read and parse rewriting
55 rules from there. If config is a directory, then all of the
56 files or symbolic links in that directory (excluding those
57 beginning with a period ``.'') will be used to provide the
58 rewriting rules. Multiple -c options are allowed.
59
60 -d Desperate mode. Normally if a fatal error occurs, all trace of
61 the partially written PCP archive outlog is removed. With the
62 -d option, the partially created outlog archive log is not
63 removed.
64
65 -i Rather than creating outlog, inlog is rewritten in place when
66 the -i option is used. A new archive is created using temporary
67 file names and then renamed to inlog in such a way that if any
68 errors (not warnings) are encountered, inlog remains unaltered.
69
70 -q Quick mode, where if there are no rewriting actions to be per‐
71 formed (none of the global data, instance domains or metrics
72 from inlog will be changed), then pmlogrewrite will exit (with
73 status 0, so success) immediately after parsing the configura‐
74 tion file(s) and outlog is not created.
75
76 -s When the ``units'' of a metric are changed, if the dimension in
77 terms of space, time and count is unaltered, then the scaling
78 factor is being changed, e.g. BYTE to KBYTE, or MSEC-1 to
79 USEC-1, or the composite MBYTE.SEC-1 to KBYTE.USEC-1. The moti‐
80 vation may be (a) that the original metadata was wrong but the
81 values in inlog are correct, or (b) the metadata is changing so
82 the values need to change as well. The default pmlogrewrite be‐
83 haviour matches case (a). If case (b) applies, then use the -s
84 option and the values of all the metrics with a scale factor
85 change in each result will be rescaled. For finer control over
86 value rescaling refer to the RESCALE option for the UNITS clause
87 of the metric rewriting rule described below.
88
89 -v Increase verbosity of diagnostic output.
90
91 -w Emit warnings. Normally pmlogrewrite remains silent for any
92 warning that is not fatal and it is expected that for a particu‐
93 lar archive, some (or indeed, all) of the rewriting specifica‐
94 tions may not apply. For example, changes to a PMDA may be cap‐
95 tured in a set of rewriting rules, but a single archive may not
96 contain all of the modified metrics nor all of the modified
97 instance domains and/or instances. Because these cases are
98 expected, they do not prevent pmlogrewrite executing, and rules
99 that do not apply to inlog are silently ignored by default.
100 Similarly, some rewriting rules may involve no change because
101 the metadata in inlog already matches the intent of the rewrit‐
102 ing rule to correct data from a previous version of a PMDA. The
103 -w flag forces warnings to be emitted for all of these cases.
104
105 The argument outlog is required in all cases, except when -i is speci‐
106 fied.
107
109 A configuration file contains zero or more rewriting rules as defined
110 below.
111
112 Keywords and special punctuation characters are shown below in
113 bolditalic font and are case-insensitive, so METRIC, metric and Metric
114 are all equivalent in rewriting rules.
115
116 The character ``#'' introduces a comment and the remainder of the line
117 is ignored. Otherwise the input is relatively free format with
118 optional white space (spaces, tabs or newlines) between lexical items
119 in the rules.
120
121 A global rewriting rule has the form:
122
123 GLOBAL { globalspec ... }
124
125 where globalspec is zero or more of the following clauses:
126
127 HOSTNAME -> hostname
128
129 Modifies the label records in the outlog PCP archive, so that
130 the metrics will appear to have been collected from the host
131 hostname.
132
133 TIME -> delta
134
135 Both metric values and the instance domain metadata in a PCP
136 archive carry timestamps. This clause forces all the time‐
137 stamps to be adjusted by delta, where delta is an optional sign
138 ``+'' (the default) or ``-'', an optional number of hours fol‐
139 lowed by a colon ``:'', an optional number of minutes followed
140 by a colon ``:'', a number of seconds, an optional fraction of
141 seconds following a period ``.''. The simplest example would
142 be ``30'' to increase the timestamps by 30 seconds. A more
143 complex example would be ``-23:59:59.999'' to move the time‐
144 stamps backwards by one millisecond less than one day.
145
146 TZ -> "timezone"
147
148 Modifies the label records in the outlog PCP archive, so that
149 the metrics will appear to have been collected from a host with
150 a local timezone of timezone. timezone must be enclosed in
151 quotes, and should conform to the valid timezone syntax rules
152 for the local platform.
153
154 An indom rewriting rule modifies an instance domain and has the form:
155
156 INDOM domain.serial { indomspec ... }
157
158 where domain and serial identify one or more existing instance domains
159 from inlog - typically domain would be an integer in the range 1 to 510
160 and serial would be an integer in the range 0 to 4194304.
161
162 As a special case serial could be an asterisk ``*'' which means the
163 rule applies to every instance domain with a domain number of domain.
164
165 If a designated instance domain is not in inlog the rule has no effect.
166
167 The indomspec is zero or more of the following clauses:
168
169 INAME "oldname" -> "newname"
170
171 The instance identified by the external instance name oldname
172 is renamed to newname. Both oldname and newname must be
173 enclosed in quotes.
174
175 As a special case, the new name may be the keyword DELETE (with
176 no quotes), and then the instance oldname will be expunged from
177 outlog which removes it from the instance domain metadata and
178 removes all values of this instance for all the associated met‐
179 rics.
180
181 If the instance names contain any embedded spaces then special
182 care needs to be taken in respect of the PCP instance naming
183 rule that treats the leading non-space part of the instance
184 name as the unique portion of the name for the purposes of
185 matching and ensuring uniqueness within an instance domain,
186 refer to pmdaInstance(3) for a discussion of this issue.
187
188 As an illustration, consider the hypothetical instance domain
189 for a metric which contains 2 instances with the following
190 names:
191 red
192 eek urk
193
194 Then some possible INAME clauses might be:
195
196 "eek" -> "yellow like a flower"
197 Acceptable, oldname "eek" matches the "eek urk"
198 instance.
199
200 "red" -> "eek"
201 Error, newname "eek" matches the existing "eek urk"
202 instance.
203
204 "eek urk" -> "red of another hue"
205 Error, newname "red of another hue" matches the
206 existing "red" instance.
207
208 INDOM -> newdomain.newserial
209
210 Modifies the metadata for the instance domain and every metric
211 associated with the instance domain. As a special case, newse‐
212 rial could be an asterisk ``*'' which means use serial from the
213 indom rewriting rule, although this is most useful when serial
214 is also an asterisk. So for example:
215 indom 29.* { indom -> 109.* }
216 will move all instance domains from domain 29 to domain 109.
217
218 INDOM -> DUPLICATE newdomain.newserial
219
220 A special case of the previous INDOM clause where the instance
221 domain is a duplicate copy of the domain.serial instance domain
222 from the indom rewriting rule, and then any mapping rules are
223 applied to the copied newdomain.newserial instance domain.
224 This is useful when a PMDA is split and the same instance
225 domain needs to be replicated for domain domain and domain new‐
226 domain. So for example if the metrics foo.one and foo.two are
227 both defined over instance domain 12.34, and foo.two is moved
228 to another PMDA using domain 27, then the following rewriting
229 rules could be used:
230 indom 12.34 { indom -> duplicate 27.34 }
231 metric foo.two { indom -> 27.34 pmid -> 27.*.* }
232
233 INST oldid -> newid
234
235 The instance identified by the internal instance identifier
236 oldid is renumbered to newid. Both oldid and newid are inte‐
237 gers in the range 0 to 231-1.
238
239 As a special case, newid may be the keyword DELETE and then the
240 instance oldid will be expunged from outlog which removes it
241 from the instance domain metadata and removes all values of
242 this instance for all the associated metrics.
243
244 A metric rewriting rule has the form:
245
246 METRIC metricid { metricspec ... }
247
248 where metricid identifies one or more existing metrics from inlog using
249 either a metric name, or the internal encoding for a metric's PMID as
250 domain.cluster.item. In the latter case, typically domain would be an
251 integer in the range 1 to 510, cluster would be an integer in the range
252 0 to 4095, and item would be an integer in the range 0 to 1023.
253
254 As special cases item could be an asterisk ``*'' which means the rule
255 applies to every metric with a domain number of domain and a cluster
256 number of cluster, or cluster could be an asterisk which means the rule
257 applies to every metric with a domain number of domain and an item num‐
258 ber of item, or both cluster and item could be asterisks, and rule
259 applies to every metric with a domain number of domain.
260
261 If a designated metric is not in inlog the rule has no effect.
262
263 The metricspec is zero or more of the following clauses:
264
265
266 DELETE
267
268 The metric is completely removed from outlog, both the metadata
269 and all values in results are expunged.
270
271
272 INDOM -> newdomain.newserial [ pick ]
273
274 Modifies the metadata to change the instance domain for this
275 metric. The new instance domain must exist in outlog.
276
277 The optional pick clause may be used to select one input value,
278 or compute an aggregate value from the instances in an input
279 result, or assign an internal instance identifier to a single
280 output value. If no pick clause is specified, the default be‐
281 haviour is to copy all input values from each input result to
282 an output result, however if the input instance domain is sin‐
283 gular (indom PM_INDOM_NULL) then the one output value must be
284 assigned an internal instance identifier, which is 0 by
285 default, unless over-ridden by a INST or INAME clause as
286 defined below.
287
288 The choices for pick are as follows:
289
290 OUTPUT FIRST
291 choose the value of the first instance from each
292 input result
293
294 OUTPUT LAST choose the value of the last instance from each
295 input result
296
297 OUTPUT INST instid
298 choose the value of the instance with internal
299 instance identifier instid from each result; the
300 sequence of rewriting rules ensures the OUTPUT pro‐
301 cessing happens before instance identifier renum‐
302 bering from any associated indom rule, so instid
303 should be one of the internal instance identifiers
304 that appears in inlog
305
306 OUTPUT INAME "name"
307 choose the value of the instance with name for its
308 external instance name from each result; the
309 sequence of rewriting rules ensures the OUTPUT pro‐
310 cessing happens before instance renaming from any
311 associated indom rule, so name should be one of the
312 external instance names that appears in inlog
313
314 OUTPUT MIN choose the smallest value in each result (metric
315 type must be numeric and output instance will be 0
316 for a non-singular instance domain)
317
318 OUTPUT MAX choose the largest value in each result (metric
319 type must be numeric and output instance will be 0
320 for a non-singular instance domain)
321
322 OUTPUT SUM choose the sum of all values in each result (metric
323 type must be numeric and output instance will be 0
324 for a non-singular instance domain)
325
326 OUTPUT AVG choose the average of all values in each result
327 (metric type must be numeric and output instance
328 will be 0 for a non-singular instance domain)
329
330 If the input instance domain is singular (indom PM_INDOM_NULL)
331 then independent of any pick specifications, there is at most
332 one value in each input result and so FIRST, LAST, MIN, MAX,
333 SUM and AVG are all equivalent and the output instance identi‐
334 fier will be 0.
335
336 In general it is an error to specify a rewriting action for the
337 same metadata or result values more than once, e.g. more than
338 one INDOM clause for the same instance domain. The one excep‐
339 tion is the possible interaction between the INDOM clauses in
340 the indom and metric rules. For example the metric sample.bin
341 is defined over the instance domain 29.2 in inlog and the fol‐
342 lowing is acceptable (albeit redundant):
343 indom 29.* { indom -> 109.* }
344 metric sample.bin { indom -> 109.2 }
345 However the following is an error, because the instance domain
346 for sample.bin has two conflicting definitions:
347 indom 29.* { indom -> 109.* }
348 metric sample.bin { indom -> 123.2 }
349
350
351 INDOM -> NULL[ pick ]
352
353 The metric (which must have been previously defined over an
354 instance domain) is being modified to be a singular metric.
355 This involves a metadata change and collapsing all results for
356 this metric so that multiple values become one value.
357
358 The optional pick part of the clause defines how the one value
359 for each result should be calculated and follows the same rules
360 as described for the non-NULL INDOM case above.
361
362 In the absence of pick, the default is OUTPUT FIRST.
363
364
365 NAME -> newname
366
367 Renames the metric in the PCP archive's metadata that supports
368 the Performance Metrics Name Space (PMNS). newname should not
369 match any existing name in the archive's PMNS and must follow
370 the syntactic rules for valid metric names as outlined in
371 pmns(5).
372
373
374 PMID -> newdomain.newcluster.newitem
375
376 Modifies the metadata and results to renumber the metric's
377 PMID. As special cases, newcluster could be an asterisk ``*''
378 which means use cluster from the metric rewriting rule and/or
379 item could be an asterisk which means use item from the metric
380 rewriting rule. This is most useful when cluster and/or item
381 is also an asterisk. So for example:
382 metric 30.*.* { pmid -> 123.*.* }
383 will move all metrics from domain 30 to domain 123.
384
385
386 SEM -> newsem
387
388 Change the semantics of the metric. newsem should be the XXX
389 part of the name of one of the PM_SEM_XXX macros defined in
390 <pcp/pmapi.h> or pmLookupDesc(3), e.g. COUNTER for
391 PM_TYPE_COUNTER.
392
393 No data value rewriting is performed as a result of the SEM
394 clause, so the usefulness is limited to cases where a version
395 of the associated PMDA was exporting incorrect semantics for
396 the metric. pmlogreduce(1) may provide an alternative in cases
397 where re-computation of result values is desired.
398
399
400 TYPE -> newtype
401
402 Change the type of the metric which alters the metadata and may
403 change the encoding of values in results. newtype should be
404 the XXX part of the name of one of the PM_TYPE_XXX macros
405 defined in <pcp/pmapi.h> or pmLookupDesc(3), e.g. FLOAT for
406 PM_TYPE_FLOAT.
407
408 Type conversion is only supported for cases where the old and
409 new metric type is numeric, so PM_TYPE_STRING, PM_TYPE_AGGRE‐
410 GATE and PM_TYPE_EVENT are not allowed. Even for the numeric
411 cases, some conversions may produce run-time errors, e.g. inte‐
412 ger overflow, or attempting to rewrite a negative value into an
413 unsigned type.
414
415
416 TYPE IF oldtype -> newtype
417
418 The same as the preceding TYPE clause, except the type of the
419 metric is only changed to newtype if the type of the metric in
420 inlog is oldtype.
421
422 This useful in cases where the type of metricid in inlog may be
423 platform dependent and so more than one type rewriting rule is
424 required.
425
426
427 UNITS -> newunits [ RESCALE ]
428
429 newunits is six values separated by commas. The first 3 values
430 describe the dimension of the metric along the dimensions of
431 space, time and count; these are integer values, usually 0, 1
432 or -1. The remaining 3 values describe the scale of the met‐
433 ric's values in the dimensions of space, time and count. Space
434 scale values should be 0 (if the space dimension is 0), else
435 the XXX part of the name of one of the PM_SPACE_XXX macros,
436 e.g. KBYTE for PM_TYPE_KBYTE. Time scale values should be 0
437 (if the time dimension is 0), else the XXX part of the name of
438 one of the PM_TIME_XXX macros, e.g. SEC for PM_TIME_SEC.
439 Count scale values should be 0 (if the time dimension is 0),
440 else ONE for PM_COUNT_ONE.
441
442 The PM_SPACE_XXX, PM_TIME_XXX and PM_COUNT_XXX macros are
443 defined in <pcp/pmapi.h> or pmLookupDesc(3).
444
445 When the scale is changed (but the dimension is unaltered) the
446 optional keyword RESCALE may be used to chose value rescaling
447 as per the -s command line option, but applied to just this
448 metric.
449
450
451 When changing the domain number for a metric or instance domain,
452 the new domain number will usually match an existing PMDA's domain
453 number. If this is not the case, then the new domain number should
454 not be randomly chosen; consult $PCP_VAR_DIR/pmns/stdpmid for
455 domain numbers that are already assigned to PMDAs.
456
458 To promote the values of the per-disk IOPS metrics to 64-bit to allow
459 aggregation over a long time period for capacity planning, or because
460 the PMDA has changed to export 64-bit counters and we want to convert
461 old archives so they can be processed alongside new archives.
462 metric disk.dev.read { type -> U64 }
463 metric disk.dev.write { type -> U64 }
464 metric disk.dev.total { type -> U64 }
465
466 The instances associated with the load average metric kernel.all.load
467 could be renamed and renumbered by the rules below.
468 # for the Linux PMDA, the kernel.all.load metric is defined
469 # over instance domain 60.2
470 indom 60.2 {
471 inst 1 -> 60 iname "1 minute" -> "60 second"
472 inst 5 -> 300 iname "5 minute" -> "300 second"
473 inst 15 -> 900 iname "15 minute" -> "900 second"
474 }
475
476 If we decide to split the ``proc'' metrics out of the Linux PMDA, this
477 will involve changing the domain number for the PMID of these metrics
478 and the associated instance domains. The rules below would rewrite an
479 old archive to match the changes after the PMDA split.
480 # all Linux proc metrics are in 7 clusters
481 metric 60.8.* { pmid -> 123.*.* }
482 metric 60.9.* { pmid -> 123.*.* }
483 metric 60.13.* { pmid -> 123.*.* }
484 metric 60.24.* { pmid -> 123.*.* }
485 metric 60.31.* { pmid -> 123.*.* }
486 metric 60.32.* { pmid -> 123.*.* }
487 metric 60.51.* { pmid -> 123.*.* }
488 # only one instance domain for Linux proc metrics
489 indom 60.9 { indom -> 123.0 }
490
491 If the metric foo.count_em was exported as a native ``long'' then it
492 could be a 32-bit integer on some platforms and a 64-bit integer on
493 other platforms. Subsequent investigations show the value is in fact
494 unsigned, so the following rules could be used.
495 metric foo.count_em {
496 type if 32 -> U32
497 type if 64 -> U64
498 }
499
501 For each of the inlog and outlog archive logs, several physical files
502 are used.
503 archive.meta
504 metadata (metric descriptions, instance domains, etc.) for
505 the archive log
506 archive.0 initial volume of metrics values (subsequent volumes have
507 suffixes 1, 2, ...).
508 archive.index
509 temporal index to support rapid random access to the other
510 files in the archive log.
511
513 Environment variables with the prefix PCP_ are used to parameterize the
514 file and directory names used by PCP. On each installation, the file
515 /etc/pcp.conf contains the local values for these variables. The
516 $PCP_CONF variable may be used to specify an alternative configuration
517 file, as described in pcp.conf(5).
518
520 PCPIntro(1), pmdaInstance(3), pmdumplog(1), pmlogger(1), pmlogex‐
521 tract(1), pmloglabel(1), pmlogreduce(1), pmLookupDesc(3), pmns(5),
522 pcp.conf(5) and pcp.env(5).
523
525 All error conditions detected by pmlogrewrite are reported on stderr
526 with textual (if sometimes terse) explanation.
527
528 Should the input archive log be corrupted (this can happen if the
529 pmlogger instance writing the log suddenly dies), then pmlogrewrite
530 will detect and report the position of the corruption in the file, and
531 any subsequent information from that archive log will not be processed.
532
533 If any error is detected, pmlogrewrite will exit with a non-zero sta‐
534 tus.
535
536
537
538Performance Co-Pilot PMLOGREWRITE(1)