1PMLOGEXTRACT(1)             General Commands Manual            PMLOGEXTRACT(1)
2
3
4

NAME

6       pmlogextract  -  reduce, extract, concatenate and merge Performance Co-
7       Pilot archives
8

SYNOPSIS

10       $PCP_BINADM_DIR/pmlogextract [-dfwz] [-c configfile] [-n pmnsfile]  [-S
11       starttime]  [-s  samples]  [-T  endtime]  [-v volsamples] [-Z timezone]
12       input [...] output
13

DESCRIPTION

15       pmlogextract reads one or more Performance Co-Pilot (PCP) archive  logs
16       identified  by input and creates a temporally merged and/or reduced PCP
17       archive log in output.  The nature of merging is controlled by the num‐
18       ber  of  input archive logs, while the nature of data reduction is con‐
19       trolled by the command line arguments.  The input(s) must  be  PCP  ar‐
20       chive  logs created by pmlogger(1) with performance data collected from
21       the same host, but usually over different  time  periods  and  possibly
22       (although not usually) with different performance metrics being logged.
23
24       If only one input is specified, then the default behavior simply copies
25       the input PCP archive log, into the output PCP archive log.   When  two
26       or  more  PCP  archive logs are specified as input, the logs are merged
27       (or concatenated) and written to output.
28
29       In the output archive log a ``mark'' record will be inserted at a  time
30       just  past the end of each of the input archive logs to indicate a pos‐
31       sible temporal discontinuity between the end of one input  archive  log
32       and the start of the next input archive log.  See the MARK RECORDS sec‐
33       tion below for more information.  There is no ``mark'' record after the
34       end of the last (in temporal order) of the input archive logs.
35

COMMAND LINE OPTIONS

37       The command line options for pmlogextract are as follows:
38
39       -c configfile
40              Extract  only the metrics specified in configfile from the input
41              PCP archive log(s).  The configfile syntax accepted by  pmlogex‐
42              tract is explained in more detail in the Configuration File Syn‐
43              tax section.
44
45       -d     Desperate mode.  Normally if a fatal error occurs, all trace  of
46              the  partially  written PCP archive output is removed.  With the
47              -d option, the output archive log is not removed.
48
49       -f     For most common uses, all of the input archive  logs  will  have
50              been  collected  in  the  same timezone.  But if this is not the
51              case, then pmlogextract must choose one of  the  timezones  from
52              the input archive logs to be used as the timezone for the output
53              archive log.  The default is to use the timezone from  the  last
54              input  archive  log.  The -f option forces the timezone from the
55              first input archive log to be used.
56
57       -n pmnsfile
58              Normally pmlogextract operates on the Performance  Metrics  Name
59              Space  (PMNS)  from input, however if the -n option is specified
60              an alternative local PMNS is loaded from the file pmnsfile.
61
62       -S starttime
63              Define the start of  a  time  window  to  restrict  the  samples
64              retrieved  or specify a ``natural'' alignment of the output sam‐
65              ple times; refer to PCPIntro(1).  See also the -w option.
66
67       -s samples
68              The argument samples defines the number of samples to be written
69              to output.  If samples is 0 or -s is not specified, pmlogextract
70              will sample until the end of the PCP archive log, or the end  of
71              the  time window as specified by -T, whichever comes first.  The
72              -s option will override the -T option if it occurs sooner.
73
74       -T endtime
75              Define the termination of a time window to restrict the  samples
76              retrieved  or specify a ``natural'' alignment of the output sam‐
77              ple times; refer to PCPIntro(1).  See also the -w option.
78
79       The output archive log is potentially a multi-volume data set, and  the
80       -v  option  causes  pmlogextract to start a new volume after volsamples
81       log records have been written to the archive log.
82
83       -w     Where -S and -T specify a time window within the same  day,  the
84              -w  flag  will  cause  the  data  within  the  time window to be
85              extracted, for every day in the archive log.  For  example,  the
86              options  -w -S @11:00 -T @15:00 specify that pmlogextract should
87              include archive log records only for the periods  from  11am  to
88              3pm  on  each day.  When -w is used, the output archive log will
89              contain ``mark'' records to indicate the temporal  discontinuity
90              between the end of one time window and the start of the next.
91
92       -Z timezone
93              Use  timezone when displaying the date and time.  Timezone is in
94              the format of the environment variable TZ as described in  envi‐
95              ron(5).
96
97       -z     Use  the local timezone of the host from the input archive logs.
98              The default is to initially use the timezone of the local host.
99

CONFIGURATION FILE SYNTAX

101       The configfile contains metrics  of  interest,  listed  one  per  line.
102       Instances may also be specified, but they are optional.  The format for
103       each metric name is
104
105               metric [[instance[,instance...]]]
106
107       where metric may be a leaf or a non-leaf node in the  Performance  Met‐
108       rics  Namespace  (PMNS, see pmns(4)).  If a metric refers to a non-leaf
109       node in the PMNS, pmlogextract will recursively descend  the  PMNS  and
110       include  all metrics corresponding to descendent leaf nodes.  Instances
111       are optional, and may be specified as a list of one or more  space  (or
112       comma)  separated names, numbers or strings.  Elements in the list that
113       are numbers are assumed to  be  external  instance  identifiers  -  see
114       pmGetInDom(3)  for  more  information.  If no instances are given, then
115       the logging specification is applied to all instances of the associated
116       metric(s).
117

CONFIGURATION FILE EXAMPLE

119       This is an example of a valid configfile:
120
121               #
122               # config file for pmlogextract
123               #
124
125               kernel.all.cpu
126               kernel.percpu.cpu.sys ["cpu0","cpu1"]
127               disk.dev ["dks0d1"]
128

MARK RECORDS

130       When  more  than  one input archive log contributes performance data to
131       the output archive log, then ``mark'' records are inserted to  indicate
132       a possible discontinuity in the performance data.
133
134       A  ``mark''  record contains a timestamp and no performance data and is
135       used to indicate that there is a time period in  the  PCP  archive  log
136       where  we  do  not  know the values of any performance metrics, because
137       there was  no  pmlogger(1)  collecting  performance  data  during  this
138       period.  Since these periods are often associated with the restart of a
139       service or pmcd(1) or a system, there may be considerable doubt  as  to
140       the continuity of performance data across this time period.
141
142       The rationale behind ``mark'' records may be demonstrated with an exam‐
143       ple.  Consider one input archive log that starts at 00:10 and  ends  at
144       09:15  on  the  same  day, and another input archive log that starts at
145       09:20 on the same day and ends at 00:10  the  following  morning.   The
146       would  be a very common case for archives managed and rotated by pmlog‐
147       ger_check(1) and pmlogger_daily(1).
148
149       The output archive log would contain:
150       00:10.000   first record from first input archive log
151       ...
152       09:15.000   last record from first input archive log
153       09:15.001   <mark record>
154       09:20.000   first record from second input archive log
155       ...
156       01:10.000   last record from second input archive log
157
158       The time period where the performance data is missing starts just after
159       09:15  and ends just before 09:20.  When the output archive log is pro‐
160       cessed with any of the PCP reporting tools, the ``mark'' record is used
161       to  indicate  a  period  of  missing  data.  For example in the archive
162       above, if one was reporting the average I/O rate at  30  minute  inter‐
163       vals,  aligned  on the hour, then there would be data for the intervals
164       ending at 09:00 and 10:00 but no data reported for the interval  ending
165       at 09:30 as this spans a ``mark'' record.
166
167       The  presence  of  ``mark''  records in a PCP archive log can be estab‐
168       lished using pmdumplog(1) where a timestamp and the  annotation  <mark>
169       is used to indicate a ``mark'' record.
170

FILES

172       For  each  of the input and output archive logs, several physical files
173       are used.
174       archive.meta
175                 metadata (metric descriptions, instance  domains,  etc.)  for
176                 the archive log
177       archive.0 initial  volume  of  metrics  values (subsequent volumes have
178                 suffixes 1, 2, ...)
179       archive.index
180                 temporal index to support rapid random access  to  the  other
181                 files in the archive log.
182

PCP ENVIRONMENT

184       Environment variables with the prefix PCP_ are used to parameterize the
185       file and directory names used by PCP.  On each installation,  the  file
186       /etc/pcp.conf  contains  the  local  values  for  these variables.  The
187       $PCP_CONF variable may be used to specify an alternative  configuration
188       file, as described in pcp.conf(4).
189

SEE ALSO

191       PCPIntro(1),   pmdumplog(1),   pmlc(1),   pmlogger(1),  pmlogreduce(1),
192       pcp.conf(4) and pcp.env(4).
193

DIAGNOSTICS

195       All error conditions detected by pmlogextract are  reported  on  stderr
196       with textual (if sometimes terse) explanation.
197
198       Should  one  of the input archive logs be corrupted (this can happen if
199       the pmlogger instance writing the log suddenly dies), then pmlogextract
200       will  detect and report the position of the corruption in the file, and
201       any subsequent information from that archive log will not be processed.
202
203       If any error is detected, pmlogextract will exit with a  non-zero  sta‐
204       tus.
205

CAVEATS

207       The  preamble  metrics  (pmcd.pmlogger.archive, pmcd.pmlogger.host, and
208       pmcd.pmlogger.port), which are automatically recorded  by  pmlogger  at
209       the  start  of the archive, may not be present in the archive output by
210       pmlogextract.  These metrics are only relevant  while  the  archive  is
211       being created, and have no significance once recording has finished.
212
213
214
215Performance Co-Pilot                  SGI                      PMLOGEXTRACT(1)
Impressum