1PMIE(1)                     General Commands Manual                    PMIE(1)
2
3
4

NAME

6       pmie - inference engine for performance metrics
7

SYNOPSIS

9       pmie  [-bCdeFfHPqvVWxXz?]   [-a  archive]  [-A align] [-c filename] [-h
10       host] [-l logfile] [-j stompfile] [-n pmnsfile] [-O offset] [-S  start‐
11       time]  [-t interval] [-T endtime] [-U username] [-Z timezone] [filename
12       ...]
13

DESCRIPTION

15       pmie accepts a collection of arithmetic, logical, and rule  expressions
16       to  be  evaluated  at  specified  frequencies.   The  base data for the
17       expressions consists of performance metrics values delivered  in  real-
18       time  from  any  host running the Performance Metrics Collection Daemon
19       (PMCD), or using historical data from Performance  Co-Pilot  (PCP)  ar‐
20       chive logs.
21
22       As  well  as  computing arithmetic and logical values, pmie can execute
23       actions (popup alarms, write system log messages, and launch  programs)
24       in response to specified conditions.  Such actions are extremely useful
25       in detecting, monitoring and correcting performance related problems.
26
27       The expressions to be evaluated are read from configuration files spec‐
28       ified  by  one or more filename arguments.  In the absence of any file‐
29       name, expressions are read from standard input.
30
31       Output from pmie is directed to standard output and standard  error  as
32       follows:
33
34       stdout
35            Expression values printed in the verbose -v mode and the output of
36            print actions.
37
38       stderr
39            Error and warning messages for any syntactic or semantic  problems
40            during expression parsing, and any semantic or performance metrics
41            availability problems during expression evaluation.
42

OPTIONS

44       The available command line options are:
45
46       -a archive, --archive=archive
47            archive which is a comma-separated list of names,  each  of  which
48            may be the base name of an archive or the name of a directory con‐
49            taining one or more archives  written  by  pmlogger(1).   Multiple
50            instances of the -a flag may appear on the command line to specify
51            a list of sets of archives.  In this case,  it  is  required  that
52            only  one  set of archives be present for any one host.  Also, any
53            explicit host names occurring in a pmie expression must match  the
54            host  name  recorded in one of the archive labels.  In the case of
55            multiple sets of archives, timestamps recorded in the archives are
56            used to ensure temporal consistency.
57
58       -A align, --align=align
59            Force  the  initial time window to be aligned on the boundary of a
60            natural time unit align.  Refer  to  PCPIntro(1)  for  a  complete
61            description of the syntax for align.
62
63       -b, --buffer
64            Output  will  be  line buffered and standard output is attached to
65            standard error.  This is most useful for background  execution  in
66            conjunction  with the -l option.  The -b option is always used for
67            pmie instances launched from pmie_check(1).
68
69       -c config, --config=config
70            An alternative to specifying filename at the end  of  the  command
71            line.
72
73       -C, --check
74            Parse  the  configuration  file(s)  and exit before performing any
75            evaluations.  Any errors in the configuration file are reported.
76
77       -d, --interact
78            Normally pmie would be launched as a  non-interactive  process  to
79            monitor  and  manage  the performance of one or more hosts.  Given
80            the -d flag however, execution is interactive and the user is pre‐
81            sented  with a menu of options.  Interactive mode is useful mainly
82            for debugging new expressions.
83
84       -e, --timestamp
85            When used with -V, -v or -W, this option forces timestamps  to  be
86            reported  with  each  expression.   The timestamps are in ctime(3)
87            format, enclosed in parenthesis and appear  after  the  expression
88            name and before the expression value, e.g.
89                 expr_1 (Tue Feb  6 19:55:10 2001): 12
90
91       -f, --foreground
92            If the -l option is specified and there is no -a option (ie. real-
93            time monitoring) then pmie is run as a daemon  in  the  background
94            (in  all  other cases foreground is the default).  The -f (and -F,
95            see below) options force pmie to be run in the  foreground,  inde‐
96            pendent of any other options.
97
98       -F, --systemd
99            Like  -f, the -F option runs pmie in the foreground, but also does
100            some housekeeping (like create a pid  file,  change  user  id  and
101            notify  systemd(1)  when  pmie  has  started or is shutting down).
102            This is intended for use when pmie is launched from systemd(1) and
103            the  daemonizing has already been done.  The -f and -F options are
104            mutually exclusive.
105
106       -h host, --host=host
107            By default performance data is fetched from  the  local  host  (in
108            real-time mode) or the host for the first named set of archives on
109            the command line (in archive mode).  The host  argument  overrides
110            this  default.  It does not override hosts explicitly named in the
111            expressions being evaluated.  The host argument is interpreted  as
112            a  connection  specification for pmNewContext, and is later mapped
113            to the remote pmcd's self-reported host name  for  reporting  pur‐
114            poses.   See  also  the  %h  vs.  %c  substitutions in rule action
115            strings below.
116
117       -l logfile, --logfile=logfile
118            Standard error is sent to logfile.
119
120       -j file
121            An alternative STOMP protocol configuration is loaded from  stomp‐
122            file.  If this option is not used, and the stomp action is used in
123            any rule, the default location  $PCP_SYSCONF_DIR/pmie/config/stomp
124            will be used.
125
126       -n pmnsfile, --namespace=pmnsfile
127            An  alternative  Performance  Metrics  Name Space (PMNS) is loaded
128            from the file pmnsfile.
129
130       -O origin, --origin=origin
131            Specify the origin of the time window.  See PCPIntro(1)  for  com‐
132            plete description of this option.
133
134       -P, --primary
135            Identifies  this as the primary pmie instance for a host.  See the
136            ``AUTOMATIC RESTART'' section below for further details.
137
138       -q, --quiet
139            Suppresses diagnostic messages that would be printed  to  standard
140            output  by  default, especially the "evaluator exiting" message as
141            this can confuse scripts.
142
143       -S starttime, --start=starttime
144            Specify the starttime of the time  window.   See  PCPIntro(1)  for
145            complete description of this option.
146
147       -t interval, --interval=interval
148            The interval argument follows the syntax described in PCPIntro(1),
149            and in the simplest form may be an unsigned integer  (the  implied
150            units  in  this case are seconds).  The value is used to determine
151            the sample interval for expressions that  do  not  explicitly  set
152            their  sample  interval  using  the  pmie variable delta described
153            below.  The default is 10.0 seconds.
154
155       -T endtime, --finish=endtime
156            Specify the endtime of the time window.  See PCPIntro(1) for  com‐
157            plete description of this option.
158
159       -U username, --username=username
160            User  account under which to run pmie.  The default is the current
161            user account for interactive use.   When  run  as  a  daemon,  the
162            unprivileged "pcp" account is used in current versions of PCP, but
163            in older versions the  superuser  account  ("root")  was  used  by
164            default.
165
166       -v   Unless one of the verbose options -V, -v or -W appears on the com‐
167            mand line, expressions are evaluated silently, the only output  is
168            as  a  result of any actions being executed.  In the verbose mode,
169            specified using the -v flag,  the  value  of  each  expression  is
170            printed  as  it  is evaluated.  The values are in canonical units;
171            bytes in the dimension of ``space'', seconds in the  dimension  of
172            ``time''   and   events   in  the  dimension  of  ``count''.   See
173            pmLookupDesc(3) for details of the supported dimension and scaling
174            mechanisms for performance metrics.  The verbose mode is useful in
175            monitoring the value of given expressions, evaluating derived per‐
176            formance  metrics, passing these values on to other tools for fur‐
177            ther processing and in debugging new expressions.
178
179       -V, --verbose
180            This option has the same effect as the -v option, except that  the
181            name  of the host and instance (if applicable) are printed as well
182            as expression values.
183
184       -W   This option has the same effect as the -V option described  above,
185            except  that  for boolean expressions, only those names and values
186            that make the expression true are printed.   These  are  the  same
187            names  and values accessible to rule actions as the %h, %i, %c and
188            %v bindings, as described below.
189
190       -x, --secret-agent
191            Execute in domain agent mode.  This mode is used within  the  Per‐
192            formance  Co-Pilot  product  to derive values for summary metrics,
193            see pmdasummary(1).  Only restricted functionality is available in
194            this mode (expressions with actions may not be used).
195
196       -X, --secret-applet
197            Run in secret applet mode (thin client).
198
199       -z, --hostzone
200            Change  the reporting timezone to the timezone of the host that is
201            the source of the performance metrics, as  identified  via  either
202            the  -h  option  or  the first named set of archives (as described
203            above for the -a option).
204
205       -Z timezone, --timezone=timezone
206            Change the reporting timezone to timezone in  the  format  of  the
207            environment variable TZ as described in environ(7).
208
209       -?, --help
210            Display usage message and exit.
211

EXAMPLES

213       The  following example expressions demonstrate some of the capabilities
214       of the inference engine.
215
216       The directory $PCP_DEMOS_DIR/pmie contains a number of other  annotated
217       examples of pmie expressions.
218
219       The  variable  delta controls expression evaluation frequency.  Specify
220       that subsequent expressions be evaluated once a second,  until  further
221       notice:
222
223            delta = 1 sec;
224
225       If the total context switch rate exceeds 10000 per second per CPU, then
226       display an alarm notifier:
227
228            kernel.all.pswitch / hinv.ncpu > 10000 count/sec
229            -> alarm "high context switch rate %v";
230
231       If the high context switch rate is sustained for  10  consecutive  sam‐
232       ples,  then  launch  top(1) in an xterm(1) window to monitor processes,
233       but do this at most once every 5 minutes:
234
235            all_sample (
236                kernel.all.pswitch @0..9 > 10 Kcount/sec * hinv.ncpu
237            ) -> shell 5 min "xterm -e 'top'";
238
239       The following rules are evaluated once every 20 seconds:
240
241            delta = 20 sec;
242
243       If any disk is performing more than 60 I/Os per second,  then  print  a
244       message  identifying  the  busy  disk  to  standard  output  and launch
245       dkvis(1):
246
247            some_inst (
248                disk.dev.total > 60 count/sec
249            ) -> print "busy disks:" " %i" &
250                 shell 5 min "dkvis";
251
252       Refine the preceding rule to apply only between the hours  of  9am  and
253       5pm,  and to require 3 of 4 consecutive samples to exceed the threshold
254       before executing the action:
255
256            $hour >= 9 && $hour <= 17 &&
257            some_inst (
258              75 %_sample (
259                disk.dev.total @0..3 > 60 count/sec
260              )
261            ) -> print "disks busy for 20 sec:" " [%h]%i";
262
263       The following two rules are evaluated once every 10 minutes:
264
265            delta = 10 min;
266
267       If either the / or the /usr filesystem is more than 95%  full,  display
268       an  alarm  popup,  but  not if it has already been displayed during the
269       last 4 hours:
270
271            filesys.free #'/dev/root' /
272                filesys.capacity #'/dev/root' < 0.05
273            -> alarm 4 hour "root filesystem (almost) full";
274
275            filesys.free #'/dev/usr' /
276                filesys.capacity #'/dev/usr' < 0.05
277            -> alarm 4 hour "/usr filesystem (almost) full";
278
279       The following rule requires a machine that supports the lmsensors  met‐
280       rics.  If the machine environment temperature rises more than 2 degrees
281       over a 10 minute interval, write an entry in the system log:
282
283            lmsensors.coretemp_isa.temp1 @0 - lmsensors.coretemp_isa.temp1 @1 > 2
284            -> alarm "temperature rising fast" &
285               syslog "machine room temperature rise alarm";
286
287       And something interesting if you have performance  problems  with  your
288       Oracle database:
289
290            // back to 30sec evaluations
291            delta = 30 sec;
292            sid = "ptg1";       # $ORACLE_SID setting
293            lid = "223";        # latch ID from v$latch
294            lru = "#'$sid/$lid cache buffers lru chain'";
295            host = ":moomba.melbourne.sgi.com";
296            gets = "oracle.latch.gets $host $lru";
297            total = "oracle.latch.gets $host $lru +
298                     oracle.latch.misses $host $lru +
299                     oracle.latch.immisses $host $lru";
300
301            $total > 100 && $gets / $total < 0.2
302            -> alarm "high lru latch contention in database $sid";
303
304       The  following  ruleset  will emit exactly one message depending on the
305       availability and value of the 1-minute load average.
306
307            delta = 1 minute;
308            ruleset
309                 kernel.all.load #'1 minute' > 10 * hinv.ncpu ->
310                     print "extreme load average %v"
311            else kernel.all.load #'1 minute' > 2 * hinv.ncpu ->
312                     print "moderate load average %v"
313            unknown ->
314                     print "load average unavailable"
315            otherwise ->
316                     print "load average OK"
317            ;
318
319       The following rule will emit a message when  some  filesystem  is  more
320       than 75% full and is filling at a rate that if sustained would fill the
321       filesystem to 100% in less than 30 minutes.
322
323            some_inst (
324                100 * filesys.used / filesys.capacity > 75 &&
325                filesys.used + 30min * (rate filesys.used) > filesys.capacity
326            ) -> print "filesystem will be full within 30 mins:" " %i";
327
328       If the metric mypmda.errors counts errors then the following rule  will
329       emit  a message if the rate of errors exceeds 1 per second provided the
330       error count is less than 100.
331
332            mypmda.errors > 1 && instant mypmda.errors < 100
333            -> print "high error rate: %v";
334

QUICK START

336       The pmie specification language is powerful and large.
337
338       To expedite rapid development of pmie rules, the pmieconf(1) tool  pro‐
339       vides a facility for generating a pmie configuration file from a set of
340       generalized pmie rules.  The supplied set of rules covers a wide  range
341       of performance scenarios.
342
343       The  Performance  Co-Pilot  User's and Administrator's Guide provides a
344       detailed tutorial-style chapter covering pmie.
345

EXPRESSION SYNTAX

347       This description is terse  and  informal.   For  a  more  comprehensive
348       description  see  the  Performance  Co-Pilot User's and Administrator's
349       Guide.
350
351       A pmie specification is a sequence of semicolon terminated expressions.
352
353       Basic operators are modeled on the arithmetic, relational  and  Boolean
354       operators  of  the  C  programming  language.   Precedence rules are as
355       expected, although the use of  parentheses  is  encouraged  to  enhance
356       readability and remove ambiguity.
357
358       Operands are performance metric names (see PMNS(5)) and the normal lit‐
359       eral constants.
360
361       Operands involving performance metrics may produce sets of values, as a
362       result  of  enumeration in the dimensions of hosts, instances and time.
363       Special qualifiers may appear after a performance metric name to define
364       the enumeration in each dimension.  For example,
365
366           kernel.percpu.cpu.user :foo :bar #cpu0 @0..2
367
368       defines 6 values corresponding to the time spent executing in user mode
369       on CPU 0 on the hosts ``foo'' and ``bar'' over the last  3  consecutive
370       samples.   The  default  interpretation  in  the absence of : (host), #
371       (instance) and @ (time) qualifiers is all instances at the most  recent
372       sample time for the default source of PCP performance metrics.
373
374       Host  and  instance names that do not follow the rules for variables in
375       programming languages, ie. alphabetic optionally followed  by  alphanu‐
376       merics, should be enclosed in single quotes.
377
378       Expression  evaluation  follows  the law of ``least surprises''.  Where
379       performance metrics have the semantics of a counter, pmie will automat‐
380       ically  convert  to  a rate based upon consecutive samples and the time
381       interval between these samples.  All numeric expressions are  evaluated
382       in  double  precision, and where appropriate, automatically scaled into
383       canonical units of ``bytes'', ``seconds'' and ``counts''.
384
385       A rule is a special form of expression that specifies  a  condition  or
386       logical expression, a special operator (->) and actions to be performed
387       when the condition is found to be true.
388
389       The following table summarizes the basic pmie operators:
390
391         ┌────────────────┬────────────────────────────────────────────────┐
392         │   Operators    │                  Explanation                   │
393         ├────────────────┼────────────────────────────────────────────────┤
394         │+ - * /         │ Arithmetic                                     │
395         │< <= == >= > != │ Relational (value comparison)                  │
396         │! && ||         │ Boolean                                        │
397         │->              │ Rule                                           │
398rising          │ Boolean, false to true transition              │
399falling         │ Boolean, true to false transition              │
400rate            │ Explicit rate conversion (rarely required)     │
401instant         │ No automatic rate conversion (rarely required) │
402         └────────────────┴────────────────────────────────────────────────┘
403       All operators are supported for  numeric-valued  operands  and  expres‐
404       sions.   For  string-valued  operands,  namely literal string constants
405       enclosed in double quotes  or  metrics  with  a  data  type  of  string
406       (PM_TYPE_STRING), only the operators == and != are supported.
407
408       The  rate and instant operators are the logical inverse of one another,
409       so an arithmetic expression expr is equal to rate  instant  expr.   The
410       more  useful  cases  involve  using  rate  with  a metric that is not a
411       counter to determine the rate of change over time  or  instant  with  a
412       metric  that is a counter to determine if the current value is above or
413       below some threshold.
414
415       Aggregate operators may be used to aggregate  or  summarize  along  one
416       dimension  of  a set-valued expression.  The following aggregate opera‐
417       tors map from a logical expression to a  logical  expression  of  lower
418       dimension.
419
420         ┌─────────────────────────┬─────────────┬──────────────────────────┐
421         │       Operators         │    Type     │       Explanation        │
422         ├─────────────────────────┼─────────────┼──────────────────────────┤
423some_inst                │ Existential │ True if at least one set │
424some_host                │             │ member is true in the    │
425some_sample              │             │ associated dimension     │
426         ├─────────────────────────┼─────────────┼──────────────────────────┤
427all_inst                 │ Universal   │ True if all set members  │
428all_host                 │             │ are true in the associ‐  │
429all_sample               │             │ ated dimension           │
430         ├─────────────────────────┼─────────────┼──────────────────────────┤
431N%_inst                  │ Percentile  │ True if at least N per‐  │
432N%_host                  │             │ cent of set members are  │
433N%_sample                │             │ true in the associated   │
434         │                         │             │ dimension                │
435         └─────────────────────────┴─────────────┴──────────────────────────┘
436       The following instantial operators may be used to  filter  or  limit  a
437       set-valued  logical expression, based on regular expression matching of
438       instance names.  The logical expression must be  a  set  involving  the
439       dimension  of instances, and the regular expression is of the form used
440       by egrep(1) or the Extended Regular Expressions of regcomp(3).
441
442              ┌─────────────┬──────────────────────────────────────────┐
443              │ Operators   │               Explanation                │
444              ├─────────────┼──────────────────────────────────────────┤
445match_inst   │ For each value of the logical expression │
446              │             │ that is ``true'', the result is ``true'' │
447              │             │ if the associated instance name matches  │
448              │             │ the regular expression.  Otherwise the   │
449              │             │ result is ``false''.                     │
450              ├─────────────┼──────────────────────────────────────────┤
451nomatch_inst │ For each value of the logical expression │
452              │             │ that is ``true'', the result is ``true'' │
453              │             │ if the associated instance name does not 
454              │             │ match the regular expression.  Otherwise │
455              │             │ the result is ``false''.                 │
456              └─────────────┴──────────────────────────────────────────┘
457       For example, the expression below will be ``true'' for  disks  attached
458       to controllers 2 or 3 performing more than 20 operations per second:
459            match_inst "^dks[23]d" disk.dev.total > 20;
460
461       The  following aggregate operators map from an arithmetic expression to
462       an arithmetic expression of lower dimension.
463
464          ┌─────────────────────────┬───────────┬──────────────────────────┐
465          │       Operators         │   Type    │       Explanation        │
466          ├─────────────────────────┼───────────┼──────────────────────────┤
467min_inst                 │ Extrema   │ Minimum value across all │
468min_host                 │           │ set members in the asso‐ │
469min_sample               │           │ ciated dimension         │
470          ├─────────────────────────┼───────────┼──────────────────────────┤
471max_inst                 │ Extrema   │ Maximum value across all │
472max_host                 │           │ set members in the asso‐ │
473max_sample               │           │ ciated dimension         │
474          ├─────────────────────────┼───────────┼──────────────────────────┤
475sum_inst                 │ Aggregate │ Sum of values across all │
476sum_host                 │           │ set members in the asso‐ │
477sum_sample               │           │ ciated dimension         │
478          ├─────────────────────────┼───────────┼──────────────────────────┤
479avg_inst                 │ Aggregate │ Average value across all │
480avg_host                 │           │ set members in the asso‐ │
481avg_sample               │           │ ciated dimension         │
482          └─────────────────────────┴───────────┴──────────────────────────┘
483       The aggregate operators count_inst,  count_host  and  count_sample  map
484       from  a  logical expression to an arithmetic expression of lower dimen‐
485       sion by counting the number of set members for which the expression  is
486       true in the associated dimension.
487
488       For action rules, the following actions are defined:
489
490                ┌──────────┬────────────────────────────────────────┐
491                │Operators │              Explanation               │
492                ├──────────┼────────────────────────────────────────┤
493alarm     │ Raise a visible alarm with xconfirm(1) │
494print     │ Display on standard output             │
495shell     │ Execute with sh(1)
496stomp     │ Send a STOMP message to a JMS server   │
497syslog    │ Append a message to system log file    │
498                └──────────┴────────────────────────────────────────┘
499       Multiple  actions  may be separated by the & and | operators to specify
500       respectively sequential  execution  (both  actions  are  executed)  and
501       alternate  execution  (the  second  action will only be executed if the
502       execution of the first action returns a non-zero error status.
503
504       Arguments to actions are an optional suppression time, and then one  or
505       more  expressions (a string is an expression in this context).  Strings
506       appearing as arguments to an action may include the  following  special
507       selectors that will be replaced at the time the action is executed.
508
509       %h  Host  name(s)  that  make the left-most top-level expression in the
510           condition true.
511
512       %c  Connection specification string(s) or files for a PCP tool to reach
513           the  hosts or archives that make the left-most top-level expression
514           in the condition true.
515
516       %i  Instance(s) that make the left-most  top-level  expression  in  the
517           condition true.
518
519       %v  One  value from the left-most top-level expression in the condition
520           for each host and instance pair that makes the condition true.
521
522       Note that expansion of the special selectors is done by  repeating  the
523       whole  argument  once  for each unique binding to any of the qualifying
524       special selectors.  For example if a rule were true for the host mumble
525       with  instances  grunt and snort, and for host fumble the instance puff
526       makes the rule true, then the action
527            ...
528            -> shell myscript "Warning: %h:%i busy ";
529       will execute myscript with the argument string  "Warning:  mumble:grunt
530       busy Warning: mumble:snort busy Warning: fumble:puff busy".
531
532       By comparison, if the action
533            ...
534            -> shell myscript "Warning! busy:" " %h:%i";
535       were executed under the same circumstances, then myscript would be exe‐
536       cuted with  the  argument  string  "Warning!  busy:  mumble:grunt  mum‐
537       ble:snort fumble:puff".
538
539       The semantics of the expansion of the special selectors leads to a com‐
540       mon usage pattern in an action, where one argument is a constant  (con‐
541       tains  no  special  selectors) the second argument contains the desired
542       special selectors with minimal separator characters,  and  an  optional
543       third  argument  provides  a constant postscript (e.g. to terminate any
544       argument quoting from the first argument).  If necessary  post-process‐
545       ing  (eg.  in myscript) can provide the necessary enumeration over each
546       unique expansion of the string containing just the special selectors.
547
548       For complex conditions, the bindings to these selectors is not obvious.
549       It  is  strongly  recommended  that  pmie be used in the debugging mode
550       (specify the -W command line option in particular) during rule develop‐
551       ment.
552

BOOLEAN EXPRESSIONS

554       pmie  expressions that have the semantics of a Boolean, e.g.  foo.bar >
555       10 or some_inst ( my.table < 0 ) are assigned the values true or  false
556       or unknown.  A value is unknown if one or more of the underlying metric
557       values is unavailable, e.g.  pmcd(1) on the host cannot  be  contacted,
558       the  metric  is  not in the PCP archive, no values are currently avail‐
559       able, insufficient values have been fetched to allow a  rate  converted
560       value  to  be  computed  or  insufficient  values  have been fetched to
561       instantiate the required number of samples in the temporal domain.
562
563       Boolean operators follow the normal rules of Kleene logic (aka 3-valued
564       logic) when combining values that include unknown:
565
566                      ┌────────────┬───────────────────────────┐
567                      │            │             B             │
568                      │  A and B   ├─────────┬───────┬─────────┤
569                      │            │  true   false unknown 
570                      ├──┬─────────┼─────────┼───────┼─────────┤
571                      │  │  true   true   false unknown 
572                      │  ├─────────┼─────────┼───────┼─────────┤
573                      │A │  false  false  false false  
574                      │  ├─────────┼─────────┼───────┼─────────┤
575                      │  │ unknown unknown false unknown 
576                      └──┴─────────┴─────────┴───────┴─────────┘
577                      ┌────────────┬──────────────────────────┐
578                      │            │            B             │
579                      │  A or B    ├──────┬─────────┬─────────┤
580                      │            │ true false  unknown 
581                      ├──┬─────────┼──────┼─────────┼─────────┤
582                      │  │  true   true true   true   
583                      │  ├─────────┼──────┼─────────┼─────────┤
584                      │A │  false  true false  unknown 
585                      │  ├─────────┼──────┼─────────┼─────────┤
586                      │  │ unknown true unknown unknown 
587                      └──┴─────────┴──────┴─────────┴─────────┘
588                                 ┌────────┬─────────┐
589                                 │   A    │  not A  │
590                                 ├────────┼─────────┤
591true   false  
592                                 ├────────┼─────────┤
593false  true   
594                                 ├────────┼─────────┤
595unknown unknown 
596                                 └────────┴─────────┘

RULESETS

598       The  ruleset  clause  is used to define a set of rules and actions that
599       are evaluated in order until some action is executed,  at  which  point
600       the  remaining rules and actions are skipped until the ruleset is again
601       scheduled for evaluation.  The keyword else is used to separate  rules.
602       After  one  or  more  regular rules (with a predicate and an action), a
603       ruleset may include an optional
604            unknown -> action
605       clause, optionally followed by a
606            otherwise -> action
607       clause.
608
609       If all of the predicates in  the  rules  evaluate  to  unknown  and  an
610       unknown  clause  has  been  specified  then  action associated with the
611       unknown clause will be executed.
612
613       If no rule predicate is true and the unknown action is either not spec‐
614       ified  or not executed and an otherwise clause has been specified, then
615       the action associated with the otherwise clause will be executed.
616

SCALE FACTORS

618       Scale factors may be appended to arithmetic expressions and force  lin‐
619       ear  scaling of the value to canonical units.  Simple scale factors are
620       constructed from the keywords: nanosecond, nanosec, nsec,  microsecond,
621       microsec,  usec, millisecond, millisec, msec, second, sec, minute, min,
622       hour, byte, Kbyte, Mbyte, Gbyte, Tbyte, count, Kcount and  Mcount,  and
623       the operator /, for example ``Kbytes / hour''.
624

MACROS

626       Macros are defined using expressions of the form:
627
628            name = constexpr;
629
630       Where  name  follows the normal rules for variables in programming lan‐
631       guages, ie. alphabetic optionally followed by alphanumerics.  constexpr
632       must  be  a  constant  expression,  either a string (enclosed in double
633       quotes) or an arithmetic expression optionally followed by a scale fac‐
634       tor.
635
636       Macros  are  expanded when their name, prefixed by a dollar ($) appears
637       in an expression, and macros may be nested within a constexpr string.
638
639       The following reserved macro names are understood.
640
641       minute    Current minute of the hour.
642
643       hour      Current hour of the day, in the range 0 to 23.
644
645       day       Current day of the month, in the range 1 to 31.
646
647       month     Current month of the year, in the range  0  (January)  to  11
648                 (December).
649
650       year      Current year.
651
652       day_of_week
653                 Current day of the week, in the range 0 (Sunday) to 6 (Satur‐
654                 day).
655
656       delta     Sample interval in effect for this expression.
657
658       Dates and times are presented in the reporting time zone (see  descrip‐
659       tion of -Z and -z command line options above).
660

AUTOMATIC RESTART

662       It  is  often  useful for pmie processes to be started and stopped when
663       the local host is booted or shutdown, or when they have  been  detected
664       as  no longer running (when they have unexpectedly exited for some rea‐
665       son).  Refer to pmie_check(1) for details on automating this process.
666
667       Optionally, each system running pmcd(1) may also be configured to run a
668       ``primary''   pmie   instance.   This  pmie  instance  is  launched  by
669       $PCP_RC_DIR/pmie,     and     is     affected     by     the      files
670       $PCP_SYSCONF_DIR/pmie/control,   $PCP_SYSCONF_DIR/pmie/control.d   (use
671       chkconfig(8), systemctl(1) or  similar  platform-specific  commands  to
672       activate  or  disable  the primary pmie instance) and $PCP_VAR_DIR/con‐
673       fig/pmie/config.default (the default initial configuration file for the
674       primary pmie).
675
676       The primary pmie instance is identified by the -P option.  There may be
677       at most one ``primary'' pmie instance on each system.  The primary pmie
678       instance  (if  any)  must be running on the same host as the pmcd(1) to
679       which it connects (if any), so the  -h  and  -P  options  are  mutually
680       exclusive.
681

EVENT MONITORING

683       It  is common for production systems to be monitored in a central loca‐
684       tion.  Traditionally on UNIX systems this has  been  performed  by  the
685       system  log  facilities  -  see logger(1), and syslogd(1).  On Windows,
686       communication with the system event log is handled by pcp-eventlog(1).
687
688       pmie fits into this model when rules use the syslog action.  Note  that
689       if  the  action  string  begins with -p (priority) and/or -t (tag) then
690       these are extracted from the string and treated in the same way  as  in
691       logger(1) and pcp-eventlog(1).
692
693       However,  it  is common to have other event monitoring frameworks also,
694       into which you may wish to incorporate performance  events  from  pmie.
695       You  can often use the shell action to send events to these frameworks,
696       as they usually provide their a program for injecting events  into  the
697       framework from external sources.
698
699       A  final  option is use of the stomp (Streaming Text Oriented Messaging
700       Protocol) action, which allows pmie to connect to a central  JMS  (Java
701       Messaging  System) server and send events to the PMIE topic.  Tools can
702       be written to extract these text messages and present  them  to  opera‐
703       tions people (via desktop popup windows, etc).  Use of the stomp action
704       requires a stomp configuration file to be setup,  which  specifies  the
705       location of the JMS server host, port number, and username/password.
706
707       The format of this file is as follows:
708
709            host=messages.sgi.com   # this is the JMS server (required)
710            port=61616              # and its listening here (required)
711            timeout=2               # seconds to wait for server (optional)
712            username=joe            # (required)
713            password=j03ST0MP       # (required)
714            topic=PMIE              # JMS topic for pmie messages (optional)
715
716       The timeout value specifies the time (in seconds) that pmie should wait
717       for acknowledgements from the JMS server after sending  a  message  (as
718       required  by the STOMP protocol).  Note that on startup, pmie will wait
719       indefinitely for a connection, and will not begin rule evaluation until
720       that initial connection has been established.  Should the connection to
721       the JMS server be lost at any time while pmie  is  running,  pmie  will
722       attempt  to  reconnect on each subsequent truthful evaluation of a rule
723       with a stomp action, but not more than once per  minute.   This  is  to
724       avoid contributing to network congestion.  In this situation, where the
725       STOMP connection to the JMS server has been severed, the  stomp  action
726       will return a non-zero error value.
727

BUGS

729       The  lexical  scanner and parser will attempt to recover after an error
730       in the input expressions.  Parsing resumes after skipping input  up  to
731       the next semi-colon (;), however during this skipping process the scan‐
732       ner is ignorant of comments and strings, so an embedded semi-colon  may
733       cause  parsing  to  resume  at  an  unexpected place.  This behavior is
734       largely benign, as until the initial syntax error  is  corrected,  pmie
735       will not attempt any expression evaluation.
736

FILES

738       $PCP_DEMOS_DIR/pmie/*
739            annotated example rules
740
741       $PCP_VAR_DIR/pmns/*
742            default PMNS specification files
743
744       $PCP_TMP_DIR/pmie
745            pmie  maintains  files  in  this directory to identify the running
746            pmie instances  and  to  export  runtime  information  about  each
747            instance  - this data forms the basis of the pmcd.pmie performance
748            metrics
749
750       $PCP_PMIECONTROL_PATH
751            the default set of pmie instances to start at boot time - refer to
752            pmie_check(1) for details
753

PCP ENVIRONMENT

755       Environment variables with the prefix PCP_ are used to parameterize the
756       file and directory names used by PCP.  On each installation,  the  file
757       /etc/pcp.conf  contains  the  local  values  for  these variables.  The
758       $PCP_CONF variable may be used to specify an alternative  configuration
759       file, as described in pcp.conf(5).
760
761       When  executing  shell  actions, pmie overrides two variables - IFS and
762       PATH - in the environment of the child process.  IFS is set to  "\t\n".
763       The  PATH  is  set to a combination of a default path for all platforms
764       ("/usr/sbin:/sbin:/usr/bin:/usr/sbin") and several configurable  compo‐
765       nents.   These  are  (in this order): $PCP_BIN_DIR, $PCP_BINADM_DIR and
766       $PCP_PLATFORM_PATHS.
767
768       When executing  popup  alarm  actions,  pmie  will  use  the  value  of
769       $PCP_XCONFIRM_PROG  as the visual notification program to run.  This is
770       typically set to pmconfirm(1), a cross-platform dialog box.
771

UNIX SEE ALSO

773       logger(1).
774

WINDOWS SEE ALSO

776       pcp-eventlog(1).
777

SEE ALSO

779       PCPIntro(1),   pmcd(1),   pmconfirm(1),   pmdumplog(1),    pmieconf(1),
780       pmie_check(1),  pminfo(1), pmlogger(1), pmval(1), systemd(1), PMAPI(3),
781       pcp.conf(5), pcp.env(5) and PMNS(5).
782

USER GUIDE

784       For a more complete description of the pmie language, refer to the Per‐
785       formance  Co-Pilot  Users  and Administrators Guide.  This is available
786       online from:
787           https://pcp.readthedocs.io/en/latest/UAG/PerformanceMetricsInferenceEngine.html
788
789
790
791Performance Co-Pilot                  PCP                              PMIE(1)
Impressum