1PMIE(1)                     General Commands Manual                    PMIE(1)
2
3
4

NAME

6       pmie - inference engine for performance metrics
7

SYNOPSIS

9       pmie  [-bCdeFfPqvVWxXz?]   [-a  archive]  [-A  align] [-c filename] [-h
10       host] [-l logfile] [-j stompfile] [-n pmnsfile] [-O offset] [-S  start‐
11       time]  [-t interval] [-T endtime] [-U username] [-Z timezone] [filename
12       ...]
13

DESCRIPTION

15       pmie accepts a collection of arithmetic, logical, and rule  expressions
16       to  be  evaluated  at specified frequencies.  The base data for the ex‐
17       pressions consists of performance metrics values delivered in real-time
18       from any host running the Performance Metrics Collection Daemon (PMCD),
19       or using historical data from Performance Co-Pilot (PCP) archive logs.
20
21       As well as computing arithmetic and logical values,  pmie  can  execute
22       actions  (popup alarms, write system log messages, and launch programs)
23       in response to specified conditions.  Such actions are extremely useful
24       in detecting, monitoring and correcting performance related problems.
25
26       The expressions to be evaluated are read from configuration files spec‐
27       ified by one or more filename arguments.  In the absence of  any  file‐
28       name, expressions are read from standard input.
29
30       Output  from  pmie is directed to standard output and standard error as
31       follows:
32
33       stdout
34            Expression values printed in the verbose -v mode and the output of
35            print actions.
36
37       stderr
38            Error  and warning messages for any syntactic or semantic problems
39            during expression parsing, and any semantic or performance metrics
40            availability problems during expression evaluation.
41

OPTIONS

43       The available command line options are:
44
45       -a archive, --archive=archive
46            archive  which  is  a comma-separated list of names, each of which
47            may be the base name of an archive or the name of a directory con‐
48            taining one or more archives written by pmlogger(1).  Multiple in‐
49            stances of the -a flag may appear on the command line to specify a
50            list  of sets of archives.  In this case, it is required that only
51            one set of archives be present for any one host.   Also,  any  ex‐
52            plicit  host  names  occurring in a pmie expression must match the
53            host name recorded in one of the archive labels.  In the  case  of
54            multiple sets of archives, timestamps recorded in the archives are
55            used to ensure temporal consistency.
56
57       -A align, --align=align
58            Force the initial time window to be aligned on the boundary  of  a
59            natural  time unit align.  Refer to PCPIntro(1) for a complete de‐
60            scription of the syntax for align.
61
62       -b, --buffer
63            Output will be line buffered and standard output  is  attached  to
64            standard  error.   This is most useful for background execution in
65            conjunction with the -l option.  The -b option is always used  for
66            pmie instances launched from pmie_check(1).
67
68       -c config, --config=config
69            An  alternative  to  specifying filename at the end of the command
70            line.
71
72       -C, --check
73            Parse the configuration file(s) and  exit  before  performing  any
74            evaluations.  Any errors in the configuration file are reported.
75
76       -d, --interact
77            Normally  pmie  would  be launched as a non-interactive process to
78            monitor and manage the performance of one or  more  hosts.   Given
79            the -d flag however, execution is interactive and the user is pre‐
80            sented with a menu of options.  Interactive mode is useful  mainly
81            for debugging new expressions.
82
83       -e, --timestamp
84            When  used  with -V, -v or -W, this option forces timestamps to be
85            reported with each expression.  The  timestamps  are  in  ctime(3)
86            format,  enclosed  in  parenthesis and appear after the expression
87            name and before the expression value, e.g.
88                 expr_1 (Tue Feb  6 19:55:10 2001): 12
89
90       -f, --foreground
91            If the -l option is specified and there is no -a option (ie. real-
92            time  monitoring)  then  pmie is run as a daemon in the background
93            (in all other cases foreground is the default).  The -f  (and  -F,
94            see  below)  options force pmie to be run in the foreground, inde‐
95            pendent of any other options.
96
97       -F, --systemd
98            Like -f, the -F option runs pmie in the foreground, but also  does
99            some  housekeeping (like create a pid file, change user id and no‐
100            tify systemd(1) when pmie has started or is shutting down).   This
101            is  intended for use when pmie is launched from systemd(1) and the
102            daemonizing has already been done.  The -f and -F options are  mu‐
103            tually exclusive.
104
105       -h host, --host=host
106            By  default  performance  data  is fetched from the local host (in
107            real-time mode) or the host for the first named set of archives on
108            the  command  line (in archive mode).  The host argument overrides
109            this default.  It does not override hosts explicitly named in  the
110            expressions  being evaluated.  The host argument is interpreted as
111            a connection specification for pmNewContext, and is  later  mapped
112            to  the  remote  pmcd's self-reported host name for reporting pur‐
113            poses.  See also the  %h  vs.  %c  substitutions  in  rule  action
114            strings below.
115
116       -l logfile, --logfile=logfile
117            Standard error is sent to logfile.
118
119       -j file
120            An  alternative STOMP protocol configuration is loaded from stomp‐
121            file.  If this option is not used, and the stomp action is used in
122            any  rule, the default location $PCP_SYSCONF_DIR/pmie/config/stomp
123            will be used.
124
125       -n pmnsfile, --namespace=pmnsfile
126            An alternative Performance Metrics Name  Space  (PMNS)  is  loaded
127            from the file pmnsfile.
128
129       -O origin, --origin=origin
130            Specify  the  origin of the time window.  See PCPIntro(1) for com‐
131            plete description of this option.
132
133       -P, --primary
134            Identifies this as the primary pmie instance for a host.  See  the
135            ``AUTOMATIC RESTART'' section below for further details.
136
137       -q, --quiet
138            Suppresses  diagnostic  messages that would be printed to standard
139            output by default, especially the "evaluator exiting"  message  as
140            this can confuse scripts.
141
142       -S starttime, --start=starttime
143            Specify  the  starttime  of  the time window.  See PCPIntro(1) for
144            complete description of this option.
145
146       -t interval, --interval=interval
147            The interval argument follows the syntax described in PCPIntro(1),
148            and  in  the simplest form may be an unsigned integer (the implied
149            units in this case are seconds).  The value is used  to  determine
150            the  sample  interval  for  expressions that do not explicitly set
151            their sample interval using the pmie variable delta described  be‐
152            low.  The default is 10.0 seconds.
153
154       -T endtime, --finish=endtime
155            Specify  the endtime of the time window.  See PCPIntro(1) for com‐
156            plete description of this option.
157
158       -U username, --username=username
159            User account under which to run pmie.  The default is the  current
160            user  account  for interactive use.  When run as a daemon, the un‐
161            privileged "pcp" account is used in current versions of  PCP,  but
162            in  older  versions the superuser account ("root") was used by de‐
163            fault.
164
165       -v   Unless one of the verbose options -V, -v or -W appears on the com‐
166            mand  line, expressions are evaluated silently, the only output is
167            as a result of any actions being executed.  In the  verbose  mode,
168            specified  using  the  -v  flag,  the  value of each expression is
169            printed as it is evaluated.  The values are  in  canonical  units;
170            bytes  in  the dimension of ``space'', seconds in the dimension of
171            ``time'' and events  in  the  dimension  of  ``count''.   See  pm‐
172            LookupDesc(3)  for  details of the supported dimension and scaling
173            mechanisms for performance metrics.  The verbose mode is useful in
174            monitoring the value of given expressions, evaluating derived per‐
175            formance metrics, passing these values on to other tools for  fur‐
176            ther processing and in debugging new expressions.
177
178       -V, --verbose
179            This  option has the same effect as the -v option, except that the
180            name of the host and instance (if applicable) are printed as  well
181            as expression values.
182
183       -W   This  option has the same effect as the -V option described above,
184            except that for boolean expressions, only those names  and  values
185            that  make  the  expression  true are printed.  These are the same
186            names and values accessible to rule actions as the %h, %i, %c  and
187            %v bindings, as described below.
188
189       -x, --secret-agent
190            Execute  in  domain agent mode.  This mode is used within the Per‐
191            formance Co-Pilot product to derive values  for  summary  metrics,
192            see pmdasummary(1).  Only restricted functionality is available in
193            this mode (expressions with actions may not be used).
194
195       -X, --secret-applet
196            Run in secret applet mode (thin client).
197
198       -z, --hostzone
199            Change the reporting timezone to the timezone of the host that  is
200            the  source  of  the performance metrics, as identified via either
201            the -h option or the first named set  of  archives  (as  described
202            above for the -a option).
203
204       -Z timezone, --timezone=timezone
205            Change the reporting timezone to timezone in the format of the en‐
206            vironment variable TZ as described in environ(7).
207
208       -?, --help
209            Display usage message and exit.
210

EXAMPLES

212       The following example expressions demonstrate some of the  capabilities
213       of the inference engine.
214
215       The  directory $PCP_DEMOS_DIR/pmie contains a number of other annotated
216       examples of pmie expressions.
217
218       The variable delta controls expression evaluation  frequency.   Specify
219       that  subsequent  expressions be evaluated once a second, until further
220       notice:
221
222            delta = 1 sec;
223
224       If the total context switch rate exceeds 10000 per second per CPU, then
225       display an alarm notifier:
226
227            kernel.all.pswitch / hinv.ncpu > 10000 count/sec
228            -> alarm "high context switch rate %v";
229
230       If  the  high  context switch rate is sustained for 10 consecutive sam‐
231       ples, then launch top(1) in an xterm(1) window  to  monitor  processes,
232       but do this at most once every 5 minutes:
233
234            all_sample (
235                kernel.all.pswitch @0..9 > 10 Kcount/sec * hinv.ncpu
236            ) -> shell 5 min "xterm -e 'top'";
237
238       The following rules are evaluated once every 20 seconds:
239
240            delta = 20 sec;
241
242       If  any  disk  is performing more than 60 I/Os per second, then print a
243       message identifying  the  busy  disk  to  standard  output  and  launch
244       dkvis(1):
245
246            some_inst (
247                disk.dev.total > 60 count/sec
248            ) -> print "busy disks:" " %i" &
249                 shell 5 min "dkvis";
250
251       Refine  the  preceding  rule to apply only between the hours of 9am and
252       5pm, and to require 3 of 4 consecutive samples to exceed the  threshold
253       before executing the action:
254
255            $hour >= 9 && $hour <= 17 &&
256            some_inst (
257              75 %_sample (
258                disk.dev.total @0..3 > 60 count/sec
259              )
260            ) -> print "disks busy for 20 sec:" " [%h]%i";
261
262       The following two rules are evaluated once every 10 minutes:
263
264            delta = 10 min;
265
266       If  either  the / or the /usr filesystem is more than 95% full, display
267       an alarm popup, but not if it has already  been  displayed  during  the
268       last 4 hours:
269
270            filesys.free #'/dev/root' /
271                filesys.capacity #'/dev/root' < 0.05
272            -> alarm 4 hour "root filesystem (almost) full";
273
274            filesys.free #'/dev/usr' /
275                filesys.capacity #'/dev/usr' < 0.05
276            -> alarm 4 hour "/usr filesystem (almost) full";
277
278       The  following rule requires a machine that supports the lmsensors met‐
279       rics.  If the machine environment temperature rises more than 2 degrees
280       over a 10 minute interval, write an entry in the system log:
281
282            lmsensors.coretemp_isa.temp1 @0 - lmsensors.coretemp_isa.temp1 @1 > 2
283            -> alarm "temperature rising fast" &
284               syslog "machine room temperature rise alarm";
285
286       And  something  interesting  if you have performance problems with your
287       Oracle database:
288
289            // back to 30sec evaluations
290            delta = 30 sec;
291            sid = "ptg1";       # $ORACLE_SID setting
292            lid = "223";        # latch ID from v$latch
293            lru = "#'$sid/$lid cache buffers lru chain'";
294            host = ":moomba.melbourne.sgi.com";
295            gets = "oracle.latch.gets $host $lru";
296            total = "oracle.latch.gets $host $lru +
297                     oracle.latch.misses $host $lru +
298                     oracle.latch.immisses $host $lru";
299
300            $total > 100 && $gets / $total < 0.2
301            -> alarm "high lru latch contention in database $sid";
302
303       The following ruleset will emit exactly one message  depending  on  the
304       availability and value of the 1-minute load average.
305
306            delta = 1 minute;
307            ruleset
308                 kernel.all.load #'1 minute' > 10 * hinv.ncpu ->
309                     print "extreme load average %v"
310            else kernel.all.load #'1 minute' > 2 * hinv.ncpu ->
311                     print "moderate load average %v"
312            unknown ->
313                     print "load average unavailable"
314            otherwise ->
315                     print "load average OK"
316            ;
317
318       The  following  rule  will  emit a message when some filesystem is more
319       than 75% full and is filling at a rate that if sustained would fill the
320       filesystem to 100% in less than 30 minutes.
321
322            some_inst (
323                100 * filesys.used / filesys.capacity > 75 &&
324                filesys.used + 30min * (rate filesys.used) > filesys.capacity
325            ) -> print "filesystem will be full within 30 mins:" " %i";
326
327       If  the metric mypmda.errors counts errors then the following rule will
328       emit a message if the rate of errors exceeds 1 per second provided  the
329       error count is less than 100.
330
331            mypmda.errors > 1 && instant mypmda.errors < 100
332            -> print "high error rate: %v";
333

QUICK START

335       The pmie specification language is powerful and large.
336
337       To  expedite rapid development of pmie rules, the pmieconf(1) tool pro‐
338       vides a facility for generating a pmie configuration file from a set of
339       generalized  pmie rules.  The supplied set of rules covers a wide range
340       of performance scenarios.
341
342       The Performance Co-Pilot User's and Administrator's  Guide  provides  a
343       detailed tutorial-style chapter covering pmie.
344

EXPRESSION SYNTAX

346       This  description  is terse and informal.  For a more comprehensive de‐
347       scription see  the  Performance  Co-Pilot  User's  and  Administrator's
348       Guide.
349
350       A pmie specification is a sequence of semicolon terminated expressions.
351
352       Basic  operators  are modeled on the arithmetic, relational and Boolean
353       operators of the C programming language.  Precedence rules are  as  ex‐
354       pected,  although the use of parentheses is encouraged to enhance read‐
355       ability and remove ambiguity.
356
357       Operands are performance metric names (see PMNS(5)) and the normal lit‐
358       eral constants.
359
360       Operands involving performance metrics may produce sets of values, as a
361       result of enumeration in the dimensions of hosts, instances  and  time.
362       Special qualifiers may appear after a performance metric name to define
363       the enumeration in each dimension.  For example,
364
365           kernel.percpu.cpu.user :foo :bar #cpu0 @0..2
366
367       defines 6 values corresponding to the time spent executing in user mode
368       on  CPU  0 on the hosts ``foo'' and ``bar'' over the last 3 consecutive
369       samples.  The default interpretation in the absence of : (host), # (in‐
370       stance)  and  @  (time)  qualifiers is all instances at the most recent
371       sample time for the default source of PCP performance metrics.
372
373       Host and instance names that do not follow the rules for  variables  in
374       programming  languages,  ie. alphabetic optionally followed by alphanu‐
375       merics, should be enclosed in single quotes.
376
377       Expression evaluation follows the law of  ``least  surprises''.   Where
378       performance metrics have the semantics of a counter, pmie will automat‐
379       ically convert to a rate based upon consecutive samples  and  the  time
380       interval  between these samples.  All numeric expressions are evaluated
381       in double precision, and where appropriate, automatically  scaled  into
382       canonical units of ``bytes'', ``seconds'' and ``counts''.
383
384       A  rule  is  a special form of expression that specifies a condition or
385       logical expression, a special operator (->) and actions to be performed
386       when the condition is found to be true.
387
388       The following table summarizes the basic pmie operators:
389
390         ┌────────────────┬────────────────────────────────────────────────┐
391         │   Operators    │                  Explanation                   │
392         ├────────────────┼────────────────────────────────────────────────┤
393         │+ - * /         │ Arithmetic                                     │
394         │< <= == >= > != │ Relational (value comparison)                  │
395         │! && ||         │ Boolean                                        │
396         │->              │ Rule                                           │
397rising          │ Boolean, false to true transition              │
398falling         │ Boolean, true to false transition              │
399rate            │ Explicit rate conversion (rarely required)     │
400instant         │ No automatic rate conversion (rarely required) │
401         └────────────────┴────────────────────────────────────────────────┘
402       All  operators  are  supported  for numeric-valued operands and expres‐
403       sions.  For string-valued operands, namely literal string constants en‐
404       closed  in  double  quotes  or  metrics  with  a  data  type  of string
405       (PM_TYPE_STRING), only the operators == and != are supported.
406
407       The rate and instant operators are the logical inverse of one  another,
408       so  an  arithmetic  expression expr is equal to rate instant expr.  The
409       more useful cases involve using rate  with  a  metric  that  is  not  a
410       counter  to  determine  the  rate of change over time or instant with a
411       metric that is a counter to determine if the current value is above  or
412       below some threshold.
413
414       Aggregate operators may be used to aggregate or summarize along one di‐
415       mension of a set-valued expression.  The following aggregate  operators
416       map  from  a logical expression to a logical expression of lower dimen‐
417       sion.
418
419         ┌─────────────────────────┬─────────────┬──────────────────────────┐
420         │       Operators         │    Type     │       Explanation        │
421         ├─────────────────────────┼─────────────┼──────────────────────────┤
422some_inst                │ Existential │ True if at least one set │
423some_host                │             │ member is true in the    │
424some_sample              │             │ associated dimension     │
425         ├─────────────────────────┼─────────────┼──────────────────────────┤
426all_inst                 │ Universal   │ True if all set members  │
427all_host                 │             │ are true in the associ‐  │
428all_sample               │             │ ated dimension           │
429         ├─────────────────────────┼─────────────┼──────────────────────────┤
430N%_inst                  │ Percentile  │ True if at least N per‐  │
431N%_host                  │             │ cent of set members are  │
432N%_sample                │             │ true in the associated   │
433         │                         │             │ dimension                │
434         └─────────────────────────┴─────────────┴──────────────────────────┘
435       The  following  instantial  operators  may be used to filter or limit a
436       set-valued logical expression, based on regular expression matching  of
437       instance names.  The logical expression must be a set involving the di‐
438       mension of instances, and the regular expression is of the form used by
439       egrep(1) or the Extended Regular Expressions of regcomp(3).
440
441              ┌─────────────┬──────────────────────────────────────────┐
442              │ Operators   │               Explanation                │
443              ├─────────────┼──────────────────────────────────────────┤
444match_inst   │ For each value of the logical expression │
445              │             │ that is ``true'', the result is ``true'' │
446              │             │ if the associated instance name matches  │
447              │             │ the regular expression.  Otherwise the   │
448              │             │ result is ``false''.                     │
449              ├─────────────┼──────────────────────────────────────────┤
450nomatch_inst │ For each value of the logical expression │
451              │             │ that is ``true'', the result is ``true'' │
452              │             │ if the associated instance name does not 
453              │             │ match the regular expression.  Otherwise │
454              │             │ the result is ``false''.                 │
455              └─────────────┴──────────────────────────────────────────┘
456       For  example,  the expression below will be ``true'' for disks attached
457       to controllers 2 or 3 performing more than 20 operations per second:
458            match_inst "^dks[23]d" disk.dev.total > 20;
459
460       The following aggregate operators map from an arithmetic expression  to
461       an arithmetic expression of lower dimension.
462
463          ┌─────────────────────────┬───────────┬──────────────────────────┐
464          │       Operators         │   Type    │       Explanation        │
465          ├─────────────────────────┼───────────┼──────────────────────────┤
466min_inst                 │ Extrema   │ Minimum value across all │
467min_host                 │           │ set members in the asso‐ │
468min_sample               │           │ ciated dimension         │
469          ├─────────────────────────┼───────────┼──────────────────────────┤
470max_inst                 │ Extrema   │ Maximum value across all │
471max_host                 │           │ set members in the asso‐ │
472max_sample               │           │ ciated dimension         │
473          ├─────────────────────────┼───────────┼──────────────────────────┤
474sum_inst                 │ Aggregate │ Sum of values across all │
475sum_host                 │           │ set members in the asso‐ │
476sum_sample               │           │ ciated dimension         │
477          ├─────────────────────────┼───────────┼──────────────────────────┤
478avg_inst                 │ Aggregate │ Average value across all │
479avg_host                 │           │ set members in the asso‐ │
480avg_sample               │           │ ciated dimension         │
481          └─────────────────────────┴───────────┴──────────────────────────┘
482       The  aggregate  operators  count_inst,  count_host and count_sample map
483       from a logical expression to an arithmetic expression of  lower  dimen‐
484       sion  by counting the number of set members for which the expression is
485       true in the associated dimension.
486
487       For action rules, the following actions are defined:
488
489                ┌──────────┬────────────────────────────────────────┐
490                │Operators │              Explanation               │
491                ├──────────┼────────────────────────────────────────┤
492alarm     │ Raise a visible alarm with xconfirm(1) │
493print     │ Display on standard output             │
494shell     │ Execute with sh(1)
495stomp     │ Send a STOMP message to a JMS server   │
496syslog    │ Append a message to system log file    │
497                └──────────┴────────────────────────────────────────┘
498       Multiple actions may be separated by the & and | operators  to  specify
499       respectively  sequential  execution (both actions are executed) and al‐
500       ternate execution (the second action will only be executed if the  exe‐
501       cution of the first action returns a non-zero error status.
502
503       Arguments  to actions are an optional suppression time, and then one or
504       more expressions (a string is an expression in this context).   Strings
505       appearing  as  arguments to an action may include the following special
506       selectors that will be replaced at the time the action is executed.
507
508       %h  Host name(s) that make the left-most top-level  expression  in  the
509           condition true.
510
511       %c  Connection specification string(s) or files for a PCP tool to reach
512           the hosts or archives that make the left-most top-level  expression
513           in the condition true.
514
515       %i  Instance(s)  that  make  the  left-most top-level expression in the
516           condition true.
517
518       %v  One value from the left-most top-level expression in the  condition
519           for each host and instance pair that makes the condition true.
520
521       Note  that  expansion of the special selectors is done by repeating the
522       whole argument once for each unique binding to any  of  the  qualifying
523       special selectors.  For example if a rule were true for the host mumble
524       with instances grunt and snort, and for host fumble the  instance  puff
525       makes the rule true, then the action
526            ...
527            -> shell myscript "Warning: %h:%i busy ";
528       will  execute  myscript with the argument string "Warning: mumble:grunt
529       busy Warning: mumble:snort busy Warning: fumble:puff busy".
530
531       By comparison, if the action
532            ...
533            -> shell myscript "Warning! busy:" " %h:%i";
534       were executed under the same circumstances, then myscript would be exe‐
535       cuted  with  the  argument  string  "Warning!  busy:  mumble:grunt mum‐
536       ble:snort fumble:puff".
537
538       The semantics of the expansion of the special selectors leads to a com‐
539       mon  usage pattern in an action, where one argument is a constant (con‐
540       tains no special selectors) the second argument  contains  the  desired
541       special  selectors  with  minimal separator characters, and an optional
542       third argument provides a constant postscript (e.g.  to  terminate  any
543       argument  quoting from the first argument).  If necessary post-process‐
544       ing (eg. in myscript) can provide the necessary enumeration  over  each
545       unique expansion of the string containing just the special selectors.
546
547       For complex conditions, the bindings to these selectors is not obvious.
548       It is strongly recommended that pmie be  used  in  the  debugging  mode
549       (specify the -W command line option in particular) during rule develop‐
550       ment.
551

BOOLEAN EXPRESSIONS

553       pmie expressions that have the semantics of a Boolean, e.g.  foo.bar  >
554       10  or some_inst ( my.table < 0 ) are assigned the values true or false
555       or unknown.  A value is unknown if one or more of the underlying metric
556       values  is  unavailable, e.g.  pmcd(1) on the host cannot be contacted,
557       the metric is not in the PCP archive, no values  are  currently  avail‐
558       able,  insufficient  values have been fetched to allow a rate converted
559       value to be computed or insufficient values have been  fetched  to  in‐
560       stantiate the required number of samples in the temporal domain.
561
562       Boolean operators follow the normal rules of Kleene logic (aka 3-valued
563       logic) when combining values that include unknown:
564
565                      ┌────────────┬───────────────────────────┐
566                      │            │             B             │
567                      │  A and B   ├─────────┬───────┬─────────┤
568                      │            │  true   false unknown 
569                      ├──┬─────────┼─────────┼───────┼─────────┤
570                      │  │  true   true   false unknown 
571                      │  ├─────────┼─────────┼───────┼─────────┤
572                      │A │  false  false  false false  
573                      │  ├─────────┼─────────┼───────┼─────────┤
574                      │  │ unknown unknown false unknown 
575                      └──┴─────────┴─────────┴───────┴─────────┘
576                      ┌────────────┬──────────────────────────┐
577                      │            │            B             │
578                      │  A or B    ├──────┬─────────┬─────────┤
579                      │            │ true false  unknown 
580                      ├──┬─────────┼──────┼─────────┼─────────┤
581                      │  │  true   true true   true   
582                      │  ├─────────┼──────┼─────────┼─────────┤
583                      │A │  false  true false  unknown 
584                      │  ├─────────┼──────┼─────────┼─────────┤
585                      │  │ unknown true unknown unknown 
586                      └──┴─────────┴──────┴─────────┴─────────┘
587                                 ┌────────┬─────────┐
588                                 │   A    │  not A  │
589                                 ├────────┼─────────┤
590true   false  
591                                 ├────────┼─────────┤
592false  true   
593                                 ├────────┼─────────┤
594unknown unknown 
595                                 └────────┴─────────┘

RULESETS

597       The ruleset clause is used to define a set of rules  and  actions  that
598       are  evaluated  in  order until some action is executed, at which point
599       the remaining rules and actions are skipped until the ruleset is  again
600       scheduled  for evaluation.  The keyword else is used to separate rules.
601       After one or more regular rules (with a predicate  and  an  action),  a
602       ruleset may include an optional
603            unknown -> action
604       clause, optionally followed by a
605            otherwise -> action
606       clause.
607
608       If  all  of  the predicates in the rules evaluate to unknown and an un‐
609       known clause has been specified then action associated with the unknown
610       clause will be executed.
611
612       If no rule predicate is true and the unknown action is either not spec‐
613       ified or not executed and an otherwise clause has been specified,  then
614       the action associated with the otherwise clause will be executed.
615

SCALE FACTORS

617       Scale  factors may be appended to arithmetic expressions and force lin‐
618       ear scaling of the value to canonical units.  Simple scale factors  are
619       constructed  from the keywords: nanosecond, nanosec, nsec, microsecond,
620       microsec, usec, millisecond, millisec, msec, second, sec, minute,  min,
621       hour,  byte,  Kbyte, Mbyte, Gbyte, Tbyte, count, Kcount and Mcount, and
622       the operator /, for example ``Kbytes / hour''.
623

MACROS

625       Macros are defined using expressions of the form:
626
627            name = constexpr;
628
629       Where name follows the normal rules for variables in  programming  lan‐
630       guages, ie. alphabetic optionally followed by alphanumerics.  constexpr
631       must be a constant expression, either  a  string  (enclosed  in  double
632       quotes) or an arithmetic expression optionally followed by a scale fac‐
633       tor.
634
635       Macros are expanded when their name, prefixed by a dollar  ($)  appears
636       in an expression, and macros may be nested within a constexpr string.
637
638       The following reserved macro names are understood.
639
640       minute    Current minute of the hour.
641
642       hour      Current hour of the day, in the range 0 to 23.
643
644       day       Current day of the month, in the range 1 to 31.
645
646       month     Current  month  of  the  year, in the range 0 (January) to 11
647                 (December).
648
649       year      Current year.
650
651       day_of_week
652                 Current day of the week, in the range 0 (Sunday) to 6 (Satur‐
653                 day).
654
655       delta     Sample interval in effect for this expression.
656
657       Dates  and times are presented in the reporting time zone (see descrip‐
658       tion of -Z and -z command line options above).
659

AUTOMATIC RESTART

661       It is often useful for pmie processes to be started  and  stopped  when
662       the  local  host is booted or shutdown, or when they have been detected
663       as no longer running (when they have unexpectedly exited for some  rea‐
664       son).  Refer to pmie_check(1) for details on automating this process.
665
666       Optionally, each system running pmcd(1) may also be configured to run a
667       ``primary''  pmie  instance.   This  pmie  instance  is   launched   by
668       $PCP_RC_DIR/pmie,      and     is     affected     by     the     files
669       $PCP_SYSCONF_DIR/pmie/control,   $PCP_SYSCONF_DIR/pmie/control.d   (use
670       chkconfig(8), systemctl(1) or similar platform-specific commands to ac‐
671       tivate or disable the  primary  pmie  instance)  and  $PCP_VAR_DIR/con‐
672       fig/pmie/config.default (the default initial configuration file for the
673       primary pmie).
674
675       The primary pmie instance is identified by the -P option.  There may be
676       at most one ``primary'' pmie instance on each system.  The primary pmie
677       instance (if any) must be running on the same host as  the  pmcd(1)  to
678       which  it  connects (if any), so the -h and -P options are mutually ex‐
679       clusive.
680

EVENT MONITORING

682       It is common for production systems to be monitored in a central  loca‐
683       tion.   Traditionally  on  UNIX  systems this has been performed by the
684       system log facilities - see logger(1),  and  syslogd(1).   On  Windows,
685       communication with the system event log is handled by pcp-eventlog(1).
686
687       pmie  fits into this model when rules use the syslog action.  Note that
688       if the action string begins with -p (priority)  and/or  -t  (tag)  then
689       these  are  extracted from the string and treated in the same way as in
690       logger(1) and pcp-eventlog(1).
691
692       However, it is common to have other event monitoring  frameworks  also,
693       into  which  you  may wish to incorporate performance events from pmie.
694       You can often use the shell action to send events to these  frameworks,
695       as  they  usually provide their a program for injecting events into the
696       framework from external sources.
697
698       A final option is use of the stomp (Streaming Text  Oriented  Messaging
699       Protocol)  action,  which allows pmie to connect to a central JMS (Java
700       Messaging System) server and send events to the PMIE topic.  Tools  can
701       be  written  to  extract these text messages and present them to opera‐
702       tions people (via desktop popup windows, etc).  Use of the stomp action
703       requires  a  stomp  configuration file to be setup, which specifies the
704       location of the JMS server host, port number, and username/password.
705
706       The format of this file is as follows:
707
708            host=messages.sgi.com   # this is the JMS server (required)
709            port=61616              # and its listening here (required)
710            timeout=2               # seconds to wait for server (optional)
711            username=joe            # (required)
712            password=j03ST0MP       # (required)
713            topic=PMIE              # JMS topic for pmie messages (optional)
714
715       The timeout value specifies the time (in seconds) that pmie should wait
716       for  acknowledgements  from  the JMS server after sending a message (as
717       required by the STOMP protocol).  Note that on startup, pmie will  wait
718       indefinitely for a connection, and will not begin rule evaluation until
719       that initial connection has been established.  Should the connection to
720       the JMS server be lost at any time while pmie is running, pmie will at‐
721       tempt to reconnect on each subsequent truthful  evaluation  of  a  rule
722       with  a  stomp  action,  but not more than once per minute.  This is to
723       avoid contributing to network congestion.  In this situation, where the
724       STOMP  connection  to the JMS server has been severed, the stomp action
725       will return a non-zero error value.
726

BUGS

728       The lexical scanner and parser will attempt to recover after  an  error
729       in  the  input expressions.  Parsing resumes after skipping input up to
730       the next semi-colon (;), however during this skipping process the scan‐
731       ner  is ignorant of comments and strings, so an embedded semi-colon may
732       cause parsing to resume at  an  unexpected  place.   This  behavior  is
733       largely  benign,  as  until the initial syntax error is corrected, pmie
734       will not attempt any expression evaluation.
735

FILES

737       $PCP_DEMOS_DIR/pmie/*
738            annotated example rules
739
740       $PCP_VAR_DIR/pmns/*
741            default PMNS specification files
742
743       $PCP_TMP_DIR/pmie
744            pmie maintains files in this directory  to  identify  the  running
745            pmie  instances  and  to export runtime information about each in‐
746            stance - this data forms the basis of  the  pmcd.pmie  performance
747            metrics
748
749       $PCP_PMIECONTROL_PATH
750            the default set of pmie instances to start at boot time - refer to
751            pmie_check(1) for details
752

PCP ENVIRONMENT

754       Environment variables with the prefix PCP_ are used to parameterize the
755       file  and  directory names used by PCP.  On each installation, the file
756       /etc/pcp.conf contains the  local  values  for  these  variables.   The
757       $PCP_CONF  variable may be used to specify an alternative configuration
758       file, as described in pcp.conf(5).
759
760       When executing shell actions, pmie overrides two variables  -  IFS  and
761       PATH  - in the environment of the child process.  IFS is set to "\t\n".
762       The PATH is set to a combination of a default path  for  all  platforms
763       ("/usr/sbin:/sbin:/usr/bin:/bin")  and several configurable components.
764       These are (in this order): $PCP_BIN_DIR, $PCP_BINADM_DIR and $PCP_PLAT‐
765       FORM_PATHS.
766
767       When  executing  popup  alarm  actions,  pmie  will  use  the  value of
768       $PCP_XCONFIRM_PROG as the visual notification program to run.  This  is
769       typically set to pmconfirm(1), a cross-platform dialog box.
770

UNIX SEE ALSO

772       logger(1).
773

WINDOWS SEE ALSO

775       pcp-eventlog(1).
776

SEE ALSO

778       PCPIntro(1),    pmcd(1),   pmconfirm(1),   pmdumplog(1),   pmieconf(1),
779       pmie_check(1), pminfo(1), pmlogger(1), pmval(1), systemd(1),  PMAPI(3),
780       pcp.conf(5), pcp.env(5) and PMNS(5).
781

USER GUIDE

783       For a more complete description of the pmie language, refer to the Per‐
784       formance Co-Pilot Users and Administrators Guide.   This  is  available
785       online from:
786           https://pcp.readthedocs.io/en/latest/UAG/PerformanceMetricsInferenceEngine.html
787
788
789
790Performance Co-Pilot                  PCP                              PMIE(1)
Impressum