stapprobes(3stap)

1STAPPROBES(3stap)                                            STAPPROBES(3stap)
2
3
4

NAME

6       stapprobes - systemtap probe points
7
8
9

DESCRIPTION

11       The  following sections enumerate the variety of probe points supported
12       by the systemtap translator, and some of the additional aliases defined
13       by  standard  tapset  scripts.  Many are individually documented in the
14       3stap manual section, with the probe:: prefix.
15
16

SYNTAX

18              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }
19
20
21       A probe declaration may list multiple comma-separated probe  points  in
22       order  to  attach  a handler to all of the named events.  Normally, the
23       handler statements are run whenever any of events occur.  Depending  on
24       the  type  of  probe point, the handler statements may refer to context
25       variables (denoted with a dollar-sign prefix  like  $foo)  to  read  or
26       write state.  This may include function parameters for function probes,
27       or local variables for statement probes.
28
29       The syntax of a single probe point is a general dotted-symbol sequence.
30       This  allows  a  breakdown  of the event namespace into parts, somewhat
31       like the Domain Name System does on the Internet.  Each component iden‐
32       tifier may be parametrized by a string or number literal, with a syntax
33       like a function call.  A component may include a "*" character, to  ex‐
34       pand  to  a  set of matching probe points.  It may also include "**" to
35       match multiple sequential components at once.  Probe  aliases  likewise
36       expand to other probe points.
37
38       Probe  aliases  can be given on their own, or with a suffix. The suffix
39       attaches to the underlying probe point that the alias is  expanded  to.
40       For example,
41
42              syscall.read.return.maxactive(10)
43
44       expands to
45
46              kernel.function("sys_read").return.maxactive(10)
47
48       with the component maxactive(10) being recognized as a suffix.
49
50       Normally,  each  and  every  probe  point  resulting from wildcard- and
51       alias-expansion must be resolved to some low-level system  instrumenta‐
52       tion  facility  (e.g.,  a kprobe address, marker, or a timer configura‐
53       tion), otherwise the elaboration phase will fail.
54
55       However, a probe point may be followed by a "?" character, to  indicate
56       that it is optional, and that no error should result if it fails to re‐
57       solve.  Optionalness passes down through all levels  of  alias/wildcard
58       expansion.  Alternately, a probe point may be followed by a "!" charac‐
59       ter, to indicate that it  is  both  optional  and  sufficient.   (Think
60       vaguely  of  the Prolog cut operator.) If it does resolve, then no fur‐
61       ther probe points in the same comma-separated list  will  be  resolved.
62       Therefore,  the  "!"   sufficiency  mark  only makes sense in a list of
63       probe point alternatives.
64
65       Additionally, a probe point may be followed by a "if (expr)" statement,
66       in  order  to  enable/disable the probe point on-the-fly. With the "if"
67       statement, if the "expr" is false when the  probe  point  is  hit,  the
68       whole  probe  body  including alias's body is skipped. The condition is
69       stacked up through all levels of alias/wildcard expansion. So the final
70       condition  becomes  the  logical-and  of  conditions  of  all  expanded
71       alias/wildcard.  The expressions are necessarily restricted  to  global
72       variables.
73
74       These  are  all  syntactically valid probe points.  (They are generally
75       semantically invalid, depending on the contents of the tapsets, and the
76       versions of kernel/user software installed.)
77
78
79              kernel.function("foo").return
80              process("/bin/vi").statement(0x2222)
81              end
82              syscall.*
83              syscall.*.return.maxactive(10)
84              syscall.{open,close}
85              sys**open
86              kernel.function("no_such_function") ?
87              module("awol").function("no_such_function") !
88              signal.*? if (switch)
89              kprobe.function("foo")
90
91
92       Probes may be broadly classified into "synchronous" and "asynchronous".
93       A "synchronous" event is deemed to occur when any processor executes an
94       instruction  matched  by  the specification.  This gives these probes a
95       reference point (instruction address) from which more  contextual  data
96       may  be  available.  Other families of probe points refer to "asynchro‐
97       nous" events such as timers/counters rolling over, where  there  is  no
98       fixed  reference point that is related.  Each probe point specification
99       may match multiple locations (for example, using wildcards or aliases),
100       and  all  them  are  then probed.  A probe declaration may also contain
101       several comma-separated specifications, all of which are probed.
102
103       Brace expansion is a mechanism which allows a list of probe  points  to
104       be generated. It is very similar to shell expansion. A component may be
105       surrounded by a pair of curly braces to indicate that  the  comma-sepa‐
106       rated  sequence of one or more subcomponents will each constitute a new
107       probe point. The braces may be arbitrarily nested. The ordering of  ex‐
108       panded results is based on product order.
109
110       The  question mark (?), exclamation mark (!) indicators and probe point
111       conditions may not be placed in any expansions that are before the last
112       component.
113
114       The following is an example of brace expansion.
115
116
117              syscall.{write,read}
118              # Expands to
119              syscall.write, syscall.read
120
121              {kernel,module("nfs")}.function("nfs*")!
122              # Expands to
123              kernel.function("nfs*")!, module("nfs").function("nfs*")!
124
125
126

DWARF DEBUGINFO

128       Resolving some probe points requires DWARF debuginfo or "debug symbols"
129       for the specific program being instrumented.  For some others, DWARF is
130       automatically  synthesized  on  the  fly from source code header files.
131       For others, it is not needed at all.  Since a systemtap script may  use
132       any mixture of probe points together, the union of their DWARF require‐
133       ments has to be met on the computer where  script  compilation  occurs.
134       (See the --use-server option and the stap-server(8) man page for infor‐
135       mation about the remote compilation facility, which  allows  these  re‐
136       quirements to be met on a different machine.)
137
138       The  following  point lists many of the available probe point families,
139       to classify them with respect to their need for DWARF debuginfo for the
140       specific program for that probe point.
141
142
143       DWARF                          NON-DWARF                    SYMBOL-TABLE
144
145       kernel.function, .statement    kernel.mark                  kernel.function*
146       module.function, .statement    process.mark, process.plt    module.function*
147       process.function, .statement   begin, end, error, never     process.function*
148       process.mark*                  timer
149       .function.callee               perf
150       python2, python3               procfs
151                                      kernel.statement.absolute
152       AUTO-GENERATED-DWARF           kernel.data
153                                      kprobe.function
154       kernel.trace                   process.statement.absolute
155                                      process.begin, .end
156                                      netfilter
157                                      java
158
159
160       The probe types marked with * asterisks mark fallbacks, where systemtap
161       can sometimes infer subset or substitute information.  In general,  the
162       more  symbolic  /  debugging  information available, the higher quality
163       probing will be available.
164
165
166

ON-THE-FLY ARMING

168       The following types of probe points may be armed/disarmed on-the-fly to
169       save  overheads during uninteresting times.  Arming conditions may also
170       be added to other types of probes, but will be treated  as  a  wrapping
171       conditional and won't benefit from overhead savings.
172
173
174       DISARMABLE                                exceptions
175       kernel.function, kernel.statement
176       module.function, module.statement
177       process.*.function, process.*.statement
178       process.*.plt, process.*.mark
179       timer.                                    timer.profile
180       java
181
182

PROBE POINT FAMILIES

184   BEGIN/END/ERROR
185       The  probe  points begin and end are defined by the translator to refer
186       to the time of session startup and shutdown.  All  "begin"  probe  han‐
187       dlers  are  run,  in  some sequence, during the startup of the session.
188       All global variables will have been initialized prior  to  this  point.
189       All  "end" probes are run, in some sequence, during the normal shutdown
190       of a session, such as in the aftermath of an exit () function call,  or
191       an interruption from the user.  In the case of an error-triggered shut‐
192       down, "end" probes are not run.  There are no target  variables  avail‐
193       able in either context.
194
195       If the order of execution among "begin" or "end" probes is significant,
196       then an optional sequence number may be provided:
197
198
199              begin(N)
200              end(N)
201
202
203       The number N may be positive or negative.  The probe handlers  are  run
204       in  increasing  order, and the order between handlers with the same se‐
205       quence number is unspecified.  When "begin" or "end" are given  without
206       a sequence, they are effectively sequence zero.
207
208       The  error  probe  point  is similar to the end probe, except that each
209       such probe handler run when the session  ends  after  errors  have  oc‐
210       curred.   In  such  cases,  "end"  probes are skipped, but each "error"
211       probe is still attempted.  This kind of probe can be used to  clean  up
212       or emit a "final gasp".  It may also be numerically parametrized to set
213       a sequence.
214
215
216   NEVER
217       The probe point never is specially defined by the  translator  to  mean
218       "never".  Its probe handler is never run, though its statements are an‐
219       alyzed for symbol / type correctness as usual.  This probe point may be
220       useful in conjunction with optional probes.
221
222
223   SYSCALL and ND_SYSCALL
224       The  syscall.* and nd_syscall.*  aliases define several hundred probes,
225       too many to detail here.  They are of the general form:
226
227
228              syscall.NAME
229              nd_syscall.NAME
230              syscall.NAME.return
231              nd_syscall.NAME.return
232
233
234       Generally, a pair of probes are defined for each normal system call  as
235       listed  in  the  syscalls(2) manual page, one for entry and one for re‐
236       turn.  Those system calls that never return do not have a corresponding
237       .return probe.  The nd_* family of probes are about the same, except it
238       uses non-DWARF based searching mechanisms, which may result in a  lower
239       quality of symbolic context data (parameters), and may miss some system
240       calls.  You may want to try them first, in case kernel debugging infor‐
241       mation is not immediately available.
242
243       Each probe alias provides a variety of variables. Looking at the tapset
244       source code is the most reliable way.  Generally, each variable  listed
245       in  the  standard manual page is made available as a script-level vari‐
246       able, so syscall.open exposes filename, flags, and mode.  In  addition,
247       a standard suite of variables is available at most aliases:
248
249       argstr A  pretty-printed  form  of  the  entire  argument list, without
250              parentheses.
251
252       name   The name of the system call.
253
254       retstr For return probes, a pretty-printed form of the system-call  re‐
255              sult.
256
257       As  usual  for  probe aliases, these variables are all initialized once
258       from the underlying $context variables, so that later changes to  $con‐
259       text  variables are not automatically reflected.  Not all probe aliases
260       obey all of these general guidelines.   Please  report  any  bothersome
261       ones you encounter as a bug.  Note that on some kernel/userspace archi‐
262       tecture combinations (e.g., 32-bit userspace on 64-bit kernel), the un‐
263       derlying $context variables may need explicit sign extension / masking.
264       When this is an issue, consider using the tapset-provided variables in‐
265       stead of raw $context variables.
266
267       If debuginfo availability is a problem, you may try using the non-DWARF
268       syscall probe aliases instead.  Use the nd_syscall.  prefix instead  of
269       syscall.  The same context variables are available, as far as possible.
270
271
272   TIMERS
273       There  are  two  main types of timer probes: "jiffies" timer probes and
274       time interval timer probes.
275
276       Intervals defined by the standard kernel "jiffies" timer may be used to
277       trigger  probe  handlers  asynchronously.  Two probe point variants are
278       supported by the translator:
279
280
281              timer.jiffies(N)
282              timer.jiffies(N).randomize(M)
283
284
285       The probe handler is run every N  jiffies  (a  kernel-defined  unit  of
286       time,  typically between 1 and 60 ms).  If the "randomize" component is
287       given, a linearly distributed random value in  the  range  [-M..+M]  is
288       added to N every time the handler is run.  N is restricted to a reason‐
289       able range (1 to around a million), and M is restricted to  be  smaller
290       than  N.  There are no target variables provided in either context.  It
291       is possible for such probes to be run concurrently on a multi-processor
292       computer.
293
294       Alternatively,  intervals may be specified in units of time.  There are
295       two probe point variants similar to the jiffies timer:
296
297
298              timer.ms(N)
299              timer.ms(N).randomize(M)
300
301
302       Here, N and M are specified in milliseconds, but the full  options  for
303       units   are   seconds  (s/sec),  milliseconds  (ms/msec),  microseconds
304       (us/usec), nanoseconds (ns/nsec), and hertz (hz).  Randomization is not
305       supported for hertz timers.
306
307       The  actual resolution of the timers depends on the target kernel.  For
308       kernels prior to 2.6.17, timers are limited to jiffies  resolution,  so
309       intervals  are  rounded  up  to  the  nearest  jiffies interval.  After
310       2.6.17, the implementation uses hrtimers for tighter precision,  though
311       the  actual  resolution will be arch-dependent.  In either case, if the
312       "randomize" component is given, then the random value will be added  to
313       the interval before any rounding occurs.
314
315       Profiling  timers  are also available to provide probes that execute on
316       all CPUs at the rate of the system tick (CONFIG_HZ) or at a given  fre‐
317       quency  (hz).  On  some  kernels, this is a one-concurrent-user-only or
318       disabled facility, resulting in error -16 (EBUSY) during  probe  regis‐
319       tration.
320
321
322              timer.profile.tick
323              timer.profile.freq.hz(N)
324
325
326       Full  context information of the interrupted process is available, mak‐
327       ing this probe suitable for a time-based sampling profiler.
328
329       It is recommended to use the tapset  probe  timer.profile  rather  than
330       timer.profile.tick.  This probe point behaves identically to timer.pro‐
331       file.tick when the underlying functionality  is  available,  and  falls
332       back  to  using perf.sw.cpu_clock on some recent kernels which lack the
333       corresponding profile timer facility.
334
335       Profiling timers with specified frequencies are  only  accurate  up  to
336       around  100  hz.  You may need to provide a larger value to achieve the
337       desired rate.
338
339       Note that if a timer probe is set to fire at a very high  rate  and  if
340       the  probe  body  is  complex, succeeding timer probes can get skipped,
341       since the time for them to run has already passed.  Normally  systemtap
342       reports missed probes, but it will not report these skipped probes.
343
344
345   DWARF
346       This family of probe points uses symbolic debugging information for the
347       target kernel/module/program, as may be found  in  unstripped  executa‐
348       bles,  or  the  separate  debuginfo  packages.  They allow placement of
349       probes logically into the execution path  of  the  target  program,  by
350       specifying a set of points in the source or object code.  When a match‐
351       ing statement executes on any processor, the probe handler  is  run  in
352       that context.
353
354       Probe points in the DWARF family can be identified by the target kernel
355       module (or user process), source file, line number, function  name,  or
356       some combination of these.
357
358       Here is a list of DWARF probe points currently supported:
359
360              kernel.function(PATTERN)
361              kernel.function(PATTERN).call
362              kernel.function(PATTERN).callee(PATTERN)
363              kernel.function(PATTERN).callee(PATTERN).return
364              kernel.function(PATTERN).callee(PATTERN).call
365              kernel.function(PATTERN).callees(DEPTH)
366              kernel.function(PATTERN).return
367              kernel.function(PATTERN).inline
368              kernel.function(PATTERN).label(LPATTERN)
369              module(MPATTERN).function(PATTERN)
370              module(MPATTERN).function(PATTERN).call
371              module(MPATTERN).function(PATTERN).callee(PATTERN)
372              module(MPATTERN).function(PATTERN).callee(PATTERN).return
373              module(MPATTERN).function(PATTERN).callee(PATTERN).call
374              module(MPATTERN).function(PATTERN).callees(DEPTH)
375              module(MPATTERN).function(PATTERN).return
376              module(MPATTERN).function(PATTERN).inline
377              module(MPATTERN).function(PATTERN).label(LPATTERN)
378              kernel.statement(PATTERN)
379              kernel.statement(PATTERN).nearest
380              kernel.statement(ADDRESS).absolute
381              module(MPATTERN).statement(PATTERN)
382              process("PATH").function("NAME")
383              process("PATH").statement("*@FILE.c:123")
384              process("PATH").library("PATH").function("NAME")
385              process("PATH").library("PATH").statement("*@FILE.c:123")
386              process("PATH").library("PATH").statement("*@FILE.c:123").nearest
387              process("PATH").function("*").return
388              process("PATH").function("myfun").label("foo")
389              process("PATH").function("foo").callee("bar")
390              process("PATH").function("foo").callee("bar").return
391              process("PATH").function("foo").callee("bar").call
392              process("PATH").function("foo").callees(DEPTH)
393              process(PID).function("NAME")
394              process(PID).function("myfun").label("foo")
395              process(PID).plt("NAME")
396              process(PID).plt("NAME").return
397              process(PID).statement("*@FILE.c:123")
398              process(PID).statement("*@FILE.c:123").nearest
399              process(PID).statement(ADDRESS).absolute
400
401       (See  the  USER-SPACE section below for more information on the process
402       probes.)
403
404       The list above includes multiple variants and modifiers  which  provide
405       additional functionality or filters. They are:
406
407              .function
408                     Places  a probe near the beginning of the named function,
409                     so that parameters are available as context variables.
410
411              .return
412                     Places a probe at the moment after the  return  from  the
413                     named  function,  so the return value is available as the
414                     "$return" context variable.
415
416              .inline
417                     Filters the results to include only instances of  inlined
418                     functions.  Note  that  inlined  functions do not have an
419                     identifiable return point, so .return is not supported on
420                     .inline probes.
421
422              .call  Filters the results to include only non-inlined functions
423                     (the opposite set of .inline)
424
425              .exported
426                     Filters the results to include only exported functions.
427
428              .statement
429                     Places a probe at the exact spot,  exposing  those  local
430                     variables that are visible there.
431
432              .statement.nearest
433                     Places  a  probe at the nearest available line number for
434                     each line number given in the statement.
435
436              .callee
437                     Places a probe  on  the  callee  function  given  in  the
438                     .callee  modifier,  where  the  callee must be a function
439                     called by the target function given in .function. The ad‐
440                     vantage  of  doing  this over directly probing the callee
441                     function is that this probe point is run  only  when  the
442                     callee  is  called  from  the  target  function  (add the
443                     -DSTAP_CALLEE_MATCHALL directive to  override  this  when
444                     calling stap(1)).
445
446                     Note  that only callees that can be statically determined
447                     are  available.   For  example,  calls  through  function
448                     pointers are not available.  Additionally, calls to func‐
449                     tions located in other objects (e.g.  libraries) are  not
450                     available (instead use another probe point). This feature
451                     will only work for code compiled with GCC 4.7+.
452
453              .callees
454                     Shortcut for .callee("*"), which places a  probe  on  all
455                     callees of the function.
456
457              .callees(DEPTH)
458                     Recursively   places  probes  on  callees.  For  example,
459                     .callees(2) will probe both callees of the  target  func‐
460                     tion,   as   well   as  callees  of  those  callees.  And
461                     .callees(3) goes one level deeper, etc...  A callee probe
462                     at  depth  N  is only triggered when the N callers in the
463                     callstack match those  that  were  statically  determined
464                     during  analysis  (this  also  may  be  overridden  using
465                     -DSTAP_CALLEE_MATCHALL).
466
467       In the above list of probe points, MPATTERN stands for a string literal
468       that aims to identify the loaded kernel module of interest. For in-tree
469       kernel modules, the name suffices (e.g. "btrfs"). The name may also in‐
470       clude  the  "*", "[]", and "?" wildcards to match multiple in-tree mod‐
471       ules. Out-of-tree modules are also supported  by  specifying  the  full
472       path  to the ko file. Wildcards are not supported. The file must follow
473       the convention of being named <module_name>.ko (characters ',' and  '-'
474       are replaced by '_').
475
476       LPATTERN  stands  for  a source program label. It may also contain "*",
477       "[]", and "?" wildcards. PATTERN stands for a string literal that  aims
478       to identify a point in the program.  It is made up of three parts:
479
480       ·   The first part is the name of a function, as would appear in the nm
481           program's output.  This part may use the "*"  and  "?"  wildcarding
482           operators to match multiple names.
483
484       ·   The  second part is optional and begins with the "@" character.  It
485           is followed by the path to the source file containing the function,
486           which may include a wildcard pattern, such as mm/slab*.  If it does
487           not match as is, an implicit "*/" is optionally  added  before  the
488           pattern, so that a script need only name the last few components of
489           a possibly long source directory path.
490
491       ·   Finally, the third part is optional if the file name part was  giv‐
492           en, and identifies the line number in the source file preceded by a
493           ":" or a "+".  The line number is assumed to be  an  absolute  line
494           number if preceded by a ":", or relative to the declaration line of
495           the function if preceded by a "+".  All the lines in  the  function
496           can  be  matched  with  ":*".   A range of lines x through y can be
497           matched with ":x-y". Ranges and specific lines can be  mixed  using
498           commas, e.g. ":x,y-z".
499
500       As an alternative, PATTERN may be a numeric constant, indicating an ad‐
501       dress.  Such an address may be found from symbol tables of  the  appro‐
502       priate  kernel  /  module  object  file.   It is verified against known
503       statement code boundaries, and will be relocated for use at run time.
504
505       In guru mode only, absolute kernel-space  addresses  may  be  specified
506       with the ".absolute" suffix.  Such an address is considered already re‐
507       located, as if it came from /proc/kallsyms, so  it  cannot  be  checked
508       against statement/instruction boundaries.
509
510   CONTEXT VARIABLES
511       Many  of  the  source-level context variables, such as function parame‐
512       ters, locals, globals visible in the compilation unit, may  be  visible
513       to  probe  handlers.   They  may  refer to these variables by prefixing
514       their name with "$" within the scripts.  In addition, a special  syntax
515       allows  limited  traversal  of  structures, pointers, and arrays.  More
516       syntax allows pretty-printing of individual variables or their  groups.
517       See  also  @cast.   Note that variables may be inaccessible due to them
518       being paged out, or  for  a  few  other  reasons.   See  also  man  er‐
519       ror::fault(7stap).
520
521
522       $var   refers  to  an in-scope variable "var".  If it's an integer-like
523              type, it will be cast to a 64-bit int for systemtap script  use.
524              String-like  pointers (char *) may be copied to systemtap string
525              values using the kernel_string or user_string functions.
526
527       @var("varname")
528              an alternative syntax for $varname
529
530       @var("varname@src/file.c")
531              refers to the global (either file local  or  external)  variable
532              varname defined when the file src/file.c was compiled. The CU in
533              which the variable is resolved is the first CU in the module  of
534              the probe point which matches the given file name at the end and
535              has    the    shortest    file    name    path    (e.g.    given
536              @var("foo@bar/baz.c")  and CUs with file name paths src/sub/mod‐
537              ule/bar/baz.c and src/bar/baz.c the second CU will be chosen  to
538              resolve the (file) global variable foo
539
540       $var->field traversal via a structure's or a pointer's field.  This
541              generalized  indirection operator may be repeated to follow more
542              levels.  Note that the .  operator is not used for plain  struc‐
543              ture  members,  only -> for both purposes.  (This is because "."
544              is reserved for string concatenation.) Also note that for direct
545              dereferencing of $var pointer {kernel,user}_{char,int,...}($var)
546              should be used. (Refer to stapfuncs(5) for more details.)
547
548       $return
549              is available in return probes only for functions  that  are  de‐
550              clared  with  a return value, which can be determined using @de‐
551              fined($return).
552
553       $var[N]
554              indexes into an array.  The index given with a literal number or
555              even an arbitrary numeric expression.
556
557       A  number  of  operators  exist for such basic context variable expres‐
558       sions:
559
560       $$vars expands to a character string that is equivalent to
561
562              sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
563                      parm1, ..., parmN, var1, ..., varN)
564
565              for each variable in scope at the probe point.  Some values  may
566              be printed as =?  if their run-time location cannot be found.
567
568       $$locals
569              expands to a subset of $$vars for only local variables.
570
571       $$parms
572              expands to a subset of $$vars for only function parameters.
573
574       $$return
575              is available in return probes only.  It expands to a string that
576              is equivalent to sprintf("return=%x",  $return)  if  the  probed
577              function has a return value, or else an empty string.
578
579       & $EXPR
580              expands to the address of the given context variable expression,
581              if it is addressable.
582
583       @defined($EXPR)
584              expands to 1 or 0 iff the given context variable  expression  is
585              resolvable, for use in conditionals such as
586
587              @defined($foo->bar) ? $foo->bar : 0
588
589
590       $EXPR$ expands to a string with all of $EXPR's members, equivalent to
591
592              sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
593                       $EXPR->a, $EXPR->b)
594
595
596       $EXPR$$
597              expands  to  a string with all of $var's members and submembers,
598              equivalent to
599
600              sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
601                      $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])
602
603
604
605   MORE ON RETURN PROBES
606       For the kernel ".return" probes, only a certain fixed number of returns
607       may  be  outstanding.  The default is a relatively small number, on the
608       order of a few times the number of physical CPUs.   If  many  different
609       threads  concurrently call the same blocking function, such as futex(2)
610       or read(2), this limit could  be  exceeded,  and  skipped  "kretprobes"
611       would be reported by "stap -t".  To work around this, specify a
612
613              probe FOO.return.maxactive(NNN)
614
615       suffix,  with  a  large  enough  NNN to cover all expected concurrently
616       blocked threads.  Alternately, use the
617
618              stap -DKRETACTIVE=NNNN
619
620       stap command line macro setting to override the default for  all  ".re‐
621       turn" probes.
622
623
624       For ".return" probes, context variables other than the "$return" may be
625       accessible, as a convenience for a script programmer wishing to  access
626       function  parameters.   These values are snapshots taken at the time of
627       function entry.  (Local variables within the function are not generally
628       accessible,  since  those variables did not exist in allocated/initial‐
629       ized form at the  snapshot  moment.)   These  entry-snapshot  variables
630       should be accessed via @entry($var).
631
632       In  addition,  arbitrary  entry-time  expressions can also be saved for
633       ".return" probes using the @entry(expr) operator.  For example, one can
634       compute the elapsed time of a function:
635
636              probe kernel.function("do_filp_open").return {
637                  println( get_timeofday_us() - @entry(get_timeofday_us()) )
638              }
639
640
641
642       The following table summarizes how values related to a function parame‐
643       ter context variable, a pointer named addr, may be accessed from a .re‐
644       turn probe.
645
646       at-entry value   past-exit value
647
648       $addr            not available
649       $addr->x->y      @cast(@entry($addr),"struct zz")->x->y
650       $addr[0]         {kernel,user}_{char,int,...}(& $addr[0])
651
652
653
654   DWARFLESS
655       In  absence  of  debugging information, entry & exit points of kernel &
656       module functions can be probed using the  "kprobe"  family  of  probes.
657       However, these do not permit looking up the arguments / local variables
658       of the function.  Following constructs are supported :
659
660              kprobe.function(FUNCTION)
661              kprobe.function(FUNCTION).call
662              kprobe.function(FUNCTION).return
663              kprobe.module(NAME).function(FUNCTION)
664              kprobe.module(NAME).function(FUNCTION).call
665              kprobe.module(NAME).function(FUNCTION).return
666              kprobe.statement(ADDRESS).absolute
667
668
669       Probes of type function are recommended for kernel  functions,  whereas
670       probes  of  type  module  are  recommended for probing functions of the
671       specified module.  In case the absolute address of a kernel  or  module
672       function is known, statement probes can be utilized.
673
674       Note  that FUNCTION and MODULE names must not contain wildcards, or the
675       probe will not be registered.  Also, statement probes must be run under
676       guru-mode only.
677
678
679
680   USER-SPACE
681       Support  for  user-space probing is available for kernels that are con‐
682       figured with the utrace extensions, or have  the  uprobes  facility  in
683       linux  3.5.  (Various kernel build configuration options need to be en‐
684       abled; systemtap will advise if these are missing.)
685
686
687       There are several forms.  First, a non-symbolic probe point:
688
689              process(PID).statement(ADDRESS).absolute
690
691       is analogous to kernel.statement(ADDRESS).absolute in that both use raw
692       (unverified)  virtual  addresses and provide no $variables.  The target
693       PID parameter must identify a running process, and ADDRESS should iden‐
694       tify  a valid instruction address.  All threads of that process will be
695       probed.
696
697       Second, non-symbolic user-kernel interface events handled by utrace may
698       be probed:
699
700              process(PID).begin
701              process("FULLPATH").begin
702              process.begin
703              process(PID).thread.begin
704              process("FULLPATH").thread.begin
705              process.thread.begin
706              process(PID).end
707              process("FULLPATH").end
708              process.end
709              process(PID).thread.end
710              process("FULLPATH").thread.end
711              process.thread.end
712              process(PID).syscall
713              process("FULLPATH").syscall
714              process.syscall
715              process(PID).syscall.return
716              process("FULLPATH").syscall.return
717              process.syscall.return
718              process(PID).insn
719              process("FULLPATH").insn
720              process(PID).insn.block
721              process("FULLPATH").insn.block
722
723
724
725       A  process.begin probe gets called when new process described by PID or
726       FULLPATH gets created.  In addition, it is called once from the context
727       of each preexisting process, at systemtap script startup.  This is use‐
728       ful to track live processes.  A process.thread.begin probe gets  called
729       when  a  new  thread  described  by  PID  or  FULLPATH gets created.  A
730       process.end probe gets called when process described by PID or FULLPATH
731       dies.   A  process.thread.end probe gets called when a thread described
732       by PID or FULLPATH dies.  A process.syscall probe gets  called  when  a
733       thread  described  by  PID or FULLPATH makes a system call.  The system
734       call number is available in the  $syscall  context  variable,  and  the
735       first  6  arguments  of the system call are available in the $argN (ex.
736       $arg1, $arg2, ...) context variable.   A  process.syscall.return  probe
737       gets  called  when a thread described by PID or FULLPATH returns from a
738       system call.  The system call number is available in the $syscall  con‐
739       text  variable, and the return value of the system call is available in
740       the $return context variable.  A process.insn probe gets called for ev‐
741       ery single-stepped instruction of the process described by PID or FULL‐
742       PATH.  A process.insn.block probe gets called for  every  block-stepped
743       instruction of the process described by PID or FULLPATH.
744
745
746       If  a  process  probe  is specified without a PID or FULLPATH, all user
747       threads will be probed.  However, if systemtap was invoked with the  -c
748       or  -x options, then process probes are restricted to the process hier‐
749       archy associated with the target process.  If a process  probe  is  un‐
750       specified (i.e. without a PID or FULLPATH), but with the -c option, the
751       PATH of the -c cmd will be heuristically filled into the process  PATH.
752       In  that  case,  only  command parameters are allowed in the -c command
753       (i.e. no command substitution allowed and  no  occurrences  of  any  of
754       these characters: '|&;<>(){}').
755
756
757       Third,  symbolic  static  instrumentation  compiled  into  programs and
758       shared libraries may be probed:
759
760              process("PATH").mark("LABEL")
761              process("PATH").provider("PROVIDER").mark("LABEL")
762              process(PID).mark("LABEL")
763              process(PID).provider("PROVIDER").mark("LABEL")
764
765
766       A .mark probe gets called via a static probe which is  defined  in  the
767       application  by  STAP_PROBE1(PROVIDER,LABEL,arg1), which are macros de‐
768       fined in sys/sdt.h.  The PROVIDER is an arbitrary application identifi‐
769       er,  LABEL is the marker site identifier, and arg1 is the integer-typed
770       argument.  STAP_PROBE1 is used for probes with 1 argument,  STAP_PROBE2
771       is  used  for probes with 2 arguments, and so on.  The arguments of the
772       probe are available in the context variables $arg1, $arg2, ...  An  al‐
773       ternative to using the STAP_PROBE macros is to use the dtrace script to
774       create  custom  macros.   Additionally,  the   variables   $$name   and
775       $$provider  are  available  as  parts  of  the  probe  point name.  The
776       sys/sdt.h macro  names  DTRACE_PROBE*  are  available  as  aliases  for
777       STAP_PROBE*.
778
779
780       Finally,  full  symbolic source-level probes in user-space programs and
781       shared libraries are supported.  These are  exactly  analogous  to  the
782       symbolic DWARF-based kernel/module probes described above.  They expose
783       the same sorts of context $variables  for  function  parameters,  local
784       variables, and so on.
785
786              process("PATH").function("NAME")
787              process("PATH").statement("*@FILE.c:123")
788              process("PATH").plt("NAME")
789              process("PATH").library("PATH").plt("NAME")
790              process("PATH").library("PATH").function("NAME")
791              process("PATH").library("PATH").statement("*@FILE.c:123")
792              process("PATH").function("*").return
793              process("PATH").function("myfun").label("foo")
794              process("PATH").function("foo").callee("bar")
795              process("PATH").plt("NAME").return
796              process(PID).function("NAME")
797              process(PID).statement("*@FILE.c:123")
798              process(PID).plt("NAME")
799
800
801
802       Note  that for all process probes, PATH names refer to executables that
803       are searched the same way shells do: relative to the working  directory
804       if they contain a "/" character, otherwise in $PATH.  If PATH names re‐
805       fer to scripts, the actual interpreters (specified in the script in the
806       first line after the #! characters) are probed.
807
808
809       Tapset   process   probes   placed   in  the  special  directory  $pre‐
810       fix/share/systemtap/tapset/PATH/ with relative paths  will  have  their
811       process  parameter  prefixed with the location of the tapset. For exam‐
812       ple,
813
814
815              process("foo").function("NAME")
816
817
818       expands to
819
820              process("/usr/bin/foo").function("NAME")
821
822
823
824       when placed in $prefix/share/systemtap/tapset/PATH/usr/bin/
825
826
827       If PATH is a process component parameter referring to shared  libraries
828       then  all  processes that map it at runtime would be selected for prob‐
829       ing.  If PATH is a library component parameter referring to shared  li‐
830       braries  then  the  process specified by the process component would be
831       selected.  Note that the PATH pattern in a library component  will  al‐
832       ways  apply  to  libraries  statically  determined  to be in use by the
833       process. However, you may also specify the full  path  to  any  library
834       file even if not statically needed by the process.
835
836
837       A  .plt  probe will probe functions in the program linkage table corre‐
838       sponding to the rest of the probe point.  .plt can be  specified  as  a
839       shorthand for .plt("*").  The symbol name is available as a $$name con‐
840       text variable; function arguments are not  available,  since  PLTs  are
841       processed without debuginfo.  A .plt.return probe places a probe at the
842       moment after the return from the named function.
843
844
845       If the PATH string contains wildcards as in  the  MPATTERN  case,  then
846       standard  globbing  is  performed  to find all matching paths.  In this
847       case, the $PATH environment variable is not used.
848
849
850       If systemtap was invoked with the -c or -x options, then process probes
851       are  restricted  to  the  process  hierarchy associated with the target
852       process.
853
854
855   JAVA
856       Support for probing Java methods is available using Byteman as a  back‐
857       end.  Byteman  is  an instrumentation tool from the JBoss project which
858       systemtap can use to monitor invocations for a specific method or  line
859       in a Java program.
860
861       Systemtap  does so by generating a Byteman script listing the probes to
862       instrument and then invoking the Byteman bminstall utility.
863
864       This Java instrumentation support is currently a prototype feature with
865       major  limitations.   Moreover,  Java  probing  currently does not work
866       across users; the stap script must run (with  appropriate  permissions)
867       under  the  same  user that the Java process being probed. (Thus a stap
868       script under root currently cannot probe Java methods in a non-root-us‐
869       er Java process.)
870
871
872       The  first  probe type refers to Java processes by the name of the Java
873       process:
874
875              java("PNAME").class("CLASSNAME").method("PATTERN")
876              java("PNAME").class("CLASSNAME").method("PATTERN").return
877
878       The PNAME argument must be a pre-existing jvm pid, and be  identifiable
879       via a jps listing.
880
881       The  PATTERN  parameter  specifies  the signature of the Java method to
882       probe. The signature must consist of the exact name of the method, fol‐
883       lowed  by  a bracketed list of the types of the arguments, for instance
884       "myMethod(int,double,Foo)". Wildcards are not supported.
885
886       The probe can be set to trigger at a specific line within the method by
887       appending  a  line number with colon, just as in other types of probes:
888       "myMethod(int,double,Foo):245".
889
890       The CLASSNAME parameter identifies the Java class  the  method  belongs
891       to,  either  with or without the package qualification. By default, the
892       probe only triggers on descendants of the class that  do  not  override
893       the  method  definition  of  the original class. However, CLASSNAME can
894       take an optional caret prefix, as in ^org.my.MyClass,  which  specifies
895       that  the  probe should also trigger on all descendants of MyClass that
896       override the original method. For instance, every method with signature
897       foo(int) in program org.my.MyApp can be probed at once using
898
899              java("org.my.MyApp").class("^java.lang.Object").method("foo(int)")
900
901
902       The  second  probe type works analogously, but refers to Java processes
903       by PID:
904
905              java(PID).class("CLASSNAME").method("PATTERN")
906              java(PID).class("CLASSNAME").method("PATTERN").return
907
908       (PIDs for an already running process can be obtained using  the  jps(1)
909       utility.)
910
911       Context  variables  defined  within  java  probes include $arg1 through
912       $arg10 (for up to the first 10 arguments of a method),  represented  as
913       character-pointers  for  the  toString()  form of each actual argument.
914       The arg1 through arg10 script variables provide access to these as  or‐
915       dinary strings, fetched via user_string_warn().
916
917       Prior  to systemtap version 3.1, $arg1 through $arg10 could contain ei‐
918       ther integers or character pointers, depending on the types of the  ob‐
919       jects  being  passed to each particular java method.  This previous be‐
920       haviour may be invoked with the stap --compatible=3.0 flag.
921
922
923   PROCFS
924       These probe points allow procfs "files" in  /proc/systemtap/MODNAME  to
925       be  created,  read  and written using a permission that may be modified
926       using the proper umask value. Default permissions  are  0400  for  read
927       probes,  and  0200 for write probes. If both a read and write probe are
928       being used on the same file, a default permission of 0600 will be used.
929       Using procfs.umask(0040).read would result in a 0404 permission set for
930       the file.  (MODNAME is the name of  the  systemtap  module).  The  proc
931       filesystem is a pseudo-filesystem which is used as an interface to ker‐
932       nel data structures. There are several probe point  variants  supported
933       by the translator:
934
935
936              procfs("PATH").read
937              procfs("PATH").umask(UMASK).read
938              procfs("PATH").read.maxsize(MAXSIZE)
939              procfs("PATH").umask(UMASK).maxsize(MAXSIZE)
940              procfs("PATH").write
941              procfs("PATH").umask(UMASK).write
942              procfs.read
943              procfs.umask(UMASK).read
944              procfs.read.maxsize(MAXSIZE)
945              procfs.umask(UMASK).read.maxsize(MAXSIZE)
946              procfs.write
947              procfs.umask(UMASK).write
948
949
950       PATH  is the file name (relative to /proc/systemtap/MODNAME) to be cre‐
951       ated.  If no PATH is specified (as in the  last  two  variants  above),
952       PATH  defaults to "command". The file name "__stdin" is used internally
953       by systemtap for input probes and should not be  used  as  a  PATH  for
954       procfs probes; see the input probe section below.
955
956       When  a  user  reads  /proc/systemtap/MODNAME/PATH,  the  corresponding
957       procfs read probe is triggered.  The string data to be read  should  be
958       assigned to a variable named $value, like this:
959
960
961              procfs("PATH").read { $value = "100\n" }
962
963
964       When a user writes into /proc/systemtap/MODNAME/PATH, the corresponding
965       procfs write probe is triggered.  The data the user wrote is  available
966       in the string variable named $value, like this:
967
968
969              procfs("PATH").write { printf("user wrote: %s", $value) }
970
971
972       MAXSIZE  is the size of the procfs read buffer.  Specifying MAXSIZE al‐
973       lows larger procfs output.  If no MAXSIZE is specified, the procfs read
974       buffer  defaults to STP_PROCFS_BUFSIZE (which defaults to MAXSTRINGLEN,
975       the maximum length of a string).  If setting the  procfs  read  buffers
976       for  more  than  one  file is needed, it may be easiest to override the
977       STP_PROCFS_BUFSIZE definition.  Here's an example of using MAXSIZE:
978
979
980              procfs.read.maxsize(1024) {
981                  $value = "long string..."
982                  $value .= "another long string..."
983                  $value .= "another long string..."
984                  $value .= "another long string..."
985              }
986
987
988
989   INPUT
990       These probe points make input from stdin available to the script during
991       runtime.   The translator currently supports two variants of this fami‐
992       ly:
993
994              input.char
995              input.line
996
997
998       input.char is triggered each time a character is read from  stdin.  The
999       current  character  is  available  in  the  string variable named char.
1000       There is no newline buffering; the next character is read from stdin as
1001       soon as it becomes available.
1002
1003       input.line causes all characters read from stdin to be buffered until a
1004       newline is read, at which point the probe will be triggered.  The  cur‐
1005       rent  line of characters (including the newline) is made available in a
1006       string variable named line.  Note that no more than MAXSTRINGLEN  char‐
1007       acters will be buffered. Any additional characters will not be included
1008       in line.
1009
1010
1011       Input probes are aliases for procfs("__stdin").write.  Systemtap recon‐
1012       figures  stdin if the presence of this procfs probe is detected, there‐
1013       fore "__stdin" should not be used as a path argument for procfs probes.
1014       Additionally,  input  probes will not work with the -F and --remote op‐
1015       tions.
1016
1017
1018   NETFILTER HOOKS
1019       These probe points allow observation of network packets using the  net‐
1020       filter  mechanism. A netfilter probe in systemtap corresponds to a net‐
1021       filter hook function in the original netfilter probes API. It is proba‐
1022       bly  more  convenient  to use tapset::netfilter(3stap), which wraps the
1023       primitive netfilter hooks and does the work of extracting useful infor‐
1024       mation from the context variables.
1025
1026
1027       There are several probe point variants supported by the translator:
1028
1029
1030              netfilter.hook("HOOKNAME").pf("PROTOCOL_F")
1031              netfilter.pf("PROTOCOL_F").hook("HOOKNAME")
1032              netfilter.hook("HOOKNAME").pf("PROTOCOL_F").priority("PRIORITY")
1033              netfilter.pf("PROTOCOL_F").hook("HOOKNAME").priority("PRIORITY")
1034
1035
1036
1037       PROTOCOL_F  is  the protocol family to listen for, currently one of NF‐
1038       PROTO_IPV4, NFPROTO_IPV6, NFPROTO_ARP, or NFPROTO_BRIDGE.
1039
1040
1041       HOOKNAME is the point, or 'hook', in the protocol stack at which to in‐
1042       tercept  the  packet. The available hook names for each protocol family
1043       are taken from the kernel header files <linux/netfilter_ipv4.h>,  <lin‐
1044       ux/netfilter_ipv6.h>,    <linux/netfilter_arp.h>   and   <linux/netfil‐
1045       ter_bridge.h>. For instance, allowable hook names for NFPROTO_IPV4  are
1046       NF_INET_PRE_ROUTING,   NF_INET_LOCAL_IN,  NF_INET_FORWARD,  NF_INET_LO‐
1047       CAL_OUT, and NF_INET_POST_ROUTING.
1048
1049
1050       PRIORITY is an integer priority giving the order  in  which  the  probe
1051       point  should  be  triggered relative to any other netfilter hook func‐
1052       tions which trigger on the same packet. Hook functions execute on  each
1053       packet  in order from smallest priority number to largest priority num‐
1054       ber. If no PRIORITY is specified (as in the first two probe point vari‐
1055       ants above), PRIORITY defaults to "0".
1056
1057       There are a number of predefined priority names of the form NF_IP_PRI_*
1058       and NF_IP6_PRI_* which are defined in the  kernel  header  files  <lin‐
1059       ux/netfilter_ipv4.h>  and  <linux/netfilter_ipv6.h>  respectively.  The
1060       script is permitted to use these instead of specifying an integer  pri‐
1061       ority.  (The  probe points for NFPROTO_ARP and NFPROTO_BRIDGE currently
1062       do not expose any named hook priorities to the script  writer.)   Thus,
1063       allowable ways to specify the priority include:
1064
1065
1066              priority("255")
1067              priority("NF_IP_PRI_SELINUX_LAST")
1068
1069
1070       A script using guru mode is permitted to specify any identifier or num‐
1071       ber as the parameter for hook, pf, and priority. This feature should be
1072       used  with  caution,  as  the parameter is inserted verbatim into the C
1073       code generated by systemtap.
1074
1075       The netfilter probe points define the following context variables:
1076
1077       $hooknum
1078              The hook number.
1079
1080       $skb   The address of the sk_buff struct representing the  packet.  See
1081              <linux/skbuff.h>  for  details on how to use this struct, or al‐
1082              ternatively use the tapset tapset::netfilter(3stap) for easy ac‐
1083              cess to key information.
1084
1085
1086       $in    The  address  of  the net_device struct representing the network
1087              device on which the packet was received (if any). May  be  0  if
1088              the device is unknown or undefined at that stage in the protocol
1089              stack.
1090
1091
1092       $out   The address of the net_device struct  representing  the  network
1093              device  on  which  the packet will be sent (if any). May be 0 if
1094              the device is unknown or undefined at that stage in the protocol
1095              stack.
1096
1097
1098       $verdict
1099              (Guru mode only.) Assigning one of the verdict values defined in
1100              <linux/netfilter.h> to this variable alters the further progress
1101              of the packet through the protocol stack. For instance, the fol‐
1102              lowing guru mode script forces all ipv6 network  packets  to  be
1103              dropped:
1104
1105
1106              probe netfilter.pf("NFPROTO_IPV6").hook("NF_IP6_PRE_ROUTING") {
1107                $verdict = 0 /* nf_drop */
1108              }
1109
1110
1111              For  convenience,  unlike  the  primitive probe points discussed
1112              here, the probes defined in tapset::netfilter(3stap) export  the
1113              lowercase  names  of the verdict constants (e.g. NF_DROP becomes
1114              nf_drop) as local variables.
1115
1116
1117   KERNEL TRACEPOINTS
1118       This family of probe points hooks up to static probing tracepoints  in‐
1119       serted  into the kernel or modules.  As with markers, these tracepoints
1120       are special macro calls inserted by kernel developers to  make  probing
1121       faster and more reliable than with DWARF-based probes, and DWARF debug‐
1122       ging information is not required  to  probe  tracepoints.   Tracepoints
1123       have an extra advantage of more strongly-typed parameters than markers.
1124
1125       Tracepoint probes look like: kernel.trace("name").  The tracepoint name
1126       string, which may contain the usual  wildcard  characters,  is  matched
1127       against  the  names  defined by the kernel developers in the tracepoint
1128       header files. To restrict  the  search  to  specific  subsystems  (e.g.
1129       sched,   ext3,   etc...),  the  following  syntax  can  be  used:  ker‐
1130       nel.trace("system:name").  The tracepoint system string may  also  con‐
1131       tain the usual wildcard characters.
1132
1133       The  handler  associated with a tracepoint-based probe may read the op‐
1134       tional parameters specified at the macro call site.   These  are  named
1135       according  to  the  declaration by the tracepoint author.  For example,
1136       the tracepoint probe  kernel.trace("sched:sched_switch")  provides  the
1137       parameters  $prev and $next.  If the parameter is a complex type, as in
1138       a struct pointer, then a script can access fields with the same  syntax
1139       as DWARF $target variables.  Also, tracepoint parameters cannot be mod‐
1140       ified, but in guru-mode a script may modify fields of parameters.
1141
1142       The subsystem and name of the tracepoint are available in $$system  and
1143       $$name  and a string of name=value pairs for all parameters of the tra‐
1144       cepoint is available in $$vars or $$parms.
1145
1146
1147   KERNEL MARKERS (OBSOLETE)
1148       This family of probe points hooks up to an older style of static  prob‐
1149       ing  markers inserted into older kernels or modules.  These markers are
1150       special STAP_MARK macro calls inserted by  kernel  developers  to  make
1151       probing  faster  and  more reliable than with DWARF-based probes.  Fur‐
1152       ther, DWARF debugging information is not required to probe markers.
1153
1154       Marker probe points begin with kernel.  The next part names the  marker
1155       itself:  mark("name").   The  marker name string, which may contain the
1156       usual wildcard characters, is matched against the names  given  to  the
1157       marker  macros when the kernel and/or module was compiled.    Optional‐
1158       ly, you can specify format("format").   Specifying  the  marker  format
1159       string  allows  differentiation  between two markers with the same name
1160       but different marker format strings.
1161
1162       The handler associated with a marker-based probe may read the  optional
1163       parameters  specified  at  the  macro call site.  These are named $arg1
1164       through $argNN, where NN is the number of parameters  supplied  by  the
1165       macro.  Number and string parameters are passed in a type-safe manner.
1166
1167       The marker format string associated with a marker is available in $for‐
1168       mat.  And also the marker name string is available in $name.
1169
1170
1171   HARDWARE BREAKPOINTS
1172       This family of probes is used to set hardware watchpoints for a given
1173        (global) kernel symbol. The probes take three components as inputs :
1174
1175       1. The virtual address / name of the kernel symbol to be traced is sup‐
1176       plied  as argument to this class of probes. ( Probes for only data seg‐
1177       ment variables are supported. Probing local  variables  of  a  function
1178       cannot be done.)
1179
1180       2. Nature of access to be probed : a.  .write probe gets triggered when
1181       a write happens at the specified address/symbol name.  b.  rw probe  is
1182       triggered when either a read or write happens.
1183
1184       3.   .length (optional) Users have the option of specifying the address
1185       interval to be probed using  "length"  constructs.  The  user-specified
1186       length  gets  approximated  to the closest possible address length that
1187       the architecture can support. If the specified length exceeds the  lim‐
1188       its imposed by architecture, an error message is flagged and probe reg‐
1189       istration fails.  Wherever 'length' is not  specified,  the  translator
1190       requests  a  hardware  breakpoint probe of length 1. It should be noted
1191       that the "length" construct is not valid with symbol names.
1192
1193       Following constructs are supported :
1194
1195              probe kernel.data(ADDRESS).write
1196              probe kernel.data(ADDRESS).rw
1197              probe kernel.data(ADDRESS).length(LEN).write
1198              probe kernel.data(ADDRESS).length(LEN).rw
1199              probe kernel.data("SYMBOL_NAME").write
1200              probe kernel.data("SYMBOL_NAME").rw
1201
1202
1203       This set of probes make use of the debug registers  of  the  processor,
1204       which  is  a  scarce  resource.  (4  on x86 , 1 on powerpc ) The script
1205       translation flags a warning if a user requests more hardware breakpoint
1206       probes  than the limits set by architecture. For example,a pass-2 warn‐
1207       ing is flashed when an input  script  requests  5  hardware  breakpoint
1208       probes  on an x86 system while x86 architecture supports a maximum of 4
1209       breakpoints.  Users are cautioned to set probes judiciously.
1210
1211
1212   PERF
1213       This family of probe points interfaces to the kernel "perf  event"  in‐
1214       frastructure for controlling hardware performance counters.  The events
1215       being attached to are described by the "type", "config" fields  of  the
1216       perf_event_attr  structure,  and are sampled at an interval governed by
1217       the "sample_period" and "sample_freq" fields.
1218
1219       These fields are made available to systemtap scripts using the  follow‐
1220       ing syntax:
1221
1222              probe perf.type(NN).config(MM).sample(XX)
1223              probe perf.type(NN).config(MM).hz(XX)
1224              probe perf.type(NN).config(MM)
1225              probe perf.type(NN).config(MM).process("PROC")
1226              probe perf.type(NN).config(MM).counter("COUNTER")
1227              probe perf.type(NN).config(MM).process("PROC").counter("NAME")
1228
1229       The systemtap probe handler is called once per XX increments of the un‐
1230       derlying performance counter when using the .sample field or at a  fre‐
1231       quency  in  hertz when using the .hz field. When not specified, the de‐
1232       fault behavior is to sample at a count of 1000000.  The range of  valid
1233       type/config  is described by the perf_event_open(2) system call, and/or
1234       the linux/perf_event.h file.  Invalid combinations or  exhausted  hard‐
1235       ware  counter resources result in errors during systemtap script start‐
1236       up.  Systemtap does not sanity-check the values: it merely passes  them
1237       through  to  the kernel for error- and safety-checking.  By default the
1238       perf event probe is systemwide unless .process is specified, which will
1239       bind  the  probe to a specific task.  If the name is omitted then it is
1240       inferred from the stap -c argument.   A perf event can be read  on  de‐
1241       mand  using  .counter.   The body of the perf probe handler will not be
1242       invoked for a .counter probe; instead, the counter is read  in  a  user
1243       space probe via:
1244
1245          process("PROC").statement("func@file") {stat <<< @perf("NAME")}
1246
1247
1248
1249   PYTHON
1250       Support  for  probing  python 2 and python 3 function is available with
1251       the help of an extra python support module. Note that the debuginfo for
1252       the  version of python being probed is required. To run a python script
1253       with the extra python support module you'd add the '-m  HelperSDT'  op‐
1254       tion to your python command, like this:
1255
1256              stap foo.stp -c "python -m HelperSDT foo.py"
1257
1258       Python probes look like the following:
1259
1260              python2.module("MPATTERN").function("PATTERN")
1261              python2.module("MPATTERN").function("PATTERN").call
1262              python2.module("MPATTERN").function("PATTERN").return
1263              python3.module("MPATTERN").function("PATTERN")
1264              python3.module("MPATTERN").function("PATTERN").call
1265              python3.module("MPATTERN").function("PATTERN").return
1266
1267       The  list  above includes multiple variants and modifiers which provide
1268       additional functionality or filters. They are:
1269
1270              .function
1271                     Places a probe at the beginning of the named function  by
1272                     default,  unless  modified  by  PATTERN.  Parameters  are
1273                     available as context variables.
1274
1275              .call  Places a probe at the beginning of  the  named  function.
1276                     Parameters are available as context variables.
1277
1278              .return
1279                     Places  a  probe at the moment before the return from the
1280                     named function. Parameters and local/global python  vari‐
1281                     ables are available as context variables.
1282
1283       PATTERN  stands  for  a string literal that aims to identify a point in
1284       the python program.  It is made up of three parts:
1285
1286       ·   The first part is the name of a  function  (e.g.  "foo")  or  class
1287           method  (e.g.  "bar.baz").  This part may use the "*" and "?" wild‐
1288           carding operators to match multiple names.
1289
1290       ·   The second part is optional and begins with the "@" character.   It
1291           is followed by the path to the source file containing the function,
1292           which may include a wildcard pattern. The python path  is  searched
1293           for a matching filename.
1294
1295       ·   Finally,  the third part is optional if the file name part was giv‐
1296           en, and identifies the line number in the source file preceded by a
1297           ":"  or  a  "+".  The line number is assumed to be an absolute line
1298           number if preceded by a ":", or relative to the declaration line of
1299           the  function  if preceded by a "+".  All the lines in the function
1300           can be matched with ":*".  A range of lines  x  through  y  can  be
1301           matched  with  ":x-y". Ranges and specific lines can be mixed using
1302           commas, e.g. ":x,y-z".
1303
1304       In the above list of probe points, MPATTERN stands for a python  module
1305       or  script name that names the python module of interest. This part may
1306       use the "*" and "?" wildcarding operators to match multiple names.  The
1307       python path is searched for a matching filename.
1308
1309
1310

EXAMPLES

1312       Here are some example probe points, defining the associated events.
1313
1314       begin, end, end
1315              refers  to  the  startup and normal shutdown of the session.  In
1316              this case, the handler would run once during startup  and  twice
1317              during shutdown.
1318
1319       timer.jiffies(1000).randomize(200)
1320              refers to a periodic interrupt, every 1000 +/- 200 jiffies.
1321
1322       kernel.function("*init*"), kernel.function("*exit*")
1323              refers  to  all  kernel  functions  with "init" or "exit" in the
1324              name.
1325
1326       kernel.function("*@kernel/time.c:240")
1327              refers to any functions within  the  "kernel/time.c"  file  that
1328              span  line 240.   Note that this is not a probe at the statement
1329              at that line number.  Use the kernel.statement probe instead.
1330
1331       kernel.trace("sched_*")
1332              refers to all scheduler-related (really,  prefixed)  tracepoints
1333              in the kernel.
1334
1335       kernel.mark("getuid")
1336              refers  to  an obsolete STAP_MARK(getuid, ...) macro call in the
1337              kernel.
1338
1339       module("usb*").function("*sync*").return
1340              refers to the moment of return from all functions with "sync" in
1341              the name in any of the USB drivers.
1342
1343       kernel.statement(0xc0044852)
1344              refers  to  the  first  byte of the statement whose compiled in‐
1345              structions include the given address in the kernel.
1346
1347       kernel.statement("*@kernel/time.c:296")
1348              refers to the statement of line 296 within "kernel/time.c".
1349
1350       kernel.statement("bio_init@fs/bio.c+3")
1351              refers to the statement at line bio_init+3 within "fs/bio.c".
1352
1353       kernel.data("pid_max").write
1354              refers to a hardware breakpoint of type "write" set on pid_max
1355
1356       syscall.*.return
1357              refers to the group of probe aliases with any name in the  third
1358              position
1359
1360