1STAPPROBES(3stap)                                            STAPPROBES(3stap)
2
3
4

NAME

6       stapprobes - systemtap probe points
7
8
9

DESCRIPTION

11       The  following sections enumerate the variety of probe points supported
12       by the systemtap translator, and some of the additional aliases defined
13       by  standard  tapset  scripts.  Many are individually documented in the
14       3stap manual section, with the probe:: prefix.
15
16

SYNTAX

18              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }
19
20
21       A probe declaration may list multiple comma-separated probe  points  in
22       order  to  attach  a handler to all of the named events.  Normally, the
23       handler statements are run whenever any of events occur.  Depending  on
24       the  type  of  probe point, the handler statements may refer to context
25       variables (denoted with a dollar-sign prefix  like  $foo)  to  read  or
26       write state.  This may include function parameters for function probes,
27       or local variables for statement probes.
28
29       The syntax of a single probe point is a general dotted-symbol sequence.
30       This  allows  a  breakdown  of the event namespace into parts, somewhat
31       like the Domain Name System does on the Internet.  Each component iden‐
32       tifier may be parametrized by a string or number literal, with a syntax
33       like a function call.  A component may include a "*" character, to  ex‐
34       pand  to  a  set of matching probe points.  It may also include "**" to
35       match multiple sequential components at once.  Probe  aliases  likewise
36       expand to other probe points.
37
38       Probe  aliases  can be given on their own, or with a suffix. The suffix
39       attaches to the underlying probe point that the alias is  expanded  to.
40       For example,
41
42              syscall.read.return.maxactive(10)
43
44       expands to
45
46              kernel.function("sys_read").return.maxactive(10)
47
48       with the component maxactive(10) being recognized as a suffix.
49
50       Normally,  each  and  every  probe  point  resulting from wildcard- and
51       alias-expansion must be resolved to some low-level system  instrumenta‐
52       tion  facility  (e.g.,  a kprobe address, marker, or a timer configura‐
53       tion), otherwise the elaboration phase will fail.
54
55       However, a probe point may be followed by a "?" character, to  indicate
56       that it is optional, and that no error should result if it fails to re‐
57       solve.  Optionalness passes down through all levels  of  alias/wildcard
58       expansion.  Alternately, a probe point may be followed by a "!" charac‐
59       ter, to indicate that it  is  both  optional  and  sufficient.   (Think
60       vaguely  of  the Prolog cut operator.) If it does resolve, then no fur‐
61       ther probe points in the same comma-separated list  will  be  resolved.
62       Therefore,  the  "!"   sufficiency  mark  only makes sense in a list of
63       probe point alternatives.
64
65       Additionally, a probe point may be followed by a "if (expr)" statement,
66       in  order  to  enable/disable the probe point on-the-fly. With the "if"
67       statement, if the "expr" is false when the  probe  point  is  hit,  the
68       whole  probe  body  including alias's body is skipped. The condition is
69       stacked up through all levels of alias/wildcard expansion. So the final
70       condition  becomes  the  logical-and  of  conditions  of  all  expanded
71       alias/wildcard.  The expressions are necessarily restricted  to  global
72       variables.
73
74       These  are  all  syntactically valid probe points.  (They are generally
75       semantically invalid, depending on the contents of the tapsets, and the
76       versions of kernel/user software installed.)
77
78
79              kernel.function("foo").return
80              process("/bin/vi").statement(0x2222)
81              end
82              syscall.*
83              syscall.*.return.maxactive(10)
84              syscall.{open,close}
85              sys**open
86              kernel.function("no_such_function") ?
87              module("awol").function("no_such_function") !
88              signal.*? if (switch)
89              kprobe.function("foo")
90
91
92       Probes may be broadly classified into "synchronous" and "asynchronous".
93       A "synchronous" event is deemed to occur when any processor executes an
94       instruction  matched  by  the specification.  This gives these probes a
95       reference point (instruction address) from which more  contextual  data
96       may  be  available.  Other families of probe points refer to "asynchro‐
97       nous" events such as timers/counters rolling over, where  there  is  no
98       fixed  reference point that is related.  Each probe point specification
99       may match multiple locations (for example, using wildcards or aliases),
100       and  all  them  are  then probed.  A probe declaration may also contain
101       several comma-separated specifications, all of which are probed.
102
103       Brace expansion is a mechanism which allows a list of probe  points  to
104       be generated. It is very similar to shell expansion. A component may be
105       surrounded by a pair of curly braces to indicate that  the  comma-sepa‐
106       rated  sequence of one or more subcomponents will each constitute a new
107       probe point. The braces may be arbitrarily nested. The ordering of  ex‐
108       panded results is based on product order.
109
110       The  question mark (?), exclamation mark (!) indicators and probe point
111       conditions may not be placed in any expansions that are before the last
112       component.
113
114       The following is an example of brace expansion.
115
116
117              syscall.{write,read}
118              # Expands to
119              syscall.write, syscall.read
120
121              {kernel,module("nfs")}.function("nfs*")!
122              # Expands to
123              kernel.function("nfs*")!, module("nfs").function("nfs*")!
124
125
126

DWARF DEBUGINFO

128       Resolving some probe points requires DWARF debuginfo or "debug symbols"
129       for the specific program being instrumented.  For some others, DWARF is
130       automatically  synthesized  on  the  fly from source code header files.
131       For others, it is not needed at all.  Since a systemtap script may  use
132       any mixture of probe points together, the union of their DWARF require‐
133       ments has to be met on the computer where  script  compilation  occurs.
134       (See the --use-server option and the stap-server(8) man page for infor‐
135       mation about the remote compilation facility, which  allows  these  re‐
136       quirements to be met on a different machine.)
137
138       The  following  point lists many of the available probe point families,
139       to classify them with respect to their need for DWARF debuginfo for the
140       specific program for that probe point.
141
142
143       DWARF                          NON-DWARF                    SYMBOL-TABLE
144
145       kernel.function, .statement    kernel.mark                  kernel.function*
146       module.function, .statement    process.mark, process.plt    module.function*
147       process.function, .statement   begin, end, error, never     process.function*
148       process.mark*                  timer
149       .function.callee               perf
150       python2, python3               procfs
151                                      kernel.statement.absolute
152       AUTO-GENERATED-DWARF           kernel.data
153                                      kprobe.function
154       kernel.trace                   process.statement.absolute
155                                      process.begin, .end
156                                      netfilter
157                                      java
158
159
160       The probe types marked with * asterisks mark fallbacks, where systemtap
161       can sometimes infer subset or substitute information.  In general,  the
162       more  symbolic  /  debugging  information available, the higher quality
163       probing will be available.
164
165
166

ON-THE-FLY ARMING

168       The following types of probe points may be armed/disarmed on-the-fly to
169       save  overheads during uninteresting times.  Arming conditions may also
170       be added to other types of probes, but will be treated  as  a  wrapping
171       conditional and won't benefit from overhead savings.
172
173
174       DISARMABLE                                exceptions
175       kernel.function, kernel.statement
176       module.function, module.statement
177       process.*.function, process.*.statement
178       process.*.plt, process.*.mark
179       timer.                                    timer.profile
180       java
181
182

PROBE POINT FAMILIES

184   BEGIN/END/ERROR
185       The  probe  points begin and end are defined by the translator to refer
186       to the time of session startup and shutdown.  All  "begin"  probe  han‐
187       dlers  are  run,  in  some sequence, during the startup of the session.
188       All global variables will have been initialized prior  to  this  point.
189       All  "end" probes are run, in some sequence, during the normal shutdown
190       of a session, such as in the aftermath of an exit () function call,  or
191       an interruption from the user.  In the case of an error-triggered shut‐
192       down, "end" probes are not run.  There are no target  variables  avail‐
193       able in either context.
194
195       If the order of execution among "begin" or "end" probes is significant,
196       then an optional sequence number may be provided:
197
198
199              begin(N)
200              end(N)
201
202
203       The number N may be positive or negative.  The probe handlers  are  run
204       in  increasing  order, and the order between handlers with the same se‐
205       quence number is unspecified.  When "begin" or "end" are given  without
206       a sequence, they are effectively sequence zero.
207
208       The  error  probe  point  is similar to the end probe, except that each
209       such probe handler run when the session  ends  after  errors  have  oc‐
210       curred.   In  such  cases,  "end"  probes are skipped, but each "error"
211       probe is still attempted.  This kind of probe can be used to  clean  up
212       or emit a "final gasp".  It may also be numerically parametrized to set
213       a sequence.
214
215
216   NEVER
217       The probe point never is specially defined by the  translator  to  mean
218       "never".  Its probe handler is never run, though its statements are an‐
219       alyzed for symbol / type correctness as usual.  This probe point may be
220       useful in conjunction with optional probes.
221
222
223   SYSCALL and ND_SYSCALL
224       The  syscall.* and nd_syscall.*  aliases define several hundred probes,
225       too many to detail here.  They are of the general form:
226
227
228              syscall.NAME
229              nd_syscall.NAME
230              syscall.NAME.return
231              nd_syscall.NAME.return
232
233
234       Generally, a pair of probes are defined for each normal system call  as
235       listed  in  the  syscalls(2) manual page, one for entry and one for re‐
236       turn.  Those system calls that never return do not have a corresponding
237       .return probe.  The nd_* family of probes are about the same, except it
238       uses non-DWARF based searching mechanisms, which may result in a  lower
239       quality of symbolic context data (parameters), and may miss some system
240       calls.  You may want to try them first, in case kernel debugging infor‐
241       mation is not immediately available.
242
243       Each probe alias provides a variety of variables. Looking at the tapset
244       source code is the most reliable way.  Generally, each variable  listed
245       in  the  standard manual page is made available as a script-level vari‐
246       able, so syscall.open exposes filename, flags, and mode.  In  addition,
247       a standard suite of variables is available at most aliases:
248
249       argstr A  pretty-printed  form  of  the  entire  argument list, without
250              parentheses.
251
252       name   The name of the system call.
253
254       retval For return probes, the raw numeric system-call result.
255
256       retstr For return probes, a pretty-printed string form of  the  system-
257              call result.
258
259       As  usual  for  probe aliases, these variables are all initialized once
260       from the underlying $context variables, so that later changes to  $con‐
261       text  variables are not automatically reflected.  Not all probe aliases
262       obey all of these general guidelines.   Please  report  any  bothersome
263       ones you encounter as a bug.  Note that on some kernel/userspace archi‐
264       tecture combinations (e.g., 32-bit userspace on 64-bit kernel), the un‐
265       derlying $context variables may need explicit sign extension / masking.
266       When this is an issue, consider using the tapset-provided variables in‐
267       stead of raw $context variables.
268
269       If debuginfo availability is a problem, you may try using the non-DWARF
270       syscall probe aliases instead.  Use the nd_syscall.  prefix instead  of
271       syscall.  The same context variables are available, as far as possible.
272
273
274   TIMERS
275       There  are  two  main types of timer probes: "jiffies" timer probes and
276       time interval timer probes.
277
278       Intervals defined by the standard kernel "jiffies" timer may be used to
279       trigger  probe  handlers  asynchronously.  Two probe point variants are
280       supported by the translator:
281
282
283              timer.jiffies(N)
284              timer.jiffies(N).randomize(M)
285
286
287       The probe handler is run every N  jiffies  (a  kernel-defined  unit  of
288       time,  typically between 1 and 60 ms).  If the "randomize" component is
289       given, a linearly distributed random value in  the  range  [-M..+M]  is
290       added to N every time the handler is run.  N is restricted to a reason‐
291       able range (1 to around a million), and M is restricted to  be  smaller
292       than  N.  There are no target variables provided in either context.  It
293       is possible for such probes to be run concurrently on a multi-processor
294       computer.
295
296       Alternatively,  intervals may be specified in units of time.  There are
297       two probe point variants similar to the jiffies timer:
298
299
300              timer.ms(N)
301              timer.ms(N).randomize(M)
302
303
304       Here, N and M are specified in milliseconds, but the full  options  for
305       units   are   seconds  (s/sec),  milliseconds  (ms/msec),  microseconds
306       (us/usec), nanoseconds (ns/nsec), and hertz (hz).  Randomization is not
307       supported for hertz timers.
308
309       The  actual resolution of the timers depends on the target kernel.  For
310       kernels prior to 2.6.17, timers are limited to jiffies  resolution,  so
311       intervals  are  rounded  up  to  the  nearest  jiffies interval.  After
312       2.6.17, the implementation uses hrtimers for tighter precision,  though
313       the  actual  resolution will be arch-dependent.  In either case, if the
314       "randomize" component is given, then the random value will be added  to
315       the interval before any rounding occurs.
316
317       Profiling  timers  are also available to provide probes that execute on
318       all CPUs at the rate of the system tick (CONFIG_HZ) or at a given  fre‐
319       quency  (hz).  On  some  kernels, this is a one-concurrent-user-only or
320       disabled facility, resulting in error -16 (EBUSY) during  probe  regis‐
321       tration.
322
323
324              timer.profile.tick
325              timer.profile.freq.hz(N)
326
327
328       Full  context information of the interrupted process is available, mak‐
329       ing this probe suitable for a time-based sampling profiler.
330
331       It is recommended to use the tapset  probe  timer.profile  rather  than
332       timer.profile.tick.  This probe point behaves identically to timer.pro‐
333       file.tick when the underlying functionality  is  available,  and  falls
334       back  to  using perf.sw.cpu_clock on some recent kernels which lack the
335       corresponding profile timer facility.
336
337       Profiling timers with specified frequencies are  only  accurate  up  to
338       around  100  hz.  You may need to provide a larger value to achieve the
339       desired rate.
340
341       Note that if a timer probe is set to fire at a very high  rate  and  if
342       the  probe  body  is  complex, succeeding timer probes can get skipped,
343       since the time for them to run has already passed.  Normally  systemtap
344       reports missed probes, but it will not report these skipped probes.
345
346
347   DWARF
348       This family of probe points uses symbolic debugging information for the
349       target kernel/module/program, as may be found  in  unstripped  executa‐
350       bles,  or  the  separate  debuginfo  packages.  They allow placement of
351       probes logically into the execution path  of  the  target  program,  by
352       specifying a set of points in the source or object code.  When a match‐
353       ing statement executes on any processor, the probe handler  is  run  in
354       that context.
355
356       Probe points in the DWARF family can be identified by the target kernel
357       module (or user process), source file, line number, function  name,  or
358       some combination of these.
359
360       Here is a list of DWARF probe points currently supported:
361
362              kernel.function(PATTERN)
363              kernel.function(PATTERN).call
364              kernel.function(PATTERN).callee(PATTERN)
365              kernel.function(PATTERN).callee(PATTERN).return
366              kernel.function(PATTERN).callee(PATTERN).call
367              kernel.function(PATTERN).callees(DEPTH)
368              kernel.function(PATTERN).return
369              kernel.function(PATTERN).inline
370              kernel.function(PATTERN).label(LPATTERN)
371              module(MPATTERN).function(PATTERN)
372              module(MPATTERN).function(PATTERN).call
373              module(MPATTERN).function(PATTERN).callee(PATTERN)
374              module(MPATTERN).function(PATTERN).callee(PATTERN).return
375              module(MPATTERN).function(PATTERN).callee(PATTERN).call
376              module(MPATTERN).function(PATTERN).callees(DEPTH)
377              module(MPATTERN).function(PATTERN).return
378              module(MPATTERN).function(PATTERN).inline
379              module(MPATTERN).function(PATTERN).label(LPATTERN)
380              kernel.statement(PATTERN)
381              kernel.statement(PATTERN).nearest
382              kernel.statement(ADDRESS).absolute
383              module(MPATTERN).statement(PATTERN)
384              process("PATH").function("NAME")
385              process("PATH").statement("*@FILE.c:123")
386              process("PATH").library("PATH").function("NAME")
387              process("PATH").library("PATH").statement("*@FILE.c:123")
388              process("PATH").library("PATH").statement("*@FILE.c:123").nearest
389              process("PATH").function("*").return
390              process("PATH").function("myfun").label("foo")
391              process("PATH").function("foo").callee("bar")
392              process("PATH").function("foo").callee("bar").return
393              process("PATH").function("foo").callee("bar").call
394              process("PATH").function("foo").callees(DEPTH)
395              process(PID).function("NAME")
396              process(PID).function("myfun").label("foo")
397              process(PID).plt("NAME")
398              process(PID).plt("NAME").return
399              process(PID).statement("*@FILE.c:123")
400              process(PID).statement("*@FILE.c:123").nearest
401              process(PID).statement(ADDRESS).absolute
402
403       (See  the  USER-SPACE section below for more information on the process
404       probes.)
405
406       The list above includes multiple variants and modifiers  which  provide
407       additional functionality or filters. They are:
408
409              .function
410                     Places  a probe near the beginning of the named function,
411                     so that parameters are available as context variables.
412
413              .return
414                     Places a probe at the moment after the  return  from  the
415                     named  function,  so the return value is available as the
416                     "$return" context variable.
417
418              .inline
419                     Filters the results to include only instances of  inlined
420                     functions.  Note  that  inlined  functions do not have an
421                     identifiable return point, so .return is not supported on
422                     .inline probes.
423
424              .call  Filters the results to include only non-inlined functions
425                     (the opposite set of .inline)
426
427              .exported
428                     Filters the results to include only exported functions.
429
430              .statement
431                     Places a probe at the exact spot,  exposing  those  local
432                     variables that are visible there.
433
434              .statement.nearest
435                     Places  a  probe at the nearest available line number for
436                     each line number given in the statement.
437
438              .callee
439                     Places a probe  on  the  callee  function  given  in  the
440                     .callee  modifier,  where  the  callee must be a function
441                     called by the target function given in .function. The ad‐
442                     vantage  of  doing  this over directly probing the callee
443                     function is that this probe point is run  only  when  the
444                     callee  is  called  from  the  target  function  (add the
445                     -DSTAP_CALLEE_MATCHALL directive to  override  this  when
446                     calling stap(1)).
447
448                     Note  that only callees that can be statically determined
449                     are  available.   For  example,  calls  through  function
450                     pointers are not available.  Additionally, calls to func‐
451                     tions located in other objects (e.g.  libraries) are  not
452                     available (instead use another probe point). This feature
453                     will only work for code compiled with GCC 4.7+.
454
455              .callees
456                     Shortcut for .callee("*"), which places a  probe  on  all
457                     callees of the function.
458
459              .callees(DEPTH)
460                     Recursively   places  probes  on  callees.  For  example,
461                     .callees(2) will probe both callees of the  target  func‐
462                     tion,   as   well   as  callees  of  those  callees.  And
463                     .callees(3) goes one level deeper, etc...  A callee probe
464                     at  depth  N  is only triggered when the N callers in the
465                     callstack match those  that  were  statically  determined
466                     during  analysis  (this  also  may  be  overridden  using
467                     -DSTAP_CALLEE_MATCHALL).
468
469       In the above list of probe points, MPATTERN stands for a string literal
470       that aims to identify the loaded kernel module of interest. For in-tree
471       kernel modules, the name suffices (e.g. "btrfs"). The name may also in‐
472       clude  the  "*", "[]", and "?" wildcards to match multiple in-tree mod‐
473       ules. Out-of-tree modules are also supported  by  specifying  the  full
474       path  to the ko file. Wildcards are not supported. The file must follow
475       the convention of being named <module_name>.ko (characters ',' and  '-'
476       are replaced by '_').
477
478       LPATTERN  stands  for  a source program label. It may also contain "*",
479       "[]", and "?" wildcards. PATTERN stands for a string literal that  aims
480       to identify a point in the program.  It is made up of three parts:
481
482       ·   The first part is the name of a function, as would appear in the nm
483           program's output.  This part may use the "*"  and  "?"  wildcarding
484           operators to match multiple names.
485
486       ·   The  second part is optional and begins with the "@" character.  It
487           is followed by the path to the source file containing the function,
488           which may include a wildcard pattern, such as mm/slab*.  If it does
489           not match as is, an implicit "*/" is optionally  added  before  the
490           pattern, so that a script need only name the last few components of
491           a possibly long source directory path.
492
493       ·   Finally, the third part is optional if the file name part was  giv‐
494           en, and identifies the line number in the source file preceded by a
495           ":" or a "+".  The line number is assumed to be  an  absolute  line
496           number if preceded by a ":", or relative to the declaration line of
497           the function if preceded by a "+".  All the lines in  the  function
498           can  be  matched  with  ":*".   A range of lines x through y can be
499           matched with ":x-y". Ranges and specific lines can be  mixed  using
500           commas, e.g. ":x,y-z".
501
502       As an alternative, PATTERN may be a numeric constant, indicating an ad‐
503       dress.  Such an address may be found from symbol tables of  the  appro‐
504       priate  kernel  /  module  object  file.   It is verified against known
505       statement code boundaries, and will be relocated for use at run time.
506
507       In guru mode only, absolute kernel-space  addresses  may  be  specified
508       with the ".absolute" suffix.  Such an address is considered already re‐
509       located, as if it came from /proc/kallsyms, so  it  cannot  be  checked
510       against statement/instruction boundaries.
511
512   CONTEXT VARIABLES
513       Many  of  the  source-level context variables, such as function parame‐
514       ters, locals, globals visible in the compilation unit, may  be  visible
515       to  probe  handlers.   They  may  refer to these variables by prefixing
516       their name with "$" within the scripts.  In addition, a special  syntax
517       allows  limited  traversal  of  structures, pointers, and arrays.  More
518       syntax allows pretty-printing of individual variables or their  groups.
519       See  also  @cast.   Note that variables may be inaccessible due to them
520       being paged out, or  for  a  few  other  reasons.   See  also  man  er‐
521       ror::fault(7stap).
522
523
524       $var   refers  to  an in-scope variable "var".  If it's an integer-like
525              type, it will be cast to a 64-bit int for systemtap script  use.
526              String-like  pointers (char *) may be copied to systemtap string
527              values using the kernel_string or user_string functions.
528
529       @var("varname")
530              an alternative syntax for $varname
531
532       @var("varname@src/file.c")
533              refers to the global (either file local  or  external)  variable
534              varname defined when the file src/file.c was compiled. The CU in
535              which the variable is resolved is the first CU in the module  of
536              the probe point which matches the given file name at the end and
537              has    the    shortest    file    name    path    (e.g.    given
538              @var("foo@bar/baz.c")  and CUs with file name paths src/sub/mod‐
539              ule/bar/baz.c and src/bar/baz.c the second CU will be chosen  to
540              resolve the (file) global variable foo
541
542       $var->field traversal via a structure's or a pointer's field.  This
543              generalized  indirection operator may be repeated to follow more
544              levels.  Note that the .  operator is not used for plain  struc‐
545              ture  members,  only -> for both purposes.  (This is because "."
546              is reserved for string concatenation.) Also note that for direct
547              dereferencing of $var pointer {kernel,user}_{char,int,...}($var)
548              should be used. (Refer to stapfuncs(5) for more details.)
549
550       $return
551              is available in return probes only for functions  that  are  de‐
552              clared  with  a return value, which can be determined using @de‐
553              fined($return).
554
555       $var[N]
556              indexes into an array.  The index given with a literal number or
557              even an arbitrary numeric expression.
558
559       A  number  of  operators  exist for such basic context variable expres‐
560       sions:
561
562       $$vars expands to a character string that is equivalent to
563
564              sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
565                      parm1, ..., parmN, var1, ..., varN)
566
567              for each variable in scope at the probe point.  Some values  may
568              be printed as =?  if their run-time location cannot be found.
569
570       $$locals
571              expands to a subset of $$vars for only local variables.
572
573       $$parms
574              expands to a subset of $$vars for only function parameters.
575
576       $$return
577              is available in return probes only.  It expands to a string that
578              is equivalent to sprintf("return=%x",  $return)  if  the  probed
579              function has a return value, or else an empty string.
580
581       & $EXPR
582              expands to the address of the given context variable expression,
583              if it is addressable.
584
585       @defined($EXPR)
586              expands to 1 or 0 iff the given context variable  expression  is
587              resolvable, for use in conditionals such as
588
589              @defined($foo->bar) ? $foo->bar : 0
590
591
592       $EXPR$ expands to a string with all of $EXPR's members, equivalent to
593
594              sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
595                       $EXPR->a, $EXPR->b)
596
597
598       $EXPR$$
599              expands  to  a string with all of $var's members and submembers,
600              equivalent to
601
602              sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
603                      $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])
604
605
606
607   MORE ON RETURN PROBES
608       For the kernel ".return" probes, only a certain fixed number of returns
609       may  be  outstanding.  The default is a relatively small number, on the
610       order of a few times the number of physical CPUs.   If  many  different
611       threads  concurrently call the same blocking function, such as futex(2)
612       or read(2), this limit could  be  exceeded,  and  skipped  "kretprobes"
613       would be reported by "stap -t".  To work around this, specify a
614
615              probe FOO.return.maxactive(NNN)
616
617       suffix,  with  a  large  enough  NNN to cover all expected concurrently
618       blocked threads.  Alternately, use the
619
620              stap -DKRETACTIVE=NNNN
621
622       stap command line macro setting to override the default for  all  ".re‐
623       turn" probes.
624
625
626       For ".return" probes, context variables other than the "$return" may be
627       accessible, as a convenience for a script programmer wishing to  access
628       function  parameters.   These values are snapshots taken at the time of
629       function entry.  (Local variables within the function are not generally
630       accessible,  since  those variables did not exist in allocated/initial‐
631       ized form at the  snapshot  moment.)   These  entry-snapshot  variables
632       should be accessed via @entry($var).
633
634       In  addition,  arbitrary  entry-time  expressions can also be saved for
635       ".return" probes using the @entry(expr) operator.  For example, one can
636       compute the elapsed time of a function:
637
638              probe kernel.function("do_filp_open").return {
639                  println( get_timeofday_us() - @entry(get_timeofday_us()) )
640              }
641
642
643
644       The following table summarizes how values related to a function parame‐
645       ter context variable, a pointer named addr, may be accessed from a .re‐
646       turn probe.
647
648       at-entry value   past-exit value
649
650       $addr            not available
651       $addr->x->y      @cast(@entry($addr),"struct zz")->x->y
652       $addr[0]         {kernel,user}_{char,int,...}(& $addr[0])
653
654
655
656   DWARFLESS
657       In  absence  of  debugging information, entry & exit points of kernel &
658       module functions can be probed using the  "kprobe"  family  of  probes.
659       However, these do not permit looking up the arguments / local variables
660       of the function.  Following constructs are supported :
661
662              kprobe.function(FUNCTION)
663              kprobe.function(FUNCTION).call
664              kprobe.function(FUNCTION).return
665              kprobe.module(NAME).function(FUNCTION)
666              kprobe.module(NAME).function(FUNCTION).call
667              kprobe.module(NAME).function(FUNCTION).return
668              kprobe.statement(ADDRESS).absolute
669
670
671       Probes of type function are recommended for kernel  functions,  whereas
672       probes  of  type  module  are  recommended for probing functions of the
673       specified module.  In case the absolute address of a kernel  or  module
674       function is known, statement probes can be utilized.
675
676       Note  that FUNCTION and MODULE names must not contain wildcards, or the
677       probe will not be registered.  Also, statement probes must be run under
678       guru-mode only.
679
680
681
682   USER-SPACE
683       Support  for  user-space probing is available for kernels that are con‐
684       figured with the utrace extensions, or have  the  uprobes  facility  in
685       linux  3.5.  (Various kernel build configuration options need to be en‐
686       abled; systemtap will advise if these are missing.)
687
688
689       There are several forms.  First, a non-symbolic probe point:
690
691              process(PID).statement(ADDRESS).absolute
692
693       is analogous to kernel.statement(ADDRESS).absolute in that both use raw
694       (unverified)  virtual  addresses and provide no $variables.  The target
695       PID parameter must identify a running process, and ADDRESS should iden‐
696       tify  a valid instruction address.  All threads of that process will be
697       probed.
698
699       Second, non-symbolic user-kernel interface events handled by utrace may
700       be probed:
701
702              process(PID).begin
703              process("FULLPATH").begin
704              process.begin
705              process(PID).thread.begin
706              process("FULLPATH").thread.begin
707              process.thread.begin
708              process(PID).end
709              process("FULLPATH").end
710              process.end
711              process(PID).thread.end
712              process("FULLPATH").thread.end
713              process.thread.end
714              process(PID).syscall
715              process("FULLPATH").syscall
716              process.syscall
717              process(PID).syscall.return
718              process("FULLPATH").syscall.return
719              process.syscall.return
720              process(PID).insn
721              process("FULLPATH").insn
722              process(PID).insn.block
723              process("FULLPATH").insn.block
724
725
726
727       A  process.begin probe gets called when new process described by PID or
728       FULLPATH gets created.  In addition, it is called once from the context
729       of each preexisting process, at systemtap script startup.  This is use‐
730       ful to track live processes.  A process.thread.begin probe gets  called
731       when  a  new  thread  described  by  PID  or  FULLPATH gets created.  A
732       process.end probe gets called when process described by PID or FULLPATH
733       dies.   A  process.thread.end probe gets called when a thread described
734       by PID or FULLPATH dies.  A process.syscall probe gets  called  when  a
735       thread  described  by  PID or FULLPATH makes a system call.  The system
736       call number is available in the  $syscall  context  variable,  and  the
737       first  6  arguments  of the system call are available in the $argN (ex.
738       $arg1, $arg2, ...) context variable.   A  process.syscall.return  probe
739       gets  called  when a thread described by PID or FULLPATH returns from a
740       system call.  The system call number is available in the $syscall  con‐
741       text  variable, and the return value of the system call is available in
742       the $return context variable.  A process.insn probe gets called for ev‐
743       ery single-stepped instruction of the process described by PID or FULL‐
744       PATH.  A process.insn.block probe gets called for  every  block-stepped
745       instruction of the process described by PID or FULLPATH.
746
747
748       If  a  process  probe  is specified without a PID or FULLPATH, all user
749       threads will be probed.  However, if systemtap was invoked with the  -c
750       or  -x options, then process probes are restricted to the process hier‐
751       archy associated with the target process.  If a process  probe  is  un‐
752       specified (i.e. without a PID or FULLPATH), but with the -c option, the
753       PATH of the -c cmd will be heuristically filled into the process  PATH.
754       In  that  case,  only  command parameters are allowed in the -c command
755       (i.e. no command substitution allowed and  no  occurrences  of  any  of
756       these characters: '|&;<>(){}').
757
758
759       Third,  symbolic  static  instrumentation  compiled  into  programs and
760       shared libraries may be probed:
761
762              process("PATH").mark("LABEL")
763              process("PATH").provider("PROVIDER").mark("LABEL")
764              process(PID).mark("LABEL")
765              process(PID).provider("PROVIDER").mark("LABEL")
766
767
768       A .mark probe gets called via a static probe which is  defined  in  the
769       application  by  STAP_PROBE1(PROVIDER,LABEL,arg1), which are macros de‐
770       fined in sys/sdt.h.  The PROVIDER is an arbitrary application identifi‐
771       er,  LABEL is the marker site identifier, and arg1 is the integer-typed
772       argument.  STAP_PROBE1 is used for probes with 1 argument,  STAP_PROBE2
773       is  used  for probes with 2 arguments, and so on.  The arguments of the
774       probe are available in the context variables $arg1, $arg2, ...  An  al‐
775       ternative to using the STAP_PROBE macros is to use the dtrace script to
776       create  custom  macros.   Additionally,  the   variables   $$name   and
777       $$provider  are  available  as  parts  of  the  probe  point name.  The
778       sys/sdt.h macro  names  DTRACE_PROBE*  are  available  as  aliases  for
779       STAP_PROBE*.
780
781
782       Finally,  full  symbolic source-level probes in user-space programs and
783       shared libraries are supported.  These are  exactly  analogous  to  the
784       symbolic DWARF-based kernel/module probes described above.  They expose
785       the same sorts of context $variables  for  function  parameters,  local
786       variables, and so on.
787
788              process("PATH").function("NAME")
789              process("PATH").statement("*@FILE.c:123")
790              process("PATH").plt("NAME")
791              process("PATH").library("PATH").plt("NAME")
792              process("PATH").library("PATH").function("NAME")
793              process("PATH").library("PATH").statement("*@FILE.c:123")
794              process("PATH").function("*").return
795              process("PATH").function("myfun").label("foo")
796              process("PATH").function("foo").callee("bar")
797              process("PATH").plt("NAME").return
798              process(PID).function("NAME")
799              process(PID).statement("*@FILE.c:123")
800              process(PID).plt("NAME")
801
802
803
804       Note  that for all process probes, PATH names refer to executables that
805       are searched the same way shells do: relative to the working  directory
806       if they contain a "/" character, otherwise in $PATH.  If PATH names re‐
807       fer to scripts, the actual interpreters (specified in the script in the
808       first line after the #! characters) are probed.
809
810
811       Tapset   process   probes   placed   in  the  special  directory  $pre‐
812       fix/share/systemtap/tapset/PATH/ with relative paths  will  have  their
813       process  parameter  prefixed with the location of the tapset. For exam‐
814       ple,
815
816
817              process("foo").function("NAME")
818
819
820       expands to
821
822              process("/usr/bin/foo").function("NAME")
823
824
825
826       when placed in $prefix/share/systemtap/tapset/PATH/usr/bin/
827
828
829       If PATH is a process component parameter referring to shared  libraries
830       then  all  processes that map it at runtime would be selected for prob‐
831       ing.  If PATH is a library component parameter referring to shared  li‐
832       braries  then  the  process specified by the process component would be
833       selected.  Note that the PATH pattern in a library component  will  al‐
834       ways  apply  to  libraries  statically  determined  to be in use by the
835       process. However, you may also specify the full  path  to  any  library
836       file even if not statically needed by the process.
837
838
839       A  .plt  probe will probe functions in the program linkage table corre‐
840       sponding to the rest of the probe point.  .plt can be  specified  as  a
841       shorthand for .plt("*").  The symbol name is available as a $$name con‐
842       text variable; function arguments are not  available,  since  PLTs  are
843       processed without debuginfo.  A .plt.return probe places a probe at the
844       moment after the return from the named function.
845
846
847       If the PATH string contains wildcards as in  the  MPATTERN  case,  then
848       standard  globbing  is  performed  to find all matching paths.  In this
849       case, the $PATH environment variable is not used.
850
851
852       If systemtap was invoked with the -c or -x options, then process probes
853       are  restricted  to  the  process  hierarchy associated with the target
854       process.
855
856
857   JAVA
858       Support for probing Java methods is available using Byteman as a  back‐
859       end.  Byteman  is  an instrumentation tool from the JBoss project which
860       systemtap can use to monitor invocations for a specific method or  line
861       in a Java program.
862
863       Systemtap  does so by generating a Byteman script listing the probes to
864       instrument and then invoking the Byteman bminstall utility.
865
866       This Java instrumentation support is currently a prototype feature with
867       major  limitations.   Moreover,  Java  probing  currently does not work
868       across users; the stap script must run (with  appropriate  permissions)
869       under  the  same  user that the Java process being probed. (Thus a stap
870       script under root currently cannot probe Java methods in a non-root-us‐
871       er Java process.)
872
873
874       The  first  probe type refers to Java processes by the name of the Java
875       process:
876
877              java("PNAME").class("CLASSNAME").method("PATTERN")
878              java("PNAME").class("CLASSNAME").method("PATTERN").return
879
880       The PNAME argument must be a pre-existing jvm pid, and be  identifiable
881       via a jps listing.
882
883       The  PATTERN  parameter  specifies  the signature of the Java method to
884       probe. The signature must consist of the exact name of the method, fol‐
885       lowed  by  a bracketed list of the types of the arguments, for instance
886       "myMethod(int,double,Foo)". Wildcards are not supported.
887
888       The probe can be set to trigger at a specific line within the method by
889       appending  a  line number with colon, just as in other types of probes:
890       "myMethod(int,double,Foo):245".
891
892       The CLASSNAME parameter identifies the Java class  the  method  belongs
893       to,  either  with or without the package qualification. By default, the
894       probe only triggers on descendants of the class that  do  not  override
895       the  method  definition  of  the original class. However, CLASSNAME can
896       take an optional caret prefix, as in ^org.my.MyClass,  which  specifies
897       that  the  probe should also trigger on all descendants of MyClass that
898       override the original method. For instance, every method with signature
899       foo(int) in program org.my.MyApp can be probed at once using
900
901              java("org.my.MyApp").class("^java.lang.Object").method("foo(int)")
902
903
904       The  second  probe type works analogously, but refers to Java processes
905       by PID:
906
907              java(PID).class("CLASSNAME").method("PATTERN")
908              java(PID).class("CLASSNAME").method("PATTERN").return
909
910       (PIDs for an already running process can be obtained using  the  jps(1)
911       utility.)
912
913       Context  variables  defined  within  java  probes include $arg1 through
914       $arg10 (for up to the first 10 arguments of a method),  represented  as
915       character-pointers  for  the  toString()  form of each actual argument.
916       The arg1 through arg10 script variables provide access to these as  or‐
917       dinary strings, fetched via user_string_warn().
918
919       Prior  to systemtap version 3.1, $arg1 through $arg10 could contain ei‐
920       ther integers or character pointers, depending on the types of the  ob‐
921       jects  being  passed to each particular java method.  This previous be‐
922       haviour may be invoked with the stap --compatible=3.0 flag.
923
924
925   PROCFS
926       These probe points allow procfs "files" in  /proc/systemtap/MODNAME  to
927       be  created,  read  and written using a permission that may be modified
928       using the proper umask value. Default permissions  are  0400  for  read
929       probes,  and  0200 for write probes. If both a read and write probe are
930       being used on the same file, a default permission of 0600 will be used.
931       Using procfs.umask(0040).read would result in a 0404 permission set for
932       the file.  (MODNAME is the name of  the  systemtap  module).  The  proc
933       filesystem is a pseudo-filesystem which is used as an interface to ker‐
934       nel data structures. There are several probe point  variants  supported
935       by the translator:
936
937
938              procfs("PATH").read
939              procfs("PATH").umask(UMASK).read
940              procfs("PATH").read.maxsize(MAXSIZE)
941              procfs("PATH").umask(UMASK).maxsize(MAXSIZE)
942              procfs("PATH").write
943              procfs("PATH").umask(UMASK).write
944              procfs.read
945              procfs.umask(UMASK).read
946              procfs.read.maxsize(MAXSIZE)
947              procfs.umask(UMASK).read.maxsize(MAXSIZE)
948              procfs.write
949              procfs.umask(UMASK).write
950
951
952       Note  that  there  are a few differences when procfs probes are used in
953       the stapbpf runtime.  FIFO special  files  are  used  instead  of  proc
954       filesystem  files.   These  files are created in /var/tmp/systemtap-US‐
955       ER/MODNAME.  (USER is the name of the user).  Additionally, users  can‐
956       not create both read and write probes on the same file.
957
958       PATH   is   the  file  name  (relative  to  /proc/systemtap/MODNAME  or
959       /var/tmp/systemtap-USER/MODNAME) to be created.  If no PATH  is  speci‐
960       fied  (as  in the last two variants above), PATH defaults to "command".
961       The file name "__stdin" is  used  internally  by  systemtap  for  input
962       probes  and should not be used as a PATH for procfs probes; see the in‐
963       put probe section below.
964
965       When a user  reads  /proc/systemtap/MODNAME/PATH  (normal  runtime)  or
966       /var/tmp/systemtap-USER/MODNAME  (stapbpf  runtime),  the corresponding
967       procfs read probe is triggered.  The string data to be read  should  be
968       assigned to a variable named $value, like this:
969
970
971              procfs("PATH").read { $value = "100\n" }
972
973
974       When  a  user writes into /proc/systemtap/MODNAME/PATH (normal runtime)
975       or /var/tmp/systemtap-USER/MODNAME (stapbpf runtime), the corresponding
976       procfs  write probe is triggered.  The data the user wrote is available
977       in the string variable named $value, like this:
978
979
980              procfs("PATH").write { printf("user wrote: %s", $value) }
981
982
983       MAXSIZE is the size of the procfs read buffer.  Specifying MAXSIZE  al‐
984       lows larger procfs output.  If no MAXSIZE is specified, the procfs read
985       buffer defaults to STP_PROCFS_BUFSIZE (which defaults to  MAXSTRINGLEN,
986       the  maximum  length  of a string).  If setting the procfs read buffers
987       for more than one file is needed, it may be  easiest  to  override  the
988       STP_PROCFS_BUFSIZE definition.  Here's an example of using MAXSIZE:
989
990
991              procfs.read.maxsize(1024) {
992                  $value = "long string..."
993                  $value .= "another long string..."
994                  $value .= "another long string..."
995                  $value .= "another long string..."
996              }
997
998
999
1000   INPUT
1001       These probe points make input from stdin available to the script during
1002       runtime.  The translator currently supports two variants of this  fami‐
1003       ly:
1004
1005              input.char
1006              input.line
1007
1008
1009       input.char  is  triggered each time a character is read from stdin. The
1010       current character is available  in  the  string  variable  named  char.
1011       There is no newline buffering; the next character is read from stdin as
1012       soon as it becomes available.
1013
1014       input.line causes all characters read from stdin to be buffered until a
1015       newline  is  read, at which point the probe will be triggered. The cur‐
1016       rent line of characters (including the newline) is made available in  a
1017       string  variable named line.  Note that no more than MAXSTRINGLEN char‐
1018       acters will be buffered. Any additional characters will not be included
1019       in line.
1020
1021
1022       Input probes are aliases for procfs("__stdin").write.  Systemtap recon‐
1023       figures stdin if the presence of this procfs probe is detected,  there‐
1024       fore "__stdin" should not be used as a path argument for procfs probes.
1025       Additionally, input probes will not work with the -F and  --remote  op‐
1026       tions.
1027
1028
1029   NETFILTER HOOKS
1030       These  probe points allow observation of network packets using the net‐
1031       filter mechanism. A netfilter probe in systemtap corresponds to a  net‐
1032       filter hook function in the original netfilter probes API. It is proba‐
1033       bly more convenient to use tapset::netfilter(3stap),  which  wraps  the
1034       primitive netfilter hooks and does the work of extracting useful infor‐
1035       mation from the context variables.
1036
1037
1038       There are several probe point variants supported by the translator:
1039
1040
1041              netfilter.hook("HOOKNAME").pf("PROTOCOL_F")
1042              netfilter.pf("PROTOCOL_F").hook("HOOKNAME")
1043              netfilter.hook("HOOKNAME").pf("PROTOCOL_F").priority("PRIORITY")
1044              netfilter.pf("PROTOCOL_F").hook("HOOKNAME").priority("PRIORITY")
1045
1046
1047
1048       PROTOCOL_F is the protocol family to listen for, currently one  of  NF‐
1049       PROTO_IPV4, NFPROTO_IPV6, NFPROTO_ARP, or NFPROTO_BRIDGE.
1050
1051
1052       HOOKNAME is the point, or 'hook', in the protocol stack at which to in‐
1053       tercept the packet. The available hook names for each  protocol  family
1054       are  taken from the kernel header files <linux/netfilter_ipv4.h>, <lin‐
1055       ux/netfilter_ipv6.h>,   <linux/netfilter_arp.h>   and    <linux/netfil‐
1056       ter_bridge.h>.  For instance, allowable hook names for NFPROTO_IPV4 are
1057       NF_INET_PRE_ROUTING,  NF_INET_LOCAL_IN,  NF_INET_FORWARD,   NF_INET_LO‐
1058       CAL_OUT, and NF_INET_POST_ROUTING.
1059
1060
1061       PRIORITY  is  an  integer  priority giving the order in which the probe
1062       point should be triggered relative to any other  netfilter  hook  func‐
1063       tions  which trigger on the same packet. Hook functions execute on each
1064       packet in order from smallest priority number to largest priority  num‐
1065       ber. If no PRIORITY is specified (as in the first two probe point vari‐
1066       ants above), PRIORITY defaults to "0".
1067
1068       There are a number of predefined priority names of the form NF_IP_PRI_*
1069       and  NF_IP6_PRI_*  which  are  defined in the kernel header files <lin‐
1070       ux/netfilter_ipv4.h>  and  <linux/netfilter_ipv6.h>  respectively.  The
1071       script  is permitted to use these instead of specifying an integer pri‐
1072       ority. (The probe points for NFPROTO_ARP and  NFPROTO_BRIDGE  currently
1073       do  not  expose any named hook priorities to the script writer.)  Thus,
1074       allowable ways to specify the priority include:
1075
1076
1077              priority("255")
1078              priority("NF_IP_PRI_SELINUX_LAST")
1079
1080
1081       A script using guru mode is permitted to specify any identifier or num‐
1082       ber as the parameter for hook, pf, and priority. This feature should be
1083       used with caution, as the parameter is inserted  verbatim  into  the  C
1084       code generated by systemtap.
1085
1086       The netfilter probe points define the following context variables:
1087
1088       $hooknum
1089              The hook number.
1090
1091       $skb   The  address  of the sk_buff struct representing the packet. See
1092              <linux/skbuff.h> for details on how to use this struct,  or  al‐
1093              ternatively use the tapset tapset::netfilter(3stap) for easy ac‐
1094              cess to key information.
1095
1096
1097       $in    The address of the net_device struct  representing  the  network
1098              device  on  which  the packet was received (if any). May be 0 if
1099              the device is unknown or undefined at that stage in the protocol
1100              stack.
1101
1102
1103       $out   The  address  of  the net_device struct representing the network
1104              device on which the packet will be sent (if any). May  be  0  if
1105              the device is unknown or undefined at that stage in the protocol
1106              stack.
1107
1108
1109       $verdict
1110              (Guru mode only.) Assigning one of the verdict values defined in
1111              <linux/netfilter.h> to this variable alters the further progress
1112              of the packet through the protocol stack. For instance, the fol‐
1113              lowing  guru  mode  script forces all ipv6 network packets to be
1114              dropped:
1115
1116
1117              probe netfilter.pf("NFPROTO_IPV6").hook("NF_IP6_PRE_ROUTING") {
1118                $verdict = 0 /* nf_drop */
1119              }
1120
1121
1122              For convenience, unlike the  primitive  probe  points  discussed
1123              here,  the probes defined in tapset::netfilter(3stap) export the
1124              lowercase names of the verdict constants (e.g.  NF_DROP  becomes
1125              nf_drop) as local variables.
1126
1127
1128   KERNEL TRACEPOINTS
1129       This  family of probe points hooks up to static probing tracepoints in‐
1130       serted into the kernel or modules.  As with markers, these  tracepoints
1131       are  special  macro calls inserted by kernel developers to make probing
1132       faster and more reliable than with DWARF-based probes, and DWARF debug‐
1133       ging  information  is  not  required to probe tracepoints.  Tracepoints
1134       have an extra advantage of more strongly-typed parameters than markers.
1135
1136       Tracepoint probes look like: kernel.trace("name").  The tracepoint name
1137       string,  which  may  contain  the usual wildcard characters, is matched
1138       against the names defined by the kernel developers  in  the  tracepoint
1139       header  files.  To  restrict  the  search  to specific subsystems (e.g.
1140       sched,  ext3,  etc...),  the  following  syntax  can  be   used:   ker‐
1141       nel.trace("system:name").   The  tracepoint system string may also con‐
1142       tain the usual wildcard characters.
1143
1144       The handler associated with a tracepoint-based probe may read  the  op‐
1145       tional  parameters  specified  at the macro call site.  These are named
1146       according to the declaration by the tracepoint  author.   For  example,
1147       the  tracepoint  probe  kernel.trace("sched:sched_switch") provides the
1148       parameters $prev and $next.  If the parameter is a complex type, as  in
1149       a  struct pointer, then a script can access fields with the same syntax
1150       as DWARF $target variables.  Also, tracepoint parameters cannot be mod‐
1151       ified, but in guru-mode a script may modify fields of parameters.
1152
1153       The  subsystem and name of the tracepoint are available in $$system and
1154       $$name and a string of name=value pairs for all parameters of the  tra‐
1155       cepoint is available in $$vars or $$parms.
1156
1157
1158   KERNEL MARKERS (OBSOLETE)
1159       This  family of probe points hooks up to an older style of static prob‐
1160       ing markers inserted into older kernels or modules.  These markers  are
1161       special  STAP_MARK  macro  calls  inserted by kernel developers to make
1162       probing faster and more reliable than with  DWARF-based  probes.   Fur‐
1163       ther, DWARF debugging information is not required to probe markers.
1164
1165       Marker  probe points begin with kernel.  The next part names the marker
1166       itself: mark("name").  The marker name string, which  may  contain  the
1167       usual  wildcard  characters,  is matched against the names given to the
1168       marker macros when the kernel and/or module was compiled.     Optional‐
1169       ly,  you  can  specify  format("format").  Specifying the marker format
1170       string allows differentiation between two markers with  the  same  name
1171       but different marker format strings.
1172
1173       The  handler associated with a marker-based probe may read the optional
1174       parameters specified at the macro call site.   These  are  named  $arg1
1175       through  $argNN,  where  NN is the number of parameters supplied by the
1176       macro.  Number and string parameters are passed in a type-safe manner.
1177
1178       The marker format string associated with a marker is available in $for‐
1179       mat.  And also the marker name string is available in $name.
1180
1181
1182   HARDWARE BREAKPOINTS
1183       This family of probes is used to set hardware watchpoints for a given
1184        (global) kernel symbol. The probes take three components as inputs :
1185
1186       1. The virtual address / name of the kernel symbol to be traced is sup‐
1187       plied as argument to this class of probes. ( Probes for only data  seg‐
1188       ment  variables  are  supported.  Probing local variables of a function
1189       cannot be done.)
1190
1191       2. Nature of access to be probed : a.  .write probe gets triggered when
1192       a  write happens at the specified address/symbol name.  b.  rw probe is
1193       triggered when either a read or write happens.
1194
1195       3.  .length (optional) Users have the option of specifying the  address
1196       interval  to  be  probed  using "length" constructs. The user-specified
1197       length gets approximated to the closest possible  address  length  that
1198       the  architecture can support. If the specified length exceeds the lim‐
1199       its imposed by architecture, an error message is flagged and probe reg‐
1200       istration  fails.   Wherever  'length' is not specified, the translator
1201       requests a hardware breakpoint probe of length 1. It  should  be  noted
1202       that the "length" construct is not valid with symbol names.
1203
1204       Following constructs are supported :
1205
1206              probe kernel.data(ADDRESS).write
1207              probe kernel.data(ADDRESS).rw
1208              probe kernel.data(ADDRESS).length(LEN).write
1209              probe kernel.data(ADDRESS).length(LEN).rw
1210              probe kernel.data("SYMBOL_NAME").write
1211              probe kernel.data("SYMBOL_NAME").rw
1212
1213
1214       This  set  of  probes make use of the debug registers of the processor,
1215       which is a scarce resource. (4 on x86 ,  1  on  powerpc  )  The  script
1216       translation flags a warning if a user requests more hardware breakpoint
1217       probes than the limits set by architecture. For example,a pass-2  warn‐
1218       ing  is  flashed  when  an  input script requests 5 hardware breakpoint
1219       probes on an x86 system while x86 architecture supports a maximum of  4
1220       breakpoints.  Users are cautioned to set probes judiciously.
1221
1222
1223   PERF
1224       This  family  of probe points interfaces to the kernel "perf event" in‐
1225       frastructure for controlling hardware performance counters.  The events
1226       being  attached  to are described by the "type", "config" fields of the
1227       perf_event_attr structure, and are sampled at an interval  governed  by
1228       the "sample_period" and "sample_freq" fields.
1229
1230       These  fields are made available to systemtap scripts using the follow‐
1231       ing syntax:
1232
1233              probe perf.type(NN).config(MM).sample(XX)
1234              probe perf.type(NN).config(MM).hz(XX)
1235              probe perf.type(NN).config(MM)
1236              probe perf.type(NN).config(MM).process("PROC")
1237              probe perf.type(NN).config(MM).counter("COUNTER")
1238              probe perf.type(NN).config(MM).process("PROC").counter("NAME")
1239
1240       The systemtap probe handler is called once per XX increments of the un‐
1241       derlying  performance counter when using the .sample field or at a fre‐
1242       quency in hertz when using the .hz field. When not specified,  the  de‐
1243       fault  behavior is to sample at a count of 1000000.  The range of valid
1244       type/config is described by the perf_event_open(2) system call,  and/or
1245       the  linux/perf_event.h  file.  Invalid combinations or exhausted hard‐
1246       ware counter resources result in errors during systemtap script  start‐
1247       up.   Systemtap does not sanity-check the values: it merely passes them
1248       through to the kernel for error- and safety-checking.  By  default  the
1249       perf event probe is systemwide unless .process is specified, which will
1250       bind the probe to a specific task.  If the name is omitted then  it  is
1251       inferred  from  the stap -c argument.   A perf event can be read on de‐
1252       mand using .counter.  The body of the perf probe handler  will  not  be
1253       invoked  for  a  .counter probe; instead, the counter is read in a user
1254       space probe via:
1255
1256          process("PROC").statement("func@file") {stat <<< @perf("NAME")}
1257
1258
1259
1260   PYTHON
1261       Support for probing python 2 and python 3 function  is  available  with
1262       the help of an extra python support module. Note that the debuginfo for
1263       the version of python being probed is required. To run a python  script
1264       with  the  extra python support module you'd add the '-m HelperSDT' op‐
1265       tion to your python command, like this:
1266
1267              stap foo.stp -c "python -m HelperSDT foo.py"
1268
1269       Python probes look like the following:
1270
1271              python2.module("MPATTERN").function("PATTERN")
1272              python2.module("MPATTERN").function("PATTERN").call
1273              python2.module("MPATTERN").function("PATTERN").return
1274              python3.module("MPATTERN").function("PATTERN")
1275              python3.module("MPATTERN").function("PATTERN").call
1276              python3.module("MPATTERN").function("PATTERN").return
1277
1278       The list above includes multiple variants and modifiers  which  provide
1279       additional functionality or filters. They are:
1280
1281              .function
1282                     Places  a probe at the beginning of the named function by
1283                     default,  unless  modified  by  PATTERN.  Parameters  are
1284                     available as context variables.
1285
1286              .call  Places  a  probe  at the beginning of the named function.
1287                     Parameters are available as context variables.
1288
1289              .return
1290                     Places a probe at the moment before the return  from  the
1291                     named  function. Parameters and local/global python vari‐
1292                     ables are available as context variables.
1293
1294       PATTERN stands for a string literal that aims to identify  a  point  in
1295       the python program.  It is made up of three parts:
1296
1297       ·   The  first  part  is  the  name of a function (e.g. "foo") or class
1298           method (e.g. "bar.baz"). This part may use the "*"  and  "?"  wild‐
1299           carding operators to match multiple names.
1300
1301       ·   The  second part is optional and begins with the "@" character.  It
1302           is followed by the path to the source file containing the function,
1303           which  may  include a wildcard pattern. The python path is searched
1304           for a matching filename.
1305
1306       ·   Finally, the third part is optional if the file name part was  giv‐
1307           en, and identifies the line number in the source file preceded by a
1308           ":" or a "+".  The line number is assumed to be  an  absolute  line
1309           number if preceded by a ":", or relative to the declaration line of
1310           the function if preceded by a "+".  All the lines in  the  function
1311           can  be  matched  with  ":*".   A range of lines x through y can be
1312           matched with ":x-y". Ranges and specific lines can be  mixed  using
1313           commas, e.g. ":x,y-z".
1314
1315       In  the above list of probe points, MPATTERN stands for a python module
1316       or script name that names the python module of interest. This part  may
1317       use  the "*" and "?" wildcarding operators to match multiple names. The
1318       python path is searched for a matching filename.
1319
1320
1321

EXAMPLES

1323       Here are some example probe points, defining the associated events.
1324
1325       begin, end, end
1326              refers to the startup and normal shutdown of  the  session.   In
1327              this  case,  the handler would run once during startup and twice
1328              during shutdown.
1329
1330       timer.jiffies(1000).randomize(200)
1331              refers to a periodic interrupt, every 1000 +/- 200 jiffies.
1332
1333       kernel.function("*init*"), kernel.function("*exit*")
1334              refers to all kernel functions with  "init"  or  "exit"  in  the
1335              name.
1336
1337       kernel.function("*@kernel/time.c:240")
1338              refers  to  any  functions  within the "kernel/time.c" file that
1339              span line 240.   Note that this is not a probe at the  statement
1340              at that line number.  Use the kernel.statement probe instead.
1341
1342       kernel.trace("sched_*")
1343              refers  to  all scheduler-related (really, prefixed) tracepoints
1344              in the kernel.
1345
1346       kernel.mark("getuid")
1347              refers to an obsolete STAP_MARK(getuid, ...) macro call  in  the
1348              kernel.
1349
1350       module("usb*").function("*sync*").return
1351              refers to the moment of return from all functions with "sync" in
1352              the name in any of the USB drivers.
1353
1354       kernel.statement(0xc0044852)
1355              refers to the first byte of the  statement  whose  compiled  in‐
1356              structions include the given address in the kernel.
1357
1358       kernel.statement("*@kernel/time.c:296")
1359              refers to the statement of line 296 within "kernel/time.c".
1360
1361       kernel.statement("bio_init@fs/bio.c+3")
1362              refers to the statement at line bio_init+3 within "fs/bio.c".
1363
1364       kernel.data("pid_max").write
1365              refers to a hardware breakpoint of type "write" set on pid_max
1366
1367       syscall.*.return
1368              refers  to the group of probe aliases with any name in the third
1369              position
1370
1371

SEE ALSO

1373       stap(1),
1374       probe::*(3stap),
1375       tapset::*(3stap)
1376
1377
1378
1379
1380                                                             STAPPROBES(3stap)
Impressum