stapprobes(3stap)

1STAPPROBES(3stap)                                            STAPPROBES(3stap)
2
3
4

NAME

6       stapprobes - systemtap probe points
7
8
9

DESCRIPTION

11       The  following sections enumerate the variety of probe points supported
12       by the systemtap translator, and some of the additional aliases defined
13       by  standard  tapset  scripts.  Many are individually documented in the
14       3stap manual section, with the probe:: prefix.
15
16       The general probe point  syntax  is  a  dotted-symbol  sequence.   This
17       allows a breakdown of the event namespace into parts, somewhat like the
18       Domain Name System does on the Internet.  Each component identifier may
19       be  parametrized  by  a  string or number literal, with a syntax like a
20       function call.  A component may include a "*" character, to expand to a
21       set of matching probe points.  It may also include "**" to match multi‐
22       ple sequential components at once.  Probe aliases  likewise  expand  to
23       other  probe  points.  Each and every resulting probe point is normally
24       resolved to some low-level system  instrumentation  facility  (e.g.,  a
25       kprobe address, marker, or a timer configuration), otherwise the elabo‐
26       ration phase will fail.
27
28       However, a probe point may be followed by a "?" character, to  indicate
29       that  it  is  optional,  and that no error should result if it fails to
30       resolve.  Optionalness passes down through all levels of alias/wildcard
31       expansion.  Alternately, a probe point may be followed by a "!" charac‐
32       ter, to indicate that it  is  both  optional  and  sufficient.   (Think
33       vaguely  of  the Prolog cut operator.) If it does resolve, then no fur‐
34       ther probe points in the same comma-separated list  will  be  resolved.
35       Therefore,  the  "!"   sufficiency  mark  only makes sense in a list of
36       probe point alternatives.
37
38       Additionally, a probe point may be followed by a "if (expr)" statement,
39       in  order  to  enable/disable the probe point on-the-fly. With the "if"
40       statement, if the "expr" is false when the  probe  point  is  hit,  the
41       whole  probe  body  including alias's body is skipped. The condition is
42       stacked up through all levels of alias/wildcard expansion. So the final
43       condition  becomes  the  logical-and  of  conditions  of  all  expanded
44       alias/wildcard.
45
46       These are all syntactically valid probe points.   (They  are  generally
47       semantically invalid, depending on the contents of the tapsets, and the
48       versions of kernel/user software installed.)
49
50              kernel.function("foo").return
51              process("/bin/vi").statement(0x2222)
52              end
53              syscall.*
54              sys**open
55              kernel.function("no_such_function") ?
56              module("awol").function("no_such_function") !
57              signal.*? if (switch)
58              kprobe.function("foo")
59
60       Probes may be broadly classified into "synchronous" and "asynchronous".
61       A "synchronous" event is deemed to occur when any processor executes an
62       instruction matched by the specification.  This gives  these  probes  a
63       reference  point  (instruction address) from which more contextual data
64       may be available.  Other families of probe points refer  to  "asynchro‐
65       nous"  events  such  as timers/counters rolling over, where there is no
66       fixed reference point that is related.  Each probe point  specification
67       may match multiple locations (for example, using wildcards or aliases),
68       and all them are then probed.  A probe  declaration  may  also  contain
69       several comma-separated specifications, all of which are probed.
70
71

DWARF DEBUGINFO

73       Resolving some probe points requires DWARF debuginfo or "debug symbols"
74       for the specific part being instrumented.  For some  others,  DWARF  is
75       automatically  synthesized  on  the  fly from source code header files.
76       For others, it is not needed at all.  Since a systemtap script may  use
77       any mixture of probe points together, the union of their DWARF require‐
78       ments has to be met on the computer where  script  compilation  occurs.
79       (See the --use-server option and the stap-server(8) man page for infor‐
80       mation about the remote compilation facility, which  allows  these  re‐
81       quirements to be met on a different machine.)
82
83       The  following  point lists many of the available probe point families,
84       to classify them with respect to their need for DWARF debuginfo.
85
86
87       DWARF                          AUTO-DWARF     NON-DWARF
88
89       kernel.function, .statement    kernel.trace   kernel.mark
90       module.function, .statement                   process.mark
91       process.function, .statement                  begin, end, error, never
92       process.mark (backup)                         timer
93                                                     perf
94                                                     procfs
95                                                     kernel.statement.absolute
96                                                     kernel.data
97                                                     kprobe.function
98                                                     process.statement.absolute
99                                                     process.begin, .end, .error
100
101

PROBE POINT FAMILIES

103   BEGIN/END/ERROR
104       The probe points begin and end are defined by the translator  to  refer
105       to  the  time  of session startup and shutdown.  All "begin" probe han‐
106       dlers are run, in some sequence, during the  startup  of  the  session.
107       All  global  variables  will have been initialized prior to this point.
108       All "end" probes are run, in some sequence, during the normal  shutdown
109       of  a session, such as in the aftermath of an exit () function call, or
110       an interruption from the user.  In the case of an error-triggered shut‐
111       down,  "end"  probes are not run.  There are no target variables avail‐
112       able in either context.
113
114       If the order of execution among "begin" or "end" probes is significant,
115       then an optional sequence number may be provided:
116
117              begin(N)
118              end(N)
119
120       The  number  N may be positive or negative.  The probe handlers are run
121       in increasing order, and the order between handlers with the  same  se‐
122       quence  number is unspecified.  When "begin" or "end" are given without
123       a sequence, they are effectively sequence zero.
124
125       The error probe point is similar to the end  probe,  except  that  each
126       such  probe  handler  run  when  the session ends after errors have oc‐
127       curred.  In such cases, "end" probes  are  skipped,  but  each  "error"
128       probe  is  still attempted.  This kind of probe can be used to clean up
129       or emit a "final gasp".  It may also be numerically parametrized to set
130       a sequence.
131
132
133   NEVER
134       The  probe  point  never is specially defined by the translator to mean
135       "never".  Its probe handler is never run, though its statements are an‐
136       alyzed for symbol / type correctness as usual.  This probe point may be
137       useful in conjunction with optional probes.
138
139
140   SYSCALL
141       The syscall.*  aliases define several hundred probes, too many to  sum‐
142       marize here.  They are:
143
144              syscall.NAME
145              syscall.NAME.return
146
147       Generally, two probes are defined for each normal system call as listed
148       in the syscalls(2) manual page, one  for  entry  and  one  for  return.
149       Those  system  calls that never return do not have a corresponding .re‐
150       turn probe.
151
152       Each probe alias provides a variety of variables. Looking at the tapset
153       source  code is the most reliable way.  Generally, each variable listed
154       in the standard manual page is made available as a  script-level  vari‐
155       able,  so syscall.open exposes filename, flags, and mode.  In addition,
156       a standard suite of variables is available at most aliases:
157
158       argstr A pretty-printed form  of  the  entire  argument  list,  without
159              parentheses.
160
161       name   The name of the system call.
162
163       retstr For  return probes, a pretty-printed form of the system-call re‐
164              sult.
165
166       As usual for probe aliases, these variables are all simply  initialized
167       once  from  the underlying $context variables, so that later changes to
168       $context variables are not  automatically  reflected.   Not  all  probe
169       aliases  obey all of these general guidelines.  Please report any both‐
170       ersome ones you encounter as a bug.
171
172
173
174   TIMERS
175       Intervals defined by the standard kernel "jiffies" timer may be used to
176       trigger  probe  handlers  asynchronously.  Two probe point variants are
177       supported by the translator:
178
179              timer.jiffies(N)
180              timer.jiffies(N).randomize(M)
181
182       The probe handler is run every N  jiffies  (a  kernel-defined  unit  of
183       time,  typically between 1 and 60 ms).  If the "randomize" component is
184       given, a linearly distributed random value in  the  range  [-M..+M]  is
185       added to N every time the handler is run.  N is restricted to a reason‐
186       able range (1 to around a million), and M is restricted to  be  smaller
187       than  N.  There are no target variables provided in either context.  It
188       is possible for such probes to be run concurrently on a multi-processor
189       computer.
190
191       Alternatively,  intervals may be specified in units of time.  There are
192       two probe point variants similar to the jiffies timer:
193
194              timer.ms(N)
195              timer.ms(N).randomize(M)
196
197       Here, N and M are specified in milliseconds, but the full  options  for
198       units   are   seconds  (s/sec),  milliseconds  (ms/msec),  microseconds
199       (us/usec), nanoseconds (ns/nsec), and hertz (hz).  Randomization is not
200       supported for hertz timers.
201
202       The  actual resolution of the timers depends on the target kernel.  For
203       kernels prior to 2.6.17, timers are limited to jiffies  resolution,  so
204       intervals  are  rounded  up  to  the  nearest  jiffies interval.  After
205       2.6.17, the implementation uses hrtimers for tighter precision,  though
206       the  actual  resolution will be arch-dependent.  In either case, if the
207       "randomize" component is given, then the random value will be added  to
208       the interval before any rounding occurs.
209
210       Profiling  timers  are also available to provide probes that execute on
211       all CPUs at the rate of the system tick (CONFIG_HZ).  This probe  takes
212       no parameters.
213
214              timer.profile
215
216       Full  context information of the interrupted process is available, mak‐
217       ing this probe suitable for a time-based sampling profiler.
218
219
220   DWARF
221       This family of probe points uses symbolic debugging information for the
222       target  kernel/module/program,  as  may be found in unstripped executa‐
223       bles, or the separate debuginfo  packages.   They  allow  placement  of
224       probes  logically  into  the  execution  path of the target program, by
225       specifying a set of points in the source or object code.  When a match‐
226       ing  statement  executes  on any processor, the probe handler is run in
227       that context.
228
229       Points in a kernel, which are identified by module, source  file,  line
230       number, function name, or some combination of these.
231
232       Here is a list of probe point families currently supported.  The .func‐
233       tion variant places a probe near the beginning of the  named  function,
234       so  that  parameters  are  available as context variables.  The .return
235       variant places a probe at the moment after the return  from  the  named
236       function,  so  the  return  value is available as the "$return" context
237       variable.  The .inline modifier for .function filters  the  results  to
238       include  only  instances  of inlined functions.  The .call modifier se‐
239       lects the opposite subset.  Inline functions do not have  an  identifi‐
240       able  return  point, so .return is not supported on .inline probes. The
241       .statement variant places a probe at the exact spot, exposing those lo‐
242       cal variables that are visible there.
243
244              kernel.function(PATTERN)
245              kernel.function(PATTERN).call
246              kernel.function(PATTERN).return
247              kernel.function(PATTERN).inline
248              kernel.function(PATTERN).label(LPATTERN)
249              module(MPATTERN).function(PATTERN)
250              module(MPATTERN).function(PATTERN).call
251              module(MPATTERN).function(PATTERN).return
252              module(MPATTERN).function(PATTERN).inline
253              module(MPATTERN).function(PATTERN).label(LPATTERN)
254              kernel.statement(PATTERN)
255              kernel.statement(ADDRESS).absolute
256              module(MPATTERN).statement(PATTERN)
257              process("PATH").function("NAME")
258              process("PATH").statement("*@FILE.c:123")
259              process("PATH").library("PATH").function("NAME")
260              process("PATH").library("PATH").statement("*@FILE.c:123")
261              process("PATH").function("*").return
262              process("PATH").function("myfun").label("foo")
263              process(PID).statement(ADDRESS).absolute
264
265       (See  the  USER-SPACE section below for more information on the process
266       probes.)
267
268       In the above list, MPATTERN stands for a string literal  that  aims  to
269       identify the loaded kernel module of interest and LPATTERN stands for a
270       source program label.  Both MPATTERN and LPATTERN may include  the  "*"
271       "[]", and "?" wildcards.  PATTERN stands for a string literal that aims
272       to identify a point in the program.  It is made up of three parts:
273
274       ·   The first part is the name of a function, as would appear in the nm
275           program's  output.   This  part may use the "*" and "?" wildcarding
276           operators to match multiple names.
277
278       ·   The second part is optional and begins with the "@" character.   It
279           is followed by the path to the source file containing the function,
280           which may include a wildcard pattern, such as mm/slab*.  If it does
281           not  match  as  is, an implicit "*/" is optionally added before the
282           pattern, so that a script need only name the last few components of
283           a possibly long source directory path.
284
285       ·   Finally,  the third part is optional if the file name part was giv‐
286           en, and identifies the line number in the source file preceded by a
287           ":"  or  a  "+".  The line number is assumed to be an absolute line
288           number if preceded by a ":", or relative to the entry of the  func‐
289           tion  if  preceded  by a "+".  All the lines in the function can be
290           matched with ":*".  A range of lines x through  y  can  be  matched
291           with ":x-y".
292
293       As an alternative, PATTERN may be a numeric constant, indicating an ad‐
294       dress.  Such an address may be found from symbol tables of  the  appro‐
295       priate  kernel  /  module  object  file.   It is verified against known
296       statement code boundaries, and will be relocated for use at run time.
297
298       In guru mode only, absolute kernel-space  addresses  may  be  specified
299       with the ".absolute" suffix.  Such an address is considered already re‐
300       located, as if it came from /proc/kallsyms, so  it  cannot  be  checked
301       against statement/instruction boundaries.
302
303
304   CONTEXT VARIABLES
305       Many  of  the  source-level context variables, such as function parame‐
306       ters, locals, globals visible in the compilation unit, may  be  visible
307       to  probe  handlers.   They  may  refer to these variables by prefixing
308       their name with "$" within the scripts.  In addition, a special  syntax
309       allows  limited  traversal  of  structures, pointers, and arrays.  More
310       syntax allows pretty-printing of individual variables or their  groups.
311       See also @cast.
312
313
314       $var   refers  to  an in-scope variable "var".  If it's an integer-like
315              type, it will be cast to a 64-bit int for systemtap script  use.
316              String-like  pointers (char *) may be copied to systemtap string
317              values using the kernel_string or user_string functions.
318
319       $var->field traversal via a structure's or a pointer's field.  This
320              generalized indirection operator may be repeated to follow  more
321              levels.   Note that the .  operator is not used for plain struc‐
322              ture members, only -> for both purposes.  (This is  because  "."
323              is reserved for string concatenation.)
324
325       $return
326              is  available  in  return probes only for functions that are de‐
327              clared with a return value.
328
329       $var[N]
330              indexes into an array.  The index given with a literal number or
331              even an arbitrary numeric expression.
332
333       A  number  of  operators  exist for such basic context variable expres‐
334       sions:
335
336       $$vars expands to a character string that is equivalent to
337              sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
338                      parm1, ..., parmN, var1, ..., varN)
339       for each variable in scope at the probe  point.   Some  values  may  be
340       printed as =?  if their run-time location cannot be found.
341
342       $$locals
343              expands to a subset of $$vars for only local variables.
344
345       $$parms
346              expands to a subset of $$vars for only function parameters.
347
348       $$return
349              is available in return probes only.  It expands to a string that
350              is equivalent to sprintf("return=%x",  $return)  if  the  probed
351              function has a return value, or else an empty string.
352
353       & $EXPR
354              expands to the address of the given context variable expression,
355              if it is addressable.
356
357       @defined($EXPR)
358              expands to 1 or 0 iff the given context variable  expression  is
359              resolvable, for use in conditionals such as
360              @defined($foo->bar) ? $foo->bar : 0
361
362       $EXPR$ expands to a string with all of $EXPR's members, equivalent to
363              sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
364                       $EXPR->a, $EXPR->b)
365
366       $EXPR$$
367              expands  to  a string with all of $var's members and submembers,
368              equivalent to
369              sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
370                      $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])
371
372
373       For ".return" probes, context variables other than the "$return"  value
374       itself  are  only  available for the function call parameters.  The ex‐
375       pressions evaluate to the entry-time values of those  variables,  since
376       that is when a snapshot is taken.  Other local variables are not gener‐
377       ally accessible, since by the time a ".return" probe hits,  the  probed
378       function will have already returned.
379
380       Arbitrary entry-time expressions can also be saved for ".return" probes
381       using the @entry(expr) operator.  For  example,  one  can  compute  the
382       elapsed time of a function:
383              probe kernel.function("do_filp_open").return {
384                  println( get_timeofday_us() - @entry(get_timeofday_us()) )
385              }
386
387
388
389   DWARFLESS
390       In  absence  of  debugging information, entry & exit points of kernel &
391       module functions can be probed using the  "kprobe"  family  of  probes.
392       However, these do not permit looking up the arguments / local variables
393       of the function.  Following constructs are supported :
394              kprobe.function(FUNCTION)
395              kprobe.function(FUNCTION).return
396              kprobe.module(NAME).function(FUNCTION)
397              kprobe.module(NAME).function(FUNCTION).return
398              kprobe.statement.(ADDRESS).absolute
399
400       Probes of type function are recommended for kernel  functions,  whereas
401       probes  of  type  module  are  recommended for probing functions of the
402       specified module.  In case the absolute address of a kernel  or  module
403       function is known, statement probes can be utilized.
404
405       Note  that FUNCTION and MODULE names must not contain wildcards, or the
406       probe will not be registered.  Also, statement probes must be run under
407       guru-mode only.
408
409
410
411   USER-SPACE
412       Support  for  user-space probing is available for kernels that are con‐
413       figured with the utrace extensions.  See
414              http://people.redhat.com/roland/utrace/
415
416       There are several forms.  First, a non-symbolic probe point:
417              process(PID).statement(ADDRESS).absolute
418       is analogous to kernel.statement(ADDRESS).absolute in that both use raw
419       (unverified)  virtual  addresses and provide no $variables.  The target
420       PID parameter must identify a running process, and ADDRESS should iden‐
421       tify  a valid instruction address.  All threads of that process will be
422       probed.
423
424       Second, non-symbolic user-kernel interface events handled by utrace may
425       be probed:
426              process(PID).begin
427              process("FULLPATH").begin
428              process.begin
429              process(PID).thread.begin
430              process("FULLPATH").thread.begin
431              process.thread.begin
432              process(PID).end
433              process("FULLPATH").end
434              process.end
435              process(PID).thread.end
436              process("FULLPATH").thread.end
437              process.thread.end
438              process(PID).syscall
439              process("FULLPATH").syscall
440              process.syscall
441              process(PID).syscall.return
442              process("FULLPATH").syscall.return
443              process.syscall.return
444              process(PID).insn
445              process("FULLPATH").insn
446              process(PID).insn.block
447              process("FULLPATH").insn.block
448
449       A  .begin  probe gets called when new process described by PID or FULL‐
450       PATH gets created.  A .thread.begin probe gets called when a new thread
451       described  by  PID  or FULLPATH gets created.  A .end probe gets called
452       when process described by PID or FULLPATH dies.   A  .thread.end  probe
453       gets  called  when  a  thread  described  by  PID  or FULLPATH dies.  A
454       .syscall probe gets called when a thread described by PID  or  FULLPATH
455       makes  a  system  call.   The  system  call  number is available in the
456       $syscall context variable, and the first 6 arguments of the system call
457       are available in the $argN (ex. $arg1, $arg2, ...) context variable.  A
458       .syscall.return probe gets called when a thread  described  by  PID  or
459       FULLPATH  returns from a system call.  The system call number is avail‐
460       able in the $syscall context variable, and the return value of the sys‐
461       tem  call  is available in the $return context variable.  A .insn probe
462       gets called for every single-stepped instruction  of  the  process  de‐
463       scribed  by PID or FULLPATH.  A .insn.block probe gets called for every
464       block-stepped instruction of the process described by PID or FULLPATH.
465
466       If a process probe is specified without a PID  or  FULLPATH,  all  user
467       threads  will be probed.  However, if systemtap was invoked with the -c
468       or -x options, then process probes are restricted to the process  hier‐
469       archy associated with the target process.  If a process probe is speci‐
470       fied without a PID or FULLPATH, but with the -c option, the PATH of the
471       -c cmd will be heuristically filled into the process PATH.
472
473
474       Third,  symbolic  static  instrumentation  compiled  into  programs and
475       shared libraries may be probed:
476              process("PATH").mark("LABEL")
477              process("PATH").provider("PROVIDER").mark("LABEL")
478
479       A .mark probe gets called via a static probe which is  defined  in  the
480       application  by  STAP_PROBE1(PROVIDER,LABEL,arg1),  which is defined in
481       sdt.h.  The handle is an application handle, LABEL corresponds  to  the
482       .mark  argument,  and  arg1  is  the argument.  STAP_PROBE1 is used for
483       probes with 1 argument, STAP_PROBE2 is used for  probes  with  2  argu‐
484       ments, and so on.  The arguments of the probe are available in the con‐
485       text  variables  $arg1,  $arg2,  ...   An  alternative  to  using   the
486       STAP_PROBE  macros is to use the dtrace script to create custom macros.
487       Additionally, the variables $$name  and  $$provider  are  available  as
488       parts of the probe point name.
489
490
491       Finally,  full  symbolic source-level probes in user-space programs and
492       shared libraries are supported.  These are  exactly  analogous  to  the
493       symbolic  DWARF-based  kernel/module probes described above, and expose
494       similar contextual $variables.
495              process("PATH").function("NAME")
496              process("PATH").statement("*@FILE.c:123")
497              process("PATH").library("PATH").function("NAME")
498              process("PATH").library("PATH").statement("*@FILE.c:123")
499              process("PATH").function("*").return
500              process("PATH").function("myfun").label("foo")
501
502
503       Note that for all process probes, PATH names refer to executables  that
504       are  searched the same way shells do: relative to the working directory
505       if they contain a "/" character, otherwise in $PATH.  If PATH names re‐
506       fer to scripts, the actual interpreters (specified in the script in the
507       first line after the #! characters) are probed.  If PATH is  a  process
508       component  parameter  referring  to shared libraries then all processes
509       that map it at runtime would be selected for probing.  If PATH is a li‐
510       brary  component  parameter  referring  to  shared  libraries  then the
511       process specified by the process component would be selected.   If  the
512       PATH  string  contains wildcards as in the MPATTERN case, then standard
513       globbing is performed to find all matching paths.  In  this  case,  the
514       $PATH environment variable is not used.
515
516
517       If systemtap was invoked with the -c or -x options, then process probes
518       are restricted to the process  hierarchy  associated  with  the  target
519       process.
520
521
522   PROCFS
523       These  probe  points allow procfs "files" in /proc/systemtap/MODNAME to
524       be created, read and written using a permission that  may  be  modified
525       using  the  proper  umask  value. Default permissions are 0400 for read
526       probes, and 0200 for write probes. If both a read and write  probe  are
527       being used on the same file, a default permission of 0600 will be used.
528       Using procfs.umask(0040).read would result in a 0404 permission set for
529       the  file.   (MODNAME  is  the  name of the systemtap module). The proc
530       filesystem is a pseudo-filesystem which is used an an interface to ker‐
531       nel  data  structures. There are several probe point variants supported
532       by the translator:
533
534              procfs("PATH").read
535              procfs("PATH").umask(UMASK).read
536              procfs("PATH").read.maxsize(MAXSIZE)
537              procfs("PATH").umask(UMASK).maxsize(MAXSIZE)
538              procfs("PATH").write
539              procfs("PATH").umask(UMASK).write
540              procfs.read
541              procfs.umask(UMASK).read
542              procfs.read.maxsize(MAXSIZE)
543              procfs.umask(UMASK).read.maxsize(MAXSIZE)
544              procfs.write
545              procfs.umask(UMASK).write
546
547       PATH is the file name (relative to /proc/systemtap/MODNAME) to be  cre‐
548       ated.   If  no  PATH  is specified (as in the last two variants above),
549       PATH defaults to "command".
550
551       When  a  user  reads  /proc/systemtap/MODNAME/PATH,  the  corresponding
552       procfs  read  probe is triggered.  The string data to be read should be
553       assigned to a variable named $value, like this:
554
555              procfs("PATH").read { $value = "100\n" }
556
557       When a user writes into /proc/systemtap/MODNAME/PATH, the corresponding
558       procfs  write probe is triggered.  The data the user wrote is available
559       in the string variable named $value, like this:
560
561              procfs("PATH").write { printf("user wrote: %s", $value) }
562
563       MAXSIZE is the size of the procfs read buffer.  Specifying MAXSIZE  al‐
564       lows larger procfs output.  If no MAXSIZE is specified, the procfs read
565       buffer defaults to STP_PROCFS_BUFSIZE (which defaults to  MAXSTRINGLEN,
566       the  maximum  length  of a string).  If setting the procfs read buffers
567       for more than one file is needed, it may be  easiest  to  override  the
568       STP_PROCFS_BUFSIZE definition.  Here's an example of using MAXSIZE:
569
570              procfs.read.maxsize(1024) {
571                  $value = "long string..."
572                  $value .= "another long string..."
573                  $value .= "another long string..."
574                  $value .= "another long string..."
575              }
576
577
578   MARKERS
579       This family of probe points hooks up to static probing markers inserted
580       into the kernel or modules.  These markers are special macro calls  in‐
581       serted  by  kernel  developers to make probing faster and more reliable
582       than with DWARF-based probes.  Further, DWARF debugging information  is
583       not required to probe markers.
584
585       Marker  probe points begin with kernel.  The next part names the marker
586       itself: mark("name").  The marker name string, which  may  contain  the
587       usual  wildcard  characters,  is matched against the names given to the
588       marker macros when the kernel and/or module was compiled.     Optional‐
589       ly,  you  can  specify  format("format").  Specifying the marker format
590       string allows differentiation between two markers with  the  same  name
591       but different marker format strings.
592
593       The  handler associated with a marker-based probe may read the optional
594       parameters specified at the macro call site.   These  are  named  $arg1
595       through  $argNN,  where  NN is the number of parameters supplied by the
596       macro.  Number and string parameters are passed in a type-safe manner.
597
598       The marker format string associated with a marker is available in $for‐
599       mat.  And also the marker name string is available in $name.
600
601
602   TRACEPOINTS
603       This  family of probe points hooks up to static probing tracepoints in‐
604       serted into the kernel or modules.  As with markers, these  tracepoints
605       are  special  macro calls inserted by kernel developers to make probing
606       faster and more reliable than with DWARF-based probes, and DWARF debug‐
607       ging  information  is  not  required to probe tracepoints.  Tracepoints
608       have an extra advantage of more strongly-typed parameters than markers.
609
610       Tracepoint probes begin with kernel.  The next part  names  the  trace‐
611       point  itself:  trace("name").   The  tracepoint name string, which may
612       contain the usual wildcard characters, is matched against the names de‐
613       fined by the kernel developers in the tracepoint header files.
614
615       The  handler  associated with a tracepoint-based probe may read the op‐
616       tional parameters specified at the macro call site.   These  are  named
617       according  to  the  declaration by the tracepoint author.  For example,
618       the tracepoint probe kernel.trace("sched_switch") provides the  parame‐
619       ters  $rq, $prev, and $next.  If the parameter is a complex type, as in
620       a struct pointer, then a script can access fields with the same  syntax
621       as DWARF $target variables.  Also, tracepoint parameters cannot be mod‐
622       ified, but in guru-mode a script may modify fields of parameters.
623
624       The name of the tracepoint is available in  $$name,  and  a  string  of
625       name=value  pairs  for all parameters of the tracepoint is available in
626       $$vars or $$parms.
627
628
629   HARDWARE BREAKPOINTS
630       This family of probes is used to set hardware watchpoints for a given
631        (global) kernel symbol. The probes take three components as inputs :
632
633       1. The virtualaddress/name of the kernel symbol to be  traced  is  sup‐
634       plied  as argument to this class of probes. ( Probes for only data seg‐
635       ment variables are supported. Probing local  variables  of  a  function
636       cannot be done.)
637
638       2. Nature of access to be probed : a.  .write probe gets triggered when
639       a write happens at the specified address/symbol name.  b.  rw probe  is
640       triggered when either a read or write happens.
641
642       3.   .length (optional) Users have the option of specifying the address
643       interval to be probed using  "length"  constructs.  The  user-specified
644       length  gets  approximated  to the closest possible address length that
645       the architecture can support. If the specified length exceeds the  lim‐
646       its imposed by architecture, an error message is flagged and probe reg‐
647       istration fails.  Wherever 'length' is not  specified,  the  translator
648       requests  a  hardware  breakpoint probe of length 1. It should be noted
649       that the "length" construct is not valid with symbol names.
650
651       Following constructs are supported :
652              probe kernel.data(ADDRESS).write
653              probe kernel.data(ADDRESS).rw
654              probe kernel.data(ADDRESS).length(LEN).write
655              probe kernel.data(ADDRESS).length(LEN).rw
656              probe kernel.data("SYMBOL_NAME").write
657              probe kernel.data("SYMBOL_NAME").rw
658
659       This set of probes make use of the debug registers  of  the  processor,
660       which  is  a  scarce  resource.  (4  on x86 , 1 on powerpc ) The script
661       translation flags a warning if a user requests more hardware breakpoint
662       probes  than the limits set by architecture. For example,a pass-2 warn‐
663       ing is flashed when an input  script  requests  5  hardware  breakpoint
664       probes  on an x86 system while x86 architecture supports a maximum of 4
665       breakpoints.  Users are cautioned to set probes judiciously.
666
667

EXAMPLES

669       Here are some example probe points, defining the associated events.
670
671       begin, end, end
672              refers to the startup and normal shutdown of  the  session.   In
673              this  case,  the handler would run once during startup and twice
674              during shutdown.
675
676       timer.jiffies(1000).randomize(200)
677              refers to a periodic interrupt, every 1000 +/- 200 jiffies.
678
679       kernel.function("*init*"), kernel.function("*exit*")
680              refers to all kernel functions with  "init"  or  "exit"  in  the
681              name.
682
683       kernel.function("*@kernel/sched.c:240")
684              refers  to  any  functions within the "kernel/sched.c" file that
685              span line 240.   Note that this is not a probe at the  statement
686              at that line number.  Use the kernel.statement probe instead.
687
688       kernel.mark("getuid")
689              refers to an STAP_MARK(getuid, ...) macro call in the kernel.
690
691       module("usb*").function("*sync*").return
692              refers to the moment of return from all functions with "sync" in
693              the name in any of the USB drivers.
694
695       kernel.statement(0xc0044852)
696              refers to the first byte of the  statement  whose  compiled  in‐
697              structions include the given address in the kernel.
698
699       kernel.statement("*@kernel/sched.c:2917")
700              refers to the statement of line 2917 within "kernel/sched.c".
701
702       kernel.statement("bio_init@fs/bio.c+3")
703              refers to the statement at line bio_init+3 within "fs/bio.c".
704
705       kernel.data("pid_max").write
706              refers to a hardware preakpoint of type "write" set on pid_max
707
708       syscall.*.return
709              refers  to the group of probe aliases with any name in the third
710              position
711
712
713   PERF
714       This prototype family of probe points interfaces to  the  kernel  "perf
715       event"  infrasture  for controlling hardware performance counters.  The
716       events being attached to are described by the "type",  "config"  fields
717       of  the  perf_event_attr structure, and are sampled at an interval gov‐
718       erned by the "sample_period" field.
719
720       These fields are made available to systemtap scripts using the  follow‐
721       ing syntax:
722              probe perf.type(NN).config(MM).sample(XX)
723              probe perf.type(NN).config(MM)
724       The systemtap probe handler is called once per XX increments of the un‐
725       derlying performance counter.  The default sampling count  is  1000000.
726       The  range  of valid type/config is described by the perf_event_open(2)
727       system call, and/or the linux/perf_event.h file.  Invalid  combinations
728       or exhausted hardware counter resources result in errors during system‐
729       tap script startup.  Systemtap does not  sanity-check  the  values:  it
730       merely  passes  them through to the kernel for error- and safety-check‐
731       ing.
732
733

NAME

DESCRIPTION

DWARF DEBUGINFO

PROBE POINT FAMILIES

EXAMPLES

SEE ALSO