1STAPPROBES(3stap) STAPPROBES(3stap)
2
3
4
6 stapprobes - systemtap probe points
7
8
9
11 The following sections enumerate the variety of probe points supported
12 by the systemtap translator, and some of the additional aliases defined
13 by standard tapset scripts. Many are individually documented in the
14 3stap manual section, with the probe:: prefix.
15
16
18 probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }
19
20
21 A probe declaration may list multiple comma-separated probe points in
22 order to attach a handler to all of the named events. Normally, the
23 handler statements are run whenever any of events occur. Depending on
24 the type of probe point, the handler statements may refer to context
25 variables (denoted with a dollar-sign prefix like $foo) to read or
26 write state. This may include function parameters for function probes,
27 or local variables for statement probes.
28
29 The syntax of a single probe point is a general dotted-symbol sequence.
30 This allows a breakdown of the event namespace into parts, somewhat
31 like the Domain Name System does on the Internet. Each component iden‐
32 tifier may be parametrized by a string or number literal, with a syntax
33 like a function call. A component may include a "*" character, to ex‐
34 pand to a set of matching probe points. It may also include "**" to
35 match multiple sequential components at once. Probe aliases likewise
36 expand to other probe points.
37
38 Probe aliases can be given on their own, or with a suffix. The suffix
39 attaches to the underlying probe point that the alias is expanded to.
40 For example,
41
42 syscall.read.return.maxactive(10)
43
44 expands to
45
46 kernel.function("sys_read").return.maxactive(10)
47
48 with the component maxactive(10) being recognized as a suffix.
49
50 Normally, each and every probe point resulting from wildcard- and
51 alias-expansion must be resolved to some low-level system instrumenta‐
52 tion facility (e.g., a kprobe address, marker, or a timer configura‐
53 tion), otherwise the elaboration phase will fail.
54
55 However, a probe point may be followed by a "?" character, to indicate
56 that it is optional, and that no error should result if it fails to re‐
57 solve. Optionalness passes down through all levels of alias/wildcard
58 expansion. Alternately, a probe point may be followed by a "!" charac‐
59 ter, to indicate that it is both optional and sufficient. (Think
60 vaguely of the Prolog cut operator.) If it does resolve, then no fur‐
61 ther probe points in the same comma-separated list will be resolved.
62 Therefore, the "!" sufficiency mark only makes sense in a list of
63 probe point alternatives.
64
65 Additionally, a probe point may be followed by a "if (expr)" statement,
66 in order to enable/disable the probe point on-the-fly. With the "if"
67 statement, if the "expr" is false when the probe point is hit, the
68 whole probe body including alias's body is skipped. The condition is
69 stacked up through all levels of alias/wildcard expansion. So the final
70 condition becomes the logical-and of conditions of all expanded
71 alias/wildcard. The expressions are necessarily restricted to global
72 variables.
73
74 These are all syntactically valid probe points. (They are generally
75 semantically invalid, depending on the contents of the tapsets, and the
76 versions of kernel/user software installed.)
77
78
79 kernel.function("foo").return
80 process("/bin/vi").statement(0x2222)
81 end
82 syscall.*
83 syscall.*.return.maxactive(10)
84 syscall.{open,close}
85 sys**open
86 kernel.function("no_such_function") ?
87 module("awol").function("no_such_function") !
88 signal.*? if (switch)
89 kprobe.function("foo")
90
91
92 Probes may be broadly classified into "synchronous" and "asynchronous".
93 A "synchronous" event is deemed to occur when any processor executes an
94 instruction matched by the specification. This gives these probes a
95 reference point (instruction address) from which more contextual data
96 may be available. Other families of probe points refer to "asynchro‐
97 nous" events such as timers/counters rolling over, where there is no
98 fixed reference point that is related. Each probe point specification
99 may match multiple locations (for example, using wildcards or aliases),
100 and all them are then probed. A probe declaration may also contain
101 several comma-separated specifications, all of which are probed.
102
103 Brace expansion is a mechanism which allows a list of probe points to
104 be generated. It is very similar to shell expansion. A component may be
105 surrounded by a pair of curly braces to indicate that the comma-sepa‐
106 rated sequence of one or more subcomponents will each constitute a new
107 probe point. The braces may be arbitrarily nested. The ordering of ex‐
108 panded results is based on product order.
109
110 The question mark (?), exclamation mark (!) indicators and probe point
111 conditions may not be placed in any expansions that are before the last
112 component.
113
114 The following is an example of brace expansion.
115
116
117 syscall.{write,read}
118 # Expands to
119 syscall.write, syscall.read
120
121 {kernel,module("nfs")}.function("nfs*")!
122 # Expands to
123 kernel.function("nfs*")!, module("nfs").function("nfs*")!
124
125
126
128 Resolving some probe points requires DWARF debuginfo or "debug symbols"
129 for the specific program being instrumented. For some others, DWARF is
130 automatically synthesized on the fly from source code header files.
131 For others, it is not needed at all. Since a systemtap script may use
132 any mixture of probe points together, the union of their DWARF require‐
133 ments has to be met on the computer where script compilation occurs.
134 (See the --use-server option and the stap-server(8) man page for infor‐
135 mation about the remote compilation facility, which allows these re‐
136 quirements to be met on a different machine.)
137
138 The following point lists many of the available probe point families,
139 to classify them with respect to their need for DWARF debuginfo for the
140 specific program for that probe point.
141
142
143 DWARF NON-DWARF SYMBOL-TABLE
144
145 kernel.function, .statement kernel.mark kernel.function*
146 module.function, .statement process.mark, process.plt module.function*
147 process.function, .statement begin, end, error, never process.function*
148 process.mark* timer
149 .function.callee perf
150 python2, python3 procfs
151 kernel.statement.absolute
152 AUTO-GENERATED-DWARF kernel.data
153 kprobe.function
154 kernel.trace process.statement.absolute
155 process.begin, .end
156 netfilter
157 java
158
159
160 The probe types marked with * asterisks mark fallbacks, where systemtap
161 can sometimes infer subset or substitute information. In general, the
162 more symbolic / debugging information available, the higher quality
163 probing will be available.
164
165
166
168 The following types of probe points may be armed/disarmed on-the-fly to
169 save overheads during uninteresting times. Arming conditions may also
170 be added to other types of probes, but will be treated as a wrapping
171 conditional and won't benefit from overhead savings.
172
173
174 DISARMABLE exceptions
175 kernel.function, kernel.statement
176 module.function, module.statement
177 process.*.function, process.*.statement
178 process.*.plt, process.*.mark
179 timer. timer.profile
180 java
181
182
184 BEGIN/END/ERROR
185 The probe points begin and end are defined by the translator to refer
186 to the time of session startup and shutdown. All "begin" probe han‐
187 dlers are run, in some sequence, during the startup of the session.
188 All global variables will have been initialized prior to this point.
189 All "end" probes are run, in some sequence, during the normal shutdown
190 of a session, such as in the aftermath of an exit () function call, or
191 an interruption from the user. In the case of an error-triggered shut‐
192 down, "end" probes are not run. There are no target variables avail‐
193 able in either context.
194
195 If the order of execution among "begin" or "end" probes is significant,
196 then an optional sequence number may be provided:
197
198
199 begin(N)
200 end(N)
201
202
203 The number N may be positive or negative. The probe handlers are run
204 in increasing order, and the order between handlers with the same se‐
205 quence number is unspecified. When "begin" or "end" are given without
206 a sequence, they are effectively sequence zero.
207
208 The error probe point is similar to the end probe, except that each
209 such probe handler run when the session ends after errors have oc‐
210 curred. In such cases, "end" probes are skipped, but each "error"
211 probe is still attempted. This kind of probe can be used to clean up
212 or emit a "final gasp". It may also be numerically parametrized to set
213 a sequence.
214
215
216 NEVER
217 The probe point never is specially defined by the translator to mean
218 "never". Its probe handler is never run, though its statements are an‐
219 alyzed for symbol / type correctness as usual. This probe point may be
220 useful in conjunction with optional probes.
221
222
223 SYSCALL and ND_SYSCALL
224 The syscall.* and nd_syscall.* aliases define several hundred probes,
225 too many to detail here. They are of the general form:
226
227
228 syscall.NAME
229 nd_syscall.NAME
230 syscall.NAME.return
231 nd_syscall.NAME.return
232
233
234 Generally, a pair of probes are defined for each normal system call as
235 listed in the syscalls(2) manual page, one for entry and one for re‐
236 turn. Those system calls that never return do not have a corresponding
237 .return probe. The nd_* family of probes are about the same, except it
238 uses non-DWARF based searching mechanisms, which may result in a lower
239 quality of symbolic context data (parameters), and may miss some system
240 calls. You may want to try them first, in case kernel debugging infor‐
241 mation is not immediately available.
242
243 Each probe alias provides a variety of variables. Looking at the tapset
244 source code is the most reliable way. Generally, each variable listed
245 in the standard manual page is made available as a script-level vari‐
246 able, so syscall.open exposes filename, flags, and mode. In addition,
247 a standard suite of variables is available at most aliases:
248
249 argstr A pretty-printed form of the entire argument list, without
250 parentheses.
251
252 name The name of the system call.
253
254 retval For return probes, the raw numeric system-call result.
255
256 retstr For return probes, a pretty-printed string form of the system-
257 call result.
258
259 As usual for probe aliases, these variables are all initialized once
260 from the underlying $context variables, so that later changes to $con‐
261 text variables are not automatically reflected. Not all probe aliases
262 obey all of these general guidelines. Please report any bothersome
263 ones you encounter as a bug. Note that on some kernel/userspace archi‐
264 tecture combinations (e.g., 32-bit userspace on 64-bit kernel), the un‐
265 derlying $context variables may need explicit sign extension / masking.
266 When this is an issue, consider using the tapset-provided variables in‐
267 stead of raw $context variables.
268
269 If debuginfo availability is a problem, you may try using the non-DWARF
270 syscall probe aliases instead. Use the nd_syscall. prefix instead of
271 syscall. The same context variables are available, as far as possible.
272
273
274 TIMERS
275 There are two main types of timer probes: "jiffies" timer probes and
276 time interval timer probes.
277
278 Intervals defined by the standard kernel "jiffies" timer may be used to
279 trigger probe handlers asynchronously. Two probe point variants are
280 supported by the translator:
281
282
283 timer.jiffies(N)
284 timer.jiffies(N).randomize(M)
285
286
287 The probe handler is run every N jiffies (a kernel-defined unit of
288 time, typically between 1 and 60 ms). If the "randomize" component is
289 given, a linearly distributed random value in the range [-M..+M] is
290 added to N every time the handler is run. N is restricted to a reason‐
291 able range (1 to around a million), and M is restricted to be smaller
292 than N. There are no target variables provided in either context. It
293 is possible for such probes to be run concurrently on a multi-processor
294 computer.
295
296 Alternatively, intervals may be specified in units of time. There are
297 two probe point variants similar to the jiffies timer:
298
299
300 timer.ms(N)
301 timer.ms(N).randomize(M)
302
303
304 Here, N and M are specified in milliseconds, but the full options for
305 units are seconds (s/sec), milliseconds (ms/msec), microseconds
306 (us/usec), nanoseconds (ns/nsec), and hertz (hz). Randomization is not
307 supported for hertz timers.
308
309 The actual resolution of the timers depends on the target kernel. For
310 kernels prior to 2.6.17, timers are limited to jiffies resolution, so
311 intervals are rounded up to the nearest jiffies interval. After
312 2.6.17, the implementation uses hrtimers for tighter precision, though
313 the actual resolution will be arch-dependent. In either case, if the
314 "randomize" component is given, then the random value will be added to
315 the interval before any rounding occurs.
316
317 Profiling timers are also available to provide probes that execute on
318 all CPUs at the rate of the system tick (CONFIG_HZ) or at a given fre‐
319 quency (hz). On some kernels, this is a one-concurrent-user-only or
320 disabled facility, resulting in error -16 (EBUSY) during probe regis‐
321 tration.
322
323
324 timer.profile.tick
325 timer.profile.freq.hz(N)
326
327
328 Full context information of the interrupted process is available, mak‐
329 ing this probe suitable for a time-based sampling profiler.
330
331 It is recommended to use the tapset probe timer.profile rather than
332 timer.profile.tick. This probe point behaves identically to timer.pro‐
333 file.tick when the underlying functionality is available, and falls
334 back to using perf.sw.cpu_clock on some recent kernels which lack the
335 corresponding profile timer facility.
336
337 Profiling timers with specified frequencies are only accurate up to
338 around 100 hz. You may need to provide a larger value to achieve the
339 desired rate.
340
341 Note that if a timer probe is set to fire at a very high rate and if
342 the probe body is complex, succeeding timer probes can get skipped,
343 since the time for them to run has already passed. Normally systemtap
344 reports missed probes, but it will not report these skipped probes.
345
346
347 DWARF
348 This family of probe points uses symbolic debugging information for the
349 target kernel/module/program, as may be found in unstripped executa‐
350 bles, or the separate debuginfo packages. They allow placement of
351 probes logically into the execution path of the target program, by
352 specifying a set of points in the source or object code. When a match‐
353 ing statement executes on any processor, the probe handler is run in
354 that context.
355
356 Probe points in the DWARF family can be identified by the target kernel
357 module (or user process), source file, line number, function name, or
358 some combination of these.
359
360 Here is a list of DWARF probe points currently supported:
361
362 kernel.function(PATTERN)
363 kernel.function(PATTERN).call
364 kernel.function(PATTERN).callee(PATTERN)
365 kernel.function(PATTERN).callee(PATTERN).return
366 kernel.function(PATTERN).callee(PATTERN).call
367 kernel.function(PATTERN).callees(DEPTH)
368 kernel.function(PATTERN).return
369 kernel.function(PATTERN).inline
370 kernel.function(PATTERN).label(LPATTERN)
371 module(MPATTERN).function(PATTERN)
372 module(MPATTERN).function(PATTERN).call
373 module(MPATTERN).function(PATTERN).callee(PATTERN)
374 module(MPATTERN).function(PATTERN).callee(PATTERN).return
375 module(MPATTERN).function(PATTERN).callee(PATTERN).call
376 module(MPATTERN).function(PATTERN).callees(DEPTH)
377 module(MPATTERN).function(PATTERN).return
378 module(MPATTERN).function(PATTERN).inline
379 module(MPATTERN).function(PATTERN).label(LPATTERN)
380 kernel.statement(PATTERN)
381 kernel.statement(PATTERN).nearest
382 kernel.statement(ADDRESS).absolute
383 module(MPATTERN).statement(PATTERN)
384 process("PATH").function("NAME")
385 process("PATH").statement("*@FILE.c:123")
386 process("PATH").library("PATH").function("NAME")
387 process("PATH").library("PATH").statement("*@FILE.c:123")
388 process("PATH").library("PATH").statement("*@FILE.c:123").nearest
389 process("PATH").function("*").return
390 process("PATH").function("myfun").label("foo")
391 process("PATH").function("foo").callee("bar")
392 process("PATH").function("foo").callee("bar").return
393 process("PATH").function("foo").callee("bar").call
394 process("PATH").function("foo").callees(DEPTH)
395 process(PID).function("NAME")
396 process(PID).function("myfun").label("foo")
397 process(PID).plt("NAME")
398 process(PID).plt("NAME").return
399 process(PID).statement("*@FILE.c:123")
400 process(PID).statement("*@FILE.c:123").nearest
401 process(PID).statement(ADDRESS).absolute
402
403 (See the USER-SPACE section below for more information on the process
404 probes.)
405
406 The list above includes multiple variants and modifiers which provide
407 additional functionality or filters. They are:
408
409 .function
410 Places a probe near the beginning of the named function,
411 so that parameters are available as context variables.
412
413 .return
414 Places a probe at the moment after the return from the
415 named function, so the return value is available as the
416 "$return" context variable.
417
418 .inline
419 Filters the results to include only instances of inlined
420 functions. Note that inlined functions do not have an
421 identifiable return point, so .return is not supported on
422 .inline probes.
423
424 .call Filters the results to include only non-inlined functions
425 (the opposite set of .inline)
426
427 .exported
428 Filters the results to include only exported functions.
429
430 .statement
431 Places a probe at the exact spot, exposing those local
432 variables that are visible there.
433
434 .statement.nearest
435 Places a probe at the nearest available line number for
436 each line number given in the statement.
437
438 .callee
439 Places a probe on the callee function given in the
440 .callee modifier, where the callee must be a function
441 called by the target function given in .function. The ad‐
442 vantage of doing this over directly probing the callee
443 function is that this probe point is run only when the
444 callee is called from the target function (add the
445 -DSTAP_CALLEE_MATCHALL directive to override this when
446 calling stap(1)).
447
448 Note that only callees that can be statically determined
449 are available. For example, calls through function
450 pointers are not available. Additionally, calls to func‐
451 tions located in other objects (e.g. libraries) are not
452 available (instead use another probe point). This feature
453 will only work for code compiled with GCC 4.7+.
454
455 .callees
456 Shortcut for .callee("*"), which places a probe on all
457 callees of the function.
458
459 .callees(DEPTH)
460 Recursively places probes on callees. For example,
461 .callees(2) will probe both callees of the target func‐
462 tion, as well as callees of those callees. And
463 .callees(3) goes one level deeper, etc... A callee probe
464 at depth N is only triggered when the N callers in the
465 callstack match those that were statically determined
466 during analysis (this also may be overridden using
467 -DSTAP_CALLEE_MATCHALL).
468
469 In the above list of probe points, MPATTERN stands for a string literal
470 that aims to identify the loaded kernel module of interest. For in-tree
471 kernel modules, the name suffices (e.g. "btrfs"). The name may also in‐
472 clude the "*", "[]", and "?" wildcards to match multiple in-tree mod‐
473 ules. Out-of-tree modules are also supported by specifying the full
474 path to the ko file. Wildcards are not supported. The file must follow
475 the convention of being named <module_name>.ko (characters ',' and '-'
476 are replaced by '_').
477
478 LPATTERN stands for a source program label. It may also contain "*",
479 "[]", and "?" wildcards. PATTERN stands for a string literal that aims
480 to identify a point in the program. It is made up of three parts:
481
482 · The first part is the name of a function, as would appear in the nm
483 program's output. This part may use the "*" and "?" wildcarding
484 operators to match multiple names.
485
486 · The second part is optional and begins with the "@" character. It
487 is followed by the path to the source file containing the function,
488 which may include a wildcard pattern, such as mm/slab*. If it does
489 not match as is, an implicit "*/" is optionally added before the
490 pattern, so that a script need only name the last few components of
491 a possibly long source directory path.
492
493 · Finally, the third part is optional if the file name part was giv‐
494 en, and identifies the line number in the source file preceded by a
495 ":" or a "+". The line number is assumed to be an absolute line
496 number if preceded by a ":", or relative to the declaration line of
497 the function if preceded by a "+". All the lines in the function
498 can be matched with ":*". A range of lines x through y can be
499 matched with ":x-y". Ranges and specific lines can be mixed using
500 commas, e.g. ":x,y-z".
501
502 As an alternative, PATTERN may be a numeric constant, indicating an ad‐
503 dress. Such an address may be found from symbol tables of the appro‐
504 priate kernel / module object file. It is verified against known
505 statement code boundaries, and will be relocated for use at run time.
506
507 In guru mode only, absolute kernel-space addresses may be specified
508 with the ".absolute" suffix. Such an address is considered already re‐
509 located, as if it came from /proc/kallsyms, so it cannot be checked
510 against statement/instruction boundaries.
511
512 CONTEXT VARIABLES
513 Many of the source-level context variables, such as function parame‐
514 ters, locals, globals visible in the compilation unit, may be visible
515 to probe handlers. They may refer to these variables by prefixing
516 their name with "$" within the scripts. In addition, a special syntax
517 allows limited traversal of structures, pointers, and arrays. More
518 syntax allows pretty-printing of individual variables or their groups.
519 See also @cast. Note that variables may be inaccessible due to them
520 being paged out, or for a few other reasons. See also man er‐
521 ror::fault(7stap).
522
523
524 $var refers to an in-scope variable "var". If it's an integer-like
525 type, it will be cast to a 64-bit int for systemtap script use.
526 String-like pointers (char *) may be copied to systemtap string
527 values using the kernel_string or user_string functions.
528
529 @var("varname")
530 an alternative syntax for $varname
531
532 @var("varname@src/file.c")
533 refers to the global (either file local or external) variable
534 varname defined when the file src/file.c was compiled. The CU in
535 which the variable is resolved is the first CU in the module of
536 the probe point which matches the given file name at the end and
537 has the shortest file name path (e.g. given
538 @var("foo@bar/baz.c") and CUs with file name paths src/sub/mod‐
539 ule/bar/baz.c and src/bar/baz.c the second CU will be chosen to
540 resolve the (file) global variable foo
541
542 $var->field traversal via a structure's or a pointer's field. This
543 generalized indirection operator may be repeated to follow more
544 levels. Note that the . operator is not used for plain struc‐
545 ture members, only -> for both purposes. (This is because "."
546 is reserved for string concatenation.) Also note that for direct
547 dereferencing of $var pointer {kernel,user}_{char,int,...}($var)
548 should be used. (Refer to stapfuncs(5) for more details.)
549
550 $return
551 is available in return probes only for functions that are de‐
552 clared with a return value, which can be determined using @de‐
553 fined($return).
554
555 $var[N]
556 indexes into an array. The index given with a literal number or
557 even an arbitrary numeric expression.
558
559 A number of operators exist for such basic context variable expres‐
560 sions:
561
562 $$vars expands to a character string that is equivalent to
563
564 sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
565 parm1, ..., parmN, var1, ..., varN)
566
567 for each variable in scope at the probe point. Some values may
568 be printed as =? if their run-time location cannot be found.
569
570 $$locals
571 expands to a subset of $$vars for only local variables.
572
573 $$parms
574 expands to a subset of $$vars for only function parameters.
575
576 $$return
577 is available in return probes only. It expands to a string that
578 is equivalent to sprintf("return=%x", $return) if the probed
579 function has a return value, or else an empty string.
580
581 & $EXPR
582 expands to the address of the given context variable expression,
583 if it is addressable.
584
585 @defined($EXPR)
586 expands to 1 or 0 iff the given context variable expression is
587 resolvable, for use in conditionals such as
588
589 @defined($foo->bar) ? $foo->bar : 0
590
591
592 $EXPR$ expands to a string with all of $EXPR's members, equivalent to
593
594 sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
595 $EXPR->a, $EXPR->b)
596
597
598 $EXPR$$
599 expands to a string with all of $var's members and submembers,
600 equivalent to
601
602 sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
603 $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])
604
605
606
607 MORE ON RETURN PROBES
608 For the kernel ".return" probes, only a certain fixed number of returns
609 may be outstanding. The default is a relatively small number, on the
610 order of a few times the number of physical CPUs. If many different
611 threads concurrently call the same blocking function, such as futex(2)
612 or read(2), this limit could be exceeded, and skipped "kretprobes"
613 would be reported by "stap -t". To work around this, specify a
614
615 probe FOO.return.maxactive(NNN)
616
617 suffix, with a large enough NNN to cover all expected concurrently
618 blocked threads. Alternately, use the
619
620 stap -DKRETACTIVE=NNNN
621
622 stap command line macro setting to override the default for all ".re‐
623 turn" probes.
624
625
626 For ".return" probes, context variables other than the "$return" may be
627 accessible, as a convenience for a script programmer wishing to access
628 function parameters. These values are snapshots taken at the time of
629 function entry. (Local variables within the function are not generally
630 accessible, since those variables did not exist in allocated/initial‐
631 ized form at the snapshot moment.) These entry-snapshot variables
632 should be accessed via @entry($var).
633
634 In addition, arbitrary entry-time expressions can also be saved for
635 ".return" probes using the @entry(expr) operator. For example, one can
636 compute the elapsed time of a function:
637
638 probe kernel.function("do_filp_open").return {
639 println( get_timeofday_us() - @entry(get_timeofday_us()) )
640 }
641
642
643
644 The following table summarizes how values related to a function parame‐
645 ter context variable, a pointer named addr, may be accessed from a .re‐
646 turn probe.
647
648 at-entry value past-exit value
649
650 $addr not available
651 $addr->x->y @cast(@entry($addr),"struct zz")->x->y
652 $addr[0] {kernel,user}_{char,int,...}(& $addr[0])
653
654
655
656 DWARFLESS
657 In absence of debugging information, entry & exit points of kernel &
658 module functions can be probed using the "kprobe" family of probes.
659 However, these do not permit looking up the arguments / local variables
660 of the function. Following constructs are supported :
661
662 kprobe.function(FUNCTION)
663 kprobe.function(FUNCTION).call
664 kprobe.function(FUNCTION).return
665 kprobe.module(NAME).function(FUNCTION)
666 kprobe.module(NAME).function(FUNCTION).call
667 kprobe.module(NAME).function(FUNCTION).return
668 kprobe.statement(ADDRESS).absolute
669
670
671 Probes of type function are recommended for kernel functions, whereas
672 probes of type module are recommended for probing functions of the
673 specified module. In case the absolute address of a kernel or module
674 function is known, statement probes can be utilized.
675
676 Note that FUNCTION and MODULE names must not contain wildcards, or the
677 probe will not be registered. Also, statement probes must be run under
678 guru-mode only.
679
680
681
682 USER-SPACE
683 Support for user-space probing is available for kernels that are con‐
684 figured with the utrace extensions, or have the uprobes facility in
685 linux 3.5. (Various kernel build configuration options need to be en‐
686 abled; systemtap will advise if these are missing.)
687
688
689 There are several forms. First, a non-symbolic probe point:
690
691 process(PID).statement(ADDRESS).absolute
692
693 is analogous to kernel.statement(ADDRESS).absolute in that both use raw
694 (unverified) virtual addresses and provide no $variables. The target
695 PID parameter must identify a running process, and ADDRESS should iden‐
696 tify a valid instruction address. All threads of that process will be
697 probed.
698
699 Second, non-symbolic user-kernel interface events handled by utrace may
700 be probed:
701
702 process(PID).begin
703 process("FULLPATH").begin
704 process.begin
705 process(PID).thread.begin
706 process("FULLPATH").thread.begin
707 process.thread.begin
708 process(PID).end
709 process("FULLPATH").end
710 process.end
711 process(PID).thread.end
712 process("FULLPATH").thread.end
713 process.thread.end
714 process(PID).syscall
715 process("FULLPATH").syscall
716 process.syscall
717 process(PID).syscall.return
718 process("FULLPATH").syscall.return
719 process.syscall.return
720 process(PID).insn
721 process("FULLPATH").insn
722 process(PID).insn.block
723 process("FULLPATH").insn.block
724
725
726
727 A process.begin probe gets called when new process described by PID or
728 FULLPATH gets created. In addition, it is called once from the context
729 of each preexisting process, at systemtap script startup. This is use‐
730 ful to track live processes. A process.thread.begin probe gets called
731 when a new thread described by PID or FULLPATH gets created. A
732 process.end probe gets called when process described by PID or FULLPATH
733 dies. A process.thread.end probe gets called when a thread described
734 by PID or FULLPATH dies. A process.syscall probe gets called when a
735 thread described by PID or FULLPATH makes a system call. The system
736 call number is available in the $syscall context variable, and the
737 first 6 arguments of the system call are available in the $argN (ex.
738 $arg1, $arg2, ...) context variable. A process.syscall.return probe
739 gets called when a thread described by PID or FULLPATH returns from a
740 system call. The system call number is available in the $syscall con‐
741 text variable, and the return value of the system call is available in
742 the $return context variable. A process.insn probe gets called for ev‐
743 ery single-stepped instruction of the process described by PID or FULL‐
744 PATH. A process.insn.block probe gets called for every block-stepped
745 instruction of the process described by PID or FULLPATH.
746
747
748 If a process probe is specified without a PID or FULLPATH, all user
749 threads will be probed. However, if systemtap was invoked with the -c
750 or -x options, then process probes are restricted to the process hier‐
751 archy associated with the target process. If a process probe is un‐
752 specified (i.e. without a PID or FULLPATH), but with the -c option, the
753 PATH of the -c cmd will be heuristically filled into the process PATH.
754 In that case, only command parameters are allowed in the -c command
755 (i.e. no command substitution allowed and no occurrences of any of
756 these characters: '|&;<>(){}').
757
758
759 Third, symbolic static instrumentation compiled into programs and
760 shared libraries may be probed:
761
762 process("PATH").mark("LABEL")
763 process("PATH").provider("PROVIDER").mark("LABEL")
764 process(PID).mark("LABEL")
765 process(PID).provider("PROVIDER").mark("LABEL")
766
767
768 A .mark probe gets called via a static probe which is defined in the
769 application by STAP_PROBE1(PROVIDER,LABEL,arg1), which are macros de‐
770 fined in sys/sdt.h. The PROVIDER is an arbitrary application identifi‐
771 er, LABEL is the marker site identifier, and arg1 is the integer-typed
772 argument. STAP_PROBE1 is used for probes with 1 argument, STAP_PROBE2
773 is used for probes with 2 arguments, and so on. The arguments of the
774 probe are available in the context variables $arg1, $arg2, ... An al‐
775 ternative to using the STAP_PROBE macros is to use the dtrace script to
776 create custom macros. Additionally, the variables $$name and
777 $$provider are available as parts of the probe point name. The
778 sys/sdt.h macro names DTRACE_PROBE* are available as aliases for
779 STAP_PROBE*.
780
781
782 Finally, full symbolic source-level probes in user-space programs and
783 shared libraries are supported. These are exactly analogous to the
784 symbolic DWARF-based kernel/module probes described above. They expose
785 the same sorts of context $variables for function parameters, local
786 variables, and so on.
787
788 process("PATH").function("NAME")
789 process("PATH").statement("*@FILE.c:123")
790 process("PATH").plt("NAME")
791 process("PATH").library("PATH").plt("NAME")
792 process("PATH").library("PATH").function("NAME")
793 process("PATH").library("PATH").statement("*@FILE.c:123")
794 process("PATH").function("*").return
795 process("PATH").function("myfun").label("foo")
796 process("PATH").function("foo").callee("bar")
797 process("PATH").plt("NAME").return
798 process(PID).function("NAME")
799 process(PID).statement("*@FILE.c:123")
800 process(PID).plt("NAME")
801
802
803
804 Note that for all process probes, PATH names refer to executables that
805 are searched the same way shells do: relative to the working directory
806 if they contain a "/" character, otherwise in $PATH. If PATH names re‐
807 fer to scripts, the actual interpreters (specified in the script in the
808 first line after the #! characters) are probed.
809
810
811 Tapset process probes placed in the special directory $pre‐
812 fix/share/systemtap/tapset/PATH/ with relative paths will have their
813 process parameter prefixed with the location of the tapset. For exam‐
814 ple,
815
816
817 process("foo").function("NAME")
818
819
820 expands to
821
822 process("/usr/bin/foo").function("NAME")
823
824
825
826 when placed in $prefix/share/systemtap/tapset/PATH/usr/bin/
827
828
829 If PATH is a process component parameter referring to shared libraries
830 then all processes that map it at runtime would be selected for prob‐
831 ing. If PATH is a library component parameter referring to shared li‐
832 braries then the process specified by the process component would be
833 selected. Note that the PATH pattern in a library component will al‐
834 ways apply to libraries statically determined to be in use by the
835 process. However, you may also specify the full path to any library
836 file even if not statically needed by the process.
837
838
839 A .plt probe will probe functions in the program linkage table corre‐
840 sponding to the rest of the probe point. .plt can be specified as a
841 shorthand for .plt("*"). The symbol name is available as a $$name con‐
842 text variable; function arguments are not available, since PLTs are
843 processed without debuginfo. A .plt.return probe places a probe at the
844 moment after the return from the named function.
845
846
847 If the PATH string contains wildcards as in the MPATTERN case, then
848 standard globbing is performed to find all matching paths. In this
849 case, the $PATH environment variable is not used.
850
851
852 If systemtap was invoked with the -c or -x options, then process probes
853 are restricted to the process hierarchy associated with the target
854 process.
855
856
857 JAVA
858 Support for probing Java methods is available using Byteman as a back‐
859 end. Byteman is an instrumentation tool from the JBoss project which
860 systemtap can use to monitor invocations for a specific method or line
861 in a Java program.
862
863 Systemtap does so by generating a Byteman script listing the probes to
864 instrument and then invoking the Byteman bminstall utility.
865
866 This Java instrumentation support is currently a prototype feature with
867 major limitations. Moreover, Java probing currently does not work
868 across users; the stap script must run (with appropriate permissions)
869 under the same user that the Java process being probed. (Thus a stap
870 script under root currently cannot probe Java methods in a non-root-us‐
871 er Java process.)
872
873
874 The first probe type refers to Java processes by the name of the Java
875 process:
876
877 java("PNAME").class("CLASSNAME").method("PATTERN")
878 java("PNAME").class("CLASSNAME").method("PATTERN").return
879
880 The PNAME argument must be a pre-existing jvm pid, and be identifiable
881 via a jps listing.
882
883 The PATTERN parameter specifies the signature of the Java method to
884 probe. The signature must consist of the exact name of the method, fol‐
885 lowed by a bracketed list of the types of the arguments, for instance
886 "myMethod(int,double,Foo)". Wildcards are not supported.
887
888 The probe can be set to trigger at a specific line within the method by
889 appending a line number with colon, just as in other types of probes:
890 "myMethod(int,double,Foo):245".
891
892 The CLASSNAME parameter identifies the Java class the method belongs
893 to, either with or without the package qualification. By default, the
894 probe only triggers on descendants of the class that do not override
895 the method definition of the original class. However, CLASSNAME can
896 take an optional caret prefix, as in ^org.my.MyClass, which specifies
897 that the probe should also trigger on all descendants of MyClass that
898 override the original method. For instance, every method with signature
899 foo(int) in program org.my.MyApp can be probed at once using
900
901 java("org.my.MyApp").class("^java.lang.Object").method("foo(int)")
902
903
904 The second probe type works analogously, but refers to Java processes
905 by PID:
906
907 java(PID).class("CLASSNAME").method("PATTERN")
908 java(PID).class("CLASSNAME").method("PATTERN").return
909
910 (PIDs for an already running process can be obtained using the jps(1)
911 utility.)
912
913 Context variables defined within java probes include $arg1 through
914 $arg10 (for up to the first 10 arguments of a method), represented as
915 character-pointers for the toString() form of each actual argument.
916 The arg1 through arg10 script variables provide access to these as or‐
917 dinary strings, fetched via user_string_warn().
918
919 Prior to systemtap version 3.1, $arg1 through $arg10 could contain ei‐
920 ther integers or character pointers, depending on the types of the ob‐
921 jects being passed to each particular java method. This previous be‐
922 haviour may be invoked with the stap --compatible=3.0 flag.
923
924
925 PROCFS
926 These probe points allow procfs "files" in /proc/systemtap/MODNAME to
927 be created, read and written using a permission that may be modified
928 using the proper umask value. Default permissions are 0400 for read
929 probes, and 0200 for write probes. If both a read and write probe are
930 being used on the same file, a default permission of 0600 will be used.
931 Using procfs.umask(0040).read would result in a 0404 permission set for
932 the file. (MODNAME is the name of the systemtap module). The proc
933 filesystem is a pseudo-filesystem which is used as an interface to ker‐
934 nel data structures. There are several probe point variants supported
935 by the translator:
936
937
938 procfs("PATH").read
939 procfs("PATH").umask(UMASK).read
940 procfs("PATH").read.maxsize(MAXSIZE)
941 procfs("PATH").umask(UMASK).maxsize(MAXSIZE)
942 procfs("PATH").write
943 procfs("PATH").umask(UMASK).write
944 procfs.read
945 procfs.umask(UMASK).read
946 procfs.read.maxsize(MAXSIZE)
947 procfs.umask(UMASK).read.maxsize(MAXSIZE)
948 procfs.write
949 procfs.umask(UMASK).write
950
951
952 Note that there are a few differences when procfs probes are used in
953 the stapbpf runtime. FIFO special files are used instead of proc
954 filesystem files. These files are created in /var/tmp/systemtap-US‐
955 ER/MODNAME. (USER is the name of the user). Additionally, users can‐
956 not create both read and write probes on the same file.
957
958 PATH is the file name (relative to /proc/systemtap/MODNAME or
959 /var/tmp/systemtap-USER/MODNAME) to be created. If no PATH is speci‐
960 fied (as in the last two variants above), PATH defaults to "command".
961 The file name "__stdin" is used internally by systemtap for input
962 probes and should not be used as a PATH for procfs probes; see the in‐
963 put probe section below.
964
965 When a user reads /proc/systemtap/MODNAME/PATH (normal runtime) or
966 /var/tmp/systemtap-USER/MODNAME (stapbpf runtime), the corresponding
967 procfs read probe is triggered. The string data to be read should be
968 assigned to a variable named $value, like this:
969
970
971 procfs("PATH").read { $value = "100\n" }
972
973
974 When a user writes into /proc/systemtap/MODNAME/PATH (normal runtime)
975 or /var/tmp/systemtap-USER/MODNAME (stapbpf runtime), the corresponding
976 procfs write probe is triggered. The data the user wrote is available
977 in the string variable named $value, like this:
978
979
980 procfs("PATH").write { printf("user wrote: %s", $value) }
981
982
983 MAXSIZE is the size of the procfs read buffer. Specifying MAXSIZE al‐
984 lows larger procfs output. If no MAXSIZE is specified, the procfs read
985 buffer defaults to STP_PROCFS_BUFSIZE (which defaults to MAXSTRINGLEN,
986 the maximum length of a string). If setting the procfs read buffers
987 for more than one file is needed, it may be easiest to override the
988 STP_PROCFS_BUFSIZE definition. Here's an example of using MAXSIZE:
989
990
991 procfs.read.maxsize(1024) {
992 $value = "long string..."
993 $value .= "another long string..."
994 $value .= "another long string..."
995 $value .= "another long string..."
996 }
997
998
999
1000 INPUT
1001 These probe points make input from stdin available to the script during
1002 runtime. The translator currently supports two variants of this fami‐
1003 ly:
1004
1005 input.char
1006 input.line
1007
1008
1009 input.char is triggered each time a character is read from stdin. The
1010 current character is available in the string variable named char.
1011 There is no newline buffering; the next character is read from stdin as
1012 soon as it becomes available.
1013
1014 input.line causes all characters read from stdin to be buffered until a
1015 newline is read, at which point the probe will be triggered. The cur‐
1016 rent line of characters (including the newline) is made available in a
1017 string variable named line. Note that no more than MAXSTRINGLEN char‐
1018 acters will be buffered. Any additional characters will not be included
1019 in line.
1020
1021
1022 Input probes are aliases for procfs("__stdin").write. Systemtap recon‐
1023 figures stdin if the presence of this procfs probe is detected, there‐
1024 fore "__stdin" should not be used as a path argument for procfs probes.
1025 Additionally, input probes will not work with the -F and --remote op‐
1026 tions.
1027
1028
1029 NETFILTER HOOKS
1030 These probe points allow observation of network packets using the net‐
1031 filter mechanism. A netfilter probe in systemtap corresponds to a net‐
1032 filter hook function in the original netfilter probes API. It is proba‐
1033 bly more convenient to use tapset::netfilter(3stap), which wraps the
1034 primitive netfilter hooks and does the work of extracting useful infor‐
1035 mation from the context variables.
1036
1037
1038 There are several probe point variants supported by the translator:
1039
1040
1041 netfilter.hook("HOOKNAME").pf("PROTOCOL_F")
1042 netfilter.pf("PROTOCOL_F").hook("HOOKNAME")
1043 netfilter.hook("HOOKNAME").pf("PROTOCOL_F").priority("PRIORITY")
1044 netfilter.pf("PROTOCOL_F").hook("HOOKNAME").priority("PRIORITY")
1045
1046
1047
1048 PROTOCOL_F is the protocol family to listen for, currently one of NF‐
1049 PROTO_IPV4, NFPROTO_IPV6, NFPROTO_ARP, or NFPROTO_BRIDGE.
1050
1051
1052 HOOKNAME is the point, or 'hook', in the protocol stack at which to in‐
1053 tercept the packet. The available hook names for each protocol family
1054 are taken from the kernel header files <linux/netfilter_ipv4.h>, <lin‐
1055 ux/netfilter_ipv6.h>, <linux/netfilter_arp.h> and <linux/netfil‐
1056 ter_bridge.h>. For instance, allowable hook names for NFPROTO_IPV4 are
1057 NF_INET_PRE_ROUTING, NF_INET_LOCAL_IN, NF_INET_FORWARD, NF_INET_LO‐
1058 CAL_OUT, and NF_INET_POST_ROUTING.
1059
1060
1061 PRIORITY is an integer priority giving the order in which the probe
1062 point should be triggered relative to any other netfilter hook func‐
1063 tions which trigger on the same packet. Hook functions execute on each
1064 packet in order from smallest priority number to largest priority num‐
1065 ber. If no PRIORITY is specified (as in the first two probe point vari‐
1066 ants above), PRIORITY defaults to "0".
1067
1068 There are a number of predefined priority names of the form NF_IP_PRI_*
1069 and NF_IP6_PRI_* which are defined in the kernel header files <lin‐
1070 ux/netfilter_ipv4.h> and <linux/netfilter_ipv6.h> respectively. The
1071 script is permitted to use these instead of specifying an integer pri‐
1072 ority. (The probe points for NFPROTO_ARP and NFPROTO_BRIDGE currently
1073 do not expose any named hook priorities to the script writer.) Thus,
1074 allowable ways to specify the priority include:
1075
1076
1077 priority("255")
1078 priority("NF_IP_PRI_SELINUX_LAST")
1079
1080
1081 A script using guru mode is permitted to specify any identifier or num‐
1082 ber as the parameter for hook, pf, and priority. This feature should be
1083 used with caution, as the parameter is inserted verbatim into the C
1084 code generated by systemtap.
1085
1086 The netfilter probe points define the following context variables:
1087
1088 $hooknum
1089 The hook number.
1090
1091 $skb The address of the sk_buff struct representing the packet. See
1092 <linux/skbuff.h> for details on how to use this struct, or al‐
1093 ternatively use the tapset tapset::netfilter(3stap) for easy ac‐
1094 cess to key information.
1095
1096
1097 $in The address of the net_device struct representing the network
1098 device on which the packet was received (if any). May be 0 if
1099 the device is unknown or undefined at that stage in the protocol
1100 stack.
1101
1102
1103 $out The address of the net_device struct representing the network
1104 device on which the packet will be sent (if any). May be 0 if
1105 the device is unknown or undefined at that stage in the protocol
1106 stack.
1107
1108
1109 $verdict
1110 (Guru mode only.) Assigning one of the verdict values defined in
1111 <linux/netfilter.h> to this variable alters the further progress
1112 of the packet through the protocol stack. For instance, the fol‐
1113 lowing guru mode script forces all ipv6 network packets to be
1114 dropped:
1115
1116
1117 probe netfilter.pf("NFPROTO_IPV6").hook("NF_IP6_PRE_ROUTING") {
1118 $verdict = 0 /* nf_drop */
1119 }
1120
1121
1122 For convenience, unlike the primitive probe points discussed
1123 here, the probes defined in tapset::netfilter(3stap) export the
1124 lowercase names of the verdict constants (e.g. NF_DROP becomes
1125 nf_drop) as local variables.
1126
1127
1128 KERNEL TRACEPOINTS
1129 This family of probe points hooks up to static probing tracepoints in‐
1130 serted into the kernel or modules. As with markers, these tracepoints
1131 are special macro calls inserted by kernel developers to make probing
1132 faster and more reliable than with DWARF-based probes, and DWARF debug‐
1133 ging information is not required to probe tracepoints. Tracepoints
1134 have an extra advantage of more strongly-typed parameters than markers.
1135
1136 Tracepoint probes look like: kernel.trace("name"). The tracepoint name
1137 string, which may contain the usual wildcard characters, is matched
1138 against the names defined by the kernel developers in the tracepoint
1139 header files. To restrict the search to specific subsystems (e.g.
1140 sched, ext3, etc...), the following syntax can be used: ker‐
1141 nel.trace("system:name"). The tracepoint system string may also con‐
1142 tain the usual wildcard characters.
1143
1144 The handler associated with a tracepoint-based probe may read the op‐
1145 tional parameters specified at the macro call site. These are named
1146 according to the declaration by the tracepoint author. For example,
1147 the tracepoint probe kernel.trace("sched:sched_switch") provides the
1148 parameters $prev and $next. If the parameter is a complex type, as in
1149 a struct pointer, then a script can access fields with the same syntax
1150 as DWARF $target variables. Also, tracepoint parameters cannot be mod‐
1151 ified, but in guru-mode a script may modify fields of parameters.
1152
1153 The subsystem and name of the tracepoint are available in $$system and
1154 $$name and a string of name=value pairs for all parameters of the tra‐
1155 cepoint is available in $$vars or $$parms.
1156
1157
1158 KERNEL MARKERS (OBSOLETE)
1159 This family of probe points hooks up to an older style of static prob‐
1160 ing markers inserted into older kernels or modules. These markers are
1161 special STAP_MARK macro calls inserted by kernel developers to make
1162 probing faster and more reliable than with DWARF-based probes. Fur‐
1163 ther, DWARF debugging information is not required to probe markers.
1164
1165 Marker probe points begin with kernel. The next part names the marker
1166 itself: mark("name"). The marker name string, which may contain the
1167 usual wildcard characters, is matched against the names given to the
1168 marker macros when the kernel and/or module was compiled. Optional‐
1169 ly, you can specify format("format"). Specifying the marker format
1170 string allows differentiation between two markers with the same name
1171 but different marker format strings.
1172
1173 The handler associated with a marker-based probe may read the optional
1174 parameters specified at the macro call site. These are named $arg1
1175 through $argNN, where NN is the number of parameters supplied by the
1176 macro. Number and string parameters are passed in a type-safe manner.
1177
1178 The marker format string associated with a marker is available in $for‐
1179 mat. And also the marker name string is available in $name.
1180
1181
1182 HARDWARE BREAKPOINTS
1183 This family of probes is used to set hardware watchpoints for a given
1184 (global) kernel symbol. The probes take three components as inputs :
1185
1186 1. The virtual address / name of the kernel symbol to be traced is sup‐
1187 plied as argument to this class of probes. ( Probes for only data seg‐
1188 ment variables are supported. Probing local variables of a function
1189 cannot be done.)
1190
1191 2. Nature of access to be probed : a. .write probe gets triggered when
1192 a write happens at the specified address/symbol name. b. rw probe is
1193 triggered when either a read or write happens.
1194
1195 3. .length (optional) Users have the option of specifying the address
1196 interval to be probed using "length" constructs. The user-specified
1197 length gets approximated to the closest possible address length that
1198 the architecture can support. If the specified length exceeds the lim‐
1199 its imposed by architecture, an error message is flagged and probe reg‐
1200 istration fails. Wherever 'length' is not specified, the translator
1201 requests a hardware breakpoint probe of length 1. It should be noted
1202 that the "length" construct is not valid with symbol names.
1203
1204 Following constructs are supported :
1205
1206 probe kernel.data(ADDRESS).write
1207 probe kernel.data(ADDRESS).rw
1208 probe kernel.data(ADDRESS).length(LEN).write
1209 probe kernel.data(ADDRESS).length(LEN).rw
1210 probe kernel.data("SYMBOL_NAME").write
1211 probe kernel.data("SYMBOL_NAME").rw
1212
1213
1214 This set of probes make use of the debug registers of the processor,
1215 which is a scarce resource. (4 on x86 , 1 on powerpc ) The script
1216 translation flags a warning if a user requests more hardware breakpoint
1217 probes than the limits set by architecture. For example,a pass-2 warn‐
1218 ing is flashed when an input script requests 5 hardware breakpoint
1219 probes on an x86 system while x86 architecture supports a maximum of 4
1220 breakpoints. Users are cautioned to set probes judiciously.
1221
1222
1223 PERF
1224 This family of probe points interfaces to the kernel "perf event" in‐
1225 frastructure for controlling hardware performance counters. The events
1226 being attached to are described by the "type", "config" fields of the
1227 perf_event_attr structure, and are sampled at an interval governed by
1228 the "sample_period" and "sample_freq" fields.
1229
1230 These fields are made available to systemtap scripts using the follow‐
1231 ing syntax:
1232
1233 probe perf.type(NN).config(MM).sample(XX)
1234 probe perf.type(NN).config(MM).hz(XX)
1235 probe perf.type(NN).config(MM)
1236 probe perf.type(NN).config(MM).process("PROC")
1237 probe perf.type(NN).config(MM).counter("COUNTER")
1238 probe perf.type(NN).config(MM).process("PROC").counter("NAME")
1239
1240 The systemtap probe handler is called once per XX increments of the un‐
1241 derlying performance counter when using the .sample field or at a fre‐
1242 quency in hertz when using the .hz field. When not specified, the de‐
1243 fault behavior is to sample at a count of 1000000. The range of valid
1244 type/config is described by the perf_event_open(2) system call, and/or
1245 the linux/perf_event.h file. Invalid combinations or exhausted hard‐
1246 ware counter resources result in errors during systemtap script start‐
1247 up. Systemtap does not sanity-check the values: it merely passes them
1248 through to the kernel for error- and safety-checking. By default the
1249 perf event probe is systemwide unless .process is specified, which will
1250 bind the probe to a specific task. If the name is omitted then it is
1251 inferred from the stap -c argument. A perf event can be read on de‐
1252 mand using .counter. The body of the perf probe handler will not be
1253 invoked for a .counter probe; instead, the counter is read in a user
1254 space probe via:
1255
1256 process("PROC").statement("func@file") {stat <<< @perf("NAME")}
1257
1258
1259
1260 PYTHON
1261 Support for probing python 2 and python 3 function is available with
1262 the help of an extra python support module. Note that the debuginfo for
1263 the version of python being probed is required. To run a python script
1264 with the extra python support module you'd add the '-m HelperSDT' op‐
1265 tion to your python command, like this:
1266
1267 stap foo.stp -c "python -m HelperSDT foo.py"
1268
1269 Python probes look like the following:
1270
1271 python2.module("MPATTERN").function("PATTERN")
1272 python2.module("MPATTERN").function("PATTERN").call
1273 python2.module("MPATTERN").function("PATTERN").return
1274 python3.module("MPATTERN").function("PATTERN")
1275 python3.module("MPATTERN").function("PATTERN").call
1276 python3.module("MPATTERN").function("PATTERN").return
1277
1278 The list above includes multiple variants and modifiers which provide
1279 additional functionality or filters. They are:
1280
1281 .function
1282 Places a probe at the beginning of the named function by
1283 default, unless modified by PATTERN. Parameters are
1284 available as context variables.
1285
1286 .call Places a probe at the beginning of the named function.
1287 Parameters are available as context variables.
1288
1289 .return
1290 Places a probe at the moment before the return from the
1291 named function. Parameters and local/global python vari‐
1292 ables are available as context variables.
1293
1294 PATTERN stands for a string literal that aims to identify a point in
1295 the python program. It is made up of three parts:
1296
1297 · The first part is the name of a function (e.g. "foo") or class
1298 method (e.g. "bar.baz"). This part may use the "*" and "?" wild‐
1299 carding operators to match multiple names.
1300
1301 · The second part is optional and begins with the "@" character. It
1302 is followed by the path to the source file containing the function,
1303 which may include a wildcard pattern. The python path is searched
1304 for a matching filename.
1305
1306 · Finally, the third part is optional if the file name part was giv‐
1307 en, and identifies the line number in the source file preceded by a
1308 ":" or a "+". The line number is assumed to be an absolute line
1309 number if preceded by a ":", or relative to the declaration line of
1310 the function if preceded by a "+". All the lines in the function
1311 can be matched with ":*". A range of lines x through y can be
1312 matched with ":x-y". Ranges and specific lines can be mixed using
1313 commas, e.g. ":x,y-z".
1314
1315 In the above list of probe points, MPATTERN stands for a python module
1316 or script name that names the python module of interest. This part may
1317 use the "*" and "?" wildcarding operators to match multiple names. The
1318 python path is searched for a matching filename.
1319
1320
1321
1323 Here are some example probe points, defining the associated events.
1324
1325 begin, end, end
1326 refers to the startup and normal shutdown of the session. In
1327 this case, the handler would run once during startup and twice
1328 during shutdown.
1329
1330 timer.jiffies(1000).randomize(200)
1331 refers to a periodic interrupt, every 1000 +/- 200 jiffies.
1332
1333 kernel.function("*init*"), kernel.function("*exit*")
1334 refers to all kernel functions with "init" or "exit" in the
1335 name.
1336
1337 kernel.function("*@kernel/time.c:240")
1338 refers to any functions within the "kernel/time.c" file that
1339 span line 240. Note that this is not a probe at the statement
1340 at that line number. Use the kernel.statement probe instead.
1341
1342 kernel.trace("sched_*")
1343 refers to all scheduler-related (really, prefixed) tracepoints
1344 in the kernel.
1345
1346 kernel.mark("getuid")
1347 refers to an obsolete STAP_MARK(getuid, ...) macro call in the
1348 kernel.
1349
1350 module("usb*").function("*sync*").return
1351 refers to the moment of return from all functions with "sync" in
1352 the name in any of the USB drivers.
1353
1354 kernel.statement(0xc0044852)
1355 refers to the first byte of the statement whose compiled in‐
1356 structions include the given address in the kernel.
1357
1358 kernel.statement("*@kernel/time.c:296")
1359 refers to the statement of line 296 within "kernel/time.c".
1360
1361 kernel.statement("bio_init@fs/bio.c+3")
1362 refers to the statement at line bio_init+3 within "fs/bio.c".
1363
1364 kernel.data("pid_max").write
1365 refers to a hardware breakpoint of type "write" set on pid_max
1366
1367 syscall.*.return
1368 refers to the group of probe aliases with any name in the third
1369 position
1370
1371
1373 stap(1),
1374 probe::*[24m(3stap),
1375 tapset::*[24m(3stap)
1376
1377
1378
1379
1380 STAPPROBES(3stap)