1STAPPROBES(3stap) STAPPROBES(3stap)
2
3
4
6 stapprobes - systemtap probe points
7
8
9
11 The following sections enumerate the variety of probe points supported
12 by the systemtap translator, and some of the additional aliases defined
13 by standard tapset scripts. Many are individually documented in the
14 3stap manual section, with the probe:: prefix.
15
16 The general probe point syntax is a dotted-symbol sequence. This
17 allows a breakdown of the event namespace into parts, somewhat like the
18 Domain Name System does on the Internet. Each component identifier may
19 be parametrized by a string or number literal, with a syntax like a
20 function call. A component may include a "*" character, to expand to a
21 set of matching probe points. It may also include "**" to match multi‐
22 ple sequential components at once. Probe aliases likewise expand to
23 other probe points. Each and every resulting probe point is normally
24 resolved to some low-level system instrumentation facility (e.g., a
25 kprobe address, marker, or a timer configuration), otherwise the elabo‐
26 ration phase will fail.
27
28 However, a probe point may be followed by a "?" character, to indicate
29 that it is optional, and that no error should result if it fails to
30 resolve. Optionalness passes down through all levels of alias/wildcard
31 expansion. Alternately, a probe point may be followed by a "!" charac‐
32 ter, to indicate that it is both optional and sufficient. (Think
33 vaguely of the Prolog cut operator.) If it does resolve, then no fur‐
34 ther probe points in the same comma-separated list will be resolved.
35 Therefore, the "!" sufficiency mark only makes sense in a list of
36 probe point alternatives.
37
38 Additionally, a probe point may be followed by a "if (expr)" statement,
39 in order to enable/disable the probe point on-the-fly. With the "if"
40 statement, if the "expr" is false when the probe point is hit, the
41 whole probe body including alias's body is skipped. The condition is
42 stacked up through all levels of alias/wildcard expansion. So the final
43 condition becomes the logical-and of conditions of all expanded
44 alias/wildcard.
45
46 These are all syntactically valid probe points. (They are generally
47 semantically invalid, depending on the contents of the tapsets, and the
48 versions of kernel/user software installed.)
49
50 kernel.function("foo").return
51 process("/bin/vi").statement(0x2222)
52 end
53 syscall.*
54 sys**open
55 kernel.function("no_such_function") ?
56 module("awol").function("no_such_function") !
57 signal.*? if (switch)
58 kprobe.function("foo")
59
60 Probes may be broadly classified into "synchronous" and "asynchronous".
61 A "synchronous" event is deemed to occur when any processor executes an
62 instruction matched by the specification. This gives these probes a
63 reference point (instruction address) from which more contextual data
64 may be available. Other families of probe points refer to "asynchro‐
65 nous" events such as timers/counters rolling over, where there is no
66 fixed reference point that is related. Each probe point specification
67 may match multiple locations (for example, using wildcards or aliases),
68 and all them are then probed. A probe declaration may also contain
69 several comma-separated specifications, all of which are probed.
70
71
73 Resolving some probe points requires DWARF debuginfo or "debug symbols"
74 for the specific part being instrumented. For some others, DWARF is
75 automatically synthesized on the fly from source code header files.
76 For others, it is not needed at all. Since a systemtap script may use
77 any mixture of probe points together, the union of their DWARF require‐
78 ments has to be met on the computer where script compilation occurs.
79 (See the --use-server option and the stap-server(8) man page for infor‐
80 mation about the remote compilation facility, which allows these re‐
81 quirements to be met on a different machine.)
82
83 The following point lists many of the available probe point families,
84 to classify them with respect to their need for DWARF debuginfo.
85
86
87 DWARF AUTO-DWARF NON-DWARF
88
89 kernel.function, .statement kernel.trace kernel.mark
90 module.function, .statement process.mark
91 process.function, .statement begin, end, error, never
92 process.mark (backup) timer
93 perf
94 procfs
95 kernel.statement.absolute
96 kernel.data
97 kprobe.function
98 process.statement.absolute
99 process.begin, .end, .error
100
101
103 BEGIN/END/ERROR
104 The probe points begin and end are defined by the translator to refer
105 to the time of session startup and shutdown. All "begin" probe han‐
106 dlers are run, in some sequence, during the startup of the session.
107 All global variables will have been initialized prior to this point.
108 All "end" probes are run, in some sequence, during the normal shutdown
109 of a session, such as in the aftermath of an exit () function call, or
110 an interruption from the user. In the case of an error-triggered shut‐
111 down, "end" probes are not run. There are no target variables avail‐
112 able in either context.
113
114 If the order of execution among "begin" or "end" probes is significant,
115 then an optional sequence number may be provided:
116
117 begin(N)
118 end(N)
119
120 The number N may be positive or negative. The probe handlers are run
121 in increasing order, and the order between handlers with the same se‐
122 quence number is unspecified. When "begin" or "end" are given without
123 a sequence, they are effectively sequence zero.
124
125 The error probe point is similar to the end probe, except that each
126 such probe handler run when the session ends after errors have oc‐
127 curred. In such cases, "end" probes are skipped, but each "error"
128 probe is still attempted. This kind of probe can be used to clean up
129 or emit a "final gasp". It may also be numerically parametrized to set
130 a sequence.
131
132
133 NEVER
134 The probe point never is specially defined by the translator to mean
135 "never". Its probe handler is never run, though its statements are an‐
136 alyzed for symbol / type correctness as usual. This probe point may be
137 useful in conjunction with optional probes.
138
139
140 SYSCALL
141 The syscall.* aliases define several hundred probes, too many to sum‐
142 marize here. They are:
143
144 syscall.NAME
145 syscall.NAME.return
146
147 Generally, two probes are defined for each normal system call as listed
148 in the syscalls(2) manual page, one for entry and one for return.
149 Those system calls that never return do not have a corresponding .re‐
150 turn probe.
151
152 Each probe alias provides a variety of variables. Looking at the tapset
153 source code is the most reliable way. Generally, each variable listed
154 in the standard manual page is made available as a script-level vari‐
155 able, so syscall.open exposes filename, flags, and mode. In addition,
156 a standard suite of variables is available at most aliases:
157
158 argstr A pretty-printed form of the entire argument list, without
159 parentheses.
160
161 name The name of the system call.
162
163 retstr For return probes, a pretty-printed form of the system-call re‐
164 sult.
165
166 As usual for probe aliases, these variables are all simply initialized
167 once from the underlying $context variables, so that later changes to
168 $context variables are not automatically reflected. Not all probe
169 aliases obey all of these general guidelines. Please report any both‐
170 ersome ones you encounter as a bug.
171
172
173
174 TIMERS
175 Intervals defined by the standard kernel "jiffies" timer may be used to
176 trigger probe handlers asynchronously. Two probe point variants are
177 supported by the translator:
178
179 timer.jiffies(N)
180 timer.jiffies(N).randomize(M)
181
182 The probe handler is run every N jiffies (a kernel-defined unit of
183 time, typically between 1 and 60 ms). If the "randomize" component is
184 given, a linearly distributed random value in the range [-M..+M] is
185 added to N every time the handler is run. N is restricted to a reason‐
186 able range (1 to around a million), and M is restricted to be smaller
187 than N. There are no target variables provided in either context. It
188 is possible for such probes to be run concurrently on a multi-processor
189 computer.
190
191 Alternatively, intervals may be specified in units of time. There are
192 two probe point variants similar to the jiffies timer:
193
194 timer.ms(N)
195 timer.ms(N).randomize(M)
196
197 Here, N and M are specified in milliseconds, but the full options for
198 units are seconds (s/sec), milliseconds (ms/msec), microseconds
199 (us/usec), nanoseconds (ns/nsec), and hertz (hz). Randomization is not
200 supported for hertz timers.
201
202 The actual resolution of the timers depends on the target kernel. For
203 kernels prior to 2.6.17, timers are limited to jiffies resolution, so
204 intervals are rounded up to the nearest jiffies interval. After
205 2.6.17, the implementation uses hrtimers for tighter precision, though
206 the actual resolution will be arch-dependent. In either case, if the
207 "randomize" component is given, then the random value will be added to
208 the interval before any rounding occurs.
209
210 Profiling timers are also available to provide probes that execute on
211 all CPUs at the rate of the system tick (CONFIG_HZ). This probe takes
212 no parameters.
213
214 timer.profile
215
216 Full context information of the interrupted process is available, mak‐
217 ing this probe suitable for a time-based sampling profiler.
218
219
220 DWARF
221 This family of probe points uses symbolic debugging information for the
222 target kernel/module/program, as may be found in unstripped executa‐
223 bles, or the separate debuginfo packages. They allow placement of
224 probes logically into the execution path of the target program, by
225 specifying a set of points in the source or object code. When a match‐
226 ing statement executes on any processor, the probe handler is run in
227 that context.
228
229 Points in a kernel, which are identified by module, source file, line
230 number, function name, or some combination of these.
231
232 Here is a list of probe point families currently supported. The .func‐
233 tion variant places a probe near the beginning of the named function,
234 so that parameters are available as context variables. The .return
235 variant places a probe at the moment after the return from the named
236 function, so the return value is available as the "$return" context
237 variable. The .inline modifier for .function filters the results to
238 include only instances of inlined functions. The .call modifier se‐
239 lects the opposite subset. Inline functions do not have an identifi‐
240 able return point, so .return is not supported on .inline probes. The
241 .statement variant places a probe at the exact spot, exposing those lo‐
242 cal variables that are visible there.
243
244 kernel.function(PATTERN)
245 kernel.function(PATTERN).call
246 kernel.function(PATTERN).return
247 kernel.function(PATTERN).inline
248 kernel.function(PATTERN).label(LPATTERN)
249 module(MPATTERN).function(PATTERN)
250 module(MPATTERN).function(PATTERN).call
251 module(MPATTERN).function(PATTERN).return
252 module(MPATTERN).function(PATTERN).inline
253 module(MPATTERN).function(PATTERN).label(LPATTERN)
254 kernel.statement(PATTERN)
255 kernel.statement(ADDRESS).absolute
256 module(MPATTERN).statement(PATTERN)
257 process("PATH").function("NAME")
258 process("PATH").statement("*@FILE.c:123")
259 process("PATH").library("PATH").function("NAME")
260 process("PATH").library("PATH").statement("*@FILE.c:123")
261 process("PATH").function("*").return
262 process("PATH").function("myfun").label("foo")
263 process(PID).statement(ADDRESS).absolute
264
265 (See the USER-SPACE section below for more information on the process
266 probes.)
267
268 In the above list, MPATTERN stands for a string literal that aims to
269 identify the loaded kernel module of interest and LPATTERN stands for a
270 source program label. Both MPATTERN and LPATTERN may include the "*"
271 "[]", and "?" wildcards. PATTERN stands for a string literal that aims
272 to identify a point in the program. It is made up of three parts:
273
274 · The first part is the name of a function, as would appear in the nm
275 program's output. This part may use the "*" and "?" wildcarding
276 operators to match multiple names.
277
278 · The second part is optional and begins with the "@" character. It
279 is followed by the path to the source file containing the function,
280 which may include a wildcard pattern, such as mm/slab*. If it does
281 not match as is, an implicit "*/" is optionally added before the
282 pattern, so that a script need only name the last few components of
283 a possibly long source directory path.
284
285 · Finally, the third part is optional if the file name part was giv‐
286 en, and identifies the line number in the source file preceded by a
287 ":" or a "+". The line number is assumed to be an absolute line
288 number if preceded by a ":", or relative to the entry of the func‐
289 tion if preceded by a "+". All the lines in the function can be
290 matched with ":*". A range of lines x through y can be matched
291 with ":x-y".
292
293 As an alternative, PATTERN may be a numeric constant, indicating an ad‐
294 dress. Such an address may be found from symbol tables of the appro‐
295 priate kernel / module object file. It is verified against known
296 statement code boundaries, and will be relocated for use at run time.
297
298 In guru mode only, absolute kernel-space addresses may be specified
299 with the ".absolute" suffix. Such an address is considered already re‐
300 located, as if it came from /proc/kallsyms, so it cannot be checked
301 against statement/instruction boundaries.
302
303
304 CONTEXT VARIABLES
305 Many of the source-level context variables, such as function parame‐
306 ters, locals, globals visible in the compilation unit, may be visible
307 to probe handlers. They may refer to these variables by prefixing
308 their name with "$" within the scripts. In addition, a special syntax
309 allows limited traversal of structures, pointers, and arrays. More
310 syntax allows pretty-printing of individual variables or their groups.
311 See also @cast.
312
313
314 $var refers to an in-scope variable "var". If it's an integer-like
315 type, it will be cast to a 64-bit int for systemtap script use.
316 String-like pointers (char *) may be copied to systemtap string
317 values using the kernel_string or user_string functions.
318
319 $var->field traversal via a structure's or a pointer's field. This
320 generalized indirection operator may be repeated to follow more
321 levels. Note that the . operator is not used for plain struc‐
322 ture members, only -> for both purposes. (This is because "."
323 is reserved for string concatenation.)
324
325 $return
326 is available in return probes only for functions that are de‐
327 clared with a return value.
328
329 $var[N]
330 indexes into an array. The index given with a literal number or
331 even an arbitrary numeric expression.
332
333 A number of operators exist for such basic context variable expres‐
334 sions:
335
336 $$vars expands to a character string that is equivalent to
337 sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
338 parm1, ..., parmN, var1, ..., varN)
339 for each variable in scope at the probe point. Some values may be
340 printed as =? if their run-time location cannot be found.
341
342 $$locals
343 expands to a subset of $$vars for only local variables.
344
345 $$parms
346 expands to a subset of $$vars for only function parameters.
347
348 $$return
349 is available in return probes only. It expands to a string that
350 is equivalent to sprintf("return=%x", $return) if the probed
351 function has a return value, or else an empty string.
352
353 & $EXPR
354 expands to the address of the given context variable expression,
355 if it is addressable.
356
357 @defined($EXPR)
358 expands to 1 or 0 iff the given context variable expression is
359 resolvable, for use in conditionals such as
360 @defined($foo->bar) ? $foo->bar : 0
361
362 $EXPR$ expands to a string with all of $EXPR's members, equivalent to
363 sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
364 $EXPR->a, $EXPR->b)
365
366 $EXPR$$
367 expands to a string with all of $var's members and submembers,
368 equivalent to
369 sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
370 $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])
371
372
373 For ".return" probes, context variables other than the "$return" value
374 itself are only available for the function call parameters. The ex‐
375 pressions evaluate to the entry-time values of those variables, since
376 that is when a snapshot is taken. Other local variables are not gener‐
377 ally accessible, since by the time a ".return" probe hits, the probed
378 function will have already returned.
379
380 Arbitrary entry-time expressions can also be saved for ".return" probes
381 using the @entry(expr) operator. For example, one can compute the
382 elapsed time of a function:
383 probe kernel.function("do_filp_open").return {
384 println( get_timeofday_us() - @entry(get_timeofday_us()) )
385 }
386
387
388
389 DWARFLESS
390 In absence of debugging information, entry & exit points of kernel &
391 module functions can be probed using the "kprobe" family of probes.
392 However, these do not permit looking up the arguments / local variables
393 of the function. Following constructs are supported :
394 kprobe.function(FUNCTION)
395 kprobe.function(FUNCTION).return
396 kprobe.module(NAME).function(FUNCTION)
397 kprobe.module(NAME).function(FUNCTION).return
398 kprobe.statement.(ADDRESS).absolute
399
400 Probes of type function are recommended for kernel functions, whereas
401 probes of type module are recommended for probing functions of the
402 specified module. In case the absolute address of a kernel or module
403 function is known, statement probes can be utilized.
404
405 Note that FUNCTION and MODULE names must not contain wildcards, or the
406 probe will not be registered. Also, statement probes must be run under
407 guru-mode only.
408
409
410
411 USER-SPACE
412 Support for user-space probing is available for kernels that are con‐
413 figured with the utrace extensions. See
414 http://people.redhat.com/roland/utrace/
415
416 There are several forms. First, a non-symbolic probe point:
417 process(PID).statement(ADDRESS).absolute
418 is analogous to kernel.statement(ADDRESS).absolute in that both use raw
419 (unverified) virtual addresses and provide no $variables. The target
420 PID parameter must identify a running process, and ADDRESS should iden‐
421 tify a valid instruction address. All threads of that process will be
422 probed.
423
424 Second, non-symbolic user-kernel interface events handled by utrace may
425 be probed:
426 process(PID).begin
427 process("FULLPATH").begin
428 process.begin
429 process(PID).thread.begin
430 process("FULLPATH").thread.begin
431 process.thread.begin
432 process(PID).end
433 process("FULLPATH").end
434 process.end
435 process(PID).thread.end
436 process("FULLPATH").thread.end
437 process.thread.end
438 process(PID).syscall
439 process("FULLPATH").syscall
440 process.syscall
441 process(PID).syscall.return
442 process("FULLPATH").syscall.return
443 process.syscall.return
444 process(PID).insn
445 process("FULLPATH").insn
446 process(PID).insn.block
447 process("FULLPATH").insn.block
448
449 A .begin probe gets called when new process described by PID or FULL‐
450 PATH gets created. A .thread.begin probe gets called when a new thread
451 described by PID or FULLPATH gets created. A .end probe gets called
452 when process described by PID or FULLPATH dies. A .thread.end probe
453 gets called when a thread described by PID or FULLPATH dies. A
454 .syscall probe gets called when a thread described by PID or FULLPATH
455 makes a system call. The system call number is available in the
456 $syscall context variable, and the first 6 arguments of the system call
457 are available in the $argN (ex. $arg1, $arg2, ...) context variable. A
458 .syscall.return probe gets called when a thread described by PID or
459 FULLPATH returns from a system call. The system call number is avail‐
460 able in the $syscall context variable, and the return value of the sys‐
461 tem call is available in the $return context variable. A .insn probe
462 gets called for every single-stepped instruction of the process de‐
463 scribed by PID or FULLPATH. A .insn.block probe gets called for every
464 block-stepped instruction of the process described by PID or FULLPATH.
465
466 If a process probe is specified without a PID or FULLPATH, all user
467 threads will be probed. However, if systemtap was invoked with the -c
468 or -x options, then process probes are restricted to the process hier‐
469 archy associated with the target process. If a process probe is speci‐
470 fied without a PID or FULLPATH, but with the -c option, the PATH of the
471 -c cmd will be heuristically filled into the process PATH.
472
473
474 Third, symbolic static instrumentation compiled into programs and
475 shared libraries may be probed:
476 process("PATH").mark("LABEL")
477 process("PATH").provider("PROVIDER").mark("LABEL")
478
479 A .mark probe gets called via a static probe which is defined in the
480 application by STAP_PROBE1(PROVIDER,LABEL,arg1), which is defined in
481 sdt.h. The handle is an application handle, LABEL corresponds to the
482 .mark argument, and arg1 is the argument. STAP_PROBE1 is used for
483 probes with 1 argument, STAP_PROBE2 is used for probes with 2 argu‐
484 ments, and so on. The arguments of the probe are available in the con‐
485 text variables $arg1, $arg2, ... An alternative to using the
486 STAP_PROBE macros is to use the dtrace script to create custom macros.
487 Additionally, the variables $$name and $$provider are available as
488 parts of the probe point name.
489
490
491 Finally, full symbolic source-level probes in user-space programs and
492 shared libraries are supported. These are exactly analogous to the
493 symbolic DWARF-based kernel/module probes described above, and expose
494 similar contextual $variables.
495 process("PATH").function("NAME")
496 process("PATH").statement("*@FILE.c:123")
497 process("PATH").library("PATH").function("NAME")
498 process("PATH").library("PATH").statement("*@FILE.c:123")
499 process("PATH").function("*").return
500 process("PATH").function("myfun").label("foo")
501
502
503 Note that for all process probes, PATH names refer to executables that
504 are searched the same way shells do: relative to the working directory
505 if they contain a "/" character, otherwise in $PATH. If PATH names re‐
506 fer to scripts, the actual interpreters (specified in the script in the
507 first line after the #! characters) are probed. If PATH is a process
508 component parameter referring to shared libraries then all processes
509 that map it at runtime would be selected for probing. If PATH is a li‐
510 brary component parameter referring to shared libraries then the
511 process specified by the process component would be selected. If the
512 PATH string contains wildcards as in the MPATTERN case, then standard
513 globbing is performed to find all matching paths. In this case, the
514 $PATH environment variable is not used.
515
516
517 If systemtap was invoked with the -c or -x options, then process probes
518 are restricted to the process hierarchy associated with the target
519 process.
520
521
522 PROCFS
523 These probe points allow procfs "files" in /proc/systemtap/MODNAME to
524 be created, read and written using a permission that may be modified
525 using the proper umask value. Default permissions are 0400 for read
526 probes, and 0200 for write probes. If both a read and write probe are
527 being used on the same file, a default permission of 0600 will be used.
528 Using procfs.umask(0040).read would result in a 0404 permission set for
529 the file. (MODNAME is the name of the systemtap module). The proc
530 filesystem is a pseudo-filesystem which is used an an interface to ker‐
531 nel data structures. There are several probe point variants supported
532 by the translator:
533
534 procfs("PATH").read
535 procfs("PATH").umask(UMASK).read
536 procfs("PATH").read.maxsize(MAXSIZE)
537 procfs("PATH").umask(UMASK).maxsize(MAXSIZE)
538 procfs("PATH").write
539 procfs("PATH").umask(UMASK).write
540 procfs.read
541 procfs.umask(UMASK).read
542 procfs.read.maxsize(MAXSIZE)
543 procfs.umask(UMASK).read.maxsize(MAXSIZE)
544 procfs.write
545 procfs.umask(UMASK).write
546
547 PATH is the file name (relative to /proc/systemtap/MODNAME) to be cre‐
548 ated. If no PATH is specified (as in the last two variants above),
549 PATH defaults to "command".
550
551 When a user reads /proc/systemtap/MODNAME/PATH, the corresponding
552 procfs read probe is triggered. The string data to be read should be
553 assigned to a variable named $value, like this:
554
555 procfs("PATH").read { $value = "100\n" }
556
557 When a user writes into /proc/systemtap/MODNAME/PATH, the corresponding
558 procfs write probe is triggered. The data the user wrote is available
559 in the string variable named $value, like this:
560
561 procfs("PATH").write { printf("user wrote: %s", $value) }
562
563 MAXSIZE is the size of the procfs read buffer. Specifying MAXSIZE al‐
564 lows larger procfs output. If no MAXSIZE is specified, the procfs read
565 buffer defaults to STP_PROCFS_BUFSIZE (which defaults to MAXSTRINGLEN,
566 the maximum length of a string). If setting the procfs read buffers
567 for more than one file is needed, it may be easiest to override the
568 STP_PROCFS_BUFSIZE definition. Here's an example of using MAXSIZE:
569
570 procfs.read.maxsize(1024) {
571 $value = "long string..."
572 $value .= "another long string..."
573 $value .= "another long string..."
574 $value .= "another long string..."
575 }
576
577
578 MARKERS
579 This family of probe points hooks up to static probing markers inserted
580 into the kernel or modules. These markers are special macro calls in‐
581 serted by kernel developers to make probing faster and more reliable
582 than with DWARF-based probes. Further, DWARF debugging information is
583 not required to probe markers.
584
585 Marker probe points begin with kernel. The next part names the marker
586 itself: mark("name"). The marker name string, which may contain the
587 usual wildcard characters, is matched against the names given to the
588 marker macros when the kernel and/or module was compiled. Optional‐
589 ly, you can specify format("format"). Specifying the marker format
590 string allows differentiation between two markers with the same name
591 but different marker format strings.
592
593 The handler associated with a marker-based probe may read the optional
594 parameters specified at the macro call site. These are named $arg1
595 through $argNN, where NN is the number of parameters supplied by the
596 macro. Number and string parameters are passed in a type-safe manner.
597
598 The marker format string associated with a marker is available in $for‐
599 mat. And also the marker name string is available in $name.
600
601
602 TRACEPOINTS
603 This family of probe points hooks up to static probing tracepoints in‐
604 serted into the kernel or modules. As with markers, these tracepoints
605 are special macro calls inserted by kernel developers to make probing
606 faster and more reliable than with DWARF-based probes, and DWARF debug‐
607 ging information is not required to probe tracepoints. Tracepoints
608 have an extra advantage of more strongly-typed parameters than markers.
609
610 Tracepoint probes begin with kernel. The next part names the trace‐
611 point itself: trace("name"). The tracepoint name string, which may
612 contain the usual wildcard characters, is matched against the names de‐
613 fined by the kernel developers in the tracepoint header files.
614
615 The handler associated with a tracepoint-based probe may read the op‐
616 tional parameters specified at the macro call site. These are named
617 according to the declaration by the tracepoint author. For example,
618 the tracepoint probe kernel.trace("sched_switch") provides the parame‐
619 ters $rq, $prev, and $next. If the parameter is a complex type, as in
620 a struct pointer, then a script can access fields with the same syntax
621 as DWARF $target variables. Also, tracepoint parameters cannot be mod‐
622 ified, but in guru-mode a script may modify fields of parameters.
623
624 The name of the tracepoint is available in $$name, and a string of
625 name=value pairs for all parameters of the tracepoint is available in
626 $$vars or $$parms.
627
628
629 HARDWARE BREAKPOINTS
630 This family of probes is used to set hardware watchpoints for a given
631 (global) kernel symbol. The probes take three components as inputs :
632
633 1. The virtualaddress/name of the kernel symbol to be traced is sup‐
634 plied as argument to this class of probes. ( Probes for only data seg‐
635 ment variables are supported. Probing local variables of a function
636 cannot be done.)
637
638 2. Nature of access to be probed : a. .write probe gets triggered when
639 a write happens at the specified address/symbol name. b. rw probe is
640 triggered when either a read or write happens.
641
642 3. .length (optional) Users have the option of specifying the address
643 interval to be probed using "length" constructs. The user-specified
644 length gets approximated to the closest possible address length that
645 the architecture can support. If the specified length exceeds the lim‐
646 its imposed by architecture, an error message is flagged and probe reg‐
647 istration fails. Wherever 'length' is not specified, the translator
648 requests a hardware breakpoint probe of length 1. It should be noted
649 that the "length" construct is not valid with symbol names.
650
651 Following constructs are supported :
652 probe kernel.data(ADDRESS).write
653 probe kernel.data(ADDRESS).rw
654 probe kernel.data(ADDRESS).length(LEN).write
655 probe kernel.data(ADDRESS).length(LEN).rw
656 probe kernel.data("SYMBOL_NAME").write
657 probe kernel.data("SYMBOL_NAME").rw
658
659 This set of probes make use of the debug registers of the processor,
660 which is a scarce resource. (4 on x86 , 1 on powerpc ) The script
661 translation flags a warning if a user requests more hardware breakpoint
662 probes than the limits set by architecture. For example,a pass-2 warn‐
663 ing is flashed when an input script requests 5 hardware breakpoint
664 probes on an x86 system while x86 architecture supports a maximum of 4
665 breakpoints. Users are cautioned to set probes judiciously.
666
667
669 Here are some example probe points, defining the associated events.
670
671 begin, end, end
672 refers to the startup and normal shutdown of the session. In
673 this case, the handler would run once during startup and twice
674 during shutdown.
675
676 timer.jiffies(1000).randomize(200)
677 refers to a periodic interrupt, every 1000 +/- 200 jiffies.
678
679 kernel.function("*init*"), kernel.function("*exit*")
680 refers to all kernel functions with "init" or "exit" in the
681 name.
682
683 kernel.function("*@kernel/sched.c:240")
684 refers to any functions within the "kernel/sched.c" file that
685 span line 240. Note that this is not a probe at the statement
686 at that line number. Use the kernel.statement probe instead.
687
688 kernel.mark("getuid")
689 refers to an STAP_MARK(getuid, ...) macro call in the kernel.
690
691 module("usb*").function("*sync*").return
692 refers to the moment of return from all functions with "sync" in
693 the name in any of the USB drivers.
694
695 kernel.statement(0xc0044852)
696 refers to the first byte of the statement whose compiled in‐
697 structions include the given address in the kernel.
698
699 kernel.statement("*@kernel/sched.c:2917")
700 refers to the statement of line 2917 within "kernel/sched.c".
701
702 kernel.statement("bio_init@fs/bio.c+3")
703 refers to the statement at line bio_init+3 within "fs/bio.c".
704
705 kernel.data("pid_max").write
706 refers to a hardware preakpoint of type "write" set on pid_max
707
708 syscall.*.return
709 refers to the group of probe aliases with any name in the third
710 position
711
712
713 PERF
714 This prototype family of probe points interfaces to the kernel "perf
715 event" infrasture for controlling hardware performance counters. The
716 events being attached to are described by the "type", "config" fields
717 of the perf_event_attr structure, and are sampled at an interval gov‐
718 erned by the "sample_period" field.
719
720 These fields are made available to systemtap scripts using the follow‐
721 ing syntax:
722 probe perf.type(NN).config(MM).sample(XX)
723 probe perf.type(NN).config(MM)
724 The systemtap probe handler is called once per XX increments of the un‐
725 derlying performance counter. The default sampling count is 1000000.
726 The range of valid type/config is described by the perf_event_open(2)
727 system call, and/or the linux/perf_event.h file. Invalid combinations
728 or exhausted hardware counter resources result in errors during system‐
729 tap script startup. Systemtap does not sanity-check the values: it
730 merely passes them through to the kernel for error- and safety-check‐
731 ing.
732
733
735 stap(1), probe::*[24m(3stap), tapset::*[24m(3stap)
736
737
738
739 STAPPROBES(3stap)