1STAP(1)                     General Commands Manual                    STAP(1)


6       stap - systemtap script translator/driver


11       stap [ OPTIONS ] FILENAME [ ARGUMENTS ]
12       stap [ OPTIONS ] - [ ARGUMENTS ]
13       stap [ OPTIONS ] -e SCRIPT [ ARGUMENTS ]
14       stap [ OPTIONS ] -l PROBE [ ARGUMENTS ]
15       stap [ OPTIONS ] -L PROBE [ ARGUMENTS ]
16       stap [ OPTIONS ] --dump-probe-types
17       stap [ OPTIONS ] --dump-probe-aliases
18       stap [ OPTIONS ] --dump-functions


24       The  stap  program  is the front-end to the Systemtap tool.  It accepts
25       probing instructions written  in  a  simple  domain-specific  language,
26       translates  those  instructions  into C code, compiles this C code, and
27       loads the resulting module into a running Linux  kernel  or  a  DynInst
28       user-space  mutator,  to perform the requested system trace/probe func‐
29       tions.  You can supply the script in  a  named  file  (FILENAME),  from
30       standard  input  (use  - instead of FILENAME), or from the command line
31       (using -e SCRIPT).  The program runs until it  is  interrupted  by  the
32       user,  or  if the script voluntarily invokes the exit() function, or by
33       sufficient number of soft errors.
35       The language, which is described the SCRIPT LANGUAGE section below,  is
36       strictly  typed, expressive, declaration free, procedural, prototyping-
37       friendly, and inspired by awk and C.  It allows source code  points  or
38       events  in the system to be associated with handlers, which are subrou‐
39       tines that are executed synchronously.  It is somewhat similar  concep‐
40       tually to "breakpoint command lists" in the gdb debugger.


44       systemtap comes with a variety of educational, documentation and refer‐
45       ence resources.  They come online and/or packaged for offline use.  For
46       online     documentation,     see     the     project     web     site,
47       https://sourceware.org/systemtap/
50       ┌──────────────────────────┬──────────────────────────────────────────────────────┐
51man pages                 │                                                      │
52       ├──────────────────────────┼──────────────────────────────────────────────────────┤
53       │stap (this page)          │ language syntax, concepts, operation, options        │
54       ├──────────────────────────┼──────────────────────────────────────────────────────┤
55       │stapprobes                │ probe points and their $context variables            │
56       ├──────────────────────────┼──────────────────────────────────────────────────────┤
57       │stapref                   │ quick reference to language syntax                   │
58       ├──────────────────────────┼──────────────────────────────────────────────────────┤
59       │stappaths                 │ list of directories, including books & references    │
60       ├──────────────────────────┼──────────────────────────────────────────────────────┤
61       │stap-prep                 │ program to install auxiliary dependencies like  ker‐ │
62       │                          │ nel debuginfo                                        │
63       ├──────────────────────────┼──────────────────────────────────────────────────────┤
64       │tapset::*                 │ generated list of tapsets                            │
65       ├──────────────────────────┼──────────────────────────────────────────────────────┤
66       │probe::*                  │ generated list of tapset probe aliases               │
67       ├──────────────────────────┼──────────────────────────────────────────────────────┤
68       │function::*               │ generated list of tapset functions                   │
69       ├──────────────────────────┼──────────────────────────────────────────────────────┤
70       │macro::*                  │ generated list of tapset macros                      │
71       ├──────────────────────────┼──────────────────────────────────────────────────────┤
72       │stapvars                  │ some of the tapset global variables                  │
73       ├──────────────────────────┼──────────────────────────────────────────────────────┤
74       │staprun, stapdyn, stapbpf │ programs for executing compiled systemtap scripts    │
75       ├──────────────────────────┼──────────────────────────────────────────────────────┤
76       │systemtap                 │ initscript, boot-time probing                        │
77       ├──────────────────────────┼──────────────────────────────────────────────────────┤
78       │stap-server               │ compilation server                                   │
79       ├──────────────────────────┼──────────────────────────────────────────────────────┤
80       │stapex                    │ a few very basic script examples                     │
81       ├──────────────────────────┼──────────────────────────────────────────────────────┤
82books                     │                                                      │
83       ├──────────────────────────┼──────────────────────────────────────────────────────┤
84       │Beginner's Guide          │ tutorial book, language essentials, examples         │
85       ├──────────────────────────┼──────────────────────────────────────────────────────┤
86       │Tutorial                  │ shorter tutorial, exercises                          │
87       ├──────────────────────────┼──────────────────────────────────────────────────────┤
88       │Language Reference        │ detailed language manual, covers statistics/analysis │
89       ├──────────────────────────┼──────────────────────────────────────────────────────┤
90       │Tapset Reference          │ the tapset man pages, reformatted into a book        │
91       ├──────────────────────────┼──────────────────────────────────────────────────────┤
92references                │                                                      │
93       ├──────────────────────────┼──────────────────────────────────────────────────────┤
94       │example scripts           │ over a hundred directly usable sysadmin tools, toys, │
95       │                          │ hacks to learn from                                  │
96       └──────────────────────────┴──────────────────────────────────────────────────────┘


99       The systemtap translator supports the following options.  Any other op‐
100       tion  prints  a list of supported options.  Options may be given on the
101       command line, as usual.  If the file $SYSTEMTAP_DIR/rc  exist,  options
102       are  also loaded from there and interpreted first.  ($SYSTEMTAP_DIR de‐
103       faults to $HOME/.systemtap if unset.)
106       In some cases, the default value of an  option  depends  on  particular
107       system  configuration  and  thus  can't be mentioned here directly.  In
108       some of those cases running "stap --help" might display the default.
111       -      Use standard input instead of a given FILENAME as probe language
112              input, unless -e SCRIPT is given.
114       -h --help
115              Show help message.
117       -V --version
118              Show version message.
120       -p NUM Stop after pass NUM.  The passes are numbered 1-5: parse, elabo‐
121              rate, translate, compile, run.  See the PROCESSING  section  for
122              details.
124       -v     Increase  verbosity  for all passes.  Produce a larger volume of
125              informative (?) output each time option repeated.
127       --vp ABCDE
128              Increase verbosity on a per-pass basis.  For example, "--vp 002"
129              adds  2  units  of  verbosity  to  pass 3 only.  The combination
130              "-v --vp 00004" adds 1 unit of verbosity for all passes,  and  4
131              more for pass 5.
133       -k     Keep  the temporary directory after all processing.  This may be
134              useful in order to examine the generated C code, or to reuse the
135              compiled kernel object.
137       -g     Guru  mode.   Enable  parsing  of unsafe expert-level constructs
138              like embedded C.
140       -P     Prologue-searching  mode.   This   is   equivalent   to   --pro‐
141              logue-searching=always.   Activate heuristics to work around in‐
142              correct debugging information for  function  parameter  $context
143              variables.
145       -u     Unoptimized  mode.   Disable  unused code elision and many other
146              optimizations during elaboration / translation.
148       -w     Suppressed warnings mode.  Disables all warning messages.
150       -W     Treat all warnings as errors.
152       -b     Use bulk mode (percpu files) for kernel-to-user  data  transfer.
153              Use  the stap-merge program to multiplex them back together lat‐
154              er.
156       -i --interactive
157              Interactive mode. Enable an interface  to  build  the  systemtap
158              script incrementally and interactively.
160       -t     Collect timing information on the number of times probe executes
161              and average amount of time spent in each probe-point. Also shows
162              the derivation for each probe-point.
164       -s NUM Use NUM megabyte buffers for kernel-to-user data transfer.  On a
165              multiprocessor in bulk mode, this is a per-processor amount.
167       -I DIR Add the given directory to the tapset search directory.  See the
168              description of pass 2 for details.
170       -D NAME=VALUE
171              Add  the  given C preprocessor directive to the module Makefile.
172              These can be used to override limit parameters described below.
174       -B NAME=VALUE
175              In kernel-runtime mode, add the given make directive to the ker‐
176              nel module build's make invocation.  These can be used to add or
177              override kconfig options.  For example, use
179              -B CONFIG_DEBUG_INFO=y
181              to add debugging information.
183       -B FLAG
184              In dyninst-runtime mode, add the given parameter to the compiler
185              CFLAGS  used for building the dyninst shared library.  For exam‐
186              ple, use
188              -B -g
190              to add debugging information.
192       -a ARCH
193              Use a cross-compilation mode for the given target  architecture.
194              This  requires access to the cross-compiler and the kernel build
195              tree, and goes along with the
197              -B CROSS_COMPILE=arch-tool-prefix-
198              and
199              -r /build/tree
201              options.
203       --modinfo NAME=VALUE
204              Add the name/value pair as a MODULE_INFO macro call to the  gen‐
205              erated module.  This may be useful to inform or override various
206              module-related checks in the kernel.
208       -G NAME=VALUE
209              Sets the value of global variable NAME to VALUE when staprun  is
210              invoked.   This  applies  to scalar variables declared global in
211              the script/tapset.
213       -R DIR Look for the systemtap runtime sources in the  given  directory.
214              Your DIR default can be seen using "stap --help".
216       -r /DIR
217              Build  for  kernel in given build tree. Can also be set with the
218              SYSTEMTAP_RELEASE environment variable.
220       -r RELEASE
221              Build for kernel in build tree /lib/modules/RELEASE/build.   Can
222              also be set with the SYSTEMTAP_RELEASE environment variable.
224       -m MODULE
225              Use  the  given name for the generated kernel object module, in‐
226              stead of a unique randomized name.  The generated kernel  object
227              module is copied to the current directory.
229       -d MODULE
230              Add symbol/unwind information for the given module into the ker‐
231              nel object module.  This may  enable  symbolic  tracebacks  from
232              those  modules/programs,  even  if  they do not have an explicit
233              probe placed into them.
235       --ldd  Add symbol/unwind information  for  all  user-space  shared  li‐
236              braries suspected by ldd to be necessary for user-space binaries
237              being probed or listed with the -d option.   Caution:  this  can
238              make  the probe modules considerably larger.  Note that this op‐
239              tion does  not  deal  with  kernel-space  modules:  see  instead
240              --all-modules below.
242       --all-modules
243              Equivalent  to  specifying "-dkernel" and a "-d" for each kernel
244              module that is currently loaded.  Caution:  this  can  make  the
245              probe modules considerably larger.
247       -o FILE
248              Send  standard  output to named file. In bulk mode, percpu files
249              will start with FILE_ (FILE_cpu with -F)  followed  by  the  cpu
250              number.  This supports strftime(3) formats for FILE.
252       -c CMD Start the probes, run CMD, and exit when CMD finishes.  This al‐
253              so has the effect of setting target() to the pid of the  command
254              ran.
256       -x PID Sets  target()  to  PID.  This allows scripts to be written that
257              filter on a specific process. Scripts  run  independent  of  the
258              PID's lifespan.
260       -e SCRIPT
261              Run the given SCRIPT specified on the command line.
263       -E SCRIPT
264              Run  the  given SCRIPT specified. This SCRIPT is run in addition
265              to the main script specified, through -e, or as a  script  file.
266              This  option can be repeated to run multiple scripts, and can be
267              used in listing mode (-l/-L).
269       -l PROBE
270              Instead of running a probe script, just list all available probe
271              points  matching  the given single probe point.  The pattern may
272              include wildcards and aliases, but not comma-separated  multiple
273              probe  points.  The process result code will indicate failure if
274              there are no matches.
276              % stap -e 'probe syscall.* { }'
277              [...]
278              % stap -l 'syscall.*'
279              syscall.accept
280              [...]
281              syscall.writev
284       -L PROBE
285              Similar to "-l", but  list  matching  probe  points  plus  their
286              available context variables.  When -v is set with -L, the output
287              includes duplicate probe points which are distinguished by their
288              PC address.
290              % stap -L 'process("/lib64/libpython*.so.*").mark("*")'
291              process("/usr/lib64/libpython2.7.so.1.0").mark("function__entry") $arg1:long $arg2:long $arg3:long
292              process("/usr/lib64/libpython2.7.so.1.0").mark("function__return") $arg1:long $arg2:long $arg3:long
293              process("/usr/lib64/libpython3.6m.so.1.0").mark("function__entry") $arg1:long $arg2:long $arg3:long
294              process("/usr/lib64/libpython3.6m.so.1.0").mark("function__return") $arg1:long $arg2:long $arg3:long
295              process("/usr/lib64/libpython3.6m.so.1.0").mark("gc__done") $arg1:long
296              process("/usr/lib64/libpython3.6m.so.1.0").mark("gc__start") $arg1:long
297              process("/usr/lib64/libpython3.6m.so.1.0").mark("line") $arg1:long $arg2:long $arg3:long
300       -F     Without  -o  option,  load  module and start probes, then detach
301              from the module leaving the probes running.  With -o option, run
302              staprun in background as a daemon and show its pid.
304       -S size[,N]
305              Sets  the  maximum size of output file and the maximum number of
306              output files.  If the size of output file  will  exceed  size  ,
307              systemtap switches output file to the next file. And if the num‐
308              ber of output files exceed N , systemtap removes the oldest out‐
309              put file. You can omit the second argument.
311       -T TIMEOUT
312              Exit the script after TIMEOUT seconds.
314       --skip-badvars
315              Ignore  unresolvable  or run-time-inaccessible context variables
316              and substitute with 0, without errors.
319       --prologue-searching[=WHEN]
320              Prologue-searching mode. Activate heuristics to work around  in‐
321              correct debugging information  for  function  parameter $context
322              variables. WHEN can be either "never", "always", or "auto" (i.e.
323              enabled  by heuristic). If WHEN is missing, then "always" is as‐
324              sumed. If the option is missing, then "auto" is assumed.
327       --suppress-handler-errors
328              Wrap all probe handlers into something like this
330              try { ... } catch { next }
332              block, which causes any runtime errors to be quietly suppressed.
333              Suppressed  errors  do  not  count against MAXERRORS limits.  In
334              this mode, the MAXSKIPPED limits are also  suppressed,  so  that
335              many  errors  and  skipped  probes  may  be accumulated during a
336              script's runtime.  Any overall counts will still be reported  at
337              shutdown.
340       --compatible VERSION
341              Suppress  recent script language or tapset changes which are in‐
342              compatible with given older version of systemtap.  This  may  be
343              useful  if  a much older systemtap script fails to run.  See the
344              DEPRECATION section for more details.
347       --check-version
348              This option is used to check if the active script has  any  con‐
349              structs  that may be systemtap version specific.  See the DEPRE‐
350              CATION section for more details.
353       --clean-cache
354              This option prunes stale entries from the cache directory.  This
355              is  normally  done automatically after successful runs, but this
356              option will trigger the cleanup manually and then exit.  See the
357              CACHING section for more details about cache limits.
360       --color[=WHEN], --colour[=WHEN]
361              This option controls coloring of error messages. WHEN can be ei‐
362              ther "never", "always", or "auto" (i.e. enable only if at a ter‐
363              minal). If WHEN is missing, then "always" is assumed. If the op‐
364              tion is missing, then "auto" is assumed.
366              Colors can be modified using  the  SYSTEMTAP_COLORS  environment
367              variable.     The     format     must    be    of    the    form
368              key1=val1:key2=val2:key3=val3 ...etc.  Valid keys  are  "error",
369              "warning",  "source",  "caret",  and "token".  Values constitute
370              Select Graphic Rendition (SGR) parameter(s). Consult  the  docu‐
371              mentation of your terminal for the SGRs it supports. As an exam‐
372              ple,   the   default    colors    would    be    expressed    as
373              error=01;31:warning=00;33:source=00;34:caret=01:token=01.     If
374              SYSTEMTAP_COLORS is absent, the default colors will be used.  If
375              it is empty or invalid, coloring is turned off.
378       --disable-cache
379              This  option  disables all use of the cache directory.  No files
380              will be either read from or written to the cache.
383       --poison-cache
384              This option treats files in the cache directory as invalid.   No
385              files will be read from the cache, but resulting files from this
386              run will still be written to the cache.   This  is  meant  as  a
387              troubleshooting aid when stap's cached behavior seems to be mis‐
388              behaving.  If it helped, there is a probably a bug in  systemtap
389              that the developers would like you to report.
392       --privilege[=stapusr | =stapsys | =stapdev]
393              This  option  instructs  stap  to examine the script looking for
394              constructs which are not allowed  for  the  specified  privilege
395              level  (see  UNPRIVILEGED USERS).  Compilation fails if any such
396              constructs are used.  If stapusr or stapsys are  specified  when
397              using a compile server (see --use-server), the server will exam‐
398              ine the script and, if compilation  succeeds,  the  server  will
399              cryptographically  sign  the resulting kernel module, certifying
400              that is it safe for use by users at the specified privilege lev‐
401              el.
403              If  --privilege  has not been specified, -pN has not been speci‐
404              fied with N < 5, and the invoking user is not root, and is not a
405              member  of  the  group stapdev, then stap will automatically add
406              the appropriate --privilege option to the options already speci‐
407              fied.
410       --unprivileged
411              This option is equivalent to --privilege=stapusr.
414       --use-server[=HOSTNAME[:PORT] | =IP_ADDRESS[:PORT] | =CERT_SERIAL]
415              Specify  compile-server(s)  to be used for compilation and/or in
416              conjunction with --list-servers and --trust-servers (see  below)
417              for listing. If no argument is supplied, then the default in un‐
418              privileged  mode  (see  --privilege)  is  to  select  compatible
419              servers which are trusted as SSL peers and as module signers and
420              currently online. Otherwise the default is to select  compatible
421              servers  which  are  trusted  as SSL peers and currently online.
422              --use-server may be specified more than once, in  which  case  a
423              list  of  servers is accumulated in the order specified. Servers
424              may be specified by host name, ip address, or by certificate se‐
425              rial number (obtained using --list-servers).  The latter is most
426              commonly used when adding or revoking trust  in  a  server  (see
427              --trust-servers below). If a server is specified by host name or
428              ip address, then an optional port number may be specified.  This
429              is  useful for accessing servers which are not on the local net‐
430              work or to specify a particular server.
432              IP addresses may be IPv4 or IPv6 addresses.
434              If a particular IPv6 address is link local and  exists  on  more
435              than  one  interface, the intended interface may be specified by
436              appending the address with a percent sign (%)  followed  by  the
437              intended        interface        name.        For       example,
438              "fe80::5eff:35ff:fe07:55ca%eth0".
440              In order to specify a port number with an IPv6  address,  it  is
441              necessary to enclose the IPv6 address in square brackets ([]) in
442              order to separate the port number from the rest of the  address.
443              For      example,      "[fe80::5eff:35ff:fe07:55ca]:5000"     or
444              "[fe80::5eff:35ff:fe07:55ca%eth0]:5000".
446              If --use-server has not been specified, -pN has not been  speci‐
447              fied with N < 5, and the invoking user not root, is not a member
448              of the group stapdev, but is a member of the group stapusr, then
449              stap  will automatically add --use-server to the options already
450              specified.
453       --use-server-on-error[=yes|=no]
454              Instructs stap to retry compilation of a script using a  compile
455              server  if compilation on the local host fails in a manner which
456              suggests that it might succeed using a server.  If  this  option
457              is  not specified, the default is no.  If no argument is provid‐
458              ed, then the default is yes. Compilation  will  be  retried  for
459              certain  types  of  errors (e.g. insufficient data or resources)
460              which may not occur during re-compilation by a  compile  server.
461              Compile servers will be selected automatically for the re-compi‐
462              lation attempt as if --use-server was specified  with  no  argu‐
463              ments.
466       --list-servers[=SERVERS]
467              Display  the status of the requested SERVERS, where SERVERS is a
468              comma-separated list of  server  attributes.  The  list  of  at‐
469              tributes  is  combined  to filter the list of servers displayed.
470              Supported attributes are:
472              all    specifies all known servers (trusted SSL  peers,  trusted
473                     module signers, online servers).
475              specified
476                     specifies servers specified using --use-server.
478              online filters the output by retaining information about servers
479                     which are currently online.
481              trusted
482                     filters the output by retaining information about servers
483                     which are trusted as SSL peers.
485              signer filters the output by retaining information about servers
486                     which are trusted as module signers (see --privilege).
488              compatible
489                     filters the output by retaining information about servers
490                     which  are compatible with the current kernel release and
491                     architecture.
493              If no argument is provided, then the default is  specified.   If
494              no  servers  were specified using --use-server, then the default
495              servers for --use-server are listed.
497              Note that --list-servers uses the avahi-daemon service to detect
498              online   servers.   If  this  service  is  not  available,  then
499              --list-servers will fail to detect any online servers. In  order
500              for  --list-servers to detect servers listening on IPv6 address‐
501              es, the avahi-daemon  configuration  file  /etc/avahi/avahi-dae‐
502              mon.conf must contain an active "use-ipv6=yes" line. The service
503              must be restarted after adding this line in order for IPv6 to be
504              enabled.
507       --trust-servers[=TRUST_SPEC]
508              Grant  or  revoke  trust  in  compile-servers,  specified  using
509              --use-server as specified by TRUST_SPEC, where TRUST_SPEC  is  a
510              comma-separated list specifying the trust which is to be granted
511              or revoked. Supported elements are:
513              ssl    trust the specified servers as SSL peers.
515              signer trust  the  specified  servers  as  module  signers  (see
516                     --privilege).  Only root can specify signer.
518              all-users
519                     grant  trust  as  an  ssl peer for all users on the local
520                     host. The default is to grant trust as an  ssl  peer  for
521                     the current user only. Trust as a module signer is always
522                     granted for all users. Only root can specify all-users.
524              revoke revoke the specified trust. The default is to grant it.
526              no-prompt
527                     do not prompt the user for confirmation  before  carrying
528                     out  the  requested  action. The default is to prompt the
529                     user for confirmation.
531              If no argument is provided, then the  default  is  ssl.   If  no
532              servers were specified using --use-server, then no trust will be
533              granted or revoked.
535              Unless no-prompt has been specified, the user will  be  prompted
536              to  confirm the trust to be granted or revoked before the opera‐
537              tion is performed.
540       --dump-probe-types
541              Dumps a list of supported probe types  and  exits.  If  --privi‐
542              lege=stapusr  is  also  specified,  the  list will be limited to
543              probe types available to unprivileged users.
546       --dump-probe-aliases
547              Dumps a list of all probe aliases found in library files and ex‐
548              its.
551       --dump-functions
552              Dumps  a list of all the public functions found in library files
553              and exits. Also includes their parameters and types. A  function
554              of  type  'unknown'  indicates a function that does not return a
555              value. Note that not all function/parameter  types  may  be  re‐
556              solved  (these  are  also  shown by 'unknown'). This features is
557              very memory-intensive and thus may not work properly with --use-
558              server  if the target server imposes an rlimit on process memory
559              (i.e. through the ~stap-server/.systemtap/rc configuration file,
560              see stap-server(8)).
563       --remote URL
564              Set  the execution target to the given host.  This option may be
565              repeated to target multiple execution targets.  Passes  1-4  are
566              completed locally as normal to build the script, and then pass 5
567              will copy the module to the target and run it.   Acceptable  URL
568              forms include:
570              [USER@]HOSTNAME, ssh://[USER@]HOSTNAME
571                     This  mode  uses  ssh,  optionally  using  a username not
572                     matching your own. If a custom ssh_config file is in use,
573                     add SendEnv LANG to retain internationalization function‐
574                     ality.
576              libvirt://DOMAIN, libvirt://DOMAIN/LIBVIRT_URI
577                     This mode uses stapvirt to execute the script on a domain
578                     managed by libvirt. Optionally, LIBVIRT_URI may be speci‐
579                     fied to connect to a  specific  driver  and/or  a  remote
580                     host. For example, to connect to the local privileged QE‐
581                     MU driver, use:
583                     --remote libvirt://MyDomain/qemu:///system
585                     See the page at  <http://libvirt.org/uri.html>  for  sup‐
586                     ported URIs. Also see stapvirt(1) for more information on
587                     how to prepare the domain for stap probing.
589              unix:PATH
590                     This mode connects to a UNIX socket.  This  can  be  used
591                     with  a QEMU virtio-serial port for executing scripts in‐
592                     side a running virtual machine.
594              direct://
595                     Special loopback mode to run on the local host.
597       --remote-prefix
598              Prefix each line of remote output with "N: ", where N is the in‐
599              dex  of  the  remote  execution target from which the given line
600              originated.
603       --download-debuginfo[=OPTION]
604              Enable, disable or set a timeout  for  the  automatic  debuginfo
605              downloading  feature  offered  by  abrt  as specified by OPTION,
606              where OPTION is one of the following:
608              yes    enable automatic downloading of debuginfo with  no  time‐
609                     out. This is the same as not providing an OPTION value to
610                     --download-debuginfo
612              no     explicitly disable automatic  downloading  of  debuginfo.
613                     This is the same as not using the option at all.
615              ask    show  abrt output, and ask before continuing download. No
616                     timeout will be set.
618              <timeout>
619                     specify a timeout as a positive number to stop the  down‐
620                     load if it is taking longer than <timeout> seconds.
622       --rlimit-as=NUM
623              Specify  the  maximum  size of the process's virtual memory (ad‐
624              dress space), in bytes.
627       --rlimit-cpu=NUM
628              Specify the CPU time limit, in seconds.
631       --rlimit-nproc=NUM
632              Specify the maximum number of processes that can be created.
635       --rlimit-stack=NUM
636              Specify the maximum size of the process stack, in bytes.
639       --rlimit-fsize=NUM
640              Specify the maximum size of files that the process  may  create,
641              in bytes.
644       --sysroot=DIR
645              Specify  sysroot  directory where target files (executables, li‐
646              braries, etc.)  are located.  With -r RELEASE, the sysroot  will
647              be searched for the appropriate kernel build directory.  With -r
648              /DIR, however, the sysroot will not be used to find  the  kernel
649              build.
652       --sysenv=VAR=VALUE
653              Provide an alternate value for an environment variable where the
654              value on a remote system differs.  Path  variables  (e.g.  PATH,
655              LD_LIBRARY_PATH)  are  assumed  to  be relative to the directory
656              provided by --sysroot, if provided.
659       --suppress-time-limits
660              Disable -DSTP_OVERLOAD related options as  well  as  -DMAXACTION
661              and -DMAXTRYLOCK.  This option requires guru mode.
664       --runtime=MODE
665              Set  the  pass-5  runtime  mode.   Valid options are kernel (de‐
666              fault), dyninst and bpf.  See ALTERNATE RUNTIMES below for  more
667              information.
670       --dyninst
671              Shorthand for --runtime=dyninst.
674       --bpf  Shorthand for --runtime=bpf.
677       --save-uprobes
678              On machines that require SystemTap to build its own uprobes mod‐
679              ule (kernels prior to version 3.5), this option  instructs  Sys‐
680              temTap to also save a copy of the module in the current directo‐
681              ry (creating a new "uprobes" directory first).
684       --target-namespaces=PID
685              Allow for a set of target namespaces to  be  set  based  on  the
686              namespaces  the  given  PID  is  in. This is for namespace-aware
687              tapset functions. If the target namespaces was not set, the tar‐
688              get defaults to the stap process' namespaces.
691       --monitor=INTERVAL
692              Enables  an  interface  to  display status information about the
693              module(uptime, module name, invoker uid,  memory  sizes,  global
694              variables,  list  of  probes with their statistics). An optional
695              argument INTERVAL can be supplied to set  the  refresh  rate  in
696              seconds  of the status window. The module can also be controlled
697              by a list of commands using the following keys:
699              c      Resets all global variables to their  initial  values  or
700                     zeroes them if they did not have an initial value.
702              s      Rotates the attribute used to sort the list of probes.
704              t      Brings up a prompt to allow toggling(on/off) of probes by
705                     index. Probe points are still affected  by  their  condi‐
706                     tions.
708              r      Resumes the script by toggling on all probes.
710              p      Pauses the script by toggling off all probes.
712              x      Hides/shows  the status window. This allows for more out‐
713                     put to be seen.
715              navigation-keys
716                     The navigation keys can be used to scroll up and down the
717                     windows.
719              Tab    Toggle scrolling between status and output windows.
722       --example
723              This option is used to run example scripts without having to en‐
724              ter the entire path to the script. Example scripts can be  found
725              in the directory specified in the stappaths(7) manual page.
728       --no-global-var-display
729              This  option  is used to disable the automatic logging of unused
730              global variables at the end of a stap session.


734       Any additional arguments on the command line are passed to  the  script
735       parser for substitution.  See below.


739       The  systemtap script language resembles awk and C.  There are two main
740       outermost constructs: probes and functions.  Within  these,  statements
741       and expressions use C-like operator syntax and precedence.
745       Whitespace is ignored.  Three forms of comments are supported:
746              # ... shell style, to the end of line, except for $# and @#
747              // ... C++ style, to the end of line
748              /* ... C style ... */
749       Literals  are either strings enclosed in double-quotes (passing through
750       the usual C escape codes with backslashes,  and  with  adjacent  string
751       literals  glued together, also as in C), or integers (in decimal, hexa‐
752       decimal, or octal, using the same notation as in C).  All  strings  are
753       limited  in length to some reasonable value (a few hundred bytes).  In‐
754       tegers are 64-bit signed quantities, although the parser  also  accepts
755       (and wraps around) values above positive 2**63.
757       In  addition, script arguments given at the end of the command line may
758       be inserted.  Use $1 ... $<NN> for insertion unquoted, @1 ... @<NN> for
759       insertion as a string literal.  The number of arguments may be accessed
760       through $# (as an unquoted number) or through @# (as a quoted  number).
761       These  may be used at any place a token may begin, including within the
762       preprocessing stage.  Reference to an argument number beyond  what  was
763       actually given is an error.
767       A  simple  conditional preprocessing stage is run as a part of parsing.
768       The general form is similar to the cond ? exp1 : exp2 ternary operator:
770              %( CONDITION %? TRUE-TOKENS %)
771              %( CONDITION %? TRUE-TOKENS %: FALSE-TOKENS %)
773       The CONDITION is either an expression whose format is determined by its
774       first  keyword,  or  a string literals comparison or a numeric literals
775       comparison.  It can be also composed of many alternatives and  conjunc‐
776       tions of CONDITIONs (meant as in previous sentence) using || and && re‐
777       spectively.  However, parentheses are not supported yet, so remembering
778       that conjunction takes precedence over alternative is important.
780       If  the  first part is the identifier kernel_vr or kernel_v to refer to
781       the kernel  version  number,  with  ("2.6.13-1.322FC3smp")  or  without
782       ("2.6.13")  the release code suffix, then the second part is one of the
783       six standard numeric comparison operators <, <=, ==, !=, >, and >=, and
784       the  third part is a string literal that contains an RPM-style version-
785       release value.  The condition is deemed satisfied if the version of the
786       target  kernel  (as optionally overridden by the -r option) compares to
787       the given version string.  The comparison is  performed  by  the  glibc
788       function  strverscmp.  As a special case, if the operator is for simple
789       equality (==), or inequality (!=), and  the  third  part  contains  any
790       wildcard  characters (* or ? or [), then the expression is treated as a
791       wildcard (mis)match as evaluated by fnmatch.
793       If, on the other hand, the first part is the identifier arch  to  refer
794       to  the  processor  architecture  (as  named by the kernel build system
795       ARCH/SUBARCH), then the second part is one of the two string comparison
796       operators == or !=, and the third part is a string literal for matching
797       it.  This comparison is a wildcard (mis)match.
799       Similarly, if the first part is an identifier like CONFIG_something  to
800       refer  to  a kernel configuration option, then the second part is == or
801       !=, and the third part is a string literal for matching the value (com‐
802       monly  "y"  or "m").  Nonexistent or unset kernel configuration options
803       are represented by the empty string.  This comparison is also  a  wild‐
804       card (mis)match.
806       If the first part is the identifier systemtap_v, the test refers to the
807       systemtap compatibility  version,  which  may  be  overridden  for  old
808       scripts  with  the --compatible flag.  The comparison operator is as is
809       for kernel_v and the right operand is a version string.  See  also  the
810       DEPRECATION section below.
812       If  the  first  part  is  the  identifier systemtap_privilege, the test
813       refers to the privilege level that the  systemtap  script  is  compiled
814       with.  Here the second part is == or !=, and the third part is a string
815       literal, either "stapusr" or "stapsys" or "stapdev".
817       If the first part is the identifier guru_mode, the test  refers  to  if
818       the  systemtap  script is compiled with guru_mode. Here the second part
819       is == or !=, and the third part is a number, either 1 or 0.
821       If the first part is the identifier runtime, the  test  refers  to  the
822       systemtap  runtime mode. See ALTERNATE RUNTIMES below for more informa‐
823       tion on runtimes.  The second part is one of the two string  comparison
824       operators == or !=, and the third part is a string literal for matching
825       it.  This comparison is a wildcard (mis)match.
827       Otherwise, the CONDITION is expected to be  a  comparison  between  two
828       string  literals  or two numeric literals.  In this case, the arguments
829       are the only variables usable.
831       The TRUE-TOKENS and FALSE-TOKENS are zero or more general parser tokens
832       (possibly  including  nested preprocessor conditionals), and are passed
833       into the input stream if the condition is true or false.  For  example,
834       the  following code induces a parse error unless the target kernel ver‐
835       sion is newer than 2.6.5:
837              %( kernel_v <= "2.6.5" %? **ERROR** %) # invalid token sequence
839       The following code might adapt to hypothetical kernel version drift:
841              probe kernel.function (
842                %( kernel_v <= "2.6.12" %? "__mm_do_fault" %:
843                   %( kernel_vr == "2.6.13*smp" %? "do_page_fault" %:
844                      UNSUPPORTED %) %)
845              ) { /* ... */ }
847              %( arch == "ia64" %?
848                 probe syscall.vliw = kernel.function("vliw_widget") {}
849              %)
854       The preprocessor also supports a simple macro facility, run as a  sepa‐
855       rate pass before conditional preprocessing.
857       Macros are defined using the following construct:
859              @define NAME %( BODY %)
860              @define NAME(PARAM_1, PARAM_2, ...) %( BODY %)
862       Macros, and parameters inside a macro body, are both invoked by prefix‐
863       ing the macro name with an @ symbol:
865              @define foo %( x %)
866              @define add(a,b) %( ((@a)+(@b)) %)
868                 @foo = @add(2,2)
871       Macro expansion is currently performed in a separate pass before condi‐
872       tional  compilation.  Therefore,  both TRUE- and FALSE-tokens in condi‐
873       tional expressions will be macroexpanded regardless of how  the  condi‐
874       tion is evaluated. This can sometimes lead to errors:
876              // The following results in a conflict:
877              %( CONFIG_UTRACE == "y" %?
878                  @define foo %( process.syscall %)
879              %:
880                  @define foo %( **ERROR** %)
881              %)
883              // The following works properly as expected:
884              @define foo %(
885                %( CONFIG_UTRACE == "y" %? process.syscall %: **ERROR** %)
886              %)
888       The first example is incorrect because both @defines are evaluated in a
889       pass prior to the conditional being evaluated.
891       Normally, a macro definition is local to the file it occurs  in.  Thus,
892       defining  a macro in a tapset does not make it available to the user of
893       the tapset. Publically available library macros can be defined  by  in‐
894       cluding  .stpm  files  on  the tapset search path. These files may only
895       contain @define constructs, which become visible across all tapsets and
896       user  scripts. Optionally, within the .stpm files, a public macro defi‐
897       nition can be surrounded by a  preprocessor  conditional  as  described
898       above.
902       Tapsets  or  guru-mode user scripts can access header file constant to‐
903       kens, typically macros, using built-in @const() operator.  The  respec‐
904       tive  header  file inclusion is possible either via the tapset library,
905       or using a top-level guru mode embedded-C construct.  This  results  in
906       appropriate embedded C pragma comments setting.
908              @const("STP_SKIP_BADVARS")
913       Identifiers  for  variables and functions are an alphanumeric sequence,
914       and may include _ and $ characters.  They may not start  with  a  plain
915       digit,  as  in  C.   Each  variable is by default local to the probe or
916       function statement block within which it is  mentioned,  and  therefore
917       its scope and lifetime is limited to a particular probe or function in‐
918       vocation.
920       Scalar variables are implicitly typed as either string or integer.  As‐
921       sociative  arrays  also  have a string or integer value, and a tuple of
922       strings and/or integers serving as a key.  Here are a few basic expres‐
923       sions.
925              var1 = 5
926              var2 = "bar"
927              array1 [pid()] = "name"     # single numeric key
928              array2 ["foo",4,i++] += 5   # vector of string/num/num keys
929              if (["hello",5,4] in array2) println ("yes")  # membership test
932       The  translator  performs  type inference on all identifiers, including
933       array indexes and function parameters.  Inconsistent  type-related  use
934       of identifiers signals an error.
936       Variables  may  be declared global, so that they are shared amongst all
937       probes and functions and live as long as the entire systemtap  session.
938       There  is  one  namespace for all global variables, regardless of which
939       script file they are found within.  Concurrent access to  global  vari‐
940       ables is automatically protected with locks, see the SAFETY AND SECURI‐
941       TY section for more details.  A global declaration may  be  written  at
942       the outermost level anywhere, not within a block of code.  Global vari‐
943       ables which are written but never read will be displayed  automatically
944       at  session  shutdown.   The  translator  will infer for each its value
945       type, and if it is used as an array, its key types.  Optionally, scalar
946       globals  may  be initialized with a string or number literal.  The fol‐
947       lowing declaration marks variables as global.
949              global var1, var2, var3=4
952       Global variables can also be set as module options. One can do this  by
953       either  using the -G option, or the module must first be compiled using
954       stap -p4.  Global variables can then be set on the  command  line  when
955       calling staprun on the module generated by stap -p4. See staprun(8) for
956       more information.
958       The scope of a global variable may be  limited  to  a  tapset  or  user
959       script  file using private keyword. The global keyword is optional when
960       defining a private global variable. Following  declaration  marks  var1
961       and var2 private globals.
963              private global var1=2
964              private var2
967       Arrays  are  limited  in  size by the MAXMAPENTRIES variable -- see the
968       SAFETY AND SECURITY section for details.  Optionally, global arrays may
969       be  declared  with a maximum size in brackets, overriding MAXMAPENTRIES
970       for that array only.  Note that this doesn't indicate the type of  keys
971       for the array, just the size.
973              global tiny_array[10], normal_array, big_array[50000]
976       Arrays may be configured for wrapping using the '%' suffix.  This caus‐
977       es older elements to be overwritten if more elements are inserted  than
978       the  array  can  hold.  This  works for both associative and statistics
979       typed arrays.
981              global wrapped_array1%[10], wrapped_array2%
985       Many types of probe points provide context variables,  which  are  run-
986       time  values, safely extracted from the kernel or userspace program be‐
987       ing probed.  These are prefixed with  the  $  character.   The  CONTEXT
988       VARIABLES section in stapprobes(3stap) lists what is available for each
989       type of probe point.  These context variables become normal  string  or
990       numeric  scalars  once they are stored in normal script variables.  See
991       the TYPECASTING section below on how to to turn them  back  into  typed
992       pointers for further processing as context variables.
996       Statements enable procedural control flow.  They may occur within func‐
997       tions and probe handlers.  The total number of statements  executed  in
998       response to any single probe event is limited to some number defined by
999       the MAXACTION macro in the translated C code, and is in the  neighbour‐
1000       hood of 1000.
1002       EXP    Execute  the string- or integer-valued expression and throw away
1003              the value.
1005       { STMT1 STMT2 ... }
1006              Execute each statement in sequence in  this  block.   Note  that
1007              separators  or  terminators  are generally not necessary between
1008              statements.
1010       ;      Null statement, do nothing.  It is useful as an optional separa‐
1011              tor  between statements to improve syntax-error detection and to
1012              handle certain grammar ambiguities.
1014       if (EXP) STMT1 [ else STMT2 ]
1015              Compare integer-valued EXP to zero.  Execute the first  (non-ze‐
1016              ro) or second STMT (zero).
1018       while (EXP) STMT
1019              While integer-valued EXP evaluates to non-zero, execute STMT.
1021       for (EXP1; EXP2; EXP3) STMT
1022              Execute EXP1 as initialization.  While EXP2 is non-zero, execute
1023              STMT, then the iteration expression EXP3.
1025       foreach (VAR in ARRAY [ limit EXP ]) STMT
1026              Loop over each element of the named global array, assigning cur‐
1027              rent  key  to  VAR.   The  array  may not be modified within the
1028              statement.  By adding a single + or - operator after the VAR  or
1029              the ARRAY identifier, the iteration will proceed in a sorted or‐
1030              der, by ascending or descending index or value.   If  the  array
1031              contains statistics aggregates, adding the desired @operator be‐
1032              tween the ARRAY identifier and the + or - will specify the sort‐
1033              ing  aggregate  function.   See the STATISTICS section below for
1034              the ones available.  Default is @count.  Using the optional lim‐
1035              it  keyword  limits  the number of loop iterations to EXP times.
1036              EXP is evaluated once at the beginning of the loop.
1038       foreach ([VAR1, VAR2, ...] in ARRAY [ limit EXP ]) STMT
1039              Same as above, used when the array is indexed with  a  tuple  of
1040              keys.   A sorting suffix may be used on at most one VAR or ARRAY
1041              identifier.
1043       foreach ([VAR1, VAR2, ...] in ARRAY [INDEX1, INDEX2, ...] [  limit  EXP
1044       ]) STMT
1045              Same  as  above, where iterations are limited to elements in the
1046              array where the keys match the index values specified. The  sym‐
1047              bol  *  can be used to specify an index and will be treated as a
1048              wildcard.
1050       foreach (VAR0 = VAR in ARRAY [ limit EXP ]) STMT
1051              This variant of foreach saves current value into  VAR0  on  each
1052              iteration,  so  it  is  the same as ARRAY[VAR].  This also works
1053              with a tuple of keys.  Sorting suffixes on VAR0  have  the  same
1054              effect as on ARRAY.
1056       foreach (VAR0 = VAR in ARRAY [INDEX1, INDEX2, ...] [ limit EXP ]) STMT
1057              Same  as  above, where iterations are limited to elements in the
1058              array where the keys match the index values specified. The  sym‐
1059              bol  *  can be used to specify an index and will be treated as a
1060              wildcard.
1062       break, continue
1063              Exit or iterate the innermost nesting  loop  (while  or  for  or
1064              foreach) statement.
1066       return EXP
1067              Return  EXP  value  from  enclosing function.  If the function's
1068              value is not taken anywhere, then  a  return  statement  is  not
1069              needed, and the function will have a special "unknown" type with
1070              no return value.
1072       next   Return now from enclosing probe  handler.   This  is  especially
1073              useful  in  probe aliases that apply event filtering predicates.
1074              When used in functions, the execution will be immediately trans‐
1075              ferred to the next overloaded function.
1077       try { STMT1 } catch { STMT2 }
1078              Run  the  statements  in the first block.  Upon any run-time er‐
1079              rors, abort STMT1 and start  executing  STMT2.   Any  errors  in
1080              STMT2 will propagate to outer try/catch blocks, if any.
1082       try { STMT1 } catch(VAR) { STMT2 }
1083              Same  as  above,  plus  assign  the  error message to the string
1084              scalar variable VAR.
1086       delete ARRAY[INDEX1, INDEX2, ...]
1087              Remove from ARRAY the element specified by the index tuple.   If
1088              the  index  tuple  contains  a  * in place of an index, the * is
1089              treated as a wildcard and all elements with keys that match  the
1090              index  tuple  will  be  removed  from  ARRAY.  The value will no
1091              longer be available, and subsequent iterations will  not  report
1092              the  element.  It is not an error to delete an element that does
1093              not exist.
1095       delete ARRAY
1096              Remove all elements from ARRAY.
1098       delete SCALAR
1099              Removes the value of SCALAR.  Integers and strings  are  cleared
1100              to 0 and "" respectively, while statistics are reset to the ini‐
1101              tial empty state.
1105       Systemtap supports a number of operators that  have  the  same  general
1106       syntax,  semantics, and precedence as in C and awk.  Arithmetic is per‐
1107       formed as per typical C rules for signed integers.  Division by zero or
1108       overflow is detected and results in an error.
1110       binary numeric operators
1111              * / % + - >> << & ^ | && ||
1113       binary string operators
1114              .  (string concatenation)
1116       numeric assignment operators
1117              = *= /= %= += -= >>= <<= &= ^= |=
1119       string assignment operators
1120              = .=
1122       unary numeric operators
1123              + - ! ~ ++ --
1125       binary numeric, string comparison or regex matching operators
1126              < > <= >= == != =~ !~
1128       ternary operator
1129              cond ? exp1 : exp2
1131       grouping operator
1132              ( exp )
1134       function call
1135              fn ([ arg1, arg2, ... ])
1137       array membership check
1138              exp in array
1139              [exp1, exp2, ...] in array
1140              [*, *, ... ]in array
1144       The scripting language supports regular expression matching.  The basic
1145       syntax is as follows:
1147              exp =~ regex
1148              exp !~ regex
1150       (The first operand must be an expression evaluating to  a  string;  the
1151       second  operand  must  be  a  string literal containing a syntactically
1152       valid regular expression.)
1154       The regular expression syntax supports most of the  features  of  POSIX
1155       Extended  Regular  Expressions,  except  for subexpression reuse ("\1")
1156       functionality.
1158       After a successful match, the contents of the matched string and subex‐
1159       pressions  can  be  extracted  using the matched() and ngroups() tapset
1160       functions as follows:
1162              if ("an example string" =~ "str(ing)") {
1163                matched(0) // -> returns "string", the matched substring
1164                matched(1) // -> returns "ing", the 1st matched subexpression
1165                ngroups()  // -> returns 2, the number of matched groups
1166              }
1169   PROBES
1170       The main construct in the scripting language identifies probes.  Probes
1171       associate abstract events with a statement block ("probe handler") that
1172       is to be executed when any of those events occur.  The  general  syntax
1173       is as follows:
1175              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }
1176              probe PROBEPOINT [, PROBEPOINT] if (CONDITION) { [STMT ...] }
1179       Events  are specified in a special syntax called "probe points".  There
1180       are several varieties of probe points defined by  the  translator,  and
1181       tapset scripts may define further ones using aliases.  Probe points may
1182       be wildcarded, grouped, or listed in preference sequences, or  declared
1183       optional.   More details on probe point syntax and semantics are listed
1184       on the stapprobes(3stap) manual page.
1186       The probe handler is interpreted relative to the context of each event.
1187       For  events associated with kernel code, this context may include vari‐
1188       ables defined in the source code at that spot.   These  "context  vari‐
1189       ables"  are  presented  to the script as variables whose names are pre‐
1190       fixed with "$".  They may be accessed only  if  the  kernel's  compiler
1191       preserved  them despite optimization.  This is the same constraint that
1192       a debugger user faces when working with optimized code.   In  addition,
1193       the  objects must exist in paged-in memory at the moment of the system‐
1194       tap probe handler's execution, because systemtap must not  cause  (sup‐
1195       presses) any additional paging.  Some probe types have very little con‐
1196       text.  See the stapprobes(3stap) man pages to see the kinds of  context
1197       variables available at each kind of probe point.
1199       Probes  may be decorated with an arming condition, consisting of a sim‐
1200       ple boolean expression on read-only  global  script  variables.   While
1201       disarmed (inactive, condition evaluates to false), some probe types re‐
1202       duce or eliminate their run-time overheads.  When an  arming  condition
1203       evaluates  to  true, probes will be soon re-armed, and their probe han‐
1204       dlers will start getting called as the events fire.  (Some  events  may
1205       be  lost  during  the arming interval.  If this is unacceptable, do not
1206       use arming conditions for those probes.)  Example of the syntax:
1208              probe timer.us(TIMER) if (enabled) {
1209              }
1212       New probe points may be defined using "aliases".  Probe  point  aliases
1213       look similar to probe definitions, but instead of activating a probe at
1214       the given point, it just defines a new probe point name as an alias  to
1215       an  existing one. There are two types of alias, i.e. the prologue style
1216       and the epilogue style which are identified by "=" and "+=" respective‐
1217       ly.
1219       For  prologue  style  alias,  the statement block that follows an alias
1220       definition is implicitly added as a prologue to any probe  that  refers
1221       to  the  alias. While for the epilogue style alias, the statement block
1222       that follows an alias definition is implicitly added as an epilogue  to
1223       any probe that refers to the alias.  For example:
1225              probe syscall.read = kernel.function("sys_read") {
1226                fildes = $fd
1227                if (execname() == "init") next  # skip rest of probe
1228              }
1230       defines   a   new   probe   point   syscall.read,   which   expands  to
1231       kernel.function("sys_read"), with the given statement  as  a  prologue,
1232       which  is  useful to predefine some variables for the alias user and/or
1233       to skip probe processing entirely based on some conditions.  And
1235              probe syscall.read += kernel.function("sys_read") {
1236                if (tracethis) println ($fd)
1237              }
1239       defines a new probe point with the  given  statement  as  an  epilogue,
1240       which  is  useful to take actions based upon variables set or left over
1241       by the the alias user.  Please note that in each case,  the  statements
1242       in  the  alias  handler block are treated ordinarily, so that variables
1243       assigned there constitute mere initialization, not  a  macro  substitu‐
1244       tion.
1246       An alias is used just like a built-in probe type.
1248              probe syscall.read {
1249                printf("reading fd=%d\n", fildes)
1250                if (fildes > 10) tracethis = 1
1251              }
1256       Systemtap  scripts  may  define  subroutines to factor out common work.
1257       Functions take any number of scalar (integer or string) arguments,  and
1258       must  return  a single scalar (integer or string).  An example function
1259       declaration looks like this:
1261              function thisfn (arg1, arg2) {
1262                 return arg1 + arg2
1263              }
1265       Note the general absence of type declarations, which  are  instead  in‐
1266       ferred  by  the translator.  However, if desired, a function definition
1267       may include explicit type declarations for its return value and/or  its
1268       arguments.   This  is  especially helpful for embedded-C functions.  In
1269       the following example, the type inference engine need only  infer  type
1270       type of arg2 (a string).
1272              function thatfn:string (arg1:long, arg2) {
1273                 return sprint(arg1) . arg2
1274              }
1276       Functions  may  call  others  or  themselves recursively, up to a fixed
1277       nesting limit.  This limit is defined by the MAXNESTING  macro  in  the
1278       translated C code and is in the neighbourhood of 10.
1280       Functions  may  be  marked  private  using the private keyword to limit
1281       their scope to the tapset or user script file they are defined  in.  An
1282       example definition of a private function follows:
1284              private function three:long () { return 3 }
1287       Functions  terminating  without  reaching  an explicit return statement
1288       will return an implicit 0 or "", determined by type inference.
1290       Functions may be overloaded during both runtime and compile time.
1292       Runtime overloading allows the executed function to be  selected  while
1293       the module is running based on runtime conditions and is achieved using
1294       the "next" statement in script functions and STAP_NEXT macro for embed‐
1295       ded-C functions. For example,
1298              function f() { if (condition) next; print("first function") }
1299              function f() %{ STAP_NEXT; print("second function") %}
1300              function f() { print("third function") }
1303       During  a  functioncall  f(),  the execution will transfer to the third
1304       function if condition evaluates to true  and  print  "third  function".
1305       Note that the second function is unconditionally nexted.
1307       Parameter overloading allows the function to be executed to be selected
1308       at compile time based on the number of arguments provided to the  func‐
1309       tioncall. For example,
1312              function g() { print("first function") }
1313              function g(x) { print("second function") }
1314              g() -> "first function"
1315              g(1) -> "second function"
1318       Note  that  runtime overloading does not occur in the above example, as
1319       exactly one function will be resolved for the functioncall. The use  of
1320       a  next statement inside a function while no more overloads remain will
1321       trigger a runtime exception Runtime overloading will only occur if  the
1322       functions have the same arity, functions with the same name but differ‐
1323       ent number of parameters are completely unrelated.
1325       Execution order is determined by a priority value which may  be  speci‐
1326       fied.   If no explicit priority is specified, user script functions are
1327       given a higher priority than library functions. User  script  functions
1328       and  library functions are assigned a default priority value of 0 and 1
1329       respectively.  Functions with the same priority are executed in  decla‐
1330       ration order. For example,
1333              function f():3 { if (condition) next; print("first function") }
1334              function f():1 { if (condition) next; print("second function") }
1335              function f():2 { print("third function") }
1338       Since  the  second function has highest priority, it is executed first.
1339       The first function is never executed as there no "next"  statements  in
1340       the third function to transfer execution.
1344       There  are  a  set  of function names that are specially treated by the
1345       translator.  They format values for printing to the standard  systemtap
1346       output stream in a more convenient way (note that data generated in the
1347       kernel module need to get transferred to user-space  in  order  to  get
1348       printed).
1350         The  sprint* variants return the formatted string instead of printing
1351       it.
1353       print, sprint
1354              Print one or more values of any type, concatenated directly  to‐
1355              gether.
1357       println, sprintln
1358              Print values like print and sprint, but also append a newline.
1360       printd, sprintd
1361              Take  a string delimiter and two or more values of any type, and
1362              print the values with the delimiter interposed.   The  delimiter
1363              must be a literal string constant.
1365       printdln, sprintdln
1366              Print  values with a delimiter like printd and sprintd, but also
1367              append a newline.
1369       printf, sprintf
1370              Take a formatting string and a number of values of corresponding
1371              types,  and print them all.  The format must be a literal string
1372              constant.
1374       The printf formatting directives similar to those  of  C,  except  that
1375       they are fully type-checked by the translator:
1377              %b     Writes a binary blob of the value given, instead of ASCII
1378                     text.  The width specifier determines the number of bytes
1379                     to  write;  valid specifiers are %b %1b %2b %4b %8b.  De‐
1380                     fault (%b) is 8 bytes.
1382              %c     Character.
1384              %d,%i  Signed decimal.
1386              %m     Safely reads kernel (without #) or user (with  #)  memory
1387                     at  the given address, outputs its content.  The optional
1388                     precision specifier (not field width) determines the num‐
1389                     ber  of bytes to read - default is 1 byte.  %10.4m prints
1390                     4 bytes of  the  memory  in  a  10-character-wide  field.
1391                     Note, on some architectures user memory can still be read
1392                     without #.
1394              %M     Same as %m, but outputs in hexadecimal.  The minimal size
1395                     of  output  is  double the optional precision specifier -
1396                     default is 1 byte (2 hex chars).  %10.4M prints  4  bytes
1397                     of the memory as 8 hexadecimal characters in a 10-charac‐
1398                     ter-wide field.   %.*M hex-dumps a given number of  bytes
1399                     from a given buffer.
1401              %o     Unsigned octal.
1403              %p     Unsigned pointer address.
1405              %s     String.
1407              %u     Unsigned decimal.
1409              %x     Unsigned hex value, in all lower-case.
1411              %X     Unsigned hex value, in all upper-case.
1413              %%     Writes a %.
1415       The  # flag selects the alternate forms.  For octal, this prefixes a 0.
1416       For hex, this prefixes 0x or 0X, depending on  case.   For  characters,
1417       this  escapes non-printing values with either C-like escapes or raw oc‐
1418       tal.  In the case of %#m/%#M, this safely accesses  user  space  memory
1419       rather than kernel space memory.
1421       Examples:
1423              a = "alice", b = "bob", p = 0x1234abcd, i = 123, j = -1, id[a] = 1234, id[b] = 4567
1424              print("hello")
1425                                        Prints: hello
1426              println(b)
1427                                        Prints: bob\n
1428              println(a . " is " . sprint(16))
1429                                        Prints: alice is 16
1430              foreach (name in id)  printdln("|", strlen(name), name, id[name])
1431                                        Prints: 5|alice|1234\n3|bob|4567
1432              printf("%c is %s; %x or %X or %p; %d or %u\n",97,a,p,p,p,j,j)
1433                                        Prints: a is alice; 1234abcd or 1234ABCD or 0x1234abcd; -1 or 18446744073709551615\n
1434              printf("2 bytes of kernel buffer at address %p: %2m", p, p)
1435                                        Prints: 2 byte of kernel buffer at address 0x1234abcd: <binary data>
1436              printf("%4b", p)
1437                                        Prints (these values as binary data): 0x1234abcd
1438              printf("%#o %#x %#X\n", 1, 2, 3)
1439                                        Prints: 01 0x2 0X3
1440              printf("%#c %#c %#c\n", 0, 9, 42)
1441                                        Prints: \000 \t *
1446       It  is  often  desirable to collect statistics in a way that avoids the
1447       penalties of repeatedly exclusive locking the  global  variables  those
1448       numbers are being put into.  Systemtap provides a solution using a spe‐
1449       cial operator to accumulate values, and several pseudo-functions to ex‐
1450       tract the statistical aggregates.
1452       The  aggregation operator is <<<, and resembles an assignment, or a C++
1453       output-streaming operation.  The left operand specifies a scalar or ar‐
1454       ray-index  lvalue, which must be declared global.  The right operand is
1455       a numeric expression.  The meaning is intuitive: add the  given  number
1456       to the pile of numbers to compute statistics of.  (The specific list of
1457       statistics to gather is given separately, by the extraction functions.)
1459              foo <<< 1
1460              stats[pid()] <<< memsize
1463       The extraction functions are also special.  For each  appearance  of  a
1464       distinct  extraction  function  operating  on  a  given identifier, the
1465       translator arranges to compute a set of  statistics  that  satisfy  it.
1466       The statistics system is thereby "on-demand".  Each execution of an ex‐
1467       traction function causes the aggregation to be computed for that moment
1468       across all processors.
1470       Here  is the set of extractor functions.  The first argument of each is
1471       the same style of lvalue used on the left hand side of  the  accumulate
1472       operation.   The  @count(v), @sum(v), @min(v), @max(v), @avg(v), @vari‐
1473       ance(v[, b]) extractor functions compute the number/total/minimum/maxi‐
1474       mum/average/variance  of  all accumulated values.  The resulting values
1475       are all simple integers.  Arrays containing aggregates  may  be  sorted
1476       and iterated.  See the foreach construct above.
1478       Variance  uses  Welford's online algorithm.  The calculations are based
1479       on integer arithmetic, and so may suffer from low precision  and  over‐
1480       flow.  To improve this, @variance(v[, b]) accepts an optional parameter
1481       b, the bit-shift, ranging from 0 (default) to 62, for internal scaling.
1482       Only  one value of bit-shift may be used with given global variable.  A
1483       larger bitshift value increases precision, but increases the likelihood
1484       of overflow.
1487              $ stap -e \
1488              > 'global x probe oneshot { for(i=1;i<=5;i++) x<<<i println(@variance(x)) }'
1489              12
1490              $ stap -e \
1491              > 'global x probe oneshot { for(i=1;i<=5;i++) x<<<i println(@variance(x,1)) }'
1492              2
1493              $ python3 -c 'import statistics; print(statistics.variance([1, 2, 3, 4, 5]))'
1494              2.5
1495              $
1498       Overflow  (from internal multiplication of large numbers) may occur and
1499       may cause a negative variance result.  Consider normalizing your  input
1500       data.   Adding  or  subtracting  a fixed value from all variance inputs
1501       preserves the original variance.  Dividing the  variance  inputs  by  a
1502       fixed value shrinks the original variance by that value squared.
1506       Histograms  are  also  available, but are more complicated because they
1507       have a vector rather than scalar value.   @hist_linear(v,start,stop,in‐
1508       terval)  represents  a  linear histogram from "start" to "stop" (inclu‐
1509       sive) by increments of "interval".  The interval must be positive. Sim‐
1510       ilarly,  @hist_log(v) represents a base-2 logarithmic histogram. Print‐
1511       ing a histogram with the print family of functions renders a  histogram
1512       object as a tabular "ASCII art" bar chart.
1515              probe timer.profile {
1516                x[1] <<< pid()
1517                x[2] <<< uid()
1518                y <<< tid()
1519              }
1520              global x // an array containing aggregates
1521              global y // a scalar
1522              probe end {
1523                foreach ([i] in x @count+) {
1524                   printf ("x[%d]: avg %d = sum %d / count %d\n",
1525                           i, @avg(x[i]), @sum(x[i]), @count(x[i]))
1526                   println (@hist_log(x[i]))
1527                }
1528                println ("y:")
1529                println (@hist_log(y))
1530              }
1533       The  counts  of  each histogram bucket may be individually accessed via
1534       the [index] operator.  Each bucket is addressed from 1 through  N  (for
1535       each natural bucket).  In addition bucket #0 counts all the samples be‐
1536       neath the start value, and bucket #N+1 counts all the samples above the
1537       stop value.  Histogram buckets (including the two out-of-range buckets)
1538       may also be iterated with foreach.
1541              global x
1542              probe oneshot {
1543                x <<< -100
1544                x <<< 1
1545                x <<< 2
1546                x <<< 3
1547                x <<< 100
1548                foreach (bucket in @hist_linear(x,1,3,1))
1549                  // expecting   1 out-of-range-low bucket
1550                  //             3 payload buckets
1551                  //             1 out-of-range-high bucket
1552                  printf("bucket %d count %d\n",
1553                         bucket, @hist_linear(x,1,3,1)[bucket])
1554              }
1559       Once a pointer (see the CONTEXT VARIABLES section of stapprobes(3stap))
1560       has been saved into a script integer variable, the translator loses the
1561       type information necessary to access members from that pointer.   Using
1562       the  @cast()  operator tells the translator how to interpret the number
1563       as a typed pointer.
1565              @cast(p, "type_name"[, "module"])->member
1568       This will interpret p as a pointer to a  struct/union  named  type_name
1569       and  dereference  the member value.  Further ->subfield expressions may
1570       be appended to dereference more levels. Note that for  direct  derefer‐
1571       encing  of  a  pointer {kernel,user}_{char,int,...}($p) should be used.
1572       (Refer to stapfuncs(5) for more details.)   NOTE: the same  dereferenc‐
1573       ing  operator -> is used to refer to both direct containment or pointer
1574       indirection.  Systemtap automatically determines which.   The  optional
1575       module  tells  the  translator where to look for information about that
1576       type.  Multiple modules may be specified as a list with  :  separators.
1577       If  the  module  is  not specified, it will default either to the probe
1578       module for dwarf probes, or to "kernel" for  functions  and  all  other
1579       probes types.
1581       The  translator  can create its own module with type information from a
1582       header surrounded by angle brackets, in case normal  debuginfo  is  not
1583       available.   For kernel headers, prefix it with "kernel" to use the ap‐
1584       propriate build system.  All other headers are built with  default  GCC
1585       parameters  into  a  user module.  Multiple headers may be specified in
1586       sequence to resolve a codependency.
1588              @cast(tv, "timeval", "<sys/time.h>")->tv_sec
1589              @cast(task, "task_struct", "kernel<linux/sched.h>")->tgid
1590              @cast(task, "task_struct",
1591                    "kernel<linux/sched.h><linux/fs_struct.h>")->fs->umask
1593       Values acquired by @cast may be pretty-printed by the $ and  $$  suffix
1594       operators,  the  same way as described in the CONTEXT VARIABLES section
1595       of the stapprobes(3stap) manual page.
1598       When in guru mode, the translator will also allow scripts to assign new
1599       values to members of typecasted pointers.
1601       Typecasting  is also useful in the case of void* members whose type may
1602       be determinable at runtime.
1604              probe foo {
1605                if ($var->type == 1) {
1606                  value = @cast($var->data, "type1")->bar
1607                } else {
1608                  value = @cast($var->data, "type2")->baz
1609                }
1610                print(value)
1611              }
1616       When in guru mode, the translator accepts embedded C code  in  the  top
1617       level  of the script.  Such code is enclosed between %{ and %} markers,
1618       and is transcribed verbatim, without analysis, in some  sequence,  into
1619       the  top  level  of the generated C code.  At the outermost level, this
1620       may be useful to add #include instructions, and any  auxiliary  defini‐
1621       tions for use by other embedded code.
1623       Another  place  where embedded code is permitted is as a function body.
1624       In this case, the script language body is replaced entirely by a  piece
1625       of C code enclosed again between %{ and %} markers.  This C code may do
1626       anything reasonable and safe.  There are a number of  undocumented  but
1627       complex safety constraints on atomicity, concurrency, resource consump‐
1628       tion, and run time limits, so this is an advanced technique.
1630       The memory locations set aside for input and  output  values  are  made
1631       available  to it using macros STAP_ARG_* and STAP_RETVALUE.  Errors may
1632       be signalled with STAP_ERROR. Output may be written  with  STAP_PRINTF.
1633       The  function  may  return early with STAP_RETURN.  Here are some exam‐
1634       ples:
1636              function integer_ops (val) %{
1637                STAP_PRINTF("%d\n", STAP_ARG_val);
1638                STAP_RETVALUE = STAP_ARG_val + 1;
1639                if (STAP_RETVALUE == 4)
1640                    STAP_ERROR("wrong guess: %d", (int) STAP_RETVALUE);
1641                if (STAP_RETVALUE == 3)
1642                    STAP_RETURN(0);
1643                STAP_RETVALUE ++;
1644              %}
1645              function string_ops (val) %{
1646                strlcpy (STAP_RETVALUE, STAP_ARG_val, MAXSTRINGLEN);
1647                strlcat (STAP_RETVALUE, "one", MAXSTRINGLEN);
1648                if (strcmp (STAP_RETVALUE, "three-two-one"))
1649                    STAP_RETURN("parameter should be three-two-");
1650              %}
1651              function no_ops () %{
1652                  STAP_RETURN(); /* function inferred with no return value */
1653              %}
1655       The function argument and return value types have to be inferred by the
1656       translator  from  the  call  sites  in order for this to work. The user
1657       should examine C code generated for ordinary script-language  functions
1658       in order to write compatible embedded-C ones.
1660       The  last  place  where  embedded code is permitted is as an expression
1661       rvalue.  In this case, the C code enclosed between %{ and %} markers is
1662       interpreted  as  an  ordinary  expression value.  It is assumed to be a
1663       normal 64-bit signed number, unless the marker /* string */ is  includ‐
1664       ed, in which case it's treated as a string.
1666              function add_one (val) {
1667                return val + %{ 1 %}
1668              }
1669              function add_string_two (val) {
1670                return val . %{ /* string */ "two" %}
1671              }
1674       The  embedded-C  code  may  contain  markers to assert optimization and
1675       safety properties.
1677       /* pure */
1678              means that the C code has no side effects and may be elided  en‐
1679              tirely if its value is not used by script code.
1681       /* stable */
1682              means  that  the  C code always has the same value (in any given
1683              probe handler invocation), so repeated calls may be automatical‐
1684              ly replaced by memoized values.  Such functions must take no pa‐
1685              rameters, and also be pure.
1687       /* unprivileged */
1688              means that the C code is so safe that  even  unprivileged  users
1689              are permitted to use it.
1691       /* myproc-unprivileged */
1692              means  that  the  C code is so safe that even unprivileged users
1693              are permitted to use it, provided that the target of the current
1694              probe is within the user's own process.
1696       /* guru */
1697              means  that  the  C code is so unsafe that a systemtap user must
1698              specify -g (guru mode) to use this.  (Tapsets are permitted  and
1699              presumed to call them safely.)
1701       /* unmangled */
1702              in an embedded-C function, means that the legacy (pre-1.8) argu‐
1703              ment access syntax should be made available inside the function.
1704              Hence, in addition to STAP_ARG_foo and STAP_RETVALUE one can use
1705              THIS->foo and THIS->__retvalue respectively inside the function.
1706              This  is useful for quickly migrating code written for SystemTap
1707              version 1.7 and earlier.
1709       /* unmodified-fnargs */
1710              in an embedded-C function, means that the function arguments are
1711              not modified inside the function body.
1713       /* string */
1714              in  embedded-C  expressions  only, means that the expression has
1715              const char * type and should be treated as a string  value,  in‐
1716              stead of the default long numeric.
1718       Script  level  global variables may be accessed in embedded-C functions
1719       and blocks. To read or write the global variable var  ,  the  /*  prag‐
1720       ma:read:var */ or /* pragma:write:var */ marker must be first placed in
1721       the embedded-C function or block. This provides the  macros  STAP_GLOB‐
1722       AL_GET_* and STAP_GLOBAL_SET_* macros to allow reading and writing, re‐
1723       spectively. For example:
1725              global var
1726              global var2[100]
1727              function increment() %{
1728                  /* pragma:read:var */ /* pragma:write:var */
1729                  /* pragma:read:var2 */ /* pragma:write:var2 */
1730                  STAP_GLOBAL_SET_var(STAP_GLOBAL_GET_var()+1); //var++
1731                  STAP_GLOBAL_SET_var2(1, 1, STAP_GLOBAL_GET_var2(1, 1)+1); //var2[1,1]++
1732              %}
1734       Variables may be read and set in both embedded-C functions and  expres‐
1735       sions.   Strings returned from embedded-C code are decayed to pointers.
1736       Variables must also be assigned at script level to allow for  type  in‐
1737       ference.  Map assignment does not return the value written, so chaining
1738       does not work.
1741   BUILT-INS
1742       A set of builtin probe point aliases are provided by  the  scripts  in‐
1743       stalled  in  the  directory  specified in the stappaths(7) manual page.
1744       The functions are described in the stapprobes(3stap) manual page.
1748       Integers can be dereferenced from pointers saved as  a  script  integer
1749       variables  using  the  @kderef()  or @uderef() operators.  @kderef() is
1750       used for kernel space addresses and @uderef() is used  for  user  space
1751       addresses.
1753              @kderef(SIZE, addr)
1754              @uderef(SIZE, addr)
1756       This  will  interpert addr as a kernel/user address and read SIZE bytes
1757       starting at that address.  SIZE should be either 1, 2, 4 or 8 bytes.
1761       The value stored within a register can be accessed using  the  @kregis‐
1762       ter() or @uregister() operators.  @kregister() is used for kernel space
1763       registers and @uregister() is used for user space registers. The regis‐
1764       ter of interest is specified using its DWARF number.
1766              @kregister(0)
1767              @uregister(5)


1771       The translator begins pass 1 by parsing the given input script, and all
1772       scripts  (files  named  *.stp)  found  in  a  tapset  directory.    The
1773       directories listed with -I are processed in sequence, each processed in
1774       "guru mode".  For each directory, a number of subdirectories  are  also
1775       searched.   These  subdirectories  are derived from the selected kernel
1776       version (the -R option), in order to allow more kernel-version-specific
1777       scripts  to  override  less  specific  ones.  For example, for a kernel
1778       version 2.6.12-23.FC3 the following  patterns  would  be  searched,  in
1779       sequence:  2.6.12-23.FC3/*.stp,  2.6.12/*.stp,  2.6/*.stp,  and finally
1780       *.stp.  Stopping the translator after pass 1 causes  it  to  print  the
1781       parse trees.
1784       In  pass 2, the translator analyzes the input script to resolve symbols
1785       and types.  References to variables, functions, and probe aliases  that
1786       are unresolved internally are satisfied by searching through the parsed
1787       tapset script files.  If any tapset script file is selected because  it
1788       defines  an  unresolved symbol, then the entirety of that file is added
1789       to the translator's resolution queue.  This process iterates until  all
1790       symbols are resolved and a subset of tapset script files is selected.
1792       Next,  all  probe  point  descriptions  are  validated against the wide
1793       variety supported by the translator.  Probe points that refer  to  code
1794       locations  ("synchronous  probe points") require the appropriate kernel
1795       debugging  information  to  be  installed.   In  the  associated  probe
1796       handlers,  target-side variables (whose names begin with "$") are found
1797       and have their run-time locations decoded.
1799       Next,  all  probes  and  functions  are   analyzed   for   optimization
1800       opportunities, in order to remove variables, expressions, and functions
1801       that have no useful value and no side-effect.  Embedded-C functions are
1802       assumed  to  have  side-effects  unless  they  include the magic string
1803       /* pure */.  Since this optimization can hide latent code  errors  such
1804       as  type  mismatches or invalid $context variables, it sometimes may be
1805       useful to disable the optimizations with the -u option.
1807       Finally, all variable, function, parameter, array, and index types  are
1808       inferred   from   context   (literals  and  operators).   Stopping  the
1809       translator after pass 2 causes it to list all  the  probes,  functions,
1810       and  variables,  along  with  all  inferred types.  Any inconsistent or
1811       unresolved types cause an error.
1814       In pass 3, the translator writes C code that represents the actions  of
1815       all  selected script files, and creates a Makefile to build that into a
1816       kernel object.  These files are  placed  into  a  temporary  directory.
1817       Stopping  the  translator at this point causes it to print the contents
1818       of the C file.
1821       In pass 4, the translator invokes the  Linux  kernel  build  system  to
1822       create  the  actual  kernel object file.  This involves running make in
1823       the temporary directory, and requires  a  kernel  module  build  system
1824       (headers,  config  and  Makefiles)  to  be  installed in the usual spot
1825       /lib/modules/VERSION/build.  Stopping the translator after  pass  4  is
1826       the  last  chance before running the kernel object.  This may be useful
1827       if you want to archive the file.
1830       In pass 5, the  translator  invokes  the  systemtap  auxiliary  program
1831       staprun  program for the given kernel object.  This program arranges to
1832       load the module then communicates with it, copying trace data from  the
1833       kernel  into temporary files, until the user sends an interrupt signal.
1834       Any run-time error encountered by the probe handlers, such  as  running
1835       out  of  memory, division by zero, exceeding nesting or runtime limits,
1836       results in a soft error indication.  Soft errors in excess of MAXERRORS
1837       block  of  all  subsequent  probes  (except error-handling probes), and
1838       terminate the session.  Finally, staprun unloads the module, and cleans
1839       up.
1843       One  should  avoid  killing the stap process forcibly, for example with
1844       SIGKILL, because the stapio  process  (a  child  process  of  the  stap
1845       process)  and  the loaded module may be left running on the system.  If
1846       this happens, send SIGTERM or SIGINT to any remaining stapio processes,
1847       then use rmmod to unload the systemtap module.


1852       See the stapex(3stap) manual page for a brief collection of samples, or
1853       a   large   set   of   installed   samples    under    the    systemtap
1854       documentation/testsuite  directories.   See  stappaths(7stap)  for  the
1855       likely location of these on the system.


1859       The systemtap translator caches the pass  3  output  (the  generated  C
1860       code)  and  the  pass  4  output (the compiled kernel module) if pass 4
1861       completes successfully.  This cached  output  is  reused  if  the  same
1862       script  is  translated  again  assuming the same conditions exist (same
1863       kernel version, same systemtap version, etc.).  Cached files are stored
1864       in  the  $SYSTEMTAP_DIR/cache  directory.  The  cache can be limited by
1865       having the file cache_mb_limit placed in  the  cache  directory  (shown
1866       above)  containing  only an ASCII integer representing how many MiB the
1867       cache should not exceed. In the absence of this file, a default will be
1868       created  with  the limit set to 256MiB.  This is a 'soft' limit in that
1869       the cache will be cleaned after a new entry is added if the cache clean
1870       interval  is  exceeded,  so the total cache size may temporarily exceed
1871       this  limit.  This  interval  can  be  specified  by  having  the  file
1872       cache_clean_interval_s  placed  in  the  cache  directory (shown above)
1873       containing only an ASCII integer representing the interval in  seconds.
1874       In  the  absence  of  this  file,  a  default  will be created with the
1875       interval set to 300 s.


1879       Systemtap may be used as a powerful administrative tool.  It can expose
1880       kernel   internal   data   structures   and  potentially  private  user
1881       information.  (In dyninst runtime mode, this is not the case,  see  the
1882       ALTERNATE RUNTIMES section below.)
1884       The  translator  asserts many safety constraints during compilation and
1885       more during run-time.  It aims to ensure that no  handler  routine  can
1886       run   for   very   long,  allocate  boundless  memory,  perform  unsafe
1887       operations, or in unintentionally interfere with the system.   Uses  of
1888       script   global   variables  are  automatically  read/write  locked  as
1889       appropriate,  to  protect  against  manipulation  by  concurrent  probe
1890       handlers.   (Deadlocks  are detected with timeouts.  Use the -t flag to
1891       receive reports of  excessive  lock  contention.)   Experimenting  with
1892       scripts  is  therefore  generally safe.  The guru-mode -g option allows
1893       administrators to bypass most safety measures, which  permits  invasive
1894       or  state-changing  operations, embedded-C code, and increases the risk
1895       of upset.  By  default,  overload  prevention  is  turned  on  for  all
1896       modules.   If  you  would  like to disable overload processing, use the
1897       --suppress-time-limits option.
1899       Errors that are caught at run time normally result in  a  clean  script
1900       shutdown  and  a  pass-5  error message.  The --suppress-handler-errors
1901       option lets scripts tolerate soft errors without shutting down.
1906       For the normal linux-kernel-module runtime, to run the  kernel  objects
1907       systemtap builds, a user must be one of the following:
1909       ·   the root user;
1911       ·   a member of the stapdev and stapusr groups;
1913       ·   a member of the stapsys and stapusr groups; or
1915       ·   a member of the stapusr group.
1917       The root user or a user who is a member of both the stapdev and stapusr
1918       groups can build and run any systemtap script.
1920       A user who is a member of both the stapsys and stapusr groups can  only
1921       use pre-built modules under the following conditions:
1923       ·   The module has been signed by a trusted signer. Trusted signers are
1924           normally systemtap compile-servers  which  sign  modules  when  the
1925           --privilege   option   is   specified   by   the  client.  See  the
1926           stap-server(8) manual page for more information.
1928       ·   The  module  was  built  using  the  --privilege=stapsys   or   the
1929           --privilege=stapusr options.
1931       Members  of only the stapusr group can only use pre-built modules under
1932       the following conditions:
1934       ·   The  module  is  located  in   the   /lib/modules/VERSION/systemtap
1935           directory.   This  directory must be owned by root and not be world
1936           writable.
1938       or
1940       ·   The module has been signed by a trusted signer. Trusted signers are
1941           normally  systemtap  compile-servers  which  sign  modules when the
1942           --privilege  option  is  specified   by   the   client.   See   the
1943           stap-server(8) manual page for more information.
1945       ·   The module was built using the --privilege=stapusr option.
1947       The  kernel  modules  generated  by stap program are run by the staprun
1948       program.  The latter is a part of the Systemtap package,  dedicated  to
1949       module  loading and unloading (but only in the white zone), and kernel-
1950       to-user data transfer.  Since staprun does not perform  any  additional
1951       security  checks  on the kernel objects it is given, it would be unwise
1952       for a system administrator to add untrusted users  to  the  stapdev  or
1953       stapusr groups.
1957       If  the  current  system has SecureBoot turned on in the UEFI firmware,
1958       all kernel modules must be signed.  (Some kernels may  allow  disabling
1959       SecureBoot  long  after  booting  with  a key sequence such as SysRq-X,
1960       making it unnecessary to sign modules.)  The systemtap  compile  server
1961       can  sign  modules with a MOK (Machine Owner Key) that it has in common
1962       with a client system. See the following wiki page for more details:
1964              https://sourceware.org/systemtap/wiki/SecureBoot
1966       Some kernels do not let systemtap guess whether module  module  signing
1967       is  in  effect.   On  such machines, set the SYSTEMTAP_SIGN environment
1968       variable to any value while running stap.
1972       Many resource use limits are set by macros in  the  generated  C  code.
1973       These may be overridden with -D flags.  A selection of these is as fol‐
1974       lows:
1976       MAXNESTING
1977              Maximum number of nested function calls.  Default determined  by
1978              script  analysis,  with  a  bonus  10  slots added for recursive
1979              scripts.
1982              Maximum length of strings, default 128.
1984       MAXTRYLOCK
1985              Maximum number of iterations to wait for locks on  global  vari‐
1986              ables before declaring possible deadlock and skipping the probe,
1987              default 1000.
1989       MAXACTION
1990              Maximum number of statements to execute during any single  probe
1991              hit  (with  interrupts  disabled),  default 1000.  Note that for
1992              straight-through probe handlers lacking loops or recursion,  due
1993              to optimization, this parameter may be interpreted too conserva‐
1994              tively.
1997              Maximum number of statements to execute during any single  probe
1998              hit which is executed with interrupts enabled (such as begin/end
1999              probes), default (MAXACTION * 10).
2002              Maximum number of stack frames that will be be processed by  the
2003              stap  runtime unwinder as produced by the backtrace functions in
2004              the [u]context-unwind.stp tapsets, default 20.
2007              Maximum number of rows in any single global array, default 2048.
2008              Individual arrays may be declared with a larger or smaller limit
2009              instead:
2011              global big[10000],little[5]
2013              or denoted with % to make them wrap-around (replace old entries)
2014              automatically, as in
2016              global big%
2018              or both.
2020       MAPHASHBIAS
2021              The  number of powers-of-two to add or subtract from the natural
2022              size of the hash table backing each  global  associative  array.
2023              Default  is  0.  Try small positive numbers to get extra perfor‐
2024              mance at the cost  of  more  memory  consumption,  because  that
2025              should reduce hash table collisions.  Try small negative numbers
2026              for the opposite tradeoff.
2028       MAXERRORS
2029              Maximum number of soft errors before an exit is  triggered,  de‐
2030              fault  0, which means that the first error will exit the script.
2031              Note that with the --suppress-handler-errors option, this  limit
2032              is not enforced.
2034       MAXSKIPPED
2035              Maximum  number  of  skipped probes before an exit is triggered,
2036              default 100.  Running systemtap with -t (timing) mode gives more
2037              details  about  skipped  probes.   With the default -DINTERRUPT‐
2038              IBLE=1 setting, probes skipped due to reentrancy are not accumu‐
2039              lated  against  this  limit.  Note that with the --suppress-han‐
2040              dler-errors option, this limit is not enforced.
2043              Minimum number of free kernel stack bytes required in  order  to
2044              run  a probe handler, default 1024.  This number should be large
2045              enough for the probe handler's own needs, plus a safety margin.
2047       MAXUPROBES
2048              Maximum number of  concurrently  armed  user-space  probes  (up‐
2049              robes),  default  somewhat  larger than the number of user-space
2050              probe points named in the script.  This pool needs to be  poten‐
2051              tially  large  because individual uprobe objects (about 64 bytes
2052              each) are allocated for each process for each  matching  script-
2053              level probe.
2055       STP_MAXMEMORY
2056              Maximum  amount of memory (in kilobytes) that the systemtap mod‐
2057              ule should use, default unlimited.  The memory size includes the
2058              size  of  the  module  itself,  plus any additional allocations.
2059              This only tracks direct allocations by  the  systemtap  runtime.
2060              This does not track indirect allocations (as done by kprobes/up‐
2061              robes/etc. internals).
2064              Maximum number of machine cycles spent in probes on any cpu  per
2065              given interval, before an overload condition is declared and the
2066              script shut down.  The defaults are 500 million and  1  billion,
2067              so as to limit stap script cpu consumption at around 50%.
2070              Size  of  procfs  probe  read  buffers  (in bytes).  Defaults to
2071              MAXSTRINGLEN.  This value can be overridden on a per-procfs file
2072              basis using the procfs read probe .maxsize(MAXSIZE) parameter.
2074       With  scripts that contain probes on any interrupt path, it is possible
2075       that those interrupts may occur in the middle of another probe handler.
2076       The  probe  in  the  interrupt handler would be skipped in this case to
2077       avoid reentrance.  To work around this issue, execute stap with the op‐
2078       tion -DINTERRUPTIBLE=0 to mask interrupts throughout the probe handler.
2079       This does add some extra overhead to the probes,  but  it  may  prevent
2080       reentrance  for  common problem cases.  However, probes in NMI handlers
2081       and in the callpath of the stap runtime may still  be  skipped  due  to
2082       reentrance.
2085       In case something goes wrong with stap or staprun after a probe has al‐
2086       ready started running, one may safely kill both user processes, and re‐
2087       move the active probe kernel module with rmmod.  Any pending trace mes‐
2088       sages may be lost.


2092       Systemtap exposes kernel internal data structures and potentially  pri‐
2093       vate  user  information. Because of this, use of systemtap's full capa‐
2094       bilities are restricted to root and to users who  are  members  of  the
2095       groups stapdev and stapusr.
2097       However, a restricted set of systemtap's features can be made available
2098       to trusted, unprivileged users. These users are members  of  the  group
2099       stapusr  only,  or  members  of  the groups stapusr and stapsys.  These
2100       users can load systemtap modules which have been compiled and certified
2101       by  a trusted systemtap compile-server. See the descriptions of the op‐
2102       tions --privilege and --use-server. See README.unprivileged in the sys‐
2103       temtap  source  code for information about setting up a trusted compile
2104       server.
2106       The restrictions enforced when --privilege=stapsys is specified are de‐
2107       signed to prevent unprivileged users from:
2109              ·   harming the system maliciously.
2111       The restrictions enforced when --privilege=stapusr is specified are de‐
2112       signed to prevent unprivileged users from:
2114              ·   harming the system maliciously.
2116              ·   gaining access to information which would  not  normally  be
2117                  available to an unprivileged user.
2119              ·   disrupting the performance of processes owned by other users
2120                  of the system.  Some overhead to the system  in  general  is
2121                  unavoidable  since  the  unprivileged  user's probes will be
2122                  triggered at the appropriate times. What we  would  like  to
2123                  avoid  is  targeted interruption of another user's processes
2124                  which would not normally be possible by an unprivileged  us‐
2125                  er.
2129       A member of the groups stapusr and stapsys may use all probe points.
2131       A member of only the group stapusr may use only the following probes:
2133              ·   begin, begin(n)
2135              ·   end, end(n)
2137              ·   error(n)
2139              ·   never
2141              ·   process.*, where the target process is owned by the user.
2143              ·   timer.{jiffies,s,sec,ms,msec,us,usec,ns,nsec}(n)*
2145              ·   timer.hz(n)
2149       The  following  scripting  language features are unavailable to all un‐
2150       privileged users:
2153              ·   any feature enabled by the Guru Mode (-g) option.
2155              ·   embedded C code.
2159       The following runtime restrictions are  placed  upon  all  unprivileged
2160       users:
2162              ·   Only the default runtime code (see -R) may be used.
2164       Additional  restrictions  are  placed on members of only the group sta‐
2165       pusr:
2167              ·   Probing of processes owned by other users is not permitted.
2169              ·   Access of kernel memory (read and write) is not permitted.
2173       Some command line options provide access to features which must not  be
2174       available to all unprivileged users:
2177              ·   -g may not be specified.
2179              ·   The  following options may not be used by the compile-server
2180                  client:
2182                      -a, -B, -D, -I, -r, -R
2187       The following environment variables must not be set  for  all  unprivi‐
2188       leged users:
2190              SYSTEMTAP_RUNTIME
2191              SYSTEMTAP_TAPSET
2197       In  general,  tapset  functions  are  only available for members of the
2198       group stapusr when they do not gather information that an ordinary pro‐
2199       gram running with that user's privileges would be denied access to.
2201       There  are  two  categories of unprivileged tapset functions. The first
2202       category consists of utility functions that are unconditionally  avail‐
2203       able to all users; these include such things as:
2205              cpu:long ()
2206              exit ()
2207              str_replace:string (prnt_str:string, srch_str:string, rplc_str:string)
2210       The second category consists of so-called myproc-unprivileged functions
2211       that can only gather information within their  own  processes.  Scripts
2212       that  wish  to  use  these functions must test the result of the tapset
2213       function is_myproc and only call these functions if the  result  is  1.
2214       The  script  will exit immediately if any of these functions are called
2215       by an unprivileged user within a probe within a process  which  is  not
2216       owned by that user. Examples of myproc-unprivileged functions include:
2218              print_usyms (stk:string)
2219              user_int:long (addr:long)
2220              usymname:string (addr:long)
2223       A  compile  error  is  triggered when any function not in either of the
2224       above categories is used by members of only the group stapusr.
2226       No other built-in tapset functions may be used by members of  only  the
2227       group stapusr.


2231       As  described above, systemtap's default runtime mode involves building
2232       and loading kernel modules, with various security tradeoffs  presented.
2233       Systemtap  now  includes  two new prototype backends: --runtime=dyninst
2234       and --runtime=bpf.
2236       --runtime=dyninst uses Dyninst to instrument a user's own processes  at
2237       runtime. This backend does not use kernel modules, and does not require
2238       root privileges, but is restricted with respect to the kinds of  probes
2239       and other constructs that a script may use. dyninst runtime operates in
2240       target-attach mode, so it does requirea -c COMMAND or -x  PID  process.
2241       For example:
2243              stap --runtime=dyninst -c 'stap -V' \
2244                   -e 'probe process.function("main")
2245                       { println("hi from dyninst!") }'
2248       It may be necessary to disable a conflicting selinux check with
2250              # setsebool allow_execstack 1
2253       --runtime=bpf  compiles  the  user script into extended Berkeley Packet
2254       Filter (eBPF) programs instead of a kernel module.  eBPF  programs  are
2255       verified by the kernel for safety and are executed by an in-kernel vir‐
2256       tual machine.  This runtime is in an early  stage  of  development  and
2257       currently  lacks  support for a number of features available in the de‐
2258       fault runtime. Please see the stapbpf(8) man page for more information.


2262       The systemtap translator generally returns with a success code of 0  if
2263       the  requested  script  was processed and executed successfully through
2264       the requested pass.  Otherwise, errors may be printed to stderr  and  a
2265       failure  code is returned.  Use -v or -vp N to increase (global or per-
2266       pass) verbosity to identify the source of the trouble.
2268       In listings mode (-l and -L), error messages are  normally  suppressed.
2269       A  success  code  of  0  is returned if at least one matching probe was
2270       found.
2272       A script executing in pass 5 that is interrupted with ^C  /  SIGINT  is
2273       considered to be successful.


2277       Over  time, some features of the script language and the tapset library
2278       may undergo incompatible changes, so that a script written  against  an
2279       old  version  of  systemtap  may no longer run.  In these cases, it may
2280       help to run systemtap with the --compatible  VERSION  flag,  specifying
2281       the   last   known   working   version.   Running  systemtap  with  the
2282       --check-version flag will output a warning if any possible incompatible
2283       elements have been parsed.  Deprecation historical details may be found
2284       in the NEWS file.
2286       The purpose of deprecation facility is to  improve  the  experience  of
2287       scripts  written  for newer versions of systemtap (by adding better al‐
2288       ternatives and removing conflicting or messy older alternatives), while
2289       at  the same time permitting scripts written for older versions of sys‐
2290       temtap to continue running.  Deprecation is thus intended a service  to
2291       users (and an inconvenience to systemtap's developers), rather than the
2292       other way around.
2294       Please note that underscore-prefixed identifiers in  the  tapset  some‐
2295       times undergo such changes that are difficult to preserve compatibility
2296       for, even with the deprecation mechanisms.  Avoid relying on  these  in
2297       your  scripts;  instead  propose  them for promotion to non-underscored
2298       status.


2303       Important files and their corresponding paths can be located in the
2304              stappaths (7) manual page.


2308       stapprobes(3stap),
2309       function::*(3stap),
2310       probe::*(3stap),
2311       tapset::*(3stap),
2312       stappaths(7),
2313       staprun(8),
2314       stapdyn(8),
2315       systemtap(8),
2316       stapvars(3stap),
2317       stapex(3stap),
2318       stap-server(8),
2319       stap-prep(1),
2320       stapref(1),
2321       awk(1),
2322       gdb(1)


2326       Use the Bugzilla link of the project web  page  or  our  mailing  list.
2327       http://sourceware.org/systemtap/, <systemtap@sourceware.org>.
2329       error::reporting(7stap),
2330       https://sourceware.org/systemtap/wiki/HowToReportBugs
2334                                                                       STAP(1)