1STAP(1)                     General Commands Manual                    STAP(1)


6       stap - systemtap script translator/driver


11       stap [ OPTIONS ] FILENAME [ ARGUMENTS ]
12       stap [ OPTIONS ] - [ ARGUMENTS ]
13       stap [ OPTIONS ] -e SCRIPT [ ARGUMENTS ]
14       stap [ OPTIONS ] -l PROBE [ ARGUMENTS ]
15       stap [ OPTIONS ] -L PROBE [ ARGUMENTS ]
16       stap [ OPTIONS ] --dump-probe-types
17       stap [ OPTIONS ] --dump-probe-aliases
18       stap [ OPTIONS ] --dump-functions


24       The  stap  program  is the front-end to the Systemtap tool.  It accepts
25       probing instructions written  in  a  simple  domain-specific  language,
26       translates  those  instructions  into C code, compiles this C code, and
27       loads the resulting module into a running Linux  kernel  or  a  DynInst
28       user-space  mutator,  to perform the requested system trace/probe func‐
29       tions.  You can supply the script in  a  named  file  (FILENAME),  from
30       standard  input  (use  - instead of FILENAME), or from the command line
31       (using -e SCRIPT).  The program runs until it  is  interrupted  by  the
32       user,  or  if the script voluntarily invokes the exit() function, or by
33       sufficient number of soft errors.
35       The language, which is described the SCRIPT LANGUAGE section below,  is
36       strictly  typed, expressive, declaration free, procedural, prototyping-
37       friendly, and inspired by awk and C.  It allows source code  points  or
38       events  in the system to be associated with handlers, which are subrou‐
39       tines that are executed synchronously.  It is somewhat similar  concep‐
40       tually to "breakpoint command lists" in the gdb debugger.


44       systemtap comes with a variety of educational, documentation and refer‐
45       ence resources.  They come online and/or packaged for offline use.  For
46       online     documentation,     see     the     project     web     site,
47       https://sourceware.org/systemtap/
50       ┌──────────────────────────┬──────────────────────────────────────────────────────┐
51man pages                 │                                                      │
52       ├──────────────────────────┼──────────────────────────────────────────────────────┤
53       │stap (this page)          │ language syntax, concepts, operation, options        │
54       ├──────────────────────────┼──────────────────────────────────────────────────────┤
55       │stapprobes                │ probe points and their $context variables            │
56       ├──────────────────────────┼──────────────────────────────────────────────────────┤
57       │stapref                   │ quick reference to language syntax                   │
58       ├──────────────────────────┼──────────────────────────────────────────────────────┤
59       │stappaths                 │ list of directories, including books & references    │
60       ├──────────────────────────┼──────────────────────────────────────────────────────┤
61       │stap-prep                 │ program to install auxiliary dependencies like  ker‐ │
62       │                          │ nel debuginfo                                        │
63       ├──────────────────────────┼──────────────────────────────────────────────────────┤
64       │tapset::*                 │ generated list of tapsets                            │
65       ├──────────────────────────┼──────────────────────────────────────────────────────┤
66       │probe::*                  │ generated list of tapset probe aliases               │
67       ├──────────────────────────┼──────────────────────────────────────────────────────┤
68       │function::*               │ generated list of tapset functions                   │
69       ├──────────────────────────┼──────────────────────────────────────────────────────┤
70       │macro::*                  │ generated list of tapset macros                      │
71       ├──────────────────────────┼──────────────────────────────────────────────────────┤
72       │stapvars                  │ some of the tapset global variables                  │
73       ├──────────────────────────┼──────────────────────────────────────────────────────┤
74       │staprun, stapdyn, stapbpf │ programs for executing compiled systemtap scripts    │
75       ├──────────────────────────┼──────────────────────────────────────────────────────┤
76       │systemtap                 │ initscript, boot-time probing                        │
77       ├──────────────────────────┼──────────────────────────────────────────────────────┤
78       │stap-server               │ compilation server                                   │
79       ├──────────────────────────┼──────────────────────────────────────────────────────┤
80       │stapex                    │ a few very basic script examples                     │
81       ├──────────────────────────┼──────────────────────────────────────────────────────┤
82books                     │                                                      │
83       ├──────────────────────────┼──────────────────────────────────────────────────────┤
84       │Beginner's Guide          │ tutorial book, language essentials, examples         │
85       ├──────────────────────────┼──────────────────────────────────────────────────────┤
86       │Tutorial                  │ shorter tutorial, exercises                          │
87       ├──────────────────────────┼──────────────────────────────────────────────────────┤
88       │Language Reference        │ detailed language manual, covers statistics/analysis │
89       ├──────────────────────────┼──────────────────────────────────────────────────────┤
90       │Tapset Reference          │ the tapset man pages, reformatted into a book        │
91       ├──────────────────────────┼──────────────────────────────────────────────────────┤
92references                │                                                      │
93       ├──────────────────────────┼──────────────────────────────────────────────────────┤
94       │example scripts           │ over a hundred directly usable sysadmin tools, toys, │
95       │                          │ hacks to learn from                                  │
96       └──────────────────────────┴──────────────────────────────────────────────────────┘


99       The systemtap translator supports the following options.  Any other op‐
100       tion  prints  a list of supported options.  Options may be given on the
101       command line, as usual.  If the file $SYSTEMTAP_DIR/rc  exist,  options
102       are  also loaded from there and interpreted first.  ($SYSTEMTAP_DIR de‐
103       faults to $HOME/.systemtap if unset.)
106       In some cases, the default value of an  option  depends  on  particular
107       system  configuration  and  thus  can't be mentioned here directly.  In
108       some of those cases running "stap --help" might display the default.
111       -      Use standard input instead of a given FILENAME as probe language
112              input, unless -e SCRIPT is given.
114       -h --help
115              Show help message.
117       -V --version
118              Show version message.
120       -p NUM Stop after pass NUM.  The passes are numbered 1-5: parse, elabo‐
121              rate, translate, compile, run.  See the PROCESSING  section  for
122              details.
124       -v     Increase  verbosity  for all passes.  Produce a larger volume of
125              informative (?) output each time option repeated.
127       --vp ABCDE
128              Increase verbosity on a per-pass basis.  For example, "--vp 002"
129              adds  2  units  of  verbosity  to  pass 3 only.  The combination
130              "-v --vp 00004" adds 1 unit of verbosity for all passes,  and  4
131              more for pass 5.
133       -k     Keep  the temporary directory after all processing.  This may be
134              useful in order to examine the generated C code, or to reuse the
135              compiled kernel object.
137       -g     Guru  mode.   Enable  parsing  of unsafe expert-level constructs
138              like embedded C.
140       -P     Prologue-searching  mode.   This   is   equivalent   to   --pro‐
141              logue-searching=always.   Activate heuristics to work around in‐
142              correct debugging information for  function  parameter  $context
143              variables.
145       -u     Unoptimized  mode.   Disable  unused code elision and many other
146              optimizations during elaboration / translation.
148       -w     Suppressed warnings mode.  Disables all warning messages.
150       -W     Treat all warnings as errors.
152       -b     Use bulk mode (percpu files) for kernel-to-user  data  transfer.
153              Use  the stap-merge program to multiplex them back together lat‐
154              er.
156       -i --interactive
157              Interactive mode. Enable an interface  to  build  the  systemtap
158              script incrementally and interactively.
160       -t     Collect timing information on the number of times probe executes
161              and average amount of time spent in each probe-point. Also shows
162              the derivation for each probe-point.
164       -s NUM Use NUM megabyte buffers for kernel-to-user data transfer.  On a
165              multiprocessor in bulk mode, this is a per-processor amount.
167       -I DIR Add the given directory to the tapset search directory.  See the
168              description of pass 2 for details.
170       -D NAME=VALUE
171              Add  the  given C preprocessor directive to the module Makefile.
172              These can be used to override limit parameters described below.
174       -B NAME=VALUE
175              In kernel-runtime mode, add the given make directive to the ker‐
176              nel module build's make invocation.  These can be used to add or
177              override kconfig options.  For example, use
179              -B CONFIG_DEBUG_INFO=y
181              to add debugging information.
183       -B FLAG
184              In dyninst-runtime mode, add the given parameter to the compiler
185              CFLAGS  used for building the dyninst shared library.  For exam‐
186              ple, use
188              -B -g
190              to add debugging information.
192       -a ARCH
193              Use a cross-compilation mode for the given target  architecture.
194              This  requires access to the cross-compiler and the kernel build
195              tree, and goes along with the
197              -B CROSS_COMPILE=arch-tool-prefix-
198              and
199              -r /build/tree
201              options.
203       --modinfo NAME=VALUE
204              Add the name/value pair as a MODULE_INFO macro call to the  gen‐
205              erated module.  This may be useful to inform or override various
206              module-related checks in the kernel.
208       -G NAME=VALUE
209              Sets the value of global variable NAME to VALUE when staprun  is
210              invoked.   This  applies  to scalar variables declared global in
211              the script/tapset.
213       -R DIR Look for the systemtap runtime sources in the  given  directory.
214              Your DIR default can be seen using "stap --help".
216       -r /DIR
217              Build  for  kernel in given build tree. Can also be set with the
218              SYSTEMTAP_RELEASE environment variable.
220       -r RELEASE
221              Build for kernel in build tree /lib/modules/RELEASE/build.   Can
222              also be set with the SYSTEMTAP_RELEASE environment variable.
224       -m MODULE
225              Use  the  given name for the generated kernel object module, in‐
226              stead of a unique randomized name.  The generated kernel  object
227              module is copied to the current directory.
229       -d MODULE
230              Add symbol/unwind information for the given module into the ker‐
231              nel object module.  This may  enable  symbolic  tracebacks  from
232              those  modules/programs,  even  if  they do not have an explicit
233              probe placed into them.
235       --ldd  Add symbol/unwind information  for  all  user-space  shared  li‐
236              braries suspected by ldd to be necessary for user-space binaries
237              being probed or listed with the -d option.   Caution:  this  can
238              make  the probe modules considerably larger.  Note that this op‐
239              tion does  not  deal  with  kernel-space  modules:  see  instead
240              --all-modules below.
242       --all-modules
243              Equivalent  to  specifying "-dkernel" and a "-d" for each kernel
244              module that is currently loaded.  Caution:  this  can  make  the
245              probe modules considerably larger.
247       -o FILE
248              Send  standard  output to named file. In bulk mode, percpu files
249              will start with FILE_ (FILE_cpu with -F)  followed  by  the  cpu
250              number.  This supports strftime(3) formats for FILE.
252       -c CMD Start the probes, run CMD, and exit when CMD finishes.  This al‐
253              so has the effect of setting target() to the pid of the  command
254              ran.
256       -x PID Sets  target()  to  PID.  This allows scripts to be written that
257              filter on a specific process. Scripts  run  independent  of  the
258              PID's lifespan.
260       -e SCRIPT
261              Run the given SCRIPT specified on the command line.
263       -E SCRIPT
264              Run  the  given SCRIPT specified. This SCRIPT is run in addition
265              to the main script specified, through -e, or as a  script  file.
266              This  option can be repeated to run multiple scripts, and can be
267              used in listing mode (-l/-L).
269       -l PROBE
270              Instead of running a probe script, just list all available probe
271              points  matching  the given single probe point.  The pattern may
272              include wildcards and aliases, but not comma-separated  multiple
273              probe  points.  The process result code will indicate failure if
274              there are no matches.
276              % stap -e 'probe syscall.* { }'
277              [...]
278              % stap -l 'syscall.*'
279              syscall.accept
280              [...]
281              syscall.writev
284       -L PROBE
285              Similar to "-l", but  list  matching  probe  points  plus  their
286              available context variables.
288              % stap -L 'process("/lib64/libpython*.so.*").mark("*")'
289              process("/usr/lib64/libpython2.7.so.1.0").mark("function__entry") $arg1:long $arg2:long $arg3:long
290              process("/usr/lib64/libpython2.7.so.1.0").mark("function__return") $arg1:long $arg2:long $arg3:long
291              process("/usr/lib64/libpython3.6m.so.1.0").mark("function__entry") $arg1:long $arg2:long $arg3:long
292              process("/usr/lib64/libpython3.6m.so.1.0").mark("function__return") $arg1:long $arg2:long $arg3:long
293              process("/usr/lib64/libpython3.6m.so.1.0").mark("gc__done") $arg1:long
294              process("/usr/lib64/libpython3.6m.so.1.0").mark("gc__start") $arg1:long
295              process("/usr/lib64/libpython3.6m.so.1.0").mark("line") $arg1:long $arg2:long $arg3:long
298       -F     Without  -o  option,  load  module and start probes, then detach
299              from the module leaving the probes running.  With -o option, run
300              staprun in background as a daemon and show its pid.
302       -S size[,N]
303              Sets  the  maximum size of output file and the maximum number of
304              output files.  If the size of output file  will  exceed  size  ,
305              systemtap switches output file to the next file. And if the num‐
306              ber of output files exceed N , systemtap removes the oldest out‐
307              put file. You can omit the second argument.
309       -T TIMEOUT
310              Exit the script after TIMEOUT seconds.
312       --skip-badvars
313              Ignore  unresolvable  or run-time-inaccessible context variables
314              and substitute with 0, without errors.
317       --prologue-searching[=WHEN]
318              Prologue-searching mode. Activate heuristics to work around  in‐
319              correct debugging information  for  function  parameter $context
320              variables. WHEN can be either "never", "always", or "auto" (i.e.
321              enabled  by heuristic). If WHEN is missing, then "always" is as‐
322              sumed. If the option is missing, then "auto" is assumed.
325       --suppress-handler-errors
326              Wrap all probe handlers into something like this
328              try { ... } catch { next }
330              block, which causes any runtime errors to be quietly suppressed.
331              Suppressed  errors  do  not  count against MAXERRORS limits.  In
332              this mode, the MAXSKIPPED limits are also  suppressed,  so  that
333              many  errors  and  skipped  probes  may  be accumulated during a
334              script's runtime.  Any overall counts will still be reported  at
335              shutdown.
338       --compatible VERSION
339              Suppress  recent script language or tapset changes which are in‐
340              compatible with given older version of systemtap.  This  may  be
341              useful  if  a much older systemtap script fails to run.  See the
342              DEPRECATION section for more details.
345       --check-version
346              This option is used to check if the active script has  any  con‐
347              structs  that may be systemtap version specific.  See the DEPRE‐
348              CATION section for more details.
351       --clean-cache
352              This option prunes stale entries from the cache directory.  This
353              is  normally  done automatically after successful runs, but this
354              option will trigger the cleanup manually and then exit.  See the
355              CACHING section for more details about cache limits.
358       --color[=WHEN], --colour[=WHEN]
359              This option controls coloring of error messages. WHEN can be ei‐
360              ther "never", "always", or "auto" (i.e. enable only if at a ter‐
361              minal). If WHEN is missing, then "always" is assumed. If the op‐
362              tion is missing, then "auto" is assumed.
364              Colors can be modified using  the  SYSTEMTAP_COLORS  environment
365              variable.     The     format     must    be    of    the    form
366              key1=val1:key2=val2:key3=val3 ...etc.  Valid keys  are  "error",
367              "warning",  "source",  "caret",  and "token".  Values constitute
368              Select Graphic Rendition (SGR) parameter(s). Consult  the  docu‐
369              mentation of your terminal for the SGRs it supports. As an exam‐
370              ple,   the   default    colors    would    be    expressed    as
371              error=01;31:warning=00;33:source=00;34:caret=01:token=01.     If
372              SYSTEMTAP_COLORS is absent, the default colors will be used.  If
373              it is empty or invalid, coloring is turned off.
376       --disable-cache
377              This  option  disables all use of the cache directory.  No files
378              will be either read from or written to the cache.
381       --poison-cache
382              This option treats files in the cache directory as invalid.   No
383              files will be read from the cache, but resulting files from this
384              run will still be written to the cache.   This  is  meant  as  a
385              troubleshooting aid when stap's cached behavior seems to be mis‐
386              behaving.  If it helped, there is a probably a bug in  systemtap
387              that the developers would like you to report.
390       --privilege[=stapusr | =stapsys | =stapdev]
391              This  option  instructs  stap  to examine the script looking for
392              constructs which are not allowed  for  the  specified  privilege
393              level  (see  UNPRIVILEGED USERS).  Compilation fails if any such
394              constructs are used.  If stapusr or stapsys are  specified  when
395              using a compile server (see --use-server), the server will exam‐
396              ine the script and, if compilation  succeeds,  the  server  will
397              cryptographically  sign  the resulting kernel module, certifying
398              that is it safe for use by users at the specified privilege lev‐
399              el.
401              If  --privilege  has not been specified, -pN has not been speci‐
402              fied with N < 5, and the invoking user is not root, and is not a
403              member  of  the  group stapdev, then stap will automatically add
404              the appropriate --privilege option to the options already speci‐
405              fied.
408       --unprivileged
409              This option is equivalent to --privilege=stapusr.
412       --use-server[=HOSTNAME[:PORT] | =IP_ADDRESS[:PORT] | =CERT_SERIAL]
413              Specify  compile-server(s)  to be used for compilation and/or in
414              conjunction with --list-servers and --trust-servers (see  below)
415              for listing. If no argument is supplied, then the default in un‐
416              privileged  mode  (see  --privilege)  is  to  select  compatible
417              servers which are trusted as SSL peers and as module signers and
418              currently online. Otherwise the default is to select  compatible
419              servers  which  are  trusted  as SSL peers and currently online.
420              --use-server may be specified more than once, in  which  case  a
421              list  of  servers is accumulated in the order specified. Servers
422              may be specified by host name, ip address, or by certificate se‐
423              rial number (obtained using --list-servers).  The latter is most
424              commonly used when adding or revoking trust  in  a  server  (see
425              --trust-servers below). If a server is specified by host name or
426              ip address, then an optional port number may be specified.  This
427              is  useful for accessing servers which are not on the local net‐
428              work or to specify a particular server.
430              IP addresses may be IPv4 or IPv6 addresses.
432              If a particular IPv6 address is link local and  exists  on  more
433              than  one  interface, the intended interface may be specified by
434              appending the address with a percent sign (%)  followed  by  the
435              intended        interface        name.        For       example,
436              "fe80::5eff:35ff:fe07:55ca%eth0".
438              In order to specify a port number with an IPv6  address,  it  is
439              necessary to enclose the IPv6 address in square brackets ([]) in
440              order to separate the port number from the rest of the  address.
441              For      example,      "[fe80::5eff:35ff:fe07:55ca]:5000"     or
442              "[fe80::5eff:35ff:fe07:55ca%eth0]:5000".
444              If --use-server has not been specified, -pN has not been  speci‐
445              fied with N < 5, and the invoking user not root, is not a member
446              of the group stapdev, but is a member of the group stapusr, then
447              stap  will automatically add --use-server to the options already
448              specified.
451       --use-server-on-error[=yes|=no]
452              Instructs stap to retry compilation of a script using a  compile
453              server  if compilation on the local host fails in a manner which
454              suggests that it might succeed using a server.  If  this  option
455              is  not specified, the default is no.  If no argument is provid‐
456              ed, then the default is yes. Compilation  will  be  retried  for
457              certain  types  of  errors (e.g. insufficient data or resources)
458              which may not occur during re-compilation by a  compile  server.
459              Compile servers will be selected automatically for the re-compi‐
460              lation attempt as if --use-server was specified  with  no  argu‐
461              ments.
464       --list-servers[=SERVERS]
465              Display  the status of the requested SERVERS, where SERVERS is a
466              comma-separated list of  server  attributes.  The  list  of  at‐
467              tributes  is  combined  to filter the list of servers displayed.
468              Supported attributes are:
470              all    specifies all known servers (trusted SSL  peers,  trusted
471                     module signers, online servers).
473              specified
474                     specifies servers specified using --use-server.
476              online filters the output by retaining information about servers
477                     which are currently online.
479              trusted
480                     filters the output by retaining information about servers
481                     which are trusted as SSL peers.
483              signer filters the output by retaining information about servers
484                     which are trusted as module signers (see --privilege).
486              compatible
487                     filters the output by retaining information about servers
488                     which  are compatible with the current kernel release and
489                     architecture.
491              If no argument is provided, then the default is  specified.   If
492              no  servers  were specified using --use-server, then the default
493              servers for --use-server are listed.
495              Note that --list-servers uses the avahi-daemon service to detect
496              online   servers.   If  this  service  is  not  available,  then
497              --list-servers will fail to detect any online servers. In  order
498              for  --list-servers to detect servers listening on IPv6 address‐
499              es, the avahi-daemon  configuration  file  /etc/avahi/avahi-dae‐
500              mon.conf must contain an active "use-ipv6=yes" line. The service
501              must be restarted after adding this line in order for IPv6 to be
502              enabled.
505       --trust-servers[=TRUST_SPEC]
506              Grant  or  revoke  trust  in  compile-servers,  specified  using
507              --use-server as specified by TRUST_SPEC, where TRUST_SPEC  is  a
508              comma-separated list specifying the trust which is to be granted
509              or revoked. Supported elements are:
511              ssl    trust the specified servers as SSL peers.
513              signer trust  the  specified  servers  as  module  signers  (see
514                     --privilege).  Only root can specify signer.
516              all-users
517                     grant  trust  as  an  ssl peer for all users on the local
518                     host. The default is to grant trust as an  ssl  peer  for
519                     the current user only. Trust as a module signer is always
520                     granted for all users. Only root can specify all-users.
522              revoke revoke the specified trust. The default is to grant it.
524              no-prompt
525                     do not prompt the user for confirmation  before  carrying
526                     out  the  requested  action. The default is to prompt the
527                     user for confirmation.
529              If no argument is provided, then the  default  is  ssl.   If  no
530              servers were specified using --use-server, then no trust will be
531              granted or revoked.
533              Unless no-prompt has been specified, the user will  be  prompted
534              to  confirm the trust to be granted or revoked before the opera‐
535              tion is performed.
538       --dump-probe-types
539              Dumps a list of supported probe types  and  exits.  If  --privi‐
540              lege=stapusr  is  also  specified,  the  list will be limited to
541              probe types available to unprivileged users.
544       --dump-probe-aliases
545              Dumps a list of all probe aliases found in library files and ex‐
546              its.
549       --dump-functions
550              Dumps  a list of all the public functions found in library files
551              and exits. Also includes their parameters and types. A  function
552              of  type  'unknown'  indicates a function that does not return a
553              value. Note that not all function/parameter  types  may  be  re‐
554              solved  (these  are  also  shown by 'unknown'). This features is
555              very memory-intensive and thus may not work properly with --use-
556              server  if the target server imposes an rlimit on process memory
557              (i.e. through the ~stap-server/.systemtap/rc configuration file,
558              see stap-server(8)).
561       --remote URL
562              Set  the execution target to the given host.  This option may be
563              repeated to target multiple execution targets.  Passes  1-4  are
564              completed locally as normal to build the script, and then pass 5
565              will copy the module to the target and run it.   Acceptable  URL
566              forms include:
568              [USER@]HOSTNAME, ssh://[USER@]HOSTNAME
569                     This  mode  uses  ssh,  optionally  using  a username not
570                     matching your own. If a custom ssh_config file is in use,
571                     add SendEnv LANG to retain internationalization function‐
572                     ality.
574              libvirt://DOMAIN, libvirt://DOMAIN/LIBVIRT_URI
575                     This mode uses stapvirt to execute the script on a domain
576                     managed by libvirt. Optionally, LIBVIRT_URI may be speci‐
577                     fied to connect to a  specific  driver  and/or  a  remote
578                     host. For example, to connect to the local privileged QE‐
579                     MU driver, use:
581                     --remote libvirt://MyDomain/qemu:///system
583                     See the page at  <http://libvirt.org/uri.html>  for  sup‐
584                     ported URIs. Also see stapvirt(1) for more information on
585                     how to prepare the domain for stap probing.
587              unix:PATH
588                     This mode connects to a UNIX socket.  This  can  be  used
589                     with  a QEMU virtio-serial port for executing scripts in‐
590                     side a running virtual machine.
592              direct://
593                     Special loopback mode to run on the local host.
595       --remote-prefix
596              Prefix each line of remote output with "N: ", where N is the in‐
597              dex  of  the  remote  execution target from which the given line
598              originated.
601       --download-debuginfo[=OPTION]
602              Enable, disable or set a timeout  for  the  automatic  debuginfo
603              downloading  feature  offered  by  abrt  as specified by OPTION,
604              where OPTION is one of the following:
606              yes    enable automatic downloading of debuginfo with  no  time‐
607                     out. This is the same as not providing an OPTION value to
608                     --download-debuginfo
610              no     explicitly disable automatic  downloading  of  debuginfo.
611                     This is the same as not using the option at all.
613              ask    show  abrt output, and ask before continuing download. No
614                     timeout will be set.
616              <timeout>
617                     specify a timeout as a positive number to stop the  down‐
618                     load if it is taking longer than <timeout> seconds.
620       --rlimit-as=NUM
621              Specify  the  maximum  size of the process's virtual memory (ad‐
622              dress space), in bytes.
625       --rlimit-cpu=NUM
626              Specify the CPU time limit, in seconds.
629       --rlimit-nproc=NUM
630              Specify the maximum number of processes that can be created.
633       --rlimit-stack=NUM
634              Specify the maximum size of the process stack, in bytes.
637       --rlimit-fsize=NUM
638              Specify the maximum size of files that the process  may  create,
639              in bytes.
642       --sysroot=DIR
643              Specify  sysroot  directory where target files (executables, li‐
644              braries, etc.)  are located.  With -r RELEASE, the sysroot  will
645              be searched for the appropriate kernel build directory.  With -r
646              /DIR, however, the sysroot will not be used to find  the  kernel
647              build.
650       --sysenv=VAR=VALUE
651              Provide an alternate value for an environment variable where the
652              value on a remote system differs.  Path  variables  (e.g.  PATH,
653              LD_LIBRARY_PATH)  are  assumed  to  be relative to the directory
654              provided by --sysroot, if provided.
657       --suppress-time-limits
658              Disable -DSTP_OVERLOAD related options as  well  as  -DMAXACTION
659              and -DMAXTRYLOCK.  This option requires guru mode.
662       --runtime=MODE
663              Set  the  pass-5  runtime  mode.   Valid options are kernel (de‐
664              fault), dyninst and bpf.  See ALTERNATE RUNTIMES below for  more
665              information.
668       --dyninst
669              Shorthand for --runtime=dyninst.
672       --bpf  Shorthand for --runtime=bpf.
675       --save-uprobes
676              On machines that require SystemTap to build its own uprobes mod‐
677              ule (kernels prior to version 3.5), this option  instructs  Sys‐
678              temTap to also save a copy of the module in the current directo‐
679              ry (creating a new "uprobes" directory first).
682       --target-namespaces=PID
683              Allow for a set of target namespaces to  be  set  based  on  the
684              namespaces  the  given  PID  is  in. This is for namespace-aware
685              tapset functions. If the target namespaces was not set, the tar‐
686              get defaults to the stap process' namespaces.
689       --monitor=INTERVAL
690              Enables  an  interface  to  display status information about the
691              module(uptime, module name, invoker uid,  memory  sizes,  global
692              variables,  list  of  probes with their statistics). An optional
693              argument INTERVAL can be supplied to set  the  refresh  rate  in
694              seconds  of the status window. The module can also be controlled
695              by a list of commands using the following keys:
697              c      Resets all global variables to their  initial  values  or
698                     zeroes them if they did not have an initial value.
700              s      Rotates the attribute used to sort the list of probes.
702              t      Brings up a prompt to allow toggling(on/off) of probes by
703                     index. Probe points are still affected  by  their  condi‐
704                     tions.
706              r      Resumes the script by toggling on all probes.
708              p      Pauses the script by toggling off all probes.
710              x      Hides/shows  the status window. This allows for more out‐
711                     put to be seen.
713              navigation-keys
714                     The navigation keys can be used to scroll up and down the
715                     windows.
717              Tab    Toggle scrolling between status and output windows.
720       --example
721              This option is used to run example scripts without having to en‐
722              ter the entire path to the script. Example scripts can be  found
723              in the directory specified in the stappaths(7) manual page.


727       Any  additional  arguments on the command line are passed to the script
728       parser for substitution.  See below.


732       The systemtap script language resembles awk and C.  There are two  main
733       outermost  constructs:  probes and functions.  Within these, statements
734       and expressions use C-like operator syntax and precedence.
738       Whitespace is ignored.  Three forms of comments are supported:
739              # ... shell style, to the end of line, except for $# and @#
740              // ... C++ style, to the end of line
741              /* ... C style ... */
742       Literals are either strings enclosed in double-quotes (passing  through
743       the  usual  C  escape  codes with backslashes, and with adjacent string
744       literals glued together, also as in C), or integers (in decimal,  hexa‐
745       decimal,  or  octal, using the same notation as in C).  All strings are
746       limited in length to some reasonable value (a few hundred bytes).   In‐
747       tegers  are  64-bit signed quantities, although the parser also accepts
748       (and wraps around) values above positive 2**63.
750       In addition, script arguments given at the end of the command line  may
751       be inserted.  Use $1 ... $<NN> for insertion unquoted, @1 ... @<NN> for
752       insertion as a string literal.  The number of arguments may be accessed
753       through  $# (as an unquoted number) or through @# (as a quoted number).
754       These may be used at any place a token may begin, including within  the
755       preprocessing  stage.   Reference to an argument number beyond what was
756       actually given is an error.
760       A simple conditional preprocessing stage is run as a part  of  parsing.
761       The general form is similar to the cond ? exp1 : exp2 ternary operator:
763              %( CONDITION %? TRUE-TOKENS %)
764              %( CONDITION %? TRUE-TOKENS %: FALSE-TOKENS %)
766       The CONDITION is either an expression whose format is determined by its
767       first keyword, or a string literals comparison or  a  numeric  literals
768       comparison.   It can be also composed of many alternatives and conjunc‐
769       tions of CONDITIONs (meant as in previous sentence) using || and && re‐
770       spectively.  However, parentheses are not supported yet, so remembering
771       that conjunction takes precedence over alternative is important.
773       If the first part is the identifier kernel_vr or kernel_v to  refer  to
774       the  kernel  version  number,  with  ("2.6.13-1.322FC3smp")  or without
775       ("2.6.13") the release code suffix, then the second part is one of  the
776       six standard numeric comparison operators <, <=, ==, !=, >, and >=, and
777       the third part is a string literal that contains an RPM-style  version-
778       release value.  The condition is deemed satisfied if the version of the
779       target kernel (as optionally overridden by the -r option)  compares  to
780       the  given  version  string.   The comparison is performed by the glibc
781       function strverscmp.  As a special case, if the operator is for  simple
782       equality  (==),  or  inequality  (!=),  and the third part contains any
783       wildcard characters (* or ? or [), then the expression is treated as  a
784       wildcard (mis)match as evaluated by fnmatch.
786       If,  on  the other hand, the first part is the identifier arch to refer
787       to the processor architecture (as named  by  the  kernel  build  system
788       ARCH/SUBARCH), then the second part is one of the two string comparison
789       operators == or !=, and the third part is a string literal for matching
790       it.  This comparison is a wildcard (mis)match.
792       Similarly,  if the first part is an identifier like CONFIG_something to
793       refer to a kernel configuration option, then the second part is  ==  or
794       !=, and the third part is a string literal for matching the value (com‐
795       monly "y" or "m").  Nonexistent or unset kernel  configuration  options
796       are  represented  by the empty string.  This comparison is also a wild‐
797       card (mis)match.
799       If the first part is the identifier systemtap_v, the test refers to the
800       systemtap  compatibility  version,  which  may  be  overridden  for old
801       scripts with the --compatible flag.  The comparison operator is  as  is
802       for  kernel_v  and the right operand is a version string.  See also the
803       DEPRECATION section below.
805       If the first part  is  the  identifier  systemtap_privilege,  the  test
806       refers  to  the  privilege  level that the systemtap script is compiled
807       with. Here the second part is == or !=, and the third part is a  string
808       literal, either "stapusr" or "stapsys" or "stapdev".
810       If  the  first  part is the identifier guru_mode, the test refers to if
811       the systemtap script is compiled with guru_mode. Here the  second  part
812       is == or !=, and the third part is a number, either 1 or 0.
814       If  the  first  part  is the identifier runtime, the test refers to the
815       systemtap runtime mode. See ALTERNATE RUNTIMES below for more  informa‐
816       tion  on runtimes.  The second part is one of the two string comparison
817       operators == or !=, and the third part is a string literal for matching
818       it.  This comparison is a wildcard (mis)match.
820       Otherwise,  the  CONDITION  is  expected to be a comparison between two
821       string literals or two numeric literals.  In this case,  the  arguments
822       are the only variables usable.
824       The TRUE-TOKENS and FALSE-TOKENS are zero or more general parser tokens
825       (possibly including nested preprocessor conditionals), and  are  passed
826       into  the input stream if the condition is true or false.  For example,
827       the following code induces a parse error unless the target kernel  ver‐
828       sion is newer than 2.6.5:
830              %( kernel_v <= "2.6.5" %? **ERROR** %) # invalid token sequence
832       The following code might adapt to hypothetical kernel version drift:
834              probe kernel.function (
835                %( kernel_v <= "2.6.12" %? "__mm_do_fault" %:
836                   %( kernel_vr == "2.6.13*smp" %? "do_page_fault" %:
837                      UNSUPPORTED %) %)
838              ) { /* ... */ }
840              %( arch == "ia64" %?
841                 probe syscall.vliw = kernel.function("vliw_widget") {}
842              %)
847       The  preprocessor also supports a simple macro facility, run as a sepa‐
848       rate pass before conditional preprocessing.
850       Macros are defined using the following construct:
852              @define NAME %( BODY %)
853              @define NAME(PARAM_1, PARAM_2, ...) %( BODY %)
855       Macros, and parameters inside a macro body, are both invoked by prefix‐
856       ing the macro name with an @ symbol:
858              @define foo %( x %)
859              @define add(a,b) %( ((@a)+(@b)) %)
861                 @foo = @add(2,2)
864       Macro expansion is currently performed in a separate pass before condi‐
865       tional compilation. Therefore, both TRUE- and  FALSE-tokens  in  condi‐
866       tional  expressions  will be macroexpanded regardless of how the condi‐
867       tion is evaluated. This can sometimes lead to errors:
869              // The following results in a conflict:
870              %( CONFIG_UTRACE == "y" %?
871                  @define foo %( process.syscall %)
872              %:
873                  @define foo %( **ERROR** %)
874              %)
876              // The following works properly as expected:
877              @define foo %(
878                %( CONFIG_UTRACE == "y" %? process.syscall %: **ERROR** %)
879              %)
881       The first example is incorrect because both @defines are evaluated in a
882       pass prior to the conditional being evaluated.
884       Normally,  a  macro definition is local to the file it occurs in. Thus,
885       defining a macro in a tapset does not make it available to the user  of
886       the  tapset.  Publically available library macros can be defined by in‐
887       cluding .stpm files on the tapset search path.  These  files  may  only
888       contain @define constructs, which become visible across all tapsets and
889       user scripts. Optionally, within the .stpm files, a public macro  defi‐
890       nition  can  be  surrounded  by a preprocessor conditional as described
891       above.
895       Tapsets or guru-mode user scripts can access header file  constant  to‐
896       kens,  typically macros, using built-in @const() operator.  The respec‐
897       tive header file inclusion is possible either via the  tapset  library,
898       or  using  a top-level guru mode embedded-C construct.  This results in
899       appropriate embedded C pragma comments setting.
901              @const("STP_SKIP_BADVARS")
906       Identifiers for variables and functions are an  alphanumeric  sequence,
907       and  may  include  _ and $ characters.  They may not start with a plain
908       digit, as in C.  Each variable is by default  local  to  the  probe  or
909       function  statement  block  within which it is mentioned, and therefore
910       its scope and lifetime is limited to a particular probe or function in‐
911       vocation.
913       Scalar variables are implicitly typed as either string or integer.  As‐
914       sociative arrays also have a string or integer value, and  a  tuple  of
915       strings and/or integers serving as a key.  Here are a few basic expres‐
916       sions.
918              var1 = 5
919              var2 = "bar"
920              array1 [pid()] = "name"     # single numeric key
921              array2 ["foo",4,i++] += 5   # vector of string/num/num keys
922              if (["hello",5,4] in array2) println ("yes")  # membership test
925       The translator performs type inference on  all  identifiers,  including
926       array  indexes  and function parameters.  Inconsistent type-related use
927       of identifiers signals an error.
929       Variables may be declared global, so that they are shared  amongst  all
930       probes  and functions and live as long as the entire systemtap session.
931       There is one namespace for all global variables,  regardless  of  which
932       script  file  they are found within.  Concurrent access to global vari‐
933       ables is automatically protected with locks, see the SAFETY AND SECURI‐
934       TY  section  for  more details.  A global declaration may be written at
935       the outermost level anywhere, not within a block of code.  Global vari‐
936       ables  which are written but never read will be displayed automatically
937       at session shutdown.  The translator will  infer  for  each  its  value
938       type, and if it is used as an array, its key types.  Optionally, scalar
939       globals may be initialized with a string or number literal.   The  fol‐
940       lowing declaration marks variables as global.
942              global var1, var2, var3=4
945       Global  variables can also be set as module options. One can do this by
946       either using the -G option, or the module must first be compiled  using
947       stap  -p4.   Global  variables can then be set on the command line when
948       calling staprun on the module generated by stap -p4. See staprun(8) for
949       more information.
951       The  scope  of  a  global  variable  may be limited to a tapset or user
952       script file using private keyword. The global keyword is optional  when
953       defining  a  private  global variable. Following declaration marks var1
954       and var2 private globals.
956              private global var1=2
957              private var2
960       Arrays are limited in size by the MAXMAPENTRIES  variable  --  see  the
961       SAFETY AND SECURITY section for details.  Optionally, global arrays may
962       be declared with a maximum size in brackets,  overriding  MAXMAPENTRIES
963       for  that array only.  Note that this doesn't indicate the type of keys
964       for the array, just the size.
966              global tiny_array[10], normal_array, big_array[50000]
969       Arrays may be configured for wrapping using the '%' suffix.  This caus‐
970       es  older elements to be overwritten if more elements are inserted than
971       the array can hold. This works  for  both  associative  and  statistics
972       typed arrays.
974              global wrapped_array1%[10], wrapped_array2%
978       Many  types  of  probe points provide context variables, which are run-
979       time values, safely extracted from the kernel or userspace program  be‐
980       ing  probed.   These  are  prefixed  with the $ character.  The CONTEXT
981       VARIABLES section in stapprobes(3stap) lists what is available for each
982       type  of  probe point.  These context variables become normal string or
983       numeric scalars once they are stored in normal script  variables.   See
984       the  TYPECASTING  section  below on how to to turn them back into typed
985       pointers for further processing as context variables.
989       Statements enable procedural control flow.  They may occur within func‐
990       tions  and  probe handlers.  The total number of statements executed in
991       response to any single probe event is limited to some number defined by
992       the  MAXACTION macro in the translated C code, and is in the neighbour‐
993       hood of 1000.
995       EXP    Execute the string- or integer-valued expression and throw  away
996              the value.
998       { STMT1 STMT2 ... }
999              Execute  each  statement  in  sequence in this block.  Note that
1000              separators or terminators are generally  not  necessary  between
1001              statements.
1003       ;      Null statement, do nothing.  It is useful as an optional separa‐
1004              tor between statements to improve syntax-error detection and  to
1005              handle certain grammar ambiguities.
1007       if (EXP) STMT1 [ else STMT2 ]
1008              Compare  integer-valued EXP to zero.  Execute the first (non-ze‐
1009              ro) or second STMT (zero).
1011       while (EXP) STMT
1012              While integer-valued EXP evaluates to non-zero, execute STMT.
1014       for (EXP1; EXP2; EXP3) STMT
1015              Execute EXP1 as initialization.  While EXP2 is non-zero, execute
1016              STMT, then the iteration expression EXP3.
1018       foreach (VAR in ARRAY [ limit EXP ]) STMT
1019              Loop over each element of the named global array, assigning cur‐
1020              rent key to VAR.  The array  may  not  be  modified  within  the
1021              statement.   By adding a single + or - operator after the VAR or
1022              the ARRAY identifier, the iteration will proceed in a sorted or‐
1023              der,  by  ascending  or descending index or value.  If the array
1024              contains statistics aggregates, adding the desired @operator be‐
1025              tween the ARRAY identifier and the + or - will specify the sort‐
1026              ing aggregate function.  See the STATISTICS  section  below  for
1027              the ones available.  Default is @count.  Using the optional lim‐
1028              it keyword limits the number of loop iterations  to  EXP  times.
1029              EXP is evaluated once at the beginning of the loop.
1031       foreach ([VAR1, VAR2, ...] in ARRAY [ limit EXP ]) STMT
1032              Same  as  above,  used when the array is indexed with a tuple of
1033              keys.  A sorting suffix may be used on at most one VAR or  ARRAY
1034              identifier.
1036       foreach  ([VAR1,  VAR2, ...] in ARRAY [INDEX1, INDEX2, ...] [ limit EXP
1037       ]) STMT
1038              Same as above, where iterations are limited to elements  in  the
1039              array  where the keys match the index values specified. The sym‐
1040              bol * can be used to specify an index and will be treated  as  a
1041              wildcard.
1043       foreach (VAR0 = VAR in ARRAY [ limit EXP ]) STMT
1044              This  variant  of  foreach saves current value into VAR0 on each
1045              iteration, so it is the same as  ARRAY[VAR].   This  also  works
1046              with  a  tuple  of keys.  Sorting suffixes on VAR0 have the same
1047              effect as on ARRAY.
1049       foreach (VAR0 = VAR in ARRAY [INDEX1, INDEX2, ...] [ limit EXP ]) STMT
1050              Same as above, where iterations are limited to elements  in  the
1051              array  where the keys match the index values specified. The sym‐
1052              bol * can be used to specify an index and will be treated  as  a
1053              wildcard.
1055       break, continue
1056              Exit  or  iterate  the  innermost  nesting loop (while or for or
1057              foreach) statement.
1059       return EXP
1060              Return EXP value from enclosing  function.   If  the  function's
1061              value  is  not  taken  anywhere,  then a return statement is not
1062              needed, and the function will have a special "unknown" type with
1063              no return value.
1065       next   Return  now  from  enclosing  probe handler.  This is especially
1066              useful in probe aliases that apply event  filtering  predicates.
1067              When used in functions, the execution will be immediately trans‐
1068              ferred to the next overloaded function.
1070       try { STMT1 } catch { STMT2 }
1071              Run the statements in the first block.  Upon  any  run-time  er‐
1072              rors,  abort  STMT1  and  start  executing STMT2.  Any errors in
1073              STMT2 will propagate to outer try/catch blocks, if any.
1075       try { STMT1 } catch(VAR) { STMT2 }
1076              Same as above, plus assign  the  error  message  to  the  string
1077              scalar variable VAR.
1079       delete ARRAY[INDEX1, INDEX2, ...]
1080              Remove  from ARRAY the element specified by the index tuple.  If
1081              the index tuple contains a * in place of  an  index,  the  *  is
1082              treated  as a wildcard and all elements with keys that match the
1083              index tuple will be removed  from  ARRAY.   The  value  will  no
1084              longer  be  available, and subsequent iterations will not report
1085              the element.  It is not an error to delete an element that  does
1086              not exist.
1088       delete ARRAY
1089              Remove all elements from ARRAY.
1091       delete SCALAR
1092              Removes  the  value of SCALAR.  Integers and strings are cleared
1093              to 0 and "" respectively, while statistics are reset to the ini‐
1094              tial empty state.
1098       Systemtap  supports  a  number  of operators that have the same general
1099       syntax, semantics, and precedence as in C and awk.  Arithmetic is  per‐
1100       formed as per typical C rules for signed integers.  Division by zero or
1101       overflow is detected and results in an error.
1103       binary numeric operators
1104              * / % + - >> << & ^ | && ||
1106       binary string operators
1107              .  (string concatenation)
1109       numeric assignment operators
1110              = *= /= %= += -= >>= <<= &= ^= |=
1112       string assignment operators
1113              = .=
1115       unary numeric operators
1116              + - ! ~ ++ --
1118       binary numeric, string comparison or regex matching operators
1119              < > <= >= == != =~ !~
1121       ternary operator
1122              cond ? exp1 : exp2
1124       grouping operator
1125              ( exp )
1127       function call
1128              fn ([ arg1, arg2, ... ])
1130       array membership check
1131              exp in array
1132              [exp1, exp2, ...] in array
1133              [*, *, ... ]in array
1137       The scripting language supports regular expression matching.  The basic
1138       syntax is as follows:
1140              exp =~ regex
1141              exp !~ regex
1143       (The  first  operand  must be an expression evaluating to a string; the
1144       second operand must be a  string  literal  containing  a  syntactically
1145       valid regular expression.)
1147       The  regular  expression  syntax supports most of the features of POSIX
1148       Extended Regular Expressions, except  for  subexpression  reuse  ("\1")
1149       functionality.
1151       After a successful match, the contents of the matched string and subex‐
1152       pressions can be extracted using the  matched()  and  ngroups()  tapset
1153       functions as follows:
1155              if ("an example string" =~ "str(ing)") {
1156                matched(0) // -> returns "string", the matched substring
1157                matched(1) // -> returns "ing", the 1st matched subexpression
1158                ngroups()  // -> returns 2, the number of matched groups
1159              }
1162   PROBES
1163       The main construct in the scripting language identifies probes.  Probes
1164       associate abstract events with a statement block ("probe handler") that
1165       is  to  be executed when any of those events occur.  The general syntax
1166       is as follows:
1168              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }
1169              probe PROBEPOINT [, PROBEPOINT] if (CONDITION) { [STMT ...] }
1172       Events are specified in a special syntax called "probe points".   There
1173       are  several  varieties  of probe points defined by the translator, and
1174       tapset scripts may define further ones using aliases.  Probe points may
1175       be  wildcarded, grouped, or listed in preference sequences, or declared
1176       optional.  More details on probe point syntax and semantics are  listed
1177       on the stapprobes(3stap) manual page.
1179       The probe handler is interpreted relative to the context of each event.
1180       For events associated with kernel code, this context may include  vari‐
1181       ables  defined  in  the source code at that spot.  These "context vari‐
1182       ables" are presented to the script as variables whose  names  are  pre‐
1183       fixed  with  "$".   They  may be accessed only if the kernel's compiler
1184       preserved them despite optimization.  This is the same constraint  that
1185       a  debugger  user faces when working with optimized code.  In addition,
1186       the objects must exist in paged-in memory at the moment of the  system‐
1187       tap  probe  handler's execution, because systemtap must not cause (sup‐
1188       presses) any additional paging.  Some probe types have very little con‐
1189       text.   See the stapprobes(3stap) man pages to see the kinds of context
1190       variables available at each kind of probe point.
1192       Probes may be decorated with an arming condition, consisting of a  sim‐
1193       ple  boolean  expression  on  read-only global script variables.  While
1194       disarmed (inactive, condition evaluates to false), some probe types re‐
1195       duce  or  eliminate their run-time overheads.  When an arming condition
1196       evaluates to true, probes will be soon re-armed, and their  probe  han‐
1197       dlers  will  start getting called as the events fire.  (Some events may
1198       be lost during the arming interval.  If this is  unacceptable,  do  not
1199       use arming conditions for those probes.)  Example of the syntax:
1201              probe timer.us(TIMER) if (enabled) {
1202              }
1205       New  probe  points may be defined using "aliases".  Probe point aliases
1206       look similar to probe definitions, but instead of activating a probe at
1207       the  given point, it just defines a new probe point name as an alias to
1208       an existing one. There are two types of alias, i.e. the prologue  style
1209       and the epilogue style which are identified by "=" and "+=" respective‐
1210       ly.
1212       For prologue style alias, the statement block  that  follows  an  alias
1213       definition  is  implicitly added as a prologue to any probe that refers
1214       to the alias. While for the epilogue style alias, the  statement  block
1215       that  follows an alias definition is implicitly added as an epilogue to
1216       any probe that refers to the alias.  For example:
1218              probe syscall.read = kernel.function("sys_read") {
1219                fildes = $fd
1220                if (execname() == "init") next  # skip rest of probe
1221              }
1223       defines  a   new   probe   point   syscall.read,   which   expands   to
1224       kernel.function("sys_read"),  with  the  given statement as a prologue,
1225       which is useful to predefine some variables for the alias  user  and/or
1226       to skip probe processing entirely based on some conditions.  And
1228              probe syscall.read += kernel.function("sys_read") {
1229                if (tracethis) println ($fd)
1230              }
1232       defines  a  new  probe  point  with the given statement as an epilogue,
1233       which is useful to take actions based upon variables set or  left  over
1234       by  the  the alias user.  Please note that in each case, the statements
1235       in the alias handler block are treated ordinarily,  so  that  variables
1236       assigned  there  constitute  mere initialization, not a macro substitu‐
1237       tion.
1239       An alias is used just like a built-in probe type.
1241              probe syscall.read {
1242                printf("reading fd=%d\n", fildes)
1243                if (fildes > 10) tracethis = 1
1244              }
1249       Systemtap scripts may define subroutines to  factor  out  common  work.
1250       Functions  take any number of scalar (integer or string) arguments, and
1251       must return a single scalar (integer or string).  An  example  function
1252       declaration looks like this:
1254              function thisfn (arg1, arg2) {
1255                 return arg1 + arg2
1256              }
1258       Note  the  general  absence of type declarations, which are instead in‐
1259       ferred by the translator.  However, if desired, a  function  definition
1260       may  include explicit type declarations for its return value and/or its
1261       arguments.  This is especially helpful for  embedded-C  functions.   In
1262       the  following  example, the type inference engine need only infer type
1263       type of arg2 (a string).
1265              function thatfn:string (arg1:long, arg2) {
1266                 return sprint(arg1) . arg2
1267              }
1269       Functions may call others or themselves  recursively,  up  to  a  fixed
1270       nesting  limit.   This  limit is defined by the MAXNESTING macro in the
1271       translated C code and is in the neighbourhood of 10.
1273       Functions may be marked private using  the  private  keyword  to  limit
1274       their  scope  to the tapset or user script file they are defined in. An
1275       example definition of a private function follows:
1277              private function three:long () { return 3 }
1280       Functions terminating without reaching  an  explicit  return  statement
1281       will return an implicit 0 or "", determined by type inference.
1283       Functions may be overloaded during both runtime and compile time.
1285       Runtime  overloading  allows the executed function to be selected while
1286       the module is running based on runtime conditions and is achieved using
1287       the "next" statement in script functions and STAP_NEXT macro for embed‐
1288       ded-C functions. For example,
1291              function f() { if (condition) next; print("first function") }
1292              function f() %{ STAP_NEXT; print("second function") %}
1293              function f() { print("third function") }
1296       During a functioncall f(), the execution will  transfer  to  the  third
1297       function  if  condition  evaluates  to true and print "third function".
1298       Note that the second function is unconditionally nexted.
1300       Parameter overloading allows the function to be executed to be selected
1301       at  compile time based on the number of arguments provided to the func‐
1302       tioncall. For example,
1305              function g() { print("first function") }
1306              function g(x) { print("second function") }
1307              g() -> "first function"
1308              g(1) -> "second function"
1311       Note that runtime overloading does not occur in the above  example,  as
1312       exactly  one function will be resolved for the functioncall. The use of
1313       a next statement inside a function while no more overloads remain  will
1314       trigger  a runtime exception Runtime overloading will only occur if the
1315       functions have the same arity, functions with the same name but differ‐
1316       ent number of parameters are completely unrelated.
1318       Execution  order  is determined by a priority value which may be speci‐
1319       fied.  If no explicit priority is specified, user script functions  are
1320       given  a  higher priority than library functions. User script functions
1321       and library functions are assigned a default priority value of 0 and  1
1322       respectively.   Functions with the same priority are executed in decla‐
1323       ration order. For example,
1326              function f():3 { if (condition) next; print("first function") }
1327              function f():1 { if (condition) next; print("second function") }
1328              function f():2 { print("third function") }
1331       Since the second function has highest priority, it is  executed  first.
1332       The  first  function is never executed as there no "next" statements in
1333       the third function to transfer execution.
1337       There are a set of function names that are  specially  treated  by  the
1338       translator.   They format values for printing to the standard systemtap
1339       output stream in a more convenient way (note that data generated in the
1340       kernel  module  need  to  get transferred to user-space in order to get
1341       printed).
1343         The sprint* variants return the formatted string instead of  printing
1344       it.
1346       print, sprint
1347              Print  one or more values of any type, concatenated directly to‐
1348              gether.
1350       println, sprintln
1351              Print values like print and sprint, but also append a newline.
1353       printd, sprintd
1354              Take a string delimiter and two or more values of any type,  and
1355              print  the  values with the delimiter interposed.  The delimiter
1356              must be a literal string constant.
1358       printdln, sprintdln
1359              Print values with a delimiter like printd and sprintd, but  also
1360              append a newline.
1362       printf, sprintf
1363              Take a formatting string and a number of values of corresponding
1364              types, and print them all.  The format must be a literal  string
1365              constant.
1367       The  printf  formatting  directives  similar to those of C, except that
1368       they are fully type-checked by the translator:
1370              %b     Writes a binary blob of the value given, instead of ASCII
1371                     text.  The width specifier determines the number of bytes
1372                     to write; valid specifiers are %b %1b %2b %4b  %8b.   De‐
1373                     fault (%b) is 8 bytes.
1375              %c     Character.
1377              %d,%i  Signed decimal.
1379              %m     Safely  reads  kernel (without #) or user (with #) memory
1380                     at the given address, outputs its content.  The  optional
1381                     precision specifier (not field width) determines the num‐
1382                     ber of bytes to read - default is 1 byte.  %10.4m  prints
1383                     4  bytes  of  the  memory  in  a 10-character-wide field.
1384                     Note, on some architectures user memory can still be read
1385                     without #.
1387              %M     Same as %m, but outputs in hexadecimal.  The minimal size
1388                     of output is double the optional  precision  specifier  -
1389                     default  is  1 byte (2 hex chars).  %10.4M prints 4 bytes
1390                     of the memory as 8 hexadecimal characters in a 10-charac‐
1391                     ter-wide  field.   %.*M hex-dumps a given number of bytes
1392                     from a given buffer.
1394              %o     Unsigned octal.
1396              %p     Unsigned pointer address.
1398              %s     String.
1400              %u     Unsigned decimal.
1402              %x     Unsigned hex value, in all lower-case.
1404              %X     Unsigned hex value, in all upper-case.
1406              %%     Writes a %.
1408       The # flag selects the alternate forms.  For octal, this prefixes a  0.
1409       For  hex,  this  prefixes 0x or 0X, depending on case.  For characters,
1410       this escapes non-printing values with either C-like escapes or raw  oc‐
1411       tal.   In  the  case of %#m/%#M, this safely accesses user space memory
1412       rather than kernel space memory.
1414       Examples:
1416              a = "alice", b = "bob", p = 0x1234abcd, i = 123, j = -1, id[a] = 1234, id[b] = 4567
1417              print("hello")
1418                                        Prints: hello
1419              println(b)
1420                                        Prints: bob\n
1421              println(a . " is " . sprint(16))
1422                                        Prints: alice is 16
1423              foreach (name in id)  printdln("|", strlen(name), name, id[name])
1424                                        Prints: 5|alice|1234\n3|bob|4567
1425              printf("%c is %s; %x or %X or %p; %d or %u\n",97,a,p,p,p,j,j)
1426                                        Prints: a is alice; 1234abcd or 1234ABCD or 0x1234abcd; -1 or 18446744073709551615\n
1427              printf("2 bytes of kernel buffer at address %p: %2m", p, p)
1428                                        Prints: 2 byte of kernel buffer at address 0x1234abcd: <binary data>
1429              printf("%4b", p)
1430                                        Prints (these values as binary data): 0x1234abcd
1431              printf("%#o %#x %#X\n", 1, 2, 3)
1432                                        Prints: 01 0x2 0X3
1433              printf("%#c %#c %#c\n", 0, 9, 42)
1434                                        Prints: \000 \t *
1439       It is often desirable to collect statistics in a way  that  avoids  the
1440       penalties  of  repeatedly  exclusive locking the global variables those
1441       numbers are being put into.  Systemtap provides a solution using a spe‐
1442       cial operator to accumulate values, and several pseudo-functions to ex‐
1443       tract the statistical aggregates.
1445       The aggregation operator is <<<, and resembles an assignment, or a  C++
1446       output-streaming operation.  The left operand specifies a scalar or ar‐
1447       ray-index lvalue, which must be declared global.  The right operand  is
1448       a  numeric  expression.  The meaning is intuitive: add the given number
1449       to the pile of numbers to compute statistics of.  (The specific list of
1450       statistics to gather is given separately, by the extraction functions.)
1452              foo <<< 1
1453              stats[pid()] <<< memsize
1456       The  extraction  functions  are also special.  For each appearance of a
1457       distinct extraction function  operating  on  a  given  identifier,  the
1458       translator  arranges  to  compute  a set of statistics that satisfy it.
1459       The statistics system is thereby "on-demand".  Each execution of an ex‐
1460       traction function causes the aggregation to be computed for that moment
1461       across all processors.
1463       Here is the set of extractor functions.  The first argument of each  is
1464       the  same  style of lvalue used on the left hand side of the accumulate
1465       operation.  The @count(v), @sum(v), @min(v), @max(v),  @avg(v),  @vari‐
1466       ance(v[, b]) extractor functions compute the number/total/minimum/maxi‐
1467       mum/average/variance of all accumulated values.  The  resulting  values
1468       are  all  simple  integers.  Arrays containing aggregates may be sorted
1469       and iterated.  See the foreach construct above.
1471       Variance uses Welford's online algorithm.  The calculations  are  based
1472       on  integer  arithmetic, and so may suffer from low precision and over‐
1473       flow.  To improve this, @variance(v[, b]) accepts an optional parameter
1474       b, the bit-shift, ranging from 0 (default) to 62, for internal scaling.
1475       Only one value of bit-shift may be used with given global variable.   A
1476       larger bitshift value increases precision, but increases the likelihood
1477       of overflow.
1480              $ stap -e \
1481              > 'global x probe oneshot { for(i=1;i<=5;i++) x<<<i println(@variance(x)) }'
1482              12
1483              $ stap -e \
1484              > 'global x probe oneshot { for(i=1;i<=5;i++) x<<<i println(@variance(x,1)) }'
1485              2
1486              $ python3 -c 'import statistics; print(statistics.variance([1, 2, 3, 4, 5]))'
1487              2.5
1488              $
1491       Overflow (from internal multiplication of large numbers) may occur  and
1492       may  cause a negative variance result.  Consider normalizing your input
1493       data.  Adding or subtracting a fixed value  from  all  variance  inputs
1494       preserves  the  original  variance.   Dividing the variance inputs by a
1495       fixed value shrinks the original variance by that value squared.
1499       Histograms are also available, but are more  complicated  because  they
1500       have  a vector rather than scalar value.  @hist_linear(v,start,stop,in‐
1501       terval) represents a linear histogram from "start" to "stop" by  incre‐
1502       ments  of  "interval".   The  interval  must  be  positive.  Similarly,
1503       @hist_log(v) represents a base-2 logarithmic histogram. Printing a his‐
1504       togram with the print family of functions renders a histogram object as
1505       a tabular "ASCII art" bar chart.
1507              probe timer.profile {
1508                x[1] <<< pid()
1509                x[2] <<< uid()
1510                y <<< tid()
1511              }
1512              global x // an array containing aggregates
1513              global y // a scalar
1514              probe end {
1515                foreach ([i] in x @count+) {
1516                   printf ("x[%d]: avg %d = sum %d / count %d\n",
1517                           i, @avg(x[i]), @sum(x[i]), @count(x[i]))
1518                   println (@hist_log(x[i]))
1519                }
1520                println ("y:")
1521                println (@hist_log(y))
1522              }
1527       Once a pointer (see the CONTEXT VARIABLES section of stapprobes(3stap))
1528       has been saved into a script integer variable, the translator loses the
1529       type information necessary to access members from that pointer.   Using
1530       the  @cast()  operator tells the translator how to interpret the number
1531       as a typed pointer.
1533              @cast(p, "type_name"[, "module"])->member
1536       This will interpret p as a pointer to a  struct/union  named  type_name
1537       and  dereference  the member value.  Further ->subfield expressions may
1538       be appended to dereference more levels. Note that for  direct  derefer‐
1539       encing  of  a  pointer {kernel,user}_{char,int,...}($p) should be used.
1540       (Refer to stapfuncs(5) for more details.)   NOTE: the same  dereferenc‐
1541       ing  operator -> is used to refer to both direct containment or pointer
1542       indirection.  Systemtap automatically determines which.   The  optional
1543       module  tells  the  translator where to look for information about that
1544       type.  Multiple modules may be specified as a list with  :  separators.
1545       If  the  module  is  not specified, it will default either to the probe
1546       module for dwarf probes, or to "kernel" for  functions  and  all  other
1547       probes types.
1549       The  translator  can create its own module with type information from a
1550       header surrounded by angle brackets, in case normal  debuginfo  is  not
1551       available.   For kernel headers, prefix it with "kernel" to use the ap‐
1552       propriate build system.  All other headers are built with  default  GCC
1553       parameters  into  a  user module.  Multiple headers may be specified in
1554       sequence to resolve a codependency.
1556              @cast(tv, "timeval", "<sys/time.h>")->tv_sec
1557              @cast(task, "task_struct", "kernel<linux/sched.h>")->tgid
1558              @cast(task, "task_struct",
1559                    "kernel<linux/sched.h><linux/fs_struct.h>")->fs->umask
1561       Values acquired by @cast may be pretty-printed by the $ and  $$  suffix
1562       operators,  the  same way as described in the CONTEXT VARIABLES section
1563       of the stapprobes(3stap) manual page.
1566       When in guru mode, the translator will also allow scripts to assign new
1567       values to members of typecasted pointers.
1569       Typecasting  is also useful in the case of void* members whose type may
1570       be determinable at runtime.
1572              probe foo {
1573                if ($var->type == 1) {
1574                  value = @cast($var->data, "type1")->bar
1575                } else {
1576                  value = @cast($var->data, "type2")->baz
1577                }
1578                print(value)
1579              }
1584       When in guru mode, the translator accepts embedded C code  in  the  top
1585       level  of the script.  Such code is enclosed between %{ and %} markers,
1586       and is transcribed verbatim, without analysis, in some  sequence,  into
1587       the  top  level  of the generated C code.  At the outermost level, this
1588       may be useful to add #include instructions, and any  auxiliary  defini‐
1589       tions for use by other embedded code.
1591       Another  place  where embedded code is permitted is as a function body.
1592       In this case, the script language body is replaced entirely by a  piece
1593       of C code enclosed again between %{ and %} markers.  This C code may do
1594       anything reasonable and safe.  There are a number of  undocumented  but
1595       complex safety constraints on atomicity, concurrency, resource consump‐
1596       tion, and run time limits, so this is an advanced technique.
1598       The memory locations set aside for input and  output  values  are  made
1599       available  to it using macros STAP_ARG_* and STAP_RETVALUE.  Errors may
1600       be signalled with STAP_ERROR. Output may be written  with  STAP_PRINTF.
1601       The  function  may  return early with STAP_RETURN.  Here are some exam‐
1602       ples:
1604              function integer_ops (val) %{
1605                STAP_PRINTF("%d\n", STAP_ARG_val);
1606                STAP_RETVALUE = STAP_ARG_val + 1;
1607                if (STAP_RETVALUE == 4)
1608                    STAP_ERROR("wrong guess: %d", (int) STAP_RETVALUE);
1609                if (STAP_RETVALUE == 3)
1610                    STAP_RETURN(0);
1611                STAP_RETVALUE ++;
1612              %}
1613              function string_ops (val) %{
1614                strlcpy (STAP_RETVALUE, STAP_ARG_val, MAXSTRINGLEN);
1615                strlcat (STAP_RETVALUE, "one", MAXSTRINGLEN);
1616                if (strcmp (STAP_RETVALUE, "three-two-one"))
1617                    STAP_RETURN("parameter should be three-two-");
1618              %}
1619              function no_ops () %{
1620                  STAP_RETURN(); /* function inferred with no return value */
1621              %}
1623       The function argument and return value types have to be inferred by the
1624       translator  from  the  call  sites  in order for this to work. The user
1625       should examine C code generated for ordinary script-language  functions
1626       in order to write compatible embedded-C ones.
1628       The  last  place  where  embedded code is permitted is as an expression
1629       rvalue.  In this case, the C code enclosed between %{ and %} markers is
1630       interpreted  as  an  ordinary  expression value.  It is assumed to be a
1631       normal 64-bit signed number, unless the marker /* string */ is  includ‐
1632       ed, in which case it's treated as a string.
1634              function add_one (val) {
1635                return val + %{ 1 %}
1636              }
1637              function add_string_two (val) {
1638                return val . %{ /* string */ "two" %}
1639              }
1642       The  embedded-C  code  may  contain  markers to assert optimization and
1643       safety properties.
1645       /* pure */
1646              means that the C code has no side effects and may be elided  en‐
1647              tirely if its value is not used by script code.
1649       /* stable */
1650              means  that  the  C code always has the same value (in any given
1651              probe handler invocation), so repeated calls may be automatical‐
1652              ly replaced by memoized values.  Such functions must take no pa‐
1653              rameters, and also be pure.
1655       /* unprivileged */
1656              means that the C code is so safe that  even  unprivileged  users
1657              are permitted to use it.
1659       /* myproc-unprivileged */
1660              means  that  the  C code is so safe that even unprivileged users
1661              are permitted to use it, provided that the target of the current
1662              probe is within the user's own process.
1664       /* guru */
1665              means  that  the  C code is so unsafe that a systemtap user must
1666              specify -g (guru mode) to use this.  (Tapsets are permitted  and
1667              presumed to call them safely.)
1669       /* unmangled */
1670              in an embedded-C function, means that the legacy (pre-1.8) argu‐
1671              ment access syntax should be made available inside the function.
1672              Hence, in addition to STAP_ARG_foo and STAP_RETVALUE one can use
1673              THIS->foo and THIS->__retvalue respectively inside the function.
1674              This  is useful for quickly migrating code written for SystemTap
1675              version 1.7 and earlier.
1677       /* unmodified-fnargs */
1678              in an embedded-C function, means that the function arguments are
1679              not modified inside the function body.
1681       /* string */
1682              in  embedded-C  expressions  only, means that the expression has
1683              const char * type and should be treated as a string  value,  in‐
1684              stead of the default long numeric.
1686       Script  level  global variables may be accessed in embedded-C functions
1687       and blocks. To read or write the global variable var  ,  the  /*  prag‐
1688       ma:read:var */ or /* pragma:write:var */ marker must be first placed in
1689       the embedded-C function or block. This provides the  macros  STAP_GLOB‐
1690       AL_GET_* and STAP_GLOBAL_SET_* macros to allow reading and writing, re‐
1691       spectively. For example:
1693              global var
1694              global var2[100]
1695              function increment() %{
1696                  /* pragma:read:var */ /* pragma:write:var */
1697                  /* pragma:read:var2 */ /* pragma:write:var2 */
1698                  STAP_GLOBAL_SET_var(STAP_GLOBAL_GET_var()+1); //var++
1699                  STAP_GLOBAL_SET_var2(1, 1, STAP_GLOBAL_GET_var2(1, 1)+1); //var2[1,1]++
1700              %}
1702       Variables may be read and set in both embedded-C functions and  expres‐
1703       sions.   Strings returned from embedded-C code are decayed to pointers.
1704       Variables must also be assigned at script level to allow for  type  in‐
1705       ference.  Map assignment does not return the value written, so chaining
1706       does not work.
1709   BUILT-INS
1710       A set of builtin probe point aliases are provided by  the  scripts  in‐
1711       stalled  in  the  directory  specified in the stappaths(7) manual page.
1712       The functions are described in the stapprobes(3stap) manual page.
1716       Integers can be dereferenced from pointers saved as  a  script  integer
1717       variables  using  the  @kderef()  or @uderef() operators.  @kderef() is
1718       used for kernel space addresses and @uderef() is used  for  user  space
1719       addresses.
1721              @kderef(SIZE, addr)
1722              @uderef(SIZE, addr)
1724       This  will  interpert addr as a kernel/user address and read SIZE bytes
1725       starting at that address.  SIZE should be either 1, 2, 4 or 8 bytes.
1729       The value stored within a register can be accessed using  the  @kregis‐
1730       ter() or @uregister() operators.  @kregister() is used for kernel space
1731       registers and @uregister() is used for user space registers. The regis‐
1732       ter of interest is specified using its DWARF number.
1734              @kregister(0)
1735              @uregister(5)


1739       The translator begins pass 1 by parsing the given input script, and all
1740       scripts  (files  named  *.stp)  found  in  a  tapset  directory.    The
1741       directories listed with -I are processed in sequence, each processed in
1742       "guru mode".  For each directory, a number of subdirectories  are  also
1743       searched.   These  subdirectories  are derived from the selected kernel
1744       version (the -R option), in order to allow more kernel-version-specific
1745       scripts  to  override  less  specific  ones.  For example, for a kernel
1746       version 2.6.12-23.FC3 the following  patterns  would  be  searched,  in
1747       sequence:  2.6.12-23.FC3/*.stp,  2.6.12/*.stp,  2.6/*.stp,  and finally
1748       *.stp.  Stopping the translator after pass 1 causes  it  to  print  the
1749       parse trees.
1752       In  pass 2, the translator analyzes the input script to resolve symbols
1753       and types.  References to variables, functions, and probe aliases  that
1754       are unresolved internally are satisfied by searching through the parsed
1755       tapset script files.  If any tapset script file is selected because  it
1756       defines  an  unresolved symbol, then the entirety of that file is added
1757       to the translator's resolution queue.  This process iterates until  all
1758       symbols are resolved and a subset of tapset script files is selected.
1760       Next,  all  probe  point  descriptions  are  validated against the wide
1761       variety supported by the translator.  Probe points that refer  to  code
1762       locations  ("synchronous  probe points") require the appropriate kernel
1763       debugging  information  to  be  installed.   In  the  associated  probe
1764       handlers,  target-side variables (whose names begin with "$") are found
1765       and have their run-time locations decoded.
1767       Next,  all  probes  and  functions  are   analyzed   for   optimization
1768       opportunities, in order to remove variables, expressions, and functions
1769       that have no useful value and no side-effect.  Embedded-C functions are
1770       assumed  to  have  side-effects  unless  they  include the magic string
1771       /* pure */.  Since this optimization can hide latent code  errors  such
1772       as  type  mismatches or invalid $context variables, it sometimes may be
1773       useful to disable the optimizations with the -u option.
1775       Finally, all variable, function, parameter, array, and index types  are
1776       inferred   from   context   (literals  and  operators).   Stopping  the
1777       translator after pass 2 causes it to list all  the  probes,  functions,
1778       and  variables,  along  with  all  inferred types.  Any inconsistent or
1779       unresolved types cause an error.
1782       In pass 3, the translator writes C code that represents the actions  of
1783       all  selected script files, and creates a Makefile to build that into a
1784       kernel object.  These files are  placed  into  a  temporary  directory.
1785       Stopping  the  translator at this point causes it to print the contents
1786       of the C file.
1789       In pass 4, the translator invokes the  Linux  kernel  build  system  to
1790       create  the  actual  kernel object file.  This involves running make in
1791       the temporary directory, and requires  a  kernel  module  build  system
1792       (headers,  config  and  Makefiles)  to  be  installed in the usual spot
1793       /lib/modules/VERSION/build.  Stopping the translator after  pass  4  is
1794       the  last  chance before running the kernel object.  This may be useful
1795       if you want to archive the file.
1798       In pass 5, the  translator  invokes  the  systemtap  auxiliary  program
1799       staprun  program for the given kernel object.  This program arranges to
1800       load the module then communicates with it, copying trace data from  the
1801       kernel  into temporary files, until the user sends an interrupt signal.
1802       Any run-time error encountered by the probe handlers, such  as  running
1803       out  of  memory, division by zero, exceeding nesting or runtime limits,
1804       results in a soft error indication.  Soft errors in excess of MAXERRORS
1805       block  of  all  subsequent  probes  (except error-handling probes), and
1806       terminate the session.  Finally, staprun unloads the module, and cleans
1807       up.
1811       One  should  avoid  killing the stap process forcibly, for example with
1812       SIGKILL, because the stapio  process  (a  child  process  of  the  stap
1813       process)  and  the loaded module may be left running on the system.  If
1814       this happens, send SIGTERM or SIGINT to any remaining stapio processes,
1815       then use rmmod to unload the systemtap module.


1820       See the stapex(3stap) manual page for a brief collection of samples, or
1821       a   large   set   of   installed   samples    under    the    systemtap
1822       documentation/testsuite  directories.   See  stappaths(7stap)  for  the
1823       likely location of these on the system.


1827       The systemtap translator caches the pass  3  output  (the  generated  C
1828       code)  and  the  pass  4  output (the compiled kernel module) if pass 4
1829       completes successfully.  This cached  output  is  reused  if  the  same
1830       script  is  translated  again  assuming the same conditions exist (same
1831       kernel version, same systemtap version, etc.).  Cached files are stored
1832       in  the  $SYSTEMTAP_DIR/cache  directory.  The  cache can be limited by
1833       having the file cache_mb_limit placed in  the  cache  directory  (shown
1834       above)  containing  only an ASCII integer representing how many MiB the
1835       cache should not exceed. In the absence of this file, a default will be
1836       created  with  the limit set to 256MiB.  This is a 'soft' limit in that
1837       the cache will be cleaned after a new entry is added if the cache clean
1838       interval  is  exceeded,  so the total cache size may temporarily exceed
1839       this  limit.  This  interval  can  be  specified  by  having  the  file
1840       cache_clean_interval_s  placed  in  the  cache  directory (shown above)
1841       containing only an ASCII integer representing the interval in  seconds.
1842       In  the  absence  of  this  file,  a  default  will be created with the
1843       interval set to 300 s.


1847       Systemtap may be used as a powerful administrative tool.  It can expose
1848       kernel   internal   data   structures   and  potentially  private  user
1849       information.  (In dyninst runtime mode, this is not the case,  see  the
1850       ALTERNATE RUNTIMES section below.)
1852       The  translator  asserts many safety constraints during compilation and
1853       more during run-time.  It aims to ensure that no  handler  routine  can
1854       run   for   very   long,  allocate  boundless  memory,  perform  unsafe
1855       operations, or in unintentionally interfere with the system.   Uses  of
1856       script   global   variables  are  automatically  read/write  locked  as
1857       appropriate,  to  protect  against  manipulation  by  concurrent  probe
1858       handlers.   (Deadlocks  are detected with timeouts.  Use the -t flag to
1859       receive reports of  excessive  lock  contention.)   Experimenting  with
1860       scripts  is  therefore  generally safe.  The guru-mode -g option allows
1861       administrators to bypass most safety measures, which  permits  invasive
1862       or  state-changing  operations, embedded-C code, and increases the risk
1863       of upset.  By  default,  overload  prevention  is  turned  on  for  all
1864       modules.   If  you  would  like to disable overload processing, use the
1865       --suppress-time-limits option.
1867       Errors that are caught at run time normally result in  a  clean  script
1868       shutdown  and  a  pass-5  error message.  The --suppress-handler-errors
1869       option lets scripts tolerate soft errors without shutting down.
1874       For the normal linux-kernel-module runtime, to run the  kernel  objects
1875       systemtap builds, a user must be one of the following:
1877       ·   the root user;
1879       ·   a member of the stapdev and stapusr groups;
1881       ·   a member of the stapsys and stapusr groups; or
1883       ·   a member of the stapusr group.
1885       The root user or a user who is a member of both the stapdev and stapusr
1886       groups can build and run any systemtap script.
1888       A user who is a member of both the stapsys and stapusr groups can  only
1889       use pre-built modules under the following conditions:
1891       ·   The module has been signed by a trusted signer. Trusted signers are
1892           normally systemtap compile-servers  which  sign  modules  when  the
1893           --privilege   option   is   specified   by   the  client.  See  the
1894           stap-server(8) manual page for more information.
1896       ·   The  module  was  built  using  the  --privilege=stapsys   or   the
1897           --privilege=stapusr options.
1899       Members  of only the stapusr group can only use pre-built modules under
1900       the following conditions:
1902       ·   The  module  is  located  in   the   /lib/modules/VERSION/systemtap
1903           directory.   This  directory must be owned by root and not be world
1904           writable.
1906       or
1908       ·   The module has been signed by a trusted signer. Trusted signers are
1909           normally  systemtap  compile-servers  which  sign  modules when the
1910           --privilege  option  is  specified   by   the   client.   See   the
1911           stap-server(8) manual page for more information.
1913       ·   The module was built using the --privilege=stapusr option.
1915       The  kernel  modules  generated  by stap program are run by the staprun
1916       program.  The latter is a part of the Systemtap package,  dedicated  to
1917       module  loading and unloading (but only in the white zone), and kernel-
1918       to-user data transfer.  Since staprun does not perform  any  additional
1919       security  checks  on the kernel objects it is given, it would be unwise
1920       for a system administrator to add untrusted users  to  the  stapdev  or
1921       stapusr groups.
1925       If  the  current  system has SecureBoot turned on in the UEFI firmware,
1926       all kernel modules must be signed.  (Some kernels may  allow  disabling
1927       SecureBoot  long  after  booting  with  a key sequence such as SysRq-X,
1928       making it unnecessary to sign modules.)  The systemtap  compile  server
1929       can  sign  modules with a MOK (Machine Owner Key) that it has in common
1930       with a client system. See the following wiki page for more details:
1932              https://sourceware.org/systemtap/wiki/SecureBoot
1934       Some kernels do not let systemtap guess whether module  module  signing
1935       is  in  effect.   On  such machines, set the SYSTEMTAP_SIGN environment
1936       variable to any value while running stap.
1940       Many resource use limits are set by macros in  the  generated  C  code.
1941       These may be overridden with -D flags.  A selection of these is as fol‐
1942       lows:
1944       MAXNESTING
1945              Maximum number of nested function calls.  Default determined  by
1946              script  analysis,  with  a  bonus  10  slots added for recursive
1947              scripts.
1950              Maximum length of strings, default 128.
1952       MAXTRYLOCK
1953              Maximum number of iterations to wait for locks on  global  vari‐
1954              ables before declaring possible deadlock and skipping the probe,
1955              default 1000.
1957       MAXACTION
1958              Maximum number of statements to execute during any single  probe
1959              hit  (with  interrupts  disabled),  default 1000.  Note that for
1960              straight-through probe handlers lacking loops or recursion,  due
1961              to optimization, this parameter may be interpreted too conserva‐
1962              tively.
1965              Maximum number of statements to execute during any single  probe
1966              hit which is executed with interrupts enabled (such as begin/end
1967              probes), default (MAXACTION * 10).
1970              Maximum number of stack frames that will be be processed by  the
1971              stap  runtime unwinder as produced by the backtrace functions in
1972              the [u]context-unwind.stp tapsets, default 20.
1975              Maximum number of rows in any single global array, default 2048.
1976              Individual arrays may be declared with a larger or smaller limit
1977              instead:
1979              global big[10000],little[5]
1981              or denoted with % to make them wrap-around (replace old entries)
1982              automatically, as in
1984              global big%
1986              or both.
1988       MAPHASHBIAS
1989              The  number of powers-of-two to add or subtract from the natural
1990              size of the hash table backing each  global  associative  array.
1991              Default  is  0.  Try small positive numbers to get extra perfor‐
1992              mance at the cost  of  more  memory  consumption,  because  that
1993              should reduce hash table collisions.  Try small negative numbers
1994              for the opposite tradeoff.
1996       MAXERRORS
1997              Maximum number of soft errors before an exit is  triggered,  de‐
1998              fault  0, which means that the first error will exit the script.
1999              Note that with the --suppress-handler-errors option, this  limit
2000              is not enforced.
2002       MAXSKIPPED
2003              Maximum  number  of  skipped probes before an exit is triggered,
2004              default 100.  Running systemtap with -t (timing) mode gives more
2005              details  about  skipped  probes.   With the default -DINTERRUPT‐
2006              IBLE=1 setting, probes skipped due to reentrancy are not accumu‐
2007              lated  against  this  limit.  Note that with the --suppress-han‐
2008              dler-errors option, this limit is not enforced.
2011              Minimum number of free kernel stack bytes required in  order  to
2012              run  a probe handler, default 1024.  This number should be large
2013              enough for the probe handler's own needs, plus a safety margin.
2015       MAXUPROBES
2016              Maximum number of  concurrently  armed  user-space  probes  (up‐
2017              robes),  default  somewhat  larger than the number of user-space
2018              probe points named in the script.  This pool needs to be  poten‐
2019              tially  large  because individual uprobe objects (about 64 bytes
2020              each) are allocated for each process for each  matching  script-
2021              level probe.
2023       STP_MAXMEMORY
2024              Maximum  amount of memory (in kilobytes) that the systemtap mod‐
2025              ule should use, default unlimited.  The memory size includes the
2026              size  of  the  module  itself,  plus any additional allocations.
2027              This only tracks direct allocations by  the  systemtap  runtime.
2028              This does not track indirect allocations (as done by kprobes/up‐
2029              robes/etc. internals).
2032              Maximum number of machine cycles spent in probes on any cpu  per
2033              given interval, before an overload condition is declared and the
2034              script shut down.  The defaults are 500 million and  1  billion,
2035              so as to limit stap script cpu consumption at around 50%.
2038              Size  of  procfs  probe  read  buffers  (in bytes).  Defaults to
2039              MAXSTRINGLEN.  This value can be overridden on a per-procfs file
2040              basis using the procfs read probe .maxsize(MAXSIZE) parameter.
2042       With  scripts that contain probes on any interrupt path, it is possible
2043       that those interrupts may occur in the middle of another probe handler.
2044       The  probe  in  the  interrupt handler would be skipped in this case to
2045       avoid reentrance.  To work around this issue, execute stap with the op‐
2046       tion -DINTERRUPTIBLE=0 to mask interrupts throughout the probe handler.
2047       This does add some extra overhead to the probes,  but  it  may  prevent
2048       reentrance  for  common problem cases.  However, probes in NMI handlers
2049       and in the callpath of the stap runtime may still  be  skipped  due  to
2050       reentrance.
2053       In case something goes wrong with stap or staprun after a probe has al‐
2054       ready started running, one may safely kill both user processes, and re‐
2055       move the active probe kernel module with rmmod.  Any pending trace mes‐
2056       sages may be lost.


2060       Systemtap exposes kernel internal data structures and potentially  pri‐
2061       vate  user  information. Because of this, use of systemtap's full capa‐
2062       bilities are restricted to root and to users who  are  members  of  the
2063       groups stapdev and stapusr.
2065       However, a restricted set of systemtap's features can be made available
2066       to trusted, unprivileged users. These users are members  of  the  group
2067       stapusr  only,  or  members  of  the groups stapusr and stapsys.  These
2068       users can load systemtap modules which have been compiled and certified
2069       by  a trusted systemtap compile-server. See the descriptions of the op‐
2070       tions --privilege and --use-server. See README.unprivileged in the sys‐
2071       temtap  source  code for information about setting up a trusted compile
2072       server.
2074       The restrictions enforced when --privilege=stapsys is specified are de‐
2075       signed to prevent unprivileged users from:
2077              ·   harming the system maliciously.
2079       The restrictions enforced when --privilege=stapusr is specified are de‐
2080       signed to prevent unprivileged users from:
2082              ·   harming the system maliciously.
2084              ·   gaining access to information which would  not  normally  be
2085                  available to an unprivileged user.
2087              ·   disrupting the performance of processes owned by other users
2088                  of the system.  Some overhead to the system  in  general  is
2089                  unavoidable  since  the  unprivileged  user's probes will be
2090                  triggered at the appropriate times. What we  would  like  to
2091                  avoid  is  targeted interruption of another user's processes
2092                  which would not normally be possible by an unprivileged  us‐
2093                  er.
2097       A member of the groups stapusr and stapsys may use all probe points.
2099       A member of only the group stapusr may use only the following probes:
2101              ·   begin, begin(n)
2103              ·   end, end(n)
2105              ·   error(n)
2107              ·   never
2109              ·   process.*, where the target process is owned by the user.
2111              ·   timer.{jiffies,s,sec,ms,msec,us,usec,ns,nsec}(n)*
2113              ·   timer.hz(n)
2117       The  following  scripting  language features are unavailable to all un‐
2118       privileged users:
2121              ·   any feature enabled by the Guru Mode (-g) option.
2123              ·   embedded C code.
2127       The following runtime restrictions are  placed  upon  all  unprivileged
2128       users:
2130              ·   Only the default runtime code (see -R) may be used.
2132       Additional  restrictions  are  placed on members of only the group sta‐
2133       pusr:
2135              ·   Probing of processes owned by other users is not permitted.
2137              ·   Access of kernel memory (read and write) is not permitted.
2141       Some command line options provide access to features which must not  be
2142       available to all unprivileged users:
2145              ·   -g may not be specified.
2147              ·   The  following options may not be used by the compile-server
2148                  client:
2150                      -a, -B, -D, -I, -r, -R
2155       The following environment variables must not be set  for  all  unprivi‐
2156       leged users:
2158              SYSTEMTAP_RUNTIME
2159              SYSTEMTAP_TAPSET
2165       In  general,  tapset  functions  are  only available for members of the
2166       group stapusr when they do not gather information that an ordinary pro‐
2167       gram running with that user's privileges would be denied access to.
2169       There  are  two  categories of unprivileged tapset functions. The first
2170       category consists of utility functions that are unconditionally  avail‐
2171       able to all users; these include such things as:
2173              cpu:long ()
2174              exit ()
2175              str_replace:string (prnt_str:string, srch_str:string, rplc_str:string)
2178       The second category consists of so-called myproc-unprivileged functions
2179       that can only gather information within their  own  processes.  Scripts
2180       that  wish  to  use  these functions must test the result of the tapset
2181       function is_myproc and only call these functions if the  result  is  1.
2182       The  script  will exit immediately if any of these functions are called
2183       by an unprivileged user within a probe within a process  which  is  not
2184       owned by that user. Examples of myproc-unprivileged functions include:
2186              print_usyms (stk:string)
2187              user_int:long (addr:long)
2188              usymname:string (addr:long)
2191       A  compile  error  is  triggered when any function not in either of the
2192       above categories is used by members of only the group stapusr.
2194       No other built-in tapset functions may be used by members of  only  the
2195       group stapusr.


2199       As  described above, systemtap's default runtime mode involves building
2200       and loading kernel modules, with various security tradeoffs  presented.
2201       Systemtap  now  includes  two new prototype backends: --runtime=dyninst
2202       and --runtime=bpf.
2204       --runtime=dyninst uses Dyninst to instrument a user's own processes  at
2205       runtime. This backend does not use kernel modules, and does not require
2206       root privileges, but is restricted with respect to the kinds of  probes
2207       and other constructs that a script may use. dyninst runtime operates in
2208       target-attach mode, so it does requirea -c COMMAND or -x  PID  process.
2209       For example:
2211              stap --runtime=dyninst -c 'stap -V' \
2212                   -e 'probe process.function("main")
2213                       { println("hi from dyninst!") }'
2216       It may be necessary to disable a conflicting selinux check with
2218              # setsebool allow_execstack 1
2221       --runtime=bpf  compiles  the  user script into extended Berkeley Packet
2222       Filter (eBPF) programs instead of a kernel module.  eBPF  programs  are
2223       verified by the kernel for safety and are executed by an in-kernel vir‐
2224       tual machine.  This runtime is in an early  stage  of  development  and
2225       currently  lacks  support for a number of features available in the de‐
2226       fault runtime. Please see the stapbpf(8) man page for more information.


2230       The systemtap translator generally returns with a success code of 0  if
2231       the  requested  script  was processed and executed successfully through
2232       the requested pass.  Otherwise, errors may be printed to stderr  and  a
2233       failure  code is returned.  Use -v or -vp N to increase (global or per-
2234       pass) verbosity to identify the source of the trouble.
2236       In listings mode (-l and -L), error messages are  normally  suppressed.
2237       A  success  code  of  0  is returned if at least one matching probe was
2238       found.
2240       A script executing in pass 5 that is interrupted with ^C  /  SIGINT  is
2241       considered to be successful.


2245       Over  time, some features of the script language and the tapset library
2246       may undergo incompatible changes, so that a script written  against  an
2247       old  version  of  systemtap  may no longer run.  In these cases, it may
2248       help to run systemtap with the --compatible  VERSION  flag,  specifying
2249       the   last   known   working   version.   Running  systemtap  with  the
2250       --check-version flag will output a warning if any possible incompatible
2251       elements have been parsed.  Deprecation historical details may be found
2252       in the NEWS file.
2254       The purpose of deprecation facility is to  improve  the  experience  of
2255       scripts  written  for newer versions of systemtap (by adding better al‐
2256       ternatives and removing conflicting or messy older alternatives), while
2257       at  the same time permitting scripts written for older versions of sys‐
2258       temtap to continue running.  Deprecation is thus intended a service  to
2259       users (and an inconvenience to systemtap's developers), rather than the
2260       other way around.
2262       Please note that underscore-prefixed identifiers in  the  tapset  some‐
2263       times undergo such changes that are difficult to preserve compatibility
2264       for, even with the deprecation mechanisms.  Avoid relying on  these  in
2265       your  scripts;  instead  propose  them for promotion to non-underscored
2266       status.


2271       Important files and their corresponding paths can be located in the
2272              stappaths (7) manual page.


2276       stapprobes(3stap),
2277       function::*(3stap),
2278       probe::*(3stap),
2279       tapset::*(3stap),
2280       stappaths(7),
2281       staprun(8),
2282       stapdyn(8),
2283       systemtap(8),
2284       stapvars(3stap),
2285       stapex(3stap),
2286       stap-server(8),
2287       stap-prep(1),
2288       stapref(1),
2289       awk(1),
2290       gdb(1)


2294       Use the Bugzilla link of the project web  page  or  our  mailing  list.
2295       http://sourceware.org/systemtap/, <systemtap@sourceware.org>.
2297       error::reporting(7stap),
2298       https://sourceware.org/systemtap/wiki/HowToReportBugs
2302                                                                       STAP(1)