1STAP(1)                     General Commands Manual                    STAP(1)
2
3
4

NAME

6       stap - systemtap script translator/driver
7
8
9

SYNOPSIS

11       stap [ OPTIONS ] FILENAME [ ARGUMENTS ]
12       stap [ OPTIONS ] - [ ARGUMENTS ]
13       stap [ OPTIONS ] -e SCRIPT [ ARGUMENTS ]
14       stap [ OPTIONS ] -l PROBE [ ARGUMENTS ]
15       stap [ OPTIONS ] -L PROBE [ ARGUMENTS ]
16
17

DESCRIPTION

19       The  stap  program  is the front-end to the Systemtap tool.  It accepts
20       probing instructions (written in a simple scripting  language),  trans‐
21       lates  those  instructions into C code, compiles this C code, and loads
22       the resulting kernel module into a running Linux kernel to perform  the
23       requested system trace/probe functions.  You can supply the script in a
24       named file, from standard input, or from the command line.  The program
25       runs  until it is interrupted by the user, or if the script voluntarily
26       invokes the exit() function, or by sufficient number of soft errors.
27
28       The language, which is described in a later section, is strictly typed,
29       declaration  free,  procedural,  and inspired by awk.  It allows source
30       code points or events in the kernel to  be  associated  with  handlers,
31       which  are subroutines that are executed synchronously.  It is somewhat
32       similar conceptually to "breakpoint command lists" in the gdb debugger.
33
34

OPTIONS

36       The systemtap translator supports the  following  options.   Any  other
37       option prints a list of supported options.
38
39       -h --help
40              Show help message.
41
42       -V --version
43              Show version message.
44
45       -p NUM Stop after pass NUM.  The passes are numbered 1-5: parse, elabo‐
46              rate, translate, compile, run.  See the PROCESSING  section  for
47              details.
48
49       -v     Increase  verbosity  for all passes.  Produce a larger volume of
50              informative (?) output each time option repeated.
51
52       --vp ABCDE
53              Increase verbosity on a per-pass basis.  For example, "--vp 002"
54              adds  2  units  of  verbosity  to  pass 3 only.  The combination
55              "-v --vp 00004" adds 1 unit of verbosity for all passes,  and  4
56              more for pass 5.
57
58       -k     Keep  the temporary directory after all processing.  This may be
59              useful in order to examine the generated C code, or to reuse the
60              compiled kernel object.
61
62       -g     Guru  mode.   Enable  parsing  of unsafe expert-level constructs
63              like embedded C.
64
65       -P     Prologue-searching mode.  Activate  heuristics  to  work  around
66              incorrect debugging information for $target variables.
67
68       -u     Unoptimized  mode.   Disable unused code elision during elabora‐
69              tion.
70
71       -w     Suppressed warnings mode.  Disables all warning messages.
72
73       -b     Use bulk mode (percpu files) for kernel-to-user data transfer.
74
75       -t     Collect timing information on the number of times probe executes
76              and average amount of time spent in each probe-point. Also shows
77              the derivation for each probe-point.
78
79       -sNUM  Use NUM megabyte buffers for kernel-to-user data transfer.  On a
80              multiprocessor in bulk mode, this is a per-processor amount.
81
82       -I DIR Add the given directory to the tapset search directory.  See the
83              description of pass 2 for details.
84
85       -D NAME=VALUE
86              Add the given C preprocessor directive to the  module  Makefile.
87              These can be used to override limit parameters described below.
88
89       -B NAME=VALUE
90              Add  the  given make directive to the kernel module build's make
91              invocation.  These can  be  used  to  add  or  override  kconfig
92              options.
93
94       -G NAME=VALUE
95              Sets  the value of global variable NAME to VALUE when staprun is
96              invoked.  This applies to scalar variables  declared  global  in
97              the script/tapset.
98
99       -R DIR Look for the systemtap runtime sources in the given directory.
100
101       -r /DIR
102              Build  for  kernel in given build tree. Can also be set with the
103              SYSTEMTAP_RELEASE environment variable.
104
105       -r RELEASE
106              Build for kernel in build tree /lib/modules/RELEASE/build.   Can
107              also be set with the SYSTEMTAP_RELEASE environment variable.
108
109       -m MODULE
110              Use  the  given  name  for  the  generated kernel object module,
111              instead of a  unique  randomized  name.   The  generated  kernel
112              object module is copied to the current directory.
113
114       -d MODULE
115              Add symbol/unwind information for the given module into the ker‐
116              nel object module.  This may  enable  symbolic  tracebacks  from
117              those  modules/programs,  even  if  they do not have an explicit
118              probe placed into them.
119
120       --ldd  Add symbol/unwind information for all shared libraries suspected
121              by  ldd  to  be necessary for user-space binaries being probe or
122              listed with the -d option.  Caution: this  can  make  the  probe
123              modules considerably larger.
124
125       --all-modules
126              Equivalent  to  specifying "-dkernel" and a "-d" for each kernel
127              module that is currently loaded.  Caution:  this  can  make  the
128              probe modules considerably larger.
129
130       -o FILE
131              Send  standard  output to named file. In bulk mode, percpu files
132              will start with FILE_ (FILE_cpu with -F)  followed  by  the  cpu
133              number.  This supports strftime(3) formats for FILE.
134
135       -c CMD Start  the  probes,  run  CMD, and exit when CMD finishes.  This
136              also has the effect of setting target() to the pid of  the  com‐
137              mand ran.
138
139       -x PID Sets  target()  to  PID.  This allows scripts to be written that
140              filter on a specific process.
141
142       -l PROBE
143              Instead of running a probe script, just list all available probe
144              points  matching  the given single probe point.  The pattern may
145              include wildcards and aliases, but not comma-separated  multiple
146              probe  points.  The process result code will indicate failure if
147              there are no matches.
148
149       -L PROBE
150              Similar to "-l", but list probe points  and  script-level  local
151              variables.
152
153       -F     Without  -o  option,  load  module and start probes, then detach
154              from the module leaving the probes running.  With -o option, run
155              staprun in background as a daemon and show its pid.
156
157       -S size[,N]
158              Sets  the  maximum size of output file and the maximum number of
159              output files.  If the size of output file  will  exceed  size  ,
160              systemtap switches output file to the next file. And if the num‐
161              ber of output files exceed N , systemtap removes the oldest out‐
162              put file. You can omit the second argument.
163
164       --skip-badvars
165              Ignore out of context variables and substitute with literal 0.
166
167
168       --compatible VERSION
169              Suppress  recent  script  language  or  tapset changes which are
170              incompatible with given older version of systemtap.  This may be
171              useful  if  a much older systemtap script fails to run.  See the
172              DEPRECATION section for more details.
173
174
175       --check-version
176              This option is used to check if the active script has  any  con‐
177              structors  that may be systemtap version specific.  See the DEP‐
178              RECATION section for more details.
179
180
181       --clean-cache
182              This option prunes stale entries from the cache directory.  This
183              is  normally  done automatically after successful runs, but this
184              option will trigger the cleanup manually and then exit.  See the
185              CACHING section for more details about cache limits.
186
187
188       --disable-cache
189              This  option  disables all use of the cache directory.  No files
190              will be either read from or written to the cache.
191
192
193       --poison-cache
194              This option treats files in the cache directory as invalid.   No
195              files will be read from the cache, but resulting files from this
196              run will still be written to the cache.   This  is  meant  as  a
197              troubleshooting aid when stap's cached behavior seems to be mis‐
198              behaving.
199
200
201       --unprivileged
202              This option instructs stap to examine  the  script  looking  for
203              constructs  which  are  not  allowed for unprivileged users (see
204              UNPRIVILEGED USERS).  Compilation fails if any  such  constructs
205              are  used.   If  this  option  is specified when using a compile
206              server (see --use-server), the server will  examine  the  script
207              and,  if compilation succeeds, the server will cryptographically
208              sign the resulting kernel module, certifying that is it safe for
209              use by unprivileged users.
210
211              If --unprivileged has not been specified, -pN has not been spec‐
212              ified with N < 5, and the invoking user not root, is not a  mem‐
213              ber  of the group stapdev, but is a member of the group stapusr,
214              then stap will automatically add --unprivileged to  the  options
215              already specified.
216
217
218       --use-server [HOSTNAME[:PORT] | IP_ADDRESS[:PORT] | CERT_SERIAL]
219              Specify  compile-server(s)  to be used for compilation and/or in
220              conjunction with --list-servers and --trust-servers (see below).
221              If  no  argument  is  supplied, then the default in unprivileged
222              mode (see --unprivileged) is to select compatible servers  which
223              are  trusted  as  SSL  peers and as module signers and currently
224              online. Otherwise the default is to  select  compatible  servers
225              which   are   trusted   as   SSL  peers  and  currently  online.
226              --use-server may be specified more than once, in  which  case  a
227              list  of  servers is accumulated in the order specified. Servers
228              may be specified by host name, ip  address,  or  by  certificate
229              serial  number  (obtained  using --list-servers).  The latter is
230              most  commonly  used  when  revoking  trust  in  a  server  (see
231              --trust-servers below). If a server is specified by host name or
232              ip address, then an optional port number may be specified.  This
233              is  useful for accessing servers which are not on the local net‐
234              work or to specify a particular server.
235
236              If --use-server has not been specified, -pN has not been  speci‐
237              fied with N < 5, and the invoking user not root, is not a member
238              of the group stapdev, but is a member of the group stapusr, then
239              stap  will automatically add --use-server to the options already
240              specified.
241
242
243       --use-server-on-error [yes|no]
244              Instructs stap to retry compilation of a script using a  compile
245              server  if compilation on the local host fails in a manner which
246              suggests that it might succeed using a server.  If  this  option
247              is  not  specified,  the  default is no.  If no argument is pro‐
248              vided, then the default is yes. Compilation will be retried  for
249              certain  types  of  errors (e.g. insufficient data or resources)
250              which may not occur during re-compilation by a  compile  server.
251              Compile servers will be selected automatically for the re-compi‐
252              lation attempt as if --use-server was specified  with  no  argu‐
253              ments.
254
255
256       --list-servers [SERVERS]
257              Display  the status of the requested SERVERS, where SERVERS is a
258              comma-separated  list  of  server  attributes.   The   list   of
259              attributes  is combined to filter the list of servers displayed.
260              Supported attributes are:
261
262              all    specifies all known servers (trusted SSL  peers,  trusted
263                     module signers, online servers).
264
265              specified
266                     specifies servers specified using --use-server.
267
268              online filters the output by retaining information about servers
269                     which are currently online.
270
271              trusted
272                     filters the output by retaining information about servers
273                     which are trusted as SSL peers.
274
275              signer filters the output by retaining information about servers
276                     which are trusted as module signers (see --unprivileged).
277
278              compatible
279                     filters the output by retaining information about servers
280                     which  are compatible with the current kernel release and
281                     architecture.
282
283              If no argument is provided, then the default is  specified.   If
284              no  servers  were specified using --use-server, then the default
285              servers for --use-server are listed.
286
287
288       --trust-servers [TRUST_SPEC]
289              Grant  or  revoke  trust  in  compile-servers,  specified  using
290              --use-server  as  specified by TRUST_SPEC, where TRUST_SPEC is a
291              comma-separated list specifying the trust which is to be granted
292              or revoked. Supported elements are:
293
294              ssl    trust the specified servers as SSL peers.
295
296              signer trust  the  specified  servers  as  module  signers  (see
297                     --unprivileged).  Only root can specify signer.
298
299              all-users
300                     grant trust as an ssl peer for all  users  on  the  local
301                     host.  The  default  is to grant trust as an ssl peer for
302                     the current user only. Trust as a module signer is always
303                     granted for all users. Only root can specify all-users.
304
305              revoke revoke the specified trust. The default is to grant it.
306
307              no-prompt
308                     do  not  prompt the user for confirmation before carrying
309                     out the requested action. The default is  to  prompt  the
310                     user for confirmation.
311
312              If  no  argument  is  provided,  then the default is ssl.  If no
313              servers were specified using --use-server, then no trust will be
314              granted or revoked.
315
316              Unless  no-prompt  has been specified, the user will be prompted
317              to confirm the trust to be granted or revoked before the  opera‐
318              tion is performed.
319
320
321       --remote [USER@]HOSTNAME
322              Set  the  execution target to the specified ssh host, optionally
323              using a username not matching your  own.   This  option  may  be
324              repeated  to  target multiple execution targets.  Passes 1-4 are
325              completed locally as normal to build the script, and then pass 5
326              will  copy  the  module  to  the target and run it.  If a custom
327              ssh_config file is in use, add SendEnv LANG to  retain  interna‐
328              tionalization functionality.
329

ARGUMENTS

331       Any  additional  arguments on the command line are passed to the script
332       parser for substitution.  See below.
333
334

SCRIPT LANGUAGE

336       The systemtap script language resembles awk.  There are two main outer‐
337       most  constructs:  probes  and functions.  Within these, statements and
338       expressions use C-like operator syntax and precedence.
339
340
341   GENERAL SYNTAX
342       Whitespace is ignored.  Three forms of comments are supported:
343              # ... shell style, to the end of line, except for $# and @#
344              // ... C++ style, to the end of line
345              /* ... C style ... */
346       Literals are either strings enclosed in double-quotes (passing  through
347       the  usual  C  escape codes with backslashes), or integers (in decimal,
348       hexadecimal, or octal, using the same notation as in C).   All  strings
349       are  limited  in length to some reasonable value (a few hundred bytes).
350       Integers are 64-bit signed quantities, although the parser also accepts
351       (and wraps around) values above positive 2**63.
352
353       In  addition, script arguments given at the end of the command line may
354       be inserted.  Use $1 ... $<NN> for insertion unquoted, @1 ... @<NN> for
355       insertion as a string literal.  The number of arguments may be accessed
356       through $# (as an unquoted number) or through @# (as a quoted  number).
357       These  may be used at any place a token may begin, including within the
358       preprocessing stage.  Reference to an argument number beyond  what  was
359       actually given is an error.
360
361
362   PREPROCESSING
363       A  simple  conditional preprocessing stage is run as a part of parsing.
364       The general form is similar to the cond ? exp1 : exp2 ternary operator:
365              %( CONDITION %? TRUE-TOKENS %)
366              %( CONDITION %? TRUE-TOKENS %: FALSE-TOKENS %)
367       The CONDITION is either an expression whose format is determined by its
368       first  keyword,  or  a string literals comparison or a numeric literals
369       comparison.  It can be also composed of many alternatives and  conjunc‐
370       tions of CONDITIONs (meant as in previous sentence) using || and && re‐
371       spectively.  However, parentheses are not supported yet, so remembering
372       that conjunction takes precedence over alternative is important.
373
374       If  the  first part is the identifier kernel_vr or kernel_v to refer to
375       the kernel  version  number,  with  ("2.6.13-1.322FC3smp")  or  without
376       ("2.6.13")  the release code suffix, then the second part is one of the
377       six standard numeric comparison operators <, <=, ==, !=, >, and >=, and
378       the  third part is a string literal that contains an RPM-style version-
379       release value.  The condition is deemed satisfied if the version of the
380       target  kernel  (as optionally overridden by the -r option) compares to
381       the given version string.  The comparison is  performed  by  the  glibc
382       function  strverscmp.  As a special case, if the operator is for simple
383       equality (==), or inequality (!=), and  the  third  part  contains  any
384       wildcard  characters (* or ? or [), then the expression is treated as a
385       wildcard (mis)match as evaluated by fnmatch.
386
387       If, on the other hand, the first part is the identifier arch  to  refer
388       to  the  processor  architecture  (as  named by the kernel build system
389       ARCH/SUBARCH), then the second part is one of the two string comparison
390       operators == or !=, and the third part is a string literal for matching
391       it.  This comparison is a wildcard (mis)match.
392
393       Similarly, if the first part is an identifier like CONFIG_something  to
394       refer  to  a kernel configuration option, then the second part is == or
395       !=, and the third part is a string literal for matching the value (com‐
396       monly  "y"  or "m").  Nonexistent or unset kernel configuration options
397       are represented by the empty string.  This comparison is also  a  wild‐
398       card (mis)match.
399
400       If the first part is the identifier systemtap_v, the test refers to the
401       systemtap compatibility  version,  which  may  be  overridden  for  old
402       scripts  with  the --compatible flag.  The comparison operator is as is
403       for kernel_v and the right operand is a version string.  See  also  the
404       DEPRECATION section below.
405
406       Otherwise,  the  CONDITION  is  expected to be a comparison between two
407       string literals or two numeric literals.  In this case,  the  arguments
408       are the only variables usable.
409
410       The TRUE-TOKENS and FALSE-TOKENS are zero or more general parser tokens
411       (possibly including nested preprocessor conditionals), and  are  passed
412       into  the input stream if the condition is true or false.  For example,
413       the following code induces a parse error unless the target kernel  ver‐
414       sion is newer than 2.6.5:
415              %( kernel_v <= "2.6.5" %? **ERROR** %) # invalid token sequence
416       The following code might adapt to hypothetical kernel version drift:
417              probe kernel.function (
418                %( kernel_v <= "2.6.12" %? "__mm_do_fault" %:
419                   %( kernel_vr == "2.6.13*smp" %? "do_page_fault" %:
420                      UNSUPPORTED %) %)
421              ) { /* ... */ }
422
423              %( arch == "ia64" %?
424                 probe syscall.vliw = kernel.function("vliw_widget") {}
425              %)
426
427
428   VARIABLES
429       Identifiers  for  variables and functions are an alphanumeric sequence,
430       and may include "_" and "$" characters.  They  may  not  start  with  a
431       plain  digit,  as in C.  Each variable is by default local to the probe
432       or function statement block within which it is mentioned, and therefore
433       its scope and lifetime is limited to a particular probe or function in‐
434       vocation.
435
436       Scalar variables are implicitly typed as either string or integer.  As‐
437       sociative  arrays  also  have a string or integer value, and a tuple of
438       strings and/or integers serving as a key.  Here are a few basic expres‐
439       sions.
440              var1 = 5
441              var2 = "bar"
442              array1 [pid()] = "name"     # single numeric key
443              array2 ["foo",4,i++] += 5   # vector of string/num/num keys
444              if (["hello",5,4] in array2) println ("yes")  # membership test
445
446       The  translator  performs  type inference on all identifiers, including
447       array indexes and function parameters.  Inconsistent  type-related  use
448       of identifiers signals an error.
449
450       Variables  may  be declared global, so that they are shared amongst all
451       probes and live as long as the entire systemtap session.  There is  one
452       namespace  for  all  global  variables, regardless of which script file
453       they are found within.  Concurrent access to global variables is  auto‐
454       matically protected with locks, see the SAFETY AND SECURITY section for
455       more details.  A global declaration may be  written  at  the  outermost
456       level anywhere, not within a block of code.  Global variables which are
457       written but never read will be displayed automatically at session shut‐
458       down.   The translator will infer for each its value type, and if it is
459       used as an array, its key types.  Optionally,  scalar  globals  may  be
460       initialized with a string or number literal.  The following declaration
461       marks variables as global.
462              global var1, var2, var3=4
463
464       Global variables can also be set as module options. One can do this  by
465       either  using the -G option, or the module must first be compiled using
466       stap -p4.  Global variables can then be set on the  command  line  when
467       calling staprun on the module generated by stap -p4. See staprun(8) for
468       more information.
469
470       Arrays are limited in size by the MAXMAPENTRIES  variable  --  see  the
471       SAFETY AND SECURITY section for details.  Optionally, global arrays may
472       be declared with a maximum size in brackets,  overriding  MAXMAPENTRIES
473       for  that array only.  Note that this doesn't indicate the type of keys
474       for the array, just the size.
475              global tiny_array[10], normal_array, big_array[50000]
476
477
478   STATEMENTS
479       Statements enable procedural control flow.  They may occur within func‐
480       tions  and  probe handlers.  The total number of statements executed in
481       response to any single probe event is limited to some number defined by
482       a macro in the translated C code, and is in the neighbourhood of 1000.
483
484       EXP    Execute  the string- or integer-valued expression and throw away
485              the value.
486
487       { STMT1 STMT2 ... }
488              Execute each statement in sequence in  this  block.   Note  that
489              separators  or  terminators  are generally not necessary between
490              statements.
491
492       ;      Null statement, do nothing.  It is useful as an optional separa‐
493              tor  between statements to improve syntax-error detection and to
494              handle certain grammar ambiguities.
495
496       if (EXP) STMT1 [ else STMT2 ]
497              Compare integer-valued EXP to zero.  Execute the first  (non-ze‐
498              ro) or second STMT (zero).
499
500       while (EXP) STMT
501              While integer-valued EXP evaluates to non-zero, execute STMT.
502
503       for (EXP1; EXP2; EXP3) STMT
504              Execute EXP1 as initialization.  While EXP2 is non-zero, execute
505              STMT, then the iteration expression EXP3.
506
507       foreach (VAR in ARRAY [ limit EXP ]) STMT
508              Loop over each element of the named global array, assigning cur‐
509              rent  key  to  VAR.   The  array  may not be modified within the
510              statement.  By adding a single + or - operator after the VAR  or
511              the ARRAY identifier, the iteration will proceed in a sorted or‐
512              der, by ascending or descending index or value.  Using  the  op‐
513              tional limit keyword limits the number of loop iterations to EXP
514              times.  EXP is evaluated once at the beginning of the loop.
515
516       foreach ([VAR1, VAR2, ...] in ARRAY [ limit EXP ]) STMT
517              Same as above, used when the array is indexed with  a  tuple  of
518              keys.   A sorting suffix may be used on at most one VAR or ARRAY
519              identifier.
520
521       foreach (VALUE = VAR in ARRAY [ limit EXP ]) STMT
522              This variant of foreach saves current value into VALUE  on  each
523              iteration,  so  it  is  the same as ARRAY[VAR].  This also works
524              with a tuple of keys.  Sorting suffixes on VALUE have  the  same
525              effect as on ARRAY.
526
527       break, continue
528              Exit  or  iterate  the  innermost  nesting loop (while or for or
529              foreach) statement.
530
531       return EXP
532              Return EXP value from enclosing  function.   If  the  function's
533              value  is  not  taken  anywhere,  then a return statement is not
534              needed, and the function will have a special "unknown" type with
535              no return value.
536
537       next   Return  now  from  enclosing  probe handler.  This is especially
538              useful in probe aliases that apply event filtering predicates.
539
540       try { STMT1 } catch { STMT2 }
541              Run the statements in the first block.  Upon  any  run-time  er‐
542              rors,  abort  STMT1  and  start  executing STMT2.  Any errors in
543              STMT2 will propagate to outer try/catch blocks, if any.
544
545       try { STMT1 } catch(VAR) { STMT2 }
546              Same as above, plus assign  the  error  message  to  the  string
547              scalar variable VAR.
548
549       delete ARRAY[INDEX1, INDEX2, ...]
550              Remove from ARRAY the element specified by the index tuple.  The
551              value will no longer be  available,  and  subsequent  iterations
552              will  not  report  the element.  It is not an error to delete an
553              element that does not exist.
554
555       delete ARRAY
556              Remove all elements from ARRAY.
557
558       delete SCALAR
559              Removes the value of SCALAR.  Integers and strings  are  cleared
560              to 0 and "" respectively, while statistics are reset to the ini‐
561              tial empty state.
562
563
564   EXPRESSIONS
565       Systemtap supports a number of operators that  have  the  same  general
566       syntax,  semantics, and precedence as in C and awk.  Arithmetic is per‐
567       formed as per typical C rules for signed integers.  Division by zero or
568       overflow is detected and results in an error.
569
570       binary numeric operators
571              * / % + - >> << & ^ | && ||
572
573       binary string operators
574              .  (string concatenation)
575
576       numeric assignment operators
577              = *= /= %= += -= >>= <<= &= ^= |=
578
579       string assignment operators
580              = .=
581
582       unary numeric operators
583              + - ! ~ ++ --
584
585       binary numeric or string comparison operators
586              < > <= >= == !=
587
588       ternary operator
589              cond ? exp1 : exp2
590
591       grouping operator
592              ( exp )
593
594       function call
595              fn ([ arg1, arg2, ... ])
596
597       array membership check
598              exp in array
599              [exp1, exp2, ...] in array
600
601
602   PROBES
603       The main construct in the scripting language identifies probes.  Probes
604       associate abstract events with a statement block ("probe handler") that
605       is  to  be executed when any of those events occur.  The general syntax
606       is as follows:
607              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }
608
609       Events are specified in a special syntax called "probe points".   There
610       are  several  varieties  of probe points defined by the translator, and
611       tapset scripts may define further ones using aliases.  These are listed
612       in the stapprobes(3stap) manual pages.
613
614       The probe handler is interpreted relative to the context of each event.
615       For events associated with kernel code, this context may include  vari‐
616       ables  defined  in  the  source code at that spot.  These "target vari‐
617       ables" are presented to the script as variables whose  names  are  pre‐
618       fixed  with  "$".   They  may be accessed only if the kernel's compiler
619       preserved them despite optimization.  This is the same constraint  that
620       a  debugger  user  faces  when working with optimized code.  Some other
621       events have very little context.  See the stapprobes(3stap)  man  pages
622       to  see  the kinds of context variables available at each kind of probe
623       point.
624
625       New probe points may be defined using "aliases".  Probe  point  aliases
626       look similar to probe definitions, but instead of activating a probe at
627       the given point, it just defines a new probe point name as an alias  to
628       an  existing one. There are two types of alias, i.e. the prologue style
629       and the epilogue style which are identified by "=" and "+=" respective‐
630       ly.
631
632       For  prologue  style  alias,  the statement block that follows an alias
633       definition is implicitly added as a prologue to any probe  that  refers
634       to  the  alias. While for the epilogue style alias, the statement block
635       that follows an alias definition is implicitly added as an epilogue  to
636       any probe that refers to the alias.  For example:
637
638              probe syscall.read = kernel.function("sys_read") {
639                fildes = $fd
640                if (execname() == "init") next  # skip rest of probe
641              }
642       defines   a   new   probe   point   syscall.read,   which   expands  to
643       kernel.function("sys_read"), with the given statement  as  a  prologue,
644       which  is  useful to predefine some variables for the alias user and/or
645       to skip probe processing entirely based on some conditions.  And
646              probe syscall.read += kernel.function("sys_read") {
647                if (tracethis) println ($fd)
648              }
649       defines a new probe point with the  given  statement  as  an  epilogue,
650       which  is  useful to take actions based upon variables set or left over
651       by the the alias user.  Please note that in each case,  the  statements
652       in  the  alias  handler block are treated ordinarily, so that variables
653       assigned there constitute mere initialization, not  a  macro  substitu‐
654       tion.
655
656       An alias is used just like a built-in probe type.
657              probe syscall.read {
658                printf("reading fd=%d0, fildes)
659                if (fildes > 10) tracethis = 1
660              }
661
662
663   FUNCTIONS
664       Systemtap  scripts  may  define  subroutines to factor out common work.
665       Functions take any number of scalar (integer or string) arguments,  and
666       must  return  a single scalar (integer or string).  An example function
667       declaration looks like this:
668              function thisfn (arg1, arg2) {
669                 return arg1 + arg2
670              }
671       Note the general absence of type declarations, which  are  instead  in‐
672       ferred  by  the translator.  However, if desired, a function definition
673       may include explicit type declarations for its return value and/or  its
674       arguments.   This  is  especially helpful for embedded-C functions.  In
675       the following example, the type inference engine need only  infer  type
676       type of arg2 (a string).
677              function thatfn:string (arg1:long, arg2) {
678                 return sprint(arg1) . arg2
679              }
680       Functions  may  call  others  or  themselves recursively, up to a fixed
681       nesting limit.  This limit is defined by a macro in  the  translated  C
682       code and is in the neighbourhood of 10.
683
684
685   PRINTING
686       There  are  a  set  of function names that are specially treated by the
687       translator.  They format values for printing to the standard  systemtap
688       output  stream  in  a more convenient way.  The sprint* variants return
689       the formatted string instead of printing it.
690
691       print, sprint
692              Print one or more values of any type, concatenated directly  to‐
693              gether.
694
695       println, sprintln
696              Print values like print and sprint, but also append a newline.
697
698       printd, sprintd
699              Take  a string delimiter and two or more values of any type, and
700              print the values with the delimiter interposed.   The  delimiter
701              must be a literal string constant.
702
703       printdln, sprintdln
704              Print  values with a delimiter like printd and sprintd, but also
705              append a newline.
706
707       printf, sprintf
708              Take a formatting string and a number of values of corresponding
709              types,  and print them all.  The format must be a literal string
710              constant.
711
712       The printf formatting directives similar to those  of  C,  except  that
713       they are fully type-checked by the translator:
714
715              %b     Writes a binary blob of the value given, instead of ASCII
716                     text.  The width specifier determines the number of bytes
717                     to  write;  valid specifiers are %b %1b %2b %4b %8b.  De‐
718                     fault (%b) is 8 bytes.
719
720              %c     Character.
721
722              %d,%i  Signed decimal.
723
724              %m     Safely reads kernel memory at the given address,  outputs
725                     its content.  The precision specifier determines the num‐
726                     ber of bytes to read.  Default is 1 byte.
727
728              %M     Same as %m, but outputs in hexadecimal.  The minimal size
729                     of output is double the precision specifier.
730
731              %o     Unsigned octal.
732
733              %p     Unsigned pointer address.
734
735              %s     String.
736
737              %u     Unsigned decimal.
738
739              %x     Unsigned hex value, in all lower-case.
740
741              %X     Unsigned hex value, in all upper-case.
742
743              %%     Writes a %.
744
745       Examples:
746                   a = "alice", b = "bob", p = 0x1234abcd, i = 123, j = -1, id[a] = 1234, id[b] = 4567
747                   print("hello")
748                        Prints: hello
749                   println(b)
750                        Prints: bob\n
751                   println(a . " is " . sprint(16))
752                        Prints: alice is 16
753                   foreach (name in id)  printdln("|", strlen(name), name, id[name])
754                        Prints: 5|alice|1234\n3|bob|4567
755                   printf("%c is %s; %x or %X or %p; %d or %u\n",97,a,p,p,p,j,j)
756                        Prints: a is alice; 1234abcd or 1234ABCD or 0x1234abcd; -1 or 18446744073709551615\n
757                   printf("2 bytes of kernel buffer at address %p: %2m", p, p)
758                        Prints: 2 byte of kernel buffer at address 0x1234abcd: <binary data>
759                   printf("%4b", p)
760                        Prints (these values as binary data): 0x1234abcd
761
762
763   STATISTICS
764       It  is  often  desirable to collect statistics in a way that avoids the
765       penalties of repeatedly exclusive locking the  global  variables  those
766       numbers are being put into.  Systemtap provides a solution using a spe‐
767       cial operator to accumulate values, and several pseudo-functions to ex‐
768       tract the statistical aggregates.
769
770       The  aggregation operator is <<<, and resembles an assignment, or a C++
771       output-streaming operation.  The left operand specifies a scalar or ar‐
772       ray-index  lvalue, which must be declared global.  The right operand is
773       a numeric expression.  The meaning is intuitive: add the  given  number
774       to the pile of numbers to compute statistics of.  (The specific list of
775       statistics to gather is given separately, by the extraction functions.)
776                  foo <<< 1
777                  stats[pid()] <<< memsize
778
779       The extraction functions are also special.  For each  appearance  of  a
780       distinct  extraction  function  operating  on  a  given identifier, the
781       translator arranges to compute a set of  statistics  that  satisfy  it.
782       The statistics system is thereby "on-demand".  Each execution of an ex‐
783       traction function causes the aggregation to be computed for that moment
784       across all processors.
785
786       Here  is the set of extractor functions.  The first argument of each is
787       the same style of lvalue used on the left hand side of  the  accumulate
788       operation.  The @count(v), @sum(v), @min(v), @max(v), @avg(v) extractor
789       functions compute the number/total/minimum/maximum/average of all accu‐
790       mulated values.  The resulting values are all simple integers.
791
792       Histograms  are  also  available, but are more complicated because they
793       have a vector rather than scalar value.   @hist_linear(v,start,stop,in‐
794       terval)  represents a linear histogram from "start" to "stop" by incre‐
795       ments  of  "interval".   The  interval  must  be  positive.  Similarly,
796       @hist_log(v) represents a base-2 logarithmic histogram. Printing a his‐
797       togram with the print family of functions renders a histogram object as
798       a tabular "ASCII art" bar chart.
799              probe foo {
800                x <<< $value
801              }
802              probe end {
803                printf ("avg %d = sum %d / count %d\n",
804                        @avg(x), @sum(x), @count(x))
805                print (@hist_log(v))
806              }
807
808
809   TYPECASTING
810       Once  a  pointer  has  been  saved  into a script integer variable, the
811       translator loses the type information necessary to access members  from
812       that  pointer.   Using the @cast() operator tells the translator how to
813       read a pointer.
814              @cast(p, "type_name"[, "module"])->member
815
816       This will interpret p as a pointer to a  struct/union  named  type_name
817       and  dereference  the member value.  Further ->subfield expressions may
818       be appended to dereference more levels.   NOTE: the same  dereferencing
819       operator  -> is used to refer to both direct containment or pointer in‐
820       direction.  Systemtap automatically  determines  which.   The  optional
821       module  tells  the  translator where to look for information about that
822       type.  Multiple modules may be specified as a list with  :  separators.
823       If  the  module  is  not specified, it will default either to the probe
824       module for dwarf probes, or to "kernel" for  functions  and  all  other
825       probes types.
826
827       The  translator  can create its own module with type information from a
828       header surrounded by angle brackets, in case normal  debuginfo  is  not
829       available.   For kernel headers, prefix it with "kernel" to use the ap‐
830       propriate build system.  All other headers are build with  default  GCC
831       parameters  into  a  user module.  Multiple headers may be specified in
832       sequence to resolve a codependency.
833              @cast(tv, "timeval", "<sys/time.h>")->tv_sec
834              @cast(task, "task_struct", "kernel<linux/sched.h>")->tgid
835              @cast(task, "task_struct",
836                    "kernel<linux/sched.h><linux/fs_struct.h>")->fs->umask
837       Values acquired by @cast may be pretty-printed by the  $  "  and  "  $$
838       suffix  operators,  the  same way as described in the CONTEXT VARIABLES
839       section of the stapprobes(3stap) manual page.
840
841
842       When in guru mode, the translator will also allow scripts to assign new
843       values to members of typecasted pointers.
844
845       Typecasting  is also useful in the case of void* members whose type may
846       be determinable at runtime.
847              probe foo {
848                if ($var->type == 1) {
849                  value = @cast($var->data, "type1")->bar
850                } else {
851                  value = @cast($var->data, "type2")->baz
852                }
853                print(value)
854              }
855
856
857   EMBEDDED C
858       When in guru mode, the translator accepts embedded code in the  script.
859       Such  code  is  enclosed  between %{ and %} markers, and is transcribed
860       verbatim, without analysis, in some  sequence,  into  the  generated  C
861       code.   At  the outermost level, this may be useful to add #include in‐
862       structions, and any auxiliary definitions for  use  by  other  embedded
863       code.
864
865       Another  place  where embedded code is permitted is as a function body.
866       In this case, the script language body is replaced entirely by a  piece
867       of C code enclosed again between %{ and %} markers.  This C code may do
868       anything reasonable and safe.  There are a number of  undocumented  but
869       complex safety constraints on atomicity, concurrency, resource consump‐
870       tion, and run time limits, so this is an advanced technique.
871
872       The memory locations set aside for input and  output  values  are  made
873       available to it using a macro THIS.  Here are some examples:
874              function add_one (val) %{
875                THIS->__retvalue = THIS->val + 1;
876              %}
877              function add_one_str (val) %{
878                strlcpy (THIS->__retvalue, THIS->val, MAXSTRINGLEN);
879                strlcat (THIS->__retvalue, "one", MAXSTRINGLEN);
880              %}
881       The function argument and return value types have to be inferred by the
882       translator from the call sites in order for this  to  work.   The  user
883       should  examine C code generated for ordinary script-language functions
884       in order to write compatible embedded-C ones.
885
886       The last place where embedded code is permitted  is  as  an  expression
887       rvalue.  In this case, the C code enclosed between %{ and %} markers is
888       interpreted as an ordinary expression value.  It is  assumed  to  be  a
889       normal  64-bit signed number, unless the marker /* string */ is includ‐
890       ed, in which case it's treated as a string.
891              function add_one (val) {
892                return val + %{ 1 %}
893              }
894              function add_string_two (val) {
895                return val . %{ /* string */ "two" %}
896              }
897
898       The embedded-C code may contain  markers  to  assert  optimization  and
899       safety properties.
900
901       /* pure */
902              means  that the C code has no side effects and may be elided en‐
903              tirely if its value is not used by script code.
904
905       /* unprivileged */
906              means that the C code is so safe that  even  unprivileged  users
907              are permitted to use it.
908
909       /* myproc-unprivileged */
910              means  that  the  C code is so safe that even unprivileged users
911              are permitted to use it, provided that the target of the current
912              probe is within the user's own process.
913
914       /* guru */
915              means  that  the  C code is so unsafe that a systemtap user must
916              specify -g (guru mode) to use this.
917
918       /* string */
919              in embedded-C expressions only, means that  the  expression  has
920              const  char  * type and should be treated as a string value, in‐
921              stead of the default long numeric.
922
923
924   BUILT-INS
925       A set of builtin functions and probe point aliases are provided by  the
926       scripts installed in the directory specified in the stappaths (7) manu‐
927       al page.  The functions are described in the stapfuncs(3stap) and stap‐
928       probes(3stap) manual pages.
929
930

PROCESSING

932       The translator begins pass 1 by parsing the given input script, and all
933       scripts (files named *.stp) found in a tapset directory.  The  directo‐
934       ries  listed with -I are processed in sequence, each processed in "guru
935       mode".  For  each  directory,  a  number  of  subdirectories  are  also
936       searched.   These  subdirectories  are derived from the selected kernel
937       version (the -R option), in order to allow more kernel-version-specific
938       scripts to override less specific ones.  For example, for a kernel ver‐
939       sion 2.6.12-23.FC3 the following patterns would  be  searched,  in  se‐
940       quence: 2.6.12-23.FC3/*.stp, 2.6.12/*.stp, 2.6/*.stp, and finally *.stp
941       Stopping the translator after pass 1  causes  it  to  print  the  parse
942       trees.
943
944
945       In  pass 2, the translator analyzes the input script to resolve symbols
946       and types.  References to variables, functions, and probe aliases  that
947       are unresolved internally are satisfied by searching through the parsed
948       tapset scripts.  If any tapset script is selected because it defines an
949       unresolved  symbol,  then  the  entirety of that script is added to the
950       translator's resolution queue.  This process iterates until all symbols
951       are resolved and a subset of tapset scripts is selected.
952
953       Next, all probe point descriptions are validated against the wide vari‐
954       ety supported by the translator.  Probe points that refer to code loca‐
955       tions  ("synchronous  probe points") require the appropriate kernel de‐
956       bugging information to be installed.  In the associated probe handlers,
957       target-side  variables  (whose names begin with "$") are found and have
958       their run-time locations decoded.
959
960       Next, all probes and functions are analyzed for optimization opportuni‐
961       ties,  in  order  to  remove variables, expressions, and functions that
962       have no useful value and no side-effect.  Embedded-C functions are  as‐
963       sumed  to  have  side-effects  unless  they  include  the  magic string
964       /* pure */.  Since this optimization can hide latent code  errors  such
965       as  type  mismatches  or invalid $target variables, it sometimes may be
966       useful to disable the optimizations with the -u option.
967
968       Finally, all variable, function, parameter, array, and index types  are
969       inferred  from context (literals and operators).  Stopping the transla‐
970       tor after pass 2 causes it to list all the probes, functions, and vari‐
971       ables,  along  with all inferred types.  Any inconsistent or unresolved
972       types cause an error.
973
974
975       In pass 3, the translator writes C code that represents the actions  of
976       all  selected script files, and creates a Makefile to build that into a
977       kernel object.  These files are  placed  into  a  temporary  directory.
978       Stopping  the  translator at this point causes it to print the contents
979       of the C file.
980
981
982       In pass 4, the translator invokes the Linux kernel build system to cre‐
983       ate  the  actual kernel object file.  This involves running make in the
984       temporary directory, and requires a kernel module build  system  (head‐
985       ers,  config and Makefiles) to be installed in the usual spot /lib/mod‐
986       ules/VERSION/build.  Stopping the translator after pass 4 is  the  last
987       chance  before  running  the  kernel object.  This may be useful if you
988       want to archive the file.
989
990
991       In pass 5, the  translator  invokes  the  systemtap  auxiliary  program
992       staprun  program for the given kernel object.  This program arranges to
993       load the module then communicates with it, copying trace data from  the
994       kernel  into temporary files, until the user sends an interrupt signal.
995       Any run-time error encountered by the probe handlers, such  as  running
996       out  of  memory, division by zero, exceeding nesting or runtime limits,
997       results in a soft error indication.  Soft errors in excess of MAXERRORS
998       block of all subsequent probes (except error-handling probes), and ter‐
999       minate the session.  Finally, staprun unloads the  module,  and  cleans
1000       up.
1001
1002
1003   ABNORMAL TERMINATION
1004       One  should  avoid  killing the stap process forcibly, for example with
1005       SIGKILL, because the stapio  process  (a  child  process  of  the  stap
1006       process)  and  the loaded module may be left running on the system.  If
1007       this happens, send SIGTERM or SIGINT to any remaining stapio processes,
1008       then use rmmod to unload the systemtap module.
1009
1010
1011

EXAMPLES

1013       See the stapex(3stap) manual page for a collection of samples.
1014
1015

CACHING

1017       The  systemtap  translator  caches  the  pass 3 output (the generated C
1018       code) and the pass 4 output (the compiled kernel module) if pass 4 com‐
1019       pletes  successfully.   This cached output is reused if the same script
1020       is translated again assuming the same  conditions  exist  (same  kernel
1021       version, same systemtap version, etc.).  Cached files are stored in the
1022       $SYSTEMTAP_DIR/cache directory. The cache can be limited by having  the
1023       file  cache_mb_limit  placed  in the cache directory (shown above) con‐
1024       taining only an ASCII integer  representing  how  many  MiB  the  cache
1025       should  not  exceed. Note that this is a 'soft' limit in that the cache
1026       will be cleaned after a new entry is added, so the total cache size may
1027       temporarily  exceed  this limit. In the absence of this file, a default
1028       will be created with the limit set to 64MiB.
1029
1030

SAFETY AND SECURITY

1032       Systemtap is an administrative tool.  It exposes kernel  internal  data
1033       structures and potentially private user information.
1034
1035       To actually run the kernel objects it builds, a user must be one of the
1036       following:
1037
1038       ·   the root user;
1039
1040       ·   a member of the stapdev and stapusr groups; or
1041
1042       ·   a member of the stapusr group.
1043
1044       The root user or a user who is a member of both the stapdev and stapusr
1045       groups  can build and run any systemtap script.  Members of the stapusr
1046       group can only use pre-built modules under the following conditions:
1047
1048       ·   The module is located in the /lib/modules/VERSION/systemtap  direc‐
1049           tory.   This  directory  must  be  owned  by  root and not be world
1050           writable.
1051
1052       ·   The module has been signed by a trusted signer. Trusted signers are
1053           normally  systemtap  compile-servers  which  sign  modules when the
1054           --unprivileged  option  is  specified  by  the  client.   See   the
1055           stap-server(8) manual page for more information.
1056
1057       The  kernel  modules  generated  by stap program are run by the staprun
1058       program.  The latter is a part of the Systemtap package,  dedicated  to
1059       module  loading and unloading (but only in the white zone), and kernel-
1060       to-user data transfer.  Since staprun does not perform  any  additional
1061       security  checks  on the kernel objects it is given, it would be unwise
1062       for a system administrator to add untrusted users  to  the  stapdev  or
1063       stapusr groups.
1064
1065       The  translator  asserts certain safety constraints.  It aims to ensure
1066       that no handler routine can run for very long, allocate memory, perform
1067       unsafe  operations,  or  in  unintentionally interfere with the kernel.
1068       Uses of script global variables are automatically read/write locked  as
1069       appropriate,  to  protect against manipulation by concurrent probe han‐
1070       dlers.  (Deadlocks are detected with timeouts.  Use the -t flag to  re‐
1071       ceive  reports  of  excessive  lock contention.)  Use of guru mode con‐
1072       structs such as embedded C can violate these  constraints,  leading  to
1073       kernel crash or data corruption.
1074
1075       The  resource  use  limits  are  set by macros in the generated C code.
1076       These may be overridden with the -D flag.  A selection of these  is  as
1077       follows:
1078
1079       MAXNESTING
1080              Maximum  number of nested function calls.  Default determined by
1081              script analysis, with a  bonus  10  slots  added  for  recursive
1082              scripts.
1083
1084       MAXSTRINGLEN
1085              Maximum length of strings, default 128.
1086
1087       MAXTRYLOCK
1088              Maximum  number  of iterations to wait for locks on global vari‐
1089              ables before declaring possible deadlock and skipping the probe,
1090              default 1000.
1091
1092       MAXACTION
1093              Maximum  number of statements to execute during any single probe
1094              hit (with interrupts disabled), default 1000.
1095
1096       MAXACTION_INTERRUPTIBLE
1097              Maximum number of statements to execute during any single  probe
1098              hit which is executed with interrupts enabled (such as begin/end
1099              probes), default (MAXACTION * 10).
1100
1101       MAXMAPENTRIES
1102              Default maximum number of rows in any single global  array,  de‐
1103              fault  2048.  Individual arrays may be declared with a larger or
1104              smaller limit instead:
1105              global big[10000],little[5]
1106
1107       MAXERRORS
1108              Maximum number of soft errors before an exit is  triggered,  de‐
1109              fault 0, which means that the first error will exit the script.
1110
1111       MAXSKIPPED
1112              Maximum  number  of  skipped probes before an exit is triggered,
1113              default 100.  Running systemtap with -t (timing) mode gives more
1114              details  about  skipped  probes.   With the default -DINTERRUPT‐
1115              IBLE=1 setting, probes skipped due to reentrancy are not accumu‐
1116              lated against this limit.
1117
1118       MINSTACKSPACE
1119              Minimum  number  of free kernel stack bytes required in order to
1120              run a probe handler, default 1024.  This number should be  large
1121              enough for the probe handler's own needs, plus a safety margin.
1122
1123       MAXUPROBES
1124              Maximum  number  of  concurrently  armed  user-space probes (up‐
1125              robes), default somewhat larger than the  number  of  user-space
1126              probe  points named in the script.  This pool needs to be poten‐
1127              tialy large because individual uprobe objects  (about  64  bytes
1128              each)  are  allocated for each process for each matching script-
1129              level probe.
1130
1131       STP_MAXMEMORY
1132              Maximum amount of memory (in kilobytes) that the systemtap  mod‐
1133              ule should use, default unlimited.  The memory size includes the
1134              size of the module  itself,  plus  any  additional  allocations.
1135              This  only  tracks  direct allocations by the systemtap runtime.
1136              This does not track indirect allocations (as done by kprobes/up‐
1137              robes/etc. internals).
1138
1139       TASK_FINDER_VMA_ENTRY_ITEMS
1140              Maximum  number  of  VMA  pages that will be tracked at runtime.
1141              This might get  exhausted  for  system  wide  probes  inspecting
1142              shared  library  variables  and/or  user backtraces. Defaults to
1143              1536.
1144
1145       STP_PROCFS_BUFSIZE
1146              Size of procfs probe  read  buffers  (in  bytes).   Defaults  to
1147              MAXSTRINGLEN.  This value can be overridden on a per-procfs file
1148              basis using the procfs read probe .maxsize(MAXSIZE) parameter.
1149
1150       With scripts that contain probes on any interrupt path, it is  possible
1151       that those interrupts may occur in the middle of another probe handler.
1152       The probe in the interrupt handler would be skipped  in  this  case  to
1153       avoid reentrance.  To work around this issue, execute stap with the op‐
1154       tion -DINTERRUPTIBLE=0 to mask interrupts throughout the probe handler.
1155       This  does  add  some  extra overhead to the probes, but it may prevent
1156       reentrance for common problem cases.  However, probes in  NMI  handlers
1157       and  in  the  callpath  of the stap runtime may still be skipped due to
1158       reentrance.
1159
1160
1161       Multiple scripts can write data into a  relay  buffer  concurrently.  A
1162       host  script  provides  an  interface for accessing its relay buffer to
1163       guest scripts.  Then, the output of the guests are merged into the out‐
1164       put of the host.  To run a script as a host, execute stap with -DRELAY‐
1165       HOST[=name] option. The name identifies your host script among  several
1166       hosts.   While  running the host, execute stap with -DRELAYGUEST[=name]
1167       to add a guest script to the host.  Note that you  must  unload  guests
1168       before  unloading  a  host.  If  there are some guests connected to the
1169       host, unloading the host will be failed.
1170
1171
1172       In case something goes wrong with stap or staprun after a probe has al‐
1173       ready started running, one may safely kill both user processes, and re‐
1174       move the active probe kernel module with rmmod.  Any pending trace mes‐
1175       sages may be lost.
1176
1177
1178       In  addition to the methods outlined above, the generated kernel module
1179       also uses overload processing to make sure that probes  can't  run  for
1180       too   long.    If  more  than  STP_OVERLOAD_THRESHOLD  cycles  (default
1181       500000000) have been spent in all the probes on a single cpu during the
1182       last STP_OVERLOAD_INTERVAL cycles (default 1000000000), the probes have
1183       overloaded the system and an exit is triggered.
1184
1185       By default, overload processing is turned on for all modules.   If  you
1186       would  like  to disable overload processing, define STP_NO_OVERLOAD (or
1187       its alias STAP_NO_OVERLOAD).
1188
1189

UNPRIVILEGED USERS

1191       Systemtap exposes kernel internal data structures and potentially  pri‐
1192       vate  user  information. Because of this, use of systemtap's full capa‐
1193       bilities are restricted to root and to users who  are  members  of  the
1194       groups stapdev and stapusr.
1195
1196       However, a restricted set of systemtap's features can be made available
1197       to trusted, unprivileged users. These users are members  of  the  group
1198       stapusr  only.  These  users can load systemtap modules which have been
1199       compiled and certified by a trusted systemtap compile-server.  See  the
1200       descriptions  of  the  options  --unprivileged  and  --use-server.  See
1201       README.unprivileged in the systemtap source code for information  about
1202       setting up a trusted compile server.
1203
1204       The restrictions enforced when --unprivileged is specified are designed
1205       to prevent unprivileged users from:
1206
1207              ·   harming the system maliciously.
1208
1209              ·   gaining access to information which would  not  normally  be
1210                  available to an unprivileged user.
1211
1212              ·   disrupting the performance of processes owned by other users
1213                  of the system.  Some overhead to the system  in  general  is
1214                  unavoidable  since  the  unprivileged  user's probes will be
1215                  triggered at the appropriate times. What we  would  like  to
1216                  avoid  is  targeted interruption of another user's processes
1217                  which would not normally be possible by an unprivileged  us‐
1218                  er.
1219
1220
1221   PROBE RESTRICTIONS
1222       An unprivileged user may only use the following probes:
1223
1224
1225              ·   begin, begin(n)
1226
1227              ·   end, end(n)
1228
1229              ·   error(n)
1230
1231              ·   never
1232
1233              ·   process.*, where the target process is owned by the user.
1234
1235              ·   timer.{jiffies,s,sec,ms,msec,us,usec,ns,nsec}(n)*
1236
1237              ·   timer.hz(n)
1238
1239
1240   SCRIPTING LANGUAGE RESTRICTIONS
1241       The  following  scripting language features are unavailable to unprivi‐
1242       leged users:
1243
1244
1245              ·   any feature enabled by the Guru Mode (-g) option.
1246
1247              ·   embedded C code.
1248
1249
1250   RUNTIME RESTRICTIONS
1251       The following runtime restrictions are placed upon unprivileged users:
1252
1253
1254              ·   Only the default runtime code (see -R) may be used.
1255
1256              ·   Probing of processes owned by other users is not permitted.
1257
1258              ·   Access of kernel memory (read and write) is not permitted.
1259
1260
1261   COMMAND LINE OPTION RESTRICTIONS
1262       Some command line options provide access to features which must not  be
1263       available to unprivileged users:
1264
1265
1266              ·   -g may not be specified.
1267
1268              ·   The  following options may not be used by the compile-server
1269                  client:
1270                      -a, -B, -D, -I, -r, -R
1271
1272
1273   ENVIRONMENT RESTRICTIONS
1274       The following environment variables must not be set:
1275
1276              SYSTEMTAP_RUNTIME
1277              SYSTEMTAP_TAPSET
1278              SYSTEMTAP_DEBUGINFO_PATH
1279
1280
1281   TAPSET RESTRICTIONS
1282       The following built-in tapset functions are  unconditionally  available
1283       to unprivileged users:
1284
1285              _ehostunreach:long ()
1286              _enetunreach:long ()
1287              _icmp_dest_unreach:long ()
1288              _icmp_exc_fragtime:long ()
1289              _icmp_prot_unreach:long ()
1290              _icmp_time_exceeded:long ()
1291              _MM_ANONPAGES:long()
1292              _MM_FILEPAGES:long()
1293              _net_rx_drop:long ()
1294              _rtn_broadcast:long ()
1295              _rtn_multicast:long ()
1296              _rtn_unspec:long ()
1297              _sys_pipe2_flag_str:string (f:long)
1298              AF_INET:long()
1299              cpu:long ()
1300              cputime_to_msecs:long (cputime:long)
1301              egid:long ()
1302              error (msg:string)
1303              euid:long ()
1304              execname:string ()
1305              exit ()
1306              get_cycles:long ()
1307              gettimeofday_ns:long ()
1308              GFP_KERNEL:long()
1309              gid:long ()
1310              HZ:long ()
1311              is_myproc:long ()
1312              isdigit:long(str:string)
1313              isinstr:long(s1:string,s2:string)
1314              jiffies:long ()
1315              log (msg:string)
1316              mem_page_size:long ()
1317              module_name:string ()
1318              pexecname:string ()
1319              pgrp:long ()
1320              pid:long ()
1321              pn:string ()
1322              pp:string ()
1323              ppid:long ()
1324              randint:long(n:long)
1325              registers_valid:long ()
1326              sid:long ()
1327              str_replace:string (prnt_str:string, srch_str:string, rplc_str:string)
1328              stringat:long(str:string, pos:long)
1329              strlen:long(s:string)
1330              strtol:long(str:string, base:long)
1331              substr:string(str:string,start:long, length:long)
1332              target:long ()
1333              task_utime:long ()
1334              task_stime:long ()
1335              text_str:string(input:string)
1336              text_strn:string(input:string, len:long, quoted:long)
1337              tid:long ()
1338              tokenize:string(input:string, delim:string)
1339              tz_gmtoff() {
1340              tz_name() {
1341              uid:long ()
1342              user_mode:long ()
1343              warn (msg:string)
1344
1345       The  following  built-in tapset functions are available to unprivileged
1346       users within their own processes. Scripts written by unprivileged users
1347       must  test  the  result  of the tapset function is_myproc and only call
1348       these functions if the result is 1. The script will exit immediately if
1349       any of these functions is called by an unprivileged user within a probe
1350       within a process which is not owned by that user.
1351
1352              _utrace_syscall_nr:long ()
1353              _utrace_syscall_arg:long (n:long)
1354              _utrace_syscall_return:long ()
1355              print_ubacktrace ()
1356              print_ubacktrace_brief ()
1357              print_ustack(stk:string)
1358              sprint_ubacktrace:string ()
1359              uaddr:long ()
1360              ubacktrace:string ()
1361              umodname:string (addr:long)
1362              user_char:long (addr:long)
1363              user_char_warn:long (addr:long)
1364              user_int:long (addr:long)
1365              user_int_warn:long (addr:long)
1366              user_int16:long (addr:long)
1367              user_int32:long (addr:long)
1368              user_int64:long (addr:long)
1369              user_int8:long (addr:long)
1370              user_long:long (addr:long)
1371              user_long_warn:long (addr:long)
1372              user_short:long (addr:long)
1373              user_short_warn:long (addr:long)
1374              user_string_quoted:string (addr:long)
1375              user_string_n_quoted:string (addr:long, n:long)
1376              user_string_n_warn:string (addr:long, n:long)
1377              user_string_n2:string (addr:long, n:long, err_msg:string)
1378              user_string_warn:string (addr:long)
1379              user_string2:string (addr:long, err_msg:string)
1380              user_uint16:long (addr:long)
1381              user_uint32:long (addr:long)
1382              user_uint8:long (addr:long)
1383              user_ushort:long (addr:long)
1384              user_ushort_warn:long (addr:long)
1385              usymdata:string (addr: long)
1386              usymname:string (addr: long)
1387
1388       No other built-in tapset functions may be used by unprivileged users.
1389
1390
1391

EXIT STATUS

1393       The systemtap translator generally returns with a success code of 0  if
1394       the  requested  script  was processed and executed successfully through
1395       the requested pass.  Otherwise, errors may be printed to stderr  and  a
1396       failure  code is returned.  Use -v or -vp N to increase (global or per-
1397       pass) verbosity to identify the source of the trouble.
1398
1399       In listings mode (-l and -L), error messages are  normally  suppressed.
1400       A  success  code  of  0  is returned if at least one matching probe was
1401       found.
1402
1403       A script executing in pass 5 that is interrupted with ^C  /  SIGINT  is
1404       considered to be successful.
1405
1406

DEPRECATION

1408       Over  time, some features of the script language and the tapset library
1409       may undergo incompatible changes, so that a script written  against  an
1410       old  version  of  systemtap  may no longer run.  In these cases, it may
1411       help to run systemtap with the --compatible  VERSION  flag,  specifying
1412       the  last  known  working version of systemtap.  Running systemtap with
1413       the --check-version flag will output a warning if any  possible  incom‐
1414       patible  elements have been parsed. Below is a table of recently depre‐
1415       cated tapset functions and  syntax  elements  that  require  the  given
1416       --compatible flag to use:
1417
1418       --compatible=1.2
1419              (none yet)
1420
1421       --compatible=1.3
1422              The  tapset  alias  'syscall.compat_pselect7a' was misnamed.  It
1423              should have been 'syscall.compat_pselect7' (without the trailing
1424              'a').  Starting in release 1.4, the old name will be deprecated.
1425
1426       --compatible=1.4
1427              In the 'syscall.add_key' probe, the 'description_auddr' variable
1428              has been deprecated in  favor  of  the  new  'description_uaddr'
1429              variable.
1430
1431              In  the 'syscall.fgetxattr', 'syscall.fsetxattr', 'syscall.getx‐
1432              attr',       ´syscall.lgetxattr',        'syscall.lremovexattr',
1433              'nd_syscall.fgetxattr',               ´nd_syscall.fremovexattr',
1434              'nd_syscall.fsetxattr',        'nd_syscall.getxattr',        and
1435              'nd_syscall.lremovexattr'  probes, the 'name2' variable has been
1436              deprecated in favor of the new 'name_str' variable.
1437
1438              In the 'nd_syscall.accept' probe  the  'flag_str'  variable  has
1439              been deprecated in favor of the new 'flags_str' variable.
1440
1441              In  the  'nd_syscall.dup'  probe  the 'old_fd' variable has been
1442              deprecated in favor of the new 'oldfd' variable.
1443
1444              The tapset alias 'nd_syscall.compat_pselect7a' was misnamed.  It
1445              should   have  been  'nd_syscall.compat_pselect7'  (without  the
1446              trailing 'a').
1447
1448              The tapset function 'cpuid' is deprecated in favor of the better
1449              known 'cpu'.
1450
1451              In the i386 'syscall.sigaltstack' probe, the 'ussp' variable has
1452              been deprecated in favor of the new 'uss_uaddr' variable.
1453
1454              In the ia64  'syscall.sigaltstack'  probe,  the  'ss_uaddr'  and
1455              ´oss_uaddr'  variables  have been deprecated in favor of the new
1456              ´uss_uaddr' and 'uoss_uaddr' variables.
1457
1458              The powerpc tapset alias 'syscall.compat_sysctl' was  deprecated
1459              and renamed 'syscall.sysctl32'.
1460
1461              In  the  x86_64  'syscall.sigaltstack'  probe,  the 'regs_uaddr'
1462              variable has been deprecated in favor of the  new  'regs'  vari‐
1463              able.
1464
1465
1466

FILES

1468       Important files and their corresponding paths can be located in the
1469              stappaths (7) manual page.
1470
1471

SEE ALSO

1473       stapprobes(3stap),  stapfuncs(3stap),  stappaths(7),  staprun(8), stap‐
1474       vars(3stap), stapex(3stap), stap-server(8), awk(1), gdb(1)
1475
1476

BUGS

1478       Use the Bugzilla link of the project web  page  or  our  mailing  list.
1479       http://sourceware.org/systemtap/,<systemtap@sourceware.org>.
1480
1481
1482
1483                                                                       STAP(1)
Impressum