1BPFTRACE(8) System Manager's Manual BPFTRACE(8)
2
3
4
6 BPFtrace - the eBPF tracing language & frontend
7
9 bpftrace [OPTIONS] FILE
10 bpftrace [OPTIONS] -e ´program code´
11
13 BPFtrace is a high-level tracing language for Linux enhanced Berkeley
14 Packet Filter (eBPF) available in recent Linux kernels (4.x).
15
16 BPFtrace uses:
17
18 · LLVM as a backend to compile scripts to BPF-bytecode
19
20 · BCC for interacting with the Linux BPF system
21
22
23
24 As well as the existing Linux tracing capabilities:
25
26 ┌─────────┬─────────────┬──────────────┐
27 │ │ kernel │ userland │
28 ├─────────┼─────────────┼──────────────┤
29 │ static │ tracepoints │ USDT* probes │
30 ├─────────┼─────────────┼──────────────┤
31 │ dynamic │ kprobes │ uprobes │
32 └─────────┴─────────────┴──────────────┘
33 *USDT = user-level statically defined tracing
34
35 The BPFtrace language is inspired by awk and C, and predecessor tracers
36 such as DTrace and SystemTap.
37
38 See EXAMPLES and ONELINERS if you are impatient.
39 See PROBE TYPES and BUILTINS (variables/functions) for the bpftrace
40 language elements.
41
43 -l [searchterm]
44 List probes.
45
46 -e ´PROGRAM´
47 Execute PROGRAM.
48
49 -p PID Enable USDT probes on PID. Will terminate bpftrace on PID termi‐
50 nation. Note this is not a global PID filter on probes.
51
52 -c CMD Helper to run CMD. Equivalent to manually running CMD and then
53 giving passing the PID to -p. This is useful to ensure you've
54 traced at least the duration CMD's execution.
55
56 -v Verbose messages.
57
58 -d Debug info on dry run.
59
60 -dd Verbose debug info on dry run.
61
63 bpftrace -l ´*sleep*´
64 List probes containing "sleep".
65
66 bpftrace -e ´kprobe:do_nanosleep { printf("PID %d sleeping\n", pid); }´
67 Trace processes calling sleep.
68
69 bpftrace -c ´sleep 5´ -e ´kprobe:do_nanosleep { printf("PID %d sleep‐
70 ing\n", pid); }´
71 run "sleep 5" in a new process and then trace processes calling
72 sleep.
73
74 bpftrace -e ´tracepoint:raw_syscalls:sys_enter { @[comm]=count(); }´
75 Count syscalls by process name.
76
78 For brevity, just the the actual BPF code is shown below.
79 Usage: bpftrace -e ´bpf-code´
80
81 New processes with arguments:
82 tracepoint:syscalls:sys_enter_execve { join(args->argv); }
83
84 Files opened by process:
85 tracepoint:syscalls:sys_enter_open { printf("%s %s\n", comm,
86 str(args->filename)); }
87
88 Syscall count by program:
89 tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }
90
91 Syscall count by syscall:
92 tracepoint:syscalls:sys_enter_* { @[probe] = count(); }
93
94 Syscall count by process:
95 tracepoint:raw_syscalls:sys_enter { @[pid, comm] = count(); }
96
97 Read bytes by process:
98 tracepoint:syscalls:sys_exit_read /args->ret/ { @[comm] =
99 sum(args->ret); }
100
101 Read size distribution by process:
102 tracepoint:syscalls:sys_exit_read { @[comm] = hist(args->ret); }
103
104 Disk size by process:
105 tracepoint:block:block_rq_issue { printf("%d %s %d\n", pid,
106 comm, args->bytes); }
107
108 Pages paged in by process:
109 software:major-faults:1 { @[comm] = count(); }
110
111 Page faults by process:
112 software:faults:1 { @[comm] = count(); }
113
114 Profile user-level stacks at 99 Hertz, for PID 189:
115 profile:hz:99 /pid == 189/ { @[ustack] = count(); }
116
118 KPROBES
119 Attach a BPFtrace script to a kernel function, to be executed when that
120 function is called:
121
122 kprobe:vfs_read { ... }
123
124 UPROBES
125 Attach script to a userland function:
126
127 uprobe:/bin/bash:readline { ... }
128
129 TRACEPOINTS
130 Attach script to a statically defined tracepoint in the kernel:
131
132 tracepoint:sched:sched_switch { ... }
133
134 Tracepoints are guaranteed to be stable between kernel versions, unlike
135 kprobes.
136
137 SOFTWARE
138 Attach script to kernel software events, executing once every provided
139 count or use a default:
140
141 software:faults:100 software:faults:
142
143 HARDWARE
144 Attach script to hardware events (PMCs), executing once every provided
145 count or use a default:
146
147 hardware:cache-references:1000000 hardware:cache-references:
148
149 PROFILE
150 Run the script on all CPUs at specified time intervals:
151
152 profile:hz:99 { ... }
153
154 profile:s:1 { ... }
155
156 profile:ms:20 { ... }
157
158 profile:us:1500 { ... }
159
160 INTERVAL
161 Run the script once per interval, for printing interval output:
162
163 interval:s:1 { ... }
164
165 interval:ms:20 { ... }
166
167 MULTIPLE ATTACHMENT POINTS
168 A single probe can be attached to multiple events:
169
170 kprobe:vfs_read,kprobe:vfs_write { ... }
171
172 WILDCARDS
173 Some probe types allow wildcards to be used when attaching a probe:
174
175 kprobe:vfs_* { ... }
176
177 PREDICATES
178 Define conditions for which a probe should be executed:
179
180 kprobe:sys_open / uid == 0 / { ... }
181
183 The following variables and functions are available for use in bpftrace
184 scripts:
185
186 VARIABLES
187 pid Process ID (kernel tgid)
188
189 tid Thread ID (kernel pid)
190
191 cgroup Cgroup ID of the current process
192
193 uid User ID
194
195 gid Group ID
196
197 nsecs Nanosecond timestamp
198
199 cpu Processor ID
200
201 comm Process name
202
203 kstack Kernel stack trace
204
205 ustack User stack trace
206
207 arg0, arg1, ... etc.
208 Arguments to the function being traced
209
210 retval Return value from function being traced
211
212 func Name of the function currently being traced
213
214 probe Full name of the probe
215
216 curtask
217 Current task_struct as a u64.
218
219 rand Random number of type u32.
220
221 FUNCTIONS
222 hist(int n)
223 Produce a log2 histogram of values of n
224
225 lhist(int n, int min, int max, int step)
226 Produce a linear histogram of values of n
227
228 count()
229 Count the number of times this function is called
230
231 sum(int n)
232 Sum this value
233
234 min(int n)
235 Record the minimum value seen
236
237 max(int n)
238 Record the maximum value seen
239
240 avg(int n)
241 Average this value
242
243 stats(int n)
244 Return the count, average, and total for this value
245
246 delete(@x)
247 Delete the map element passed in as an argument
248
249 str(char *s)
250 Returns the string pointed to by s
251
252 printf(char *fmt, ...)
253 Print formatted to stdout
254
255 print(@x[, int top [, int div]])
256 Print a map, with optional top entry count and divisor
257
258 clear(@x)
259 Delete all key/values from a map
260
261 sym(void *p)
262 Resolve kernel address
263
264 usym(void *p)
265 Resolve user space address
266
267 kaddr(char *name)
268 Resolve kernel symbol name
269
270 uaddr(char *name)
271 Resolve user space symbol name
272
273 reg(char *name)
274 Returns the value stored in the named register
275
276 join(char *arr[])
277 Prints the string array
278
279 time(char *fmt)
280 Print the current time
281
282 system(char *fmt)
283 Execute shell command
284
285 exit() Quit bpftrace
286
287 kstack([StackMode mode, ][int level])
288 Kernel stack trace
289
290 ustack([StackMode mode, ][int level])
291 User stack trace
292
294 The official documentation can be found here:
295 https://github.com/iovisor/bpftrace/blob/master/docs
296
298 The first official talk by Alastair on bpftrace happened at the Tracing
299 Summit in Edinburgh, Oct 25th 2018.
300
302 Created by Alastair Robertson.
303 Manpage by Stephan Schuberth.
304
306 man -k bcc, after having installed the bpfcc-tools package under
307 Ubuntu.
308
310 Prior to contributing new tools, read the official checklist at:
311 https://github.com/iovisor/bpftrace/blob/master/CONTRIBUTING-TOOLS.md
312
313
314
315 October 2018 BPFTRACE(8)