1BPFTRACE(8) System Manager's Manual BPFTRACE(8)
2
3
4
6 bpftrace - the eBPF tracing language & frontend
7
9 bpftrace [OPTIONS] FILE
10 bpftrace [OPTIONS] -e 'program code'
11
13 bpftrace is a high-level tracing language for Linux enhanced Berkeley
14 Packet Filter (eBPF) available in recent Linux kernels (4.x).
15
16 bpftrace uses:
17
18 · LLVM as a backend to compile scripts to BPF-bytecode
19
20 · BCC for interacting with the Linux BPF system
21
22
23
24 As well as the existing Linux tracing capabilities:
25
26 ┌─────────┬─────────────┬──────────────┐
27 │ │ kernel │ userland │
28 ├─────────┼─────────────┼──────────────┤
29 │ static │ tracepoints │ USDT* probes │
30 ├─────────┼─────────────┼──────────────┤
31 │ dynamic │ kprobes │ uprobes │
32 └─────────┴─────────────┴──────────────┘
33 *USDT = user-level statically defined tracing
34
35 The bpftrace language is inspired by awk and C, and predecessor tracers
36 such as DTrace and SystemTap.
37
38 See EXAMPLES and ONELINERS if you are impatient.
39 See PROBE TYPES and BUILTINS (variables/functions) for the bpftrace
40 language elements.
41
43 -l [searchterm]
44 List probes.
45
46 -e 'PROGRAM'
47 Execute PROGRAM.
48
49 -p PID Enable USDT probes on PID. Will terminate bpftrace on PID termi‐
50 nation. Note this is not a global PID filter on probes.
51
52 -c CMD Helper to run CMD. Equivalent to manually running CMD and then
53 giving passing the PID to -p. This is useful to ensure you've
54 traced at least the duration CMD's execution.
55
56 --unsafe
57 Enable unsafe builtin functions. By default, bpftrace runs in
58 safe mode. Safe mode ensure programs cannot modify system state.
59 Unsafe builtin functions are marked as such in BUILTINS (func‐
60 tions).
61
62 --btf Force BTF data processing if it's available. By default it's
63 enabled only if the user does not specify any types/includes.
64
65 -v Verbose messages.
66
67 -d Debug info on dry run.
68
69 -dd Verbose debug info on dry run.
70
72 bpftrace -l '*sleep*'
73 List probes containing "sleep".
74
75 bpftrace -e 'kprobe:do_nanosleep { printf("PID %d sleeping\n", pid); }'
76 Trace processes calling sleep.
77
78 bpftrace -c 'sleep 5' -e 'kprobe:do_nanosleep { printf("PID %d sleep‐
79 ing\n", pid); }'
80 run "sleep 5" in a new process and then trace processes calling
81 sleep.
82
83 bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm]=count(); }'
84 Count syscalls by process name.
85
87 For brevity, just the the actual BPF code is shown below.
88 Usage: bpftrace -e 'bpf-code'
89
90 New processes with arguments:
91 tracepoint:syscalls:sys_enter_execve { join(args->argv); }
92
93 Files opened by process:
94 tracepoint:syscalls:sys_enter_open { printf("%s %s\n", comm,
95 str(args->filename)); }
96
97 Syscall count by program:
98 tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }
99
100 Syscall count by syscall:
101 tracepoint:syscalls:sys_enter_* { @[probe] = count(); }
102
103 Syscall count by process:
104 tracepoint:raw_syscalls:sys_enter { @[pid, comm] = count(); }
105
106 Read bytes by process:
107 tracepoint:syscalls:sys_exit_read /args->ret/ { @[comm] =
108 sum(args->ret); }
109
110 Read size distribution by process:
111 tracepoint:syscalls:sys_exit_read { @[comm] = hist(args->ret); }
112
113 Disk size by process:
114 tracepoint:block:block_rq_issue { printf("%d %s %d\n", pid,
115 comm, args->bytes); }
116
117 Pages paged in by process:
118 software:major-faults:1 { @[comm] = count(); }
119
120 Page faults by process:
121 software:faults:1 { @[comm] = count(); }
122
123 Profile user-level stacks at 99 Hertz, for PID 189:
124 profile:hz:99 /pid == 189/ { @[ustack] = count(); }
125
127 KPROBES
128 Attach a bpftrace script to a kernel function, to be executed when that
129 function is called:
130
131 kprobe:vfs_read { ... }
132
133 UPROBES
134 Attach script to a userland function:
135
136 uprobe:/bin/bash:readline { ... }
137
138 TRACEPOINTS
139 Attach script to a statically defined tracepoint in the kernel:
140
141 tracepoint:sched:sched_switch { ... }
142
143 Tracepoints are guaranteed to be stable between kernel versions, unlike
144 kprobes.
145
146 SOFTWARE
147 Attach script to kernel software events, executing once every provided
148 count or use a default:
149
150 software:faults:100 software:faults:
151
152 HARDWARE
153 Attach script to hardware events (PMCs), executing once every provided
154 count or use a default:
155
156 hardware:cache-references:1000000 hardware:cache-references:
157
158 PROFILE
159 Run the script on all CPUs at specified time intervals:
160
161 profile:hz:99 { ... }
162
163 profile:s:1 { ... }
164
165 profile:ms:20 { ... }
166
167 profile:us:1500 { ... }
168
169 INTERVAL
170 Run the script once per interval, for printing interval output:
171
172 interval:s:1 { ... }
173
174 interval:ms:20 { ... }
175
176 MULTIPLE ATTACHMENT POINTS
177 A single probe can be attached to multiple events:
178
179 kprobe:vfs_read,kprobe:vfs_write { ... }
180
181 WILDCARDS
182 Some probe types allow wildcards to be used when attaching a probe:
183
184 kprobe:vfs_* { ... }
185
186 PREDICATES
187 Define conditions for which a probe should be executed:
188
189 kprobe:sys_open / uid == 0 / { ... }
190
192 The following variables and functions are available for use in bpftrace
193 scripts:
194
195 VARIABLES
196 pid Process ID (kernel tgid)
197
198 tid Thread ID (kernel pid)
199
200 cgroup Cgroup ID of the current process
201
202 uid User ID
203
204 gid Group ID
205
206 nsecs Nanosecond timestamp
207
208 cpu Processor ID
209
210 comm Process name
211
212 kstack Kernel stack trace
213
214 ustack User stack trace
215
216 arg0, arg1, ... etc.
217 Arguments to the function being traced
218
219 retval Return value from function being traced
220
221 func Name of the function currently being traced
222
223 probe Full name of the probe
224
225 curtask
226 Current task_struct as a u64.
227
228 rand Random number of type u32.
229
230 FUNCTIONS
231 hist(int n)
232 Produce a log2 histogram of values of n
233
234 lhist(int n, int min, int max, int step)
235 Produce a linear histogram of values of n
236
237 count()
238 Count the number of times this function is called
239
240 sum(int n)
241 Sum this value
242
243 min(int n)
244 Record the minimum value seen
245
246 max(int n)
247 Record the maximum value seen
248
249 avg(int n)
250 Average this value
251
252 stats(int n)
253 Return the count, average, and total for this value
254
255 delete(@x)
256 Delete the map element passed in as an argument
257
258 str(char *s)
259 Returns the string pointed to by s
260
261 printf(char *fmt, ...)
262 Print formatted to stdout
263
264 print(@x[, int top [, int div]])
265 Print a map, with optional top entry count and divisor
266
267 clear(@x)
268 Delete all key/values from a map
269
270 ksym(void *p)
271 Resolve kernel address
272
273 usym(void *p)
274 Resolve user space address
275
276 kaddr(char *name)
277 Resolve kernel symbol name
278
279 uaddr(char *name)
280 Resolve user space symbol name
281
282 reg(char *name)
283 Returns the value stored in the named register
284
285 join(char *arr[])
286 Prints the string array
287
288 time(char *fmt)
289 Print the current time
290
291 cat(char *filename)
292 Print file content
293
294 ntop([int af, ]int|char[4|16] addr)
295 Convert IP address data to text
296
297 system(char *fmt) (unsafe)
298 Execute shell command
299
300 exit() Quit bpftrace
301
302 kstack([StackMode mode, ][int level])
303 Kernel stack trace
304
305 ustack([StackMode mode, ][int level])
306 User stack trace
307
309 The official documentation can be found here:
310 https://github.com/iovisor/bpftrace/blob/master/docs
311
313 The first official talk by Alastair on bpftrace happened at the Tracing
314 Summit in Edinburgh, Oct 25th 2018.
315
317 Created by Alastair Robertson.
318 Manpage by Stephan Schuberth.
319
321 man -k bcc, after having installed the bpfcc-tools package under
322 Ubuntu.
323
325 Prior to contributing new tools, read the official checklist at:
326 https://github.com/iovisor/bpftrace/blob/master/CONTRIBUTING-TOOLS.md
327
328
329
330 October 2018 BPFTRACE(8)