1profile(8)                  System Manager's Manual                 profile(8)
2
3
4

NAME

6       profile  -  Profile  CPU  usage  by  sampling  stack traces. Uses Linux
7       eBPF/bcc.
8

SYNOPSIS

10       profile [-adfh] [-p PID | -L TID] [-U | -K] [-F FREQUENCY |  -c  COUNT]
11       [--stack-storage-size   COUNT]   [-C   CPU]   [--cgroupmap   CGROUPMAP]
12       [--mntnsmap MAPPATH] [duration]
13

DESCRIPTION

15       This is a CPU profiler. It works by taking samples of stack  traces  at
16       timed  intervals.  It  will help you understand and quantify CPU usage:
17       which code is executing, and by how much, including both user-level and
18       kernel code.
19
20       By  default  this  samples at 49 Hertz (samples per second), across all
21       CPUs.  This frequency can be tuned using a  command  line  option.  The
22       reason for 49, and not 50, is to avoid lock-step sampling.
23
24       This  is  also  an  efficient  profiler,  as stack traces are frequency
25       counted in kernel context, rather than passing each stack to user space
26       for  frequency  counting  there.  Only the unique stacks and counts are
27       passed to user space at the end of the profile,  greatly  reducing  the
28       kernel<->user transfer.
29

REQUIREMENTS

31       CONFIG_BPF and bcc.
32
33       This  also  requires Linux 4.9+ (BPF_PROG_TYPE_PERF_EVENT support). See
34       tools/old for an older version that may work on Linux 4.6 - 4.8.
35

OPTIONS

37       -h     Print usage message.
38
39       -p PID Trace process with one or more comma separated PIDs  only  (fil‐
40              tered in-kernel).
41
42       -L TID Trace  thread  with  one or more comma separated TIDs only (fil‐
43              tered in-kernel).
44
45       -F frequency
46              Frequency to sample stacks.
47
48       -c count
49              Sample stacks every one in this many events.
50
51       -f     Print output in folded stack format.
52
53       -d     Include an output delimiter between kernel and user stacks  (ei‐
54              ther "--", or, in folded mode, "-").
55
56       -U     Show stacks from user space only (no kernel space stacks).
57
58       -K     Show stacks from kernel space only (no user space stacks).
59
60       -I     Include CPU idle stacks (by default these are excluded).
61
62       --stack-storage-size COUNT
63              The  maximum  number of unique stack traces that the kernel will
64              count (default 16384). If the  sampled  count  exceeds  this,  a
65              warning will be printed.
66
67       -C cpu Collect stacks only from specified cpu.
68
69       --cgroupmap MAPPATH
70              Profile cgroups in this BPF map only (filtered in-kernel).
71
72       duration
73              Duration to trace, in seconds.
74

EXAMPLES

76       Profile (sample) stack traces system-wide at 49 Hertz (samples per sec‐
77       ond) until Ctrl-C:
78              # profile
79
80       Profile for 5 seconds only:
81              # profile 5
82
83       Profile at 99 Hertz for 5 seconds only:
84              # profile -F 99 5
85
86       Profile 1 in a million events for 5 seconds only:
87              # profile -c 1000000 5
88
89       Profile process with PID 181 only:
90              # profile -p 181
91
92       Profile thread with TID 181 only:
93              # profile -L 181
94
95       Profile for 5 seconds and output in folded stack  format  (suitable  as
96       input  for flame graphs), including a delimiter between kernel and user
97       stacks:
98              # profile -df 5
99
100       Profile kernel stacks only:
101              # profile -K
102
103       Profile a set  of  cgroups  only  (see  special_filtering.md  from  bcc
104       sources for more details):
105              # profile --cgroupmap /sys/fs/bpf/test01
106

DEBUGGING

108       See  "[unknown]"  frames with bogus addresses? This can happen for dif‐
109       ferent reasons. Your best approach is to get Linux perf to work  first,
110       and  then  to  try  this tool. Eg, "perf record -F 49 -a -g -- sleep 1;
111       perf script", and to check for unknown frames there.
112
113       The most common reason for "[unknown]" frames is that the target  soft‐
114       ware  has  not  been  compiled with frame pointers, and so we can't use
115       that simple method for walking the stack. The fix in that  case  is  to
116       use  software  that  does have frame pointers, eg, gcc -fno-omit-frame-
117       pointer, or Java's -XX:+PreserveFramePointer.
118
119       Another reason for "[unknown]" frames is JIT compilers, which don't use
120       a  traditional  symbol  table.  The  fix  in that case is to populate a
121       /tmp/perf-PID.map file with the symbols, which this tool  should  read.
122       How you do this depends on the runtime (Java, Node.js).
123
124       If  you  seem  to have unrelated samples in the output, check for other
125       sampling or tracing tools that may be running. The current  version  of
126       this  tool can include their events if profiling happened concurrently.
127       Those samples may be filtered in a future version.
128

OVERHEAD

130       This is an efficient profiler, as stack traces are frequency counted in
131       kernel  context, and only the unique stacks and their counts are passed
132       to user space. Contrast this with the current "perf record  -F  99  -a"
133       method of profiling, which writes each sample to user space (via a ring
134       buffer), and then to the file system (perf.data), which must  be  post-
135       processed.
136
137       This  uses  perf_event_open  to  setup a timer which is instrumented by
138       BPF, and for efficiency it does not initialize the perf ring buffer, so
139       the redundant perf samples are not collected.
140
141       It's  expected  that  the  overhead while sampling at 49 Hertz (the de‐
142       fault), across all CPUs, should be negligible. If you increase the sam‐
143       ple rate, the overhead might begin to be measurable.
144

SOURCE

146       This is from bcc.
147
148              https://github.com/iovisor/bcc
149
150       Also  look  in  the bcc distribution for a companion _examples.txt file
151       containing example usage, output, and commentary for this tool.
152

OS

154       Linux
155

STABILITY

157       Unstable - in development.
158

AUTHOR

160       Brendan Gregg
161

SEE ALSO

163       offcputime(8)
164
165
166
167USER COMMANDS                     2020-03-18                        profile(8)
Impressum