1PERF-BENCH(1) perf Manual PERF-BENCH(1)
2
3
4
6 perf-bench - General framework for benchmark suites
7
9 perf bench [<common options>] <subsystem> <suite> [<options>]
10
12 This perf bench command is a general framework for benchmark suites.
13
15 -r, --repeat=
16 Specify number of times to repeat the run (default 10).
17
18 -f, --format=
19 Specify format style. Current available format styles are:
20
21 default
22 Default style. This is mainly for human reading.
23
24
25 .ft C
26 % perf bench sched pipe # with no style specified
27 (executing 1000000 pipe operations between two tasks)
28 Total time:5.855 sec
29 5.855061 usecs/op
30 170792 ops/sec
31 .ft
32
33
34 simple
35 This simple style is friendly for automated processing by scripts.
36
37
38 .ft C
39 % perf bench --format=simple sched pipe # specified simple
40 5.988
41 .ft
42
43
45 sched
46 Scheduler and IPC mechanisms.
47
48 syscall
49 System call performance (throughput).
50
51 mem
52 Memory access performance.
53
54 numa
55 NUMA scheduling and MM benchmarks.
56
57 futex
58 Futex stressing benchmarks.
59
60 epoll
61 Eventpoll (epoll) stressing benchmarks.
62
63 internals
64 Benchmark internal perf functionality.
65
66 uprobe
67 Benchmark overhead of uprobe + BPF.
68
69 all
70 All benchmark subsystems.
71
72 SUITES FOR sched
73 messaging
74 Suite for evaluating performance of scheduler and IPC mechanisms.
75 Based on hackbench by Rusty Russell.
76
77 Options of messaging
78 -p, --pipe
79 Use pipe() instead of socketpair()
80
81 -t, --thread
82 Be multi thread instead of multi process
83
84 -g, --group=
85 Specify number of groups
86
87 -l, --nr_loops=
88 Specify number of loops
89
90 Example of messaging
91 .ft C
92 % perf bench sched messaging # run with default
93 options (20 sender and receiver processes per group)
94 (10 groups == 400 processes run)
95
96 Total time:0.308 sec
97
98 % perf bench sched messaging -t -g 20 # be multi-thread, with 20 groups
99 (20 sender and receiver threads per group)
100 (20 groups == 800 threads run)
101
102 Total time:0.582 sec
103 .ft
104
105
106 pipe
107 Suite for pipe() system call. Based on pipe-test-1m.c by Ingo
108 Molnar.
109
110 Options of pipe
111 -l, --loop=
112 Specify number of loops.
113
114 Example of pipe
115 .ft C
116 % perf bench sched pipe
117 (executing 1000000 pipe operations between two tasks)
118
119 Total time:8.091 sec
120 8.091833 usecs/op
121 123581 ops/sec
122
123 % perf bench sched pipe -l 1000 # loop 1000
124 (executing 1000 pipe operations between two tasks)
125
126 Total time:0.016 sec
127 16.948000 usecs/op
128 59004 ops/sec
129 .ft
130
131
132 SUITES FOR syscall
133 basic
134 Suite for evaluating performance of core system call throughput
135 (both usecs/op and ops/sec metrics). This uses a single thread
136 simply doing getppid(2), which is a simple syscall where the result
137 is not cached by glibc.
138
139 SUITES FOR mem
140 memcpy
141 Suite for evaluating performance of simple memory copy in various
142 ways.
143
144 Options of memcpy
145 -l, --size
146 Specify size of memory to copy (default: 1MB). Available units
147 are B, KB, MB, GB and TB (case insensitive).
148
149 -f, --function
150 Specify function to copy (default: default). Available
151 functions are depend on the architecture. On x86-64,
152 x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.
153
154 -l, --nr_loops
155 Repeat memcpy invocation this number of times.
156
157 -c, --cycles
158 Use perf’s cpu-cycles event instead of gettimeofday syscall.
159
160 memset
161 Suite for evaluating performance of simple memory set in
162 various ways.
163
164 Options of memset
165 -l, --size
166 Specify size of memory to set (default: 1MB). Available units
167 are B, KB, MB, GB and TB (case insensitive).
168
169 -f, --function
170 Specify function to set (default: default). Available functions
171 are depend on the architecture. On x86-64, x86-64-unrolled,
172 x86-64-stosq and x86-64-stosb are supported.
173
174 -l, --nr_loops
175 Repeat memset invocation this number of times.
176
177 -c, --cycles
178 Use perf’s cpu-cycles event instead of gettimeofday syscall.
179
180 SUITES FOR numa
181 mem
182 Suite for evaluating NUMA workloads.
183
184 SUITES FOR futex
185 hash
186 Suite for evaluating hash tables.
187
188 wake
189 Suite for evaluating wake calls.
190
191 wake-parallel
192 Suite for evaluating parallel wake calls.
193
194 requeue
195 Suite for evaluating requeue calls.
196
197 lock-pi
198 Suite for evaluating futex lock_pi calls.
199
200 SUITES FOR epoll
201 wait
202 Suite for evaluating concurrent epoll_wait calls.
203
204 ctl
205 Suite for evaluating multiple epoll_ctl calls.
206
207 SUITES FOR internals
208 synthesize
209 Suite for evaluating perf’s event synthesis performance.
210
212 perf(1)
213
214
215
216perf 11/28/2023 PERF-BENCH(1)