1PERF-BENCH(1)                     perf Manual                    PERF-BENCH(1)
2
3
4

NAME

6       perf-bench - General framework for benchmark suites
7

SYNOPSIS

9       perf bench [<common options>] <subsystem> <suite> [<options>]
10

DESCRIPTION

12       This perf bench command is a general framework for benchmark suites.
13

COMMON OPTIONS

15       -r, --repeat=
16           Specify number of times to repeat the run (default 10).
17
18       -f, --format=
19           Specify format style. Current available format styles are:
20
21       default
22           Default style. This is mainly for human reading.
23
24
25           .ft C
26           % perf bench sched pipe                      # with no style specified
27           (executing 1000000 pipe operations between two tasks)
28                   Total time:5.855 sec
29                           5.855061 usecs/op
30                           170792 ops/sec
31           .ft
32
33
34       simple
35           This simple style is friendly for automated processing by scripts.
36
37
38           .ft C
39           % perf bench --format=simple sched pipe      # specified simple
40           5.988
41           .ft
42
43

SUBSYSTEM

45       sched
46           Scheduler and IPC mechanisms.
47
48       syscall
49           System call performance (throughput).
50
51       mem
52           Memory access performance.
53
54       numa
55           NUMA scheduling and MM benchmarks.
56
57       futex
58           Futex stressing benchmarks.
59
60       epoll
61           Eventpoll (epoll) stressing benchmarks.
62
63       internals
64           Benchmark internal perf functionality.
65
66       uprobe
67           Benchmark overhead of uprobe + BPF.
68
69       all
70           All benchmark subsystems.
71
72   SUITES FOR sched
73       messaging
74           Suite for evaluating performance of scheduler and IPC mechanisms.
75           Based on hackbench by Rusty Russell.
76
77       Options of messaging
78           -p, --pipe
79               Use pipe() instead of socketpair()
80
81           -t, --thread
82               Be multi thread instead of multi process
83
84           -g, --group=
85               Specify number of groups
86
87           -l, --nr_loops=
88               Specify number of loops
89
90       Example of messaging
91               .ft C
92               % perf bench sched messaging                 # run with default
93               options (20 sender and receiver processes per group)
94               (10 groups == 400 processes run)
95
96                     Total time:0.308 sec
97
98               % perf bench sched messaging -t -g 20        # be multi-thread, with 20 groups
99               (20 sender and receiver threads per group)
100               (20 groups == 800 threads run)
101
102                     Total time:0.582 sec
103               .ft
104
105
106           pipe
107               Suite for pipe() system call. Based on pipe-test-1m.c by Ingo
108               Molnar.
109
110       Options of pipe
111           -l, --loop=
112               Specify number of loops.
113
114       Example of pipe
115               .ft C
116               % perf bench sched pipe
117               (executing 1000000 pipe operations between two tasks)
118
119                       Total time:8.091 sec
120                               8.091833 usecs/op
121                               123581 ops/sec
122
123               % perf bench sched pipe -l 1000              # loop 1000
124               (executing 1000 pipe operations between two tasks)
125
126                       Total time:0.016 sec
127                               16.948000 usecs/op
128                               59004 ops/sec
129               .ft
130
131
132   SUITES FOR syscall
133       basic
134           Suite for evaluating performance of core system call throughput
135           (both usecs/op and ops/sec metrics). This uses a single thread
136           simply doing getppid(2), which is a simple syscall where the result
137           is not cached by glibc.
138
139   SUITES FOR mem
140       memcpy
141           Suite for evaluating performance of simple memory copy in various
142           ways.
143
144       Options of memcpy
145           -l, --size
146               Specify size of memory to copy (default: 1MB). Available units
147               are B, KB, MB, GB and TB (case insensitive).
148
149           -f, --function
150               Specify function to copy (default: default). Available
151               functions are depend on the architecture. On x86-64,
152               x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.
153
154           -l, --nr_loops
155               Repeat memcpy invocation this number of times.
156
157           -c, --cycles
158               Use perf’s cpu-cycles event instead of gettimeofday syscall.
159
160           memset
161               Suite for evaluating performance of simple memory set in
162               various ways.
163
164       Options of memset
165           -l, --size
166               Specify size of memory to set (default: 1MB). Available units
167               are B, KB, MB, GB and TB (case insensitive).
168
169           -f, --function
170               Specify function to set (default: default). Available functions
171               are depend on the architecture. On x86-64, x86-64-unrolled,
172               x86-64-stosq and x86-64-stosb are supported.
173
174           -l, --nr_loops
175               Repeat memset invocation this number of times.
176
177           -c, --cycles
178               Use perf’s cpu-cycles event instead of gettimeofday syscall.
179
180   SUITES FOR numa
181       mem
182           Suite for evaluating NUMA workloads.
183
184   SUITES FOR futex
185       hash
186           Suite for evaluating hash tables.
187
188       wake
189           Suite for evaluating wake calls.
190
191       wake-parallel
192           Suite for evaluating parallel wake calls.
193
194       requeue
195           Suite for evaluating requeue calls.
196
197       lock-pi
198           Suite for evaluating futex lock_pi calls.
199
200   SUITES FOR epoll
201       wait
202           Suite for evaluating concurrent epoll_wait calls.
203
204       ctl
205           Suite for evaluating multiple epoll_ctl calls.
206
207   SUITES FOR internals
208       synthesize
209           Suite for evaluating perf’s event synthesis performance.
210

SEE ALSO

212       perf(1)
213
214
215
216perf                              11/28/2023                     PERF-BENCH(1)
Impressum