llvm-exegesis-7(1)

1LLVM-EXEGESIS(1)                     LLVM                     LLVM-EXEGESIS(1)
2
3
4

NAME

6       llvm-exegesis - LLVM Machine Instruction Benchmark
7

SYNOPSIS

9       llvm-exegesis [options]
10

DESCRIPTION

12       llvm-exegesis is a benchmarking tool that uses information available in
13       LLVM to measure host machine instruction characteristics  like  latency
14       or port decomposition.
15
16       Given an LLVM opcode name and a benchmarking mode, llvm-exegesis gener‐
17       ates a code snippet that makes execution as serial (resp. as  parallel)
18       as  possible  so  that we can measure the latency (resp. uop decomposi‐
19       tion) of the instruction.  The code snippet is jitted and  executed  on
20       the  host  subtarget. The time taken (resp. resource usage) is measured
21       using hardware performance counters. The result is printed out as  YAML
22       to the standard output.
23
24       The  main goal of this tool is to automatically (in)validate the LLVM's
25       TableDef scheduling models. To that end, we also  provide  analysis  of
26       the results.
27

EXAMPLES: BENCHMARKING

29       Assume  you  have an X86-64 machine. To measure the latency of a single
30       instruction, run:
31
32          $ llvm-exegesis -mode=latency -opcode-name=ADD64rr
33
34       Measuring the uop decomposition of an instruction works similarly:
35
36          $ llvm-exegesis -mode=uops -opcode-name=ADD64rr
37
38       The output is a YAML document (the default is to write to  stdout,  but
39       you can redirect the output to a file using -benchmarks-file):
40
41          ---
42          key:
43            opcode_name:     ADD64rr
44            mode:            latency
45            config:          ''
46          cpu_name:        haswell
47          llvm_triple:     x86_64-unknown-linux-gnu
48          num_repetitions: 10000
49          measurements:
50            - { key: latency, value: 1.0058, debug_string: '' }
51          error:           ''
52          info:            'explicit self cycles, selecting one aliasing configuration.
53          Snippet:
54          ADD64rr R8, R8, R10
55          '
56          ...
57
58       To  measure  the latency of all instructions for the host architecture,
59       run:
60
61          #!/bin/bash
62          readonly INSTRUCTIONS=$(($(grep INSTRUCTION_LIST_END build/lib/Target/X86/X86GenInstrInfo.inc | cut -f2 -d=) - 1))
63          for INSTRUCTION in $(seq 1 ${INSTRUCTIONS});
64          do
65            ./build/bin/llvm-exegesis -mode=latency -opcode-index=${INSTRUCTION} | sed -n '/---/,$p'
66          done
67
68       FIXME: Provide an llvm-exegesis option to test all instructions.
69

EXAMPLES: ANALYSIS

71       Assuming you have a set of benchmarked instructions (either latency  or
72       uops) as YAML in file /tmp/benchmarks.yaml, you can analyze the results
73       using the following command:
74
75            $ llvm-exegesis -mode=analysis \
76          -benchmarks-file=/tmp/benchmarks.yaml \
77          -analysis-clusters-output-file=/tmp/clusters.csv \
78          -analysis-inconsistencies-output-file=/tmp/inconsistencies.txt
79
80       This will group the instructions into clusters with  the  same  perfor‐
81       mance  characteristics.  The clusters will be written out to /tmp/clus‐
82       ters.csv in the following format:
83
84          cluster_id,opcode_name,config,sched_class
85          ...
86          2,ADD32ri8_DB,,WriteALU,1.00
87          2,ADD32ri_DB,,WriteALU,1.01
88          2,ADD32rr,,WriteALU,1.01
89          2,ADD32rr_DB,,WriteALU,1.00
90          2,ADD32rr_REV,,WriteALU,1.00
91          2,ADD64i32,,WriteALU,1.01
92          2,ADD64ri32,,WriteALU,1.01
93          2,MOVSX64rr32,,BSWAP32r_BSWAP64r_MOVSX64rr32,1.00
94          2,VPADDQYrr,,VPADDBYrr_VPADDDYrr_VPADDQYrr_VPADDWYrr_VPSUBBYrr_VPSUBDYrr_VPSUBQYrr_VPSUBWYrr,1.02
95          2,VPSUBQYrr,,VPADDBYrr_VPADDDYrr_VPADDQYrr_VPADDWYrr_VPSUBBYrr_VPSUBDYrr_VPSUBQYrr_VPSUBWYrr,1.01
96          2,ADD64ri8,,WriteALU,1.00
97          2,SETBr,,WriteSETCC,1.01
98          ...
99
100       llvm-exegesis will also analyze the clusters to point out  inconsisten‐
101       cies in the scheduling information. The output is an html file. For ex‐
102       ample, /tmp/inconsistencies.html will contain messages like the follow‐
103       ing : [image]
104
105       Note  that  the  scheduling  class  names  will  be  resolved only when
106       llvm-exegesis is compiled in debug mode, else only the class id will be
107       shown. This does not invalidate any of the analysis results though.
108

OPTIONS

110       -help  Print a summary of command line options.
111
112       -opcode-index=<LLVM opcode index>
113              Specify the opcode to measure, by index.  Either opcode-index or
114              opcode-name must be set.
115
116       -opcode-name=<LLVM opcode name>
117              Specify the opcode to measure, by name.  Either opcode-index  or
118              opcode-name must be set.
119
120       -mode=[latency|uops|analysis]
121              Specify the run mode.
122
123       -num-repetitions=<Number of repetition>
124              Specify  the  number  of repetitions of the asm snippet.  Higher
125              values lead to  more  accurate  measurements  but  lengthen  the
126              benchmark.
127
128       -benchmarks-file=</path/to/file>
129              File  to  read  (analysis  mode)  or  write (latency/uops modes)
130              benchmark results. "-" uses stdin/stdout.
131
132       -analysis-clusters-output-file=</path/to/file>
133              If provided, write the analysis clusters as CSV  to  this  file.
134              "-" prints to stdout.
135
136       -analysis-inconsistencies-output-file=</path/to/file>
137              If  non-empty,  write  inconsistencies  found during analysis to
138              this file. - prints to stdout.
139
140       -analysis-numpoints=<dbscan numPoints parameter>
141              Specify the numPoints parameters to be used for DBSCAN  cluster‐
142              ing (analysis mode).
143
144       -analysis-espilon=<dbscan epsilon parameter>
145              Specify  the numPoints parameters to be used for DBSCAN cluster‐
146              ing (analysis mode).
147
148       -ignore-invalid-sched-class=false
149              If set, ignore instructions that  do  not  have  a  sched  class
150              (class idx = 0).
151

EXIT STATUS

153       llvm-exegesis  returns  0  on  success.  Otherwise, an error message is
154       printed to standard error, and the tool returns a non 0 value.
155

AUTHOR

157       Maintained by The LLVM Team (http://llvm.org/).
158

COPYRIGHT

160       2003-2023, LLVM Project
161
162
163
164
1657                                 2023-07-20                  LLVM-EXEGESIS(1)