1UMR(1) User Manuals UMR(1)
2
3
4
6 umr - AMDGPU Userspace Register Debugger
7
9 umr is a tool to read and display, as well as write to AMDGPU device
10 MMIO, PCIE, SMC, and DIDT registers via userspace. It can autodetect
11 and scan AMDGPU devices (SI and up).
12
14 --database-path, -dbp <path>
15 Specify a database path for register, ip, and asic model data.
16
17 --gpu, -g <asicname>(@<instance> | =<pcidevice>)
18 Select a gpu by ASIC name and either the instance number or the
19 PCI bus identifier. For instance, "raven1@1" would pick the
20 raven1 device in the 2nd DRI instance slot. Similarly,
21 "raven1=0000:06:00.0" would pick a raven1 device with the PCI
22 bus address '0000:06:00.0'.
23
24 --instance, -i <number>
25 Pick a device instance to work with. Defaults to the 0'th de‐
26 vice. The instance refers to a directory under /sys/kernel/de‐
27 bug/dri/ where 0 is the first card probed.
28
29 --force, -f <number>
30 Force a PCIE Device ID in hex or by asic name. This is used in
31 case the amdgpu driver is not yet loaded or a display is not yet
32 attached. A '.' prefix will specify a virtual device which is
33 handy for looking up register decodings for a device not present
34 in the system, for instance, '.vega10'.
35
36 --pci <device>
37 Force a specific PCI device using the domain:bus:slot.function
38 format in hex. This is useful when more than one GPU is avail‐
39 able. If the amdgpu driver is loaded the corresponding instance
40 will be automatically detected.
41
42 --gfxoff, -go <0 | 1>
43 Turn on or off GFXOFF on select hardware. A non-zero value en‐
44 ables the GFXOFF feature and a zero value disables it.
45
46 --vm_partition, -vmp <-1, 0...n>
47 Select a VM partition for all GPUVM accesses. Default is -1
48 which refers to the 0'th instance of the VM hub which is not the
49 same as specifying '0'. Values above -1 are for ASICs with mul‐
50 tiple IP instances.
51
52 --option, -O <string>[,<string>,...]
53 Specify options to the tool. Multiple options can be specified
54 as comma separated strings. Options should be specified before
55 --update or --force commands (among others) to enable options
56 specified.
57
58 quiet
59 Disable various informative but not required (for function‐
60 ality) outputs.
61
62 read_smc
63 Enable scanning of SMC registers.
64
65 bits
66 enables displaying bitfields for scanned blocks.
67
68 bitsfull
69 enables displaying bitfields using their entire path for
70 scanned blocks.
71
72 empty_log
73 Empties the MMIO log after reading it.
74
75 follow
76 Causes the --logscan command to repeatedly produce output
77 without
78 exiting.
79
80 no_follow_ib
81 Instruct the --ring-stream command to not attempt to follow
82 IBs pointed to by the packets
83 in the ring.
84
85 use_pci
86 Enable PCI access for MMIO instead of using debugfs. Used
87 by the --read,
88 --scan, --top, --write, and --write-bit commands. Does not
89 currently
90 support multiple instances of the same GPU (PCI device ID).
91 Note that access
92 to non-MMIO registers might be disabled when using this
93 flag.
94
95 use_colour
96 Enable colour output for --top command, scales from blue,
97 green, yellow, to red. Also
98 accepted is 'use_color'.
99
100 no_kernel
101 Disable using kernel files to access the device. Implies
102 ''use_pci''. This is meant to
103 be used only if the KMD is hung or otherwise not working
104 correctly. Using it on live systems
105 may result in race conditions.
106
107 verbose
108 Enable verbose diagnostics (used in --vram).
109
110 halt_waves
111 Halt/resume all waves while reading wave status.
112
113 disasm_early_term
114 Terminate shader disassembly when first s_endpgm is hit.
115 This is required for
116 older UMDs (or non-mesa UMDs) that don't use the quintuple
117 0xBF9F0000 to signal the true
118 end of a shader.
119
120 no_disasm
121 Disable shader disassembler logic (still outputs text just
122 doesn't use LLVM to decode). Useful
123 if the linked llvm-dev doesn't support the hardware being
124 debugged. Avoids segfualts/asserts.
125
126 disasm_anyways
127 Enable shader disassembly in --waves even if the rings
128 aren't halted.
129
130 wave64
131 Enable full 64 wave disassembly
132
133 full_shader
134 Enable full shader disassembly in --waves when '-O bits' is
135 used and the shader is found in
136 a gfx or compute ring.
137
138 no_fold_vm_decode
139 Disable folding of PDEs when VM decoding multiple pages of
140 memory. By default,
141 when subsequent pages are decoded if PDEs match previous
142 pages they are omitted to cut down
143 on the verbosity of the output. This option disables this
144 and will print the full chain of
145 PDEs for every page decoded.
146
147 no_scan_waves
148 Disable scanning wave data during --ring-stream output.
149
150 force_asic_file
151 Force using a database .asic file matching in pci.did instead
152 of IP discovery.
153
154
156 --bank, -b <se> <sh> <instance>
157 Select a GRBM se/sh/instance bank in decimal. Can use 'x' to
158 denote a broadcast selection.
159
160 --sbank, -sb <me> <pipe> <queue> [vmid]
161 Select a SRBM me/pipe/queue bank in decimal. VMID is optional
162 (default: 0).
163
164 --cbank, -cb <context_reg_bank>
165 Select a context register bank (value is multiplied by 0x1000).
166 Used for context registers in the range 0xA000..0xAFFF.
167
169 --config, -c
170 Print out configuation data read from kernel driver.
171
172 --enumerate, -e
173 Enumerate all AMDGPU supported devices.
174
175 --list-blocks -lb
176 List all blocks attached to the asic that have been detected.
177
178 --list-regs, -lr <string>
179 List all registers in an IP block (can use '-O bits' to list
180 bitfields)
181
182
184 --lookup, -lu <address_or_regname> <number>
185 Look up an MMIO register by address and bitfield decode the
186 value specified (with 0x prefix) or by register name. The reg‐
187 ister name string must include the ipname, e.g., uvd6.mmUVD_CON‐
188 TEXT_ID.
189
190 --write -w <string> <number>
191 Write a value specified in hex to a register specified with a
192 complete register path in the form < asicname.ipname.regname >.
193 For example, fiji.uvd6.mmUVD_CGC_GATE. The value of asicname
194 and/or ipname can be * to simplify scripting. This command can
195 be used multiple times to write to multiple registers in a sin‐
196 gle invocation.
197
198 --writebit -wb <string> <number>
199 Write a value specified in hex to a register bitfield specified
200 with a complete register path as in the --write command.
201
202 --read, -r <string>
203 Read a value from a register specified by a register path to
204 stdout. This command uses the same syntax as the --write com‐
205 mand but also allows * for the regname field to read an entire
206 block. Additionally, a * can be appended to a register name to
207 read any register that contains a partial match. For instance,
208 "*.vcn10.ADDR*" would read any register from the 'VCN10' block
209 which contains 'ADDR' in the name.
210
211 --scan, -s <string>
212 Scan and print an IP block by name, for example, uvd6 or car‐
213 rizo.uvd6. Can be used multiple times in a single invocation.
214
215
217 --top, -t
218 Summarize GPU utilization. Can select a SE block with --bank.
219 Relevant options that apply are: use_colour and use_pci
220
221 --waves, -wa [ <ring_name> | <vmid>@<addr>.<size> ]
222 Print out information about any active CU waves. Note that if
223 GFX power gating is enabled this command may result in a GPU
224 hang. It's unlikely unless you're invoking it very rapidly.
225 Unlike the wave count reading in --top this command will operate
226 regardless of whether GFX PG is enabled or not. Can use bits to
227 decode the wave bitfields. An optional ring name can be speci‐
228 fied (default: gfx) to search for pointers to active shaders to
229 find extra debugging information. Alternatively, an IB can be
230 specified by a vmid, address, and size (in hex bytes) triplet.
231
232 --profiler, -prof [pixel= | vertex= | compute=]<nsamples> [ring]
233 Capture 'nsamples' samples of wave data. Optionally specify a
234 ring to use when searching for IBs that point to shaders. De‐
235 faults to 'gfx'. Additionally, the type of shader can be se‐
236 lected for as well to only profile a given type of shader.
237
238
240 VMIDs are specified in umr as 16 bit numbers where the lower 8 bits in‐
241 dicate the hardware VMID and the upper 8 bits indicate the which VM
242 space to use.
243
244 0 - GFX hub
245
246 1 - MM hub
247
248 2 - VC0 hub
249
250 3 - VC1 hub
251
252
253 For instance, 0x107 would specify the 7'th VMID on the MM hub.
254
255
256
257 --vm-decode, -vm vmid@<address> <num_of_pages>
258 Decode page mappings at a specified address (in hex) from the
259 VMID specified. The VMID can be specified in hexadecimal (with
260 leading '0x') or in decimal. Implies '-O verbose' for the dura‐
261 tion of the command so does not require it to be manually speci‐
262 fied.
263
264
265 --vm-read, -vr [vmid@]<address> <size>
266 Read 'size' bytes (in hex) from the address specified (in hexa‐
267 decimal) from VRAM to stdout. Optionally specify the VMID (in
268 decimal or in hex with a 0x prefix) treating the address as a
269 virtual address instead. Can use 'use_pci' to directly access
270 VRAM.
271
272
273 --vm-write, -vw [vmid@]<address> <size>
274 Write 'size' bytes (in hex) to the address specified (in hexa‐
275 decimal) to VRAM from stdin.
276
277
278 --vm-write-word, -vww [vmid@]<address> <data>
279 Write a 32-bit word 'data' (in hex) to a given address (in hex)
280 in host machine order.
281
282
283 --vm-disasm, -vdis [<vmid>@]<address> <size>
284 Disassemble 'size' bytes (in hex) from a given address (in hex).
285 The size can be specified as zero to have umr try and compute
286 the shader size.
287
288
290 --ring-stream, -RS <string>[range]
291 Read the contents of the ring named by the string
292 amdgpu_ring_<string>, i.e. without the amdgpu_ring prefix. By
293 default it reads and prints the entire ring. A range is op‐
294 tional and has the format '[start:end]'. The starting and ending
295 address are non-negative integers or the '.' (dot) symbol, which
296 indicates the rptr when on the left side and wptr when on the
297 right side of the range. For instance, "-R gfx" prints the en‐
298 tire gfx ring, "-R gfx[0:16]" prints the contents from 0 to 16
299 inclusively, and "-R gfx[.]" or "-R gfx[.:.]" prints the range
300 [rptr,wptr]. When one of the range limits is a number while the
301 other is the dot, '.', then the number indicates the relative
302 range before or after the corresponding ring pointer. For in‐
303 stance, "-R sdma0[16:.]" prints [wptr-16, wptr] words of the
304 SDMA0 ring, and "-R sdma1[.:32]" prints [rptr, rptr+32] double-
305 words of the SDMA1 ring. The contents of the ring is always in‐
306 terpreted, if it can be interpreted.
307
308 --dump-ib, -di [vmid@]address length [pm]
309 Dump an IB packet at an address with an optional VMID. The
310 length is specified in bytes. The type of decoder <pm> is op‐
311 tional and defaults to PM4 packets. Can specify '3' for SDMA
312 packets, '2' for MES packets.
313
314 --dump-ib-file, -df filename [pm]
315 Dump an IB stored in a file as a series of hexadecimal DWORDS
316 one per line. Optionally supply a PM type, can specify '2' for
317 MES packets, '3' for SDMA IBs, or '4' for PM4 IBs. The default
318 is PM4.
319
320 --header-dump, -hd [HEADER_DUMP_reg]
321 Dump the contents of the HEADER_DUMP buffer and decode the op‐
322 code into a human readable string.
323
324 --print-cpc, -cpc
325 Dump CPC register data.
326
327 --print-sdma, -sdma
328 Dump SDMA register data.
329
330 --logscan, -ls
331 Read and display contents of the MMIO register log. Usually
332 specified with '-O bits,follow,empty_log' to enable continual
333 dumping of the trace log.
334
335
337 --power, -p
338 Read the content of clocks, temperature, gpu loading at runtime
339 options 'use_colour' to colourize output.
340
341
342 --clock-scan -cs [clock]
343 Scan the current hierarchy value of each clock. Default will
344 list all the hierarchy value of clocks. otherwise will list the
345 corresponding clock, eg. sclk.
346
347
348 --clock-manual, -cm [clock] [value]
349 Set the value of the corresponding clock. Use -cs command to
350 check hierarchy values of clock and then use -cm value to set
351 the clock.
352
353
354 --clock-high, -ch
355 Set power_dpm_force_performance_level to high.
356
357
358 --clock-low, -cl
359 Set power_dpm_force_performance_level to low.
360
361
362 --clock-auto, -ca
363 Set power_dpm_force_performance_level to auto.
364
365
366 --ppt-read, -pptr [ppt_field_name]
367 Read powerplay table value and print it to stdout. This command
368 will print all the powerplay table information or the corre‐
369 sponding string in powerplay table.
370
371
372 --gpu-metrics, -gm
373 Print the GPU metrics table for the device.
374
375
377 - The "Waves" field in the DRM section of --top only works if GFX PG
378 has been disabled. Otherwise, GPU hangs occur frequently. When PG is
379 enabled it will read a constant 0.
380
381
383 UMR_LOGGER
384 Directory to output "umr.log" file when capturing samples with the
385 --top command.
386
387 UMR_DATABASE_PATH
388 Should be set to the top directory of the database tree used for
389 register, IP, and ASIC model data.
390
391
393 ${CMAKE_INSTALL_PREFIX}/share/bash-completion/completions/umr contains
394 completion for bash shells. You'd normally source this file in your
395 ~/.bashrc.
396
397 ${CMAKE_INSTALL_PREFIX}/share/umr/database contains database files for
398 ASICs, IPs, and registers. UMR_DATABASE_PATH is usually set to point
399 to here.
400
401
402
403AMD (c) 2022 February 2022 UMR(1)