1UMR(1) User Manuals UMR(1)
2
3
4
6 umr - AMDGPU Userspace Register Debugger
7
9 umr is a tool to read and display, as well as write to AMDGPU device
10 MMIO, PCIE, SMC, and DIDT registers via userspace. It can autodetect
11 and scan AMDGPU devices (SI and up).
12
14 --database-path, -dbp <path>
15 Specify a database path for register, ip, and asic model data.
16
17 --gpu, -g <asicname>(@<instance> | =<pcidevice>)
18 Select a gpu by ASIC name and either the instance number or the
19 PCI bus identifier. For instance, "raven1@1" would pick the
20 raven1 device in the 2nd DRI instance slot. Similarly,
21 "raven1=0000:06:00.0" would pick a raven1 device with the PCI
22 bus address '0000:06:00.0'.
23
24 --instance, -i <number>
25 Pick a device instance to work with. Defaults to the 0'th de‐
26 vice. The instance refers to a directory under /sys/kernel/de‐
27 bug/dri/ where 0 is the first card probed.
28
29 --force, -f <number>
30 Force a PCIE Device ID in hex or by asic name. This is used in
31 case the amdgpu driver is not yet loaded or a display is not yet
32 attached. A '.' prefix will specify a virtual device which is
33 handy for looking up register decodings for a device not present
34 in the system, for instance, '.vega10'.
35
36 --pci <device>
37 Force a specific PCI device using the domain:bus:slot.function
38 format in hex. This is useful when more than one GPU is avail‐
39 able. If the amdgpu driver is loaded the corresponding instance
40 will be automatically detected.
41
42 --gfxoff, -go <0 | 1>
43 Turn on or off GFXOFF on select hardware. A non-zero value en‐
44 ables the GFXOFF feature and a zero value disables it.
45
46 --vm-partition, -vmp <-1, 0...n>
47 Select a VM partition for all GPUVM accesses. Default is -1
48 which refers to the 0'th instance of the VM hub which is not the
49 same as specifying '0'. Values above -1 are for ASICs with mul‐
50 tiple IP instances.
51
52 --vgpr-granularity, -vgpr <-1, 0...n>
53 Specify the VGPR size granularity as a power of 2, e.g., '2'
54 means 4 DWORDs per increment.
55
56 --option, -O <string>[,<string>,...]
57 Specify options to the tool. Multiple options can be specified
58 as comma separated strings. Options should be specified before
59 --update or --force commands (among others) to enable options
60 specified.
61
62 quiet
63 Disable various informative but not required (for function‐
64 ality) outputs.
65
66 read_smc
67 Enable scanning of SMC registers.
68
69 bits
70 enables displaying bitfields for scanned blocks.
71
72 bitsfull
73 enables displaying bitfields using their entire path for
74 scanned blocks.
75
76 empty_log
77 Empties the MMIO log after reading it.
78
79 follow
80 Causes the --logscan command to repeatedly produce output
81 without
82 exiting.
83
84 no_follow_ib
85 Instruct the --ring-stream command to not attempt to follow
86 IBs pointed to by the packets
87 in the ring.
88
89 use_pci
90 Enable PCI access for MMIO instead of using debugfs. Used
91 by the --read,
92 --scan, --top, --write, and --write-bit commands. Does not
93 currently
94 support multiple instances of the same GPU (PCI device ID).
95 Note that access
96 to non-MMIO registers might be disabled when using this
97 flag.
98
99 use_colour
100 Enable colour output for --top command, scales from blue,
101 green, yellow, to red. Also
102 accepted is 'use_color'.
103
104 no_kernel
105 Disable using kernel files to access the device. Implies
106 ''use_pci''. This is meant to
107 be used only if the KMD is hung or otherwise not working
108 correctly. Using it on live systems
109 may result in race conditions.
110
111 verbose
112 Enable verbose diagnostics (used in --vram).
113
114 halt_waves
115 Halt/resume all waves while reading wave status.
116
117 disasm_early_term
118 Terminate shader disassembly when first s_endpgm is hit.
119 This is required for
120 older UMDs (or non-mesa UMDs) that don't use the quintuple
121 0xBF9F0000 to signal the true
122 end of a shader.
123
124 no_disasm
125 Disable shader disassembler logic (still outputs text just
126 doesn't use LLVM to decode). Useful
127 if the linked llvm-dev doesn't support the hardware being
128 debugged. Avoids segfualts/asserts.
129
130 disasm_anyways
131 Enable shader disassembly in --waves even if the rings
132 aren't halted.
133
134 wave64
135 Enable full 64 wave disassembly
136
137 full_shader
138 Enable full shader disassembly in --waves when '-O bits' is
139 used and the shader is found in
140 a gfx or compute ring.
141
142 no_fold_vm_decode
143 Disable folding of PDEs when VM decoding multiple pages of
144 memory. By default,
145 when subsequent pages are decoded if PDEs match previous
146 pages they are omitted to cut down
147 on the verbosity of the output. This option disables this
148 and will print the full chain of
149 PDEs for every page decoded.
150
151 no_scan_waves
152 Disable scanning wave data during --ring-stream output.
153
154 force_asic_file
155 Force using a database .asic file matching in pci.did instead
156 of IP discovery.
157
158
160 --bank, -b <se> <sh> <instance>
161 Select a GRBM se/sh/instance bank in decimal. Can use 'x' to
162 denote a broadcast selection.
163
164 --sbank, -sb <me> <pipe> <queue> [vmid]
165 Select a SRBM me/pipe/queue bank in decimal. VMID is optional
166 (default: 0).
167
168 --cbank, -cb <context_reg_bank>
169 Select a context register bank (value is multiplied by 0x1000).
170 Used for context registers in the range 0xA000..0xAFFF.
171
173 --config, -c
174 Print out configuation data read from kernel driver.
175
176 --enumerate, -e
177 Enumerate all AMDGPU supported devices.
178
179 --list-blocks -lb
180 List all blocks attached to the asic that have been detected.
181
182 --list-regs, -lr <string>
183 List all registers in an IP block (can use '-O bits' to list
184 bitfields)
185
186
188 --lookup, -lu <address_or_regname> <number>
189 Look up an MMIO register by address and bitfield decode the
190 value specified (with 0x prefix) or by register name. The reg‐
191 ister name string must include the ipname, e.g., uvd6.mmUVD_CON‐
192 TEXT_ID.
193
194 --write -w <string> <number>
195 Write a value specified in hex to a register specified with a
196 complete register path in the form < asicname.ipname.regname >.
197 For example, fiji.uvd6.mmUVD_CGC_GATE. The value of asicname
198 and/or ipname can be * to simplify scripting. This command can
199 be used multiple times to write to multiple registers in a sin‐
200 gle invocation.
201
202 --writebit -wb <string> <number>
203 Write a value specified in hex to a register bitfield specified
204 with a complete register path as in the --write command.
205
206 --read, -r <string>
207 Read a value from a register specified by a register path to
208 stdout. This command uses the same syntax as the --write com‐
209 mand but also allows * for the regname field to read an entire
210 block. Additionally, a * can be appended to a register name to
211 read any register that contains a partial match. For instance,
212 "*.vcn10.ADDR*" would read any register from the 'VCN10' block
213 which contains 'ADDR' in the name.
214
215 --scan, -s <string>
216 Scan and print an IP block by name, for example, uvd6 or car‐
217 rizo.uvd6. Can be used multiple times in a single invocation.
218
219
221 --top, -t
222 Summarize GPU utilization. Can select a SE block with --bank.
223 Relevant options that apply are: use_colour and use_pci
224
225 --waves, -wa [ <ring_name> | <vmid>@<addr>.<size> ]
226 Print out information about any active CU waves. Note that if
227 GFX power gating is enabled this command may result in a GPU
228 hang. It's unlikely unless you're invoking it very rapidly.
229 Unlike the wave count reading in --top this command will operate
230 regardless of whether GFX PG is enabled or not. Can use bits to
231 decode the wave bitfields. An optional ring name can be speci‐
232 fied (default: gfx) to search for pointers to active shaders to
233 find extra debugging information. Alternatively, an IB can be
234 specified by a vmid, address, and size (in hex bytes) triplet.
235
236 --profiler, -prof [pixel= | vertex= | compute=]<nsamples> [ring]
237 Capture 'nsamples' samples of wave data. Optionally specify a
238 ring to use when searching for IBs that point to shaders. De‐
239 faults to 'gfx'. Additionally, the type of shader can be se‐
240 lected for as well to only profile a given type of shader.
241
242
244 VMIDs are specified in umr as 16 bit numbers where the lower 8 bits in‐
245 dicate the hardware VMID and the upper 8 bits indicate the which VM
246 space to use.
247
248 0 - GFX hub
249
250 1 - MM hub
251
252 2 - VC0 hub
253
254 3 - VC1 hub
255
256
257 For instance, 0x107 would specify the 7'th VMID on the MM hub.
258
259
260
261 --vm-decode, -vm vmid@<address> <num_of_pages>
262 Decode page mappings at a specified address (in hex) from the
263 VMID specified. The VMID can be specified in hexadecimal (with
264 leading '0x') or in decimal. Implies '-O verbose' for the dura‐
265 tion of the command so does not require it to be manually speci‐
266 fied.
267
268
269 --vm-read, -vr [vmid@]<address> <size>
270 Read 'size' bytes (in hex) from the address specified (in hexa‐
271 decimal) from VRAM to stdout. Optionally specify the VMID (in
272 decimal or in hex with a 0x prefix) treating the address as a
273 virtual address instead. Can use 'use_pci' to directly access
274 VRAM.
275
276
277 --vm-write, -vw [vmid@]<address> <size>
278 Write 'size' bytes (in hex) to the address specified (in hexa‐
279 decimal) to VRAM from stdin.
280
281
282 --vm-write-word, -vww [vmid@]<address> <data>
283 Write a 32-bit word 'data' (in hex) to a given address (in hex)
284 in host machine order.
285
286
287 --vm-disasm, -vdis [<vmid>@]<address> <size>
288 Disassemble 'size' bytes (in hex) from a given address (in hex).
289 The size can be specified as zero to have umr try and compute
290 the shader size.
291
292
294 --ring-stream, -RS <string>[range]
295 Read the contents of the ring named by the string
296 amdgpu_ring_<string>, i.e. without the amdgpu_ring prefix. By
297 default it reads and prints the entire ring. A range is op‐
298 tional and has the format '[start:end]'. The starting and ending
299 address are non-negative integers or the '.' (dot) symbol, which
300 indicates the rptr when on the left side and wptr when on the
301 right side of the range. For instance, "-RS gfx" prints the en‐
302 tire gfx ring, "-R gfx[0:16]" prints the contents from 0 to 16
303 inclusively, and "-RS gfx[.]" or "-RS gfx[.:.]" prints the range
304 [rptr,wptr]. When one of the range limits is a number while the
305 other is the dot, '.', then the number indicates the relative
306 range before or after the corresponding ring pointer. For in‐
307 stance, "-RS sdma0[16:.]" prints [wptr-16, wptr] words of the
308 SDMA0 ring, and "-RS sdma1[.:32]" prints [rptr, rptr+32] double-
309 words of the SDMA1 ring. The contents of the ring is always in‐
310 terpreted, if it can be interpreted.
311
312 --dump-ib, -di [vmid@]address length [pm]
313 Dump an IB packet at an address with an optional VMID. The
314 length is specified in bytes. The type of decoder <pm> is op‐
315 tional and defaults to PM4 packets. Can specify '3' for SDMA
316 packets, '2' for MES packets.
317
318 --dump-ib-file, -df filename [pm]
319 Dump an IB stored in a file as a series of hexadecimal DWORDS
320 one per line. Optionally supply a PM type, can specify '2' for
321 MES packets, '3' for SDMA IBs, or '4' for PM4 IBs. The default
322 is PM4.
323
324 --header-dump, -hd [HEADER_DUMP_reg]
325 Dump the contents of the HEADER_DUMP buffer and decode the op‐
326 code into a human readable string.
327
328 --print-cpc, -cpc
329 Dump CPC register data.
330
331 --print-sdma, -sdma
332 Dump SDMA register data.
333
334 --logscan, -ls
335 Read and display contents of the MMIO register log. Usually
336 specified with '-O bits,follow,empty_log' to enable continual
337 dumping of the trace log.
338
339
341 --power, -p
342 Read the content of clocks, temperature, gpu loading at runtime
343 options 'use_colour' to colourize output.
344
345
346 --clock-scan -cs [clock]
347 Scan the current hierarchy value of each clock. Default will
348 list all the hierarchy value of clocks. otherwise will list the
349 corresponding clock, eg. sclk.
350
351
352 --clock-manual, -cm [clock] [value]
353 Set the value of the corresponding clock. Use -cs command to
354 check hierarchy values of clock and then use -cm value to set
355 the clock.
356
357
358 --clock-high, -ch
359 Set power_dpm_force_performance_level to high.
360
361
362 --clock-low, -cl
363 Set power_dpm_force_performance_level to low.
364
365
366 --clock-auto, -ca
367 Set power_dpm_force_performance_level to auto.
368
369
370 --ppt-read, -pptr [ppt_field_name]
371 Read powerplay table value and print it to stdout. This command
372 will print all the powerplay table information or the corre‐
373 sponding string in powerplay table.
374
375
376 --gpu-metrics, -gm
377 Print the GPU metrics table for the device.
378
379
381 - The "Waves" field in the DRM section of --top only works if GFX PG
382 has been disabled. Otherwise, GPU hangs occur frequently. When PG is
383 enabled it will read a constant 0.
384
385
387 UMR_LOGGER
388 Directory to output "umr.log" file when capturing samples with the
389 --top command.
390
391 UMR_DATABASE_PATH
392 Should be set to the top directory of the database tree used for
393 register, IP, and ASIC model data.
394
395
397 ${CMAKE_INSTALL_PREFIX}/share/bash-completion/completions/umr contains
398 completion for bash shells. You'd normally source this file in your
399 ~/.bashrc.
400
401 ${CMAKE_INSTALL_PREFIX}/share/umr/database contains database files for
402 ASICs, IPs, and registers. UMR_DATABASE_PATH is usually set to point
403 to here.
404
405
406
407AMD (c) 2022 February 2022 UMR(1)