1UMR(1) User Manuals UMR(1)
2
3
4
6 umr - AMDGPU Userspace Register Debugger
7
9 umr is a tool to read and display, as well as write to AMDGPU device
10 MMIO, PCIE, SMC, and DIDT registers via userspace. It can autodetect
11 and scan AMDGPU devices (SI and up).
12
14 --database-path, -dbp <path>
15 Specify a database path for register, ip, and asic model data.
16
17 --gpu, -g <asicname>(@<instance> | =<pcidevice>)
18 Select a gpu by ASIC name and either the instance number or the
19 PCI bus identifier. For instance, "raven1@1" would pick the
20 raven1 device in the 2nd DRI instance slot. Similarly,
21 "raven1=0000:06:00.0" would pick a raven1 device with the PCI
22 bus address '0000:06:00.0'.
23
24 --instance, -i <number>
25 Pick a device instance to work with. Defaults to the 0'th de‐
26 vice. The instance refers to a directory under /sys/kernel/de‐
27 bug/dri/ where 0 is the first card probed.
28
29 --force, -f <number>
30 Force a PCIE Device ID in hex or by asic name. This is used in
31 case the amdgpu driver is not yet loaded or a display is not yet
32 attached. A '.' prefix will specify a virtual device which is
33 handy for looking up register decodings for a device not present
34 in the system, for instance, '.vega10'.
35
36 --pci <device>
37 Force a specific PCI device using the domain:bus:slot.function
38 format in hex. This is useful when more than one GPU is avail‐
39 able. If the amdgpu driver is loaded the corresponding instance
40 will be automatically detected.
41
42 --gfxoff, -go <0 | 1>
43 Turn on or off GFXOFF on select hardware. A non-zero value en‐
44 ables the GFXOFF feature and a zero value disables it.
45
46 --vm_partition, -vmp <-1, 0...n>
47 Select a VM partition for all GPUVM accesses. Default is -1
48 which refers to the 0'th instance of the VM hub which is not the
49 same as specifying '0'. Values above -1 are for ASICs with mul‐
50 tiple IP instances.
51
52 --option, -O <string>[,<string>,...]
53 Specify options to the tool. Multiple options can be specified
54 as comma separated strings. Options should be specified before
55 --update or --force commands (among others) to enable options
56 specified.
57
58 quiet
59 Disable various informative but not required (for function‐
60 ality) outputs.
61
62 read_smc
63 Enable scanning of SMC registers.
64
65 bits
66 enables displaying bitfields for scanned blocks.
67
68 bitsfull
69 enables displaying bitfields using their entire path for
70 scanned blocks.
71
72 empty_log
73 Empties the MMIO log after reading it.
74
75 follow
76 Causes the --logscan command to repeatedly produce output
77 without
78 exiting.
79
80 no_follow_ib
81 Instruct the --ring-stream command to not attempt to follow
82 IBs pointed to by the packets
83 in the ring.
84
85 use_pci
86 Enable PCI access for MMIO instead of using debugfs. Used
87 by the --read,
88 --scan, --top, --write, and --write-bit commands. Does not
89 currently
90 support multiple instances of the same GPU (PCI device ID).
91 Note that access
92 to non-MMIO registers might be disabled when using this
93 flag.
94
95 use_colour
96 Enable colour output for --top command, scales from blue,
97 green, yellow, to red. Also
98 accepted is 'use_color'.
99
100 no_kernel
101 Disable using kernel files to access the device. Implies
102 ''use_pci''. This is meant to
103 be used only if the KMD is hung or otherwise not working
104 correctly. Using it on live systems
105 may result in race conditions.
106
107 verbose
108 Enable verbose diagnostics (used in --vram).
109
110 halt_waves
111 Halt/resume all waves while reading wave status.
112
113 disasm_early_term
114 Terminate shader disassembly when first s_endpgm is hit.
115 This is required for
116 older UMDs (or non-mesa UMDs) that don't use the quintuple
117 0xBF9F0000 to signal the true
118 end of a shader.
119
120 no_disasm
121 Disable shader disassembler logic (still outputs text just
122 doesn't use LLVM to decode). Useful
123 if the linked llvm-dev doesn't support the hardware being
124 debugged. Avoids segfualts/asserts.
125
126 disasm_anyways
127 Enable shader disassembly in --waves even if the rings
128 aren't halted.
129
130 wave64
131 Enable full 64 wave disassembly
132
133 full_shader
134 Enable full shader disassembly in --waves when '-O bits' is
135 used and the shader is found in
136 a gfx or compute ring.
137
138 no_fold_vm_decode
139 Disable folding of PDEs when VM decoding multiple pages of
140 memory. By default,
141 when subsequent pages are decoded if PDEs match previous
142 pages they are omitted to cut down
143 on the verbosity of the output. This option disables this
144 and will print the full chain of
145 PDEs for every page decoded.
146
147 no_scan_waves
148 Disable scanning wave data during --ring-stream output.
149
150
152 --bank, -b <se> <sh> <instance>
153 Select a GRBM se/sh/instance bank in decimal. Can use 'x' to
154 denote a broadcast selection.
155
156 --sbank, -sb <me> <pipe> <queue> [vmid]
157 Select a SRBM me/pipe/queue bank in decimal. VMID is optional
158 (default: 0).
159
160 --cbank, -cb <context_reg_bank>
161 Select a context register bank (value is multiplied by 0x1000).
162 Used for context registers in the range 0xA000..0xAFFF.
163
165 --config, -c
166 Print out configuation data read from kernel driver.
167
168 --enumerate, -e
169 Enumerate all AMDGPU supported devices.
170
171 --list-blocks -lb
172 List all blocks attached to the asic that have been detected.
173
174 --list-regs, -lr <string>
175 List all registers in an IP block (can use '-O bits' to list
176 bitfields)
177
178
180 --lookup, -lu <address_or_regname> <number>
181 Look up an MMIO register by address and bitfield decode the
182 value specified (with 0x prefix) or by register name. The reg‐
183 ister name string must include the ipname, e.g., uvd6.mmUVD_CON‐
184 TEXT_ID.
185
186 --write -w <string> <number>
187 Write a value specified in hex to a register specified with a
188 complete register path in the form < asicname.ipname.regname >.
189 For example, fiji.uvd6.mmUVD_CGC_GATE. The value of asicname
190 and/or ipname can be * to simplify scripting. This command can
191 be used multiple times to write to multiple registers in a sin‐
192 gle invocation.
193
194 --writebit -wb <string> <number>
195 Write a value specified in hex to a register bitfield specified
196 with a complete register path as in the --write command.
197
198 --read, -r <string>
199 Read a value from a register specified by a register path to
200 stdout. This command uses the same syntax as the --write com‐
201 mand but also allows * for the regname field to read an entire
202 block. Additionally, a * can be appended to a register name to
203 read any register that contains a partial match. For instance,
204 "*.vcn10.ADDR*" would read any register from the 'VCN10' block
205 which contains 'ADDR' in the name.
206
207 --scan, -s <string>
208 Scan and print an IP block by name, for example, uvd6 or car‐
209 rizo.uvd6. Can be used multiple times in a single invocation.
210
211
213 --top, -t
214 Summarize GPU utilization. Can select a SE block with --bank.
215 Relevant options that apply are: use_colour and use_pci
216
217 --waves, -wa [ <ring_name> | <vmid>@<addr>.<size> ]
218 Print out information about any active CU waves. Note that if
219 GFX power gating is enabled this command may result in a GPU
220 hang. It's unlikely unless you're invoking it very rapidly.
221 Unlike the wave count reading in --top this command will operate
222 regardless of whether GFX PG is enabled or not. Can use bits to
223 decode the wave bitfields. An optional ring name can be speci‐
224 fied (default: gfx) to search for pointers to active shaders to
225 find extra debugging information. Alternatively, an IB can be
226 specified by a vmid, address, and size (in hex bytes) triplet.
227
228 --profiler, -prof [pixel= | vertex= | compute=]<nsamples> [ring]
229 Capture 'nsamples' samples of wave data. Optionally specify a
230 ring to use when searching for IBs that point to shaders. De‐
231 faults to 'gfx'. Additionally, the type of shader can be se‐
232 lected for as well to only profile a given type of shader.
233
234
236 VMIDs are specified in umr as 16 bit numbers where the lower 8 bits in‐
237 dicate the hardware VMID and the upper 8 bits indicate the which VM
238 space to use.
239
240 0 - GFX hub
241
242 1 - MM hub
243
244 2 - VC0 hub
245
246 3 - VC1 hub
247
248
249 For instance, 0x107 would specify the 7'th VMID on the MM hub.
250
251
252
253 --vm-decode, -vm vmid@<address> <num_of_pages>
254 Decode page mappings at a specified address (in hex) from the
255 VMID specified. The VMID can be specified in hexadecimal (with
256 leading '0x') or in decimal. Implies '-O verbose' for the dura‐
257 tion of the command so does not require it to be manually speci‐
258 fied.
259
260
261 --vm-read, -vr [vmid@]<address> <size>
262 Read 'size' bytes (in hex) from the address specified (in hexa‐
263 decimal) from VRAM to stdout. Optionally specify the VMID (in
264 decimal or in hex with a 0x prefix) treating the address as a
265 virtual address instead. Can use 'use_pci' to directly access
266 VRAM.
267
268
269 --vm-write, -vw [vmid@]<address> <size>
270 Write 'size' bytes (in hex) to the address specified (in hexa‐
271 decimal) to VRAM from stdin.
272
273
274 --vm-write-word, -vww [vmid@]<address> <data>
275 Write a 32-bit word 'data' (in hex) to a given address (in hex)
276 in host machine order.
277
278
279 --vm-disasm, -vdis [<vmid>@]<address> <size>
280 Disassemble 'size' bytes (in hex) from a given address (in hex).
281 The size can be specified as zero to have umr try and compute
282 the shader size.
283
284
286 --ring-stream, -RS <string>[range]
287 Read the contents of the ring named by the string
288 amdgpu_ring_<string>, i.e. without the amdgpu_ring prefix. By
289 default it reads and prints the entire ring. A range is op‐
290 tional and has the format '[start:end]'. The starting and ending
291 address are non-negative integers or the '.' (dot) symbol, which
292 indicates the rptr when on the left side and wptr when on the
293 right side of the range. For instance, "-R gfx" prints the en‐
294 tire gfx ring, "-R gfx[0:16]" prints the contents from 0 to 16
295 inclusively, and "-R gfx[.]" or "-R gfx[.:.]" prints the range
296 [rptr,wptr]. When one of the range limits is a number while the
297 other is the dot, '.', then the number indicates the relative
298 range before or after the corresponding ring pointer. For in‐
299 stance, "-R sdma0[16:.]" prints [wptr-16, wptr] words of the
300 SDMA0 ring, and "-R sdma1[.:32]" prints [rptr, rptr+32] double-
301 words of the SDMA1 ring. The contents of the ring is always in‐
302 terpreted, if it can be interpreted.
303
304 --dump-ib, -di [vmid@]address length [pm]
305 Dump an IB packet at an address with an optional VMID. The
306 length is specified in bytes. The type of decoder <pm> is op‐
307 tional and defaults to PM4 packets. Can specify '3' for SDMA
308 packets.
309
310 --dump-ib-file, -df filename [pm]
311 Dump an IB stored in a file as a series of hexadecimal DWORDS
312 one per line. Optionally supply a PM type, can specify '3' for
313 SDMA IBs or '4' for PM4 IBs. The default is PM4.
314
315 --header-dump, -hd [HEADER_DUMP_reg]
316 Dump the contents of the HEADER_DUMP buffer and decode the op‐
317 code into a human readable string.
318
319 --logscan, -ls
320 Read and display contents of the MMIO register log. Usually
321 specified with '-O bits,follow,empty_log' to enable continual
322 dumping of the trace log.
323
324
326 --power, -p
327 Read the content of clocks, temperature, gpu loading at runtime
328 options 'use_colour' to colourize output.
329
330
331 --clock-scan -cs [clock]
332 Scan the current hierarchy value of each clock. Default will
333 list all the hierarchy value of clocks. otherwise will list the
334 corresponding clock, eg. sclk.
335
336
337 --clock-manual, -cm [clock] [value]
338 Set the value of the corresponding clock. Use -cs command to
339 check hierarchy values of clock and then use -cm value to set
340 the clock.
341
342
343 --clock-high, -ch
344 Set power_dpm_force_performance_level to high.
345
346
347 --clock-low, -cl
348 Set power_dpm_force_performance_level to low.
349
350
351 --clock-auto, -ca
352 Set power_dpm_force_performance_level to auto.
353
354
355 --ppt_read, -pptr [ppt_field_name]
356 Read powerplay table value and print it to stdout. This command
357 will print all the powerplay table information or the corre‐
358 sponding string in powerplay table.
359
360
361 --gpu_metrics, -gm
362 Print the GPU metrics table for the device.
363
364
366 - The "Waves" field in the DRM section of --top only works if GFX PG
367 has been disabled. Otherwise, GPU hangs occur frequently. When PG is
368 enabled it will read a constant 0.
369
370
372 UMR_LOGGER
373 Directory to output "umr.log" file when capturing samples with the
374 --top command.
375
376 UMR_DATABASE_PATH
377 Directory start of database tree used for register, ip, and asic
378 model data.
379
380
381
382AMD (c) 2020 January 2020 UMR(1)