1UMR(1)                           User Manuals                           UMR(1)
2
3
4

NAME

6       umr - AMDGPU Userspace Register Debugger
7

DESCRIPTION

9       umr  is  a  tool to read and display, as well as write to AMDGPU device
10       MMIO, PCIE, SMC, and DIDT registers via userspace.  It  can  autodetect
11       and scan AMDGPU devices (SI and up).
12

Device Selection

14       --database-path, -dbp <path>
15              Specify a database path for register, ip, and asic model data.
16
17       --gpu, -g <asicname>(@<instance> | =<pcidevice>)
18              Select  a gpu by ASIC name and either the instance number or the
19              PCI bus identifier.  For instance,  "raven1@1"  would  pick  the
20              raven1   device  in  the  2nd  DRI  instance  slot.   Similarly,
21              "raven1=0000:06:00.0" would pick a raven1 device  with  the  PCI
22              bus address '0000:06:00.0'.
23
24       --instance, -i <number>
25              Pick  a  device instance to work with.  Defaults to the 0'th de‐
26              vice.  The instance refers to a directory under  /sys/kernel/de‐
27              bug/dri/ where 0 is the first card probed.
28
29       --force, -f <number>
30              Force  a PCIE Device ID in hex or by asic name.  This is used in
31              case the amdgpu driver is not yet loaded or a display is not yet
32              attached.   A  '.' prefix will specify a virtual device which is
33              handy for looking up register decodings for a device not present
34              in the system, for instance, '.vega10'.
35
36       --pci <device>
37              Force  a  specific PCI device using the domain:bus:slot.function
38              format in hex.  This is useful when more than one GPU is  avail‐
39              able.  If the amdgpu driver is loaded the corresponding instance
40              will be automatically detected.
41
42       --gfxoff, -go <0 | 1>
43              Turn on or off GFXOFF on select hardware.  A non-zero value  en‐
44              ables the GFXOFF feature and a zero value disables it.
45
46       --vm_partition, -vmp <-1, 0...n>
47              Select  a  VM  partition  for all GPUVM accesses.  Default is -1
48              which refers to the 0'th instance of the VM hub which is not the
49              same as specifying '0'.  Values above -1 are for ASICs with mul‐
50              tiple IP instances.
51
52       --option, -O <string>[,<string>,...]
53              Specify options to the tool.  Multiple options can be  specified
54              as  comma separated strings.  Options should be specified before
55              --update or --force commands (among others)  to  enable  options
56              specified.
57
58              quiet
59                   Disable various informative but not required (for function‐
60              ality) outputs.
61
62              read_smc
63                   Enable scanning of SMC registers.
64
65              bits
66                   enables displaying bitfields for scanned blocks.
67
68              bitsfull
69                   enables displaying bitfields using their  entire  path  for
70              scanned blocks.
71
72              empty_log
73                   Empties the MMIO log after reading it.
74
75              follow
76                   Causes  the  --logscan command to repeatedly produce output
77              without
78                   exiting.
79
80              no_follow_ib
81                   Instruct the --ring-stream command to not attempt to follow
82              IBs pointed to by the packets
83                   in the ring.
84
85              use_pci
86                   Enable  PCI access for MMIO instead of using debugfs.  Used
87              by the --read,
88                   --scan, --top, --write, and --write-bit commands.  Does not
89              currently
90                   support multiple instances of the same GPU (PCI device ID).
91              Note that access
92                   to non-MMIO registers might be  disabled  when  using  this
93              flag.
94
95              use_colour
96                   Enable  colour  output for --top command, scales from blue,
97              green, yellow, to red.  Also
98                   accepted is 'use_color'.
99
100              no_kernel
101                   Disable using kernel files to access the  device.   Implies
102              ''use_pci''.  This is meant to
103                   be  used  only  if the KMD is hung or otherwise not working
104              correctly.  Using it on live systems
105                   may result in race conditions.
106
107              verbose
108                   Enable verbose diagnostics (used in --vram).
109
110              halt_waves
111                   Halt/resume all waves while reading wave status.
112
113              disasm_early_term
114                   Terminate shader disassembly when first  s_endpgm  is  hit.
115              This is required for
116                   older  UMDs (or non-mesa UMDs) that don't use the quintuple
117              0xBF9F0000 to signal the true
118                   end of a shader.
119
120              no_disasm
121                   Disable shader disassembler logic (still outputs text  just
122              doesn't use LLVM to decode).  Useful
123                   if  the  linked llvm-dev doesn't support the hardware being
124              debugged.  Avoids segfualts/asserts.
125
126              disasm_anyways
127                   Enable shader disassembly in  --waves  even  if  the  rings
128              aren't halted.
129
130              wave64
131                   Enable full 64 wave disassembly
132
133              full_shader
134                   Enable full shader disassembly in --waves when '-O bits' is
135              used and the shader is found in
136                   a gfx or compute ring.
137
138              no_fold_vm_decode
139                  Disable folding of PDEs when VM decoding multiple  pages  of
140              memory.  By default,
141                  when  subsequent  pages  are  decoded if PDEs match previous
142              pages they are omitted to cut down
143                  on the verbosity of the output.  This option  disables  this
144              and will print the full chain of
145                  PDEs for every page decoded.
146
147              no_scan_waves
148                 Disable scanning wave data during --ring-stream output.
149
150              force_asic_file
151                 Force using a database .asic file matching in pci.did instead
152              of IP discovery.
153
154

Bank Selection

156       --bank, -b <se> <sh> <instance>
157              Select a GRBM se/sh/instance bank in decimal.  Can  use  'x'  to
158              denote a broadcast selection.
159
160       --sbank, -sb <me> <pipe> <queue> [vmid]
161              Select  a  SRBM me/pipe/queue bank in decimal.  VMID is optional
162              (default: 0).
163
164       --cbank, -cb <context_reg_bank>
165              Select a context register bank (value is multiplied by  0x1000).
166              Used for context registers in the range 0xA000..0xAFFF.
167

Device Information

169       --config, -c
170              Print out configuation data read from kernel driver.
171
172       --enumerate, -e
173              Enumerate all AMDGPU supported devices.
174
175       --list-blocks -lb
176              List all blocks attached to the asic that have been detected.
177
178       --list-regs, -lr <string>
179              List  all  registers  in  an IP block (can use '-O bits' to list
180              bitfields)
181
182

Register Access

184       --lookup, -lu <address_or_regname> <number>
185              Look up an MMIO register by  address  and  bitfield  decode  the
186              value  specified (with 0x prefix) or by register name.  The reg‐
187              ister name string must include the ipname, e.g., uvd6.mmUVD_CON‐
188              TEXT_ID.
189
190       --write -w <string> <number>
191              Write  a  value  specified in hex to a register specified with a
192              complete register path in the form < asicname.ipname.regname  >.
193              For  example,  fiji.uvd6.mmUVD_CGC_GATE.   The value of asicname
194              and/or ipname can be * to simplify scripting.  This command  can
195              be  used multiple times to write to multiple registers in a sin‐
196              gle invocation.
197
198       --writebit -wb <string> <number>
199              Write a value specified in hex to a register bitfield  specified
200              with a complete register path as in the --write command.
201
202       --read, -r <string>
203              Read  a  value  from  a register specified by a register path to
204              stdout.  This command uses the same syntax as the  --write  com‐
205              mand  but  also allows * for the regname field to read an entire
206              block.  Additionally, a * can be appended to a register name  to
207              read  any register that contains a partial match.  For instance,
208              "*.vcn10.ADDR*" would read any register from the  'VCN10'  block
209              which contains 'ADDR' in the name.
210
211       --scan, -s <string>
212              Scan  and  print  an IP block by name, for example, uvd6 or car‐
213              rizo.uvd6.  Can be used multiple times in a single invocation.
214
215

Device Utilization

217       --top, -t
218              Summarize GPU utilization.  Can select a SE block  with  --bank.
219              Relevant options that apply are: use_colour and use_pci
220
221       --waves, -wa [ <ring_name> | <vmid>@<addr>.<size> ]
222              Print  out  information about any active CU waves.  Note that if
223              GFX power gating is enabled this command may  result  in  a  GPU
224              hang.   It's  unlikely  unless  you're invoking it very rapidly.
225              Unlike the wave count reading in --top this command will operate
226              regardless of whether GFX PG is enabled or not.  Can use bits to
227              decode the wave bitfields.  An optional ring name can be  speci‐
228              fied  (default: gfx) to search for pointers to active shaders to
229              find extra debugging information.  Alternatively, an IB  can  be
230              specified by a vmid, address, and size (in hex bytes) triplet.
231
232       --profiler, -prof [pixel= | vertex= | compute=]<nsamples> [ring]
233              Capture  'nsamples'  samples of wave data.  Optionally specify a
234              ring to use when searching for IBs that point to  shaders.   De‐
235              faults  to  'gfx'.   Additionally, the type of shader can be se‐
236              lected for as well to only profile a given type of shader.
237
238

Virtual Memory Access

240       VMIDs are specified in umr as 16 bit numbers where the lower 8 bits in‐
241       dicate  the  hardware  VMID  and the upper 8 bits indicate the which VM
242       space to use.
243
244       0 - GFX hub
245
246       1 - MM hub
247
248       2 - VC0 hub
249
250       3 - VC1 hub
251
252
253       For instance, 0x107 would specify the 7'th VMID on the MM hub.
254
255
256
257       --vm-decode, -vm vmid@<address> <num_of_pages>
258              Decode page mappings at a specified address (in  hex)  from  the
259              VMID  specified.  The VMID can be specified in hexadecimal (with
260              leading '0x') or in decimal.  Implies '-O verbose' for the dura‐
261              tion of the command so does not require it to be manually speci‐
262              fied.
263
264
265       --vm-read, -vr [vmid@]<address> <size>
266              Read 'size' bytes (in hex) from the address specified (in  hexa‐
267              decimal)  from  VRAM to stdout.  Optionally specify the VMID (in
268              decimal or in hex with a 0x prefix) treating the  address  as  a
269              virtual  address  instead.  Can use 'use_pci' to directly access
270              VRAM.
271
272
273       --vm-write, -vw [vmid@]<address> <size>
274              Write 'size' bytes (in hex) to the address specified  (in  hexa‐
275              decimal) to VRAM from stdin.
276
277
278       --vm-write-word, -vww [vmid@]<address> <data>
279              Write  a 32-bit word 'data' (in hex) to a given address (in hex)
280              in host machine order.
281
282
283       --vm-disasm, -vdis [<vmid>@]<address> <size>
284              Disassemble 'size' bytes (in hex) from a given address (in hex).
285              The  size  can  be specified as zero to have umr try and compute
286              the shader size.
287
288

Ring and PM4 Decoding

290       --ring-stream, -RS <string>[range]
291              Read  the  contents  of   the   ring   named   by   the   string
292              amdgpu_ring_<string>,  i.e.  without  the amdgpu_ring prefix. By
293              default it reads and prints the entire ring.   A  range  is  op‐
294              tional and has the format '[start:end]'. The starting and ending
295              address are non-negative integers or the '.' (dot) symbol, which
296              indicates  the  rptr  when on the left side and wptr when on the
297              right side of the range.  For instance, "-R gfx" prints the  en‐
298              tire  gfx  ring, "-R gfx[0:16]" prints the contents from 0 to 16
299              inclusively, and "-R gfx[.]" or "-R gfx[.:.]" prints  the  range
300              [rptr,wptr].  When one of the range limits is a number while the
301              other is the dot, '.', then the number  indicates  the  relative
302              range  before  or  after the corresponding ring pointer. For in‐
303              stance, "-R sdma0[16:.]"  prints [wptr-16, wptr]  words  of  the
304              SDMA0  ring, and "-R sdma1[.:32]" prints [rptr, rptr+32] double-
305              words of the SDMA1 ring. The contents of the ring is always  in‐
306              terpreted, if it can be interpreted.
307
308       --dump-ib, -di [vmid@]address length [pm]
309              Dump  an  IB  packet  at  an address with an optional VMID.  The
310              length is specified in bytes.  The type of decoder <pm>  is  op‐
311              tional  and  defaults  to PM4 packets.  Can specify '3' for SDMA
312              packets, '2' for MES packets.
313
314       --dump-ib-file, -df filename [pm]
315              Dump an IB stored in a file as a series  of  hexadecimal  DWORDS
316              one  per line.  Optionally supply a PM type, can specify '2' for
317              MES packets, '3' for SDMA IBs, or '4' for PM4 IBs.  The  default
318              is PM4.
319
320       --header-dump, -hd [HEADER_DUMP_reg]
321              Dump  the  contents of the HEADER_DUMP buffer and decode the op‐
322              code into a human readable string.
323
324       --print-cpc, -cpc
325              Dump CPC register data.
326
327       --print-sdma, -sdma
328              Dump SDMA register data.
329
330       --logscan, -ls
331              Read and display contents of the  MMIO  register  log.   Usually
332              specified  with  '-O  bits,follow,empty_log' to enable continual
333              dumping of the trace log.
334
335

Power and Clock

337       --power, -p
338              Read the content of clocks, temperature, gpu loading at  runtime
339              options 'use_colour' to colourize output.
340
341
342       --clock-scan -cs [clock]
343              Scan  the  current  hierarchy value of each clock.  Default will
344              list all the hierarchy value of clocks.  otherwise will list the
345              corresponding clock, eg. sclk.
346
347
348       --clock-manual, -cm [clock] [value]
349              Set  the  value  of the corresponding clock.  Use -cs command to
350              check hierarchy values of clock and then use -cm  value  to  set
351              the clock.
352
353
354       --clock-high, -ch
355              Set power_dpm_force_performance_level to high.
356
357
358       --clock-low, -cl
359              Set power_dpm_force_performance_level to low.
360
361
362       --clock-auto, -ca
363              Set power_dpm_force_performance_level to auto.
364
365
366       --ppt-read, -pptr [ppt_field_name]
367              Read powerplay table value and print it to stdout.  This command
368              will print all the powerplay table  information  or  the  corre‐
369              sponding string in powerplay table.
370
371
372       --gpu-metrics, -gm
373              Print the GPU metrics table for the device.
374
375

Notes

377       -  The  "Waves"  field in the DRM section of --top only works if GFX PG
378       has been disabled.  Otherwise, GPU hangs occur frequently.  When PG  is
379       enabled it will read a constant 0.
380
381

Environmental Variables

383       UMR_LOGGER
384           Directory  to output "umr.log" file when capturing samples with the
385       --top command.
386
387       UMR_DATABASE_PATH
388           Should be set to the top directory of the database  tree  used  for
389       register, IP, and ASIC model data.
390
391

FILES

393       ${CMAKE_INSTALL_PREFIX}/share/bash-completion/completions/umr  contains
394       completion for bash shells. You'd normally source  this  file  in  your
395       ~/.bashrc.
396
397       ${CMAKE_INSTALL_PREFIX}/share/umr/database  contains database files for
398       ASICs, IPs, and registers.  UMR_DATABASE_PATH is usually set  to  point
399       to here.
400
401
402
403AMD (c) 2022                     February 2022                          UMR(1)
Impressum