1UMR(1)                           User Manuals                           UMR(1)
2
3
4

NAME

6       umr - AMDGPU Userspace Register Debugger
7

DESCRIPTION

9       umr  is  a  tool to read and display, as well as write to AMDGPU device
10       MMIO, PCIE, SMC, and DIDT registers via userspace.  It  can  autodetect
11       and scan AMDGPU devices (SI and up).
12

Device Selection

14       --gpu, -g <asicname>(@<instance> | =<pcidevice>)
15              Select  a gpu by ASIC name and either the instance number or the
16              PCI bus identifier.  For instance,  "raven1@1"  would  pick  the
17              raven1   device  in  the  2nd  DRI  instance  slot.   Similarly,
18              "raven1=0000:06:00.0" would pick a raven1 device  with  the  PCI
19              bus address '0000:06:00.0'.
20
21       --instance, -i <number>
22              Pick  a  device  instance  to  work  with.  Defaults to the 0'th
23              device.  The instance refers  to  a  directory  under  /sys/ker‐
24              nel/debug/dri/ where 0 is the first card probed.
25
26       --force, -f <number>
27              Force  a PCIE Device ID in hex or by asic name.  This is used in
28              case the amdgpu driver is not yet loaded or a display is not yet
29              attached.  A '@' prefix will specify a path to
30               an  NPI  script, for instance, '@/home/user/device.npi'.  A '.'
31              prefix will specify a virtual device which is handy for  looking
32              up  register  decodings  for a device not present in the system,
33              for instance, '.vega10'.
34
35       --pci <device>
36              Force a specific PCI device using  the  domain:bus:slot.function
37              format  in hex.  This is useful when more than one GPU is avail‐
38              able. If the amdgpu driver is loaded the corresponding  instance
39              will be automatically detected.
40
41       --update, -u <filename>
42              Specify update file to add, change, or delete registers from the
43              register database.  Can also use '@' prefix  to  specify  update
44              commands   on   the   command  line.   For  instance  '@add  reg
45              raven1.gfx91.mmFoo 0x1234' would add a gfx mmio register.   Use‐
46              ful  for  adding  registers that are not including in the kernel
47              headers.
48
49       --gfxoff, -go <0 | 1>
50              Turn on or off GFXOFF on  select  hardware.   A  non-zero  value
51              enables the GFXOFF feature and a zero value disables it.
52
53       --option, -O <string>[,<string>,...]
54              Specify  options to the tool.  Multiple options can be specified
55              as comma separated strings.  Options should be specified  before
56              --update  or  --force  commands (among others) to enable options
57              specified.
58
59              quiet
60                   Disable various informative but not required (for function‐
61              ality) outputs.
62
63              read_smc
64                   Enable scanning of SMC registers.
65
66              bits
67                   enables displaying bitfields for scanned blocks.
68
69              bitsfull
70                   enables  displaying  bitfields  using their entire path for
71              scanned blocks.
72
73              empty_log
74                   Empties the MMIO log after reading it.
75
76              follow
77                   Causes the --logscan command to repeatedly  produce  output
78              without
79                   exiting.
80
81              no_follow_ib
82                   Instruct  the  --ring  command to not attempt to follow IBs
83              pointed to by the packets
84                   in the ring.
85
86              named
87                   Causes --read to print out the register name with the  reg‐
88              ister
89                   value.
90
91              many
92                   Allows  matching  regname  openly  (used  with  --read) and
93              implies
94                   "named".  For instance "*.dce100.CRTC" would match any reg‐
95              ister that
96                   contains the fragment "CRTC" in the name.
97
98              use_pci
99                   Enable  PCI access for MMIO instead of using debugfs.  Used
100              by the --read,
101                   --scan, --top, --write, and --write-bit commands.  Does not
102              currently
103                   support multiple instances of the same GPU (PCI device ID).
104              Note that access
105                   to non-MMIO registers might be  disabled  when  using  this
106              flag.
107
108              use_colour
109                   Enable  colour  output for --top command, scales from blue,
110              green, yellow, to red.  Also
111                   accepted is 'use_color'.
112
113              no_kernel
114                   Disable using kernel files to access the  device.   Implies
115              ''use_pci''.  This is meant to
116                   be  used  only  if the KMD is hung or otherwise not working
117              correctly.  Using it on live systems
118                   may result in race conditions.
119
120              verbose
121                   Enable verbose diagnostics (used in --vram).
122
123              halt_waves
124                   Halt/resume all waves while reading wave status.
125
126              disasm_early_term
127                   Terminate shader disassembly when first  s_endpgm  is  hit.
128              This is required for
129                   older  UMDs (or non-mesa UMDs) that don't use the quintuple
130              0xBF9F0000 to signal the true
131                   end of a shader.
132
133              no_disasm
134                   Disable shader disassembler logic (still outputs text  just
135              doesn't use LLVM to decode).  Useful
136                   if  the  linked llvm-dev doesn't support the hardware being
137              debugged.  Avoids segfualts/asserts.
138
139              disasm_anyways
140                   Enable shader disassembly in  --waves  even  if  the  rings
141              aren't halted.
142
143              wave64
144                   Enable full 64 wave disassembly
145
146              full_shader
147                   Enable full shader disassembly in --waves when '-O bits' is
148              used and the shader is found in
149                   a gfx or compute ring.
150
151              no_fold_vm_decode
152                  Disable folding of PDEs when VM decoding multiple  pages  of
153              memory.  By default,
154                  when  subsequent  pages  are  decoded if PDEs match previous
155              pages they are omitted to cut down
156                  on the verbosity of the output.  This option  disables  this
157              and will print the full chain of
158                  PDEs for every page decoded.
159
160              no_scan_waves
161                 Disable scanning wave data during --ring output.
162
163

Bank Selection

165       --bank, -b <se> <sh> <instance>
166              Select  a  GRBM  se/sh/instance bank in decimal.  Can use 'x' to
167              denote a broadcast selection.
168
169       --sbank, -sb <me> <pipe> <queue> [vmid]
170              Select a SRBM me/pipe/queue bank in decimal.  VMID  is  optional
171              (default: 0).
172
173       --cbank, -cb <context_reg_bank>
174              Select  a context register bank (value is multiplied by 0x1000).
175              Used for context registers in the range 0xA000..0xAFFF.
176

Device Information

178       --config, -c
179              Print out configuation data read from kernel driver.
180
181       --enumerate, -e
182              Enumerate all AMDGPU supported devices.
183
184       --list-blocks -lb
185              List all blocks attached to the asic that have been detected.
186
187       --list-regs, -lr <string>
188              List all registers in an IP block (can use  '-O  bits'  to  list
189              bitfields)
190
191

Register Access

193       --lookup, -lu <address_or_regname> <number>
194              Look  up  an  MMIO  register  by address and bitfield decode the
195              value specified (with 0x prefix) or by register name.  The  reg‐
196              ister name string must include the ipname, e.g., uvd6.mmUVD_CON‐
197              TEXT_ID.
198
199       --write -w <string> <number>
200              Write a value specified in hex to a register  specified  with  a
201              complete  register path in the form < asicname.ipname.regname >.
202              For example, fiji.uvd6.mmUVD_CGC_GATE.  The  value  of  asicname
203              and/or  ipname can be * to simplify scripting.  This command can
204              be used multiple times to write to multiple registers in a  sin‐
205              gle invocation.
206
207       --writebit -wb <string> <number>
208              Write  a value specified in hex to a register bitfield specified
209              with a complete register path as in the --write command.
210
211       --read, -r <string>
212              Read a value from a register specified by  a  register  path  to
213              stdout.   This  command uses the same syntax as the --write com‐
214              mand but also allows * for the regname field to read  an  entire
215              block.   Additionally, a * can be appended to a register name to
216              read any register that contains a partial match.  For  instance,
217              "*.vcn10.ADDR*"  would  read any register from the 'VCN10' block
218              which contains 'ADDR' in the name.
219
220       --scan, -s <string>
221              Scan and print an IP block by name, for example,  uvd6  or  car‐
222              rizo.uvd6.  Can be used multiple times in a single invocation.
223
224

Device Utilization

226       --top, -t
227              Summarize  GPU  utilization.  Can select a SE block with --bank.
228              Relevant options that apply are: use_colour and use_pci
229
230       --waves, -wa [ <ring_name> | <vmid>@<addr>.<size> ]
231              Print out information about any active CU waves.  Note  that  if
232              GFX  power  gating  is  enabled this command may result in a GPU
233              hang.  It's unlikely unless you're  invoking  it  very  rapidly.
234              Unlike the wave count reading in --top this command will operate
235              regardless of whether GFX PG is enabled or not.  Can use bits to
236              decode  the wave bitfields.  An optional ring name can be speci‐
237              fied (default: gfx) to search for pointers to active shaders  to
238              find  extra  debugging information.  Alternatively, an IB can be
239              specified by a vmid, address, and size (in hex bytes) triplet.
240
241       --profiler, -prof [pixel= | vertex= | compute=]<nsamples> [ring]
242              Capture 'nsamples' samples of wave data.  Optionally  specify  a
243              ring  to  use  when  searching  for  IBs  that point to shaders.
244              Defaults to 'gfx'.  Additionally, the  type  of  shader  can  be
245              selected for as well to only profile a given type of shader.
246
247

Virtual Memory Access

249       VMIDs  are  specified  in  umr as 16 bit numbers where the lower 8 bits
250       indicate the hardware VMID and the upper 8 bits indicate the  which  VM
251       space to use.
252
253       0 - GFX hub
254
255       1 - MM hub
256
257       2 - VC0 hub
258
259       3 - VC1 hub
260
261
262       For instance, 0x107 would specify the 7'th VMID on the MM hub.
263
264
265
266       --vm-decode, -vm vmid@<address> <num_of_pages>
267              Decode  page  mappings  at a specified address (in hex) from the
268              VMID specified.  The VMID can be specified in hexadecimal  (with
269              leading '0x') or in decimal.  Implies '-O verbose' for the dura‐
270              tion of the command so does not require it to be manually speci‐
271              fied.
272
273
274       --vm-read, -vr [vmid@]<address> <size>
275              Read  'size' bytes (in hex) from the address specified (in hexa‐
276              decimal) from VRAM to stdout.  Optionally specify the  VMID  (in
277              decimal  or  in  hex with a 0x prefix) treating the address as a
278              virtual address instead.  Can use 'use_pci' to  directly  access
279              VRAM.
280
281
282       --vm-write, -vw [vmid@]<address> <size>
283              Write  'size'  bytes (in hex) to the address specified (in hexa‐
284              decimal) to VRAM from stdin.
285
286
287       --vm-write-word, -vww [vmid@]<address> <data>
288              Write a 32-bit word 'data' (in hex) to a given address (in  hex)
289              in host machine order.
290
291
292       --vm-disasm, -vdis [<vmid>@]<address> <size>
293              Disassemble 'size' bytes (in hex) from a given address (in hex).
294              The size can be specified as zero to have umr  try  and  compute
295              the shader size.
296
297

Ring and PM4 Decoding

299       --ring, -R <string>[range]
300              Read   the   contents   of   the   ring   named  by  the  string
301              amdgpu_ring_<string>, i.e. without the  amdgpu_ring  prefix.  By
302              default  it  reads  and  prints  the  entire  ring.   A range is
303              optional and has the format '[start:end]'. The starting and end‐
304              ing  address  are non-negative integers or the '.' (dot) symbol,
305              which indicates the rptr when on the left side and wptr when  on
306              the  right side of the range.  For instance, "-R gfx" prints the
307              entire gfx ring, "-R gfx[0:16]" prints the contents from 0 to 16
308              inclusively,  and  "-R gfx[.]" or "-R gfx[.:.]" prints the range
309              [rptr,wptr]. When one of the range limits is a number while  the
310              other  is  the  dot, '.', then the number indicates the relative
311              range before  or  after  the  corresponding  ring  pointer.  For
312              instance,  "-R sdma0[16:.]"  prints [wptr-16, wptr] words of the
313              SDMA0 ring, and "-R sdma1[.:32]" prints [rptr, rptr+32]  double-
314              words  of  the  SDMA1  ring.  The contents of the ring is always
315              interpreted, if it can be interpreted.
316
317       --dump-ib, -di [vmid@]address length [pm]
318              Dump an IB packet at an address  with  an  optional  VMID.   The
319              length  is  specified  in  bytes.   The  type of decoder <pm> is
320              optional and defaults to PM4 packets.  Can specify '3' for  SDMA
321              packets.
322
323       --dump-ib-file, -df filename [pm]
324              Dump  an  IB  stored in a file as a series of hexadecimal DWORDS
325              one per line.  Optionally supply a PM type, can specify '3'  for
326              SDMA IBs or '4' for PM4 IBs.  The default is PM4.
327
328       --header-dump, -hd [HEADER_DUMP_reg]
329              Dump  the  contents  of  the  HEADER_DUMP  buffer and decode the
330              opcode into a human readable string.
331
332       --logscan, -ls
333              Read and display contents of the  MMIO  register  log.   Usually
334              specified  with  '-O  bits,follow,empty_log' to enable continual
335              dumping of the trace log.
336
337

Notes

339       - The "Waves" field in the DRM section of --top only works  if  GFX  PG
340       has  been disabled.  Otherwise, GPU hangs occur frequently.  When PG is
341       enabled it will read a constant 0.
342
343

Environmental Variables

345       UMR_LOGGER
346           Directory to output "umr.log" file when capturing samples with  the
347       --top command.
348
349
350
351AMD (c) 2020                     January 2020                           UMR(1)
Impressum