1UMR(1)                           User Manuals                           UMR(1)
2
3
4

NAME

6       umr - AMDGPU Userspace Register Debugger
7

DESCRIPTION

9       umr  is  a  tool to read and display, as well as write to AMDGPU device
10       MMIO, PCIE, SMC, and DIDT registers via userspace.  It  can  autodetect
11       and scan AMDGPU devices (SI and up).
12

Device Selection

14       --database-path, -dbp <path>
15              Specify a database path for register, ip, and asic model data.
16
17       --gpu, -g <asicname>(@<instance> | =<pcidevice>)
18              Select  a gpu by ASIC name and either the instance number or the
19              PCI bus identifier.  For instance,  "raven1@1"  would  pick  the
20              raven1   device  in  the  2nd  DRI  instance  slot.   Similarly,
21              "raven1=0000:06:00.0" would pick a raven1 device  with  the  PCI
22              bus address '0000:06:00.0'.
23
24       --instance, -i <number>
25              Pick  a  device instance to work with.  Defaults to the 0'th de‐
26              vice.  The instance refers to a directory under  /sys/kernel/de‐
27              bug/dri/ where 0 is the first card probed.
28
29       --force, -f <number>
30              Force  a PCIE Device ID in hex or by asic name.  This is used in
31              case the amdgpu driver is not yet loaded or a display is not yet
32              attached.   A  '.' prefix will specify a virtual device which is
33              handy for looking up register decodings for a device not present
34              in the system, for instance, '.vega10'.
35
36       --pci <device>
37              Force  a  specific PCI device using the domain:bus:slot.function
38              format in hex.  This is useful when more than one GPU is  avail‐
39              able.  If the amdgpu driver is loaded the corresponding instance
40              will be automatically detected.
41
42       --gfxoff, -go <0 | 1>
43              Turn on or off GFXOFF on select hardware.  A non-zero value  en‐
44              ables the GFXOFF feature and a zero value disables it.
45
46       --vm_partition, -vmp <-1, 0...n>
47              Select  a  VM  partition  for all GPUVM accesses.  Default is -1
48              which refers to the 0'th instance of the VM hub which is not the
49              same as specifying '0'.  Values above -1 are for ASICs with mul‐
50              tiple IP instances.
51
52       --option, -O <string>[,<string>,...]
53              Specify options to the tool.  Multiple options can be  specified
54              as  comma separated strings.  Options should be specified before
55              --update or --force commands (among others)  to  enable  options
56              specified.
57
58              quiet
59                   Disable various informative but not required (for function‐
60              ality) outputs.
61
62              read_smc
63                   Enable scanning of SMC registers.
64
65              bits
66                   enables displaying bitfields for scanned blocks.
67
68              bitsfull
69                   enables displaying bitfields using their  entire  path  for
70              scanned blocks.
71
72              empty_log
73                   Empties the MMIO log after reading it.
74
75              follow
76                   Causes  the  --logscan command to repeatedly produce output
77              without
78                   exiting.
79
80              no_follow_ib
81                   Instruct the --ring-stream command to not attempt to follow
82              IBs pointed to by the packets
83                   in the ring.
84
85              use_pci
86                   Enable  PCI access for MMIO instead of using debugfs.  Used
87              by the --read,
88                   --scan, --top, --write, and --write-bit commands.  Does not
89              currently
90                   support multiple instances of the same GPU (PCI device ID).
91              Note that access
92                   to non-MMIO registers might be  disabled  when  using  this
93              flag.
94
95              use_colour
96                   Enable  colour  output for --top command, scales from blue,
97              green, yellow, to red.  Also
98                   accepted is 'use_color'.
99
100              no_kernel
101                   Disable using kernel files to access the  device.   Implies
102              ''use_pci''.  This is meant to
103                   be  used  only  if the KMD is hung or otherwise not working
104              correctly.  Using it on live systems
105                   may result in race conditions.
106
107              verbose
108                   Enable verbose diagnostics (used in --vram).
109
110              halt_waves
111                   Halt/resume all waves while reading wave status.
112
113              disasm_early_term
114                   Terminate shader disassembly when first  s_endpgm  is  hit.
115              This is required for
116                   older  UMDs (or non-mesa UMDs) that don't use the quintuple
117              0xBF9F0000 to signal the true
118                   end of a shader.
119
120              no_disasm
121                   Disable shader disassembler logic (still outputs text  just
122              doesn't use LLVM to decode).  Useful
123                   if  the  linked llvm-dev doesn't support the hardware being
124              debugged.  Avoids segfualts/asserts.
125
126              disasm_anyways
127                   Enable shader disassembly in  --waves  even  if  the  rings
128              aren't halted.
129
130              wave64
131                   Enable full 64 wave disassembly
132
133              full_shader
134                   Enable full shader disassembly in --waves when '-O bits' is
135              used and the shader is found in
136                   a gfx or compute ring.
137
138              no_fold_vm_decode
139                  Disable folding of PDEs when VM decoding multiple  pages  of
140              memory.  By default,
141                  when  subsequent  pages  are  decoded if PDEs match previous
142              pages they are omitted to cut down
143                  on the verbosity of the output.  This option  disables  this
144              and will print the full chain of
145                  PDEs for every page decoded.
146
147              no_scan_waves
148                 Disable scanning wave data during --ring-stream output.
149
150

Bank Selection

152       --bank, -b <se> <sh> <instance>
153              Select  a  GRBM  se/sh/instance bank in decimal.  Can use 'x' to
154              denote a broadcast selection.
155
156       --sbank, -sb <me> <pipe> <queue> [vmid]
157              Select a SRBM me/pipe/queue bank in decimal.  VMID  is  optional
158              (default: 0).
159
160       --cbank, -cb <context_reg_bank>
161              Select  a context register bank (value is multiplied by 0x1000).
162              Used for context registers in the range 0xA000..0xAFFF.
163

Device Information

165       --config, -c
166              Print out configuation data read from kernel driver.
167
168       --enumerate, -e
169              Enumerate all AMDGPU supported devices.
170
171       --list-blocks -lb
172              List all blocks attached to the asic that have been detected.
173
174       --list-regs, -lr <string>
175              List all registers in an IP block (can use  '-O  bits'  to  list
176              bitfields)
177
178

Register Access

180       --lookup, -lu <address_or_regname> <number>
181              Look  up  an  MMIO  register  by address and bitfield decode the
182              value specified (with 0x prefix) or by register name.  The  reg‐
183              ister name string must include the ipname, e.g., uvd6.mmUVD_CON‐
184              TEXT_ID.
185
186       --write -w <string> <number>
187              Write a value specified in hex to a register  specified  with  a
188              complete  register path in the form < asicname.ipname.regname >.
189              For example, fiji.uvd6.mmUVD_CGC_GATE.  The  value  of  asicname
190              and/or  ipname can be * to simplify scripting.  This command can
191              be used multiple times to write to multiple registers in a  sin‐
192              gle invocation.
193
194       --writebit -wb <string> <number>
195              Write  a value specified in hex to a register bitfield specified
196              with a complete register path as in the --write command.
197
198       --read, -r <string>
199              Read a value from a register specified by  a  register  path  to
200              stdout.   This  command uses the same syntax as the --write com‐
201              mand but also allows * for the regname field to read  an  entire
202              block.   Additionally, a * can be appended to a register name to
203              read any register that contains a partial match.  For  instance,
204              "*.vcn10.ADDR*"  would  read any register from the 'VCN10' block
205              which contains 'ADDR' in the name.
206
207       --scan, -s <string>
208              Scan and print an IP block by name, for example,  uvd6  or  car‐
209              rizo.uvd6.  Can be used multiple times in a single invocation.
210
211

Device Utilization

213       --top, -t
214              Summarize  GPU  utilization.  Can select a SE block with --bank.
215              Relevant options that apply are: use_colour and use_pci
216
217       --waves, -wa [ <ring_name> | <vmid>@<addr>.<size> ]
218              Print out information about any active CU waves.  Note  that  if
219              GFX  power  gating  is  enabled this command may result in a GPU
220              hang.  It's unlikely unless you're  invoking  it  very  rapidly.
221              Unlike the wave count reading in --top this command will operate
222              regardless of whether GFX PG is enabled or not.  Can use bits to
223              decode  the wave bitfields.  An optional ring name can be speci‐
224              fied (default: gfx) to search for pointers to active shaders  to
225              find  extra  debugging information.  Alternatively, an IB can be
226              specified by a vmid, address, and size (in hex bytes) triplet.
227
228       --profiler, -prof [pixel= | vertex= | compute=]<nsamples> [ring]
229              Capture 'nsamples' samples of wave data.  Optionally  specify  a
230              ring  to  use when searching for IBs that point to shaders.  De‐
231              faults to 'gfx'.  Additionally, the type of shader  can  be  se‐
232              lected for as well to only profile a given type of shader.
233
234

Virtual Memory Access

236       VMIDs are specified in umr as 16 bit numbers where the lower 8 bits in‐
237       dicate the hardware VMID and the upper 8 bits  indicate  the  which  VM
238       space to use.
239
240       0 - GFX hub
241
242       1 - MM hub
243
244       2 - VC0 hub
245
246       3 - VC1 hub
247
248
249       For instance, 0x107 would specify the 7'th VMID on the MM hub.
250
251
252
253       --vm-decode, -vm vmid@<address> <num_of_pages>
254              Decode  page  mappings  at a specified address (in hex) from the
255              VMID specified.  The VMID can be specified in hexadecimal  (with
256              leading '0x') or in decimal.  Implies '-O verbose' for the dura‐
257              tion of the command so does not require it to be manually speci‐
258              fied.
259
260
261       --vm-read, -vr [vmid@]<address> <size>
262              Read  'size' bytes (in hex) from the address specified (in hexa‐
263              decimal) from VRAM to stdout.  Optionally specify the  VMID  (in
264              decimal  or  in  hex with a 0x prefix) treating the address as a
265              virtual address instead.  Can use 'use_pci' to  directly  access
266              VRAM.
267
268
269       --vm-write, -vw [vmid@]<address> <size>
270              Write  'size'  bytes (in hex) to the address specified (in hexa‐
271              decimal) to VRAM from stdin.
272
273
274       --vm-write-word, -vww [vmid@]<address> <data>
275              Write a 32-bit word 'data' (in hex) to a given address (in  hex)
276              in host machine order.
277
278
279       --vm-disasm, -vdis [<vmid>@]<address> <size>
280              Disassemble 'size' bytes (in hex) from a given address (in hex).
281              The size can be specified as zero to have umr  try  and  compute
282              the shader size.
283
284

Ring and PM4 Decoding

286       --ring-stream, -RS <string>[range]
287              Read   the   contents   of   the   ring   named  by  the  string
288              amdgpu_ring_<string>, i.e. without the  amdgpu_ring  prefix.  By
289              default  it  reads  and  prints the entire ring.  A range is op‐
290              tional and has the format '[start:end]'. The starting and ending
291              address are non-negative integers or the '.' (dot) symbol, which
292              indicates the rptr when on the left side and wptr  when  on  the
293              right  side of the range.  For instance, "-R gfx" prints the en‐
294              tire gfx ring, "-R gfx[0:16]" prints the contents from 0  to  16
295              inclusively,  and  "-R gfx[.]" or "-R gfx[.:.]" prints the range
296              [rptr,wptr]. When one of the range limits is a number while  the
297              other  is  the  dot, '.', then the number indicates the relative
298              range before or after the corresponding ring  pointer.  For  in‐
299              stance,  "-R  sdma0[16:.]"   prints [wptr-16, wptr] words of the
300              SDMA0 ring, and "-R sdma1[.:32]" prints [rptr, rptr+32]  double-
301              words  of the SDMA1 ring. The contents of the ring is always in‐
302              terpreted, if it can be interpreted.
303
304       --dump-ib, -di [vmid@]address length [pm]
305              Dump an IB packet at an address  with  an  optional  VMID.   The
306              length  is  specified in bytes.  The type of decoder <pm> is op‐
307              tional and defaults to PM4 packets.  Can specify  '3'  for  SDMA
308              packets.
309
310       --dump-ib-file, -df filename [pm]
311              Dump  an  IB  stored in a file as a series of hexadecimal DWORDS
312              one per line.  Optionally supply a PM type, can specify '3'  for
313              SDMA IBs or '4' for PM4 IBs.  The default is PM4.
314
315       --header-dump, -hd [HEADER_DUMP_reg]
316              Dump  the  contents of the HEADER_DUMP buffer and decode the op‐
317              code into a human readable string.
318
319       --logscan, -ls
320              Read and display contents of the  MMIO  register  log.   Usually
321              specified  with  '-O  bits,follow,empty_log' to enable continual
322              dumping of the trace log.
323
324

Power and Clock

326       --power, -p
327              Read the content of clocks, temperature, gpu loading at  runtime
328              options 'use_colour' to colourize output.
329
330
331       --clock-scan -cs [clock]
332              Scan  the  current  hierarchy value of each clock.  Default will
333              list all the hierarchy value of clocks.  otherwise will list the
334              corresponding clock, eg. sclk.
335
336
337       --clock-manual, -cm [clock] [value]
338              Set  the  value  of the corresponding clock.  Use -cs command to
339              check hierarchy values of clock and then use -cm  value  to  set
340              the clock.
341
342
343       --clock-high, -ch
344              Set power_dpm_force_performance_level to high.
345
346
347       --clock-low, -cl
348              Set power_dpm_force_performance_level to low.
349
350
351       --clock-auto, -ca
352              Set power_dpm_force_performance_level to auto.
353
354
355       --ppt_read, -pptr [ppt_field_name]
356              Read powerplay table value and print it to stdout.  This command
357              will print all the powerplay table  information  or  the  corre‐
358              sponding string in powerplay table.
359
360
361       --gpu_metrics, -gm
362              Print the GPU metrics table for the device.
363
364

Notes

366       -  The  "Waves"  field in the DRM section of --top only works if GFX PG
367       has been disabled.  Otherwise, GPU hangs occur frequently.  When PG  is
368       enabled it will read a constant 0.
369
370

Environmental Variables

372       UMR_LOGGER
373           Directory  to output "umr.log" file when capturing samples with the
374       --top command.
375
376       UMR_DATABASE_PATH
377           Directory start of database tree used for register,  ip,  and  asic
378       model data.
379
380
381
382AMD (c) 2020                     January 2020                           UMR(1)
Impressum