B Debugger Implementation
B.1 C Header File
github.com/riscv/riscv-debug-spec contains instructions for generating a C header file that defines macros for every field in every register/abstract command mentioned in this document.
B.2 External Debugger Implementation
This section details how an external debugger might use the described debug interface to perform some common operations on RISC-V cores using the JTAG DTM described in Section 6.1. All these examples assume a 32-bit core but it should be easy to adapt the examples to 64- or 128-bit cores.
To keep the examples readable, they all assume that everything succeeds, and that they complete faster than the debugger can perform the next access. This will be the case in a typical JTAG setup. However, the debugger must always check the sticky error status bits after performing a sequence of actions. If it sees any that are set, then it should attempt the same actions again, possibly while adding in some delay, or explicit checks for status bits.
B.2.1 Debug Module Interface Access
To read an arbitrary Debug Module register, select dmi, and scan in a value with op set to 1, and address set to the desired register address. In Update-DR the operation will start, and in Capture-DR its results will be captured into data. If the operation didn’t complete in time, op will be 3 and the value in data must be ignored. The busy condition must be cleared by writing dmireset in dtmcs, and then the second scan must be performed again. This process must be repeated until op returns 0. In later operations the debugger should allow for more time between Update-DR and Capture-DR.
To write an arbitrary Debug Bus register, select dmi, and scan in a value with op set to 2, and address and data set to the desired register address and data respectively. From then on everything happens exactly as with a read, except that a write is performed instead of the read.
It should almost never be necessary to scan IR, avoiding a big part of the inefficiency in typical JTAG use.
B.2.2 Checking for Halted Harts
A user will want to know as quickly as possible when a hart is halted
(e.g. due to a breakpoint). To efficiently determine which harts are
halted when there are many harts, the debugger uses the haltsum
registers. Assuming the maximum number of harts exist, first it checks haltsum3 .
For each bit set there, it writes hartsel, and checks haltsum2. This process repeats
through haltsum1 and haltsum0. Depending on how many harts exist, the process should
start at one of the lower haltsum registers.
B.2.3 Halting
To halt one or more harts, the debugger selects them, sets haltreq, and then waits for allhalted to indicate the harts are halted. Then it can clear haltreq to 0, or leave it high to catch a hart that resets while halted.
B.2.4 Running
First, the debugger should restore any registers that it has overwritten. Then it can let the selected harts run by setting resumereq. Once allresumeack is set, the debugger knows the selected harts have resumed. Harts might halt very quickly after resuming (e.g. by hitting a software breakpoint) so the debugger cannot use allhalted/anyhalted to check whether the hart resumed.
B.2.5 Single Step
Using the hardware single step feature is almost the same as regular running. The debugger just sets in before letting the hart run. The hart behaves exactly as in the running case, except that interrupts may be disabled (depending on stepie) and it only fetches and executes a single instruction before re-entering Debug Mode.
B.2.6 Accessing Registers
B.2.6.1 Using Abstract Command
Read s0 using abstract command:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | command | aarsize, transfer, regno = 0x1008 | Read s0 |
| Read | data0 | - | Returns value that was in s0 |
Write mstatus using abstract command:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | data0 | new value | |
| Write | command | aarsize, transfer, write, regno = 0x300 | Write mstatus |
B.2.6.2 Using Program Buffer
Abstract commands are only required to support GPR access. To access non-GPR registers, the debugger can use the Program Buffer to move values to/from a GPR, and then access the GPR value using an abstract command.
Write mstatus using program buffer:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | progbuf0 | csrw s0, MSTATUS | |
| Write | progbuf1 | ebreak | |
| Write | data0 | new value | |
| Write | command | aarsize, postexec, transfer, write, regno = 0x1008 | Write s0, then execute program buffer |
Read f1 using program buffer:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | progbuf0 | {fmv.x.s s0, f1} | |
| Write | progbuf1 | ebreak | |
| Write | command | postexec | Execute program buffer |
| Write | command | transfer, regno = 0x1008 | read s0 |
| Read | data0 | - | Returns the value that was in f1 |
B.2.7 Reading Memory
B.2.7.1 Using System Bus Access
With system bus access, addresses are physical system bus addresses.
Read a word from memory using system bus access:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | sbcs | sbaccess, sbreadonaddr | Setup |
| Write | sbaddress0 | address | |
| Read | sbdata0 | - | Value read from memory |
Read block of memory using system bus access:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | sbcs | sbaccess, sbreadonaddr, sbreadondata, sbautoincrement | Turn on autoread and autoincrement |
| Write | sbaddress0 | address | Writing address triggers read and increment |
| Read | sbdata0 | - | Value read from memory |
| Read | sbdata0 | - | Next value read from memory |
| … | … | … | … |
| Write | sbcs | 0 | Disable autoread |
| Read | sbdata0 | - | Get last value read from memory. |
B.2.7.2 Using Program Buffer
Memory can be accessed through the Program Buffer by having the hart perform loads/stores. Whether the addresses are physical or virtual depends on the system configuration.
Read a word from memory using program buffer:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | progbuf0 | lw s0, 0(s0) | |
| Write | progbuf1 | ebreak | |
| Write | data0 | address | |
| Write | command | transfer, write, postexec, regno = 0x1008 | Write s0, then execute program buffer |
| Write | command | regno = 0x1008 | Read s0 |
| Read | data0 | - | Value read from memory |
Read block of memory using program buffer:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | progbuf0 | lw s1, 0(s0) | |
| Write | progbuf1 | addi s0, s1, 4 | |
| Write | progbuf2 | ebreak | |
| Write | data0 | address | |
| Write | command | transfer, write, postexec, regno = 0x1008 | Write s0, then execute program buffer |
| Write | command | postexec, regno = 0x1009 | Read s1, then execute program buffer |
| Write | abstractauto | autoexecdata[0] | Set autoexecdata[0] |
| Read | data0 | - | Get value read from memory, then execute program buffer |
| Read | data0 | - | Get next value read from memory, then execute program buffer |
| … | … | … | … |
| Write | abstractauto | 0 | Clear autoexecdata[0] |
| Read | data0 | - | Get last value read from memory. |
B.2.7.3 Using Abstract Memory Access
Abstract memory accesses act as if they are performed by the hart, although the actual implementation may differ.
Read a word from memory using abstract memory access:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | data1 | address | |
| Write | command | cmdtype=2, aamsize | |
| Read | data0 | - | Value read from memory |
Read block of memory using abstract memory access:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | abstractauto | 1 | Re-execute the command when data0 is accessed |
| Write | data1 | address | |
| Write | command | cmdtype=2, aamsize, aampostincrement | |
| Read | data0 | - | Read value, and trigger reading of next address |
| … | … | … | … |
| Write | abstractauto | 0 | Disable auto-exec |
| Read | data0 | - | Get last value read from memory. |
B.2.8 Writing Memory
B.2.8.1 Using System Bus Access
With system bus access, addresses are physical system bus addresses.
Write a word to memory using system bus access:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | sbcs | sbaccess | Configure access size |
| Write | sbaddress0 | address | |
| Write | sbdata0 | value |
Write a block of memory using system bus access:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | sbcs | sbaccess, sbautoincrement | Turn on autoincrement |
| Write | sbaddress0 | address | |
| Write | sbdata0 | value0 | |
| Write | sbdata0 | value1 | |
| … | … | … | … |
| Write | sbdata0 | valueN |
B.2.8.2 Using Program Buffer
Through the Program Buffer, the hart performs the memory accesses. Addresses are physical or virtual (depending on and other system configuration).
Write a word to memory using program buffer:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | progbuf0 | sw s1, 0(s0) | |
| Write | progbuf1 | ebreak | |
| Write | data0 | address | |
| Write | command | transfer, write, regno = 0x1008 | Write s0 |
| Write | data0 | value | |
| Write | command | transfer, write, postexec, regno = 0x1009 | Write s1, then execute program buffer |
Write block of memory using program buffer:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | progbuf0 | sw s1, 0(s0) | |
| Write | progbuf1 | addi s0, s1, 4 | |
| Write | progbuf2 | ebreak | |
| Write | data0 | address | |
| Write | command | transfer, write, regno = 0x1008 | Write s0 |
| Write | data0 | value0 | |
| Write | command | transfer, write, postexec, regno = 0x1009 | Write s1, then execute program buffer |
| Write | abstractauto | autoexecdata[0] | Set autoexecdata[0] |
| Write | data0 | value1 | |
| … | … | … | … |
| Write | data0 | valueN | |
| Write | abstractauto | 0 | Clear autoexecdata[0] |
B.2.8.3 Using Abstract Memory Access
Abstract memory accesses act as if they are performed by the hart, although the actual implementation may differ.
Write a word to memory using abstract memory access:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | data1 | address | |
| Write | data0 | value | |
| Write | command | cmdtype=2, aamsize=2, write=1 |
Write a block of memory using abstract memory access:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | data1 | address | |
| Write | data0 | value0 | |
| Write | command | cmdtype=2, aamsize, write, aampostincrement | |
| Write | abstractauto | 1 | Re-execute the command when data0 is accessed |
| Write | data0 | value1 | |
| Write | data0 | value2 | |
| … | … | … | … |
| Write | data0 | valueN | |
| Write | abstractauto | 0 | Disable auto-exec |
B.2.9 Triggers
A debugger can use hardware triggers to halt a hart when a certain event occurs. Below are some examples, but as there is no requirement on the number of features of the triggers implemented by a hart, these examples might not be applicable to all implementations. When a debugger wants to set a trigger, it writes the desired configuration, and then reads back to see if that configuration is supported. All examples assume XLEN=32.
Enter Debug Mode when the instruction at 0x80001234 is executed, to be used as an instruction breakpoint in ROM:
| tdata1 | 0x6980105c | type=6, dmode=1, action=1, select=0, match=0, m=1, s=1, u=1, vs=1, vu=1, execute=1 | | tdata2 | 0x80001234 | address |
Enter Debug Mode when performing a load at address 0x80007f80 in M-mode or S-mode or U-mode:
| tdata1 | 0x68001059 | type=6, dmode=1, action=1, select=0, match=0, m=1, s=1, u=1, load=1 | | tdata2 | 0x80007f80 | address |
Enter Debug Mode when storing to an address between 0x80007c80 and 0x80007cef (inclusive) in VS-mode or VU-mode when hgatp.VMID=1:
| tdata1 0 | 0x69801902 | type=6, dmode=1, action=1, chain=1, select=0, match=2, vs=1, vu=1, store=1 | | tdata2 0 | 0x80007c80 | start address (inclusive) | | textra32 0 | 0x03000000 | mhselect=6, mhvalue=0 | | tdata1 1 | 0x69801182 | type=6, dmode=1, action=1, select=0, match=3, vs=1, vu=1, store=1 | | tdata2 1 | 0x80007cf0 | end address (exclusive) | | textra32 1 | 0x03000000 | mhselect=6, mhvalue=0 |
Enter Debug Mode when storing to an address between 0x81230000 and 0x8123ffff (inclusive):
| tdata1 | 0x698010da | type=6, dmode=1, action=1, select=0, match=1, m=1, s=1, u=1, vs=1, vu=1, store=1 | | tdata2 | 0x81237fff | 16 upper bits to match exactly, then 0, then all ones. |
Enter Debug Mode when loading from an address between 0x86753090 and 0x8675309f or between 0x96753090 and 0x9675309f (inclusive):
| tdata1 0 | 0x69801a59 | type=6, dmode=1, action=1, chain=1, match=4, m=1, s=1, u=1, vs=1, vu=1, load=1 | | tdata2 0 | 0xfff03090 | Mask for low half, then match for low half | | tdata1 1 | 0x698012d9 | type=6, dmode=1, action=1, match=5, m=1, s=1, u=1, vs=1, vu=1, load=1 | | tdata2 1 | 0xefff8675 | Mask for high half, then match for high half |
B.2.10 Handling Exceptions
Generally the debugger can avoid exceptions by being careful with the programs it writes. Sometimes they are unavoidable though, e.g. if the user asks to access memory or a CSR that is not implemented. A typical debugger will not know enough about the hardware platform to know what’s going to happen, and must attempt the access to determine the outcome.
When an exception occurs while executing the Program Buffer, command becomes set. The debugger can check this field to see whether a program encountered an exception. If there was an exception, it’s left to the debugger to know what must have caused it.
B.2.11 Quick Access
There are a variety of instructions to transfer data between GPRs and
the data registers. They are either loads/stores or CSR reads/writes.
The specific addresses also vary. This is all specified in hartinfo. The
examples here use the pseudo-op transfer dest, src to represent all
these options.
Halt the hart for a minimum amount of time to perform a single memory write:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | progbuf0 | transfer arg2, s0 | Save s0 |
| Write | progbuf1 | transfer s0, arg0 | Read first argument (address) |
| Write | progbuf2 | transfer arg0, s1 | Save s1 |
| Write | progbuf3 | transfer s1, arg1 | Read second argument (data) |
| Write | progbuf4 | sw s1, 0(s0) | |
| Write | progbuf5 | transfer s1, arg0 | Restore s1 |
| Write | progbuf6 | transfer s0, arg2 | Restore s0 |
| Write | progbuf7 | ebreak | |
| Write | data0 | address | |
| Write | data1 | data | |
| Write | command | 0x10000000 | Perform quick access |
This shows an example of setting the m bit in to enable a hardware breakpoint in M-mode. Similar quick access instructions could have been used previously to configure the trigger that is being enabled here:
| Op | Address | Value | Comment |
|---|---|---|---|
| Write | progbuf0 | transfer arg0, s0 | Save s0 |
| Write | progbuf1 | li s0, (1 \<\< 6) | Form the mask for m bit |
| Write | progbuf2 | csrrs x0, [tdata1](./trigger#csr-tdata1), s0 | Apply the mask to mcontrol |
| Write | progbuf3 | transfer s0, arg2 | Restore s0 |
| Write | progbuf4 | ebreak | |
| Write | command | 0x10000000 | Perform quick access |
B.3 Native Debugger Implementation
The spec contains a few features to aid in writing a native debugger. This section describes how some common tasks might be achieved.
B.3.1 Single Step
Single step is straightforward if the OS or a debug stub runs in M-Mode while the
program being debugged runs in a less privileged mode. When a step is required,
the OS or debug stub writes count=1, action=0,
m=0 before returning control to the lower user program with an
mret instruction.
Stepping code running in the same privilege mode as the debugger is more complicated, depending on what other debug features are implemented.
If hardware implements mpte and mte, then stepping through non-trap code which doesn’t allow for nested interrupts is also straightforward.
If hardware automatically prevents action=0 triggers from matching when entering a trap handler as described in Section 5.4, then a carefully written trap handler can ensure that interrupts are disabled whenever the icount trigger must not match.
If neither of these features exist, then single step is doable, but tricky to get right. To single step, the debug stub would execute something like:
li t0, count=4, action=0, m=1
csrw tdata1, t0 /* Write the trigger. */
lw t0, 8(sp) /* Restore t0, count decrements to 3 */
lw sp, 0(sp) /* Restore sp, count decrements to 2 */
mret /* Return to program being debugged. count decrements to 1 */
There is an additional problem with using icount to single step. An instruction may cause an exception into a more privileged mode where the trigger is not enabled. The exception handler might address the cause of the exception, and then restart the instruction. Examples of this include page faults, FPU instructions when the FPU is not yet enabled, and interrupts. When a user is single stepping through such code, they will have to step twice to get past the restarted instruction. The first time the exception handler runs, and the second time the instruction actually executes. That is confusing and usually undesirable.
To help users out, debuggers should detect when a single step restarted an instruction, and then step again. This way the users see the expected behavior of stepping over the instruction. Ideally the debugger would notify the user that an exception handler executed the first time.
The debugger should perform this extra step when the PC doesn’t change during a regular step.
It is safe to perform an extra step when the PC changes, because every RISC-V instruction either changes the PC or has side effects when repeated, but never both.
To avoid an infinite loop if the exception handler does not address the cause of the exception, the debugger must execute no more than a single extra step.