Skip to main content

8 "Ss" Supervisor Extensions

note

This chapter is currently being restructured. Its contents are normative, but the presentation might appear disjoint.

8.1 "Ssqosid" Extension for Quality-of-Service (QoS) Identifiers, Version 1.0

Quality of Service (QoS) is defined as the minimal end-to-end performance guaranteed in advance by a service level agreement (SLA) to a workload. Performance metrics might include measures such as instructions per cycle (IPC), latency of service, etc.

When multiple workloads execute concurrently on modern processors—equipped with large core counts, multiple cache hierarchies, and multiple memory controllers— the performance of any given workload becomes less deterministic, or even non-deterministic, due to shared resource contention.

To manage performance variability, system software needs resource allocation and monitoring capabilities. These capabilities allow for the reservation of resources like cache and bandwidth, thus meeting individual performance targets while minimizing interference. For resource management, hardware should provide monitoring features that allow system software to profile workload resource consumption and allocate resources accordingly.

To facilitate this, the QoS Identifiers extension (Ssqosid) introduces the srmcfg register, which configures a hart with two identifiers: a Resource Control ID (RCID) and a Monitoring Counter ID (MCID). These identifiers accompany each request issued by the hart to shared resource controllers.

Additional metadata, like the nature of the memory access and the ID of the originating supervisor domain, can accompany RCID and MCID. Resource controllers may use this metadata for differentiated service such as a different capacity allocation for code storage vs. data storage. Resource controllers can use this data for security policies such as not exposing statistics of one security domain to another.

These identifiers are crucial for the RISC-V Capacity and Bandwidth Controller QoS Register Interface (CBQRI) specification, which provides methods for setting resource usage limits and monitoring resource consumption. The RCID controls resource allocations, while the MCID is used for tracking resource usage.

note

The Ssqosid extension does not require that S-mode mode be implemented.

8.1.1 Supervisor Resource Management Configuration (srmcfg) register

The srmcfg register is an SXLEN-bit read/write register used to configure a Resource Control ID (RCID) and a Monitoring Counter ID (MCID). Both RCID and MCID are WARL fields. The register is formatted as shown in SRMCFG64 when SXLEN=64 and SRMCFG32 when SXLEN=32.

The RCID and MCID accompany each request made by the hart to shared resource controllers. The RCID is used to determine the resource allocations (e.g., cache occupancy limits, memory bandwidth limits, etc.) to enforce. The MCID is used to identify a counter to monitor resource usage.

239d2b8c18e9985f3779ca2d0d44fae8

e114dfdf984d642b5acf809878606597

The RCID and MCID configured in the srmcfg CSR apply to all privilege modes of software execution on that hart by default, but this behavior may be overridden by future extensions.

If extension Smstateen is implemented together with Ssqosid, then Ssqosid also requires the SRMCFG bit in mstateen0 to be implemented. If mstateen0.SRMCFG is 0, attempts to access srmcfg in privilege modes less privileged than M-mode raise an illegal-instruction exception. If mstateen0.SRMCFG is 1 or if extension Smstateen is not implemented, attempts to access srmcfg when V=1 raise a virtual-instruction exception.

note

A reset value of 0 is suggested for the RCID field matching resource controllers' default behavior of associating all capacity with RCID=0. The MCID reset value does not affect functionality and may be implementation-defined.

Typically, fewer bits are allocated for RCID (e.g., to support tens of RCIDs) than for MCID (e.g., to support hundreds of MCIDs). A common RCID is usually used to group apps or VMs, pooling resource allocations to meet collective SLAs. If an SLA breach occurs, unique MCIDs enable granular monitoring, aiding decisions on resource adjustment, associating a different RCID with a subset of members, or migrating members to other machines. The larger pool of MCIDs speeds up this analysis.

The RCID and MCID in srmcfg apply across all privilege levels on the hart. Typically, higher-privilege modes don’t modify srmcfg, as they often serve lower-privileged tasks. If differentiation is needed, higher privilege code can update srmcfg and restore it before returning to a lower privilege level.

In VM environments, hypervisors usually manage resource allocations, keeping the Guest OS out of QoS flows. If needed, the hypervisor can virtualize srmcfg CSR for a VM using the virtual-instruction exceptions triggered upon Guest access. If the direct selection of RCID and MCID by the VM becomes common and emulation overhead is an issue, future extensions may allow VS-mode to use a selector for a hypervisor-configured set of CSRs holding RCID and MCID values designated for that Guest OS use.

During context switches, the supervisor may choose to execute with the srmcfg of the outgoing context to attribute the execution to it. Prior to restoring the new context, it switches to the new VM’s srmcfg. The supervisor can also use a separate configuration for execution not to be attributed to either contexts.

8.2 Ssu64xl Extension for UXLEN=64 Support, Version 1.0

If the Ssu64xl extension is implemented, then sstatus.UXL must be capable of holding the value 2 (i.e., UXLEN=64 must be supported).

8.3 Ssccptr Extension for Main Memory Page-Table Reads, Version 1.0

If the Ssccptr extension is implemented, then main memory regions with both the cacheability and coherence PMAs must support hardware page-table reads.

8.4 Sstvecd Extension for Direct Trap Vectoring, Version 1.0

If the Sstvecd extension is implemented, then stvec.MODE must be capable of holding the value 0 (Direct). Furthermore, when stvec.MODE=Direct, stvec.BASE must be capable of holding any valid four-byte-aligned address.

8.5 Sstvala Extension for Trap Value Reporting, Version 1.0

If the Sstvala extension is implemented, then stval must be written with the faulting virtual address for load, store, and instruction page-fault, access-fault, and misaligned exceptions, and for breakpoint exceptions that are defined to write an address to stval, other than those caused by execution of the EBREAK or C.EBREAK instructions. For virtual-instruction and illegal-instruction exceptions, stval must be written with the faulting instruction.

8.6 Sscounterenw Extension for Counter-Enable Writability, Version 1.0

If the Sscounterenw extension is implemented, then for any hpmcounter that is not read-only zero, the corresponding bit in scounteren must be writable.

8.7 Ssstrict Extension for Extension Conformance, Version 1.0

If the Ssstrict extension is implemented, then no non-conforming extensions are present. Furthermore, attempts to execute unimplemented opcodes or access unimplemented CSRs in the standard or reserved encoding spaces raises an illegal instruction exception that results in a contained trap to the supervisor-mode trap handler.

8.8 "Sstc" Extension for Supervisor-mode Timer Interrupts, Version 1.0

The current Privileged arch specification only defines a hardware mechanism for generating machine-mode timer interrupts (based on the mtime and mtimecmp registers). With the resultant requirement that timer services for S-mode/HS-mode (and for VS-mode) have to all be provided by M-mode - via SBI calls from S/HS-mode up to M-mode (or VS-mode calls to HS-mode and then to M-mode). M-mode software then multiplexes these multiple logical timers onto its one physical M-mode timer facility, and the M-mode timer interrupt handler passes timer interrupts back down to the appropriate lower privilege mode.

This extension serves to provide supervisor mode with its own CSR-based timer interrupt facility that it can directly manage to provide its own timer service (in the form of having its own stimecmp register) - thus eliminating the large overheads for emulating S/HS-mode timers and timer interrupt generation up in M-mode. Further, this extension adds a similar facility to the Hypervisor extension for VS-mode.

The extension name is "Sstc" ('Ss' for Privileged arch and Supervisor-level extensions, and 'tc' for timecmp). This extension adds the S-level stimecmp CSR (stimecmp) and the VS-level vstimecmp CSR (vstimecmp). This extension adds the STCE bit to the menvcfg (sec:menvcfg) and henvcfg (sec:henvcfg) CSRs.

8.9 "Sscofpmf" Extension for Count Overflow and Mode-Based Filtering, Version 1.0

The current Privileged specification defines mhpmevent CSRs to select and control event counting by the associated hpmcounter CSRs, but provides no standardization of any fields within these CSRs. For at least Linux-class rich-OS systems it is desirable to standardize certain basic features that are broadly desired (and have come up over the past year plus on RISC-V lists, as well as have been the subject of past proposals). This enables there to be standard upstream software support that eliminates the need for implementations to provide their own custom software support.

This extension serves to accomplish exactly this within the existing mhpmevent CSRs (and correspondingly avoids the unnecessary creation of whole new sets of CSRs - past just one new CSR).

This extension sticks to addressing two basic well-understood needs that have been requested by various people. To make it easy to understand the deltas from the current Priv 1.11/1.12 specs, this is written as the actual exact changes to be made to existing paragraphs of Priv spec text (or additional paragraphs within the existing text).

The extension name is "Sscofpmf" ('Ss' for Privileged arch and Supervisor-level extensions, and 'cofpmf' for Count OverFlow and Privilege Mode Filtering).

Note that the new count overflow interrupt will be treated as a standard local interrupt that is assigned to bit 13 in the mip/mie/sip/sie registers.

8.9.1 Count Overflow Control

The following bits are added to mhpmevent:

6362616059585756
OFMINHSINHUINHVSINHVUINHWPRIWPRI
FieldDescription
OFOverflow status and interrupt disable bit that is set when counter overflows
MINHIf set, then counting of events in M-mode is inhibited
SINHIf set, then counting of events in S/HS-mode is inhibited
UINHIf set, then counting of events in U-mode is inhibited
VSINHIf set, then counting of events in VS-mode is inhibited
VUINHIf set, then counting of events in VU-mode is inhibited
WPRIReserved
WPRIReserved

For each xINH bit, if the associated privilege mode is not implemented, the bit is read-only zero.

Each of the five xINH bits, when set, inhibit counting of events while in privilege mode x. All-zeroes for these bits results in counting of events in all modes.

The OF bit is set when the corresponding hpmcounter overflows, and remains set until written by software. Since hpmcounter values are unsigned values, overflow is defined as unsigned overflow of the implemented counter bits. Note that there is no loss of information after an overflow since the counter wraps around and keeps counting while the sticky OF bit remains set.

If supervisor mode is implemented, the 32-bit scountovf register contains read-only shadow copies of the OF bits in all 29 mhpmevent registers.

If an hpmcounter overflows while the associated OF bit is zero, then a "count overflow interrupt request" is generated. If the OF bit is one, then no interrupt request is generated. Consequently the OF bit also functions as a count overflow interrupt disable for the associated hpmcounter.

Count overflow never results from writes to the mhpmcounter_n_ or mhpmevent_n_ registers, only from hardware increments of counter registers.

This count-overflow-interrupt-request signal is treated as a standard local interrupt that corresponds to bit 13 in the mip/mie/sip/sie registers. The mip/sip LCOFIP and mie/sie LCOFIE bits are, respectively, the interrupt-pending and interrupt-enable bits for this interrupt. ('LCOFI' represents 'Local Count Overflow Interrupt'.)

Generation of a count-overflow-interrupt request by an hpmcounter sets the associated OF bit. When an OF bit is set, it eventually, but not necessarily immediately, sets the LCOFIP bit in the mip/sip registers. The LCOFIP bit is cleared by software before servicing the count overflow interrupt resulting from one or more count overflows. The mideleg register controls the delegation of this interrupt to S-mode versus M-mode.#

note

There are not separate overflow status and overflow interrupt enable bits. In practice, enabling overflow interrupt generation (by clearing the OF bit) is done in conjunction with initializing the counter to a starting value. Once a counter has overflowed, it and the OF bit must be reinitialized before another overflow interrupt can be generated.

note

Software can distinguish newly overflowed counters (yet to be serviced by an overflow interrupt handler) from overflowed counters that have already been serviced or that are configured to not generate an interrupt on overflow, by maintaining a bit mask reflecting which counters are active and due to eventually overflow.

8.9.2 Supervisor Count Overflow (scountovf) Register

This extension adds the scountovf CSR, a 32-bit read-only register that contains shadow copies of the OF bits in the 29 mhpmevent CSRs (mhpmevent_3_ - mhpmevent_31_) - where scountovf bit X corresponds to mhpmevent_X_.

This register enables supervisor-level overflow interrupt handler software to quickly and easily determine which counter(s) have overflowed (without needing to make an execution environment call or series of calls ultimately up to M-mode).

Read access to bit X is subject to the same mcounteren (or mcounteren and hcounteren) CSRs that mediate access to the hpmcounter CSRs by S-mode (or VS-mode). In M-mode, scountovf bit X is always readable. In S/HS-mode, scountovf bit X is readable when mcounteren bit X is set, and otherwise reads as zero. Similarly, in VS mode, scountovf bit X is readable when mcounteren bit X and hcounteren bit X are both set, and otherwise reads as zero.

8.10 "Ssdbltrp" Double Trap Extension, Version 1.0

The Ssdbltrp extension addresses a double trap (See machine-double-trap) privilege modes lower than M. It enables HS-mode to invoke a critical error handler in a virtual machine on a double trap in VS-mode. It also allows M-mode to invoke a critical error handler in the OS/Hypervisor on a double trap in S/HS-mode.

The Ssdbltrp extension adds the menvcfg.DTE (See sec:menvcfg) and the sstatus.SDT fields (See sstatus). If the hypervisor extension is additionally implemented, then the extension adds the henvcfg.DTE (See sec:henvcfg) and the vsstatus.SDT fields (See vsstatus).

See supv-double-trap for the operational details.