8 "Ss" Supervisor Extensions
This chapter is currently being restructured. Its contents are normative, but the presentation might appear disjoint.
8.1 "Ssqosid" Extension for Quality-of-Service (QoS) Identifiers, Version 1.0
Quality of Service (QoS) is defined as the minimal end-to-end performance guaranteed in advance by a service level agreement (SLA) to a workload. Performance metrics might include measures such as instructions per cycle (IPC), latency of service, etc.
When multiple workloads execute concurrently on modern processors—equipped with large core counts, multiple cache hierarchies, and multiple memory controllers— the performance of any given workload becomes less deterministic, or even non-deterministic, due to shared resource contention.
To manage performance variability, system software needs resource allocation and monitoring capabilities. These capabilities allow for the reservation of resources like cache and bandwidth, thus meeting individual performance targets while minimizing interference. For resource management, hardware should provide monitoring features that allow system software to profile workload resource consumption and allocate resources accordingly.
To facilitate this, the QoS Identifiers extension (Ssqosid) introduces the
srmcfg register, which configures a hart with two identifiers: a Resource
Control ID (RCID) and a Monitoring Counter ID (MCID). These identifiers
accompany each request issued by the hart to shared resource controllers.
Additional metadata, like the nature of the memory access and the ID of the
originating supervisor domain, can accompany RCID and MCID. Resource
controllers may use this metadata for differentiated service such as a different
capacity allocation for code storage vs. data storage. Resource controllers can
use this data for security policies such as not exposing statistics of one
security domain to another.
These identifiers are crucial for the RISC-V Capacity and Bandwidth Controller
QoS Register Interface (CBQRI) specification, which provides methods for
setting resource usage limits and monitoring resource consumption. The RCID
controls resource allocations, while the MCID is used for tracking resource
usage.
The Ssqosid extension does not require that S-mode mode be implemented.
8.1.1 Supervisor Resource Management Configuration (srmcfg) register
The srmcfg register is an SXLEN-bit read/write register used to configure a
Resource Control ID (RCID) and a Monitoring Counter ID (MCID). Both RCID
and MCID are WARL fields. The register is formatted as shown in SRMCFG64
when SXLEN=64 and SRMCFG32 when SXLEN=32.
The RCID and MCID accompany each request made by the hart to shared resource
controllers. The RCID is used to determine the resource allocations (e.g.,
cache occupancy limits, memory bandwidth limits, etc.) to enforce. The MCID
is used to identify a counter to monitor resource usage.
The RCID and MCID configured in the srmcfg CSR apply to all privilege
modes of software execution on that hart by default, but this behavior may be
overridden by future extensions.
If extension Smstateen is implemented together with Ssqosid, then Ssqosid also
requires the SRMCFG bit in mstateen0 to be implemented.
If mstateen0.SRMCFG is 0, attempts to access srmcfg in privilege modes
less privileged than M-mode raise an illegal-instruction exception.
If mstateen0.SRMCFG is 1 or if extension Smstateen is not implemented,
attempts to access srmcfg when V=1 raise a virtual-instruction exception.
A reset value of 0 is suggested for the RCID field matching resource
controllers' default behavior of associating all capacity with RCID=0. The
MCID reset value does not affect functionality and may be
implementation-defined.
Typically, fewer bits are allocated for RCID (e.g., to support tens of RCIDs)
than for MCID (e.g., to support hundreds of MCIDs). A common RCID is usually
used to group apps or VMs, pooling resource allocations to meet collective SLAs.
If an SLA breach occurs, unique MCIDs enable granular monitoring, aiding
decisions on resource adjustment, associating a different RCID with a subset
of members, or migrating members to other machines. The larger pool of MCIDs
speeds up this analysis.
The RCID and MCID in srmcfg apply across all privilege levels on the hart.
Typically, higher-privilege modes don’t modify srmcfg, as they often serve
lower-privileged tasks. If differentiation is needed, higher privilege code can
update srmcfg and restore it before returning to a lower privilege level.
In VM environments, hypervisors usually manage resource allocations, keeping
the Guest OS out of QoS flows. If needed, the hypervisor can virtualize
srmcfg CSR for a VM using the virtual-instruction exceptions triggered upon
Guest access. If the direct selection of RCID and MCID by the VM becomes
common and emulation overhead is an issue, future extensions may allow VS-mode
to use a selector for a hypervisor-configured set of CSRs holding RCID and
MCID values designated for that Guest OS use.
During context switches, the supervisor may choose to execute with the srmcfg
of the outgoing context to attribute the execution to it. Prior to restoring
the new context, it switches to the new VM’s srmcfg. The supervisor can also
use a separate configuration for execution not to be attributed to either
contexts.
8.2 Ssu64xl Extension for UXLEN=64 Support, Version 1.0
If the Ssu64xl extension is implemented, then sstatus.UXL must be capable of
holding the value 2 (i.e., UXLEN=64 must be supported).
8.3 Ssccptr Extension for Main Memory Page-Table Reads, Version 1.0
If the Ssccptr extension is implemented, then main memory regions with both the cacheability and coherence PMAs must support hardware page-table reads.
8.4 Sstvecd Extension for Direct Trap Vectoring, Version 1.0
If the Sstvecd extension is implemented, then stvec.MODE must be capable of
holding the value 0 (Direct).
Furthermore, when stvec.MODE=Direct, stvec.BASE must be capable of holding
any valid four-byte-aligned address.
8.5 Sstvala Extension for Trap Value Reporting, Version 1.0
If the Sstvala extension is implemented, then stval must be written with the
faulting virtual address for load, store, and instruction page-fault,
access-fault, and misaligned exceptions, and for breakpoint exceptions that
are defined to write an address to stval, other than those caused by execution
of the EBREAK or C.EBREAK instructions.
For virtual-instruction and illegal-instruction exceptions, stval must be
written with the faulting instruction.
8.6 Sscounterenw Extension for Counter-Enable Writability, Version 1.0
If the Sscounterenw extension is implemented, then for any hpmcounter that
is not read-only zero, the corresponding bit in scounteren must be writable.
8.7 Ssstrict Extension for Extension Conformance, Version 1.0
If the Ssstrict extension is implemented, then no non-conforming extensions are present. Furthermore, attempts to execute unimplemented opcodes or access unimplemented CSRs in the standard or reserved encoding spaces raises an illegal instruction exception that results in a contained trap to the supervisor-mode trap handler.
8.8 "Sstc" Extension for Supervisor-mode Timer Interrupts, Version 1.0
The current Privileged arch specification only defines a hardware mechanism for generating machine-mode timer interrupts (based on the mtime and mtimecmp registers). With the resultant requirement that timer services for S-mode/HS-mode (and for VS-mode) have to all be provided by M-mode - via SBI calls from S/HS-mode up to M-mode (or VS-mode calls to HS-mode and then to M-mode). M-mode software then multiplexes these multiple logical timers onto its one physical M-mode timer facility, and the M-mode timer interrupt handler passes timer interrupts back down to the appropriate lower privilege mode.
This extension serves to provide supervisor mode with its own CSR-based timer
interrupt facility that it can directly manage to provide its own timer service
(in the form of having its own stimecmp register) - thus eliminating the large
overheads for emulating S/HS-mode timers and timer interrupt generation up in
M-mode. Further, this extension adds a similar facility to the Hypervisor
extension for VS-mode.
The extension name is "Sstc" ('Ss' for Privileged arch and Supervisor-level
extensions, and 'tc' for timecmp). This extension adds the S-level stimecmp
CSR (stimecmp) and the VS-level vstimecmp CSR (vstimecmp). This extension adds the STCE bit to the menvcfg
(sec:menvcfg) and henvcfg
(sec:henvcfg) CSRs.
8.9 "Sscofpmf" Extension for Count Overflow and Mode-Based Filtering, Version 1.0
The current Privileged specification defines mhpmevent CSRs to select and control event counting by the associated hpmcounter CSRs, but provides no standardization of any fields within these CSRs. For at least Linux-class rich-OS systems it is desirable to standardize certain basic features that are broadly desired (and have come up over the past year plus on RISC-V lists, as well as have been the subject of past proposals). This enables there to be standard upstream software support that eliminates the need for implementations to provide their own custom software support.
This extension serves to accomplish exactly this within the existing mhpmevent CSRs (and correspondingly avoids the unnecessary creation of whole new sets of CSRs - past just one new CSR).
This extension sticks to addressing two basic well-understood needs that have been requested by various people. To make it easy to understand the deltas from the current Priv 1.11/1.12 specs, this is written as the actual exact changes to be made to existing paragraphs of Priv spec text (or additional paragraphs within the existing text).
The extension name is "Sscofpmf" ('Ss' for Privileged arch and Supervisor-level extensions, and 'cofpmf' for Count OverFlow and Privilege Mode Filtering).
Note that the new count overflow interrupt will be treated as a standard local interrupt that is assigned to bit 13 in the mip/mie/sip/sie registers.
8.9.1 Count Overflow Control
The following bits are added to mhpmevent:
| 63 | 62 | 61 | 60 | 59 | 58 | 57 | 56 |
|---|---|---|---|---|---|---|---|
| OF | MINH | SINH | UINH | VSINH | VUINH | WPRI | WPRI |
| Field | Description |
|---|---|
| OF | Overflow status and interrupt disable bit that is set when counter overflows |
| MINH | If set, then counting of events in M-mode is inhibited |
| SINH | If set, then counting of events in S/HS-mode is inhibited |
| UINH | If set, then counting of events in U-mode is inhibited |
| VSINH | If set, then counting of events in VS-mode is inhibited |
| VUINH | If set, then counting of events in VU-mode is inhibited |
| WPRI | Reserved |
| WPRI | Reserved |
For each xINH bit, if the associated privilege mode is not implemented,
the bit is read-only zero.
Each of the five xINH bits, when set, inhibit counting of events while in
privilege mode x. All-zeroes for these bits results in counting of events in
all modes.
The OF bit is set when the corresponding hpmcounter overflows, and remains set until written by software. Since hpmcounter values are unsigned values, overflow is defined as unsigned overflow of the implemented counter bits. Note that there is no loss of information after an overflow since the counter wraps around and keeps counting while the sticky OF bit remains set.
If supervisor mode is implemented, the 32-bit scountovf register contains read-only shadow copies of the OF bits in all 29 mhpmevent registers.
If an hpmcounter overflows while the associated OF bit is zero, then a "count overflow interrupt request" is generated. If the OF bit is one, then no interrupt request is generated. Consequently the OF bit also functions as a count overflow interrupt disable for the associated hpmcounter.
Count overflow never results from writes to the mhpmcounter_n_ or mhpmevent_n_ registers, only from hardware increments of counter registers.
This count-overflow-interrupt-request signal is treated as a standard local
interrupt that corresponds to bit 13 in the mip/mie/sip/sie registers.
The mip/sip LCOFIP and mie/sie LCOFIE bits are, respectively, the
interrupt-pending and interrupt-enable bits for this interrupt.
('LCOFI' represents 'Local Count Overflow Interrupt'.)
Generation of a count-overflow-interrupt request by an hpmcounter sets the
associated OF bit.
When an OF bit is set, it eventually, but not necessarily immediately, sets
the LCOFIP bit in the mip/sip registers.
The LCOFIP bit is cleared by software before servicing the count overflow
interrupt resulting from one or more count overflows. The mideleg register controls the delegation of this interrupt to S-mode
versus M-mode.#
There are not separate overflow status and overflow interrupt enable bits. In practice, enabling overflow interrupt generation (by clearing the OF bit) is done in conjunction with initializing the counter to a starting value. Once a counter has overflowed, it and the OF bit must be reinitialized before another overflow interrupt can be generated.
Software can distinguish newly overflowed counters (yet to be serviced by an overflow interrupt handler) from overflowed counters that have already been serviced or that are configured to not generate an interrupt on overflow, by maintaining a bit mask reflecting which counters are active and due to eventually overflow.
8.9.2 Supervisor Count Overflow (scountovf) Register
This extension adds the scountovf CSR,
a 32-bit read-only register that contains shadow copies of
the OF bits in the 29 mhpmevent CSRs (mhpmevent_3_ - mhpmevent_31_) - where
scountovf bit X corresponds to mhpmevent_X_.
This register enables supervisor-level overflow interrupt handler software to quickly and easily determine which counter(s) have overflowed (without needing to make an execution environment call or series of calls ultimately up to M-mode).
Read access to bit X is subject to the same mcounteren (or mcounteren and hcounteren) CSRs that mediate access to the hpmcounter CSRs by S-mode (or VS-mode). In M-mode, scountovf bit X is always readable. In S/HS-mode, scountovf bit X is readable when mcounteren bit X is set, and otherwise reads as zero. Similarly, in VS mode, scountovf bit X is readable when mcounteren bit X and hcounteren bit X are both set, and otherwise reads as zero.
8.10 "Ssdbltrp" Double Trap Extension, Version 1.0
The Ssdbltrp extension addresses a double trap (See machine-double-trap) privilege modes lower than M. It enables HS-mode to invoke a critical error handler in a virtual machine on a double trap in VS-mode. It also allows M-mode to invoke a critical error handler in the OS/Hypervisor on a double trap in S/HS-mode.
The Ssdbltrp extension adds the menvcfg.DTE (See sec:menvcfg) and the
sstatus.SDT fields (See sstatus). If the hypervisor extension is
additionally implemented, then the extension adds the henvcfg.DTE (See
sec:henvcfg) and the vsstatus.SDT fields (See vsstatus).
See supv-double-trap for the operational details.