diff mbox series

target/riscv/pmp: fix non-translated page size address checks w/ MPU

Message ID 20220909152258.2568942-1-leon@is.currently.online
State New
Headers show
Series target/riscv/pmp: fix non-translated page size address checks w/ MPU | expand

Commit Message

Leon Schuermann Sept. 9, 2022, 3:22 p.m. UTC
From: Leon Schuermann <leon@is.currently.online>

This commit fixes PMP address access checks with non page-aligned PMP
regions on harts with MPU enabled. Without this change, the presence
of an MPU in the virtual CPU model would influence the PMP address
check behavior when an access size was unknown (`size == 0`),
regardless of whether virtual memory has actually been enabled by the
guest.

The RISC-V Privileged Spec Version 20211203[1] states in 4.3.1
Addressing and Memory Protection that "[...]  [w]hen Sv32 virtual
memory mode is selected in the MODE field of the satp register,
supervisor virtual addresses are translated into supervisor physical
addresses via a two-level page table. The 20-bit VPN is translated
into a 22-bit physical page number (PPN), while the 12-bit page offset
is untranslated. The resulting supervisor-level physical addresses are
then checked using any physical memory protection structures (Sections
3.7), before being directly converted to machine-level physical
addresses. [...]" and "[...] [w]hen the value of satp.MODE is Bare,
the 32-bit virtual address is translated (unmodified) into a 32-bit
physical address [...]". Other modes such as Sv39, Sv48 and Sv57 are
said to behave similar in this regard.

From this specification it can be inferred that any access made when
virtual memory is disabled, which is the case when satp.MODE is set to
"Bare" (0), should behave identically with respect to PMP checks as if
no MPU were present in the system at all. The current implementation,
however, degrades any PMP address checks of unknown access size (which
seems to be the case for instruction fetches at least) to be of
page-granularity, just based on the fact that the hart has MPU support
enabled. This causes systems that rely on 4-byte aligned PMP regions
to incur access faults, which are not occurring with the MPU disabled,
independent of any runtime guest configuration.

While there possibly are other unhandled edge cases in which
page-granularity access checks might not be appropriate, this commit
appears to be a strict improvement over the current implementation's
behavior. It has been tested using Tock OS, but not with other
systems (e.g., Linux) yet.

[1]: https://github.com/riscv/riscv-isa-manual/releases/download/Priv-v1.12/riscv-privileged-20211203.pdf

Signed-off-by: Leon Schuermann <leon@is.currently.online>
---

This patch is a resubmission to include all maintainers of the
modified files and main QEMU mailing list, as determined through the
`get_maintainer.pl` script.

Also, one particular example of an additional edge case not handled
through this patch might be a hart operating in M-mode. Given that
virtual memory through {Sv32,Sv39,Sv48,Sv57} is only supported for
S-mode and U-mode respectively, enabling virtual memory in the satp
CSR should not have any effect on the behavior of memory accesses
w.r.t. PMP checks for harts operating in M-mode.

I'm going to defer adding this additional check, as I'd appreciate some
feedback as to whether my reasoning is correct here at all first.

Thanks!

-Leon

---
 target/riscv/pmp.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)


base-commit: 61fd710b8da8aedcea9b4f197283dc38638e4b60

Comments

Alistair Francis Sept. 19, 2022, 11:28 p.m. UTC | #1
On Sat, Sep 10, 2022 at 1:24 AM <leon@is.currently.online> wrote:
>
> From: Leon Schuermann <leon@is.currently.online>
>
> This commit fixes PMP address access checks with non page-aligned PMP
> regions on harts with MPU enabled. Without this change, the presence
> of an MPU in the virtual CPU model would influence the PMP address
> check behavior when an access size was unknown (`size == 0`),
> regardless of whether virtual memory has actually been enabled by the
> guest.
>
> The RISC-V Privileged Spec Version 20211203[1] states in 4.3.1
> Addressing and Memory Protection that "[...]  [w]hen Sv32 virtual
> memory mode is selected in the MODE field of the satp register,
> supervisor virtual addresses are translated into supervisor physical
> addresses via a two-level page table. The 20-bit VPN is translated
> into a 22-bit physical page number (PPN), while the 12-bit page offset
> is untranslated. The resulting supervisor-level physical addresses are
> then checked using any physical memory protection structures (Sections
> 3.7), before being directly converted to machine-level physical
> addresses. [...]" and "[...] [w]hen the value of satp.MODE is Bare,
> the 32-bit virtual address is translated (unmodified) into a 32-bit
> physical address [...]". Other modes such as Sv39, Sv48 and Sv57 are
> said to behave similar in this regard.
>
> From this specification it can be inferred that any access made when
> virtual memory is disabled, which is the case when satp.MODE is set to
> "Bare" (0), should behave identically with respect to PMP checks as if
> no MPU were present in the system at all. The current implementation,
> however, degrades any PMP address checks of unknown access size (which
> seems to be the case for instruction fetches at least) to be of
> page-granularity, just based on the fact that the hart has MPU support
> enabled. This causes systems that rely on 4-byte aligned PMP regions
> to incur access faults, which are not occurring with the MPU disabled,
> independent of any runtime guest configuration.
>
> While there possibly are other unhandled edge cases in which
> page-granularity access checks might not be appropriate, this commit
> appears to be a strict improvement over the current implementation's
> behavior. It has been tested using Tock OS, but not with other
> systems (e.g., Linux) yet.
>
> [1]: https://github.com/riscv/riscv-isa-manual/releases/download/Priv-v1.12/riscv-privileged-20211203.pdf
>
> Signed-off-by: Leon Schuermann <leon@is.currently.online>

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>

> ---
>
> This patch is a resubmission to include all maintainers of the
> modified files and main QEMU mailing list, as determined through the
> `get_maintainer.pl` script.
>
> Also, one particular example of an additional edge case not handled
> through this patch might be a hart operating in M-mode. Given that
> virtual memory through {Sv32,Sv39,Sv48,Sv57} is only supported for
> S-mode and U-mode respectively, enabling virtual memory in the satp
> CSR should not have any effect on the behavior of memory accesses
> w.r.t. PMP checks for harts operating in M-mode.
>
> I'm going to defer adding this additional check, as I'd appreciate some
> feedback as to whether my reasoning is correct here at all first.
>
> Thanks!
>
> -Leon
>
> ---
>  target/riscv/pmp.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index ea2b67d947..48f64a4aef 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -300,6 +300,7 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
>      int i = 0;
>      int ret = -1;
>      int pmp_size = 0;
> +    uint64_t satp_mode;
>      target_ulong s = 0;
>      target_ulong e = 0;
>
> @@ -310,10 +311,17 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
>      }
>
>      if (size == 0) {
> -        if (riscv_feature(env, RISCV_FEATURE_MMU)) {
> +        if (riscv_cpu_mxl(env) == MXL_RV32) {
> +            satp_mode = SATP32_MODE;
> +        } else {
> +            satp_mode = SATP64_MODE;
> +        }
> +
> +        if (riscv_feature(env, RISCV_FEATURE_MMU)
> +            && get_field(env->satp, satp_mode)) {
>              /*
> -             * If size is unknown (0), assume that all bytes
> -             * from addr to the end of the page will be accessed.
> +             * If size is unknown (0) and virtual memory is enabled, assume that
> +             * all bytes from addr to the end of the page will be accessed.
>               */
>              pmp_size = -(addr | TARGET_PAGE_MASK);

I'm not sure if we need this at all.

This function is only called from get_physical_address_pmp() which
then calculates the maximum size using pmp_is_range_in_tlb().

I suspect that we could just use sizeof(target_ulong) as the fallback
for every time size == 0. Then pmp_is_range_in_tlb() will set the
tlb_size to the maximum possible size of the PMP region.

As a plus, we would remove some macros as well, so what about (untested)?

    if (size == 0) {
        if (riscv_cpu_mxl(env) == MXL_RV32) {
            pmp_size = 4;
        } else {
            pmp_size = 8;
        }
    } else {
        pmp_size = size;
    }

Alistair

>          } else {
>
> base-commit: 61fd710b8da8aedcea9b4f197283dc38638e4b60
> --
> 2.36.0
>
>
Leon Schuermann Oct. 19, 2022, 9:29 p.m. UTC | #2
Alistair Francis <alistair23@gmail.com> writes:
>> @@ -310,10 +311,17 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
>>      }
>>
>>      if (size == 0) {
>> -        if (riscv_feature(env, RISCV_FEATURE_MMU)) {
>> +        if (riscv_cpu_mxl(env) == MXL_RV32) {
>> +            satp_mode = SATP32_MODE;
>> +        } else {
>> +            satp_mode = SATP64_MODE;
>> +        }
>> +
>> +        if (riscv_feature(env, RISCV_FEATURE_MMU)
>> +            && get_field(env->satp, satp_mode)) {
>>              /*
>> -             * If size is unknown (0), assume that all bytes
>> -             * from addr to the end of the page will be accessed.
>> +             * If size is unknown (0) and virtual memory is enabled, assume that
>> +             * all bytes from addr to the end of the page will be accessed.
>>               */
>>              pmp_size = -(addr | TARGET_PAGE_MASK);
>
> I'm not sure if we need this at all.
>
> This function is only called from get_physical_address_pmp() which
> then calculates the maximum size using pmp_is_range_in_tlb().

I'm by no means an expert on QEMU and the TCG, so I've spun up GDB to
trace down why exactly this function is called with a `size = 0`
argument. It seems that there are, generally, two code paths to this
function for instruction fetching:

1. From `get_page_addr_code`: this will invoke `tlb_fill` with
   `size = 0` to check whether an entire page is accessible and can be
   translated given the current MMU / PMP configuration. In my
   particular example, it may rightfully fail then. `get_page_addr_code`
   can handle this and will subsequently cause an MMU protection check
   to be run for each instruction translated.

2. From `riscv_tr_translate_insn` through `cpu_lduw_code`, which will
   execute `tlb_fill` with `size = 2` (to try and decode a compressed
   instruction), assuming that the above check failed.

So far, so good. In this context, it actually makes sense for
`pmp_hart_has_privs` to interpret `size = 0` to mean whether the entire
page is allowed to be accessed.

> I suspect that we could just use sizeof(target_ulong) as the fallback
> for every time size == 0. Then pmp_is_range_in_tlb() will set the
> tlb_size to the maximum possible size of the PMP region.

Given the above, I don't think that this is correct either. The PMP
check would pass even for non-page sized regions, but the entire page
would be accessible through the TCG's TLB, as a consequence of
`get_page_addr_code`.


In the current implementation, `get_page_addr_code_hostp` calls
`tlb_fill`, which in turn invokes the RISC-V TCG op `tlb_fill` with the
parameter `probe = false`. This in turn raises a PMP exception in the
CPU, whereas `get_page_addr_code` would seem to expect this a failing
`tlb_fill` to be side-effect free, such that the MMU protection checks
can be re-run per instruction in the TCG code generation phase.

I think that this is sufficient evidence to conclude that my initial
patch is actually incorrect, however I am unsure as to how this issue
can be solved proper. One approach which seems to work is to change
`get_page_addr_code_hostp` to use a non-faulting page-table read
instead:

@@ -1510,11 +1510,15 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
     uintptr_t mmu_idx = cpu_mmu_index(env, true);
     uintptr_t index = tlb_index(env, mmu_idx, addr);
     CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
+    CPUState *cs = env_cpu(env);
+    CPUClass *cc = CPU_GET_CLASS(cs);
     void *p;
 
     if (unlikely(!tlb_hit(entry->addr_code, addr))) {
         if (!VICTIM_TLB_HIT(addr_code, addr)) {
-            tlb_fill(env_cpu(env), addr, 0, MMU_INST_FETCH, mmu_idx, 0);
+            // Nonfaulting page-table read:
+            cc->tcg_ops->tlb_fill(cs, addr, 0, MMU_INST_FETCH, mmu_idx, true,
+                                  0);
             index = tlb_index(env, mmu_idx, addr);
             entry = tlb_entry(env, mmu_idx, addr);
 

However, given this touches the generic TCG implementation, I cannot
judge whether this is correct or has any unintended side effects for
other targets. If this is correct, I'd be happy to send a proper patch.

-Leon
Alistair Francis Oct. 25, 2022, 2:19 a.m. UTC | #3
On Thu, Oct 20, 2022 at 7:29 AM Leon Schuermann
<leon@is.currently.online> wrote:
>
> Alistair Francis <alistair23@gmail.com> writes:
> >> @@ -310,10 +311,17 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
> >>      }
> >>
> >>      if (size == 0) {
> >> -        if (riscv_feature(env, RISCV_FEATURE_MMU)) {
> >> +        if (riscv_cpu_mxl(env) == MXL_RV32) {
> >> +            satp_mode = SATP32_MODE;
> >> +        } else {
> >> +            satp_mode = SATP64_MODE;
> >> +        }
> >> +
> >> +        if (riscv_feature(env, RISCV_FEATURE_MMU)
> >> +            && get_field(env->satp, satp_mode)) {
> >>              /*
> >> -             * If size is unknown (0), assume that all bytes
> >> -             * from addr to the end of the page will be accessed.
> >> +             * If size is unknown (0) and virtual memory is enabled, assume that
> >> +             * all bytes from addr to the end of the page will be accessed.
> >>               */
> >>              pmp_size = -(addr | TARGET_PAGE_MASK);
> >
> > I'm not sure if we need this at all.
> >
> > This function is only called from get_physical_address_pmp() which
> > then calculates the maximum size using pmp_is_range_in_tlb().
>
> I'm by no means an expert on QEMU and the TCG, so I've spun up GDB to
> trace down why exactly this function is called with a `size = 0`
> argument. It seems that there are, generally, two code paths to this
> function for instruction fetching:
>
> 1. From `get_page_addr_code`: this will invoke `tlb_fill` with
>    `size = 0` to check whether an entire page is accessible and can be
>    translated given the current MMU / PMP configuration. In my
>    particular example, it may rightfully fail then. `get_page_addr_code`
>    can handle this and will subsequently cause an MMU protection check
>    to be run for each instruction translated.
>
> 2. From `riscv_tr_translate_insn` through `cpu_lduw_code`, which will
>    execute `tlb_fill` with `size = 2` (to try and decode a compressed
>    instruction), assuming that the above check failed.
>
> So far, so good. In this context, it actually makes sense for
> `pmp_hart_has_privs` to interpret `size = 0` to mean whether the entire
> page is allowed to be accessed.
>
> > I suspect that we could just use sizeof(target_ulong) as the fallback
> > for every time size == 0. Then pmp_is_range_in_tlb() will set the
> > tlb_size to the maximum possible size of the PMP region.
>
> Given the above, I don't think that this is correct either. The PMP
> check would pass even for non-page sized regions, but the entire page
> would be accessible through the TCG's TLB, as a consequence of
> `get_page_addr_code`.

If we pass a size smaller than the page, it won't be cached in the
TLB, so that should be ok.

A few PMP improvements have gone into
https://github.com/alistair23/qemu/tree/riscv-to-apply.next. It might
be worth checking to see if that fixes the issue you are seeing.
Otherwise I think defaulting to the xlen should be ok.

Alistair

>
>
> In the current implementation, `get_page_addr_code_hostp` calls
> `tlb_fill`, which in turn invokes the RISC-V TCG op `tlb_fill` with the
> parameter `probe = false`. This in turn raises a PMP exception in the
> CPU, whereas `get_page_addr_code` would seem to expect this a failing
> `tlb_fill` to be side-effect free, such that the MMU protection checks
> can be re-run per instruction in the TCG code generation phase.
>
> I think that this is sufficient evidence to conclude that my initial
> patch is actually incorrect, however I am unsure as to how this issue
> can be solved proper. One approach which seems to work is to change
> `get_page_addr_code_hostp` to use a non-faulting page-table read
> instead:
>
> @@ -1510,11 +1510,15 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
>      uintptr_t mmu_idx = cpu_mmu_index(env, true);
>      uintptr_t index = tlb_index(env, mmu_idx, addr);
>      CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
> +    CPUState *cs = env_cpu(env);
> +    CPUClass *cc = CPU_GET_CLASS(cs);
>      void *p;
>
>      if (unlikely(!tlb_hit(entry->addr_code, addr))) {
>          if (!VICTIM_TLB_HIT(addr_code, addr)) {
> -            tlb_fill(env_cpu(env), addr, 0, MMU_INST_FETCH, mmu_idx, 0);
> +            // Nonfaulting page-table read:
> +            cc->tcg_ops->tlb_fill(cs, addr, 0, MMU_INST_FETCH, mmu_idx, true,
> +                                  0);
>              index = tlb_index(env, mmu_idx, addr);
>              entry = tlb_entry(env, mmu_idx, addr);
>
>
> However, given this touches the generic TCG implementation, I cannot
> judge whether this is correct or has any unintended side effects for
> other targets. If this is correct, I'd be happy to send a proper patch.
>
> -Leon
diff mbox series

Patch

diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
index ea2b67d947..48f64a4aef 100644
--- a/target/riscv/pmp.c
+++ b/target/riscv/pmp.c
@@ -300,6 +300,7 @@  bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
     int i = 0;
     int ret = -1;
     int pmp_size = 0;
+    uint64_t satp_mode;
     target_ulong s = 0;
     target_ulong e = 0;
 
@@ -310,10 +311,17 @@  bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
     }
 
     if (size == 0) {
-        if (riscv_feature(env, RISCV_FEATURE_MMU)) {
+        if (riscv_cpu_mxl(env) == MXL_RV32) {
+            satp_mode = SATP32_MODE;
+        } else {
+            satp_mode = SATP64_MODE;
+        }
+
+        if (riscv_feature(env, RISCV_FEATURE_MMU)
+            && get_field(env->satp, satp_mode)) {
             /*
-             * If size is unknown (0), assume that all bytes
-             * from addr to the end of the page will be accessed.
+             * If size is unknown (0) and virtual memory is enabled, assume that
+             * all bytes from addr to the end of the page will be accessed.
              */
             pmp_size = -(addr | TARGET_PAGE_MASK);
         } else {