diff mbox series

[6/6] accel/tcg: Remain TLB_INVALID_MASK in the address when TLB is re-filled

Message ID 20230413090122.65228-7-liweiwei@iscas.ac.cn
State New
Headers show
Series target/riscv: Fix PMP related problem | expand

Commit Message

Weiwei Li April 13, 2023, 9:01 a.m. UTC
When PMP entry overlap part of the page, we'll set the tlb_size to 1, and
this will make the address set with TLB_INVALID_MASK to make the page
un-cached. However, if we clear TLB_INVALID_MASK when TLB is re-filled, then
the TLB host address will be cached, and the following instructions can use
this host address directly which may lead to the bypass of PMP related check.

Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
---
 accel/tcg/cputlb.c | 7 -------
 1 file changed, 7 deletions(-)

Comments

Daniel Henrique Barboza April 17, 2023, 4:25 p.m. UTC | #1
On 4/13/23 06:01, Weiwei Li wrote:
> When PMP entry overlap part of the page, we'll set the tlb_size to 1, and
> this will make the address set with TLB_INVALID_MASK to make the page
> un-cached. However, if we clear TLB_INVALID_MASK when TLB is re-filled, then
> the TLB host address will be cached, and the following instructions can use
> this host address directly which may lead to the bypass of PMP related check.
> 
> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
> ---

For this commit I believe it's worth mentioning that it's partially reverting
commit c3c8bf579b431b6b ("accel/tcg: Suppress auto-invalidate in
probe_access_internal") that was made to handle a particularity/quirk that was
present in s390x code.

At first glance this patch seems benign but we must make sure that no other
assumptions were made with this particular change in probe_access_internal().


Thanks,

Daniel

>   accel/tcg/cputlb.c | 7 -------
>   1 file changed, 7 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index e984a98dc4..d0bf996405 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1563,13 +1563,6 @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
>               /* TLB resize via tlb_fill may have moved the entry.  */
>               index = tlb_index(env, mmu_idx, addr);
>               entry = tlb_entry(env, mmu_idx, addr);
> -
> -            /*
> -             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
> -             * to force the next access through tlb_fill.  We've just
> -             * called tlb_fill, so we know that this entry *is* valid.
> -             */
> -            flags &= ~TLB_INVALID_MASK;
>           }
>           tlb_addr = tlb_read_ofs(entry, elt_ofs);
>       }
Weiwei Li April 18, 2023, 12:48 a.m. UTC | #2
On 2023/4/18 00:25, Daniel Henrique Barboza wrote:
>
>
> On 4/13/23 06:01, Weiwei Li wrote:
>> When PMP entry overlap part of the page, we'll set the tlb_size to 1, 
>> and
>> this will make the address set with TLB_INVALID_MASK to make the page
>> un-cached. However, if we clear TLB_INVALID_MASK when TLB is 
>> re-filled, then
>> the TLB host address will be cached, and the following instructions 
>> can use
>> this host address directly which may lead to the bypass of PMP 
>> related check.
>>
>> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
>> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
>> ---
>
> For this commit I believe it's worth mentioning that it's partially 
> reverting
> commit c3c8bf579b431b6b ("accel/tcg: Suppress auto-invalidate in
> probe_access_internal") that was made to handle a particularity/quirk 
> that was
> present in s390x code.
>
> At first glance this patch seems benign but we must make sure that no 
> other
> assumptions were made with this particular change in 
> probe_access_internal().

I think this change will introduce no external function change except 
that we should

always walk the page table(fill_tlb) for memory access to that page. And 
this is needed

for pages that are partially overlapped by PMP region.

Regards,

Weiwei Li


>
>
> Thanks,
>
> Daniel
>
>>   accel/tcg/cputlb.c | 7 -------
>>   1 file changed, 7 deletions(-)
>>
>> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
>> index e984a98dc4..d0bf996405 100644
>> --- a/accel/tcg/cputlb.c
>> +++ b/accel/tcg/cputlb.c
>> @@ -1563,13 +1563,6 @@ static int probe_access_internal(CPUArchState 
>> *env, target_ulong addr,
>>               /* TLB resize via tlb_fill may have moved the entry.  */
>>               index = tlb_index(env, mmu_idx, addr);
>>               entry = tlb_entry(env, mmu_idx, addr);
>> -
>> -            /*
>> -             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK 
>> immediately,
>> -             * to force the next access through tlb_fill. We've just
>> -             * called tlb_fill, so we know that this entry *is* valid.
>> -             */
>> -            flags &= ~TLB_INVALID_MASK;
>>           }
>>           tlb_addr = tlb_read_ofs(entry, elt_ofs);
>>       }
Richard Henderson April 18, 2023, 7:18 a.m. UTC | #3
On 4/17/23 18:25, Daniel Henrique Barboza wrote:
> 
> 
> On 4/13/23 06:01, Weiwei Li wrote:
>> When PMP entry overlap part of the page, we'll set the tlb_size to 1, and
>> this will make the address set with TLB_INVALID_MASK to make the page
>> un-cached. However, if we clear TLB_INVALID_MASK when TLB is re-filled, then
>> the TLB host address will be cached, and the following instructions can use
>> this host address directly which may lead to the bypass of PMP related check.
>>
>> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
>> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
>> ---
> 
> For this commit I believe it's worth mentioning that it's partially reverting
> commit c3c8bf579b431b6b ("accel/tcg: Suppress auto-invalidate in
> probe_access_internal") that was made to handle a particularity/quirk that was
> present in s390x code.
> 
> At first glance this patch seems benign but we must make sure that no other
> assumptions were made with this particular change in probe_access_internal().
> 
> 
> Thanks,
> 
> Daniel
> 
>>   accel/tcg/cputlb.c | 7 -------
>>   1 file changed, 7 deletions(-)
>>
>> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
>> index e984a98dc4..d0bf996405 100644
>> --- a/accel/tcg/cputlb.c
>> +++ b/accel/tcg/cputlb.c
>> @@ -1563,13 +1563,6 @@ static int probe_access_internal(CPUArchState *env, target_ulong 
>> addr,
>>               /* TLB resize via tlb_fill may have moved the entry.  */
>>               index = tlb_index(env, mmu_idx, addr);
>>               entry = tlb_entry(env, mmu_idx, addr);
>> -
>> -            /*
>> -             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
>> -             * to force the next access through tlb_fill.  We've just
>> -             * called tlb_fill, so we know that this entry *is* valid.
>> -             */
>> -            flags &= ~TLB_INVALID_MASK;


I missed the original patch, but this is definitely wrong.

Clearing this bit locally (!) is correct because we want to inform the caller of 
probe_access_* that the access is valid.  We know that it is valid because we have just 
queried tlb_fill (and thus for riscv, PMP).

Clearing the bit locally does *not* cause the tlb entry to be cached -- the INVALID bit is 
still set within the tlb entry.  The next access will again go through tlb_fill.

What is the original problem you are seeing?  The commit message does not say.


r~
Richard Henderson April 18, 2023, 7:36 a.m. UTC | #4
On 4/18/23 09:18, Richard Henderson wrote:
>>> -            /*
>>> -             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
>>> -             * to force the next access through tlb_fill.  We've just
>>> -             * called tlb_fill, so we know that this entry *is* valid.
>>> -             */
>>> -            flags &= ~TLB_INVALID_MASK;
> 
> 
> I missed the original patch, but this is definitely wrong.
> 
> Clearing this bit locally (!) is correct because we want to inform the caller of 
> probe_access_* that the access is valid.  We know that it is valid because we have just 
> queried tlb_fill (and thus for riscv, PMP).
> 
> Clearing the bit locally does *not* cause the tlb entry to be cached -- the INVALID bit is 
> still set within the tlb entry.  The next access will again go through tlb_fill.
> 
> What is the original problem you are seeing?  The commit message does not say.

 From https://lore.kernel.org/qemu-devel/3ace9e9e-91cf-36e6-a18f-494fd44dffab@iscas.ac.cn/
I see that it is a problem with execution.

By eye, it appears that get_page_addr_code_hostp needs adjustment, e.g.

     (void)probe_access_internal(env, addr, 1, MMU_INST_FETCH,
                                 cpu_mmu_index(env, true), false, &p, &full, 0);
     if (p == NULL) {
         return -1;
     }
+   if (full->lg_page_size < TARGET_PAGE_BITS) {
+       return -1;
+   }
     if (hostp) {
         *hostp = p;
     }

It seems like we could do slightly better than this, perhaps by single-stepping through 
such a page, but surely this edge case is so uncommon as to not make it worthwhile to 
consider.


r~
Weiwei Li April 18, 2023, 8:18 a.m. UTC | #5
On 2023/4/18 15:36, Richard Henderson wrote:
> On 4/18/23 09:18, Richard Henderson wrote:
>>>> -            /*
>>>> -             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK 
>>>> immediately,
>>>> -             * to force the next access through tlb_fill. We've just
>>>> -             * called tlb_fill, so we know that this entry *is* 
>>>> valid.
>>>> -             */
>>>> -            flags &= ~TLB_INVALID_MASK;
>>
>>
>> I missed the original patch, but this is definitely wrong.
>>
>> Clearing this bit locally (!) is correct because we want to inform 
>> the caller of probe_access_* that the access is valid. We know that 
>> it is valid because we have just queried tlb_fill (and thus for 
>> riscv, PMP).
>>
>> Clearing the bit locally does *not* cause the tlb entry to be cached 
>> -- the INVALID bit is still set within the tlb entry. The next access 
>> will again go through tlb_fill.
>>
>> What is the original problem you are seeing?  The commit message does 
>> not say.
>
> From 
> https://lore.kernel.org/qemu-devel/3ace9e9e-91cf-36e6-a18f-494fd44dffab@iscas.ac.cn/
> I see that it is a problem with execution.
Yeah. I found this problem in PMP check for instruction fetch.
>
> By eye, it appears that get_page_addr_code_hostp needs adjustment, e.g.
>
>     (void)probe_access_internal(env, addr, 1, MMU_INST_FETCH,
>                                 cpu_mmu_index(env, true), false, &p, 
> &full, 0);
>     if (p == NULL) {
>         return -1;
>     }
> +   if (full->lg_page_size < TARGET_PAGE_BITS) {
> +       return -1;
> +   }
>     if (hostp) {
>         *hostp = p;
>     }
>
> It seems like we could do slightly better than this, perhaps by 
> single-stepping through such a page, but surely this edge case is so 
> uncommon as to not make it worthwhile to consider.

OK. I'll  update and test it later.

Regards,

Weiwei Li

>
>
> r~
diff mbox series

Patch

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index e984a98dc4..d0bf996405 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1563,13 +1563,6 @@  static int probe_access_internal(CPUArchState *env, target_ulong addr,
             /* TLB resize via tlb_fill may have moved the entry.  */
             index = tlb_index(env, mmu_idx, addr);
             entry = tlb_entry(env, mmu_idx, addr);
-
-            /*
-             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
-             * to force the next access through tlb_fill.  We've just
-             * called tlb_fill, so we know that this entry *is* valid.
-             */
-            flags &= ~TLB_INVALID_MASK;
         }
         tlb_addr = tlb_read_ofs(entry, elt_ofs);
     }