diff mbox series

[v1,3/8] riscv: hugetlb: Convert set_huge_pte_at() to take vma

Message ID 20230921162007.1630149-4-ryan.roberts@arm.com
State New
Headers show
Series Fix set_huge_pte_at() panic on arm64 | expand

Commit Message

Ryan Roberts Sept. 21, 2023, 4:20 p.m. UTC
In order to fix a bug, arm64 needs access to the vma inside it's
implementation of set_huge_pte_at(). Provide for this by converting the
mm parameter to be a vma. Any implementations that require the mm can
access it via vma->vm_mm.

This commit makes the required riscv modifications. Separate commits
update the other arches and core code, before the actual bug is fixed in
arm64.

No behavioral changes intended.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/riscv/include/asm/hugetlb.h | 2 +-
 arch/riscv/mm/hugetlbpage.c      | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

Comments

Alexandre Ghiti Sept. 22, 2023, 7:54 a.m. UTC | #1
Hi Ryan,

On 21/09/2023 18:20, Ryan Roberts wrote:
> In order to fix a bug, arm64 needs access to the vma inside it's
> implementation of set_huge_pte_at(). Provide for this by converting the
> mm parameter to be a vma. Any implementations that require the mm can
> access it via vma->vm_mm.
>
> This commit makes the required riscv modifications. Separate commits
> update the other arches and core code, before the actual bug is fixed in
> arm64.
>
> No behavioral changes intended.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>   arch/riscv/include/asm/hugetlb.h | 2 +-
>   arch/riscv/mm/hugetlbpage.c      | 3 ++-
>   2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h
> index 34e24f078cc1..be1ac8582bc2 100644
> --- a/arch/riscv/include/asm/hugetlb.h
> +++ b/arch/riscv/include/asm/hugetlb.h
> @@ -17,7 +17,7 @@ void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
>   		    pte_t *ptep, unsigned long sz);
>   
>   #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
> -void set_huge_pte_at(struct mm_struct *mm,
> +void set_huge_pte_at(struct vm_area_struct *vma,
>   		     unsigned long addr, pte_t *ptep, pte_t pte);
>   
>   #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
> index 96225a8533ad..7cdbf0960772 100644
> --- a/arch/riscv/mm/hugetlbpage.c
> +++ b/arch/riscv/mm/hugetlbpage.c
> @@ -177,11 +177,12 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
>   	return entry;
>   }
>   
> -void set_huge_pte_at(struct mm_struct *mm,
> +void set_huge_pte_at(struct vm_area_struct *vma,
>   		     unsigned long addr,
>   		     pte_t *ptep,
>   		     pte_t pte)
>   {
> +	struct mm_struct *mm = vma->vm_mm;
>   	int i, pte_num;
>   
>   	if (!pte_napot(pte)) {


You can add:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

I realize that we may have the same issue with our contig pte 
implementation (called napot in riscv) as we don't handle swap/migration 
entries at all. So I guess we need something similar, and I'll implement 
it (unless you want to do it of course, but I guess it's easier for me 
to test). One (maybe stupid) question though: wouldn't it be possible to 
extract the contig pte size from the value of ptep instead of using a vma?

Thanks,

Alex
Ryan Roberts Sept. 22, 2023, 8:36 a.m. UTC | #2
On 22/09/2023 08:54, Alexandre Ghiti wrote:
> Hi Ryan,
> 
> On 21/09/2023 18:20, Ryan Roberts wrote:
>> In order to fix a bug, arm64 needs access to the vma inside it's
>> implementation of set_huge_pte_at(). Provide for this by converting the
>> mm parameter to be a vma. Any implementations that require the mm can
>> access it via vma->vm_mm.
>>
>> This commit makes the required riscv modifications. Separate commits
>> update the other arches and core code, before the actual bug is fixed in
>> arm64.
>>
>> No behavioral changes intended.
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>   arch/riscv/include/asm/hugetlb.h | 2 +-
>>   arch/riscv/mm/hugetlbpage.c      | 3 ++-
>>   2 files changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h
>> index 34e24f078cc1..be1ac8582bc2 100644
>> --- a/arch/riscv/include/asm/hugetlb.h
>> +++ b/arch/riscv/include/asm/hugetlb.h
>> @@ -17,7 +17,7 @@ void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
>>               pte_t *ptep, unsigned long sz);
>>     #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
>> -void set_huge_pte_at(struct mm_struct *mm,
>> +void set_huge_pte_at(struct vm_area_struct *vma,
>>                unsigned long addr, pte_t *ptep, pte_t pte);
>>     #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
>> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
>> index 96225a8533ad..7cdbf0960772 100644
>> --- a/arch/riscv/mm/hugetlbpage.c
>> +++ b/arch/riscv/mm/hugetlbpage.c
>> @@ -177,11 +177,12 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int
>> shift, vm_flags_t flags)
>>       return entry;
>>   }
>>   -void set_huge_pte_at(struct mm_struct *mm,
>> +void set_huge_pte_at(struct vm_area_struct *vma,
>>                unsigned long addr,
>>                pte_t *ptep,
>>                pte_t pte)
>>   {
>> +    struct mm_struct *mm = vma->vm_mm;
>>       int i, pte_num;
>>         if (!pte_napot(pte)) {
> 
> 
> You can add:
> 
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks!

> 
> I realize that we may have the same issue with our contig pte implementation
> (called napot in riscv) as we don't handle swap/migration entries at all. So I
> guess we need something similar, and I'll implement it (unless you want to do it
> of course, but I guess it's easier for me to test). 

Yes -I'll leave you to do the riscv part.

> One (maybe stupid) question
> though: wouldn't it be possible to extract the contig pte size from the value of
> ptep instead of using a vma?

Not for arm64: We support contpmd, pmd and contpte entries as backing for the
logical huge pte, depending on size. So without the size, we can't distinguish
between a coincidentally-aligned pmd entry vs a contpmd entry (which is just a
fixed size block of pmd entries).

Discussion with Christophe on the powerpc patch triggered some thinking; There
is theoretical problem with my current approach because there is one call site
in the core code that calls set_huge_pte_at(&init_mm). I've changed that to:

  struct vm_area_struct vma = TLB_FLUSH_VMA(&init_mm, 0);
  set_huge_pte_at(&vma);

knowing that this will never actually get called for arm64 because we return
PAGE_SIZE for arch_vmap_pte_range_map_size() and all other arches just take the
mm and ignore the rest of the vma. So it's safe, but fragile.

But it looks like riscv overrides arch_vmap_pte_range_map_size() and therefore
the call will be made there. And if riscv also needs to determine the size from
the vma, then bang.

So I'm going to rework it to continue to pass the mm in, but also add a size
parameter. Then it's totally safe. Will post a v2 later today.

Thanks,
Ryan

> 
> Thanks,
> 
> Alex
>
diff mbox series

Patch

diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h
index 34e24f078cc1..be1ac8582bc2 100644
--- a/arch/riscv/include/asm/hugetlb.h
+++ b/arch/riscv/include/asm/hugetlb.h
@@ -17,7 +17,7 @@  void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
 		    pte_t *ptep, unsigned long sz);
 
 #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
-void set_huge_pte_at(struct mm_struct *mm,
+void set_huge_pte_at(struct vm_area_struct *vma,
 		     unsigned long addr, pte_t *ptep, pte_t pte);
 
 #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index 96225a8533ad..7cdbf0960772 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -177,11 +177,12 @@  pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
 	return entry;
 }
 
-void set_huge_pte_at(struct mm_struct *mm,
+void set_huge_pte_at(struct vm_area_struct *vma,
 		     unsigned long addr,
 		     pte_t *ptep,
 		     pte_t pte)
 {
+	struct mm_struct *mm = vma->vm_mm;
 	int i, pte_num;
 
 	if (!pte_napot(pte)) {