diff mbox series

[RFC,v2,11/20] powerpc/mm: Complement huge_pte_alloc() for all non HUGEPD setups

Message ID 59a1390923c40b0b83ae062e3041873292186577.1715971869.git.christophe.leroy@csgroup.eu (mailing list archive)
State Superseded
Headers show
Series Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64) | expand

Commit Message

Christophe Leroy May 17, 2024, 7 p.m. UTC
huge_pte_alloc() for non-HUGEPD targets is reserved for 8xx at the
moment. In order to convert other targets for non-HUGEPD, complement
huge_pte_alloc() to support any standard cont-PxD setup.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/mm/hugetlbpage.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

Comments

Oscar Salvador May 25, 2024, 4:29 a.m. UTC | #1
On Fri, May 17, 2024 at 09:00:05PM +0200, Christophe Leroy wrote:
> huge_pte_alloc() for non-HUGEPD targets is reserved for 8xx at the
> moment. In order to convert other targets for non-HUGEPD, complement
> huge_pte_alloc() to support any standard cont-PxD setup.
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>  arch/powerpc/mm/hugetlbpage.c | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 42b12e1ec851..f8aefa1e7363 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -195,11 +195,34 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
>  pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
>  		      unsigned long addr, unsigned long sz)
>  {
> -	pmd_t *pmd = pmd_off(mm, addr);
> +	pgd_t *pgd;
> +	p4d_t *p4d;
> +	pud_t *pud;
> +	pmd_t *pmd;
> +
> +	addr &= ~(sz - 1);
> +	pgd = pgd_offset(mm, addr);
> +
> +	p4d = p4d_offset(pgd, addr);
> +	if (sz >= PGDIR_SIZE)
> +		return (pte_t *)p4d;
> +
> +	pud = pud_alloc(mm, p4d, addr);
> +	if (!pud)
> +		return NULL;
> +	if (sz >= PUD_SIZE)
> +		return (pte_t *)pud;
> +
> +	pmd = pmd_alloc(mm, pud, addr);
> +	if (!pmd)
> +		return NULL;
>  
>  	if (sz < PMD_SIZE)
>  		return pte_alloc_huge(mm, pmd, addr, sz);
>  
> +	if (!IS_ENABLED(CONFIG_PPC_8xx))
> +		return (pte_t *)pmd;

So only 8xx has cont-PMD for hugepages?

> +
>  	if (sz != SZ_8M)
>  		return NULL;

Since this function is the core for allocation huge pages, I think it would
benefit from a comment at the top explaining the possible layouts.
e.g: Who can have cont-{P4d,PUD,PMD} etc.
A brief explanation of the possible scheme for all powerpc platforms.

That would help people looking into this in a future.
Christophe Leroy May 25, 2024, 6:44 a.m. UTC | #2
Le 25/05/2024 à 06:29, Oscar Salvador a écrit :
> On Fri, May 17, 2024 at 09:00:05PM +0200, Christophe Leroy wrote:
>> huge_pte_alloc() for non-HUGEPD targets is reserved for 8xx at the
>> moment. In order to convert other targets for non-HUGEPD, complement
>> huge_pte_alloc() to support any standard cont-PxD setup.
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> ---
>>   arch/powerpc/mm/hugetlbpage.c | 25 ++++++++++++++++++++++++-
>>   1 file changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
>> index 42b12e1ec851..f8aefa1e7363 100644
>> --- a/arch/powerpc/mm/hugetlbpage.c
>> +++ b/arch/powerpc/mm/hugetlbpage.c
>> @@ -195,11 +195,34 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
>>   pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
>>   		      unsigned long addr, unsigned long sz)
>>   {
>> -	pmd_t *pmd = pmd_off(mm, addr);
>> +	pgd_t *pgd;
>> +	p4d_t *p4d;
>> +	pud_t *pud;
>> +	pmd_t *pmd;
>> +
>> +	addr &= ~(sz - 1);
>> +	pgd = pgd_offset(mm, addr);
>> +
>> +	p4d = p4d_offset(pgd, addr);
>> +	if (sz >= PGDIR_SIZE)
>> +		return (pte_t *)p4d;
>> +
>> +	pud = pud_alloc(mm, p4d, addr);
>> +	if (!pud)
>> +		return NULL;
>> +	if (sz >= PUD_SIZE)
>> +		return (pte_t *)pud;
>> +
>> +	pmd = pmd_alloc(mm, pud, addr);
>> +	if (!pmd)
>> +		return NULL;
>>   
>>   	if (sz < PMD_SIZE)
>>   		return pte_alloc_huge(mm, pmd, addr, sz);
>>   
>> +	if (!IS_ENABLED(CONFIG_PPC_8xx))
>> +		return (pte_t *)pmd;
> 
> So only 8xx has cont-PMD for hugepages?

No, all have cont-PMD but only 8xx handles pages greater than PMD_SIZE 
as cont-PTE instead of cont-PMD.

> 
>> +
>>   	if (sz != SZ_8M)
>>   		return NULL;
> 
> Since this function is the core for allocation huge pages, I think it would
> benefit from a comment at the top explaining the possible layouts.
> e.g: Who can have cont-{P4d,PUD,PMD} etc.
> A brief explanation of the possible scheme for all powerpc platforms.

All is standard except 8xx, let's just have a comment for 8xx.

> 
> That would help people looking into this in a future.
> 
>   
>
Oscar Salvador May 25, 2024, 10:33 a.m. UTC | #3
On Sat, May 25, 2024 at 06:44:06AM +0000, Christophe Leroy wrote:
> No, all have cont-PMD but only 8xx handles pages greater than PMD_SIZE 
> as cont-PTE instead of cont-PMD.

Yes, sorry, I managed to confuse myself. It is obvious from the code.
diff mbox series

Patch

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 42b12e1ec851..f8aefa1e7363 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -195,11 +195,34 @@  pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
 pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
 		      unsigned long addr, unsigned long sz)
 {
-	pmd_t *pmd = pmd_off(mm, addr);
+	pgd_t *pgd;
+	p4d_t *p4d;
+	pud_t *pud;
+	pmd_t *pmd;
+
+	addr &= ~(sz - 1);
+	pgd = pgd_offset(mm, addr);
+
+	p4d = p4d_offset(pgd, addr);
+	if (sz >= PGDIR_SIZE)
+		return (pte_t *)p4d;
+
+	pud = pud_alloc(mm, p4d, addr);
+	if (!pud)
+		return NULL;
+	if (sz >= PUD_SIZE)
+		return (pte_t *)pud;
+
+	pmd = pmd_alloc(mm, pud, addr);
+	if (!pmd)
+		return NULL;
 
 	if (sz < PMD_SIZE)
 		return pte_alloc_huge(mm, pmd, addr, sz);
 
+	if (!IS_ENABLED(CONFIG_PPC_8xx))
+		return (pte_t *)pmd;
+
 	if (sz != SZ_8M)
 		return NULL;
 	if (!pte_alloc_huge(mm, pmd, addr, sz))