diff mbox

[v3] powerpc: Fix PTE page address mismatch in pgtable ctor/dtor

Message ID 1386425193-24015-1-git-send-email-hong.pham@windriver.com (mailing list archive)
State Accepted, archived
Commit cf77ee54362a245f9a01f240adce03a06c05eb68
Headers show

Commit Message

Hong H. Pham Dec. 7, 2013, 2:06 p.m. UTC
From: "Hong H. Pham" <hong.pham@windriver.com>

In pte_alloc_one(), pgtable_page_ctor() is passed an address that has
not been converted by page_address() to the newly allocated PTE page.

When the PTE is freed, __pte_free_tlb() calls pgtable_page_dtor()
with an address to the PTE page that has been converted by page_address().
The mismatch in the PTE's page address causes pgtable_page_dtor() to access
invalid memory, so resources for that PTE (such as the page lock) is not
properly cleaned up.

On PPC32, only SMP kernels are affected.

On PPC64, only SMP kernels with 4K page size are affected.

This bug was introduced by commit d614bb041209fd7cb5e4b35e11a7b2f6ee8f62b8
"powerpc: Move the pte free routines from common header".

On a preempt-rt kernel, a spinlock is dynamically allocated for each
PTE in pgtable_page_ctor().  When the PTE is freed, calling
pgtable_page_dtor() with a mismatched page address causes a memory leak,
as the pointer to the PTE's spinlock is bogus.

On mainline, there isn't any immediately obvious symptoms, but the
problem still exists here.

Fixes: d614bb041209fd7c "powerpc: Move the pte free routes from common header"
Cc: Paul Mackerras <paulus@samba.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-stable <stable@vger.kernel.org> # v3.10+
Signed-off-by: Hong H. Pham <hong.pham@windriver.com>
---
 arch/powerpc/include/asm/pgalloc-32.h | 6 ++----
 arch/powerpc/include/asm/pgalloc-64.h | 6 ++----
 2 files changed, 4 insertions(+), 8 deletions(-)

Comments

Benjamin Herrenschmidt Dec. 7, 2013, 8:27 p.m. UTC | #1
On Sat, 2013-12-07 at 09:06 -0500, Hong H. Pham wrote:

> diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
> index 27b2386..842846c 100644
> --- a/arch/powerpc/include/asm/pgalloc-32.h
> +++ b/arch/powerpc/include/asm/pgalloc-32.h
> @@ -84,10 +84,8 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
>  static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
>  				  unsigned long address)
>  {
> -	struct page *page = page_address(table);
> -
>  	tlb_flush_pgtable(tlb, address);
> -	pgtable_page_dtor(page);
> -	pgtable_free_tlb(tlb, page, 0);
> +	pgtable_page_dtor(table);
> +	pgtable_free_tlb(tlb, page_address(table), 0);
>  }

Ok so your description of the problem confused me a bit, but I see that
in the !64K page, pgtable_t is already a struct page so yes, the
page_address() call here is bogus.

However, I also noticed that in the 64k page case, we don't call the dto
at all. Is that a problem ?

Also, Aneesh, shouldn't we just fix the disconnect here and have
pgtable_t always be the same type ? The way this is now is confusing
and error prone...

>  #endif /* _ASM_POWERPC_PGALLOC_32_H */
> diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
> index f65e27b..256d6f8 100644
> --- a/arch/powerpc/include/asm/pgalloc-64.h
> +++ b/arch/powerpc/include/asm/pgalloc-64.h
> @@ -144,11 +144,9 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
>  static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
>  				  unsigned long address)
>  {
> -	struct page *page = page_address(table);
> -
>  	tlb_flush_pgtable(tlb, address);
> -	pgtable_page_dtor(page);
> -	pgtable_free_tlb(tlb, page, 0);
> +	pgtable_page_dtor(table);
> +	pgtable_free_tlb(tlb, page_address(table), 0);
>  }
>  
>  #else /* if CONFIG_PPC_64K_PAGES */

Ben.
Benjamin Herrenschmidt Dec. 7, 2013, 8:32 p.m. UTC | #2
On Sun, 2013-12-08 at 07:27 +1100, Benjamin Herrenschmidt wrote:
> On Sat, 2013-12-07 at 09:06 -0500, Hong H. Pham wrote:
> 
> > diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
> > index 27b2386..842846c 100644
> > --- a/arch/powerpc/include/asm/pgalloc-32.h
> > +++ b/arch/powerpc/include/asm/pgalloc-32.h
> > @@ -84,10 +84,8 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
> >  static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
> >  				  unsigned long address)
> >  {
> > -	struct page *page = page_address(table);
> > -
> >  	tlb_flush_pgtable(tlb, address);
> > -	pgtable_page_dtor(page);
> > -	pgtable_free_tlb(tlb, page, 0);
> > +	pgtable_page_dtor(table);
> > +	pgtable_free_tlb(tlb, page_address(table), 0);
> >  }
> 
> Ok so your description of the problem confused me a bit, but I see that
> in the !64K page, pgtable_t is already a struct page so yes, the
> page_address() call here is bogus.
> 
> However, I also noticed that in the 64k page case, we don't call the dto
> at all. Is that a problem ?

Actually we do, just elsewhere... ignore the above.

> Also, Aneesh, shouldn't we just fix the disconnect here and have
> pgtable_t always be the same type ? The way this is now is confusing
> and error prone...
> 
> >  #endif /* _ASM_POWERPC_PGALLOC_32_H */
> > diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
> > index f65e27b..256d6f8 100644
> > --- a/arch/powerpc/include/asm/pgalloc-64.h
> > +++ b/arch/powerpc/include/asm/pgalloc-64.h
> > @@ -144,11 +144,9 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
> >  static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
> >  				  unsigned long address)
> >  {
> > -	struct page *page = page_address(table);
> > -
> >  	tlb_flush_pgtable(tlb, address);
> > -	pgtable_page_dtor(page);
> > -	pgtable_free_tlb(tlb, page, 0);
> > +	pgtable_page_dtor(table);
> > +	pgtable_free_tlb(tlb, page_address(table), 0);
> >  }
> >  
> >  #else /* if CONFIG_PPC_64K_PAGES */
> 
> Ben.
> 
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
Aneesh Kumar K.V Dec. 9, 2013, 8:41 a.m. UTC | #3
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Sat, 2013-12-07 at 09:06 -0500, Hong H. Pham wrote:
>
>> diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
>> index 27b2386..842846c 100644
>> --- a/arch/powerpc/include/asm/pgalloc-32.h
>> +++ b/arch/powerpc/include/asm/pgalloc-32.h
>> @@ -84,10 +84,8 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
>>  static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
>>  				  unsigned long address)
>>  {
>> -	struct page *page = page_address(table);
>> -
>>  	tlb_flush_pgtable(tlb, address);
>> -	pgtable_page_dtor(page);
>> -	pgtable_free_tlb(tlb, page, 0);
>> +	pgtable_page_dtor(table);
>> +	pgtable_free_tlb(tlb, page_address(table), 0);
>>  }
>
> Ok so your description of the problem confused me a bit, but I see that
> in the !64K page, pgtable_t is already a struct page so yes, the
> page_address() call here is bogus.
>
> However, I also noticed that in the 64k page case, we don't call the dto
> at all. Is that a problem ?
>
> Also, Aneesh, shouldn't we just fix the disconnect here and have
> pgtable_t always be the same type ? The way this is now is confusing
> and error prone...

With pte page fragments that may not be possible right ?. With PTE fragments,
we share the page allocated with multiple pmd entries 

5c1f6ee9a31cbdac90bbb8ae1ba4475031ac74b4 should have more details

>
>>  #endif /* _ASM_POWERPC_PGALLOC_32_H */
>> diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
>> index f65e27b..256d6f8 100644
>> --- a/arch/powerpc/include/asm/pgalloc-64.h
>> +++ b/arch/powerpc/include/asm/pgalloc-64.h
>> @@ -144,11 +144,9 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
>>  static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
>>  				  unsigned long address)
>>  {
>> -	struct page *page = page_address(table);
>> -
>>  	tlb_flush_pgtable(tlb, address);
>> -	pgtable_page_dtor(page);
>> -	pgtable_free_tlb(tlb, page, 0);
>> +	pgtable_page_dtor(table);
>> +	pgtable_free_tlb(tlb, page_address(table), 0);
>>  }
>>  
>>  #else /* if CONFIG_PPC_64K_PAGES */
>
> Ben.

-aneesh
Aneesh Kumar K.V Dec. 9, 2013, 8:42 a.m. UTC | #4
"Hong H. Pham" <hong.pham@windriver.com> writes:

> From: "Hong H. Pham" <hong.pham@windriver.com>
>
> In pte_alloc_one(), pgtable_page_ctor() is passed an address that has
> not been converted by page_address() to the newly allocated PTE page.
>
> When the PTE is freed, __pte_free_tlb() calls pgtable_page_dtor()
> with an address to the PTE page that has been converted by page_address().
> The mismatch in the PTE's page address causes pgtable_page_dtor() to access
> invalid memory, so resources for that PTE (such as the page lock) is not
> properly cleaned up.
>
> On PPC32, only SMP kernels are affected.
>
> On PPC64, only SMP kernels with 4K page size are affected.
>
> This bug was introduced by commit d614bb041209fd7cb5e4b35e11a7b2f6ee8f62b8
> "powerpc: Move the pte free routines from common header".
>
> On a preempt-rt kernel, a spinlock is dynamically allocated for each
> PTE in pgtable_page_ctor().  When the PTE is freed, calling
> pgtable_page_dtor() with a mismatched page address causes a memory leak,
> as the pointer to the PTE's spinlock is bogus.
>
> On mainline, there isn't any immediately obvious symptoms, but the
> problem still exists here.
>
> Fixes: d614bb041209fd7c "powerpc: Move the pte free routes from common header"
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: linux-stable <stable@vger.kernel.org> # v3.10+
> Signed-off-by: Hong H. Pham <hong.pham@windriver.com>


Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>


> ---
>  arch/powerpc/include/asm/pgalloc-32.h | 6 ++----
>  arch/powerpc/include/asm/pgalloc-64.h | 6 ++----
>  2 files changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
> index 27b2386..842846c 100644
> --- a/arch/powerpc/include/asm/pgalloc-32.h
> +++ b/arch/powerpc/include/asm/pgalloc-32.h
> @@ -84,10 +84,8 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
>  static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
>  				  unsigned long address)
>  {
> -	struct page *page = page_address(table);
> -
>  	tlb_flush_pgtable(tlb, address);
> -	pgtable_page_dtor(page);
> -	pgtable_free_tlb(tlb, page, 0);
> +	pgtable_page_dtor(table);
> +	pgtable_free_tlb(tlb, page_address(table), 0);
>  }
>  #endif /* _ASM_POWERPC_PGALLOC_32_H */
> diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
> index f65e27b..256d6f8 100644
> --- a/arch/powerpc/include/asm/pgalloc-64.h
> +++ b/arch/powerpc/include/asm/pgalloc-64.h
> @@ -144,11 +144,9 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb,
>  static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
>  				  unsigned long address)
>  {
> -	struct page *page = page_address(table);
> -
>  	tlb_flush_pgtable(tlb, address);
> -	pgtable_page_dtor(page);
> -	pgtable_free_tlb(tlb, page, 0);
> +	pgtable_page_dtor(table);
> +	pgtable_free_tlb(tlb, page_address(table), 0);
>  }
>
>  #else /* if CONFIG_PPC_64K_PAGES */
> -- 
> 1.8.3.2
Sebastian Andrzej Siewior Dec. 15, 2013, 12:59 p.m. UTC | #5
* Hong H. Pham | 2013-12-07 09:06:33 [-0500]:

>On PPC32, only SMP kernels are affected.
>
>On PPC64, only SMP kernels with 4K page size are affected.

$ uname -a
Linux mpc8536-1 3.12.1-rt3-00281-g9de268d #76 SMP PREEMPT RT Fri Nov 22 16:53:05 CET 2013 ppc GNU/Linux
$ uptime
 22:01:10 up 22 days, 21:01,  1 user,  load average: 443.08, 563.59, 586.20

This is from a mpc8536 box. The high load comes from a hackbench that was
running for quite some time. Are Book-E (CONFIG_PPC_BOOK3E_MMU without
CONFIG_PPC_BOOK3E set) not affected or is this bug not present if a SMP
kernel is booted on a UP machine?

Sebastian
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
index 27b2386..842846c 100644
--- a/arch/powerpc/include/asm/pgalloc-32.h
+++ b/arch/powerpc/include/asm/pgalloc-32.h
@@ -84,10 +84,8 @@  static inline void pgtable_free_tlb(struct mmu_gather *tlb,
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
 				  unsigned long address)
 {
-	struct page *page = page_address(table);
-
 	tlb_flush_pgtable(tlb, address);
-	pgtable_page_dtor(page);
-	pgtable_free_tlb(tlb, page, 0);
+	pgtable_page_dtor(table);
+	pgtable_free_tlb(tlb, page_address(table), 0);
 }
 #endif /* _ASM_POWERPC_PGALLOC_32_H */
diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index f65e27b..256d6f8 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -144,11 +144,9 @@  static inline void pgtable_free_tlb(struct mmu_gather *tlb,
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
 				  unsigned long address)
 {
-	struct page *page = page_address(table);
-
 	tlb_flush_pgtable(tlb, address);
-	pgtable_page_dtor(page);
-	pgtable_free_tlb(tlb, page, 0);
+	pgtable_page_dtor(table);
+	pgtable_free_tlb(tlb, page_address(table), 0);
 }
 
 #else /* if CONFIG_PPC_64K_PAGES */