diff mbox

powerpc/mm/hash: Clear the invalid slot information correctly

Message ID 1455813884-8283-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com (mailing list archive)
State Superseded
Headers show

Commit Message

Aneesh Kumar K.V Feb. 18, 2016, 4:44 p.m. UTC
We can get a hash pte fault with 4k base page size and find the pte
already inserted with 64K base page size. In that case we need to clear
the existing slot information from the old pte. Fix this correctly

With THP, we also clear the slot information with respect to all
the 64K hash pte mapping that 16MB page. They are all invalid
now. This make sure we don't find the slot valid when we fault with
4k base page size. Finding the slot valid should not result in any wrong
behavior because we do check again in hash page table for the validity.
But we can avoid that check completely.

Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C")

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash64_4k.c       |  2 +-
 arch/powerpc/mm/hash64_64k.c      | 12 +++++++++---
 arch/powerpc/mm/hugepage-hash64.c |  7 ++++++-
 3 files changed, 16 insertions(+), 5 deletions(-)

Comments

Anshuman Khandual Feb. 19, 2016, 5:53 a.m. UTC | #1
On 02/18/2016 10:14 PM, Aneesh Kumar K.V wrote:
> We can get a hash pte fault with 4k base page size and find the pte
> already inserted with 64K base page size. In that case we need to clear

Can you please elaborate on this ? What are those situations when we
have 64K base page size on the PTE but we had inserted HPTE with base
page size as 4K ?

> the existing slot information from the old pte. Fix this correctly
> 
> With THP, we also clear the slot information with respect to all
> the 64K hash pte mapping that 16MB page. They are all invalid
> now. This make sure we don't find the slot valid when we fault with
> 4k base page size. Finding the slot valid should not result in any wrong
> behavior because we do check again in hash page table for the validity.
> But we can avoid that check completely.

Makes sense.

> 
> Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C")
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/hash64_4k.c       |  2 +-
>  arch/powerpc/mm/hash64_64k.c      | 12 +++++++++---
>  arch/powerpc/mm/hugepage-hash64.c |  7 ++++++-
>  3 files changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
> index e7c04542ba62..e3e76b929f33 100644
> --- a/arch/powerpc/mm/hash64_4k.c
> +++ b/arch/powerpc/mm/hash64_4k.c
> @@ -106,7 +106,7 @@ repeat:
>  			}
>  		}
>  		/*
> -		 * Hypervisor failure. Restore old pmd and return -1
> +		 * Hypervisor failure. Restore old pte and return -1

This change is not relevant here. Should be a separate patch.

>  		 * similar to __hash_page_*
>  		 */
>  		if (unlikely(slot == -2)) {
> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> index 0762c1e08c88..b3895720edb0 100644
> --- a/arch/powerpc/mm/hash64_64k.c
> +++ b/arch/powerpc/mm/hash64_64k.c
> @@ -111,7 +111,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  	 */
>  	if (!(old_pte & _PAGE_COMBO)) {
>  		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
> -		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
> +		/*
> +		 * clear the old slot details from the old and new pte.
> +		 * On hash insert failure we use old pte value and we don't
> +		 * want slot information there if we have a insert failure.
> +		 */
> +		old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
> +		new_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);

But why we need clear the bits on new_pte as well ?

>  		goto htab_insert_hpte;
>  	}
>  	/*
> @@ -182,7 +188,7 @@ repeat:
>  		}
>  	}
>  	/*
> -	 * Hypervisor failure. Restore old pmd and return -1
> +	 * Hypervisor failure. Restore old pte and return -1

This change is not relevant here. Should be a separate patch.


>  	 * similar to __hash_page_*
>  	 */
>  	if (unlikely(slot == -2)) {
> @@ -305,7 +311,7 @@ repeat:
>  			}
>  		}
>  		/*
> -		 * Hypervisor failure. Restore old pmd and return -1
> +		 * Hypervisor failure. Restore old pte and return -1
>  		 * similar to __hash_page_*

Ditto.
Michael Ellerman Feb. 19, 2016, 10:37 a.m. UTC | #2
On Fri, 2016-02-19 at 11:23 +0530, Anshuman Khandual wrote:
> On 02/18/2016 10:14 PM, Aneesh Kumar K.V wrote:
> > We can get a hash pte fault with 4k base page size and find the pte
> > already inserted with 64K base page size. In that case we need to clear
...
>
> > diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
> > index e7c04542ba62..e3e76b929f33 100644
> > --- a/arch/powerpc/mm/hash64_4k.c
> > +++ b/arch/powerpc/mm/hash64_4k.c
> > @@ -106,7 +106,7 @@ repeat:
> >  			}
> >  		}
> >  		/*
> > -		 * Hypervisor failure. Restore old pmd and return -1
> > +		 * Hypervisor failure. Restore old pte and return -1
>
> This change is not relevant here. Should be a separate patch.

Yeah.

If it was -rc1 then I would probably let it go, but this will land in rc6 so
the fixes need to be tight.

cheers
Balbir Singh Feb. 19, 2016, 11:30 a.m. UTC | #3
On 19/02/16 03:44, Aneesh Kumar K.V wrote:
> We can get a hash pte fault with 4k base page size and find the pte
> already inserted with 64K base page size. In that case we need to clear
> the existing slot information from the old pte. Fix this correctly
>
> With THP, we also clear the slot information with respect to all
> the 64K hash pte mapping that 16MB page. They are all invalid
> now. This make sure we don't find the slot valid when we fault with
> 4k base page size. Finding the slot valid should not result in any wrong
> behavior because we do check again in hash page table for the validity.
> But we can avoid that check completely.
>
> Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C")
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/hash64_4k.c       |  2 +-
>  arch/powerpc/mm/hash64_64k.c      | 12 +++++++++---
>  arch/powerpc/mm/hugepage-hash64.c |  7 ++++++-
>  3 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
> index e7c04542ba62..e3e76b929f33 100644
> --- a/arch/powerpc/mm/hash64_4k.c
> +++ b/arch/powerpc/mm/hash64_4k.c
> @@ -106,7 +106,7 @@ repeat:
>  			}
>  		}
>  		/*
> -		 * Hypervisor failure. Restore old pmd and return -1
> +		 * Hypervisor failure. Restore old pte and return -1
>  		 * similar to __hash_page_*
>  		 */
>  		if (unlikely(slot == -2)) {
> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> index 0762c1e08c88..b3895720edb0 100644
> --- a/arch/powerpc/mm/hash64_64k.c
> +++ b/arch/powerpc/mm/hash64_64k.c
> @@ -111,7 +111,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  	 */
>  	if (!(old_pte & _PAGE_COMBO)) {
>  		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
> -		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
> +		/*
> +		 * clear the old slot details from the old and new pte.
> +		 * On hash insert failure we use old pte value and we don't
> +		 * want slot information there if we have a insert failure.
> +		 */
> +		old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
> +		new_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
>  		goto htab_insert_hpte;
>  	}
>  	/*
> @@ -182,7 +188,7 @@ repeat:
>  		}
>  	}
>  	/*
> -	 * Hypervisor failure. Restore old pmd and return -1
> +	 * Hypervisor failure. Restore old pte and return -1
>  	 * similar to __hash_page_*
>  	 */
>  	if (unlikely(slot == -2)) {
> @@ -305,7 +311,7 @@ repeat:
>  			}
>  		}
>  		/*
> -		 * Hypervisor failure. Restore old pmd and return -1
> +		 * Hypervisor failure. Restore old pte and return -1
>  		 * similar to __hash_page_*
>  		 */
>  		if (unlikely(slot == -2)) {
> diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
> index 49b152b0f926..8424f46c2bf7 100644
> --- a/arch/powerpc/mm/hugepage-hash64.c
> +++ b/arch/powerpc/mm/hugepage-hash64.c
> @@ -78,9 +78,14 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
>  		 * base page size. This is because demote_segment won't flush
>  		 * hash page table entries.
>  		 */
> -		if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO))
> +		if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) {
>  			flush_hash_hugepage(vsid, ea, pmdp, MMU_PAGE_64K,
>  					    ssize, flags);
> +			/*
> +			 * clear the old slot information 
> +			 */
Redundant comment, something more useful? why clear it?
> +			memset(hpte_slot_array, 0, PTE_FRAG_SIZE);
> +		}
>  	}
>  
>  	valid = hpte_valid(hpte_slot_array, index);
Aneesh Kumar K.V Feb. 20, 2016, 2:32 p.m. UTC | #4
Michael Ellerman <mpe@ellerman.id.au> writes:

> On Fri, 2016-02-19 at 11:23 +0530, Anshuman Khandual wrote:
>> On 02/18/2016 10:14 PM, Aneesh Kumar K.V wrote:
>> > We can get a hash pte fault with 4k base page size and find the pte
>> > already inserted with 64K base page size. In that case we need to clear
> ...
>>
>> > diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
>> > index e7c04542ba62..e3e76b929f33 100644
>> > --- a/arch/powerpc/mm/hash64_4k.c
>> > +++ b/arch/powerpc/mm/hash64_4k.c
>> > @@ -106,7 +106,7 @@ repeat:
>> >  			}
>> >  		}
>> >  		/*
>> > -		 * Hypervisor failure. Restore old pmd and return -1
>> > +		 * Hypervisor failure. Restore old pte and return -1
>>
>> This change is not relevant here. Should be a separate patch.
>
> Yeah.
>
> If it was -rc1 then I would probably let it go, but this will land in rc6 so
> the fixes need to be tight.
>

You want me to do an upate with those changes dropped ?.

-aneesh
Aneesh Kumar K.V Feb. 20, 2016, 2:34 p.m. UTC | #5
Balbir Singh <bsingharora@gmail.com> writes:

...........
............

>> diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
>> index 49b152b0f926..8424f46c2bf7 100644
>> --- a/arch/powerpc/mm/hugepage-hash64.c
>> +++ b/arch/powerpc/mm/hugepage-hash64.c
>> @@ -78,9 +78,14 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
>>  		 * base page size. This is because demote_segment won't flush
>>  		 * hash page table entries.
>>  		 */
>> -		if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO))
>> +		if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) {
>>  			flush_hash_hugepage(vsid, ea, pmdp, MMU_PAGE_64K,
>>  					    ssize, flags);
>> +			/*
>> +			 * clear the old slot information 
>> +			 */
> Redundant comment, something more useful? why clear it?
>> +			memset(hpte_slot_array, 0, PTE_FRAG_SIZE);
>> +		}
>>  	}
>>  

explained in the commit message. 


 With THP, we also clear the slot information with respect to all
 the 64K hash pte mapping that 16MB page. They are all invalid
 now. This make sure we don't find the slot valid when we fault with
 4k base page size. Finding the slot valid should not result in any wrong
 behavior because we do check again in hash page table for the validity.
 But we can avoid that check completely.



>>  	valid = hpte_valid(hpte_slot_array, index);
Aneesh Kumar K.V Feb. 20, 2016, 3:13 p.m. UTC | #6
Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> On 02/18/2016 10:14 PM, Aneesh Kumar K.V wrote:
>> We can get a hash pte fault with 4k base page size and find the pte
>> already inserted with 64K base page size. In that case we need to clear
>
> Can you please elaborate on this ? What are those situations when we
> have 64K base page size on the PTE but we had inserted HPTE with base
> page size as 4K ?

when we demote a segment.

>
>> the existing slot information from the old pte. Fix this correctly
>> 
>> With THP, we also clear the slot information with respect to all
>> the 64K hash pte mapping that 16MB page. They are all invalid
>> now. This make sure we don't find the slot valid when we fault with
>> 4k base page size. Finding the slot valid should not result in any wrong
>> behavior because we do check again in hash page table for the validity.
>> But we can avoid that check completely.
>
> Makes sense.
>
>> 
>> Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C")
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/mm/hash64_4k.c       |  2 +-
>>  arch/powerpc/mm/hash64_64k.c      | 12 +++++++++---
>>  arch/powerpc/mm/hugepage-hash64.c |  7 ++++++-
>>  3 files changed, 16 insertions(+), 5 deletions(-)
>> 
>> diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
>> index e7c04542ba62..e3e76b929f33 100644
>> --- a/arch/powerpc/mm/hash64_4k.c
>> +++ b/arch/powerpc/mm/hash64_4k.c
>> @@ -106,7 +106,7 @@ repeat:
>>  			}
>>  		}
>>  		/*
>> -		 * Hypervisor failure. Restore old pmd and return -1
>> +		 * Hypervisor failure. Restore old pte and return -1
>
> This change is not relevant here. Should be a separate patch.
>
>>  		 * similar to __hash_page_*
>>  		 */
>>  		if (unlikely(slot == -2)) {
>> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
>> index 0762c1e08c88..b3895720edb0 100644
>> --- a/arch/powerpc/mm/hash64_64k.c
>> +++ b/arch/powerpc/mm/hash64_64k.c
>> @@ -111,7 +111,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>>  	 */
>>  	if (!(old_pte & _PAGE_COMBO)) {
>>  		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
>> -		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
>> +		/*
>> +		 * clear the old slot details from the old and new pte.
>> +		 * On hash insert failure we use old pte value and we don't
>> +		 * want slot information there if we have a insert failure.
>> +		 */
>> +		old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
>> +		new_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
>
> But why we need clear the bits on new_pte as well ?

we use new pte when updating the actual pte towards the end of that function.


>
>>  		goto htab_insert_hpte;
>>  	}
>>  	/*
>> @@ -182,7 +188,7 @@ repeat:
>>  		}
>>  	}
>>  	/*
>> -	 * Hypervisor failure. Restore old pmd and return -1
>> +	 * Hypervisor failure. Restore old pte and return -1
>
> This change is not relevant here. Should be a separate patch.
>
>
>>  	 * similar to __hash_page_*
>>  	 */
>>  	if (unlikely(slot == -2)) {
>> @@ -305,7 +311,7 @@ repeat:
>>  			}
>>  		}
>>  		/*
>> -		 * Hypervisor failure. Restore old pmd and return -1
>> +		 * Hypervisor failure. Restore old pte and return -1
>>  		 * similar to __hash_page_*
>
> Ditto.

-anessh
diff mbox

Patch

diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index e7c04542ba62..e3e76b929f33 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -106,7 +106,7 @@  repeat:
 			}
 		}
 		/*
-		 * Hypervisor failure. Restore old pmd and return -1
+		 * Hypervisor failure. Restore old pte and return -1
 		 * similar to __hash_page_*
 		 */
 		if (unlikely(slot == -2)) {
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 0762c1e08c88..b3895720edb0 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -111,7 +111,13 @@  int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 */
 	if (!(old_pte & _PAGE_COMBO)) {
 		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
-		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
+		/*
+		 * clear the old slot details from the old and new pte.
+		 * On hash insert failure we use old pte value and we don't
+		 * want slot information there if we have a insert failure.
+		 */
+		old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
+		new_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
 		goto htab_insert_hpte;
 	}
 	/*
@@ -182,7 +188,7 @@  repeat:
 		}
 	}
 	/*
-	 * Hypervisor failure. Restore old pmd and return -1
+	 * Hypervisor failure. Restore old pte and return -1
 	 * similar to __hash_page_*
 	 */
 	if (unlikely(slot == -2)) {
@@ -305,7 +311,7 @@  repeat:
 			}
 		}
 		/*
-		 * Hypervisor failure. Restore old pmd and return -1
+		 * Hypervisor failure. Restore old pte and return -1
 		 * similar to __hash_page_*
 		 */
 		if (unlikely(slot == -2)) {
diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
index 49b152b0f926..8424f46c2bf7 100644
--- a/arch/powerpc/mm/hugepage-hash64.c
+++ b/arch/powerpc/mm/hugepage-hash64.c
@@ -78,9 +78,14 @@  int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
 		 * base page size. This is because demote_segment won't flush
 		 * hash page table entries.
 		 */
-		if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO))
+		if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) {
 			flush_hash_hugepage(vsid, ea, pmdp, MMU_PAGE_64K,
 					    ssize, flags);
+			/*
+			 * clear the old slot information 
+			 */
+			memset(hpte_slot_array, 0, PTE_FRAG_SIZE);
+		}
 	}
 
 	valid = hpte_valid(hpte_slot_array, index);