Message ID | 1455813884-8283-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On 02/18/2016 10:14 PM, Aneesh Kumar K.V wrote: > We can get a hash pte fault with 4k base page size and find the pte > already inserted with 64K base page size. In that case we need to clear Can you please elaborate on this ? What are those situations when we have 64K base page size on the PTE but we had inserted HPTE with base page size as 4K ? > the existing slot information from the old pte. Fix this correctly > > With THP, we also clear the slot information with respect to all > the 64K hash pte mapping that 16MB page. They are all invalid > now. This make sure we don't find the slot valid when we fault with > 4k base page size. Finding the slot valid should not result in any wrong > behavior because we do check again in hash page table for the validity. > But we can avoid that check completely. Makes sense. > > Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C") > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > --- > arch/powerpc/mm/hash64_4k.c | 2 +- > arch/powerpc/mm/hash64_64k.c | 12 +++++++++--- > arch/powerpc/mm/hugepage-hash64.c | 7 ++++++- > 3 files changed, 16 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c > index e7c04542ba62..e3e76b929f33 100644 > --- a/arch/powerpc/mm/hash64_4k.c > +++ b/arch/powerpc/mm/hash64_4k.c > @@ -106,7 +106,7 @@ repeat: > } > } > /* > - * Hypervisor failure. Restore old pmd and return -1 > + * Hypervisor failure. Restore old pte and return -1 This change is not relevant here. Should be a separate patch. > * similar to __hash_page_* > */ > if (unlikely(slot == -2)) { > diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c > index 0762c1e08c88..b3895720edb0 100644 > --- a/arch/powerpc/mm/hash64_64k.c > +++ b/arch/powerpc/mm/hash64_64k.c > @@ -111,7 +111,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, > */ > if (!(old_pte & _PAGE_COMBO)) { > flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags); > - old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND; > + /* > + * clear the old slot details from the old and new pte. > + * On hash insert failure we use old pte value and we don't > + * want slot information there if we have a insert failure. > + */ > + old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND); > + new_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND); But why we need clear the bits on new_pte as well ? > goto htab_insert_hpte; > } > /* > @@ -182,7 +188,7 @@ repeat: > } > } > /* > - * Hypervisor failure. Restore old pmd and return -1 > + * Hypervisor failure. Restore old pte and return -1 This change is not relevant here. Should be a separate patch. > * similar to __hash_page_* > */ > if (unlikely(slot == -2)) { > @@ -305,7 +311,7 @@ repeat: > } > } > /* > - * Hypervisor failure. Restore old pmd and return -1 > + * Hypervisor failure. Restore old pte and return -1 > * similar to __hash_page_* Ditto.
On Fri, 2016-02-19 at 11:23 +0530, Anshuman Khandual wrote: > On 02/18/2016 10:14 PM, Aneesh Kumar K.V wrote: > > We can get a hash pte fault with 4k base page size and find the pte > > already inserted with 64K base page size. In that case we need to clear ... > > > diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c > > index e7c04542ba62..e3e76b929f33 100644 > > --- a/arch/powerpc/mm/hash64_4k.c > > +++ b/arch/powerpc/mm/hash64_4k.c > > @@ -106,7 +106,7 @@ repeat: > > } > > } > > /* > > - * Hypervisor failure. Restore old pmd and return -1 > > + * Hypervisor failure. Restore old pte and return -1 > > This change is not relevant here. Should be a separate patch. Yeah. If it was -rc1 then I would probably let it go, but this will land in rc6 so the fixes need to be tight. cheers
On 19/02/16 03:44, Aneesh Kumar K.V wrote: > We can get a hash pte fault with 4k base page size and find the pte > already inserted with 64K base page size. In that case we need to clear > the existing slot information from the old pte. Fix this correctly > > With THP, we also clear the slot information with respect to all > the 64K hash pte mapping that 16MB page. They are all invalid > now. This make sure we don't find the slot valid when we fault with > 4k base page size. Finding the slot valid should not result in any wrong > behavior because we do check again in hash page table for the validity. > But we can avoid that check completely. > > Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C") > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > --- > arch/powerpc/mm/hash64_4k.c | 2 +- > arch/powerpc/mm/hash64_64k.c | 12 +++++++++--- > arch/powerpc/mm/hugepage-hash64.c | 7 ++++++- > 3 files changed, 16 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c > index e7c04542ba62..e3e76b929f33 100644 > --- a/arch/powerpc/mm/hash64_4k.c > +++ b/arch/powerpc/mm/hash64_4k.c > @@ -106,7 +106,7 @@ repeat: > } > } > /* > - * Hypervisor failure. Restore old pmd and return -1 > + * Hypervisor failure. Restore old pte and return -1 > * similar to __hash_page_* > */ > if (unlikely(slot == -2)) { > diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c > index 0762c1e08c88..b3895720edb0 100644 > --- a/arch/powerpc/mm/hash64_64k.c > +++ b/arch/powerpc/mm/hash64_64k.c > @@ -111,7 +111,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, > */ > if (!(old_pte & _PAGE_COMBO)) { > flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags); > - old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND; > + /* > + * clear the old slot details from the old and new pte. > + * On hash insert failure we use old pte value and we don't > + * want slot information there if we have a insert failure. > + */ > + old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND); > + new_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND); > goto htab_insert_hpte; > } > /* > @@ -182,7 +188,7 @@ repeat: > } > } > /* > - * Hypervisor failure. Restore old pmd and return -1 > + * Hypervisor failure. Restore old pte and return -1 > * similar to __hash_page_* > */ > if (unlikely(slot == -2)) { > @@ -305,7 +311,7 @@ repeat: > } > } > /* > - * Hypervisor failure. Restore old pmd and return -1 > + * Hypervisor failure. Restore old pte and return -1 > * similar to __hash_page_* > */ > if (unlikely(slot == -2)) { > diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c > index 49b152b0f926..8424f46c2bf7 100644 > --- a/arch/powerpc/mm/hugepage-hash64.c > +++ b/arch/powerpc/mm/hugepage-hash64.c > @@ -78,9 +78,14 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid, > * base page size. This is because demote_segment won't flush > * hash page table entries. > */ > - if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) > + if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) { > flush_hash_hugepage(vsid, ea, pmdp, MMU_PAGE_64K, > ssize, flags); > + /* > + * clear the old slot information > + */ Redundant comment, something more useful? why clear it? > + memset(hpte_slot_array, 0, PTE_FRAG_SIZE); > + } > } > > valid = hpte_valid(hpte_slot_array, index);
Michael Ellerman <mpe@ellerman.id.au> writes: > On Fri, 2016-02-19 at 11:23 +0530, Anshuman Khandual wrote: >> On 02/18/2016 10:14 PM, Aneesh Kumar K.V wrote: >> > We can get a hash pte fault with 4k base page size and find the pte >> > already inserted with 64K base page size. In that case we need to clear > ... >> >> > diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c >> > index e7c04542ba62..e3e76b929f33 100644 >> > --- a/arch/powerpc/mm/hash64_4k.c >> > +++ b/arch/powerpc/mm/hash64_4k.c >> > @@ -106,7 +106,7 @@ repeat: >> > } >> > } >> > /* >> > - * Hypervisor failure. Restore old pmd and return -1 >> > + * Hypervisor failure. Restore old pte and return -1 >> >> This change is not relevant here. Should be a separate patch. > > Yeah. > > If it was -rc1 then I would probably let it go, but this will land in rc6 so > the fixes need to be tight. > You want me to do an upate with those changes dropped ?. -aneesh
Balbir Singh <bsingharora@gmail.com> writes: ........... ............ >> diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c >> index 49b152b0f926..8424f46c2bf7 100644 >> --- a/arch/powerpc/mm/hugepage-hash64.c >> +++ b/arch/powerpc/mm/hugepage-hash64.c >> @@ -78,9 +78,14 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid, >> * base page size. This is because demote_segment won't flush >> * hash page table entries. >> */ >> - if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) >> + if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) { >> flush_hash_hugepage(vsid, ea, pmdp, MMU_PAGE_64K, >> ssize, flags); >> + /* >> + * clear the old slot information >> + */ > Redundant comment, something more useful? why clear it? >> + memset(hpte_slot_array, 0, PTE_FRAG_SIZE); >> + } >> } >> explained in the commit message. With THP, we also clear the slot information with respect to all the 64K hash pte mapping that 16MB page. They are all invalid now. This make sure we don't find the slot valid when we fault with 4k base page size. Finding the slot valid should not result in any wrong behavior because we do check again in hash page table for the validity. But we can avoid that check completely. >> valid = hpte_valid(hpte_slot_array, index);
Anshuman Khandual <khandual@linux.vnet.ibm.com> writes: > On 02/18/2016 10:14 PM, Aneesh Kumar K.V wrote: >> We can get a hash pte fault with 4k base page size and find the pte >> already inserted with 64K base page size. In that case we need to clear > > Can you please elaborate on this ? What are those situations when we > have 64K base page size on the PTE but we had inserted HPTE with base > page size as 4K ? when we demote a segment. > >> the existing slot information from the old pte. Fix this correctly >> >> With THP, we also clear the slot information with respect to all >> the 64K hash pte mapping that 16MB page. They are all invalid >> now. This make sure we don't find the slot valid when we fault with >> 4k base page size. Finding the slot valid should not result in any wrong >> behavior because we do check again in hash page table for the validity. >> But we can avoid that check completely. > > Makes sense. > >> >> Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C") >> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> >> --- >> arch/powerpc/mm/hash64_4k.c | 2 +- >> arch/powerpc/mm/hash64_64k.c | 12 +++++++++--- >> arch/powerpc/mm/hugepage-hash64.c | 7 ++++++- >> 3 files changed, 16 insertions(+), 5 deletions(-) >> >> diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c >> index e7c04542ba62..e3e76b929f33 100644 >> --- a/arch/powerpc/mm/hash64_4k.c >> +++ b/arch/powerpc/mm/hash64_4k.c >> @@ -106,7 +106,7 @@ repeat: >> } >> } >> /* >> - * Hypervisor failure. Restore old pmd and return -1 >> + * Hypervisor failure. Restore old pte and return -1 > > This change is not relevant here. Should be a separate patch. > >> * similar to __hash_page_* >> */ >> if (unlikely(slot == -2)) { >> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c >> index 0762c1e08c88..b3895720edb0 100644 >> --- a/arch/powerpc/mm/hash64_64k.c >> +++ b/arch/powerpc/mm/hash64_64k.c >> @@ -111,7 +111,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, >> */ >> if (!(old_pte & _PAGE_COMBO)) { >> flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags); >> - old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND; >> + /* >> + * clear the old slot details from the old and new pte. >> + * On hash insert failure we use old pte value and we don't >> + * want slot information there if we have a insert failure. >> + */ >> + old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND); >> + new_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND); > > But why we need clear the bits on new_pte as well ? we use new pte when updating the actual pte towards the end of that function. > >> goto htab_insert_hpte; >> } >> /* >> @@ -182,7 +188,7 @@ repeat: >> } >> } >> /* >> - * Hypervisor failure. Restore old pmd and return -1 >> + * Hypervisor failure. Restore old pte and return -1 > > This change is not relevant here. Should be a separate patch. > > >> * similar to __hash_page_* >> */ >> if (unlikely(slot == -2)) { >> @@ -305,7 +311,7 @@ repeat: >> } >> } >> /* >> - * Hypervisor failure. Restore old pmd and return -1 >> + * Hypervisor failure. Restore old pte and return -1 >> * similar to __hash_page_* > > Ditto. -anessh
diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c index e7c04542ba62..e3e76b929f33 100644 --- a/arch/powerpc/mm/hash64_4k.c +++ b/arch/powerpc/mm/hash64_4k.c @@ -106,7 +106,7 @@ repeat: } } /* - * Hypervisor failure. Restore old pmd and return -1 + * Hypervisor failure. Restore old pte and return -1 * similar to __hash_page_* */ if (unlikely(slot == -2)) { diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c index 0762c1e08c88..b3895720edb0 100644 --- a/arch/powerpc/mm/hash64_64k.c +++ b/arch/powerpc/mm/hash64_64k.c @@ -111,7 +111,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid, */ if (!(old_pte & _PAGE_COMBO)) { flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags); - old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND; + /* + * clear the old slot details from the old and new pte. + * On hash insert failure we use old pte value and we don't + * want slot information there if we have a insert failure. + */ + old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND); + new_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND); goto htab_insert_hpte; } /* @@ -182,7 +188,7 @@ repeat: } } /* - * Hypervisor failure. Restore old pmd and return -1 + * Hypervisor failure. Restore old pte and return -1 * similar to __hash_page_* */ if (unlikely(slot == -2)) { @@ -305,7 +311,7 @@ repeat: } } /* - * Hypervisor failure. Restore old pmd and return -1 + * Hypervisor failure. Restore old pte and return -1 * similar to __hash_page_* */ if (unlikely(slot == -2)) { diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c index 49b152b0f926..8424f46c2bf7 100644 --- a/arch/powerpc/mm/hugepage-hash64.c +++ b/arch/powerpc/mm/hugepage-hash64.c @@ -78,9 +78,14 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid, * base page size. This is because demote_segment won't flush * hash page table entries. */ - if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) + if ((old_pmd & _PAGE_HASHPTE) && !(old_pmd & _PAGE_COMBO)) { flush_hash_hugepage(vsid, ea, pmdp, MMU_PAGE_64K, ssize, flags); + /* + * clear the old slot information + */ + memset(hpte_slot_array, 0, PTE_FRAG_SIZE); + } } valid = hpte_valid(hpte_slot_array, index);
We can get a hash pte fault with 4k base page size and find the pte already inserted with 64K base page size. In that case we need to clear the existing slot information from the old pte. Fix this correctly With THP, we also clear the slot information with respect to all the 64K hash pte mapping that 16MB page. They are all invalid now. This make sure we don't find the slot valid when we fault with 4k base page size. Finding the slot valid should not result in any wrong behavior because we do check again in hash page table for the validity. But we can avoid that check completely. Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- arch/powerpc/mm/hash64_4k.c | 2 +- arch/powerpc/mm/hash64_64k.c | 12 +++++++++--- arch/powerpc/mm/hugepage-hash64.c | 7 ++++++- 3 files changed, 16 insertions(+), 5 deletions(-)