Message ID | 1442300408-26490-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > If we had secondary hash flag set, we ended up modifying hash value in > the updatepp code path. Hence with a failed updatepp we will be using > a wrong hash value for the following hash insert. Fix this by > recomputing hash before insert. Without this patch we can end up with using wrong slot number in linux pte. That can result in us missing an hash pte update or invalidate which can cause memory corruption or even machine check ? -aneesh
On Wed, 2015-09-16 at 08:53 +0530, Aneesh Kumar K.V wrote: > "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > > > If we had secondary hash flag set, we ended up modifying hash value in > > the updatepp code path. Hence with a failed updatepp we will be using > > a wrong hash value for the following hash insert. Fix this by > > recomputing hash before insert. > > Without this patch we can end up with using wrong slot number in linux > pte. That can result in us missing an hash pte update or invalidate > which can cause memory corruption or even machine check ? Thanks. When did this break? Always? If so this should go to stable? cheers
On Tue, Sep 15, 2015 at 12:30:08PM +0530, Aneesh Kumar K.V wrote: > If we had secondary hash flag set, we ended up modifying hash value in > the updatepp code path. Hence with a failed updatepp we will be using > a wrong hash value for the following hash insert. Fix this by > recomputing hash before insert. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman <mpe@ellerman.id.au> writes: > On Wed, 2015-09-16 at 08:53 +0530, Aneesh Kumar K.V wrote: >> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: >> >> > If we had secondary hash flag set, we ended up modifying hash value in >> > the updatepp code path. Hence with a failed updatepp we will be using >> > a wrong hash value for the following hash insert. Fix this by >> > recomputing hash before insert. >> >> Without this patch we can end up with using wrong slot number in linux >> pte. That can result in us missing an hash pte update or invalidate >> which can cause memory corruption or even machine check ? > > Thanks. When did this break? Always? If so this should go to stable? > IIUC we have this issue with initial support for THP (6d492ecc6489113968ec269be1cf88942d4a5d29) " powerpc/THP: Add code to handle HPTE faults for hugepages". So yes this should got to stable. -aneesh
On Wed, 2015-09-16 at 11:27 +0530, Aneesh Kumar K.V wrote: > Michael Ellerman <mpe@ellerman.id.au> writes: > > > On Wed, 2015-09-16 at 08:53 +0530, Aneesh Kumar K.V wrote: > >> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > >> > >> > If we had secondary hash flag set, we ended up modifying hash value in > >> > the updatepp code path. Hence with a failed updatepp we will be using > >> > a wrong hash value for the following hash insert. Fix this by > >> > recomputing hash before insert. > >> > >> Without this patch we can end up with using wrong slot number in linux > >> pte. That can result in us missing an hash pte update or invalidate > >> which can cause memory corruption or even machine check ? > > > > Thanks. When did this break? Always? If so this should go to stable? > > > > IIUC we have this issue with initial support for THP (6d492ecc6489113968ec269be1cf88942d4a5d29) > " powerpc/THP: Add code to handle HPTE faults for hugepages". So yes > this should got to stable. Thanks. And that went into 3.11. You haven't actually seen any crashes that are definitely linked to this though am I right? You just found it by code inspection? cheers
Michael Ellerman <mpe@ellerman.id.au> writes: > On Wed, 2015-09-16 at 11:27 +0530, Aneesh Kumar K.V wrote: >> Michael Ellerman <mpe@ellerman.id.au> writes: >> >> > On Wed, 2015-09-16 at 08:53 +0530, Aneesh Kumar K.V wrote: >> >> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: >> >> >> >> > If we had secondary hash flag set, we ended up modifying hash value in >> >> > the updatepp code path. Hence with a failed updatepp we will be using >> >> > a wrong hash value for the following hash insert. Fix this by >> >> > recomputing hash before insert. >> >> >> >> Without this patch we can end up with using wrong slot number in linux >> >> pte. That can result in us missing an hash pte update or invalidate >> >> which can cause memory corruption or even machine check ? >> > >> > Thanks. When did this break? Always? If so this should go to stable? >> > >> >> IIUC we have this issue with initial support for THP (6d492ecc6489113968ec269be1cf88942d4a5d29) >> " powerpc/THP: Add code to handle HPTE faults for hugepages". So yes >> this should got to stable. > > Thanks. And that went into 3.11. > > You haven't actually seen any crashes that are definitely linked to this though > am I right? You just found it by code inspection? > I am still not sure, why we haven't seen crashes. One of the possibility is that we removed that slot because we ran out of free space soon enough and everything went back normal. Yes I found this by code inspection. -aneesh
On Tue, 2015-15-09 at 07:00:08 UTC, "Aneesh Kumar K.V" wrote: > If we had secondary hash flag set, we ended up modifying hash value in > the updatepp code path. Hence with a failed updatepp we will be using > a wrong hash value for the following hash insert. Fix this by > recomputing hash before insert. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > Reviewed-by: Paul Mackerras <paulus@samba.org> Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/36b35d5d807b7e57aff7d08e cheers
diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c index 43dafb9d6a46..4d87122cf6a7 100644 --- a/arch/powerpc/mm/hugepage-hash64.c +++ b/arch/powerpc/mm/hugepage-hash64.c @@ -85,7 +85,6 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid, BUG_ON(index >= 4096); vpn = hpt_vpn(ea, vsid, ssize); - hash = hpt_hash(vpn, shift, ssize); hpte_slot_array = get_hpte_slot_array(pmdp); if (psize == MMU_PAGE_4K) { /* @@ -101,6 +100,7 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid, valid = hpte_valid(hpte_slot_array, index); if (valid) { /* update the hpte bits */ + hash = hpt_hash(vpn, shift, ssize); hidx = hpte_hash_index(hpte_slot_array, index); if (hidx & _PTEIDX_SECONDARY) hash = ~hash; @@ -126,6 +126,7 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid, if (!valid) { unsigned long hpte_group; + hash = hpt_hash(vpn, shift, ssize); /* insert new entry */ pa = pmd_pfn(__pmd(old_pmd)) << PAGE_SHIFT; new_pmd |= _PAGE_HASHPTE;
If we had secondary hash flag set, we ended up modifying hash value in the updatepp code path. Hence with a failed updatepp we will be using a wrong hash value for the following hash insert. Fix this by recomputing hash before insert. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- arch/powerpc/mm/hugepage-hash64.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)