Message ID | 1371033004-15864-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Wed, Jun 12, 2013 at 04:00:04PM +0530, Aneesh Kumar K.V wrote: > From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> > > Book3E uses the hugepd at PMD level and don't encode pte directly > at the pmd level. So it will find the lower bits of pmd set > and the pmd_bad check throws error. Infact the current code > will never take the free_hugepd_range call at all because it will > clear the pmd if it find a hugepd pointer. > > Reported-by: Scott Wood <scottwood@freescale.com> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > --- > arch/powerpc/mm/hugetlbpage.c | 29 ++++++++++++++++++----------- > 1 file changed, 18 insertions(+), 11 deletions(-) Thanks; this fixes the error for me. -Scott
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> > > Book3E uses the hugepd at PMD level and don't encode pte directly > at the pmd level. So it will find the lower bits of pmd set > and the pmd_bad check throws error. Infact the current code > will never take the free_hugepd_range call at all because it will > clear the pmd if it find a hugepd pointer. > > Reported-by: Scott Wood <scottwood@freescale.com> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Ben, This is regression introduced by e2b3d202d1dba8f3546ed28224ce485bc50010be "powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format" and should go upstream in 3.10 $git describe --contains e2b3d202d1dba8f3546ed28224ce485bc50010be v3.10-rc1~121^2~15 Without this path, we leak hugepd with all subarchs using old huge page directory format. -aneesh
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote: > From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> > > Book3E uses the hugepd at PMD level and don't encode pte directly > at the pmd level. So it will find the lower bits of pmd set > and the pmd_bad check throws error. Infact the current code > will never take the free_hugepd_range call at all because it will > clear the pmd if it find a hugepd pointer. > Please explain what changes you are making. Currently you are only describing what the issue is. Also include which the SHA1 which caused the regression (ie e2b3d202d1dba8f3546ed28224ce485bc50010be "powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format") Mikey > Reported-by: Scott Wood <scottwood@freescale.com> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > --- > arch/powerpc/mm/hugetlbpage.c | 29 ++++++++++++++++++----------- > 1 file changed, 18 insertions(+), 11 deletions(-) > > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c > index f2f01fd..0d3d3ee 100644 > --- a/arch/powerpc/mm/hugetlbpage.c > +++ b/arch/powerpc/mm/hugetlbpage.c > @@ -536,19 +536,26 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, > do { > pmd = pmd_offset(pud, addr); > next = pmd_addr_end(addr, end); > - if (pmd_none_or_clear_bad(pmd)) > - continue; > + if (!is_hugepd(pmd)) { > + /* > + * if it is not hugepd pointer, we should already find > + * it cleared. > + */ > + if (!pmd_none_or_clear_bad(pmd)) > + WARN_ON(1); How often are we going to hit this? Should this be a warn_on once or even a bug_on? Also just make it: WARN_ON(!pmd_none_or_clear_bad(pmd)) Mikey > + } else { > #ifdef CONFIG_PPC_FSL_BOOK3E > - /* > - * Increment next by the size of the huge mapping since > - * there may be more than one entry at this level for a > - * single hugepage, but all of them point to > - * the same kmem cache that holds the hugepte. > - */ > - next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd)); > + /* > + * Increment next by the size of the huge mapping since > + * there may be more than one entry at this level for a > + * single hugepage, but all of them point to > + * the same kmem cache that holds the hugepte. > + */ > + next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd)); > #endif > - free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT, > - addr, next, floor, ceiling); > + free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT, > + addr, next, floor, ceiling); > + } > } while (addr = next, addr != end); > > start &= PUD_MASK; > -- > 1.8.1.2 > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev >
Michael Neuling <mikey@neuling.org> writes: > Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote: > >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> >> >> Book3E uses the hugepd at PMD level and don't encode pte directly >> at the pmd level. So it will find the lower bits of pmd set >> and the pmd_bad check throws error. Infact the current code >> will never take the free_hugepd_range call at all because it will >> clear the pmd if it find a hugepd pointer. >> > > Please explain what changes you are making. Currently you are only > describing what the issue is. will do > > Also include which the SHA1 which caused the regression (ie > e2b3d202d1dba8f3546ed28224ce485bc50010be "powerpc: Switch 16GB and 16MB > explicit hugepages to a different page table format") will add > > Mikey > >> Reported-by: Scott Wood <scottwood@freescale.com> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> >> --- >> arch/powerpc/mm/hugetlbpage.c | 29 ++++++++++++++++++----------- >> 1 file changed, 18 insertions(+), 11 deletions(-) >> >> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c >> index f2f01fd..0d3d3ee 100644 >> --- a/arch/powerpc/mm/hugetlbpage.c >> +++ b/arch/powerpc/mm/hugetlbpage.c >> @@ -536,19 +536,26 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, >> do { >> pmd = pmd_offset(pud, addr); >> next = pmd_addr_end(addr, end); >> - if (pmd_none_or_clear_bad(pmd)) >> - continue; >> + if (!is_hugepd(pmd)) { >> + /* >> + * if it is not hugepd pointer, we should already find >> + * it cleared. >> + */ >> + if (!pmd_none_or_clear_bad(pmd)) >> + WARN_ON(1); > > How often are we going to hit this? Should this be a warn_on once or > even a bug_on? it should never happen. But i was thinking killing the system may a bit too much, hence WARN_ON > > Also just make it: > WARN_ON(!pmd_none_or_clear_bad(pmd)) > will do -aneesh
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote: > Michael Neuling <mikey@neuling.org> writes: > > > Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote: > > > >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> > >> > >> Book3E uses the hugepd at PMD level and don't encode pte directly > >> at the pmd level. So it will find the lower bits of pmd set > >> and the pmd_bad check throws error. Infact the current code > >> will never take the free_hugepd_range call at all because it will > >> clear the pmd if it find a hugepd pointer. > >> > > > > Please explain what changes you are making. Currently you are only > > describing what the issue is. > > will do > > > > > Also include which the SHA1 which caused the regression (ie > > e2b3d202d1dba8f3546ed28224ce485bc50010be "powerpc: Switch 16GB and 16MB > > explicit hugepages to a different page table format") > > will add > > > > > Mikey > > > >> Reported-by: Scott Wood <scottwood@freescale.com> > >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > >> --- > >> arch/powerpc/mm/hugetlbpage.c | 29 ++++++++++++++++++----------- > >> 1 file changed, 18 insertions(+), 11 deletions(-) > >> > >> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c > >> index f2f01fd..0d3d3ee 100644 > >> --- a/arch/powerpc/mm/hugetlbpage.c > >> +++ b/arch/powerpc/mm/hugetlbpage.c > >> @@ -536,19 +536,26 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, > >> do { > >> pmd = pmd_offset(pud, addr); > >> next = pmd_addr_end(addr, end); > >> - if (pmd_none_or_clear_bad(pmd)) > >> - continue; > >> + if (!is_hugepd(pmd)) { > >> + /* > >> + * if it is not hugepd pointer, we should already find > >> + * it cleared. > >> + */ > >> + if (!pmd_none_or_clear_bad(pmd)) > >> + WARN_ON(1); > > > > How often are we going to hit this? Should this be a warn_on once or > > even a bug_on? > > it should never happen. But i was thinking killing the system may a bit > too much, hence WARN_ON Maybe WARN_ON_ONCE. If you do hit it once, you are going to hit it a lot? Mikey > > > > > Also just make it: > > WARN_ON(!pmd_none_or_clear_bad(pmd)) > > > > will do > > -aneesh >
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index f2f01fd..0d3d3ee 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -536,19 +536,26 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, do { pmd = pmd_offset(pud, addr); next = pmd_addr_end(addr, end); - if (pmd_none_or_clear_bad(pmd)) - continue; + if (!is_hugepd(pmd)) { + /* + * if it is not hugepd pointer, we should already find + * it cleared. + */ + if (!pmd_none_or_clear_bad(pmd)) + WARN_ON(1); + } else { #ifdef CONFIG_PPC_FSL_BOOK3E - /* - * Increment next by the size of the huge mapping since - * there may be more than one entry at this level for a - * single hugepage, but all of them point to - * the same kmem cache that holds the hugepte. - */ - next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd)); + /* + * Increment next by the size of the huge mapping since + * there may be more than one entry at this level for a + * single hugepage, but all of them point to + * the same kmem cache that holds the hugepte. + */ + next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd)); #endif - free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT, - addr, next, floor, ceiling); + free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT, + addr, next, floor, ceiling); + } } while (addr = next, addr != end); start &= PUD_MASK;