Message ID | 1454383281-156550-1-git-send-email-nitin.m.gupta@oracle.com |
---|---|
State | Superseded |
Delegated to: | David Miller |
Headers | show |
From: Nitin Gupta <nitin.m.gupta@oracle.com> Date: Mon, 1 Feb 2016 19:21:21 -0800 > During hugepage unmap, TLB flush is currently issued > at every PAGE_SIZE'd boundary which is unnecessary. We > now issue the flush at REAL_HPAGE_SIZE boundaries only. > > Without this patch workloads which unmap a large hugepage > backed VMA region get CPU lockups due to excessive TLB > flush calls. > > Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Thanks for finding this but we'll need a few adjustments to your patch. First of all, you can't do the final TLB flush of each REAL_HPAGE_SIZE entry until all of the PTE's that cover that region have been cleared. Otherwise a TLB miss on any cpu can reload the entry after you've flushed it. Second, the stores should be done in a way such that they are done in-order and consequetively in order to optimize store buffer compression. I would recommend clearing all of the PTE's and then executing the two TLB and TSB flushes right afterwards as an independant operation and not via pte_clear(). -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/02/2016 01:25 PM, David Miller wrote: > From: Nitin Gupta <nitin.m.gupta@oracle.com> > Date: Mon, 1 Feb 2016 19:21:21 -0800 > >> During hugepage unmap, TLB flush is currently issued >> at every PAGE_SIZE'd boundary which is unnecessary. We >> now issue the flush at REAL_HPAGE_SIZE boundaries only. >> >> Without this patch workloads which unmap a large hugepage >> backed VMA region get CPU lockups due to excessive TLB >> flush calls. >> >> Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> > > Thanks for finding this but we'll need a few adjustments to your > patch. > > First of all, you can't do the final TLB flush of each REAL_HPAGE_SIZE > entry until all of the PTE's that cover that region have been cleared. > Otherwise a TLB miss on any cpu can reload the entry after you've > flushed it. > > Second, the stores should be done in a way such that they are done > in-order and consequetively in order to optimize store buffer > compression. > > I would recommend clearing all of the PTE's and then executing the > two TLB and TSB flushes right afterwards as an independant operation > and not via pte_clear(). > Thanks for the review. I've sent v2 patch with above changes. Apart from this lockup during unmap, I'm also getting lockups during map of large hugepage backed VMA region. I see the backtrace as: hugetlb_fault -> huge_pte_alloc -> __pte_alloc -> __raw_spin_trylock. I think last level page table allocation can be completely avoided for huge pages and only allocate till PMD level. This would atleast avoid looping over 1024 PTEs during map/unmap and save some memory. Do you think this change would be worth doing? Thanks, Nitin -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index 131eaf4..48927cb 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -193,16 +193,24 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { pte_t entry; - int i; + int i, pte_count; entry = *ptep; if (pte_present(entry)) mm->context.huge_pte_count--; addr &= HPAGE_MASK; + pte_count = 1 << HUGETLB_PAGE_ORDER; - for (i = 0; i < (1 << HUGETLB_PAGE_ORDER); i++) { - pte_clear(mm, addr, ptep); + /* + * pte_clear issues TLB flush which is required + * only for REAL_HPAGE_SIZE aligned addresses. + */ + pte_clear(mm, addr, ptep); + pte_clear(mm, addr + REAL_HPAGE_SIZE, ptep + pte_count / 2); + ptep++; + for (i = 1; i < pte_count; i++) { + *ptep = __pte(0UL); addr += PAGE_SIZE; ptep++; }
During hugepage unmap, TLB flush is currently issued at every PAGE_SIZE'd boundary which is unnecessary. We now issue the flush at REAL_HPAGE_SIZE boundaries only. Without this patch workloads which unmap a large hugepage backed VMA region get CPU lockups due to excessive TLB flush calls. Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> --- arch/sparc/mm/hugetlbpage.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)