From patchwork Sun May 13 04:21:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 912510 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40k9q70Tjfz9s16 for ; Sun, 13 May 2018 14:30:11 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="G5g0K8B9"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 40k9q65cKkzF2Fd for ; Sun, 13 May 2018 14:30:10 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="G5g0K8B9"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c00::244; helo=mail-pf0-x244.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="G5g0K8B9"; dkim-atps=neutral Received: from mail-pf0-x244.google.com (mail-pf0-x244.google.com [IPv6:2607:f8b0:400e:c00::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40k9d53y5QzF3Rx for ; Sun, 13 May 2018 14:21:29 +1000 (AEST) Received: by mail-pf0-x244.google.com with SMTP id p14-v6so4462081pfh.9 for ; Sat, 12 May 2018 21:21:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=XrNJmtk1amrgE8c1G+F4dz1t+bK7TYg+ZNPOY5zjbFo=; b=G5g0K8B93Y4aBZA/k8iKBbNHZVD/AduaUxCYVavqZkeFX3vrUuke5C7d4Oif0B/T8J G0IBag4yzX6w7XkQP63DbjZqD31zGka/qk/t8l82mzTMyglUnGLSpCtDiT6sGdLUMptK iMMn/+oobR7Z7W9pKVi5XH4B2iAzJjL6WQlSWP7JdpoO+pQ+Y5OVv7BA2h6p0MkbKX7W Gvwz8bZNGunXDwy1+a25WXQQKHleUXYx2ewF5qzEXYLeitPQjCGKEuLKUb59BRBnZ2cR hJSOwazuIEFL6Xd2iqVWdzxHeRdKuXRh7Ouk3DahNdEJ7E3jFUuckKl9gUjVB6B2R5dg yh2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=XrNJmtk1amrgE8c1G+F4dz1t+bK7TYg+ZNPOY5zjbFo=; b=DCa3jXIyafdaysneVp/OJdrMjb6KoAcFPj3KtHm018ti9iUXD7k6cSFul4UBrog5tm fQR7DrNnMtnrMdVQ8hvdLWN4sn/XAydezgSHxnl7gYLumsuuf9VnES1sp1OjKnJkh67s E5VNO1ty+qhowFX0dRvdZH5F50dn1Yia5Nv0jMJ1gEB2nu3RPhCEbzoT75oRilq62moG DfhtxQE+oL6/aCBavnaG1iJDBC6AK7qaINVPZMAd/DCRss6fqMgwnkXq0pFK0ZqcqebK j3kFwDpDn26lwTRhvRXDmgtk+Plonw3zvW0ylBtX37it1wMvRJWA6BUh3sPoKYAYL8NT 1uOQ== X-Gm-Message-State: ALKqPwdgdmyJTggwC2rxp99/IWwEEFw1iKy81gXXZ4WmfAWimCxO1nR+ M343h+GyXCvQnWGpj9oB0sUnNg== X-Google-Smtp-Source: AB8JxZrb4/kBt7OK+xYKgQmP+3ygxmpAlY0lO37//3fKy7X0Y4FrUzz7k7NOrZcR8ikhOaJKkvNlvQ== X-Received: by 2002:a62:de02:: with SMTP id h2-v6mr5222153pfg.205.1526185286976; Sat, 12 May 2018 21:21:26 -0700 (PDT) Received: from roar.au.ibm.com (59-102-70-78.tpgi.com.au. [59.102.70.78]) by smtp.gmail.com with ESMTPSA id i186-v6sm10740058pge.40.2018.05.12.21.21.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 12 May 2018 21:21:26 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH 3/3] powerpc/64s/radix: optimise pte_update Date: Sun, 13 May 2018 14:21:06 +1000 Message-Id: <20180513042106.15470-4-npiggin@gmail.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180513042106.15470-1-npiggin@gmail.com> References: <20180513042106.15470-1-npiggin@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Implementing pte_update with pte_xchg (which uses cmpxchg) is inefficient. A single larx/stcx. works fine, no need for the less efficient cmpxchg sequence. Then remove the memory barriers from the operation. There is a requirement for TLB flushing to load mm_cpumask after the store that reduces pte permissions, which is moved into the TLB flush code. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/book3s/64/radix.h | 25 +++++++++++----------- arch/powerpc/mm/mmu_context.c | 6 ++++-- arch/powerpc/mm/tlb-radix.c | 11 +++++++++- 3 files changed, 27 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h index 45bf1e1b1d33..cc9437a542cc 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -127,20 +127,21 @@ extern void radix__mark_initmem_nx(void); static inline unsigned long __radix_pte_update(pte_t *ptep, unsigned long clr, unsigned long set) { - pte_t pte; - unsigned long old_pte, new_pte; - - do { - pte = READ_ONCE(*ptep); - old_pte = pte_val(pte); - new_pte = (old_pte | set) & ~clr; - - } while (!pte_xchg(ptep, __pte(old_pte), __pte(new_pte))); - - return old_pte; + __be64 old_be, tmp_be; + + __asm__ __volatile__( + "1: ldarx %0,0,%3 # pte_update\n" + " andc %1,%0,%5 \n" + " or %1,%1,%4 \n" + " stdcx. %1,0,%3 \n" + " bne- 1b" + : "=&r" (old_be), "=&r" (tmp_be), "=m" (*ptep) + : "r" (ptep), "r" (cpu_to_be64(set)), "r" (cpu_to_be64(clr)) + : "cc" ); + + return be64_to_cpu(old_be); } - static inline unsigned long radix__pte_update(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned long clr, diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c index 0ab297c4cfad..f84e14f23e50 100644 --- a/arch/powerpc/mm/mmu_context.c +++ b/arch/powerpc/mm/mmu_context.c @@ -57,8 +57,10 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, * in switch_slb(), and/or the store of paca->mm_ctx_id in * copy_mm_to_paca(). * - * On the read side the barrier is in pte_xchg(), which orders - * the store to the PTE vs the load of mm_cpumask. + * On the other side, the barrier is in mm/tlb-radix.c for + * radix which orders earlier stores to clear the PTEs vs + * the load of mm_cpumask. And pte_xchg which does the same + * thing for hash. * * This full barrier is needed by membarrier when switching * between processes after store to rq->curr, before user-space diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 55f93d66c8d2..b419702b1ba6 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -535,6 +535,11 @@ void radix__flush_tlb_mm(struct mm_struct *mm) return; preempt_disable(); + /* + * Order loads of mm_cpumask vs previous stores to clear ptes before + * the invalidate. See barrier in switch_mm_irqs_off + */ + smp_mb(); if (!mm_is_thread_local(mm)) { if (mm_is_singlethreaded(mm)) { _tlbie_pid(pid, RIC_FLUSH_ALL); @@ -560,6 +565,7 @@ void radix__flush_all_mm(struct mm_struct *mm) return; preempt_disable(); + smp_mb(); /* see radix__flush_tlb_mm */ if (!mm_is_thread_local(mm)) { _tlbie_pid(pid, RIC_FLUSH_ALL); if (mm_is_singlethreaded(mm)) @@ -587,6 +593,7 @@ void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, return; preempt_disable(); + smp_mb(); /* see radix__flush_tlb_mm */ if (mm_is_thread_local(mm)) { _tlbiel_va(vmaddr, pid, psize, RIC_FLUSH_TLB); } else { @@ -655,6 +662,7 @@ void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, return; preempt_disable(); + smp_mb(); /* see radix__flush_tlb_mm */ if (mm_is_thread_local(mm)) { local = true; full = (end == TLB_FLUSH_ALL || @@ -820,6 +828,7 @@ static inline void __radix__flush_tlb_range_psize(struct mm_struct *mm, return; preempt_disable(); + smp_mb(); /* see radix__flush_tlb_mm */ if (mm_is_thread_local(mm)) { local = true; full = (end == TLB_FLUSH_ALL || @@ -882,7 +891,7 @@ void radix__flush_tlb_collapsed_pmd(struct mm_struct *mm, unsigned long addr) /* Otherwise first do the PWC, then iterate the pages. */ preempt_disable(); - + smp_mb(); /* see radix__flush_tlb_mm */ if (mm_is_thread_local(mm)) { _tlbiel_va_range(addr, end, pid, PAGE_SIZE, mmu_virtual_psize, true); } else {