From patchwork Tue Apr 4 13:42:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yair Podemsky X-Patchwork-Id: 1764973 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=9gxa=73=vger.kernel.org=sparclinux-owner@ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=iDsOQTsh; dkim-atps=neutral Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PrTWb2lKJz1yZf for ; Tue, 4 Apr 2023 23:44:11 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4PrTWb2b6kz4wj7 for ; Tue, 4 Apr 2023 23:44:11 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4PrTWb2Xmkz4xFL; Tue, 4 Apr 2023 23:44:11 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=sparclinux-owner@vger.kernel.org; receiver=) Authentication-Results: gandalf.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=iDsOQTsh; dkim-atps=neutral Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4PrTWb2QMGz4wj7 for ; Tue, 4 Apr 2023 23:44:11 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235311AbjDDNoJ (ORCPT ); Tue, 4 Apr 2023 09:44:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235313AbjDDNn5 (ORCPT ); Tue, 4 Apr 2023 09:43:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CC23E61 for ; Tue, 4 Apr 2023 06:43:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680615786; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DOxp/loUVwjuOPGUeDlb03eqt//Xsq563D9HmMeX0kc=; b=iDsOQTsh2CjlMvP1ZuqT8+MgSaTWBdVnUUU6aE5cwuV3jbQmEtsuXODxzrdhPSqdOO0VVa aVI/WT/pKuSj/LFZyYN/J3T8UpgB1XZlh6bwFI+Mt2+N/liN7BIDv4Tpa4I/1J9GBrGfO9 LTnO0Qlkxi7xFZ5W0/+v/jCalFFQ8jc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-635-pi0Svwr4ODigORJ7n6yb0Q-1; Tue, 04 Apr 2023 09:42:59 -0400 X-MC-Unique: pi0Svwr4ODigORJ7n6yb0Q-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0FC07101A54F; Tue, 4 Apr 2023 13:42:57 +0000 (UTC) Received: from ypodemsk.tlv.csb (unknown [10.39.194.160]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 390002166B26; Tue, 4 Apr 2023 13:42:49 +0000 (UTC) From: Yair Podemsky To: linux@armlinux.org.uk, mpe@ellerman.id.au, npiggin@gmail.com, christophe.leroy@csgroup.eu, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, will@kernel.org, aneesh.kumar@linux.ibm.com, akpm@linux-foundation.org, peterz@infradead.org, arnd@arndb.de, keescook@chromium.org, paulmck@kernel.org, jpoimboe@kernel.org, samitolvanen@google.com, frederic@kernel.org, ardb@kernel.org, juerg.haefliger@canonical.com, rmk+kernel@armlinux.org.uk, geert+renesas@glider.be, tony@atomide.com, linus.walleij@linaro.org, sebastian.reichel@collabora.com, nick.hawkins@hpe.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, mtosatti@redhat.com, vschneid@redhat.com, dhildenb@redhat.com Cc: ypodemsk@redhat.com, alougovs@redhat.com Subject: [PATCH 1/3] arch: Introduce ARCH_HAS_CPUMASK_BITS Date: Tue, 4 Apr 2023 16:42:22 +0300 Message-Id: <20230404134224.137038-2-ypodemsk@redhat.com> In-Reply-To: <20230404134224.137038-1-ypodemsk@redhat.com> References: <20230404134224.137038-1-ypodemsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: sparclinux@vger.kernel.org Some architectures set and maintain the mm_cpumask bits when loading or removing process from cpu. This Kconfig will mark those to allow different behavior between kernels that maintain the mm_cpumask and those that do not. Signed-off-by: Yair Podemsky --- arch/Kconfig | 8 ++++++++ arch/arm/Kconfig | 1 + arch/powerpc/Kconfig | 1 + arch/s390/Kconfig | 1 + arch/sparc/Kconfig | 1 + arch/x86/Kconfig | 1 + 6 files changed, 13 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig index e3511afbb7f2..ec5559779e9f 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -1434,6 +1434,14 @@ config ARCH_HAS_NONLEAF_PMD_YOUNG address translations. Page table walkers that clear the accessed bit may use this capability to reduce their search space. +config ARCH_HAS_CPUMASK_BITS + bool + help + Architectures that select this option set bits on the mm_cpumask + to mark which cpus loaded the mm, The mask can then be used to + control mm specific actions such as tlb_flush. + + source "kernel/gcov/Kconfig" source "scripts/gcc-plugins/Kconfig" diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e24a9820e12f..6111059a68a3 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -70,6 +70,7 @@ config ARM select GENERIC_SCHED_CLOCK select GENERIC_SMP_IDLE_THREAD select HARDIRQS_SW_RESEND + select ARCH_HAS_CPUMASK_BITS select HAVE_ARCH_AUDITSYSCALL if AEABI && !OABI_COMPAT select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6 select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index a6c4407d3ec8..2fd0160f4f8e 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -144,6 +144,7 @@ config PPC select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UACCESS_FLUSHCACHE select ARCH_HAS_UBSAN_SANITIZE_ALL + select ARCH_HAS_CPUMASK_BITS select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_KEEP_MEMBLOCK select ARCH_MIGHT_HAVE_PC_PARPORT diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 9809c74e1240..b2de5ee07faf 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -86,6 +86,7 @@ config S390 select ARCH_HAS_SYSCALL_WRAPPER select ARCH_HAS_UBSAN_SANITIZE_ALL select ARCH_HAS_VDSO_DATA + select ARCH_HAS_CPUMASK_BITS select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_INLINE_READ_LOCK select ARCH_INLINE_READ_LOCK_BH diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 84437a4c6545..f9e0cf26d447 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -98,6 +98,7 @@ config SPARC64 select ARCH_HAS_PTE_SPECIAL select PCI_DOMAINS if PCI select ARCH_HAS_GIGANTIC_PAGE + select ARCH_HAS_CPUMASK_BITS select HAVE_SOFTIRQ_ON_OWN_STACK select HAVE_SETUP_PER_CPU_AREA select NEED_PER_CPU_EMBED_FIRST_CHUNK diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index a825bf031f49..d98dfdf9c6b4 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -183,6 +183,7 @@ config X86 select HAVE_ARCH_THREAD_STRUCT_WHITELIST select HAVE_ARCH_STACKLEAK select HAVE_ARCH_TRACEHOOK + select ARCH_HAS_CPUMASK_BITS select HAVE_ARCH_TRANSPARENT_HUGEPAGE select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64 select HAVE_ARCH_USERFAULTFD_WP if X86_64 && USERFAULTFD From patchwork Tue Apr 4 13:42:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yair Podemsky X-Patchwork-Id: 1764972 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=9gxa=73=vger.kernel.org=sparclinux-owner@ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=W/4lMlLl; dkim-atps=neutral Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PrTWb0Lhnz1yZQ for ; Tue, 4 Apr 2023 23:44:10 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4PrTWZ3KBbz4wj7 for ; Tue, 4 Apr 2023 23:44:10 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4PrTWZ3GmZz4xDr; Tue, 4 Apr 2023 23:44:10 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=sparclinux-owner@vger.kernel.org; receiver=) Authentication-Results: gandalf.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=W/4lMlLl; dkim-atps=neutral Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4PrTWZ3BR4z4wj7 for ; Tue, 4 Apr 2023 23:44:10 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235288AbjDDNoI (ORCPT ); Tue, 4 Apr 2023 09:44:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33700 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235311AbjDDNn5 (ORCPT ); Tue, 4 Apr 2023 09:43:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD390198 for ; Tue, 4 Apr 2023 06:43:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680615789; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=98dFpManPARgksa4GjoMPiRXPOJRbGoDK3gntpYc2e0=; b=W/4lMlLlXgPhMdR9MxzwXcpSRIWpjsXZqfPD/yYGsdWSSVZeRpRiOcuXSoK5ag5c6AqBGR 2RB8wto6hvIgToRsNnL5EB3te4K3x4yuzjRjqlhBSNtXA4J/9F76Bvw6kLGeKJRzA5K0dx R/sbYGiO162wdgxPof4gaEVDZYU48Ms= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-656-C67ab_qpOfSKsM99qL632Q-1; Tue, 04 Apr 2023 09:43:07 -0400 X-MC-Unique: C67ab_qpOfSKsM99qL632Q-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3E405101A553; Tue, 4 Apr 2023 13:43:05 +0000 (UTC) Received: from ypodemsk.tlv.csb (unknown [10.39.194.160]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5C2282166B26; Tue, 4 Apr 2023 13:42:57 +0000 (UTC) From: Yair Podemsky To: linux@armlinux.org.uk, mpe@ellerman.id.au, npiggin@gmail.com, christophe.leroy@csgroup.eu, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, will@kernel.org, aneesh.kumar@linux.ibm.com, akpm@linux-foundation.org, peterz@infradead.org, arnd@arndb.de, keescook@chromium.org, paulmck@kernel.org, jpoimboe@kernel.org, samitolvanen@google.com, frederic@kernel.org, ardb@kernel.org, juerg.haefliger@canonical.com, rmk+kernel@armlinux.org.uk, geert+renesas@glider.be, tony@atomide.com, linus.walleij@linaro.org, sebastian.reichel@collabora.com, nick.hawkins@hpe.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, mtosatti@redhat.com, vschneid@redhat.com, dhildenb@redhat.com Cc: ypodemsk@redhat.com, alougovs@redhat.com Subject: [PATCH 2/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Date: Tue, 4 Apr 2023 16:42:23 +0300 Message-Id: <20230404134224.137038-3-ypodemsk@redhat.com> In-Reply-To: <20230404134224.137038-1-ypodemsk@redhat.com> References: <20230404134224.137038-1-ypodemsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: sparclinux@vger.kernel.org Currently the tlb_remove_table_smp_sync IPI is sent to all CPUs indiscriminately, this causes unnecessary work and delays notable in real-time use-cases and isolated cpus. This patch will limit this IPI on systems with ARCH_HAS_CPUMASK_BITS, Where the IPI will only be sent to cpus referencing the affected mm. Signed-off-by: Yair Podemsky Suggested-by: David Hildenbrand --- include/asm-generic/tlb.h | 4 ++-- mm/khugepaged.c | 4 ++-- mm/mmu_gather.c | 17 ++++++++++++----- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index b46617207c93..0b6ba17cc8d3 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -222,7 +222,7 @@ extern void tlb_remove_table(struct mmu_gather *tlb, void *table); #define tlb_needs_table_invalidate() (true) #endif -void tlb_remove_table_sync_one(void); +void tlb_remove_table_sync_one(struct mm_struct *mm); #else @@ -230,7 +230,7 @@ void tlb_remove_table_sync_one(void); #error tlb_needs_table_invalidate() requires MMU_GATHER_RCU_TABLE_FREE #endif -static inline void tlb_remove_table_sync_one(void) { } +static inline void tlb_remove_table_sync_one(struct mm_struct *mm) { } #endif /* CONFIG_MMU_GATHER_RCU_TABLE_FREE */ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 92e6f56a932d..2b4e6ca1f38e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1070,7 +1070,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, _pmd = pmdp_collapse_flush(vma, address, pmd); spin_unlock(pmd_ptl); mmu_notifier_invalidate_range_end(&range); - tlb_remove_table_sync_one(); + tlb_remove_table_sync_one(mm); spin_lock(pte_ptl); result = __collapse_huge_page_isolate(vma, address, pte, cc, @@ -1427,7 +1427,7 @@ static void collapse_and_free_pmd(struct mm_struct *mm, struct vm_area_struct *v addr + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); pmd = pmdp_collapse_flush(vma, addr, pmdp); - tlb_remove_table_sync_one(); + tlb_remove_table_sync_one(mm); mmu_notifier_invalidate_range_end(&range); mm_dec_nr_ptes(mm); page_table_check_pte_clear_range(mm, addr, pmd); diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 2b93cf6ac9ae..5ea9be6fb87c 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -191,7 +191,13 @@ static void tlb_remove_table_smp_sync(void *arg) /* Simply deliver the interrupt */ } -void tlb_remove_table_sync_one(void) +#ifdef CONFIG_ARCH_HAS_CPUMASK_BITS +#define REMOVE_TABLE_IPI_MASK mm_cpumask(mm) +#else +#define REMOVE_TABLE_IPI_MASK NULL +#endif /* CONFIG_ARCH_HAS_CPUMASK_BITS */ + +void tlb_remove_table_sync_one(struct mm_struct *mm) { /* * This isn't an RCU grace period and hence the page-tables cannot be @@ -200,7 +206,8 @@ void tlb_remove_table_sync_one(void) * It is however sufficient for software page-table walkers that rely on * IRQ disabling. */ - smp_call_function(tlb_remove_table_smp_sync, NULL, 1); + on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync, + NULL, true); } static void tlb_remove_table_rcu(struct rcu_head *head) @@ -237,9 +244,9 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb) } } -static void tlb_remove_table_one(void *table) +static void tlb_remove_table_one(struct mm_struct *mm, void *table) { - tlb_remove_table_sync_one(); + tlb_remove_table_sync_one(mm); __tlb_remove_table(table); } @@ -262,7 +269,7 @@ void tlb_remove_table(struct mmu_gather *tlb, void *table) *batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | __GFP_NOWARN); if (*batch == NULL) { tlb_table_invalidate(tlb); - tlb_remove_table_one(table); + tlb_remove_table_one(tlb->mm, table); return; } (*batch)->nr = 0; From patchwork Tue Apr 4 13:42:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yair Podemsky X-Patchwork-Id: 1764974 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=9gxa=73=vger.kernel.org=sparclinux-owner@ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=O9gQ/HD5; dkim-atps=neutral Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PrTWt5bwHz1yZQ for ; Tue, 4 Apr 2023 23:44:26 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4PrTWt58Rsz4xDh for ; Tue, 4 Apr 2023 23:44:26 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4PrTWt567qz4xDw; Tue, 4 Apr 2023 23:44:26 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=sparclinux-owner@vger.kernel.org; receiver=) Authentication-Results: gandalf.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=O9gQ/HD5; dkim-atps=neutral Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4PrTWt52Jjz4xDh for ; Tue, 4 Apr 2023 23:44:26 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235381AbjDDNoZ (ORCPT ); Tue, 4 Apr 2023 09:44:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235315AbjDDNoH (ORCPT ); Tue, 4 Apr 2023 09:44:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7507C101 for ; Tue, 4 Apr 2023 06:43:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680615797; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/5AmH2Sv0AFTXQ2yib2CUEzf+dnZkAWR6OfJs2RIyZU=; b=O9gQ/HD5KMV5uAJrfTXphq2U1jLNlWCJbA4bU4qTllYWTvQySSZibcQW3aAm3hMx/svuBP b2E9pLYauE50Sm8btYVGJamxy/GlFRULd3k8hYLe+gnuh1WCULfa6Bt7bPlNFrPnzex2eL jZzuT3U/M25a5BnSGrAz7oIclvmRi8Q= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-175-4B_pK5R4OeycA85aaITT_g-1; Tue, 04 Apr 2023 09:43:15 -0400 X-MC-Unique: 4B_pK5R4OeycA85aaITT_g-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5B2548030CD; Tue, 4 Apr 2023 13:43:13 +0000 (UTC) Received: from ypodemsk.tlv.csb (unknown [10.39.194.160]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8A2D02166B26; Tue, 4 Apr 2023 13:43:05 +0000 (UTC) From: Yair Podemsky To: linux@armlinux.org.uk, mpe@ellerman.id.au, npiggin@gmail.com, christophe.leroy@csgroup.eu, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, will@kernel.org, aneesh.kumar@linux.ibm.com, akpm@linux-foundation.org, peterz@infradead.org, arnd@arndb.de, keescook@chromium.org, paulmck@kernel.org, jpoimboe@kernel.org, samitolvanen@google.com, frederic@kernel.org, ardb@kernel.org, juerg.haefliger@canonical.com, rmk+kernel@armlinux.org.uk, geert+renesas@glider.be, tony@atomide.com, linus.walleij@linaro.org, sebastian.reichel@collabora.com, nick.hawkins@hpe.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, mtosatti@redhat.com, vschneid@redhat.com, dhildenb@redhat.com Cc: ypodemsk@redhat.com, alougovs@redhat.com Subject: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode Date: Tue, 4 Apr 2023 16:42:24 +0300 Message-Id: <20230404134224.137038-4-ypodemsk@redhat.com> In-Reply-To: <20230404134224.137038-1-ypodemsk@redhat.com> References: <20230404134224.137038-1-ypodemsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_FILL_THIS_FORM_SHORT autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: sparclinux@vger.kernel.org The tlb_remove_table_smp_sync IPI is used to ensure the outdated tlb page is not currently being accessed and can be cleared. This occurs once all CPUs have left the lockless gup code section. If they reenter the page table walk, the pointers will be to the new pages. Therefore the IPI is only needed for CPUs in kernel mode. By preventing the IPI from being sent to CPUs not in kernel mode, Latencies are reduced. Race conditions considerations: The context state check is vulnerable to race conditions between the moment the context state is read to when the IPI is sent (or not). Here are these scenarios. case 1: CPU-A CPU-B state == CONTEXT_KERNEL int state = atomic_read(&ct->state); Kernel-exit: state == CONTEXT_USER if (state & CT_STATE_MASK == CONTEXT_KERNEL) In this case, the IPI will be sent to CPU-B despite it is no longer in the kernel. The consequence of which would be an unnecessary IPI being handled by CPU-B, causing a reduction in latency. This would have been the case every time without this patch. case 2: CPU-A CPU-B modify pagetables tlb_flush (memory barrier) state == CONTEXT_USER int state = atomic_read(&ct->state); Kernel-enter: state == CONTEXT_KERNEL READ(pagetable values) if (state & CT_STATE_MASK == CONTEXT_USER) In this case, the IPI will not be sent to CPU-B despite it returning to the kernel and even reading the pagetable. However since this CPU-B has entered the pagetable after the modification it is reading the new, safe values. The only case when this IPI is truly necessary is when CPU-B has entered the lockless gup code section before the pagetable modifications and has yet to exit them, in which case it is still in the kernel. Signed-off-by: Yair Podemsky --- mm/mmu_gather.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 5ea9be6fb87c..731d955e152d 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include @@ -191,6 +192,20 @@ static void tlb_remove_table_smp_sync(void *arg) /* Simply deliver the interrupt */ } + +#ifdef CONFIG_CONTEXT_TRACKING +static bool cpu_in_kernel(int cpu, void *info) +{ + struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); + int state = atomic_read(&ct->state); + /* will return true only for cpus in kernel space */ + return state & CT_STATE_MASK == CONTEXT_KERNEL; +} +#define CONTEXT_PREDICATE cpu_in_kernel +#else +#define CONTEXT_PREDICATE NULL +#endif /* CONFIG_CONTEXT_TRACKING */ + #ifdef CONFIG_ARCH_HAS_CPUMASK_BITS #define REMOVE_TABLE_IPI_MASK mm_cpumask(mm) #else @@ -206,8 +221,8 @@ void tlb_remove_table_sync_one(struct mm_struct *mm) * It is however sufficient for software page-table walkers that rely on * IRQ disabling. */ - on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync, - NULL, true); + on_each_cpu_cond_mask(CONTEXT_PREDICATE, tlb_remove_table_smp_sync, + NULL, true, REMOVE_TABLE_IPI_MASK); } static void tlb_remove_table_rcu(struct rcu_head *head)