From patchwork Mon Nov 14 02:31:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703391 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=OhZIEbYW; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YKV735pz23nM for ; Mon, 14 Nov 2022 13:34:06 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YKV4V0fz3f3J for ; Mon, 14 Nov 2022 13:34:06 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=OhZIEbYW; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::102f; helo=mail-pj1-x102f.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=OhZIEbYW; dkim-atps=neutral Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YH46Sjtz2ybK for ; Mon, 14 Nov 2022 13:32:00 +1100 (AEDT) Received: by mail-pj1-x102f.google.com with SMTP id q1-20020a17090a750100b002139ec1e999so9374651pjk.1 for ; Sun, 13 Nov 2022 18:32:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x/MkpvgiQvZUS48v8RArx2HbNh9Khi3667nn4hCQEgU=; b=OhZIEbYW+V0/w/O2kAJlYDSvrU+rZVMe6mh3M1FdZezJUTChiuWHh9iqDh0b9j0GzC fF2gV5UFi0szLRamMYljJ6icAZZf4qyetFEeFndF393f/kNElJ95C7dhY9NdjMnqIR2Q iHW0mbF1w8RkX8Kc4KJ4X+d9MbglzlSLxde7oCTIhaqwWLyDILOFX1fCvL4zdFFnZES8 YaFEjVewBz3URxPQAlr8WcygX84WNW0WoJ7QxJGfkRzC16dViJ9I3Wkbps5O0fYb835c 20vFUK3kqOpgcmYbU8QwlH+BTFsWBgQHNpRp4WQdJJlfTKmwWSiw6BcvvhSDoD5C3ED/ vd6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x/MkpvgiQvZUS48v8RArx2HbNh9Khi3667nn4hCQEgU=; b=3EmM8jFStmasgiWF1oWGLA4vINBlXJlUXyQJX8qeZ7dX4kz1tlplkUT4cqk9c/emxm lzzrw/Xy+fC0VTqgjFaJEE18U8IOS7hNIGuXQE4VScfZwi/rNCJEtewfdTgDhiwn3FJq eoRNSQvdIML1kPGNTDyVBrie2dOsCEkrjmTM3HLZjvh44XDtqp3+Di74ENK1d2otQ+u0 7JdMxXdLgqf6dHqt6bZkshdEF37thlv1ronOkquKqi+XzHyDYuQrZlWcayKMWrM6gd5Q t4wibWBOle8EDxWUwxTVH0Jkx/vHt706EWulhPU9C67o4vVZZLSAHj4lQ8X50OmPRtkt ZppA== X-Gm-Message-State: ANoB5pkzVr9dGrd3Y6fUggLTZ4FAaonZW93ouNFj1Ki0HVBHAZZyzST6 5AuAm8QuDjmJbDyXNOzAAPxRnot7pJo= X-Google-Smtp-Source: AA0mqf42OVVUP5tJzt1Au72st2AjHPCfix/D0XqFQrLWJ2dtbnVCcntOk1QDNzY3ITspEDXk86N+mA== X-Received: by 2002:a17:902:f609:b0:186:c958:6cd8 with SMTP id n9-20020a170902f60900b00186c9586cd8mr11669931plg.145.1668393117961; Sun, 13 Nov 2022 18:31:57 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.31.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:31:57 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 01/17] powerpc/qspinlock: powerpc qspinlock implementation Date: Mon, 14 Nov 2022 12:31:21 +1000 Message-Id: <20221114023137.2679627-3-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Add a powerpc specific implementation of queued spinlocks. This is the build framework with a very simple (non-queued) spinlock implementation to begin with. Later changes add queueing, and other features and optimisations one-at-a-time. It is done this way to more easily see how the queued spinlocks are built, and to make performance and correctness bisects more useful. Signed-off-by: Nicholas Piggin --- arch/powerpc/Kconfig | 1 - arch/powerpc/include/asm/qspinlock.h | 76 +++++++++------------- arch/powerpc/include/asm/qspinlock_types.h | 13 ++++ arch/powerpc/include/asm/spinlock_types.h | 2 +- arch/powerpc/lib/Makefile | 4 +- arch/powerpc/lib/qspinlock.c | 17 +++++ 6 files changed, 66 insertions(+), 47 deletions(-) create mode 100644 arch/powerpc/include/asm/qspinlock_types.h create mode 100644 arch/powerpc/lib/qspinlock.c diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 2ca5418457ed..1d5b4f280feb 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -155,7 +155,6 @@ config PPC select ARCH_USE_CMPXCHG_LOCKREF if PPC64 select ARCH_USE_MEMTEST select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS - select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select ARCH_WANT_IPC_PARSE_VERSION select ARCH_WANT_IRQS_OFF_ACTIVATE_MM diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 39c1c7f80579..b1443aab2145 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -2,66 +2,54 @@ #ifndef _ASM_POWERPC_QSPINLOCK_H #define _ASM_POWERPC_QSPINLOCK_H -#include -#include +#include +#include +#include -#define _Q_PENDING_LOOPS (1 << 9) /* not tuned */ - -void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); -void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); -void __pv_queued_spin_unlock(struct qspinlock *lock); - -static __always_inline void queued_spin_lock(struct qspinlock *lock) +static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { - u32 val = 0; + return atomic_read(&lock->val); +} - if (likely(arch_atomic_try_cmpxchg_lock(&lock->val, &val, _Q_LOCKED_VAL))) - return; +static __always_inline int queued_spin_value_unlocked(struct qspinlock lock) +{ + return !atomic_read(&lock.val); +} - if (!IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) || !is_shared_processor()) - queued_spin_lock_slowpath(lock, val); - else - __pv_queued_spin_lock_slowpath(lock, val); +static __always_inline int queued_spin_is_contended(struct qspinlock *lock) +{ + return 0; } -#define queued_spin_lock queued_spin_lock -static inline void queued_spin_unlock(struct qspinlock *lock) +static __always_inline int queued_spin_trylock(struct qspinlock *lock) { - if (!IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) || !is_shared_processor()) - smp_store_release(&lock->locked, 0); - else - __pv_queued_spin_unlock(lock); + return atomic_cmpxchg_acquire(&lock->val, 0, 1) == 0; } -#define queued_spin_unlock queued_spin_unlock -#ifdef CONFIG_PARAVIRT_SPINLOCKS -#define SPIN_THRESHOLD (1<<15) /* not tuned */ +void queued_spin_lock_slowpath(struct qspinlock *lock); -static __always_inline void pv_wait(u8 *ptr, u8 val) +static __always_inline void queued_spin_lock(struct qspinlock *lock) { - if (*ptr != val) - return; - yield_to_any(); - /* - * We could pass in a CPU here if waiting in the queue and yield to - * the previous CPU in the queue. - */ + if (!queued_spin_trylock(lock)) + queued_spin_lock_slowpath(lock); } -static __always_inline void pv_kick(int cpu) +static inline void queued_spin_unlock(struct qspinlock *lock) { - prod_cpu(cpu); + atomic_set_release(&lock->val, 0); } -#endif +#define arch_spin_is_locked(l) queued_spin_is_locked(l) +#define arch_spin_is_contended(l) queued_spin_is_contended(l) +#define arch_spin_value_unlocked(l) queued_spin_value_unlocked(l) +#define arch_spin_lock(l) queued_spin_lock(l) +#define arch_spin_trylock(l) queued_spin_trylock(l) +#define arch_spin_unlock(l) queued_spin_unlock(l) -/* - * Queued spinlocks rely heavily on smp_cond_load_relaxed() to busy-wait, - * which was found to have performance problems if implemented with - * the preferred spin_begin()/spin_end() SMT priority pattern. Use the - * generic version instead. - */ - -#include +#ifdef CONFIG_PARAVIRT_SPINLOCKS +void pv_spinlocks_init(void); +#else +static inline void pv_spinlocks_init(void) { } +#endif #endif /* _ASM_POWERPC_QSPINLOCK_H */ diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h new file mode 100644 index 000000000000..59606bc0c774 --- /dev/null +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _ASM_POWERPC_QSPINLOCK_TYPES_H +#define _ASM_POWERPC_QSPINLOCK_TYPES_H + +#include + +typedef struct qspinlock { + atomic_t val; +} arch_spinlock_t; + +#define __ARCH_SPIN_LOCK_UNLOCKED { .val = ATOMIC_INIT(0) } + +#endif /* _ASM_POWERPC_QSPINLOCK_TYPES_H */ diff --git a/arch/powerpc/include/asm/spinlock_types.h b/arch/powerpc/include/asm/spinlock_types.h index d5f8a74ed2e8..40b01446cf75 100644 --- a/arch/powerpc/include/asm/spinlock_types.h +++ b/arch/powerpc/include/asm/spinlock_types.h @@ -7,7 +7,7 @@ #endif #ifdef CONFIG_PPC_QUEUED_SPINLOCKS -#include +#include #include #else #include diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index 8560c912186d..b895cbf6a709 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -52,7 +52,9 @@ obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \ obj64-y += copypage_64.o copyuser_64.o mem_64.o hweight_64.o \ memcpy_64.o copy_mc_64.o -ifndef CONFIG_PPC_QUEUED_SPINLOCKS +ifdef CONFIG_PPC_QUEUED_SPINLOCKS +obj64-$(CONFIG_SMP) += qspinlock.o +else obj64-$(CONFIG_SMP) += locks.o endif diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c new file mode 100644 index 000000000000..1c669b5b4607 --- /dev/null +++ b/arch/powerpc/lib/qspinlock.c @@ -0,0 +1,17 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +#include +#include +#include + +void queued_spin_lock_slowpath(struct qspinlock *lock) +{ + while (!queued_spin_trylock(lock)) + cpu_relax(); +} +EXPORT_SYMBOL(queued_spin_lock_slowpath); + +#ifdef CONFIG_PARAVIRT_SPINLOCKS +void pv_spinlocks_init(void) +{ +} +#endif From patchwork Mon Nov 14 02:31:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703392 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ZObpSHoX; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YLW4hQQz23nM for ; Mon, 14 Nov 2022 13:34:59 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YLW3VHNz3cMw for ; Mon, 14 Nov 2022 13:34:59 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ZObpSHoX; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::42c; helo=mail-pf1-x42c.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ZObpSHoX; dkim-atps=neutral Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YH86tc9z2ybK for ; Mon, 14 Nov 2022 13:32:04 +1100 (AEDT) Received: by mail-pf1-x42c.google.com with SMTP id g62so9773018pfb.10 for ; Sun, 13 Nov 2022 18:32:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UIhJaoXuWaovgk4L1YJAy7HFH+S6ZwNNpXO7ndwl8Sg=; b=ZObpSHoXE4TFYfxL80z0Q8FbW05RQLkMVa+c2QJnihVo7iZsBmn4RCpgh+UG4mSV26 90uY9CSbCXg6P/zyWLrVi3e3gV3lbPA1+3ceLMZoUwRZ6FkRBDFeA0+Z4t12NODgfbxk vdwC10MjcPStcHHAJBHlcnq8b1yWQ4Qr8TuO9PjuhPurUiB/sVt4hgjehHARbLUT0kC0 hakpxhRTGYgM2aNAM3PRvgWhUMOrw+Nyj2DlP8yizKWXd4mq9BxNyzLjLLR5YZp7A23x Nr/6IPjQEuK4X01p52FeDzRPXcqoebpOu0CaDbilwI0WBinjXpu5daHlX7CF1nK1IEHV wRlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UIhJaoXuWaovgk4L1YJAy7HFH+S6ZwNNpXO7ndwl8Sg=; b=27eC4HX+vtb+K+izABgcfvyZIoIvcfcJz6apSAehA23KuQktqIfgWDaej8OefwMn86 eB/PY9ZZX0+YaBHHTo5LKYYKWo2IGNkaV4gX1rp5LasbVdb4dQh9XKUnKKpCEsEsUah9 rS7J2+68Z/KsBTajTmM50lx5H5wg5vsVz6+K9HXApv3kj7Gt+qsdFDrOsJJyFxBt0r1t Fj02cnNpcXwzTTp7xpuqfhnusc/EbC2IjTR4O6f9l0zBZfFri5FQrO0C785jo5QVnQ5Z SANy8hileh4MwnItP0gy6EelHLbbR4MAazHhxq7j0SIVrzpmyd4+MMgSIyNekghPFOVv 8KtQ== X-Gm-Message-State: ANoB5pn9VHwJl+cy9TTD/DbFe5+GZmr7v+vi73S2gLeD0ntKaIyIFGDv ag9BI/kDckkfPexsCQDp8GN4fthIXj8= X-Google-Smtp-Source: AA0mqf7zLE7c4x1dHmEgO2eK7eUdL67ttTl+PxXigRIxAdNbYGP8b3b5VM0adSiek75V/ufy3gjaSw== X-Received: by 2002:a05:6a00:1f13:b0:567:546c:718b with SMTP id be19-20020a056a001f1300b00567546c718bmr12115782pfb.17.1668393121995; Sun, 13 Nov 2022 18:32:01 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.31.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:01 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 02/17] powerpc/qspinlock: add mcs queueing for contended waiters Date: Mon, 14 Nov 2022 12:31:22 +1000 Message-Id: <20221114023137.2679627-4-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" This forms the basis of the qspinlock slow path. Like generic qspinlocks and unlike the vanilla MCS algorithm, the lock owner does not participate in the queue, only waiters. The first waiter spins on the lock word, then when the lock is released it takes ownership and unqueues the next waiter. This is how qspinlocks can be implemented with the spinlock API -- lock owners don't need a node, only waiters do. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 10 +- arch/powerpc/include/asm/qspinlock_types.h | 21 +++ arch/powerpc/lib/qspinlock.c | 180 ++++++++++++++++++++- 3 files changed, 205 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index b1443aab2145..300c7d2ebe2e 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -18,12 +18,12 @@ static __always_inline int queued_spin_value_unlocked(struct qspinlock lock) static __always_inline int queued_spin_is_contended(struct qspinlock *lock) { - return 0; + return !!(atomic_read(&lock->val) & _Q_TAIL_CPU_MASK); } static __always_inline int queued_spin_trylock(struct qspinlock *lock) { - return atomic_cmpxchg_acquire(&lock->val, 0, 1) == 0; + return atomic_cmpxchg_acquire(&lock->val, 0, _Q_LOCKED_VAL) == 0; } void queued_spin_lock_slowpath(struct qspinlock *lock); @@ -36,7 +36,11 @@ static __always_inline void queued_spin_lock(struct qspinlock *lock) static inline void queued_spin_unlock(struct qspinlock *lock) { - atomic_set_release(&lock->val, 0); + for (;;) { + int val = atomic_read(&lock->val); + if (atomic_cmpxchg_release(&lock->val, val, val & ~_Q_LOCKED_VAL) == val) + return; + } } #define arch_spin_is_locked(l) queued_spin_is_locked(l) diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 59606bc0c774..9630e714c70d 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -10,4 +10,25 @@ typedef struct qspinlock { #define __ARCH_SPIN_LOCK_UNLOCKED { .val = ATOMIC_INIT(0) } +/* + * Bitfields in the atomic value: + * + * 0: locked bit + * 16-31: tail cpu (+1) + */ +#define _Q_SET_MASK(type) (((1U << _Q_ ## type ## _BITS) - 1)\ + << _Q_ ## type ## _OFFSET) +#define _Q_LOCKED_OFFSET 0 +#define _Q_LOCKED_BITS 1 +#define _Q_LOCKED_MASK _Q_SET_MASK(LOCKED) +#define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) + +#define _Q_TAIL_CPU_OFFSET 16 +#define _Q_TAIL_CPU_BITS (32 - _Q_TAIL_CPU_OFFSET) +#define _Q_TAIL_CPU_MASK _Q_SET_MASK(TAIL_CPU) + +#if CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS) +#error "qspinlock does not support such large CONFIG_NR_CPUS" +#endif + #endif /* _ASM_POWERPC_QSPINLOCK_TYPES_H */ diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 1c669b5b4607..f3c3d5128bd5 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -1,12 +1,186 @@ // SPDX-License-Identifier: GPL-2.0-or-later +#include +#include +#include #include -#include +#include +#include #include -void queued_spin_lock_slowpath(struct qspinlock *lock) +#define MAX_NODES 4 + +struct qnode { + struct qnode *next; + struct qspinlock *lock; + u8 locked; /* 1 if lock acquired */ +}; + +struct qnodes { + int count; + struct qnode nodes[MAX_NODES]; +}; + +static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); + +static inline int encode_tail_cpu(int cpu) +{ + return (cpu + 1) << _Q_TAIL_CPU_OFFSET; +} + +static inline int decode_tail_cpu(int val) +{ + return (val >> _Q_TAIL_CPU_OFFSET) - 1; +} + +/* Take the lock by setting the bit, no other CPUs may concurrently lock it. */ +static __always_inline void set_locked(struct qspinlock *lock) +{ + atomic_or(_Q_LOCKED_VAL, &lock->val); + __atomic_acquire_fence(); +} + +/* Take lock, clearing tail, cmpxchg with val (which must not be locked) */ +static __always_inline int trylock_clear_tail_cpu(struct qspinlock *lock, int val) +{ + int newval = _Q_LOCKED_VAL; + + BUG_ON(val & _Q_LOCKED_VAL); + + return atomic_cmpxchg_acquire(&lock->val, val, newval) == val; +} + +/* + * Publish our tail, replacing previous tail. Return previous value. + * + * This provides a release barrier for publishing node, this pairs with the + * acquire barrier in get_tail_qnode() when the next CPU finds this tail + * value. + */ +static __always_inline int publish_tail_cpu(struct qspinlock *lock, int tail) +{ + for (;;) { + int val = atomic_read(&lock->val); + int newval = (val & ~_Q_TAIL_CPU_MASK) | tail; + int old; + + old = atomic_cmpxchg_release(&lock->val, val, newval); + if (old == val) + return old; + } +} + +static struct qnode *get_tail_qnode(struct qspinlock *lock, int val) +{ + int cpu = decode_tail_cpu(val); + struct qnodes *qnodesp = per_cpu_ptr(&qnodes, cpu); + int idx; + + /* + * After publishing the new tail and finding a previous tail in the + * previous val (which is the control dependency), this barrier + * orders the release barrier in publish_tail_cpu performed by the + * last CPU, with subsequently looking at its qnode structures + * after the barrier. + */ + smp_acquire__after_ctrl_dep(); + + for (idx = 0; idx < MAX_NODES; idx++) { + struct qnode *qnode = &qnodesp->nodes[idx]; + if (qnode->lock == lock) + return qnode; + } + + BUG(); +} + +static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) { - while (!queued_spin_trylock(lock)) + struct qnodes *qnodesp; + struct qnode *next, *node; + int val, old, tail; + int idx; + + BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); + + qnodesp = this_cpu_ptr(&qnodes); + if (unlikely(qnodesp->count >= MAX_NODES)) { + while (!queued_spin_trylock(lock)) + cpu_relax(); + return; + } + + idx = qnodesp->count++; + /* + * Ensure that we increment the head node->count before initialising + * the actual node. If the compiler is kind enough to reorder these + * stores, then an IRQ could overwrite our assignments. + */ + barrier(); + node = &qnodesp->nodes[idx]; + node->next = NULL; + node->lock = lock; + node->locked = 0; + + tail = encode_tail_cpu(smp_processor_id()); + + old = publish_tail_cpu(lock, tail); + + /* + * If there was a previous node; link it and wait until reaching the + * head of the waitqueue. + */ + if (old & _Q_TAIL_CPU_MASK) { + struct qnode *prev = get_tail_qnode(lock, old); + + /* Link @node into the waitqueue. */ + WRITE_ONCE(prev->next, node); + + /* Wait for mcs node lock to be released */ + while (!node->locked) + cpu_relax(); + + smp_rmb(); /* acquire barrier for the mcs lock */ + } + + /* We're at the head of the waitqueue, wait for the lock. */ + for (;;) { + val = atomic_read(&lock->val); + if (!(val & _Q_LOCKED_VAL)) + break; + + cpu_relax(); + } + + /* If we're the last queued, must clean up the tail. */ + if ((val & _Q_TAIL_CPU_MASK) == tail) { + if (trylock_clear_tail_cpu(lock, val)) + goto release; + /* Another waiter must have enqueued */ + } + + /* We must be the owner, just set the lock bit and acquire */ + set_locked(lock); + + /* contended path; must wait for next != NULL (MCS protocol) */ + while (!(next = READ_ONCE(node->next))) cpu_relax(); + + /* + * Unlock the next mcs waiter node. Release barrier is not required + * here because the acquirer is only accessing the lock word, and + * the acquire barrier we took the lock with orders that update vs + * this store to locked. The corresponding barrier is the smp_rmb() + * acquire barrier for mcs lock, above. + */ + WRITE_ONCE(next->locked, 1); + +release: + qnodesp->count--; /* release the node */ +} + +void queued_spin_lock_slowpath(struct qspinlock *lock) +{ + queued_spin_lock_mcs_queue(lock); } EXPORT_SYMBOL(queued_spin_lock_slowpath); From patchwork Mon Nov 14 02:31:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703393 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=fv34fExh; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YMX2nkKz23nK for ; Mon, 14 Nov 2022 13:35:52 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YMX0Xhcz3f3N for ; Mon, 14 Nov 2022 13:35:52 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=fv34fExh; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::532; helo=mail-pg1-x532.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=fv34fExh; dkim-atps=neutral Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHD20WSz3cHF for ; Mon, 14 Nov 2022 13:32:07 +1100 (AEDT) Received: by mail-pg1-x532.google.com with SMTP id h193so9102109pgc.10 for ; Sun, 13 Nov 2022 18:32:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0+uMYmD6JNF5IlbnvrwUfhRxoK/tQDZvS1D6p8gYOrI=; b=fv34fExhONbhqZjINv2kWqH1e1Xx7LRBv+lVfi3eLQ+RLDMWy4+vtcXG2Q70MfYfmi JsnGaM64QiW/T2VzB1ndgWpakyjKLzHsYIfUl8ncow973aj/QT/O7WaNARapp/39/6Qd nYb5k5T/JG+hpgkipgI6v5xq1Jlh+RBwib7xVzCgUpw4hUQnwhW2AzXp9U7KHrQgx1g1 B0ZEJZGcxR/qnCXUYjNx76sCAn5KCZBI7E1JxpgCu1lN+maJZOwoc2fix7lsuiHWtAjA mTpotk9tm1wCWiBXYOqUwObTlP7jg4di4WflZF/jYE6rFmT6s2pLf2fTPO4hnl6jKKuk Tdpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0+uMYmD6JNF5IlbnvrwUfhRxoK/tQDZvS1D6p8gYOrI=; b=fP4ZpdkbQSBd23wT9EQgiZ+mB8iSzYloMEBhAGoL2YMdFgqY3KH8dIkkQ01BiBmrEA tDbDJotrO148/snKXf6qnPRb2yTjwRbQ71NsGAnv8Xbm5KtHYtayh1z8g1WR8lrnF7fJ PeHFecTAIBE8gn/B9/CR4ljfGuoCMU0yP1O6ujGT+06qarARwwX4sKHd6ct7cdRcREL6 4Oc0GWcJVRj/KwSLGl2OdQbUdMRg/yG49eYv0hJHalkkjtul0CG1NbF3/C6WYHNHVfDM VSLYldjo/Tksfq5fBaUMDge5IeZOv09GrQuUJ2sWeDD6pFKC4CZfGAdN2Y7OejerSToA Y/Hg== X-Gm-Message-State: ANoB5pnCu3Fd6LlRnpjIrQBqrK/v7QwjiE5MwHZyVD3n+mkGKHKn6Nvs +Pkom9k1SpKMXkj71cjVHGzhBQJyLOw= X-Google-Smtp-Source: AA0mqf42CBsxzsI03lvPUUawAnBCyHIDJy3YprkmgNlN9XEarARMo7QxGrTDFscEgJ/iRzSerRrigQ== X-Received: by 2002:a65:6b8a:0:b0:470:1187:4dae with SMTP id d10-20020a656b8a000000b0047011874daemr10247481pgw.239.1668393125483; Sun, 13 Nov 2022 18:32:05 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:04 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 03/17] powerpc/qspinlock: use a half-word store to unlock to avoid larx/stcx. Date: Mon, 14 Nov 2022 12:31:23 +1000 Message-Id: <20221114023137.2679627-5-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The first 16 bits of the lock are only modified by the owner, and other modifications always use atomic operations on the entire 32 bits, so unlocks can use plain stores on the 16 bits. This is the same kind of optimisation done by core qspinlock code. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 6 +----- arch/powerpc/include/asm/qspinlock_types.h | 19 +++++++++++++++++-- 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 300c7d2ebe2e..7bc254c55705 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -36,11 +36,7 @@ static __always_inline void queued_spin_lock(struct qspinlock *lock) static inline void queued_spin_unlock(struct qspinlock *lock) { - for (;;) { - int val = atomic_read(&lock->val); - if (atomic_cmpxchg_release(&lock->val, val, val & ~_Q_LOCKED_VAL) == val) - return; - } + smp_store_release(&lock->locked, 0); } #define arch_spin_is_locked(l) queued_spin_is_locked(l) diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 9630e714c70d..3425dab42576 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -3,12 +3,27 @@ #define _ASM_POWERPC_QSPINLOCK_TYPES_H #include +#include typedef struct qspinlock { - atomic_t val; + union { + atomic_t val; + +#ifdef __LITTLE_ENDIAN + struct { + u16 locked; + u8 reserved[2]; + }; +#else + struct { + u8 reserved[2]; + u16 locked; + }; +#endif + }; } arch_spinlock_t; -#define __ARCH_SPIN_LOCK_UNLOCKED { .val = ATOMIC_INIT(0) } +#define __ARCH_SPIN_LOCK_UNLOCKED { { .val = ATOMIC_INIT(0) } } /* * Bitfields in the atomic value: From patchwork Mon Nov 14 02:31:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703394 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bEDxlOkp; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YNX2T5Xz23nK for ; Mon, 14 Nov 2022 13:36:44 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YNX1FZ4z3f4K for ; Mon, 14 Nov 2022 13:36:44 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bEDxlOkp; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::52a; helo=mail-pg1-x52a.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bEDxlOkp; dkim-atps=neutral Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHH6d7Zz3cLW for ; Mon, 14 Nov 2022 13:32:11 +1100 (AEDT) Received: by mail-pg1-x52a.google.com with SMTP id s196so9135476pgs.3 for ; Sun, 13 Nov 2022 18:32:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bnHfbjIsxhlF7AVP9pM40xHgiio3lOPgjwAjHTj0Fzs=; b=bEDxlOkpW765o+wt0skcWG4MNVl+ajTSiRCzs8maJwvZpgNhZCVasVj8OXvAEbMPSt /bol0n8/P8ev5HJIKvxAnUs377r2CM4ZLGzvcjdGHj5B6jLoFut1AYw7t8Puv2Vvt2v0 waE6NqdRpXyIozKMMn2n8G418ceYpqc8+XCB6QEQIE0Wh+s7kISFu3IFZP7p8DNsI9zZ oq9xa3cf7tXWQITZYkH3EAdH/8fVHht37IjPLzt0w/a9RdRAC9cnG5qcT1MnRRUufGvC 0NVq/xH9eBiZtMCOaft6t5VB7oWxLbQf3k2choAdT+XPXkVr9qAwLaq2t3Zw2vzFy7yF v9yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bnHfbjIsxhlF7AVP9pM40xHgiio3lOPgjwAjHTj0Fzs=; b=EVqQN/B2mjzQ+vvZ/bjQoYfvCSBJ8eAXSFyBg4tL/563MSHmRjRi3D9m5L9ob7A25J wJlkKt5y91VeO6P3nRZkGYFecZY54dXPjAUB1WVnXvSBbinHOmvywhJ2KCqL+70fglR9 sdaFt779nPihwvXDu2iswDr4la8CxTd9Wa84K9WSHZcq6RwlRGBRv9OYdS6v6P84p5Qn Yq3qfron0Q3S7FM92Rm3fB9HXjmM7xI+I9xdrAvUF/gPMMP/Ho00fCjv+faEuQXtojjQ cb3jwk/MgO7AFsk3UfxjTOqu77/Jd26GeMyhlc17jooRtbJqZ4PgfW72bPhzx+/OZnmg loFw== X-Gm-Message-State: ANoB5pnag0TatlJOzX5edDeCJJtVqK6dRgellG4HdmKIORqBTUb0Sjii Yk71pV34oguY1sQGPp2ndPwam/2nas0= X-Google-Smtp-Source: AA0mqf5dxNZyG83brUKvgv0vkxEZkC3tPVRWBEBmJ4k3Sl1nWA4RMHVpjeXc8mB85SqeNzLNIFNPjg== X-Received: by 2002:a63:4402:0:b0:46a:e819:216c with SMTP id r2-20020a634402000000b0046ae819216cmr10278996pga.155.1668393129073; Sun, 13 Nov 2022 18:32:09 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:08 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 04/17] powerpc/qspinlock: convert atomic operations to assembly Date: Mon, 14 Nov 2022 12:31:24 +1000 Message-Id: <20221114023137.2679627-6-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" This uses more optimal ll/sc style access patterns (rather than cmpxchg), and also sets the EH=1 lock hint on those operations which acquire ownership of the lock. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 24 +++++-- arch/powerpc/include/asm/qspinlock_types.h | 6 +- arch/powerpc/lib/qspinlock.c | 81 +++++++++++++++------- 3 files changed, 77 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 7bc254c55705..7d300e6883a8 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -2,28 +2,42 @@ #ifndef _ASM_POWERPC_QSPINLOCK_H #define _ASM_POWERPC_QSPINLOCK_H -#include #include #include static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { - return atomic_read(&lock->val); + return READ_ONCE(lock->val); } static __always_inline int queued_spin_value_unlocked(struct qspinlock lock) { - return !atomic_read(&lock.val); + return !lock.val; } static __always_inline int queued_spin_is_contended(struct qspinlock *lock) { - return !!(atomic_read(&lock->val) & _Q_TAIL_CPU_MASK); + return !!(READ_ONCE(lock->val) & _Q_TAIL_CPU_MASK); } static __always_inline int queued_spin_trylock(struct qspinlock *lock) { - return atomic_cmpxchg_acquire(&lock->val, 0, _Q_LOCKED_VAL) == 0; + u32 prev; + + asm volatile( +"1: lwarx %0,0,%1,%3 # queued_spin_trylock \n" +" cmpwi 0,%0,0 \n" +" bne- 2f \n" +" stwcx. %2,0,%1 \n" +" bne- 1b \n" +"\t" PPC_ACQUIRE_BARRIER " \n" +"2: \n" + : "=&r" (prev) + : "r" (&lock->val), "r" (_Q_LOCKED_VAL), + "i" (IS_ENABLED(CONFIG_PPC64)) + : "cr0", "memory"); + + return likely(prev == 0); } void queued_spin_lock_slowpath(struct qspinlock *lock); diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 3425dab42576..210adf05b235 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -7,7 +7,7 @@ typedef struct qspinlock { union { - atomic_t val; + u32 val; #ifdef __LITTLE_ENDIAN struct { @@ -23,10 +23,10 @@ typedef struct qspinlock { }; } arch_spinlock_t; -#define __ARCH_SPIN_LOCK_UNLOCKED { { .val = ATOMIC_INIT(0) } } +#define __ARCH_SPIN_LOCK_UNLOCKED { { .val = 0 } } /* - * Bitfields in the atomic value: + * Bitfields in the lock word: * * 0: locked bit * 16-31: tail cpu (+1) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index f3c3d5128bd5..6c58c24af5a0 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -1,5 +1,4 @@ // SPDX-License-Identifier: GPL-2.0-or-later -#include #include #include #include @@ -22,31 +21,56 @@ struct qnodes { static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); -static inline int encode_tail_cpu(int cpu) +static inline u32 encode_tail_cpu(int cpu) { return (cpu + 1) << _Q_TAIL_CPU_OFFSET; } -static inline int decode_tail_cpu(int val) +static inline int decode_tail_cpu(u32 val) { return (val >> _Q_TAIL_CPU_OFFSET) - 1; } -/* Take the lock by setting the bit, no other CPUs may concurrently lock it. */ +/* Take the lock by setting the lock bit, no other CPUs will touch it. */ static __always_inline void set_locked(struct qspinlock *lock) { - atomic_or(_Q_LOCKED_VAL, &lock->val); - __atomic_acquire_fence(); + u32 prev, tmp; + + asm volatile( +"1: lwarx %0,0,%2,%4 # set_locked \n" +" or %1,%0,%3 \n" +" stwcx. %1,0,%2 \n" +" bne- 1b \n" +"\t" PPC_ACQUIRE_BARRIER " \n" + : "=&r" (prev), "=&r" (tmp) + : "r" (&lock->val), "i" (_Q_LOCKED_VAL), + "i" (IS_ENABLED(CONFIG_PPC64)) + : "cr0", "memory"); + + BUG_ON(prev & _Q_LOCKED_VAL); } -/* Take lock, clearing tail, cmpxchg with val (which must not be locked) */ -static __always_inline int trylock_clear_tail_cpu(struct qspinlock *lock, int val) +/* Take lock, clearing tail, cmpxchg with old (which must not be locked) */ +static __always_inline int trylock_clear_tail_cpu(struct qspinlock *lock, u32 old) { - int newval = _Q_LOCKED_VAL; - - BUG_ON(val & _Q_LOCKED_VAL); - - return atomic_cmpxchg_acquire(&lock->val, val, newval) == val; + u32 prev; + + BUG_ON(old & _Q_LOCKED_VAL); + + asm volatile( +"1: lwarx %0,0,%1,%4 # trylock_clear_tail_cpu \n" +" cmpw 0,%0,%2 \n" +" bne- 2f \n" +" stwcx. %3,0,%1 \n" +" bne- 1b \n" +"\t" PPC_ACQUIRE_BARRIER " \n" +"2: \n" + : "=&r" (prev) + : "r" (&lock->val), "r"(old), "r" (_Q_LOCKED_VAL), + "i" (IS_ENABLED(CONFIG_PPC64)) + : "cr0", "memory"); + + return likely(prev == old); } /* @@ -56,20 +80,25 @@ static __always_inline int trylock_clear_tail_cpu(struct qspinlock *lock, int va * acquire barrier in get_tail_qnode() when the next CPU finds this tail * value. */ -static __always_inline int publish_tail_cpu(struct qspinlock *lock, int tail) +static __always_inline u32 publish_tail_cpu(struct qspinlock *lock, u32 tail) { - for (;;) { - int val = atomic_read(&lock->val); - int newval = (val & ~_Q_TAIL_CPU_MASK) | tail; - int old; - - old = atomic_cmpxchg_release(&lock->val, val, newval); - if (old == val) - return old; - } + u32 prev, tmp; + + asm volatile( +"\t" PPC_RELEASE_BARRIER " \n" +"1: lwarx %0,0,%2 # publish_tail_cpu \n" +" andc %1,%0,%4 \n" +" or %1,%1,%3 \n" +" stwcx. %1,0,%2 \n" +" bne- 1b \n" + : "=&r" (prev), "=&r"(tmp) + : "r" (&lock->val), "r" (tail), "r"(_Q_TAIL_CPU_MASK) + : "cr0", "memory"); + + return prev; } -static struct qnode *get_tail_qnode(struct qspinlock *lock, int val) +static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) { int cpu = decode_tail_cpu(val); struct qnodes *qnodesp = per_cpu_ptr(&qnodes, cpu); @@ -97,7 +126,7 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) { struct qnodes *qnodesp; struct qnode *next, *node; - int val, old, tail; + u32 val, old, tail; int idx; BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); @@ -144,7 +173,7 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) /* We're at the head of the waitqueue, wait for the lock. */ for (;;) { - val = atomic_read(&lock->val); + val = READ_ONCE(lock->val); if (!(val & _Q_LOCKED_VAL)) break; From patchwork Mon Nov 14 02:31:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703395 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=qmQ9y1B1; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YPj4XMjz23nK for ; Mon, 14 Nov 2022 13:37:45 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YPj2FnGz3dts for ; Mon, 14 Nov 2022 13:37:45 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=qmQ9y1B1; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::62a; helo=mail-pl1-x62a.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=qmQ9y1B1; dkim-atps=neutral Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHM58Bxz3cH9 for ; Mon, 14 Nov 2022 13:32:15 +1100 (AEDT) Received: by mail-pl1-x62a.google.com with SMTP id p21so8852321plr.7 for ; Sun, 13 Nov 2022 18:32:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oiRMGs/hF879eTYlkCveIAkgMIO1cM/VpPsdSSJjIbY=; b=qmQ9y1B1poAn1t1H2NnZV8rpF70d51gSLPCmwvVh5m4cRdcVfRezWKmoNyRQskqeyl apewaco5SXUIYhGZtpcn//JCuTlfTU4ap34EBkgcv633rCW73itN1fDGSQX6ghw4K4L3 kWcCVWIie4Dx6RoJ4IIT2fz01Lqae1jBAk+9Jz9h1rHl4btLXFcF6TGrwGRINgqz38lb 74RFdFBRDDoRxtO/nIdta1Okl3YrRgw+niyTb+X/+QjvRG6Vt4VJ6ZYMV5D6ql9mvdsP 7HlZ7l5qq3bnTxmYBrn09qjjLSvsr71/87xVmtQw2yVV5jeVNgi7zn0QrE9Uw+Bl5jln 4grg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oiRMGs/hF879eTYlkCveIAkgMIO1cM/VpPsdSSJjIbY=; b=o96qwaxN3+1Yobh+tBHyUaydZFcZvWHn7msd23yMfmK05P49mNZm82PyZYdsHuy9Oc mi0YPzlA1t1Gndj9iCHJBPfx+1W4hBS6ectR2yB0jNIpyOstrEH2wQnELqHidhW2u8A6 ggtpXb5gfdbYPMVartAXn4URcF6wapMBbN7Cv58xpsT9mWfFRaXrXauDYPAQuiw6IU6g 5eLOj48O6J9xTaLseM9/ujxqbW5OyinGXglpYGXplqljCHcESDtW4An6fc/PPHptm4Vv QYOGj6aJiiXt4AXLr8aadbRAN9/yGf6knwSpzkvowNEx6LbZ6AVuXtyYAXnOMTzqgdf2 abvw== X-Gm-Message-State: ANoB5pn5VqhH9RlBtx2t3LNo7p+epPHDi0Z7d/GQ1OagQrziGhWcvQ76 GartfNr56xlvZIzZgUp0JdLAXeYSolk= X-Google-Smtp-Source: AA0mqf42pq7g1QwQviT4LOc2Mw1wAYoV+52j0IteCToMSUdXQrJKk5g7lvcZ4CAnaKciIRKHUv7mNg== X-Received: by 2002:a17:90b:896:b0:213:df24:ed80 with SMTP id bj22-20020a17090b089600b00213df24ed80mr11673935pjb.186.1668393132758; Sun, 13 Nov 2022 18:32:12 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:12 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 05/17] powerpc/qspinlock: allow new waiters to steal the lock before queueing Date: Mon, 14 Nov 2022 12:31:25 +1000 Message-Id: <20221114023137.2679627-7-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Allow new waiters a number of spins on the lock word before queueing, which particularly helps paravirt performance when physical CPUs are oversubscribed. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 159 ++++++++++++++++++++++++++++++----- 1 file changed, 140 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 6c58c24af5a0..872d4628a44d 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -19,8 +19,17 @@ struct qnodes { struct qnode nodes[MAX_NODES]; }; +/* Tuning parameters */ +static int steal_spins __read_mostly = (1<<5); +static bool maybe_stealers __read_mostly = true; + static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); +static __always_inline int get_steal_spins(void) +{ + return steal_spins; +} + static inline u32 encode_tail_cpu(int cpu) { return (cpu + 1) << _Q_TAIL_CPU_OFFSET; @@ -50,15 +59,14 @@ static __always_inline void set_locked(struct qspinlock *lock) BUG_ON(prev & _Q_LOCKED_VAL); } -/* Take lock, clearing tail, cmpxchg with old (which must not be locked) */ -static __always_inline int trylock_clear_tail_cpu(struct qspinlock *lock, u32 old) +static __always_inline u32 __trylock_cmpxchg(struct qspinlock *lock, u32 old, u32 new) { u32 prev; BUG_ON(old & _Q_LOCKED_VAL); asm volatile( -"1: lwarx %0,0,%1,%4 # trylock_clear_tail_cpu \n" +"1: lwarx %0,0,%1,%4 # __trylock_cmpxchg \n" " cmpw 0,%0,%2 \n" " bne- 2f \n" " stwcx. %3,0,%1 \n" @@ -66,13 +74,27 @@ static __always_inline int trylock_clear_tail_cpu(struct qspinlock *lock, u32 ol "\t" PPC_ACQUIRE_BARRIER " \n" "2: \n" : "=&r" (prev) - : "r" (&lock->val), "r"(old), "r" (_Q_LOCKED_VAL), + : "r" (&lock->val), "r"(old), "r" (new), "i" (IS_ENABLED(CONFIG_PPC64)) : "cr0", "memory"); return likely(prev == old); } +/* Take lock, clearing tail, cmpxchg with old (which must not be locked) */ +static __always_inline int trylock_clear_tail_cpu(struct qspinlock *lock, u32 val) +{ + return __trylock_cmpxchg(lock, val, _Q_LOCKED_VAL); +} + +/* Take lock, preserving tail, cmpxchg with val (which must not be locked) */ +static __always_inline int trylock_with_tail_cpu(struct qspinlock *lock, u32 val) +{ + u32 newval = _Q_LOCKED_VAL | (val & _Q_TAIL_CPU_MASK); + + return __trylock_cmpxchg(lock, val, newval); +} + /* * Publish our tail, replacing previous tail. Return previous value. * @@ -122,6 +144,30 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } +static inline bool try_to_steal_lock(struct qspinlock *lock) +{ + int iters = 0; + + if (!maybe_stealers) + return false; + + /* Attempt to steal the lock */ + do { + u32 val = READ_ONCE(lock->val); + + if (unlikely(!(val & _Q_LOCKED_VAL))) { + if (trylock_with_tail_cpu(lock, val)) + return true; + } else { + cpu_relax(); + } + + iters++; + } while (iters < get_steal_spins()); + + return false; +} + static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) { struct qnodes *qnodesp; @@ -171,25 +217,49 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) smp_rmb(); /* acquire barrier for the mcs lock */ } - /* We're at the head of the waitqueue, wait for the lock. */ - for (;;) { - val = READ_ONCE(lock->val); - if (!(val & _Q_LOCKED_VAL)) - break; + if (!maybe_stealers) { + /* We're at the head of the waitqueue, wait for the lock. */ + for (;;) { + val = READ_ONCE(lock->val); + if (!(val & _Q_LOCKED_VAL)) + break; - cpu_relax(); - } + cpu_relax(); + } + + /* If we're the last queued, must clean up the tail. */ + if ((val & _Q_TAIL_CPU_MASK) == tail) { + if (trylock_clear_tail_cpu(lock, val)) + goto release; + /* Another waiter must have enqueued. */ + } + + /* We must be the owner, just set the lock bit and acquire */ + set_locked(lock); + } else { +again: + /* We're at the head of the waitqueue, wait for the lock. */ + for (;;) { + val = READ_ONCE(lock->val); + if (!(val & _Q_LOCKED_VAL)) + break; - /* If we're the last queued, must clean up the tail. */ - if ((val & _Q_TAIL_CPU_MASK) == tail) { - if (trylock_clear_tail_cpu(lock, val)) - goto release; - /* Another waiter must have enqueued */ + cpu_relax(); + } + + /* If we're the last queued, must clean up the tail. */ + if ((val & _Q_TAIL_CPU_MASK) == tail) { + if (trylock_clear_tail_cpu(lock, val)) + goto release; + /* Another waiter must have enqueued, or lock stolen. */ + } else { + if (trylock_with_tail_cpu(lock, val)) + goto unlock_next; + } + goto again; } - /* We must be the owner, just set the lock bit and acquire */ - set_locked(lock); - +unlock_next: /* contended path; must wait for next != NULL (MCS protocol) */ while (!(next = READ_ONCE(node->next))) cpu_relax(); @@ -209,6 +279,9 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) void queued_spin_lock_slowpath(struct qspinlock *lock) { + if (try_to_steal_lock(lock)) + return; + queued_spin_lock_mcs_queue(lock); } EXPORT_SYMBOL(queued_spin_lock_slowpath); @@ -218,3 +291,51 @@ void pv_spinlocks_init(void) { } #endif + +#include +static int steal_spins_set(void *data, u64 val) +{ + static DEFINE_MUTEX(lock); + + /* + * The lock slow path has a !maybe_stealers case that can assume + * the head of queue will not see concurrent waiters. That waiter + * is unsafe in the presence of stealers, so must keep them away + * from one another. + */ + + mutex_lock(&lock); + if (val && !steal_spins) { + maybe_stealers = true; + /* wait for queue head waiter to go away */ + synchronize_rcu(); + steal_spins = val; + } else if (!val && steal_spins) { + steal_spins = val; + /* wait for all possible stealers to go away */ + synchronize_rcu(); + maybe_stealers = false; + } else { + steal_spins = val; + } + mutex_unlock(&lock); + + return 0; +} + +static int steal_spins_get(void *data, u64 *val) +{ + *val = steal_spins; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_steal_spins, steal_spins_get, steal_spins_set, "%llu\n"); + +static __init int spinlock_debugfs_init(void) +{ + debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); + + return 0; +} +device_initcall(spinlock_debugfs_init); From patchwork Mon Nov 14 02:31:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703396 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=a3Ij+q0/; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YQk21XHz23nK for ; Mon, 14 Nov 2022 13:38:38 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YQk199cz3f67 for ; Mon, 14 Nov 2022 13:38:38 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=a3Ij+q0/; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::532; helo=mail-pg1-x532.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=a3Ij+q0/; dkim-atps=neutral Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHQ1sdPz3cKq for ; Mon, 14 Nov 2022 13:32:18 +1100 (AEDT) Received: by mail-pg1-x532.google.com with SMTP id o13so9113659pgu.7 for ; Sun, 13 Nov 2022 18:32:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cbFEo/vIKJd/N5UVT8EuUjOzcfNkXiu0/v34zJsyHtA=; b=a3Ij+q0/p4tkRHn2a5DRqDpIX7r9OXQ9wTklpo0UPn8R5XYaBmGN9AZ3mlqVGWV/V6 BUpoo8OZAu0ymo+vfs4Rvvfe+MpEkxHKIzuq3+v3cn8ArbjjJB4N0EU3H0lBbN/wFq7u J4Tx65/tcebHmkeM3VcokCG3F+qvm+CNkmGFYgw5w96dnCfejJ8RLkNEPyREdawhdB+i QQkyCVboK+EyS50KhntdGJkqbnnzZOzRwc05BmB5pKIE9lZsE7aTe7knvUbemYCUqjGp ZLfpS0Sh6qe2z4ugkG4TKRm5IB4ZcNUFwKZHu7yJaH1P9CJ3UMYuhW0Ba73HSj6+sgke g2jQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cbFEo/vIKJd/N5UVT8EuUjOzcfNkXiu0/v34zJsyHtA=; b=H/avTLAWaMQDuRlo10cs7/bIm4Grm5l4s7sdDDR89maWeOFArHRtUkrWuXYhskSZrx /lMpjADWY/oOKRf8CL6EgOGEfrfAi/qT2ykpqZdI/GMx4qcugoySnYYDIxPcX1z8ILYu b1ghQwm/2ECcisQEgWBvvlvpIbfrPdP4dLE/SuyQ6SJONp9eMjU7FoEurfcA/6Nmotnq 9OxR64QdH+5Vi6pm6a36WVMyFfvPFNaU5IMm68OdgyfyTY94g47gl5r2UuC3p7qpvHqa uFmt1lrBP/3OtQb/MT/jy2p/wr7YHVuI28wDuvBU6YDcmU6+MRmr32MtvVmcTvn3vQzr aisw== X-Gm-Message-State: ANoB5pl3+xJBAv9TGo9Xidy6zXeWiputLNz0kHiptmtoVQk7SEWZXijX T5L1J5ojBD5IKqhETACQg1weMHvNx2U= X-Google-Smtp-Source: AA0mqf5ScUcUMURaSBD3XABQ9gjy26EP0kZmsvzyu07exu0pPoK5kad9TB5QL0bE0Qwede8PimCv5Q== X-Received: by 2002:a62:19cf:0:b0:565:af12:c329 with SMTP id 198-20020a6219cf000000b00565af12c329mr12020242pfz.48.1668393136095; Sun, 13 Nov 2022 18:32:16 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:15 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 06/17] powerpc/qspinlock: theft prevention to control latency Date: Mon, 14 Nov 2022 12:31:26 +1000 Message-Id: <20221114023137.2679627-8-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Give the queue head the ability to stop stealers. After a number of spins without sucessfully acquiring the lock, the queue head employs this, which will assure it is the next owner. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock_types.h | 10 ++++- arch/powerpc/lib/qspinlock.c | 52 ++++++++++++++++++++++ 2 files changed, 60 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 210adf05b235..8b20f5e22bba 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -29,7 +29,8 @@ typedef struct qspinlock { * Bitfields in the lock word: * * 0: locked bit - * 16-31: tail cpu (+1) + * 16: must queue bit + * 17-31: tail cpu (+1) */ #define _Q_SET_MASK(type) (((1U << _Q_ ## type ## _BITS) - 1)\ << _Q_ ## type ## _OFFSET) @@ -38,7 +39,12 @@ typedef struct qspinlock { #define _Q_LOCKED_MASK _Q_SET_MASK(LOCKED) #define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) -#define _Q_TAIL_CPU_OFFSET 16 +#define _Q_MUST_Q_OFFSET 16 +#define _Q_MUST_Q_BITS 1 +#define _Q_MUST_Q_MASK _Q_SET_MASK(MUST_Q) +#define _Q_MUST_Q_VAL (1U << _Q_MUST_Q_OFFSET) + +#define _Q_TAIL_CPU_OFFSET 17 #define _Q_TAIL_CPU_BITS (32 - _Q_TAIL_CPU_OFFSET) #define _Q_TAIL_CPU_MASK _Q_SET_MASK(TAIL_CPU) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 872d4628a44d..8f437b0768a5 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -22,6 +22,7 @@ struct qnodes { /* Tuning parameters */ static int steal_spins __read_mostly = (1<<5); static bool maybe_stealers __read_mostly = true; +static int head_spins __read_mostly = (1<<8); static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -30,6 +31,11 @@ static __always_inline int get_steal_spins(void) return steal_spins; } +static __always_inline int get_head_spins(void) +{ + return head_spins; +} + static inline u32 encode_tail_cpu(int cpu) { return (cpu + 1) << _Q_TAIL_CPU_OFFSET; @@ -120,6 +126,22 @@ static __always_inline u32 publish_tail_cpu(struct qspinlock *lock, u32 tail) return prev; } +static __always_inline u32 set_mustq(struct qspinlock *lock) +{ + u32 prev; + + asm volatile( +"1: lwarx %0,0,%1 # set_mustq \n" +" or %0,%0,%2 \n" +" stwcx. %0,0,%1 \n" +" bne- 1b \n" + : "=&r" (prev) + : "r" (&lock->val), "r" (_Q_MUST_Q_VAL) + : "cr0", "memory"); + + return prev; +} + static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) { int cpu = decode_tail_cpu(val); @@ -155,6 +177,9 @@ static inline bool try_to_steal_lock(struct qspinlock *lock) do { u32 val = READ_ONCE(lock->val); + if (val & _Q_MUST_Q_VAL) + break; + if (unlikely(!(val & _Q_LOCKED_VAL))) { if (trylock_with_tail_cpu(lock, val)) return true; @@ -237,6 +262,9 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) /* We must be the owner, just set the lock bit and acquire */ set_locked(lock); } else { + int iters = 0; + bool mustq = false; + again: /* We're at the head of the waitqueue, wait for the lock. */ for (;;) { @@ -245,6 +273,13 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) break; cpu_relax(); + + iters++; + if (!mustq && iters >= get_head_spins()) { + mustq = true; + set_mustq(lock); + val |= _Q_MUST_Q_VAL; + } } /* If we're the last queued, must clean up the tail. */ @@ -332,9 +367,26 @@ static int steal_spins_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_steal_spins, steal_spins_get, steal_spins_set, "%llu\n"); +static int head_spins_set(void *data, u64 val) +{ + head_spins = val; + + return 0; +} + +static int head_spins_get(void *data, u64 *val) +{ + *val = head_spins; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_head_spins, head_spins_get, head_spins_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); + debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); return 0; } From patchwork Mon Nov 14 02:31:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703397 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bk4LnQgy; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YRt4Y2Qz23nK for ; Mon, 14 Nov 2022 13:39:38 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YRt3Qwjz3cMc for ; Mon, 14 Nov 2022 13:39:38 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bk4LnQgy; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::52c; helo=mail-pg1-x52c.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bk4LnQgy; dkim-atps=neutral Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHV4N8tz3cG9 for ; Mon, 14 Nov 2022 13:32:22 +1100 (AEDT) Received: by mail-pg1-x52c.google.com with SMTP id 62so777808pgb.13 for ; Sun, 13 Nov 2022 18:32:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iEryIjBYLek/I0PLOcZSk5+tDg99RI4q8ip4G/1s8F0=; b=bk4LnQgympgfjzNiIk2rzGZuUOw4oaYbPI0OBjfN8rppwpfqz2j6ZOkUDiV2zQ2HF4 NTLFKJNrQSCTmU0n+g/uyWV3igMdWh+31A86re8ZqCxMVet0kFeIrY7R3HkPMEj+JsFX PWqtrJULV53wp40znMgkzofkwD6QLIkz5f0cAXhOypbs74iujh2hvFm3zO1vj+oECVt4 H23T+uTx/Od1dQ9S88mZ6OM3uzbtfvHPTvv8ncYEmI/f/ej5x6QoyprpeCeOnhZG1Pyu YtCpOfrPv/3qKEiBmDvoH5doV9goWH84xRQnh700Qv5GhEthM8CAF7C5R8QM3/IUne4L RWBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iEryIjBYLek/I0PLOcZSk5+tDg99RI4q8ip4G/1s8F0=; b=w+Q324oakZNIFkUtGbfHhcnOuZ+GoplJ6r2AJK62Mmw6I2xVbG0rZN84HThrs7it+q 7akweuo7zEmNfIth1ZaCe4/TmGfXGjOXDLCgvMPjjmFq2OPlOfVdqgDuzA0C50zU35km RGi1b8miQYtChTZzrsNEHoFkWAW3jwl0Gqe+nNpj2v22XiQYKPWhOjxh99iTBFsI7oLj AEQ3bCq0zYEmQbJuHwRtb4ed2CUf9KBLv5sD/kIesq0X+YxwaZlJagB4J7/XMCeXgwzc ZBIplfIP4CsYzYW7d1GY1ogjaXfbLlJPqIKAwCSDWbYYOGaUOEP67lenhi62W/f80oiE yTIQ== X-Gm-Message-State: ANoB5pmvK/gDDRCTr3VAU3AXutpOyKxOsaV2vQsNAI6VOd8/0HzOIdAM Q8MA+qGhFMQoR/5BaonFvZ3Jb+DkFNU= X-Google-Smtp-Source: AA0mqf44sAXDihjUoLOA8vgVoQi2e8EgyiN37sVNJHg7/rPDYdzoEdTR3jj+U7YP7e0UIc3frClh1A== X-Received: by 2002:a63:4842:0:b0:42f:e191:2d35 with SMTP id x2-20020a634842000000b0042fe1912d35mr10311260pgk.1.1668393139941; Sun, 13 Nov 2022 18:32:19 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:19 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 07/17] powerpc/qspinlock: store owner CPU in lock word Date: Mon, 14 Nov 2022 12:31:27 +1000 Message-Id: <20221114023137.2679627-9-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Store the owner CPU number in the lock word so it may be yielded to, as powerpc's paravirtualised simple spinlocks do. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 9 ++++++++- arch/powerpc/include/asm/qspinlock_types.h | 10 ++++++++++ arch/powerpc/lib/qspinlock.c | 9 ++++++--- 3 files changed, 24 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 7d300e6883a8..3eff2d875bb6 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -20,8 +20,15 @@ static __always_inline int queued_spin_is_contended(struct qspinlock *lock) return !!(READ_ONCE(lock->val) & _Q_TAIL_CPU_MASK); } +static __always_inline u32 queued_spin_encode_locked_val(void) +{ + /* XXX: make this use lock value in paca like simple spinlocks? */ + return _Q_LOCKED_VAL | (smp_processor_id() << _Q_OWNER_CPU_OFFSET); +} + static __always_inline int queued_spin_trylock(struct qspinlock *lock) { + u32 new = queued_spin_encode_locked_val(); u32 prev; asm volatile( @@ -33,7 +40,7 @@ static __always_inline int queued_spin_trylock(struct qspinlock *lock) "\t" PPC_ACQUIRE_BARRIER " \n" "2: \n" : "=&r" (prev) - : "r" (&lock->val), "r" (_Q_LOCKED_VAL), + : "r" (&lock->val), "r" (new), "i" (IS_ENABLED(CONFIG_PPC64)) : "cr0", "memory"); diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 8b20f5e22bba..35f9525381e6 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -29,6 +29,8 @@ typedef struct qspinlock { * Bitfields in the lock word: * * 0: locked bit + * 1-14: lock holder cpu + * 15: unused bit * 16: must queue bit * 17-31: tail cpu (+1) */ @@ -39,6 +41,14 @@ typedef struct qspinlock { #define _Q_LOCKED_MASK _Q_SET_MASK(LOCKED) #define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) +#define _Q_OWNER_CPU_OFFSET 1 +#define _Q_OWNER_CPU_BITS 14 +#define _Q_OWNER_CPU_MASK _Q_SET_MASK(OWNER_CPU) + +#if CONFIG_NR_CPUS > (1U << _Q_OWNER_CPU_BITS) +#error "qspinlock does not support such large CONFIG_NR_CPUS" +#endif + #define _Q_MUST_Q_OFFSET 16 #define _Q_MUST_Q_BITS 1 #define _Q_MUST_Q_MASK _Q_SET_MASK(MUST_Q) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 8f437b0768a5..b25a52251cb3 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -49,6 +49,7 @@ static inline int decode_tail_cpu(u32 val) /* Take the lock by setting the lock bit, no other CPUs will touch it. */ static __always_inline void set_locked(struct qspinlock *lock) { + u32 new = queued_spin_encode_locked_val(); u32 prev, tmp; asm volatile( @@ -58,7 +59,7 @@ static __always_inline void set_locked(struct qspinlock *lock) " bne- 1b \n" "\t" PPC_ACQUIRE_BARRIER " \n" : "=&r" (prev), "=&r" (tmp) - : "r" (&lock->val), "i" (_Q_LOCKED_VAL), + : "r" (&lock->val), "r" (new), "i" (IS_ENABLED(CONFIG_PPC64)) : "cr0", "memory"); @@ -90,13 +91,15 @@ static __always_inline u32 __trylock_cmpxchg(struct qspinlock *lock, u32 old, u3 /* Take lock, clearing tail, cmpxchg with old (which must not be locked) */ static __always_inline int trylock_clear_tail_cpu(struct qspinlock *lock, u32 val) { - return __trylock_cmpxchg(lock, val, _Q_LOCKED_VAL); + u32 newval = queued_spin_encode_locked_val(); + + return __trylock_cmpxchg(lock, val, newval); } /* Take lock, preserving tail, cmpxchg with val (which must not be locked) */ static __always_inline int trylock_with_tail_cpu(struct qspinlock *lock, u32 val) { - u32 newval = _Q_LOCKED_VAL | (val & _Q_TAIL_CPU_MASK); + u32 newval = queued_spin_encode_locked_val() | (val & _Q_TAIL_CPU_MASK); return __trylock_cmpxchg(lock, val, newval); } From patchwork Mon Nov 14 02:31:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703398 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=avZzpUaa; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YSv1m1Kz23nN for ; Mon, 14 Nov 2022 13:40:31 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YSv0gv7z3f9L for ; Mon, 14 Nov 2022 13:40:31 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=avZzpUaa; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::42c; helo=mail-pf1-x42c.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=avZzpUaa; dkim-atps=neutral Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHX5GPtz3cMK for ; Mon, 14 Nov 2022 13:32:24 +1100 (AEDT) Received: by mail-pf1-x42c.google.com with SMTP id g62so9773517pfb.10 for ; Sun, 13 Nov 2022 18:32:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5VFKeV/LgpiRuyALCe3LPQB58m40bJVCcJtuPUhA1R4=; b=avZzpUaa90FsfsqCToWgSq8dieLaX7q1YnmrVegBWGEHoy+8SIGhxt1rT38mAz1V60 RzjCvkqRuaUUUrgP7QJKd0SLVV1LARM1FFVfi7wMGaqLo5Lh8lQcdP0VBakvsEgmdR5m kkPAV8QAqDiX26RgKlGOs3z1yyrxDOtPjZcg6UhhC6lU7chRrBj+svsfyPlJDnRfaGnU xq98me5XX74858pVXK3gfX8SkJd2tsWq6Bej2QhhyAos5Rf0ldvMdWvO6OnB1HCaRW8D 7fiHXBfVgnameXq0NJiIeusI2hZBpYk4DkPFu0//b3sP2/6TPRr9MwYbJK0EmK8b9Rye VKZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5VFKeV/LgpiRuyALCe3LPQB58m40bJVCcJtuPUhA1R4=; b=V2bOKPgO/vdKHWsMdcUOzmagxQ5xghi3z5Rg2YH0w9FRhpxAMRI4O8H81sIqcrvAP1 LBDjx76xMlM4AwsoXS2mScr2NCnBaRJQ3Pe6P3bb56bUPPstXFwP7xPDYs0yE5R/FLf7 eaJt5lDav4h19OEQt0ikSnyuKpcYtNn3jPikdMWlz3Hvhsr3+3YMz7WoHPk4rjTk0Xe5 2HVQmQAx8G9wYkEBrjgsBeu0PEzBmEgEeGnlwgG/nw8KPLQeVBG4PAsWUOpQhStKGTPl 46rzRAMsLCgg0of2SSgArrZ7bzTa/JYK6GpGavBDncV4CNzMBAxUBg0+1wLKiJ198cK4 4Tmg== X-Gm-Message-State: ANoB5pniCsT5ze2DKwzKZwDn05r9k4TiWLALVMdskaZuEXGzx2TISgsp FtRmuicMygZvF4fEt4ttilX7W3I8eBM= X-Google-Smtp-Source: AA0mqf6Ilj0kyL4Y4HNP+mTz0H6endp/00bKyKtN/uwjvKpqFCkLiTEzg1RkJRS0++caYxTM40D2gw== X-Received: by 2002:a05:6a00:26f7:b0:572:149c:e1ba with SMTP id p55-20020a056a0026f700b00572149ce1bamr3114729pfw.19.1668393143770; Sun, 13 Nov 2022 18:32:23 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:22 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 08/17] powerpc/qspinlock: paravirt yield to lock owner Date: Mon, 14 Nov 2022 12:31:28 +1000 Message-Id: <20221114023137.2679627-10-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Waiters spinning on the lock word should yield to the lock owner if the vCPU is preempted. This improves performance when the hypervisor has oversubscribed physical CPUs. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 101 ++++++++++++++++++++++++++++++----- 1 file changed, 88 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index b25a52251cb3..d81d72125034 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -5,6 +5,7 @@ #include #include #include +#include #define MAX_NODES 4 @@ -24,14 +25,16 @@ static int steal_spins __read_mostly = (1<<5); static bool maybe_stealers __read_mostly = true; static int head_spins __read_mostly = (1<<8); +static bool pv_yield_owner __read_mostly = true; + static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); -static __always_inline int get_steal_spins(void) +static __always_inline int get_steal_spins(bool paravirt) { return steal_spins; } -static __always_inline int get_head_spins(void) +static __always_inline int get_head_spins(bool paravirt) { return head_spins; } @@ -46,6 +49,11 @@ static inline int decode_tail_cpu(u32 val) return (val >> _Q_TAIL_CPU_OFFSET) - 1; } +static inline int get_owner_cpu(u32 val) +{ + return (val & _Q_OWNER_CPU_MASK) >> _Q_OWNER_CPU_OFFSET; +} + /* Take the lock by setting the lock bit, no other CPUs will touch it. */ static __always_inline void set_locked(struct qspinlock *lock) { @@ -169,7 +177,45 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } -static inline bool try_to_steal_lock(struct qspinlock *lock) +static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +{ + int owner; + u32 yield_count; + + BUG_ON(!(val & _Q_LOCKED_VAL)); + + if (!paravirt) + goto relax; + + if (!pv_yield_owner) + goto relax; + + owner = get_owner_cpu(val); + yield_count = yield_count_of(owner); + + if ((yield_count & 1) == 0) + goto relax; /* owner vcpu is running */ + + /* + * Read the lock word after sampling the yield count. On the other side + * there may a wmb because the yield count update is done by the + * hypervisor preemption and the value update by the OS, however this + * ordering might reduce the chance of out of order accesses and + * improve the heuristic. + */ + smp_rmb(); + + if (READ_ONCE(lock->val) == val) { + yield_to_preempted(owner, yield_count); + /* Don't relax if we yielded. Maybe we should? */ + return; + } +relax: + cpu_relax(); +} + + +static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool paravirt) { int iters = 0; @@ -187,16 +233,16 @@ static inline bool try_to_steal_lock(struct qspinlock *lock) if (trylock_with_tail_cpu(lock, val)) return true; } else { - cpu_relax(); + yield_to_locked_owner(lock, val, paravirt); } iters++; - } while (iters < get_steal_spins()); + } while (iters < get_steal_spins(paravirt)); return false; } -static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) +static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, bool paravirt) { struct qnodes *qnodesp; struct qnode *next, *node; @@ -252,7 +298,7 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) if (!(val & _Q_LOCKED_VAL)) break; - cpu_relax(); + yield_to_locked_owner(lock, val, paravirt); } /* If we're the last queued, must clean up the tail. */ @@ -275,10 +321,10 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) if (!(val & _Q_LOCKED_VAL)) break; - cpu_relax(); + yield_to_locked_owner(lock, val, paravirt); iters++; - if (!mustq && iters >= get_head_spins()) { + if (!mustq && iters >= get_head_spins(paravirt)) { mustq = true; set_mustq(lock); val |= _Q_MUST_Q_VAL; @@ -317,10 +363,20 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) void queued_spin_lock_slowpath(struct qspinlock *lock) { - if (try_to_steal_lock(lock)) - return; - - queued_spin_lock_mcs_queue(lock); + /* + * This looks funny, but it induces the compiler to inline both + * sides of the branch rather than share code as when the condition + * is passed as the paravirt argument to the functions. + */ + if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) && is_shared_processor()) { + if (try_to_steal_lock(lock, true)) + return; + queued_spin_lock_mcs_queue(lock, true); + } else { + if (try_to_steal_lock(lock, false)) + return; + queued_spin_lock_mcs_queue(lock, false); + } } EXPORT_SYMBOL(queued_spin_lock_slowpath); @@ -386,10 +442,29 @@ static int head_spins_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_head_spins, head_spins_get, head_spins_set, "%llu\n"); +static int pv_yield_owner_set(void *data, u64 val) +{ + pv_yield_owner = !!val; + + return 0; +} + +static int pv_yield_owner_get(void *data, u64 *val) +{ + *val = pv_yield_owner; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_owner, pv_yield_owner_get, pv_yield_owner_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); + if (is_shared_processor()) { + debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); + } return 0; } From patchwork Mon Nov 14 02:31:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703399 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=dTNP2CTe; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YTw00cpz23nN for ; Mon, 14 Nov 2022 13:41:23 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YTv4VDPz3f9m for ; Mon, 14 Nov 2022 13:41:23 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=dTNP2CTe; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::531; helo=mail-pg1-x531.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=dTNP2CTe; dkim-atps=neutral Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHf0bWMz3cLF for ; Mon, 14 Nov 2022 13:32:29 +1100 (AEDT) Received: by mail-pg1-x531.google.com with SMTP id q71so9115491pgq.8 for ; Sun, 13 Nov 2022 18:32:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TxlZzY3oPQoAvpJ/JC2WvocxFOzKd/6CJ/9uPnl4FMo=; b=dTNP2CTeTqoouloFiOWtNOXRHE28GCBm0qfU+Rg9vPIYxhpLzEjLw1YFSaZp2qBtdG bUUFMDbPx4IZrCvUqlameZfC6w7JrW2rvdBht40Dz9MXEEAcqyr37cx0m6ygJkjdpqsY g0BfIizmOF30NCgzA8YvxRwLf7fuJNPvdIw/PcpArDxOaCgZAjZ1XpxFyKO1Cij4lHu0 jim1fDnrlbnQ4r/eCxX+Pq/p6nIw7X7nPZwTzVjiCp8hbMKNHDwBZAq7rksduhkVttN2 9wUXJ/x18yiTQCjy282HNS/UEHd/jbvp3XguJO73n7bZQAVUPYVybT6hVm+Zkt9agW7d qVyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TxlZzY3oPQoAvpJ/JC2WvocxFOzKd/6CJ/9uPnl4FMo=; b=54WhBHZzGdKNiFbmgz3jzfUEbwwR0F4bFq79xegA6KHy8//5gNZR0sVH5+67/EbBpn QL8mAv5lJZe9pHw23lc1lQ7SaNAIasvkR6y+2zbN7/F+cDzvgEOGm/LwwlCyHPfI/bzE 3Md12POBY/I+Cs/Our3fBVDY2tyS5Il473jVRibs0VypfkPWyRlrSQpRtO/FjAcTRCDP vFn3mJtTik4KT5OXrEG0l0pFPBrwaucT6qjfHDMZjZ0tb180Ih04FEOPJU+StN5R1rcL BObXiAnAFK5CUdKqatXPTBZTPXxLWb9ROw2G9by7YdscHhJVCx7ncn9mnJvA9KBxOO/y OksA== X-Gm-Message-State: ANoB5pktNs0EYLAZpH5cvW5asFtQe/RD2hI5iqMxnhYtqNxTJF4jIsvB RThewDgwNE3in2wk1kGEenk9Qr1PDaY= X-Google-Smtp-Source: AA0mqf4+SLWa/5FEWYE2/aktP90O+DZbCM+sSZdh7mTBhfL0l24VXJDeq5IxoTpm21JFCYc9EALtJQ== X-Received: by 2002:a63:e50e:0:b0:470:3a39:4c18 with SMTP id r14-20020a63e50e000000b004703a394c18mr9875865pgh.163.1668393147213; Sun, 13 Nov 2022 18:32:27 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:26 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 09/17] powerpc/qspinlock: implement option to yield to previous node Date: Mon, 14 Nov 2022 12:31:29 +1000 Message-Id: <20221114023137.2679627-11-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Queued waiters which are not at the head of the queue don't spin on the lock word but their qnode lock word, waiting for the previous queued CPU to release them. Add an option which allows these waiters to yield to the previous CPU if its vCPU is preempted. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 46 +++++++++++++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index d81d72125034..272467c99b90 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -26,6 +26,7 @@ static bool maybe_stealers __read_mostly = true; static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; +static bool pv_yield_prev __read_mostly = true; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -214,6 +215,32 @@ static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 va cpu_relax(); } +static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) +{ + int prev_cpu = decode_tail_cpu(val); + u32 yield_count; + + if (!paravirt) + goto relax; + + if (!pv_yield_prev) + goto relax; + + yield_count = yield_count_of(prev_cpu); + if ((yield_count & 1) == 0) + goto relax; /* owner vcpu is running */ + + smp_rmb(); /* See yield_to_locked_owner comment */ + + if (!node->locked) { + yield_to_preempted(prev_cpu, yield_count); + return; + } + +relax: + cpu_relax(); +} + static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool paravirt) { @@ -286,7 +313,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b /* Wait for mcs node lock to be released */ while (!node->locked) - cpu_relax(); + yield_to_prev(lock, node, old, paravirt); smp_rmb(); /* acquire barrier for the mcs lock */ } @@ -458,12 +485,29 @@ static int pv_yield_owner_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_owner, pv_yield_owner_get, pv_yield_owner_set, "%llu\n"); +static int pv_yield_prev_set(void *data, u64 val) +{ + pv_yield_prev = !!val; + + return 0; +} + +static int pv_yield_prev_get(void *data, u64 *val) +{ + *val = pv_yield_prev; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_prev, pv_yield_prev_get, pv_yield_prev_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); if (is_shared_processor()) { debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); + debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); } return 0; From patchwork Mon Nov 14 02:31:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703400 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=DwY5klbp; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YVw43tGz23nN for ; Mon, 14 Nov 2022 13:42:16 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YVw2zG2z3f88 for ; Mon, 14 Nov 2022 13:42:16 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=DwY5klbp; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::52f; helo=mail-pg1-x52f.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=DwY5klbp; dkim-atps=neutral Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHj4d7Hz3cLF for ; Mon, 14 Nov 2022 13:32:33 +1100 (AEDT) Received: by mail-pg1-x52f.google.com with SMTP id n17so1488099pgh.9 for ; Sun, 13 Nov 2022 18:32:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jESh/QHriL5jCEX0NFLvVT4M44gqscG0SJjoh++TX7Y=; b=DwY5klbpZLEBzGmZdDjNuPOmEBWeZUqeXjXnq0z657BFVPUEmTcS2hcAIRT9M7isiN 8TyRTTSYWqhQlSXpWhXKuz7ICaEvOx+9mx+Zcs5C6Qa0LGxuRkFWbAzzuVOhmV1tkWhi WJ/4PlS108t2QRbcbBYrDipo8QJHwOlzIHobZkK3lHnOSOAxQJq0ZRsRzI264MefSzkh i/RnPvjmy3HZhxTrDw4906ntVG3BkUBO8XJk73v8mJ0ooNb7ZuBycRHPeZZo65YmfxOd Bk78TLikD3V+lJF+Mm0qA72+bIlOOsat+r/mrJJLhiXQoWPKt9ceD0je94+BcYKOo+R6 fdOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jESh/QHriL5jCEX0NFLvVT4M44gqscG0SJjoh++TX7Y=; b=c3xZAyeXV621AE5GIGoFEvGcS+v2NlS5gcO6FxtpD7KzhxUIm/sNZdcYmjXu8bacRe sAhGW47ZqXM1QinPFNByop5PMRdynEmh7PvxbhZXUieH0jCUpEB9O2KeoKHDyOqPM+n2 cZ0SiPXm4GiHvhgifO2sHkjxnvOe0MJS9MHduD6U5glY8VfkVInWkKfmeBY4z/dtZzgc A0bAR7NLPY3xkiGPmwi6GIwggem/g5UeIGcVyD1g8R/rrF5K5dIKChji8exUeerYCttk XAo78JkepUqB8Vahbgxa39Xh6QPpKtYYNpy7Y9p/y3mqAA7E1NHt0m7k6lrDhwHIpjV4 kTTA== X-Gm-Message-State: ANoB5pmX2nSnFy4eDmx/3Eba4qQ1nVMnQVrV7M/skgcV59axKjvIlJWQ A8XmScpF79hXLua/23Tbv7Li2LqVz3I= X-Google-Smtp-Source: AA0mqf5vl4+61lH3MOkrizq95zaPuBQ/QMNwdQPixBBajVzkLC3WMNEPGmzsM6xfdKQlyCu/TGkqCA== X-Received: by 2002:a63:114b:0:b0:470:5b0d:b50e with SMTP id 11-20020a63114b000000b004705b0db50emr9826439pgr.488.1668393150927; Sun, 13 Nov 2022 18:32:30 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:30 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 10/17] powerpc/qspinlock: allow stealing when head of queue yields Date: Mon, 14 Nov 2022 12:31:30 +1000 Message-Id: <20221114023137.2679627-12-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" If the head of queue is preventing stealing but it finds the owner vCPU is preempted, it will yield its cycles to the owner which could cause it to become preempted. Add an option to re-allow stealers before yielding, and disallow them again after returning from the yield. Disable this option by default for now, i.e., no logical change. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 61 +++++++++++++++++++++++++++++++++--- 1 file changed, 57 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 272467c99b90..6b54b4628991 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -26,6 +26,7 @@ static bool maybe_stealers __read_mostly = true; static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; +static bool pv_yield_allow_steal __read_mostly = false; static bool pv_yield_prev __read_mostly = true; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -154,6 +155,22 @@ static __always_inline u32 set_mustq(struct qspinlock *lock) return prev; } +static __always_inline u32 clear_mustq(struct qspinlock *lock) +{ + u32 prev; + + asm volatile( +"1: lwarx %0,0,%1 # clear_mustq \n" +" andc %0,%0,%2 \n" +" stwcx. %0,0,%1 \n" +" bne- 1b \n" + : "=&r" (prev) + : "r" (&lock->val), "r" (_Q_MUST_Q_VAL) + : "cr0", "memory"); + + return prev; +} + static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) { int cpu = decode_tail_cpu(val); @@ -178,7 +195,7 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } -static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt, bool mustq) { int owner; u32 yield_count; @@ -207,7 +224,11 @@ static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 va smp_rmb(); if (READ_ONCE(lock->val) == val) { + if (mustq) + clear_mustq(lock); yield_to_preempted(owner, yield_count); + if (mustq) + set_mustq(lock); /* Don't relax if we yielded. Maybe we should? */ return; } @@ -215,6 +236,21 @@ static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 va cpu_relax(); } +static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +{ + __yield_to_locked_owner(lock, val, paravirt, false); +} + +static __always_inline void yield_head_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +{ + bool mustq = false; + + if ((val & _Q_MUST_Q_VAL) && pv_yield_allow_steal) + mustq = true; + + __yield_to_locked_owner(lock, val, paravirt, mustq); +} + static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) { int prev_cpu = decode_tail_cpu(val); @@ -230,7 +266,7 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * if ((yield_count & 1) == 0) goto relax; /* owner vcpu is running */ - smp_rmb(); /* See yield_to_locked_owner comment */ + smp_rmb(); /* See __yield_to_locked_owner comment */ if (!node->locked) { yield_to_preempted(prev_cpu, yield_count); @@ -325,7 +361,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b if (!(val & _Q_LOCKED_VAL)) break; - yield_to_locked_owner(lock, val, paravirt); + yield_head_to_locked_owner(lock, val, paravirt); } /* If we're the last queued, must clean up the tail. */ @@ -348,7 +384,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b if (!(val & _Q_LOCKED_VAL)) break; - yield_to_locked_owner(lock, val, paravirt); + yield_head_to_locked_owner(lock, val, paravirt); iters++; if (!mustq && iters >= get_head_spins(paravirt)) { @@ -485,6 +521,22 @@ static int pv_yield_owner_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_owner, pv_yield_owner_get, pv_yield_owner_set, "%llu\n"); +static int pv_yield_allow_steal_set(void *data, u64 val) +{ + pv_yield_allow_steal = !!val; + + return 0; +} + +static int pv_yield_allow_steal_get(void *data, u64 *val) +{ + *val = pv_yield_allow_steal; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_allow_steal, pv_yield_allow_steal_get, pv_yield_allow_steal_set, "%llu\n"); + static int pv_yield_prev_set(void *data, u64 val) { pv_yield_prev = !!val; @@ -507,6 +559,7 @@ static __init int spinlock_debugfs_init(void) debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); if (is_shared_processor()) { debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); + debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); } From patchwork Mon Nov 14 02:31:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703401 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Gnn4CJw/; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YX45qw6z23nN for ; Mon, 14 Nov 2022 13:43:16 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YX44qqvz3f5R for ; Mon, 14 Nov 2022 13:43:16 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Gnn4CJw/; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::1030; helo=mail-pj1-x1030.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Gnn4CJw/; dkim-atps=neutral Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHm6bDCz3cMM for ; Mon, 14 Nov 2022 13:32:36 +1100 (AEDT) Received: by mail-pj1-x1030.google.com with SMTP id o7so9168370pjj.1 for ; Sun, 13 Nov 2022 18:32:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x26+trCXBaWEOJg1x5+z1SBGuB50i/8pVdmP1HjVa5A=; b=Gnn4CJw/uZ2Y0Y+n29ftK3utksoVXq+Fwx/PYWFPbRsoAPBQ4TvjRR5+WRY35CyuqW sHtxIT6KTPBehno9S0Ln2n1qI95EKYrFAmmlOpk507xpG1RmeUqedv4eFfYAsSKVpM7I tswJO8cgixNbnRTWudLMwuDLDLqfb6vmQurQ4jcdcZX41TP5Z3mIXQucmeIr729RSig3 nB6pGRDcnkK0DtS/BLkOnwFwvgmEyazrMOEDhb6n5d1qBNfoik5+Dq/wv1hwB6WlZsRJ U+4pmjyBUlMbMnaPXut5ez74LzHHDEnsyVYSASX6l4yZRF6CB4HJSEf6q6jmluTG6ONi H9gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x26+trCXBaWEOJg1x5+z1SBGuB50i/8pVdmP1HjVa5A=; b=PjNdUJZc5gbTrPP1Y9nJxfOH+Wg4c5MSU8coaiIIznvle4x/VZwbS53ku9NRDtpv4a D3UcI42B7O+ZtJmK5atMC5MBu8k1x14aSFXHFyxovJkV3TnQhdozeBiUIwWMS9zVwWZC tULKQYvrLyYrgj4gLcAhxNe1Bu93VCg7xCe+SZrGJcUWfYq40iIZLmyL9qUPr4LeIFEE CtxMAfpzwsImuCDWlG0rD1a9ipl9eO/ukzt38zluae0J3Vravk2J2QjHm/Qy383nJw5n PZ7v2XQ3sX1HlK6Bsd0/HjCoKfW23rAdyVMToTeSanR0R6b7ljD1qNiuRCDddTugNMVv 7yQg== X-Gm-Message-State: ANoB5pnjWH70QeQGdnR2eSJhukSbwGljHe10DBvEtNKnW9zQ7JZVWbaa AaX/d+/ozQDQkQv2WTdVTcrCXMHF1cc= X-Google-Smtp-Source: AA0mqf5VW4cUe93amcdWfHY//kuO/QpwJ0Bh1tbHDpe2dVO8fuCqwnmaBjLOMeIxp++N5Vyt+rjkIg== X-Received: by 2002:a17:90a:c396:b0:205:ff5b:d27a with SMTP id h22-20020a17090ac39600b00205ff5bd27amr11460351pjt.225.1668393154403; Sun, 13 Nov 2022 18:32:34 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:33 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 11/17] powerpc/qspinlock: allow propagation of yield CPU down the queue Date: Mon, 14 Nov 2022 12:31:31 +1000 Message-Id: <20221114023137.2679627-13-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Having all CPUs poll the lock word for the owner CPU that should be yielded to defeats most of the purpose of using MCS queueing for scalability. Yet it may be desirable for queued waiters to to yield to a preempted owner. s390 addreses this problem by having queued waiters sample the lock word to find the owner much less frequently. In this approach, the waiters never sample it directly, but the queue head propagates the owner CPU back to the next waiter if it ever finds the owner has been preempted. Queued waiters then subsequently propagate the owner CPU back to the next waiter, and so on. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 82 ++++++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 6b54b4628991..f07843b4c497 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -12,6 +12,7 @@ struct qnode { struct qnode *next; struct qspinlock *lock; + int yield_cpu; u8 locked; /* 1 if lock acquired */ }; @@ -28,6 +29,7 @@ static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; static bool pv_yield_allow_steal __read_mostly = false; static bool pv_yield_prev __read_mostly = true; +static bool pv_yield_propagate_owner __read_mostly = true; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -251,14 +253,67 @@ static __always_inline void yield_head_to_locked_owner(struct qspinlock *lock, u __yield_to_locked_owner(lock, val, paravirt, mustq); } +static __always_inline void propagate_yield_cpu(struct qnode *node, u32 val, int *set_yield_cpu, bool paravirt) +{ + struct qnode *next; + int owner; + + if (!paravirt) + return; + if (!pv_yield_propagate_owner) + return; + + owner = get_owner_cpu(val); + if (*set_yield_cpu == owner) + return; + + next = READ_ONCE(node->next); + if (!next) + return; + + if (vcpu_is_preempted(owner)) { + next->yield_cpu = owner; + *set_yield_cpu = owner; + } else if (*set_yield_cpu != -1) { + next->yield_cpu = owner; + *set_yield_cpu = owner; + } +} + static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) { int prev_cpu = decode_tail_cpu(val); u32 yield_count; + int yield_cpu; if (!paravirt) goto relax; + if (!pv_yield_propagate_owner) + goto yield_prev; + + yield_cpu = READ_ONCE(node->yield_cpu); + if (yield_cpu == -1) { + /* Propagate back the -1 CPU */ + if (node->next && node->next->yield_cpu != -1) + node->next->yield_cpu = yield_cpu; + goto yield_prev; + } + + yield_count = yield_count_of(yield_cpu); + if ((yield_count & 1) == 0) + goto yield_prev; /* owner vcpu is running */ + + smp_rmb(); + + if (yield_cpu == node->yield_cpu) { + if (node->next && node->next->yield_cpu != yield_cpu) + node->next->yield_cpu = yield_cpu; + yield_to_preempted(yield_cpu, yield_count); + return; + } + +yield_prev: if (!pv_yield_prev) goto relax; @@ -331,6 +386,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b node = &qnodesp->nodes[idx]; node->next = NULL; node->lock = lock; + node->yield_cpu = -1; node->locked = 0; tail = encode_tail_cpu(smp_processor_id()); @@ -351,16 +407,23 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b while (!node->locked) yield_to_prev(lock, node, old, paravirt); + /* Clear out stale propagated yield_cpu */ + if (paravirt && pv_yield_propagate_owner && node->yield_cpu != -1) + node->yield_cpu = -1; + smp_rmb(); /* acquire barrier for the mcs lock */ } if (!maybe_stealers) { + int set_yield_cpu = -1; + /* We're at the head of the waitqueue, wait for the lock. */ for (;;) { val = READ_ONCE(lock->val); if (!(val & _Q_LOCKED_VAL)) break; + propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); yield_head_to_locked_owner(lock, val, paravirt); } @@ -374,6 +437,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b /* We must be the owner, just set the lock bit and acquire */ set_locked(lock); } else { + int set_yield_cpu = -1; int iters = 0; bool mustq = false; @@ -384,6 +448,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b if (!(val & _Q_LOCKED_VAL)) break; + propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); yield_head_to_locked_owner(lock, val, paravirt); iters++; @@ -553,6 +618,22 @@ static int pv_yield_prev_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_prev, pv_yield_prev_get, pv_yield_prev_set, "%llu\n"); +static int pv_yield_propagate_owner_set(void *data, u64 val) +{ + pv_yield_propagate_owner = !!val; + + return 0; +} + +static int pv_yield_propagate_owner_get(void *data, u64 *val) +{ + *val = pv_yield_propagate_owner; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_propagate_owner, pv_yield_propagate_owner_get, pv_yield_propagate_owner_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); @@ -561,6 +642,7 @@ static __init int spinlock_debugfs_init(void) debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); + debugfs_create_file("qspl_pv_yield_propagate_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_propagate_owner); } return 0; From patchwork Mon Nov 14 02:31:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703402 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=BLmIHG3x; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YY52NSJz23nN for ; Mon, 14 Nov 2022 13:44:09 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YY515bQz3f3k for ; Mon, 14 Nov 2022 13:44:09 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=BLmIHG3x; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::62a; helo=mail-pl1-x62a.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=BLmIHG3x; dkim-atps=neutral Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHq5d6Cz3cKv for ; Mon, 14 Nov 2022 13:32:39 +1100 (AEDT) Received: by mail-pl1-x62a.google.com with SMTP id y4so8873953plb.2 for ; Sun, 13 Nov 2022 18:32:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XPvVAbxtafvGVHXmryCF6pKJZoEVQIoaHGDq1e3+zz8=; b=BLmIHG3x1wW1EAUaQ4rfo50EME5kOrVHPPTOITLwLiaaGUGCsVhIuPdvt2tuPeAiZt VoQEKQRGrdvXzKboaPrado0Z7MS/4HIRr8W859RxWOsYyLfvBkEyyoyeAGr+KuNATbjy 2Z79pWogSHrATuX1Tngb6npkKT4mjZslruPyZEjTDLJ3yXNp4U3zmDEYIQLTG4PdKkQy E3qho8NK87hd2T/At0OWy/h5fvYbSQORDm6ddqxCoHaPyu742WskSkhXYY8YUiGADoh0 Gw0zEOQYbJSDzprjf/hS7PRwjoeMEAMAlgiy9NRuDWavzUyvAl2BlTOdOXhKUEvG7wTp Z9qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XPvVAbxtafvGVHXmryCF6pKJZoEVQIoaHGDq1e3+zz8=; b=T1xBruEoSa0a76J5IQMKgsJdLV1a+kqlnG54orfFhxlK/stTjEi9GMhayseh7slMI/ iEKnnS/9MKNiZJ71kfLUt6TshMPUanmLWggwa2nVBiw/J85tYlVuHw2f38d0WnwXksE3 cfbx/GMcym+KZ5HZKr2yJeOH6chb5zDyq4jPqZj+oihagb4fFfg5ogbY3/7KPkZeTnDf OjfvwzLvNHpX89scPRq3msJHH5JDtaEzp45Nw1Irs3RDRdCSZNmEakOQnG/VsV4CtqAD lx5tQFRJLT2QLwDI4iXXGknyo7RiWTkxBWDu7HbU+XGSPfMk70tAwReJQ+OJgvWNWrMD ru5w== X-Gm-Message-State: ANoB5pk9D5KbP8L28bHNe9RpQ7fF4ee+ACId/yXPBcHbAChJM0JFu2AW Yu9Kzdw1Vg1vj7CkZMdER0coYvfECOA= X-Google-Smtp-Source: AA0mqf46R9fCkzszYDvpIeuaGYSwSAKttvsHB4zus00sWTKj3nxXSBbdDCXN2JpjROD1rxsqUHBg4g== X-Received: by 2002:a17:903:41d0:b0:188:5e78:be0 with SMTP id u16-20020a17090341d000b001885e780be0mr11808876ple.18.1668393157856; Sun, 13 Nov 2022 18:32:37 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:37 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 12/17] powerpc/qspinlock: add ability to prod new queue head CPU Date: Mon, 14 Nov 2022 12:31:32 +1000 Message-Id: <20221114023137.2679627-14-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" After the head of the queue acquires the lock, it releases the next waiter in the queue to become the new head. Add an option to prod the new head if its vCPU was preempted. This may only have an effect if queue waiters are yielding. Disable this option by default for now, i.e., no logical change. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index f07843b4c497..51123240da8e 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -12,6 +12,7 @@ struct qnode { struct qnode *next; struct qspinlock *lock; + int cpu; int yield_cpu; u8 locked; /* 1 if lock acquired */ }; @@ -30,6 +31,7 @@ static bool pv_yield_owner __read_mostly = true; static bool pv_yield_allow_steal __read_mostly = false; static bool pv_yield_prev __read_mostly = true; static bool pv_yield_propagate_owner __read_mostly = true; +static bool pv_prod_head __read_mostly = false; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -386,6 +388,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b node = &qnodesp->nodes[idx]; node->next = NULL; node->lock = lock; + node->cpu = smp_processor_id(); node->yield_cpu = -1; node->locked = 0; @@ -483,7 +486,14 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b * this store to locked. The corresponding barrier is the smp_rmb() * acquire barrier for mcs lock, above. */ - WRITE_ONCE(next->locked, 1); + if (paravirt && pv_prod_head) { + int next_cpu = next->cpu; + WRITE_ONCE(next->locked, 1); + if (vcpu_is_preempted(next_cpu)) + prod_cpu(next_cpu); + } else { + WRITE_ONCE(next->locked, 1); + } release: qnodesp->count--; /* release the node */ @@ -634,6 +644,22 @@ static int pv_yield_propagate_owner_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_propagate_owner, pv_yield_propagate_owner_get, pv_yield_propagate_owner_set, "%llu\n"); +static int pv_prod_head_set(void *data, u64 val) +{ + pv_prod_head = !!val; + + return 0; +} + +static int pv_prod_head_get(void *data, u64 *val) +{ + *val = pv_prod_head; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_prod_head, pv_prod_head_get, pv_prod_head_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); @@ -643,6 +669,7 @@ static __init int spinlock_debugfs_init(void) debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); debugfs_create_file("qspl_pv_yield_propagate_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_propagate_owner); + debugfs_create_file("qspl_pv_prod_head", 0600, arch_debugfs_dir, NULL, &fops_pv_prod_head); } return 0; From patchwork Mon Nov 14 02:31:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703403 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=NX2SqJJD; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YZ56kRLz23nN for ; Mon, 14 Nov 2022 13:45:01 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YZ55tkrz3fP9 for ; Mon, 14 Nov 2022 13:45:01 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=NX2SqJJD; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::531; helo=mail-pg1-x531.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=NX2SqJJD; dkim-atps=neutral Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHv2M78z3cGm for ; Mon, 14 Nov 2022 13:32:43 +1100 (AEDT) Received: by mail-pg1-x531.google.com with SMTP id v3so9126420pgh.4 for ; Sun, 13 Nov 2022 18:32:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/OV9ZfoAfAbhk+WQL7FfDGr/G3dfE6dz+V5jHibvfWI=; b=NX2SqJJD7xy/JntKpyZgm0LbLq1jhSFzNvqw/e70o/SFGydOTR8XTWfv0QlLNwrREd fNQXUbffEhCzLhUck8iUigZWU6fbQk51PWTKne4dKDIpmoP9jK6S06xbGxCYHRy2h+lv 0pZWXcucbOPt7l961hvAtherIcyT0TTM9Ku5o4zPBrZWNiSOfVpJWXix5szBNvr4zYgp 8IRCLTYK4UOBlyOCkFj6TcLb6Sj5qML18YMr1eXsBM23fUNCzf9ooo7IzacJTDHfbUia BabT/KSufRCAVp84CyNWg2+A2LTXM75q3ZPoxlKzMh0zG2FR+DIvCZVuIt+nOuY683Gx Kryg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/OV9ZfoAfAbhk+WQL7FfDGr/G3dfE6dz+V5jHibvfWI=; b=Be4Jy0RuWaVIxp5ZO26NKd+glCA+8OtCzC6bT5lP5KeVR3ZElhPH2BHjFgI/2891Rj AZb+xm3itwdQ9NZ/PTdWO4DH8vkhmgvo+0A2qQmYWKlBwz0jdgWASWj1TyJb3Wxyuxaa INbfFfP7siOrQ9pvBRg3sOQFw6roaPK8AZ7rngXUaG9GXq7QdgmApjuZ1Svol6FwlPxn q4VPzJ11J/5xeBvWBmND6jP4Q4TY3RCsItlDG7q7OeYqldlxFbkRjI2ChMzP/F/XiIfV iC26+mtmDFR1SLjA4yoOKwlp1DSZtV9lCSPi8GXfuNoM3l5/el8EQ0vX781PC6fgL+jz wwLA== X-Gm-Message-State: ANoB5pkDb20OgaqK1cyK5WwVAGvY5MGxL4F2Nrh4OSt3DPIfLHYotrKa cwShIMmke/be4w2Suj6WiBbt5bgOVpM= X-Google-Smtp-Source: AA0mqf60NBiitm4+zHDpOmUB8lmCrREWO10/++BmfRfjpNaZEgBJJfDvPqOvv9z3oo4JTMryrEVEbg== X-Received: by 2002:a05:6a00:a16:b0:56a:fd45:d054 with SMTP id p22-20020a056a000a1600b0056afd45d054mr12365881pfh.3.1668393161277; Sun, 13 Nov 2022 18:32:41 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:40 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 13/17] powerpc/qspinlock: trylock and initial lock attempt may steal Date: Mon, 14 Nov 2022 12:31:33 +1000 Message-Id: <20221114023137.2679627-15-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" This gives trylock slightly more strength, and it also gives most of the benefit of passing 'val' back through the slowpath without the complexity. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 44 +++++++++++++++++++++++++++- arch/powerpc/lib/qspinlock.c | 9 ++++++ 2 files changed, 52 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 3eff2d875bb6..56638175e49b 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -5,6 +5,15 @@ #include #include +/* + * The trylock itself may steal. This makes trylocks slightly stronger, and + * might make spin locks slightly more efficient when stealing. + * + * This is compile-time, so if true then there may always be stealers, so the + * nosteal paths become unused. + */ +#define _Q_SPIN_TRY_LOCK_STEAL 1 + static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { return READ_ONCE(lock->val); @@ -26,11 +35,12 @@ static __always_inline u32 queued_spin_encode_locked_val(void) return _Q_LOCKED_VAL | (smp_processor_id() << _Q_OWNER_CPU_OFFSET); } -static __always_inline int queued_spin_trylock(struct qspinlock *lock) +static __always_inline int __queued_spin_trylock_nosteal(struct qspinlock *lock) { u32 new = queued_spin_encode_locked_val(); u32 prev; + /* Trylock succeeds only when unlocked and no queued nodes */ asm volatile( "1: lwarx %0,0,%1,%3 # queued_spin_trylock \n" " cmpwi 0,%0,0 \n" @@ -47,6 +57,38 @@ static __always_inline int queued_spin_trylock(struct qspinlock *lock) return likely(prev == 0); } +static __always_inline int __queued_spin_trylock_steal(struct qspinlock *lock) +{ + u32 new = queued_spin_encode_locked_val(); + u32 prev, tmp; + + /* Trylock may get ahead of queued nodes if it finds unlocked */ + asm volatile( +"1: lwarx %0,0,%2,%5 # queued_spin_trylock \n" +" andc. %1,%0,%4 \n" +" bne- 2f \n" +" and %1,%0,%4 \n" +" or %1,%1,%3 \n" +" stwcx. %1,0,%2 \n" +" bne- 1b \n" +"\t" PPC_ACQUIRE_BARRIER " \n" +"2: \n" + : "=&r" (prev), "=&r" (tmp) + : "r" (&lock->val), "r" (new), "r" (_Q_TAIL_CPU_MASK), + "i" (IS_ENABLED(CONFIG_PPC64)) + : "cr0", "memory"); + + return likely(!(prev & ~_Q_TAIL_CPU_MASK)); +} + +static __always_inline int queued_spin_trylock(struct qspinlock *lock) +{ + if (!_Q_SPIN_TRY_LOCK_STEAL) + return __queued_spin_trylock_nosteal(lock); + else + return __queued_spin_trylock_steal(lock); +} + void queued_spin_lock_slowpath(struct qspinlock *lock); static __always_inline void queued_spin_lock(struct qspinlock *lock) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 51123240da8e..830a90a66f5f 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -24,7 +24,11 @@ struct qnodes { /* Tuning parameters */ static int steal_spins __read_mostly = (1<<5); +#if _Q_SPIN_TRY_LOCK_STEAL == 1 +static const bool maybe_stealers = true; +#else static bool maybe_stealers __read_mostly = true; +#endif static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; @@ -527,6 +531,10 @@ void pv_spinlocks_init(void) #include static int steal_spins_set(void *data, u64 val) { +#if _Q_SPIN_TRY_LOCK_STEAL == 1 + /* MAYBE_STEAL remains true */ + steal_spins = val; +#else static DEFINE_MUTEX(lock); /* @@ -551,6 +559,7 @@ static int steal_spins_set(void *data, u64 val) steal_spins = val; } mutex_unlock(&lock); +#endif return 0; } From patchwork Mon Nov 14 02:31:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703404 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=M8+88ofT; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9Yb61QJ6z23nM for ; Mon, 14 Nov 2022 13:45:54 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9Yb60fZQz3f6V for ; Mon, 14 Nov 2022 13:45:54 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=M8+88ofT; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::62b; helo=mail-pl1-x62b.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=M8+88ofT; dkim-atps=neutral Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YHz2fF1z3dvf for ; Mon, 14 Nov 2022 13:32:47 +1100 (AEDT) Received: by mail-pl1-x62b.google.com with SMTP id b21so8845248plc.9 for ; Sun, 13 Nov 2022 18:32:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HjpE3jVVn946ihSc1RuQRXSXhPe2XHAfeGF4cz4ifk8=; b=M8+88ofTGx4Z/sg6rbw0avEH+akdSAQdCN9evENO4RPy7/cPq/+e4G0VA1xwhEgVvu YN0/egmJhNx8S5e9ACklfNLWINL3cA3mdlx2iW0oLywa4vGG149GAmDV6xPnC0ddTAox ibH+L/FfZOXE1vn1v6oBZUr8sZsP8iPSrtCAIsOJzIr01dGtv6+ctXI12GN0PYUQM4bD DVZZyBI5JB6ba5M2PAIwfLnll+sbKdS8SCIVxccn0R7G5ARWHFk427sNedJKWN1UDHGx wfxWxjhVzDcVQyuPwxJBIY4BmG5NnBmWp6SfdazHM+te5f0U/85nj5EXT2LI8MMMTJfi A25A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HjpE3jVVn946ihSc1RuQRXSXhPe2XHAfeGF4cz4ifk8=; b=vj+v2kmTYtwoolHXyNIrmA9XciiSoBDvJ30uRtf334xGTioFNRZpB12lDMFH8ajocH KK6vKCmjX9thGRJoD0fkLU5psHa1n1lN+s9+Ui3YqfJUcXLg9UEJXckRkWocsAXzRcqS 57oUScnm/xy0K/NcoItnpdGMMfbu9BJhGmS2cg9bHcNWkwUwrE9chQNIM/4l2MMI8sF/ nL7Da8Pji8OV4y0yS+/INSfcAT7o5/Omht4RjLCeVSrfosr86pqxq7M2JDu3AXXzwpcr K83e5ZzScFxZLwnN3nVtI3OhnR0swKiJyZc4oCZ0UsTP/FaiEgohV0XRtclNz20R44cf LVlw== X-Gm-Message-State: ANoB5pl00YEOuLiLpXhpLY+PuXGq1ypl0i1V9uKca+Ph6iC/hq76dImP tTLE+ohLYP5hbTaAJsmwYQcIfFXpbN0= X-Google-Smtp-Source: AA0mqf6S3Vm6nq766rWXsZRNCLgnMPCLIrEuNQOnqbERoMBNXNPuSyas0BcOliwEsyZc/wdPe/26zw== X-Received: by 2002:a17:902:f64d:b0:186:9cce:c59 with SMTP id m13-20020a170902f64d00b001869cce0c59mr11751652plg.120.1668393164675; Sun, 13 Nov 2022 18:32:44 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:44 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 14/17] powerpc/qspinlock: use spin_begin/end API Date: Mon, 14 Nov 2022 12:31:34 +1000 Message-Id: <20221114023137.2679627-16-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Use the spin_begin/spin_cpu_relax/spin_end APIs in qspinlock, which helps to prevent threads issuing a lot of expensive priority nops which may not have much effect due to immediately executing low then medium priority. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 41 ++++++++++++++++++++++++++++++++---- 1 file changed, 37 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 830a90a66f5f..ea8886e2922b 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -203,6 +203,7 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } +/* Called inside spin_begin() */ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt, bool mustq) { int owner; @@ -222,6 +223,8 @@ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 if ((yield_count & 1) == 0) goto relax; /* owner vcpu is running */ + spin_end(); + /* * Read the lock word after sampling the yield count. On the other side * there may a wmb because the yield count update is done by the @@ -237,18 +240,22 @@ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 yield_to_preempted(owner, yield_count); if (mustq) set_mustq(lock); + spin_begin(); /* Don't relax if we yielded. Maybe we should? */ return; } + spin_begin(); relax: - cpu_relax(); + spin_cpu_relax(); } +/* Called inside spin_begin() */ static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) { __yield_to_locked_owner(lock, val, paravirt, false); } +/* Called inside spin_begin() */ static __always_inline void yield_head_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) { bool mustq = false; @@ -286,6 +293,7 @@ static __always_inline void propagate_yield_cpu(struct qnode *node, u32 val, int } } +/* Called inside spin_begin() */ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) { int prev_cpu = decode_tail_cpu(val); @@ -310,14 +318,18 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * if ((yield_count & 1) == 0) goto yield_prev; /* owner vcpu is running */ + spin_end(); + smp_rmb(); if (yield_cpu == node->yield_cpu) { if (node->next && node->next->yield_cpu != yield_cpu) node->next->yield_cpu = yield_cpu; yield_to_preempted(yield_cpu, yield_count); + spin_begin(); return; } + spin_begin(); yield_prev: if (!pv_yield_prev) @@ -327,15 +339,19 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * if ((yield_count & 1) == 0) goto relax; /* owner vcpu is running */ + spin_end(); + smp_rmb(); /* See __yield_to_locked_owner comment */ if (!node->locked) { yield_to_preempted(prev_cpu, yield_count); + spin_begin(); return; } + spin_begin(); relax: - cpu_relax(); + spin_cpu_relax(); } @@ -347,6 +363,8 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav return false; /* Attempt to steal the lock */ + spin_begin(); + do { u32 val = READ_ONCE(lock->val); @@ -354,8 +372,10 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav break; if (unlikely(!(val & _Q_LOCKED_VAL))) { + spin_end(); if (trylock_with_tail_cpu(lock, val)) return true; + spin_begin(); } else { yield_to_locked_owner(lock, val, paravirt); } @@ -363,6 +383,8 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav iters++; } while (iters < get_steal_spins(paravirt)); + spin_end(); + return false; } @@ -411,8 +433,10 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b WRITE_ONCE(prev->next, node); /* Wait for mcs node lock to be released */ + spin_begin(); while (!node->locked) yield_to_prev(lock, node, old, paravirt); + spin_end(); /* Clear out stale propagated yield_cpu */ if (paravirt && pv_yield_propagate_owner && node->yield_cpu != -1) @@ -425,6 +449,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b int set_yield_cpu = -1; /* We're at the head of the waitqueue, wait for the lock. */ + spin_begin(); for (;;) { val = READ_ONCE(lock->val); if (!(val & _Q_LOCKED_VAL)) @@ -433,6 +458,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); yield_head_to_locked_owner(lock, val, paravirt); } + spin_end(); /* If we're the last queued, must clean up the tail. */ if ((val & _Q_TAIL_CPU_MASK) == tail) { @@ -450,6 +476,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b again: /* We're at the head of the waitqueue, wait for the lock. */ + spin_begin(); for (;;) { val = READ_ONCE(lock->val); if (!(val & _Q_LOCKED_VAL)) @@ -465,6 +492,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b val |= _Q_MUST_Q_VAL; } } + spin_end(); /* If we're the last queued, must clean up the tail. */ if ((val & _Q_TAIL_CPU_MASK) == tail) { @@ -480,8 +508,13 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b unlock_next: /* contended path; must wait for next != NULL (MCS protocol) */ - while (!(next = READ_ONCE(node->next))) - cpu_relax(); + next = READ_ONCE(node->next); + if (!next) { + spin_begin(); + while (!(next = READ_ONCE(node->next))) + cpu_relax(); + spin_end(); + } /* * Unlock the next mcs waiter node. Release barrier is not required From patchwork Mon Nov 14 02:31:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703405 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=mINaLtMb; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YcG5Jxxz23nM for ; Mon, 14 Nov 2022 13:46:54 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YcG3gNpz3dtr for ; Mon, 14 Nov 2022 13:46:54 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=mINaLtMb; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::532; helo=mail-pg1-x532.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=mINaLtMb; dkim-atps=neutral Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YJ11FP1z3dvh for ; Mon, 14 Nov 2022 13:32:49 +1100 (AEDT) Received: by mail-pg1-x532.google.com with SMTP id o13so9114354pgu.7 for ; Sun, 13 Nov 2022 18:32:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+sPE9WOX7gvql4UktV0topV8AA5mTabSZc33TsfO13k=; b=mINaLtMbmHmX22/C0SYToPyvSnSzk7Q0FBLRHGZeq3IaRrnL3sBkGvi36vJc+cjKBn lbqgQR1s0j9QWZXYrzYGMXlTG2FHEGfyf1Hato/8Oh36FFRPdAiw4FZF8A3tvGcQ6AfY 3YNHMq/rky+WJFvyoO7MWTgTuXIXIR0INPxVdWQPAIcbxPIzBpAczG8tCvy3Zb5T8A/q ImT0YrTZEQJ3a+Vzo8dsVbvY36sV3CdJM1SA5RYg20OLdOkjGb6Onvt26pkkx7KLHOVn /kW31mNaPtovVLHLIKmlNPcAYy5PsWA6eIDP8PegqMx+GcZMpaZj/tUOTibUYdBz3zX1 qUjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+sPE9WOX7gvql4UktV0topV8AA5mTabSZc33TsfO13k=; b=DrILsDVf6KEZl4Z5vC1HFq6KNytcYJ9PjMbFgLXIB5B3uXidpTcY7eOSYVbPLiq3xb ID223Wo2uN2Egd+VntPMgepWn7suHLtcUNX+RwlobD5mB/ru2U0GDTxvjPlwai80MpnZ 6MJry4G/DujKyGA1JS7ykjZbtZGJl6iNW8M1pl5UOczML2QsbDkS8zFD/btW8kCDRwcS WVdkJRsFABNlI9p+5gDX24CwtypdG/Jh0PwlpFIHfYq4DtcdDh2tS6TLcaDrKWfzmhU3 lraPMUVGIeKEZ4jhyNtzVSo0HUxlOcTKKV525WlXc3g2zsrwnBCjui5G8iof4tv/D3wZ u1Zg== X-Gm-Message-State: ANoB5pkOh1ql5H30sJFpj+32piSX4ILZ1CJG0DTuS9o7I2989AZXX7Bm 80Sp+K4ezD25ijlm4x9ll1ysKZf2r20= X-Google-Smtp-Source: AA0mqf4sc4anC0OcXCmB2ByBsNcpwW41UFX4/3voNTztJlITC50Y9oPSUQZQybZ6d7AyrXrrFtGO5A== X-Received: by 2002:a62:fb10:0:b0:56b:a4f6:e030 with SMTP id x16-20020a62fb10000000b0056ba4f6e030mr12226689pfm.85.1668393168085; Sun, 13 Nov 2022 18:32:48 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:47 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 15/17] powerpc/qspinlock: reduce remote node steal spins Date: Mon, 14 Nov 2022 12:31:35 +1000 Message-Id: <20221114023137.2679627-17-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Allow for a reduction in the number of times a CPU from a different node than the owner can attempt to steal the lock before queueing. This could bias the transfer behaviour of the lock across the machine and reduce NUMA crossings. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 43 +++++++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index ea8886e2922b..a1c832a52d26 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -4,6 +4,7 @@ #include #include #include +#include #include #include @@ -24,6 +25,7 @@ struct qnodes { /* Tuning parameters */ static int steal_spins __read_mostly = (1<<5); +static int remote_steal_spins __read_mostly = (1<<2); #if _Q_SPIN_TRY_LOCK_STEAL == 1 static const bool maybe_stealers = true; #else @@ -44,6 +46,11 @@ static __always_inline int get_steal_spins(bool paravirt) return steal_spins; } +static __always_inline int get_remote_steal_spins(bool paravirt) +{ + return remote_steal_spins; +} + static __always_inline int get_head_spins(bool paravirt) { return head_spins; @@ -354,10 +361,24 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * spin_cpu_relax(); } +static __always_inline bool steal_break(u32 val, int iters, bool paravirt) +{ + if (iters >= get_steal_spins(paravirt)) + return true; + + if (IS_ENABLED(CONFIG_NUMA) && + (iters >= get_remote_steal_spins(paravirt))) { + int cpu = get_owner_cpu(val); + if (numa_node_id() != cpu_to_node(cpu)) + return true; + } + return false; +} static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool paravirt) { int iters = 0; + u32 val; if (!maybe_stealers) return false; @@ -366,8 +387,7 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav spin_begin(); do { - u32 val = READ_ONCE(lock->val); - + val = READ_ONCE(lock->val); if (val & _Q_MUST_Q_VAL) break; @@ -381,7 +401,7 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav } iters++; - } while (iters < get_steal_spins(paravirt)); + } while (!steal_break(val, iters, paravirt)); spin_end(); @@ -606,6 +626,22 @@ static int steal_spins_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_steal_spins, steal_spins_get, steal_spins_set, "%llu\n"); +static int remote_steal_spins_set(void *data, u64 val) +{ + remote_steal_spins = val; + + return 0; +} + +static int remote_steal_spins_get(void *data, u64 *val) +{ + *val = remote_steal_spins; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_remote_steal_spins, remote_steal_spins_get, remote_steal_spins_set, "%llu\n"); + static int head_spins_set(void *data, u64 val) { head_spins = val; @@ -705,6 +741,7 @@ DEFINE_SIMPLE_ATTRIBUTE(fops_pv_prod_head, pv_prod_head_get, pv_prod_head_set, " static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); + debugfs_create_file("qspl_remote_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_remote_steal_spins); debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); if (is_shared_processor()) { debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); From patchwork Mon Nov 14 02:31:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703407 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=JpU77Xmy; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YdH64mMz23nM for ; Mon, 14 Nov 2022 13:47:47 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YdH3VLKz3fLr for ; Mon, 14 Nov 2022 13:47:47 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=JpU77Xmy; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::42f; helo=mail-pf1-x42f.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=JpU77Xmy; dkim-atps=neutral Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YJ62PJlz3dvk for ; Mon, 14 Nov 2022 13:32:54 +1100 (AEDT) Received: by mail-pf1-x42f.google.com with SMTP id z26so9807950pff.1 for ; Sun, 13 Nov 2022 18:32:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KiLPUfBsTeXHutdkrj2F8sriNTR5yL64ZGAMnwHkXeg=; b=JpU77Xmy/ykcKlY+DR7clpDhtW6PWmX4/4xrIQ7PwGQ9Ik/2sy9YRbNSc06ORACOoF zvDG1saBBsK0dNfjWQoSnq8KJEChgWu88oKrfWR3ilgPPzY+12BZp60Y71mJef/bwWPV xBTtko8Y8z9BmTlN3zQHq7/p2hjAdiPhCV8UpmVRYLd5xQRaYVdzCwAOAOvOXKGvfs5w czJKanxhp8D9EoQNW/zfXEcaHg12dPyE5acqY/N1CSuniCSgRXXj3Z6NG9P24JfaPvmj 6VRv6lSi7rIE0A5dUHU5sZZWSWNmkOOpuqCpZnDD55kJWCfnIMGVmhNlnTD9BCpx9+dQ KCuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KiLPUfBsTeXHutdkrj2F8sriNTR5yL64ZGAMnwHkXeg=; b=NI6qrv+IrbxmMscLoR9ezcOXFdi77AQ+BItJ/ozS1opJfGaT9n7sLE7FHTa3qgBkE3 0jPijFDBtcfo5eE0Q++IgUWNTX1ywLesJiq0wUgUOo9G86EG0EDha4pBirw/Pg99j0tp l053Om4Q0mEvTfVjlymPVV/RSbWBUz9EbVeyQatlfkIQ+yPNJv/iltzvp0LqzUNZcw48 Vby/fOmDiwojazWbTEN8aOKRitWLNBmQ/9i0HAv1JQiv0MZlkaRAlRM96pVQy0/TGuY2 1pa3nY4jVk7w/EfxqghM1WU+6VNvqf95xB7gVC+XR1mJGdl0fJ1lJRRRi2Iq/KXtqpg9 WfmA== X-Gm-Message-State: ANoB5pk8xsj8wFXuHUjnuVK4CqbkWbCwhI05cKQWySb4pQDRlnbPgKKq QHSbwNN3cFWIu86UA1AS4xt1XZb8jmY= X-Google-Smtp-Source: AA0mqf4BmrOkmcShiXHd0dTlm080k+WXiurao04gCpuixVaecSsWS2bMEfAlyk+B9SL0GG20LS30VA== X-Received: by 2002:a05:6a00:1799:b0:558:991a:6671 with SMTP id s25-20020a056a00179900b00558991a6671mr12267220pfg.58.1668393171752; Sun, 13 Nov 2022 18:32:51 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:51 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 16/17] powerpc/qspinlock: allow indefinite spinning on a preempted owner Date: Mon, 14 Nov 2022 12:31:36 +1000 Message-Id: <20221114023137.2679627-18-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Provide an option that holds off queueing indefinitely while the lock owner is preempted. This could reduce queueing latencies for very overcommitted vcpu situations. This is disabled by default. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 74 ++++++++++++++++++++++++++++-------- 1 file changed, 59 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index a1c832a52d26..7e6ab1f30d50 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -35,6 +35,7 @@ static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; static bool pv_yield_allow_steal __read_mostly = false; +static bool pv_spin_on_preempted_owner __read_mostly = false; static bool pv_yield_prev __read_mostly = true; static bool pv_yield_propagate_owner __read_mostly = true; static bool pv_prod_head __read_mostly = false; @@ -210,11 +211,12 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } -/* Called inside spin_begin() */ -static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt, bool mustq) +/* Called inside spin_begin(). Returns whether or not the vCPU was preempted. */ +static __always_inline bool __yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt, bool mustq) { int owner; u32 yield_count; + bool preempted = false; BUG_ON(!(val & _Q_LOCKED_VAL)); @@ -232,6 +234,8 @@ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 spin_end(); + preempted = true; + /* * Read the lock word after sampling the yield count. On the other side * there may a wmb because the yield count update is done by the @@ -248,29 +252,32 @@ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 if (mustq) set_mustq(lock); spin_begin(); + /* Don't relax if we yielded. Maybe we should? */ - return; + return preempted; } spin_begin(); relax: spin_cpu_relax(); + + return preempted; } -/* Called inside spin_begin() */ -static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +/* Called inside spin_begin(). Returns whether or not the vCPU was preempted. */ +static __always_inline bool yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) { - __yield_to_locked_owner(lock, val, paravirt, false); + return __yield_to_locked_owner(lock, val, paravirt, false); } -/* Called inside spin_begin() */ -static __always_inline void yield_head_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +/* Called inside spin_begin(). Returns whether or not the vCPU was preempted. */ +static __always_inline bool yield_head_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) { bool mustq = false; if ((val & _Q_MUST_Q_VAL) && pv_yield_allow_steal) mustq = true; - __yield_to_locked_owner(lock, val, paravirt, mustq); + return __yield_to_locked_owner(lock, val, paravirt, mustq); } static __always_inline void propagate_yield_cpu(struct qnode *node, u32 val, int *set_yield_cpu, bool paravirt) @@ -380,13 +387,16 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav int iters = 0; u32 val; - if (!maybe_stealers) + if (!maybe_stealers) { + /* XXX: should spin_on_preempted_owner do anything here? */ return false; + } /* Attempt to steal the lock */ spin_begin(); - do { + bool preempted = false; + val = READ_ONCE(lock->val); if (val & _Q_MUST_Q_VAL) break; @@ -397,10 +407,23 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav return true; spin_begin(); } else { - yield_to_locked_owner(lock, val, paravirt); + preempted = yield_to_locked_owner(lock, val, paravirt); } - iters++; + if (preempted) { + if (!pv_spin_on_preempted_owner) + iters++; + /* + * pv_spin_on_preempted_owner don't increase iters + * while the owner is preempted -- we won't interfere + * with it by definition. This could introduce some + * latency issue if we continually observe preempted + * owners, but hopefully that's a rare corner case of + * a badly oversubscribed system. + */ + } else { + iters++; + } } while (!steal_break(val, iters, paravirt)); spin_end(); @@ -503,9 +526,13 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b break; propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); - yield_head_to_locked_owner(lock, val, paravirt); + if (yield_head_to_locked_owner(lock, val, paravirt)) { + if (!pv_spin_on_preempted_owner) + iters++; + } else { + iters++; + } - iters++; if (!mustq && iters >= get_head_spins(paravirt)) { mustq = true; set_mustq(lock); @@ -690,6 +717,22 @@ static int pv_yield_allow_steal_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_allow_steal, pv_yield_allow_steal_get, pv_yield_allow_steal_set, "%llu\n"); +static int pv_spin_on_preempted_owner_set(void *data, u64 val) +{ + pv_spin_on_preempted_owner = !!val; + + return 0; +} + +static int pv_spin_on_preempted_owner_get(void *data, u64 *val) +{ + *val = pv_spin_on_preempted_owner; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_spin_on_preempted_owner, pv_spin_on_preempted_owner_get, pv_spin_on_preempted_owner_set, "%llu\n"); + static int pv_yield_prev_set(void *data, u64 val) { pv_yield_prev = !!val; @@ -746,6 +789,7 @@ static __init int spinlock_debugfs_init(void) if (is_shared_processor()) { debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); + debugfs_create_file("qspl_pv_spin_on_preempted_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_spin_on_preempted_owner); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); debugfs_create_file("qspl_pv_yield_propagate_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_propagate_owner); debugfs_create_file("qspl_pv_prod_head", 0600, arch_debugfs_dir, NULL, &fops_pv_prod_head); From patchwork Mon Nov 14 02:31:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1703408 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=FJdIuUm2; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N9YfJ4HyKz23nM for ; Mon, 14 Nov 2022 13:48:40 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4N9YfJ1KsYz3fTm for ; Mon, 14 Nov 2022 13:48:40 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=FJdIuUm2; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::432; helo=mail-pf1-x432.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=FJdIuUm2; dkim-atps=neutral Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4N9YJB5NfNz3cF6 for ; Mon, 14 Nov 2022 13:32:58 +1100 (AEDT) Received: by mail-pf1-x432.google.com with SMTP id q9so9785542pfg.5 for ; Sun, 13 Nov 2022 18:32:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=26jftw046NIpGN8HtAXLwE20A5xp/EJwP7MS7N5sDXc=; b=FJdIuUm2Tknb8xDw3L/VMLrCf2Exrf3E0gGrVxsH1rUcCKWLrXQAbOL8w/PJg5eLdX Pyx8mNTBKOn9SykpJfCYED+CcoQ4IytYujf2xslMC7kfZh5Ueq4AlURPlDMhpEpUPyGm RFiNo+nZDSPRwTbpeUkscT+MSTeNR9ThU2caIYksYF5FRvXhDJ4mVwxHVRoAqRC1F2h9 UaXCHrX0oLlfD4y9MQvyW0npTj4X52514rq5yi9Sgf+ysdPMUf2/J6JK9pX3VnR4xb/0 7RmoGZERPD/ofgD8qoP+AxTdfhB5KP7BXBiw+vMonJhfoyMY76GJ1DjbePJ1hLpM61eT shyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=26jftw046NIpGN8HtAXLwE20A5xp/EJwP7MS7N5sDXc=; b=AqqkfsueZZ+S1yhHJ9wEHBBB0D6pY4cBdmibuw4b8G7WOinehxO0aDHJd4BMBfwTT+ fXXC+lF2DUqSiSU0RA1yLU6ISKcsMU98PlQL5KBfl0jdmcVQcfUhR6+eaq8mbC7m8idp hEhHKgdvtItY62aqCZ9wRUgM0cLoqHpFZXjOnixoaEg3WZqiZcgDp79jELqV1cAKI+Am 8AIsEyu255hPmvtXRxWVaBjplbW68CTmEComXwKIs4455LadIs9+T5AxkTV4FLCPJW0+ oHfF1PYRgdKhx20mjg0m6v4PXE6TF5KuDEkLqfMrp2vmR3OOgfAcIAq/iHeo/pBXpMRX 9PIA== X-Gm-Message-State: ANoB5pli+S7/yF6o/0sxfUnC7ih1S4os46tjLpxrO2gb/9wPnhW7Iw4b gCYXxa/nAc4KzskxuwykRPtboB17204= X-Google-Smtp-Source: AA0mqf4ZLdjdvAUkuMbv+t8waHCW+sa8wmHjE1JTFKSQ1hXbcuAF0/A5CWlH8jHYE/jI9IV4MCOIjQ== X-Received: by 2002:a05:6a00:a13:b0:56c:14a5:2245 with SMTP id p19-20020a056a000a1300b0056c14a52245mr12171715pfh.12.1668393175247; Sun, 13 Nov 2022 18:32:55 -0800 (PST) Received: from bobo.ibm.com (14-200-187-164.tpgi.com.au. [14.200.187.164]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00186616b8fbasm5973655plc.10.2022.11.13.18.32.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 18:32:54 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v2 17/17] powerpc/qspinlock: provide accounting and options for sleepy locks Date: Mon, 14 Nov 2022 12:31:37 +1000 Message-Id: <20221114023137.2679627-19-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221114023137.2679627-1-npiggin@gmail.com> References: <20221114023137.2679627-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Finding the owner or a queued waiter on a lock with a preempted vcpu is indicative of an oversubscribed guest causing the lock to get into trouble. Provide some options to detect this situation and have new CPUs avoid queueing for a longer time (more steal iterations) to minimise the problems caused by vcpu preemption on the queue. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock_types.h | 7 +- arch/powerpc/lib/qspinlock.c | 244 +++++++++++++++++++-- 2 files changed, 232 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 35f9525381e6..4fbcc8a4230b 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -30,7 +30,7 @@ typedef struct qspinlock { * * 0: locked bit * 1-14: lock holder cpu - * 15: unused bit + * 15: lock owner or queuer vcpus observed to be preempted bit * 16: must queue bit * 17-31: tail cpu (+1) */ @@ -49,6 +49,11 @@ typedef struct qspinlock { #error "qspinlock does not support such large CONFIG_NR_CPUS" #endif +#define _Q_SLEEPY_OFFSET 15 +#define _Q_SLEEPY_BITS 1 +#define _Q_SLEEPY_MASK _Q_SET_MASK(SLEEPY_OWNER) +#define _Q_SLEEPY_VAL (1U << _Q_SLEEPY_OFFSET) + #define _Q_MUST_Q_OFFSET 16 #define _Q_MUST_Q_BITS 1 #define _Q_MUST_Q_MASK _Q_SET_MASK(MUST_Q) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 7e6ab1f30d50..36afdfde41aa 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include @@ -36,25 +37,56 @@ static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; static bool pv_yield_allow_steal __read_mostly = false; static bool pv_spin_on_preempted_owner __read_mostly = false; +static bool pv_sleepy_lock __read_mostly = true; +static bool pv_sleepy_lock_sticky __read_mostly = false; +static u64 pv_sleepy_lock_interval_ns __read_mostly = 0; +static int pv_sleepy_lock_factor __read_mostly = 256; static bool pv_yield_prev __read_mostly = true; static bool pv_yield_propagate_owner __read_mostly = true; static bool pv_prod_head __read_mostly = false; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); +static DEFINE_PER_CPU_ALIGNED(u64, sleepy_lock_seen_clock); -static __always_inline int get_steal_spins(bool paravirt) +static __always_inline bool recently_sleepy(void) { - return steal_spins; + /* pv_sleepy_lock is true when this is called */ + if (pv_sleepy_lock_interval_ns) { + u64 seen = this_cpu_read(sleepy_lock_seen_clock); + + if (seen) { + u64 delta = sched_clock() - seen; + if (delta < pv_sleepy_lock_interval_ns) + return true; + this_cpu_write(sleepy_lock_seen_clock, 0); + } + } + + return false; } -static __always_inline int get_remote_steal_spins(bool paravirt) +static __always_inline int get_steal_spins(bool paravirt, bool sleepy) { - return remote_steal_spins; + if (paravirt && sleepy) + return steal_spins * pv_sleepy_lock_factor; + else + return steal_spins; } -static __always_inline int get_head_spins(bool paravirt) +static __always_inline int get_remote_steal_spins(bool paravirt, bool sleepy) { - return head_spins; + if (paravirt && sleepy) + return remote_steal_spins * pv_sleepy_lock_factor; + else + return remote_steal_spins; +} + +static __always_inline int get_head_spins(bool paravirt, bool sleepy) +{ + if (paravirt && sleepy) + return head_spins * pv_sleepy_lock_factor; + else + return head_spins; } static inline u32 encode_tail_cpu(int cpu) @@ -187,6 +219,56 @@ static __always_inline u32 clear_mustq(struct qspinlock *lock) return prev; } +static __always_inline bool try_set_sleepy(struct qspinlock *lock, u32 old) +{ + u32 prev; + u32 new = old | _Q_SLEEPY_VAL; + + BUG_ON(!(old & _Q_LOCKED_VAL)); + BUG_ON(old & _Q_SLEEPY_VAL); + + asm volatile( +"1: lwarx %0,0,%1 # try_set_sleepy \n" +" cmpw 0,%0,%2 \n" +" bne- 2f \n" +" stwcx. %3,0,%1 \n" +" bne- 1b \n" +"2: \n" + : "=&r" (prev) + : "r" (&lock->val), "r"(old), "r" (new) + : "cr0", "memory"); + + return likely(prev == old); +} + +static __always_inline void seen_sleepy_owner(struct qspinlock *lock, u32 val) +{ + if (pv_sleepy_lock) { + if (pv_sleepy_lock_interval_ns) + this_cpu_write(sleepy_lock_seen_clock, sched_clock()); + if (!(val & _Q_SLEEPY_VAL)) + try_set_sleepy(lock, val); + } +} + +static __always_inline void seen_sleepy_lock(void) +{ + if (pv_sleepy_lock && pv_sleepy_lock_interval_ns) + this_cpu_write(sleepy_lock_seen_clock, sched_clock()); +} + +static __always_inline void seen_sleepy_node(struct qspinlock *lock, u32 val) +{ + if (pv_sleepy_lock) { + if (pv_sleepy_lock_interval_ns) + this_cpu_write(sleepy_lock_seen_clock, sched_clock()); + if (val & _Q_LOCKED_VAL) { + if (!(val & _Q_SLEEPY_VAL)) + try_set_sleepy(lock, val); + } + } +} + static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) { int cpu = decode_tail_cpu(val); @@ -234,6 +316,7 @@ static __always_inline bool __yield_to_locked_owner(struct qspinlock *lock, u32 spin_end(); + seen_sleepy_owner(lock, val); preempted = true; /* @@ -308,11 +391,12 @@ static __always_inline void propagate_yield_cpu(struct qnode *node, u32 val, int } /* Called inside spin_begin() */ -static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) +static __always_inline bool yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) { int prev_cpu = decode_tail_cpu(val); u32 yield_count; int yield_cpu; + bool preempted = false; if (!paravirt) goto relax; @@ -334,6 +418,9 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * spin_end(); + preempted = true; + seen_sleepy_node(lock, val); + smp_rmb(); if (yield_cpu == node->yield_cpu) { @@ -341,7 +428,7 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * node->next->yield_cpu = yield_cpu; yield_to_preempted(yield_cpu, yield_count); spin_begin(); - return; + return preempted; } spin_begin(); @@ -355,26 +442,31 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * spin_end(); + preempted = true; + seen_sleepy_node(lock, val); + smp_rmb(); /* See __yield_to_locked_owner comment */ if (!node->locked) { yield_to_preempted(prev_cpu, yield_count); spin_begin(); - return; + return preempted; } spin_begin(); relax: spin_cpu_relax(); + + return preempted; } -static __always_inline bool steal_break(u32 val, int iters, bool paravirt) +static __always_inline bool steal_break(u32 val, int iters, bool paravirt, bool sleepy) { - if (iters >= get_steal_spins(paravirt)) + if (iters >= get_steal_spins(paravirt, sleepy)) return true; if (IS_ENABLED(CONFIG_NUMA) && - (iters >= get_remote_steal_spins(paravirt))) { + (iters >= get_remote_steal_spins(paravirt, sleepy))) { int cpu = get_owner_cpu(val); if (numa_node_id() != cpu_to_node(cpu)) return true; @@ -384,6 +476,8 @@ static __always_inline bool steal_break(u32 val, int iters, bool paravirt) static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool paravirt) { + bool seen_preempted = false; + bool sleepy = false; int iters = 0; u32 val; @@ -410,7 +504,25 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav preempted = yield_to_locked_owner(lock, val, paravirt); } + if (paravirt && pv_sleepy_lock) { + if (!sleepy) { + if (val & _Q_SLEEPY_VAL) { + seen_sleepy_lock(); + sleepy = true; + } else if (recently_sleepy()) { + sleepy = true; + } + } + if (pv_sleepy_lock_sticky && seen_preempted && + !(val & _Q_SLEEPY_VAL)) { + if (try_set_sleepy(lock, val)) + val |= _Q_SLEEPY_VAL; + } + } + if (preempted) { + seen_preempted = true; + sleepy = true; if (!pv_spin_on_preempted_owner) iters++; /* @@ -424,7 +536,7 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav } else { iters++; } - } while (!steal_break(val, iters, paravirt)); + } while (!steal_break(val, iters, paravirt, sleepy)); spin_end(); @@ -436,6 +548,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b struct qnodes *qnodesp; struct qnode *next, *node; u32 val, old, tail; + bool seen_preempted = false; int idx; BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); @@ -477,8 +590,10 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b /* Wait for mcs node lock to be released */ spin_begin(); - while (!node->locked) - yield_to_prev(lock, node, old, paravirt); + while (!node->locked) { + if (yield_to_prev(lock, node, old, paravirt)) + seen_preempted = true; + } spin_end(); /* Clear out stale propagated yield_cpu */ @@ -499,7 +614,8 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b break; propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); - yield_head_to_locked_owner(lock, val, paravirt); + if (yield_head_to_locked_owner(lock, val, paravirt)) + seen_preempted = true; } spin_end(); @@ -515,7 +631,9 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b } else { int set_yield_cpu = -1; int iters = 0; + bool sleepy = false; bool mustq = false; + bool preempted; again: /* We're at the head of the waitqueue, wait for the lock. */ @@ -525,15 +643,37 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b if (!(val & _Q_LOCKED_VAL)) break; + if (paravirt && pv_sleepy_lock) { + if (!sleepy) { + if (val & _Q_SLEEPY_VAL) { + seen_sleepy_lock(); + sleepy = true; + } else if (recently_sleepy()) { + sleepy = true; + } + } + if (pv_sleepy_lock_sticky && seen_preempted && + !(val & _Q_SLEEPY_VAL)) { + if (try_set_sleepy(lock, val)) + val |= _Q_SLEEPY_VAL; + } + } + propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); - if (yield_head_to_locked_owner(lock, val, paravirt)) { + preempted = yield_head_to_locked_owner(lock, val, paravirt); + if (preempted) + seen_preempted = true; + + if (paravirt && preempted) { + sleepy = true; + if (!pv_spin_on_preempted_owner) iters++; } else { iters++; } - if (!mustq && iters >= get_head_spins(paravirt)) { + if (!mustq && iters >= get_head_spins(paravirt, sleepy)) { mustq = true; set_mustq(lock); val |= _Q_MUST_Q_VAL; @@ -733,6 +873,70 @@ static int pv_spin_on_preempted_owner_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_spin_on_preempted_owner, pv_spin_on_preempted_owner_get, pv_spin_on_preempted_owner_set, "%llu\n"); +static int pv_sleepy_lock_set(void *data, u64 val) +{ + pv_sleepy_lock = !!val; + + return 0; +} + +static int pv_sleepy_lock_get(void *data, u64 *val) +{ + *val = pv_sleepy_lock; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_sleepy_lock, pv_sleepy_lock_get, pv_sleepy_lock_set, "%llu\n"); + +static int pv_sleepy_lock_sticky_set(void *data, u64 val) +{ + pv_sleepy_lock_sticky = !!val; + + return 0; +} + +static int pv_sleepy_lock_sticky_get(void *data, u64 *val) +{ + *val = pv_sleepy_lock_sticky; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_sleepy_lock_sticky, pv_sleepy_lock_sticky_get, pv_sleepy_lock_sticky_set, "%llu\n"); + +static int pv_sleepy_lock_interval_ns_set(void *data, u64 val) +{ + pv_sleepy_lock_interval_ns = val; + + return 0; +} + +static int pv_sleepy_lock_interval_ns_get(void *data, u64 *val) +{ + *val = pv_sleepy_lock_interval_ns; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_sleepy_lock_interval_ns, pv_sleepy_lock_interval_ns_get, pv_sleepy_lock_interval_ns_set, "%llu\n"); + +static int pv_sleepy_lock_factor_set(void *data, u64 val) +{ + pv_sleepy_lock_factor = val; + + return 0; +} + +static int pv_sleepy_lock_factor_get(void *data, u64 *val) +{ + *val = pv_sleepy_lock_factor; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_sleepy_lock_factor, pv_sleepy_lock_factor_get, pv_sleepy_lock_factor_set, "%llu\n"); + static int pv_yield_prev_set(void *data, u64 val) { pv_yield_prev = !!val; @@ -790,6 +994,10 @@ static __init int spinlock_debugfs_init(void) debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); debugfs_create_file("qspl_pv_spin_on_preempted_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_spin_on_preempted_owner); + debugfs_create_file("qspl_pv_sleepy_lock", 0600, arch_debugfs_dir, NULL, &fops_pv_sleepy_lock); + debugfs_create_file("qspl_pv_sleepy_lock_sticky", 0600, arch_debugfs_dir, NULL, &fops_pv_sleepy_lock_sticky); + debugfs_create_file("qspl_pv_sleepy_lock_interval_ns", 0600, arch_debugfs_dir, NULL, &fops_pv_sleepy_lock_interval_ns); + debugfs_create_file("qspl_pv_sleepy_lock_factor", 0600, arch_debugfs_dir, NULL, &fops_pv_sleepy_lock_factor); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); debugfs_create_file("qspl_pv_yield_propagate_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_propagate_owner); debugfs_create_file("qspl_pv_prod_head", 0600, arch_debugfs_dir, NULL, &fops_pv_prod_head);