From patchwork Mon May 29 02:22:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 768005 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wbgXX73SDz9s3T for ; Mon, 29 May 2017 12:23:56 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="tsjtD5pg"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3wbgXX60WwzDqJC for ; Mon, 29 May 2017 12:23:56 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="tsjtD5pg"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-pg0-x242.google.com (mail-pg0-x242.google.com [IPv6:2607:f8b0:400e:c05::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3wbgW041sqzDq7Z for ; Mon, 29 May 2017 12:22:36 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="tsjtD5pg"; dkim-atps=neutral Received: by mail-pg0-x242.google.com with SMTP id s62so4897310pgc.0 for ; Sun, 28 May 2017 19:22:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=cEa5UBqytD9V56fczVKk6RHkUsoMnIYGCNQdTU5MAY8=; b=tsjtD5pgc3tbPXpXfJy7Ypl+3G8iH9t9xYSC8cxUsfIFg3NygMZMV/zXU+xoeksDsP 9RXzHxP7s2GXUzfR2LWl3u4URtKaU7/jMvbVlffnttEn+xJI8csZn1y8lPq+tt0IzlE4 ecdSipg0FwQi4Zupkvp4kv/oI3jsmBoV3WJBwzhkM494cuHyHK4yH4HFtgaEKlXH/xip Qq4LlTSN4nX+TowgtjIpLvaWkLHPnQzaWArhfdrMQexA/EFChZD5W//V398Yp3FFfGaB 9keDYyK4/ifWDJ1vqZ6q6oDYUNVtAZ00GDbVrD/9ofXTxHPCjkl4M09o80jt5WcdEgki gcJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=cEa5UBqytD9V56fczVKk6RHkUsoMnIYGCNQdTU5MAY8=; b=KqniJoW4voMrstHqu5Igdanl07RK4wkHADi/Z7KDUqOyfkHt/S8NoCld3EGZUm8gVl +2UVQnBOS7Znh6NgzHijiIzt6x/TbAOtn62DFjqhRkujez/ShLlht09qhFkmHC1exv4q NblZ6d29POXzJPtexsdL6zYQKTAmoDEefzPo6G8rCo3VfDKC0vK3pM/nA7IKbOOYvGtp SzuVySgw4mBFUpXEKCTJKqKAYoqq/jDRuO6KchXUZ0jk6OdxuqiiXWTh02Xka25qBbSl oPMXVZNO8yCFF8DEKup1zox6ylfEvhlN+vtZs3aklewcDu2GVdJW16w8schdl901llTH /y2g== X-Gm-Message-State: AODbwcD9BZ1aI+ZpO3QFoVzC9LM+jsaGKU2aXdPRHu5CU2mlLkodd2lj 1FYeso6M1Jij2A== X-Received: by 10.99.107.136 with SMTP id g130mr15616963pgc.3.1496024554637; Sun, 28 May 2017 19:22:34 -0700 (PDT) Received: from roar.au.ibm.com (14-202-186-188.tpgi.com.au. [14.202.186.188]) by smtp.gmail.com with ESMTPSA id w76sm10724034pfd.76.2017.05.28.19.22.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 28 May 2017 19:22:33 -0700 (PDT) From: Nicholas Piggin To: Linus Torvalds Subject: [PATCH v2] spin loop primitives for busy waiting Date: Mon, 29 May 2017 12:22:23 +1000 Message-Id: <20170529022223.14793-1-npiggin@gmail.com> X-Mailer: git-send-email 2.11.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Current busy-wait loops are implemented by repeatedly calling cpu_relax() to give an arch option for a low-latency option to improve power and/or SMT resource contention. This poses some difficulties for powerpc, which has SMT priority setting instructions (priorities determine how ifetch cycles are apportioned). powerpc's cpu_relax() is implemented by setting a low priority then setting normal priority. This has several problems: - Changing thread priority can have some execution cost and potential impact to other threads in the core. It's inefficient to execute them every time around a busy-wait loop. - Depending on implementation details, a `low ; medium` sequence may not have much if any affect. Some software with similar pattern actually inserts a lot of nops between, in order to cause a few fetch cycles with the low priority. - The busy-wait loop runs with regular priority. This might only be a few fetch cycles, but if there are several threads running such loops, they could cause a noticable impact on a non-idle thread. Implement spin_begin, spin_end primitives that can be used around busy wait loops, which default to no-ops. And spin_cpu_relax which defaults to cpu_relax. This will allow architectures to hook the entry and exit of busy-wait loops, and will allow powerpc to set low SMT priority at entry, and normal priority at exit. Suggested-by: Linus Torvalds Signed-off-by: Nicholas Piggin --- Since last time: - Fixed spin_do_cond with initial test as suggested by Linus. - Renamed it to spin_until_cond, which reads a little better. include/linux/processor.h | 70 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 include/linux/processor.h diff --git a/include/linux/processor.h b/include/linux/processor.h new file mode 100644 index 000000000000..da0c5e56ca02 --- /dev/null +++ b/include/linux/processor.h @@ -0,0 +1,70 @@ +/* Misc low level processor primitives */ +#ifndef _LINUX_PROCESSOR_H +#define _LINUX_PROCESSOR_H + +#include + +/* + * spin_begin is used before beginning a busy-wait loop, and must be paired + * with spin_end when the loop is exited. spin_cpu_relax must be called + * within the loop. + * + * The loop body should be as small and fast as possible, on the order of + * tens of instructions/cycles as a guide. It should and avoid calling + * cpu_relax, or any "spin" or sleep type of primitive including nested uses + * of these primitives. It should not lock or take any other resource. + * Violations of these guidelies will not cause a bug, but may cause sub + * optimal performance. + * + * These loops are optimized to be used where wait times are expected to be + * less than the cost of a context switch (and associated overhead). + * + * Detection of resource owner and decision to spin or sleep or guest-yield + * (e.g., spin lock holder vcpu preempted, or mutex owner not on CPU) can be + * tested within the loop body. + */ +#ifndef spin_begin +#define spin_begin() +#endif + +#ifndef spin_cpu_relax +#define spin_cpu_relax() cpu_relax() +#endif + +/* + * spin_cpu_yield may be called to yield (undirected) to the hypervisor if + * necessary. This should be used if the wait is expected to take longer + * than context switch overhead, but we can't sleep or do a directed yield. + */ +#ifndef spin_cpu_yield +#define spin_cpu_yield() cpu_relax_yield() +#endif + +#ifndef spin_end +#define spin_end() +#endif + +/* + * spin_until_cond can be used to wait for a condition to become true. It + * may be expected that the first iteration will true in the common case + * (no spinning), so that callers should not require a first "likely" test + * for the uncontended case before using this primitive. + * + * Usage and implementation guidelines are the same as for the spin_begin + * primitives, above. + */ +#ifndef spin_until_cond +#define spin_until_cond(cond) \ +do { \ + if (unlikely(!(cond))) { \ + spin_begin(); \ + do { \ + spin_cpu_relax(); \ + } while (!(cond)); \ + spin_end(); \ + } \ +} while (0) + +#endif + +#endif /* _LINUX_PROCESSOR_H */