From patchwork Tue Apr 17 09:11:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 899155 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40QKKc2S9Lz9s0x for ; Tue, 17 Apr 2018 19:13:08 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=popple.id.au Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 40QKKc1Dr8zF1xg for ; Tue, 17 Apr 2018 19:13:08 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=popple.id.au X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (ozlabs.org [IPv6:2401:3900:2:1::2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40QKHr61llzF1pY for ; Tue, 17 Apr 2018 19:11:36 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=popple.id.au Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPSA id 40QKHr3Cdgz9s0x; Tue, 17 Apr 2018 19:11:36 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=popple.id.au From: Alistair Popple To: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au Subject: [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range Date: Tue, 17 Apr 2018 19:11:28 +1000 Message-Id: <20180417091129.23069-1-alistair@popple.id.au> X-Mailer: git-send-email 2.11.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Alistair Popple , mhairgrove@nvidia.com, arbab@linux.ibm.com Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The NPU has a limited number of address translation shootdown (ATSD) registers and the GPU has limited bandwidth to process ATSDs. This can result in contention of ATSD registers leading to soft lockups on some threads, particularly when invalidating a large address range in pnv_npu2_mn_invalidate_range(). At some threshold it becomes more efficient to flush the entire GPU TLB for the given MM context (PID) than individually flushing each address in the range. This patch will result in ranges greater than 2MB being converted from 32+ ATSDs into a single ATSD which will flush the TLB for the given PID on each GPU. Signed-off-by: Alistair Popple Acked-by: Balbir Singh Tested-by: Balbir Singh --- arch/powerpc/platforms/powernv/npu-dma.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index 94801d8e7894..dc34662e9df9 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -40,6 +40,13 @@ DEFINE_SPINLOCK(npu_context_lock); /* + * When an address shootdown range exceeds this threshold we invalidate the + * entire TLB on the GPU for the given PID rather than each specific address in + * the range. + */ +#define ATSD_THRESHOLD (2*1024*1024) + +/* * Other types of TCE cache invalidation are not functional in the * hardware. */ @@ -675,11 +682,19 @@ static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn, struct npu_context *npu_context = mn_to_npu_context(mn); unsigned long address; - for (address = start; address < end; address += PAGE_SIZE) - mmio_invalidate(npu_context, 1, address, false); + if (end - start > ATSD_THRESHOLD) { + /* + * Just invalidate the entire PID if the address range is too + * large. + */ + mmio_invalidate(npu_context, 0, 0, true); + } else { + for (address = start; address < end; address += PAGE_SIZE) + mmio_invalidate(npu_context, 1, address, false); - /* Do the flush only on the final addess == end */ - mmio_invalidate(npu_context, 1, address, true); + /* Do the flush only on the final addess == end */ + mmio_invalidate(npu_context, 1, address, true); + } } static const struct mmu_notifier_ops nv_nmmu_notifier_ops = { From patchwork Tue Apr 17 09:11:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 899157 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40QKMp1FmNz9s15 for ; Tue, 17 Apr 2018 19:15:02 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=popple.id.au Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 40QKMn5k8LzF1wX for ; Tue, 17 Apr 2018 19:15:01 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=popple.id.au X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40QKHs2ND4zF1pY for ; Tue, 17 Apr 2018 19:11:37 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=popple.id.au Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPSA id 40QKHr6zqPz9s15; Tue, 17 Apr 2018 19:11:36 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=popple.id.au From: Alistair Popple To: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au Subject: [PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold Date: Tue, 17 Apr 2018 19:11:29 +1000 Message-Id: <20180417091129.23069-2-alistair@popple.id.au> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180417091129.23069-1-alistair@popple.id.au> References: <20180417091129.23069-1-alistair@popple.id.au> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Alistair Popple , mhairgrove@nvidia.com, arbab@linux.ibm.com Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The threshold at which it becomes more efficient to coalesce a range of ATSDs into a single per-PID ATSD is currently not well understood due to a lack of real-world work loads. This patch adds a debugfs parameter allowing the threshold to be altered at runtime in order to aid future development and refinement of the value. Signed-off-by: Alistair Popple Acked-by: Balbir Singh --- arch/powerpc/platforms/powernv/npu-dma.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index dc34662e9df9..a765bf576c14 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -17,7 +17,9 @@ #include #include #include +#include +#include #include #include #include @@ -44,7 +46,8 @@ DEFINE_SPINLOCK(npu_context_lock); * entire TLB on the GPU for the given PID rather than each specific address in * the range. */ -#define ATSD_THRESHOLD (2*1024*1024) +static uint64_t atsd_threshold = 2 * 1024 * 1024; +static struct dentry *atsd_threshold_dentry; /* * Other types of TCE cache invalidation are not functional in the @@ -682,7 +685,7 @@ static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn, struct npu_context *npu_context = mn_to_npu_context(mn); unsigned long address; - if (end - start > ATSD_THRESHOLD) { + if (end - start > atsd_threshold) { /* * Just invalidate the entire PID if the address range is too * large. @@ -956,6 +959,11 @@ int pnv_npu2_init(struct pnv_phb *phb) static int npu_index; uint64_t rc = 0; + if (!atsd_threshold_dentry) { + atsd_threshold_dentry = debugfs_create_x64("atsd_threshold", + 0600, powerpc_debugfs_root, &atsd_threshold); + } + phb->npu.nmmu_flush = of_property_read_bool(phb->hose->dn, "ibm,nmmu-flush"); for_each_child_of_node(phb->hose->dn, dn) {