From patchwork Mon Dec 8 08:17:55 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ian Munsie X-Patchwork-Id: 418632 X-Patchwork-Delegate: michael@ellerman.id.au Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id EBB471400DD for ; Mon, 8 Dec 2014 19:19:30 +1100 (AEDT) Received: from ozlabs.org (ozlabs.org [103.22.144.67]) by lists.ozlabs.org (Postfix) with ESMTP id D9F961A0CCF for ; Mon, 8 Dec 2014 19:19:30 +1100 (AEDT) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (ozlabs.org [IPv6:2401:3900:2:1::2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 26B921A06AF for ; Mon, 8 Dec 2014 19:18:41 +1100 (AEDT) Received: by ozlabs.org (Postfix) id 127D81400EA; Mon, 8 Dec 2014 19:18:41 +1100 (AEDT) Delivered-To: linuxppc-dev@ozlabs.org Received: from e23smtp09.au.ibm.com (e23smtp09.au.ibm.com [202.81.31.142]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id EA0EE1400DD for ; Mon, 8 Dec 2014 19:18:40 +1100 (AEDT) Received: from /spool/local by e23smtp09.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 8 Dec 2014 18:18:40 +1000 Received: from d23dlp02.au.ibm.com (202.81.31.213) by e23smtp09.au.ibm.com (202.81.31.206) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 8 Dec 2014 18:18:37 +1000 Received: from d23relay08.au.ibm.com (d23relay08.au.ibm.com [9.185.71.33]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id A0B5B2BB0061; Mon, 8 Dec 2014 19:18:34 +1100 (EST) Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by d23relay08.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id sB88IYaq33816610; Mon, 8 Dec 2014 19:18:34 +1100 Received: from d23av02.au.ibm.com (localhost [127.0.0.1]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id sB88IWw3010370; Mon, 8 Dec 2014 19:18:34 +1100 Received: from ozlabs.au.ibm.com (ozlabs.au.ibm.com [9.192.253.14]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id sB88IWOH010363; Mon, 8 Dec 2014 19:18:32 +1100 Received: from dukhat.ozlabs.ibm.com (haven.au.ibm.com [9.192.253.15]) (using TLSv1.2 with cipher AES128-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.au.ibm.com (Postfix) with ESMTPSA id 803CDA00A6; Mon, 8 Dec 2014 19:18:32 +1100 (AEDT) From: "Ian Munsie" To: mpe Subject: [PATCH 1/7] CXL: Change contexts_lock to a mutex to fix sleep while atomic bug Date: Mon, 8 Dec 2014 19:17:55 +1100 Message-Id: <1418026681-14787-1-git-send-email-imunsie@au.ibm.com> X-Mailer: git-send-email 2.1.3 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14120808-0033-0000-0000-000000B05030 Cc: cbe-oss-dev , mikey , arnd , "Aneesh Kumar K.V" , greg , linux-kernel , linuxppc-dev , anton , imunsie , jk X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Ian Munsie We had a known sleep while atomic bug if a CXL device was forcefully unbound while it was in use. This could occur as a result of EEH, or manually induced with something like this while the device was in use: echo 0000:01:00.0 > /sys/bus/pci/drivers/cxl-pci/unbind The issue was that in this code path we iterated over each context and forcefully detached it with the contexts_lock spin lock held, however the detach also needed to take the spu_mutex, and call schedule. This patch changes the contexts_lock to a mutex so that we are not in atomic context while doing the detach, thereby avoiding the sleep while atomic. Also delete the related TODO comment, which suggested an alternate solution which turned out to not be workable. Signed-off-by: Ian Munsie --- drivers/misc/cxl/context.c | 15 ++++++++------- drivers/misc/cxl/cxl.h | 2 +- drivers/misc/cxl/native.c | 7 ------- drivers/misc/cxl/pci.c | 2 +- drivers/misc/cxl/sysfs.c | 10 +++++----- 5 files changed, 15 insertions(+), 21 deletions(-) diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index cca4721..4aa31a3 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -82,12 +82,12 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master) * Allocating IDR! We better make sure everything's setup that * dereferences from it. */ + mutex_lock(&afu->contexts_lock); idr_preload(GFP_KERNEL); - spin_lock(&afu->contexts_lock); i = idr_alloc(&ctx->afu->contexts_idr, ctx, 0, ctx->afu->num_procs, GFP_NOWAIT); - spin_unlock(&afu->contexts_lock); idr_preload_end(); + mutex_unlock(&afu->contexts_lock); if (i < 0) return i; @@ -168,21 +168,22 @@ void cxl_context_detach_all(struct cxl_afu *afu) struct cxl_context *ctx; int tmp; - rcu_read_lock(); - idr_for_each_entry(&afu->contexts_idr, ctx, tmp) + mutex_lock(&afu->contexts_lock); + idr_for_each_entry(&afu->contexts_idr, ctx, tmp) { /* * Anything done in here needs to be setup before the IDR is * created and torn down after the IDR removed */ __detach_context(ctx); - rcu_read_unlock(); + } + mutex_unlock(&afu->contexts_lock); } void cxl_context_free(struct cxl_context *ctx) { - spin_lock(&ctx->afu->contexts_lock); + mutex_lock(&ctx->afu->contexts_lock); idr_remove(&ctx->afu->contexts_idr, ctx->pe); - spin_unlock(&ctx->afu->contexts_lock); + mutex_unlock(&ctx->afu->contexts_lock); synchronize_rcu(); free_page((u64)ctx->sstp); diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index b5b6bda..7c05239 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -351,7 +351,7 @@ struct cxl_afu { struct device *chardev_s, *chardev_m, *chardev_d; struct idr contexts_idr; struct dentry *debugfs; - spinlock_t contexts_lock; + struct mutex contexts_lock; struct mutex spa_mutex; spinlock_t afu_cntl_lock; diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c index 9a5a442..1001cf4 100644 --- a/drivers/misc/cxl/native.c +++ b/drivers/misc/cxl/native.c @@ -610,13 +610,6 @@ static inline int detach_process_native_dedicated(struct cxl_context *ctx) return 0; } -/* - * TODO: handle case when this is called inside a rcu_read_lock() which may - * happen when we unbind the driver (ie. cxl_context_detach_all()) . Terminate - * & remove use a mutex lock and schedule which will not good with lock held. - * May need to write do_process_element_cmd() that handles outstanding page - * faults synchronously. - */ static inline int detach_process_native_afu_directed(struct cxl_context *ctx) { if (!ctx->pe_inserted) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 10c98ab..0f2cc9f 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -502,7 +502,7 @@ static struct cxl_afu *cxl_alloc_afu(struct cxl *adapter, int slice) afu->dev.release = cxl_release_afu; afu->slice = slice; idr_init(&afu->contexts_idr); - spin_lock_init(&afu->contexts_lock); + mutex_init(&afu->contexts_lock); spin_lock_init(&afu->afu_cntl_lock); mutex_init(&afu->spa_mutex); diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c index ce7ec06..461bdbd 100644 --- a/drivers/misc/cxl/sysfs.c +++ b/drivers/misc/cxl/sysfs.c @@ -121,7 +121,7 @@ static ssize_t reset_store_afu(struct device *device, int rc; /* Not safe to reset if it is currently in use */ - spin_lock(&afu->contexts_lock); + mutex_lock(&afu->contexts_lock); if (!idr_is_empty(&afu->contexts_idr)) { rc = -EBUSY; goto err; @@ -132,7 +132,7 @@ static ssize_t reset_store_afu(struct device *device, rc = count; err: - spin_unlock(&afu->contexts_lock); + mutex_unlock(&afu->contexts_lock); return rc; } @@ -247,7 +247,7 @@ static ssize_t mode_store(struct device *device, struct device_attribute *attr, int rc = -EBUSY; /* can't change this if we have a user */ - spin_lock(&afu->contexts_lock); + mutex_lock(&afu->contexts_lock); if (!idr_is_empty(&afu->contexts_idr)) goto err; @@ -271,7 +271,7 @@ static ssize_t mode_store(struct device *device, struct device_attribute *attr, afu->current_mode = 0; afu->num_procs = 0; - spin_unlock(&afu->contexts_lock); + mutex_unlock(&afu->contexts_lock); if ((rc = _cxl_afu_deactivate_mode(afu, old_mode))) return rc; @@ -280,7 +280,7 @@ static ssize_t mode_store(struct device *device, struct device_attribute *attr, return count; err: - spin_unlock(&afu->contexts_lock); + mutex_unlock(&afu->contexts_lock); return rc; }