From patchwork Wed Sep 12 08:50:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Whitcroft X-Patchwork-Id: 968899 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 429Fqh1Jm1z9sBn; Wed, 12 Sep 2018 18:50:56 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1g00rJ-0002IX-Cv; Wed, 12 Sep 2018 08:50:49 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1g00rH-0002IF-W1 for kernel-team@lists.ubuntu.com; Wed, 12 Sep 2018 08:50:47 +0000 Received: from 1.general.apw.uk.vpn ([10.172.192.78] helo=localhost) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1g00rH-0006ZO-NA; Wed, 12 Sep 2018 08:50:47 +0000 From: Andy Whitcroft To: kernel-team@lists.ubuntu.com Subject: [bionic/master-next 1/1] UBUNTU: SAUCE: vfio -- release device lock before userspace requests Date: Wed, 12 Sep 2018 09:50:46 +0100 Message-Id: <20180912085046.3401-2-apw@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180912085046.3401-1-apw@canonical.com> References: <20180912085046.3401-1-apw@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Andy Whitcroft MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" During a hotplug event vfio_pci_remove() will call vfio_del_group_dev() to release the device group. This may trigger a userspace request. Currently this userspace request is performed while holding the device lock. This leads userspace to deadlock against it while trying to perform the requested cleanup. Drop the device lock while the userspace request is in flight. After it completes reaquire the lock and revalidate the device as it may have been successfully removed by a concurrent operation. As the remove callback may now drop the lock also check and revalidation at the end of that operation. BugLink: http://bugs.launchpad.net/bugs/1792099 Signed-off-by: Andy Whitcroft Acked-by: Stefan Bader Acked-by: Colin Ian King --- drivers/base/dd.c | 7 +++++++ drivers/vfio/vfio.c | 23 ++++++++++++++++++++++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 2c964f56dafe..37c01054521b 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -868,6 +868,13 @@ static void __device_release_driver(struct device *dev, struct device *parent) dev->bus->remove(dev); else if (drv->remove) drv->remove(dev); + /* + * A concurrent invocation of the same function might + * have released the driver successfully while this one + * was waiting, so check for that. + */ + if (dev->driver != drv) + return; device_links_driver_cleanup(dev); dma_deconfigure(dev); diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 2bc3705a99bd..6ae4d3f432fe 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -934,6 +934,10 @@ void *vfio_del_group_dev(struct device *dev) unsigned int i = 0; long ret; bool interrupted = false; + bool locked = true; + struct device_driver *drv; + + drv = dev->driver; /* * The group exists so long as we have a device reference. Get @@ -974,8 +978,11 @@ void *vfio_del_group_dev(struct device *dev) if (!device) break; - if (device->ops->request) + if (device->ops->request) { + device_unlock(dev); + locked = false; device->ops->request(device_data, i++); + } vfio_device_put(device); @@ -994,6 +1001,20 @@ void *vfio_del_group_dev(struct device *dev) current->comm, task_pid_nr(current)); } } + + if (!locked) { + device_lock(dev); + locked = true; + /* + * A concurrent operation may have released the driver + * successfully while we had dropped the lock, + * check for that. + */ + if (dev->driver != drv) { + vfio_group_put(group); + return NULL; + } + } } while (ret <= 0); /*