From patchwork Fri Oct 6 13:31:28 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822420 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r6n1hjpz9t3t for ; Sat, 7 Oct 2017 00:28:01 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752178AbdJFN17 (ORCPT ); Fri, 6 Oct 2017 09:27:59 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60072 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750950AbdJFN16 (ORCPT ); Fri, 6 Oct 2017 09:27:58 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B309D15A2; Fri, 6 Oct 2017 06:27:57 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D95CF3F578; Fri, 6 Oct 2017 06:27:52 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 01/36] iommu: Keep track of processes and PASIDs Date: Fri, 6 Oct 2017 14:31:28 +0100 Message-Id: <20171006133203.22803-2-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org IOMMU drivers need a way to bind Linux processes to devices. This is used for Shared Virtual Memory (SVM), where devices support paging. In that mode, DMA can directly target virtual addresses of a process. Introduce boilerplate code for allocating process structures and binding them to devices. Four operations are added to IOMMU drivers: * process_alloc, process_free: to create an iommu_process structure and perform architecture-specific operations required to grab the process (for instance on ARM SMMU, pin down the CPU ASID). There is a single iommu_process structure per Linux process. * process_attach: attach a process to a device. The IOMMU driver checks that the device is capable of sharing an address space with this process, and writes the PASID table entry to install the process page directory. Some IOMMU drivers (e.g. ARM SMMU and virtio-iommu) will have a single PASID table per domain, for convenience. Other can implement it differently but to help these drivers, process_attach and process_detach take a 'first' or 'last' parameter telling whether they need to install/remove the PASID entry or only send the required TLB invalidations. * process_detach: detach a process from a device. The IOMMU driver removes the PASID table entry and invalidates the IOTLBs. process_attach and process_detach operations are serialized with a spinlock. At the moment it is global, but if we try to optimize it, the core should at least prevent concurrent attach/detach on the same domain. (so multi-level PASID table code can allocate tables lazily without having to go through the io-pgtable concurrency nightmare). process_alloc can sleep, but process_free must not (because we'll have to call it from call_srcu.) At the moment we use an IDR for allocating PASIDs and retrieving contexts. We also use a single spinlock. These can be refined and optimized later (a custom allocator will be needed for top-down PASID allocation). Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/Kconfig | 10 ++ drivers/iommu/Makefile | 1 + drivers/iommu/iommu-process.c | 225 ++++++++++++++++++++++++++++++++++++++++++ drivers/iommu/iommu.c | 1 + include/linux/iommu.h | 24 +++++ 5 files changed, 261 insertions(+) create mode 100644 drivers/iommu/iommu-process.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index f3a21343e636..1ea5c90e37be 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -74,6 +74,16 @@ config IOMMU_DMA select IOMMU_IOVA select NEED_SG_DMA_LENGTH +config IOMMU_PROCESS + bool "Process management API for the IOMMU" + select IOMMU_API + help + Enable process management for the IOMMU API. In systems that support + it, device drivers can bind processes to devices and share their page + tables using this API. + + If unsure, say N here. + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index b910aea813a1..a2832edbfaa2 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -1,6 +1,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o +obj-$(CONFIG_IOMMU_PROCESS) += iommu-process.o obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o diff --git a/drivers/iommu/iommu-process.c b/drivers/iommu/iommu-process.c new file mode 100644 index 000000000000..a7e5a1c94305 --- /dev/null +++ b/drivers/iommu/iommu-process.c @@ -0,0 +1,225 @@ +/* + * Track processes bound to devices + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Copyright (C) 2017 ARM Ltd. + * + * Author: Jean-Philippe Brucker + */ + +#include +#include +#include +#include + +/* Link between a domain and a process */ +struct iommu_context { + struct iommu_process *process; + struct iommu_domain *domain; + + struct list_head process_head; + struct list_head domain_head; + + /* Number of devices that use this context */ + refcount_t ref; +}; + +/* + * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is + * used for returning errors). In practice implementations will use at most 20 + * bits, which is the PCI limit. + */ +static DEFINE_IDR(iommu_process_idr); + +/* + * For the moment this is an all-purpose lock. It serializes + * access/modifications to contexts (process-domain links), access/modifications + * to the PASID IDR, and changes to process refcount as well. + */ +static DEFINE_SPINLOCK(iommu_process_lock); + +/* + * Allocate a iommu_process structure for the given task. + * + * Ideally we shouldn't need the domain parameter, since iommu_process is + * system-wide, but we use it to retrieve the driver's allocation ops and a + * PASID range. + */ +static struct iommu_process * +iommu_process_alloc(struct iommu_domain *domain, struct task_struct *task) +{ + int err; + int pasid; + struct iommu_process *process; + + if (WARN_ON(!domain->ops->process_alloc || !domain->ops->process_free)) + return ERR_PTR(-ENODEV); + + process = domain->ops->process_alloc(task); + if (IS_ERR(process)) + return process; + if (!process) + return ERR_PTR(-ENOMEM); + + process->pid = get_task_pid(task, PIDTYPE_PID); + process->release = domain->ops->process_free; + INIT_LIST_HEAD(&process->domains); + kref_init(&process->kref); + + if (!process->pid) { + err = -EINVAL; + goto err_free_process; + } + + idr_preload(GFP_KERNEL); + spin_lock(&iommu_process_lock); + pasid = idr_alloc_cyclic(&iommu_process_idr, process, domain->min_pasid, + domain->max_pasid + 1, GFP_ATOMIC); + process->pasid = pasid; + spin_unlock(&iommu_process_lock); + idr_preload_end(); + + if (pasid < 0) { + err = pasid; + goto err_put_pid; + } + + return process; + +err_put_pid: + put_pid(process->pid); + +err_free_process: + domain->ops->process_free(process); + + return ERR_PTR(err); +} + +static void iommu_process_release(struct kref *kref) +{ + struct iommu_process *process; + void (*release)(struct iommu_process *); + + assert_spin_locked(&iommu_process_lock); + + process = container_of(kref, struct iommu_process, kref); + release = process->release; + + WARN_ON(!list_empty(&process->domains)); + + idr_remove(&iommu_process_idr, process->pasid); + put_pid(process->pid); + release(process); +} + +/* + * Returns non-zero if a reference to the process was successfully taken. + * Returns zero if the process is being freed and should not be used. + */ +static int iommu_process_get_locked(struct iommu_process *process) +{ + assert_spin_locked(&iommu_process_lock); + + if (process) + return kref_get_unless_zero(&process->kref); + + return 0; +} + +static void iommu_process_put_locked(struct iommu_process *process) +{ + assert_spin_locked(&iommu_process_lock); + + kref_put(&process->kref, iommu_process_release); +} + +static int iommu_process_attach(struct iommu_domain *domain, struct device *dev, + struct iommu_process *process) +{ + int err; + int pasid = process->pasid; + struct iommu_context *context; + + if (WARN_ON(!domain->ops->process_attach || !domain->ops->process_detach)) + return -ENODEV; + + if (pasid > domain->max_pasid || pasid < domain->min_pasid) + return -ENOSPC; + + context = kzalloc(sizeof(*context), GFP_KERNEL); + if (!context) + return -ENOMEM; + + context->process = process; + context->domain = domain; + refcount_set(&context->ref, 1); + + spin_lock(&iommu_process_lock); + err = domain->ops->process_attach(domain, dev, process, true); + if (err) { + kfree(context); + spin_unlock(&iommu_process_lock); + return err; + } + + list_add(&context->process_head, &process->domains); + list_add(&context->domain_head, &domain->processes); + spin_unlock(&iommu_process_lock); + + return 0; +} + +static void iommu_context_free(struct iommu_context *context) +{ + assert_spin_locked(&iommu_process_lock); + + if (WARN_ON(!context->process || !context->domain)) + return; + + list_del(&context->process_head); + list_del(&context->domain_head); + iommu_process_put_locked(context->process); + + kfree(context); +} + +/* Attach an existing context to the device */ +static int iommu_process_attach_locked(struct iommu_context *context, + struct device *dev) +{ + assert_spin_locked(&iommu_process_lock); + + refcount_inc(&context->ref); + return context->domain->ops->process_attach(context->domain, dev, + context->process, false); +} + +/* Detach device from context and release it if necessary */ +static void iommu_process_detach_locked(struct iommu_context *context, + struct device *dev) +{ + bool last = false; + struct iommu_domain *domain = context->domain; + + assert_spin_locked(&iommu_process_lock); + + if (refcount_dec_and_test(&context->ref)) + last = true; + + domain->ops->process_detach(domain, dev, context->process, last); + + if (last) + iommu_context_free(context); +} diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 3de5c0bcb5cc..b2b34cf7c978 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1264,6 +1264,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus, domain->type = type; /* Assume all sizes by default; the driver may override this later */ domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; + INIT_LIST_HEAD(&domain->processes); return domain; } diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 41b8c5757859..3978dc094706 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -94,6 +94,19 @@ struct iommu_domain { void *handler_token; struct iommu_domain_geometry geometry; void *iova_cookie; + + unsigned int min_pasid, max_pasid; + struct list_head processes; +}; + +struct iommu_process { + struct pid *pid; + int pasid; + struct list_head domains; + struct kref kref; + + /* Release callback for this process */ + void (*release)(struct iommu_process *process); }; enum iommu_cap { @@ -164,6 +177,11 @@ struct iommu_resv_region { * @domain_free: free iommu domain * @attach_dev: attach device to an iommu domain * @detach_dev: detach device from an iommu domain + * @process_alloc: allocate iommu process + * @process_free: free iommu process + * @process_attach: attach iommu process to a domain + * @process_detach: detach iommu process from a domain. Remove PASID entry and + * flush associated TLB entries. * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @map_sg: map a scatter-gather list of physically contiguous memory chunks @@ -197,6 +215,12 @@ struct iommu_ops { int (*attach_dev)(struct iommu_domain *domain, struct device *dev); void (*detach_dev)(struct iommu_domain *domain, struct device *dev); + struct iommu_process *(*process_alloc)(struct task_struct *task); + void (*process_free)(struct iommu_process *process); + int (*process_attach)(struct iommu_domain *domain, struct device *dev, + struct iommu_process *process, bool first); + void (*process_detach)(struct iommu_domain *domain, struct device *dev, + struct iommu_process *process, bool last); int (*map)(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, From patchwork Fri Oct 6 13:31:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822421 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r6s5WPBz9t3m for ; Sat, 7 Oct 2017 00:28:05 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752242AbdJFN2D (ORCPT ); Fri, 6 Oct 2017 09:28:03 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60108 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751892AbdJFN2D (ORCPT ); Fri, 6 Oct 2017 09:28:03 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C90D215BE; Fri, 6 Oct 2017 06:28:02 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id F00653F578; Fri, 6 Oct 2017 06:27:57 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 02/36] iommu: Add a process_exit callback for device drivers Date: Fri, 6 Oct 2017 14:31:29 +0100 Message-Id: <20171006133203.22803-3-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When a process exits, we need to ensure that devices attached to it stop issuing transactions with its PASID. Let device drivers register a callback to be notified on process exit. At the moment the callback is set on the domain like the fault handler, because we don't have a structure available for IOMMU masters. This can become problematic if different devices in a domain are managed by distinct device drivers (for example multiple devices in the same group). The problem is the same for the fault handler, so we'll probably fix them all at once. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-process.c | 31 +++++++++++++++++++++++++++++++ include/linux/iommu.h | 19 +++++++++++++++++++ 2 files changed, 50 insertions(+) diff --git a/drivers/iommu/iommu-process.c b/drivers/iommu/iommu-process.c index a7e5a1c94305..61ca0bd707c0 100644 --- a/drivers/iommu/iommu-process.c +++ b/drivers/iommu/iommu-process.c @@ -223,3 +223,34 @@ static void iommu_process_detach_locked(struct iommu_context *context, if (last) iommu_context_free(context); } + +/** + * iommu_set_process_exit_handler() - set a callback for stopping the use of + * PASID in a device. + * @dev: the device + * @handler: exit handler + * @token: user data, will be passed back to the exit handler + * + * Users of the bind/unbind API should call this function to set a + * device-specific callback telling them when a process is exiting. + * + * After the callback returns, the device must not issue any more transaction + * with the PASIDs given as argument to the handler. It can be a single PASID + * value or the special IOMMU_PROCESS_EXIT_ALL. + * + * The handler itself should return 0 on success, and an appropriate error code + * otherwise. + */ +void iommu_set_process_exit_handler(struct device *dev, + iommu_process_exit_handler_t handler, + void *token) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + + if (WARN_ON(!domain)) + return; + + domain->process_exit = handler; + domain->process_exit_token = token; +} +EXPORT_SYMBOL_GPL(iommu_set_process_exit_handler); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 3978dc094706..8d74f9058f30 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -56,6 +56,11 @@ struct notifier_block; typedef int (*iommu_fault_handler_t)(struct iommu_domain *, struct device *, unsigned long, int, void *); +/* All process are being detached from this device */ +#define IOMMU_PROCESS_EXIT_ALL (-1) +typedef int (*iommu_process_exit_handler_t)(struct iommu_domain *, struct device *dev, + int pasid, void *); + struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped */ dma_addr_t aperture_end; /* Last address that can be mapped */ @@ -92,6 +97,8 @@ struct iommu_domain { unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */ iommu_fault_handler_t handler; void *handler_token; + iommu_process_exit_handler_t process_exit; + void *process_exit_token; struct iommu_domain_geometry geometry; void *iova_cookie; @@ -722,4 +729,16 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode) #endif /* CONFIG_IOMMU_API */ +#ifdef CONFIG_IOMMU_PROCESS +extern void iommu_set_process_exit_handler(struct device *dev, + iommu_process_exit_handler_t cb, + void *token); +#else /* CONFIG_IOMMU_PROCESS */ +static inline void iommu_set_process_exit_handler(struct device *dev, + iommu_process_exit_handler_t cb, + void *token) +{ +} +#endif /* CONFIG_IOMMU_PROCESS */ + #endif /* __LINUX_IOMMU_H */ From patchwork Fri Oct 6 13:31:30 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822422 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r6y4N3Pz9t3t for ; Sat, 7 Oct 2017 00:28:10 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751989AbdJFN2J (ORCPT ); Fri, 6 Oct 2017 09:28:09 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60154 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751901AbdJFN2I (ORCPT ); Fri, 6 Oct 2017 09:28:08 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DF61715BF; Fri, 6 Oct 2017 06:28:07 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 123D03F578; Fri, 6 Oct 2017 06:28:02 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 03/36] iommu/process: Add public function to search for a process Date: Fri, 6 Oct 2017 14:31:30 +0100 Message-Id: <20171006133203.22803-4-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The fault handler will need to find a process given its PASID. This is the reason we have an IDR for storing processes, so hook it up. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-process.c | 35 +++++++++++++++++++++++++++++++++++ include/linux/iommu.h | 12 ++++++++++++ 2 files changed, 47 insertions(+) diff --git a/drivers/iommu/iommu-process.c b/drivers/iommu/iommu-process.c index 61ca0bd707c0..8f4c98632d58 100644 --- a/drivers/iommu/iommu-process.c +++ b/drivers/iommu/iommu-process.c @@ -145,6 +145,41 @@ static void iommu_process_put_locked(struct iommu_process *process) kref_put(&process->kref, iommu_process_release); } +/** + * iommu_process_put - Put reference to process, freeing it if necessary. + */ +void iommu_process_put(struct iommu_process *process) +{ + spin_lock(&iommu_process_lock); + iommu_process_put_locked(process); + spin_unlock(&iommu_process_lock); +} +EXPORT_SYMBOL_GPL(iommu_process_put); + +/** + * iommu_process_find - Find process associated to the given PASID + * + * Returns the IOMMU process corresponding to this PASID, or NULL if not found. + * A reference to the iommu_process is kept, and must be released with + * iommu_process_put. + */ +struct iommu_process *iommu_process_find(int pasid) +{ + struct iommu_process *process; + + spin_lock(&iommu_process_lock); + process = idr_find(&iommu_process_idr, pasid); + if (process) { + if (!iommu_process_get_locked(process)) + /* kref is 0, process is defunct */ + process = NULL; + } + spin_unlock(&iommu_process_lock); + + return process; +} +EXPORT_SYMBOL_GPL(iommu_process_find); + static int iommu_process_attach(struct iommu_domain *domain, struct device *dev, struct iommu_process *process) { diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 8d74f9058f30..e9528fcacab1 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -733,12 +733,24 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode) extern void iommu_set_process_exit_handler(struct device *dev, iommu_process_exit_handler_t cb, void *token); +extern struct iommu_process *iommu_process_find(int pasid); +extern void iommu_process_put(struct iommu_process *process); + #else /* CONFIG_IOMMU_PROCESS */ static inline void iommu_set_process_exit_handler(struct device *dev, iommu_process_exit_handler_t cb, void *token) { } + +static inline struct iommu_process *iommu_process_find(int pasid) +{ + return NULL; +} + +static inline void iommu_process_put(struct iommu_process *process) +{ +} #endif /* CONFIG_IOMMU_PROCESS */ #endif /* __LINUX_IOMMU_H */ From patchwork Fri Oct 6 13:31:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822423 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r741Nynz9t3t for ; Sat, 7 Oct 2017 00:28:16 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752190AbdJFN2O (ORCPT ); Fri, 6 Oct 2017 09:28:14 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60188 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751901AbdJFN2N (ORCPT ); Fri, 6 Oct 2017 09:28:13 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 01AF31610; Fri, 6 Oct 2017 06:28:13 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 289D73F578; Fri, 6 Oct 2017 06:28:08 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 04/36] iommu/process: Track process changes with an mmu_notifier Date: Fri, 6 Oct 2017 14:31:31 +0100 Message-Id: <20171006133203.22803-5-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When creating an iommu_process structure, register a notifier to be informed of changes to the virtual address space and to know when the process exits. Two new operations are added to the IOMMU driver: * process_invalidate when a range of addresses is unmapped, to let the IOMMU driver send TLB invalidations. * process_exit when the mm is released. It's a bit more involved in this case, as the IOMMU driver has to tell all devices drivers to stop using this PASID, then clear the PASID table and invalidate TLBs. Adding the notifier in the mix complicates process release. In one case device drivers free the process explicitly by calling unbind (or detaching the device). In the other case the process could crash before unbind, in which case the release notifier has to do all the work. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-process.c | 165 ++++++++++++++++++++++++++++++++++++++++-- include/linux/iommu.h | 12 +++ 2 files changed, 170 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/iommu-process.c b/drivers/iommu/iommu-process.c index 8f4c98632d58..1ef3f55b962b 100644 --- a/drivers/iommu/iommu-process.c +++ b/drivers/iommu/iommu-process.c @@ -21,9 +21,14 @@ #include #include +#include #include +#include #include +/* FIXME: stub for the fault queue. Remove later. */ +#define iommu_fault_queue_flush(...) + /* Link between a domain and a process */ struct iommu_context { struct iommu_process *process; @@ -50,6 +55,8 @@ static DEFINE_IDR(iommu_process_idr); */ static DEFINE_SPINLOCK(iommu_process_lock); +static struct mmu_notifier_ops iommu_process_mmu_notfier; + /* * Allocate a iommu_process structure for the given task. * @@ -74,15 +81,21 @@ iommu_process_alloc(struct iommu_domain *domain, struct task_struct *task) return ERR_PTR(-ENOMEM); process->pid = get_task_pid(task, PIDTYPE_PID); + process->mm = get_task_mm(task); + process->notifier.ops = &iommu_process_mmu_notfier; process->release = domain->ops->process_free; INIT_LIST_HEAD(&process->domains); - kref_init(&process->kref); if (!process->pid) { err = -EINVAL; goto err_free_process; } + if (!process->mm) { + err = -EINVAL; + goto err_put_pid; + } + idr_preload(GFP_KERNEL); spin_lock(&iommu_process_lock); pasid = idr_alloc_cyclic(&iommu_process_idr, process, domain->min_pasid, @@ -93,11 +106,44 @@ iommu_process_alloc(struct iommu_domain *domain, struct task_struct *task) if (pasid < 0) { err = pasid; - goto err_put_pid; + goto err_put_mm; } + err = mmu_notifier_register(&process->notifier, process->mm); + if (err) + goto err_free_pasid; + + /* + * Now that the MMU notifier is valid, we can allow users to grab this + * process by setting a valid refcount. Before that it was accessible in + * the IDR but invalid. + * + * Users of the process structure obtain it with inc_not_zero, which + * provides a control dependency to ensure that they don't modify the + * structure if they didn't acquire the ref. So I think we need a write + * barrier here to pair with that control dependency (XXX probably + * nonsense.) + */ + smp_wmb(); + kref_init(&process->kref); + + /* A mm_count reference is kept by the notifier */ + mmput(process->mm); + return process; +err_free_pasid: + /* + * Even if the process is accessible from the IDR at this point, kref is + * 0 so no user could get a reference to it. Free it manually. + */ + spin_lock(&iommu_process_lock); + idr_remove(&iommu_process_idr, process->pasid); + spin_unlock(&iommu_process_lock); + +err_put_mm: + mmput(process->mm); + err_put_pid: put_pid(process->pid); @@ -107,21 +153,46 @@ iommu_process_alloc(struct iommu_domain *domain, struct task_struct *task) return ERR_PTR(err); } -static void iommu_process_release(struct kref *kref) +static void iommu_process_free(struct rcu_head *rcu) { struct iommu_process *process; void (*release)(struct iommu_process *); + process = container_of(rcu, struct iommu_process, rcu); + release = process->release; + + release(process); +} + +static void iommu_process_release(struct kref *kref) +{ + struct iommu_process *process; + assert_spin_locked(&iommu_process_lock); process = container_of(kref, struct iommu_process, kref); - release = process->release; - WARN_ON(!list_empty(&process->domains)); idr_remove(&iommu_process_idr, process->pasid); put_pid(process->pid); - release(process); + + /* + * If we're being released from process exit, the notifier callback + * ->release has already been called. Otherwise we don't need to go + * through there, the process isn't attached to anything anymore. Hence + * no_release. + */ + mmu_notifier_unregister_no_release(&process->notifier, process->mm); + + /* + * We can't free the structure here, because ->release might be + * attempting to grab it concurrently. And in the other case, if the + * structure is being released from within ->release, then + * __mmu_notifier_release expects to still have a valid mn when + * returning. So free the structure when it's safe, after the RCU grace + * period elapsed. + */ + mmu_notifier_call_srcu(&process->rcu, iommu_process_free); } /* @@ -187,7 +258,8 @@ static int iommu_process_attach(struct iommu_domain *domain, struct device *dev, int pasid = process->pasid; struct iommu_context *context; - if (WARN_ON(!domain->ops->process_attach || !domain->ops->process_detach)) + if (WARN_ON(!domain->ops->process_attach || !domain->ops->process_detach || + !domain->ops->process_exit || !domain->ops->process_invalidate)) return -ENODEV; if (pasid > domain->max_pasid || pasid < domain->min_pasid) @@ -259,6 +331,85 @@ static void iommu_process_detach_locked(struct iommu_context *context, iommu_context_free(context); } +/* + * Called when the process exits. Might race with unbind or any other function + * dropping the last reference to the process. As the mmu notifier doesn't hold + * any reference to the process when calling ->release, try to take a reference. + */ +static void iommu_notifier_release(struct mmu_notifier *mn, struct mm_struct *mm) +{ + struct iommu_context *context, *next; + struct iommu_process *process = container_of(mn, struct iommu_process, notifier); + + /* + * If the process is exiting then domains are still attached to the + * process. A few things need to be done before it is safe to release + * + * 1) Tell the IOMMU driver to stop using this PASID (and forward the + * message to attached device drivers. It can then clear the PASID + * table and invalidate relevant TLBs. + * + * 2) Drop all references to this process, by freeing the contexts. + */ + spin_lock(&iommu_process_lock); + if (!iommu_process_get_locked(process)) { + /* Someone's already taking care of it. */ + spin_unlock(&iommu_process_lock); + return; + } + + list_for_each_entry_safe(context, next, &process->domains, process_head) { + context->domain->ops->process_exit(context->domain, process); + iommu_context_free(context); + } + spin_unlock(&iommu_process_lock); + + iommu_fault_queue_flush(NULL); + + /* + * We're now reasonably certain that no more fault is being handled for + * this process, since we just flushed them all out of the fault queue. + * Release the last reference to free the process. + */ + iommu_process_put(process); +} + +static void iommu_notifier_invalidate_range(struct mmu_notifier *mn, struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + struct iommu_context *context; + struct iommu_process *process = container_of(mn, struct iommu_process, notifier); + + spin_lock(&iommu_process_lock); + list_for_each_entry(context, &process->domains, process_head) { + context->domain->ops->process_invalidate(context->domain, + process, start, end - start); + } + spin_unlock(&iommu_process_lock); +} + +static int iommu_notifier_clear_flush_young(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + iommu_notifier_invalidate_range(mn, mm, start, end); + return 0; +} + +static void iommu_notifier_change_pte(struct mmu_notifier *mn, struct mm_struct *mm, + unsigned long address, pte_t pte) +{ + iommu_notifier_invalidate_range(mn, mm, address, address + PAGE_SIZE); +} + +static struct mmu_notifier_ops iommu_process_mmu_notfier = { + .release = iommu_notifier_release, + .clear_flush_young = iommu_notifier_clear_flush_young, + .change_pte = iommu_notifier_change_pte, + .invalidate_range = iommu_notifier_invalidate_range, +}; + /** * iommu_set_process_exit_handler() - set a callback for stopping the use of * PASID in a device. diff --git a/include/linux/iommu.h b/include/linux/iommu.h index e9528fcacab1..42b818437fa1 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -25,6 +25,7 @@ #include #include #include +#include #define IOMMU_READ (1 << 0) #define IOMMU_WRITE (1 << 1) @@ -111,9 +112,13 @@ struct iommu_process { int pasid; struct list_head domains; struct kref kref; + struct mmu_notifier notifier; + struct mm_struct *mm; /* Release callback for this process */ void (*release)(struct iommu_process *process); + /* For postponed release */ + struct rcu_head rcu; }; enum iommu_cap { @@ -189,6 +194,9 @@ struct iommu_resv_region { * @process_attach: attach iommu process to a domain * @process_detach: detach iommu process from a domain. Remove PASID entry and * flush associated TLB entries. + * @process_invalidate: Invalidate a range of mappings for a process. + * @process_exit: A process is exiting. Stop using the PASID, remove PASID entry + * and flush associated TLB entries. * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @map_sg: map a scatter-gather list of physically contiguous memory chunks @@ -228,6 +236,10 @@ struct iommu_ops { struct iommu_process *process, bool first); void (*process_detach)(struct iommu_domain *domain, struct device *dev, struct iommu_process *process, bool last); + void (*process_invalidate)(struct iommu_domain *domain, + struct iommu_process *process, + unsigned long iova, size_t size); + void (*process_exit)(struct iommu_domain *domain, struct iommu_process *process); int (*map)(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, From patchwork Fri Oct 6 13:31:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822424 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r786cg7z9t3t for ; Sat, 7 Oct 2017 00:28:20 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752215AbdJFN2T (ORCPT ); Fri, 6 Oct 2017 09:28:19 -0400 Received: from foss.arm.com ([217.140.101.70]:60210 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751875AbdJFN2S (ORCPT ); Fri, 6 Oct 2017 09:28:18 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 188161650; Fri, 6 Oct 2017 06:28:18 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 3F1BA3F578; Fri, 6 Oct 2017 06:28:13 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 05/36] iommu/process: Bind and unbind process to and from devices Date: Fri, 6 Oct 2017 14:31:32 +0100 Message-Id: <20171006133203.22803-6-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Add bind and unbind operations to the IOMMU API. Device drivers can use them to share process page tables with their device. iommu_process_bind_group is provided for VFIO's convenience, as it needs to provide a coherent interface on containers. Device drivers will most likely want to use iommu_process_bind_device, which doesn't bind the whole group. PASIDs are de facto shared between all devices in a group (because of hardware weaknesses), but we don't do anything about it at the API level. Making bind_device call bind_group is probably the wrong way around, because it requires more work on our side for no benefit. We'd have to replay all binds each time a device is hotplugged into a group. But when a device is hotplugged into a group, the device driver will have to do a bind before using its PASID anyway and we can reject inconsistencies at that point. Concurrent calls to iommu_process_bind_device for the same process are not supported at the moment (they'll race on process_alloc which will only succeed for the first one; the others will have to retry the bind). I also don't support calling bind() on a dying process, not sure if it matters. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-process.c | 165 ++++++++++++++++++++++++++++++++++++++++++ drivers/iommu/iommu.c | 64 ++++++++++++++++ include/linux/iommu.h | 41 +++++++++++ 3 files changed, 270 insertions(+) diff --git a/drivers/iommu/iommu-process.c b/drivers/iommu/iommu-process.c index 1ef3f55b962b..dee7691e3791 100644 --- a/drivers/iommu/iommu-process.c +++ b/drivers/iommu/iommu-process.c @@ -411,6 +411,171 @@ static struct mmu_notifier_ops iommu_process_mmu_notfier = { }; /** + * iommu_process_bind_device - Bind a process address space to a device + * @dev: the device + * @task: the process to bind + * @pasid: valid address where the PASID will be stored + * @flags: bond properties (IOMMU_PROCESS_BIND_*) + * + * Create a bond between device and task, allowing the device to access the + * process address space using the returned PASID. + * + * On success, 0 is returned and @pasid contains a valid ID. Otherwise, an error + * is returned. + */ +int iommu_process_bind_device(struct device *dev, struct task_struct *task, + int *pasid, int flags) +{ + int err, i; + int nesting; + struct pid *pid; + struct iommu_domain *domain; + struct iommu_process *process; + struct iommu_context *cur_context; + struct iommu_context *context = NULL; + + domain = iommu_get_domain_for_dev(dev); + if (WARN_ON(!domain)) + return -EINVAL; + + if (!iommu_domain_get_attr(domain, DOMAIN_ATTR_NESTING, &nesting) && + nesting) + return -EINVAL; + + pid = get_task_pid(task, PIDTYPE_PID); + if (!pid) + return -EINVAL; + + /* If an iommu_process already exists, use it */ + spin_lock(&iommu_process_lock); + idr_for_each_entry(&iommu_process_idr, process, i) { + if (process->pid != pid) + continue; + + if (!iommu_process_get_locked(process)) { + /* Process is defunct, create a new one */ + process = NULL; + break; + } + + /* Great, is it also bound to this domain? */ + list_for_each_entry(cur_context, &process->domains, + process_head) { + if (cur_context->domain != domain) + continue; + + context = cur_context; + *pasid = process->pasid; + + /* Splendid, tell the driver and increase the ref */ + err = iommu_process_attach_locked(context, dev); + if (err) + iommu_process_put_locked(process); + + break; + } + break; + } + spin_unlock(&iommu_process_lock); + put_pid(pid); + + if (context) + return err; + + if (!process) { + process = iommu_process_alloc(domain, task); + if (IS_ERR(process)) + return PTR_ERR(process); + } + + err = iommu_process_attach(domain, dev, process); + if (err) { + iommu_process_put(process); + return err; + } + + *pasid = process->pasid; + + return 0; +} +EXPORT_SYMBOL_GPL(iommu_process_bind_device); + +/** + * iommu_process_unbind_device - Remove a bond created with + * iommu_process_bind_device. + * + * @dev: the device + * @pasid: the pasid returned by bind + */ +int iommu_process_unbind_device(struct device *dev, int pasid) +{ + struct iommu_domain *domain; + struct iommu_process *process; + struct iommu_context *cur_context; + struct iommu_context *context = NULL; + + domain = iommu_get_domain_for_dev(dev); + if (WARN_ON(!domain)) + return -EINVAL; + + /* + * Caller stopped the device from issuing PASIDs, now make sure they are + * out of the fault queue. + */ + iommu_fault_queue_flush(dev); + + spin_lock(&iommu_process_lock); + process = idr_find(&iommu_process_idr, pasid); + if (!process) { + spin_unlock(&iommu_process_lock); + return -ESRCH; + } + + list_for_each_entry(cur_context, &process->domains, process_head) { + if (cur_context->domain == domain) { + context = cur_context; + break; + } + } + + if (context) + iommu_process_detach_locked(context, dev); + spin_unlock(&iommu_process_lock); + + return context ? 0 : -ESRCH; +} +EXPORT_SYMBOL_GPL(iommu_process_unbind_device); + +/* + * __iommu_process_unbind_dev_all - Detach all processes attached to this + * device. + * + * When detaching @device from @domain, IOMMU drivers have to use this function. + */ +void __iommu_process_unbind_dev_all(struct iommu_domain *domain, struct device *dev) +{ + struct iommu_context *context, *next; + + /* Ask device driver to stop using all PASIDs */ + spin_lock(&iommu_process_lock); + if (domain->process_exit) { + list_for_each_entry(context, &domain->processes, domain_head) + domain->process_exit(domain, dev, + context->process->pasid, + domain->process_exit_token); + } + spin_unlock(&iommu_process_lock); + + iommu_fault_queue_flush(dev); + + spin_lock(&iommu_process_lock); + list_for_each_entry_safe(context, next, &domain->processes, domain_head) + iommu_process_detach_locked(context, dev); + spin_unlock(&iommu_process_lock); +} +EXPORT_SYMBOL_GPL(__iommu_process_unbind_dev_all); + +/** * iommu_set_process_exit_handler() - set a callback for stopping the use of * PASID in a device. * @dev: the device diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b2b34cf7c978..f9cb89dd28f5 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1460,6 +1460,70 @@ void iommu_detach_group(struct iommu_domain *domain, struct iommu_group *group) } EXPORT_SYMBOL_GPL(iommu_detach_group); +/* + * iommu_process_bind_group - Share process address space with all devices in + * the group. + * @group: the iommu group + * @task: the process to bind + * @pasid: valid address where the PASID will be stored + * @flags: bond properties (IOMMU_PROCESS_BIND_*) + * + * Create a bond between group and process, allowing devices in the group to + * access the process address space using @pasid. + * + * On success, 0 is returned and @pasid contains a valid ID. Otherwise, an error + * is returned. + */ +int iommu_process_bind_group(struct iommu_group *group, + struct task_struct *task, int *pasid, int flags) +{ + struct group_device *device; + int ret = -ENODEV; + + if (!pasid) + return -EINVAL; + + if (!group->domain) + return -EINVAL; + + mutex_lock(&group->mutex); + list_for_each_entry(device, &group->devices, list) { + ret = iommu_process_bind_device(device->dev, task, pasid, + flags); + if (ret) + break; + } + + if (ret) { + list_for_each_entry_continue_reverse(device, &group->devices, list) + iommu_process_unbind_device(device->dev, *pasid); + } + mutex_unlock(&group->mutex); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_process_bind_group); + +/** + * iommu_process_unbind_group - Remove a bond created with + * iommu_process_bind_group + * + * @group: the group + * @pasid: the pasid returned by bind + */ +int iommu_process_unbind_group(struct iommu_group *group, int pasid) +{ + struct group_device *device; + + mutex_lock(&group->mutex); + list_for_each_entry(device, &group->devices, list) + iommu_process_unbind_device(device->dev, pasid); + mutex_unlock(&group->mutex); + + return 0; +} +EXPORT_SYMBOL_GPL(iommu_process_unbind_group); + phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) { if (unlikely(domain->ops->iova_to_phys == NULL)) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 42b818437fa1..e64c2711ea8d 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -454,6 +454,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, void iommu_fwspec_free(struct device *dev); int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids); const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode); +extern int iommu_process_bind_group(struct iommu_group *group, + struct task_struct *task, int *pasid, + int flags); +extern int iommu_process_unbind_group(struct iommu_group *group, int pasid); #else /* CONFIG_IOMMU_API */ @@ -739,6 +743,19 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode) return NULL; } +static inline int iommu_process_bind_group(struct iommu_group *group, + struct task_struct *task, int *pasid, + int flags) +{ + return -ENODEV; +} + +static inline int iommu_process_unbind_group(struct iommu_group *group, + int pasid) +{ + return -ENODEV; +} + #endif /* CONFIG_IOMMU_API */ #ifdef CONFIG_IOMMU_PROCESS @@ -747,6 +764,12 @@ extern void iommu_set_process_exit_handler(struct device *dev, void *token); extern struct iommu_process *iommu_process_find(int pasid); extern void iommu_process_put(struct iommu_process *process); +extern int iommu_process_bind_device(struct device *dev, + struct task_struct *task, int *pasid, + int flags); +extern int iommu_process_unbind_device(struct device *dev, int pasid); +extern void __iommu_process_unbind_dev_all(struct iommu_domain *domain, + struct device *dev); #else /* CONFIG_IOMMU_PROCESS */ static inline void iommu_set_process_exit_handler(struct device *dev, @@ -763,6 +786,24 @@ static inline struct iommu_process *iommu_process_find(int pasid) static inline void iommu_process_put(struct iommu_process *process) { } + +static inline int iommu_process_bind_device(struct device *dev, + struct task_struct *task, + int *pasid, int flags) +{ + return -ENODEV; +} + +static inline int iommu_process_unbind_device(struct device *dev, int pasid) +{ + return -ENODEV; +} + +static inline void __iommu_process_unbind_dev_all(struct iommu_domain *domain, + struct device *dev) +{ +} + #endif /* CONFIG_IOMMU_PROCESS */ #endif /* __LINUX_IOMMU_H */ From patchwork Fri Oct 6 13:31:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822426 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r7G1lHWz9t3t for ; Sat, 7 Oct 2017 00:28:26 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752226AbdJFN2Y (ORCPT ); Fri, 6 Oct 2017 09:28:24 -0400 Received: from foss.arm.com ([217.140.101.70]:60266 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752200AbdJFN2X (ORCPT ); Fri, 6 Oct 2017 09:28:23 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 307B81435; Fri, 6 Oct 2017 06:28:23 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 55CD53F578; Fri, 6 Oct 2017 06:28:18 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 06/36] iommu: Extend fault reporting Date: Fri, 6 Oct 2017 14:31:33 +0100 Message-Id: <20171006133203.22803-7-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org A number of new users will need additional information in the IOMMU fault report, such as PASID and/or PRI group. Pass a new iommu_fault structure to the driver callbacks. For the moment add the new API in parallel, with an "ext" prefix, to let users move to the new API at their pace. I think it would be nice to use a single API though. There are only 4 device drivers using it, and receiving an iommu_fault instead of iova/flags wouldn't hurt them much. For the same reason as the process_exit handler, set_fault_handler is done on a device rather than a domain (although for the moment stored in the domain). Even when multiple heterogenous devices are in the same IOMMU group, each of their driver might want to register a fault handler. At the moment they'll race to set the handler, and the winning driver will receive fault reports from other devices. The new registering function also takes flags as arguments, giving future users a way to specify at which point of the fault process they want to be called. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu.c | 42 ++++++++++++++++++++++++++++++++++++++++++ include/linux/iommu.h | 18 ++++++++++++++++++ 2 files changed, 60 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index f9cb89dd28f5..ee956b5fc301 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1234,6 +1234,8 @@ EXPORT_SYMBOL_GPL(iommu_capable); * This function should be used by IOMMU users which want to be notified * whenever an IOMMU fault happens. * + * Note that new users should use iommu_set_ext_fault_handler instead. + * * The fault handler itself should return 0 on success, and an appropriate * error code otherwise. */ @@ -1243,11 +1245,44 @@ void iommu_set_fault_handler(struct iommu_domain *domain, { BUG_ON(!domain); + if (WARN_ON(domain->ext_handler)) + return; + domain->handler = handler; domain->handler_token = token; } EXPORT_SYMBOL_GPL(iommu_set_fault_handler); +/** + * iommu_set_ext_fault_handler() - set a fault handler for a device + * @dev: the device + * @handler: fault handler + * @token: user data, will be passed back to the fault handler + * @flags: IOMMU_FAULT_HANDLER_* parameters. + * + * This function should be used by IOMMU users which want to be notified + * whenever an IOMMU fault happens. + * + * The fault handler itself should return 0 on success, and an appropriate + * error code otherwise. + */ +void iommu_set_ext_fault_handler(struct device *dev, + iommu_ext_fault_handler_t handler, + void *token, int flags) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + + if (WARN_ON(!domain)) + return; + + if (WARN_ON(domain->handler || domain->ext_handler)) + return; + + domain->ext_handler = handler; + domain->handler_token = token; +} +EXPORT_SYMBOL_GPL(iommu_set_ext_fault_handler); + static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus, unsigned type) { @@ -1787,6 +1822,10 @@ int report_iommu_fault(struct iommu_domain *domain, struct device *dev, unsigned long iova, int flags) { int ret = -ENOSYS; + struct iommu_fault fault = { + .address = iova, + .flags = flags, + }; /* * if upper layers showed interest and installed a fault handler, @@ -1795,6 +1834,9 @@ int report_iommu_fault(struct iommu_domain *domain, struct device *dev, if (domain->handler) ret = domain->handler(domain, dev, iova, flags, domain->handler_token); + else if (domain->ext_handler) + ret = domain->ext_handler(domain, dev, &fault, + domain->handler_token); trace_io_page_fault(dev, iova, flags); return ret; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index e64c2711ea8d..ea4eaf585eb4 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -57,6 +57,14 @@ struct notifier_block; typedef int (*iommu_fault_handler_t)(struct iommu_domain *, struct device *, unsigned long, int, void *); +struct iommu_fault { + unsigned long address; + unsigned int flags; +}; + +typedef int (*iommu_ext_fault_handler_t)(struct iommu_domain *, struct device *, + struct iommu_fault *, void *); + /* All process are being detached from this device */ #define IOMMU_PROCESS_EXIT_ALL (-1) typedef int (*iommu_process_exit_handler_t)(struct iommu_domain *, struct device *dev, @@ -97,6 +105,7 @@ struct iommu_domain { const struct iommu_ops *ops; unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */ iommu_fault_handler_t handler; + iommu_ext_fault_handler_t ext_handler; void *handler_token; iommu_process_exit_handler_t process_exit; void *process_exit_token; @@ -352,6 +361,9 @@ extern size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long io extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova); extern void iommu_set_fault_handler(struct iommu_domain *domain, iommu_fault_handler_t handler, void *token); +extern void iommu_set_ext_fault_handler(struct device *dev, + iommu_ext_fault_handler_t handler, + void *token, int flags); extern void iommu_get_resv_regions(struct device *dev, struct list_head *list); extern void iommu_put_resv_regions(struct device *dev, struct list_head *list); @@ -566,6 +578,12 @@ static inline void iommu_set_fault_handler(struct iommu_domain *domain, { } +static inline void iommu_set_ext_fault_handler(struct device *dev, + iommu_ext_fault_handler_t handler, void *token, + int flags) +{ +} + static inline void iommu_get_resv_regions(struct device *dev, struct list_head *list) { From patchwork Fri Oct 6 13:31:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822428 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r7P5dDdz9t34 for ; Sat, 7 Oct 2017 00:28:33 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752249AbdJFN23 (ORCPT ); Fri, 6 Oct 2017 09:28:29 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60296 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751981AbdJFN22 (ORCPT ); Fri, 6 Oct 2017 09:28:28 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6A89E15A2; Fri, 6 Oct 2017 06:28:28 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 6DF813F578; Fri, 6 Oct 2017 06:28:23 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 07/36] iommu: Add a fault handler Date: Fri, 6 Oct 2017 14:31:34 +0100 Message-Id: <20171006133203.22803-8-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Some systems allow devices to do paging. For example systems supporting PCI's PRI extension or ARM SMMU's stall model. As more IOMMU drivers are adding support for page faults, we see a number of patterns that are common to all implementations. Let's try to unify some of the generic code. Add boilerplate code to handle device page requests: * An IOMMU drivers instantiate a fault workqueue if necessary, using iommu_fault_queue_init and iommu_fault_queue_destroy. * When it receives a fault report, supposedly in an IRQ handler, the IOMMU driver reports the fault using handle_iommu_fault (as opposed to the current report_iommu_fault) * Then depending on the domain configuration, we either immediately forward it to a device driver, or submit it to the fault queue, to be handled in a thread. * When the fault corresponds to a process context, call the mm fault handler on it (in the next patch). * Once the fault is handled, it is completed. This is either done automatically by the mm wrapper, or manually by a device driver (e.g. VFIO). A new operation, fault_response, is added to IOMMU drivers. It takes the same fault context passed to handle_iommu_fault and a status, allowing the driver to complete the fault, for instance by sending a PRG Response in PCI PRI. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/Kconfig | 9 ++ drivers/iommu/Makefile | 1 + drivers/iommu/io-pgfault.c | 330 ++++++++++++++++++++++++++++++++++++++++++ drivers/iommu/iommu-process.c | 3 - include/linux/iommu.h | 102 ++++++++++++- 5 files changed, 440 insertions(+), 5 deletions(-) create mode 100644 drivers/iommu/io-pgfault.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 1ea5c90e37be..a34d268d8ed3 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -84,6 +84,15 @@ config IOMMU_PROCESS If unsure, say N here. +config IOMMU_FAULT + bool "Fault handler for the IOMMU API" + select IOMMU_API + help + Enable the generic fault handler for the IOMMU API, that handles + recoverable page faults or inject them into guests. + + If unsure, say N here. + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index a2832edbfaa2..c34cbea482f0 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o obj-$(CONFIG_IOMMU_PROCESS) += iommu-process.o +obj-$(CONFIG_IOMMU_FAULT) += io-pgfault.o obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c new file mode 100644 index 000000000000..f31bc24534b0 --- /dev/null +++ b/drivers/iommu/io-pgfault.c @@ -0,0 +1,330 @@ +/* + * Handle device page faults + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Copyright (C) 2017 ARM Ltd. + * + * Author: Jean-Philippe Brucker + */ + +#include +#include +#include +#include + +static struct workqueue_struct *iommu_fault_queue; +static DECLARE_RWSEM(iommu_fault_queue_sem); +static refcount_t iommu_fault_queue_refs = REFCOUNT_INIT(0); +static BLOCKING_NOTIFIER_HEAD(iommu_fault_queue_flush_notifiers); + +/* Used to store incomplete fault groups */ +static LIST_HEAD(iommu_partial_faults); +static DEFINE_SPINLOCK(iommu_partial_faults_lock); + +struct iommu_fault_context { + struct iommu_domain *domain; + struct device *dev; + struct iommu_fault params; + struct list_head head; +}; + +struct iommu_fault_group { + struct list_head faults; + struct work_struct work; +}; + +/* + * iommu_fault_finish - Finish handling a fault + * + * Send a response if necessary and pass on the sanitized status code + */ +static int iommu_fault_finish(struct iommu_domain *domain, struct device *dev, + struct iommu_fault *fault, int status) +{ + /* + * There is no "handling" an unrecoverable fault, so the only valid + * return values are 0 or an error. + */ + if (!(fault->flags & IOMMU_FAULT_RECOVERABLE)) + return status > 0 ? 0 : status; + + /* Device driver took ownership of the fault and will complete it later */ + if (status == IOMMU_FAULT_STATUS_IGNORE) + return 0; + + /* + * There was an internal error with handling the recoverable fault (e.g. + * OOM or no handler). Try to complete the fault if possible. + */ + if (status <= 0) + status = IOMMU_FAULT_STATUS_INVALID; + + if (WARN_ON(!domain->ops->fault_response)) + /* + * The IOMMU driver shouldn't have submitted recoverable faults + * if it cannot receive a response. + */ + return -EINVAL; + + return domain->ops->fault_response(domain, dev, fault, status); +} + +static int iommu_fault_handle_single(struct iommu_fault_context *fault) +{ + /* TODO */ + return -ENODEV; +} + +static void iommu_fault_handle_group(struct work_struct *work) +{ + struct iommu_fault_group *group; + struct iommu_fault_context *fault, *next; + int status = IOMMU_FAULT_STATUS_HANDLED; + + group = container_of(work, struct iommu_fault_group, work); + + list_for_each_entry_safe(fault, next, &group->faults, head) { + struct iommu_fault *params = &fault->params; + /* + * Errors are sticky: don't handle subsequent faults in the + * group if there is an error. + */ + if (status == IOMMU_FAULT_STATUS_HANDLED) + status = iommu_fault_handle_single(fault); + + if (params->flags & IOMMU_FAULT_LAST || + !(params->flags & IOMMU_FAULT_GROUP)) { + iommu_fault_finish(fault->domain, fault->dev, + &fault->params, status); + } + + kfree(fault); + } + + kfree(group); +} + +static int iommu_queue_fault(struct iommu_domain *domain, struct device *dev, + struct iommu_fault *params) +{ + struct iommu_fault_group *group; + struct iommu_fault_context *fault = kzalloc(sizeof(*fault), GFP_KERNEL); + + /* + * FIXME There is a race here, with queue_register. The last IOMMU + * driver has to ensure no fault is reported anymore before + * unregistering, so that doesn't matter. But you could have an IOMMU + * device that didn't register to the fault queue and is still reporting + * faults while the last queue user disappears. It really shouldn't get + * here, but it currently does if there is a blocking handler. + */ + if (!iommu_fault_queue) + return -ENOSYS; + + if (!fault) + return -ENOMEM; + + fault->dev = dev; + fault->domain = domain; + fault->params = *params; + + if ((params->flags & IOMMU_FAULT_LAST) || !(params->flags & IOMMU_FAULT_GROUP)) { + group = kzalloc(sizeof(*group), GFP_KERNEL); + if (!group) { + kfree(fault); + return -ENOMEM; + } + + INIT_LIST_HEAD(&group->faults); + list_add(&fault->head, &group->faults); + INIT_WORK(&group->work, iommu_fault_handle_group); + } else { + /* Non-last request of a group. Postpone until the last one */ + spin_lock(&iommu_partial_faults_lock); + list_add(&fault->head, &iommu_partial_faults); + spin_unlock(&iommu_partial_faults_lock); + + return IOMMU_FAULT_STATUS_IGNORE; + } + + if (params->flags & IOMMU_FAULT_GROUP) { + struct iommu_fault_context *cur, *next; + + /* See if we have pending faults for this group */ + spin_lock(&iommu_partial_faults_lock); + list_for_each_entry_safe(cur, next, &iommu_partial_faults, head) { + if (cur->params.id == params->id && cur->dev == dev) { + list_del(&cur->head); + /* Insert *before* the last fault */ + list_add(&cur->head, &group->faults); + } + } + spin_unlock(&iommu_partial_faults_lock); + } + + queue_work(iommu_fault_queue, &group->work); + + /* Postpone the fault completion */ + return IOMMU_FAULT_STATUS_IGNORE; +} + +/** + * handle_iommu_fault - Handle fault in device driver or mm + * + * If the device driver expressed interest in handling fault, report it throught + * the domain handler. If the fault is recoverable, try to page in the address. + */ +int handle_iommu_fault(struct iommu_domain *domain, struct device *dev, + struct iommu_fault *fault) +{ + int ret = -ENOSYS; + + /* + * if upper layers showed interest and installed a fault handler, + * invoke it. + */ + if (domain->ext_handler) { + ret = domain->ext_handler(domain, dev, fault, + domain->handler_token); + + if (ret != IOMMU_FAULT_STATUS_NONE) + return iommu_fault_finish(domain, dev, fault, ret); + } else if (domain->handler && !(fault->flags & + (IOMMU_FAULT_RECOVERABLE | IOMMU_FAULT_PASID))) { + /* Fall back to the old method if possible */ + ret = domain->handler(domain, dev, fault->address, + fault->flags, domain->handler_token); + if (ret) + return ret; + } + + /* If the handler is blocking, handle fault in the workqueue */ + if (fault->flags & IOMMU_FAULT_RECOVERABLE) + ret = iommu_queue_fault(domain, dev, fault); + + return iommu_fault_finish(domain, dev, fault, ret); +} +EXPORT_SYMBOL_GPL(handle_iommu_fault); + +/** + * iommu_fault_response - Complete a recoverable fault + * @domain: iommu domain passed to the handler + * @dev: device passed to the handler + * @fault: fault passed to the handler + * @status: action to perform + * + * An atomic handler that took ownership of the fault (by returning + * IOMMU_FAULT_STATUS_IGNORE) must complete the fault by calling this function. + */ +int iommu_fault_response(struct iommu_domain *domain, struct device *dev, + struct iommu_fault *fault, enum iommu_fault_status status) +{ + /* No response is need for unrecoverable faults... */ + if (!(fault->flags & IOMMU_FAULT_RECOVERABLE)) + return -EINVAL; + + /* Ignore is certainly the wrong thing to do at this point */ + if (WARN_ON(status == IOMMU_FAULT_STATUS_IGNORE || + status == IOMMU_FAULT_STATUS_NONE)) + status = IOMMU_FAULT_STATUS_INVALID; + + return iommu_fault_finish(domain, dev, fault, status); +} +EXPORT_SYMBOL_GPL(iommu_fault_response); + +/** + * iommu_fault_queue_register - register an IOMMU driver to the global fault + * queue + * + * @flush_notifier: a notifier block that is called before the fault queue is + * flushed. The IOMMU driver should commit all faults that are pending in its + * low-level queues at the time of the call, into the fault queue. The notifier + * takes a device pointer as argument, hinting what endpoint is causing the + * flush. When the device is NULL, all faults should be committed. + */ +int iommu_fault_queue_register(struct notifier_block *flush_notifier) +{ + /* + * The WQ is unordered because the low-level handler enqueues faults by + * group. PRI requests within a group have to be ordered, but once + * that's dealt with, the high-level function can handle groups out of + * order. + */ + down_write(&iommu_fault_queue_sem); + if (!iommu_fault_queue) { + iommu_fault_queue = alloc_workqueue("iommu_fault_queue", + WQ_UNBOUND, 0); + if (iommu_fault_queue) + refcount_set(&iommu_fault_queue_refs, 1); + } else { + refcount_inc(&iommu_fault_queue_refs); + } + up_write(&iommu_fault_queue_sem); + + if (!iommu_fault_queue) + return -ENOMEM; + + if (flush_notifier) + blocking_notifier_chain_register(&iommu_fault_queue_flush_notifiers, + flush_notifier); + + return 0; +} +EXPORT_SYMBOL_GPL(iommu_fault_queue_register); + +/** + * iommu_fault_queue_flush - Ensure that all queued faults have been processed. + * @dev: the endpoint whose faults need to be flushed. If NULL, flush all + * pending faults. + * + * Users must call this function when releasing a PASID, to ensure that all + * pending faults affecting this PASID have been handled, and won't affect the + * address space of a subsequent process that reuses this PASID. + */ +void iommu_fault_queue_flush(struct device *dev) +{ + blocking_notifier_call_chain(&iommu_fault_queue_flush_notifiers, 0, dev); + + down_read(&iommu_fault_queue_sem); + /* + * Don't flush the partial faults list. All PRGs with the PASID are + * complete and have been submitted to the queue. + */ + if (iommu_fault_queue) + flush_workqueue(iommu_fault_queue); + up_read(&iommu_fault_queue_sem); +} +EXPORT_SYMBOL_GPL(iommu_fault_queue_flush); + +/** + * iommu_fault_queue_unregister - Unregister an IOMMU driver from the global + * fault queue. + * + * @flush_notifier: same parameter as iommu_fault_queue_register + */ +void iommu_fault_queue_unregister(struct notifier_block *flush_notifier) +{ + down_write(&iommu_fault_queue_sem); + if (refcount_dec_and_test(&iommu_fault_queue_refs)) { + destroy_workqueue(iommu_fault_queue); + iommu_fault_queue = NULL; + } + up_write(&iommu_fault_queue_sem); + + if (flush_notifier) + blocking_notifier_chain_unregister(&iommu_fault_queue_flush_notifiers, + flush_notifier); +} +EXPORT_SYMBOL_GPL(iommu_fault_queue_unregister); diff --git a/drivers/iommu/iommu-process.c b/drivers/iommu/iommu-process.c index dee7691e3791..092240708b78 100644 --- a/drivers/iommu/iommu-process.c +++ b/drivers/iommu/iommu-process.c @@ -26,9 +26,6 @@ #include #include -/* FIXME: stub for the fault queue. Remove later. */ -#define iommu_fault_queue_flush(...) - /* Link between a domain and a process */ struct iommu_context { struct iommu_process *process; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ea4eaf585eb4..37fafaf07ee2 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -51,15 +51,69 @@ struct iommu_domain; struct notifier_block; /* iommu fault flags */ -#define IOMMU_FAULT_READ 0x0 -#define IOMMU_FAULT_WRITE 0x1 +#define IOMMU_FAULT_READ (1 << 0) +#define IOMMU_FAULT_WRITE (1 << 1) +#define IOMMU_FAULT_EXEC (1 << 2) +#define IOMMU_FAULT_PRIV (1 << 3) +/* + * If a fault is recoverable, then it *must* be completed, once handled, with + * iommu_fault_response. + */ +#define IOMMU_FAULT_RECOVERABLE (1 << 4) +/* The PASID field is valid */ +#define IOMMU_FAULT_PASID (1 << 5) +/* Fault is part of a group (PCI PRG) */ +#define IOMMU_FAULT_GROUP (1 << 6) +/* Fault is last of its group */ +#define IOMMU_FAULT_LAST (1 << 7) + +/** + * enum iommu_fault_status - Return status of fault handlers, telling the IOMMU + * driver how to proceed with the fault. + * + * @IOMMU_FAULT_STATUS_NONE: Fault was not handled. Call the next handler, or + * terminate. + * @IOMMU_FAULT_STATUS_FAILURE: General error. Drop all subsequent faults from + * this device if possible. This is "Response Failure" in PCI PRI. + * @IOMMU_FAULT_STATUS_INVALID: Could not handle this fault, don't retry the + * access. This is "Invalid Request" in PCI PRI. + * @IOMMU_FAULT_STATUS_HANDLED: Fault has been handled and the page tables + * populated, retry the access. + * @IOMMU_FAULT_STATUS_IGNORE: Stop processing the fault, and do not send a + * reply to the device. + * + * For unrecoverable faults, the only valid status is IOMMU_FAULT_STATUS_NONE + * For a recoverable fault, if no one handled the fault, treat as + * IOMMU_FAULT_STATUS_INVALID. + */ +enum iommu_fault_status { + IOMMU_FAULT_STATUS_NONE = 0, + IOMMU_FAULT_STATUS_FAILURE, + IOMMU_FAULT_STATUS_INVALID, + IOMMU_FAULT_STATUS_HANDLED, + IOMMU_FAULT_STATUS_IGNORE, +}; typedef int (*iommu_fault_handler_t)(struct iommu_domain *, struct device *, unsigned long, int, void *); struct iommu_fault { + /* Faulting address */ unsigned long address; + /* Fault flags */ unsigned int flags; + /* Process address space ID (if IOMMU_FAULT_PASID is present) */ + u32 pasid; + /* + * For PCI PRI, 'id' is the PRG. For others, it's a tag identifying a + * single fault. + */ + unsigned int id; + /* + * IOMMU vendor-specific things. This cannot be a private pointer + * because the fault report might leave the kernel and into a guest. + */ + u64 iommu_data; }; typedef int (*iommu_ext_fault_handler_t)(struct iommu_domain *, struct device *, @@ -228,6 +282,7 @@ struct iommu_resv_region { * @domain_set_windows: Set the number of windows for a domain * @domain_get_windows: Return the number of windows for a domain * @of_xlate: add OF master IDs to iommu grouping + * @fault_reponse: complete a recoverable fault * @pgsize_bitmap: bitmap of all possible supported page sizes */ struct iommu_ops { @@ -287,6 +342,10 @@ struct iommu_ops { int (*of_xlate)(struct device *dev, struct of_phandle_args *args); bool (*is_attach_deferred)(struct iommu_domain *domain, struct device *dev); + int (*fault_response)(struct iommu_domain *domain, struct device *dev, + struct iommu_fault *fault, + enum iommu_fault_status status); + unsigned long pgsize_bitmap; }; @@ -824,4 +883,43 @@ static inline void __iommu_process_unbind_dev_all(struct iommu_domain *domain, #endif /* CONFIG_IOMMU_PROCESS */ +#ifdef CONFIG_IOMMU_FAULT +extern int handle_iommu_fault(struct iommu_domain *domain, struct device *dev, + struct iommu_fault *fault); +extern int iommu_fault_response(struct iommu_domain *domain, struct device *dev, + struct iommu_fault *fault, + enum iommu_fault_status status); +extern int iommu_fault_queue_register(struct notifier_block *flush_notifier); +extern void iommu_fault_queue_flush(struct device *dev); +extern void iommu_fault_queue_unregister(struct notifier_block *flush_notifier); +#else /* CONFIG_IOMMU_FAULT */ +static inline int handle_iommu_fault(struct iommu_domain *domain, + struct device *dev, + struct iommu_fault *fault) +{ + return -ENODEV; +} + +static inline int iommu_fault_response(struct iommu_domain *domain, + struct device *dev, + struct iommu_fault *fault, + enum iommu_fault_status status) +{ + return -ENODEV; +} + +static inline int iommu_fault_queue_register(struct notifier_block *flush_notifier) +{ + return -ENODEV; +} + +static inline void iommu_fault_queue_flush(struct device *dev) +{ +} + +static inline void iommu_fault_queue_unregister(struct notifier_block *flush_notifier) +{ +} +#endif /* CONFIG_IOMMU_FAULT */ + #endif /* __LINUX_IOMMU_H */ From patchwork Fri Oct 6 13:31:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822429 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r7S6NyXz9t34 for ; Sat, 7 Oct 2017 00:28:36 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752263AbdJFN2e (ORCPT ); Fri, 6 Oct 2017 09:28:34 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60340 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752020AbdJFN2d (ORCPT ); Fri, 6 Oct 2017 09:28:33 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 812A815BE; Fri, 6 Oct 2017 06:28:33 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A7F043F578; Fri, 6 Oct 2017 06:28:28 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 08/36] iommu/fault: Handle mm faults Date: Fri, 6 Oct 2017 14:31:35 +0100 Message-Id: <20171006133203.22803-9-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When a recoverable page fault is handled by the fault workqueue, find the associated process and call handle_mm_fault. In theory, we don't even need to take a reference to the iommu_process, because any release of the structure is preceded by a flush of the queue. I don't feel comfortable removing the pinning at the moment, though. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/io-pgfault.c | 83 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 81 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index f31bc24534b0..532bdb9ce519 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -21,6 +21,7 @@ #include #include +#include #include #include @@ -83,8 +84,86 @@ static int iommu_fault_finish(struct iommu_domain *domain, struct device *dev, static int iommu_fault_handle_single(struct iommu_fault_context *fault) { - /* TODO */ - return -ENODEV; + struct mm_struct *mm; + struct vm_area_struct *vma; + struct iommu_process *process; + int ret = IOMMU_FAULT_STATUS_INVALID; + unsigned int access_flags = 0; + unsigned int fault_flags = FAULT_FLAG_REMOTE; + struct iommu_fault *params = &fault->params; + + if (!(params->flags & IOMMU_FAULT_PASID)) + return ret; + + process = iommu_process_find(params->pasid); + if (!process) + return ret; + + if ((params->flags & (IOMMU_FAULT_LAST | IOMMU_FAULT_READ | + IOMMU_FAULT_WRITE)) == IOMMU_FAULT_LAST) { + /* Special case: PASID Stop Marker doesn't require a response */ + ret = IOMMU_FAULT_STATUS_IGNORE; + goto out_put_process; + } + + mm = process->mm; + if (!mmget_not_zero(mm)) { + /* Process is dead */ + goto out_put_process; + } + + down_read(&mm->mmap_sem); + + vma = find_extend_vma(mm, params->address); + if (!vma) + /* Unmapped area */ + goto out_put_mm; + + if (params->flags & IOMMU_FAULT_READ) + access_flags |= VM_READ; + + if (params->flags & IOMMU_FAULT_WRITE) { + access_flags |= VM_WRITE; + fault_flags |= FAULT_FLAG_WRITE; + } + + if (params->flags & IOMMU_FAULT_EXEC) { + access_flags |= VM_EXEC; + fault_flags |= FAULT_FLAG_INSTRUCTION; + } + + if (!(params->flags & IOMMU_FAULT_PRIV)) + fault_flags |= FAULT_FLAG_USER; + + if (access_flags & ~vma->vm_flags) + /* Access fault */ + goto out_put_mm; + + ret = handle_mm_fault(vma, params->address, fault_flags); + ret = ret & VM_FAULT_ERROR ? IOMMU_FAULT_STATUS_INVALID : + IOMMU_FAULT_STATUS_HANDLED; + +out_put_mm: + up_read(&mm->mmap_sem); + + /* + * Here's a fun scenario: the process exits while we're handling the + * fault on its mm. Since we're the last mm_user, mmput will call + * mm_exit immediately. exit_mm releases the mmu notifier, which calls + * iommu_notifier_release, which has to flush the fault queue that we're + * executing on... It's actually easy to reproduce with a DMA engine, + * and I did observe a lockdep splat. Therefore move the release of the + * mm to another thread, if we're the last user. + * + * mmput_async was removed in 4.14, and added back in 4.15(?) + * https://patchwork.kernel.org/patch/9952257/ + */ + mmput_async(mm); + +out_put_process: + iommu_process_put(process); + + return ret; } static void iommu_fault_handle_group(struct work_struct *work) From patchwork Fri Oct 6 13:31:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822430 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r7Y54vnz9t34 for ; Sat, 7 Oct 2017 00:28:41 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752278AbdJFN2j (ORCPT ); Fri, 6 Oct 2017 09:28:39 -0400 Received: from foss.arm.com ([217.140.101.70]:60382 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752292AbdJFN2i (ORCPT ); Fri, 6 Oct 2017 09:28:38 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 977EF15BF; Fri, 6 Oct 2017 06:28:38 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id BE8E93F578; Fri, 6 Oct 2017 06:28:33 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 09/36] iommu/fault: Allow blocking fault handlers Date: Fri, 6 Oct 2017 14:31:36 +0100 Message-Id: <20171006133203.22803-10-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Allow device driver to register their fault handler at various stages of the handling path, by adding flags to iommu_set_ext_fault_handler. Since we now have a fault workqueue, it is quite easy to call their handler from thread context instead of IRQ handler. A driver can request to be called both in blocking and non-blocking context, so it can filter faults early and only execute the blocking code for some of them. Add the IOMMU_FAULT_ATOMIC fault flag to tell the driver where we're calling it from. Signed-off-by: Jean-Philippe Brucker --- Rob, would this do what you want? The MSM driver can register its handler with ATOMIC | BLOCKING flags. When called in IRQ context, it can ignore the fault by returning IOMMU_FAULT_STATUS_NONE, or drop it by returning IOMMU_FAULT_STATUS_INVALID. When called in thread context, it can sleep and then return IOMMU_FAULT_STATUS_INVALID to terminate the fault. --- drivers/iommu/io-pgfault.c | 16 ++++++++++++++-- drivers/iommu/iommu.c | 12 +++++++++--- include/linux/iommu.h | 20 +++++++++++++++++++- 3 files changed, 42 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 532bdb9ce519..3ec8179f58b5 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -91,6 +91,14 @@ static int iommu_fault_handle_single(struct iommu_fault_context *fault) unsigned int access_flags = 0; unsigned int fault_flags = FAULT_FLAG_REMOTE; struct iommu_fault *params = &fault->params; + struct iommu_domain *domain = fault->domain; + + if (domain->handler_flags & IOMMU_FAULT_HANDLER_BLOCKING) { + ret = domain->ext_handler(domain, fault->dev, &fault->params, + domain->handler_token); + if (ret != IOMMU_FAULT_STATUS_NONE) + return ret; + } if (!(params->flags & IOMMU_FAULT_PASID)) return ret; @@ -274,7 +282,8 @@ int handle_iommu_fault(struct iommu_domain *domain, struct device *dev, * if upper layers showed interest and installed a fault handler, * invoke it. */ - if (domain->ext_handler) { + if (domain->handler_flags & IOMMU_FAULT_HANDLER_ATOMIC) { + fault->flags |= IOMMU_FAULT_ATOMIC; ret = domain->ext_handler(domain, dev, fault, domain->handler_token); @@ -290,8 +299,11 @@ int handle_iommu_fault(struct iommu_domain *domain, struct device *dev, } /* If the handler is blocking, handle fault in the workqueue */ - if (fault->flags & IOMMU_FAULT_RECOVERABLE) + if ((fault->flags & IOMMU_FAULT_RECOVERABLE) || + (domain->handler_flags & IOMMU_FAULT_HANDLER_BLOCKING)) { + fault->flags &= ~IOMMU_FAULT_ATOMIC; ret = iommu_queue_fault(domain, dev, fault); + } return iommu_fault_finish(domain, dev, fault, ret); } diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ee956b5fc301..c189648ab7b4 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1258,7 +1258,9 @@ EXPORT_SYMBOL_GPL(iommu_set_fault_handler); * @dev: the device * @handler: fault handler * @token: user data, will be passed back to the fault handler - * @flags: IOMMU_FAULT_HANDLER_* parameters. + * @flags: IOMMU_FAULT_HANDLER_* parameters. Allows the driver to tell when it + * wants to be notified. By default the handler will only be called from atomic + * context. * * This function should be used by IOMMU users which want to be notified * whenever an IOMMU fault happens. @@ -1275,11 +1277,15 @@ void iommu_set_ext_fault_handler(struct device *dev, if (WARN_ON(!domain)) return; + if (!flags) + flags |= IOMMU_FAULT_HANDLER_ATOMIC; + if (WARN_ON(domain->handler || domain->ext_handler)) return; domain->ext_handler = handler; domain->handler_token = token; + domain->handler_flags = flags; } EXPORT_SYMBOL_GPL(iommu_set_ext_fault_handler); @@ -1824,7 +1830,7 @@ int report_iommu_fault(struct iommu_domain *domain, struct device *dev, int ret = -ENOSYS; struct iommu_fault fault = { .address = iova, - .flags = flags, + .flags = flags | IOMMU_FAULT_ATOMIC, }; /* @@ -1834,7 +1840,7 @@ int report_iommu_fault(struct iommu_domain *domain, struct device *dev, if (domain->handler) ret = domain->handler(domain, dev, iova, flags, domain->handler_token); - else if (domain->ext_handler) + else if (domain->handler_flags & IOMMU_FAULT_HANDLER_ATOMIC) ret = domain->ext_handler(domain, dev, &fault, domain->handler_token); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 37fafaf07ee2..a6d417785c7b 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -66,6 +66,8 @@ struct notifier_block; #define IOMMU_FAULT_GROUP (1 << 6) /* Fault is last of its group */ #define IOMMU_FAULT_LAST (1 << 7) +/* The fault handler is being called from atomic context */ +#define IOMMU_FAULT_ATOMIC (1 << 8) /** * enum iommu_fault_status - Return status of fault handlers, telling the IOMMU @@ -97,6 +99,21 @@ enum iommu_fault_status { typedef int (*iommu_fault_handler_t)(struct iommu_domain *, struct device *, unsigned long, int, void *); +/* + * IOMMU_FAULT_HANDLER_ATOMIC: Notify device driver from within atomic context + * (IRQ handler). The callback is not allowed to sleep. If the fault is + * recoverable, the driver must either return a fault status telling the IOMMU + * driver how to complete the fault (FAILURE, INVALID, HANDLED) or complete the + * fault later with iommu_fault_response. + */ +#define IOMMU_FAULT_HANDLER_ATOMIC (1 << 0) +/* + * IOMMU_FAULT_HANDLER_BLOCKING: Notify device driver from a thread. If the fault + * is recoverable, the driver must return a fault status telling the IOMMU + * driver how to complete the fault (FAILURE, INVALID, HANDLED) + */ +#define IOMMU_FAULT_HANDLER_BLOCKING (1 << 1) + struct iommu_fault { /* Faulting address */ unsigned long address; @@ -161,6 +178,7 @@ struct iommu_domain { iommu_fault_handler_t handler; iommu_ext_fault_handler_t ext_handler; void *handler_token; + int handler_flags; iommu_process_exit_handler_t process_exit; void *process_exit_token; struct iommu_domain_geometry geometry; @@ -633,7 +651,7 @@ static inline phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_ad } static inline void iommu_set_fault_handler(struct iommu_domain *domain, - iommu_fault_handler_t handler, void *token) + iommu_fault_handler_t handler, void *token, int flags) { } From patchwork Fri Oct 6 13:31:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822431 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r7g4VcBz9t34 for ; Sat, 7 Oct 2017 00:28:47 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752295AbdJFN2p (ORCPT ); Fri, 6 Oct 2017 09:28:45 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60422 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752103AbdJFN2o (ORCPT ); Fri, 6 Oct 2017 09:28:44 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id ADDD2165D; Fri, 6 Oct 2017 06:28:43 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D4E263F578; Fri, 6 Oct 2017 06:28:38 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 10/36] vfio: Add support for Shared Virtual Memory Date: Fri, 6 Oct 2017 14:31:37 +0100 Message-Id: <20171006133203.22803-11-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Add two new ioctl for VFIO containers. VFIO_DEVICE_BIND_PROCESS creates a bond between a container and a process address space, identified by a device-specific ID named PASID. This allows the device to target DMA transactions at the process virtual addresses without a need for mapping and unmapping buffers explicitly in the IOMMU. The process page tables are shared with the IOMMU, and mechanisms such as PCI ATS/PRI may be used to handle faults. VFIO_DEVICE_UNBIND_PROCESS removed a bond identified by a PASID. Signed-off-by: Jean-Philippe Brucker --- drivers/vfio/vfio_iommu_type1.c | 243 +++++++++++++++++++++++++++++++++++++++- include/uapi/linux/vfio.h | 69 ++++++++++++ 2 files changed, 311 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 92155cce926d..4bfb92273cb5 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -60,6 +61,7 @@ MODULE_PARM_DESC(disable_hugepages, struct vfio_iommu { struct list_head domain_list; + struct list_head process_list; struct vfio_domain *external_domain; /* domain for external user */ struct mutex lock; struct rb_root dma_list; @@ -92,6 +94,12 @@ struct vfio_group { struct list_head next; }; +struct vfio_process { + int pasid; + struct pid *pid; + struct list_head next; +}; + /* * Guest RAM pinning working set or DMA target */ @@ -1114,6 +1122,25 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, return 0; } +static int vfio_iommu_replay_bind(struct vfio_iommu *iommu, struct vfio_group *group) +{ + int ret; + u32 pasid; + struct vfio_process *vfio_process; + + list_for_each_entry(vfio_process, &iommu->process_list, next) { + struct task_struct *task = get_pid_task(vfio_process->pid, + PIDTYPE_PID); + + ret = iommu_process_bind_group(group->iommu_group, task, &pasid, 0); + put_task_struct(task); + if (ret) + return ret; + } + + return 0; +} + /* * We change our unmap behavior slightly depending on whether the IOMMU * supports fine-grained superpages. IOMMUs like AMD-Vi will use a superpage @@ -1301,8 +1328,9 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, list_add(&group->next, &d->group_list); iommu_domain_free(domain->domain); kfree(domain); + ret = vfio_iommu_replay_bind(iommu, group); mutex_unlock(&iommu->lock); - return 0; + return ret; } ret = iommu_attach_group(domain->domain, iommu_group); @@ -1318,6 +1346,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, if (ret) goto out_detach; + ret = vfio_iommu_replay_bind(iommu, group); + if (ret) + goto out_detach; + if (resv_msi) { ret = iommu_get_msi_cookie(domain->domain, resv_msi_base); if (ret) @@ -1349,6 +1381,21 @@ static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu) vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node)); } +static void vfio_iommu_unbind_all(struct vfio_iommu *iommu) +{ + struct vfio_process *process, *process_tmp; + + list_for_each_entry_safe(process, process_tmp, &iommu->process_list, next) { + /* + * No need to unbind manually, iommu_detach_group should + * do it for us. + */ + put_pid(process->pid); + kfree(process); + } + INIT_LIST_HEAD(&iommu->process_list); +} + static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu) { struct rb_node *n, *p; @@ -1438,6 +1485,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, vfio_iommu_unmap_unpin_all(iommu); else vfio_iommu_unmap_unpin_reaccount(iommu); + vfio_iommu_unbind_all(iommu); } iommu_domain_free(domain->domain); list_del(&domain->next); @@ -1472,6 +1520,7 @@ static void *vfio_iommu_type1_open(unsigned long arg) } INIT_LIST_HEAD(&iommu->domain_list); + INIT_LIST_HEAD(&iommu->process_list); iommu->dma_list = RB_ROOT; mutex_init(&iommu->lock); BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier); @@ -1506,6 +1555,7 @@ static void vfio_iommu_type1_release(void *iommu_data) kfree(iommu->external_domain); } + vfio_iommu_unbind_all(iommu); vfio_iommu_unmap_unpin_all(iommu); list_for_each_entry_safe(domain, domain_tmp, @@ -1534,6 +1584,159 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu) return ret; } +static long vfio_iommu_type1_bind_process(struct vfio_iommu *iommu, + void __user *arg, + struct vfio_iommu_type1_bind *bind) +{ + struct vfio_iommu_type1_bind_process params; + struct vfio_process *vfio_process; + struct vfio_domain *domain; + struct task_struct *task; + struct vfio_group *group; + struct mm_struct *mm; + unsigned long minsz; + struct pid *pid; + int ret; + + minsz = sizeof(*bind) + sizeof(params); + if (bind->argsz < minsz) + return -EINVAL; + + arg += sizeof(*bind); + ret = copy_from_user(¶ms, arg, sizeof(params)); + if (ret) + return -EFAULT; + + if (params.flags & ~VFIO_IOMMU_BIND_PID) + return -EINVAL; + + if (params.flags & VFIO_IOMMU_BIND_PID) { + pid_t vpid; + + minsz += sizeof(pid_t); + if (bind->argsz < minsz) + return -EINVAL; + + ret = copy_from_user(&vpid, arg + sizeof(params), sizeof(pid_t)); + if (ret) + return -EFAULT; + + rcu_read_lock(); + task = find_task_by_vpid(vpid); + if (task) + get_task_struct(task); + rcu_read_unlock(); + if (!task) + return -ESRCH; + + /* Ensure current has RW access on the mm */ + mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS); + if (!mm || IS_ERR(mm)) { + put_task_struct(task); + return IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; + } + mmput(mm); + } else { + get_task_struct(current); + task = current; + } + + pid = get_task_pid(task, PIDTYPE_PID); + mutex_lock(&iommu->lock); + list_for_each_entry(vfio_process, &iommu->process_list, next) { + if (vfio_process->pid != pid) + continue; + + params.pasid = vfio_process->pasid; + + mutex_unlock(&iommu->lock); + put_pid(pid); + put_task_struct(task); + return copy_to_user(arg, ¶ms, sizeof(params)) ? + -EFAULT : 0; + } + + vfio_process = kzalloc(sizeof(*vfio_process), GFP_KERNEL); + if (!vfio_process) { + mutex_unlock(&iommu->lock); + put_pid(pid); + put_task_struct(task); + return -ENOMEM; + } + + list_for_each_entry(domain, &iommu->domain_list, next) { + list_for_each_entry(group, &domain->group_list, next) { + ret = iommu_process_bind_group(group->iommu_group, task, + ¶ms.pasid, 0); + if (ret) + break; + } + if (ret) + break; + } + + if (!ret) { + vfio_process->pid = pid; + vfio_process->pasid = params.pasid; + list_add(&vfio_process->next, &iommu->process_list); + } + + mutex_unlock(&iommu->lock); + + put_task_struct(task); + + if (ret) + kfree(vfio_process); + else + ret = copy_to_user(arg, ¶ms, sizeof(params)) ? + -EFAULT : 0; + + return ret; +} + +static long vfio_iommu_type1_unbind_process(struct vfio_iommu *iommu, + void __user *arg, + struct vfio_iommu_type1_bind *bind) +{ + int ret = -EINVAL; + unsigned long minsz; + struct vfio_process *process; + struct vfio_group *group; + struct vfio_domain *domain; + struct vfio_iommu_type1_bind_process params; + + minsz = sizeof(*bind) + sizeof(params); + if (bind->argsz < minsz) + return -EINVAL; + + arg += sizeof(*bind); + ret = copy_from_user(¶ms, arg, sizeof(params)); + if (ret) + return -EFAULT; + + if (params.flags) + return -EINVAL; + + mutex_lock(&iommu->lock); + list_for_each_entry(process, &iommu->process_list, next) { + if (process->pasid != params.pasid) + continue; + + list_for_each_entry(domain, &iommu->domain_list, next) + list_for_each_entry(group, &domain->group_list, next) + iommu_process_unbind_group(group->iommu_group, + process->pasid); + + put_pid(process->pid); + list_del(&process->next); + kfree(process); + break; + } + mutex_unlock(&iommu->lock); + + return ret; +} + static long vfio_iommu_type1_ioctl(void *iommu_data, unsigned int cmd, unsigned long arg) { @@ -1604,6 +1807,44 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, return copy_to_user((void __user *)arg, &unmap, minsz) ? -EFAULT : 0; + + } else if (cmd == VFIO_IOMMU_BIND) { + struct vfio_iommu_type1_bind bind; + + minsz = offsetofend(struct vfio_iommu_type1_bind, mode); + + if (copy_from_user(&bind, (void __user *)arg, minsz)) + return -EFAULT; + + if (bind.argsz < minsz) + return -EINVAL; + + switch (bind.mode) { + case VFIO_IOMMU_BIND_PROCESS: + return vfio_iommu_type1_bind_process(iommu, (void *)arg, + &bind); + default: + return -EINVAL; + } + + } else if (cmd == VFIO_IOMMU_UNBIND) { + struct vfio_iommu_type1_bind bind; + + minsz = offsetofend(struct vfio_iommu_type1_bind, mode); + + if (copy_from_user(&bind, (void __user *)arg, minsz)) + return -EFAULT; + + if (bind.argsz < minsz) + return -EINVAL; + + switch (bind.mode) { + case VFIO_IOMMU_BIND_PROCESS: + return vfio_iommu_type1_unbind_process(iommu, (void *)arg, + &bind); + default: + return -EINVAL; + } } return -ENOTTY; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index ae461050661a..6da8321c33dc 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -565,6 +565,75 @@ struct vfio_iommu_type1_dma_unmap { #define VFIO_IOMMU_ENABLE _IO(VFIO_TYPE, VFIO_BASE + 15) #define VFIO_IOMMU_DISABLE _IO(VFIO_TYPE, VFIO_BASE + 16) +/* + * Allocate a PASID for a local process, and use it to attach this process to + * devices in the container. Devices can then tag their DMA traffic with the + * returned @pasid to perform transactions on the associated virtual address + * space. Mapping and unmapping of buffers is performed by standard functions + * such as mmap and malloc. + * + * If flag is VFIO_IOMMU_BIND_PID, bind to a process different from the calling + * one. data contains the pid of that process, a s32. Given that the caller owns + * the device, setting this flag grants the caller read and write permissions on + * the entire address space of foreign process described by @pid. Therefore, + * permission to perform the bind operation on a foreign process is governed by + * the ptrace access mode PTRACE_MODE_ATTACH_REALCREDS check. See man ptrace(2) + * for more information. + * + * On success, VFIO writes a Process Address Space ID (PASID) into @pasid. This + * ID is unique to a process and can be used on all devices in the container. + * + * On fork, the child inherits the device fd and can use the bonds setup by its + * parent. Consequently, the child has R/W access on the address spaces bound by + * its parent. After an execv, the device fd is closed and the child doesn't + * have access to the address space anymore. + */ +struct vfio_iommu_type1_bind_process { + __u32 flags; +#define VFIO_IOMMU_BIND_PID (1 << 0) + __u32 pasid; + __u8 data[]; +}; + +/* + * Only mode supported at the moment is VFIO_IOMMU_BIND_PROCESS, which takes + * vfio_iommu_type1_bind_process in data. + */ +struct vfio_iommu_type1_bind { + __u32 argsz; + __u32 mode; +#define VFIO_IOMMU_BIND_PROCESS (1 << 0) + __u8 data[]; +}; + +/* + * VFIO_IOMMU_BIND - _IOWR(VFIO_TYPE, VFIO_BASE + 22, struct vfio_iommu_bind) + * + * Manage address spaces of devices in this container. Initially a TYPE1 + * container can only have one address space, managed with + * VFIO_IOMMU_MAP/UNMAP_DMA. + * + * An IOMMU of type VFIO_TYPE1_NESTING_IOMMU can be managed by both MAP/UNMAP + * and BIND ioctls at the same time. MAP/UNMAP acts on the stage-2 (host) page + * tables, and BIND manages the stage-1 (guest) page tables. Other types of + * IOMMU may allow MAP/UNMAP and BIND to coexist, where MAP/UNMAP controls + * non-PASID traffic and BIND controls PASID traffic. But this depends on the + * underlying IOMMU architecture and isn't guaranteed. + * + * Availability of this feature depends on the device, its bus, the underlying + * IOMMU and the CPU architecture. + * + * returns: 0 on success, -errno on failure. + */ +#define VFIO_IOMMU_BIND _IO(VFIO_TYPE, VFIO_BASE + 22) + +/* + * VFIO_IOMMU_UNBIND - _IOWR(VFIO_TYPE, VFIO_BASE + 23, struct vfio_iommu_bind) + * + * Undo what was done by the corresponding VFIO_IOMMU_BIND ioctl. + */ +#define VFIO_IOMMU_UNBIND _IO(VFIO_TYPE, VFIO_BASE + 23) + /* -------- Additional API for SPAPR TCE (Server POWERPC) IOMMU -------- */ /* From patchwork Fri Oct 6 13:31:38 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822438 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r8L6PWZz9t3m for ; Sat, 7 Oct 2017 00:29:22 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752411AbdJFN2w (ORCPT ); Fri, 6 Oct 2017 09:28:52 -0400 Received: from foss.arm.com ([217.140.101.70]:60486 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752404AbdJFN2t (ORCPT ); Fri, 6 Oct 2017 09:28:49 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C4CDD164F; Fri, 6 Oct 2017 06:28:48 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id EB74F3F578; Fri, 6 Oct 2017 06:28:43 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 11/36] iommu/arm-smmu-v3: Link domains and devices Date: Fri, 6 Oct 2017 14:31:38 +0100 Message-Id: <20171006133203.22803-12-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When removing a mapping from a domain, we need to send an invalidation to all devices that might have stored it in their Address Translation Cache (ATC). In addition with SVM, we'll need to invalidate context descriptors of all devices attached to a live domain. Maintain a list of devices in each domain, protected by a spinlock. It is updated every time we attach or detach devices to and from domains. It needs to be a spinlock because we'll invalidate ATC entries from within hardirq-safe contexts, but it may be possible to relax the read side with RCU later. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 22a6b08ef014..ecc424b15749 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -644,6 +644,11 @@ struct arm_smmu_device { struct arm_smmu_master_data { struct arm_smmu_device *smmu; struct arm_smmu_strtab_ent ste; + + struct arm_smmu_domain *domain; + struct list_head list; /* domain->devices */ + + struct device *dev; }; /* SMMU private data for an IOMMU domain */ @@ -667,6 +672,9 @@ struct arm_smmu_domain { }; struct iommu_domain domain; + + struct list_head devices; + spinlock_t devices_lock; }; struct arm_smmu_option_prop { @@ -1461,6 +1469,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) } mutex_init(&smmu_domain->init_mutex); + INIT_LIST_HEAD(&smmu_domain->devices); + spin_lock_init(&smmu_domain->devices_lock); + return &smmu_domain->domain; } @@ -1666,7 +1677,17 @@ static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec) static void arm_smmu_detach_dev(struct device *dev) { + unsigned long flags; struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv; + struct arm_smmu_domain *smmu_domain = master->domain; + + if (smmu_domain) { + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_del(&master->list); + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + + master->domain = NULL; + } master->ste.assigned = false; arm_smmu_install_ste_for_dev(dev->iommu_fwspec); @@ -1675,6 +1696,7 @@ static void arm_smmu_detach_dev(struct device *dev) static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) { int ret = 0; + unsigned long flags; struct arm_smmu_device *smmu; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_master_data *master; @@ -1710,6 +1732,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) } ste->assigned = true; + master->domain = smmu_domain; + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_add(&master->list, &smmu_domain->devices); + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS) { ste->s1_cfg = NULL; @@ -1820,6 +1847,7 @@ static int arm_smmu_add_device(struct device *dev) return -ENOMEM; master->smmu = smmu; + master->dev = dev; fwspec->iommu_priv = master; } From patchwork Fri Oct 6 13:31:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822433 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r7x4pYfz9t3m for ; Sat, 7 Oct 2017 00:29:01 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752440AbdJFN2z (ORCPT ); Fri, 6 Oct 2017 09:28:55 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60524 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752406AbdJFN2y (ORCPT ); Fri, 6 Oct 2017 09:28:54 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DB33D1684; Fri, 6 Oct 2017 06:28:53 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 0E05A3F578; Fri, 6 Oct 2017 06:28:48 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 12/36] dt-bindings: document stall and PASID properties for IOMMU masters Date: Fri, 6 Oct 2017 14:31:39 +0100 Message-Id: <20171006133203.22803-13-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On ARM systems, some platform devices behind an IOMMU may support stall and PASID features. Stall is the ability to recover from page faults and PASID offers multiple process address spaces to the device. Together they allow to do paging with a device. Let the firmware tell us when a device supports stall and PASID. Signed-off-by: Jean-Philippe Brucker --- Documentation/devicetree/bindings/iommu/iommu.txt | 24 +++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt index 5a8b4624defc..c589b75f7277 100644 --- a/Documentation/devicetree/bindings/iommu/iommu.txt +++ b/Documentation/devicetree/bindings/iommu/iommu.txt @@ -86,6 +86,30 @@ have a means to turn off translation. But it is invalid in such cases to disable the IOMMU's device tree node in the first place because it would prevent any driver from properly setting up the translations. +Optional properties: +-------------------- +- dma-can-stall: When present, the master can wait for a transaction to + complete for an indefinite amount of time. Upon translation fault some + IOMMUs, instead of aborting the translation immediately, may first + notify the driver and keep the transaction in flight. This allows the OS + to inspect the fault and, for example, make physical pages resident + before updating the mappings and completing the transaction. Such IOMMU + accepts a limited number of simultaneous stalled transactions before + having to either put back-pressure on the master, or abort new faulting + transactions. + + Firmware has to opt-in stalling, because most buses and masters don't + support it. In particular it isn't compatible with PCI, where + transactions have to complete before a time limit. More generally it + won't work in systems and masters that haven't been designed for + stalling. For example the OS, in order to handle a stalled transaction, + may attempt to retrieve pages from secondary storage in a stalled + domain, leading to a deadlock. + +- pasid-bits: Some masters support multiple address spaces for DMA. By + tagging DMA transactions with an address space identifier. By default, + this is 0, which means that the device only has one address space. + Notes: ====== From patchwork Fri Oct 6 13:31:40 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822434 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r831krVz9t34 for ; Sat, 7 Oct 2017 00:29:07 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752312AbdJFN3C (ORCPT ); Fri, 6 Oct 2017 09:29:02 -0400 Received: from foss.arm.com ([217.140.101.70]:60568 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752270AbdJFN3A (ORCPT ); Fri, 6 Oct 2017 09:29:00 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 066A01688; Fri, 6 Oct 2017 06:28:59 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 249123F578; Fri, 6 Oct 2017 06:28:54 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 13/36] iommu/of: Add stall and pasid properties to iommu_fwspec Date: Fri, 6 Oct 2017 14:31:40 +0100 Message-Id: <20171006133203.22803-14-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Add stall and pasid properties to iommu_fwspec, and fill them when dma-can-stall and pasid-bits properties are present in the device tree. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/of_iommu.c | 10 ++++++++++ include/linux/iommu.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 50947ebb6d17..345286bfdbfc 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -204,6 +204,16 @@ const struct iommu_ops *of_iommu_configure(struct device *dev, if (err) break; } + + if (!err && dev->iommu_fwspec) { + const __be32 *prop; + if (of_get_property(master_np, "dma-can-stall", NULL)) + dev->iommu_fwspec->can_stall = true; + + prop = of_get_property(master_np, "pasid-bits", NULL); + if (prop) + dev->iommu_fwspec->num_pasid_bits = be32_to_cpu(*prop); + } } /* diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a6d417785c7b..2eb65d4724bb 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -535,6 +535,8 @@ struct iommu_fwspec { struct fwnode_handle *iommu_fwnode; void *iommu_priv; unsigned int num_ids; + unsigned int num_pasid_bits; + bool can_stall; u32 ids[1]; }; From patchwork Fri Oct 6 13:31:41 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822435 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r870BQgz9t34 for ; Sat, 7 Oct 2017 00:29:11 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752446AbdJFN3G (ORCPT ); Fri, 6 Oct 2017 09:29:06 -0400 Received: from foss.arm.com ([217.140.101.70]:60612 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752406AbdJFN3E (ORCPT ); Fri, 6 Oct 2017 09:29:04 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1D1E615BE; Fri, 6 Oct 2017 06:29:04 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 43DF63F578; Fri, 6 Oct 2017 06:28:59 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 14/36] iommu/arm-smmu-v3: Add support for Substream IDs Date: Fri, 6 Oct 2017 14:31:41 +0100 Message-Id: <20171006133203.22803-15-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org At the moment, the SMMUv3 driver offers only one stage-1 or stage-2 address space to each device. SMMUv3 allows to associate multiple address spaces per device. In addition to the Stream ID (SID), that identifies a device, we can now have Substream IDs (SSID) identifying an address space. In PCIe lingo, SID is called Requester ID (RID) and SSID is called Process Address-Space ID (PASID). Prepare the driver for SSID support, by adding context descriptor tables in STEs (previously a single static context descriptor). A complete stage-1 walk is now performed like this by the SMMU: Stream tables Ctx. tables Page tables +--------+ ,------->+-------+ ,------->+-------+ : : | : : | : : +--------+ | +-------+ | +-------+ SID->| STE |---' SSID->| CD |---' IOVA->| PTE |--> IPA +--------+ +-------+ +-------+ : : : : : : +--------+ +-------+ +-------+ SSIDs are allocated by the core. Note that we only implement one level of context descriptor table for now, but as with stream and page tables, an SSID can be split to target multiple levels of tables. In all stream table entries, we set S1DSS=SSID0 mode, which forces all traffic lacking an SSID to be routed to context descriptor 0. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 228 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 188 insertions(+), 40 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index ecc424b15749..37061e1cbae4 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -244,6 +244,12 @@ #define STRTAB_STE_0_S1CDMAX_SHIFT 59 #define STRTAB_STE_0_S1CDMAX_MASK 0x1fUL +#define STRTAB_STE_1_S1DSS_SHIFT 0 +#define STRTAB_STE_1_S1DSS_MASK 0x3UL +#define STRTAB_STE_1_S1DSS_TERMINATE (0x0 << STRTAB_STE_1_S1DSS_SHIFT) +#define STRTAB_STE_1_S1DSS_BYPASS (0x1 << STRTAB_STE_1_S1DSS_SHIFT) +#define STRTAB_STE_1_S1DSS_SSID0 (0x2 << STRTAB_STE_1_S1DSS_SHIFT) + #define STRTAB_STE_1_S1C_CACHE_NC 0UL #define STRTAB_STE_1_S1C_CACHE_WBRA 1UL #define STRTAB_STE_1_S1C_CACHE_WT 2UL @@ -349,10 +355,14 @@ #define CMDQ_0_OP_MASK 0xffUL #define CMDQ_0_SSV (1UL << 11) +#define CMDQ_PREFETCH_0_SSID_SHIFT 12 +#define CMDQ_PREFETCH_0_SSID_MASK 0xfffffUL #define CMDQ_PREFETCH_0_SID_SHIFT 32 #define CMDQ_PREFETCH_1_SIZE_SHIFT 0 #define CMDQ_PREFETCH_1_ADDR_MASK ~0xfffUL +#define CMDQ_CFGI_0_SSID_SHIFT 12 +#define CMDQ_CFGI_0_SSID_MASK 0xfffffUL #define CMDQ_CFGI_0_SID_SHIFT 32 #define CMDQ_CFGI_0_SID_MASK 0xffffffffUL #define CMDQ_CFGI_1_LEAF (1UL << 0) @@ -469,14 +479,18 @@ struct arm_smmu_cmdq_ent { #define CMDQ_OP_PREFETCH_CFG 0x1 struct { u32 sid; + u32 ssid; u8 size; u64 addr; } prefetch; #define CMDQ_OP_CFGI_STE 0x3 #define CMDQ_OP_CFGI_ALL 0x4 + #define CMDQ_OP_CFGI_CD 0x5 + #define CMDQ_OP_CFGI_CD_ALL 0x6 struct { u32 sid; + u32 ssid; union { bool leaf; u8 span; @@ -546,16 +560,20 @@ struct arm_smmu_strtab_l1_desc { dma_addr_t l2ptr_dma; }; +struct arm_smmu_ctx_desc { + u16 asid; + u64 ttbr; + u64 tcr; + u64 mair; +}; + struct arm_smmu_s1_cfg { __le64 *cdptr; dma_addr_t cdptr_dma; - struct arm_smmu_ctx_desc { - u16 asid; - u64 ttbr; - u64 tcr; - u64 mair; - } cd; + size_t num_contexts; + + struct arm_smmu_ctx_desc cd; /* Default context (SSID0) */ }; struct arm_smmu_s2_cfg { @@ -649,6 +667,8 @@ struct arm_smmu_master_data { struct list_head list; /* domain->devices */ struct device *dev; + + size_t num_ssids; }; /* SMMU private data for an IOMMU domain */ @@ -840,14 +860,22 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) case CMDQ_OP_TLBI_NSNH_ALL: break; case CMDQ_OP_PREFETCH_CFG: + cmd[0] |= ent->substream_valid ? CMDQ_0_SSV : 0; cmd[0] |= (u64)ent->prefetch.sid << CMDQ_PREFETCH_0_SID_SHIFT; + cmd[0] |= ent->prefetch.ssid << CMDQ_PREFETCH_0_SSID_SHIFT; cmd[1] |= ent->prefetch.size << CMDQ_PREFETCH_1_SIZE_SHIFT; cmd[1] |= ent->prefetch.addr & CMDQ_PREFETCH_1_ADDR_MASK; break; + case CMDQ_OP_CFGI_CD: + cmd[0] |= ent->cfgi.ssid << CMDQ_CFGI_0_SSID_SHIFT; + /* pass through */ case CMDQ_OP_CFGI_STE: cmd[0] |= (u64)ent->cfgi.sid << CMDQ_CFGI_0_SID_SHIFT; cmd[1] |= ent->cfgi.leaf ? CMDQ_CFGI_1_LEAF : 0; break; + case CMDQ_OP_CFGI_CD_ALL: + cmd[0] |= (u64)ent->cfgi.sid << CMDQ_CFGI_0_SID_SHIFT; + break; case CMDQ_OP_CFGI_ALL: /* Cover the entire SID range */ cmd[1] |= CMDQ_CFGI_1_RANGE_MASK << CMDQ_CFGI_1_RANGE_SHIFT; @@ -972,6 +1000,35 @@ static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu, } /* Context descriptor manipulation functions */ +static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, u32 ssid) +{ + size_t i; + unsigned long flags; + struct arm_smmu_master_data *master; + struct arm_smmu_device *smmu = smmu_domain->smmu; + struct arm_smmu_cmdq_ent cmd = { + .opcode = CMDQ_OP_CFGI_CD, + .cfgi = { + .ssid = ssid, + .leaf = true, + }, + }; + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, list) { + struct iommu_fwspec *fwspec = master->dev->iommu_fwspec; + + for (i = 0; i < fwspec->num_ids; i++) { + cmd.cfgi.sid = fwspec->ids[i]; + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + } + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + + cmd.opcode = CMDQ_OP_CMD_SYNC; + arm_smmu_cmdq_issue_cmd(smmu, &cmd); +} + static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr) { u64 val = 0; @@ -990,33 +1047,116 @@ static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr) return val; } -static void arm_smmu_write_ctx_desc(struct arm_smmu_device *smmu, - struct arm_smmu_s1_cfg *cfg) +static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, + u32 ssid, struct arm_smmu_ctx_desc *cd) { u64 val; + bool cd_live; + __u64 *cdptr = (__u64 *)smmu_domain->s1_cfg.cdptr + ssid * CTXDESC_CD_DWORDS; /* - * We don't need to issue any invalidation here, as we'll invalidate - * the STE when installing the new entry anyway. + * This function handles the following cases: + * + * (1) Install primary CD, for normal DMA traffic (SSID = 0). In this + * case, invalidation is performed when installing the STE. + * (2) Install a secondary CD, for SID+SSID traffic, followed by an + * invalidation. + * (3) Update ASID of primary CD. This is allowed by atomically writing + * the first 64 bits of the CD, followed by invalidation of the old + * entry and mappings. + * (4) Remove a secondary CD and invalidate it. */ - val = arm_smmu_cpu_tcr_to_cd(cfg->cd.tcr) | + + val = le64_to_cpu(cdptr[0]); + cd_live = !!(val & CTXDESC_CD_0_V); + + if (!cd) { + /* (4) */ + cdptr[0] = 0; + if (ssid) + arm_smmu_sync_cd(smmu_domain, ssid); + return; + } + + if (cd_live) { + /* (3) */ + val &= ~(CTXDESC_CD_0_ASID_MASK << CTXDESC_CD_0_ASID_SHIFT); + val |= (u64)cd->asid << CTXDESC_CD_0_ASID_SHIFT; + + cdptr[0] = cpu_to_le64(val); + /* + * Until CD+TLB invalidation, both ASIDs may be used for tagging + * this substream's traffic + */ + + } else { + /* (1) and (2) */ + cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK + << CTXDESC_CD_1_TTB0_SHIFT); + cdptr[2] = 0; + cdptr[3] = cpu_to_le64(cd->mair << CTXDESC_CD_3_MAIR_SHIFT); + + if (ssid) + /* + * STE is live, and the SMMU might fetch this CD at any + * time. Ensure it observes the rest of the CD before we + * enable it. + */ + arm_smmu_sync_cd(smmu_domain, ssid); + + val = arm_smmu_cpu_tcr_to_cd(cd->tcr) | #ifdef __BIG_ENDIAN - CTXDESC_CD_0_ENDI | + CTXDESC_CD_0_ENDI | #endif - CTXDESC_CD_0_R | CTXDESC_CD_0_A | CTXDESC_CD_0_ASET_PRIVATE | - CTXDESC_CD_0_AA64 | (u64)cfg->cd.asid << CTXDESC_CD_0_ASID_SHIFT | - CTXDESC_CD_0_V; + CTXDESC_CD_0_R | CTXDESC_CD_0_A | + CTXDESC_CD_0_ASET_PRIVATE | + CTXDESC_CD_0_AA64 | + (u64)cd->asid << CTXDESC_CD_0_ASID_SHIFT | + CTXDESC_CD_0_V; + + /* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */ + if (smmu_domain->smmu->features & ARM_SMMU_FEAT_STALL_FORCE) + val |= CTXDESC_CD_0_S; + + cdptr[0] = cpu_to_le64(val); + } + + if (ssid || cd_live) + arm_smmu_sync_cd(smmu_domain, ssid); +} + +static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) +{ + int num_ssids; + struct arm_smmu_device *smmu = smmu_domain->smmu; + struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; + + if (WARN_ON(smmu_domain->stage != ARM_SMMU_DOMAIN_S1)) + return -EINVAL; + + num_ssids = cfg->num_contexts; + + cfg->cdptr = dmam_alloc_coherent(smmu->dev, + num_ssids * (CTXDESC_CD_DWORDS << 3), + &cfg->cdptr_dma, + GFP_KERNEL | __GFP_ZERO); + if (!cfg->cdptr) + return -ENOMEM; - /* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */ - if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE) - val |= CTXDESC_CD_0_S; + return 0; +} - cfg->cdptr[0] = cpu_to_le64(val); +static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain) +{ + struct arm_smmu_device *smmu = smmu_domain->smmu; + struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; - val = cfg->cd.ttbr & CTXDESC_CD_1_TTB0_MASK << CTXDESC_CD_1_TTB0_SHIFT; - cfg->cdptr[1] = cpu_to_le64(val); + if (WARN_ON(smmu_domain->stage != ARM_SMMU_DOMAIN_S1)) + return; - cfg->cdptr[3] = cpu_to_le64(cfg->cd.mair << CTXDESC_CD_3_MAIR_SHIFT); + dmam_free_coherent(smmu->dev, + cfg->num_contexts * (CTXDESC_CD_DWORDS << 3), + cfg->cdptr, cfg->cdptr_dma); } /* Stream table manipulation functions */ @@ -1115,8 +1255,12 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, } if (ste->s1_cfg) { + unsigned int s1cdmax = ilog2(ste->s1_cfg->num_contexts); + BUG_ON(ste_live); + dst[1] = cpu_to_le64( + STRTAB_STE_1_S1DSS_SSID0 | STRTAB_STE_1_S1C_CACHE_WBRA << STRTAB_STE_1_S1CIR_SHIFT | STRTAB_STE_1_S1C_CACHE_WBRA @@ -1133,6 +1277,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, val |= (ste->s1_cfg->cdptr_dma & STRTAB_STE_0_S1CTXPTR_MASK << STRTAB_STE_0_S1CTXPTR_SHIFT) | + (u64)(s1cdmax & STRTAB_STE_0_S1CDMAX_MASK) + << STRTAB_STE_0_S1CDMAX_SHIFT | + STRTAB_STE_0_S1FMT_LINEAR | STRTAB_STE_0_CFG_S1_TRANS; } @@ -1501,16 +1648,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) iommu_put_dma_cookie(domain); free_io_pgtable_ops(smmu_domain->pgtbl_ops); - /* Free the CD and ASID, if we allocated them */ + /* Free the CD table and ASID, if we allocated them */ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; - - if (cfg->cdptr) { - dmam_free_coherent(smmu_domain->smmu->dev, - CTXDESC_CD_DWORDS << 3, - cfg->cdptr, - cfg->cdptr_dma); - + if (cfg->num_contexts) { + arm_smmu_free_cd_tables(smmu_domain); arm_smmu_bitmap_free(smmu->asid_map, cfg->cd.asid); } } else { @@ -1534,14 +1676,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, if (asid < 0) return asid; - cfg->cdptr = dmam_alloc_coherent(smmu->dev, CTXDESC_CD_DWORDS << 3, - &cfg->cdptr_dma, - GFP_KERNEL | __GFP_ZERO); - if (!cfg->cdptr) { - dev_warn(smmu->dev, "failed to allocate context descriptor\n"); - ret = -ENOMEM; + ret = arm_smmu_alloc_cd_tables(smmu_domain); + if (ret) goto out_free_asid; - } cfg->cd.asid = (u16)asid; cfg->cd.ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0]; @@ -1571,7 +1708,8 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, return 0; } -static int arm_smmu_domain_finalise(struct iommu_domain *domain) +static int arm_smmu_domain_finalise(struct iommu_domain *domain, + struct arm_smmu_master_data *master) { int ret; unsigned long ias, oas; @@ -1600,6 +1738,12 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain) oas = smmu->ias; fmt = ARM_64_LPAE_S1; finalise_stage_fn = arm_smmu_domain_finalise_s1; + + if (master->num_ssids) { + domain->min_pasid = 1; + domain->max_pasid = master->num_ssids - 1; + smmu_domain->s1_cfg.num_contexts = master->num_ssids; + } break; case ARM_SMMU_DOMAIN_NESTED: case ARM_SMMU_DOMAIN_S2: @@ -1717,7 +1861,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) if (!smmu_domain->smmu) { smmu_domain->smmu = smmu; - ret = arm_smmu_domain_finalise(domain); + ret = arm_smmu_domain_finalise(domain, master); if (ret) { smmu_domain->smmu = NULL; goto out_unlock; @@ -1744,7 +1888,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { ste->s1_cfg = &smmu_domain->s1_cfg; ste->s2_cfg = NULL; - arm_smmu_write_ctx_desc(smmu, ste->s1_cfg); + arm_smmu_write_ctx_desc(smmu_domain, 0, &ste->s1_cfg->cd); } else { ste->s1_cfg = NULL; ste->s2_cfg = &smmu_domain->s2_cfg; @@ -1866,6 +2010,10 @@ static int arm_smmu_add_device(struct device *dev) } } + if (smmu->ssid_bits) + master->num_ssids = 1 << min(smmu->ssid_bits, + fwspec->num_pasid_bits); + group = iommu_group_get_for_dev(dev); if (!IS_ERR(group)) { iommu_group_put(group); From patchwork Fri Oct 6 13:31:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822436 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r8C42hzz9t3t for ; Sat, 7 Oct 2017 00:29:15 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752452AbdJFN3L (ORCPT ); Fri, 6 Oct 2017 09:29:11 -0400 Received: from foss.arm.com ([217.140.101.70]:60672 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752450AbdJFN3J (ORCPT ); Fri, 6 Oct 2017 09:29:09 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 33801168F; Fri, 6 Oct 2017 06:29:09 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5A8103F578; Fri, 6 Oct 2017 06:29:04 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 15/36] iommu/arm-smmu-v3: Add second level of context descriptor table Date: Fri, 6 Oct 2017 14:31:42 +0100 Message-Id: <20171006133203.22803-16-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The SMMU can support up to 20 bits of SSID. Add a second level of page tables to accommodate this. Devices that support more than 1024 SSIDs now have a table of 1024 L1 entries (8kB), pointing to tables of 1024 context descriptors (64kB), allocated on demand. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 198 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 179 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 37061e1cbae4..c444f9e83b91 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -239,6 +239,8 @@ #define STRTAB_STE_0_S1FMT_SHIFT 4 #define STRTAB_STE_0_S1FMT_LINEAR (0UL << STRTAB_STE_0_S1FMT_SHIFT) +#define STRTAB_STE_0_S1FMT_4K_L2 (1UL << STRTAB_STE_0_S1FMT_SHIFT) +#define STRTAB_STE_0_S1FMT_64K_L2 (2UL << STRTAB_STE_0_S1FMT_SHIFT) #define STRTAB_STE_0_S1CTXPTR_SHIFT 6 #define STRTAB_STE_0_S1CTXPTR_MASK 0x3ffffffffffUL #define STRTAB_STE_0_S1CDMAX_SHIFT 59 @@ -287,7 +289,21 @@ #define STRTAB_STE_3_S2TTB_SHIFT 4 #define STRTAB_STE_3_S2TTB_MASK 0xfffffffffffUL -/* Context descriptor (stage-1 only) */ +/* + * Context descriptor + * + * Linear: when less than 1024 SSIDs are supported + * 2lvl: at most 1024 L1 entrie, + * 1024 lazy entries per table. + */ +#define CTXDESC_SPLIT 10 +#define CTXDESC_NUM_L2_ENTRIES (1 << CTXDESC_SPLIT) + +#define CTXDESC_L1_DESC_DWORD 1 +#define CTXDESC_L1_DESC_VALID 1 +#define CTXDESC_L1_DESC_L2PTR_SHIFT 12 +#define CTXDESC_L1_DESC_L2PTR_MASK 0xfffffffffUL + #define CTXDESC_CD_DWORDS 8 #define CTXDESC_CD_0_TCR_T0SZ_SHIFT 0 #define ARM64_TCR_T0SZ_SHIFT 0 @@ -567,9 +583,24 @@ struct arm_smmu_ctx_desc { u64 mair; }; -struct arm_smmu_s1_cfg { +struct arm_smmu_cd_table { __le64 *cdptr; dma_addr_t cdptr_dma; +}; + +struct arm_smmu_s1_cfg { + bool linear; + + union { + struct arm_smmu_cd_table table; + struct { + __le64 *ptr; + dma_addr_t ptr_dma; + size_t num_entries; + + struct arm_smmu_cd_table *tables; + } l1; + }; size_t num_contexts; @@ -1000,7 +1031,8 @@ static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu, } /* Context descriptor manipulation functions */ -static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, u32 ssid) +static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, u32 ssid, + bool leaf) { size_t i; unsigned long flags; @@ -1010,7 +1042,7 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, u32 ssid) .opcode = CMDQ_OP_CFGI_CD, .cfgi = { .ssid = ssid, - .leaf = true, + .leaf = leaf, }, }; @@ -1029,6 +1061,69 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, u32 ssid) arm_smmu_cmdq_issue_cmd(smmu, &cmd); } +static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu, + struct arm_smmu_cd_table *desc, + size_t num_entries) +{ + size_t size = num_entries * (CTXDESC_CD_DWORDS << 3); + + desc->cdptr = dmam_alloc_coherent(smmu->dev, size, &desc->cdptr_dma, + GFP_ATOMIC | __GFP_ZERO); + if (!desc->cdptr) + return -ENOMEM; + + return 0; +} + +static void arm_smmu_free_cd_leaf_table(struct arm_smmu_device *smmu, + struct arm_smmu_cd_table *desc, + size_t num_entries) +{ + size_t size = num_entries * (CTXDESC_CD_DWORDS << 3); + + dmam_free_coherent(smmu->dev, size, desc->cdptr, desc->cdptr_dma); +} + +static void arm_smmu_write_cd_l1_desc(__le64 *dst, + struct arm_smmu_cd_table *table) +{ + u64 val = (table->cdptr_dma & CTXDESC_L1_DESC_L2PTR_MASK + << CTXDESC_L1_DESC_L2PTR_SHIFT) | CTXDESC_L1_DESC_VALID; + + *dst = cpu_to_le64(val); +} + +static __u64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, u32 ssid) +{ + unsigned long idx; + struct arm_smmu_cd_table *l1_desc; + struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; + + if (cfg->linear) + return cfg->table.cdptr + ssid * CTXDESC_CD_DWORDS; + + idx = ssid >> CTXDESC_SPLIT; + if (idx >= cfg->l1.num_entries) + return NULL; + + l1_desc = &cfg->l1.tables[idx]; + if (!l1_desc->cdptr) { + __le64 *l1ptr = cfg->l1.ptr + idx * CTXDESC_L1_DESC_DWORD; + + if (arm_smmu_alloc_cd_leaf_table(smmu_domain->smmu, l1_desc, + CTXDESC_NUM_L2_ENTRIES)) + return NULL; + + arm_smmu_write_cd_l1_desc(l1ptr, l1_desc); + /* An invalid L1 entry is allowed to be cached */ + arm_smmu_sync_cd(smmu_domain, idx << CTXDESC_SPLIT, false); + } + + idx = ssid & (CTXDESC_NUM_L2_ENTRIES - 1); + + return l1_desc->cdptr + idx * CTXDESC_CD_DWORDS; +} + static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr) { u64 val = 0; @@ -1052,7 +1147,7 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, { u64 val; bool cd_live; - __u64 *cdptr = (__u64 *)smmu_domain->s1_cfg.cdptr + ssid * CTXDESC_CD_DWORDS; + __u64 *cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid); /* * This function handles the following cases: @@ -1067,6 +1162,9 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, * (4) Remove a secondary CD and invalidate it. */ + if (WARN_ON(!cdptr)) + return; + val = le64_to_cpu(cdptr[0]); cd_live = !!(val & CTXDESC_CD_0_V); @@ -1074,7 +1172,7 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, /* (4) */ cdptr[0] = 0; if (ssid) - arm_smmu_sync_cd(smmu_domain, ssid); + arm_smmu_sync_cd(smmu_domain, ssid, true); return; } @@ -1102,7 +1200,7 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, * time. Ensure it observes the rest of the CD before we * enable it. */ - arm_smmu_sync_cd(smmu_domain, ssid); + arm_smmu_sync_cd(smmu_domain, ssid, true); val = arm_smmu_cpu_tcr_to_cd(cd->tcr) | #ifdef __BIG_ENDIAN @@ -1122,12 +1220,15 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, } if (ssid || cd_live) - arm_smmu_sync_cd(smmu_domain, ssid); + arm_smmu_sync_cd(smmu_domain, ssid, true); } static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) { + int ret; int num_ssids; + size_t num_leaf_entries, size = 0; + struct arm_smmu_cd_table *leaf_table; struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; @@ -1135,28 +1236,80 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) return -EINVAL; num_ssids = cfg->num_contexts; + if (num_ssids <= CTXDESC_NUM_L2_ENTRIES) { + /* Fits in a single table */ + cfg->linear = true; + num_leaf_entries = num_ssids; + leaf_table = &cfg->table; + } else { + /* + * SSID[S1CDmax-1:10] indexes 1st-level table, SSID[9:0] indexes + * 2nd-level + */ + cfg->linear = false; + cfg->l1.num_entries = num_ssids / CTXDESC_NUM_L2_ENTRIES; - cfg->cdptr = dmam_alloc_coherent(smmu->dev, - num_ssids * (CTXDESC_CD_DWORDS << 3), - &cfg->cdptr_dma, - GFP_KERNEL | __GFP_ZERO); - if (!cfg->cdptr) - return -ENOMEM; + cfg->l1.tables = devm_kzalloc(smmu->dev, + sizeof(struct arm_smmu_cd_table) * + cfg->l1.num_entries, GFP_KERNEL); + if (!cfg->l1.tables) + return -ENOMEM; + + size = cfg->l1.num_entries * (CTXDESC_L1_DESC_DWORD << 3); + cfg->l1.ptr = dmam_alloc_coherent(smmu->dev, size, + &cfg->l1.ptr_dma, + GFP_KERNEL | __GFP_ZERO); + if (!cfg->l1.ptr) { + devm_kfree(smmu->dev, cfg->l1.tables); + return -ENOMEM; + } + + num_leaf_entries = CTXDESC_NUM_L2_ENTRIES; + leaf_table = cfg->l1.tables; + } + + ret = arm_smmu_alloc_cd_leaf_table(smmu, leaf_table, num_leaf_entries); + if (ret) { + if (!cfg->linear) { + dmam_free_coherent(smmu->dev, size, cfg->l1.ptr, + cfg->l1.ptr_dma); + devm_kfree(smmu->dev, cfg->l1.tables); + } + + return ret; + } + + if (!cfg->linear) + arm_smmu_write_cd_l1_desc(cfg->l1.ptr, leaf_table); return 0; } static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain) { + size_t i, size; struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; if (WARN_ON(smmu_domain->stage != ARM_SMMU_DOMAIN_S1)) return; - dmam_free_coherent(smmu->dev, - cfg->num_contexts * (CTXDESC_CD_DWORDS << 3), - cfg->cdptr, cfg->cdptr_dma); + if (cfg->linear) { + arm_smmu_free_cd_leaf_table(smmu, &cfg->table, cfg->num_contexts); + } else { + for (i = 0; i < cfg->l1.num_entries; i++) { + struct arm_smmu_cd_table *desc = &cfg->l1.tables[i]; + + if (!desc->cdptr) + continue; + + arm_smmu_free_cd_leaf_table(smmu, desc, + CTXDESC_NUM_L2_ENTRIES); + } + + size = cfg->l1.num_entries * (CTXDESC_L1_DESC_DWORD << 3); + dmam_free_coherent(smmu->dev, size, cfg->l1.ptr, cfg->l1.ptr_dma); + } } /* Stream table manipulation functions */ @@ -1255,10 +1408,16 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, } if (ste->s1_cfg) { + dma_addr_t s1ctxptr; unsigned int s1cdmax = ilog2(ste->s1_cfg->num_contexts); BUG_ON(ste_live); + if (ste->s1_cfg->linear) + s1ctxptr = ste->s1_cfg->table.cdptr_dma; + else + s1ctxptr = ste->s1_cfg->l1.ptr_dma; + dst[1] = cpu_to_le64( STRTAB_STE_1_S1DSS_SSID0 | STRTAB_STE_1_S1C_CACHE_WBRA @@ -1275,11 +1434,12 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); - val |= (ste->s1_cfg->cdptr_dma & STRTAB_STE_0_S1CTXPTR_MASK + val |= (s1ctxptr & STRTAB_STE_0_S1CTXPTR_MASK << STRTAB_STE_0_S1CTXPTR_SHIFT) | (u64)(s1cdmax & STRTAB_STE_0_S1CDMAX_MASK) << STRTAB_STE_0_S1CDMAX_SHIFT | - STRTAB_STE_0_S1FMT_LINEAR | + (ste->s1_cfg->linear ? STRTAB_STE_0_S1FMT_LINEAR : + STRTAB_STE_0_S1FMT_64K_L2) | STRTAB_STE_0_CFG_S1_TRANS; } From patchwork Fri Oct 6 13:31:43 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822437 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r8H2vSpz9t34 for ; Sat, 7 Oct 2017 00:29:19 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752365AbdJFN3R (ORCPT ); Fri, 6 Oct 2017 09:29:17 -0400 Received: from foss.arm.com ([217.140.101.70]:60698 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752459AbdJFN3O (ORCPT ); Fri, 6 Oct 2017 09:29:14 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4A00A165D; Fri, 6 Oct 2017 06:29:14 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 70F013F578; Fri, 6 Oct 2017 06:29:09 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 16/36] iommu/arm-smmu-v3: Add support for VHE Date: Fri, 6 Oct 2017 14:31:43 +0100 Message-Id: <20171006133203.22803-17-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org ARMv8.1 extensions added Virtualization Host Extensions (VHE), which allow to run a host kernel at EL2. When using normal DMA, Device and CPU address spaces are orthogonal, and do not need to implement the same capabilities, so VHE hasn't been in use on the SMMU side until now. With shared address spaces however, ASIDs are shared between MMU and SMMU, and broadcast TLB invalidations issued by a CPU are taken into account by the SMMU. TLB entries on both sides need to have identical exception level in order to be shot with a single invalidation. When the CPU is using VHE, enable VHE in the SMMU and for all streams. Normal DMA mappings will need to use TLBI_EL2 commands instead of TLBI_NH, but shouldn't be otherwise affected by this change. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index c444f9e83b91..27376e1193c1 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -22,6 +22,7 @@ #include #include +#include #include #include #include @@ -516,6 +517,8 @@ struct arm_smmu_cmdq_ent { #define CMDQ_OP_TLBI_NH_ASID 0x11 #define CMDQ_OP_TLBI_NH_VA 0x12 #define CMDQ_OP_TLBI_EL2_ALL 0x20 + #define CMDQ_OP_TLBI_EL2_ASID 0x21 + #define CMDQ_OP_TLBI_EL2_VA 0x22 #define CMDQ_OP_TLBI_S12_VMALL 0x28 #define CMDQ_OP_TLBI_S2_IPA 0x2a #define CMDQ_OP_TLBI_NSNH_ALL 0x30 @@ -655,6 +658,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_STALLS (1 << 11) #define ARM_SMMU_FEAT_HYP (1 << 12) #define ARM_SMMU_FEAT_STALL_FORCE (1 << 13) +#define ARM_SMMU_FEAT_E2H (1 << 14) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -912,6 +916,7 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[1] |= CMDQ_CFGI_1_RANGE_MASK << CMDQ_CFGI_1_RANGE_SHIFT; break; case CMDQ_OP_TLBI_NH_VA: + case CMDQ_OP_TLBI_EL2_VA: cmd[0] |= (u64)ent->tlbi.asid << CMDQ_TLBI_0_ASID_SHIFT; cmd[1] |= ent->tlbi.leaf ? CMDQ_TLBI_1_LEAF : 0; cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK; @@ -927,6 +932,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) case CMDQ_OP_TLBI_S12_VMALL: cmd[0] |= (u64)ent->tlbi.vmid << CMDQ_TLBI_0_VMID_SHIFT; break; + case CMDQ_OP_TLBI_EL2_ASID: + cmd[0] |= (u64)ent->tlbi.asid << CMDQ_TLBI_0_ASID_SHIFT; + break; case CMDQ_OP_PRI_RESP: cmd[0] |= ent->substream_valid ? CMDQ_0_SSV : 0; cmd[0] |= ent->pri.ssid << CMDQ_PRI_0_SSID_SHIFT; @@ -1428,7 +1436,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, #ifdef CONFIG_PCI_ATS STRTAB_STE_1_EATS_TRANS << STRTAB_STE_1_EATS_SHIFT | #endif - STRTAB_STE_1_STRW_NSEL1 << STRTAB_STE_1_STRW_SHIFT); + (smmu->features & ARM_SMMU_FEAT_E2H ? + STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1) << + STRTAB_STE_1_STRW_SHIFT); if (smmu->features & ARM_SMMU_FEAT_STALLS && !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) @@ -1694,7 +1704,8 @@ static void arm_smmu_tlb_inv_context(void *cookie) struct arm_smmu_cmdq_ent cmd; if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - cmd.opcode = CMDQ_OP_TLBI_NH_ASID; + cmd.opcode = smmu->features & ARM_SMMU_FEAT_E2H ? + CMDQ_OP_TLBI_EL2_ASID : CMDQ_OP_TLBI_NH_ASID; cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; cmd.tlbi.vmid = 0; } else { @@ -1719,7 +1730,8 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, }; if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - cmd.opcode = CMDQ_OP_TLBI_NH_VA; + cmd.opcode = smmu->features & ARM_SMMU_FEAT_E2H ? + CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA; cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; } else { cmd.opcode = CMDQ_OP_TLBI_S2_IPA; @@ -2718,7 +2730,11 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) writel_relaxed(reg, smmu->base + ARM_SMMU_CR1); /* CR2 (random crap) */ - reg = CR2_PTM | CR2_RECINVSID | CR2_E2H; + reg = CR2_PTM | CR2_RECINVSID; + + if (smmu->features & ARM_SMMU_FEAT_E2H) + reg |= CR2_E2H; + writel_relaxed(reg, smmu->base + ARM_SMMU_CR2); /* Stream table */ @@ -2868,8 +2884,11 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) if (reg & IDR0_MSI) smmu->features |= ARM_SMMU_FEAT_MSI; - if (reg & IDR0_HYP) + if (reg & IDR0_HYP) { smmu->features |= ARM_SMMU_FEAT_HYP; + if (cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN)) + smmu->features |= ARM_SMMU_FEAT_E2H; + } /* * The coherency feature as set by FW is used in preference to the ID From patchwork Fri Oct 6 13:31:44 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822439 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r8N4TmJz9t3t for ; Sat, 7 Oct 2017 00:29:24 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752247AbdJFN3V (ORCPT ); Fri, 6 Oct 2017 09:29:21 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60752 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752054AbdJFN3T (ORCPT ); Fri, 6 Oct 2017 09:29:19 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 60BF5169F; Fri, 6 Oct 2017 06:29:19 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 876E43F578; Fri, 6 Oct 2017 06:29:14 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 17/36] iommu/arm-smmu-v3: Support broadcast TLB maintenance Date: Fri, 6 Oct 2017 14:31:44 +0100 Message-Id: <20171006133203.22803-18-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The SMMUv3 can handle invalidation targeted at TLB entries with shared ASIDs. If the implementation supports broadcast TLB maintenance, enable it and keep track of it in a feature bit. The SMMU will then take into account the following CPU instruction for ASIDs in the shared set: * TLBI VAE1IS(ASID, VA) * TLBI ASIDE1IS(ASID) Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 27376e1193c1..b23f69aa242e 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -64,6 +64,7 @@ #define IDR0_ASID16 (1 << 12) #define IDR0_ATS (1 << 10) #define IDR0_HYP (1 << 9) +#define IDR0_BTM (1 << 5) #define IDR0_COHACC (1 << 4) #define IDR0_TTF_SHIFT 2 #define IDR0_TTF_MASK 0x3 @@ -659,6 +660,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_HYP (1 << 12) #define ARM_SMMU_FEAT_STALL_FORCE (1 << 13) #define ARM_SMMU_FEAT_E2H (1 << 14) +#define ARM_SMMU_FEAT_BTM (1 << 15) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -2730,11 +2732,14 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) writel_relaxed(reg, smmu->base + ARM_SMMU_CR1); /* CR2 (random crap) */ - reg = CR2_PTM | CR2_RECINVSID; + reg = CR2_RECINVSID; if (smmu->features & ARM_SMMU_FEAT_E2H) reg |= CR2_E2H; + if (!(smmu->features & ARM_SMMU_FEAT_BTM)) + reg |= CR2_PTM; + writel_relaxed(reg, smmu->base + ARM_SMMU_CR2); /* Stream table */ @@ -2837,6 +2842,7 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) { u32 reg; bool coherent = smmu->features & ARM_SMMU_FEAT_COHERENCY; + bool vhe = cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN); /* IDR0 */ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0); @@ -2886,11 +2892,20 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) if (reg & IDR0_HYP) { smmu->features |= ARM_SMMU_FEAT_HYP; - if (cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN)) + if (vhe) smmu->features |= ARM_SMMU_FEAT_E2H; } /* + * If the CPU is using VHE, but the SMMU doesn't support it, the SMMU + * will create TLB entries for NH-EL1 world and will miss the + * broadcasted TLB invalidations that target EL2-E2H world. Don't enable + * BTM in that case. + */ + if (reg & IDR0_BTM && (!vhe || reg & IDR0_HYP)) + smmu->features |= ARM_SMMU_FEAT_BTM; + + /* * The coherency feature as set by FW is used in preference to the ID * register, but warn on mismatch. */ From patchwork Fri Oct 6 13:31:45 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822440 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r8R1zrlz9t3m for ; Sat, 7 Oct 2017 00:29:27 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752392AbdJFN30 (ORCPT ); Fri, 6 Oct 2017 09:29:26 -0400 Received: from foss.arm.com ([217.140.101.70]:60788 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752353AbdJFN3Y (ORCPT ); Fri, 6 Oct 2017 09:29:24 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7752215BE; Fri, 6 Oct 2017 06:29:24 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9E1ED3F578; Fri, 6 Oct 2017 06:29:19 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 18/36] iommu/arm-smmu-v3: Add SVM feature checking Date: Fri, 6 Oct 2017 14:31:45 +0100 Message-Id: <20171006133203.22803-19-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Aggregate all sanity-checks for sharing CPU page tables with the SMMU under a single ARM_SMMU_FEAT_SVM bit. For PCIe SVM, users also need to check FEAT_ATS and FEAT_PRI. For platform SVM, they will most likely have to check FEAT_STALLS and FEAT_BTM. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 60 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index b23f69aa242e..96347aad605f 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -661,6 +661,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_STALL_FORCE (1 << 13) #define ARM_SMMU_FEAT_E2H (1 << 14) #define ARM_SMMU_FEAT_BTM (1 << 15) +#define ARM_SMMU_FEAT_SVM (1 << 16) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -2838,6 +2839,62 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) return 0; } +static bool arm_smmu_supports_svm(struct arm_smmu_device *smmu) +{ + unsigned long reg, fld; + unsigned long oas; + unsigned long asid_bits; + + u32 feat_mask = ARM_SMMU_FEAT_BTM | ARM_SMMU_FEAT_COHERENCY; + + if ((smmu->features & feat_mask) != feat_mask) + return false; + + if (!(smmu->pgsize_bitmap & PAGE_SIZE)) + return false; + + /* + * Get the smallest PA size of all CPUs (sanitized by cpufeature). We're + * not even pretending to support AArch32 here. + */ + reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); + fld = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_PARANGE_SHIFT); + switch (fld) { + case 0x0: + oas = 32; + break; + case 0x1: + oas = 36; + break; + case 0x2: + oas = 40; + break; + case 0x3: + oas = 42; + break; + case 0x4: + oas = 44; + break; + case 0x5: + oas = 48; + break; + default: + return false; + } + + /* abort if MMU outputs addresses greater than what we support. */ + if (smmu->oas < oas) + return false; + + /* We can support bigger ASIDs than the CPU, but not smaller */ + fld = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_ASID_SHIFT); + asid_bits = fld ? 16 : 8; + if (smmu->asid_bits < asid_bits) + return false; + + return true; +} + static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) { u32 reg; @@ -3032,6 +3089,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) smmu->ias = max(smmu->ias, smmu->oas); + if (arm_smmu_supports_svm(smmu)) + smmu->features |= ARM_SMMU_FEAT_SVM; + dev_info(smmu->dev, "ias %lu-bit, oas %lu-bit (features 0x%08x)\n", smmu->ias, smmu->oas, smmu->features); return 0; From patchwork Fri Oct 6 13:31:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822441 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r8X46MMz9t4P for ; Sat, 7 Oct 2017 00:29:32 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752254AbdJFN3a (ORCPT ); Fri, 6 Oct 2017 09:29:30 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60838 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752120AbdJFN33 (ORCPT ); Fri, 6 Oct 2017 09:29:29 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8DD661435; Fri, 6 Oct 2017 06:29:29 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B4B243F578; Fri, 6 Oct 2017 06:29:24 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 19/36] arm64: mm: Pin down ASIDs for sharing contexts with devices Date: Fri, 6 Oct 2017 14:31:46 +0100 Message-Id: <20171006133203.22803-20-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org In order to enable address space sharing with the IOMMU, we introduce functions mm_context_get and mm_context_put, that pin down a context and ensure that its ASID won't be modified willy-nilly after a rollover. Pinning is necessary because, once a device is using an ASID, it needs a valid and unique one at all times, whether the associated task is running or not. Without pinning, we would need to notify the IOMMU when we're about to use a new ASID for a task. Things would get messy when a new task is assigned a shared ASID. Consider the following scenario: 1. Task t1 is running on CPUx with shared ASID (1, 1) 2. Task t2 is scheduled on CPUx, gets ASID (1, 2) 3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1) We would now have to immediately generate a new ASID for t1, notify the IOMMU, and finally enable task tn. We are holding the lock during all that time, since we can't afford having another CPU trigger a rollover. It gets needlessly complicated, and all we wanted to do was schedule poor task tn, that has no business with the IOMMU. By letting the IOMMU pin tasks when needed, we avoid stalling the slow path, and let the pinning fail when we're out of potential ASIDs. After a rollover, we assume that there is at least one more ASID than number of CPUs. So we can use (NR_ASIDS - NR_CPUS - 1) as a hard limit for the number of ASIDs we can afford to share with the IOMMU. Since multiple IOMMUs could pin the same context, we need to keep track of the number of references. Add a refcount value in mm_context_t for this purpose. Signed-off-by: Jean-Philippe Brucker --- arch/arm64/include/asm/mmu.h | 1 + arch/arm64/include/asm/mmu_context.h | 11 ++++- arch/arm64/mm/context.c | 80 +++++++++++++++++++++++++++++++++++- 3 files changed, 90 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 0d34bf0a89c7..3e687fc49825 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -20,6 +20,7 @@ typedef struct { atomic64_t id; + unsigned long refcount; void *vdso; unsigned long flags; } mm_context_t; diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 3257895a9b5e..52c2f8e04a18 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -154,7 +154,13 @@ static inline void cpu_replace_ttbr1(pgd_t *pgd) #define destroy_context(mm) do { } while(0) void check_and_switch_context(struct mm_struct *mm, unsigned int cpu); -#define init_new_context(tsk,mm) ({ atomic64_set(&(mm)->context.id, 0); 0; }) +static inline int +init_new_context(struct task_struct *tsk, struct mm_struct *mm) +{ + atomic64_set(&mm->context.id, 0); + mm->context.refcount = 0; + return 0; +} /* * This is called when "tsk" is about to enter lazy TLB mode. @@ -226,6 +232,9 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next, void verify_cpu_asid_bits(void); +unsigned long mm_context_get(struct mm_struct *mm); +void mm_context_put(struct mm_struct *mm); + #endif /* !__ASSEMBLY__ */ #endif /* !__ASM_MMU_CONTEXT_H */ diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index ab9f5f0fb2c7..a15c90083a57 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -37,6 +37,10 @@ static DEFINE_PER_CPU(atomic64_t, active_asids); static DEFINE_PER_CPU(u64, reserved_asids); static cpumask_t tlb_flush_pending; +static unsigned long max_pinned_asids; +static unsigned long nr_pinned_asids; +static unsigned long *pinned_asid_map; + #define ASID_MASK (~GENMASK(asid_bits - 1, 0)) #define ASID_FIRST_VERSION (1UL << asid_bits) #define NUM_USER_ASIDS ASID_FIRST_VERSION @@ -92,7 +96,7 @@ static void flush_context(unsigned int cpu) u64 asid; /* Update the list of reserved ASIDs and the ASID bitmap. */ - bitmap_clear(asid_map, 0, NUM_USER_ASIDS); + bitmap_copy(asid_map, pinned_asid_map, NUM_USER_ASIDS); set_reserved_asid_bits(); @@ -154,6 +158,10 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu) if (asid != 0) { u64 newasid = generation | (asid & ~ASID_MASK); + /* That ASID is pinned for us, we're good to go. */ + if (mm->context.refcount) + return newasid; + /* * If our current ASID was active during a rollover, we * can continue to use it and this was just a false alarm. @@ -235,6 +243,63 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu) cpu_switch_mm(mm->pgd, mm); } +unsigned long mm_context_get(struct mm_struct *mm) +{ + unsigned long flags; + u64 asid; + + raw_spin_lock_irqsave(&cpu_asid_lock, flags); + + asid = atomic64_read(&mm->context.id); + + if (mm->context.refcount) { + mm->context.refcount++; + asid &= ~ASID_MASK; + goto out_unlock; + } + + if (nr_pinned_asids >= max_pinned_asids) { + asid = 0; + goto out_unlock; + } + + if (((asid ^ atomic64_read(&asid_generation)) >> asid_bits)) { + /* + * We went through one or more rollover since that ASID was + * used. Ensure that it is still valid, or generate a new one. + * The cpu argument isn't used by new_context. + */ + asid = new_context(mm, 0); + atomic64_set(&mm->context.id, asid); + } + + asid &= ~ASID_MASK; + + nr_pinned_asids++; + __set_bit(asid, pinned_asid_map); + mm->context.refcount++; + +out_unlock: + raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); + + return asid; +} + +void mm_context_put(struct mm_struct *mm) +{ + unsigned long flags; + u64 asid = atomic64_read(&mm->context.id) & ~ASID_MASK; + + raw_spin_lock_irqsave(&cpu_asid_lock, flags); + + if (--mm->context.refcount == 0) { + __clear_bit(asid, pinned_asid_map); + nr_pinned_asids--; + } + + raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); +} + static int asids_init(void) { asid_bits = get_cpu_asid_bits(); @@ -252,6 +317,19 @@ static int asids_init(void) set_reserved_asid_bits(); + pinned_asid_map = kzalloc(BITS_TO_LONGS(NUM_USER_ASIDS) + * sizeof(*pinned_asid_map), GFP_KERNEL); + if (!pinned_asid_map) + panic("Failed to allocate pinned bitmap\n"); + + /* + * We assume that an ASID is always available after a rollback. This + * means that even if all CPUs have a reserved ASID, there still is at + * least one slot available in the asid_bitmap. + */ + max_pinned_asids = NUM_USER_ASIDS - num_possible_cpus() - 2; + nr_pinned_asids = 0; + pr_info("ASID allocator initialised with %lu entries\n", NUM_USER_ASIDS); return 0; } From patchwork Fri Oct 6 13:31:47 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822442 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r8d31wSz9t41 for ; Sat, 7 Oct 2017 00:29:37 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752460AbdJFN3f (ORCPT ); Fri, 6 Oct 2017 09:29:35 -0400 Received: from foss.arm.com ([217.140.101.70]:60880 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752459AbdJFN3f (ORCPT ); Fri, 6 Oct 2017 09:29:35 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A462715A2; Fri, 6 Oct 2017 06:29:34 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id CB2603F578; Fri, 6 Oct 2017 06:29:29 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 20/36] iommu/arm-smmu-v3: Track ASID state Date: Fri, 6 Oct 2017 14:31:47 +0100 Message-Id: <20171006133203.22803-21-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org At the moment each SMMU has a 8- or 16-bit ASID set and allocates one ASID per device via a bitmap. ASIDs are used to differentiate address spaces in SMMU TLB entries. With SVM, sharing process address spaces with the SMMU, we need to use CPU ASIDs in SMMU contexts, to ensure that broadcast TLB invalidations reach the right IOTLB entries. When binding a process address space to a device, we become slaves to the arch ASID allocator. We have to use whatever ASID they give us. If a domain is currently using it, then we'll either abort or steal that ASID. To make matters worse, tasks are global, while domains are per-SMMU. SMMU ASIDs can be aliased across different SMMUs, but the CPU ASID space is unique across the whole system. Introduce an IDR for SMMU ASID allocation. It allows to keep information about an ASID, for instance which domain it is assigned to or how many devices are using it. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 53 +++++++++++++++++++++++++++++++++++++-------- 1 file changed, 44 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 96347aad605f..71fc3a2c8a95 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -640,6 +640,10 @@ struct arm_smmu_strtab_cfg { u32 strtab_base_cfg; }; +struct arm_smmu_asid_state { + struct arm_smmu_domain *domain; +}; + /* An SMMUv3 instance */ struct arm_smmu_device { struct device *dev; @@ -681,7 +685,8 @@ struct arm_smmu_device { #define ARM_SMMU_MAX_ASIDS (1 << 16) unsigned int asid_bits; - DECLARE_BITMAP(asid_map, ARM_SMMU_MAX_ASIDS); + struct idr asid_idr; + spinlock_t asid_lock; #define ARM_SMMU_MAX_VMIDS (1 << 16) unsigned int vmid_bits; @@ -1828,7 +1833,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; if (cfg->num_contexts) { arm_smmu_free_cd_tables(smmu_domain); - arm_smmu_bitmap_free(smmu->asid_map, cfg->cd.asid); + + spin_lock(&smmu->asid_lock); + kfree(idr_find(&smmu->asid_idr, cfg->cd.asid)); + idr_remove(&smmu->asid_idr, cfg->cd.asid); + spin_unlock(&smmu->asid_lock); } } else { struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg; @@ -1844,25 +1853,48 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, { int ret; int asid; + struct arm_smmu_asid_state *asid_state; struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; - asid = arm_smmu_bitmap_alloc(smmu->asid_map, smmu->asid_bits); - if (asid < 0) - return asid; - ret = arm_smmu_alloc_cd_tables(smmu_domain); if (ret) - goto out_free_asid; + return ret; + + asid_state = kzalloc(sizeof(*asid_state), GFP_KERNEL); + if (!asid_state) { + ret = -ENOMEM; + goto out_free_tables; + } + + asid_state->domain = smmu_domain; + + idr_preload(GFP_KERNEL); + spin_lock(&smmu->asid_lock); + asid = idr_alloc_cyclic(&smmu->asid_idr, asid_state, 0, + 1 << smmu->asid_bits, GFP_ATOMIC); cfg->cd.asid = (u16)asid; cfg->cd.ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0]; cfg->cd.tcr = pgtbl_cfg->arm_lpae_s1_cfg.tcr; cfg->cd.mair = pgtbl_cfg->arm_lpae_s1_cfg.mair[0]; + + spin_unlock(&smmu->asid_lock); + idr_preload_end(); + + if (asid < 0) { + ret = asid; + goto out_free_asid_state; + } + return 0; -out_free_asid: - arm_smmu_bitmap_free(smmu->asid_map, asid); +out_free_asid_state: + kfree(asid_state); + +out_free_tables: + arm_smmu_free_cd_tables(smmu_domain); + return ret; } @@ -2506,6 +2538,9 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu) { int ret; + spin_lock_init(&smmu->asid_lock); + idr_init(&smmu->asid_idr); + ret = arm_smmu_init_queues(smmu); if (ret) return ret; From patchwork Fri Oct 6 13:31:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822443 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r8k5l5tz9t3t for ; Sat, 7 Oct 2017 00:29:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752228AbdJFN3k (ORCPT ); Fri, 6 Oct 2017 09:29:40 -0400 Received: from foss.arm.com ([217.140.101.70]:60930 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752165AbdJFN3k (ORCPT ); Fri, 6 Oct 2017 09:29:40 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BB4241610; Fri, 6 Oct 2017 06:29:39 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E18523F578; Fri, 6 Oct 2017 06:29:34 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 21/36] iommu/arm-smmu-v3: Implement process operations Date: Fri, 6 Oct 2017 14:31:48 +0100 Message-Id: <20171006133203.22803-22-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Hook process operations to support PASID and page table sharing with the SMMUv3: * process_allocate pins down its ASID and initializes the context descriptor fields. * process_free releases the ASID. * process_attach checks device capabilities and writes the context descriptor. More work is required to ensure that the process' ASID isn't being used for io-pgtables. * process_detach clears the context descriptor and sends required invalidations. * process_invalidate sends required invalidations. * process_exit stops us of the PASID, clears the context descriptor and performs required invalidations. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 207 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 207 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 71fc3a2c8a95..c86a1182c137 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -37,6 +38,7 @@ #include #include #include +#include #include @@ -642,6 +644,7 @@ struct arm_smmu_strtab_cfg { struct arm_smmu_asid_state { struct arm_smmu_domain *domain; + unsigned long refs; }; /* An SMMUv3 instance */ @@ -712,6 +715,9 @@ struct arm_smmu_master_data { struct device *dev; size_t num_ssids; + bool can_fault; + /* Number of processes attached */ + int processes; }; /* SMMU private data for an IOMMU domain */ @@ -740,6 +746,11 @@ struct arm_smmu_domain { spinlock_t devices_lock; }; +struct arm_smmu_process { + struct iommu_process process; + struct arm_smmu_ctx_desc ctx_desc; +}; + struct arm_smmu_option_prop { u32 opt; const char *prop; @@ -766,6 +777,11 @@ static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) return container_of(dom, struct arm_smmu_domain, domain); } +static struct arm_smmu_process *to_smmu_process(struct iommu_process *process) +{ + return container_of(process, struct arm_smmu_process, process); +} + static void parse_driver_options(struct arm_smmu_device *smmu) { int i = 0; @@ -2032,6 +2048,13 @@ static void arm_smmu_detach_dev(struct device *dev) struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv; struct arm_smmu_domain *smmu_domain = master->domain; + /* + * Core is preventing concurrent calls between attach and bind, so this + * read only races with process_exit (FIXME). + */ + if (master->processes) + __iommu_process_unbind_dev_all(&smmu_domain->domain, dev); + if (smmu_domain) { spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_del(&master->list); @@ -2143,6 +2166,184 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) return ops->iova_to_phys(ops, iova); } +static int arm_smmu_process_init_pgtable(struct arm_smmu_process *smmu_process, + struct mm_struct *mm) +{ + int asid; + + asid = mm_context_get(mm); + if (!asid) + return -ENOSPC; + + smmu_process->ctx_desc.asid = asid; + /* TODO: init the rest */ + + return 0; +} + +static struct iommu_process *arm_smmu_process_alloc(struct task_struct *task) +{ + int ret; + struct mm_struct *mm; + struct arm_smmu_process *smmu_process; + + smmu_process = kzalloc(sizeof(*smmu_process), GFP_KERNEL); + + mm = get_task_mm(task); + if (!mm) { + kfree(smmu_process); + return NULL; + } + + ret = arm_smmu_process_init_pgtable(smmu_process, mm); + mmput(mm); + if (ret) { + kfree(smmu_process); + return NULL; + } + + return &smmu_process->process; +} + +static void arm_smmu_process_free(struct iommu_process *process) +{ + struct arm_smmu_process *smmu_process = to_smmu_process(process); + + /* Unpin ASID */ + mm_context_put(process->mm); + + kfree(smmu_process); +} + +static int arm_smmu_process_share(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_process *smmu_process) +{ + int asid, ret; + struct arm_smmu_asid_state *asid_state; + struct arm_smmu_device *smmu = smmu_domain->smmu; + + asid = smmu_process->ctx_desc.asid; + + asid_state = idr_find(&smmu->asid_idr, asid); + if (asid_state && asid_state->domain) { + return -EEXIST; + } else if (asid_state) { + asid_state->refs++; + return 0; + } + + asid_state = kzalloc(sizeof(*asid_state), GFP_ATOMIC); + asid_state->refs = 1; + + if (!asid_state) + return -ENOMEM; + + ret = idr_alloc(&smmu->asid_idr, asid_state, asid, asid + 1, GFP_ATOMIC); + return ret < 0 ? ret : 0; +} + +static int arm_smmu_process_attach(struct iommu_domain *domain, + struct device *dev, + struct iommu_process *process, bool first) +{ + int ret; + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_process *smmu_process = to_smmu_process(process); + struct arm_smmu_device *smmu = smmu_domain->smmu; + struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv; + + if (!(smmu->features & ARM_SMMU_FEAT_SVM)) + return -ENODEV; + + /* TODO: process->no_pasid */ + if (process->pasid >= master->num_ssids) + return -ENODEV; + + /* TODO: process->no_need_for_pri_ill_pin_everything */ + if (!master->can_fault) + return -ENODEV; + + master->processes++; + + if (!first) + return 0; + + spin_lock(&smmu->asid_lock); + ret = arm_smmu_process_share(smmu_domain, smmu_process); + spin_unlock(&smmu->asid_lock); + if (ret) + return ret; + + arm_smmu_write_ctx_desc(smmu_domain, process->pasid, &smmu_process->ctx_desc); + + return 0; +} + +static void arm_smmu_process_detach(struct iommu_domain *domain, + struct device *dev, + struct iommu_process *process, bool last) +{ + struct arm_smmu_asid_state *asid_state; + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_process *smmu_process = to_smmu_process(process); + struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv; + struct arm_smmu_device *smmu = smmu_domain->smmu; + + master->processes--; + + if (last) { + spin_lock(&smmu->asid_lock); + asid_state = idr_find(&smmu->asid_idr, smmu_process->ctx_desc.asid); + if (--asid_state->refs == 0) { + idr_remove(&smmu->asid_idr, smmu_process->ctx_desc.asid); + kfree(asid_state); + } + spin_unlock(&smmu->asid_lock); + + arm_smmu_write_ctx_desc(smmu_domain, process->pasid, NULL); + } + + /* TODO: Invalidate ATC. */ + /* TODO: Invalidate all mappings if last and not DVM. */ +} + +static void arm_smmu_process_invalidate(struct iommu_domain *domain, + struct iommu_process *process, + unsigned long iova, size_t size) +{ + /* + * TODO: Invalidate ATC. + * TODO: Invalidate mapping if not DVM + */ +} + +static void arm_smmu_process_exit(struct iommu_domain *domain, + struct iommu_process *process) +{ + struct arm_smmu_master_data *master; + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + + if (!domain->process_exit) + return; + + spin_lock(&smmu_domain->devices_lock); + list_for_each_entry(master, &smmu_domain->devices, list) { + if (!master->processes) + continue; + + master->processes--; + domain->process_exit(domain, master->dev, process->pasid, + domain->process_exit_token); + + /* TODO: inval ATC */ + } + spin_unlock(&smmu_domain->devices_lock); + + arm_smmu_write_ctx_desc(smmu_domain, process->pasid, NULL); + + /* TODO: Invalidate all mappings if not DVM */ +} + static struct platform_driver arm_smmu_driver; static int arm_smmu_match_node(struct device *dev, void *data) @@ -2351,6 +2552,12 @@ static struct iommu_ops arm_smmu_ops = { .domain_alloc = arm_smmu_domain_alloc, .domain_free = arm_smmu_domain_free, .attach_dev = arm_smmu_attach_dev, + .process_alloc = arm_smmu_process_alloc, + .process_free = arm_smmu_process_free, + .process_attach = arm_smmu_process_attach, + .process_detach = arm_smmu_process_detach, + .process_invalidate = arm_smmu_process_invalidate, + .process_exit = arm_smmu_process_exit, .map = arm_smmu_map, .unmap = arm_smmu_unmap, .map_sg = default_iommu_map_sg, From patchwork Fri Oct 6 13:31:49 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822449 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9J6H4mz9t34 for ; Sat, 7 Oct 2017 00:30:12 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752187AbdJFN3q (ORCPT ); Fri, 6 Oct 2017 09:29:46 -0400 Received: from foss.arm.com ([217.140.101.70]:60962 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751827AbdJFN3p (ORCPT ); Fri, 6 Oct 2017 09:29:45 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D1D731650; Fri, 6 Oct 2017 06:29:44 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 0490C3F578; Fri, 6 Oct 2017 06:29:39 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 22/36] iommu/io-pgtable-arm: Factor out ARM LPAE register defines Date: Fri, 6 Oct 2017 14:31:49 +0100 Message-Id: <20171006133203.22803-23-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org For SVM, we'll need to extract CPU page table information and mirror it in the substream setup. Move relevant defines to a common header. Fix TCR_SZ_MASK while we're at it. Signed-off-by: Jean-Philippe Brucker --- MAINTAINERS | 1 + drivers/iommu/io-pgtable-arm.c | 48 +----------------------------- drivers/iommu/io-pgtable-arm.h | 67 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 69 insertions(+), 47 deletions(-) create mode 100644 drivers/iommu/io-pgtable-arm.h diff --git a/MAINTAINERS b/MAINTAINERS index 65b0c88d5ee0..cff90315c2ec 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1089,6 +1089,7 @@ S: Maintained F: drivers/iommu/arm-smmu.c F: drivers/iommu/arm-smmu-v3.c F: drivers/iommu/io-pgtable-arm.c +F: drivers/iommu/io-pgtable-arm.h F: drivers/iommu/io-pgtable-arm-v7s.c ARM SUB-ARCHITECTURES diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index e8018a308868..443234a564a6 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -31,6 +31,7 @@ #include #include "io-pgtable.h" +#include "io-pgtable-arm.h" #define ARM_LPAE_MAX_ADDR_BITS 48 #define ARM_LPAE_S2_MAX_CONCAT_PAGES 16 @@ -118,53 +119,6 @@ #define ARM_LPAE_PTE_MEMATTR_DEV (((arm_lpae_iopte)0x1) << 2) /* Register bits */ -#define ARM_32_LPAE_TCR_EAE (1 << 31) -#define ARM_64_LPAE_S2_TCR_RES1 (1 << 31) - -#define ARM_LPAE_TCR_EPD1 (1 << 23) - -#define ARM_LPAE_TCR_TG0_4K (0 << 14) -#define ARM_LPAE_TCR_TG0_64K (1 << 14) -#define ARM_LPAE_TCR_TG0_16K (2 << 14) - -#define ARM_LPAE_TCR_SH0_SHIFT 12 -#define ARM_LPAE_TCR_SH0_MASK 0x3 -#define ARM_LPAE_TCR_SH_NS 0 -#define ARM_LPAE_TCR_SH_OS 2 -#define ARM_LPAE_TCR_SH_IS 3 - -#define ARM_LPAE_TCR_ORGN0_SHIFT 10 -#define ARM_LPAE_TCR_IRGN0_SHIFT 8 -#define ARM_LPAE_TCR_RGN_MASK 0x3 -#define ARM_LPAE_TCR_RGN_NC 0 -#define ARM_LPAE_TCR_RGN_WBWA 1 -#define ARM_LPAE_TCR_RGN_WT 2 -#define ARM_LPAE_TCR_RGN_WB 3 - -#define ARM_LPAE_TCR_SL0_SHIFT 6 -#define ARM_LPAE_TCR_SL0_MASK 0x3 - -#define ARM_LPAE_TCR_T0SZ_SHIFT 0 -#define ARM_LPAE_TCR_SZ_MASK 0xf - -#define ARM_LPAE_TCR_PS_SHIFT 16 -#define ARM_LPAE_TCR_PS_MASK 0x7 - -#define ARM_LPAE_TCR_IPS_SHIFT 32 -#define ARM_LPAE_TCR_IPS_MASK 0x7 - -#define ARM_LPAE_TCR_PS_32_BIT 0x0ULL -#define ARM_LPAE_TCR_PS_36_BIT 0x1ULL -#define ARM_LPAE_TCR_PS_40_BIT 0x2ULL -#define ARM_LPAE_TCR_PS_42_BIT 0x3ULL -#define ARM_LPAE_TCR_PS_44_BIT 0x4ULL -#define ARM_LPAE_TCR_PS_48_BIT 0x5ULL - -#define ARM_LPAE_MAIR_ATTR_SHIFT(n) ((n) << 3) -#define ARM_LPAE_MAIR_ATTR_MASK 0xff -#define ARM_LPAE_MAIR_ATTR_DEVICE 0x04 -#define ARM_LPAE_MAIR_ATTR_NC 0x44 -#define ARM_LPAE_MAIR_ATTR_WBRWA 0xff #define ARM_LPAE_MAIR_ATTR_IDX_NC 0 #define ARM_LPAE_MAIR_ATTR_IDX_CACHE 1 #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2 diff --git a/drivers/iommu/io-pgtable-arm.h b/drivers/iommu/io-pgtable-arm.h new file mode 100644 index 000000000000..cb31314971ac --- /dev/null +++ b/drivers/iommu/io-pgtable-arm.h @@ -0,0 +1,67 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + * + * Copyright (C) 2017 ARM Limited + */ +#ifndef __IO_PGTABLE_ARM_H +#define __IO_PGTABLE_ARM_H + +#define ARM_32_LPAE_TCR_EAE (1 << 31) +#define ARM_64_LPAE_S2_TCR_RES1 (1 << 31) + +#define ARM_LPAE_TCR_EPD1 (1 << 23) + +#define ARM_LPAE_TCR_TG0_4K (0 << 14) +#define ARM_LPAE_TCR_TG0_64K (1 << 14) +#define ARM_LPAE_TCR_TG0_16K (2 << 14) + +#define ARM_LPAE_TCR_SH0_SHIFT 12 +#define ARM_LPAE_TCR_SH0_MASK 0x3 +#define ARM_LPAE_TCR_SH_NS 0 +#define ARM_LPAE_TCR_SH_OS 2 +#define ARM_LPAE_TCR_SH_IS 3 + +#define ARM_LPAE_TCR_ORGN0_SHIFT 10 +#define ARM_LPAE_TCR_IRGN0_SHIFT 8 +#define ARM_LPAE_TCR_RGN_MASK 0x3 +#define ARM_LPAE_TCR_RGN_NC 0 +#define ARM_LPAE_TCR_RGN_WBWA 1 +#define ARM_LPAE_TCR_RGN_WT 2 +#define ARM_LPAE_TCR_RGN_WB 3 + +#define ARM_LPAE_TCR_SL0_SHIFT 6 +#define ARM_LPAE_TCR_SL0_MASK 0x3 + +#define ARM_LPAE_TCR_T0SZ_SHIFT 0 +#define ARM_LPAE_TCR_SZ_MASK 0x3f + +#define ARM_LPAE_TCR_PS_SHIFT 16 +#define ARM_LPAE_TCR_PS_MASK 0x7 + +#define ARM_LPAE_TCR_IPS_SHIFT 32 +#define ARM_LPAE_TCR_IPS_MASK 0x7 + +#define ARM_LPAE_TCR_PS_32_BIT 0x0ULL +#define ARM_LPAE_TCR_PS_36_BIT 0x1ULL +#define ARM_LPAE_TCR_PS_40_BIT 0x2ULL +#define ARM_LPAE_TCR_PS_42_BIT 0x3ULL +#define ARM_LPAE_TCR_PS_44_BIT 0x4ULL +#define ARM_LPAE_TCR_PS_48_BIT 0x5ULL + +#define ARM_LPAE_MAIR_ATTR_SHIFT(n) ((n) << 3) +#define ARM_LPAE_MAIR_ATTR_MASK 0xff +#define ARM_LPAE_MAIR_ATTR_DEVICE 0x04 +#define ARM_LPAE_MAIR_ATTR_NC 0x44 +#define ARM_LPAE_MAIR_ATTR_WBRWA 0xff + +#endif /* __IO_PGTABLE_ARM_H */ From patchwork Fri Oct 6 13:31:50 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822444 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r905bb2z9t4P for ; Sat, 7 Oct 2017 00:29:56 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752490AbdJFN3v (ORCPT ); Fri, 6 Oct 2017 09:29:51 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:32788 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752258AbdJFN3u (ORCPT ); Fri, 6 Oct 2017 09:29:50 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E9781165D; Fri, 6 Oct 2017 06:29:49 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 1B09D3F578; Fri, 6 Oct 2017 06:29:44 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 23/36] iommu/arm-smmu-v3: Share process page tables Date: Fri, 6 Oct 2017 14:31:50 +0100 Message-Id: <20171006133203.22803-24-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Copy the content of TCR, MAIR and TTBR of a given task into a context descriptor. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 38 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index c86a1182c137..293f260782c2 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -43,6 +43,7 @@ #include #include "io-pgtable.h" +#include "io-pgtable-arm.h" /* MMIO registers */ #define ARM_SMMU_IDR0 0x0 @@ -2170,13 +2171,46 @@ static int arm_smmu_process_init_pgtable(struct arm_smmu_process *smmu_process, struct mm_struct *mm) { int asid; + unsigned long tcr; + unsigned long reg, par; + struct arm_smmu_ctx_desc *cfg = &smmu_process->ctx_desc; asid = mm_context_get(mm); if (!asid) return -ENOSPC; - smmu_process->ctx_desc.asid = asid; - /* TODO: init the rest */ + tcr = TCR_T0SZ(VA_BITS) | TCR_IRGN0_WBWA | TCR_ORGN0_WBWA | + TCR_SH0_INNER | ARM_LPAE_TCR_EPD1; + + switch (PAGE_SIZE) { + case SZ_4K: + tcr |= TCR_TG0_4K; + break; + case SZ_16K: + tcr |= TCR_TG0_16K; + break; + case SZ_64K: + tcr |= TCR_TG0_64K; + break; + default: + WARN_ON(1); + return -EFAULT; + } + + reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); + par = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_PARANGE_SHIFT); + tcr |= par << ARM_LPAE_TCR_IPS_SHIFT; + + tcr |= TCR_TBI0; + + cfg->asid = asid; + cfg->ttbr = virt_to_phys(mm->pgd); + /* + * MAIR value is pretty much constant and global, so we can just get it + * from the current CPU register + */ + cfg->mair = read_sysreg(mair_el1); + cfg->tcr = tcr; return 0; } From patchwork Fri Oct 6 13:31:51 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822445 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9261p6z9t34 for ; Sat, 7 Oct 2017 00:29:58 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752503AbdJFN35 (ORCPT ); Fri, 6 Oct 2017 09:29:57 -0400 Received: from foss.arm.com ([217.140.101.70]:32832 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752376AbdJFN3z (ORCPT ); Fri, 6 Oct 2017 09:29:55 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0B94415BE; Fri, 6 Oct 2017 06:29:55 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 32B643F578; Fri, 6 Oct 2017 06:29:50 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 24/36] iommu/arm-smmu-v3: Steal private ASID from a domain Date: Fri, 6 Oct 2017 14:31:51 +0100 Message-Id: <20171006133203.22803-25-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The SMMU only has one ASID space, so the process allocator competes with the domain allocator for ASIDs. Process ASIDs are allocated by the arch allocator and shared with CPUs, whereas domain ASIDs are private to the SMMU, and not affected by broadcast TLB invalidations. When the process allocator pins an mm_context and gets an ASID that is already in use by the SMMU, it belongs to a domain. At the moment we simply abort the bind, but we can try one step further. Attempt to assign a new private ASID to the domain, and steal the old one for our process. Use the smmu-wide ASID lock to prevent racing with attach_dev over the foreign domain. We now need to also take this lock when modifying entry 0 of the context table. Concurrent modifications of a given context table used to be prevented by group->mutex but in this patch we modify the CD of another group. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 53 +++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 49 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 293f260782c2..e89e6d1263d9 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1731,7 +1731,7 @@ static void arm_smmu_tlb_inv_context(void *cookie) if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { cmd.opcode = smmu->features & ARM_SMMU_FEAT_E2H ? CMDQ_OP_TLBI_EL2_ASID : CMDQ_OP_TLBI_NH_ASID; - cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; + cmd.tlbi.asid = READ_ONCE(smmu_domain->s1_cfg.cd.asid); cmd.tlbi.vmid = 0; } else { cmd.opcode = CMDQ_OP_TLBI_S12_VMALL; @@ -1757,7 +1757,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { cmd.opcode = smmu->features & ARM_SMMU_FEAT_E2H ? CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA; - cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; + cmd.tlbi.asid = READ_ONCE(smmu_domain->s1_cfg.cd.asid); } else { cmd.opcode = CMDQ_OP_TLBI_S2_IPA; cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; @@ -2119,7 +2119,9 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { ste->s1_cfg = &smmu_domain->s1_cfg; ste->s2_cfg = NULL; + spin_lock(&smmu->asid_lock); arm_smmu_write_ctx_desc(smmu_domain, 0, &ste->s1_cfg->cd); + spin_unlock(&smmu->asid_lock); } else { ste->s1_cfg = NULL; ste->s2_cfg = &smmu_domain->s2_cfg; @@ -2253,14 +2255,57 @@ static int arm_smmu_process_share(struct arm_smmu_domain *smmu_domain, struct arm_smmu_process *smmu_process) { int asid, ret; - struct arm_smmu_asid_state *asid_state; + struct arm_smmu_asid_state *asid_state, *new_state; struct arm_smmu_device *smmu = smmu_domain->smmu; asid = smmu_process->ctx_desc.asid; asid_state = idr_find(&smmu->asid_idr, asid); if (asid_state && asid_state->domain) { - return -EEXIST; + struct arm_smmu_domain *smmu_domain = asid_state->domain; + struct arm_smmu_cmdq_ent cmd = { + .opcode = smmu->features & ARM_SMMU_FEAT_E2H ? + CMDQ_OP_TLBI_EL2_ASID : CMDQ_OP_TLBI_NH_ASID, + }; + + new_state = kzalloc(sizeof(*new_state), GFP_ATOMIC); + if (!new_state) + return -ENOMEM; + + new_state->domain = smmu_domain; + + ret = idr_alloc_cyclic(&smmu->asid_idr, new_state, 0, + 1 << smmu->asid_bits, GFP_ATOMIC); + if (ret < 0) { + kfree(new_state); + return ret; + } + + /* + * Race with unmap; TLB invalidations will start targeting the + * new ASID, which isn't assigned yet. We'll do an + * invalidate-all on the old ASID later, so it doesn't matter. + */ + WRITE_ONCE(smmu_domain->s1_cfg.cd.asid, ret); + + /* + * Update ASID and invalidate CD in all associated masters. + * There will be some overlapping between use of both ASIDs, + * until we invalidate the TLB. + */ + arm_smmu_write_ctx_desc(smmu_domain, 0, &smmu_domain->s1_cfg.cd); + + /* Invalidate TLB entries previously associated with that domain */ + cmd.tlbi.asid = asid; + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + cmd.opcode = CMDQ_OP_CMD_SYNC; + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + + asid_state->domain = NULL; + asid_state->refs = 1; + + return 0; + } else if (asid_state) { asid_state->refs++; return 0; From patchwork Fri Oct 6 13:31:52 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822446 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9B0yjJz9t34 for ; Sat, 7 Oct 2017 00:30:06 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752509AbdJFNaC (ORCPT ); Fri, 6 Oct 2017 09:30:02 -0400 Received: from foss.arm.com ([217.140.101.70]:32882 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752376AbdJFNaA (ORCPT ); Fri, 6 Oct 2017 09:30:00 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2B9651435; Fri, 6 Oct 2017 06:30:00 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 490A73F578; Fri, 6 Oct 2017 06:29:55 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 25/36] iommu/arm-smmu-v3: Use shared ASID set Date: Fri, 6 Oct 2017 14:31:52 +0100 Message-Id: <20171006133203.22803-26-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org We now have two exclusive sets of ASIDs: private and shared. SMMUv3 allows for contexts to take part in distributed TLB maintenance via the ASET bit. When this bit is 0 for a given context, TLB entries tagged with its ASID are invalidated by broadcast TLB maintenance. Set ASET=0 for process contexts. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index e89e6d1263d9..b7355630526a 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1240,7 +1240,8 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, CTXDESC_CD_0_ENDI | #endif CTXDESC_CD_0_R | CTXDESC_CD_0_A | - CTXDESC_CD_0_ASET_PRIVATE | + (ssid ? CTXDESC_CD_0_ASET_SHARED : + CTXDESC_CD_0_ASET_PRIVATE) | CTXDESC_CD_0_AA64 | (u64)cd->asid << CTXDESC_CD_0_ASID_SHIFT | CTXDESC_CD_0_V; From patchwork Fri Oct 6 13:31:53 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822448 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9H4RPsz9t3t for ; Sat, 7 Oct 2017 00:30:11 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752335AbdJFNaI (ORCPT ); Fri, 6 Oct 2017 09:30:08 -0400 Received: from foss.arm.com ([217.140.101.70]:32922 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752376AbdJFNaF (ORCPT ); Fri, 6 Oct 2017 09:30:05 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4233E1684; Fri, 6 Oct 2017 06:30:05 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 68EF73F578; Fri, 6 Oct 2017 06:30:00 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 26/36] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update Date: Fri, 6 Oct 2017 14:31:53 +0100 Message-Id: <20171006133203.22803-27-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org If the SMMU supports it and the kernel was built with HTTU support, enable hardware update of access and dirty flags. This is essential for shared page tables, to reduce the number of access faults on the fault queue. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index b7355630526a..2b2e2be03de7 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -67,6 +67,8 @@ #define IDR0_ASID16 (1 << 12) #define IDR0_ATS (1 << 10) #define IDR0_HYP (1 << 9) +#define IDR0_HD (1 << 7) +#define IDR0_HA (1 << 6) #define IDR0_BTM (1 << 5) #define IDR0_COHACC (1 << 4) #define IDR0_TTF_SHIFT 2 @@ -342,7 +344,16 @@ #define ARM64_TCR_TBI0_SHIFT 37 #define ARM64_TCR_TBI0_MASK 0x1UL +#define ARM64_TCR_HA_SHIFT 39 +#define ARM64_TCR_HA_MASK 0x1UL +#define ARM64_TCR_HD_SHIFT 40 +#define ARM64_TCR_HD_MASK 0x1UL + #define CTXDESC_CD_0_AA64 (1UL << 41) +#define CTXDESC_CD_0_TCR_HD_SHIFT 42 +#define CTXDESC_CD_0_TCR_HA_SHIFT 43 +#define CTXDESC_CD_0_HD (1UL << CTXDESC_CD_0_TCR_HD_SHIFT) +#define CTXDESC_CD_0_HA (1UL << CTXDESC_CD_0_TCR_HA_SHIFT) #define CTXDESC_CD_0_S (1UL << 44) #define CTXDESC_CD_0_R (1UL << 45) #define CTXDESC_CD_0_A (1UL << 46) @@ -670,6 +681,8 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_E2H (1 << 14) #define ARM_SMMU_FEAT_BTM (1 << 15) #define ARM_SMMU_FEAT_SVM (1 << 16) +#define ARM_SMMU_FEAT_HA (1 << 17) +#define ARM_SMMU_FEAT_HD (1 << 18) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -1157,7 +1170,7 @@ static __u64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, u32 ssid) return l1_desc->cdptr + idx * CTXDESC_CD_DWORDS; } -static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr) +static u64 arm_smmu_cpu_tcr_to_cd(struct arm_smmu_device *smmu, u64 tcr) { u64 val = 0; @@ -1172,6 +1185,12 @@ static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr) val |= ARM_SMMU_TCR2CD(tcr, IPS); val |= ARM_SMMU_TCR2CD(tcr, TBI0); + if (smmu->features & ARM_SMMU_FEAT_HA) + val |= ARM_SMMU_TCR2CD(tcr, HA); + + if (smmu->features & ARM_SMMU_FEAT_HD) + val |= ARM_SMMU_TCR2CD(tcr, HD); + return val; } @@ -1235,7 +1254,7 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, */ arm_smmu_sync_cd(smmu_domain, ssid, true); - val = arm_smmu_cpu_tcr_to_cd(cd->tcr) | + val = arm_smmu_cpu_tcr_to_cd(smmu_domain->smmu, cd->tcr) | #ifdef __BIG_ENDIAN CTXDESC_CD_0_ENDI | #endif @@ -2203,8 +2222,7 @@ static int arm_smmu_process_init_pgtable(struct arm_smmu_process *smmu_process, reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); par = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_PARANGE_SHIFT); tcr |= par << ARM_LPAE_TCR_IPS_SHIFT; - - tcr |= TCR_TBI0; + tcr |= TCR_TBI0 | TCR_HA | TCR_HD; cfg->asid = asid; cfg->ttbr = virt_to_phys(mm->pgd); @@ -3275,6 +3293,12 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) smmu->features |= ARM_SMMU_FEAT_E2H; } + if (IS_ENABLED(CONFIG_ARM64_HW_AFDBM) && (reg & (IDR0_HA | IDR0_HD))) { + smmu->features |= ARM_SMMU_FEAT_HA; + if (reg & IDR0_HD) + smmu->features |= ARM_SMMU_FEAT_HD; + } + /* * If the CPU is using VHE, but the SMMU doesn't support it, the SMMU * will create TLB entries for NH-EL1 world and will miss the From patchwork Fri Oct 6 13:31:54 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822450 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9M2QPYz9t3t for ; Sat, 7 Oct 2017 00:30:15 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752376AbdJFNaM (ORCPT ); Fri, 6 Oct 2017 09:30:12 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:32960 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751925AbdJFNaK (ORCPT ); Fri, 6 Oct 2017 09:30:10 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 594D415BF; Fri, 6 Oct 2017 06:30:10 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7F9573F578; Fri, 6 Oct 2017 06:30:05 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 27/36] iommu/arm-smmu-v3: Register fault workqueue Date: Fri, 6 Oct 2017 14:31:54 +0100 Message-Id: <20171006133203.22803-28-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When using PRI or Stall, the PRI or event handler enqueues faults into the core fault queue. Register it based on the SMMU features. When the core stops using a PASID, it notifies the SMMU to flush all instances of this PASID from the PRI queue. Add a way to flush the PRI and event queue. PRI and event thread now take a spinlock while processing the queue. The flush handler takes this lock to inspect the queue state. We avoid livelock, where the SMMU adds fault to the queue faster than we can consume them, by incrementing a 'batch' number on every cycle so the flush handler only has to wait a complete cycle (two batch increments.) Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 104 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 102 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 2b2e2be03de7..7d68c6aecb14 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -570,6 +570,10 @@ struct arm_smmu_queue { u32 __iomem *prod_reg; u32 __iomem *cons_reg; + + /* Event and PRI */ + u64 batch; + wait_queue_head_t wq; }; struct arm_smmu_cmdq { @@ -716,6 +720,9 @@ struct arm_smmu_device { /* IOMMU core code handle */ struct iommu_device iommu; + + /* Notifier for the fault queue */ + struct notifier_block faultq_nb; }; /* SMMU private data for each master */ @@ -1568,19 +1575,27 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) { int i; + int num_handled = 0; struct arm_smmu_device *smmu = dev; struct arm_smmu_queue *q = &smmu->evtq.q; + size_t queue_size = 1 << q->max_n_shift; u64 evt[EVTQ_ENT_DWORDS]; + spin_lock(&q->wq.lock); do { while (!queue_remove_raw(q, evt)) { u8 id = evt[0] >> EVTQ_0_ID_SHIFT & EVTQ_0_ID_MASK; + if (++num_handled == queue_size) { + q->batch++; + wake_up_locked(&q->wq); + num_handled = 0; + } + dev_info(smmu->dev, "event 0x%02x received:\n", id); for (i = 0; i < ARRAY_SIZE(evt); ++i) dev_info(smmu->dev, "\t0x%016llx\n", (unsigned long long)evt[i]); - } /* @@ -1593,6 +1608,11 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) /* Sync our overflow flag, as we believe we're up to speed */ q->cons = Q_OVF(q, q->prod) | Q_WRP(q, q->cons) | Q_IDX(q, q->cons); + + q->batch++; + wake_up_locked(&q->wq); + spin_unlock(&q->wq.lock); + return IRQ_HANDLED; } @@ -1636,13 +1656,24 @@ static void arm_smmu_handle_ppr(struct arm_smmu_device *smmu, u64 *evt) static irqreturn_t arm_smmu_priq_thread(int irq, void *dev) { + int num_handled = 0; struct arm_smmu_device *smmu = dev; struct arm_smmu_queue *q = &smmu->priq.q; + size_t queue_size = 1 << q->max_n_shift; u64 evt[PRIQ_ENT_DWORDS]; + spin_lock(&q->wq.lock); do { - while (!queue_remove_raw(q, evt)) + while (!queue_remove_raw(q, evt)) { + spin_unlock(&q->wq.lock); arm_smmu_handle_ppr(smmu, evt); + spin_lock(&q->wq.lock); + if (++num_handled == queue_size) { + q->batch++; + wake_up_locked(&q->wq); + num_handled = 0; + } + } if (queue_sync_prod(q) == -EOVERFLOW) dev_err(smmu->dev, "PRIQ overflow detected -- requests lost\n"); @@ -1650,9 +1681,65 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev) /* Sync our overflow flag, as we believe we're up to speed */ q->cons = Q_OVF(q, q->prod) | Q_WRP(q, q->cons) | Q_IDX(q, q->cons); + + q->batch++; + wake_up_locked(&q->wq); + spin_unlock(&q->wq.lock); + return IRQ_HANDLED; } +/* + * arm_smmu_flush_queue - wait until all events/PRIs currently in the queue have + * been consumed. + * + * Wait until the queue thread finished a batch, or until the queue is empty. + * Note that we don't handle overflows on q->batch. If it occurs, just wait for + * the queue to be empty. + */ +static int arm_smmu_flush_queue(struct arm_smmu_device *smmu, + struct arm_smmu_queue *q, const char *name) +{ + int ret; + u64 batch; + + spin_lock(&q->wq.lock); + if (queue_sync_prod(q) == -EOVERFLOW) + dev_err(smmu->dev, "%s overflow detected -- requests lost\n", name); + + batch = q->batch; + ret = wait_event_interruptible_locked(q->wq, queue_empty(q) || + q->batch >= batch + 2); + spin_unlock(&q->wq.lock); + + return ret; +} + +static int arm_smmu_flush_queues(struct notifier_block *nb, + unsigned long action, void *data) +{ + struct arm_smmu_device *smmu = container_of(nb, struct arm_smmu_device, + faultq_nb); + struct device *dev = data; + struct arm_smmu_master_data *master = NULL; + + if (dev) + master = dev->iommu_fwspec->iommu_priv; + + if (master) { + /* TODO: add support for PRI and Stall */ + return 0; + } + + /* No target device, flush all queues. */ + if (smmu->features & ARM_SMMU_FEAT_STALLS) + arm_smmu_flush_queue(smmu, &smmu->evtq.q, "evtq"); + if (smmu->features & ARM_SMMU_FEAT_PRI) + arm_smmu_flush_queue(smmu, &smmu->priq.q, "priq"); + + return 0; +} + static irqreturn_t arm_smmu_cmdq_sync_handler(int irq, void *dev) { /* We don't actually use CMD_SYNC interrupts for anything */ @@ -2697,6 +2784,10 @@ static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu, << Q_BASE_LOG2SIZE_SHIFT; q->prod = q->cons = 0; + + init_waitqueue_head(&q->wq); + q->batch = 0; + return 0; } @@ -3594,6 +3685,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev) if (ret) return ret; + if (smmu->features & (ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_PRI)) { + smmu->faultq_nb.notifier_call = arm_smmu_flush_queues; + ret = iommu_fault_queue_register(&smmu->faultq_nb); + if (ret) + return ret; + } + /* And we're up. Go go go! */ ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL, "smmu3.%pa", &ioaddr); @@ -3636,6 +3734,8 @@ static int arm_smmu_device_remove(struct platform_device *pdev) { struct arm_smmu_device *smmu = platform_get_drvdata(pdev); + iommu_fault_queue_unregister(&smmu->faultq_nb); + arm_smmu_device_disable(smmu); return 0; From patchwork Fri Oct 6 13:31:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822451 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9Q1fdVz9t3t for ; Sat, 7 Oct 2017 00:30:18 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752292AbdJFNaQ (ORCPT ); Fri, 6 Oct 2017 09:30:16 -0400 Received: from foss.arm.com ([217.140.101.70]:32998 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752215AbdJFNaP (ORCPT ); Fri, 6 Oct 2017 09:30:15 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6FBB21610; Fri, 6 Oct 2017 06:30:15 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 96A833F578; Fri, 6 Oct 2017 06:30:10 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 28/36] iommu/arm-smmu-v3: Maintain a SID->device structure Date: Fri, 6 Oct 2017 14:31:55 +0100 Message-Id: <20171006133203.22803-29-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When handling faults from the event or PRI queue, we need to find the struct device associated to a SID. Add a rb_tree to keep track of SIDs. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 104 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 7d68c6aecb14..4e915e649643 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -721,10 +721,19 @@ struct arm_smmu_device { /* IOMMU core code handle */ struct iommu_device iommu; + struct rb_root streams; + struct mutex streams_mutex; + /* Notifier for the fault queue */ struct notifier_block faultq_nb; }; +struct arm_smmu_stream { + u32 id; + struct arm_smmu_master_data *master; + struct rb_node node; +}; + /* SMMU private data for each master */ struct arm_smmu_master_data { struct arm_smmu_device *smmu; @@ -732,6 +741,7 @@ struct arm_smmu_master_data { struct arm_smmu_domain *domain; struct list_head list; /* domain->devices */ + struct arm_smmu_stream *streams; struct device *dev; @@ -1571,6 +1581,31 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return 0; } +static struct arm_smmu_master_data * +arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) +{ + struct rb_node *node; + struct arm_smmu_stream *stream; + struct arm_smmu_master_data *master = NULL; + + mutex_lock(&smmu->streams_mutex); + node = smmu->streams.rb_node; + while (node) { + stream = rb_entry(node, struct arm_smmu_stream, node); + if (stream->id < sid) { + node = node->rb_right; + } else if (stream->id > sid) { + node = node->rb_left; + } else { + master = stream->master; + break; + } + } + mutex_unlock(&smmu->streams_mutex); + + return master; +} + /* IRQ and event handlers */ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) { @@ -2555,6 +2590,71 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid) return sid < limit; } +static int arm_smmu_insert_master(struct arm_smmu_device *smmu, + struct arm_smmu_master_data *master) +{ + int i; + int ret = 0; + struct arm_smmu_stream *new_stream, *cur_stream; + struct rb_node **new_node, *parent_node = NULL; + struct iommu_fwspec *fwspec = master->dev->iommu_fwspec; + + master->streams = kcalloc(fwspec->num_ids, + sizeof(struct arm_smmu_stream), GFP_KERNEL); + if (!master->streams) + return -ENOMEM; + + mutex_lock(&smmu->streams_mutex); + for (i = 0; i < fwspec->num_ids && !ret; i++) { + new_stream = &master->streams[i]; + new_stream->id = fwspec->ids[i]; + new_stream->master = master; + + new_node = &(smmu->streams.rb_node); + while (*new_node) { + cur_stream = rb_entry(*new_node, struct arm_smmu_stream, + node); + parent_node = *new_node; + if (cur_stream->id > new_stream->id) { + new_node = &((*new_node)->rb_left); + } else if (cur_stream->id < new_stream->id) { + new_node = &((*new_node)->rb_right); + } else { + dev_warn(master->dev, + "stream %u already in tree\n", + cur_stream->id); + ret = -EINVAL; + break; + } + } + + if (!ret) { + rb_link_node(&new_stream->node, parent_node, new_node); + rb_insert_color(&new_stream->node, &smmu->streams); + } + } + mutex_unlock(&smmu->streams_mutex); + + return ret; +} + +static void arm_smmu_remove_master(struct arm_smmu_device *smmu, + struct arm_smmu_master_data *master) +{ + int i; + struct iommu_fwspec *fwspec = master->dev->iommu_fwspec; + + if (!master->streams) + return; + + mutex_lock(&smmu->streams_mutex); + for (i = 0; i < fwspec->num_ids; i++) + rb_erase(&master->streams[i].node, &smmu->streams); + mutex_unlock(&smmu->streams_mutex); + + kfree(master->streams); +} + static struct iommu_ops arm_smmu_ops; static int arm_smmu_add_device(struct device *dev) @@ -2609,6 +2709,7 @@ static int arm_smmu_add_device(struct device *dev) group = iommu_group_get_for_dev(dev); if (!IS_ERR(group)) { + arm_smmu_insert_master(smmu, master); iommu_group_put(group); iommu_device_link(&smmu->iommu, dev); } @@ -2629,6 +2730,7 @@ static void arm_smmu_remove_device(struct device *dev) smmu = master->smmu; if (master && master->ste.assigned) arm_smmu_detach_dev(dev); + arm_smmu_remove_master(smmu, master); iommu_group_remove_device(dev); iommu_device_unlink(&smmu->iommu, dev); kfree(master); @@ -2936,6 +3038,8 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu) spin_lock_init(&smmu->asid_lock); idr_init(&smmu->asid_idr); + mutex_init(&smmu->streams_mutex); + smmu->streams = RB_ROOT; ret = arm_smmu_init_queues(smmu); if (ret) From patchwork Fri Oct 6 13:31:56 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822452 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9X1NHSz9t34 for ; Sat, 7 Oct 2017 00:30:24 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752525AbdJFNaV (ORCPT ); Fri, 6 Oct 2017 09:30:21 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:33046 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751961AbdJFNaU (ORCPT ); Fri, 6 Oct 2017 09:30:20 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8661215A2; Fri, 6 Oct 2017 06:30:20 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AD3183F578; Fri, 6 Oct 2017 06:30:15 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 29/36] iommu/arm-smmu-v3: Add stall support for platform devices Date: Fri, 6 Oct 2017 14:31:56 +0100 Message-Id: <20171006133203.22803-30-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The SMMU provides a Stall model for handling page faults in platform devices. It is similar to PCI PRI, but doesn't require devices to have their own translation cache. Instead, faulting transactions are parked and the OS is given a chance to fix the page tables and retry the transaction. Enable stall for devices that support it (opt-in by firmware). When an event corresponds to a translation error, call the IOMMU fault handler. If the fault is recoverable, it will call us back to terminate or continue the stall. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 176 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 172 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 4e915e649643..48a1da0934b4 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -418,6 +418,15 @@ #define CMDQ_PRI_1_RESP_FAIL (1UL << CMDQ_PRI_1_RESP_SHIFT) #define CMDQ_PRI_1_RESP_SUCC (2UL << CMDQ_PRI_1_RESP_SHIFT) +#define CMDQ_RESUME_0_SID_SHIFT 32 +#define CMDQ_RESUME_0_SID_MASK 0xffffffffUL +#define CMDQ_RESUME_0_ACTION_SHIFT 12 +#define CMDQ_RESUME_0_ACTION_TERM (0UL << CMDQ_RESUME_0_ACTION_SHIFT) +#define CMDQ_RESUME_0_ACTION_RETRY (1UL << CMDQ_RESUME_0_ACTION_SHIFT) +#define CMDQ_RESUME_0_ACTION_ABORT (2UL << CMDQ_RESUME_0_ACTION_SHIFT) +#define CMDQ_RESUME_1_STAG_SHIFT 0 +#define CMDQ_RESUME_1_STAG_MASK 0xffffUL + #define CMDQ_SYNC_0_CS_SHIFT 12 #define CMDQ_SYNC_0_CS_NONE (0UL << CMDQ_SYNC_0_CS_SHIFT) #define CMDQ_SYNC_0_CS_SEV (2UL << CMDQ_SYNC_0_CS_SHIFT) @@ -429,6 +438,31 @@ #define EVTQ_0_ID_SHIFT 0 #define EVTQ_0_ID_MASK 0xffUL +#define EVT_ID_TRANSLATION_FAULT 0x10 +#define EVT_ID_ADDR_SIZE_FAULT 0x11 +#define EVT_ID_ACCESS_FAULT 0x12 +#define EVT_ID_PERMISSION_FAULT 0x13 + +#define EVTQ_0_SSV (1UL << 11) +#define EVTQ_0_SSID_SHIFT 12 +#define EVTQ_0_SSID_MASK 0xfffffUL +#define EVTQ_0_SID_SHIFT 32 +#define EVTQ_0_SID_MASK 0xffffffffUL +#define EVTQ_1_STAG_SHIFT 0 +#define EVTQ_1_STAG_MASK 0xffffUL +#define EVTQ_1_STALL (1UL << 31) +#define EVTQ_1_PRIV (1UL << 33) +#define EVTQ_1_EXEC (1UL << 34) +#define EVTQ_1_READ (1UL << 35) +#define EVTQ_1_S2 (1UL << 39) +#define EVTQ_1_CLASS_SHIFT 40 +#define EVTQ_1_CLASS_MASK 0x3UL +#define EVTQ_1_TT_READ (1UL << 44) +#define EVTQ_2_ADDR_SHIFT 0 +#define EVTQ_2_ADDR_MASK 0xffffffffffffffffUL +#define EVTQ_3_IPA_SHIFT 12 +#define EVTQ_3_IPA_MASK 0xffffffffffUL + /* PRI queue */ #define PRIQ_ENT_DWORDS 2 #define PRIQ_MAX_SZ_SHIFT 8 @@ -456,6 +490,9 @@ #define MSI_IOVA_BASE 0x8000000 #define MSI_IOVA_LENGTH 0x100000 +/* Flags for iommu_data in iommu_fault */ +#define ARM_SMMU_FAULT_STALL (1 << 0) + /* Until ACPICA headers cover IORT rev. C */ #ifndef ACPI_IORT_SMMU_HISILICON_HI161X #define ACPI_IORT_SMMU_HISILICON_HI161X 0x1 @@ -552,6 +589,13 @@ struct arm_smmu_cmdq_ent { enum pri_resp resp; } pri; + #define CMDQ_OP_RESUME 0x44 + struct { + u32 sid; + u16 stag; + enum iommu_fault_status resp; + } resume; + #define CMDQ_OP_CMD_SYNC 0x46 }; }; @@ -625,6 +669,7 @@ struct arm_smmu_s1_cfg { }; size_t num_contexts; + bool can_stall; struct arm_smmu_ctx_desc cd; /* Default context (SSID0) */ }; @@ -646,6 +691,8 @@ struct arm_smmu_strtab_ent { bool assigned; struct arm_smmu_s1_cfg *s1_cfg; struct arm_smmu_s2_cfg *s2_cfg; + + bool can_stall; }; struct arm_smmu_strtab_cfg { @@ -1009,6 +1056,21 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) return -EINVAL; } break; + case CMDQ_OP_RESUME: + switch (ent->resume.resp) { + case IOMMU_FAULT_STATUS_FAILURE: + case IOMMU_FAULT_STATUS_INVALID: + cmd[0] |= CMDQ_RESUME_0_ACTION_ABORT; + break; + case IOMMU_FAULT_STATUS_HANDLED: + cmd[0] |= CMDQ_RESUME_0_ACTION_RETRY; + break; + default: + return -EINVAL; + } + cmd[0] |= (u64)ent->resume.sid << CMDQ_RESUME_0_SID_SHIFT; + cmd[1] |= ent->resume.stag << CMDQ_RESUME_1_STAG_SHIFT; + break; case CMDQ_OP_CMD_SYNC: cmd[0] |= CMDQ_SYNC_0_CS_SEV; break; @@ -1093,6 +1155,32 @@ static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu, spin_unlock_irqrestore(&smmu->cmdq.lock, flags); } +static int arm_smmu_fault_response(struct iommu_domain *domain, + struct device *dev, + struct iommu_fault *fault, + enum iommu_fault_status resp) +{ + int sid = dev->iommu_fwspec->ids[0]; + struct arm_smmu_cmdq_ent cmd = {0}; + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + + if (fault->iommu_data & ARM_SMMU_FAULT_STALL) { + cmd.opcode = CMDQ_OP_RESUME; + cmd.resume.sid = sid; + cmd.resume.stag = fault->id; + cmd.resume.resp = resp; + } else { + /* TODO: put PRI response here */ + return -EINVAL; + } + + arm_smmu_cmdq_issue_cmd(smmu_domain->smmu, &cmd); + cmd.opcode = CMDQ_OP_CMD_SYNC; + arm_smmu_cmdq_issue_cmd(smmu_domain->smmu, &cmd); + + return 0; +} + /* Context descriptor manipulation functions */ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, u32 ssid, bool leaf) @@ -1283,7 +1371,8 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, CTXDESC_CD_0_V; /* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */ - if (smmu_domain->smmu->features & ARM_SMMU_FEAT_STALL_FORCE) + if ((smmu_domain->smmu->features & ARM_SMMU_FEAT_STALL_FORCE) || + (ssid && smmu_domain->s1_cfg.can_stall)) val |= CTXDESC_CD_0_S; cdptr[0] = cpu_to_le64(val); @@ -1503,7 +1592,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, STRTAB_STE_1_STRW_SHIFT); if (smmu->features & ARM_SMMU_FEAT_STALLS && - !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) + !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE) && + !ste->can_stall) dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); val |= (s1ctxptr & STRTAB_STE_0_S1CTXPTR_MASK @@ -1606,10 +1696,72 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) return master; } +static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) +{ + struct iommu_domain *domain; + struct arm_smmu_master_data *master; + u8 type = evt[0] >> EVTQ_0_ID_SHIFT & EVTQ_0_ID_MASK; + u32 sid = evt[0] >> EVTQ_0_SID_SHIFT & EVTQ_0_SID_MASK; + + struct iommu_fault fault = { + .id = evt[1] >> EVTQ_1_STAG_SHIFT & EVTQ_1_STAG_MASK, + .address = evt[2] >> EVTQ_2_ADDR_SHIFT & EVTQ_2_ADDR_MASK, + .iommu_data = ARM_SMMU_FAULT_STALL, + }; + + switch (type) { + case EVT_ID_TRANSLATION_FAULT: + case EVT_ID_ADDR_SIZE_FAULT: + case EVT_ID_ACCESS_FAULT: + case EVT_ID_PERMISSION_FAULT: + break; + default: + return -EFAULT; + } + + /* Stage-2 is always pinned at the moment */ + if (evt[1] & EVTQ_1_S2) + return -EFAULT; + + master = arm_smmu_find_master(smmu, sid); + if (!master) + return -EINVAL; + + /* + * The domain is valid until the fault returns, because detach() flushes + * the fault queue. + */ + domain = iommu_get_domain_for_dev(master->dev); + if (!domain) + return -EINVAL; + + if (evt[1] & EVTQ_1_STALL) + fault.flags |= IOMMU_FAULT_RECOVERABLE; + + if (evt[1] & EVTQ_1_READ) + fault.flags |= IOMMU_FAULT_READ; + else + fault.flags |= IOMMU_FAULT_WRITE; + + if (evt[1] & EVTQ_1_EXEC) + fault.flags |= IOMMU_FAULT_EXEC; + + if (evt[1] & EVTQ_1_PRIV) + fault.flags |= IOMMU_FAULT_PRIV; + + if (evt[0] & EVTQ_0_SSV) { + fault.flags |= IOMMU_FAULT_PASID; + fault.pasid = evt[0] >> EVTQ_0_SSID_SHIFT & EVTQ_0_SSID_MASK; + } + + /* Report to device driver or populate the page tables */ + return handle_iommu_fault(domain, master->dev, &fault); +} + /* IRQ and event handlers */ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) { - int i; + int i, ret; int num_handled = 0; struct arm_smmu_device *smmu = dev; struct arm_smmu_queue *q = &smmu->evtq.q; @@ -1621,12 +1773,19 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) while (!queue_remove_raw(q, evt)) { u8 id = evt[0] >> EVTQ_0_ID_SHIFT & EVTQ_0_ID_MASK; + spin_unlock(&q->wq.lock); + ret = arm_smmu_handle_evt(smmu, evt); + spin_lock(&q->wq.lock); + if (++num_handled == queue_size) { q->batch++; wake_up_locked(&q->wq); num_handled = 0; } + if (!ret) + continue; + dev_info(smmu->dev, "event 0x%02x received:\n", id); for (i = 0; i < ARRAY_SIZE(evt); ++i) dev_info(smmu->dev, "\t0x%016llx\n", @@ -1762,7 +1921,9 @@ static int arm_smmu_flush_queues(struct notifier_block *nb, master = dev->iommu_fwspec->iommu_priv; if (master) { - /* TODO: add support for PRI and Stall */ + if (master->ste.can_stall) + arm_smmu_flush_queue(smmu, &smmu->evtq.q, "evtq"); + /* TODO: add support for PRI */ return 0; } @@ -2110,6 +2271,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, domain->max_pasid = master->num_ssids - 1; smmu_domain->s1_cfg.num_contexts = master->num_ssids; } + smmu_domain->s1_cfg.can_stall = master->ste.can_stall; break; case ARM_SMMU_DOMAIN_NESTED: case ARM_SMMU_DOMAIN_S2: @@ -2707,6 +2869,11 @@ static int arm_smmu_add_device(struct device *dev) master->num_ssids = 1 << min(smmu->ssid_bits, fwspec->num_pasid_bits); + if (fwspec->can_stall && smmu->features & ARM_SMMU_FEAT_STALLS) { + master->can_fault = true; + master->ste.can_stall = true; + } + group = iommu_group_get_for_dev(dev); if (!IS_ERR(group)) { arm_smmu_insert_master(smmu, master); @@ -2845,6 +3012,7 @@ static struct iommu_ops arm_smmu_ops = { .process_detach = arm_smmu_process_detach, .process_invalidate = arm_smmu_process_invalidate, .process_exit = arm_smmu_process_exit, + .fault_response = arm_smmu_fault_response, .map = arm_smmu_map, .unmap = arm_smmu_unmap, .map_sg = default_iommu_map_sg, From patchwork Fri Oct 6 13:31:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822453 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9h3VMvz9t3t for ; Sat, 7 Oct 2017 00:30:32 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752527AbdJFNa1 (ORCPT ); Fri, 6 Oct 2017 09:30:27 -0400 Received: from foss.arm.com ([217.140.101.70]:33084 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752361AbdJFNa0 (ORCPT ); Fri, 6 Oct 2017 09:30:26 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9E4A71435; Fri, 6 Oct 2017 06:30:25 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C3CA13F578; Fri, 6 Oct 2017 06:30:20 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 30/36] ACPI/IORT: Check ATS capability in root complex nodes Date: Fri, 6 Oct 2017 14:31:57 +0100 Message-Id: <20171006133203.22803-31-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Root complex node in IORT has a bit telling whether it supports ATS or not. Store this bit in the IOMMU fwspec when setting up a device, so it can be accessed later by an IOMMU driver. Use the negative version at the moment because it's not clear if/how the bit needs to be integrated in other firmare descriptions. The SMMU has a feature bit telling if it supports ATS, which might be sufficient in most systems for deciding whether or not we should enable the ATS capability in endpoints. Signed-off-by: Jean-Philippe Brucker --- drivers/acpi/arm64/iort.c | 11 +++++++++++ include/linux/iommu.h | 4 ++++ 2 files changed, 15 insertions(+) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index 9565d572f8dd..b94eac8ed21e 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -768,6 +768,14 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size) dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset); } +static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node) +{ + struct acpi_iort_root_complex *pci_rc; + + pci_rc = (struct acpi_iort_root_complex *)node->node_data; + return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED; +} + /** * iort_iommu_configure - Set-up IOMMU configuration for a device. * @@ -803,6 +811,9 @@ const struct iommu_ops *iort_iommu_configure(struct device *dev) info.node = node; err = pci_for_each_dma_alias(to_pci_dev(dev), iort_pci_iommu_init, &info); + + if (!err && !iort_pci_rc_supports_ats(node)) + dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS; } else { int i = 0; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 2eb65d4724bb..661031aed0c4 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -534,12 +534,16 @@ struct iommu_fwspec { const struct iommu_ops *ops; struct fwnode_handle *iommu_fwnode; void *iommu_priv; + u32 flags; unsigned int num_ids; unsigned int num_pasid_bits; bool can_stall; u32 ids[1]; }; +/* Firmware disabled ATS in the root complex */ +#define IOMMU_FWSPEC_PCI_NO_ATS (1 << 0) + int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, const struct iommu_ops *ops); void iommu_fwspec_free(struct device *dev); From patchwork Fri Oct 6 13:31:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822454 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9n5pGyz9t3m for ; Sat, 7 Oct 2017 00:30:37 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752351AbdJFNaf (ORCPT ); Fri, 6 Oct 2017 09:30:35 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:33132 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752295AbdJFNab (ORCPT ); Fri, 6 Oct 2017 09:30:31 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B3858164F; Fri, 6 Oct 2017 06:30:30 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DA56A3F578; Fri, 6 Oct 2017 06:30:25 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 31/36] iommu/arm-smmu-v3: Add support for PCI ATS Date: Fri, 6 Oct 2017 14:31:58 +0100 Message-Id: <20171006133203.22803-32-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org PCIe devices can implement their own TLB, named Address Translation Cache (ATC). Enable Address Translation Service (ATS) for devices that support it and send them invalidation requests whenever we invalidate the IOTLBs. Range calculation ----------------- The invalidation packet itself is a bit awkward: range must be naturally aligned, which means that the start address is a multiple of the range size. In addition, the size must be a power of two number of 4k pages. We have a few options to enforce this constraint: (1) Find the smallest naturally aligned region that covers the requested range. This is simple to compute and only takes one ATC_INV, but it will spill on lots of neighbouring ATC entries. (2) Align the start address to the region size (rounded up to a power of two), and send a second invalidation for the next range of the same size. Still not great, but reduces spilling. (3) Cover the range exactly with the smallest number of naturally aligned regions. This would be interesting to implement but as for (2), requires multiple ATC_INV. As I suspect ATC invalidation packets will be a very scarce resource, I'll go with option (1) for now, and only send one big invalidation. We can move to (2), which is both easier to read and more gentle with the ATC, once we've observed on real systems that we can send multiple smaller Invalidation Requests for roughly the same price as a single big one. Note that with io-pgtable, the unmap function is called for each page, so this doesn't matter. The problem shows up when sharing page tables with the MMU. Timeout ------- ATC invalidation is allowed to take up to 90 seconds, according to the PCIe spec, so it is possible to hit the SMMU command queue timeout during normal operations. Some SMMU implementations will raise a CERROR_ATC_INV_SYNC when a CMD_SYNC fails because of an ATC invalidation. Some will just abort the CMD_SYNC. Others might let CMD_SYNC complete and have an asynchronous IMPDEF mechanism to record the error. When we receive a CERROR_ATC_INV_SYNC, we could retry sending all ATC_INV since last successful CMD_SYNC. When a CMD_SYNC fails without CERROR_ATC_INV_SYNC, we could retry sending *all* commands since last successful CMD_SYNC. We cannot afford to wait 90 seconds in iommu_unmap, let alone MMU notifiers. So we'd have to introduce a more clever system if this timeout becomes a problem, like keeping hold of mappings and invalidating in the background. Implementing safe delayed invalidations is a very complex problem and deserves a series of its own. We'll assess whether more work is needed to properly handle ATC invalidation timeouts once this code runs on real hardware. Misc ---- I didn't put ATC and TLB invalidations in the same functions for three reasons: * TLB invalidation by range is batched and committed with a single sync. Batching ATC invalidation is inconvenient, endpoints limit the number of inflight invalidations. We'd have to count the number of invalidations queued and send a sync periodically. In addition, I suspect we always need a sync between TLB and ATC invalidation for the same page. * Doing ATC invalidation outside tlb_inv_range also allows to send less requests, since TLB invalidations are done per page or block, while ATC invalidations target IOVA ranges. * TLB invalidation by context is performed when freeing the domain, at which point there isn't any device attached anymore. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 238 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 228 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 48a1da0934b4..d03bec4d4b82 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include @@ -109,6 +110,7 @@ #define IDR5_OAS_48_BIT (5 << IDR5_OAS_SHIFT) #define ARM_SMMU_CR0 0x20 +#define CR0_ATSCHK (1 << 4) #define CR0_CMDQEN (1 << 3) #define CR0_EVTQEN (1 << 2) #define CR0_PRIQEN (1 << 1) @@ -382,6 +384,7 @@ #define CMDQ_ERR_CERROR_NONE_IDX 0 #define CMDQ_ERR_CERROR_ILL_IDX 1 #define CMDQ_ERR_CERROR_ABT_IDX 2 +#define CMDQ_ERR_CERROR_ATC_INV_IDX 3 #define CMDQ_0_OP_SHIFT 0 #define CMDQ_0_OP_MASK 0xffUL @@ -407,6 +410,15 @@ #define CMDQ_TLBI_1_VA_MASK ~0xfffUL #define CMDQ_TLBI_1_IPA_MASK 0xfffffffff000UL +#define CMDQ_ATC_0_SSID_SHIFT 12 +#define CMDQ_ATC_0_SSID_MASK 0xfffffUL +#define CMDQ_ATC_0_SID_SHIFT 32 +#define CMDQ_ATC_0_SID_MASK 0xffffffffUL +#define CMDQ_ATC_0_GLOBAL (1UL << 9) +#define CMDQ_ATC_1_SIZE_SHIFT 0 +#define CMDQ_ATC_1_SIZE_MASK 0x3fUL +#define CMDQ_ATC_1_ADDR_MASK ~0xfffUL + #define CMDQ_PRI_0_SSID_SHIFT 12 #define CMDQ_PRI_0_SSID_MASK 0xfffffUL #define CMDQ_PRI_0_SID_SHIFT 32 @@ -507,6 +519,11 @@ module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); MODULE_PARM_DESC(disable_bypass, "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); +static bool disable_ats_check; +module_param_named(disable_ats_check, disable_ats_check, bool, S_IRUGO); +MODULE_PARM_DESC(disable_ats_check, + "By default, the SMMU checks whether each incoming transaction marked as translated is allowed by the stream configuration. This option disables the check."); + enum pri_resp { PRI_RESP_DENY, PRI_RESP_FAIL, @@ -581,6 +598,16 @@ struct arm_smmu_cmdq_ent { u64 addr; } tlbi; + #define CMDQ_OP_ATC_INV 0x40 + #define ATC_INV_SIZE_ALL 52 + struct { + u32 sid; + u32 ssid; + u64 addr; + u8 size; + bool global; + } atc; + #define CMDQ_OP_PRI_RESP 0x41 struct { u32 sid; @@ -1037,6 +1064,14 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) case CMDQ_OP_TLBI_EL2_ASID: cmd[0] |= (u64)ent->tlbi.asid << CMDQ_TLBI_0_ASID_SHIFT; break; + case CMDQ_OP_ATC_INV: + cmd[0] |= ent->substream_valid ? CMDQ_0_SSV : 0; + cmd[0] |= ent->atc.global ? CMDQ_ATC_0_GLOBAL : 0; + cmd[0] |= ent->atc.ssid << CMDQ_ATC_0_SSID_SHIFT; + cmd[0] |= (u64)ent->atc.sid << CMDQ_ATC_0_SID_SHIFT; + cmd[1] |= ent->atc.size << CMDQ_ATC_1_SIZE_SHIFT; + cmd[1] |= ent->atc.addr & CMDQ_ATC_1_ADDR_MASK; + break; case CMDQ_OP_PRI_RESP: cmd[0] |= ent->substream_valid ? CMDQ_0_SSV : 0; cmd[0] |= ent->pri.ssid << CMDQ_PRI_0_SSID_SHIFT; @@ -1087,6 +1122,7 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu) [CMDQ_ERR_CERROR_NONE_IDX] = "No error", [CMDQ_ERR_CERROR_ILL_IDX] = "Illegal command", [CMDQ_ERR_CERROR_ABT_IDX] = "Abort on command fetch", + [CMDQ_ERR_CERROR_ATC_INV_IDX] = "ATC invalidate timeout", }; int i; @@ -1106,6 +1142,14 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu) dev_err(smmu->dev, "retrying command fetch\n"); case CMDQ_ERR_CERROR_NONE_IDX: return; + case CMDQ_ERR_CERROR_ATC_INV_IDX: + /* + * ATC Invalidation Completion timeout. CONS is still pointing + * at the CMD_SYNC. Attempt to complete other pending commands + * by repeating the CMD_SYNC, though we might well end up back + * here since the ATC invalidation may still be pending. + */ + return; case CMDQ_ERR_CERROR_ILL_IDX: /* Fallthrough */ default: @@ -1584,9 +1628,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, STRTAB_STE_1_S1C_CACHE_WBRA << STRTAB_STE_1_S1COR_SHIFT | STRTAB_STE_1_S1C_SH_ISH << STRTAB_STE_1_S1CSH_SHIFT | -#ifdef CONFIG_PCI_ATS - STRTAB_STE_1_EATS_TRANS << STRTAB_STE_1_EATS_SHIFT | -#endif (smmu->features & ARM_SMMU_FEAT_E2H ? STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1) << STRTAB_STE_1_STRW_SHIFT); @@ -1623,6 +1664,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, val |= STRTAB_STE_0_CFG_S2_TRANS; } + if (IS_ENABLED(CONFIG_PCI_ATS)) + dst[1] |= cpu_to_le64(STRTAB_STE_1_EATS_TRANS + << STRTAB_STE_1_EATS_SHIFT); + arm_smmu_sync_ste_for_sid(smmu, sid); dst[0] = cpu_to_le64(val); arm_smmu_sync_ste_for_sid(smmu, sid); @@ -2078,6 +2123,106 @@ static const struct iommu_gather_ops arm_smmu_gather_ops = { .tlb_sync = arm_smmu_tlb_sync, }; +static bool arm_smmu_master_has_ats(struct arm_smmu_master_data *master) +{ + return dev_is_pci(master->dev) && to_pci_dev(master->dev)->ats_enabled; +} + +static void +arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size, + struct arm_smmu_cmdq_ent *cmd) +{ + size_t log2_span; + size_t span_mask; + /* ATC invalidates are always on 4096 bytes pages */ + size_t inval_grain_shift = 12; + unsigned long page_start, page_end; + + *cmd = (struct arm_smmu_cmdq_ent) { + .opcode = CMDQ_OP_ATC_INV, + .substream_valid = !!ssid, + .atc.ssid = ssid, + }; + + if (!size) { + cmd->atc.size = ATC_INV_SIZE_ALL; + return; + } + + page_start = iova >> inval_grain_shift; + page_end = (iova + size - 1) >> inval_grain_shift; + + /* + * Find the smallest power of two that covers the range. Most + * significant differing bit between start and end address indicates the + * required span, ie. fls(start ^ end). For example: + * + * We want to invalidate pages [8; 11]. This is already the ideal range: + * x = 0b1000 ^ 0b1011 = 0b11 + * span = 1 << fls(x) = 4 + * + * To invalidate pages [7; 10], we need to invalidate [0; 15]: + * x = 0b0111 ^ 0b1010 = 0b1101 + * span = 1 << fls(x) = 16 + */ + log2_span = fls_long(page_start ^ page_end); + span_mask = (1ULL << log2_span) - 1; + + page_start &= ~span_mask; + + cmd->atc.addr = page_start << inval_grain_shift; + cmd->atc.size = log2_span; +} + +static int arm_smmu_atc_inv_master(struct arm_smmu_master_data *master, + struct arm_smmu_cmdq_ent *cmd) +{ + int i; + struct iommu_fwspec *fwspec = master->dev->iommu_fwspec; + struct arm_smmu_cmdq_ent sync_cmd = { + .opcode = CMDQ_OP_CMD_SYNC, + }; + + if (!arm_smmu_master_has_ats(master)) + return 0; + + for (i = 0; i < fwspec->num_ids; i++) { + cmd->atc.sid = fwspec->ids[i]; + arm_smmu_cmdq_issue_cmd(master->smmu, cmd); + } + + arm_smmu_cmdq_issue_cmd(master->smmu, &sync_cmd); + + return 0; +} + +static int arm_smmu_atc_inv_master_all(struct arm_smmu_master_data *master, + int ssid) +{ + struct arm_smmu_cmdq_ent cmd; + + arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd); + return arm_smmu_atc_inv_master(master, &cmd); +} + +static size_t +arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, + unsigned long iova, size_t size) +{ + unsigned long flags; + struct arm_smmu_cmdq_ent cmd; + struct arm_smmu_master_data *master; + + arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd); + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, list) + arm_smmu_atc_inv_master(master, &cmd); + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + + return size; +} + /* IOMMU API */ static bool arm_smmu_capable(enum iommu_cap cap) { @@ -2361,6 +2506,8 @@ static void arm_smmu_detach_dev(struct device *dev) __iommu_process_unbind_dev_all(&smmu_domain->domain, dev); if (smmu_domain) { + arm_smmu_atc_inv_master_all(master, 0); + spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_del(&master->list); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); @@ -2451,12 +2598,19 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size) { - struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + int ret; + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; if (!ops) return 0; - return ops->unmap(ops, iova, size); + ret = ops->unmap(ops, iova, size); + + if (ret && smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS) + ret = arm_smmu_atc_inv_domain(smmu_domain, 0, iova, size); + + return ret; } static phys_addr_t @@ -2752,6 +2906,48 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid) return sid < limit; } +static int arm_smmu_enable_ats(struct arm_smmu_master_data *master) +{ + int ret; + size_t stu; + struct pci_dev *pdev; + struct arm_smmu_device *smmu = master->smmu; + struct iommu_fwspec *fwspec = master->dev->iommu_fwspec; + + if (!(smmu->features & ARM_SMMU_FEAT_ATS) || !dev_is_pci(master->dev) || + (fwspec->flags & IOMMU_FWSPEC_PCI_NO_ATS)) + return -ENOSYS; + + pdev = to_pci_dev(master->dev); + + /* Smallest Translation Unit: log2 of the smallest supported granule */ + stu = __ffs(smmu->pgsize_bitmap); + + ret = pci_enable_ats(pdev, stu); + if (ret) + return ret; + + dev_dbg(&pdev->dev, "enabled ATS (STU=%zu, QDEP=%d)\n", stu, + pci_ats_queue_depth(pdev)); + + return 0; +} + +static void arm_smmu_disable_ats(struct arm_smmu_master_data *master) +{ + struct pci_dev *pdev; + + if (!dev_is_pci(master->dev)) + return; + + pdev = to_pci_dev(master->dev); + + if (!pdev->ats_enabled) + return; + + pci_disable_ats(pdev); +} + static int arm_smmu_insert_master(struct arm_smmu_device *smmu, struct arm_smmu_master_data *master) { @@ -2874,14 +3070,24 @@ static int arm_smmu_add_device(struct device *dev) master->ste.can_stall = true; } + arm_smmu_enable_ats(master); + group = iommu_group_get_for_dev(dev); - if (!IS_ERR(group)) { - arm_smmu_insert_master(smmu, master); - iommu_group_put(group); - iommu_device_link(&smmu->iommu, dev); + if (IS_ERR(group)) { + ret = PTR_ERR(group); + goto err_disable_ats; } - return PTR_ERR_OR_ZERO(group); + iommu_group_put(group); + arm_smmu_insert_master(smmu, master); + iommu_device_link(&smmu->iommu, dev); + + return 0; + +err_disable_ats: + arm_smmu_disable_ats(master); + + return ret; } static void arm_smmu_remove_device(struct device *dev) @@ -2898,6 +3104,8 @@ static void arm_smmu_remove_device(struct device *dev) if (master && master->ste.assigned) arm_smmu_detach_dev(dev); arm_smmu_remove_master(smmu, master); + arm_smmu_disable_ats(master); + iommu_group_remove_device(dev); iommu_device_unlink(&smmu->iommu, dev); kfree(master); @@ -3515,6 +3723,16 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) } } + if (smmu->features & ARM_SMMU_FEAT_ATS && !disable_ats_check) { + enables |= CR0_ATSCHK; + ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, + ARM_SMMU_CR0ACK); + if (ret) { + dev_err(smmu->dev, "failed to enable ATS check\n"); + return ret; + } + } + ret = arm_smmu_setup_irqs(smmu); if (ret) { dev_err(smmu->dev, "failed to setup irqs\n"); From patchwork Fri Oct 6 13:31:59 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822455 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9r6kd0z9t34 for ; Sat, 7 Oct 2017 00:30:40 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752389AbdJFNah (ORCPT ); Fri, 6 Oct 2017 09:30:37 -0400 Received: from foss.arm.com ([217.140.101.70]:33164 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752295AbdJFNag (ORCPT ); Fri, 6 Oct 2017 09:30:36 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CAFF51688; Fri, 6 Oct 2017 06:30:35 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id F0E263F578; Fri, 6 Oct 2017 06:30:30 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 32/36] iommu/arm-smmu-v3: Hook ATC invalidation to process ops Date: Fri, 6 Oct 2017 14:31:59 +0100 Message-Id: <20171006133203.22803-33-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The core calls us when a process is modified. Perform the required ATC invalidations. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index d03bec4d4b82..f591f1974228 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -2839,7 +2839,7 @@ static void arm_smmu_process_detach(struct iommu_domain *domain, arm_smmu_write_ctx_desc(smmu_domain, process->pasid, NULL); } - /* TODO: Invalidate ATC. */ + arm_smmu_atc_inv_master_all(master, process->pasid); /* TODO: Invalidate all mappings if last and not DVM. */ } @@ -2847,8 +2847,9 @@ static void arm_smmu_process_invalidate(struct iommu_domain *domain, struct iommu_process *process, unsigned long iova, size_t size) { + arm_smmu_atc_inv_domain(to_smmu_domain(domain), process->pasid, + iova, size); /* - * TODO: Invalidate ATC. * TODO: Invalidate mapping if not DVM */ } @@ -2871,7 +2872,7 @@ static void arm_smmu_process_exit(struct iommu_domain *domain, domain->process_exit(domain, master->dev, process->pasid, domain->process_exit_token); - /* TODO: inval ATC */ + arm_smmu_atc_inv_master_all(master, process->pasid); } spin_unlock(&smmu_domain->devices_lock); From patchwork Fri Oct 6 13:32:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822456 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7r9z5vD0z9t3t for ; Sat, 7 Oct 2017 00:30:47 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752424AbdJFNam (ORCPT ); Fri, 6 Oct 2017 09:30:42 -0400 Received: from foss.arm.com ([217.140.101.70]:33222 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752065AbdJFNal (ORCPT ); Fri, 6 Oct 2017 09:30:41 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E083C1684; Fri, 6 Oct 2017 06:30:40 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 135723F578; Fri, 6 Oct 2017 06:30:35 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 33/36] iommu/arm-smmu-v3: Disable tagged pointers Date: Fri, 6 Oct 2017 14:32:00 +0100 Message-Id: <20171006133203.22803-34-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The ARM architecture has a "Top Byte Ignore" (TBI) option that makes the MMU mask out bits [63:56] of an address, allowing a userspace application to store data in its pointers. This option is incompatible with PCI ATS. If TBI is enabled in the SMMU and userspace triggers DMA transactions on tagged pointers, the endpoint might create ATC entries for addresses that include a tag. Software would then have to send ATC invalidation packets for each 255 possible alias of an address, or just wipe the whole address space. This is not a viable option, so disable TBI. The impact of this change is unclear, since there are very few users of tagged pointers, much less SVM. But the requirement introduced by this patch doesn't seem excessive: a userspace application using both tagged pointers and SVM should now sanitize addresses (clear the tag) before using them for device DMA. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index f591f1974228..f008b4617cd4 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1332,7 +1332,6 @@ static u64 arm_smmu_cpu_tcr_to_cd(struct arm_smmu_device *smmu, u64 tcr) val |= ARM_SMMU_TCR2CD(tcr, EPD0); val |= ARM_SMMU_TCR2CD(tcr, EPD1); val |= ARM_SMMU_TCR2CD(tcr, IPS); - val |= ARM_SMMU_TCR2CD(tcr, TBI0); if (smmu->features & ARM_SMMU_FEAT_HA) val |= ARM_SMMU_TCR2CD(tcr, HA); From patchwork Fri Oct 6 13:32:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822457 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7rB36Ll7z9t3m for ; Sat, 7 Oct 2017 00:30:51 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752418AbdJFNas (ORCPT ); Fri, 6 Oct 2017 09:30:48 -0400 Received: from foss.arm.com ([217.140.101.70]:33264 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752065AbdJFNaq (ORCPT ); Fri, 6 Oct 2017 09:30:46 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 02C4515A2; Fri, 6 Oct 2017 06:30:46 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 29D193F578; Fri, 6 Oct 2017 06:30:41 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 34/36] PCI: Make "PRG Response PASID Required" handling common Date: Fri, 6 Oct 2017 14:32:01 +0100 Message-Id: <20171006133203.22803-35-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The PASID ECN to the PCIe spec added a bit in the PRI status register that allows a Function to declare whether a PRG Response should contain the PASID prefix or not. Move the helper that accesses it from amd_iommu into the PCI subsystem, renaming it to be consistent with the current spec (PRPR - PRG Response PASID Required). Signed-off-by: Jean-Philippe Brucker Acked-by: Bjorn Helgaas --- drivers/iommu/amd_iommu.c | 19 +------------------ drivers/pci/ats.c | 17 +++++++++++++++++ include/linux/pci-ats.h | 8 ++++++++ include/uapi/linux/pci_regs.h | 1 + 4 files changed, 27 insertions(+), 18 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 51f8215877f5..45036a253d63 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2039,23 +2039,6 @@ static int pdev_iommuv2_enable(struct pci_dev *pdev) return ret; } -/* FIXME: Move this to PCI code */ -#define PCI_PRI_TLP_OFF (1 << 15) - -static bool pci_pri_tlp_required(struct pci_dev *pdev) -{ - u16 status; - int pos; - - pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI); - if (!pos) - return false; - - pci_read_config_word(pdev, pos + PCI_PRI_STATUS, &status); - - return (status & PCI_PRI_TLP_OFF) ? true : false; -} - /* * If a device is not yet associated with a domain, this function * assigns it visible for the hardware @@ -2084,7 +2067,7 @@ static int attach_device(struct device *dev, dev_data->ats.enabled = true; dev_data->ats.qdep = pci_ats_queue_depth(pdev); - dev_data->pri_tlp = pci_pri_tlp_required(pdev); + dev_data->pri_tlp = pci_prg_resp_requires_prefix(pdev); } } else if (amd_iommu_iotlb_sup && pci_enable_ats(pdev, PAGE_SHIFT) == 0) { diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index ad8ddbbbf245..f95e42df728b 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -389,3 +389,20 @@ int pci_max_pasids(struct pci_dev *pdev) } EXPORT_SYMBOL_GPL(pci_max_pasids); #endif /* CONFIG_PCI_PASID */ + +#if defined(CONFIG_PCI_PASID) && defined(CONFIG_PCI_PRI) +bool pci_prg_resp_requires_prefix(struct pci_dev *pdev) +{ + u16 status; + int pos; + + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI); + if (!pos) + return false; + + pci_read_config_word(pdev, pos + PCI_PRI_STATUS, &status); + + return !!(status & PCI_PRI_STATUS_PRPR); +} +EXPORT_SYMBOL_GPL(pci_prg_resp_requires_prefix); +#endif /* CONFIG_PCI_PASID && CONFIG_PCI_PRI */ diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h index 782fb8e0755f..367ea9448441 100644 --- a/include/linux/pci-ats.h +++ b/include/linux/pci-ats.h @@ -67,5 +67,13 @@ static inline int pci_max_pasids(struct pci_dev *pdev) #endif /* CONFIG_PCI_PASID */ +#if defined(CONFIG_PCI_PASID) && defined(CONFIG_PCI_PRI) +bool pci_prg_resp_requires_prefix(struct pci_dev *pdev); +#else +static inline bool pci_prg_resp_requires_prefix(struct pci_dev *pdev) +{ + return false; +} +#endif /* CONFIG_PCI_PASID && CONFIG_PCI_PRI */ #endif /* LINUX_PCI_ATS_H*/ diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index f8d58045926f..a0eeb16a2bfe 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -862,6 +862,7 @@ #define PCI_PRI_STATUS_RF 0x001 /* Response Failure */ #define PCI_PRI_STATUS_UPRGI 0x002 /* Unexpected PRG index */ #define PCI_PRI_STATUS_STOPPED 0x100 /* PRI Stopped */ +#define PCI_PRI_STATUS_PRPR 0x8000 /* PRG Response requires PASID prefix */ #define PCI_PRI_MAX_REQ 0x08 /* PRI max reqs supported */ #define PCI_PRI_ALLOC_REQ 0x0c /* PRI max reqs allowed */ #define PCI_EXT_CAP_PRI_SIZEOF 16 From patchwork Fri Oct 6 13:32:02 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822458 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7rB72hlqz9t4P for ; Sat, 7 Oct 2017 00:30:55 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752568AbdJFNax (ORCPT ); Fri, 6 Oct 2017 09:30:53 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:33308 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752442AbdJFNav (ORCPT ); Fri, 6 Oct 2017 09:30:51 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 199F01435; Fri, 6 Oct 2017 06:30:51 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 4041D3F578; Fri, 6 Oct 2017 06:30:46 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 35/36] iommu/arm-smmu-v3: Add support for PRI Date: Fri, 6 Oct 2017 14:32:02 +0100 Message-Id: <20171006133203.22803-36-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org For PCI devices that support it, enable the PRI capability and handle PRI Page Requests with the generic fault handler. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 176 ++++++++++++++++++++++++++++++-------------- 1 file changed, 122 insertions(+), 54 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index f008b4617cd4..852714f35010 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -272,6 +272,7 @@ #define STRTAB_STE_1_S1COR_SHIFT 4 #define STRTAB_STE_1_S1CSH_SHIFT 6 +#define STRTAB_STE_1_PPAR (1UL << 18) #define STRTAB_STE_1_S1STALLD (1UL << 27) #define STRTAB_STE_1_EATS_ABT 0UL @@ -426,9 +427,9 @@ #define CMDQ_PRI_1_GRPID_SHIFT 0 #define CMDQ_PRI_1_GRPID_MASK 0x1ffUL #define CMDQ_PRI_1_RESP_SHIFT 12 -#define CMDQ_PRI_1_RESP_DENY (0UL << CMDQ_PRI_1_RESP_SHIFT) -#define CMDQ_PRI_1_RESP_FAIL (1UL << CMDQ_PRI_1_RESP_SHIFT) -#define CMDQ_PRI_1_RESP_SUCC (2UL << CMDQ_PRI_1_RESP_SHIFT) +#define CMDQ_PRI_1_RESP_FAILURE (0UL << CMDQ_PRI_1_RESP_SHIFT) +#define CMDQ_PRI_1_RESP_INVALID (1UL << CMDQ_PRI_1_RESP_SHIFT) +#define CMDQ_PRI_1_RESP_SUCCESS (2UL << CMDQ_PRI_1_RESP_SHIFT) #define CMDQ_RESUME_0_SID_SHIFT 32 #define CMDQ_RESUME_0_SID_MASK 0xffffffffUL @@ -504,6 +505,7 @@ /* Flags for iommu_data in iommu_fault */ #define ARM_SMMU_FAULT_STALL (1 << 0) +#define ARM_SMMU_FAULT_RESP_PASID (1 << 1); /* Until ACPICA headers cover IORT rev. C */ #ifndef ACPI_IORT_SMMU_HISILICON_HI161X @@ -524,12 +526,6 @@ module_param_named(disable_ats_check, disable_ats_check, bool, S_IRUGO); MODULE_PARM_DESC(disable_ats_check, "By default, the SMMU checks whether each incoming transaction marked as translated is allowed by the stream configuration. This option disables the check."); -enum pri_resp { - PRI_RESP_DENY, - PRI_RESP_FAIL, - PRI_RESP_SUCC, -}; - enum arm_smmu_msi_index { EVTQ_MSI_INDEX, GERROR_MSI_INDEX, @@ -613,7 +609,7 @@ struct arm_smmu_cmdq_ent { u32 sid; u32 ssid; u16 grpid; - enum pri_resp resp; + enum iommu_fault_status resp; } pri; #define CMDQ_OP_RESUME 0x44 @@ -720,6 +716,7 @@ struct arm_smmu_strtab_ent { struct arm_smmu_s2_cfg *s2_cfg; bool can_stall; + bool prg_resp_needs_ssid; }; struct arm_smmu_strtab_cfg { @@ -1078,14 +1075,14 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[0] |= (u64)ent->pri.sid << CMDQ_PRI_0_SID_SHIFT; cmd[1] |= ent->pri.grpid << CMDQ_PRI_1_GRPID_SHIFT; switch (ent->pri.resp) { - case PRI_RESP_DENY: - cmd[1] |= CMDQ_PRI_1_RESP_DENY; + case IOMMU_FAULT_STATUS_FAILURE: + cmd[1] |= CMDQ_PRI_1_RESP_FAILURE; break; - case PRI_RESP_FAIL: - cmd[1] |= CMDQ_PRI_1_RESP_FAIL; + case IOMMU_FAULT_STATUS_INVALID: + cmd[1] |= CMDQ_PRI_1_RESP_INVALID; break; - case PRI_RESP_SUCC: - cmd[1] |= CMDQ_PRI_1_RESP_SUCC; + case IOMMU_FAULT_STATUS_HANDLED: + cmd[1] |= CMDQ_PRI_1_RESP_SUCCESS; break; default: return -EINVAL; @@ -1214,8 +1211,13 @@ static int arm_smmu_fault_response(struct iommu_domain *domain, cmd.resume.stag = fault->id; cmd.resume.resp = resp; } else { - /* TODO: put PRI response here */ - return -EINVAL; + cmd.opcode = CMDQ_OP_PRI_RESP; + cmd.substream_valid = fault->iommu_data & + ARM_SMMU_FAULT_RESP_PASID; + cmd.pri.sid = sid; + cmd.pri.ssid = fault->pasid; + cmd.pri.grpid = fault->id; + cmd.pri.resp = resp; } arm_smmu_cmdq_issue_cmd(smmu_domain->smmu, &cmd); @@ -1631,6 +1633,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1) << STRTAB_STE_1_STRW_SHIFT); + if (ste->prg_resp_needs_ssid) + dst[1] |= STRTAB_STE_1_PPAR; + if (smmu->features & ARM_SMMU_FEAT_STALLS && !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE) && !ste->can_stall) @@ -1856,40 +1861,42 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) static void arm_smmu_handle_ppr(struct arm_smmu_device *smmu, u64 *evt) { - u32 sid, ssid; - u16 grpid; - bool ssv, last; - - sid = evt[0] >> PRIQ_0_SID_SHIFT & PRIQ_0_SID_MASK; - ssv = evt[0] & PRIQ_0_SSID_V; - ssid = ssv ? evt[0] >> PRIQ_0_SSID_SHIFT & PRIQ_0_SSID_MASK : 0; - last = evt[0] & PRIQ_0_PRG_LAST; - grpid = evt[1] >> PRIQ_1_PRG_IDX_SHIFT & PRIQ_1_PRG_IDX_MASK; - - dev_info(smmu->dev, "unexpected PRI request received:\n"); - dev_info(smmu->dev, - "\tsid 0x%08x.0x%05x: [%u%s] %sprivileged %s%s%s access at iova 0x%016llx\n", - sid, ssid, grpid, last ? "L" : "", - evt[0] & PRIQ_0_PERM_PRIV ? "" : "un", - evt[0] & PRIQ_0_PERM_READ ? "R" : "", - evt[0] & PRIQ_0_PERM_WRITE ? "W" : "", - evt[0] & PRIQ_0_PERM_EXEC ? "X" : "", - evt[1] & PRIQ_1_ADDR_MASK << PRIQ_1_ADDR_SHIFT); + u32 sid = evt[0] >> PRIQ_0_SID_SHIFT & PRIQ_0_SID_MASK; - if (last) { - struct arm_smmu_cmdq_ent cmd = { - .opcode = CMDQ_OP_PRI_RESP, - .substream_valid = ssv, - .pri = { - .sid = sid, - .ssid = ssid, - .grpid = grpid, - .resp = PRI_RESP_DENY, - }, - }; + struct arm_smmu_master_data *master; + struct iommu_domain *domain; + struct iommu_fault fault = { + .pasid = evt[0] >> PRIQ_0_SSID_SHIFT & PRIQ_0_SSID_MASK, + .id = evt[1] >> PRIQ_1_PRG_IDX_SHIFT & PRIQ_1_PRG_IDX_MASK, + .address = evt[1] & PRIQ_1_ADDR_MASK << PRIQ_1_ADDR_SHIFT, + .flags = IOMMU_FAULT_GROUP | IOMMU_FAULT_RECOVERABLE, + }; - arm_smmu_cmdq_issue_cmd(smmu, &cmd); - } + if (evt[0] & PRIQ_0_SSID_V) + fault.flags |= IOMMU_FAULT_PASID; + if (evt[0] & PRIQ_0_PRG_LAST) + fault.flags |= IOMMU_FAULT_LAST; + if (evt[0] & PRIQ_0_PERM_READ) + fault.flags |= IOMMU_FAULT_READ; + if (evt[0] & PRIQ_0_PERM_WRITE) + fault.flags |= IOMMU_FAULT_WRITE; + if (evt[0] & PRIQ_0_PERM_EXEC) + fault.flags |= IOMMU_FAULT_EXEC; + if (evt[0] & PRIQ_0_PERM_PRIV) + fault.flags |= IOMMU_FAULT_PRIV; + + master = arm_smmu_find_master(smmu, sid); + if (WARN_ON(!master)) + return; + + if (fault.flags & IOMMU_FAULT_PASID && master->ste.prg_resp_needs_ssid) + fault.iommu_data |= ARM_SMMU_FAULT_RESP_PASID; + + domain = iommu_get_domain_for_dev(master->dev); + if (WARN_ON(!domain)) + return; + + handle_iommu_fault(domain, master->dev, &fault); } static irqreturn_t arm_smmu_priq_thread(int irq, void *dev) @@ -1967,7 +1974,8 @@ static int arm_smmu_flush_queues(struct notifier_block *nb, if (master) { if (master->ste.can_stall) arm_smmu_flush_queue(smmu, &smmu->evtq.q, "evtq"); - /* TODO: add support for PRI */ + else if (master->can_fault) + arm_smmu_flush_queue(smmu, &smmu->priq.q, "priq"); return 0; } @@ -2933,6 +2941,46 @@ static int arm_smmu_enable_ats(struct arm_smmu_master_data *master) return 0; } +static int arm_smmu_enable_pri(struct arm_smmu_master_data *master) +{ + int ret, pos; + struct pci_dev *pdev; + /* + * TODO: find a good inflight PPR number. We should divide the PRI queue + * by the number of PRI-capable devices, but it's impossible to know + * about current and future (hotplugged) devices. So we're at risk of + * dropping PPRs (and leaking pending requests in the FQ). + */ + size_t max_inflight_pprs = 16; + struct arm_smmu_device *smmu = master->smmu; + + if (!(smmu->features & ARM_SMMU_FEAT_PRI) || !dev_is_pci(master->dev)) + return -ENOSYS; + + pdev = to_pci_dev(master->dev); + + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI); + if (!pos) + return -ENOSYS; + + ret = pci_reset_pri(pdev); + if (ret) + return ret; + + ret = pci_enable_pri(pdev, max_inflight_pprs); + if (ret) { + dev_err(master->dev, "cannot enable PRI: %d\n", ret); + return ret; + } + + master->can_fault = true; + master->ste.prg_resp_needs_ssid = pci_prg_resp_requires_prefix(pdev); + + dev_dbg(master->dev, "enabled PRI"); + + return 0; +} + static void arm_smmu_disable_ats(struct arm_smmu_master_data *master) { struct pci_dev *pdev; @@ -2948,6 +2996,22 @@ static void arm_smmu_disable_ats(struct arm_smmu_master_data *master) pci_disable_ats(pdev); } +static void arm_smmu_disable_pri(struct arm_smmu_master_data *master) +{ + struct pci_dev *pdev; + + if (!dev_is_pci(master->dev)) + return; + + pdev = to_pci_dev(master->dev); + + if (!pdev->pri_enabled) + return; + + pci_disable_pri(pdev); + master->can_fault = false; +} + static int arm_smmu_insert_master(struct arm_smmu_device *smmu, struct arm_smmu_master_data *master) { @@ -3070,12 +3134,13 @@ static int arm_smmu_add_device(struct device *dev) master->ste.can_stall = true; } - arm_smmu_enable_ats(master); + if (!arm_smmu_enable_ats(master)) + arm_smmu_enable_pri(master); group = iommu_group_get_for_dev(dev); if (IS_ERR(group)) { ret = PTR_ERR(group); - goto err_disable_ats; + goto err_disable_pri; } iommu_group_put(group); @@ -3084,7 +3149,8 @@ static int arm_smmu_add_device(struct device *dev) return 0; -err_disable_ats: +err_disable_pri: + arm_smmu_disable_pri(master); arm_smmu_disable_ats(master); return ret; @@ -3104,6 +3170,8 @@ static void arm_smmu_remove_device(struct device *dev) if (master && master->ste.assigned) arm_smmu_detach_dev(dev); arm_smmu_remove_master(smmu, master); + + arm_smmu_disable_pri(master); arm_smmu_disable_ats(master); iommu_group_remove_device(dev); From patchwork Fri Oct 6 13:32:03 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 822459 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y7rBC5R9tz9t34 for ; Sat, 7 Oct 2017 00:30:59 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752569AbdJFNa5 (ORCPT ); Fri, 6 Oct 2017 09:30:57 -0400 Received: from foss.arm.com ([217.140.101.70]:33350 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752369AbdJFNa4 (ORCPT ); Fri, 6 Oct 2017 09:30:56 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3028B169F; Fri, 6 Oct 2017 06:30:56 -0700 (PDT) Received: from e106794-lin.cambridge.arm.com (e106794-lin.cambridge.arm.com [10.1.211.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 570163F578; Fri, 6 Oct 2017 06:30:51 -0700 (PDT) From: Jean-Philippe Brucker To: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, will.deacon@arm.com, lorenzo.pieralisi@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, alex.williamson@redhat.com, tn@semihalf.com, liubo95@huawei.com, thunder.leizhen@huawei.com, xieyisheng1@huawei.com, gabriele.paoloni@huawei.com, nwatters@codeaurora.org, okaya@codeaurora.org, rfranz@cavium.com, dwmw2@infradead.org, jacob.jun.pan@linux.intel.com, yi.l.liu@intel.com, ashok.raj@intel.com, robdclark@gmail.com Subject: [RFCv2 PATCH 36/36] iommu/arm-smmu-v3: Add support for PCI PASID Date: Fri, 6 Oct 2017 14:32:03 +0100 Message-Id: <20171006133203.22803-37-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006133203.22803-1-jean-philippe.brucker@arm.com> References: <20171006133203.22803-1-jean-philippe.brucker@arm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Enable PASID for PCI devices that support it. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 52 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 852714f35010..42c8378624ed 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -3012,6 +3012,50 @@ static void arm_smmu_disable_pri(struct arm_smmu_master_data *master) master->can_fault = false; } +static int arm_smmu_enable_pasid(struct arm_smmu_master_data *master) +{ + int ret; + int features; + int num_ssids; + struct pci_dev *pdev; + + if (!dev_is_pci(master->dev)) + return -ENOSYS; + + pdev = to_pci_dev(master->dev); + + features = pci_pasid_features(pdev); + if (features < 0) + return -ENOSYS; + + num_ssids = pci_max_pasids(pdev); + + dev_dbg(&pdev->dev, "device supports %#x SSIDs [%s%s]\n", num_ssids, + (features & PCI_PASID_CAP_EXEC) ? "x" : "", + (features & PCI_PASID_CAP_PRIV) ? "p" : ""); + + num_ssids = clamp_val(num_ssids, 1, 1 << master->smmu->ssid_bits); + num_ssids = rounddown_pow_of_two(num_ssids); + + ret = pci_enable_pasid(pdev, features); + return ret ? ret : num_ssids; +} + +static void arm_smmu_disable_pasid(struct arm_smmu_master_data *master) +{ + struct pci_dev *pdev; + + if (!dev_is_pci(master->dev)) + return; + + pdev = to_pci_dev(master->dev); + + if (!pdev->pasid_enabled) + return; + + pci_disable_pasid(pdev); +} + static int arm_smmu_insert_master(struct arm_smmu_device *smmu, struct arm_smmu_master_data *master) { @@ -3134,6 +3178,11 @@ static int arm_smmu_add_device(struct device *dev) master->ste.can_stall = true; } + /* PASID must be enabled before ATS */ + ret = arm_smmu_enable_pasid(master); + if (ret > 0) + master->num_ssids = ret; + if (!arm_smmu_enable_ats(master)) arm_smmu_enable_pri(master); @@ -3152,6 +3201,7 @@ static int arm_smmu_add_device(struct device *dev) err_disable_pri: arm_smmu_disable_pri(master); arm_smmu_disable_ats(master); + arm_smmu_disable_pasid(master); return ret; } @@ -3172,7 +3222,9 @@ static void arm_smmu_remove_device(struct device *dev) arm_smmu_remove_master(smmu, master); arm_smmu_disable_pri(master); + /* PASID must be disabled after ATS */ arm_smmu_disable_ats(master); + arm_smmu_disable_pasid(master); iommu_group_remove_device(dev); iommu_device_unlink(&smmu->iommu, dev);