From patchwork Mon Nov 20 23:50:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manoj Iyer X-Patchwork-Id: 839843 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 3yglpk1HKsz9t2t; Tue, 21 Nov 2017 10:50:54 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1eGvpw-0004uO-6Z; Mon, 20 Nov 2017 23:50:48 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1eGvpu-0004tt-RK for kernel-team@lists.ubuntu.com; Mon, 20 Nov 2017 23:50:46 +0000 Received: from cpe-24-28-20-247.austin.res.rr.com ([24.28.20.247] helo=canonical.com) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1eGvpu-0003U5-Ca for kernel-team@lists.ubuntu.com; Mon, 20 Nov 2017 23:50:46 +0000 From: Manoj Iyer To: kernel-team@lists.ubuntu.com Subject: [PATCH 1/2] vfio/pci: Virtualize Maximum Payload Size Date: Mon, 20 Nov 2017 17:50:31 -0600 Message-Id: <20171120235032.1178-2-manoj.iyer@canonical.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20171120235032.1178-1-manoj.iyer@canonical.com> References: <20171120235032.1178-1-manoj.iyer@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Alex Williamson With virtual PCI-Express chipsets, we now see userspace/guest drivers trying to match the physical MPS setting to a virtual downstream port. Of course a lone physical device surrounded by virtual interconnects cannot make a correct decision for a proper MPS setting. Instead, let's virtualize the MPS control register so that writes through to hardware are disallowed. Userspace drivers like QEMU assume they can write anything to the device and we'll filter out anything dangerous. Since mismatched MPS can lead to AER and other faults, let's add it to the kernel side rather than relying on userspace virtualization to handle it. BugLink: https://launchpad.net/bugs/1732804 Signed-off-by: Alex Williamson Reviewed-by: Eric Auger (cherry picked from linux-next commit 523184972b282cd9ca17a76f6ca4742394856818) Signed-off-by: Manoj Iyer --- drivers/vfio/pci/vfio_pci_config.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index 5628fe114347..91335e6de88a 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -849,11 +849,13 @@ static int __init init_pci_cap_exp_perm(struct perm_bits *perm) /* * Allow writes to device control fields, except devctl_phantom, - * which could confuse IOMMU, and the ARI bit in devctl2, which + * which could confuse IOMMU, MPS, which can break communication + * with other physical devices, and the ARI bit in devctl2, which * is set at probe time. FLR gets virtualized via our writefn. */ p_setw(perm, PCI_EXP_DEVCTL, - PCI_EXP_DEVCTL_BCR_FLR, ~PCI_EXP_DEVCTL_PHANTOM); + PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD, + ~PCI_EXP_DEVCTL_PHANTOM); p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI); return 0; } From patchwork Mon Nov 20 23:50:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manoj Iyer X-Patchwork-Id: 839844 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 3yglpk0Yjcz9t2m; Tue, 21 Nov 2017 10:50:54 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1eGvpx-0004vC-AT; Mon, 20 Nov 2017 23:50:49 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1eGvpw-0004uk-Iu for kernel-team@lists.ubuntu.com; Mon, 20 Nov 2017 23:50:48 +0000 Received: from cpe-24-28-20-247.austin.res.rr.com ([24.28.20.247] helo=canonical.com) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1eGvpw-0003U8-4I for kernel-team@lists.ubuntu.com; Mon, 20 Nov 2017 23:50:48 +0000 From: Manoj Iyer To: kernel-team@lists.ubuntu.com Subject: [PATCH 2/2] vfio/pci: Virtualize Maximum Read Request Size Date: Mon, 20 Nov 2017 17:50:32 -0600 Message-Id: <20171120235032.1178-3-manoj.iyer@canonical.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20171120235032.1178-1-manoj.iyer@canonical.com> References: <20171120235032.1178-1-manoj.iyer@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Alex Williamson MRRS defines the maximum read request size a device is allowed to make. Drivers will often increase this to allow more data transfer with a single request. Completions to this request are bound by the MPS setting for the bus. Aside from device quirks (none known), it doesn't seem to make sense to set an MRRS value less than MPS, yet this is a likely scenario given that user drivers do not have a system-wide view of the PCI topology. Virtualize MRRS such that the user can set MRRS >= MPS, but use MPS as the floor value that we'll write to hardware. BugLink: https://launchpad.net/bugs/1732804 Signed-off-by: Alex Williamson (cherry picked from linux-next commit cf0d53ba4947aad6e471491d5b20a567cbe92e56) Signed-off-by: Manoj Iyer --- drivers/vfio/pci/vfio_pci_config.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index 91335e6de88a..115a36f6f403 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -808,6 +808,7 @@ static int vfio_exp_config_write(struct vfio_pci_device *vdev, int pos, { __le16 *ctrl = (__le16 *)(vdev->vconfig + pos - offset + PCI_EXP_DEVCTL); + int readrq = le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ; count = vfio_default_config_write(vdev, pos, count, perm, offset, val); if (count < 0) @@ -833,6 +834,27 @@ static int vfio_exp_config_write(struct vfio_pci_device *vdev, int pos, pci_try_reset_function(vdev->pdev); } + /* + * MPS is virtualized to the user, writes do not change the physical + * register since determining a proper MPS value requires a system wide + * device view. The MRRS is largely independent of MPS, but since the + * user does not have that system-wide view, they might set a safe, but + * inefficiently low value. Here we allow writes through to hardware, + * but we set the floor to the physical device MPS setting, so that + * we can at least use full TLPs, as defined by the MPS value. + * + * NB, if any devices actually depend on an artificially low MRRS + * setting, this will need to be revisited, perhaps with a quirk + * though pcie_set_readrq(). + */ + if (readrq != (le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ)) { + readrq = 128 << + ((le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ) >> 12); + readrq = max(readrq, pcie_get_mps(vdev->pdev)); + + pcie_set_readrq(vdev->pdev, readrq); + } + return count; } @@ -851,11 +873,12 @@ static int __init init_pci_cap_exp_perm(struct perm_bits *perm) * Allow writes to device control fields, except devctl_phantom, * which could confuse IOMMU, MPS, which can break communication * with other physical devices, and the ARI bit in devctl2, which - * is set at probe time. FLR gets virtualized via our writefn. + * is set at probe time. FLR and MRRS get virtualized via our + * writefn. */ p_setw(perm, PCI_EXP_DEVCTL, - PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD, - ~PCI_EXP_DEVCTL_PHANTOM); + PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD | + PCI_EXP_DEVCTL_READRQ, ~PCI_EXP_DEVCTL_PHANTOM); p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI); return 0; }