Message ID | 20180116234040.12154-3-manoj.iyer@canonical.com |
---|---|
State | New |
Headers | show |
Series | [1/2] vfio/pci: Virtualize Maximum Payload Size | expand |
Acked-by: Paolo Pisati <paolo.pisati@canonical.com> On Wed, Jan 17, 2018 at 12:40 AM, Manoj Iyer <manoj.iyer@canonical.com> wrote: > From: Alex Williamson <alex.williamson@redhat.com> > > MRRS defines the maximum read request size a device is allowed to > make. Drivers will often increase this to allow more data transfer > with a single request. Completions to this request are bound by the > MPS setting for the bus. Aside from device quirks (none known), it > doesn't seem to make sense to set an MRRS value less than MPS, yet > this is a likely scenario given that user drivers do not have a > system-wide view of the PCI topology. Virtualize MRRS such that the > user can set MRRS >= MPS, but use MPS as the floor value that we'll > write to hardware. > > BugLink: https://launchpad.net/bugs/1732804 > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > (cherry picked from commit cf0d53ba4947aad6e471491d5b20a567cbe92e56) > Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com> > --- > drivers/vfio/pci/vfio_pci_config.c | 29 ++++++++++++++++++++++++++--- > 1 file changed, 26 insertions(+), 3 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c > index 91335e6de88a..115a36f6f403 100644 > --- a/drivers/vfio/pci/vfio_pci_config.c > +++ b/drivers/vfio/pci/vfio_pci_config.c > @@ -808,6 +808,7 @@ static int vfio_exp_config_write(struct vfio_pci_device *vdev, int pos, > { > __le16 *ctrl = (__le16 *)(vdev->vconfig + pos - > offset + PCI_EXP_DEVCTL); > + int readrq = le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ; > > count = vfio_default_config_write(vdev, pos, count, perm, offset, val); > if (count < 0) > @@ -833,6 +834,27 @@ static int vfio_exp_config_write(struct vfio_pci_device *vdev, int pos, > pci_try_reset_function(vdev->pdev); > } > > + /* > + * MPS is virtualized to the user, writes do not change the physical > + * register since determining a proper MPS value requires a system wide > + * device view. The MRRS is largely independent of MPS, but since the > + * user does not have that system-wide view, they might set a safe, but > + * inefficiently low value. Here we allow writes through to hardware, > + * but we set the floor to the physical device MPS setting, so that > + * we can at least use full TLPs, as defined by the MPS value. > + * > + * NB, if any devices actually depend on an artificially low MRRS > + * setting, this will need to be revisited, perhaps with a quirk > + * though pcie_set_readrq(). > + */ > + if (readrq != (le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ)) { > + readrq = 128 << > + ((le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ) >> 12); > + readrq = max(readrq, pcie_get_mps(vdev->pdev)); > + > + pcie_set_readrq(vdev->pdev, readrq); > + } > + > return count; > } > > @@ -851,11 +873,12 @@ static int __init init_pci_cap_exp_perm(struct perm_bits *perm) > * Allow writes to device control fields, except devctl_phantom, > * which could confuse IOMMU, MPS, which can break communication > * with other physical devices, and the ARI bit in devctl2, which > - * is set at probe time. FLR gets virtualized via our writefn. > + * is set at probe time. FLR and MRRS get virtualized via our > + * writefn. > */ > p_setw(perm, PCI_EXP_DEVCTL, > - PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD, > - ~PCI_EXP_DEVCTL_PHANTOM); > + PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD | > + PCI_EXP_DEVCTL_READRQ, ~PCI_EXP_DEVCTL_PHANTOM); > p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI); > return 0; > } > -- > 2.14.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index 91335e6de88a..115a36f6f403 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -808,6 +808,7 @@ static int vfio_exp_config_write(struct vfio_pci_device *vdev, int pos, { __le16 *ctrl = (__le16 *)(vdev->vconfig + pos - offset + PCI_EXP_DEVCTL); + int readrq = le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ; count = vfio_default_config_write(vdev, pos, count, perm, offset, val); if (count < 0) @@ -833,6 +834,27 @@ static int vfio_exp_config_write(struct vfio_pci_device *vdev, int pos, pci_try_reset_function(vdev->pdev); } + /* + * MPS is virtualized to the user, writes do not change the physical + * register since determining a proper MPS value requires a system wide + * device view. The MRRS is largely independent of MPS, but since the + * user does not have that system-wide view, they might set a safe, but + * inefficiently low value. Here we allow writes through to hardware, + * but we set the floor to the physical device MPS setting, so that + * we can at least use full TLPs, as defined by the MPS value. + * + * NB, if any devices actually depend on an artificially low MRRS + * setting, this will need to be revisited, perhaps with a quirk + * though pcie_set_readrq(). + */ + if (readrq != (le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ)) { + readrq = 128 << + ((le16_to_cpu(*ctrl) & PCI_EXP_DEVCTL_READRQ) >> 12); + readrq = max(readrq, pcie_get_mps(vdev->pdev)); + + pcie_set_readrq(vdev->pdev, readrq); + } + return count; } @@ -851,11 +873,12 @@ static int __init init_pci_cap_exp_perm(struct perm_bits *perm) * Allow writes to device control fields, except devctl_phantom, * which could confuse IOMMU, MPS, which can break communication * with other physical devices, and the ARI bit in devctl2, which - * is set at probe time. FLR gets virtualized via our writefn. + * is set at probe time. FLR and MRRS get virtualized via our + * writefn. */ p_setw(perm, PCI_EXP_DEVCTL, - PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD, - ~PCI_EXP_DEVCTL_PHANTOM); + PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD | + PCI_EXP_DEVCTL_READRQ, ~PCI_EXP_DEVCTL_PHANTOM); p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI); return 0; }