diff mbox series

[v3,2/2] PCI: vmd: Disable MSI-X remapping when possible

Message ID 20210206033502.103964-3-jonathan.derrick@intel.com
State New
Headers show
Series VMD MSI Remapping Bypass | expand

Commit Message

Jon Derrick Feb. 6, 2021, 3:35 a.m. UTC
VMD will retransmit child device MSI-X using its own MSI-X table and
requester-id. This limits the number of MSI-X available to the whole
child device domain to the number of VMD MSI-X interrupts.

Some VMD devices have a mode where this remapping can be disabled,
allowing child device interrupts to bypass processing with the VMD MSI-X
domain interrupt handler and going straight the child device interrupt
handler, allowing for better performance and scaling. The requester-id
still gets changed to the VMD endpoint's requester-id, and the interrupt
remapping handlers have been updated to properly set IRTE for child
device interrupts to the VMD endpoint's context.

Some VMD platforms have existing production BIOS which rely on MSI-X
remapping and won't explicitly program the MSI-X remapping bit. This
re-enables MSI-X remapping on unload.

Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
 drivers/pci/controller/vmd.c | 61 +++++++++++++++++++++++++++++-------
 1 file changed, 49 insertions(+), 12 deletions(-)

Comments

Krzysztof Wilczyński Feb. 8, 2021, 1:11 p.m. UTC | #1
Hi Jon,

Thank you for all the work here!

Just a number of suggestions, mainly nitpicks, so feel free to ignore
these, of course.

[...]
> +#define VMCFG_MSI_RMP_DIS	0x2
[...]

What about calling this VMCONFIG_MSI_REMAP so that is more
self-explanatory (it also shares some similarity with the
PCI_REG_VMCONFIG defintition).

[...]
> +	VMD_FEAT_BYPASS_MSI_REMAP		= (1 << 4),
[...]

Following on the naming that included "HAS" to indicate a feature (or
support for thereof), perhaps we could name this as, for example:

	VMD_FEAT_CAN_BYPASS_MSI_REMAP

What do you think?

[...] 
> +static void vmd_enable_msi_remapping(struct vmd_dev *vmd, bool enable)
> +{
> +	u16 reg;
> +
> +	pci_read_config_word(vmd->dev, PCI_REG_VMCONFIG, &reg);
> +	reg = enable ? (reg & ~VMCFG_MSI_RMP_DIS) : (reg | VMCFG_MSI_RMP_DIS);
> +	pci_write_config_word(vmd->dev, PCI_REG_VMCONFIG, reg);
> +}

I wonder if calling this function vmd_set_msi_remapping() would be more
aligned with what it does, since it turns the MSI remapping support on
and off, so to speak, as needed.  Do you think this would be OK to do?

[...]
> +		/*
> +		 * Override the irq domain bus token so the domain can be
> +		 * distinguished from a regular PCI/MSI domain.
> +		 */

It would be "IRQ" here.

Reviewed-by: Krzysztof Wilczyński <kw@linux.com>

Krzysztof
Jon Derrick Feb. 8, 2021, 4:30 p.m. UTC | #2
Hi Krzysztof,

On Mon, 2021-02-08 at 14:11 +0100, Krzysztof Wilczyński wrote:
> Hi Jon,
> 
> Thank you for all the work here!
> 
> Just a number of suggestions, mainly nitpicks, so feel free to ignore
> these, of course.
> 
> [...]
> > +#define VMCFG_MSI_RMP_DIS	0x2
> [...]
> 
> What about calling this VMCONFIG_MSI_REMAP so that is more
> self-explanatory (it also shares some similarity with the
> PCI_REG_VMCONFIG defintition).
> 
> [...]
> > +	VMD_FEAT_BYPASS_MSI_REMAP		= (1 << 4),
> [...]
> 
> Following on the naming that included "HAS" to indicate a feature (or
> support for thereof), perhaps we could name this as, for example:
> 
> 	VMD_FEAT_CAN_BYPASS_MSI_REMAP
> 
> What do you think?
Sure

> 
> [...] 
> > +static void vmd_enable_msi_remapping(struct vmd_dev *vmd, bool enable)
> > +{
> > +	u16 reg;
> > +
> > +	pci_read_config_word(vmd->dev, PCI_REG_VMCONFIG, &reg);
> > +	reg = enable ? (reg & ~VMCFG_MSI_RMP_DIS) : (reg | VMCFG_MSI_RMP_DIS);
> > +	pci_write_config_word(vmd->dev, PCI_REG_VMCONFIG, reg);
> > +}
> 
> I wonder if calling this function vmd_set_msi_remapping() would be more
> aligned with what it does, since it turns the MSI remapping support on
> and off, so to speak, as needed.  Do you think this would be OK to do?
> 
Yes that makes sense

> [...]
> > +		/*
> > +		 * Override the irq domain bus token so the domain can be
> > +		 * distinguished from a regular PCI/MSI domain.
> > +		 */
> 
> It would be "IRQ" here.
> 
No problem!


> Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
> 
Thanks!

> Krzysztof
diff mbox series

Patch

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 5e80f28f0119..2a585d1b3349 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -28,6 +28,7 @@ 
 #define BUS_RESTRICT_CAP(vmcap)	(vmcap & 0x1)
 #define PCI_REG_VMCONFIG	0x44
 #define BUS_RESTRICT_CFG(vmcfg)	((vmcfg >> 8) & 0x3)
+#define VMCFG_MSI_RMP_DIS	0x2
 #define PCI_REG_VMLOCK		0x70
 #define MB2_SHADOW_EN(vmlock)	(vmlock & 0x2)
 
@@ -59,6 +60,13 @@  enum vmd_features {
 	 * be used for MSI remapping
 	 */
 	VMD_FEAT_OFFSET_FIRST_VECTOR		= (1 << 3),
+
+	/*
+	 * Device can bypass remapping MSI-X transactions into its MSI-X table,
+	 * avoiding the requirement of a VMD MSI domain for child device
+	 * interrupt handling.
+	 */
+	VMD_FEAT_BYPASS_MSI_REMAP		= (1 << 4),
 };
 
 /*
@@ -306,6 +314,15 @@  static struct msi_domain_info vmd_msi_domain_info = {
 	.chip		= &vmd_msi_controller,
 };
 
+static void vmd_enable_msi_remapping(struct vmd_dev *vmd, bool enable)
+{
+	u16 reg;
+
+	pci_read_config_word(vmd->dev, PCI_REG_VMCONFIG, &reg);
+	reg = enable ? (reg & ~VMCFG_MSI_RMP_DIS) : (reg | VMCFG_MSI_RMP_DIS);
+	pci_write_config_word(vmd->dev, PCI_REG_VMCONFIG, reg);
+}
+
 static int vmd_create_irq_domain(struct vmd_dev *vmd)
 {
 	struct fwnode_handle *fn;
@@ -325,6 +342,13 @@  static int vmd_create_irq_domain(struct vmd_dev *vmd)
 
 static void vmd_remove_irq_domain(struct vmd_dev *vmd)
 {
+	/*
+	 * Some production BIOS won't enable remapping between soft reboots.
+	 * Ensure remapping is restored before unloading the driver.
+	 */
+	if (!vmd->msix_count)
+		vmd_enable_msi_remapping(vmd, true);
+
 	if (vmd->irq_domain) {
 		struct fwnode_handle *fn = vmd->irq_domain->fwnode;
 
@@ -679,15 +703,31 @@  static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 
 	sd->node = pcibus_to_node(vmd->dev->bus);
 
-	ret = vmd_create_irq_domain(vmd);
-	if (ret)
-		return ret;
-
 	/*
-	 * Override the irq domain bus token so the domain can be distinguished
-	 * from a regular PCI/MSI domain.
+	 * Currently MSI remapping must be enabled in guest passthrough mode
+	 * due to some missing interrupt remapping plumbing. This is probably
+	 * acceptable because the guest is usually CPU-limited and MSI
+	 * remapping doesn't become a performance bottleneck.
 	 */
-	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
+	if (!(features & VMD_FEAT_BYPASS_MSI_REMAP) || offset[0] || offset[1]) {
+		ret = vmd_alloc_irqs(vmd);
+		if (ret)
+			return ret;
+
+		vmd_enable_msi_remapping(vmd, true);
+
+		ret = vmd_create_irq_domain(vmd);
+		if (ret)
+			return ret;
+
+		/*
+		 * Override the irq domain bus token so the domain can be
+		 * distinguished from a regular PCI/MSI domain.
+		 */
+		irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
+	} else {
+		vmd_enable_msi_remapping(vmd, false);
+	}
 
 	pci_add_resource(&resources, &vmd->resources[0]);
 	pci_add_resource_offset(&resources, &vmd->resources[1], offset[0]);
@@ -753,10 +793,6 @@  static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	if (features & VMD_FEAT_OFFSET_FIRST_VECTOR)
 		vmd->first_vec = 1;
 
-	err = vmd_alloc_irqs(vmd);
-	if (err)
-		return err;
-
 	spin_lock_init(&vmd->cfg_lock);
 	pci_set_drvdata(dev, vmd);
 	err = vmd_enable_domain(vmd, features);
@@ -825,7 +861,8 @@  static const struct pci_device_id vmd_ids[] = {
 		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP,},
 	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_28C0),
 		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW |
-				VMD_FEAT_HAS_BUS_RESTRICTIONS,},
+				VMD_FEAT_HAS_BUS_RESTRICTIONS |
+				VMD_FEAT_BYPASS_MSI_REMAP,},
 	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x467f),
 		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
 				VMD_FEAT_HAS_BUS_RESTRICTIONS |