diff mbox series

iommu/vt-d: Cure VF irqdomain hickup

Message ID 87d00imlop.fsf@nanos.tec.linutronix.de
State New
Headers show
Series iommu/vt-d: Cure VF irqdomain hickup | expand

Commit Message

Thomas Gleixner Nov. 12, 2020, 7:15 p.m. UTC
The recent changes to store the MSI irqdomain pointer in struct device
missed that Intel DMAR does not register virtual function devices.  Due to
that a VF device gets the plain PCI-MSI domain assigned and then issues
compat MSI messages which get caught by the interrupt remapping unit.

Cure that by inheriting the irq domain from the physical function
device.

That's a temporary workaround. The correct fix is to inherit the irq domain
from the bus, but that's a larger effort which needs quite some other
changes to the way how x86 manages PCI and MSI domains.

Fixes: 85a8dfc57a0b ("iommm/vt-d: Store irq domain in struct device")
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/iommu/intel/dmar.c |   19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

Comments

Thomas Gleixner Nov. 12, 2020, 9:34 p.m. UTC | #1
On Thu, Nov 12 2020 at 20:15, Thomas Gleixner wrote:
> The recent changes to store the MSI irqdomain pointer in struct device
> missed that Intel DMAR does not register virtual function devices.  Due to
> that a VF device gets the plain PCI-MSI domain assigned and then issues
> compat MSI messages which get caught by the interrupt remapping unit.
>
> Cure that by inheriting the irq domain from the physical function
> device.
>
> That's a temporary workaround. The correct fix is to inherit the irq domain
> from the bus, but that's a larger effort which needs quite some other
> changes to the way how x86 manages PCI and MSI domains.

Bah, that's not really going to work with the way how irq remapping
works on x86 because at least Intel/DMAR can have more than one DMAR
unit on a bus.

So the alternative solution would be to assign the domain per device,
but the current ordering creates a hen and egg problem. Looking the
domain up in pci_set_msi_domain() does not work because at that point
the device is not registered in the IOMMU. That happens from
device_add().

Marc, is there any problem to reorder the calls in pci_device_add():

      device_add();
      pci_set_msi_domain();

That would allow to add a irq_find_matching_fwspec() based lookup to
pci_msi_get_device_domain().

Though I'm not yet convinced that the outcome would be less horrible
than the hack in the DMAR driver when I'm taking all the other horrors
of x86 (including XEN) into account :)

Thanks,

        tglx
Lu Baolu Nov. 13, 2020, 7:20 a.m. UTC | #2
Hi Thomas,

On 2020/11/13 3:15, Thomas Gleixner wrote:
> The recent changes to store the MSI irqdomain pointer in struct device
> missed that Intel DMAR does not register virtual function devices.  Due to
> that a VF device gets the plain PCI-MSI domain assigned and then issues
> compat MSI messages which get caught by the interrupt remapping unit.
> 
> Cure that by inheriting the irq domain from the physical function
> device.
> 
> That's a temporary workaround. The correct fix is to inherit the irq domain
> from the bus, but that's a larger effort which needs quite some other
> changes to the way how x86 manages PCI and MSI domains.
> 
> Fixes: 85a8dfc57a0b ("iommm/vt-d: Store irq domain in struct device")
> Reported-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   drivers/iommu/intel/dmar.c |   19 ++++++++++++++++++-
>   1 file changed, 18 insertions(+), 1 deletion(-)
> 
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -333,6 +333,11 @@ static void  dmar_pci_bus_del_dev(struct
>   	dmar_iommu_notify_scope_dev(info);
>   }
>   
> +static inline void vf_inherit_msi_domain(struct pci_dev *pdev)
> +{
> +	dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));
> +}
> +
>   static int dmar_pci_bus_notifier(struct notifier_block *nb,
>   				 unsigned long action, void *data)
>   {
> @@ -342,8 +347,20 @@ static int dmar_pci_bus_notifier(struct
>   	/* Only care about add/remove events for physical functions.
>   	 * For VFs we actually do the lookup based on the corresponding
>   	 * PF in device_to_iommu() anyway. */
> -	if (pdev->is_virtfn)
> +	if (pdev->is_virtfn) {
> +		/*
> +		 * Note: This is a horrible hack and needs to be cleaned
> +		 * up by assigning the domain to the bus, but that's too
> +		 * big of a change for post rc3.
> +		 *
> +		 * Ensure that the VF device inherits the irq domain of the
> +		 * PF device:
> +		 */
> +		if (action == BUS_NOTIFY_ADD_DEVICE)
> +			vf_inherit_msi_domain(pdev);
>   		return NOTIFY_DONE;
> +	}
> +
>   	if (action != BUS_NOTIFY_ADD_DEVICE &&
>   	    action != BUS_NOTIFY_REMOVED_DEVICE)
>   		return NOTIFY_DONE;

We also encountered this problem in internal testing. This patch can
solve the problem.

Acked-by: Lu Baolu <baolu.lu@linux.intel.com>

Best regards,
baolu
Marc Zyngier Nov. 13, 2020, 9:19 a.m. UTC | #3
On 2020-11-12 21:34, Thomas Gleixner wrote:
> On Thu, Nov 12 2020 at 20:15, Thomas Gleixner wrote:
>> The recent changes to store the MSI irqdomain pointer in struct device
>> missed that Intel DMAR does not register virtual function devices.  
>> Due to
>> that a VF device gets the plain PCI-MSI domain assigned and then 
>> issues
>> compat MSI messages which get caught by the interrupt remapping unit.
>> 
>> Cure that by inheriting the irq domain from the physical function
>> device.
>> 
>> That's a temporary workaround. The correct fix is to inherit the irq 
>> domain
>> from the bus, but that's a larger effort which needs quite some other
>> changes to the way how x86 manages PCI and MSI domains.
> 
> Bah, that's not really going to work with the way how irq remapping
> works on x86 because at least Intel/DMAR can have more than one DMAR
> unit on a bus.
> 
> So the alternative solution would be to assign the domain per device,
> but the current ordering creates a hen and egg problem. Looking the
> domain up in pci_set_msi_domain() does not work because at that point
> the device is not registered in the IOMMU. That happens from
> device_add().
> 
> Marc, is there any problem to reorder the calls in pci_device_add():
> 
>       device_add();
>       pci_set_msi_domain();

I *think* it works as long as we keep the "match_driver = false" hack.
Otherwise, we risk binding to a driver early, and game over.

> That would allow to add a irq_find_matching_fwspec() based lookup to
> pci_msi_get_device_domain().

Just so that I understand the issue: is the core of the problem that
there is no 1:1 mapping between a PCI bus and a DMAR unit, and no
firmware topology information to indicate which one to pick?

> Though I'm not yet convinced that the outcome would be less horrible
> than the hack in the DMAR driver when I'm taking all the other horrors
> of x86 (including XEN) into account :)

I tried to follow the notifier into the DMAR driver, ended up in the
IRQ remapping code, and lost the will to live. I have a question though:

In the bus notifier callback, you end-up in dmar_pci_bus_add_dev(),
which calls intel_irq_remap_add_device(), which tries to set the
MSI domain. Why isn't that enough? Are we still missing any information
at that stage?

Thanks,

         M.
Thomas Gleixner Nov. 13, 2020, 1:52 p.m. UTC | #4
On Fri, Nov 13 2020 at 09:19, Marc Zyngier wrote:
> On 2020-11-12 21:34, Thomas Gleixner wrote:
>> That would allow to add a irq_find_matching_fwspec() based lookup to
>> pci_msi_get_device_domain().
>
> Just so that I understand the issue: is the core of the problem that
> there is no 1:1 mapping between a PCI bus and a DMAR unit, and no
> firmware topology information to indicate which one to pick?

Yes, we don't have a 1:1 mapping and there is some info, but that's all
a horrible mess.

>> Though I'm not yet convinced that the outcome would be less horrible
>> than the hack in the DMAR driver when I'm taking all the other horrors
>> of x86 (including XEN) into account :)
>
> I tried to follow the notifier into the DMAR driver, ended up in the
> IRQ remapping code, and lost the will to live.

Please just don't look at that and stay alive :)

> I have a question though:
>
> In the bus notifier callback, you end-up in dmar_pci_bus_add_dev(),
> which calls intel_irq_remap_add_device(), which tries to set the MSI
> domain. Why isn't that enough? Are we still missing any information at
> that stage?

That works, but this code is not reached for VF devices ... See the
patch which cures that.

If we want to get rid of that mess we'd need to rewrite the DMAR IOMMU
device registration completely. I'll leave it as is for now. My will to
live is more important :)

Thanks

        tglx
Geert Uytterhoeven Nov. 16, 2020, 9:47 a.m. UTC | #5
Hi Thomas,

On Thu, Nov 12, 2020 at 8:16 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> The recent changes to store the MSI irqdomain pointer in struct device
> missed that Intel DMAR does not register virtual function devices.  Due to
> that a VF device gets the plain PCI-MSI domain assigned and then issues
> compat MSI messages which get caught by the interrupt remapping unit.
>
> Cure that by inheriting the irq domain from the physical function
> device.
>
> That's a temporary workaround. The correct fix is to inherit the irq domain
> from the bus, but that's a larger effort which needs quite some other
> changes to the way how x86 manages PCI and MSI domains.
>
> Fixes: 85a8dfc57a0b ("iommm/vt-d: Store irq domain in struct device")
> Reported-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  drivers/iommu/intel/dmar.c |   19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -333,6 +333,11 @@ static void  dmar_pci_bus_del_dev(struct
>         dmar_iommu_notify_scope_dev(info);
>  }
>
> +static inline void vf_inherit_msi_domain(struct pci_dev *pdev)
> +{
> +       dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));

If CONFIG_PCI_ATS is not set:

    error: 'struct pci_dev' has no member named 'physfn'

http://kisskb.ellerman.id.au/kisskb/buildresult/14400927/

Gr{oetje,eeting}s,

                        Geert
Thomas Gleixner Nov. 16, 2020, 12:50 p.m. UTC | #6
Geert,

On Mon, Nov 16 2020 at 10:47, Geert Uytterhoeven wrote:
> On Thu, Nov 12, 2020 at 8:16 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> The recent changes to store the MSI irqdomain pointer in struct device
>> missed that Intel DMAR does not register virtual function devices.  Due to
>> that a VF device gets the plain PCI-MSI domain assigned and then issues
>> compat MSI messages which get caught by the interrupt remapping unit.
>>
>> Cure that by inheriting the irq domain from the physical function
>> device.
>>
>> That's a temporary workaround. The correct fix is to inherit the irq domain
>> from the bus, but that's a larger effort which needs quite some other
>> changes to the way how x86 manages PCI and MSI domains.
>>
>> Fixes: 85a8dfc57a0b ("iommm/vt-d: Store irq domain in struct device")
>> Reported-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>  drivers/iommu/intel/dmar.c |   19 ++++++++++++++++++-
>>  1 file changed, 18 insertions(+), 1 deletion(-)
>>
>> --- a/drivers/iommu/intel/dmar.c
>> +++ b/drivers/iommu/intel/dmar.c
>> @@ -333,6 +333,11 @@ static void  dmar_pci_bus_del_dev(struct
>>         dmar_iommu_notify_scope_dev(info);
>>  }
>>
>> +static inline void vf_inherit_msi_domain(struct pci_dev *pdev)
>> +{
>> +       dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));
>
> If CONFIG_PCI_ATS is not set:
>
>     error: 'struct pci_dev' has no member named 'physfn'
>

thanks for pointing that out. Yet moar ifdeffery, oh well...

Thanks,

        tglx
Lu Baolu Nov. 16, 2020, 12:50 p.m. UTC | #7
On 2020/11/16 17:47, Geert Uytterhoeven wrote:
> Hi Thomas,
> 
> On Thu, Nov 12, 2020 at 8:16 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> The recent changes to store the MSI irqdomain pointer in struct device
>> missed that Intel DMAR does not register virtual function devices.  Due to
>> that a VF device gets the plain PCI-MSI domain assigned and then issues
>> compat MSI messages which get caught by the interrupt remapping unit.
>>
>> Cure that by inheriting the irq domain from the physical function
>> device.
>>
>> That's a temporary workaround. The correct fix is to inherit the irq domain
>> from the bus, but that's a larger effort which needs quite some other
>> changes to the way how x86 manages PCI and MSI domains.
>>
>> Fixes: 85a8dfc57a0b ("iommm/vt-d: Store irq domain in struct device")
>> Reported-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>   drivers/iommu/intel/dmar.c |   19 ++++++++++++++++++-
>>   1 file changed, 18 insertions(+), 1 deletion(-)
>>
>> --- a/drivers/iommu/intel/dmar.c
>> +++ b/drivers/iommu/intel/dmar.c
>> @@ -333,6 +333,11 @@ static void  dmar_pci_bus_del_dev(struct
>>          dmar_iommu_notify_scope_dev(info);
>>   }
>>
>> +static inline void vf_inherit_msi_domain(struct pci_dev *pdev)
>> +{
>> +       dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));
> 
> If CONFIG_PCI_ATS is not set:
> 
>      error: 'struct pci_dev' has no member named 'physfn'
> 
> http://kisskb.ellerman.id.au/kisskb/buildresult/14400927/

Maybe pci_physfn() helper should be used here.

Best regards,
baolu
Jason Gunthorpe Nov. 16, 2020, 11:22 p.m. UTC | #8
On Thu, Nov 12, 2020 at 08:15:02PM +0100, Thomas Gleixner wrote:
> The recent changes to store the MSI irqdomain pointer in struct device
> missed that Intel DMAR does not register virtual function devices.  Due to
> that a VF device gets the plain PCI-MSI domain assigned and then issues
> compat MSI messages which get caught by the interrupt remapping unit.
> 
> Cure that by inheriting the irq domain from the physical function
> device.
> 
> That's a temporary workaround. The correct fix is to inherit the irq domain
> from the bus, but that's a larger effort which needs quite some other
> changes to the way how x86 manages PCI and MSI domains.
> 
> Fixes: 85a8dfc57a0b ("iommm/vt-d: Store irq domain in struct device")
> Reported-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  drivers/iommu/intel/dmar.c |   19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)

Our QA says it solves the issue:

Tested-by: Itay Aveksis <itayav@nvidia.com>

Thanks,
Jason
diff mbox series

Patch

--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -333,6 +333,11 @@  static void  dmar_pci_bus_del_dev(struct
 	dmar_iommu_notify_scope_dev(info);
 }
 
+static inline void vf_inherit_msi_domain(struct pci_dev *pdev)
+{
+	dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));
+}
+
 static int dmar_pci_bus_notifier(struct notifier_block *nb,
 				 unsigned long action, void *data)
 {
@@ -342,8 +347,20 @@  static int dmar_pci_bus_notifier(struct
 	/* Only care about add/remove events for physical functions.
 	 * For VFs we actually do the lookup based on the corresponding
 	 * PF in device_to_iommu() anyway. */
-	if (pdev->is_virtfn)
+	if (pdev->is_virtfn) {
+		/*
+		 * Note: This is a horrible hack and needs to be cleaned
+		 * up by assigning the domain to the bus, but that's too
+		 * big of a change for post rc3.
+		 *
+		 * Ensure that the VF device inherits the irq domain of the
+		 * PF device:
+		 */
+		if (action == BUS_NOTIFY_ADD_DEVICE)
+			vf_inherit_msi_domain(pdev);
 		return NOTIFY_DONE;
+	}
+
 	if (action != BUS_NOTIFY_ADD_DEVICE &&
 	    action != BUS_NOTIFY_REMOVED_DEVICE)
 		return NOTIFY_DONE;