diff mbox series

[RFC] PCI/MSI: Only mask all MSI-X entries when MSI-X is used

Message ID 20211210161025.3287927-1-sr@denx.de
State New
Headers show
Series [RFC] PCI/MSI: Only mask all MSI-X entries when MSI-X is used | expand

Commit Message

Stefan Roese Dec. 10, 2021, 4:10 p.m. UTC
This patch moves the masking of the MSI-X entries to a later stage in
msix_capability_init(), which is not reached on platforms not
supporting MSI-X. Without this, MSI interrupts from a NVMe drive are not
received at all on this ZynqMP based platform, only supporting legacy
and MSI interrupts.

Background:
This patch fixes a problem on our ZynqMP based system working with newer
NVMe drives which support MSI & MSI-X. Running v5.4 all is fine and
these drives correctly configure an MSI interrupt and this IRQ is
received just fine in the ZynqMP rootport. But when updating to v5.10
or later (I also tested with v5.15 and v5.16-rc4) the MSI interrupt
gets assigned but no interrupts are received by the NVMe driver at all.

Note: The ZynqMP PCIe rootport driver only supports legacy and MSI
interrupts, not MSI-X (yet).

I've debugged the MSI integration of the ZynqMP PCIe rootport driver
(pcie-xilinx-nwl.c) and found no issues there. Also the MSI framework
in the Kernel did not reveal any problems - at least for me. Looking
a bit deeper into the lspci output, I found an interesting difference
between v5.4 and v5.10 (or later).

v5.4:
04:00.0 Non-Volatile memory controller: Marvell Technology Group Ltd. Device 1321 (rev 02) (prog-if 02 [NVM Express])
        ...
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit+
		Address: 00000000fd480000  Data: 0004
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] Express (v2) Endpoint, MSI 00
	...
	Capabilities: [b0] MSI-X: Enable- Count=67 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=0 offset=00003000
	...

v5.10:
04:00.0 Non-Volatile memory controller: Marvell Technology Group Ltd. Device 1321 (rev 02) (prog-if 02 [NVM Express])
        ...
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit+
                Address: 00000000fd480000  Data: 0004
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
        ...
        Capabilities: [b0] MSI-X: Enable- Count=67 Masked+
                Vector table: BAR=0 offset=00002000
                PBA: BAR=0 offset=00003000
        ...

So the only difference here being the "Masked+" compared to the
"Masked-" in the working v5.4 Kernel. Testing in this area has shown,
that the root cause for the masked bit being set was the call to
msix_mask_all() in msix_capability_init(). Without this, all works just
fine and the MSI interrupts are received again by the NVMe driver.

BTW: I've also tested this problem with the latest version of Thomas's
PCI/MSI Spring cleaning on top of v5.16-rc4. No change - the masked bit
is still set and the MSI interrupt are note received by the NVMe driver.

I'm open to other ideas to fix this issue. So please review and comment.

Fixes: aa8092c1d1f1 ("PCI/MSI: Mask all unused MSI-X entries")
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Marek Vasut <marex@denx.de>
---
 drivers/pci/msi.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Thomas Gleixner Dec. 11, 2021, 10:17 a.m. UTC | #1
Stefan,

On Fri, Dec 10 2021 at 17:10, Stefan Roese wrote:
> I've debugged the MSI integration of the ZynqMP PCIe rootport driver
> (pcie-xilinx-nwl.c) and found no issues there. Also the MSI framework
> in the Kernel did not reveal any problems - at least for me. Looking
> a bit deeper into the lspci output, I found an interesting difference
> between v5.4 and v5.10 (or later).
>
> v5.4:
> 04:00.0 Non-Volatile memory controller: Marvell Technology Group Ltd. Device 1321 (rev 02) (prog-if 02 [NVM Express])
>         ...
> 	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit+
> 		Address: 00000000fd480000  Data: 0004
> 		Masking: 00000000  Pending: 00000000
> 	Capabilities: [70] Express (v2) Endpoint, MSI 00
> 	...
> 	Capabilities: [b0] MSI-X: Enable- Count=67 Masked-
> 		Vector table: BAR=0 offset=00002000
> 		PBA: BAR=0 offset=00003000
> 	...
>
> v5.10:
> 04:00.0 Non-Volatile memory controller: Marvell Technology Group Ltd. Device 1321 (rev 02) (prog-if 02 [NVM Express])
>         ...
>         Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit+
>                 Address: 00000000fd480000  Data: 0004
>                 Masking: 00000000  Pending: 00000000
>         Capabilities: [70] Express (v2) Endpoint, MSI 00
>         ...
>         Capabilities: [b0] MSI-X: Enable- Count=67 Masked+
>                 Vector table: BAR=0 offset=00002000
>                 PBA: BAR=0 offset=00003000
>         ...
>
> So the only difference here being the "Masked+" compared to the
> "Masked-" in the working v5.4 Kernel. Testing in this area has shown,
> that the root cause for the masked bit being set was the call to
> msix_mask_all() in msix_capability_init(). Without this, all works just
> fine and the MSI interrupts are received again by the NVMe driver.

Not really. The Masked+ in the capabilities entry has nothing to do with
the entries in the table being masked. The Masked+ reflects the
PCI_MSIX_FLAGS_MASKALL bit in the MSI-X control register.

That is set early on and not cleared in the error handling path. The
error handling just clears the MSIX_FLAGS_ENABLE bit.

Can you try the patch below?

It might still be that this Marvell part really combines the per entry
mask bits from MSI-X with MSI, then we need both.

Thanks,

        tglx
---
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -777,7 +777,7 @@ static int msix_capability_init(struct p
 	free_msi_irqs(dev);
 
 out_disable:
-	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
 
 	return ret;
 }
Stefan Roese Dec. 11, 2021, 1:58 p.m. UTC | #2
Hi Thomas,

On 12/11/21 11:17, Thomas Gleixner wrote:
> Stefan,
> 
> On Fri, Dec 10 2021 at 17:10, Stefan Roese wrote:
>> I've debugged the MSI integration of the ZynqMP PCIe rootport driver
>> (pcie-xilinx-nwl.c) and found no issues there. Also the MSI framework
>> in the Kernel did not reveal any problems - at least for me. Looking
>> a bit deeper into the lspci output, I found an interesting difference
>> between v5.4 and v5.10 (or later).
>>
>> v5.4:
>> 04:00.0 Non-Volatile memory controller: Marvell Technology Group Ltd. Device 1321 (rev 02) (prog-if 02 [NVM Express])
>>          ...
>> 	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit+
>> 		Address: 00000000fd480000  Data: 0004
>> 		Masking: 00000000  Pending: 00000000
>> 	Capabilities: [70] Express (v2) Endpoint, MSI 00
>> 	...
>> 	Capabilities: [b0] MSI-X: Enable- Count=67 Masked-
>> 		Vector table: BAR=0 offset=00002000
>> 		PBA: BAR=0 offset=00003000
>> 	...
>>
>> v5.10:
>> 04:00.0 Non-Volatile memory controller: Marvell Technology Group Ltd. Device 1321 (rev 02) (prog-if 02 [NVM Express])
>>          ...
>>          Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit+
>>                  Address: 00000000fd480000  Data: 0004
>>                  Masking: 00000000  Pending: 00000000
>>          Capabilities: [70] Express (v2) Endpoint, MSI 00
>>          ...
>>          Capabilities: [b0] MSI-X: Enable- Count=67 Masked+
>>                  Vector table: BAR=0 offset=00002000
>>                  PBA: BAR=0 offset=00003000
>>          ...
>>
>> So the only difference here being the "Masked+" compared to the
>> "Masked-" in the working v5.4 Kernel. Testing in this area has shown,
>> that the root cause for the masked bit being set was the call to
>> msix_mask_all() in msix_capability_init(). Without this, all works just
>> fine and the MSI interrupts are received again by the NVMe driver.
> 
> Not really. The Masked+ in the capabilities entry has nothing to do with
> the entries in the table being masked. The Masked+ reflects the
> PCI_MSIX_FLAGS_MASKALL bit in the MSI-X control register.
> 
> That is set early on and not cleared in the error handling path. The
> error handling just clears the MSIX_FLAGS_ENABLE bit.
> 
> Can you try the patch below?

Sure, please see below.

> It might still be that this Marvell part really combines the per entry
> mask bits from MSI-X with MSI, then we need both.

With your patch applied only (mine not), the Masked+ is gone but still
the MSI interrupts are not received in the system. So you seem to have
guessed correctly, that we need both changes.

How to continue? Should I integrate your patch into mine and send a new
version? Or will you send it separately to the list for integration?

Thanks,
Stefan

> Thanks,
> 
>          tglx
> ---
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -777,7 +777,7 @@ static int msix_capability_init(struct p
>   	free_msi_irqs(dev);
>   
>   out_disable:
> -	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
> +	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
>   
>   	return ret;
>   }
> 

Viele Grüße,
Stefan Roese
Thomas Gleixner Dec. 11, 2021, 9:02 p.m. UTC | #3
Stefan,

On Sat, Dec 11 2021 at 14:58, Stefan Roese wrote:
> On 12/11/21 11:17, Thomas Gleixner wrote:
>> Can you try the patch below?
>
> Sure, please see below.
>
>> It might still be that this Marvell part really combines the per entry
>> mask bits from MSI-X with MSI, then we need both.
>
> With your patch applied only (mine not), the Masked+ is gone but still
> the MSI interrupts are not received in the system. So you seem to have
> guessed correctly, that we need both changes.

Groan. How is that device specification compliant?

Vector Control for MSI-X Table Entries
--------------------------------------

"00: Mask bit:  When this bit is set, the function is prohibited from
                sending a message using this MSI-X Table entry.
                ....
                This bit’s state after reset is 1 (entry is masked)."

So how can that work in the first place if that device is PCI
specification compliant? Seems that PCI/SIG compliance program is just
another rubberstamping nonsense.

Can someone who has access to that group please ask them what their
specification compliance stuff is actualy testing?

Sure, that went unnoticed so far on that marvelous device because the
kernel was missing a defense line, but sigh...

> How to continue? Should I integrate your patch into mine and send a new
> version? Or will you send it separately to the list for integration?

Your patch is incomplete. The function can fail later on, which results
in the same problem, no?

So we need something like the below.

Just to satisfy my curiosity:

  The device supports obviously MSI-X, which is preferred over MSI.

  So why is the MSI-X initialization failing in the first place on this
  platform?

Thanks,

        tglx
---
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -722,9 +722,6 @@ static int msix_capability_init(struct p
 		goto out_disable;
 	}
 
-	/* Ensure that all table entries are masked. */
-	msix_mask_all(base, tsize);
-
 	ret = msix_setup_entries(dev, base, entries, nvec, affd);
 	if (ret)
 		goto out_disable;
@@ -751,6 +748,9 @@ static int msix_capability_init(struct p
 	/* Set MSI-X enabled bits and unmask the function */
 	pci_intx_for_msi(dev, 0);
 	dev->msix_enabled = 1;
+
+	/* Ensure that all table entries are masked. */
+	msix_mask_all(base, tsize);
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 
 	pcibios_free_irq(dev);
@@ -777,7 +777,7 @@ static int msix_capability_init(struct p
 	free_msi_irqs(dev);
 
 out_disable:
-	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
 
 	return ret;
 }
Stefan Roese Dec. 14, 2021, 11:10 a.m. UTC | #4
Hi Thomas,

On 12/11/21 22:02, Thomas Gleixner wrote:
> Stefan,
> 
> On Sat, Dec 11 2021 at 14:58, Stefan Roese wrote:
>> On 12/11/21 11:17, Thomas Gleixner wrote:
>>> Can you try the patch below?
>>
>> Sure, please see below.
>>
>>> It might still be that this Marvell part really combines the per entry
>>> mask bits from MSI-X with MSI, then we need both.
>>
>> With your patch applied only (mine not), the Masked+ is gone but still
>> the MSI interrupts are not received in the system. So you seem to have
>> guessed correctly, that we need both changes.
> 
> Groan. How is that device specification compliant?
> 
> Vector Control for MSI-X Table Entries
> --------------------------------------
> 
> "00: Mask bit:  When this bit is set, the function is prohibited from
>                  sending a message using this MSI-X Table entry.
>                  ....
>                  This bit’s state after reset is 1 (entry is masked)."
> 
> So how can that work in the first place if that device is PCI
> specification compliant? Seems that PCI/SIG compliance program is just
> another rubberstamping nonsense.
> 
> Can someone who has access to that group please ask them what their
> specification compliance stuff is actualy testing?
> 
> Sure, that went unnoticed so far on that marvelous device because the
> kernel was missing a defense line, but sigh...
> 
>> How to continue? Should I integrate your patch into mine and send a new
>> version? Or will you send it separately to the list for integration?
> 
> Your patch is incomplete. The function can fail later on, which results
> in the same problem, no?

Yes, agreed.

> So we need something like the below.

The patch below works fine on my ZynqMP platform. MSI interrupts are now
received okay.

> Just to satisfy my curiosity:
> 
>    The device supports obviously MSI-X, which is preferred over MSI.

I would gladly use MSI-X interrupts, if easily possible. But...

>    So why is the MSI-X initialization failing in the first place on this
>    platform?

... the ZyqnMP PCIe rootport driver only support legacy and MSI
interrupts but not MSI-X (yet) [1].

Thanks,
Stefan

[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/pci/controller/pcie-xilinx-nwl.c

> Thanks,
> 
>          tglx
> ---
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -722,9 +722,6 @@ static int msix_capability_init(struct p
>   		goto out_disable;
>   	}
>   
> -	/* Ensure that all table entries are masked. */
> -	msix_mask_all(base, tsize);
> -
>   	ret = msix_setup_entries(dev, base, entries, nvec, affd);
>   	if (ret)
>   		goto out_disable;
> @@ -751,6 +748,9 @@ static int msix_capability_init(struct p
>   	/* Set MSI-X enabled bits and unmask the function */
>   	pci_intx_for_msi(dev, 0);
>   	dev->msix_enabled = 1;
> +
> +	/* Ensure that all table entries are masked. */
> +	msix_mask_all(base, tsize);
>   	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
>   
>   	pcibios_free_irq(dev);
> @@ -777,7 +777,7 @@ static int msix_capability_init(struct p
>   	free_msi_irqs(dev);
>   
>   out_disable:
> -	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
> +	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
>   
>   	return ret;
>   }
> 

Viele Grüße,
Stefan Roese
diff mbox series

Patch

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a7a1c7411348..25b659dd5e2b 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -825,9 +825,6 @@  static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 		goto out_disable;
 	}
 
-	/* Ensure that all table entries are masked. */
-	msix_mask_all(base, tsize);
-
 	ret = msix_setup_entries(dev, base, entries, nvec, affd);
 	if (ret)
 		goto out_disable;
@@ -836,6 +833,9 @@  static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 	if (ret)
 		goto out_avail;
 
+	/* Ensure that all table entries are masked. */
+	msix_mask_all(base, tsize);
+
 	/* Check if all MSI entries honor device restrictions */
 	ret = msi_verify_entries(dev);
 	if (ret)