PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

Message ID 20180730212146.31909-1-mr.nuke.me@gmail.com
State Changes Requested
Delegated to: Bjorn Helgaas
Headers show
Series
  • PCI/MSI: Don't touch MSI bits when the PCI device is disconnected
Related show

Commit Message

Alexandru Gagniuc July 30, 2018, 9:21 p.m.
When a PCI device is gone, we don't want to send IO to it if we can
avoid it. We expose functionality via the irq_chip structure. As
users of that structure may not know about the underlying PCI device,
it's our responsibility to guard against removed devices.

irq_write_msi_msg is already guarded. pci_msi_(un)mask_irq are not.
Guard them for completeness.

For example, surprise removal of a PCIe device triggers teardown. This
touches the irq_chips ops some point to disable the interrupts. I/O
generated here can crash the system on machines with buggy firmware.
Not triggering the IO in the first place eliminates the problem.

Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
---

There's another patch by Lukas Wunner that is needed (not yet published)
in order to fully block IO on SURPRISE!!! removal. The existing code only
sets the PCI_DEV_DISCONNECTED bit in an unreasonably narrow set of
circumstances. Lukas' patch fixes that.

However, this change is otherwise fully independent, and enjoy!

 drivers/pci/msi.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Alexandru Gagniuc Aug. 29, 2018, 4:59 p.m. | #1
Should I resubmit this rebased on 4.19-rc*, or just leave this patch as is?

Alex

On 07/30/2018 04:21 PM, Alexandru Gagniuc wrote:
> When a PCI device is gone, we don't want to send IO to it if we can
> avoid it. We expose functionality via the irq_chip structure. As
> users of that structure may not know about the underlying PCI device,
> it's our responsibility to guard against removed devices.
> 
> irq_write_msi_msg is already guarded. pci_msi_(un)mask_irq are not.
> Guard them for completeness.
> 
> For example, surprise removal of a PCIe device triggers teardown. This
> touches the irq_chips ops some point to disable the interrupts. I/O
> generated here can crash the system on machines with buggy firmware.
> Not triggering the IO in the first place eliminates the problem.
> 
> Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
> ---
> 
> There's another patch by Lukas Wunner that is needed (not yet published)
> in order to fully block IO on SURPRISE!!! removal. The existing code only
> sets the PCI_DEV_DISCONNECTED bit in an unreasonably narrow set of
> circumstances. Lukas' patch fixes that.
> 
> However, this change is otherwise fully independent, and enjoy!
> 
>   drivers/pci/msi.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 4d88afdfc843..5f47b5cb0401 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -227,6 +227,9 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag)
>   {
>   	struct msi_desc *desc = irq_data_get_msi_desc(data);
>   
> +	if (pci_dev_is_disconnected(msi_desc_to_pci_dev(desc)))
> +		return;
> +
>   	if (desc->msi_attrib.is_msix) {
>   		msix_mask_irq(desc, flag);
>   		readl(desc->mask_base);		/* Flush write to device */
>
Bjorn Helgaas Sept. 12, 2018, 9:28 p.m. | #2
On Mon, Jul 30, 2018 at 04:21:44PM -0500, Alexandru Gagniuc wrote:
> When a PCI device is gone, we don't want to send IO to it if we can
> avoid it. We expose functionality via the irq_chip structure. As
> users of that structure may not know about the underlying PCI device,
> it's our responsibility to guard against removed devices.

I'm pretty ambivalent about pci_dev_is_disconnected() in general, but
I think I'll take this, given a couple minor changelog clarifications:

> irq_write_msi_msg is already guarded. pci_msi_(un)mask_irq are not.
> Guard them for completeness.

By the irq_write_msi_msg() guard, I guess you mean this path:

  pci_msi_domain_write_msg       # irq_chip.irq_write_msi_msg
    __pci_write_msi_msg
      if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev))
        /* don't touch */

pci_msi_(un)mask_irq() may be irq_chip.irq_mask, .irq_unmask, etc
pointers.  So these are parallel because they're all irq_chip function
pointers, but the changelog isn't (yet) parallel because it uses the
irq_chip pointer name for .irq_write_msi_msg but not for mask/unmask.

> For example, surprise removal of a PCIe device triggers teardown. This
> touches the irq_chips ops some point to disable the interrupts. I/O
> generated here can crash the system on machines with buggy firmware.
> Not triggering the IO in the first place eliminates the problem.

It doesn't eliminate the problem completely because .irq_mask() and
.irq_unmask() may be called for reasons other than surprise removal,
and if a surprise removal happens after the pci_dev_is_disconnected()
check but before the readl(), we will still generate I/O to a device
that's gone.  I'd be OK if you said it "reduces" the problem.

One reason I'm ambivalent about pci_dev_is_disconnected() is that in
cases like this, it turns a reproducible problem into a very
hard-to-reproduce problem, which reduces the likelihood that the buggy
firmware will be fixed.

Do you have information about known platforms with this buggy firmware
and the signature of the crash?  If you do, it's always nice to be
able to connect a patch with the user-visible problem it fixes.

> Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
> ---
> 
> There's another patch by Lukas Wunner that is needed (not yet published)
> in order to fully block IO on SURPRISE!!! removal. The existing code only
> sets the PCI_DEV_DISCONNECTED bit in an unreasonably narrow set of
> circumstances. Lukas' patch fixes that.
> 
> However, this change is otherwise fully independent, and enjoy!
> 
>  drivers/pci/msi.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 4d88afdfc843..5f47b5cb0401 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -227,6 +227,9 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag)
>  {
>  	struct msi_desc *desc = irq_data_get_msi_desc(data);
>  
> +	if (pci_dev_is_disconnected(msi_desc_to_pci_dev(desc)))
> +		return;
> +
>  	if (desc->msi_attrib.is_msix) {
>  		msix_mask_irq(desc, flag);
>  		readl(desc->mask_base);		/* Flush write to device */
> -- 
> 2.17.1
>
Alex_Gagniuc@Dellteam.com Sept. 18, 2018, 1:07 p.m. | #3
On 9/12/2018 4:28 PM, Bjorn Helgaas wrote:
> On Mon, Jul 30, 2018 at 04:21:44PM -0500, Alexandru Gagniuc wrote:
>> When a PCI device is gone, we don't want to send IO to it if we can
>> avoid it. We expose functionality via the irq_chip structure. As
>> users of that structure may not know about the underlying PCI device,
>> it's our responsibility to guard against removed devices.
> 
> I'm pretty ambivalent about pci_dev_is_disconnected() in general, but
> I think I'll take this, given a couple minor changelog clarifications:
> 
>> irq_write_msi_msg is already guarded. pci_msi_(un)mask_irq are not.
>> Guard them for completeness.
> 
> By the irq_write_msi_msg() guard, I guess you mean this path:
> 
>    pci_msi_domain_write_msg       # irq_chip.irq_write_msi_msg
>      __pci_write_msi_msg
>        if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev))
>          /* don't touch */

Yes!

> pci_msi_(un)mask_irq() may be irq_chip.irq_mask, .irq_unmask, etc
> pointers.  So these are parallel because they're all irq_chip function
> pointers, but the changelog isn't (yet) parallel because it uses the
> irq_chip pointer name for .irq_write_msi_msg but not for mask/unmask

Good catch! I'll get this corrected.


>> For example, surprise removal of a PCIe device triggers teardown. This
>> touches the irq_chips ops some point to disable the interrupts. I/O
>> generated here can crash the system on machines with buggy firmware.
>> Not triggering the IO in the first place eliminates the problem.
> 
> It doesn't eliminate the problem completely because .irq_mask() and
> .irq_unmask() may be called for reasons other than surprise removal,
> and if a surprise removal happens after the pci_dev_is_disconnected()
> check but before the readl(), we will still generate I/O to a device
> that's gone.  I'd be OK if you said it "reduces" the problem.

That sounds reasonable.

> One reason I'm ambivalent about pci_dev_is_disconnected() is that in
> cases like this, it turns a reproducible problem into a very
> hard-to-reproduce problem, which reduces the likelihood that the buggy
> firmware will be fixed.

If it manages to turn this into 99.999% territory, I'll be much happier. 
I'd love to give you an academically correct solution, but I just don't 
see how, given how firmware-first philosophy is written.

> Do you have information about known platforms with this buggy firmware
> and the signature of the crash?  If you do, it's always nice to be
> able to connect a patch with the user-visible problem it fixes.

 From what I've heard, it won't be fixed. The number of changes needed 
would require re-qualifying the firmware. I'm told that's very hard to 
do on platforms that are shipping. I can reword this to say 
"firmware-first" instead of "buggy" since they are mostly synonymous.

Alex

>> Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
>> ---
>>
>> There's another patch by Lukas Wunner that is needed (not yet published)
>> in order to fully block IO on SURPRISE!!! removal. The existing code only
>> sets the PCI_DEV_DISCONNECTED bit in an unreasonably narrow set of
>> circumstances. Lukas' patch fixes that.
>>
>> However, this change is otherwise fully independent, and enjoy!
>>
>>   drivers/pci/msi.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>> index 4d88afdfc843..5f47b5cb0401 100644
>> --- a/drivers/pci/msi.c
>> +++ b/drivers/pci/msi.c
>> @@ -227,6 +227,9 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag)
>>   {
>>   	struct msi_desc *desc = irq_data_get_msi_desc(data);
>>   
>> +	if (pci_dev_is_disconnected(msi_desc_to_pci_dev(desc)))
>> +		return;
>> +
>>   	if (desc->msi_attrib.is_msix) {
>>   		msix_mask_irq(desc, flag);
>>   		readl(desc->mask_base);		/* Flush write to device */
>> -- 
>> 2.17.1
>>
>

Patch

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 4d88afdfc843..5f47b5cb0401 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -227,6 +227,9 @@  static void msi_set_mask_bit(struct irq_data *data, u32 flag)
 {
 	struct msi_desc *desc = irq_data_get_msi_desc(data);
 
+	if (pci_dev_is_disconnected(msi_desc_to_pci_dev(desc)))
+		return;
+
 	if (desc->msi_attrib.is_msix) {
 		msix_mask_irq(desc, flag);
 		readl(desc->mask_base);		/* Flush write to device */