diff mbox series

PCI: Clear PCI_STATUS when setting up the device

Message ID 20220517043738.2308499-1-kai.heng.feng@canonical.com
State New
Headers show
Series PCI: Clear PCI_STATUS when setting up the device | expand

Commit Message

Kai-Heng Feng May 17, 2022, 4:37 a.m. UTC
We are seeing Master Abort bit is set on Intel I350 ethernet device and its
root port right after boot, probably happened during BIOS phase:

00:06.0 PCI bridge [0604]: Intel Corporation Device [8086:464d] (rev 05) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-

6e:00.0 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
        Subsystem: Intel Corporation Ethernet Server Adapter I350-T2 [8086:00a2]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-

6e:00.1 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
        Subsystem: Intel Corporation Ethernet Server Adapter I350-T2 [8086:00a2]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-

And the Master Abort bit is cleared after S3.

Since there's no functional impact found, clear the PCI_STATUS to treat
it anew at setting up.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215989
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/pci/probe.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Bjorn Helgaas July 5, 2022, 9 p.m. UTC | #1
On Tue, May 17, 2022 at 12:37:38PM +0800, Kai-Heng Feng wrote:
> We are seeing Master Abort bit is set on Intel I350 ethernet device and its
> root port right after boot, probably happened during BIOS phase:
> 
> 00:06.0 PCI bridge [0604]: Intel Corporation Device [8086:464d] (rev 05) (prog-if 00 [Normal decode])
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> 
> 6e:00.0 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
>         Subsystem: Intel Corporation Ethernet Server Adapter I350-T2 [8086:00a2]
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> 
> 6e:00.1 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
>         Subsystem: Intel Corporation Ethernet Server Adapter I350-T2 [8086:00a2]
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> 
> And the Master Abort bit is cleared after S3.
> 
> Since there's no functional impact found, clear the PCI_STATUS to treat
> it anew at setting up.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215989
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

Applied to pci/err for v5.20, thanks!

> ---
>  drivers/pci/probe.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 17a969942d370..414f659dc8735 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1890,6 +1890,9 @@ int pci_setup_device(struct pci_dev *dev)
>  
>  	dev->broken_intx_masking = pci_intx_mask_broken(dev);
>  
> +	/* Clear errors left from system firmware */
> +	pci_write_config_word(dev, PCI_STATUS, 0xffff);
> +
>  	switch (dev->hdr_type) {		    /* header type */
>  	case PCI_HEADER_TYPE_NORMAL:		    /* standard header */
>  		if (class == PCI_CLASS_BRIDGE_PCI)
> -- 
> 2.34.1
>
Bjorn Helgaas Nov. 4, 2022, 3:53 p.m. UTC | #2
On Tue, May 17, 2022 at 12:37:38PM +0800, Kai-Heng Feng wrote:
> We are seeing Master Abort bit is set on Intel I350 ethernet device and its
> root port right after boot, probably happened during BIOS phase:
> 
> 00:06.0 PCI bridge [0604]: Intel Corporation Device [8086:464d] (rev 05) (prog-if 00 [Normal decode])
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> 
> 6e:00.0 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
>         Subsystem: Intel Corporation Ethernet Server Adapter I350-T2 [8086:00a2]
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> 
> 6e:00.1 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
>         Subsystem: Intel Corporation Ethernet Server Adapter I350-T2 [8086:00a2]
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> 
> And the Master Abort bit is cleared after S3.
> 
> Since there's no functional impact found, clear the PCI_STATUS to treat
> it anew at setting up.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215989
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

This patch appeared in v6.0 as 6cd514e58f12 ("PCI: Clear PCI_STATUS
when setting up device").  Christophe reported in the bugzilla that it
causes boot failures:

> --- Comment #3 from Christophe Fergeau (cfergeau@redhat.com) ---
> This commit
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6c
> d514e58f12b211d638dbf6f791fa18d854f09c
> references this issue.
> This commit causes boot failures when trying to start linux guests with Apple's
> hypervisor framework (for example using https://github.com/evansm7/vftool ).
> After reverting it, I can successfully boot 6.1-rc kernels on my macOS12 x86_64
> macbook. With this commit, the VM fails to start, additional details in
> https://bugzilla.redhat.com/show_bug.cgi?id=2137803


> ---
>  drivers/pci/probe.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 17a969942d370..414f659dc8735 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1890,6 +1890,9 @@ int pci_setup_device(struct pci_dev *dev)
>  
>  	dev->broken_intx_masking = pci_intx_mask_broken(dev);
>  
> +	/* Clear errors left from system firmware */
> +	pci_write_config_word(dev, PCI_STATUS, 0xffff);
> +
>  	switch (dev->hdr_type) {		    /* header type */
>  	case PCI_HEADER_TYPE_NORMAL:		    /* standard header */
>  		if (class == PCI_CLASS_BRIDGE_PCI)
> -- 
> 2.34.1
>
Bjorn Helgaas Nov. 7, 2022, 9:37 p.m. UTC | #3
On Fri, Nov 04, 2022 at 10:53:39AM -0500, Bjorn Helgaas wrote:
> On Tue, May 17, 2022 at 12:37:38PM +0800, Kai-Heng Feng wrote:
> > We are seeing Master Abort bit is set on Intel I350 ethernet device and its
> > root port right after boot, probably happened during BIOS phase:
> > 
> > 00:06.0 PCI bridge [0604]: Intel Corporation Device [8086:464d] (rev 05) (prog-if 00 [Normal decode])
> >         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> >         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> > 
> > 6e:00.0 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
> >         Subsystem: Intel Corporation Ethernet Server Adapter I350-T2 [8086:00a2]
> >         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> >         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> > 
> > 6e:00.1 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
> >         Subsystem: Intel Corporation Ethernet Server Adapter I350-T2 [8086:00a2]
> >         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> >         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> > 
> > And the Master Abort bit is cleared after S3.
> > 
> > Since there's no functional impact found, clear the PCI_STATUS to treat
> > it anew at setting up.
> > 
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215989
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> 
> This patch appeared in v6.0 as 6cd514e58f12 ("PCI: Clear PCI_STATUS
> when setting up device").  Christophe reported in the bugzilla that it
> causes boot failures:
> 
> > --- Comment #3 from Christophe Fergeau (cfergeau@redhat.com) ---
> > This commit
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6c
> > d514e58f12b211d638dbf6f791fa18d854f09c
> > references this issue.
> > This commit causes boot failures when trying to start linux guests with Apple's
> > hypervisor framework (for example using https://github.com/evansm7/vftool ).
> > After reverting it, I can successfully boot 6.1-rc kernels on my macOS12 x86_64
> > macbook. With this commit, the VM fails to start, additional details in
> > https://bugzilla.redhat.com/show_bug.cgi?id=2137803

I queued up a revert for v6.2.  Obviously I would prefer if we could
figure out how to clear PCI_STATUS while still letting the guests
boot, but I have no idea how to debug the boot failures.

  commit 44e985938e85 ("Revert "PCI: Clear PCI_STATUS when setting up device"")
  Author: Bjorn Helgaas <bhelgaas@google.com>
  Date:   Mon Nov 7 15:31:08 2022 -0600

      Revert "PCI: Clear PCI_STATUS when setting up device"
      
      This reverts commit 6cd514e58f12b211d638dbf6f791fa18d854f09c.
      
      Christophe Fergeau reported that 6cd514e58f12 ("PCI: Clear PCI_STATUS when
      setting up device") causes boot failures when trying to start linux guests
      with Apple's virtualization framework (for example using
      https://developer.apple.com/documentation/virtualization/running_linux_in_a_virtual_machine?language=objc)
      
      6cd514e58f12 only solved a cosmetic problem, so revert it to fix the boot
      failures.
      
      Link: https://bugzilla.redhat.com/show_bug.cgi?id=2137803
      Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

  diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
  index b66fa42c4b1f..1d6f7b502020 100644
  --- a/drivers/pci/probe.c
  +++ b/drivers/pci/probe.c
  @@ -1891,9 +1891,6 @@ int pci_setup_device(struct pci_dev *dev)
   
	  dev->broken_intx_masking = pci_intx_mask_broken(dev);
   
  -	/* Clear errors left from system firmware */
  -	pci_write_config_word(dev, PCI_STATUS, 0xffff);
  -
	  switch (dev->hdr_type) {		    /* header type */
	  case PCI_HEADER_TYPE_NORMAL:		    /* standard header */
		  if (class == PCI_CLASS_BRIDGE_PCI)

> > ---
> >  drivers/pci/probe.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 17a969942d370..414f659dc8735 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1890,6 +1890,9 @@ int pci_setup_device(struct pci_dev *dev)
> >  
> >  	dev->broken_intx_masking = pci_intx_mask_broken(dev);
> >  
> > +	/* Clear errors left from system firmware */
> > +	pci_write_config_word(dev, PCI_STATUS, 0xffff);
> > +
> >  	switch (dev->hdr_type) {		    /* header type */
> >  	case PCI_HEADER_TYPE_NORMAL:		    /* standard header */
> >  		if (class == PCI_CLASS_BRIDGE_PCI)
> > -- 
> > 2.34.1
> >
diff mbox series

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 17a969942d370..414f659dc8735 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1890,6 +1890,9 @@  int pci_setup_device(struct pci_dev *dev)
 
 	dev->broken_intx_masking = pci_intx_mask_broken(dev);
 
+	/* Clear errors left from system firmware */
+	pci_write_config_word(dev, PCI_STATUS, 0xffff);
+
 	switch (dev->hdr_type) {		    /* header type */
 	case PCI_HEADER_TYPE_NORMAL:		    /* standard header */
 		if (class == PCI_CLASS_BRIDGE_PCI)