diff mbox series

[SRU,Artful,Bionic,Cosmic,1/1] powerpc/eeh: Fix enabling bridge MMIO windows

Message ID 267ae9c8a14f88afb486a16cca80782db4453fee.1526496085.git.joseph.salisbury@canonical.com
State New
Headers show
Series powerpc/eeh: Fix enabling bridge MMIO windows | expand

Commit Message

Joseph Salisbury May 23, 2018, 4:18 p.m. UTC
From: Michael Neuling <mikey@neuling.org>

BugLink: http://bugs.launchpad.net/bugs/1771344

On boot we save the configuration space of PCIe bridges. We do this so
when we get an EEH event and everything gets reset that we can restore
them.

Unfortunately we save this state before we've enabled the MMIO space
on the bridges. Hence if we have to reset the bridge when we come back
MMIO is not enabled and we end up taking an PE freeze when the driver
starts accessing again.

This patch forces the memory/MMIO and bus mastering on when restoring
bridges on EEH. Ideally we'd do this correctly by saving the
configuration space writes later, but that will have to come later in
a larger EEH rewrite. For now we have this simple fix.

The original bug can be triggered on a boston machine by doing:
  echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound
On boston, this PHB has a PCIe switch on it.  Without this patch,
you'll see two EEH events, 1 expected and 1 the failure we are fixing
here. The second EEH event causes the anything under the PHB to
disappear (i.e. the i40e eth).

With this patch, only 1 EEH event occurs and devices properly recover.

Fixes: 652defed4875 ("powerpc/eeh: Check PCIe link after reset")
Cc: stable@vger.kernel.org # v3.11+
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 13a83eac373c49c0a081cbcd137e79210fe78acd)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
---
 arch/powerpc/kernel/eeh_pe.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Stefan Bader June 4, 2018, 8:08 p.m. UTC | #1
On 23.05.2018 09:18, Joseph Salisbury wrote:
> From: Michael Neuling <mikey@neuling.org>
> 
> BugLink: http://bugs.launchpad.net/bugs/1771344
> 
> On boot we save the configuration space of PCIe bridges. We do this so
> when we get an EEH event and everything gets reset that we can restore
> them.
> 
> Unfortunately we save this state before we've enabled the MMIO space
> on the bridges. Hence if we have to reset the bridge when we come back
> MMIO is not enabled and we end up taking an PE freeze when the driver
> starts accessing again.
> 
> This patch forces the memory/MMIO and bus mastering on when restoring
> bridges on EEH. Ideally we'd do this correctly by saving the
> configuration space writes later, but that will have to come later in
> a larger EEH rewrite. For now we have this simple fix.
> 
> The original bug can be triggered on a boston machine by doing:
>   echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound
> On boston, this PHB has a PCIe switch on it.  Without this patch,
> you'll see two EEH events, 1 expected and 1 the failure we are fixing
> here. The second EEH event causes the anything under the PHB to
> disappear (i.e. the i40e eth).
> 
> With this patch, only 1 EEH event occurs and devices properly recover.
> 
> Fixes: 652defed4875 ("powerpc/eeh: Check PCIe link after reset")
> Cc: stable@vger.kernel.org # v3.11+
> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> Acked-by: Russell Currey <ruscur@russell.cc>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> (cherry picked from commit 13a83eac373c49c0a081cbcd137e79210fe78acd)
> Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>

> ---
>  arch/powerpc/kernel/eeh_pe.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
> index 2d4956e..ee5a67d 100644
> --- a/arch/powerpc/kernel/eeh_pe.c
> +++ b/arch/powerpc/kernel/eeh_pe.c
> @@ -807,7 +807,8 @@ static void eeh_restore_bridge_bars(struct eeh_dev *edev)
>  	eeh_ops->write_config(pdn, 15*4, 4, edev->config_space[15]);
>  
>  	/* PCI Command: 0x4 */
> -	eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1]);
> +	eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1] |
> +			      PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
>  
>  	/* Check the PCIe link is ready */
>  	eeh_bridge_check_link(edev);
>
Kleber Sacilotto de Souza June 5, 2018, 7:43 p.m. UTC | #2
On 05/23/18 09:18, Joseph Salisbury wrote:
> From: Michael Neuling <mikey@neuling.org>
> 
> BugLink: http://bugs.launchpad.net/bugs/1771344
> 
> On boot we save the configuration space of PCIe bridges. We do this so
> when we get an EEH event and everything gets reset that we can restore
> them.
> 
> Unfortunately we save this state before we've enabled the MMIO space
> on the bridges. Hence if we have to reset the bridge when we come back
> MMIO is not enabled and we end up taking an PE freeze when the driver
> starts accessing again.
> 
> This patch forces the memory/MMIO and bus mastering on when restoring
> bridges on EEH. Ideally we'd do this correctly by saving the
> configuration space writes later, but that will have to come later in
> a larger EEH rewrite. For now we have this simple fix.
> 
> The original bug can be triggered on a boston machine by doing:
>   echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound
> On boston, this PHB has a PCIe switch on it.  Without this patch,
> you'll see two EEH events, 1 expected and 1 the failure we are fixing
> here. The second EEH event causes the anything under the PHB to
> disappear (i.e. the i40e eth).
> 
> With this patch, only 1 EEH event occurs and devices properly recover.
> 
> Fixes: 652defed4875 ("powerpc/eeh: Check PCIe link after reset")
> Cc: stable@vger.kernel.org # v3.11+
> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> Acked-by: Russell Currey <ruscur@russell.cc>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> (cherry picked from commit 13a83eac373c49c0a081cbcd137e79210fe78acd)
> Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

> ---
>  arch/powerpc/kernel/eeh_pe.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
> index 2d4956e..ee5a67d 100644
> --- a/arch/powerpc/kernel/eeh_pe.c
> +++ b/arch/powerpc/kernel/eeh_pe.c
> @@ -807,7 +807,8 @@ static void eeh_restore_bridge_bars(struct eeh_dev *edev)
>  	eeh_ops->write_config(pdn, 15*4, 4, edev->config_space[15]);
>  
>  	/* PCI Command: 0x4 */
> -	eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1]);
> +	eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1] |
> +			      PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
>  
>  	/* Check the PCIe link is ready */
>  	eeh_bridge_check_link(edev);
>
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 2d4956e..ee5a67d 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -807,7 +807,8 @@  static void eeh_restore_bridge_bars(struct eeh_dev *edev)
 	eeh_ops->write_config(pdn, 15*4, 4, edev->config_space[15]);
 
 	/* PCI Command: 0x4 */
-	eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1]);
+	eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1] |
+			      PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
 
 	/* Check the PCIe link is ready */
 	eeh_bridge_check_link(edev);