diff mbox series

phb4: Use link if degraded

Message ID 20170915002739.20717-1-mikey@neuling.org
State Accepted
Headers show
Series phb4: Use link if degraded | expand

Commit Message

Michael Neuling Sept. 15, 2017, 12:27 a.m. UTC
In the recent change:
 3f936bae97 phb4: Retrain link if degraded
We retrain if the link is degraded.  We do 3 retries to get an optimal
link.

Unfortunately if the last retry fails, we mark the PHB as bad and
don't use it. Hence that PHB is lost even though it actually trained
(just degraded).

This fixes the problem by printing an error message (as below) but
still marking the PHB as good.

  [    7.179320404,3] PHB#0005[0:5]: LINK: Link degraded
  [    8.387346665,3] PHB#0005[0:5]: LINK: Link degraded
  [   10.078409137,3] PHB#0005[0:5]: LINK: Link degraded
  [   11.281477269,3] PHB#0005[0:5]: LINK: Link degraded
  [   11.283123885,3] PHB#0005[0:5]: LINK: Degraded but no more retries

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 hw/phb4.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Russell Currey Sept. 15, 2017, 2:22 a.m. UTC | #1
On Fri, 2017-09-15 at 10:27 +1000, Michael Neuling wrote:
> In the recent change:
>  3f936bae97 phb4: Retrain link if degraded
> We retrain if the link is degraded.  We do 3 retries to get an optimal
> link.
> 
> Unfortunately if the last retry fails, we mark the PHB as bad and
> don't use it. Hence that PHB is lost even though it actually trained
> (just degraded).
> 
> This fixes the problem by printing an error message (as below) but
> still marking the PHB as good.
> 
>   [    7.179320404,3] PHB#0005[0:5]: LINK: Link degraded
>   [    8.387346665,3] PHB#0005[0:5]: LINK: Link degraded
>   [   10.078409137,3] PHB#0005[0:5]: LINK: Link degraded
>   [   11.281477269,3] PHB#0005[0:5]: LINK: Link degraded
>   [   11.283123885,3] PHB#0005[0:5]: LINK: Degraded but no more retries
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>

Acked-by: Russell Currey <ruscur@russell.cc>
Stewart Smith Sept. 15, 2017, 8:19 a.m. UTC | #2
Michael Neuling <mikey@neuling.org> writes:
> In the recent change:
>  3f936bae97 phb4: Retrain link if degraded
> We retrain if the link is degraded.  We do 3 retries to get an optimal
> link.
>
> Unfortunately if the last retry fails, we mark the PHB as bad and
> don't use it. Hence that PHB is lost even though it actually trained
> (just degraded).
>
> This fixes the problem by printing an error message (as below) but
> still marking the PHB as good.
>
>   [    7.179320404,3] PHB#0005[0:5]: LINK: Link degraded
>   [    8.387346665,3] PHB#0005[0:5]: LINK: Link degraded
>   [   10.078409137,3] PHB#0005[0:5]: LINK: Link degraded
>   [   11.281477269,3] PHB#0005[0:5]: LINK: Link degraded
>   [   11.283123885,3] PHB#0005[0:5]: LINK: Degraded but no more retries
>
> Signed-off-by: Michael Neuling <mikey@neuling.org>

Awesome.

Merged to master as of 4a2b8317fd3fee38fdce9b6d6e1639752335204a

Pushing to op-build now, so should be easy for others to pick up
diff mbox series

Patch

diff --git a/hw/phb4.c b/hw/phb4.c
index c51e57ce2d..620961b12d 100644
--- a/hw/phb4.c
+++ b/hw/phb4.c
@@ -2622,7 +2622,13 @@  static int64_t phb4_poll_link(struct pci_slot *slot)
 			PHBDBG(p, "LINK: Link is stable\n");
 			if (!phb4_link_optimal(slot)) {
 				PHBERR(p, "LINK: Link degraded\n");
-				return phb4_retry_state(slot);
+				if (slot->link_retries)
+					return phb4_retry_state(slot);
+				/*
+				 * Link is degraded but no more retries, so
+				 * settle for what we have :-(
+				 */
+				PHBERR(p, "LINK: Degraded but no more retries\n");
 			}
 			pci_slot_set_state(slot, PHB4_SLOT_NORMAL);
 			return OPAL_SUCCESS;