diff mbox series

[net] net: phy: make phy_error() report which PHY has failed

Message ID E1ihCLZ-0001Vo-Nw@rmk-PC.armlinux.org.uk
State Rejected
Delegated to: David Miller
Headers show
Series [net] net: phy: make phy_error() report which PHY has failed | expand

Commit Message

Russell King (Oracle) Dec. 17, 2019, 12:53 p.m. UTC
phy_error() is called from phy_interrupt() or phy_state_machine(), and
uses WARN_ON() to print a backtrace. The backtrace is not useful when
reporting a PHY error.

However, a system may contain multiple ethernet PHYs, and phy_error()
gives no clue which one caused the problem.

Replace WARN_ON() with a call to phydev_err() so that we can see which
PHY had an error, and also inform the user that we are halting the PHY.

Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
There is another related problem in this area. If an error is detected
while the PHY is running, phy_error() moves to PHY_HALTED state. If we
try to take the network device down, then:

void phy_stop(struct phy_device *phydev)
{
        if (!phy_is_started(phydev)) {
                WARN(1, "called from state %s\n",
                     phy_state_to_str(phydev->state));
                return;
        }

triggers, and we never do any of the phy_stop() cleanup. I'm not sure
what the best way to solve this is - introducing a PHY_ERROR state may
be a solution, but I think we want some phy_is_started() sites to
return true for it and others to return false.

Heiner - you introduced the above warning, could you look at improving
this case so we don't print a warning and taint the kernel when taking
a network device down after phy_error() please?

Thanks.

 drivers/net/phy/phy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Heiner Kallweit Dec. 17, 2019, 9:41 p.m. UTC | #1
On 17.12.2019 13:53, Russell King wrote:
> phy_error() is called from phy_interrupt() or phy_state_machine(), and
> uses WARN_ON() to print a backtrace. The backtrace is not useful when
> reporting a PHY error.
> 
> However, a system may contain multiple ethernet PHYs, and phy_error()
> gives no clue which one caused the problem.
> 
> Replace WARN_ON() with a call to phydev_err() so that we can see which
> PHY had an error, and also inform the user that we are halting the PHY.
> 
> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> ---
> There is another related problem in this area. If an error is detected
> while the PHY is running, phy_error() moves to PHY_HALTED state. If we
> try to take the network device down, then:
> 
> void phy_stop(struct phy_device *phydev)
> {
>         if (!phy_is_started(phydev)) {
>                 WARN(1, "called from state %s\n",
>                      phy_state_to_str(phydev->state));
>                 return;
>         }
> 
> triggers, and we never do any of the phy_stop() cleanup. I'm not sure
> what the best way to solve this is - introducing a PHY_ERROR state may
> be a solution, but I think we want some phy_is_started() sites to
> return true for it and others to return false.
> 
> Heiner - you introduced the above warning, could you look at improving
> this case so we don't print a warning and taint the kernel when taking
> a network device down after phy_error() please?
> 
I think we need both types of information:
- the affected PHY device
- the stack trace to see where the issue was triggered

In general it's not fully clear yet what's the appropriate reaction
after a PHY error. Few reasons for PHY errors I see:
- MDIO bus may not be accessible, e.g. because parent device is in
  power-down mode
- Frequently polling is used to determine end of a MDIO operation.
  If timeout for polling is too low, then we may end up with an
  -ETIMEDOUT.

In case of singular timeouts they may be acceptable or not.
- If we miss a single PHY status poll, then this may be acceptable.
- But if e.g. a relevant setting in config_init fails, then this
  may not be acceptable.

The current behavior has been existing for the last 15 years,
and I'm just aware of one issue with PHY errors. The case I've
seen was triggered by timeouts, and adjusting the timeouts
fixed it: c3b084c24c8a (net: fec: Adjust ENET MDIO timeouts)

> Thanks.
> 
>  drivers/net/phy/phy.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 49300fb59757..06fbca959383 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -663,7 +663,7 @@ void phy_stop_machine(struct phy_device *phydev)
>   */
>  static void phy_error(struct phy_device *phydev)
>  {
> -	WARN_ON(1);
> +	phydev_err(phydev, "Error detected, halting PHY\n");
>  
>  	mutex_lock(&phydev->lock);
>  	phydev->state = PHY_HALTED;
>
Russell King (Oracle) Dec. 17, 2019, 11:34 p.m. UTC | #2
On Tue, Dec 17, 2019 at 10:41:34PM +0100, Heiner Kallweit wrote:
> On 17.12.2019 13:53, Russell King wrote:
> > phy_error() is called from phy_interrupt() or phy_state_machine(), and
> > uses WARN_ON() to print a backtrace. The backtrace is not useful when
> > reporting a PHY error.
> > 
> > However, a system may contain multiple ethernet PHYs, and phy_error()
> > gives no clue which one caused the problem.
> > 
> > Replace WARN_ON() with a call to phydev_err() so that we can see which
> > PHY had an error, and also inform the user that we are halting the PHY.
> > 
> > Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > ---
> > There is another related problem in this area. If an error is detected
> > while the PHY is running, phy_error() moves to PHY_HALTED state. If we
> > try to take the network device down, then:
> > 
> > void phy_stop(struct phy_device *phydev)
> > {
> >         if (!phy_is_started(phydev)) {
> >                 WARN(1, "called from state %s\n",
> >                      phy_state_to_str(phydev->state));
> >                 return;
> >         }
> > 
> > triggers, and we never do any of the phy_stop() cleanup. I'm not sure
> > what the best way to solve this is - introducing a PHY_ERROR state may
> > be a solution, but I think we want some phy_is_started() sites to
> > return true for it and others to return false.
> > 
> > Heiner - you introduced the above warning, could you look at improving
> > this case so we don't print a warning and taint the kernel when taking
> > a network device down after phy_error() please?
> > 
> I think we need both types of information:
> - the affected PHY device
> - the stack trace to see where the issue was triggered

Can you please explain why the stack trace is useful.  For the paths
that are reachable, all it tells you is whether it was reached via
the interrupt or the workqueue.

If it's via the interrupt, the rest of the backtrace beyond that is
irrelevant.  If it's the workqueue, the backtrace doesn't go back
very far, and doesn't tell you what operation triggered it.

If it's important to see where or why phy_error() was called, there
are much better ways of doing that, notably passing a string into
phy_error() to describe the actual error itself.  That would convey
way more useful information than the backtrace does.

I have been faced with these backtraces, and they have not been at
all useful for diagnosing the problem.
Heiner Kallweit Dec. 18, 2019, 8:54 p.m. UTC | #3
On 18.12.2019 00:34, Russell King - ARM Linux admin wrote:
> On Tue, Dec 17, 2019 at 10:41:34PM +0100, Heiner Kallweit wrote:
>> On 17.12.2019 13:53, Russell King wrote:
>>> phy_error() is called from phy_interrupt() or phy_state_machine(), and
>>> uses WARN_ON() to print a backtrace. The backtrace is not useful when
>>> reporting a PHY error.
>>>
>>> However, a system may contain multiple ethernet PHYs, and phy_error()
>>> gives no clue which one caused the problem.
>>>
>>> Replace WARN_ON() with a call to phydev_err() so that we can see which
>>> PHY had an error, and also inform the user that we are halting the PHY.
>>>
>>> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
>>> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
>>> ---
>>> There is another related problem in this area. If an error is detected
>>> while the PHY is running, phy_error() moves to PHY_HALTED state. If we
>>> try to take the network device down, then:
>>>
>>> void phy_stop(struct phy_device *phydev)
>>> {
>>>         if (!phy_is_started(phydev)) {
>>>                 WARN(1, "called from state %s\n",
>>>                      phy_state_to_str(phydev->state));
>>>                 return;
>>>         }
>>>
>>> triggers, and we never do any of the phy_stop() cleanup. I'm not sure
>>> what the best way to solve this is - introducing a PHY_ERROR state may
>>> be a solution, but I think we want some phy_is_started() sites to
>>> return true for it and others to return false.
>>>
>>> Heiner - you introduced the above warning, could you look at improving
>>> this case so we don't print a warning and taint the kernel when taking
>>> a network device down after phy_error() please?
>>>
>> I think we need both types of information:
>> - the affected PHY device
>> - the stack trace to see where the issue was triggered
> 
> Can you please explain why the stack trace is useful.  For the paths
> that are reachable, all it tells you is whether it was reached via
> the interrupt or the workqueue.
> 
> If it's via the interrupt, the rest of the backtrace beyond that is
> irrelevant.  If it's the workqueue, the backtrace doesn't go back
> very far, and doesn't tell you what operation triggered it.
> 
> If it's important to see where or why phy_error() was called, there
> are much better ways of doing that, notably passing a string into
> phy_error() to describe the actual error itself.  That would convey
> way more useful information than the backtrace does.
> 
> I have been faced with these backtraces, and they have not been at
> all useful for diagnosing the problem.
> 
"The problem" comes in two flavors:
1. The problem that caused the PHY error
2. The problem caused by the PHY error (if we decide to not
   always switch to HALTED state)

We can't do much for case 1, maybe we could add an errno argument
to phy_error(). To facilitate analyzing case 2 we'd need to change
code pieces like the following.

case a:
err = f1();
case b:
err = f2();

if (err)
	phy_error()

For my understanding: What caused the PHY error in your case(s)?
Which info would have been useful for analyzing the error?
Russell King (Oracle) Dec. 18, 2019, 10:09 p.m. UTC | #4
On Wed, Dec 18, 2019 at 09:54:32PM +0100, Heiner Kallweit wrote:
> On 18.12.2019 00:34, Russell King - ARM Linux admin wrote:
> > On Tue, Dec 17, 2019 at 10:41:34PM +0100, Heiner Kallweit wrote:
> >> On 17.12.2019 13:53, Russell King wrote:
> >>> phy_error() is called from phy_interrupt() or phy_state_machine(), and
> >>> uses WARN_ON() to print a backtrace. The backtrace is not useful when
> >>> reporting a PHY error.
> >>>
> >>> However, a system may contain multiple ethernet PHYs, and phy_error()
> >>> gives no clue which one caused the problem.
> >>>
> >>> Replace WARN_ON() with a call to phydev_err() so that we can see which
> >>> PHY had an error, and also inform the user that we are halting the PHY.
> >>>
> >>> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
> >>> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> >>> ---
> >>> There is another related problem in this area. If an error is detected
> >>> while the PHY is running, phy_error() moves to PHY_HALTED state. If we
> >>> try to take the network device down, then:
> >>>
> >>> void phy_stop(struct phy_device *phydev)
> >>> {
> >>>         if (!phy_is_started(phydev)) {
> >>>                 WARN(1, "called from state %s\n",
> >>>                      phy_state_to_str(phydev->state));
> >>>                 return;
> >>>         }
> >>>
> >>> triggers, and we never do any of the phy_stop() cleanup. I'm not sure
> >>> what the best way to solve this is - introducing a PHY_ERROR state may
> >>> be a solution, but I think we want some phy_is_started() sites to
> >>> return true for it and others to return false.
> >>>
> >>> Heiner - you introduced the above warning, could you look at improving
> >>> this case so we don't print a warning and taint the kernel when taking
> >>> a network device down after phy_error() please?
> >>>
> >> I think we need both types of information:
> >> - the affected PHY device
> >> - the stack trace to see where the issue was triggered
> > 
> > Can you please explain why the stack trace is useful.  For the paths
> > that are reachable, all it tells you is whether it was reached via
> > the interrupt or the workqueue.
> > 
> > If it's via the interrupt, the rest of the backtrace beyond that is
> > irrelevant.  If it's the workqueue, the backtrace doesn't go back
> > very far, and doesn't tell you what operation triggered it.
> > 
> > If it's important to see where or why phy_error() was called, there
> > are much better ways of doing that, notably passing a string into
> > phy_error() to describe the actual error itself.  That would convey
> > way more useful information than the backtrace does.
> > 
> > I have been faced with these backtraces, and they have not been at
> > all useful for diagnosing the problem.
> > 
> "The problem" comes in two flavors:
> 1. The problem that caused the PHY error
> 2. The problem caused by the PHY error (if we decide to not
>    always switch to HALTED state)
> 
> We can't do much for case 1, maybe we could add an errno argument
> to phy_error(). To facilitate analyzing case 2 we'd need to change
> code pieces like the following.
> 
> case a:
> err = f1();
> case b:
> err = f2();
> 
> if (err)
> 	phy_error()
> 
> For my understanding: What caused the PHY error in your case(s)?
> Which info would have been useful for analyzing the error?

Errors reading/writing from the PHY.

The problem with a backtrace from phy_error() is it doesn't tell you
where the error actually occurred, it only tells you where the error
is reported - which is one of two different paths at the moment.
That can be achieved with much more elegance and simplicity by
passing a string into phy_error() to describe the call site if that's
even relevant.

I would say, however, that knowing where the error occurred would be
far better information.
Heiner Kallweit Dec. 19, 2019, 7:10 a.m. UTC | #5
On 18.12.2019 23:09, Russell King - ARM Linux admin wrote:
> On Wed, Dec 18, 2019 at 09:54:32PM +0100, Heiner Kallweit wrote:
>> On 18.12.2019 00:34, Russell King - ARM Linux admin wrote:
>>> On Tue, Dec 17, 2019 at 10:41:34PM +0100, Heiner Kallweit wrote:
>>>> On 17.12.2019 13:53, Russell King wrote:
>>>>> phy_error() is called from phy_interrupt() or phy_state_machine(), and
>>>>> uses WARN_ON() to print a backtrace. The backtrace is not useful when
>>>>> reporting a PHY error.
>>>>>
>>>>> However, a system may contain multiple ethernet PHYs, and phy_error()
>>>>> gives no clue which one caused the problem.
>>>>>
>>>>> Replace WARN_ON() with a call to phydev_err() so that we can see which
>>>>> PHY had an error, and also inform the user that we are halting the PHY.
>>>>>
>>>>> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
>>>>> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
>>>>> ---
>>>>> There is another related problem in this area. If an error is detected
>>>>> while the PHY is running, phy_error() moves to PHY_HALTED state. If we
>>>>> try to take the network device down, then:
>>>>>
>>>>> void phy_stop(struct phy_device *phydev)
>>>>> {
>>>>>         if (!phy_is_started(phydev)) {
>>>>>                 WARN(1, "called from state %s\n",
>>>>>                      phy_state_to_str(phydev->state));
>>>>>                 return;
>>>>>         }
>>>>>
>>>>> triggers, and we never do any of the phy_stop() cleanup. I'm not sure
>>>>> what the best way to solve this is - introducing a PHY_ERROR state may
>>>>> be a solution, but I think we want some phy_is_started() sites to
>>>>> return true for it and others to return false.
>>>>>
>>>>> Heiner - you introduced the above warning, could you look at improving
>>>>> this case so we don't print a warning and taint the kernel when taking
>>>>> a network device down after phy_error() please?
>>>>>
>>>> I think we need both types of information:
>>>> - the affected PHY device
>>>> - the stack trace to see where the issue was triggered
>>>
>>> Can you please explain why the stack trace is useful.  For the paths
>>> that are reachable, all it tells you is whether it was reached via
>>> the interrupt or the workqueue.
>>>
>>> If it's via the interrupt, the rest of the backtrace beyond that is
>>> irrelevant.  If it's the workqueue, the backtrace doesn't go back
>>> very far, and doesn't tell you what operation triggered it.
>>>
>>> If it's important to see where or why phy_error() was called, there
>>> are much better ways of doing that, notably passing a string into
>>> phy_error() to describe the actual error itself.  That would convey
>>> way more useful information than the backtrace does.
>>>
>>> I have been faced with these backtraces, and they have not been at
>>> all useful for diagnosing the problem.
>>>
>> "The problem" comes in two flavors:
>> 1. The problem that caused the PHY error
>> 2. The problem caused by the PHY error (if we decide to not
>>    always switch to HALTED state)
>>
>> We can't do much for case 1, maybe we could add an errno argument
>> to phy_error(). To facilitate analyzing case 2 we'd need to change
>> code pieces like the following.
>>
>> case a:
>> err = f1();
>> case b:
>> err = f2();
>>
>> if (err)
>> 	phy_error()
>>
>> For my understanding: What caused the PHY error in your case(s)?
>> Which info would have been useful for analyzing the error?
> 
> Errors reading/writing from the PHY.
> 
> The problem with a backtrace from phy_error() is it doesn't tell you
> where the error actually occurred, it only tells you where the error
> is reported - which is one of two different paths at the moment.
> That can be achieved with much more elegance and simplicity by
> passing a string into phy_error() to describe the call site if that's
> even relevant.
> 
> I would say, however, that knowing where the error occurred would be
> far better information.
> 
AFAICS PHY errors are typically forwarded MDIO access errors.
PHY driver callback implementations could add own error sources,
but from what I've seen they don't. Instead of the backtrace in
phy_error() we could add a WARN_ONCE() to __mdiobus_read/write.
Then the printed call chain should be more useful.
If somebody wants to analyze in more detail, he can switch on
MDIO access tracing.
Russell King (Oracle) Dec. 19, 2019, 5:06 p.m. UTC | #6
On Thu, Dec 19, 2019 at 08:10:21AM +0100, Heiner Kallweit wrote:
> On 18.12.2019 23:09, Russell King - ARM Linux admin wrote:
> > On Wed, Dec 18, 2019 at 09:54:32PM +0100, Heiner Kallweit wrote:
> >> On 18.12.2019 00:34, Russell King - ARM Linux admin wrote:
> >>> On Tue, Dec 17, 2019 at 10:41:34PM +0100, Heiner Kallweit wrote:
> >>>> On 17.12.2019 13:53, Russell King wrote:
> >>>>> phy_error() is called from phy_interrupt() or phy_state_machine(), and
> >>>>> uses WARN_ON() to print a backtrace. The backtrace is not useful when
> >>>>> reporting a PHY error.
> >>>>>
> >>>>> However, a system may contain multiple ethernet PHYs, and phy_error()
> >>>>> gives no clue which one caused the problem.
> >>>>>
> >>>>> Replace WARN_ON() with a call to phydev_err() so that we can see which
> >>>>> PHY had an error, and also inform the user that we are halting the PHY.
> >>>>>
> >>>>> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
> >>>>> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> >>>>> ---
> >>>>> There is another related problem in this area. If an error is detected
> >>>>> while the PHY is running, phy_error() moves to PHY_HALTED state. If we
> >>>>> try to take the network device down, then:
> >>>>>
> >>>>> void phy_stop(struct phy_device *phydev)
> >>>>> {
> >>>>>         if (!phy_is_started(phydev)) {
> >>>>>                 WARN(1, "called from state %s\n",
> >>>>>                      phy_state_to_str(phydev->state));
> >>>>>                 return;
> >>>>>         }
> >>>>>
> >>>>> triggers, and we never do any of the phy_stop() cleanup. I'm not sure
> >>>>> what the best way to solve this is - introducing a PHY_ERROR state may
> >>>>> be a solution, but I think we want some phy_is_started() sites to
> >>>>> return true for it and others to return false.
> >>>>>
> >>>>> Heiner - you introduced the above warning, could you look at improving
> >>>>> this case so we don't print a warning and taint the kernel when taking
> >>>>> a network device down after phy_error() please?
> >>>>>
> >>>> I think we need both types of information:
> >>>> - the affected PHY device
> >>>> - the stack trace to see where the issue was triggered
> >>>
> >>> Can you please explain why the stack trace is useful.  For the paths
> >>> that are reachable, all it tells you is whether it was reached via
> >>> the interrupt or the workqueue.
> >>>
> >>> If it's via the interrupt, the rest of the backtrace beyond that is
> >>> irrelevant.  If it's the workqueue, the backtrace doesn't go back
> >>> very far, and doesn't tell you what operation triggered it.
> >>>
> >>> If it's important to see where or why phy_error() was called, there
> >>> are much better ways of doing that, notably passing a string into
> >>> phy_error() to describe the actual error itself.  That would convey
> >>> way more useful information than the backtrace does.
> >>>
> >>> I have been faced with these backtraces, and they have not been at
> >>> all useful for diagnosing the problem.
> >>>
> >> "The problem" comes in two flavors:
> >> 1. The problem that caused the PHY error
> >> 2. The problem caused by the PHY error (if we decide to not
> >>    always switch to HALTED state)
> >>
> >> We can't do much for case 1, maybe we could add an errno argument
> >> to phy_error(). To facilitate analyzing case 2 we'd need to change
> >> code pieces like the following.
> >>
> >> case a:
> >> err = f1();
> >> case b:
> >> err = f2();
> >>
> >> if (err)
> >> 	phy_error()
> >>
> >> For my understanding: What caused the PHY error in your case(s)?
> >> Which info would have been useful for analyzing the error?
> > 
> > Errors reading/writing from the PHY.
> > 
> > The problem with a backtrace from phy_error() is it doesn't tell you
> > where the error actually occurred, it only tells you where the error
> > is reported - which is one of two different paths at the moment.
> > That can be achieved with much more elegance and simplicity by
> > passing a string into phy_error() to describe the call site if that's
> > even relevant.
> > 
> > I would say, however, that knowing where the error occurred would be
> > far better information.
> > 
> AFAICS PHY errors are typically forwarded MDIO access errors.
> PHY driver callback implementations could add own error sources,
> but from what I've seen they don't. Instead of the backtrace in
> phy_error() we could add a WARN_ONCE() to __mdiobus_read/write.
> Then the printed call chain should be more useful.
> If somebody wants to analyze in more detail, he can switch on
> MDIO access tracing.

I'm still not clear why you're so keen to trigger a kernel warning
on one of these events.

Errors may _legitimately_ occur when trying to read/write a PHY. For
example, it would be completely mad for the kernel to WARN and taint
itself just because you've unplugged a SFP module just at the time
that phylib is trying to poll the PHY on-board, and that caused an
failure to read/write the PHY. You just need the right timing to
trigger this.

When a SFP module is unplugged the three contacts that comprise the
I2C bus (used for communicating with a PHY that may be there) and
the pin that identifies that the module is present all break at about
the same point in time (give or take some minor tolerances) so there
is no way to definitively say "yes, the PHY is still present, we can
talk to it" by testing something.
David Miller Dec. 19, 2019, 8:50 p.m. UTC | #7
From: Russell King <rmk+kernel@armlinux.org.uk>
Date: Tue, 17 Dec 2019 12:53:05 +0000

> phy_error() is called from phy_interrupt() or phy_state_machine(), and
> uses WARN_ON() to print a backtrace. The backtrace is not useful when
> reporting a PHY error.
> 
> However, a system may contain multiple ethernet PHYs, and phy_error()
> gives no clue which one caused the problem.
> 
> Replace WARN_ON() with a call to phydev_err() so that we can see which
> PHY had an error, and also inform the user that we are halting the PHY.
> 
> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

I think I agree with Heiner that it is valuable to know whether the
error occurred from the interrupt handler or the state machine (and
if the state machine, where that got called from).

So I totally disagree with removing the backtrace, sorry.
Andrew Lunn Dec. 19, 2019, 9:05 p.m. UTC | #8
> I think I agree with Heiner that it is valuable to know whether the
> error occurred from the interrupt handler or the state machine (and
> if the state machine, where that got called from).
> 
> So I totally disagree with removing the backtrace, sorry.

Russell does have a point about the backtrace not giving an indication
of which phy experienced the error. So adding the phydev_err() call,
which will prefix the print with an identifier for the PHY, is a good
idea. So we should add that, and keep the WARN().

      Andrew
Russell King (Oracle) Dec. 19, 2019, 9:45 p.m. UTC | #9
On Thu, Dec 19, 2019 at 12:50:10PM -0800, David Miller wrote:
> From: Russell King <rmk+kernel@armlinux.org.uk>
> Date: Tue, 17 Dec 2019 12:53:05 +0000
> 
> > phy_error() is called from phy_interrupt() or phy_state_machine(), and
> > uses WARN_ON() to print a backtrace. The backtrace is not useful when
> > reporting a PHY error.
> > 
> > However, a system may contain multiple ethernet PHYs, and phy_error()
> > gives no clue which one caused the problem.
> > 
> > Replace WARN_ON() with a call to phydev_err() so that we can see which
> > PHY had an error, and also inform the user that we are halting the PHY.
> > 
> > Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> 
> I think I agree with Heiner that it is valuable to know whether the
> error occurred from the interrupt handler or the state machine (and
> if the state machine, where that got called from).

Would you accept, then, passing a string to indicate where phy_error()
was called from, which would do the same job without tainting the
kernel for something that becomes a _normal_ event when removing a
SFP?
David Miller Dec. 19, 2019, 10:14 p.m. UTC | #10
From: Andrew Lunn <andrew@lunn.ch>
Date: Thu, 19 Dec 2019 22:05:03 +0100

>> I think I agree with Heiner that it is valuable to know whether the
>> error occurred from the interrupt handler or the state machine (and
>> if the state machine, where that got called from).
>> 
>> So I totally disagree with removing the backtrace, sorry.
> 
> Russell does have a point about the backtrace not giving an indication
> of which phy experienced the error. So adding the phydev_err() call,
> which will prefix the print with an identifier for the PHY, is a good
> idea. So we should add that, and keep the WARN().

Agreed.
Andrew Lunn Dec. 20, 2019, 9:18 a.m. UTC | #11
On Thu, Dec 19, 2019 at 09:45:37PM +0000, Russell King - ARM Linux admin wrote:
> On Thu, Dec 19, 2019 at 12:50:10PM -0800, David Miller wrote:
> > From: Russell King <rmk+kernel@armlinux.org.uk>
> > Date: Tue, 17 Dec 2019 12:53:05 +0000
> > 
> > > phy_error() is called from phy_interrupt() or phy_state_machine(), and
> > > uses WARN_ON() to print a backtrace. The backtrace is not useful when
> > > reporting a PHY error.
> > > 
> > > However, a system may contain multiple ethernet PHYs, and phy_error()
> > > gives no clue which one caused the problem.
> > > 
> > > Replace WARN_ON() with a call to phydev_err() so that we can see which
> > > PHY had an error, and also inform the user that we are halting the PHY.
> > > 
> > > Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
> > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > 
> > I think I agree with Heiner that it is valuable to know whether the
> > error occurred from the interrupt handler or the state machine (and
> > if the state machine, where that got called from).
> 
> Would you accept, then, passing a string to indicate where phy_error()
> was called from, which would do the same job without tainting the
> kernel for something that becomes a _normal_ event when removing a
> SFP?

I'm actually wondering what purpose phy_error has, and if the SFP is
using it correctly.

It is currently used when the phy driver interrupts handling returns
an error, or phy_start_aneg returns an error.

In general, with a traditional MDIO bus, you cannot tell when the
device has disappeared. Writes just disappear into the bit bucket, and
reads return 0xffff. If an error is returned by the MDIO bus driver,
it is because the mechanisms surrounding the bus have returned an
error. A needed clock has been turned off, the system is too busy to
service the interrupt in a timely manor etc. An error indicates a bug
of some sort, bad power management, or timers too short.

SFPs, with their MDIO bus tunnelled over i2c are however a little
different. i2c does allow you to know the device has gone, and return
-EIO.

We might want to consider the question, should the MDIO over i2c code
actually be tossing this EIO error, and returning 0xffff on read, no
error on write? It makes the emulation more true? Only return an error
for power management or timeouts, similar to traditional MDIO? Maybe
print a rate limited warning when dropping an EIO error?

Phylink is going to notice the PHY has disappeared soon anyway, so is
there any value in returning an error when the module has been
unplugged?

      Andrew
Florian Fainelli Dec. 20, 2019, 6:46 p.m. UTC | #12
On 12/19/19 9:06 AM, Russell King - ARM Linux admin wrote:
> On Thu, Dec 19, 2019 at 08:10:21AM +0100, Heiner Kallweit wrote:
>> On 18.12.2019 23:09, Russell King - ARM Linux admin wrote:
>>> On Wed, Dec 18, 2019 at 09:54:32PM +0100, Heiner Kallweit wrote:
>>>> On 18.12.2019 00:34, Russell King - ARM Linux admin wrote:
>>>>> On Tue, Dec 17, 2019 at 10:41:34PM +0100, Heiner Kallweit wrote:
>>>>>> On 17.12.2019 13:53, Russell King wrote:
>>>>>>> phy_error() is called from phy_interrupt() or phy_state_machine(), and
>>>>>>> uses WARN_ON() to print a backtrace. The backtrace is not useful when
>>>>>>> reporting a PHY error.
>>>>>>>
>>>>>>> However, a system may contain multiple ethernet PHYs, and phy_error()
>>>>>>> gives no clue which one caused the problem.
>>>>>>>
>>>>>>> Replace WARN_ON() with a call to phydev_err() so that we can see which
>>>>>>> PHY had an error, and also inform the user that we are halting the PHY.
>>>>>>>
>>>>>>> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
>>>>>>> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
>>>>>>> ---
>>>>>>> There is another related problem in this area. If an error is detected
>>>>>>> while the PHY is running, phy_error() moves to PHY_HALTED state. If we
>>>>>>> try to take the network device down, then:
>>>>>>>
>>>>>>> void phy_stop(struct phy_device *phydev)
>>>>>>> {
>>>>>>>         if (!phy_is_started(phydev)) {
>>>>>>>                 WARN(1, "called from state %s\n",
>>>>>>>                      phy_state_to_str(phydev->state));
>>>>>>>                 return;
>>>>>>>         }
>>>>>>>
>>>>>>> triggers, and we never do any of the phy_stop() cleanup. I'm not sure
>>>>>>> what the best way to solve this is - introducing a PHY_ERROR state may
>>>>>>> be a solution, but I think we want some phy_is_started() sites to
>>>>>>> return true for it and others to return false.
>>>>>>>
>>>>>>> Heiner - you introduced the above warning, could you look at improving
>>>>>>> this case so we don't print a warning and taint the kernel when taking
>>>>>>> a network device down after phy_error() please?
>>>>>>>
>>>>>> I think we need both types of information:
>>>>>> - the affected PHY device
>>>>>> - the stack trace to see where the issue was triggered
>>>>>
>>>>> Can you please explain why the stack trace is useful.  For the paths
>>>>> that are reachable, all it tells you is whether it was reached via
>>>>> the interrupt or the workqueue.
>>>>>
>>>>> If it's via the interrupt, the rest of the backtrace beyond that is
>>>>> irrelevant.  If it's the workqueue, the backtrace doesn't go back
>>>>> very far, and doesn't tell you what operation triggered it.
>>>>>
>>>>> If it's important to see where or why phy_error() was called, there
>>>>> are much better ways of doing that, notably passing a string into
>>>>> phy_error() to describe the actual error itself.  That would convey
>>>>> way more useful information than the backtrace does.
>>>>>
>>>>> I have been faced with these backtraces, and they have not been at
>>>>> all useful for diagnosing the problem.
>>>>>
>>>> "The problem" comes in two flavors:
>>>> 1. The problem that caused the PHY error
>>>> 2. The problem caused by the PHY error (if we decide to not
>>>>    always switch to HALTED state)
>>>>
>>>> We can't do much for case 1, maybe we could add an errno argument
>>>> to phy_error(). To facilitate analyzing case 2 we'd need to change
>>>> code pieces like the following.
>>>>
>>>> case a:
>>>> err = f1();
>>>> case b:
>>>> err = f2();
>>>>
>>>> if (err)
>>>> 	phy_error()
>>>>
>>>> For my understanding: What caused the PHY error in your case(s)?
>>>> Which info would have been useful for analyzing the error?
>>>
>>> Errors reading/writing from the PHY.
>>>
>>> The problem with a backtrace from phy_error() is it doesn't tell you
>>> where the error actually occurred, it only tells you where the error
>>> is reported - which is one of two different paths at the moment.
>>> That can be achieved with much more elegance and simplicity by
>>> passing a string into phy_error() to describe the call site if that's
>>> even relevant.
>>>
>>> I would say, however, that knowing where the error occurred would be
>>> far better information.
>>>
>> AFAICS PHY errors are typically forwarded MDIO access errors.
>> PHY driver callback implementations could add own error sources,
>> but from what I've seen they don't. Instead of the backtrace in
>> phy_error() we could add a WARN_ONCE() to __mdiobus_read/write.
>> Then the printed call chain should be more useful.
>> If somebody wants to analyze in more detail, he can switch on
>> MDIO access tracing.
> 
> I'm still not clear why you're so keen to trigger a kernel warning
> on one of these events.
> 
> Errors may _legitimately_ occur when trying to read/write a PHY. For
> example, it would be completely mad for the kernel to WARN and taint
> itself just because you've unplugged a SFP module just at the time
> that phylib is trying to poll the PHY on-board, and that caused an
> failure to read/write the PHY. You just need the right timing to
> trigger this.
> 
> When a SFP module is unplugged the three contacts that comprise the
> I2C bus (used for communicating with a PHY that may be there) and
> the pin that identifies that the module is present all break at about
> the same point in time (give or take some minor tolerances) so there
> is no way to definitively say "yes, the PHY is still present, we can
> talk to it" by testing something.
> 

For instance, here is a resume failure because of incorrect pinmuxing,
you can see that the piece of useful information is not from the stack
trace, but right under:

[   39.637976] ------------[ cut here ]------------
[   39.637995] WARNING: CPU: 0 PID: 29 at drivers/net/phy/phy.c:657
phy_error+0x34/0x6c
[   39.637998] Modules linked in:
[   39.638006] CPU: 0 PID: 29 Comm: kworker/0:1 Not tainted 5.5.0-rc2 #23
[   39.638007] Hardware name: Broadcom STB (Flattened Device Tree)
[   39.638013] Workqueue: events_power_efficient phy_state_machine
[   39.638015] Backtrace:
[   39.638021] [<4020df50>] (dump_backtrace) from [<4020e254>]
(show_stack+0x20/0x24)
[   39.638023]  r7:41ca90c8 r6:00000000 r5:60000153 r4:41ca90c8
[   39.638030] [<4020e234>] (show_stack) from [<40b98404>]
(dump_stack+0xb8/0xe4)
[   39.638035] [<40b9834c>] (dump_stack) from [<40226220>]
(__warn+0xec/0x104)
[   39.638038]  r10:ed7ab605 r9:4080ba38 r8:00000291 r7:00000009
r6:40da88ec r5:00000000
[   39.638039]  r4:00000000 r3:189dae12
[   39.638043] [<40226134>] (__warn) from [<402262f4>]
(warn_slowpath_fmt+0xbc/0xc4)
[   39.638045]  r9:00000009 r8:4080ba38 r7:00000291 r6:40da88ec
r5:00000000 r4:e9232000
[   39.638049] [<4022623c>] (warn_slowpath_fmt) from [<4080ba38>]
(phy_error+0x34/0x6c)
[   39.638051]  r9:41cb14b0 r8:fffffffb r7:e80d3400 r6:00000003
r5:e80d36dc r4:e80d3400
[   39.638055] [<4080ba04>] (phy_error) from [<4080ccc0>]
(phy_state_machine+0x114/0x1c0)
[   39.638056]  r5:e80d36dc r4:e80d36b0
[   39.638060] [<4080cbac>] (phy_state_machine) from [<40243ab4>]
(process_one_work+0x240/0x57c)
[   39.638063]  r8:00000000 r7:ed7ab600 r6:ed7a8040 r5:e90ff880 r4:e80d36b0
[   39.638065] [<40243874>] (process_one_work) from [<402442b0>]
(worker_thread+0x58/0x5f4)
[   39.638068]  r10:e9232000 r9:ed7a8058 r8:41c03d00 r7:00000008
r6:e90ff894 r5:ed7a8040
[   39.638069]  r4:e90ff880
[   39.638073] [<40244258>] (worker_thread) from [<4024b050>]
(kthread+0x148/0x174)
[   39.638076]  r10:e914be74 r9:40244258 r8:e90ff880 r7:e9232000
r6:00000000 r5:e9204e40
[   39.638077]  r4:e91940c0
[   39.638081] [<4024af08>] (kthread) from [<402010ac>]
(ret_from_fork+0x14/0x28)
[   39.638083] Exception stack(0xe9233fb0 to 0xe9233ff8)
[   39.638085] 3fa0:                                     00000000
00000000 00000000 00000000
[   39.638087] 3fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[   39.638089] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[   39.638092]  r10:00000000 r9:00000000 r8:00000000 r7:00000000
r6:00000000 r5:4024af08
[   39.638093]  r4:e9204e40
[   39.638095] ---[ end trace 10eeee4f71649fc9 ]---
[   39.639134] PM: dpm_run_callback(): mdio_bus_phy_resume+0x0/0x64
returns -5
[   39.639139] PM: Device unimac-mdio-0:01 failed to resume: error -5
Heiner Kallweit Dec. 20, 2019, 10:28 p.m. UTC | #13
On 20.12.2019 19:46, Florian Fainelli wrote:
> On 12/19/19 9:06 AM, Russell King - ARM Linux admin wrote:
>> On Thu, Dec 19, 2019 at 08:10:21AM +0100, Heiner Kallweit wrote:
>>> On 18.12.2019 23:09, Russell King - ARM Linux admin wrote:
>>>> On Wed, Dec 18, 2019 at 09:54:32PM +0100, Heiner Kallweit wrote:
>>>>> On 18.12.2019 00:34, Russell King - ARM Linux admin wrote:
>>>>>> On Tue, Dec 17, 2019 at 10:41:34PM +0100, Heiner Kallweit wrote:
>>>>>>> On 17.12.2019 13:53, Russell King wrote:
>>>>>>>> phy_error() is called from phy_interrupt() or phy_state_machine(), and
>>>>>>>> uses WARN_ON() to print a backtrace. The backtrace is not useful when
>>>>>>>> reporting a PHY error.
>>>>>>>>
>>>>>>>> However, a system may contain multiple ethernet PHYs, and phy_error()
>>>>>>>> gives no clue which one caused the problem.
>>>>>>>>
>>>>>>>> Replace WARN_ON() with a call to phydev_err() so that we can see which
>>>>>>>> PHY had an error, and also inform the user that we are halting the PHY.
>>>>>>>>
>>>>>>>> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
>>>>>>>> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
>>>>>>>> ---
>>>>>>>> There is another related problem in this area. If an error is detected
>>>>>>>> while the PHY is running, phy_error() moves to PHY_HALTED state. If we
>>>>>>>> try to take the network device down, then:
>>>>>>>>
>>>>>>>> void phy_stop(struct phy_device *phydev)
>>>>>>>> {
>>>>>>>>         if (!phy_is_started(phydev)) {
>>>>>>>>                 WARN(1, "called from state %s\n",
>>>>>>>>                      phy_state_to_str(phydev->state));
>>>>>>>>                 return;
>>>>>>>>         }
>>>>>>>>
>>>>>>>> triggers, and we never do any of the phy_stop() cleanup. I'm not sure
>>>>>>>> what the best way to solve this is - introducing a PHY_ERROR state may
>>>>>>>> be a solution, but I think we want some phy_is_started() sites to
>>>>>>>> return true for it and others to return false.
>>>>>>>>
>>>>>>>> Heiner - you introduced the above warning, could you look at improving
>>>>>>>> this case so we don't print a warning and taint the kernel when taking
>>>>>>>> a network device down after phy_error() please?
>>>>>>>>
>>>>>>> I think we need both types of information:
>>>>>>> - the affected PHY device
>>>>>>> - the stack trace to see where the issue was triggered
>>>>>>
>>>>>> Can you please explain why the stack trace is useful.  For the paths
>>>>>> that are reachable, all it tells you is whether it was reached via
>>>>>> the interrupt or the workqueue.
>>>>>>
>>>>>> If it's via the interrupt, the rest of the backtrace beyond that is
>>>>>> irrelevant.  If it's the workqueue, the backtrace doesn't go back
>>>>>> very far, and doesn't tell you what operation triggered it.
>>>>>>
>>>>>> If it's important to see where or why phy_error() was called, there
>>>>>> are much better ways of doing that, notably passing a string into
>>>>>> phy_error() to describe the actual error itself.  That would convey
>>>>>> way more useful information than the backtrace does.
>>>>>>
>>>>>> I have been faced with these backtraces, and they have not been at
>>>>>> all useful for diagnosing the problem.
>>>>>>
>>>>> "The problem" comes in two flavors:
>>>>> 1. The problem that caused the PHY error
>>>>> 2. The problem caused by the PHY error (if we decide to not
>>>>>    always switch to HALTED state)
>>>>>
>>>>> We can't do much for case 1, maybe we could add an errno argument
>>>>> to phy_error(). To facilitate analyzing case 2 we'd need to change
>>>>> code pieces like the following.
>>>>>
>>>>> case a:
>>>>> err = f1();
>>>>> case b:
>>>>> err = f2();
>>>>>
>>>>> if (err)
>>>>> 	phy_error()
>>>>>
>>>>> For my understanding: What caused the PHY error in your case(s)?
>>>>> Which info would have been useful for analyzing the error?
>>>>
>>>> Errors reading/writing from the PHY.
>>>>
>>>> The problem with a backtrace from phy_error() is it doesn't tell you
>>>> where the error actually occurred, it only tells you where the error
>>>> is reported - which is one of two different paths at the moment.
>>>> That can be achieved with much more elegance and simplicity by
>>>> passing a string into phy_error() to describe the call site if that's
>>>> even relevant.
>>>>
>>>> I would say, however, that knowing where the error occurred would be
>>>> far better information.
>>>>
>>> AFAICS PHY errors are typically forwarded MDIO access errors.
>>> PHY driver callback implementations could add own error sources,
>>> but from what I've seen they don't. Instead of the backtrace in
>>> phy_error() we could add a WARN_ONCE() to __mdiobus_read/write.
>>> Then the printed call chain should be more useful.
>>> If somebody wants to analyze in more detail, he can switch on
>>> MDIO access tracing.
>>
>> I'm still not clear why you're so keen to trigger a kernel warning
>> on one of these events.
>>
>> Errors may _legitimately_ occur when trying to read/write a PHY. For
>> example, it would be completely mad for the kernel to WARN and taint
>> itself just because you've unplugged a SFP module just at the time
>> that phylib is trying to poll the PHY on-board, and that caused an
>> failure to read/write the PHY. You just need the right timing to
>> trigger this.
>>

When the trace was added we didn't consider the fact that there may
me legitimate errors. And sure, WARNing just because a SFP has been
removed is inappropriate. Having said that I'm fine with removing the
trace. Then phy_error() should just be extended to also print the
errno of the failed operation and give a hint which operation failed.

>> When a SFP module is unplugged the three contacts that comprise the
>> I2C bus (used for communicating with a PHY that may be there) and
>> the pin that identifies that the module is present all break at about
>> the same point in time (give or take some minor tolerances) so there
>> is no way to definitively say "yes, the PHY is still present, we can
>> talk to it" by testing something.
>>
> 
> For instance, here is a resume failure because of incorrect pinmuxing,
> you can see that the piece of useful information is not from the stack
> trace, but right under:
> 
> [   39.637976] ------------[ cut here ]------------
> [   39.637995] WARNING: CPU: 0 PID: 29 at drivers/net/phy/phy.c:657
> phy_error+0x34/0x6c
> [   39.637998] Modules linked in:
> [   39.638006] CPU: 0 PID: 29 Comm: kworker/0:1 Not tainted 5.5.0-rc2 #23
> [   39.638007] Hardware name: Broadcom STB (Flattened Device Tree)
> [   39.638013] Workqueue: events_power_efficient phy_state_machine
> [   39.638015] Backtrace:
> [   39.638021] [<4020df50>] (dump_backtrace) from [<4020e254>]
> (show_stack+0x20/0x24)
> [   39.638023]  r7:41ca90c8 r6:00000000 r5:60000153 r4:41ca90c8
> [   39.638030] [<4020e234>] (show_stack) from [<40b98404>]
> (dump_stack+0xb8/0xe4)
> [   39.638035] [<40b9834c>] (dump_stack) from [<40226220>]
> (__warn+0xec/0x104)
> [   39.638038]  r10:ed7ab605 r9:4080ba38 r8:00000291 r7:00000009
> r6:40da88ec r5:00000000
> [   39.638039]  r4:00000000 r3:189dae12
> [   39.638043] [<40226134>] (__warn) from [<402262f4>]
> (warn_slowpath_fmt+0xbc/0xc4)
> [   39.638045]  r9:00000009 r8:4080ba38 r7:00000291 r6:40da88ec
> r5:00000000 r4:e9232000
> [   39.638049] [<4022623c>] (warn_slowpath_fmt) from [<4080ba38>]
> (phy_error+0x34/0x6c)
> [   39.638051]  r9:41cb14b0 r8:fffffffb r7:e80d3400 r6:00000003
> r5:e80d36dc r4:e80d3400
> [   39.638055] [<4080ba04>] (phy_error) from [<4080ccc0>]
> (phy_state_machine+0x114/0x1c0)
> [   39.638056]  r5:e80d36dc r4:e80d36b0
> [   39.638060] [<4080cbac>] (phy_state_machine) from [<40243ab4>]
> (process_one_work+0x240/0x57c)
> [   39.638063]  r8:00000000 r7:ed7ab600 r6:ed7a8040 r5:e90ff880 r4:e80d36b0
> [   39.638065] [<40243874>] (process_one_work) from [<402442b0>]
> (worker_thread+0x58/0x5f4)
> [   39.638068]  r10:e9232000 r9:ed7a8058 r8:41c03d00 r7:00000008
> r6:e90ff894 r5:ed7a8040
> [   39.638069]  r4:e90ff880
> [   39.638073] [<40244258>] (worker_thread) from [<4024b050>]
> (kthread+0x148/0x174)
> [   39.638076]  r10:e914be74 r9:40244258 r8:e90ff880 r7:e9232000
> r6:00000000 r5:e9204e40
> [   39.638077]  r4:e91940c0
> [   39.638081] [<4024af08>] (kthread) from [<402010ac>]
> (ret_from_fork+0x14/0x28)
> [   39.638083] Exception stack(0xe9233fb0 to 0xe9233ff8)
> [   39.638085] 3fa0:                                     00000000
> 00000000 00000000 00000000
> [   39.638087] 3fc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [   39.638089] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [   39.638092]  r10:00000000 r9:00000000 r8:00000000 r7:00000000
> r6:00000000 r5:4024af08
> [   39.638093]  r4:e9204e40
> [   39.638095] ---[ end trace 10eeee4f71649fc9 ]---
> [   39.639134] PM: dpm_run_callback(): mdio_bus_phy_resume+0x0/0x64
> returns -5
> [   39.639139] PM: Device unimac-mdio-0:01 failed to resume: error -5
> 
>
diff mbox series

Patch

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 49300fb59757..06fbca959383 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -663,7 +663,7 @@  void phy_stop_machine(struct phy_device *phydev)
  */
 static void phy_error(struct phy_device *phydev)
 {
-	WARN_ON(1);
+	phydev_err(phydev, "Error detected, halting PHY\n");
 
 	mutex_lock(&phydev->lock);
 	phydev->state = PHY_HALTED;