diff mbox

net: phy: Ensure the state machine is called when phy is UP

Message ID 1460750172-7796-1-git-send-email-alexandre.belloni@free-electrons.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Alexandre Belloni April 15, 2016, 7:56 p.m. UTC
Commit d5c3d84657db ("net: phy: Avoid polling PHY with
PHY_IGNORE_INTERRUPTS") removed the last polling done on the phy. Since
then, the last actual poll done on the phy happens PHY_STATE_TIME seconds
(that is actually one second) after registering the phy. If the interface
is not UP by that time, any previous IRQ indicating the link is up is
ignored. Moreover, nothing will start the autonegociation so the phy will
simply change from READY to UP and never actually go to RUNNING.

The one second delay explains why the issue is not seen when booting from
NFS or when the interface is configured at boot time.

To solve that, ensure the state machine is called as soon as the state
changes from READY to UP.

Fixes: d5c3d84657db ("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS")
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
---
 drivers/net/phy/phy.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Florian Fainelli April 15, 2016, 8:10 p.m. UTC | #1
On 15/04/16 12:56, Alexandre Belloni wrote:
> Commit d5c3d84657db ("net: phy: Avoid polling PHY with
> PHY_IGNORE_INTERRUPTS") removed the last polling done on the phy. Since
> then, the last actual poll done on the phy happens PHY_STATE_TIME seconds
> (that is actually one second) after registering the phy. If the interface
> is not UP by that time, any previous IRQ indicating the link is up is
> ignored. Moreover, nothing will start the autonegociation so the phy will
> simply change from READY to UP and never actually go to RUNNING.

What do you mean by that? phy_start() will start auto-negotiation.

> 
> The one second delay explains why the issue is not seen when booting from
> NFS or when the interface is configured at boot time.
> 
> To solve that, ensure the state machine is called as soon as the state
> changes from READY to UP.

The fix may be good, but I would like to see which driver are you
observing this with? Also, having a capture of the PHY state machine
with debug prints enabled could help us figure out the sequence of
events leading to what you observed.

Assuming you might be using the macb driver, I see a race condition in
how macb_probe() registers for its MDIO bus and probes for the PHY,
after calling register_netdev(), which is something that is not good,
because as soon as register_netdev() is called, an in-kernel notifier
can start opening the device for use before you have returned...

> 
> Fixes: d5c3d84657db ("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS")
> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
> ---
>  drivers/net/phy/phy.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 5590b9c182c9..25f6bfd1c8fd 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -787,6 +787,9 @@ void phy_start(struct phy_device *phydev)
>  		break;
>  	case PHY_READY:
>  		phydev->state = PHY_UP;
> +		cancel_delayed_work_sync(&phydev->state_queue);
> +		queue_delayed_work(system_power_efficient_wq,
> +				   &phydev->state_queue, 0);
>  		break;
>  	case PHY_HALTED:
>  		/* make sure interrupts are re-enabled for the PHY */
>
Alexandre Belloni April 15, 2016, 8:56 p.m. UTC | #2
On 15/04/2016 at 13:10:12 -0700, Florian Fainelli wrote :
> On 15/04/16 12:56, Alexandre Belloni wrote:
> > Commit d5c3d84657db ("net: phy: Avoid polling PHY with
> > PHY_IGNORE_INTERRUPTS") removed the last polling done on the phy. Since
> > then, the last actual poll done on the phy happens PHY_STATE_TIME seconds
> > (that is actually one second) after registering the phy. If the interface
> > is not UP by that time, any previous IRQ indicating the link is up is
> > ignored. Moreover, nothing will start the autonegociation so the phy will
> > simply change from READY to UP and never actually go to RUNNING.
> 
> What do you mean by that? phy_start() will start auto-negotiation.
> 

In my case, it doesn't because it switches the state from PHY_READY to
PHY_UP but phy_state_machine() is never called afterwards.

> > The one second delay explains why the issue is not seen when booting from
> > NFS or when the interface is configured at boot time.
> > 
> > To solve that, ensure the state machine is called as soon as the state
> > changes from READY to UP.
> 
> The fix may be good, but I would like to see which driver are you
> observing this with? Also, having a capture of the PHY state machine
> with debug prints enabled could help us figure out the sequence of
> events leading to what you observed.
> 

I'm using a macb with a Micrel KSZ8081.

Trace without my patch:
libphy: MACB_mii_bus: probed
macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05)
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171)
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
[...]
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
[...]
# ifconfig eth0 up
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready


With my patch:
libphy: MACB_mii_bus: probed
macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05)
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171)
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
[...]
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
[...]
# ifconfig eth0 up
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change UP -> AN
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change AN -> NOLINK
macb f8020000.ethernet eth0: link up (100/Full)
Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change CHANGELINK -> RUNNING
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready



> Assuming you might be using the macb driver, I see a race condition in
> how macb_probe() registers for its MDIO bus and probes for the PHY,
> after calling register_netdev(), which is something that is not good,
> because as soon as register_netdev() is called, an in-kernel notifier
> can start opening the device for use before you have returned...
> 

Well, I'm not sure  I'm running into that because phy_start() is only called
once I open the interface from userspace.
Andrew Lunn April 15, 2016, 10:05 p.m. UTC | #3
> Trace without my patch:
> libphy: MACB_mii_bus: probed
> macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05)
> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171)
> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
> [...]
> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY

Are there some state changes before this? How is it getting to state
READY? It would expect it to start in DOWN, from when the phy device
was created in phy_device_create().

       Andrew
Alexandre Belloni April 15, 2016, 10:17 p.m. UTC | #4
On 16/04/2016 at 00:05:08 +0200, Andrew Lunn wrote :
> > Trace without my patch:
> > libphy: MACB_mii_bus: probed
> > macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05)
> > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171)
> > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
> > [...]
> > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
> 
> Are there some state changes before this? How is it getting to state
> READY? It would expect it to start in DOWN, from when the phy device
> was created in phy_device_create().
> 

No other changes. I forgot to mention that this is when booting with a
cable plugged in. Unplugging and replugging the cable makes the link
detection work fine even without the patch.
Andrew Lunn April 15, 2016, 10:30 p.m. UTC | #5
On Sat, Apr 16, 2016 at 12:17:11AM +0200, Alexandre Belloni wrote:
> On 16/04/2016 at 00:05:08 +0200, Andrew Lunn wrote :
> > > Trace without my patch:
> > > libphy: MACB_mii_bus: probed
> > > macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05)
> > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171)
> > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
> > > [...]
> > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
> > 
> > Are there some state changes before this? How is it getting to state
> > READY? It would expect it to start in DOWN, from when the phy device
> > was created in phy_device_create().
> > 
> 
> No other changes. I forgot to mention that this is when booting with a
> cable plugged in. Unplugging and replugging the cable makes the link
> detection work fine even without the patch.

Are you tftpbooting? I.e. has the boot loader already done an auto
negotiation?

I've looked at the code and i still don't see how it gets to READY.
What i do see is that when you connect the phy to the MAC, the
interrupt handler is installed. So maybe there are some PHY interrupts
before the interface is opened? Could you put a print in
phy_interrupt().

	Andrew
Alexandre Belloni April 15, 2016, 10:45 p.m. UTC | #6
On 16/04/2016 at 00:30:26 +0200, Andrew Lunn wrote :
> On Sat, Apr 16, 2016 at 12:17:11AM +0200, Alexandre Belloni wrote:
> > On 16/04/2016 at 00:05:08 +0200, Andrew Lunn wrote :
> > > > Trace without my patch:
> > > > libphy: MACB_mii_bus: probed
> > > > macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05)
> > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171)
> > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
> > > > [...]
> > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
> > > 
> > > Are there some state changes before this? How is it getting to state
> > > READY? It would expect it to start in DOWN, from when the phy device
> > > was created in phy_device_create().
> > > 
> > 
> > No other changes. I forgot to mention that this is when booting with a
> > cable plugged in. Unplugging and replugging the cable makes the link
> > detection work fine even without the patch.
> 
> Are you tftpbooting? I.e. has the boot loader already done an auto
> negotiation?
> 

Yes.

> I've looked at the code and i still don't see how it gets to READY.
> What i do see is that when you connect the phy to the MAC, the
> interrupt handler is installed. So maybe there are some PHY interrupts
> before the interface is opened? Could you put a print in
> phy_interrupt().
> 

That is indeed the case, and I'm not sure why because
99f81afc139c6edd14d77a91ee91685a414a1c66 is trying to disable AN at
boot.
Nicolas Ferre Sept. 19, 2016, 1:15 p.m. UTC | #7
Hi all,

I come back to this thread to re-start the conversation as I still have
the issue...


Le 16/04/2016 à 00:45, Alexandre Belloni a écrit :
> On 16/04/2016 at 00:30:26 +0200, Andrew Lunn wrote :
>> On Sat, Apr 16, 2016 at 12:17:11AM +0200, Alexandre Belloni wrote:
>>> On 16/04/2016 at 00:05:08 +0200, Andrew Lunn wrote :
>>>>> Trace without my patch:
>>>>> libphy: MACB_mii_bus: probed
>>>>> macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05)
>>>>> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171)
>>>>> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
>>>>> [...]
>>>>> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY
>>>>
>>>> Are there some state changes before this? How is it getting to state
>>>> READY? It would expect it to start in DOWN, from when the phy device
>>>> was created in phy_device_create().
>>>>
>>>
>>> No other changes. I forgot to mention that this is when booting with a
>>> cable plugged in. Unplugging and replugging the cable makes the link
>>> detection work fine even without the patch.
>>
>> Are you tftpbooting? I.e. has the boot loader already done an auto
>> negotiation?
>>
> 
> Yes.

Yes indeed: this is my use-case: load the kernel from U-Boot using tftp
and having the rootfs in NAND flash so, no NFS rootfs for me.

>> I've looked at the code and i still don't see how it gets to READY.
>> What i do see is that when you connect the phy to the MAC, the
>> interrupt handler is installed. So maybe there are some PHY interrupts
>> before the interface is opened? Could you put a print in
>> phy_interrupt().
>>
> 
> That is indeed the case, and I'm not sure why because
> 99f81afc139c6edd14d77a91ee91685a414a1c66 is trying to disable AN at
> boot.

I don't know what happens to the phy, but this patch does fix the issue
for me:
Tested-by: Nicolas Ferre <nicolas.ferre@atmel.com>

The other alternative that I'm considering seriously as I'm still
struggled with this is to simply remove the phy IRQ from my board DT.

Best regards,
diff mbox

Patch

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 5590b9c182c9..25f6bfd1c8fd 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -787,6 +787,9 @@  void phy_start(struct phy_device *phydev)
 		break;
 	case PHY_READY:
 		phydev->state = PHY_UP;
+		cancel_delayed_work_sync(&phydev->state_queue);
+		queue_delayed_work(system_power_efficient_wq,
+				   &phydev->state_queue, 0);
 		break;
 	case PHY_HALTED:
 		/* make sure interrupts are re-enabled for the PHY */