Message ID | 1460750172-7796-1-git-send-email-alexandre.belloni@free-electrons.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On 15/04/16 12:56, Alexandre Belloni wrote: > Commit d5c3d84657db ("net: phy: Avoid polling PHY with > PHY_IGNORE_INTERRUPTS") removed the last polling done on the phy. Since > then, the last actual poll done on the phy happens PHY_STATE_TIME seconds > (that is actually one second) after registering the phy. If the interface > is not UP by that time, any previous IRQ indicating the link is up is > ignored. Moreover, nothing will start the autonegociation so the phy will > simply change from READY to UP and never actually go to RUNNING. What do you mean by that? phy_start() will start auto-negotiation. > > The one second delay explains why the issue is not seen when booting from > NFS or when the interface is configured at boot time. > > To solve that, ensure the state machine is called as soon as the state > changes from READY to UP. The fix may be good, but I would like to see which driver are you observing this with? Also, having a capture of the PHY state machine with debug prints enabled could help us figure out the sequence of events leading to what you observed. Assuming you might be using the macb driver, I see a race condition in how macb_probe() registers for its MDIO bus and probes for the PHY, after calling register_netdev(), which is something that is not good, because as soon as register_netdev() is called, an in-kernel notifier can start opening the device for use before you have returned... > > Fixes: d5c3d84657db ("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS") > Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> > --- > drivers/net/phy/phy.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > index 5590b9c182c9..25f6bfd1c8fd 100644 > --- a/drivers/net/phy/phy.c > +++ b/drivers/net/phy/phy.c > @@ -787,6 +787,9 @@ void phy_start(struct phy_device *phydev) > break; > case PHY_READY: > phydev->state = PHY_UP; > + cancel_delayed_work_sync(&phydev->state_queue); > + queue_delayed_work(system_power_efficient_wq, > + &phydev->state_queue, 0); > break; > case PHY_HALTED: > /* make sure interrupts are re-enabled for the PHY */ >
On 15/04/2016 at 13:10:12 -0700, Florian Fainelli wrote : > On 15/04/16 12:56, Alexandre Belloni wrote: > > Commit d5c3d84657db ("net: phy: Avoid polling PHY with > > PHY_IGNORE_INTERRUPTS") removed the last polling done on the phy. Since > > then, the last actual poll done on the phy happens PHY_STATE_TIME seconds > > (that is actually one second) after registering the phy. If the interface > > is not UP by that time, any previous IRQ indicating the link is up is > > ignored. Moreover, nothing will start the autonegociation so the phy will > > simply change from READY to UP and never actually go to RUNNING. > > What do you mean by that? phy_start() will start auto-negotiation. > In my case, it doesn't because it switches the state from PHY_READY to PHY_UP but phy_state_machine() is never called afterwards. > > The one second delay explains why the issue is not seen when booting from > > NFS or when the interface is configured at boot time. > > > > To solve that, ensure the state machine is called as soon as the state > > changes from READY to UP. > > The fix may be good, but I would like to see which driver are you > observing this with? Also, having a capture of the PHY state machine > with debug prints enabled could help us figure out the sequence of > events leading to what you observed. > I'm using a macb with a Micrel KSZ8081. Trace without my patch: libphy: MACB_mii_bus: probed macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05) Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171) Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY [...] Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY [...] # ifconfig eth0 up IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready With my patch: libphy: MACB_mii_bus: probed macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05) Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171) Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY [...] Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY [...] # ifconfig eth0 up IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change UP -> AN Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change AN -> NOLINK macb f8020000.ethernet eth0: link up (100/Full) Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change CHANGELINK -> RUNNING IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > Assuming you might be using the macb driver, I see a race condition in > how macb_probe() registers for its MDIO bus and probes for the PHY, > after calling register_netdev(), which is something that is not good, > because as soon as register_netdev() is called, an in-kernel notifier > can start opening the device for use before you have returned... > Well, I'm not sure I'm running into that because phy_start() is only called once I open the interface from userspace.
> Trace without my patch: > libphy: MACB_mii_bus: probed > macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05) > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171) > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY > [...] > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY Are there some state changes before this? How is it getting to state READY? It would expect it to start in DOWN, from when the phy device was created in phy_device_create(). Andrew
On 16/04/2016 at 00:05:08 +0200, Andrew Lunn wrote : > > Trace without my patch: > > libphy: MACB_mii_bus: probed > > macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05) > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171) > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY > > [...] > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY > > Are there some state changes before this? How is it getting to state > READY? It would expect it to start in DOWN, from when the phy device > was created in phy_device_create(). > No other changes. I forgot to mention that this is when booting with a cable plugged in. Unplugging and replugging the cable makes the link detection work fine even without the patch.
On Sat, Apr 16, 2016 at 12:17:11AM +0200, Alexandre Belloni wrote: > On 16/04/2016 at 00:05:08 +0200, Andrew Lunn wrote : > > > Trace without my patch: > > > libphy: MACB_mii_bus: probed > > > macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05) > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171) > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY > > > [...] > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY > > > > Are there some state changes before this? How is it getting to state > > READY? It would expect it to start in DOWN, from when the phy device > > was created in phy_device_create(). > > > > No other changes. I forgot to mention that this is when booting with a > cable plugged in. Unplugging and replugging the cable makes the link > detection work fine even without the patch. Are you tftpbooting? I.e. has the boot loader already done an auto negotiation? I've looked at the code and i still don't see how it gets to READY. What i do see is that when you connect the phy to the MAC, the interrupt handler is installed. So maybe there are some PHY interrupts before the interface is opened? Could you put a print in phy_interrupt(). Andrew
On 16/04/2016 at 00:30:26 +0200, Andrew Lunn wrote : > On Sat, Apr 16, 2016 at 12:17:11AM +0200, Alexandre Belloni wrote: > > On 16/04/2016 at 00:05:08 +0200, Andrew Lunn wrote : > > > > Trace without my patch: > > > > libphy: MACB_mii_bus: probed > > > > macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05) > > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171) > > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY > > > > [...] > > > > Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY > > > > > > Are there some state changes before this? How is it getting to state > > > READY? It would expect it to start in DOWN, from when the phy device > > > was created in phy_device_create(). > > > > > > > No other changes. I forgot to mention that this is when booting with a > > cable plugged in. Unplugging and replugging the cable makes the link > > detection work fine even without the patch. > > Are you tftpbooting? I.e. has the boot loader already done an auto > negotiation? > Yes. > I've looked at the code and i still don't see how it gets to READY. > What i do see is that when you connect the phy to the MAC, the > interrupt handler is installed. So maybe there are some PHY interrupts > before the interface is opened? Could you put a print in > phy_interrupt(). > That is indeed the case, and I'm not sure why because 99f81afc139c6edd14d77a91ee91685a414a1c66 is trying to disable AN at boot.
Hi all, I come back to this thread to re-start the conversation as I still have the issue... Le 16/04/2016 à 00:45, Alexandre Belloni a écrit : > On 16/04/2016 at 00:30:26 +0200, Andrew Lunn wrote : >> On Sat, Apr 16, 2016 at 12:17:11AM +0200, Alexandre Belloni wrote: >>> On 16/04/2016 at 00:05:08 +0200, Andrew Lunn wrote : >>>>> Trace without my patch: >>>>> libphy: MACB_mii_bus: probed >>>>> macb f8020000.ethernet eth0: Cadence GEM rev 0x00020120 at 0xf8020000 irq 27 (fc:c2:3d:0c:6e:05) >>>>> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: attached PHY driver [Micrel KSZ8081 or KSZ8091] (mii_bus:phy_addr=f8020000.etherne:01, irq=171) >>>>> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY >>>>> [...] >>>>> Micrel KSZ8081 or KSZ8091 f8020000.etherne:01: PHY state change READY -> READY >>>> >>>> Are there some state changes before this? How is it getting to state >>>> READY? It would expect it to start in DOWN, from when the phy device >>>> was created in phy_device_create(). >>>> >>> >>> No other changes. I forgot to mention that this is when booting with a >>> cable plugged in. Unplugging and replugging the cable makes the link >>> detection work fine even without the patch. >> >> Are you tftpbooting? I.e. has the boot loader already done an auto >> negotiation? >> > > Yes. Yes indeed: this is my use-case: load the kernel from U-Boot using tftp and having the rootfs in NAND flash so, no NFS rootfs for me. >> I've looked at the code and i still don't see how it gets to READY. >> What i do see is that when you connect the phy to the MAC, the >> interrupt handler is installed. So maybe there are some PHY interrupts >> before the interface is opened? Could you put a print in >> phy_interrupt(). >> > > That is indeed the case, and I'm not sure why because > 99f81afc139c6edd14d77a91ee91685a414a1c66 is trying to disable AN at > boot. I don't know what happens to the phy, but this patch does fix the issue for me: Tested-by: Nicolas Ferre <nicolas.ferre@atmel.com> The other alternative that I'm considering seriously as I'm still struggled with this is to simply remove the phy IRQ from my board DT. Best regards,
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index 5590b9c182c9..25f6bfd1c8fd 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -787,6 +787,9 @@ void phy_start(struct phy_device *phydev) break; case PHY_READY: phydev->state = PHY_UP; + cancel_delayed_work_sync(&phydev->state_queue); + queue_delayed_work(system_power_efficient_wq, + &phydev->state_queue, 0); break; case PHY_HALTED: /* make sure interrupts are re-enabled for the PHY */
Commit d5c3d84657db ("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS") removed the last polling done on the phy. Since then, the last actual poll done on the phy happens PHY_STATE_TIME seconds (that is actually one second) after registering the phy. If the interface is not UP by that time, any previous IRQ indicating the link is up is ignored. Moreover, nothing will start the autonegociation so the phy will simply change from READY to UP and never actually go to RUNNING. The one second delay explains why the issue is not seen when booting from NFS or when the interface is configured at boot time. To solve that, ensure the state machine is called as soon as the state changes from READY to UP. Fixes: d5c3d84657db ("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS") Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> --- drivers/net/phy/phy.c | 3 +++ 1 file changed, 3 insertions(+)