diff mbox

[v3,1/2] net: phy: Fix PHY AN done state machine for interrupt driven PHYs

Message ID 078917ad-77a1-a722-1688-6203ae183a46@ti.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Roger Quadros March 27, 2017, 11:59 a.m. UTC
The Ethernet link on an interrupt driven PHY was not coming up if the
Ethernet cable was plugged before the Ethernet interface was brought up.

The PHY state machine seems to be stuck from RUNNING to AN state
with no new interrupts from the PHY. So it doesn't know when the
PHY Auto-negotiation has been completed and doesn't transition to RUNNING
state with ANEG done thus netif_carrier_on() is never called.

NOTE: genphy_config_aneg() will not restart PHY Auto-negotiation of
advertisement parameters didn't change.

Fix this by scheduling the PHY state machine in phy_start_aneg().
There is no way of knowing in phy.c whether auto-negotiation was
restarted or not by the PHY driver so we just wait for the next
poll/interrupt to update the PHY state machine.

Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
Cc: stable <stable@vger.kernel.org> # v4.9+
Signed-off-by: Roger Quadros <rogerq@ti.com>
---
v3: Fix typo in commit message

 drivers/net/phy/phy.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Roger Quadros March 28, 2017, 10:05 a.m. UTC | #1
+Andrew Davis & Sekhar.

Hi,

Andrew Davis posted a few comments offline which I'm clarifying here.

On 27/03/17 14:59, Roger Quadros wrote:
> The Ethernet link on an interrupt driven PHY was not coming up if the
> Ethernet cable was plugged before the Ethernet interface was brought up.
> 
> The PHY state machine seems to be stuck from RUNNING to AN state
> with no new interrupts from the PHY. So it doesn't know when the
> PHY Auto-negotiation has been completed and doesn't transition to RUNNING
> state with ANEG done thus netif_carrier_on() is never called.
> 
> NOTE: genphy_config_aneg() will not restart PHY Auto-negotiation of
> advertisement parameters didn't change.

Is phy->config_aneg expected to *always* restart auto-negotiation even if
advertisement parameters didn't change?
If so then we'll need to fix genphy_config_aneg().

> 
> Fix this by scheduling the PHY state machine in phy_start_aneg().
> There is no way of knowing in phy.c whether auto-negotiation was
> restarted or not by the PHY driver so we just wait for the next
> poll/interrupt to update the PHY state machine.
> 
> Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
> Cc: stable <stable@vger.kernel.org> # v4.9+
> Signed-off-by: Roger Quadros <rogerq@ti.com>
> ---
> v3: Fix typo in commit message
> 
>  drivers/net/phy/phy.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 1be69d8..49dedf8 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -630,6 +630,10 @@ int phy_start_aneg(struct phy_device *phydev)
>  
>  out_unlock:
>  	mutex_unlock(&phydev->lock);
> +	if (!err && phy_interrupt_is_valid(phydev))
> +		queue_delayed_work(system_power_efficient_wq,
> +				   &phydev->state_queue, HZ);
> +
>  	return err;
>  }
>  EXPORT_SYMBOL(phy_start_aneg);
> 

There is still room for optimization for interrupt driven PHYs as I still
see a delay of 1 second between "ifconfig ethx up" and link status coming up
if cable was already plugged in. In fact if Auto-negotiation was already completed
and not required to be restarted, the PHY state machine should have move from
AN to RUNNING instantly without expecting a PHY interrupt.

How can we get rid of the unnecessary delay in the case where auto-negotiation
is not restarted?
Should we check for phy_aneg_done() immediately after issuing a phy_start_aneg()
in phy_state_machine() and switch from PHY_AN to PHY_RUNNING?

This should avoid the need to re-schedule the state machine in phy_start_angeg().

cheers,
-roger
Madalin Bucur March 30, 2017, 12:59 p.m. UTC | #2
On March 27, 2017 2:59 PM, Roger Quadros wrote:
> The Ethernet link on an interrupt driven PHY was not coming up if the
> Ethernet cable was plugged before the Ethernet interface was brought up.
> 
> The PHY state machine seems to be stuck from RUNNING to AN state
> with no new interrupts from the PHY. So it doesn't know when the
> PHY Auto-negotiation has been completed and doesn't transition to RUNNING
> state with ANEG done thus netif_carrier_on() is never called.
> 
> NOTE: genphy_config_aneg() will not restart PHY Auto-negotiation of
> advertisement parameters didn't change.
> 
> Fix this by scheduling the PHY state machine in phy_start_aneg().
> There is no way of knowing in phy.c whether auto-negotiation was
> restarted or not by the PHY driver so we just wait for the next
> poll/interrupt to update the PHY state machine.
> 
> Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and
> not polling.")
> Cc: stable <stable@vger.kernel.org> # v4.9+
> Signed-off-by: Roger Quadros <rogerq@ti.com>
> ---
> v3: Fix typo in commit message
> 
>  drivers/net/phy/phy.c | 4 ++++
>  1 file changed, 4 insertions(+)

Tested-by: Madalin Bucur <madalin.bucur@nxp.com>
Florian Fainelli March 30, 2017, 8:02 p.m. UTC | #3
On 03/27/2017 04:59 AM, Roger Quadros wrote:
> The Ethernet link on an interrupt driven PHY was not coming up if the
> Ethernet cable was plugged before the Ethernet interface was brought up.
> 
> The PHY state machine seems to be stuck from RUNNING to AN state
> with no new interrupts from the PHY. So it doesn't know when the
> PHY Auto-negotiation has been completed and doesn't transition to RUNNING
> state with ANEG done thus netif_carrier_on() is never called.
> 
> NOTE: genphy_config_aneg() will not restart PHY Auto-negotiation of
> advertisement parameters didn't change.
> 
> Fix this by scheduling the PHY state machine in phy_start_aneg().
> There is no way of knowing in phy.c whether auto-negotiation was
> restarted or not by the PHY driver so we just wait for the next
> poll/interrupt to update the PHY state machine.
> 
> Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
> Cc: stable <stable@vger.kernel.org> # v4.9+
> Signed-off-by: Roger Quadros <rogerq@ti.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Roger Quadros March 31, 2017, 9:19 a.m. UTC | #4
Florian,

On 30/03/17 23:02, Florian Fainelli wrote:
> On 03/27/2017 04:59 AM, Roger Quadros wrote:
>> The Ethernet link on an interrupt driven PHY was not coming up if the
>> Ethernet cable was plugged before the Ethernet interface was brought up.
>>
>> The PHY state machine seems to be stuck from RUNNING to AN state
>> with no new interrupts from the PHY. So it doesn't know when the
>> PHY Auto-negotiation has been completed and doesn't transition to RUNNING
>> state with ANEG done thus netif_carrier_on() is never called.
>>
>> NOTE: genphy_config_aneg() will not restart PHY Auto-negotiation of
>> advertisement parameters didn't change.
>>
>> Fix this by scheduling the PHY state machine in phy_start_aneg().
>> There is no way of knowing in phy.c whether auto-negotiation was
>> restarted or not by the PHY driver so we just wait for the next
>> poll/interrupt to update the PHY state machine.
>>
>> Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
>> Cc: stable <stable@vger.kernel.org> # v4.9+
>> Signed-off-by: Roger Quadros <rogerq@ti.com>
> 
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> 
Thanks for the review, but there are a still few unanswered questions in the parallel thread.
Can you please clarify those first before this patch gets picked? Thanks.

cheers,
-roger
Roger Quadros April 11, 2017, 11:17 a.m. UTC | #5
Hi,

On 28/03/17 13:05, Roger Quadros wrote:
> +Andrew Davis & Sekhar.
> 
> Hi,
> 
> Andrew Davis posted a few comments offline which I'm clarifying here.
> 
> On 27/03/17 14:59, Roger Quadros wrote:
>> The Ethernet link on an interrupt driven PHY was not coming up if the
>> Ethernet cable was plugged before the Ethernet interface was brought up.
>>
>> The PHY state machine seems to be stuck from RUNNING to AN state
>> with no new interrupts from the PHY. So it doesn't know when the
>> PHY Auto-negotiation has been completed and doesn't transition to RUNNING
>> state with ANEG done thus netif_carrier_on() is never called.
>>
>> NOTE: genphy_config_aneg() will not restart PHY Auto-negotiation of
>> advertisement parameters didn't change.
> 
> Is phy->config_aneg expected to *always* restart auto-negotiation even if
> advertisement parameters didn't change?
> If so then we'll need to fix genphy_config_aneg().
> 
>>
>> Fix this by scheduling the PHY state machine in phy_start_aneg().
>> There is no way of knowing in phy.c whether auto-negotiation was
>> restarted or not by the PHY driver so we just wait for the next
>> poll/interrupt to update the PHY state machine.
>>
>> Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
>> Cc: stable <stable@vger.kernel.org> # v4.9+
>> Signed-off-by: Roger Quadros <rogerq@ti.com>
>> ---
>> v3: Fix typo in commit message
>>
>>  drivers/net/phy/phy.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
>> index 1be69d8..49dedf8 100644
>> --- a/drivers/net/phy/phy.c
>> +++ b/drivers/net/phy/phy.c
>> @@ -630,6 +630,10 @@ int phy_start_aneg(struct phy_device *phydev)
>>  
>>  out_unlock:
>>  	mutex_unlock(&phydev->lock);
>> +	if (!err && phy_interrupt_is_valid(phydev))
>> +		queue_delayed_work(system_power_efficient_wq,
>> +				   &phydev->state_queue, HZ);
>> +
>>  	return err;
>>  }
>>  EXPORT_SYMBOL(phy_start_aneg);
>>
> 
> There is still room for optimization for interrupt driven PHYs as I still
> see a delay of 1 second between "ifconfig ethx up" and link status coming up
> if cable was already plugged in. In fact if Auto-negotiation was already completed
> and not required to be restarted, the PHY state machine should have move from
> AN to RUNNING instantly without expecting a PHY interrupt.
> 
> How can we get rid of the unnecessary delay in the case where auto-negotiation
> is not restarted?
> Should we check for phy_aneg_done() immediately after issuing a phy_start_aneg()
> in phy_state_machine() and switch from PHY_AN to PHY_RUNNING?
> 
> This should avoid the need to re-schedule the state machine in phy_start_angeg().

Any comments on my questions?

cheers,
-roger
Florian Fainelli April 19, 2017, 4:28 p.m. UTC | #6
On 04/11/2017 04:17 AM, Roger Quadros wrote:
> Hi,
> 
> On 28/03/17 13:05, Roger Quadros wrote:
>> +Andrew Davis & Sekhar.
>>
>> Hi,
>>
>> Andrew Davis posted a few comments offline which I'm clarifying here.
>>
>> On 27/03/17 14:59, Roger Quadros wrote:
>>> The Ethernet link on an interrupt driven PHY was not coming up if the
>>> Ethernet cable was plugged before the Ethernet interface was brought up.
>>>
>>> The PHY state machine seems to be stuck from RUNNING to AN state
>>> with no new interrupts from the PHY. So it doesn't know when the
>>> PHY Auto-negotiation has been completed and doesn't transition to RUNNING
>>> state with ANEG done thus netif_carrier_on() is never called.
>>>
>>> NOTE: genphy_config_aneg() will not restart PHY Auto-negotiation of
>>> advertisement parameters didn't change.
>>
>> Is phy->config_aneg expected to *always* restart auto-negotiation even if
>> advertisement parameters didn't change?
>> If so then we'll need to fix genphy_config_aneg().
>>
>>>
>>> Fix this by scheduling the PHY state machine in phy_start_aneg().
>>> There is no way of knowing in phy.c whether auto-negotiation was
>>> restarted or not by the PHY driver so we just wait for the next
>>> poll/interrupt to update the PHY state machine.
>>>
>>> Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
>>> Cc: stable <stable@vger.kernel.org> # v4.9+
>>> Signed-off-by: Roger Quadros <rogerq@ti.com>
>>> ---
>>> v3: Fix typo in commit message
>>>
>>>  drivers/net/phy/phy.c | 4 ++++
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
>>> index 1be69d8..49dedf8 100644
>>> --- a/drivers/net/phy/phy.c
>>> +++ b/drivers/net/phy/phy.c
>>> @@ -630,6 +630,10 @@ int phy_start_aneg(struct phy_device *phydev)
>>>  
>>>  out_unlock:
>>>  	mutex_unlock(&phydev->lock);
>>> +	if (!err && phy_interrupt_is_valid(phydev))
>>> +		queue_delayed_work(system_power_efficient_wq,
>>> +				   &phydev->state_queue, HZ);
>>> +
>>>  	return err;
>>>  }
>>>  EXPORT_SYMBOL(phy_start_aneg);
>>>
>>
>> There is still room for optimization for interrupt driven PHYs as I still
>> see a delay of 1 second between "ifconfig ethx up" and link status coming up
>> if cable was already plugged in. In fact if Auto-negotiation was already completed
>> and not required to be restarted, the PHY state machine should have move from
>> AN to RUNNING instantly without expecting a PHY interrupt.
>>
>> How can we get rid of the unnecessary delay in the case where auto-negotiation
>> is not restarted?
>> Should we check for phy_aneg_done() immediately after issuing a phy_start_aneg()
>> in phy_state_machine() and switch from PHY_AN to PHY_RUNNING?

That sounds like a good idea yes. It seems to me like Alexander's patch
actually takes care of that:

http://patchwork.ozlabs.org/patch/752288/

Let's try to merge threads/recipients so we can shoot for a fix to be
included soon.

Thanks!

>>
>> This should avoid the need to re-schedule the state machine in phy_start_angeg().
> 
> Any comments on my questions?
> 
> cheers,
> -roger
>
diff mbox

Patch

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 1be69d8..49dedf8 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -630,6 +630,10 @@  int phy_start_aneg(struct phy_device *phydev)
 
 out_unlock:
 	mutex_unlock(&phydev->lock);
+	if (!err && phy_interrupt_is_valid(phydev))
+		queue_delayed_work(system_power_efficient_wq,
+				   &phydev->state_queue, HZ);
+
 	return err;
 }
 EXPORT_SYMBOL(phy_start_aneg);