ARC: [hsdk] Use rgmii-id mode for ethernet phy
diff mbox series

Message ID 20190514173941.20046-1-tpiepho@impinj.com
State New
Headers show
Series
  • ARC: [hsdk] Use rgmii-id mode for ethernet phy
Related show

Commit Message

Trent Piepho May 14, 2019, 5:40 p.m. UTC
If internal delays are desired on the RGMII link, "rgmii-id" should be
used as the phy-mode rather than "rgmii" .

This dts has properties to set the delay values, but they are ignored.
I suspect this is a mistake.

While the driver should disable delay based on the current DT, it does
not, and instead leaves the PHY in the pin strapping default.  Which is
usually to have delays very close to the unused values the hsdk DT.
Which is why the phy would work even if the delays in the DT are
ignored.

Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
---
 arch/arc/boot/dts/hsdk.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Alexey Brodkin May 14, 2019, 6:22 p.m. UTC | #1
Hi Trent,

> -----Original Message-----
> From: Trent Piepho <tpiepho@impinj.com>
> Sent: Tuesday, May 14, 2019 8:40 PM
> To: linux-snps-arc@lists.infradead.org
> Cc: Trent Piepho <tpiepho@impinj.com>; Alexey Brodkin <abrodkin@synopsys.com>; Vineet Gupta
> <vgupta@synopsys.com>
> Subject: [PATCH] ARC: [hsdk] Use rgmii-id mode for ethernet phy
> 
> If internal delays are desired on the RGMII link, "rgmii-id" should be
> used as the phy-mode rather than "rgmii" .
> 
> This dts has properties to set the delay values, but they are ignored.
> I suspect this is a mistake.
> 
> While the driver should disable delay based on the current DT, it does
> not, and instead leaves the PHY in the pin strapping default.  Which is
> usually to have delays very close to the unused values the hsdk DT.
> Which is why the phy would work even if the delays in the DT are
> ignored.

Thanks for this patch!

Indeed there might very well be something incomplete in that .dts
as I didn't know all those details.

I did check and Micrel KSZ9031 Gigabit PHY on HSDK supports on-chip delay.
That's what its datasheet says:
------------------->8------------------
RGMII Timing Supports On-Chip Delay According to RGMII Version 2.0,
with Programming Options for External Delay and Making Adjustments
and Corrections to TX and RX Timing Paths.
------------------->8------------------

And with proposed change I don't see any regressions so far, which is good.

Still a couple of questions:
 1. How did you spot this problem?
 2. With some Ethernet cards (especially 1Gb ones) on the other end we do see
    from time to time auto-negotiation takes that much time that
    udhcpc fails to get a lease because 3 discovery packets sent in
    a row get lost since link is not established (i.e. > 5 seconds).
    Do you think if it has something to do with that particular issue?
    Unfortunately I cannot reproduce this behavior right now so cannot
    verify it myself.

-Alexey
Trent Piepho May 14, 2019, 7:04 p.m. UTC | #2
On Tue, 2019-05-14 at 18:22 +0000, Alexey Brodkin wrote:
> > Subject: [PATCH] ARC: [hsdk] Use rgmii-id mode for ethernet phy
> > 
> > If internal delays are desired on the RGMII link, "rgmii-id" should be
> > used as the phy-mode rather than "rgmii" .
> > 
> > This dts has properties to set the delay values, but they are ignored.
> > I suspect this is a mistake.
> > 
> > While the driver should disable delay based on the current DT, it does
> > not, and instead leaves the PHY in the pin strapping default.  Which is
> > usually to have delays very close to the unused values the hsdk DT.
> > Which is why the phy would work even if the delays in the DT are
> > ignored.
> 
> Thanks for this patch!
> 
> Indeed there might very well be something incomplete in that .dts
> as I didn't know all those details.
> 
> I did check and Micrel KSZ9031 Gigabit PHY on HSDK supports on-chip delay.
> That's what its datasheet says:

Hmm, I was under the impression this board used a TI phy!  The phy DT
node contains these properties:

ti,rx-internal-delay = <DP83867_RGMIIDCTL_2_00_NS>;
ti,tx-internal-delay = <DP83867_RGMIIDCTL_2_00_NS>;
ti,fifo-depth = <DP83867_PHYCR_FIFO_DEPTH_4_B_NIB>;

These are for a dp83867 PHY of course, not a Micrel KSZ9031.  The
micrel driver would just ignore these properties.

> ------------------->8------------------
> RGMII Timing Supports On-Chip Delay According to RGMII Version 2.0,
> with Programming Options for External Delay and Making Adjustments
> and Corrections to TX and RX Timing Paths.
> ------------------->8------------------
> 
> And with proposed change I don't see any regressions so far, which is good.

I suspect the micrel driver will simply ignore the rgmii vs rgmii-id. 
But I have not worked on that driver for a long time and don't know for
sure.

> Still a couple of questions:
>  1. How did you spot this problem?

Found bug in dp83867 phy, grepped all dts files for properties or
symbolic constants that relate to that phy to look for board that might
be affected.

Since dts files almost always use run-time detection of phy type based
on the phy's id registers, I can't search for the "compatible" in the
dts to find boards using the phy.  So perhaps this board is a false
positive, and the real bug is the dts file was written for the dp83867
phy but the board really uses a different phy.

>  2. With some Ethernet cards (especially 1Gb ones) on the other end we do see
>     from time to time auto-negotiation takes that much time that
>     udhcpc fails to get a lease because 3 discovery packets sent in
>     a row get lost since link is not established (i.e. > 5 seconds).
>     Do you think if it has something to do with that particular issue?
>     Unfortunately I cannot reproduce this behavior right now so cannot
>     verify it myself.

Unlikely.  I don't believe the data portion of the RGMII link to the
MAC, which is what these clock skew settings affect, has any effect on
auto-negotiation.  And indeed I can program entirely non-functional
values for delay and see A/N proceed as normal.

If the delay is off slightly, you will see TX or RX (depending on which
delay) packet error rate go up.  On my board, I see about a 50 ps range
over which the error rates goes from virtually 0 to 100%.  There is
about a 200 ps range over which the error rate is 0.  I imagine that
over a temperature range, that 200 ps range is smaller.

I did recently diagnose a problem where A/N was several seconds slower
than expected.  It had to do with the link partner being an Intel PHY
which had "ultra low power mode" enabled.  This is some feature to
reduce power when there is no link.  If the link to the intel phy was
not up, i.e. a direct connect from my device (powered off) to a PC, not
through a switch, then link up would first A/N to 10 mbps mode, not
work, then drop, then come up as 1000 mbps and work.  This took about
7-8 seconds instead of about 3 seconds.  With a switch interposed
between the devices, the Intel PHY does not see a down link (the switch
is on), so this doesn't happen.  Probably not your problem, as I could
only see this in u-boot by the time Linux has booted the phy will have
activated the link and gotten past this screwy 10 mbps thing.
Alexey Brodkin May 15, 2019, 4:21 p.m. UTC | #3
Hi Trent,

> -----Original Message-----
> From: Trent Piepho <tpiepho@impinj.com>
> Sent: Tuesday, May 14, 2019 10:05 PM
> To: Alexey Brodkin <abrodkin@synopsys.com>
> Cc: Vineet.Gupta1@synopsys.com; Eugeniy.Paltsev@synopsys.com; linux-snps-arc@lists.infradead.org
> Subject: Re: [PATCH] ARC: [hsdk] Use rgmii-id mode for ethernet phy
> 
> On Tue, 2019-05-14 at 18:22 +0000, Alexey Brodkin wrote:
> > > Subject: [PATCH] ARC: [hsdk] Use rgmii-id mode for ethernet phy
> > >
> > > If internal delays are desired on the RGMII link, "rgmii-id" should be
> > > used as the phy-mode rather than "rgmii" .
> > >
> > > This dts has properties to set the delay values, but they are ignored.
> > > I suspect this is a mistake.
> > >
> > > While the driver should disable delay based on the current DT, it does
> > > not, and instead leaves the PHY in the pin strapping default.  Which is
> > > usually to have delays very close to the unused values the hsdk DT.
> > > Which is why the phy would work even if the delays in the DT are
> > > ignored.
> >
> > Thanks for this patch!
> >
> > Indeed there might very well be something incomplete in that .dts
> > as I didn't know all those details.
> >
> > I did check and Micrel KSZ9031 Gigabit PHY on HSDK supports on-chip delay.
> > That's what its datasheet says:
> 
> Hmm, I was under the impression this board used a TI phy!  The phy DT
> node contains these properties:
> 
> ti,rx-internal-delay = <DP83867_RGMIIDCTL_2_00_NS>;
> ti,tx-internal-delay = <DP83867_RGMIIDCTL_2_00_NS>;
> ti,fifo-depth = <DP83867_PHYCR_FIFO_DEPTH_4_B_NIB>;

Got those removed, see explanation here:
http://lists.infradead.org/pipermail/linux-snps-arc/2019-May/005754.html

But we have another boards where DP83865 PHY is used,
these are AXS101 & AXS103 which share the same base-board .dtsi,
see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arc/boot/dts/axs10x_mb.dtsi#n75

Even though it's not immediately clear there's a TI PHY as there's
no PHY node at all but see what we have in the bootlog:
| NatSemi DP83865 stmmac-0:01: attached PHY driver [NatSemi DP83865] ...

I guess I need to add PHY node and use suggested by you "rgmii-id", right?

> I did recently diagnose a problem where A/N was several seconds slower
> than expected.  It had to do with the link partner being an Intel PHY
> which had "ultra low power mode" enabled.  This is some feature to
> reduce power when there is no link.  If the link to the intel phy was
> not up, i.e. a direct connect from my device (powered off) to a PC, not
> through a switch, then link up would first A/N to 10 mbps mode, not
> work, then drop, then come up as 1000 mbps and work.  This took about
> 7-8 seconds instead of about 3 seconds.  With a switch interposed
> between the devices, the Intel PHY does not see a down link (the switch
> is on), so this doesn't happen.  Probably not your problem, as I could
> only see this in u-boot by the time Linux has booted the phy will have
> activated the link and gotten past this screwy 10 mbps thing.

Hm, that's interesting... I think at least on some of our machines we do
have Intel controllers and most probably Intel PHYs as well so that might
very well be the case. Do you know if that "ultra low power mode" could be
somehow easily disabled?

-Alexey
Trent Piepho May 15, 2019, 6:16 p.m. UTC | #4
On Wed, 2019-05-15 at 16:21 +0000, Alexey Brodkin wrote:
> > -----Original Message-----
> > From: Trent Piepho <tpiepho@impinj.com>
> > Sent: Tuesday, May 14, 2019 10:05 PM
> > To: Alexey Brodkin <abrodkin@synopsys.com>
> > Cc: Vineet.Gupta1@synopsys.com; Eugeniy.Paltsev@synopsys.com; linux-snps-arc@lists.infradead.org
> > Subject: Re: [PATCH] ARC: [hsdk] Use rgmii-id mode for ethernet phy
> > 
> > On Tue, 2019-05-14 at 18:22 +0000, Alexey Brodkin wrote:
> > > > Subject: [PATCH] ARC: [hsdk] Use rgmii-id mode for ethernet phy
> > > > 
> > > > If internal delays are desired on the RGMII link, "rgmii-id" should be
> > > > used as the phy-mode rather than "rgmii" .
> > > > 
> > > > This dts has properties to set the delay values, but they are ignored.
> > > > I suspect this is a mistake.
> > > > 
> But we have another boards where DP83865 PHY is used,
> these are AXS101 & AXS103 which share the same base-board .dtsi,
> see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arc/boot/dts/axs10x_mb.dtsi#n75
> 
> Even though it's not immediately clear there's a TI PHY as there's
> no PHY node at all but see what we have in the bootlog:
> > NatSemi DP83865 stmmac-0:01: attached PHY driver [NatSemi DP83865] ...
> 
> I guess I need to add PHY node and use suggested by you "rgmii-id", right?

The dp83865 is a different driver than the dp83867.  My check of it
didn't find any evidence of it doing anything w.r.t. rgmii clock skew. 
A quick check of dp83865 datasheet shows it's a very different device
than the dp83867.

> > work, then drop, then come up as 1000 mbps and work.  This took about
> > 7-8 seconds instead of about 3 seconds.  With a switch interposed
> > between the devices, the Intel PHY does not see a down link (the switch
> > is on), so this doesn't happen.  Probably not your problem, as I could
> > only see this in u-boot by the time Linux has booted the phy will have
> > activated the link and gotten past this screwy 10 mbps thing.
> 
> Hm, that's interesting... I think at least on some of our machines we do
> have Intel controllers and most probably Intel PHYs as well so that might
> very well be the case. Do you know if that "ultra low power mode" could be
> somehow easily disabled?

If the machine with the intel phy is running windows, there is a driver
option (device manager, properties, advanced..) for it.  Driver version
12.15.22.6 from 4/5/2016 does not have a control, while version
12.17.10.6 from 4/3/2018 does have the control.  Those are my only two
data points.  I don't know how to control this feature from Linux. 
Never used an Intel phy on an embedded device.
Alexey Brodkin May 15, 2019, 6:21 p.m. UTC | #5
Hi Trent,

[snip]

> > Even though it's not immediately clear there's a TI PHY as there's
> > no PHY node at all but see what we have in the bootlog:
> > > NatSemi DP83865 stmmac-0:01: attached PHY driver [NatSemi DP83865] ...
> >
> > I guess I need to add PHY node and use suggested by you "rgmii-id", right?
> 
> The dp83865 is a different driver than the dp83867.  My check of it
> didn't find any evidence of it doing anything w.r.t. rgmii clock skew.
> A quick check of dp83865 datasheet shows it's a very different device
> than the dp83867.

Ok so I'll just add a generic PHY node to AXS10x motherboard.

> > > work, then drop, then come up as 1000 mbps and work.  This took about
> > > 7-8 seconds instead of about 3 seconds.  With a switch interposed
> > > between the devices, the Intel PHY does not see a down link (the switch
> > > is on), so this doesn't happen.  Probably not your problem, as I could
> > > only see this in u-boot by the time Linux has booted the phy will have
> > > activated the link and gotten past this screwy 10 mbps thing.
> >
> > Hm, that's interesting... I think at least on some of our machines we do
> > have Intel controllers and most probably Intel PHYs as well so that might
> > very well be the case. Do you know if that "ultra low power mode" could be
> > somehow easily disabled?
> 
> If the machine with the intel phy is running windows, there is a driver
> option (device manager, properties, advanced..) for it.  Driver version
> 12.15.22.6 from 4/5/2016 does not have a control, while version
> 12.17.10.6 from 4/3/2018 does have the control.  Those are my only two
> data points.  I don't know how to control this feature from Linux.
> Never used an Intel phy on an embedded device.

Thanks for this info. Our servers are running Linux so we'll see.
Though I just understood that we connect our boards via a switch typically
so it shouldn't be a problem of "ultra low-power" anything.

Anyways thanks for your input.

-Alexey

Patch
diff mbox series

diff --git a/arch/arc/boot/dts/hsdk.dts b/arch/arc/boot/dts/hsdk.dts
index 7425bb0f2d1b..d77b27894ab6 100644
--- a/arch/arc/boot/dts/hsdk.dts
+++ b/arch/arc/boot/dts/hsdk.dts
@@ -185,7 +185,7 @@ 
 			reg = <0x8000 0x2000>;
 			interrupts = <10>;
 			interrupt-names = "macirq";
-			phy-mode = "rgmii";
+			phy-mode = "rgmii-id";
 			snps,pbl = <32>;
 			clocks = <&gmacclk>;
 			clock-names = "stmmaceth";