diff mbox

[OpenWrt-Devel] ar71xx: fix kernel Oops in at803x_link_change_notify

Message ID 2194528.0hAsuVtonU@bentobox
State Not Applicable
Headers show

Commit Message

Sven Eckelmann July 3, 2015, 10:29 a.m. UTC
On Wednesday 01 July 2015 16:34:07 Christian Lamparter wrote:
> On Tuesday, June 30, 2015 05:35:12 PM Sven Eckelmann wrote:
> > r45954 ("ar71xx: fix 100/10mbps ethernet link issues on mynet range
> > extender") introduced a pdata based modification of the tx_clk_dly. But it
> > was not checked if pdata actually existed. This caused a page fault on all
> > devices which didn't have at803x_platform_data specified for an at803x
> > based device.
> > 
> Ah finally. I tried to contact you several times before. 
> (RFC back in May)

Yes, missed the Cc mail "[RFT/RFC] ar71xx: fix 100/10mbps ethernet link issues
on mynet range extender" and "Re: [OpenWrt-Devel] [PATCH] ar71xx: fix
100/10mbps ethernet link issues on mynet range extender". Sorry about that -
it is my fault because I have not yet figured out how to force google to
handle Cc'ed/To'ed/Bcc'ed mails special and not to put them in my "Mailing
List - read a lot later" folder just because the original thread is in this
folder. Marek gave me a list of workarounds which I promise to try out.

> Sven, I've seen you did quite a bit of work on the OM5P-AN
> with the same F1E-PHY. I went through the code from Atheros'
> original driver SDK v17 (which was part of the GPL source
> for the mynet). Apart from tx delay handling, everything
> else matches perfectly, so the tx/rx delay code should be
> enabled by default for this chip on all platforms without
> needing any platform data overwrites.

Interesting, I don't have the SDK and I am only using OpenWrt. So I cannot say
anything about the Atheros driver. But sounds like you are right. At least all
recent devices I had in my hand required this change.

I also saw your fixup_rgmii_tx_delay delay change. I would ack when you submit
a change like this. Just don't forget that the pll_1000 has also to be changed
(especially when you move the default values to at803x.c):



I don't have the equipment to check the actual signals (as you said in an
earlier mail) - only some people which complained very loud when their long
cable setup didn't work. :)

Maybe there are some non atheros SoC's out there which would have problems
when you would change the default behavior of the AT8035.

> I'm interested to hear from any other devices which have 
> a AT803x. Also, please let me know where to get the GPL
> tars for the devices.

I have forwarded your mail to the person which is handling the actual
firmware builds of the OM5P-AN. He will contact you later and provide
the sources.

Kind regards,
	Sven

Comments

Christian Lamparter July 6, 2015, 11:26 a.m. UTC | #1
On Friday, July 03, 2015 12:29:32 PM Sven Eckelmann wrote:
> > Sven, I've seen you did quite a bit of work on the OM5P-AN
> > with the same F1E-PHY. I went through the code from Atheros'
> > original driver SDK v17 (which was part of the GPL source
> > for the mynet). Apart from tx delay handling, everything
> > else matches perfectly, so the tx/rx delay code should be
> > enabled by default for this chip on all platforms without
> > needing any platform data overwrites.
> 
> Interesting, I don't have the SDK and I am only using OpenWrt. So I cannot say
> anything about the Atheros driver. But sounds like you are right. At least all
> recent devices I had in my hand required this change.
Oh, I figured you have access to some of Qualcomm Atheros' SDKs
and most importantly: some documentations from Qualcomm Atheros? 
I hoped you could confirm or disconfirm based on the documents.
 
> I also saw your fixup_rgmii_tx_delay delay change. I would ack when you submit
> a change like this. Just don't forget that the pll_1000 has also to be changed
> (especially when you move the default values to at803x.c):
> 
> --- a/target/linux/ar71xx/files/arch/mips/ath79/mach-om5p.c
> +++ b/target/linux/ar71xx/files/arch/mips/ath79/mach-om5p.c
> @@ -154,7 +154,8 @@ static struct i2c_board_info om5pan_i2c_devs[] __initdata = {
>  static struct at803x_platform_data om5p_an_at803x_data = {
>  	.disable_smarteee = 1,
>  	.enable_rgmii_rx_delay = 1,
> -	.enable_rgmii_tx_delay = 1,
> +	.enable_rgmii_tx_delay = 0,
> +	.fixup_rgmii_tx_delay = 1,
>  };
>  
>  static struct mdio_board_info om5p_an_mdio0_info[] = {
> @@ -201,7 +202,7 @@ static void __init om5p_an_setup(void)
>  	ath79_eth0_data.phy_if_mode = PHY_INTERFACE_MODE_RGMII;
>  	ath79_eth0_data.mii_bus_dev = &ath79_mdio0_device.dev;
>  	ath79_eth0_data.phy_mask = BIT(7);
> -	ath79_eth0_pll_data.pll_1000 = 0x02000000;
> +	ath79_eth0_pll_data.pll_1000 = 0x0e000000;
>  	ath79_eth0_pll_data.pll_100 = 0x00000101;
>  	ath79_eth0_pll_data.pll_10 = 0x00001313;
>  	ath79_register_eth(0);
> 
> 
> I don't have the equipment to check the actual signals (as you said in an
> earlier mail) - only some people which complained very loud when their long
> cable setup didn't work. :)
Long cable... That's a good point, I think I never tested the AT8035 with
anything beyond a 5m / 16 feet cable.  

> Maybe there are some non atheros SoC's out there which would have problems
> when you would change the default behavior of the AT8035.
I've been looking around for devices with an AT8035. And I found a few gems.

<http://lists.infradead.org/pipermail/linux-arm-kernel/2013-September/200978.html>
<https://forum.openwrt.org/viewtopic.php?id=42896><https://dev.openwrt.org/ticket/13203>
...
(The AT8035 is also used in some of the new HomePlug AV2 equipment) 

As far as I can tell, the defaults only seem to work for the 100mbps.
This makes sense, since the F1E has different PLLs and rx/tx delay
settings. Fixing this "globally" might actually be a good thing. At
least I'll give it a try.
> > I'm interested to hear from any other devices which have 
> > a AT803x. Also, please let me know where to get the GPL
> > tars for the devices.
> 
> I have forwarded your mail to the person which is handling the actual
> firmware builds of the OM5P-AN. He will contact you later and provide
> the sources.
Oh, that might be helpful yes. I can also post the sources from Western
Digital's S17_SSDK [The Ethernet driver SDK is part of their GPL.tar.gz].

Thanks,
  Christian
Chris July 14, 2015, 10:35 p.m. UTC | #2
Not sure if this helps, but the recent Cisco Meraki AP drop includes a
newish PHY driver for the at8033/8035.

The GPL source is at
http://dl.meraki.net/linux/meraki-firmware-sources-r23-20150601.tar.bz2 and
a copy of the actual file can be found at
https://github.com/riptidewave93/meraki-linux/blob/linux-3.4-r23-20150601/drivers/net/ethernet/atheros/ag7240/phys/athr_ar8033_phy.c

On Mon, Jul 6, 2015 at 6:26 AM, Christian Lamparter <chunkeey@googlemail.com
> wrote:

> On Friday, July 03, 2015 12:29:32 PM Sven Eckelmann wrote:
> > > Sven, I've seen you did quite a bit of work on the OM5P-AN
> > > with the same F1E-PHY. I went through the code from Atheros'
> > > original driver SDK v17 (which was part of the GPL source
> > > for the mynet). Apart from tx delay handling, everything
> > > else matches perfectly, so the tx/rx delay code should be
> > > enabled by default for this chip on all platforms without
> > > needing any platform data overwrites.
> >
> > Interesting, I don't have the SDK and I am only using OpenWrt. So I
> cannot say
> > anything about the Atheros driver. But sounds like you are right. At
> least all
> > recent devices I had in my hand required this change.
> Oh, I figured you have access to some of Qualcomm Atheros' SDKs
> and most importantly: some documentations from Qualcomm Atheros?
> I hoped you could confirm or disconfirm based on the documents.
>
> > I also saw your fixup_rgmii_tx_delay delay change. I would ack when you
> submit
> > a change like this. Just don't forget that the pll_1000 has also to be
> changed
> > (especially when you move the default values to at803x.c):
> >
> > --- a/target/linux/ar71xx/files/arch/mips/ath79/mach-om5p.c
> > +++ b/target/linux/ar71xx/files/arch/mips/ath79/mach-om5p.c
> > @@ -154,7 +154,8 @@ static struct i2c_board_info om5pan_i2c_devs[]
> __initdata = {
> >  static struct at803x_platform_data om5p_an_at803x_data = {
> >       .disable_smarteee = 1,
> >       .enable_rgmii_rx_delay = 1,
> > -     .enable_rgmii_tx_delay = 1,
> > +     .enable_rgmii_tx_delay = 0,
> > +     .fixup_rgmii_tx_delay = 1,
> >  };
> >
> >  static struct mdio_board_info om5p_an_mdio0_info[] = {
> > @@ -201,7 +202,7 @@ static void __init om5p_an_setup(void)
> >       ath79_eth0_data.phy_if_mode = PHY_INTERFACE_MODE_RGMII;
> >       ath79_eth0_data.mii_bus_dev = &ath79_mdio0_device.dev;
> >       ath79_eth0_data.phy_mask = BIT(7);
> > -     ath79_eth0_pll_data.pll_1000 = 0x02000000;
> > +     ath79_eth0_pll_data.pll_1000 = 0x0e000000;
> >       ath79_eth0_pll_data.pll_100 = 0x00000101;
> >       ath79_eth0_pll_data.pll_10 = 0x00001313;
> >       ath79_register_eth(0);
> >
> >
> > I don't have the equipment to check the actual signals (as you said in an
> > earlier mail) - only some people which complained very loud when their
> long
> > cable setup didn't work. :)
> Long cable... That's a good point, I think I never tested the AT8035 with
> anything beyond a 5m / 16 feet cable.
>
> > Maybe there are some non atheros SoC's out there which would have
> problems
> > when you would change the default behavior of the AT8035.
> I've been looking around for devices with an AT8035. And I found a few
> gems.
>
> <
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-September/200978.html
> >
> <https://forum.openwrt.org/viewtopic.php?id=42896><
> https://dev.openwrt.org/ticket/13203>
> ...
> (The AT8035 is also used in some of the new HomePlug AV2 equipment)
>
> As far as I can tell, the defaults only seem to work for the 100mbps.
> This makes sense, since the F1E has different PLLs and rx/tx delay
> settings. Fixing this "globally" might actually be a good thing. At
> least I'll give it a try.
> > > I'm interested to hear from any other devices which have
> > > a AT803x. Also, please let me know where to get the GPL
> > > tars for the devices.
> >
> > I have forwarded your mail to the person which is handling the actual
> > firmware builds of the OM5P-AN. He will contact you later and provide
> > the sources.
> Oh, that might be helpful yes. I can also post the sources from Western
> Digital's S17_SSDK [The Ethernet driver SDK is part of their GPL.tar.gz].
>
> Thanks,
>   Christian
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel
>
Christian Lamparter July 14, 2015, 11:56 p.m. UTC | #3
On Tuesday, July 14, 2015 05:35:13 PM Chris Blake wrote:
> Not sure if this helps, but the recent Cisco Meraki AP drop includes a
> newish PHY driver for the at8033/8035.

> The GPL source is at
> http://dl.meraki.net/linux/meraki-firmware-sources-r23-20150601.tar.bz2 and
> a copy of the actual file can be found at
> https://github.com/riptidewave93/meraki-linux/blob/linux-3.4-r23-20150601/drivers/net/ethernet/atheros/ag7240/phys/athr_ar8033_phy.c

Thank you both!

For the record AR8033 seems to be a different/new chip with new quirks.
Nevertheless this SDK contains the phy code for the AR8035 as well. It's
just in a different file: athrf1_phy.c [0] (8035's SmartEEE mentioned
in line 242).

The rx/tx delay switching is done in athr_phy_speed as well (line 333+).
What's more: This is very much the same code I have seen in Western
Digital's source too, which was much older. I think at this point it's
save to assume that the AR8035 code was stable and this chip always
needed the special tx/rx delay treatment. I'll make the changes and
post them here (probably next week) and probably on linux-net (as this
isn't device specific behavior and I think it should be in the mainline
as well). 
 
Regards,
  Christian

[0] <https://github.com/riptidewave93/meraki-linux/blob/linux-3.4-r23-20150601/drivers/net/ethernet/atheros/ag7240/phys/athrf1_phy.c>
Sven Eckelmann July 20, 2015, 9:13 a.m. UTC | #4
On Monday 06 July 2015 13:26:21 Christian Lamparter wrote:
> On Friday, July 03, 2015 12:29:32 PM Sven Eckelmann wrote:
> > > Sven, I've seen you did quite a bit of work on the OM5P-AN
> > > with the same F1E-PHY. I went through the code from Atheros'
> > > original driver SDK v17 (which was part of the GPL source
> > > for the mynet). Apart from tx delay handling, everything
> > > else matches perfectly, so the tx/rx delay code should be
> > > enabled by default for this chip on all platforms without
> > > needing any platform data overwrites.
> > 
> > Interesting, I don't have the SDK and I am only using OpenWrt. So I cannot say
> > anything about the Atheros driver. But sounds like you are right. At least all
> > recent devices I had in my hand required this change.
> Oh, I figured you have access to some of Qualcomm Atheros' SDKs
> and most importantly: some documentations from Qualcomm Atheros? 
> I hoped you could confirm or disconfirm based on the documents.

No, we don't have access to them. The tarball from Marek is the thing we
have. And it doesn't contain information from Atheros beside the stuff which
is already in OpenWrt.

> > I don't have the equipment to check the actual signals (as you said in an
> > earlier mail) - only some people which complained very loud when their long
> > cable setup didn't work. 
> Long cable... That's a good point, I think I never tested the AT8035 with
> anything beyond a 5m / 16 feet cable.  

Especially problematic devices seem to be (active/passive) POE switches + long
cables.

Kind regards,
	Sven
Christian Lamparter July 20, 2015, 11:24 a.m. UTC | #5
On 7/20/15, Sven Eckelmann <sven@open-mesh.com> wrote:
> On Monday 06 July 2015 13:26:21 Christian Lamparter wrote:
>> Oh, I figured you have access to some of Qualcomm Atheros' SDKs
>> and most importantly: some documentations from Qualcomm Atheros?
>> I hoped you could confirm or disconfirm based on the documents.
>
> No, we don't have access to them. The tarball from Marek is the thing we
> have. And it doesn't contain information from Atheros beside the stuff
> which is already in OpenWrt.
Ok, I can leave it there. In the mean time: I'm looking at what Chris send me
about the AT8035 in his QCA9557/8 platform. Apparently, it's the same chip
it says AT8035-A on the chip (and it shares the same phyid 0x004dd072).
However, the Atheros' SDK calls it AT8033 too.

Thing is, this chip no longer needs any workarounds, but it might need some
extra init code now. (Currently, it isn't working yet, but we'll try
to get it to work.)

Regards,
    Christian
diff mbox

Patch

--- a/target/linux/ar71xx/files/arch/mips/ath79/mach-om5p.c
+++ b/target/linux/ar71xx/files/arch/mips/ath79/mach-om5p.c
@@ -154,7 +154,8 @@  static struct i2c_board_info om5pan_i2c_devs[] __initdata = {
 static struct at803x_platform_data om5p_an_at803x_data = {
 	.disable_smarteee = 1,
 	.enable_rgmii_rx_delay = 1,
-	.enable_rgmii_tx_delay = 1,
+	.enable_rgmii_tx_delay = 0,
+	.fixup_rgmii_tx_delay = 1,
 };
 
 static struct mdio_board_info om5p_an_mdio0_info[] = {
@@ -201,7 +202,7 @@  static void __init om5p_an_setup(void)
 	ath79_eth0_data.phy_if_mode = PHY_INTERFACE_MODE_RGMII;
 	ath79_eth0_data.mii_bus_dev = &ath79_mdio0_device.dev;
 	ath79_eth0_data.phy_mask = BIT(7);
-	ath79_eth0_pll_data.pll_1000 = 0x02000000;
+	ath79_eth0_pll_data.pll_1000 = 0x0e000000;
 	ath79_eth0_pll_data.pll_100 = 0x00000101;
 	ath79_eth0_pll_data.pll_10 = 0x00001313;
 	ath79_register_eth(0);