diff mbox

[OpenWrt-Devel,RFC] ag71xx: Enable flow control for builtin switch port

Message ID 1460981742-17048-1-git-send-email-sven@open-mesh.com
State RFC
Headers show

Commit Message

Sven Eckelmann April 18, 2016, 12:15 p.m. UTC
From: Sven Eckelmann <sven.eckelmann@open-mesh.com>

The port connected to the internal switch tends to drop a lot of packets
when a lot of data is transferred over it. This is especially visible when
IP fragmentation happens for large UDP and ICMP packets. An easy test for
this is

    # works
    ping 192.168.1.23
    # doesn't work
    ping -s 65507 192.168.1.23

But enabling flow control on ports without the builtin switch seems to
break the connection completely in some situations. This was for example
detected when a QCA955x with AR98533 on GMAC1 was started with an active
link on this port. So only enable it for SoC ethernet devices with
switch attached.

This closes #19498

Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
---
This is only an RFC because Felix reverted a similar change in r27034.
Unfortunately, the revert commit doesn't contain enough information to
find out if this problem would also happen when only activating flow
control on the switch port port or when only enabling TX and not RX
flow control (or the other way around).

 .../ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c  | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Conn O'Griofa April 19, 2016, 3:04 p.m. UTC | #1
On Mon, Apr 18, 2016 at 1:15 PM, Sven Eckelmann <
sven.eckelmann@open-mesh.com> wrote:

> From: Sven Eckelmann <sven.eckelmann@open-mesh.com>
>
> The port connected to the internal switch tends to drop a lot of packets
> when a lot of data is transferred over it. This is especially visible when
> IP fragmentation happens for large UDP and ICMP packets. An easy test for
> this is
>
>     # works
>     ping 192.168.1.23
>     # doesn't work
>     ping -s 65507 192.168.1.23
>
> But enabling flow control on ports without the builtin switch seems to
> break the connection completely in some situations. This was for example
> detected when a QCA955x with AR98533 on GMAC1 was started with an active
> link on this port. So only enable it for SoC ethernet devices with
> switch attached.
>
> This closes #19498
>
> Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
> ---
> This is only an RFC because Felix reverted a similar change in r27034.
> Unfortunately, the revert commit doesn't contain enough information to
> find out if this problem would also happen when only activating flow
> control on the switch port port or when only enabling TX and not RX
> flow control (or the other way around).
>
>  .../ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c  | 6
> +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git
> a/target/linux/ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c
> b/target/linux/ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c
> index 0832059..208b1d6 100644
> ---
> a/target/linux/ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c
> +++
> b/target/linux/ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c
> @@ -455,7 +455,11 @@ static void ag71xx_hw_setup(struct ag71xx *ag)
>         struct ag71xx_platform_data *pdata = ag71xx_get_pdata(ag);
>
>         /* setup MAC configuration registers */
> -       ag71xx_wr(ag, AG71XX_REG_MAC_CFG1, MAC_CFG1_INIT);
> +       if (pdata->is_ar724x && pdata->switch_data)
> +               ag71xx_wr(ag, AG71XX_REG_MAC_CFG1,
> +                         MAC_CFG1_INIT | MAC_CFG1_TFC | MAC_CFG1_RFC);
> +       else
> +               ag71xx_wr(ag, AG71XX_REG_MAC_CFG1, MAC_CFG1_INIT);
>
>         ag71xx_sb(ag, AG71XX_REG_MAC_CFG2,
>                   MAC_CFG2_PAD_CRC_EN | MAC_CFG2_LEN_CHECK);
> --
> 2.8.0.rc3
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel
>

It looks like you may have solved the original issue; this patch is working
fine on WR-842ND v1. More details:

* When I tried reverting r27034, the WAN port stopped functioning
correctly, showing a tx timeout  error - but the LAN ports still worked.
With your patch, all ports work correctly.
* Your testcase (ping -s 65507 from/to a wired client between LAN ports)
usually results in 20%-40% packet loss. With your patch (as with the
original), it's now typically 0% packet loss between LAN ports.

Conn
Sven Eckelmann April 19, 2016, 3:09 p.m. UTC | #2
On Tuesday 19 April 2016 16:04:48 Conn O'Griofa wrote:
[...]
> It looks like you may have solved the original issue; this patch is working
> fine on WR-842ND v1. More details:
> 
> * When I tried reverting r27034, the WAN port stopped functioning
> correctly, showing a tx timeout  error - but the LAN ports still worked.
> With your patch, all ports work correctly.
> * Your testcase (ping -s 65507 from/to a wired client between LAN ports)
> usually results in 20%-40% packet loss. With your patch (as with the
> original), it's now typically 0% packet loss between LAN ports.

There might be more bugs hidden in the hardware. Maybe some SoCs have problems 
with it or some setups with external switch.

But thanks a lot for your feedback.

Kind regards,
	Sven
Conn O'Griofa April 19, 2016, 5:24 p.m. UTC | #3
On Tue, Apr 19, 2016 at 4:09 PM, Sven Eckelmann <
sven.eckelmann@open-mesh.com> wrote:

> On Tuesday 19 April 2016 16:04:48 Conn O'Griofa wrote:
> [...]
> > It looks like you may have solved the original issue; this patch is
> working
> > fine on WR-842ND v1. More details:
> >
> > * When I tried reverting r27034, the WAN port stopped functioning
> > correctly, showing a tx timeout  error - but the LAN ports still worked.
> > With your patch, all ports work correctly.
> > * Your testcase (ping -s 65507 from/to a wired client between LAN ports)
> > usually results in 20%-40% packet loss. With your patch (as with the
> > original), it's now typically 0% packet loss between LAN ports.
>
> There might be more bugs hidden in the hardware. Maybe some SoCs have
> problems
> with it or some setups with external switch.
>
> But thanks a lot for your feedback.
>
> Kind regards,
>         Sven


Of course - but I would urge other SoCs to be tested (or at least, find a
way to enable it only on compatible chipsets). I'm currently using this
router as a dumb AP to serve 2.4GHz clients, and your patch greatly
improves connection quality due to the packet loss issue being cleared up.
I'll continue using your patch & will report if any issue arises related to
it.

Conn
diff mbox

Patch

diff --git a/target/linux/ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c b/target/linux/ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c
index 0832059..208b1d6 100644
--- a/target/linux/ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c
+++ b/target/linux/ar71xx/files/drivers/net/ethernet/atheros/ag71xx/ag71xx_main.c
@@ -455,7 +455,11 @@  static void ag71xx_hw_setup(struct ag71xx *ag)
 	struct ag71xx_platform_data *pdata = ag71xx_get_pdata(ag);
 
 	/* setup MAC configuration registers */
-	ag71xx_wr(ag, AG71XX_REG_MAC_CFG1, MAC_CFG1_INIT);
+	if (pdata->is_ar724x && pdata->switch_data)
+		ag71xx_wr(ag, AG71XX_REG_MAC_CFG1,
+			  MAC_CFG1_INIT | MAC_CFG1_TFC | MAC_CFG1_RFC);
+	else
+		ag71xx_wr(ag, AG71XX_REG_MAC_CFG1, MAC_CFG1_INIT);
 
 	ag71xx_sb(ag, AG71XX_REG_MAC_CFG2,
 		  MAC_CFG2_PAD_CRC_EN | MAC_CFG2_LEN_CHECK);