diff mbox series

[OpenWrt-Devel] ramips: gsw_mt7621: disable PORT 5 MAC RX/TX flow control by default

Message ID 20200211101741.17350-1-ynezz@true.cz
State Accepted
Delegated to: Petr Štetiar
Headers show
Series [OpenWrt-Devel] ramips: gsw_mt7621: disable PORT 5 MAC RX/TX flow control by default | expand

Commit Message

Petr Štetiar Feb. 11, 2020, 10:17 a.m. UTC
Looking at the current upstream driver implementation, it seems like the
TX/RX flow control is enabled only if the flow control pause option is
resolved from the device/link partner advertisements (or otherwise set).

On the other hand, our current in-tree driver force enables TX/RX
flow control by default, thus possibly leading to TX timeouts if the
other end sends pause frames (which are not properly handled?):

 WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x1ac/0x324
 NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out

Disabling the flow control on PORT 5 MAC seems to fix this issues as the
pause frames are then filtered out. While at it, I'm removing the if
condition completely as suggested, since this code is run only on mt7621
SoC, so there is no need to check for the silicon revisions.

Ref: https://lists.openwrt.org/pipermail/openwrt-devel/2017-November/009882.html
Ref: https://forum.openwrt.org/t/mtk-soc-eth-watchdog-timeout-after-r11573/50000/12
Suggested-by: Felix Fietkau <nbd@nbd.name>
Reported-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Petr Štetiar <ynezz@true.cz>
---
 .../drivers/net/ethernet/mediatek/gsw_mt7621.c       | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

Comments

Rosen Penev Feb. 11, 2020, 10:56 a.m. UTC | #1
Sent from my iPhone

> On Feb 11, 2020, at 2:17 AM, Petr Štetiar <ynezz@true.cz> wrote:
> 
> Looking at the current upstream driver implementation, it seems like the
> TX/RX flow control is enabled only if the flow control pause option is
> resolved from the device/link partner advertisements (or otherwise set).
> 
> On the other hand, our current in-tree driver force enables TX/RX
> flow control by default, thus possibly leading to TX timeouts if the
> other end sends pause frames (which are not properly handled?):
> 
> WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x1ac/0x324
> NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
> 
> Disabling the flow control on PORT 5 MAC seems to fix this issues as the
> pause frames are then filtered out. While at it, I'm removing the if
> condition completely as suggested, since this code is run only on mt7621
> SoC, so there is no need to check for the silicon revisions.
Sounds good.
> 
> Ref: https://lists.openwrt.org/pipermail/openwrt-devel/2017-November/009882.html
> Ref: https://forum.openwrt.org/t/mtk-soc-eth-watchdog-timeout-after-r11573/50000/12
> Suggested-by: Felix Fietkau <nbd@nbd.name>
> Reported-by: Rosen Penev <rosenp@gmail.com>
> Signed-off-by: Petr Štetiar <ynezz@true.cz>
> ---
> .../drivers/net/ethernet/mediatek/gsw_mt7621.c       | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
> index 89be23900738..232bcd8cf4ea 100644
> --- a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
> +++ b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
> @@ -98,15 +98,9 @@ static void mt7621_hw_init(struct mt7620_gsw *gsw, struct device_node *np)
>    mt7530_mdio_w32(gsw, 0x7000, 0x3);
>    usleep_range(10, 20);
> 
> -    if ((rt_sysc_r32(SYSC_REG_CHIP_REV_ID) & 0xFFFF) == 0x0101) {
> -        /* (GE1, Force 1000M/FD, FC ON, MAX_RX_LENGTH 1536) */
> -        mtk_switch_w32(gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
> -        mt7530_mdio_w32(gsw, 0x3600, 0x5e30b);
> -    } else {
> -        /* (GE1, Force 1000M/FD, FC ON, MAX_RX_LENGTH 1536) */
> -        mtk_switch_w32(gsw, 0x2305e33b, GSW_REG_MAC_P0_MCR);
> -        mt7530_mdio_w32(gsw, 0x3600, 0x5e33b);
> -    }
> +    /* (GE1, Force 1000M/FD, FC OFF, MAX_RX_LENGTH 1536) */
> +    mtk_switch_w32(gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
> +    mt7530_mdio_w32(gsw, 0x3600, 0x5e30b);
> 
>    /* (GE2, Link down) */
>    mtk_switch_w32(gsw, 0x8000, GSW_REG_MAC_P1_MCR);
Thomas Richard via openwrt-devel Feb. 11, 2020, 3:59 p.m. UTC | #2
The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.
Hi Petr,

Quoting Petr Štetiar <ynezz@true.cz>:

> Looking at the current upstream driver implementation, it seems like the
> TX/RX flow control is enabled only if the flow control pause option is
> resolved from the device/link partner advertisements (or otherwise set).
>

With upstream you mean mainline 5.5 kernel?
In the DTS the CPU port is defined as a fixed-link with pause enabled.
So the pause bits are always resolved/set.

I can't tell if the mainline kernel driver encounters the same issue.
I don't use my device daily.

I still wonder if there is any race condition in the TX path/handling.
I see in the mainline kernel version they are using the same lock in  
tx_map and
tx_poll. And also checking the DMA owner bit in the descriptor.

So I think that the following is happening.

The hardware can't send the frames fast enough because of the PAUSE frames,
maybe to a slower device? The CPU is filling the tx ring faster then the
hardware sending it and eventually CPU overtakes the DMA head. Which causes an
issue/race/deadlock.

By disabling the PAUSE/flow control, it is less likely that is overtake is
going happen. But may result in more packet loss.

> On the other hand, our current in-tree driver force enables TX/RX
> flow control by default, thus possibly leading to TX timeouts if the
> other end sends pause frames (which are not properly handled?):
>
> WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:320  
> dev_watchdog+0x1ac/0x324
> NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
>
> Disabling the flow control on PORT 5 MAC seems to fix this issues as the
> pause frames are then filtered out. While at it, I'm removing the if
> condition completely as suggested, since this code is run only on mt7621
> SoC, so there is no need to check for the silicon revisions.
>

Port 6 is connected to the first GMAC of the SOC, not port 5.
So it should be PORT 6 in your description also you change the first GMAC
settings, maybe mention that too.

Greats,

René

> Ref:  
> https://lists.openwrt.org/pipermail/openwrt-devel/2017-November/009882.html
> Ref:  
> https://forum.openwrt.org/t/mtk-soc-eth-watchdog-timeout-after-r11573/50000/12
> Suggested-by: Felix Fietkau <nbd@nbd.name>
> Reported-by: Rosen Penev <rosenp@gmail.com>
> Signed-off-by: Petr Štetiar <ynezz@true.cz>
> ---
> .../drivers/net/ethernet/mediatek/gsw_mt7621.c       | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)
>
> diff --git  
> a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c  
> b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
> index 89be23900738..232bcd8cf4ea 100644
> ---  
> a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
> +++  
> b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
> @@ -98,15 +98,9 @@ static void mt7621_hw_init(struct mt7620_gsw  
> *gsw, struct device_node *np)
> 	mt7530_mdio_w32(gsw, 0x7000, 0x3);
> 	usleep_range(10, 20);
>
> -	if ((rt_sysc_r32(SYSC_REG_CHIP_REV_ID) & 0xFFFF) == 0x0101) {
> -		/* (GE1, Force 1000M/FD, FC ON, MAX_RX_LENGTH 1536) */
> -		mtk_switch_w32(gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
> -		mt7530_mdio_w32(gsw, 0x3600, 0x5e30b);
> -	} else {
> -		/* (GE1, Force 1000M/FD, FC ON, MAX_RX_LENGTH 1536) */
> -		mtk_switch_w32(gsw, 0x2305e33b, GSW_REG_MAC_P0_MCR);
> -		mt7530_mdio_w32(gsw, 0x3600, 0x5e33b);
> -	}
> +	/* (GE1, Force 1000M/FD, FC OFF, MAX_RX_LENGTH 1536) */
> +	mtk_switch_w32(gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
> +	mt7530_mdio_w32(gsw, 0x3600, 0x5e30b);
>
> 	/* (GE2, Link down) */
> 	mtk_switch_w32(gsw, 0x8000, GSW_REG_MAC_P1_MCR);
>
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Petr Štetiar Feb. 11, 2020, 7:50 p.m. UTC | #3
René van Dorst via openwrt-devel <openwrt-devel@lists.openwrt.org> [2020-02-11 15:59:52]:

[adding Kristian to the Cc: loop]

> > Looking at the current upstream driver implementation, it seems like the
> > TX/RX flow control is enabled only if the flow control pause option is
> > resolved from the device/link partner advertisements (or otherwise set).
> > 
> 
> With upstream you mean mainline 5.5 kernel?

Yes, 5.6-rc1

> In the DTS the CPU port is defined as a fixed-link with pause enabled.
> So the pause bits are always resolved/set.

I see[1] port@6 with fixed-link, without pause property.

> The hardware can't send the frames fast enough because of the PAUSE frames,
> maybe to a slower device? The CPU is filling the tx ring faster then the
> hardware sending it and eventually CPU overtakes the DMA head. Which causes an
> issue/race/deadlock.

Probably, I didn't tried to reproduce it or planning to do so.

> > Disabling the flow control on PORT 5 MAC seems to fix this issues as the
> > pause frames are then filtered out. While at it, I'm removing the if
> > condition completely as suggested, since this code is run only on mt7621
> > SoC, so there is no need to check for the silicon revisions.
> > 
> 
> Port 6 is connected to the first GMAC of the SOC, not port 5.
> So it should be PORT 6 in your description also 

Ok, I took that "PMCR_P5/PORT 5 MAC Control Register" from
MT7621_ProgrammingGuide_GSW_v01.pdf. Couldnt find anything about P6, it's
quite confusing.

1. https://elixir.bootlin.com/linux/v5.6-rc1/source/drivers/staging/mt7621-dts/mt7621.dtsi#L489

-- ynezz
Thomas Richard via openwrt-devel Feb. 11, 2020, 9:23 p.m. UTC | #4
The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.
Quoting Petr Štetiar <ynezz@true.cz>:

> René van Dorst via openwrt-devel <openwrt-devel@lists.openwrt.org>  
> [2020-02-11 15:59:52]:
>
> [adding Kristian to the Cc: loop]
>
>> > Looking at the current upstream driver implementation, it seems like the
>> > TX/RX flow control is enabled only if the flow control pause option is
>> > resolved from the device/link partner advertisements (or otherwise set).
>> >
>>
>> With upstream you mean mainline 5.5 kernel?
>
> Yes, 5.6-rc1
>
>> In the DTS the CPU port is defined as a fixed-link with pause enabled.
>> So the pause bits are always resolved/set.
>
> I see[1] port@6 with fixed-link, without pause property.

Here is more information I wrote [1] about the mt7530 switch and it  
modes and also
the mt7621 example below. I used the pause because AFAIK there was/is  
no pause issue.

>
>> The hardware can't send the frames fast enough because of the PAUSE frames,
>> maybe to a slower device? The CPU is filling the tx ring faster then the
>> hardware sending it and eventually CPU overtakes the DMA head.  
>> Which causes an
>> issue/race/deadlock.
>
> Probably, I didn't tried to reproduce it or planning to do so.
>

I hope to reproduce it quick by reducing the TX ring size from 4096 to 256.

>> > Disabling the flow control on PORT 5 MAC seems to fix this issues as the
>> > pause frames are then filtered out. While at it, I'm removing the if
>> > condition completely as suggested, since this code is run only on mt7621
>> > SoC, so there is no need to check for the silicon revisions.
>> >
>>
>> Port 6 is connected to the first GMAC of the SOC, not port 5.
>> So it should be PORT 6 in your description also
>
> Ok, I took that "PMCR_P5/PORT 5 MAC Control Register" from
> MT7621_ProgrammingGuide_GSW_v01.pdf. Couldnt find anything about P6, it's
> quite confusing.

I totally understand that. Lack of complete documentation is an issue.

P6 = MAC6 in the documentation. AFAIK all the MACs do have the same register
layout. So MAC6 = PORT 6 and is connected to the first GMAC of the SOC.
PORT 5 = MAC5 and can be used as 2nd CPU port or connect to an external phy.
So MAC5, 2nd GMAC of the SOC and an external PHY share the same RGMII bus.
Depending on GPIO/PORT settings they are connected to the BUS. See  
also mt7530.txt
kernel doc for more info.

>
> 1.  
> https://elixir.bootlin.com/linux/v5.6-rc1/source/drivers/staging/mt7621-dts/mt7621.dtsi#L489
>
> -- ynezz
>
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel

[1]:  
https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/net/dsa/mt7530.txt#L38

Greats,

René
Thomas Richard via openwrt-devel Feb. 12, 2020, 12:04 a.m. UTC | #5
The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.
Quoting René van Dorst <opensource@vdorst.com>:

> Quoting Petr Štetiar <ynezz@true.cz>:
>
>> René van Dorst via openwrt-devel <openwrt-devel@lists.openwrt.org>  
>> [2020-02-11 15:59:52]:
>>
>> [adding Kristian to the Cc: loop]
>>
>>>> Looking at the current upstream driver implementation, it seems like the
>>>> TX/RX flow control is enabled only if the flow control pause option is
>>>> resolved from the device/link partner advertisements (or otherwise set).
>>>>
>>>
>>> With upstream you mean mainline 5.5 kernel?
>>
>> Yes, 5.6-rc1
>>
>>> In the DTS the CPU port is defined as a fixed-link with pause enabled.
>>> So the pause bits are always resolved/set.
>>
>> I see[1] port@6 with fixed-link, without pause property.
>
> Here is more information I wrote [1] about the mt7530 switch and it  
> modes and also
> the mt7621 example below. I used the pause because AFAIK there  
> was/is no pause issue.
>
>>
>>> The hardware can't send the frames fast enough because of the PAUSE frames,
>>> maybe to a slower device? The CPU is filling the tx ring faster then the
>>> hardware sending it and eventually CPU overtakes the DMA head.  
>>> Which causes an
>>> issue/race/deadlock.
>>
>> Probably, I didn't tried to reproduce it or planning to do so.
>>
>
> I hope to reproduce it quick by reducing the TX ring size from 4096 to 256.
>

Hi Petr,

I thing I looking in the right direction.
With this patch, reducing the tx_ring size to 4 * 4 = 16 packets, I can easily
trigger a transmit timeout.
Just create some traffic.


rene@pc-rene:~/dev/openwrt$ git diff
diff --git  
a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/mtk_eth_soc.c  
b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index c6f4d90193b3..09d5e75ec0b0 100644
---  
a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++  
b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1658,7 +1658,7 @@ static int fe_probe(struct platform_device *pdev)
         priv->msg_enable = netif_msg_init(fe_msg_level,  
FE_DEFAULT_MSG_ENABLE);
         priv->rx_ring.frag_size = fe_max_frag_size(ETH_DATA_LEN);
         priv->rx_ring.rx_buf_size = fe_max_buf_size(priv->rx_ring.frag_size);
-       priv->tx_ring.tx_ring_size = NUM_DMA_DESC;
+       priv->tx_ring.tx_ring_size = 4;
         priv->rx_ring.rx_ring_size = NUM_DMA_DESC;
         INIT_WORK(&priv->pending_work, fe_pending_work);
         u64_stats_init(&priv->hw_stats->syncp);

Greats,

René

>>>> Disabling the flow control on PORT 5 MAC seems to fix this issues as the
>>>> pause frames are then filtered out. While at it, I'm removing the if
>>>> condition completely as suggested, since this code is run only on mt7621
>>>> SoC, so there is no need to check for the silicon revisions.
>>>>
>>>
>>> Port 6 is connected to the first GMAC of the SOC, not port 5.
>>> So it should be PORT 6 in your description also
>>
>> Ok, I took that "PMCR_P5/PORT 5 MAC Control Register" from
>> MT7621_ProgrammingGuide_GSW_v01.pdf. Couldnt find anything about P6, it's
>> quite confusing.
>
> I totally understand that. Lack of complete documentation is an issue.
>
> P6 = MAC6 in the documentation. AFAIK all the MACs do have the same register
> layout. So MAC6 = PORT 6 and is connected to the first GMAC of the SOC.
> PORT 5 = MAC5 and can be used as 2nd CPU port or connect to an external phy.
> So MAC5, 2nd GMAC of the SOC and an external PHY share the same RGMII bus.
> Depending on GPIO/PORT settings they are connected to the BUS. See  
> also mt7530.txt
> kernel doc for more info.
>
>>
>> 1.  
>> https://elixir.bootlin.com/linux/v5.6-rc1/source/drivers/staging/mt7621-dts/mt7621.dtsi#L489
>>
>> -- ynezz
>>
>> _______________________________________________
>> openwrt-devel mailing list
>> openwrt-devel@lists.openwrt.org
>> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
>
> [1]:  
> https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/net/dsa/mt7530.txt#L38
>
> Greats,
>
> René
Thomas Richard via openwrt-devel Feb. 13, 2020, 3:14 p.m. UTC | #6
The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.
Quoting René van Dorst <opensource@vdorst.com>:

> Quoting René van Dorst <opensource@vdorst.com>:
>
>> Quoting Petr Štetiar <ynezz@true.cz>:
>>
>>> René van Dorst via openwrt-devel <openwrt-devel@lists.openwrt.org>  
>>> [2020-02-11 15:59:52]:
>>>
>>> [adding Kristian to the Cc: loop]
>>>
>>>>> Looking at the current upstream driver implementation, it seems like the
>>>>> TX/RX flow control is enabled only if the flow control pause option is
>>>>> resolved from the device/link partner advertisements (or otherwise set).
>>>>>
>>>>
>>>> With upstream you mean mainline 5.5 kernel?
>>>
>>> Yes, 5.6-rc1
>>>
>>>> In the DTS the CPU port is defined as a fixed-link with pause enabled.
>>>> So the pause bits are always resolved/set.
>>>
>>> I see[1] port@6 with fixed-link, without pause property.
>>
>> Here is more information I wrote [1] about the mt7530 switch and it  
>> modes and also
>> the mt7621 example below. I used the pause because AFAIK there  
>> was/is no pause issue.
>>
>>>
>>>> The hardware can't send the frames fast enough because of the  
>>>> PAUSE frames,
>>>> maybe to a slower device? The CPU is filling the tx ring faster then the
>>>> hardware sending it and eventually CPU overtakes the DMA head.  
>>>> Which causes an
>>>> issue/race/deadlock.
>>>
>>> Probably, I didn't tried to reproduce it or planning to do so.
>>>
>>
>> I hope to reproduce it quick by reducing the TX ring size from 4096 to 256.
>>
>
> Hi Petr,
>
> I thing I looking in the right direction.
> With this patch, reducing the tx_ring size to 4 * 4 = 16 packets, I  
> can easily
> trigger a transmit timeout.
> Just create some traffic.
>

Hi Petr,

Although it triggers the timeout, it seems the driver/functions don't expected
such small dma sizes. With some extra debug code added, I see that only the
first 2 entries are filled and a timeout hits. By increasing the size  
to 8*4=32
packets the drivers works as expected.
I tried to flood the device with packets. Iperf3 (LAN->WAN) through the device
with and in parallel iperf3 from device to LAN. After more then +500GBytes it
was still going.

So I am not sure about my theory.

Maybe we could add some extra debug code in the timeout().
FE can set the flag "DMA DONE" for every packet that was send.
Debug code should show that bit for every entry.
Maybe this gives a bit more info.

Greats,

René

>
> rene@pc-rene:~/dev/openwrt$ git diff
> diff --git  
> a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/mtk_eth_soc.c  
> b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index c6f4d90193b3..09d5e75ec0b0 100644
> ---  
> a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++  
> b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1658,7 +1658,7 @@ static int fe_probe(struct platform_device *pdev)
>         priv->msg_enable = netif_msg_init(fe_msg_level,  
> FE_DEFAULT_MSG_ENABLE);
>         priv->rx_ring.frag_size = fe_max_frag_size(ETH_DATA_LEN);
>         priv->rx_ring.rx_buf_size = fe_max_buf_size(priv->rx_ring.frag_size);
> -       priv->tx_ring.tx_ring_size = NUM_DMA_DESC;
> +       priv->tx_ring.tx_ring_size = 4;
>         priv->rx_ring.rx_ring_size = NUM_DMA_DESC;
>         INIT_WORK(&priv->pending_work, fe_pending_work);
>         u64_stats_init(&priv->hw_stats->syncp);
>
> Greats,
>
> René
>
>>>>> Disabling the flow control on PORT 5 MAC seems to fix this issues as the
>>>>> pause frames are then filtered out. While at it, I'm removing the if
>>>>> condition completely as suggested, since this code is run only on mt7621
>>>>> SoC, so there is no need to check for the silicon revisions.
>>>>>
>>>>
>>>> Port 6 is connected to the first GMAC of the SOC, not port 5.
>>>> So it should be PORT 6 in your description also
>>>
>>> Ok, I took that "PMCR_P5/PORT 5 MAC Control Register" from
>>> MT7621_ProgrammingGuide_GSW_v01.pdf. Couldnt find anything about P6, it's
>>> quite confusing.
>>
>> I totally understand that. Lack of complete documentation is an issue.
>>
>> P6 = MAC6 in the documentation. AFAIK all the MACs do have the same register
>> layout. So MAC6 = PORT 6 and is connected to the first GMAC of the SOC.
>> PORT 5 = MAC5 and can be used as 2nd CPU port or connect to an external phy.
>> So MAC5, 2nd GMAC of the SOC and an external PHY share the same RGMII bus.
>> Depending on GPIO/PORT settings they are connected to the BUS. See  
>> also mt7530.txt
>> kernel doc for more info.
>>
>>>
>>> 1.  
>>> https://elixir.bootlin.com/linux/v5.6-rc1/source/drivers/staging/mt7621-dts/mt7621.dtsi#L489
>>>
>>> -- ynezz
>>>
>>> _______________________________________________
>>> openwrt-devel mailing list
>>> openwrt-devel@lists.openwrt.org
>>> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
>>
>> [1]:  
>> https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/net/dsa/mt7530.txt#L38
>>
>> Greats,
>>
>> René
Petr Štetiar Feb. 13, 2020, 3:43 p.m. UTC | #7
René van Dorst via openwrt-devel <openwrt-devel@lists.openwrt.org> [2020-02-13 15:14:09]:

Hi,

> Maybe we could add some extra debug code in the timeout().  FE can set the
> flag "DMA DONE" for every packet that was send.  Debug code should show that
> bit for every entry.  Maybe this gives a bit more info.

I'm all in to fix this properly as that proposed patch still feels like
workaround to me.

There seems to be a lot of users interested/affected/able to test it on the
forum[1], so it would be really nice from you if you could find some spare
time to fix it.

I mean prepare some debug/fix patch, post it on that forum topic, wait for the
response, iterate...

Thanks for taking care!

1. https://forum.openwrt.org/t/mtk-soc-eth-watchdog-timeout-after-r11573

-- ynezz
Kristian Evensen Feb. 14, 2020, 10:13 a.m. UTC | #8
Hi everyone,

I am sorry for my late reply to this thread. My email provider flagged
it as spam, so I only saw the conversation now. It seems that you have
reached a conclusion on how to proceed, but I thought I should anyway
share my notes/observations on this issue (in case they can be
useful).

My employer has a large number of Mediatek-based (mt7620 and mt7621)
routers in production. Most routers have a minimum of two internet
connections - one fixed and one using mobile broadband. Some time in
2017 we started receiving reports from a few customers that the switch
would stop working. The link was up, but no data would go through.
Looking at the logs, we could always see the "TX timeout" error
message and we started to look for a cause.

We quickly ruled out any kind of crash, as the LTE was still up and
wifi worked fine. After getting a few of these reports, we started
looking for things that were common between the different
installations. We struggled to find any, there were all sorts of
devices connected to the different ports on the routers. The only
thing the different cases had in common, was that the problem
disappeared when whatever was connected to the WAN port was
disconnected. However, again, the equipment that provided the fixed
connection came from all sorts of vendors.

After scratching our heads for a while and not getting anywhere, I
asked here on the mailing list and was told that restarting networking
should at least make the switch works fine again. We added a watchdog
doing exactly that when the TX timeout message would appear.
Restarting networking improved the situation considerably, but the
switch would still sometimes get stuck and never recover.

This triggered us to make a second attempt at recreating the problem.
Our test was the same as what Rene described. We assumed the problem
had something to do with sending large amounts of traffic and at a
high speed, so we used iperf3 as a traffic generator and sent traffic
between different machines connected to the switch. One of these
machines were quite unstable and prone to crash, and we noticed that
whenever that machine would crash the TX timeout issue would trigger
and no traffic would pass through the switch.

A normal packet capture didn't reveal anything interesting, but
connecting a network tap did. Looking at the packets captured from the
tap, we could see a flood of pause frames from the crashed machine.
When this flood occurred, the switch stopped transmitting packets on
all the ports and not just the one that the crashed machine was
connected to. This caught us by surprise, but doing some research it
seems to be a common behavior among "normal" switches. Also, if we
waited long enough, the switch would never recover.

After discovering that pause frames seemed to be at least one trigger
for TX timeout, we added support to the driver for enabling/disabling
flow control on each of the ports + an init script that does the
disabling. Since we deployed this change on our routers, we have not
had a single report about switches that stop working. We do sometimes
still see the "TX timeout" error, but it is no longer critical.

We never tried to disable flow control on the CPU-port only, which
seems like a more elegant approach than disabling each port
individually. I do agree that disabling pause frames is more a
work-around than a solution, but it has at least eradicated the
problem for us. I never got around to submitting our patch, but if
anyone would find it useful I can do it quite soon.

BR,
Kristian
diff mbox series

Patch

diff --git a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
index 89be23900738..232bcd8cf4ea 100644
--- a/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
+++ b/target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
@@ -98,15 +98,9 @@  static void mt7621_hw_init(struct mt7620_gsw *gsw, struct device_node *np)
 	mt7530_mdio_w32(gsw, 0x7000, 0x3);
 	usleep_range(10, 20);
 
-	if ((rt_sysc_r32(SYSC_REG_CHIP_REV_ID) & 0xFFFF) == 0x0101) {
-		/* (GE1, Force 1000M/FD, FC ON, MAX_RX_LENGTH 1536) */
-		mtk_switch_w32(gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
-		mt7530_mdio_w32(gsw, 0x3600, 0x5e30b);
-	} else {
-		/* (GE1, Force 1000M/FD, FC ON, MAX_RX_LENGTH 1536) */
-		mtk_switch_w32(gsw, 0x2305e33b, GSW_REG_MAC_P0_MCR);
-		mt7530_mdio_w32(gsw, 0x3600, 0x5e33b);
-	}
+	/* (GE1, Force 1000M/FD, FC OFF, MAX_RX_LENGTH 1536) */
+	mtk_switch_w32(gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
+	mt7530_mdio_w32(gsw, 0x3600, 0x5e30b);
 
 	/* (GE2, Link down) */
 	mtk_switch_w32(gsw, 0x8000, GSW_REG_MAC_P1_MCR);