diff mbox series

[v7] bmips: bcm6368-enetsw: Bump max MTU

Message ID 20240124-bcm6368-mtu-v7-1-694c86d8d435@linaro.org
State Accepted
Delegated to: Álvaro Fernández
Headers show
Series [v7] bmips: bcm6368-enetsw: Bump max MTU | expand

Commit Message

Linus Walleij Jan. 24, 2024, 8:23 a.m. UTC
The safe max frame size for this ethernet switch is 1532 bytes,
excluding the DSA TAG and extra VLAN header, so the maximum
outgoing frame is 1542 bytes.

The available overhead is needed when using the DSA switch with
a cascaded Marvell DSA switch, which is something that exist in
real products, in this case the Inteno XG6846.

Use defines at the top of the size for max MTU so it is clear how
we think about this, add comments.

We need to adjust the RX buffer size to fit the new max frame size,
which is 1542 when the DSA tag (6 bytes) and VLAN header (4 extra
bytes) is added.

We also drop this default MTU:

  #define ENETSW_TAG_SIZE (6 + VLAN_HLEN)
  ndev->mtu = ETH_DATA_LEN + ENETSW_TAG_SIZE;

in favor of just:

  ndev->mtu = ETH_DATA_LEN;

I don't know why the default MTU is trying to second guess the
overhead required by DSA and VLAN but the framework will also
try to bump the MTU for e.g. DSA tags, and the VLAN overhead is
not supposed to be included in the MTU, so this is clearly not
right.

Before this patch (on the lan1 DSA port in this case):
dsa_slave_change_mtu: master->max_mtu = 9724, dev->max_mtu = 10218, DSA overhead = 8
dsa_slave_change_mtu: master = extsw, dev = lan1
dsa_slave_change_mtu: master->max_mtu = 1510, dev->max_mtu = 9724, DSA overhead = 6
dsa_slave_change_mtu: master = eth0, dev = extsw
dsa_slave_change_mtu new_master_mtu 1514 > mtu_limit 1510
mdio_mux-0.1:00: nonfatal error -34 setting MTU to 1500 on port 0

My added debug prints before the nonfatal error: the first switch from the top
is the Marvell switch, the second in the bcm6368-enetsw with its 1510 limit.

After this patch the error is gone.

OpenWrt adds a VLAN to each port so we get VLAN tags on all frames. On this
setup we even have 4 more bytes left after the two DSA tags and VLAN so
we can go all the way up to 1532 as MTU.

Testing the new 1532 MTU:

    eth0             ext1              enp7s0
 .--------.     .-----------.  cable  .------.
 | enetsw | <-> | mv88e6152 | <-----> | host |
 `--------´     `-----------´         `------´

On the router we set the max MTU for test:
ifconfig eth0 mtu 1520
ifconfig br-wan mtu 1520
ifconfig ext1 mtu 1506

An MTU of 1506 on ext1 is a logic consequence of the above setup:
this is the max bytes actually transferred. The framing added will be:

- 18 bytes standard ethernet header
- 4 bytes VLAN header
- 6 bytes DSA tag for enetsw
- 8 bytes DSA tag for mv88e6152

Sum: 1506 + 18 + 4 + 6 + 8 = 1542 which is out max frame size.

Test pinging from host:
ping -s 1478 -M do 192.168.1.220
PING 192.168.1.220 (192.168.1.220) 1478(1506) bytes of data.
1486 bytes from 192.168.1.220: icmp_seq=1 ttl=64 time=0.696 ms
1486 bytes from 192.168.1.220: icmp_seq=2 ttl=64 time=0.615 ms

Test pinging from router:
PING 192.168.1.2 (192.168.1.2): 1478 data bytes
1486 bytes from 192.168.1.2: seq=0 ttl=64 time=0.931 ms
1486 bytes from 192.168.1.2: seq=1 ttl=64 time=0.810 ms

The max IP packet without headers is 1478, the outgoing ICMP packet is
1506 bytes. Then the DSA, VLAN and ethernet overhead is added.

Let us verify the contents of the resulting ethernet frame of 1542 bytes.

Ping packet on router side as viewed with tcpdump:

00:54:51.900869 AF Unknown (1429722180), length 1538:
        0x0000:  3d93 bcae c56b a83d 8874 0300 0004 8100  =....k.=.t......
        0x0010:  0000 dada 0000 c020 0fff 0800 4500 05e2  ............E...
        0x0020:  0000 4000 4001 b0ec c0a8 0102 c0a8 01dc  ..@.@...........
        0x0030:  0800 7628 00c3 0001 f5da 1d65 0000 0000  ..v(.......e....
        0x0040:  ce65 0a00 0000 0000 1011 1213 1415 1617  .e..............
        0x0050:  1819 1a1b 1c1d 1e1f 2021 2223 2425 2627  .........!"#$%&'
        0x0060:  2829 2a2b 2c2d 2e2f 3031 3233 3435 3637  ()*+,-./0123456
(...)

- 3d93 = First four bytes are the last two bytes of the destination
  ethernet address I don't know why the first four are missing,
  but it sure explains why the paket is 1538 bytes and not 1542
  which is the actual max frame size.
- bcae c56b a83b = source ethernet address
- 8874 0300 0004 = Broadcom enetsw DSA tag
- 8100 0000 = VLAN 802.1Q header
- dada 0000 c020 0fff = EDSA tag for the Marvell (outer) switch,
- 0800 is the ethertype (part of the EDSA tag technically)
- Next follows the contents of the ping packet as it appears if
  we dump it on the DSA interface such as tcpdump -i lan1
  etc, there we get the stripped out packet, 1506 bytes.
- At the end 4 bytes of FCS.

This clearly illustrates that the DSA tag is included in the MTU
which we set up in Linux, but the VLAN tag and ethernet headers and
checksum is not.

Cc: Álvaro Fernández Rojas <noltari@gmail.com>
Cc: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Paul Donald <newtwen@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v6->v7:
- Rebase on master.
- Perhaps we can merge this now?
ChangeLog v5->v6:
- Rewrite thoroughly after discussing with Jonas that the safe
  MTU is probably 1532 without DSA tag and extra VLAN header.
- Reword commit message and redo tests so it is crystal clear what
  is going on, as illustrated by tcpdump.
ChangeLog v4->v5:
- Drop the confusing ENETSW_MTU_OVERHEAD altogether after discussing
  with Jonas.
ChangeLog v3->v4:
- Adjust the RX buffer size and we can use the max "jumbo"
  frame size 2048.
ChangeLog v2->v3:
- Make a more believable case for the max MTU with tcpdump
  example.
ChangeLog v1->v2:
- Do some better research after help on IRC, did some ping tests.
---
 .../drivers/net/ethernet/broadcom/bcm6368-enetsw.c | 23 +++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)


---
base-commit: 1b7e62b20b1735fcdc498a35e005afcd775abcf4
change-id: 20240124-bcm6368-mtu-d3b27dca1ca8

Best regards,

Comments

Álvaro Fernández Rojas Jan. 24, 2024, 7:32 p.m. UTC | #1
Hi Linus,

I tested this:
- Comtrend AR-5381u (bcm6328 with internal switch only).
- Netgear DGND3700v2 (bcm6362 with external BCM53125 switch).
And there were no regressions on any of them.

So I merged it @ 3cf1fe5508ee8a2a2c6e33a829bfb6c81381ccd0
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=3cf1fe5508ee8a2a2c6e33a829bfb6c81381ccd0
https://github.com/openwrt/openwrt/commit/3cf1fe5508ee8a2a2c6e33a829bfb6c81381ccd0

Thanks!

El mié, 24 ene 2024 a las 9:23, Linus Walleij
(<linus.walleij@linaro.org>) escribió:
>
> The safe max frame size for this ethernet switch is 1532 bytes,
> excluding the DSA TAG and extra VLAN header, so the maximum
> outgoing frame is 1542 bytes.
>
> The available overhead is needed when using the DSA switch with
> a cascaded Marvell DSA switch, which is something that exist in
> real products, in this case the Inteno XG6846.
>
> Use defines at the top of the size for max MTU so it is clear how
> we think about this, add comments.
>
> We need to adjust the RX buffer size to fit the new max frame size,
> which is 1542 when the DSA tag (6 bytes) and VLAN header (4 extra
> bytes) is added.
>
> We also drop this default MTU:
>
>   #define ENETSW_TAG_SIZE (6 + VLAN_HLEN)
>   ndev->mtu = ETH_DATA_LEN + ENETSW_TAG_SIZE;
>
> in favor of just:
>
>   ndev->mtu = ETH_DATA_LEN;
>
> I don't know why the default MTU is trying to second guess the
> overhead required by DSA and VLAN but the framework will also
> try to bump the MTU for e.g. DSA tags, and the VLAN overhead is
> not supposed to be included in the MTU, so this is clearly not
> right.
>
> Before this patch (on the lan1 DSA port in this case):
> dsa_slave_change_mtu: master->max_mtu = 9724, dev->max_mtu = 10218, DSA overhead = 8
> dsa_slave_change_mtu: master = extsw, dev = lan1
> dsa_slave_change_mtu: master->max_mtu = 1510, dev->max_mtu = 9724, DSA overhead = 6
> dsa_slave_change_mtu: master = eth0, dev = extsw
> dsa_slave_change_mtu new_master_mtu 1514 > mtu_limit 1510
> mdio_mux-0.1:00: nonfatal error -34 setting MTU to 1500 on port 0
>
> My added debug prints before the nonfatal error: the first switch from the top
> is the Marvell switch, the second in the bcm6368-enetsw with its 1510 limit.
>
> After this patch the error is gone.
>
> OpenWrt adds a VLAN to each port so we get VLAN tags on all frames. On this
> setup we even have 4 more bytes left after the two DSA tags and VLAN so
> we can go all the way up to 1532 as MTU.
>
> Testing the new 1532 MTU:
>
>     eth0             ext1              enp7s0
>  .--------.     .-----------.  cable  .------.
>  | enetsw | <-> | mv88e6152 | <-----> | host |
>  `--------´     `-----------´         `------´
>
> On the router we set the max MTU for test:
> ifconfig eth0 mtu 1520
> ifconfig br-wan mtu 1520
> ifconfig ext1 mtu 1506
>
> An MTU of 1506 on ext1 is a logic consequence of the above setup:
> this is the max bytes actually transferred. The framing added will be:
>
> - 18 bytes standard ethernet header
> - 4 bytes VLAN header
> - 6 bytes DSA tag for enetsw
> - 8 bytes DSA tag for mv88e6152
>
> Sum: 1506 + 18 + 4 + 6 + 8 = 1542 which is out max frame size.
>
> Test pinging from host:
> ping -s 1478 -M do 192.168.1.220
> PING 192.168.1.220 (192.168.1.220) 1478(1506) bytes of data.
> 1486 bytes from 192.168.1.220: icmp_seq=1 ttl=64 time=0.696 ms
> 1486 bytes from 192.168.1.220: icmp_seq=2 ttl=64 time=0.615 ms
>
> Test pinging from router:
> PING 192.168.1.2 (192.168.1.2): 1478 data bytes
> 1486 bytes from 192.168.1.2: seq=0 ttl=64 time=0.931 ms
> 1486 bytes from 192.168.1.2: seq=1 ttl=64 time=0.810 ms
>
> The max IP packet without headers is 1478, the outgoing ICMP packet is
> 1506 bytes. Then the DSA, VLAN and ethernet overhead is added.
>
> Let us verify the contents of the resulting ethernet frame of 1542 bytes.
>
> Ping packet on router side as viewed with tcpdump:
>
> 00:54:51.900869 AF Unknown (1429722180), length 1538:
>         0x0000:  3d93 bcae c56b a83d 8874 0300 0004 8100  =....k.=.t......
>         0x0010:  0000 dada 0000 c020 0fff 0800 4500 05e2  ............E...
>         0x0020:  0000 4000 4001 b0ec c0a8 0102 c0a8 01dc  ..@.@...........
>         0x0030:  0800 7628 00c3 0001 f5da 1d65 0000 0000  ..v(.......e....
>         0x0040:  ce65 0a00 0000 0000 1011 1213 1415 1617  .e..............
>         0x0050:  1819 1a1b 1c1d 1e1f 2021 2223 2425 2627  .........!"#$%&'
>         0x0060:  2829 2a2b 2c2d 2e2f 3031 3233 3435 3637  ()*+,-./0123456
> (...)
>
> - 3d93 = First four bytes are the last two bytes of the destination
>   ethernet address I don't know why the first four are missing,
>   but it sure explains why the paket is 1538 bytes and not 1542
>   which is the actual max frame size.
> - bcae c56b a83b = source ethernet address
> - 8874 0300 0004 = Broadcom enetsw DSA tag
> - 8100 0000 = VLAN 802.1Q header
> - dada 0000 c020 0fff = EDSA tag for the Marvell (outer) switch,
> - 0800 is the ethertype (part of the EDSA tag technically)
> - Next follows the contents of the ping packet as it appears if
>   we dump it on the DSA interface such as tcpdump -i lan1
>   etc, there we get the stripped out packet, 1506 bytes.
> - At the end 4 bytes of FCS.
>
> This clearly illustrates that the DSA tag is included in the MTU
> which we set up in Linux, but the VLAN tag and ethernet headers and
> checksum is not.
>
> Cc: Álvaro Fernández Rojas <noltari@gmail.com>
> Cc: Jonas Gorski <jonas.gorski@gmail.com>
> Tested-by: Paul Donald <newtwen@gmail.com>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> ChangeLog v6->v7:
> - Rebase on master.
> - Perhaps we can merge this now?
> ChangeLog v5->v6:
> - Rewrite thoroughly after discussing with Jonas that the safe
>   MTU is probably 1532 without DSA tag and extra VLAN header.
> - Reword commit message and redo tests so it is crystal clear what
>   is going on, as illustrated by tcpdump.
> ChangeLog v4->v5:
> - Drop the confusing ENETSW_MTU_OVERHEAD altogether after discussing
>   with Jonas.
> ChangeLog v3->v4:
> - Adjust the RX buffer size and we can use the max "jumbo"
>   frame size 2048.
> ChangeLog v2->v3:
> - Make a more believable case for the max MTU with tcpdump
>   example.
> ChangeLog v1->v2:
> - Do some better research after help on IRC, did some ping tests.
> ---
>  .../drivers/net/ethernet/broadcom/bcm6368-enetsw.c | 23 +++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/target/linux/bmips/files/drivers/net/ethernet/broadcom/bcm6368-enetsw.c b/target/linux/bmips/files/drivers/net/ethernet/broadcom/bcm6368-enetsw.c
> index 321e95dbbb3d..b72a788378fa 100644
> --- a/target/linux/bmips/files/drivers/net/ethernet/broadcom/bcm6368-enetsw.c
> +++ b/target/linux/bmips/files/drivers/net/ethernet/broadcom/bcm6368-enetsw.c
> @@ -22,10 +22,19 @@
>  #include <linux/reset.h>
>  #include <linux/version.h>
>
> -/* MTU */
> -#define ENETSW_TAG_SIZE                        (6 + VLAN_HLEN)
> -#define ENETSW_MTU_OVERHEAD            (VLAN_ETH_HLEN + VLAN_HLEN + \
> -                                        ENETSW_TAG_SIZE)
> +/* TODO: Bigger frames may work but we do not trust that they are safe on all
> + * platforms so more research is needed, a max frame size of 2048 has been
> + * tested. We use the safe frame size 1542 which is 1532 plus DSA and VLAN
> + * overhead.
> + */
> +#define ENETSW_MAX_FRAME               1542
> +#define ENETSW_DSA_TAG_SIZE            6
> +/* The MTU in Linux does not include ethernet or VLAN headers, but it DOES
> + * include the DSA overhead (the framework will increase the MTU to fit
> + * any DSA header).
> + */
> +#define ENETSW_MAX_MTU                 (ENETSW_MAX_FRAME - VLAN_ETH_HLEN - \
> +                                        VLAN_HLEN)
>  #define ENETSW_FRAG_SIZE(x)            (SKB_DATA_ALIGN(NET_SKB_PAD + x + \
>                                          SKB_DATA_ALIGN(sizeof(struct skb_shared_info))))
>
> @@ -1006,7 +1015,7 @@ static int bcm6368_enetsw_probe(struct platform_device *pdev)
>                 dev_info(dev, "random mac\n");
>         }
>
> -       priv->rx_buf_size = ALIGN(ndev->mtu + ENETSW_MTU_OVERHEAD,
> +       priv->rx_buf_size = ALIGN(ENETSW_MAX_FRAME,
>                                   ENETSW_DMA_MAXBURST * 4);
>
>         priv->rx_frag_size = ENETSW_FRAG_SIZE(priv->rx_buf_size);
> @@ -1066,8 +1075,8 @@ static int bcm6368_enetsw_probe(struct platform_device *pdev)
>         /* register netdevice */
>         ndev->netdev_ops = &bcm6368_enetsw_ops;
>         ndev->min_mtu = ETH_ZLEN;
> -       ndev->mtu = ETH_DATA_LEN + ENETSW_TAG_SIZE;
> -       ndev->max_mtu = ETH_DATA_LEN + ENETSW_TAG_SIZE;
> +       ndev->mtu = ETH_DATA_LEN;
> +       ndev->max_mtu = ENETSW_MAX_MTU;
>  #if LINUX_VERSION_CODE >= KERNEL_VERSION(6,1,0)
>         netif_napi_add(ndev, &priv->napi, bcm6368_enetsw_poll);
>  #else
>
> ---
> base-commit: 1b7e62b20b1735fcdc498a35e005afcd775abcf4
> change-id: 20240124-bcm6368-mtu-d3b27dca1ca8
>
> Best regards,
> --
> Linus Walleij <linus.walleij@linaro.org>
>
Linus Walleij Jan. 25, 2024, 8:07 a.m. UTC | #2
On Wed, Jan 24, 2024 at 8:32 PM Álvaro Fernández Rojas
<noltari@gmail.com> wrote:

> I tested this:
> - Comtrend AR-5381u (bcm6328 with internal switch only).
> - Netgear DGND3700v2 (bcm6362 with external BCM53125 switch).
> And there were no regressions on any of them.
>
> So I merged it @ 3cf1fe5508ee8a2a2c6e33a829bfb6c81381ccd0
> https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=3cf1fe5508ee8a2a2c6e33a829bfb6c81381ccd0
> https://github.com/openwrt/openwrt/commit/3cf1fe5508ee8a2a2c6e33a829bfb6c81381ccd0

Thanks Álvaro!

Yours,
Linus Walleij
diff mbox series

Patch

diff --git a/target/linux/bmips/files/drivers/net/ethernet/broadcom/bcm6368-enetsw.c b/target/linux/bmips/files/drivers/net/ethernet/broadcom/bcm6368-enetsw.c
index 321e95dbbb3d..b72a788378fa 100644
--- a/target/linux/bmips/files/drivers/net/ethernet/broadcom/bcm6368-enetsw.c
+++ b/target/linux/bmips/files/drivers/net/ethernet/broadcom/bcm6368-enetsw.c
@@ -22,10 +22,19 @@ 
 #include <linux/reset.h>
 #include <linux/version.h>
 
-/* MTU */
-#define ENETSW_TAG_SIZE			(6 + VLAN_HLEN)
-#define ENETSW_MTU_OVERHEAD		(VLAN_ETH_HLEN + VLAN_HLEN + \
-					 ENETSW_TAG_SIZE)
+/* TODO: Bigger frames may work but we do not trust that they are safe on all
+ * platforms so more research is needed, a max frame size of 2048 has been
+ * tested. We use the safe frame size 1542 which is 1532 plus DSA and VLAN
+ * overhead.
+ */
+#define ENETSW_MAX_FRAME		1542
+#define ENETSW_DSA_TAG_SIZE		6
+/* The MTU in Linux does not include ethernet or VLAN headers, but it DOES
+ * include the DSA overhead (the framework will increase the MTU to fit
+ * any DSA header).
+ */
+#define ENETSW_MAX_MTU			(ENETSW_MAX_FRAME - VLAN_ETH_HLEN - \
+					 VLAN_HLEN)
 #define ENETSW_FRAG_SIZE(x)		(SKB_DATA_ALIGN(NET_SKB_PAD + x + \
 					 SKB_DATA_ALIGN(sizeof(struct skb_shared_info))))
 
@@ -1006,7 +1015,7 @@  static int bcm6368_enetsw_probe(struct platform_device *pdev)
 		dev_info(dev, "random mac\n");
 	}
 
-	priv->rx_buf_size = ALIGN(ndev->mtu + ENETSW_MTU_OVERHEAD,
+	priv->rx_buf_size = ALIGN(ENETSW_MAX_FRAME,
 				  ENETSW_DMA_MAXBURST * 4);
 
 	priv->rx_frag_size = ENETSW_FRAG_SIZE(priv->rx_buf_size);
@@ -1066,8 +1075,8 @@  static int bcm6368_enetsw_probe(struct platform_device *pdev)
 	/* register netdevice */
 	ndev->netdev_ops = &bcm6368_enetsw_ops;
 	ndev->min_mtu = ETH_ZLEN;
-	ndev->mtu = ETH_DATA_LEN + ENETSW_TAG_SIZE;
-	ndev->max_mtu = ETH_DATA_LEN + ENETSW_TAG_SIZE;
+	ndev->mtu = ETH_DATA_LEN;
+	ndev->max_mtu = ENETSW_MAX_MTU;
 #if LINUX_VERSION_CODE >= KERNEL_VERSION(6,1,0)
 	netif_napi_add(ndev, &priv->napi, bcm6368_enetsw_poll);
 #else