diff mbox series

[22.03] mvebu: cortexa9: disable devices using broken mv88e6176 switch

Message ID 20221201095225.1567-1-ynezz@true.cz
State Accepted
Delegated to: Petr Štetiar
Headers show
Series [22.03] mvebu: cortexa9: disable devices using broken mv88e6176 switch | expand

Commit Message

Petr Štetiar Dec. 1, 2022, 9:52 a.m. UTC
Several users have reported, that devices using mv88e6176 switch are
seriously broken, basically turning that switch into a hub. Until fixed
those devices should be disabled.

I've used TOH with "Switch 88E6176" filter, which provided me with the
following list of likely affected devices:

 * Linksys WRT1200AC v1/v2, WRT1900AC v1/v2
 * SolidRun ClearFog Pro
 * Turris Omnia

That device list more or less corresponds with the list of devices
mentioned in the linked bug reports.

References: https://github.com/openwrt/openwrt/issues/11077
Signed-off-by: Petr Štetiar <ynezz@true.cz>
---
 target/linux/mvebu/image/cortexa9.mk | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Uwe Kleine-König Dec. 1, 2022, 11:17 a.m. UTC | #1
On 12/1/22 10:52, Petr Štetiar wrote:
> Several users have reported, that devices using mv88e6176 switch are
> seriously broken, basically turning that switch into a hub. Until fixed
> those devices should be disabled.
> 
> I've used TOH with "Switch 88E6176" filter, which provided me with the
> following list of likely affected devices:
> 
>   * Linksys WRT1200AC v1/v2, WRT1900AC v1/v2
>   * SolidRun ClearFog Pro
>   * Turris Omnia
> 
> That device list more or less corresponds with the list of devices
> mentioned in the linked bug reports.
> 
> References: https://github.com/openwrt/openwrt/issues/11077
> Signed-off-by: Petr Štetiar <ynezz@true.cz>

I wonder if disabling is a sane thing to do here. People running 22.03.2 
won't be able to update (without building themselves). Isn't that worse 
than a slow network configuration?

Best regards
Uwe
Petr Štetiar Dec. 1, 2022, 11:52 a.m. UTC | #2
Uwe Kleine-König <uwe+openwrt@kleine-koenig.org> [2022-12-01 12:17:31]:

Hi Uwe,

> On 12/1/22 10:52, Petr Štetiar wrote:
> > Several users have reported, that devices using mv88e6176 switch are
> > seriously broken, basically turning that switch into a hub. Until fixed
> > those devices should be disabled.
> 
> I wonder if disabling is a sane thing to do here. 

IMO it's a serious issue, so we've just following sane options:

 * fix it
 * remove support for those devices and make everyone aware

> People running 22.03.2 won't be able to update

Indeed, bummer, we're sorry about that, but the idea here is to make everyone
aware and prevent more users from upgrading those devices to latest 22.03
release.

> Isn't that worse than a slow network configuration?

Quoting from https://github.com/openwrt/openwrt/issues/10997

 "After upgrade from 21.02 eth switch start to forward all incoming traffic to
  all ports.  I can see ALL traffic using tcpdump on computer connected to one
  port and all eth traffic leds are blinking always simultaneously.  Same
  configuration on 21.02 works ok - VLANs and individual traffic are isolated."

Cheers,

Petr
Bjørn Mork Dec. 1, 2022, 1:19 p.m. UTC | #3
I assume KERNEL_PATCHVER in target/linux/mvebu/Makefile will be fixed in
master as well, given that 5.10 is unsupportable on this target?


Bjørn
Josef Schlehofer Dec. 1, 2022, 1:23 p.m. UTC | #4
On 01. 12. 22 14:19, Bjørn Mork wrote:

> I assume KERNEL_PATCHVER in target/linux/mvebu/Makefile will be fixed in
> master as well, given that 5.10 is unsupportable on this target?
>
AFAIK What should be done is to put the kernel 5.15 as the default 
kernel for mvebu. Currently, it is only as the testing kernel.
Christian Marangi Dec. 1, 2022, 1:33 p.m. UTC | #5
On Thu, Dec 01, 2022 at 02:23:08PM +0100, Josef Schlehofer wrote:
> On 01. 12. 22 14:19, Bjørn Mork wrote:
> 
> > I assume KERNEL_PATCHVER in target/linux/mvebu/Makefile will be fixed in
> > master as well, given that 5.10 is unsupportable on this target?
> > 
> AFAIK What should be done is to put the kernel 5.15 as the default kernel
> for mvebu. Currently, it is only as the testing kernel.
>

My 2 cent on this... A kernel upgrade is not viable for a stable
release. The problem here is simple...

Things related to VLAN are broken in 5.10 and got fixed in 5.15. DSA is
""easy enough"" to check all the changes related to VLAN and mvebu
switch and backport them to 5.10. (even warning some kernel guy once the
affected patch are found and sent to stable mailing list to ask greg to
be backported)

Problem is that we currently lack manpower to bisect this and ideally by
disabling these target we will push the community on finding the
problem.

Require some time but the fact that things are broken on 5.10 and are
fixed in 5.15 makes everything less hard to bisect... Someone can
totally have some fun building intermediate kernel 5.11, 5.12, 5.13 once
things starts to work so he can reduce the patch to check... 

AFAIK there were many changes to VLAN part and were totally related to
mvebu so it just require some user with the device and time to actually
bisect this. Once we have the affected commit we can totally backport
them and put the patch for the mvebu target and we will reenable the
affected devices.

> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Enrico Mioso Dec. 1, 2022, 1:33 p.m. UTC | #6
My 2 cents here.

My (very possibly wrong) impression, this seems the same way of thinking followed in commit db19efee951231b38573cffaadb15fad8f9c058d .
I can understand this is a strong way to convince people to care about specific issues. Still, this seems a bit rude to me.
On the other side, I can understand this problem can have pretty much security consequences, so both point of views make sense to me.

On Thu, Dec 01, 2022 at 12:52:29PM +0100, Petr Štetiar wrote:
> Uwe Kleine-König <uwe+openwrt@kleine-koenig.org> [2022-12-01 12:17:31]:
> 
> Hi Uwe,
> 
> > On 12/1/22 10:52, Petr Štetiar wrote:
> > > Several users have reported, that devices using mv88e6176 switch are
> > > seriously broken, basically turning that switch into a hub. Until fixed
> > > those devices should be disabled.
> > 
> > I wonder if disabling is a sane thing to do here. 
> 
> IMO it's a serious issue, so we've just following sane options:
> 
>  * fix it
>  * remove support for those devices and make everyone aware
> 
> > People running 22.03.2 won't be able to update
> 
> Indeed, bummer, we're sorry about that, but the idea here is to make everyone
> aware and prevent more users from upgrading those devices to latest 22.03
> release.
> 
> > Isn't that worse than a slow network configuration?
> 
> Quoting from https://github.com/openwrt/openwrt/issues/10997
> 
>  "After upgrade from 21.02 eth switch start to forward all incoming traffic to
>   all ports.  I can see ALL traffic using tcpdump on computer connected to one
>   port and all eth traffic leds are blinking always simultaneously.  Same
>   configuration on 21.02 works ok - VLANs and individual traffic are isolated."
> 
> Cheers,
> 
> Petr
> 
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Robert Marko Dec. 1, 2022, 1:39 p.m. UTC | #7
On Thu, 1 Dec 2022 at 14:34, Christian Marangi <ansuelsmth@gmail.com> wrote:
>
> On Thu, Dec 01, 2022 at 02:23:08PM +0100, Josef Schlehofer wrote:
> > On 01. 12. 22 14:19, Bjørn Mork wrote:
> >
> > > I assume KERNEL_PATCHVER in target/linux/mvebu/Makefile will be fixed in
> > > master as well, given that 5.10 is unsupportable on this target?
> > >
> > AFAIK What should be done is to put the kernel 5.15 as the default kernel
> > for mvebu. Currently, it is only as the testing kernel.
> >
>
> My 2 cent on this... A kernel upgrade is not viable for a stable
> release. The problem here is simple...
>
> Things related to VLAN are broken in 5.10 and got fixed in 5.15. DSA is
> ""easy enough"" to check all the changes related to VLAN and mvebu
> switch and backport them to 5.10. (even warning some kernel guy once the
> affected patch are found and sent to stable mailing list to ask greg to
> be backported)
>
> Problem is that we currently lack manpower to bisect this and ideally by
> disabling these target we will push the community on finding the
> problem.
>
> Require some time but the fact that things are broken on 5.10 and are
> fixed in 5.15 makes everything less hard to bisect... Someone can
> totally have some fun building intermediate kernel 5.11, 5.12, 5.13 once
> things starts to work so he can reduce the patch to check...
>
> AFAIK there were many changes to VLAN part and were totally related to
> mvebu so it just require some user with the device and time to actually
> bisect this. Once we have the affected commit we can totally backport
> them and put the patch for the mvebu target and we will reenable the
> affected devices.

My bet is on one of the patches that are not backports but rather hacks/hotfixes
carried only in OpenWrt.
Even if it was just an upstream change that fixed it, its probably one
of the hundreds
that dont look even remotely related.

Personally I dont have any of these devices and the switch is a rather
rare model.

Regards,
Robert
>
> > _______________________________________________
> > openwrt-devel mailing list
> > openwrt-devel@lists.openwrt.org
> > https://lists.openwrt.org/mailman/listinfo/openwrt-devel
>
> --
>         Ansuel
>
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Christian Marangi Dec. 1, 2022, 1:43 p.m. UTC | #8
On Thu, Dec 01, 2022 at 02:39:56PM +0100, Robert Marko wrote:
> On Thu, 1 Dec 2022 at 14:34, Christian Marangi <ansuelsmth@gmail.com> wrote:
> >
> > On Thu, Dec 01, 2022 at 02:23:08PM +0100, Josef Schlehofer wrote:
> > > On 01. 12. 22 14:19, Bjørn Mork wrote:
> > >
> > > > I assume KERNEL_PATCHVER in target/linux/mvebu/Makefile will be fixed in
> > > > master as well, given that 5.10 is unsupportable on this target?
> > > >
> > > AFAIK What should be done is to put the kernel 5.15 as the default kernel
> > > for mvebu. Currently, it is only as the testing kernel.
> > >
> >
> > My 2 cent on this... A kernel upgrade is not viable for a stable
> > release. The problem here is simple...
> >
> > Things related to VLAN are broken in 5.10 and got fixed in 5.15. DSA is
> > ""easy enough"" to check all the changes related to VLAN and mvebu
> > switch and backport them to 5.10. (even warning some kernel guy once the
> > affected patch are found and sent to stable mailing list to ask greg to
> > be backported)
> >
> > Problem is that we currently lack manpower to bisect this and ideally by
> > disabling these target we will push the community on finding the
> > problem.
> >
> > Require some time but the fact that things are broken on 5.10 and are
> > fixed in 5.15 makes everything less hard to bisect... Someone can
> > totally have some fun building intermediate kernel 5.11, 5.12, 5.13 once
> > things starts to work so he can reduce the patch to check...
> >
> > AFAIK there were many changes to VLAN part and were totally related to
> > mvebu so it just require some user with the device and time to actually
> > bisect this. Once we have the affected commit we can totally backport
> > them and put the patch for the mvebu target and we will reenable the
> > affected devices.
> 
> My bet is on one of the patches that are not backports but rather hacks/hotfixes
> carried only in OpenWrt.
> Even if it was just an upstream change that fixed it, its probably one
> of the hundreds
> that dont look even remotely related.
> 
> Personally I dont have any of these devices and the switch is a rather
> rare model.
> 

That can also be a reason... But again it's just having fun with
starting from a clear start and see what it's broken...

mvebu platform is full supported upstream so in theory someone can just
use a buildroot and compile a kernel... everything should work right
away
Robert Marko Dec. 1, 2022, 1:49 p.m. UTC | #9
On Thu, 1 Dec 2022 at 14:43, Christian Marangi <ansuelsmth@gmail.com> wrote:
>
> On Thu, Dec 01, 2022 at 02:39:56PM +0100, Robert Marko wrote:
> > On Thu, 1 Dec 2022 at 14:34, Christian Marangi <ansuelsmth@gmail.com> wrote:
> > >
> > > On Thu, Dec 01, 2022 at 02:23:08PM +0100, Josef Schlehofer wrote:
> > > > On 01. 12. 22 14:19, Bjørn Mork wrote:
> > > >
> > > > > I assume KERNEL_PATCHVER in target/linux/mvebu/Makefile will be fixed in
> > > > > master as well, given that 5.10 is unsupportable on this target?
> > > > >
> > > > AFAIK What should be done is to put the kernel 5.15 as the default kernel
> > > > for mvebu. Currently, it is only as the testing kernel.
> > > >
> > >
> > > My 2 cent on this... A kernel upgrade is not viable for a stable
> > > release. The problem here is simple...
> > >
> > > Things related to VLAN are broken in 5.10 and got fixed in 5.15. DSA is
> > > ""easy enough"" to check all the changes related to VLAN and mvebu
> > > switch and backport them to 5.10. (even warning some kernel guy once the
> > > affected patch are found and sent to stable mailing list to ask greg to
> > > be backported)
> > >
> > > Problem is that we currently lack manpower to bisect this and ideally by
> > > disabling these target we will push the community on finding the
> > > problem.
> > >
> > > Require some time but the fact that things are broken on 5.10 and are
> > > fixed in 5.15 makes everything less hard to bisect... Someone can
> > > totally have some fun building intermediate kernel 5.11, 5.12, 5.13 once
> > > things starts to work so he can reduce the patch to check...
> > >
> > > AFAIK there were many changes to VLAN part and were totally related to
> > > mvebu so it just require some user with the device and time to actually
> > > bisect this. Once we have the affected commit we can totally backport
> > > them and put the patch for the mvebu target and we will reenable the
> > > affected devices.
> >
> > My bet is on one of the patches that are not backports but rather hacks/hotfixes
> > carried only in OpenWrt.
> > Even if it was just an upstream change that fixed it, its probably one
> > of the hundreds
> > that dont look even remotely related.
> >
> > Personally I dont have any of these devices and the switch is a rather
> > rare model.
> >
>
> That can also be a reason... But again it's just having fun with
> starting from a clear start and see what it's broken...
>
> mvebu platform is full supported upstream so in theory someone can just
> use a buildroot and compile a kernel... everything should work right
> away

Yeah, it shouldn't be "that" hard, its just gonna take the time if you have
the HW.

Regards,
Robert
>
> --
>         Ansuel
Klaus Kudielka Dec. 15, 2022, 6:13 p.m. UTC | #10
I believe this is the same bug I reported upstream in October '20 (but
nobody was able to reproduce):

https://bugzilla.kernel.org/show_bug.cgi?id=209487


Quoting myself (Comment 11):
----

The issue seems to be finally solved by acceptance of the "RX filtering in
DSA" patch series (thanks to Tobias Waldekranz & Vladimir Oltean).

https://lore.kernel.org/netdev/20210629140658.2510288-1-olteanv@gmail.com/

I tested this with 5.14-rc3 on a Turris Omnia. Traffic from a DSA switch
port, addressed to the bridge, is not flooded anymore to other DSA switch
ports - even if unicast flooding is turned on.
----


So, I guess this would be the series to backport.


Regards, Klaus
diff mbox series

Patch

diff --git a/target/linux/mvebu/image/cortexa9.mk b/target/linux/mvebu/image/cortexa9.mk
index d9738903fb04..ac83e62e8a5c 100644
--- a/target/linux/mvebu/image/cortexa9.mk
+++ b/target/linux/mvebu/image/cortexa9.mk
@@ -66,6 +66,7 @@  define Device/cznic_turris-omnia
   DEVICE_IMG_NAME = $$(2)
   SUPPORTED_DEVICES += armada-385-turris-omnia
   BOOT_SCRIPT := turris-omnia
+  DEFAULT := n
 endef
 TARGET_DEVICES += cznic_turris-omnia
 
@@ -114,6 +115,7 @@  define Device/linksys
   IMAGE/factory.img := append-kernel | pad-to $$$$(KERNEL_SIZE) | \
 	append-ubi | pad-to $$$$(PAGESIZE)
   KERNEL_SIZE := 6144k
+  DEFAULT := n
 endef
 
 define Device/linksys_wrt1200ac
@@ -194,6 +196,7 @@  define Device/linksys_wrt32x
   KERNEL_SIZE := 6144k
   KERNEL := kernel-bin | append-dtb
   SUPPORTED_DEVICES += armada-385-linksys-venom linksys,venom
+  DEFAULT := y
 endef
 TARGET_DEVICES += linksys_wrt32x
 
@@ -299,5 +302,6 @@  define Device/solidrun_clearfog-pro-a1
   UBOOT := clearfog-u-boot-spl.kwb
   BOOT_SCRIPT := clearfog
   SUPPORTED_DEVICES += armada-388-clearfog armada-388-clearfog-pro
+  DEFAULT := n
 endef
 TARGET_DEVICES += solidrun_clearfog-pro-a1