Message ID | E1gvNNd-0006YN-Rc@rmk-PC.armlinux.org.uk |
---|---|
State | Superseded |
Delegated to: | David Miller |
Headers | show |
Series | net: dsa: mv88e6xxx: fix IPv6 | expand |
On Sun, Feb 17, 2019 at 02:25:17PM +0000, Russell King wrote: > Switches work by learning the MAC address for each attached station by > monitoring traffic from each station. When a station sends a packet, > the switch records which port the MAC address is connected to. > > With IPv4 networking, before communication commences with a neighbour, > an ARP packet is broadcasted to all stations asking for the MAC address > corresponding with the IPv4. The desired station responds with an ARP > reply, and the ARP reply causes the switch to learn which port the > station is connected to. > > With IPv6 networking, the situation is rather different. Rather than > broadcasting ARP packets, a "neighbour solicitation" is multicasted > rather than broadcasted. This multicast needs to reach the intended > station in order for the neighbour to be discovered. > > Once a neighbour has been discovered, and entered into the sending > stations neighbour cache, communication can restart at a point later > without sending a new neighbour solicitation, even if the entry in > the neighbour cache is marked as stale. This can be after the MAC > address has expired from the forwarding cache of the DSA switch - > when that occurs, there is a long pause in communication. > > Our DSA implementation for mv88e6xxx switches has defaulted to having > multicast and unicast flooding disabled. As per the above description, > this is fine for IPv4 networking, since the broadcasted ARP queries > will be sent to and received by all stations on the same network. > However, this breaks IPv6 very badly - blocking neighbour solicitations > and later causing connections to stall. > > The defaults that the Linux bridge code expect from bridges are that > unknown unicast frames and unknown multicast frames are flooded to > all stations, which is at odds to the defaults adopted by our DSA > implementation for mv88e6xxx switches. > > This commit enables by default flooding of both unknown unicast and > unknown multicast frames. This means that mv88e6xxx DSA switches now > behave as per the bridge(8) man page, and IPv6 works flawlessly through > such a switch. Note that there is the open question whether this affects the case where each port is used as a separate network interface: that case has not yet been tested. > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> > --- > drivers/net/dsa/mv88e6xxx/chip.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > index b75a865a293d..eb5e3d88374f 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -2144,13 +2144,14 @@ static int mv88e6xxx_setup_message_port(struct mv88e6xxx_chip *chip, int port) > static int mv88e6xxx_setup_egress_floods(struct mv88e6xxx_chip *chip, int port) > { > struct dsa_switch *ds = chip->ds; > - bool flood; > > - /* Upstream ports flood frames with unknown unicast or multicast DA */ > - flood = dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port); > + /* Linux bridges are expected to flood unknown multicast and > + * unicast frames to all ports - as per the defaults specified > + * in the iproute2 bridge(8) man page. Not doing this causes > + * stalls and failures with IPv6 over Marvell bridges. */ > if (chip->info->ops->port_set_egress_floods) > return chip->info->ops->port_set_egress_floods(chip, port, > - flood, flood); > + true, true); > > return 0; > } > -- > 2.7.4 > >
On Sun, Feb 17, 2019 at 02:27:16PM +0000, Russell King - ARM Linux admin wrote: > On Sun, Feb 17, 2019 at 02:25:17PM +0000, Russell King wrote: > > Switches work by learning the MAC address for each attached station by > > monitoring traffic from each station. When a station sends a packet, > > the switch records which port the MAC address is connected to. > > > > With IPv4 networking, before communication commences with a neighbour, > > an ARP packet is broadcasted to all stations asking for the MAC address > > corresponding with the IPv4. The desired station responds with an ARP > > reply, and the ARP reply causes the switch to learn which port the > > station is connected to. > > > > With IPv6 networking, the situation is rather different. Rather than > > broadcasting ARP packets, a "neighbour solicitation" is multicasted > > rather than broadcasted. This multicast needs to reach the intended > > station in order for the neighbour to be discovered. > > > > Once a neighbour has been discovered, and entered into the sending > > stations neighbour cache, communication can restart at a point later > > without sending a new neighbour solicitation, even if the entry in > > the neighbour cache is marked as stale. This can be after the MAC > > address has expired from the forwarding cache of the DSA switch - > > when that occurs, there is a long pause in communication. > > > > Our DSA implementation for mv88e6xxx switches has defaulted to having > > multicast and unicast flooding disabled. As per the above description, > > this is fine for IPv4 networking, since the broadcasted ARP queries > > will be sent to and received by all stations on the same network. > > However, this breaks IPv6 very badly - blocking neighbour solicitations > > and later causing connections to stall. > > > > The defaults that the Linux bridge code expect from bridges are that > > unknown unicast frames and unknown multicast frames are flooded to > > all stations, which is at odds to the defaults adopted by our DSA > > implementation for mv88e6xxx switches. > > > > This commit enables by default flooding of both unknown unicast and > > unknown multicast frames. This means that mv88e6xxx DSA switches now > > behave as per the bridge(8) man page, and IPv6 works flawlessly through > > such a switch. > > Note that there is the open question whether this affects the case where > each port is used as a separate network interface: that case has not yet > been tested. I've checked with a mv88e6131 on the clearfog gt8k board. lan1 connected to my lan with plenty of traffic on, and configured as part of a bridge. lan2 connected to the zii board, but not part of the bridge. Monitoring lan2 from the zii board shows no traffic that was received from lan1. So it looks fine. > > > > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> > > --- > > drivers/net/dsa/mv88e6xxx/chip.c | 9 +++++---- > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > > index b75a865a293d..eb5e3d88374f 100644 > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > @@ -2144,13 +2144,14 @@ static int mv88e6xxx_setup_message_port(struct mv88e6xxx_chip *chip, int port) > > static int mv88e6xxx_setup_egress_floods(struct mv88e6xxx_chip *chip, int port) > > { > > struct dsa_switch *ds = chip->ds; > > - bool flood; > > > > - /* Upstream ports flood frames with unknown unicast or multicast DA */ > > - flood = dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port); > > + /* Linux bridges are expected to flood unknown multicast and > > + * unicast frames to all ports - as per the defaults specified > > + * in the iproute2 bridge(8) man page. Not doing this causes > > + * stalls and failures with IPv6 over Marvell bridges. */ > > if (chip->info->ops->port_set_egress_floods) > > return chip->info->ops->port_set_egress_floods(chip, port, > > - flood, flood); > > + true, true); > > > > return 0; > > } > > -- > > 2.7.4 > > > > > > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up > According to speedtest.net: 11.9Mbps down 500kbps up
On 2/17/2019 8:34 AM, Russell King - ARM Linux admin wrote: > On Sun, Feb 17, 2019 at 02:27:16PM +0000, Russell King - ARM Linux admin wrote: >> On Sun, Feb 17, 2019 at 02:25:17PM +0000, Russell King wrote: >>> Switches work by learning the MAC address for each attached station by >>> monitoring traffic from each station. When a station sends a packet, >>> the switch records which port the MAC address is connected to. >>> >>> With IPv4 networking, before communication commences with a neighbour, >>> an ARP packet is broadcasted to all stations asking for the MAC address >>> corresponding with the IPv4. The desired station responds with an ARP >>> reply, and the ARP reply causes the switch to learn which port the >>> station is connected to. >>> >>> With IPv6 networking, the situation is rather different. Rather than >>> broadcasting ARP packets, a "neighbour solicitation" is multicasted >>> rather than broadcasted. This multicast needs to reach the intended >>> station in order for the neighbour to be discovered. >>> >>> Once a neighbour has been discovered, and entered into the sending >>> stations neighbour cache, communication can restart at a point later >>> without sending a new neighbour solicitation, even if the entry in >>> the neighbour cache is marked as stale. This can be after the MAC >>> address has expired from the forwarding cache of the DSA switch - >>> when that occurs, there is a long pause in communication. >>> >>> Our DSA implementation for mv88e6xxx switches has defaulted to having >>> multicast and unicast flooding disabled. As per the above description, >>> this is fine for IPv4 networking, since the broadcasted ARP queries >>> will be sent to and received by all stations on the same network. >>> However, this breaks IPv6 very badly - blocking neighbour solicitations >>> and later causing connections to stall. >>> >>> The defaults that the Linux bridge code expect from bridges are that >>> unknown unicast frames and unknown multicast frames are flooded to >>> all stations, which is at odds to the defaults adopted by our DSA >>> implementation for mv88e6xxx switches. >>> >>> This commit enables by default flooding of both unknown unicast and >>> unknown multicast frames. This means that mv88e6xxx DSA switches now >>> behave as per the bridge(8) man page, and IPv6 works flawlessly through >>> such a switch. >> >> Note that there is the open question whether this affects the case where >> each port is used as a separate network interface: that case has not yet >> been tested. > > I've checked with a mv88e6131 on the clearfog gt8k board. lan1 > connected to my lan with plenty of traffic on, and configured as > part of a bridge. lan2 connected to the zii board, but not part > of the bridge. Monitoring lan2 from the zii board shows no traffic > that was received from lan1. > > So it looks fine. With the current state whereby we do not have the necessary hooks to perform filtering on non-bridged/standalone ports, this is entirely fine indeed. In the future this is part of something I want to address because it is IMHO highly undesirable to have non-bridged ports be flooded with unknown multicast or unknown unicast for that matter because that makes them deviate from a standard NIC interface. Unknown unicast is not necessarily a low hanging fruit, but still, if we have switches capable of filtering, we might as well make use of that. Of course, one difficulty is that we must not break running tcpdump on those DSA slave network interfaces. > >> >>> >>> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> >>> --- >>> drivers/net/dsa/mv88e6xxx/chip.c | 9 +++++---- >>> 1 file changed, 5 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c >>> index b75a865a293d..eb5e3d88374f 100644 >>> --- a/drivers/net/dsa/mv88e6xxx/chip.c >>> +++ b/drivers/net/dsa/mv88e6xxx/chip.c >>> @@ -2144,13 +2144,14 @@ static int mv88e6xxx_setup_message_port(struct mv88e6xxx_chip *chip, int port) >>> static int mv88e6xxx_setup_egress_floods(struct mv88e6xxx_chip *chip, int port) >>> { >>> struct dsa_switch *ds = chip->ds; >>> - bool flood; >>> >>> - /* Upstream ports flood frames with unknown unicast or multicast DA */ >>> - flood = dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port); >>> + /* Linux bridges are expected to flood unknown multicast and >>> + * unicast frames to all ports - as per the defaults specified >>> + * in the iproute2 bridge(8) man page. Not doing this causes >>> + * stalls and failures with IPv6 over Marvell bridges. */ >>> if (chip->info->ops->port_set_egress_floods) >>> return chip->info->ops->port_set_egress_floods(chip, port, >>> - flood, flood); >>> + true, true); >>> >>> return 0; >>> } >>> -- >>> 2.7.4 >>> >>> >> >> -- >> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ >> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up >> According to speedtest.net: 11.9Mbps down 500kbps up >
On Sun, Feb 17, 2019 at 01:45:24PM -0800, Florian Fainelli wrote: > > > On 2/17/2019 8:34 AM, Russell King - ARM Linux admin wrote: > > On Sun, Feb 17, 2019 at 02:27:16PM +0000, Russell King - ARM Linux admin wrote: > >> On Sun, Feb 17, 2019 at 02:25:17PM +0000, Russell King wrote: > >>> Switches work by learning the MAC address for each attached station by > >>> monitoring traffic from each station. When a station sends a packet, > >>> the switch records which port the MAC address is connected to. > >>> > >>> With IPv4 networking, before communication commences with a neighbour, > >>> an ARP packet is broadcasted to all stations asking for the MAC address > >>> corresponding with the IPv4. The desired station responds with an ARP > >>> reply, and the ARP reply causes the switch to learn which port the > >>> station is connected to. > >>> > >>> With IPv6 networking, the situation is rather different. Rather than > >>> broadcasting ARP packets, a "neighbour solicitation" is multicasted > >>> rather than broadcasted. This multicast needs to reach the intended > >>> station in order for the neighbour to be discovered. > >>> > >>> Once a neighbour has been discovered, and entered into the sending > >>> stations neighbour cache, communication can restart at a point later > >>> without sending a new neighbour solicitation, even if the entry in > >>> the neighbour cache is marked as stale. This can be after the MAC > >>> address has expired from the forwarding cache of the DSA switch - > >>> when that occurs, there is a long pause in communication. > >>> > >>> Our DSA implementation for mv88e6xxx switches has defaulted to having > >>> multicast and unicast flooding disabled. As per the above description, > >>> this is fine for IPv4 networking, since the broadcasted ARP queries > >>> will be sent to and received by all stations on the same network. > >>> However, this breaks IPv6 very badly - blocking neighbour solicitations > >>> and later causing connections to stall. > >>> > >>> The defaults that the Linux bridge code expect from bridges are that > >>> unknown unicast frames and unknown multicast frames are flooded to > >>> all stations, which is at odds to the defaults adopted by our DSA > >>> implementation for mv88e6xxx switches. > >>> > >>> This commit enables by default flooding of both unknown unicast and > >>> unknown multicast frames. This means that mv88e6xxx DSA switches now > >>> behave as per the bridge(8) man page, and IPv6 works flawlessly through > >>> such a switch. > >> > >> Note that there is the open question whether this affects the case where > >> each port is used as a separate network interface: that case has not yet > >> been tested. > > > > I've checked with a mv88e6131 on the clearfog gt8k board. lan1 > > connected to my lan with plenty of traffic on, and configured as > > part of a bridge. lan2 connected to the zii board, but not part > > of the bridge. Monitoring lan2 from the zii board shows no traffic > > that was received from lan1. > > > > So it looks fine. > > With the current state whereby we do not have the necessary hooks to > perform filtering on non-bridged/standalone ports, this is entirely fine > indeed. > > In the future this is part of something I want to address because it is > IMHO highly undesirable to have non-bridged ports be flooded with > unknown multicast or unknown unicast for that matter because that makes > them deviate from a standard NIC interface. Unknown unicast is not > necessarily a low hanging fruit, but still, if we have switches capable > of filtering, we might as well make use of that. Of course, one > difficulty is that we must not break running tcpdump on those DSA slave > network interfaces. Sorry, I think you have the wrong end of the stick. For a non-bridged port, I am seeing _no_ traffic apart from that explicitly sent out through that port. In other words, there are _no_ flooded frames coming out of the non-bridged port. This patch appears to have no material effect on non-bridged ports. > > > > >> > >>> > >>> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> > >>> --- > >>> drivers/net/dsa/mv88e6xxx/chip.c | 9 +++++---- > >>> 1 file changed, 5 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > >>> index b75a865a293d..eb5e3d88374f 100644 > >>> --- a/drivers/net/dsa/mv88e6xxx/chip.c > >>> +++ b/drivers/net/dsa/mv88e6xxx/chip.c > >>> @@ -2144,13 +2144,14 @@ static int mv88e6xxx_setup_message_port(struct mv88e6xxx_chip *chip, int port) > >>> static int mv88e6xxx_setup_egress_floods(struct mv88e6xxx_chip *chip, int port) > >>> { > >>> struct dsa_switch *ds = chip->ds; > >>> - bool flood; > >>> > >>> - /* Upstream ports flood frames with unknown unicast or multicast DA */ > >>> - flood = dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port); > >>> + /* Linux bridges are expected to flood unknown multicast and > >>> + * unicast frames to all ports - as per the defaults specified > >>> + * in the iproute2 bridge(8) man page. Not doing this causes > >>> + * stalls and failures with IPv6 over Marvell bridges. */ > >>> if (chip->info->ops->port_set_egress_floods) > >>> return chip->info->ops->port_set_egress_floods(chip, port, > >>> - flood, flood); > >>> + true, true); > >>> > >>> return 0; > >>> } > >>> -- > >>> 2.7.4 > >>> > >>> > >> > >> -- > >> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > >> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up > >> According to speedtest.net: 11.9Mbps down 500kbps up > > > > -- > Florian >
On 2/17/2019 1:58 PM, Russell King - ARM Linux admin wrote: > On Sun, Feb 17, 2019 at 01:45:24PM -0800, Florian Fainelli wrote: >> >> >> On 2/17/2019 8:34 AM, Russell King - ARM Linux admin wrote: >>> On Sun, Feb 17, 2019 at 02:27:16PM +0000, Russell King - ARM Linux admin wrote: >>>> On Sun, Feb 17, 2019 at 02:25:17PM +0000, Russell King wrote: >>>>> Switches work by learning the MAC address for each attached station by >>>>> monitoring traffic from each station. When a station sends a packet, >>>>> the switch records which port the MAC address is connected to. >>>>> >>>>> With IPv4 networking, before communication commences with a neighbour, >>>>> an ARP packet is broadcasted to all stations asking for the MAC address >>>>> corresponding with the IPv4. The desired station responds with an ARP >>>>> reply, and the ARP reply causes the switch to learn which port the >>>>> station is connected to. >>>>> >>>>> With IPv6 networking, the situation is rather different. Rather than >>>>> broadcasting ARP packets, a "neighbour solicitation" is multicasted >>>>> rather than broadcasted. This multicast needs to reach the intended >>>>> station in order for the neighbour to be discovered. >>>>> >>>>> Once a neighbour has been discovered, and entered into the sending >>>>> stations neighbour cache, communication can restart at a point later >>>>> without sending a new neighbour solicitation, even if the entry in >>>>> the neighbour cache is marked as stale. This can be after the MAC >>>>> address has expired from the forwarding cache of the DSA switch - >>>>> when that occurs, there is a long pause in communication. >>>>> >>>>> Our DSA implementation for mv88e6xxx switches has defaulted to having >>>>> multicast and unicast flooding disabled. As per the above description, >>>>> this is fine for IPv4 networking, since the broadcasted ARP queries >>>>> will be sent to and received by all stations on the same network. >>>>> However, this breaks IPv6 very badly - blocking neighbour solicitations >>>>> and later causing connections to stall. >>>>> >>>>> The defaults that the Linux bridge code expect from bridges are that >>>>> unknown unicast frames and unknown multicast frames are flooded to >>>>> all stations, which is at odds to the defaults adopted by our DSA >>>>> implementation for mv88e6xxx switches. >>>>> >>>>> This commit enables by default flooding of both unknown unicast and >>>>> unknown multicast frames. This means that mv88e6xxx DSA switches now >>>>> behave as per the bridge(8) man page, and IPv6 works flawlessly through >>>>> such a switch. >>>> >>>> Note that there is the open question whether this affects the case where >>>> each port is used as a separate network interface: that case has not yet >>>> been tested. >>> >>> I've checked with a mv88e6131 on the clearfog gt8k board. lan1 >>> connected to my lan with plenty of traffic on, and configured as >>> part of a bridge. lan2 connected to the zii board, but not part >>> of the bridge. Monitoring lan2 from the zii board shows no traffic >>> that was received from lan1. >>> >>> So it looks fine. >> >> With the current state whereby we do not have the necessary hooks to >> perform filtering on non-bridged/standalone ports, this is entirely fine >> indeed. >> >> In the future this is part of something I want to address because it is >> IMHO highly undesirable to have non-bridged ports be flooded with >> unknown multicast or unknown unicast for that matter because that makes >> them deviate from a standard NIC interface. Unknown unicast is not >> necessarily a low hanging fruit, but still, if we have switches capable >> of filtering, we might as well make use of that. Of course, one >> difficulty is that we must not break running tcpdump on those DSA slave >> network interfaces. > > Sorry, I think you have the wrong end of the stick. > > For a non-bridged port, I am seeing _no_ traffic apart from that > explicitly sent out through that port. In other words, there are > _no_ flooded frames coming out of the non-bridged port. > > This patch appears to have no material effect on non-bridged ports. Presumably because that non-bridged port and the CPU port are part of the same domain with only those 2 ports and that is what we want. Now what happens if say you have a station that sends multicast traffic through that port to e.g.: 226.94.1.1, I bet that port happily sends that multicast traffic to the CPU port with no filtering what so ever and this ends-up being dropped in the network stack because there is a socket look up failure there. IMHO unless you have a receiver/server on that network interface on the DSA network interface and a matching socket you should not be receiving that multicast traffic and the switch should be filtering it. Since the network stack will call into ndo_set_rx_mode() for those cases, we really just need to make that multicast traffic known, instead of unknown to the switch.
On Sun, Feb 17, 2019 at 02:03:40PM -0800, Florian Fainelli wrote: > > > On 2/17/2019 1:58 PM, Russell King - ARM Linux admin wrote: > > On Sun, Feb 17, 2019 at 01:45:24PM -0800, Florian Fainelli wrote: > >> > >> > >> On 2/17/2019 8:34 AM, Russell King - ARM Linux admin wrote: > >>> On Sun, Feb 17, 2019 at 02:27:16PM +0000, Russell King - ARM Linux admin wrote: > >>>> On Sun, Feb 17, 2019 at 02:25:17PM +0000, Russell King wrote: > >>>>> Switches work by learning the MAC address for each attached station by > >>>>> monitoring traffic from each station. When a station sends a packet, > >>>>> the switch records which port the MAC address is connected to. > >>>>> > >>>>> With IPv4 networking, before communication commences with a neighbour, > >>>>> an ARP packet is broadcasted to all stations asking for the MAC address > >>>>> corresponding with the IPv4. The desired station responds with an ARP > >>>>> reply, and the ARP reply causes the switch to learn which port the > >>>>> station is connected to. > >>>>> > >>>>> With IPv6 networking, the situation is rather different. Rather than > >>>>> broadcasting ARP packets, a "neighbour solicitation" is multicasted > >>>>> rather than broadcasted. This multicast needs to reach the intended > >>>>> station in order for the neighbour to be discovered. > >>>>> > >>>>> Once a neighbour has been discovered, and entered into the sending > >>>>> stations neighbour cache, communication can restart at a point later > >>>>> without sending a new neighbour solicitation, even if the entry in > >>>>> the neighbour cache is marked as stale. This can be after the MAC > >>>>> address has expired from the forwarding cache of the DSA switch - > >>>>> when that occurs, there is a long pause in communication. > >>>>> > >>>>> Our DSA implementation for mv88e6xxx switches has defaulted to having > >>>>> multicast and unicast flooding disabled. As per the above description, > >>>>> this is fine for IPv4 networking, since the broadcasted ARP queries > >>>>> will be sent to and received by all stations on the same network. > >>>>> However, this breaks IPv6 very badly - blocking neighbour solicitations > >>>>> and later causing connections to stall. > >>>>> > >>>>> The defaults that the Linux bridge code expect from bridges are that > >>>>> unknown unicast frames and unknown multicast frames are flooded to > >>>>> all stations, which is at odds to the defaults adopted by our DSA > >>>>> implementation for mv88e6xxx switches. > >>>>> > >>>>> This commit enables by default flooding of both unknown unicast and > >>>>> unknown multicast frames. This means that mv88e6xxx DSA switches now > >>>>> behave as per the bridge(8) man page, and IPv6 works flawlessly through > >>>>> such a switch. > >>>> > >>>> Note that there is the open question whether this affects the case where > >>>> each port is used as a separate network interface: that case has not yet > >>>> been tested. > >>> > >>> I've checked with a mv88e6131 on the clearfog gt8k board. lan1 > >>> connected to my lan with plenty of traffic on, and configured as > >>> part of a bridge. lan2 connected to the zii board, but not part > >>> of the bridge. Monitoring lan2 from the zii board shows no traffic > >>> that was received from lan1. > >>> > >>> So it looks fine. > >> > >> With the current state whereby we do not have the necessary hooks to > >> perform filtering on non-bridged/standalone ports, this is entirely fine > >> indeed. > >> > >> In the future this is part of something I want to address because it is > >> IMHO highly undesirable to have non-bridged ports be flooded with > >> unknown multicast or unknown unicast for that matter because that makes > >> them deviate from a standard NIC interface. Unknown unicast is not > >> necessarily a low hanging fruit, but still, if we have switches capable > >> of filtering, we might as well make use of that. Of course, one > >> difficulty is that we must not break running tcpdump on those DSA slave > >> network interfaces. > > > > Sorry, I think you have the wrong end of the stick. > > > > For a non-bridged port, I am seeing _no_ traffic apart from that > > explicitly sent out through that port. In other words, there are > > _no_ flooded frames coming out of the non-bridged port. > > > > This patch appears to have no material effect on non-bridged ports. > > Presumably because that non-bridged port and the CPU port are part of > the same domain with only those 2 ports and that is what we want. > > Now what happens if say you have a station that sends multicast traffic > through that port to e.g.: 226.94.1.1, I bet that port happily sends > that multicast traffic to the CPU port with no filtering what so ever > and this ends-up being dropped in the network stack because there is a > socket look up failure there. IMHO unless you have a receiver/server on > that network interface on the DSA network interface and a matching > socket you should not be receiving that multicast traffic and the switch > should be filtering it. Since the network stack will call into > ndo_set_rx_mode() for those cases, we really just need to make that > multicast traffic known, instead of unknown to the switch. If the port is not bridged, then it's operating as network interface, and traffic to/from that port needs to be routed to the CPU port so that it appears as it would do from a real network interface. Doing anything else makes breaks the idea that you can use a set of DSA ports as individual interfaces and run anything but IPv4 non-multicast over them.
On 2/17/2019 2:19 PM, Russell King - ARM Linux admin wrote: > On Sun, Feb 17, 2019 at 02:03:40PM -0800, Florian Fainelli wrote: >> >> >> On 2/17/2019 1:58 PM, Russell King - ARM Linux admin wrote: >>> On Sun, Feb 17, 2019 at 01:45:24PM -0800, Florian Fainelli wrote: >>>> >>>> >>>> On 2/17/2019 8:34 AM, Russell King - ARM Linux admin wrote: >>>>> On Sun, Feb 17, 2019 at 02:27:16PM +0000, Russell King - ARM Linux admin wrote: >>>>>> On Sun, Feb 17, 2019 at 02:25:17PM +0000, Russell King wrote: >>>>>>> Switches work by learning the MAC address for each attached station by >>>>>>> monitoring traffic from each station. When a station sends a packet, >>>>>>> the switch records which port the MAC address is connected to. >>>>>>> >>>>>>> With IPv4 networking, before communication commences with a neighbour, >>>>>>> an ARP packet is broadcasted to all stations asking for the MAC address >>>>>>> corresponding with the IPv4. The desired station responds with an ARP >>>>>>> reply, and the ARP reply causes the switch to learn which port the >>>>>>> station is connected to. >>>>>>> >>>>>>> With IPv6 networking, the situation is rather different. Rather than >>>>>>> broadcasting ARP packets, a "neighbour solicitation" is multicasted >>>>>>> rather than broadcasted. This multicast needs to reach the intended >>>>>>> station in order for the neighbour to be discovered. >>>>>>> >>>>>>> Once a neighbour has been discovered, and entered into the sending >>>>>>> stations neighbour cache, communication can restart at a point later >>>>>>> without sending a new neighbour solicitation, even if the entry in >>>>>>> the neighbour cache is marked as stale. This can be after the MAC >>>>>>> address has expired from the forwarding cache of the DSA switch - >>>>>>> when that occurs, there is a long pause in communication. >>>>>>> >>>>>>> Our DSA implementation for mv88e6xxx switches has defaulted to having >>>>>>> multicast and unicast flooding disabled. As per the above description, >>>>>>> this is fine for IPv4 networking, since the broadcasted ARP queries >>>>>>> will be sent to and received by all stations on the same network. >>>>>>> However, this breaks IPv6 very badly - blocking neighbour solicitations >>>>>>> and later causing connections to stall. >>>>>>> >>>>>>> The defaults that the Linux bridge code expect from bridges are that >>>>>>> unknown unicast frames and unknown multicast frames are flooded to >>>>>>> all stations, which is at odds to the defaults adopted by our DSA >>>>>>> implementation for mv88e6xxx switches. >>>>>>> >>>>>>> This commit enables by default flooding of both unknown unicast and >>>>>>> unknown multicast frames. This means that mv88e6xxx DSA switches now >>>>>>> behave as per the bridge(8) man page, and IPv6 works flawlessly through >>>>>>> such a switch. >>>>>> >>>>>> Note that there is the open question whether this affects the case where >>>>>> each port is used as a separate network interface: that case has not yet >>>>>> been tested. >>>>> >>>>> I've checked with a mv88e6131 on the clearfog gt8k board. lan1 >>>>> connected to my lan with plenty of traffic on, and configured as >>>>> part of a bridge. lan2 connected to the zii board, but not part >>>>> of the bridge. Monitoring lan2 from the zii board shows no traffic >>>>> that was received from lan1. >>>>> >>>>> So it looks fine. >>>> >>>> With the current state whereby we do not have the necessary hooks to >>>> perform filtering on non-bridged/standalone ports, this is entirely fine >>>> indeed. >>>> >>>> In the future this is part of something I want to address because it is >>>> IMHO highly undesirable to have non-bridged ports be flooded with >>>> unknown multicast or unknown unicast for that matter because that makes >>>> them deviate from a standard NIC interface. Unknown unicast is not >>>> necessarily a low hanging fruit, but still, if we have switches capable >>>> of filtering, we might as well make use of that. Of course, one >>>> difficulty is that we must not break running tcpdump on those DSA slave >>>> network interfaces. >>> >>> Sorry, I think you have the wrong end of the stick. >>> >>> For a non-bridged port, I am seeing _no_ traffic apart from that >>> explicitly sent out through that port. In other words, there are >>> _no_ flooded frames coming out of the non-bridged port. >>> >>> This patch appears to have no material effect on non-bridged ports. >> >> Presumably because that non-bridged port and the CPU port are part of >> the same domain with only those 2 ports and that is what we want. >> >> Now what happens if say you have a station that sends multicast traffic >> through that port to e.g.: 226.94.1.1, I bet that port happily sends >> that multicast traffic to the CPU port with no filtering what so ever >> and this ends-up being dropped in the network stack because there is a >> socket look up failure there. IMHO unless you have a receiver/server on >> that network interface on the DSA network interface and a matching >> socket you should not be receiving that multicast traffic and the switch >> should be filtering it. Since the network stack will call into >> ndo_set_rx_mode() for those cases, we really just need to make that >> multicast traffic known, instead of unknown to the switch. > > If the port is not bridged, then it's operating as network interface, > and traffic to/from that port needs to be routed to the CPU port so > that it appears as it would do from a real network interface. > Doing anything else makes breaks the idea that you can use a set > of DSA ports as individual interfaces and run anything but IPv4 > non-multicast over them. I am not proposing changing any of that just making use of the switch's ability to snoop management traffic (ARP, DHCP, MLD, etc.) and use of the network stack's hints in order to avoid flooding unknown multicast. This is starting to diverge a little bit from the original issue though.
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index b75a865a293d..eb5e3d88374f 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2144,13 +2144,14 @@ static int mv88e6xxx_setup_message_port(struct mv88e6xxx_chip *chip, int port) static int mv88e6xxx_setup_egress_floods(struct mv88e6xxx_chip *chip, int port) { struct dsa_switch *ds = chip->ds; - bool flood; - /* Upstream ports flood frames with unknown unicast or multicast DA */ - flood = dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port); + /* Linux bridges are expected to flood unknown multicast and + * unicast frames to all ports - as per the defaults specified + * in the iproute2 bridge(8) man page. Not doing this causes + * stalls and failures with IPv6 over Marvell bridges. */ if (chip->info->ops->port_set_egress_floods) return chip->info->ops->port_set_egress_floods(chip, port, - flood, flood); + true, true); return 0; }
Switches work by learning the MAC address for each attached station by monitoring traffic from each station. When a station sends a packet, the switch records which port the MAC address is connected to. With IPv4 networking, before communication commences with a neighbour, an ARP packet is broadcasted to all stations asking for the MAC address corresponding with the IPv4. The desired station responds with an ARP reply, and the ARP reply causes the switch to learn which port the station is connected to. With IPv6 networking, the situation is rather different. Rather than broadcasting ARP packets, a "neighbour solicitation" is multicasted rather than broadcasted. This multicast needs to reach the intended station in order for the neighbour to be discovered. Once a neighbour has been discovered, and entered into the sending stations neighbour cache, communication can restart at a point later without sending a new neighbour solicitation, even if the entry in the neighbour cache is marked as stale. This can be after the MAC address has expired from the forwarding cache of the DSA switch - when that occurs, there is a long pause in communication. Our DSA implementation for mv88e6xxx switches has defaulted to having multicast and unicast flooding disabled. As per the above description, this is fine for IPv4 networking, since the broadcasted ARP queries will be sent to and received by all stations on the same network. However, this breaks IPv6 very badly - blocking neighbour solicitations and later causing connections to stall. The defaults that the Linux bridge code expect from bridges are that unknown unicast frames and unknown multicast frames are flooded to all stations, which is at odds to the defaults adopted by our DSA implementation for mv88e6xxx switches. This commit enables by default flooding of both unknown unicast and unknown multicast frames. This means that mv88e6xxx DSA switches now behave as per the bridge(8) man page, and IPv6 works flawlessly through such a switch. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> --- drivers/net/dsa/mv88e6xxx/chip.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)