diff mbox series

[net-next,v3,3/3] net: dsa: enable flooding for bridge ports

Message ID E1gwR7T-0003iw-Cq@rmk-PC.armlinux.org.uk
State Superseded
Delegated to: David Miller
Headers show
Series net: dsa: mv88e6xxx: fix IPv6 | expand

Commit Message

Russell King (Oracle) Feb. 20, 2019, 12:36 p.m. UTC
Switches work by learning the MAC address for each attached station by
monitoring traffic from each station.  When a station sends a packet,
the switch records which port the MAC address is connected to.

With IPv4 networking, before communication commences with a neighbour,
an ARP packet is broadcasted to all stations asking for the MAC address
corresponding with the IPv4.  The desired station responds with an ARP
reply, and the ARP reply causes the switch to learn which port the
station is connected to.

With IPv6 networking, the situation is rather different.  Rather than
broadcasting ARP packets, a "neighbour solicitation" is multicasted
rather than broadcasted.  This multicast needs to reach the intended
station in order for the neighbour to be discovered.

Once a neighbour has been discovered, and entered into the sending
stations neighbour cache, communication can restart at a point later
without sending a new neighbour solicitation, even if the entry in
the neighbour cache is marked as stale.  This can be after the MAC
address has expired from the forwarding cache of the DSA switch -
when that occurs, there is a long pause in communication.

Our DSA implementation for mv88e6xxx switches disables flooding of
multicast and unicast frames for bridged ports.  As per the above
description, this is fine for IPv4 networking, since the broadcasted
ARP queries will be sent to and received by all stations on the same
network.  However, this breaks IPv6 very badly - blocking neighbour
solicitations and later causing connections to stall.

The defaults that the Linux bridge code expect from bridges are for
unknown unicast and unknown multicast frames to be flooded to all ports
on the bridge, which is at odds to the defaults adopted by our DSA
implementation for mv88e6xxx switches.

This commit enables by default flooding of both unknown unicast and
unknown multicast frames whenever a port is added to a bridge, and
disables the flooding when a port leaves the bridge.  This means that
mv88e6xxx DSA switches now behave as per the bridge(8) man page, and
IPv6 works flawlessly through such a switch.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 net/dsa/port.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Vivien Didelot Feb. 20, 2019, 5:23 p.m. UTC | #1
Hi Russell,

On Wed, 20 Feb 2019 12:36:59 +0000, Russell King <rmk+kernel@armlinux.org.uk> wrote:
> Switches work by learning the MAC address for each attached station by
> monitoring traffic from each station.  When a station sends a packet,
> the switch records which port the MAC address is connected to.
> 
> With IPv4 networking, before communication commences with a neighbour,
> an ARP packet is broadcasted to all stations asking for the MAC address
> corresponding with the IPv4.  The desired station responds with an ARP
> reply, and the ARP reply causes the switch to learn which port the
> station is connected to.
> 
> With IPv6 networking, the situation is rather different.  Rather than
> broadcasting ARP packets, a "neighbour solicitation" is multicasted
> rather than broadcasted.  This multicast needs to reach the intended
> station in order for the neighbour to be discovered.
> 
> Once a neighbour has been discovered, and entered into the sending
> stations neighbour cache, communication can restart at a point later
> without sending a new neighbour solicitation, even if the entry in
> the neighbour cache is marked as stale.  This can be after the MAC
> address has expired from the forwarding cache of the DSA switch -
> when that occurs, there is a long pause in communication.
> 
> Our DSA implementation for mv88e6xxx switches disables flooding of
> multicast and unicast frames for bridged ports.  As per the above
> description, this is fine for IPv4 networking, since the broadcasted
> ARP queries will be sent to and received by all stations on the same
> network.  However, this breaks IPv6 very badly - blocking neighbour
> solicitations and later causing connections to stall.
> 
> The defaults that the Linux bridge code expect from bridges are for
> unknown unicast and unknown multicast frames to be flooded to all ports
> on the bridge, which is at odds to the defaults adopted by our DSA
> implementation for mv88e6xxx switches.
> 
> This commit enables by default flooding of both unknown unicast and
> unknown multicast frames whenever a port is added to a bridge, and
> disables the flooding when a port leaves the bridge.  This means that
> mv88e6xxx DSA switches now behave as per the bridge(8) man page, and
> IPv6 works flawlessly through such a switch.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> ---
>  net/dsa/port.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index b84d010fb165..9e7aab13957e 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -105,6 +105,11 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
>  	};
>  	int err;
>  
> +	/* Set the flooding mode before joining */

Note that as stated by the comment just below, the port has already joined
the bridge here.

> +	err = dsa_port_bridge_flags(dp, BR_FLOOD | BR_MCAST_FLOOD, NULL);
> +	if (err)
> +		return err;
> +
>  	/* Here the port is already bridged. Reflect the current configuration
>  	 * so that drivers can program their chips accordingly.
>  	 */
> @@ -113,8 +118,10 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
>  	err = dsa_port_notify(dp, DSA_NOTIFIER_BRIDGE_JOIN, &info);
>  
>  	/* The bridging is rolled back on error */
> -	if (err)
> +	if (err) {
> +		dsa_port_bridge_flags(dp, 0, NULL);
>  		dp->bridge_dev = NULL;
> +	}
>  
>  	return err;
>  }
> @@ -137,6 +144,9 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
>  	if (err)
>  		pr_err("DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE\n");
>  
> +	/* Port is leaving the bridge, disable flooding */
> +	dsa_port_bridge_flags(dp, BR_LEARNING, NULL);
> +
>  	/* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer,
>  	 * so allow it to be in BR_STATE_FORWARDING to be kept functional
>  	 */


This makes it clear that we must add this logic which sets the expected
default flags into the bridge code itself. But this can be done later.


Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Russell King (Oracle) Feb. 20, 2019, 5:33 p.m. UTC | #2
On Wed, Feb 20, 2019 at 12:23:55PM -0500, Vivien Didelot wrote:
> Hi Russell,
> 
> On Wed, 20 Feb 2019 12:36:59 +0000, Russell King <rmk+kernel@armlinux.org.uk> wrote:
> > Switches work by learning the MAC address for each attached station by
> > monitoring traffic from each station.  When a station sends a packet,
> > the switch records which port the MAC address is connected to.
> > 
> > With IPv4 networking, before communication commences with a neighbour,
> > an ARP packet is broadcasted to all stations asking for the MAC address
> > corresponding with the IPv4.  The desired station responds with an ARP
> > reply, and the ARP reply causes the switch to learn which port the
> > station is connected to.
> > 
> > With IPv6 networking, the situation is rather different.  Rather than
> > broadcasting ARP packets, a "neighbour solicitation" is multicasted
> > rather than broadcasted.  This multicast needs to reach the intended
> > station in order for the neighbour to be discovered.
> > 
> > Once a neighbour has been discovered, and entered into the sending
> > stations neighbour cache, communication can restart at a point later
> > without sending a new neighbour solicitation, even if the entry in
> > the neighbour cache is marked as stale.  This can be after the MAC
> > address has expired from the forwarding cache of the DSA switch -
> > when that occurs, there is a long pause in communication.
> > 
> > Our DSA implementation for mv88e6xxx switches disables flooding of
> > multicast and unicast frames for bridged ports.  As per the above
> > description, this is fine for IPv4 networking, since the broadcasted
> > ARP queries will be sent to and received by all stations on the same
> > network.  However, this breaks IPv6 very badly - blocking neighbour
> > solicitations and later causing connections to stall.
> > 
> > The defaults that the Linux bridge code expect from bridges are for
> > unknown unicast and unknown multicast frames to be flooded to all ports
> > on the bridge, which is at odds to the defaults adopted by our DSA
> > implementation for mv88e6xxx switches.
> > 
> > This commit enables by default flooding of both unknown unicast and
> > unknown multicast frames whenever a port is added to a bridge, and
> > disables the flooding when a port leaves the bridge.  This means that
> > mv88e6xxx DSA switches now behave as per the bridge(8) man page, and
> > IPv6 works flawlessly through such a switch.
> > 
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > ---
> >  net/dsa/port.c | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/dsa/port.c b/net/dsa/port.c
> > index b84d010fb165..9e7aab13957e 100644
> > --- a/net/dsa/port.c
> > +++ b/net/dsa/port.c
> > @@ -105,6 +105,11 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
> >  	};
> >  	int err;
> >  
> > +	/* Set the flooding mode before joining */
> 
> Note that as stated by the comment just below, the port has already joined
> the bridge here.

The software interface has joined at this point, but not the physical
port itself.  I actually find that the statement in the comment below
this code is the confusing statement.

> 
> > +	err = dsa_port_bridge_flags(dp, BR_FLOOD | BR_MCAST_FLOOD, NULL);
> > +	if (err)
> > +		return err;
> > +
> >  	/* Here the port is already bridged. Reflect the current configuration
> >  	 * so that drivers can program their chips accordingly.

This is the confusing statement: the switch port is not bridged at
this point since we haven't programmed the hardware to make that happen.
The Linux interface corresponding to the switch port is what has been
bridged.

> >  	 */
> > @@ -113,8 +118,10 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
> >  	err = dsa_port_notify(dp, DSA_NOTIFIER_BRIDGE_JOIN, &info);
> >  
> >  	/* The bridging is rolled back on error */
> > -	if (err)
> > +	if (err) {
> > +		dsa_port_bridge_flags(dp, 0, NULL);
> >  		dp->bridge_dev = NULL;
> > +	}
> >  
> >  	return err;
> >  }
> > @@ -137,6 +144,9 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
> >  	if (err)
> >  		pr_err("DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE\n");
> >  
> > +	/* Port is leaving the bridge, disable flooding */
> > +	dsa_port_bridge_flags(dp, BR_LEARNING, NULL);

Hmm, that should've been 0 not BR_LEARNING since we are not dealing
with that flag yet.  I'll send a v4 later this evening.

> > +
> >  	/* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer,
> >  	 * so allow it to be in BR_STATE_FORWARDING to be kept functional
> >  	 */
> 
> 
> This makes it clear that we must add this logic which sets the expected
> default flags into the bridge code itself. But this can be done later.

Note that the above is carefully chosen to ensure that the port is
configured for flooding whenever traffic may pass in bridge mode,
which I suggest is something that is kept in future.

> 
> 
> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
>
diff mbox series

Patch

diff --git a/net/dsa/port.c b/net/dsa/port.c
index b84d010fb165..9e7aab13957e 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -105,6 +105,11 @@  int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
 	};
 	int err;
 
+	/* Set the flooding mode before joining */
+	err = dsa_port_bridge_flags(dp, BR_FLOOD | BR_MCAST_FLOOD, NULL);
+	if (err)
+		return err;
+
 	/* Here the port is already bridged. Reflect the current configuration
 	 * so that drivers can program their chips accordingly.
 	 */
@@ -113,8 +118,10 @@  int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
 	err = dsa_port_notify(dp, DSA_NOTIFIER_BRIDGE_JOIN, &info);
 
 	/* The bridging is rolled back on error */
-	if (err)
+	if (err) {
+		dsa_port_bridge_flags(dp, 0, NULL);
 		dp->bridge_dev = NULL;
+	}
 
 	return err;
 }
@@ -137,6 +144,9 @@  void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
 	if (err)
 		pr_err("DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE\n");
 
+	/* Port is leaving the bridge, disable flooding */
+	dsa_port_bridge_flags(dp, BR_LEARNING, NULL);
+
 	/* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer,
 	 * so allow it to be in BR_STATE_FORWARDING to be kept functional
 	 */