diff mbox series

[net,1/2] net: mscc: ocelot: fix vlan_filtering when enslaving to bridge before link is up

Message ID 20191026180427.14039-2-olteanv@gmail.com
State Accepted
Delegated to: David Miller
Headers show
Series VLAN fixes for Ocelot switch | expand

Commit Message

Vladimir Oltean Oct. 26, 2019, 6:04 p.m. UTC
Background information: the driver operates the hardware in a mode where
a single VLAN can be transmitted as untagged on a particular egress
port. That is the "native VLAN on trunk port" use case. Its value is
held in port->vid.

Consider the following command sequence (no network manager, all
interfaces are down, debugging prints added by me):

$ ip link add dev br0 type bridge vlan_filtering 1
$ ip link set dev swp0 master br0

Kernel code path during last command:

br_add_slave -> ocelot_netdevice_port_event (NETDEV_CHANGEUPPER):
[   21.401901] ocelot_vlan_port_apply: port 0 vlan aware 0 pvid 0 vid 0

br_add_slave -> nbp_vlan_init -> switchdev_port_attr_set -> ocelot_port_attr_set (SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING):
[   21.413335] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 0 vid 0

br_add_slave -> nbp_vlan_init -> nbp_vlan_add -> br_switchdev_port_vlan_add -> switchdev_port_obj_add -> ocelot_port_obj_add -> ocelot_vlan_vid_add
[   21.667421] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 1

So far so good. The bridge has replaced the driver's default pvid used
in standalone mode (0) with its own default_pvid (1). The port's vid
(native VLAN) has also changed from 0 to 1.

$ ip link set dev swp0 up

[   31.722956] 8021q: adding VLAN 0 to HW filter on device swp0
do_setlink -> dev_change_flags -> vlan_vid_add -> ocelot_vlan_rx_add_vid -> ocelot_vlan_vid_add:
[   31.728700] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 0

The 8021q module uses the .ndo_vlan_rx_add_vid API on .ndo_open to make
ports be able to transmit and receive 802.1p-tagged traffic by default.
This API is supposed to offload a VLAN sub-interface, which for a switch
port means to add a VLAN that is not a pvid, and tagged on egress.

But the driver implementation of .ndo_vlan_rx_add_vid is wrong: it adds
back vid 0 as "egress untagged". Now back to the initial paragraph:
there is a single untagged VID that the driver keeps track of, and that
has just changed from 1 (the pvid) to 0. So this breaks the bridge
core's expectation, because it has changed vid 1 from untagged to
tagged, when what the user sees is.

$ bridge vlan
port    vlan ids
swp0     1 PVID Egress Untagged

br0      1 PVID Egress Untagged

But curiously, instead of manifesting itself as "untagged and
pvid-tagged traffic gets sent as tagged on egress", the bug:

- is hidden when vlan_filtering=0
- manifests as dropped traffic when vlan_filtering=1, due to this setting:

	if (port->vlan_aware && !port->vid)
		/* If port is vlan-aware and tagged, drop untagged and priority
		 * tagged frames.
		 */
		val |= ANA_PORT_DROP_CFG_DROP_UNTAGGED_ENA |
		       ANA_PORT_DROP_CFG_DROP_PRIO_S_TAGGED_ENA |
		       ANA_PORT_DROP_CFG_DROP_PRIO_C_TAGGED_ENA;

which would have made sense if it weren't for this bug. The setting's
intention was "this is a trunk port with no native VLAN, so don't accept
untagged traffic". So the driver was never expecting to set VLAN 0 as
the value of the native VLAN, 0 was just encoding for "invalid".

So the fix is to not send 802.1p traffic as untagged, because that would
change the port's native vlan to 0, unbeknownst to the bridge, and
trigger unexpected code paths in the driver.

Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Fixes: 7142529f1688 ("net: mscc: ocelot: add VLAN filtering")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/net/ethernet/mscc/ocelot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Florian Fainelli Oct. 26, 2019, 8:31 p.m. UTC | #1
On 10/26/2019 11:04 AM, Vladimir Oltean wrote:
> Background information: the driver operates the hardware in a mode where
> a single VLAN can be transmitted as untagged on a particular egress
> port. That is the "native VLAN on trunk port" use case. Its value is
> held in port->vid.
> 
> Consider the following command sequence (no network manager, all
> interfaces are down, debugging prints added by me):
> 
> $ ip link add dev br0 type bridge vlan_filtering 1
> $ ip link set dev swp0 master br0
> 
> Kernel code path during last command:
> 
> br_add_slave -> ocelot_netdevice_port_event (NETDEV_CHANGEUPPER):
> [   21.401901] ocelot_vlan_port_apply: port 0 vlan aware 0 pvid 0 vid 0
> 
> br_add_slave -> nbp_vlan_init -> switchdev_port_attr_set -> ocelot_port_attr_set (SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING):
> [   21.413335] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 0 vid 0
> 
> br_add_slave -> nbp_vlan_init -> nbp_vlan_add -> br_switchdev_port_vlan_add -> switchdev_port_obj_add -> ocelot_port_obj_add -> ocelot_vlan_vid_add
> [   21.667421] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 1
> 
> So far so good. The bridge has replaced the driver's default pvid used
> in standalone mode (0) with its own default_pvid (1). The port's vid
> (native VLAN) has also changed from 0 to 1.
> 
> $ ip link set dev swp0 up
> 
> [   31.722956] 8021q: adding VLAN 0 to HW filter on device swp0
> do_setlink -> dev_change_flags -> vlan_vid_add -> ocelot_vlan_rx_add_vid -> ocelot_vlan_vid_add:
> [   31.728700] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 0
> 
> The 8021q module uses the .ndo_vlan_rx_add_vid API on .ndo_open to make
> ports be able to transmit and receive 802.1p-tagged traffic by default.
> This API is supposed to offload a VLAN sub-interface, which for a switch
> port means to add a VLAN that is not a pvid, and tagged on egress.
> 
> But the driver implementation of .ndo_vlan_rx_add_vid is wrong: it adds
> back vid 0 as "egress untagged". Now back to the initial paragraph:
> there is a single untagged VID that the driver keeps track of, and that
> has just changed from 1 (the pvid) to 0. So this breaks the bridge
> core's expectation, because it has changed vid 1 from untagged to
> tagged, when what the user sees is.
> 
> $ bridge vlan
> port    vlan ids
> swp0     1 PVID Egress Untagged
> 
> br0      1 PVID Egress Untagged
> 
> But curiously, instead of manifesting itself as "untagged and
> pvid-tagged traffic gets sent as tagged on egress", the bug:
> 
> - is hidden when vlan_filtering=0
> - manifests as dropped traffic when vlan_filtering=1, due to this setting:
> 
> 	if (port->vlan_aware && !port->vid)
> 		/* If port is vlan-aware and tagged, drop untagged and priority
> 		 * tagged frames.
> 		 */
> 		val |= ANA_PORT_DROP_CFG_DROP_UNTAGGED_ENA |
> 		       ANA_PORT_DROP_CFG_DROP_PRIO_S_TAGGED_ENA |
> 		       ANA_PORT_DROP_CFG_DROP_PRIO_C_TAGGED_ENA;
> 
> which would have made sense if it weren't for this bug. The setting's
> intention was "this is a trunk port with no native VLAN, so don't accept
> untagged traffic". So the driver was never expecting to set VLAN 0 as
> the value of the native VLAN, 0 was just encoding for "invalid".
> 
> So the fix is to not send 802.1p traffic as untagged, because that would
> change the port's native vlan to 0, unbeknownst to the bridge, and
> trigger unexpected code paths in the driver.
> 
> Cc: Antoine Tenart <antoine.tenart@bootlin.com>
> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
> Fixes: 7142529f1688 ("net: mscc: ocelot: add VLAN filtering")
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

Nice explanation, thanks!
Alexandre Belloni Oct. 26, 2019, 11:03 p.m. UTC | #2
On 26/10/2019 21:04:26+0300, Vladimir Oltean wrote:
> Background information: the driver operates the hardware in a mode where
> a single VLAN can be transmitted as untagged on a particular egress
> port. That is the "native VLAN on trunk port" use case. Its value is
> held in port->vid.
> 
> Consider the following command sequence (no network manager, all
> interfaces are down, debugging prints added by me):
> 
> $ ip link add dev br0 type bridge vlan_filtering 1
> $ ip link set dev swp0 master br0
> 
> Kernel code path during last command:
> 
> br_add_slave -> ocelot_netdevice_port_event (NETDEV_CHANGEUPPER):
> [   21.401901] ocelot_vlan_port_apply: port 0 vlan aware 0 pvid 0 vid 0
> 
> br_add_slave -> nbp_vlan_init -> switchdev_port_attr_set -> ocelot_port_attr_set (SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING):
> [   21.413335] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 0 vid 0
> 
> br_add_slave -> nbp_vlan_init -> nbp_vlan_add -> br_switchdev_port_vlan_add -> switchdev_port_obj_add -> ocelot_port_obj_add -> ocelot_vlan_vid_add
> [   21.667421] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 1
> 
> So far so good. The bridge has replaced the driver's default pvid used
> in standalone mode (0) with its own default_pvid (1). The port's vid
> (native VLAN) has also changed from 0 to 1.
> 
> $ ip link set dev swp0 up
> 
> [   31.722956] 8021q: adding VLAN 0 to HW filter on device swp0
> do_setlink -> dev_change_flags -> vlan_vid_add -> ocelot_vlan_rx_add_vid -> ocelot_vlan_vid_add:
> [   31.728700] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 0
> 
> The 8021q module uses the .ndo_vlan_rx_add_vid API on .ndo_open to make
> ports be able to transmit and receive 802.1p-tagged traffic by default.
> This API is supposed to offload a VLAN sub-interface, which for a switch
> port means to add a VLAN that is not a pvid, and tagged on egress.
> 
> But the driver implementation of .ndo_vlan_rx_add_vid is wrong: it adds
> back vid 0 as "egress untagged". Now back to the initial paragraph:
> there is a single untagged VID that the driver keeps track of, and that
> has just changed from 1 (the pvid) to 0. So this breaks the bridge
> core's expectation, because it has changed vid 1 from untagged to
> tagged, when what the user sees is.
> 
> $ bridge vlan
> port    vlan ids
> swp0     1 PVID Egress Untagged
> 
> br0      1 PVID Egress Untagged
> 
> But curiously, instead of manifesting itself as "untagged and
> pvid-tagged traffic gets sent as tagged on egress", the bug:
> 
> - is hidden when vlan_filtering=0
> - manifests as dropped traffic when vlan_filtering=1, due to this setting:
> 
> 	if (port->vlan_aware && !port->vid)
> 		/* If port is vlan-aware and tagged, drop untagged and priority
> 		 * tagged frames.
> 		 */
> 		val |= ANA_PORT_DROP_CFG_DROP_UNTAGGED_ENA |
> 		       ANA_PORT_DROP_CFG_DROP_PRIO_S_TAGGED_ENA |
> 		       ANA_PORT_DROP_CFG_DROP_PRIO_C_TAGGED_ENA;
> 
> which would have made sense if it weren't for this bug. The setting's
> intention was "this is a trunk port with no native VLAN, so don't accept
> untagged traffic". So the driver was never expecting to set VLAN 0 as
> the value of the native VLAN, 0 was just encoding for "invalid".
> 
> So the fix is to not send 802.1p traffic as untagged, because that would
> change the port's native vlan to 0, unbeknownst to the bridge, and
> trigger unexpected code paths in the driver.
> 
> Cc: Antoine Tenart <antoine.tenart@bootlin.com>
> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
> Fixes: 7142529f1688 ("net: mscc: ocelot: add VLAN filtering")
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

> ---
>  drivers/net/ethernet/mscc/ocelot.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
> index 7190fe4c1095..552252331e55 100644
> --- a/drivers/net/ethernet/mscc/ocelot.c
> +++ b/drivers/net/ethernet/mscc/ocelot.c
> @@ -915,7 +915,7 @@ static int ocelot_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
>  static int ocelot_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
>  				  u16 vid)
>  {
> -	return ocelot_vlan_vid_add(dev, vid, false, true);
> +	return ocelot_vlan_vid_add(dev, vid, false, false);
>  }
>  
>  static int ocelot_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
> -- 
> 2.17.1
>
Horatiu Vultur Oct. 28, 2019, 11:48 p.m. UTC | #3
The 10/26/2019 21:04, Vladimir Oltean wrote:
> Background information: the driver operates the hardware in a mode where
> a single VLAN can be transmitted as untagged on a particular egress
> port. That is the "native VLAN on trunk port" use case. Its value is
> held in port->vid.
> 
> Consider the following command sequence (no network manager, all
> interfaces are down, debugging prints added by me):
> 
> $ ip link add dev br0 type bridge vlan_filtering 1
> $ ip link set dev swp0 master br0
> 
> Kernel code path during last command:
> 
> br_add_slave -> ocelot_netdevice_port_event (NETDEV_CHANGEUPPER):
> [   21.401901] ocelot_vlan_port_apply: port 0 vlan aware 0 pvid 0 vid 0
> 
> br_add_slave -> nbp_vlan_init -> switchdev_port_attr_set -> ocelot_port_attr_set (SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING):
> [   21.413335] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 0 vid 0
> 
> br_add_slave -> nbp_vlan_init -> nbp_vlan_add -> br_switchdev_port_vlan_add -> switchdev_port_obj_add -> ocelot_port_obj_add -> ocelot_vlan_vid_add
> [   21.667421] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 1
> 
> So far so good. The bridge has replaced the driver's default pvid used
> in standalone mode (0) with its own default_pvid (1). The port's vid
> (native VLAN) has also changed from 0 to 1.
> 
> $ ip link set dev swp0 up
> 
> [   31.722956] 8021q: adding VLAN 0 to HW filter on device swp0
> do_setlink -> dev_change_flags -> vlan_vid_add -> ocelot_vlan_rx_add_vid -> ocelot_vlan_vid_add:
> [   31.728700] ocelot_vlan_port_apply: port 0 vlan aware 1 pvid 1 vid 0
> 
> The 8021q module uses the .ndo_vlan_rx_add_vid API on .ndo_open to make
> ports be able to transmit and receive 802.1p-tagged traffic by default.
> This API is supposed to offload a VLAN sub-interface, which for a switch
> port means to add a VLAN that is not a pvid, and tagged on egress.
> 
> But the driver implementation of .ndo_vlan_rx_add_vid is wrong: it adds
> back vid 0 as "egress untagged". Now back to the initial paragraph:
> there is a single untagged VID that the driver keeps track of, and that
> has just changed from 1 (the pvid) to 0. So this breaks the bridge
> core's expectation, because it has changed vid 1 from untagged to
> tagged, when what the user sees is.
> 
> $ bridge vlan
> port    vlan ids
> swp0     1 PVID Egress Untagged
> 
> br0      1 PVID Egress Untagged
> 
> But curiously, instead of manifesting itself as "untagged and
> pvid-tagged traffic gets sent as tagged on egress", the bug:
> 
> - is hidden when vlan_filtering=0
> - manifests as dropped traffic when vlan_filtering=1, due to this setting:
> 
> 	if (port->vlan_aware && !port->vid)
> 		/* If port is vlan-aware and tagged, drop untagged and priority
> 		 * tagged frames.
> 		 */
> 		val |= ANA_PORT_DROP_CFG_DROP_UNTAGGED_ENA |
> 		       ANA_PORT_DROP_CFG_DROP_PRIO_S_TAGGED_ENA |
> 		       ANA_PORT_DROP_CFG_DROP_PRIO_C_TAGGED_ENA;
> 
> which would have made sense if it weren't for this bug. The setting's
> intention was "this is a trunk port with no native VLAN, so don't accept
> untagged traffic". So the driver was never expecting to set VLAN 0 as
> the value of the native VLAN, 0 was just encoding for "invalid".
> 
> So the fix is to not send 802.1p traffic as untagged, because that would
> change the port's native vlan to 0, unbeknownst to the bridge, and
> trigger unexpected code paths in the driver.
> 
> Cc: Antoine Tenart <antoine.tenart@bootlin.com>
> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
> Fixes: 7142529f1688 ("net: mscc: ocelot: add VLAN filtering")
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
> ---
>  drivers/net/ethernet/mscc/ocelot.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
> index 7190fe4c1095..552252331e55 100644
> --- a/drivers/net/ethernet/mscc/ocelot.c
> +++ b/drivers/net/ethernet/mscc/ocelot.c
> @@ -915,7 +915,7 @@ static int ocelot_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
>  static int ocelot_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
>  				  u16 vid)
>  {
> -	return ocelot_vlan_vid_add(dev, vid, false, true);
> +	return ocelot_vlan_vid_add(dev, vid, false, false);
>  }
>  
>  static int ocelot_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
> -- 
> 2.17.1
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 7190fe4c1095..552252331e55 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -915,7 +915,7 @@  static int ocelot_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
 static int ocelot_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
 				  u16 vid)
 {
-	return ocelot_vlan_vid_add(dev, vid, false, true);
+	return ocelot_vlan_vid_add(dev, vid, false, false);
 }
 
 static int ocelot_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,