diff mbox

[RFC] net/core: fix skb handling on netif serves for both bridge and vlan

Message ID 1299149713-18740-1-git-send-email-dfeng@redhat.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Xiaotian Feng March 3, 2011, 10:55 a.m. UTC
Consider network topology as follows:

eth0  eth1
 |_____|
    |
  bond0 --- br0
    |
  vlan0 --- br1

bond0 serves for both br0 and vlan0, if a vlan tagged packet was sent
to br1 through bond0, bridge handling code is seeing the packet on bond0
and handing it off to my "legacy" bridge before vlan_tx_tag_present
and vlan_hwaccel_do_receive even haven't a chance to look at it.

Moving the vlan_tx_tag_present before bridge/macvlan handling code could
cure this.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Tom Herbert <therbert@google.com>
---
 net/core/dev.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

Comments

Ben Hutchings March 3, 2011, 1:53 p.m. UTC | #1
On Thu, 2011-03-03 at 18:55 +0800, Xiaotian Feng wrote:
> Consider network topology as follows:
> 
> eth0  eth1
>  |_____|
>     |
>   bond0 --- br0
>     |
>   vlan0 --- br1
> 
> bond0 serves for both br0 and vlan0, if a vlan tagged packet was sent
> to br1 through bond0, bridge handling code is seeing the packet on bond0
> and handing it off to my "legacy" bridge before vlan_tx_tag_present
> and vlan_hwaccel_do_receive even haven't a chance to look at it.
[...]

This used to work if the underlying device (bond0 in your example)
implemented VLAN tag extraction, because the VLAN group would be checked
before the bridge.  But it never worked for devices without VLAN tag
extraction.  Perhaps we should just prevent this configuration.

Ben.
Nicolas de Pesloüan March 3, 2011, 9:42 p.m. UTC | #2
Le 03/03/2011 14:53, Ben Hutchings a écrit :
> On Thu, 2011-03-03 at 18:55 +0800, Xiaotian Feng wrote:
>> Consider network topology as follows:
>>
>> eth0  eth1
>>   |_____|
>>      |
>>    bond0 --- br0
>>      |
>>    vlan0 --- br1
>>
>> bond0 serves for both br0 and vlan0, if a vlan tagged packet was sent
>> to br1 through bond0, bridge handling code is seeing the packet on bond0
>> and handing it off to my "legacy" bridge before vlan_tx_tag_present
>> and vlan_hwaccel_do_receive even haven't a chance to look at it.
> [...]
>
> This used to work if the underlying device (bond0 in your example)
> implemented VLAN tag extraction, because the VLAN group would be checked
> before the bridge.  But it never worked for devices without VLAN tag
> extraction.  Perhaps we should just prevent this configuration.

If Jiri Pirko eventually move vlan processing to rx_handler, then the setup won't be possible, 
because bond0 would require two rx_handlers : one for bridge (-> br0) and one for vlan (-> vlan0).

Or we need several rx_handlers per net_device, if the above setup is a real one.

	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko March 5, 2011, 10:36 a.m. UTC | #3
Thu, Mar 03, 2011 at 11:55:13AM CET, dfeng@redhat.com wrote:
>Consider network topology as follows:
>
>eth0  eth1
> |_____|
>    |
>  bond0 --- br0
>    |
>  vlan0 --- br1
>
>bond0 serves for both br0 and vlan0, if a vlan tagged packet was sent
>to br1 through bond0, bridge handling code is seeing the packet on bond0
>and handing it off to my "legacy" bridge before vlan_tx_tag_present
>and vlan_hwaccel_do_receive even haven't a chance to look at it.
>
>Moving the vlan_tx_tag_present before bridge/macvlan handling code could
>cure this.

Wouldn't this break "eth0 - br0 - br0.5"?

>
>Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
>Cc: "David S. Miller" <davem@davemloft.net>
>Cc: Eric Dumazet <eric.dumazet@gmail.com>
>Cc: Tom Herbert <therbert@google.com>
>---
> net/core/dev.c |   20 ++++++++++----------
> 1 files changed, 10 insertions(+), 10 deletions(-)
>
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 8ae6631..d2d12c2 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -3079,27 +3079,27 @@ static int __netif_receive_skb(struct sk_buff *skb)
> ncls:
> #endif
> 
>-	/* Handle special case of bridge or macvlan */
>-	rx_handler = rcu_dereference(skb->dev->rx_handler);
>-	if (rx_handler) {
>+	if (vlan_tx_tag_present(skb)) {
> 		if (pt_prev) {
> 			ret = deliver_skb(skb, pt_prev, orig_dev);
> 			pt_prev = NULL;
> 		}
>-		skb = rx_handler(skb);
>-		if (!skb)
>+		if (vlan_hwaccel_do_receive(&skb)) {
>+			ret = __netif_receive_skb(skb);
>+			goto out;
>+		} else if (unlikely(!skb))
> 			goto out;
> 	}
> 
>-	if (vlan_tx_tag_present(skb)) {
>+	/* Handle special case of bridge or macvlan */
>+	rx_handler = rcu_dereference(skb->dev->rx_handler);
>+	if (rx_handler) {
> 		if (pt_prev) {
> 			ret = deliver_skb(skb, pt_prev, orig_dev);
> 			pt_prev = NULL;
> 		}
>-		if (vlan_hwaccel_do_receive(&skb)) {
>-			ret = __netif_receive_skb(skb);
>-			goto out;
>-		} else if (unlikely(!skb))
>+		skb = rx_handler(skb);
>+		if (!skb)
> 			goto out;
> 	}
> 
>-- 
>1.7.1
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas de Pesloüan March 5, 2011, 1:53 p.m. UTC | #4
Le 05/03/2011 11:36, Jiri Pirko a écrit :
> Thu, Mar 03, 2011 at 11:55:13AM CET, dfeng@redhat.com wrote:
>> Consider network topology as follows:
>>
>> eth0  eth1
>> |_____|
>>     |
>>   bond0 --- br0
>>     |
>>   vlan0 --- br1
>>
>> bond0 serves for both br0 and vlan0, if a vlan tagged packet was sent
>> to br1 through bond0, bridge handling code is seeing the packet on bond0
>> and handing it off to my "legacy" bridge before vlan_tx_tag_present
>> and vlan_hwaccel_do_receive even haven't a chance to look at it.
>>
>> Moving the vlan_tx_tag_present before bridge/macvlan handling code could
>> cure this.
>
> Wouldn't this break "eth0 - br0 - br0.5"?

I think it would. One more reason to build a single interface stacking framework...

	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko March 5, 2011, 2:25 p.m. UTC | #5
Sat, Mar 05, 2011 at 02:53:47PM CET, nicolas.2p.debian@gmail.com wrote:
>Le 05/03/2011 11:36, Jiri Pirko a écrit :
>>Thu, Mar 03, 2011 at 11:55:13AM CET, dfeng@redhat.com wrote:
>>>Consider network topology as follows:
>>>
>>>eth0  eth1
>>>|_____|
>>>    |
>>>  bond0 --- br0
>>>    |
>>>  vlan0 --- br1
>>>
>>>bond0 serves for both br0 and vlan0, if a vlan tagged packet was sent
>>>to br1 through bond0, bridge handling code is seeing the packet on bond0
>>>and handing it off to my "legacy" bridge before vlan_tx_tag_present
>>>and vlan_hwaccel_do_receive even haven't a chance to look at it.
>>>
>>>Moving the vlan_tx_tag_present before bridge/macvlan handling code could
>>>cure this.
>>
>>Wouldn't this break "eth0 - br0 - br0.5"?
>
>I think it would. One more reason to build a single interface stacking framework...

I plan to deal with vlans later. This kind of topology would not be
possible after that :)
>
>	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 8ae6631..d2d12c2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3079,27 +3079,27 @@  static int __netif_receive_skb(struct sk_buff *skb)
 ncls:
 #endif
 
-	/* Handle special case of bridge or macvlan */
-	rx_handler = rcu_dereference(skb->dev->rx_handler);
-	if (rx_handler) {
+	if (vlan_tx_tag_present(skb)) {
 		if (pt_prev) {
 			ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = NULL;
 		}
-		skb = rx_handler(skb);
-		if (!skb)
+		if (vlan_hwaccel_do_receive(&skb)) {
+			ret = __netif_receive_skb(skb);
+			goto out;
+		} else if (unlikely(!skb))
 			goto out;
 	}
 
-	if (vlan_tx_tag_present(skb)) {
+	/* Handle special case of bridge or macvlan */
+	rx_handler = rcu_dereference(skb->dev->rx_handler);
+	if (rx_handler) {
 		if (pt_prev) {
 			ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = NULL;
 		}
-		if (vlan_hwaccel_do_receive(&skb)) {
-			ret = __netif_receive_skb(skb);
-			goto out;
-		} else if (unlikely(!skb))
+		skb = rx_handler(skb);
+		if (!skb)
 			goto out;
 	}