diff mbox

[net-next-2.6] net: harmonize the call to ptype_all and ptype_base handlers.

Message ID 1299417916-14198-1-git-send-email-nicolas.2p.debian@free.fr
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Nicolas de Pesloüan March 6, 2011, 1:25 p.m. UTC
Until now, ptype_all and ptype_base delivery in __netif_receive_skb() is
inconsistent.

- For ptype_all, we deliver to every device crossed while walking the
rx_handler path (inside the another_round loop), and there is no way to stop
wildcard delivery (no exact match logic).
- For ptype_base, we deliver to the lowest device (orig_dev) and to the highest
(skb->dev) and we can ask for exact match delivery.

This patch try and fix this, by:

1/ Doing exact match delivery for both ptype_all and ptype_base, while walking
   the rx_handler path.
2/ Doing wildcard match delivery at the end of __netif_receive_skb(), if not
   asked to do exact match delivery only.

Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
---

This apply on top of the last batch of patch from Jiri Pirko.
---
 net/core/dev.c |   32 ++++++++++++++++++++++++--------
 1 files changed, 24 insertions(+), 8 deletions(-)

Comments

Jiri Pirko March 7, 2011, 10:03 a.m. UTC | #1
Sun, Mar 06, 2011 at 02:25:16PM CET, nicolas.2p.debian@free.fr wrote:
>Until now, ptype_all and ptype_base delivery in __netif_receive_skb() is
>inconsistent.
>
>- For ptype_all, we deliver to every device crossed while walking the
>rx_handler path (inside the another_round loop), and there is no way to stop
>wildcard delivery (no exact match logic).
>- For ptype_base, we deliver to the lowest device (orig_dev) and to the highest
>(skb->dev) and we can ask for exact match delivery.
>
>This patch try and fix this, by:
>
>1/ Doing exact match delivery for both ptype_all and ptype_base, while walking
>   the rx_handler path.
>2/ Doing wildcard match delivery at the end of __netif_receive_skb(), if not
>   asked to do exact match delivery only.
>
>Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
>---
>
>This apply on top of the last batch of patch from Jiri Pirko.
>---
> net/core/dev.c |   32 ++++++++++++++++++++++++--------
> 1 files changed, 24 insertions(+), 8 deletions(-)
>

I tend to like this patch. However I'm not sure if extra 2 loops don't
introduce noticable overhead :/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas de Pesloüan March 7, 2011, 8:41 p.m. UTC | #2
Le 07/03/2011 11:03, Jiri Pirko a écrit :
> Sun, Mar 06, 2011 at 02:25:16PM CET, nicolas.2p.debian@free.fr wrote:
>> Until now, ptype_all and ptype_base delivery in __netif_receive_skb() is
>> inconsistent.
>>
>> - For ptype_all, we deliver to every device crossed while walking the
>> rx_handler path (inside the another_round loop), and there is no way to stop
>> wildcard delivery (no exact match logic).
>> - For ptype_base, we deliver to the lowest device (orig_dev) and to the highest
>> (skb->dev) and we can ask for exact match delivery.
>>
>> This patch try and fix this, by:
>>
>> 1/ Doing exact match delivery for both ptype_all and ptype_base, while walking
>>    the rx_handler path.
>> 2/ Doing wildcard match delivery at the end of __netif_receive_skb(), if not
>>    asked to do exact match delivery only.
>>
>> Signed-off-by: Nicolas de Pesloüan<nicolas.2p.debian@free.fr>
>> ---
>>
>> This apply on top of the last batch of patch from Jiri Pirko.
>> ---
>> net/core/dev.c |   32 ++++++++++++++++++++++++--------
>> 1 files changed, 24 insertions(+), 8 deletions(-)
>>
>
> I tend to like this patch. However I'm not sure if extra 2 loops don't
> introduce noticable overhead :/

I think ptype_all and ptype_base lists should only contain entries having ptype->dev == NULL.

The entries having ptype->dev != NULL should be on per net_device lists. The head of those lists 
could/should be in a ptype_all and a ptype_base property in net_device.

This would speed up the exact-match loops, because they would scan small (or empty) lists.

I need to double check the possible impact of this proposal.

	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko March 7, 2011, 9:12 p.m. UTC | #3
Mon, Mar 07, 2011 at 09:41:19PM CET, nicolas.2p.debian@gmail.com wrote:
>Le 07/03/2011 11:03, Jiri Pirko a écrit :
>>Sun, Mar 06, 2011 at 02:25:16PM CET, nicolas.2p.debian@free.fr wrote:
>>>Until now, ptype_all and ptype_base delivery in __netif_receive_skb() is
>>>inconsistent.
>>>
>>>- For ptype_all, we deliver to every device crossed while walking the
>>>rx_handler path (inside the another_round loop), and there is no way to stop
>>>wildcard delivery (no exact match logic).
>>>- For ptype_base, we deliver to the lowest device (orig_dev) and to the highest
>>>(skb->dev) and we can ask for exact match delivery.
>>>
>>>This patch try and fix this, by:
>>>
>>>1/ Doing exact match delivery for both ptype_all and ptype_base, while walking
>>>   the rx_handler path.
>>>2/ Doing wildcard match delivery at the end of __netif_receive_skb(), if not
>>>   asked to do exact match delivery only.
>>>
>>>Signed-off-by: Nicolas de Pesloüan<nicolas.2p.debian@free.fr>
>>>---
>>>
>>>This apply on top of the last batch of patch from Jiri Pirko.
>>>---
>>>net/core/dev.c |   32 ++++++++++++++++++++++++--------
>>>1 files changed, 24 insertions(+), 8 deletions(-)
>>>
>>
>>I tend to like this patch. However I'm not sure if extra 2 loops don't
>>introduce noticable overhead :/
>
>I think ptype_all and ptype_base lists should only contain entries having ptype->dev == NULL.
>
>The entries having ptype->dev != NULL should be on per net_device
>lists. The head of those lists could/should be in a ptype_all and a
>ptype_base property in net_device.
>
>This would speed up the exact-match loops, because they would scan small (or empty) lists.
>
>I need to double check the possible impact of this proposal.

On the first glance, this makes sense to me.

>
>	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index c71bd18..a368223 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3117,8 +3117,6 @@  static int __netif_receive_skb(struct sk_buff *skb)
 {
 	struct packet_type *ptype, *pt_prev;
 	rx_handler_func_t *rx_handler;
-	struct net_device *orig_dev;
-	struct net_device *null_or_dev;
 	bool deliver_exact = false;
 	int ret = NET_RX_DROP;
 	__be16 type;
@@ -3134,7 +3132,6 @@  static int __netif_receive_skb(struct sk_buff *skb)
 
 	if (!skb->skb_iif)
 		skb->skb_iif = skb->dev->ifindex;
-	orig_dev = skb->dev;
 
 	skb_reset_network_header(skb);
 	skb_reset_transport_header(skb);
@@ -3156,7 +3153,17 @@  another_round:
 #endif
 
 	list_for_each_entry_rcu(ptype, &ptype_all, list) {
-		if (!ptype->dev || ptype->dev == skb->dev) {
+		if (ptype->dev == skb->dev) {
+			if (pt_prev)
+				ret = deliver_skb(skb, pt_prev);
+			pt_prev = ptype;
+		}
+	}
+
+	type = skb->protocol;
+	list_for_each_entry_rcu(ptype,
+			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
+		if (ptype->type == type && ptype->dev == skb->dev) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev);
 			pt_prev = ptype;
@@ -3205,20 +3212,29 @@  ncls:
 	vlan_on_bond_hook(skb);
 
 	/* deliver only exact match when indicated */
-	null_or_dev = deliver_exact ? skb->dev : NULL;
+	if (deliver_exact)
+		goto skip_wildcard_delivery;
+
+	list_for_each_entry_rcu(ptype, &ptype_all, list) {
+		if (!ptype->dev) {
+			if (pt_prev)
+				ret = deliver_skb(skb, pt_prev);
+			pt_prev = ptype;
+		}
+	}
 
 	type = skb->protocol;
 	list_for_each_entry_rcu(ptype,
 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
-		if (ptype->type == type &&
-		    (ptype->dev == null_or_dev || ptype->dev == skb->dev ||
-		     ptype->dev == orig_dev)) {
+		if (ptype->type == type && !ptype->dev) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev);
 			pt_prev = ptype;
 		}
 	}
 
+skip_wildcard_delivery:	
+
 	if (pt_prev) {
 		ret = pt_prev->func(skb, skb->dev, pt_prev);
 	} else {