diff mbox

[v15,00/17] Provide a zero-copy method on KVM virtio-net.

Message ID 1289381008-5484-1-git-send-email-xiaohui.xin@intel.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Xin, Xiaohui Nov. 10, 2010, 9:23 a.m. UTC
From: Xin Xiaohui <xiaohui.xin@intel.com>

>2) The idea to key off of skb->dev in skb_release_data() is
>   fundamentally flawed since many actions can change skb->dev on you,
>   which will end up causing a leak of your external data areas.

How about this one? If the destructor_arg is not a good candidate,
then I have to add an apparent field in shinfo.

Thanks
Xiaohui

Comments

David Miller Nov. 10, 2010, 5:46 p.m. UTC | #1
From: xiaohui.xin@intel.com
Date: Wed, 10 Nov 2010 17:23:28 +0800

> From: Xin Xiaohui <xiaohui.xin@intel.com>
> 
>>2) The idea to key off of skb->dev in skb_release_data() is
>>   fundamentally flawed since many actions can change skb->dev on you,
>>   which will end up causing a leak of your external data areas.
> 
> How about this one? If the destructor_arg is not a good candidate,
> then I have to add an apparent field in shinfo.

If destructor_arg is actually a net_device pointer or similar,
you will need to take a reference count on it or similar.

Which means --> good bye performance especially on SMP.

You're going to be adding new serialization points and at
least two new atomics per packet.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xin, Xiaohui Nov. 11, 2010, 8:28 a.m. UTC | #2
>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net]
>Sent: Thursday, November 11, 2010 1:47 AM
>To: Xin, Xiaohui
>Cc: netdev@vger.kernel.org; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
>mst@redhat.com; mingo@elte.hu; herbert@gondor.apana.org.au; jdike@linux.intel.com
>Subject: Re: [PATCH v15 00/17] Provide a zero-copy method on KVM virtio-net.
>
>From: xiaohui.xin@intel.com
>Date: Wed, 10 Nov 2010 17:23:28 +0800
>
>> From: Xin Xiaohui <xiaohui.xin@intel.com>
>>
>>>2) The idea to key off of skb->dev in skb_release_data() is
>>>   fundamentally flawed since many actions can change skb->dev on you,
>>>   which will end up causing a leak of your external data areas.
>>
>> How about this one? If the destructor_arg is not a good candidate,
>> then I have to add an apparent field in shinfo.
>
>If destructor_arg is actually a net_device pointer or similar,
>you will need to take a reference count on it or similar.
>
Do you mean destructor_arg will be consumed by other user?
If that case, may I add a new structure member in shinfo?
Thus only zero-copy will use it, and no need for the reference count.

>Which means --> good bye performance especially on SMP.
>
>You're going to be adding new serialization points and at
>least two new atomics per packet.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xin, Xiaohui Nov. 17, 2010, 8:09 a.m. UTC | #3
>-----Original Message-----
>From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On Behalf Of Xin,
>Xiaohui
>Sent: Thursday, November 11, 2010 4:28 PM
>To: David Miller
>Cc: netdev@vger.kernel.org; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
>mst@redhat.com; mingo@elte.hu; herbert@gondor.apana.org.au; jdike@linux.intel.com
>Subject: RE: [PATCH v15 00/17] Provide a zero-copy method on KVM virtio-net.
>
>>-----Original Message-----
>>From: David Miller [mailto:davem@davemloft.net]
>>Sent: Thursday, November 11, 2010 1:47 AM
>>To: Xin, Xiaohui
>>Cc: netdev@vger.kernel.org; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
>>mst@redhat.com; mingo@elte.hu; herbert@gondor.apana.org.au; jdike@linux.intel.com
>>Subject: Re: [PATCH v15 00/17] Provide a zero-copy method on KVM virtio-net.
>>
>>From: xiaohui.xin@intel.com
>>Date: Wed, 10 Nov 2010 17:23:28 +0800
>>
>>> From: Xin Xiaohui <xiaohui.xin@intel.com>
>>>
>>>>2) The idea to key off of skb->dev in skb_release_data() is
>>>>   fundamentally flawed since many actions can change skb->dev on you,
>>>>   which will end up causing a leak of your external data areas.
>>>
>>> How about this one? If the destructor_arg is not a good candidate,
>>> then I have to add an apparent field in shinfo.
>>
>>If destructor_arg is actually a net_device pointer or similar,
>>you will need to take a reference count on it or similar.
>>
>Do you mean destructor_arg will be consumed by other user?
>If that case, may I add a new structure member in shinfo?
>Thus only zero-copy will use it, and no need for the reference count.
>
How about this? It really needs somewhere to track the external data area,
and if something wrong with it, we can also release the data area. We think 
skb_release_data() is the right place to deal with it. If I understood right,
that destructor_arg will be used by other else that why reference count is
needed, then how about add a new structure member in shinfo?

Thanks
Xiaohui 

>>Which means --> good bye performance especially on SMP.
>>
>>You're going to be adding new serialization points and at
>>least two new atomics per packet.
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 10ba67d..ad4636e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -199,14 +199,15 @@  struct skb_shared_info {
 	struct sk_buff	*frag_list;
 	struct skb_shared_hwtstamps hwtstamps;
 
+	/* Intermediate layers must ensure that destructor_arg
+	 * remains valid until skb destructor */
+	void *		destructor_arg;
+
 	/*
 	 * Warning : all fields before dataref are cleared in __alloc_skb()
 	 */
 	atomic_t	dataref;
 
-	/* Intermediate layers must ensure that destructor_arg
-	 * remains valid until skb destructor */
-	void *		destructor_arg;
 	/* must be last field, see pskb_expand_head() */
 	skb_frag_t	frags[MAX_SKB_FRAGS];
 };
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index c83b421..eb040f4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -343,6 +343,13 @@  static void skb_release_data(struct sk_buff *skb)
 		if (skb_has_frags(skb))
 			skb_drop_fraglist(skb);
 
+		if (skb_shinfo(skb)->destructor_arg) {
+			struct skb_ext_page *ext_page =
+				skb_shinfo(skb)->destructor_arg;
+			if (ext_page->dtor)
+				ext_page->dtor(ext_page);
+		}
+
 		kfree(skb->head);
 	}
 }