Patchwork [v3] xen-netfront: pull on receive skb may need to happen earlier

login
register
mail settings
Submitter Jan Beulich
Date July 17, 2013, 7:09 a.m.
Message ID <51E65F5102000078000E57ED@nat28.tlf.novell.com>
Download mbox | patch
Permalink /patch/259610/
State Accepted
Delegated to: David Miller
Headers show

Comments

Jan Beulich - July 17, 2013, 7:09 a.m.
Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
linear area is big enough on RX") xennet_fill_frags() may end up
filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
the fragment count subsequently via __pskb_pull_tail(). That's a
result of xennet_get_responses() allowing a maximum of one more slot to
be consumed (and intermediately transformed into a fragment) if the
head slot has a size less than or equal to RX_COPY_THRESHOLD.

Hence we need to adjust xennet_fill_frags() to pull earlier if we
reached the maximum fragment count - due to the described behavior of
xennet_get_responses() this guarantees that at least the first fragment
will get completely consumed, and hence the fragment count reduced.

In order to not needlessly call __pskb_pull_tail() twice, make the
original call conditional upon the pull target not having been reached
yet, and defer the newly added one as much as possible (an alternative
would have been to always call the function right before the call to
xennet_fill_frags(), but that would imply more frequent cases of
needing to call it twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org (3.6 onwards)
---
v3: Break up a "complex" assignment into two simple ones.

v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
    we go (skb->len needing intermediate updating was pointed out by
    Wei Liu and David Miller, shinfo->nr_frags needing updating before
    calling __pskb_pull_tail() was spotted out by Dion Kant).

---
 drivers/net/xen-netfront.c |   31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ian Campbell - July 17, 2013, 8:26 a.m.
On Wed, 2013-07-17 at 08:09 +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Ian.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - July 17, 2013, 7:52 p.m.
From: Ian Campbell <Ian.Campbell@citrix.com>
Date: Wed, 17 Jul 2013 09:26:24 +0100

> On Wed, 2013-07-17 at 08:09 +0100, Jan Beulich wrote:
>> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
>> linear area is big enough on RX") xennet_fill_frags() may end up
>> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
>> the fragment count subsequently via __pskb_pull_tail(). That's a
>> result of xennet_get_responses() allowing a maximum of one more slot to
>> be consumed (and intermediately transformed into a fragment) if the
>> head slot has a size less than or equal to RX_COPY_THRESHOLD.
>> 
>> Hence we need to adjust xennet_fill_frags() to pull earlier if we
>> reached the maximum fragment count - due to the described behavior of
>> xennet_get_responses() this guarantees that at least the first fragment
>> will get completely consumed, and hence the fragment count reduced.
>> 
>> In order to not needlessly call __pskb_pull_tail() twice, make the
>> original call conditional upon the pull target not having been reached
>> yet, and defer the newly added one as much as possible (an alternative
>> would have been to always call the function right before the call to
>> xennet_fill_frags(), but that would imply more frequent cases of
>> needing to call it twice).
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Acked-by: Wei Liu <wei.liu2@citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

--- 3.11-rc1/drivers/net/xen-netfront.c
+++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
@@ -286,8 +286,7 @@  no_skb:
 			break;
 		}
 
-		__skb_fill_page_desc(skb, 0, page, 0, 0);
-		skb_shinfo(skb)->nr_frags = 1;
+		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
 		__skb_queue_tail(&np->rx_batch, skb);
 	}
 
@@ -831,7 +830,6 @@  static RING_IDX xennet_fill_frags(struct
 				  struct sk_buff_head *list)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
-	int nr_frags = shinfo->nr_frags;
 	RING_IDX cons = np->rx.rsp_cons;
 	struct sk_buff *nskb;
 
@@ -840,19 +838,21 @@  static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
 
-		__skb_fill_page_desc(skb, nr_frags,
-				     skb_frag_page(nfrag),
-				     rx->offset, rx->status);
+		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		skb->data_len += rx->status;
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		}
+		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
+
+		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
+				rx->offset, rx->status, PAGE_SIZE);
 
 		skb_shinfo(nskb)->nr_frags = 0;
 		kfree_skb(nskb);
-
-		nr_frags++;
 	}
 
-	shinfo->nr_frags = nr_frags;
 	return cons;
 }
 
@@ -933,7 +933,8 @@  static int handle_incoming_queue(struct
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
@@ -1019,16 +1020,10 @@  err:
 		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
 		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
 		skb->data_len = rx->status;
+		skb->len += rx->status;
 
 		i = xennet_fill_frags(np, skb, &tmpq);
 
-		/*
-                 * Truesize is the actual allocation size, even if the
-                 * allocation is only partially used.
-                 */
-		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
-		skb->len += skb->data_len;
-
 		if (rx->flags & XEN_NETRXF_csum_blank)
 			skb->ip_summed = CHECKSUM_PARTIAL;
 		else if (rx->flags & XEN_NETRXF_data_validated)