From patchwork Tue Nov 20 14:14:25 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ian Campbell X-Patchwork-Id: 200341 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id C65A02C009A for ; Wed, 21 Nov 2012 01:14:31 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752304Ab2KTOO3 (ORCPT ); Tue, 20 Nov 2012 09:14:29 -0500 Received: from smtp.eu.citrix.com ([46.33.159.39]:3397 "EHLO SMTP.EU.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751341Ab2KTOO2 (ORCPT ); Tue, 20 Nov 2012 09:14:28 -0500 X-IronPort-AV: E=Sophos;i="4.83,285,1352073600"; d="scan'208";a="15905766" Received: from lonpmailmx01.citrite.net ([10.30.203.162]) by LONPIPO01.EU.CITRIX.COM with ESMTP/TLS/RC4-MD5; 20 Nov 2012 14:14:27 +0000 Received: from [10.80.2.42] (10.80.2.42) by LONPMAILMX01.citrite.net (10.30.203.162) with Microsoft SMTP Server id 8.3.279.1; Tue, 20 Nov 2012 14:14:27 +0000 Message-ID: <1353420865.13542.44.camel@zakaz.uk.xensource.com> Subject: Re: [Xen-devel] [PATCH] xen/netfront: handle compound page fragments on transmit From: Ian Campbell To: Jan Beulich CC: Stefan Bader , Sander Eikelenboom , Eric Dumazet , "KonradRzeszutek Wilk" , "xen-devel@lists.xen.org" , ANNIE LI , "netdev@vger.kernel.org" Date: Tue, 20 Nov 2012 14:14:25 +0000 In-Reply-To: <50AB990602000078000A9F5B@nat28.tlf.novell.com> References: <1353403286.18229.159.camel@zakaz.uk.xensource.com> <1353411606-15940-1-git-send-email-ian.campbell@citrix.com> <50AB856D02000078000A9EFD@nat28.tlf.novell.com> <1353418516.13542.38.camel@zakaz.uk.xensource.com> <50AB990602000078000A9F5B@nat28.tlf.novell.com> Organization: Citrix Systems, Inc. X-Mailer: Evolution 3.4.3-1 MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Tue, 2012-11-20 at 13:51 +0000, Jan Beulich wrote: > >>> On 20.11.12 at 14:35, Ian Campbell wrote: > > On Tue, 2012-11-20 at 12:28 +0000, Jan Beulich wrote: > >> >>> On 20.11.12 at 12:40, Ian Campbell wrote: > >> > An SKB paged fragment can consist of a compound page with order > 0. > >> > However the netchannel protocol deals only in PAGE_SIZE frames. > >> > > >> > Handle this in xennet_make_frags by iterating over the frames which > >> > make up the page. > >> > > >> > This is the netfront equivalent to 6a8ed462f16b for netback. > >> > >> Wouldn't you need to be at least a little more conservative here > >> with respect to resource use: I realize that get_id_from_freelist() > >> return values were never checked, and failure of > >> gnttab_claim_grant_reference() was always dealt with via > >> BUG_ON(), but considering that netfront_tx_slot_available() > >> doesn't account for compound page fragments, I think this (lack > >> of) error handling needs improvement in the course of the > >> change here (regardless of - I think - someone having said that > >> usually the sum of all pages referenced from an skb's fragments > >> would not exceed MAX_SKB_FRAGS - "usually" just isn't enough > >> imo). > > > > I think it is more than "usually", it is derived from the number of > > pages needed to contain 64K of data which is the maximum size of the > > data associated with an skb (AIUI). > > > > Unwinding from failure in xennet_make_frags looks pretty tricky, > > Yes, I agree. > > > but how about this incremental patch: > > Looks good, but can probably be simplified quite a bit: > > > --- a/drivers/net/xen-netfront.c > > +++ b/drivers/net/xen-netfront.c > > @@ -505,6 +505,46 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev, > > np->tx.req_prod_pvt = prod; > > } > > > > +/* > > + * Count how many ring slots are required to send the frags of this > > + * skb. Each frag might be a compound page. > > + */ > > +static int xennet_count_skb_frag_pages(struct sk_buff *skb) > > +{ > > + int i, frags = skb_shinfo(skb)->nr_frags; > > + int pages = 0; > > + > > + for (i = 0; i < frags; i++) { > > + skb_frag_t *frag = skb_shinfo(skb)->frags + i; > > + unsigned long size = skb_frag_size(frag); > > + unsigned long offset = frag->page_offset; > > + > > + /* Skip unused frames from start of page */ > > + offset &= ~PAGE_MASK; > > + > > + while (size > 0) { > > + unsigned long bytes; > > + > > + BUG_ON(offset >= PAGE_SIZE); > > + > > + bytes = PAGE_SIZE - offset; > > + if (bytes > size) > > + bytes = size; > > + > > + offset += bytes; > > + size -= bytes; > > + > > + /* Next frame */ > > + if (offset == PAGE_SIZE && size) { > > + pages++; > > + offset = 0; > > + } > > + } > > Isn't the whole loop equivalent to > > pages = PFN_UP(offset + size); > > (at least as long as size is not zero)? Er, yes. Wood for the trees etc... I think using PFN_UP overcounts a bit since the data needed start in the first frame of a compound frame, but if you keep the /* Skip unused frames from start of page */ offset &= ~PAGE_MASK; I think that does the right thing > > @@ -517,12 +557,13 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) > > grant_ref_t ref; > > unsigned long mfn; > > int notify; > > - int frags = skb_shinfo(skb)->nr_frags; > > + int frags; > > unsigned int offset = offset_in_page(data); > > unsigned int len = skb_headlen(skb); > > unsigned long flags; > > > > - frags += DIV_ROUND_UP(offset + len, PAGE_SIZE); > > + frags = xennet_count_skb_frag_pages(skb) + > > + DIV_ROUND_UP(offset + len, PAGE_SIZE); > > if (unlikely(frags > MAX_SKB_FRAGS + 1)) { > > This condition would now need adjustment, though (because > "frags" is no longer what its name says). I think it already wasn't what the name says, since it included the skb head too. Perhaps "slots" would be a better name? --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index a12b99a..b744875 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -505,6 +505,29 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev, np->tx.req_prod_pvt = prod; } +/* + * Count how many ring slots are required to send the frags of this + * skb. Each frag might be a compound page. + */ +static int xennet_count_skb_frag_slots(struct sk_buff *skb) +{ + int i, frags = skb_shinfo(skb)->nr_frags; + int pages = 0; + + for (i = 0; i < frags; i++) { + skb_frag_t *frag = skb_shinfo(skb)->frags + i; + unsigned long size = skb_frag_size(frag); + unsigned long offset = frag->page_offset; + + /* Skip unused frames from start of page */ + offset &= ~PAGE_MASK; + + pages += PFN_UP(offset + size); + } + + return pages; +} + static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) { unsigned short id; @@ -517,15 +540,16 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) grant_ref_t ref; unsigned long mfn; int notify; - int frags = skb_shinfo(skb)->nr_frags; + int slots; unsigned int offset = offset_in_page(data); unsigned int len = skb_headlen(skb); unsigned long flags; - frags += DIV_ROUND_UP(offset + len, PAGE_SIZE); - if (unlikely(frags > MAX_SKB_FRAGS + 1)) { - printk(KERN_ALERT "xennet: skb rides the rocket: %d frags\n", - frags); + slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) + + xennet_count_skb_frag_slots(skb); + if (unlikely(slots > MAX_SKB_FRAGS + 1)) { + printk(KERN_ALERT "xennet: skb rides the rocket: %d slots\n", + slots); dump_stack(); goto drop; } @@ -533,7 +557,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) spin_lock_irqsave(&np->tx_lock, flags); if (unlikely(!netif_carrier_ok(dev) || - (frags > 1 && !xennet_can_sg(dev)) || + (slots > 1 && !xennet_can_sg(dev)) || netif_needs_gso(skb, netif_skb_features(skb)))) { spin_unlock_irqrestore(&np->tx_lock, flags); goto drop;