diff mbox

[net] xen-netback: fix abuse of napi budget

Message ID 1386670600-5882-1-git-send-email-paul.durrant@citrix.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Paul Durrant Dec. 10, 2013, 10:16 a.m. UTC
netback seemed to be somewhat confused about the napi budget parameter and
basically ignored it. This patch fixes that, properly limiting the work done
in each poll.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/net/xen-netback/netback.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Ian Campbell Dec. 10, 2013, 10:25 a.m. UTC | #1
On Tue, 2013-12-10 at 10:16 +0000, Paul Durrant wrote:
> netback seemed to be somewhat confused about the napi budget parameter and
> basically ignored it. This patch fixes that, properly limiting the work done
> in each poll.

What do you mean "ignored", xenvif_tx_submit seems to be tracking and
testing work_done against the budget.

I suspect this change is probably worthwhile but it would be good to get
an accurate description of why, which I presume is because the tx
process is xenvif_tx_build_gops followed by, gnttab_batch_copy then
xenvif_tx_submit and that it is better to do the budget enforcement
earlier on.

How does this change impact the batching in gnttab_batch_copy and
therefore performance? Do we need to tweak the the NAPI budget to ensure
we are getting good batching? I suspect that netback is a bit unusual
among NIC drivers in that the rx path contains a fair bit of actual work
to do, so perhaps the NAPI defaults are not necessarily going to be the
best for it.

> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/net/xen-netback/netback.c |   14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 43341b8..83b4e5b 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1351,14 +1351,15 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
>  	return false;
>  }
>  
> -static unsigned xenvif_tx_build_gops(struct xenvif *vif)
> +static unsigned xenvif_tx_build_gops(struct xenvif *vif, int budget)
>  {
>  	struct gnttab_copy *gop = vif->tx_copy_ops, *request_gop;
>  	struct sk_buff *skb;
>  	int ret;
>  
>  	while ((nr_pending_reqs(vif) + XEN_NETBK_LEGACY_SLOTS_MAX
> -		< MAX_PENDING_REQS)) {
> +		< MAX_PENDING_REQS) &&
> +	       (skb_queue_len(&vif->tx_queue) < budget)) {
>  		struct xen_netif_tx_request txreq;
>  		struct xen_netif_tx_request txfrags[XEN_NETBK_LEGACY_SLOTS_MAX];
>  		struct page *page;
> @@ -1520,14 +1521,13 @@ static unsigned xenvif_tx_build_gops(struct xenvif *vif)
>  }
>  
> 
> -static int xenvif_tx_submit(struct xenvif *vif, int budget)
> +static int xenvif_tx_submit(struct xenvif *vif)
>  {
>  	struct gnttab_copy *gop = vif->tx_copy_ops;
>  	struct sk_buff *skb;
>  	int work_done = 0;
>  
> -	while (work_done < budget &&
> -	       (skb = __skb_dequeue(&vif->tx_queue)) != NULL) {
> +	while ((skb = __skb_dequeue(&vif->tx_queue)) != NULL) {
>  		struct xen_netif_tx_request *txp;
>  		u16 pending_idx;
>  		unsigned data_len;
> @@ -1602,14 +1602,14 @@ int xenvif_tx_action(struct xenvif *vif, int budget)
>  	if (unlikely(!tx_work_todo(vif)))
>  		return 0;
>  
> -	nr_gops = xenvif_tx_build_gops(vif);
> +	nr_gops = xenvif_tx_build_gops(vif, budget);
>  
>  	if (nr_gops == 0)
>  		return 0;
>  
>  	gnttab_batch_copy(vif->tx_copy_ops, nr_gops);
>  
> -	work_done = xenvif_tx_submit(vif, nr_gops);
> +	work_done = xenvif_tx_submit(vif);
>  
>  	return work_done;
>  }


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Vrabel Dec. 10, 2013, 10:30 a.m. UTC | #2
On 10/12/13 10:25, Ian Campbell wrote:
> On Tue, 2013-12-10 at 10:16 +0000, Paul Durrant wrote:
>> netback seemed to be somewhat confused about the napi budget parameter and
>> basically ignored it. This patch fixes that, properly limiting the work done
>> in each poll.
> 
> What do you mean "ignored", xenvif_tx_submit seems to be tracking and
> testing work_done against the budget.

I have seen this warning in net_rx_action() trigger.

	WARN_ON_ONCE(work > weight);

Which means netback wasn't limiting the work done.

David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Liu Dec. 10, 2013, 10:34 a.m. UTC | #3
On Tue, Dec 10, 2013 at 10:16:40AM +0000, Paul Durrant wrote:
> netback seemed to be somewhat confused about the napi budget parameter and
> basically ignored it. This patch fixes that, properly limiting the work done
> in each poll.
> 

After reading the code I think your "basically ignored it" means netback
will process the ring as much as possible, right? 

But overall the packets passed to network stack is still limited by
budget, if I'm not mistaken.

What's the impact on flow control if you more the check earlier?

Wei.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Liu Dec. 10, 2013, 10:37 a.m. UTC | #4
On Tue, Dec 10, 2013 at 10:30:13AM +0000, David Vrabel wrote:
> On 10/12/13 10:25, Ian Campbell wrote:
> > On Tue, 2013-12-10 at 10:16 +0000, Paul Durrant wrote:
> >> netback seemed to be somewhat confused about the napi budget parameter and
> >> basically ignored it. This patch fixes that, properly limiting the work done
> >> in each poll.
> > 
> > What do you mean "ignored", xenvif_tx_submit seems to be tracking and
> > testing work_done against the budget.
> 
> I have seen this warning in net_rx_action() trigger.
> 
> 	WARN_ON_ONCE(work > weight);
> 
> Which means netback wasn't limiting the work done.
> 

But in the original code work_done is returned by xenvif_tx_submit which
has guard against that situation, right?

Wei.

> David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Liu Dec. 10, 2013, 10:44 a.m. UTC | #5
On Tue, Dec 10, 2013 at 10:37:36AM +0000, Wei Liu wrote:
> On Tue, Dec 10, 2013 at 10:30:13AM +0000, David Vrabel wrote:
> > On 10/12/13 10:25, Ian Campbell wrote:
> > > On Tue, 2013-12-10 at 10:16 +0000, Paul Durrant wrote:
> > >> netback seemed to be somewhat confused about the napi budget parameter and
> > >> basically ignored it. This patch fixes that, properly limiting the work done
> > >> in each poll.
> > > 
> > > What do you mean "ignored", xenvif_tx_submit seems to be tracking and
> > > testing work_done against the budget.
> > 
> > I have seen this warning in net_rx_action() trigger.
> > 
> > 	WARN_ON_ONCE(work > weight);
> > 
> > Which means netback wasn't limiting the work done.
> > 
> 
> But in the original code work_done is returned by xenvif_tx_submit which
> has guard against that situation, right?
> 

And now I think I spot a bug...

  work_done = xenvif_tx_submit(vif, nr_gops);

The second argument should really be "budget". :-(

Wei.

> Wei.
> 
> > David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Durrant Dec. 10, 2013, 10:48 a.m. UTC | #6
> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 10 December 2013 10:45
> To: David Vrabel
> Cc: Ian Campbell; Paul Durrant; xen-devel@lists.xen.org;
> netdev@vger.kernel.org; Wei Liu
> Subject: Re: [PATCH net] xen-netback: fix abuse of napi budget
> 
> On Tue, Dec 10, 2013 at 10:37:36AM +0000, Wei Liu wrote:
> > On Tue, Dec 10, 2013 at 10:30:13AM +0000, David Vrabel wrote:
> > > On 10/12/13 10:25, Ian Campbell wrote:
> > > > On Tue, 2013-12-10 at 10:16 +0000, Paul Durrant wrote:
> > > >> netback seemed to be somewhat confused about the napi budget
> parameter and
> > > >> basically ignored it. This patch fixes that, properly limiting the work
> done
> > > >> in each poll.
> > > >
> > > > What do you mean "ignored", xenvif_tx_submit seems to be tracking
> and
> > > > testing work_done against the budget.
> > >
> > > I have seen this warning in net_rx_action() trigger.
> > >
> > > 	WARN_ON_ONCE(work > weight);
> > >
> > > Which means netback wasn't limiting the work done.
> > >
> >
> > But in the original code work_done is returned by xenvif_tx_submit which
> > has guard against that situation, right?
> >
> 
> And now I think I spot a bug...
> 
>   work_done = xenvif_tx_submit(vif, nr_gops);
> 
> The second argument should really be "budget". :-(
> 

Yep - that's basically the problem.

  Paul

> Wei.
> 
> > Wei.
> >
> > > David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 43341b8..83b4e5b 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1351,14 +1351,15 @@  static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
 	return false;
 }
 
-static unsigned xenvif_tx_build_gops(struct xenvif *vif)
+static unsigned xenvif_tx_build_gops(struct xenvif *vif, int budget)
 {
 	struct gnttab_copy *gop = vif->tx_copy_ops, *request_gop;
 	struct sk_buff *skb;
 	int ret;
 
 	while ((nr_pending_reqs(vif) + XEN_NETBK_LEGACY_SLOTS_MAX
-		< MAX_PENDING_REQS)) {
+		< MAX_PENDING_REQS) &&
+	       (skb_queue_len(&vif->tx_queue) < budget)) {
 		struct xen_netif_tx_request txreq;
 		struct xen_netif_tx_request txfrags[XEN_NETBK_LEGACY_SLOTS_MAX];
 		struct page *page;
@@ -1520,14 +1521,13 @@  static unsigned xenvif_tx_build_gops(struct xenvif *vif)
 }
 
 
-static int xenvif_tx_submit(struct xenvif *vif, int budget)
+static int xenvif_tx_submit(struct xenvif *vif)
 {
 	struct gnttab_copy *gop = vif->tx_copy_ops;
 	struct sk_buff *skb;
 	int work_done = 0;
 
-	while (work_done < budget &&
-	       (skb = __skb_dequeue(&vif->tx_queue)) != NULL) {
+	while ((skb = __skb_dequeue(&vif->tx_queue)) != NULL) {
 		struct xen_netif_tx_request *txp;
 		u16 pending_idx;
 		unsigned data_len;
@@ -1602,14 +1602,14 @@  int xenvif_tx_action(struct xenvif *vif, int budget)
 	if (unlikely(!tx_work_todo(vif)))
 		return 0;
 
-	nr_gops = xenvif_tx_build_gops(vif);
+	nr_gops = xenvif_tx_build_gops(vif, budget);
 
 	if (nr_gops == 0)
 		return 0;
 
 	gnttab_batch_copy(vif->tx_copy_ops, nr_gops);
 
-	work_done = xenvif_tx_submit(vif, nr_gops);
+	work_done = xenvif_tx_submit(vif);
 
 	return work_done;
 }