diff mbox

[net-next] xen-netback: stop vif thread spinning if frontend is unresponsive

Message ID 1389111929-37231-1-git-send-email-paul.durrant@citrix.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Paul Durrant Jan. 7, 2014, 4:25 p.m. UTC
The recent patch to improve guest receive side flow control (ca2f09f2) had a
slight flaw in the wait condition for the vif thread in that any remaining
skbs in the guest receive side netback internal queue would prevent the
thread from sleeping. An unresponsive frontend can lead to a permanently
non-empty internal queue and thus the thread will spin. In this case the
thread should really sleep until the frontend becomes responsive again.

This patch adds an extra flag to the vif which is set if the shared ring
is full and cleared when skbs are drained into the shared ring. Thus,
if the thread runs, finds the shared ring full and can make no progress the
flag remains set. If the flag remains set then the thread will sleep,
regardless of a non-empty queue, until the next event from the frontend.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/net/xen-netback/common.h  |    1 +
 drivers/net/xen-netback/netback.c |   12 ++++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

Comments

David Miller Jan. 7, 2014, 9:29 p.m. UTC | #1
From: Paul Durrant <paul.durrant@citrix.com>
Date: Tue, 7 Jan 2014 16:25:29 +0000

> @@ -477,6 +477,7 @@ static void xenvif_rx_action(struct xenvif *vif)
>  	unsigned long offset;
>  	struct skb_cb_overlay *sco;
>  	int need_to_notify = 0;
> +	int ring_full = 0;

Please use bool, false, and true.

>  
> -	if (!npo.copy_prod)
> +	if (!npo.copy_prod) {
> +		if (ring_full)
> +			vif->rx_queue_stopped = true;
>  		goto done;
> +	}
> +
> +	vif->rx_queue_stopped = false;

And then you can code this as:

	vif->rx_queue_stopped = (!npo.copy_prod && ring_full);
	if (!npo.copy_prod)
		goto done;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Durrant Jan. 8, 2014, 9:49 a.m. UTC | #2
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of David Miller
> Sent: 07 January 2014 21:30
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; xen-devel@lists.xen.org; Wei Liu; Ian Campbell;
> David Vrabel
> Subject: Re: [PATCH net-next] xen-netback: stop vif thread spinning if
> frontend is unresponsive
> 
> From: Paul Durrant <paul.durrant@citrix.com>
> Date: Tue, 7 Jan 2014 16:25:29 +0000
> 
> > @@ -477,6 +477,7 @@ static void xenvif_rx_action(struct xenvif *vif)
> >  	unsigned long offset;
> >  	struct skb_cb_overlay *sco;
> >  	int need_to_notify = 0;
> > +	int ring_full = 0;
> 
> Please use bool, false, and true.
> 
> >
> > -	if (!npo.copy_prod)
> > +	if (!npo.copy_prod) {
> > +		if (ring_full)
> > +			vif->rx_queue_stopped = true;
> >  		goto done;
> > +	}
> > +
> > +	vif->rx_queue_stopped = false;
> 
> And then you can code this as:
> 
> 	vif->rx_queue_stopped = (!npo.copy_prod && ring_full);
> 	if (!npo.copy_prod)
> 		goto done;

Sure. I was just following style (of need_to_notify). If you prefer bool then I'll use that and also convert need_to_notify.

  Paul

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Jan. 8, 2014, 9:55 a.m. UTC | #3
> From: David Miller
...
> > -	if (!npo.copy_prod)
> > +	if (!npo.copy_prod) {
> > +		if (ring_full)
> > +			vif->rx_queue_stopped = true;
> >  		goto done;
> > +	}
> > +
> > +	vif->rx_queue_stopped = false;
> 
> And then you can code this as:
> 
> 	vif->rx_queue_stopped = (!npo.copy_prod && ring_full);
> 	if (!npo.copy_prod)
> 		goto done;

Which isn't quite the same...
1) It always writes vif->rx_queue_stopped, the old code could
   leave it unchanged.
2) If 'npo' is global then the compiler can't assume that 'vif'
   doesn't alias it so may have to re-read it following the
   write to 'vif->rx_queue_stopped'.

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index c955fc3..4c76bcb 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -143,6 +143,7 @@  struct xenvif {
 	char rx_irq_name[IFNAMSIZ+4]; /* DEVNAME-rx */
 	struct xen_netif_rx_back_ring rx;
 	struct sk_buff_head rx_queue;
+	bool rx_queue_stopped;
 	/* Set when the RX interrupt is triggered by the frontend.
 	 * The worker thread may need to wake the queue.
 	 */
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 4f81ac0..1c31ac5 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -477,6 +477,7 @@  static void xenvif_rx_action(struct xenvif *vif)
 	unsigned long offset;
 	struct skb_cb_overlay *sco;
 	int need_to_notify = 0;
+	int ring_full = 0;
 
 	struct netrx_pending_operations npo = {
 		.copy  = vif->grant_copy_op,
@@ -509,6 +510,7 @@  static void xenvif_rx_action(struct xenvif *vif)
 		if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) {
 			skb_queue_head(&vif->rx_queue, skb);
 			need_to_notify = 1;
+			ring_full = 1;
 			break;
 		}
 
@@ -521,8 +523,13 @@  static void xenvif_rx_action(struct xenvif *vif)
 
 	BUG_ON(npo.meta_prod > ARRAY_SIZE(vif->meta));
 
-	if (!npo.copy_prod)
+	if (!npo.copy_prod) {
+		if (ring_full)
+			vif->rx_queue_stopped = true;
 		goto done;
+	}
+
+	vif->rx_queue_stopped = false;
 
 	BUG_ON(npo.copy_prod > MAX_GRANT_COPY_OPS);
 	gnttab_batch_copy(vif->grant_copy_op, npo.copy_prod);
@@ -1724,7 +1731,8 @@  static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif,
 
 static inline int rx_work_todo(struct xenvif *vif)
 {
-	return !skb_queue_empty(&vif->rx_queue) || vif->rx_event;
+	return (!skb_queue_empty(&vif->rx_queue) && !vif->rx_queue_stopped) ||
+		vif->rx_event;
 }
 
 static inline int tx_work_todo(struct xenvif *vif)