diff mbox

xen-netfront: Rework the fix for Rx stall during OOM and network stress

Message ID 1486493941-15696-1-git-send-email-vineethp@amazon.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Vineeth Remanan Pillai Feb. 7, 2017, 6:59 p.m. UTC
The commit 90c311b0eeea ("xen-netfront: Fix Rx stall during network
stress and OOM") caused the refill timer to be triggerred almost on
all invocations of xennet_alloc_rx_buffers for certain workloads.
This reworks the fix by reverting to the old behaviour and taking into
consideration the skb allocation failure. Refill timer is now triggered
on insufficient requests or skb allocation failure.

Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com>
Fixes: 90c311b0eeea (xen-netfront: Fix Rx stall during network stress and OOM)
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 drivers/net/xen-netfront.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Boris Ostrovsky Feb. 8, 2017, 3:08 p.m. UTC | #1
On 02/07/2017 01:59 PM, Vineeth Remanan Pillai wrote:
> The commit 90c311b0eeea ("xen-netfront: Fix Rx stall during network
> stress and OOM") caused the refill timer to be triggerred almost on
> all invocations of xennet_alloc_rx_buffers for certain workloads.
> This reworks the fix by reverting to the old behaviour and taking into
> consideration the skb allocation failure. Refill timer is now triggered
> on insufficient requests or skb allocation failure.
>
> Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com>
> Fixes: 90c311b0eeea (xen-netfront: Fix Rx stall during network stress and OOM)
> Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
David Miller Feb. 9, 2017, 9:34 p.m. UTC | #2
From: Vineeth Remanan Pillai <vineethp@amazon.com>
Date: Tue, 7 Feb 2017 18:59:01 +0000

> The commit 90c311b0eeea ("xen-netfront: Fix Rx stall during network
> stress and OOM") caused the refill timer to be triggerred almost on
> all invocations of xennet_alloc_rx_buffers for certain workloads.
> This reworks the fix by reverting to the old behaviour and taking into
> consideration the skb allocation failure. Refill timer is now triggered
> on insufficient requests or skb allocation failure.
> 
> Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com>
> Fixes: 90c311b0eeea (xen-netfront: Fix Rx stall during network stress and OOM)
> Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Applied.
diff mbox

Patch

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 8315fe7..9dba697 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -281,6 +281,7 @@  static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 {
 	RING_IDX req_prod = queue->rx.req_prod_pvt;
 	int notify;
+	int err = 0;
 
 	if (unlikely(!netif_carrier_ok(queue->info->netdev)))
 		return;
@@ -295,8 +296,10 @@  static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 		struct xen_netif_rx_request *req;
 
 		skb = xennet_alloc_one_rx_buffer(queue);
-		if (!skb)
+		if (!skb) {
+			err = -ENOMEM;
 			break;
+		}
 
 		id = xennet_rxidx(req_prod);
 
@@ -320,8 +323,13 @@  static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 
 	queue->rx.req_prod_pvt = req_prod;
 
-	/* Not enough requests? Try again later. */
-	if (req_prod - queue->rx.sring->req_prod < NET_RX_SLOTS_MIN) {
+	/* Try again later if there are not enough requests or skb allocation
+	 * failed.
+	 * Enough requests is quantified as the sum of newly created slots and
+	 * the unconsumed slots at the backend.
+	 */
+	if (req_prod - queue->rx.rsp_cons < NET_RX_SLOTS_MIN ||
+	    unlikely(err)) {
 		mod_timer(&queue->rx_refill_timer, jiffies + (HZ/10));
 		return;
 	}