From patchwork Tue Sep 1 14:05:33 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eugene Shatokhin X-Patchwork-Id: 512856 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 2C3DF14012C for ; Wed, 2 Sep 2015 00:06:36 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753291AbbIAOGS (ORCPT ); Tue, 1 Sep 2015 10:06:18 -0400 Received: from collab.rosalab.ru ([195.19.76.181]:60692 "EHLO collab.rosalab.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751610AbbIAOGR (ORCPT ); Tue, 1 Sep 2015 10:06:17 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by collab.rosalab.ru (Postfix) with ESMTP id 9C81629C0EA; Tue, 1 Sep 2015 17:06:14 +0300 (MSK) X-Virus-Scanned: amavisd-new at rosalab.ru Received: from collab.rosalab.ru ([127.0.0.1]) by localhost (collab.rosalab.ru [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mknTjFcZzNPl; Tue, 1 Sep 2015 17:06:14 +0300 (MSK) Received: from localhost.localdomain (37-144-214-38.broadband.corbina.ru [37.144.214.38]) by collab.rosalab.ru (Postfix) with ESMTPSA id 22A0529C0C7; Tue, 1 Sep 2015 17:06:14 +0300 (MSK) From: Eugene Shatokhin To: Oliver Neukum , =?UTF-8?q?Bj=C3=B8rn=20Mork?= , David Miller Cc: netdev@vger.kernel.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, Eugene Shatokhin Subject: [PATCH] usbnet: Fix a race between usbnet_stop() and the BH Date: Tue, 1 Sep 2015 17:05:33 +0300 Message-Id: <1441116333-5377-1-git-send-email-eugene.shatokhin@rosalab.ru> X-Mailer: git-send-email 2.3.2 In-Reply-To: <1441094336.3328.1.camel@suse.com> References: <1441094336.3328.1.camel@suse.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The race may happen when a device (e.g. YOTA 4G LTE Modem) is unplugged while the system is downloading a large file from the Net. Hardware breakpoints and Kprobes with delays were used to confirm that the race does actually happen. The race is on skb_queue ('next' pointer) between usbnet_stop() and rx_complete(), which, in turn, calls usbnet_bh(). Here is a part of the call stack with the code where the changes to the queue happen. The line numbers are for the kernel 4.1.0: *0 __skb_unlink (skbuff.h:1517) prev->next = next; *1 defer_bh (usbnet.c:430) spin_lock_irqsave(&list->lock, flags); old_state = entry->state; entry->state = state; __skb_unlink(skb, list); spin_unlock(&list->lock); spin_lock(&dev->done.lock); __skb_queue_tail(&dev->done, skb); if (dev->done.qlen == 1) tasklet_schedule(&dev->bh); spin_unlock_irqrestore(&dev->done.lock, flags); *2 rx_complete (usbnet.c:640) state = defer_bh(dev, skb, &dev->rxq, state); At the same time, the following code repeatedly checks if the queue is empty and reads these values concurrently with the above changes: *0 usbnet_terminate_urbs (usbnet.c:765) /* maybe wait for deletions to finish. */ while (!skb_queue_empty(&dev->rxq) && !skb_queue_empty(&dev->txq) && !skb_queue_empty(&dev->done)) { schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS)); set_current_state(TASK_UNINTERRUPTIBLE); netif_dbg(dev, ifdown, dev->net, "waited for %d urb completions\n", temp); } *1 usbnet_stop (usbnet.c:806) if (!(info->flags & FLAG_AVOID_UNLINK_URBS)) usbnet_terminate_urbs(dev); As a result, it is possible, for example, that the skb is removed from dev->rxq by __skb_unlink() before the check "!skb_queue_empty(&dev->rxq)" in usbnet_terminate_urbs() is made. It is also possible in this case that the skb is added to dev->done queue after "!skb_queue_empty(&dev->done)" is checked. So usbnet_terminate_urbs() may stop waiting and return while dev->done queue still has an item. Locking in defer_bh() and usbnet_terminate_urbs() was revisited to avoid this race. Signed-off-by: Eugene Shatokhin Reviewed-by: Bjørn Mork Acked-by: Oliver Neukum --- drivers/net/usb/usbnet.c | 39 ++++++++++++++++++++++++++++----------- 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index e049857..b4cf107 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -428,12 +428,18 @@ static enum skb_state defer_bh(struct usbnet *dev, struct sk_buff *skb, old_state = entry->state; entry->state = state; __skb_unlink(skb, list); - spin_unlock(&list->lock); - spin_lock(&dev->done.lock); + + /* defer_bh() is never called with list == &dev->done. + * spin_lock_nested() tells lockdep that it is OK to take + * dev->done.lock here with list->lock held. + */ + spin_lock_nested(&dev->done.lock, SINGLE_DEPTH_NESTING); + __skb_queue_tail(&dev->done, skb); if (dev->done.qlen == 1) tasklet_schedule(&dev->bh); - spin_unlock_irqrestore(&dev->done.lock, flags); + spin_unlock(&dev->done.lock); + spin_unlock_irqrestore(&list->lock, flags); return old_state; } @@ -749,6 +755,20 @@ EXPORT_SYMBOL_GPL(usbnet_unlink_rx_urbs); /*-------------------------------------------------------------------------*/ +static void wait_skb_queue_empty(struct sk_buff_head *q) +{ + unsigned long flags; + + spin_lock_irqsave(&q->lock, flags); + while (!skb_queue_empty(q)) { + spin_unlock_irqrestore(&q->lock, flags); + schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS)); + set_current_state(TASK_UNINTERRUPTIBLE); + spin_lock_irqsave(&q->lock, flags); + } + spin_unlock_irqrestore(&q->lock, flags); +} + // precondition: never called in_interrupt static void usbnet_terminate_urbs(struct usbnet *dev) { @@ -762,14 +782,11 @@ static void usbnet_terminate_urbs(struct usbnet *dev) unlink_urbs(dev, &dev->rxq); /* maybe wait for deletions to finish. */ - while (!skb_queue_empty(&dev->rxq) - && !skb_queue_empty(&dev->txq) - && !skb_queue_empty(&dev->done)) { - schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS)); - set_current_state(TASK_UNINTERRUPTIBLE); - netif_dbg(dev, ifdown, dev->net, - "waited for %d urb completions\n", temp); - } + wait_skb_queue_empty(&dev->rxq); + wait_skb_queue_empty(&dev->txq); + wait_skb_queue_empty(&dev->done); + netif_dbg(dev, ifdown, dev->net, + "waited for %d urb completions\n", temp); set_current_state(TASK_RUNNING); remove_wait_queue(&dev->wait, &wait); }