From patchwork Wed Feb 7 01:17:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Grygorii Strashko X-Patchwork-Id: 870169 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=ti.com header.i=@ti.com header.b="piBFTtpv"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zbk2g1kb1z9s1h for ; Wed, 7 Feb 2018 12:17:31 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754340AbeBGBRT (ORCPT ); Tue, 6 Feb 2018 20:17:19 -0500 Received: from fllnx210.ext.ti.com ([198.47.19.17]:32307 "EHLO fllnx210.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754326AbeBGBRR (ORCPT ); Tue, 6 Feb 2018 20:17:17 -0500 Received: from dlelxv90.itg.ti.com ([172.17.2.17]) by fllnx210.ext.ti.com (8.15.1/8.15.1) with ESMTP id w171HFM7031968; Tue, 6 Feb 2018 19:17:15 -0600 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ti.com; s=ti-com-17Q1; t=1517966235; bh=UvJ5cPMFPSnHTq69pqa8jNuFBG8oe9deqT2nnRD5u8g=; h=From:To:CC:Subject:Date; b=piBFTtpv9H90PZRRxuUR9vBWmrgnfKvDPQ7033sfWAhwRwPSD1Hy8G8/1U4+Tsc5h AN4EBWEgYbOdiZBQn7hSWebaJTfjzXCYCn5Glri4raAuAmDOIb5DYQJ7hLldElLIlY ijGqv1owcWMCMEVfLpxaWAkXm3MEeKZ3wQB25L18= Received: from DFLE108.ent.ti.com (dfle108.ent.ti.com [10.64.6.29]) by dlelxv90.itg.ti.com (8.14.3/8.13.8) with ESMTP id w171HE8Q030737; Tue, 6 Feb 2018 19:17:15 -0600 Received: from DFLE101.ent.ti.com (10.64.6.22) by DFLE108.ent.ti.com (10.64.6.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1261.35; Tue, 6 Feb 2018 19:17:14 -0600 Received: from dlep32.itg.ti.com (157.170.170.100) by DFLE101.ent.ti.com (10.64.6.22) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_RSA_WITH_AES_256_CBC_SHA) id 15.1.1261.35 via Frontend Transport; Tue, 6 Feb 2018 19:17:14 -0600 Received: from legion.dal.design.ti.com (legion.dal.design.ti.com [128.247.22.53]) by dlep32.itg.ti.com (8.14.3/8.13.8) with ESMTP id w171HEbW028686; Tue, 6 Feb 2018 19:17:14 -0600 Received: from localhost (uda0226610.dhcp.ti.com [128.247.59.147]) by legion.dal.design.ti.com (8.11.7p1+Sun/8.11.7) with ESMTP id w171HEx22713; Tue, 6 Feb 2018 19:17:14 -0600 (CST) From: Grygorii Strashko To: "David S. Miller" , CC: Sekhar Nori , , , Grygorii Strashko Subject: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout Date: Tue, 6 Feb 2018 19:17:06 -0600 Message-ID: <20180207011706.13393-1-grygorii.strashko@ti.com> X-Mailer: git-send-email 2.10.5 MIME-Version: 1.0 X-EXCLAIMER-MD-CONFIG: e1e8a2fd-e40a-4ac6-ac9b-f7e9cc9ee180 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org It was discovered that simple program which indefinitely sends 200b UDP packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog timeout is triggered due to race between cpsw_ndo_start_xmit() and cpsw_tx_handler() [NAPI] cpsw_ndo_start_xmit() if (unlikely(!cpdma_check_free_tx_desc(txch))) { txq = netdev_get_tx_queue(ndev, q_idx); netif_tx_stop_queue(txq); ^^ as per [1] barier has to be used after set_bit() otherwise new value might not be visible to other cpus } cpsw_tx_handler() if (unlikely(netif_tx_queue_stopped(txq))) netif_tx_wake_queue(txq); and when it happens ndev TX queue became disabled forever while driver's HW TX queue is empty. Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue() calls and double check for free TX descriptors after stopping ndev TX queue - if there are free TX descriptors wake up ndev TX queue. [1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html Signed-off-by: Grygorii Strashko Reviewed-by: Ivan Khoronzhuk --- drivers/net/ethernet/ti/cpsw.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 10d7cbe..3805b13 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -1638,6 +1638,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff *skb, q_idx = q_idx % cpsw->tx_ch_num; txch = cpsw->txv[q_idx].ch; + txq = netdev_get_tx_queue(ndev, q_idx); ret = cpsw_tx_packet_submit(priv, skb, txch); if (unlikely(ret != 0)) { cpsw_err(priv, tx_err, "desc submit failed\n"); @@ -1648,15 +1649,26 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff *skb, * tell the kernel to stop sending us tx frames. */ if (unlikely(!cpdma_check_free_tx_desc(txch))) { - txq = netdev_get_tx_queue(ndev, q_idx); netif_tx_stop_queue(txq); + + /* Barrier, so that stop_queue visible to other cpus */ + smp_mb__after_atomic(); + + if (cpdma_check_free_tx_desc(txch)) + netif_tx_wake_queue(txq); } return NETDEV_TX_OK; fail: ndev->stats.tx_dropped++; - txq = netdev_get_tx_queue(ndev, skb_get_queue_mapping(skb)); netif_tx_stop_queue(txq); + + /* Barrier, so that stop_queue visible to other cpus */ + smp_mb__after_atomic(); + + if (cpdma_check_free_tx_desc(txch)) + netif_tx_wake_queue(txq); + return NETDEV_TX_BUSY; }