From patchwork Thu Jan 26 22:18:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Netanel Belgazal X-Patchwork-Id: 720404 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3v8bw63n3yz9sxS for ; Fri, 27 Jan 2017 09:20:38 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=annapurnalabs-com.20150623.gappssmtp.com header.i=@annapurnalabs-com.20150623.gappssmtp.com header.b="OUoYDGlp"; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754245AbdAZWSv (ORCPT ); Thu, 26 Jan 2017 17:18:51 -0500 Received: from mail-wm0-f46.google.com ([74.125.82.46]:36076 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754201AbdAZWSr (ORCPT ); Thu, 26 Jan 2017 17:18:47 -0500 Received: by mail-wm0-f46.google.com with SMTP id c85so92147752wmi.1 for ; Thu, 26 Jan 2017 14:18:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=annapurnalabs-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mfo82YA98TpSeqQuRA6pP0p4ALQRseCcXHjiwVKAERE=; b=OUoYDGlpSq2kiGWecM4luxOaRE+xeXtb7W5L6jv2+VG16h/05BRos+4UJ6YhFPUV13 BY9xdBX0L326wSSbwrsVUD9ldlaMwWfgCm5xUnAzKql2DIXeJSYRXGtHJqy2a4/D8XOD aCwYV3v6mE3od4IpqSsJ6dnR9mgHQCX21FAb1xesVtAmIVTHxkmS6b6QpEV0D/pqKo+O QIghOsPhsMxdZl65sINjRZwTKi8Nn07kAhPIsBBQzh9Ssrt5SHARCep0qRQ6yxf+aN4k iNdwrA9aJw6qZJqIEw0vvQcQEmY7kvyUg9h+ButVZ0iiqYnT2ixryNT6GS4YUpULcalx cURA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mfo82YA98TpSeqQuRA6pP0p4ALQRseCcXHjiwVKAERE=; b=bHlyU6JVldkkv/gic2Dlw50RvInyChy8Zna7nje+oVF+JsD820UMeT2I/L+TqzRGAj rrvZt5OWICcbD7ZSxBgLzDd343Js+X0zFsityfdawYn5XaRK6Wx7PtzkUml1L0nfarbm jx4mu9IckYmtSAIPdfEklTBgl96QN6G3arwF/ctOyIcVY67zcWPzrpfgTI3MuXaItS8K ubIcyW59y7Z7Hc6qFpzegQNRvhSYX0Igwqx/6arlGY7R353rUUzr5P7OEkEyosizv8J2 rpFLDZvC+9eZpwxn3d4OyxLNngAHmhjBtdeAleokO5GML/KzToPCHvNTtavmgGJoL0Fw pAeA== X-Gm-Message-State: AIkVDXJgIaBbnDfng7Sr12dr41eLr45aNZlnpVK84xwqQJrSa3G/jNCFWIyz3BypsPAWCf02 X-Received: by 10.28.29.141 with SMTP id d135mr563717wmd.108.1485469126154; Thu, 26 Jan 2017 14:18:46 -0800 (PST) Received: from u28f10e31dbcc580f6671.ant.amazon.com (bzq-84-108-251-164.cablep.bezeqint.net. [84.108.251.164]) by smtp.gmail.com with ESMTPSA id e16sm4690194wra.36.2017.01.26.14.18.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 26 Jan 2017 14:18:45 -0800 (PST) From: Netanel Belgazal To: linux-kernel@vger.kernel.org, davem@davemloft.net, netdev@vger.kernel.org Cc: Netanel Belgazal , dwmw@amazon.com, zorik@annapurnalabs.com, alex@annapurnalabs.com, saeed@annapurnalabs.com, msw@amazon.com, aliguori@amazon.com, nafea@annapurnalabs.com, eric.dumazet@gmail.com Subject: [PATCH V3 net-next 08/14] net/ena: fix potential access to freed memory during device reset Date: Fri, 27 Jan 2017 00:18:10 +0200 Message-Id: <1485469096-5271-9-git-send-email-netanel@annapurnalabs.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485469096-5271-1-git-send-email-netanel@annapurnalabs.com> References: <1485469096-5271-1-git-send-email-netanel@annapurnalabs.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org If the ena driver detects that the device is not behave as expected, it tries to reset the device. The reset flow calls ena_down, which will frees all the resources the driver allocates and then it will reset the device. This flow can cause memory corruption if the device is still writes to the driver's memory space. To overcome this potential race, move the reset before the device resources are freed. Signed-off-by: Netanel Belgazal --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 56 +++++++++++++++++++++------- 1 file changed, 43 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index ea3c801..606fb5c 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -80,14 +80,18 @@ static void ena_tx_timeout(struct net_device *dev) { struct ena_adapter *adapter = netdev_priv(dev); + /* Change the state of the device to trigger reset + * Check that we are not in the middle or a trigger already + */ + + if (test_and_set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags)) + return; + u64_stats_update_begin(&adapter->syncp); adapter->dev_stats.tx_timeout++; u64_stats_update_end(&adapter->syncp); netif_err(adapter, tx_err, dev, "Transmit time out\n"); - - /* Change the state of the device to trigger reset */ - set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags); } static void update_rx_ring_mtu(struct ena_adapter *adapter, int mtu) @@ -1109,7 +1113,8 @@ static int ena_io_poll(struct napi_struct *napi, int budget) tx_budget = tx_ring->ring_size / ENA_TX_POLL_BUDGET_DIVIDER; - if (!test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags)) { + if (!test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags) || + test_bit(ENA_FLAG_TRIGGER_RESET, &tx_ring->adapter->flags)) { napi_complete_done(napi, 0); return 0; } @@ -1698,12 +1703,22 @@ static void ena_down(struct ena_adapter *adapter) adapter->dev_stats.interface_down++; u64_stats_update_end(&adapter->syncp); - /* After this point the napi handler won't enable the tx queue */ - ena_napi_disable_all(adapter); netif_carrier_off(adapter->netdev); netif_tx_disable(adapter->netdev); + /* After this point the napi handler won't enable the tx queue */ + ena_napi_disable_all(adapter); + /* After destroy the queue there won't be any new interrupts */ + + if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags)) { + int rc; + + rc = ena_com_dev_reset(adapter->ena_dev); + if (rc) + dev_err(&adapter->pdev->dev, "Device reset failed\n"); + } + ena_destroy_all_io_queues(adapter); ena_disable_io_intr_sync(adapter); @@ -2065,6 +2080,14 @@ static void ena_netpoll(struct net_device *netdev) struct ena_adapter *adapter = netdev_priv(netdev); int i; + /* Dont schedule NAPI if the driver is in the middle of reset + * or netdev is down. + */ + + if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags) || + test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags)) + return; + for (i = 0; i < adapter->num_queues; i++) napi_schedule(&adapter->ena_napi[i].napi); } @@ -2451,6 +2474,14 @@ static void ena_fw_reset_device(struct work_struct *work) bool dev_up, wd_state; int rc; + if (unlikely(!test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) { + dev_err(&pdev->dev, + "device reset schedule while reset bit is off\n"); + return; + } + + netif_carrier_off(netdev); + del_timer_sync(&adapter->timer_service); rtnl_lock(); @@ -2464,12 +2495,6 @@ static void ena_fw_reset_device(struct work_struct *work) */ ena_close(netdev); - rc = ena_com_dev_reset(ena_dev); - if (rc) { - dev_err(&pdev->dev, "Device reset failed\n"); - goto err; - } - ena_free_mgmnt_irq(adapter); ena_disable_msix(adapter); @@ -2482,6 +2507,8 @@ static void ena_fw_reset_device(struct work_struct *work) ena_com_mmio_reg_read_request_destroy(ena_dev); + clear_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags); + /* Finish with the destroy part. Start the init part */ rc = ena_device_init(ena_dev, adapter->pdev, &get_feat_ctx, &wd_state); @@ -2547,6 +2574,9 @@ static void check_for_missing_tx_completions(struct ena_adapter *adapter) if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags)) return; + if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags)) + return; + budget = ENA_MONITORED_TX_QUEUES; for (i = adapter->last_monitored_tx_qid; i < adapter->num_queues; i++) { @@ -2646,7 +2676,7 @@ static void ena_timer_service(unsigned long data) if (host_info) ena_update_host_info(host_info, adapter->netdev); - if (unlikely(test_and_clear_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) { + if (unlikely(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) { netif_err(adapter, drv, adapter->netdev, "Trigger reset is on\n"); ena_dump_stats_to_dmesg(adapter);