From patchwork Thu Jun 1 22:40:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Keller, Jacob E" X-Patchwork-Id: 770012 X-Patchwork-Delegate: jeffrey.t.kirsher@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wf2Pl0jGlz9sNJ for ; Fri, 2 Jun 2017 08:41:15 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 85B0530C68; Thu, 1 Jun 2017 22:41:13 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zn3sZRoy6nw6; Thu, 1 Jun 2017 22:41:03 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by silver.osuosl.org (Postfix) with ESMTP id E736E30C7D; Thu, 1 Jun 2017 22:41:00 +0000 (UTC) X-Original-To: intel-wired-lan@lists.osuosl.org Delivered-To: intel-wired-lan@lists.osuosl.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by ash.osuosl.org (Postfix) with ESMTP id 73E411C3EBE for ; Thu, 1 Jun 2017 22:40:57 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 6D79C88B42 for ; Thu, 1 Jun 2017 22:40:57 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QwNNoPUQTSFu for ; Thu, 1 Jun 2017 22:40:56 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by fraxinus.osuosl.org (Postfix) with ESMTPS id C70A688B3C for ; Thu, 1 Jun 2017 22:40:56 +0000 (UTC) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Jun 2017 15:40:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.39,281,1493708400"; d="scan'208"; a="1155582873" Received: from jekeller-desk.amr.corp.intel.com ([10.166.35.158]) by fmsmga001.fm.intel.com with ESMTP; 01 Jun 2017 15:40:56 -0700 From: Jacob Keller To: Intel Wired LAN Date: Thu, 1 Jun 2017 15:40:46 -0700 Message-Id: <20170601224051.6106-11-jacob.e.keller@intel.com> X-Mailer: git-send-email 2.13.0.311.g0339965c70d6 In-Reply-To: <20170601224051.6106-1-jacob.e.keller@intel.com> References: <20170601224051.6106-1-jacob.e.keller@intel.com> Subject: [Intel-wired-lan] [PATCH 10/15] fm10k: prepare_for_reset() when we lose PCIe Link X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" If we lose PCIe link, such as when an unannounced PFLR event occurs, or when a device is surprise removed, we currently detach the device and close the netdev. This unfortunately leaves a lot of things still active, such as the msix_mbx_pf IRQ, and Tx/Rx resources. This can cause problems because the register reads will return potentially invalid values which may result in unknown driver behavior. Begin the process of resetting using fm10k_prepare_for_reset(), much in the same way as the suspend and resume cycle does. This will attempt to shutdown as much as possible, in order to prevent possible issues. Since the __FM10K_RESETTING state is long lived, we'll also stop waiting for it when we check to the fm10k_reset_subtask. This is important since otherwise it would deadlock with the fm10k_detach_subtask. Additionally, stop attempting to manage the mailbox subtask if we're detached/resetting, as there is nothing to do when we don't have a PCIe address. Overall this produces a much cleaner shutdown and recovery cycle for a PCIe surprise remove event. Signed-off-by: Jacob Keller --- drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 39 +++++++++++++++++++--------- 1 file changed, 27 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c index 6a7b4c5429ae..2d94a16f9613 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c @@ -141,8 +141,9 @@ static void fm10k_prepare_for_reset(struct fm10k_intfc *interface) /* put off any impending NetWatchDogTimeout */ netif_trans_update(netdev); - while (test_and_set_bit(__FM10K_RESETTING, interface->state)) - usleep_range(1000, 2000); + /* Nothing to do if a reset is already in progress */ + if (test_and_set_bit(__FM10K_RESETTING, interface->state)) + return; rtnl_lock(); @@ -168,6 +169,8 @@ static int fm10k_handle_reset(struct fm10k_intfc *interface) struct fm10k_hw *hw = &interface->hw; int err; + WARN_ON(!test_bit(__FM10K_RESETTING, interface->state)); + rtnl_lock(); pci_set_master(interface->pdev); @@ -247,27 +250,33 @@ static void fm10k_detach_subtask(struct fm10k_intfc *interface) u32 __iomem *hw_addr; u32 value; - /* do nothing if device is still present or hw_addr is set */ + /* do nothing if netdev is still present or hw_addr is set */ if (netif_device_present(netdev) || interface->hw.hw_addr) return; + /* We've lost the PCIe register space, and can no longer access the + * device. Shut everything except the detach subtask down and prepare + * to reset the device in case we recover. + */ + fm10k_prepare_for_reset(interface); + /* check the real address space to see if we've recovered */ hw_addr = READ_ONCE(interface->uc_addr); value = readl(hw_addr); if (~value) { + /* Restore the hardware address */ interface->hw.hw_addr = interface->uc_addr; + + /* PCIe link has been restored, and the device is active + * again. Restore everything and reset the device. + */ + fm10k_handle_reset(interface); + + /* Re-attach the netdev */ netif_device_attach(netdev); - set_bit(FM10K_FLAG_RESET_REQUESTED, interface->flags); netdev_warn(netdev, "PCIe link restored, device now attached\n"); return; } - - rtnl_lock(); - - if (netif_running(netdev)) - dev_close(netdev); - - rtnl_unlock(); } static void fm10k_reinit(struct fm10k_intfc *interface) @@ -360,6 +369,10 @@ static void fm10k_watchdog_update_host_state(struct fm10k_intfc *interface) **/ static void fm10k_mbx_subtask(struct fm10k_intfc *interface) { + /* If we're resetting, bail out */ + if (test_bit(__FM10K_RESETTING, interface->state)) + return; + /* process upstream mailbox and update device state */ fm10k_watchdog_update_host_state(interface); @@ -609,9 +622,11 @@ static void fm10k_service_task(struct work_struct *work) interface = container_of(work, struct fm10k_intfc, service_task); + /* Check whether we're detached first */ + fm10k_detach_subtask(interface); + /* tasks run even when interface is down */ fm10k_mbx_subtask(interface); - fm10k_detach_subtask(interface); fm10k_reset_subtask(interface); /* tasks only run when interface is up */