From patchwork Fri Jul 14 13:27:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael, Alice" X-Patchwork-Id: 788805 X-Patchwork-Delegate: jeffrey.t.kirsher@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3x8QqP69pPz9s7F for ; Sat, 15 Jul 2017 07:31:29 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 70CD08A42E; Fri, 14 Jul 2017 21:31:28 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ISj0Nvsn96Lb; Fri, 14 Jul 2017 21:31:26 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by hemlock.osuosl.org (Postfix) with ESMTP id 2844F8A440; Fri, 14 Jul 2017 21:31:26 +0000 (UTC) X-Original-To: intel-wired-lan@lists.osuosl.org Delivered-To: intel-wired-lan@lists.osuosl.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by ash.osuosl.org (Postfix) with ESMTP id 570581C0683 for ; Fri, 14 Jul 2017 21:31:22 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id F17288A431 for ; Fri, 14 Jul 2017 21:31:21 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OFUTPmCyHuae for ; Fri, 14 Jul 2017 21:31:18 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by hemlock.osuosl.org (Postfix) with ESMTPS id 77D628A426 for ; Fri, 14 Jul 2017 21:31:17 +0000 (UTC) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP; 14 Jul 2017 14:31:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,360,1496127600"; d="scan'208";a="125288635" Received: from unknown (HELO localhost.jf.intel.com) ([10.166.16.121]) by orsmga005.jf.intel.com with ESMTP; 14 Jul 2017 14:31:16 -0700 From: Alice Michael To: alice.michael@intel.com, intel-wired-lan@lists.osuosl.org Date: Fri, 14 Jul 2017 09:27:06 -0400 Message-Id: <20170714132709.52657-7-alice.michael@intel.com> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170714132709.52657-1-alice.michael@intel.com> References: <20170714132709.52657-1-alice.michael@intel.com> Subject: [Intel-wired-lan] [next PATCH S77-V3 07/10] i40e: shutdown all IRQs and disable MSI-X when suspended X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" From: Jacob Keller On some platforms with a large number of CPUs, we will allocate many IRQ vectors. When hibernating, the system will attempt to migrate all of the vectors back to CPU0 when shutting down all the other CPUs. It is possible that we have so many vectors that it cannot re-assign them to CPU0. This is even more likely if we have many devices installed in one platform. The end result is failure to hibernate, as it is not possible to shutdown the CPUs. We can avoid this by disabling MSI-X and clearing our interrupt scheme when the device is suspended. A more ideal solution would be some method for the stack to properly handle this for all drivers, rather than on a case-by-case basis for each driver to fix itself. However, until this more ideal solution exists, we can do our part and shutdown our IRQs during suspend, which should allow systems with a large number of CPUs to safely suspend or hibernate. It may be worth investigating if we should shut down even further when we suspend as it may make the path cleaner, but this was the minimum fix for the hibernation issue mentioned here. Testing-hints: This affects systems with a large number of CPUs, and with multiple devices enabled. Without this change, those platforms are unable to hibernate at all. Signed-off-by: Jacob Keller Tested-by: Andrew Bowers --- drivers/net/ethernet/intel/i40e/i40e_main.c | 68 ++++++++++++++++++++++++- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +- 2 files changed, 68 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 0d2e652..1ad014d 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -8340,6 +8340,57 @@ static int i40e_init_interrupt_scheme(struct i40e_pf *pf) return 0; } +#ifdef CONFIG_PM +/** + * i40e_restore_interrupt_scheme - Restore the interrupt scheme + * @pf: private board data structure + * + * Restore the interrupt scheme that was cleared when we suspended the + * device. This should be called during resume to re-allocate the q_vectors + * and reacquire IRQs. + */ +static int i40e_restore_interrupt_scheme(struct i40e_pf *pf) +{ + int err, i; + + /* We cleared the MSI and MSI-X flags when disabling the old interrupt + * scheme. We need to re-enabled them here in order to attempt to + * re-acquire the MSI or MSI-X vectors + */ + pf->flags |= (I40E_FLAG_MSIX_ENABLED | I40E_FLAG_MSI_ENABLED); + + err = i40e_init_interrupt_scheme(pf); + if (err) + return err; + + /* Now that we've re-acquired IRQs, we need to remap the vectors and + * rings together again. + */ + for (i = 0; i < pf->num_alloc_vsi; i++) { + if (pf->vsi[i]) { + err = i40e_vsi_alloc_q_vectors(pf->vsi[i]); + if (err) + goto err_unwind; + i40e_vsi_map_rings_to_vectors(pf->vsi[i]); + } + } + + err = i40e_setup_misc_vector(pf); + if (err) + goto err_unwind; + + return 0; + +err_unwind: + while (i--) { + if (pf->vsi[i]) + i40e_vsi_free_q_vectors(pf->vsi[i]); + } + + return err; +} +#endif /* CONFIG_PM */ + /** * i40e_setup_misc_vector - Setup the misc vector to handle non queue events * @pf: board private structure @@ -12063,7 +12114,12 @@ static int i40e_suspend(struct device *dev) wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0)); wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0)); - i40e_free_misc_vector(pf); + /* Clear the interrupt scheme and release our IRQs so that the system + * can safely hibernate even when there are a large number of CPUs. + * Otherwise hibernation might fail when mapping all the vectors back + * to CPU0. + */ + i40e_clear_interrupt_scheme(pf); return 0; } @@ -12076,11 +12132,21 @@ static int i40e_resume(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); struct i40e_pf *pf = pci_get_drvdata(pdev); + int err; /* If we're not suspended, then there is nothing to do */ if (!test_bit(__I40E_SUSPENDED, pf->state)) return 0; + /* We cleared the interrupt scheme when we suspended, so we need to + * restore it now to resume device functionality. + */ + err = i40e_restore_interrupt_scheme(pf); + if (err) { + dev_err(&pdev->dev, "Cannot restore interrupt scheme: %d\n", + err); + } + clear_bit(__I40E_DOWN, pf->state); i40e_reset_and_rebuild(pf, false, false); diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index 29953c9..d8e17c2 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -46,7 +46,7 @@ static const char i40evf_driver_string[] = #define DRV_VERSION_MAJOR 3 #define DRV_VERSION_MINOR 0 -#define DRV_VERSION_BUILD 0 +#define DRV_VERSION_BUILD 1 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \ __stringify(DRV_VERSION_MINOR) "." \ __stringify(DRV_VERSION_BUILD) \