From patchwork Tue Jun 12 13:27:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Mauro S. M. Rodrigues" X-Patchwork-Id: 928332 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 414rL41MgJz9s1b; Tue, 12 Jun 2018 23:28:12 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fSjLA-0005Zp-2V; Tue, 12 Jun 2018 13:28:04 +0000 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1fSjL8-0005Yy-K5 for kernel-team@lists.ubuntu.com; Tue, 12 Jun 2018 13:28:02 +0000 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w5CDRKfb054955 for ; Tue, 12 Jun 2018 09:28:01 -0400 Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) by mx0a-001b2d01.pphosted.com with ESMTP id 2jje8aub1t-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 12 Jun 2018 09:27:58 -0400 Received: from localhost by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 12 Jun 2018 07:27:46 -0600 Received: from b03cxnp08026.gho.boulder.ibm.com (9.17.130.18) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 12 Jun 2018 07:27:44 -0600 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w5CDRhDU10944802 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 12 Jun 2018 06:27:43 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 25C32BE053; Tue, 12 Jun 2018 07:27:43 -0600 (MDT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E0EF3BE054; Tue, 12 Jun 2018 07:27:42 -0600 (MDT) Received: from localhost (unknown [9.18.239.186]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 12 Jun 2018 07:27:42 -0600 (MDT) From: "Mauro S. M. Rodrigues" To: kernel-team@lists.ubuntu.com Subject: [Bionic][PATCH v2] ixgbe/ixgbevf: Free IRQ when PCI error recovery removes the device Date: Tue, 12 Jun 2018 10:27:37 -0300 X-Mailer: git-send-email 2.7.4 X-TM-AS-GCONF: 00 x-cbid: 18061213-0016-0000-0000-000008F5780B X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009175; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000265; SDB=6.01045893; UDB=6.00535603; IPR=6.00824871; MB=3.00021602; MTD=3.00000008; XFM=3.00000015; UTC=2018-06-12 13:27:45 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18061213-0017-0000-0000-00003F3F9B4C Message-Id: <1528810057-7417-1-git-send-email-maurosr@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-06-12_01:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1805220000 definitions=main-1806120154 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Mauro S M Rodrigues BugLink: http://bugs.launchpad.net/bugs/1776389 Since commit f7f37e7ff2b9 ("ixgbe: handle close/suspend race with netif_device_detach/present") ixgbe_close_suspend is called, from ixgbe_close, only if the device is present, i.e. if it isn't detached. That exposed a situation where IRQs weren't freed if a PCI error recovery system opts to remove the device. For such case the pci channel state is set to pci_channel_io_perm_failure and ixgbe_io_error_detected was returning PCI_ERS_RESULT_DISCONNECT before calling ixgbe_close_suspend consequentially not freeing IRQ and crashing when the remove handler calls pci_disable_device, hitting a BUG_ON at free_msi_irqs, which asserts that there is no non-free IRQ associated with the device to be removed: BUG_ON(irq_has_action(entry->irq + i)); The issue is fixed by calling the ixgbe_close_suspend before evaluate the pci channel state. Reported-by: Naresh Bannoth Reported-by: Abdul Haleem Signed-off-by: Mauro S M Rodrigues Reviewed-by: Alexander Duyck Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher (cherry picked from commit b212d815e77c72be921979119c715166cc8987b1) Signed-off-by: Mauro S. M. Rodrigues Acked-by: Stefan Bader Acked-by: Khalid Elmously --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 +++--- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index a549c4870882..e1326c0603af 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -10910,14 +10910,14 @@ static pci_ers_result_t ixgbe_io_error_detected(struct pci_dev *pdev, rtnl_lock(); netif_device_detach(netdev); + if (netif_running(netdev)) + ixgbe_close_suspend(adapter); + if (state == pci_channel_io_perm_failure) { rtnl_unlock(); return PCI_ERS_RESULT_DISCONNECT; } - if (netif_running(netdev)) - ixgbe_close_suspend(adapter); - if (!test_and_set_bit(__IXGBE_DISABLED, &adapter->state)) pci_disable_device(pdev); rtnl_unlock(); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 1f4a69134ade..64859f8318c8 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -4241,14 +4241,14 @@ static pci_ers_result_t ixgbevf_io_error_detected(struct pci_dev *pdev, rtnl_lock(); netif_device_detach(netdev); + if (netif_running(netdev)) + ixgbevf_close_suspend(adapter); + if (state == pci_channel_io_perm_failure) { rtnl_unlock(); return PCI_ERS_RESULT_DISCONNECT; } - if (netif_running(netdev)) - ixgbevf_close_suspend(adapter); - if (!test_and_set_bit(__IXGBEVF_DISABLED, &adapter->state)) pci_disable_device(pdev); rtnl_unlock();