From patchwork Tue Jun 12 12:57:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Mauro S. M. Rodrigues" X-Patchwork-Id: 928306 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 414qgX0GzDz9s1B; Tue, 12 Jun 2018 22:58:16 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fSis9-0003FQ-Pm; Tue, 12 Jun 2018 12:58:05 +0000 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5] helo=mx0a-001b2d01.pphosted.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1fSis7-0003F9-A0 for kernel-team@lists.ubuntu.com; Tue, 12 Jun 2018 12:58:03 +0000 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w5CCv45j027340 for ; Tue, 12 Jun 2018 08:58:02 -0400 Received: from e14.ny.us.ibm.com (e14.ny.us.ibm.com [129.33.205.204]) by mx0a-001b2d01.pphosted.com with ESMTP id 2jje3529f6-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 12 Jun 2018 08:58:02 -0400 Received: from localhost by e14.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 12 Jun 2018 08:58:01 -0400 Received: from b01cxnp22033.gho.pok.ibm.com (9.57.198.23) by e14.ny.us.ibm.com (146.89.104.201) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 12 Jun 2018 08:57:59 -0400 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w5CCvwkA64815156 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 12 Jun 2018 12:57:58 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E587BAC05F; Tue, 12 Jun 2018 08:59:20 -0400 (EDT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9CEE9AC059; Tue, 12 Jun 2018 08:59:20 -0400 (EDT) Received: from localhost (unknown [9.18.239.186]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 12 Jun 2018 08:59:20 -0400 (EDT) From: "Mauro S. M. Rodrigues" To: kernel-team@lists.ubuntu.com Subject: [PATCH] ixgbe/ixgbevf: Free IRQ when PCI error recovery removes the device Date: Tue, 12 Jun 2018 09:57:51 -0300 X-Mailer: git-send-email 2.7.4 X-TM-AS-GCONF: 00 x-cbid: 18061212-0052-0000-0000-000002FD75B3 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009175; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000265; SDB=6.01045884; UDB=6.00535597; IPR=6.00824861; MB=3.00021602; MTD=3.00000008; XFM=3.00000015; UTC=2018-06-12 12:58:00 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18061212-0053-0000-0000-00005CFDD398 Message-Id: <1528808271-31533-1-git-send-email-maurosr@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-06-12_01:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1805220000 definitions=main-1806120150 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Mauro S M Rodrigues BugLink: http://bugs.launchpad.net/bugs/1776389 Since commit f7f37e7ff2b9 ("ixgbe: handle close/suspend race with netif_device_detach/present") ixgbe_close_suspend is called, from ixgbe_close, only if the device is present, i.e. if it isn't detached. That exposed a situation where IRQs weren't freed if a PCI error recovery system opts to remove the device. For such case the pci channel state is set to pci_channel_io_perm_failure and ixgbe_io_error_detected was returning PCI_ERS_RESULT_DISCONNECT before calling ixgbe_close_suspend consequentially not freeing IRQ and crashing when the remove handler calls pci_disable_device, hitting a BUG_ON at free_msi_irqs, which asserts that there is no non-free IRQ associated with the device to be removed: BUG_ON(irq_has_action(entry->irq + i)); The issue is fixed by calling the ixgbe_close_suspend before evaluate the pci channel state. Reported-by: Naresh Bannoth Reported-by: Abdul Haleem Signed-off-by: Mauro S M Rodrigues Reviewed-by: Alexander Duyck Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher (cherry picked from commit b212d815e77c72be921979119c715166cc8987b1) --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 +++--- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index a549c4870882..e1326c0603af 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -10910,14 +10910,14 @@ static pci_ers_result_t ixgbe_io_error_detected(struct pci_dev *pdev, rtnl_lock(); netif_device_detach(netdev); + if (netif_running(netdev)) + ixgbe_close_suspend(adapter); + if (state == pci_channel_io_perm_failure) { rtnl_unlock(); return PCI_ERS_RESULT_DISCONNECT; } - if (netif_running(netdev)) - ixgbe_close_suspend(adapter); - if (!test_and_set_bit(__IXGBE_DISABLED, &adapter->state)) pci_disable_device(pdev); rtnl_unlock(); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 1f4a69134ade..64859f8318c8 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -4241,14 +4241,14 @@ static pci_ers_result_t ixgbevf_io_error_detected(struct pci_dev *pdev, rtnl_lock(); netif_device_detach(netdev); + if (netif_running(netdev)) + ixgbevf_close_suspend(adapter); + if (state == pci_channel_io_perm_failure) { rtnl_unlock(); return PCI_ERS_RESULT_DISCONNECT; } - if (netif_running(netdev)) - ixgbevf_close_suspend(adapter); - if (!test_and_set_bit(__IXGBEVF_DISABLED, &adapter->state)) pci_disable_device(pdev); rtnl_unlock();