From patchwork Tue Nov 8 17:57:47 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prarit Bhargava X-Patchwork-Id: 692417 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3tCxqT4s39z9t1d for ; Wed, 9 Nov 2016 04:57:57 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751668AbcKHR54 (ORCPT ); Tue, 8 Nov 2016 12:57:56 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35308 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751427AbcKHR5y (ORCPT ); Tue, 8 Nov 2016 12:57:54 -0500 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C3F045A5D; Tue, 8 Nov 2016 17:57:53 +0000 (UTC) Received: from praritdesktop.bos.redhat.com (prarit-guest.khw.lab.eng.bos.redhat.com [10.16.186.145]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uA8HvqFI008166; Tue, 8 Nov 2016 12:57:52 -0500 From: Prarit Bhargava To: linux-pci@vger.kernel.org Cc: Prarit Bhargava , alex.williamson@redhat.com, darcari@redhat.com, mstowe@redhat.com, bhelgaas@google.com, lukas@wunner.de, keith.busch@intel.com, mika.westerberg@linux.intel.com Subject: [PATCH] pci: Only disable MSI/X and enable INTx if shutdown function has been called Date: Tue, 8 Nov 2016 12:57:47 -0500 Message-Id: <1478627867-28795-1-git-send-email-prarit@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 08 Nov 2016 17:57:54 +0000 (UTC) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Bjorn, We have seen this at Red Hat on various drivers: nouveau, ahci, mei_me, and pcieport (so far). Google search for "unhandled irq 16" yields many results reporting similar behavior during shutdown indicating that this problem is widespread. I can cause this to happen on a "stable" system by adding a 3 second delay in pci_device_shutdown() which causes the number of spurious interrupts to exceed the 100000 limit and display the warning below for the primarily the nouveau driver, and occasionally for the other mentioned drivers. A patch for this was proposed and rejected here for being too risky: https://patchwork.kernel.org/patch/5990701/ I also originally posted a patch to resolve this here: http://marc.info/?l=linux-pci&m=147705209308588&w=2 and several other patch suggestions were made. The problem with all of these solutions is that there is some risk associated with them (kdump, kvm, etc.) and they are papering over the real issue that the PCI shutdown should not blindly switch to INTx for all devices. I am reproposing the original suggested patch. There is some risk associated with this but I don't think it is any more or any less than the other patches, and it seems like the other patches are only applying band-aids to the problem. [Aside: Lukas Wunner asked why does this always happen on IRQ 16 (even when the legacy device says IRQ 32 in lspci)? The PCI irq pins A, B, C, and D are routed according to the ACPI _PRT table for the device. _In general_, I have noted a consistent pattern for PCI irq pins such that irq pin A is IRQ 0x10 (16) irq pin B is IRQ 0x11 (17) irq pin C is IRQ 0x12 (18) irq pin D is IRQ 0x13 (19) Since the device's IRQ is hooked up to pin A we're seeing the unhandled interrupt on IRQ 16.] I have tested this on various systems with KVM and kdump (and kdump on KVM) and didn't see any issues. NOTE: In my testing this resolves the problem with PCI based serial ports cutting off their output during shutdown. Again, this can be tracked to the PCI shutdown path switching between MSI & INTx independently of the driver. ----8<---- The following unhandled IRQ warning is seen during shutdown: irq 16: nobody cared (try booting with the "irqpoll" option) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.2-1.el7_UNSUPPORTED.x86_64 #1 Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/01/2016 0000000000000000 ffff88041f803e70 ffffffff81333bd5 ffff88041cb78200 ffff88041cb7829c ffff88041f803e98 ffffffff810d9465 ffff88041cb78200 0000000000000000 0000000000000028 ffff88041f803ed0 ffffffff810d97bf Call Trace: [] dump_stack+0x63/0x8e [] __report_bad_irq+0x35/0xd0 [] note_interrupt+0x20f/0x260 [] handle_irq_event_percpu+0x45/0x60 [] handle_irq_event+0x2c/0x50 [] handle_fasteoi_irq+0x8a/0x150 [] handle_irq+0xab/0x130 [] ? _local_bh_enable+0x21/0x50 [] do_IRQ+0x4d/0xd0 [] common_interrupt+0x82/0x82 [] ? cpuidle_enter_state+0xc1/0x280 [] ? cpuidle_enter_state+0xb4/0x280 [] cpuidle_enter+0x17/0x20 [] cpu_startup_entry+0x220/0x3a0 [] rest_init+0x77/0x80 [] start_kernel+0x495/0x4a2 [] ? set_init_arg+0x55/0x55 [] ? early_idt_handler_array+0x120/0x120 [] x86_64_start_reservations+0x2a/0x2c [] x86_64_start_kernel+0x13d/0x14c pci_device_shutdown() is called on each PCI device, and does if (drv && drv->shutdown) drv->shutdown(pci_dev); pci_msi_shutdown(pci_dev); pci_msix_shutdown(pci_dev); The pci_msi_shutdown() and pci_msix_shutdown() functions both call pci_intx_for_msi() which enables the INTx interrupt asynchronously of the driver. The problem is that the driver may not have a shutdown function and the device remains active. The driver continues to operate the PCI device and the device interrupts to generate INTx. The driver, however, has not registered a handler for INTx and the interrupt line remains set which leads to an unhandled IRQ warning. Signed-off-by: Prarit Bhargava Cc: alex.williamson@redhat.com Cc: darcari@redhat.com Cc: mstowe@redhat.com Cc: bhelgaas@google.com Cc: lukas@wunner.de Cc: keith.busch@intel.com Cc: mika.westerberg@linux.intel.com --- drivers/pci/pci-driver.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 1ccce1cd6aca..87c35db5a564 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -461,10 +461,11 @@ static void pci_device_shutdown(struct device *dev) pm_runtime_resume(dev); - if (drv && drv->shutdown) + if (drv && drv->shutdown) { drv->shutdown(pci_dev); - pci_msi_shutdown(pci_dev); - pci_msix_shutdown(pci_dev); + pci_msi_shutdown(pci_dev); + pci_msix_shutdown(pci_dev); + } /* * If this is a kexec reboot, turn off Bus Master bit on the