Patchwork [3.9,stable] iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets

login
register
mail settings
Submitter Neil Horman
Date June 26, 2013, 1:26 p.m.
Message ID <1372253195-17053-1-git-send-email-nhorman@tuxdriver.com>
Download mbox | patch
Permalink /patch/254738/
State Not Applicable
Headers show

Comments

Neil Horman - June 26, 2013, 1:26 p.m.
A few years back intel published a spec update:
http://www.intel.com/content/dam/doc/specification-update/5520-and-5500-chipset-ioh-specification-update.pdf

For the 5520 and 5500 chipsets which contained an errata (specificially errata
53), which noted that these chipsets can't properly do interrupt remapping, and
as a result the recommend that interrupt remapping be disabled in bios.  While
many vendors have a bios update to do exactly that, not all do, and of course
not all users update their bios to a level that corrects the problem.  As a
result, occasionally interrupts can arrive at a cpu even after affinity for that
interrupt has be moved, leading to lost or spurrious interrupts (usually
characterized by the message:
kernel: do_IRQ: 7.71 No irq handler for vector (irq -1)

There have been several incidents recently of people seeing this error, and
investigation has shown that they have system for which their BIOS level is such
that this feature was not properly turned off.  As such, it would be good to
give them a reminder that their systems are vulnurable to this problem.  For
details of those that reported the problem, please see:
https://bugzilla.redhat.com/show_bug.cgi?id=887006

[ Joerg: Removed CONFIG_IRQ_REMAP ifdef from early-quirks.c ]

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Prarit Bhargava <prarit@redhat.com>
CC: Don Zickus <dzickus@redhat.com>
CC: Don Dutile <ddutile@redhat.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Asit Mallick <asit.k.mallick@intel.com>
CC: David Woodhouse <dwmw2@infradead.org>
CC: linux-pci@vger.kernel.org
CC: Joerg Roedel <joro@8bytes.org>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Arkadiusz Miskiewicz <arekm@maven.pl>
CC: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
---
 arch/x86/include/asm/irq_remapping.h |  3 +++
 arch/x86/kernel/early-quirks.c       | 20 ++++++++++++++++++++
 drivers/iommu/intel_irq_remapping.c  | 10 ++++++++++
 drivers/iommu/irq_remapping.c        |  6 ++++++
 drivers/iommu/irq_remapping.h        |  2 ++
 5 files changed, 41 insertions(+)
Greg KH - June 26, 2013, 5:26 p.m.
On Wed, Jun 26, 2013 at 09:26:35AM -0400, Neil Horman wrote:
> A few years back intel published a spec update:
> http://www.intel.com/content/dam/doc/specification-update/5520-and-5500-chipset-ioh-specification-update.pdf
> 
> For the 5520 and 5500 chipsets which contained an errata (specificially errata
> 53), which noted that these chipsets can't properly do interrupt remapping, and
> as a result the recommend that interrupt remapping be disabled in bios.  While
> many vendors have a bios update to do exactly that, not all do, and of course
> not all users update their bios to a level that corrects the problem.  As a
> result, occasionally interrupts can arrive at a cpu even after affinity for that
> interrupt has be moved, leading to lost or spurrious interrupts (usually
> characterized by the message:
> kernel: do_IRQ: 7.71 No irq handler for vector (irq -1)
> 
> There have been several incidents recently of people seeing this error, and
> investigation has shown that they have system for which their BIOS level is such
> that this feature was not properly turned off.  As such, it would be good to
> give them a reminder that their systems are vulnurable to this problem.  For
> details of those that reported the problem, please see:
> https://bugzilla.redhat.com/show_bug.cgi?id=887006
> 
> [ Joerg: Removed CONFIG_IRQ_REMAP ifdef from early-quirks.c ]
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Prarit Bhargava <prarit@redhat.com>
> CC: Don Zickus <dzickus@redhat.com>
> CC: Don Dutile <ddutile@redhat.com>
> CC: Bjorn Helgaas <bhelgaas@google.com>
> CC: Asit Mallick <asit.k.mallick@intel.com>
> CC: David Woodhouse <dwmw2@infradead.org>
> CC: linux-pci@vger.kernel.org
> CC: Joerg Roedel <joro@8bytes.org>
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Arkadiusz Miskiewicz <arekm@maven.pl>
> CC: Greg KH <gregkh@linuxfoundation.org>
> Signed-off-by: Joerg Roedel <joro@8bytes.org>
> ---
>  arch/x86/include/asm/irq_remapping.h |  3 +++
>  arch/x86/kernel/early-quirks.c       | 20 ++++++++++++++++++++
>  drivers/iommu/intel_irq_remapping.c  | 10 ++++++++++
>  drivers/iommu/irq_remapping.c        |  6 ++++++
>  drivers/iommu/irq_remapping.h        |  2 ++
>  5 files changed, 41 insertions(+)

Please let me know what the git commit id of the patch you are asking to
be applied is, in Linus's tree.  Otherwise I'll just assume you are
trying to get a patch into a stable branch that isn't in Linus's tree,
which isn't allowed, and I know you know better than to try to do that
:)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman - June 26, 2013, 5:39 p.m.
On Wed, Jun 26, 2013 at 10:26:10AM -0700, Greg KH wrote:
> On Wed, Jun 26, 2013 at 09:26:35AM -0400, Neil Horman wrote:
> > A few years back intel published a spec update:
> > http://www.intel.com/content/dam/doc/specification-update/5520-and-5500-chipset-ioh-specification-update.pdf
> > 
> > For the 5520 and 5500 chipsets which contained an errata (specificially errata
> > 53), which noted that these chipsets can't properly do interrupt remapping, and
> > as a result the recommend that interrupt remapping be disabled in bios.  While
> > many vendors have a bios update to do exactly that, not all do, and of course
> > not all users update their bios to a level that corrects the problem.  As a
> > result, occasionally interrupts can arrive at a cpu even after affinity for that
> > interrupt has be moved, leading to lost or spurrious interrupts (usually
> > characterized by the message:
> > kernel: do_IRQ: 7.71 No irq handler for vector (irq -1)
> > 
> > There have been several incidents recently of people seeing this error, and
> > investigation has shown that they have system for which their BIOS level is such
> > that this feature was not properly turned off.  As such, it would be good to
> > give them a reminder that their systems are vulnurable to this problem.  For
> > details of those that reported the problem, please see:
> > https://bugzilla.redhat.com/show_bug.cgi?id=887006
> > 
> > [ Joerg: Removed CONFIG_IRQ_REMAP ifdef from early-quirks.c ]
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > CC: Prarit Bhargava <prarit@redhat.com>
> > CC: Don Zickus <dzickus@redhat.com>
> > CC: Don Dutile <ddutile@redhat.com>
> > CC: Bjorn Helgaas <bhelgaas@google.com>
> > CC: Asit Mallick <asit.k.mallick@intel.com>
> > CC: David Woodhouse <dwmw2@infradead.org>
> > CC: linux-pci@vger.kernel.org
> > CC: Joerg Roedel <joro@8bytes.org>
> > CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > CC: Arkadiusz Miskiewicz <arekm@maven.pl>
> > CC: Greg KH <gregkh@linuxfoundation.org>
> > Signed-off-by: Joerg Roedel <joro@8bytes.org>
> > ---
> >  arch/x86/include/asm/irq_remapping.h |  3 +++
> >  arch/x86/kernel/early-quirks.c       | 20 ++++++++++++++++++++
> >  drivers/iommu/intel_irq_remapping.c  | 10 ++++++++++
> >  drivers/iommu/irq_remapping.c        |  6 ++++++
> >  drivers/iommu/irq_remapping.h        |  2 ++
> >  5 files changed, 41 insertions(+)
> 
> Please let me know what the git commit id of the patch you are asking to
> be applied is, in Linus's tree.  Otherwise I'll just assume you are
> trying to get a patch into a stable branch that isn't in Linus's tree,
> which isn't allowed, and I know you know better than to try to do that
> :)
> 
Sorry, I normally do a cherry-pick -x, fat fingered it in my rush :).  As was
previously noted, its from commit 03bbcb2e7e292838bb0244f5a7816d194c911d62
Neil

> thanks,
> 
> greg k-h
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 95fd352..b00bf09 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -23,11 +23,13 @@ 
 #define __X86_IRQ_REMAPPING_H
 
 #include <asm/io_apic.h>
+#include <linux/irq.h>
 
 #ifdef CONFIG_IRQ_REMAP
 
 extern void setup_irq_remapping_ops(void);
 extern int irq_remapping_supported(void);
+extern void set_irq_remapping_broken(void);
 extern int irq_remapping_prepare(void);
 extern int irq_remapping_enable(void);
 extern void irq_remapping_disable(void);
@@ -54,6 +56,7 @@  void irq_remap_modify_chip_defaults(struct irq_chip *chip);
 
 static inline void setup_irq_remapping_ops(void) { }
 static inline int irq_remapping_supported(void) { return 0; }
+static inline void set_irq_remapping_broken(void) { }
 static inline int irq_remapping_prepare(void) { return -ENODEV; }
 static inline int irq_remapping_enable(void) { return -ENODEV; }
 static inline void irq_remapping_disable(void) { }
diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 3755ef4..94ab6b9 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -18,6 +18,7 @@ 
 #include <asm/apic.h>
 #include <asm/iommu.h>
 #include <asm/gart.h>
+#include <asm/irq_remapping.h>
 
 static void __init fix_hypertransport_config(int num, int slot, int func)
 {
@@ -192,6 +193,21 @@  static void __init ati_bugs_contd(int num, int slot, int func)
 }
 #endif
 
+static void __init intel_remapping_check(int num, int slot, int func)
+{
+	u8 revision;
+
+	revision = read_pci_config_byte(num, slot, func, PCI_REVISION_ID);
+
+	/*
+	 * Revision 0x13 of this chipset supports irq remapping
+	 * but has an erratum that breaks its behavior, flag it as such
+	 */
+	if (revision == 0x13)
+		set_irq_remapping_broken();
+
+}
+
 #define QFLAG_APPLY_ONCE 	0x1
 #define QFLAG_APPLIED		0x2
 #define QFLAG_DONE		(QFLAG_APPLY_ONCE|QFLAG_APPLIED)
@@ -221,6 +237,10 @@  static struct chipset early_qrk[] __initdata = {
 	  PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, 0, ati_bugs },
 	{ PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_SBX00_SMBUS,
 	  PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, 0, ati_bugs_contd },
+	{ PCI_VENDOR_ID_INTEL, 0x3403, PCI_CLASS_BRIDGE_HOST,
+	  PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
+	{ PCI_VENDOR_ID_INTEL, 0x3406, PCI_CLASS_BRIDGE_HOST,
+	  PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
 	{}
 };
 
diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index f3b8f23..5b19b2d 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -524,6 +524,16 @@  static int __init intel_irq_remapping_supported(void)
 
 	if (disable_irq_remap)
 		return 0;
+	if (irq_remap_broken) {
+		WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
+			   "This system BIOS has enabled interrupt remapping\n"
+			   "on a chipset that contains an erratum making that\n"
+			   "feature unstable.  To maintain system stability\n"
+			   "interrupt remapping is being disabled.  Please\n"
+			   "contact your BIOS vendor for an update\n");
+		disable_irq_remap = 1;
+		return 0;
+	}
 
 	if (!dmar_ir_support())
 		return 0;
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 7c11ff3..dcfea4e 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -18,6 +18,7 @@ 
 int irq_remapping_enabled;
 
 int disable_irq_remap;
+int irq_remap_broken;
 int disable_sourceid_checking;
 int no_x2apic_optout;
 
@@ -210,6 +211,11 @@  void __init setup_irq_remapping_ops(void)
 #endif
 }
 
+void set_irq_remapping_broken(void)
+{
+	irq_remap_broken = 1;
+}
+
 int irq_remapping_supported(void)
 {
 	if (disable_irq_remap)
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index ecb6376..90c4dae 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -32,6 +32,7 @@  struct pci_dev;
 struct msi_msg;
 
 extern int disable_irq_remap;
+extern int irq_remap_broken;
 extern int disable_sourceid_checking;
 extern int no_x2apic_optout;
 extern int irq_remapping_enabled;
@@ -89,6 +90,7 @@  extern struct irq_remap_ops amd_iommu_irq_ops;
 
 #define irq_remapping_enabled 0
 #define disable_irq_remap     1
+#define irq_remap_broken      0
 
 #endif /* CONFIG_IRQ_REMAP */