diff mbox series

[1/1] x86/PCI: Disable exclusion of E820 reserved addresses in some cases

Message ID 20220216150121.9400-2-hdegoede@redhat.com
State New
Headers show
Series x86/PCI: Disable exclusion of E820 reserved addresses in some cases | expand

Commit Message

Hans de Goede Feb. 16, 2022, 3:01 p.m. UTC
Some fw has a bug where the PCI bridge window returned by the ACPI
resources partly overlaps with some other address range, causing issues.
To workaround this Linux excludes E820 reserved addresses when allocating
addresses from the PCI bridge window. 2 known examples of such fw bugs are:

1. The returned window contains addresses which map to system RAM,
see commit 4dc2287c1805 ("x86: avoid E820 regions when allocating
address space").

2. The Lenovo X1 carbon gen 2 BIOS has an overlap between an EFI/E820
reserved range and the ACPI provided PCI bridge window:
 efi: mem46: [MMIO] range=[0x00000000dfa00000-0x00000000dfa0ffff] (0MB)
 BIOS-e820: [mem 0x00000000dceff000-0x00000000dfa0ffff] reserved
 pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]
If Linux assigns the overlapping 0xdfa00000-0xdfa0ffff range to a PCI BAR
then the system fails to resume after a suspend.

Recently (2019) some systems have shown-up with EFI memmap MMIO entries
covering the entire ACPI provided PCI bridge window. These memmap entries
get converted into e820_table entries, causing all attempts to assign
memory to PCI BARs which have not been setup by the BIOS to fail.
For example see these dmesg snippets from a Lenovo IdeaPad 3 15IIL 81WE:
 efi: mem63: [MMIO] range=[0x0000000065400000-0x00000000cfffffff] (1708MB)
 BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved
 pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
 pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
 pci 0000:00:15.0: BAR 0: failed to assign [mem size 0x00001000 64bit]

To fix this, check if the ACPI provided PCI bridge window is fully
contained within in EFI memmap MMIO region and in that case disable
the "exclude E820 reserved addresses" workaround, fixing the problem
of not being able to find free space for unassigned BARs.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2029207
BugLink: https://bugs.launchpad.net/bugs/1878279
BugLink: https://bugs.launchpad.net/bugs/1931715
BugLink: https://bugs.launchpad.net/bugs/1932069
BugLink: https://bugs.launchpad.net/bugs/1921649
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 arch/x86/include/asm/pci_x86.h |  8 +++++
 arch/x86/kernel/resource.c     |  4 +++
 arch/x86/pci/acpi.c            | 65 +++++++++++++++++++++++++++++++++-
 3 files changed, 76 insertions(+), 1 deletion(-)

Comments

Rafael J. Wysocki Feb. 16, 2022, 3:47 p.m. UTC | #1
On Wed, Feb 16, 2022 at 4:02 PM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Some fw has a bug where the PCI bridge window returned by the ACPI
> resources partly overlaps with some other address range, causing issues.
> To workaround this Linux excludes E820 reserved addresses when allocating
> addresses from the PCI bridge window. 2 known examples of such fw bugs are:
>
> 1. The returned window contains addresses which map to system RAM,
> see commit 4dc2287c1805 ("x86: avoid E820 regions when allocating
> address space").
>
> 2. The Lenovo X1 carbon gen 2 BIOS has an overlap between an EFI/E820
> reserved range and the ACPI provided PCI bridge window:
>  efi: mem46: [MMIO] range=[0x00000000dfa00000-0x00000000dfa0ffff] (0MB)
>  BIOS-e820: [mem 0x00000000dceff000-0x00000000dfa0ffff] reserved
>  pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]
> If Linux assigns the overlapping 0xdfa00000-0xdfa0ffff range to a PCI BAR
> then the system fails to resume after a suspend.
>
> Recently (2019) some systems have shown-up with EFI memmap MMIO entries
> covering the entire ACPI provided PCI bridge window. These memmap entries
> get converted into e820_table entries, causing all attempts to assign
> memory to PCI BARs which have not been setup by the BIOS to fail.
> For example see these dmesg snippets from a Lenovo IdeaPad 3 15IIL 81WE:
>  efi: mem63: [MMIO] range=[0x0000000065400000-0x00000000cfffffff] (1708MB)
>  BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>  pci 0000:00:15.0: BAR 0: failed to assign [mem size 0x00001000 64bit]
>
> To fix this, check if the ACPI provided PCI bridge window is fully
> contained within in EFI memmap MMIO region and in that case disable
> the "exclude E820 reserved addresses" workaround, fixing the problem
> of not being able to find free space for unassigned BARs.
>
> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2029207
> BugLink: https://bugs.launchpad.net/bugs/1878279
> BugLink: https://bugs.launchpad.net/bugs/1931715
> BugLink: https://bugs.launchpad.net/bugs/1932069
> BugLink: https://bugs.launchpad.net/bugs/1921649
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Thanks a lot for working on this!

> ---
>  arch/x86/include/asm/pci_x86.h |  8 +++++
>  arch/x86/kernel/resource.c     |  4 +++
>  arch/x86/pci/acpi.c            | 65 +++++++++++++++++++++++++++++++++-
>  3 files changed, 76 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
> index 490411dba438..997379b022ee 100644
> --- a/arch/x86/include/asm/pci_x86.h
> +++ b/arch/x86/include/asm/pci_x86.h
> @@ -64,6 +64,8 @@ void pcibios_scan_specific_bus(int busn);
>
>  /* pci-irq.c */
>
> +struct pci_dev;
> +
>  struct irq_info {
>         u8 bus, devfn;                  /* Bus, device and function */
>         struct {
> @@ -232,3 +234,9 @@ static inline void mmio_config_writel(void __iomem *pos, u32 val)
>  # define x86_default_pci_init_irq      NULL
>  # define x86_default_pci_fixup_irqs    NULL
>  #endif
> +
> +#if defined CONFIG_PCI && defined CONFIG_ACPI
> +extern bool pci_use_e820;
> +#else
> +#define pci_use_e820 true
> +#endif
> diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c
> index 9b9fb7882c20..e8dc9bc327bd 100644
> --- a/arch/x86/kernel/resource.c
> +++ b/arch/x86/kernel/resource.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  #include <linux/ioport.h>
>  #include <asm/e820/api.h>
> +#include <asm/pci_x86.h>
>
>  static void resource_clip(struct resource *res, resource_size_t start,
>                           resource_size_t end)
> @@ -28,6 +29,9 @@ static void remove_e820_regions(struct resource *avail)
>         int i;
>         struct e820_entry *entry;
>
> +       if (!pci_use_e820)
> +               return;
> +
>         for (i = 0; i < e820_table->nr_entries; i++) {
>                 entry = &e820_table->entries[i];
>
> diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> index 052f1d78a562..e4617df661a9 100644
> --- a/arch/x86/pci/acpi.c
> +++ b/arch/x86/pci/acpi.c
> @@ -1,4 +1,5 @@
>  // SPDX-License-Identifier: GPL-2.0
> +#include <linux/efi.h>
>  #include <linux/pci.h>
>  #include <linux/acpi.h>
>  #include <linux/init.h>
> @@ -21,6 +22,7 @@ struct pci_root_info {
>
>  static bool pci_use_crs = true;
>  static bool pci_ignore_seg;
> +bool pci_use_e820 = true;
>
>  static int __init set_use_crs(const struct dmi_system_id *id)
>  {
> @@ -291,6 +293,60 @@ static bool resource_is_pcicfg_ioport(struct resource *res)
>                 res->start == 0xCF8 && res->end == 0xCFF;
>  }
>
> +/*
> + * Some fw has a bug where the PCI bridge window returned by the ACPI resources
> + * partly overlaps with some other address range, causing issues. To workaround
> + * this Linux excludes E820 reserved addresses when allocating addresses from
> + * the PCI bridge window. 2 known examples of such firmware bugs are:
> + *
> + * 1. The returned window contains addresses which map to system RAM, see
> + * commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address space").
> + *
> + * 2. The Lenovo X1 carbon gen 2 BIOS has an overlap between an EFI/E820
> + * reserved range and the ACPI provided PCI bridge window:
> + *  efi: mem46: [MMIO] range=[0x00000000dfa00000-0x00000000dfa0ffff] (0MB)
> + *  BIOS-e820: [mem 0x00000000dceff000-0x00000000dfa0ffff] reserved
> + *  pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]
> + * If Linux assigns the overlapping 0xdfa00000-0xdfa0ffff range to a PCI BAR
> + * then the system fails to resume after a suspend.
> + *
> + * Recently (2019) some systems have shown-up with EFI memmap MMIO entries
> + * covering the entire ACPI provided PCI bridge window. These memmap entries
> + * get converted into e820_table entries, causing all attempts to assign
> + * memory to PCI BARs which have not been setup by the BIOS to fail.
> + * For example see these dmesg snippets from a Lenovo IdeaPad 3 15IIL 81WE:
> + *  efi: mem63: [MMIO] range=[0x0000000065400000-0x00000000cfffffff] (1708MB)
> + *  BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved
> + *  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
> + *  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
> + *  pci 0000:00:15.0: BAR 0: failed to assign [mem size 0x00001000 64bit]
> + *
> + * To code below checks if the ACPI provided PCI bridge window is fully
> + * contained within in EFI memmap MMIO region and in that case disables
> + * the "exclude E820 reserved addresses" workaround to avoid this issue.
> + */
> +static bool resource_is_efi_mmio_region(const struct resource *res)
> +{
> +       unsigned long long start, end;
> +       efi_memory_desc_t *md;
> +
> +       if (!efi_enabled(EFI_MEMMAP))
> +               return false;
> +
> +       for_each_efi_memory_desc(md) {
> +               if (md->type != EFI_MEMORY_MAPPED_IO)
> +                       continue;
> +
> +               start = md->phys_addr;
> +               end = start + (md->num_pages << EFI_PAGE_SHIFT) - 1;
> +
> +               if (res->start >= start && res->end <= end)
> +                       return true;
> +       }
> +
> +       return false;
> +}
> +
>  static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
>  {
>         struct acpi_device *device = ci->bridge;
> @@ -300,9 +356,16 @@ static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
>
>         status = acpi_pci_probe_root_resources(ci);
>         if (pci_use_crs) {
> -               resource_list_for_each_entry_safe(entry, tmp, &ci->resources)
> +               resource_list_for_each_entry_safe(entry, tmp, &ci->resources) {
>                         if (resource_is_pcicfg_ioport(entry->res))
>                                 resource_list_destroy_entry(entry);
> +                       if (resource_is_efi_mmio_region(entry->res)) {
> +                               dev_info(&device->dev,
> +                                       "host bridge window %pR is marked by EFI as MMIO\n",
> +                                       entry->res);
> +                               pci_use_e820 = false;
> +                       }
> +               }
>                 return status;
>         }
>
> --
> 2.33.1
>
Mika Westerberg Feb. 17, 2022, 8:47 a.m. UTC | #2
On Wed, Feb 16, 2022 at 04:01:21PM +0100, Hans de Goede wrote:
> Some fw has a bug where the PCI bridge window returned by the ACPI
> resources partly overlaps with some other address range, causing issues.
> To workaround this Linux excludes E820 reserved addresses when allocating
> addresses from the PCI bridge window. 2 known examples of such fw bugs are:
> 
> 1. The returned window contains addresses which map to system RAM,
> see commit 4dc2287c1805 ("x86: avoid E820 regions when allocating
> address space").
> 
> 2. The Lenovo X1 carbon gen 2 BIOS has an overlap between an EFI/E820
> reserved range and the ACPI provided PCI bridge window:
>  efi: mem46: [MMIO] range=[0x00000000dfa00000-0x00000000dfa0ffff] (0MB)
>  BIOS-e820: [mem 0x00000000dceff000-0x00000000dfa0ffff] reserved
>  pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]
> If Linux assigns the overlapping 0xdfa00000-0xdfa0ffff range to a PCI BAR
> then the system fails to resume after a suspend.
> 
> Recently (2019) some systems have shown-up with EFI memmap MMIO entries
> covering the entire ACPI provided PCI bridge window. These memmap entries
> get converted into e820_table entries, causing all attempts to assign
> memory to PCI BARs which have not been setup by the BIOS to fail.
> For example see these dmesg snippets from a Lenovo IdeaPad 3 15IIL 81WE:
>  efi: mem63: [MMIO] range=[0x0000000065400000-0x00000000cfffffff] (1708MB)
>  BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>  pci 0000:00:15.0: BAR 0: failed to assign [mem size 0x00001000 64bit]
> 
> To fix this, check if the ACPI provided PCI bridge window is fully
> contained within in EFI memmap MMIO region and in that case disable
> the "exclude E820 reserved addresses" workaround, fixing the problem
> of not being able to find free space for unassigned BARs.
> 
> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2029207
> BugLink: https://bugs.launchpad.net/bugs/1878279
> BugLink: https://bugs.launchpad.net/bugs/1931715
> BugLink: https://bugs.launchpad.net/bugs/1932069
> BugLink: https://bugs.launchpad.net/bugs/1921649
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
diff mbox series

Patch

diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
index 490411dba438..997379b022ee 100644
--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -64,6 +64,8 @@  void pcibios_scan_specific_bus(int busn);
 
 /* pci-irq.c */
 
+struct pci_dev;
+
 struct irq_info {
 	u8 bus, devfn;			/* Bus, device and function */
 	struct {
@@ -232,3 +234,9 @@  static inline void mmio_config_writel(void __iomem *pos, u32 val)
 # define x86_default_pci_init_irq	NULL
 # define x86_default_pci_fixup_irqs	NULL
 #endif
+
+#if defined CONFIG_PCI && defined CONFIG_ACPI
+extern bool pci_use_e820;
+#else
+#define pci_use_e820 true
+#endif
diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c
index 9b9fb7882c20..e8dc9bc327bd 100644
--- a/arch/x86/kernel/resource.c
+++ b/arch/x86/kernel/resource.c
@@ -1,6 +1,7 @@ 
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/ioport.h>
 #include <asm/e820/api.h>
+#include <asm/pci_x86.h>
 
 static void resource_clip(struct resource *res, resource_size_t start,
 			  resource_size_t end)
@@ -28,6 +29,9 @@  static void remove_e820_regions(struct resource *avail)
 	int i;
 	struct e820_entry *entry;
 
+	if (!pci_use_e820)
+		return;
+
 	for (i = 0; i < e820_table->nr_entries; i++) {
 		entry = &e820_table->entries[i];
 
diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 052f1d78a562..e4617df661a9 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -1,4 +1,5 @@ 
 // SPDX-License-Identifier: GPL-2.0
+#include <linux/efi.h>
 #include <linux/pci.h>
 #include <linux/acpi.h>
 #include <linux/init.h>
@@ -21,6 +22,7 @@  struct pci_root_info {
 
 static bool pci_use_crs = true;
 static bool pci_ignore_seg;
+bool pci_use_e820 = true;
 
 static int __init set_use_crs(const struct dmi_system_id *id)
 {
@@ -291,6 +293,60 @@  static bool resource_is_pcicfg_ioport(struct resource *res)
 		res->start == 0xCF8 && res->end == 0xCFF;
 }
 
+/*
+ * Some fw has a bug where the PCI bridge window returned by the ACPI resources
+ * partly overlaps with some other address range, causing issues. To workaround
+ * this Linux excludes E820 reserved addresses when allocating addresses from
+ * the PCI bridge window. 2 known examples of such firmware bugs are:
+ *
+ * 1. The returned window contains addresses which map to system RAM, see
+ * commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address space").
+ *
+ * 2. The Lenovo X1 carbon gen 2 BIOS has an overlap between an EFI/E820
+ * reserved range and the ACPI provided PCI bridge window:
+ *  efi: mem46: [MMIO] range=[0x00000000dfa00000-0x00000000dfa0ffff] (0MB)
+ *  BIOS-e820: [mem 0x00000000dceff000-0x00000000dfa0ffff] reserved
+ *  pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]
+ * If Linux assigns the overlapping 0xdfa00000-0xdfa0ffff range to a PCI BAR
+ * then the system fails to resume after a suspend.
+ *
+ * Recently (2019) some systems have shown-up with EFI memmap MMIO entries
+ * covering the entire ACPI provided PCI bridge window. These memmap entries
+ * get converted into e820_table entries, causing all attempts to assign
+ * memory to PCI BARs which have not been setup by the BIOS to fail.
+ * For example see these dmesg snippets from a Lenovo IdeaPad 3 15IIL 81WE:
+ *  efi: mem63: [MMIO] range=[0x0000000065400000-0x00000000cfffffff] (1708MB)
+ *  BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved
+ *  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
+ *  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
+ *  pci 0000:00:15.0: BAR 0: failed to assign [mem size 0x00001000 64bit]
+ *
+ * To code below checks if the ACPI provided PCI bridge window is fully
+ * contained within in EFI memmap MMIO region and in that case disables
+ * the "exclude E820 reserved addresses" workaround to avoid this issue.
+ */
+static bool resource_is_efi_mmio_region(const struct resource *res)
+{
+	unsigned long long start, end;
+	efi_memory_desc_t *md;
+
+	if (!efi_enabled(EFI_MEMMAP))
+		return false;
+
+	for_each_efi_memory_desc(md) {
+		if (md->type != EFI_MEMORY_MAPPED_IO)
+			continue;
+
+		start = md->phys_addr;
+		end = start + (md->num_pages << EFI_PAGE_SHIFT) - 1;
+
+		if (res->start >= start && res->end <= end)
+			return true;
+	}
+
+	return false;
+}
+
 static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
 {
 	struct acpi_device *device = ci->bridge;
@@ -300,9 +356,16 @@  static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
 
 	status = acpi_pci_probe_root_resources(ci);
 	if (pci_use_crs) {
-		resource_list_for_each_entry_safe(entry, tmp, &ci->resources)
+		resource_list_for_each_entry_safe(entry, tmp, &ci->resources) {
 			if (resource_is_pcicfg_ioport(entry->res))
 				resource_list_destroy_entry(entry);
+			if (resource_is_efi_mmio_region(entry->res)) {
+				dev_info(&device->dev,
+					"host bridge window %pR is marked by EFI as MMIO\n",
+					entry->res);
+				pci_use_e820 = false;
+			}
+		}
 		return status;
 	}