Message ID | 20181023173934.GA14918@bhelgaas-glaptop.roam.corp.google.com |
---|---|
State | Not Applicable |
Headers | show |
Series | [GIT,PULL] PCI changes for v4.20 | expand |
On Tue, Oct 23, 2018 at 10:39 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > PCI changes: Pulled, Linus
* Bjorn Helgaas <helgaas@kernel.org> wrote: > PCI changes: > > - Pay attention to device-specific _PXM node values (Jonathan Cameron) There's a new boot regression, my AMD ThreadRipper system (MSI X399 SLI PLUS (MS-7B09)) hangs during early bootup, and I have bisected it down to this commit: bad7dcd94f39: ACPI/PCI: Pay attention to device-specific _PXM node values Reverting it solves the hang. Unfortunately there's no console output when it hangs, even with earlyprintk. It just hangs after the "loading initrd" line. Config is an Ubuntu-ish config with PROVE_LOCKING=y and a few other debug options. All my other testsystems boot fine with similar configs, so it's probably something specific to this system. Thanks, Ingo
On Tue, Nov 13, 2018 at 08:17:12AM +0100, Ingo Molnar wrote: > > * Bjorn Helgaas <helgaas@kernel.org> wrote: > > > PCI changes: > > > > - Pay attention to device-specific _PXM node values (Jonathan Cameron) > > There's a new boot regression, my AMD ThreadRipper system (MSI X399 SLI > PLUS (MS-7B09)) hangs during early bootup, and I have bisected it down to > this commit: > > bad7dcd94f39: ACPI/PCI: Pay attention to device-specific _PXM node values > > Reverting it solves the hang. > > Unfortunately there's no console output when it hangs, even with > earlyprintk. It just hangs after the "loading initrd" line. > > Config is an Ubuntu-ish config with PROVE_LOCKING=y and a few other debug > options. > > All my other testsystems boot fine with similar configs, so it's probably > something specific to this system. Lemme add Tom, he might have an idea.
On 11/13/2018 04:20 AM, Borislav Petkov wrote: > On Tue, Nov 13, 2018 at 08:17:12AM +0100, Ingo Molnar wrote: >> >> * Bjorn Helgaas <helgaas@kernel.org> wrote: >> >>> PCI changes: >>> >>> - Pay attention to device-specific _PXM node values (Jonathan Cameron) >> >> There's a new boot regression, my AMD ThreadRipper system (MSI X399 SLI >> PLUS (MS-7B09)) hangs during early bootup, and I have bisected it down to >> this commit: >> >> bad7dcd94f39: ACPI/PCI: Pay attention to device-specific _PXM node values >> >> Reverting it solves the hang. >> >> Unfortunately there's no console output when it hangs, even with >> earlyprintk. It just hangs after the "loading initrd" line. >> >> Config is an Ubuntu-ish config with PROVE_LOCKING=y and a few other debug >> options. >> >> All my other testsystems boot fine with similar configs, so it's probably >> something specific to this system. > > Lemme add Tom, he might have an idea. I'm not seeing any issues on my EPYC system. Let me see if I can locate a Threadripper system to test on. It seems very strange that the commit in question would cause a hang so early. Do you have a serial console hooked up for the earlyprintk? Is the serial port set up in legacy mode (e.g. 0x3f8 as opposed to being an MMIO device that would require a driver)? Can you dump the ACPI tables / run them through iasl to see what the _PXM values are in the DSDT table? Thanks, Tom >
[+cc Martin, Rafael, Len, linux-acpi] On Tue, Nov 13, 2018 at 11:20:04AM +0100, Borislav Petkov wrote: > On Tue, Nov 13, 2018 at 08:17:12AM +0100, Ingo Molnar wrote: > > > > * Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > > PCI changes: > > > > > > - Pay attention to device-specific _PXM node values (Jonathan Cameron) > > > > There's a new boot regression, my AMD ThreadRipper system (MSI X399 SLI > > PLUS (MS-7B09)) hangs during early bootup, and I have bisected it down to > > this commit: > > > > bad7dcd94f39: ACPI/PCI: Pay attention to device-specific _PXM node values > > > > Reverting it solves the hang. > > > > Unfortunately there's no console output when it hangs, even with > > earlyprintk. It just hangs after the "loading initrd" line. > > > > Config is an Ubuntu-ish config with PROVE_LOCKING=y and a few other debug > > options. > > > > All my other testsystems boot fine with similar configs, so it's probably > > something specific to this system. Martin reported the same thing [1] (unfortunately the archive didn't capture Martin's original emails, I think because they were multi-part messages with attachments). Looks like Martin might have a similar system: DMI: To Be Filled By O.E.M. To Be Filled By O.E.M./X399 Taichi, BIOS P3.30 08/14/2018 smpboot: CPU0: AMD Ryzen Threadripper 2950X 16-Core Processor (family: 0x17, model: 0x8, stepping: 0x2) Given how painful this is to debug, I queued up a revert on my for-linus branch until we figure out what sanity checks are needed to make the original patch safe. I would expect proximity information to be basically just a hint for optimization, not a functional requirement, so it would be really interesting to figure out why this causes such a catastrophic failure. Maybe there's a way to improve that path as well so it would be more robust or at least more debuggable. Bjorn [1] https://lore.kernel.org/linux-pci/20180912152140.3676-2-Jonathan.Cameron@huawei.com
On 11/13/2018 08:41 AM, Lendacky, Thomas wrote: > On 11/13/2018 04:20 AM, Borislav Petkov wrote: >> On Tue, Nov 13, 2018 at 08:17:12AM +0100, Ingo Molnar wrote: >>> >>> * Bjorn Helgaas <helgaas@kernel.org> wrote: >>> >>>> PCI changes: >>>> >>>> - Pay attention to device-specific _PXM node values (Jonathan Cameron) >>> >>> There's a new boot regression, my AMD ThreadRipper system (MSI X399 SLI >>> PLUS (MS-7B09)) hangs during early bootup, and I have bisected it down to >>> this commit: >>> >>> bad7dcd94f39: ACPI/PCI: Pay attention to device-specific _PXM node values >>> >>> Reverting it solves the hang. >>> >>> Unfortunately there's no console output when it hangs, even with >>> earlyprintk. It just hangs after the "loading initrd" line. >>> >>> Config is an Ubuntu-ish config with PROVE_LOCKING=y and a few other debug >>> options. >>> >>> All my other testsystems boot fine with similar configs, so it's probably >>> something specific to this system. >> >> Lemme add Tom, he might have an idea. > > I'm not seeing any issues on my EPYC system. Let me see if I can locate a > Threadripper system to test on. Based upon the link that Bjorn referenced in another email, I was able to re-create the problem by having my EPYC system return early from acpi_numa_init() with a -ENOENT (skipping the SRAT table). This resulted in the following GPF: [ 11.157840] general protection fault: 0000 [#1] SMP NOPTI [ 11.158785] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc2-zp-linux #3 [ 11.158785] Hardware name: ****** [ 11.158785] RIP: 0010:get_partial_node.isra.76+0x33/0x2b0 [ 11.158785] Code: 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 f0 48 83 c4 80 48 85 f6 48 89 7c 24 30 48 89 54 24 10 89 4c 24 0c 0f 84 d5 00 00 00 <48> 83 7e 08 00 0f 84 ca 00 00 00 48 89 f7 48 89 74 24 38 e8 95 5e [ 11.158785] RSP: 0018:ffffc900001078b0 EFLAGS: 00010002 [ 11.158785] RAX: 0000000000000000 RBX: 0000000000000202 RCX: 00000000006080c0 [ 11.158785] RDX: ffff889ffdae7150 RSI: 4c7a584873359cf2 RDI: ffff888107c07000 [ 11.158785] RBP: ffffc90000107958 R08: ffff888107c07000 R09: 0000000000000001 [ 11.158785] R10: 00000000006080c0 R11: 0000000000000002 R12: ffff889ffdae7140 [ 11.158785] R13: ffff888107c07000 R14: ffff888107c07000 R15: 0000000000000002 [ 11.158785] FS: 0000000000000000(0000) GS:ffff889ffdac0000(0000) knlGS:0000000000000000 [ 11.158785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 11.158785] CR2: 0000000000000000 CR3: 00008014bc20a000 CR4: 00000000003406e0 [ 11.158785] Call Trace: [ 11.158785] ? acpi_os_release_object+0xa/0x10 [ 11.158785] ? acpi_ds_result_pop+0xf8/0x10c [ 11.158785] ? acpi_ds_create_operand+0x227/0x24e [ 11.158785] ___slab_alloc+0x100/0x540 [ 11.158785] ? acpi_ds_create_operands+0x72/0xd7 [ 11.158785] ? alloc_desc+0x35/0x210 [ 11.158785] ? acpi_ns_check_object_type+0x123/0x1c0 [ 11.158785] ? alloc_desc+0x35/0x210 [ 11.158785] __slab_alloc+0x1c/0x33 [ 11.158785] kmem_cache_alloc_node_trace+0xac/0x210 [ 11.158785] alloc_desc+0x35/0x210 [ 11.158785] __irq_alloc_descs+0x1c4/0x230 [ 11.158785] __irq_domain_alloc_irqs+0x54/0x2e0 [ 11.158785] mp_map_pin_to_irq+0x2cf/0x330 [ 11.158785] acpi_register_gsi_ioapic+0x78/0x170 [ 11.158785] ? mmio_resource_enabled.part.0+0x60/0x60 [ 11.158785] acpi_pci_irq_enable+0xcd/0x280 [ 11.158785] ? mmio_resource_enabled.part.0+0x60/0x60 [ 11.158785] ? mmio_resource_enabled.part.0+0x60/0x60 [ 11.158785] do_pci_enable_device+0x5b/0x100 [ 11.158785] ? pci_bus_read_config_word+0x56/0x70 [ 11.158785] pci_enable_device_flags+0xe0/0x130 [ 11.158785] pci_enable_bridge+0x52/0x90 [ 11.158785] pci_enable_device_flags+0x8c/0x130 [ 11.158785] quirk_usb_early_handoff+0x63/0x6b0 [ 11.158785] ? bus_find_device+0x87/0xd0 [ 11.158785] ? mmio_resource_enabled.part.0+0x60/0x60 [ 11.158785] pci_fixup_device+0xe8/0x1a0 [ 11.158785] pci_apply_final_quirks+0x68/0x127 [ 11.158785] ? pci_proc_init+0x68/0x68 [ 11.158785] do_one_initcall+0x4b/0x1cb [ 11.158785] ? init_setup+0x1b/0x28 [ 11.158785] kernel_init_freeable+0x1be/0x26b [ 11.158785] ? loglevel+0x5b/0x5b [ 11.158785] ? rest_init+0xb0/0xb0 [ 11.158785] kernel_init+0xa/0x110 [ 11.158785] ret_from_fork+0x22/0x40 [ 11.158785] Modules linked in: [ 11.158785] ---[ end trace ba1c80a146740c8b ]--- [ 11.158785] RIP: 0010:get_partial_node.isra.76+0x33/0x2b0 [ 11.158785] Code: 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 f0 48 83 c4 80 48 85 f6 48 89 7c 24 30 48 89 54 24 10 89 4c 24 0c 0f 84 d5 00 00 00 <48> 83 7e 08 00 0f 84 ca 00 00 00 48 89 f7 48 89 74 24 38 e8 95 5e [ 11.158785] RSP: 0018:ffffc900001078b0 EFLAGS: 00010002 [ 11.158785] RAX: 0000000000000000 RBX: 0000000000000202 RCX: 00000000006080c0 [ 11.158785] RDX: ffff889ffdae7150 RSI: 4c7a584873359cf2 RDI: ffff888107c07000 [ 11.158785] RBP: ffffc90000107958 R08: ffff888107c07000 R09: 0000000000000001 [ 11.158785] R10: 00000000006080c0 R11: 0000000000000002 R12: ffff889ffdae7140 [ 11.158785] R13: ffff888107c07000 R14: ffff888107c07000 R15: 0000000000000002 [ 11.158785] FS: 0000000000000000(0000) GS:ffff889ffdac0000(0000) knlGS:0000000000000000 [ 11.158785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 11.158785] CR2: 0000000000000000 CR3: 00008014bc20a000 CR4: 00000000003406e0 [ 11.158785] Kernel panic - not syncing: Fatal exception [ 11.158785] ---[ end Kernel panic - not syncing: Fatal exception ]--- In acpi_get_node(), if I replace "return acpi_map_pxm_to_node(pxm);" with "return acpi_map_pxm_to_online_node(pxm);" then the system successfully boots. I'm just not sure if that should be the proper approach or if NUMA_NO_NODE should be returned if the _PXM value is outside the defined entries. I was also able to trigger this GPF by returning a bogus _PXM value on the EPYC system that had a valid SRAT table. So it definitely would be worth validating the PXM value before returning it. Thanks, Tom > > It seems very strange that the commit in question would cause a hang so > early. Do you have a serial console hooked up for the earlyprintk? Is the > serial port set up in legacy mode (e.g. 0x3f8 as opposed to being an MMIO > device that would require a driver)? > > Can you dump the ACPI tables / run them through iasl to see what the _PXM > values are in the DSDT table? > > Thanks, > Tom > >>
* Bjorn Helgaas <helgaas@kernel.org> wrote: > [+cc Martin, Rafael, Len, linux-acpi] > > On Tue, Nov 13, 2018 at 11:20:04AM +0100, Borislav Petkov wrote: > > On Tue, Nov 13, 2018 at 08:17:12AM +0100, Ingo Molnar wrote: > > > > > > * Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > > > > PCI changes: > > > > > > > > - Pay attention to device-specific _PXM node values (Jonathan Cameron) > > > > > > There's a new boot regression, my AMD ThreadRipper system (MSI X399 SLI > > > PLUS (MS-7B09)) hangs during early bootup, and I have bisected it down to > > > this commit: > > > > > > bad7dcd94f39: ACPI/PCI: Pay attention to device-specific _PXM node values > > > > > > Reverting it solves the hang. > > > > > > Unfortunately there's no console output when it hangs, even with > > > earlyprintk. It just hangs after the "loading initrd" line. > > > > > > Config is an Ubuntu-ish config with PROVE_LOCKING=y and a few other debug > > > options. > > > > > > All my other testsystems boot fine with similar configs, so it's probably > > > something specific to this system. > > Martin reported the same thing [1] (unfortunately the archive didn't > capture Martin's original emails, I think because they were multi-part > messages with attachments). > > Looks like Martin might have a similar system: > > DMI: To Be Filled By O.E.M. To Be Filled By O.E.M./X399 Taichi, BIOS P3.30 08/14/2018 > smpboot: CPU0: AMD Ryzen Threadripper 2950X 16-Core Processor (family: 0x17, model: 0x8, stepping: 0x2) > > Given how painful this is to debug, I queued up a revert on my > for-linus branch until we figure out what sanity checks are needed to > make the original patch safe. Thanks! Took me about a day to bisect this, on this hard to bisect machine. :-/ > I would expect proximity information to be basically just a hint for > optimization, not a functional requirement, so it would be really > interesting to figure out why this causes such a catastrophic failure. > Maybe there's a way to improve that path as well so it would be more > robust or at least more debuggable. Yeah. Thanks, Ingo