Message ID | 20230201043018.778499-2-chenhuacai@loongson.cn |
---|---|
State | New |
Headers | show |
Series | PCI: Resolve Loongson's LS7A PCI problems | expand |
On Wed, Feb 01, 2023 at 12:30:17PM +0800, Huacai Chen wrote: > This patch has a long story. > > After cc27b735ad3a7557 ("PCI/portdrv: Turn off PCIe services during > shutdown") we observe poweroff/reboot failures on systems with LS7A > chipset. > > We found that if we remove "pci_command &= ~PCI_COMMAND_MASTER" in > do_pci_disable_device(), it can work well. The hardware engineer says > that the root cause is that CPU is still accessing PCIe devices while > poweroff/reboot, and if we disable the Bus Master Bit at this time, the > PCIe controller doesn't forward requests to downstream devices, and also > does not send TIMEOUT to CPU, which causes CPU wait forever (hardware > deadlock). > > To be clear, the sequence is like this: > > - CPU issues MMIO read to device below Root Port > > - LS7A Root Port fails to forward transaction to secondary bus > because of LS7A Bus Master defect > > - CPU hangs waiting for response to MMIO read > > Then how is userspace able to use a device after the device is removed? > > To give more details, let's take the graphics driver (e.g. amdgpu) as > an example. The userspace programs call printf() to display "shutting > down xxx service" during shutdown/reboot, or the kernel calls printk() > to display something during shutdown/reboot. These can happen at any > time, even after we call pcie_port_device_remove() to disable the pcie > port on the graphic card. > > The call stack is: printk() --> call_console_drivers() --> con->write() > --> vt_console_print() --> fbcon_putcs() > > This scenario happens because userspace programs (or the kernel itself) > don't know whether a device is 'usable', they just use it, at any time. > > This hardware behavior is a PCIe protocol violation (Bus Master should > not be involved in CPU MMIO transactions), and it will be fixed in new > revisions of hardware (add timeout mechanism for CPU read request, > whether or not Bus Master bit is cleared). > > On some x86 platforms, radeon/amdgpu devices can cause similar problems > [1][2]. > > Once before I add a quirk to solve the LS7A problem but looks ugly. > After long time discussions, Bjorn Helgaas suggest simply remove the > pci_disable_device() in pcie_portdrv_shutdown() and this patch do it > exactly. > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980 > [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638 > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> > --- > drivers/pci/pcie/portdrv.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c > index 2cc2e60bcb39..46fad0d813b2 100644 > --- a/drivers/pci/pcie/portdrv.c > +++ b/drivers/pci/pcie/portdrv.c > @@ -501,7 +501,6 @@ static void pcie_port_device_remove(struct pci_dev *dev) > { > device_for_each_child(&dev->dev, NULL, remove_iter); > pci_free_irq_vectors(dev); > - pci_disable_device(dev); > } > > /** > @@ -727,6 +726,19 @@ static void pcie_portdrv_remove(struct pci_dev *dev) > } > > pcie_port_device_remove(dev); > + > + pci_disable_device(dev); > +} > + > +static void pcie_portdrv_shutdown(struct pci_dev *dev) > +{ > + if (pci_bridge_d3_possible(dev)) { > + pm_runtime_forbid(&dev->dev); > + pm_runtime_get_noresume(&dev->dev); > + pm_runtime_dont_use_autosuspend(&dev->dev); > + } > + > + pcie_port_device_remove(dev); Thanks! I guess you verified that this actually *does* call all the port service .remove() methods, right? aer_remove(), dpc_remove(), etc? I *assume* that happens via the device_unregister() done in remove_iter(), but there's a LOT of code in the middle. > } > > static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev, > @@ -777,7 +789,7 @@ static struct pci_driver pcie_portdriver = { > > .probe = pcie_portdrv_probe, > .remove = pcie_portdrv_remove, > - .shutdown = pcie_portdrv_remove, > + .shutdown = pcie_portdrv_shutdown, > > .err_handler = &pcie_portdrv_err_handler, > > -- > 2.39.0 >
On Thu, Feb 2, 2023 at 2:17 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Wed, Feb 01, 2023 at 12:30:17PM +0800, Huacai Chen wrote: > > This patch has a long story. > > > > After cc27b735ad3a7557 ("PCI/portdrv: Turn off PCIe services during > > shutdown") we observe poweroff/reboot failures on systems with LS7A > > chipset. > > > > We found that if we remove "pci_command &= ~PCI_COMMAND_MASTER" in > > do_pci_disable_device(), it can work well. The hardware engineer says > > that the root cause is that CPU is still accessing PCIe devices while > > poweroff/reboot, and if we disable the Bus Master Bit at this time, the > > PCIe controller doesn't forward requests to downstream devices, and also > > does not send TIMEOUT to CPU, which causes CPU wait forever (hardware > > deadlock). > > > > To be clear, the sequence is like this: > > > > - CPU issues MMIO read to device below Root Port > > > > - LS7A Root Port fails to forward transaction to secondary bus > > because of LS7A Bus Master defect > > > > - CPU hangs waiting for response to MMIO read > > > > Then how is userspace able to use a device after the device is removed? > > > > To give more details, let's take the graphics driver (e.g. amdgpu) as > > an example. The userspace programs call printf() to display "shutting > > down xxx service" during shutdown/reboot, or the kernel calls printk() > > to display something during shutdown/reboot. These can happen at any > > time, even after we call pcie_port_device_remove() to disable the pcie > > port on the graphic card. > > > > The call stack is: printk() --> call_console_drivers() --> con->write() > > --> vt_console_print() --> fbcon_putcs() > > > > This scenario happens because userspace programs (or the kernel itself) > > don't know whether a device is 'usable', they just use it, at any time. > > > > This hardware behavior is a PCIe protocol violation (Bus Master should > > not be involved in CPU MMIO transactions), and it will be fixed in new > > revisions of hardware (add timeout mechanism for CPU read request, > > whether or not Bus Master bit is cleared). > > > > On some x86 platforms, radeon/amdgpu devices can cause similar problems > > [1][2]. > > > > Once before I add a quirk to solve the LS7A problem but looks ugly. > > After long time discussions, Bjorn Helgaas suggest simply remove the > > pci_disable_device() in pcie_portdrv_shutdown() and this patch do it > > exactly. > > > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980 > > [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638 > > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> > > --- > > drivers/pci/pcie/portdrv.c | 16 ++++++++++++++-- > > 1 file changed, 14 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c > > index 2cc2e60bcb39..46fad0d813b2 100644 > > --- a/drivers/pci/pcie/portdrv.c > > +++ b/drivers/pci/pcie/portdrv.c > > @@ -501,7 +501,6 @@ static void pcie_port_device_remove(struct pci_dev *dev) > > { > > device_for_each_child(&dev->dev, NULL, remove_iter); > > pci_free_irq_vectors(dev); > > - pci_disable_device(dev); > > } > > > > /** > > @@ -727,6 +726,19 @@ static void pcie_portdrv_remove(struct pci_dev *dev) > > } > > > > pcie_port_device_remove(dev); > > + > > + pci_disable_device(dev); > > +} > > + > > +static void pcie_portdrv_shutdown(struct pci_dev *dev) > > +{ > > + if (pci_bridge_d3_possible(dev)) { > > + pm_runtime_forbid(&dev->dev); > > + pm_runtime_get_noresume(&dev->dev); > > + pm_runtime_dont_use_autosuspend(&dev->dev); > > + } > > + > > + pcie_port_device_remove(dev); > > Thanks! I guess you verified that this actually *does* call all the > port service .remove() methods, right? aer_remove(), dpc_remove(), > etc? I have tested, but aer_probe(), dpc_probe() doesn't get called at boot, so does aer_remove(), dpc_remove() when poweroff. I haven't got the root cause but I will continue to investigate. Huacai > > I *assume* that happens via the device_unregister() done in > remove_iter(), but there's a LOT of code in the middle. > > > } > > > > static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev, > > @@ -777,7 +789,7 @@ static struct pci_driver pcie_portdriver = { > > > > .probe = pcie_portdrv_probe, > > .remove = pcie_portdrv_remove, > > - .shutdown = pcie_portdrv_remove, > > + .shutdown = pcie_portdrv_shutdown, > > > > .err_handler = &pcie_portdrv_err_handler, > > > > -- > > 2.39.0 > >
On Thu, Feb 02, 2023 at 09:27:03PM +0800, Huacai Chen wrote: > On Thu, Feb 2, 2023 at 2:17 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Wed, Feb 01, 2023 at 12:30:17PM +0800, Huacai Chen wrote: > > > +static void pcie_portdrv_shutdown(struct pci_dev *dev) > > > +{ > > > + if (pci_bridge_d3_possible(dev)) { > > > + pm_runtime_forbid(&dev->dev); > > > + pm_runtime_get_noresume(&dev->dev); > > > + pm_runtime_dont_use_autosuspend(&dev->dev); > > > + } > > > + > > > + pcie_port_device_remove(dev); > > > > Thanks! I guess you verified that this actually *does* call all the > > port service .remove() methods, right? aer_remove(), dpc_remove(), > > etc? > > I have tested, but aer_probe(), dpc_probe() doesn't get called at > boot, so does aer_remove(), dpc_remove() when poweroff. I haven't got > the root cause but I will continue to investigate. We'll only call aer_probe() and dpc_probe() if the port supports those services and the platform has granted us control of them. I don't know if your platform does. It may support PCIe native hotplug (pcie_hp_init()) or PME (pcie_pme_init()). Bjorn
Hi, Bjorn, On Fri, Feb 3, 2023 at 4:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Thu, Feb 02, 2023 at 09:27:03PM +0800, Huacai Chen wrote: > > On Thu, Feb 2, 2023 at 2:17 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > > On Wed, Feb 01, 2023 at 12:30:17PM +0800, Huacai Chen wrote: > > > > > +static void pcie_portdrv_shutdown(struct pci_dev *dev) > > > > +{ > > > > + if (pci_bridge_d3_possible(dev)) { > > > > + pm_runtime_forbid(&dev->dev); > > > > + pm_runtime_get_noresume(&dev->dev); > > > > + pm_runtime_dont_use_autosuspend(&dev->dev); > > > > + } > > > > + > > > > + pcie_port_device_remove(dev); > > > > > > Thanks! I guess you verified that this actually *does* call all the > > > port service .remove() methods, right? aer_remove(), dpc_remove(), > > > etc? > > > > I have tested, but aer_probe(), dpc_probe() doesn't get called at > > boot, so does aer_remove(), dpc_remove() when poweroff. I haven't got > > the root cause but I will continue to investigate. > > We'll only call aer_probe() and dpc_probe() if the port supports those > services and the platform has granted us control of them. I don't > know if your platform does. It may support PCIe native hotplug > (pcie_hp_init()) or PME (pcie_pme_init()). When I use pcie_ports=native to boot kernel, I verified that aer_remove() and pcie_pme_remove() are both called, while DPC and HOTPLUG are both not supported. Huacai > > Bjorn
On Fri, Feb 03, 2023 at 12:00:37PM +0800, Huacai Chen wrote: > Hi, Bjorn, > > On Fri, Feb 3, 2023 at 4:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > On Thu, Feb 02, 2023 at 09:27:03PM +0800, Huacai Chen wrote: > > > On Thu, Feb 2, 2023 at 2:17 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > On Wed, Feb 01, 2023 at 12:30:17PM +0800, Huacai Chen wrote: > > > > > > > +static void pcie_portdrv_shutdown(struct pci_dev *dev) > > > > > +{ > > > > > + if (pci_bridge_d3_possible(dev)) { > > > > > + pm_runtime_forbid(&dev->dev); > > > > > + pm_runtime_get_noresume(&dev->dev); > > > > > + pm_runtime_dont_use_autosuspend(&dev->dev); > > > > > + } > > > > > + > > > > > + pcie_port_device_remove(dev); > > > > > > > > Thanks! I guess you verified that this actually *does* call all the > > > > port service .remove() methods, right? aer_remove(), dpc_remove(), > > > > etc? > > > > > > I have tested, but aer_probe(), dpc_probe() doesn't get called at > > > boot, so does aer_remove(), dpc_remove() when poweroff. I haven't got > > > the root cause but I will continue to investigate. > > > > We'll only call aer_probe() and dpc_probe() if the port supports those > > services and the platform has granted us control of them. I don't > > know if your platform does. It may support PCIe native hotplug > > (pcie_hp_init()) or PME (pcie_pme_init()). > > When I use pcie_ports=native to boot kernel, I verified that > aer_remove() and pcie_pme_remove() are both called, while DPC and > HOTPLUG are both not supported. Great, thank you!
diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c index 2cc2e60bcb39..46fad0d813b2 100644 --- a/drivers/pci/pcie/portdrv.c +++ b/drivers/pci/pcie/portdrv.c @@ -501,7 +501,6 @@ static void pcie_port_device_remove(struct pci_dev *dev) { device_for_each_child(&dev->dev, NULL, remove_iter); pci_free_irq_vectors(dev); - pci_disable_device(dev); } /** @@ -727,6 +726,19 @@ static void pcie_portdrv_remove(struct pci_dev *dev) } pcie_port_device_remove(dev); + + pci_disable_device(dev); +} + +static void pcie_portdrv_shutdown(struct pci_dev *dev) +{ + if (pci_bridge_d3_possible(dev)) { + pm_runtime_forbid(&dev->dev); + pm_runtime_get_noresume(&dev->dev); + pm_runtime_dont_use_autosuspend(&dev->dev); + } + + pcie_port_device_remove(dev); } static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev, @@ -777,7 +789,7 @@ static struct pci_driver pcie_portdriver = { .probe = pcie_portdrv_probe, .remove = pcie_portdrv_remove, - .shutdown = pcie_portdrv_remove, + .shutdown = pcie_portdrv_shutdown, .err_handler = &pcie_portdrv_err_handler,
This patch has a long story. After cc27b735ad3a7557 ("PCI/portdrv: Turn off PCIe services during shutdown") we observe poweroff/reboot failures on systems with LS7A chipset. We found that if we remove "pci_command &= ~PCI_COMMAND_MASTER" in do_pci_disable_device(), it can work well. The hardware engineer says that the root cause is that CPU is still accessing PCIe devices while poweroff/reboot, and if we disable the Bus Master Bit at this time, the PCIe controller doesn't forward requests to downstream devices, and also does not send TIMEOUT to CPU, which causes CPU wait forever (hardware deadlock). To be clear, the sequence is like this: - CPU issues MMIO read to device below Root Port - LS7A Root Port fails to forward transaction to secondary bus because of LS7A Bus Master defect - CPU hangs waiting for response to MMIO read Then how is userspace able to use a device after the device is removed? To give more details, let's take the graphics driver (e.g. amdgpu) as an example. The userspace programs call printf() to display "shutting down xxx service" during shutdown/reboot, or the kernel calls printk() to display something during shutdown/reboot. These can happen at any time, even after we call pcie_port_device_remove() to disable the pcie port on the graphic card. The call stack is: printk() --> call_console_drivers() --> con->write() --> vt_console_print() --> fbcon_putcs() This scenario happens because userspace programs (or the kernel itself) don't know whether a device is 'usable', they just use it, at any time. This hardware behavior is a PCIe protocol violation (Bus Master should not be involved in CPU MMIO transactions), and it will be fixed in new revisions of hardware (add timeout mechanism for CPU read request, whether or not Bus Master bit is cleared). On some x86 platforms, radeon/amdgpu devices can cause similar problems [1][2]. Once before I add a quirk to solve the LS7A problem but looks ugly. After long time discussions, Bjorn Helgaas suggest simply remove the pci_disable_device() in pcie_portdrv_shutdown() and this patch do it exactly. [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980 [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638 Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> --- drivers/pci/pcie/portdrv.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)