diff mbox series

[PATCHv2,3/5] PCI/ERR: Retain status from error notification

Message ID 20210104230300.1277180-4-kbusch@kernel.org
State New
Headers show
Series aer handling fixups | expand

Commit Message

Keith Busch Jan. 4, 2021, 11:02 p.m. UTC
Overwriting the frozen detected status with the result of the link reset
loses the NEED_RESET result that drivers are depending on for error
handling to report the .slot_reset() callback. Retain this status so
that subsequent error handling has the correct flow.

Reported-by: Hinko Kocevar <hinko.kocevar@ess.eu>
Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/pci/pcie/err.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Williams, Dan J March 3, 2021, 5:34 a.m. UTC | #1
[ Add Sathya ]

On Mon, 2021-01-04 at 15:02 -0800, Keith Busch wrote:
> Overwriting the frozen detected status with the result of the link reset
> loses the NEED_RESET result that drivers are depending on for error
> handling to report the .slot_reset() callback. Retain this status so
> that subsequent error handling has the correct flow.
> 
> Reported-by: Hinko Kocevar <hinko.kocevar@ess.eu>
> Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>

Just want to report that this fix might be a candidate for -stable.
Recent DPC error recovery testing on v5.10 shows that losing this
status prevents NVME from restarting the queue after error recovery
with a failing signature like:

[  424.685179] pcie_do_recovery: pcieport 0000:97:02.0: AER: broadcast error_detected message
[  424.685185] nvme nvme0: frozen state error detected, reset controller
[  425.078620] pcie_do_recovery: pcieport 0000:97:02.0: AER: broadcast resume message
[  425.078674] pcieport 0000:97:02.0: AER: device recovery successful

...and then later:

[  751.146277] sysrq: Show Blocked State
[  751.146431] task:touch           state:D stack:    0 pid:11969 ppid: 11413 flags:0x00000000
[  751.146434] Call Trace:
[  751.146443]  __schedule+0x31b/0x890
[  751.146445]  schedule+0x3c/0xa0
[  751.146449]  blk_queue_enter+0x151/0x220
[  751.146454]  ? finish_wait+0x80/0x80
[  751.146455]  submit_bio_noacct+0x39a/0x410
[  751.146457]  submit_bio+0x42/0x150
[  751.146460]  ? bio_add_page+0x62/0x90
[  751.146461]  ? guard_bio_eod+0x25/0x70
[  751.146465]  submit_bh_wbc+0x16d/0x190
[  751.146469]  ext4_read_bh+0x47/0xa0
[  751.146472]  ext4_read_inode_bitmap+0x3cd/0x5f0

...because NVME was only told to disable, never to reset and resume the
block queue.

I have not tested it, but by code inspection I eventually found this
upstream fix.
Kuppuswamy, Sathyanarayanan March 3, 2021, 5:46 a.m. UTC | #2
On 3/2/21 9:34 PM, Williams, Dan J wrote:
> [ Add Sathya ]
>
> On Mon, 2021-01-04 at 15:02 -0800, Keith Busch wrote:
>> Overwriting the frozen detected status with the result of the link reset
>> loses the NEED_RESET result that drivers are depending on for error
>> handling to report the .slot_reset() callback. Retain this status so
>> that subsequent error handling has the correct flow.
>>
>> Reported-by: Hinko Kocevar <hinko.kocevar@ess.eu>
>> Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
>> Signed-off-by: Keith Busch <kbusch@kernel.org>
> Just want to report that this fix might be a candidate for -stable.
Agree.

I think it can be merged in both stable and mainline kernels.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Recent DPC error recovery testing on v5.10 shows that losing this
> status prevents NVME from restarting the queue after error recovery
> with a failing signature like:
>
> [  424.685179] pcie_do_recovery: pcieport 0000:97:02.0: AER: broadcast error_detected message
> [  424.685185] nvme nvme0: frozen state error detected, reset controller
> [  425.078620] pcie_do_recovery: pcieport 0000:97:02.0: AER: broadcast resume message
> [  425.078674] pcieport 0000:97:02.0: AER: device recovery successful
>
> ...and then later:
>
> [  751.146277] sysrq: Show Blocked State
> [  751.146431] task:touch           state:D stack:    0 pid:11969 ppid: 11413 flags:0x00000000
> [  751.146434] Call Trace:
> [  751.146443]  __schedule+0x31b/0x890
> [  751.146445]  schedule+0x3c/0xa0
> [  751.146449]  blk_queue_enter+0x151/0x220
> [  751.146454]  ? finish_wait+0x80/0x80
> [  751.146455]  submit_bio_noacct+0x39a/0x410
> [  751.146457]  submit_bio+0x42/0x150
> [  751.146460]  ? bio_add_page+0x62/0x90
> [  751.146461]  ? guard_bio_eod+0x25/0x70
> [  751.146465]  submit_bh_wbc+0x16d/0x190
> [  751.146469]  ext4_read_bh+0x47/0xa0
> [  751.146472]  ext4_read_inode_bitmap+0x3cd/0x5f0
>
> ...because NVME was only told to disable, never to reset and resume the
> block queue.
>
> I have not tested it, but by code inspection I eventually found this
> upstream fix.
diff mbox series

Patch

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index a84f0bf4c1e2..b576aa890c76 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -198,8 +198,7 @@  pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 	pci_dbg(bridge, "broadcast error_detected message\n");
 	if (state == pci_channel_io_frozen) {
 		pci_walk_bridge(bridge, report_frozen_detected, &status);
-		status = reset_subordinates(bridge);
-		if (status != PCI_ERS_RESULT_RECOVERED) {
+		if (reset_subordinates(bridge) != PCI_ERS_RESULT_RECOVERED) {
 			pci_warn(bridge, "subordinate device reset failed\n");
 			goto failed;
 		}