diff mbox series

[net,v1] lan743x: prevent entire kernel HANG on open, for some platforms

Message ID 20201112204741.12375-1-TheSven73@gmail.com
State Superseded
Headers show
Series [net,v1] lan743x: prevent entire kernel HANG on open, for some platforms | expand

Commit Message

Sven Van Asbroeck Nov. 12, 2020, 8:47 p.m. UTC
From: Sven Van Asbroeck <thesven73@gmail.com>

On arm imx6, when opening the chip's netdev, the whole Linux
kernel intermittently hangs/freezes.

This is caused by a bug in the driver code which tests if pcie
interrupts are working correctly, using the software interrupt:

1. open: enable the software interrupt
2. open: tell the chip to assert the software interrupt
3. open: wait for flag
4. ISR: acknowledge s/w interrupt, set flag
5. open: notice flag, disable the s/w interrupt, continue

Unfortunately the ISR only acknowledges the s/w interrupt, but
does not disable it. This will re-trigger the ISR in a tight
loop.

On some (lucky) platforms, open proceeds to disable the s/w
interrupt even while the ISR is 'spinning'. On arm imx6,
the spinning ISR does not allow open to proceed, resulting
in a hung Linux kernel.

Fix minimally by disabling the s/w interrupt in the ISR, which
will prevent it from spinning. This won't break anything because
the s/w interrupt is used as a one-shot interrupt.

Note that this is a minimal fix, overlooking many possible
cleanups, e.g.:
- lan743x_intr_software_isr() is completely redundant and reads
  INT_STS twice for no apparent reason
- disabling the s/w interrupt in lan743x_intr_test_isr() is now
  redundant, but harmless
- waiting on software_isr_flag can be converted from a sleeping
  poll loop to wait_event_timeout()

Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # arm imx6 lan7430
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
---

Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git # edbc21113bde

To: Jakub Kicinski <kuba@kernel.org>
To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: "David S. Miller" <davem@davemloft.net>
Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

 drivers/net/ethernet/microchip/lan743x_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski Nov. 14, 2020, 11:28 p.m. UTC | #1
On Thu, 12 Nov 2020 15:47:41 -0500 Sven Van Asbroeck wrote:
> From: Sven Van Asbroeck <thesven73@gmail.com>
> 
> On arm imx6, when opening the chip's netdev, the whole Linux
> kernel intermittently hangs/freezes.
> 
> This is caused by a bug in the driver code which tests if pcie
> interrupts are working correctly, using the software interrupt:
> 
> 1. open: enable the software interrupt
> 2. open: tell the chip to assert the software interrupt
> 3. open: wait for flag
> 4. ISR: acknowledge s/w interrupt, set flag
> 5. open: notice flag, disable the s/w interrupt, continue
> 
> Unfortunately the ISR only acknowledges the s/w interrupt, but
> does not disable it. This will re-trigger the ISR in a tight
> loop.
> 
> On some (lucky) platforms, open proceeds to disable the s/w
> interrupt even while the ISR is 'spinning'. On arm imx6,
> the spinning ISR does not allow open to proceed, resulting
> in a hung Linux kernel.
> 
> Fix minimally by disabling the s/w interrupt in the ISR, which
> will prevent it from spinning. This won't break anything because
> the s/w interrupt is used as a one-shot interrupt.
> 
> Note that this is a minimal fix, overlooking many possible
> cleanups, e.g.:
> - lan743x_intr_software_isr() is completely redundant and reads
>   INT_STS twice for no apparent reason
> - disabling the s/w interrupt in lan743x_intr_test_isr() is now
>   redundant, but harmless
> - waiting on software_isr_flag can be converted from a sleeping
>   poll loop to wait_event_timeout()
> 
> Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
> Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # arm imx6 lan7430
> Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>

Applied, thank you!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index 065e10bc98f2..f4a09e0f3ec5 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -148,7 +148,8 @@  static void lan743x_intr_software_isr(void *context)
 
 	int_sts = lan743x_csr_read(adapter, INT_STS);
 	if (int_sts & INT_BIT_SW_GP_) {
-		lan743x_csr_write(adapter, INT_STS, INT_BIT_SW_GP_);
+		/* disable the interrupt to prevent repeated re-triggering */
+		lan743x_csr_write(adapter, INT_EN_CLR, INT_BIT_SW_GP_);
 		intr->software_isr_flag = 1;
 	}
 }