diff mbox series

hw/p8-i2c: Fix deadlock in p9_i2c_bus_owner_change

Message ID 20171012140159.1808-1-anton@ozlabs.org
State Accepted
Headers show
Series hw/p8-i2c: Fix deadlock in p9_i2c_bus_owner_change | expand

Commit Message

Anton Blanchard Oct. 12, 2017, 2:01 p.m. UTC
From: Anton Blanchard <anton@samba.org>

When debugging a system where Linux was taking soft lockup errors, I
noticed two CPUs were stuck in OPAL:

CPU0
lock
p8_i2c_recover
opal_handle_interrupt

CPU1
sync_timer
cancel_timer
p9_i2c_bus_owner_change
occ_p9_interrupt
xive_source_interrupt
opal_handle_interrupt

p8_i2c_recover() is a timer, and is stuck trying to take master->lock.
p9_i2c_bus_owner_change() has taken master->lock, but then is stuck waiting
for all timers to complete. We deadlock.

Fix this by using cancel_timer_async(), as suggested by Oliver.

Fixes: 201fd50f208d ("hw/p8-i2c: Fix OCC locking")
Signed-off-by: Anton Blanchard <anton@samba.org>
---
 hw/p8-i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Stewart Smith Oct. 16, 2017, 8:03 a.m. UTC | #1
Anton Blanchard <anton@ozlabs.org> writes:
> From: Anton Blanchard <anton@samba.org>
>
> When debugging a system where Linux was taking soft lockup errors, I
> noticed two CPUs were stuck in OPAL:
>
> CPU0
> lock
> p8_i2c_recover
> opal_handle_interrupt
>
> CPU1
> sync_timer
> cancel_timer
> p9_i2c_bus_owner_change
> occ_p9_interrupt
> xive_source_interrupt
> opal_handle_interrupt
>
> p8_i2c_recover() is a timer, and is stuck trying to take master->lock.
> p9_i2c_bus_owner_change() has taken master->lock, but then is stuck waiting
> for all timers to complete. We deadlock.
>
> Fix this by using cancel_timer_async(), as suggested by Oliver.
>
> Fixes: 201fd50f208d ("hw/p8-i2c: Fix OCC locking")
> Signed-off-by: Anton Blanchard <anton@samba.org>

Thanks!

Merged to master as of 4466a6ec89bf39afeb98b92d03bbf7fc30c55afe

Added the Suggested-by for Oliver as the patches were exactly the same.
diff mbox series

Patch

diff --git a/hw/p8-i2c.c b/hw/p8-i2c.c
index ffbace93..0defe7b2 100644
--- a/hw/p8-i2c.c
+++ b/hw/p8-i2c.c
@@ -1307,7 +1307,7 @@  void p9_i2c_bus_owner_change(u32 chip_id)
 			goto done;
 
 		/* clear the existing wait timer */
-		cancel_timer(&master->recovery);
+		cancel_timer_async(&master->recovery);
 
 		/* re-start the request now that we own the master */
 		master->state = state_idle;