[2/2] core/lock: don't set bust_locks on lock error

Message ID 20180930034821.13788-3-npiggin@gmail.com
State Accepted
Headers show
  • fix lock debugging corruptions
Related show


Context Check Description
snowpatch_ozlabs/make_check success Test make_check on branch master
snowpatch_ozlabs/apply_patch success master/apply_patch Successfully applied

Commit Message

Nicholas Piggin Sept. 30, 2018, 3:48 a.m.
bust_locks is a big hammer that guarantees a mess if it's set while
all other threads are not stopped.

I propose removing this in the lock error paths. In debugging the
previous deadlock false positive, none of the error messages printed,
and the in-memory console was totally garbled due to lack of locking.

I think it's generally better for debugging and system integrity to
keep locks held when lock errors occur. Lock busting should be used
carefully, just to allow messages to be printed out or machine to be
restarted, probably when the whole system is single-threaded.

Skiboot is slowly working toward that being feasible with co-operative
debug APIs between firmware and host, but for the time being,
difficult lock crashes are better not to corrupt everything by
busting locks.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
 core/lock.c | 2 --
 1 file changed, 2 deletions(-)


diff --git a/core/lock.c b/core/lock.c
index fc051ca4..e7c60a39 100644
--- a/core/lock.c
+++ b/core/lock.c
@@ -34,8 +34,6 @@  bool bust_locks = true;
 static void __nomcount lock_error(struct lock *l, const char *reason, uint16_t err)
-	bust_locks = true;
 	fprintf(stderr, "LOCK ERROR: %s @%p (state: 0x%016llx)\n",
 		reason, l, l->lock_val);
 	op_display(OP_FATAL, OP_MOD_LOCK, err);