[4/4] core/opal: Allow poller re-entry if OPAL was re-entered

Message ID 20180408064939.18879-5-npiggin@gmail.com
State Accepted
Headers show
Series
  • next round of OPAL debugging improvements
Related show

Commit Message

Nicholas Piggin April 8, 2018, 6:49 a.m.
If an NMI interrupts the middle of running pollers and the OS
invokes pollers again (e.g., for console output), the poller
re-entrancy check will prevent it from running and spam the
console.

That check was designed to catch a poller calling opal_run_pollers,
OPAL re-entrancy is something different and is detected elsewhere.
Avoid the poller recursion check if OPAL has been re-entered. This
is a best-effort attempt to cope with errors.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 core/opal.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Patch

diff --git a/core/opal.c b/core/opal.c
index f6922b26..3642fb04 100644
--- a/core/opal.c
+++ b/core/opal.c
@@ -546,18 +546,21 @@  void opal_del_poller(void (*poller)(void *data))
 
 void opal_run_pollers(void)
 {
-	struct opal_poll_entry *poll_ent;
 	static int pollers_with_lock_warnings = 0;
 	static int poller_recursion = 0;
+	struct opal_poll_entry *poll_ent;
+	bool was_in_poller;
 
-	/* Don't re-enter on this CPU */
-	if (this_cpu()->in_poller && poller_recursion < 16) {
+	/* Don't re-enter on this CPU, unless it was an OPAL re-entry */
+	if (this_cpu()->in_opal_call == 1 &&
+			this_cpu()->in_poller && poller_recursion < 16) {
 		/**
 		 * @fwts-label OPALPollerRecursion
 		 * @fwts-advice Recursion detected in opal_run_pollers(). This
 		 * indicates a bug in OPAL where a poller ended up running
 		 * pollers, which doesn't lead anywhere good.
 		 */
+		disable_fast_reboot("Poller recursion detected.");
 		prlog(PR_ERR, "OPAL: Poller recursion detected.\n");
 		backtrace();
 		poller_recursion++;
@@ -565,6 +568,7 @@  void opal_run_pollers(void)
 			prlog(PR_ERR, "OPAL: Squashing future poller recursion warnings (>16).\n");
 		return;
 	}
+	was_in_poller = this_cpu()->in_poller;
 	this_cpu()->in_poller = true;
 
 	if (!list_empty(&this_cpu()->locks_held) && pollers_with_lock_warnings < 64) {
@@ -598,7 +602,7 @@  void opal_run_pollers(void)
 		poll_ent->poller(poll_ent->data);
 
 	/* Disable poller flag */
-	this_cpu()->in_poller = false;
+	this_cpu()->in_poller = was_in_poller;
 
 	/* On debug builds, print max stack usage */
 	check_stacks();