diff mbox

[v2] slw: improve error message for SLW timer stuck

Message ID 1473394188-19264-1-git-send-email-stewart@linux.vnet.ibm.com
State Accepted
Headers show

Commit Message

Stewart Smith Sept. 9, 2016, 4:09 a.m. UTC
We still register dump, but only to in memory console buffer by default.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
---
Changes in V2:
 - update message to say you may have *had* jitter rather than have it.
---
 hw/slw.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

Comments

Stewart Smith Sept. 14, 2016, 4:10 a.m. UTC | #1
Stewart Smith <stewart@linux.vnet.ibm.com> writes:
> We still register dump, but only to in memory console buffer by default.
>
> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
> ---
> Changes in V2:
>  - update message to say you may have *had* jitter rather than have it.
> ---
>  hw/slw.c | 24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)

merged to master as of 81154ba9b2d418cd5f9eda3a6f89ca6631556510
and 5.3.x as of c70c825dc8ff1fb309aa678eebe10300fb3c832a
diff mbox

Patch

diff --git a/hw/slw.c b/hw/slw.c
index 74b9cd5477c5..87dc1c4964e7 100644
--- a/hw/slw.c
+++ b/hw/slw.c
@@ -1197,13 +1197,13 @@  static void slw_dump_timer_ffdc(void)
 	 * root-cause. OPAL/skiboot may be stuck on some operation that
 	 * requires SLW timer state machine (e.g. core powersaving)
 	 */
-	prlog(PR_ERR, "SLW: Register state:\n");
+	prlog(PR_DEBUG, "SLW: Register state:\n");
 
 	for (i = 0; i < ARRAY_SIZE(dump_regs); i++) {
 		uint32_t reg = dump_regs[i];
 		rc = xscom_read(slw_timer_chip, reg, &val);
 		if (rc) {
-			prlog(PR_ERR, "SLW: XSCOM error %lld reading"
+			prlog(PR_DEBUG, "SLW: XSCOM error %lld reading"
 			      " reg 0x%x\n", rc, reg);
 			break;
 		}
@@ -1250,7 +1250,25 @@  void slw_update_timer_expiry(uint64_t new_target)
 			if (!(gen & 1))
 				break;
 			if (tb_compare(now + msecs_to_tb(1), mftb()) == TB_ABEFOREB) {
-				prerror("SLW: Stuck with odd generation !\n");
+				/**
+				 * @fwts-label SLWTimerStuck
+				 * @fwts-advice The SLeep/Winkle Engine (SLW)
+				 * failed to increment the generation number
+				 * within our timeout period (it *should* have
+				 * done so within ~10us, not >1ms. OPAL uses
+				 * the SLW timer to schedule some operations,
+				 * but can fall back to the (much less frequent
+				 * OPAL poller, which although does not affect
+				 * functionality, runs *much* less frequently.
+				 * This could have the effect of slow I2C
+				 * operations (for example). It may also mean
+				 * that you *had* an increase in jitter, due
+				 * to slow interactions with SLW.
+				 * This error may also occur if the machine
+				 * is connected to via soft FSI.
+				 */
+				prerror("SLW: timer stuck, falling back to OPAL pollers. You will likely have slower I2C and may have experienced increased jitter.\n");
+				prlog(PR_DEBUG, "SLW: Stuck with odd generation !\n");
 				slw_has_timer = false;
 				slw_dump_timer_ffdc();
 				return;