diff mbox series

opal/xstop: Use nvram option to enable/disable sw checkstop.

Message ID 151351242343.5903.13165313011722996463.stgit@jupiter.in.ibm.com
State Superseded
Headers show
Series opal/xstop: Use nvram option to enable/disable sw checkstop. | expand

Commit Message

Mahesh J Salgaonkar Dec. 17, 2017, 12:14 p.m. UTC
From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Add a mechanism to enable/disable sw checkstop by looking at nvram option
opal-sw-xstop=<enable/disable>.

For now this patch disables the sw checkstop trigger unless explicitly
enabled through nvram option 'opal-sw-xstop=enable' for p9. This will allow
an opportunity to get host kernel in panic path or xmon for unrecoverable
HMIs or MCE, to be able to debug the issue effectively.

To enable sw checkstop in opal issue following command:

# nvram -p ibm,skiboot --update-config opal-sw-xstop=enable

NOTE: This is a workaround patch to disable sw checkstop by default to gain
control in host kernel for better checkstop debugging. Once we have most of
the checkstop issues stabilized/resolved, revisit this patch to enable sw
checkstop by default for p9.

For p8 platform it will remain enabled by default unless explicitly disabled.

To disable sw checkstop on p8 issue following command:

# nvram -p ibm,skiboot --update-config opal-sw-xstop=disable

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 hw/xscom.c |   26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comments

Balbir Singh Dec. 18, 2017, 2:07 a.m. UTC | #1
On Sun, Dec 17, 2017 at 11:14 PM, Mahesh J Salgaonkar
<mahesh@linux.vnet.ibm.com> wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>
> Add a mechanism to enable/disable sw checkstop by looking at nvram option
> opal-sw-xstop=<enable/disable>.
>
> For now this patch disables the sw checkstop trigger unless explicitly
> enabled through nvram option 'opal-sw-xstop=enable' for p9. This will allow
> an opportunity to get host kernel in panic path or xmon for unrecoverable
> HMIs or MCE, to be able to debug the issue effectively.
>
> To enable sw checkstop in opal issue following command:
>
> # nvram -p ibm,skiboot --update-config opal-sw-xstop=enable
>
> NOTE: This is a workaround patch to disable sw checkstop by default to gain
> control in host kernel for better checkstop debugging. Once we have most of
> the checkstop issues stabilized/resolved, revisit this patch to enable sw
> checkstop by default for p9.
>
> For p8 platform it will remain enabled by default unless explicitly disabled.
>
> To disable sw checkstop on p8 issue following command:
>
> # nvram -p ibm,skiboot --update-config opal-sw-xstop=disable
>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  hw/xscom.c |   26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
>
> diff --git a/hw/xscom.c b/hw/xscom.c
> index de5a27e..7b530a9 100644
> --- a/hw/xscom.c
> +++ b/hw/xscom.c
> @@ -24,6 +24,7 @@
>  #include <errorlog.h>
>  #include <opal-api.h>
>  #include <timebase.h>
> +#include <nvram.h>
>
>  /* Mask of bits to clear in HMER before an access */
>  #define HMER_CLR_MASK  (~(SPR_HMER_XSCOM_FAIL | \
> @@ -827,6 +828,31 @@ int64_t xscom_trigger_xstop(void)
>  {
>         int rc = OPAL_UNSUPPORTED;
>
> +       /*
> +        * Workaround until we iron out all checkstop issues at present.
> +        *
> +        * For p9:
> +        * By default do not trigger sw checkstop unless explicitly enabled
> +        * through nvram option 'opal-sw-xstop=enable'.
> +        *
> +        * For p8:
> +        * Keep it enabled by default unless explicitly disabled.
> +        *
> +        * NOTE: Once all checkstop issues are resolved/stabilized reverse
> +        * the logic to enable sw checkstop by default on p9.
> +        */
> +       switch (proc_gen) {
> +       case proc_gen_p8:
> +               if (nvram_query_eq("opal-sw-xstop", "disable"))
> +                       return rc;
> +               break;
> +       case proc_gen_p9:
> +       default:
> +               if (!nvram_query_eq("opal-sw-xstop", "enable"))
> +                       return rc;
> +               break;
> +       }

Can we get a prlog to indicate that we did not checkstop? I think it
will be useful
when we debug. Ideally the OS will crash/reboot and tell us from the stack

> +
>         if (xstop_xscom.addr)
>                 rc = xscom_writeme(xstop_xscom.addr,
>                                 PPC_BIT(xstop_xscom.fir_bit));
>

Reviewed-by: Balbir Singh <bsingharora@gmail.com>
diff mbox series

Patch

diff --git a/hw/xscom.c b/hw/xscom.c
index de5a27e..7b530a9 100644
--- a/hw/xscom.c
+++ b/hw/xscom.c
@@ -24,6 +24,7 @@ 
 #include <errorlog.h>
 #include <opal-api.h>
 #include <timebase.h>
+#include <nvram.h>
 
 /* Mask of bits to clear in HMER before an access */
 #define HMER_CLR_MASK	(~(SPR_HMER_XSCOM_FAIL | \
@@ -827,6 +828,31 @@  int64_t xscom_trigger_xstop(void)
 {
 	int rc = OPAL_UNSUPPORTED;
 
+	/*
+	 * Workaround until we iron out all checkstop issues at present.
+	 *
+	 * For p9:
+	 * By default do not trigger sw checkstop unless explicitly enabled
+	 * through nvram option 'opal-sw-xstop=enable'.
+	 *
+	 * For p8:
+	 * Keep it enabled by default unless explicitly disabled.
+	 *
+	 * NOTE: Once all checkstop issues are resolved/stabilized reverse
+	 * the logic to enable sw checkstop by default on p9.
+	 */
+	switch (proc_gen) {
+	case proc_gen_p8:
+		if (nvram_query_eq("opal-sw-xstop", "disable"))
+			return rc;
+		break;
+	case proc_gen_p9:
+	default:
+		if (!nvram_query_eq("opal-sw-xstop", "enable"))
+			return rc;
+		break;
+	}
+
 	if (xstop_xscom.addr)
 		rc = xscom_writeme(xstop_xscom.addr,
 				PPC_BIT(xstop_xscom.fir_bit));