diff mbox series

[2/2] core/fast-reset: Do not enable by default

Message ID 20200217010224.26285-2-oohall@gmail.com
State Accepted
Headers show
Series [1/2] core/platform: Add an explicit fast-reboot type | expand

Checks

Context Check Description
snowpatch_ozlabs/snowpatch_job_snowpatch-skiboot-dco success Signed-off-by present
snowpatch_ozlabs/snowpatch_job_snowpatch-skiboot success Test snowpatch/job/snowpatch-skiboot on branch master
snowpatch_ozlabs/apply_patch success Successfully applied on branch master (53944d45087bd78bf7f9494a54e464c74322a83a)

Commit Message

Oliver O'Halloran Feb. 17, 2020, 1:02 a.m. UTC
Fast reboot started life as a debug hack and it escaped into the wild when
Stewart enabled it by default. There was some reasons for this, but the
main one is that a full reboot takes somewhere between one and five
minutes. For those of us who spend all day rebooting their POWER systems
this is great, but the utility for end users has always been pretty
questionable.

Rebooting a system should be a fairly infrequent activity in the field
with the main reasons for doing one being:

1) Kernel updates,
2) Misbehaving hardware

Although 1) can be performed by kexec we have found that it fails due to
2) occasionally. The reason for 2) is usually hardware getting itself
into a bad state. The universal fix for that type of hardware problem
is turning the hardware off and back on again so it's preferable that
a reboot actually does that.

This patch refactors the reboot handling OPAL calls so that fast-reboot
is only used by default when explicitly enabled, or manually invoked.
This allows developers to continue to use fast-reboot without expecting
users deal with its quirks (and understand how a "normal" reboot,
fast-reboot and MPIPL differ).

This has two user visible changes:

1. Full reboot is now the default. In order to get fast-reboot as the
   default the nvram option needs to be set:

	nvram -p ibm,skiboot --update-config fast-reset=1

2. The nvram option to force a fast-reboot even when some part of
   skiboot has called disable_fast_reboot() has changed from
   'fast-reset=im-feeling-lucky' to 'force-fast-reset=1' because
   it's impossible to actually use that 'feature' if fast-reboot is
   off by default.

	nvram -p ibm,skiboot --update-config force-fast-reset=1

Cc: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
While we're here should we change it so that the nvram option is
fast-reboot rather than fast-reset? It's called fast-reboot
everywhere *except* the nvram flag and it always screws me up.
---
 core/fast-reboot.c |  2 +-
 core/platform.c    | 43 ++++++++++++++++++++++++-------------------
 2 files changed, 25 insertions(+), 20 deletions(-)

Comments

Michael Neuling Feb. 17, 2020, 5:28 a.m. UTC | #1
> This has two user visible changes:
> 
> 1. Full reboot is now the default. 

*sniff*

You're the boss now so fair enough.

Mikey
Michael Ellerman Feb. 24, 2020, 2:08 a.m. UTC | #2
Oliver O'Halloran <oohall@gmail.com> writes:

> Fast reboot started life as a debug hack and it escaped into the wild when
> Stewart enabled it by default. There was some reasons for this, but the
> main one is that a full reboot takes somewhere between one and five
> minutes. For those of us who spend all day rebooting their POWER systems
> this is great, but the utility for end users has always been pretty
> questionable.
>
> Rebooting a system should be a fairly infrequent activity in the field
> with the main reasons for doing one being:
>
> 1) Kernel updates,
> 2) Misbehaving hardware
>
> Although 1) can be performed by kexec we have found that it fails due to
> 2) occasionally. The reason for 2) is usually hardware getting itself
> into a bad state. The universal fix for that type of hardware problem
> is turning the hardware off and back on again so it's preferable that
> a reboot actually does that.
>
> This patch refactors the reboot handling OPAL calls so that fast-reboot
> is only used by default when explicitly enabled, or manually invoked.
> This allows developers to continue to use fast-reboot without expecting
> users deal with its quirks (and understand how a "normal" reboot,
> fast-reboot and MPIPL differ).

+(infinite points)

cheers
diff mbox series

Patch

diff --git a/core/fast-reboot.c b/core/fast-reboot.c
index 410acfe637ce..b60ddf3c4322 100644
--- a/core/fast-reboot.c
+++ b/core/fast-reboot.c
@@ -118,7 +118,7 @@  void fast_reboot(void)
 	}
 
 	if (fast_reboot_disabled &&
-	    nvram_query_eq_dangerous("fast-reset", "im-feeling-lucky")) {
+	    nvram_query_eq_dangerous("force-fast-reset", "1")) {
 		/* Do fast reboot even if it's been disabled */
 		prlog(PR_NOTICE, "RESET: Ignoring fast reboot disabled: %s\n",
 				fast_reboot_disabled);
diff --git a/core/platform.c b/core/platform.c
index 99cf5c985399..853b4a4acc3f 100644
--- a/core/platform.c
+++ b/core/platform.c
@@ -48,25 +48,10 @@  static int64_t opal_cec_power_down(uint64_t request)
 }
 opal_call(OPAL_CEC_POWER_DOWN, opal_cec_power_down, 1);
 
-static int64_t opal_cec_reboot(void)
+static int64_t full_reboot(void)
 {
 	prlog(PR_NOTICE, "OPAL: Reboot request...\n");
 
-	opal_quiesce(QUIESCE_HOLD, -1);
-
-	if (proc_gen == proc_gen_p8) {
-		/*
-		 * Bugs in P8 mean fast reboot isn't 100% reliable when cores
-		 * are busy, so only attempt if explicitly *enabled*.
-		 */
-		if (nvram_query_eq_safe("fast-reset", "1"))
-			fast_reboot();
-
-	} else if (!nvram_query_eq_safe("fast-reset", "0")) {
-		/* Try fast-reset unless explicitly disabled */
-		fast_reboot();
-	}
-
 	console_complete_flush();
 
 	if (platform.cec_reboot)
@@ -74,6 +59,26 @@  static int64_t opal_cec_reboot(void)
 
 	return OPAL_SUCCESS;
 }
+
+static int64_t opal_cec_reboot(void)
+{
+	opal_quiesce(QUIESCE_HOLD, -1);
+
+	/*
+	 * Fast-reset was enabled by default for a long time in an attempt to
+	 * make it more stable by exercising it more frequently. This resulted
+	 * in a fair amount of pain due to mis-behaving hardware and confusion
+	 * about what a "reset" is supposed to do exactly. Additionally,
+	 * secure variables require a full reboot to work at all.
+	 *
+	 * Due to all that fast-reset should only be used if it's explicitly
+	 * enabled. It started life as a debug hack and should remain one.
+	 */
+	if (nvram_query_eq_safe("fast-reset", "1"))
+		fast_reboot();
+
+	return full_reboot();
+}
 opal_call(OPAL_CEC_REBOOT, opal_cec_reboot, 0);
 
 static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag)
@@ -105,15 +110,15 @@  static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag)
 		console_complete_flush();
 		return xscom_trigger_xstop();
 	case OPAL_REBOOT_FULL_IPL:
-		disable_fast_reboot("full IPL reboot requested");
-		return opal_cec_reboot();
+		prlog(PR_NOTICE, "Reboot: Full reboot requested");
+		return full_reboot();
 	case OPAL_REBOOT_MPIPL:
 		prlog(PR_NOTICE, "Reboot: OS reported error. Performing MPIPL\n");
 		console_complete_flush();
 		if (platform.terminate)
 			platform.terminate("OS reported error. Performing MPIPL\n");
 		else
-			opal_cec_reboot();
+			full_reboot();
 		for (;;);
 		break;
 	case OPAL_REBOOT_FAST: