[RFC,2/2] core/cpu: have opal_reinit_cpus clear MMU registers

Message ID 20171130155215.30574-3-npiggin@gmail.com
State New
Headers show
Series
  • [RFC,1/2] head: POWER9 initialise MMU registers to boot state for fast-reboot
Related show

Commit Message

Nicholas Piggin Nov. 30, 2017, 3:52 p.m.
When called with an MMU argument, have opal_reinit_cpus zero PIDR and
LPID registers as well as flush TLBs.

During MMU initialization and over kexec, existing Linux kernels do
not clear PIDR, which does not get set until init is executed, which
is well after CPUs start running with relocation on.

This can result in CPUs incorrectly picking up and caching
translations (PWC and PTEs) after the kexec/boot process has done
its initial clearing out of TLBs.

PTCR can not be cleared entirely here, because opal_reinit_cpus is
called with relocation on, so the new kexec kernel always boots with
stale PID 0 translations sadly, but in practice they get cleared out
before relocation is turned on.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 core/cpu.c | 43 +++++++++++++++++++++++++++----------------
 1 file changed, 27 insertions(+), 16 deletions(-)

Patch

diff --git a/core/cpu.c b/core/cpu.c
index b94e04ef2..519a60515 100644
--- a/core/cpu.c
+++ b/core/cpu.c
@@ -1280,24 +1280,42 @@  void cpu_set_radix_mode(void)
 	cpu_change_all_hid0(&req);
 }
 
-static void cpu_cleanup_one(void *param __unused)
+static void cpu_cleanup_mmu_one(void *param __unused)
 {
+	if (proc_gen >= proc_gen_p9) {
+		mtspr(SPR_PID, 0);
+	}
+	mtspr(SPR_LPID, 0);
 	mtspr(SPR_AMR, 0);
 	mtspr(SPR_IAMR, 0);
 }
 
-static int64_t cpu_cleanup_all(void)
+/*
+ * Clean MMU registers and flush TLBs to prepare for a kexec (or similar
+ * environment).
+ */
+static int64_t cpu_cleanup_mmu(void)
 {
 	struct cpu_thread *cpu;
 
 	for_each_available_cpu(cpu) {
 		if (cpu == this_cpu()) {
-			cpu_cleanup_one(NULL);
+			cpu_cleanup_mmu_one(NULL);
 			continue;
 		}
-		cpu_wait_job(cpu_queue_job(cpu, "cpu_cleanup",
-					   cpu_cleanup_one, NULL), true);
+		cpu_wait_job(cpu_queue_job(cpu, "cpu_cleanup_mmu",
+					   cpu_cleanup_mmu_one, NULL), true);
 	}
+
+	/* Cleanup the TLB. After PID and LPID are cleared, we can flush
+	 * TLBs without them being prefetched, with the exception of PID 0,
+	 * unfortunately Linux calls this with the MMU on, so we can't
+	 * clear up the MMU registers completely and flush everything.
+	 *
+	 * This is P9 specific for now.
+	 */
+	cleanup_global_tlb();
+
 	return OPAL_SUCCESS;
 }
 
@@ -1362,7 +1380,10 @@  static int64_t opal_reinit_cpus(uint64_t flags)
 	 * transitions. Ideally Linux should do it but doing it
 	 * here works around existing broken kernels.
 	 */
-	cpu_cleanup_all();
+	if (flags & (OPAL_REINIT_CPUS_MMU_HASH |
+		      OPAL_REINIT_CPUS_MMU_RADIX)) {
+		cpu_cleanup_mmu();
+	}
 
 	/* If HILE change via HID0 is supported ... */
 	if (hile_supported &&
@@ -1398,16 +1419,6 @@  static int64_t opal_reinit_cpus(uint64_t flags)
 		}
 	}
 
-	/* Cleanup the TLB. We do that unconditionally, this works
-	 * around issues where OSes fail to invalidate the PWC in Radix
-	 * mode for example. This only works on P9 and later, but we
-	 * also know we don't have a problem with Linux cleanups on
-	 * P8 so this isn't a problem. If we wanted to cleanup the
-	 * TLB on P8 as well, we'd have to use jobs to do it locally
-	 * on each CPU.
-	 */
-	 cleanup_global_tlb();
-
 	 /* Apply HID bits changes if any */
 	if (req.set_bits || req.clr_bits)
 		cpu_change_all_hid0(&req);