diff mbox series

[v7,24/24] x86/tsc: Stop the HPET hardlockup detector if TSC become unstable

Message ID 20230301234753.28582-25-ricardo.neri-calderon@linux.intel.com (mailing list archive)
State Handled Elsewhere
Headers show
Series x86: Implement an HPET-based hardlockup detector | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 24 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 6 jobs.

Commit Message

Ricardo Neri March 1, 2023, 11:47 p.m. UTC
The HPET-based hardlockup detector relies on the TSC to determine if an
observed NMI interrupt was originated by HPET timer. Hence, this detector
can no longer be used with an unstable TSC. Once marked as unstable,
the TSC cannot be stable again. In such case, permanently stop the HPET-
based hardlockup detector.

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linuxppc-dev@lists.ozlabs.org
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Changes since v6:
 * Do not switch to the perf-based NMI watchdog. Instead, only stop
   the HPET-based NMI watchdog if the TSC counter becomes unstable.

Changes since v5:
 * Relocated the declaration of hardlockup_detector_switch_to_perf() to
   x86/nmi.h It does not depend on HPET.
 * Removed function stub. The shim hardlockup detector is always for x86.

Changes since v4:
 * Added a stub version of hardlockup_detector_switch_to_perf() for
   !CONFIG_HPET_TIMER. (lkp)
 * Reconfigure the whole lockup detector instead of unconditionally
   starting the perf-based hardlockup detector.

Changes since v3:
 * None

Changes since v2:
 * Introduced this patch.

Changes since v1:
 * N/A
---
 arch/x86/include/asm/nmi.h     |  6 ++++++
 arch/x86/kernel/tsc.c          |  3 +++
 arch/x86/kernel/watchdog_hld.c | 11 +++++++++++
 3 files changed, 20 insertions(+)
diff mbox series

Patch

diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index 5c5f1e56c404..4d0687a2b4ea 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -63,4 +63,10 @@  void stop_nmi(void);
 void restart_nmi(void);
 void local_touch_nmi(void);
 
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+extern void hardlockup_detector_mark_hpet_hld_unavailable(void);
+#else
+static inline void hardlockup_detector_mark_hpet_hld_unavailable(void) {}
+#endif
+
 #endif /* _ASM_X86_NMI_H */
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 344698852146..24f77efea569 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1191,6 +1191,9 @@  void mark_tsc_unstable(char *reason)
 
 	clocksource_mark_unstable(&clocksource_tsc_early);
 	clocksource_mark_unstable(&clocksource_tsc);
+
+	/* The HPET hardlockup detector depends on a stable TSC. */
+	hardlockup_detector_mark_hpet_hld_unavailable();
 }
 
 EXPORT_SYMBOL_GPL(mark_tsc_unstable);
diff --git a/arch/x86/kernel/watchdog_hld.c b/arch/x86/kernel/watchdog_hld.c
index 33c22f6456a3..f5d79ce0e7a2 100644
--- a/arch/x86/kernel/watchdog_hld.c
+++ b/arch/x86/kernel/watchdog_hld.c
@@ -6,6 +6,8 @@ 
  * Copyright (C) Intel Corporation 2023
  */
 
+#define pr_fmt(fmt) "watchdog: " fmt
+
 #include <linux/nmi.h>
 #include <asm/hpet.h>
 
@@ -84,3 +86,12 @@  void watchdog_nmi_start(void)
 	if (detector_type == X86_HARDLOCKUP_DETECTOR_HPET)
 		hardlockup_detector_hpet_start();
 }
+
+void hardlockup_detector_mark_hpet_hld_unavailable(void)
+{
+	if (detector_type != X86_HARDLOCKUP_DETECTOR_HPET)
+		return;
+
+	pr_warn("TSC is unstable. Stopping the HPET NMI watchdog.");
+	hardlockup_detector_mark_unavailable();
+}