diff mbox series

[SRU,J/L,1/1] x86/xen/time: prefer tsc as clocksource when it is invariant

Message ID 470bb1be5bde7540d219748a77b0124ee8a0451a.1693009521.git.ubuntu@ip-172-31-11-177.us-west-2.compute.internal
State New
Headers show
Series Xen timekeeping performance improvement | expand

Commit Message

Krister Johansen Aug. 26, 2023, 12:57 a.m. UTC
BugLink: http://bugs.launchpad.net/bugs/2033122

Kvm elects to use tsc instead of kvm-clock when it can detect that the
TSC is invariant.

(As of commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
invariant TSC is exposed")).

Notable cloud vendors[1] and performance engineers[2] recommend that Xen
users preferentially select tsc over xen-clocksource due the performance
penalty incurred by the latter.  These articles are persuasive and
tailored to specific use cases.  In order to understand the tradeoffs
around this choice more fully, this author had to reference the
documented[3] complexities around the Xen configuration, as well as the
kernel's clocksource selection algorithm.  Many users may not attempt
this to correctly configure the right clock source in their guest.

The approach taken in the kvm-clock module spares users this confusion,
where possible.

Both the Intel SDM[4] and the Xen tsc documentation explain that marking
a tsc as invariant means that it should be considered stable by the OS
and is elibile to be used as a wall clock source.

In order to obtain better out-of-the-box performance, and reduce the
need for user tuning, follow kvm's approach and decrease the xen clock
rating so that tsc is preferable, if it is invariant, stable, and the
tsc will never be emulated.

[1] https://aws.amazon.com/premiumsupport/knowledge-center/manage-ec2-linux-clock-source/
[2] https://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html
[3] https://xenbits.xen.org/docs/unstable/man/xen-tscmode.7.html
[4] Intel 64 and IA-32 Architectures Sofware Developer's Manual Volume
    3b: System Programming Guide, Part 2, Section 17.17.1, Invariant TSC

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Code-reviewed-by: David Reaver <me@davidreaver.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221216162118.GB2633@templeofstupid.com
Signed-off-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit 99a7bcafbd0d04555074554573019096a8c10450)
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
---
 arch/x86/xen/time.c | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

Comments

Krister Johansen Aug. 26, 2023, 1 a.m. UTC | #1
On Fri, Aug 25, 2023 at 05:57:30PM -0700, Krister Johansen wrote:
> BugLink: http://bugs.launchpad.net/bugs/2033122
> 
> Kvm elects to use tsc instead of kvm-clock when it can detect that the
> TSC is invariant.
> 
> (As of commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
> invariant TSC is exposed")).
> 
> Notable cloud vendors[1] and performance engineers[2] recommend that Xen
> users preferentially select tsc over xen-clocksource due the performance
> penalty incurred by the latter.  These articles are persuasive and
> tailored to specific use cases.  In order to understand the tradeoffs
> around this choice more fully, this author had to reference the
> documented[3] complexities around the Xen configuration, as well as the
> kernel's clocksource selection algorithm.  Many users may not attempt
> this to correctly configure the right clock source in their guest.
> 
> The approach taken in the kvm-clock module spares users this confusion,
> where possible.
> 
> Both the Intel SDM[4] and the Xen tsc documentation explain that marking
> a tsc as invariant means that it should be considered stable by the OS
> and is elibile to be used as a wall clock source.
> 
> In order to obtain better out-of-the-box performance, and reduce the
> need for user tuning, follow kvm's approach and decrease the xen clock
> rating so that tsc is preferable, if it is invariant, stable, and the
> tsc will never be emulated.
> 
> [1] https://aws.amazon.com/premiumsupport/knowledge-center/manage-ec2-linux-clock-source/
> [2] https://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html
> [3] https://xenbits.xen.org/docs/unstable/man/xen-tscmode.7.html
> [4] Intel 64 and IA-32 Architectures Sofware Developer's Manual Volume
>     3b: System Programming Guide, Part 2, Section 17.17.1, Invariant TSC
> 
> Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
> Code-reviewed-by: David Reaver <me@davidreaver.com>
> Reviewed-by: Juergen Gross <jgross@suse.com>
> Link: https://lore.kernel.org/r/20221216162118.GB2633@templeofstupid.com
> Signed-off-by: Juergen Gross <jgross@suse.com>
> (cherry picked from commit 99a7bcafbd0d04555074554573019096a8c10450)

Apologies, this should be:

(cherry picked from commit caea091e48ed9d3951506507abf26e9918d08e35)

> Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
> ---
>  arch/x86/xen/time.c | 38 +++++++++++++++++++++++++++++++++++++-
>  1 file changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index cc11dd2e2f48..062120dd8b98 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> @@ -474,15 +474,51 @@ static void xen_setup_vsyscall_time_info(void)
>  	xen_clocksource.vdso_clock_mode = VDSO_CLOCKMODE_PVCLOCK;
>  }
>  
> +/*
> + * Check if it is possible to safely use the tsc as a clocksource.  This is
> + * only true if the hypervisor notifies the guest that its tsc is invariant,
> + * the tsc is stable, and the tsc instruction will never be emulated.
> + */
> +static int __init xen_tsc_safe_clocksource(void)
> +{
> +	u32 eax, ebx, ecx, edx;
> +
> +	if (!(boot_cpu_has(X86_FEATURE_CONSTANT_TSC)))
> +		return 0;
> +
> +	if (!(boot_cpu_has(X86_FEATURE_NONSTOP_TSC)))
> +		return 0;
> +
> +	if (check_tsc_unstable())
> +		return 0;
> +
> +	/* Leaf 4, sub-leaf 0 (0x40000x03) */
> +	cpuid_count(xen_cpuid_base() + 3, 0, &eax, &ebx, &ecx, &edx);
> +
> +	/* tsc_mode = no_emulate (2) */
> +	if (ebx != 2)
> +		return 0;
> +
> +	return 1;
> +}
> +
>  static void __init xen_time_init(void)
>  {
>  	struct pvclock_vcpu_time_info *pvti;
>  	int cpu = smp_processor_id();
>  	struct timespec64 tp;
>  
> -	/* As Dom0 is never moved, no penalty on using TSC there */
> +	/*
> +	 * As Dom0 is never moved, no penalty on using TSC there.
> +	 *
> +	 * If it is possible for the guest to determine that the tsc is a safe
> +	 * clocksource, then set xen_clocksource rating below that of the tsc
> +	 * so that the system prefers tsc instead.
> +	 */
>  	if (xen_initial_domain())
>  		xen_clocksource.rating = 275;
> +	else if (xen_tsc_safe_clocksource())
> +		xen_clocksource.rating = 299;
>  
>  	clocksource_register_hz(&xen_clocksource, NSEC_PER_SEC);
>  
> -- 
> 2.25.1
>
Thadeu Lima de Souza Cascardo Aug. 28, 2023, 10:49 a.m. UTC | #2
With the fixed up cherry-pick line,

Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
diff mbox series

Patch

diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index cc11dd2e2f48..062120dd8b98 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -474,15 +474,51 @@  static void xen_setup_vsyscall_time_info(void)
 	xen_clocksource.vdso_clock_mode = VDSO_CLOCKMODE_PVCLOCK;
 }
 
+/*
+ * Check if it is possible to safely use the tsc as a clocksource.  This is
+ * only true if the hypervisor notifies the guest that its tsc is invariant,
+ * the tsc is stable, and the tsc instruction will never be emulated.
+ */
+static int __init xen_tsc_safe_clocksource(void)
+{
+	u32 eax, ebx, ecx, edx;
+
+	if (!(boot_cpu_has(X86_FEATURE_CONSTANT_TSC)))
+		return 0;
+
+	if (!(boot_cpu_has(X86_FEATURE_NONSTOP_TSC)))
+		return 0;
+
+	if (check_tsc_unstable())
+		return 0;
+
+	/* Leaf 4, sub-leaf 0 (0x40000x03) */
+	cpuid_count(xen_cpuid_base() + 3, 0, &eax, &ebx, &ecx, &edx);
+
+	/* tsc_mode = no_emulate (2) */
+	if (ebx != 2)
+		return 0;
+
+	return 1;
+}
+
 static void __init xen_time_init(void)
 {
 	struct pvclock_vcpu_time_info *pvti;
 	int cpu = smp_processor_id();
 	struct timespec64 tp;
 
-	/* As Dom0 is never moved, no penalty on using TSC there */
+	/*
+	 * As Dom0 is never moved, no penalty on using TSC there.
+	 *
+	 * If it is possible for the guest to determine that the tsc is a safe
+	 * clocksource, then set xen_clocksource rating below that of the tsc
+	 * so that the system prefers tsc instead.
+	 */
 	if (xen_initial_domain())
 		xen_clocksource.rating = 275;
+	else if (xen_tsc_safe_clocksource())
+		xen_clocksource.rating = 299;
 
 	clocksource_register_hz(&xen_clocksource, NSEC_PER_SEC);