Patchwork [RFC,Quantal] Replace pv-ops fix for legacy Xen

login
register
mail settings
Submitter Stefan Bader
Date June 14, 2012, 4:46 p.m.
Message ID <1339692363-9195-1-git-send-email-stefan.bader@canonical.com>
Download mbox | patch
Permalink /patch/164962/
State New
Headers show

Comments

Stefan Bader - June 14, 2012, 4:46 p.m.
For hysterical reasons we carry the following patch for running
as paravirtualized guests on Xen (on Amazon EC2):

  commit b187be83798aded25e58cb40f52ffc9a1452879d
  Author: John Johansen <john.johansen@canonical.com>
  Date:   Tue Jul 27 06:06:07 2010 -0700

    UBUNTU: SAUCE: fix pv-ops for legacy Xen

A short (ok, maybe not) background:

Back in Lucid times there was a problem which caused paravirtualized
guests with pv-ops enabled to crash early on boot when running on
EC2 (not always but sometimes).

The reason was found later (Maverick) to be that some older versions
of the Xen hypervisor (not sure which exactly) would crash a PV guest
when that tried to write unexpected/-supported values into CR4. The
problematic one apparently OSXSAVE.

So the hack (taken from Fedora) put some code in to filter out that
bit before it was written into CR4 by a function that is part of the
pv-ops (iow it only is used on a paravirt guest).

During last UDS time, there happened to be an upstream discussion
about this because kernels that had this hack would cause problems
when running on newer Xen hypervisor versions (and of course CPUs
that actually support this feature).

That turned out to be a glibc problem where checks were not complete
and thus only looking at XSAVE but not OSXSAVE reported through CR4.
I believe that problem now has been fixed, but still it sounded like
the way this is handled could be improved.

In order to be save about EC2 we need to prevent OSXSAVE to be written
into CR4 on "older" Xen versions. But running on newer versions with
the right support and features it would be nice not to cripple
it all the time.

So this patch would replace the old hack. It hooks into the same function
which would disable XSAVE/OSXSAVE when it would be manually forced off
(by masking off the feature flags from the cpuid results).
The change will enforce this when running on a Xen hypervisor before
version 4 and only when running as a PV guest. I also added a small
change to the banner printing code, so it is possible to see when the
cpuid capping takes place.

I booted a Quantal kernel with that on two EC2 instances (and one
even happened to be on an old 3.0.3-rc5 version of Xen). Of course
neither of them seemed to claim the XSAVE feature in the first 
place... And it would likely require a lot of time to accidentally
hit a host that does, and none of my machines and home does either.

-Stefan


From 78d874169b045410c41c0f3f2135743aa49d2d28 Mon Sep 17 00:00:00 2001
From: Stefan Bader <stefan.bader@canonical.com>
Date: Mon, 4 Jun 2012 19:12:51 +0200
Subject: [PATCH 2/2] UBUNTU: SAUCE: Mask CR4 writes on older Xen hypervisors

Older Xen hypervisors (like RHEL5 versions found to be used
on Amazon's EC2) did have a bug which would crash the domain
when trying to write unsupported CR values.
Newer versions do handle this correctly. But some probes (in
particular one that tries to pass the OSXSAVE bit set) were
causing pv-ops kernels to crash early on boot when running on EC2.

We were using a patch that did always mask off OSXSAVE when
running under Xen PV but since this may limit performance for
hypervisor/HW combinations that do support the AVX instructions,
a slightly more targetted approach would be good.

This patch tries to mask off the XSAVE and OSXSAVE bits when
running as a PV guest on a host prior Xen 4.

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
---
 arch/x86/xen/enlighten.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)
Tim Gardner - June 14, 2012, 7:06 p.m.
On 06/14/2012 10:46 AM, Stefan Bader wrote:
> For hysterical reasons we carry the following patch for running
> as paravirtualized guests on Xen (on Amazon EC2):
> 
>   commit b187be83798aded25e58cb40f52ffc9a1452879d
>   Author: John Johansen <john.johansen@canonical.com>
>   Date:   Tue Jul 27 06:06:07 2010 -0700
> 
>     UBUNTU: SAUCE: fix pv-ops for legacy Xen
> 
> A short (ok, maybe not) background:
> 
> Back in Lucid times there was a problem which caused paravirtualized
> guests with pv-ops enabled to crash early on boot when running on
> EC2 (not always but sometimes).
> 
> The reason was found later (Maverick) to be that some older versions
> of the Xen hypervisor (not sure which exactly) would crash a PV guest
> when that tried to write unexpected/-supported values into CR4. The
> problematic one apparently OSXSAVE.
> 
> So the hack (taken from Fedora) put some code in to filter out that
> bit before it was written into CR4 by a function that is part of the
> pv-ops (iow it only is used on a paravirt guest).
> 
> During last UDS time, there happened to be an upstream discussion
> about this because kernels that had this hack would cause problems
> when running on newer Xen hypervisor versions (and of course CPUs
> that actually support this feature).
> 
> That turned out to be a glibc problem where checks were not complete
> and thus only looking at XSAVE but not OSXSAVE reported through CR4.
> I believe that problem now has been fixed, but still it sounded like
> the way this is handled could be improved.
> 
> In order to be save about EC2 we need to prevent OSXSAVE to be written
> into CR4 on "older" Xen versions. But running on newer versions with
> the right support and features it would be nice not to cripple
> it all the time.
> 
> So this patch would replace the old hack. It hooks into the same function
> which would disable XSAVE/OSXSAVE when it would be manually forced off
> (by masking off the feature flags from the cpuid results).
> The change will enforce this when running on a Xen hypervisor before
> version 4 and only when running as a PV guest. I also added a small
> change to the banner printing code, so it is possible to see when the
> cpuid capping takes place.
> 
> I booted a Quantal kernel with that on two EC2 instances (and one
> even happened to be on an old 3.0.3-rc5 version of Xen). Of course
> neither of them seemed to claim the XSAVE feature in the first 
> place... And it would likely require a lot of time to accidentally
> hit a host that does, and none of my machines and home does either.
> 
> -Stefan
> 
> 
> From 78d874169b045410c41c0f3f2135743aa49d2d28 Mon Sep 17 00:00:00 2001
> From: Stefan Bader <stefan.bader@canonical.com>
> Date: Mon, 4 Jun 2012 19:12:51 +0200
> Subject: [PATCH 2/2] UBUNTU: SAUCE: Mask CR4 writes on older Xen hypervisors
> 
> Older Xen hypervisors (like RHEL5 versions found to be used
> on Amazon's EC2) did have a bug which would crash the domain
> when trying to write unsupported CR values.
> Newer versions do handle this correctly. But some probes (in
> particular one that tries to pass the OSXSAVE bit set) were
> causing pv-ops kernels to crash early on boot when running on EC2.
> 
> We were using a patch that did always mask off OSXSAVE when
> running under Xen PV but since this may limit performance for
> hypervisor/HW combinations that do support the AVX instructions,
> a slightly more targetted approach would be good.
> 
> This patch tries to mask off the XSAVE and OSXSAVE bits when
> running as a PV guest on a host prior Xen 4.
> 
> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
> ---
>  arch/x86/xen/enlighten.c |   13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index e74df95..7eaa415 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -207,6 +207,8 @@ static void __init xen_banner(void)
>  	printk(KERN_INFO "Xen version: %d.%d%s%s\n",
>  	       version >> 16, version & 0xffff, extra.extraversion,
>  	       xen_feature(XENFEAT_mmu_pt_update_preserve_ad) ? " (preserve-AD)" : "");
> +	if (xen_pv_domain() && ((version >> 16) < 4))
> +		printk(KERN_INFO "Forcing xsave off due to Xen version.\n");
>  }
>  
>  static __read_mostly unsigned int cpuid_leaf1_edx_mask = ~0;
> @@ -329,6 +331,7 @@ static bool __init xen_check_mwait(void)
>  }
>  static void __init xen_init_cpuid_mask(void)
>  {
> +	unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
>  	unsigned int ax, bx, cx, dx;
>  	unsigned int xsave_mask;
>  
> @@ -353,6 +356,16 @@ static void __init xen_init_cpuid_mask(void)
>  	/* Xen will set CR4.OSXSAVE if supported and not disabled by force */
>  	if ((cx & xsave_mask) != xsave_mask)
>  		cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & OSXSAVE */
> +	/*
> +	 * This seems not to work with some early Xen versions when
> +	 * the cpuid information claims support but trying to write
> +	 * this into CR4 causes the pv guest to crash.
> +	 * Since we don't know the exact version, assume all versions
> +	 * prior Xen 4.x to be broken.
> +	 */
> +	if (xen_pv_domain() && ((version >> 16) < 4))
> +		cpuid_leaf1_ecx_mask &= ~xsave_mask;
> +
>  	if (xen_check_mwait())
>  		cpuid_leaf1_ecx_set_mask = (1 << (X86_FEATURE_MWAIT % 32));
>  }

Just a nit, how about making the version check a documented macro
instead of open coding '(version >> 16) < 4)' twice ?

Otherwise, ACK.

rtg

Patch

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index e74df95..7eaa415 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -207,6 +207,8 @@  static void __init xen_banner(void)
 	printk(KERN_INFO "Xen version: %d.%d%s%s\n",
 	       version >> 16, version & 0xffff, extra.extraversion,
 	       xen_feature(XENFEAT_mmu_pt_update_preserve_ad) ? " (preserve-AD)" : "");
+	if (xen_pv_domain() && ((version >> 16) < 4))
+		printk(KERN_INFO "Forcing xsave off due to Xen version.\n");
 }
 
 static __read_mostly unsigned int cpuid_leaf1_edx_mask = ~0;
@@ -329,6 +331,7 @@  static bool __init xen_check_mwait(void)
 }
 static void __init xen_init_cpuid_mask(void)
 {
+	unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
 	unsigned int ax, bx, cx, dx;
 	unsigned int xsave_mask;
 
@@ -353,6 +356,16 @@  static void __init xen_init_cpuid_mask(void)
 	/* Xen will set CR4.OSXSAVE if supported and not disabled by force */
 	if ((cx & xsave_mask) != xsave_mask)
 		cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & OSXSAVE */
+	/*
+	 * This seems not to work with some early Xen versions when
+	 * the cpuid information claims support but trying to write
+	 * this into CR4 causes the pv guest to crash.
+	 * Since we don't know the exact version, assume all versions
+	 * prior Xen 4.x to be broken.
+	 */
+	if (xen_pv_domain() && ((version >> 16) < 4))
+		cpuid_leaf1_ecx_mask &= ~xsave_mask;
+
 	if (xen_check_mwait())
 		cpuid_leaf1_ecx_set_mask = (1 << (X86_FEATURE_MWAIT % 32));
 }