[2/2] x86: Fix Opteron xlevels
diff mbox

Message ID 1429655891-198219-2-git-send-email-agraf@suse.de
State New
Headers show

Commit Message

Alexander Graf April 21, 2015, 10:38 p.m. UTC
The AMD Opteron family has different xlevel levels depending on the
generation. I looked up Gen1, Gen2 and Gen3 hardware and adapted the
levels according to real silicon.

The reason this came up is that there is a sanity check in KVM making
sure that SVM is only used when xlevel is high enough. Using real
hardware levels, they now are.

Reported-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - add compat handling for machine types < 2.4
---
 include/hw/i386/pc.h | 13 +++++++++++++
 target-i386/cpu.c    |  6 +++---
 2 files changed, 16 insertions(+), 3 deletions(-)

Comments

Eduardo Habkost April 22, 2015, 3:53 p.m. UTC | #1
On Wed, Apr 22, 2015 at 12:38:11AM +0200, Alexander Graf wrote:
> The AMD Opteron family has different xlevel levels depending on the
> generation. I looked up Gen1, Gen2 and Gen3 hardware and adapted the
> levels according to real silicon.
> 
> The reason this came up is that there is a sanity check in KVM making
> sure that SVM is only used when xlevel is high enough. Using real
> hardware levels, they now are.
> 

As noted in the reply to the previous patch: if you increase xlevel you are now
reporting new CPUID leaves as available. I still don't see any reason to start
reporting new CPUID leaves as available if they are returning useless data
(that doesn't match real hardware).

For reference, here are the new CPUID leaves you are introducing (and will
return all-zeroes):

For Opteron_G[123]:
* CPUID Fn8000_0009 Reserved
* CPUID Fn8000_000A SVM Revision and Feature Identification (supported by QEMU)
* CPUID Fn8000_00[18:0B] Reserved

For Opteron_G3:
* CPUID Fn8000_0019_EAX TLB 1GB Page Identifiers
* CPUID Fn8000_0019_EBX TLB 1GB Page Identifiers
* CPUID Fn8000_0019_E[D,C]X Reserved
* CPUID Fn8000_001A_EAX Performance Optimization Identifiers
* CPUID Fn8000_001A_E[D,C,B]X Reserved

The reserved bits look safe, but we can't be 100% sure unless we confirm that
real hardware is returning zeroes on all the reserved bits. If real hardware
return some data, we don't know what exactly that data means (and what
all-zeroes would mean) until AMD documents it.

For L1 and L2 1GB page TLB info, all-zeroes means "TLB disabled".

The Performance Optimization Identifier leaf will now be available, and will
return 0 on the following bits:

* "MOVU: MOVU SSE (multimedia) instructions are more efficient and should be
  preferred to SSE (multimedia) MOVL/MOVH. MOVUPS is more efficient than
  MOVLPS/MOVHPS. MOVUPD is more efficient than MOVLPD/MOVHP";
* "FP128: 128-bit SSE (multimedia) instructions are executed with full-width
  internal operations and pipelines rather than decomposing them into internal
  64-bit suboperations. This may impact how soft- ware performs instruction
  selection and scheduling."

There's a simple way to avoid exposing additional useless information to
guests: just return 0x8000000A on xlevel. 0x8000000A is the CPUID leaf you need
for SVM, the others are unrelated to SVM.

> Reported-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> 
> ---
> 
> v1 -> v2:
> 
>   - add compat handling for machine types < 2.4
> ---
>  include/hw/i386/pc.h | 13 +++++++++++++
>  target-i386/cpu.c    |  6 +++---
>  2 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 05ac5f9..c3cfe18 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -296,6 +296,19 @@ int e820_get_num_entries(void);
>  bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
>  
>  #define PC_COMPAT_2_3 \
> +        {\
> +            .driver   = "Opteron_G1-" TYPE_X86_CPU,\
> +            .property = "xlevel",\
> +            .value    = stringify(0x80000008),\
> +        },{\
> +            .driver   = "Opteron_G2-" TYPE_X86_CPU,\
> +            .property = "xlevel",\
> +            .value    = stringify(0x80000008),\
> +        },{\
> +            .driver   = "Opteron_G3-" TYPE_X86_CPU,\
> +            .property = "xlevel",\
> +            .value    = stringify(0x80000008),\
> +        }
>  
>  #define PC_COMPAT_2_0 \
>          PC_COMPAT_2_3, \
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 03b33cf..d1b1b8c 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -1234,7 +1234,7 @@ static X86CPUDefinition builtin_x86_defs[] = {
>              CPUID_EXT2_MTRR | CPUID_EXT2_SYSCALL | CPUID_EXT2_APIC |
>              CPUID_EXT2_CX8 | CPUID_EXT2_MCE | CPUID_EXT2_PAE | CPUID_EXT2_MSR |
>              CPUID_EXT2_TSC | CPUID_EXT2_PSE | CPUID_EXT2_DE | CPUID_EXT2_FPU,
> -        .xlevel = 0x80000008,
> +        .xlevel = 0x80000018,
>          .model_id = "AMD Opteron 240 (Gen 1 Class Opteron)",
>      },
>      {
> @@ -1262,7 +1262,7 @@ static X86CPUDefinition builtin_x86_defs[] = {
>              CPUID_EXT2_DE | CPUID_EXT2_FPU,
>          .features[FEAT_8000_0001_ECX] =
>              CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM,
> -        .xlevel = 0x80000008,
> +        .xlevel = 0x80000018,
>          .model_id = "AMD Opteron 22xx (Gen 2 Class Opteron)",
>      },
>      {
> @@ -1292,7 +1292,7 @@ static X86CPUDefinition builtin_x86_defs[] = {
>          .features[FEAT_8000_0001_ECX] =
>              CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A |
>              CPUID_EXT3_ABM | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM,
> -        .xlevel = 0x80000008,
> +        .xlevel = 0x8000001A,
>          .model_id = "AMD Opteron 23xx (Gen 3 Class Opteron)",
>      },
>      {
> -- 
> 1.7.12.4
>
Alexander Graf April 22, 2015, 5:19 p.m. UTC | #2
> Am 22.04.2015 um 17:53 schrieb Eduardo Habkost <ehabkost@redhat.com>:
> 
>> On Wed, Apr 22, 2015 at 12:38:11AM +0200, Alexander Graf wrote:
>> The AMD Opteron family has different xlevel levels depending on the
>> generation. I looked up Gen1, Gen2 and Gen3 hardware and adapted the
>> levels according to real silicon.
>> 
>> The reason this came up is that there is a sanity check in KVM making
>> sure that SVM is only used when xlevel is high enough. Using real
>> hardware levels, they now are.
> 
> As noted in the reply to the previous patch: if you increase xlevel you are now
> reporting new CPUID leaves as available. I still don't see any reason to start
> reporting new CPUID leaves as available if they are returning useless data
> (that doesn't match real hardware).
> 
> For reference, here are the new CPUID leaves you are introducing (and will
> return all-zeroes):
> 
> For Opteron_G[123]:
> * CPUID Fn8000_0009 Reserved
> * CPUID Fn8000_000A SVM Revision and Feature Identification (supported by QEMU)
> * CPUID Fn8000_00[18:0B] Reserved
> 
> For Opteron_G3:
> * CPUID Fn8000_0019_EAX TLB 1GB Page Identifiers
> * CPUID Fn8000_0019_EBX TLB 1GB Page Identifiers
> * CPUID Fn8000_0019_E[D,C]X Reserved
> * CPUID Fn8000_001A_EAX Performance Optimization Identifiers
> * CPUID Fn8000_001A_E[D,C,B]X Reserved
> 
> The reserved bits look safe, but we can't be 100% sure unless we confirm that
> real hardware is returning zeroes on all the reserved bits. If real hardware
> return some data, we don't know what exactly that data means (and what
> all-zeroes would mean) until AMD documents it.
> 
> For L1 and L2 1GB page TLB info, all-zeroes means "TLB disabled".
> 
> The Performance Optimization Identifier leaf will now be available, and will
> return 0 on the following bits:
> 
> * "MOVU: MOVU SSE (multimedia) instructions are more efficient and should be
>  preferred to SSE (multimedia) MOVL/MOVH. MOVUPS is more efficient than
>  MOVLPS/MOVHPS. MOVUPD is more efficient than MOVLPD/MOVHP";
> * "FP128: 128-bit SSE (multimedia) instructions are executed with full-width
>  internal operations and pipelines rather than decomposing them into internal
>  64-bit suboperations. This may impact how soft- ware performs instruction
>  selection and scheduling."
> 
> There's a simple way to avoid exposing additional useless information to
> guests: just return 0x8000000A on xlevel. 0x8000000A is the CPUID leaf you need
> for SVM, the others are unrelated to SVM.

I don't think creating a more Frankenstein cpu definition is superior over unset bits in the new leafs. How about I add support for the new leafs and fill the bits with the same information as real hardware?


Alex
Eduardo Habkost April 22, 2015, 6:03 p.m. UTC | #3
On Wed, Apr 22, 2015 at 07:19:22PM +0200, Alexander Graf wrote:
> 
> 
> > Am 22.04.2015 um 17:53 schrieb Eduardo Habkost <ehabkost@redhat.com>:
> > 
> >> On Wed, Apr 22, 2015 at 12:38:11AM +0200, Alexander Graf wrote:
> >> The AMD Opteron family has different xlevel levels depending on the
> >> generation. I looked up Gen1, Gen2 and Gen3 hardware and adapted the
> >> levels according to real silicon.
> >> 
> >> The reason this came up is that there is a sanity check in KVM making
> >> sure that SVM is only used when xlevel is high enough. Using real
> >> hardware levels, they now are.
> > 
> > As noted in the reply to the previous patch: if you increase xlevel you are now
> > reporting new CPUID leaves as available. I still don't see any reason to start
> > reporting new CPUID leaves as available if they are returning useless data
> > (that doesn't match real hardware).
> > 
> > For reference, here are the new CPUID leaves you are introducing (and will
> > return all-zeroes):
> > 
> > For Opteron_G[123]:
> > * CPUID Fn8000_0009 Reserved
> > * CPUID Fn8000_000A SVM Revision and Feature Identification (supported by QEMU)
> > * CPUID Fn8000_00[18:0B] Reserved
> > 
> > For Opteron_G3:
> > * CPUID Fn8000_0019_EAX TLB 1GB Page Identifiers
> > * CPUID Fn8000_0019_EBX TLB 1GB Page Identifiers
> > * CPUID Fn8000_0019_E[D,C]X Reserved
> > * CPUID Fn8000_001A_EAX Performance Optimization Identifiers
> > * CPUID Fn8000_001A_E[D,C,B]X Reserved
> > 
> > The reserved bits look safe, but we can't be 100% sure unless we confirm that
> > real hardware is returning zeroes on all the reserved bits. If real hardware
> > return some data, we don't know what exactly that data means (and what
> > all-zeroes would mean) until AMD documents it.
> > 
> > For L1 and L2 1GB page TLB info, all-zeroes means "TLB disabled".
> > 
> > The Performance Optimization Identifier leaf will now be available, and will
> > return 0 on the following bits:
> > 
> > * "MOVU: MOVU SSE (multimedia) instructions are more efficient and should be
> >  preferred to SSE (multimedia) MOVL/MOVH. MOVUPS is more efficient than
> >  MOVLPS/MOVHPS. MOVUPD is more efficient than MOVLPD/MOVHP";
> > * "FP128: 128-bit SSE (multimedia) instructions are executed with full-width
> >  internal operations and pipelines rather than decomposing them into internal
> >  64-bit suboperations. This may impact how soft- ware performs instruction
> >  selection and scheduling."
> > 
> > There's a simple way to avoid exposing additional useless information to
> > guests: just return 0x8000000A on xlevel. 0x8000000A is the CPUID leaf you need
> > for SVM, the others are unrelated to SVM.
> 
> I don't think creating a more Frankenstein cpu definition is superior
> over unset bits in the new leafs. How about I add support for the new
> leafs and fill the bits with the same information as real hardware?

I don't see why xlevel=0x8000000A is "more Frankenstein". You seem to be
worrying more about the xlevel value (which doesn't have any meaning
except for "you can run CPUID with EAX=<value>") than the actual CPUID
contents, but I don't understand why.

Both options don't match real hardware (and can be called
"Frakensteins"). One option just hides information that don't match real
hardware, and the other exposes more mismatching information (that can
actually confuse guests).

If you volunteer to add support to the new leafs, I won't have
objections. To be honest, after the analysis above I am more bothered
about having those new leafs being added without anything mentioning
them in the cpu_x86_cpuid() code (even if just code comments), than
about what guests will do when they see them.

BTW, leaving the "Frankensteinness" discussion apart, we have an
additional issue with the higher xlevel: even if it doesn't break
anything (I don't think it will), it will cause us trouble on the day we
decide to change the contents of those new leaves to match real hardware
(because then we are going to make a guest-visible change that requires
compatibility code). But this is already an issue on Opteron_G[45]. :(

Finally, considering that both solutions are equally "Frankensteins" in
my opinion, but you are the one who spent time writing the code:

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

Patch
diff mbox

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 05ac5f9..c3cfe18 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -296,6 +296,19 @@  int e820_get_num_entries(void);
 bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 
 #define PC_COMPAT_2_3 \
+        {\
+            .driver   = "Opteron_G1-" TYPE_X86_CPU,\
+            .property = "xlevel",\
+            .value    = stringify(0x80000008),\
+        },{\
+            .driver   = "Opteron_G2-" TYPE_X86_CPU,\
+            .property = "xlevel",\
+            .value    = stringify(0x80000008),\
+        },{\
+            .driver   = "Opteron_G3-" TYPE_X86_CPU,\
+            .property = "xlevel",\
+            .value    = stringify(0x80000008),\
+        }
 
 #define PC_COMPAT_2_0 \
         PC_COMPAT_2_3, \
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 03b33cf..d1b1b8c 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -1234,7 +1234,7 @@  static X86CPUDefinition builtin_x86_defs[] = {
             CPUID_EXT2_MTRR | CPUID_EXT2_SYSCALL | CPUID_EXT2_APIC |
             CPUID_EXT2_CX8 | CPUID_EXT2_MCE | CPUID_EXT2_PAE | CPUID_EXT2_MSR |
             CPUID_EXT2_TSC | CPUID_EXT2_PSE | CPUID_EXT2_DE | CPUID_EXT2_FPU,
-        .xlevel = 0x80000008,
+        .xlevel = 0x80000018,
         .model_id = "AMD Opteron 240 (Gen 1 Class Opteron)",
     },
     {
@@ -1262,7 +1262,7 @@  static X86CPUDefinition builtin_x86_defs[] = {
             CPUID_EXT2_DE | CPUID_EXT2_FPU,
         .features[FEAT_8000_0001_ECX] =
             CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM,
-        .xlevel = 0x80000008,
+        .xlevel = 0x80000018,
         .model_id = "AMD Opteron 22xx (Gen 2 Class Opteron)",
     },
     {
@@ -1292,7 +1292,7 @@  static X86CPUDefinition builtin_x86_defs[] = {
         .features[FEAT_8000_0001_ECX] =
             CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A |
             CPUID_EXT3_ABM | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM,
-        .xlevel = 0x80000008,
+        .xlevel = 0x8000001A,
         .model_id = "AMD Opteron 23xx (Gen 3 Class Opteron)",
     },
     {