diff mbox

[uq/master,1/2] x86: fix migration from pre-version 12

Message ID 1378386382-415-2-git-send-email-pbonzini@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini Sept. 5, 2013, 1:06 p.m. UTC
On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
and not restore anything.

Since FP and SSE data are always valid, set them in xstate_bv at reset
time.  In fact, that value is the same that KVM_GET_XSAVE returns on
pre-XSAVE hosts.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target-i386/cpu.c | 1 +
 target-i386/cpu.h | 5 +++++
 2 files changed, 6 insertions(+)

Comments

Gleb Natapov Sept. 8, 2013, 11:40 a.m. UTC | #1
On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote:
> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
> and not restore anything.
> 
XRSTOR restores FP/SSE state to reset state if no bits are set in
xstate_bv. This is what should happen on reset, no?

> Since FP and SSE data are always valid, set them in xstate_bv at reset
> time.  In fact, that value is the same that KVM_GET_XSAVE returns on
> pre-XSAVE hosts.
It is needed for migration between non xsave host to xsave host.

> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target-i386/cpu.c | 1 +
>  target-i386/cpu.h | 5 +++++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index c36345e..ac83106 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -2386,6 +2386,7 @@ static void x86_cpu_reset(CPUState *s)
>      env->fpuc = 0x37f;
>  
>      env->mxcsr = 0x1f80;
> +    env->xstate_bv = XSTATE_FP | XSTATE_SSE;
>  
>      env->pat = 0x0007040600070406ULL;
>      env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT;
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index 5723eff..a153078 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -380,6 +380,11 @@
>  
>  #define MSR_VM_HSAVE_PA                 0xc0010117
>  
> +#define XSTATE_SUPPORTED		(XSTATE_FP|XSTATE_SSE|XSTATE_YMM)
Supported by whom? By QEMU? We should filer unsupported bits from CPUID.0D then too.

> +#define XSTATE_FP			1
> +#define XSTATE_SSE			2
> +#define XSTATE_YMM			4
> +
>  /* CPUID feature words */
>  typedef enum FeatureWord {
>      FEAT_1_EDX,         /* CPUID[1].EDX */
> -- 
> 1.8.3.1
> 

--
			Gleb.
Paolo Bonzini Sept. 9, 2013, 8:31 a.m. UTC | #2
Il 08/09/2013 13:40, Gleb Natapov ha scritto:
> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote:
>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
>> and not restore anything.
>>
> XRSTOR restores FP/SSE state to reset state if no bits are set in
> xstate_bv. This is what should happen on reset, no?

Yes. The problem happens on the migration destination when XSAVE data is
not transmitted.  FP/SSE data is transmitted and must be restored, but
xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset
state.  The vcpu then loses the values that were set in the migration data.

>> Since FP and SSE data are always valid, set them in xstate_bv at reset
>> time.  In fact, that value is the same that KVM_GET_XSAVE returns on
>> pre-XSAVE hosts.
> It is needed for migration between non xsave host to xsave host.

Yes, and this patch does the same for migration between non-XSAVE QEMU
and XSAVE QEMU.

In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores
xstate_bv when XSAVE is not available.  Instead, it should reset the
FXSAVE data to processor-reset values (except for MXCSR which always
comes from XRSTOR data), i.e. to all-zeros except for the x87 control
and tag words.  It should also check reserved bits of MXCSR.

>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  target-i386/cpu.c | 1 +
>>  target-i386/cpu.h | 5 +++++
>>  2 files changed, 6 insertions(+)
>>
>> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
>> index c36345e..ac83106 100644
>> --- a/target-i386/cpu.c
>> +++ b/target-i386/cpu.c
>> @@ -2386,6 +2386,7 @@ static void x86_cpu_reset(CPUState *s)
>>      env->fpuc = 0x37f;
>>  
>>      env->mxcsr = 0x1f80;
>> +    env->xstate_bv = XSTATE_FP | XSTATE_SSE;
>>  
>>      env->pat = 0x0007040600070406ULL;
>>      env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT;
>> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
>> index 5723eff..a153078 100644
>> --- a/target-i386/cpu.h
>> +++ b/target-i386/cpu.h
>> @@ -380,6 +380,11 @@
>>  
>>  #define MSR_VM_HSAVE_PA                 0xc0010117
>>  
>> +#define XSTATE_SUPPORTED		(XSTATE_FP|XSTATE_SSE|XSTATE_YMM)
> Supported by whom? By QEMU? We should filer unsupported bits from CPUID.0D then too.

Yes.  QEMU unmarshals information from the XSAVE region and back, so it
cannot support MPX or AVX-512 yet (even if KVM were).  Separate bug, though.

Paolo

> 
>> +#define XSTATE_FP			1
>> +#define XSTATE_SSE			2
>> +#define XSTATE_YMM			4
>> +
>>  /* CPUID feature words */
>>  typedef enum FeatureWord {
>>      FEAT_1_EDX,         /* CPUID[1].EDX */
>> -- 
>> 1.8.3.1
>>
> 
> --
> 			Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Gleb Natapov Sept. 9, 2013, 9:03 a.m. UTC | #3
On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote:
> Il 08/09/2013 13:40, Gleb Natapov ha scritto:
> > On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote:
> >> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
> >> and not restore anything.
> >>
> > XRSTOR restores FP/SSE state to reset state if no bits are set in
> > xstate_bv. This is what should happen on reset, no?
> 
> Yes. The problem happens on the migration destination when XSAVE data is
> not transmitted.  FP/SSE data is transmitted and must be restored, but
> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset
> state.  The vcpu then loses the values that were set in the migration data.
> 
> >> Since FP and SSE data are always valid, set them in xstate_bv at reset
> >> time.  In fact, that value is the same that KVM_GET_XSAVE returns on
> >> pre-XSAVE hosts.
> > It is needed for migration between non xsave host to xsave host.
> 
> Yes, and this patch does the same for migration between non-XSAVE QEMU
> and XSAVE QEMU.
> 
Can such migration happen? The commit that added xsave support
(f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id.

> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores
> xstate_bv when XSAVE is not available.  Instead, it should reset the
> FXSAVE data to processor-reset values (except for MXCSR which always
> comes from XRSTOR data), i.e. to all-zeros except for the x87 control
> and tag words.  It should also check reserved bits of MXCSR.
> 
I do not see why.

> >>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> ---
> >>  target-i386/cpu.c | 1 +
> >>  target-i386/cpu.h | 5 +++++
> >>  2 files changed, 6 insertions(+)
> >>
> >> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> >> index c36345e..ac83106 100644
> >> --- a/target-i386/cpu.c
> >> +++ b/target-i386/cpu.c
> >> @@ -2386,6 +2386,7 @@ static void x86_cpu_reset(CPUState *s)
> >>      env->fpuc = 0x37f;
> >>  
> >>      env->mxcsr = 0x1f80;
> >> +    env->xstate_bv = XSTATE_FP | XSTATE_SSE;
> >>  
> >>      env->pat = 0x0007040600070406ULL;
> >>      env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT;
> >> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> >> index 5723eff..a153078 100644
> >> --- a/target-i386/cpu.h
> >> +++ b/target-i386/cpu.h
> >> @@ -380,6 +380,11 @@
> >>  
> >>  #define MSR_VM_HSAVE_PA                 0xc0010117
> >>  
> >> +#define XSTATE_SUPPORTED		(XSTATE_FP|XSTATE_SSE|XSTATE_YMM)
> > Supported by whom? By QEMU? We should filer unsupported bits from CPUID.0D then too.
> 
> Yes.  QEMU unmarshals information from the XSAVE region and back, so it
> cannot support MPX or AVX-512 yet (even if KVM were).  Separate bug, though.
> 
IMO this is the main issue here, not separate bug. If we gonna let guest
use CPU state QEMU does not support we gonna have a bad time.

--
			Gleb.
Paolo Bonzini Sept. 9, 2013, 9:53 a.m. UTC | #4
Il 09/09/2013 11:03, Gleb Natapov ha scritto:
> On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote:
>> Il 08/09/2013 13:40, Gleb Natapov ha scritto:
>>> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote:
>>>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
>>>> and not restore anything.
>>>>
>>> XRSTOR restores FP/SSE state to reset state if no bits are set in
>>> xstate_bv. This is what should happen on reset, no?
>>
>> Yes. The problem happens on the migration destination when XSAVE data is
>> not transmitted.  FP/SSE data is transmitted and must be restored, but
>> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset
>> state.  The vcpu then loses the values that were set in the migration data.
>>
>>>> Since FP and SSE data are always valid, set them in xstate_bv at reset
>>>> time.  In fact, that value is the same that KVM_GET_XSAVE returns on
>>>> pre-XSAVE hosts.
>>> It is needed for migration between non xsave host to xsave host.
>>
>> Yes, and this patch does the same for migration between non-XSAVE QEMU
>> and XSAVE QEMU.
>>
> Can such migration happen? The commit that added xsave support
> (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id.

Yes, old->new migration can happen.  New->old of course cannot.

>> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores
>> xstate_bv when XSAVE is not available.  Instead, it should reset the
>> FXSAVE data to processor-reset values (except for MXCSR which always
>> comes from XRSTOR data), i.e. to all-zeros except for the x87 control
>> and tag words.  It should also check reserved bits of MXCSR.
>
> I do not see why.

Because otherwise it behaves in a subtly different manner for XSAVE and
non-XSAVE hosts.

>> Yes.  QEMU unmarshals information from the XSAVE region and back, so it
>> cannot support MPX or AVX-512 yet (even if KVM were).  Separate bug, though.
>>
> IMO this is the main issue here, not separate bug. If we gonna let guest
> use CPU state QEMU does not support we gonna have a bad time.

We cannot force the guest not to use a feature; all we can do is hide
the CPUID bits so that a well-behaved guest will not use it.  QEMU does
hide CPUID bits for non-supported XSAVE states, except for "-cpu host".
 So this will not be a problem except with "-cpu host".

Paolo
Gleb Natapov Sept. 9, 2013, 10:54 a.m. UTC | #5
On Mon, Sep 09, 2013 at 11:53:45AM +0200, Paolo Bonzini wrote:
> Il 09/09/2013 11:03, Gleb Natapov ha scritto:
> > On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote:
> >> Il 08/09/2013 13:40, Gleb Natapov ha scritto:
> >>> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote:
> >>>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
> >>>> and not restore anything.
> >>>>
> >>> XRSTOR restores FP/SSE state to reset state if no bits are set in
> >>> xstate_bv. This is what should happen on reset, no?
> >>
> >> Yes. The problem happens on the migration destination when XSAVE data is
> >> not transmitted.  FP/SSE data is transmitted and must be restored, but
> >> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset
> >> state.  The vcpu then loses the values that were set in the migration data.
> >>
> >>>> Since FP and SSE data are always valid, set them in xstate_bv at reset
> >>>> time.  In fact, that value is the same that KVM_GET_XSAVE returns on
> >>>> pre-XSAVE hosts.
> >>> It is needed for migration between non xsave host to xsave host.
> >>
> >> Yes, and this patch does the same for migration between non-XSAVE QEMU
> >> and XSAVE QEMU.
> >>
> > Can such migration happen? The commit that added xsave support
> > (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id.
> 
> Yes, old->new migration can happen.  New->old of course cannot.
> 
I see. I am fine with the patch, but please drop defines that are not
used in the patch itself.

> >> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores
> >> xstate_bv when XSAVE is not available.  Instead, it should reset the
> >> FXSAVE data to processor-reset values (except for MXCSR which always
> >> comes from XRSTOR data), i.e. to all-zeros except for the x87 control
> >> and tag words.  It should also check reserved bits of MXCSR.
> >
> > I do not see why.
> 
> Because otherwise it behaves in a subtly different manner for XSAVE and
> non-XSAVE hosts.
I do not see how. Can you elaborate?

> 
> >> Yes.  QEMU unmarshals information from the XSAVE region and back, so it
> >> cannot support MPX or AVX-512 yet (even if KVM were).  Separate bug, though.
> >>
> > IMO this is the main issue here, not separate bug. If we gonna let guest
> > use CPU state QEMU does not support we gonna have a bad time.
> 
> We cannot force the guest not to use a feature; all we can do is hide
Of course we can't, this is correct for other features too, but this is
guest's problem.

> the CPUID bits so that a well-behaved guest will not use it.  QEMU does
> hide CPUID bits for non-supported XSAVE states, except for "-cpu host".
>  So this will not be a problem except with "-cpu host".
> 

--
			Gleb.
Gleb Natapov Sept. 9, 2013, 10:58 a.m. UTC | #6
On Mon, Sep 09, 2013 at 01:54:50PM +0300, Gleb Natapov wrote:
> On Mon, Sep 09, 2013 at 11:53:45AM +0200, Paolo Bonzini wrote:
> > Il 09/09/2013 11:03, Gleb Natapov ha scritto:
> > > On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote:
> > >> Il 08/09/2013 13:40, Gleb Natapov ha scritto:
> > >>> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote:
> > >>>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
> > >>>> and not restore anything.
> > >>>>
> > >>> XRSTOR restores FP/SSE state to reset state if no bits are set in
> > >>> xstate_bv. This is what should happen on reset, no?
> > >>
> > >> Yes. The problem happens on the migration destination when XSAVE data is
> > >> not transmitted.  FP/SSE data is transmitted and must be restored, but
> > >> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset
> > >> state.  The vcpu then loses the values that were set in the migration data.
> > >>
> > >>>> Since FP and SSE data are always valid, set them in xstate_bv at reset
> > >>>> time.  In fact, that value is the same that KVM_GET_XSAVE returns on
> > >>>> pre-XSAVE hosts.
> > >>> It is needed for migration between non xsave host to xsave host.
> > >>
> > >> Yes, and this patch does the same for migration between non-XSAVE QEMU
> > >> and XSAVE QEMU.
> > >>
> > > Can such migration happen? The commit that added xsave support
> > > (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id.
> > 
> > Yes, old->new migration can happen.  New->old of course cannot.
> > 
> I see. I am fine with the patch, but please drop defines that are not
> used in the patch itself.
> 
BTW migration question, will xstate_bv no be zeroed by migration code in
old->new case?
 
--
			Gleb.
Paolo Bonzini Sept. 9, 2013, 11:07 a.m. UTC | #7
Il 09/09/2013 12:54, Gleb Natapov ha scritto:
> On Mon, Sep 09, 2013 at 11:53:45AM +0200, Paolo Bonzini wrote:
>> Il 09/09/2013 11:03, Gleb Natapov ha scritto:
>>> On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote:
>>>> Il 08/09/2013 13:40, Gleb Natapov ha scritto:
>>>>> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote:
>>>>>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
>>>>>> and not restore anything.
>>>>>>
>>>>> XRSTOR restores FP/SSE state to reset state if no bits are set in
>>>>> xstate_bv. This is what should happen on reset, no?
>>>>
>>>> Yes. The problem happens on the migration destination when XSAVE data is
>>>> not transmitted.  FP/SSE data is transmitted and must be restored, but
>>>> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset
>>>> state.  The vcpu then loses the values that were set in the migration data.
>>>>
>>>>>> Since FP and SSE data are always valid, set them in xstate_bv at reset
>>>>>> time.  In fact, that value is the same that KVM_GET_XSAVE returns on
>>>>>> pre-XSAVE hosts.
>>>>> It is needed for migration between non xsave host to xsave host.
>>>>
>>>> Yes, and this patch does the same for migration between non-XSAVE QEMU
>>>> and XSAVE QEMU.
>>>>
>>> Can such migration happen? The commit that added xsave support
>>> (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id.
>>
>> Yes, old->new migration can happen.  New->old of course cannot.
>>
> I see. I am fine with the patch, but please drop defines that are not
> used in the patch itself.

Ok.

(For the "BTW" question, xstate_bv will not be zeroed, it will remain to
the default value).

>>>> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores
>>>> xstate_bv when XSAVE is not available.  Instead, it should reset the
>>>> FXSAVE data to processor-reset values (except for MXCSR which always
>>>> comes from XRSTOR data), i.e. to all-zeros except for the x87 control
>>>> and tag words.  It should also check reserved bits of MXCSR.
>>>
>>> I do not see why.
>>
>> Because otherwise it behaves in a subtly different manner for XSAVE and
>> non-XSAVE hosts.
> 
> I do not see how. Can you elaborate?

Suppose userspace calls KVM_SET_XSAVE with XSTATE_BV=0.

On an XSAVE host, when the guest FPU state is loaded KVM will do an
XRSTOR.  The XRSTOR will restore the FPU state to default values.

On a non-XSAVE host, when the guest FPU state is loaded KVM will do an
FXRSTR.  The FXRSTR will load the FPU state from the first 512 bytes of
the block that was passed to KVM_SET_XSAVE.

This is not a problem because userspace will usually pass to
KVM_SET_XSAVE only something that it got from KVM_GET_XSAVE, and
KVM_GET_XSAVE will never set XSTATE_BV=0.  However, KVM_SET_XSAVE is
supposed to emulate XSAVE/XRSTOR if it is not available, and it is
failing to emulate this detail.

>>>> Yes.  QEMU unmarshals information from the XSAVE region and back, so it
>>>> cannot support MPX or AVX-512 yet (even if KVM were).  Separate bug, though.
>>>>
>>> IMO this is the main issue here, not separate bug. If we gonna let guest
>>> use CPU state QEMU does not support we gonna have a bad time.
>>
>> We cannot force the guest not to use a feature; all we can do is hide
> 
> Of course we can't, this is correct for other features too, but this is
> guest's problem.

Ok, then we agree that QEMU doesn't have a problem?  The XSAVE data will
always be "fresh" as long as the guest obeys CPUID bits it receives, and
the CPUID bits that QEMU passes will never enable XSAVE data that QEMU
does not support.

Paolo
Gleb Natapov Sept. 9, 2013, 11:28 a.m. UTC | #8
On Mon, Sep 09, 2013 at 01:07:37PM +0200, Paolo Bonzini wrote:
> >>>> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores
> >>>> xstate_bv when XSAVE is not available.  Instead, it should reset the
> >>>> FXSAVE data to processor-reset values (except for MXCSR which always
> >>>> comes from XRSTOR data), i.e. to all-zeros except for the x87 control
> >>>> and tag words.  It should also check reserved bits of MXCSR.
> >>>
> >>> I do not see why.
> >>
> >> Because otherwise it behaves in a subtly different manner for XSAVE and
> >> non-XSAVE hosts.
> > 
> > I do not see how. Can you elaborate?
> 
> Suppose userspace calls KVM_SET_XSAVE with XSTATE_BV=0.
> 
> On an XSAVE host, when the guest FPU state is loaded KVM will do an
> XRSTOR.  The XRSTOR will restore the FPU state to default values.
> 
> On a non-XSAVE host, when the guest FPU state is loaded KVM will do an
> FXRSTR.  The FXRSTR will load the FPU state from the first 512 bytes of
> the block that was passed to KVM_SET_XSAVE.
> 
> This is not a problem because userspace will usually pass to
> KVM_SET_XSAVE only something that it got from KVM_GET_XSAVE, and
> KVM_GET_XSAVE will never set XSTATE_BV=0.  However, KVM_SET_XSAVE is
> supposed to emulate XSAVE/XRSTOR if it is not available, and it is
> failing to emulate this detail.
> 
You are trying to be bug to bug compatible :) XSTATE_BV can be zero only
if FPU state is reset one, otherwise the guest will not survive. KVM_SET_XSAVE
is not suppose to emulate XSAVE/XRSTOR, it is not emulator function. It
is better to outlaw zero value for XSTATE_BV at all, but we cannot do it
because current QEMU uses it.

> >>>> Yes.  QEMU unmarshals information from the XSAVE region and back, so it
> >>>> cannot support MPX or AVX-512 yet (even if KVM were).  Separate bug, though.
> >>>>
> >>> IMO this is the main issue here, not separate bug. If we gonna let guest
> >>> use CPU state QEMU does not support we gonna have a bad time.
> >>
> >> We cannot force the guest not to use a feature; all we can do is hide
> > 
> > Of course we can't, this is correct for other features too, but this is
> > guest's problem.
> 
> Ok, then we agree that QEMU doesn't have a problem?  The XSAVE data will
Which problem exactly. The problems I see is that 1. We do not support
MPX and AVX-512 (but this is probably not the problem you meant :)) 2. 0D
data is not consistent with features. Guest may not expect it and do stupid
things.

> always be "fresh" as long as the guest obeys CPUID bits it receives, and
> the CPUID bits that QEMU passes will never enable XSAVE data that QEMU
> does not support.
> 

--
			Gleb.
Paolo Bonzini Sept. 9, 2013, 11:46 a.m. UTC | #9
Il 09/09/2013 13:28, Gleb Natapov ha scritto:
>> On an XSAVE host, when the guest FPU state is loaded KVM will do an
>> XRSTOR.  The XRSTOR will restore the FPU state to default values.
>>
>> On a non-XSAVE host, when the guest FPU state is loaded KVM will do an
>> FXRSTR.  The FXRSTR will load the FPU state from the first 512 bytes of
>> the block that was passed to KVM_SET_XSAVE.
>>
>> This is not a problem because userspace will usually pass to
>> KVM_SET_XSAVE only something that it got from KVM_GET_XSAVE, and
>> KVM_GET_XSAVE will never set XSTATE_BV=0.  However, KVM_SET_XSAVE is
>> supposed to emulate XSAVE/XRSTOR if it is not available, and it is
>> failing to emulate this detail.
>>
> You are trying to be bug to bug compatible :) XSTATE_BV can be zero only
> if FPU state is reset one, otherwise the guest will not survive.

Yes.

> KVM_SET_XSAVE
> is not suppose to emulate XSAVE/XRSTOR, it is not emulator function. It
> is better to outlaw zero value for XSTATE_BV at all, but we cannot do it
> because current QEMU uses it.

I agree it'd be better to forbid it.  If the mismatch in semantics does
not bother you, I won't fix it.  It slightly bothers me. :)

>>>>>> Yes.  QEMU unmarshals information from the XSAVE region and back, so it
>>>>>> cannot support MPX or AVX-512 yet (even if KVM were).  Separate bug, though.
>>>>>>
>>>>> IMO this is the main issue here, not separate bug. If we gonna let guest
>>>>> use CPU state QEMU does not support we gonna have a bad time.
>>>>
>>>> We cannot force the guest not to use a feature; all we can do is hide
>>>
>>> Of course we can't, this is correct for other features too, but this is
>>> guest's problem.
>>
>> Ok, then we agree that QEMU doesn't have a problem?  The XSAVE data will
> 
> Which problem exactly. The problems I see is that 1. We do not support
> MPX and AVX-512 (but this is probably not the problem you meant :)) 2. 0D
> data is not consistent with features. Guest may not expect it and do stupid
> things.

It is not a problem to unmarshal information out of KVM_GET_XSAVE data
(and back).  If the guest does stupid things, it's a bug in an
ill-behaving guest.

On the other hand, I agree that passthrough of host 0xD data is bad and
will fix it.

Paolo

>> always be "fresh" as long as the guest obeys CPUID bits it receives, and
>> the CPUID bits that QEMU passes will never enable XSAVE data that QEMU
>> does not support.
>>
> 
> --
> 			Gleb.
>
Gleb Natapov Sept. 9, 2013, noon UTC | #10
On Mon, Sep 09, 2013 at 01:46:49PM +0200, Paolo Bonzini wrote:
> >>>>>> Yes.  QEMU unmarshals information from the XSAVE region and back, so it
> >>>>>> cannot support MPX or AVX-512 yet (even if KVM were).  Separate bug, though.
> >>>>>>
> >>>>> IMO this is the main issue here, not separate bug. If we gonna let guest
> >>>>> use CPU state QEMU does not support we gonna have a bad time.
> >>>>
> >>>> We cannot force the guest not to use a feature; all we can do is hide
> >>>
> >>> Of course we can't, this is correct for other features too, but this is
> >>> guest's problem.
> >>
> >> Ok, then we agree that QEMU doesn't have a problem?  The XSAVE data will
> > 
> > Which problem exactly. The problems I see is that 1. We do not support
> > MPX and AVX-512 (but this is probably not the problem you meant :)) 2. 0D
> > data is not consistent with features. Guest may not expect it and do stupid
> > things.
> 
> It is not a problem to unmarshal information out of KVM_GET_XSAVE data
> (and back).  If the guest does stupid things, it's a bug in an
> ill-behaving guest.
> 
You know I am first in line to blame guest for everything :) (who needs
guests anyway) but in this case I didn't mean that guest does something
illegal. If we advertise support for some XSAVE state in 0D leaf guest
is in his right to make conclusions we may not expect from that. It may
check corespondent feature bit and crash if it is not present for
instance.

> On the other hand, I agree that passthrough of host 0xD data is bad and
> will fix it.
> 
Thanks!

--
			Gleb.
diff mbox

Patch

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index c36345e..ac83106 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2386,6 +2386,7 @@  static void x86_cpu_reset(CPUState *s)
     env->fpuc = 0x37f;
 
     env->mxcsr = 0x1f80;
+    env->xstate_bv = XSTATE_FP | XSTATE_SSE;
 
     env->pat = 0x0007040600070406ULL;
     env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT;
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 5723eff..a153078 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -380,6 +380,11 @@ 
 
 #define MSR_VM_HSAVE_PA                 0xc0010117
 
+#define XSTATE_SUPPORTED		(XSTATE_FP|XSTATE_SSE|XSTATE_YMM)
+#define XSTATE_FP			1
+#define XSTATE_SSE			2
+#define XSTATE_YMM			4
+
 /* CPUID feature words */
 typedef enum FeatureWord {
     FEAT_1_EDX,         /* CPUID[1].EDX */