mbox series

[SRU,J,0/2] A general-proteciton exception during guest migration to unsupported PKRU machine

Message ID 20240316034344.17515-1-chengen.du@canonical.com
Headers show
Series A general-proteciton exception during guest migration to unsupported PKRU machine | expand

Message

Chengen Du March 16, 2024, 3:43 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2032164

SRU Justification:

[Impact]
When a host that supports PKRU initiates a guest that lacks PKRU support, the flag is enabled on the guest's fpstate.
This information is then passed to userspace through the vcpu ioctl KVM_GET_XSAVE.
However, a problem arises when the user opts to migrate the mentioned guest to another machine that does not support PKRU.
In this scenario, the new host attempts to restore the guest's fpu registers.
Nevertheless, due to the absence of PKRU support on the new host, a general-protection exception takes place, leading to a guest crash.

[Fix]
The problem is resolved by the following upstream commit:
18164f66e6c5 x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
8647c52e9504 KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}

[Test Plan]
Several scenarios need to be conducted to confirm the migration outcome.
	Patched kernel with PKRU -> kernel with PKRU
	Patched kernel with PKRU -> kernel without PKRU
	Patched kernel without PKRU -> kernel with PKRU
	Patched kernel without PKRU -> kernel without PKRU
	Kernel with PKRU -> patched kernel with PKRU
	Kernel with PKRU -> patched kernel without PKRU
	Kernel without PKRU -> patched kernel with PKRU
	Kernel without PKRU -> patched kernel without PKRU
	Patched kernel with PKRU -> patched kernel without PKRU

Each scenarios shall succeed except "Kernel with PKRU -> patched kernel without PKRU" one.
Addressing this case poses challenges because the most plausible solution is to clamp the FPU features at the destination during migration.
However, upstream does not support this approach due to concerns about silently dropping features requested by userspace.
This could potentially lead to other issues and violate KVM's ABI.

[Where problems could occur]
The introduced commits will impact the guest migration process,
potentially leading to failures and preventing the guest from operating successfully on the migration destination.

Sean Christopherson (2):
  x86/fpu: Allow caller to constrain xfeatures when copying to uabi
    buffer
  KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}

 arch/x86/include/asm/fpu/api.h |  3 ++-
 arch/x86/kernel/fpu/core.c     |  5 +++--
 arch/x86/kernel/fpu/xstate.c   |  7 +++++--
 arch/x86/kernel/fpu/xstate.h   |  3 ++-
 arch/x86/kvm/x86.c             | 33 +++++++++++++++++++++++++++------
 5 files changed, 39 insertions(+), 12 deletions(-)

Comments

Stefan Bader March 26, 2024, 8:34 a.m. UTC | #1
On 16.03.24 04:43, Chengen Du wrote:
> BugLink: https://bugs.launchpad.net/bugs/2032164
> 
> SRU Justification:
> 
> [Impact]
> When a host that supports PKRU initiates a guest that lacks PKRU support, the flag is enabled on the guest's fpstate.
> This information is then passed to userspace through the vcpu ioctl KVM_GET_XSAVE.
> However, a problem arises when the user opts to migrate the mentioned guest to another machine that does not support PKRU.
> In this scenario, the new host attempts to restore the guest's fpu registers.
> Nevertheless, due to the absence of PKRU support on the new host, a general-protection exception takes place, leading to a guest crash.
> 
> [Fix]
> The problem is resolved by the following upstream commit:
> 18164f66e6c5 x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
> 8647c52e9504 KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
> 
> [Test Plan]
> Several scenarios need to be conducted to confirm the migration outcome.
> 	Patched kernel with PKRU -> kernel with PKRU
> 	Patched kernel with PKRU -> kernel without PKRU
> 	Patched kernel without PKRU -> kernel with PKRU
> 	Patched kernel without PKRU -> kernel without PKRU
> 	Kernel with PKRU -> patched kernel with PKRU
> 	Kernel with PKRU -> patched kernel without PKRU
> 	Kernel without PKRU -> patched kernel with PKRU
> 	Kernel without PKRU -> patched kernel without PKRU
> 	Patched kernel with PKRU -> patched kernel without PKRU
> 
> Each scenarios shall succeed except "Kernel with PKRU -> patched kernel without PKRU" one.
> Addressing this case poses challenges because the most plausible solution is to clamp the FPU features at the destination during migration.
> However, upstream does not support this approach due to concerns about silently dropping features requested by userspace.
> This could potentially lead to other issues and violate KVM's ABI.
> 
> [Where problems could occur]
> The introduced commits will impact the guest migration process,
> potentially leading to failures and preventing the guest from operating successfully on the migration destination.
> 
> Sean Christopherson (2):
>    x86/fpu: Allow caller to constrain xfeatures when copying to uabi
>      buffer
>    KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
> 
>   arch/x86/include/asm/fpu/api.h |  3 ++-
>   arch/x86/kernel/fpu/core.c     |  5 +++--
>   arch/x86/kernel/fpu/xstate.c   |  7 +++++--
>   arch/x86/kernel/fpu/xstate.h   |  3 ++-
>   arch/x86/kvm/x86.c             | 33 +++++++++++++++++++++++++++------
>   5 files changed, 39 insertions(+), 12 deletions(-)
> 
Given the complexity of the changes with parts dropped from the upstream 
patches because other changes were not done, it would be better to have 
tested the outcome before applying to a cycle. Has this been done?
Chengen Du March 27, 2024, 1:59 a.m. UTC | #2
Hi,

We have prepared a tested PPA [1] for the support team and the
customer to evaluate.
The test results from the support team meet our expectations [2], and
the customer is also satisfied with the outcome.
Furthermore, I have carefully reviewed the patch against the content
of the tested PPA, and it aligns with our expectations.

Best regards,
Chengen Du

[1] https://launchpad.net/~chengendu/+archive/ubuntu/sf00364034-test
[2] https://pastebin.canonical.com/p/Hgx8hjzqky/plain/

On Tue, Mar 26, 2024 at 4:34 PM Stefan Bader <stefan.bader@canonical.com> wrote:
>
> On 16.03.24 04:43, Chengen Du wrote:
> > BugLink: https://bugs.launchpad.net/bugs/2032164
> >
> > SRU Justification:
> >
> > [Impact]
> > When a host that supports PKRU initiates a guest that lacks PKRU support, the flag is enabled on the guest's fpstate.
> > This information is then passed to userspace through the vcpu ioctl KVM_GET_XSAVE.
> > However, a problem arises when the user opts to migrate the mentioned guest to another machine that does not support PKRU.
> > In this scenario, the new host attempts to restore the guest's fpu registers.
> > Nevertheless, due to the absence of PKRU support on the new host, a general-protection exception takes place, leading to a guest crash.
> >
> > [Fix]
> > The problem is resolved by the following upstream commit:
> > 18164f66e6c5 x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
> > 8647c52e9504 KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
> >
> > [Test Plan]
> > Several scenarios need to be conducted to confirm the migration outcome.
> >       Patched kernel with PKRU -> kernel with PKRU
> >       Patched kernel with PKRU -> kernel without PKRU
> >       Patched kernel without PKRU -> kernel with PKRU
> >       Patched kernel without PKRU -> kernel without PKRU
> >       Kernel with PKRU -> patched kernel with PKRU
> >       Kernel with PKRU -> patched kernel without PKRU
> >       Kernel without PKRU -> patched kernel with PKRU
> >       Kernel without PKRU -> patched kernel without PKRU
> >       Patched kernel with PKRU -> patched kernel without PKRU
> >
> > Each scenarios shall succeed except "Kernel with PKRU -> patched kernel without PKRU" one.
> > Addressing this case poses challenges because the most plausible solution is to clamp the FPU features at the destination during migration.
> > However, upstream does not support this approach due to concerns about silently dropping features requested by userspace.
> > This could potentially lead to other issues and violate KVM's ABI.
> >
> > [Where problems could occur]
> > The introduced commits will impact the guest migration process,
> > potentially leading to failures and preventing the guest from operating successfully on the migration destination.
> >
> > Sean Christopherson (2):
> >    x86/fpu: Allow caller to constrain xfeatures when copying to uabi
> >      buffer
> >    KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
> >
> >   arch/x86/include/asm/fpu/api.h |  3 ++-
> >   arch/x86/kernel/fpu/core.c     |  5 +++--
> >   arch/x86/kernel/fpu/xstate.c   |  7 +++++--
> >   arch/x86/kernel/fpu/xstate.h   |  3 ++-
> >   arch/x86/kvm/x86.c             | 33 +++++++++++++++++++++++++++------
> >   5 files changed, 39 insertions(+), 12 deletions(-)
> >
> Given the complexity of the changes with parts dropped from the upstream
> patches because other changes were not done, it would be better to have
> tested the outcome before applying to a cycle. Has this been done?
>
> --
> - Stefan
>
Stefan Bader March 27, 2024, 9:40 a.m. UTC | #3
On 16.03.24 04:43, Chengen Du wrote:
> BugLink: https://bugs.launchpad.net/bugs/2032164
> 
> SRU Justification:
> 
> [Impact]
> When a host that supports PKRU initiates a guest that lacks PKRU support, the flag is enabled on the guest's fpstate.
> This information is then passed to userspace through the vcpu ioctl KVM_GET_XSAVE.
> However, a problem arises when the user opts to migrate the mentioned guest to another machine that does not support PKRU.
> In this scenario, the new host attempts to restore the guest's fpu registers.
> Nevertheless, due to the absence of PKRU support on the new host, a general-protection exception takes place, leading to a guest crash.
> 
> [Fix]
> The problem is resolved by the following upstream commit:
> 18164f66e6c5 x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
> 8647c52e9504 KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
> 
> [Test Plan]
> Several scenarios need to be conducted to confirm the migration outcome.
> 	Patched kernel with PKRU -> kernel with PKRU
> 	Patched kernel with PKRU -> kernel without PKRU
> 	Patched kernel without PKRU -> kernel with PKRU
> 	Patched kernel without PKRU -> kernel without PKRU
> 	Kernel with PKRU -> patched kernel with PKRU
> 	Kernel with PKRU -> patched kernel without PKRU
> 	Kernel without PKRU -> patched kernel with PKRU
> 	Kernel without PKRU -> patched kernel without PKRU
> 	Patched kernel with PKRU -> patched kernel without PKRU
> 
> Each scenarios shall succeed except "Kernel with PKRU -> patched kernel without PKRU" one.
> Addressing this case poses challenges because the most plausible solution is to clamp the FPU features at the destination during migration.
> However, upstream does not support this approach due to concerns about silently dropping features requested by userspace.
> This could potentially lead to other issues and violate KVM's ABI.
> 
> [Where problems could occur]
> The introduced commits will impact the guest migration process,
> potentially leading to failures and preventing the guest from operating successfully on the migration destination.
> 
> Sean Christopherson (2):
>    x86/fpu: Allow caller to constrain xfeatures when copying to uabi
>      buffer
>    KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
> 
>   arch/x86/include/asm/fpu/api.h |  3 ++-
>   arch/x86/kernel/fpu/core.c     |  5 +++--
>   arch/x86/kernel/fpu/xstate.c   |  7 +++++--
>   arch/x86/kernel/fpu/xstate.h   |  3 ++-
>   arch/x86/kvm/x86.c             | 33 +++++++++++++++++++++++++++------
>   5 files changed, 39 insertions(+), 12 deletions(-)
> 

As this was tested beforehand with good results.

Acked-by: Stefan Bader <stefan.bader@canonical.com>
Roxana Nicolescu March 28, 2024, 10:58 a.m. UTC | #4
On 16/03/2024 04:43, Chengen Du wrote:
> BugLink: https://bugs.launchpad.net/bugs/2032164
>
> SRU Justification:
>
> [Impact]
> When a host that supports PKRU initiates a guest that lacks PKRU support, the flag is enabled on the guest's fpstate.
> This information is then passed to userspace through the vcpu ioctl KVM_GET_XSAVE.
> However, a problem arises when the user opts to migrate the mentioned guest to another machine that does not support PKRU.
> In this scenario, the new host attempts to restore the guest's fpu registers.
> Nevertheless, due to the absence of PKRU support on the new host, a general-protection exception takes place, leading to a guest crash.
>
> [Fix]
> The problem is resolved by the following upstream commit:
> 18164f66e6c5 x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
> 8647c52e9504 KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
>
> [Test Plan]
> Several scenarios need to be conducted to confirm the migration outcome.
> 	Patched kernel with PKRU -> kernel with PKRU
> 	Patched kernel with PKRU -> kernel without PKRU
> 	Patched kernel without PKRU -> kernel with PKRU
> 	Patched kernel without PKRU -> kernel without PKRU
> 	Kernel with PKRU -> patched kernel with PKRU
> 	Kernel with PKRU -> patched kernel without PKRU
> 	Kernel without PKRU -> patched kernel with PKRU
> 	Kernel without PKRU -> patched kernel without PKRU
> 	Patched kernel with PKRU -> patched kernel without PKRU
>
> Each scenarios shall succeed except "Kernel with PKRU -> patched kernel without PKRU" one.
> Addressing this case poses challenges because the most plausible solution is to clamp the FPU features at the destination during migration.
> However, upstream does not support this approach due to concerns about silently dropping features requested by userspace.
> This could potentially lead to other issues and violate KVM's ABI.
>
> [Where problems could occur]
> The introduced commits will impact the guest migration process,
> potentially leading to failures and preventing the guest from operating successfully on the migration destination.
>
> Sean Christopherson (2):
>    x86/fpu: Allow caller to constrain xfeatures when copying to uabi
>      buffer
>    KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
>
>   arch/x86/include/asm/fpu/api.h |  3 ++-
>   arch/x86/kernel/fpu/core.c     |  5 +++--
>   arch/x86/kernel/fpu/xstate.c   |  7 +++++--
>   arch/x86/kernel/fpu/xstate.h   |  3 ++-
>   arch/x86/kvm/x86.c             | 33 +++++++++++++++++++++++++++------
>   5 files changed, 39 insertions(+), 12 deletions(-)
>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Stefan Bader March 28, 2024, 11:20 a.m. UTC | #5
On 16.03.24 04:43, Chengen Du wrote:
> BugLink: https://bugs.launchpad.net/bugs/2032164
> 
> SRU Justification:
> 
> [Impact]
> When a host that supports PKRU initiates a guest that lacks PKRU support, the flag is enabled on the guest's fpstate.
> This information is then passed to userspace through the vcpu ioctl KVM_GET_XSAVE.
> However, a problem arises when the user opts to migrate the mentioned guest to another machine that does not support PKRU.
> In this scenario, the new host attempts to restore the guest's fpu registers.
> Nevertheless, due to the absence of PKRU support on the new host, a general-protection exception takes place, leading to a guest crash.
> 
> [Fix]
> The problem is resolved by the following upstream commit:
> 18164f66e6c5 x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
> 8647c52e9504 KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
> 
> [Test Plan]
> Several scenarios need to be conducted to confirm the migration outcome.
> 	Patched kernel with PKRU -> kernel with PKRU
> 	Patched kernel with PKRU -> kernel without PKRU
> 	Patched kernel without PKRU -> kernel with PKRU
> 	Patched kernel without PKRU -> kernel without PKRU
> 	Kernel with PKRU -> patched kernel with PKRU
> 	Kernel with PKRU -> patched kernel without PKRU
> 	Kernel without PKRU -> patched kernel with PKRU
> 	Kernel without PKRU -> patched kernel without PKRU
> 	Patched kernel with PKRU -> patched kernel without PKRU
> 
> Each scenarios shall succeed except "Kernel with PKRU -> patched kernel without PKRU" one.
> Addressing this case poses challenges because the most plausible solution is to clamp the FPU features at the destination during migration.
> However, upstream does not support this approach due to concerns about silently dropping features requested by userspace.
> This could potentially lead to other issues and violate KVM's ABI.
> 
> [Where problems could occur]
> The introduced commits will impact the guest migration process,
> potentially leading to failures and preventing the guest from operating successfully on the migration destination.
> 
> Sean Christopherson (2):
>    x86/fpu: Allow caller to constrain xfeatures when copying to uabi
>      buffer
>    KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
> 
>   arch/x86/include/asm/fpu/api.h |  3 ++-
>   arch/x86/kernel/fpu/core.c     |  5 +++--
>   arch/x86/kernel/fpu/xstate.c   |  7 +++++--
>   arch/x86/kernel/fpu/xstate.h   |  3 ++-
>   arch/x86/kvm/x86.c             | 33 +++++++++++++++++++++++++++------
>   5 files changed, 39 insertions(+), 12 deletions(-)
> 

Applied to jammy:linux/master-next. Thanks.

-Stefan