diff mbox series

UBUNTU: SAUCE: TDX: Work around the segfault issue in glibc 2.35 in Ubuntu 22.04.

Message ID 20230123140233.790103-2-tim.gardner@canonical.com
State New
Headers show
Series Azure: TDX enabled hyper-visors cause segfault | expand

Commit Message

Tim Gardner Jan. 23, 2023, 2:02 p.m. UTC
From: Dexuan Cui <decui@microsoft.com>

BugLink: https://bugs.launchpad.net/bugs/2003714

glibc 2.34/2.35 (and 2.36?) had a bug (2.32 is good):
See https://sourceware.org/bugzilla/show_bug.cgi?id=28784

The bug has been fixed in upstream glibc:
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=c242fcce06e3102ca663b2f992611d0bda4f2668

However, it looks like a lot of distros haven't picked up the fix yet,
e.g. Ubuntu 22.04/22.10/23.04's glibc need pick up the glibc fix (c242fcce06e3102ca663b2f992611d0bda4f2668).
RHEL 9's glibc needs the glibc fix as well.

Before the glibc packages in the distros are fixed, we can use this
kernel side workaround patch for now. The workaround is from Intel.
See the below for the rationale:

x86/tdx: Virtualize CPUID leaf 0x2
CPUID leaf 0x2 provides cache and TLB information. In TDX guest access
to the leaf causes #VE.

Current implementation returns all zero, but it confuses some users:
some recent versions of GLIBC hit segfaults. It is a GLIBC bug, but it is
also a user-visible regression comparing to non-TDX environment.

Kernel can generate a sensible response to the #VE to work around the
glibc segfault for now.

The leaf is obsolete. There are leafs that provides the same
information in a structured form. See leaf 0x4 on cache info and
leaf 0x18 on TLB info.

Generate a response that indicates that CPUID leaf 0x4 and 0x18 have to
be used instead.

(cherry picked from commit 16218cf73491e867fd39c16c9e4b8aa926cbda68 https://github.com/dcui/tdx)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 arch/x86/coco/tdx/tdx.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Ian May Jan. 23, 2023, 2:47 p.m. UTC | #1
LGTM

Acked-by: Ian May <ian.may@canonical.com>

On 2023-01-23 07:02:33 , Tim Gardner wrote:
> From: Dexuan Cui <decui@microsoft.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/2003714
> 
> glibc 2.34/2.35 (and 2.36?) had a bug (2.32 is good):
> See https://sourceware.org/bugzilla/show_bug.cgi?id=28784
> 
> The bug has been fixed in upstream glibc:
> https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=c242fcce06e3102ca663b2f992611d0bda4f2668
> 
> However, it looks like a lot of distros haven't picked up the fix yet,
> e.g. Ubuntu 22.04/22.10/23.04's glibc need pick up the glibc fix (c242fcce06e3102ca663b2f992611d0bda4f2668).
> RHEL 9's glibc needs the glibc fix as well.
> 
> Before the glibc packages in the distros are fixed, we can use this
> kernel side workaround patch for now. The workaround is from Intel.
> See the below for the rationale:
> 
> x86/tdx: Virtualize CPUID leaf 0x2
> CPUID leaf 0x2 provides cache and TLB information. In TDX guest access
> to the leaf causes #VE.
> 
> Current implementation returns all zero, but it confuses some users:
> some recent versions of GLIBC hit segfaults. It is a GLIBC bug, but it is
> also a user-visible regression comparing to non-TDX environment.
> 
> Kernel can generate a sensible response to the #VE to work around the
> glibc segfault for now.
> 
> The leaf is obsolete. There are leafs that provides the same
> information in a structured form. See leaf 0x4 on cache info and
> leaf 0x18 on TLB info.
> 
> Generate a response that indicates that CPUID leaf 0x4 and 0x18 have to
> be used instead.
> 
> (cherry picked from commit 16218cf73491e867fd39c16c9e4b8aa926cbda68 https://github.com/dcui/tdx)
> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
> ---
>  arch/x86/coco/tdx/tdx.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index c32c7ef55249..928ca748bb26 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -329,6 +329,18 @@ static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve)
>  		.r13 = regs->cx,
>  	};
>  
> +	/*
> +	 * Work around the segfault issue in glibc 2.35 in Ubuntu 22.04.
> +	 * See https://sourceware.org/bugzilla/show_bug.cgi?id=28784
> +	 * Ubuntu 22.04/22.10/23.04's glibc should pick up this glibc fix:
> +	 * https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=c242fcce06e3102ca663b2f992611d0bda4f2668
> +	 */
> +	if (regs->ax == 2) {
> +		regs->ax = 0xf1ff01;
> +		regs->bx = regs->cx = regs->dx = 0;
> +		return ve_instr_len(ve);
> +	}
> +
>  	/*
>  	 * Only allow VMM to control range reserved for hypervisor
>  	 * communication.
> -- 
> 2.34.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
diff mbox series

Patch

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index c32c7ef55249..928ca748bb26 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -329,6 +329,18 @@  static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve)
 		.r13 = regs->cx,
 	};
 
+	/*
+	 * Work around the segfault issue in glibc 2.35 in Ubuntu 22.04.
+	 * See https://sourceware.org/bugzilla/show_bug.cgi?id=28784
+	 * Ubuntu 22.04/22.10/23.04's glibc should pick up this glibc fix:
+	 * https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=c242fcce06e3102ca663b2f992611d0bda4f2668
+	 */
+	if (regs->ax == 2) {
+		regs->ax = 0xf1ff01;
+		regs->bx = regs->cx = regs->dx = 0;
+		return ve_instr_len(ve);
+	}
+
 	/*
 	 * Only allow VMM to control range reserved for hypervisor
 	 * communication.