diff mbox

[Lucid,CVE-2013-0228] x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.

Message ID 1365511630-15274-1-git-send-email-luis.henriques@canonical.com
State New
Headers show

Commit Message

Luis Henriques April 9, 2013, 12:47 p.m. UTC
From: Jan Beulich <JBeulich@suse.com>

CVE-2013-0228

BugLink: http://bugs.launchpad.net/bugs/1143796

This fixes CVE-2013-0228 / XSA-42

Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
in 32bit PV guest can use to crash the > guest with the panic like this:

-------------
general protection fault: 0000 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: scsi_wait_scan]

Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0
EIP is at xen_iret+0x12/0x2b
EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
 DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
Stack:
 00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
Call Trace:
Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40
10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
general protection fault: 0000 [#2]
---[ end trace ab0d29a492dcd330 ]---
Kernel panic - not syncing: Fatal exception
Pid: 1250, comm: r Tainted: G      D    ---------------
2.6.32-356.el6.i686 #1
Call Trace:
 [<c08476df>] ? panic+0x6e/0x122
 [<c084b63c>] ? oops_end+0xbc/0xd0
 [<c084b260>] ? do_general_protection+0x0/0x210
 [<c084a9b7>] ? error_code+0x73/
-------------

Petr says: "
 I've analysed the bug and I think that xen_iret() cannot cope with
 mangled DS, in this case zeroed out (null selector/descriptor) by either
 xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
 entry was invalidated by the reproducer. "

Jan took a look at the preliminary patch and came up a fix that solves
this problem:

"This code gets called after all registers other than those handled by
IRET got already restored, hence a null selector in %ds or a non-null
one that got loaded from a code or read-only data descriptor would
cause a kernel mode fault (with the potential of crashing the kernel
as a whole, if panic_on_oops is set)."

The way to fix this is to realize that the we can only relay on the
registers that IRET restores. The two that are guaranteed are the
%cs and %ss as they are always fixed GDT selectors. Also they are
inaccessible from user mode - so they cannot be altered. This is
the approach taken in this patch.

Another alternative option suggested by Jan would be to relay on
the subtle realization that using the %ebp or %esp relative references uses
the %ss segment.  In which case we could switch from using %eax to %ebp and
would not need the %ss over-rides. That would also require one extra
instruction to compensate for the one place where the register is used
as scaled index. However Andrew pointed out that is too subtle and if
further work was to be done in this code-path it could escape folks attention
and lead to accidents.

Reviewed-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(back ported from commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc)

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
---
 arch/x86/xen/xen-asm_32.S | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Stefan Bader April 9, 2013, 2:17 p.m. UTC | #1

Colin Ian King April 9, 2013, 2:30 p.m. UTC | #2
On 09/04/13 13:47, Luis Henriques wrote:
> From: Jan Beulich <JBeulich@suse.com>
>
> CVE-2013-0228
>
> BugLink: http://bugs.launchpad.net/bugs/1143796
>
> This fixes CVE-2013-0228 / XSA-42
>
> Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
> in 32bit PV guest can use to crash the > guest with the panic like this:
>
> -------------
> general protection fault: 0000 [#1] SMP
> last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
> Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
> iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
> xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
> mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
> unloaded: scsi_wait_scan]
>
> Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
> EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0
> EIP is at xen_iret+0x12/0x2b
> EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
> ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
>   DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
> Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
> Stack:
>   00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
> Call Trace:
> Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
> 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40
> 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
> EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
> general protection fault: 0000 [#2]
> ---[ end trace ab0d29a492dcd330 ]---
> Kernel panic - not syncing: Fatal exception
> Pid: 1250, comm: r Tainted: G      D    ---------------
> 2.6.32-356.el6.i686 #1
> Call Trace:
>   [<c08476df>] ? panic+0x6e/0x122
>   [<c084b63c>] ? oops_end+0xbc/0xd0
>   [<c084b260>] ? do_general_protection+0x0/0x210
>   [<c084a9b7>] ? error_code+0x73/
> -------------
>
> Petr says: "
>   I've analysed the bug and I think that xen_iret() cannot cope with
>   mangled DS, in this case zeroed out (null selector/descriptor) by either
>   xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
>   entry was invalidated by the reproducer. "
>
> Jan took a look at the preliminary patch and came up a fix that solves
> this problem:
>
> "This code gets called after all registers other than those handled by
> IRET got already restored, hence a null selector in %ds or a non-null
> one that got loaded from a code or read-only data descriptor would
> cause a kernel mode fault (with the potential of crashing the kernel
> as a whole, if panic_on_oops is set)."
>
> The way to fix this is to realize that the we can only relay on the
> registers that IRET restores. The two that are guaranteed are the
> %cs and %ss as they are always fixed GDT selectors. Also they are
> inaccessible from user mode - so they cannot be altered. This is
> the approach taken in this patch.
>
> Another alternative option suggested by Jan would be to relay on
> the subtle realization that using the %ebp or %esp relative references uses
> the %ss segment.  In which case we could switch from using %eax to %ebp and
> would not need the %ss over-rides. That would also require one extra
> instruction to compensate for the one place where the register is used
> as scaled index. However Andrew pointed out that is too subtle and if
> further work was to be done in this code-path it could escape folks attention
> and lead to accidents.
>
> Reviewed-by: Petr Matousek <pmatouse@redhat.com>
> Reported-by: Petr Matousek <pmatouse@redhat.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> (back ported from commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc)
>
> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
> ---
>   arch/x86/xen/xen-asm_32.S | 14 +++++++-------
>   1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
> index 9a95a9c..d05bd11 100644
> --- a/arch/x86/xen/xen-asm_32.S
> +++ b/arch/x86/xen/xen-asm_32.S
> @@ -88,11 +88,11 @@ ENTRY(xen_iret)
>   	 */
>   #ifdef CONFIG_SMP
>   	GET_THREAD_INFO(%eax)
> -	movl TI_cpu(%eax), %eax
> -	movl __per_cpu_offset(,%eax,4), %eax
> -	mov per_cpu__xen_vcpu(%eax), %eax
> +	movl %ss:TI_cpu(%eax), %eax
> +	movl %ss:__per_cpu_offset(,%eax,4), %eax
> +	mov %ss:per_cpu__xen_vcpu(%eax), %eax
>   #else
> -	movl per_cpu__xen_vcpu, %eax
> +	movl %ss:per_cpu__xen_vcpu, %eax
>   #endif
>
>   	/* check IF state we're restoring */
> @@ -105,11 +105,11 @@ ENTRY(xen_iret)
>   	 * resuming the code, so we don't have to be worried about
>   	 * being preempted to another CPU.
>   	 */
> -	setz XEN_vcpu_info_mask(%eax)
> +	setz %ss:XEN_vcpu_info_mask(%eax)
>   xen_iret_start_crit:
>
>   	/* check for unmasked and pending */
> -	cmpw $0x0001, XEN_vcpu_info_pending(%eax)
> +	cmpw $0x0001, %ss:XEN_vcpu_info_pending(%eax)
>
>   	/*
>   	 * If there's something pending, mask events again so we can
> @@ -117,7 +117,7 @@ xen_iret_start_crit:
>   	 * touch XEN_vcpu_info_mask.
>   	 */
>   	jne 1f
> -	movb $1, XEN_vcpu_info_mask(%eax)
> +	movb $1, %ss:XEN_vcpu_info_mask(%eax)
>
>   1:	popl %eax
>
>
Yup, looks like a clean backport.

Acked-by: Colin Ian King <colin.king@canonical.com>
Andy Whitcroft April 9, 2013, 2:52 p.m. UTC | #3
On Tue, Apr 09, 2013 at 01:47:10PM +0100, Luis Henriques wrote:
> From: Jan Beulich <JBeulich@suse.com>
> 
> CVE-2013-0228
> 
> BugLink: http://bugs.launchpad.net/bugs/1143796
> 
> This fixes CVE-2013-0228 / XSA-42
> 
> Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
> in 32bit PV guest can use to crash the > guest with the panic like this:
> 
> -------------
> general protection fault: 0000 [#1] SMP
> last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
> Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
> iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
> xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
> mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
> unloaded: scsi_wait_scan]
> 
> Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
> EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0
> EIP is at xen_iret+0x12/0x2b
> EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
> ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
>  DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
> Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
> Stack:
>  00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
> Call Trace:
> Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
> 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40
> 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
> EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
> general protection fault: 0000 [#2]
> ---[ end trace ab0d29a492dcd330 ]---
> Kernel panic - not syncing: Fatal exception
> Pid: 1250, comm: r Tainted: G      D    ---------------
> 2.6.32-356.el6.i686 #1
> Call Trace:
>  [<c08476df>] ? panic+0x6e/0x122
>  [<c084b63c>] ? oops_end+0xbc/0xd0
>  [<c084b260>] ? do_general_protection+0x0/0x210
>  [<c084a9b7>] ? error_code+0x73/
> -------------
> 
> Petr says: "
>  I've analysed the bug and I think that xen_iret() cannot cope with
>  mangled DS, in this case zeroed out (null selector/descriptor) by either
>  xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
>  entry was invalidated by the reproducer. "
> 
> Jan took a look at the preliminary patch and came up a fix that solves
> this problem:
> 
> "This code gets called after all registers other than those handled by
> IRET got already restored, hence a null selector in %ds or a non-null
> one that got loaded from a code or read-only data descriptor would
> cause a kernel mode fault (with the potential of crashing the kernel
> as a whole, if panic_on_oops is set)."
> 
> The way to fix this is to realize that the we can only relay on the
> registers that IRET restores. The two that are guaranteed are the
> %cs and %ss as they are always fixed GDT selectors. Also they are
> inaccessible from user mode - so they cannot be altered. This is
> the approach taken in this patch.
> 
> Another alternative option suggested by Jan would be to relay on
> the subtle realization that using the %ebp or %esp relative references uses
> the %ss segment.  In which case we could switch from using %eax to %ebp and
> would not need the %ss over-rides. That would also require one extra
> instruction to compensate for the one place where the register is used
> as scaled index. However Andrew pointed out that is too subtle and if
> further work was to be done in this code-path it could escape folks attention
> and lead to accidents.
> 
> Reviewed-by: Petr Matousek <pmatouse@redhat.com>
> Reported-by: Petr Matousek <pmatouse@redhat.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> (back ported from commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc)
> 
> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
> ---
>  arch/x86/xen/xen-asm_32.S | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
> index 9a95a9c..d05bd11 100644
> --- a/arch/x86/xen/xen-asm_32.S
> +++ b/arch/x86/xen/xen-asm_32.S
> @@ -88,11 +88,11 @@ ENTRY(xen_iret)
>  	 */
>  #ifdef CONFIG_SMP
>  	GET_THREAD_INFO(%eax)
> -	movl TI_cpu(%eax), %eax
> -	movl __per_cpu_offset(,%eax,4), %eax
> -	mov per_cpu__xen_vcpu(%eax), %eax
> +	movl %ss:TI_cpu(%eax), %eax
> +	movl %ss:__per_cpu_offset(,%eax,4), %eax
> +	mov %ss:per_cpu__xen_vcpu(%eax), %eax
>  #else
> -	movl per_cpu__xen_vcpu, %eax
> +	movl %ss:per_cpu__xen_vcpu, %eax
>  #endif
>  
>  	/* check IF state we're restoring */
> @@ -105,11 +105,11 @@ ENTRY(xen_iret)
>  	 * resuming the code, so we don't have to be worried about
>  	 * being preempted to another CPU.
>  	 */
> -	setz XEN_vcpu_info_mask(%eax)
> +	setz %ss:XEN_vcpu_info_mask(%eax)
>  xen_iret_start_crit:
>  
>  	/* check for unmasked and pending */
> -	cmpw $0x0001, XEN_vcpu_info_pending(%eax)
> +	cmpw $0x0001, %ss:XEN_vcpu_info_pending(%eax)
>  
>  	/*
>  	 * If there's something pending, mask events again so we can
> @@ -117,7 +117,7 @@ xen_iret_start_crit:
>  	 * touch XEN_vcpu_info_mask.
>  	 */
>  	jne 1f
> -	movb $1, XEN_vcpu_info_mask(%eax)
> +	movb $1, %ss:XEN_vcpu_info_mask(%eax)


Looks to mirror the spirit of the change it is backported from.  I
assume if this was wrong spectacular explosions would result.  Assuming
it has been tested to boot successfully:

Acked-by: Andy Whitcroft <apw@canonical.com>

-apw
Luis Henriques April 9, 2013, 3:10 p.m. UTC | #4
On Tue, Apr 09, 2013 at 03:52:11PM +0100, Andy Whitcroft wrote:
> On Tue, Apr 09, 2013 at 01:47:10PM +0100, Luis Henriques wrote:
> > From: Jan Beulich <JBeulich@suse.com>
> > 
> > CVE-2013-0228
> > 
> > BugLink: http://bugs.launchpad.net/bugs/1143796
> > 
> > This fixes CVE-2013-0228 / XSA-42
> > 
> > Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
> > in 32bit PV guest can use to crash the > guest with the panic like this:
> > 
> > -------------
> > general protection fault: 0000 [#1] SMP
> > last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
> > Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
> > iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
> > xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
> > mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
> > unloaded: scsi_wait_scan]
> > 
> > Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
> > EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0
> > EIP is at xen_iret+0x12/0x2b
> > EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
> > ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
> >  DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
> > Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
> > Stack:
> >  00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
> > Call Trace:
> > Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
> > 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40
> > 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
> > EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
> > general protection fault: 0000 [#2]
> > ---[ end trace ab0d29a492dcd330 ]---
> > Kernel panic - not syncing: Fatal exception
> > Pid: 1250, comm: r Tainted: G      D    ---------------
> > 2.6.32-356.el6.i686 #1
> > Call Trace:
> >  [<c08476df>] ? panic+0x6e/0x122
> >  [<c084b63c>] ? oops_end+0xbc/0xd0
> >  [<c084b260>] ? do_general_protection+0x0/0x210
> >  [<c084a9b7>] ? error_code+0x73/
> > -------------
> > 
> > Petr says: "
> >  I've analysed the bug and I think that xen_iret() cannot cope with
> >  mangled DS, in this case zeroed out (null selector/descriptor) by either
> >  xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
> >  entry was invalidated by the reproducer. "
> > 
> > Jan took a look at the preliminary patch and came up a fix that solves
> > this problem:
> > 
> > "This code gets called after all registers other than those handled by
> > IRET got already restored, hence a null selector in %ds or a non-null
> > one that got loaded from a code or read-only data descriptor would
> > cause a kernel mode fault (with the potential of crashing the kernel
> > as a whole, if panic_on_oops is set)."
> > 
> > The way to fix this is to realize that the we can only relay on the
> > registers that IRET restores. The two that are guaranteed are the
> > %cs and %ss as they are always fixed GDT selectors. Also they are
> > inaccessible from user mode - so they cannot be altered. This is
> > the approach taken in this patch.
> > 
> > Another alternative option suggested by Jan would be to relay on
> > the subtle realization that using the %ebp or %esp relative references uses
> > the %ss segment.  In which case we could switch from using %eax to %ebp and
> > would not need the %ss over-rides. That would also require one extra
> > instruction to compensate for the one place where the register is used
> > as scaled index. However Andrew pointed out that is too subtle and if
> > further work was to be done in this code-path it could escape folks attention
> > and lead to accidents.
> > 
> > Reviewed-by: Petr Matousek <pmatouse@redhat.com>
> > Reported-by: Petr Matousek <pmatouse@redhat.com>
> > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > (back ported from commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc)
> > 
> > Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
> > ---
> >  arch/x86/xen/xen-asm_32.S | 14 +++++++-------
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
> > index 9a95a9c..d05bd11 100644
> > --- a/arch/x86/xen/xen-asm_32.S
> > +++ b/arch/x86/xen/xen-asm_32.S
> > @@ -88,11 +88,11 @@ ENTRY(xen_iret)
> >  	 */
> >  #ifdef CONFIG_SMP
> >  	GET_THREAD_INFO(%eax)
> > -	movl TI_cpu(%eax), %eax
> > -	movl __per_cpu_offset(,%eax,4), %eax
> > -	mov per_cpu__xen_vcpu(%eax), %eax
> > +	movl %ss:TI_cpu(%eax), %eax
> > +	movl %ss:__per_cpu_offset(,%eax,4), %eax
> > +	mov %ss:per_cpu__xen_vcpu(%eax), %eax
> >  #else
> > -	movl per_cpu__xen_vcpu, %eax
> > +	movl %ss:per_cpu__xen_vcpu, %eax
> >  #endif
> >  
> >  	/* check IF state we're restoring */
> > @@ -105,11 +105,11 @@ ENTRY(xen_iret)
> >  	 * resuming the code, so we don't have to be worried about
> >  	 * being preempted to another CPU.
> >  	 */
> > -	setz XEN_vcpu_info_mask(%eax)
> > +	setz %ss:XEN_vcpu_info_mask(%eax)
> >  xen_iret_start_crit:
> >  
> >  	/* check for unmasked and pending */
> > -	cmpw $0x0001, XEN_vcpu_info_pending(%eax)
> > +	cmpw $0x0001, %ss:XEN_vcpu_info_pending(%eax)
> >  
> >  	/*
> >  	 * If there's something pending, mask events again so we can
> > @@ -117,7 +117,7 @@ xen_iret_start_crit:
> >  	 * touch XEN_vcpu_info_mask.
> >  	 */
> >  	jne 1f
> > -	movb $1, XEN_vcpu_info_mask(%eax)
> > +	movb $1, %ss:XEN_vcpu_info_mask(%eax)
> 
> 
> Looks to mirror the spirit of the change it is backported from.  I
> assume if this was wrong spectacular explosions would result.  Assuming
> it has been tested to boot successfully:

Hmm, no.  It hasn't actually been boot tested with xen -- I've build
tested and booted a PAE kernel on kvm.  One of these days I'll have to
setup a box to test xen, but haven't done that so far... :-/

> 
> Acked-by: Andy Whitcroft <apw@canonical.com>
> 
> -apw

Cheers,
--
Luis
Stefan Bader April 9, 2013, 3:38 p.m. UTC | #5
On 09.04.2013 17:10, Luis Henriques wrote:
> On Tue, Apr 09, 2013 at 03:52:11PM +0100, Andy Whitcroft wrote:
>> On Tue, Apr 09, 2013 at 01:47:10PM +0100, Luis Henriques wrote:
>>> From: Jan Beulich <JBeulich@suse.com>
>>>
>>> CVE-2013-0228
>>>
>>> BugLink: http://bugs.launchpad.net/bugs/1143796
>>>
>>> This fixes CVE-2013-0228 / XSA-42
>>>
>>> Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
>>> in 32bit PV guest can use to crash the > guest with the panic like this:
>>>
>>> -------------
>>> general protection fault: 0000 [#1] SMP
>>> last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
>>> Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
>>> iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
>>> xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
>>> mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
>>> unloaded: scsi_wait_scan]
>>>
>>> Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
>>> EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0
>>> EIP is at xen_iret+0x12/0x2b
>>> EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
>>> ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
>>>  DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
>>> Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
>>> Stack:
>>>  00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
>>> Call Trace:
>>> Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
>>> 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40
>>> 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
>>> EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
>>> general protection fault: 0000 [#2]
>>> ---[ end trace ab0d29a492dcd330 ]---
>>> Kernel panic - not syncing: Fatal exception
>>> Pid: 1250, comm: r Tainted: G      D    ---------------
>>> 2.6.32-356.el6.i686 #1
>>> Call Trace:
>>>  [<c08476df>] ? panic+0x6e/0x122
>>>  [<c084b63c>] ? oops_end+0xbc/0xd0
>>>  [<c084b260>] ? do_general_protection+0x0/0x210
>>>  [<c084a9b7>] ? error_code+0x73/
>>> -------------
>>>
>>> Petr says: "
>>>  I've analysed the bug and I think that xen_iret() cannot cope with
>>>  mangled DS, in this case zeroed out (null selector/descriptor) by either
>>>  xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
>>>  entry was invalidated by the reproducer. "
>>>
>>> Jan took a look at the preliminary patch and came up a fix that solves
>>> this problem:
>>>
>>> "This code gets called after all registers other than those handled by
>>> IRET got already restored, hence a null selector in %ds or a non-null
>>> one that got loaded from a code or read-only data descriptor would
>>> cause a kernel mode fault (with the potential of crashing the kernel
>>> as a whole, if panic_on_oops is set)."
>>>
>>> The way to fix this is to realize that the we can only relay on the
>>> registers that IRET restores. The two that are guaranteed are the
>>> %cs and %ss as they are always fixed GDT selectors. Also they are
>>> inaccessible from user mode - so they cannot be altered. This is
>>> the approach taken in this patch.
>>>
>>> Another alternative option suggested by Jan would be to relay on
>>> the subtle realization that using the %ebp or %esp relative references uses
>>> the %ss segment.  In which case we could switch from using %eax to %ebp and
>>> would not need the %ss over-rides. That would also require one extra
>>> instruction to compensate for the one place where the register is used
>>> as scaled index. However Andrew pointed out that is too subtle and if
>>> further work was to be done in this code-path it could escape folks attention
>>> and lead to accidents.
>>>
>>> Reviewed-by: Petr Matousek <pmatouse@redhat.com>
>>> Reported-by: Petr Matousek <pmatouse@redhat.com>
>>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> (back ported from commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc)
>>>
>>> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
>>> ---
>>>  arch/x86/xen/xen-asm_32.S | 14 +++++++-------
>>>  1 file changed, 7 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
>>> index 9a95a9c..d05bd11 100644
>>> --- a/arch/x86/xen/xen-asm_32.S
>>> +++ b/arch/x86/xen/xen-asm_32.S
>>> @@ -88,11 +88,11 @@ ENTRY(xen_iret)
>>>  	 */
>>>  #ifdef CONFIG_SMP
>>>  	GET_THREAD_INFO(%eax)
>>> -	movl TI_cpu(%eax), %eax
>>> -	movl __per_cpu_offset(,%eax,4), %eax
>>> -	mov per_cpu__xen_vcpu(%eax), %eax
>>> +	movl %ss:TI_cpu(%eax), %eax
>>> +	movl %ss:__per_cpu_offset(,%eax,4), %eax
>>> +	mov %ss:per_cpu__xen_vcpu(%eax), %eax
>>>  #else
>>> -	movl per_cpu__xen_vcpu, %eax
>>> +	movl %ss:per_cpu__xen_vcpu, %eax
>>>  #endif
>>>  
>>>  	/* check IF state we're restoring */
>>> @@ -105,11 +105,11 @@ ENTRY(xen_iret)
>>>  	 * resuming the code, so we don't have to be worried about
>>>  	 * being preempted to another CPU.
>>>  	 */
>>> -	setz XEN_vcpu_info_mask(%eax)
>>> +	setz %ss:XEN_vcpu_info_mask(%eax)
>>>  xen_iret_start_crit:
>>>  
>>>  	/* check for unmasked and pending */
>>> -	cmpw $0x0001, XEN_vcpu_info_pending(%eax)
>>> +	cmpw $0x0001, %ss:XEN_vcpu_info_pending(%eax)
>>>  
>>>  	/*
>>>  	 * If there's something pending, mask events again so we can
>>> @@ -117,7 +117,7 @@ xen_iret_start_crit:
>>>  	 * touch XEN_vcpu_info_mask.
>>>  	 */
>>>  	jne 1f
>>> -	movb $1, XEN_vcpu_info_mask(%eax)
>>> +	movb $1, %ss:XEN_vcpu_info_mask(%eax)
>>
>>
>> Looks to mirror the spirit of the change it is backported from.  I
>> assume if this was wrong spectacular explosions would result.  Assuming
>> it has been tested to boot successfully:
> 
> Hmm, no.  It hasn't actually been boot tested with xen -- I've build
> tested and booted a PAE kernel on kvm.  One of these days I'll have to
> setup a box to test xen, but haven't done that so far... :-/

I would likely let you know when I will do the related ec2/ec2-nopatch kernel.
Of course you were expecting that. ;-P

> 
>>
>> Acked-by: Andy Whitcroft <apw@canonical.com>
>>
>> -apw
> 
> Cheers,
> --
> Luis
>
Tim Gardner April 9, 2013, 4:17 p.m. UTC | #6

diff mbox

Patch

diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
index 9a95a9c..d05bd11 100644
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -88,11 +88,11 @@  ENTRY(xen_iret)
 	 */
 #ifdef CONFIG_SMP
 	GET_THREAD_INFO(%eax)
-	movl TI_cpu(%eax), %eax
-	movl __per_cpu_offset(,%eax,4), %eax
-	mov per_cpu__xen_vcpu(%eax), %eax
+	movl %ss:TI_cpu(%eax), %eax
+	movl %ss:__per_cpu_offset(,%eax,4), %eax
+	mov %ss:per_cpu__xen_vcpu(%eax), %eax
 #else
-	movl per_cpu__xen_vcpu, %eax
+	movl %ss:per_cpu__xen_vcpu, %eax
 #endif
 
 	/* check IF state we're restoring */
@@ -105,11 +105,11 @@  ENTRY(xen_iret)
 	 * resuming the code, so we don't have to be worried about
 	 * being preempted to another CPU.
 	 */
-	setz XEN_vcpu_info_mask(%eax)
+	setz %ss:XEN_vcpu_info_mask(%eax)
 xen_iret_start_crit:
 
 	/* check for unmasked and pending */
-	cmpw $0x0001, XEN_vcpu_info_pending(%eax)
+	cmpw $0x0001, %ss:XEN_vcpu_info_pending(%eax)
 
 	/*
 	 * If there's something pending, mask events again so we can
@@ -117,7 +117,7 @@  xen_iret_start_crit:
 	 * touch XEN_vcpu_info_mask.
 	 */
 	jne 1f
-	movb $1, XEN_vcpu_info_mask(%eax)
+	movb $1, %ss:XEN_vcpu_info_mask(%eax)
 
 1:	popl %eax