Patchwork Stack corruption problem with SeaBIOS/gPXE under QEMU

login
register
mail settings
Submitter Naphtali Sprei
Date Nov. 12, 2009, 11:20 a.m.
Message ID <4AFBEF9A.5010802@redhat.com>
Download mbox | patch
Permalink /patch/38238/
State New
Headers show

Comments

Naphtali Sprei - Nov. 12, 2009, 11:20 a.m.
Hi,
I've found a problem with the usage of SeaBIOS/gPXE in Qemu.
The scenario is when failing to boot from network and falling back to booting from hard-disk (-boot nc).
The cause of the problem is that both SeaBIOS and gPXE (in it's installation phase) uses same stack area, 0x7c00.
The gPXE code corrupts the SeaBIOS stack, so when gPXE returns to SeaBIOS chaos occurs.

Output: "qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000eb300000"

A simple hack/patch (attached) solves this problem, but a proper patch expected from the SeaBIOS guys.

 Enjoy,

  Naphtali

Patch against current SeaBIOS git


Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
---
 src/arch/i386/prefix/pxeprefix.S |    2 +-
 src/arch/i386/prefix/romprefix.S |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
Kevin O'Connor - Nov. 14, 2009, 7:47 p.m.
Hi,

On Thu, Nov 12, 2009 at 01:20:58PM +0200, Naphtali Sprei wrote:
> I've found a problem with the usage of SeaBIOS/gPXE in Qemu.  The
> scenario is when failing to boot from network and falling back to
> booting from hard-disk (-boot nc).  The cause of the problem is that
> both SeaBIOS and gPXE (in it's installation phase) uses same stack
> area, 0x7c00.  The gPXE code corrupts the SeaBIOS stack, so when
> gPXE returns to SeaBIOS chaos occurs.
> 
> Output: "qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000eb300000"

Thanks for reporting this.

We can move the SeaBIOS stack, but it's not clear to me where to move
it to.  Bochs bios puts the top of the stack at 0x10000, but this
could potentially conflict with the OS load to 0x7c00.  So, in SeaBIOS
the top of stack was moved to 0x7c00 to prevent this conflict.

Maybe the gPXE developers know where the bios typically places its
stack.

However, I'm not sure why gPXE doesn't just use the stack it was
given, or allocate the stack space it needs with PMM.

> A simple hack/patch (attached) solves this problem, but a proper
> patch expected from the SeaBIOS guys.
> 
>  Enjoy,
> 
>   Naphtali
> 
> Patch against current SeaBIOS git

The patch isn't against SeaBIOS.  Did you mean gPXE?

-Kevin


> 
> 
> Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
> ---
>  src/arch/i386/prefix/pxeprefix.S |    2 +-
>  src/arch/i386/prefix/romprefix.S |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/arch/i386/prefix/pxeprefix.S b/src/arch/i386/prefix/pxeprefix.S
> index b541e4b..11dd45d 100644
> --- a/src/arch/i386/prefix/pxeprefix.S
> +++ b/src/arch/i386/prefix/pxeprefix.S
> @@ -47,7 +47,7 @@ FILE_LICENCE ( GPL2_OR_LATER )
>  	/* Set up stack just below 0x7c00 */
>  	xorw	%ax, %ax
>  	movw	%ax, %ss
> -	movl	$0x7c00, %esp
> +	movl	$0x7a00, %esp
>  	/* Clear direction flag, for the sake of sanity */
>  	cld
>  	/* Print welcome message */
> diff --git a/src/arch/i386/prefix/romprefix.S b/src/arch/i386/prefix/romprefix.S
> index cb474e8..93f3f17 100644
> --- a/src/arch/i386/prefix/romprefix.S
> +++ b/src/arch/i386/prefix/romprefix.S
> @@ -587,7 +587,7 @@ exec:	/* Set %ds = %cs */
>  	/* Obtain a reasonably-sized temporary stack */
>  	xorw	%ax, %ax
>  	movw	%ax, %ss
> -	movw	$0x7c00, %sp
> +	movw	$0x7a00, %sp
>  
>  	/* Install gPXE */
>  	movl	image_source, %esi
> -- 
> 1.6.3.3
> 
>
Naphtali Sprei - Nov. 15, 2009, 9:43 a.m.
Kevin O'Connor wrote:
> Hi,
> 
> On Thu, Nov 12, 2009 at 01:20:58PM +0200, Naphtali Sprei wrote:
>> I've found a problem with the usage of SeaBIOS/gPXE in Qemu.  The
>> scenario is when failing to boot from network and falling back to
>> booting from hard-disk (-boot nc).  The cause of the problem is that
>> both SeaBIOS and gPXE (in it's installation phase) uses same stack
>> area, 0x7c00.  The gPXE code corrupts the SeaBIOS stack, so when
>> gPXE returns to SeaBIOS chaos occurs.
>>
>> Output: "qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000eb300000"
> 
> Thanks for reporting this.
> 
> We can move the SeaBIOS stack, but it's not clear to me where to move
> it to.  Bochs bios puts the top of the stack at 0x10000, but this
> could potentially conflict with the OS load to 0x7c00.  So, in SeaBIOS
> the top of stack was moved to 0x7c00 to prevent this conflict.
> 
> Maybe the gPXE developers know where the bios typically places its
> stack.
> 
> However, I'm not sure why gPXE doesn't just use the stack it was
> given, or allocate the stack space it needs with PMM.
> 
>> A simple hack/patch (attached) solves this problem, but a proper
>> patch expected from the SeaBIOS guys.

Sorry for the misleading addressee. I should have addressed the request to the gPXE project, and not SeaBIOS project.
Since the gPXE uses the services of SeaBIOS, that where the changes should be.
Thanks for CC'ing them. 

>>
>>  Enjoy,
>>
>>   Naphtali
>>
>> Patch against current SeaBIOS git
> 
> The patch isn't against SeaBIOS.  Did you mean gPXE?

Sure, my mistake, against gPXE.

> 
> -Kevin
Avi Kivity - Nov. 16, 2009, 1:36 p.m.
On 11/14/2009 09:47 PM, Kevin O'Connor wrote:
> Hi,
>
> On Thu, Nov 12, 2009 at 01:20:58PM +0200, Naphtali Sprei wrote:
>    
>> I've found a problem with the usage of SeaBIOS/gPXE in Qemu.  The
>> scenario is when failing to boot from network and falling back to
>> booting from hard-disk (-boot nc).  The cause of the problem is that
>> both SeaBIOS and gPXE (in it's installation phase) uses same stack
>> area, 0x7c00.  The gPXE code corrupts the SeaBIOS stack, so when
>> gPXE returns to SeaBIOS chaos occurs.
>>
>> Output: "qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000eb300000"
>>      
> Thanks for reporting this.
>
> We can move the SeaBIOS stack, but it's not clear to me where to move
> it to.  Bochs bios puts the top of the stack at 0x10000, but this
> could potentially conflict with the OS load to 0x7c00.  So, in SeaBIOS
> the top of stack was moved to 0x7c00 to prevent this conflict.
>
> Maybe the gPXE developers know where the bios typically places its
> stack.
>
> However, I'm not sure why gPXE doesn't just use the stack it was
> given, or allocate the stack space it needs with PMM.
>    

Something that is likely related, I am seeing reboot failures in 
seabios's pmm_free.  Immediately after loading gpxe, seabios is in an 
endless loop there, likely due to memory corruption.

This is with -smp 2, rebooting Fedora 9 after installation.
Avi Kivity - Nov. 16, 2009, 2:02 p.m.
On 11/16/2009 03:36 PM, Avi Kivity wrote:
> On 11/14/2009 09:47 PM, Kevin O'Connor wrote:
>> Hi,
>>
>> On Thu, Nov 12, 2009 at 01:20:58PM +0200, Naphtali Sprei wrote:
>>> I've found a problem with the usage of SeaBIOS/gPXE in Qemu.  The
>>> scenario is when failing to boot from network and falling back to
>>> booting from hard-disk (-boot nc).  The cause of the problem is that
>>> both SeaBIOS and gPXE (in it's installation phase) uses same stack
>>> area, 0x7c00.  The gPXE code corrupts the SeaBIOS stack, so when
>>> gPXE returns to SeaBIOS chaos occurs.
>>>
>>> Output: "qemu: fatal: Trying to execute code outside RAM or ROM at 
>>> 0x00000000eb300000"
>> Thanks for reporting this.
>>
>> We can move the SeaBIOS stack, but it's not clear to me where to move
>> it to.  Bochs bios puts the top of the stack at 0x10000, but this
>> could potentially conflict with the OS load to 0x7c00.  So, in SeaBIOS
>> the top of stack was moved to 0x7c00 to prevent this conflict.
>>
>> Maybe the gPXE developers know where the bios typically places its
>> stack.
>>
>> However, I'm not sure why gPXE doesn't just use the stack it was
>> given, or allocate the stack space it needs with PMM.
>
> Something that is likely related, I am seeing reboot failures in 
> seabios's pmm_free.  Immediately after loading gpxe, seabios is in an 
> endless loop there, likely due to memory corruption.
>
> This is with -smp 2, rebooting Fedora 9 after installation.
>

With gpxe disabled, rebooting works as expected.

Note the tests were performed with the stack at 64K to avoid triggering 
the known issue.
Kevin O'Connor - Nov. 17, 2009, 2:26 a.m.
On Mon, Nov 16, 2009 at 04:02:20PM +0200, Avi Kivity wrote:
>> Something that is likely related, I am seeing reboot failures in  
>> seabios's pmm_free.  Immediately after loading gpxe, seabios is in an  
>> endless loop there, likely due to memory corruption.
>>
>> This is with -smp 2, rebooting Fedora 9 after installation.
>
> With gpxe disabled, rebooting works as expected.
>
> Note the tests were performed with the stack at 64K to avoid triggering  
> the known issue.

Hi Avi,

Can you send the full qemu command line that you used?  I can't seem
to reproduce this on my setup.

I do see an issue if SeaBIOS's reboot vector is called (eg, by using
"sendkey ctrl-alt-delete" while still in the bios) because seabios
allows gpxe to modify itself, and on a seabios only reboot the gpxe
rom isn't recopied and gpxe therefore gets confused.  However, on a
linux invoked reboot, it looks like a full machine reset occurs and
qemu recopies the gpxe rom, so that doesn't seem to be an issue.

BTW, how did you change the stack location?  I've been changing
seabios by setting BUILD_STACK_ADDR to 0x7000 (or 0xfff0) in
src/config.h.

-Kevin
Avi Kivity - Nov. 17, 2009, 1:23 p.m.
On 11/17/2009 04:26 AM, Kevin O'Connor wrote:
> On Mon, Nov 16, 2009 at 04:02:20PM +0200, Avi Kivity wrote:
>    
>>> Something that is likely related, I am seeing reboot failures in
>>> seabios's pmm_free.  Immediately after loading gpxe, seabios is in an
>>> endless loop there, likely due to memory corruption.
>>>
>>> This is with -smp 2, rebooting Fedora 9 after installation.
>>>        
>> With gpxe disabled, rebooting works as expected.
>>
>> Note the tests were performed with the stack at 64K to avoid triggering
>> the known issue.
>>      
> Hi Avi,
>
> Can you send the full qemu command line that you used?  I can't seem
> to reproduce this on my setup.
>
>    

Example command line is

   qemu -name 'vm1' -drive 
file=/root/kvm-autotest/client/tests/kvm/images/winvista-64.qcow2,if=ide,cache=writeback 
-net nic,vlan=0,model=rtl8139,macaddr=52:54:00:12:34:56 -net user,vlan=0 
-m 512 -smp 2 -cdrom 
/root/kvm-autotest/client/tests/kvm/isos/windows/winutils.iso -redir 
tcp:5000::22

(generate by autotest) with qemu-kvm.git 
b496fe34317ead61cf5ae019506fadc8f9ad6556.

> I do see an issue if SeaBIOS's reboot vector is called (eg, by using
> "sendkey ctrl-alt-delete" while still in the bios) because seabios
> allows gpxe to modify itself, and on a seabios only reboot the gpxe
> rom isn't recopied and gpxe therefore gets confused.  However, on a
> linux invoked reboot, it looks like a full machine reset occurs and
> qemu recopies the gpxe rom, so that doesn't seem to be an issue.
>
> BTW, how did you change the stack location?  I've been changing
> seabios by setting BUILD_STACK_ADDR to 0x7000 (or 0xfff0) in
> src/config.h.
>    

I modified BUILD_STACK_ADDR as well.
Gleb Natapov - Nov. 18, 2009, 9:39 a.m.
On Mon, Nov 16, 2009 at 09:26:20PM -0500, Kevin O'Connor wrote:
> On Mon, Nov 16, 2009 at 04:02:20PM +0200, Avi Kivity wrote:
> >> Something that is likely related, I am seeing reboot failures in  
> >> seabios's pmm_free.  Immediately after loading gpxe, seabios is in an  
> >> endless loop there, likely due to memory corruption.
> >>
> >> This is with -smp 2, rebooting Fedora 9 after installation.
> >
> > With gpxe disabled, rebooting works as expected.
> >
> > Note the tests were performed with the stack at 64K to avoid triggering  
> > the known issue.
> 
> Hi Avi,
> 
> Can you send the full qemu command line that you used?  I can't seem
> to reproduce this on my setup.
> 
> I do see an issue if SeaBIOS's reboot vector is called (eg, by using
> "sendkey ctrl-alt-delete" while still in the bios) because seabios
> allows gpxe to modify itself, and on a seabios only reboot the gpxe
> rom isn't recopied and gpxe therefore gets confused.  However, on a
> linux invoked reboot, it looks like a full machine reset occurs and
> qemu recopies the gpxe rom, so that doesn't seem to be an issue.
> 
Do we have the same problem with tpr patching rom (vapic,bin)? It modifies
itself too.

> BTW, how did you change the stack location?  I've been changing
> seabios by setting BUILD_STACK_ADDR to 0x7000 (or 0xfff0) in
> src/config.h.
> 
> -Kevin
> 

--
			Gleb.
Alexander Graf - Nov. 18, 2009, 9:49 a.m.
On 18.11.2009, at 10:39, Gleb Natapov wrote:

> On Mon, Nov 16, 2009 at 09:26:20PM -0500, Kevin O'Connor wrote:
>> On Mon, Nov 16, 2009 at 04:02:20PM +0200, Avi Kivity wrote:
>>>> Something that is likely related, I am seeing reboot failures in
>>>> seabios's pmm_free.  Immediately after loading gpxe, seabios is  
>>>> in an
>>>> endless loop there, likely due to memory corruption.
>>>>
>>>> This is with -smp 2, rebooting Fedora 9 after installation.
>>>
>>> With gpxe disabled, rebooting works as expected.
>>>
>>> Note the tests were performed with the stack at 64K to avoid  
>>> triggering
>>> the known issue.
>>
>> Hi Avi,
>>
>> Can you send the full qemu command line that you used?  I can't seem
>> to reproduce this on my setup.
>>
>> I do see an issue if SeaBIOS's reboot vector is called (eg, by using
>> "sendkey ctrl-alt-delete" while still in the bios) because seabios
>> allows gpxe to modify itself, and on a seabios only reboot the gpxe
>> rom isn't recopied and gpxe therefore gets confused.  However, on a
>> linux invoked reboot, it looks like a full machine reset occurs and
>> qemu recopies the gpxe rom, so that doesn't seem to be an issue.
>>
> Do we have the same problem with tpr patching rom (vapic,bin)? It  
> modifies
> itself too.

Are you sure vapic.bin still works with SeaBIOS? I've had to modify  
the multiboot and linuxboot code to write to the stack because the  
code section of the option rom was read only.

Alex
Gleb Natapov - Nov. 18, 2009, 9:53 a.m.
On Wed, Nov 18, 2009 at 10:49:37AM +0100, Alexander Graf wrote:
> 
> On 18.11.2009, at 10:39, Gleb Natapov wrote:
> 
> >On Mon, Nov 16, 2009 at 09:26:20PM -0500, Kevin O'Connor wrote:
> >>On Mon, Nov 16, 2009 at 04:02:20PM +0200, Avi Kivity wrote:
> >>>>Something that is likely related, I am seeing reboot failures in
> >>>>seabios's pmm_free.  Immediately after loading gpxe, seabios
> >>>>is in an
> >>>>endless loop there, likely due to memory corruption.
> >>>>
> >>>>This is with -smp 2, rebooting Fedora 9 after installation.
> >>>
> >>>With gpxe disabled, rebooting works as expected.
> >>>
> >>>Note the tests were performed with the stack at 64K to avoid
> >>>triggering
> >>>the known issue.
> >>
> >>Hi Avi,
> >>
> >>Can you send the full qemu command line that you used?  I can't seem
> >>to reproduce this on my setup.
> >>
> >>I do see an issue if SeaBIOS's reboot vector is called (eg, by using
> >>"sendkey ctrl-alt-delete" while still in the bios) because seabios
> >>allows gpxe to modify itself, and on a seabios only reboot the gpxe
> >>rom isn't recopied and gpxe therefore gets confused.  However, on a
> >>linux invoked reboot, it looks like a full machine reset occurs and
> >>qemu recopies the gpxe rom, so that doesn't seem to be an issue.
> >>
> >Do we have the same problem with tpr patching rom (vapic,bin)? It
> >modifies
> >itself too.
> 
> Are you sure vapic.bin still works with SeaBIOS? I've had to modify
> the multiboot and linuxboot code to write to the stack because the
> code section of the option rom was read only.
> 
I tested it with SeaBIOS and it worked. Actually vapic.bin doesn't modifies
itself during BIOS run. Part of vapic.bin are modified by QEMU and other
part are modified during Windows run.

--
			Gleb.
Kevin O'Connor - Nov. 18, 2009, 12:58 p.m.
On Wed, Nov 18, 2009 at 10:49:37AM +0100, Alexander Graf wrote:
> Are you sure vapic.bin still works with SeaBIOS? I've had to modify the 
> multiboot and linuxboot code to write to the stack because the code 
> section of the option rom was read only.

SeaBIOS should be making the code writable during option rom
execution.  This is a change from bochs-bios, which did not do this.

-Kevin
Kevin O'Connor - Nov. 18, 2009, 1:06 p.m.
On Wed, Nov 18, 2009 at 11:39:49AM +0200, Gleb Natapov wrote:
> On Mon, Nov 16, 2009 at 09:26:20PM -0500, Kevin O'Connor wrote:
> > I do see an issue if SeaBIOS's reboot vector is called (eg, by using
> > "sendkey ctrl-alt-delete" while still in the bios) because seabios
> > allows gpxe to modify itself, and on a seabios only reboot the gpxe
> > rom isn't recopied and gpxe therefore gets confused.  However, on a
> > linux invoked reboot, it looks like a full machine reset occurs and
> > qemu recopies the gpxe rom, so that doesn't seem to be an issue.
> > 
> Do we have the same problem with tpr patching rom (vapic,bin)? It modifies
> itself too.

I don't know, but I wouldn't think so.  The issue is only if the
option rom init code doesn't like getting run twice.  (Gpxe allocates
high memory via pmm, relocates itself there, and shrinks its option
rom size - on the second option rom init call the PMM allocation is
lost and its option rom has been shrunk - it rightfully can't handle
that.)  I don't think the vapic would have the same issue - would it?

Ideally, I think SeaBIOS should detect a second call to "post" and try
to issue a machine reboot.  That should fix this issue.  (To be clear
though, I don't think this is the cause of Avi's Fedora reboot hang.)

-Kevin
Avi Kivity - Nov. 18, 2009, 1:50 p.m.
On 11/18/2009 11:39 AM, Gleb Natapov wrote:
>
>> Hi Avi,
>>
>> Can you send the full qemu command line that you used?  I can't seem
>> to reproduce this on my setup.
>>
>> I do see an issue if SeaBIOS's reboot vector is called (eg, by using
>> "sendkey ctrl-alt-delete" while still in the bios) because seabios
>> allows gpxe to modify itself, and on a seabios only reboot the gpxe
>> rom isn't recopied and gpxe therefore gets confused.  However, on a
>> linux invoked reboot, it looks like a full machine reset occurs and
>> qemu recopies the gpxe rom, so that doesn't seem to be an issue.
>>
>>      
> Do we have the same problem with tpr patching rom (vapic,bin)? It modifies
> itself too.
>    

But a reset will reload it.
Gleb Natapov - Nov. 18, 2009, 2:19 p.m.
On Wed, Nov 18, 2009 at 03:50:20PM +0200, Avi Kivity wrote:
> On 11/18/2009 11:39 AM, Gleb Natapov wrote:
> >
> >>Hi Avi,
> >>
> >>Can you send the full qemu command line that you used?  I can't seem
> >>to reproduce this on my setup.
> >>
> >>I do see an issue if SeaBIOS's reboot vector is called (eg, by using
> >>"sendkey ctrl-alt-delete" while still in the bios) because seabios
> >>allows gpxe to modify itself, and on a seabios only reboot the gpxe
> >>rom isn't recopied and gpxe therefore gets confused.  However, on a
> >>linux invoked reboot, it looks like a full machine reset occurs and
> >>qemu recopies the gpxe rom, so that doesn't seem to be an issue.
> >>
> >Do we have the same problem with tpr patching rom (vapic,bin)? It modifies
> >itself too.
> 
> But a reset will reload it.
> 
Correct, but Kevin says "sendkey ctrl-alt-delete" jumps to SeaBIOS's
reboot vector without issuing system reset. I am talking about this situation.

--
			Gleb.
Avi Kivity - Nov. 18, 2009, 2:21 p.m.
On 11/18/2009 04:19 PM, Gleb Natapov wrote:
>>>
>>> Do we have the same problem with tpr patching rom (vapic,bin)? It modifies
>>> itself too.
>>>        
>> But a reset will reload it.
>>
>>      
> Correct, but Kevin says "sendkey ctrl-alt-delete" jumps to SeaBIOS's
> reboot vector without issuing system reset. I am talking about this situation.
>    

That's only if we're in the bios.  If an OS has taken over, it will 
issue a proper reset.  If an OS has not taken over (DOS won't, probably) 
then it isn't Windows and the vapic payload hasn't had a chance to 
modify itself.
Gleb Natapov - Nov. 18, 2009, 2:22 p.m.
On Wed, Nov 18, 2009 at 08:06:26AM -0500, Kevin O'Connor wrote:
> On Wed, Nov 18, 2009 at 11:39:49AM +0200, Gleb Natapov wrote:
> > On Mon, Nov 16, 2009 at 09:26:20PM -0500, Kevin O'Connor wrote:
> > > I do see an issue if SeaBIOS's reboot vector is called (eg, by using
> > > "sendkey ctrl-alt-delete" while still in the bios) because seabios
> > > allows gpxe to modify itself, and on a seabios only reboot the gpxe
> > > rom isn't recopied and gpxe therefore gets confused.  However, on a
> > > linux invoked reboot, it looks like a full machine reset occurs and
> > > qemu recopies the gpxe rom, so that doesn't seem to be an issue.
> > > 
> > Do we have the same problem with tpr patching rom (vapic,bin)? It modifies
> > itself too.
> 
> I don't know, but I wouldn't think so.  The issue is only if the
> option rom init code doesn't like getting run twice.  (Gpxe allocates
If rom modifies itself its checksum changes so SeaBIOS thinks that rom
is invalid and does not call its init code second time. Is this correct?

> high memory via pmm, relocates itself there, and shrinks its option
> rom size - on the second option rom init call the PMM allocation is
> lost and its option rom has been shrunk - it rightfully can't handle
> that.)  I don't think the vapic would have the same issue - would it?
> 
> Ideally, I think SeaBIOS should detect a second call to "post" and try
> to issue a machine reboot.  That should fix this issue.  (To be clear
> though, I don't think this is the cause of Avi's Fedora reboot hang.)
> 
> -Kevin

--
			Gleb.
Joshua Oreman - Nov. 18, 2009, 3:38 p.m.
On Wed, Nov 18, 2009 at 9:22 AM, Gleb Natapov <gleb@redhat.com> wrote:
> On Wed, Nov 18, 2009 at 08:06:26AM -0500, Kevin O'Connor wrote:
>> On Wed, Nov 18, 2009 at 11:39:49AM +0200, Gleb Natapov wrote:
>> > On Mon, Nov 16, 2009 at 09:26:20PM -0500, Kevin O'Connor wrote:
>> > > I do see an issue if SeaBIOS's reboot vector is called (eg, by using
>> > > "sendkey ctrl-alt-delete" while still in the bios) because seabios
>> > > allows gpxe to modify itself, and on a seabios only reboot the gpxe
>> > > rom isn't recopied and gpxe therefore gets confused.  However, on a
>> > > linux invoked reboot, it looks like a full machine reset occurs and
>> > > qemu recopies the gpxe rom, so that doesn't seem to be an issue.
>> > >
>> > Do we have the same problem with tpr patching rom (vapic,bin)? It modifies
>> > itself too.
>>
>> I don't know, but I wouldn't think so.  The issue is only if the
>> option rom init code doesn't like getting run twice.  (Gpxe allocates
>
> If rom modifies itself its checksum changes so SeaBIOS thinks that rom
> is invalid and does not call its init code second time. Is this correct?

I don't know how it's "supposed" to work, but gPXE's ROM init
procedure contains code to recompute and store the checksum after it's
modified itself. Presumably this is there because some vendor BIOSes
expect it.

-- Josh
Kevin O'Connor - Nov. 19, 2009, 1:07 a.m.
On Wed, Nov 18, 2009 at 04:22:17PM +0200, Gleb Natapov wrote:
> On Wed, Nov 18, 2009 at 08:06:26AM -0500, Kevin O'Connor wrote:
> > On Wed, Nov 18, 2009 at 11:39:49AM +0200, Gleb Natapov wrote:
> > > On Mon, Nov 16, 2009 at 09:26:20PM -0500, Kevin O'Connor wrote:
> > > > I do see an issue if SeaBIOS's reboot vector is called (eg, by using
> > > > "sendkey ctrl-alt-delete" while still in the bios) because seabios
> > > > allows gpxe to modify itself, and on a seabios only reboot the gpxe
> > > > rom isn't recopied and gpxe therefore gets confused.  However, on a
> > > > linux invoked reboot, it looks like a full machine reset occurs and
> > > > qemu recopies the gpxe rom, so that doesn't seem to be an issue.
> > > > 
> > > Do we have the same problem with tpr patching rom (vapic,bin)? It modifies
> > > itself too.
> > 
> > I don't know, but I wouldn't think so.  The issue is only if the
> > option rom init code doesn't like getting run twice.  (Gpxe allocates
> If rom modifies itself its checksum changes so SeaBIOS thinks that rom
> is invalid and does not call its init code second time. Is this correct?

An option rom that modifies itself is required to update its checksum
before returning to the bios.

If the vapic is modified without updating the checksum then SeaBIOS
wont execute its init vector.  I'm guessing that isn't really a
problem, though.

-Kevin
Kevin O'Connor - Nov. 20, 2009, 10:39 p.m.
On Sat, Nov 14, 2009 at 02:47:45PM -0500, Kevin O'Connor wrote:
> On Thu, Nov 12, 2009 at 01:20:58PM +0200, Naphtali Sprei wrote:
> > I've found a problem with the usage of SeaBIOS/gPXE in Qemu.  The
> > scenario is when failing to boot from network and falling back to
> > booting from hard-disk (-boot nc).  The cause of the problem is that
> > both SeaBIOS and gPXE (in it's installation phase) uses same stack
> > area, 0x7c00.  The gPXE code corrupts the SeaBIOS stack, so when
> > gPXE returns to SeaBIOS chaos occurs.
> > 
> > Output: "qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000eb300000"
> 
> Thanks for reporting this.
> 
> We can move the SeaBIOS stack, but it's not clear to me where to move
> it to.

I don't think this is a SeaBIOS bug, but in an effort to move forward,
I've moved the SeaBIOS stack from 0x7c00 to 0x7000.  Commit 494dfc6e.

-Kevin
Kevin O'Connor - Nov. 21, 2009, 12:47 a.m.
On Tue, Nov 17, 2009 at 03:23:46PM +0200, Avi Kivity wrote:
> On 11/17/2009 04:26 AM, Kevin O'Connor wrote:
>> On Mon, Nov 16, 2009 at 04:02:20PM +0200, Avi Kivity wrote:
>>>> Something that is likely related, I am seeing reboot failures in
>>>> seabios's pmm_free.  Immediately after loading gpxe, seabios is in an
>>>> endless loop there, likely due to memory corruption.
>>>>
>>>> This is with -smp 2, rebooting Fedora 9 after installation.
>>>>        
>>> With gpxe disabled, rebooting works as expected.
>>>
>>> Note the tests were performed with the stack at 64K to avoid triggering
>>> the known issue.
>>>      
>> Hi Avi,
>>
>> Can you send the full qemu command line that you used?  I can't seem
>> to reproduce this on my setup.

Hi Avi,

I'm still unable to reproduce this Fedora reboot failure.

> Example command line is

That command looks like the vista cdrom scenario.  To try and
reproduce the fedora reboot issue, I checked out 51a8ac6f of qemu-kvm
and ran:

qemu-img create -f qcow2 test-fc9 10G

qemu-system-x86_64 -hda test-fc9 -cdrom ../../iso/Fedora-9-i386-DVD/Fedora-9-i386-DVD.iso -m 512 -smp 2

This was with the bios.bin that came with 51a8ac6f.  I selected all
default options (except I disabled "Software productivity tools" to
make the install go faster).  The machine rebooted okay after the
install.

Can you retry this with the latest seabios git.  If you are able to
reproduce, can you set CONFIG_DEBUG_LEVEL to 8 and post the log?
Maybe something in the log will help.

Thanks,
-Kevin
Avi Kivity - Nov. 29, 2009, 10:58 a.m.
On 11/21/2009 02:47 AM, Kevin O'Connor wrote:
>
> Can you retry this with the latest seabios git.  If you are able to
> reproduce, can you set CONFIG_DEBUG_LEVEL to 8 and post the log?
> Maybe something in the log will help.
>    

With current seabios.git the problem is resolved.

Patch

diff --git a/src/arch/i386/prefix/pxeprefix.S b/src/arch/i386/prefix/pxeprefix.S
index b541e4b..11dd45d 100644
--- a/src/arch/i386/prefix/pxeprefix.S
+++ b/src/arch/i386/prefix/pxeprefix.S
@@ -47,7 +47,7 @@  FILE_LICENCE ( GPL2_OR_LATER )
 	/* Set up stack just below 0x7c00 */
 	xorw	%ax, %ax
 	movw	%ax, %ss
-	movl	$0x7c00, %esp
+	movl	$0x7a00, %esp
 	/* Clear direction flag, for the sake of sanity */
 	cld
 	/* Print welcome message */
diff --git a/src/arch/i386/prefix/romprefix.S b/src/arch/i386/prefix/romprefix.S
index cb474e8..93f3f17 100644
--- a/src/arch/i386/prefix/romprefix.S
+++ b/src/arch/i386/prefix/romprefix.S
@@ -587,7 +587,7 @@  exec:	/* Set %ds = %cs */
 	/* Obtain a reasonably-sized temporary stack */
 	xorw	%ax, %ax
 	movw	%ax, %ss
-	movw	$0x7c00, %sp
+	movw	$0x7a00, %sp
 
 	/* Install gPXE */
 	movl	image_source, %esi