diff mbox

piix: do not reset APIC base address (0x80) on piix4_reset.

Message ID 1386753670-11238-1-git-send-email-ghammer@redhat.com
State New
Headers show

Commit Message

Gal Hammer Dec. 11, 2013, 9:21 a.m. UTC
Fix a bug that was introduced in commit c046e8c4. QEMU fails to
resume from suspend mode (S3).

Signed-off-by: Gal Hammer <ghammer@redhat.com>
---
 hw/acpi/piix4.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Paolo Bonzini Dec. 11, 2013, 10:23 a.m. UTC | #1
Il 11/12/2013 10:21, Gal Hammer ha scritto:
> Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> resume from suspend mode (S3).
> 
> Signed-off-by: Gal Hammer <ghammer@redhat.com>
> ---
>  hw/acpi/piix4.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> index 93849c8..5c736a4 100644
> --- a/hw/acpi/piix4.c
> +++ b/hw/acpi/piix4.c
> @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
>      pci_conf[0x5b] = 0;
>  
>      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> -    pci_conf[0x80] = 0;
>  
>      if (s->kvm_enabled) {
>          /* Mark SMM as already inited (until KVM supports SMM). */
> 

Cc: qemu-stable@nongnu.org
Michael S. Tsirkin Dec. 11, 2013, 10:44 a.m. UTC | #2
On Wed, Dec 11, 2013 at 11:23:27AM +0100, Paolo Bonzini wrote:
> Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > resume from suspend mode (S3).
> > 
> > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > ---
> >  hw/acpi/piix4.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > index 93849c8..5c736a4 100644
> > --- a/hw/acpi/piix4.c
> > +++ b/hw/acpi/piix4.c
> > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> >      pci_conf[0x5b] = 0;
> >  
> >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > -    pci_conf[0x80] = 0;
> >  
> >      if (s->kvm_enabled) {
> >          /* Mark SMM as already inited (until KVM supports SMM). */
> > 
> 
> Cc: qemu-stable@nongnu.org

It's good to know this helps but I don't think we can apply
it as is without figuring out why,
otherwise it might break something else.
Gal Hammer Dec. 11, 2013, 11:04 a.m. UTC | #3
Michael,

True, I haven't figure it out yet, but the current status is that recover from sleep doesn't work.

As far as I can tell it could be either:

1. piix4_reset shouldn't be call on resume.
2. memory_region_set_enabled (called in pm_io_space_update) shouldn't use config[0x80].
3. the config[0x80] shouldn't be zero in piix4_reset (current solution).
4. something else?

I'm not well familiar with the PIIX4 emulation and your help will be appreciated.

Thanks,

    Gal.

----- Original Message -----
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Paolo Bonzini" <pbonzini@redhat.com>
Cc: "Gal Hammer" <ghammer@redhat.com>, qemu-devel@nongnu.org, qemu-stable@nongnu.org
Sent: Wednesday, December 11, 2013 12:44:37 PM
Subject: Re: [PATCH] piix: do not reset APIC base address (0x80) on piix4_reset.

On Wed, Dec 11, 2013 at 11:23:27AM +0100, Paolo Bonzini wrote:
> Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > resume from suspend mode (S3).
> > 
> > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > ---
> >  hw/acpi/piix4.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > index 93849c8..5c736a4 100644
> > --- a/hw/acpi/piix4.c
> > +++ b/hw/acpi/piix4.c
> > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> >      pci_conf[0x5b] = 0;
> >  
> >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > -    pci_conf[0x80] = 0;
> >  
> >      if (s->kvm_enabled) {
> >          /* Mark SMM as already inited (until KVM supports SMM). */
> > 
> 
> Cc: qemu-stable@nongnu.org

It's good to know this helps but I don't think we can apply
it as is without figuring out why,
otherwise it might break something else.
Paolo Bonzini Dec. 18, 2013, 2:19 p.m. UTC | #4
Il 11/12/2013 12:04, Gal Hammer ha scritto:
> Michael,
> 
> True, I haven't figure it out yet, but the current status is that recover from sleep doesn't work.
> 
> As far as I can tell it could be either:
> 
> 1. piix4_reset shouldn't be call on resume.
> 2. memory_region_set_enabled (called in pm_io_space_update) shouldn't use config[0x80].
> 3. the config[0x80] shouldn't be zero in piix4_reset (current solution).
> 4. something else?
> 
> I'm not well familiar with the PIIX4 emulation and your help will be appreciated.

The datasheet says that config[0x80] is reset to 0.

The PIIX spec says that during S3 the chipset provides "Shadow registers
for standard AT write only registers to save and restore system state
information"  These are just for the 825x (DMA controller, PIC, PIT).
We do not emulate them and our BIOS does not support them.

I was told that a few memory controller registers survive S3, which in
our case would be the i440FX's PAM registers, but I don't think this
register should be one of them.

What guest is breaking and how?  Does the guest usually initialize this
register, or does the firmware (SeaBIOS) do that?  If the latter, this
could be a SeaBIOS bug instead.

Paolo
Paolo Bonzini Dec. 18, 2013, 2:22 p.m. UTC | #5
Il 11/12/2013 10:21, Gal Hammer ha scritto:
> Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> resume from suspend mode (S3).
> 
> Signed-off-by: Gal Hammer <ghammer@redhat.com>
> ---
>  hw/acpi/piix4.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> index 93849c8..5c736a4 100644
> --- a/hw/acpi/piix4.c
> +++ b/hw/acpi/piix4.c
> @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
>      pci_conf[0x5b] = 0;
>  
>      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> -    pci_conf[0x80] = 0;
>  
>      if (s->kvm_enabled) {
>          /* Mark SMM as already inited (until KVM supports SMM). */

Note this is not the APIC base address, that one is 80h on the ISA
bridge (function 0).  You're changing the behavior for 80h on the power
management function, which is function 3.  The register is "PMBA—POWER
MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
piix4_pm_setup (src/fw/pciinit.c).

Michael, perhaps a part of pci_setup (same file) should run on S3 resume?

Paolo
Gal Hammer Dec. 18, 2013, 3:16 p.m. UTC | #6
On 18/12/2013 16:19, Paolo Bonzini wrote:

> The PIIX spec says that during S3 the chipset provides "Shadow registers
> for standard AT write only registers to save and restore system state
> information"  These are just for the 825x (DMA controller, PIC, PIT).
> We do not emulate them and our BIOS does not support them.
>
> I was told that a few memory controller registers survive S3, which in
> our case would be the i440FX's PAM registers, but I don't think this
> register should be one of them.
>
> What guest is breaking and how?  Does the guest usually initialize this
> register, or does the firmware (SeaBIOS) do that?  If the latter, this
> could be a SeaBIOS bug instead.

Both Windows and Linux guests are breaking when system is suspend. On 
system wakeup nothing occurs and the OS is not restored.

I don't know the answer for the remaining questions.

Thanks,

     Gal.
Michael S. Tsirkin Dec. 18, 2013, 3:22 p.m. UTC | #7
On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
> Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > resume from suspend mode (S3).
> > 
> > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > ---
> >  hw/acpi/piix4.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > index 93849c8..5c736a4 100644
> > --- a/hw/acpi/piix4.c
> > +++ b/hw/acpi/piix4.c
> > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> >      pci_conf[0x5b] = 0;
> >  
> >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > -    pci_conf[0x80] = 0;
> >  
> >      if (s->kvm_enabled) {
> >          /* Mark SMM as already inited (until KVM supports SMM). */
> 
> Note this is not the APIC base address, that one is 80h on the ISA
> bridge (function 0).  You're changing the behavior for 80h on the power
> management function, which is function 3.  The register is "PMBA—POWER
> MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> piix4_pm_setup (src/fw/pciinit.c).
> 
> Michael, perhaps a part of pci_setup (same file) should run on S3 resume?
> 
> Paolo

Seems reasonable: either seabios or guest OS must do it, and
guest does not seem to.
Gal Hammer Dec. 18, 2013, 3:59 p.m. UTC | #8
On 18/12/2013 16:22, Paolo Bonzini wrote:
> Il 11/12/2013 10:21, Gal Hammer ha scritto:
>> Fix a bug that was introduced in commit c046e8c4. QEMU fails to
>> resume from suspend mode (S3).
>>
>> Signed-off-by: Gal Hammer <ghammer@redhat.com>
>> ---
>>   hw/acpi/piix4.c | 1 -
>>   1 file changed, 1 deletion(-)
>>
>> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
>> index 93849c8..5c736a4 100644
>> --- a/hw/acpi/piix4.c
>> +++ b/hw/acpi/piix4.c
>> @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
>>       pci_conf[0x5b] = 0;
>>
>>       pci_conf[0x40] = 0x01; /* PM io base read only bit */
>> -    pci_conf[0x80] = 0;
>>
>>       if (s->kvm_enabled) {
>>           /* Mark SMM as already inited (until KVM supports SMM). */
>
> Note this is not the APIC base address, that one is 80h on the ISA
> bridge (function 0).  You're changing the behavior for 80h on the power
> management function, which is function 3.  The register is "PMBA—POWER
> MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> piix4_pm_setup (src/fw/pciinit.c).

I think we both made a mistake and the right name is 
"PMREGMISC—MISCELLANEOUS POWER MANAGEMENT (FUNCTION 3)" :-).

> Michael, perhaps a part of pci_setup (same file) should run on S3 resume?
>
> Paolo
>

     Gal.
Paolo Bonzini Dec. 18, 2013, 4 p.m. UTC | #9
Il 18/12/2013 16:59, Gal Hammer ha scritto:
>>
>> Note this is not the APIC base address, that one is 80h on the ISA
>> bridge (function 0).  You're changing the behavior for 80h on the power
>> management function, which is function 3.  The register is "PMBA—POWER
>> MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
>> piix4_pm_setup (src/fw/pciinit.c).
> 
> I think we both made a mistake and the right name is
> "PMREGMISC—MISCELLANEOUS POWER MANAGEMENT (FUNCTION 3)" :-).

Yeah. :)  Still, both PMBA and PMREGMISC need to be initialized by SeaBIOS.

Paolo
Marcel Apfelbaum Dec. 18, 2013, 4:27 p.m. UTC | #10
On Wed, 2013-12-18 at 17:22 +0200, Michael S. Tsirkin wrote:
> On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
> > Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > > resume from suspend mode (S3).
> > > 
> > > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > > ---
> > >  hw/acpi/piix4.c | 1 -
> > >  1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > > index 93849c8..5c736a4 100644
> > > --- a/hw/acpi/piix4.c
> > > +++ b/hw/acpi/piix4.c
> > > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> > >      pci_conf[0x5b] = 0;
> > >  
> > >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > > -    pci_conf[0x80] = 0;
> > >  
> > >      if (s->kvm_enabled) {
> > >          /* Mark SMM as already inited (until KVM supports SMM). */
> > 
> > Note this is not the APIC base address, that one is 80h on the ISA
> > bridge (function 0).  You're changing the behavior for 80h on the power
> > management function, which is function 3.  The register is "PMBA—POWER
> > MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> > piix4_pm_setup (src/fw/pciinit.c).
> > 
> > Michael, perhaps a part of pci_setup (same file) should run on S3 resume?
> > 
> > Paolo
> 
> Seems reasonable: either seabios or guest OS must do it, and
> guest does not seem to.
I was looking into this today, but it seems that we have a problem.
We cannot run pci_setup() in init section:
.data.varinit.seabios/src/hw/pci.h.66 is VARVERIFY32INIT but used from ['.text.runtime.seabios/src/resume.c.150', '.text.pci_setup']

Any thoughts how to get around this?
Thanks,
Marcel

>
Michael S. Tsirkin Dec. 18, 2013, 4:33 p.m. UTC | #11
On Wed, Dec 18, 2013 at 06:27:12PM +0200, Marcel Apfelbaum wrote:
> On Wed, 2013-12-18 at 17:22 +0200, Michael S. Tsirkin wrote:
> > On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
> > > Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > > > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > > > resume from suspend mode (S3).
> > > > 
> > > > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > > > ---
> > > >  hw/acpi/piix4.c | 1 -
> > > >  1 file changed, 1 deletion(-)
> > > > 
> > > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > > > index 93849c8..5c736a4 100644
> > > > --- a/hw/acpi/piix4.c
> > > > +++ b/hw/acpi/piix4.c
> > > > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> > > >      pci_conf[0x5b] = 0;
> > > >  
> > > >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > > > -    pci_conf[0x80] = 0;
> > > >  
> > > >      if (s->kvm_enabled) {
> > > >          /* Mark SMM as already inited (until KVM supports SMM). */
> > > 
> > > Note this is not the APIC base address, that one is 80h on the ISA
> > > bridge (function 0).  You're changing the behavior for 80h on the power
> > > management function, which is function 3.  The register is "PMBA—POWER
> > > MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> > > piix4_pm_setup (src/fw/pciinit.c).
> > > 
> > > Michael, perhaps a part of pci_setup (same file) should run on S3 resume?
> > > 
> > > Paolo
> > 
> > Seems reasonable: either seabios or guest OS must do it, and
> > guest does not seem to.
> I was looking into this today, but it seems that we have a problem.
> We cannot run pci_setup() in init section:
> .data.varinit.seabios/src/hw/pci.h.66 is VARVERIFY32INIT but used from ['.text.runtime.seabios/src/resume.c.150', '.text.pci_setup']
> 
> Any thoughts how to get around this?
> Thanks,
> Marcel

We defintely don't want to do full pci enumeration.
Just pci_bios_init_platform or even less.

> > 
> 
>
Paolo Bonzini Dec. 18, 2013, 4:34 p.m. UTC | #12
----- Messaggio originale -----
> Da: "Michael S. Tsirkin" <mst@redhat.com>
> A: "marcel a" <marcel.a@redhat.com>
> Cc: "Paolo Bonzini" <pbonzini@redhat.com>, "Gal Hammer" <ghammer@redhat.com>, seabios@seabios.org,
> qemu-devel@nongnu.org
> Inviato: Mercoledì, 18 dicembre 2013 17:33:06
> Oggetto: Re: [Qemu-devel] [PATCH] piix: do not reset APIC base address (0x80) on piix4_reset.
> 
> On Wed, Dec 18, 2013 at 06:27:12PM +0200, Marcel Apfelbaum wrote:
> > On Wed, 2013-12-18 at 17:22 +0200, Michael S. Tsirkin wrote:
> > > On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
> > > > Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > > > > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > > > > resume from suspend mode (S3).
> > > > > 
> > > > > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > > > > ---
> > > > >  hw/acpi/piix4.c | 1 -
> > > > >  1 file changed, 1 deletion(-)
> > > > > 
> > > > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > > > > index 93849c8..5c736a4 100644
> > > > > --- a/hw/acpi/piix4.c
> > > > > +++ b/hw/acpi/piix4.c
> > > > > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> > > > >      pci_conf[0x5b] = 0;
> > > > >  
> > > > >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > > > > -    pci_conf[0x80] = 0;
> > > > >  
> > > > >      if (s->kvm_enabled) {
> > > > >          /* Mark SMM as already inited (until KVM supports SMM). */
> > > > 
> > > > Note this is not the APIC base address, that one is 80h on the ISA
> > > > bridge (function 0).  You're changing the behavior for 80h on the power
> > > > management function, which is function 3.  The register is "PMBA—POWER
> > > > MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> > > > piix4_pm_setup (src/fw/pciinit.c).
> > > > 
> > > > Michael, perhaps a part of pci_setup (same file) should run on S3
> > > > resume?
> > > > 
> > > > Paolo
> > > 
> > > Seems reasonable: either seabios or guest OS must do it, and
> > > guest does not seem to.
> > I was looking into this today, but it seems that we have a problem.
> > We cannot run pci_setup() in init section:
> > .data.varinit.seabios/src/hw/pci.h.66 is VARVERIFY32INIT but used from
> > ['.text.runtime.seabios/src/resume.c.150', '.text.pci_setup']
> > 
> > Any thoughts how to get around this?
> > Thanks,
> > Marcel
> 
> We defintely don't want to do full pci enumeration.
> Just pci_bios_init_platform or even less.

Or put an array of (bdf, offset, size, value) tuples somewhere in low memory,
fill it at startup, and reproduce it blindly at S3 resume time.  This is similar
to what UEFI does.

Paolo
Marcel Apfelbaum Dec. 18, 2013, 4:49 p.m. UTC | #13
On Wed, 2013-12-18 at 18:33 +0200, Michael S. Tsirkin wrote:
> On Wed, Dec 18, 2013 at 06:27:12PM +0200, Marcel Apfelbaum wrote:
> > On Wed, 2013-12-18 at 17:22 +0200, Michael S. Tsirkin wrote:
> > > On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
> > > > Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > > > > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > > > > resume from suspend mode (S3).
> > > > > 
> > > > > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > > > > ---
> > > > >  hw/acpi/piix4.c | 1 -
> > > > >  1 file changed, 1 deletion(-)
> > > > > 
> > > > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > > > > index 93849c8..5c736a4 100644
> > > > > --- a/hw/acpi/piix4.c
> > > > > +++ b/hw/acpi/piix4.c
> > > > > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> > > > >      pci_conf[0x5b] = 0;
> > > > >  
> > > > >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > > > > -    pci_conf[0x80] = 0;
> > > > >  
> > > > >      if (s->kvm_enabled) {
> > > > >          /* Mark SMM as already inited (until KVM supports SMM). */
> > > > 
> > > > Note this is not the APIC base address, that one is 80h on the ISA
> > > > bridge (function 0).  You're changing the behavior for 80h on the power
> > > > management function, which is function 3.  The register is "PMBA—POWER
> > > > MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> > > > piix4_pm_setup (src/fw/pciinit.c).
> > > > 
> > > > Michael, perhaps a part of pci_setup (same file) should run on S3 resume?
> > > > 
> > > > Paolo
> > > 
> > > Seems reasonable: either seabios or guest OS must do it, and
> > > guest does not seem to.
> > I was looking into this today, but it seems that we have a problem.
> > We cannot run pci_setup() in init section:
> > .data.varinit.seabios/src/hw/pci.h.66 is VARVERIFY32INIT but used from ['.text.runtime.seabios/src/resume.c.150', '.text.pci_setup']
> > 
> > Any thoughts how to get around this?
> > Thanks,
> > Marcel
> 
> We defintely don't want to do full pci enumeration.
> Just pci_bios_init_platform or even less.
The problem still remains, we have to use pci_bios_init_device that
in turn uses the PCIDevices list.

Thanks,
Marcel
> 
> > > 
> > 
> >
Marcel Apfelbaum Dec. 18, 2013, 4:55 p.m. UTC | #14
On Wed, 2013-12-18 at 11:34 -0500, Paolo Bonzini wrote:
> 
> ----- Messaggio originale -----
> > Da: "Michael S. Tsirkin" <mst@redhat.com>
> > A: "marcel a" <marcel.a@redhat.com>
> > Cc: "Paolo Bonzini" <pbonzini@redhat.com>, "Gal Hammer" <ghammer@redhat.com>, seabios@seabios.org,
> > qemu-devel@nongnu.org
> > Inviato: Mercoledì, 18 dicembre 2013 17:33:06
> > Oggetto: Re: [Qemu-devel] [PATCH] piix: do not reset APIC base address (0x80) on piix4_reset.
> > 
> > On Wed, Dec 18, 2013 at 06:27:12PM +0200, Marcel Apfelbaum wrote:
> > > On Wed, 2013-12-18 at 17:22 +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
> > > > > Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > > > > > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > > > > > resume from suspend mode (S3).
> > > > > > 
> > > > > > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > > > > > ---
> > > > > >  hw/acpi/piix4.c | 1 -
> > > > > >  1 file changed, 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > > > > > index 93849c8..5c736a4 100644
> > > > > > --- a/hw/acpi/piix4.c
> > > > > > +++ b/hw/acpi/piix4.c
> > > > > > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> > > > > >      pci_conf[0x5b] = 0;
> > > > > >  
> > > > > >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > > > > > -    pci_conf[0x80] = 0;
> > > > > >  
> > > > > >      if (s->kvm_enabled) {
> > > > > >          /* Mark SMM as already inited (until KVM supports SMM). */
> > > > > 
> > > > > Note this is not the APIC base address, that one is 80h on the ISA
> > > > > bridge (function 0).  You're changing the behavior for 80h on the power
> > > > > management function, which is function 3.  The register is "PMBA—POWER
> > > > > MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> > > > > piix4_pm_setup (src/fw/pciinit.c).
> > > > > 
> > > > > Michael, perhaps a part of pci_setup (same file) should run on S3
> > > > > resume?
> > > > > 
> > > > > Paolo
> > > > 
> > > > Seems reasonable: either seabios or guest OS must do it, and
> > > > guest does not seem to.
> > > I was looking into this today, but it seems that we have a problem.
> > > We cannot run pci_setup() in init section:
> > > .data.varinit.seabios/src/hw/pci.h.66 is VARVERIFY32INIT but used from
> > > ['.text.runtime.seabios/src/resume.c.150', '.text.pci_setup']
> > > 
> > > Any thoughts how to get around this?
> > > Thanks,
> > > Marcel
> > 
> > We defintely don't want to do full pci enumeration.
> > Just pci_bios_init_platform or even less.
> 
> Or put an array of (bdf, offset, size, value) tuples somewhere in low memory,
> fill it at startup, and reproduce it blindly at S3 resume time.  This is similar
> to what UEFI does.
Could you please elaborate a little more?
Let me see first if I understand the problem:
PciDevices list is a list of pointers that cannot be used
inside init code which is 16 bit code, right?

If I got it right, the solution is to find another way to iterate
over pci devices? If yes, *when* would this data be put in memory and "when"?
A pointer to the right direction would be appreciated,

Thanks!
Marcel

> 
> Paolo
>
Michael S. Tsirkin Dec. 18, 2013, 4:58 p.m. UTC | #15
On Wed, Dec 18, 2013 at 11:34:14AM -0500, Paolo Bonzini wrote:
> 
> 
> ----- Messaggio originale -----
> > Da: "Michael S. Tsirkin" <mst@redhat.com>
> > A: "marcel a" <marcel.a@redhat.com>
> > Cc: "Paolo Bonzini" <pbonzini@redhat.com>, "Gal Hammer" <ghammer@redhat.com>, seabios@seabios.org,
> > qemu-devel@nongnu.org
> > Inviato: Mercoledì, 18 dicembre 2013 17:33:06
> > Oggetto: Re: [Qemu-devel] [PATCH] piix: do not reset APIC base address (0x80) on piix4_reset.
> > 
> > On Wed, Dec 18, 2013 at 06:27:12PM +0200, Marcel Apfelbaum wrote:
> > > On Wed, 2013-12-18 at 17:22 +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
> > > > > Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > > > > > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > > > > > resume from suspend mode (S3).
> > > > > > 
> > > > > > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > > > > > ---
> > > > > >  hw/acpi/piix4.c | 1 -
> > > > > >  1 file changed, 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > > > > > index 93849c8..5c736a4 100644
> > > > > > --- a/hw/acpi/piix4.c
> > > > > > +++ b/hw/acpi/piix4.c
> > > > > > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> > > > > >      pci_conf[0x5b] = 0;
> > > > > >  
> > > > > >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > > > > > -    pci_conf[0x80] = 0;
> > > > > >  
> > > > > >      if (s->kvm_enabled) {
> > > > > >          /* Mark SMM as already inited (until KVM supports SMM). */
> > > > > 
> > > > > Note this is not the APIC base address, that one is 80h on the ISA
> > > > > bridge (function 0).  You're changing the behavior for 80h on the power
> > > > > management function, which is function 3.  The register is "PMBA—POWER
> > > > > MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> > > > > piix4_pm_setup (src/fw/pciinit.c).
> > > > > 
> > > > > Michael, perhaps a part of pci_setup (same file) should run on S3
> > > > > resume?
> > > > > 
> > > > > Paolo
> > > > 
> > > > Seems reasonable: either seabios or guest OS must do it, and
> > > > guest does not seem to.
> > > I was looking into this today, but it seems that we have a problem.
> > > We cannot run pci_setup() in init section:
> > > .data.varinit.seabios/src/hw/pci.h.66 is VARVERIFY32INIT but used from
> > > ['.text.runtime.seabios/src/resume.c.150', '.text.pci_setup']
> > > 
> > > Any thoughts how to get around this?
> > > Thanks,
> > > Marcel
> > 
> > We defintely don't want to do full pci enumeration.
> > Just pci_bios_init_platform or even less.
> 
> Or put an array of (bdf, offset, size, value) tuples somewhere in low memory,
> fill it at startup, and reproduce it blindly at S3 resume time.  This is similar
> to what UEFI does.
> 
> Paolo

Yes, it's an option. Though reworking pci_bios_init_devices so that it
can work from low memory seems easier.
Michael S. Tsirkin Dec. 18, 2013, 5:21 p.m. UTC | #16
On Wed, Dec 18, 2013 at 06:55:24PM +0200, Marcel Apfelbaum wrote:
> On Wed, 2013-12-18 at 11:34 -0500, Paolo Bonzini wrote:
> > 
> > ----- Messaggio originale -----
> > > Da: "Michael S. Tsirkin" <mst@redhat.com>
> > > A: "marcel a" <marcel.a@redhat.com>
> > > Cc: "Paolo Bonzini" <pbonzini@redhat.com>, "Gal Hammer" <ghammer@redhat.com>, seabios@seabios.org,
> > > qemu-devel@nongnu.org
> > > Inviato: Mercoledì, 18 dicembre 2013 17:33:06
> > > Oggetto: Re: [Qemu-devel] [PATCH] piix: do not reset APIC base address (0x80) on piix4_reset.
> > > 
> > > On Wed, Dec 18, 2013 at 06:27:12PM +0200, Marcel Apfelbaum wrote:
> > > > On Wed, 2013-12-18 at 17:22 +0200, Michael S. Tsirkin wrote:
> > > > > On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
> > > > > > Il 11/12/2013 10:21, Gal Hammer ha scritto:
> > > > > > > Fix a bug that was introduced in commit c046e8c4. QEMU fails to
> > > > > > > resume from suspend mode (S3).
> > > > > > > 
> > > > > > > Signed-off-by: Gal Hammer <ghammer@redhat.com>
> > > > > > > ---
> > > > > > >  hw/acpi/piix4.c | 1 -
> > > > > > >  1 file changed, 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > > > > > > index 93849c8..5c736a4 100644
> > > > > > > --- a/hw/acpi/piix4.c
> > > > > > > +++ b/hw/acpi/piix4.c
> > > > > > > @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
> > > > > > >      pci_conf[0x5b] = 0;
> > > > > > >  
> > > > > > >      pci_conf[0x40] = 0x01; /* PM io base read only bit */
> > > > > > > -    pci_conf[0x80] = 0;
> > > > > > >  
> > > > > > >      if (s->kvm_enabled) {
> > > > > > >          /* Mark SMM as already inited (until KVM supports SMM). */
> > > > > > 
> > > > > > Note this is not the APIC base address, that one is 80h on the ISA
> > > > > > bridge (function 0).  You're changing the behavior for 80h on the power
> > > > > > management function, which is function 3.  The register is "PMBA—POWER
> > > > > > MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
> > > > > > piix4_pm_setup (src/fw/pciinit.c).
> > > > > > 
> > > > > > Michael, perhaps a part of pci_setup (same file) should run on S3
> > > > > > resume?
> > > > > > 
> > > > > > Paolo
> > > > > 
> > > > > Seems reasonable: either seabios or guest OS must do it, and
> > > > > guest does not seem to.
> > > > I was looking into this today, but it seems that we have a problem.
> > > > We cannot run pci_setup() in init section:
> > > > .data.varinit.seabios/src/hw/pci.h.66 is VARVERIFY32INIT but used from
> > > > ['.text.runtime.seabios/src/resume.c.150', '.text.pci_setup']
> > > > 
> > > > Any thoughts how to get around this?
> > > > Thanks,
> > > > Marcel
> > > 
> > > We defintely don't want to do full pci enumeration.
> > > Just pci_bios_init_platform or even less.
> > 
> > Or put an array of (bdf, offset, size, value) tuples somewhere in low memory,
> > fill it at startup, and reproduce it blindly at S3 resume time.  This is similar
> > to what UEFI does.
> Could you please elaborate a little more?
> Let me see first if I understand the problem:
> PciDevices list is a list of pointers that cannot be used
> inside init code which is 16 bit code, right?
> 
> If I got it right, the solution is to find another way to iterate
> over pci devices?

Yes, but I think foreachbdf does this in already.

> If yes, *when* would this data be put in memory and "when"?
> A pointer to the right direction would be appreciated,
> 
> Thanks!
> Marcel
> 
> > 
> > Paolo
> > 
> 
>
Laszlo Ersek Dec. 18, 2013, 9:57 p.m. UTC | #17
On 12/18/13 15:19, Paolo Bonzini wrote:
> Il 11/12/2013 12:04, Gal Hammer ha scritto:
>> Michael,
>>
>> True, I haven't figure it out yet, but the current status is that recover from sleep doesn't work.
>>
>> As far as I can tell it could be either:
>>
>> 1. piix4_reset shouldn't be call on resume.
>> 2. memory_region_set_enabled (called in pm_io_space_update) shouldn't use config[0x80].
>> 3. the config[0x80] shouldn't be zero in piix4_reset (current solution).
>> 4. something else?
>>
>> I'm not well familiar with the PIIX4 emulation and your help will be appreciated.
> 
> The datasheet says that config[0x80] is reset to 0.
> 
> The PIIX spec says that during S3 the chipset provides "Shadow registers
> for standard AT write only registers to save and restore system state
> information"  These are just for the 825x (DMA controller, PIC, PIT).
> We do not emulate them and our BIOS does not support them.
> 
> I was told that a few memory controller registers survive S3, which in
> our case would be the i440FX's PAM registers, but I don't think this
> register should be one of them.
> 
> What guest is breaking and how?  Does the guest usually initialize this
> register, or does the firmware (SeaBIOS) do that?  If the latter, this
> could be a SeaBIOS bug instead.

I tend to agree. The commit Gal identified (c046e8c4) is part of v1.7.0,
and Fedora 19 correctly resumes from S3 when using OVMF and running on
qemu v1.7.0 (an out-of-tree qemu fix is needed for setting the 32-bit
PCI hole, but that's irrelevant now.)

OVMF has a bit of code that runs after both cold boot and during S3
resume:

InitializePlatform() [OvmfPkg/PlatformPei/Platform.c]
  MiscInitialization()

It reads:

  //
  // If PMREGMISC/PMIOSE is set, assume the ACPI PMBA has been configured (for
  // example by Xen) and skip the setup here. This matches the logic in
  // AcpiTimerLibConstructor ().
  //
  if ((PciRead8 (PCI_LIB_ADDRESS (0, 1, 3, 0x80)) & 0x01) == 0) {
    //
    // The PEI phase should be exited with fully accessibe PIIX4 IO space:
    // 1. set PMBA
    //
    PciAndThenOr32 (
      PCI_LIB_ADDRESS (0, 1, 3, 0x40),
      (UINT32) ~0xFFC0,
      PcdGet16 (PcdAcpiPmBaseAddress)
      );

    //
    // 2. set PCICMD/IOSE
    //
    PciOr8 (
      PCI_LIB_ADDRESS (0, 1, 3, PCI_COMMAND_OFFSET),
      EFI_PCI_COMMAND_IO_SPACE
      );

    //
    // 3. set PMREGMISC/PMIOSE
    //
    PciOr8 (PCI_LIB_ADDRESS (0, 1, 3, 0x80), 0x01);
  }

From edk2 SVN revisions 13722 and 13723.

https://github.com/tianocore/edk2/commit/931a0c7
https://github.com/tianocore/edk2/commit/0e20a18

Laszlo
Laszlo Ersek Dec. 18, 2013, 10:10 p.m. UTC | #18
On 12/18/13 17:34, Paolo Bonzini wrote:
> 
> 
> ----- Messaggio originale -----
>> Da: "Michael S. Tsirkin" <mst@redhat.com>
>> A: "marcel a" <marcel.a@redhat.com>
>> Cc: "Paolo Bonzini" <pbonzini@redhat.com>, "Gal Hammer" <ghammer@redhat.com>, seabios@seabios.org,
>> qemu-devel@nongnu.org
>> Inviato: Mercoledì, 18 dicembre 2013 17:33:06
>> Oggetto: Re: [Qemu-devel] [PATCH] piix: do not reset APIC base address (0x80) on piix4_reset.
>>
>> On Wed, Dec 18, 2013 at 06:27:12PM +0200, Marcel Apfelbaum wrote:
>>> On Wed, 2013-12-18 at 17:22 +0200, Michael S. Tsirkin wrote:
>>>> On Wed, Dec 18, 2013 at 03:22:59PM +0100, Paolo Bonzini wrote:
>>>>> Il 11/12/2013 10:21, Gal Hammer ha scritto:
>>>>>> Fix a bug that was introduced in commit c046e8c4. QEMU fails to
>>>>>> resume from suspend mode (S3).
>>>>>>
>>>>>> Signed-off-by: Gal Hammer <ghammer@redhat.com>
>>>>>> ---
>>>>>>  hw/acpi/piix4.c | 1 -
>>>>>>  1 file changed, 1 deletion(-)
>>>>>>
>>>>>> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
>>>>>> index 93849c8..5c736a4 100644
>>>>>> --- a/hw/acpi/piix4.c
>>>>>> +++ b/hw/acpi/piix4.c
>>>>>> @@ -376,7 +376,6 @@ static void piix4_reset(void *opaque)
>>>>>>      pci_conf[0x5b] = 0;
>>>>>>  
>>>>>>      pci_conf[0x40] = 0x01; /* PM io base read only bit */
>>>>>> -    pci_conf[0x80] = 0;
>>>>>>  
>>>>>>      if (s->kvm_enabled) {
>>>>>>          /* Mark SMM as already inited (until KVM supports SMM). */
>>>>>
>>>>> Note this is not the APIC base address, that one is 80h on the ISA
>>>>> bridge (function 0).  You're changing the behavior for 80h on the power
>>>>> management function, which is function 3.  The register is "PMBA—POWER
>>>>> MANAGEMENT BASE ADDRESS" and it is indeed initialized by SeaBIOS in
>>>>> piix4_pm_setup (src/fw/pciinit.c).
>>>>>
>>>>> Michael, perhaps a part of pci_setup (same file) should run on S3
>>>>> resume?
>>>>>
>>>>> Paolo
>>>>
>>>> Seems reasonable: either seabios or guest OS must do it, and
>>>> guest does not seem to.
>>> I was looking into this today, but it seems that we have a problem.
>>> We cannot run pci_setup() in init section:
>>> .data.varinit.seabios/src/hw/pci.h.66 is VARVERIFY32INIT but used from
>>> ['.text.runtime.seabios/src/resume.c.150', '.text.pci_setup']
>>>
>>> Any thoughts how to get around this?
>>> Thanks,
>>> Marcel
>>
>> We defintely don't want to do full pci enumeration.
>> Just pci_bios_init_platform or even less.
> 
> Or put an array of (bdf, offset, size, value) tuples somewhere in low memory,
> fill it at startup, and reproduce it blindly at S3 resume time.  This is similar
> to what UEFI does.

What UEFI does is kind of a mess :)

PEI runs both during cold boot and S3 resume.

DXE runs only after cold boot, it is not reached during S3 resume. (The
DXE initial program loader is the last PEI module, and dependent on the
boot mode, it either loads the DXE core, or invokes the S3 resume vector.)

So, PEI itself can reinitialize whatever it wants. The S3 boot script
that you refer to above is a way for DXE drivers to stash a bunch of IO
/ PCI etc writes for the *next* PEI phase, ie. the one that runs when
resuming from S3. QemuVideoDxe (the driver in OVMF that configures
produces the GOP on top of Cirrus) is such a DXE driver.

In a nutshell,

1. SEC after cold boot
2. PEI after cold boot
2.5 DXE IPL PEIM loads DXE core
3. DXE after cold boot
4. BDS after cold boot
5. runtime (OSPM), normal entry
6. PEI after S3 resume
6.5 DXE IPL PEIM branches to S3 resume PEIM
7. runtime (OSPM), entry via S3 resume vector

Steps 2 and 6 are implemented by the same PEI code (it can branch
internally based on boot mode of course), and this PEI code contains the
excerpt that I posted a minute ago.

The S3 boot script is prepared during step 3 by various DXE drivers, and
it is saved into reserved / AcpiNVS RAM no later than entering 5. This
script is then replayed / interpreted by step 6.

So, if PEI must do something after S3 resume that is independent of any
DXE drivers, it can simply do it. The boot script is only necessary when
the S3 resume PEI actions (in step 6) need to depend on earlier actions
during DXE (step 3).

Thanks,
Laszlo
Laszlo Ersek Dec. 18, 2013, 10:16 p.m. UTC | #19
On 12/18/13 23:10, Laszlo Ersek wrote:

> 1. SEC after cold boot
> 2. PEI after cold boot
> 2.5 DXE IPL PEIM loads DXE core
> 3. DXE after cold boot
> 4. BDS after cold boot
> 5. runtime (OSPM), normal entry
> 6. PEI after S3 resume
> 6.5 DXE IPL PEIM branches to S3 resume PEIM
> 7. runtime (OSPM), entry via S3 resume vector

(Sorry I left out SEC from between 5 and 6, but it's not relevant now.)

Laszlo
Paolo Bonzini Dec. 18, 2013, 11:43 p.m. UTC | #20
Il 18/12/2013 23:10, Laszlo Ersek ha scritto:
> 
> So, if PEI must do something after S3 resume that is independent of any
> DXE drivers, it can simply do it. The boot script is only necessary when
> the S3 resume PEI actions (in step 6) need to depend on earlier actions
> during DXE (step 3).

In SeaBIOS, PEI is really almost nothing (it's just the contents of
src/resume.c), so all the setup needs to be placed in the "boot script".

For example I suspect that using INT 13h for disk I/O does not work
after S3.

Paolo
Kevin O'Connor Dec. 19, 2013, 4:06 p.m. UTC | #21
On Wed, Dec 18, 2013 at 06:55:24PM +0200, Marcel Apfelbaum wrote:
> On Wed, 2013-12-18 at 11:34 -0500, Paolo Bonzini wrote:
> > Or put an array of (bdf, offset, size, value) tuples somewhere in low memory,
> > fill it at startup, and reproduce it blindly at S3 resume time.  This is similar
> > to what UEFI does.
> Could you please elaborate a little more?
> Let me see first if I understand the problem:
> PciDevices list is a list of pointers that cannot be used
> inside init code which is 16 bit code, right?

FYI, all the code at this point is 32bit code.  Both the SeaBIOS init
code (aka POST) and the SeaBIOS resume code run in 32bit mode.

The problem is that SeaBIOS has ownership of all ram during its
initialization phase, but it must release ownership during its runtime
phase.  (During the runtime phase, the OS has ownership of the bulk of
ram and SeaBIOS only has a tiny fraction that it reserves.)  The PCI
device cache that SeaBIOS builds is not placed in reserved memory, and
that's why it is marked as VARVERIFY32INIT.  It's to try and prevent
pointers that no longer point to valid memory from being accessed
after the init phase has completed.

The error it produces is correct - one must not access the pci_device
structs from the resume code in the current code.

> If I got it right, the solution is to find another way to iterate
> over pci devices? If yes, *when* would this data be put in memory and "when"?
> A pointer to the right direction would be appreciated,

A good question.  I'll take a look at Michael's patch.

-Kevin
Marcel Apfelbaum Dec. 19, 2013, 6:03 p.m. UTC | #22
On Thu, 2013-12-19 at 11:06 -0500, Kevin O'Connor wrote:
> On Wed, Dec 18, 2013 at 06:55:24PM +0200, Marcel Apfelbaum wrote:
> > On Wed, 2013-12-18 at 11:34 -0500, Paolo Bonzini wrote:
> > > Or put an array of (bdf, offset, size, value) tuples somewhere in low memory,
> > > fill it at startup, and reproduce it blindly at S3 resume time.  This is similar
> > > to what UEFI does.
> > Could you please elaborate a little more?
> > Let me see first if I understand the problem:
> > PciDevices list is a list of pointers that cannot be used
> > inside init code which is 16 bit code, right?
> 
> FYI, all the code at this point is 32bit code.  Both the SeaBIOS init
> code (aka POST) and the SeaBIOS resume code run in 32bit mode.
> 
> The problem is that SeaBIOS has ownership of all ram during its
> initialization phase, but it must release ownership during its runtime
> phase.  (During the runtime phase, the OS has ownership of the bulk of
> ram and SeaBIOS only has a tiny fraction that it reserves.)  The PCI
> device cache that SeaBIOS builds is not placed in reserved memory, and
> that's why it is marked as VARVERIFY32INIT.  It's to try and prevent
> pointers that no longer point to valid memory from being accessed
> after the init phase has completed.
> 
> The error it produces is correct - one must not access the pci_device
> structs from the resume code in the current code.
Thank you Kevin for the detailed explanation! By the way, do you know
how this fraction is allocated by Seabios and how can one "decide" to move
the device cache to this region reserved by the BIOS ? (not that I want to,
but to understand how Seabios does this)

Thanks again!
Marcel

> 
> > If I got it right, the solution is to find another way to iterate
> > over pci devices? If yes, *when* would this data be put in memory and "when"?
> > A pointer to the right direction would be appreciated,
> 
> A good question.  I'll take a look at Michael's patch.

> 
> -Kevin
Kevin O'Connor Dec. 19, 2013, 6:17 p.m. UTC | #23
On Thu, Dec 19, 2013 at 08:03:15PM +0200, Marcel Apfelbaum wrote:
> On Thu, 2013-12-19 at 11:06 -0500, Kevin O'Connor wrote:
> > On Wed, Dec 18, 2013 at 06:55:24PM +0200, Marcel Apfelbaum wrote:
> > > On Wed, 2013-12-18 at 11:34 -0500, Paolo Bonzini wrote:
> > > > Or put an array of (bdf, offset, size, value) tuples somewhere in low memory,
> > > > fill it at startup, and reproduce it blindly at S3 resume time.  This is similar
> > > > to what UEFI does.
> > > Could you please elaborate a little more?
> > > Let me see first if I understand the problem:
> > > PciDevices list is a list of pointers that cannot be used
> > > inside init code which is 16 bit code, right?
> > 
> > FYI, all the code at this point is 32bit code.  Both the SeaBIOS init
> > code (aka POST) and the SeaBIOS resume code run in 32bit mode.
> > 
> > The problem is that SeaBIOS has ownership of all ram during its
> > initialization phase, but it must release ownership during its runtime
> > phase.  (During the runtime phase, the OS has ownership of the bulk of
> > ram and SeaBIOS only has a tiny fraction that it reserves.)  The PCI
> > device cache that SeaBIOS builds is not placed in reserved memory, and
> > that's why it is marked as VARVERIFY32INIT.  It's to try and prevent
> > pointers that no longer point to valid memory from being accessed
> > after the init phase has completed.
> > 
> > The error it produces is correct - one must not access the pci_device
> > structs from the resume code in the current code.
> Thank you Kevin for the detailed explanation! By the way, do you know
> how this fraction is allocated by Seabios and how can one "decide" to move
> the device cache to this region reserved by the BIOS ? (not that I want to,
> but to understand how Seabios does this)

In pci.c:pci_probe_devices(), you'll see that it calls malloc_tmp() to
allocate the struct pci_device.  That allocation function takes memory
from ram that will eventually be given back to the OS.  To make it not
do that, one would need to choose one of the reserved zones (ie,
malloc_fseg, malloc_low, or malloc_high).  There is some freedom in
the choice of zones - malloc_fseg would probably be the simplest to
use.

However, as you suspected, I don't think just allocating in a reserved
zone is the right thing to do.  Caching all the PCI devices after init
could lead to confusion as the devices cached may not be present later
on or have different parameters (eg, due to hotplug).

-Kevin
diff mbox

Patch

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 93849c8..5c736a4 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -376,7 +376,6 @@  static void piix4_reset(void *opaque)
     pci_conf[0x5b] = 0;
 
     pci_conf[0x40] = 0x01; /* PM io base read only bit */
-    pci_conf[0x80] = 0;
 
     if (s->kvm_enabled) {
         /* Mark SMM as already inited (until KVM supports SMM). */