Patchwork [v2,3/5] Qemu: do not mark bios readonly

login
register
mail settings
Submitter Xiao Guangrong
Date Oct. 25, 2012, 9:22 a.m.
Message ID <508904C4.7030409@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/194085/
State New
Headers show

Comments

Xiao Guangrong - Oct. 25, 2012, 9:22 a.m.
In isapc, no i440x device exists in guest that means seabios can not
make 0xc0000 to 0x1000000 writable

It works fine in current code since the guest can happily write readonly
memory. In order to support readonly slot in Qemu, we do not make the bios
readonly anymore

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 hw/pc_sysfw.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)
Jan Kiszka - Oct. 26, 2012, 10:35 a.m.
On 2012-10-25 11:22, Xiao Guangrong wrote:
> In isapc, no i440x device exists in guest that means seabios can not
> make 0xc0000 to 0x1000000 writable
> 
> It works fine in current code since the guest can happily write readonly
> memory. In order to support readonly slot in Qemu, we do not make the bios
> readonly anymore
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
>  hw/pc_sysfw.c |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/pc_sysfw.c b/hw/pc_sysfw.c
> index b45f0ac..2d56fc7 100644
> --- a/hw/pc_sysfw.c
> +++ b/hw/pc_sysfw.c
> @@ -156,7 +156,6 @@ static void old_pc_system_rom_init(MemoryRegion *rom_memory)
>      bios = g_malloc(sizeof(*bios));
>      memory_region_init_ram(bios, "pc.bios", bios_size);
>      vmstate_register_ram_global(bios);
> -    memory_region_set_readonly(bios, true);
>      ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
>      if (ret != 0) {
>      bios_error:
> @@ -179,7 +178,6 @@ static void old_pc_system_rom_init(MemoryRegion *rom_memory)
>                                          0x100000 - isa_bios_size,
>                                          isa_bios,
>                                          1);
> -    memory_region_set_readonly(isa_bios, true);
> 
>      /* map all the bios at the top of memory */
>      memory_region_add_subregion(rom_memory,
> 

This has two problems: We know it breaks at least Win 95 that overwrites
its F-segment during boot. And it applies changes to the shadowed area
(below 1 MB) also to the ROM area - I don't think that is the original
behaviour on real hardware.

What we need is paravirtual shadow write control for the ISA PC. It's on
my todo list, maybe I will be able to look into this during the next week.

BTW, your patch series should allow to drop the KVM special case from
pc_system_firmware_init. That version, btw, treats high and low BIOS
areas separately - but only reloads the upper area. Hmm...

Jan
Xiao Guangrong - Oct. 29, 2012, 7:09 a.m.
Jan,

On 10/26/2012 06:35 PM, Jan Kiszka wrote:

> This has two problems: We know it breaks at least Win 95 that overwrites
> its F-segment during boot. And it applies changes to the shadowed area
> (below 1 MB) also to the ROM area - I don't think that is the original
> behaviour on real hardware.

So what is the problem? It can break Win95's running?

I tried to install win95 guest but it failed to boot regardless my patchset
was applied or not. I found the information that win 95 is not supported at
http://www.linux-kvm.org/page/Guest_Support_Status

Note: before my patchset, Win 95 still can happily something into ROM area
because readonly memory is actually writable on KVM. And win95 can not run
on isapc with --no-kvm since it is no way to enable shadow ROM.

> 
> What we need is paravirtual shadow write control for the ISA PC. It's on
> my todo list, maybe I will be able to look into this during the next week.
> 

You idea is that modify the code of seabios and use a special way (PV) to
notify Qemu to make the bios writable?

Actually, I am confused why the guest (including bios) persistently uses
shadow ROM even if it is not supported (on ISA PC), i think the right way
is move itself to RAM under this case, no?

> BTW, your patch series should allow to drop the KVM special case from
> pc_system_firmware_init. That version, btw, treats high and low BIOS
> areas separately - but only reloads the upper area. Hmm...
> 

You mean that also allow Qemu to use pflash to load bios if kvm is enabled?
We can not do that for pflash is a RD device which can not be directly written,
kvm can not emulate the instruction which implicitly write the memory. (e.g:
using this area as stack).

Thanks!
Jan Kiszka - Oct. 29, 2012, 7:44 a.m.
On 2012-10-29 08:09, Xiao Guangrong wrote:
> Jan,
> 
> On 10/26/2012 06:35 PM, Jan Kiszka wrote:
> 
>> This has two problems: We know it breaks at least Win 95 that overwrites
>> its F-segment during boot. And it applies changes to the shadowed area
>> (below 1 MB) also to the ROM area - I don't think that is the original
>> behaviour on real hardware.
> 
> So what is the problem? It can break Win95's running?
> 
> I tried to install win95 guest but it failed to boot regardless my patchset
> was applied or not. I found the information that win 95 is not supported at
> http://www.linux-kvm.org/page/Guest_Support_Status
> 
> Note: before my patchset, Win 95 still can happily something into ROM area
> because readonly memory is actually writable on KVM. And win95 can not run
> on isapc with --no-kvm since it is no way to enable shadow ROM.

Your patches causes regressions on TCG mode as that is perfectly fine
with booting Win95 so far.

> 
>>
>> What we need is paravirtual shadow write control for the ISA PC. It's on
>> my todo list, maybe I will be able to look into this during the next week.
>>
> 
> You idea is that modify the code of seabios and use a special way (PV) to
> notify Qemu to make the bios writable?

Yes.

> 
> Actually, I am confused why the guest (including bios) persistently uses
> shadow ROM even if it is not supported (on ISA PC), i think the right way
> is move itself to RAM under this case, no?

I've been told that Seabios has been built around that assumption and
the PV shadow control would be simpler to realize.

> 
>> BTW, your patch series should allow to drop the KVM special case from
>> pc_system_firmware_init. That version, btw, treats high and low BIOS
>> areas separately - but only reloads the upper area. Hmm...
>>
> 
> You mean that also allow Qemu to use pflash to load bios if kvm is enabled?

Yes.

> We can not do that for pflash is a RD device which can not be directly written,
> kvm can not emulate the instruction which implicitly write the memory. (e.g:
> using this area as stack).

Isn't enabling ROMD support for KVM that whole point of your patches? I
do not see yet what prevents this still, but it should be fixed first.

Jan
Xiao Guangrong - Oct. 29, 2012, 8:31 a.m.
On 10/29/2012 03:44 PM, Jan Kiszka wrote:
> On 2012-10-29 08:09, Xiao Guangrong wrote:
>> Jan,
>>
>> On 10/26/2012 06:35 PM, Jan Kiszka wrote:
>>
>>> This has two problems: We know it breaks at least Win 95 that overwrites
>>> its F-segment during boot. And it applies changes to the shadowed area
>>> (below 1 MB) also to the ROM area - I don't think that is the original
>>> behaviour on real hardware.
>>
>> So what is the problem? It can break Win95's running?
>>
>> I tried to install win95 guest but it failed to boot regardless my patchset
>> was applied or not. I found the information that win 95 is not supported at
>> http://www.linux-kvm.org/page/Guest_Support_Status
>>
>> Note: before my patchset, Win 95 still can happily something into ROM area
>> because readonly memory is actually writable on KVM. And win95 can not run
>> on isapc with --no-kvm since it is no way to enable shadow ROM.
> 
> Your patches causes regressions on TCG mode as that is perfectly fine
> with booting Win95 so far.

Aha, i tried accel=tcg, before my patchset, it works for -machine pc but
failed for -machine isapc (known issue for seabios). After my patchset,
it works fine for both -machine pc and isapc. :)

> 
>>
>>>
>>> What we need is paravirtual shadow write control for the ISA PC. It's on
>>> my todo list, maybe I will be able to look into this during the next week.
>>>
>>
>> You idea is that modify the code of seabios and use a special way (PV) to
>> notify Qemu to make the bios writable?
> 
> Yes.
> 
>>
>> Actually, I am confused why the guest (including bios) persistently uses
>> shadow ROM even if it is not supported (on ISA PC), i think the right way
>> is move itself to RAM under this case, no?
> 
> I've been told that Seabios has been built around that assumption and
> the PV shadow control would be simpler to realize.

Sounds the PV is complexer that directly making the bios area writable
(if it works).

> 
>>
>>> BTW, your patch series should allow to drop the KVM special case from
>>> pc_system_firmware_init. That version, btw, treats high and low BIOS
>>> areas separately - but only reloads the upper area. Hmm...
>>>
>>
>> You mean that also allow Qemu to use pflash to load bios if kvm is enabled?
> 
> Yes.
> 
>> We can not do that for pflash is a RD device which can not be directly written,
>> kvm can not emulate the instruction which implicitly write the memory. (e.g:
>> using this area as stack).
> 
> Isn't enabling ROMD support for KVM that whole point of your patches? I

It can generate MMIO exit if ROMD be written, that means the instruction
needs kvm's help to be finished if it explicitly/implicitly write the memory.

> do not see yet what prevents this still, but it should be fixed first.

For the explicitly write memory access, it is easy to be fixed - we just need
to fetch the instruction from EIP and emulate it. But for the implicitly memory
access, fixing its emulation is really hard work. Really worth doing it?
Jan Kiszka - Oct. 31, 2012, 6:03 a.m.
On 2012-10-29 09:31, Xiao Guangrong wrote:
> On 10/29/2012 03:44 PM, Jan Kiszka wrote:
>> On 2012-10-29 08:09, Xiao Guangrong wrote:
>>> Jan,
>>>
>>> On 10/26/2012 06:35 PM, Jan Kiszka wrote:
>>>
>>>> This has two problems: We know it breaks at least Win 95 that overwrites
>>>> its F-segment during boot. And it applies changes to the shadowed area
>>>> (below 1 MB) also to the ROM area - I don't think that is the original
>>>> behaviour on real hardware.
>>>
>>> So what is the problem? It can break Win95's running?
>>>
>>> I tried to install win95 guest but it failed to boot regardless my patchset
>>> was applied or not. I found the information that win 95 is not supported at
>>> http://www.linux-kvm.org/page/Guest_Support_Status
>>>
>>> Note: before my patchset, Win 95 still can happily something into ROM area
>>> because readonly memory is actually writable on KVM. And win95 can not run
>>> on isapc with --no-kvm since it is no way to enable shadow ROM.
>>
>> Your patches causes regressions on TCG mode as that is perfectly fine
>> with booting Win95 so far.
> 
> Aha, i tried accel=tcg, before my patchset, it works for -machine pc but
> failed for -machine isapc (known issue for seabios). After my patchset,
> it works fine for both -machine pc and isapc. :)

Indeed, looks like I'm on the wrong track regarding what breaks Win95 in
KVM mode.

However: This patch inappropriately allows the guest to change the BIOS
content during runtime. And that not only in the lower ISA range, not
only for our stepchild isapc but for the high ROM range as well, even
with PCI chipset enabled. So this is nothing more than a hack.

> 
>>
>>>
>>>>
>>>> What we need is paravirtual shadow write control for the ISA PC. It's on
>>>> my todo list, maybe I will be able to look into this during the next week.
>>>>
>>>
>>> You idea is that modify the code of seabios and use a special way (PV) to
>>> notify Qemu to make the bios writable?
>>
>> Yes.
>>
>>>
>>> Actually, I am confused why the guest (including bios) persistently uses
>>> shadow ROM even if it is not supported (on ISA PC), i think the right way
>>> is move itself to RAM under this case, no?
>>
>> I've been told that Seabios has been built around that assumption and
>> the PV shadow control would be simpler to realize.
> 
> Sounds the PV is complexer that directly making the bios area writable
> (if it works).

But it is the only correct solution. In fact, shadowing means mapping
RAM above the ROM, not enabling writability, and then copying necessary
bits from the high ROM part to that RAM. Seabios does this when PAM is
available, we just need to pull in those bits for PV shadow control.

> 
>>
>>>
>>>> BTW, your patch series should allow to drop the KVM special case from
>>>> pc_system_firmware_init. That version, btw, treats high and low BIOS
>>>> areas separately - but only reloads the upper area. Hmm...
>>>>
>>>
>>> You mean that also allow Qemu to use pflash to load bios if kvm is enabled?
>>
>> Yes.
>>
>>> We can not do that for pflash is a RD device which can not be directly written,
>>> kvm can not emulate the instruction which implicitly write the memory. (e.g:
>>> using this area as stack).
>>
>> Isn't enabling ROMD support for KVM that whole point of your patches? I
> 
> It can generate MMIO exit if ROMD be written, that means the instruction
> needs kvm's help to be finished if it explicitly/implicitly write the memory.

I was assuming that this is what you already do. If you trap write
accesses, why not allowing user space to handle them?

> 
>> do not see yet what prevents this still, but it should be fixed first.
> 
> For the explicitly write memory access, it is easy to be fixed - we just need
> to fetch the instruction from EIP and emulate it. But for the implicitly memory
> access, fixing its emulation is really hard work. Really worth doing it?

Aren't the read-only regions also marked read-only on the host side to
avoid that the guest writes to it? Or how is this implemented?

Support for flash emulation in KVM mode is increasingly important, for
embedded platform virtualization but also for classic x86 server-like
targets. The pflash-backed system firmware device was added for a reason...

Jan
Xiao Guangrong - Oct. 31, 2012, 6:35 a.m.
On 10/31/2012 02:03 PM, Jan Kiszka wrote:
> On 2012-10-29 09:31, Xiao Guangrong wrote:
>> On 10/29/2012 03:44 PM, Jan Kiszka wrote:
>>> On 2012-10-29 08:09, Xiao Guangrong wrote:
>>>> Jan,
>>>>
>>>> On 10/26/2012 06:35 PM, Jan Kiszka wrote:
>>>>
>>>>> This has two problems: We know it breaks at least Win 95 that overwrites
>>>>> its F-segment during boot. And it applies changes to the shadowed area
>>>>> (below 1 MB) also to the ROM area - I don't think that is the original
>>>>> behaviour on real hardware.
>>>>
>>>> So what is the problem? It can break Win95's running?
>>>>
>>>> I tried to install win95 guest but it failed to boot regardless my patchset
>>>> was applied or not. I found the information that win 95 is not supported at
>>>> http://www.linux-kvm.org/page/Guest_Support_Status
>>>>
>>>> Note: before my patchset, Win 95 still can happily something into ROM area
>>>> because readonly memory is actually writable on KVM. And win95 can not run
>>>> on isapc with --no-kvm since it is no way to enable shadow ROM.
>>>
>>> Your patches causes regressions on TCG mode as that is perfectly fine
>>> with booting Win95 so far.
>>
>> Aha, i tried accel=tcg, before my patchset, it works for -machine pc but
>> failed for -machine isapc (known issue for seabios). After my patchset,
>> it works fine for both -machine pc and isapc. :)
> 
> Indeed, looks like I'm on the wrong track regarding what breaks Win95 in
> KVM mode.
> 
> However: This patch inappropriately allows the guest to change the BIOS
> content during runtime. And that not only in the lower ISA range, not
> only for our stepchild isapc but for the high ROM range as well, even
> with PCI chipset enabled. So this is nothing more than a hack.

Okay.

> 
>>
>>>
>>>>
>>>>>
>>>>> What we need is paravirtual shadow write control for the ISA PC. It's on
>>>>> my todo list, maybe I will be able to look into this during the next week.
>>>>>
>>>>
>>>> You idea is that modify the code of seabios and use a special way (PV) to
>>>> notify Qemu to make the bios writable?
>>>
>>> Yes.
>>>
>>>>
>>>> Actually, I am confused why the guest (including bios) persistently uses
>>>> shadow ROM even if it is not supported (on ISA PC), i think the right way
>>>> is move itself to RAM under this case, no?
>>>
>>> I've been told that Seabios has been built around that assumption and
>>> the PV shadow control would be simpler to realize.
>>
>> Sounds the PV is complexer that directly making the bios area writable
>> (if it works).
> 
> But it is the only correct solution. In fact, shadowing means mapping
> RAM above the ROM, not enabling writability, and then copying necessary
> bits from the high ROM part to that RAM. Seabios does this when PAM is
> available, we just need to pull in those bits for PV shadow control.

Okay. I will continue my work after your PV works. :)

> 
>>
>>>
>>>>
>>>>> BTW, your patch series should allow to drop the KVM special case from
>>>>> pc_system_firmware_init. That version, btw, treats high and low BIOS
>>>>> areas separately - but only reloads the upper area. Hmm...
>>>>>
>>>>
>>>> You mean that also allow Qemu to use pflash to load bios if kvm is enabled?
>>>
>>> Yes.
>>>
>>>> We can not do that for pflash is a RD device which can not be directly written,
>>>> kvm can not emulate the instruction which implicitly write the memory. (e.g:
>>>> using this area as stack).
>>>
>>> Isn't enabling ROMD support for KVM that whole point of your patches? I
>>
>> It can generate MMIO exit if ROMD be written, that means the instruction
>> needs kvm's help to be finished if it explicitly/implicitly write the memory.
> 
> I was assuming that this is what you already do. If you trap write
> accesses, why not allowing user space to handle them?

Already have done that, guest write ROMD -> vmexit -> return to userspace with
MMIO Exit.

> 
>>
>>> do not see yet what prevents this still, but it should be fixed first.
>>
>> For the explicitly write memory access, it is easy to be fixed - we just need
>> to fetch the instruction from EIP and emulate it. But for the implicitly memory
>> access, fixing its emulation is really hard work. Really worth doing it?
> 
> Aren't the read-only regions also marked read-only on the host side to
> avoid that the guest writes to it? Or how is this implemented?
> 
> Support for flash emulation in KVM mode is increasingly important, for
> embedded platform virtualization but also for classic x86 server-like
> targets. The pflash-backed system firmware device was added for a reason...

Please allow me to clarify it more clearly.

The flash is ROMD device means guest can not write it, any kinds of guest write
access on this device can cause vmexit to kvm and return to userspace.

We should pay more attention on it if we execute the code in ROMD since we
can not use ROMD as stack/page table/IDT table and all other implicitly write access.
Of course, if you do not use ROM as those purposes, it is okay. :)
Jan Kiszka - Oct. 31, 2012, 6:46 a.m.
On 2012-10-31 07:35, Xiao Guangrong wrote:
>>>>> We can not do that for pflash is a RD device which can not be directly written,
>>>>> kvm can not emulate the instruction which implicitly write the memory. (e.g:
>>>>> using this area as stack).
>>>>
>>>> Isn't enabling ROMD support for KVM that whole point of your patches? I
>>>
>>> It can generate MMIO exit if ROMD be written, that means the instruction
>>> needs kvm's help to be finished if it explicitly/implicitly write the memory.
>>
>> I was assuming that this is what you already do. If you trap write
>> accesses, why not allowing user space to handle them?
> 
> Already have done that, guest write ROMD -> vmexit -> return to userspace with
> MMIO Exit.

Great.

> 
>>
>>>
>>>> do not see yet what prevents this still, but it should be fixed first.
>>>
>>> For the explicitly write memory access, it is easy to be fixed - we just need
>>> to fetch the instruction from EIP and emulate it. But for the implicitly memory
>>> access, fixing its emulation is really hard work. Really worth doing it?
>>
>> Aren't the read-only regions also marked read-only on the host side to
>> avoid that the guest writes to it? Or how is this implemented?
>>
>> Support for flash emulation in KVM mode is increasingly important, for
>> embedded platform virtualization but also for classic x86 server-like
>> targets. The pflash-backed system firmware device was added for a reason...
> 
> Please allow me to clarify it more clearly.
> 
> The flash is ROMD device means guest can not write it, any kinds of guest write
> access on this device can cause vmexit to kvm and return to userspace.
> 
> We should pay more attention on it if we execute the code in ROMD since we
> can not use ROMD as stack/page table/IDT table and all other implicitly write access.
> Of course, if you do not use ROM as those purposes, it is okay. :)

So the problem is that there is KVM code that still blindly writes to
guest memory and does not take the memory regions' protection flag into
account? And we cannot mark those regions read only in the host's page
table?

Jan
Xiao Guangrong - Oct. 31, 2012, 7:01 a.m.
On 10/31/2012 02:46 PM, Jan Kiszka wrote:

>> Please allow me to clarify it more clearly.
>>
>> The flash is ROMD device means guest can not write it, any kinds of guest write
>> access on this device can cause vmexit to kvm and return to userspace.
>>
>> We should pay more attention on it if we execute the code in ROMD since we
>> can not use ROMD as stack/page table/IDT table and all other implicitly write access.
>> Of course, if you do not use ROM as those purposes, it is okay. :)
> 
> So the problem is that there is KVM code that still blindly writes to
> guest memory and does not take the memory regions' protection flag into
> account? And we cannot mark those regions read only in the host's page
> table?

KVM has the ability to catch this kind of write access on ROMD, it is just hard to
emulate the implicitly memory access.
Jan Kiszka - Oct. 31, 2012, 7:21 a.m.
On 2012-10-31 08:01, Xiao Guangrong wrote:
> On 10/31/2012 02:46 PM, Jan Kiszka wrote:
> 
>>> Please allow me to clarify it more clearly.
>>>
>>> The flash is ROMD device means guest can not write it, any kinds of guest write
>>> access on this device can cause vmexit to kvm and return to userspace.
>>>
>>> We should pay more attention on it if we execute the code in ROMD since we
>>> can not use ROMD as stack/page table/IDT table and all other implicitly write access.
>>> Of course, if you do not use ROM as those purposes, it is okay. :)
>>
>> So the problem is that there is KVM code that still blindly writes to
>> guest memory and does not take the memory regions' protection flag into
>> account? And we cannot mark those regions read only in the host's page
>> table?
> 
> KVM has the ability to catch this kind of write access on ROMD, it is just hard to
> emulate the implicitly memory access.

Drop them? It is highly unlikely that they trigger the magic
write-enable patterns at the right spot in a ROMD device.

Jan

Patch

diff --git a/hw/pc_sysfw.c b/hw/pc_sysfw.c
index b45f0ac..2d56fc7 100644
--- a/hw/pc_sysfw.c
+++ b/hw/pc_sysfw.c
@@ -156,7 +156,6 @@  static void old_pc_system_rom_init(MemoryRegion *rom_memory)
     bios = g_malloc(sizeof(*bios));
     memory_region_init_ram(bios, "pc.bios", bios_size);
     vmstate_register_ram_global(bios);
-    memory_region_set_readonly(bios, true);
     ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
     if (ret != 0) {
     bios_error:
@@ -179,7 +178,6 @@  static void old_pc_system_rom_init(MemoryRegion *rom_memory)
                                         0x100000 - isa_bios_size,
                                         isa_bios,
                                         1);
-    memory_region_set_readonly(isa_bios, true);

     /* map all the bios at the top of memory */
     memory_region_add_subregion(rom_memory,