diff mbox

[RFC,v5,86/86] 440fx: fix PAM, PCI holes

Message ID 1311180636-17012-87-git-send-email-avi@redhat.com
State New
Headers show

Commit Message

Avi Kivity July 20, 2011, 4:50 p.m. UTC
The current implementation of PAM and the PCI holes is broken in several
ways:

  - PCI BARs are not restricted to the PCI hole (a BAR may hide memory)
  - PCI devices do not respect PAM (if a PCI device maps a region while
    PAM maps the region to RAM, the request will be honored)

This patch fixes things by introducing a pci address space, and using
memory region aliases to represent PAM regions, SMRAM, and PCI holes.

The memory hierarchy looks something like

system_memory
 |
 +--- low memory alias (0-0xe0000000)
 |      |
 |      +-- ram@0
 |
 +--- high memory alias (0x100000000-EOM)
 |      |
 |      +-- ram@0xe0000000
 |
 +--- pam[n] (0xc0000-0xc3fff etc) (when set to pci, priority 1)
 |      |
 |      +-- pci@0xc4000 etc
 |
 +--- smram (0xa0000-0xbffff) (when set to pci/vga, priority 1)
        |
        +-- pci@0xa0000 etc

ram (simple ram region)

pci
 |
 +--- BARn
 |
 +--- VGA 0xa0000-0xbffff
 |
 +--- ROMs

Signed-off-by: Avi Kivity <avi@redhat.com>
---
 hw/pc.c       |   11 +++--
 hw/pc.h       |   13 +++++-
 hw/pc_piix.c  |   21 +++++++---
 hw/piix_pci.c |  114 ++++++++++++++++++++++++++++++++++++++++----------------
 4 files changed, 113 insertions(+), 46 deletions(-)

Comments

Anthony Liguori July 25, 2011, 1:07 p.m. UTC | #1
On 07/20/2011 11:50 AM, Avi Kivity wrote:
> The current implementation of PAM and the PCI holes is broken in several
> ways:
>
>    - PCI BARs are not restricted to the PCI hole (a BAR may hide memory)

Technically, a BAR can be mapped to any non-RAM memory location.

>    - PCI devices do not respect PAM (if a PCI device maps a region while
>      PAM maps the region to RAM, the request will be honored)

I assume you mean SMM shadowing, right?  PAM doesn't cover an area 
that's ever forwarded to the PCI bus.

> This patch fixes things by introducing a pci address space, and using
> memory region aliases to represent PAM regions, SMRAM, and PCI holes.
>
> The memory hierarchy looks something like
>
> system_memory
>   |
>   +--- low memory alias (0-0xe0000000)

According to the spec, PCI memory doesn't start at e00... but rather at 
the top of RAM.  In fact, this is what the spec says:

"The address range from the top of main DRAM to 4 Gbytes (top of 
physical memory space supported by the 440FX PCIset) is normally mapped 
to PCI. The PMC forwards all accesses within this address range to PCI.
There are two sub-ranges within this address range defined as APIC 
Configuration Space and High BIOS Address Range."

So the right thing to do is to forward all accesses from 
low_memory_memsize ... 4GB to the PCI bus.

Regards,

Anthony Liguori
Avi Kivity July 25, 2011, 1:14 p.m. UTC | #2
On 07/25/2011 04:07 PM, Anthony Liguori wrote:
> On 07/20/2011 11:50 AM, Avi Kivity wrote:
>> The current implementation of PAM and the PCI holes is broken in several
>> ways:
>>
>>    - PCI BARs are not restricted to the PCI hole (a BAR may hide memory)
>
> Technically, a BAR can be mapped to any non-RAM memory location.

I understood TOM (Top Of Memory) to be fixed - can't find a register for 
it - but maybe I misread the spec.

>
>>    - PCI devices do not respect PAM (if a PCI device maps a region while
>>      PAM maps the region to RAM, the request will be honored)
>
> I assume you mean SMM shadowing, right?  PAM doesn't cover an area 
> that's ever forwarded to the PCI bus.

No, PAM.  And all of the PAM address space is forwarded to the PCI bus 
(the ROMs are satisfied by PIIX which is a PCI device).

Here at least the spec is clear, see 3.2.18.

>
>> This patch fixes things by introducing a pci address space, and using
>> memory region aliases to represent PAM regions, SMRAM, and PCI holes.
>>
>> The memory hierarchy looks something like
>>
>> system_memory
>>   |
>>   +--- low memory alias (0-0xe0000000)
>
> According to the spec, PCI memory doesn't start at e00... but rather 
> at the top of RAM.  In fact, this is what the spec says:
>
> "The address range from the top of main DRAM to 4 Gbytes (top of 
> physical memory space supported by the 440FX PCIset) is normally 
> mapped to PCI. The PMC forwards all accesses within this address range 
> to PCI.
> There are two sub-ranges within this address range defined as APIC 
> Configuration Space and High BIOS Address Range."
>
> So the right thing to do is to forward all accesses from 
> low_memory_memsize ... 4GB to the PCI bus.
>

We could do that.  However what happens if we implement memory hotplug?

Maybe we should implement memory hotplug on a newer chipset anyway.
Gleb Natapov July 25, 2011, 1:17 p.m. UTC | #3
On Mon, Jul 25, 2011 at 04:14:45PM +0300, Avi Kivity wrote:
> On 07/25/2011 04:07 PM, Anthony Liguori wrote:
> >On 07/20/2011 11:50 AM, Avi Kivity wrote:
> >>The current implementation of PAM and the PCI holes is broken in several
> >>ways:
> >>
> >>   - PCI BARs are not restricted to the PCI hole (a BAR may hide memory)
> >
> >Technically, a BAR can be mapped to any non-RAM memory location.
> 
> I understood TOM (Top Of Memory) to be fixed - can't find a register
> for it - but maybe I misread the spec.
> 
PIIX3 spec:

2.2.11.    TOM—TOP OF MEMORY REGISTER (Function 0)
Address Offset:    69h
Default Value:     02h
Attribute:         Read/Write

--
			Gleb.
Avi Kivity July 25, 2011, 1:28 p.m. UTC | #4
On 07/25/2011 04:17 PM, Gleb Natapov wrote:
> On Mon, Jul 25, 2011 at 04:14:45PM +0300, Avi Kivity wrote:
> >  On 07/25/2011 04:07 PM, Anthony Liguori wrote:
> >  >On 07/20/2011 11:50 AM, Avi Kivity wrote:
> >  >>The current implementation of PAM and the PCI holes is broken in several
> >  >>ways:
> >  >>
> >  >>    - PCI BARs are not restricted to the PCI hole (a BAR may hide memory)
> >  >
> >  >Technically, a BAR can be mapped to any non-RAM memory location.
> >
> >  I understood TOM (Top Of Memory) to be fixed - can't find a register
> >  for it - but maybe I misread the spec.
> >
> PIIX3 spec:
>
> 2.2.11.    TOM—TOP OF MEMORY REGISTER (Function 0)
> Address Offset:    69h
> Default Value:     02h
> Attribute:         Read/Write
>

What's it doing in PIIX3?  Is it the same TOM?
Gleb Natapov July 25, 2011, 1:31 p.m. UTC | #5
On Mon, Jul 25, 2011 at 04:28:12PM +0300, Avi Kivity wrote:
> On 07/25/2011 04:17 PM, Gleb Natapov wrote:
> >On Mon, Jul 25, 2011 at 04:14:45PM +0300, Avi Kivity wrote:
> >>  On 07/25/2011 04:07 PM, Anthony Liguori wrote:
> >>  >On 07/20/2011 11:50 AM, Avi Kivity wrote:
> >>  >>The current implementation of PAM and the PCI holes is broken in several
> >>  >>ways:
> >>  >>
> >>  >>    - PCI BARs are not restricted to the PCI hole (a BAR may hide memory)
> >>  >
> >>  >Technically, a BAR can be mapped to any non-RAM memory location.
> >>
> >>  I understood TOM (Top Of Memory) to be fixed - can't find a register
> >>  for it - but maybe I misread the spec.
> >>
> >PIIX3 spec:
> >
> >2.2.11.    TOM—TOP OF MEMORY REGISTER (Function 0)
> >Address Offset:    69h
> >Default Value:     02h
> >Attribute:         Read/Write
> >
> 
> What's it doing in PIIX3?  Is it the same TOM?
> 
Good question. Looks like it is not:

This register enables the forwarding of ISA or DMA memory cycles to the
PCI Bus and sets the top of main
memory accessible by ISA or DMA devices. In addition, this register
controls the forwarding of ISA or DMA
accesses to the lower BIOS region (E0000–EFFFFh) and the 512–640-Kbyte
main memory region (80000–
9FFFFh). The Top Of Memory configuration register must be set by the
BIOS.

--
			Gleb.
Avi Kivity July 25, 2011, 1:31 p.m. UTC | #6
On 07/25/2011 04:28 PM, Avi Kivity wrote:
> On 07/25/2011 04:17 PM, Gleb Natapov wrote:
>> On Mon, Jul 25, 2011 at 04:14:45PM +0300, Avi Kivity wrote:
>> >  On 07/25/2011 04:07 PM, Anthony Liguori wrote:
>> > >On 07/20/2011 11:50 AM, Avi Kivity wrote:
>> > >>The current implementation of PAM and the PCI holes is broken in 
>> several
>> > >>ways:
>> > >>
>> > >>    - PCI BARs are not restricted to the PCI hole (a BAR may hide 
>> memory)
>> > >
>> > >Technically, a BAR can be mapped to any non-RAM memory location.
>> >
>> >  I understood TOM (Top Of Memory) to be fixed - can't find a register
>> >  for it - but maybe I misread the spec.
>> >
>> PIIX3 spec:
>>
>> 2.2.11.    TOM—TOP OF MEMORY REGISTER (Function 0)
>> Address Offset:    69h
>> Default Value:     02h
>> Attribute:         Read/Write
>>
>
> What's it doing in PIIX3?  Is it the same TOM?
>

That's the ISA TOM (15MB hole and friends).
Anthony Liguori July 25, 2011, 1:32 p.m. UTC | #7
On 07/25/2011 08:28 AM, Avi Kivity wrote:
> On 07/25/2011 04:17 PM, Gleb Natapov wrote:
>> On Mon, Jul 25, 2011 at 04:14:45PM +0300, Avi Kivity wrote:
>> > On 07/25/2011 04:07 PM, Anthony Liguori wrote:
>> > >On 07/20/2011 11:50 AM, Avi Kivity wrote:
>> > >>The current implementation of PAM and the PCI holes is broken in
>> several
>> > >>ways:
>> > >>
>> > >> - PCI BARs are not restricted to the PCI hole (a BAR may hide
>> memory)
>> > >
>> > >Technically, a BAR can be mapped to any non-RAM memory location.
>> >
>> > I understood TOM (Top Of Memory) to be fixed - can't find a register
>> > for it - but maybe I misread the spec.
>> >
>> PIIX3 spec:
>>
>> 2.2.11. TOM—TOP OF MEMORY REGISTER (Function 0)
>> Address Offset: 69h
>> Default Value: 02h
>> Attribute: Read/Write
>>
>
> What's it doing in PIIX3? Is it the same TOM?

This is a programmable register used to indicate the TOM for the 
purposes of forwarding ISA DMA transactions.

It is unrelated to the I440FX notion of top of memory.

Regards,

Anthony Liguori

>
Gleb Natapov July 25, 2011, 1:35 p.m. UTC | #8
On Mon, Jul 25, 2011 at 04:31:27PM +0300, Avi Kivity wrote:
> On 07/25/2011 04:28 PM, Avi Kivity wrote:
> >On 07/25/2011 04:17 PM, Gleb Natapov wrote:
> >>On Mon, Jul 25, 2011 at 04:14:45PM +0300, Avi Kivity wrote:
> >>>  On 07/25/2011 04:07 PM, Anthony Liguori wrote:
> >>> >On 07/20/2011 11:50 AM, Avi Kivity wrote:
> >>> >>The current implementation of PAM and the PCI holes is
> >>broken in several
> >>> >>ways:
> >>> >>
> >>> >>    - PCI BARs are not restricted to the PCI hole (a BAR may
> >>hide memory)
> >>> >
> >>> >Technically, a BAR can be mapped to any non-RAM memory location.
> >>>
> >>>  I understood TOM (Top Of Memory) to be fixed - can't find a register
> >>>  for it - but maybe I misread the spec.
> >>>
> >>PIIX3 spec:
> >>
> >>2.2.11.    TOM—TOP OF MEMORY REGISTER (Function 0)
> >>Address Offset:    69h
> >>Default Value:     02h
> >>Attribute:         Read/Write
> >>
> >
> >What's it doing in PIIX3?  Is it the same TOM?
> >
> 
> That's the ISA TOM (15MB hole and friends).
> 
Correct. What about:
3.2.19.        DRB[0:7] DRAM ROW BOUNDARY REGISTERS

from 440fx spec?

--
			Gleb.
Avi Kivity July 25, 2011, 1:38 p.m. UTC | #9
On 07/25/2011 04:35 PM, Gleb Natapov wrote:
> >
> >  That's the ISA TOM (15MB hole and friends).
> >
> Correct. What about:
> 3.2.19.        DRB[0:7] DRAM ROW BOUNDARY REGISTERS
>
> from 440fx spec?
>

Maybe.  But we can't use that, since it ignores address line 31.

(440fx supports only 1GB RAM, and we're ignoring that)
Anthony Liguori July 25, 2011, 1:47 p.m. UTC | #10
On 07/25/2011 08:38 AM, Avi Kivity wrote:
> On 07/25/2011 04:35 PM, Gleb Natapov wrote:
>> >
>> > That's the ISA TOM (15MB hole and friends).
>> >
>> Correct. What about:
>> 3.2.19. DRB[0:7] DRAM ROW BOUNDARY REGISTERS
>>
>> from 440fx spec?
>>
>
> Maybe. But we can't use that, since it ignores address line 31.
>
> (440fx supports only 1GB RAM, and we're ignoring that)

What are we trying to do?

Can't we just register highest RAM address under 4G to 4G as PCI memory 
and call it a day?

Do we really need a guest visible register to do this?

Regards,

Anthony Liguori

>
Gleb Natapov July 25, 2011, 1:50 p.m. UTC | #11
On Mon, Jul 25, 2011 at 08:47:31AM -0500, Anthony Liguori wrote:
> On 07/25/2011 08:38 AM, Avi Kivity wrote:
> >On 07/25/2011 04:35 PM, Gleb Natapov wrote:
> >>>
> >>> That's the ISA TOM (15MB hole and friends).
> >>>
> >>Correct. What about:
> >>3.2.19. DRB[0:7] DRAM ROW BOUNDARY REGISTERS
> >>
> >>from 440fx spec?
> >>
> >
> >Maybe. But we can't use that, since it ignores address line 31.
> >
> >(440fx supports only 1GB RAM, and we're ignoring that)
> 
> What are we trying to do?
> 
I just tried to answer Avi's question about TOM register.

> Can't we just register highest RAM address under 4G to 4G as PCI
> memory and call it a day?
> 
It is good idea to emulate existing functionality as close as possible,
but if existing functionality does not satisfy us (like in our case)
then yes, we can.

> Do we really need a guest visible register to do this?
> 
> Regards,
> 
> Anthony Liguori
> 
> >

--
			Gleb.
Avi Kivity July 25, 2011, 2:05 p.m. UTC | #12
On 07/25/2011 04:47 PM, Anthony Liguori wrote:
> On 07/25/2011 08:38 AM, Avi Kivity wrote:
>> On 07/25/2011 04:35 PM, Gleb Natapov wrote:
>>> >
>>> > That's the ISA TOM (15MB hole and friends).
>>> >
>>> Correct. What about:
>>> 3.2.19. DRB[0:7] DRAM ROW BOUNDARY REGISTERS
>>>
>>> from 440fx spec?
>>>
>>
>> Maybe. But we can't use that, since it ignores address line 31.
>>
>> (440fx supports only 1GB RAM, and we're ignoring that)
>
> What are we trying to do?
>
> Can't we just register highest RAM address under 4G to 4G as PCI 
> memory and call it a day?
>
> Do we really need a guest visible register to do this?

Why not use 3.5GB and call it a day?  It's safer for memory hotplug, if 
we ever get it.

The guest will never put a PCI BAR below that anyway.
Anthony Liguori July 25, 2011, 2:08 p.m. UTC | #13
On 07/25/2011 09:05 AM, Avi Kivity wrote:
> On 07/25/2011 04:47 PM, Anthony Liguori wrote:
>> On 07/25/2011 08:38 AM, Avi Kivity wrote:
>>> On 07/25/2011 04:35 PM, Gleb Natapov wrote:
>>>> >
>>>> > That's the ISA TOM (15MB hole and friends).
>>>> >
>>>> Correct. What about:
>>>> 3.2.19. DRB[0:7] DRAM ROW BOUNDARY REGISTERS
>>>>
>>>> from 440fx spec?
>>>>
>>>
>>> Maybe. But we can't use that, since it ignores address line 31.
>>>
>>> (440fx supports only 1GB RAM, and we're ignoring that)
>>
>> What are we trying to do?
>>
>> Can't we just register highest RAM address under 4G to 4G as PCI
>> memory and call it a day?
>>
>> Do we really need a guest visible register to do this?
>
> Why not use 3.5GB and call it a day? It's safer for memory hotplug, if
> we ever get it.
>
> The guest will never put a PCI BAR below that anyway.

My entire concern is that they will.

We're not just talking about Windows or Linux here, but any odd DOS 
application that's smart enough to switch into 32-bit mode, bootloaders, 
etc.

It's a significant behaviorial change that goes against the spec for no 
obviously good reason.

Even if we supported hot plug, we could easily just hot plug above the 
4GB mark.

Regards,

Anthony Liguori

>
Avi Kivity July 25, 2011, 2:10 p.m. UTC | #14
On 07/25/2011 05:08 PM, Anthony Liguori wrote:
>> Why not use 3.5GB and call it a day? It's safer for memory hotplug, if
>> we ever get it.
>>
>> The guest will never put a PCI BAR below that anyway.
>
>
> My entire concern is that they will.
>
> We're not just talking about Windows or Linux here, but any odd DOS 
> application that's smart enough to switch into 32-bit mode, 
> bootloaders, etc.
>
> It's a significant behaviorial change that goes against the spec for 
> no obviously good reason.
>

Okay, I'll update the patch.

> Even if we supported hot plug, we could easily just hot plug above the 
> 4GB mark.

Or change the PCI hole dynamically.
Eric Northup July 25, 2011, 9:34 p.m. UTC | #15
On Wed, Jul 20, 2011 at 9:50 AM, Avi Kivity <avi@redhat.com> wrote:
[...]
> @@ -130,7 +137,13 @@ static void pc_init1(MemoryRegion *system_memory,
>
>     if (pci_enabled) {
>         pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq,
> -                              system_memory, system_io, ram_size);
> +                              system_memory, system_io, ram_size,
> +                              0xe0000000, 0x1fe00000,
> +                              0x100000000 + above_4g_mem_size,
> +                              (sizeof(target_phys_addr_t) == 32

sizeof(target_phys_addr_t) == 8 ?

> +                               ? 0
> +                               : ((uint64_t)1 << 63)),
> +                              pci_memory, ram_memory);
>     } else {
>         pci_bus = NULL;
>         i440fx_state = NULL;
Avi Kivity July 26, 2011, 8:01 a.m. UTC | #16
On 07/26/2011 12:34 AM, Eric Northup wrote:
> On Wed, Jul 20, 2011 at 9:50 AM, Avi Kivity<avi@redhat.com>  wrote:
> [...]
> >  @@ -130,7 +137,13 @@ static void pc_init1(MemoryRegion *system_memory,
> >
> >       if (pci_enabled) {
> >           pci_bus = i440fx_init(&i440fx_state,&piix3_devfn, isa_irq,
> >  -                              system_memory, system_io, ram_size);
> >  +                              system_memory, system_io, ram_size,
> >  +                              0xe0000000, 0x1fe00000,
> >  +                              0x100000000 + above_4g_mem_size,
> >  +                              (sizeof(target_phys_addr_t) == 32
>
> sizeof(target_phys_addr_t) == 8 ?

Good catch.  Fixed

(== 4).
diff mbox

Patch

diff --git a/hw/pc.c b/hw/pc.c
index f920001..af24e3c 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -963,7 +963,9 @@  void pc_memory_init(MemoryRegion *system_memory,
                     const char *kernel_cmdline,
                     const char *initrd_filename,
                     ram_addr_t below_4g_mem_size,
-                    ram_addr_t above_4g_mem_size)
+                    ram_addr_t above_4g_mem_size,
+                    MemoryRegion *pci_memory,
+                    MemoryRegion **ram_memory)
 {
     char *filename;
     int ret, linux_boot, i;
@@ -981,6 +983,7 @@  void pc_memory_init(MemoryRegion *system_memory,
     ram = qemu_malloc(sizeof(*ram));
     memory_region_init_ram(ram, NULL, "pc.ram",
                            below_4g_mem_size + above_4g_mem_size);
+    *ram_memory = ram;
     ram_below_4g = qemu_malloc(sizeof(*ram_below_4g));
     memory_region_init_alias(ram_below_4g, "ram-below-4g", ram,
                              0, below_4g_mem_size);
@@ -1025,7 +1028,7 @@  void pc_memory_init(MemoryRegion *system_memory,
     isa_bios = qemu_malloc(sizeof(*isa_bios));
     memory_region_init_alias(isa_bios, "isa-bios", bios,
                              bios_size - isa_bios_size, isa_bios_size);
-    memory_region_add_subregion_overlap(system_memory,
+    memory_region_add_subregion_overlap(pci_memory,
                                         0x100000 - isa_bios_size,
                                         isa_bios,
                                         1);
@@ -1033,13 +1036,13 @@  void pc_memory_init(MemoryRegion *system_memory,
 
     option_rom_mr = qemu_malloc(sizeof(*option_rom_mr));
     memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE);
-    memory_region_add_subregion_overlap(system_memory,
+    memory_region_add_subregion_overlap(pci_memory,
                                         PC_ROM_MIN_VGA,
                                         option_rom_mr,
                                         1);
 
     /* map all the bios at the top of memory */
-    memory_region_add_subregion(system_memory,
+    memory_region_add_subregion(pci_memory,
                                 (uint32_t)(-bios_size),
                                 bios);
 
diff --git a/hw/pc.h b/hw/pc.h
index d871fd8..dae736e 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -136,7 +136,9 @@  void pc_memory_init(MemoryRegion *system_memory,
                     const char *kernel_cmdline,
                     const char *initrd_filename,
                     ram_addr_t below_4g_mem_size,
-                    ram_addr_t above_4g_mem_size);
+                    ram_addr_t above_4g_mem_size,
+                    MemoryRegion *pci_memory,
+                    MemoryRegion **ram_memory);
 qemu_irq *pc_allocate_cpu_irq(void);
 void pc_vga_init(PCIBus *pci_bus);
 void pc_basic_device_init(qemu_irq *isa_irq,
@@ -182,8 +184,13 @@  PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn,
                     qemu_irq *pic,
                     MemoryRegion *address_space_mem,
                     MemoryRegion *address_space_io,
-                    ram_addr_t ram_size);
-void i440fx_init_memory_mappings(PCII440FXState *d);
+                    ram_addr_t ram_size,
+                    target_phys_addr_t pci_hole_start,
+                    target_phys_addr_t pci_hole_size,
+                    target_phys_addr_t pci_hole64_start,
+                    target_phys_addr_t pci_hole64_size,
+                    MemoryRegion *pci_memory,
+                    MemoryRegion *ram_memory);
 
 /* piix4.c */
 extern PCIDevice *piix4_dev;
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index fb92035..35efc93 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -93,6 +93,8 @@  static void pc_init1(MemoryRegion *system_memory,
     DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
     BusState *idebus[MAX_IDE_BUS];
     ISADevice *rtc_state;
+    MemoryRegion *ram_memory;
+    MemoryRegion *pci_memory;
 
     pc_cpus_init(cpu_model);
 
@@ -108,11 +110,16 @@  static void pc_init1(MemoryRegion *system_memory,
         below_4g_mem_size = ram_size;
     }
 
+    pci_memory = QEMU_NEW(MemoryRegion);
+    memory_region_init(pci_memory, "pci", UINT64_MAX);
+    memory_region_add_subregion_overlap(system_memory, 0, pci_memory, 1);
+
     /* allocate ram and load rom/bios */
     if (!xen_enabled()) {
         pc_memory_init(system_memory,
                        kernel_filename, kernel_cmdline, initrd_filename,
-                       below_4g_mem_size, above_4g_mem_size);
+                       below_4g_mem_size, above_4g_mem_size,
+                       pci_memory, &ram_memory);
     }
 
     if (!xen_enabled()) {
@@ -130,7 +137,13 @@  static void pc_init1(MemoryRegion *system_memory,
 
     if (pci_enabled) {
         pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq,
-                              system_memory, system_io, ram_size);
+                              system_memory, system_io, ram_size,
+                              0xe0000000, 0x1fe00000,
+                              0x100000000 + above_4g_mem_size,
+                              (sizeof(target_phys_addr_t) == 32
+                               ? 0
+                               : ((uint64_t)1 << 63)),
+                              pci_memory, ram_memory);
     } else {
         pci_bus = NULL;
         i440fx_state = NULL;
@@ -198,10 +211,6 @@  static void pc_init1(MemoryRegion *system_memory,
         smbus_eeprom_init(smbus, 8, NULL, 0);
     }
 
-    if (i440fx_state) {
-        i440fx_init_memory_mappings(i440fx_state);
-    }
-
     if (pci_enabled) {
         pc_pci_device_init(pci_bus);
     }
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 28a3ee2..c563c6e 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -66,10 +66,22 @@  typedef struct PIIX3State {
     int32_t pci_irq_levels_vmstate[PIIX_NUM_PIRQS];
 } PIIX3State;
 
+typedef struct PAMMemoryRegion {
+    MemoryRegion mem;
+    bool initialized;
+} PAMMemoryRegion;
+
 struct PCII440FXState {
     PCIDevice dev;
-    target_phys_addr_t isa_page_descs[384 / 4];
+    MemoryRegion *system_memory;
+    MemoryRegion *pci_address_space;
+    MemoryRegion *ram_memory;
+    MemoryRegion pci_hole;
+    MemoryRegion pci_hole_64bit;
+    PAMMemoryRegion pam_regions[13];
+    MemoryRegion smram_region;
     uint8_t smm_enabled;
+    bool smram_enabled;
     PIIX3State *piix3;
 };
 
@@ -92,50 +104,62 @@  static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
     return (pci_intx + slot_addend) & 3;
 }
 
-static void update_pam(PCII440FXState *d, uint32_t start, uint32_t end, int r)
+static void update_pam(PCII440FXState *d, uint32_t start, uint32_t end, int r,
+                       PAMMemoryRegion *mem)
 {
-    uint32_t addr;
+    if (mem->initialized) {
+        memory_region_del_subregion(d->system_memory, &mem->mem);
+        memory_region_destroy(&mem->mem);
+    }
 
     //    printf("ISA mapping %08x-0x%08x: %d\n", start, end, r);
     switch(r) {
     case 3:
         /* RAM */
-        cpu_register_physical_memory(start, end - start,
-                                     start);
+        memory_region_init_alias(&mem->mem, "pam-ram", d->ram_memory,
+                                 start, end - start);
         break;
     case 1:
         /* ROM (XXX: not quite correct) */
-        cpu_register_physical_memory(start, end - start,
-                                     start | IO_MEM_ROM);
+        memory_region_init_alias(&mem->mem, "pam-rom", d->ram_memory,
+                                 start, end - start);
+        memory_region_set_readonly(&mem->mem, true);
         break;
     case 2:
     case 0:
         /* XXX: should distinguish read/write cases */
-        for(addr = start; addr < end; addr += 4096) {
-            cpu_register_physical_memory(addr, 4096,
-                                         d->isa_page_descs[(addr - 0xa0000) >> 12]);
-        }
+        memory_region_init_alias(&mem->mem, "pam-pci", d->pci_address_space,
+                                 start, end - start);
         break;
     }
+    memory_region_add_subregion_overlap(d->system_memory,
+                                        start, &mem->mem, 1);
+    mem->initialized = true;
 }
 
 static void i440fx_update_memory_mappings(PCII440FXState *d)
 {
     int i, r;
-    uint32_t smram, addr;
+    uint32_t smram;
 
-    update_pam(d, 0xf0000, 0x100000, (d->dev.config[I440FX_PAM] >> 4) & 3);
+    update_pam(d, 0xf0000, 0x100000, (d->dev.config[I440FX_PAM] >> 4) & 3,
+               &d->pam_regions[0]);
     for(i = 0; i < 12; i++) {
         r = (d->dev.config[(i >> 1) + (I440FX_PAM + 1)] >> ((i & 1) * 4)) & 3;
-        update_pam(d, 0xc0000 + 0x4000 * i, 0xc0000 + 0x4000 * (i + 1), r);
+        update_pam(d, 0xc0000 + 0x4000 * i, 0xc0000 + 0x4000 * (i + 1), r,
+                   &d->pam_regions[i+1]);
     }
     smram = d->dev.config[I440FX_SMRAM];
     if ((d->smm_enabled && (smram & 0x08)) || (smram & 0x40)) {
-        cpu_register_physical_memory(0xa0000, 0x20000, 0xa0000);
+        if (!d->smram_enabled) {
+            memory_region_del_subregion(d->system_memory, &d->smram_region);
+            d->smram_enabled = true;
+        }
     } else {
-        for(addr = 0xa0000; addr < 0xc0000; addr += 4096) {
-            cpu_register_physical_memory(addr, 4096,
-                                         d->isa_page_descs[(addr - 0xa0000) >> 12]);
+        if (d->smram_enabled) {
+            memory_region_add_subregion_overlap(d->system_memory, 0xa0000,
+                                                &d->smram_region, 1);
+            d->smram_enabled = false;
         }
     }
 }
@@ -152,17 +176,6 @@  static void i440fx_set_smm(int val, void *arg)
 }
 
 
-/* XXX: suppress when better memory API. We make the assumption that
-   no device (in particular the VGA) changes the memory mappings in
-   the 0xa0000-0x100000 range */
-void i440fx_init_memory_mappings(PCII440FXState *d)
-{
-    int i;
-    for(i = 0; i < 96; i++) {
-        d->isa_page_descs[i] = cpu_get_physical_page_desc(0xa0000 + i * 0x1000);
-    }
-}
-
 static void i440fx_write_config(PCIDevice *dev,
                                 uint32_t address, uint32_t val, int len)
 {
@@ -244,24 +257,48 @@  static PCIBus *i440fx_common_init(const char *device_name,
                                   qemu_irq *pic,
                                   MemoryRegion *address_space_mem,
                                   MemoryRegion *address_space_io,
-                                  ram_addr_t ram_size)
+                                  ram_addr_t ram_size,
+                                  target_phys_addr_t pci_hole_start,
+                                  target_phys_addr_t pci_hole_size,
+                                  target_phys_addr_t pci_hole64_start,
+                                  target_phys_addr_t pci_hole64_size,
+                                  MemoryRegion *pci_address_space,
+                                  MemoryRegion *ram_memory)
 {
     DeviceState *dev;
     PCIBus *b;
     PCIDevice *d;
     I440FXState *s;
     PIIX3State *piix3;
+    PCII440FXState *f;
 
     dev = qdev_create(NULL, "i440FX-pcihost");
     s = FROM_SYSBUS(I440FXState, sysbus_from_qdev(dev));
     s->address_space = address_space_mem;
-    b = pci_bus_new(&s->busdev.qdev, NULL, s->address_space,
+    b = pci_bus_new(&s->busdev.qdev, NULL, pci_address_space,
                     address_space_io, 0);
     s->bus = b;
     qdev_init_nofail(dev);
 
     d = pci_create_simple(b, 0, device_name);
     *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
+    f = *pi440fx_state;
+    f->system_memory = address_space_mem;
+    f->pci_address_space = pci_address_space;
+    f->ram_memory = ram_memory;
+    memory_region_init_alias(&f->pci_hole, "pci-hole", f->pci_address_space,
+                             pci_hole_start, pci_hole_size);
+    memory_region_add_subregion(f->system_memory, pci_hole_start, &f->pci_hole);
+    memory_region_init_alias(&f->pci_hole_64bit, "pci-hole64",
+                             f->pci_address_space,
+                             pci_hole64_start, pci_hole64_size);
+    if (pci_hole64_size) {
+        memory_region_add_subregion(f->system_memory, pci_hole64_start,
+                                    &f->pci_hole_64bit);
+    }
+    memory_region_init_alias(&f->smram_region, "smram-region",
+                             f->pci_address_space, 0xa0000, 0x20000);
+    f->smram_enabled = true;
 
     /* Xen supports additional interrupt routes from the PCI devices to
      * the IOAPIC: the four pins of each PCI device on the bus are also
@@ -289,6 +326,8 @@  static PCIBus *i440fx_common_init(const char *device_name,
         ram_size = 255;
     (*pi440fx_state)->dev.config[0x57]=ram_size;
 
+    i440fx_update_memory_mappings(f);
+
     return b;
 }
 
@@ -296,12 +335,21 @@  PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
                     qemu_irq *pic,
                     MemoryRegion *address_space_mem,
                     MemoryRegion *address_space_io,
-                    ram_addr_t ram_size)
+                    ram_addr_t ram_size,
+                    target_phys_addr_t pci_hole_start,
+                    target_phys_addr_t pci_hole_size,
+                    target_phys_addr_t pci_hole64_start,
+                    target_phys_addr_t pci_hole64_size,
+                    MemoryRegion *pci_memory, MemoryRegion *ram_memory)
+
 {
     PCIBus *b;
 
     b = i440fx_common_init("i440FX", pi440fx_state, piix3_devfn, pic,
-                           address_space_mem, address_space_io, ram_size);
+                           address_space_mem, address_space_io, ram_size,
+                           pci_hole_start, pci_hole_size,
+                           pci_hole64_size, pci_hole64_size,
+                           pci_memory, ram_memory);
     return b;
 }