Patchwork [for-1.4] pc: tag apic as overlap region

login
register
mail settings
Submitter Michael S. Tsirkin
Date Feb. 19, 2013, 3:20 p.m.
Message ID <20130219152057.GA30086@redhat.com>
Download mbox | patch
Permalink /patch/221714/
State New
Headers show

Comments

Michael S. Tsirkin - Feb. 19, 2013, 3:20 p.m.
apic overlaps PCI space. On real hardware it has
higher priority, emulate this correctly.

This should addresses the following issue:

> Subject: Re: [BUG] Guest OS hangs on boot when 64bit BAR present (kvm-apic-msi resource conflict)
> Sometime ago I reported an issue about guest OS hang when 64bit BAR present.
> http://lists.gnu.org/archive/html/qemu-devel/2012-01/msg03189.html
> http://lists.gnu.org/archive/html/qemu-devel/2012-12/msg00413.html
> 
> Some more investigation has been done, so in this post I'll try to explain why it happens and offer possible solutions:
> 
> *When the issue happens*
> The issue occurs on Linux guest OS if kernel version <2.6.36
> A Guest OS hangs on boot when a 64bit PCI BAR is present in a system (if we use ivshmem driver for example) and occupies range within first
> 4 GB.
> 
> *How to reproduce*
> I used the following qemu command to reproduce the case:
> /usr/local/bin/qemu-system-x86_64 -M pc-1.3 -enable-kvm -m 2000 -smp 1,sockets=1,cores=1,threads=1 -name Rh5332 -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/Rh5332.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=readline -rtc
> base=utc -boot cd -drive file=/home/akorolev/rh5332.img,if=none,id=drive-ide0-0-0,format=raw -device
> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -chardev file,id=charserial0,path=/home/akorolev/serial.log -device
> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga cirrus -device ivshmem,shm,size=32M-device
> virtio-balloon-pci,id=balloon0
> 
> Tried different guests: Centos 5.8 64bit, RHEL 5.3 32bit, FC 12 64bit on all machines hang occurs in 100% cases
> 
> *Why it happens*
> The issue basically comes from Linux PCI enumeration code.
> 
> The OS enumerates 64BIT bars when device is enabled using the following procedure.
> 1. Write all FF's to lower half of 64bit BAR
> 2. Write address back to lower half of 64bit BAR
> 3. Write all FF's to higher half of 64bit BAR
> 4. Write address back to higher half of 64bit BAR
> 
> For qemu it means that  qemu pci_default_write_config() recevies all FFs for lower part of the 64bit BAR.
> Then it applies the mask and converts the value to "All FF's - size + 1" (FE000000 if size is 32MB).
> 
> So for short period of time the range [0xFE000000 - 0xFFFFFFFF] will be occupied by ivshmem resource.
> For some reason it is lethal for further boot process.
> 
> We have found that boot process screws up completely if kvm-apic-msi range is overlapped even for short period of time.  (We still don't
> know why it happens, hope that the qemu maintainers can answer?)
> 
> If we look at kvm-apic-msi memory region it is a non-overlapable memory region with hardcoded address range [0xFEE00000 - 0xFEF00000].
> 
> Here is a log we collected from render_memory_regions:
> 
>  system overlap 0 pri 0 [0x0 - 0x7fffffffffffffff]
>      kvmvapic-rom overlap 1 pri 1000 [0xca000 - 0xcd000]
>          pc.ram overlap 0 pri 0 [0xca000 - 0xcd000]
>          ++ pc.ram [0xca000 - 0xcd000] is added to view
>      ....................
>      smram-region overlap 1 pri 1 [0xa0000 - 0xc0000]
>          pci overlap 0 pri 0 [0xa0000 - 0xc0000]
>              cirrus-lowmem-container overlap 1 pri 1 [0xa0000 - 0xc0000]
>                  cirrus-low-memory overlap 0 pri 0 [0xa0000 - 0xc0000]
>                 ++cirrus-low-memory [0xa0000 - 0xc0000] is added to view
>      kvm-ioapic overlap 0 pri 0 [0xfec00000 - 0xfec01000]
>     ++kvm-ioapic [0xfec00000 - 0xfec01000] is added to view
>      pci-hole64 overlap 0 pri 0 [0x100000000 - 0x4000000100000000]
>          pci overlap 0 pri 0 [0x100000000 - 0x4000000100000000]
>      pci-hole overlap 0 pri 0 [0x7d000000 - 0x100000000]
>          pci overlap 0 pri 0 [0x7d000000 - 0x100000000]
>              ivshmem-bar2-container overlap 1 pri 1 [0xfe000000 - 0x100000000]
>                  ivshmem.bar2 overlap 0 pri 0 [0xfe000000 - 0x100000000]
>                 ++ivshmem.bar2 [0xfe000000 - 0xfec00000] is added to view
>                 ++ivshmem.bar2  [0xfec01000 - 0x100000000] is added to view
>              ivshmem-mmio overlap 1 pri 1 [0xfebf1000 - 0xfebf1100]
>              e1000-mmio overlap 1 pri 1 [0xfeba0000 - 0xfebc0000]
>              cirrus-mmio overlap 1 pri 1 [0xfebf0000 - 0xfebf1000]
>              cirrus-pci-bar0 overlap 1 pri 1 [0xfa000000 - 0xfc000000]
>                  vga.vram overlap 1 pri 1 [0xfa000000 - 0xfa800000]
>                 ++vga.vram [0xfa000000 - 0xfa800000] is added to view
>                  cirrus-bitblt-mmio overlap 0 pri 0 [0xfb000000 - 0xfb400000]
>                 ++cirrus-bitblt-mmio [0xfb000000 - 0xfb400000] is added to view
>                  cirrus-linear-io overlap 0 pri 0 [0xfa000000 - 0xfa800000]
>              pc.bios overlap 0 pri 0 [0xfffe0000 - 0x100000000]
>      ram-below-4g overlap 0 pri 0 [0x0 - 0x7d000000]
>          pc.ram overlap 0 pri 0 [0x0 - 0x7d000000]
>         ++pc.ram [0x0 - 0xa0000] is added to view
>         ++pc.ram [0x100000 - 0x7d000000] is added to view
>      kvm-apic-msi overlap 0 pri 0 [0xfee00000 - 0xfef00000]
> 
> As you can see from log the kvm-apic-msi is enumarated last when range [0xfee00000 - 0xfef00000] is already occupied by ivshmem.bar2
> [0xfec01000 - 0x100000000].
> 
> 
> *Possible solutions*
> Solution 1. Probably the best would be adding the rule that regions which may not be overlapped are added to view first (In in other words
> regions which must not be overlapped have the highest priority).  Please find patch in the following message.
> 
> Solution 2. Raise priority of kvm-apic-msi resource. This is a bit misleading solution, as priority is only applicable for overlap-able
> regions, but this region must not be overlapped.
> 
> Solution 3. Fix the issue at PCI level. Track if the resource is 64bit and apply changes if both parts of 64bit BAR are programmed. (It
> appears that real PCI bus controllers are smart enough to track 64bit BAR writes on PC, so qemu could do the same? Drawbacks are that
> tracking PCI writes is bit cumbersome, and such tracking may appear to somebody as a hack)

---

The following patch is under test ATM.
Alexey, does it address the crash for you?
Jan Kiszka - Feb. 19, 2013, 3:51 p.m.
On 2013-02-19 16:20, Michael S. Tsirkin wrote:
> apic overlaps PCI space. On real hardware it has
> higher priority, emulate this correctly.
> 
> This should addresses the following issue:
> 
>> Subject: Re: [BUG] Guest OS hangs on boot when 64bit BAR present (kvm-apic-msi resource conflict)
>> Sometime ago I reported an issue about guest OS hang when 64bit BAR present.
>> http://lists.gnu.org/archive/html/qemu-devel/2012-01/msg03189.html
>> http://lists.gnu.org/archive/html/qemu-devel/2012-12/msg00413.html
>>
>> Some more investigation has been done, so in this post I'll try to explain why it happens and offer possible solutions:
>>
>> *When the issue happens*
>> The issue occurs on Linux guest OS if kernel version <2.6.36
>> A Guest OS hangs on boot when a 64bit PCI BAR is present in a system (if we use ivshmem driver for example) and occupies range within first
>> 4 GB.
>>
>> *How to reproduce*
>> I used the following qemu command to reproduce the case:
>> /usr/local/bin/qemu-system-x86_64 -M pc-1.3 -enable-kvm -m 2000 -smp 1,sockets=1,cores=1,threads=1 -name Rh5332 -chardev
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/Rh5332.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=readline -rtc
>> base=utc -boot cd -drive file=/home/akorolev/rh5332.img,if=none,id=drive-ide0-0-0,format=raw -device
>> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -chardev file,id=charserial0,path=/home/akorolev/serial.log -device
>> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga cirrus -device ivshmem,shm,size=32M-device
>> virtio-balloon-pci,id=balloon0
>>
>> Tried different guests: Centos 5.8 64bit, RHEL 5.3 32bit, FC 12 64bit on all machines hang occurs in 100% cases
>>
>> *Why it happens*
>> The issue basically comes from Linux PCI enumeration code.
>>
>> The OS enumerates 64BIT bars when device is enabled using the following procedure.
>> 1. Write all FF's to lower half of 64bit BAR
>> 2. Write address back to lower half of 64bit BAR
>> 3. Write all FF's to higher half of 64bit BAR
>> 4. Write address back to higher half of 64bit BAR
>>
>> For qemu it means that  qemu pci_default_write_config() recevies all FFs for lower part of the 64bit BAR.
>> Then it applies the mask and converts the value to "All FF's - size + 1" (FE000000 if size is 32MB).
>>
>> So for short period of time the range [0xFE000000 - 0xFFFFFFFF] will be occupied by ivshmem resource.
>> For some reason it is lethal for further boot process.
>>
>> We have found that boot process screws up completely if kvm-apic-msi range is overlapped even for short period of time.  (We still don't
>> know why it happens, hope that the qemu maintainers can answer?)
>>
>> If we look at kvm-apic-msi memory region it is a non-overlapable memory region with hardcoded address range [0xFEE00000 - 0xFEF00000].
>>
>> Here is a log we collected from render_memory_regions:
>>
>>  system overlap 0 pri 0 [0x0 - 0x7fffffffffffffff]
>>      kvmvapic-rom overlap 1 pri 1000 [0xca000 - 0xcd000]
>>          pc.ram overlap 0 pri 0 [0xca000 - 0xcd000]
>>          ++ pc.ram [0xca000 - 0xcd000] is added to view
>>      ....................
>>      smram-region overlap 1 pri 1 [0xa0000 - 0xc0000]
>>          pci overlap 0 pri 0 [0xa0000 - 0xc0000]
>>              cirrus-lowmem-container overlap 1 pri 1 [0xa0000 - 0xc0000]
>>                  cirrus-low-memory overlap 0 pri 0 [0xa0000 - 0xc0000]
>>                 ++cirrus-low-memory [0xa0000 - 0xc0000] is added to view
>>      kvm-ioapic overlap 0 pri 0 [0xfec00000 - 0xfec01000]
>>     ++kvm-ioapic [0xfec00000 - 0xfec01000] is added to view
>>      pci-hole64 overlap 0 pri 0 [0x100000000 - 0x4000000100000000]
>>          pci overlap 0 pri 0 [0x100000000 - 0x4000000100000000]
>>      pci-hole overlap 0 pri 0 [0x7d000000 - 0x100000000]
>>          pci overlap 0 pri 0 [0x7d000000 - 0x100000000]
>>              ivshmem-bar2-container overlap 1 pri 1 [0xfe000000 - 0x100000000]
>>                  ivshmem.bar2 overlap 0 pri 0 [0xfe000000 - 0x100000000]
>>                 ++ivshmem.bar2 [0xfe000000 - 0xfec00000] is added to view
>>                 ++ivshmem.bar2  [0xfec01000 - 0x100000000] is added to view
>>              ivshmem-mmio overlap 1 pri 1 [0xfebf1000 - 0xfebf1100]
>>              e1000-mmio overlap 1 pri 1 [0xfeba0000 - 0xfebc0000]
>>              cirrus-mmio overlap 1 pri 1 [0xfebf0000 - 0xfebf1000]
>>              cirrus-pci-bar0 overlap 1 pri 1 [0xfa000000 - 0xfc000000]
>>                  vga.vram overlap 1 pri 1 [0xfa000000 - 0xfa800000]
>>                 ++vga.vram [0xfa000000 - 0xfa800000] is added to view
>>                  cirrus-bitblt-mmio overlap 0 pri 0 [0xfb000000 - 0xfb400000]
>>                 ++cirrus-bitblt-mmio [0xfb000000 - 0xfb400000] is added to view
>>                  cirrus-linear-io overlap 0 pri 0 [0xfa000000 - 0xfa800000]
>>              pc.bios overlap 0 pri 0 [0xfffe0000 - 0x100000000]
>>      ram-below-4g overlap 0 pri 0 [0x0 - 0x7d000000]
>>          pc.ram overlap 0 pri 0 [0x0 - 0x7d000000]
>>         ++pc.ram [0x0 - 0xa0000] is added to view
>>         ++pc.ram [0x100000 - 0x7d000000] is added to view
>>      kvm-apic-msi overlap 0 pri 0 [0xfee00000 - 0xfef00000]
>>
>> As you can see from log the kvm-apic-msi is enumarated last when range [0xfee00000 - 0xfef00000] is already occupied by ivshmem.bar2
>> [0xfec01000 - 0x100000000].
>>
>>
>> *Possible solutions*
>> Solution 1. Probably the best would be adding the rule that regions which may not be overlapped are added to view first (In in other words
>> regions which must not be overlapped have the highest priority).  Please find patch in the following message.
>>
>> Solution 2. Raise priority of kvm-apic-msi resource. This is a bit misleading solution, as priority is only applicable for overlap-able
>> regions, but this region must not be overlapped.
>>
>> Solution 3. Fix the issue at PCI level. Track if the resource is 64bit and apply changes if both parts of 64bit BAR are programmed. (It
>> appears that real PCI bus controllers are smart enough to track 64bit BAR writes on PC, so qemu could do the same? Drawbacks are that
>> tracking PCI writes is bit cumbersome, and such tracking may appear to somebody as a hack)
> 
> ---
> 
> The following patch is under test ATM.
> Alexey, does it address the crash for you?
> 
> 
> diff --git a/hw/pc.c b/hw/pc.c
> index 53cc173..745c89d 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -1154,7 +1154,8 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
>      }
>      qdev_init_nofail(dev);
>      d = SYS_BUS_DEVICE(dev);
> -    sysbus_mmio_map(d, 0, 0xfec00000);
> +    /* APIC overlaps the PCI window. */
> +    sysbus_mmio_map_overlap(d, 0, 0xfec00000, 1000);

That's the IOAPIC, not the APIC. If you mean the IOAPIC, APIC and HPET
would require higher prio, too. But I suppose this is really about the
APIC and its special priority due to CPU-local access dispatching, right?

Jan
Peter Maydell - Feb. 19, 2013, 3:54 p.m.
On 19 February 2013 15:51, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2013-02-19 16:20, Michael S. Tsirkin wrote:
>>      qdev_init_nofail(dev);
>>      d = SYS_BUS_DEVICE(dev);
>> -    sysbus_mmio_map(d, 0, 0xfec00000);
>> +    /* APIC overlaps the PCI window. */
>> +    sysbus_mmio_map_overlap(d, 0, 0xfec00000, 1000);
>
> That's the IOAPIC, not the APIC. If you mean the IOAPIC, APIC and HPET
> would require higher prio, too. But I suppose this is really about the
> APIC and its special priority due to CPU-local access dispatching, right?

Is this a proposed minimally invasive patch for 1.4 with a
different approach (possibly involving reworking things with
a better managed set of container regions) for master, or
is this the planned fix for master too?

-- PMM
Michael S. Tsirkin - Feb. 19, 2013, 4:04 p.m.
On Tue, Feb 19, 2013 at 03:54:35PM +0000, Peter Maydell wrote:
> On 19 February 2013 15:51, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> > On 2013-02-19 16:20, Michael S. Tsirkin wrote:
> >>      qdev_init_nofail(dev);
> >>      d = SYS_BUS_DEVICE(dev);
> >> -    sysbus_mmio_map(d, 0, 0xfec00000);
> >> +    /* APIC overlaps the PCI window. */
> >> +    sysbus_mmio_map_overlap(d, 0, 0xfec00000, 1000);
> >
> > That's the IOAPIC, not the APIC. If you mean the IOAPIC, APIC and HPET
> > would require higher prio, too. But I suppose this is really about the
> > APIC and its special priority due to CPU-local access dispatching, right?
> 
> Is this a proposed minimally invasive patch for 1.4 with a
> different approach (possibly involving reworking things with
> a better managed set of container regions) for master, or
> is this the planned fix for master too?
> 
> -- PMM

I think it's a reasonable thing to do long-term.
Jan Kiszka - Feb. 19, 2013, 4:05 p.m.
On 2013-02-19 16:54, Peter Maydell wrote:
> On 19 February 2013 15:51, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2013-02-19 16:20, Michael S. Tsirkin wrote:
>>>      qdev_init_nofail(dev);
>>>      d = SYS_BUS_DEVICE(dev);
>>> -    sysbus_mmio_map(d, 0, 0xfec00000);
>>> +    /* APIC overlaps the PCI window. */
>>> +    sysbus_mmio_map_overlap(d, 0, 0xfec00000, 1000);
>>
>> That's the IOAPIC, not the APIC. If you mean the IOAPIC, APIC and HPET
>> would require higher prio, too. But I suppose this is really about the
>> APIC and its special priority due to CPU-local access dispatching, right?
> 
> Is this a proposed minimally invasive patch for 1.4 with a
> different approach (possibly involving reworking things with
> a better managed set of container regions) for master, or
> is this the planned fix for master too?

I'm not yet sure we need any overhaul at all. If hardware prioritizes
certain windows like APIC, IOAPIC, HPET over PCI device mappings, then
we can already express this today - we just need to do it.

Jan
Peter Maydell - Feb. 19, 2013, 4:20 p.m.
On 19 February 2013 16:05, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2013-02-19 16:54, Peter Maydell wrote:
>> On 19 February 2013 15:51, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>> On 2013-02-19 16:20, Michael S. Tsirkin wrote:
>>>>      qdev_init_nofail(dev);
>>>>      d = SYS_BUS_DEVICE(dev);
>>>> -    sysbus_mmio_map(d, 0, 0xfec00000);
>>>> +    /* APIC overlaps the PCI window. */
>>>> +    sysbus_mmio_map_overlap(d, 0, 0xfec00000, 1000);
>>>
>>> That's the IOAPIC, not the APIC. If you mean the IOAPIC, APIC and HPET
>>> would require higher prio, too. But I suppose this is really about the
>>> APIC and its special priority due to CPU-local access dispatching, right?
>>
>> Is this a proposed minimally invasive patch for 1.4 with a
>> different approach (possibly involving reworking things with
>> a better managed set of container regions) for master, or
>> is this the planned fix for master too?
>
> I'm not yet sure we need any overhaul at all. If hardware prioritizes
> certain windows like APIC, IOAPIC, HPET over PCI device mappings, then
> we can already express this today - we just need to do it.

Yes, indeed, but there are different ways to do it with the
API we have at the moment (the patch above being one way).
On reflection it's probably a reasonable approach since we
can't currently set up CPUs to dispatch memory accesses to
an arbitrary MemoryRegion.[*]

[*] by which I mean that really the correct model for this
would be something like:
 the CPU cores live inside a QOM container object which also
 has the CPU-local devices like the IOAPIC &c inside it.
 This container has a container memory region which is set
 up as "background region == the MR which the board model
 passes to us; overlaid at higher priority == CPU-local devices",
 and that container MR is then what you pass to the CPU cores
 as their view of the address space.
I don't suppose we'll go down that path until we have a compelling
reason to implement multiple bus masters with different views
of the address space, though.

-- PMM
Jan Kiszka - Feb. 19, 2013, 4:23 p.m.
On 2013-02-19 17:20, Peter Maydell wrote:
> On 19 February 2013 16:05, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2013-02-19 16:54, Peter Maydell wrote:
>>> On 19 February 2013 15:51, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>> On 2013-02-19 16:20, Michael S. Tsirkin wrote:
>>>>>      qdev_init_nofail(dev);
>>>>>      d = SYS_BUS_DEVICE(dev);
>>>>> -    sysbus_mmio_map(d, 0, 0xfec00000);
>>>>> +    /* APIC overlaps the PCI window. */
>>>>> +    sysbus_mmio_map_overlap(d, 0, 0xfec00000, 1000);
>>>>
>>>> That's the IOAPIC, not the APIC. If you mean the IOAPIC, APIC and HPET
>>>> would require higher prio, too. But I suppose this is really about the
>>>> APIC and its special priority due to CPU-local access dispatching, right?
>>>
>>> Is this a proposed minimally invasive patch for 1.4 with a
>>> different approach (possibly involving reworking things with
>>> a better managed set of container regions) for master, or
>>> is this the planned fix for master too?
>>
>> I'm not yet sure we need any overhaul at all. If hardware prioritizes
>> certain windows like APIC, IOAPIC, HPET over PCI device mappings, then
>> we can already express this today - we just need to do it.
> 
> Yes, indeed, but there are different ways to do it with the
> API we have at the moment (the patch above being one way).
> On reflection it's probably a reasonable approach since we
> can't currently set up CPUs to dispatch memory accesses to
> an arbitrary MemoryRegion.[*]
> 
> [*] by which I mean that really the correct model for this
> would be something like:
>  the CPU cores live inside a QOM container object which also
>  has the CPU-local devices like the IOAPIC &c inside it.
>  This container has a container memory region which is set
>  up as "background region == the MR which the board model
>  passes to us; overlaid at higher priority == CPU-local devices",
>  and that container MR is then what you pass to the CPU cores
>  as their view of the address space.
> I don't suppose we'll go down that path until we have a compelling
> reason to implement multiple bus masters with different views
> of the address space, though.

Well, the APIC (not the IOAPIC) is already a reason to implement such a
feature. We still do not properly support it, means (in user space) we
do not allow the guest to remap its APIC to different addresses for
different CPUs. But we got along with that for quite a while now. So no
need to fix it in a hurry.

Jan

Patch

diff --git a/hw/pc.c b/hw/pc.c
index 53cc173..745c89d 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1154,7 +1154,8 @@  void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
     }
     qdev_init_nofail(dev);
     d = SYS_BUS_DEVICE(dev);
-    sysbus_mmio_map(d, 0, 0xfec00000);
+    /* APIC overlaps the PCI window. */
+    sysbus_mmio_map_overlap(d, 0, 0xfec00000, 1000);
 
     for (i = 0; i < IOAPIC_NUM_PINS; i++) {
         gsi_state->ioapic_irq[i] = qdev_get_gpio_in(dev, i);
diff --git a/hw/sysbus.c b/hw/sysbus.c
index 6d9d1df..40bf352 100644
--- a/hw/sysbus.c
+++ b/hw/sysbus.c
@@ -66,6 +66,26 @@  void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr)
                                 dev->mmio[n].memory);
 }
 
+void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr addr,
+                             unsigned priority)
+{
+    assert(n >= 0 && n < dev->num_mmio);
+
+    if (dev->mmio[n].addr == addr) {
+        /* ??? region already mapped here.  */
+        return;
+    }
+    if (dev->mmio[n].addr != (hwaddr)-1) {
+        /* Unregister previous mapping.  */
+        memory_region_del_subregion(get_system_memory(), dev->mmio[n].memory);
+    }
+    dev->mmio[n].addr = addr;
+    memory_region_add_subregion_overlap(get_system_memory(),
+                                        addr,
+                                        dev->mmio[n].memory,
+                                        priority);
+}
+
 
 /* Request an IRQ source.  The actual IRQ object may be populated later.  */
 void sysbus_init_irq(SysBusDevice *dev, qemu_irq *p)
diff --git a/hw/sysbus.h b/hw/sysbus.h
index a7fcded..2100bd7 100644
--- a/hw/sysbus.h
+++ b/hw/sysbus.h
@@ -56,6 +56,8 @@  void sysbus_init_ioports(SysBusDevice *dev, pio_addr_t ioport, pio_addr_t size);
 
 void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq);
 void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr);
+void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr addr,
+                             unsigned priority);
 void sysbus_add_memory(SysBusDevice *dev, hwaddr addr,
                        MemoryRegion *mem);
 void sysbus_add_memory_overlap(SysBusDevice *dev, hwaddr addr,