Message ID | 1360823521-32306-4-git-send-email-scottwood@freescale.com |
---|---|
State | New |
Headers | show |
On 14.02.2013, at 07:31, Scott Wood wrote: > This is useful for when a user of the memory region API needs to > communicate the absolute bus address to something outside QEMU > (in particular, KVM). > > Signed-off-by: Scott Wood <scottwood@freescale.com> Peter, how does the VGIC implementation handle this? Alex > --- > include/exec/memory.h | 9 +++++++++ > memory.c | 38 ++++++++++++++++++++++++++++++++++---- > 2 files changed, 43 insertions(+), 4 deletions(-) > > diff --git a/include/exec/memory.h b/include/exec/memory.h > index 2322732..b800391 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -892,6 +892,15 @@ void *address_space_map(AddressSpace *as, hwaddr addr, > void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, > int is_write, hwaddr access_len); > > +/* memory_region_to_address: Find the full address of the start of the > + * given #MemoryRegion, ignoring aliases. There is no guarantee > + * that the #MemoryRegion is actually visible at this address, if > + * there are overlapping regions. > + * > + * @mr: #MemoryRegion being queried > + * @asp: if non-NULL, returns the #AddressSpace @mr is mapped in, if any > + */ > +hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp); > > #endif > > diff --git a/memory.c b/memory.c > index cd7d5e0..0099f12 100644 > --- a/memory.c > +++ b/memory.c > @@ -453,21 +453,51 @@ const IORangeOps memory_region_iorange_ops = { > .destructor = memory_region_iorange_destructor, > }; > > -static AddressSpace *memory_region_to_address_space(MemoryRegion *mr) > +static AddressSpace *memory_region_root_to_address_space(MemoryRegion *mr) > { > AddressSpace *as; > > - while (mr->parent) { > - mr = mr->parent; > - } > QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) { > if (mr == as->root) { > return as; > } > } > + > + return NULL; > +} > + > +static AddressSpace *memory_region_to_address_space(MemoryRegion *mr) > +{ > + AddressSpace *as; > + > + while (mr->parent) { > + mr = mr->parent; > + } > + > + as = memory_region_root_to_address_space(mr); > + if (as) { > + return as; > + } > + > abort(); > } > > +hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp) > +{ > + hwaddr addr = mr->addr; > + > + while (mr->parent) { > + mr = mr->parent; > + addr += mr->addr; > + } > + > + if (asp) { > + *asp = memory_region_root_to_address_space(mr); > + } > + > + return addr; > +} > + > /* Render a memory region into the global view. Ranges in @view obscure > * ranges in @mr. > */ > -- > 1.7.9.5 > >
On 21 March 2013 08:31, Alexander Graf <agraf@suse.de> wrote: > On 14.02.2013, at 07:31, Scott Wood wrote: >> This is useful for when a user of the memory region API needs to >> communicate the absolute bus address to something outside QEMU >> (in particular, KVM). >> >> Signed-off-by: Scott Wood <scottwood@freescale.com> > > Peter, how does the VGIC implementation handle this? Check kvm_arm_register_device() in target-arm/kvm.c. Basically the VGIC device model calls this function to say "tell the kernel where this MemoryRegion is in the system address space, when it eventually gets mapped". The code in kvm.c uses the memory system's Notifier API to get a callback when the region is mapped into an address space, which it uses to track the offset in the address space. Finally, we use a machine init notifier so that just before everything finally starts we can make the KVM ioctls to say "here is where everything lives". I think this is a pretty neat way of doing it because it means neither the interrupt controller device nor the board model really need to care about the kernel being told where things are mapped; it's all abstracted out into kvm.c. If your interrupt controller can be moved around at runtime that's probably also handlable, but the ARM code just unregisters its notifiers at machine init because the GIC can't move. (I think the code assumes the device only gets mapped into one address space; this could easily be fixed if it's not true at some point in the future.) thanks -- PMM
On 21.03.2013, at 11:53, Peter Maydell wrote: > On 21 March 2013 08:31, Alexander Graf <agraf@suse.de> wrote: >> On 14.02.2013, at 07:31, Scott Wood wrote: >>> This is useful for when a user of the memory region API needs to >>> communicate the absolute bus address to something outside QEMU >>> (in particular, KVM). >>> >>> Signed-off-by: Scott Wood <scottwood@freescale.com> >> >> Peter, how does the VGIC implementation handle this? > > Check kvm_arm_register_device() in target-arm/kvm.c. Basically > the VGIC device model calls this function to say "tell the kernel > where this MemoryRegion is in the system address space, when it > eventually gets mapped". The code in kvm.c uses the memory system's > Notifier API to get a callback when the region is mapped into > an address space, which it uses to track the offset in the > address space. Finally, we use a machine init notifier so that > just before everything finally starts we can make the KVM ioctls > to say "here is where everything lives". Same thing here. The question is how the kvm-vgic code in QEMU finds out where it got mapped to. Scott adds this patch to do this, but I'd assume you have some other way :) Alex > > I think this is a pretty neat way of doing it because it means > neither the interrupt controller device nor the board model > really need to care about the kernel being told where things > are mapped; it's all abstracted out into kvm.c. If your > interrupt controller can be moved around at runtime that's > probably also handlable, but the ARM code just unregisters its > notifiers at machine init because the GIC can't move. > > (I think the code assumes the device only gets mapped into > one address space; this could easily be fixed if it's not true > at some point in the future.) > > thanks > -- PMM
On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote: > On 21.03.2013, at 11:53, Peter Maydell wrote: >> Check kvm_arm_register_device() in target-arm/kvm.c. Basically >> the VGIC device model calls this function to say "tell the kernel >> where this MemoryRegion is in the system address space, when it >> eventually gets mapped". The code in kvm.c uses the memory system's >> Notifier API to get a callback when the region is mapped into >> an address space, which it uses to track the offset in the >> address space. Finally, we use a machine init notifier so that >> just before everything finally starts we can make the KVM ioctls >> to say "here is where everything lives". > > Same thing here. The question is how the kvm-vgic code in QEMU > finds out where it got mapped to. Scott adds this patch to do > this, but I'd assume you have some other way :) Hmm? The kvm-vgic code in QEMU doesn't need to know where it lives. We have to tell the kernel so it can map its bits of registers in at the right place, that's all. -- PMM
On 21.03.2013, at 12:01, Peter Maydell wrote: > On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote: >> On 21.03.2013, at 11:53, Peter Maydell wrote: >>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically >>> the VGIC device model calls this function to say "tell the kernel >>> where this MemoryRegion is in the system address space, when it >>> eventually gets mapped". The code in kvm.c uses the memory system's >>> Notifier API to get a callback when the region is mapped into >>> an address space, which it uses to track the offset in the >>> address space. Finally, we use a machine init notifier so that >>> just before everything finally starts we can make the KVM ioctls >>> to say "here is where everything lives". >> >> Same thing here. The question is how the kvm-vgic code in QEMU >> finds out where it got mapped to. Scott adds this patch to do >> this, but I'd assume you have some other way :) > > Hmm? The kvm-vgic code in QEMU doesn't need to know where it > lives. We have to tell the kernel so it can map its bits of > registers in at the right place, that's all. The kvm-vgic code in QEMU needs to tell the kernel, no? For that, it needs to know what to tell the kernel. This patch adds a function that allows kvm-openpic to fetch its base flat address from the MemoryListener. I was wondering whether either this patch is superfluous or you guys had an awkward MemoryListener handler :) Alex
On 21 March 2013 11:05, Alexander Graf <agraf@suse.de> wrote: > > On 21.03.2013, at 12:01, Peter Maydell wrote: > >> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote: >>> On 21.03.2013, at 11:53, Peter Maydell wrote: >>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically >>>> the VGIC device model calls this function to say "tell the kernel >>>> where this MemoryRegion is in the system address space, when it >>>> eventually gets mapped". The code in kvm.c uses the memory system's >>>> Notifier API to get a callback when the region is mapped into >>>> an address space, which it uses to track the offset in the >>>> address space. Finally, we use a machine init notifier so that >>>> just before everything finally starts we can make the KVM ioctls >>>> to say "here is where everything lives". >>> >>> Same thing here. The question is how the kvm-vgic code in QEMU >>> finds out where it got mapped to. Scott adds this patch to do >>> this, but I'd assume you have some other way :) >> >> Hmm? The kvm-vgic code in QEMU doesn't need to know where it >> lives. We have to tell the kernel so it can map its bits of >> registers in at the right place, that's all. > > The kvm-vgic code in QEMU needs to tell the kernel, no? For > that, it needs to know what to tell the kernel. No. As I explained earlier, all the kvm-vgic code needs to do is call kvm_arm_register_device(). That code in kvm.c then takes care of telling the kernel. hw/kvm/arm_gic.c itself never knows or needs to know where it's mapped. This is the whole point of the mechanism involving notifiers. > This patch adds a function that allows kvm-openpic to fetch its > base flat address from the MemoryListener. This sounds to me like the wrong way to do it -- it's board models that decide where devices are mapped and the device code itself shouldn't have to know where it has been mapped. -- PMM
On 21.03.2013, at 12:09, Peter Maydell wrote: > On 21 March 2013 11:05, Alexander Graf <agraf@suse.de> wrote: >> >> On 21.03.2013, at 12:01, Peter Maydell wrote: >> >>> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote: >>>> On 21.03.2013, at 11:53, Peter Maydell wrote: >>>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically >>>>> the VGIC device model calls this function to say "tell the kernel >>>>> where this MemoryRegion is in the system address space, when it >>>>> eventually gets mapped". The code in kvm.c uses the memory system's >>>>> Notifier API to get a callback when the region is mapped into >>>>> an address space, which it uses to track the offset in the >>>>> address space. Finally, we use a machine init notifier so that >>>>> just before everything finally starts we can make the KVM ioctls >>>>> to say "here is where everything lives". >>>> >>>> Same thing here. The question is how the kvm-vgic code in QEMU >>>> finds out where it got mapped to. Scott adds this patch to do >>>> this, but I'd assume you have some other way :) >>> >>> Hmm? The kvm-vgic code in QEMU doesn't need to know where it >>> lives. We have to tell the kernel so it can map its bits of >>> registers in at the right place, that's all. >> >> The kvm-vgic code in QEMU needs to tell the kernel, no? For >> that, it needs to know what to tell the kernel. > > No. As I explained earlier, all the kvm-vgic code needs to do > is call kvm_arm_register_device(). That code in kvm.c then takes > care of telling the kernel. hw/kvm/arm_gic.c itself never knows or > needs to know where it's mapped. This is the whole point of the > mechanism involving notifiers. I fully disagree. Code that talks to the in-kernel device should live in hw/kvm/<device>.c, not in some random target-XXX/kvm.c file. Of course the board defines where the device gets mapped to, but the communication with the in-kernel device bits should really be contained to the device itself. So this is the function that gets invoked on ARM: static void kvm_arm_machine_init_done(Notifier *notifier, void *data) { KVMDevice *kd, *tkd; memory_listener_unregister(&devlistener); QSLIST_FOREACH_SAFE(kd, &kvm_devices_head, entries, tkd) { if (kd->kda.addr != -1) { if (kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR, &kd->kda) < 0) { fprintf(stderr, "KVM_ARM_SET_DEVICE_ADDRESS failed: %s\n", strerror(errno)); abort(); } } g_free(kd); } } This only goes one level deep, right? So if you ever have to nest the VGIC inside of another MemoryRegion, this will break, right? Alex > >> This patch adds a function that allows kvm-openpic to fetch its >> base flat address from the MemoryListener. > > This sounds to me like the wrong way to do it -- it's board > models that decide where devices are mapped and the device > code itself shouldn't have to know where it has been mapped. > > -- PMM
On 21 March 2013 11:14, Alexander Graf <agraf@suse.de> wrote: > > On 21.03.2013, at 12:09, Peter Maydell wrote: > >> On 21 March 2013 11:05, Alexander Graf <agraf@suse.de> wrote: >>> >>> On 21.03.2013, at 12:01, Peter Maydell wrote: >>> >>>> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote: >>>>> On 21.03.2013, at 11:53, Peter Maydell wrote: >>>>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically >>>>>> the VGIC device model calls this function to say "tell the kernel >>>>>> where this MemoryRegion is in the system address space, when it >>>>>> eventually gets mapped". The code in kvm.c uses the memory system's >>>>>> Notifier API to get a callback when the region is mapped into >>>>>> an address space, which it uses to track the offset in the >>>>>> address space. Finally, we use a machine init notifier so that >>>>>> just before everything finally starts we can make the KVM ioctls >>>>>> to say "here is where everything lives". >>>>> >>>>> Same thing here. The question is how the kvm-vgic code in QEMU >>>>> finds out where it got mapped to. Scott adds this patch to do >>>>> this, but I'd assume you have some other way :) >>>> >>>> Hmm? The kvm-vgic code in QEMU doesn't need to know where it >>>> lives. We have to tell the kernel so it can map its bits of >>>> registers in at the right place, that's all. >>> >>> The kvm-vgic code in QEMU needs to tell the kernel, no? For >>> that, it needs to know what to tell the kernel. >> >> No. As I explained earlier, all the kvm-vgic code needs to do >> is call kvm_arm_register_device(). That code in kvm.c then takes >> care of telling the kernel. hw/kvm/arm_gic.c itself never knows or >> needs to know where it's mapped. This is the whole point of the >> mechanism involving notifiers. > > I fully disagree. Code that talks to the in-kernel device should > live in hw/kvm/<device>.c, not in some random target-XXX/kvm.c file. The code in kvm.c is entirely generic -- it provides a mechanism for a device to say "this memory region is kernel ID X and it will want to know where it lives". The kvm.c code will work for any device with a memory mapped region, whether it's the GIC or something else. > Of course the board defines where the device gets mapped to, but > the communication with the in-kernel device bits should really be > contained to the device itself. You're arguing that every device should implement its own set of notifier functions so it can get called back when its memory regions are finally mapped, just so it can make a non-device-specific KVM ioctl? The obvious thing to do is abstract that functionality out. > So this is the function that gets invoked on ARM: > > static void kvm_arm_machine_init_done(Notifier *notifier, void *data) > { > KVMDevice *kd, *tkd; > > memory_listener_unregister(&devlistener); > QSLIST_FOREACH_SAFE(kd, &kvm_devices_head, entries, tkd) { > if (kd->kda.addr != -1) { > if (kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR, > &kd->kda) < 0) { > fprintf(stderr, "KVM_ARM_SET_DEVICE_ADDRESS failed: %s\n", > strerror(errno)); > abort(); > } > } > g_free(kd); > } > } > > This only goes one level deep, right? So if you ever have to nest the > VGIC inside of another MemoryRegion, this will break, right? We already nest the VGIC inside another memory region (the a15mpcore container), and it works fine. This function is just iterating through "everything any device asked me to tell the kernel about". -- PMM
On 21.03.2013, at 12:22, Peter Maydell wrote: > On 21 March 2013 11:14, Alexander Graf <agraf@suse.de> wrote: >> >> On 21.03.2013, at 12:09, Peter Maydell wrote: >> >>> On 21 March 2013 11:05, Alexander Graf <agraf@suse.de> wrote: >>>> >>>> On 21.03.2013, at 12:01, Peter Maydell wrote: >>>> >>>>> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote: >>>>>> On 21.03.2013, at 11:53, Peter Maydell wrote: >>>>>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically >>>>>>> the VGIC device model calls this function to say "tell the kernel >>>>>>> where this MemoryRegion is in the system address space, when it >>>>>>> eventually gets mapped". The code in kvm.c uses the memory system's >>>>>>> Notifier API to get a callback when the region is mapped into >>>>>>> an address space, which it uses to track the offset in the >>>>>>> address space. Finally, we use a machine init notifier so that >>>>>>> just before everything finally starts we can make the KVM ioctls >>>>>>> to say "here is where everything lives". >>>>>> >>>>>> Same thing here. The question is how the kvm-vgic code in QEMU >>>>>> finds out where it got mapped to. Scott adds this patch to do >>>>>> this, but I'd assume you have some other way :) >>>>> >>>>> Hmm? The kvm-vgic code in QEMU doesn't need to know where it >>>>> lives. We have to tell the kernel so it can map its bits of >>>>> registers in at the right place, that's all. >>>> >>>> The kvm-vgic code in QEMU needs to tell the kernel, no? For >>>> that, it needs to know what to tell the kernel. >>> >>> No. As I explained earlier, all the kvm-vgic code needs to do >>> is call kvm_arm_register_device(). That code in kvm.c then takes >>> care of telling the kernel. hw/kvm/arm_gic.c itself never knows or >>> needs to know where it's mapped. This is the whole point of the >>> mechanism involving notifiers. >> >> I fully disagree. Code that talks to the in-kernel device should >> live in hw/kvm/<device>.c, not in some random target-XXX/kvm.c file. > > The code in kvm.c is entirely generic -- it provides a mechanism > for a device to say "this memory region is kernel ID X and it will > want to know where it lives". The kvm.c code will work for any device > with a memory mapped region, whether it's the GIC or something else. > >> Of course the board defines where the device gets mapped to, but >> the communication with the in-kernel device bits should really be >> contained to the device itself. > > You're arguing that every device should implement its own set > of notifier functions so it can get called back when its memory > regions are finally mapped, just so it can make a non-device-specific > KVM ioctl? The obvious thing to do is abstract that functionality > out. What I'm arguing is that every device should look as if it was a QEMU device. Devices that happen to live in KVM, should still make a significant effort to expose themselves to the board model as if they were QEMU devices. So yes, I think the device model should at least register the memory listener, because only the device model knows what its memory regions would map to in KVM's world. Not all devices have a single flat region. Some have more than one. Whether we have a helper function in (generic) kvm.c that can call a (generic) ioctl to set a device's region X is a different matter. I'd be open to that if it makes sense. > >> So this is the function that gets invoked on ARM: >> >> static void kvm_arm_machine_init_done(Notifier *notifier, void *data) >> { >> KVMDevice *kd, *tkd; >> >> memory_listener_unregister(&devlistener); >> QSLIST_FOREACH_SAFE(kd, &kvm_devices_head, entries, tkd) { >> if (kd->kda.addr != -1) { >> if (kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR, >> &kd->kda) < 0) { >> fprintf(stderr, "KVM_ARM_SET_DEVICE_ADDRESS failed: %s\n", >> strerror(errno)); >> abort(); >> } >> } >> g_free(kd); >> } >> } >> >> This only goes one level deep, right? So if you ever have to nest the >> VGIC inside of another MemoryRegion, this will break, right? > > We already nest the VGIC inside another memory region (the a15mpcore > container), and it works fine. This function is just iterating through > "everything any device asked me to tell the kernel about". So kda is the real physical offset? I'm having a hard time reading that code :). According to this function: static void kvm_arm_devlistener_add(MemoryListener *listener, MemoryRegionSection *section) { KVMDevice *kd; QSLIST_FOREACH(kd, &kvm_devices_head, entries) { if (section->mr == kd->mr) { kd->kda.addr = section->offset_within_address_space; } } } it's only the offset within its parent region, which would mean it's broken, no? Alex
On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote: > On 21.03.2013, at 12:22, Peter Maydell wrote: >> We already nest the VGIC inside another memory region (the a15mpcore >> container), and it works fine. This function is just iterating through >> "everything any device asked me to tell the kernel about". > > So kda is the real physical offset? I'm having a hard time reading that code :). According to this function: > > static void kvm_arm_devlistener_add(MemoryListener *listener, > MemoryRegionSection *section) > { > KVMDevice *kd; > > QSLIST_FOREACH(kd, &kvm_devices_head, entries) { > if (section->mr == kd->mr) { > kd->kda.addr = section->offset_within_address_space; > } > } > } > > it's only the offset within its parent region, which would mean it's broken, no? Address spaces are not the same thing as memory regions :-) The only address space involved here is the system address space. (As I say, we currently assume we only get mapped into one address space, but that could be fixed if necessary.) -- PMM
On 21.03.2013, at 12:32, Peter Maydell wrote: > On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote: >> On 21.03.2013, at 12:22, Peter Maydell wrote: >>> We already nest the VGIC inside another memory region (the a15mpcore >>> container), and it works fine. This function is just iterating through >>> "everything any device asked me to tell the kernel about". >> >> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function: >> >> static void kvm_arm_devlistener_add(MemoryListener *listener, >> MemoryRegionSection *section) >> { >> KVMDevice *kd; >> >> QSLIST_FOREACH(kd, &kvm_devices_head, entries) { >> if (section->mr == kd->mr) { >> kd->kda.addr = section->offset_within_address_space; >> } >> } >> } >> >> it's only the offset within its parent region, which would mean it's broken, no? > > Address spaces are not the same thing as memory regions :-) > The only address space involved here is the system address space. > (As I say, we currently assume we only get mapped into one address > space, but that could be fixed if necessary.) Interesting. Oh well, I'll leave that one to Scott to figure out ;). So what if I want to write an in-kernel IDE PIO accelerator? Or even better yet: An AHCI accelerator that has one MMIO BAR and another PIO BAR that can be remapped by the guest at any time? The distinction on whether a region is handled by KVM really needs to be done by the device model. Alex
On 21 March 2013 11:38, Alexander Graf <agraf@suse.de> wrote: > > On 21.03.2013, at 12:32, Peter Maydell wrote: > >> On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote: >>> On 21.03.2013, at 12:22, Peter Maydell wrote: >>>> We already nest the VGIC inside another memory region (the a15mpcore >>>> container), and it works fine. This function is just iterating through >>>> "everything any device asked me to tell the kernel about". >>> >>> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function: >>> >>> static void kvm_arm_devlistener_add(MemoryListener *listener, >>> MemoryRegionSection *section) >>> { >>> KVMDevice *kd; >>> >>> QSLIST_FOREACH(kd, &kvm_devices_head, entries) { >>> if (section->mr == kd->mr) { >>> kd->kda.addr = section->offset_within_address_space; >>> } >>> } >>> } >>> >>> it's only the offset within its parent region, which would mean it's broken, no? >> >> Address spaces are not the same thing as memory regions :-) >> The only address space involved here is the system address space. >> (As I say, we currently assume we only get mapped into one address >> space, but that could be fixed if necessary.) > > Interesting. Oh well, I'll leave that one to Scott to figure out ;). > > So what if I want to write an in-kernel IDE PIO accelerator? Have the QEMU end of that device call (your equivalent of) kvm_arm_register_device(), and provide a 'reserved' mmio region to its users; the kernel end implements the standard 'tell me where I live' ioctl; that's it. > Or even better yet: An AHCI accelerator that has one MMIO BAR and > another PIO BAR that can be remapped by the guest at any time? Guest remappable KVM regions would require enhancements, yes (it's not like we have an existing mechanism for doing that on any architecture at the moment). The principle of implementing the mechanics of this in common code still holds, probably even more so for the increased complexity. > The distinction on whether a region is handled by KVM really needs > to be done by the device model. It is -- the device model is what calls kvm_arm_register_device(). It's just the mechanics of "how do we tell the kernel the right address for this region at the point when we know it" that are handled in kvm.c. -- PMM
On 21.03.2013, at 12:44, Peter Maydell wrote: > On 21 March 2013 11:38, Alexander Graf <agraf@suse.de> wrote: >> >> On 21.03.2013, at 12:32, Peter Maydell wrote: >> >>> On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote: >>>> On 21.03.2013, at 12:22, Peter Maydell wrote: >>>>> We already nest the VGIC inside another memory region (the a15mpcore >>>>> container), and it works fine. This function is just iterating through >>>>> "everything any device asked me to tell the kernel about". >>>> >>>> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function: >>>> >>>> static void kvm_arm_devlistener_add(MemoryListener *listener, >>>> MemoryRegionSection *section) >>>> { >>>> KVMDevice *kd; >>>> >>>> QSLIST_FOREACH(kd, &kvm_devices_head, entries) { >>>> if (section->mr == kd->mr) { >>>> kd->kda.addr = section->offset_within_address_space; >>>> } >>>> } >>>> } >>>> >>>> it's only the offset within its parent region, which would mean it's broken, no? >>> >>> Address spaces are not the same thing as memory regions :-) >>> The only address space involved here is the system address space. >>> (As I say, we currently assume we only get mapped into one address >>> space, but that could be fixed if necessary.) >> >> Interesting. Oh well, I'll leave that one to Scott to figure out ;). >> >> So what if I want to write an in-kernel IDE PIO accelerator? > > Have the QEMU end of that device call (your equivalent of) > kvm_arm_register_device(), and provide a 'reserved' mmio region to > its users; the kernel end implements the standard 'tell me where I live' > ioctl; that's it. > >> Or even better yet: An AHCI accelerator that has one MMIO BAR and >> another PIO BAR that can be remapped by the guest at any time? > > Guest remappable KVM regions would require enhancements, yes (it's > not like we have an existing mechanism for doing that on any > architecture at the moment). The principle of implementing the > mechanics of this in common code still holds, probably even more > so for the increased complexity. > >> The distinction on whether a region is handled by KVM really needs >> to be done by the device model. > > It is -- the device model is what calls kvm_arm_register_device(). > It's just the mechanics of "how do we tell the kernel the right > address for this region at the point when we know it" that are > handled in kvm.c. I think I'm slowly grasping what you're aiming at :). Ok, that works. You do actually do the listener in the device model, just that you pass code responsibility over to kvm.c. That's perfectly valid and sounds like a good model that Scott probably wants to follow as well :). Alex
On 21.03.2013, at 12:49, Alexander Graf wrote: > > On 21.03.2013, at 12:44, Peter Maydell wrote: > >> On 21 March 2013 11:38, Alexander Graf <agraf@suse.de> wrote: >>> >>> On 21.03.2013, at 12:32, Peter Maydell wrote: >>> >>>> On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote: >>>>> On 21.03.2013, at 12:22, Peter Maydell wrote: >>>>>> We already nest the VGIC inside another memory region (the a15mpcore >>>>>> container), and it works fine. This function is just iterating through >>>>>> "everything any device asked me to tell the kernel about". >>>>> >>>>> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function: >>>>> >>>>> static void kvm_arm_devlistener_add(MemoryListener *listener, >>>>> MemoryRegionSection *section) >>>>> { >>>>> KVMDevice *kd; >>>>> >>>>> QSLIST_FOREACH(kd, &kvm_devices_head, entries) { >>>>> if (section->mr == kd->mr) { >>>>> kd->kda.addr = section->offset_within_address_space; >>>>> } >>>>> } >>>>> } >>>>> >>>>> it's only the offset within its parent region, which would mean it's broken, no? >>>> >>>> Address spaces are not the same thing as memory regions :-) >>>> The only address space involved here is the system address space. >>>> (As I say, we currently assume we only get mapped into one address >>>> space, but that could be fixed if necessary.) >>> >>> Interesting. Oh well, I'll leave that one to Scott to figure out ;). >>> >>> So what if I want to write an in-kernel IDE PIO accelerator? >> >> Have the QEMU end of that device call (your equivalent of) >> kvm_arm_register_device(), and provide a 'reserved' mmio region to >> its users; the kernel end implements the standard 'tell me where I live' >> ioctl; that's it. >> >>> Or even better yet: An AHCI accelerator that has one MMIO BAR and >>> another PIO BAR that can be remapped by the guest at any time? >> >> Guest remappable KVM regions would require enhancements, yes (it's >> not like we have an existing mechanism for doing that on any >> architecture at the moment). The principle of implementing the >> mechanics of this in common code still holds, probably even more >> so for the increased complexity. >> >>> The distinction on whether a region is handled by KVM really needs >>> to be done by the device model. >> >> It is -- the device model is what calls kvm_arm_register_device(). >> It's just the mechanics of "how do we tell the kernel the right >> address for this region at the point when we know it" that are >> handled in kvm.c. > > I think I'm slowly grasping what you're aiming at :). Ok, that works. You do actually do the listener in the device model, just that you pass code responsibility over to kvm.c. > > That's perfectly valid and sounds like a good model that Scott probably wants to follow as well :). s/follow/evaluate/ :). The currently proposed device api doesn't have a generic notion of device regions. Regions are a per-device property, because a single device can have multiple regions. However, maybe with a bit of brainstorming we could come up with a reasonably generic scheme. Alex
On 21 March 2013 11:49, Alexander Graf <agraf@suse.de> wrote: > > On 21.03.2013, at 12:44, Peter Maydell wrote: >> It is -- the device model is what calls kvm_arm_register_device(). >> It's just the mechanics of "how do we tell the kernel the right >> address for this region at the point when we know it" that are >> handled in kvm.c. > > I think I'm slowly grasping what you're aiming at :). Ok, that > works. You do actually do the listener in the device model, just > that you pass code responsibility over to kvm.c. > > That's perfectly valid and sounds like a good model that Scott > probably wants to follow as well :). Yep. We were actually originally going to make the device ioctl a generic one, not an ARM one, because there really isn't anything ARM specific about it. We should probably move the code from target-arm/kvm.c into kvm-all.c with an arch hook to specify the ioctl to use (same as irq_set_ioctl) if you want to do the same approach with PPC. Re multiple regions: yes, the VGIC has several. We just divide the u32 ID into two halves, one for a device ID and one for a region ID for that device. -- PMM
On 21 March 2013 22:43, Scott Wood <scottwood@freescale.com> wrote: > What if the update is to a parent memory region, not to the one directly > associated with the device? > > Or does add() get called for all child regions (recursively) in such cases? The memory API flattens the tree of memory regions down into a flat view of the address space. These callbacks get called for the final flattened view (so you'll never see a pure container in the callback, only leaves). The callbacks happen for every region which appears in the address space, in linear order. When an update happens memory.c identifies the changes between the old flat view and the new one and calls callbacks appropriately. This code isn't the first use of the memory API listeners, so it's all well-tested code. >> However, maybe with a bit of brainstorming we could come up with a >> reasonably generic scheme. > In the kernel API? Or do you mean a generic scheme within QEMU that encodes > any reasonably expected mechanism for setting the device adress (e.g. assume > that it is either a 64-bit attribute, or uses the legacy ARM API), or > perhaps a callback into device code? > > The MPIC's memory listener isn't that much code... I'm not sure > there's a great need for a central KVM registry. Well, nor is the ARM memory listener, but why have two bits of code doing the same thing when you could have one? -- PMM
On 22 March 2013 22:05, Scott Wood <scottwood@freescale.com> wrote: > On 03/22/2013 08:08:57 AM, Peter Maydell wrote: >> The memory API flattens the tree of memory regions down into a flat >> view of the address space. These callbacks get called for the >> final flattened view (so you'll never see a pure container in the >> callback, only leaves). The callbacks happen for every region which >> appears in the address space, in linear order. When an update happens >> memory.c identifies the changes between the old flat view and the >> new one and calls callbacks appropriately. > > OK, so .add and .del will be sufficient to capture any manipulation that > would affect whether and where the region we care about is mapped? Yes. Note that if the board (brokenly) maps the region so it is 'hidden' by another region, this manifests as a .del [since it is no longer accessible]. Also I think if the board maps something small on top and in the middle of the region you get an add for each of the partially visible fragments. Personally I'm happy to not worry about either of these cases on the basis that they would be board model bugs. >> This code isn't the >> first use of the memory API listeners, so it's all well-tested code. > > > Sure, I'm not suggesting the code doesn't work -- just trying to understand > how, so I know I'm using it properly. The implementation is a bit opaque > (to me at least), and the listener callbacks aren't documented the way the > normal API functions are. Yeah, it would I guess be good to add doc comments for all the fields in struct MemoryListener describing the semantics of the callbacks. >> > The MPIC's memory listener isn't that much code... I'm not sure >> > there's a great need for a central KVM registry. >> >> Well, nor is the ARM memory listener, but why have two bits of >> code doing the same thing when you could have one? > > They're not doing quite the same thing, though, and the effort required to > unify them is non-zero. The two main issues are the way that the address is > communicated to KVM, and the ability to change the mapping after the guest > starts. Ah, guest-programmable mappings are a real use case and not a hypothetical? Do we run into synchronisation issues with making sure that QEMU and the kernel both agree simultaneously about where the mapping is? Can the mapping be different between different CPU cores? [let's hope not :-)] Is the mapping controlled by a register within the mapping itself, or is there some separate non-moving register which defines the location of the mappable registers? thanks -- PMM
diff --git a/include/exec/memory.h b/include/exec/memory.h index 2322732..b800391 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -892,6 +892,15 @@ void *address_space_map(AddressSpace *as, hwaddr addr, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, int is_write, hwaddr access_len); +/* memory_region_to_address: Find the full address of the start of the + * given #MemoryRegion, ignoring aliases. There is no guarantee + * that the #MemoryRegion is actually visible at this address, if + * there are overlapping regions. + * + * @mr: #MemoryRegion being queried + * @asp: if non-NULL, returns the #AddressSpace @mr is mapped in, if any + */ +hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp); #endif diff --git a/memory.c b/memory.c index cd7d5e0..0099f12 100644 --- a/memory.c +++ b/memory.c @@ -453,21 +453,51 @@ const IORangeOps memory_region_iorange_ops = { .destructor = memory_region_iorange_destructor, }; -static AddressSpace *memory_region_to_address_space(MemoryRegion *mr) +static AddressSpace *memory_region_root_to_address_space(MemoryRegion *mr) { AddressSpace *as; - while (mr->parent) { - mr = mr->parent; - } QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) { if (mr == as->root) { return as; } } + + return NULL; +} + +static AddressSpace *memory_region_to_address_space(MemoryRegion *mr) +{ + AddressSpace *as; + + while (mr->parent) { + mr = mr->parent; + } + + as = memory_region_root_to_address_space(mr); + if (as) { + return as; + } + abort(); } +hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp) +{ + hwaddr addr = mr->addr; + + while (mr->parent) { + mr = mr->parent; + addr += mr->addr; + } + + if (asp) { + *asp = memory_region_root_to_address_space(mr); + } + + return addr; +} + /* Render a memory region into the global view. Ranges in @view obscure * ranges in @mr. */
This is useful for when a user of the memory region API needs to communicate the absolute bus address to something outside QEMU (in particular, KVM). Signed-off-by: Scott Wood <scottwood@freescale.com> --- include/exec/memory.h | 9 +++++++++ memory.c | 38 ++++++++++++++++++++++++++++++++++---- 2 files changed, 43 insertions(+), 4 deletions(-)