Message ID | 1305814352-15044-2-git-send-email-avi@redhat.com |
---|---|
State | New |
Headers | show |
On Thu, 2011-05-19 at 10:12 -0400, Avi Kivity wrote: > The memory API separates the attributes of a memory region (its size, how > reads or writes are handled, dirty logging, and coalescing) from where it > is mapped and whether it is enabled. This allows a device to configure > a memory region once, then hand it off to its parent bus to map it according > to the bus configuration. > > Hierarchical registration also allows a device to compose a region out of > a number of sub-regions with different properties; for example some may be > RAM while others may be MMIO. > > Signed-off-by: Avi Kivity <avi@redhat.com> > --- > memory.h | 142 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 142 insertions(+), 0 deletions(-) > create mode 100644 memory.h > > diff --git a/memory.h b/memory.h > new file mode 100644 > index 0000000..77c5951 > --- /dev/null > +++ b/memory.h > @@ -0,0 +1,142 @@ > +#ifndef MEMORY_H > +#define MEMORY_H > + > +#include <stdint.h> > +#include <stdbool.h> > +#include "qemu-common.h" > +#include "cpu-common.h" > +#include "targphys.h" > +#include "qemu-queue.h" > + > +typedef struct MemoryRegionOps MemoryRegionOps; > +typedef struct MemoryRegion MemoryRegion; > + > +/* > + * Memory region callbacks > + */ > +struct MemoryRegionOps { > + /* Read from the memory region. @addr is relative to @mr; @size is > + * in bytes. */ > + uint64_t (*read)(MemoryRegion *mr, > + target_phys_addr_t addr, > + unsigned size); > + /* Write to the memory region. @addr is relative to @mr; @size is > + * in bytes. */ > + void (*write)(MemoryRegion *mr, > + target_phys_addr_t addr, > + uint64_t data, > + unsigned size); > + /* Guest-visible constraints: */ > + struct { > + /* If nonzero, specify bounds on access sizes beyond which a machine > + * check is thrown. > + */ > + unsigned min_access_size; > + unsigned max_access_size; Do we always support all access sizes between min and max? This might be easier to describe as a bitmap of supported power of 2 access sizes. Thanks, Alex
On 2011-05-19 16:12, Avi Kivity wrote: > +/* Sets an offset to be added to MemoryRegionOps callbacks. */ > +void memory_region_set_offset(MemoryRegion *mr, target_phys_addr_t offset); Please mark this as a legacy helper, ideally to be removed after the complete conversion to this API. During that phase we should try to identify those devices which still depend on offset=0 and maybe directly fix them. > +/* Turn loggging on or off for specified client (display, migration) */ > +void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client); > +/* Enable memory coalescing for the region. MMIO ->write callbacks may be > + * delayed until a non-coalesced MMIO is issued. > + */ > +void memory_region_set_coalescing(MemoryRegion *mr); > +/* Enable memory coalescing for a sub-range of the region. MMIO ->write > + * callbacks may be delayed until a non-coalesced MMIO is issued. > + */ > +void memory_region_add_coalescing(MemoryRegion *mr, > + target_phys_addr_t offset, > + target_phys_addr_t size); Will this be such a common use case that requesting the user to split up the region and then use set_coalescing will generate too much boiler plate code? > +/* Disable MMIO coalescing for the region. */ > +void memory_region_clear_coalescing(MemoryRegion *mr); And what about clearing coalescing for sub-ranges? Maybe skip add_coalescing for the first run and see how far we get. Jan
On 05/19/2011 09:12 AM, Avi Kivity wrote: > The memory API separates the attributes of a memory region (its size, how > reads or writes are handled, dirty logging, and coalescing) from where it > is mapped and whether it is enabled. This allows a device to configure > a memory region once, then hand it off to its parent bus to map it according > to the bus configuration. > > Hierarchical registration also allows a device to compose a region out of > a number of sub-regions with different properties; for example some may be > RAM while others may be MMIO. > > Signed-off-by: Avi Kivity<avi@redhat.com> > --- > memory.h | 142 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 142 insertions(+), 0 deletions(-) > create mode 100644 memory.h > > diff --git a/memory.h b/memory.h > new file mode 100644 > index 0000000..77c5951 > --- /dev/null > +++ b/memory.h > @@ -0,0 +1,142 @@ > +#ifndef MEMORY_H > +#define MEMORY_H > + > +#include<stdint.h> > +#include<stdbool.h> > +#include "qemu-common.h" > +#include "cpu-common.h" > +#include "targphys.h" > +#include "qemu-queue.h" > + > +typedef struct MemoryRegionOps MemoryRegionOps; > +typedef struct MemoryRegion MemoryRegion; > + > +/* > + * Memory region callbacks > + */ > +struct MemoryRegionOps { > + /* Read from the memory region. @addr is relative to @mr; @size is > + * in bytes. */ > + uint64_t (*read)(MemoryRegion *mr, > + target_phys_addr_t addr, > + unsigned size); > + /* Write to the memory region. @addr is relative to @mr; @size is > + * in bytes. */ > + void (*write)(MemoryRegion *mr, > + target_phys_addr_t addr, > + uint64_t data, > + unsigned size); > + /* Guest-visible constraints: */ > + struct { > + /* If nonzero, specify bounds on access sizes beyond which a machine > + * check is thrown. > + */ > + unsigned min_access_size; > + unsigned max_access_size; > + /* If true, unaligned accesses are supported. Otherwise unaligned > + * accesses throw machine checks. > + */ > + bool unaligned; > + } valid; Under what circumstances would this be used? The behavior of devices that receive non-natural accesses varies wildly. For PCI devices, invalid accesses almost always return ~0. I can't think of a device where an MCE would occur. Regards, Anthony Liguori
Am 19.05.2011 16:12, schrieb Avi Kivity: > The memory API separates the attributes of a memory region (its size, how > reads or writes are handled, dirty logging, and coalescing) from where it > is mapped and whether it is enabled. This allows a device to configure > a memory region once, then hand it off to its parent bus to map it > according > to the bus configuration. > > Hierarchical registration also allows a device to compose a region out of > a number of sub-regions with different properties; for example some may be > RAM while others may be MMIO. > > Signed-off-by: Avi Kivity <avi@redhat.com> > --- > memory.h | 142 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 142 insertions(+), 0 deletions(-) > create mode 100644 memory.h > > diff --git a/memory.h b/memory.h > new file mode 100644 > index 0000000..77c5951 > --- /dev/null > +++ b/memory.h > @@ -0,0 +1,142 @@ > +#ifndef MEMORY_H > +#define MEMORY_H > + > +#include <stdint.h> > +#include <stdbool.h> stdbool.h is already included in qemu-common.h, stdint.h (indirectly) too. Therefore both include statements can be removed. > +#include "qemu-common.h" > +#include "cpu-common.h" > +#include "targphys.h" > +#include "qemu-queue.h" > + > +typedef struct MemoryRegionOps MemoryRegionOps; > +typedef struct MemoryRegion MemoryRegion; > + > +/* > + * Memory region callbacks > + */ > +struct MemoryRegionOps { > + /* Read from the memory region. @addr is relative to @mr; @size is > + * in bytes. */ > + uint64_t (*read)(MemoryRegion *mr, > + target_phys_addr_t addr, > + unsigned size); > + /* Write to the memory region. @addr is relative to @mr; @size is > + * in bytes. */ > + void (*write)(MemoryRegion *mr, > + target_phys_addr_t addr, > + uint64_t data, > + unsigned size); > + /* Guest-visible constraints: */ > + struct { > + /* If nonzero, specify bounds on access sizes beyond which a machine > + * check is thrown. > + */ > + unsigned min_access_size; > + unsigned max_access_size; > + /* If true, unaligned accesses are supported. Otherwise unaligned > + * accesses throw machine checks. > + */ > + bool unaligned; > + } valid; > + /* Internal implementation constraints: */ > + struct { > + /* If nonzero, specifies the minimum size implemented. Smaller sizes > + * will be rounded upwards and a partial result will be returned. > + */ > + unsigned min_access_size; > + /* If nonzero, specifies the maximum size implemented. Larger sizes > + * will be done as a series of accesses with smaller sizes. > + */ > + unsigned max_access_size; > + /* If true, unaligned accesses are supported. Otherwise all accesses > + * are converted to (possibly multiple) naturally aligned accesses. > + */ > + bool unaligned; > + } impl; > +}; > + > +typedef struct CoalescedMemoryRange CoalescedMemoryRange; > + > +struct CoalescedMemoryRange { > + target_phys_addr_t start; > + target_phys_addr_t size; > + QTAILQ_ENTRY(coalesced_ranges) link; > +}; > + > +struct MemoryRegion { > + /* All fields are private - violators will be prosecuted */ Is it possible to move this private declaration into the implementation file (or a private header file if the declaration is needed by more than one file)? > + const MemoryRegionOps *ops; > + MemoryRegion *parent; > + target_phys_addr_t size; > + target_phys_addr_t addr; > + ram_addr_t ram_addr; > + unsigned priority; > + bool may_overlap; > + QTAILQ_HEAD(subregions, MemoryRegion) subregions; > + QTAILQ_ENTRY(subregions) subregions_link; > + QTAILQ_HEAD(coalesced_ranges, CoalescedMemoryRange) coalesced; > +}; > + > +/* Initialize a memory region > + * > + * The region typically acts as a container for other memory regions. > + */ > +void memory_region_init(MemoryRegion *mr, > + target_phys_addr_t size); > +/* Initialize an I/O memory region. Accesses into the region will be > + * cause the callbacks in @ops to be called. > + * > + * if @size is nonzero, subregions will be clipped to @size. > + */ > +void memory_region_init_io(MemoryRegion *mr, > + const MemoryRegionOps *ops, > + target_phys_addr_t size); > +/* Initialize an I/O memory region. Accesses into the region will be > + * modify memory directly. > + */ > +void memory_region_init_ram(MemoryRegion *mr, > + target_phys_addr_t size); > +/* Initialize a RAM memory region. Accesses into the region will be > + * modify memory in @ptr directly. > + */ > +void memory_region_init_ram_ptr(MemoryRegion *mr, > + target_phys_addr_t size, > + void *ptr); > +/* Destroy a memory region. The memory becomes inaccessible. */ > +void memory_region_destroy(MemoryRegion *mr); > +/* Sets an offset to be added to MemoryRegionOps callbacks. */ > +void memory_region_set_offset(MemoryRegion *mr, target_phys_addr_t > offset); > +/* Turn loggging on or off for specified client (display, migration) */ > +void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client); > +/* Enable memory coalescing for the region. MMIO ->write callbacks may be > + * delayed until a non-coalesced MMIO is issued. > + */ > +void memory_region_set_coalescing(MemoryRegion *mr); > +/* Enable memory coalescing for a sub-range of the region. MMIO ->write > + * callbacks may be delayed until a non-coalesced MMIO is issued. > + */ > +void memory_region_add_coalescing(MemoryRegion *mr, > + target_phys_addr_t offset, > + target_phys_addr_t size); > +/* Disable MMIO coalescing for the region. */ > +void memory_region_clear_coalescing(MemoryRegion *mr); > + > +/* Add a sub-region at @offset. The sub-region may not overlap with other > + * subregions (except for those explicitly marked as overlapping) > + */ > +void memory_region_add_subregion(MemoryRegion *mr, > + target_phys_addr_t offset, > + MemoryRegion *subregion); > +/* Add a sub-region at @offset. The sun-region may overlap other > subregions; > + * conflicts are resolved by having a higher @priority hide a lower > @priority. > + * Subregions without priority are taken as @priority 0. > + */ > +void memory_region_add_subregion_overlap(MemoryRegion *mr, > + target_phys_addr_t offset, > + MemoryRegion *subregion, > + unsigned priority); > +/* Remove a subregion. */ > +void memory_region_del_subregion(MemoryRegion *mr, > + MemoryRegion *subregion); > + > +#endif Kind regards, Stefan W.
On Thu, May 19, 2011 at 3:12 PM, Avi Kivity <avi@redhat.com> wrote: > +struct MemoryRegion { > + /* All fields are private - violators will be prosecuted */ > + const MemoryRegionOps *ops; > + MemoryRegion *parent; In the case where a region is aliased (mapped twice into the address space at different addresses) I need two MemoryRegions? The MemoryRegion describes an actual mapping in the <parent, addr, ram_addr> tuple, not just the attributes of the region (ops, size, ...). Stefan
On 05/19/2011 10:07 PM, Alex Williamson wrote: > On Thu, 2011-05-19 at 10:12 -0400, Avi Kivity wrote: > > The memory API separates the attributes of a memory region (its size, how > > reads or writes are handled, dirty logging, and coalescing) from where it > > is mapped and whether it is enabled. This allows a device to configure > > a memory region once, then hand it off to its parent bus to map it according > > to the bus configuration. > > > > Hierarchical registration also allows a device to compose a region out of > > a number of sub-regions with different properties; for example some may be > > RAM while others may be MMIO. > > > + /* Guest-visible constraints: */ > > + struct { > > + /* If nonzero, specify bounds on access sizes beyond which a machine > > + * check is thrown. > > + */ > > + unsigned min_access_size; > > + unsigned max_access_size; > > Do we always support all access sizes between min and max? As far as I can tell, yes. > This might > be easier to describe as a bitmap of supported power of 2 access sizes. This is uglier to initialize. However we can provide #defines for common use (MEM_ACCESS_BYTE_TO_LONG, MEM_ACCESS_LONG).
On 05/19/2011 10:27 PM, Jan Kiszka wrote: > On 2011-05-19 16:12, Avi Kivity wrote: > > +/* Sets an offset to be added to MemoryRegionOps callbacks. */ > > +void memory_region_set_offset(MemoryRegion *mr, target_phys_addr_t offset); > > Please mark this as a legacy helper, ideally to be removed after the > complete conversion to this API. During that phase we should try to > identify those devices which still depend on offset=0 and maybe directly > fix them. Okay. > > +/* Turn loggging on or off for specified client (display, migration) */ > > +void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client); > > +/* Enable memory coalescing for the region. MMIO ->write callbacks may be > > + * delayed until a non-coalesced MMIO is issued. > > + */ > > +void memory_region_set_coalescing(MemoryRegion *mr); > > +/* Enable memory coalescing for a sub-range of the region. MMIO ->write > > + * callbacks may be delayed until a non-coalesced MMIO is issued. > > + */ > > +void memory_region_add_coalescing(MemoryRegion *mr, > > + target_phys_addr_t offset, > > + target_phys_addr_t size); > > Will this be such a common use case that requesting the user to split up > the region and then use set_coalescing will generate too much boiler > plate code? Look at e1000, coalescing ranges have byte granularity. > > +/* Disable MMIO coalescing for the region. */ > > +void memory_region_clear_coalescing(MemoryRegion *mr); > > And what about clearing coalescing for sub-ranges? Clear them all and rebuild. > Maybe skip > add_coalescing for the first run and see how far we get. We get as far as e.
On 05/19/2011 11:43 PM, Anthony Liguori wrote: > On 05/19/2011 09:12 AM, Avi Kivity wrote: >> The memory API separates the attributes of a memory region (its size, >> how >> reads or writes are handled, dirty logging, and coalescing) from >> where it >> is mapped and whether it is enabled. This allows a device to configure >> a memory region once, then hand it off to its parent bus to map it >> according >> to the bus configuration. >> >> Hierarchical registration also allows a device to compose a region >> out of >> a number of sub-regions with different properties; for example some >> may be >> RAM while others may be MMIO. >> >> + struct { >> + /* If nonzero, specify bounds on access sizes beyond which a >> machine >> + * check is thrown. >> + */ >> + unsigned min_access_size; >> + unsigned max_access_size; >> + /* If true, unaligned accesses are supported. Otherwise >> unaligned >> + * accesses throw machine checks. >> + */ >> + bool unaligned; >> + } valid; > > Under what circumstances would this be used? > > The behavior of devices that receive non-natural accesses varies wildly. > > For PCI devices, invalid accesses almost always return ~0. I can't > think of a device where an MCE would occur. This was requested by Richard, so I'll let him comment.
On 05/20/2011 12:04 AM, Stefan Weil wrote: > Am 19.05.2011 16:12, schrieb Avi Kivity: >> The memory API separates the attributes of a memory region (its size, >> how >> reads or writes are handled, dirty logging, and coalescing) from >> where it >> is mapped and whether it is enabled. This allows a device to configure >> a memory region once, then hand it off to its parent bus to map it >> according >> to the bus configuration. >> >> Hierarchical registration also allows a device to compose a region >> out of >> a number of sub-regions with different properties; for example some >> may be >> RAM while others may be MMIO. >> --- /dev/null >> +++ b/memory.h >> @@ -0,0 +1,142 @@ >> +#ifndef MEMORY_H >> +#define MEMORY_H >> + >> +#include <stdint.h> >> +#include <stdbool.h> > > stdbool.h is already included in qemu-common.h, > stdint.h (indirectly) too. > > Therefore both include statements can be removed. We shouldn't rely on indirect includes, it makes updating headers very hard. Each header should #include what it directly needs and no more. >> +typedef struct CoalescedMemoryRange CoalescedMemoryRange; >> + >> +struct CoalescedMemoryRange { >> + target_phys_addr_t start; >> + target_phys_addr_t size; >> + QTAILQ_ENTRY(coalesced_ranges) link; >> +}; >> + >> +struct MemoryRegion { >> + /* All fields are private - violators will be prosecuted */ > > Is it possible to move this private declaration into the implementation > file (or a private header file if the declaration is needed by more than > one file)? No, the structure size is needed by clients.
On 05/20/2011 12:11 AM, Stefan Hajnoczi wrote: > On Thu, May 19, 2011 at 3:12 PM, Avi Kivity<avi@redhat.com> wrote: > > +struct MemoryRegion { > > + /* All fields are private - violators will be prosecuted */ > > + const MemoryRegionOps *ops; > > + MemoryRegion *parent; > > In the case where a region is aliased (mapped twice into the address > space at different addresses) I need two MemoryRegions? Yes. > The > MemoryRegion describes an actual mapping in the<parent, addr, > ram_addr> tuple, not just the attributes of the region (ops, size, > ...). Correct. The region is not just a read-only descriptor. memory_region_add_subregion() can be used only once on a region (unless you _del_subregion() in between). (it also follows from the fact that there is no separate opaque for registration, and from the fact that RAM is owned by the region, not provided as part of registration).
On 05/20/2011 02:23 AM, Avi Kivity wrote: > On 05/19/2011 11:43 PM, Anthony Liguori wrote: >> On 05/19/2011 09:12 AM, Avi Kivity wrote: >>> The memory API separates the attributes of a memory region (its size, how >>> reads or writes are handled, dirty logging, and coalescing) from where it >>> is mapped and whether it is enabled. This allows a device to configure >>> a memory region once, then hand it off to its parent bus to map it according >>> to the bus configuration. >>> >>> Hierarchical registration also allows a device to compose a region out of >>> a number of sub-regions with different properties; for example some may be >>> RAM while others may be MMIO. >>> >>> + struct { >>> + /* If nonzero, specify bounds on access sizes beyond which a machine >>> + * check is thrown. >>> + */ >>> + unsigned min_access_size; >>> + unsigned max_access_size; >>> + /* If true, unaligned accesses are supported. Otherwise unaligned >>> + * accesses throw machine checks. >>> + */ >>> + bool unaligned; >>> + } valid; >> >> Under what circumstances would this be used? >> >> The behavior of devices that receive non-natural accesses varies wildly. >> >> For PCI devices, invalid accesses almost always return ~0. I can't think of a device where an MCE would occur. > > This was requested by Richard, so I'll let him comment. > Several alpha system chips MCE when accessed with incorrect sizes. E.g. only 64-bit accesses are allowed. Is this structure honestly any better than 4 function pointers? I can't see that it is, myself. r~
On 05/20/2011 09:06 AM, Richard Henderson wrote: > On 05/20/2011 02:23 AM, Avi Kivity wrote: >> On 05/19/2011 11:43 PM, Anthony Liguori wrote: >>> On 05/19/2011 09:12 AM, Avi Kivity wrote: >>>> The memory API separates the attributes of a memory region (its size, how >>>> reads or writes are handled, dirty logging, and coalescing) from where it >>>> is mapped and whether it is enabled. This allows a device to configure >>>> a memory region once, then hand it off to its parent bus to map it according >>>> to the bus configuration. >>>> >>>> Hierarchical registration also allows a device to compose a region out of >>>> a number of sub-regions with different properties; for example some may be >>>> RAM while others may be MMIO. >>>> >>>> + struct { >>>> + /* If nonzero, specify bounds on access sizes beyond which a machine >>>> + * check is thrown. >>>> + */ >>>> + unsigned min_access_size; >>>> + unsigned max_access_size; >>>> + /* If true, unaligned accesses are supported. Otherwise unaligned >>>> + * accesses throw machine checks. >>>> + */ >>>> + bool unaligned; >>>> + } valid; >>> >>> Under what circumstances would this be used? >>> >>> The behavior of devices that receive non-natural accesses varies wildly. >>> >>> For PCI devices, invalid accesses almost always return ~0. I can't think of a device where an MCE would occur. >> >> This was requested by Richard, so I'll let him comment. >> > > Several alpha system chips MCE when accessed with incorrect sizes. > E.g. only 64-bit accesses are allowed. But is this a characteristic of devices or is this a characteristic of the chipset/CPU? At any rate, I'm fairly sure it doesn't belong in the MemoryRegion structure. Regards, Anthony Liguori
On 05/20/2011 07:31 AM, Anthony Liguori wrote:
> But is this a characteristic of devices or is this a characteristic of the chipset/CPU?
Chipset.
r~
On 05/20/2011 09:40 AM, Richard Henderson wrote: > On 05/20/2011 07:31 AM, Anthony Liguori wrote: >> But is this a characteristic of devices or is this a characteristic of the chipset/CPU? > > Chipset. So if the chipset only allows accesses that are 64-bit, then you'll want to have hierarchical dispatch filter non 64-bit accesses and raise an MCE appropriately. So you don't need anything in MemoryRegion, you need code in the dispatch path. Regards, Anthony Liguori > > r~ >
On Thu, May 19, 2011 at 5:12 PM, Avi Kivity <avi@redhat.com> wrote: > The memory API separates the attributes of a memory region (its size, how > reads or writes are handled, dirty logging, and coalescing) from where it > is mapped and whether it is enabled. This allows a device to configure > a memory region once, then hand it off to its parent bus to map it according > to the bus configuration. > > Hierarchical registration also allows a device to compose a region out of > a number of sub-regions with different properties; for example some may be > RAM while others may be MMIO. > > Signed-off-by: Avi Kivity <avi@redhat.com> > --- > memory.h | 142 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 142 insertions(+), 0 deletions(-) > create mode 100644 memory.h > > diff --git a/memory.h b/memory.h > new file mode 100644 > index 0000000..77c5951 > --- /dev/null > +++ b/memory.h > @@ -0,0 +1,142 @@ > +#ifndef MEMORY_H > +#define MEMORY_H > + > +#include <stdint.h> > +#include <stdbool.h> > +#include "qemu-common.h" > +#include "cpu-common.h" > +#include "targphys.h" > +#include "qemu-queue.h" > + > +typedef struct MemoryRegionOps MemoryRegionOps; > +typedef struct MemoryRegion MemoryRegion; > + > +/* > + * Memory region callbacks > + */ > +struct MemoryRegionOps { > + /* Read from the memory region. @addr is relative to @mr; @size is > + * in bytes. */ > + uint64_t (*read)(MemoryRegion *mr, > + target_phys_addr_t addr, > + unsigned size); > + /* Write to the memory region. @addr is relative to @mr; @size is > + * in bytes. */ > + void (*write)(MemoryRegion *mr, > + target_phys_addr_t addr, > + uint64_t data, > + unsigned size); > + /* Guest-visible constraints: */ > + struct { > + /* If nonzero, specify bounds on access sizes beyond which a machine > + * check is thrown. > + */ > + unsigned min_access_size; > + unsigned max_access_size; > + /* If true, unaligned accesses are supported. Otherwise unaligned > + * accesses throw machine checks. > + */ > + bool unaligned; > + } valid; > + /* Internal implementation constraints: */ > + struct { > + /* If nonzero, specifies the minimum size implemented. Smaller sizes > + * will be rounded upwards and a partial result will be returned. > + */ > + unsigned min_access_size; > + /* If nonzero, specifies the maximum size implemented. Larger sizes > + * will be done as a series of accesses with smaller sizes. > + */ > + unsigned max_access_size; > + /* If true, unaligned accesses are supported. Otherwise all accesses > + * are converted to (possibly multiple) naturally aligned accesses. > + */ > + bool unaligned; > + } impl; > +}; > + > +typedef struct CoalescedMemoryRange CoalescedMemoryRange; > + > +struct CoalescedMemoryRange { > + target_phys_addr_t start; > + target_phys_addr_t size; > + QTAILQ_ENTRY(coalesced_ranges) link; > +}; > + > +struct MemoryRegion { > + /* All fields are private - violators will be prosecuted */ > + const MemoryRegionOps *ops; > + MemoryRegion *parent; > + target_phys_addr_t size; > + target_phys_addr_t addr; > + ram_addr_t ram_addr; > + unsigned priority; > + bool may_overlap; > + QTAILQ_HEAD(subregions, MemoryRegion) subregions; > + QTAILQ_ENTRY(subregions) subregions_link; > + QTAILQ_HEAD(coalesced_ranges, CoalescedMemoryRange) coalesced; > +}; > + > +/* Initialize a memory region > + * > + * The region typically acts as a container for other memory regions. > + */ > +void memory_region_init(MemoryRegion *mr, > + target_phys_addr_t size); > +/* Initialize an I/O memory region. Accesses into the region will be > + * cause the callbacks in @ops to be called. > + * > + * if @size is nonzero, subregions will be clipped to @size. > + */ > +void memory_region_init_io(MemoryRegion *mr, > + const MemoryRegionOps *ops, > + target_phys_addr_t size); > +/* Initialize an I/O memory region. Accesses into the region will be > + * modify memory directly. > + */ > +void memory_region_init_ram(MemoryRegion *mr, > + target_phys_addr_t size); > +/* Initialize a RAM memory region. Accesses into the region will be > + * modify memory in @ptr directly. > + */ > +void memory_region_init_ram_ptr(MemoryRegion *mr, > + target_phys_addr_t size, > + void *ptr); > +/* Destroy a memory region. The memory becomes inaccessible. */ > +void memory_region_destroy(MemoryRegion *mr); Doesn't the lower priority region become accessible instead in some cases? > +/* Sets an offset to be added to MemoryRegionOps callbacks. */ > +void memory_region_set_offset(MemoryRegion *mr, target_phys_addr_t offset); > +/* Turn loggging on or off for specified client (display, migration) */ g-- > +void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client); > +/* Enable memory coalescing for the region. MMIO ->write callbacks may be > + * delayed until a non-coalesced MMIO is issued. > + */ > +void memory_region_set_coalescing(MemoryRegion *mr); > +/* Enable memory coalescing for a sub-range of the region. MMIO ->write > + * callbacks may be delayed until a non-coalesced MMIO is issued. > + */ > +void memory_region_add_coalescing(MemoryRegion *mr, > + target_phys_addr_t offset, > + target_phys_addr_t size); > +/* Disable MMIO coalescing for the region. */ > +void memory_region_clear_coalescing(MemoryRegion *mr); Perhaps the interface could be more generic, like +void memory_region_set_property(MemoryRegion *mr, unsigned flags); +void memory_region_clear_property(MemoryRegion *mr, unsigned flags); > + > +/* Add a sub-region at @offset. The sub-region may not overlap with other > + * subregions (except for those explicitly marked as overlapping) > + */ > +void memory_region_add_subregion(MemoryRegion *mr, > + target_phys_addr_t offset, > + MemoryRegion *subregion); > +/* Add a sub-region at @offset. The sun-region may overlap other subregions; Sunny regions? > + * conflicts are resolved by having a higher @priority hide a lower @priority. > + * Subregions without priority are taken as @priority 0. > + */ > +void memory_region_add_subregion_overlap(MemoryRegion *mr, > + target_phys_addr_t offset, > + MemoryRegion *subregion, > + unsigned priority); > +/* Remove a subregion. */ > +void memory_region_del_subregion(MemoryRegion *mr, > + MemoryRegion *subregion); What would the subregions be used for?
On Fri, May 20, 2011 at 5:46 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > On 05/20/2011 09:40 AM, Richard Henderson wrote: >> >> On 05/20/2011 07:31 AM, Anthony Liguori wrote: >>> >>> But is this a characteristic of devices or is this a characteristic of >>> the chipset/CPU? >> >> Chipset. > > So if the chipset only allows accesses that are 64-bit, then you'll want to > have hierarchical dispatch filter non 64-bit accesses and raise an MCE > appropriately. > > So you don't need anything in MemoryRegion, you need code in the dispatch > path. Sparc (32/64) systems are also very picky about wrong sized accesses. I think the buses have lines for access size and the device can (and they will) signal an error in these cases. Then the bus controller will raise an NMI. I think the easiest way to handle this could be to use overlapping registrations for specific sizes. Then we could make a default error generator device, which would just signal NMI/MCE on any access. It would register for all of the picky area with lowest possible priority. Other devices would register the small working access areas with no knowledge about this error generator device. Any correct access should go to other devices, bad accesses to the error generator. Though this would not be very different from current unassigned access handling.
On 05/20/2011 05:31 PM, Anthony Liguori wrote: >> Several alpha system chips MCE when accessed with incorrect sizes. >> E.g. only 64-bit accesses are allowed. > > > But is this a characteristic of devices or is this a characteristic of > the chipset/CPU? The chipset is modelled by a MemoryRegion too. > > At any rate, I'm fairly sure it doesn't belong in the MemoryRegion > structure. > Since it isn't a global property, where does it belong?
On 05/20/2011 05:46 PM, Anthony Liguori wrote: > On 05/20/2011 09:40 AM, Richard Henderson wrote: >> On 05/20/2011 07:31 AM, Anthony Liguori wrote: >>> But is this a characteristic of devices or is this a characteristic >>> of the chipset/CPU? >> >> Chipset. > > So if the chipset only allows accesses that are 64-bit, then you'll > want to have hierarchical dispatch filter non 64-bit accesses and > raise an MCE appropriately. > > So you don't need anything in MemoryRegion, you need code in the > dispatch path. MemoryRegion *is* the dispatch path. Only done declaratively so we can flatten it whenever it changes.
On 05/20/2011 09:16 PM, Blue Swirl wrote: > On Fri, May 20, 2011 at 5:46 PM, Anthony Liguori<anthony@codemonkey.ws> wrote: > > On 05/20/2011 09:40 AM, Richard Henderson wrote: > >> > >> On 05/20/2011 07:31 AM, Anthony Liguori wrote: > >>> > >>> But is this a characteristic of devices or is this a characteristic of > >>> the chipset/CPU? > >> > >> Chipset. > > > > So if the chipset only allows accesses that are 64-bit, then you'll want to > > have hierarchical dispatch filter non 64-bit accesses and raise an MCE > > appropriately. > > > > So you don't need anything in MemoryRegion, you need code in the dispatch > > path. > > Sparc (32/64) systems are also very picky about wrong sized accesses. > I think the buses have lines for access size and the device can (and > they will) signal an error in these cases. Then the bus controller > will raise an NMI. > > I think the easiest way to handle this could be to use overlapping > registrations for specific sizes. Then we could make a default error > generator device, which would just signal NMI/MCE on any access. It > would register for all of the picky area with lowest possible > priority. Other devices would register the small working access areas > with no knowledge about this error generator device. Any correct > access should go to other devices, bad accesses to the error > generator. > > Though this would not be very different from current unassigned access handling. The MemoryRegion way of doing it would be to register the MemoryRegion of the bus with the picky attributes. Any sub-regions will inherit the property.
On 05/20/2011 08:59 PM, Blue Swirl wrote: > On Thu, May 19, 2011 at 5:12 PM, Avi Kivity<avi@redhat.com> wrote: > > The memory API separates the attributes of a memory region (its size, how > > reads or writes are handled, dirty logging, and coalescing) from where it > > is mapped and whether it is enabled. This allows a device to configure > > a memory region once, then hand it off to its parent bus to map it according > > to the bus configuration. > > > > Hierarchical registration also allows a device to compose a region out of > > a number of sub-regions with different properties; for example some may be > > RAM while others may be MMIO. > > +void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client); > > +/* Enable memory coalescing for the region. MMIO ->write callbacks may be > > + * delayed until a non-coalesced MMIO is issued. > > + */ > > +void memory_region_set_coalescing(MemoryRegion *mr); > > +/* Enable memory coalescing for a sub-range of the region. MMIO ->write > > + * callbacks may be delayed until a non-coalesced MMIO is issued. > > + */ > > +void memory_region_add_coalescing(MemoryRegion *mr, > > + target_phys_addr_t offset, > > + target_phys_addr_t size); > > +/* Disable MMIO coalescing for the region. */ > > +void memory_region_clear_coalescing(MemoryRegion *mr); > > Perhaps the interface could be more generic, like > +void memory_region_set_property(MemoryRegion *mr, unsigned flags); > +void memory_region_clear_property(MemoryRegion *mr, unsigned flags); > Coalescing is a complex property, not just a boolean attribute. We probably will have a number of boolean attributes later, though. > > + * conflicts are resolved by having a higher @priority hide a lower @priority. > > + * Subregions without priority are taken as @priority 0. > > + */ > > +void memory_region_add_subregion_overlap(MemoryRegion *mr, > > + target_phys_addr_t offset, > > + MemoryRegion *subregion, > > + unsigned priority); > > +/* Remove a subregion. */ > > +void memory_region_del_subregion(MemoryRegion *mr, > > + MemoryRegion *subregion); > > What would the subregions be used for? Subregions describe the flow of data through the memory bus. We'd have a subregion for the PCI bus, with its own subregions for various BARs, with some having subregions for dispatching different MMIO types within the BAR. This allows, for example, the PCI layer to move a BAR without the PCI device knowing anything about it.
On 05/20/2011 05:06 PM, Richard Henderson wrote: > Is this structure honestly any better than 4 function pointers? > I can't see that it is, myself. > That was requested by Anthony. And in fact we have two bits of information per access size, one is whether the access is allowed or not, the other is whether we want to use a larger or smaller access size function (useful for auto-splitting a 64-bit access into two 32-bit accesses for example).
On 05/20/2011 08:59 PM, Blue Swirl wrote: > On Thu, May 19, 2011 at 5:12 PM, Avi Kivity<avi@redhat.com> wrote: > > The memory API separates the attributes of a memory region (its size, how > > reads or writes are handled, dirty logging, and coalescing) from where it > > is mapped and whether it is enabled. This allows a device to configure > > a memory region once, then hand it off to its parent bus to map it according > > to the bus configuration. > > > > Hierarchical registration also allows a device to compose a region out of > > a number of sub-regions with different properties; for example some may be > > RAM while others may be MMIO. > > > > +/* Destroy a memory region. The memory becomes inaccessible. */ > > +void memory_region_destroy(MemoryRegion *mr); > > Doesn't the lower priority region become accessible instead in some cases? If we require that _add_subregion() and _del_subregion() be paired, then no.
On Sun, May 22, 2011 at 9:45 AM, Avi Kivity <avi@redhat.com> wrote: > On 05/20/2011 08:59 PM, Blue Swirl wrote: >> >> On Thu, May 19, 2011 at 5:12 PM, Avi Kivity<avi@redhat.com> wrote: >> > The memory API separates the attributes of a memory region (its size, >> > how >> > reads or writes are handled, dirty logging, and coalescing) from where >> > it >> > is mapped and whether it is enabled. This allows a device to configure >> > a memory region once, then hand it off to its parent bus to map it >> > according >> > to the bus configuration. >> > >> > Hierarchical registration also allows a device to compose a region out >> > of >> > a number of sub-regions with different properties; for example some may >> > be >> > RAM while others may be MMIO. > >> > +void memory_region_set_log(MemoryRegion *mr, bool log, unsigned >> > client); >> > +/* Enable memory coalescing for the region. MMIO ->write callbacks >> > may be >> > + * delayed until a non-coalesced MMIO is issued. >> > + */ >> > +void memory_region_set_coalescing(MemoryRegion *mr); >> > +/* Enable memory coalescing for a sub-range of the region. MMIO >> > ->write >> > + * callbacks may be delayed until a non-coalesced MMIO is issued. >> > + */ >> > +void memory_region_add_coalescing(MemoryRegion *mr, >> > + target_phys_addr_t offset, >> > + target_phys_addr_t size); >> > +/* Disable MMIO coalescing for the region. */ >> > +void memory_region_clear_coalescing(MemoryRegion *mr); >> >> Perhaps the interface could be more generic, like >> +void memory_region_set_property(MemoryRegion *mr, unsigned flags); >> +void memory_region_clear_property(MemoryRegion *mr, unsigned flags); >> > > Coalescing is a complex property, not just a boolean attribute. We probably > will have a number of boolean attributes later, though. But what is the difference between adding coalescing to an area and setting the bit property 'coalescing' to an area? At least what you propose now is not so complex that it couldn't be handled as a single bit. >> > + * conflicts are resolved by having a higher @priority hide a lower >> > @priority. >> > + * Subregions without priority are taken as @priority 0. >> > + */ >> > +void memory_region_add_subregion_overlap(MemoryRegion *mr, >> > + target_phys_addr_t offset, >> > + MemoryRegion *subregion, >> > + unsigned priority); >> > +/* Remove a subregion. */ >> > +void memory_region_del_subregion(MemoryRegion *mr, >> > + MemoryRegion *subregion); >> >> What would the subregions be used for? > > Subregions describe the flow of data through the memory bus. We'd have a > subregion for the PCI bus, with its own subregions for various BARs, with > some having subregions for dispatching different MMIO types within the BAR. > > This allows, for example, the PCI layer to move a BAR without the PCI device > knowing anything about it. But why can't a first class region be used for that?
On 05/22/2011 12:32 PM, Blue Swirl wrote: > >> > +void memory_region_add_coalescing(MemoryRegion *mr, > >> > + target_phys_addr_t offset, > >> > + target_phys_addr_t size); > >> > +/* Disable MMIO coalescing for the region. */ > >> > +void memory_region_clear_coalescing(MemoryRegion *mr); > >> > >> Perhaps the interface could be more generic, like > >> +void memory_region_set_property(MemoryRegion *mr, unsigned flags); > >> +void memory_region_clear_property(MemoryRegion *mr, unsigned flags); > >> > > > > Coalescing is a complex property, not just a boolean attribute. We probably > > will have a number of boolean attributes later, though. > > But what is the difference between adding coalescing to an area and > setting the bit property 'coalescing' to an area? At least what you > propose now is not so complex that it couldn't be handled as a single > bit. Look at the API - add_coalescing() sets the coalescing property on a subrange of the memory region, not the entire region. (motivation - hw/e1000.c). > >> > + * conflicts are resolved by having a higher @priority hide a lower > >> > @priority. > >> > + * Subregions without priority are taken as @priority 0. > >> > + */ > >> > +void memory_region_add_subregion_overlap(MemoryRegion *mr, > >> > + target_phys_addr_t offset, > >> > + MemoryRegion *subregion, > >> > + unsigned priority); > >> > +/* Remove a subregion. */ > >> > +void memory_region_del_subregion(MemoryRegion *mr, > >> > + MemoryRegion *subregion); > >> > >> What would the subregions be used for? > > > > Subregions describe the flow of data through the memory bus. We'd have a > > subregion for the PCI bus, with its own subregions for various BARs, with > > some having subregions for dispatching different MMIO types within the BAR. > > > > This allows, for example, the PCI layer to move a BAR without the PCI device > > knowing anything about it. > > But why can't a first class region be used for that? Subregions are first-class regions. In fact all regions are subregions except the root. It's a tree of regions, each level adding an offset, clipping, and perhaps other attributes, with the leaves providing actual memory (mmio or RAM).
On Sun, May 22, 2011 at 2:36 PM, Avi Kivity <avi@redhat.com> wrote: > On 05/22/2011 12:32 PM, Blue Swirl wrote: >> >> >> > +void memory_region_add_coalescing(MemoryRegion *mr, >> >> > + target_phys_addr_t offset, >> >> > + target_phys_addr_t size); >> >> > +/* Disable MMIO coalescing for the region. */ >> >> > +void memory_region_clear_coalescing(MemoryRegion *mr); >> >> >> >> Perhaps the interface could be more generic, like >> >> +void memory_region_set_property(MemoryRegion *mr, unsigned flags); >> >> +void memory_region_clear_property(MemoryRegion *mr, unsigned flags); >> >> >> > >> > Coalescing is a complex property, not just a boolean attribute. We >> > probably >> > will have a number of boolean attributes later, though. >> >> But what is the difference between adding coalescing to an area and >> setting the bit property 'coalescing' to an area? At least what you >> propose now is not so complex that it couldn't be handled as a single >> bit. > > Look at the API - add_coalescing() sets the coalescing property on a > subrange of the memory region, not the entire region. Right, but doesn't the same apply to any other properties, they may apply to a full range or just a subrange? > (motivation - hw/e1000.c). > >> >> > + * conflicts are resolved by having a higher @priority hide a >> >> lower >> >> > @priority. >> >> > + * Subregions without priority are taken as @priority 0. >> >> > + */ >> >> > +void memory_region_add_subregion_overlap(MemoryRegion *mr, >> >> > + target_phys_addr_t >> >> offset, >> >> > + MemoryRegion >> >> *subregion, >> >> > + unsigned priority); >> >> > +/* Remove a subregion. */ >> >> > +void memory_region_del_subregion(MemoryRegion *mr, >> >> > + MemoryRegion *subregion); >> >> >> >> What would the subregions be used for? >> > >> > Subregions describe the flow of data through the memory bus. We'd have >> > a >> > subregion for the PCI bus, with its own subregions for various BARs, >> > with >> > some having subregions for dispatching different MMIO types within the >> > BAR. >> > >> > This allows, for example, the PCI layer to move a BAR without the PCI >> > device >> > knowing anything about it. >> >> But why can't a first class region be used for that? > > Subregions are first-class regions. In fact all regions are subregions > except the root. Oh, I see now. Maybe the comments should describe this. Or perhaps the terms should be something like 'bus/bridge/root' and 'region' instead of 'region' and 'subregion'? > It's a tree of regions, each level adding an offset, clipping, and perhaps > other attributes, with the leaves providing actual memory (mmio or RAM). I thought that there are two classes of regions, like PCI device vs. a single BAR.
On 05/22/2011 03:06 PM, Blue Swirl wrote: > On Sun, May 22, 2011 at 2:36 PM, Avi Kivity<avi@redhat.com> wrote: > > On 05/22/2011 12:32 PM, Blue Swirl wrote: > >> > >> >> > +void memory_region_add_coalescing(MemoryRegion *mr, > >> >> > + target_phys_addr_t offset, > >> >> > + target_phys_addr_t size); > >> >> > +/* Disable MMIO coalescing for the region. */ > >> >> > +void memory_region_clear_coalescing(MemoryRegion *mr); > >> >> > >> >> Perhaps the interface could be more generic, like > >> >> +void memory_region_set_property(MemoryRegion *mr, unsigned flags); > >> >> +void memory_region_clear_property(MemoryRegion *mr, unsigned flags); > >> >> > >> > > >> > Coalescing is a complex property, not just a boolean attribute. We > >> > probably > >> > will have a number of boolean attributes later, though. > >> > >> But what is the difference between adding coalescing to an area and > >> setting the bit property 'coalescing' to an area? At least what you > >> propose now is not so complex that it couldn't be handled as a single > >> bit. > > > > Look at the API - add_coalescing() sets the coalescing property on a > > subrange of the memory region, not the entire region. > > Right, but doesn't the same apply to any other properties, they may > apply to a full range or just a subrange? We'll know when we have more properties. I expect most will be region-wide. > > > > Subregions are first-class regions. In fact all regions are subregions > > except the root. > > Oh, I see now. Maybe the comments should describe this. Or perhaps the > terms should be something like 'bus/bridge/root' and 'region' instead > of 'region' and 'subregion'? Problem is, memory_region_add_subregion() adds both sub-bridges and leaf regions. It's quite possible that BAR 0 will be a leaf region, and BAR 1 will be a sub-bridge. Can you suggest an alternative naming for the API? > > It's a tree of regions, each level adding an offset, clipping, and perhaps > > other attributes, with the leaves providing actual memory (mmio or RAM). > > I thought that there are two classes of regions, like PCI device vs. a > single BAR. It's true in a way, except the mapping is not 1:1.
On Sun, May 22, 2011 at 3:18 PM, Avi Kivity <avi@redhat.com> wrote: > On 05/22/2011 03:06 PM, Blue Swirl wrote: >> >> On Sun, May 22, 2011 at 2:36 PM, Avi Kivity<avi@redhat.com> wrote: >> > On 05/22/2011 12:32 PM, Blue Swirl wrote: >> >> >> >> >> > +void memory_region_add_coalescing(MemoryRegion *mr, >> >> >> > + target_phys_addr_t >> >> offset, >> >> >> > + target_phys_addr_t >> >> size); >> >> >> > +/* Disable MMIO coalescing for the region. */ >> >> >> > +void memory_region_clear_coalescing(MemoryRegion *mr); >> >> >> >> >> >> Perhaps the interface could be more generic, like >> >> >> +void memory_region_set_property(MemoryRegion *mr, unsigned >> >> flags); >> >> >> +void memory_region_clear_property(MemoryRegion *mr, unsigned >> >> flags); >> >> >> >> >> > >> >> > Coalescing is a complex property, not just a boolean attribute. >> >> We >> >> > probably >> >> > will have a number of boolean attributes later, though. >> >> >> >> But what is the difference between adding coalescing to an area and >> >> setting the bit property 'coalescing' to an area? At least what you >> >> propose now is not so complex that it couldn't be handled as a single >> >> bit. >> > >> > Look at the API - add_coalescing() sets the coalescing property on a >> > subrange of the memory region, not the entire region. >> >> Right, but doesn't the same apply to any other properties, they may >> apply to a full range or just a subrange? > > We'll know when we have more properties. I expect most will be region-wide. Since we don't know about those yet, coalescing API could be like you propose. Later it can be changed to the property API, or leave for convenience. >> > >> > Subregions are first-class regions. In fact all regions are subregions >> > except the root. >> >> Oh, I see now. Maybe the comments should describe this. Or perhaps the >> terms should be something like 'bus/bridge/root' and 'region' instead >> of 'region' and 'subregion'? > > Problem is, memory_region_add_subregion() adds both sub-bridges and leaf > regions. > > It's quite possible that BAR 0 will be a leaf region, and BAR 1 will be a > sub-bridge. > > Can you suggest an alternative naming for the API? How about memory_region_container_init() memory_region_add()
On 05/22/2011 06:32 PM, Blue Swirl wrote: > > > > Can you suggest an alternative naming for the API? > > How about > memory_region_container_init() > memory_region_add() I'm neutral. If someone seconds this, I'll make it so.
On 05/22/2011 01:38 AM, Avi Kivity wrote: > On 05/20/2011 05:31 PM, Anthony Liguori wrote: >>> Several alpha system chips MCE when accessed with incorrect sizes. >>> E.g. only 64-bit accesses are allowed. >> >> >> But is this a characteristic of devices or is this a characteristic of >> the chipset/CPU? > > The chipset is modelled by a MemoryRegion too. > >> >> At any rate, I'm fairly sure it doesn't belong in the MemoryRegion >> structure. >> > > Since it isn't a global property, where does it belong? The chipset should have an intercept in the dispatch path that enforces this (this assumes hierarchical dispatch). Regards, Anthony Liguori >
On 05/22/2011 01:39 AM, Avi Kivity wrote: > On 05/20/2011 05:46 PM, Anthony Liguori wrote: >> On 05/20/2011 09:40 AM, Richard Henderson wrote: >>> On 05/20/2011 07:31 AM, Anthony Liguori wrote: >>>> But is this a characteristic of devices or is this a characteristic >>>> of the chipset/CPU? >>> >>> Chipset. >> >> So if the chipset only allows accesses that are 64-bit, then you'll >> want to have hierarchical dispatch filter non 64-bit accesses and >> raise an MCE appropriately. >> >> So you don't need anything in MemoryRegion, you need code in the >> dispatch path. > > MemoryRegion *is* the dispatch path. Only done declaratively so we can > flatten it whenever it changes. We don't want dispatch to be 100% declarative. That's what will cause the API to get horrendously ugly. An example is PCI-bus level endianness conversion. I also believe the Sparc IOMMU has an xor engine. You could add a 'bool swap_endian' and an 'uint32_t xor_mask' in MemoryRegion but now you're adding a ton of platform specific knowledge to what should be an independent layer. Regards, Anthony Liguori >
On 05/22/2011 06:46 PM, Anthony Liguori wrote: >> MemoryRegion *is* the dispatch path. Only done declaratively so we can >> flatten it whenever it changes. > > > We don't want dispatch to be 100% declarative. That's what will cause > the API to get horrendously ugly. > > An example is PCI-bus level endianness conversion. I also believe the > Sparc IOMMU has an xor engine. > > You could add a 'bool swap_endian' and an 'uint32_t xor_mask' in > MemoryRegion but now you're adding a ton of platform specific > knowledge to what should be an independent layer. > Currently containers do not use the read/write function pointers. We could make them (or perhaps others) act as transformations on the data as it passes. So it's still declarative (the entire flow is known at registration time) but it doesn't embed platform magic. Byteswap is sufficiently generic to add as a region property, IMO. btw, wrt iommu emulation, the API finally allows us to determine the path between any two devices, so we can apply the right iommu transformations.
On 05/22/2011 06:44 PM, Anthony Liguori wrote: > >>> >>> At any rate, I'm fairly sure it doesn't belong in the MemoryRegion >>> structure. >>> >> >> Since it isn't a global property, where does it belong? > > > The chipset should have an intercept in the dispatch path that > enforces this (this assumes hierarchical dispatch). So instead of region->ops->valid.*, region->ops->intercept()? btw, that still doesn't require hierarchical dispatch. If intercepts only check if the access is valid, it can still be flattened. Hierarchical dispatch means that chipset callbacks get to choose which subregion callbacks are called, which isn't the case here. If it were, it would be impossible to figure out the kvm slot layout.
diff --git a/memory.h b/memory.h new file mode 100644 index 0000000..77c5951 --- /dev/null +++ b/memory.h @@ -0,0 +1,142 @@ +#ifndef MEMORY_H +#define MEMORY_H + +#include <stdint.h> +#include <stdbool.h> +#include "qemu-common.h" +#include "cpu-common.h" +#include "targphys.h" +#include "qemu-queue.h" + +typedef struct MemoryRegionOps MemoryRegionOps; +typedef struct MemoryRegion MemoryRegion; + +/* + * Memory region callbacks + */ +struct MemoryRegionOps { + /* Read from the memory region. @addr is relative to @mr; @size is + * in bytes. */ + uint64_t (*read)(MemoryRegion *mr, + target_phys_addr_t addr, + unsigned size); + /* Write to the memory region. @addr is relative to @mr; @size is + * in bytes. */ + void (*write)(MemoryRegion *mr, + target_phys_addr_t addr, + uint64_t data, + unsigned size); + /* Guest-visible constraints: */ + struct { + /* If nonzero, specify bounds on access sizes beyond which a machine + * check is thrown. + */ + unsigned min_access_size; + unsigned max_access_size; + /* If true, unaligned accesses are supported. Otherwise unaligned + * accesses throw machine checks. + */ + bool unaligned; + } valid; + /* Internal implementation constraints: */ + struct { + /* If nonzero, specifies the minimum size implemented. Smaller sizes + * will be rounded upwards and a partial result will be returned. + */ + unsigned min_access_size; + /* If nonzero, specifies the maximum size implemented. Larger sizes + * will be done as a series of accesses with smaller sizes. + */ + unsigned max_access_size; + /* If true, unaligned accesses are supported. Otherwise all accesses + * are converted to (possibly multiple) naturally aligned accesses. + */ + bool unaligned; + } impl; +}; + +typedef struct CoalescedMemoryRange CoalescedMemoryRange; + +struct CoalescedMemoryRange { + target_phys_addr_t start; + target_phys_addr_t size; + QTAILQ_ENTRY(coalesced_ranges) link; +}; + +struct MemoryRegion { + /* All fields are private - violators will be prosecuted */ + const MemoryRegionOps *ops; + MemoryRegion *parent; + target_phys_addr_t size; + target_phys_addr_t addr; + ram_addr_t ram_addr; + unsigned priority; + bool may_overlap; + QTAILQ_HEAD(subregions, MemoryRegion) subregions; + QTAILQ_ENTRY(subregions) subregions_link; + QTAILQ_HEAD(coalesced_ranges, CoalescedMemoryRange) coalesced; +}; + +/* Initialize a memory region + * + * The region typically acts as a container for other memory regions. + */ +void memory_region_init(MemoryRegion *mr, + target_phys_addr_t size); +/* Initialize an I/O memory region. Accesses into the region will be + * cause the callbacks in @ops to be called. + * + * if @size is nonzero, subregions will be clipped to @size. + */ +void memory_region_init_io(MemoryRegion *mr, + const MemoryRegionOps *ops, + target_phys_addr_t size); +/* Initialize an I/O memory region. Accesses into the region will be + * modify memory directly. + */ +void memory_region_init_ram(MemoryRegion *mr, + target_phys_addr_t size); +/* Initialize a RAM memory region. Accesses into the region will be + * modify memory in @ptr directly. + */ +void memory_region_init_ram_ptr(MemoryRegion *mr, + target_phys_addr_t size, + void *ptr); +/* Destroy a memory region. The memory becomes inaccessible. */ +void memory_region_destroy(MemoryRegion *mr); +/* Sets an offset to be added to MemoryRegionOps callbacks. */ +void memory_region_set_offset(MemoryRegion *mr, target_phys_addr_t offset); +/* Turn loggging on or off for specified client (display, migration) */ +void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client); +/* Enable memory coalescing for the region. MMIO ->write callbacks may be + * delayed until a non-coalesced MMIO is issued. + */ +void memory_region_set_coalescing(MemoryRegion *mr); +/* Enable memory coalescing for a sub-range of the region. MMIO ->write + * callbacks may be delayed until a non-coalesced MMIO is issued. + */ +void memory_region_add_coalescing(MemoryRegion *mr, + target_phys_addr_t offset, + target_phys_addr_t size); +/* Disable MMIO coalescing for the region. */ +void memory_region_clear_coalescing(MemoryRegion *mr); + +/* Add a sub-region at @offset. The sub-region may not overlap with other + * subregions (except for those explicitly marked as overlapping) + */ +void memory_region_add_subregion(MemoryRegion *mr, + target_phys_addr_t offset, + MemoryRegion *subregion); +/* Add a sub-region at @offset. The sun-region may overlap other subregions; + * conflicts are resolved by having a higher @priority hide a lower @priority. + * Subregions without priority are taken as @priority 0. + */ +void memory_region_add_subregion_overlap(MemoryRegion *mr, + target_phys_addr_t offset, + MemoryRegion *subregion, + unsigned priority); +/* Remove a subregion. */ +void memory_region_del_subregion(MemoryRegion *mr, + MemoryRegion *subregion); + +#endif
The memory API separates the attributes of a memory region (its size, how reads or writes are handled, dirty logging, and coalescing) from where it is mapped and whether it is enabled. This allows a device to configure a memory region once, then hand it off to its parent bus to map it according to the bus configuration. Hierarchical registration also allows a device to compose a region out of a number of sub-regions with different properties; for example some may be RAM while others may be MMIO. Signed-off-by: Avi Kivity <avi@redhat.com> --- memory.h | 142 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 142 insertions(+), 0 deletions(-) create mode 100644 memory.h