Message ID | 1290670555-12575-2-git-send-email-agraf@suse.de |
---|---|
State | New |
Headers | show |
Am 25.11.2010 um 08:35 schrieb Alexander Graf: > The way we're currently modeling mmio is too simplified. We assume > that > every device has the same endianness as the target CPU. In reality, > most devices are little endian (all PCI and ISA ones I'm aware of). > Some > are big endian (special system devices) and a very little fraction is > target native endian (fw_cfg). > > So instead of assuming every device to be native endianness, let's > move > to a model where the device tells us which endianness it's in. > > That way we can compile the devices only once and get rid of all the > ugly > swap will be done by the underlying layer. > > For the same of readability, this patch only introduces the helper > framework > but doesn't allow the registering code to set its endianness yet. > > Signed-off-by: Alexander Graf <agraf@suse.de> > --- > cpu-common.h | 4 ++ > exec.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++ > +++++++++ > 2 files changed, 126 insertions(+), 0 deletions(-) > > diff --git a/cpu-common.h b/cpu-common.h > index a543b5d..839b236 100644 > --- a/cpu-common.h > +++ b/cpu-common.h > @@ -20,6 +20,10 @@ > > #if !defined(CONFIG_USER_ONLY) > > +#define DEVICE_NATIVE_ENDIAN 0 > +#define DEVICE_BIG_ENDIAN 1 > +#define DEVICE_LITTLE_ENDIAN 2 I believe some people around here voiced a preference for enums, to aid with gdb debugging? > diff --git a/exec.c b/exec.c > index db9ff55..f54a360 100644 > --- a/exec.c > +++ b/exec.c > @@ -3370,6 +3474,22 @@ static int cpu_register_io_memory_fixed(int > io_index, > } > io_mem_opaque[io_index] = opaque; > > + switch (endian) { > + case DEVICE_BIG_ENDIAN: > +#ifndef TARGET_WORDS_BIGENDIAN > + swapendian_init(io_index); > +#endif > + break; So basically, you just moved the #ifdefs to another place. :) Shouldn't this be dependent on the CPU state and determined at runtime? Thinking of MSR LE bit on ppc. I guess QEMU doesn't support bi-endian ppc today, as does the 970, but it would be nice to keep it in mind. Andreas
On 27.11.2010, at 00:00, Andreas Färber wrote: > Am 25.11.2010 um 08:35 schrieb Alexander Graf: > >> The way we're currently modeling mmio is too simplified. We assume that >> every device has the same endianness as the target CPU. In reality, >> most devices are little endian (all PCI and ISA ones I'm aware of). Some >> are big endian (special system devices) and a very little fraction is >> target native endian (fw_cfg). >> >> So instead of assuming every device to be native endianness, let's move >> to a model where the device tells us which endianness it's in. >> >> That way we can compile the devices only once and get rid of all the ugly >> swap will be done by the underlying layer. >> >> For the same of readability, this patch only introduces the helper framework >> but doesn't allow the registering code to set its endianness yet. >> >> Signed-off-by: Alexander Graf <agraf@suse.de> > >> --- >> cpu-common.h | 4 ++ >> exec.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 126 insertions(+), 0 deletions(-) >> >> diff --git a/cpu-common.h b/cpu-common.h >> index a543b5d..839b236 100644 >> --- a/cpu-common.h >> +++ b/cpu-common.h >> @@ -20,6 +20,10 @@ >> >> #if !defined(CONFIG_USER_ONLY) >> >> +#define DEVICE_NATIVE_ENDIAN 0 >> +#define DEVICE_BIG_ENDIAN 1 >> +#define DEVICE_LITTLE_ENDIAN 2 > > I believe some people around here voiced a preference for enums, to aid with gdb debugging? Yeah, I actually like enums better too :). Good point! > >> diff --git a/exec.c b/exec.c >> index db9ff55..f54a360 100644 >> --- a/exec.c >> +++ b/exec.c > >> @@ -3370,6 +3474,22 @@ static int cpu_register_io_memory_fixed(int io_index, >> } >> io_mem_opaque[io_index] = opaque; >> >> + switch (endian) { >> + case DEVICE_BIG_ENDIAN: >> +#ifndef TARGET_WORDS_BIGENDIAN >> + swapendian_init(io_index); >> +#endif >> + break; > > So basically, you just moved the #ifdefs to another place. :) Shouldn't this be dependent on the CPU state and determined at runtime? Thinking of MSR LE bit on ppc. I guess QEMU doesn't support bi-endian ppc today, as does the 970, but it would be nice to keep it in mind. MSR_LE is the only case where runtime determination really makes sense. I'm not sure how that should be handled in the end. MSR_LE swizzles every memory access. Maybe it makes sense to just swap things inside of TCG code there. Either way, I don't want to add any performance penalty to LE-on-LE (the default use case) and runtime checks on every MMIO are just too heavy :(. Alex
On 26 November 2010 23:00, Andreas Färber <andreas.faerber@web.de> wrote: > Am 25.11.2010 um 08:35 schrieb Alexander Graf: >> + switch (endian) { >> + case DEVICE_BIG_ENDIAN: >> +#ifndef TARGET_WORDS_BIGENDIAN >> + swapendian_init(io_index); >> +#endif >> + break; > > So basically, you just moved the #ifdefs to another place. :) Shouldn't this > be dependent on the CPU state and determined at runtime? Thinking of MSR LE > bit on ppc. I guess QEMU doesn't support bi-endian ppc today, as does the > 970, but it would be nice to keep it in mind. I was wondering about that (ARM actually has two different 'big endian' modes, one of which is byte-invariant and one of which is word-invariant, and which are runtime selectable on at least some cores...) But I think I agree with Paul Brook that endianness as we're considering it here is really kind of a bus level property, and TARGET_WORDS_BIGENDIAN is defining a sort of "system bus endianness". Runtime flippable cores do the swapping in generated code before they get out to this bit, which corresponds to how the hardware does it (things get swapped before they get out to the signals external to the core). That does mean that in some cases you end up doing a double byteswap, but my guess would be that the slowdown on doing that for device access is completely buried by the slowdown on having to do that for RAM access, so it's not worth worrying about. -- PMM
> > diff --git a/exec.c b/exec.c > > index db9ff55..f54a360 100644 > > --- a/exec.c > > +++ b/exec.c > > > > @@ -3370,6 +3474,22 @@ static int cpu_register_io_memory_fixed(int > > io_index, > > > > } > > io_mem_opaque[io_index] = opaque; > > > > + switch (endian) { > > + case DEVICE_BIG_ENDIAN: > > +#ifndef TARGET_WORDS_BIGENDIAN > > + swapendian_init(io_index); > > +#endif > > + break; > > So basically, you just moved the #ifdefs to another place. :) Many #ifdefs inconsistently scattered through all the device code have been replaced by a single #ifdef. > Shouldn't this be dependent on the CPU state and determined at > runtime? Thinking of MSR LE bit on ppc. I guess QEMU doesn't support > bi-endian ppc today, as does the 970, but it would be nice to keep it > in mind. Switching endianness of a CPU generally does not effect the endianness of the CPU/peripheral busses. It makes the CPU byteswap accesses before they are seen by either memory or devices. In theory it might be possible to avoid redundant byteswaps if you're really clever. In practice you still have to handle the fact that your devices are a different endianness to RAM, so it probably doesn't gain you a whole lot. Paul
On Fri, Nov 26, 2010 at 11:47 PM, Paul Brook <paul@codesourcery.com> wrote: >> > diff --git a/exec.c b/exec.c >> > index db9ff55..f54a360 100644 >> > --- a/exec.c >> > +++ b/exec.c >> > >> > @@ -3370,6 +3474,22 @@ static int cpu_register_io_memory_fixed(int >> > io_index, >> > >> > } >> > io_mem_opaque[io_index] = opaque; >> > >> > + switch (endian) { >> > + case DEVICE_BIG_ENDIAN: >> > +#ifndef TARGET_WORDS_BIGENDIAN >> > + swapendian_init(io_index); >> > +#endif >> > + break; >> >> So basically, you just moved the #ifdefs to another place. :) > > Many #ifdefs inconsistently scattered through all the device code have been > replaced by a single #ifdef. > >> Shouldn't this be dependent on the CPU state and determined at >> runtime? Thinking of MSR LE bit on ppc. I guess QEMU doesn't support >> bi-endian ppc today, as does the 970, but it would be nice to keep it >> in mind. > > Switching endianness of a CPU generally does not effect the endianness of the > CPU/peripheral busses. It makes the CPU byteswap accesses before they are > seen by either memory or devices. > > In theory it might be possible to avoid redundant byteswaps if you're really > clever. In practice you still have to handle the fact that your devices are a > different endianness to RAM, so it probably doesn't gain you a whole lot. Sparc64 MMU can also perform byte swapping, there is also a byte swapping CPU mode and byte swapping access instructions. I think only the instructions are used (for PCI).
> > Switching endianness of a CPU generally does not effect the endianness of > > the CPU/peripheral busses. It makes the CPU byteswap accesses before > > they are seen by either memory or devices. > > > > In theory it might be possible to avoid redundant byteswaps if you're > > really clever. In practice you still have to handle the fact that your > > devices are a different endianness to RAM, so it probably doesn't gain > > you a whole lot. > > Sparc64 MMU can also perform byte swapping, there is also a byte > swapping CPU mode and byte swapping access instructions. I think only > the instructions are used (for PCI). Right, but that is (to a large extent) a separate problem from memory mapped peripherals. You still have to handle the case where a single TLB entry covers both ram and a cross-endian device. Paul
On Thu, Nov 25, 2010 at 08:35:41AM +0100, Alexander Graf wrote: > The way we're currently modeling mmio is too simplified. We assume that > every device has the same endianness as the target CPU. In reality, > most devices are little endian (all PCI and ISA ones I'm aware of). Some > are big endian (special system devices) and a very little fraction is > target native endian (fw_cfg). > As far as I can see fw_cfg calls cpu_to_le() on some of entries and doesn't on others. Doesn't this makes it "mixed endian" (aka broken)? -- Gleb.
On 28.11.2010, at 09:12, Gleb Natapov wrote: > On Thu, Nov 25, 2010 at 08:35:41AM +0100, Alexander Graf wrote: >> The way we're currently modeling mmio is too simplified. We assume that >> every device has the same endianness as the target CPU. In reality, >> most devices are little endian (all PCI and ISA ones I'm aware of). Some >> are big endian (special system devices) and a very little fraction is >> target native endian (fw_cfg). >> > As far as I can see fw_cfg calls cpu_to_le() on some of entries and > doesn't on others. Doesn't this makes it "mixed endian" (aka broken)? There are two layers of brokenness: 1) MMIO 2) Internal structures A device that assumes native target endianness for its own MMIO is usually broken. Devices don't know which CPU they're plugged to (unless built inside of course) :). That specific issue is that fw_cfg_mem_writew directly takes the selector as is and both openbios and seabios expect it to accept values in native endianness. The way forward to fix this is probably to declare the selector as little endian and just commit a new version of openbios. Its internal structures are a different story. If the interface spec of fw_cfg (basically the .c file atm) defines some values to be a different endianness from the other, that might be acceptable. It's still unpretty though. I'd way prefer to see everything aligned to a single endianness. Alex
diff --git a/cpu-common.h b/cpu-common.h index a543b5d..839b236 100644 --- a/cpu-common.h +++ b/cpu-common.h @@ -20,6 +20,10 @@ #if !defined(CONFIG_USER_ONLY) +#define DEVICE_NATIVE_ENDIAN 0 +#define DEVICE_BIG_ENDIAN 1 +#define DEVICE_LITTLE_ENDIAN 2 + /* address in the RAM (different from a physical address) */ typedef unsigned long ram_addr_t; diff --git a/exec.c b/exec.c index db9ff55..f54a360 100644 --- a/exec.c +++ b/exec.c @@ -3336,6 +3336,109 @@ static int get_free_io_mem_idx(void) return -1; } +/* + * Usually, devices operate in little endian mode. There are devices out + * there that operate in big endian too. Each device gets byte swapped + * mmio if plugged onto a CPU that does the other endianness. + * + * CPU Device swap? + * + * little little no + * little big yes + * big little yes + * big big no + */ +#ifdef TARGET_WORDS_BIGENDIAN + +typedef struct SwapEndianContainer { + CPUReadMemoryFunc *read[3]; + CPUWriteMemoryFunc *write[3]; + void *opaque; +} SwapEndianContainer; + +static uint32_t swapendian_mem_readb (void *opaque, target_phys_addr_t addr) +{ + uint32_t val; + SwapEndianContainer *c = opaque; + val = c->read[0](c->opaque, addr); + return val; +} + +static uint32_t swapendian_mem_readw(void *opaque, target_phys_addr_t addr) +{ + uint32_t val; + SwapEndianContainer *c = opaque; + val = bswap16(c->read[1](c->opaque, addr)); + return val; +} + +static uint32_t swapendian_mem_readl(void *opaque, target_phys_addr_t addr) +{ + uint32_t val; + SwapEndianContainer *c = opaque; + val = bswap32(c->read[2](c->opaque, addr)); + return val; +} + +static CPUReadMemoryFunc * const swapendian_readfn[3]={ + swapendian_mem_readb, + swapendian_mem_readw, + swapendian_mem_readl +}; + +static void swapendian_mem_writeb(void *opaque, target_phys_addr_t addr, + uint32_t val) +{ + SwapEndianContainer *c = opaque; + c->write[0](c->opaque, addr, val); +} + +static void swapendian_mem_writew(void *opaque, target_phys_addr_t addr, + uint32_t val) +{ + SwapEndianContainer *c = opaque; + c->write[1](c->opaque, addr, bswap16(val)); +} + +static void swapendian_mem_writel(void *opaque, target_phys_addr_t addr, + uint32_t val) +{ + SwapEndianContainer *c = opaque; + c->write[2](c->opaque, addr, bswap32(val)); +} + +static CPUWriteMemoryFunc * const swapendian_writefn[3]={ + swapendian_mem_writeb, + swapendian_mem_writew, + swapendian_mem_writel +}; + +static void swapendian_init(int io_index) +{ + SwapEndianContainer *c = qemu_malloc(sizeof(SwapEndianContainer)); + int i; + + /* Swap mmio for big endian targets */ + c->opaque = io_mem_opaque[io_index]; + for (i = 0; i < 3; i++) { + c->read[i] = io_mem_read[io_index][i]; + c->write[i] = io_mem_write[io_index][i]; + + io_mem_read[io_index][i] = swapendian_readfn[i]; + io_mem_write[io_index][i] = swapendian_writefn[i]; + } + io_mem_opaque[io_index] = c; +} + +static void swapendian_del(int io_index) +{ + if (io_mem_read[io_index][0] == swapendian_readfn[0]) { + qemu_free(io_mem_opaque[io_index]); + } +} + +#endif + /* mem_read and mem_write are arrays of functions containing the function to access byte (index 0), word (index 1) and dword (index 2). Functions can be omitted with a NULL function pointer. @@ -3349,6 +3452,7 @@ static int cpu_register_io_memory_fixed(int io_index, void *opaque) { int i; + int endian = DEVICE_NATIVE_ENDIAN; if (io_index <= 0) { io_index = get_free_io_mem_idx(); @@ -3370,6 +3474,22 @@ static int cpu_register_io_memory_fixed(int io_index, } io_mem_opaque[io_index] = opaque; + switch (endian) { + case DEVICE_BIG_ENDIAN: +#ifndef TARGET_WORDS_BIGENDIAN + swapendian_init(io_index); +#endif + break; + case DEVICE_LITTLE_ENDIAN: +#ifdef TARGET_WORDS_BIGENDIAN + swapendian_init(io_index); +#endif + break; + case DEVICE_NATIVE_ENDIAN: + default: + break; + } + return (io_index << IO_MEM_SHIFT); } @@ -3385,6 +3505,8 @@ void cpu_unregister_io_memory(int io_table_address) int i; int io_index = io_table_address >> IO_MEM_SHIFT; + swapendian_del(io_index); + for (i=0;i < 3; i++) { io_mem_read[io_index][i] = unassigned_mem_read[i]; io_mem_write[io_index][i] = unassigned_mem_write[i];
The way we're currently modeling mmio is too simplified. We assume that every device has the same endianness as the target CPU. In reality, most devices are little endian (all PCI and ISA ones I'm aware of). Some are big endian (special system devices) and a very little fraction is target native endian (fw_cfg). So instead of assuming every device to be native endianness, let's move to a model where the device tells us which endianness it's in. That way we can compile the devices only once and get rid of all the ugly swap will be done by the underlying layer. For the same of readability, this patch only introduces the helper framework but doesn't allow the registering code to set its endianness yet. Signed-off-by: Alexander Graf <agraf@suse.de> --- cpu-common.h | 4 ++ exec.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 126 insertions(+), 0 deletions(-)