diff mbox

[1/9] exec: add endian specific phys ld/st functions

Message ID 484E8730-8686-4FBE-9C50-2211C9C4E929@suse.de
State New
Headers show

Commit Message

Alexander Graf July 5, 2011, 10:40 p.m. UTC
On 06.07.2011, at 00:22, Blue Swirl wrote:

> On Wed, Jul 6, 2011 at 1:13 AM, Alexander Graf <agraf@suse.de> wrote:
>> 
>> On 06.07.2011, at 00:05, Blue Swirl wrote:
>> 
>>> On Wed, Jul 6, 2011 at 12:55 AM, Alexander Graf <agraf@suse.de> wrote:
>>>> 
>>>> On 05.07.2011, at 23:48, Blue Swirl wrote:
>>>> 
>>>>> On Tue, Jul 5, 2011 at 7:28 PM, Alexander Graf <agraf@suse.de> wrote:
>>>>>> Device code some times needs to access physical memory and does that
>>>>>> through the ld./st._phys functions. However, these are the exact same
>>>>>> functions that the CPU uses to access memory, which means they will
>>>>>> be endianness swapped depending on the target CPU.
>>>>>> 
>>>>>> However, devices don't know about the CPU's endianness, but instead
>>>>>> access memory directly using their own interface to the memory bus,
>>>>>> so they need some way to read data with their native endianness.
>>>>>> 
>>>>>> This patch adds _le and _be functions to ld./st._phys.
>>>>>> 
>>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>>> ---
>>>>>>  cpu-common.h |   12 +++++++
>>>>>>  exec.c       |  102 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>  2 files changed, 114 insertions(+), 0 deletions(-)
>>>>>> 
>>>>>> diff --git a/cpu-common.h b/cpu-common.h
>>>>>> index b027e43..c6a2b5f 100644
>>>>>> --- a/cpu-common.h
>>>>>> +++ b/cpu-common.h
>>>>>> @@ -135,14 +135,26 @@ void qemu_flush_coalesced_mmio_buffer(void);
>>>>>> 
>>>>>>  uint32_t ldub_phys(target_phys_addr_t addr);
>>>>>>  uint32_t lduw_phys(target_phys_addr_t addr);
>>>>>> +uint32_t lduw_le_phys(target_phys_addr_t addr);
>>>>>> +uint32_t lduw_be_phys(target_phys_addr_t addr);
>>>>>>  uint32_t ldl_phys(target_phys_addr_t addr);
>>>>>> +uint32_t ldl_le_phys(target_phys_addr_t addr);
>>>>>> +uint32_t ldl_be_phys(target_phys_addr_t addr);
>>>>>>  uint64_t ldq_phys(target_phys_addr_t addr);
>>>>>> +uint64_t ldq_le_phys(target_phys_addr_t addr);
>>>>>> +uint64_t ldq_be_phys(target_phys_addr_t addr);
>>>>>>  void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val);
>>>>>>  void stq_phys_notdirty(target_phys_addr_t addr, uint64_t val);
>>>>>>  void stb_phys(target_phys_addr_t addr, uint32_t val);
>>>>>>  void stw_phys(target_phys_addr_t addr, uint32_t val);
>>>>>> +void stw_le_phys(target_phys_addr_t addr, uint32_t val);
>>>>>> +void stw_be_phys(target_phys_addr_t addr, uint32_t val);
>>>>>>  void stl_phys(target_phys_addr_t addr, uint32_t val);
>>>>>> +void stl_le_phys(target_phys_addr_t addr, uint32_t val);
>>>>>> +void stl_be_phys(target_phys_addr_t addr, uint32_t val);
>>>>>>  void stq_phys(target_phys_addr_t addr, uint64_t val);
>>>>>> +void stq_le_phys(target_phys_addr_t addr, uint64_t val);
>>>>>> +void stq_be_phys(target_phys_addr_t addr, uint64_t val);
>>>>>> 
>>>>>>  void cpu_physical_memory_write_rom(target_phys_addr_t addr,
>>>>>>                                    const uint8_t *buf, int len);
>>>>>> diff --git a/exec.c b/exec.c
>>>>>> index 4c45299..5f2f87e 100644
>>>>>> --- a/exec.c
>>>>>> +++ b/exec.c
>>>>>> @@ -4158,6 +4158,24 @@ uint32_t ldl_phys(target_phys_addr_t addr)
>>>>>>     return val;
>>>>>>  }
>>>>>> 
>>>>>> +uint32_t ldl_le_phys(target_phys_addr_t addr)
>>>>>> +{
>>>>>> +#if defined(TARGET_WORDS_BIGENDIAN)
>>>>>> +    return bswap32(ldl_phys(addr));
>>>>> 
>>>>> This would introduce a second bswap in some cases. Please make instead
>>>>> two versions of ldl_phys which use ldl_le/be_p instead of ldl_p. Then
>>>>> ldl_phys could be #defined to the suitable function.
>>>> 
>>>> Yeah, I was thinking to do that one at first and then realized how MMIO gets tricky, since we also need to potentially swap it again, as by now it returns values in target CPU endianness. But if you prefer, I can dig into that.
>>> 
>>> Maybe the swapendian thing could be adjusted so that it's possible to
>>> access the device in either LE or BE way directly? For example, there
>>> could be two io_mem_read/write tables, the current one for CPU and
>>> another one for device MMIO with known endianness. Or even three
>>> tables: CPU, BE, LE.
>> 
>> If you take a look at the patches following this one, you can easily see how few devices actually use these functions. I don't think the additional memory usage would count up for a couple of byte swaps here really.
>> 
>> We could however try to be clever and directly check if the device mmio is a swapendian function and just bypass it. I'm just not sure it's worth the additional overhead for the non-swapped case (which should be the normal one).
> 
> Neither seems to be very attractive option. It may be enough to
> optimize just RAM accesses and forget about extra bswap for MMIO for
> now.

How about something like this on top? Plus the q, uw and st version of course. The inline is there on purpose, moving the be/le specific code (which is rarely used) out of the way from the often(?) used native ones. Eventually, TCG generated code comes into these, no?




Alex

Comments

Paolo Bonzini July 6, 2011, 10:24 a.m. UTC | #1
> diff --git a/exec.c b/exec.c
> index 5f2f87e..f281ba4 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
>   }
>
>   /* warning: addr must be aligned */
> -uint32_t ldl_phys(target_phys_addr_t addr)
> +static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
> +                                         enum device_endian endian)

You probably need the __always_inline__ attribute to really convince GCC 
to inline this.

Paolo
Alexander Graf July 6, 2011, 11:34 a.m. UTC | #2
On 06.07.2011, at 12:24, Paolo Bonzini <pbonzini@redhat.com> wrote:

>> diff --git a/exec.c b/exec.c
>> index 5f2f87e..f281ba4 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
>>  }
>> 
>>  /* warning: addr must be aligned */
>> -uint32_t ldl_phys(target_phys_addr_t addr)
>> +static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
>> +                                         enum device_endian endian)
> 
> You probably need the __always_inline__ attribute to really convince GCC to inline this.

There's a define in osdep.h that converts inline into always_inline :)

Alex

>
Hannes Reinecke July 6, 2011, 1:03 p.m. UTC | #3
On 07/06/2011 01:34 PM, Alexander Graf wrote:
>
>
>
>
> On 06.07.2011, at 12:24, Paolo Bonzini<pbonzini@redhat.com>  wrote:
>
>>> diff --git a/exec.c b/exec.c
>>> index 5f2f87e..f281ba4 100644
>>> --- a/exec.c
>>> +++ b/exec.c
>>> @@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
>>>   }
>>>
>>>   /* warning: addr must be aligned */
>>> -uint32_t ldl_phys(target_phys_addr_t addr)
>>> +static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
>>> +                                         enum device_endian endian)
>>
>> You probably need the __always_inline__ attribute to really convince GCC to inline this.
>
> There's a define in osdep.h that converts inline into always_inline :)
>
Btw, while you're at it:

uint32_t ldub_phys(target_phys_addr_t addr);
uint32_t lduw_phys(target_phys_addr_t addr);

Hmm? ldub is supposed to read an 'unsigned byte' (uint8_t),
and lduw is supposed to read an 'unsigned word' (uint16_t).

Why does it return an uint32_t?

Cheers,

Hannes
Alexander Graf July 6, 2011, 1:18 p.m. UTC | #4
Am 06.07.2011 um 15:03 schrieb Hannes Reinecke <hare@suse.de>:

> On 07/06/2011 01:34 PM, Alexander Graf wrote:
>> 
>> 
>> 
>> 
>> On 06.07.2011, at 12:24, Paolo Bonzini<pbonzini@redhat.com>  wrote:
>> 
>>>> diff --git a/exec.c b/exec.c
>>>> index 5f2f87e..f281ba4 100644
>>>> --- a/exec.c
>>>> +++ b/exec.c
>>>> @@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
>>>>  }
>>>> 
>>>>  /* warning: addr must be aligned */
>>>> -uint32_t ldl_phys(target_phys_addr_t addr)
>>>> +static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
>>>> +                                         enum device_endian endian)
>>> 
>>> You probably need the __always_inline__ attribute to really convince GCC to inline this.
>> 
>> There's a define in osdep.h that converts inline into always_inline :)
>> 
> Btw, while you're at it:
> 
> uint32_t ldub_phys(target_phys_addr_t addr);
> uint32_t lduw_phys(target_phys_addr_t addr);
> 
> Hmm? ldub is supposed to read an 'unsigned byte' (uint8_t),
> and lduw is supposed to read an 'unsigned word' (uint16_t).
> 
> Why does it return an uint32_t?

Because it ends up being a 32-bit register for the return value / parameter anyways :).

But this is a different thing. I'd prefer to  focus on the endian problem for now.


Alex
Paolo Bonzini July 6, 2011, 1:30 p.m. UTC | #5
On 07/06/2011 03:03 PM, Hannes Reinecke wrote:
>
> uint32_t ldub_phys(target_phys_addr_t addr);
> uint32_t lduw_phys(target_phys_addr_t addr);
>
> Hmm? ldub is supposed to read an 'unsigned byte' (uint8_t),
> and lduw is supposed to read an 'unsigned word' (uint16_t).
>
> Why does it return an uint32_t?

I don't know if this is the reason, but uint{8,16}_t are promoted to a 
signed int.  So when you do

   (uint64_t) (ldub_phys (addr) << 24)

you'd get a sign extension in bits 32-63.  Admittedly a bit contrived, 
but it can happen and QEMU is full of such bugs:

     case 4:
         lba = (uint64_t) buf[9] | ((uint64_t) buf[8] << 8) |
               ((uint64_t) buf[7] << 16) | ((uint64_t) buf[6] << 24) |
               ((uint64_t) buf[5] << 32) | ((uint64_t) buf[4] << 40) |
               ((uint64_t) buf[3] << 48) | ((uint64_t) buf[2] << 56);
         break;

(found by Coverity).

This was the reason for my series at 
http://permalink.gmane.org/gmane.comp.emulators.qemu/105336 (which you 
reminded me to ping---thanks!)

Paolo
diff mbox

Patch

diff --git a/exec.c b/exec.c
index 5f2f87e..f281ba4 100644
--- a/exec.c
+++ b/exec.c
@@ -4127,7 +4127,8 @@  void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
 }
 
 /* warning: addr must be aligned */
-uint32_t ldl_phys(target_phys_addr_t addr)
+static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
+                                         enum device_endian endian)
 {
     int io_index;
     uint8_t *ptr;
@@ -4149,31 +4150,47 @@  uint32_t ldl_phys(target_phys_addr_t addr)
         if (p)
             addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset;
         val = io_mem_read[io_index][2](io_mem_opaque[io_index], addr);
+#if defined(TARGET_WORDS_BIGENDIAN)
+        if (endian == DEVICE_LITTLE_ENDIAN) {
+            val = bswap32(val);
+        }
+#else
+        if (endian == DEVICE_BIG_ENDIAN) {
+            val = bswap32(val);
+        }
+#endif
     } else {
         /* RAM case */
         ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) +
             (addr & ~TARGET_PAGE_MASK);
-        val = ldl_p(ptr);
+        switch (endian) {
+        case DEVICE_LITTLE_ENDIAN:
+            val = ldl_le_p(ptr);
+            break;
+        case DEVICE_BIG_ENDIAN:
+            val = ldl_be_p(ptr);
+            break;
+        default:
+            val = ldl_p(ptr);
+            break;
+        }
     }
     return val;
 }
 
+uint32_t ldl_phys(target_phys_addr_t addr)
+{
+    return ldl_phys_internal(addr, DEVICE_NATIVE_ENDIAN);
+}
+
 uint32_t ldl_le_phys(target_phys_addr_t addr)
 {
-#if defined(TARGET_WORDS_BIGENDIAN)
-    return bswap32(ldl_phys(addr));
-#else
-    return ldl_phys(addr);
-#endif
+    return ldl_phys_internal(addr, DEVICE_LITTLE_ENDIAN);
 }
 
 uint32_t ldl_be_phys(target_phys_addr_t addr)
 {
-#if defined(TARGET_WORDS_BIGENDIAN)
-    return ldl_phys(addr);
-#else
-    return bswap32(ldl_phys(addr));
-#endif
+    return ldl_phys_internal(addr, DEVICE_BIG_ENDIAN);
 }
 
 /* warning: addr must be aligned */