Patchwork docs: memory.txt document the endian field

login
register
mail settings
Submitter Michael S. Tsirkin
Date Feb. 12, 2012, 3:06 p.m.
Message ID <20120212150658.GC27718@redhat.com>
Download mbox | patch
Permalink /patch/140813/
State New
Headers show

Comments

Michael S. Tsirkin - Feb. 12, 2012, 3:06 p.m.
On Sun, Feb 12, 2012 at 03:55:20PM +0200, Avi Kivity wrote:
> On 02/12/2012 03:47 PM, Michael S. Tsirkin wrote:
> > On Sun, Feb 12, 2012 at 03:02:11PM +0200, Avi Kivity wrote:
> > > On 02/12/2012 02:52 PM, Michael S. Tsirkin wrote:
> > > > This is an attempt to document the endian
> > > > field in memory API. As this is a confusing topic,
> > > > it's best to make the text as explicit as possible.
> > > >
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > ---
> > > >  docs/memory.txt |   28 ++++++++++++++++++++++++++++
> > > >  1 files changed, 28 insertions(+), 0 deletions(-)
> > > >
> > > > diff --git a/docs/memory.txt b/docs/memory.txt
> > > > index 5bbee8e..ff92b52 100644
> > > > --- a/docs/memory.txt
> > > > +++ b/docs/memory.txt
> > > > @@ -170,3 +170,31 @@ various constraints can be supplied to control how these callbacks are called:
> > > >   - .old_portio and .old_mmio can be used to ease porting from code using
> > > >     cpu_register_io_memory() and register_ioport().  They should not be used
> > > >     in new code.
> > > > +- .endianness; specifies the device endian-ness, which affects
> > > > +   the value parameter passed from guest to write and returned
> > > > +   to guest from read callbacks, as follows:
> > > > +        void write(void *opaque, target_phys_addr_t addr,
> > > > +                   uint64_t value, unsigned size)
> > > > +        uint64_t read(void *opaque, target_phys_addr_t addr,
> > > > +                       unsigned size)
> > > > +   Legal values are:
> > > > +   DEVICE_NATIVE_ENDIAN - Callbacks accept and return value in
> > > > +        host endian format. This makes it possible to do
> > > > +        math on values without type conversions.
> > > > +        Low size bytes in value are set, the rest are zero padded
> > > > +        on input and ignored on output.
> > > > +   DEVICE_LITTLE_ENDIAN - Callbacks accept and return value
> > > > +        in little endian format. This is appropriate
> > > > +        if you need to directly copy the data into device memory,
> > > > +        and the device programming interface is little endian
> > > > +        (true for most pci devices).
> > > > +        First size bytes in value are set, the rest are zero padded
> > > > +        on input and ignored on output.
> > > > +   DEVICE_BIG_ENDIAN - Callbacks accept and return value
> > > > +        in big endian format.
> > > > +        in little endian format. This is appropriate
> > > > +        if you need to directly copy the data into device memory,
> > > > +        and the device programming interface is big endian
> > > > +        (true e.g. for some system devices on big endian architectures).
> > > > +        Last size bytes in value are set, the rest are zero padded
> > > > +        on input and ignored on output.
> > > 
> > > This is wrong.  Callback data is always in host endianness.  Device
> > > endianness is about the device.
> > > 
> > > For example, DEVICE_BIG_ENDIAN means that the device expects data in big
> > > endian format.  Qemu assumes the guest OS writes big endian data to the
> > > device, so it swaps from big endian to host endian before calling the
> > > callback.  Similarly it will swap from host endian to big endian on read.
> > > 
> > > DEVICE_NATIVE_ENDIAN means:
> > > 
> > >   defined(TARGET_WORDS_BIGENDIAN) ? DEVICE_BIG_ENDIAN : DEVICE_NATIVE_ENDIAN
> > > 
> > > i.e. the device has the same endianness as the guest cpu.
> >
> > I think this boils down to the same thing in the end, no?
> 
> Maybe.
> 
> > However, it's a bad way to describe the setup
> > for device writers: it documents the
> > internal workings of qemu with multiple
> > swaps. We need to document the end result.
> >
> > And, it is IMO confusing to say that 'a device expects data'
> > this adds a speculative element where you
> > are asked to think about what you would want to
> > do and promised that this will be somehow
> > satisfied.
> >
> > Instead, please specify what the API does, users
> > can make their own decisions on when to use it.
> 
> But "callbacks accept data in little endian format" implies that you
> have to add a swap in the handler,
> since you usually want data in host endian.
> It's really really simple:
> 
> If the device spec says "big endian, specify DEVICE_BIG_ENDIAN, and
> treat the data naturally in the callback.
> If the device spec says "little endian, specify DEVICE_LITTLE_ENDIAN,
> and treat the data naturally in the callback.
> 
> That's it.

OKay, but I'm sure your API does not go read the spec, so
we should not base the description on that :)
Right?

So I think the following is right?


commit 02aa79aac9bec1c8c17d1b7b5405b59b649dfdb9
Author: Michael S. Tsirkin <mst@redhat.com>
Date:   Wed Feb 8 17:16:35 2012 +0200

    docs: memory.txt document the endian field
    
    This is an attempt to document the endian
    field in memory API. As this is a confusing topic,
    add some examples.
    
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Avi Kivity - Feb. 12, 2012, 3:19 p.m.
On 02/12/2012 05:06 PM, Michael S. Tsirkin wrote:
> > It's really really simple:
> > 
> > If the device spec says "big endian, specify DEVICE_BIG_ENDIAN, and
> > treat the data naturally in the callback.
> > If the device spec says "little endian, specify DEVICE_LITTLE_ENDIAN,
> > and treat the data naturally in the callback.
> > 
> > That's it.
>
> OKay, but I'm sure your API does not go read the spec, so
> we should not base the description on that :)
> Right?
>
> So I think the following is right?
>
>
> commit 02aa79aac9bec1c8c17d1b7b5405b59b649dfdb9
> Author: Michael S. Tsirkin <mst@redhat.com>
> Date:   Wed Feb 8 17:16:35 2012 +0200
>
>     docs: memory.txt document the endian field
>     
>     This is an attempt to document the endian
>     field in memory API. As this is a confusing topic,
>     add some examples.
>     
>     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> diff --git a/docs/memory.txt b/docs/memory.txt
> index 5bbee8e..9132c86 100644
> --- a/docs/memory.txt
> +++ b/docs/memory.txt
> @@ -170,3 +170,48 @@ various constraints can be supplied to control how these callbacks are called:
>   - .old_portio and .old_mmio can be used to ease porting from code using
>     cpu_register_io_memory() and register_ioport().  They should not be used
>     in new code.
> +- .endianness; specifies the device endian-ness, which affects
> +   the handling of the value parameter passed from guest to write
> +   and returned to guest from read callbacks, as follows:
> +        void write(void *opaque, target_phys_addr_t addr,
> +                   uint64_t value, unsigned size)
> +        uint64_t read(void *opaque, target_phys_addr_t addr,
> +                       unsigned size)
> +   value is always passed in the natural host format,
> +   low size bytes in value are set, the rest are zero padded
> +   on input and ignored on output.
> +   Legal values for endian-ness are:
> +   DEVICE_NATIVE_ENDIAN - The value is left in the format used by guest.
> +       Note that although this is typically a fixed format as
> +       guest drivers take care of endian conversions,
> +       if host endian-ness does not match the device this will
> +       result in "mixed endian" since the data is always
> +       stored in low bits of value.
> +
> +       To handle this data, on write, you typically need to first
> +       convert to the appropriate type, removing the
> +       padding. On read, handle the data in the appropriate
> +       type and then convert to uint64_t, padding with leading zeroes.

No.  Data is converted from guest endian to host endian on write (vice
versa on read).  This works if the device endianness matches the guest
endianness.

> +
> +   DEVICE_LITTLE_ENDIAN - The value is assumed to be
> +       endian, and is converted to host endian.
> +   DEVICE_BIG_ENDIAN - The value is assumed to be
> +        big endian, and is converted to host endian.

Yes.

> +
> +    As an example, consider a little endian guest writing a 32 bit
> +    value 0x12345678 into an MMIO register, on a big endian host.
> +    The value passed to the write callback is documented below:
> +
> +   DEVICE_NATIVE_ENDIAN - value = 0x0000000087654321
> +        Explanation: write callback will get the high bits
> +        in value set to 0, and low bits set to data left
> +        as is, that is in little endian format.

No, you'll see 0x12345678, same as DEVICE_LITTLE_ENDIAN.

> +   DEVICE_LITTLE_ENDIAN - value = 0x0000000012345678
> +        Explanation: the write callback will get the high bits
> +        in value set to 0, and low bits set to data in big endian
> +        format.
> +   DEVICE_BIG_ENDIAN - value = 0x0000000087654321
> +        Explanation: the write callback will get the high bits
> +        in value set to 0, and low bits set to data in little endian
> +        format.
> +

Right value, wrong explanation.  The value is still in big endian format.
Andreas Färber - Feb. 12, 2012, 6:20 p.m.
Am 12.02.2012 16:06, schrieb Michael S. Tsirkin:
> So I think the following is right?
> 
> 
> commit 02aa79aac9bec1c8c17d1b7b5405b59b649dfdb9
> Author: Michael S. Tsirkin <mst@redhat.com>
> Date:   Wed Feb 8 17:16:35 2012 +0200
> 
>     docs: memory.txt document the endian field
>     
>     This is an attempt to document the endian
>     field in memory API. As this is a confusing topic,
>     add some examples.
>     
>     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> diff --git a/docs/memory.txt b/docs/memory.txt
> index 5bbee8e..9132c86 100644
> --- a/docs/memory.txt
> +++ b/docs/memory.txt
> @@ -170,3 +170,48 @@ various constraints can be supplied to control how these callbacks are called:
>   - .old_portio and .old_mmio can be used to ease porting from code using
>     cpu_register_io_memory() and register_ioport().  They should not be used
>     in new code.
> +- .endianness; specifies the device endian-ness, which affects
> +   the handling of the value parameter passed from guest to write
> +   and returned to guest from read callbacks, as follows:
> +        void write(void *opaque, target_phys_addr_t addr,
> +                   uint64_t value, unsigned size)
> +        uint64_t read(void *opaque, target_phys_addr_t addr,
> +                       unsigned size)
> +   value is always passed in the natural host format,
> +   low size bytes in value are set, the rest are zero padded
> +   on input and ignored on output.

Looks good so far.

> +   Legal values for endian-ness are:
> +   DEVICE_NATIVE_ENDIAN - The value is left in the format used by guest.
> +       Note that although this is typically a fixed format as
> +       guest drivers take care of endian conversions,

> +       if host endian-ness does not match the device this will
> +       result in "mixed endian" since the data is always
> +       stored in low bits of value.

Why "mixed" endian? The host always uses host endianness, and with
"native" we use the (nominal) endianness of the target.

Note that the endianness of the guest might be different from the
target's if the CPU is bi-endian.

> +
> +       To handle this data, on write, you typically need to first
> +       convert to the appropriate type, removing the
> +       padding. On read, handle the data in the appropriate
> +       type and then convert to uint64_t, padding with leading zeroes.

That applies to all three endiannesses, doesn't it?

Andreas

> +
> +   DEVICE_LITTLE_ENDIAN - The value is assumed to be
> +       endian, and is converted to host endian.
> +   DEVICE_BIG_ENDIAN - The value is assumed to be
> +        big endian, and is converted to host endian.
> +
> +    As an example, consider a little endian guest writing a 32 bit
> +    value 0x12345678 into an MMIO register, on a big endian host.
> +    The value passed to the write callback is documented below:
> +
> +   DEVICE_NATIVE_ENDIAN - value = 0x0000000087654321
> +        Explanation: write callback will get the high bits
> +        in value set to 0, and low bits set to data left
> +        as is, that is in little endian format.
> +   DEVICE_LITTLE_ENDIAN - value = 0x0000000012345678
> +        Explanation: the write callback will get the high bits
> +        in value set to 0, and low bits set to data in big endian
> +        format.
> +   DEVICE_BIG_ENDIAN - value = 0x0000000087654321
> +        Explanation: the write callback will get the high bits
> +        in value set to 0, and low bits set to data in little endian
> +        format.
> +
Michael S. Tsirkin - Feb. 12, 2012, 6:27 p.m.
On Sun, Feb 12, 2012 at 07:20:07PM +0100, Andreas Färber wrote:
> Am 12.02.2012 16:06, schrieb Michael S. Tsirkin:
> > So I think the following is right?
> > 
> > 
> > commit 02aa79aac9bec1c8c17d1b7b5405b59b649dfdb9
> > Author: Michael S. Tsirkin <mst@redhat.com>
> > Date:   Wed Feb 8 17:16:35 2012 +0200
> > 
> >     docs: memory.txt document the endian field
> >     
> >     This is an attempt to document the endian
> >     field in memory API. As this is a confusing topic,
> >     add some examples.
> >     
> >     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > diff --git a/docs/memory.txt b/docs/memory.txt
> > index 5bbee8e..9132c86 100644
> > --- a/docs/memory.txt
> > +++ b/docs/memory.txt
> > @@ -170,3 +170,48 @@ various constraints can be supplied to control how these callbacks are called:
> >   - .old_portio and .old_mmio can be used to ease porting from code using
> >     cpu_register_io_memory() and register_ioport().  They should not be used
> >     in new code.
> > +- .endianness; specifies the device endian-ness, which affects
> > +   the handling of the value parameter passed from guest to write
> > +   and returned to guest from read callbacks, as follows:
> > +        void write(void *opaque, target_phys_addr_t addr,
> > +                   uint64_t value, unsigned size)
> > +        uint64_t read(void *opaque, target_phys_addr_t addr,
> > +                       unsigned size)
> > +   value is always passed in the natural host format,
> > +   low size bytes in value are set, the rest are zero padded
> > +   on input and ignored on output.
> 
> Looks good so far.
> 
> > +   Legal values for endian-ness are:
> > +   DEVICE_NATIVE_ENDIAN - The value is left in the format used by guest.
> > +       Note that although this is typically a fixed format as
> > +       guest drivers take care of endian conversions,
> 
> > +       if host endian-ness does not match the device this will
> > +       result in "mixed endian" since the data is always
> > +       stored in low bits of value.
> 
> Why "mixed" endian? The host always uses host endianness, and with
> "native" we use the (nominal) endianness of the target.
> Note that the endianness of the guest might be different from the
> target's if the CPU is bi-endian.
> 
> > +
> > +       To handle this data, on write, you typically need to first
> > +       convert to the appropriate type, removing the
> > +       padding. On read, handle the data in the appropriate
> > +       type and then convert to uint64_t, padding with leading zeroes.
> 
> That applies to all three endiannesses, doesn't it?
> 
> Andreas
> > +
> > +   DEVICE_LITTLE_ENDIAN - The value is assumed to be
> > +       endian, and is converted to host endian.
> > +   DEVICE_BIG_ENDIAN - The value is assumed to be
> > +        big endian, and is converted to host endian.
> > +
> > +    As an example, consider a little endian guest writing a 32 bit
> > +    value 0x12345678 into an MMIO register, on a big endian host.
> > +    The value passed to the write callback is documented below:
> > +
> > +   DEVICE_NATIVE_ENDIAN - value = 0x0000000087654321
> > +        Explanation: write callback will get the high bits
> > +        in value set to 0, and low bits set to data left
> > +        as is, that is in little endian format.
> > +   DEVICE_LITTLE_ENDIAN - value = 0x0000000012345678
> > +        Explanation: the write callback will get the high bits
> > +        in value set to 0, and low bits set to data in big endian
> > +        format.
> > +   DEVICE_BIG_ENDIAN - value = 0x0000000087654321
> > +        Explanation: the write callback will get the high bits
> > +        in value set to 0, and low bits set to data in little endian
> > +        format.
> > +
> 


It looks like the text is wrong anyway.
I give up for now, maybe Avi can document it
properly.


> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Patch

diff --git a/docs/memory.txt b/docs/memory.txt
index 5bbee8e..9132c86 100644
--- a/docs/memory.txt
+++ b/docs/memory.txt
@@ -170,3 +170,48 @@  various constraints can be supplied to control how these callbacks are called:
  - .old_portio and .old_mmio can be used to ease porting from code using
    cpu_register_io_memory() and register_ioport().  They should not be used
    in new code.
+- .endianness; specifies the device endian-ness, which affects
+   the handling of the value parameter passed from guest to write
+   and returned to guest from read callbacks, as follows:
+        void write(void *opaque, target_phys_addr_t addr,
+                   uint64_t value, unsigned size)
+        uint64_t read(void *opaque, target_phys_addr_t addr,
+                       unsigned size)
+   value is always passed in the natural host format,
+   low size bytes in value are set, the rest are zero padded
+   on input and ignored on output.
+   Legal values for endian-ness are:
+   DEVICE_NATIVE_ENDIAN - The value is left in the format used by guest.
+       Note that although this is typically a fixed format as
+       guest drivers take care of endian conversions,
+       if host endian-ness does not match the device this will
+       result in "mixed endian" since the data is always
+       stored in low bits of value.
+
+       To handle this data, on write, you typically need to first
+       convert to the appropriate type, removing the
+       padding. On read, handle the data in the appropriate
+       type and then convert to uint64_t, padding with leading zeroes.
+
+   DEVICE_LITTLE_ENDIAN - The value is assumed to be
+       endian, and is converted to host endian.
+   DEVICE_BIG_ENDIAN - The value is assumed to be
+        big endian, and is converted to host endian.
+
+    As an example, consider a little endian guest writing a 32 bit
+    value 0x12345678 into an MMIO register, on a big endian host.
+    The value passed to the write callback is documented below:
+
+   DEVICE_NATIVE_ENDIAN - value = 0x0000000087654321
+        Explanation: write callback will get the high bits
+        in value set to 0, and low bits set to data left
+        as is, that is in little endian format.
+   DEVICE_LITTLE_ENDIAN - value = 0x0000000012345678
+        Explanation: the write callback will get the high bits
+        in value set to 0, and low bits set to data in big endian
+        format.
+   DEVICE_BIG_ENDIAN - value = 0x0000000087654321
+        Explanation: the write callback will get the high bits
+        in value set to 0, and low bits set to data in little endian
+        format.
+