diff mbox

qtest: add read/write accessors with a specific endianness

Message ID 1475583448-21013-1-git-send-email-clg@kaod.org
State New
Headers show

Commit Message

Cédric Le Goater Oct. 4, 2016, 12:17 p.m. UTC
Some test scenarios require to access memory regions using a specific
endianness, such as a device region, but the current qtest memory
accessors are done in native endian, which means that the values are
byteswapped in qtest if the endianness of the guest and the host are
different.

To maintain the endianness of a value, we need a new set of memory
accessor. This can be done in two ways:

- first, convert the value to the required endianness in libqtest and
  then use the memread/write routines so that qtest accesses the guest
  memory without doing any supplementary byteswapping

- an alternative method would be to handle the byte swapping on the
  qtest side. For that, we would need to extend the read/write
  protocol with an ending word : "native|le|be" and modify the tswap
  calls accordingly under the qtest_process_command() routine.

The result is the same and the first method is simpler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 tests/libqtest.c |   91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/libqtest.h |   71 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 162 insertions(+)

Comments

Peter Maydell Oct. 4, 2016, 12:36 p.m. UTC | #1
On 4 October 2016 at 13:17, Cédric Le Goater <clg@kaod.org> wrote:
> Some test scenarios require to access memory regions using a specific
> endianness, such as a device region, but the current qtest memory
> accessors are done in native endian, which means that the values are
> byteswapped in qtest if the endianness of the guest and the host are
> different.
>
> To maintain the endianness of a value, we need a new set of memory
> accessor. This can be done in two ways:
>
> - first, convert the value to the required endianness in libqtest and
>   then use the memread/write routines so that qtest accesses the guest
>   memory without doing any supplementary byteswapping
>
> - an alternative method would be to handle the byte swapping on the
>   qtest side. For that, we would need to extend the read/write
>   protocol with an ending word : "native|le|be" and modify the tswap
>   calls accordingly under the qtest_process_command() routine.
>
> The result is the same and the first method is simpler.

The difficulty with this patch is that it's hard to tell whether
it's really required, or if this is just adding an extra layer
of byteswapping that should really be done in some other location
in the stack. What's the actual test case here?

thanks
-- PMM
Laurent Vivier Oct. 4, 2016, 2:04 p.m. UTC | #2
On 04/10/2016 14:36, Peter Maydell wrote:
> On 4 October 2016 at 13:17, Cédric Le Goater <clg@kaod.org> wrote:
>> Some test scenarios require to access memory regions using a specific
>> endianness, such as a device region, but the current qtest memory
>> accessors are done in native endian, which means that the values are
>> byteswapped in qtest if the endianness of the guest and the host are
>> different.
>>
>> To maintain the endianness of a value, we need a new set of memory
>> accessor. This can be done in two ways:
>>
>> - first, convert the value to the required endianness in libqtest and
>>   then use the memread/write routines so that qtest accesses the guest
>>   memory without doing any supplementary byteswapping
>>
>> - an alternative method would be to handle the byte swapping on the
>>   qtest side. For that, we would need to extend the read/write
>>   protocol with an ending word : "native|le|be" and modify the tswap
>>   calls accordingly under the qtest_process_command() routine.
>>
>> The result is the same and the first method is simpler.
> 
> The difficulty with this patch is that it's hard to tell whether
> it's really required, or if this is just adding an extra layer
> of byteswapping that should really be done in some other location
> in the stack. What's the actual test case here?
> 

We need that to implement PCI qtest with SPAPR, as PCI is always
little-endian and SPAPR is big-endian by default, see my patch
http://patchwork.ozlabs.org/patch/676591/

I'm going to resend my series based on Cédric's patch.

Thanks,
Laurent
Paolo Bonzini Oct. 4, 2016, 5:15 p.m. UTC | #3
On 04/10/2016 14:36, Peter Maydell wrote:
> > - first, convert the value to the required endianness in libqtest and
> >   then use the memread/write routines so that qtest accesses the guest
> >   memory without doing any supplementary byteswapping
> >
> > - an alternative method would be to handle the byte swapping on the
> >   qtest side. For that, we would need to extend the read/write
> >   protocol with an ending word : "native|le|be" and modify the tswap
> >   calls accordingly under the qtest_process_command() routine.
> >
> > The result is the same and the first method is simpler.
>
> The difficulty with this patch is that it's hard to tell whether
> it's really required, or if this is just adding an extra layer
> of byteswapping that should really be done in some other location
> in the stack. What's the actual test case here?

I think the first method makes sense.  The second would be the wrong side.

Paolo
David Gibson Oct. 4, 2016, 11:26 p.m. UTC | #4
On Tue, Oct 04, 2016 at 02:17:28PM +0200, Cédric Le Goater wrote:
> Some test scenarios require to access memory regions using a specific
> endianness, such as a device region, but the current qtest memory
> accessors are done in native endian, which means that the values are
> byteswapped in qtest if the endianness of the guest and the host are
> different.
> 
> To maintain the endianness of a value, we need a new set of memory
> accessor. This can be done in two ways:
> 
> - first, convert the value to the required endianness in libqtest and
>   then use the memread/write routines so that qtest accesses the guest
>   memory without doing any supplementary byteswapping
> 
> - an alternative method would be to handle the byte swapping on the
>   qtest side. For that, we would need to extend the read/write
>   protocol with an ending word : "native|le|be" and modify the tswap
>   calls accordingly under the qtest_process_command() routine.
> 
> The result is the same and the first method is simpler.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  tests/libqtest.c |   91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/libqtest.h |   71 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 162 insertions(+)
> 
> Index: qemu-aspeed.git/tests/libqtest.h
> ===================================================================
> --- qemu-aspeed.git.orig/tests/libqtest.h
> +++ qemu-aspeed.git/tests/libqtest.h
> @@ -887,4 +887,75 @@ void qmp_fd_send(int fd, const char *fmt
>  QDict *qmp_fdv(int fd, const char *fmt, va_list ap);
>  QDict *qmp_fd(int fd, const char *fmt, ...);
>  
> +/*
> + * BE/LE read/write accessors
> + */
> +uint16_t qtest_readw_be(QTestState *s, uint64_t addr);
> +uint32_t qtest_readl_be(QTestState *s, uint64_t addr);
> +uint64_t qtest_readq_be(QTestState *s, uint64_t addr);
> +
> +static inline uint16_t readw_be(uint64_t addr)
> +{
> +    return qtest_readw_be(global_qtest, addr);
> +}
> +static inline uint32_t readl_be(uint64_t addr)
> +{
> +    return qtest_readl_be(global_qtest, addr);
> +}
> +static inline uint64_t readq_be(uint64_t addr)
> +{
> +    return qtest_readq_be(global_qtest, addr);
> +}
> +
> +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value);
> +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value);
> +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value);
> +
> +static inline void writew_be(uint64_t addr, uint16_t value)
> +{
> +    qtest_writew_be(global_qtest, addr, value);
> +}
> +static inline void writel_be(uint64_t addr, uint32_t value)
> +{
> +    qtest_writel_be(global_qtest, addr, value);
> +}
> +static inline void writeq_be(uint64_t addr, uint64_t value)
> +{
> +    qtest_writeq_be(global_qtest, addr, value);
> +}
> +
> +uint16_t qtest_readw_le(QTestState *s, uint64_t addr);
> +uint32_t qtest_readl_le(QTestState *s, uint64_t addr);
> +uint64_t qtest_readq_le(QTestState *s, uint64_t addr);
> +
> +static inline uint16_t readw_le(uint64_t addr)
> +{
> +    return qtest_readw_le(global_qtest, addr);
> +}
> +static inline uint32_t readl_le(uint64_t addr)
> +{
> +    return qtest_readl_le(global_qtest, addr);
> +}
> +static inline uint64_t readq_le(uint64_t addr)
> +{
> +    return qtest_readq_le(global_qtest, addr);
> +}
> +
> +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value);
> +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value);
> +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value);
> +
> +static inline void writew_le(uint64_t addr, uint16_t value)
> +{
> +    qtest_writew_le(global_qtest, addr, value);
> +}
> +static inline void writel_le(uint64_t addr, uint32_t value)
> +{
> +    qtest_writel_le(global_qtest, addr, value);
> +}
> +static inline void writeq_le(uint64_t addr, uint64_t value)
> +{
> +    qtest_writeq_le(global_qtest, addr, value);
> +}
> +
>  #endif
> Index: qemu-aspeed.git/tests/libqtest.c
> ===================================================================
> --- qemu-aspeed.git.orig/tests/libqtest.c
> +++ qemu-aspeed.git/tests/libqtest.c
> @@ -15,6 +15,7 @@
>   *
>   */
>  #include "qemu/osdep.h"
> +#include "qemu/bswap.h"
>  #include "libqtest.h"
>  
>  #include <sys/socket.h>
> @@ -688,6 +689,42 @@ void qtest_writeq(QTestState *s, uint64_
>      qtest_write(s, "writeq", addr, value);
>  }
>  
> +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value)
> +{
> +    value = cpu_to_be16(value);
> +    qtest_memread(s, addr, &value, sizeof(value));

Uh.. this surely needs to be qtest_memwrite().

> +}
> +
> +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value)
> +{
> +    value = cpu_to_be32(value);
> +    qtest_memread(s, addr, &value, sizeof(value));

and here,

> +}
> +
> +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value)
> +{
> +    value = cpu_to_be64(value);
> +    qtest_memread(s, addr, &value, sizeof(value));

here

> +}
> +
> +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value)
> +{
> +    value = cpu_to_le16(value);
> +    qtest_memread(s, addr, &value, sizeof(value));

here

> +}
> +
> +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value)
> +{
> +    value = cpu_to_le32(value);
> +    qtest_memread(s, addr, &value, sizeof(value));

here

> +}
> +
> +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value)
> +{
> +    value = cpu_to_le64(value);
> +    qtest_memread(s, addr, &value, sizeof(value));

and here

> +}
> +
>  static uint64_t qtest_read(QTestState *s, const char *cmd, uint64_t addr)
>  {
>      gchar **args;
> @@ -721,6 +758,60 @@ uint64_t qtest_readq(QTestState *s, uint
>      return qtest_read(s, "readq", addr);
>  }
>  
> +uint16_t qtest_readw_be(QTestState *s, uint64_t addr)
> +{
> +    uint16_t value;
> +
> +    qtest_memread(s, addr, &value, sizeof(value));
> +
> +    return be16_to_cpu(value);
> +}
> +
> +uint32_t qtest_readl_be(QTestState *s, uint64_t addr)
> +{
> +    uint32_t value;
> +
> +    qtest_memread(s, addr, &value, sizeof(value));
> +
> +    return be32_to_cpu(value);
> +}
> +
> +uint64_t qtest_readq_be(QTestState *s, uint64_t addr)
> +{
> +    uint64_t value;
> +
> +    qtest_memread(s, addr, &value, sizeof(value));
> +
> +    return be64_to_cpu(value);
> +}
> +
> +uint16_t qtest_readw_le(QTestState *s, uint64_t addr)
> +{
> +    uint16_t value;
> +
> +    qtest_memread(s, addr, &value, sizeof(value));
> +
> +    return le16_to_cpu(value);
> +}
> +
> +uint32_t qtest_readl_le(QTestState *s, uint64_t addr)
> +{
> +    uint32_t value;
> +
> +    qtest_memread(s, addr, &value, sizeof(value));
> +
> +    return le32_to_cpu(value);
> +}
> +
> +uint64_t qtest_readq_le(QTestState *s, uint64_t addr)
> +{
> +    uint64_t value;
> +
> +    qtest_memread(s, addr, &value, sizeof(value));
> +
> +    return le64_to_cpu(value);
> +}
> +
>  static int hex2nib(char ch)
>  {
>      if (ch >= '0' && ch <= '9') {
>
David Gibson Oct. 4, 2016, 11:43 p.m. UTC | #5
On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
> On 4 October 2016 at 13:17, Cédric Le Goater <clg@kaod.org> wrote:
> > Some test scenarios require to access memory regions using a specific
> > endianness, such as a device region, but the current qtest memory
> > accessors are done in native endian, which means that the values are
> > byteswapped in qtest if the endianness of the guest and the host are
> > different.
> >
> > To maintain the endianness of a value, we need a new set of memory
> > accessor. This can be done in two ways:
> >
> > - first, convert the value to the required endianness in libqtest and
> >   then use the memread/write routines so that qtest accesses the guest
> >   memory without doing any supplementary byteswapping
> >
> > - an alternative method would be to handle the byte swapping on the
> >   qtest side. For that, we would need to extend the read/write
> >   protocol with an ending word : "native|le|be" and modify the tswap
> >   calls accordingly under the qtest_process_command() routine.
> >
> > The result is the same and the first method is simpler.
> 
> The difficulty with this patch is that it's hard to tell whether
> it's really required, or if this is just adding an extra layer
> of byteswapping that should really be done in some other location

Actually, it's neither.  It's not essential for anything, but it
*removes* an extra layer of byteswapping that really never should have
been done in the first place.

> in the stack. What's the actual test case here?

The current readw, readl, etc. all work in "guest endianness".  But
guest endianness is not well defined - there are a number of targets
which can support either.  And it's doubly meaningless since it's a
property of the guest cpu, which we're essentially replacing with the
qtest stub anyway.

Furthermore "guest endianness" isn't useful.  With a tiny handful of
exceptions, all peripherals have their own endianness which is known
independent of the target.  It makes more sense for test cases to
explicitly do their accesses in the correct endianness for the device,
without having to compensate for the fact that it'll be swapped into
the essentially arbitrary "guest endianness" along the way.

Basically, the existing byteswaps, instead of removing the need for
them in the testcase code, instead mean that target-conditional
byteswaps will be *required* in nearly every testcase.  It's a recipe
for endianness-unsafe testcases.
Cédric Le Goater Oct. 5, 2016, 5:36 a.m. UTC | #6
On 10/05/2016 01:26 AM, David Gibson wrote:
> On Tue, Oct 04, 2016 at 02:17:28PM +0200, Cédric Le Goater wrote:
>> Some test scenarios require to access memory regions using a specific
>> endianness, such as a device region, but the current qtest memory
>> accessors are done in native endian, which means that the values are
>> byteswapped in qtest if the endianness of the guest and the host are
>> different.
>>
>> To maintain the endianness of a value, we need a new set of memory
>> accessor. This can be done in two ways:
>>
>> - first, convert the value to the required endianness in libqtest and
>>   then use the memread/write routines so that qtest accesses the guest
>>   memory without doing any supplementary byteswapping
>>
>> - an alternative method would be to handle the byte swapping on the
>>   qtest side. For that, we would need to extend the read/write
>>   protocol with an ending word : "native|le|be" and modify the tswap
>>   calls accordingly under the qtest_process_command() routine.
>>
>> The result is the same and the first method is simpler.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>
>>  tests/libqtest.c |   91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  tests/libqtest.h |   71 ++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 162 insertions(+)
>>
>> Index: qemu-aspeed.git/tests/libqtest.h
>> ===================================================================
>> --- qemu-aspeed.git.orig/tests/libqtest.h
>> +++ qemu-aspeed.git/tests/libqtest.h
>> @@ -887,4 +887,75 @@ void qmp_fd_send(int fd, const char *fmt
>>  QDict *qmp_fdv(int fd, const char *fmt, va_list ap);
>>  QDict *qmp_fd(int fd, const char *fmt, ...);
>>  
>> +/*
>> + * BE/LE read/write accessors
>> + */
>> +uint16_t qtest_readw_be(QTestState *s, uint64_t addr);
>> +uint32_t qtest_readl_be(QTestState *s, uint64_t addr);
>> +uint64_t qtest_readq_be(QTestState *s, uint64_t addr);
>> +
>> +static inline uint16_t readw_be(uint64_t addr)
>> +{
>> +    return qtest_readw_be(global_qtest, addr);
>> +}
>> +static inline uint32_t readl_be(uint64_t addr)
>> +{
>> +    return qtest_readl_be(global_qtest, addr);
>> +}
>> +static inline uint64_t readq_be(uint64_t addr)
>> +{
>> +    return qtest_readq_be(global_qtest, addr);
>> +}
>> +
>> +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value);
>> +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value);
>> +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value);
>> +
>> +static inline void writew_be(uint64_t addr, uint16_t value)
>> +{
>> +    qtest_writew_be(global_qtest, addr, value);
>> +}
>> +static inline void writel_be(uint64_t addr, uint32_t value)
>> +{
>> +    qtest_writel_be(global_qtest, addr, value);
>> +}
>> +static inline void writeq_be(uint64_t addr, uint64_t value)
>> +{
>> +    qtest_writeq_be(global_qtest, addr, value);
>> +}
>> +
>> +uint16_t qtest_readw_le(QTestState *s, uint64_t addr);
>> +uint32_t qtest_readl_le(QTestState *s, uint64_t addr);
>> +uint64_t qtest_readq_le(QTestState *s, uint64_t addr);
>> +
>> +static inline uint16_t readw_le(uint64_t addr)
>> +{
>> +    return qtest_readw_le(global_qtest, addr);
>> +}
>> +static inline uint32_t readl_le(uint64_t addr)
>> +{
>> +    return qtest_readl_le(global_qtest, addr);
>> +}
>> +static inline uint64_t readq_le(uint64_t addr)
>> +{
>> +    return qtest_readq_le(global_qtest, addr);
>> +}
>> +
>> +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value);
>> +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value);
>> +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value);
>> +
>> +static inline void writew_le(uint64_t addr, uint16_t value)
>> +{
>> +    qtest_writew_le(global_qtest, addr, value);
>> +}
>> +static inline void writel_le(uint64_t addr, uint32_t value)
>> +{
>> +    qtest_writel_le(global_qtest, addr, value);
>> +}
>> +static inline void writeq_le(uint64_t addr, uint64_t value)
>> +{
>> +    qtest_writeq_le(global_qtest, addr, value);
>> +}
>> +
>>  #endif
>> Index: qemu-aspeed.git/tests/libqtest.c
>> ===================================================================
>> --- qemu-aspeed.git.orig/tests/libqtest.c
>> +++ qemu-aspeed.git/tests/libqtest.c
>> @@ -15,6 +15,7 @@
>>   *
>>   */
>>  #include "qemu/osdep.h"
>> +#include "qemu/bswap.h"
>>  #include "libqtest.h"
>>  
>>  #include <sys/socket.h>
>> @@ -688,6 +689,42 @@ void qtest_writeq(QTestState *s, uint64_
>>      qtest_write(s, "writeq", addr, value);
>>  }
>>  
>> +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value)
>> +{
>> +    value = cpu_to_be16(value);
>> +    qtest_memread(s, addr, &value, sizeof(value));
> 
> Uh.. this surely needs to be qtest_memwrite().

arg. I missed a quilt refresh before sending ...

Thanks,

C. 
> 
>> +}
>> +
>> +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value)
>> +{
>> +    value = cpu_to_be32(value);
>> +    qtest_memread(s, addr, &value, sizeof(value));
> 
> and here,
> 
>> +}
>> +
>> +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value)
>> +{
>> +    value = cpu_to_be64(value);
>> +    qtest_memread(s, addr, &value, sizeof(value));
> 
> here
> 
>> +}
>> +
>> +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value)
>> +{
>> +    value = cpu_to_le16(value);
>> +    qtest_memread(s, addr, &value, sizeof(value));
> 
> here
> 
>> +}
>> +
>> +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value)
>> +{
>> +    value = cpu_to_le32(value);
>> +    qtest_memread(s, addr, &value, sizeof(value));
> 
> here
> 
>> +}
>> +
>> +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value)
>> +{
>> +    value = cpu_to_le64(value);
>> +    qtest_memread(s, addr, &value, sizeof(value));
> 
> and here
> 
>> +}
>> +
>>  static uint64_t qtest_read(QTestState *s, const char *cmd, uint64_t addr)
>>  {
>>      gchar **args;
>> @@ -721,6 +758,60 @@ uint64_t qtest_readq(QTestState *s, uint
>>      return qtest_read(s, "readq", addr);
>>  }
>>  
>> +uint16_t qtest_readw_be(QTestState *s, uint64_t addr)
>> +{
>> +    uint16_t value;
>> +
>> +    qtest_memread(s, addr, &value, sizeof(value));
>> +
>> +    return be16_to_cpu(value);
>> +}
>> +
>> +uint32_t qtest_readl_be(QTestState *s, uint64_t addr)
>> +{
>> +    uint32_t value;
>> +
>> +    qtest_memread(s, addr, &value, sizeof(value));
>> +
>> +    return be32_to_cpu(value);
>> +}
>> +
>> +uint64_t qtest_readq_be(QTestState *s, uint64_t addr)
>> +{
>> +    uint64_t value;
>> +
>> +    qtest_memread(s, addr, &value, sizeof(value));
>> +
>> +    return be64_to_cpu(value);
>> +}
>> +
>> +uint16_t qtest_readw_le(QTestState *s, uint64_t addr)
>> +{
>> +    uint16_t value;
>> +
>> +    qtest_memread(s, addr, &value, sizeof(value));
>> +
>> +    return le16_to_cpu(value);
>> +}
>> +
>> +uint32_t qtest_readl_le(QTestState *s, uint64_t addr)
>> +{
>> +    uint32_t value;
>> +
>> +    qtest_memread(s, addr, &value, sizeof(value));
>> +
>> +    return le32_to_cpu(value);
>> +}
>> +
>> +uint64_t qtest_readq_le(QTestState *s, uint64_t addr)
>> +{
>> +    uint64_t value;
>> +
>> +    qtest_memread(s, addr, &value, sizeof(value));
>> +
>> +    return le64_to_cpu(value);
>> +}
>> +
>>  static int hex2nib(char ch)
>>  {
>>      if (ch >= '0' && ch <= '9') {
>>
>
Cédric Le Goater Oct. 5, 2016, 5:59 a.m. UTC | #7
On 10/05/2016 01:43 AM, David Gibson wrote:
> On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
>> On 4 October 2016 at 13:17, Cédric Le Goater <clg@kaod.org> wrote:
>>> Some test scenarios require to access memory regions using a specific
>>> endianness, such as a device region, but the current qtest memory
>>> accessors are done in native endian, which means that the values are
>>> byteswapped in qtest if the endianness of the guest and the host are
>>> different.
>>>
>>> To maintain the endianness of a value, we need a new set of memory
>>> accessor. This can be done in two ways:
>>>
>>> - first, convert the value to the required endianness in libqtest and
>>>   then use the memread/write routines so that qtest accesses the guest
>>>   memory without doing any supplementary byteswapping
>>>
>>> - an alternative method would be to handle the byte swapping on the
>>>   qtest side. For that, we would need to extend the read/write
>>>   protocol with an ending word : "native|le|be" and modify the tswap
>>>   calls accordingly under the qtest_process_command() routine.
>>>
>>> The result is the same and the first method is simpler.
>>
>> The difficulty with this patch is that it's hard to tell whether
>> it's really required, or if this is just adding an extra layer
>> of byteswapping that should really be done in some other location
> 
> Actually, it's neither.  It's not essential for anything, but it
> *removes* an extra layer of byteswapping that really never should have
> been done in the first place.

So may be removing the tswap in qtest_process_command() is a better 
solution ? I don't know how much of an earthquake this will be in the 
tests results.

Tests will have to be explicit on which endianness they expect else 
they will be using the qtest default endianness which is the host. 


>> in the stack. What's the actual test case here?
> 
> The current readw, readl, etc. all work in "guest endianness".  But
> guest endianness is not well defined - there are a number of targets
> which can support either.  And it's doubly meaningless since it's a
> property of the guest cpu, which we're essentially replacing with the
> qtest stub anyway.
> 
> Furthermore "guest endianness" isn't useful.  With a tiny handful of
> exceptions, all peripherals have their own endianness which is known
> independent of the target.  It makes more sense for test cases to
> explicitly do their accesses in the correct endianness for the device,
> without having to compensate for the fact that it'll be swapped into
> the essentially arbitrary "guest endianness" along the way.
> 
> Basically, the existing byteswaps, instead of removing the need for
> them in the testcase code, instead mean that target-conditional
> byteswaps will be *required* in nearly every testcase.  It's a recipe
> for endianness-unsafe testcases.

and plenty of baroque work arounds.

Thanks,
C.
Peter Maydell Oct. 5, 2016, 12:31 p.m. UTC | #8
On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:
> On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
>> The difficulty with this patch is that it's hard to tell whether
>> it's really required, or if this is just adding an extra layer
>> of byteswapping that should really be done in some other location
>
> Actually, it's neither.  It's not essential for anything, but it
> *removes* an extra layer of byteswapping that really never should have
> been done in the first place.

The patch is very clearly adding calls to swapping functions.
It looks like it's mostly convenience functions for not doing
those swaps explicitly in the test cases.

>> in the stack. What's the actual test case here?
>
> The current readw, readl, etc. all work in "guest endianness".  But
> guest endianness is not well defined - there are a number of targets
> which can support either.

It's guest bus endianness, and it's pretty well defined I think.
(ARM for instance is LE bus even if the CPU is doing BE writes.)

>  And it's doubly meaningless since it's a
> property of the guest cpu, which we're essentially replacing with the
> qtest stub anyway.

The stub sits on the same bus the guest cpu would.

> Furthermore "guest endianness" isn't useful.  With a tiny handful of
> exceptions, all peripherals have their own endianness which is known
> independent of the target.  It makes more sense for test cases to
> explicitly do their accesses in the correct endianness for the device,
> without having to compensate for the fact that it'll be swapped into
> the essentially arbitrary "guest endianness" along the way.

Here I definitely disagree. I think it makes much more sense
for writes to be "what the guest CPU would write", because that's
where we're injecting them. If we had a test framework where the
test was talking directly to the device, then maybe, but we don't.

> Basically, the existing byteswaps, instead of removing the need for
> them in the testcase code, instead mean that target-conditional
> byteswaps will be *required* in nearly every testcase.  It's a recipe
> for endianness-unsafe testcases.

I prefer the swaps to be explicit in the test cases. If your
test case running on the real CPU would have had to do
"swap to LE and then write this word", so does the test
case in our test framework. If your test case just does
"write the word", then so does the test.

Most devices IME do not require byteswaps by guest code,
and I think these functions are confusing -- if you try
to use them to write tests for ARM devices, for instance,
the _le versions of these functions will introduce an
incorrect byteswap on a BE host, even though ARM CPUs and
devices are typically all LE.

thanks
-- PMM
Cédric Le Goater Oct. 5, 2016, 1:49 p.m. UTC | #9
On 10/05/2016 02:31 PM, Peter Maydell wrote:
> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:
>> On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
>>> The difficulty with this patch is that it's hard to tell whether
>>> it's really required, or if this is just adding an extra layer
>>> of byteswapping that should really be done in some other location
>>
>> Actually, it's neither.  It's not essential for anything, but it
>> *removes* an extra layer of byteswapping that really never should have
>> been done in the first place.
> 
> The patch is very clearly adding calls to swapping functions.
> It looks like it's mostly convenience functions for not doing
> those swaps explicitly in the test cases.
> 
>>> in the stack. What's the actual test case here?
>>
>> The current readw, readl, etc. all work in "guest endianness".  But
>> guest endianness is not well defined - there are a number of targets
>> which can support either.
> 
> It's guest bus endianness, and it's pretty well defined I think.
> (ARM for instance is LE bus even if the CPU is doing BE writes.)
> 
>>  And it's doubly meaningless since it's a
>> property of the guest cpu, which we're essentially replacing with the
>> qtest stub anyway.
> 
> The stub sits on the same bus the guest cpu would.
> 
>> Furthermore "guest endianness" isn't useful.  With a tiny handful of
>> exceptions, all peripherals have their own endianness which is known
>> independent of the target.  It makes more sense for test cases to
>> explicitly do their accesses in the correct endianness for the device,
>> without having to compensate for the fact that it'll be swapped into
>> the essentially arbitrary "guest endianness" along the way.
> 
> Here I definitely disagree. I think it makes much more sense
> for writes to be "what the guest CPU would write", because that's
> where we're injecting them. If we had a test framework where the
> test was talking directly to the device, then maybe, but we don't.

This is how it all started with the test I wrote. Quite ignorant
of the middle layers, I used cpu_to_be32() as I would have done 
on the guest and then realized that qtest was byte-swapping also
in some cases and so the test failed on a BE host.

>> Basically, the existing byteswaps, instead of removing the need for
>> them in the testcase code, instead mean that target-conditional
>> byteswaps will be *required* in nearly every testcase.  It's a recipe
>> for endianness-unsafe testcases.
> 
> I prefer the swaps to be explicit in the test cases. If your
> test case running on the real CPU would have had to do
> "swap to LE and then write this word", so does the test
> case in our test framework. If your test case just does
> "write the word", then so does the test.
>
> Most devices IME do not require byteswaps by guest code,
> and I think these functions are confusing -- if you try
> to use them to write tests for ARM devices, for instance,
> the _le versions of these functions will introduce an
> incorrect byteswap on a BE host, even though ARM CPUs and
> devices are typically all LE.

hmmm, yes I agree :/ My scenario is for a ARM SoC running LE 
on a BE host and accessing a BE device. 

So how do we handle the possible bswap due to the host and 
guest endianness being different ?  I am confused. Should we 
try to remove the tswap from qtest ? 

Thanks

C.
Peter Maydell Oct. 5, 2016, 1:53 p.m. UTC | #10
On 5 October 2016 at 06:49, Cédric Le Goater <clg@kaod.org> wrote:
> On 10/05/2016 02:31 PM, Peter Maydell wrote:
>> Here I definitely disagree. I think it makes much more sense
>> for writes to be "what the guest CPU would write", because that's
>> where we're injecting them. If we had a test framework where the
>> test was talking directly to the device, then maybe, but we don't.
>
> This is how it all started with the test I wrote. Quite ignorant
> of the middle layers, I used cpu_to_be32() as I would have done
> on the guest and then realized that qtest was byte-swapping also
> in some cases and so the test failed on a BE host.

At this point we're back to "I'd need to look at the actual test
and device code to identify what's going wrong"...

>>> Basically, the existing byteswaps, instead of removing the need for
>>> them in the testcase code, instead mean that target-conditional
>>> byteswaps will be *required* in nearly every testcase.  It's a recipe
>>> for endianness-unsafe testcases.
>>
>> I prefer the swaps to be explicit in the test cases. If your
>> test case running on the real CPU would have had to do
>> "swap to LE and then write this word", so does the test
>> case in our test framework. If your test case just does
>> "write the word", then so does the test.
>>
>> Most devices IME do not require byteswaps by guest code,
>> and I think these functions are confusing -- if you try
>> to use them to write tests for ARM devices, for instance,
>> the _le versions of these functions will introduce an
>> incorrect byteswap on a BE host, even though ARM CPUs and
>> devices are typically all LE.
>
> hmmm, yes I agree :/ My scenario is for a ARM SoC running LE
> on a BE host and accessing a BE device.
>
> So how do we handle the possible bswap due to the host and
> guest endianness being different ?  I am confused. Should we
> try to remove the tswap from qtest ?

Which tswap? Last time I worked through the stack of
what happens I thought that we had the right set of
swaps in the right places.

thanks
-- PMM
Cédric Le Goater Oct. 5, 2016, 2 p.m. UTC | #11
On 10/05/2016 03:53 PM, Peter Maydell wrote:
> On 5 October 2016 at 06:49, Cédric Le Goater <clg@kaod.org> wrote:
>> On 10/05/2016 02:31 PM, Peter Maydell wrote:
>>> Here I definitely disagree. I think it makes much more sense
>>> for writes to be "what the guest CPU would write", because that's
>>> where we're injecting them. If we had a test framework where the
>>> test was talking directly to the device, then maybe, but we don't.
>>
>> This is how it all started with the test I wrote. Quite ignorant
>> of the middle layers, I used cpu_to_be32() as I would have done
>> on the guest and then realized that qtest was byte-swapping also
>> in some cases and so the test failed on a BE host.
> 
> At this point we're back to "I'd need to look at the actual test
> and device code to identify what's going wrong"...

ok. This is clearly not critical. When ever you have time. 

>>>> Basically, the existing byteswaps, instead of removing the need for
>>>> them in the testcase code, instead mean that target-conditional
>>>> byteswaps will be *required* in nearly every testcase.  It's a recipe
>>>> for endianness-unsafe testcases.
>>>
>>> I prefer the swaps to be explicit in the test cases. If your
>>> test case running on the real CPU would have had to do
>>> "swap to LE and then write this word", so does the test
>>> case in our test framework. If your test case just does
>>> "write the word", then so does the test.
>>>
>>> Most devices IME do not require byteswaps by guest code,
>>> and I think these functions are confusing -- if you try
>>> to use them to write tests for ARM devices, for instance,
>>> the _le versions of these functions will introduce an
>>> incorrect byteswap on a BE host, even though ARM CPUs and
>>> devices are typically all LE.
>>
>> hmmm, yes I agree :/ My scenario is for a ARM SoC running LE
>> on a BE host and accessing a BE device.
>>
>> So how do we handle the possible bswap due to the host and
>> guest endianness being different ?  I am confused. Should we
>> try to remove the tswap from qtest ?
> 
> Which tswap? Last time I worked through the stack of
> what happens I thought that we had the right set of
> swaps in the right places.

The one I am talking about are under qtest_process_command(), 
see below.

Thanks,

C. 


    } else if (strcmp(words[0], "writeb") == 0 ||
               strcmp(words[0], "writew") == 0 ||
               strcmp(words[0], "writel") == 0 ||
               strcmp(words[0], "writeq") == 0) {
        uint64_t addr;
        uint64_t value;

        g_assert(words[1] && words[2]);
        g_assert(qemu_strtoull(words[1], NULL, 0, &addr) == 0);
        g_assert(qemu_strtoull(words[2], NULL, 0, &value) == 0);

        if (words[0][5] == 'b') {
            uint8_t data = value;
            cpu_physical_memory_write(addr, &data, 1);
        } else if (words[0][5] == 'w') {
            uint16_t data = value;
            tswap16s(&data);
            cpu_physical_memory_write(addr, &data, 2);
        } else if (words[0][5] == 'l') {
            uint32_t data = value;
            tswap32s(&data);
            cpu_physical_memory_write(addr, &data, 4);
        } else if (words[0][5] == 'q') {
            uint64_t data = value;
            tswap64s(&data);
            cpu_physical_memory_write(addr, &data, 8);
        }
        qtest_send_prefix(chr);
        qtest_send(chr, "OK\n");
    } else if (strcmp(words[0], "readb") == 0 ||
               strcmp(words[0], "readw") == 0 ||
               strcmp(words[0], "readl") == 0 ||
               strcmp(words[0], "readq") == 0) {
        uint64_t addr;
        uint64_t value = UINT64_C(-1);

        g_assert(words[1]);
        g_assert(qemu_strtoull(words[1], NULL, 0, &addr) == 0);

        if (words[0][4] == 'b') {
            uint8_t data;
            cpu_physical_memory_read(addr, &data, 1);
            value = data;
        } else if (words[0][4] == 'w') {
            uint16_t data;
            cpu_physical_memory_read(addr, &data, 2);
            value = tswap16(data);
        } else if (words[0][4] == 'l') {
            uint32_t data;
            cpu_physical_memory_read(addr, &data, 4);
            value = tswap32(data);
        } else if (words[0][4] == 'q') {
            cpu_physical_memory_read(addr, &value, 8);
            tswap64s(&value);
        }
        qtest_send_prefix(chr);
        qtest_sendf(chr, "OK 0x%016" PRIx64 "\n", value);
Peter Maydell Oct. 5, 2016, 2:20 p.m. UTC | #12
On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote:
> On 10/05/2016 03:53 PM, Peter Maydell wrote:
>> Which tswap? Last time I worked through the stack of
>> what happens I thought that we had the right set of
>> swaps in the right places.
>
> The one I am talking about are under qtest_process_command(),
> see below.

Those are correct and required, and they do not change
the overall behaviour of the system depending on the host
endianness. (They convert 32-bit values to "bag of
bytes in guest order" which is what the cpu_physical_memory_*
functions want.)

https://lists.gnu.org/archive/html/qemu-arm/2016-07/msg00037.html
is the explanation from last time around, I think.

If your test is giving different answers on different
host endiannesses, it is presumably because it is
incorrectly issuing different readl/etc commands
in the two cases.

thanks
-- PMM
Cédric Le Goater Oct. 5, 2016, 5:17 p.m. UTC | #13
On 10/05/2016 04:20 PM, Peter Maydell wrote:
> On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote:
>> On 10/05/2016 03:53 PM, Peter Maydell wrote:
>>> Which tswap? Last time I worked through the stack of
>>> what happens I thought that we had the right set of
>>> swaps in the right places.
>>
>> The one I am talking about are under qtest_process_command(),
>> see below.
> 
> Those are correct and required, and they do not change
> the overall behaviour of the system depending on the host
> endianness. (They convert 32-bit values to "bag of
> bytes in guest order" which is what the cpu_physical_memory_*
> functions want.)
> 
> https://lists.gnu.org/archive/html/qemu-arm/2016-07/msg00037.html
> is the explanation from last time around, I think.
> 
> If your test is giving different answers on different
> host endiannesses, it is presumably because it is
> incorrectly issuing different readl/etc commands
> in the two cases.

OK. I think my brain is starting to see things from the right 
angle.

Let's try that : my test is simulating a Little Endian CPU which
is writing a Big Endian value, so it *always* needs to bswap() 
that value. cpu_to_be32 is incorrect as it implies the endianness
of the host which can be wrong (when running BE).

The qtest layer is a just fixing layer.

Thanks,
C.
Peter Maydell Oct. 5, 2016, 5:32 p.m. UTC | #14
On 5 October 2016 at 10:17, Cédric Le Goater <clg@kaod.org> wrote:
> OK. I think my brain is starting to see things from the right
> angle.
>
> Let's try that : my test is simulating a Little Endian CPU which
> is writing a Big Endian value, so it *always* needs to bswap()
> that value.

Yes. If your code on the (LE) guest CPU would be
 "bswap 32-bit value X; store 32-bit value to address"
then your code in the test should be:
 "bswap 32-bit value X; store 32-bit value to address"
If your code on the guest CPU is just
 "write 32-bit value X to address"
then the code in the test is just
 "write 32-bit value X to address"

> cpu_to_be32 is incorrect as it implies the endianness
> of the host which can be wrong (when running BE).

The main time you might want it in a test is if you
do what the IDE and e1000e tests do: build up a
data structure locally, and then memwrite() it to
the guest memory as a pile of bytes (which you then
point the device-under-test at).

thanks
-- PMM
David Gibson Oct. 6, 2016, 3:38 a.m. UTC | #15
On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote:
> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:
> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
> >> The difficulty with this patch is that it's hard to tell whether
> >> it's really required, or if this is just adding an extra layer
> >> of byteswapping that should really be done in some other location
> >
> > Actually, it's neither.  It's not essential for anything, but it
> > *removes* an extra layer of byteswapping that really never should have
> > been done in the first place.
> 
> The patch is very clearly adding calls to swapping functions.
> It looks like it's mostly convenience functions for not doing
> those swaps explicitly in the test cases.

It's adding 1 swap on top of the memread/memwrite path - that's the
path which had no existing swaps (intended primarily for bag-o'-bytes
block access AFAICT).

It's less swaps compared to the alternative approach of using the
readw/writew/readl/writel set of functions.  Those already include a
tswap, and will in nearly all cases require an additional
(conditional) swap in the caller.

> >> in the stack. What's the actual test case here?
> >
> > The current readw, readl, etc. all work in "guest endianness".  But
> > guest endianness is not well defined - there are a number of targets
> > which can support either.
> 
> It's guest bus endianness, and it's pretty well defined I think.
> (ARM for instance is LE bus even if the CPU is doing BE writes.)

I don't see that guest bus endianness is any better defined, or any
more useful than "guest endianness".  It might have a vague meaning
for ARM (or embedded Power) in the sense that the on-SoC devices will
use that endianness.  But since the SoC devices are generally unique
to the architecture anyway, you still know their endianness
independent of any notion of guest endianness.

> >  And it's doubly meaningless since it's a
> > property of the guest cpu, which we're essentially replacing with the
> > qtest stub anyway.
> 
> The stub sits on the same bus the guest cpu would.
> 
> > Furthermore "guest endianness" isn't useful.  With a tiny handful of
> > exceptions, all peripherals have their own endianness which is known
> > independent of the target.  It makes more sense for test cases to
> > explicitly do their accesses in the correct endianness for the device,
> > without having to compensate for the fact that it'll be swapped into
> > the essentially arbitrary "guest endianness" along the way.
> 
> Here I definitely disagree. I think it makes much more sense
> for writes to be "what the guest CPU would write", because that's
> where we're injecting them. If we had a test framework where the
> test was talking directly to the device, then maybe, but we don't.

When I say that guest endianness is not well defined, what I mean is
precisely that "what the guest CPU would write" is not well defined.
For example on Power, the endianness of a given access will depend on:
    - For server chips, whether the CPU is in LE or BE mode
    - For embedded chips, the endianness flag in the TLB
    - For all chips, whether the plain or byte-reversed load/store
      instructions are used

[There might even be variants with both a global mode flag and
per-page flags in the TLB, I'm not sure]

All of those components are not present in the qtest model.  So
attempting to do "what the cpu would do" - by making a bunch of
essentially arbitrary assumptions - gives us zero extra coverage.

The model proposed here is that our testcases, on every access specify
a specific final-result endianness.  The alternative is that every
write from qtest has to do:
    1) Start with a value (host endian)
    2) Conditional swap for host endian vs. device endian difference
    3) Conditional swap for host endian vs. "guest endian" difference
       simply to compensate for..
    4) [Existing tswap() within writew/writel]
       Conditional swap for host endian vs. "guest endian difference

> > Basically, the existing byteswaps, instead of removing the need for
> > them in the testcase code, instead mean that target-conditional
> > byteswaps will be *required* in nearly every testcase.  It's a recipe
> > for endianness-unsafe testcases.
> 
> I prefer the swaps to be explicit in the test cases. If your

The are explicit in the testcases, in the sense that on every read or
write you say this is an LE access or a BE access.  Potentially we
could have a "guest native" access option as well, but I think that's
useful 

> test case running on the real CPU would have had to do
> "swap to LE and then write this word", so does the test
> case in our test framework. If your test case just does
> "write the word", then so does the test.
> 
> Most devices IME do not require byteswaps by guest code,

What!?  So speaks the person working on a historically LE platform.
In the kernel, if a driver isn't littered with cpu_to_leXX() it's
almost certainly broken on a BE platform.

The only devices I can think of which don't have a fixed endianness
regardless of platform are:
   * legacy virtio: this was an insane design choice, which is why it
     went away in virtio-1.0
   * graphics card framebuffers, which can usually be configured
     either way (or have both BE and LE views of the framebuffer)
   * a handful of simple devices, usually from a while back, where
     some idiot thought a BE version of a previously LE device (or vice
     versa) was a good idea

> and I think these functions are confusing -- if you try
> to use them to write tests for ARM devices, for instance,
> the _le versions of these functions will introduce an
> incorrect byteswap on a BE host, even though ARM CPUs and
> devices are typically all LE.

Huh?  How so?

If you're accessing an LE device (like PCI) you say "write_le" and
there are no byteswaps at all on an LE host, and a single byteswap on
a BE host, which is just what you want.  For a PCI device "write_le"
is the correct option regardless of both host and guest platform.

That makes testcases straightforward to write.  As a bonus, it forces
the test author to think momentarily about what the right endianness
is for each access, which is a good habit to be in (not thinking about
this is why we've so often had endianness-broken drivers).

Switching to an arbitrary "guest endianness" and back again makes the
test more complicated, and accomplishes no extra coverage (because all
the swaps under consideration are in the test code anyway).  It gains
nothing but confusion.
David Gibson Oct. 6, 2016, 3:40 a.m. UTC | #16
On Wed, Oct 05, 2016 at 03:49:18PM +0200, Cédric Le Goater wrote:
> On 10/05/2016 02:31 PM, Peter Maydell wrote:
> > On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:
> >> On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
> >>> The difficulty with this patch is that it's hard to tell whether
> >>> it's really required, or if this is just adding an extra layer
> >>> of byteswapping that should really be done in some other location
> >>
> >> Actually, it's neither.  It's not essential for anything, but it
> >> *removes* an extra layer of byteswapping that really never should have
> >> been done in the first place.
> > 
> > The patch is very clearly adding calls to swapping functions.
> > It looks like it's mostly convenience functions for not doing
> > those swaps explicitly in the test cases.
> > 
> >>> in the stack. What's the actual test case here?
> >>
> >> The current readw, readl, etc. all work in "guest endianness".  But
> >> guest endianness is not well defined - there are a number of targets
> >> which can support either.
> > 
> > It's guest bus endianness, and it's pretty well defined I think.
> > (ARM for instance is LE bus even if the CPU is doing BE writes.)
> > 
> >>  And it's doubly meaningless since it's a
> >> property of the guest cpu, which we're essentially replacing with the
> >> qtest stub anyway.
> > 
> > The stub sits on the same bus the guest cpu would.
> > 
> >> Furthermore "guest endianness" isn't useful.  With a tiny handful of
> >> exceptions, all peripherals have their own endianness which is known
> >> independent of the target.  It makes more sense for test cases to
> >> explicitly do their accesses in the correct endianness for the device,
> >> without having to compensate for the fact that it'll be swapped into
> >> the essentially arbitrary "guest endianness" along the way.
> > 
> > Here I definitely disagree. I think it makes much more sense
> > for writes to be "what the guest CPU would write", because that's
> > where we're injecting them. If we had a test framework where the
> > test was talking directly to the device, then maybe, but we don't.
> 
> This is how it all started with the test I wrote. Quite ignorant
> of the middle layers, I used cpu_to_be32() as I would have done 
> on the guest and then realized that qtest was byte-swapping also
> in some cases and so the test failed on a BE host.
> 
> >> Basically, the existing byteswaps, instead of removing the need for
> >> them in the testcase code, instead mean that target-conditional
> >> byteswaps will be *required* in nearly every testcase.  It's a recipe
> >> for endianness-unsafe testcases.
> > 
> > I prefer the swaps to be explicit in the test cases. If your
> > test case running on the real CPU would have had to do
> > "swap to LE and then write this word", so does the test
> > case in our test framework. If your test case just does
> > "write the word", then so does the test.
> >
> > Most devices IME do not require byteswaps by guest code,
> > and I think these functions are confusing -- if you try
> > to use them to write tests for ARM devices, for instance,
> > the _le versions of these functions will introduce an
> > incorrect byteswap on a BE host, even though ARM CPUs and
> > devices are typically all LE.
> 
> hmmm, yes I agree :/ My scenario is for a ARM SoC running LE 
> on a BE host and accessing a BE device. 

This case is handled perfectly well by this proposal.

The ARM devices are LE, so testcases testing them should use the
write_le operations.  That will do the right thing regardless of host
endianness.  It will also do the right thing regardless of guest
endianness, if for whatever reason you try to attach an ARM SoC device
to a (notionally) BE target.

> So how do we handle the possible bswap due to the host and 
> guest endianness being different ?  I am confused. Should we 
> try to remove the tswap from qtest ? 
> 
> Thanks
> 
> C. 
>
David Gibson Oct. 6, 2016, 3:45 a.m. UTC | #17
On Wed, Oct 05, 2016 at 07:20:52AM -0700, Peter Maydell wrote:
> On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote:
> > On 10/05/2016 03:53 PM, Peter Maydell wrote:
> >> Which tswap? Last time I worked through the stack of
> >> what happens I thought that we had the right set of
> >> swaps in the right places.
> >
> > The one I am talking about are under qtest_process_command(),
> > see below.
> 
> Those are correct and required, and they do not change
> the overall behaviour of the system depending on the host
> endianness. (They convert 32-bit values to "bag of
> bytes in guest order" which is what the cpu_physical_memory_*
> functions want.)

These functions are correct for the defined semantics of the
readw/readl operations, but those semantics are not useful.

This proposal is introducing alternate functions with the more useful
semantics which are "convert a 32-bit value to a bag of bytes in LE
order" or "convert a 32-bit value to a bag of bytes in BE order"
depending on which variant you choose.

qtest is about testing hardware, not the guest cpu, and whether you're
accessing MMIO space or buffers the hardware will read via DMA, it's
the hardware which determines the correct endianness, not the guest
cpu.

> https://lists.gnu.org/archive/html/qemu-arm/2016-07/msg00037.html
> is the explanation from last time around, I think.
> 
> If your test is giving different answers on different
> host endiannesses, it is presumably because it is
> incorrectly issuing different readl/etc commands
> in the two cases.
> 
> thanks
> -- PMM
>
David Gibson Oct. 6, 2016, 6:10 a.m. UTC | #18
On Thu, Oct 06, 2016 at 02:38:36PM +1100, David Gibson wrote:
> On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote:
> > On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:
> > > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
> > >> The difficulty with this patch is that it's hard to tell whether
> > >> it's really required, or if this is just adding an extra layer
> > >> of byteswapping that should really be done in some other location
> > >
> > > Actually, it's neither.  It's not essential for anything, but it
> > > *removes* an extra layer of byteswapping that really never should have
> > > been done in the first place.
> > 
> > The patch is very clearly adding calls to swapping functions.
> > It looks like it's mostly convenience functions for not doing
> > those swaps explicitly in the test cases.
> 
> It's adding 1 swap on top of the memread/memwrite path - that's the
> path which had no existing swaps (intended primarily for bag-o'-bytes
> block access AFAICT).
> 
> It's less swaps compared to the alternative approach of using the
> readw/writew/readl/writel set of functions.  Those already include a
> tswap, and will in nearly all cases require an additional
> (conditional) swap in the caller.
> 
> > >> in the stack. What's the actual test case here?
> > >
> > > The current readw, readl, etc. all work in "guest endianness".  But
> > > guest endianness is not well defined - there are a number of targets
> > > which can support either.
> > 
> > It's guest bus endianness, and it's pretty well defined I think.
> > (ARM for instance is LE bus even if the CPU is doing BE writes.)
> 
> I don't see that guest bus endianness is any better defined, or any
> more useful than "guest endianness".  It might have a vague meaning
> for ARM (or embedded Power) in the sense that the on-SoC devices will
> use that endianness.  But since the SoC devices are generally unique
> to the architecture anyway, you still know their endianness
> independent of any notion of guest endianness.
> 
> > >  And it's doubly meaningless since it's a
> > > property of the guest cpu, which we're essentially replacing with the
> > > qtest stub anyway.
> > 
> > The stub sits on the same bus the guest cpu would.
> > 
> > > Furthermore "guest endianness" isn't useful.  With a tiny handful of
> > > exceptions, all peripherals have their own endianness which is known
> > > independent of the target.  It makes more sense for test cases to
> > > explicitly do their accesses in the correct endianness for the device,
> > > without having to compensate for the fact that it'll be swapped into
> > > the essentially arbitrary "guest endianness" along the way.
> > 
> > Here I definitely disagree. I think it makes much more sense
> > for writes to be "what the guest CPU would write", because that's
> > where we're injecting them. If we had a test framework where the
> > test was talking directly to the device, then maybe, but we don't.
> 
> When I say that guest endianness is not well defined, what I mean is
> precisely that "what the guest CPU would write" is not well defined.
> For example on Power, the endianness of a given access will depend on:
>     - For server chips, whether the CPU is in LE or BE mode
>     - For embedded chips, the endianness flag in the TLB
>     - For all chips, whether the plain or byte-reversed load/store
>       instructions are used
> 
> [There might even be variants with both a global mode flag and
> per-page flags in the TLB, I'm not sure]
> 
> All of those components are not present in the qtest model.  So
> attempting to do "what the cpu would do" - by making a bunch of
> essentially arbitrary assumptions - gives us zero extra coverage.
> 
> The model proposed here is that our testcases, on every access specify
> a specific final-result endianness.  The alternative is that every
> write from qtest has to do:
>     1) Start with a value (host endian)
>     2) Conditional swap for host endian vs. device endian difference
>     3) Conditional swap for host endian vs. "guest endian" difference
>        simply to compensate for..
>     4) [Existing tswap() within writew/writel]
>        Conditional swap for host endian vs. "guest endian difference
> 
> > > Basically, the existing byteswaps, instead of removing the need for
> > > them in the testcase code, instead mean that target-conditional
> > > byteswaps will be *required* in nearly every testcase.  It's a recipe
> > > for endianness-unsafe testcases.
> > 
> > I prefer the swaps to be explicit in the test cases. If your
> 
> The are explicit in the testcases, in the sense that on every read or
> write you say this is an LE access or a BE access.  Potentially we
> could have a "guest native" access option as well, but I think that's
> useful 
> 
> > test case running on the real CPU would have had to do
> > "swap to LE and then write this word", so does the test
> > case in our test framework. If your test case just does
> > "write the word", then so does the test.
> > 
> > Most devices IME do not require byteswaps by guest code,
> 
> What!?  So speaks the person working on a historically LE platform.
> In the kernel, if a driver isn't littered with cpu_to_leXX() it's
> almost certainly broken on a BE platform.

Clarifying here.  In the kernel, byteswaps are not typically required
on writew(), writel() operations.  That's because the kernel writew()
and writel() are defined to have always-LE operation, and most
hardware is LE.

In other words, the kernel writew() and writel() have the same
semantics as the proposed writew_le() and writel_le() *NOT* the
existing qtest writew() and writel() (they're more like the rarely
used kernel __raw_writew() and __raw_writel()).
Paolo Bonzini Oct. 6, 2016, 7:23 a.m. UTC | #19
On 06/10/2016 05:45, David Gibson wrote:
> qtest is about testing hardware, not the guest cpu, and whether you're
> accessing MMIO space or buffers the hardware will read via DMA, it's
> the hardware which determines the correct endianness, not the guest
> cpu.

Well, sort of... qtest writes are definitely injecting things from the
point of view of the guest CPU.  For example see Laurent's patches that
add RTAS commands to qtest.

Paolo
David Gibson Oct. 6, 2016, 8:37 a.m. UTC | #20
On Thu, Oct 06, 2016 at 09:23:11AM +0200, Paolo Bonzini wrote:
> 
> 
> On 06/10/2016 05:45, David Gibson wrote:
> > qtest is about testing hardware, not the guest cpu, and whether you're
> > accessing MMIO space or buffers the hardware will read via DMA, it's
> > the hardware which determines the correct endianness, not the guest
> > cpu.
> 
> Well, sort of... qtest writes are definitely injecting things from the
> point of view of the guest CPU.  For example see Laurent's patches that
> add RTAS commands to qtest.

Sure.  My point is that any guest CPU can do both LE and BE accesses
pretty simply, so we might as well specify which we want.
Paolo Bonzini Oct. 6, 2016, 9:40 a.m. UTC | #21
On 06/10/2016 10:37, David Gibson wrote:
> On Thu, Oct 06, 2016 at 09:23:11AM +0200, Paolo Bonzini wrote:
>>
>>
>> On 06/10/2016 05:45, David Gibson wrote:
>>> qtest is about testing hardware, not the guest cpu, and whether you're
>>> accessing MMIO space or buffers the hardware will read via DMA, it's
>>> the hardware which determines the correct endianness, not the guest
>>> cpu.
>>
>> Well, sort of... qtest writes are definitely injecting things from the
>> point of view of the guest CPU.  For example see Laurent's patches that
>> add RTAS commands to qtest.
> 
> Sure.  My point is that any guest CPU can do both LE and BE accesses
> pretty simply, so we might as well specify which we want.

On this I do agree.

Paolo
Cédric Le Goater Oct. 6, 2016, 10:44 a.m. UTC | #22
On 10/06/2016 11:40 AM, Paolo Bonzini wrote:
> 
> 
> On 06/10/2016 10:37, David Gibson wrote:
>> On Thu, Oct 06, 2016 at 09:23:11AM +0200, Paolo Bonzini wrote:
>>>
>>>
>>> On 06/10/2016 05:45, David Gibson wrote:
>>>> qtest is about testing hardware, not the guest cpu, and whether you're
>>>> accessing MMIO space or buffers the hardware will read via DMA, it's
>>>> the hardware which determines the correct endianness, not the guest
>>>> cpu.
>>>
>>> Well, sort of... qtest writes are definitely injecting things from the
>>> point of view of the guest CPU.  For example see Laurent's patches that
>>> add RTAS commands to qtest.
>>
>> Sure.  My point is that any guest CPU can do both LE and BE accesses
>> pretty simply, so we might as well specify which we want.
>
> On this I do agree.

Yes. It is pretty confusing today from the test side. To clarify 
things a little, I came up with this  :

	+static inline uint32_t make_be32(uint32_t data)
	+{
	+    return bswap32(data);
	+}

	...

	+    writel(ASPEED_FLASH_BASE, make_be32(some_page_addr));

Which is what a Little Endian CPU would do to write a BE value. So
it should probably be called, make_b32_on_le_cpu() to be correct. 

I am not sure there is a good interface for these accesses ...

 
C.
Peter Maydell Oct. 6, 2016, 10:47 a.m. UTC | #23
On 6 October 2016 at 04:45, David Gibson <david@gibson.dropbear.id.au> wrote:
> On Wed, Oct 05, 2016 at 07:20:52AM -0700, Peter Maydell wrote:
>> On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote:
>> > On 10/05/2016 03:53 PM, Peter Maydell wrote:
>> >> Which tswap? Last time I worked through the stack of
>> >> what happens I thought that we had the right set of
>> >> swaps in the right places.
>> >
>> > The one I am talking about are under qtest_process_command(),
>> > see below.
>>
>> Those are correct and required, and they do not change
>> the overall behaviour of the system depending on the host
>> endianness. (They convert 32-bit values to "bag of
>> bytes in guest order" which is what the cpu_physical_memory_*
>> functions want.)
>
> These functions are correct for the defined semantics of the
> readw/readl operations, but those semantics are not useful.

?? They're the most obvious and required semantics: "act
like the CPU just did a word/halfword/byte read/write".
There's a reason we've got this far without needing anything
else, and it's that this is the most straightforward use case.

> This proposal is introducing alternate functions with the more useful
> semantics which are "convert a 32-bit value to a bag of bytes in LE
> order" or "convert a 32-bit value to a bag of bytes in BE order"
> depending on which variant you choose.

It's adding functions whose semantics are "act like the
CPU wrote this value to some RAM and then memcpy()ed it to
the device". I think devices whose usage model is "memcpy bytes
to me" are rare to nonexistent.

thanks
-- PMM
Peter Maydell Oct. 6, 2016, 11:03 a.m. UTC | #24
On 6 October 2016 at 04:38, David Gibson <david@gibson.dropbear.id.au> wrote:
> On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote:
>> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:
>> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
>> >> The difficulty with this patch is that it's hard to tell whether
>> >> it's really required, or if this is just adding an extra layer
>> >> of byteswapping that should really be done in some other location
>> >
>> > Actually, it's neither.  It's not essential for anything, but it
>> > *removes* an extra layer of byteswapping that really never should have
>> > been done in the first place.
>>
>> The patch is very clearly adding calls to swapping functions.
>> It looks like it's mostly convenience functions for not doing
>> those swaps explicitly in the test cases.
>
> It's adding 1 swap on top of the memread/memwrite path - that's the
> path which had no existing swaps (intended primarily for bag-o'-bytes
> block access AFAICT).

Yeah, I hadn't noticed it was using the memread/memwrite path.
I disagree with using that code path for what ought to be
register read/writes (among other things, it's not clear to me
that it guarantees that a 4-byte access by the test code is
always a 4-byte access on the device, etc).

>> >> in the stack. What's the actual test case here?
>> >
>> > The current readw, readl, etc. all work in "guest endianness".  But
>> > guest endianness is not well defined - there are a number of targets
>> > which can support either.
>>
>> It's guest bus endianness, and it's pretty well defined I think.
>> (ARM for instance is LE bus even if the CPU is doing BE writes.)
>
> I don't see that guest bus endianness is any better defined, or any
> more useful than "guest endianness".  It might have a vague meaning
> for ARM (or embedded Power) in the sense that the on-SoC devices will
> use that endianness.  But since the SoC devices are generally unique
> to the architecture anyway, you still know their endianness
> independent of any notion of guest endianness.

It means "the endianness that you would see if you could snoop
the data bus at the output of the CPU". It is definitely well
defined for every QEMU target because that's what TARGET_WORDS_BIGENDIAN
tells you. It's not the same as whatever the device thinks it
might interpret values as (though obviously people don't
often design systems where they differ).

>> >  And it's doubly meaningless since it's a
>> > property of the guest cpu, which we're essentially replacing with the
>> > qtest stub anyway.
>>
>> The stub sits on the same bus the guest cpu would.
>>
>> > Furthermore "guest endianness" isn't useful.  With a tiny handful of
>> > exceptions, all peripherals have their own endianness which is known
>> > independent of the target.  It makes more sense for test cases to
>> > explicitly do their accesses in the correct endianness for the device,
>> > without having to compensate for the fact that it'll be swapped into
>> > the essentially arbitrary "guest endianness" along the way.
>>
>> Here I definitely disagree. I think it makes much more sense
>> for writes to be "what the guest CPU would write", because that's
>> where we're injecting them. If we had a test framework where the
>> test was talking directly to the device, then maybe, but we don't.
>
> When I say that guest endianness is not well defined, what I mean is
> precisely that "what the guest CPU would write" is not well defined.
> For example on Power, the endianness of a given access will depend on:
>     - For server chips, whether the CPU is in LE or BE mode
>     - For embedded chips, the endianness flag in the TLB
>     - For all chips, whether the plain or byte-reversed load/store
>       instructions are used
>
> [There might even be variants with both a global mode flag and
> per-page flags in the TLB, I'm not sure]

AFAIK, these things all take effect inside the guest CPU, and affect
the value the guest eventually writes to the CPU bus.

> The model proposed here is that our testcases, on every access specify
> a specific final-result endianness.  The alternative is that every
> write from qtest has to do:
>     1) Start with a value (host endian)
>     2) Conditional swap for host endian vs. device endian difference
>     3) Conditional swap for host endian vs. "guest endian" difference
>        simply to compensate for..
>     4) [Existing tswap() within writew/writel]
>        Conditional swap for host endian vs. "guest endian difference
>
>> > Basically, the existing byteswaps, instead of removing the need for
>> > them in the testcase code, instead mean that target-conditional
>> > byteswaps will be *required* in nearly every testcase.  It's a recipe
>> > for endianness-unsafe testcases.
>>
>> I prefer the swaps to be explicit in the test cases. If your
>
> The are explicit in the testcases, in the sense that on every read or
> write you say this is an LE access or a BE access.  Potentially we
> could have a "guest native" access option as well, but I think that's
> useful
>
>> test case running on the real CPU would have had to do
>> "swap to LE and then write this word", so does the test
>> case in our test framework. If your test case just does
>> "write the word", then so does the test.
>>
>> Most devices IME do not require byteswaps by guest code,
>
> What!?  So speaks the person working on a historically LE platform.

Well, platforms where the devices and the CPU agree about
endianness.

> In the kernel, if a driver isn't littered with cpu_to_leXX() it's
> almost certainly broken on a BE platform.
>
> The only devices I can think of which don't have a fixed endianness
> regardless of platform are:
>    * legacy virtio: this was an insane design choice, which is why it
>      went away in virtio-1.0
>    * graphics card framebuffers, which can usually be configured
>      either way (or have both BE and LE views of the framebuffer)
>    * a handful of simple devices, usually from a while back, where
>      some idiot thought a BE version of a previously LE device (or vice
>      versa) was a good idea
>
>> and I think these functions are confusing -- if you try
>> to use them to write tests for ARM devices, for instance,
>> the _le versions of these functions will introduce an
>> incorrect byteswap on a BE host, even though ARM CPUs and
>> devices are typically all LE.
>
> Huh?  How so?

This was the result of my mis-reading of the code: I thought
they were doing a byteswap and then calling the readl/readw/etc
functions, not the readmem etc functions.

> If you're accessing an LE device (like PCI) you say "write_le" and
> there are no byteswaps at all on an LE host, and a single byteswap on
> a BE host, which is just what you want.  For a PCI device "write_le"
> is the correct option regardless of both host and guest platform.
>
> That makes testcases straightforward to write.  As a bonus, it forces
> the test author to think momentarily about what the right endianness
> is for each access, which is a good habit to be in (not thinking about
> this is why we've so often had endianness-broken drivers).
>
> Switching to an arbitrary "guest endianness" and back again makes the
> test more complicated, and accomplishes no extra coverage (because all
> the swaps under consideration are in the test code anyway).  It gains
> nothing but confusion.

I think I need to think about this a bit...

thanks
-- PMM
Greg Kurz Oct. 6, 2016, 2:11 p.m. UTC | #25
On Thu, 6 Oct 2016 12:03:34 +0100
Peter Maydell <peter.maydell@linaro.org> wrote:

> On 6 October 2016 at 04:38, David Gibson <david@gibson.dropbear.id.au> wrote:
> > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote:  
> >> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:  
> >> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:  
> >> >> The difficulty with this patch is that it's hard to tell whether
> >> >> it's really required, or if this is just adding an extra layer
> >> >> of byteswapping that should really be done in some other location  
> >> >
> >> > Actually, it's neither.  It's not essential for anything, but it
> >> > *removes* an extra layer of byteswapping that really never should have
> >> > been done in the first place.  
> >>
> >> The patch is very clearly adding calls to swapping functions.
> >> It looks like it's mostly convenience functions for not doing
> >> those swaps explicitly in the test cases.  
> >
> > It's adding 1 swap on top of the memread/memwrite path - that's the
> > path which had no existing swaps (intended primarily for bag-o'-bytes
> > block access AFAICT).  
> 
> Yeah, I hadn't noticed it was using the memread/memwrite path.
> I disagree with using that code path for what ought to be
> register read/writes (among other things, it's not clear to me
> that it guarantees that a 4-byte access by the test code is
> always a 4-byte access on the device, etc).
> 

FWIW, Cedric had another proposal which apparently went unnoticed:

<fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>

The idea is to add an optional endianness argument to the read*/write*
commands in the qtest protocol:
- libqtest then provides explicit _le and _be APIs
- no extra byteswap is performed on the test program side: qtest
  actually handles that and does exactly 1 or 0 byteswap.
- it does not use memread/memwrite
- the current 'guest native' API where qtest tswaps is preserved

> >> >> in the stack. What's the actual test case here?  
> >> >
> >> > The current readw, readl, etc. all work in "guest endianness".  But
> >> > guest endianness is not well defined - there are a number of targets
> >> > which can support either.  
> >>
> >> It's guest bus endianness, and it's pretty well defined I think.
> >> (ARM for instance is LE bus even if the CPU is doing BE writes.)  
> >
> > I don't see that guest bus endianness is any better defined, or any
> > more useful than "guest endianness".  It might have a vague meaning
> > for ARM (or embedded Power) in the sense that the on-SoC devices will
> > use that endianness.  But since the SoC devices are generally unique
> > to the architecture anyway, you still know their endianness
> > independent of any notion of guest endianness.  
> 
> It means "the endianness that you would see if you could snoop
> the data bus at the output of the CPU". It is definitely well
> defined for every QEMU target because that's what TARGET_WORDS_BIGENDIAN
> tells you. It's not the same as whatever the device thinks it
> might interpret values as (though obviously people don't
> often design systems where they differ).
> 
> >> >  And it's doubly meaningless since it's a
> >> > property of the guest cpu, which we're essentially replacing with the
> >> > qtest stub anyway.  
> >>
> >> The stub sits on the same bus the guest cpu would.
> >>  
> >> > Furthermore "guest endianness" isn't useful.  With a tiny handful of
> >> > exceptions, all peripherals have their own endianness which is known
> >> > independent of the target.  It makes more sense for test cases to
> >> > explicitly do their accesses in the correct endianness for the device,
> >> > without having to compensate for the fact that it'll be swapped into
> >> > the essentially arbitrary "guest endianness" along the way.  
> >>
> >> Here I definitely disagree. I think it makes much more sense
> >> for writes to be "what the guest CPU would write", because that's
> >> where we're injecting them. If we had a test framework where the
> >> test was talking directly to the device, then maybe, but we don't.  
> >
> > When I say that guest endianness is not well defined, what I mean is
> > precisely that "what the guest CPU would write" is not well defined.
> > For example on Power, the endianness of a given access will depend on:
> >     - For server chips, whether the CPU is in LE or BE mode
> >     - For embedded chips, the endianness flag in the TLB
> >     - For all chips, whether the plain or byte-reversed load/store
> >       instructions are used
> >
> > [There might even be variants with both a global mode flag and
> > per-page flags in the TLB, I'm not sure]  
> 
> AFAIK, these things all take effect inside the guest CPU, and affect
> the value the guest eventually writes to the CPU bus.
> 
> > The model proposed here is that our testcases, on every access specify
> > a specific final-result endianness.  The alternative is that every
> > write from qtest has to do:
> >     1) Start with a value (host endian)
> >     2) Conditional swap for host endian vs. device endian difference
> >     3) Conditional swap for host endian vs. "guest endian" difference
> >        simply to compensate for..
> >     4) [Existing tswap() within writew/writel]
> >        Conditional swap for host endian vs. "guest endian difference
> >  
> >> > Basically, the existing byteswaps, instead of removing the need for
> >> > them in the testcase code, instead mean that target-conditional
> >> > byteswaps will be *required* in nearly every testcase.  It's a recipe
> >> > for endianness-unsafe testcases.  
> >>
> >> I prefer the swaps to be explicit in the test cases. If your  
> >
> > The are explicit in the testcases, in the sense that on every read or
> > write you say this is an LE access or a BE access.  Potentially we
> > could have a "guest native" access option as well, but I think that's
> > useful
> >  
> >> test case running on the real CPU would have had to do
> >> "swap to LE and then write this word", so does the test
> >> case in our test framework. If your test case just does
> >> "write the word", then so does the test.
> >>
> >> Most devices IME do not require byteswaps by guest code,  
> >
> > What!?  So speaks the person working on a historically LE platform.  
> 
> Well, platforms where the devices and the CPU agree about
> endianness.
> 
> > In the kernel, if a driver isn't littered with cpu_to_leXX() it's
> > almost certainly broken on a BE platform.
> >
> > The only devices I can think of which don't have a fixed endianness
> > regardless of platform are:
> >    * legacy virtio: this was an insane design choice, which is why it
> >      went away in virtio-1.0
> >    * graphics card framebuffers, which can usually be configured
> >      either way (or have both BE and LE views of the framebuffer)
> >    * a handful of simple devices, usually from a while back, where
> >      some idiot thought a BE version of a previously LE device (or vice
> >      versa) was a good idea
> >  
> >> and I think these functions are confusing -- if you try
> >> to use them to write tests for ARM devices, for instance,
> >> the _le versions of these functions will introduce an
> >> incorrect byteswap on a BE host, even though ARM CPUs and
> >> devices are typically all LE.  
> >
> > Huh?  How so?  
> 
> This was the result of my mis-reading of the code: I thought
> they were doing a byteswap and then calling the readl/readw/etc
> functions, not the readmem etc functions.
> 
> > If you're accessing an LE device (like PCI) you say "write_le" and
> > there are no byteswaps at all on an LE host, and a single byteswap on
> > a BE host, which is just what you want.  For a PCI device "write_le"
> > is the correct option regardless of both host and guest platform.
> >
> > That makes testcases straightforward to write.  As a bonus, it forces
> > the test author to think momentarily about what the right endianness
> > is for each access, which is a good habit to be in (not thinking about
> > this is why we've so often had endianness-broken drivers).
> >
> > Switching to an arbitrary "guest endianness" and back again makes the
> > test more complicated, and accomplishes no extra coverage (because all
> > the swaps under consideration are in the test code anyway).  It gains
> > nothing but confusion.  
> 
> I think I need to think about this a bit...
> 
> thanks
> -- PMM
>
Paolo Bonzini Oct. 6, 2016, 3:36 p.m. UTC | #26
On 06/10/2016 16:11, Greg Kurz wrote:
> FWIW, Cedric had another proposal which apparently went unnoticed:
> 
> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>
> 
> The idea is to add an optional endianness argument to the read*/write*
> commands in the qtest protocol:
> - libqtest then provides explicit _le and _be APIs
> - no extra byteswap is performed on the test program side: qtest
>   actually handles that and does exactly 1 or 0 byteswap.
> - it does not use memread/memwrite
> - the current 'guest native' API where qtest tswaps is preserved
> 

No, this is a worse idea, because the right place to do the swap is in
the "program" (libqtest) not in the "CPU" (QEMU).

Paolo
Peter Maydell Oct. 6, 2016, 3:41 p.m. UTC | #27
On 6 October 2016 at 16:36, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 06/10/2016 16:11, Greg Kurz wrote:
>> FWIW, Cedric had another proposal which apparently went unnoticed:
>>
>> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>
>>
>> The idea is to add an optional endianness argument to the read*/write*
>> commands in the qtest protocol:
>> - libqtest then provides explicit _le and _be APIs
>> - no extra byteswap is performed on the test program side: qtest
>>   actually handles that and does exactly 1 or 0 byteswap.
>> - it does not use memread/memwrite
>> - the current 'guest native' API where qtest tswaps is preserved
>>
>
> No, this is a worse idea, because the right place to do the swap is in
> the "program" (libqtest) not in the "CPU" (QEMU).

Speaking of the right place to do things, perhaps we should
reimplement qtest_big_endian() in libqtest.c to send a query
to the QEMU-under-test to ask it what TARGET_BIG_ENDIAN says,
rather than hardcoding a big list of architectures...

thanks
-- PMM
Cédric Le Goater Oct. 6, 2016, 3:44 p.m. UTC | #28
On 10/06/2016 05:36 PM, Paolo Bonzini wrote:
> 
> 
> On 06/10/2016 16:11, Greg Kurz wrote:
>> FWIW, Cedric had another proposal which apparently went unnoticed:
>>
>> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>
>>
>> The idea is to add an optional endianness argument to the read*/write*
>> commands in the qtest protocol:
>> - libqtest then provides explicit _le and _be APIs
>> - no extra byteswap is performed on the test program side: qtest
>>   actually handles that and does exactly 1 or 0 byteswap.
>> - it does not use memread/memwrite
>> - the current 'guest native' API where qtest tswaps is preserved
>>
> 
> No, this is a worse idea, because the right place to do the swap is in
> the "program" (libqtest) not in the "CPU" (QEMU).

So the current patch, minus the typos, seems to be a "better" solution.

Thanks,

C.
Paolo Bonzini Oct. 6, 2016, 3:45 p.m. UTC | #29
On 06/10/2016 17:44, Cédric Le Goater wrote:
> On 10/06/2016 05:36 PM, Paolo Bonzini wrote:
>>
>>
>> On 06/10/2016 16:11, Greg Kurz wrote:
>>> FWIW, Cedric had another proposal which apparently went unnoticed:
>>>
>>> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>
>>>
>>> The idea is to add an optional endianness argument to the read*/write*
>>> commands in the qtest protocol:
>>> - libqtest then provides explicit _le and _be APIs
>>> - no extra byteswap is performed on the test program side: qtest
>>>   actually handles that and does exactly 1 or 0 byteswap.
>>> - it does not use memread/memwrite
>>> - the current 'guest native' API where qtest tswaps is preserved
>>>
>>
>> No, this is a worse idea, because the right place to do the swap is in
>> the "program" (libqtest) not in the "CPU" (QEMU).
> 
> So the current patch, minus the typos, seems to be a "better" solution.

Yes, at least I don't think it's a problem that it uses "memread" and
"memwrite".  It's just address_space_rw under the hood, and we know that
it does the right thing with MMIO addresses.

Paolo
Laurent Vivier Oct. 6, 2016, 3:59 p.m. UTC | #30
On 06/10/2016 17:41, Peter Maydell wrote:
> On 6 October 2016 at 16:36, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>
>> On 06/10/2016 16:11, Greg Kurz wrote:
>>> FWIW, Cedric had another proposal which apparently went unnoticed:
>>>
>>> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>
>>>
>>> The idea is to add an optional endianness argument to the read*/write*
>>> commands in the qtest protocol:
>>> - libqtest then provides explicit _le and _be APIs
>>> - no extra byteswap is performed on the test program side: qtest
>>>   actually handles that and does exactly 1 or 0 byteswap.
>>> - it does not use memread/memwrite
>>> - the current 'guest native' API where qtest tswaps is preserved
>>>
>>
>> No, this is a worse idea, because the right place to do the swap is in
>> the "program" (libqtest) not in the "CPU" (QEMU).
> 
> Speaking of the right place to do things, perhaps we should
> reimplement qtest_big_endian() in libqtest.c to send a query
> to the QEMU-under-test to ask it what TARGET_BIG_ENDIAN says,
> rather than hardcoding a big list of architectures...

Yes, it's a good idea.

I can do that...

Laurent
David Gibson Oct. 6, 2016, 11:09 p.m. UTC | #31
On Thu, Oct 06, 2016 at 11:47:36AM +0100, Peter Maydell wrote:
> On 6 October 2016 at 04:45, David Gibson <david@gibson.dropbear.id.au> wrote:
> > On Wed, Oct 05, 2016 at 07:20:52AM -0700, Peter Maydell wrote:
> >> On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote:
> >> > On 10/05/2016 03:53 PM, Peter Maydell wrote:
> >> >> Which tswap? Last time I worked through the stack of
> >> >> what happens I thought that we had the right set of
> >> >> swaps in the right places.
> >> >
> >> > The one I am talking about are under qtest_process_command(),
> >> > see below.
> >>
> >> Those are correct and required, and they do not change
> >> the overall behaviour of the system depending on the host
> >> endianness. (They convert 32-bit values to "bag of
> >> bytes in guest order" which is what the cpu_physical_memory_*
> >> functions want.)
> >
> > These functions are correct for the defined semantics of the
> > readw/readl operations, but those semantics are not useful.
> 
> ?? They're the most obvious and required semantics: "act
> like the CPU just did a word/halfword/byte read/write".

The CPU with what mode and options?

> There's a reason we've got this far without needing anything
> else, and it's that this is the most straightforward use case.

No, I'm pretty sure we've got this far because most of the tests
haven't yet been enabled for a traditionally BE target.

> > This proposal is introducing alternate functions with the more useful
> > semantics which are "convert a 32-bit value to a bag of bytes in LE
> > order" or "convert a 32-bit value to a bag of bytes in BE order"
> > depending on which variant you choose.
> 
> It's adding functions whose semantics are "act like the
> CPU wrote this value to some RAM and then memcpy()ed it to
> the device". I think devices whose usage model is "memcpy bytes
> to me" are rare to nonexistent.

Uh, that's not the intention.  I have some comments on this elsewhere.

The intended semantics are that we do a single atomic write to an
address, but with a specific endianness.
David Gibson Oct. 6, 2016, 11:31 p.m. UTC | #32
On Thu, Oct 06, 2016 at 12:03:34PM +0100, Peter Maydell wrote:
> On 6 October 2016 at 04:38, David Gibson <david@gibson.dropbear.id.au> wrote:
> > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote:
> >> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:
> >> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
> >> >> The difficulty with this patch is that it's hard to tell whether
> >> >> it's really required, or if this is just adding an extra layer
> >> >> of byteswapping that should really be done in some other location
> >> >
> >> > Actually, it's neither.  It's not essential for anything, but it
> >> > *removes* an extra layer of byteswapping that really never should have
> >> > been done in the first place.
> >>
> >> The patch is very clearly adding calls to swapping functions.
> >> It looks like it's mostly convenience functions for not doing
> >> those swaps explicitly in the test cases.
> >
> > It's adding 1 swap on top of the memread/memwrite path - that's the
> > path which had no existing swaps (intended primarily for bag-o'-bytes
> > block access AFAICT).
> 
> Yeah, I hadn't noticed it was using the memread/memwrite path.
> I disagree with using that code path for what ought to be
> register read/writes (among other things, it's not clear to me
> that it guarantees that a 4-byte access by the test code is
> always a 4-byte access on the device, etc).

I was concerned about that when I first saw it as well.

But, I traced the path and both the memread() path and the readw/readl
path backend onto cpu_physical_memory_read() with the appropriate
width.  So if the proposed ops are broken in this way, so are
readw/readl.

If we're concerned that that won't always be the case we can still
implement the same operations, but instead of having them be just
wrappers on memread/memwrite, they'd be new primitives passed over the
pipe to the qtest accelerator.

> 
> >> >> in the stack. What's the actual test case here?
> >> >
> >> > The current readw, readl, etc. all work in "guest endianness".  But
> >> > guest endianness is not well defined - there are a number of targets
> >> > which can support either.
> >>
> >> It's guest bus endianness, and it's pretty well defined I think.
> >> (ARM for instance is LE bus even if the CPU is doing BE writes.)
> >
> > I don't see that guest bus endianness is any better defined, or any
> > more useful than "guest endianness".  It might have a vague meaning
> > for ARM (or embedded Power) in the sense that the on-SoC devices will
> > use that endianness.  But since the SoC devices are generally unique
> > to the architecture anyway, you still know their endianness
> > independent of any notion of guest endianness.
> 
> It means "the endianness that you would see if you could snoop
> the data bus at the output of the CPU".

Hrm.  I see two cases here, neither of them makes this clear:

1) The bus spec defines which data lines are MSB, and which are LSB.
In this case, the endianness depends on how those are mapped to byte
addresses when you get out to RAM - that could involve several
intermediate bridges.

2) The bus spec defines which data lines are byte0, 1, 2, etc (in byte
address order).  In this case it really is the CPU which determines
the endianness of its accesses - and that in turn could depend on
modes, TLB entries or instruction formss within the CPU.  So the
question is: the endianness you see if you snoop the bus when the CPU
does... what.. exactly?

> It is definitely well
> defined for every QEMU target because that's what TARGET_WORDS_BIGENDIAN
> tells you. It's not the same as whatever the device thinks it
> might interpret values as (though obviously people don't
> often design systems where they differ).

Honestly I think TAGET_WORDS_BIGENDIAN is just a hangover from when
nearly all CPUs worked exclusively in one endianness.  It's never
terribly well defined.

> >> >  And it's doubly meaningless since it's a
> >> > property of the guest cpu, which we're essentially replacing with the
> >> > qtest stub anyway.
> >>
> >> The stub sits on the same bus the guest cpu would.
> >>
> >> > Furthermore "guest endianness" isn't useful.  With a tiny handful of
> >> > exceptions, all peripherals have their own endianness which is known
> >> > independent of the target.  It makes more sense for test cases to
> >> > explicitly do their accesses in the correct endianness for the device,
> >> > without having to compensate for the fact that it'll be swapped into
> >> > the essentially arbitrary "guest endianness" along the way.
> >>
> >> Here I definitely disagree. I think it makes much more sense
> >> for writes to be "what the guest CPU would write", because that's
> >> where we're injecting them. If we had a test framework where the
> >> test was talking directly to the device, then maybe, but we don't.
> >
> > When I say that guest endianness is not well defined, what I mean is
> > precisely that "what the guest CPU would write" is not well defined.
> > For example on Power, the endianness of a given access will depend on:
> >     - For server chips, whether the CPU is in LE or BE mode
> >     - For embedded chips, the endianness flag in the TLB
> >     - For all chips, whether the plain or byte-reversed load/store
> >       instructions are used
> >
> > [There might even be variants with both a global mode flag and
> > per-page flags in the TLB, I'm not sure]
> 
> AFAIK, these things all take effect inside the guest CPU, and affect
> the value the guest eventually writes to the CPU bus.
> 
> > The model proposed here is that our testcases, on every access specify
> > a specific final-result endianness.  The alternative is that every
> > write from qtest has to do:
> >     1) Start with a value (host endian)
> >     2) Conditional swap for host endian vs. device endian difference
> >     3) Conditional swap for host endian vs. "guest endian" difference
> >        simply to compensate for..
> >     4) [Existing tswap() within writew/writel]
> >        Conditional swap for host endian vs. "guest endian difference
> >
> >> > Basically, the existing byteswaps, instead of removing the need for
> >> > them in the testcase code, instead mean that target-conditional
> >> > byteswaps will be *required* in nearly every testcase.  It's a recipe
> >> > for endianness-unsafe testcases.
> >>
> >> I prefer the swaps to be explicit in the test cases. If your
> >
> > The are explicit in the testcases, in the sense that on every read or
> > write you say this is an LE access or a BE access.  Potentially we
> > could have a "guest native" access option as well, but I think that's
> > useful
> >
> >> test case running on the real CPU would have had to do
> >> "swap to LE and then write this word", so does the test
> >> case in our test framework. If your test case just does
> >> "write the word", then so does the test.
> >>
> >> Most devices IME do not require byteswaps by guest code,
> >
> > What!?  So speaks the person working on a historically LE platform.
> 
> Well, platforms where the devices and the CPU agree about
> endianness.

Which means historically LE platforms, since nearly all commodity,
cross platform hardware is LE.

> > In the kernel, if a driver isn't littered with cpu_to_leXX() it's
> > almost certainly broken on a BE platform.
> >
> > The only devices I can think of which don't have a fixed endianness
> > regardless of platform are:
> >    * legacy virtio: this was an insane design choice, which is why it
> >      went away in virtio-1.0
> >    * graphics card framebuffers, which can usually be configured
> >      either way (or have both BE and LE views of the framebuffer)
> >    * a handful of simple devices, usually from a while back, where
> >      some idiot thought a BE version of a previously LE device (or vice
> >      versa) was a good idea
> >
> >> and I think these functions are confusing -- if you try
> >> to use them to write tests for ARM devices, for instance,
> >> the _le versions of these functions will introduce an
> >> incorrect byteswap on a BE host, even though ARM CPUs and
> >> devices are typically all LE.
> >
> > Huh?  How so?
> 
> This was the result of my mis-reading of the code: I thought
> they were doing a byteswap and then calling the readl/readw/etc
> functions, not the readmem etc functions.

Ah, ok, yes.  That would certainly have been wrong.

> > If you're accessing an LE device (like PCI) you say "write_le" and
> > there are no byteswaps at all on an LE host, and a single byteswap on
> > a BE host, which is just what you want.  For a PCI device "write_le"
> > is the correct option regardless of both host and guest platform.
> >
> > That makes testcases straightforward to write.  As a bonus, it forces
> > the test author to think momentarily about what the right endianness
> > is for each access, which is a good habit to be in (not thinking about
> > this is why we've so often had endianness-broken drivers).
> >
> > Switching to an arbitrary "guest endianness" and back again makes the
> > test more complicated, and accomplishes no extra coverage (because all
> > the swaps under consideration are in the test code anyway).  It gains
> > nothing but confusion.
> 
> I think I need to think about this a bit...
> 
> thanks
> -- PMM
>
David Gibson Oct. 6, 2016, 11:31 p.m. UTC | #33
On Thu, Oct 06, 2016 at 04:11:42PM +0200, Greg Kurz wrote:
> On Thu, 6 Oct 2016 12:03:34 +0100
> Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> > On 6 October 2016 at 04:38, David Gibson <david@gibson.dropbear.id.au> wrote:
> > > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote:  
> > >> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote:  
> > >> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:  
> > >> >> The difficulty with this patch is that it's hard to tell whether
> > >> >> it's really required, or if this is just adding an extra layer
> > >> >> of byteswapping that should really be done in some other location  
> > >> >
> > >> > Actually, it's neither.  It's not essential for anything, but it
> > >> > *removes* an extra layer of byteswapping that really never should have
> > >> > been done in the first place.  
> > >>
> > >> The patch is very clearly adding calls to swapping functions.
> > >> It looks like it's mostly convenience functions for not doing
> > >> those swaps explicitly in the test cases.  
> > >
> > > It's adding 1 swap on top of the memread/memwrite path - that's the
> > > path which had no existing swaps (intended primarily for bag-o'-bytes
> > > block access AFAICT).  
> > 
> > Yeah, I hadn't noticed it was using the memread/memwrite path.
> > I disagree with using that code path for what ought to be
> > register read/writes (among other things, it's not clear to me
> > that it guarantees that a 4-byte access by the test code is
> > always a 4-byte access on the device, etc).
> > 
> 
> FWIW, Cedric had another proposal which apparently went unnoticed:
> 
> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>
> 
> The idea is to add an optional endianness argument to the read*/write*
> commands in the qtest protocol:
> - libqtest then provides explicit _le and _be APIs
> - no extra byteswap is performed on the test program side: qtest
>   actually handles that and does exactly 1 or 0 byteswap.
> - it does not use memread/memwrite
> - the current 'guest native' API where qtest tswaps is preserved

That would also be fine by me.
David Gibson Oct. 6, 2016, 11:34 p.m. UTC | #34
On Thu, Oct 06, 2016 at 05:59:00PM +0200, Laurent Vivier wrote:
> 
> 
> On 06/10/2016 17:41, Peter Maydell wrote:
> > On 6 October 2016 at 16:36, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>
> >>
> >> On 06/10/2016 16:11, Greg Kurz wrote:
> >>> FWIW, Cedric had another proposal which apparently went unnoticed:
> >>>
> >>> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>
> >>>
> >>> The idea is to add an optional endianness argument to the read*/write*
> >>> commands in the qtest protocol:
> >>> - libqtest then provides explicit _le and _be APIs
> >>> - no extra byteswap is performed on the test program side: qtest
> >>>   actually handles that and does exactly 1 or 0 byteswap.
> >>> - it does not use memread/memwrite
> >>> - the current 'guest native' API where qtest tswaps is preserved
> >>>
> >>
> >> No, this is a worse idea, because the right place to do the swap is in
> >> the "program" (libqtest) not in the "CPU" (QEMU).
> > 
> > Speaking of the right place to do things, perhaps we should
> > reimplement qtest_big_endian() in libqtest.c to send a query
> > to the QEMU-under-test to ask it what TARGET_BIG_ENDIAN says,
> > rather than hardcoding a big list of architectures...
> 
> Yes, it's a good idea.

I disagree.

TARGET_BIG_ENDIAN is simply not well defined - we should avoid using
it at all.
David Gibson Oct. 6, 2016, 11:43 p.m. UTC | #35
On Thu, Oct 06, 2016 at 05:36:32PM +0200, Paolo Bonzini wrote:
> 
> 
> On 06/10/2016 16:11, Greg Kurz wrote:
> > FWIW, Cedric had another proposal which apparently went unnoticed:
> > 
> > <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org>
> > 
> > The idea is to add an optional endianness argument to the read*/write*
> > commands in the qtest protocol:
> > - libqtest then provides explicit _le and _be APIs
> > - no extra byteswap is performed on the test program side: qtest
> >   actually handles that and does exactly 1 or 0 byteswap.
> > - it does not use memread/memwrite
> > - the current 'guest native' API where qtest tswaps is preserved
> > 
> 
> No, this is a worse idea, because the right place to do the swap is in
> the "program" (libqtest) not in the "CPU" (QEMU).

Hrm.. I guess that makes sense from an x86 perspective when
load/stores always operate in LE.  Not so much for something like
Power where the CPU can perform both LE and BE load/stores trivially.
You can select with CPU mode combined with which instruction form you
use.  e.g. the always-LE writel() on a BE Power kernel is a single
byte-reversed store instruction[0].  there's no "swap" as such, and the
swapped value never appears in a register.  I'm not certain if gcc is
smart enough to translate foo->bar = cpu_to_le32(val) into a
byte-reversed store, but it might be.

The value passed across the pipe to readw etc. is text, so it has no
endianness, just as a value in a cpu register has no endianness.  To
me it makes perfect sense to tell the qtest "cpu" which endianness of
load/store you want it to do with that.

[0] Well, ok, there's a memory barrier too, so it's not quite 1
instruction.
Laurent Vivier Oct. 7, 2016, 7:44 a.m. UTC | #36
On 07/10/2016 01:34, David Gibson wrote:

>> On 06/10/2016 17:41, Peter Maydell wrote:
>>> Speaking of the right place to do things, perhaps we should
>>> reimplement qtest_big_endian() in libqtest.c to send a query
>>> to the QEMU-under-test to ask it what TARGET_BIG_ENDIAN says,
>>> rather than hardcoding a big list of architectures...
>>
>> Yes, it's a good idea.
> 
> I disagree.
> 
> TARGET_BIG_ENDIAN is simply not well defined - we should avoid using
> it at all.

I think it is at least better to send a query to ask the endianness of
the guest instead of using some guessing on the test application side.

For the moment, we can use TARGET_WORDS_BIGENDIAN, because this is
simple and its purpose. TARGET_WORDS_BIGENDIAN defines the default
endianness of the guest.

But I don't think we can't use something else: in the case of POWER, for
instance, the endianness can be changed, but the processor is always big
endian by default at reset, and in the case of qtest, the processor is
never started (or in some special cases, never exits from the firmware,
which always runs in big endian mode), so the endianness will always
stay to the one given by TARGET_WORDS_BIGENDIAN.

Thanks,
Laurent
Peter Maydell Oct. 7, 2016, 9:52 a.m. UTC | #37
On 7 October 2016 at 00:31, David Gibson <david@gibson.dropbear.id.au> wrote:
> Honestly I think TAGET_WORDS_BIGENDIAN is just a hangover from when
> nearly all CPUs worked exclusively in one endianness.  It's never
> terribly well defined.

Feel free to try to get rid of it, but it's baked into the
KVM ABI as a concept and also important for TCG efficiency.
It works fine both for "always one endianness" systems and
for "programmably variable" CPUs like ARM.

thanks
-- PMM
diff mbox

Patch

Index: qemu-aspeed.git/tests/libqtest.h
===================================================================
--- qemu-aspeed.git.orig/tests/libqtest.h
+++ qemu-aspeed.git/tests/libqtest.h
@@ -887,4 +887,75 @@  void qmp_fd_send(int fd, const char *fmt
 QDict *qmp_fdv(int fd, const char *fmt, va_list ap);
 QDict *qmp_fd(int fd, const char *fmt, ...);
 
+/*
+ * BE/LE read/write accessors
+ */
+uint16_t qtest_readw_be(QTestState *s, uint64_t addr);
+uint32_t qtest_readl_be(QTestState *s, uint64_t addr);
+uint64_t qtest_readq_be(QTestState *s, uint64_t addr);
+
+static inline uint16_t readw_be(uint64_t addr)
+{
+    return qtest_readw_be(global_qtest, addr);
+}
+static inline uint32_t readl_be(uint64_t addr)
+{
+    return qtest_readl_be(global_qtest, addr);
+}
+static inline uint64_t readq_be(uint64_t addr)
+{
+    return qtest_readq_be(global_qtest, addr);
+}
+
+void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value);
+void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value);
+void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value);
+
+static inline void writew_be(uint64_t addr, uint16_t value)
+{
+    qtest_writew_be(global_qtest, addr, value);
+}
+static inline void writel_be(uint64_t addr, uint32_t value)
+{
+    qtest_writel_be(global_qtest, addr, value);
+}
+static inline void writeq_be(uint64_t addr, uint64_t value)
+{
+    qtest_writeq_be(global_qtest, addr, value);
+}
+
+uint16_t qtest_readw_le(QTestState *s, uint64_t addr);
+uint32_t qtest_readl_le(QTestState *s, uint64_t addr);
+uint64_t qtest_readq_le(QTestState *s, uint64_t addr);
+
+static inline uint16_t readw_le(uint64_t addr)
+{
+    return qtest_readw_le(global_qtest, addr);
+}
+static inline uint32_t readl_le(uint64_t addr)
+{
+    return qtest_readl_le(global_qtest, addr);
+}
+static inline uint64_t readq_le(uint64_t addr)
+{
+    return qtest_readq_le(global_qtest, addr);
+}
+
+void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value);
+void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value);
+void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value);
+
+static inline void writew_le(uint64_t addr, uint16_t value)
+{
+    qtest_writew_le(global_qtest, addr, value);
+}
+static inline void writel_le(uint64_t addr, uint32_t value)
+{
+    qtest_writel_le(global_qtest, addr, value);
+}
+static inline void writeq_le(uint64_t addr, uint64_t value)
+{
+    qtest_writeq_le(global_qtest, addr, value);
+}
+
 #endif
Index: qemu-aspeed.git/tests/libqtest.c
===================================================================
--- qemu-aspeed.git.orig/tests/libqtest.c
+++ qemu-aspeed.git/tests/libqtest.c
@@ -15,6 +15,7 @@ 
  *
  */
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
 #include "libqtest.h"
 
 #include <sys/socket.h>
@@ -688,6 +689,42 @@  void qtest_writeq(QTestState *s, uint64_
     qtest_write(s, "writeq", addr, value);
 }
 
+void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value)
+{
+    value = cpu_to_be16(value);
+    qtest_memread(s, addr, &value, sizeof(value));
+}
+
+void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value)
+{
+    value = cpu_to_be32(value);
+    qtest_memread(s, addr, &value, sizeof(value));
+}
+
+void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value)
+{
+    value = cpu_to_be64(value);
+    qtest_memread(s, addr, &value, sizeof(value));
+}
+
+void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value)
+{
+    value = cpu_to_le16(value);
+    qtest_memread(s, addr, &value, sizeof(value));
+}
+
+void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value)
+{
+    value = cpu_to_le32(value);
+    qtest_memread(s, addr, &value, sizeof(value));
+}
+
+void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value)
+{
+    value = cpu_to_le64(value);
+    qtest_memread(s, addr, &value, sizeof(value));
+}
+
 static uint64_t qtest_read(QTestState *s, const char *cmd, uint64_t addr)
 {
     gchar **args;
@@ -721,6 +758,60 @@  uint64_t qtest_readq(QTestState *s, uint
     return qtest_read(s, "readq", addr);
 }
 
+uint16_t qtest_readw_be(QTestState *s, uint64_t addr)
+{
+    uint16_t value;
+
+    qtest_memread(s, addr, &value, sizeof(value));
+
+    return be16_to_cpu(value);
+}
+
+uint32_t qtest_readl_be(QTestState *s, uint64_t addr)
+{
+    uint32_t value;
+
+    qtest_memread(s, addr, &value, sizeof(value));
+
+    return be32_to_cpu(value);
+}
+
+uint64_t qtest_readq_be(QTestState *s, uint64_t addr)
+{
+    uint64_t value;
+
+    qtest_memread(s, addr, &value, sizeof(value));
+
+    return be64_to_cpu(value);
+}
+
+uint16_t qtest_readw_le(QTestState *s, uint64_t addr)
+{
+    uint16_t value;
+
+    qtest_memread(s, addr, &value, sizeof(value));
+
+    return le16_to_cpu(value);
+}
+
+uint32_t qtest_readl_le(QTestState *s, uint64_t addr)
+{
+    uint32_t value;
+
+    qtest_memread(s, addr, &value, sizeof(value));
+
+    return le32_to_cpu(value);
+}
+
+uint64_t qtest_readq_le(QTestState *s, uint64_t addr)
+{
+    uint64_t value;
+
+    qtest_memread(s, addr, &value, sizeof(value));
+
+    return le64_to_cpu(value);
+}
+
 static int hex2nib(char ch)
 {
     if (ch >= '0' && ch <= '9') {