mbox series

[v2,0/7] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order

Message ID 20190307180520.13868-1-mark.cave-ayland@ilande.co.uk
Headers show
Series target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order | expand

Message

Mark Cave-Ayland March 7, 2019, 6:05 p.m. UTC
After some investigation into Andrew's report of corruption in his ppc64le
tests at https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg07234.html, I
discovered the underlying cause was that the first 32 VSX registers are not
stored in host endian order.

This is something that Richard and I had discussed before, but missed that with
VSX if you have source registers from different register sets then even logical
operations will give you the wrong result.

Rather than revert 7b8fe477e1 "target/ppc: convert VSX logical operations to
vector operations" let's keep the use of the accelerated vector instructions,
and instead fix the real problem which is to switch the first 32 VSX registers
to host endian order matching the VMX registers.

Patches 1-5 aim to consolidate the offset calculations for both CPUPPCState
and the associated _ptr() functions into one single place.

With this preliminary work complete, patch 6 switches the first 32 registers
into host endian order without too much difficulty.

Finally now that all VSX registers are stored in the same way, the vsr offset
functions and get_cpu_vsrh()/get_cpu_vsrl() can be simplified accordingly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

v2:
- Rebase onto master
- Rework patchset set based upon av64_offset()/vsr64_offset() as suggested by
  Richard, rather than using separate low/high accessors


Mark Cave-Ayland (7):
  target/ppc: introduce single fpr_offset() function
  target/ppc: introduce single vsrl_offset() function
  target/ppc: move Vsr* macros from internal.h to cpu.h
  target/ppc: introduce avr_full_offset() function
  target/ppc: improve avr64_offset() and use it to simplify
    get_avr64()/set_avr64()
  target/ppc: switch fpr/vsrl registers so all VSX registers are in host
    endian order
  target/ppc: introduce vsr64_offset() to simplify get_cpu_vsr{l,h}()
    and set_cpu_vsr{l,h}()

 target/ppc/cpu.h                    | 51 ++++++++++++++++++++++++++++++++++---
 target/ppc/internal.h               | 27 +++-----------------
 target/ppc/machine.c                |  8 +++---
 target/ppc/translate.c              | 20 +++------------
 target/ppc/translate/vmx-impl.inc.c | 27 ++++++++------------
 target/ppc/translate/vsx-impl.inc.c | 39 +++-------------------------
 6 files changed, 75 insertions(+), 97 deletions(-)

Comments

Richard Henderson March 7, 2019, 9 p.m. UTC | #1
On 3/7/19 10:05 AM, Mark Cave-Ayland wrote:
> Finally now that all VSX registers are stored in the same way, the vsr offset
> functions and get_cpu_vsrh()/get_cpu_vsrl() can be simplified accordingly.

For the todo list for 4.1 should be getting rid of getVSR and setVSR.


r`
Mark Cave-Ayland March 7, 2019, 9:27 p.m. UTC | #2
On 07/03/2019 21:00, Richard Henderson wrote:

> On 3/7/19 10:05 AM, Mark Cave-Ayland wrote:
>> Finally now that all VSX registers are stored in the same way, the vsr offset
>> functions and get_cpu_vsrh()/get_cpu_vsrl() can be simplified accordingly.
> 
> For the todo list for 4.1 should be getting rid of getVSR and setVSR.

That's something that certainly crossed my mind, although I was conscious that freeze
was coming up and wanted to get the fix for the existing VSX vector ops merged. I
think it should be a simple mechanical change, but I'm not real set up right now for
much in the way of ppc64 testing.


ATB,

Mark.
David Gibson March 7, 2019, 11:38 p.m. UTC | #3
On Thu, Mar 07, 2019 at 06:05:13PM +0000, Mark Cave-Ayland wrote:
> After some investigation into Andrew's report of corruption in his ppc64le
> tests at https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg07234.html, I
> discovered the underlying cause was that the first 32 VSX registers are not
> stored in host endian order.
> 
> This is something that Richard and I had discussed before, but missed that with
> VSX if you have source registers from different register sets then even logical
> operations will give you the wrong result.
> 
> Rather than revert 7b8fe477e1 "target/ppc: convert VSX logical operations to
> vector operations" let's keep the use of the accelerated vector instructions,
> and instead fix the real problem which is to switch the first 32 VSX registers
> to host endian order matching the VMX registers.
> 
> Patches 1-5 aim to consolidate the offset calculations for both CPUPPCState
> and the associated _ptr() functions into one single place.
> 
> With this preliminary work complete, patch 6 switches the first 32 registers
> into host endian order without too much difficulty.
> 
> Finally now that all VSX registers are stored in the same way, the vsr offset
> functions and get_cpu_vsrh()/get_cpu_vsrl() can be simplified accordingly.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

Series applied to ppc-for-4.0, thanks.

> 
> v2:
> - Rebase onto master
> - Rework patchset set based upon av64_offset()/vsr64_offset() as suggested by
>   Richard, rather than using separate low/high accessors
> 
> 
> Mark Cave-Ayland (7):
>   target/ppc: introduce single fpr_offset() function
>   target/ppc: introduce single vsrl_offset() function
>   target/ppc: move Vsr* macros from internal.h to cpu.h
>   target/ppc: introduce avr_full_offset() function
>   target/ppc: improve avr64_offset() and use it to simplify
>     get_avr64()/set_avr64()
>   target/ppc: switch fpr/vsrl registers so all VSX registers are in host
>     endian order
>   target/ppc: introduce vsr64_offset() to simplify get_cpu_vsr{l,h}()
>     and set_cpu_vsr{l,h}()
> 
>  target/ppc/cpu.h                    | 51 ++++++++++++++++++++++++++++++++++---
>  target/ppc/internal.h               | 27 +++-----------------
>  target/ppc/machine.c                |  8 +++---
>  target/ppc/translate.c              | 20 +++------------
>  target/ppc/translate/vmx-impl.inc.c | 27 ++++++++------------
>  target/ppc/translate/vsx-impl.inc.c | 39 +++-------------------------
>  6 files changed, 75 insertions(+), 97 deletions(-)
>