target-mips: apply workaround for TCG optimizations for MFC1

There seems to be an issue when trying to keep a pointer in bottom 32-bits
of a 64-bit floating point register. Load and store instructions accessing
this address for some reason use the whole 64-bit content of floating point
register rather than truncated 32-bit value. The following load uses
incorrect address which leads to a crash if upper 32 bits of $f0 isn't 0:

0x00400c60:  mfc1       t8,$f0
0x00400c64:  lw t9,0(t8)

It can be reproduced with the following linux userland program when running
on a MIPS32 with CP0.Status.FR=1 (by default mips32r5-generic and
mips32r6-generic CPUs have this bit set in linux-user).

int main(int argc, char *argv[])
{
    int tmp = 0x11111111;
    /* Set f0 */
    __asm__ ("mtc1  %0, $f0\n"
             "mthc1 %1, $f0\n"
             : : "r" (&tmp), "r" (tmp));
    /* At this point $f0: w:76fff040 d:1111111176fff040 */
    __asm__ ("mfc1 $t8, $f0\n"
             "lw   $t9, 0($t8)\n"); /* <--- crash! */
    return 0;
}

Running above program in normal (non-singlestep mode) leads to:

Program received signal SIGSEGV, Segmentation fault.
0x00005555559f6f37 in static_code_gen_buffer ()
(gdb) x/i 0x00005555559f6f37
=> 0x5555559f6f37 <static_code_gen_buffer+78359>:       mov    %gs:0x0(%rbp),%ebp
(gdb) info registers rbp
rbp            0x1111111176fff040       0x1111111176fff040

The program runs fine in singlestep mode, or with disabled TCG
optimizations. Also, I'm not able to reproduce it in system emulation.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
---
I had been investigating this some time ago, but had to move to other
things and haven't managed to get back to it. And now, since 2.4 release
is relatively close I think workaround is better than nothing (apparently
some MIPS32R6 compilers may keep a pointer in a floating point register
which exposes this problem in QEMU). Ideas and comments are welcome.

More dumps if anyone is interested (I isolated TB for these two
instructions by stopping translation after mthc1 and lw):

IN: main
0x00400c60:  mfc1	t8,$f0
0x00400c64:  lw	t9,0(t8)

OP:
 ld_i32 tmp0,env,$0xfffffffffffffffc
 movi_i32 tmp1,$0x0
 brcond_i32 tmp0,tmp1,ne,$L0

 ---- 0x400c60
 mov_i32 tmp1,w0.d0
 mov_i32 tmp0,tmp1
 mov_i32 t8,tmp0

 ---- 0x400c64
 mov_i32 tmp0,t8
 qemu_ld_i32 tmp0,tmp0,un+leul,2
 mov_i32 t9,tmp0
 goto_tb $0x0
 movi_i32 PC,$0x400c68
 exit_tb $0x7ffff35d5d30
 set_label $L0
 exit_tb $0x7ffff35d5d33

OP after optimization and liveness analysis:
 ld_i32 tmp0,env,$0xfffffffffffffffc
 movi_i32 tmp1,$0x0
 brcond_i32 tmp0,tmp1,ne,$L0

 ---- 0x400c60
 mov_i32 tmp1,w0.d0
 mov_i32 tmp0,tmp1
 mov_i32 t8,tmp0

 ---- 0x400c64
 qemu_ld_i32 tmp0,t8,un+leul,2
 mov_i32 t9,tmp0
 goto_tb $0x0
 movi_i32 PC,$0x400c68
 exit_tb $0x7ffff35d5d30
 set_label $L0
 exit_tb $0x7ffff35d5d33

OUT: [size=78]
0x5555559f6f20:  mov    -0x4(%r14),%ebp
0x5555559f6f24:  test   %ebp,%ebp
0x5555559f6f26:  jne    0x5555559f6f5f
0x5555559f6f2c:  mov    0xe8(%r14),%rbp
0x5555559f6f33:  mov    %ebp,0x60(%r14)
0x5555559f6f37:  mov    %gs:0x0(%rbp),%ebp
0x5555559f6f3b:  mov    %ebp,0x64(%r14)
0x5555559f6f3f:  jmpq   0x5555559f6f44
0x5555559f6f44:  mov    $0x400c68,%ebp
0x5555559f6f49:  mov    %ebp,0x80(%r14)
0x5555559f6f50:  mov    $0x7ffff35d5d30,%rax
0x5555559f6f5a:  jmpq   0x5555579e3936
0x5555559f6f5f:  mov    $0x7ffff35d5d33,%rax
0x5555559f6f69:  jmpq   0x5555579e3936

---
 target-mips/translate.c | 6 ++++++
 1 file changed, 6 insertions(+)

target-mips: apply workaround for TCG optimizations for MFC1

Commit Message

Comments

Patch