diff mbox

[Committed/AARCH64] Fix a few failures with LSE enabled

Message ID CA+=Sn1kUrHPfrB1ikBVJNPvs0dC_f9NyUoDAER59dzc54=PjnQ@mail.gmail.com
State New
Headers show

Commit Message

Andrew Pinski Dec. 20, 2015, 10:01 p.m. UTC
Hi,
 With LSE enabled by default a few failures in libgomp happen.
The shortest testcase I came up with was:
extern void abort (void);
int x = 6;
int f(void) __attribute__((noinline,noclone));
int f(void)
{
  return 32;
}

int
main ()
{
  int v, l = 2, s = 1;
  x = f();
  #pragma omp atomic capture
    v = x = 5 | x;
  if (v != 37)
    abort ();
  return 0;
}
--- CUT ---
What happen was register allocator decided to use the same register
for the input as the clobber register:
Before:
(insn 13 12 14 2 (parallel [
            (set (reg:SI 74 [ _5 ])
                (ior:SI (mem/v:SI (reg/f:DI 76) [-1  S4 A32])
                    (reg:SI 80)))
            (set (mem/v:SI (reg/f:DI 76) [-1  S4 A32])
                (unspec_volatile:SI [
                        (mem/v:SI (reg/f:DI 76) [-1  S4 A32])
                        (reg:SI 80)
                        (const_int 0 [0])
                    ] UNSPECV_ATOMIC_LDOP))
            (clobber (scratch:SI))
        ]) t.c:14 2895 {aarch64_atomic_or_fetchsi_lse}
     (expr_list:REG_DEAD (reg:SI 80)
        (expr_list:REG_DEAD (reg/f:DI 76)
            (nil))))

After:
(insn 13 12 14 2 (parallel [
            (set (reg:SI 2 x2 [orig:74 _5 ] [74])
                (ior:SI (mem/v:SI (reg/f:DI 1 x1 [76]) [-1  S4 A32])
                    (reg:SI 0 x0 [80])))
            (set (mem/v:SI (reg/f:DI 1 x1 [76]) [-1  S4 A32])
                (unspec_volatile:SI [
                        (mem/v:SI (reg/f:DI 1 x1 [76]) [-1  S4 A32])
                        (reg:SI 0 x0 [80])
                        (const_int 0 [0])
                    ] UNSPECV_ATOMIC_LDOP))
            (clobber (reg:SI 0 x0 [82]))
        ]) t.c:14 2895 {aarch64_atomic_or_fetchsi_lse}
     (nil))

And split came along and used the clobber register as a temporary to
store the the result of the ldset and then did an or with that
register and the original input register.

This is incorrect as the clobber register needs to be marked as early
clobber so it does not match up with the input register.

This obvious patch fixes the problem by marking the clobber register
as an early clobber so the register allocator does not choose the same
register as an input register.

Committed as obvious after a bootstrap/test on aarch64-linux-gnu
configured with/without --with-cpu=thunderx+lse on a pass 2 ThunderX
CPU (which has ARMv8.1 support).

Thanks,
Andrew

ChangeLog:
2015-12-20  Andrew Pinsi  <apinski@cavium.com>

        * config/aarch64/atomics.md
        (aarch64_atomic_<atomic_optab>_fetch<mode>_lse): Add early clobber
        to the scratch register.
diff mbox

Patch

Index: config/aarch64/atomics.md
===================================================================
--- config/aarch64/atomics.md	(revision 231852)
+++ config/aarch64/atomics.md	(working copy)
@@ -428,7 +428,7 @@ 
        (match_dup 2)
        (match_operand:SI 3 "const_int_operand")]
       UNSPECV_ATOMIC_LDOP))
-     (clobber (match_scratch:ALLI 4 "=r"))]
+     (clobber (match_scratch:ALLI 4 "=&r"))]
   "TARGET_LSE"
   "#"
   "&& reload_completed"