Patchwork [rs6000] Improve atomic_load/store code gen for Power8 TI mode

login
register
mail settings
Submitter Pat Haugen
Date April 8, 2014, 6:39 p.m.
Message ID <5344426C.9080207@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/337717/
State New
Headers show

Comments

Pat Haugen - April 8, 2014, 6:39 p.m.
On 03/25/2014 11:20 AM, Pat Haugen wrote:
> Power8 can use lq/stq instructions for TI mode atomic_load/store. 
> Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once 
> bootstrap/regtest finishes)?
>
> -Pat
>
>
> 2014-03-25  Pat Haugen <pthaugen@us.ibm.com>
>
>         * config/rs6000/sync.md (AINT mode_iterator): Move definition.
>         (loadsync_<mode>): Change mode.
>         (atomic_load<mode>, atomic_store<mode>): Add support for TI mode.
>         (load_quadpti, store_quadpti): New.
>         * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
>
> gcc/testsuite:
>         * gcc.target/powerpc/atomic_load_store-p8.c: New.

Updated patch which was approved off list and I have committed.
William J. Schmidt - April 9, 2014, 6:56 p.m.
On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
> On 03/25/2014 11:20 AM, Pat Haugen wrote:
> > Power8 can use lq/stq instructions for TI mode atomic_load/store. 
> > Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once 
> > bootstrap/regtest finishes)?
> >
> > -Pat
> >
> >
> > 2014-03-25  Pat Haugen <pthaugen@us.ibm.com>
> >
> >         * config/rs6000/sync.md (AINT mode_iterator): Move definition.
> >         (loadsync_<mode>): Change mode.
> >         (atomic_load<mode>, atomic_store<mode>): Add support for TI mode.
> >         (load_quadpti, store_quadpti): New.
> >         * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
> >
> > gcc/testsuite:
> >         * gcc.target/powerpc/atomic_load_store-p8.c: New.
> 
> Updated patch which was approved off list and I have committed.
> 

Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:

checking for suffix of executables... /home/wschmidt/gcc/gcc-4_8-base/libatomic\
/load_n.c: In function 'libat_load_16':
/home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid failur\
e memory model for '__atomic_compare_exchange'
     atomic_compare_exchange_n (mptr, &t, 0, true,
                               ^
make[4]: *** [load_16_.lo] Error 1
make[4]: *** Waiting for unfinished jobs....

Thanks,
Bill
David Edelsohn - April 9, 2014, 7:56 p.m.
I have reverted this on trunk and asked Bill to revert this on the 4.8
branch. This patch is too risky to apply this close to a freeze for
4.9.

Sorry for the problems.

- David


On Wed, Apr 9, 2014 at 2:56 PM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
>> On 03/25/2014 11:20 AM, Pat Haugen wrote:
>> > Power8 can use lq/stq instructions for TI mode atomic_load/store.
>> > Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once
>> > bootstrap/regtest finishes)?
>> >
>> > -Pat
>> >
>> >
>> > 2014-03-25  Pat Haugen <pthaugen@us.ibm.com>
>> >
>> >         * config/rs6000/sync.md (AINT mode_iterator): Move definition.
>> >         (loadsync_<mode>): Change mode.
>> >         (atomic_load<mode>, atomic_store<mode>): Add support for TI mode.
>> >         (load_quadpti, store_quadpti): New.
>> >         * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
>> >
>> > gcc/testsuite:
>> >         * gcc.target/powerpc/atomic_load_store-p8.c: New.
>>
>> Updated patch which was approved off list and I have committed.
>>
>
> Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:
>
> checking for suffix of executables... /home/wschmidt/gcc/gcc-4_8-base/libatomic\
> /load_n.c: In function 'libat_load_16':
> /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid failur\
> e memory model for '__atomic_compare_exchange'
>      atomic_compare_exchange_n (mptr, &t, 0, true,
>                                ^
> make[4]: *** [load_16_.lo] Error 1
> make[4]: *** Waiting for unfinished jobs....
>
> Thanks,
> Bill
>
William J. Schmidt - April 9, 2014, 8:08 p.m.
On Wed, 2014-04-09 at 15:56 -0400, David Edelsohn wrote:
> I have reverted this on trunk and asked Bill to revert this on the 4.8
> branch. This patch is too risky to apply this close to a freeze for
> 4.9.

I've reverted this on 4.8 as r209254.

Thanks,
Bill

> 
> Sorry for the problems.
> 
> - David
> 
> 
> On Wed, Apr 9, 2014 at 2:56 PM, Bill Schmidt
> <wschmidt@linux.vnet.ibm.com> wrote:
> > On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
> >> On 03/25/2014 11:20 AM, Pat Haugen wrote:
> >> > Power8 can use lq/stq instructions for TI mode atomic_load/store.
> >> > Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once
> >> > bootstrap/regtest finishes)?
> >> >
> >> > -Pat
> >> >
> >> >
> >> > 2014-03-25  Pat Haugen <pthaugen@us.ibm.com>
> >> >
> >> >         * config/rs6000/sync.md (AINT mode_iterator): Move definition.
> >> >         (loadsync_<mode>): Change mode.
> >> >         (atomic_load<mode>, atomic_store<mode>): Add support for TI mode.
> >> >         (load_quadpti, store_quadpti): New.
> >> >         * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
> >> >
> >> > gcc/testsuite:
> >> >         * gcc.target/powerpc/atomic_load_store-p8.c: New.
> >>
> >> Updated patch which was approved off list and I have committed.
> >>
> >
> > Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:
> >
> > checking for suffix of executables... /home/wschmidt/gcc/gcc-4_8-base/libatomic\
> > /load_n.c: In function 'libat_load_16':
> > /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid failur\
> > e memory model for '__atomic_compare_exchange'
> >      atomic_compare_exchange_n (mptr, &t, 0, true,
> >                                ^
> > make[4]: *** [load_16_.lo] Error 1
> > make[4]: *** Waiting for unfinished jobs....
> >
> > Thanks,
> > Bill
> >
>
Pat Haugen - April 28, 2014, 8:56 p.m.
On 04/09/2014 02:56 PM, David Edelsohn wrote:
> I have reverted this on trunk and asked Bill to revert this on the 4.8
> branch. This patch is too risky to apply this close to a freeze for
> 4.9.
I received approval off list for an updated variant of the patch for 
4.8, so this patch has now been (re)committed to 4.8/4.9/trunk.

-Pat

Patch

Index: testsuite/gcc.target/powerpc/atomic_load_store-p8.c
===================================================================
--- testsuite/gcc.target/powerpc/atomic_load_store-p8.c	(revision 0)
+++ testsuite/gcc.target/powerpc/atomic_load_store-p8.c	(revision 0)
@@ -0,0 +1,22 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mcpu=power8 -O2" } */
+/* { dg-final { scan-assembler-times "lq" 1 } } */
+/* { dg-final { scan-assembler-times "stq" 1 } } */
+/* { dg-final { scan-assembler-not "bl __atomic" } } */
+/* { dg-final { scan-assembler-not "lqarx" } } */
+/* { dg-final { scan-assembler-not "stqcx" } } */
+
+__int128
+atomic_load_128_relaxed (__int128 *ptr)
+{
+	return __atomic_load_n (ptr, __ATOMIC_RELAXED);
+}
+
+void
+atomic_store_128_relaxed (__int128 *ptr, __int128 val)
+{
+	__atomic_store_n (ptr, val, __ATOMIC_RELAXED);
+}
+
Index: config/rs6000/predicates.md
===================================================================
--- config/rs6000/predicates.md	(revision 209198)
+++ config/rs6000/predicates.md	(working copy)
@@ -624,14 +624,14 @@  (define_predicate "offsettable_mem_opera
        (match_test "offsettable_nonstrict_memref_p (op)")))
 
 ;; Return 1 if the operand is suitable for load/store quad memory.
-;; This predicate only checks for non-atomic loads/stores.
+;; This predicate only checks for non-atomic loads/stores (not lqarx/stqcx).
 (define_predicate "quad_memory_operand"
   (match_code "mem")
 {
   rtx addr, op0, op1;
   int ret;
 
-  if (!TARGET_QUAD_MEMORY)
+  if (!TARGET_QUAD_MEMORY && !TARGET_SYNC_TI)
     ret = 0;
 
   else if (!memory_operand (op, mode))
Index: config/rs6000/sync.md
===================================================================
--- config/rs6000/sync.md	(revision 209198)
+++ config/rs6000/sync.md	(working copy)
@@ -107,10 +107,17 @@  (define_insn "isync"
   "isync"
   [(set_attr "type" "isync")])
 
+;; Types that we should provide atomic instructions for.
+(define_mode_iterator AINT [QI
+			    HI
+			    SI
+			    (DI "TARGET_POWERPC64")
+			    (TI "TARGET_SYNC_TI")])
+
 ;; The control dependency used for load dependency described
 ;; in B.2.3 of the Power ISA 2.06B.
 (define_insn "loadsync_<mode>"
-  [(unspec_volatile:BLK [(match_operand:INT1 0 "register_operand" "r")]
+  [(unspec_volatile:BLK [(match_operand:AINT 0 "register_operand" "r")]
 			UNSPECV_ISYNC)
    (clobber (match_scratch:CC 1 "=y"))]
   ""
@@ -118,18 +125,56 @@  (define_insn "loadsync_<mode>"
   [(set_attr "type" "isync")
    (set_attr "length" "12")])
 
+(define_insn "load_quadpti"
+  [(set (match_operand:PTI 0 "quad_int_reg_operand" "=&r")
+	(unspec:PTI
+	 [(match_operand:TI 1 "quad_memory_operand" "wQ")] UNSPEC_LSQ))]
+  "TARGET_SYNC_TI
+   && !reg_mentioned_p (operands[0], operands[1])"
+  "lq %0,%1"
+  [(set_attr "type" "load")
+   (set_attr "length" "4")])
+
 (define_expand "atomic_load<mode>"
-  [(set (match_operand:INT1 0 "register_operand" "")		;; output
-	(match_operand:INT1 1 "memory_operand" ""))		;; memory
+  [(set (match_operand:AINT 0 "register_operand" "")		;; output
+	(match_operand:AINT 1 "memory_operand" ""))		;; memory
    (use (match_operand:SI 2 "const_int_operand" ""))]		;; model
   ""
 {
+  if (<MODE>mode == TImode && !TARGET_SYNC_TI)
+    FAIL;
+
   enum memmodel model = (enum memmodel) INTVAL (operands[2]);
 
   if (model == MEMMODEL_SEQ_CST)
     emit_insn (gen_hwsync ());
 
-  emit_move_insn (operands[0], operands[1]);
+  if (<MODE>mode != TImode)
+    emit_move_insn (operands[0], operands[1]);
+  else
+    {
+      rtx op0 = operands[0];
+      rtx op1 = operands[1];
+      rtx pti_reg = gen_reg_rtx (PTImode);
+
+      // Can't have indexed address for 'lq'
+      if (indexed_address (XEXP (op1, 0), TImode))
+	{
+	  rtx old_addr = XEXP (op1, 0);
+	  rtx new_addr = force_reg (Pmode, old_addr);
+	  operands[1] = op1 = replace_equiv_address (op1, new_addr);
+	}
+
+      emit_insn (gen_load_quadpti (pti_reg, op1));
+
+      if (WORDS_BIG_ENDIAN)
+	emit_move_insn (op0, gen_lowpart (TImode, pti_reg));
+      else
+	{
+	  emit_move_insn (gen_lowpart (DImode, op0), gen_highpart (DImode, pti_reg));
+	  emit_move_insn (gen_highpart (DImode, op0), gen_lowpart (DImode, pti_reg));
+	}
+    }
 
   switch (model)
     {
@@ -146,12 +191,24 @@  (define_expand "atomic_load<mode>"
   DONE;
 })
 
+(define_insn "store_quadpti"
+  [(set (match_operand:PTI 0 "quad_memory_operand" "=wQ")
+	(unspec:PTI
+	 [(match_operand:PTI 1 "quad_int_reg_operand" "r")] UNSPEC_LSQ))]
+  "TARGET_SYNC_TI"
+  "stq %1,%0"
+  [(set_attr "type" "store")
+   (set_attr "length" "4")])
+
 (define_expand "atomic_store<mode>"
-  [(set (match_operand:INT1 0 "memory_operand" "")		;; memory
-	(match_operand:INT1 1 "register_operand" ""))		;; input
+  [(set (match_operand:AINT 0 "memory_operand" "")		;; memory
+	(match_operand:AINT 1 "register_operand" ""))		;; input
    (use (match_operand:SI 2 "const_int_operand" ""))]		;; model
   ""
 {
+  if (<MODE>mode == TImode && !TARGET_SYNC_TI)
+    FAIL;
+
   enum memmodel model = (enum memmodel) INTVAL (operands[2]);
   switch (model)
     {
@@ -166,7 +223,33 @@  (define_expand "atomic_store<mode>"
     default:
       gcc_unreachable ();
     }
-  emit_move_insn (operands[0], operands[1]);
+  if (<MODE>mode != TImode)
+    emit_move_insn (operands[0], operands[1]);
+  else
+    {
+      rtx op0 = operands[0];
+      rtx op1 = operands[1];
+      rtx pti_reg = gen_reg_rtx (PTImode);
+
+      // Can't have indexed address for 'stq'
+      if (indexed_address (XEXP (op0, 0), TImode))
+	{
+	  rtx old_addr = XEXP (op0, 0);
+	  rtx new_addr = force_reg (Pmode, old_addr);
+	  operands[0] = op0 = replace_equiv_address (op0, new_addr);
+	}
+
+      if (WORDS_BIG_ENDIAN)
+	emit_move_insn (pti_reg, gen_lowpart (PTImode, op1));
+      else
+	{
+	  emit_move_insn (gen_lowpart (DImode, pti_reg), gen_highpart (DImode, op1));
+	  emit_move_insn (gen_highpart (DImode, pti_reg), gen_lowpart (DImode, op1));
+	}
+
+      emit_insn (gen_store_quadpti (gen_lowpart (PTImode, op0), pti_reg));
+    }
+
   DONE;
 })
 
@@ -180,14 +263,6 @@  (define_mode_iterator ATOMIC [(QI "TARGE
 			      SI
 			      (DI "TARGET_POWERPC64")])
 
-;; Types that we should provide atomic instructions for.
-
-(define_mode_iterator AINT [QI
-			    HI
-			    SI
-			    (DI "TARGET_POWERPC64")
-			    (TI "TARGET_SYNC_TI")])
-
 (define_insn "load_locked<mode>"
   [(set (match_operand:ATOMIC 0 "int_reg_operand" "=r")
 	(unspec_volatile:ATOMIC
Index: config/rs6000/rs6000.md
===================================================================
--- config/rs6000/rs6000.md	(revision 209198)
+++ config/rs6000/rs6000.md	(working copy)
@@ -125,6 +125,7 @@  (define_c_enum "unspec"
    UNSPEC_P8V_MTVSRD
    UNSPEC_P8V_XXPERMDI
    UNSPEC_P8V_RELOAD_FROM_VSX
+   UNSPEC_LSQ
   ])
 
 ;;