diff mbox

Avoid PR72749 by not using unspecs

Message ID 20170113115058.GU32333@bubble.grove.modra.org
State New
Headers show

Commit Message

Alan Modra Jan. 13, 2017, 11:50 a.m. UTC
Rather than using unspecs in doloop insns to stop combine creating
these insns, use legitimate_combined_insn.

I'm not sure why the original patch implementing
legitimate_combined_insn did not store the result of recog to the insn
but it seems good to me, and would allow the recog call in
ix86_legitimate_combined_insn to be omitted.  (I tested that too, not
shown here.)

Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux,
x86_64-linux.  OK for mainline?

	PR target/72749
	* combine.c (recog_for_combine_1): Set INSN_CODE before calling
	target legitimate_combined_insn.
	* config/rs6000/rs6000.c (TARGET_LEGITIMATE_COMBINED_INSN): Define.
	(rs6000_legitimate_combined_insn): New function.
	* config/rs6000/rs6000.md (UNSPEC_DOLOOP): Delete, and remove
	all uses.
	(ctr<mode>_internal3): Rename from *ctr<mode>_internal5.
	(ctr<mode>_internal4): Rename from *ctr<mode>_internal6.
	(ctr<mode>_internal1, ctr<mode>_internal2): Remove '*' from name.

Comments

Uros Bizjak Jan. 13, 2017, 3:58 p.m. UTC | #1
On Fri, Jan 13, 2017 at 12:50 PM, Alan Modra <amodra@gmail.com> wrote:
> Rather than using unspecs in doloop insns to stop combine creating
> these insns, use legitimate_combined_insn.
>
> I'm not sure why the original patch implementing
> legitimate_combined_insn did not store the result of recog to the insn
> but it seems good to me, and would allow the recog call in
> ix86_legitimate_combined_insn to be omitted.  (I tested that too, not
> shown here.)

IIRC, I copied operand scanning loop from recog.c (around line 2580)
and the function was later enhanced with preferred alternatives
handling. The function worked well, and not being an expert in this
area, I didn't try to "optimize" the code that worked...

So, there is no particular reason for the current implementation.

Uros.
Jeff Law Jan. 13, 2017, 5:17 p.m. UTC | #2
On 01/13/2017 04:50 AM, Alan Modra wrote:
> Rather than using unspecs in doloop insns to stop combine creating
> these insns, use legitimate_combined_insn.
>
> I'm not sure why the original patch implementing
> legitimate_combined_insn did not store the result of recog to the insn
> but it seems good to me, and would allow the recog call in
> ix86_legitimate_combined_insn to be omitted.  (I tested that too, not
> shown here.)
>
> Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux,
> x86_64-linux.  OK for mainline?
>
> 	PR target/72749
> 	* combine.c (recog_for_combine_1): Set INSN_CODE before calling
> 	target legitimate_combined_insn.
> 	* config/rs6000/rs6000.c (TARGET_LEGITIMATE_COMBINED_INSN): Define.
> 	(rs6000_legitimate_combined_insn): New function.
> 	* config/rs6000/rs6000.md (UNSPEC_DOLOOP): Delete, and remove
> 	all uses.
> 	(ctr<mode>_internal3): Rename from *ctr<mode>_internal5.
> 	(ctr<mode>_internal4): Rename from *ctr<mode>_internal6.
> 	(ctr<mode>_internal1, ctr<mode>_internal2): Remove '*' from name.
OK on the combine.c changes.  ppc maintainers own the rest.

jeff
Segher Boessenkool Jan. 14, 2017, 9:51 a.m. UTC | #3
On Fri, Jan 13, 2017 at 10:20:58PM +1030, Alan Modra wrote:
> Rather than using unspecs in doloop insns to stop combine creating
> these insns, use legitimate_combined_insn.
> 
> I'm not sure why the original patch implementing
> legitimate_combined_insn did not store the result of recog to the insn
> but it seems good to me, and would allow the recog call in
> ix86_legitimate_combined_insn to be omitted.  (I tested that too, not
> shown here.)

It is always restored after the hook call, so this change is fine.

> +      /* Reject creating doloop insns.  Combine should not be allowed
> +	 to create these for a number of reasons:

It also prevents combine from combining any instruction into an existing
doloop insn (resulting in a doloop insn again).  This isn't too serious,
almost always this is just a register copy that will be optimised away
some other way.  The unspec hack has more significant problems.

> +	 3) Faced with not being able to allocate ctr for ctrsi/crtdi
> +	 insns, the register allocator sometimes uses floating point
> +	 or vector registers for the pseudo.  Since ctrsi/ctrdi is a
> +	 jump insn and output reloads are not implemented for jumps,
> +	 the ctrsi/ctrdi splitters need to handle all possible cases.
> +	 That's a pain, and it gets to be seriously difficult when a
> +	 splitter that runs after reload needs memory to transfer from
> +	 a gpr to fpr.  See PR70098 and PR71763 which are not fixed
> +	 for the difficult case.  It's better to not create problems
> +	 in the first place.  */

Yeah :-)  Note that the problems still can happen, just much less frequently.

> @@ -12751,9 +12749,8 @@ (define_expand "ctr<mode>"
>  ;; JUMP_INSNs.
>  ;; For the length attribute to be calculated correctly, the
>  ;; label MUST be operand 0.
> -;; The UNSPEC is present to prevent combine creating this pattern.

Maybe add a note here that rs6000_legitimate_combined_insn will refuse
to combine to this?

Okay for trunk.  Thanks!


Segher
Alan Modra Jan. 14, 2017, 12:57 p.m. UTC | #4
On Sat, Jan 14, 2017 at 03:51:27AM -0600, Segher Boessenkool wrote:
> It also prevents combine from combining any instruction into an existing
> doloop insn (resulting in a doloop insn again).  This isn't too serious,
> almost always this is just a register copy that will be optimised away
> some other way.

Right.  I forgot to mention, I looked at all of gcc/*.o before/after
the patch and saw no occurrences of any missed combines.  Just the
expected changes in rs6000.o, various insn-*.o from twiddling
rs6000.md, and one constant change in read-rtl-function.o.
diff mbox

Patch

diff --git a/gcc/combine.c b/gcc/combine.c
index 3043f2a..73a74ac 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11199,6 +11199,7 @@  recog_for_combine_1 (rtx *pnewpat, rtx_insn *insn, rtx *pnotes)
       old_icode = INSN_CODE (insn);
       PATTERN (insn) = pat;
       REG_NOTES (insn) = notes;
+      INSN_CODE (insn) = insn_code_number;
 
       /* Allow targets to reject combined insn.  */
       if (!targetm.legitimate_combined_insn (insn))
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 02b521c..76ba81b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1558,6 +1558,9 @@  static const struct attribute_spec rs6000_attribute_table[] =
 #undef TARGET_CONST_NOT_OK_FOR_DEBUG_P
 #define TARGET_CONST_NOT_OK_FOR_DEBUG_P rs6000_const_not_ok_for_debug_p
 
+#undef TARGET_LEGITIMATE_COMBINED_INSN
+#define TARGET_LEGITIMATE_COMBINED_INSN rs6000_legitimate_combined_insn
+
 #undef TARGET_ASM_FUNCTION_PROLOGUE
 #define TARGET_ASM_FUNCTION_PROLOGUE rs6000_output_function_prologue
 #undef TARGET_ASM_FUNCTION_EPILOGUE
@@ -9076,6 +9079,49 @@  rs6000_const_not_ok_for_debug_p (rtx x)
   return false;
 }
 
+
+/* Implement the TARGET_LEGITIMATE_COMBINED_INSN hook.  */
+
+static bool
+rs6000_legitimate_combined_insn (rtx_insn *insn)
+{
+  switch (INSN_CODE (insn))
+    {
+      /* Reject creating doloop insns.  Combine should not be allowed
+	 to create these for a number of reasons:
+	 1) In a nested loop, if combine creates one of these in an
+	 outer loop and the register allocator happens to allocate ctr
+	 to the outer loop insn, then the inner loop can't use ctr.
+	 Inner loops ought to be more highly optimized.
+	 2) Combine often wants to create one of these from what was
+	 originally a three insn sequence, first combining the three
+	 insns to two, then to ctrsi/ctrdi.  When ctrsi/ctrdi is not
+	 allocated ctr, the splitter takes use back to the three insn
+	 sequence.  It's better to stop combine at the two insn
+	 sequence.
+	 3) Faced with not being able to allocate ctr for ctrsi/crtdi
+	 insns, the register allocator sometimes uses floating point
+	 or vector registers for the pseudo.  Since ctrsi/ctrdi is a
+	 jump insn and output reloads are not implemented for jumps,
+	 the ctrsi/ctrdi splitters need to handle all possible cases.
+	 That's a pain, and it gets to be seriously difficult when a
+	 splitter that runs after reload needs memory to transfer from
+	 a gpr to fpr.  See PR70098 and PR71763 which are not fixed
+	 for the difficult case.  It's better to not create problems
+	 in the first place.  */
+    case CODE_FOR_ctrsi_internal1:
+    case CODE_FOR_ctrdi_internal1:
+    case CODE_FOR_ctrsi_internal2:
+    case CODE_FOR_ctrdi_internal2:
+    case CODE_FOR_ctrsi_internal3:
+    case CODE_FOR_ctrdi_internal3:
+    case CODE_FOR_ctrsi_internal4:
+    case CODE_FOR_ctrdi_internal4:
+      return false;
+    }
+  return true;
+}
+
 /* Construct the SYMBOL_REF for the tls_get_addr function.  */
 
 static GTY(()) rtx rs6000_tls_symbol;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index f7c1ab2..9442b07 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -149,7 +149,6 @@  (define_c_enum "unspec"
    UNSPEC_IEEE128_MOVE
    UNSPEC_IEEE128_CONVERT
    UNSPEC_SIGNBIT
-   UNSPEC_DOLOOP
    UNSPEC_SF_FROM_SI
    UNSPEC_SI_FROM_SF
   ])
@@ -12740,7 +12739,6 @@  (define_expand "ctr<mode>"
 	      (set (match_dup 0)
 		   (plus:P (match_dup 0)
 			    (const_int -1)))
-	      (unspec [(const_int 0)] UNSPEC_DOLOOP)
 	      (clobber (match_scratch:CC 2 ""))
 	      (clobber (match_scratch:P 3 ""))])]
   ""
@@ -12751,9 +12749,8 @@  (define_expand "ctr<mode>"
 ;; JUMP_INSNs.
 ;; For the length attribute to be calculated correctly, the
 ;; label MUST be operand 0.
-;; The UNSPEC is present to prevent combine creating this pattern.
 
-(define_insn "*ctr<mode>_internal1"
+(define_insn "ctr<mode>_internal1"
   [(set (pc)
 	(if_then_else (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
 			  (const_int 1))
@@ -12762,7 +12759,6 @@  (define_insn "*ctr<mode>_internal1"
    (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*wi*c*l")
 	(plus:P (match_dup 1)
 		 (const_int -1)))
-   (unspec [(const_int 0)] UNSPEC_DOLOOP)
    (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
    (clobber (match_scratch:P 4 "=X,X,&r,r"))]
   ""
@@ -12778,7 +12774,7 @@  (define_insn "*ctr<mode>_internal1"
   [(set_attr "type" "branch")
    (set_attr "length" "*,16,20,20")])
 
-(define_insn "*ctr<mode>_internal2"
+(define_insn "ctr<mode>_internal2"
   [(set (pc)
 	(if_then_else (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
 			  (const_int 1))
@@ -12787,7 +12783,6 @@  (define_insn "*ctr<mode>_internal2"
    (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*wi*c*l")
 	(plus:P (match_dup 1)
 		 (const_int -1)))
-   (unspec [(const_int 0)] UNSPEC_DOLOOP)
    (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
    (clobber (match_scratch:P 4 "=X,X,&r,r"))]
   ""
@@ -12805,7 +12800,7 @@  (define_insn "*ctr<mode>_internal2"
 
 ;; Similar but use EQ
 
-(define_insn "*ctr<mode>_internal5"
+(define_insn "ctr<mode>_internal3"
   [(set (pc)
 	(if_then_else (eq (match_operand:P 1 "register_operand" "c,*b,*b,*b")
 			  (const_int 1))
@@ -12814,7 +12809,6 @@  (define_insn "*ctr<mode>_internal5"
    (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*wi*c*l")
 	(plus:P (match_dup 1)
 		 (const_int -1)))
-   (unspec [(const_int 0)] UNSPEC_DOLOOP)
    (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
    (clobber (match_scratch:P 4 "=X,X,&r,r"))]
   ""
@@ -12830,7 +12824,7 @@  (define_insn "*ctr<mode>_internal5"
   [(set_attr "type" "branch")
    (set_attr "length" "*,16,20,20")])
 
-(define_insn "*ctr<mode>_internal6"
+(define_insn "ctr<mode>_internal4"
   [(set (pc)
 	(if_then_else (eq (match_operand:P 1 "register_operand" "c,*b,*b,*b")
 			  (const_int 1))
@@ -12839,7 +12833,6 @@  (define_insn "*ctr<mode>_internal6"
    (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*wi*c*l")
 	(plus:P (match_dup 1)
 		 (const_int -1)))
-   (unspec [(const_int 0)] UNSPEC_DOLOOP)
    (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
    (clobber (match_scratch:P 4 "=X,X,&r,r"))]
   ""
@@ -12866,7 +12859,6 @@  (define_split
 		      (match_operand 6 "" "")))
    (set (match_operand:P 0 "int_reg_operand" "")
 	(plus:P (match_dup 1) (const_int -1)))
-   (unspec [(const_int 0)] UNSPEC_DOLOOP)
    (clobber (match_scratch:CC 3 ""))
    (clobber (match_scratch:P 4 ""))]
   "reload_completed"
@@ -12892,7 +12884,6 @@  (define_split
 		      (match_operand 6 "" "")))
    (set (match_operand:P 0 "nonimmediate_operand" "")
 	(plus:P (match_dup 1) (const_int -1)))
-   (unspec [(const_int 0)] UNSPEC_DOLOOP)
    (clobber (match_scratch:CC 3 ""))
    (clobber (match_scratch:P 4 ""))]
   "reload_completed && ! gpc_reg_operand (operands[0], SImode)"
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr72749.c b/gcc/testsuite/gcc.c-torture/compile/pr72749.c
new file mode 100644
index 0000000..2ef4d9a
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr72749.c
@@ -0,0 +1,21 @@ 
+/* { dg-options "-O2 -fsched2-use-superblocks" } */
+
+int as;
+
+void
+ji (int *x4)
+{
+  if (0)
+    {
+      unsigned int pv;
+
+      while (as < 0)
+        {
+          for (*x4 = 0; *x4 < 1; ++(*x4))
+yj:
+            x4 = (int *)&pv;
+          ++as;
+        }
+    }
+  goto yj;
+}