rs6000: Cleanup bdz/bdnz insn/splitter, add new insn/splitter for bdzt/bdzf/bdnzt/bdnzf
diff mbox series

Message ID 1512063107.6005.22.camel@linux.vnet.ibm.com
State New
Headers show
Series
  • rs6000: Cleanup bdz/bdnz insn/splitter, add new insn/splitter for bdzt/bdzf/bdnzt/bdnzf
Related show

Commit Message

Aaron Sawdey Nov. 30, 2017, 5:31 p.m. UTC
This does some cleanup/consolidation so that bdz/bdnz are supported by
a single insn and splitter, and adds a new insn and splitter to support
the conditional form of those (bdzt/bdzf/bdnzt/bdnzf).

This is going to be used for the memcmp() builtin expansion patch which
is next. That also will require the change to canonicalize_condition I
posted before thanksgiving to prevent doloop from being confused by
bdnzt et. al. 

Bootstrap/regtest passes on ppc64le. OK for trunk?


2017-11-30  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>

	* config/rs6000/rs6000.md (cceq_ior_compare): Remove * so I can use it
	to generate rtl.
	(cceq_ior_compare_complement): Give it a name so I can use it, and
	change boolean_or_operator predicate to boolean_operator so it can
	be used to generate a crand.
	Define new code iterator eqne, and new code_attrs bd/bd_neg.
	(<bd>_<mode>) New name for ctr<mode>_internal[12] now combined into
	a single define_insn. There is now just a single splitter for this
	that looks whether the decremented ctr result is going to a register
	or memory and generates the appropriate rtl.
	(<bd>tf_<mode>) A new insn pattern for the conditional form branch
	decrement (bdnzt/bdnzf/bdzt/bdzf). This also has a splitter similar
	to the one for <bd>_<mode>.
	* config/rs6000/rs6000.c (rs6000_legitimate_combined_insn): Updated
	with the new names of the branch decrement patterns, and added the
	names of the branch decrement conditional patterns.

Comments

Segher Boessenkool Dec. 1, 2017, 10:45 p.m. UTC | #1
Hi Aaron,

On Thu, Nov 30, 2017 at 11:31:47AM -0600, Aaron Sawdey wrote:
> This does some cleanup/consolidation so that bdz/bdnz are supported by
> a single insn and splitter, and adds a new insn and splitter to support
> the conditional form of those (bdzt/bdzf/bdnzt/bdnzf).
> 
> This is going to be used for the memcmp() builtin expansion patch which
> is next. That also will require the change to canonicalize_condition I
> posted before thanksgiving to prevent doloop from being confused by
> bdnzt et. al. 

> 	* config/rs6000/rs6000.md (cceq_ior_compare): Remove * so I can use it
> 	to generate rtl.
> 	(cceq_ior_compare_complement): Give it a name so I can use it, and
> 	change boolean_or_operator predicate to boolean_operator so it can
> 	be used to generate a crand.
> 	Define new code iterator eqne, and new code_attrs bd/bd_neg.

	(eqne): New code_iterator.
etc.

> 	(<bd>_<mode>) New name for ctr<mode>_internal[12] now combined into
> 	a single define_insn. There is now just a single splitter for this
> 	that looks whether the decremented ctr result is going to a register
> 	or memory and generates the appropriate rtl.

Colon after ), two spaces after a full stop.  You don't need to explain
what a change is for btw; changelog is *what* changed, not *why*.

> 	(<bd>tf_<mode>) A new insn pattern for the conditional form branch
> 	decrement (bdnzt/bdnzf/bdzt/bdzf). This also has a splitter similar
> 	to the one for <bd>_<mode>.
> 	* config/rs6000/rs6000.c (rs6000_legitimate_combined_insn): Updated
> 	with the new names of the branch decrement patterns, and added the
> 	names of the branch decrement conditional patterns.

> +      && (icode == CODE_FOR_bdz_si
> +	  || icode == CODE_FOR_bdz_di
> +	  || icode == CODE_FOR_bdnz_si
> +	  || icode == CODE_FOR_bdnz_di
> +	  || icode == CODE_FOR_bdztf_si
> +	  || icode == CODE_FOR_bdnztf_si
> +	  || icode == CODE_FOR_bdztf_di
> +	  || icode == CODE_FOR_bdnztf_di))

Please swap bdnztf_si and bdztf_di so it is clearer you handle all cases?

> +(define_code_iterator eqne [eq ne])
> +(define_code_attr bd [(ne "bdnz") (eq "bdz")])
> +(define_code_attr bd_neg [(ne "bdz") (eq "bdnz")])

Maybe order those as eq, ne as well?

> +  rtx ctrin = operands[1];
> +  rtx ctrout = operands[0];
> +  rtx ctrtmp = operands[4];

I don't think these temporaries improve legibility at all?

>    operands[7] = gen_rtx_fmt_ee (GET_CODE (operands[2]), VOIDmode, operands[3],
>  				const0_rtx);
> +  emit_insn (gen_rtx_SET (operands[3],
> +                          gen_rtx_COMPARE (CCmode, ctrin, const1_rtx)));
> +  if (gpc_reg_operand (ctrout, <MODE>mode))
> +    emit_insn (gen_add<mode>3 (ctrout, ctrin, constm1_rtx));
> +  else
> +    {
> +      emit_insn (gen_add<mode>3 (ctrtmp, ctrin, constm1_rtx));
> +      emit_move_insn (ctrout, ctrtmp);
> +    } 

(Space at end of line).

> +    /* No DONE so branch comes from the pattern.  */
>  })

> +(define_insn "<bd>tf_<mode>"
>    [(set (pc)
> -	(if_then_else (match_operator 2 "comparison_operator"
> -				      [(match_operand:P 1 "gpc_reg_operand")
> -				       (const_int 1)])
> -		      (match_operand 5)
> -		      (match_operand 6)))
> -   (set (match_operand:P 0 "nonimmediate_operand")
> +	(if_then_else
> +	  (and
> +	     (eqne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
> +		   (const_int 1))
> +	     (match_operator 3 "branch_comparison_operator"
> +		      [(match_operand 4 "cc_reg_operand" "y,y,y,y")
> +		       (const_int 0)]))
> +		      (label_ref (match_operand 0))
> +		      (pc)))

Those last two lines should be indented the same as the "(and" (they are
sub rtx of the if_then_else).

> +{
> +  if (which_alternative != 0)
> +    return "#";
> +  else if (get_attr_length (insn) == 4)
> +    {
> +      if (branch_positive_comparison_operator (operands[3],
> +                                               GET_MODE (operands[3])))
> +        return "<bd>t %j3,%l0";
> +      else
> +        return "<bd>f %j3,%l0";

Eight leading spaces should be a tab.

> +    }
> +  else
> +    { 

Trailing space.

> +        static char seq[96];
> +        char *bcs = output_cbranch (operands[3], "$+8", 1, insn);
> +	sprintf(seq, "<bd_neg> $+12\;%s;b %%l0", bcs);
> +	return seq;

The indent should be six spaces on these four lines.

It should be "const char *" really; but output_cbranch has a big bug
as well it seems: it returns a pointer to a string on its stack!  Uh-oh.

> +;; Now the splitter if we could not allocate the CTR register
> +(define_split
> +  [(set (pc)
> +	(if_then_else
> +	  (and
> +	     (match_operator 1 "comparison_operator"
> +	                     [(match_operand:P 0 "gpc_reg_operand")
> +		              (const_int 1)])
> +	     (match_operator 3 "branch_comparison_operator"
> +		      [(match_operand 2 "cc_reg_operand")
> +		       (const_int 0)]))
> +		      (match_operand 4)
> +		      (match_operand 5)))

Indent of these last two lines.

> +   if (!ispos)
> +      cmpcode = reverse_condition (cmpcode);

Should indent one space less.

> +   // want to generate crand/crandc
> +
> +   emit_insn (gen_rtx_SET (operands[8],
> +                           gen_rtx_COMPARE (CCmode, ctr, const1_rtx)));

Each eight leading spaces should be a tab.

> +   rtx ctrcmpcc = gen_rtx_fmt_ee (cmpcode, SImode, operands[8], const0_rtx);
> +
> +   rtx andexpr = gen_rtx_AND (SImode, ctrcmpcc, cccmp);
> +   if (ispos)
> +      emit_insn (gen_cceq_ior_compare (operands[9], andexpr, ctrcmpcc,
> +                                       operands[8], cccmp, ccin));
> +   else
> +      emit_insn (gen_cceq_ior_compare_complement (operands[9], andexpr, ctrcmpcc,
> +                                                  operands[8], cccmp, ccin));
> +   if (gpc_reg_operand (operands[0], <MODE>mode))
> +      emit_insn (gen_add<mode>3 (ctrout, ctr, constm1_rtx));
> +   else
> +      {
> +       	 emit_insn (gen_add<mode>3 (ctrtmp, ctr, constm1_rtx));

That has a tab after spaces, never correct.

> +         emit_move_insn (ctrout, ctrtmp);
> +      }
> +   rtx cmp = gen_rtx_EQ (CCEQmode, operands[9], const0_rtx);
> +   emit_jump_insn (gen_rtx_SET (pc_rtx,
> +                                gen_rtx_IF_THEN_ELSE (VOIDmode, cmp,
> +						      dst1, dst2)));
> +   DONE;
>  })

Looks good otherwise.  I'll ok it when there is a user (or a testcase).
It shouldn't go in before the canonicalize_condition patch, of course.

Thanks!


Segher
Aaron Sawdey Jan. 8, 2018, 3:07 p.m. UTC | #2
On Fri, 2017-12-01 at 16:45 -0600, Segher Boessenkool wrote:
> Looks good otherwise.  I'll ok it when there is a user (or a
> testcase).
> It shouldn't go in before the canonicalize_condition patch, of
> course.

The canonicalize_condition patch is in, so I have checked in this
cleanup and addition to the patterns and splitters for the branch
decrement instructions as 256344.

2018-01-08  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>

	* config/rs6000/rs6000.md (cceq_ior_compare): Remove * so I can use it
	to generate rtl.
	(cceq_ior_compare_complement): Give it a name so I can use it, and
	change boolean_or_operator predicate to boolean_operator so it can
	be used to generate a crand.
	(eqne): New code iterator.
	(bd/bd_neg): New code_attrs.
	(<bd>_<mode>): New name for ctr<mode>_internal[12] now combined into
	a single define_insn.
	(<bd>tf_<mode>): A new insn pattern for the conditional form branch
	decrement (bdnzt/bdnzf/bdzt/bdzf).
	* config/rs6000/rs6000.c (rs6000_legitimate_combined_insn): Updated
	with the new names of the branch decrement patterns, and added the
	names of the branch decrement conditional patterns.

Patch
diff mbox series

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 254553)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -9056,10 +9056,14 @@ 
      for the difficult case.  It's better to not create problems
      in the first place.  */
   if (icode != CODE_FOR_nothing
-      && (icode == CODE_FOR_ctrsi_internal1
-	  || icode == CODE_FOR_ctrdi_internal1
-	  || icode == CODE_FOR_ctrsi_internal2
-	  || icode == CODE_FOR_ctrdi_internal2))
+      && (icode == CODE_FOR_bdz_si
+	  || icode == CODE_FOR_bdz_di
+	  || icode == CODE_FOR_bdnz_si
+	  || icode == CODE_FOR_bdnz_di
+	  || icode == CODE_FOR_bdztf_si
+	  || icode == CODE_FOR_bdnztf_si
+	  || icode == CODE_FOR_bdztf_di
+	  || icode == CODE_FOR_bdnztf_di))
     return false;
 
   return true;
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 254553)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -12744,7 +12744,7 @@ 
 ; which are generated by the branch logic.
 ; Prefer destructive operations where BT = BB (for crXX BT,BA,BB)
 
-(define_insn "*cceq_ior_compare"
+(define_insn "cceq_ior_compare"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=y,?y")
         (compare:CCEQ (match_operator:SI 1 "boolean_operator"
 	                [(match_operator:SI 2
@@ -12764,9 +12764,9 @@ 
 
 ; Why is the constant -1 here, but 1 in the previous pattern?
 ; Because ~1 has all but the low bit set.
-(define_insn ""
+(define_insn "cceq_ior_compare_complement"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=y,?y")
-        (compare:CCEQ (match_operator:SI 1 "boolean_or_operator"
+        (compare:CCEQ (match_operator:SI 1 "boolean_operator"
 	                [(not:SI (match_operator:SI 2
 				      "branch_positive_comparison_operator"
 				      [(match_operand 3
@@ -12983,34 +12983,13 @@ 
 ;; rs6000_legitimate_combined_insn prevents combine creating any of
 ;; the ctr<mode> insns.
 
-(define_insn "ctr<mode>_internal1"
-  [(set (pc)
-	(if_then_else (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
-			  (const_int 1))
-		      (label_ref (match_operand 0))
-		      (pc)))
-   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*wi*c*l")
-	(plus:P (match_dup 1)
-		(const_int -1)))
-   (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
-   (clobber (match_scratch:P 4 "=X,X,&r,r"))]
-  ""
-{
-  if (which_alternative != 0)
-    return "#";
-  else if (get_attr_length (insn) == 4)
-    return "bdnz %l0";
-  else
-    return "bdz $+8\;b %l0";
-}
-  [(set_attr "type" "branch")
-   (set_attr "length" "*,16,20,20")])
+(define_code_iterator eqne [eq ne])
+(define_code_attr bd [(ne "bdnz") (eq "bdz")])
+(define_code_attr bd_neg [(ne "bdz") (eq "bdnz")])
 
-;; Similar but use EQ
-
-(define_insn "ctr<mode>_internal2"
+(define_insn "<bd>_<mode>"
   [(set (pc)
-	(if_then_else (eq (match_operand:P 1 "register_operand" "c,*b,*b,*b")
+	(if_then_else (eqne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
 			  (const_int 1))
 		      (label_ref (match_operand 0))
 		      (pc)))
@@ -13024,15 +13003,14 @@ 
   if (which_alternative != 0)
     return "#";
   else if (get_attr_length (insn) == 4)
-    return "bdz %l0";
+    return "<bd> %l0";
   else
-    return "bdnz $+8\;b %l0";
+    return "<bd_neg> $+8\;b %l0";
 }
   [(set_attr "type" "branch")
    (set_attr "length" "*,16,20,20")])
 
-;; Now the splitters if we could not allocate the CTR register
-
+;; Now the splitter if we could not allocate the CTR register
 (define_split
   [(set (pc)
 	(if_then_else (match_operator 2 "comparison_operator"
@@ -13040,56 +13018,141 @@ 
 				       (const_int 1)])
 		      (match_operand 5)
 		      (match_operand 6)))
-   (set (match_operand:P 0 "int_reg_operand")
+   (set (match_operand:P 0 "nonimmediate_operand")
 	(plus:P (match_dup 1)
 		(const_int -1)))
    (clobber (match_scratch:CC 3))
    (clobber (match_scratch:P 4))]
   "reload_completed"
-  [(set (match_dup 3)
-	(compare:CC (match_dup 1)
-		    (const_int 1)))
-   (set (match_dup 0)
-	(plus:P (match_dup 1)
-		(const_int -1)))
-   (set (pc)
+  [(set (pc)
 	(if_then_else (match_dup 7)
 		      (match_dup 5)
 		      (match_dup 6)))]
 {
+  rtx ctrin = operands[1];
+  rtx ctrout = operands[0];
+  rtx ctrtmp = operands[4];
   operands[7] = gen_rtx_fmt_ee (GET_CODE (operands[2]), VOIDmode, operands[3],
 				const0_rtx);
+  emit_insn (gen_rtx_SET (operands[3],
+                          gen_rtx_COMPARE (CCmode, ctrin, const1_rtx)));
+  if (gpc_reg_operand (ctrout, <MODE>mode))
+    emit_insn (gen_add<mode>3 (ctrout, ctrin, constm1_rtx));
+  else
+    {
+      emit_insn (gen_add<mode>3 (ctrtmp, ctrin, constm1_rtx));
+      emit_move_insn (ctrout, ctrtmp);
+    } 
+    /* No DONE so branch comes from the pattern.  */
 })
 
-(define_split
+;; patterns for bdnzt/bdnzf/bdzt/bdzf
+;; Note that in the case of long branches we have to decompose this into
+;; bdnz+bc. This is because bdnzt has an implied AND between the ctr condition
+;; and the CR bit, which means there is no way to conveniently invert the
+;; comparison as is done with plain bdnz/bdz.
+
+(define_insn "<bd>tf_<mode>"
   [(set (pc)
-	(if_then_else (match_operator 2 "comparison_operator"
-				      [(match_operand:P 1 "gpc_reg_operand")
-				       (const_int 1)])
-		      (match_operand 5)
-		      (match_operand 6)))
-   (set (match_operand:P 0 "nonimmediate_operand")
+	(if_then_else
+	  (and
+	     (eqne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
+		   (const_int 1))
+	     (match_operator 3 "branch_comparison_operator"
+		      [(match_operand 4 "cc_reg_operand" "y,y,y,y")
+		       (const_int 0)]))
+		      (label_ref (match_operand 0))
+		      (pc)))
+   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*wi*c*l")
 	(plus:P (match_dup 1)
 		(const_int -1)))
-   (clobber (match_scratch:CC 3))
-   (clobber (match_scratch:P 4))]
-  "reload_completed && !gpc_reg_operand (operands[0], SImode)"
-  [(set (match_dup 3)
-	(compare:CC (match_dup 1)
-		    (const_int 1)))
-   (set (match_dup 4)
-	(plus:P (match_dup 1)
+   (clobber (match_scratch:P 5 "=X,X,&r,r"))
+   (clobber (match_scratch:CC 6 "=X,&y,&y,&y"))
+   (clobber (match_scratch:CCEQ 7 "=X,&y,&y,&y"))]
+  ""
+{
+  if (which_alternative != 0)
+    return "#";
+  else if (get_attr_length (insn) == 4)
+    {
+      if (branch_positive_comparison_operator (operands[3],
+                                               GET_MODE (operands[3])))
+        return "<bd>t %j3,%l0";
+      else
+        return "<bd>f %j3,%l0";
+    }
+  else
+    { 
+        static char seq[96];
+        char *bcs = output_cbranch (operands[3], "$+8", 1, insn);
+	sprintf(seq, "<bd_neg> $+12\;%s;b %%l0", bcs);
+	return seq;
+    }
+}
+  [(set_attr "type" "branch")
+   (set_attr "length" "*,16,20,20")])
+
+;; Now the splitter if we could not allocate the CTR register
+(define_split
+  [(set (pc)
+	(if_then_else
+	  (and
+	     (match_operator 1 "comparison_operator"
+	                     [(match_operand:P 0 "gpc_reg_operand")
+		              (const_int 1)])
+	     (match_operator 3 "branch_comparison_operator"
+		      [(match_operand 2 "cc_reg_operand")
+		       (const_int 0)]))
+		      (match_operand 4)
+		      (match_operand 5)))
+   (set (match_operand:P 6 "int_reg_operand")
+	(plus:P (match_dup 0)
 		(const_int -1)))
-   (set (match_dup 0)
-	(match_dup 4))
-   (set (pc)
-	(if_then_else (match_dup 7)
-		      (match_dup 5)
-		      (match_dup 6)))]
+   (clobber (match_scratch:P 7))
+   (clobber (match_scratch:CC 8))
+   (clobber (match_scratch:CCEQ 9))]
+  "reload_completed"
+[(pc)]
 {
-  operands[7] = gen_rtx_fmt_ee (GET_CODE (operands[2]), VOIDmode, operands[3],
-				const0_rtx);
+   rtx ctr = operands[0];
+   rtx ctrcmp = operands[1];
+   rtx ccin = operands[2];
+   rtx cccmp = operands[3];
+   rtx dst1 = operands[4];
+   rtx dst2 = operands[5];
+   rtx ctrout = operands[6];
+   rtx ctrtmp = operands[7];
+   enum rtx_code cmpcode = GET_CODE (ctrcmp);
+   bool ispos = branch_positive_comparison_operator (ctrcmp, GET_MODE (ctrcmp));
+   if (!ispos)
+      cmpcode = reverse_condition (cmpcode);
+   // want to generate crand/crandc
+
+   emit_insn (gen_rtx_SET (operands[8],
+                           gen_rtx_COMPARE (CCmode, ctr, const1_rtx)));
+   rtx ctrcmpcc = gen_rtx_fmt_ee (cmpcode, SImode, operands[8], const0_rtx);
+
+   rtx andexpr = gen_rtx_AND (SImode, ctrcmpcc, cccmp);
+   if (ispos)
+      emit_insn (gen_cceq_ior_compare (operands[9], andexpr, ctrcmpcc,
+                                       operands[8], cccmp, ccin));
+   else
+      emit_insn (gen_cceq_ior_compare_complement (operands[9], andexpr, ctrcmpcc,
+                                                  operands[8], cccmp, ccin));
+   if (gpc_reg_operand (operands[0], <MODE>mode))
+      emit_insn (gen_add<mode>3 (ctrout, ctr, constm1_rtx));
+   else
+      {
+       	 emit_insn (gen_add<mode>3 (ctrtmp, ctr, constm1_rtx));
+         emit_move_insn (ctrout, ctrtmp);
+      }
+   rtx cmp = gen_rtx_EQ (CCEQmode, operands[9], const0_rtx);
+   emit_jump_insn (gen_rtx_SET (pc_rtx,
+                                gen_rtx_IF_THEN_ELSE (VOIDmode, cmp,
+						      dst1, dst2)));
+   DONE;
 })
+
 
 (define_insn "trap"
   [(trap_if (const_int 1) (const_int 0))]