diff mbox

[AVR] : Fix PR49903

Message ID 4E43F9CA.70404@gjlay.de
State New
Headers show

Commit Message

Georg-Johann Lay Aug. 11, 2011, 3:48 p.m. UTC
This is an optimization in machine dependent reorg to
remove redundant comparisons like in

   cc0 = compare (Reg, Num)
   if (cc0 == 0)
     goto L1

   cc0 = compare (Reg, Num)
   if (cc0 > 0)
     goto L2

The second comparison is redundant an can be removed.
Code like this can be seen in binary decision switch/case
expansion.

Passed without regressions.

Ok to install?

Johann


	* PR target/49903
	* config/avr/avr.md (UNSPEC_IDENTITY): New c_enum.
	(branch_unspec): New insn.
	(branch): Beauty farm.
	* config/avr/avr.c (compare_condition): Use JUMP_P.  Test SET_SRC
	to be IF_THEN_ELSE.
	(avr_compare_pattern, avr_reorg_remove_redundant_compare):
	New static functions.
	(avr_reorg): Use them.  Use next_real_insn instead of NEXT_INSN.
	Use CONST_INT_P.  Beauty.

Comments

Denis Chertykov Aug. 11, 2011, 6:29 p.m. UTC | #1
2011/8/11 Georg-Johann Lay <avr@gjlay.de>:
> This is an optimization in machine dependent reorg to
> remove redundant comparisons like in
>
>   cc0 = compare (Reg, Num)
>   if (cc0 == 0)
>     goto L1
>
>   cc0 = compare (Reg, Num)
>   if (cc0 > 0)
>     goto L2
>
> The second comparison is redundant an can be removed.
> Code like this can be seen in binary decision switch/case
> expansion.
>
> Passed without regressions.
>
> Ok to install?

Please, commit.

Denis.
Hans-Peter Nilsson Aug. 12, 2011, 10:53 p.m. UTC | #2
On Thu, 11 Aug 2011, Georg-Johann Lay wrote:

> This is an optimization in machine dependent reorg to
> remove redundant comparisons like in
>
>    cc0 = compare (Reg, Num)
>    if (cc0 == 0)
>      goto L1
>
>    cc0 = compare (Reg, Num)
>    if (cc0 > 0)
>      goto L2
>
> The second comparison is redundant an can be removed.
> Code like this can be seen in binary decision switch/case
> expansion.

A glance at AVR makes me think this should already be handled by
the NOTICE_UPDATE_CC machinery.  Any analysis why this doesn't
happen?  With the same test-case (at -Os) I don't see redundant
compares for cris-elf, for example.

brgds, H-P
Georg-Johann Lay Aug. 13, 2011, 9:16 a.m. UTC | #3
Hans-Peter Nilsson schrieb:
> 
> On Thu, 11 Aug 2011, Georg-Johann Lay wrote:
> 
> 
>>This is an optimization in machine dependent reorg to
>>remove redundant comparisons like in
>>
>>   cc0 = compare (Reg, Num)
>>   if (cc0 == 0)
>>     goto L1
>>
>>   cc0 = compare (Reg, Num)
>>   if (cc0 > 0)
>>     goto L2
>>
>>The second comparison is redundant an can be removed.
>>Code like this can be seen in binary decision switch/case
>>expansion.
> 
> A glance at AVR makes me think this should already be handled by
> the NOTICE_UPDATE_CC machinery.  Any analysis why this doesn't
> happen?  With the same test-case (at -Os) I don't see redundant
> compares for cris-elf, for example.

One reason is that branch insns set cc0 to "clobber" where it is
actually "none".  I could not depict the rationale for this from
the avr BE, presumably it's because of text peepholes that change
branches or jump-over-one-insn skip optimizations.

Second reason is that avr has no GT/GTU and therefore reorg transforms

    cc0 = compare (Reg, Num)
    if (cc0 > 0)
      goto L2

to

    cc0 = compare (Reg, Num-1)
    if (cc0 >= 0)
      goto L2

so that the comparisons are no more the same.

Of cource, reorg could ommit the last optimization if there is
a similar comparison beforehand.  But the hard part is not
doing/skipping the optimization, the annoyance is detecting the
right 4-insn instruction sequences.

I also thought about extending genrecog et al. which currently
can handle 3 types of things (RECOG, SPLIT, PEEPHOLE2) to a
fourth one like INSN_SEQ so that one could write down the
sequence as RTL instead of as brain-dead C (brain-dead in the
way it must be written down, not in what it does)
and use insn-recog, insn-extract etc. to analyse such sequences.

This might also be helpful in other backends when doing similar
optimizations or writing hand-coded schedulers or when scanning
for specific sequences to work around core errata.

Johann

> 
> brgds, H-P
>
Hans-Peter Nilsson Aug. 13, 2011, 9:57 a.m. UTC | #4
On Sat, 13 Aug 2011, Georg-Johann Lay wrote:
> Hans-Peter Nilsson schrieb:
> > A glance at AVR makes me think this should already be handled by
> > the NOTICE_UPDATE_CC machinery.  Any analysis why this doesn't
> > happen?  With the same test-case (at -Os) I don't see redundant
> > compares for cris-elf, for example.
>
> One reason is that branch insns set cc0 to "clobber" where it is
> actually "none".  I could not depict the rationale for this from
> the avr BE, presumably it's because of text peepholes that change
> branches or jump-over-one-insn skip optimizations.

Or gas relaxation of branches ...nope, not in current binutils.
(And with gcc keeping track of instruction lengths, that's an
obsolete feature for code generated by gcc.)

> Second reason is that avr has no GT/GTU and therefore reorg transforms

Sidenote: please don't refer to it as reorg.  Reorg is the
delay-slot handler beast living in reorg.c and resource.c (but
whose days are numbered, thankfully).  ITYM machine-dependent
reorg; md_reorg, avr_reorg.

>
>    cc0 = compare (Reg, Num)
>    if (cc0 > 0)
>      goto L2
>
> to
>
>    cc0 = compare (Reg, Num-1)
>    if (cc0 >= 0)
>      goto L2
>
> so that the comparisons are no more the same.

Sorry, I missed that (those).  It just looked like you added a
lot of code for something that already should be handled.

> Of cource, reorg could ommit the last optimization if there is
> a similar comparison beforehand.  But the hard part is not
> doing/skipping the optimization, the annoyance is detecting the
> right 4-insn instruction sequences.
>
> I also thought about extending genrecog et al. which currently
> can handle 3 types of things (RECOG, SPLIT, PEEPHOLE2) to a
> fourth one like INSN_SEQ so that one could write down the
> sequence as RTL instead of as brain-dead C (brain-dead in the
> way it must be written down, not in what it does)
> and use insn-recog, insn-extract etc. to analyse such sequences.

I'm not sure what you mean here, except it sounds like yet more
code.  If you meant to keep the compare and branch together,
then there's an option to keep it: define cbranchM as a
define_insn, not as define_expand.

brgds, H-P
diff mbox

Patch

Index: config/avr/avr.md
===================================================================
--- config/avr/avr.md	(revision 177648)
+++ config/avr/avr.md	(working copy)
@@ -56,6 +56,7 @@  (define_c_enum "unspec"
    UNSPEC_FMULS
    UNSPEC_FMULSU
    UNSPEC_COPYSIGN
+   UNSPEC_IDENTITY
    ])
 
 (define_c_enum "unspecv"
@@ -3339,16 +3340,36 @@  (define_peephole2
 (define_insn "branch"
   [(set (pc)
         (if_then_else (match_operator 1 "simple_comparison_operator"
-                        [(cc0)
-                         (const_int 0)])
+                                      [(cc0)
+                                       (const_int 0)])
                       (label_ref (match_operand 0 "" ""))
                       (pc)))]
   ""
-  "*
-   return ret_cond_branch (operands[1], avr_jump_mode (operands[0],insn), 0);"
+  {
+    return ret_cond_branch (operands[1], avr_jump_mode (operands[0], insn), 0);
+  }
   [(set_attr "type" "branch")
    (set_attr "cc" "clobber")])
 
+
+;; Same as above but wrap SET_SRC so that this branch won't be transformed
+;; or optimized in the remainder.
+
+(define_insn "branch_unspec"
+  [(set (pc)
+        (unspec [(if_then_else (match_operator 1 "simple_comparison_operator"
+                                               [(cc0)
+                                                (const_int 0)])
+                               (label_ref (match_operand 0 "" ""))
+                               (pc))
+                 ] UNSPEC_IDENTITY))]
+  ""
+  {
+    return ret_cond_branch (operands[1], avr_jump_mode (operands[0], insn), 0);
+  }
+  [(set_attr "type" "branch")
+   (set_attr "cc" "none")])
+
 ;; ****************************************************************
 ;; AVR does not have following conditional jumps: LE,LEU,GT,GTU.
 ;; Convert them all to proper jumps.
Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(revision 177648)
+++ config/avr/avr.c	(working copy)
@@ -2947,15 +2947,17 @@  static RTX_CODE
 compare_condition (rtx insn)
 {
   rtx next = next_real_insn (insn);
-  RTX_CODE cond = UNKNOWN;
-  if (next && GET_CODE (next) == JUMP_INSN)
+
+  if (next && JUMP_P (next))
     {
       rtx pat = PATTERN (next);
       rtx src = SET_SRC (pat);
-      rtx t = XEXP (src, 0);
-      cond = GET_CODE (t);
+      
+      if (IF_THEN_ELSE == GET_CODE (src))
+        return GET_CODE (XEXP (src, 0));
     }
-  return cond;
+  
+  return UNKNOWN;
 }
 
 /* Returns nonzero if INSN is a tst insn that only tests the sign.  */
@@ -6046,82 +6048,265 @@  avr_normalize_condition (RTX_CODE condit
     }
 }
 
-/* This function optimizes conditional jumps.  */
+/* Helper function for `avr_reorg'.  */
+
+static rtx
+avr_compare_pattern (rtx insn)
+{
+  rtx pattern = single_set (insn);
+
+  if (pattern
+      && NONJUMP_INSN_P (insn)
+      && SET_DEST (pattern) == cc0_rtx
+      && GET_CODE (SET_SRC (pattern)) == COMPARE)
+    {
+      return pattern;
+    }
+
+  return NULL_RTX;
+}
+
+/* Helper function for `avr_reorg'.  */
+
+/* Expansion of switch/case decision trees leads to code like
+
+       cc0 = compare (Reg, Num)
+       if (cc0 == 0)
+         goto L1
+         
+       cc0 = compare (Reg, Num)
+       if (cc0 > 0)
+         goto L2
+
+   The second comparison is superfluous and can be deleted.
+   The second jump condition can be transformed from a
+   "difficult" one to a "simple" one because "cc0 > 0" and
+   "cc0 >= 0" will have the same effect here.
+
+   This function relies on the way switch/case is being expaned
+   as binary decision tree.  For example code see PR 49903.
+         
+   Return TRUE if optimization performed.
+   Return FALSE if nothing changed.
+
+   INSN1 is a comparison, i.e. avr_compare_pattern != 0.
+
+   We don't want to do this in text peephole because it is
+   tedious to work out jump offsets there and the second comparison
+   might have been transormed by `avr_reorg'.
+
+   RTL peephole won't do because peephole2 does not scan across
+   basic blocks.  */                
+                        
+static bool
+avr_reorg_remove_redundant_compare (rtx insn1)
+{
+  rtx comp1, ifelse1, xcond1, branch1;
+  rtx comp2, ifelse2, xcond2, branch2, insn2;
+  enum rtx_code code;
+  rtx jump, target, cond;
+  
+  /* Look out for:  compare1 - branch1 - compare2 - branch2  */
+
+  branch1 = next_nonnote_nondebug_insn (insn1);
+  if (!branch1 || !JUMP_P (branch1))
+    return false;
+
+  insn2 = next_nonnote_nondebug_insn (branch1);
+  if (!insn2 || !avr_compare_pattern (insn2))
+    return false;
+
+  branch2 = next_nonnote_nondebug_insn (insn2);
+  if (!branch2 || !JUMP_P (branch2))
+    return false;
+
+  comp1 = avr_compare_pattern (insn1);
+  comp2 = avr_compare_pattern (insn2);
+  xcond1 = single_set (branch1);
+  xcond2 = single_set (branch2);
+  
+  if (!comp1 || !comp2
+      || !rtx_equal_p (comp1, comp2)
+      || !xcond1 || SET_DEST (xcond1) != pc_rtx
+      || !xcond2 || SET_DEST (xcond2) != pc_rtx
+      || IF_THEN_ELSE != GET_CODE (SET_SRC (xcond1))
+      || IF_THEN_ELSE != GET_CODE (SET_SRC (xcond2)))
+    {
+      return false;
+    }
+
+  comp1 = SET_SRC (comp1);
+  ifelse1 = SET_SRC (xcond1);
+  ifelse2 = SET_SRC (xcond2);
+
+  /* comp<n> is COMPARE now and ifelse<n> is IF_THEN_ELSE.  */
+
+  if (EQ != GET_CODE (XEXP (ifelse1, 0))
+      || !REG_P (XEXP (comp1, 0))
+      || !CONST_INT_P (XEXP (comp1, 1))
+      || XEXP (ifelse1, 2) != pc_rtx
+      || XEXP (ifelse2, 2) != pc_rtx
+      || LABEL_REF != GET_CODE (XEXP (ifelse1, 1))
+      || LABEL_REF != GET_CODE (XEXP (ifelse2, 1))
+      || !COMPARISON_P (XEXP (ifelse2, 0))
+      || cc0_rtx != XEXP (XEXP (ifelse1, 0), 0)
+      || cc0_rtx != XEXP (XEXP (ifelse2, 0), 0)
+      || const0_rtx != XEXP (XEXP (ifelse1, 0), 1)
+      || const0_rtx != XEXP (XEXP (ifelse2, 0), 1))
+    {
+      return false;
+    }
+
+  /* We filtered the insn sequence to look like
+
+        (set (cc0)
+             (compare (reg:M N)
+                      (const_int VAL)))
+        (set (pc)
+             (if_then_else (eq (cc0)
+                               (const_int 0))
+                           (label_ref L1)
+                           (pc)))
+                           
+        (set (cc0)
+             (compare (reg:M N)
+                      (const_int VAL)))
+        (set (pc)
+             (if_then_else (CODE (cc0)
+                                 (const_int 0))
+                           (label_ref L2)
+                           (pc)))
+  */
+
+  code = GET_CODE (XEXP (ifelse2, 0));
+
+  /* Map GT/GTU to GE/GEU which is easier for AVR.
+     The first two instructions compare/branch on EQ
+     so we may replace the difficult
+        
+        if (x == VAL)   goto L1;
+        if (x > VAL)    goto L2;
+
+     with easy
+         
+         if (x == VAL)   goto L1;
+         if (x >= VAL)   goto L2;
+
+     Similarly, replace LE/LEU by LT/LTU.  */
+  
+  switch (code)
+    {
+    case EQ:
+    case LT:  case LTU:
+    case GE:  case GEU:
+      break;
+
+    case LE:  case LEU:
+    case GT:  case GTU:
+      code = avr_normalize_condition (code);
+      break;
+      
+    default:
+      return false;
+    }
+
+  /* Wrap the branches into UNSPECs so they won't be changed or
+     optimized in the remainder.  */
+
+  target = XEXP (XEXP (ifelse1, 1), 0);
+  cond = XEXP (ifelse1, 0);
+  jump = emit_jump_insn_after (gen_branch_unspec (target, cond), insn1);
+
+  JUMP_LABEL (jump) = JUMP_LABEL (branch1);
+
+  target = XEXP (XEXP (ifelse2, 1), 0);
+  cond = gen_rtx_fmt_ee (code, VOIDmode, cc0_rtx, const0_rtx);
+  jump = emit_jump_insn_after (gen_branch_unspec (target, cond), insn2);
+
+  JUMP_LABEL (jump) = JUMP_LABEL (branch2);
+
+  /* The comparisons in insn1 and insn2 are exactly the same;
+     insn2 is superfluous so delete it.  */
+     
+  delete_insn (insn2);
+  delete_insn (branch1);
+  delete_insn (branch2);
+
+  return true;
+}
+
+
+/* Implement `TARGET_MACHINE_DEPENDENT_REORG'.  */
+/* Optimize conditional jumps.  */
 
 static void
 avr_reorg (void)
 {
-  rtx insn, pattern;
+  rtx insn = get_insns();
   
-  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+  for (insn = next_real_insn (insn); insn; insn = next_real_insn (insn))
     {
-      if (! (GET_CODE (insn) == INSN
-	     || GET_CODE (insn) == CALL_INSN
-	     || GET_CODE (insn) == JUMP_INSN)
-	  || !single_set (insn))
-	continue;
+      rtx pattern = avr_compare_pattern (insn);
+      
+      if (!pattern)
+        continue;
 
-      pattern = PATTERN (insn);
+      if (optimize
+          && avr_reorg_remove_redundant_compare (insn))
+        {
+          continue;
+        }
 
-      if (GET_CODE (pattern) == PARALLEL)
-	pattern = XVECEXP (pattern, 0, 0);
-      if (GET_CODE (pattern) == SET
-	  && SET_DEST (pattern) == cc0_rtx
-	  && compare_diff_p (insn))
-	{
-	  if (GET_CODE (SET_SRC (pattern)) == COMPARE)
-	    {
-	      /* Now we work under compare insn.  */
-	      
-	      pattern = SET_SRC (pattern);
-	      if (true_regnum (XEXP (pattern,0)) >= 0
-		  && true_regnum (XEXP (pattern,1)) >= 0 )
-		{
-		  rtx x = XEXP (pattern,0);
-		  rtx next = next_real_insn (insn);
-		  rtx pat = PATTERN (next);
-		  rtx src = SET_SRC (pat);
-		  rtx t = XEXP (src,0);
-		  PUT_CODE (t, swap_condition (GET_CODE (t)));
-		  XEXP (pattern,0) = XEXP (pattern,1);
-		  XEXP (pattern,1) = x;
-		  INSN_CODE (next) = -1;
-		}
-	      else if (true_regnum (XEXP (pattern, 0)) >= 0
-		       && XEXP (pattern, 1) == const0_rtx)
-	        {
-	          /* This is a tst insn, we can reverse it.  */
-	          rtx next = next_real_insn (insn);
-	          rtx pat = PATTERN (next);
-	          rtx src = SET_SRC (pat);
-	          rtx t = XEXP (src,0);
-    
-	          PUT_CODE (t, swap_condition (GET_CODE (t)));
-	          XEXP (pattern, 1) = XEXP (pattern, 0);
-	          XEXP (pattern, 0) = const0_rtx;
-	          INSN_CODE (next) = -1;
-	          INSN_CODE (insn) = -1;
-	        }
-	      else if (true_regnum (XEXP (pattern,0)) >= 0
-		       && GET_CODE (XEXP (pattern,1)) == CONST_INT)
-		{
-		  rtx x = XEXP (pattern,1);
-		  rtx next = next_real_insn (insn);
-		  rtx pat = PATTERN (next);
-		  rtx src = SET_SRC (pat);
-		  rtx t = XEXP (src,0);
-		  enum machine_mode mode = GET_MODE (XEXP (pattern, 0));
+      if (compare_diff_p (insn))
+	{
+          /* Now we work under compare insn with difficult branch.  */
+          
+          rtx next = next_real_insn (insn);
+          rtx pat = PATTERN (next);
 
-		  if (avr_simplify_comparison_p (mode, GET_CODE (t), x))
-		    {
-		      XEXP (pattern, 1) = gen_int_mode (INTVAL (x) + 1, mode);
-		      PUT_CODE (t, avr_normalize_condition (GET_CODE (t)));
-		      INSN_CODE (next) = -1;
-		      INSN_CODE (insn) = -1;
-		    }
-		}
-	    }
-	}
+          pattern = SET_SRC (pattern);
+          
+          if (true_regnum (XEXP (pattern, 0)) >= 0
+              && true_regnum (XEXP (pattern, 1)) >= 0)
+            {
+              rtx x = XEXP (pattern, 0);
+              rtx src = SET_SRC (pat);
+              rtx t = XEXP (src,0);
+              PUT_CODE (t, swap_condition (GET_CODE (t)));
+              XEXP (pattern, 0) = XEXP (pattern, 1);
+              XEXP (pattern, 1) = x;
+              INSN_CODE (next) = -1;
+            }
+          else if (true_regnum (XEXP (pattern, 0)) >= 0
+                   && XEXP (pattern, 1) == const0_rtx)
+            {
+              /* This is a tst insn, we can reverse it.  */
+              rtx src = SET_SRC (pat);
+              rtx t = XEXP (src,0);
+    
+              PUT_CODE (t, swap_condition (GET_CODE (t)));
+              XEXP (pattern, 1) = XEXP (pattern, 0);
+              XEXP (pattern, 0) = const0_rtx;
+              INSN_CODE (next) = -1;
+              INSN_CODE (insn) = -1;
+            }
+          else if (true_regnum (XEXP (pattern, 0)) >= 0
+                   && CONST_INT_P (XEXP (pattern, 1)))
+            {
+              rtx x = XEXP (pattern, 1);
+              rtx src = SET_SRC (pat);
+              rtx t = XEXP (src,0);
+              enum machine_mode mode = GET_MODE (XEXP (pattern, 0));
+              
+              if (avr_simplify_comparison_p (mode, GET_CODE (t), x))
+                {
+                  XEXP (pattern, 1) = gen_int_mode (INTVAL (x) + 1, mode);
+                  PUT_CODE (t, avr_normalize_condition (GET_CODE (t)));
+                  INSN_CODE (next) = -1;
+                  INSN_CODE (insn) = -1;
+                }
+            }
+        }
     }
 }