diff mbox

[RFC] new post-reload compare elimination pass

Message ID 4CF4399B.2090403@redhat.com
State New
Headers show

Commit Message

Richard Henderson Nov. 29, 2010, 11:39 p.m. UTC
Since NickC is away for the next month or so, I spent some of the
long weekend trying to help him out with the mn10300 comparison
problem for which he's already sent two revisions.  The patch is
along the lines I proposed here:

  http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01076.html

This adds a new post-reload pass, which is only enabled by backends
that require it, e.g.  

> +  /* Enable comparison elimination, unless explicitly disabled.  */
> +  if (optimize && !global_options_set.x_flag_compare_elim_after_reload)
> +    flag_compare_elim_after_reload = 1;

as will be seen in the mn10300 backend patch that uses this.

I tried to use the df.h interfaces more, but I couldn't seem to find
any easy way to walk a def-def chain.  I'd need that to easily notice
the LUID of the clobber of the flags register preceding a compare.  In
the end it seemed easiest to simply walk the insn chain and keep track.

Comments wrt this specific patch?


r~
From f5c86689ffa10fe9a7e9f63454ad84f0f7f8e882 Mon Sep 17 00:00:00 2001
From: Richard Henderson <rth@twiddle.net>
Date: Mon, 29 Nov 2010 15:24:29 -0800
Subject: [PATCH 1/3] New -fcompare-elim pass.

---
 gcc/Makefile.in     |    6 +-
 gcc/common.opt      |    4 +
 gcc/compare-elim.c  |  287 +++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/doc/invoke.texi |   15 +++-
 gcc/passes.c        |    1 +
 gcc/recog.h         |    7 ++
 gcc/tree-pass.h     |    1 +
 7 files changed, 319 insertions(+), 2 deletions(-)
 create mode 100644 gcc/compare-elim.c

Comments

Yao Qi Nov. 30, 2010, 3:56 a.m. UTC | #1
On 11/30/2010 07:39 AM, Richard Henderson wrote:
> Comments wrt this specific patch?

It will be great if some test cases for this pass can be provided.
Jeff Law Nov. 30, 2010, 5:50 a.m. UTC | #2
On 11/29/10 16:39, Richard Henderson wrote:
> Since NickC is away for the next month or so, I spent some of the
> long weekend trying to help him out with the mn10300 comparison
> problem for which he's already sent two revisions.  The patch is
> along the lines I proposed here:
>
>    http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01076.html
>
> This adds a new post-reload pass, which is only enabled by backends
> that require it, e.g.
>
>> +  /* Enable comparison elimination, unless explicitly disabled.  */
>> +  if (optimize&&  !global_options_set.x_flag_compare_elim_after_reload)
>> +    flag_compare_elim_after_reload = 1;
> as will be seen in the mn10300 backend patch that uses this.
>
> I tried to use the df.h interfaces more, but I couldn't seem to find
> any easy way to walk a def-def chain.  I'd need that to easily notice
> the LUID of the clobber of the flags register preceding a compare.  In
> the end it seemed easiest to simply walk the insn chain and keep track.
>
> Comments wrt this specific patch?
At first glance this reminded me a lot of the old comparison elimination 
code in final, as it should since the code we want to generate is the 
same -- the major difference is this code twiddles the RTL to match 
reality where the old code waited until it was late enough on the 
pipeline that it could just change the assembly code without downstream 
consequences.

With that in mind, I looked over the old code in final.c to see if there 
were any high level gotchas -- the only ones that stuck out where checks 
to ensure the comparison didn't have a volatile operand or an operand 
with an auto-increment addressing mode.    I don't see similar checks in 
your new pass.

I don't think it's necessary right now, but handling conditional moves, 
conditional traps & setcc insns might be useful one day as well.

Overall, the pass looks pretty simple.  I presume you verified it 
actually eliminated some comparisons on the mn103 port?  Did NickC give 
you a testcase or two to work with?

Jeff
Richard Henderson Nov. 30, 2010, 3:56 p.m. UTC | #3
On 11/29/2010 09:50 PM, Jeff Law wrote:
> At first glance this reminded me a lot of the old comparison
> elimination code in final, as it should since the code we want to
> generate is the same ...

Yes, that's the idea: giving us a proper rtl replacement for the
cc0 compare elimination from final.

> With that in mind, I looked over the old code in final.c to see if
> there were any high level gotchas -- the only ones that stuck out
> where checks to ensure the comparison didn't have a volatile operand
> or an operand with an auto-increment addressing mode. I don't see
> similar checks in your new pass.

I validate that the comparison is exactly (compare (reg) (const_int)).
I don't allow memory operands at all.

> I don't think it's necessary right now, but handling conditional
> moves, conditional traps & setcc insns might be useful one day as
> well.

I'm not specifically checking for jumps.  In theory it does handle
conditional moves and setcc, but mn10300 doesn't have any of those
to test with.

I'm planning to convert the RX port as well, which does have setcc
and movcc patterns to deal with, and was also switched from cc0 to
flags register within the 4.6 release cycle.

> Overall, the pass looks pretty simple. I presume you verified it
> actually eliminated some comparisons on the mn103 port? Did NickC
> give you a testcase or two to work with?

No, I've just been examining make all-target output and comparing
results between gcc 4.5 (pre-cc0 removal) and current.  I should
think a proper test is that we don't regress the size of libc.a.


r~
Jeff Law Nov. 30, 2010, 4:18 p.m. UTC | #4
On 11/30/10 08:56, Richard Henderson wrote:
>
>> With that in mind, I looked over the old code in final.c to see if
>> there were any high level gotchas -- the only ones that stuck out
>> where checks to ensure the comparison didn't have a volatile operand
>> or an operand with an auto-increment addressing mode. I don't see
>> similar checks in your new pass.
> I validate that the comparison is exactly (compare (reg) (const_int)).
> I don't allow memory operands at all.
OK.  Solves that "problem".

>> Overall, the pass looks pretty simple. I presume you verified it
>> actually eliminated some comparisons on the mn103 port? Did NickC
>> give you a testcase or two to work with?
> No, I've just been examining make all-target output and comparing
> results between gcc 4.5 (pre-cc0 removal) and current.  I should
> think a proper test is that we don't regress the size of libc.a.
Seems like lots of other stuff could get in the way of a simple size 
comparison.  You might be able to instrument the pre-cc0 compiler to 
dump a little info when the old redundant tst removal code triggered and 
verify that those same functions trigger with your new pass.  newlib's 
libc.a is definitely the right thing to be looking at for testcases.

jeff
diff mbox

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 97ddf46..8a35cfd 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -902,7 +902,7 @@  GIMPLE_H = gimple.h gimple.def gsstruct.def pointer-set.h $(VEC_H) \
 GCOV_IO_H = gcov-io.h gcov-iov.h auto-host.h
 COVERAGE_H = coverage.h $(GCOV_IO_H)
 DEMANGLE_H = $(srcdir)/../include/demangle.h
-RECOG_H = recog.h
+RECOG_H = recog.h insn-config.h
 ALIAS_H = alias.h coretypes.h
 EMIT_RTL_H = emit-rtl.h
 FLAGS_H = flags.h coretypes.h flag-types.h $(OPTIONS_H)
@@ -1206,6 +1206,7 @@  OBJS-common = \
 	cfgrtl.o \
 	combine.o \
 	combine-stack-adj.o \
+	compare-elim.o \
 	convert.o \
 	coverage.o \
 	cse.o \
@@ -3390,6 +3391,9 @@  combine-stack-adj.o : combine-stack-adj.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
    $(TM_H) $(RTL_H) insn-config.h $(TIMEVAR_H) $(TREE_PASS_H) \
    $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
    $(EXPR_H) $(BASIC_BLOCK_H) $(TOPLEV_H) $(DIAGNOSTIC_CORE_H) $(TM_P_H) $(DF_H) $(EXCEPT_H) reload.h
+compare-elim.o : compare-elim.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
+   $(TM_H) $(RTL_H) $(TM_P_H) $(RECOG_H) $(FLAGS_H) $(BASIC_BLOCK_H) \
+   $(TREE_PASS_H) $(DF_H)
 ddg.o : ddg.c $(DDG_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TARGET_H) \
    $(TOPLEV_H) $(DIAGNOSTIC_CORE_H) $(RTL_H) $(TM_P_H) $(REGS_H) $(FUNCTION_H) \
    $(FLAGS_H) insn-config.h $(INSN_ATTR_H) $(EXCEPT_H) $(RECOG_H) \
diff --git a/gcc/common.opt b/gcc/common.opt
index 57f5b0a..c146cee 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -818,6 +818,10 @@  fcompare-debug-second
 Common Driver RejectNegative
 Run only the second compilation of -fcompare-debug
 
+fcompare-elim
+Common Report Var(flag_compare_elim_after_reload) Optimization
+Perform comparison elimination after register allocation has finished
+
 fconserve-stack
 Common Var(flag_conserve_stack) Optimization
 Do not perform optimizations increasing noticeably stack usage
diff --git a/gcc/compare-elim.c b/gcc/compare-elim.c
new file mode 100644
index 0000000..754d308
--- /dev/null
+++ b/gcc/compare-elim.c
@@ -0,0 +1,287 @@ 
+/* Post-reload compare elimination.
+   Copyright (C) 2010
+   Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* There is a set of targets whose general-purpose move or addition
+   instructions clobber the flags.  These targets cannot split their
+   CBRANCH/CSTORE etc patterns before reload is complete, lest reload
+   itself insert instructions in between the flags setter and user.
+   Because these targets cannot split the compare from the use, they
+   cannot make use of the comparison elimination offered by the combine pass.
+
+   This is a small pass intended to provide comparison elimination similar to
+   what is available via NOTICE_UPDATE_CC for cc0 targets.  This should help
+   encourage cc0 targets to convert to an explicit post-reload representation
+   of the flags.
+
+   This pass assumes:
+
+   (0) CBRANCH/CSTORE etc have been split in pass_split_after_reload.
+
+   (1) All comparison patterns are represented as
+
+	[(set (reg:CC) (compare:CC (reg) (immediate)]
+
+   (2) All insn patterns that modify the flags are represented as
+
+	[(set (reg) (operation)
+	 (clobber (reg:CC))]
+
+   (3) If an insn of form (2) can usefully set the flags, there is
+       another pattern of the form
+
+	[(set (reg) (operation)
+	 (set (reg:CCM) (compare:CCM (operation) (immediate)))]
+
+       The mode CCM will be chosen as if by SELECT_CC_MODE.
+*/
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "rtl.h"
+#include "tm_p.h"
+#include "recog.h"
+#include "flags.h"
+#include "basic-block.h"
+#include "tree-pass.h"
+#include "df.h"
+
+
+/* Attempt to install a comparison similar to PAT into CC_SET_INSN.
+   OP_DEST and FLAGS_DEST contain the operation and flags registers
+   set and clobbered, respectively, inside CC_SET_INSN.
+
+   Return true iff CC_SET_INSN modified to contain the comparison is valid.  */
+
+static bool
+maybe_eliminate_compare (rtx insn, rtx cc_set_insn, rtx op_dest, rtx flags_dest)
+{
+  rtx cmp_dest, cmp_reg, cmp_imm, cmp_src, x, pat = PATTERN (insn);
+  enum machine_mode cmp_mode;
+
+  /* PAT should be only a compare pattern.  */
+  if (GET_CODE (pat) != SET)
+    return false;
+  if (GET_CODE (SET_SRC (pat)) != COMPARE)
+    return false;
+
+  /* The destination of the compare should be the same register as the
+     flags register that was clobbered, although we allow the CCmode to
+     differ (for the moment) between the setter and the compare.  */
+  cmp_dest = SET_DEST (pat);
+  cmp_mode = GET_MODE (cmp_dest);
+  if (!REG_P (cmp_dest)
+      || REGNO (cmp_dest) != REGNO (flags_dest)
+      || GET_MODE_CLASS (cmp_mode) != MODE_CC)
+    return false;
+
+  /* The source of the compare should involve the operation destination
+     and a constant.  */
+  cmp_reg = XEXP (SET_SRC (pat), 0);
+  cmp_imm = XEXP (SET_SRC (pat), 1);
+  if (!rtx_equal_p (cmp_reg, op_dest) || !CONSTANT_P (cmp_imm))
+    return false;
+
+  cmp_src = SET_SRC (XVECEXP (PATTERN (cc_set_insn), 0, 0));
+
+#ifdef SELECT_CC_MODE
+  {
+    /* Extract the comparison operator from the flags use.
+       This is assumed to be in the next insn, due to the location
+       of this pass wrt post-reload splitting.  */
+    rtx next = next_nonnote_nondebug_insn (insn);
+    x = next ? single_set (next) : NULL;
+    if (x)
+      {
+        x = SET_SRC (x);
+        if (GET_CODE (x) == IF_THEN_ELSE)
+	  x = XEXP (x, 0);
+
+        if (COMPARISON_P (x)
+	    && REG_P (XEXP (x, 0))
+	    && REGNO (XEXP (x, 0)) == REGNO (cmp_dest))
+	  {
+	    cmp_mode = SELECT_CC_MODE (GET_CODE (x), cmp_src, cmp_imm);
+	    if (cmp_mode != GET_MODE (cmp_dest))
+	      {
+		cmp_dest = gen_rtx_REG (cmp_mode, REGNO (cmp_dest));
+		validate_change (next, &XEXP (x, 0), cmp_dest, true);
+	      }
+	  }
+      }
+  }
+#endif
+
+  /* Generate a new comparison for installation in the setter.  */
+  x = copy_rtx (cmp_src);
+  x = gen_rtx_COMPARE (cmp_mode, x, cmp_imm);
+  x = gen_rtx_SET (VOIDmode, cmp_dest, x);
+
+  /* Succeed if the new instruction is valid.  */
+  validate_change (cc_set_insn, &XVECEXP (PATTERN (cc_set_insn), 0, 1),
+		   x, true);
+
+  return apply_change_group ();
+}
+
+
+/* Recognize an instruction that clobbers a flags register that is of the
+   correct form to transform into a combined operate-and-compare.  If PAT
+   does match, extract OP_DEST and FLAGS_DEST.  */
+
+static bool
+is_cc_setter (rtx insn, rtx *pop_dest, rtx *pflags_dest)
+{
+  rtx op0, op1, op_dest, flags_dest, pat = PATTERN (insn);
+
+  if (GET_CODE (pat) != PARALLEL)
+    return false;
+  if (XVECLEN (pat, 0) != 2)
+    return false;
+
+  op0 = XVECEXP (pat, 0, 0);
+  op1 = XVECEXP (pat, 0, 1);
+  if (GET_CODE (op1) != CLOBBER || GET_CODE (op0) != SET)
+    return false;
+
+  flags_dest = XEXP (op1, 0);
+  if (!REG_P (flags_dest) || GET_MODE_CLASS (GET_MODE (flags_dest)) != MODE_CC)
+    return false;
+    
+  op_dest = SET_DEST (op0);
+  if (!REG_P (op_dest))
+    return false;
+
+  *pop_dest = op_dest;
+  *pflags_dest = flags_dest;
+  return true;
+}
+
+
+/* Perform the compare elimination pass on BB.  */
+
+static void
+compare_elim_bb (basic_block bb)
+{
+  rtx cc_set_insn = NULL, op_dest = NULL, flags_dest = NULL;
+  rtx insn, next;
+  bool finish = false;
+
+  for (insn = BB_HEAD (bb); !finish ; insn = next)
+    {
+      next = NEXT_INSN (insn);
+      finish = (insn == BB_END (bb));
+
+      if (!NONDEBUG_INSN_P (insn))
+	continue;
+
+      /* Assume calls clobber the flags.  */
+      if (CALL_P (insn))
+	{
+	  cc_set_insn = op_dest = flags_dest = NULL;
+	  continue;
+	}
+
+      /* Notice compare insns, and eliminate if possible.  */
+      if (cc_set_insn
+	  && maybe_eliminate_compare (insn, cc_set_insn, op_dest, flags_dest))
+	{
+	  rtx note;
+
+	  /* Success.  Delete the compare insn...  */
+	  delete_insn (insn);
+
+	  /* ... and any notes that are now irrelevant due to multi-stores. */
+	  note = find_reg_note (cc_set_insn, REG_UNUSED, flags_dest);
+	  if (note)
+	    remove_note (cc_set_insn, note);
+	  note = find_reg_note (cc_set_insn, REG_EQUAL, NULL);
+	  if (note)
+	    remove_note (cc_set_insn, note);
+	  note = find_reg_note (cc_set_insn, REG_EQUIV, NULL);
+	  if (note)
+	    remove_note (cc_set_insn, note);
+
+	  cc_set_insn = op_dest = flags_dest = NULL;
+	  continue;
+	}
+
+      /* Notice potential cc_set insns.  */
+      if (is_cc_setter (insn, &op_dest, &flags_dest))
+	{
+	  cc_set_insn = insn;
+	  continue;
+	}
+
+      /* Notice clobbers of the registers involved.  */
+      if (cc_set_insn
+	  && (reg_set_p (op_dest, insn)
+	      || reg_set_p (flags_dest, insn)))
+	{
+	  cc_set_insn = op_dest = flags_dest = NULL;
+	  continue;
+	}
+    }
+}
+
+/* Main entry point to the pass.  */
+
+static unsigned int
+execute_compare_elim_after_reload (void)
+{
+  basic_block bb;
+
+  df_set_flags (DF_DEFER_INSN_RESCAN);
+
+  FOR_EACH_BB (bb)
+    compare_elim_bb (bb);
+
+  return 0;
+}
+
+static bool
+gate_compare_elim_after_reload (void)
+{
+  return flag_compare_elim_after_reload;
+}
+
+struct rtl_opt_pass pass_compare_elim_after_reload =
+{
+ {
+  RTL_PASS,
+  "cmpelim",				/* name */
+  gate_compare_elim_after_reload,	/* gate */
+  execute_compare_elim_after_reload,	/* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  TV_NONE,				/* tv_id */
+  0,					/* properties_required */
+  0,					/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  TODO_df_finish
+  | TODO_df_verify
+  | TODO_verify_rtl_sharing
+  | TODO_dump_func
+  | TODO_ggc_collect,			/* todo_flags_finish */
+ }
+};
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b68d1dc..55d27fa 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -337,7 +337,7 @@  Objective-C and Objective-C++ Dialects}.
 -fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol
 -fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol
 -fcheck-data-deps -fcombine-stack-adjustments -fconserve-stack @gol
--fcprop-registers -fcrossjumping @gol
+-fcompare-elim -fcprop-registers -fcrossjumping @gol
 -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol
 -fcx-limited-range @gol
 -fdata-sections -fdce -fdce @gol
@@ -5865,6 +5865,7 @@  compilation time.
 @option{-O} turns on the following optimization flags:
 @gccoptlist{
 -fauto-inc-dec @gol
+-fcompare-elim @gol
 -fcprop-registers @gol
 -fdce @gol
 -fdefer-pop @gol
@@ -7670,6 +7671,18 @@  effect of this flag and how to use it.
 
 Disabled by default.
 
+@item -fcompare-elim
+@opindex fcompare-elim
+After register allocation and post-register allocation instruction splitting,
+identify arithmetic instructions that compute processor flags similar to a
+comparison operation based on that arithmetic.  If possible, eliminate the
+explicit comparison operation.
+
+This pass only applies to certain targets that cannot explicitly represent
+the comparison operation before register allocation is complete.
+
+Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}.
+
 @item -fcprop-registers
 @opindex fcprop-registers
 After register allocation and post-register allocation instruction splitting,
diff --git a/gcc/passes.c b/gcc/passes.c
index 4be61a9..45ffebd 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1017,6 +1017,7 @@  init_optimization_passes (void)
 	  NEXT_PASS (pass_gcse2);
 	  NEXT_PASS (pass_split_after_reload);
 	  NEXT_PASS (pass_implicit_zee);
+	  NEXT_PASS (pass_compare_elim_after_reload);
 	  NEXT_PASS (pass_branch_target_load_optimize1);
 	  NEXT_PASS (pass_thread_prologue_and_epilogue);
 	  NEXT_PASS (pass_rtl_dse2);
diff --git a/gcc/recog.h b/gcc/recog.h
index 217c6e5..5d37100 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -18,6 +18,11 @@  You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+#ifndef GCC_RECOG_H
+#define GCC_RECOG_H
+
+#include "insn-config.h"
+
 /* Random number that should be large enough for all purposes.  */
 #define MAX_RECOG_ALTERNATIVES 30
 
@@ -305,3 +310,5 @@  struct insn_data_d
 
 extern const struct insn_data_d insn_data[];
 extern int peep2_current_count;
+
+#endif /* GCC_RECOG_H */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index a87a770..29ac5d6 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -547,6 +547,7 @@  extern struct rtl_opt_pass pass_reorder_blocks;
 extern struct rtl_opt_pass pass_branch_target_load_optimize2;
 extern struct rtl_opt_pass pass_leaf_regs;
 extern struct rtl_opt_pass pass_split_before_sched2;
+extern struct rtl_opt_pass pass_compare_elim_after_reload;
 extern struct rtl_opt_pass pass_sched2;
 extern struct rtl_opt_pass pass_stack_regs;
 extern struct rtl_opt_pass pass_stack_regs_run;