[28/28] New -fcompare-elim pass.

Version 2, based on Paolo's comments from v1. Changes: (0) The USE-DEF and DEF-USE chains are gone. (1) A PREV_CLOBBER insn is recorded during the single scan of the function. This insn is the insn previous to the compare which (A) clobbers the flags and (B) is of a shape amenable to compare elimination. If the insn previous to the compare that clobbers the flags is not amenable to elimination -- e.g. a call -- we don't bother recording it, and later know that we shouldn't try. (2) Uses of the compare are also found during that single scan. Some, ahem, highly questionable assumptions that had been made about how DEF-USE chains are or are not created are gone. I check the DF_LIVE_BB_INFO now, combined with the shape of the CFG to determine if there's a use of the flags that escapes the extended basic block. (3) try_eliminate_compare has been rewritten along the lines we discussed previously, although instead of reg_set_between_p... (4) Porting the RX target to the compare-elimination pass was very helpful. If showed that there's a phase ordering problem, that fortunately can be worked around with a little bit of effort. Consider libgcc _absvdi3.o, as of split2: (insn 76 75 74 4 (parallel [ (set (reg:SI 4 r4 [38]) (minus:SI (minus:SI (reg:SI 4 r4 [38]) (reg:SI 2 r2 [orig:37 a+4 ] [37])) (geu:SI (reg:CC 16 cc) (const_int 0 [0])))) (clobber (reg:CC 16 cc)) ]) libgcc2.c:265 76 {sbb_internal} (nil)) (insn 74 76 77 4 (set (reg:SI 2 r2 [orig:33 w+4 ] [33]) (reg:SI 4 r4 [38])) libgcc2.c:265 22 {*movsi_internal} (nil)) (insn 77 74 78 4 (set (reg:CC 16 cc) (compare:CC (reg:SI 2 r2 [orig:33 w+4 ] [33]) (const_int 0 [0]))) libgcc2.c:267 1 {*cmpsi} (nil)) For RX, subtract-with-borrow (sbb) is a two-address insn, and so RA has little choice but to add an output reload (insn 74) in order to satisfy the matching constraint and put the result into the proper register. If we were to run this pass after pass_cprop_hardreg, this sequence would be tidied up such that the compare (insn 77) would have R4 as an input, and so we would immediately see the corresponding arithmetic (insn 76) and perform the elimination. However, pass_cprop_hardreg currently runs much later. In particular, after pass_peephole2. Now, peep2 is where we would like to make transformations like "mov 0,r" -> "xor r,r" iff the flags are not already live at that location. In order to satisfy both, we'd have to shuffle lots of passes around with the attendant problem of determining if we've pessimized something else. Instead, simply look for copy insns while we're stepping back looking to be sure that the input to the compare isn't changed since PREV_CLOBBER. Comments? r~ /* Post-reload compare elimination. Copyright (C) 2010, 2011 Free Software Foundation, Inc. This file is part of GCC. GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version. GCC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ /* There is a set of targets whose general-purpose move or addition instructions clobber the flags. These targets cannot split their CBRANCH/CSTORE etc patterns before reload is complete, lest reload itself insert instructions in between the flags setter and user. Because these targets cannot split the compare from the use, they cannot make use of the comparison elimination offered by the combine pass. This is a small pass intended to provide comparison elimination similar to what is available via NOTICE_UPDATE_CC for cc0 targets. This should help encourage cc0 targets to convert to an explicit post-reload representation of the flags. This pass assumes: (0) CBRANCH/CSTORE etc have been split in pass_split_after_reload. (1) All comparison patterns are represented as [(set (reg:CC) (compare:CC (reg) (immediate)))] (2) All insn patterns that modify the flags are represented as [(set (reg) (operation) (clobber (reg:CC))] (3) If an insn of form (2) can usefully set the flags, there is another pattern of the form [(set (reg) (operation) (set (reg:CCM) (compare:CCM (operation) (immediate)))] The mode CCM will be chosen as if by SELECT_CC_MODE. */ #include "config.h" #include "system.h" #include "coretypes.h" #include "tm.h" #include "rtl.h" #include "tm_p.h" #include "insn-config.h" #include "recog.h" #include "flags.h" #include "basic-block.h" #include "tree-pass.h" #include "target.h" #include "df.h" #include "domwalk.h" /* These structures describe a comparison and how it is used. */ /* The choice of maximum 3 uses comes from wanting to eliminate the two duplicate compares from a three-way branch on the sign of a value. This is also sufficient to eliminate the duplicate compare against the high-part of a double-word comparison. */ #define MAX_CMP_USE 3 struct comparison_use { /* The instruction in which the result of the compare is used. */ rtx insn; /* The location of the flags register within the use. */ rtx *loc; /* The comparison code applied against the flags register. */ enum rtx_code code; }; struct comparison { /* The comparison instruction. */ rtx insn; /* The insn prior to the comparison insn that clobbers the flags. */ rtx prev_clobber; /* The two values being compared. These will be either REGs or constants. */ rtx in_a, in_b; /* Information about how this comparison is used. */ struct comparison_use uses[MAX_CMP_USE]; /* The original CC_MODE for this comparison. */ enum machine_mode orig_mode; /* The number of uses identified for this comparison. */ unsigned short n_uses; /* True if not all uses of this comparison have been identified. This can happen either for overflowing the array above, or if the flags register is used in some unusual context. */ bool missing_uses; /* True if its inputs are still valid at the end of the block. */ bool inputs_valid; }; typedef struct comparison *comparison_struct_p; DEF_VEC_P(comparison_struct_p); DEF_VEC_ALLOC_P(comparison_struct_p, heap); static VEC(comparison_struct_p, heap) *all_compares; /* Look for a "conforming" comparison, as defined above. If valid, return the rtx for the COMPARE itself. */ static rtx conforming_compare (rtx insn) { rtx set, src, dest; set = single_set (insn); if (set == NULL) return NULL; src = SET_SRC (set); if (GET_CODE (src) != COMPARE) return NULL; dest = SET_DEST (set); if (!REG_P (dest) || REGNO (dest) != targetm.flags_regnum) return NULL; if (REG_P (XEXP (src, 0)) && REG_P (XEXP (src, 0)) && (REG_P (XEXP (src, 1)) || CONSTANT_P (XEXP (src, 1)))) return src; return NULL; } /* Look for a pattern of the "correct" form for an insn with a flags clobber for which we may be able to eliminate a compare later. We're not looking to validate any inputs at this time, merely see that the basic shape is correct. The term "arithmetic" may be somewhay misleading... */ static bool arithmetic_flags_clobber_p (rtx insn) { rtx pat, x; if (!NONJUMP_INSN_P (insn)) return false; pat = PATTERN (insn); if (extract_asm_operands (pat)) return false; if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) == 2) { x = XVECEXP (pat, 0, 0); if (GET_CODE (x) != SET) return false; x = SET_DEST (x); if (!REG_P (x)) return false; x = XVECEXP (pat, 0, 1); if (GET_CODE (x) == CLOBBER) { x = XEXP (x, 0); if (REG_P (x) && REGNO (x) == targetm.flags_regnum) return true; } } return false; } /* Look for uses of FLAGS in INSN. If we find one we can analyze, record it in CMP; otherwise indicate that we've missed a use. */ static void find_flags_uses_in_insn (struct comparison *cmp, rtx insn) { df_ref *use_rec, use; /* If we've already lost track of uses, don't bother collecting more. */ if (cmp->missing_uses) return; /* Find a USE of the flags register. */ for (use_rec = DF_INSN_USES (insn); (use = *use_rec) != NULL; use_rec++) if (DF_REF_REGNO (use) == targetm.flags_regnum) { rtx x, *loc; /* If this is an unusual use, quit. */ if (DF_REF_TYPE (use) != DF_REF_REG_USE) goto fail; /* If we've run out of slots to record uses, quit. */ if (cmp->n_uses == MAX_CMP_USE) goto fail; /* Unfortunately, the location of the flags register, while present in the reference structure, doesn't help. We need to find the comparison code that is outer to the actual flags use. */ loc = DF_REF_LOC (use); x = PATTERN (insn); if (GET_CODE (x) == PARALLEL) x = XVECEXP (x, 0, 0); x = SET_SRC (x); if (GET_CODE (x) == IF_THEN_ELSE) x = XEXP (x, 0); if (COMPARISON_P (x) && loc == &XEXP (x, 0) && XEXP (x, 1) == const0_rtx) { /* We've found a use of the flags that we understand. */ struct comparison_use *cuse = &cmp->uses[cmp->n_uses++]; cuse->insn = insn; cuse->loc = loc; cuse->code = GET_CODE (x); } else goto fail; } return; fail: /* We failed to recognize this use of the flags register. */ cmp->missing_uses = true; } /* Identify comparison instructions within BB. If the last compare in the BB is valid at the end of the block, install it in BB->AUX. Called via walk_dominators_tree. */ static void find_comparisons_in_bb (struct dom_walk_data *data ATTRIBUTE_UNUSED, basic_block bb) { struct comparison *last_cmp; rtx insn, next, last_clobber; bool last_cmp_valid; bitmap killed; killed = BITMAP_ALLOC (NULL); /* The last comparison that was made. Will be reset to NULL once the flags are clobbered. */ last_cmp = NULL; /* True iff the last comparison has not been clobbered, nor have its inputs. Used to eliminate duplicate compares. */ last_cmp_valid = false; /* The last insn that clobbered the flags, if that insn is of a form that may be valid for eliminating a following compare. To be reset to NULL once the flags are set otherwise. */ last_clobber = NULL; /* Propagate the last live comparison throughout the extended basic block. */ if (single_pred_p (bb)) { last_cmp = (struct comparison *) single_pred (bb)->aux; if (last_cmp) last_cmp_valid = last_cmp->inputs_valid; } for (insn = BB_HEAD (bb); insn; insn = next) { rtx src; next = (insn == BB_END (bb) ? NULL_RTX : NEXT_INSN (insn)); if (!NONDEBUG_INSN_P (insn)) continue; src = conforming_compare (insn); if (src) { /* Eliminate a compare that's redundant with the previous. */ if (last_cmp_valid && rtx_equal_p (last_cmp->in_a, XEXP (src, 0)) && rtx_equal_p (last_cmp->in_b, XEXP (src, 1))) { delete_insn (insn); continue; } last_cmp = XCNEW (struct comparison); last_cmp->insn = insn; last_cmp->prev_clobber = last_clobber; last_cmp->in_a = XEXP (src, 0); last_cmp->in_b = XEXP (src, 1); last_cmp->orig_mode = GET_MODE (SET_DEST (single_set (insn))); VEC_safe_push (comparison_struct_p, heap, all_compares, last_cmp); /* It's unusual, but be prepared for comparison patterns that also clobber an input, or perhaps a scratch. */ last_clobber = NULL; last_cmp_valid = true; bitmap_clear (killed); df_simulate_find_defs (insn, killed); } else { /* Notice if this instruction kills the flags register. */ bitmap_clear (killed); df_simulate_find_defs (insn, killed); if (bitmap_bit_p (killed, targetm.flags_regnum)) { /* See if this insn could be the "clobber" that eliminates a future comparison. */ if (arithmetic_flags_clobber_p (insn)) last_clobber = insn; else last_clobber = NULL; /* In either case, the previous compare is no longer valid. */ last_cmp = NULL; last_cmp_valid = false; continue; } /* Notice if this insn uses the flags register. */ if (last_cmp) find_flags_uses_in_insn (last_cmp, insn); } /* Notice if any of the inputs to the comparison have changed. */ if (last_cmp && (bitmap_bit_p (killed, REGNO (last_cmp->in_a)) || (REG_P (last_cmp->in_b) && bitmap_bit_p (killed, REGNO (last_cmp->in_b))))) last_cmp_valid = false; } BITMAP_FREE (killed); /* Remember the live comparison for subsequent members of the extended basic block. */ if (last_cmp) { bb->aux = last_cmp; last_cmp->inputs_valid = last_cmp_valid; /* Look to see if the flags register is live outgoing here, and incoming to any successor not part of the extended basic block. */ if (bitmap_bit_p (&DF_LIVE_BB_INFO (bb)->out, targetm.flags_regnum)) { edge e; edge_iterator ei; FOR_EACH_EDGE (e, ei, bb->succs) { basic_block dest = e->dest; if (bitmap_bit_p (&DF_LIVE_BB_INFO (dest)->in, targetm.flags_regnum) && !single_pred_p (dest)) { last_cmp->missing_uses = true; break; } } } } } /* Find all comparisons in the function. */ static void find_comparisons (void) { struct dom_walk_data data; memset (&data, 0, sizeof(data)); data.dom_direction = CDI_DOMINATORS; data.before_dom_children = find_comparisons_in_bb; calculate_dominance_info (CDI_DOMINATORS); init_walk_dominator_tree (&data); walk_dominator_tree (&data, ENTRY_BLOCK_PTR); fini_walk_dominator_tree (&data); clear_aux_for_blocks (); free_dominance_info (CDI_DOMINATORS); } /* Select an alternate CC_MODE for a comparison insn comparing A and B. Note that inputs are almost certainly different than the IN_A and IN_B stored in CMP -- we're called while attempting to eliminate the compare after all. Return the new FLAGS rtx if successful, else return NULL. */ static rtx maybe_select_cc_mode (struct comparison *cmp, rtx a, rtx b) { enum machine_mode sel_mode; const int n = cmp->n_uses; rtx flags = NULL; #ifndef SELECT_CC_MODE /* Minimize code differences when this target macro is undefined. */ return NULL; #define SELECT_CC_MODE(A,B,C) (gcc_unreachable (), VOIDmode) #endif /* If we don't have access to all of the uses, we can't validate. */ if (cmp->missing_uses || n == 0) return NULL; /* Find a new mode that works for all of the uses. Special case the common case of exactly one use. */ if (n == 1) { sel_mode = SELECT_CC_MODE (cmp->uses[0].code, a, b); if (sel_mode != cmp->orig_mode) { flags = gen_rtx_REG (sel_mode, targetm.flags_regnum); validate_change (cmp->uses[0].insn, cmp->uses[0].loc, flags, true); } } else { int i; sel_mode = SELECT_CC_MODE (cmp->uses[0].code, a, b); for (i = 1; i < n; ++i) { enum machine_mode new_mode; new_mode = SELECT_CC_MODE (cmp->uses[i].code, a, b); if (new_mode != sel_mode) { sel_mode = targetm.cc_modes_compatible (sel_mode, new_mode); if (sel_mode == VOIDmode) return NULL; } } if (sel_mode != cmp->orig_mode) { flags = gen_rtx_REG (sel_mode, targetm.flags_regnum); for (i = 0; i < n; ++i) validate_change (cmp->uses[i].insn, cmp->uses[i].loc, flags, true); } } return flags; } /* Attempt to find an instruction prior to CMP that can be used to compute the same flags value as the comparison itself. Return true if successful. */ static bool try_eliminate_compare (struct comparison *cmp) { rtx x, insn, bb_head, flags, in_a, cmp_src; /* We must have found an interesting "clobber" preceeding the compare. */ if (cmp->prev_clobber == NULL) return false; /* ??? For the moment we don't handle comparisons for which IN_B is a register. We accepted these during initial comparison recognition in order to eliminate duplicate compares. An improvement here would be to handle x = a - b; if (a < b). */ if (!CONSTANT_P (cmp->in_b)) return false; /* Verify that PREV_CLOBBER defines IN_A, and that IN_A is not clobbered in between. Given that this target requires this pass, we can assume that most insns do clobber the flags, and so the distance between the compare and the clobber is likely to be small. */ /* ??? This is one point at which one could argue that DF_REF_CHAIN would be useful, but it is thought to be too heavy-weight a solution here. */ in_a = cmp->in_a; insn = cmp->insn; bb_head = BB_HEAD (BLOCK_FOR_INSN (insn)); for (insn = PREV_INSN (insn); insn != cmp->prev_clobber; insn = PREV_INSN (insn)) { const int abnormal_flags = (DF_REF_CONDITIONAL | DF_REF_PARTIAL | DF_REF_MAY_CLOBBER | DF_REF_MUST_CLOBBER | DF_REF_SIGN_EXTRACT | DF_REF_ZERO_EXTRACT | DF_REF_STRICT_LOW_PART | DF_REF_PRE_POST_MODIFY); df_ref *def_rec, def; /* Note that the BB_HEAD is always either a note or a label, but in any case it means that IN_A is defined outside the block. */ if (insn == bb_head) return false; if (NOTE_P (insn) || DEBUG_INSN_P (insn)) continue; /* Find a possible def of IN_A in INSN. */ for (def_rec = DF_INSN_DEFS (insn); (def = *def_rec) != NULL; def_rec++) if (DF_REF_REGNO (def) == REGNO (in_a)) break; /* No definitions of IN_A; continue searching. */ if (def == NULL) continue; /* Bail if this is not a totally normal set of IN_A. */ if (DF_REF_IS_ARTIFICIAL (def)) return false; if (DF_REF_FLAGS (def) & abnormal_flags) return false; /* We've found an insn between the compare and the clobber that sets IN_A. Given that pass_cprop_hardreg has not yet run, we still find situations in which we can usefully look through a copy insn. */ x = single_set (insn); if (x == NULL) return false; in_a = SET_SRC (x); if (!REG_P (in_a)) return false; } /* We've reached PREV_CLOBBER without finding a modification of IN_A. Validate that PREV_CLOBBER itself does in fact refer to IN_A. Do recall that we've already validated the shape of PREV_CLOBBER. */ x = XVECEXP (PATTERN (insn), 0, 0); if (!rtx_equal_p (SET_DEST (x), in_a)) return false; cmp_src = SET_SRC (x); /* Determine if we ought to use a different CC_MODE here. */ flags = maybe_select_cc_mode (cmp, cmp_src, cmp->in_b); if (flags == NULL) flags = gen_rtx_REG (cmp->orig_mode, targetm.flags_regnum); /* Generate a new comparison for installation in the setter. */ x = copy_rtx (cmp_src); x = gen_rtx_COMPARE (GET_MODE (flags), x, cmp->in_b); x = gen_rtx_SET (VOIDmode, flags, x); /* Succeed if the new instruction is valid. */ validate_change (insn, &XVECEXP (PATTERN (insn), 0, 1), x, true); if (!apply_change_group ()) return false; /* Success. Delete the compare insn... */ delete_insn (cmp->insn); /* ... and any notes that are now irrelevant due to multi-stores. */ x = find_regno_note (insn, REG_UNUSED, targetm.flags_regnum); if (x) remove_note (insn, x); x = find_reg_note (insn, REG_EQUAL, NULL); if (x) remove_note (insn, x); x = find_reg_note (insn, REG_EQUIV, NULL); if (x) remove_note (insn, x); return true; } /* Main entry point to the pass. */ static unsigned int execute_compare_elim_after_reload (void) { df_set_flags (DF_DEFER_INSN_RESCAN); df_live_add_problem (); df_analyze (); gcc_checking_assert (all_compares == NULL); /* Locate all comparisons and their uses, and eliminate duplicates. */ find_comparisons (); if (all_compares) { struct comparison *cmp; size_t i; /* Eliminate comparisons that are redundant with flags computation. */ FOR_EACH_VEC_ELT (comparison_struct_p, all_compares, i, cmp) { try_eliminate_compare (cmp); XDELETE (cmp); } VEC_free (comparison_struct_p, heap, all_compares); all_compares = NULL; df_analyze (); } return 0; } static bool gate_compare_elim_after_reload (void) { return (flag_compare_elim_after_reload && targetm.flags_regnum != INVALID_REGNUM); } struct rtl_opt_pass pass_compare_elim_after_reload = { { RTL_PASS, "cmpelim", /* name */ gate_compare_elim_after_reload, /* gate */ execute_compare_elim_after_reload, /* execute */ NULL, /* sub */ NULL, /* next */ 0, /* static_pass_number */ TV_NONE, /* tv_id */ 0, /* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ TODO_df_finish | TODO_df_verify | TODO_verify_rtl_sharing | TODO_dump_func | TODO_ggc_collect /* todo_flags_finish */ } };

Message ID	1294691517-19580-29-git-send-email-rth@redhat.com
State	New
Headers	show Return-Path: <gcc-patches-return-282586-incoming=patchwork.ozlabs.org@gcc.gnu.org> From: Richard Henderson <rth@redhat.com> To: gcc-patches@gcc.gnu.org Cc: nickc@redhat.com, law@redhat.com, Richard Henderson <rth@twiddle.net> Subject: [PATCH 28/28] New -fcompare-elim pass. Date: Mon, 10 Jan 2011 12:31:57 -0800 Message-Id: <1294691517-19580-29-git-send-email-rth@redhat.com> In-Reply-To: <1294691517-19580-1-git-send-email-rth@redhat.com> References: <1294691517-19580-1-git-send-email-rth@redhat.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org

diff --git a/gcc/Makefile.in b/gcc/Makefile.in index de3bde9..f421d5a 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1205,6 +1205,7 @@ OBJS-common = \ cfgrtl.o \ combine.o \ combine-stack-adj.o \ + compare-elim.o \ convert.o \ coverage.o \ cse.o \ @@ -3352,6 +3353,9 @@ combine-stack-adj.o : combine-stack-adj.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(RTL_H) insn-config.h $(TIMEVAR_H) $(TREE_PASS_H) \ $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \ $(EXPR_H) $(BASIC_BLOCK_H) $(DIAGNOSTIC_CORE_H) $(TM_P_H) $(DF_H) $(EXCEPT_H) reload.h +compare-elim.o : compare-elim.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) $(RTL_H) $(TM_P_H) $(RECOG_H) $(FLAGS_H) $(BASIC_BLOCK_H) \ + $(TREE_PASS_H) $(DF_H) ddg.o : ddg.c $(DDG_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TARGET_H) \ $(DIAGNOSTIC_CORE_H) $(RTL_H) $(TM_P_H) $(REGS_H) $(FUNCTION_H) \ $(FLAGS_H) insn-config.h $(INSN_ATTR_H) $(EXCEPT_H) $(RECOG_H) \ diff --git a/gcc/common.opt b/gcc/common.opt index 32df6fc..5b01363 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -853,6 +853,10 @@ fcompare-debug-second Common Driver RejectNegative Var(flag_compare_debug) Run only the second compilation of -fcompare-debug +fcompare-elim +Common Report Var(flag_compare_elim_after_reload) Optimization +Perform comparison elimination after register allocation has finished + fconserve-stack Common Var(flag_conserve_stack) Optimization Do not perform optimizations increasing noticeably stack usage diff --git a/gcc/compare-elim.c b/gcc/compare-elim.c new file mode 100644 index 0000000..a1d91cc --- /dev/null +++ b/gcc/compare-elim.c @@ -0,0 +1,519 @@ +/* Post-reload compare elimination. + Copyright (C) 2010, 2011 + Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +/* There is a set of targets whose general-purpose move or addition + instructions clobber the flags. These targets cannot split their + CBRANCH/CSTORE etc patterns before reload is complete, lest reload + itself insert instructions in between the flags setter and user. + Because these targets cannot split the compare from the use, they + cannot make use of the comparison elimination offered by the combine pass. + + This is a small pass intended to provide comparison elimination similar to + what is available via NOTICE_UPDATE_CC for cc0 targets. This should help + encourage cc0 targets to convert to an explicit post-reload representation + of the flags. + + This pass assumes: + + (0) CBRANCH/CSTORE etc have been split in pass_split_after_reload. + + (1) All comparison patterns are represented as + + [(set (reg:CC) (compare:CC (reg) (immediate)] + + (2) All insn patterns that modify the flags are represented as + + [(set (reg) (operation) + (clobber (reg:CC))] + + (3) If an insn of form (2) can usefully set the flags, there is + another pattern of the form + + [(set (reg) (operation) + (set (reg:CCM) (compare:CCM (operation) (immediate)))] + + The mode CCM will be chosen as if by SELECT_CC_MODE. +*/ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "rtl.h" +#include "tm_p.h" +#include "insn-config.h" +#include "recog.h" +#include "flags.h" +#include "basic-block.h" +#include "tree-pass.h" +#include "target.h" +#include "df.h" +#include "domwalk.h" + + +/* These structures describe a comparison and how it is used. */ + +/* The choice of maximum 3 uses comes from wanting to eliminate the two + duplicate compares from a three-way branch on the sign of a value. + This is also sufficient to eliminate the duplicate compare against the + high-part of a double-word comparison. */ +#define MAX_CMP_USE 3 + +struct comparison_use +{ + /* The instruction in which the result of the compare is used. */ + rtx insn; + /* The location of the flags register within the use. */ + rtx *loc; + /* The comparison code applied against the flags register. */ + enum rtx_code code; +}; + +struct comparison +{ + /* The comparison instruction. */ + rtx insn; + + /* The two values being compared. These will be either REGs or + constants. */ + rtx in_a, in_b; + + /* Information about how this comparison is used. */ + struct comparison_use uses[MAX_CMP_USE]; + + /* The original CC_MODE for this comparison. */ + enum machine_mode orig_mode; + + /* The number of uses identified for this comparison. */ + unsigned short n_uses; + + /* True if not all uses of this comparison have been identified. + This can happen either for overflowing the array above, or if + the flags register is used in some unusual context. */ + bool missing_uses; +}; + +typedef struct comparison *comparison_struct_p; +DEF_VEC_P(comparison_struct_p); +DEF_VEC_ALLOC_P(comparison_struct_p, heap); + +static VEC(comparison_struct_p, heap) *all_compares; + +static void +find_flags_uses_in_bb (struct comparison *cmp, + basic_block bb ATTRIBUTE_UNUSED, rtx insn) +{ + df_ref *use_rec, use; + struct df_link *chain; + + /* If we've already lost track of uses, don't bother collecting more. */ + if (cmp->missing_uses) + return; + + /* Find the DEF of the flags register. It must be there. */ + for (use_rec = DF_INSN_DEFS (insn); ; use_rec++) + { + use = *use_rec; + if (DF_REF_TYPE (use) == DF_REF_REG_DEF + && DF_REF_REGNO (use) == targetm.flags_regnum) + break; + } + + for (chain = DF_REF_CHAIN (use); chain ; chain = chain->next) + { + rtx x, uinsn, *uloc; + + /* If we've run out of slots to record uses, quit. */ + if (cmp->n_uses == MAX_CMP_USE) + { + cmp->missing_uses = true; + return; + } + + /* Unfortunately, the location of the flags register, while present in + the reference structure, doesn't help. We need to find the comparison + code that is outer to the actual flags use. */ + uinsn = DF_REF_INSN (chain->ref); + uloc = DF_REF_LOC (chain->ref); + x = PATTERN (uinsn); + if (GET_CODE (x) == PARALLEL) + x = XVECEXP (x, 0, 0); + x = SET_SRC (x); + if (GET_CODE (x) == IF_THEN_ELSE) + x = XEXP (x, 0); + if (COMPARISON_P (x) + && uloc == &XEXP (x, 0) + && XEXP (x, 1) == const0_rtx) + { + struct comparison_use *cuse = &cmp->uses[cmp->n_uses++]; + cuse->insn = uinsn; + cuse->loc = uloc; + cuse->code = GET_CODE (x); + } + else + { + /* We failed to recognize this use of the flags register. */ + cmp->missing_uses = true; + return; + } + } +} + +/* Identify comparison instructions within BB. If the last compare in the BB + is valid at the end of the block, install it in BB->AUX. Called via + walk_dominators_tree. */ + +static void +find_comparisons_in_bb (struct dom_walk_data *data ATTRIBUTE_UNUSED, + basic_block bb) +{ + struct comparison *last_cmp = NULL; + bitmap killed; + rtx insn, next; + + killed = BITMAP_ALLOC (NULL); + + /* Propagate the last live comparison throughout the extended basic block. */ + if (single_pred_p (bb)) + last_cmp = (struct comparison *) single_pred (bb)->aux; + + for (insn = BB_HEAD (bb); insn; insn = next) + { + rtx set, src, dst; + + next = (insn == BB_END (bb) ? NULL_RTX : NEXT_INSN (insn)); + if (!NONDEBUG_INSN_P (insn)) + continue; + + if ((set = single_set (insn)) != NULL + && (src = SET_SRC (set), GET_CODE (src) == COMPARE) + && (dst = SET_DEST (set), REG_P (dst)) + && REGNO (dst) == targetm.flags_regnum + && REG_P (XEXP (src, 0)) + && (REG_P (XEXP (src, 1)) || CONSTANT_P (XEXP (src, 1)))) + { + /* Eliminate a compare that's redundant with the previous. */ + if (last_cmp + && rtx_equal_p (last_cmp->in_a, XEXP (src, 0)) + && rtx_equal_p (last_cmp->in_b, XEXP (src, 1))) + { + find_flags_uses_in_bb (last_cmp, bb, insn); + delete_insn (insn); + continue; + } + + last_cmp = XCNEW (struct comparison); + last_cmp->insn = insn; + last_cmp->in_a = XEXP (src, 0); + last_cmp->in_b = XEXP (src, 1); + last_cmp->orig_mode = GET_MODE (dst); + VEC_safe_push (comparison_struct_p, heap, all_compares, last_cmp); + + find_flags_uses_in_bb (last_cmp, bb, insn); + + /* It's unusual, but be prepared for comparison patterns that + also clobber an input, or perhaps a scratch. */ + bitmap_clear (killed); + df_simulate_find_defs (insn, killed); + } + else if (last_cmp) + { + /* Notice if this instruction kills the flags register. If so, + any comparison we're holding is no longer valid. */ + bitmap_clear (killed); + df_simulate_find_defs (insn, killed); + if (bitmap_bit_p (killed, targetm.flags_regnum)) + { + last_cmp = NULL; + continue; + } + } + else + continue; + + /* Notice if any of the inputs to the comparison have changed. */ + if (bitmap_bit_p (killed, REGNO (last_cmp->in_a))) + last_cmp = NULL; + else if (REG_P (last_cmp->in_b) + && bitmap_bit_p (killed, REGNO (last_cmp->in_b))) + last_cmp = NULL; + } + + BITMAP_FREE (killed); + + /* Remember the live comparison for subsequent members of + the extended basic block. */ + if (last_cmp) + { + bb->aux = last_cmp; + + /* Note that df does not create use chains that cross basic blocks. + Therefore, given that we've not yet eliminated duplicate comparisons, + if the flags register is live (in the sense of being used) then we're + not going to be able to find all uses. Given that this target needed + this pass in the first place, this should be exceedingly rare, caused + by a post-reload splitter creating new basic blocks. */ + if (bitmap_bit_p (DF_LR_OUT (bb), targetm.flags_regnum)) + last_cmp->missing_uses = true; + } +} + +/* Find all comparisons in the function. */ + +static void +find_comparisons (void) +{ + struct dom_walk_data data; + + memset (&data, 0, sizeof(data)); + data.dom_direction = CDI_DOMINATORS; + data.before_dom_children = find_comparisons_in_bb; + + calculate_dominance_info (CDI_DOMINATORS); + + init_walk_dominator_tree (&data); + walk_dominator_tree (&data, ENTRY_BLOCK_PTR); + fini_walk_dominator_tree (&data); + + clear_aux_for_blocks (); + free_dominance_info (CDI_DOMINATORS); +} + +/* Select an alternate CC_MODE for a comparison insn comparing A and B. + Note that inputs are almost certainly different than the IN_A and IN_B + stored in CMP -- we're called while attempting to eliminate the compare + after all. Return the new FLAGS rtx if successful, else return NULL. */ + +static rtx +maybe_select_cc_mode (struct comparison *cmp, rtx a, rtx b) +{ + enum machine_mode sel_mode; + const int n = cmp->n_uses; + rtx flags = NULL; + +#ifndef SELECT_CC_MODE + /* Minimize code differences when this target macro is undefined. */ + return NULL; +#define SELECT_CC_MODE(A,B,C) (gcc_unreachable (), VOIDmode) +#endif + + /* If we don't have access to all of the uses, we can't validate. */ + if (cmp->missing_uses || n == 0) + return NULL; + + /* Find a new mode that works for all of the uses. Special case the + common case of exactly one use. */ + if (n == 1) + { + sel_mode = SELECT_CC_MODE (cmp->uses[0].code, a, b); + if (sel_mode != cmp->orig_mode) + { + flags = gen_rtx_REG (sel_mode, targetm.flags_regnum); + validate_change (cmp->uses[0].insn, cmp->uses[0].loc, flags, true); + } + } + else + { + int i; + + sel_mode = SELECT_CC_MODE (cmp->uses[0].code, a, b); + for (i = 1; i < n; ++i) + { + enum machine_mode new_mode; + new_mode = SELECT_CC_MODE (cmp->uses[i].code, a, b); + if (new_mode != sel_mode) + { + sel_mode = targetm.cc_modes_compatible (sel_mode, new_mode); + if (sel_mode == VOIDmode) + return NULL; + } + } + + if (sel_mode != cmp->orig_mode) + { + flags = gen_rtx_REG (sel_mode, targetm.flags_regnum); + for (i = 0; i < n; ++i) + validate_change (cmp->uses[i].insn, cmp->uses[i].loc, flags, true); + } + } + + return flags; +} + +/* Attempt to find an instruction prior to CMP that can be used to compute the + same flags value as the comparison itself. Return true if successful. */ + +static bool +try_eliminate_compare (struct comparison *cmp) +{ + df_ref *use_rec, use, def; + rtx dinsn, dpat, x, flags, cmp_src; + + /* ??? For the moment we don't handle comparisons for which IN_B + is a register. We accepted these during initial comparison + recognition in order to eliminate duplicate compares. + An improvement here would be to handle + x = a - b; if (a < b) + Note that this doesn't follow the USE-DEF chain from X, but + since we already have to search for the previous clobber of + the flags register, this wouldn't really be a problem. */ + if (!CONSTANT_P (cmp->in_b)) + return false; + + /* The relevant instruction is the definition for the use in the + comparison instruction. */ + for (use_rec = DF_INSN_USES (cmp->insn); ; use_rec++) + { + use = *use_rec; + if (DF_REF_REGNO (use) == REGNO (cmp->in_a)) + break; + } + if (DF_REF_CHAIN (use) == NULL) + return false; + + def = DF_REF_CHAIN (use)->ref; + gcc_assert (DF_REF_REG_DEF_P (def)); + + /* If the defintion comes from a parameter, or such, there's no insn. */ + if (DF_REF_IS_ARTIFICIAL (def)) + return false; + + /* Make sure that the defining instruction is of the proper form. + That is, a single set plus a clobber of the flags register. */ + dinsn = DF_REF_INSN (def); + if (!NONJUMP_INSN_P (dinsn)) + return false; + dpat = PATTERN (dinsn); + if (GET_CODE (dpat) != PARALLEL || XVECLEN (dpat, 0) != 2) + return false; + x = XVECEXP (dpat, 0, 0); + if (GET_CODE (x) != SET || DF_REF_LOC (def) != &SET_DEST (x)) + return false; + x = XVECEXP (dpat, 0, 1); + if (GET_CODE (x) != CLOBBER) + return false; + flags = XEXP (x, 0); + if (!REG_P (flags) || REGNO (flags) != targetm.flags_regnum) + return false; + + /* Make sure that the flags are not clobbered in between the two + instructions. Unfortunately, we don't get DEF-DEF chains, so + do this the old fashioned way. */ + if (reg_set_between_p (flags, dinsn, cmp->insn)) + return false; + + /* Determine if we ought to use a different CC_MODE here. */ + cmp_src = SET_SRC (XVECEXP (dpat, 0, 0)); + x = maybe_select_cc_mode (cmp, cmp_src, cmp->in_b); + if (x != NULL) + flags = x; + + /* Generate a new comparison for installation in the setter. */ + x = copy_rtx (cmp_src); + x = gen_rtx_COMPARE (GET_MODE (flags), x, cmp->in_b); + x = gen_rtx_SET (VOIDmode, flags, x); + + /* Succeed if the new instruction is valid. */ + validate_change (dinsn, &XVECEXP (dpat, 0, 1), x, true); + if (!apply_change_group ()) + return false; + + /* Success. Delete the compare insn... */ + delete_insn (cmp->insn); + + /* ... and any notes that are now irrelevant due to multi-stores. */ + x = find_regno_note (dinsn, REG_UNUSED, targetm.flags_regnum); + if (x) + remove_note (dinsn, x); + x = find_reg_note (dinsn, REG_EQUAL, NULL); + if (x) + remove_note (dinsn, x); + x = find_reg_note (dinsn, REG_EQUIV, NULL); + if (x) + remove_note (dinsn, x); + + return true; +} + +/* Main entry point to the pass. */ + +static unsigned int +execute_compare_elim_after_reload (void) +{ + df_set_flags (DF_DEFER_INSN_RESCAN); + df_live_add_problem (); + df_chain_add_problem (DF_DU_CHAIN + DF_UD_CHAIN); + df_analyze (); + + gcc_checking_assert (all_compares == NULL); + + /* Locate all comparisons and their uses, and eliminate duplicates. */ + find_comparisons (); + if (all_compares) + { + struct comparison *cmp; + size_t i; + + /* Eliminate comparisons that are redundant with flags computation. */ + for (i = 0; VEC_iterate (comparison_struct_p, all_compares, i, cmp); ++i) + { + try_eliminate_compare (cmp); + XDELETE (cmp); + } + + VEC_free (comparison_struct_p, heap, all_compares); + all_compares = NULL; + + df_remove_problem (df_chain); + df_analyze (); + } + + return 0; +} + +static bool +gate_compare_elim_after_reload (void) +{ + return (flag_compare_elim_after_reload + && targetm.flags_regnum != INVALID_REGNUM); +} + +struct rtl_opt_pass pass_compare_elim_after_reload = +{ + { + RTL_PASS, + "cmpelim", /* name */ + gate_compare_elim_after_reload, /* gate */ + execute_compare_elim_after_reload, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + TV_NONE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_df_finish + | TODO_df_verify + | TODO_verify_rtl_sharing + | TODO_dump_func + | TODO_ggc_collect /* todo_flags_finish */ + } +}; diff --git a/gcc/config/mn10300/mn10300.c b/gcc/config/mn10300/mn10300.c index 25bfa0d..e5bfdae 100644 --- a/gcc/config/mn10300/mn10300.c +++ b/gcc/config/mn10300/mn10300.c @@ -3023,4 +3023,7 @@ mn10300_md_asm_clobbers (tree outputs ATTRIBUTE_UNUSED, #undef TARGET_MD_ASM_CLOBBERS #define TARGET_MD_ASM_CLOBBERS mn10300_md_asm_clobbers +#undef TARGET_FLAGS_REGNUM +#define TARGET_FLAGS_REGNUM CC_REG + struct gcc_target targetm = TARGET_INITIALIZER; diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 800c592..e50de64 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -337,7 +337,7 @@ Objective-C and Objective-C++ Dialects}. -fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol -fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol -fcheck-data-deps -fcombine-stack-adjustments -fconserve-stack @gol --fcprop-registers -fcrossjumping @gol +-fcompare-elim -fcprop-registers -fcrossjumping @gol -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol -fcx-limited-range @gol -fdata-sections -fdce -fdce @gol @@ -5870,6 +5870,7 @@ compilation time. @option{-O} turns on the following optimization flags: @gccoptlist{ -fauto-inc-dec @gol +-fcompare-elim @gol -fcprop-registers @gol -fdce @gol -fdefer-pop @gol @@ -7680,6 +7681,18 @@ use hidden visibility) is similar to @code{-fwhole-program}. See Enabled by default when LTO support in GCC is enabled and GCC was compiled with linker supporting plugins (GNU ld or @code{gold}). +@item -fcompare-elim +@opindex fcompare-elim +After register allocation and post-register allocation instruction splitting, +identify arithmetic instructions that compute processor flags similar to a +comparison operation based on that arithmetic. If possible, eliminate the +explicit comparison operation. + +This pass only applies to certain targets that cannot explicitly represent +the comparison operation before register allocation is complete. + +Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}. + @item -fcprop-registers @opindex fcprop-registers After register allocation and post-register allocation instruction splitting, diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 139c5a7..5dfd100 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -4345,6 +4345,10 @@ to return a nonzero value when it is required, the compiler will run out of spill registers and print a fatal error message. @end deftypefn +@deftypevr {Target Hook} {unsigned int} TARGET_FLAGS_REGNUM +If the target has a dedicated flags register, and it needs to use the post-reload comparison elimination pass, then this value should be set appropriately. +@end deftypevr + @node Scalar Return @subsection How Scalar Function Values Are Returned @cindex return values in registers diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 2085816..bd49012 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4333,6 +4333,8 @@ to return a nonzero value when it is required, the compiler will run out of spill registers and print a fatal error message. @end deftypefn +@hook TARGET_FLAGS_REGNUM + @node Scalar Return @subsection How Scalar Function Values Are Returned @cindex return values in registers diff --git a/gcc/opts.c b/gcc/opts.c index 42d53f4..7001122 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -456,6 +456,7 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_ch, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fcombine_stack_adjustments, NULL, 1 }, + { OPT_LEVELS_1_PLUS, OPT_fcompare_elim, NULL, 1 }, /* -O2 optimizations. */ { OPT_LEVELS_2_PLUS, OPT_finline_small_functions, NULL, 1 }, diff --git a/gcc/passes.c b/gcc/passes.c index 804ac9f..5aba987 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -1021,6 +1021,7 @@ init_optimization_passes (void) NEXT_PASS (pass_gcse2); NEXT_PASS (pass_split_after_reload); NEXT_PASS (pass_implicit_zee); + NEXT_PASS (pass_compare_elim_after_reload); NEXT_PASS (pass_branch_target_load_optimize1); NEXT_PASS (pass_thread_prologue_and_epilogue); NEXT_PASS (pass_rtl_dse2); diff --git a/gcc/recog.h b/gcc/recog.h index 217c6e5..534d2c9 100644 --- a/gcc/recog.h +++ b/gcc/recog.h @@ -18,6 +18,9 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#ifndef GCC_RECOG_H +#define GCC_RECOG_H + /* Random number that should be large enough for all purposes. */ #define MAX_RECOG_ALTERNATIVES 30 @@ -305,3 +308,5 @@ struct insn_data_d extern const struct insn_data_d insn_data[]; extern int peep2_current_count; + +#endif /* GCC_RECOG_H */ diff --git a/gcc/target.def b/gcc/target.def index 9d96e65..29d2a51 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1627,6 +1627,15 @@ DEFHOOK bool, (enum machine_mode mode), hook_bool_mode_false) +/* Register number for a flags register. Only needs to be defined if the + target is constrainted to use post-reload comparison elimination. */ +DEFHOOKPOD +(flags_regnum, + "If the target has a dedicated flags register, and it needs to use the\ + post-reload comparison elimination pass, then this value should be set\ + appropriately.", + unsigned int, INVALID_REGNUM) + /* Compute a (partial) cost for rtx X. Return true if the complete cost has been computed, and false if subexpressions should be scanned. In either case, *TOTAL contains the cost result. */ diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 32d8f40..dd82288 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -551,6 +551,7 @@ extern struct rtl_opt_pass pass_reorder_blocks; extern struct rtl_opt_pass pass_branch_target_load_optimize2; extern struct rtl_opt_pass pass_leaf_regs; extern struct rtl_opt_pass pass_split_before_sched2; +extern struct rtl_opt_pass pass_compare_elim_after_reload; extern struct rtl_opt_pass pass_sched2; extern struct rtl_opt_pass pass_stack_regs; extern struct rtl_opt_pass pass_stack_regs_run;

[28/28] New -fcompare-elim pass.

Commit Message

Comments

Patch