diff mbox

Improve rtl loop inv cost by checking if the inv can be propagated to address uses

Message ID CAHFci2-f+uvo85YQsZXkEQjr3tSTNG5DdiVskX9oHuwKBRp_jQ@mail.gmail.com
State New
Headers show

Commit Message

Bin.Cheng Oct. 9, 2015, noon UTC
On Wed, Sep 30, 2015 at 11:33 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Sep 29, 2015 at 1:21 AM, Jeff Law <law@redhat.com> wrote:
>> On 09/28/2015 05:28 AM, Bernd Schmidt wrote:
>>>
>>> On 09/28/2015 11:43 AM, Bin Cheng wrote:
>>>>
>>>> Bootstrap and test on x86_64 and x86_32.  Will test it on aarch64.  So
>>>> any
>>>> comments?
>>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> 2015-09-28  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>     * loop-invariant.c (struct def): New field cant_fwprop_to_addr_uses.
>>>>     (inv_cant_fwprop_to_addr_use): New function.
>>>>     (record_use): Call inv_cant_fwprop_to_addr_use, set the new field.
>>>>     (get_inv_cost): Count cost if inv can't be propagated into its
>>>>     address uses.
>>>
>>>
>>> It looks at least plausible.
>>
>> Definitely plausible.  Many targets have restrictions on the immediate
>> offsets, so this potentially affects many targets (in a good way).
> Here is some more information.
> For spec2k6 on x86, no regression or obvious improvement.
> For spec2k6 on aarch64, several cases are improved.
> Given the results and your positive feedback, I will continue to work
> on this patch when I get back from holiday.
I further bootstrap and test attached patch on aarch64.  Also three
cases in spec2k6/fp are improved by 3~6%, two cases in spec2k6/fp are
regressed by ~2%.  Overall score is improved by ~0.8% for spec2k6/fp
on aarch64 of my run.  I may later analyze the regression.

So is this patch OK?

Thanks,
bin

2015-10-09  Bin Cheng  <bin.cheng@arm.com>

    * loop-invariant.c (struct def): New field cant_prop_to_addr_uses.
    (inv_cant_prop_to_addr_use): New function.
    (record_use): Call inv_cant_prop_to_addr_use, set the new field.
    (get_inv_cost): Count cost if inv can't be propagated into its
    address uses.

>
>>
>>
>>  Another option which I think has had some
>>>
>>> discussion recently would be to just move everything, and leave it to
>>> cprop to put things back together if the costs allow it.
>>
>> I go back and forth on this kind of approach.
> Hmm, either way we need to model rtx/register pressure costs, in loop
> invariant or cprop.
>
> Thanks,
> bin
>
>>
>> jeff

Comments

Bernd Schmidt Oct. 9, 2015, 12:04 p.m. UTC | #1
On 10/09/2015 02:00 PM, Bin.Cheng wrote:
> I further bootstrap and test attached patch on aarch64.  Also three
> cases in spec2k6/fp are improved by 3~6%, two cases in spec2k6/fp are
> regressed by ~2%.  Overall score is improved by ~0.8% for spec2k6/fp
> on aarch64 of my run.  I may later analyze the regression.
>
> So is this patch OK?

I'll approve this with one change, but please keep an eye out for 
performance regressions on other targets.

>      * loop-invariant.c (struct def): New field cant_prop_to_addr_uses.
>      (inv_cant_prop_to_addr_use): New function.

I would like these to have switched truthvalues, i.e. 
can_prop_to_addr_uses, inv_can_prop_to_addr_use. Otherwise we end up 
with double negations like !def->cant_prop_to_addr_uses which can be 
slightly confusing.

You'll probably slightly need to tweak the initialization when 
n_addr_uses goes from zero to one.


Bernd
diff mbox

Patch

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 52c8ae8..3c2395c 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -99,6 +99,8 @@  struct def
   unsigned n_uses;		/* Number of such uses.  */
   unsigned n_addr_uses;		/* Number of uses in addresses.  */
   unsigned invno;		/* The corresponding invariant.  */
+  bool cant_prop_to_addr_uses;	/* True if the corresponding inv can't be
+				   propagated into its address uses.  */
 };
 
 /* The data stored for each invariant.  */
@@ -762,6 +764,34 @@  create_new_invariant (struct def *def, rtx_insn *insn, bitmap depends_on,
   return inv;
 }
 
+/* Given invariant DEF and its address USE, check if the corresponding
+   invariant expr can be propagated into the use or not.  */
+
+static bool
+inv_cant_prop_to_addr_use (struct def *def, df_ref use)
+{
+  struct invariant *inv;
+  rtx *pos = DF_REF_REAL_LOC (use), def_set;
+  rtx_insn *use_insn = DF_REF_INSN (use);
+  rtx_insn *def_insn;
+  bool ok;
+
+  inv = invariants[def->invno];
+  /* No need to check if address expression is expensive.  */
+  if (!inv->cheap_address)
+    return true;
+
+  def_insn = inv->insn;
+  def_set = single_set (def_insn);
+  if (!def_set)
+    return true;
+
+  validate_unshare_change (use_insn, pos, SET_SRC (def_set), true);
+  ok = verify_changes (0);
+  cancel_changes (0);
+  return !ok;
+}
+
 /* Record USE at DEF.  */
 
 static void
@@ -777,7 +807,11 @@  record_use (struct def *def, df_ref use)
   def->uses = u;
   def->n_uses++;
   if (u->addr_use_p)
-    def->n_addr_uses++;
+    {
+      def->n_addr_uses++;
+      if (!def->cant_prop_to_addr_uses && inv_cant_prop_to_addr_use (def, use))
+	def->cant_prop_to_addr_uses = true;
+    }
 }
 
 /* Finds the invariants USE depends on and store them to the DEPENDS_ON
@@ -1158,7 +1192,9 @@  get_inv_cost (struct invariant *inv, int *comp_cost, unsigned *regs_needed,
 
   if (!inv->cheap_address
       || inv->def->n_uses == 0
-      || inv->def->n_addr_uses < inv->def->n_uses)
+      || inv->def->n_addr_uses < inv->def->n_uses
+      /* Count cost if the inv can't be propagated into address uses.  */
+      || inv->def->cant_prop_to_addr_uses)
     (*comp_cost) += inv->cost * inv->eqno;
 
 #ifdef STACK_REGS