diff mbox

[PR65768] Check rtx_cost when propagating constant

Message ID 552E1907.4090708@linaro.org
State New
Headers show

Commit Message

Kugan Vivekanandarajah April 15, 2015, 7:53 a.m. UTC
As mentioned in PR65768, ARM gcc generates suboptimal code for constant
Uses in loop. Part of the reason is cprop is undoing what loop invariant
code motion did.

Zhenqiang posted a patch at to fix this based on rtx costs:
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01321.html

I cleaned it up and bootstrapped, regression tested on x86_64-linux-gnu;
no new regressions. Is this OK for trunk?

Thanks,
Kugan

gcc/ChangeLog:

2015-04-15  Kugan Vivekanandarajah  <kuganv@linaro.org>
	    Zhenqiang Chen  <zhenqiang.chen@linaro.org>

	PR target/65768
	* cprop.c (try_replace_reg): Check cost of constants before propagating.

Comments

Steven Bosscher April 15, 2015, 9:05 a.m. UTC | #1
On Wed, Apr 15, 2015 at 9:53 AM, Kugan wrote:
> 2015-04-15  Kugan Vivekanandarajah  < >
>             Zhenqiang Chen  <>
>
>         PR target/65768
>         * cprop.c (try_replace_reg): Check cost of constants before propagating.


> +
> +  /* For CONSTANT_P (to), loop2_invariant pass might hoist it out the loop.
> +     And it can be shared by different references.  So skip propagation if
> +     it makes INSN's rtx cost higher.  */
> +

So only undo if the insn is inside a loop (i.e.
BLOCK_FOR_INSN(insn)->loop_father != NULL) and this is a
post-pass_loop2 cprop run?

Ciao!
Steven
Richard Biener April 15, 2015, 11:18 a.m. UTC | #2
On Wed, Apr 15, 2015 at 11:05 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Wed, Apr 15, 2015 at 9:53 AM, Kugan wrote:
>> 2015-04-15  Kugan Vivekanandarajah  < >
>>             Zhenqiang Chen  <>
>>
>>         PR target/65768
>>         * cprop.c (try_replace_reg): Check cost of constants before propagating.
>
>
>> +
>> +  /* For CONSTANT_P (to), loop2_invariant pass might hoist it out the loop.
>> +     And it can be shared by different references.  So skip propagation if
>> +     it makes INSN's rtx cost higher.  */
>> +
>
> So only undo if the insn is inside a loop (i.e.
> BLOCK_FOR_INSN(insn)->loop_father != NULL) and this is a
> post-pass_loop2 cprop run?

post loop2 loops are destroyed.  When loops are available loop_father
is always non-NULL, the proper check is for loop_outer (->loop_father) == NULL.
or loop_depth (->loop_father) != 0.

Richard.

>
> Ciao!
> Steven
Kugan Vivekanandarajah April 17, 2015, 3:19 a.m. UTC | #3
On 15/04/15 21:18, Richard Biener wrote:
> On Wed, Apr 15, 2015 at 11:05 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> On Wed, Apr 15, 2015 at 9:53 AM, Kugan wrote:
>>> 2015-04-15  Kugan Vivekanandarajah  < >
>>>             Zhenqiang Chen  <>
>>>
>>>         PR target/65768
>>>         * cprop.c (try_replace_reg): Check cost of constants before propagating.
>>
>>
>>> +
>>> +  /* For CONSTANT_P (to), loop2_invariant pass might hoist it out the loop.
>>> +     And it can be shared by different references.  So skip propagation if
>>> +     it makes INSN's rtx cost higher.  */
>>> +
>>
>> So only undo if the insn is inside a loop (i.e.
>> BLOCK_FOR_INSN(insn)->loop_father != NULL) and this is a
>> post-pass_loop2 cprop run?
> 
> post loop2 loops are destroyed.  When loops are available loop_father
> is always non-NULL, the proper check is for loop_outer (->loop_father) == NULL.
> or loop_depth (->loop_father) != 0.

Thanks Steven and Richard for the comments. If the loop information is
present, we could have used this. But even otherwise, we are just
limiting the cprop of an expensive constant (based on the rtx_cost). I
understand that Richard was a bit concerned about extending the live
range but this is a trade off. As per his previous mail, Zhenqiang did
some benchmarking. I am happy to do further benchmarking if you want to
see that.

Probably the rematerialization that is being introduced in IRA/LRA can
redo this if it sees there is high register pressure. Any thoughts?

Thanks,
Kugan
Kugan Vivekanandarajah May 14, 2015, 5:46 a.m. UTC | #4
ping?

Thanks,
Kugan

On 15/04/15 17:53, Kugan wrote:
> As mentioned in PR65768, ARM gcc generates suboptimal code for constant
> Uses in loop. Part of the reason is cprop is undoing what loop invariant
> code motion did.
> 
> Zhenqiang posted a patch at to fix this based on rtx costs:
> https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01321.html
> 
> I cleaned it up and bootstrapped, regression tested on x86_64-linux-gnu;
> no new regressions. Is this OK for trunk?
> 
> Thanks,
> Kugan
> 
> gcc/ChangeLog:
> 
> 2015-04-15  Kugan Vivekanandarajah  <kuganv@linaro.org>
> 	    Zhenqiang Chen  <zhenqiang.chen@linaro.org>
> 
> 	PR target/65768
> 	* cprop.c (try_replace_reg): Check cost of constants before propagating.
>
Kugan Vivekanandarajah May 29, 2015, 6:32 a.m. UTC | #5
On 29/05/15 07:31, Jeff Law wrote:
> On 05/13/2015 11:46 PM, Kugan wrote:
>> ping?
>>
>> Thanks,
>> Kugan
>>
>> On 15/04/15 17:53, Kugan wrote:
>>> As mentioned in PR65768, ARM gcc generates suboptimal code for constant
>>> Uses in loop. Part of the reason is cprop is undoing what loop invariant
>>> code motion did.
>>>
>>> Zhenqiang posted a patch at to fix this based on rtx costs:
>>> https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01321.html
>>>
>>> I cleaned it up and bootstrapped, regression tested on x86_64-linux-gnu;
>>> no new regressions. Is this OK for trunk?
>>>
>>> Thanks,
>>> Kugan
>>>
>>> gcc/ChangeLog:
>>>
>>> 2015-04-15  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>         Zhenqiang Chen  <zhenqiang.chen@linaro.org>
>>>
>>>     PR target/65768
>>>     * cprop.c (try_replace_reg): Check cost of constants before
>>> propagating.
> I should have also noted, fresh bootstrap & regression test is needed too.

Thanks Jeff for the comments. I did a fresh bootstrap and regression
testing on x86_64-linux-gnu with no new regression. I will wait for you ACK.

Thanks,
Kugan
Jeff Law May 30, 2015, 4:54 a.m. UTC | #6
On 05/29/2015 12:32 AM, Kugan wrote:
>>>>
>>>>      PR target/65768
>>>>      * cprop.c (try_replace_reg): Check cost of constants before
>>>> propagating.
>> I should have also noted, fresh bootstrap & regression test is needed too.
>
> Thanks Jeff for the comments. I did a fresh bootstrap and regression
> testing on x86_64-linux-gnu with no new regression. I will wait for you ACK.
Can you address the 3 issues in my prior message?  I'll include them 
here for clarity:

--

The "const_p" variable is poorly named, though I can kindof see how you 
settled on it.  Maybe "check_rtx_costs" or something along those lines 
would be better.

The comment for the second hunk would probably be better as:

/* If TO is a constant, check the cost of the set after propagation
    to the cost of the set before the propagation.  If the cost is
    higher, then do not replace FROM with TO.  */


You should try to produce a testcase where this change shows a code 
generation improvement.    Given we're checking target costs, that test 
will naturally be target specific.  But please do try.

So with the two nits fixed and a testcase, I think this can go forward.
--
diff mbox

Patch

diff --git a/gcc/cprop.c b/gcc/cprop.c
index c9fb2fc..42a2a72 100644
--- a/gcc/cprop.c
+++ b/gcc/cprop.c
@@ -758,12 +758,38 @@  try_replace_reg (rtx from, rtx to, rtx_insn *insn)
   int success = 0;
   rtx set = single_set (insn);
 
+  bool already_const_p = false;
+  bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
+  int old_cost = set ? set_rtx_cost (set, speed) : 0;
+
+  if ((note != 0
+      && REG_NOTE_KIND (note) == REG_EQUAL
+      && (GET_CODE (XEXP (note, 0)) == CONST
+	  || CONSTANT_P (XEXP (note, 0))))
+      || (set && CONSTANT_P (SET_SRC (set))))
+    already_const_p = true;
+
   /* Usually we substitute easy stuff, so we won't copy everything.
      We however need to take care to not duplicate non-trivial CONST
      expressions.  */
   to = copy_rtx (to);
 
   validate_replace_src_group (from, to, insn);
+
+
+  /* For CONSTANT_P (to), loop2_invariant pass might hoist it out the loop.
+     And it can be shared by different references.  So skip propagation if
+     it makes INSN's rtx cost higher.  */
+
+  if (!already_const_p
+      && CONSTANT_P (to)
+      && (set_rtx_cost (set, speed) > old_cost))
+    {
+      cancel_changes (0);
+      return false;
+    }
+
+
   if (num_changes_pending () && apply_change_group ())
     success = 1;