diff mbox

[ifcvt] Check size cost in noce_try_store_flag_mask

Message ID 000001cff4d4$29ff8750$7dfe95f0$@arm.com
State New
Headers show

Commit Message

Zhenqiang Chen Oct. 31, 2014, 6:30 a.m. UTC
Thank you all for the comments. Patch is updated.

Bootstrap and no make check regression on X86-64.
No make check regression with Cortex-M0 qemu.
No performance changes for coremark, dhrystone, spec2000 and spec2006 on
X86-64 and Cortex-A15.

For CSiBE, ARM Cortex-M0 result is a little better. A little regression for
MIPS (less than 0.01%).


> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: Thursday, October 30, 2014 1:27 PM
> To: Zhenqiang Chen; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, ifcvt] Check size cost in noce_try_store_flag_mask
> 
> On 10/26/14 19:53, Zhenqiang Chen wrote:
> > Hi,
> >
> > Function noce_try_store_flag_mask converts "if (test) x = 0;" to "x &=
> > -(test == 0);"
> >
> > But from code size view, "x &= -(test == 0);" might have more
> > instructions than "if (test) x = 0;". The patch checks the cost to
> > determine the conversion is valuable or not.
> >
> > Bootstrap and no make check regression on X86-64.
> > No make check regression with Cortex-M0 qemu.
> > For CSiBE, ARM Cortex-m0 result is a little better. A little
> > regression for MIPS. Roughly no change for PowerPC.
> >
> > OK for trunk?
> >
> > Thanks!
> > -Zhenqiang
> >
> > ChangeLog:
> > 2014-10-27  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> > 	* ifcvt.c (noce_try_store_flag_mask): Check rtx cost.
> >
> > testsuite/ChangeLog:
> > 2014-10-27  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> > 	* gcc.target/arm/ifcvt-size-check.c: New test.
> >
> > diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 949d2b4..3abd518 100644
> > --- a/gcc/ifcvt.c
> > +++ b/gcc/ifcvt.c
> > @@ -1393,6 +1393,14 @@ noce_try_store_flag_mask (struct noce_if_info
> > *if_info)
> >   	  if (!seq)
> >   	    return FALSE;
> >
> > +	  if (optimize_function_for_size_p (cfun))
> > +	    {
> > +	      int old_cost = COSTS_N_INSNS (if_info->branch_cost + 1);
> > +	      int new_cost = seq_cost (seq, 0);
> > +	      if (new_cost > old_cost)
> > +		return FALSE;
> > +	    }
> > +
> As Andrew pointed out, there's really no reason to limit this to cases
where
> we're optimizing for size.  So that check should be removed.
> 
> You should also engage with Michael to determine if the MIPS regressions
> are significant enough to warrant deeper investigation.  My gut tells me
that
> if MIPS is regressing because of this, then that's going to be an issue in
the
> MIPS costing model that the MIPS folks will ultimately need to fix.
> 
> jeff
>

Comments

Andrew Pinski Oct. 31, 2014, 6:37 a.m. UTC | #1
On Thu, Oct 30, 2014 at 11:30 PM, Zhenqiang Chen <zhenqiang.chen@arm.com> wrote:
> Thank you all for the comments. Patch is updated.
>
> Bootstrap and no make check regression on X86-64.
> No make check regression with Cortex-M0 qemu.
> No performance changes for coremark, dhrystone, spec2000 and spec2006 on
> X86-64 and Cortex-A15.
>
> For CSiBE, ARM Cortex-M0 result is a little better. A little regression for
> MIPS (less than 0.01%).

I think I have a fix for MIPS which I need to submit too.  The problem
is IF_THEN_ELSE is not implemented for mips_rtx_costs.

Something like the attached one (though It is not updated for the new
cores including octeon3).

Thanks,
Andrew Pinski

>
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 653653f..7bb2578 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -1393,6 +1393,9 @@ noce_try_store_flag_mask (struct noce_if_info
> *if_info)
>
>        if (target)
>         {
> +         int old_cost, new_cost, insn_cost;
> +         int speed_p;
> +
>           if (target != if_info->x)
>             noce_emit_move_insn (if_info->x, target);
>
> @@ -1400,6 +1403,14 @@ noce_try_store_flag_mask (struct noce_if_info
> *if_info)
>           if (!seq)
>             return FALSE;
>
> +         speed_p = optimize_bb_for_speed_p (BLOCK_FOR_INSN
> (if_info->insn_a));
> +         insn_cost = insn_rtx_cost (PATTERN (if_info->insn_a), speed_p);
> +         old_cost = COSTS_N_INSNS (if_info->branch_cost) + insn_cost;
> +         new_cost = seq_cost (seq, speed_p);
> +
> +         if (new_cost > old_cost)
> +           return FALSE;
> +
>           emit_insn_before_setloc (seq, if_info->jump,
>                                    INSN_LOCATION (if_info->insn_a));
>           return TRUE;
> diff --git a/gcc/testsuite/gcc.target/arm/ifcvt-size-check.c
> b/gcc/testsuite/gcc.target/arm/ifcvt-size-check.c
> new file mode 100644
> index 0000000..43fa16b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/ifcvt-size-check.c
> @@ -0,0 +1,13 @@
> +/* { dg-do assemble } */
> +/* { dg-options "-mthumb -Os " }  */
> +/* { dg-require-effective-target arm_thumb1_ok } */
> +
> +int
> +test (unsigned char iov_len, int count, int i) {
> +  unsigned char bytes = 0;
> +  if ((unsigned char) ((char) 127 - bytes) < iov_len)
> +    return 22;
> +  return 0;
> +}
> +/* { dg-final { object-size text <= 12 } } */
>
>
>> -----Original Message-----
>> From: Jeff Law [mailto:law@redhat.com]
>> Sent: Thursday, October 30, 2014 1:27 PM
>> To: Zhenqiang Chen; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH, ifcvt] Check size cost in noce_try_store_flag_mask
>>
>> On 10/26/14 19:53, Zhenqiang Chen wrote:
>> > Hi,
>> >
>> > Function noce_try_store_flag_mask converts "if (test) x = 0;" to "x &=
>> > -(test == 0);"
>> >
>> > But from code size view, "x &= -(test == 0);" might have more
>> > instructions than "if (test) x = 0;". The patch checks the cost to
>> > determine the conversion is valuable or not.
>> >
>> > Bootstrap and no make check regression on X86-64.
>> > No make check regression with Cortex-M0 qemu.
>> > For CSiBE, ARM Cortex-m0 result is a little better. A little
>> > regression for MIPS. Roughly no change for PowerPC.
>> >
>> > OK for trunk?
>> >
>> > Thanks!
>> > -Zhenqiang
>> >
>> > ChangeLog:
>> > 2014-10-27  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>> >
>> >     * ifcvt.c (noce_try_store_flag_mask): Check rtx cost.
>> >
>> > testsuite/ChangeLog:
>> > 2014-10-27  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>> >
>> >     * gcc.target/arm/ifcvt-size-check.c: New test.
>> >
>> > diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 949d2b4..3abd518 100644
>> > --- a/gcc/ifcvt.c
>> > +++ b/gcc/ifcvt.c
>> > @@ -1393,6 +1393,14 @@ noce_try_store_flag_mask (struct noce_if_info
>> > *if_info)
>> >       if (!seq)
>> >         return FALSE;
>> >
>> > +     if (optimize_function_for_size_p (cfun))
>> > +       {
>> > +         int old_cost = COSTS_N_INSNS (if_info->branch_cost + 1);
>> > +         int new_cost = seq_cost (seq, 0);
>> > +         if (new_cost > old_cost)
>> > +           return FALSE;
>> > +       }
>> > +
>> As Andrew pointed out, there's really no reason to limit this to cases
> where
>> we're optimizing for size.  So that check should be removed.
>>
>> You should also engage with Michael to determine if the MIPS regressions
>> are significant enough to warrant deeper investigation.  My gut tells me
> that
>> if MIPS is regressing because of this, then that's going to be an issue in
> the
>> MIPS costing model that the MIPS folks will ultimately need to fix.
>>
>> jeff
>>
>
>
>
>
Matthew Fortune Oct. 31, 2014, 11:07 a.m. UTC | #2
Andrew Pinski <pinskia@gmail.com> writes:
> On Thu, Oct 30, 2014 at 11:30 PM, Zhenqiang Chen <zhenqiang.chen@arm.com>

> wrote:

> > Thank you all for the comments. Patch is updated.

> >

> > Bootstrap and no make check regression on X86-64.

> > No make check regression with Cortex-M0 qemu.

> > No performance changes for coremark, dhrystone, spec2000 and spec2006 on

> > X86-64 and Cortex-A15.

> >

> > For CSiBE, ARM Cortex-M0 result is a little better. A little regression

> for

> > MIPS (less than 0.01%).

> 

> I think I have a fix for MIPS which I need to submit too.  The problem

> is IF_THEN_ELSE is not implemented for mips_rtx_costs.

> 

> Something like the attached one (though It is not updated for the new

> cores including octeon3).


This looks OK in principle so I have no objection to the original patch
from Zhengiang. The MIPS patch can follow on.

Andrew: Are you setting higher costs for octeon to try and avoid the
conversion owing to high latency for MOV[NZ] etc in octeon*? Should that
be conditional on speed vs size?

Thanks,
Matthew
Andrew Pinski Oct. 31, 2014, 2:57 p.m. UTC | #3
> On Oct 31, 2014, at 4:07 AM, Matthew Fortune <Matthew.Fortune@imgtec.com> wrote:
> 
> Andrew Pinski <pinskia@gmail.com> writes:
>> On Thu, Oct 30, 2014 at 11:30 PM, Zhenqiang Chen <zhenqiang.chen@arm.com>
>> wrote:
>>> Thank you all for the comments. Patch is updated.
>>> 
>>> Bootstrap and no make check regression on X86-64.
>>> No make check regression with Cortex-M0 qemu.
>>> No performance changes for coremark, dhrystone, spec2000 and spec2006 on
>>> X86-64 and Cortex-A15.
>>> 
>>> For CSiBE, ARM Cortex-M0 result is a little better. A little regression
>> for
>>> MIPS (less than 0.01%).
>> 
>> I think I have a fix for MIPS which I need to submit too.  The problem
>> is IF_THEN_ELSE is not implemented for mips_rtx_costs.
>> 
>> Something like the attached one (though It is not updated for the new
>> cores including octeon3).
> 
> This looks OK in principle so I have no objection to the original patch
> from Zhengiang. The MIPS patch can follow on.
> 
> Andrew: Are you setting higher costs for octeon to try and avoid the
> conversion owing to high latency for MOV[NZ] etc in octeon*?

Yes. In fact I was doing it for the higher latency on octeon 2 than Octeon 1/+. I saw a small improvement with that, using other instructions in one or two cases which be scheduled with other code. 


> Should that
> be conditional on speed vs size?

Yes though I thought we had a variable for size too. 

Thanks,
Andrew

> 
> Thanks,
> Matthew
Jeff Law Oct. 31, 2014, 5:25 p.m. UTC | #4
On 10/31/14 00:30, Zhenqiang Chen wrote:
> Thank you all for the comments. Patch is updated.
>
> Bootstrap and no make check regression on X86-64.
> No make check regression with Cortex-M0 qemu.
> No performance changes for coremark, dhrystone, spec2000 and spec2006 on
> X86-64 and Cortex-A15.
>
> For CSiBE, ARM Cortex-M0 result is a little better. A little regression for
> MIPS (less than 0.01%).
Given Matthew's follow-up indicting he was comfortable with the patch 
as-is and the MIPS folks will follow-up on their costing models, this 
patch is fine.

jeff
diff mbox

Patch

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 653653f..7bb2578 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1393,6 +1393,9 @@  noce_try_store_flag_mask (struct noce_if_info
*if_info)
 
       if (target)
 	{
+	  int old_cost, new_cost, insn_cost;
+	  int speed_p;
+
 	  if (target != if_info->x)
 	    noce_emit_move_insn (if_info->x, target);
 
@@ -1400,6 +1403,14 @@  noce_try_store_flag_mask (struct noce_if_info
*if_info)
 	  if (!seq)
 	    return FALSE;
 
+	  speed_p = optimize_bb_for_speed_p (BLOCK_FOR_INSN
(if_info->insn_a));
+	  insn_cost = insn_rtx_cost (PATTERN (if_info->insn_a), speed_p);
+	  old_cost = COSTS_N_INSNS (if_info->branch_cost) + insn_cost;
+	  new_cost = seq_cost (seq, speed_p);
+
+	  if (new_cost > old_cost)
+	    return FALSE;
+
 	  emit_insn_before_setloc (seq, if_info->jump,
 				   INSN_LOCATION (if_info->insn_a));
 	  return TRUE;
diff --git a/gcc/testsuite/gcc.target/arm/ifcvt-size-check.c
b/gcc/testsuite/gcc.target/arm/ifcvt-size-check.c
new file mode 100644
index 0000000..43fa16b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/ifcvt-size-check.c
@@ -0,0 +1,13 @@ 
+/* { dg-do assemble } */
+/* { dg-options "-mthumb -Os " }  */
+/* { dg-require-effective-target arm_thumb1_ok } */
+
+int
+test (unsigned char iov_len, int count, int i) {
+  unsigned char bytes = 0;
+  if ((unsigned char) ((char) 127 - bytes) < iov_len)
+    return 22;
+  return 0;
+}
+/* { dg-final { object-size text <= 12 } } */