diff mbox

[expmed] Calculate mult-by-const cost properly in mult_by_coeff_cost

Message ID 5506ACA9.4000909@arm.com
State New
Headers show

Commit Message

Kyrylo Tkachov March 16, 2015, 10:12 a.m. UTC
Hi all,

Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
It's supposed to return the cost of multiplying by a constant 'coeff'.
It calculates that by taking the cost of a MULT rtx by that constant
and comparing it to the cost of synthesizing that multiplication, and 
returning
the cheapest. However, in the MULT rtx cost calculations it creates
a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as 
I would
expect. This patches fixes that in the obvious way.

Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
I'm guessing this is stage 1 material at this point?

Thanks,
Kyrill

2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
     calculation rather than fake_reg.

Comments

Jeff Law March 17, 2015, 7:11 p.m. UTC | #1
On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
> Hi all,
>
> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
> It's supposed to return the cost of multiplying by a constant 'coeff'.
> It calculates that by taking the cost of a MULT rtx by that constant
> and comparing it to the cost of synthesizing that multiplication, and
> returning
> the cheapest. However, in the MULT rtx cost calculations it creates
> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
> I would
> expect. This patches fixes that in the obvious way.
>
> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
> I'm guessing this is stage 1 material at this point?
>
> Thanks,
> Kyrill
>
> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>      * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>      calculation rather than fake_reg.

I'd think stage1, unless you can point to a bug, particularly a regression.

Jeff
Kyrylo Tkachov March 18, 2015, 9:06 a.m. UTC | #2
On 17/03/15 19:11, Jeff Law wrote:
> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
>> Hi all,
>>
>> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
>> It's supposed to return the cost of multiplying by a constant 'coeff'.
>> It calculates that by taking the cost of a MULT rtx by that constant
>> and comparing it to the cost of synthesizing that multiplication, and
>> returning
>> the cheapest. However, in the MULT rtx cost calculations it creates
>> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
>> I would
>> expect. This patches fixes that in the obvious way.
>>
>> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
>> I'm guessing this is stage 1 material at this point?
>>
>> Thanks,
>> Kyrill
>>
>> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>       * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>>       calculation rather than fake_reg.
> I'd think stage1, unless you can point to a bug, particularly a regression.

No regression that I know of. I'll queue it up for stage 1 if it's ok 
code-wise.

Thanks,
Kyrill

> Jeff
>
Bin.Cheng March 18, 2015, 9:36 a.m. UTC | #3
On Wed, Mar 18, 2015 at 5:06 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>
> On 17/03/15 19:11, Jeff Law wrote:
>>
>> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
>>>
>>> Hi all,
>>>
>>> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
>>> It's supposed to return the cost of multiplying by a constant 'coeff'.
>>> It calculates that by taking the cost of a MULT rtx by that constant
>>> and comparing it to the cost of synthesizing that multiplication, and
>>> returning
>>> the cheapest. However, in the MULT rtx cost calculations it creates
>>> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
>>> I would
>>> expect. This patches fixes that in the obvious way.
>>>
>>> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
>>> I'm guessing this is stage 1 material at this point?
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>       * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>>>       calculation rather than fake_reg.
>>
>> I'd think stage1, unless you can point to a bug, particularly a
>> regression.
>
>
> No regression that I know of. I'll queue it up for stage 1 if it's ok
> code-wise.

This function just estimate the max_cost roughly, since it is only
used by Tree passes.  It shouldn't have much impact on generated code.
Maybe some targets doesn't have proper cost function for reg * const
rtl expression, also most of calls are in IVOPT, so it would be better
if you run some benchmark to make sure there is no surprise.

Thanks,
bin
>
> Thanks,
> Kyrill
>
>> Jeff
>>
>
>
Kyrylo Tkachov March 18, 2015, 9:39 a.m. UTC | #4
On 18/03/15 09:36, Bin.Cheng wrote:
> On Wed, Mar 18, 2015 at 5:06 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>> On 17/03/15 19:11, Jeff Law wrote:
>>> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
>>>> Hi all,
>>>>
>>>> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
>>>> It's supposed to return the cost of multiplying by a constant 'coeff'.
>>>> It calculates that by taking the cost of a MULT rtx by that constant
>>>> and comparing it to the cost of synthesizing that multiplication, and
>>>> returning
>>>> the cheapest. However, in the MULT rtx cost calculations it creates
>>>> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
>>>> I would
>>>> expect. This patches fixes that in the obvious way.
>>>>
>>>> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
>>>> I'm guessing this is stage 1 material at this point?
>>>>
>>>> Thanks,
>>>> Kyrill
>>>>
>>>> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>>
>>>>        * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>>>>        calculation rather than fake_reg.
>>> I'd think stage1, unless you can point to a bug, particularly a
>>> regression.
>>
>> No regression that I know of. I'll queue it up for stage 1 if it's ok
>> code-wise.
> This function just estimate the max_cost roughly, since it is only
> used by Tree passes.  It shouldn't have much impact on generated code.
> Maybe some targets doesn't have proper cost function for reg * const
> rtl expression, also most of calls are in IVOPT, so it would be better
> if you run some benchmark to make sure there is no surprise.
Hi Bin,

Thanks for the guidance. I'll have a look.

Kyrill

> Thanks,
> bin
>> Thanks,
>> Kyrill
>>
>>> Jeff
>>>
>>
Jeff Law April 13, 2015, 6:18 p.m. UTC | #5
On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
> Hi all,
>
> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
> It's supposed to return the cost of multiplying by a constant 'coeff'.
> It calculates that by taking the cost of a MULT rtx by that constant
> and comparing it to the cost of synthesizing that multiplication, and
> returning
> the cheapest. However, in the MULT rtx cost calculations it creates
> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
> I would
> expect. This patches fixes that in the obvious way.
>
> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
> I'm guessing this is stage 1 material at this point?
>
> Thanks,
> Kyrill
>
> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>      * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>      calculation rather than fake_reg.
I'm pretty sure this patch is wrong.

The call you're referring to is computing an upper limit to the cost for 
use by choose_mult_variant.  Once a synthesized multiply sequence 
exceeds the cost of reg*reg, then that synthesized sequence can be 
thrown away because it's not profitable.

Jeff
Kyrylo Tkachov April 14, 2015, 8:07 a.m. UTC | #6
Hi Jeff,

Thanks for looking at this.

On 13/04/15 19:18, Jeff Law wrote:
> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
>> Hi all,
>>
>> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
>> It's supposed to return the cost of multiplying by a constant 'coeff'.
>> It calculates that by taking the cost of a MULT rtx by that constant
>> and comparing it to the cost of synthesizing that multiplication, and
>> returning
>> the cheapest. However, in the MULT rtx cost calculations it creates
>> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
>> I would
>> expect. This patches fixes that in the obvious way.
>>
>> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
>> I'm guessing this is stage 1 material at this point?
>>
>> Thanks,
>> Kyrill
>>
>> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>       * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>>       calculation rather than fake_reg.
> I'm pretty sure this patch is wrong.
>
> The call you're referring to is computing an upper limit to the cost for
> use by choose_mult_variant.  Once a synthesized multiply sequence
> exceeds the cost of reg*reg, then that synthesized sequence can be
> thrown away because it's not profitable.

But shouldn't the limit be the mult-by-constant cost?
Consider also similar logic in expand_mult:
max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, op1), speed);
if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost))
   return expand_mult_const (mode, op0, coeff, target,
                             &algorithm, variant);


Here op1 is a CONST_INT and therefore the cost of mult-by-immediate is
used as a limit for choose_mult_variant.

Thanks,
Kyrill


>
> Jeff
>
>
Jeff Law April 15, 2015, 3:41 p.m. UTC | #7
On 04/14/2015 02:07 AM, Kyrill Tkachov wrote:
> Hi Jeff,
>
> Thanks for looking at this.
>
> On 13/04/15 19:18, Jeff Law wrote:
>> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
>>> Hi all,
>>>
>>> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
>>> It's supposed to return the cost of multiplying by a constant 'coeff'.
>>> It calculates that by taking the cost of a MULT rtx by that constant
>>> and comparing it to the cost of synthesizing that multiplication, and
>>> returning
>>> the cheapest. However, in the MULT rtx cost calculations it creates
>>> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
>>> I would
>>> expect. This patches fixes that in the obvious way.
>>>
>>> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
>>> I'm guessing this is stage 1 material at this point?
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>       * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>>>       calculation rather than fake_reg.
>> I'm pretty sure this patch is wrong.
>>
>> The call you're referring to is computing an upper limit to the cost for
>> use by choose_mult_variant.  Once a synthesized multiply sequence
>> exceeds the cost of reg*reg, then that synthesized sequence can be
>> thrown away because it's not profitable.
>
> But shouldn't the limit be the mult-by-constant cost?
No, because ultimately we're trying to do better than just loading the 
constant into a register and doing a reg * reg.  So the reg*reg case is 
the upper bound for allowed cost of a synthesized sequence.

> Consider also similar logic in expand_mult:
> max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, op1), speed);
> if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost))
>    return expand_mult_const (mode, op0, coeff, target,
>                              &algorithm, variant);
This looks wrong to me.  They're certainly inconsistent.

Maybe start by asking Bill (who added mult_by_coeff_cost and whom I've 
cc'd) what his intent was to make sure it matches my understanding.

Jeff
Kyrylo Tkachov April 15, 2015, 3:47 p.m. UTC | #8
On 15/04/15 16:41, Jeff Law wrote:
> On 04/14/2015 02:07 AM, Kyrill Tkachov wrote:
>> Hi Jeff,
>>
>> Thanks for looking at this.
>>
>> On 13/04/15 19:18, Jeff Law wrote:
>>> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
>>>> Hi all,
>>>>
>>>> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
>>>> It's supposed to return the cost of multiplying by a constant 'coeff'.
>>>> It calculates that by taking the cost of a MULT rtx by that constant
>>>> and comparing it to the cost of synthesizing that multiplication, and
>>>> returning
>>>> the cheapest. However, in the MULT rtx cost calculations it creates
>>>> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
>>>> I would
>>>> expect. This patches fixes that in the obvious way.
>>>>
>>>> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
>>>> I'm guessing this is stage 1 material at this point?
>>>>
>>>> Thanks,
>>>> Kyrill
>>>>
>>>> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>>
>>>>        * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>>>>        calculation rather than fake_reg.
>>> I'm pretty sure this patch is wrong.
>>>
>>> The call you're referring to is computing an upper limit to the cost for
>>> use by choose_mult_variant.  Once a synthesized multiply sequence
>>> exceeds the cost of reg*reg, then that synthesized sequence can be
>>> thrown away because it's not profitable.
>> But shouldn't the limit be the mult-by-constant cost?
> No, because ultimately we're trying to do better than just loading the
> constant into a register and doing a reg * reg.  So the reg*reg case is
> the upper bound for allowed cost of a synthesized sequence.
>
>> Consider also similar logic in expand_mult:
>> max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, op1), speed);
>> if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost))
>>     return expand_mult_const (mode, op0, coeff, target,
>>                               &algorithm, variant);
> This looks wrong to me.  They're certainly inconsistent.

Ah ok. I had noticed the inconsistency and instead thought that

mult_by_coeff_cost was the one that needed fixing.
Actually, I'd prefer fixing the expand_mult logic, since that would mean we wouldn't end up
passing a mult-by-immediate rtx to the backend costs, which might not be a valid rtx for some
architectures (arm, for example) and would require special casing in rtx costs.


>
> Maybe start by asking Bill (who added mult_by_coeff_cost and whom I've
> cc'd) what his intent was to make sure it matches my understanding.

If we end up preferring to fix the expand_mult logic instead,
I'm withdrawing this patch then.
withdraw

Thanks,
Kyrill

>
> Jeff
>
Kyrylo Tkachov April 20, 2015, 9:27 a.m. UTC | #9
On 15/04/15 16:41, Jeff Law wrote:
> On 04/14/2015 02:07 AM, Kyrill Tkachov wrote:
>> Hi Jeff,
>>
>> Thanks for looking at this.
>>
>> On 13/04/15 19:18, Jeff Law wrote:
>>> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
>>>> Hi all,
>>>>
>>>> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
>>>> It's supposed to return the cost of multiplying by a constant 'coeff'.
>>>> It calculates that by taking the cost of a MULT rtx by that constant
>>>> and comparing it to the cost of synthesizing that multiplication, and
>>>> returning
>>>> the cheapest. However, in the MULT rtx cost calculations it creates
>>>> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as
>>>> I would
>>>> expect. This patches fixes that in the obvious way.
>>>>
>>>> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
>>>> I'm guessing this is stage 1 material at this point?
>>>>
>>>> Thanks,
>>>> Kyrill
>>>>
>>>> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>>
>>>>        * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
>>>>        calculation rather than fake_reg.
>>> I'm pretty sure this patch is wrong.
>>>
>>> The call you're referring to is computing an upper limit to the cost for
>>> use by choose_mult_variant.  Once a synthesized multiply sequence
>>> exceeds the cost of reg*reg, then that synthesized sequence can be
>>> thrown away because it's not profitable.
>> But shouldn't the limit be the mult-by-constant cost?
> No, because ultimately we're trying to do better than just loading the
> constant into a register and doing a reg * reg.  So the reg*reg case is
> the upper bound for allowed cost of a synthesized sequence.

So I've thought about it a bit more and I have another concern.
The function returns this:
   if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost))
     return algorithm.cost.cost;
   else
     return max_cost;

If I read this right, it tries to synthesise the mult at choose_mult_variant
with the limit cost of the reg-by-reg mult, but if the synthesis cost exceeds
that, then it returns the reg-by-reg mult cost (in return max_cost;) so that
can't be right, can it?

Thanks,
Kyrill

>
>> Consider also similar logic in expand_mult:
>> max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, op1), speed);
>> if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost))
>>     return expand_mult_const (mode, op0, coeff, target,
>>                               &algorithm, variant);
> This looks wrong to me.  They're certainly inconsistent.
>
> Maybe start by asking Bill (who added mult_by_coeff_cost and whom I've
> cc'd) what his intent was to make sure it matches my understanding.
>
> Jeff
>
Jeff Law April 20, 2015, 5:51 p.m. UTC | #10
On 04/20/2015 03:27 AM, Kyrill Tkachov wrote:
>
> On 15/04/15 16:41, Jeff Law wrote:
>> On 04/14/2015 02:07 AM, Kyrill Tkachov wrote:
>>> Hi Jeff,
>>>
>>> Thanks for looking at this.
>>>
>>> On 13/04/15 19:18, Jeff Law wrote:
>>>> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote:
>>>>> Hi all,
>>>>>
>>>>> Eyeballing the mult_by_coeff_cost function I think it has a
>>>>> typo/bug. It's supposed to return the cost of multiplying by
>>>>> a constant 'coeff'. It calculates that by taking the cost of
>>>>> a MULT rtx by that constant and comparing it to the cost of
>>>>> synthesizing that multiplication, and returning the cheapest.
>>>>> However, in the MULT rtx cost calculations it creates a MULT
>>>>> rtx of two REGs rather than the a REG and the GEN_INT of
>>>>> coeff as I would expect. This patches fixes that in the
>>>>> obvious way.
>>>>>
>>>>> Tested aarch64-none-elf and bootstrapped on
>>>>> x86_64-linux-gnu. I'm guessing this is stage 1 material at
>>>>> this point?
>>>>>
>>>>> Thanks, Kyrill
>>>>>
>>>>> 2015-03-13  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>>>
>>>>> * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT
>>>>> cost calculation rather than fake_reg.
>>>> I'm pretty sure this patch is wrong.
>>>>
>>>> The call you're referring to is computing an upper limit to the
>>>> cost for use by choose_mult_variant.  Once a synthesized
>>>> multiply sequence exceeds the cost of reg*reg, then that
>>>> synthesized sequence can be thrown away because it's not
>>>> profitable.
>>> But shouldn't the limit be the mult-by-constant cost?
>> No, because ultimately we're trying to do better than just loading
>> the constant into a register and doing a reg * reg.  So the reg*reg
>> case is the upper bound for allowed cost of a synthesized
>> sequence.
>
> So I've thought about it a bit more and I have another concern. The
> function returns this: if (choose_mult_variant (mode, coeff,
> &algorithm, &variant, max_cost)) return algorithm.cost.cost; else
> return max_cost;
>
> If I read this right, it tries to synthesise the mult at
> choose_mult_variant with the limit cost of the reg-by-reg mult, but
> if the synthesis cost exceeds that, then it returns the reg-by-reg
> mult cost (in return max_cost;) so that can't be right, can it?
In the case where the target doesn't have mult imm,reg, then reg*reg 
would be the right estimated cost if there's no cheap synthesis.   It 
doesn't look like we correctly handle costing on targets with mult imm,reg.

jeff
diff mbox

Patch

commit fa230baf4f35d03cf072904dc048ce4ffcf43148
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Thu Mar 12 09:47:06 2015 +0000

    [expmed] Calculate mult-by-const properly in mult_by_coeff_cost

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 0034203..fec9501 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -3285,7 +3285,8 @@  mult_by_coeff_cost (HOST_WIDE_INT coeff, machine_mode mode, bool speed)
   enum mult_variant variant;
 
   rtx fake_reg = gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1);
-  max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, fake_reg), speed);
+  max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, GEN_INT (coeff)),
+			    speed);
   if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost))
     return algorithm.cost.cost;
   else