diff mbox

[RFC] : Try and vectorize with shift for mult expr with power 2 integer constant.

Message ID 7794A52CE4D579448B959EED7DD0A4723DD205E2@satlexdag06.amd.com
State New
Headers show

Commit Message

Kumar, Venkataramanan Aug. 4, 2015, 8:52 a.m. UTC
Hi Jeff, 

> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Jeff Law
> Sent: Monday, August 03, 2015 11:42 PM
> To: Kumar, Venkataramanan; Jakub Jelinek
> Cc: Richard Beiner (richard.guenther@gmail.com); gcc-patches@gcc.gnu.org
> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult expr with
> power 2 integer constant.
> 
> On 08/02/2015 05:03 AM, Kumar, Venkataramanan wrote:
> > Hi Jakub,
> >
> > Thank you for reviewing the patch.
> >
> > I have incorporated your comments in the attached patch.
> Note Jakub is on PTO for the next 3 weeks.

 Thank you for this information.

> 
> 
> >
> >
> >
> > vectorize_mults_via_shift.diff.txt
> >
> >
> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
> > b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
> Jakub would probably like more testcases :-)
> 
> The most obvious thing to test would be other shift factors.
> 
> A negative test to verify we don't try to turn a multiply by non-constant or
> multiply by a constant that is not a power of 2 into shifts.

I have added negative test in the attached patch.


> 
> [ Would it make sense, for example, to turn a multiply by 3 into a shift-add
> sequence?  As Jakub said, choose_mult_variant can be your friend. ]

Yes I will do that in a follow up patch.   

The new change log becomes 

gcc/ChangeLog
2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
     * tree-vect-patterns.c (vect_recog_mult_pattern): New function for vectorizing
        multiplication patterns.
     * tree-vectorizer.h: Adjust the number of patterns.

gcc/testsuite/ChangeLog
2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
     * gcc.dg/vect/vect-mult-pattern-1.c: New
    * gcc.dg/vect/vect-mult-pattern-2.c: New

Bootstrapped and reg tested on aarch64-unknown-linux-gnu.

Ok for trunk ?

> 
> 
> 
> > @@ -2147,6 +2152,140 @@ vect_recog_vector_vector_shift_pattern
> (vec<gimple> *stmts,
> >     return pattern_stmt;
> >   }
> >
> > +/* Detect multiplication by constant which are postive or negatives
> > +of power 2,
> s/postive/positive/
> 
> 
> Jeff

Regards,
Venkat.

Comments

Richard Biener Aug. 4, 2015, 10:37 a.m. UTC | #1
On Tue, Aug 4, 2015 at 10:52 AM, Kumar, Venkataramanan
<Venkataramanan.Kumar@amd.com> wrote:
> Hi Jeff,
>
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>> owner@gcc.gnu.org] On Behalf Of Jeff Law
>> Sent: Monday, August 03, 2015 11:42 PM
>> To: Kumar, Venkataramanan; Jakub Jelinek
>> Cc: Richard Beiner (richard.guenther@gmail.com); gcc-patches@gcc.gnu.org
>> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult expr with
>> power 2 integer constant.
>>
>> On 08/02/2015 05:03 AM, Kumar, Venkataramanan wrote:
>> > Hi Jakub,
>> >
>> > Thank you for reviewing the patch.
>> >
>> > I have incorporated your comments in the attached patch.
>> Note Jakub is on PTO for the next 3 weeks.
>
>  Thank you for this information.
>
>>
>>
>> >
>> >
>> >
>> > vectorize_mults_via_shift.diff.txt
>> >
>> >
>> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
>> > b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
>> Jakub would probably like more testcases :-)
>>
>> The most obvious thing to test would be other shift factors.
>>
>> A negative test to verify we don't try to turn a multiply by non-constant or
>> multiply by a constant that is not a power of 2 into shifts.
>
> I have added negative test in the attached patch.
>
>
>>
>> [ Would it make sense, for example, to turn a multiply by 3 into a shift-add
>> sequence?  As Jakub said, choose_mult_variant can be your friend. ]
>
> Yes I will do that in a follow up patch.
>
> The new change log becomes
>
> gcc/ChangeLog
> 2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
>      * tree-vect-patterns.c (vect_recog_mult_pattern): New function for vectorizing
>         multiplication patterns.
>      * tree-vectorizer.h: Adjust the number of patterns.
>
> gcc/testsuite/ChangeLog
> 2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
>      * gcc.dg/vect/vect-mult-pattern-1.c: New
>     * gcc.dg/vect/vect-mult-pattern-2.c: New
>
> Bootstrapped and reg tested on aarch64-unknown-linux-gnu.
>
> Ok for trunk ?

+  if (TREE_CODE (oprnd0) != SSA_NAME
+      || TREE_CODE (oprnd1) != INTEGER_CST
+      || TREE_CODE (itype) != INTEGER_TYPE

INTEGRAL_TYPE_P (itype)

+  optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector);
+  if (!optab
+      || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing)
+       return NULL;
+

indent of the return stmt looks wrong

+  /* Handle constant operands that are postive or negative powers of 2.  */
+  if ( wi::exact_log2 (oprnd1) != -1  ||
+       wi::exact_log2 (wi::neg (oprnd1)) != -1)

no space after (, || goes to the next line.

+    {
+      tree shift;
+
+      if (wi::exact_log2 (oprnd1) != -1)

please cache wi::exact_log2

in fact the first if () looks redundant if you simply put an else return NULL
after a else if (wi::exact_log2 (wi::neg (oprnd1)) != -1)

Note that the issue with INT_MIN is that wi::neg (INT_MIN) is INT_MIN
again, but it seems that wi::exact_log2 returns -1 in that case so you
are fine (and in fact not handling this case).

Thanks,
Richard.

>>
>>
>>
>> > @@ -2147,6 +2152,140 @@ vect_recog_vector_vector_shift_pattern
>> (vec<gimple> *stmts,
>> >     return pattern_stmt;
>> >   }
>> >
>> > +/* Detect multiplication by constant which are postive or negatives
>> > +of power 2,
>> s/postive/positive/
>>
>>
>> Jeff
>
> Regards,
> Venkat.
>
Richard Biener Aug. 4, 2015, 2:22 p.m. UTC | #2
On Tue, Aug 4, 2015 at 4:21 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Aug 4, 2015 at 4:15 PM, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Tue, Aug 4, 2015 at 10:52 AM, Kumar, Venkataramanan
>>> <Venkataramanan.Kumar@amd.com> wrote:
>>>> Hi Jeff,
>>>>
>>>>> -----Original Message-----
>>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>>> owner@gcc.gnu.org] On Behalf Of Jeff Law
>>>>> Sent: Monday, August 03, 2015 11:42 PM
>>>>> To: Kumar, Venkataramanan; Jakub Jelinek
>>>>> Cc: Richard Beiner (richard.guenther@gmail.com); gcc-patches@gcc.gnu.org
>>>>> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult expr with
>>>>> power 2 integer constant.
>>>>>
>>>>> On 08/02/2015 05:03 AM, Kumar, Venkataramanan wrote:
>>>>> > Hi Jakub,
>>>>> >
>>>>> > Thank you for reviewing the patch.
>>>>> >
>>>>> > I have incorporated your comments in the attached patch.
>>>>> Note Jakub is on PTO for the next 3 weeks.
>>>>
>>>>  Thank you for this information.
>>>>
>>>>>
>>>>>
>>>>> >
>>>>> >
>>>>> >
>>>>> > vectorize_mults_via_shift.diff.txt
>>>>> >
>>>>> >
>>>>> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
>>>>> > b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
>>>>> Jakub would probably like more testcases :-)
>>>>>
>>>>> The most obvious thing to test would be other shift factors.
>>>>>
>>>>> A negative test to verify we don't try to turn a multiply by non-constant or
>>>>> multiply by a constant that is not a power of 2 into shifts.
>>>>
>>>> I have added negative test in the attached patch.
>>>>
>>>>
>>>>>
>>>>> [ Would it make sense, for example, to turn a multiply by 3 into a shift-add
>>>>> sequence?  As Jakub said, choose_mult_variant can be your friend. ]
>>>>
>>>> Yes I will do that in a follow up patch.
>>>>
>>>> The new change log becomes
>>>>
>>>> gcc/ChangeLog
>>>> 2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
>>>>      * tree-vect-patterns.c (vect_recog_mult_pattern): New function for vectorizing
>>>>         multiplication patterns.
>>>>      * tree-vectorizer.h: Adjust the number of patterns.
>>>>
>>>> gcc/testsuite/ChangeLog
>>>> 2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
>>>>      * gcc.dg/vect/vect-mult-pattern-1.c: New
>>>>     * gcc.dg/vect/vect-mult-pattern-2.c: New
>>>>
>>>> Bootstrapped and reg tested on aarch64-unknown-linux-gnu.
>>>>
>>>> Ok for trunk ?
>>>
>>> +  if (TREE_CODE (oprnd0) != SSA_NAME
>>> +      || TREE_CODE (oprnd1) != INTEGER_CST
>>> +      || TREE_CODE (itype) != INTEGER_TYPE
>>>
>>> INTEGRAL_TYPE_P (itype)
>>>
>>> +  optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector);
>>> +  if (!optab
>>> +      || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing)
>>> +       return NULL;
>>> +
>>>
>>> indent of the return stmt looks wrong
>>>
>>> +  /* Handle constant operands that are postive or negative powers of 2.  */
>>> +  if ( wi::exact_log2 (oprnd1) != -1  ||
>>> +       wi::exact_log2 (wi::neg (oprnd1)) != -1)
>>>
>>> no space after (, || goes to the next line.
>>>
>>> +    {
>>> +      tree shift;
>>> +
>>> +      if (wi::exact_log2 (oprnd1) != -1)
>>>
>>> please cache wi::exact_log2
>>>
>>> in fact the first if () looks redundant if you simply put an else return NULL
>>> after a else if (wi::exact_log2 (wi::neg (oprnd1)) != -1)
>>>
>>> Note that the issue with INT_MIN is that wi::neg (INT_MIN) is INT_MIN
>>> again, but it seems that wi::exact_log2 returns -1 in that case so you
>>> are fine (and in fact not handling this case).
>>
>> Are you sure it returns -1 for INT_MIN?  It isn't supposed to, assuming
>> INT_MIN is shorthand for "minimum value for a signed type".  wide_ints
>> aren't signed, so INT_MIN is indistinguishable from an unsigned
>> 1<<(prec-1).
>
> No, not sure.  I spotted
>
>   /* Reject cases where there are implicit -1 blocks above HIGH.  */
>   if (x.len * HOST_BITS_PER_WIDE_INT < x.precision && x.sign_mask () < 0)
>     return -1;
>
> and thought that would catch it.  I mean the tree value is negative so
> exact_log2 must see it is a negative value.

Now re-sent with Richards company disclaimer stripped...

> Richard.
>
>> wi::exact_log2 (wi::to_widest (INT_MIN)) would return -1, but that's
>> the difference between "infinite" precision and exact precision.
>>
>> Thanks,
>> Richard
Richard Sandiford Aug. 4, 2015, 2:28 p.m. UTC | #3
Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Aug 4, 2015 at 4:21 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Aug 4, 2015 at 4:15 PM, Richard Sandiford
>> <richard.sandiford@arm.com> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Tue, Aug 4, 2015 at 10:52 AM, Kumar, Venkataramanan
>>>> <Venkataramanan.Kumar@amd.com> wrote:
>>>>> Hi Jeff,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>>>> owner@gcc.gnu.org] On Behalf Of Jeff Law
>>>>>> Sent: Monday, August 03, 2015 11:42 PM
>>>>>> To: Kumar, Venkataramanan; Jakub Jelinek
>>>>>> Cc: Richard Beiner (richard.guenther@gmail.com); gcc-patches@gcc.gnu.org
>>>>>> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult
>>>>>> expr with
>>>>>> power 2 integer constant.
>>>>>>
>>>>>> On 08/02/2015 05:03 AM, Kumar, Venkataramanan wrote:
>>>>>> > Hi Jakub,
>>>>>> >
>>>>>> > Thank you for reviewing the patch.
>>>>>> >
>>>>>> > I have incorporated your comments in the attached patch.
>>>>>> Note Jakub is on PTO for the next 3 weeks.
>>>>>
>>>>>  Thank you for this information.
>>>>>
>>>>>>
>>>>>>
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > vectorize_mults_via_shift.diff.txt
>>>>>> >
>>>>>> >
>>>>>> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
>>>>>> > b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
>>>>>> Jakub would probably like more testcases :-)
>>>>>>
>>>>>> The most obvious thing to test would be other shift factors.
>>>>>>
>>>>>> A negative test to verify we don't try to turn a multiply by
>>>>>> non-constant or
>>>>>> multiply by a constant that is not a power of 2 into shifts.
>>>>>
>>>>> I have added negative test in the attached patch.
>>>>>
>>>>>
>>>>>>
>>>>>> [ Would it make sense, for example, to turn a multiply by 3 into a
>>>>>> shift-add
>>>>>> sequence?  As Jakub said, choose_mult_variant can be your friend. ]
>>>>>
>>>>> Yes I will do that in a follow up patch.
>>>>>
>>>>> The new change log becomes
>>>>>
>>>>> gcc/ChangeLog
>>>>> 2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
>>>>>      * tree-vect-patterns.c (vect_recog_mult_pattern): New function for vectorizing
>>>>>         multiplication patterns.
>>>>>      * tree-vectorizer.h: Adjust the number of patterns.
>>>>>
>>>>> gcc/testsuite/ChangeLog
>>>>> 2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
>>>>>      * gcc.dg/vect/vect-mult-pattern-1.c: New
>>>>>     * gcc.dg/vect/vect-mult-pattern-2.c: New
>>>>>
>>>>> Bootstrapped and reg tested on aarch64-unknown-linux-gnu.
>>>>>
>>>>> Ok for trunk ?
>>>>
>>>> +  if (TREE_CODE (oprnd0) != SSA_NAME
>>>> +      || TREE_CODE (oprnd1) != INTEGER_CST
>>>> +      || TREE_CODE (itype) != INTEGER_TYPE
>>>>
>>>> INTEGRAL_TYPE_P (itype)
>>>>
>>>> +  optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector);
>>>> +  if (!optab
>>>> +      || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing)
>>>> +       return NULL;
>>>> +
>>>>
>>>> indent of the return stmt looks wrong
>>>>
>>>> +  /* Handle constant operands that are postive or negative powers of 2.  */
>>>> +  if ( wi::exact_log2 (oprnd1) != -1  ||
>>>> +       wi::exact_log2 (wi::neg (oprnd1)) != -1)
>>>>
>>>> no space after (, || goes to the next line.
>>>>
>>>> +    {
>>>> +      tree shift;
>>>> +
>>>> +      if (wi::exact_log2 (oprnd1) != -1)
>>>>
>>>> please cache wi::exact_log2
>>>>
>>>> in fact the first if () looks redundant if you simply put an else
>>>> return NULL
>>>> after a else if (wi::exact_log2 (wi::neg (oprnd1)) != -1)
>>>>
>>>> Note that the issue with INT_MIN is that wi::neg (INT_MIN) is INT_MIN
>>>> again, but it seems that wi::exact_log2 returns -1 in that case so you
>>>> are fine (and in fact not handling this case).
>>>
>>> Are you sure it returns -1 for INT_MIN?  It isn't supposed to, assuming
>>> INT_MIN is shorthand for "minimum value for a signed type".  wide_ints
>>> aren't signed, so INT_MIN is indistinguishable from an unsigned
>>> 1<<(prec-1).
>>
>> No, not sure.  I spotted
>>
>>   /* Reject cases where there are implicit -1 blocks above HIGH.  */
>>   if (x.len * HOST_BITS_PER_WIDE_INT < x.precision && x.sign_mask () < 0)
>>     return -1;
>>
>> and thought that would catch it.  I mean the tree value is negative so
>> exact_log2 must see it is a negative value.

That's handling the "compressed" format, e.g.:

  {1 << 63}

as a 64-bit short-hand for a 256-bit:

  {1 << 63,-1,-1,-1}

In this case more than one of the low x.precision bits are known to be set.

> Now re-sent with Richards company disclaimer stripped...

Doh.  Sent via the right channels this time...

Thanks,
Richard
Richard Biener Aug. 4, 2015, 4:07 p.m. UTC | #4
On August 4, 2015 4:28:26 PM GMT+02:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Aug 4, 2015 at 4:21 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Tue, Aug 4, 2015 at 4:15 PM, Richard Sandiford
>>> <richard.sandiford@arm.com> wrote:
>>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>>> On Tue, Aug 4, 2015 at 10:52 AM, Kumar, Venkataramanan
>>>>> <Venkataramanan.Kumar@amd.com> wrote:
>>>>>> Hi Jeff,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>>>>> owner@gcc.gnu.org] On Behalf Of Jeff Law
>>>>>>> Sent: Monday, August 03, 2015 11:42 PM
>>>>>>> To: Kumar, Venkataramanan; Jakub Jelinek
>>>>>>> Cc: Richard Beiner (richard.guenther@gmail.com);
>gcc-patches@gcc.gnu.org
>>>>>>> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for
>mult
>>>>>>> expr with
>>>>>>> power 2 integer constant.
>>>>>>>
>>>>>>> On 08/02/2015 05:03 AM, Kumar, Venkataramanan wrote:
>>>>>>> > Hi Jakub,
>>>>>>> >
>>>>>>> > Thank you for reviewing the patch.
>>>>>>> >
>>>>>>> > I have incorporated your comments in the attached patch.
>>>>>>> Note Jakub is on PTO for the next 3 weeks.
>>>>>>
>>>>>>  Thank you for this information.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > vectorize_mults_via_shift.diff.txt
>>>>>>> >
>>>>>>> >
>>>>>>> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
>>>>>>> > b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
>>>>>>> Jakub would probably like more testcases :-)
>>>>>>>
>>>>>>> The most obvious thing to test would be other shift factors.
>>>>>>>
>>>>>>> A negative test to verify we don't try to turn a multiply by
>>>>>>> non-constant or
>>>>>>> multiply by a constant that is not a power of 2 into shifts.
>>>>>>
>>>>>> I have added negative test in the attached patch.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> [ Would it make sense, for example, to turn a multiply by 3 into
>a
>>>>>>> shift-add
>>>>>>> sequence?  As Jakub said, choose_mult_variant can be your
>friend. ]
>>>>>>
>>>>>> Yes I will do that in a follow up patch.
>>>>>>
>>>>>> The new change log becomes
>>>>>>
>>>>>> gcc/ChangeLog
>>>>>> 2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
>>>>>>      * tree-vect-patterns.c (vect_recog_mult_pattern): New
>function for vectorizing
>>>>>>         multiplication patterns.
>>>>>>      * tree-vectorizer.h: Adjust the number of patterns.
>>>>>>
>>>>>> gcc/testsuite/ChangeLog
>>>>>> 2015-08-04  Venkataramanan Kumar  <Venkataramanan.kumar@amd.com>
>>>>>>      * gcc.dg/vect/vect-mult-pattern-1.c: New
>>>>>>     * gcc.dg/vect/vect-mult-pattern-2.c: New
>>>>>>
>>>>>> Bootstrapped and reg tested on aarch64-unknown-linux-gnu.
>>>>>>
>>>>>> Ok for trunk ?
>>>>>
>>>>> +  if (TREE_CODE (oprnd0) != SSA_NAME
>>>>> +      || TREE_CODE (oprnd1) != INTEGER_CST
>>>>> +      || TREE_CODE (itype) != INTEGER_TYPE
>>>>>
>>>>> INTEGRAL_TYPE_P (itype)
>>>>>
>>>>> +  optab = optab_for_tree_code (LSHIFT_EXPR, vectype,
>optab_vector);
>>>>> +  if (!optab
>>>>> +      || optab_handler (optab, TYPE_MODE (vectype)) ==
>CODE_FOR_nothing)
>>>>> +       return NULL;
>>>>> +
>>>>>
>>>>> indent of the return stmt looks wrong
>>>>>
>>>>> +  /* Handle constant operands that are postive or negative powers
>of 2.  */
>>>>> +  if ( wi::exact_log2 (oprnd1) != -1  ||
>>>>> +       wi::exact_log2 (wi::neg (oprnd1)) != -1)
>>>>>
>>>>> no space after (, || goes to the next line.
>>>>>
>>>>> +    {
>>>>> +      tree shift;
>>>>> +
>>>>> +      if (wi::exact_log2 (oprnd1) != -1)
>>>>>
>>>>> please cache wi::exact_log2
>>>>>
>>>>> in fact the first if () looks redundant if you simply put an else
>>>>> return NULL
>>>>> after a else if (wi::exact_log2 (wi::neg (oprnd1)) != -1)
>>>>>
>>>>> Note that the issue with INT_MIN is that wi::neg (INT_MIN) is
>INT_MIN
>>>>> again, but it seems that wi::exact_log2 returns -1 in that case so
>you
>>>>> are fine (and in fact not handling this case).
>>>>
>>>> Are you sure it returns -1 for INT_MIN?  It isn't supposed to,
>assuming
>>>> INT_MIN is shorthand for "minimum value for a signed type". 
>wide_ints
>>>> aren't signed, so INT_MIN is indistinguishable from an unsigned
>>>> 1<<(prec-1).
>>>
>>> No, not sure.  I spotted
>>>
>>>   /* Reject cases where there are implicit -1 blocks above HIGH.  */
>>>   if (x.len * HOST_BITS_PER_WIDE_INT < x.precision && x.sign_mask ()
>< 0)
>>>     return -1;
>>>
>>> and thought that would catch it.  I mean the tree value is negative
>so
>>> exact_log2 must see it is a negative value.
>
>That's handling the "compressed" format, e.g.:
>
>  {1 << 63}
>
>as a 64-bit short-hand for a 256-bit:
>
>  {1 << 63,-1,-1,-1}
>
>In this case more than one of the low x.precision bits are known to be
>set.

So you are saying exact_log2 is really exact_log2u?

>> Now re-sent with Richards company disclaimer stripped...
>
>Doh.  Sent via the right channels this time...
>
>Thanks,
>Richard
Richard Sandiford Aug. 4, 2015, 4:27 p.m. UTC | #5
Richard Biener <richard.guenther@gmail.com> writes:
> On August 4, 2015 4:28:26 PM GMT+02:00, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>Richard Biener <richard.guenther@gmail.com> writes:
>>> On Tue, Aug 4, 2015 at 4:21 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Tue, Aug 4, 2015 at 4:15 PM, Richard Sandiford
>>>> <richard.sandiford@arm.com> wrote:
>>>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>>>> in fact the first if () looks redundant if you simply put an else
>>>>>> return NULL
>>>>>> after a else if (wi::exact_log2 (wi::neg (oprnd1)) != -1)
>>>>>>
>>>>>> Note that the issue with INT_MIN is that wi::neg (INT_MIN) is
>>INT_MIN
>>>>>> again, but it seems that wi::exact_log2 returns -1 in that case so
>>you
>>>>>> are fine (and in fact not handling this case).
>>>>>
>>>>> Are you sure it returns -1 for INT_MIN?  It isn't supposed to,
>>assuming
>>>>> INT_MIN is shorthand for "minimum value for a signed type". 
>>wide_ints
>>>>> aren't signed, so INT_MIN is indistinguishable from an unsigned
>>>>> 1<<(prec-1).
>>>>
>>>> No, not sure.  I spotted
>>>>
>>>>   /* Reject cases where there are implicit -1 blocks above HIGH.  */
>>>>   if (x.len * HOST_BITS_PER_WIDE_INT < x.precision && x.sign_mask ()
>>< 0)
>>>>     return -1;
>>>>
>>>> and thought that would catch it.  I mean the tree value is negative
>>so
>>>> exact_log2 must see it is a negative value.
>>
>>That's handling the "compressed" format, e.g.:
>>
>>  {1 << 63}
>>
>>as a 64-bit short-hand for a 256-bit:
>>
>>  {1 << 63,-1,-1,-1}
>>
>>In this case more than one of the low x.precision bits are known to be
>>set.
>
> So you are saying exact_log2 is really exact_log2u?

Not sure what you mean, sorry.  All I meant was that a number like:

  0xffffffff_ffffffff_ffffffff_80000000

(uing 32-bit rather than 64-bit elements for brevity) is stored as
a single {0x80000000}, with the upper 3 elements being implicit.  And:

  exact_log2 (0xffffffff_ffffffff_ffffffff_80000000)

is obviously -1.  That's the case that the code above is handling.
If there are implicit all-1 blocks, the number is not a power of 2.

Thanks,
Richard
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-1.c b/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-1.c
new file mode 100644
index 0000000..764d0e3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-1.c
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */
+
+unsigned  long int __attribute__ ((aligned (64)))arr[100];
+int i;
+
+void test_for_vectorshifts_via_mul_with_power2_const ()
+{
+  for (i=0; i<=99; i++)
+    arr[i] = arr[i] * 4;
+}
+
+void test_for_vectorshifts_via_mul_with_negative_power2_const ()
+{
+  for (i=0; i<=99; i++)
+    arr[i] = arr[i] * (-4);
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect"  {target  { ! { vect_int_mult } } } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect" {target  { ! { vect_int_mult } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-2.c b/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-2.c
new file mode 100644
index 0000000..77e8cff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-mult-pattern-2.c
@@ -0,0 +1,28 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */
+
+unsigned  long int __attribute__ ((aligned (64)))arr[100];
+int i;
+
+void negative_test_for_vectorshifts_via_mul_with_const ()
+{
+  for (i=0; i<=99; i++)
+    arr[i] = arr[i] * 123;
+}
+
+void negative_test_for_vectorshifts_via_mul_with_negative_const ()
+{
+  for (i=0; i<=99; i++)
+    arr[i] = arr[i] * (-123);
+}
+
+void negative_test_for_vectorshifts_via_mul_with_varable (int x)
+{
+  for (i=0; i<=99; i++)
+    arr[i] = arr[i] * x;
+}
+
+
+/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 3 "vect"  {target  { ! { vect_int_mult } } } } } */
+/* { dg-final { scan-tree-dump-not "vect_recog_mult_pattern: detected" "vect" {target  { ! { vect_int_mult } } } } } */
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index f034635..5cbb49e 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -76,6 +76,10 @@  static gimple vect_recog_vector_vector_shift_pattern (vec<gimple> *,
 						      tree *, tree *);
 static gimple vect_recog_divmod_pattern (vec<gimple> *,
 					 tree *, tree *);
+
+static gimple vect_recog_mult_pattern (vec<gimple> *,
+				       tree *, tree *);
+
 static gimple vect_recog_mixed_size_cond_pattern (vec<gimple> *,
 						  tree *, tree *);
 static gimple vect_recog_bool_pattern (vec<gimple> *, tree *, tree *);
@@ -90,6 +94,7 @@  static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
 	vect_recog_rotate_pattern,
 	vect_recog_vector_vector_shift_pattern,
 	vect_recog_divmod_pattern,
+	vect_recog_mult_pattern,
 	vect_recog_mixed_size_cond_pattern,
 	vect_recog_bool_pattern};
 
@@ -2147,6 +2152,140 @@  vect_recog_vector_vector_shift_pattern (vec<gimple> *stmts,
   return pattern_stmt;
 }
 
+/* Detect multiplication by constant which are postive or negatives of power 2,
+   and convert them to shift patterns.
+
+   Mult with constants that are postive power of two.
+   type a_t;
+   type b_t
+   S1: b_t = a_t * n
+
+   or
+
+   Mult with constants that are negative power of two.
+   S2: b_t = a_t * -n
+
+   Input/Output:
+
+   STMTS: Contains a stmt from which the pattern search begins,
+   i.e. the mult stmt.  Convert the mult operation to LSHIFT if
+   constant operand is a power of 2.
+   type a_t, b_t
+   S1': b_t = a_t << log2 (n)
+
+   Convert the mult operation to LSHIFT and followed by a NEGATE
+   if constant operand is a negative power of 2.
+   type a_t, b_t, res_T;
+   S2': b_t = a_t << log2 (n)
+   S3': res_T  = - (b_t)
+
+ Output:
+
+  * TYPE_IN: The type of the input arguments to the pattern.
+
+  * TYPE_OUT: The type of the output of this pattern.
+
+  * Return value: A new stmt that will be used to replace the multiplication
+    S1 or S2 stmt.  */
+
+static gimple
+vect_recog_mult_pattern (vec<gimple> *stmts,
+			 tree *type_in, tree *type_out)
+{
+  gimple last_stmt = stmts->pop ();
+  tree oprnd0, oprnd1, vectype, itype;
+  gimple pattern_stmt, def_stmt;
+  optab optab;
+  stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
+
+  if (!is_gimple_assign (last_stmt))
+    return NULL;
+
+  if (gimple_assign_rhs_code (last_stmt) != MULT_EXPR)
+    return NULL;
+
+  oprnd0 = gimple_assign_rhs1 (last_stmt);
+  oprnd1 = gimple_assign_rhs2 (last_stmt);
+  itype = TREE_TYPE (oprnd0);
+
+  if (TREE_CODE (oprnd0) != SSA_NAME
+      || TREE_CODE (oprnd1) != INTEGER_CST
+      || TREE_CODE (itype) != INTEGER_TYPE
+      || TYPE_PRECISION (itype) != GET_MODE_PRECISION (TYPE_MODE (itype)))
+    return NULL;
+
+  vectype = get_vectype_for_scalar_type (itype);
+  if (vectype == NULL_TREE)
+    return NULL;
+
+  /* If the target can handle vectorized multiplication natively,
+     don't attempt to optimize this.  */
+  optab = optab_for_tree_code (MULT_EXPR, vectype, optab_default);
+  if (optab != unknown_optab)
+    {
+      machine_mode vec_mode = TYPE_MODE (vectype);
+      int icode = (int) optab_handler (optab, vec_mode);
+      if (icode != CODE_FOR_nothing)
+	return NULL;
+    }
+
+  /* If target cannot handle vector left shift then we cannot
+     optimize and bail out.  */
+  optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector);
+  if (!optab
+      || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing)
+	return NULL;
+
+  /* Handle constant operands that are postive or negative powers of 2.  */
+  if ( wi::exact_log2 (oprnd1) != -1  ||
+       wi::exact_log2 (wi::neg (oprnd1)) != -1)
+    {
+      tree shift;
+
+      if (wi::exact_log2 (oprnd1) != -1)
+	{
+	  shift = build_int_cst (itype, wi::exact_log2 (oprnd1));
+	  pattern_stmt
+	    = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL),
+				   LSHIFT_EXPR, oprnd0, shift);
+	}
+      else
+	{
+	  /* If the target cannot handle vector NEGATE then we cannot
+	     do the optimization.  */
+	  optab = optab_for_tree_code (NEGATE_EXPR, vectype, optab_vector);
+	  if (!optab
+	      || optab_handler (optab, TYPE_MODE (vectype)) == CODE_FOR_nothing)
+	    return NULL;
+
+	  shift = build_int_cst (itype, wi::exact_log2 (wi::neg (oprnd1)));
+	  def_stmt
+	     = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL),
+				    LSHIFT_EXPR, oprnd0, shift);
+	  new_pattern_def_seq (stmt_vinfo, def_stmt);
+	  pattern_stmt
+	    = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL),
+				   NEGATE_EXPR, gimple_assign_lhs (def_stmt));
+	}
+
+      /* Pattern detected.  */
+      if (dump_enabled_p ())
+	dump_printf_loc (MSG_NOTE, vect_location,
+			 "vect_recog_mult_pattern: detected:\n");
+
+      if (dump_enabled_p ())
+	dump_gimple_stmt_loc (MSG_NOTE, vect_location, TDF_SLIM,
+			      pattern_stmt,0);
+
+      stmts->safe_push (last_stmt);
+      *type_in = vectype;
+      *type_out = vectype;
+      return pattern_stmt;
+    }
+
+  return NULL;
+}
+
 /* Detect a signed division by a constant that wouldn't be
    otherwise vectorized:
 
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index dfa8795..b490af4 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -1132,7 +1132,7 @@  extern void vect_slp_transform_bb (basic_block);
    Additional pattern recognition functions can (and will) be added
    in the future.  */
 typedef gimple (* vect_recog_func_ptr) (vec<gimple> *, tree *, tree *);
-#define NUM_PATTERNS 12
+#define NUM_PATTERNS 13
 void vect_pattern_recog (loop_vec_info, bb_vec_info);
 
 /* In tree-vectorizer.c.  */