diff mbox

[AArch64] PR target 66049: fix add/extend gcc test suite failures

Message ID 7794A52CE4D579448B959EED7DD0A4723DD06B62@satlexdag06.amd.com
State New
Headers show

Commit Message

Kumar, Venkataramanan May 19, 2015, 3:37 p.m. UTC
Hi Maintainers, 

Please find the attached patch, that fixes add/extend gcc test suite failures in Aarch64 target.  
Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66049

These tests started to fail after we prevented combiner from converting shift RTX to mult RTX, when the RTX is not inside a memory operation (r222874) .
Now I have added new add/extend patterns which are based on shift operations,  to fix these cases.

Testing status with the patch.

(1) GCC bootstrap on AArch64 successful. 
(2)  SPEC2006 INT runs did not show any degradation.
(3) gcc regression testing passed.

(-----Snip-----)
# Comparing 3 common sum files
## /bin/sh ./gcc-fsf-trunk/contrib/compare_tests  /tmp/gxx-sum1.24998 /tmp/gxx-sum2.24998
Tests that now work, but didn't before:

gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-9]+, lsl 3
gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 3
gcc.target/aarch64/adds3.c scan-assembler-times adds\tx[0-9]+, x[0-9]+, x[0-9]+, sxtw 2
gcc.target/aarch64/extend.c scan-assembler add\tw[0-9]+,.*uxth #?1
gcc.target/aarch64/extend.c scan-assembler add\tx[0-9]+,.*uxtw #?3
gcc.target/aarch64/extend.c scan-assembler sub\tw[0-9]+,.*uxth #?1
gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxth #?1
gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxtw #?3
gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-9]+, lsl 3
gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 3
gcc.target/aarch64/subs3.c scan-assembler-times subs\tx[0-9]+, x[0-9]+, x[0-9]+, sxtw 2

# No differences found in 3 common sum files  
(-----Snip-----)

The patterns are fixing the regressing tests, so I have not added any new tests. 
Regarding  removal of the old patterns based on "mults",  I am planning to do it as a separate work.

Is this OK for trunk ? 

gcc/ChangeLog 

2015-05-19  Venkataramanan Kumar  <venkataramanan.kumar@amd.com>

        * config/aarch64/aarch64.md
        (*adds_shift_imm_<mode>):  New pattern.
        (*subs_shift_imm_<mode>):  Likewise.
        (*adds_<optab><ALLX:mode>_shift_<GPI:mode>):  Likewise.
        (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
        (*add_uxt<mode>_shift2): Likewise.
        (*add_uxtsi_shift2_uxtw): Likewise.
       (*sub_uxt<mode>_shift2): Likewise.
       (*sub_uxtsi_shift2_uxtw): Likewise.


Regards,
Venkat.

Comments

Kyrill Tkachov May 19, 2015, 3:42 p.m. UTC | #1
Hi Venkat,

On 19/05/15 16:37, Kumar, Venkataramanan wrote:
> Hi Maintainers,
>
> Please find the attached patch, that fixes add/extend gcc test suite failures in Aarch64 target.
> Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66049
>
> These tests started to fail after we prevented combiner from converting shift RTX to mult RTX, when the RTX is not inside a memory operation (r222874) .
> Now I have added new add/extend patterns which are based on shift operations,  to fix these cases.
>
> Testing status with the patch.
>
> (1) GCC bootstrap on AArch64 successful.
> (2)  SPEC2006 INT runs did not show any degradation.

Does that mean there was no performance regression? Or no codegen difference?
What I'd expect from this patch is that the codegen would be the same as before the combine patch
(r222874). A performance difference can sometimes be hard to measure even at worse code quality.
Can you please confirm that on SPEC2006 INT the adds and shifts are now back to being combined
into a single instruction?

Thanks,
Kyrill

> (3) gcc regression testing passed.
>
> (-----Snip-----)
> # Comparing 3 common sum files
> ## /bin/sh ./gcc-fsf-trunk/contrib/compare_tests  /tmp/gxx-sum1.24998 /tmp/gxx-sum2.24998
> Tests that now work, but didn't before:
>
> gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-9]+, lsl 3
> gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 3
> gcc.target/aarch64/adds3.c scan-assembler-times adds\tx[0-9]+, x[0-9]+, x[0-9]+, sxtw 2
> gcc.target/aarch64/extend.c scan-assembler add\tw[0-9]+,.*uxth #?1
> gcc.target/aarch64/extend.c scan-assembler add\tx[0-9]+,.*uxtw #?3
> gcc.target/aarch64/extend.c scan-assembler sub\tw[0-9]+,.*uxth #?1
> gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxth #?1
> gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxtw #?3
> gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-9]+, lsl 3
> gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 3
> gcc.target/aarch64/subs3.c scan-assembler-times subs\tx[0-9]+, x[0-9]+, x[0-9]+, sxtw 2
>
> # No differences found in 3 common sum files
> (-----Snip-----)
>
> The patterns are fixing the regressing tests, so I have not added any new tests.
> Regarding  removal of the old patterns based on "mults",  I am planning to do it as a separate work.
>
> Is this OK for trunk ?
>
> gcc/ChangeLog
>
> 2015-05-19  Venkataramanan Kumar  <venkataramanan.kumar@amd.com>
>
>          * config/aarch64/aarch64.md
>          (*adds_shift_imm_<mode>):  New pattern.
>          (*subs_shift_imm_<mode>):  Likewise.
>          (*adds_<optab><ALLX:mode>_shift_<GPI:mode>):  Likewise.
>          (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
>          (*add_uxt<mode>_shift2): Likewise.
>          (*add_uxtsi_shift2_uxtw): Likewise.
>         (*sub_uxt<mode>_shift2): Likewise.
>         (*sub_uxtsi_shift2_uxtw): Likewise.
>
>
> Regards,
> Venkat.
>    
>
>
>
Kumar, Venkataramanan May 22, 2015, 8:50 a.m. UTC | #2
Hi Kyrill, 

Sorry for little delay in responding. 

> -----Original Message-----
> From: Kyrill Tkachov [mailto:kyrylo.tkachov@foss.arm.com]
> Sent: Tuesday, May 19, 2015 9:13 PM
> To: Kumar, Venkataramanan; James Greenhalgh; gcc-patches@gcc.gnu.org
> Cc: Ramana Radhakrishnan; segher@kernel.crashing.org; Marcus Shawcroft
> Subject: Re: [Patch] [AArch64] PR target 66049: fix add/extend gcc test suite
> failures
> 
> Hi Venkat,
> 
> On 19/05/15 16:37, Kumar, Venkataramanan wrote:
> > Hi Maintainers,
> >
> > Please find the attached patch, that fixes add/extend gcc test suite failures
> in Aarch64 target.
> > Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66049
> >
> > These tests started to fail after we prevented combiner from converting
> shift RTX to mult RTX, when the RTX is not inside a memory operation
> (r222874) .
> > Now I have added new add/extend patterns which are based on shift
> operations,  to fix these cases.
> >
> > Testing status with the patch.
> >
> > (1) GCC bootstrap on AArch64 successful.
> > (2)  SPEC2006 INT runs did not show any degradation.
> 
> Does that mean there was no performance regression? Or no codegen
> difference?

Yes there was no performance regression.


> What I'd expect from this patch is that the codegen would be the same as
> before the combine patch
> (r222874). A performance difference can sometimes be hard to measure
> even at worse code quality.
> Can you please confirm that on SPEC2006 INT the adds and shifts are now
> back to being combined
> into a single instruction?

I  used -dp --save-temps to dump pattern names in assembly files. 
I used revision before combiner patch (r222873) and the patch on top of trunk (little older r223194) for comparison.

Quickly compared the counts generated for the below patterns from the SPEC INT binaries. 

*adds_<optab><mode>_multp2 vs *adds_<optab><ALLX:mode>_shft_<GPI:mode> 
*subs_<optab><mode>_multp2 vs *subs_<optab><ALLX:mode>_shft_<GPI:mode>
*add_uxt<mode>_multp2	   vs  *add_uxt<mode>_shift2
*add_uxtsi_multp2_uxtw	   vs  *add_uxtsi_shift2_uxtw
*sub_uxt<mode>_multp2	   vs  *sub_uxt<mode>_shift2
*sub_uxtsi_multp2_uxtw	   vs  *sub_uxtsi_shift2_uxtw
*adds_mul_imm_<mode>	vs     *adds_shift_imm_<mode> 
*subs_mul_imm_<mode>	vs     *subs_shift_imm_<mode>

Patterns are found in few benchmarks. The  generated counts are matching in the binaries compiled with the two revisions. 

 Also Looked at assembly generated and  the adds and shifts are combined properly using the new patterns. Please let me know if this is OK.

> 
> Thanks,
> Kyrill
> 
> > (3) gcc regression testing passed.
> >
> > (-----Snip-----)
> > # Comparing 3 common sum files
> > ## /bin/sh ./gcc-fsf-trunk/contrib/compare_tests  /tmp/gxx-sum1.24998
> /tmp/gxx-sum2.24998
> > Tests that now work, but didn't before:
> >
> > gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-
> 9]+, lsl 3
> > gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+, x[0-9]+,
> lsl 3
> > gcc.target/aarch64/adds3.c scan-assembler-times adds\tx[0-9]+, x[0-9]+,
> x[0-9]+, sxtw 2
> > gcc.target/aarch64/extend.c scan-assembler add\tw[0-9]+,.*uxth #?1
> > gcc.target/aarch64/extend.c scan-assembler add\tx[0-9]+,.*uxtw #?3
> > gcc.target/aarch64/extend.c scan-assembler sub\tw[0-9]+,.*uxth #?1
> > gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxth #?1
> > gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxtw #?3
> > gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-
> 9]+, lsl 3
> > gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+,
> lsl 3
> > gcc.target/aarch64/subs3.c scan-assembler-times subs\tx[0-9]+, x[0-9]+,
> x[0-9]+, sxtw 2
> >
> > # No differences found in 3 common sum files
> > (-----Snip-----)
> >
> > The patterns are fixing the regressing tests, so I have not added any new
> tests.
> > Regarding  removal of the old patterns based on "mults",  I am planning to
> do it as a separate work.
> >
> > Is this OK for trunk ?
> >
> > gcc/ChangeLog
> >
> > 2015-05-19  Venkataramanan Kumar  <venkataramanan.kumar@amd.com>
> >
> >          * config/aarch64/aarch64.md
> >          (*adds_shift_imm_<mode>):  New pattern.
> >          (*subs_shift_imm_<mode>):  Likewise.
> >          (*adds_<optab><ALLX:mode>_shift_<GPI:mode>):  Likewise.
> >          (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> >          (*add_uxt<mode>_shift2): Likewise.
> >          (*add_uxtsi_shift2_uxtw): Likewise.
> >         (*sub_uxt<mode>_shift2): Likewise.
> >         (*sub_uxtsi_shift2_uxtw): Likewise.
> >
> >
> > Regards,
> > Venkat.
> >
> >
> >
> >

Regards,
Venkat,
Kyrill Tkachov May 22, 2015, 8:58 a.m. UTC | #3
Hi Venkat,

On 22/05/15 09:50, Kumar, Venkataramanan wrote:
> Hi Kyrill,
>
> Sorry for little delay in responding.
>
>> -----Original Message-----
>> From: Kyrill Tkachov [mailto:kyrylo.tkachov@foss.arm.com]
>> Sent: Tuesday, May 19, 2015 9:13 PM
>> To: Kumar, Venkataramanan; James Greenhalgh; gcc-patches@gcc.gnu.org
>> Cc: Ramana Radhakrishnan; segher@kernel.crashing.org; Marcus Shawcroft
>> Subject: Re: [Patch] [AArch64] PR target 66049: fix add/extend gcc test suite
>> failures
>>
>> Hi Venkat,
>>
>> On 19/05/15 16:37, Kumar, Venkataramanan wrote:
>>> Hi Maintainers,
>>>
>>> Please find the attached patch, that fixes add/extend gcc test suite failures
>> in Aarch64 target.
>>> Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66049
>>>
>>> These tests started to fail after we prevented combiner from converting
>> shift RTX to mult RTX, when the RTX is not inside a memory operation
>> (r222874) .
>>> Now I have added new add/extend patterns which are based on shift
>> operations,  to fix these cases.
>>> Testing status with the patch.
>>>
>>> (1) GCC bootstrap on AArch64 successful.
>>> (2)  SPEC2006 INT runs did not show any degradation.
>> Does that mean there was no performance regression? Or no codegen
>> difference?
> Yes there was no performance regression.
>
>
>> What I'd expect from this patch is that the codegen would be the same as
>> before the combine patch
>> (r222874). A performance difference can sometimes be hard to measure
>> even at worse code quality.
>> Can you please confirm that on SPEC2006 INT the adds and shifts are now
>> back to being combined
>> into a single instruction?
> I  used -dp --save-temps to dump pattern names in assembly files.
> I used revision before combiner patch (r222873) and the patch on top of trunk (little older r223194) for comparison.
>
> Quickly compared the counts generated for the below patterns from the SPEC INT binaries.
>
> *adds_<optab><mode>_multp2 vs *adds_<optab><ALLX:mode>_shft_<GPI:mode>
> *subs_<optab><mode>_multp2 vs *subs_<optab><ALLX:mode>_shft_<GPI:mode>
> *add_uxt<mode>_multp2	   vs  *add_uxt<mode>_shift2
> *add_uxtsi_multp2_uxtw	   vs  *add_uxtsi_shift2_uxtw
> *sub_uxt<mode>_multp2	   vs  *sub_uxt<mode>_shift2
> *sub_uxtsi_multp2_uxtw	   vs  *sub_uxtsi_shift2_uxtw
> *adds_mul_imm_<mode>	vs     *adds_shift_imm_<mode>
> *subs_mul_imm_<mode>	vs     *subs_shift_imm_<mode>
>
> Patterns are found in few benchmarks. The  generated counts are matching in the binaries compiled with the two revisions.
>
>   Also Looked at assembly generated and  the adds and shifts are combined properly using the new patterns. Please let me know if this is OK.

Thanks for investigating this. I've run your patch on aarch64.exp and I see the add+shift/extend failures
that we were seeing go away (I'm sure you saw that as well ;).
Up to the maintainers to review the patch.

Thanks,
Kyrill

>
>> Thanks,
>> Kyrill
>>
>>> (3) gcc regression testing passed.
>>>
>>> (-----Snip-----)
>>> # Comparing 3 common sum files
>>> ## /bin/sh ./gcc-fsf-trunk/contrib/compare_tests  /tmp/gxx-sum1.24998
>> /tmp/gxx-sum2.24998
>>> Tests that now work, but didn't before:
>>>
>>> gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-
>> 9]+, lsl 3
>>> gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+, x[0-9]+,
>> lsl 3
>>> gcc.target/aarch64/adds3.c scan-assembler-times adds\tx[0-9]+, x[0-9]+,
>> x[0-9]+, sxtw 2
>>> gcc.target/aarch64/extend.c scan-assembler add\tw[0-9]+,.*uxth #?1
>>> gcc.target/aarch64/extend.c scan-assembler add\tx[0-9]+,.*uxtw #?3
>>> gcc.target/aarch64/extend.c scan-assembler sub\tw[0-9]+,.*uxth #?1
>>> gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxth #?1
>>> gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxtw #?3
>>> gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-
>> 9]+, lsl 3
>>> gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+,
>> lsl 3
>>> gcc.target/aarch64/subs3.c scan-assembler-times subs\tx[0-9]+, x[0-9]+,
>> x[0-9]+, sxtw 2
>>> # No differences found in 3 common sum files
>>> (-----Snip-----)
>>>
>>> The patterns are fixing the regressing tests, so I have not added any new
>> tests.
>>> Regarding  removal of the old patterns based on "mults",  I am planning to
>> do it as a separate work.
>>> Is this OK for trunk ?
>>>
>>> gcc/ChangeLog
>>>
>>> 2015-05-19  Venkataramanan Kumar  <venkataramanan.kumar@amd.com>
>>>
>>>           * config/aarch64/aarch64.md
>>>           (*adds_shift_imm_<mode>):  New pattern.
>>>           (*subs_shift_imm_<mode>):  Likewise.
>>>           (*adds_<optab><ALLX:mode>_shift_<GPI:mode>):  Likewise.
>>>           (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
>>>           (*add_uxt<mode>_shift2): Likewise.
>>>           (*add_uxtsi_shift2_uxtw): Likewise.
>>>          (*sub_uxt<mode>_shift2): Likewise.
>>>          (*sub_uxtsi_shift2_uxtw): Likewise.
>>>
>>>
>>> Regards,
>>> Venkat.
>>>
>>>
>>>
>>>
> Regards,
> Venkat,
Bin.Cheng May 22, 2015, 9:56 a.m. UTC | #4
On Fri, May 22, 2015 at 4:58 PM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi Venkat,
>
>
> On 22/05/15 09:50, Kumar, Venkataramanan wrote:
>>
>> Hi Kyrill,
>>
>> Sorry for little delay in responding.
>>
>>> -----Original Message-----
>>> From: Kyrill Tkachov [mailto:kyrylo.tkachov@foss.arm.com]
>>> Sent: Tuesday, May 19, 2015 9:13 PM
>>> To: Kumar, Venkataramanan; James Greenhalgh; gcc-patches@gcc.gnu.org
>>> Cc: Ramana Radhakrishnan; segher@kernel.crashing.org; Marcus Shawcroft
>>> Subject: Re: [Patch] [AArch64] PR target 66049: fix add/extend gcc test
>>> suite
>>> failures
>>>
>>> Hi Venkat,
>>>
>>> On 19/05/15 16:37, Kumar, Venkataramanan wrote:
>>>>
>>>> Hi Maintainers,
>>>>
>>>> Please find the attached patch, that fixes add/extend gcc test suite
>>>> failures
>>>
>>> in Aarch64 target.
>>>>
>>>> Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66049
>>>>
>>>> These tests started to fail after we prevented combiner from converting
>>>
>>> shift RTX to mult RTX, when the RTX is not inside a memory operation
>>> (r222874) .
>>>>
>>>> Now I have added new add/extend patterns which are based on shift
>>>
>>> operations,  to fix these cases.
>>>>
>>>> Testing status with the patch.
>>>>
>>>> (1) GCC bootstrap on AArch64 successful.
>>>> (2)  SPEC2006 INT runs did not show any degradation.
>>>
>>> Does that mean there was no performance regression? Or no codegen
>>> difference?
>>
>> Yes there was no performance regression.
>>
>>
>>> What I'd expect from this patch is that the codegen would be the same as
>>> before the combine patch
>>> (r222874). A performance difference can sometimes be hard to measure
>>> even at worse code quality.
>>> Can you please confirm that on SPEC2006 INT the adds and shifts are now
>>> back to being combined
>>> into a single instruction?
>>
>> I  used -dp --save-temps to dump pattern names in assembly files.
>> I used revision before combiner patch (r222873) and the patch on top of
>> trunk (little older r223194) for comparison.
>>
>> Quickly compared the counts generated for the below patterns from the SPEC
>> INT binaries.
>>
>> *adds_<optab><mode>_multp2 vs *adds_<optab><ALLX:mode>_shft_<GPI:mode>
>> *subs_<optab><mode>_multp2 vs *subs_<optab><ALLX:mode>_shft_<GPI:mode>
>> *add_uxt<mode>_multp2      vs  *add_uxt<mode>_shift2
>> *add_uxtsi_multp2_uxtw     vs  *add_uxtsi_shift2_uxtw
>> *sub_uxt<mode>_multp2      vs  *sub_uxt<mode>_shift2
>> *sub_uxtsi_multp2_uxtw     vs  *sub_uxtsi_shift2_uxtw
>> *adds_mul_imm_<mode>    vs     *adds_shift_imm_<mode>
>> *subs_mul_imm_<mode>    vs     *subs_shift_imm_<mode>
>>
>> Patterns are found in few benchmarks. The  generated counts are matching
>> in the binaries compiled with the two revisions.
>>
>>   Also Looked at assembly generated and  the adds and shifts are combined
>> properly using the new patterns. Please let me know if this is OK.
>
>
> Thanks for investigating this. I've run your patch on aarch64.exp and I see
> the add+shift/extend failures
> that we were seeing go away (I'm sure you saw that as well ;).
Exact what's expected.  This patch and the previous combine one are
designed to fix those failures.

Thanks,
bin
> Up to the maintainers to review the patch.
>
> Thanks,
> Kyrill
>
>
>>
>>> Thanks,
>>> Kyrill
>>>
>>>> (3) gcc regression testing passed.
>>>>
>>>> (-----Snip-----)
>>>> # Comparing 3 common sum files
>>>> ## /bin/sh ./gcc-fsf-trunk/contrib/compare_tests  /tmp/gxx-sum1.24998
>>>
>>> /tmp/gxx-sum2.24998
>>>>
>>>> Tests that now work, but didn't before:
>>>>
>>>> gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-
>>>
>>> 9]+, lsl 3
>>>>
>>>> gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+,
>>>> x[0-9]+,
>>>
>>> lsl 3
>>>>
>>>> gcc.target/aarch64/adds3.c scan-assembler-times adds\tx[0-9]+, x[0-9]+,
>>>
>>> x[0-9]+, sxtw 2
>>>>
>>>> gcc.target/aarch64/extend.c scan-assembler add\tw[0-9]+,.*uxth #?1
>>>> gcc.target/aarch64/extend.c scan-assembler add\tx[0-9]+,.*uxtw #?3
>>>> gcc.target/aarch64/extend.c scan-assembler sub\tw[0-9]+,.*uxth #?1
>>>> gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxth #?1
>>>> gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxtw #?3
>>>> gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-
>>>
>>> 9]+, lsl 3
>>>>
>>>> gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
>>>> x[0-9]+,
>>>
>>> lsl 3
>>>>
>>>> gcc.target/aarch64/subs3.c scan-assembler-times subs\tx[0-9]+, x[0-9]+,
>>>
>>> x[0-9]+, sxtw 2
>>>>
>>>> # No differences found in 3 common sum files
>>>> (-----Snip-----)
>>>>
>>>> The patterns are fixing the regressing tests, so I have not added any
>>>> new
>>>
>>> tests.
>>>>
>>>> Regarding  removal of the old patterns based on "mults",  I am planning
>>>> to
>>>
>>> do it as a separate work.
>>>>
>>>> Is this OK for trunk ?
>>>>
>>>> gcc/ChangeLog
>>>>
>>>> 2015-05-19  Venkataramanan Kumar  <venkataramanan.kumar@amd.com>
>>>>
>>>>           * config/aarch64/aarch64.md
>>>>           (*adds_shift_imm_<mode>):  New pattern.
>>>>           (*subs_shift_imm_<mode>):  Likewise.
>>>>           (*adds_<optab><ALLX:mode>_shift_<GPI:mode>):  Likewise.
>>>>           (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
>>>>           (*add_uxt<mode>_shift2): Likewise.
>>>>           (*add_uxtsi_shift2_uxtw): Likewise.
>>>>          (*sub_uxt<mode>_shift2): Likewise.
>>>>          (*sub_uxtsi_shift2_uxtw): Likewise.
>>>>
>>>>
>>>> Regards,
>>>> Venkat.
>>>>
>>>>
>>>>
>>>>
>> Regards,
>> Venkat,
>
>
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1dbadc0..d0d6a6a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1539,6 +1539,38 @@ 
   [(set_attr "type" "alus_sreg,alus_imm,alus_imm")]
 )
 
+(define_insn "*adds_shift_imm_<mode>"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ
+	 (plus:GPI (ASHIFT:GPI 
+		    (match_operand:GPI 1 "register_operand" "r")
+		    (match_operand:QI 2 "aarch64_shift_imm_<mode>" "n"))
+		   (match_operand:GPI 3 "register_operand" "r"))
+	 (const_int 0)))
+   (set (match_operand:GPI 0 "register_operand" "=r")
+	(plus:GPI (ASHIFT:GPI (match_dup 1) (match_dup 2))
+		  (match_dup 3)))]
+  ""
+  "adds\\t%<w>0, %<w>3, %<w>1, <shift> %2"
+  [(set_attr "type" "alus_shift_imm")]
+)
+
+(define_insn "*subs_shift_imm_<mode>"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ
+	 (minus:GPI (match_operand:GPI 1 "register_operand" "r")
+		    (ASHIFT:GPI
+		     (match_operand:GPI 2 "register_operand" "r")
+		     (match_operand:QI 3 "aarch64_shift_imm_<mode>" "n")))
+	 (const_int 0)))
+   (set (match_operand:GPI 0 "register_operand" "=r")
+	(minus:GPI (match_dup 1)
+		   (ASHIFT:GPI (match_dup 2) (match_dup 3))))]
+  ""
+  "subs\\t%<w>0, %<w>1, %<w>2, <shift> %3"
+  [(set_attr "type" "alus_shift_imm")]
+)
+
 (define_insn "*adds_mul_imm_<mode>"
   [(set (reg:CC_NZ CC_REGNUM)
 	(compare:CC_NZ
@@ -1599,6 +1631,42 @@ 
   [(set_attr "type" "alus_ext")]
 )
 
+(define_insn "*adds_<optab><ALLX:mode>_shift_<GPI:mode>"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ
+	 (plus:GPI (ashift:GPI 
+		    (ANY_EXTEND:GPI 
+		     (match_operand:ALLX 1 "register_operand" "r"))
+		    (match_operand 2 "aarch64_imm3" "Ui3"))
+		   (match_operand:GPI 3 "register_operand" "r"))
+	 (const_int 0)))
+   (set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (ashift:GPI (ANY_EXTEND:GPI (match_dup 1))
+			      (match_dup 2))
+		  (match_dup 3)))]
+  ""
+  "adds\\t%<GPI:w>0, %<GPI:w>3, %<GPI:w>1, <su>xt<ALLX:size> %2"
+  [(set_attr "type" "alus_ext")]
+)
+
+(define_insn "*subs_<optab><ALLX:mode>_shift_<GPI:mode>"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ
+	 (minus:GPI (match_operand:GPI 1 "register_operand" "r")
+		    (ashift:GPI 
+		     (ANY_EXTEND:GPI
+		      (match_operand:ALLX 2 "register_operand" "r"))
+		     (match_operand 3 "aarch64_imm3" "Ui3")))
+	 (const_int 0)))
+   (set (match_operand:GPI 0 "register_operand" "=rk")
+	(minus:GPI (match_dup 1)
+		   (ashift:GPI (ANY_EXTEND:GPI (match_dup 2))
+			       (match_dup 3))))]
+  ""
+  "subs\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size> %3"
+  [(set_attr "type" "alus_ext")]
+)
+
 (define_insn "*adds_<optab><mode>_multp2"
   [(set (reg:CC_NZ CC_REGNUM)
 	(compare:CC_NZ
@@ -1894,6 +1962,38 @@ 
   [(set_attr "type" "adc_reg")]
 )
 
+(define_insn "*add_uxt<mode>_shift2"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (and:GPI
+		   (ashift:GPI (match_operand:GPI 1 "register_operand" "r")
+			       (match_operand 2 "aarch64_imm3" "Ui3"))
+		   (match_operand 3 "const_int_operand" "n"))
+		  (match_operand:GPI 4 "register_operand" "r")))]
+  "aarch64_uxt_size (INTVAL (operands[2]), INTVAL (operands[3])) != 0"
+  "*
+  operands[3] = GEN_INT (aarch64_uxt_size (INTVAL(operands[2]),
+					   INTVAL (operands[3])));
+  return \"add\t%<w>0, %<w>4, %<w>1, uxt%e3 %2\";"
+  [(set_attr "type" "alu_ext")]
+)
+
+;; zero_extend version of above
+(define_insn "*add_uxtsi_shift2_uxtw"
+  [(set (match_operand:DI 0 "register_operand" "=rk")
+	(zero_extend:DI
+	 (plus:SI (and:SI
+		   (ashift:SI (match_operand:SI 1 "register_operand" "r")
+			      (match_operand 2 "aarch64_imm3" "Ui3"))
+		   (match_operand 3 "const_int_operand" "n"))
+		  (match_operand:SI 4 "register_operand" "r"))))]
+  "aarch64_uxt_size (INTVAL (operands[2]), INTVAL (operands[3])) != 0"
+  "*
+  operands[3] = GEN_INT (aarch64_uxt_size (INTVAL (operands[2]),
+					   INTVAL (operands[3])));
+  return \"add\t%w0, %w4, %w1, uxt%e3 %2\";"
+  [(set_attr "type" "alu_ext")]
+)
+
 (define_insn "*add_uxt<mode>_multp2"
   [(set (match_operand:GPI 0 "register_operand" "=rk")
 	(plus:GPI (and:GPI
@@ -2150,6 +2250,38 @@ 
   [(set_attr "type" "adc_reg")]
 )
 
+(define_insn "*sub_uxt<mode>_shift2"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(minus:GPI (match_operand:GPI 4 "register_operand" "rk")
+		   (and:GPI
+		    (ashift:GPI (match_operand:GPI 1 "register_operand" "r")
+				(match_operand 2 "aarch64_imm3" "Ui3"))
+		    (match_operand 3 "const_int_operand" "n"))))]
+  "aarch64_uxt_size (INTVAL (operands[2]),INTVAL (operands[3])) != 0"
+  "*
+  operands[3] = GEN_INT (aarch64_uxt_size (INTVAL (operands[2]),
+					   INTVAL (operands[3])));
+  return \"sub\t%<w>0, %<w>4, %<w>1, uxt%e3 %2\";"
+  [(set_attr "type" "alu_ext")]
+)
+
+;; zero_extend version of above
+(define_insn "*sub_uxtsi_shift2_uxtw"
+  [(set (match_operand:DI 0 "register_operand" "=rk")
+	(zero_extend:DI
+	 (minus:SI (match_operand:SI 4 "register_operand" "rk")
+		   (and:SI
+		    (ashift:SI (match_operand:SI 1 "register_operand" "r")
+			       (match_operand 2 "aarch64_imm3" "Ui3"))
+		    (match_operand 3 "const_int_operand" "n")))))]
+  "aarch64_uxt_size (INTVAL (operands[2]),INTVAL (operands[3])) != 0"
+  "*
+  operands[3] = GEN_INT (aarch64_uxt_size (INTVAL (operands[2]),
+					   INTVAL (operands[3])));
+  return \"sub\t%w0, %w4, %w1, uxt%e3 %2\";"
+  [(set_attr "type" "alu_ext")]
+)
+
 (define_insn "*sub_uxt<mode>_multp2"
   [(set (match_operand:GPI 0 "register_operand" "=rk")
 	(minus:GPI (match_operand:GPI 4 "register_operand" "rk")