diff mbox

[RFC] : Remove Mem/address type assumption in combiner

Message ID 7794A52CE4D579448B959EED7DD0A4723DCF5AEF@satlexdag06.amd.com
State New
Headers show

Commit Message

Kumar, Venkataramanan May 5, 2015, 4:14 p.m. UTC
Hi Segher, 

Thank you for testing the patch and approving it.

Before I commit it, I wanted to check with you on the changelog entry.

Please find the updated patch with the changelog entry. 
I have just removed the comments that says checks for PLUS/MINUS RTX is a hack.

Let me know if it ok.  

Regards,
Venkat.

Change Log 
---------------

2015-05-05  Venkataramanan Kumar  <venkataramanan.kumar@amd.com>

        * combine.c (make_compound_operation): Remove checks for PLUS/MINUS
        rtx type.

---patch---


---patch---

 
-----Original Message-----
From: Segher Boessenkool [mailto:segher@kernel.crashing.org] 
Sent: Friday, May 01, 2015 8:48 PM
To: Kumar, Venkataramanan
Cc: Jeff Law (law@redhat.com); gcc-patches@gcc.gnu.org; maxim.kuvyrkov@linaro.org
Subject: Re: [RFC]: Remove Mem/address type assumption in combiner

On Wed, Apr 29, 2015 at 12:03:35PM -0500, Segher Boessenkool wrote:
> On Wed, Apr 29, 2015 at 09:25:21AM +0000, Kumar, Venkataramanan wrote:
> > diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..945abdb 
> > 100644
> > --- a/gcc/combine.c
> > +++ b/gcc/combine.c
> > @@ -7703,8 +7703,6 @@ make_compound_operation (rtx x, enum rtx_code in_code)
> >       but once inside, go back to our default of SET.  */
> > 
> >    next_code = (code == MEM ? MEM
> > -              : ((code == PLUS || code == MINUS)
> > -                 && SCALAR_INT_MODE_P (mode)) ? MEM
> >                : ((code == COMPARE || COMPARISON_P (x))
> >                   && XEXP (x, 1) == const0_rtx) ? COMPARE
> >                : in_code == COMPARE ? SET : in_code);
> > 
> > 
> > On X86_64, it passes bootstrap and regression tests.
> > But on Aarch64 the test in PR passed, but I got a few test case failures.
> 
> > There are few patterns based on multiplication operations in Aarch64 backend which are used to match with the pattern combiner generated.
> > Now those patterns have to be fixed to use SHIFTS.  Also need to see any impact on other targets.
> 
> Right.  It would be good if you could find out for what targets it matters.
> The thing is, if a target expects only the patterns as combine used to 
> make them, it will regress (as you've seen on aarch64); but it will 
> regress _silently_.  Which isn't so nice.
> 
> > But  before that  I wanted to check if the assumption in combiner,  can simply be removed ?
> 
> Seeing for what targets / patterns it makes a difference would tell us 
> the answer to that, too :-)
> 
> I'll run some tests with and without your patch.

So I ran the tests.  These are text sizes for vmlinux built for most architectures (before resp. after the patch).  Results are good, but they show many targets need some work...


BIGGER
 5410496   5432744   alpha
 3848987   3849143   arm
 2190276   2190292   blackfin
 2188431   2191503   c6x (but data shrank by more than text grew)
 2212322   2212706   cris
10896454  10896965   i386
 7471891   7488771   parisc
 6168799   6195391   parisc64
 1545840   1545872   shnommu
 1946649   1954149   xtensa

I looked at some of those.  Alpha regresses with s4add things, arm with add ,lsl things, parisc with shladd things.  I tried to fix the parisc one (it seemed simplest), and the 32-bit kernel got most (but not all) of the size difference back; and the 64-bit wouldn't build (an assert in the kernel failed, and it uses a floating point reg where an integer one is needed -- I gave up).


SMALLER
 8688171   8688003   powerpc
20252605  20251797   powerpc64
11425334  11422558   s390
12300135  12299767   x86_64

The PowerPC differences are mostly what first was rlwinm;addi;rlwinm;lwz and now is rlwinm;lwz; i.e. the add is moved after the two rotates, the rotates are merged, and the add is made part of the offset in the load.


NO DIFF
 3321492   3321492   frv
 3252747   3252747   m32r
 4708232   4708232   microblaze
 3949145   3949145   mips
 5693019   5693019   mips64
 2373441   2373441   mn10300
 6489846   6489846   sh64
 3733749   3733749   sparc
 6075122   6075122   sparc64

Also quite a few without any diff.  Good :-)  Some of those might have unnecessary add/mult patterns now, but they also have the add/shift.


I didn't see any big differences, and all are of the expected kind.
Some targets need some love (but what else is new).

The patch is approved for trunk.

Thanks,


Segher

Comments

Segher Boessenkool May 5, 2015, 5:15 p.m. UTC | #1
On Tue, May 05, 2015 at 04:14:03PM +0000, Kumar, Venkataramanan wrote:
> Hi Segher, 
> 
> Thank you for testing the patch and approving it.
> 
> Before I commit it, I wanted to check with you on the changelog entry.
> 
> Please find the updated patch with the changelog entry. 
> I have just removed the comments that says checks for PLUS/MINUS RTX is a hack.
> 
> Let me know if it ok.  
> 
> Regards,
> Venkat.
> 
> Change Log 
> ---------------
> 
> 2015-05-05  Venkataramanan Kumar  <venkataramanan.kumar@amd.com>
> 
>         * combine.c (make_compound_operation): Remove checks for PLUS/MINUS
>         rtx type.

Yes, this is okay for trunk.


Segher
Kumar, Venkataramanan May 7, 2015, 11:01 a.m. UTC | #2
Hi Segher, 

Thank you I committed as r222874. 
Ref: https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=222874

Regards,
Venkat.


-----Original Message-----
From: Segher Boessenkool [mailto:segher@kernel.crashing.org] 
Sent: Tuesday, May 05, 2015 10:46 PM
To: Kumar, Venkataramanan
Cc: Jeff Law (law@redhat.com); gcc-patches@gcc.gnu.org; maxim.kuvyrkov@linaro.org
Subject: Re: [RFC]: Remove Mem/address type assumption in combiner

On Tue, May 05, 2015 at 04:14:03PM +0000, Kumar, Venkataramanan wrote:
> Hi Segher, 
> 
> Thank you for testing the patch and approving it.
> 
> Before I commit it, I wanted to check with you on the changelog entry.
> 
> Please find the updated patch with the changelog entry. 
> I have just removed the comments that says checks for PLUS/MINUS RTX is a hack.
> 
> Let me know if it ok.  
> 
> Regards,
> Venkat.
> 
> Change Log 
> ---------------
> 
> 2015-05-05  Venkataramanan Kumar  <venkataramanan.kumar@amd.com>
> 
>         * combine.c (make_compound_operation): Remove checks for PLUS/MINUS
>         rtx type.

Yes, this is okay for trunk.


Segher
Steve Ellcey May 11, 2015, 5:50 p.m. UTC | #3
On Thu, 2015-05-07 at 11:01 +0000, Kumar, Venkataramanan wrote:
> Hi Segher, 
> 
> Thank you I committed as r222874. 
> Ref: https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=222874
> 
> Regards,
> Venkat.


Venkat,

This patch broke a number of MIPS tests, specifically mips32r6 tests
that look for the lsa instruction (load scaled address) which shifts one
register and then adds it to a second register.  I am not sure if this
needs to be addressed in combine.c or if we need to add a peephole
optimization to mips.md to handle the new instruction sequence.  What do
you think?  Is the change here what you would expect to see from your
patch?

With this C code:


signed short test (signed short *a, int index)
{
  return a[index];
}


GCC/combine for mips32r6 used to produce:

(insn 8 7 9 2 (set (reg/f:SI 203)
        (plus:SI (mult:SI (reg:SI 5 $5 [ index ])
                (const_int 2 [0x2]))
            (reg:SI 4 $4 [ a ]))) lsa.c:3 444 {lsa}
     (expr_list:REG_DEAD (reg:SI 5 $5 [ index ])
        (expr_list:REG_DEAD (reg:SI 4 $4 [ a ])
            (nil))))
(insn 15 10 16 2 (set (reg/i:SI 2 $2)
        (sign_extend:SI (mem:HI (reg/f:SI 203) [1 *_5+0 S2 A16])))
lsa.c:4 237 {*extendhisi2_seh}
     (expr_list:REG_DEAD (reg/f:SI 203)
        (nil)))

And would generate:

	lsa	$4,$5,$4,1
	lh	$2,0($4)
	

But now it produces:

(insn 7 4 8 2 (set (reg:SI 202)
        (ashift:SI (reg:SI 5 $5 [ index ])
            (const_int 1 [0x1]))) lsa.c:3 432 {*ashlsi3}
     (expr_list:REG_DEAD (reg:SI 5 $5 [ index ])
        (nil)))
(insn 8 7 9 2 (set (reg/f:SI 203)
        (plus:SI (reg:SI 4 $4 [ a ])
            (reg:SI 202))) lsa.c:3 13 {*addsi3}
     (expr_list:REG_DEAD (reg:SI 4 $4 [ a ])
        (expr_list:REG_DEAD (reg:SI 202)
            (nil)))
(insn 15 10 16 2 (set (reg/i:SI 2 $2)
        (sign_extend:SI (mem:HI (reg/f:SI 203) [1 *_5+0 S2 A16])))
lsa.c:4 237 {*extendhisi2_seh}
     (expr_list:REG_DEAD (reg/f:SI 203)
        (nil)))

Which generates:

	sll	$5,$5,1
	addu	$4,$4,$5
	lh	$2,0($4)




Steve Ellcey
sellcey@imgtec.com
Segher Boessenkool May 11, 2015, 6:22 p.m. UTC | #4
Hi Steve,

On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote:
> This patch broke a number of MIPS tests, specifically mips32r6 tests
> that look for the lsa instruction (load scaled address) which shifts one
> register and then adds it to a second register.  I am not sure if this
> needs to be addressed in combine.c or if we need to add a peephole
> optimization to mips.md to handle the new instruction sequence.  What do
> you think?  Is the change here what you would expect to see from your
> patch?

Yes, this is as expected.  AFAICS the only change you need in the
MIPS backend is to change the "<GPR:d>lsa" pattern to match a shift
instead of a mult (and change the "const_immlsa_operand" predicate
to just match 1..4 instead of the powers).


Segher
Steve Ellcey May 11, 2015, 7:44 p.m. UTC | #5
On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote:
> Hi Steve,
> 
> On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote:
> > This patch broke a number of MIPS tests, specifically mips32r6 tests
> > that look for the lsa instruction (load scaled address) which shifts one
> > register and then adds it to a second register.  I am not sure if this
> > needs to be addressed in combine.c or if we need to add a peephole
> > optimization to mips.md to handle the new instruction sequence.  What do
> > you think?  Is the change here what you would expect to see from your
> > patch?
> 
> Yes, this is as expected.  AFAICS the only change you need in the
> MIPS backend is to change the "<GPR:d>lsa" pattern to match a shift
> instead of a mult (and change the "const_immlsa_operand" predicate
> to just match 1..4 instead of the powers).
> 
> 
> Segher

Hm, I thought it was going to be more complicated than that, but it
seems to be working.  I will do a complete test run and then submit a
patch.

Steve Ellcey
Jeff Law May 11, 2015, 7:46 p.m. UTC | #6
On 05/11/2015 01:44 PM, Steve Ellcey wrote:
> On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote:
>> Hi Steve,
>>
>> On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote:
>>> This patch broke a number of MIPS tests, specifically mips32r6 tests
>>> that look for the lsa instruction (load scaled address) which shifts one
>>> register and then adds it to a second register.  I am not sure if this
>>> needs to be addressed in combine.c or if we need to add a peephole
>>> optimization to mips.md to handle the new instruction sequence.  What do
>>> you think?  Is the change here what you would expect to see from your
>>> patch?
>>
>> Yes, this is as expected.  AFAICS the only change you need in the
>> MIPS backend is to change the "<GPR:d>lsa" pattern to match a shift
>> instead of a mult (and change the "const_immlsa_operand" predicate
>> to just match 1..4 instead of the powers).
>>
>>
>> Segher
>
> Hm, I thought it was going to be more complicated than that, but it
> seems to be working.  I will do a complete test run and then submit a
> patch.
Yea, it really should be that easy.

I'm pretty sure the sh[123]add insns in the PA need to be updated in a 
similar manner.

jeff
Jeff Law May 11, 2015, 7:46 p.m. UTC | #7
On 05/11/2015 01:46 PM, Jeff Law wrote:
> On 05/11/2015 01:44 PM, Steve Ellcey wrote:
>> On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote:
>>> Hi Steve,
>>>
>>> On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote:
>>>> This patch broke a number of MIPS tests, specifically mips32r6 tests
>>>> that look for the lsa instruction (load scaled address) which shifts
>>>> one
>>>> register and then adds it to a second register.  I am not sure if this
>>>> needs to be addressed in combine.c or if we need to add a peephole
>>>> optimization to mips.md to handle the new instruction sequence.
>>>> What do
>>>> you think?  Is the change here what you would expect to see from your
>>>> patch?
>>>
>>> Yes, this is as expected.  AFAICS the only change you need in the
>>> MIPS backend is to change the "<GPR:d>lsa" pattern to match a shift
>>> instead of a mult (and change the "const_immlsa_operand" predicate
>>> to just match 1..4 instead of the powers).
>>>
>>>
>>> Segher
>>
>> Hm, I thought it was going to be more complicated than that, but it
>> seems to be working.  I will do a complete test run and then submit a
>> patch.
> Yea, it really should be that easy.
>
> I'm pretty sure the sh[123]add insns in the PA need to be updated in a
> similar manner.
Oh, and just to be clear, I'm not expecting you to do this Steve.  It'd 
be great if you did, but not required.

jeff
Matthew Fortune May 11, 2015, 8:16 p.m. UTC | #8
Jeff Law <law@redhat.com> writes:
> On 05/11/2015 01:46 PM, Jeff Law wrote:
> > On 05/11/2015 01:44 PM, Steve Ellcey wrote:
> >> On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote:
> >>> Hi Steve,
> >>>
> >>> On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote:
> >>>> This patch broke a number of MIPS tests, specifically mips32r6
> >>>> tests that look for the lsa instruction (load scaled address) which
> >>>> shifts one register and then adds it to a second register.  I am
> >>>> not sure if this needs to be addressed in combine.c or if we need
> >>>> to add a peephole optimization to mips.md to handle the new
> >>>> instruction sequence.
> >>>> What do
> >>>> you think?  Is the change here what you would expect to see from
> >>>> your patch?
> >>>
> >>> Yes, this is as expected.  AFAICS the only change you need in the
> >>> MIPS backend is to change the "<GPR:d>lsa" pattern to match a shift
> >>> instead of a mult (and change the "const_immlsa_operand" predicate
> >>> to just match 1..4 instead of the powers).
> >>>
> >>>
> >>> Segher
> >>
> >> Hm, I thought it was going to be more complicated than that, but it
> >> seems to be working.  I will do a complete test run and then submit a
> >> patch.
> > Yea, it really should be that easy.
> >
> > I'm pretty sure the sh[123]add insns in the PA need to be updated in a
> > similar manner.
> Oh, and just to be clear, I'm not expecting you to do this Steve.  It'd
> be great if you did, but not required.

Does this patch effectively change the canonicalization rules? The following
Still exists in md.texi:

@item
Within address computations (i.e., inside @code{mem}), a left shift is
converted into the appropriate multiplication by a power of two.

Thanks,
Matthew
Segher Boessenkool May 11, 2015, 8:30 p.m. UTC | #9
On Mon, May 11, 2015 at 08:16:41PM +0000, Matthew Fortune wrote:
> Does this patch effectively change the canonicalization rules? The following
> Still exists in md.texi:
> 
> @item
> Within address computations (i.e., inside @code{mem}), a left shift is
> converted into the appropriate multiplication by a power of two.

No, it makes combine *follow* those rules -- this isn't inside a MEM.


Segher
Matthew Fortune May 11, 2015, 8:52 p.m. UTC | #10
Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Mon, May 11, 2015 at 08:16:41PM +0000, Matthew Fortune wrote:
> > Does this patch effectively change the canonicalization rules? The
> > following Still exists in md.texi:
> >
> > @item
> > Within address computations (i.e., inside @code{mem}), a left shift is
> > converted into the appropriate multiplication by a power of two.
> 
> No, it makes combine *follow* those rules -- this isn't inside a MEM.

Thanks, I'm being a bit slow today.

Matthew
Kumar, Venkataramanan May 12, 2015, 5:21 a.m. UTC | #11
Hi Steve, 

Yes this is expected.  As Segher pointed out, we need to change .md patterns in target to be based on shifts instead of mults.

Regards,
Venkat.

-----Original Message-----
From: Steve Ellcey [mailto:sellcey@imgtec.com] 
Sent: Monday, May 11, 2015 11:20 PM
To: Kumar, Venkataramanan
Cc: Segher Boessenkool; Jeff Law (law@redhat.com); gcc-patches@gcc.gnu.org; maxim.kuvyrkov@linaro.org; Matthew Fortune; clm
Subject: RE: [RFC]: Remove Mem/address type assumption in combiner

On Thu, 2015-05-07 at 11:01 +0000, Kumar, Venkataramanan wrote:
> Hi Segher,
> 
> Thank you I committed as r222874. 
> Ref: https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=222874
> 
> Regards,
> Venkat.


Venkat,

This patch broke a number of MIPS tests, specifically mips32r6 tests that look for the lsa instruction (load scaled address) which shifts one register and then adds it to a second register.  I am not sure if this needs to be addressed in combine.c or if we need to add a peephole optimization to mips.md to handle the new instruction sequence.  What do you think?  Is the change here what you would expect to see from your patch?

With this C code:


signed short test (signed short *a, int index) {
  return a[index];
}


GCC/combine for mips32r6 used to produce:

(insn 8 7 9 2 (set (reg/f:SI 203)
        (plus:SI (mult:SI (reg:SI 5 $5 [ index ])
                (const_int 2 [0x2]))
            (reg:SI 4 $4 [ a ]))) lsa.c:3 444 {lsa}
     (expr_list:REG_DEAD (reg:SI 5 $5 [ index ])
        (expr_list:REG_DEAD (reg:SI 4 $4 [ a ])
            (nil))))
(insn 15 10 16 2 (set (reg/i:SI 2 $2)
        (sign_extend:SI (mem:HI (reg/f:SI 203) [1 *_5+0 S2 A16])))
lsa.c:4 237 {*extendhisi2_seh}
     (expr_list:REG_DEAD (reg/f:SI 203)
        (nil)))

And would generate:

	lsa	$4,$5,$4,1
	lh	$2,0($4)
	

But now it produces:

(insn 7 4 8 2 (set (reg:SI 202)
        (ashift:SI (reg:SI 5 $5 [ index ])
            (const_int 1 [0x1]))) lsa.c:3 432 {*ashlsi3}
     (expr_list:REG_DEAD (reg:SI 5 $5 [ index ])
        (nil)))
(insn 8 7 9 2 (set (reg/f:SI 203)
        (plus:SI (reg:SI 4 $4 [ a ])
            (reg:SI 202))) lsa.c:3 13 {*addsi3}
     (expr_list:REG_DEAD (reg:SI 4 $4 [ a ])
        (expr_list:REG_DEAD (reg:SI 202)
            (nil)))
(insn 15 10 16 2 (set (reg/i:SI 2 $2)
        (sign_extend:SI (mem:HI (reg/f:SI 203) [1 *_5+0 S2 A16])))
lsa.c:4 237 {*extendhisi2_seh}
     (expr_list:REG_DEAD (reg/f:SI 203)
        (nil)))

Which generates:

	sll	$5,$5,1
	addu	$4,$4,$5
	lh	$2,0($4)




Steve Ellcey
sellcey@imgtec.com
diff mbox

Patch

diff --git a/gcc/combine.c b/gcc/combine.c
index c04146a..9e3eb03 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7723,9 +7723,8 @@  extract_left_shift (rtx x, int count)
    We try, as much as possible, to re-use rtl expressions to save memory.

    IN_CODE says what kind of expression we are processing.  Normally, it is
-   SET.  In a memory address (inside a MEM, PLUS or minus, the latter two
-   being kludges), it is MEM.  When processing the arguments of a comparison
-   or a COMPARE against zero, it is COMPARE.  */
+   SET.  In a memory address it is MEM.  When processing the arguments of
+   a comparison or a COMPARE against zero, it is COMPARE.  */

rtx
 make_compound_operation (rtx x, enum rtx_code in_code)
@@ -7745,8 +7744,6 @@  make_compound_operation (rtx x, enum rtx_code in_code)
      but once inside, go back to our default of SET.  */

   next_code = (code == MEM ? MEM
-              : ((code == PLUS || code == MINUS)
-                 && SCALAR_INT_MODE_P (mode)) ? MEM
               : ((code == COMPARE || COMPARISON_P (x))
                  && XEXP (x, 1) == const0_rtx) ? COMPARE
               : in_code == COMPARE ? SET : in_code);