Patchwork RFA: patch to fix 2 testsuite failures for LRA on PPC

login
register
mail settings
Submitter Vladimir Makarov
Date Dec. 5, 2013, 5:40 p.m.
Message ID <52A0BA81.2030108@redhat.com>
Download mbox | patch
Permalink /patch/297305/
State New
Headers show

Comments

Vladimir Makarov - Dec. 5, 2013, 5:40 p.m.
The following patch fixes two GCC testsuite failures for LRA.  The patch 
makes swap through registers instead of memory for the test cases when 
LRA is used.

There are differences in reload and LRA constraint matching algorithm 
which results in different alternative choices when the original pattern 
is used.

Actually my first proposed solution variant used one pattern which is 
now for LRA in this patch.  But some doubt arose that it may affect 
reload pass in some bad way.

Ok to commit?

2013-12-05  Vladimir Makarov  <vmakarov@redhat.com>

        * config/rs6000/rs6000.md (*bswapdi2_64bit): Make two versions of
        the insn definition.  One is for reload, another one is for LRA.
David Edelsohn - Dec. 6, 2013, 1:44 p.m.
On Thu, Dec 5, 2013 at 12:40 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> The following patch fixes two GCC testsuite failures for LRA.  The patch
> makes swap through registers instead of memory for the test cases when LRA
> is used.
>
> There are differences in reload and LRA constraint matching algorithm which
> results in different alternative choices when the original pattern is used.
>
> Actually my first proposed solution variant used one pattern which is now
> for LRA in this patch.  But some doubt arose that it may affect reload pass
> in some bad way.
>
> Ok to commit?

I understand that LRA requires different tuning than reload, but I
continue to be a little uncomfortable with different patterns for LRA
and reload.

I would like to head some additional opinions.

Thanks, David
Vladimir Makarov - Dec. 6, 2013, 3:39 p.m.
On 12/6/2013, 8:44 AM, David Edelsohn wrote:
> On Thu, Dec 5, 2013 at 12:40 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>> The following patch fixes two GCC testsuite failures for LRA.  The patch
>> makes swap through registers instead of memory for the test cases when LRA
>> is used.
>>
>> There are differences in reload and LRA constraint matching algorithm which
>> results in different alternative choices when the original pattern is used.
>>
>> Actually my first proposed solution variant used one pattern which is now
>> for LRA in this patch.  But some doubt arose that it may affect reload pass
>> in some bad way.
>>
>> Ok to commit?
>
> I understand that LRA requires different tuning than reload, but I
> continue to be a little uncomfortable with different patterns for LRA
> and reload.
>
> I would like to head some additional opinions.
>
>

   Ok. I guess there is only one option to use one pattern for LRA and 
reload without ?? in register alternative.  In this case, reload and LRA 
will actually work according to GCC documentation (LRA treats ? cost as 
the cost of one reload, reload does the same but not in this case).

   That was my first solution but you were not comfortable with this too.

   Changing LRA most sensitive code to behave (wrongly in this case) as 
reload is not an option for me.

   So I don't know what to do anymore to fix this 2 failures.
Jakub Jelinek - Dec. 6, 2013, 3:45 p.m.
On Fri, Dec 06, 2013 at 10:39:29AM -0500, Vladimir Makarov wrote:
>   Ok. I guess there is only one option to use one pattern for LRA
> and reload without ?? in register alternative.  In this case, reload
> and LRA will actually work according to GCC documentation (LRA
> treats ? cost as the cost of one reload, reload does the same but
> not in this case).
> 
>   That was my first solution but you were not comfortable with this too.
> 
>   Changing LRA most sensitive code to behave (wrongly in this case)
> as reload is not an option for me.
> 
>   So I don't know what to do anymore to fix this 2 failures.

Could it be handled by enabled attribute?  You'd duplicate the
alternatives, one would be with the ??, one without, and enabled
attribute on the insn would be 1 for the first two alternatives
and also for the ?? alternative if not LRA, or non-?? alternative
if LRA.

	Jakub
Vladimir Makarov - Dec. 6, 2013, 3:59 p.m.
On 12/6/2013, 10:45 AM, Jakub Jelinek wrote:
> On Fri, Dec 06, 2013 at 10:39:29AM -0500, Vladimir Makarov wrote:
>>    Ok. I guess there is only one option to use one pattern for LRA
>> and reload without ?? in register alternative.  In this case, reload
>> and LRA will actually work according to GCC documentation (LRA
>> treats ? cost as the cost of one reload, reload does the same but
>> not in this case).
>>
>>    That was my first solution but you were not comfortable with this too.
>>
>>    Changing LRA most sensitive code to behave (wrongly in this case)
>> as reload is not an option for me.
>>
>>    So I don't know what to do anymore to fix this 2 failures.
>
> Could it be handled by enabled attribute?  You'd duplicate the
> alternatives, one would be with the ??, one without, and enabled
> attribute on the insn would be 1 for the first two alternatives
> and also for the ?? alternative if not LRA, or non-?? alternative
> if LRA.
>
>

It is still two different patterns.  One for reload and one for LRA.
Attribute enabled is mostly used to describe insn constraints for 
subtargets.

IMO removing ?? is the most right choice as both reload and LRA works 
fine without this.

On the other hand, it is not so important bug as it is performance one 
and as IBM guys at least for now are oriented to reload.  They have too 
many things (power8) to do besides LRA.

So we could postpone resolving these failures.
Jakub Jelinek - Dec. 6, 2013, 4:08 p.m.
On Fri, Dec 06, 2013 at 10:59:37AM -0500, Vladimir Makarov wrote:
> It is still two different patterns.  One for reload and one for LRA.
> Attribute enabled is mostly used to describe insn constraints for
> subtargets.

I meant something like:
(define_insn "*bswapdi2_64bit"
  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r,r")
       (bswap:DI (match_operand:DI 1 "reg_or_mem_operand" "Z,r,r,r")))
   (clobber (match_scratch:DI 2 "=&b,&b,&r,&r"))
   (clobber (match_scratch:DI 3 "=&r,&r,&r,&r"))
   (clobber (match_scratch:DI 4 "=&r,X,&r,&r"))]
  "TARGET_POWERPC64 && !TARGET_LDBRX
   && (REG_P (operands[0]) || REG_P (operands[1]))
   && !(MEM_P (operands[0]) && MEM_VOLATILE_P (operands[0]))
   && !(MEM_P (operands[1]) && MEM_VOLATILE_P (operands[1]))"
  "#"
  [(set_attr "length" "16,12,36,36")
   (set (attr "enabled")
        (cond [(eq_attr "alternative" "0,1")
                 (const_int 1)
               (and (eq_attr "alternative" "2")
                    (match_test "!rs6000_lra_flag"))
		 (const_int 1)
               (and (eq_attr "alternative" "3")
                    (match_test "rs6000_lra_flag"))
		 (const_int 1)]
	      (const_int 0)]))])

That is just one pattern.
And of course
(define_attr "enabled" "" (const_int 1))
somewhere early, because rs6000 wasn't using enabled attribute yet.

	Jakub
Michael Meissner - Dec. 6, 2013, 4:28 p.m.
On Thu, Dec 05, 2013 at 12:40:17PM -0500, Vladimir Makarov wrote:
> The following patch fixes two GCC testsuite failures for LRA.  The
> patch makes swap through registers instead of memory for the test
> cases when LRA is used.
> 
> There are differences in reload and LRA constraint matching
> algorithm which results in different alternative choices when the
> original pattern is used.
> 
> Actually my first proposed solution variant used one pattern which
> is now for LRA in this patch.  But some doubt arose that it may
> affect reload pass in some bad way.
> 
> Ok to commit?

I must admit to not remembering why I used ??&r.  I know I wanted it to prefer
doing the memory patterns.  I would think we should try just the pattern
without the ??.
Vladimir Makarov - Dec. 6, 2013, 5:30 p.m.
On 12/6/2013, 11:28 AM, Michael Meissner wrote:
> On Thu, Dec 05, 2013 at 12:40:17PM -0500, Vladimir Makarov wrote:
>> The following patch fixes two GCC testsuite failures for LRA.  The
>> patch makes swap through registers instead of memory for the test
>> cases when LRA is used.
>>
>> There are differences in reload and LRA constraint matching
>> algorithm which results in different alternative choices when the
>> original pattern is used.
>>
>> Actually my first proposed solution variant used one pattern which
>> is now for LRA in this patch.  But some doubt arose that it may
>> affect reload pass in some bad way.
>>
>> Ok to commit?
>
> I must admit to not remembering why I used ??&r.  I know I wanted it to prefer
> doing the memory patterns.  I would think we should try just the pattern
> without the ??.
>

   I tried it about 2 months ago.  I did not see any problems of such 
change for reload and LRA.  There were no regressions on GCC testsuite.

   So, Mike, if you don't see any compelling reason to keep ??, probably 
we should remove them.

If you don't mind, I'll make the patch and test again and after that 
submit it for approval.

Patch

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 205647)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -2377,14 +2377,28 @@ 
    [(set_attr "length" "4,4,36")
     (set_attr "type" "load,store,*")])

-;; Non-power7/cell, fall back to use lwbrx/stwbrx
+;; Non-power7/cell, fall back to use lwbrx/stwbrx -- reload variant
  (define_insn "*bswapdi2_64bit"
    [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r")
         (bswap:DI (match_operand:DI 1 "reg_or_mem_operand" "Z,r,r")))
     (clobber (match_scratch:DI 2 "=&b,&b,&r"))
     (clobber (match_scratch:DI 3 "=&r,&r,&r"))
     (clobber (match_scratch:DI 4 "=&r,X,&r"))]
-  "TARGET_POWERPC64 && !TARGET_LDBRX
+  "TARGET_POWERPC64 && !TARGET_LDBRX && !rs6000_lra_flag
+   && (REG_P (operands[0]) || REG_P (operands[1]))
+   && !(MEM_P (operands[0]) && MEM_VOLATILE_P (operands[0]))
+   && !(MEM_P (operands[1]) && MEM_VOLATILE_P (operands[1]))"
+  "#"
+  [(set_attr "length" "16,12,36")])
+
+;; Non-power7/cell, fall back to use lwbrx/stwbrx -- LRA variant
+(define_insn "*bswapdi2_64bit"
+  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,&r")
+       (bswap:DI (match_operand:DI 1 "reg_or_mem_operand" "Z,r,r")))
+   (clobber (match_scratch:DI 2 "=&b,&b,&r"))
+   (clobber (match_scratch:DI 3 "=&r,&r,&r"))
+   (clobber (match_scratch:DI 4 "=&r,X,&r"))]
+  "TARGET_POWERPC64 && !TARGET_LDBRX && rs6000_lra_flag
     && (REG_P (operands[0]) || REG_P (operands[1]))
     && !(MEM_P (operands[0]) && MEM_VOLATILE_P (operands[0]))
     && !(MEM_P (operands[1]) && MEM_VOLATILE_P (operands[1]))"