Patchwork Prefer reg as first operand in commutative operator

login
register
mail settings
Submitter Paulo J. Matos
Date Feb. 14, 2012, 7:27 a.m.
Message ID <jhd2cq$8rh$1@dough.gmane.org>
Download mbox | patch
Permalink /patch/141061/
State New
Headers show

Comments

Paulo J. Matos - Feb. 14, 2012, 7:27 a.m.
Hi,

This patch was submitted as part of PR 52235.
It increases the preference of a register for first operand of a 
commutative operator.

2012-02-13 Paulo Matos <paulo.matos@csr.com>
	* gcc/rtlanal.c: Increase preference of a register for the
	first operand in a commutative operator.


@@ -3059,26 +3059,28 @@
      {
      case RTX_CONST_OBJ:
        if (code == CONST_INT)
+        return -7;
+      if (code == CONST_DOUBLE)
          return -6;
-      if (code == CONST_DOUBLE)
-        return -5;
        if (code == CONST_FIXED)
-        return -5;
-      return -4;
+        return -6;
+      return -5;

      case RTX_EXTRA:
        /* SUBREGs of objects should come second.  */
        if (code == SUBREG && OBJECT_P (SUBREG_REG (op)))
-        return -3;
+        return -4;
        return 0;

      case RTX_OBJ:
        /* Complex expressions should be the first, so decrease priority
           of objects.  Prefer pointer objects over non pointer objects.  */
-      if ((REG_P (op) && REG_POINTER (op))
-	  || (MEM_P (op) && MEM_POINTER (op)))
-	return -1;
-      return -2;
+      if(REG_P(op))
+          return -1;
+      else if ((REG_P (op) && REG_POINTER (op))
+               || (MEM_P (op) && MEM_POINTER (op)))
+          return -2;
+      return -3;

      case RTX_COMM_ARITH:
        /* Prefer operands that are themselves commutative to be first.
Jakub Jelinek - Feb. 14, 2012, 9:52 a.m.
On Tue, Feb 14, 2012 at 07:27:21AM +0000, Paulo J. Matos wrote:
>      case RTX_OBJ:
>        /* Complex expressions should be the first, so decrease priority
>           of objects.  Prefer pointer objects over non pointer objects.  */
> -      if ((REG_P (op) && REG_POINTER (op))
> -	  || (MEM_P (op) && MEM_POINTER (op)))
> -	return -1;
> -      return -2;
> +      if(REG_P(op))
> +          return -1;
> +      else if ((REG_P (op) && REG_POINTER (op))
> +               || (MEM_P (op) && MEM_POINTER (op)))
> +          return -2;
> +      return -3;

The above is definitely wrong, I think it will penalize e.g. Power6/7 a lot.
Note that the REG_P && REG_POINTER testis then useless because of your
change.  So, if anything, you'd need to use highest priority for
REG_P && REG_POINTER, then MEM_P && MEM_POINTER, then REG_P and then MEM_P
and verify it doesn't regress on any major target on SPEC etc.
Watch your formatting and I don't think this is stage4 material in any case,
it is extremely risky change.

	Jakub
Paolo Bonzini - Feb. 14, 2012, 1:46 p.m.
On 02/14/2012 10:52 AM, Jakub Jelinek wrote:
>> >        /* Complex expressions should be the first, so decrease priority
>> >           of objects.  Prefer pointer objects over non pointer objects.  */
>> > -      if ((REG_P (op) && REG_POINTER (op))
>> > -	  || (MEM_P (op) && MEM_POINTER (op)))
>> > -	return -1;
>> > -      return -2;
>> > +      if(REG_P(op))
>> > +          return -1;
>> > +      else if ((REG_P (op) && REG_POINTER (op))
>> > +               || (MEM_P (op) && MEM_POINTER (op)))
>> > +          return -2;
>> > +      return -3;
> The above is definitely wrong, I think it will penalize e.g. Power6/7 a lot.

Even if this was fixed, I'm not sure it is always the correct thing to 
do, because it would override swap_commutative_operands_with_target.  If 
the target is memory, you'd want the memory to come first.

In other words, there is no right answer as to whether REG or MEM should 
come first.

I think the register allocator will generate good code using % if you 
make your predicate nonimmediate_operand in operand 1:

(define_insn "iorqi3"
   [(set (match_operand:QI 0 "register_operand" "=c")
         (ior:QI (match_operand:QI 1 "nonimmediate_operand" "%0")
                 (match_operand:QI 2 "general_operand" "cwmi")))
    (clobber (reg:CC RCC))]
   "register_operand(operands[1], QImode) ||
    register_operand(operands[2], QImode)"
   "or\\t%0,%f2")

Paolo
Paulo J. Matos - Feb. 14, 2012, 1:57 p.m.
On 14/02/12 09:52, Jakub Jelinek wrote:
> On Tue, Feb 14, 2012 at 07:27:21AM +0000, Paulo J. Matos wrote:
>>       case RTX_OBJ:
>>         /* Complex expressions should be the first, so decrease priority
>>            of objects.  Prefer pointer objects over non pointer objects.  */
>> -      if ((REG_P (op)&&  REG_POINTER (op))
>> -	  || (MEM_P (op)&&  MEM_POINTER (op)))
>> -	return -1;
>> -      return -2;
>> +      if(REG_P(op))
>> +          return -1;
>> +      else if ((REG_P (op)&&  REG_POINTER (op))
>> +               || (MEM_P (op)&&  MEM_POINTER (op)))
>> +          return -2;
>> +      return -3;
>
> The above is definitely wrong, I think it will penalize e.g. Power6/7 a lot.
> Note that the REG_P&&  REG_POINTER testis then useless because of your
> change.  So, if anything, you'd need to use highest priority for
> REG_P&&  REG_POINTER, then MEM_P&&  MEM_POINTER, then REG_P and then MEM_P
> and verify it doesn't regress on any major target on SPEC etc.
> Watch your formatting and I don't think this is stage4 material in any case,
> it is extremely risky change.
>
> 	Jakub
>

Thanks for the comments Jakub. I submitted the bug report since it 
helped me (my backend generated code is better) and Richard in the gcc 
mailing list agreed it to be unintuitive.

It is true that my test makes the following test on REG_P and 
REG_POINTER useless. My mistake, will fix.
I haven't tested it on any major target and the submission as a bug was 
in order to see if it has any interest for gcc main targets.
Paulo J. Matos - Feb. 14, 2012, 2:05 p.m.
On 14/02/12 13:46, Paolo Bonzini wrote:
> On 02/14/2012 10:52 AM, Jakub Jelinek wrote:
>>> > /* Complex expressions should be the first, so decrease priority
>>> > of objects. Prefer pointer objects over non pointer objects. */
>>> > - if ((REG_P (op) && REG_POINTER (op))
>>> > - || (MEM_P (op) && MEM_POINTER (op)))
>>> > - return -1;
>>> > - return -2;
>>> > + if(REG_P(op))
>>> > + return -1;
>>> > + else if ((REG_P (op) && REG_POINTER (op))
>>> > + || (MEM_P (op) && MEM_POINTER (op)))
>>> > + return -2;
>>> > + return -3;
>> The above is definitely wrong, I think it will penalize e.g. Power6/7
>> a lot.
>
> Even if this was fixed, I'm not sure it is always the correct thing to
> do, because it would override swap_commutative_operands_with_target. If
> the target is memory, you'd want the memory to come first.
>

I was unaware of swap_commutative_operands_with_target, therefore the 
patch change makes things a lot trickier.

> In other words, there is no right answer as to whether REG or MEM should
> come first.
>
> I think the register allocator will generate good code using % if you
> make your predicate nonimmediate_operand in operand 1:
>
> (define_insn "iorqi3"
> [(set (match_operand:QI 0 "register_operand" "=c")
> (ior:QI (match_operand:QI 1 "nonimmediate_operand" "%0")
> (match_operand:QI 2 "general_operand" "cwmi")))
> (clobber (reg:CC RCC))]
> "register_operand(operands[1], QImode) ||
> register_operand(operands[2], QImode)"
> "or\\t%0,%f2")
>
> Paolo
>

Yes, I think in general that seems to be the right procedure. In my 
case, unfortunately it does not work.
The reason is that it allows:
(set (reg:QI ...)
      (ior:QI (mem:QI (reg:QI ...))
              (mem:QI (reg:QI ...))))

This wouldn't in general be a problem except that my backend only has 
one register that can be used for a memory dereference, which means that 
BASE_REG_CLASS is a class with a single register.

This seriously breaks IRA, which doesn't have two registers to allocate 
for that rule.

I always try to change my backend before changing my port of GCC and 
unfortunately in this case I saw no alternative my to change 
commutative_operand_precedence. It might be the wrong solution in the 
general case, therefore just disregard it.

Thanks for the comments,
Jakub Jelinek - Feb. 14, 2012, 2:10 p.m.
On Tue, Feb 14, 2012 at 02:05:30PM +0000, Paulo J. Matos wrote:
> >I think the register allocator will generate good code using % if you
> >make your predicate nonimmediate_operand in operand 1:
> >
> >(define_insn "iorqi3"
> >[(set (match_operand:QI 0 "register_operand" "=c")
> >(ior:QI (match_operand:QI 1 "nonimmediate_operand" "%0")
> >(match_operand:QI 2 "general_operand" "cwmi")))
> >(clobber (reg:CC RCC))]
> >"register_operand(operands[1], QImode) ||
> >register_operand(operands[2], QImode)"
> >"or\\t%0,%f2")

> Yes, I think in general that seems to be the right procedure. In my
> case, unfortunately it does not work.
> The reason is that it allows:
> (set (reg:QI ...)
>      (ior:QI (mem:QI (reg:QI ...))
>              (mem:QI (reg:QI ...))))
> 
> This wouldn't in general be a problem except that my backend only
> has one register that can be used for a memory dereference, which
> means that BASE_REG_CLASS is a class with a single register.

It doesn't allow that, because the condition on the insn then fails,
as neither operand 1 nor operand 2 is register_operand.

	Jakub
Paulo J. Matos - Feb. 14, 2012, 2:16 p.m.
On 14/02/12 14:10, Jakub Jelinek wrote:
>
> It doesn't allow that, because the condition on the insn then fails,
> as neither operand 1 nor operand 2 is register_operand.
>
> 	Jakub
>


Oh!!!
I incorrectly read the instruction as I had initially tried it and I 
knew it was failing, however, I didn't have the predicate Paolo suggested.

I will give it a go. It does look good and it would avoid touching 	core 
GCC. :)

Thanks guys,
Paulo J. Matos - Feb. 15, 2012, 10:04 a.m.
On 14/02/12 14:10, Jakub Jelinek wrote:
> On Tue, Feb 14, 2012 at 02:05:30PM +0000, Paulo J. Matos wrote:
>>> I think the register allocator will generate good code using % if you
>>> make your predicate nonimmediate_operand in operand 1:
>>>
>>> (define_insn "iorqi3"
>>> [(set (match_operand:QI 0 "register_operand" "=c")
>>> (ior:QI (match_operand:QI 1 "nonimmediate_operand" "%0")
>>> (match_operand:QI 2 "general_operand" "cwmi")))
>>> (clobber (reg:CC RCC))]
>>> "register_operand(operands[1], QImode) ||
>>> register_operand(operands[2], QImode)"
>>> "or\\t%0,%f2")
>
>> Yes, I think in general that seems to be the right procedure. In my
>> case, unfortunately it does not work.
>> The reason is that it allows:
>> (set (reg:QI ...)
>>       (ior:QI (mem:QI (reg:QI ...))
>>               (mem:QI (reg:QI ...))))
>>
>> This wouldn't in general be a problem except that my backend only
>> has one register that can be used for a memory dereference, which
>> means that BASE_REG_CLASS is a class with a single register.
>
> It doesn't allow that, because the condition on the insn then fails,
> as neither operand 1 nor operand 2 is register_operand.
>

It doesn't look like you can actually do that because, according to the 
internals manual:
"For a named pattern, the condition (if present) may not depend on the 
data in the insn being matched, but only the target-machine-type flags. 
The compiler needs to test these conditions during initialization in 
order to learn exactly which named instructions are available in a 
particular run."

And GCC complains with:
insn-opinit.c: In function 'init_all_optabs':
insn-opinit.c:24: error: 'operands' undeclared (first use in this function)
insn-opinit.c:24: error: (Each undeclared identifier is reported only once
insn-opinit.c:24: error: for each function it appears in.)

Did I miss something in the example?
Jakub Jelinek - Feb. 15, 2012, 10:14 a.m.
On Wed, Feb 15, 2012 at 10:04:15AM +0000, Paulo J. Matos wrote:
> It doesn't look like you can actually do that because, according to
> the internals manual:
> "For a named pattern, the condition (if present) may not depend on
> the data in the insn being matched, but only the target-machine-type
> flags. The compiler needs to test these conditions during
> initialization in order to learn exactly which named instructions
> are available in a particular run."
> 
> And GCC complains with:
> insn-opinit.c: In function 'init_all_optabs':
> insn-opinit.c:24: error: 'operands' undeclared (first use in this function)
> insn-opinit.c:24: error: (Each undeclared identifier is reported only once
> insn-opinit.c:24: error: for each function it appears in.)
> 
> Did I miss something in the example?

In define_expand (or define_insn of a standard named pattern) you can't use
operands, those conditions are used to select whether the given insn is
available at all.  But you can use it in define_insn conditions for other
insn patterns.
So you can have
(define_expand "iorqi3"
  ...
  "")

(define_insn "*iorqi3"
  ...
  "register_operand (operands[1], QImode) || register_operand (operands[2], QImode)"
...
)

See config/i386/i386.md for thousands of examples.

	Jakub

Patch

--- gcc46/gcc/rtlanal.c	(gcc 4.6.2)
+++ gcc46/gcc/rtlanal.c	(working copy)
@@ -3047,11 +3047,11 @@ 

    /* Constants always come the second operand.  Prefer "nice" 
constants.  */
    if (code == CONST_INT)
+    return -9;
+  if (code == CONST_DOUBLE)
      return -8;
-  if (code == CONST_DOUBLE)
-    return -7;
    if (code == CONST_FIXED)
-    return -7;
+    return -8;
    op = avoid_constant_pool_reference (op);
    code = GET_CODE (op);