[ix86] Fix rtx_costs for flag-setting adds
diff mbox series

Message ID 29c1ed2a-10b0-63c9-cfe1-942b1a35e844@t-online.de
State New
Headers show
Series
  • [ix86] Fix rtx_costs for flag-setting adds
Related show

Commit Message

Bernd Schmidt Nov. 22, 2019, 12:58 p.m. UTC
A patch I posted recently fixes combine to take costs of JUMP_INSNs into
account. That causes the pr30315 test to fail with -m32, since the cost
of an add that sets the flags is estimated too high.

The following seems to fix it.  Bootstrapped and tested on x86_64-linux, ok?


Bernd

Comments

Uros Bizjak Nov. 22, 2019, 2:04 p.m. UTC | #1
On Fri, Nov 22, 2019 at 1:58 PM Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>
> A patch I posted recently fixes combine to take costs of JUMP_INSNs into
> account. That causes the pr30315 test to fail with -m32, since the cost
> of an add that sets the flags is estimated too high.
>
> The following seems to fix it.  Bootstrapped and tested on x86_64-linux, ok?

I think that the intention of the code below  "The embedded comparison
operand is completely free." comment is to handle this case. It looks
that it should return the cost of the inside operation of COMPARE rtx.

Uros.
Bernd Schmidt Nov. 22, 2019, 4:39 p.m. UTC | #2
On 11/22/19 3:04 PM, Uros Bizjak wrote:
> On Fri, Nov 22, 2019 at 1:58 PM Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>>
>> A patch I posted recently fixes combine to take costs of JUMP_INSNs into
>> account. That causes the pr30315 test to fail with -m32, since the cost
>> of an add that sets the flags is estimated too high.
>>
>> The following seems to fix it.  Bootstrapped and tested on x86_64-linux, ok?
> 
> I think that the intention of the code below  "The embedded comparison
> operand is completely free." comment is to handle this case. It looks
> that it should return the cost of the inside operation of COMPARE rtx.

There seem to be two problems with that. We're dealing with patterns
such as:

        (set (reg:CCC 17 flags)
            (compare:CCC (plus:SI (mem/c:SI (reg/f:SI 16 argp) [2 a+0 S4
A32])
                    (reg/v:SI 87 [ b ]))
                (mem/c:SI (reg/f:SI 16 argp) [2 a+0 S4 A32])))

If I remove the test for const0_rtx, it still doesn't work - I think
setting *total to zero is ineffective, since we'll still count the MEM
twice.

So, how about the following?


Bernd

@@ -19502,9 +19502,12 @@
 	}

       /* The embedded comparison operand is completely free.  */
-      if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0)))
-	  && XEXP (x, 1) == const0_rtx)
-	*total = 0;
+      if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0))))
+	{
+	  *total = rtx_cost (XEXP (x, 0), GET_MODE (XEXP (x, 0)),
+			     COMPARE, 0, speed);
+	  return true;
+	}

       return false;
Uros Bizjak Nov. 22, 2019, 5:05 p.m. UTC | #3
On Fri, Nov 22, 2019 at 5:39 PM Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>
> On 11/22/19 3:04 PM, Uros Bizjak wrote:
> > On Fri, Nov 22, 2019 at 1:58 PM Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> >>
> >> A patch I posted recently fixes combine to take costs of JUMP_INSNs into
> >> account. That causes the pr30315 test to fail with -m32, since the cost
> >> of an add that sets the flags is estimated too high.
> >>
> >> The following seems to fix it.  Bootstrapped and tested on x86_64-linux, ok?
> >
> > I think that the intention of the code below  "The embedded comparison
> > operand is completely free." comment is to handle this case. It looks
> > that it should return the cost of the inside operation of COMPARE rtx.
>
> There seem to be two problems with that. We're dealing with patterns
> such as:
>
>         (set (reg:CCC 17 flags)
>             (compare:CCC (plus:SI (mem/c:SI (reg/f:SI 16 argp) [2 a+0 S4
> A32])
>                     (reg/v:SI 87 [ b ]))
>                 (mem/c:SI (reg/f:SI 16 argp) [2 a+0 S4 A32])))

Indeed, this is a different case, an overflow test that results in one
CMP insn. I think, we should check if the second operand is either 0
(then proceed as it is now), or if the second operand equals first
operand of PLUS insn, then we actually emit CMP insn (please see
PR30315).

Uros.

> If I remove the test for const0_rtx, it still doesn't work - I think
> setting *total to zero is ineffective, since we'll still count the MEM
> twice.
>
> So, how about the following?
>
>
> Bernd
>
> @@ -19502,9 +19502,12 @@
>         }
>
>        /* The embedded comparison operand is completely free.  */
> -      if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0)))
> -         && XEXP (x, 1) == const0_rtx)
> -       *total = 0;
> +      if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0))))
> +       {
> +         *total = rtx_cost (XEXP (x, 0), GET_MODE (XEXP (x, 0)),
> +                            COMPARE, 0, speed);
> +         return true;
> +       }
>
>        return false;
>
Bernd Schmidt Nov. 24, 2019, 1:08 p.m. UTC | #4
On 11/22/19 6:05 PM, Uros Bizjak wrote:
> Indeed, this is a different case, an overflow test that results in one
> CMP insn. I think, we should check if the second operand is either 0
> (then proceed as it is now), or if the second operand equals first
> operand of PLUS insn, then we actually emit CMP insn (please see
> PR30315).

Here's what I committed - preapproved by Uros off-list (I forgot
Reply-All again).


Bernd

Patch
diff mbox series

	* config/i386/i386.c (ix86_rtx_costs): For a PLUS inside a COMPARE,
	representing an add that sets the flags, count just the PLUS.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7115ec44c2a..6e48f5ccbde 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19500,6 +19500,11 @@  ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
 		    + rtx_cost (const1_rtx, mode, outer_code, opno, speed));
 	  return true;
 	}
+      if (GET_CODE (XEXP (x, 0)) == PLUS)
+	{
+	  *total = rtx_cost (XEXP (x, 0), mode, COMPARE, 0, speed);
+	  return true;
+	}
 
       /* The embedded comparison operand is completely free.  */
       if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0)))