diff mbox

[RFA,pr,target/60648] Fix non-canonical RTL from x86 backend -- P1 regression

Message ID 53330838.7010402@redhat.com
State New
Headers show

Commit Message

Jeff Law March 26, 2014, 5:02 p.m. UTC
The x86 backend can generate non-canonical RTL when it simplifies 
address expressions.

In particular addresses which have the form
(plus (mult) (A) (B) (label_ref))

If the multiplication can be simplified to a constant, the x86 backend 
will end up generating

(plus (constant) (label_ref))

Which is obviously non-canonical and should be written
(const (plus (label_ref) (constant))

This change merely canonicalizes the RTL, leaving it to other code to 
simplify.  At first I wanted to simplify, but it just gets painful if, 
for example B above is (const_int 0), in which case our example 
collapses into

(label_ref)

The subsequent code in the ix86_legitimize_address assumes it's still 
working with a PLUS.  Fixable, but painful.

It's also the case that simplify_rtx might return NULL.  So the code 
also has to DTRT in that case and the caller's don't DTRT if 
ix86_legitimize_address returns NULL.

Anyway, I did verify that the relevant code in the short example gets 
optimized if we just fix the canonicalization (namely the addition of 
(const_int 0) is eliminated).

Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
Verified it fixes the original and reduced testcase.

OK for the trunk?

Comments

Jakub Jelinek March 26, 2014, 6:12 p.m. UTC | #1
On Wed, Mar 26, 2014 at 11:02:48AM -0600, Jeff Law wrote:
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
> Verified it fixes the original and reduced testcase.

Note, the testcase is missing from your patch.

But I'd question if this is the right place to canonicalize it.
The non-canonical order seems to be created in the generic code, where
do_tablejump does:
11133	  /* ??? The only correct use of CASE_VECTOR_MODE is the one inside the
11134	     GET_MODE_SIZE, because this indicates how large insns are.  The other
11135	     uses should all be Pmode, because they are addresses.  This code
11136	     could fail if addresses and insns are not the same size.  */
11137	  index = gen_rtx_PLUS
11138	    (Pmode,
11139	     gen_rtx_MULT (Pmode, index,
11140			   gen_int_mode (GET_MODE_SIZE (CASE_VECTOR_MODE), Pmode)),
11141	     gen_rtx_LABEL_REF (Pmode, table_label));
and there I don't see why it shouldn't just try to simplify it.
Thus
  index = simplify_gen_binary (MULT, Pmode, index,
			       gen_int_mode (GET_MODE_SIZE (CASE_VECTOR_MODE),
					     Pmode));
  index = simplify_gen_binary (PLUS, Pmode, index,
			       gen_rtx_LABEL_REF (Pmode, table_label));
would be my (untested) preference.  In the usual case where index is
previously a REG (or less frequently a MEM), I guess the simplification
shouldn't make a difference.
Of course it would be better if we optimized this either at the tree level
or during expansion into a normal unconditional jump, but from what I see
we don't have enough info that expand_normal will expand it into a constant
earlier and try_tablejump doesn't get passed labelvec so that it would know
where to jump to.

	Jakub
Jeff Law March 26, 2014, 6:17 p.m. UTC | #2
On 03/26/14 12:12, Jakub Jelinek wrote:
> On Wed, Mar 26, 2014 at 11:02:48AM -0600, Jeff Law wrote:
>> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
>> Verified it fixes the original and reduced testcase.
>
> Note, the testcase is missing from your patch.
>
> But I'd question if this is the right place to canonicalize it.
> The non-canonical order seems to be created in the generic code, where
> do_tablejump does:
No, at that point it's still canonical because the x86 backend hasn't 
simpified the (mult ...) subexpression.  Its the simplification of that 
subexpression to a constant that creates the non-canonical RTL.  That's 
why I fixed the x86 bits -- those are the bits that simplify the (mult 
...) into a (const_int) and thus creates the non-canonical RTL.

jeff
Jakub Jelinek March 26, 2014, 6:28 p.m. UTC | #3
On Wed, Mar 26, 2014 at 12:17:43PM -0600, Jeff Law wrote:
> On 03/26/14 12:12, Jakub Jelinek wrote:
> >On Wed, Mar 26, 2014 at 11:02:48AM -0600, Jeff Law wrote:
> >>Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
> >>Verified it fixes the original and reduced testcase.
> >
> >Note, the testcase is missing from your patch.
> >
> >But I'd question if this is the right place to canonicalize it.
> >The non-canonical order seems to be created in the generic code, where
> >do_tablejump does:
> No, at that point it's still canonical because the x86 backend
> hasn't simpified the (mult ...) subexpression.  Its the
> simplification of that subexpression to a constant that creates the
> non-canonical RTL.  That's why I fixed the x86 bits -- those are the
> bits that simplify the (mult ...) into a (const_int) and thus
> creates the non-canonical RTL.

(mult:SI (const_int 0) (const_int 4)) is IMHO far from being canonical.
And, I'd say it is likely other target legitimization hooks would also try
to simplify it similarly.
simplify_gen_binary is used in several other places during expansion,
so I don't see why it couldn't be desirable here.

	Jakub
Jeff Law March 26, 2014, 6:33 p.m. UTC | #4
On 03/26/14 12:28, Jakub Jelinek wrote:
> On Wed, Mar 26, 2014 at 12:17:43PM -0600, Jeff Law wrote:
>> On 03/26/14 12:12, Jakub Jelinek wrote:
>>> On Wed, Mar 26, 2014 at 11:02:48AM -0600, Jeff Law wrote:
>>>> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
>>>> Verified it fixes the original and reduced testcase.
>>>
>>> Note, the testcase is missing from your patch.
>>>
>>> But I'd question if this is the right place to canonicalize it.
>>> The non-canonical order seems to be created in the generic code, where
>>> do_tablejump does:
>> No, at that point it's still canonical because the x86 backend
>> hasn't simpified the (mult ...) subexpression.  Its the
>> simplification of that subexpression to a constant that creates the
>> non-canonical RTL.  That's why I fixed the x86 bits -- those are the
>> bits that simplify the (mult ...) into a (const_int) and thus
>> creates the non-canonical RTL.
>
> (mult:SI (const_int 0) (const_int 4)) is IMHO far from being canonical.
It's debatable.  Our canonicalization rules don't explicitly cover the 
case where both arguments to a commutative expression are constants. 
Thus, I would classify that as legitimate, but unsimplified RTL.

Contrast to
(plus (const_int 0) (label_ref)

Which is clearly non-canonical.

Jeff
Jeff Law March 26, 2014, 7:32 p.m. UTC | #5
On 03/26/14 12:28, Jakub Jelinek wrote:
> (mult:SI (const_int 0) (const_int 4)) is IMHO far from being canonical.
> And, I'd say it is likely other target legitimization hooks would also try
> to simplify it similarly.
> simplify_gen_binary is used in several other places during expansion,
> so I don't see why it couldn't be desirable here.
No particular reason.  I'll try that since we disagree about the 
validity of the RTL and we can both agree that using simplify_gen_binary 
is reasonable.


jeff
Jakub Jelinek March 26, 2014, 7:40 p.m. UTC | #6
On Wed, Mar 26, 2014 at 01:32:44PM -0600, Jeff Law wrote:
> On 03/26/14 12:28, Jakub Jelinek wrote:
> >(mult:SI (const_int 0) (const_int 4)) is IMHO far from being canonical.
> >And, I'd say it is likely other target legitimization hooks would also try
> >to simplify it similarly.
> >simplify_gen_binary is used in several other places during expansion,
> >so I don't see why it couldn't be desirable here.
> No particular reason.  I'll try that since we disagree about the
> validity of the RTL and we can both agree that using
> simplify_gen_binary is reasonable.

Other possibility if you want to change it in the i386.c legitimize_address
hook would be IMHO using force_reg instead of force_operand, it should be
the same thing in most cases, except for these corner cases, and there would
be no need to canonizalize anything afterwards.
But, if the i?86 maintainers feel otherwise on this and think your patch is
ok, I don't feel that strongly about this.

	Jakub
Mike Stump March 26, 2014, 7:41 p.m. UTC | #7
On Mar 26, 2014, at 11:33 AM, Jeff Law <law@redhat.com> wrote:
> On 03/26/14 12:28, Jakub Jelinek wrote:
>> On Wed, Mar 26, 2014 at 12:17:43PM -0600, Jeff Law wrote:
>>> On 03/26/14 12:12, Jakub Jelinek wrote:
>>>> On Wed, Mar 26, 2014 at 11:02:48AM -0600, Jeff Law wrote:
>>>>> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
>>>>> Verified it fixes the original and reduced testcase.
>>>> 
>>>> Note, the testcase is missing from your patch.
>>>> 
>>>> But I'd question if this is the right place to canonicalize it.
>>>> The non-canonical order seems to be created in the generic code, where
>>>> do_tablejump does:
>>> No, at that point it's still canonical because the x86 backend
>>> hasn't simpified the (mult ...) subexpression.  Its the
>>> simplification of that subexpression to a constant that creates the
>>> non-canonical RTL.  That's why I fixed the x86 bits -- those are the
>>> bits that simplify the (mult ...) into a (const_int) and thus
>>> creates the non-canonical RTL.
>> 
>> (mult:SI (const_int 0) (const_int 4)) is IMHO far from being canonical.
> It's debatable.

  (set (reg) (plus (reg) (mult 2 4)))

should be canonicalized into

  (set (reg) (plus (reg) 8))

remember, the entire point of canonicalization is so that the port doesn’t have to match:

  (set (reg) (plus (reg) (mult 2 4)))

and

  (set (reg) (plus (reg) 8)

with two patterns to get good code-gen.  I know, we goof the rules from time to time, but when we do, we should just admit there is no value to it, and fix it.

Anyway, the documented rule is:

> There are often cases where multiple RTL expressions could represent an
> operation performed by a single machine instruction.  This situation is
> most commonly encountered with logical, branch, and multiply-accumulate
> instructions.  In such cases, the compiler attempts to convert these
> multiple RTL expressions into a single canonical form to reduce the
> number of insn patterns required.
> 
> In addition to algebraic simplifications,

Now, when look up algebraic simplification, I get:

  http://blog.wolframalpha.com/2011/04/25/algebraic-simplification-simplifying-expressions-in-wolframalpha/

and when I look up:

  http://www.wolframalpha.com/input/?i=2+*+4

it says the answer is 8.  If that isn’t the right answer, please fix google and/or wolfram alpha as one of them is seriously misleading.  If the documented rule is wrong, please fix the document.

?

> Our canonicalization rules don't explicitly cover the case where both arguments to a commutative expression are constants.

Then why do we document it under:

@section Canonicalization of Instructions                                                                                                     
@cindex canonicalization of instructions
@cindex insn canonicalization

?
Richard Henderson March 26, 2014, 7:52 p.m. UTC | #8
On 03/26/2014 12:40 PM, Jakub Jelinek wrote:
> On Wed, Mar 26, 2014 at 01:32:44PM -0600, Jeff Law wrote:
>> On 03/26/14 12:28, Jakub Jelinek wrote:
>>> (mult:SI (const_int 0) (const_int 4)) is IMHO far from being canonical.
>>> And, I'd say it is likely other target legitimization hooks would also try
>>> to simplify it similarly.
>>> simplify_gen_binary is used in several other places during expansion,
>>> so I don't see why it couldn't be desirable here.
>> No particular reason.  I'll try that since we disagree about the
>> validity of the RTL and we can both agree that using
>> simplify_gen_binary is reasonable.
> 
> Other possibility if you want to change it in the i386.c legitimize_address
> hook would be IMHO using force_reg instead of force_operand, it should be
> the same thing in most cases, except for these corner cases, and there would
> be no need to canonizalize anything afterwards.
> But, if the i?86 maintainers feel otherwise on this and think your patch is
> ok, I don't feel that strongly about this.

I like this as a solution.  Let the combiner clean things up if it's gotten so far.


r~
Richard Sandiford March 26, 2014, 9:53 p.m. UTC | #9
Richard Henderson <rth@redhat.com> writes:
> On 03/26/2014 12:40 PM, Jakub Jelinek wrote:
>> On Wed, Mar 26, 2014 at 01:32:44PM -0600, Jeff Law wrote:
>>> On 03/26/14 12:28, Jakub Jelinek wrote:
>>>> (mult:SI (const_int 0) (const_int 4)) is IMHO far from being canonical.
>>>> And, I'd say it is likely other target legitimization hooks would also try
>>>> to simplify it similarly.
>>>> simplify_gen_binary is used in several other places during expansion,
>>>> so I don't see why it couldn't be desirable here.
>>> No particular reason.  I'll try that since we disagree about the
>>> validity of the RTL and we can both agree that using
>>> simplify_gen_binary is reasonable.
>> 
>> Other possibility if you want to change it in the i386.c legitimize_address
>> hook would be IMHO using force_reg instead of force_operand, it should be
>> the same thing in most cases, except for these corner cases, and there would
>> be no need to canonizalize anything afterwards.
>> But, if the i?86 maintainers feel otherwise on this and think your patch is
>> ok, I don't feel that strongly about this.
>
> I like this as a solution.  Let the combiner clean things up if it's
> gotten so far.

How about doing both?  Jakub's simplify_gen_binary change looked like a good
idea regardless of whatever else happens.  Seems a shame not to go with it.

Thanks,
Richard
Jeff Law March 27, 2014, 3:45 p.m. UTC | #10
On 03/26/14 15:53, Richard Sandiford wrote:
> Richard Henderson <rth@redhat.com> writes:
>> On 03/26/2014 12:40 PM, Jakub Jelinek wrote:
>>> On Wed, Mar 26, 2014 at 01:32:44PM -0600, Jeff Law wrote:
>>>> On 03/26/14 12:28, Jakub Jelinek wrote:
>>>>> (mult:SI (const_int 0) (const_int 4)) is IMHO far from being canonical.
>>>>> And, I'd say it is likely other target legitimization hooks would also try
>>>>> to simplify it similarly.
>>>>> simplify_gen_binary is used in several other places during expansion,
>>>>> so I don't see why it couldn't be desirable here.
>>>> No particular reason.  I'll try that since we disagree about the
>>>> validity of the RTL and we can both agree that using
>>>> simplify_gen_binary is reasonable.
>>>
>>> Other possibility if you want to change it in the i386.c legitimize_address
>>> hook would be IMHO using force_reg instead of force_operand, it should be
>>> the same thing in most cases, except for these corner cases, and there would
>>> be no need to canonizalize anything afterwards.
>>> But, if the i?86 maintainers feel otherwise on this and think your patch is
>>> ok, I don't feel that strongly about this.
>>
>> I like this as a solution.  Let the combiner clean things up if it's
>> gotten so far.
>
> How about doing both?  Jakub's simplify_gen_binary change looked like a good
> idea regardless of whatever else happens.  Seems a shame not to go with it.
Agreed.  The simplify_gen_binary change was spinning overnight, I'll be 
looking at the results momentarily.

jeff
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 53d58b3..80f0ba8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@ 
+2014-03-26  Jeff Law  <law@redhat.com>
+
+	* i386/i386.c (ix86_legitimize_address): Canonicalize
+	(plus (const) (label_ref)).
+
 2014-03-26  Richard Biener  <rguenther@suse.de>
 
 	* tree-pretty-print.c (percent_K_format): Implement special
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 842be68..79f4aff 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13939,6 +13939,28 @@  ix86_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
 	  && REG_P (XEXP (x, 0)))
 	return x;
 
+      /* We might have started with something like
+
+	 (plus (mult (const_int 1) (const_int 4)) (label_ref))
+
+	 Which we change into:
+
+	 (plus (const_int 4) (label_ref))
+
+	 Which is obviously not canonical RTL.  Passing the non
+	 canonical RTL up to our caller is bad.
+
+	 Do not simplify, just canonicalize.  Simplification opens up
+	 a can of worms here as that can change the structure of X which
+	 this code isn't really prepared to handle.  */
+      if (COMMUTATIVE_ARITH_P (x)
+	  && swap_commutative_operands_p (XEXP (x, 0), XEXP (x, 1)))
+	{
+	  rtx temp = XEXP (x, 0);
+	  XEXP (x, 0) = XEXP (x, 1);
+	  XEXP (x, 1) = temp;
+	}
+
       if (flag_pic && SYMBOLIC_CONST (XEXP (x, 1)))
 	{
 	  changed = 1;
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index cdc8e9a..789e27d 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@ 
+2014-02-27  Jeff Law  <law@redhat.com>
+
+	PR target/60648
+	* g++.dg/pr60648.C: New test.
+
 2014-03-26  Jakub Jelinek  <jakub@redhat.com>
 
 	PR sanitizer/60636