diff mbox series

Fix combine's simplify_if_then_else (PR rtl-optimization/81553)

Message ID 20171124213813.GB14653@tucnak
State New
Headers show
Series Fix combine's simplify_if_then_else (PR rtl-optimization/81553) | expand

Commit Message

Jakub Jelinek Nov. 24, 2017, 9:38 p.m. UTC
Hi!

The following testcase ICEs in wide-int*, but the reason is a mode mismatch
(we build a SImode MULT with one QImode argument and one VOIDmode argument,
then it is folded into SImode NEG with QImode argument, ...).
The bug is in assuming that the mode of c1 must be m, that is usually the
case, but shifts are special, the second argument can have a different mode.

The following patch makes sure we perform the computation of the new shift
count in the right mode.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-11-24  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/81553
	* combine.c (simplify_if_then_else): In (if_then_else COND (OP Z C1) Z)
	to (OP Z (mult COND (C1 * STORE_FLAG_VALUE))) optimization, if OP
	is a shift where C1 has different mode than the whole shift, use C1's
	mode for MULT rather than the shift's mode.

	* gcc.c-torture/compile/pr81553.c: New test.


	Jakub

Comments

Segher Boessenkool Nov. 25, 2017, 1:19 a.m. UTC | #1
Hi,

On Fri, Nov 24, 2017 at 10:38:13PM +0100, Jakub Jelinek wrote:
> The following testcase ICEs in wide-int*, but the reason is a mode mismatch
> (we build a SImode MULT with one QImode argument and one VOIDmode argument,
> then it is folded into SImode NEG with QImode argument, ...).
> The bug is in assuming that the mode of c1 must be m, that is usually the
> case, but shifts are special, the second argument can have a different mode.
> 
> The following patch makes sure we perform the computation of the new shift
> count in the right mode.

Okay, thanks!  Is this better than bailing out though, do you have
an example?


Segher


> 2017-11-24  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR rtl-optimization/81553
> 	* combine.c (simplify_if_then_else): In (if_then_else COND (OP Z C1) Z)
> 	to (OP Z (mult COND (C1 * STORE_FLAG_VALUE))) optimization, if OP
> 	is a shift where C1 has different mode than the whole shift, use C1's
> 	mode for MULT rather than the shift's mode.
> 
> 	* gcc.c-torture/compile/pr81553.c: New test.
Jakub Jelinek Nov. 25, 2017, 8:39 a.m. UTC | #2
On Fri, Nov 24, 2017 at 07:19:29PM -0600, Segher Boessenkool wrote:
> Hi,
> 
> On Fri, Nov 24, 2017 at 10:38:13PM +0100, Jakub Jelinek wrote:
> > The following testcase ICEs in wide-int*, but the reason is a mode mismatch
> > (we build a SImode MULT with one QImode argument and one VOIDmode argument,
> > then it is folded into SImode NEG with QImode argument, ...).
> > The bug is in assuming that the mode of c1 must be m, that is usually the
> > case, but shifts are special, the second argument can have a different mode.
> > 
> > The following patch makes sure we perform the computation of the new shift
> > count in the right mode.
> 
> Okay, thanks!  Is this better than bailing out though, do you have
> an example?

I don't, but I haven't bootstrapped/regtested with a patch to gather
statistics on whether it would ever be successful (how?  Should I note in a
flag successful optimization of this and then check that flag upon
successful try_combine and clear it at the end of any try_combine?  It would
need to be done for all targets probably).  This code dates from 1994, but
testsuite was separate, so it is hard to check if this was covered by any
testcases.

If C1 is a constant (but then it would likely have VOIDmode), then I can
imagine it would match, though hard to find testcases, because usually
GIMPLE folding should propagate constants far earlier.  If C1 is a pseudo,
then we'd need an instruction that has
(any_shift (something) (mult (something) (something)))
which is quite unlikely.  But it is similarly unlikely for
(ior (something) (mult (something) (something))) and similar.
Or can try_combine then split it into two separate instructions, one doing
the mult and one doing the other, say when combining from original 3 or 4
instructions?  If so, then it could be likely it hits.

	Jakub
Segher Boessenkool Nov. 25, 2017, 7:05 p.m. UTC | #3
On Sat, Nov 25, 2017 at 09:39:54AM +0100, Jakub Jelinek wrote:
> > Okay, thanks!  Is this better than bailing out though, do you have
> > an example?
> 
> I don't, but I haven't bootstrapped/regtested with a patch to gather
> statistics on whether it would ever be successful (how?  Should I note in a
> flag successful optimization of this and then check that flag upon
> successful try_combine and clear it at the end of any try_combine?  It would
> need to be done for all targets probably).  This code dates from 1994, but
> testsuite was separate, so it is hard to check if this was covered by any
> testcases.
> 
> If C1 is a constant (but then it would likely have VOIDmode), then I can
> imagine it would match, though hard to find testcases, because usually
> GIMPLE folding should propagate constants far earlier.  If C1 is a pseudo,
> then we'd need an instruction that has
> (any_shift (something) (mult (something) (something)))
> which is quite unlikely.  But it is similarly unlikely for
> (ior (something) (mult (something) (something))) and similar.

const_int is always VOIDmode yes.  So we're left with a shift of a mul
of pseudos, and when that is split it is extremely likely split on a
boundary that was already there when we began, so I don't see it likely
it helps.  I thought you might have hit on an example where it does :-)

The patch is fine either way.

> Or can try_combine then split it into two separate instructions, one doing
> the mult and one doing the other, say when combining from original 3 or 4
> instructions?  If so, then it could be likely it hits.

Yes: it will use a define_split if there is one, or else it will find
what it thinks is the best spot to split, and try that.

That should usually split the mul off, and you usually will have started
with a mul in a separate insn, so says this was i1+i2+i3 with i3 the
mul, in effect this is then just combining i1+i2.  Sometimes of course
that will then work, where combining just i1+i2 won't (because of known
register values, or REG_DEAD notes, or similar).


Segher
diff mbox series

Patch

--- gcc/combine.c.jj	2017-11-19 18:08:08.000000000 +0100
+++ gcc/combine.c	2017-11-24 12:07:25.701480794 +0100
@@ -6639,11 +6639,15 @@  simplify_if_then_else (rtx x)
 
       if (z)
 	{
-	  temp = subst (simplify_gen_relational (true_code, m, VOIDmode,
+	  machine_mode cm = m;
+	  if ((op == ASHIFT || op == LSHIFTRT || op == ASHIFTRT)
+	      && GET_MODE (c1) != VOIDmode)
+	    cm = GET_MODE (c1);
+	  temp = subst (simplify_gen_relational (true_code, cm, VOIDmode,
 						 cond_op0, cond_op1),
 			pc_rtx, pc_rtx, 0, 0, 0);
-	  temp = simplify_gen_binary (MULT, m, temp,
-				      simplify_gen_binary (MULT, m, c1,
+	  temp = simplify_gen_binary (MULT, cm, temp,
+				      simplify_gen_binary (MULT, cm, c1,
 							   const_true_rtx));
 	  temp = subst (temp, pc_rtx, pc_rtx, 0, 0, 0);
 	  temp = simplify_gen_binary (op, m, gen_lowpart (m, z), temp);
--- gcc/testsuite/gcc.c-torture/compile/pr81553.c.jj	2017-11-24 12:11:25.681551110 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr81553.c	2017-11-24 12:10:55.000000000 +0100
@@ -0,0 +1,10 @@ 
+/* PR rtl-optimization/81553 */
+
+int a, b, c, d;
+
+void
+foo (void)
+{
+  d = 1 >> c >> 1;
+  b = ~(209883449764912897ULL & d) << (0 >= a) | ~d;
+}