diff mbox

Fix A < 0 ? C : 0 optimization (PR tree-optimization/78720)

Message ID 20161210091851.GN3541@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Dec. 10, 2016, 9:18 a.m. UTC
On Sat, Dec 10, 2016 at 10:05:50AM +0100, Paolo Bonzini wrote:
> 
> 
> On 09/12/2016 20:26, Jakub Jelinek wrote:
> > +    tree ctype = TREE_TYPE (@1);
> > +   }
> > +   (if (shift >= 0)
> > +    (bit_and
> > +     (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
> > +     @1)
> > +    (bit_and
> > +     (lshift (convert:ctype @0) { build_int_cst (integer_type_node, -shift); })
> > +     @1)))))
> 
> The ":ctype" shouldn't be needed because the COND_EXPR already has @1's
> type.

Note I'm actually bootstrapping/regtesting today following patch instead,
where it clearly isn't needed.  But for the above case with the shift, it
isn't clear for me how it would work without :ctype - how the algorithm
of figuring out what type to convert to works.  Because the first operand
of lshift isn't used next to @1.

2016-12-09  Jakub Jelinek  <jakub@redhat.com>
	    Marc Glisse  <marc.glisse@inria.fr>

	PR tree-optimization/78720
	* match.pd (A < 0 ? C : 0): Only optimize for signed A.  If shift
	is negative, first convert to @1's type and then lshift it by -shift.

	* gcc.c-torture/execute/pr78720.c: New test.



	Jakub

Comments

Marc Glisse Dec. 10, 2016, 10:45 a.m. UTC | #1
On Sat, 10 Dec 2016, Jakub Jelinek wrote:

> 	* match.pd (A < 0 ? C : 0): Only optimize for signed A.  If shift
> 	is negative, first convert to @1's type and then lshift it by -shift.

Thanks, the ChangeLog needs updating.

> --- gcc/match.pd.jj	2016-12-09 10:19:10.909735559 +0100
> +++ gcc/match.pd	2016-12-10 09:21:26.260516596 +0100
> @@ -2768,17 +2768,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>     (ncmp (convert:stype @0) { build_zero_cst (stype); })))))
>
> /* If we have A < 0 ? C : 0 where C is a power of 2, convert
> -   this into a right shift followed by ANDing with C.  */
> +   this into a right shift or sign extension followed by ANDing with C.  */
> (simplify
>  (cond
>   (lt @0 integer_zerop)
>   integer_pow2p@1 integer_zerop)
> - (with {
> + (if (!TYPE_UNSIGNED (TREE_TYPE (@0)))
> +  (with {
>     int shift = element_precision (@0) - wi::exact_log2 (@1) - 1;
> -  }
> -  (bit_and
> -   (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
> -   @1)))
> +   }
> +   (if (shift >= 0)
> +    (bit_and
> +     (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
> +     @1)
> +    /* Otherwise ctype must be wider than TREE_TYPE (@0) and pure
> +       sign extension followed by AND with C will achieve the effect.  */
> +    (bit_and (convert @0) @1)))))

It is funny to notice that the fact that @1 is a power of 2 has become 
mostly irrelevant. When C fits in the type of A, we can do an arithmetic 
shift right of precision-1 to obtain a mask of 0 or -1, then bit_and works 
not just for powers of 2. Otherwise, imagining that A is int32_t and C 
int64_t, all we care about is that the low 31 bits of C are 0 (otherwise 
we could also do an arithmetic right shift, either before or after the 
convert). But that's another patch, fixing the PR is what matters for now.
diff mbox

Patch

--- gcc/match.pd.jj	2016-12-09 10:19:10.909735559 +0100
+++ gcc/match.pd	2016-12-10 09:21:26.260516596 +0100
@@ -2768,17 +2768,22 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
     (ncmp (convert:stype @0) { build_zero_cst (stype); })))))
 
 /* If we have A < 0 ? C : 0 where C is a power of 2, convert
-   this into a right shift followed by ANDing with C.  */
+   this into a right shift or sign extension followed by ANDing with C.  */
 (simplify
  (cond
   (lt @0 integer_zerop)
   integer_pow2p@1 integer_zerop)
- (with {
+ (if (!TYPE_UNSIGNED (TREE_TYPE (@0)))
+  (with {
     int shift = element_precision (@0) - wi::exact_log2 (@1) - 1;
-  }
-  (bit_and
-   (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
-   @1)))
+   }
+   (if (shift >= 0)
+    (bit_and
+     (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
+     @1)
+    /* Otherwise ctype must be wider than TREE_TYPE (@0) and pure
+       sign extension followed by AND with C will achieve the effect.  */
+    (bit_and (convert @0) @1)))))
 
 /* When the addresses are not directly of decls compare base and offset.
    This implements some remaining parts of fold_comparison address
--- gcc/testsuite/gcc.c-torture/execute/pr78720.c.jj	2016-12-10 09:18:43.386574179 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr78720.c	2016-12-10 09:18:43.386574179 +0100
@@ -0,0 +1,29 @@ 
+/* PR tree-optimization/78720 */
+
+__attribute__((noinline, noclone)) long int
+foo (signed char x)
+{
+  return x < 0 ? 0x80000L : 0L;
+}
+
+__attribute__((noinline, noclone)) long int
+bar (signed char x)
+{
+  return x < 0 ? 0x80L : 0L;
+}
+
+__attribute__((noinline, noclone)) long int
+baz (signed char x)
+{
+  return x < 0 ? 0x20L : 0L;
+}
+
+int
+main ()
+{
+  if (foo (-1) != 0x80000L || bar (-1) != 0x80L || baz (-1) != 0x20L
+      || foo (0) != 0L || bar (0) != 0L || baz (0) != 0L
+      || foo (31) != 0L || bar (31) != 0L || baz (31) != 0L)
+    __builtin_abort ();
+  return 0;
+}