Message ID | 20161220190933.GO21933@tucnak |
---|---|
State | New |
Headers | show |
On Tue, 20 Dec 2016, Jakub Jelinek wrote: > If the shift count has enough known zero low bits (non-zero bits only > above the ceil_log2 (precision)), then the only valid shift count that > is not out of bounds is 0, so we can as well fold it into the first > argument of the shift. This resolves a regression introduced by partly > optimizing it at the gimple level, which results in it not being optimized > at the RTL level that managed to do that completely. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2016-12-20 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/71563 > * match.pd: Simplify X << Y into X if Y is known to be 0 or > out of range value - has low bits known to be zero. Hello, would it make sense to extend it to rotates later? Note that you can write (shift @0 SSA_NAME@1) in the pattern instead of a separate test.
On Tue, Dec 20, 2016 at 09:30:17PM +0100, Marc Glisse wrote: > would it make sense to extend it to rotates later? I wasn't 100% sure if rotates also require 0..prec-1 rotate counts, or if they accept arbitrary ones. > Note that you can write (shift @0 SSA_NAME@1) in the pattern instead of a > separate test. That is what I tried first, but there is some bug in genmatch.c that prevents it. The: (for vec (VECTOR_CST CONSTRUCTOR) (simplify (shiftrotate @0 vec@1) results in case SSA_NAME: being added to a switch: case SSA_NAME: if (do_valueize (valueize, op1) != NULL_TREE) { gimple *def_stmt = SSA_NAME_DEF_STMT (op1); if (gassign *def = dyn_cast <gassign *> (def_stmt)) switch (gimple_assign_rhs_code (def)) { case CONSTRUCTOR: and the SSA_NAME@1 in another simplification resulted in another case SSA_NAME: into the same switch (rather than appending to the case SSA_NAME). Jakub
--- gcc/match.pd.jj 2016-12-10 13:05:39.000000000 +0100 +++ gcc/match.pd 2016-12-20 15:44:30.892704283 +0100 @@ -1497,6 +1497,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tem) (shiftrotate @0 { tem; })))))) +/* Simplify X << Y where Y's low width bits are 0 to X, as only valid + Y is 0. Similarly for X >> Y. */ +#if GIMPLE +(for shift (lshift rshift) + (simplify + (shift @0 @1) + (if (TREE_CODE (@1) == SSA_NAME && INTEGRAL_TYPE_P (TREE_TYPE (@1))) + (with { + int width = ceil_log2 (element_precision (TREE_TYPE (@0))); + int prec = TYPE_PRECISION (TREE_TYPE (@1)); + } + (if ((get_nonzero_bits (@1) & wi::mask (width, false, prec)) == 0) + @0))))) +#endif + /* Rewrite an LROTATE_EXPR by a constant into an RROTATE_EXPR by a new constant. */ (simplify --- gcc/testsuite/gcc.dg/tree-ssa/pr71563.c.jj 2016-12-20 15:57:16.624722177 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/pr71563.c 2016-12-20 15:57:01.000000000 +0100 @@ -0,0 +1,23 @@ +/* PR tree-optimization/71563 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +void link_error (void); + +void +foo (int k) +{ + int t = 1 << ((1 / k) << 8); + if (t != 1) + link_error (); +} + +void +bar (int k, int l) +{ + int t = l << (k << 8); + if (t != l) + link_error (); +} + +/* { dg-final { scan-tree-dump-not "link_error" "optimized" } } */