Message ID | alpine.DEB.2.02.1809291327050.7519@stedding.saclay.inria.fr |
---|---|
State | New |
Headers | show |
Series | ((X /[ex] A) +- B) * A --> X +- A * B | expand |
On Sat, Sep 29, 2018 at 1:35 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > Hello, > > I noticed quite ugly code from both testcases. This transformation does > not fix either, but it helps a bit. I'm curious why you chose to restrict to INTEGER_CST A and B? Is that because of the case when (X / [ex] A) +- B evaluates to zero but A * B overflows? Can that ever happen? Isn't it enough to know that A isn't -1? That is, can we use expr_not_equal_to or friends to put constraints on possibly non-constant A/B? Otherwise the patch is of course OK and the above would just improve it. Thanks, Richard. > bootstrap+regtest on powerpc64le-unknown-linux-gnu. > > 2018-09-30 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * match.pd (((X /[ex] A) +- B) * A): New transformation. > > gcc/testsuite/ > * gcc.dg/tree-ssa/muldiv-1.c: New file. > * gcc.dg/tree-ssa/muldiv-2.c: Likewise. > > -- > Marc Glisse
On Mon, 1 Oct 2018, Richard Biener wrote: > On Sat, Sep 29, 2018 at 1:35 PM Marc Glisse <marc.glisse@inria.fr> wrote: >> >> Hello, >> >> I noticed quite ugly code from both testcases. This transformation does >> not fix either, but it helps a bit. > > I'm curious why you chose to restrict to INTEGER_CST A and B? > Is that because of the case when (X / [ex] A) +- B evaluates to zero > but A * B overflows? Can that ever happen? Isn't it enough to know > that A isn't -1? That is, can we use expr_not_equal_to or friends > to put constraints on possibly non-constant A/B? For A, I don't remember seeing a divexact with non-constant denominator in gcc. For B, constants are what I was seeing in testcases, I was even tempted to implement it only for B==1 which is simpler. At first, I only had the version without casts, and for that I needed to be able to check for overflow, so constants. Now that I have a version that casts to unsigned, it would be valid with arbitrary A and B indeed. There are also a lot of casts making my head hurt (we may need one more for the last @1 if it isn't a constant anymore). I was also thinking of things like ((X/[ex]3)+1)*6, but that doesn't occur as often. > Otherwise the patch is of course OK and the above would just improve > it. I'll commit it and see if I can find some time...
Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 264371) +++ gcc/match.pd (working copy) @@ -2637,20 +2637,39 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && operand_equal_p (@1, build_low_bits_mask (TREE_TYPE (@1), TYPE_PRECISION (type)), 0)) (convert @0))) /* (X /[ex] A) * A -> X. */ (simplify (mult (convert1? (exact_div @0 @@1)) (convert2? @1)) (convert @0)) +/* ((X /[ex] A) +- B) * A --> X +- A * B. */ +(for op (plus minus) + (simplify + (mult (convert1? (op (convert2? (exact_div @0 INTEGER_CST@@1)) INTEGER_CST@2)) @1) + (if (tree_nop_conversion_p (type, TREE_TYPE (@2)) + && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE (@2))) + (with + { + wi::overflow_type overflow; + wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2), + TYPE_SIGN (type), &overflow); + } + (if (types_match (type, TREE_TYPE (@2)) + && types_match (TREE_TYPE (@0), TREE_TYPE (@2)) && !overflow) + (op @0 { wide_int_to_tree (type, mul); }) + (with { tree utype = unsigned_type_for (type); } + (convert (op (convert:utype @0) + (mult (convert:utype @1) (convert:utype @2)))))))))) + /* Canonicalization of binary operations. */ /* Convert X + -C into X - C. */ (simplify (plus @0 REAL_CST@1) (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1))) (with { tree tem = const_unop (NEGATE_EXPR, type, @1); } (if (!TREE_OVERFLOW (tem) || !flag_trapping_math) (minus @0 { tem; }))))) Index: gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c (working copy) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-tree-optimized-raw" } */ + +// ldist produces (((q-p-4)/4)&...+1)*4 +// Make sure we remove at least the division +// Eventually this should just be n*4 + +void foo(int*p, __SIZE_TYPE__ n){ + for(int*q=p+n;p!=q;++p)*p=0; +} + +/* { dg-final { scan-tree-dump "builtin_memset" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "div" "optimized" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-optimized-raw" } */ + +// 'a' should disappear, but we are not there yet + +int* f(int* a, int* b, int* c){ + __PTRDIFF_TYPE__ d = b - a; + d += 1; + return a + d; +} + +/* { dg-final { scan-tree-dump-not "div" "optimized" } } */