((X /[ex] A) +- B) * A --> X +- A * B
diff mbox series

Message ID alpine.DEB.2.02.1809291327050.7519@stedding.saclay.inria.fr
State New
Headers show
Series
  • ((X /[ex] A) +- B) * A --> X +- A * B
Related show

Commit Message

Marc Glisse Sept. 29, 2018, 11:35 a.m. UTC
Hello,

I noticed quite ugly code from both testcases. This transformation does 
not fix either, but it helps a bit.

bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2018-09-30  Marc Glisse  <marc.glisse@inria.fr>

gcc/
 	* match.pd (((X /[ex] A) +- B) * A): New transformation.

gcc/testsuite/
 	* gcc.dg/tree-ssa/muldiv-1.c: New file.
 	* gcc.dg/tree-ssa/muldiv-2.c: Likewise.

Comments

Richard Biener Oct. 1, 2018, 8:32 a.m. UTC | #1
On Sat, Sep 29, 2018 at 1:35 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> Hello,
>
> I noticed quite ugly code from both testcases. This transformation does
> not fix either, but it helps a bit.

I'm curious why you chose to restrict to INTEGER_CST A and B?
Is that because of the case when (X / [ex] A) +- B evaluates to zero
but A * B overflows?  Can that ever happen?  Isn't it enough to know
that A isn't -1?  That is, can we use expr_not_equal_to or friends
to put constraints on possibly non-constant A/B?

Otherwise the patch is of course OK and the above would just improve
it.

Thanks,
Richard.

> bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>
> 2018-09-30  Marc Glisse  <marc.glisse@inria.fr>
>
> gcc/
>         * match.pd (((X /[ex] A) +- B) * A): New transformation.
>
> gcc/testsuite/
>         * gcc.dg/tree-ssa/muldiv-1.c: New file.
>         * gcc.dg/tree-ssa/muldiv-2.c: Likewise.
>
> --
> Marc Glisse
Marc Glisse Oct. 1, 2018, 4:14 p.m. UTC | #2
On Mon, 1 Oct 2018, Richard Biener wrote:

> On Sat, Sep 29, 2018 at 1:35 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>>
>> Hello,
>>
>> I noticed quite ugly code from both testcases. This transformation does
>> not fix either, but it helps a bit.
>
> I'm curious why you chose to restrict to INTEGER_CST A and B?
> Is that because of the case when (X / [ex] A) +- B evaluates to zero
> but A * B overflows?  Can that ever happen?  Isn't it enough to know
> that A isn't -1?  That is, can we use expr_not_equal_to or friends
> to put constraints on possibly non-constant A/B?

For A, I don't remember seeing a divexact with non-constant denominator in 
gcc. For B, constants are what I was seeing in testcases, I was even 
tempted to implement it only for B==1 which is simpler.

At first, I only had the version without casts, and for that I needed to 
be able to check for overflow, so constants. Now that I have a version 
that casts to unsigned, it would be valid with arbitrary A and B indeed. 
There are also a lot of casts making my head hurt (we may need one more 
for the last @1 if it isn't a constant anymore).

I was also thinking of things like ((X/[ex]3)+1)*6, but that doesn't occur 
as often.

> Otherwise the patch is of course OK and the above would just improve
> it.

I'll commit it and see if I can find some time...

Patch
diff mbox series

Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 264371)
+++ gcc/match.pd	(working copy)
@@ -2637,20 +2637,39 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
        && operand_equal_p (@1, build_low_bits_mask (TREE_TYPE (@1),
 						    TYPE_PRECISION (type)), 0))
    (convert @0)))
 
 
 /* (X /[ex] A) * A -> X.  */
 (simplify
   (mult (convert1? (exact_div @0 @@1)) (convert2? @1))
   (convert @0))
 
+/* ((X /[ex] A) +- B) * A  -->  X +- A * B.  */
+(for op (plus minus)
+ (simplify
+  (mult (convert1? (op (convert2? (exact_div @0 INTEGER_CST@@1)) INTEGER_CST@2)) @1)
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@2))
+       && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE (@2)))
+   (with
+     {
+       wi::overflow_type overflow;
+       wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2),
+			       TYPE_SIGN (type), &overflow);
+     }
+     (if (types_match (type, TREE_TYPE (@2))
+ 	 && types_match (TREE_TYPE (@0), TREE_TYPE (@2)) && !overflow)
+      (op @0 { wide_int_to_tree (type, mul); })
+      (with { tree utype = unsigned_type_for (type); }
+       (convert (op (convert:utype @0)
+		    (mult (convert:utype @1) (convert:utype @2))))))))))
+
 /* Canonicalization of binary operations.  */
 
 /* Convert X + -C into X - C.  */
 (simplify
  (plus @0 REAL_CST@1)
  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
   (with { tree tem = const_unop (NEGATE_EXPR, type, @1); }
    (if (!TREE_OVERFLOW (tem) || !flag_trapping_math)
     (minus @0 { tem; })))))
 
Index: gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized-raw" } */
+
+// ldist produces (((q-p-4)/4)&...+1)*4
+// Make sure we remove at least the division
+// Eventually this should just be n*4
+
+void foo(int*p, __SIZE_TYPE__ n){
+  for(int*q=p+n;p!=q;++p)*p=0;
+}
+
+/* { dg-final { scan-tree-dump "builtin_memset" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "div" "optimized" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c	(working copy)
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized-raw" } */
+
+// 'a' should disappear, but we are not there yet
+
+int* f(int* a, int* b, int* c){
+    __PTRDIFF_TYPE__ d = b - a;
+    d += 1;
+    return a + d;
+}
+
+/* { dg-final { scan-tree-dump-not "div" "optimized" } } */