diff mbox

Division Optimization in match and simplify

Message ID BLUPR0701MB10110152D9B362818EB337668E2A0@BLUPR0701MB1011.namprd07.prod.outlook.com
State New
Headers show

Commit Message

Hurugalawadi, Naveen Nov. 4, 2015, 10:41 a.m. UTC
Hi,

Thanks for the review and comments.

>> I thought we were mostly using the 'convert?' 
>> and tree_nop_conversion_p on integers

Done. Cleared all instances of convert which are not required.
However, I am still confused about the use of "convert" in match
and simplify.

>> So all patterns looking at arg[01] are handling nop-conversions on their
>> outermost operands while those looking at op[01] do not.

These patterns are looking at arg[01] rather than op[01] ?
Am really sorry, If I am repeating the same thing over and over.

>> That would probably also work for exact_div.
Done.

>> Don't you want to require that @1 is INTEGER_CST? And then you could use
>> const_binop, or even wi::add.

Done. The fold-const part did not specify it as INTEGER_CST.
Although it was obvious from the PLUS_EXPR.
However, just wanted to retain as per fold-const.

Please find attached the modified patch with all review comments addressed.
Regression tested on X86_64 without any extra failures.

Thanks,
Naveen

Comments

Marc Glisse Nov. 4, 2015, 11:18 a.m. UTC | #1
On Wed, 4 Nov 2015, Hurugalawadi, Naveen wrote:

>>> I thought we were mostly using the 'convert?'
>>> and tree_nop_conversion_p on integers
>
> Done. Cleared all instances of convert which are not required.
> However, I am still confused about the use of "convert" in match
> and simplify.

It could be that I am wrong, you'll have to wait for Richard's comments 
anyway. The one place where a convert could be useful is:
(div (convert? (bit_and @0 INTEGER_CST@1)) INTEGER_CST@2)
but I didn't check if we want it (it would probably at least require 
(convert @0) instead of plain @0 in the result so the division and the 
shift are done with the same signedness).

>>> So all patterns looking at arg[01] are handling nop-conversions on their
>>> outermost operands while those looking at op[01] do not.
>
> These patterns are looking at arg[01] rather than op[01] ?

Might be a case where it doesn't really matter which one you look at, and 
the easiest one in fold-const may not be the same as in match.pd.

+/* Convert (A/B)/C to A/(B*C)  */
+(simplify
+ (rdiv (rdiv @0 @1) @2)
+  (if (flag_reciprocal_math)

I would indent this line the same as the one above.

+   (rdiv @0 (mult @1 @2))))
+
+/* Convert A/(B/C) to (A/B)*C  */
+(simplify
+ (rdiv @0 (rdiv @1 @2))
+  (if (flag_reciprocal_math)
+   (mult (rdiv @0 @1) @2)))

I believe you might want a :s on the inner rdiv (?).
Note that you can factor out the test on flag_reciprocal_math so you write 
it only once for 2 patterns.

I don't really remember what the tests !TYPE_UNSIGNED (type) and 
tree_int_cst_sgn are for in the other pattern, but since you are only 
moving the transformation...
Richard Biener Nov. 4, 2015, 12:07 p.m. UTC | #2
On Wed, Nov 4, 2015 at 12:18 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Wed, 4 Nov 2015, Hurugalawadi, Naveen wrote:
>
>>>> I thought we were mostly using the 'convert?'
>>>> and tree_nop_conversion_p on integers

Yes, on floats they shouldn't be used.

>>
>> Done. Cleared all instances of convert which are not required.
>> However, I am still confused about the use of "convert" in match
>> and simplify.
>
>
> It could be that I am wrong, you'll have to wait for Richard's comments
> anyway. The one place where a convert could be useful is:
> (div (convert? (bit_and @0 INTEGER_CST@1)) INTEGER_CST@2)
> but I didn't check if we want it (it would probably at least require
> (convert @0) instead of plain @0 in the result so the division and the shift
> are done with the same signedness).
>
>>>> So all patterns looking at arg[01] are handling nop-conversions on their
>>>> outermost operands while those looking at op[01] do not.
>>
>>
>> These patterns are looking at arg[01] rather than op[01] ?
>
>
> Might be a case where it doesn't really matter which one you look at, and
> the easiest one in fold-const may not be the same as in match.pd.
>
> +/* Convert (A/B)/C to A/(B*C)  */
> +(simplify
> + (rdiv (rdiv @0 @1) @2)
> +  (if (flag_reciprocal_math)
>
> I would indent this line the same as the one above.

Yep.

> +   (rdiv @0 (mult @1 @2))))
> +
> +/* Convert A/(B/C) to (A/B)*C  */
> +(simplify
> + (rdiv @0 (rdiv @1 @2))
> +  (if (flag_reciprocal_math)
> +   (mult (rdiv @0 @1) @2)))
>
> I believe you might want a :s on the inner rdiv (?).
> Note that you can factor out the test on flag_reciprocal_math so you write
> it only once for 2 patterns.

Please use :s on both inner rdiv in both patterns.  With that the two patterns
are ok.

+/* Convert C1/(X*C2) into (C1/C2)/X  */
+(simplify
+ (rdiv (REAL_CST@0) (mult @1 REAL_CST@2))

Omit the parens around REAL_CST@0

+  (if (flag_reciprocal_math)
+   (with
+    { tree tem = const_binop (RDIV_EXPR, type, @0, @2); }
+    (if (tem)
+     (rdiv { tem; } @1)))))

with that the pattern is ok.

>
> I don't really remember what the tests !TYPE_UNSIGNED (type) and
> tree_int_cst_sgn are for in the other pattern, but since you are only moving
> the transformation...

+/* Optimize (X & (-A)) / A where A is a power of 2, to X >> log2(A) */
+(for div (exact_div trunc_div)
+ (simplify
+  (div (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
+  (if (!TYPE_UNSIGNED (type) && integer_pow2p (@2)
+       && tree_int_cst_sgn (@2) > 0
+       && wi::add (@2, @1) == 0)
+   (rshift @0 { build_int_cst (integer_type_node, wi::exact_log2 (@2)); }))))

the TYPE_UNSIGNED test is because right shift of negative values is undefined,
so is a shift with a negative value.  I believe we can safely handle
conversions here
just like fold-const.c does with

 (div (convert? (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
  ...

With that the pattern looks ok to me.

Richard.


>
> --
> Marc Glisse
Marc Glisse Nov. 4, 2015, 12:45 p.m. UTC | #3
On Wed, 4 Nov 2015, Richard Biener wrote:

>> I don't really remember what the tests !TYPE_UNSIGNED (type) and
>> tree_int_cst_sgn are for in the other pattern, but since you are only moving
>> the transformation...
>
> +/* Optimize (X & (-A)) / A where A is a power of 2, to X >> log2(A) */
> +(for div (exact_div trunc_div)
> + (simplify
> +  (div (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
> +  (if (!TYPE_UNSIGNED (type) && integer_pow2p (@2)
> +       && tree_int_cst_sgn (@2) > 0
> +       && wi::add (@2, @1) == 0)
> +   (rshift @0 { build_int_cst (integer_type_node, wi::exact_log2 (@2)); }))))
>
> the TYPE_UNSIGNED test is because right shift of negative values is undefined,

tree.def: "Shift means logical shift if done on an unsigned type, arithmetic shift if done on a signed type."
To me, this implies that right shift of negative values is well-defined
inside gcc.
Also, the test allows *only signed types*, not unsigned.

> so is a shift with a negative value.  I believe we can safely handle
> conversions here
> just like fold-const.c does with
>
> (div (convert? (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
> (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>  ...
>
> With that the pattern looks ok to me.

As long as it comes with (convert @0) in the result... I think the
fold-const.c pattern is lucky that (int)(x&-4u) gets folded to
(int)x&-4, or it might ICE for ((int)(x&-4u))/4.
Richard Biener Nov. 4, 2015, 1:26 p.m. UTC | #4
On Wed, Nov 4, 2015 at 1:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Wed, 4 Nov 2015, Richard Biener wrote:
>
>>> I don't really remember what the tests !TYPE_UNSIGNED (type) and
>>> tree_int_cst_sgn are for in the other pattern, but since you are only
>>> moving
>>> the transformation...
>>
>>
>> +/* Optimize (X & (-A)) / A where A is a power of 2, to X >> log2(A) */
>> +(for div (exact_div trunc_div)
>> + (simplify
>> +  (div (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
>> +  (if (!TYPE_UNSIGNED (type) && integer_pow2p (@2)
>> +       && tree_int_cst_sgn (@2) > 0
>> +       && wi::add (@2, @1) == 0)
>> +   (rshift @0 { build_int_cst (integer_type_node, wi::exact_log2 (@2));
>> }))))
>>
>> the TYPE_UNSIGNED test is because right shift of negative values is
>> undefined,
>
>
> tree.def: "Shift means logical shift if done on an unsigned type, arithmetic
> shift if done on a signed type."
> To me, this implies that right shift of negative values is well-defined
> inside gcc.

Ah, it was _left_ shift of negative values that ubsan complains about.

> Also, the test allows *only signed types*, not unsigned.

Doh.  Ok, so maybe that's because the tree_int_cst_sgn test would
"not work" otherwise (just use tree_int_cst_sign_bit (@2) == 0).

>
>> so is a shift with a negative value.  I believe we can safely handle
>> conversions here
>> just like fold-const.c does with
>>
>> (div (convert? (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
>> (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>  ...
>>
>> With that the pattern looks ok to me.
>
>
> As long as it comes with (convert @0) in the result... I think the
> fold-const.c pattern is lucky that (int)(x&-4u) gets folded to
> (int)x&-4, or it might ICE for ((int)(x&-4u))/4.

I think the types match ok but I might have looked too quickly (again).

Richard.

> --
> Marc Glisse
Michael Matz Nov. 5, 2015, 1:40 p.m. UTC | #5
Hi,

On Wed, 4 Nov 2015, Richard Biener wrote:

> Ah, it was _left_ shift of negative values that ubsan complains about.

Note that this is only for the frontend definition of shifts.  I don't see 
why gimple shouldn't define it to the only sensible definition there is, 
which also happens to be the one that all CPUs in the world implement 
(well, those using two-complement representation at least).


Ciao,
Michael.
Richard Biener Nov. 5, 2015, 5:24 p.m. UTC | #6
On November 5, 2015 2:40:30 PM GMT+01:00, Michael Matz <matz@suse.de> wrote:
>Hi,
>
>On Wed, 4 Nov 2015, Richard Biener wrote:
>
>> Ah, it was _left_ shift of negative values that ubsan complains
>about.
>
>Note that this is only for the frontend definition of shifts.  I don't
>see 
>why gimple shouldn't define it to the only sensible definition there
>is, 
>which also happens to be the one that all CPUs in the world implement 
>(well, those using two-complement representation at least).

Yeah, but front ends and back end are not separated enough to have both.

Richard.

>
>Ciao,
>Michael.
diff mbox

Patch

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index ee9b349..88dbbdd 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -10163,54 +10163,9 @@  fold_binary_loc (location_t loc,
 	return fold_build2_loc (loc, RDIV_EXPR, type,
 			    negate_expr (arg0),
 			    TREE_OPERAND (arg1, 0));
-
-      /* Convert A/B/C to A/(B*C).  */
-      if (flag_reciprocal_math
-	  && TREE_CODE (arg0) == RDIV_EXPR)
-	return fold_build2_loc (loc, RDIV_EXPR, type, TREE_OPERAND (arg0, 0),
-			    fold_build2_loc (loc, MULT_EXPR, type,
-					 TREE_OPERAND (arg0, 1), arg1));
-
-      /* Convert A/(B/C) to (A/B)*C.  */
-      if (flag_reciprocal_math
-	  && TREE_CODE (arg1) == RDIV_EXPR)
-	return fold_build2_loc (loc, MULT_EXPR, type,
-			    fold_build2_loc (loc, RDIV_EXPR, type, arg0,
-					 TREE_OPERAND (arg1, 0)),
-			    TREE_OPERAND (arg1, 1));
-
-      /* Convert C1/(X*C2) into (C1/C2)/X.  */
-      if (flag_reciprocal_math
-	  && TREE_CODE (arg1) == MULT_EXPR
-	  && TREE_CODE (arg0) == REAL_CST
-	  && TREE_CODE (TREE_OPERAND (arg1, 1)) == REAL_CST)
-	{
-	  tree tem = const_binop (RDIV_EXPR, arg0,
-				  TREE_OPERAND (arg1, 1));
-	  if (tem)
-	    return fold_build2_loc (loc, RDIV_EXPR, type, tem,
-				TREE_OPERAND (arg1, 0));
-	}
-
       return NULL_TREE;
 
     case TRUNC_DIV_EXPR:
-      /* Optimize (X & (-A)) / A where A is a power of 2,
-	 to X >> log2(A) */
-      if (TREE_CODE (arg0) == BIT_AND_EXPR
-	  && !TYPE_UNSIGNED (type) && TREE_CODE (arg1) == INTEGER_CST
-	  && integer_pow2p (arg1) && tree_int_cst_sgn (arg1) > 0)
-	{
-	  tree sum = fold_binary_loc (loc, PLUS_EXPR, TREE_TYPE (arg1),
-				      arg1, TREE_OPERAND (arg0, 1));
-	  if (sum && integer_zerop (sum)) {
-	    tree pow2 = build_int_cst (integer_type_node,
-				       wi::exact_log2 (arg1));
-	    return fold_build2_loc (loc, RSHIFT_EXPR, type,
-				    TREE_OPERAND (arg0, 0), pow2);
-	  }
-	}
-
       /* Fall through */
       
     case FLOOR_DIV_EXPR:
diff --git a/gcc/match.pd b/gcc/match.pd
index f6c5c07..020f575 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -247,6 +247,27 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (!HONOR_SNANS (type))
   (negate @0)))
 
+/* Convert (A/B)/C to A/(B*C)  */
+(simplify
+ (rdiv (rdiv @0 @1) @2)
+  (if (flag_reciprocal_math)
+   (rdiv @0 (mult @1 @2))))
+
+/* Convert A/(B/C) to (A/B)*C  */
+(simplify
+ (rdiv @0 (rdiv @1 @2))
+  (if (flag_reciprocal_math)
+   (mult (rdiv @0 @1) @2)))
+
+/* Optimize (X & (-A)) / A where A is a power of 2, to X >> log2(A) */
+(for div (exact_div trunc_div) 
+ (simplify
+  (div (bit_and @0 INTEGER_CST@1) INTEGER_CST@2)
+  (if (!TYPE_UNSIGNED (type) && integer_pow2p (@2)
+       && tree_int_cst_sgn (@2) > 0
+       && wi::add (@2, @1) == 0)
+   (rshift @0 { build_int_cst (integer_type_node, wi::exact_log2 (@2)); }))))
+
 /* If ARG1 is a constant, we can convert this to a multiply by the
    reciprocal.  This does not have the same rounding properties,
    so only do this if -freciprocal-math.  We can actually
@@ -464,6 +485,15 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
     (if (tem)
      (rdiv { tem; } @1)))))
 
+/* Convert C1/(X*C2) into (C1/C2)/X  */
+(simplify
+ (rdiv (REAL_CST@0) (mult @1 REAL_CST@2))
+  (if (flag_reciprocal_math)
+   (with
+    { tree tem = const_binop (RDIV_EXPR, type, @0, @2); }
+    (if (tem)
+     (rdiv { tem; } @1)))))
+
 /* Simplify ~X & X as zero.  */
 (simplify
  (bit_and:c (convert? @0) (convert? (bit_not @0)))