diff mbox

Move some bit and binary optimizations in simplify and match

Message ID SN2PR0701MB10242B9E56933B072C29DE698E360@SN2PR0701MB1024.namprd07.prod.outlook.com
State New
Headers show

Commit Message

Hurugalawadi, Naveen Oct. 7, 2015, 9:54 a.m. UTC
Hi,

Please find attached the patch that moves some more patterns from
fold-const using simplify and match.

Please review the patch and let me know if any modifications are required.

Tested the patch on X86 without any regressions.

Thanks,
Naveen

ChangeLog

2015-10-07  Naveen H.S  <Naveen.Hurugalawadi@caviumnetworks.com>

	* fold-const.c (fold_binary_loc) : Move X + (X / CST) * -CST ->
	X % CST to match.pd.
	Move Fold (A & ~B) - (A & B) into (A ^ B) - B to match.pd.
	Move (-A) * (-B) -> A * B to match.pd.
	Move (a * (1 << b)) is (a << b) to match.pd.
	Move convert (C1/X)*C2 into (C1*C2)/X to match.pd.
	Move (X & ~Y) | (~X & Y) is X ^ Y to match.pd.
	Move ~X & X, (X == 0) & X, and !X & X are zero to match.pd.
	Move X & ~X , X & (X == 0), and X & !X are zero to match.pd.
	Move Fold X & (X ^ Y) as X & ~Y to match.pd.
	Move Fold X & (Y ^ X) as ~Y & X to match.pd.

	* match.pd (plus @0 (mult:s (trunc_div:s @0 INTEGER_CST@1)
	(negate @1))): New simplifier.
	(minus (bit_and:s @0 (bit_not:s @1)) (bit_and:s @0 @1)) :
	New simplifier.
	(mult:c @0 (lshift integer_onep@1 @2)): New simplifier.
	(mult:c (plus @0 @0) INTEGER_CST@1): New simplifier.
	(mult (rdiv REAL_CST@0 @1) REAL_CST@2): New simplifier.
	(bit_ior (bit_and:s @0 (bit_not:s @1)) (bit_and:s (bit_not:s @0) @1))
	: New simplifier.
	(bit_and:c @0 (bit_not:s @0)): New simplifier.
	(bit_and:c @0 (eq @0 integer_zerop@1)): New simplifier.
	(bit_and @0 (bit_xor:s @0 @1)): New simplifier.
	(bit_and @0 (bit_xor:s @1 @0)): New simplifier.
	(mult:c (negate @0) (negate @1)): New simplifier.

Comments

Richard Biener Oct. 8, 2015, 1:39 p.m. UTC | #1
On Wed, Oct 7, 2015 at 11:54 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi,
>
> Please find attached the patch that moves some more patterns from
> fold-const using simplify and match.
>
> Please review the patch and let me know if any modifications are required.

+/* Fold X + (X / CST) * -CST to X % CST.  */
+(simplify
+ (plus @0 (mult:s (trunc_div:s @0 INTEGER_CST@1) (negate @1)))

that's a bit too literal -- (negate @1) won't match for -@1

+  (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
+  (trunc_mod @0 @1)))

+/* Fold (A & ~B) - (A & B) into (A ^ B) - B.  */
+(simplify
+ (minus (bit_and:s @0 (bit_not:s @1)) (bit_and:s @0 @1))
+  (if (! FLOAT_TYPE_P (type))
+  (minus (bit_xor @0 @1) @1)))

Likewise the fold code handles both constant and non-constant B.
To mimic this you need a second pattern for the constant case
or add a predicate matching @1 with ~@1.

+/* (-A) * (-B) -> A * B  */
+(simplify
+ (mult:c (negate @0) (negate @1))
+  (mult @0 @1))

the fold-const.c code handles sign-conversions around the negates
so you should add (convert?)s around them and verify useless_type_conversions.

+/* Fold (a * (1 << b)) into (a << b)  */
+(simplify
+ (mult:c @0 (lshift integer_onep@1 @2))
+  (if (! FLOAT_TYPE_P (type))
+  (lshift @0 @2)))

Likewise (sign-conversion on the lshift).  Though I'm not sure
this won't trap ubsan for signed left-shift of negative values.

+/* Fold (C1/X)*C2 into (C1*C2)/X.  */
+(simplify
+ (mult (rdiv REAL_CST@0 @1) REAL_CST@2)
+  (if (FLOAT_TYPE_P (type)
+       && flag_associative_math)
+  (rdiv (mult @0 @2) @1)))

the fold-const.c code avoids the transform if @0 * @2 doesn't simplify
(nans/infs and flag combos).  Not sure if we care though.

+/* Simplify (X & ~Y) | (~X & Y) is X ^ Y.  */
+(simplify
+ (bit_ior (bit_and:s @0 (bit_not:s @1)) (bit_and:s (bit_not:s @0) @1))
+  (bit_xor @0 @1))

fold again handles also constants for X and Y.  I suggest to re-use
the matching predicate you need to add for the above ~ pattern.
fold also handles sign-converted bit-ands.

+/* Simplify ~X & X as zero.  */
+(simplify
+ (bit_and:c @0 (bit_not:s @0))
+  { build_zero_cst (type); })

I was sure we already have this...  looks I was wrong.  Again fold
handles sign-conversions on @0 resp. the bit_not.

+/* Simplify (X == 0) & X as zero.  */
+(simplify
+ (bit_and:c @0 (eq @0 integer_zerop@1))
+  @1)

I think we have this one see logical_inverted_value and uses:

(simplify
 (bit_and:c @0 (logical_inverted_value @0))
 { build_zero_cst (type); })

+/* Fold X & (X ^ Y) as X & ~Y.  */
+(simplify
+ (bit_and @0 (bit_xor:s @0 @1))
+  (bit_and @0 (bit_not @1)))
+
+/* Fold X & (Y ^ X) as ~Y & X.  */
+(simplify
+ (bit_and @0 (bit_xor:s @1 @0))
+  (bit_and (bit_not @1) @0))

add :c on the bit_and and the bit_xor and then merge the patterns.

Thanks,
Richard.




> Tested the patch on X86 without any regressions.
>
> Thanks,
> Naveen
>
> ChangeLog
>
> 2015-10-07  Naveen H.S  <Naveen.Hurugalawadi@caviumnetworks.com>
>
>         * fold-const.c (fold_binary_loc) : Move X + (X / CST) * -CST ->
>         X % CST to match.pd.
>         Move Fold (A & ~B) - (A & B) into (A ^ B) - B to match.pd.
>         Move (-A) * (-B) -> A * B to match.pd.
>         Move (a * (1 << b)) is (a << b) to match.pd.
>         Move convert (C1/X)*C2 into (C1*C2)/X to match.pd.
>         Move (X & ~Y) | (~X & Y) is X ^ Y to match.pd.
>         Move ~X & X, (X == 0) & X, and !X & X are zero to match.pd.
>         Move X & ~X , X & (X == 0), and X & !X are zero to match.pd.
>         Move Fold X & (X ^ Y) as X & ~Y to match.pd.
>         Move Fold X & (Y ^ X) as ~Y & X to match.pd.
>
>         * match.pd (plus @0 (mult:s (trunc_div:s @0 INTEGER_CST@1)
>         (negate @1))): New simplifier.
>         (minus (bit_and:s @0 (bit_not:s @1)) (bit_and:s @0 @1)) :
>         New simplifier.
>         (mult:c @0 (lshift integer_onep@1 @2)): New simplifier.
>         (mult:c (plus @0 @0) INTEGER_CST@1): New simplifier.
>         (mult (rdiv REAL_CST@0 @1) REAL_CST@2): New simplifier.
>         (bit_ior (bit_and:s @0 (bit_not:s @1)) (bit_and:s (bit_not:s @0) @1))
>         : New simplifier.
>         (bit_and:c @0 (bit_not:s @0)): New simplifier.
>         (bit_and:c @0 (eq @0 integer_zerop@1)): New simplifier.
>         (bit_and @0 (bit_xor:s @0 @1)): New simplifier.
>         (bit_and @0 (bit_xor:s @1 @0)): New simplifier.
>         (mult:c (negate @0) (negate @1)): New simplifier.
Bernd Schmidt Oct. 8, 2015, 4:58 p.m. UTC | #2
On 10/07/2015 11:54 AM, Hurugalawadi, Naveen wrote:
> 	Move Fold X & (X ^ Y) as X & ~Y to match.pd.
> 	Move Fold X & (Y ^ X) as ~Y & X to match.pd.

I wonder if we shouldn't try to autogenerate patterns such as these. I 
did something like that for a different project a long time ago. 
Generate expressions up to a certain level of complexity, identify which 
ones are equivalent, and pick the one with the lowest cost for 
simplifications...


Bernd
Joseph Myers Oct. 8, 2015, 6:03 p.m. UTC | #3
On Thu, 8 Oct 2015, Bernd Schmidt wrote:

> On 10/07/2015 11:54 AM, Hurugalawadi, Naveen wrote:
> > 	Move Fold X & (X ^ Y) as X & ~Y to match.pd.
> > 	Move Fold X & (Y ^ X) as ~Y & X to match.pd.
> 
> I wonder if we shouldn't try to autogenerate patterns such as these. I did
> something like that for a different project a long time ago. Generate
> expressions up to a certain level of complexity, identify which ones are
> equivalent, and pick the one with the lowest cost for simplifications...

Any bitwise expression whose ultimate operands are X, Y, 0 and -1 
(possibly with conversions among types of the same width) could be 
canonicalized to one of: 0, -1, X, Y, ~X, ~Y, X^Y, X^~Y, A&B or A|B (where 
A is X or ~X and B is Y or ~Y).  I don't guarantee those are the best 
canonical forms, but if you're folding this sort of expression you ought 
to be able to make GCC fold all such expressions down to some such form 
(and fold away all equality comparisons among such expressions with 
constant value).

Now, such canonicalization could be done with a finite number of 
autogenerated patterns (if you can fold ~(A BINOP B), (A BINOP B) BINOP C 
and (A BINOP B) BINOP (C BINOP D), for A, B, C, D from 0, -1, X, Y, ~X, 
~Y, folding for more complicated expressions falls out).  I don't know if 
that's the best way to do such folding or not.

Given such folding, autogenerating expressions of the form ((A BINOP B) 
BINOP (C BINOP D)) == CANONICAL_FORM seems a plausible way of getting 
testsuite coverage for the folding (and for that matter for seeing what 
such folding is missing at present).
Bernd Schmidt Oct. 8, 2015, 6:15 p.m. UTC | #4
On 10/08/2015 08:03 PM, Joseph Myers wrote:
> On Thu, 8 Oct 2015, Bernd Schmidt wrote:
>
>> On 10/07/2015 11:54 AM, Hurugalawadi, Naveen wrote:
>>> 	Move Fold X & (X ^ Y) as X & ~Y to match.pd.
>>> 	Move Fold X & (Y ^ X) as ~Y & X to match.pd.
>>
>> I wonder if we shouldn't try to autogenerate patterns such as these. I did
>> something like that for a different project a long time ago. Generate
>> expressions up to a certain level of complexity, identify which ones are
>> equivalent, and pick the one with the lowest cost for simplifications...
>
> Any bitwise expression whose ultimate operands are X, Y, 0 and -1
> (possibly with conversions among types of the same width) could be
> canonicalized to one of: 0, -1, X, Y, ~X, ~Y, X^Y, X^~Y, A&B or A|B (where
> A is X or ~X and B is Y or ~Y).  I don't guarantee those are the best
> canonical forms, but if you're folding this sort of expression you ought
> to be able to make GCC fold all such expressions down to some such form
> (and fold away all equality comparisons among such expressions with
> constant value).

I was actually thinking of also doing this for more complex expressions 
with more than two different operands.


Bernd
Richard Biener Oct. 9, 2015, 9:32 a.m. UTC | #5
On Thu, Oct 8, 2015 at 8:15 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 10/08/2015 08:03 PM, Joseph Myers wrote:
>>
>> On Thu, 8 Oct 2015, Bernd Schmidt wrote:
>>
>>> On 10/07/2015 11:54 AM, Hurugalawadi, Naveen wrote:
>>>>
>>>>         Move Fold X & (X ^ Y) as X & ~Y to match.pd.
>>>>         Move Fold X & (Y ^ X) as ~Y & X to match.pd.
>>>
>>>
>>> I wonder if we shouldn't try to autogenerate patterns such as these. I
>>> did
>>> something like that for a different project a long time ago. Generate
>>> expressions up to a certain level of complexity, identify which ones are
>>> equivalent, and pick the one with the lowest cost for simplifications...
>>
>>
>> Any bitwise expression whose ultimate operands are X, Y, 0 and -1
>> (possibly with conversions among types of the same width) could be
>> canonicalized to one of: 0, -1, X, Y, ~X, ~Y, X^Y, X^~Y, A&B or A|B (where
>> A is X or ~X and B is Y or ~Y).  I don't guarantee those are the best
>> canonical forms, but if you're folding this sort of expression you ought
>> to be able to make GCC fold all such expressions down to some such form
>> (and fold away all equality comparisons among such expressions with
>> constant value).
>
>
> I was actually thinking of also doing this for more complex expressions with
> more than two different operands.

Note that the optimization problem in general is more complicated and
involves multiple such expressions which should be associated/optimized
with CSE in mind to result in the minimal number of overall instructions
needed to compute all required values.

But yes, we could auto-generate patterns up to a specific depth or up
to a point where more complex patterns are handled by composition.
Patches welcome ;)

Richard.

>
> Bernd
diff mbox

Patch

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 7231fd6..ad850f0 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9191,26 +9191,6 @@  fold_binary_loc (location_t loc,
       return NULL_TREE;
 
     case PLUS_EXPR:
-      if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
-	{
-	  /* X + (X / CST) * -CST is X % CST.  */
-	  if (TREE_CODE (arg1) == MULT_EXPR
-	      && TREE_CODE (TREE_OPERAND (arg1, 0)) == TRUNC_DIV_EXPR
-	      && operand_equal_p (arg0,
-				  TREE_OPERAND (TREE_OPERAND (arg1, 0), 0), 0))
-	    {
-	      tree cst0 = TREE_OPERAND (TREE_OPERAND (arg1, 0), 1);
-	      tree cst1 = TREE_OPERAND (arg1, 1);
-	      tree sum = fold_binary_loc (loc, PLUS_EXPR, TREE_TYPE (cst1),
-				      cst1, cst0);
-	      if (sum && integer_zerop (sum))
-		return fold_convert_loc (loc, type,
-					 fold_build2_loc (loc, TRUNC_MOD_EXPR,
-						      TREE_TYPE (arg0), arg0,
-						      cst0));
-	    }
-	}
-
       /* Handle (A1 * C1) + (A2 * C2) with A1, A2 or C1, C2 being the same or
 	 one.  Make sure the type is not saturating and has the signedness of
 	 the stripped operands, as fold_plusminus_mult_expr will re-associate.
@@ -9651,28 +9631,6 @@  fold_binary_loc (location_t loc,
 			    fold_convert_loc (loc, type,
 					      TREE_OPERAND (arg0, 0)));
 
-      if (! FLOAT_TYPE_P (type))
-	{
-	  /* Fold (A & ~B) - (A & B) into (A ^ B) - B, where B is
-	     any power of 2 minus 1.  */
-	  if (TREE_CODE (arg0) == BIT_AND_EXPR
-	      && TREE_CODE (arg1) == BIT_AND_EXPR
-	      && operand_equal_p (TREE_OPERAND (arg0, 0),
-				  TREE_OPERAND (arg1, 0), 0))
-	    {
-	      tree mask0 = TREE_OPERAND (arg0, 1);
-	      tree mask1 = TREE_OPERAND (arg1, 1);
-	      tree tem = fold_build1_loc (loc, BIT_NOT_EXPR, type, mask0);
-
-	      if (operand_equal_p (tem, mask1, 0))
-		{
-		  tem = fold_build2_loc (loc, BIT_XOR_EXPR, type,
-				     TREE_OPERAND (arg0, 0), mask1);
-		  return fold_build2_loc (loc, MINUS_EXPR, type, tem, mask1);
-		}
-	    }
-	}
-
       /* Fold __complex__ ( x, 0 ) - __complex__ ( 0, y ) to
 	 __complex__ ( x, -y ).  This is not the same for SNaNs or if
 	 signed zeros are involved.  */
@@ -9762,20 +9720,6 @@  fold_binary_loc (location_t loc,
       goto associate;
 
     case MULT_EXPR:
-      /* (-A) * (-B) -> A * B  */
-      if (TREE_CODE (arg0) == NEGATE_EXPR && negate_expr_p (arg1))
-	return fold_build2_loc (loc, MULT_EXPR, type,
-			    fold_convert_loc (loc, type,
-					      TREE_OPERAND (arg0, 0)),
-			    fold_convert_loc (loc, type,
-					      negate_expr (arg1)));
-      if (TREE_CODE (arg1) == NEGATE_EXPR && negate_expr_p (arg0))
-	return fold_build2_loc (loc, MULT_EXPR, type,
-			    fold_convert_loc (loc, type,
-					      negate_expr (arg0)),
-			    fold_convert_loc (loc, type,
-					      TREE_OPERAND (arg1, 0)));
-
       if (! FLOAT_TYPE_P (type))
 	{
 	  /* Transform x * -C into -x * C if x is easily negatable.  */
@@ -9789,16 +9733,6 @@  fold_binary_loc (location_t loc,
 						  negate_expr (arg0)),
 				tem);
 
-	  /* (a * (1 << b)) is (a << b)  */
-	  if (TREE_CODE (arg1) == LSHIFT_EXPR
-	      && integer_onep (TREE_OPERAND (arg1, 0)))
-	    return fold_build2_loc (loc, LSHIFT_EXPR, type, op0,
-				TREE_OPERAND (arg1, 1));
-	  if (TREE_CODE (arg0) == LSHIFT_EXPR
-	      && integer_onep (TREE_OPERAND (arg0, 0)))
-	    return fold_build2_loc (loc, LSHIFT_EXPR, type, op1,
-				TREE_OPERAND (arg0, 1));
-
 	  /* (A + A) * C -> A * 2 * C  */
 	  if (TREE_CODE (arg0) == PLUS_EXPR
 	      && TREE_CODE (arg1) == INTEGER_CST
@@ -9841,21 +9775,6 @@  fold_binary_loc (location_t loc,
 	}
       else
 	{
-	  /* Convert (C1/X)*C2 into (C1*C2)/X.  This transformation may change
-             the result for floating point types due to rounding so it is applied
-             only if -fassociative-math was specify.  */
-	  if (flag_associative_math
-	      && TREE_CODE (arg0) == RDIV_EXPR
-	      && TREE_CODE (arg1) == REAL_CST
-	      && TREE_CODE (TREE_OPERAND (arg0, 0)) == REAL_CST)
-	    {
-	      tree tem = const_binop (MULT_EXPR, TREE_OPERAND (arg0, 0),
-				      arg1);
-	      if (tem)
-		return fold_build2_loc (loc, RDIV_EXPR, type, tem,
-				    TREE_OPERAND (arg0, 1));
-	    }
-
           /* Strip sign operations from X in X*X, i.e. -Y*-Y -> Y*Y.  */
 	  if (operand_equal_p (arg0, arg1, 0))
 	    {
@@ -9972,28 +9891,6 @@  fold_binary_loc (location_t loc,
 				    arg1);
 	}
 
-      /* (X & ~Y) | (~X & Y) is X ^ Y */
-      if (TREE_CODE (arg0) == BIT_AND_EXPR
-	  && TREE_CODE (arg1) == BIT_AND_EXPR)
-        {
-	  tree a0, a1, l0, l1, n0, n1;
-
-	  a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
-	  a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
-
-	  l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
-	  l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
-	  
-	  n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
-	  n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
-	  
-	  if ((operand_equal_p (n0, a0, 0)
-	       && operand_equal_p (n1, a1, 0))
-	      || (operand_equal_p (n0, a1, 0)
-		  && operand_equal_p (n1, a0, 0)))
-	    return fold_build2_loc (loc, BIT_XOR_EXPR, type, l0, n1);
-	}
-
       /* See if this can be simplified into a rotate first.  If that
 	 is unsuccessful continue in the association code.  */
       goto bit_rotate;
@@ -10012,22 +9909,6 @@  fold_binary_loc (location_t loc,
       goto bit_rotate;
 
     case BIT_AND_EXPR:
-      /* ~X & X, (X == 0) & X, and !X & X are always zero.  */
-      if ((TREE_CODE (arg0) == BIT_NOT_EXPR
-	   || TREE_CODE (arg0) == TRUTH_NOT_EXPR
-	   || (TREE_CODE (arg0) == EQ_EXPR
-	       && integer_zerop (TREE_OPERAND (arg0, 1))))
-	  && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0))
-	return omit_one_operand_loc (loc, type, integer_zero_node, arg1);
-
-      /* X & ~X , X & (X == 0), and X & !X are always zero.  */
-      if ((TREE_CODE (arg1) == BIT_NOT_EXPR
-	   || TREE_CODE (arg1) == TRUTH_NOT_EXPR
-	   || (TREE_CODE (arg1) == EQ_EXPR
-	       && integer_zerop (TREE_OPERAND (arg1, 1))))
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
-	return omit_one_operand_loc (loc, type, integer_zero_node, arg0);
-
       /* Fold (X ^ 1) & 1 as (X & 1) == 0.  */
       if (TREE_CODE (arg0) == BIT_XOR_EXPR
 	  && INTEGRAL_TYPE_P (type)
@@ -10083,26 +9964,6 @@  fold_binary_loc (location_t loc,
 			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
 			      fold_convert_loc (loc, type, arg1));
 	}
-      /* Fold X & (X ^ Y) as X & ~Y.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_convert_loc (loc, type, arg0),
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem));
-	}
-      /* Fold X & (Y ^ X) as ~Y & X.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
-	  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg0));
-	}
-
       /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
          multiple of 1 << CST.  */
       if (TREE_CODE (arg1) == INTEGER_CST)
diff --git a/gcc/match.pd b/gcc/match.pd
index bd5c267..690ea14 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -316,6 +316,56 @@  along with GCC; see the file COPYING3.  If not see
   (coss (op @0))
    (coss @0))))
 
+/* Fold X + (X / CST) * -CST to X % CST.  */
+(simplify
+ (plus @0 (mult:s (trunc_div:s @0 INTEGER_CST@1) (negate @1)))
+  (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
+  (trunc_mod @0 @1)))
+
+/* Fold (A & ~B) - (A & B) into (A ^ B) - B.  */
+(simplify
+ (minus (bit_and:s @0 (bit_not:s @1)) (bit_and:s @0 @1))
+  (if (! FLOAT_TYPE_P (type))
+  (minus (bit_xor @0 @1) @1)))
+
+/* Fold (a * (1 << b)) into (a << b)  */
+(simplify
+ (mult:c @0 (lshift integer_onep@1 @2))
+  (if (! FLOAT_TYPE_P (type))
+  (lshift @0 @2)))
+
+/* Fold (C1/X)*C2 into (C1*C2)/X.  */ 
+(simplify
+ (mult (rdiv REAL_CST@0 @1) REAL_CST@2)
+  (if (FLOAT_TYPE_P (type)
+       && flag_associative_math)
+  (rdiv (mult @0 @2) @1)))
+
+/* Simplify (X & ~Y) | (~X & Y) is X ^ Y.  */
+(simplify
+ (bit_ior (bit_and:s @0 (bit_not:s @1)) (bit_and:s (bit_not:s @0) @1))
+  (bit_xor @0 @1))
+
+/* Simplify ~X & X as zero.  */
+(simplify
+ (bit_and:c @0 (bit_not:s @0))
+  { build_zero_cst (type); })
+
+/* Simplify (X == 0) & X as zero.  */
+(simplify
+ (bit_and:c @0 (eq @0 integer_zerop@1))
+  @1)
+
+/* Fold X & (X ^ Y) as X & ~Y.  */
+(simplify
+ (bit_and @0 (bit_xor:s @0 @1))
+  (bit_and @0 (bit_not @1)))
+
+/* Fold X & (Y ^ X) as ~Y & X.  */
+(simplify
+ (bit_and @0 (bit_xor:s @1 @0))
+  (bit_and (bit_not @1) @0))
+
 /* X % Y is smaller than Y.  */
 (for cmp (lt ge)
  (simplify
@@ -549,6 +599,11 @@  along with GCC; see the file COPYING3.  If not see
  (if (!FIXED_POINT_TYPE_P (type))
  (plus @0 (negate @1))))
 
+/* (-A) * (-B) -> A * B  */
+(simplify
+ (mult:c (negate @0) (negate @1))
+  (mult @0 @1))
+
 /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST))
    when profitable.
    For bitwise binary operations apply operand conversions to the