diff mbox

[10/n] Remove GENERIC stmt combining from SCCVN

Message ID alpine.LSU.2.11.1507011502050.22477@zhemvz.fhfr.qr
State New
Headers show

Commit Message

Richard Biener July 1, 2015, 1:03 p.m. UTC
This merges the complete comparison patterns from the match-and-simplify
branch, leaving incomplete implementations of fold-const.c code alone.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-07-01  Richard Biener  <rguenther@suse.de>

	* fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
	X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
	~X CMP C -> X CMP' ~C to ...
	* match.pd: ... patterns here.

Comments

Kyrylo Tkachov July 6, 2015, 2:34 p.m. UTC | #1
Hi Richard,

On 01/07/15 14:03, Richard Biener wrote:
> This merges the complete comparison patterns from the match-and-simplify
> branch, leaving incomplete implementations of fold-const.c code alone.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>
> Richard.
>
> 2015-07-01  Richard Biener  <rguenther@suse.de>
>
> 	* fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
> 	X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
> 	~X CMP C -> X CMP' ~C to ...
> 	* match.pd: ... patterns here.
>
>
>   
> +/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
> +   ??? The transformation is valid for the other operators if overflow
> +   is undefined for the type, but performing it here badly interacts
> +   with the transformation in fold_cond_expr_with_comparison which
> +   attempts to synthetize ABS_EXPR.  */
> +(for cmp (eq ne)
> + (simplify
> +  (cmp (minus @0 @1) integer_zerop)
> +  (cmp @0 @1)))

This broke some tests on aarch64:
FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-9]+
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-9]+, lsl 3
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 3

To take subs.c as an example:
There's something odd going on:
The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case but
not the long long case, but the int case (foo) is the place where the rtl ends up being:

(insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
         (minus:SI (reg/v:SI 76 [ x ])
             (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
      (nil))
(insn 10 9 11 2 (set (reg:CC 66 cc)
         (compare:CC (reg/v:SI 76 [ x ])
             (reg/v:SI 77 [ y ])))

instead of the previous:

(insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
         (minus:SI (reg/v:SI 76 [ x ])
             (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}

(insn 10 9 11 2 (set (reg:CC 66 cc)
         (compare:CC (reg/v:SI 74 [ l ])
             (const_int 0 [0])))


so the tranformed X CMP Y does not get matched by combine into a subs.
Was the transformation before the patch in fold-const.c not getting triggered?
In aarch64 we have patterns to match:
   [(set (reg:CC_NZ CC_REGNUM)
     (compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
                   (match_operand:GPI 2 "register_operand" "r"))
                (const_int 0)))
    (set (match_operand:GPI 0 "register_operand" "=r")
     (minus:GPI (match_dup 1) (match_dup 2)))]


Should we add a pattern to match:
   [(set (reg:CC CC_REGNUM)
     (compare:CC (match_operand:GPI 1 "register_operand" "r")
                    (match_operand:GPI 2 "register_operand" "r")))
    (set (match_operand:GPI 0 "register_operand" "=r")
     (minus:GPI (match_dup 1) (match_dup 2)))]

as well?

Kyrill

> +
> +/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
> +   signed arithmetic case.  That form is created by the compiler
> +   often enough for folding it to be of value.  One example is in
> +   computing loop trip counts after Operator Strength Reduction.  */
> +(for cmp (tcc_comparison)
> +     scmp (swapped_tcc_comparison)
> + (simplify
> +  (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
> +  /* Handle unfolded multiplication by zero.  */
> +  (if (integer_zerop (@1))
> +   (cmp @1 @2))
> +  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +       && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
> +   /* If @1 is negative we swap the sense of the comparison.  */
> +   (if (tree_int_cst_sgn (@1) < 0)
> +    (scmp @0 @2))
> +   (cmp @0 @2))))
> +
> +/* Simplify comparison of something with itself.  For IEEE
> +   floating-point, we can only do some of these simplifications.  */
> +(simplify
> + (eq @0 @0)
> + (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
> +      || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
> +  { constant_boolean_node (true, type); }))
> +(for cmp (ge le)
> + (simplify
> +  (cmp @0 @0)
> +  (eq @0 @0)))
> +(for cmp (ne gt lt)
> + (simplify
> +  (cmp @0 @0)
> +  (if (cmp != NE_EXPR
> +       || ! FLOAT_TYPE_P (TREE_TYPE (@0))
> +       || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
> +   { constant_boolean_node (false, type); })))
> +
> +/* Fold ~X op ~Y as Y op X.  */
> +(for cmp (tcc_comparison)
> + (simplify
> +  (cmp (bit_not @0) (bit_not @1))
> +  (cmp @1 @0)))
> +
> +/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> +(for cmp (tcc_comparison)
> +     scmp (swapped_tcc_comparison)
> + (simplify
> +  (cmp (bit_not @0) CONSTANT_CLASS_P@1)
> +  (if (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)
> +   (scmp @0 (bit_not @1)))))
> +
> +
>   /* Unordered tests if either argument is a NaN.  */
>   (simplify
>    (bit_ior (unordered @0 @0) (unordered @1 @1))
>
Andreas Schwab July 6, 2015, 2:38 p.m. UTC | #2
Kyrill Tkachov <kyrylo.tkachov@arm.com> writes:

> This broke some tests on aarch64:
> FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-9]+
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-9]+, lsl 3
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 3

This is PR66739.

Andreas.
Richard Biener July 6, 2015, 2:46 p.m. UTC | #3
On Mon, 6 Jul 2015, Kyrill Tkachov wrote:

> Hi Richard,
> 
> On 01/07/15 14:03, Richard Biener wrote:
> > This merges the complete comparison patterns from the match-and-simplify
> > branch, leaving incomplete implementations of fold-const.c code alone.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
> > 
> > Richard.
> > 
> > 2015-07-01  Richard Biener  <rguenther@suse.de>
> > 
> > 	* fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
> > 	X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
> > 	~X CMP C -> X CMP' ~C to ...
> > 	* match.pd: ... patterns here.
> > 
> > 
> >   +/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
> > +   ??? The transformation is valid for the other operators if overflow
> > +   is undefined for the type, but performing it here badly interacts
> > +   with the transformation in fold_cond_expr_with_comparison which
> > +   attempts to synthetize ABS_EXPR.  */
> > +(for cmp (eq ne)
> > + (simplify
> > +  (cmp (minus @0 @1) integer_zerop)
> > +  (cmp @0 @1)))
> 
> This broke some tests on aarch64:
> FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
> w[0-9]+
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
> w[0-9]+, lsl 3
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
> x[0-9]+
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
> x[0-9]+, lsl 3
> 
> To take subs.c as an example:
> There's something odd going on:
> The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case
> but
> not the long long case, but the int case (foo) is the place where the rtl ends
> up being:
> 
> (insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
>         (minus:SI (reg/v:SI 76 [ x ])
>             (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
>      (nil))
> (insn 10 9 11 2 (set (reg:CC 66 cc)
>         (compare:CC (reg/v:SI 76 [ x ])
>             (reg/v:SI 77 [ y ])))
> 
> instead of the previous:
> 
> (insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
>         (minus:SI (reg/v:SI 76 [ x ])
>             (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
> 
> (insn 10 9 11 2 (set (reg:CC 66 cc)
>         (compare:CC (reg/v:SI 74 [ l ])
>             (const_int 0 [0])))
> 
> 
> so the tranformed X CMP Y does not get matched by combine into a subs.
> Was the transformation before the patch in fold-const.c not getting triggered?

It was prevented from getting triggered by restricting the transform
to single uses (a fix I am testing right now).

Note that in case you'd write

  int l = x - y;
  if (l == 0)
    return 5;

  /* { dg-final { scan-assembler "subs\tw\[0-9\]" } } */
  z = x - y ;

the simplification will happen anyway because the redundancy
computing z has not yet been eliminated (a reason why such
single-use checks are not 100% the very much "correct" thing to do).

> In aarch64 we have patterns to match:
>   [(set (reg:CC_NZ CC_REGNUM)
>     (compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
>                   (match_operand:GPI 2 "register_operand" "r"))
>                (const_int 0)))
>    (set (match_operand:GPI 0 "register_operand" "=r")
>     (minus:GPI (match_dup 1) (match_dup 2)))]
> 
> 
> Should we add a pattern to match:
>   [(set (reg:CC CC_REGNUM)
>     (compare:CC (match_operand:GPI 1 "register_operand" "r")
>                    (match_operand:GPI 2 "register_operand" "r")))
>    (set (match_operand:GPI 0 "register_operand" "=r")
>     (minus:GPI (match_dup 1) (match_dup 2)))]
> 
> as well?

No, I don't think so.

Richard.

> Kyrill
> 
> > +
> > +/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
> > +   signed arithmetic case.  That form is created by the compiler
> > +   often enough for folding it to be of value.  One example is in
> > +   computing loop trip counts after Operator Strength Reduction.  */
> > +(for cmp (tcc_comparison)
> > +     scmp (swapped_tcc_comparison)
> > + (simplify
> > +  (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
> > +  /* Handle unfolded multiplication by zero.  */
> > +  (if (integer_zerop (@1))
> > +   (cmp @1 @2))
> > +  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > +       && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
> > +   /* If @1 is negative we swap the sense of the comparison.  */
> > +   (if (tree_int_cst_sgn (@1) < 0)
> > +    (scmp @0 @2))
> > +   (cmp @0 @2))))
> > +
> > +/* Simplify comparison of something with itself.  For IEEE
> > +   floating-point, we can only do some of these simplifications.  */
> > +(simplify
> > + (eq @0 @0)
> > + (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
> > +      || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
> > +  { constant_boolean_node (true, type); }))
> > +(for cmp (ge le)
> > + (simplify
> > +  (cmp @0 @0)
> > +  (eq @0 @0)))
> > +(for cmp (ne gt lt)
> > + (simplify
> > +  (cmp @0 @0)
> > +  (if (cmp != NE_EXPR
> > +       || ! FLOAT_TYPE_P (TREE_TYPE (@0))
> > +       || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
> > +   { constant_boolean_node (false, type); })))
> > +
> > +/* Fold ~X op ~Y as Y op X.  */
> > +(for cmp (tcc_comparison)
> > + (simplify
> > +  (cmp (bit_not @0) (bit_not @1))
> > +  (cmp @1 @0)))
> > +
> > +/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> > +(for cmp (tcc_comparison)
> > +     scmp (swapped_tcc_comparison)
> > + (simplify
> > +  (cmp (bit_not @0) CONSTANT_CLASS_P@1)
> > +  (if (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)
> > +   (scmp @0 (bit_not @1)))))
> > +
> > +
> >   /* Unordered tests if either argument is a NaN.  */
> >   (simplify
> >    (bit_ior (unordered @0 @0) (unordered @1 @1))
> > 
> 
>
Kyrylo Tkachov July 6, 2015, 2:56 p.m. UTC | #4
On 06/07/15 15:46, Richard Biener wrote:
> On Mon, 6 Jul 2015, Kyrill Tkachov wrote:
>
>> Hi Richard,
>>
>> On 01/07/15 14:03, Richard Biener wrote:
>>> This merges the complete comparison patterns from the match-and-simplify
>>> branch, leaving incomplete implementations of fold-const.c code alone.
>>>
>>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>>>
>>> Richard.
>>>
>>> 2015-07-01  Richard Biener  <rguenther@suse.de>
>>>
>>> 	* fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
>>> 	X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
>>> 	~X CMP C -> X CMP' ~C to ...
>>> 	* match.pd: ... patterns here.
>>>
>>>
>>>    +/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
>>> +   ??? The transformation is valid for the other operators if overflow
>>> +   is undefined for the type, but performing it here badly interacts
>>> +   with the transformation in fold_cond_expr_with_comparison which
>>> +   attempts to synthetize ABS_EXPR.  */
>>> +(for cmp (eq ne)
>>> + (simplify
>>> +  (cmp (minus @0 @1) integer_zerop)
>>> +  (cmp @0 @1)))
>> This broke some tests on aarch64:
>> FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
>> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
>> w[0-9]+
>> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
>> w[0-9]+, lsl 3
>> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
>> x[0-9]+
>> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
>> x[0-9]+, lsl 3
>>
>> To take subs.c as an example:
>> There's something odd going on:
>> The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case
>> but
>> not the long long case, but the int case (foo) is the place where the rtl ends
>> up being:
>>
>> (insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
>>          (minus:SI (reg/v:SI 76 [ x ])
>>              (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
>>       (nil))
>> (insn 10 9 11 2 (set (reg:CC 66 cc)
>>          (compare:CC (reg/v:SI 76 [ x ])
>>              (reg/v:SI 77 [ y ])))
>>
>> instead of the previous:
>>
>> (insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
>>          (minus:SI (reg/v:SI 76 [ x ])
>>              (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
>>
>> (insn 10 9 11 2 (set (reg:CC 66 cc)
>>          (compare:CC (reg/v:SI 74 [ l ])
>>              (const_int 0 [0])))
>>
>>
>> so the tranformed X CMP Y does not get matched by combine into a subs.
>> Was the transformation before the patch in fold-const.c not getting triggered?
> It was prevented from getting triggered by restricting the transform
> to single uses (a fix I am testing right now).
>
> Note that in case you'd write
>
>    int l = x - y;
>    if (l == 0)
>      return 5;
>
>    /* { dg-final { scan-assembler "subs\tw\[0-9\]" } } */
>    z = x - y ;
>
> the simplification will happen anyway because the redundancy
> computing z has not yet been eliminated (a reason why such
> single-use checks are not 100% the very much "correct" thing to do).

Ok, thanks. Andreas pointed out PR 66739 to me. I had not noticed it.
Sorry for the noise.

Kyrill

>
>> In aarch64 we have patterns to match:
>>    [(set (reg:CC_NZ CC_REGNUM)
>>      (compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
>>                    (match_operand:GPI 2 "register_operand" "r"))
>>                 (const_int 0)))
>>     (set (match_operand:GPI 0 "register_operand" "=r")
>>      (minus:GPI (match_dup 1) (match_dup 2)))]
>>
>>
>> Should we add a pattern to match:
>>    [(set (reg:CC CC_REGNUM)
>>      (compare:CC (match_operand:GPI 1 "register_operand" "r")
>>                     (match_operand:GPI 2 "register_operand" "r")))
>>     (set (match_operand:GPI 0 "register_operand" "=r")
>>      (minus:GPI (match_dup 1) (match_dup 2)))]
>>
>> as well?
> No, I don't think so.
>
> Richard.
>
>> Kyrill
>>
>>> +
>>> +/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
>>> +   signed arithmetic case.  That form is created by the compiler
>>> +   often enough for folding it to be of value.  One example is in
>>> +   computing loop trip counts after Operator Strength Reduction.  */
>>> +(for cmp (tcc_comparison)
>>> +     scmp (swapped_tcc_comparison)
>>> + (simplify
>>> +  (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
>>> +  /* Handle unfolded multiplication by zero.  */
>>> +  (if (integer_zerop (@1))
>>> +   (cmp @1 @2))
>>> +  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
>>> +       && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>>> +   /* If @1 is negative we swap the sense of the comparison.  */
>>> +   (if (tree_int_cst_sgn (@1) < 0)
>>> +    (scmp @0 @2))
>>> +   (cmp @0 @2))))
>>> +
>>> +/* Simplify comparison of something with itself.  For IEEE
>>> +   floating-point, we can only do some of these simplifications.  */
>>> +(simplify
>>> + (eq @0 @0)
>>> + (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
>>> +      || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
>>> +  { constant_boolean_node (true, type); }))
>>> +(for cmp (ge le)
>>> + (simplify
>>> +  (cmp @0 @0)
>>> +  (eq @0 @0)))
>>> +(for cmp (ne gt lt)
>>> + (simplify
>>> +  (cmp @0 @0)
>>> +  (if (cmp != NE_EXPR
>>> +       || ! FLOAT_TYPE_P (TREE_TYPE (@0))
>>> +       || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
>>> +   { constant_boolean_node (false, type); })))
>>> +
>>> +/* Fold ~X op ~Y as Y op X.  */
>>> +(for cmp (tcc_comparison)
>>> + (simplify
>>> +  (cmp (bit_not @0) (bit_not @1))
>>> +  (cmp @1 @0)))
>>> +
>>> +/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
>>> +(for cmp (tcc_comparison)
>>> +     scmp (swapped_tcc_comparison)
>>> + (simplify
>>> +  (cmp (bit_not @0) CONSTANT_CLASS_P@1)
>>> +  (if (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)
>>> +   (scmp @0 (bit_not @1)))))
>>> +
>>> +
>>>    /* Unordered tests if either argument is a NaN.  */
>>>    (simplify
>>>     (bit_ior (unordered @0 @0) (unordered @1 @1))
>>>
>>
diff mbox

Patch

Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	(revision 225225)
+++ gcc/fold-const.c	(working copy)
@@ -8783,23 +8734,6 @@  fold_comparison (location_t loc, enum tr
 	}
     }
 
-  /* Transform comparisons of the form X - Y CMP 0 to X CMP Y.  */
-  if (TREE_CODE (arg0) == MINUS_EXPR
-      && equality_code
-      && integer_zerop (arg1))
-    {
-      /* ??? The transformation is valid for the other operators if overflow
-	 is undefined for the type, but performing it here badly interacts
-	 with the transformation in fold_cond_expr_with_comparison which
-	 attempts to synthetize ABS_EXPR.  */
-      if (!equality_code)
-	fold_overflow_warning ("assuming signed overflow does not occur "
-			       "when changing X - Y cmp 0 to X cmp Y",
-			       WARN_STRICT_OVERFLOW_COMPARISON);
-      return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
-			      TREE_OPERAND (arg0, 1));
-    }
-
   /* For comparisons of pointers we can decompose it to a compile time
      comparison of the base objects and the offsets into the object.
      This requires at least one operand being an ADDR_EXPR or a
@@ -9088,38 +9022,6 @@  fold_comparison (location_t loc, enum tr
 	}
     }
 
-  /* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
-     signed arithmetic case.  That form is created by the compiler
-     often enough for folding it to be of value.  One example is in
-     computing loop trip counts after Operator Strength Reduction.  */
-  if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
-      && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
-      && TREE_CODE (arg0) == MULT_EXPR
-      && (TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
-          && !TREE_OVERFLOW (TREE_OPERAND (arg0, 1)))
-      && integer_zerop (arg1))
-    {
-      tree const1 = TREE_OPERAND (arg0, 1);
-      tree const2 = arg1;                       /* zero */
-      tree variable1 = TREE_OPERAND (arg0, 0);
-      enum tree_code cmp_code = code;
-
-      /* Handle unfolded multiplication by zero.  */
-      if (integer_zerop (const1))
-	return fold_build2_loc (loc, cmp_code, type, const1, const2);
-
-      fold_overflow_warning (("assuming signed overflow does not occur when "
-			      "eliminating multiplication in comparison "
-			      "with zero"),
-			     WARN_STRICT_OVERFLOW_COMPARISON);
-
-      /* If const1 is negative we swap the sense of the comparison.  */
-      if (tree_int_cst_sgn (const1) < 0)
-        cmp_code = swap_tree_comparison (cmp_code);
-
-      return fold_build2_loc (loc, cmp_code, type, variable1, const2);
-    }
-
   tem = maybe_canonicalize_comparison (loc, code, type, arg0, arg1);
   if (tem)
     return tem;
@@ -9241,40 +9138,6 @@  fold_comparison (location_t loc, enum tr
 	return tem;
     }
 
-  /* Simplify comparison of something with itself.  (For IEEE
-     floating-point, we can only do some of these simplifications.)  */
-  if (operand_equal_p (arg0, arg1, 0))
-    {
-      switch (code)
-	{
-	case EQ_EXPR:
-	  if (! FLOAT_TYPE_P (TREE_TYPE (arg0))
-	      || ! HONOR_NANS (arg0))
-	    return constant_boolean_node (1, type);
-	  break;
-
-	case GE_EXPR:
-	case LE_EXPR:
-	  if (! FLOAT_TYPE_P (TREE_TYPE (arg0))
-	      || ! HONOR_NANS (arg0))
-	    return constant_boolean_node (1, type);
-	  return fold_build2_loc (loc, EQ_EXPR, type, arg0, arg1);
-
-	case NE_EXPR:
-	  /* For NE, we can only do this simplification if integer
-	     or we don't honor IEEE floating point NaNs.  */
-	  if (FLOAT_TYPE_P (TREE_TYPE (arg0))
-	      && HONOR_NANS (arg0))
-	    break;
-	  /* ... fall through ...  */
-	case GT_EXPR:
-	case LT_EXPR:
-	  return constant_boolean_node (0, type);
-	default:
-	  gcc_unreachable ();
-	}
-    }
-
   /* If we are comparing an expression that just has comparisons
      of two integer values, arithmetic expressions of those comparisons,
      and constants, we can simplify it.  There are only three cases
@@ -9392,28 +9255,6 @@  fold_comparison (location_t loc, enum tr
 	return tem;
     }
 
-  /* Fold ~X op ~Y as Y op X.  */
-  if (TREE_CODE (arg0) == BIT_NOT_EXPR
-      && TREE_CODE (arg1) == BIT_NOT_EXPR)
-    {
-      tree cmp_type = TREE_TYPE (TREE_OPERAND (arg0, 0));
-      return fold_build2_loc (loc, code, type,
-			  fold_convert_loc (loc, cmp_type,
-					    TREE_OPERAND (arg1, 0)),
-			  TREE_OPERAND (arg0, 0));
-    }
-
-  /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
-  if (TREE_CODE (arg0) == BIT_NOT_EXPR
-      && (TREE_CODE (arg1) == INTEGER_CST || TREE_CODE (arg1) == VECTOR_CST))
-    {
-      tree cmp_type = TREE_TYPE (TREE_OPERAND (arg0, 0));
-      return fold_build2_loc (loc, swap_tree_comparison (code), type,
-			  TREE_OPERAND (arg0, 0),
-			  fold_build1_loc (loc, BIT_NOT_EXPR, cmp_type,
-				       fold_convert_loc (loc, cmp_type, arg1)));
-    }
-
   return NULL_TREE;
 }
 
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 225225)
+++ gcc/match.pd	(working copy)
@@ -1262,6 +1262,7 @@  (define_operator_list swapped_tcc_compar
           == TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)))))
   (plus @3 (view_convert @0))))
 
+
 /* Simplifications of comparisons.  */
 
 /* We can simplify a logical negation of a comparison to the
@@ -1299,6 +1300,68 @@  (define_operator_list swapped_tcc_compar
    (if (ic == ncmp)
     (ncmp @0 @1)))))
 
+/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
+   ??? The transformation is valid for the other operators if overflow
+   is undefined for the type, but performing it here badly interacts
+   with the transformation in fold_cond_expr_with_comparison which
+   attempts to synthetize ABS_EXPR.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (minus @0 @1) integer_zerop)
+  (cmp @0 @1)))
+
+/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
+   signed arithmetic case.  That form is created by the compiler
+   often enough for folding it to be of value.  One example is in
+   computing loop trip counts after Operator Strength Reduction.  */
+(for cmp (tcc_comparison)
+     scmp (swapped_tcc_comparison)
+ (simplify
+  (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
+  /* Handle unfolded multiplication by zero.  */
+  (if (integer_zerop (@1))
+   (cmp @1 @2))
+  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
+       && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+   /* If @1 is negative we swap the sense of the comparison.  */
+   (if (tree_int_cst_sgn (@1) < 0)
+    (scmp @0 @2))
+   (cmp @0 @2))))
+ 
+/* Simplify comparison of something with itself.  For IEEE
+   floating-point, we can only do some of these simplifications.  */
+(simplify
+ (eq @0 @0)
+ (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
+      || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
+  { constant_boolean_node (true, type); }))
+(for cmp (ge le)
+ (simplify
+  (cmp @0 @0)
+  (eq @0 @0)))
+(for cmp (ne gt lt)
+ (simplify
+  (cmp @0 @0)
+  (if (cmp != NE_EXPR
+       || ! FLOAT_TYPE_P (TREE_TYPE (@0))
+       || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
+   { constant_boolean_node (false, type); })))
+
+/* Fold ~X op ~Y as Y op X.  */
+(for cmp (tcc_comparison)
+ (simplify
+  (cmp (bit_not @0) (bit_not @1))
+  (cmp @1 @0)))
+
+/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
+(for cmp (tcc_comparison)
+     scmp (swapped_tcc_comparison)
+ (simplify
+  (cmp (bit_not @0) CONSTANT_CLASS_P@1)
+  (if (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)
+   (scmp @0 (bit_not @1)))))
+
+
 /* Unordered tests if either argument is a NaN.  */
 (simplify
  (bit_ior (unordered @0 @0) (unordered @1 @1))