diff mbox

[tree-ssa-reassoc.c] : Better reassoication for comparision and boolean-logic

Message ID BANLkTik+c=A4xqqMTXphcP7_b1KTptw71w@mail.gmail.com
State New
Headers show

Commit Message

Kai Tietz May 19, 2011, 12:48 p.m. UTC
Hello,

This patch improves reassociation folding for comparision. It expands
expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
0 && Y == 0)
and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
better reassociation
on weak pre-folded logical expressions.  This unfolding gets undone
anyway later by pass,
so no disadvantage gets introduced.
Also while going through BB-list, it tries to do some little
type-sinking for SSA sequences
like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
bool1 & bool2; D2 = (type) D1;'.
This folding has the advantage to see better through intermediate
results with none-boolean type.
The function eliminate_redundant_comparison () got reworded so, that
doesn't break in all cases.
It now continues to find duplicates and tries to find inverse variant
(folded to constant). By this
change we don't combine possible weak optimizations too fast, before
we can find and handle
inverse or duplicates.

ChangeLog

	* tree-ssa-reassoc.c (is_boolean_compatible_type_p): New
	helper.
	(is_ior_ne_eq_cmp): Likewise.
	(build_inner_expand_ne_eq_cmp): Likewise.
	(build_expand_ne_eq_cmp): Likewise.
	(sink_cast_and_expand): Likewise.
	(build_and_add_sum): Add forward declaration.
	(remove_visited_stmt_chain): Likewise.
	(build_and_add_sum): Allow generation of unary expression.
	(eliminate_redundant_comparison): Rework this routine.
	(break_up_subtract_bb): Rename to break_up_operation_bb.
	Additional do logical unfolding for optimization and
	primitive type-sinking for type-casts on boolean-kind
	(do_reassoc): Rename break_up_subtract_bb to break_up_operation_bb.

ChangeLog

	* gcc.dg/binop-tand1.c
	* gcc.dg/binop-tand2.c
	* gcc.dg/binop-tand3.c
	* gcc.dg/binop-tand4.c
	* gcc.dg/binop-tor1.c
	* gcc.dg/binop-tor2.c
	* gcc.dg/binop-tor3.c
	* gcc.dg/binop-tor4.c
	* gcc.dg/binop-tor5.c

Bootstrapped and tested for x86_64-pc-linux-gnu for all standard
languages plus ADA and Obj-C++.
Ok for apply?

Regards,
Kai

Comments

Jakub Jelinek May 19, 2011, 12:52 p.m. UTC | #1
On Thu, May 19, 2011 at 02:48:08PM +0200, Kai Tietz wrote:
> ChangeLog
> 
> 	* gcc.dg/binop-tand1.c
> 	* gcc.dg/binop-tand2.c
> 	* gcc.dg/binop-tand3.c
> 	* gcc.dg/binop-tand4.c
> 	* gcc.dg/binop-tor1.c
> 	* gcc.dg/binop-tor2.c
> 	* gcc.dg/binop-tor3.c
> 	* gcc.dg/binop-tor4.c
> 	* gcc.dg/binop-tor5.c

The testsuite ChangeLog entry is wrong, should have
: New test.
at the end of each line...

	Jakub
Kai Tietz May 19, 2011, 12:55 p.m. UTC | #2
2011/5/19 Jakub Jelinek <jakub@redhat.com>:
> On Thu, May 19, 2011 at 02:48:08PM +0200, Kai Tietz wrote:
>> ChangeLog
>>
>>       * gcc.dg/binop-tand1.c
>>       * gcc.dg/binop-tand2.c
>>       * gcc.dg/binop-tand3.c
>>       * gcc.dg/binop-tand4.c
>>       * gcc.dg/binop-tor1.c
>>       * gcc.dg/binop-tor2.c
>>       * gcc.dg/binop-tor3.c
>>       * gcc.dg/binop-tor4.c
>>       * gcc.dg/binop-tor5.c
>
> The testsuite ChangeLog entry is wrong, should have
> : New test.
> at the end of each line...
>
>        Jakub

Ups missed, this was a paste & cut issue. Correct it in my safe.

Thanks,
Kai
Richard Biener May 19, 2011, 12:56 p.m. UTC | #3
On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> Hello,
>
> This patch improves reassociation folding for comparision. It expands
> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
> 0 && Y == 0)
> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
> better reassociation
> on weak pre-folded logical expressions.  This unfolding gets undone
> anyway later by pass,
> so no disadvantage gets introduced.
> Also while going through BB-list, it tries to do some little
> type-sinking for SSA sequences
> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
> bool1 & bool2; D2 = (type) D1;'.
> This folding has the advantage to see better through intermediate
> results with none-boolean type.
> The function eliminate_redundant_comparison () got reworded so, that
> doesn't break in all cases.
> It now continues to find duplicates and tries to find inverse variant
> (folded to constant). By this
> change we don't combine possible weak optimizations too fast, before
> we can find and handle
> inverse or duplicates.

sinking casting belongs not here but instead to tree-ssa-forwprop.
I'm not sure that a != 0 | b != 0 is the better canonical variant than
a | b != 0 though.

is_boolean_compatible_type_p looks like a strange remanent.

Richard.

> ChangeLog
>
>        * tree-ssa-reassoc.c (ii: New
>        helper.
>        (is_ior_ne_eq_cmp): Likewise.
>        (build_inner_expand_ne_eq_cmp): Likewise.
>        (build_expand_ne_eq_cmp): Likewise.
>        (sink_cast_and_expand): Likewise.
>        (build_and_add_sum): Add forward declaration.
>        (remove_visited_stmt_chain): Likewise.
>        (build_and_add_sum): Allow generation of unary expression.
>        (eliminate_redundant_comparison): Rework this routine.
>        (break_up_subtract_bb): Rename to break_up_operation_bb.
>        Additional do logical unfolding for optimization and
>        primitive type-sinking for type-casts on boolean-kind
>        (do_reassoc): Rename break_up_subtract_bb to break_up_operation_bb.
>
> ChangeLog
>
>        * gcc.dg/binop-tand1.c
>        * gcc.dg/binop-tand2.c
>        * gcc.dg/binop-tand3.c
>        * gcc.dg/binop-tand4.c
>        * gcc.dg/binop-tor1.c
>        * gcc.dg/binop-tor2.c
>        * gcc.dg/binop-tor3.c
>        * gcc.dg/binop-tor4.c
>        * gcc.dg/binop-tor5.c
>
> Bootstrapped and tested for x86_64-pc-linux-gnu for all standard
> languages plus ADA and Obj-C++.
> Ok for apply?
>
> Regards,
> Kai
>
Kai Tietz May 19, 2011, 12:59 p.m. UTC | #4
2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> Hello,
>>
>> This patch improves reassociation folding for comparision. It expands
>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>> 0 && Y == 0)
>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>> better reassociation
>> on weak pre-folded logical expressions.  This unfolding gets undone
>> anyway later by pass,
>> so no disadvantage gets introduced.
>> Also while going through BB-list, it tries to do some little
>> type-sinking for SSA sequences
>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>> bool1 & bool2; D2 = (type) D1;'.
>> This folding has the advantage to see better through intermediate
>> results with none-boolean type.
>> The function eliminate_redundant_comparison () got reworded so, that
>> doesn't break in all cases.
>> It now continues to find duplicates and tries to find inverse variant
>> (folded to constant). By this
>> change we don't combine possible weak optimizations too fast, before
>> we can find and handle
>> inverse or duplicates.
>
> sinking casting belongs not here but instead to tree-ssa-forwprop.
> I'm not sure that a != 0 | b != 0 is the better canonical variant than
> a | b != 0 though.
>
> is_boolean_compatible_type_p looks like a strange remanent.
>
> Richard.

Well, a | b != 0 is for sure more optimal, but for reassociation we
need to see the unfolded variant temporary. This is necessary as
fold-const can't see through SSA statements.  But this kind of
expansion should be reversed then by pass to the form (a | b) != 0
back.

Regards,
Kai
Richard Biener May 19, 2011, 1 p.m. UTC | #5
On Thu, May 19, 2011 at 2:56 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> Hello,
>>
>> This patch improves reassociation folding for comparision. It expands
>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>> 0 && Y == 0)
>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>> better reassociation
>> on weak pre-folded logical expressions.  This unfolding gets undone
>> anyway later by pass,
>> so no disadvantage gets introduced.
>> Also while going through BB-list, it tries to do some little
>> type-sinking for SSA sequences
>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>> bool1 & bool2; D2 = (type) D1;'.
>> This folding has the advantage to see better through intermediate
>> results with none-boolean type.
>> The function eliminate_redundant_comparison () got reworded so, that
>> doesn't break in all cases.
>> It now continues to find duplicates and tries to find inverse variant
>> (folded to constant). By this
>> change we don't combine possible weak optimizations too fast, before
>> we can find and handle
>> inverse or duplicates.
>
> sinking casting belongs not here but instead to tree-ssa-forwprop.
> I'm not sure that a != 0 | b != 0 is the better canonical variant than
> a | b != 0 though.

For example w/o your patch I see on the first testcase:

<bb 2>:
  D.2689_3 = a_2(D) != 0;
  D.2690_5 = b_4(D) != 0;
  D.2691_6 = D.2690_5 & D.2689_3;
  if (D.2691_6 != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  D.2693_7 = b_4(D) | a_2(D);
  D.2694_8 = D.2693_7 == 0;
  D.2695_10 = c_9(D) != 0;
  D.2696_11 = D.2695_10 & D.2694_8;
  D.2685_16 = (int) D.2696_11;

canonicalizing to the latter (which is also smaller) would be

<bb 2>:
  D.2691_5 = a_2(D) & b_4(D);
  D.2691_6 = D.2691_5 != 0;
  if (D.2691_6 != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  D.2693_7 = b_4(D) | a_2(D);
  D.2694_8 = ~D.2693_7;
  D.2696_11 = c_9(D) & D.2694_8;
  D.2696_11 = D.2696_11 != 0;
  D.2685_16 = (int) D.2696_11;

that looks re-associatable as good as the other.

Richard.
Richard Biener May 19, 2011, 1:01 p.m. UTC | #6
On Thu, May 19, 2011 at 2:59 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>> Hello,
>>>
>>> This patch improves reassociation folding for comparision. It expands
>>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>>> 0 && Y == 0)
>>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>>> better reassociation
>>> on weak pre-folded logical expressions.  This unfolding gets undone
>>> anyway later by pass,
>>> so no disadvantage gets introduced.
>>> Also while going through BB-list, it tries to do some little
>>> type-sinking for SSA sequences
>>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>>> bool1 & bool2; D2 = (type) D1;'.
>>> This folding has the advantage to see better through intermediate
>>> results with none-boolean type.
>>> The function eliminate_redundant_comparison () got reworded so, that
>>> doesn't break in all cases.
>>> It now continues to find duplicates and tries to find inverse variant
>>> (folded to constant). By this
>>> change we don't combine possible weak optimizations too fast, before
>>> we can find and handle
>>> inverse or duplicates.
>>
>> sinking casting belongs not here but instead to tree-ssa-forwprop.
>> I'm not sure that a != 0 | b != 0 is the better canonical variant than
>> a | b != 0 though.
>>
>> is_boolean_compatible_type_p looks like a strange remanent.
>>
>> Richard.
>
> Well, a | b != 0 is for sure more optimal, but for reassociation we
> need to see the unfolded variant temporary. This is necessary as
> fold-const can't see through SSA statements.  But this kind of
> expansion should be reversed then by pass to the form (a | b) != 0
> back.

?  fold-const shouldn't deal with this at all as we are in gimple and in
SSA form.  Surely re-association comes to play only with chains of
the above with more than two operands.

Richard.

>
> Regards,
> Kai
>
Kai Tietz May 19, 2011, 1:08 p.m. UTC | #7
2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, May 19, 2011 at 2:59 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>> Hello,
>>>>
>>>> This patch improves reassociation folding for comparision. It expands
>>>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>>>> 0 && Y == 0)
>>>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>>>> better reassociation
>>>> on weak pre-folded logical expressions.  This unfolding gets undone
>>>> anyway later by pass,
>>>> so no disadvantage gets introduced.
>>>> Also while going through BB-list, it tries to do some little
>>>> type-sinking for SSA sequences
>>>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>>>> bool1 & bool2; D2 = (type) D1;'.
>>>> This folding has the advantage to see better through intermediate
>>>> results with none-boolean type.
>>>> The function eliminate_redundant_comparison () got reworded so, that
>>>> doesn't break in all cases.
>>>> It now continues to find duplicates and tries to find inverse variant
>>>> (folded to constant). By this
>>>> change we don't combine possible weak optimizations too fast, before
>>>> we can find and handle
>>>> inverse or duplicates.
>>>
>>> sinking casting belongs not here but instead to tree-ssa-forwprop.
>>> I'm not sure that a != 0 | b != 0 is the better canonical variant than
>>> a | b != 0 though.
>>>
>>> is_boolean_compatible_type_p looks like a strange remanent.
>>>
>>> Richard.
>>
>> Well, a | b != 0 is for sure more optimal, but for reassociation we
>> need to see the unfolded variant temporary. This is necessary as
>> fold-const can't see through SSA statements.  But this kind of
>> expansion should be reversed then by pass to the form (a | b) != 0
>> back.
>
> ?  fold-const shouldn't deal with this at all as we are in gimple and in
> SSA form.  Surely re-association comes to play only with chains of
> the above with more than two operands.
>
> Richard.
>
>>
>> Regards,
>> Kai
>>
>

The issue you can see by testcase binop_tor4.c, as here are the
intermediate variables d and e (with int type) are destroying the
reassociation pass. This testcase for example needs this sinking.

Kai
Richard Biener May 19, 2011, 1:11 p.m. UTC | #8
On Thu, May 19, 2011 at 3:08 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>> On Thu, May 19, 2011 at 2:59 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>> Hello,
>>>>>
>>>>> This patch improves reassociation folding for comparision. It expands
>>>>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>>>>> 0 && Y == 0)
>>>>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>>>>> better reassociation
>>>>> on weak pre-folded logical expressions.  This unfolding gets undone
>>>>> anyway later by pass,
>>>>> so no disadvantage gets introduced.
>>>>> Also while going through BB-list, it tries to do some little
>>>>> type-sinking for SSA sequences
>>>>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>>>>> bool1 & bool2; D2 = (type) D1;'.
>>>>> This folding has the advantage to see better through intermediate
>>>>> results with none-boolean type.
>>>>> The function eliminate_redundant_comparison () got reworded so, that
>>>>> doesn't break in all cases.
>>>>> It now continues to find duplicates and tries to find inverse variant
>>>>> (folded to constant). By this
>>>>> change we don't combine possible weak optimizations too fast, before
>>>>> we can find and handle
>>>>> inverse or duplicates.
>>>>
>>>> sinking casting belongs not here but instead to tree-ssa-forwprop.
>>>> I'm not sure that a != 0 | b != 0 is the better canonical variant than
>>>> a | b != 0 though.
>>>>
>>>> is_boolean_compatible_type_p looks like a strange remanent.
>>>>
>>>> Richard.
>>>
>>> Well, a | b != 0 is for sure more optimal, but for reassociation we
>>> need to see the unfolded variant temporary. This is necessary as
>>> fold-const can't see through SSA statements.  But this kind of
>>> expansion should be reversed then by pass to the form (a | b) != 0
>>> back.
>>
>> ?  fold-const shouldn't deal with this at all as we are in gimple and in
>> SSA form.  Surely re-association comes to play only with chains of
>> the above with more than two operands.
>>
>> Richard.
>>
>>>
>>> Regards,
>>> Kai
>>>
>>
>
> The issue you can see by testcase binop_tor4.c, as here are the
> intermediate variables d and e (with int type) are destroying the
> reassociation pass. This testcase for example needs this sinking.

hoisting would work equally well

> Kai
>
Kai Tietz May 19, 2011, 1:30 p.m. UTC | #9
2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, May 19, 2011 at 3:08 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>> On Thu, May 19, 2011 at 2:59 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> This patch improves reassociation folding for comparision. It expands
>>>>>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>>>>>> 0 && Y == 0)
>>>>>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>>>>>> better reassociation
>>>>>> on weak pre-folded logical expressions.  This unfolding gets undone
>>>>>> anyway later by pass,
>>>>>> so no disadvantage gets introduced.
>>>>>> Also while going through BB-list, it tries to do some little
>>>>>> type-sinking for SSA sequences
>>>>>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>>>>>> bool1 & bool2; D2 = (type) D1;'.
>>>>>> This folding has the advantage to see better through intermediate
>>>>>> results with none-boolean type.
>>>>>> The function eliminate_redundant_comparison () got reworded so, that
>>>>>> doesn't break in all cases.
>>>>>> It now continues to find duplicates and tries to find inverse variant
>>>>>> (folded to constant). By this
>>>>>> change we don't combine possible weak optimizations too fast, before
>>>>>> we can find and handle
>>>>>> inverse or duplicates.
>>>>>
>>>>> sinking casting belongs not here but instead to tree-ssa-forwprop.
>>>>> I'm not sure that a != 0 | b != 0 is the better canonical variant than
>>>>> a | b != 0 though.
>>>>>
>>>>> is_boolean_compatible_type_p looks like a strange remanent.
>>>>>
>>>>> Richard.
>>>>
>>>> Well, a | b != 0 is for sure more optimal, but for reassociation we
>>>> need to see the unfolded variant temporary. This is necessary as
>>>> fold-const can't see through SSA statements.  But this kind of
>>>> expansion should be reversed then by pass to the form (a | b) != 0
>>>> back.
>>>
>>> ?  fold-const shouldn't deal with this at all as we are in gimple and in
>>> SSA form.  Surely re-association comes to play only with chains of
>>> the above with more than two operands.
>>>
>>> Richard.
>>>
>>>>
>>>> Regards,
>>>> Kai
>>>>
>>>
>>
>> The issue you can see by testcase binop_tor4.c, as here are the
>> intermediate variables d and e (with int type) are destroying the
>> reassociation pass. This testcase for example needs this sinking.
>
> hoisting would work equally well

Well, but just if then all operands in combined BIT_AND/OR block are
getting hoisting. And well, there might be still some cases where we
wouldn't find the equivalent. As hoisting leads to following
sequences, eg:

D1 = a != 0;
D2 = b != 0;
D3 = a == 0;
D4 = b == 0;
D5 = (long) D1
D6 = (long) D2
D7 = (long) D3
D8 = (long) D4
D9 = D5 & D6;
D10 = D8 & D9
D11 = D9 & D10;

which means that comparision folding will never will happen as the
statement passed to fold algorithm is a cast to a comparison and not
the comparison itself.  So sinking looks more sane IMHO.

Regards,
Kai
Richard Biener May 19, 2011, 1:32 p.m. UTC | #10
On Thu, May 19, 2011 at 3:30 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>> On Thu, May 19, 2011 at 3:08 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>> On Thu, May 19, 2011 at 2:59 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>>> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> This patch improves reassociation folding for comparision. It expands
>>>>>>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>>>>>>> 0 && Y == 0)
>>>>>>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>>>>>>> better reassociation
>>>>>>> on weak pre-folded logical expressions.  This unfolding gets undone
>>>>>>> anyway later by pass,
>>>>>>> so no disadvantage gets introduced.
>>>>>>> Also while going through BB-list, it tries to do some little
>>>>>>> type-sinking for SSA sequences
>>>>>>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>>>>>>> bool1 & bool2; D2 = (type) D1;'.
>>>>>>> This folding has the advantage to see better through intermediate
>>>>>>> results with none-boolean type.
>>>>>>> The function eliminate_redundant_comparison () got reworded so, that
>>>>>>> doesn't break in all cases.
>>>>>>> It now continues to find duplicates and tries to find inverse variant
>>>>>>> (folded to constant). By this
>>>>>>> change we don't combine possible weak optimizations too fast, before
>>>>>>> we can find and handle
>>>>>>> inverse or duplicates.
>>>>>>
>>>>>> sinking casting belongs not here but instead to tree-ssa-forwprop.
>>>>>> I'm not sure that a != 0 | b != 0 is the better canonical variant than
>>>>>> a | b != 0 though.
>>>>>>
>>>>>> is_boolean_compatible_type_p looks like a strange remanent.
>>>>>>
>>>>>> Richard.
>>>>>
>>>>> Well, a | b != 0 is for sure more optimal, but for reassociation we
>>>>> need to see the unfolded variant temporary. This is necessary as
>>>>> fold-const can't see through SSA statements.  But this kind of
>>>>> expansion should be reversed then by pass to the form (a | b) != 0
>>>>> back.
>>>>
>>>> ?  fold-const shouldn't deal with this at all as we are in gimple and in
>>>> SSA form.  Surely re-association comes to play only with chains of
>>>> the above with more than two operands.
>>>>
>>>> Richard.
>>>>
>>>>>
>>>>> Regards,
>>>>> Kai
>>>>>
>>>>
>>>
>>> The issue you can see by testcase binop_tor4.c, as here are the
>>> intermediate variables d and e (with int type) are destroying the
>>> reassociation pass. This testcase for example needs this sinking.
>>
>> hoisting would work equally well
>
> Well, but just if then all operands in combined BIT_AND/OR block are
> getting hoisting. And well, there might be still some cases where we
> wouldn't find the equivalent. As hoisting leads to following
> sequences, eg:
>
> D1 = a != 0;
> D2 = b != 0;
> D3 = a == 0;
> D4 = b == 0;
> D5 = (long) D1
> D6 = (long) D2
> D7 = (long) D3
> D8 = (long) D4
> D9 = D5 & D6;
> D10 = D8 & D9
> D11 = D9 & D10;
>
> which means that comparision folding will never will happen as the
> statement passed to fold algorithm is a cast to a comparison and not
> the comparison itself.  So sinking looks more sane IMHO.

The above is what you do.

>
> Regards,
> Kai
>
Kai Tietz May 19, 2011, 1:36 p.m. UTC | #11
2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, May 19, 2011 at 3:30 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>> On Thu, May 19, 2011 at 3:08 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>> On Thu, May 19, 2011 at 2:59 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>>>> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> This patch improves reassociation folding for comparision. It expands
>>>>>>>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>>>>>>>> 0 && Y == 0)
>>>>>>>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>>>>>>>> better reassociation
>>>>>>>> on weak pre-folded logical expressions.  This unfolding gets undone
>>>>>>>> anyway later by pass,
>>>>>>>> so no disadvantage gets introduced.
>>>>>>>> Also while going through BB-list, it tries to do some little
>>>>>>>> type-sinking for SSA sequences
>>>>>>>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>>>>>>>> bool1 & bool2; D2 = (type) D1;'.
>>>>>>>> This folding has the advantage to see better through intermediate
>>>>>>>> results with none-boolean type.
>>>>>>>> The function eliminate_redundant_comparison () got reworded so, that
>>>>>>>> doesn't break in all cases.
>>>>>>>> It now continues to find duplicates and tries to find inverse variant
>>>>>>>> (folded to constant). By this
>>>>>>>> change we don't combine possible weak optimizations too fast, before
>>>>>>>> we can find and handle
>>>>>>>> inverse or duplicates.
>>>>>>>
>>>>>>> sinking casting belongs not here but instead to tree-ssa-forwprop.
>>>>>>> I'm not sure that a != 0 | b != 0 is the better canonical variant than
>>>>>>> a | b != 0 though.
>>>>>>>
>>>>>>> is_boolean_compatible_type_p looks like a strange remanent.
>>>>>>>
>>>>>>> Richard.
>>>>>>
>>>>>> Well, a | b != 0 is for sure more optimal, but for reassociation we
>>>>>> need to see the unfolded variant temporary. This is necessary as
>>>>>> fold-const can't see through SSA statements.  But this kind of
>>>>>> expansion should be reversed then by pass to the form (a | b) != 0
>>>>>> back.
>>>>>
>>>>> ?  fold-const shouldn't deal with this at all as we are in gimple and in
>>>>> SSA form.  Surely re-association comes to play only with chains of
>>>>> the above with more than two operands.
>>>>>
>>>>> Richard.
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Kai
>>>>>>
>>>>>
>>>>
>>>> The issue you can see by testcase binop_tor4.c, as here are the
>>>> intermediate variables d and e (with int type) are destroying the
>>>> reassociation pass. This testcase for example needs this sinking.
>>>
>>> hoisting would work equally well
>>
>> Well, but just if then all operands in combined BIT_AND/OR block are
>> getting hoisting. And well, there might be still some cases where we
>> wouldn't find the equivalent. As hoisting leads to following
>> sequences, eg:
>>
>> D1 = a != 0;
>> D2 = b != 0;
>> D3 = a == 0;
>> D4 = b == 0;
>> D5 = (long) D1
>> D6 = (long) D2
>> D7 = (long) D3
>> D8 = (long) D4
>> D9 = D5 & D6;
>> D10 = D8 & D9
>> D11 = D9 & D10;
>>
>> which means that comparision folding will never will happen as the
>> statement passed to fold algorithm is a cast to a comparison and not
>> the comparison itself.  So sinking looks more sane IMHO.
>
> The above is what you do.

No, I don't do this. Please see function sink_cast_and_expand function in patch.

  if (gimple_assign_cast_p (s1)
      && gimple_assign_cast_p (s2)
      && is_boolean_compatible_type_p (TREE_TYPE (gimple_assign_rhs1 (s1)))
      && is_boolean_compatible_type_p (TREE_TYPE (gimple_assign_rhs1 (s2)))
      && useless_type_conversion_p (TREE_TYPE (gimple_assign_rhs1 (s1)),
      				    TREE_TYPE (gimple_assign_rhs1 (s2))))
    {
      gimple_stmt_iterator gsi;
      gimple sum;
      tree op1a, op1b, tmpvar;

      tmpvar = create_tmp_reg (TREE_TYPE (gimple_assign_rhs1 (s1)), NULL);

      add_referenced_var (tmpvar);
      sum = build_and_add_sum (tmpvar, gimple_assign_rhs1 (s1),
      			       gimple_assign_rhs1 (s2), code);
      op1 = gimple_get_lhs (sum);
      op1 = fold_convert (type, op1);

      op1a = gimple_assign_rhs1 (stmt);
      op1b = gimple_assign_rhs2 (stmt);
      gsi = gsi_for_stmt (stmt);
      gimple_assign_set_rhs_from_tree (&gsi, op1);
      update_stmt (stmt);
      remove_visited_stmt_chain (op1a);
      remove_visited_stmt_chain (op1b);
      ret = true;
    }

The none-boolean cast get moved outer, not inner.

Regards,
Kai
Richard Biener May 19, 2011, 1:38 p.m. UTC | #12
On Thu, May 19, 2011 at 3:36 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>> On Thu, May 19, 2011 at 3:30 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>> On Thu, May 19, 2011 at 3:08 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>>> On Thu, May 19, 2011 at 2:59 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>>>>> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> This patch improves reassociation folding for comparision. It expands
>>>>>>>>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>>>>>>>>> 0 && Y == 0)
>>>>>>>>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>>>>>>>>> better reassociation
>>>>>>>>> on weak pre-folded logical expressions.  This unfolding gets undone
>>>>>>>>> anyway later by pass,
>>>>>>>>> so no disadvantage gets introduced.
>>>>>>>>> Also while going through BB-list, it tries to do some little
>>>>>>>>> type-sinking for SSA sequences
>>>>>>>>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>>>>>>>>> bool1 & bool2; D2 = (type) D1;'.
>>>>>>>>> This folding has the advantage to see better through intermediate
>>>>>>>>> results with none-boolean type.
>>>>>>>>> The function eliminate_redundant_comparison () got reworded so, that
>>>>>>>>> doesn't break in all cases.
>>>>>>>>> It now continues to find duplicates and tries to find inverse variant
>>>>>>>>> (folded to constant). By this
>>>>>>>>> change we don't combine possible weak optimizations too fast, before
>>>>>>>>> we can find and handle
>>>>>>>>> inverse or duplicates.
>>>>>>>>
>>>>>>>> sinking casting belongs not here but instead to tree-ssa-forwprop.
>>>>>>>> I'm not sure that a != 0 | b != 0 is the better canonical variant than
>>>>>>>> a | b != 0 though.
>>>>>>>>
>>>>>>>> is_boolean_compatible_type_p looks like a strange remanent.
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>
>>>>>>> Well, a | b != 0 is for sure more optimal, but for reassociation we
>>>>>>> need to see the unfolded variant temporary. This is necessary as
>>>>>>> fold-const can't see through SSA statements.  But this kind of
>>>>>>> expansion should be reversed then by pass to the form (a | b) != 0
>>>>>>> back.
>>>>>>
>>>>>> ?  fold-const shouldn't deal with this at all as we are in gimple and in
>>>>>> SSA form.  Surely re-association comes to play only with chains of
>>>>>> the above with more than two operands.
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Kai
>>>>>>>
>>>>>>
>>>>>
>>>>> The issue you can see by testcase binop_tor4.c, as here are the
>>>>> intermediate variables d and e (with int type) are destroying the
>>>>> reassociation pass. This testcase for example needs this sinking.
>>>>
>>>> hoisting would work equally well
>>>
>>> Well, but just if then all operands in combined BIT_AND/OR block are
>>> getting hoisting. And well, there might be still some cases where we
>>> wouldn't find the equivalent. As hoisting leads to following
>>> sequences, eg:
>>>
>>> D1 = a != 0;
>>> D2 = b != 0;
>>> D3 = a == 0;
>>> D4 = b == 0;
>>> D5 = (long) D1
>>> D6 = (long) D2
>>> D7 = (long) D3
>>> D8 = (long) D4
>>> D9 = D5 & D6;
>>> D10 = D8 & D9
>>> D11 = D9 & D10;
>>>
>>> which means that comparision folding will never will happen as the
>>> statement passed to fold algorithm is a cast to a comparison and not
>>> the comparison itself.  So sinking looks more sane IMHO.
>>
>> The above is what you do.
>
> No, I don't do this. Please see function sink_cast_and_expand function in patch.
>
>  if (gimple_assign_cast_p (s1)
>      && gimple_assign_cast_p (s2)
>      && is_boolean_compatible_type_p (TREE_TYPE (gimple_assign_rhs1 (s1)))
>      && is_boolean_compatible_type_p (TREE_TYPE (gimple_assign_rhs1 (s2)))
>      && useless_type_conversion_p (TREE_TYPE (gimple_assign_rhs1 (s1)),
>                                    TREE_TYPE (gimple_assign_rhs1 (s2))))
>    {
>      gimple_stmt_iterator gsi;
>      gimple sum;
>      tree op1a, op1b, tmpvar;
>
>      tmpvar = create_tmp_reg (TREE_TYPE (gimple_assign_rhs1 (s1)), NULL);
>
>      add_referenced_var (tmpvar);
>      sum = build_and_add_sum (tmpvar, gimple_assign_rhs1 (s1),
>                               gimple_assign_rhs1 (s2), code);
>      op1 = gimple_get_lhs (sum);
>      op1 = fold_convert (type, op1);
>
>      op1a = gimple_assign_rhs1 (stmt);
>      op1b = gimple_assign_rhs2 (stmt);
>      gsi = gsi_for_stmt (stmt);
>      gimple_assign_set_rhs_from_tree (&gsi, op1);
>      update_stmt (stmt);
>      remove_visited_stmt_chain (op1a);
>      remove_visited_stmt_chain (op1b);
>      ret = true;
>    }
>
> The none-boolean cast get moved outer, not inner.

You said  (X | Y) != 0 to (X != 0 || Y != 0).  I would simply move everything
down, including the cast (that's already done by tree-ssa-forwprop.c).

Richard.

> Regards,
> Kai
>
Kai Tietz May 19, 2011, 1:51 p.m. UTC | #13
2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, May 19, 2011 at 3:36 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>> On Thu, May 19, 2011 at 3:30 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>> On Thu, May 19, 2011 at 3:08 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>>>> On Thu, May 19, 2011 at 2:59 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>>>> 2011/5/19 Richard Guenther <richard.guenther@gmail.com>:
>>>>>>>>> On Thu, May 19, 2011 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> This patch improves reassociation folding for comparision. It expands
>>>>>>>>>> expressions within binary-AND/OR expression like (X | Y) == 0 to (X ==
>>>>>>>>>> 0 && Y == 0)
>>>>>>>>>> and (X | Y) != 0 to (X != 0 || Y != 0).  This is necessary to allow
>>>>>>>>>> better reassociation
>>>>>>>>>> on weak pre-folded logical expressions.  This unfolding gets undone
>>>>>>>>>> anyway later by pass,
>>>>>>>>>> so no disadvantage gets introduced.
>>>>>>>>>> Also while going through BB-list, it tries to do some little
>>>>>>>>>> type-sinking for SSA sequences
>>>>>>>>>> like "D1 = (type) bool1; D2 = (type) bool2; D3 = D1 & D2;' to 'D1 =
>>>>>>>>>> bool1 & bool2; D2 = (type) D1;'.
>>>>>>>>>> This folding has the advantage to see better through intermediate
>>>>>>>>>> results with none-boolean type.
>>>>>>>>>> The function eliminate_redundant_comparison () got reworded so, that
>>>>>>>>>> doesn't break in all cases.
>>>>>>>>>> It now continues to find duplicates and tries to find inverse variant
>>>>>>>>>> (folded to constant). By this
>>>>>>>>>> change we don't combine possible weak optimizations too fast, before
>>>>>>>>>> we can find and handle
>>>>>>>>>> inverse or duplicates.
>>>>>>>>>
>>>>>>>>> sinking casting belongs not here but instead to tree-ssa-forwprop.
>>>>>>>>> I'm not sure that a != 0 | b != 0 is the better canonical variant than
>>>>>>>>> a | b != 0 though.
>>>>>>>>>
>>>>>>>>> is_boolean_compatible_type_p looks like a strange remanent.
>>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>
>>>>>>>> Well, a | b != 0 is for sure more optimal, but for reassociation we
>>>>>>>> need to see the unfolded variant temporary. This is necessary as
>>>>>>>> fold-const can't see through SSA statements.  But this kind of
>>>>>>>> expansion should be reversed then by pass to the form (a | b) != 0
>>>>>>>> back.
>>>>>>>
>>>>>>> ?  fold-const shouldn't deal with this at all as we are in gimple and in
>>>>>>> SSA form.  Surely re-association comes to play only with chains of
>>>>>>> the above with more than two operands.
>>>>>>>
>>>>>>> Richard.
>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Kai
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> The issue you can see by testcase binop_tor4.c, as here are the
>>>>>> intermediate variables d and e (with int type) are destroying the
>>>>>> reassociation pass. This testcase for example needs this sinking.
>>>>>
>>>>> hoisting would work equally well
>>>>
>>>> Well, but just if then all operands in combined BIT_AND/OR block are
>>>> getting hoisting. And well, there might be still some cases where we
>>>> wouldn't find the equivalent. As hoisting leads to following
>>>> sequences, eg:
>>>>
>>>> D1 = a != 0;
>>>> D2 = b != 0;
>>>> D3 = a == 0;
>>>> D4 = b == 0;
>>>> D5 = (long) D1
>>>> D6 = (long) D2
>>>> D7 = (long) D3
>>>> D8 = (long) D4
>>>> D9 = D5 & D6;
>>>> D10 = D8 & D9
>>>> D11 = D9 & D10;
>>>>
>>>> which means that comparision folding will never will happen as the
>>>> statement passed to fold algorithm is a cast to a comparison and not
>>>> the comparison itself.  So sinking looks more sane IMHO.
>>>
>>> The above is what you do.
>>
>> No, I don't do this. Please see function sink_cast_and_expand function in patch.
>>
>>  if (gimple_assign_cast_p (s1)
>>      && gimple_assign_cast_p (s2)
>>      && is_boolean_compatible_type_p (TREE_TYPE (gimple_assign_rhs1 (s1)))
>>      && is_boolean_compatible_type_p (TREE_TYPE (gimple_assign_rhs1 (s2)))
>>      && useless_type_conversion_p (TREE_TYPE (gimple_assign_rhs1 (s1)),
>>                                    TREE_TYPE (gimple_assign_rhs1 (s2))))
>>    {
>>      gimple_stmt_iterator gsi;
>>      gimple sum;
>>      tree op1a, op1b, tmpvar;
>>
>>      tmpvar = create_tmp_reg (TREE_TYPE (gimple_assign_rhs1 (s1)), NULL);
>>
>>      add_referenced_var (tmpvar);
>>      sum = build_and_add_sum (tmpvar, gimple_assign_rhs1 (s1),
>>                               gimple_assign_rhs1 (s2), code);
>>      op1 = gimple_get_lhs (sum);
>>      op1 = fold_convert (type, op1);
>>
>>      op1a = gimple_assign_rhs1 (stmt);
>>      op1b = gimple_assign_rhs2 (stmt);
>>      gsi = gsi_for_stmt (stmt);
>>      gimple_assign_set_rhs_from_tree (&gsi, op1);
>>      update_stmt (stmt);
>>      remove_visited_stmt_chain (op1a);
>>      remove_visited_stmt_chain (op1b);
>>      ret = true;
>>    }
>>
>> The none-boolean cast get moved outer, not inner.
>
> You said  (X | Y) != 0 to (X != 0 || Y != 0).  I would simply move everything
> down, including the cast (that's already done by tree-ssa-forwprop.c).

Yes, here I do also already sinking. See build_expand_ne_eq_cmp
function (which is called by sink_cast_and_expand () - tree is here
unrolled recursively.

  if (is_boolean_compatible_type_p (type))
    op1 = fold_build2 ((code == NE_EXPR ? BIT_IOR_EXPR : BIT_AND_EXPR),
		       type, op1a, op1b);
  else
    {
      gimple sum;
      tree tmpvar = create_tmp_reg (boolean_type_node, NULL);
      add_referenced_var (tmpvar);
      sum = build_and_add_sum (tmpvar, op1a, op1b,
			       (code == NE_EXPR ? BIT_IOR_EXPR
						: BIT_AND_EXPR));
      op1 = gimple_get_lhs (sum);
      op1 = fold_convert (type, op1);
    }

For internal expansions of (A | B | C) != to the series of A!=0 & B !=
0 & C != 0 will done only on boolean_type_node type. There is no
recast. Just on final expansion it gets recasted.

So yes, the way is to move everything down as far as possible. I don't
see your point here that I wouldn't "simply move everything down" by
this patch.  If I wouldn't the testcases would fail.

Regards,
Kai
Kai Tietz May 19, 2011, 5:52 p.m. UTC | #14
To illustrate in which scenario code in tree-ssa-forwprop doesn't help
is binop-tor4.c

w/o this patch we get


foo (int a, int b, int c)
{
  int e;
  int d;
  int D.2701;
  _Bool D.2700;
  _Bool D.2699;
  _Bool D.2698;
  _Bool D.2697;
  _Bool D.2696;
  int D.2695;

<bb 2>:
  D.2695_3 = b_2(D) | a_1(D);
  d_4 = D.2695_3 != 0;
  D.2696_5 = a_1(D) == 0;
  D.2697_6 = b_2(D) == 0;
  D.2698_7 = D.2697_6 | D.2696_5;
  D.2699_9 = c_8(D) != 0;
  D.2700_10 = D.2698_7 | D.2699_9;
  e_11 = (int) D.2700_10;
  D.2701_12 = e_11 | d_4;
  return D.2701_12;
}

Of interest is here  D.2701_12, which doesn't have a type sinking.
This is caused by

 D.2695_3 = b_2(D) | a_1(D);
 d_4 = D.2695_3 != 0;

which is a comparison result with implicit integer cast. So maybe the
solution here could be to first doing boolification of comparison in
gimplifier. By this, the code for type-sinking in my patch could go
away.

Regards,
Kai
Eric Botcazou May 19, 2011, 8:17 p.m. UTC | #15
> Bootstrapped and tested for x86_64-pc-linux-gnu for all standard
> languages plus ADA and Obj-C++.

Ada, not ADA, this is the first name of the Countess of Lovelace:
  http://en.wikipedia.org/wiki/Ada_Lovelace
Kai Tietz May 19, 2011, 8:32 p.m. UTC | #16
2011/5/19 Eric Botcazou <ebotcazou@adacore.com>:
>> Bootstrapped and tested for x86_64-pc-linux-gnu for all standard
>> languages plus ADA and Obj-C++.
>
> Ada, not ADA, this is the first name of the Countess of Lovelace:
>  http://en.wikipedia.org/wiki/Ada_Lovelace

Well, I glad to hear that isn't drived by Russian word Ad (ад) ;)
Ok, I will write it next time as Ada.

Regards,
Kai
Richard Biener May 20, 2011, 9:17 a.m. UTC | #17
On Thu, May 19, 2011 at 7:52 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> To illustrate in which scenario code in tree-ssa-forwprop doesn't help
> is binop-tor4.c
>
> w/o this patch we get
>
>
> foo (int a, int b, int c)
> {
>  int e;
>  int d;
>  int D.2701;
>  _Bool D.2700;
>  _Bool D.2699;
>  _Bool D.2698;
>  _Bool D.2697;
>  _Bool D.2696;
>  int D.2695;
>
> <bb 2>:
>  D.2695_3 = b_2(D) | a_1(D);
>  d_4 = D.2695_3 != 0;
>  D.2696_5 = a_1(D) == 0;
>  D.2697_6 = b_2(D) == 0;
>  D.2698_7 = D.2697_6 | D.2696_5;
>  D.2699_9 = c_8(D) != 0;
>  D.2700_10 = D.2698_7 | D.2699_9;
>  e_11 = (int) D.2700_10;
>  D.2701_12 = e_11 | d_4;
>  return D.2701_12;
> }
>
> Of interest is here  D.2701_12, which doesn't have a type sinking.
> This is caused by
>
>  D.2695_3 = b_2(D) | a_1(D);
>  d_4 = D.2695_3 != 0;
>
> which is a comparison result with implicit integer cast. So maybe the
> solution here could be to first doing boolification of comparison in
> gimplifier. By this, the code for type-sinking in my patch could go
> away.

Well, forwprop either needs to be teached to handle this different kind
of widening

  d_4 = D.2687_3 != 0;
  e_11 = (int) D.2692_10;
  D.2694_12 = e_11 | d_4;

or indeed comparisons should also be boolified (which I think they
should - they are also predicate producers).

Still whether sinking or hoisting the stuff is the right thing, reassoc
is not the place to do it.

Richard.
Kai Tietz May 20, 2011, 7:21 p.m. UTC | #18
2011/5/20 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, May 19, 2011 at 7:52 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> To illustrate in which scenario code in tree-ssa-forwprop doesn't help
>> is binop-tor4.c
>>
>> w/o this patch we get
>>
>>
>> foo (int a, int b, int c)
>> {
>>  int e;
>>  int d;
>>  int D.2701;
>>  _Bool D.2700;
>>  _Bool D.2699;
>>  _Bool D.2698;
>>  _Bool D.2697;
>>  _Bool D.2696;
>>  int D.2695;
>>
>> <bb 2>:
>>  D.2695_3 = b_2(D) | a_1(D);
>>  d_4 = D.2695_3 != 0;
>>  D.2696_5 = a_1(D) == 0;
>>  D.2697_6 = b_2(D) == 0;
>>  D.2698_7 = D.2697_6 | D.2696_5;
>>  D.2699_9 = c_8(D) != 0;
>>  D.2700_10 = D.2698_7 | D.2699_9;
>>  e_11 = (int) D.2700_10;
>>  D.2701_12 = e_11 | d_4;
>>  return D.2701_12;
>> }
>>
>> Of interest is here  D.2701_12, which doesn't have a type sinking.
>> This is caused by
>>
>>  D.2695_3 = b_2(D) | a_1(D);
>>  d_4 = D.2695_3 != 0;
>>
>> which is a comparison result with implicit integer cast. So maybe the
>> solution here could be to first doing boolification of comparison in
>> gimplifier. By this, the code for type-sinking in my patch could go
>> away.
>
> Well, forwprop either needs to be teached to handle this different kind
> of widening
>
>  d_4 = D.2687_3 != 0;
>  e_11 = (int) D.2692_10;
>  D.2694_12 = e_11 | d_4;
>
> or indeed comparisons should also be boolified (which I think they
> should - they are also predicate producers).
>
> Still whether sinking or hoisting the stuff is the right thing, reassoc
> is not the place to do it.
>
> Richard.

So I tested code to do boolifying of comparison in gimplifier.  This
works so far nice when fold_convert doesn't hoist for boolean-types.
But in pass  forwprop (see here function forward_propagate_comparison)
does again type hoisting, which destroys of coures the boolified
comparisons and so later reassociation pass has again the issue about
finding matches.
To introduce (as you suggested) into tree-ssa-forwprop the type
sinking, therefore doesn't work.  As type hoisting is for sure the
better final result of an expression, but on expression folding
passes it has advantages to use type sinking instead.
So this might be a thing for a different pass, or in reassoc-pass
itself (as patch does) as here type-sinking helps to combine.  As
after reassociation again the forward-propagation happens, we have
still the better final expression variant as result.

So how to continue here?

Regards,
Kai
Richard Biener May 21, 2011, 9:20 a.m. UTC | #19
On Fri, May 20, 2011 at 9:21 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/5/20 Richard Guenther <richard.guenther@gmail.com>:
>> On Thu, May 19, 2011 at 7:52 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>> To illustrate in which scenario code in tree-ssa-forwprop doesn't help
>>> is binop-tor4.c
>>>
>>> w/o this patch we get
>>>
>>>
>>> foo (int a, int b, int c)
>>> {
>>>  int e;
>>>  int d;
>>>  int D.2701;
>>>  _Bool D.2700;
>>>  _Bool D.2699;
>>>  _Bool D.2698;
>>>  _Bool D.2697;
>>>  _Bool D.2696;
>>>  int D.2695;
>>>
>>> <bb 2>:
>>>  D.2695_3 = b_2(D) | a_1(D);
>>>  d_4 = D.2695_3 != 0;
>>>  D.2696_5 = a_1(D) == 0;
>>>  D.2697_6 = b_2(D) == 0;
>>>  D.2698_7 = D.2697_6 | D.2696_5;
>>>  D.2699_9 = c_8(D) != 0;
>>>  D.2700_10 = D.2698_7 | D.2699_9;
>>>  e_11 = (int) D.2700_10;
>>>  D.2701_12 = e_11 | d_4;
>>>  return D.2701_12;
>>> }
>>>
>>> Of interest is here  D.2701_12, which doesn't have a type sinking.
>>> This is caused by
>>>
>>>  D.2695_3 = b_2(D) | a_1(D);
>>>  d_4 = D.2695_3 != 0;
>>>
>>> which is a comparison result with implicit integer cast. So maybe the
>>> solution here could be to first doing boolification of comparison in
>>> gimplifier. By this, the code for type-sinking in my patch could go
>>> away.
>>
>> Well, forwprop either needs to be teached to handle this different kind
>> of widening
>>
>>  d_4 = D.2687_3 != 0;
>>  e_11 = (int) D.2692_10;
>>  D.2694_12 = e_11 | d_4;
>>
>> or indeed comparisons should also be boolified (which I think they
>> should - they are also predicate producers).
>>
>> Still whether sinking or hoisting the stuff is the right thing, reassoc
>> is not the place to do it.
>>
>> Richard.
>
> So I tested code to do boolifying of comparison in gimplifier.  This
> works so far nice when fold_convert doesn't hoist for boolean-types.
> But in pass  forwprop (see here function forward_propagate_comparison)
> does again type hoisting, which destroys of coures the boolified
> comparisons and so later reassociation pass has again the issue about
> finding matches.
> To introduce (as you suggested) into tree-ssa-forwprop the type
> sinking, therefore doesn't work.  As type hoisting is for sure the
> better final result of an expression, but on expression folding
> passes it has advantages to use type sinking instead.
> So this might be a thing for a different pass, or in reassoc-pass
> itself (as patch does) as here type-sinking helps to combine.  As
> after reassociation again the forward-propagation happens, we have
> still the better final expression variant as result.
>
> So how to continue here?

Please send me the boolification of comparisons patch you have,
as that is the right way to continue.  forwprop shouldn't undo
this (if it does, it does so via fold).

Richard.

> Regards,
> Kai
>
diff mbox

Patch

Index: gcc/gcc/tree-ssa-reassoc.c
===================================================================
--- gcc.orig/gcc/tree-ssa-reassoc.c	2011-05-19 14:25:10.404289600 +0200
+++ gcc/gcc/tree-ssa-reassoc.c	2011-05-19 14:26:11.088817100 +0200
@@ -41,6 +41,10 @@  along with GCC; see the file COPYING3.
 #include "cfgloop.h"
 #include "flags.h"
 
+/* Forward declarations of some helper routines.  */
+static gimple build_and_add_sum (tree , tree , tree , enum tree_code);
+static void remove_visited_stmt_chain (tree);
+
 /*  This is a simple global reassociation pass.  It is, in part, based
     on the LLVM pass of the same name (They do some things more/less
     than we do, in different orders, etc).
@@ -411,6 +415,236 @@  get_unary_op (tree name, enum tree_code
   return NULL_TREE;
 }
 
+/* Checks if a type is based on a BOOLEAN_TYPE, and or has 1-bit precision.
+   This check is necessary to detect ADA's 8-bit unsigned boolean types
+   proper.
+   If type is of kind boolean or compatible, TRUE is returned.  Otherwise
+   FALSE.  */
+
+static bool
+is_boolean_compatible_type_p (tree type)
+{
+  if (!type)
+    return false;
+
+  if (TYPE_PRECISION (type) == 1)
+    return true;
+
+  /* We try to look here into inner type, as ADA uses
+     boolean_type_node with type precision != 1.  */
+  while (TREE_TYPE (type)
+        && (TREE_CODE (type) == INTEGER_TYPE
+            || TREE_CODE (type) == REAL_TYPE))
+    type = TREE_TYPE (type);
+
+  return TYPE_PRECISION (type) == 1 || TREE_CODE (type) == BOOLEAN_TYPE;
+}
+
+/* This routine checks for (A | B) != 0 and (A | B) == 0.
+   If found TRUE is returned, otherwise FALSE.  */
+
+static bool
+is_ior_ne_eq_cmp (gimple stmt)
+{
+  gimple s;
+  tree op1, op2;
+  enum tree_code code;
+
+  if (!is_gimple_assign (stmt))
+    return false;
+  code = gimple_assign_rhs_code (stmt);
+  if (code != NE_EXPR && code != EQ_EXPR)
+    return false;
+  op1 = gimple_assign_rhs1 (stmt);
+  op2 = gimple_assign_rhs2 (stmt);
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (op2)) || !integer_zerop (op2))
+    return false;
+  if (TREE_CODE (op1) != SSA_NAME)
+    return false;
+  s = SSA_NAME_DEF_STMT (op1);
+  if (!is_gimple_assign (s) || gimple_assign_rhs_code (s) != BIT_IOR_EXPR)
+    return false;
+
+  return true;
+}
+
+/* This helper routine for build_expand_ne_eq_cmp () expands (A | B) != 0
+   to A != 0 || B != 0, and (A | B) == 0 to A == 0 && B == 0 to allow later
+   association to eliminate duplicates and do inverse operation.  */
+
+static tree
+build_inner_expand_ne_eq_cmp (tree op, tree zero, enum tree_code cmp_code)
+{
+  tree op1a, op1b, tmpvar;
+  gimple sum, s = NULL;
+  bool is_gimple = false;
+
+  if (TREE_CODE (op) == SSA_NAME
+      && is_gimple_assign ((s = SSA_NAME_DEF_STMT (op))))
+    is_gimple = true;
+
+  if (is_gimple
+      && gimple_assign_rhs_code (s) == BIT_IOR_EXPR)
+    {
+      op1a = build_inner_expand_ne_eq_cmp (gimple_assign_rhs1 (s),
+      					   zero, cmp_code);
+      op1b = build_inner_expand_ne_eq_cmp (gimple_assign_rhs2 (s),
+      					   zero, cmp_code);
+      tmpvar = create_tmp_reg (boolean_type_node, NULL);
+      add_referenced_var (tmpvar);
+      sum = build_and_add_sum (tmpvar, op1a, op1b,
+      			       (cmp_code == NE_EXPR ? BIT_IOR_EXPR
+						    : BIT_AND_EXPR));
+      op1a = gimple_get_lhs (sum);
+    }
+  else if (cmp_code == NE_EXPR
+	   && is_boolean_compatible_type_p (TREE_TYPE (op)))
+    return op;
+  else if (cmp_code == NE_EXPR && is_gimple && gimple_assign_cast_p (s)
+  	   && is_boolean_compatible_type_p
+		(TREE_TYPE (gimple_assign_rhs1 (s))))
+    return gimple_assign_rhs1 (s);
+  else
+    {
+      tmpvar = create_tmp_reg (boolean_type_node, NULL);
+      add_referenced_var (tmpvar);
+      sum = build_and_add_sum (tmpvar, op, zero, cmp_code);
+      op1a = gimple_get_lhs (sum);
+    }
+  return op1a;
+}
+
+/* This routine expands (A | B) != 0 to A != 0 || B != 0,
+   and (A | B) == 0 to A == 0 && B == 0 to allow later
+   association to eliminate duplicates and do inverse
+   operation.  */
+
+static void
+build_expand_ne_eq_cmp (gimple stmt)
+{
+  gimple_stmt_iterator gsi;
+  gimple s;
+  tree op1, op2, op1a, op1b, type;
+  enum tree_code code;
+
+  code = gimple_assign_rhs_code (stmt);
+  op1 = gimple_assign_rhs1 (stmt);
+  op2 = gimple_assign_rhs2 (stmt);
+  s = SSA_NAME_DEF_STMT (op1);
+
+  /* (X | Y) != 0 or (X | Y) == 0.  */
+  type = TREE_TYPE (gimple_assign_lhs (stmt));
+  op1a = gimple_assign_rhs1 (s);
+  op1b = gimple_assign_rhs2 (s);
+
+  op1a = build_inner_expand_ne_eq_cmp (op1a, op2, code);
+  op1b = build_inner_expand_ne_eq_cmp (op1b, op2, code);
+
+  if (is_boolean_compatible_type_p (type))
+    op1 = fold_build2 ((code == NE_EXPR ? BIT_IOR_EXPR : BIT_AND_EXPR),
+		       type, op1a, op1b);
+  else
+    {
+      gimple sum;
+      tree tmpvar = create_tmp_reg (boolean_type_node, NULL);
+      add_referenced_var (tmpvar);
+      sum = build_and_add_sum (tmpvar, op1a, op1b,
+			       (code == NE_EXPR ? BIT_IOR_EXPR
+						: BIT_AND_EXPR));
+      op1 = gimple_get_lhs (sum);
+      op1 = fold_convert (type, op1);
+    }
+
+  op1a = gimple_assign_rhs1 (stmt);
+  op1b = gimple_assign_rhs2 (stmt);
+  gsi = gsi_for_stmt (stmt);
+  gimple_assign_set_rhs_from_tree (&gsi, op1);
+  update_stmt (stmt);
+  remove_visited_stmt_chain (op1a);
+  remove_visited_stmt_chain (op1b);
+}
+
+/* This helper function converts (A | B) != 0 and (A | B) == 0 to their
+   binary sequence .  Additionally it walks binary &, |, and ^ trees
+   for making sure to expand operations and it tries to do a type sink
+   for boolean based expressions in SSA form like:
+     D1 = (int) BOOL-COND1
+     D2 = (int) BOOL-COND2
+     D3 = D1 | D2
+   into
+     D1 = BOOL-COND1 | BOOL-COND2
+     D2 = (int) D1
+   */
+
+static bool
+sink_cast_and_expand (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree type, op1, op2;
+  gimple s1, s2;
+  bool ret = false;
+
+  if (is_ior_ne_eq_cmp (stmt))
+    {
+      build_expand_ne_eq_cmp (stmt);
+      return true;
+    }
+
+  if (code != BIT_AND_EXPR && code != BIT_IOR_EXPR
+      && code != BIT_XOR_EXPR)
+    return false;
+  type = TREE_TYPE (gimple_assign_lhs (stmt));
+
+  op1 = gimple_assign_rhs1 (stmt);
+  if (TREE_CODE (op1) != SSA_NAME)
+    return false;
+  op2 = gimple_assign_rhs2 (stmt);
+  if (TREE_CODE (op2) != SSA_NAME)
+    return false;
+  s1 = SSA_NAME_DEF_STMT (op1);
+  if (!is_gimple_assign (s1))
+    return false;
+  if (sink_cast_and_expand (s1))
+    ret = true;
+  s2 = SSA_NAME_DEF_STMT (op2);
+  if (!is_gimple_assign (s2))
+    return ret;
+  if (sink_cast_and_expand (s2))
+    ret = true;
+
+  if (is_boolean_compatible_type_p (type))
+    return ret;
+  if (gimple_assign_cast_p (s1)
+      && gimple_assign_cast_p (s2)
+      && is_boolean_compatible_type_p (TREE_TYPE (gimple_assign_rhs1 (s1)))
+      && is_boolean_compatible_type_p (TREE_TYPE (gimple_assign_rhs1 (s2)))
+      && useless_type_conversion_p (TREE_TYPE (gimple_assign_rhs1 (s1)),
+      				    TREE_TYPE (gimple_assign_rhs1 (s2))))
+    {
+      gimple_stmt_iterator gsi;
+      gimple sum;
+      tree op1a, op1b, tmpvar;
+
+      tmpvar = create_tmp_reg (TREE_TYPE (gimple_assign_rhs1 (s1)), NULL);
+
+      add_referenced_var (tmpvar);
+      sum = build_and_add_sum (tmpvar, gimple_assign_rhs1 (s1),
+      			       gimple_assign_rhs1 (s2), code);
+      op1 = gimple_get_lhs (sum);
+      op1 = fold_convert (type, op1);
+      
+      op1a = gimple_assign_rhs1 (stmt);
+      op1b = gimple_assign_rhs2 (stmt);
+      gsi = gsi_for_stmt (stmt);
+      gimple_assign_set_rhs_from_tree (&gsi, op1);
+      update_stmt (stmt);
+      remove_visited_stmt_chain (op1a);
+      remove_visited_stmt_chain (op1b);
+      ret = true;
+    }
+  return ret;
+}
+
 /* If CURR and LAST are a pair of ops that OPCODE allows us to
    eliminate through equivalences, do so, remove them from OPS, and
    return true.  Otherwise, return false.  */
@@ -872,9 +1106,9 @@  zero_one_operation (tree *def, enum tree
   while (1);
 }
 
-/* Builds one statement performing OP1 OPCODE OP2 using TMPVAR for
-   the result.  Places the statement after the definition of either
-   OP1 or OP2.  Returns the new statement.  */
+/* Builds one statement performing OP1 OPCODE OP2, or UNARY-OPCODE OP1
+   using TMPVAR for the result.  Places the statement after the definition
+   of either OP1 or - for binary OPCODE - OP2 .  Returns the new statement.  */
 
 static gimple
 build_and_add_sum (tree tmpvar, tree op1, tree op2, enum tree_code opcode)
@@ -892,7 +1126,7 @@  build_and_add_sum (tree tmpvar, tree op1
   /* Find an insertion place and insert.  */
   if (TREE_CODE (op1) == SSA_NAME)
     op1def = SSA_NAME_DEF_STMT (op1);
-  if (TREE_CODE (op2) == SSA_NAME)
+  if (op2 && TREE_CODE (op2) == SSA_NAME)
     op2def = SSA_NAME_DEF_STMT (op2);
   if ((!op1def || gimple_nop_p (op1def))
       && (!op2def || gimple_nop_p (op2def)))
@@ -1223,11 +1457,11 @@  eliminate_redundant_comparison (enum tre
 				unsigned int currindex,
 				operand_entry_t curr)
 {
-  tree op1, op2;
+  tree op1, op2, matched = NULL_TREE, matched_op = NULL_TREE;
   enum tree_code lcode, rcode;
   gimple def1, def2;
-  int i;
   operand_entry_t oe;
+  int i, matched_i = 0;
 
   if (opcode != BIT_IOR_EXPR && opcode != BIT_AND_EXPR)
     return false;
@@ -1294,42 +1528,84 @@  eliminate_redundant_comparison (enum tre
 	    continue;
 	}
 
+      /* If a duplicate was found, remove it and continue search.  */
+      if (TREE_CODE (t) != INTEGER_CST && operand_equal_p (t, curr->op, 0))
+        {
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "Equivalence: ");
+	      print_generic_expr (dump_file, curr->op, 0);
+	      fprintf (dump_file, " %s ", op_symbol_code (opcode));
+	      print_generic_expr (dump_file, oe->op, 0);
+	      fprintf (dump_file, " -> ");
+	      print_generic_expr (dump_file, t, 0);
+	      fprintf (dump_file, "\n");
+	    }
+
+	  /* Now we can delete oe, as it has been subsumed by the new combined
+	     expression t.  */
+	  VEC_ordered_remove (operand_entry_t, *ops, i);
+	  reassociate_stats.ops_eliminated ++;
+
+	  /* Continue search, we might have another duplicate in list.  We
+	     don't break here to allow further duplicate elimination.  Also
+	     iteration might continue on next statement, when no further
+	     optimization for the current element is possible.  */
+	  --i;
+	  continue;
+        }
+      if (TREE_CODE (t) == INTEGER_CST)
+        {
+	  matched = t;
+	  matched_i = i;
+	  matched_op = oe->op;
+	  break;
+	}
+
+      if (matched)
+        continue;
+      matched = t;
+      matched_i = i;
+      matched_op = oe->op;
+    }
+
+  if (matched)
+    {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	{
 	  fprintf (dump_file, "Equivalence: ");
 	  print_generic_expr (dump_file, curr->op, 0);
 	  fprintf (dump_file, " %s ", op_symbol_code (opcode));
-	  print_generic_expr (dump_file, oe->op, 0);
+	  print_generic_expr (dump_file, matched_op, 0);
 	  fprintf (dump_file, " -> ");
-	  print_generic_expr (dump_file, t, 0);
+	  print_generic_expr (dump_file, matched, 0);
 	  fprintf (dump_file, "\n");
 	}
 
       /* Now we can delete oe, as it has been subsumed by the new combined
-         expression t.  */
-      VEC_ordered_remove (operand_entry_t, *ops, i);
+	 expression t.  */
+      VEC_ordered_remove (operand_entry_t, *ops, matched_i);
       reassociate_stats.ops_eliminated ++;
-
-      /* If t is the same as curr->op, we're done.  Otherwise we must
+      /* If matched is the same as curr->op, we're done.  Otherwise we must
 	 replace curr->op with t.  Special case is if we got a constant
 	 back, in which case we add it to the end instead of in place of
 	 the current entry.  */
-      if (TREE_CODE (t) == INTEGER_CST)
+      if (TREE_CODE (matched) == INTEGER_CST)
 	{
 	  VEC_ordered_remove (operand_entry_t, *ops, currindex);
-	  add_to_ops_vec (ops, t);
+	  add_to_ops_vec (ops, matched);
 	}
-      else if (!operand_equal_p (t, curr->op, 0))
+      else if (!operand_equal_p (matched, curr->op, 0))
 	{
 	  tree tmpvar;
 	  gimple sum;
 	  enum tree_code subcode;
 	  tree newop1;
 	  tree newop2;
-	  gcc_assert (COMPARISON_CLASS_P (t));
-	  tmpvar = create_tmp_var (TREE_TYPE (t), NULL);
+	  gcc_assert (COMPARISON_CLASS_P (matched));
+	  tmpvar = create_tmp_var (TREE_TYPE (matched), NULL);
 	  add_referenced_var (tmpvar);
-	  extract_ops_from_tree (t, &subcode, &newop1, &newop2);
+	  extract_ops_from_tree (matched, &subcode, &newop1, &newop2);
 	  STRIP_USELESS_TYPE_CONVERSION (newop1);
 	  STRIP_USELESS_TYPE_CONVERSION (newop2);
 	  gcc_checking_assert (is_gimple_val (newop1)
@@ -1975,8 +2251,11 @@  can_reassociate_p (tree op)
   return false;
 }
 
-/* Break up subtract operations in block BB.
+/* Break up subtract operations, normalize compare or operations, and
+   do type sinking for boolean based binary or/and/xor operations in
+   block BB.
 
+   For substract:
    We do this top down because we don't know whether the subtract is
    part of a possible chain of reassociation except at the top.
 
@@ -1987,32 +2266,46 @@  can_reassociate_p (tree op)
    q = b - r
    k = t - q
 
-   we want to break up k = t - q, but we won't until we've transformed q
+   We want to break up k = t - q, but we won't until we've transformed q
    = b - r, which won't be broken up until we transform b = c - d.
 
+   Compare operations:
+   For being able to reassoc in and/or/xor operations for optimized
+   comparison, we need to break up (A | B) == 0 to (A == 0 && B == 0)
+   and (A | B) != 0 to (A != 0 || B != 0).
+
    En passant, clear the GIMPLE visited flag on every statement.  */
 
 static void
-break_up_subtract_bb (basic_block bb)
+break_up_operation_bb (basic_block bb)
 {
   gimple_stmt_iterator gsi;
   basic_block son;
 
-  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
     {
       gimple stmt = gsi_stmt (gsi);
       gimple_set_visited (stmt, false);
 
+      if (is_gimple_assign (stmt) && sink_cast_and_expand (stmt))
+	continue;
+
       if (!is_gimple_assign (stmt)
 	  || !can_reassociate_p (gimple_assign_lhs (stmt)))
-	continue;
+	{
+	  gsi_next (&gsi);
+	  continue;
+	}
 
       /* Look for simple gimple subtract operations.  */
       if (gimple_assign_rhs_code (stmt) == MINUS_EXPR)
 	{
 	  if (!can_reassociate_p (gimple_assign_rhs1 (stmt))
 	      || !can_reassociate_p (gimple_assign_rhs2 (stmt)))
-	    continue;
+	    {
+	      gsi_next (&gsi);
+	      continue;
+	    }
 
 	  /* Check for a subtract used only in an addition.  If this
 	     is the case, transform it into add of a negate for better
@@ -2024,11 +2317,12 @@  break_up_subtract_bb (basic_block bb)
       else if (gimple_assign_rhs_code (stmt) == NEGATE_EXPR
 	       && can_reassociate_p (gimple_assign_rhs1 (stmt)))
 	VEC_safe_push (tree, heap, plus_negates, gimple_assign_lhs (stmt));
+      gsi_next (&gsi);
     }
   for (son = first_dom_son (CDI_DOMINATORS, bb);
        son;
        son = next_dom_son (CDI_DOMINATORS, son))
-    break_up_subtract_bb (son);
+    break_up_operation_bb (son);
 }
 
 /* Reassociate expressions in basic block BB and its post-dominator as
@@ -2180,7 +2474,7 @@  debug_ops_vector (VEC (operand_entry_t,
 static void
 do_reassoc (void)
 {
-  break_up_subtract_bb (ENTRY_BLOCK_PTR);
+  break_up_operation_bb (ENTRY_BLOCK_PTR);
   reassociate_bb (EXIT_BLOCK_PTR);
 }
 
Index: gcc/gcc/testsuite/gcc.dg/binop-tand1.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tand1.c	2011-05-19 14:28:19.980184200 +0200
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  return ((!a && !b) && c && b && c && a);
+}
+
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-tand2.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tand2.c	2011-05-19 14:28:19.982184500 +0200
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  return (a != 0 && b != 0 && ((a | b) == 0) && c != 0) ? 1 : 0;
+}
+
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-tand4.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tand4.c	2011-05-19 14:28:19.985184900 +0200
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  int e = (a && b && c);
+  return (e && (a | b) == 0);
+}
+
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-tor1.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tor1.c	2011-05-19 14:28:12.189194900 +0200
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  return (a || b || c || (!b || !a) || c);
+}
+
+/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-tor2.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tor2.c	2011-05-19 14:28:12.191195200 +0200
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  return ((!a || !b) || c || (a | b) != 0) || c;
+}
+
+/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-tor4.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tor4.c	2011-05-19 14:28:12.194195500 +0200
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  int d = (a | b) != 0;
+  int e = (!a || c || !b || c);
+  return (e || d) != 0;
+}
+
+/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-tor5.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tor5.c	2011-05-19 14:28:12.196695900 +0200
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  _Bool d = (a | b) != 0;
+  _Bool e = (!a || c || !b || c);
+  return (e || d) != 0;
+}
+
+/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-tand3.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tand3.c	2011-05-19 14:28:19.983684700 +0200
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  int d = (!a && !b);
+  int e = (b && c);
+  return (d && e && c && a);
+}
+
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-tor3.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-tor3.c	2011-05-19 14:28:12.192695300 +0200
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  int d = (a || !b);
+  int e = (!a || b);
+  return (e || d || c);
+}
+
+/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */