diff mbox series

[V2] Handle bitop with INTEGER_CST in analyze_and_compute_bitop_with_inv_effect.

Message ID 20231107060539.443303-1-hongtao.liu@intel.com
State New
Headers show
Series [V2] Handle bitop with INTEGER_CST in analyze_and_compute_bitop_with_inv_effect. | expand

Commit Message

Liu, Hongtao Nov. 7, 2023, 6:05 a.m. UTC
analyze_and_compute_bitop_with_inv_effect assumes the first operand is
loop invariant which is not the case when it's INTEGER_CST.

Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

	PR tree-optimization/105735
	PR tree-optimization/111972
	* tree-scalar-evolution.cc
	(analyze_and_compute_bitop_with_inv_effect): Handle bitop with
	INTEGER_CST.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr105735-3.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
 gcc/tree-scalar-evolution.cc               |  3 +
 2 files changed, 90 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c

Comments

Richard Biener Nov. 7, 2023, 8:10 a.m. UTC | #1
On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> loop invariant which is not the case when it's INTEGER_CST.
>
> Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?

So this addresses a missed optimization, right?  It seems to me that
even with two SSA names we are only "lucky" when rhs1 is the invariant
one.  So instead of swapping this way I'd do

 unsigned i;
 for (i = 0; i < 2; ++i)
   if (TREE_CODE (match_op[i]) == SSA_NAME
       && ...)
    break; /* found! */

  if (i == 2)
    return NULL_TREE;
  if (i == 0)
    std::swap (match_op[0], match_op[1]);

to also handle a "swapped" pair of SSA names?

> gcc/ChangeLog:
>
>         PR tree-optimization/105735
>         PR tree-optimization/111972
>         * tree-scalar-evolution.cc
>         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
>         INTEGER_CST.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/i386/pr105735-3.c: New test.
> ---
>  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
>  gcc/tree-scalar-evolution.cc               |  3 +
>  2 files changed, 90 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> new file mode 100644
> index 00000000000..9e268a1a997
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> @@ -0,0 +1,87 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> +
> +unsigned int
> +__attribute__((noipa))
> +foo (unsigned int tmp)
> +{
> +  for (int bit = 0; bit < 64; bit++)
> +    tmp &= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo1 (unsigned int tmp)
> +{
> +  for (int bit = 63; bit >= 0; bit -=3)
> +    tmp &= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo2 (unsigned int tmp)
> +{
> +  for (int bit = 0; bit < 64; bit++)
> +    tmp |= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo3 (unsigned int tmp)
> +{
> +  for (int bit = 63; bit >= 0; bit -=3)
> +    tmp |= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo4 (unsigned int tmp)
> +{
> +  for (int bit = 0; bit < 64; bit++)
> +    tmp ^= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo5 (unsigned int tmp)
> +{
> +  for (int bit = 0; bit < 63; bit++)
> +    tmp ^= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +f (unsigned int tmp, int bit)
> +{
> +  unsigned int res = tmp;
> +  for (int i = 0; i < bit; i++)
> +    res &= 11304;
> +  return res;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +f1 (unsigned int tmp, int bit)
> +{
> +  unsigned int res = tmp;
> +  for (int i = 0; i < bit; i++)
> +    res |= 11304;
> +  return res;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +f2 (unsigned int tmp, int bit)
> +{
> +  unsigned int res = tmp;
> +  for (int i = 0; i < bit; i++)
> +    res ^= 11304;
> +  return res;
> +}
> diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> index 70b17c5bca1..f61277c32df 100644
> --- a/gcc/tree-scalar-evolution.cc
> +++ b/gcc/tree-scalar-evolution.cc
> @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
>    match_op[0] = gimple_assign_rhs1 (def);
>    match_op[1] = gimple_assign_rhs2 (def);
>
> +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> +    std::swap (match_op[0], match_op[1]);
> +
>    if (TREE_CODE (match_op[1]) != SSA_NAME
>        || !expr_invariant_in_loop_p (loop, match_op[0])
>        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> --
> 2.31.1
>
Hongtao Liu Nov. 7, 2023, 1:03 p.m. UTC | #2
On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
> >
> > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > loop invariant which is not the case when it's INTEGER_CST.
> >
> > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
>
> So this addresses a missed optimization, right?  It seems to me that
> even with two SSA names we are only "lucky" when rhs1 is the invariant
> one.  So instead of swapping this way I'd do
Yes, it's a miss optimization.
And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
enough, if match_op[1] is a loop invariant.it must be false for the
below conditions(there couldn't be any header_phi from its
definition).

>
>  unsigned i;
>  for (i = 0; i < 2; ++i)
>    if (TREE_CODE (match_op[i]) == SSA_NAME
>        && ...)
>     break; /* found! */
>
>   if (i == 2)
>     return NULL_TREE;
>   if (i == 0)
>     std::swap (match_op[0], match_op[1]);
>
> to also handle a "swapped" pair of SSA names?
>
> > gcc/ChangeLog:
> >
> >         PR tree-optimization/105735
> >         PR tree-optimization/111972
> >         * tree-scalar-evolution.cc
> >         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> >         INTEGER_CST.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/i386/pr105735-3.c: New test.
> > ---
> >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
> >  gcc/tree-scalar-evolution.cc               |  3 +
> >  2 files changed, 90 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > new file mode 100644
> > index 00000000000..9e268a1a997
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > @@ -0,0 +1,87 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo (unsigned int tmp)
> > +{
> > +  for (int bit = 0; bit < 64; bit++)
> > +    tmp &= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo1 (unsigned int tmp)
> > +{
> > +  for (int bit = 63; bit >= 0; bit -=3)
> > +    tmp &= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo2 (unsigned int tmp)
> > +{
> > +  for (int bit = 0; bit < 64; bit++)
> > +    tmp |= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo3 (unsigned int tmp)
> > +{
> > +  for (int bit = 63; bit >= 0; bit -=3)
> > +    tmp |= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo4 (unsigned int tmp)
> > +{
> > +  for (int bit = 0; bit < 64; bit++)
> > +    tmp ^= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo5 (unsigned int tmp)
> > +{
> > +  for (int bit = 0; bit < 63; bit++)
> > +    tmp ^= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +f (unsigned int tmp, int bit)
> > +{
> > +  unsigned int res = tmp;
> > +  for (int i = 0; i < bit; i++)
> > +    res &= 11304;
> > +  return res;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +f1 (unsigned int tmp, int bit)
> > +{
> > +  unsigned int res = tmp;
> > +  for (int i = 0; i < bit; i++)
> > +    res |= 11304;
> > +  return res;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +f2 (unsigned int tmp, int bit)
> > +{
> > +  unsigned int res = tmp;
> > +  for (int i = 0; i < bit; i++)
> > +    res ^= 11304;
> > +  return res;
> > +}
> > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > index 70b17c5bca1..f61277c32df 100644
> > --- a/gcc/tree-scalar-evolution.cc
> > +++ b/gcc/tree-scalar-evolution.cc
> > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
> >    match_op[0] = gimple_assign_rhs1 (def);
> >    match_op[1] = gimple_assign_rhs2 (def);
> >
> > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > +    std::swap (match_op[0], match_op[1]);
> > +
> >    if (TREE_CODE (match_op[1]) != SSA_NAME
> >        || !expr_invariant_in_loop_p (loop, match_op[0])
> >        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > --
> > 2.31.1
> >
Richard Biener Nov. 7, 2023, 2:34 p.m. UTC | #3
On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
> > >
> > > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > > loop invariant which is not the case when it's INTEGER_CST.
> > >
> > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > > Ok for trunk?
> >
> > So this addresses a missed optimization, right?  It seems to me that
> > even with two SSA names we are only "lucky" when rhs1 is the invariant
> > one.  So instead of swapping this way I'd do
> Yes, it's a miss optimization.
> And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
> enough, if match_op[1] is a loop invariant.it must be false for the
> below conditions(there couldn't be any header_phi from its
> definition).

Yes, all I said is that when you now care for op1 being INTEGER_CST
it could also be an invariant SSA name and thus only after swapping op0/op1
we could have a successful match, no?

> >
> >  unsigned i;
> >  for (i = 0; i < 2; ++i)
> >    if (TREE_CODE (match_op[i]) == SSA_NAME
> >        && ...)
> >     break; /* found! */
> >
> >   if (i == 2)
> >     return NULL_TREE;
> >   if (i == 0)
> >     std::swap (match_op[0], match_op[1]);
> >
> > to also handle a "swapped" pair of SSA names?
> >
> > > gcc/ChangeLog:
> > >
> > >         PR tree-optimization/105735
> > >         PR tree-optimization/111972
> > >         * tree-scalar-evolution.cc
> > >         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> > >         INTEGER_CST.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * gcc.target/i386/pr105735-3.c: New test.
> > > ---
> > >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
> > >  gcc/tree-scalar-evolution.cc               |  3 +
> > >  2 files changed, 90 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> > >
> > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > new file mode 100644
> > > index 00000000000..9e268a1a997
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > @@ -0,0 +1,87 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +foo (unsigned int tmp)
> > > +{
> > > +  for (int bit = 0; bit < 64; bit++)
> > > +    tmp &= 11304;
> > > +  return tmp;
> > > +}
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +foo1 (unsigned int tmp)
> > > +{
> > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > +    tmp &= 11304;
> > > +  return tmp;
> > > +}
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +foo2 (unsigned int tmp)
> > > +{
> > > +  for (int bit = 0; bit < 64; bit++)
> > > +    tmp |= 11304;
> > > +  return tmp;
> > > +}
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +foo3 (unsigned int tmp)
> > > +{
> > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > +    tmp |= 11304;
> > > +  return tmp;
> > > +}
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +foo4 (unsigned int tmp)
> > > +{
> > > +  for (int bit = 0; bit < 64; bit++)
> > > +    tmp ^= 11304;
> > > +  return tmp;
> > > +}
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +foo5 (unsigned int tmp)
> > > +{
> > > +  for (int bit = 0; bit < 63; bit++)
> > > +    tmp ^= 11304;
> > > +  return tmp;
> > > +}
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +f (unsigned int tmp, int bit)
> > > +{
> > > +  unsigned int res = tmp;
> > > +  for (int i = 0; i < bit; i++)
> > > +    res &= 11304;
> > > +  return res;
> > > +}
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +f1 (unsigned int tmp, int bit)
> > > +{
> > > +  unsigned int res = tmp;
> > > +  for (int i = 0; i < bit; i++)
> > > +    res |= 11304;
> > > +  return res;
> > > +}
> > > +
> > > +unsigned int
> > > +__attribute__((noipa))
> > > +f2 (unsigned int tmp, int bit)
> > > +{
> > > +  unsigned int res = tmp;
> > > +  for (int i = 0; i < bit; i++)
> > > +    res ^= 11304;
> > > +  return res;
> > > +}
> > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > > index 70b17c5bca1..f61277c32df 100644
> > > --- a/gcc/tree-scalar-evolution.cc
> > > +++ b/gcc/tree-scalar-evolution.cc
> > > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
> > >    match_op[0] = gimple_assign_rhs1 (def);
> > >    match_op[1] = gimple_assign_rhs2 (def);
> > >
> > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > +    std::swap (match_op[0], match_op[1]);
> > > +
> > >    if (TREE_CODE (match_op[1]) != SSA_NAME
> > >        || !expr_invariant_in_loop_p (loop, match_op[0])
> > >        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > --
> > > 2.31.1
> > >
>
>
>
> --
> BR,
> Hongtao
Hongtao Liu Nov. 8, 2023, 1:18 a.m. UTC | #4
On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > >
> > > > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > > > loop invariant which is not the case when it's INTEGER_CST.
> > > >
> > > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > > > Ok for trunk?
> > >
> > > So this addresses a missed optimization, right?  It seems to me that
> > > even with two SSA names we are only "lucky" when rhs1 is the invariant
> > > one.  So instead of swapping this way I'd do
> > Yes, it's a miss optimization.
> > And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
> > enough, if match_op[1] is a loop invariant.it must be false for the
> > below conditions(there couldn't be any header_phi from its
> > definition).
>
> Yes, all I said is that when you now care for op1 being INTEGER_CST
> it could also be an invariant SSA name and thus only after swapping op0/op1
> we could have a successful match, no?
Sorry, the commit message is a little bit misleading.
At first, I just wanted to handle the INTEGER_CST case (with TREE_CODE
(match_op[1]) == INTEGER_CST), but then I realized that this could
probably be extended to the normal SSA_NAME case as well, so I used
expr_invariant_in_loop_p, which should theoretically be able to handle
the SSA_NAME case as well.

if (expr_invariant_in_loop_p (loop, match_op[1])) is true, w/o
swapping it must return NULL_TREE for below conditions.
if (expr_invariant_in_loop_p (loop, match_op[1])) is false, w/
swapping it must return NULL_TREE too.
So it can cover the both cases you mentioned, no need for a loop to
iterate 2 match_ops for all conditions.

3692  if (TREE_CODE (match_op[1]) != SSA_NAME
3693      || !expr_invariant_in_loop_p (loop, match_op[0])
3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
3695      || gimple_bb (header_phi) != loop->header
3696      || gimple_phi_num_args (header_phi) != 2)
3697    return NULL_TREE;
3698
3699  if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef)
3700    return NULL_TREE;


>
> > >
> > >  unsigned i;
> > >  for (i = 0; i < 2; ++i)
> > >    if (TREE_CODE (match_op[i]) == SSA_NAME
> > >        && ...)
> > >     break; /* found! */
> > >
> > >   if (i == 2)
> > >     return NULL_TREE;
> > >   if (i == 0)
> > >     std::swap (match_op[0], match_op[1]);
> > >
> > > to also handle a "swapped" pair of SSA names?
> > >
> > > > gcc/ChangeLog:
> > > >
> > > >         PR tree-optimization/105735
> > > >         PR tree-optimization/111972
> > > >         * tree-scalar-evolution.cc
> > > >         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> > > >         INTEGER_CST.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >         * gcc.target/i386/pr105735-3.c: New test.
> > > > ---
> > > >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
> > > >  gcc/tree-scalar-evolution.cc               |  3 +
> > > >  2 files changed, 90 insertions(+)
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > >
> > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > new file mode 100644
> > > > index 00000000000..9e268a1a997
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > @@ -0,0 +1,87 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +foo (unsigned int tmp)
> > > > +{
> > > > +  for (int bit = 0; bit < 64; bit++)
> > > > +    tmp &= 11304;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +foo1 (unsigned int tmp)
> > > > +{
> > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > +    tmp &= 11304;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +foo2 (unsigned int tmp)
> > > > +{
> > > > +  for (int bit = 0; bit < 64; bit++)
> > > > +    tmp |= 11304;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +foo3 (unsigned int tmp)
> > > > +{
> > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > +    tmp |= 11304;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +foo4 (unsigned int tmp)
> > > > +{
> > > > +  for (int bit = 0; bit < 64; bit++)
> > > > +    tmp ^= 11304;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +foo5 (unsigned int tmp)
> > > > +{
> > > > +  for (int bit = 0; bit < 63; bit++)
> > > > +    tmp ^= 11304;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +f (unsigned int tmp, int bit)
> > > > +{
> > > > +  unsigned int res = tmp;
> > > > +  for (int i = 0; i < bit; i++)
> > > > +    res &= 11304;
> > > > +  return res;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +f1 (unsigned int tmp, int bit)
> > > > +{
> > > > +  unsigned int res = tmp;
> > > > +  for (int i = 0; i < bit; i++)
> > > > +    res |= 11304;
> > > > +  return res;
> > > > +}
> > > > +
> > > > +unsigned int
> > > > +__attribute__((noipa))
> > > > +f2 (unsigned int tmp, int bit)
> > > > +{
> > > > +  unsigned int res = tmp;
> > > > +  for (int i = 0; i < bit; i++)
> > > > +    res ^= 11304;
> > > > +  return res;
> > > > +}
> > > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > > > index 70b17c5bca1..f61277c32df 100644
> > > > --- a/gcc/tree-scalar-evolution.cc
> > > > +++ b/gcc/tree-scalar-evolution.cc
> > > > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
> > > >    match_op[0] = gimple_assign_rhs1 (def);
> > > >    match_op[1] = gimple_assign_rhs2 (def);
> > > >
> > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > +    std::swap (match_op[0], match_op[1]);
> > > > +
> > > >    if (TREE_CODE (match_op[1]) != SSA_NAME
> > > >        || !expr_invariant_in_loop_p (loop, match_op[0])
> > > >        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > > --
> > > > 2.31.1
> > > >
> >
> >
> >
> > --
> > BR,
> > Hongtao



--
BR,
Hongtao
Richard Biener Nov. 8, 2023, 7:53 a.m. UTC | #5
On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > > <richard.guenther@gmail.com> wrote:
> > > >
> > > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > >
> > > > > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > > > > loop invariant which is not the case when it's INTEGER_CST.
> > > > >
> > > > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > > > > Ok for trunk?
> > > >
> > > > So this addresses a missed optimization, right?  It seems to me that
> > > > even with two SSA names we are only "lucky" when rhs1 is the invariant
> > > > one.  So instead of swapping this way I'd do
> > > Yes, it's a miss optimization.
> > > And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
> > > enough, if match_op[1] is a loop invariant.it must be false for the
> > > below conditions(there couldn't be any header_phi from its
> > > definition).
> >
> > Yes, all I said is that when you now care for op1 being INTEGER_CST
> > it could also be an invariant SSA name and thus only after swapping op0/op1
> > we could have a successful match, no?
> Sorry, the commit message is a little bit misleading.
> At first, I just wanted to handle the INTEGER_CST case (with TREE_CODE
> (match_op[1]) == INTEGER_CST), but then I realized that this could
> probably be extended to the normal SSA_NAME case as well, so I used
> expr_invariant_in_loop_p, which should theoretically be able to handle
> the SSA_NAME case as well.
>
> if (expr_invariant_in_loop_p (loop, match_op[1])) is true, w/o
> swapping it must return NULL_TREE for below conditions.
> if (expr_invariant_in_loop_p (loop, match_op[1])) is false, w/
> swapping it must return NULL_TREE too.
> So it can cover the both cases you mentioned, no need for a loop to
> iterate 2 match_ops for all conditions.

Sorry if it appears we're going in circles ;)

> 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> 3693      || !expr_invariant_in_loop_p (loop, match_op[0])
> 3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))

but this only checks match_op[1] (an SSA name at this point) for being defined
by the header PHI.  What if expr_invariant_in_loop_p (loop, mach_op[1])
and header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[0]))
which I think can happen when both ops are SSA name?

The only canonicalization we have is that constant operands are put second so
it would have been more natural to write the matching with the other operand
order (but likely you'd have been unlucky for the existing testcases then).

> 3695      || gimple_bb (header_phi) != loop->header
> 3696      || gimple_phi_num_args (header_phi) != 2)
> 3697    return NULL_TREE;
> 3698
> 3699  if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef)
> 3700    return NULL_TREE;
>
>
> >
> > > >
> > > >  unsigned i;
> > > >  for (i = 0; i < 2; ++i)
> > > >    if (TREE_CODE (match_op[i]) == SSA_NAME
> > > >        && ...)
> > > >     break; /* found! */
> > > >
> > > >   if (i == 2)
> > > >     return NULL_TREE;
> > > >   if (i == 0)
> > > >     std::swap (match_op[0], match_op[1]);
> > > >
> > > > to also handle a "swapped" pair of SSA names?
> > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >         PR tree-optimization/105735
> > > > >         PR tree-optimization/111972
> > > > >         * tree-scalar-evolution.cc
> > > > >         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> > > > >         INTEGER_CST.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >         * gcc.target/i386/pr105735-3.c: New test.
> > > > > ---
> > > > >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
> > > > >  gcc/tree-scalar-evolution.cc               |  3 +
> > > > >  2 files changed, 90 insertions(+)
> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > >
> > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > new file mode 100644
> > > > > index 00000000000..9e268a1a997
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > @@ -0,0 +1,87 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +foo (unsigned int tmp)
> > > > > +{
> > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > +    tmp &= 11304;
> > > > > +  return tmp;
> > > > > +}
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +foo1 (unsigned int tmp)
> > > > > +{
> > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > +    tmp &= 11304;
> > > > > +  return tmp;
> > > > > +}
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +foo2 (unsigned int tmp)
> > > > > +{
> > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > +    tmp |= 11304;
> > > > > +  return tmp;
> > > > > +}
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +foo3 (unsigned int tmp)
> > > > > +{
> > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > +    tmp |= 11304;
> > > > > +  return tmp;
> > > > > +}
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +foo4 (unsigned int tmp)
> > > > > +{
> > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > +    tmp ^= 11304;
> > > > > +  return tmp;
> > > > > +}
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +foo5 (unsigned int tmp)
> > > > > +{
> > > > > +  for (int bit = 0; bit < 63; bit++)
> > > > > +    tmp ^= 11304;
> > > > > +  return tmp;
> > > > > +}
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +f (unsigned int tmp, int bit)
> > > > > +{
> > > > > +  unsigned int res = tmp;
> > > > > +  for (int i = 0; i < bit; i++)
> > > > > +    res &= 11304;
> > > > > +  return res;
> > > > > +}
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +f1 (unsigned int tmp, int bit)
> > > > > +{
> > > > > +  unsigned int res = tmp;
> > > > > +  for (int i = 0; i < bit; i++)
> > > > > +    res |= 11304;
> > > > > +  return res;
> > > > > +}
> > > > > +
> > > > > +unsigned int
> > > > > +__attribute__((noipa))
> > > > > +f2 (unsigned int tmp, int bit)
> > > > > +{
> > > > > +  unsigned int res = tmp;
> > > > > +  for (int i = 0; i < bit; i++)
> > > > > +    res ^= 11304;
> > > > > +  return res;
> > > > > +}
> > > > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > > > > index 70b17c5bca1..f61277c32df 100644
> > > > > --- a/gcc/tree-scalar-evolution.cc
> > > > > +++ b/gcc/tree-scalar-evolution.cc
> > > > > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
> > > > >    match_op[0] = gimple_assign_rhs1 (def);
> > > > >    match_op[1] = gimple_assign_rhs2 (def);
> > > > >
> > > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > > +    std::swap (match_op[0], match_op[1]);
> > > > > +
> > > > >    if (TREE_CODE (match_op[1]) != SSA_NAME
> > > > >        || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > >        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > > > --
> > > > > 2.31.1
> > > > >
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> BR,
> Hongtao
Hongtao Liu Nov. 8, 2023, 8:22 a.m. UTC | #6
On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > > > <richard.guenther@gmail.com> wrote:
> > > > >
> > > > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > > >
> > > > > > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > > > > > loop invariant which is not the case when it's INTEGER_CST.
> > > > > >
> > > > > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > > > > > Ok for trunk?
> > > > >
> > > > > So this addresses a missed optimization, right?  It seems to me that
> > > > > even with two SSA names we are only "lucky" when rhs1 is the invariant
> > > > > one.  So instead of swapping this way I'd do
> > > > Yes, it's a miss optimization.
> > > > And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
> > > > enough, if match_op[1] is a loop invariant.it must be false for the
> > > > below conditions(there couldn't be any header_phi from its
> > > > definition).
> > >
> > > Yes, all I said is that when you now care for op1 being INTEGER_CST
> > > it could also be an invariant SSA name and thus only after swapping op0/op1
> > > we could have a successful match, no?
> > Sorry, the commit message is a little bit misleading.
> > At first, I just wanted to handle the INTEGER_CST case (with TREE_CODE
> > (match_op[1]) == INTEGER_CST), but then I realized that this could
> > probably be extended to the normal SSA_NAME case as well, so I used
> > expr_invariant_in_loop_p, which should theoretically be able to handle
> > the SSA_NAME case as well.
> >
> > if (expr_invariant_in_loop_p (loop, match_op[1])) is true, w/o
> > swapping it must return NULL_TREE for below conditions.
> > if (expr_invariant_in_loop_p (loop, match_op[1])) is false, w/
> > swapping it must return NULL_TREE too.
> > So it can cover the both cases you mentioned, no need for a loop to
> > iterate 2 match_ops for all conditions.
>
> Sorry if it appears we're going in circles ;)
>
> > 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> > 3693      || !expr_invariant_in_loop_p (loop, match_op[0])
> > 3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
>
> but this only checks match_op[1] (an SSA name at this point) for being defined
> by the header PHI.  What if expr_invariant_in_loop_p (loop, mach_op[1])
> and header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[0]))
> which I think can happen when both ops are SSA name?
The whole condition is like

3692  if (TREE_CODE (match_op[1]) != SSA_NAME
3693      || !expr_invariant_in_loop_p (loop, match_op[0])
3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
3695      || gimple_bb (header_phi) != loop->header  ----- This would
be true if match_op[1] is SSA_NAME and expr_invariant_in_loop_p
3696      || gimple_phi_num_args (header_phi) != 2)

If expr_invariant_in_loop_p (loop, mach_op[1]) is true and it's an SSA_NAME
according to code in expr_invariant_in_loop_p, def_bb of gphi is
either NULL or not belong to this loop, either case will make will
make gimple_bb (header_phi) != loop->header true.

1857  if (TREE_CODE (expr) == SSA_NAME)
1858    {
1859      def_bb = gimple_bb (SSA_NAME_DEF_STMT (expr));
1860      if (def_bb
1861          && flow_bb_inside_loop_p (loop, def_bb))  -- def_bb is
NULL or it doesn't belong to the loop
1862        return false;
1863
1864      return true;
1865    }
1866
1867  if (!EXPR_P (expr))

>
> The only canonicalization we have is that constant operands are put second so
> it would have been more natural to write the matching with the other operand
> order (but likely you'd have been unlucky for the existing testcases then).
>
> > 3695      || gimple_bb (header_phi) != loop->header
> > 3696      || gimple_phi_num_args (header_phi) != 2)
> > 3697    return NULL_TREE;
> > 3698
> > 3699  if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef)
> > 3700    return NULL_TREE;
> >
> >
> > >
> > > > >
> > > > >  unsigned i;
> > > > >  for (i = 0; i < 2; ++i)
> > > > >    if (TREE_CODE (match_op[i]) == SSA_NAME
> > > > >        && ...)
> > > > >     break; /* found! */
> > > > >
> > > > >   if (i == 2)
> > > > >     return NULL_TREE;
> > > > >   if (i == 0)
> > > > >     std::swap (match_op[0], match_op[1]);
> > > > >
> > > > > to also handle a "swapped" pair of SSA names?
> > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > >         PR tree-optimization/105735
> > > > > >         PR tree-optimization/111972
> > > > > >         * tree-scalar-evolution.cc
> > > > > >         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> > > > > >         INTEGER_CST.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >
> > > > > >         * gcc.target/i386/pr105735-3.c: New test.
> > > > > > ---
> > > > > >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
> > > > > >  gcc/tree-scalar-evolution.cc               |  3 +
> > > > > >  2 files changed, 90 insertions(+)
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > >
> > > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > new file mode 100644
> > > > > > index 00000000000..9e268a1a997
> > > > > > --- /dev/null
> > > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > @@ -0,0 +1,87 @@
> > > > > > +/* { dg-do compile } */
> > > > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > > > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +foo (unsigned int tmp)
> > > > > > +{
> > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > +    tmp &= 11304;
> > > > > > +  return tmp;
> > > > > > +}
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +foo1 (unsigned int tmp)
> > > > > > +{
> > > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > > +    tmp &= 11304;
> > > > > > +  return tmp;
> > > > > > +}
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +foo2 (unsigned int tmp)
> > > > > > +{
> > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > +    tmp |= 11304;
> > > > > > +  return tmp;
> > > > > > +}
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +foo3 (unsigned int tmp)
> > > > > > +{
> > > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > > +    tmp |= 11304;
> > > > > > +  return tmp;
> > > > > > +}
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +foo4 (unsigned int tmp)
> > > > > > +{
> > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > +    tmp ^= 11304;
> > > > > > +  return tmp;
> > > > > > +}
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +foo5 (unsigned int tmp)
> > > > > > +{
> > > > > > +  for (int bit = 0; bit < 63; bit++)
> > > > > > +    tmp ^= 11304;
> > > > > > +  return tmp;
> > > > > > +}
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +f (unsigned int tmp, int bit)
> > > > > > +{
> > > > > > +  unsigned int res = tmp;
> > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > +    res &= 11304;
> > > > > > +  return res;
> > > > > > +}
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +f1 (unsigned int tmp, int bit)
> > > > > > +{
> > > > > > +  unsigned int res = tmp;
> > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > +    res |= 11304;
> > > > > > +  return res;
> > > > > > +}
> > > > > > +
> > > > > > +unsigned int
> > > > > > +__attribute__((noipa))
> > > > > > +f2 (unsigned int tmp, int bit)
> > > > > > +{
> > > > > > +  unsigned int res = tmp;
> > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > +    res ^= 11304;
> > > > > > +  return res;
> > > > > > +}
> > > > > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > > > > > index 70b17c5bca1..f61277c32df 100644
> > > > > > --- a/gcc/tree-scalar-evolution.cc
> > > > > > +++ b/gcc/tree-scalar-evolution.cc
> > > > > > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
> > > > > >    match_op[0] = gimple_assign_rhs1 (def);
> > > > > >    match_op[1] = gimple_assign_rhs2 (def);
> > > > > >
> > > > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > > > +    std::swap (match_op[0], match_op[1]);
> > > > > > +
> > > > > >    if (TREE_CODE (match_op[1]) != SSA_NAME
> > > > > >        || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > > >        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > > > > --
> > > > > > 2.31.1
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao
Richard Biener Nov. 10, 2023, 9:12 a.m. UTC | #7
On Wed, Nov 8, 2023 at 9:22 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
> > > <richard.guenther@gmail.com> wrote:
> > > >
> > > > On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > > > >
> > > > > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > > > > <richard.guenther@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > > > >
> > > > > > > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > > > > > > loop invariant which is not the case when it's INTEGER_CST.
> > > > > > >
> > > > > > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > > > > > > Ok for trunk?
> > > > > >
> > > > > > So this addresses a missed optimization, right?  It seems to me that
> > > > > > even with two SSA names we are only "lucky" when rhs1 is the invariant
> > > > > > one.  So instead of swapping this way I'd do
> > > > > Yes, it's a miss optimization.
> > > > > And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
> > > > > enough, if match_op[1] is a loop invariant.it must be false for the
> > > > > below conditions(there couldn't be any header_phi from its
> > > > > definition).
> > > >
> > > > Yes, all I said is that when you now care for op1 being INTEGER_CST
> > > > it could also be an invariant SSA name and thus only after swapping op0/op1
> > > > we could have a successful match, no?
> > > Sorry, the commit message is a little bit misleading.
> > > At first, I just wanted to handle the INTEGER_CST case (with TREE_CODE
> > > (match_op[1]) == INTEGER_CST), but then I realized that this could
> > > probably be extended to the normal SSA_NAME case as well, so I used
> > > expr_invariant_in_loop_p, which should theoretically be able to handle
> > > the SSA_NAME case as well.
> > >
> > > if (expr_invariant_in_loop_p (loop, match_op[1])) is true, w/o
> > > swapping it must return NULL_TREE for below conditions.
> > > if (expr_invariant_in_loop_p (loop, match_op[1])) is false, w/
> > > swapping it must return NULL_TREE too.
> > > So it can cover the both cases you mentioned, no need for a loop to
> > > iterate 2 match_ops for all conditions.
> >
> > Sorry if it appears we're going in circles ;)
> >
> > > 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> > > 3693      || !expr_invariant_in_loop_p (loop, match_op[0])
> > > 3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> >
> > but this only checks match_op[1] (an SSA name at this point) for being defined
> > by the header PHI.  What if expr_invariant_in_loop_p (loop, mach_op[1])
> > and header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[0]))
> > which I think can happen when both ops are SSA name?
> The whole condition is like
>
> 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> 3693      || !expr_invariant_in_loop_p (loop, match_op[0])
> 3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> 3695      || gimple_bb (header_phi) != loop->header  ----- This would
> be true if match_op[1] is SSA_NAME and expr_invariant_in_loop_p

But it could be expr_invariant_in_loop_p (match_op[1]) and
header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[0]))

all I say is that for two SSA names we could not match the condition
(aka not fail)
when we swap op0/op1.  Not only when op1 is INTEGER_CST.

> 3696      || gimple_phi_num_args (header_phi) != 2)
>
> If expr_invariant_in_loop_p (loop, mach_op[1]) is true and it's an SSA_NAME
> according to code in expr_invariant_in_loop_p, def_bb of gphi is
> either NULL or not belong to this loop, either case will make will
> make gimple_bb (header_phi) != loop->header true.
>
> 1857  if (TREE_CODE (expr) == SSA_NAME)
> 1858    {
> 1859      def_bb = gimple_bb (SSA_NAME_DEF_STMT (expr));
> 1860      if (def_bb
> 1861          && flow_bb_inside_loop_p (loop, def_bb))  -- def_bb is
> NULL or it doesn't belong to the loop
> 1862        return false;
> 1863
> 1864      return true;
> 1865    }
> 1866
> 1867  if (!EXPR_P (expr))
>
> >
> > The only canonicalization we have is that constant operands are put second so
> > it would have been more natural to write the matching with the other operand
> > order (but likely you'd have been unlucky for the existing testcases then).
> >
> > > 3695      || gimple_bb (header_phi) != loop->header
> > > 3696      || gimple_phi_num_args (header_phi) != 2)
> > > 3697    return NULL_TREE;
> > > 3698
> > > 3699  if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef)
> > > 3700    return NULL_TREE;
> > >
> > >
> > > >
> > > > > >
> > > > > >  unsigned i;
> > > > > >  for (i = 0; i < 2; ++i)
> > > > > >    if (TREE_CODE (match_op[i]) == SSA_NAME
> > > > > >        && ...)
> > > > > >     break; /* found! */
> > > > > >
> > > > > >   if (i == 2)
> > > > > >     return NULL_TREE;
> > > > > >   if (i == 0)
> > > > > >     std::swap (match_op[0], match_op[1]);
> > > > > >
> > > > > > to also handle a "swapped" pair of SSA names?
> > > > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > >         PR tree-optimization/105735
> > > > > > >         PR tree-optimization/111972
> > > > > > >         * tree-scalar-evolution.cc
> > > > > > >         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> > > > > > >         INTEGER_CST.
> > > > > > >
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > >
> > > > > > >         * gcc.target/i386/pr105735-3.c: New test.
> > > > > > > ---
> > > > > > >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
> > > > > > >  gcc/tree-scalar-evolution.cc               |  3 +
> > > > > > >  2 files changed, 90 insertions(+)
> > > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > >
> > > > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > > new file mode 100644
> > > > > > > index 00000000000..9e268a1a997
> > > > > > > --- /dev/null
> > > > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > > @@ -0,0 +1,87 @@
> > > > > > > +/* { dg-do compile } */
> > > > > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > > > > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +foo (unsigned int tmp)
> > > > > > > +{
> > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > +    tmp &= 11304;
> > > > > > > +  return tmp;
> > > > > > > +}
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +foo1 (unsigned int tmp)
> > > > > > > +{
> > > > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > > > +    tmp &= 11304;
> > > > > > > +  return tmp;
> > > > > > > +}
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +foo2 (unsigned int tmp)
> > > > > > > +{
> > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > +    tmp |= 11304;
> > > > > > > +  return tmp;
> > > > > > > +}
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +foo3 (unsigned int tmp)
> > > > > > > +{
> > > > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > > > +    tmp |= 11304;
> > > > > > > +  return tmp;
> > > > > > > +}
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +foo4 (unsigned int tmp)
> > > > > > > +{
> > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > +    tmp ^= 11304;
> > > > > > > +  return tmp;
> > > > > > > +}
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +foo5 (unsigned int tmp)
> > > > > > > +{
> > > > > > > +  for (int bit = 0; bit < 63; bit++)
> > > > > > > +    tmp ^= 11304;
> > > > > > > +  return tmp;
> > > > > > > +}
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +f (unsigned int tmp, int bit)
> > > > > > > +{
> > > > > > > +  unsigned int res = tmp;
> > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > +    res &= 11304;
> > > > > > > +  return res;
> > > > > > > +}
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +f1 (unsigned int tmp, int bit)
> > > > > > > +{
> > > > > > > +  unsigned int res = tmp;
> > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > +    res |= 11304;
> > > > > > > +  return res;
> > > > > > > +}
> > > > > > > +
> > > > > > > +unsigned int
> > > > > > > +__attribute__((noipa))
> > > > > > > +f2 (unsigned int tmp, int bit)
> > > > > > > +{
> > > > > > > +  unsigned int res = tmp;
> > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > +    res ^= 11304;
> > > > > > > +  return res;
> > > > > > > +}
> > > > > > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > > > > > > index 70b17c5bca1..f61277c32df 100644
> > > > > > > --- a/gcc/tree-scalar-evolution.cc
> > > > > > > +++ b/gcc/tree-scalar-evolution.cc
> > > > > > > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
> > > > > > >    match_op[0] = gimple_assign_rhs1 (def);
> > > > > > >    match_op[1] = gimple_assign_rhs2 (def);
> > > > > > >
> > > > > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > > > > +    std::swap (match_op[0], match_op[1]);
> > > > > > > +
> > > > > > >    if (TREE_CODE (match_op[1]) != SSA_NAME
> > > > > > >        || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > > > >        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > > > > > --
> > > > > > > 2.31.1
> > > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > BR,
> > > > > Hongtao
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> BR,
> Hongtao
Hongtao Liu Nov. 13, 2023, 7:58 a.m. UTC | #8
On Fri, Nov 10, 2023 at 5:12 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Nov 8, 2023 at 9:22 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
> > > > <richard.guenther@gmail.com> wrote:
> > > > >
> > > > > On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > > > > > <richard.guenther@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > > > > >
> > > > > > > > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > > > > > > > loop invariant which is not the case when it's INTEGER_CST.
> > > > > > > >
> > > > > > > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > > > > > > > Ok for trunk?
> > > > > > >
> > > > > > > So this addresses a missed optimization, right?  It seems to me that
> > > > > > > even with two SSA names we are only "lucky" when rhs1 is the invariant
> > > > > > > one.  So instead of swapping this way I'd do
> > > > > > Yes, it's a miss optimization.
> > > > > > And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
> > > > > > enough, if match_op[1] is a loop invariant.it must be false for the
> > > > > > below conditions(there couldn't be any header_phi from its
> > > > > > definition).
> > > > >
> > > > > Yes, all I said is that when you now care for op1 being INTEGER_CST
> > > > > it could also be an invariant SSA name and thus only after swapping op0/op1
> > > > > we could have a successful match, no?
> > > > Sorry, the commit message is a little bit misleading.
> > > > At first, I just wanted to handle the INTEGER_CST case (with TREE_CODE
> > > > (match_op[1]) == INTEGER_CST), but then I realized that this could
> > > > probably be extended to the normal SSA_NAME case as well, so I used
> > > > expr_invariant_in_loop_p, which should theoretically be able to handle
> > > > the SSA_NAME case as well.
> > > >
> > > > if (expr_invariant_in_loop_p (loop, match_op[1])) is true, w/o
> > > > swapping it must return NULL_TREE for below conditions.
> > > > if (expr_invariant_in_loop_p (loop, match_op[1])) is false, w/
> > > > swapping it must return NULL_TREE too.
> > > > So it can cover the both cases you mentioned, no need for a loop to
> > > > iterate 2 match_ops for all conditions.
> > >
> > > Sorry if it appears we're going in circles ;)
> > >
> > > > 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> > > > 3693      || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > 3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > >
> > > but this only checks match_op[1] (an SSA name at this point) for being defined
> > > by the header PHI.  What if expr_invariant_in_loop_p (loop, mach_op[1])
> > > and header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[0]))
> > > which I think can happen when both ops are SSA name?
> > The whole condition is like
> >
> > 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> > 3693      || !expr_invariant_in_loop_p (loop, match_op[0])
> > 3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > 3695      || gimple_bb (header_phi) != loop->header  ----- This would
> > be true if match_op[1] is SSA_NAME and expr_invariant_in_loop_p
>
> But it could be expr_invariant_in_loop_p (match_op[1]) and
> header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[0]))

> > > > > > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > > > > > +    std::swap (match_op[0], match_op[1]);
match_op[1] will be swapped to match_op[0], the case is also handled
by my patch [1](the v2 patch)
My point is the upper code already handles 2 SSA names, no need to
iterate with all conditions, expr_invariant_in_loop_p alone is enough.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635440.html
>
> all I say is that for two SSA names we could not match the condition
> (aka not fail)
> when we swap op0/op1.  Not only when op1 is INTEGER_CST.

>
> > 3696      || gimple_phi_num_args (header_phi) != 2)
> >
> > If expr_invariant_in_loop_p (loop, mach_op[1]) is true and it's an SSA_NAME
> > according to code in expr_invariant_in_loop_p, def_bb of gphi is
> > either NULL or not belong to this loop, either case will make will
> > make gimple_bb (header_phi) != loop->header true.
> >
> > 1857  if (TREE_CODE (expr) == SSA_NAME)
> > 1858    {
> > 1859      def_bb = gimple_bb (SSA_NAME_DEF_STMT (expr));
> > 1860      if (def_bb
> > 1861          && flow_bb_inside_loop_p (loop, def_bb))  -- def_bb is
> > NULL or it doesn't belong to the loop
> > 1862        return false;
> > 1863
> > 1864      return true;
> > 1865    }
> > 1866
> > 1867  if (!EXPR_P (expr))
> >
> > >
> > > The only canonicalization we have is that constant operands are put second so
> > > it would have been more natural to write the matching with the other operand
> > > order (but likely you'd have been unlucky for the existing testcases then).
> > >
> > > > 3695      || gimple_bb (header_phi) != loop->header
> > > > 3696      || gimple_phi_num_args (header_phi) != 2)
> > > > 3697    return NULL_TREE;
> > > > 3698
> > > > 3699  if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef)
> > > > 3700    return NULL_TREE;
> > > >
> > > >
> > > > >
> > > > > > >
> > > > > > >  unsigned i;
> > > > > > >  for (i = 0; i < 2; ++i)
> > > > > > >    if (TREE_CODE (match_op[i]) == SSA_NAME
> > > > > > >        && ...)
> > > > > > >     break; /* found! */
> > > > > > >
> > > > > > >   if (i == 2)
> > > > > > >     return NULL_TREE;
> > > > > > >   if (i == 0)
> > > > > > >     std::swap (match_op[0], match_op[1]);
> > > > > > >
> > > > > > > to also handle a "swapped" pair of SSA names?
> > > > > > >
> > > > > > > > gcc/ChangeLog:
> > > > > > > >
> > > > > > > >         PR tree-optimization/105735
> > > > > > > >         PR tree-optimization/111972
> > > > > > > >         * tree-scalar-evolution.cc
> > > > > > > >         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> > > > > > > >         INTEGER_CST.
> > > > > > > >
> > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > >
> > > > > > > >         * gcc.target/i386/pr105735-3.c: New test.
> > > > > > > > ---
> > > > > > > >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
> > > > > > > >  gcc/tree-scalar-evolution.cc               |  3 +
> > > > > > > >  2 files changed, 90 insertions(+)
> > > > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > > >
> > > > > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > > > new file mode 100644
> > > > > > > > index 00000000000..9e268a1a997
> > > > > > > > --- /dev/null
> > > > > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > > > @@ -0,0 +1,87 @@
> > > > > > > > +/* { dg-do compile } */
> > > > > > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > > > > > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +foo (unsigned int tmp)
> > > > > > > > +{
> > > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > > +    tmp &= 11304;
> > > > > > > > +  return tmp;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +foo1 (unsigned int tmp)
> > > > > > > > +{
> > > > > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > > > > +    tmp &= 11304;
> > > > > > > > +  return tmp;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +foo2 (unsigned int tmp)
> > > > > > > > +{
> > > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > > +    tmp |= 11304;
> > > > > > > > +  return tmp;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +foo3 (unsigned int tmp)
> > > > > > > > +{
> > > > > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > > > > +    tmp |= 11304;
> > > > > > > > +  return tmp;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +foo4 (unsigned int tmp)
> > > > > > > > +{
> > > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > > +    tmp ^= 11304;
> > > > > > > > +  return tmp;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +foo5 (unsigned int tmp)
> > > > > > > > +{
> > > > > > > > +  for (int bit = 0; bit < 63; bit++)
> > > > > > > > +    tmp ^= 11304;
> > > > > > > > +  return tmp;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +f (unsigned int tmp, int bit)
> > > > > > > > +{
> > > > > > > > +  unsigned int res = tmp;
> > > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > > +    res &= 11304;
> > > > > > > > +  return res;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +f1 (unsigned int tmp, int bit)
> > > > > > > > +{
> > > > > > > > +  unsigned int res = tmp;
> > > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > > +    res |= 11304;
> > > > > > > > +  return res;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +unsigned int
> > > > > > > > +__attribute__((noipa))
> > > > > > > > +f2 (unsigned int tmp, int bit)
> > > > > > > > +{
> > > > > > > > +  unsigned int res = tmp;
> > > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > > +    res ^= 11304;
> > > > > > > > +  return res;
> > > > > > > > +}
> > > > > > > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > > > > > > > index 70b17c5bca1..f61277c32df 100644
> > > > > > > > --- a/gcc/tree-scalar-evolution.cc
> > > > > > > > +++ b/gcc/tree-scalar-evolution.cc
> > > > > > > > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
> > > > > > > >    match_op[0] = gimple_assign_rhs1 (def);
> > > > > > > >    match_op[1] = gimple_assign_rhs2 (def);
> > > > > > > >
> > > > > > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > > > > > +    std::swap (match_op[0], match_op[1]);
> > > > > > > > +
> > > > > > > >    if (TREE_CODE (match_op[1]) != SSA_NAME
> > > > > > > >        || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > > > > >        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > > > > > > --
> > > > > > > > 2.31.1
> > > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > BR,
> > > > > > Hongtao
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao
Richard Biener Nov. 13, 2023, 11:52 a.m. UTC | #9
On Mon, Nov 13, 2023 at 8:50 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Fri, Nov 10, 2023 at 5:12 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Nov 8, 2023 at 9:22 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
> > > <richard.guenther@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > > > >
> > > > > On Tue, Nov 7, 2023 at 10:34 PM Richard Biener
> > > > > <richard.guenther@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Nov 7, 2023 at 2:03 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
> > > > > > > <richard.guenther@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Nov 7, 2023 at 7:08 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > > > > > >
> > > > > > > > > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > > > > > > > > loop invariant which is not the case when it's INTEGER_CST.
> > > > > > > > >
> > > > > > > > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > > > > > > > > Ok for trunk?
> > > > > > > >
> > > > > > > > So this addresses a missed optimization, right?  It seems to me that
> > > > > > > > even with two SSA names we are only "lucky" when rhs1 is the invariant
> > > > > > > > one.  So instead of swapping this way I'd do
> > > > > > > Yes, it's a miss optimization.
> > > > > > > And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
> > > > > > > enough, if match_op[1] is a loop invariant.it must be false for the
> > > > > > > below conditions(there couldn't be any header_phi from its
> > > > > > > definition).
> > > > > >
> > > > > > Yes, all I said is that when you now care for op1 being INTEGER_CST
> > > > > > it could also be an invariant SSA name and thus only after swapping op0/op1
> > > > > > we could have a successful match, no?
> > > > > Sorry, the commit message is a little bit misleading.
> > > > > At first, I just wanted to handle the INTEGER_CST case (with TREE_CODE
> > > > > (match_op[1]) == INTEGER_CST), but then I realized that this could
> > > > > probably be extended to the normal SSA_NAME case as well, so I used
> > > > > expr_invariant_in_loop_p, which should theoretically be able to handle
> > > > > the SSA_NAME case as well.
> > > > >
> > > > > if (expr_invariant_in_loop_p (loop, match_op[1])) is true, w/o
> > > > > swapping it must return NULL_TREE for below conditions.
> > > > > if (expr_invariant_in_loop_p (loop, match_op[1])) is false, w/
> > > > > swapping it must return NULL_TREE too.
> > > > > So it can cover the both cases you mentioned, no need for a loop to
> > > > > iterate 2 match_ops for all conditions.
> > > >
> > > > Sorry if it appears we're going in circles ;)
> > > >
> > > > > 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> > > > > 3693      || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > > 3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > >
> > > > but this only checks match_op[1] (an SSA name at this point) for being defined
> > > > by the header PHI.  What if expr_invariant_in_loop_p (loop, mach_op[1])
> > > > and header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[0]))
> > > > which I think can happen when both ops are SSA name?
> > > The whole condition is like
> > >
> > > 3692  if (TREE_CODE (match_op[1]) != SSA_NAME
> > > 3693      || !expr_invariant_in_loop_p (loop, match_op[0])
> > > 3694      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > 3695      || gimple_bb (header_phi) != loop->header  ----- This would
> > > be true if match_op[1] is SSA_NAME and expr_invariant_in_loop_p
> >
> > But it could be expr_invariant_in_loop_p (match_op[1]) and
> > header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[0]))
>
> > > > > > > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > > > > > > +    std::swap (match_op[0], match_op[1]);
> match_op[1] will be swapped to match_op[0], the case is also handled
> by my patch [1](the v2 patch)
> My point is the upper code already handles 2 SSA names, no need to
> iterate with all conditions, expr_invariant_in_loop_p alone is enough.
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635440.html

I see - thanks for the repeated explanations.  That patch is OK.

Thanks,
Richard.

> >
> > all I say is that for two SSA names we could not match the condition
> > (aka not fail)
> > when we swap op0/op1.  Not only when op1 is INTEGER_CST.
>
> >
> > > 3696      || gimple_phi_num_args (header_phi) != 2)
> > >
> > > If expr_invariant_in_loop_p (loop, mach_op[1]) is true and it's an SSA_NAME
> > > according to code in expr_invariant_in_loop_p, def_bb of gphi is
> > > either NULL or not belong to this loop, either case will make will
> > > make gimple_bb (header_phi) != loop->header true.
> > >
> > > 1857  if (TREE_CODE (expr) == SSA_NAME)
> > > 1858    {
> > > 1859      def_bb = gimple_bb (SSA_NAME_DEF_STMT (expr));
> > > 1860      if (def_bb
> > > 1861          && flow_bb_inside_loop_p (loop, def_bb))  -- def_bb is
> > > NULL or it doesn't belong to the loop
> > > 1862        return false;
> > > 1863
> > > 1864      return true;
> > > 1865    }
> > > 1866
> > > 1867  if (!EXPR_P (expr))
> > >
> > > >
> > > > The only canonicalization we have is that constant operands are put second so
> > > > it would have been more natural to write the matching with the other operand
> > > > order (but likely you'd have been unlucky for the existing testcases then).
> > > >
> > > > > 3695      || gimple_bb (header_phi) != loop->header
> > > > > 3696      || gimple_phi_num_args (header_phi) != 2)
> > > > > 3697    return NULL_TREE;
> > > > > 3698
> > > > > 3699  if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef)
> > > > > 3700    return NULL_TREE;
> > > > >
> > > > >
> > > > > >
> > > > > > > >
> > > > > > > >  unsigned i;
> > > > > > > >  for (i = 0; i < 2; ++i)
> > > > > > > >    if (TREE_CODE (match_op[i]) == SSA_NAME
> > > > > > > >        && ...)
> > > > > > > >     break; /* found! */
> > > > > > > >
> > > > > > > >   if (i == 2)
> > > > > > > >     return NULL_TREE;
> > > > > > > >   if (i == 0)
> > > > > > > >     std::swap (match_op[0], match_op[1]);
> > > > > > > >
> > > > > > > > to also handle a "swapped" pair of SSA names?
> > > > > > > >
> > > > > > > > > gcc/ChangeLog:
> > > > > > > > >
> > > > > > > > >         PR tree-optimization/105735
> > > > > > > > >         PR tree-optimization/111972
> > > > > > > > >         * tree-scalar-evolution.cc
> > > > > > > > >         (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> > > > > > > > >         INTEGER_CST.
> > > > > > > > >
> > > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > > >
> > > > > > > > >         * gcc.target/i386/pr105735-3.c: New test.
> > > > > > > > > ---
> > > > > > > > >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++++++++++++++++++++++
> > > > > > > > >  gcc/tree-scalar-evolution.cc               |  3 +
> > > > > > > > >  2 files changed, 90 insertions(+)
> > > > > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > > > >
> > > > > > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > > > > new file mode 100644
> > > > > > > > > index 00000000000..9e268a1a997
> > > > > > > > > --- /dev/null
> > > > > > > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > > > > > > > > @@ -0,0 +1,87 @@
> > > > > > > > > +/* { dg-do compile } */
> > > > > > > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > > > > > > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +foo (unsigned int tmp)
> > > > > > > > > +{
> > > > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > > > +    tmp &= 11304;
> > > > > > > > > +  return tmp;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +foo1 (unsigned int tmp)
> > > > > > > > > +{
> > > > > > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > > > > > +    tmp &= 11304;
> > > > > > > > > +  return tmp;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +foo2 (unsigned int tmp)
> > > > > > > > > +{
> > > > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > > > +    tmp |= 11304;
> > > > > > > > > +  return tmp;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +foo3 (unsigned int tmp)
> > > > > > > > > +{
> > > > > > > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > > > > > > +    tmp |= 11304;
> > > > > > > > > +  return tmp;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +foo4 (unsigned int tmp)
> > > > > > > > > +{
> > > > > > > > > +  for (int bit = 0; bit < 64; bit++)
> > > > > > > > > +    tmp ^= 11304;
> > > > > > > > > +  return tmp;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +foo5 (unsigned int tmp)
> > > > > > > > > +{
> > > > > > > > > +  for (int bit = 0; bit < 63; bit++)
> > > > > > > > > +    tmp ^= 11304;
> > > > > > > > > +  return tmp;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +f (unsigned int tmp, int bit)
> > > > > > > > > +{
> > > > > > > > > +  unsigned int res = tmp;
> > > > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > > > +    res &= 11304;
> > > > > > > > > +  return res;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +f1 (unsigned int tmp, int bit)
> > > > > > > > > +{
> > > > > > > > > +  unsigned int res = tmp;
> > > > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > > > +    res |= 11304;
> > > > > > > > > +  return res;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +unsigned int
> > > > > > > > > +__attribute__((noipa))
> > > > > > > > > +f2 (unsigned int tmp, int bit)
> > > > > > > > > +{
> > > > > > > > > +  unsigned int res = tmp;
> > > > > > > > > +  for (int i = 0; i < bit; i++)
> > > > > > > > > +    res ^= 11304;
> > > > > > > > > +  return res;
> > > > > > > > > +}
> > > > > > > > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > > > > > > > > index 70b17c5bca1..f61277c32df 100644
> > > > > > > > > --- a/gcc/tree-scalar-evolution.cc
> > > > > > > > > +++ b/gcc/tree-scalar-evolution.cc
> > > > > > > > > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
> > > > > > > > >    match_op[0] = gimple_assign_rhs1 (def);
> > > > > > > > >    match_op[1] = gimple_assign_rhs2 (def);
> > > > > > > > >
> > > > > > > > > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > > > > > > > > +    std::swap (match_op[0], match_op[1]);
> > > > > > > > > +
> > > > > > > > >    if (TREE_CODE (match_op[1]) != SSA_NAME
> > > > > > > > >        || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > > > > > >        || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
> > > > > > > > > --
> > > > > > > > > 2.31.1
> > > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > BR,
> > > > > > > Hongtao
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > BR,
> > > > > Hongtao
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> BR,
> Hongtao
diff mbox series

Patch

diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c b/gcc/testsuite/gcc.target/i386/pr105735-3.c
new file mode 100644
index 00000000000..9e268a1a997
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
@@ -0,0 +1,87 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-sccp-details" } */
+/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
+
+unsigned int
+__attribute__((noipa))
+foo (unsigned int tmp)
+{
+  for (int bit = 0; bit < 64; bit++)
+    tmp &= 11304;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo1 (unsigned int tmp)
+{
+  for (int bit = 63; bit >= 0; bit -=3)
+    tmp &= 11304;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo2 (unsigned int tmp)
+{
+  for (int bit = 0; bit < 64; bit++)
+    tmp |= 11304;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo3 (unsigned int tmp)
+{
+  for (int bit = 63; bit >= 0; bit -=3)
+    tmp |= 11304;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo4 (unsigned int tmp)
+{
+  for (int bit = 0; bit < 64; bit++)
+    tmp ^= 11304;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo5 (unsigned int tmp)
+{
+  for (int bit = 0; bit < 63; bit++)
+    tmp ^= 11304;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+f (unsigned int tmp, int bit)
+{
+  unsigned int res = tmp;
+  for (int i = 0; i < bit; i++)
+    res &= 11304;
+  return res;
+}
+
+unsigned int
+__attribute__((noipa))
+f1 (unsigned int tmp, int bit)
+{
+  unsigned int res = tmp;
+  for (int i = 0; i < bit; i++)
+    res |= 11304;
+  return res;
+}
+
+unsigned int
+__attribute__((noipa))
+f2 (unsigned int tmp, int bit)
+{
+  unsigned int res = tmp;
+  for (int i = 0; i < bit; i++)
+    res ^= 11304;
+  return res;
+}
diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
index 70b17c5bca1..f61277c32df 100644
--- a/gcc/tree-scalar-evolution.cc
+++ b/gcc/tree-scalar-evolution.cc
@@ -3689,6 +3689,9 @@  analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
   match_op[0] = gimple_assign_rhs1 (def);
   match_op[1] = gimple_assign_rhs2 (def);
 
+  if (expr_invariant_in_loop_p (loop, match_op[1]))
+    std::swap (match_op[0], match_op[1]);
+
   if (TREE_CODE (match_op[1]) != SSA_NAME
       || !expr_invariant_in_loop_p (loop, match_op[0])
       || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))