diff mbox series

Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738]

Message ID CAMZc-byf4rYBdOtgC2enpVymG6iRcrdyRPYHCUfn6J=F72jfCg@mail.gmail.com
State New
Headers show
Series Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738] | expand

Commit Message

Hongtao Liu June 1, 2021, 5:22 a.m. UTC
Hi:
  This patch is about to simplify (view_convert:type ~a) < 0 to
(view_convert:type a) >= 0 when type is signed integer. Similar for
(view_convert:type ~a) >= 0.
  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
  Ok for the trunk?

gcc/ChangeLog:

        PR middle-end/100738
        * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
        (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
        simplification.

gcc/testsuite/ChangeLog:

        PR middle-end/100738
        * g++.target/i386/avx2-pr100738-1.C: New test.
        * g++.target/i386/sse4_1-pr100738-1.C: New test.

Comments

Andrew Pinski June 1, 2021, 5:29 a.m. UTC | #1
On Mon, May 31, 2021 at 10:21 PM Hongtao Liu via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi:
>   This patch is about to simplify (view_convert:type ~a) < 0 to
> (view_convert:type a) >= 0 when type is signed integer. Similar for
> (view_convert:type ~a) >= 0.
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>   Ok for the trunk?

Why not just do instead:

/* ((view_convert)~a) is just ~(view_convert)a .  */
(simplify
  (view_convert (bit_not @0))
  (if ((VECTOR_TYPE_P (type) && INTEGRAL_TYPE_P (TREE_TYPE (type)))
        || INTEGRAL_TYPE_P (type))
   (bit_not (view_convert @0))))

And then the other patterns for converting ~a < 0 to a >= 0 should
happen (if they don't add a few) (for wrapping types).

Thanks,
Andrew Pinski

>
> gcc/ChangeLog:
>
>         PR middle-end/100738
>         * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
>         (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
>         simplification.
>
> gcc/testsuite/ChangeLog:
>
>         PR middle-end/100738
>         * g++.target/i386/avx2-pr100738-1.C: New test.
>         * g++.target/i386/sse4_1-pr100738-1.C: New test.
>
> --
> BR,
> Hongtao
Marc Glisse June 1, 2021, 10:17 a.m. UTC | #2
On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:

> Hi:
>  This patch is about to simplify (view_convert:type ~a) < 0 to
> (view_convert:type a) >= 0 when type is signed integer. Similar for
> (view_convert:type ~a) >= 0.
>  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>  Ok for the trunk?
>
> gcc/ChangeLog:
>
>        PR middle-end/100738
>        * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
>        (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
>        simplification.

We already have

/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
(for cmp (simple_comparison)
      scmp (swapped_simple_comparison)
  (simplify
   (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
   (if (single_use (@2)
        && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
    (scmp @0 (bit_not @1)))))

Would it make sense to try and generalize it a bit, say with

(cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)

(scmp (view_convert:XXX @0) (bit_not @1))

(I still believe that it is a bad idea that SSA_NAMEs are strongly typed, 
encoding the type in operations would be more convenient, but I think the 
time for that choice has long gone)
Hongtao Liu June 1, 2021, 11:52 a.m. UTC | #3
On Tue, Jun 1, 2021 at 1:29 PM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Mon, May 31, 2021 at 10:21 PM Hongtao Liu via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi:
> >   This patch is about to simplify (view_convert:type ~a) < 0 to
> > (view_convert:type a) >= 0 when type is signed integer. Similar for
> > (view_convert:type ~a) >= 0.
> >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> >   Ok for the trunk?
>
> Why not just do instead:
>
> /* ((view_convert)~a) is just ~(view_convert)a .  */
> (simplify
>   (view_convert (bit_not @0))
>   (if ((VECTOR_TYPE_P (type) && INTEGRAL_TYPE_P (TREE_TYPE (type)))
>         || INTEGRAL_TYPE_P (type))
>    (bit_not (view_convert @0))))
>
> And then the other patterns for converting ~a < 0 to a >= 0 should
> happen (if they don't add a few) (for wrapping types).
>
It fails c11-atomic-exec-3.c where @0 is _Bool and type is unsigned char.
> Thanks,
> Andrew Pinski
>
> >
> > gcc/ChangeLog:
> >
> >         PR middle-end/100738
> >         * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
> >         (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
> >         simplification.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR middle-end/100738
> >         * g++.target/i386/avx2-pr100738-1.C: New test.
> >         * g++.target/i386/sse4_1-pr100738-1.C: New test.
> >
> > --
> > BR,
> > Hongtao
Hongtao Liu June 4, 2021, 5:01 a.m. UTC | #4
On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
>
> > Hi:
> >  This patch is about to simplify (view_convert:type ~a) < 0 to
> > (view_convert:type a) >= 0 when type is signed integer. Similar for
> > (view_convert:type ~a) >= 0.
> >  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> >  Ok for the trunk?
> >
> > gcc/ChangeLog:
> >
> >        PR middle-end/100738
> >        * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
> >        (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
> >        simplification.
>
> We already have
>
> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> (for cmp (simple_comparison)
>       scmp (swapped_simple_comparison)
>   (simplify
>    (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
>    (if (single_use (@2)
>         && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
>     (scmp @0 (bit_not @1)))))
>
> Would it make sense to try and generalize it a bit, say with
>
> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
>
> (scmp (view_convert:XXX @0) (bit_not @1))
>
Thanks for your advice, it looks great.
And can I use *view_convert1?* instead of *nop_convert1?* here,
because the original case is view_convert, and nop_convert would fail
to simplify the case.
> (I still believe that it is a bad idea that SSA_NAMEs are strongly typed,
> encoding the type in operations would be more convenient, but I think the
> time for that choice has long gone)
>
> --
> Marc Glisse
Hongtao Liu June 4, 2021, 8:11 a.m. UTC | #5
On Fri, Jun 4, 2021 at 1:01 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> >
> > On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
> >
> > > Hi:
> > >  This patch is about to simplify (view_convert:type ~a) < 0 to
> > > (view_convert:type a) >= 0 when type is signed integer. Similar for
> > > (view_convert:type ~a) >= 0.
> > >  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> > >  Ok for the trunk?
> > >
> > > gcc/ChangeLog:
> > >
> > >        PR middle-end/100738
> > >        * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
> > >        (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
> > >        simplification.
> >
> > We already have
> >
> > /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> > (for cmp (simple_comparison)
> >       scmp (swapped_simple_comparison)
> >   (simplify
> >    (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
> >    (if (single_use (@2)
> >         && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
> >     (scmp @0 (bit_not @1)))))
> >
> > Would it make sense to try and generalize it a bit, say with
> >
> > (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
> >
> > (scmp (view_convert:XXX @0) (bit_not @1))
> >
> Thanks for your advice, it looks great.
> And can I use *view_convert1?* instead of *nop_convert1?* here,
> because the original case is view_convert, and nop_convert would fail
> to simplify the case.
Here is updated patch

gcc/ChangeLog:

        PR middle-end/100738
        * match.pd (Fold ~X op C as X op' ~C): Extend GIMPLE
        simplification to handle view_convert ~X.

gcc/testsuite/ChangeLog:

        PR middle-end/100738
        * g++.target/i386/avx2-pr100738-1.C: New test.
        * g++.target/i386/sse4_1-pr100738-1.C: New test.

> > (I still believe that it is a bad idea that SSA_NAMEs are strongly typed,
> > encoding the type in operations would be more convenient, but I think the
> > time for that choice has long gone)
> >
> > --
> > Marc Glisse
>
>
>
> --
> BR,
> Hongtao
Marc Glisse June 4, 2021, 8:18 a.m. UTC | #6
On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote:

> On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>>
>> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
>>
>>> Hi:
>>>  This patch is about to simplify (view_convert:type ~a) < 0 to
>>> (view_convert:type a) >= 0 when type is signed integer. Similar for
>>> (view_convert:type ~a) >= 0.
>>>  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>>>  Ok for the trunk?
>>>
>>> gcc/ChangeLog:
>>>
>>>        PR middle-end/100738
>>>        * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
>>>        (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
>>>        simplification.
>>
>> We already have
>>
>> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
>> (for cmp (simple_comparison)
>>       scmp (swapped_simple_comparison)
>>   (simplify
>>    (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
>>    (if (single_use (@2)
>>         && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
>>     (scmp @0 (bit_not @1)))))
>>
>> Would it make sense to try and generalize it a bit, say with
>>
>> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
>>
>> (scmp (view_convert:XXX @0) (bit_not @1))
>>
> Thanks for your advice, it looks great.
> And can I use *view_convert1?* instead of *nop_convert1?* here,
> because the original case is view_convert, and nop_convert would fail
> to simplify the case.

Near the top of match.pd, you can see

/* With nop_convert? combine convert? and view_convert? in one pattern
    plus conditionalize on tree_nop_conversion_p conversions.  */
(match (nop_convert @0)
  (convert @0)
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))))
(match (nop_convert @0)
  (view_convert @0)
  (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0))
       && known_eq (TYPE_VECTOR_SUBPARTS (type),
                    TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)))
       && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0))))))

So at least the intention is that it can handle both NOP_EXPR for scalars 
and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way 
in some places in match.pd, like

(simplify
  (negate (nop_convert? (bit_not @0)))
  (plus (view_convert @0) { build_each_one_cst (type); }))

(simplify
  (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1)
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
   (bit_not (bit_xor (view_convert @0) @1))))

(the 'if' seems redundant for this one)

  (simplify
   (negate (nop_convert? (negate @1)))
   (if (!TYPE_OVERFLOW_SANITIZED (type)
        && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1)))
    (view_convert @1)))

etc.


At some point this got some genmatch help, to handle '?' and numbers, so I 
don't remember all the details, but following these examples should work.
Hongtao Liu June 7, 2021, 6:22 a.m. UTC | #7
On Fri, Jun 4, 2021 at 4:18 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote:
>
> > On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> >>
> >> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
> >>
> >>> Hi:
> >>>  This patch is about to simplify (view_convert:type ~a) < 0 to
> >>> (view_convert:type a) >= 0 when type is signed integer. Similar for
> >>> (view_convert:type ~a) >= 0.
> >>>  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> >>>  Ok for the trunk?
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>        PR middle-end/100738
> >>>        * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
> >>>        (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
> >>>        simplification.
> >>
> >> We already have
> >>
> >> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> >> (for cmp (simple_comparison)
> >>       scmp (swapped_simple_comparison)
> >>   (simplify
> >>    (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
> >>    (if (single_use (@2)
> >>         && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
> >>     (scmp @0 (bit_not @1)))))
> >>
> >> Would it make sense to try and generalize it a bit, say with
> >>
> >> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
> >>
> >> (scmp (view_convert:XXX @0) (bit_not @1))
> >>
> > Thanks for your advice, it looks great.
> > And can I use *view_convert1?* instead of *nop_convert1?* here,
> > because the original case is view_convert, and nop_convert would fail
> > to simplify the case.
>
> Near the top of match.pd, you can see
>
> /* With nop_convert? combine convert? and view_convert? in one pattern
>     plus conditionalize on tree_nop_conversion_p conversions.  */
> (match (nop_convert @0)
>   (convert @0)
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))))
> (match (nop_convert @0)
>   (view_convert @0)
>   (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0))
>        && known_eq (TYPE_VECTOR_SUBPARTS (type),
>                     TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)))
>        && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0))))))
>
Oh, it's restricted to the same number of elements which is not the
case i tested.
That's why nop_convert failed to simplify the case.

Guess we can define another nop1_convert to handle vector types with
different number of elements, but still tree_nop_convertion_p?

> So at least the intention is that it can handle both NOP_EXPR for scalars
> and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way
> in some places in match.pd, like
>
> (simplify
>   (negate (nop_convert? (bit_not @0)))
>   (plus (view_convert @0) { build_each_one_cst (type); }))
>
> (simplify
>   (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1)
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>    (bit_not (bit_xor (view_convert @0) @1))))
>
> (the 'if' seems redundant for this one)
>
>   (simplify
>    (negate (nop_convert? (negate @1)))
>    (if (!TYPE_OVERFLOW_SANITIZED (type)
>         && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1)))
>     (view_convert @1)))
>
> etc.
>
>
> At some point this got some genmatch help, to handle '?' and numbers, so I
> don't remember all the details, but following these examples should work.
>
> --
> Marc Glisse
Hongtao Liu June 7, 2021, 7:06 a.m. UTC | #8
On Mon, Jun 7, 2021 at 2:22 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Fri, Jun 4, 2021 at 4:18 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> >
> > On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote:
> >
> > > On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> > >>
> > >> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
> > >>
> > >>> Hi:
> > >>>  This patch is about to simplify (view_convert:type ~a) < 0 to
> > >>> (view_convert:type a) >= 0 when type is signed integer. Similar for
> > >>> (view_convert:type ~a) >= 0.
> > >>>  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> > >>>  Ok for the trunk?
> > >>>
> > >>> gcc/ChangeLog:
> > >>>
> > >>>        PR middle-end/100738
> > >>>        * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
> > >>>        (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
> > >>>        simplification.
> > >>
> > >> We already have
> > >>
> > >> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> > >> (for cmp (simple_comparison)
> > >>       scmp (swapped_simple_comparison)
> > >>   (simplify
> > >>    (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
> > >>    (if (single_use (@2)
> > >>         && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
> > >>     (scmp @0 (bit_not @1)))))
> > >>
> > >> Would it make sense to try and generalize it a bit, say with
> > >>
> > >> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
> > >>
> > >> (scmp (view_convert:XXX @0) (bit_not @1))
> > >>
> > > Thanks for your advice, it looks great.
> > > And can I use *view_convert1?* instead of *nop_convert1?* here,
> > > because the original case is view_convert, and nop_convert would fail
> > > to simplify the case.
> >
> > Near the top of match.pd, you can see
> >
> > /* With nop_convert? combine convert? and view_convert? in one pattern
> >     plus conditionalize on tree_nop_conversion_p conversions.  */
> > (match (nop_convert @0)
> >   (convert @0)
> >   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))))
> > (match (nop_convert @0)
> >   (view_convert @0)
> >   (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0))
> >        && known_eq (TYPE_VECTOR_SUBPARTS (type),
> >                     TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)))
> >        && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0))))))
> >
> Oh, it's restricted to the same number of elements which is not the
> case i tested.
> That's why nop_convert failed to simplify the case.
And tree_nop_conversion_p also doesn't handle vector types with
different element numbers.
Shouldn't v4si --> v16qi a nop conversion for all targets?
>
> Guess we can define another nop1_convert to handle vector types with
> different number of elements, but still tree_nop_convertion_p?
>
> > So at least the intention is that it can handle both NOP_EXPR for scalars
> > and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way
> > in some places in match.pd, like
> >
> > (simplify
> >   (negate (nop_convert? (bit_not @0)))
> >   (plus (view_convert @0) { build_each_one_cst (type); }))
> >
> > (simplify
> >   (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1)
> >   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> >    (bit_not (bit_xor (view_convert @0) @1))))
> >
> > (the 'if' seems redundant for this one)
> >
> >   (simplify
> >    (negate (nop_convert? (negate @1)))
> >    (if (!TYPE_OVERFLOW_SANITIZED (type)
> >         && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1)))
> >     (view_convert @1)))
> >
> > etc.
> >
> >
> > At some point this got some genmatch help, to handle '?' and numbers, so I
> > don't remember all the details, but following these examples should work.
> >
> > --
> > Marc Glisse
>
>
>
> --
> BR,
> Hongtao
Richard Biener June 7, 2021, 1:19 p.m. UTC | #9
On Mon, Jun 7, 2021 at 9:02 AM Hongtao Liu via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Mon, Jun 7, 2021 at 2:22 PM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Fri, Jun 4, 2021 at 4:18 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> > >
> > > On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote:
> > >
> > > > On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> > > >>
> > > >> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
> > > >>
> > > >>> Hi:
> > > >>>  This patch is about to simplify (view_convert:type ~a) < 0 to
> > > >>> (view_convert:type a) >= 0 when type is signed integer. Similar for
> > > >>> (view_convert:type ~a) >= 0.
> > > >>>  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> > > >>>  Ok for the trunk?
> > > >>>
> > > >>> gcc/ChangeLog:
> > > >>>
> > > >>>        PR middle-end/100738
> > > >>>        * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
> > > >>>        (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
> > > >>>        simplification.
> > > >>
> > > >> We already have
> > > >>
> > > >> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> > > >> (for cmp (simple_comparison)
> > > >>       scmp (swapped_simple_comparison)
> > > >>   (simplify
> > > >>    (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
> > > >>    (if (single_use (@2)
> > > >>         && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
> > > >>     (scmp @0 (bit_not @1)))))
> > > >>
> > > >> Would it make sense to try and generalize it a bit, say with
> > > >>
> > > >> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
> > > >>
> > > >> (scmp (view_convert:XXX @0) (bit_not @1))
> > > >>
> > > > Thanks for your advice, it looks great.
> > > > And can I use *view_convert1?* instead of *nop_convert1?* here,
> > > > because the original case is view_convert, and nop_convert would fail
> > > > to simplify the case.
> > >
> > > Near the top of match.pd, you can see
> > >
> > > /* With nop_convert? combine convert? and view_convert? in one pattern
> > >     plus conditionalize on tree_nop_conversion_p conversions.  */
> > > (match (nop_convert @0)
> > >   (convert @0)
> > >   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))))
> > > (match (nop_convert @0)
> > >   (view_convert @0)
> > >   (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0))
> > >        && known_eq (TYPE_VECTOR_SUBPARTS (type),
> > >                     TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)))
> > >        && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0))))))
> > >
> > Oh, it's restricted to the same number of elements which is not the
> > case i tested.
> > That's why nop_convert failed to simplify the case.
> And tree_nop_conversion_p also doesn't handle vector types with
> different element numbers.
> Shouldn't v4si --> v16qi a nop conversion for all targets?

It's a conversion with semantics, like v4si + V_C_E<v4si> (v16qi) doesn't
make sense when you strip the V_C_E as you'll get v4si + v16qi.  We
don't consider those a nop conversion.

Richard.

> >
> > Guess we can define another nop1_convert to handle vector types with
> > different number of elements, but still tree_nop_convertion_p?
> >
> > > So at least the intention is that it can handle both NOP_EXPR for scalars
> > > and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way
> > > in some places in match.pd, like
> > >
> > > (simplify
> > >   (negate (nop_convert? (bit_not @0)))
> > >   (plus (view_convert @0) { build_each_one_cst (type); }))
> > >
> > > (simplify
> > >   (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1)
> > >   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> > >    (bit_not (bit_xor (view_convert @0) @1))))
> > >
> > > (the 'if' seems redundant for this one)
> > >
> > >   (simplify
> > >    (negate (nop_convert? (negate @1)))
> > >    (if (!TYPE_OVERFLOW_SANITIZED (type)
> > >         && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1)))
> > >     (view_convert @1)))
> > >
> > > etc.
> > >
> > >
> > > At some point this got some genmatch help, to handle '?' and numbers, so I
> > > don't remember all the details, but following these examples should work.
> > >
> > > --
> > > Marc Glisse
> >
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao
diff mbox series

Patch

From 8c13f61c25821aca63ef2920fddce9704bfadeec Mon Sep 17 00:00:00 2001
From: liuhongt <hongtao.liu@intel.com>
Date: Thu, 27 May 2021 15:21:06 +0800
Subject: [PATCH] Optimize (view_convert:type ~a) < 0 to (view_convert:type a)
 >= 0 when type is signed integer. Similar for (view_convert:type ~a) >= 0.

gcc/ChangeLog:

	PR middle-end/100738
	* match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
	(view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
	simplification.

gcc/testsuite/ChangeLog:

	PR middle-end/100738
	* g++.target/i386/avx2-pr100738-1.C: New test.
	* g++.target/i386/sse4_1-pr100738-1.C: New test.
---
 gcc/match.pd                                  |   9 ++
 .../g++.target/i386/avx2-pr100738-1.C         | 120 ++++++++++++++++++
 .../g++.target/i386/sse4_1-pr100738-1.C       | 120 ++++++++++++++++++
 3 files changed, 249 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/i386/avx2-pr100738-1.C
 create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C

diff --git a/gcc/match.pd b/gcc/match.pd
index cdb87636951..d1c6b4ea2b4 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3983,6 +3983,15 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 				  wide_int_to_tree (TREE_TYPE (cst),
 				  wi::to_wide (cst) - 1)); })))))
 
+/* ((view_convert:signed_type)~a) < 0 is just (view_convert) a >= 0.  */
+(for cmp  (lt ge)
+     acmp (ge lt)
+ (simplify
+  (cmp (view_convert (bit_not @0)) integer_zerop@1)
+  (if (!TYPE_UNSIGNED (TREE_TYPE (@1)))
+    (with { tree stype = TREE_TYPE (@1); }
+    (acmp (view_convert:stype @0) @1)))))
+
 /* We can simplify a logical negation of a comparison to the
    inverted comparison.  As we cannot compute an expression
    operator using invert_tree_comparison we have to simulate
diff --git a/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C b/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C
new file mode 100644
index 00000000000..80fdad3e5f0
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx2-pr100738-1.C
@@ -0,0 +1,120 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mavx2 -std=c++14 -O2 -mno-avx512f -mno-xop" } */
+/* { dg-final { scan-assembler-not "pxor" } } */
+/* { dg-final { scan-assembler-not "pcmpgt\[bdq]" } } */
+/* { dg-final { scan-assembler-times "pblendvb" 6 } } */
+/* { dg-final { scan-assembler-times "blendvps" 6 } } */
+/* { dg-final { scan-assembler-times "blendvpd" 6 } } */
+
+typedef char v32qi __attribute__ ((vector_size (32)));
+typedef short v16hi __attribute__ ((vector_size (32)));
+typedef int v8si __attribute__ ((vector_size (32)));
+typedef long long v4di __attribute__ ((vector_size (32)));
+
+v8si
+f1 (v32qi a, v8si b, v8si c)
+{
+  return ((v8si)~a) < 0 ? b : c;
+}
+
+v4di
+f2 (v32qi a, v4di b, v4di c)
+{
+  return ((v4di)~a) < 0 ? b : c;
+}
+
+v32qi
+f3 (v16hi a, v32qi b, v32qi c)
+{
+  return ((v32qi)~a) < 0 ? b : c;
+}
+
+v8si
+f4 (v16hi a, v8si b, v8si c)
+{
+  return ((v8si)~a) < 0 ? b : c;
+}
+
+v4di
+f5 (v16hi a, v4di b, v4di c)
+{
+  return ((v4di)~a) < 0 ? b : c;
+}
+
+v32qi
+f6 (v8si a, v32qi b, v32qi c)
+{
+  return ((v32qi)~a) < 0 ? b : c;
+}
+
+v4di
+f7 (v8si a, v4di b, v4di c)
+{
+  return ((v4di)~a) < 0 ? b : c;
+}
+
+v32qi
+f8 (v4di a, v32qi b, v32qi c)
+{
+  return ((v32qi)~a) < 0 ? b : c;
+}
+
+v8si
+f9 (v4di a, v8si b, v8si c)
+{
+  return ((v8si)~a) < 0 ? b : c;
+}
+
+v8si
+f10 (v32qi a, v8si b, v8si c)
+{
+  return ((v8si)~a) >= 0 ? b : c;
+}
+
+v4di
+f11 (v32qi a, v4di b, v4di c)
+{
+  return ((v4di)~a) >= 0 ? b : c;
+}
+
+v32qi
+f12 (v16hi a, v32qi b, v32qi c)
+{
+  return ((v32qi)~a) >= 0 ? b : c;
+}
+
+v8si
+f13 (v16hi a, v8si b, v8si c)
+{
+  return ((v8si)~a) >= 0 ? b : c;
+}
+
+v4di
+f14 (v16hi a, v4di b, v4di c)
+{
+  return ((v4di)~a) >= 0 ? b : c;
+}
+
+v32qi
+f15 (v8si a, v32qi b, v32qi c)
+{
+  return ((v32qi)~a) >= 0 ? b : c;
+}
+
+v4di
+f16 (v8si a, v4di b, v4di c)
+{
+  return ((v4di)~a) >= 0 ? b : c;
+}
+
+v32qi
+f17 (v4di a, v32qi b, v32qi c)
+{
+  return ((v32qi)~a) >= 0 ? b : c;
+}
+
+v8si
+f18 (v4di a, v8si b, v8si c)
+{
+  return ((v8si)~a) >= 0 ? b : c;
+}
diff --git a/gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C b/gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C
new file mode 100644
index 00000000000..d3454c264cd
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/sse4_1-pr100738-1.C
@@ -0,0 +1,120 @@ 
+/* { dg-do compile } */
+/* { dg-options "-msse4 -std=c++14 -mno-avx2 -O2 -mno-xop" } */
+/* { dg-final { scan-assembler-not "pxor" } } */
+/* { dg-final { scan-assembler-not "pcmpgt\[bdq]" } } */
+/* { dg-final { scan-assembler-times "pblendvb" 6 } } */
+/* { dg-final { scan-assembler-times "blendvps" 6 } } */
+/* { dg-final { scan-assembler-times "blendvpd" 6 } } */
+
+typedef char v16qi __attribute__ ((vector_size (16)));
+typedef short v8hi __attribute__ ((vector_size (16)));
+typedef int v4si __attribute__ ((vector_size (16)));
+typedef long long v2di __attribute__ ((vector_size (16)));
+
+v4si
+f1 (v16qi a, v4si b, v4si c)
+{
+  return ((v4si)~a) < 0 ? b : c;
+}
+
+v2di
+f2 (v16qi a, v2di b, v2di c)
+{
+  return ((v2di)~a) < 0 ? b : c;
+}
+
+v16qi
+f3 (v8hi a, v16qi b, v16qi c)
+{
+  return ((v16qi)~a) < 0 ? b : c;
+}
+
+v4si
+f4 (v8hi a, v4si b, v4si c)
+{
+  return ((v4si)~a) < 0 ? b : c;
+}
+
+v2di
+f5 (v8hi a, v2di b, v2di c)
+{
+  return ((v2di)~a) < 0 ? b : c;
+}
+
+v16qi
+f6 (v4si a, v16qi b, v16qi c)
+{
+  return ((v16qi)~a) < 0 ? b : c;
+}
+
+v2di
+f7 (v4si a, v2di b, v2di c)
+{
+  return ((v2di)~a) < 0 ? b : c;
+}
+
+v16qi
+f8 (v2di a, v16qi b, v16qi c)
+{
+  return ((v16qi)~a) < 0 ? b : c;
+}
+
+v4si
+f9 (v2di a, v4si b, v4si c)
+{
+  return ((v4si)~a) < 0 ? b : c;
+}
+
+v4si
+f10 (v16qi a, v4si b, v4si c)
+{
+  return ((v4si)~a) >= 0 ? b : c;
+}
+
+v2di
+f11 (v16qi a, v2di b, v2di c)
+{
+  return ((v2di)~a) >= 0 ? b : c;
+}
+
+v16qi
+f12 (v8hi a, v16qi b, v16qi c)
+{
+  return ((v16qi)~a) >= 0 ? b : c;
+}
+
+v4si
+f13 (v8hi a, v4si b, v4si c)
+{
+  return ((v4si)~a) >= 0 ? b : c;
+}
+
+v2di
+f14 (v8hi a, v2di b, v2di c)
+{
+  return ((v2di)~a) >= 0 ? b : c;
+}
+
+v16qi
+f15 (v4si a, v16qi b, v16qi c)
+{
+  return ((v16qi)~a) >= 0 ? b : c;
+}
+
+v2di
+f16 (v4si a, v2di b, v2di c)
+{
+  return ((v2di)~a) >= 0 ? b : c;
+}
+
+v16qi
+f17 (v2di a, v16qi b, v16qi c)
+{
+  return ((v16qi)~a) >= 0 ? b : c;
+}
+
+v4si
+f18 (v2di a, v4si b, v4si c)
+{
+  return ((v4si)~a) >= 0 ? b : c;
+}
-- 
2.18.1