diff mbox series

vect: Use xor to invert oversized vector masks

Message ID 20240314154153.263896-1-ams@baylibre.com
State New
Headers show
Series vect: Use xor to invert oversized vector masks | expand

Commit Message

Andrew Stubbs March 14, 2024, 3:41 p.m. UTC
Don't enable excess lanes when inverting vector bit-masks smaller than the
integer mode.  This is yet another case of wrong-code due to mishandling
of oversized bitmasks.

This issue shows up in vect/tsvc/vect-tsvc-s278.c and
vect/tsvc/vect-tsvc-s279.c if I set the preferred vector size to V32
(down from V64) on amdgcn.

OK for mainline?

Andrew

gcc/ChangeLog:

	* expr.cc (expand_expr_real_2): Use xor to invert vector masks.
---
 gcc/expr.cc | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Hongtao Liu March 15, 2024, 3:45 a.m. UTC | #1
On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs <ams@baylibre.com> wrote:
>
> Don't enable excess lanes when inverting vector bit-masks smaller than the
> integer mode.  This is yet another case of wrong-code due to mishandling
> of oversized bitmasks.
>
> This issue shows up in vect/tsvc/vect-tsvc-s278.c and
> vect/tsvc/vect-tsvc-s279.c if I set the preferred vector size to V32
> (down from V64) on amdgcn.
>
> OK for mainline?
>
> Andrew
>
> gcc/ChangeLog:
>
>         * expr.cc (expand_expr_real_2): Use xor to invert vector masks.
> ---
>  gcc/expr.cc | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/gcc/expr.cc b/gcc/expr.cc
> index 403eeaa108e4..3540327d879e 100644
> --- a/gcc/expr.cc
> +++ b/gcc/expr.cc
> @@ -10497,6 +10497,17 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
>                                immed_wide_int_const (mask, int_mode),
>                                target, 1, OPTAB_LIB_WIDEN);
>         }
> +      /* If it's a vector mask don't enable excess bits.  */
> +      else if (VECTOR_BOOLEAN_TYPE_P (type)
> +              && SCALAR_INT_MODE_P (mode)
> +              && maybe_ne (GET_MODE_PRECISION (mode),
> +                           TYPE_VECTOR_SUBPARTS (type).to_constant ()))
> +       {
> +         auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
> +         temp = expand_binop (mode, xor_optab, op0,
> +                              GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
> +                              target, true, OPTAB_WIDEN);
> +       }
Not review, just curious, should the issue be fixed by the commit in PR113576.
Also wonder besides cbranch, excess land bits also matter?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576#c35
>        else
>         temp = expand_unop (mode, one_cmpl_optab, op0, target, 1);
>        gcc_assert (temp);
> --
> 2.41.0
>
Richard Biener March 15, 2024, 7:35 a.m. UTC | #2
On Fri, Mar 15, 2024 at 4:35 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs <ams@baylibre.com> wrote:
> >
> > Don't enable excess lanes when inverting vector bit-masks smaller than the
> > integer mode.  This is yet another case of wrong-code due to mishandling
> > of oversized bitmasks.
> >
> > This issue shows up in vect/tsvc/vect-tsvc-s278.c and
> > vect/tsvc/vect-tsvc-s279.c if I set the preferred vector size to V32
> > (down from V64) on amdgcn.
> >
> > OK for mainline?
> >
> > Andrew
> >
> > gcc/ChangeLog:
> >
> >         * expr.cc (expand_expr_real_2): Use xor to invert vector masks.
> > ---
> >  gcc/expr.cc | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/gcc/expr.cc b/gcc/expr.cc
> > index 403eeaa108e4..3540327d879e 100644
> > --- a/gcc/expr.cc
> > +++ b/gcc/expr.cc
> > @@ -10497,6 +10497,17 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
> >                                immed_wide_int_const (mask, int_mode),
> >                                target, 1, OPTAB_LIB_WIDEN);
> >         }
> > +      /* If it's a vector mask don't enable excess bits.  */
> > +      else if (VECTOR_BOOLEAN_TYPE_P (type)
> > +              && SCALAR_INT_MODE_P (mode)
> > +              && maybe_ne (GET_MODE_PRECISION (mode),
> > +                           TYPE_VECTOR_SUBPARTS (type).to_constant ()))
> > +       {
> > +         auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
> > +         temp = expand_binop (mode, xor_optab, op0,
> > +                              GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
> > +                              target, true, OPTAB_WIDEN);
> > +       }
> Not review, just curious, should the issue be fixed by the commit in PR113576.
> Also wonder besides cbranch, excess land bits also matter?
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576#c35

Yes, you patch BIT_NOT but we decided to patch final compares.  Is it that
we need to fixup every mask use in a .COND_* expansion as well?  If so
we should do it there.

Richard.

> >        else
> >         temp = expand_unop (mode, one_cmpl_optab, op0, target, 1);
> >        gcc_assert (temp);
> > --
> > 2.41.0
> >
>
>
> --
> BR,
> Hongtao
Andrew Stubbs March 15, 2024, 9:52 a.m. UTC | #3
On 15/03/2024 03:45, Hongtao Liu wrote:
> On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs <ams@baylibre.com> wrote:
>>
>> Don't enable excess lanes when inverting vector bit-masks smaller than the
>> integer mode.  This is yet another case of wrong-code due to mishandling
>> of oversized bitmasks.
>>
>> This issue shows up in vect/tsvc/vect-tsvc-s278.c and
>> vect/tsvc/vect-tsvc-s279.c if I set the preferred vector size to V32
>> (down from V64) on amdgcn.
>>
>> OK for mainline?
>>
>> Andrew
>>
>> gcc/ChangeLog:
>>
>>          * expr.cc (expand_expr_real_2): Use xor to invert vector masks.
>> ---
>>   gcc/expr.cc | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/gcc/expr.cc b/gcc/expr.cc
>> index 403eeaa108e4..3540327d879e 100644
>> --- a/gcc/expr.cc
>> +++ b/gcc/expr.cc
>> @@ -10497,6 +10497,17 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
>>                                 immed_wide_int_const (mask, int_mode),
>>                                 target, 1, OPTAB_LIB_WIDEN);
>>          }
>> +      /* If it's a vector mask don't enable excess bits.  */
>> +      else if (VECTOR_BOOLEAN_TYPE_P (type)
>> +              && SCALAR_INT_MODE_P (mode)
>> +              && maybe_ne (GET_MODE_PRECISION (mode),
>> +                           TYPE_VECTOR_SUBPARTS (type).to_constant ()))
>> +       {
>> +         auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
>> +         temp = expand_binop (mode, xor_optab, op0,
>> +                              GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
>> +                              target, true, OPTAB_WIDEN);
>> +       }
> Not review, just curious, should the issue be fixed by the commit in PR113576.
> Also wonder besides cbranch, excess land bits also matter?
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576#c35

It does seem to be another case of the same problem, but those commits 
are long enough ago that I do have them, and still saw a problem.

Andrew
Andrew Stubbs March 15, 2024, 11:24 a.m. UTC | #4
On 15/03/2024 07:35, Richard Biener wrote:
> On Fri, Mar 15, 2024 at 4:35 AM Hongtao Liu <crazylht@gmail.com> wrote:
>>
>> On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs <ams@baylibre.com> wrote:
>>>
>>> Don't enable excess lanes when inverting vector bit-masks smaller than the
>>> integer mode.  This is yet another case of wrong-code due to mishandling
>>> of oversized bitmasks.
>>>
>>> This issue shows up in vect/tsvc/vect-tsvc-s278.c and
>>> vect/tsvc/vect-tsvc-s279.c if I set the preferred vector size to V32
>>> (down from V64) on amdgcn.
>>>
>>> OK for mainline?
>>>
>>> Andrew
>>>
>>> gcc/ChangeLog:
>>>
>>>          * expr.cc (expand_expr_real_2): Use xor to invert vector masks.
>>> ---
>>>   gcc/expr.cc | 11 +++++++++++
>>>   1 file changed, 11 insertions(+)
>>>
>>> diff --git a/gcc/expr.cc b/gcc/expr.cc
>>> index 403eeaa108e4..3540327d879e 100644
>>> --- a/gcc/expr.cc
>>> +++ b/gcc/expr.cc
>>> @@ -10497,6 +10497,17 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
>>>                                 immed_wide_int_const (mask, int_mode),
>>>                                 target, 1, OPTAB_LIB_WIDEN);
>>>          }
>>> +      /* If it's a vector mask don't enable excess bits.  */
>>> +      else if (VECTOR_BOOLEAN_TYPE_P (type)
>>> +              && SCALAR_INT_MODE_P (mode)
>>> +              && maybe_ne (GET_MODE_PRECISION (mode),
>>> +                           TYPE_VECTOR_SUBPARTS (type).to_constant ()))
>>> +       {
>>> +         auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
>>> +         temp = expand_binop (mode, xor_optab, op0,
>>> +                              GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
>>> +                              target, true, OPTAB_WIDEN);
>>> +       }
>> Not review, just curious, should the issue be fixed by the commit in PR113576.
>> Also wonder besides cbranch, excess land bits also matter?
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576#c35
> 
> Yes, you patch BIT_NOT but we decided to patch final compares.  Is it that
> we need to fixup every mask use in a .COND_* expansion as well?  If so
> we should do it there.

I thought that the "not" to "xor" change was nice and there was already 
code there for fixing bitfields, but OK, I take your point.

The .COND_* statements are handled as internal function calls that are 
expanded directly via the optab with no special cases for different call 
types. This is because the "expand_cond_*_optab_fn" functions just map 
straight to "expand_direct_optab_fn".... would that be the right place 
to insert a special case handler to insert "and" operations?

Andrew
Richard Biener March 15, 2024, 1:10 p.m. UTC | #5
On Fri, Mar 15, 2024 at 12:24 PM Andrew Stubbs <ams@baylibre.com> wrote:
>
> On 15/03/2024 07:35, Richard Biener wrote:
> > On Fri, Mar 15, 2024 at 4:35 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >>
> >> On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs <ams@baylibre.com> wrote:
> >>>
> >>> Don't enable excess lanes when inverting vector bit-masks smaller than the
> >>> integer mode.  This is yet another case of wrong-code due to mishandling
> >>> of oversized bitmasks.
> >>>
> >>> This issue shows up in vect/tsvc/vect-tsvc-s278.c and
> >>> vect/tsvc/vect-tsvc-s279.c if I set the preferred vector size to V32
> >>> (down from V64) on amdgcn.
> >>>
> >>> OK for mainline?
> >>>
> >>> Andrew
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>          * expr.cc (expand_expr_real_2): Use xor to invert vector masks.
> >>> ---
> >>>   gcc/expr.cc | 11 +++++++++++
> >>>   1 file changed, 11 insertions(+)
> >>>
> >>> diff --git a/gcc/expr.cc b/gcc/expr.cc
> >>> index 403eeaa108e4..3540327d879e 100644
> >>> --- a/gcc/expr.cc
> >>> +++ b/gcc/expr.cc
> >>> @@ -10497,6 +10497,17 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
> >>>                                 immed_wide_int_const (mask, int_mode),
> >>>                                 target, 1, OPTAB_LIB_WIDEN);
> >>>          }
> >>> +      /* If it's a vector mask don't enable excess bits.  */
> >>> +      else if (VECTOR_BOOLEAN_TYPE_P (type)
> >>> +              && SCALAR_INT_MODE_P (mode)
> >>> +              && maybe_ne (GET_MODE_PRECISION (mode),
> >>> +                           TYPE_VECTOR_SUBPARTS (type).to_constant ()))
> >>> +       {
> >>> +         auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
> >>> +         temp = expand_binop (mode, xor_optab, op0,
> >>> +                              GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
> >>> +                              target, true, OPTAB_WIDEN);
> >>> +       }
> >> Not review, just curious, should the issue be fixed by the commit in PR113576.
> >> Also wonder besides cbranch, excess land bits also matter?
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576#c35
> >
> > Yes, you patch BIT_NOT but we decided to patch final compares.  Is it that
> > we need to fixup every mask use in a .COND_* expansion as well?  If so
> > we should do it there.
>
> I thought that the "not" to "xor" change was nice and there was already
> code there for fixing bitfields, but OK, I take your point.
>
> The .COND_* statements are handled as internal function calls that are
> expanded directly via the optab with no special cases for different call
> types. This is because the "expand_cond_*_optab_fn" functions just map
> straight to "expand_direct_optab_fn".... would that be the right place
> to insert a special case handler to insert "and" operations?

Yes, I think in expand_fn_using_insn where we handle the "undefined" input
operands we want to handle the vector bool operands as well, masking
rhs_rtx accordingly.

Richard.

> Andrew
diff mbox series

Patch

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 403eeaa108e4..3540327d879e 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -10497,6 +10497,17 @@  expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 			       immed_wide_int_const (mask, int_mode),
 			       target, 1, OPTAB_LIB_WIDEN);
 	}
+      /* If it's a vector mask don't enable excess bits.  */
+      else if (VECTOR_BOOLEAN_TYPE_P (type)
+	       && SCALAR_INT_MODE_P (mode)
+	       && maybe_ne (GET_MODE_PRECISION (mode),
+			    TYPE_VECTOR_SUBPARTS (type).to_constant ()))
+	{
+	  auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
+	  temp = expand_binop (mode, xor_optab, op0,
+			       GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
+			       target, true, OPTAB_WIDEN);
+	}
       else
 	temp = expand_unop (mode, one_cmpl_optab, op0, target, 1);
       gcc_assert (temp);