diff mbox series

[RFC/PATCH] IFN: Fix mask_{load,store} optab support macros

Message ID 6d6de1d2-0c76-c425-ac4e-b06c39752487@linux.ibm.com
State New
Headers show
Series [RFC/PATCH] IFN: Fix mask_{load,store} optab support macros | expand

Commit Message

Kewen.Lin June 24, 2020, 3:12 a.m. UTC
Hi,

When I am working on IFNs for vector with length, I noticed that the
current optab support query for mask_load/mask_store looks unexpected.
The mask_load/mask_store requires two modes for convert_optab query,
but the macros direct_mask_{load,store}_optab_supported_p uses
direct_optab_supported_p which asserts type pair should have the same mode.

I'm not sure whether we have some special reason here or just a typo,
since everything goes well now, mask_{load,store} optab check is mainly
handled by can_vec_mask_load_store_p.

But if we have some codes as below (eg: one checking for all IFNs finally)

  tree_pair types = direct_internal_fn_types (ifn, call);
  if(direct_internal_fn_supported_p (ifn, types, OPTIMIZE_FOR_SPEED) ...

It will cause ICE.

Does it make sense to fix it?

Thanks in advance!

BR,
Kewen
-----
gcc/ChangeLog:

	* internal-fn.c (direct_mask_load_optab_supported_p): Use
	convert_optab_supported_p instead of direct_optab_supported_p.
	(direct_mask_store_optab_supported_p): Likewise.

-----

Comments

Richard Sandiford June 24, 2020, 7:13 a.m. UTC | #1
"Kewen.Lin" <linkw@linux.ibm.com> writes:
> Hi,
>
> When I am working on IFNs for vector with length, I noticed that the
> current optab support query for mask_load/mask_store looks unexpected.
> The mask_load/mask_store requires two modes for convert_optab query,
> but the macros direct_mask_{load,store}_optab_supported_p uses
> direct_optab_supported_p which asserts type pair should have the same mode.
>
> I'm not sure whether we have some special reason here or just a typo,
> since everything goes well now, mask_{load,store} optab check is mainly
> handled by can_vec_mask_load_store_p.
>
> But if we have some codes as below (eg: one checking for all IFNs finally)
>
>   tree_pair types = direct_internal_fn_types (ifn, call);
>   if(direct_internal_fn_supported_p (ifn, types, OPTIMIZE_FOR_SPEED) ...
>
> It will cause ICE.
>
> Does it make sense to fix it?
>
> Thanks in advance!
>
> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
> 	* internal-fn.c (direct_mask_load_optab_supported_p): Use
> 	convert_optab_supported_p instead of direct_optab_supported_p.
> 	(direct_mask_store_optab_supported_p): Likewise.

OK, thanks.

I keep meaning to experiment with dropping the second mode from these optabs,
since it should be uniquely determined by the first.  That would make things
slightly simpler and more consistent with the other load/store IFNs.  But as
usual I've never found the time.

Richard
Richard Biener June 24, 2020, 7:39 a.m. UTC | #2
On Wed, Jun 24, 2020 at 9:13 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> "Kewen.Lin" <linkw@linux.ibm.com> writes:
> > Hi,
> >
> > When I am working on IFNs for vector with length, I noticed that the
> > current optab support query for mask_load/mask_store looks unexpected.
> > The mask_load/mask_store requires two modes for convert_optab query,
> > but the macros direct_mask_{load,store}_optab_supported_p uses
> > direct_optab_supported_p which asserts type pair should have the same mode.
> >
> > I'm not sure whether we have some special reason here or just a typo,
> > since everything goes well now, mask_{load,store} optab check is mainly
> > handled by can_vec_mask_load_store_p.
> >
> > But if we have some codes as below (eg: one checking for all IFNs finally)
> >
> >   tree_pair types = direct_internal_fn_types (ifn, call);
> >   if(direct_internal_fn_supported_p (ifn, types, OPTIMIZE_FOR_SPEED) ...
> >
> > It will cause ICE.
> >
> > Does it make sense to fix it?
> >
> > Thanks in advance!
> >
> > BR,
> > Kewen
> > -----
> > gcc/ChangeLog:
> >
> >       * internal-fn.c (direct_mask_load_optab_supported_p): Use
> >       convert_optab_supported_p instead of direct_optab_supported_p.
> >       (direct_mask_store_optab_supported_p): Likewise.
>
> OK, thanks.
>
> I keep meaning to experiment with dropping the second mode from these optabs,
> since it should be uniquely determined by the first.  That would make things
> slightly simpler and more consistent with the other load/store IFNs.  But as
> usual I've never found the time.

AVX512 would have V16SImode and SImode because the mask would have
an integer mode?  Likewise I could imagine RISC-V using V4SImode and V4QImode
or however their mask registers look like.

Richard.

>
> Richard
Richard Sandiford June 24, 2020, 8:35 a.m. UTC | #3
Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Jun 24, 2020 at 9:13 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> "Kewen.Lin" <linkw@linux.ibm.com> writes:
>> > Hi,
>> >
>> > When I am working on IFNs for vector with length, I noticed that the
>> > current optab support query for mask_load/mask_store looks unexpected.
>> > The mask_load/mask_store requires two modes for convert_optab query,
>> > but the macros direct_mask_{load,store}_optab_supported_p uses
>> > direct_optab_supported_p which asserts type pair should have the same mode.
>> >
>> > I'm not sure whether we have some special reason here or just a typo,
>> > since everything goes well now, mask_{load,store} optab check is mainly
>> > handled by can_vec_mask_load_store_p.
>> >
>> > But if we have some codes as below (eg: one checking for all IFNs finally)
>> >
>> >   tree_pair types = direct_internal_fn_types (ifn, call);
>> >   if(direct_internal_fn_supported_p (ifn, types, OPTIMIZE_FOR_SPEED) ...
>> >
>> > It will cause ICE.
>> >
>> > Does it make sense to fix it?
>> >
>> > Thanks in advance!
>> >
>> > BR,
>> > Kewen
>> > -----
>> > gcc/ChangeLog:
>> >
>> >       * internal-fn.c (direct_mask_load_optab_supported_p): Use
>> >       convert_optab_supported_p instead of direct_optab_supported_p.
>> >       (direct_mask_store_optab_supported_p): Likewise.
>>
>> OK, thanks.
>>
>> I keep meaning to experiment with dropping the second mode from these optabs,
>> since it should be uniquely determined by the first.  That would make things
>> slightly simpler and more consistent with the other load/store IFNs.  But as
>> usual I've never found the time.
>
> AVX512 would have V16SImode and SImode because the mask would have
> an integer mode?  Likewise I could imagine RISC-V using V4SImode and V4QImode
> or however their mask registers look like.

But what I mean is, once you know the vector mode, there should only
be one “choice” of mask mode.  (I agree it might not be the same mode,
and isn't for SVE as well as AVX512.)  TARGET_GET_MASK_MODE and related
vectoriser masking code is based around the assumption that the mask
mode is a function of the vector mode.

This is similar to single-mode direct optabs like vec_pack and vec_unpack,
which operate on pairs of vector modes, but where one mode is uniquely
determined by the other.

In contrast, convert optabs are AIUI designed for cases where there are
two modes that can vary independently of one another (at least to some
extent), and so giving a single mode isn't enough to identity the operation.

Thanks,
Richard
Jakub Jelinek June 24, 2020, 8:46 a.m. UTC | #4
On Wed, Jun 24, 2020 at 09:35:42AM +0100, Richard Sandiford wrote:
> >> I keep meaning to experiment with dropping the second mode from these optabs,
> >> since it should be uniquely determined by the first.  That would make things
> >> slightly simpler and more consistent with the other load/store IFNs.  But as
> >> usual I've never found the time.
> >
> > AVX512 would have V16SImode and SImode because the mask would have
> > an integer mode?  Likewise I could imagine RISC-V using V4SImode and V4QImode
> > or however their mask registers look like.
> 
> But what I mean is, once you know the vector mode, there should only
> be one “choice” of mask mode.  (I agree it might not be the same mode,
> and isn't for SVE as well as AVX512.)  TARGET_GET_MASK_MODE and related
> vectoriser masking code is based around the assumption that the mask
> mode is a function of the vector mode.

But on the same target the same vector mode (e.g. V16HImode) can have two
different mask modes (for AVX and non-AVX512VL V16HImode, for AVX512VL
HImode).  It is true that better both shouldn't appear in the same function,
but it would be better if at least in the optabs it is clearly spelled out.

> This is similar to single-mode direct optabs like vec_pack and vec_unpack,
> which operate on pairs of vector modes, but where one mode is uniquely
> determined by the other.

For those IMHO it would be better to use two modes, otherwise it is a pain.

	Jakub
Richard Sandiford June 24, 2020, 9:19 a.m. UTC | #5
Jakub Jelinek <jakub@redhat.com> writes:
> On Wed, Jun 24, 2020 at 09:35:42AM +0100, Richard Sandiford wrote:
>> >> I keep meaning to experiment with dropping the second mode from these optabs,
>> >> since it should be uniquely determined by the first.  That would make things
>> >> slightly simpler and more consistent with the other load/store IFNs.  But as
>> >> usual I've never found the time.
>> >
>> > AVX512 would have V16SImode and SImode because the mask would have
>> > an integer mode?  Likewise I could imagine RISC-V using V4SImode and V4QImode
>> > or however their mask registers look like.
>> 
>> But what I mean is, once you know the vector mode, there should only
>> be one “choice” of mask mode.  (I agree it might not be the same mode,
>> and isn't for SVE as well as AVX512.)  TARGET_GET_MASK_MODE and related
>> vectoriser masking code is based around the assumption that the mask
>> mode is a function of the vector mode.
>
> But on the same target the same vector mode (e.g. V16HImode) can have two
> different mask modes (for AVX and non-AVX512VL V16HImode, for AVX512VL
> HImode).  It is true that better both shouldn't appear in the same function,
> but it would be better if at least in the optabs it is clearly spelled out.

Hmm, OK.

The reason for bringing this up is that we took the above shortcut
for things like mask_gather_load_optab.  That's a convert optab in
which one mode is the data vector mode and the other is the index
vector mode (which can vary independently of the data mode to some
extent).  If we allow multiple mask modes for the same pair of
index and data vector modes, we'd presumably need to add support
for three-mode optabs.

I don't know whether that would be a practical problem for AVX or not,
if it did switch to using ifns instead of built-in functions for
autovectorised gathers.

Richard
Jim Wilson June 25, 2020, 12:55 a.m. UTC | #6
On Wed, Jun 24, 2020 at 1:35 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
> > AVX512 would have V16SImode and SImode because the mask would have
> > an integer mode?  Likewise I could imagine RISC-V using V4SImode and V4QImode
> > or however their mask registers look like.

RISC-V has 7 mask modes, currently.  Masks are defined to always fit
in one register.  So if you have a register group of 8 (mlen=8) and an
element length of 8-bits (elen=8) then you have 1 bit of mask per
element which completely fills one register.  If you have a register
group of 1 (mlen=1) and an element length of 64-bits (elen=64) then
you have 64-bits of mask per element which completely fills one
register.  So we need a mode for each possible way of tiling a
register with mask bits, which is 2**x for x=(0..6).

> But what I mean is, once you know the vector mode, there should only
> be one “choice” of mask mode.

This is true for RISC-V also.  The mask mode is determined by the
number of registers in the group and the element width, so there is
only one valid mask mode for each vector mode.

Jim
Andrew Waterman June 25, 2020, 1:29 a.m. UTC | #7
On Wed, Jun 24, 2020 at 5:56 PM Jim Wilson <jimw@sifive.com> wrote:
>
> On Wed, Jun 24, 2020 at 1:35 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> > Richard Biener <richard.guenther@gmail.com> writes:
> > > AVX512 would have V16SImode and SImode because the mask would have
> > > an integer mode?  Likewise I could imagine RISC-V using V4SImode and V4QImode
> > > or however their mask registers look like.
>
> RISC-V has 7 mask modes, currently.  Masks are defined to always fit
> in one register.  So if you have a register group of 8 (mlen=8) and an
> element length of 8-bits (elen=8) then you have 1 bit of mask per
> element which completely fills one register.  If you have a register
> group of 1 (mlen=1) and an element length of 64-bits (elen=64) then
> you have 64-bits of mask per element which completely fills one
> register.  So we need a mode for each possible way of tiling a
> register with mask bits, which is 2**x for x=(0..6).

The more recent (and hopefully final) ISA design has element i masked
by mask bit i, irrespective of element size--i.e., mlen=1
unconditionally.  Sorry about the churn.

>
> > But what I mean is, once you know the vector mode, there should only
> > be one “choice” of mask mode.
>
> This is true for RISC-V also.  The mask mode is determined by the
> number of registers in the group and the element width, so there is
> only one valid mask mode for each vector mode.

And of course this is still true.

>
> Jim
diff mbox series

Patch

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index f9e851069a5..1e53ced60eb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3142,12 +3142,12 @@  multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 #define direct_cond_unary_optab_supported_p direct_optab_supported_p
 #define direct_cond_binary_optab_supported_p direct_optab_supported_p
 #define direct_cond_ternary_optab_supported_p direct_optab_supported_p
-#define direct_mask_load_optab_supported_p direct_optab_supported_p
+#define direct_mask_load_optab_supported_p convert_optab_supported_p
 #define direct_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_gather_load_optab_supported_p convert_optab_supported_p
 #define direct_len_load_optab_supported_p direct_optab_supported_p
-#define direct_mask_store_optab_supported_p direct_optab_supported_p
+#define direct_mask_store_optab_supported_p convert_optab_supported_p
 #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_store_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_scatter_store_optab_supported_p convert_optab_supported_p