diff mbox series

Make sure we're playing with integral modes before call extract_integral_bit_field.

Message ID 20210806033228.3270579-1-hongtao.liu@intel.com
State New
Headers show
Series Make sure we're playing with integral modes before call extract_integral_bit_field. | expand

Commit Message

liuhongt Aug. 6, 2021, 3:32 a.m. UTC
Hi:
---
OK, I think sth is amiss here upthread.  insv/extv do look like they
are designed
to work on integer modes (but docs do not say anything about this here).
In fact the caller of extract_bit_field_using_extv is named
extract_integral_bit_field.  Of course nothing seems to check what kind of
modes we're dealing with, but we're for example happily doing
expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
some integer mode and op0 is HFmode?  From the above I get it's
the other way around?  In that case we should wrap the
call to extract_integral_bit_field, extracting in an integer mode with the
same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
---
  This is a separate patch as a follow up of upper comments.
 
gcc/ChangeLog:

	* expmed.c (extract_bit_field_1): Wrap the call to
	extract_integral_bit_field, extracting in an integer mode with
	the same size as 'tmode' and then converting the result
	as (subreg:tmode (reg:imode)).

gcc/testsuite/ChangeLog:
	* gcc.target/i386/float16-5.c: New test.
---
 gcc/expmed.c                              | 19 +++++++++++++++++++
 gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c

Comments

Andrew Pinski Aug. 6, 2021, 3:44 a.m. UTC | #1
On Thu, Aug 5, 2021 at 8:33 PM liuhongt via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi:
> ---
> OK, I think sth is amiss here upthread.  insv/extv do look like they
> are designed
> to work on integer modes (but docs do not say anything about this here).
> In fact the caller of extract_bit_field_using_extv is named
> extract_integral_bit_field.  Of course nothing seems to check what kind of
> modes we're dealing with, but we're for example happily doing
> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> some integer mode and op0 is HFmode?  From the above I get it's
> the other way around?  In that case we should wrap the
> call to extract_integral_bit_field, extracting in an integer mode with the
> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).

This seems related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93235 .
I wonder why the fix for that did not help here.

Thanks,
Andrew Pinski

> ---
>   This is a separate patch as a follow up of upper comments.
>
> gcc/ChangeLog:
>
>         * expmed.c (extract_bit_field_1): Wrap the call to
>         extract_integral_bit_field, extracting in an integer mode with
>         the same size as 'tmode' and then converting the result
>         as (subreg:tmode (reg:imode)).
>
> gcc/testsuite/ChangeLog:
>         * gcc.target/i386/float16-5.c: New test.
> ---
>  gcc/expmed.c                              | 19 +++++++++++++++++++
>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
>  2 files changed, 31 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
>
> diff --git a/gcc/expmed.c b/gcc/expmed.c
> index 3143f38e057..72790693ef0 100644
> --- a/gcc/expmed.c
> +++ b/gcc/expmed.c
> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
>        op0_mode = opt_scalar_int_mode ();
>      }
>
> +  /* Make sure we are playing with integral modes.  Pun with subregs
> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> +     in extract_integral_bit_field.  */
> +  if (int_mode_for_mode (tmode).exists (&imode)
> +      && imode != tmode
> +      && imode != GET_MODE (op0))
> +    {
> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> +                                           bitsize.to_constant (),
> +                                           bitnum.to_constant (), unsignedp,
> +                                           NULL, imode, imode,
> +                                           reverse, fallback_p);
> +      gcc_assert (ret);
> +
> +      if (!REG_P (ret))
> +       ret = force_reg (imode, ret);
> +      return gen_lowpart_SUBREG (tmode, ret);
> +    }
> +
>    /* It's possible we'll need to handle other cases here for
>       polynomial bitnum and bitsize.  */
>
> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> new file mode 100644
> index 00000000000..ebc0af1490b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msse2 -O2" } */
> +_Float16
> +foo (int a)
> +{
> +  union {
> +    int a;
> +    _Float16 b;
> +  }c;
> +  c.a = a;
> +  return c.b;
> +}
> --
> 2.27.0
>
Hongtao Liu Aug. 6, 2021, 4:59 a.m. UTC | #2
On Fri, Aug 6, 2021 at 11:44 AM Andrew Pinski via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Thu, Aug 5, 2021 at 8:33 PM liuhongt via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi:
> > ---
> > OK, I think sth is amiss here upthread.  insv/extv do look like they
> > are designed
> > to work on integer modes (but docs do not say anything about this here).
> > In fact the caller of extract_bit_field_using_extv is named
> > extract_integral_bit_field.  Of course nothing seems to check what kind of
> > modes we're dealing with, but we're for example happily doing
> > expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > some integer mode and op0 is HFmode?  From the above I get it's
> > the other way around?  In that case we should wrap the
> > call to extract_integral_bit_field, extracting in an integer mode with the
> > same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
>
> This seems related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93235 .
> I wonder why the fix for that did not help here.
>
aarch64 didn't hit gcc_assert with my testcase, and I debugged it to
figure out why.

in gimple level, both x86 and aarch64 is the same with
_3 = BIT_FIELD_REF <a_2(D), 16, 0>;

and they all goes into
extract_bit_field_using_extv

The difference is aarch64 has ext_mode as DImode, but x86 has ext_mode
as SImode.
with ext_mode as DImode and target as (reg:HF 94), aarch64 doesn't hit
gcc_assert in
 gen_lowpart (ext_mode, target)

since validate_subreg allow (subreg:DI (reg:HF)), but disallow
(subreg:SI (reg:HF)).

  /* ??? This should not be here.  Temporarily continue to allow word_mode
     subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
     Generally, backends are doing something sketchy but it'll take time to
     fix them all.  */
  if (omode == word_mode)
    ;

ext_mode is assigned from extv->field mode which is initialized in
get_best_reg_extraction_insn.
get_best_reg_extraction_insn will finally call
get_optab_extraction_insn and find
aarch64 doesn't have CODE_FOR_extzvsi but x86 has.

That's why aarch64 has ext_mode as DImode and x86 SImode.

> Thanks,
> Andrew Pinski
>
> > ---
> >   This is a separate patch as a follow up of upper comments.
> >
> > gcc/ChangeLog:
> >
> >         * expmed.c (extract_bit_field_1): Wrap the call to
> >         extract_integral_bit_field, extracting in an integer mode with
> >         the same size as 'tmode' and then converting the result
> >         as (subreg:tmode (reg:imode)).
> >
> > gcc/testsuite/ChangeLog:
> >         * gcc.target/i386/float16-5.c: New test.
> > ---
> >  gcc/expmed.c                              | 19 +++++++++++++++++++
> >  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> >  2 files changed, 31 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> >
> > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > index 3143f38e057..72790693ef0 100644
> > --- a/gcc/expmed.c
> > +++ b/gcc/expmed.c
> > @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> >        op0_mode = opt_scalar_int_mode ();
> >      }
> >
> > +  /* Make sure we are playing with integral modes.  Pun with subregs
> > +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > +     in extract_integral_bit_field.  */
> > +  if (int_mode_for_mode (tmode).exists (&imode)
> > +      && imode != tmode
> > +      && imode != GET_MODE (op0))
> > +    {
> > +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > +                                           bitsize.to_constant (),
> > +                                           bitnum.to_constant (), unsignedp,
> > +                                           NULL, imode, imode,
> > +                                           reverse, fallback_p);
> > +      gcc_assert (ret);
> > +
> > +      if (!REG_P (ret))
> > +       ret = force_reg (imode, ret);
> > +      return gen_lowpart_SUBREG (tmode, ret);
> > +    }
> > +
> >    /* It's possible we'll need to handle other cases here for
> >       polynomial bitnum and bitsize.  */
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > new file mode 100644
> > index 00000000000..ebc0af1490b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-msse2 -O2" } */
> > +_Float16
> > +foo (int a)
> > +{
> > +  union {
> > +    int a;
> > +    _Float16 b;
> > +  }c;
> > +  c.a = a;
> > +  return c.b;
> > +}
> > --
> > 2.27.0
> >
Hongtao Liu Aug. 6, 2021, 5:52 a.m. UTC | #3
On Fri, Aug 6, 2021 at 12:59 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 11:44 AM Andrew Pinski via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Thu, Aug 5, 2021 at 8:33 PM liuhongt via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > Hi:
> > > ---
> > > OK, I think sth is amiss here upthread.  insv/extv do look like they
> > > are designed
> > > to work on integer modes (but docs do not say anything about this here).
> > > In fact the caller of extract_bit_field_using_extv is named
> > > extract_integral_bit_field.  Of course nothing seems to check what kind of
> > > modes we're dealing with, but we're for example happily doing
> > > expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > > some integer mode and op0 is HFmode?  From the above I get it's
> > > the other way around?  In that case we should wrap the
> > > call to extract_integral_bit_field, extracting in an integer mode with the
> > > same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> >
> > This seems related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93235 .
CC jakub.
> > I wonder why the fix for that did not help here.
> >
> aarch64 didn't hit gcc_assert with my testcase, and I debugged it to
> figure out why.
>
> in gimple level, both x86 and aarch64 is the same with
> _3 = BIT_FIELD_REF <a_2(D), 16, 0>;
>
> and they all goes into
> extract_bit_field_using_extv
>
> The difference is aarch64 has ext_mode as DImode, but x86 has ext_mode
> as SImode.
> with ext_mode as DImode and target as (reg:HF 94), aarch64 doesn't hit
> gcc_assert in
>  gen_lowpart (ext_mode, target)
>
> since validate_subreg allow (subreg:DI (reg:HF)), but disallow
> (subreg:SI (reg:HF)).
>
>   /* ??? This should not be here.  Temporarily continue to allow word_mode
>      subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
>      Generally, backends are doing something sketchy but it'll take time to
>      fix them all.  */
>   if (omode == word_mode)
>     ;
>
> ext_mode is assigned from extv->field mode which is initialized in
> get_best_reg_extraction_insn.
> get_best_reg_extraction_insn will finally call
> get_optab_extraction_insn and find
> aarch64 doesn't have CODE_FOR_extzvsi but x86 has.
>
> That's why aarch64 has ext_mode as DImode and x86 SImode.
>
> > Thanks,
> > Andrew Pinski
> >
> > > ---
> > >   This is a separate patch as a follow up of upper comments.
> > >
> > > gcc/ChangeLog:
> > >
> > >         * expmed.c (extract_bit_field_1): Wrap the call to
> > >         extract_integral_bit_field, extracting in an integer mode with
> > >         the same size as 'tmode' and then converting the result
> > >         as (subreg:tmode (reg:imode)).
> > >
> > > gcc/testsuite/ChangeLog:
> > >         * gcc.target/i386/float16-5.c: New test.
> > > ---
> > >  gcc/expmed.c                              | 19 +++++++++++++++++++
> > >  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > >  2 files changed, 31 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > >
> > > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > index 3143f38e057..72790693ef0 100644
> > > --- a/gcc/expmed.c
> > > +++ b/gcc/expmed.c
> > > @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > >        op0_mode = opt_scalar_int_mode ();
> > >      }
> > >
> > > +  /* Make sure we are playing with integral modes.  Pun with subregs
> > > +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > > +     in extract_integral_bit_field.  */
> > > +  if (int_mode_for_mode (tmode).exists (&imode)
> > > +      && imode != tmode
> > > +      && imode != GET_MODE (op0))
> > > +    {
> > > +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > > +                                           bitsize.to_constant (),
> > > +                                           bitnum.to_constant (), unsignedp,
> > > +                                           NULL, imode, imode,
> > > +                                           reverse, fallback_p);
> > > +      gcc_assert (ret);
> > > +
> > > +      if (!REG_P (ret))
> > > +       ret = force_reg (imode, ret);
> > > +      return gen_lowpart_SUBREG (tmode, ret);
> > > +    }
> > > +
> > >    /* It's possible we'll need to handle other cases here for
> > >       polynomial bitnum and bitsize.  */
> > >
> > > diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > new file mode 100644
> > > index 00000000000..ebc0af1490b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > @@ -0,0 +1,12 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-msse2 -O2" } */
> > > +_Float16
> > > +foo (int a)
> > > +{
> > > +  union {
> > > +    int a;
> > > +    _Float16 b;
> > > +  }c;
> > > +  c.a = a;
> > > +  return c.b;
> > > +}
> > > --
> > > 2.27.0
> > >
>
>
>
> --
> BR,
> Hongtao
Richard Biener Aug. 6, 2021, 6:57 a.m. UTC | #4
On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> Hi:
> ---
> OK, I think sth is amiss here upthread.  insv/extv do look like they
> are designed
> to work on integer modes (but docs do not say anything about this here).
> In fact the caller of extract_bit_field_using_extv is named
> extract_integral_bit_field.  Of course nothing seems to check what kind of
> modes we're dealing with, but we're for example happily doing
> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> some integer mode and op0 is HFmode?  From the above I get it's
> the other way around?  In that case we should wrap the
> call to extract_integral_bit_field, extracting in an integer mode with the
> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> ---
>   This is a separate patch as a follow up of upper comments.
>
> gcc/ChangeLog:
>
>         * expmed.c (extract_bit_field_1): Wrap the call to
>         extract_integral_bit_field, extracting in an integer mode with
>         the same size as 'tmode' and then converting the result
>         as (subreg:tmode (reg:imode)).
>
> gcc/testsuite/ChangeLog:
>         * gcc.target/i386/float16-5.c: New test.
> ---
>  gcc/expmed.c                              | 19 +++++++++++++++++++
>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
>  2 files changed, 31 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
>
> diff --git a/gcc/expmed.c b/gcc/expmed.c
> index 3143f38e057..72790693ef0 100644
> --- a/gcc/expmed.c
> +++ b/gcc/expmed.c
> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
>        op0_mode = opt_scalar_int_mode ();
>      }
>
> +  /* Make sure we are playing with integral modes.  Pun with subregs
> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> +     in extract_integral_bit_field.  */
> +  if (int_mode_for_mode (tmode).exists (&imode)

check !INTEGRAL_MODE_P (tmode) before, that should be slightly
cheaper.  Then imode should always be != tmode.  Maybe
even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
how it behaves for composite modes.

Of course the least surprises would happen when we restrict this
to FLOAT_MODE_P (tmode).

Richard - any preferences?

> +      && imode != tmode
> +      && imode != GET_MODE (op0))
> +    {
> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> +                                           bitsize.to_constant (),
> +                                           bitnum.to_constant (), unsignedp,
> +                                           NULL, imode, imode,
> +                                           reverse, fallback_p);
> +      gcc_assert (ret);
> +
> +      if (!REG_P (ret))
> +       ret = force_reg (imode, ret);
> +      return gen_lowpart_SUBREG (tmode, ret);
> +    }
> +
>    /* It's possible we'll need to handle other cases here for
>       polynomial bitnum and bitsize.  */
>
> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> new file mode 100644
> index 00000000000..ebc0af1490b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msse2 -O2" } */
> +_Float16
> +foo (int a)
> +{
> +  union {
> +    int a;
> +    _Float16 b;
> +  }c;
> +  c.a = a;
> +  return c.b;
> +}
> --
> 2.27.0
>
Richard Biener Aug. 6, 2021, 6:59 a.m. UTC | #5
On Fri, Aug 6, 2021 at 6:54 AM Hongtao Liu via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Fri, Aug 6, 2021 at 11:44 AM Andrew Pinski via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Thu, Aug 5, 2021 at 8:33 PM liuhongt via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > Hi:
> > > ---
> > > OK, I think sth is amiss here upthread.  insv/extv do look like they
> > > are designed
> > > to work on integer modes (but docs do not say anything about this here).
> > > In fact the caller of extract_bit_field_using_extv is named
> > > extract_integral_bit_field.  Of course nothing seems to check what kind of
> > > modes we're dealing with, but we're for example happily doing
> > > expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > > some integer mode and op0 is HFmode?  From the above I get it's
> > > the other way around?  In that case we should wrap the
> > > call to extract_integral_bit_field, extracting in an integer mode with the
> > > same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> >
> > This seems related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93235 .
> > I wonder why the fix for that did not help here.
> >
> aarch64 didn't hit gcc_assert with my testcase, and I debugged it to
> figure out why.
>
> in gimple level, both x86 and aarch64 is the same with
> _3 = BIT_FIELD_REF <a_2(D), 16, 0>;
>
> and they all goes into
> extract_bit_field_using_extv
>
> The difference is aarch64 has ext_mode as DImode, but x86 has ext_mode
> as SImode.
> with ext_mode as DImode and target as (reg:HF 94), aarch64 doesn't hit
> gcc_assert in
>  gen_lowpart (ext_mode, target)
>
> since validate_subreg allow (subreg:DI (reg:HF)), but disallow
> (subreg:SI (reg:HF)).
>
>   /* ??? This should not be here.  Temporarily continue to allow word_mode
>      subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
>      Generally, backends are doing something sketchy but it'll take time to
>      fix them all.  */
>   if (omode == word_mode)
>     ;

Yeah, as said all this verification looks dubious.  This one especially so.

> ext_mode is assigned from extv->field mode which is initialized in
> get_best_reg_extraction_insn.
> get_best_reg_extraction_insn will finally call
> get_optab_extraction_insn and find
> aarch64 doesn't have CODE_FOR_extzvsi but x86 has.
>
> That's why aarch64 has ext_mode as DImode and x86 SImode.
>
> > Thanks,
> > Andrew Pinski
> >
> > > ---
> > >   This is a separate patch as a follow up of upper comments.
> > >
> > > gcc/ChangeLog:
> > >
> > >         * expmed.c (extract_bit_field_1): Wrap the call to
> > >         extract_integral_bit_field, extracting in an integer mode with
> > >         the same size as 'tmode' and then converting the result
> > >         as (subreg:tmode (reg:imode)).
> > >
> > > gcc/testsuite/ChangeLog:
> > >         * gcc.target/i386/float16-5.c: New test.
> > > ---
> > >  gcc/expmed.c                              | 19 +++++++++++++++++++
> > >  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > >  2 files changed, 31 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > >
> > > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > index 3143f38e057..72790693ef0 100644
> > > --- a/gcc/expmed.c
> > > +++ b/gcc/expmed.c
> > > @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > >        op0_mode = opt_scalar_int_mode ();
> > >      }
> > >
> > > +  /* Make sure we are playing with integral modes.  Pun with subregs
> > > +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > > +     in extract_integral_bit_field.  */
> > > +  if (int_mode_for_mode (tmode).exists (&imode)
> > > +      && imode != tmode
> > > +      && imode != GET_MODE (op0))
> > > +    {
> > > +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > > +                                           bitsize.to_constant (),
> > > +                                           bitnum.to_constant (), unsignedp,
> > > +                                           NULL, imode, imode,
> > > +                                           reverse, fallback_p);
> > > +      gcc_assert (ret);
> > > +
> > > +      if (!REG_P (ret))
> > > +       ret = force_reg (imode, ret);
> > > +      return gen_lowpart_SUBREG (tmode, ret);
> > > +    }
> > > +
> > >    /* It's possible we'll need to handle other cases here for
> > >       polynomial bitnum and bitsize.  */
> > >
> > > diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > new file mode 100644
> > > index 00000000000..ebc0af1490b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > @@ -0,0 +1,12 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-msse2 -O2" } */
> > > +_Float16
> > > +foo (int a)
> > > +{
> > > +  union {
> > > +    int a;
> > > +    _Float16 b;
> > > +  }c;
> > > +  c.a = a;
> > > +  return c.b;
> > > +}
> > > --
> > > 2.27.0
> > >
>
>
>
> --
> BR,
> Hongtao
Richard Sandiford Aug. 6, 2021, 9:05 a.m. UTC | #6
Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
>>
>> Hi:
>> ---
>> OK, I think sth is amiss here upthread.  insv/extv do look like they
>> are designed
>> to work on integer modes (but docs do not say anything about this here).
>> In fact the caller of extract_bit_field_using_extv is named
>> extract_integral_bit_field.  Of course nothing seems to check what kind of
>> modes we're dealing with, but we're for example happily doing
>> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
>> some integer mode and op0 is HFmode?  From the above I get it's
>> the other way around?  In that case we should wrap the
>> call to extract_integral_bit_field, extracting in an integer mode with the
>> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
>> ---
>>   This is a separate patch as a follow up of upper comments.
>>
>> gcc/ChangeLog:
>>
>>         * expmed.c (extract_bit_field_1): Wrap the call to
>>         extract_integral_bit_field, extracting in an integer mode with
>>         the same size as 'tmode' and then converting the result
>>         as (subreg:tmode (reg:imode)).
>>
>> gcc/testsuite/ChangeLog:
>>         * gcc.target/i386/float16-5.c: New test.
>> ---
>>  gcc/expmed.c                              | 19 +++++++++++++++++++
>>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
>>  2 files changed, 31 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
>>
>> diff --git a/gcc/expmed.c b/gcc/expmed.c
>> index 3143f38e057..72790693ef0 100644
>> --- a/gcc/expmed.c
>> +++ b/gcc/expmed.c
>> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
>>        op0_mode = opt_scalar_int_mode ();
>>      }
>>
>> +  /* Make sure we are playing with integral modes.  Pun with subregs
>> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
>> +     in extract_integral_bit_field.  */
>> +  if (int_mode_for_mode (tmode).exists (&imode)
>
> check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> cheaper.  Then imode should always be != tmode.  Maybe
> even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> how it behaves for composite modes.
>
> Of course the least surprises would happen when we restrict this
> to FLOAT_MODE_P (tmode).
>
> Richard - any preferences?

If the bug is that extract_integral_bit_field is being called with
a non-integral mode parameter, then it looks odd that we can still
fall through to it without an integral mode (when exists is false).

If calling extract_integral_bit_field without an integral mode is
a bug then I think we should have:

  int_mode_for_mode (mode).require ()

whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
Ideally we'd make the mode parameter scalar_int_mode too.

extract_integral_bit_field currently has:

  /* Find a correspondingly-sized integer field, so we can apply
     shifts and masks to it.  */
  scalar_int_mode int_mode;
  if (!int_mode_for_mode (tmode).exists (&int_mode))
    /* If this fails, we should probably push op0 out to memory and then
       do a load.  */
    int_mode = int_mode_for_mode (mode).require ();

which would seem to be redundant after this change.

>> +      && imode != tmode
>> +      && imode != GET_MODE (op0))
>> +    {
>> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
>> +                                           bitsize.to_constant (),
>> +                                           bitnum.to_constant (), unsignedp,
>> +                                           NULL, imode, imode,
>> +                                           reverse, fallback_p);
>> +      gcc_assert (ret);
>> +
>> +      if (!REG_P (ret))
>> +       ret = force_reg (imode, ret);
>> +      return gen_lowpart_SUBREG (tmode, ret);
>> +    }
>> +
>>    /* It's possible we'll need to handle other cases here for
>>       polynomial bitnum and bitsize.  */

Minor nit, but since the code is using to_constant, it should go after
this comment rather than before it.

Thanks,
Richard

>>
>> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
>> new file mode 100644
>> index 00000000000..ebc0af1490b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
>> @@ -0,0 +1,12 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-msse2 -O2" } */
>> +_Float16
>> +foo (int a)
>> +{
>> +  union {
>> +    int a;
>> +    _Float16 b;
>> +  }c;
>> +  c.a = a;
>> +  return c.b;
>> +}
>> --
>> 2.27.0
>>
Richard Biener Aug. 6, 2021, 11:27 a.m. UTC | #7
On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
> >>
> >> Hi:
> >> ---
> >> OK, I think sth is amiss here upthread.  insv/extv do look like they
> >> are designed
> >> to work on integer modes (but docs do not say anything about this here).
> >> In fact the caller of extract_bit_field_using_extv is named
> >> extract_integral_bit_field.  Of course nothing seems to check what kind of
> >> modes we're dealing with, but we're for example happily doing
> >> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> >> some integer mode and op0 is HFmode?  From the above I get it's
> >> the other way around?  In that case we should wrap the
> >> call to extract_integral_bit_field, extracting in an integer mode with the
> >> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> >> ---
> >>   This is a separate patch as a follow up of upper comments.
> >>
> >> gcc/ChangeLog:
> >>
> >>         * expmed.c (extract_bit_field_1): Wrap the call to
> >>         extract_integral_bit_field, extracting in an integer mode with
> >>         the same size as 'tmode' and then converting the result
> >>         as (subreg:tmode (reg:imode)).
> >>
> >> gcc/testsuite/ChangeLog:
> >>         * gcc.target/i386/float16-5.c: New test.
> >> ---
> >>  gcc/expmed.c                              | 19 +++++++++++++++++++
> >>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> >>  2 files changed, 31 insertions(+)
> >>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> >>
> >> diff --git a/gcc/expmed.c b/gcc/expmed.c
> >> index 3143f38e057..72790693ef0 100644
> >> --- a/gcc/expmed.c
> >> +++ b/gcc/expmed.c
> >> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> >>        op0_mode = opt_scalar_int_mode ();
> >>      }
> >>
> >> +  /* Make sure we are playing with integral modes.  Pun with subregs
> >> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> >> +     in extract_integral_bit_field.  */
> >> +  if (int_mode_for_mode (tmode).exists (&imode)
> >
> > check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> > cheaper.  Then imode should always be != tmode.  Maybe
> > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> > how it behaves for composite modes.
> >
> > Of course the least surprises would happen when we restrict this
> > to FLOAT_MODE_P (tmode).
> >
> > Richard - any preferences?
>
> If the bug is that extract_integral_bit_field is being called with
> a non-integral mode parameter, then it looks odd that we can still
> fall through to it without an integral mode (when exists is false).
>
> If calling extract_integral_bit_field without an integral mode is
> a bug then I think we should have:
>
>   int_mode_for_mode (mode).require ()
>
> whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> Ideally we'd make the mode parameter scalar_int_mode too.
>
> extract_integral_bit_field currently has:
>
>   /* Find a correspondingly-sized integer field, so we can apply
>      shifts and masks to it.  */
>   scalar_int_mode int_mode;
>   if (!int_mode_for_mode (tmode).exists (&int_mode))
>     /* If this fails, we should probably push op0 out to memory and then
>        do a load.  */
>     int_mode = int_mode_for_mode (mode).require ();
>
> which would seem to be redundant after this change.

I'm not sure what exactly the bug is, but extract_integral_bit_field ends
up creating a lowpart subreg that's not allowed and that ICEs (and I
can't see a way to check beforehand).  So it seems to me at least
part of that function doesn't expect non-integral extraction modes.

But who knows - the code is older than I am (OK, not, but older than
my involvment in GCC ;))

Richard.

> >> +      && imode != tmode
> >> +      && imode != GET_MODE (op0))
> >> +    {
> >> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> >> +                                           bitsize.to_constant (),
> >> +                                           bitnum.to_constant (), unsignedp,
> >> +                                           NULL, imode, imode,
> >> +                                           reverse, fallback_p);
> >> +      gcc_assert (ret);
> >> +
> >> +      if (!REG_P (ret))
> >> +       ret = force_reg (imode, ret);
> >> +      return gen_lowpart_SUBREG (tmode, ret);
> >> +    }
> >> +
> >>    /* It's possible we'll need to handle other cases here for
> >>       polynomial bitnum and bitsize.  */
>
> Minor nit, but since the code is using to_constant, it should go after
> this comment rather than before it.
>
> Thanks,
> Richard
>
> >>
> >> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> >> new file mode 100644
> >> index 00000000000..ebc0af1490b
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> >> @@ -0,0 +1,12 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-msse2 -O2" } */
> >> +_Float16
> >> +foo (int a)
> >> +{
> >> +  union {
> >> +    int a;
> >> +    _Float16 b;
> >> +  }c;
> >> +  c.a = a;
> >> +  return c.b;
> >> +}
> >> --
> >> 2.27.0
> >>
Hongtao Liu Aug. 9, 2021, 8:34 a.m. UTC | #8
On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> >
> > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
> > >>
> > >> Hi:
> > >> ---
> > >> OK, I think sth is amiss here upthread.  insv/extv do look like they
> > >> are designed
> > >> to work on integer modes (but docs do not say anything about this here).
> > >> In fact the caller of extract_bit_field_using_extv is named
> > >> extract_integral_bit_field.  Of course nothing seems to check what kind of
> > >> modes we're dealing with, but we're for example happily doing
> > >> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > >> some integer mode and op0 is HFmode?  From the above I get it's
> > >> the other way around?  In that case we should wrap the
> > >> call to extract_integral_bit_field, extracting in an integer mode with the
> > >> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> > >> ---
> > >>   This is a separate patch as a follow up of upper comments.
> > >>
> > >> gcc/ChangeLog:
> > >>
> > >>         * expmed.c (extract_bit_field_1): Wrap the call to
> > >>         extract_integral_bit_field, extracting in an integer mode with
> > >>         the same size as 'tmode' and then converting the result
> > >>         as (subreg:tmode (reg:imode)).
> > >>
> > >> gcc/testsuite/ChangeLog:
> > >>         * gcc.target/i386/float16-5.c: New test.
> > >> ---
> > >>  gcc/expmed.c                              | 19 +++++++++++++++++++
> > >>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > >>  2 files changed, 31 insertions(+)
> > >>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > >>
> > >> diff --git a/gcc/expmed.c b/gcc/expmed.c
> > >> index 3143f38e057..72790693ef0 100644
> > >> --- a/gcc/expmed.c
> > >> +++ b/gcc/expmed.c
> > >> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > >>        op0_mode = opt_scalar_int_mode ();
> > >>      }
> > >>
> > >> +  /* Make sure we are playing with integral modes.  Pun with subregs
> > >> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > >> +     in extract_integral_bit_field.  */
> > >> +  if (int_mode_for_mode (tmode).exists (&imode)
> > >
> > > check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> > > cheaper.  Then imode should always be != tmode.  Maybe
> > > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> > > how it behaves for composite modes.
> > >
> > > Of course the least surprises would happen when we restrict this
> > > to FLOAT_MODE_P (tmode).
> > >
> > > Richard - any preferences?
> >
> > If the bug is that extract_integral_bit_field is being called with
> > a non-integral mode parameter, then it looks odd that we can still
> > fall through to it without an integral mode (when exists is false).
> >
> > If calling extract_integral_bit_field without an integral mode is
> > a bug then I think we should have:
> >
> >   int_mode_for_mode (mode).require ()
> >
> > whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> > Ideally we'd make the mode parameter scalar_int_mode too.
> >
> > extract_integral_bit_field currently has:
> >
> >   /* Find a correspondingly-sized integer field, so we can apply
> >      shifts and masks to it.  */
> >   scalar_int_mode int_mode;
> >   if (!int_mode_for_mode (tmode).exists (&int_mode))
> >     /* If this fails, we should probably push op0 out to memory and then
> >        do a load.  */
> >     int_mode = int_mode_for_mode (mode).require ();
> >
> > which would seem to be redundant after this change.
>
> I'm not sure what exactly the bug is, but extract_integral_bit_field ends
> up creating a lowpart subreg that's not allowed and that ICEs (and I
> can't see a way to check beforehand).  So it seems to me at least
> part of that function doesn't expect non-integral extraction modes.
>
> But who knows - the code is older than I am (OK, not, but older than
> my involvment in GCC ;))
>
How about attached patch w/ below changelog

gcc/ChangeLog:

        * expmed.c (extract_bit_field_1): Make sure we're playing with
        integral modes before call extract_integral_bit_field.
        (extract_integral_bit_field): Add a parameter of type
        scalar_int_mode which corresponds to of tmode.
        And call extract_and_convert_fixed_bit_field instead of
        extract_fixed_bit_field and convert_extracted_bit_field.
        (extract_and_convert_fixed_bit_field): New function, it's a
        combination of extract_fixed_bit_field and
        convert_extracted_bit_field.

gcc/testsuite/ChangeLog:
        * gcc.target/i386/float16-5.c: New test.


> Richard.
>
> > >> +      && imode != tmode
> > >> +      && imode != GET_MODE (op0))
> > >> +    {
> > >> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > >> +                                           bitsize.to_constant (),
> > >> +                                           bitnum.to_constant (), unsignedp,
> > >> +                                           NULL, imode, imode,
> > >> +                                           reverse, fallback_p);
> > >> +      gcc_assert (ret);
> > >> +
> > >> +      if (!REG_P (ret))
> > >> +       ret = force_reg (imode, ret);
> > >> +      return gen_lowpart_SUBREG (tmode, ret);
> > >> +    }
> > >> +
> > >>    /* It's possible we'll need to handle other cases here for
> > >>       polynomial bitnum and bitsize.  */
> >
> > Minor nit, but since the code is using to_constant, it should go after
> > this comment rather than before it.
> >
> > Thanks,
> > Richard
> >
> > >>
> > >> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > >> new file mode 100644
> > >> index 00000000000..ebc0af1490b
> > >> --- /dev/null
> > >> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > >> @@ -0,0 +1,12 @@
> > >> +/* { dg-do compile } */
> > >> +/* { dg-options "-msse2 -O2" } */
> > >> +_Float16
> > >> +foo (int a)
> > >> +{
> > >> +  union {
> > >> +    int a;
> > >> +    _Float16 b;
> > >> +  }c;
> > >> +  c.a = a;
> > >> +  return c.b;
> > >> +}
> > >> --
> > >> 2.27.0
> > >>
Hongtao Liu Aug. 17, 2021, 1:52 a.m. UTC | #9
On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> > >
> > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > >>
> > > >> Hi:
> > > >> ---
> > > >> OK, I think sth is amiss here upthread.  insv/extv do look like they
> > > >> are designed
> > > >> to work on integer modes (but docs do not say anything about this here).
> > > >> In fact the caller of extract_bit_field_using_extv is named
> > > >> extract_integral_bit_field.  Of course nothing seems to check what kind of
> > > >> modes we're dealing with, but we're for example happily doing
> > > >> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > > >> some integer mode and op0 is HFmode?  From the above I get it's
> > > >> the other way around?  In that case we should wrap the
> > > >> call to extract_integral_bit_field, extracting in an integer mode with the
> > > >> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> > > >> ---
> > > >>   This is a separate patch as a follow up of upper comments.
> > > >>
> > > >> gcc/ChangeLog:
> > > >>
> > > >>         * expmed.c (extract_bit_field_1): Wrap the call to
> > > >>         extract_integral_bit_field, extracting in an integer mode with
> > > >>         the same size as 'tmode' and then converting the result
> > > >>         as (subreg:tmode (reg:imode)).
> > > >>
> > > >> gcc/testsuite/ChangeLog:
> > > >>         * gcc.target/i386/float16-5.c: New test.
> > > >> ---
> > > >>  gcc/expmed.c                              | 19 +++++++++++++++++++
> > > >>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > > >>  2 files changed, 31 insertions(+)
> > > >>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > > >>
> > > >> diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > >> index 3143f38e057..72790693ef0 100644
> > > >> --- a/gcc/expmed.c
> > > >> +++ b/gcc/expmed.c
> > > >> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > >>        op0_mode = opt_scalar_int_mode ();
> > > >>      }
> > > >>
> > > >> +  /* Make sure we are playing with integral modes.  Pun with subregs
> > > >> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > > >> +     in extract_integral_bit_field.  */
> > > >> +  if (int_mode_for_mode (tmode).exists (&imode)
> > > >
> > > > check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> > > > cheaper.  Then imode should always be != tmode.  Maybe
> > > > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> > > > how it behaves for composite modes.
> > > >
> > > > Of course the least surprises would happen when we restrict this
> > > > to FLOAT_MODE_P (tmode).
> > > >
> > > > Richard - any preferences?
> > >
> > > If the bug is that extract_integral_bit_field is being called with
> > > a non-integral mode parameter, then it looks odd that we can still
> > > fall through to it without an integral mode (when exists is false).
> > >
> > > If calling extract_integral_bit_field without an integral mode is
> > > a bug then I think we should have:
> > >
> > >   int_mode_for_mode (mode).require ()
> > >
> > > whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> > > Ideally we'd make the mode parameter scalar_int_mode too.
> > >
> > > extract_integral_bit_field currently has:
> > >
> > >   /* Find a correspondingly-sized integer field, so we can apply
> > >      shifts and masks to it.  */
> > >   scalar_int_mode int_mode;
> > >   if (!int_mode_for_mode (tmode).exists (&int_mode))
> > >     /* If this fails, we should probably push op0 out to memory and then
> > >        do a load.  */
> > >     int_mode = int_mode_for_mode (mode).require ();
> > >
> > > which would seem to be redundant after this change.
> >
> > I'm not sure what exactly the bug is, but extract_integral_bit_field ends
> > up creating a lowpart subreg that's not allowed and that ICEs (and I
> > can't see a way to check beforehand).  So it seems to me at least
> > part of that function doesn't expect non-integral extraction modes.
> >
> > But who knows - the code is older than I am (OK, not, but older than
> > my involvment in GCC ;))
> >
> How about attached patch w/ below changelog
>
> gcc/ChangeLog:
>
>         * expmed.c (extract_bit_field_1): Make sure we're playing with
>         integral modes before call extract_integral_bit_field.
>         (extract_integral_bit_field): Add a parameter of type
>         scalar_int_mode which corresponds to of tmode.
>         And call extract_and_convert_fixed_bit_field instead of
>         extract_fixed_bit_field and convert_extracted_bit_field.
>         (extract_and_convert_fixed_bit_field): New function, it's a
>         combination of extract_fixed_bit_field and
>         convert_extracted_bit_field.
>
> gcc/testsuite/ChangeLog:
>         * gcc.target/i386/float16-5.c: New test.
>
I'd like to ping this patch, or maybe we can use the patch before with
richi's comments.
>
> > Richard.
> >
> > > >> +      && imode != tmode
> > > >> +      && imode != GET_MODE (op0))
> > > >> +    {
> > > >> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > > >> +                                           bitsize.to_constant (),
> > > >> +                                           bitnum.to_constant (), unsignedp,
> > > >> +                                           NULL, imode, imode,
> > > >> +                                           reverse, fallback_p);
> > > >> +      gcc_assert (ret);
> > > >> +
> > > >> +      if (!REG_P (ret))
> > > >> +       ret = force_reg (imode, ret);
> > > >> +      return gen_lowpart_SUBREG (tmode, ret);
> > > >> +    }
> > > >> +
> > > >>    /* It's possible we'll need to handle other cases here for
> > > >>       polynomial bitnum and bitsize.  */
> > >
> > > Minor nit, but since the code is using to_constant, it should go after
> > > this comment rather than before it.
> > >
> > > Thanks,
> > > Richard
> > >
> > > >>
> > > >> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > >> new file mode 100644
> > > >> index 00000000000..ebc0af1490b
> > > >> --- /dev/null
> > > >> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > >> @@ -0,0 +1,12 @@
> > > >> +/* { dg-do compile } */
> > > >> +/* { dg-options "-msse2 -O2" } */
> > > >> +_Float16
> > > >> +foo (int a)
> > > >> +{
> > > >> +  union {
> > > >> +    int a;
> > > >> +    _Float16 b;
> > > >> +  }c;
> > > >> +  c.a = a;
> > > >> +  return c.b;
> > > >> +}
> > > >> --
> > > >> 2.27.0
> > > >>
>
>
>
> --
> BR,
> Hongtao
Hongtao Liu Aug. 24, 2021, 9:40 a.m. UTC | #10
On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> > > <richard.sandiford@arm.com> wrote:
> > > >
> > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > >>
> > > > >> Hi:
> > > > >> ---
> > > > >> OK, I think sth is amiss here upthread.  insv/extv do look like they
> > > > >> are designed
> > > > >> to work on integer modes (but docs do not say anything about this here).
> > > > >> In fact the caller of extract_bit_field_using_extv is named
> > > > >> extract_integral_bit_field.  Of course nothing seems to check what kind of
> > > > >> modes we're dealing with, but we're for example happily doing
> > > > >> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > > > >> some integer mode and op0 is HFmode?  From the above I get it's
> > > > >> the other way around?  In that case we should wrap the
> > > > >> call to extract_integral_bit_field, extracting in an integer mode with the
> > > > >> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> > > > >> ---
> > > > >>   This is a separate patch as a follow up of upper comments.
> > > > >>
> > > > >> gcc/ChangeLog:
> > > > >>
> > > > >>         * expmed.c (extract_bit_field_1): Wrap the call to
> > > > >>         extract_integral_bit_field, extracting in an integer mode with
> > > > >>         the same size as 'tmode' and then converting the result
> > > > >>         as (subreg:tmode (reg:imode)).
> > > > >>
> > > > >> gcc/testsuite/ChangeLog:
> > > > >>         * gcc.target/i386/float16-5.c: New test.
> > > > >> ---
> > > > >>  gcc/expmed.c                              | 19 +++++++++++++++++++
> > > > >>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > > > >>  2 files changed, 31 insertions(+)
> > > > >>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > > > >>
> > > > >> diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > > >> index 3143f38e057..72790693ef0 100644
> > > > >> --- a/gcc/expmed.c
> > > > >> +++ b/gcc/expmed.c
> > > > >> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > >>        op0_mode = opt_scalar_int_mode ();
> > > > >>      }
> > > > >>
> > > > >> +  /* Make sure we are playing with integral modes.  Pun with subregs
> > > > >> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > > > >> +     in extract_integral_bit_field.  */
> > > > >> +  if (int_mode_for_mode (tmode).exists (&imode)
> > > > >
> > > > > check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> > > > > cheaper.  Then imode should always be != tmode.  Maybe
> > > > > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> > > > > how it behaves for composite modes.
> > > > >
> > > > > Of course the least surprises would happen when we restrict this
> > > > > to FLOAT_MODE_P (tmode).
> > > > >
> > > > > Richard - any preferences?
> > > >
> > > > If the bug is that extract_integral_bit_field is being called with
> > > > a non-integral mode parameter, then it looks odd that we can still
> > > > fall through to it without an integral mode (when exists is false).
> > > >
> > > > If calling extract_integral_bit_field without an integral mode is
> > > > a bug then I think we should have:
> > > >
> > > >   int_mode_for_mode (mode).require ()
> > > >
> > > > whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> > > > Ideally we'd make the mode parameter scalar_int_mode too.
> > > >
> > > > extract_integral_bit_field currently has:
> > > >
> > > >   /* Find a correspondingly-sized integer field, so we can apply
> > > >      shifts and masks to it.  */
> > > >   scalar_int_mode int_mode;
> > > >   if (!int_mode_for_mode (tmode).exists (&int_mode))
> > > >     /* If this fails, we should probably push op0 out to memory and then
> > > >        do a load.  */
> > > >     int_mode = int_mode_for_mode (mode).require ();
> > > >
> > > > which would seem to be redundant after this change.
> > >
> > > I'm not sure what exactly the bug is, but extract_integral_bit_field ends
> > > up creating a lowpart subreg that's not allowed and that ICEs (and I
> > > can't see a way to check beforehand).  So it seems to me at least
> > > part of that function doesn't expect non-integral extraction modes.
> > >
> > > But who knows - the code is older than I am (OK, not, but older than
> > > my involvment in GCC ;))
> > >
> > How about attached patch w/ below changelog
> >
> > gcc/ChangeLog:
> >
> >         * expmed.c (extract_bit_field_1): Make sure we're playing with
> >         integral modes before call extract_integral_bit_field.
> >         (extract_integral_bit_field): Add a parameter of type
> >         scalar_int_mode which corresponds to of tmode.
> >         And call extract_and_convert_fixed_bit_field instead of
> >         extract_fixed_bit_field and convert_extracted_bit_field.
> >         (extract_and_convert_fixed_bit_field): New function, it's a
> >         combination of extract_fixed_bit_field and
> >         convert_extracted_bit_field.
> >
> > gcc/testsuite/ChangeLog:
> >         * gcc.target/i386/float16-5.c: New test.
> >
> I'd like to ping this patch, or maybe we can use the patch before with
> richi's comments.
> >

Rebased and ping^2, there are plenty of avx512fp16 patches blocked by
this patch, i'd like someone to help review this patch.

> > > Richard.
> > >
> > > > >> +      && imode != tmode
> > > > >> +      && imode != GET_MODE (op0))
> > > > >> +    {
> > > > >> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > > > >> +                                           bitsize.to_constant (),
> > > > >> +                                           bitnum.to_constant (), unsignedp,
> > > > >> +                                           NULL, imode, imode,
> > > > >> +                                           reverse, fallback_p);
> > > > >> +      gcc_assert (ret);
> > > > >> +
> > > > >> +      if (!REG_P (ret))
> > > > >> +       ret = force_reg (imode, ret);
> > > > >> +      return gen_lowpart_SUBREG (tmode, ret);
> > > > >> +    }
> > > > >> +
> > > > >>    /* It's possible we'll need to handle other cases here for
> > > > >>       polynomial bitnum and bitsize.  */
> > > >
> > > > Minor nit, but since the code is using to_constant, it should go after
> > > > this comment rather than before it.
> > > >
> > > > Thanks,
> > > > Richard
> > > >
> > > > >>
> > > > >> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > >> new file mode 100644
> > > > >> index 00000000000..ebc0af1490b
> > > > >> --- /dev/null
> > > > >> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > >> @@ -0,0 +1,12 @@
> > > > >> +/* { dg-do compile } */
> > > > >> +/* { dg-options "-msse2 -O2" } */
> > > > >> +_Float16
> > > > >> +foo (int a)
> > > > >> +{
> > > > >> +  union {
> > > > >> +    int a;
> > > > >> +    _Float16 b;
> > > > >> +  }c;
> > > > >> +  c.a = a;
> > > > >> +  return c.b;
> > > > >> +}
> > > > >> --
> > > > >> 2.27.0
> > > > >>
> >
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao
Hongtao Liu Aug. 24, 2021, 9:44 a.m. UTC | #11
On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > > >
> > > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> > > > <richard.sandiford@arm.com> wrote:
> > > > >
> > > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > > >>
> > > > > >> Hi:
> > > > > >> ---
> > > > > >> OK, I think sth is amiss here upthread.  insv/extv do look like they
> > > > > >> are designed
> > > > > >> to work on integer modes (but docs do not say anything about this here).
> > > > > >> In fact the caller of extract_bit_field_using_extv is named
> > > > > >> extract_integral_bit_field.  Of course nothing seems to check what kind of
> > > > > >> modes we're dealing with, but we're for example happily doing
> > > > > >> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > > > > >> some integer mode and op0 is HFmode?  From the above I get it's
> > > > > >> the other way around?  In that case we should wrap the
> > > > > >> call to extract_integral_bit_field, extracting in an integer mode with the
> > > > > >> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> > > > > >> ---
> > > > > >>   This is a separate patch as a follow up of upper comments.
> > > > > >>
> > > > > >> gcc/ChangeLog:
> > > > > >>
> > > > > >>         * expmed.c (extract_bit_field_1): Wrap the call to
> > > > > >>         extract_integral_bit_field, extracting in an integer mode with
> > > > > >>         the same size as 'tmode' and then converting the result
> > > > > >>         as (subreg:tmode (reg:imode)).
> > > > > >>
> > > > > >> gcc/testsuite/ChangeLog:
> > > > > >>         * gcc.target/i386/float16-5.c: New test.
> > > > > >> ---
> > > > > >>  gcc/expmed.c                              | 19 +++++++++++++++++++
> > > > > >>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > > > > >>  2 files changed, 31 insertions(+)
> > > > > >>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > >>
> > > > > >> diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > > > >> index 3143f38e057..72790693ef0 100644
> > > > > >> --- a/gcc/expmed.c
> > > > > >> +++ b/gcc/expmed.c
> > > > > >> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > >>        op0_mode = opt_scalar_int_mode ();
> > > > > >>      }
> > > > > >>
> > > > > >> +  /* Make sure we are playing with integral modes.  Pun with subregs
> > > > > >> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > > > > >> +     in extract_integral_bit_field.  */
> > > > > >> +  if (int_mode_for_mode (tmode).exists (&imode)
> > > > > >
> > > > > > check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> > > > > > cheaper.  Then imode should always be != tmode.  Maybe
> > > > > > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> > > > > > how it behaves for composite modes.
> > > > > >
> > > > > > Of course the least surprises would happen when we restrict this
> > > > > > to FLOAT_MODE_P (tmode).
> > > > > >
> > > > > > Richard - any preferences?
> > > > >
> > > > > If the bug is that extract_integral_bit_field is being called with
> > > > > a non-integral mode parameter, then it looks odd that we can still
> > > > > fall through to it without an integral mode (when exists is false).
> > > > >
> > > > > If calling extract_integral_bit_field without an integral mode is
> > > > > a bug then I think we should have:
> > > > >
> > > > >   int_mode_for_mode (mode).require ()
> > > > >
> > > > > whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> > > > > Ideally we'd make the mode parameter scalar_int_mode too.
> > > > >
> > > > > extract_integral_bit_field currently has:
> > > > >
> > > > >   /* Find a correspondingly-sized integer field, so we can apply
> > > > >      shifts and masks to it.  */
> > > > >   scalar_int_mode int_mode;
> > > > >   if (!int_mode_for_mode (tmode).exists (&int_mode))
> > > > >     /* If this fails, we should probably push op0 out to memory and then
> > > > >        do a load.  */
> > > > >     int_mode = int_mode_for_mode (mode).require ();
> > > > >
> > > > > which would seem to be redundant after this change.
> > > >
> > > > I'm not sure what exactly the bug is, but extract_integral_bit_field ends
> > > > up creating a lowpart subreg that's not allowed and that ICEs (and I
> > > > can't see a way to check beforehand).  So it seems to me at least
> > > > part of that function doesn't expect non-integral extraction modes.
> > > >
> > > > But who knows - the code is older than I am (OK, not, but older than
> > > > my involvment in GCC ;))
> > > >
> > > How about attached patch w/ below changelog
> > >
> > > gcc/ChangeLog:
> > >
> > >         * expmed.c (extract_bit_field_1): Make sure we're playing with
> > >         integral modes before call extract_integral_bit_field.
> > >         (extract_integral_bit_field): Add a parameter of type
> > >         scalar_int_mode which corresponds to of tmode.
> > >         And call extract_and_convert_fixed_bit_field instead of
> > >         extract_fixed_bit_field and convert_extracted_bit_field.
> > >         (extract_and_convert_fixed_bit_field): New function, it's a
> > >         combination of extract_fixed_bit_field and
> > >         convert_extracted_bit_field.
> > >
> > > gcc/testsuite/ChangeLog:
> > >         * gcc.target/i386/float16-5.c: New test.
> > >
> > I'd like to ping this patch, or maybe we can use the patch before with
> > richi's comments.
> > >
>
> Rebased and ping^2, there are plenty of avx512fp16 patches blocked by
> this patch, i'd like someone to help review this patch.
>
Please ignore the former attached patch, should be the patch attached here.
> > > > Richard.
> > > >
> > > > > >> +      && imode != tmode
> > > > > >> +      && imode != GET_MODE (op0))
> > > > > >> +    {
> > > > > >> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > > > > >> +                                           bitsize.to_constant (),
> > > > > >> +                                           bitnum.to_constant (), unsignedp,
> > > > > >> +                                           NULL, imode, imode,
> > > > > >> +                                           reverse, fallback_p);
> > > > > >> +      gcc_assert (ret);
> > > > > >> +
> > > > > >> +      if (!REG_P (ret))
> > > > > >> +       ret = force_reg (imode, ret);
> > > > > >> +      return gen_lowpart_SUBREG (tmode, ret);
> > > > > >> +    }
> > > > > >> +
> > > > > >>    /* It's possible we'll need to handle other cases here for
> > > > > >>       polynomial bitnum and bitsize.  */
> > > > >
> > > > > Minor nit, but since the code is using to_constant, it should go after
> > > > > this comment rather than before it.
> > > > >
> > > > > Thanks,
> > > > > Richard
> > > > >
> > > > > >>
> > > > > >> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > >> new file mode 100644
> > > > > >> index 00000000000..ebc0af1490b
> > > > > >> --- /dev/null
> > > > > >> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > >> @@ -0,0 +1,12 @@
> > > > > >> +/* { dg-do compile } */
> > > > > >> +/* { dg-options "-msse2 -O2" } */
> > > > > >> +_Float16
> > > > > >> +foo (int a)
> > > > > >> +{
> > > > > >> +  union {
> > > > > >> +    int a;
> > > > > >> +    _Float16 b;
> > > > > >> +  }c;
> > > > > >> +  c.a = a;
> > > > > >> +  return c.b;
> > > > > >> +}
> > > > > >> --
> > > > > >> 2.27.0
> > > > > >>
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao
Richard Biener Aug. 24, 2021, 11:38 a.m. UTC | #12
On Tue, Aug 24, 2021 at 11:38 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >
> > > > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> > > > > <richard.sandiford@arm.com> wrote:
> > > > > >
> > > > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > > > >>
> > > > > > >> Hi:
> > > > > > >> ---
> > > > > > >> OK, I think sth is amiss here upthread.  insv/extv do look like they
> > > > > > >> are designed
> > > > > > >> to work on integer modes (but docs do not say anything about this here).
> > > > > > >> In fact the caller of extract_bit_field_using_extv is named
> > > > > > >> extract_integral_bit_field.  Of course nothing seems to check what kind of
> > > > > > >> modes we're dealing with, but we're for example happily doing
> > > > > > >> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > > > > > >> some integer mode and op0 is HFmode?  From the above I get it's
> > > > > > >> the other way around?  In that case we should wrap the
> > > > > > >> call to extract_integral_bit_field, extracting in an integer mode with the
> > > > > > >> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> > > > > > >> ---
> > > > > > >>   This is a separate patch as a follow up of upper comments.
> > > > > > >>
> > > > > > >> gcc/ChangeLog:
> > > > > > >>
> > > > > > >>         * expmed.c (extract_bit_field_1): Wrap the call to
> > > > > > >>         extract_integral_bit_field, extracting in an integer mode with
> > > > > > >>         the same size as 'tmode' and then converting the result
> > > > > > >>         as (subreg:tmode (reg:imode)).
> > > > > > >>
> > > > > > >> gcc/testsuite/ChangeLog:
> > > > > > >>         * gcc.target/i386/float16-5.c: New test.
> > > > > > >> ---
> > > > > > >>  gcc/expmed.c                              | 19 +++++++++++++++++++
> > > > > > >>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > > > > > >>  2 files changed, 31 insertions(+)
> > > > > > >>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > >>
> > > > > > >> diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > > > > >> index 3143f38e057..72790693ef0 100644
> > > > > > >> --- a/gcc/expmed.c
> > > > > > >> +++ b/gcc/expmed.c
> > > > > > >> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > > >>        op0_mode = opt_scalar_int_mode ();
> > > > > > >>      }
> > > > > > >>
> > > > > > >> +  /* Make sure we are playing with integral modes.  Pun with subregs
> > > > > > >> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > > > > > >> +     in extract_integral_bit_field.  */
> > > > > > >> +  if (int_mode_for_mode (tmode).exists (&imode)
> > > > > > >
> > > > > > > check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> > > > > > > cheaper.  Then imode should always be != tmode.  Maybe
> > > > > > > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> > > > > > > how it behaves for composite modes.
> > > > > > >
> > > > > > > Of course the least surprises would happen when we restrict this
> > > > > > > to FLOAT_MODE_P (tmode).
> > > > > > >
> > > > > > > Richard - any preferences?
> > > > > >
> > > > > > If the bug is that extract_integral_bit_field is being called with
> > > > > > a non-integral mode parameter, then it looks odd that we can still
> > > > > > fall through to it without an integral mode (when exists is false).
> > > > > >
> > > > > > If calling extract_integral_bit_field without an integral mode is
> > > > > > a bug then I think we should have:
> > > > > >
> > > > > >   int_mode_for_mode (mode).require ()
> > > > > >
> > > > > > whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> > > > > > Ideally we'd make the mode parameter scalar_int_mode too.
> > > > > >
> > > > > > extract_integral_bit_field currently has:
> > > > > >
> > > > > >   /* Find a correspondingly-sized integer field, so we can apply
> > > > > >      shifts and masks to it.  */
> > > > > >   scalar_int_mode int_mode;
> > > > > >   if (!int_mode_for_mode (tmode).exists (&int_mode))
> > > > > >     /* If this fails, we should probably push op0 out to memory and then
> > > > > >        do a load.  */
> > > > > >     int_mode = int_mode_for_mode (mode).require ();
> > > > > >
> > > > > > which would seem to be redundant after this change.
> > > > >
> > > > > I'm not sure what exactly the bug is, but extract_integral_bit_field ends
> > > > > up creating a lowpart subreg that's not allowed and that ICEs (and I
> > > > > can't see a way to check beforehand).  So it seems to me at least
> > > > > part of that function doesn't expect non-integral extraction modes.
> > > > >
> > > > > But who knows - the code is older than I am (OK, not, but older than
> > > > > my involvment in GCC ;))
> > > > >
> > > > How about attached patch w/ below changelog
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         * expmed.c (extract_bit_field_1): Make sure we're playing with
> > > >         integral modes before call extract_integral_bit_field.
> > > >         (extract_integral_bit_field): Add a parameter of type
> > > >         scalar_int_mode which corresponds to of tmode.
> > > >         And call extract_and_convert_fixed_bit_field instead of
> > > >         extract_fixed_bit_field and convert_extracted_bit_field.
> > > >         (extract_and_convert_fixed_bit_field): New function, it's a
> > > >         combination of extract_fixed_bit_field and
> > > >         convert_extracted_bit_field.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >         * gcc.target/i386/float16-5.c: New test.
> > > >
> > > I'd like to ping this patch, or maybe we can use the patch before with
> > > richi's comments.
> > > >
> >
> > Rebased and ping^2, there are plenty of avx512fp16 patches blocked by
> > this patch, i'd like someone to help review this patch.
> >
> Please ignore the former attached patch, should be the patch attached here.

I think the patch is reasonable.  I'm a bit worried approving it since
my knowledge
of the code is restricted.  I wonder if you can tell the change doesn't make
a difference for the majority of cases - that is, did you try for
example comparing
generated code for GCC (or parts of it)?

OK if there's no objection from others within 48 hours.

Thanks,
Richard.

> > > > > Richard.
> > > > >
> > > > > > >> +      && imode != tmode
> > > > > > >> +      && imode != GET_MODE (op0))
> > > > > > >> +    {
> > > > > > >> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > > > > > >> +                                           bitsize.to_constant (),
> > > > > > >> +                                           bitnum.to_constant (), unsignedp,
> > > > > > >> +                                           NULL, imode, imode,
> > > > > > >> +                                           reverse, fallback_p);
> > > > > > >> +      gcc_assert (ret);
> > > > > > >> +
> > > > > > >> +      if (!REG_P (ret))
> > > > > > >> +       ret = force_reg (imode, ret);
> > > > > > >> +      return gen_lowpart_SUBREG (tmode, ret);
> > > > > > >> +    }
> > > > > > >> +
> > > > > > >>    /* It's possible we'll need to handle other cases here for
> > > > > > >>       polynomial bitnum and bitsize.  */
> > > > > >
> > > > > > Minor nit, but since the code is using to_constant, it should go after
> > > > > > this comment rather than before it.
> > > > > >
> > > > > > Thanks,
> > > > > > Richard
> > > > > >
> > > > > > >>
> > > > > > >> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > >> new file mode 100644
> > > > > > >> index 00000000000..ebc0af1490b
> > > > > > >> --- /dev/null
> > > > > > >> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > >> @@ -0,0 +1,12 @@
> > > > > > >> +/* { dg-do compile } */
> > > > > > >> +/* { dg-options "-msse2 -O2" } */
> > > > > > >> +_Float16
> > > > > > >> +foo (int a)
> > > > > > >> +{
> > > > > > >> +  union {
> > > > > > >> +    int a;
> > > > > > >> +    _Float16 b;
> > > > > > >> +  }c;
> > > > > > >> +  c.a = a;
> > > > > > >> +  return c.b;
> > > > > > >> +}
> > > > > > >> --
> > > > > > >> 2.27.0
> > > > > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao
Jeff Law Aug. 25, 2021, 11:16 p.m. UTC | #13
On 8/24/2021 3:44 AM, Hongtao Liu via Gcc-patches wrote:
> On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu <crazylht@gmail.com> wrote:
>> On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <crazylht@gmail.com> wrote:
>>> On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazylht@gmail.com> wrote:
>>>> On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>> On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
>>>>> <richard.sandiford@arm.com> wrote:
>>>>>> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>>>>>> On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
>>>>>>>> Hi:
>>>>>>>> ---
>>>>>>>> OK, I think sth is amiss here upthread.  insv/extv do look like they
>>>>>>>> are designed
>>>>>>>> to work on integer modes (but docs do not say anything about this here).
>>>>>>>> In fact the caller of extract_bit_field_using_extv is named
>>>>>>>> extract_integral_bit_field.  Of course nothing seems to check what kind of
>>>>>>>> modes we're dealing with, but we're for example happily doing
>>>>>>>> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
>>>>>>>> some integer mode and op0 is HFmode?  From the above I get it's
>>>>>>>> the other way around?  In that case we should wrap the
>>>>>>>> call to extract_integral_bit_field, extracting in an integer mode with the
>>>>>>>> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
>>>>>>>> ---
>>>>>>>>    This is a separate patch as a follow up of upper comments.
>>>>>>>>
>>>>>>>> gcc/ChangeLog:
>>>>>>>>
>>>>>>>>          * expmed.c (extract_bit_field_1): Wrap the call to
>>>>>>>>          extract_integral_bit_field, extracting in an integer mode with
>>>>>>>>          the same size as 'tmode' and then converting the result
>>>>>>>>          as (subreg:tmode (reg:imode)).
>>>>>>>>
>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>          * gcc.target/i386/float16-5.c: New test.
>>>>>>>> ---
>>>>>>>>   gcc/expmed.c                              | 19 +++++++++++++++++++
>>>>>>>>   gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
>>>>>>>>   2 files changed, 31 insertions(+)
>>>>>>>>   create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
>>>>>>>>
>>>>>>>> diff --git a/gcc/expmed.c b/gcc/expmed.c
>>>>>>>> index 3143f38e057..72790693ef0 100644
>>>>>>>> --- a/gcc/expmed.c
>>>>>>>> +++ b/gcc/expmed.c
>>>>>>>> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
>>>>>>>>         op0_mode = opt_scalar_int_mode ();
>>>>>>>>       }
>>>>>>>>
>>>>>>>> +  /* Make sure we are playing with integral modes.  Pun with subregs
>>>>>>>> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
>>>>>>>> +     in extract_integral_bit_field.  */
>>>>>>>> +  if (int_mode_for_mode (tmode).exists (&imode)
>>>>>>> check !INTEGRAL_MODE_P (tmode) before, that should be slightly
>>>>>>> cheaper.  Then imode should always be != tmode.  Maybe
>>>>>>> even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
>>>>>>> how it behaves for composite modes.
>>>>>>>
>>>>>>> Of course the least surprises would happen when we restrict this
>>>>>>> to FLOAT_MODE_P (tmode).
>>>>>>>
>>>>>>> Richard - any preferences?
>>>>>> If the bug is that extract_integral_bit_field is being called with
>>>>>> a non-integral mode parameter, then it looks odd that we can still
>>>>>> fall through to it without an integral mode (when exists is false).
>>>>>>
>>>>>> If calling extract_integral_bit_field without an integral mode is
>>>>>> a bug then I think we should have:
>>>>>>
>>>>>>    int_mode_for_mode (mode).require ()
>>>>>>
>>>>>> whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
>>>>>> Ideally we'd make the mode parameter scalar_int_mode too.
>>>>>>
>>>>>> extract_integral_bit_field currently has:
>>>>>>
>>>>>>    /* Find a correspondingly-sized integer field, so we can apply
>>>>>>       shifts and masks to it.  */
>>>>>>    scalar_int_mode int_mode;
>>>>>>    if (!int_mode_for_mode (tmode).exists (&int_mode))
>>>>>>      /* If this fails, we should probably push op0 out to memory and then
>>>>>>         do a load.  */
>>>>>>      int_mode = int_mode_for_mode (mode).require ();
>>>>>>
>>>>>> which would seem to be redundant after this change.
>>>>> I'm not sure what exactly the bug is, but extract_integral_bit_field ends
>>>>> up creating a lowpart subreg that's not allowed and that ICEs (and I
>>>>> can't see a way to check beforehand).  So it seems to me at least
>>>>> part of that function doesn't expect non-integral extraction modes.
>>>>>
>>>>> But who knows - the code is older than I am (OK, not, but older than
>>>>> my involvment in GCC ;))
>>>>>
>>>> How about attached patch w/ below changelog
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>          * expmed.c (extract_bit_field_1): Make sure we're playing with
>>>>          integral modes before call extract_integral_bit_field.
>>>>          (extract_integral_bit_field): Add a parameter of type
>>>>          scalar_int_mode which corresponds to of tmode.
>>>>          And call extract_and_convert_fixed_bit_field instead of
>>>>          extract_fixed_bit_field and convert_extracted_bit_field.
>>>>          (extract_and_convert_fixed_bit_field): New function, it's a
>>>>          combination of extract_fixed_bit_field and
>>>>          convert_extracted_bit_field.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>          * gcc.target/i386/float16-5.c: New test.
>>>>
>>> I'd like to ping this patch, or maybe we can use the patch before with
>>> richi's comments.
>> Rebased and ping^2, there are plenty of avx512fp16 patches blocked by
>> this patch, i'd like someone to help review this patch.
>>
> Please ignore the former attached patch, should be the patch attached here.
>>>>> Richard.
>>>>>
>>>>>>>> +      && imode != tmode
>>>>>>>> +      && imode != GET_MODE (op0))
>>>>>>>> +    {
>>>>>>>> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
>>>>>>>> +                                           bitsize.to_constant (),
>>>>>>>> +                                           bitnum.to_constant (), unsignedp,
>>>>>>>> +                                           NULL, imode, imode,
>>>>>>>> +                                           reverse, fallback_p);
>>>>>>>> +      gcc_assert (ret);
>>>>>>>> +
>>>>>>>> +      if (!REG_P (ret))
>>>>>>>> +       ret = force_reg (imode, ret);
>>>>>>>> +      return gen_lowpart_SUBREG (tmode, ret);
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>>     /* It's possible we'll need to handle other cases here for
>>>>>>>>        polynomial bitnum and bitsize.  */
>>>>>> Minor nit, but since the code is using to_constant, it should go after
>>>>>> this comment rather than before it.
>>>>>>
>>>>>> Thanks,
>>>>>> Richard
>>>>>>
>>>>>>>> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
>>>>>>>> new file mode 100644
>>>>>>>> index 00000000000..ebc0af1490b
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
>>>>>>>> @@ -0,0 +1,12 @@
>>>>>>>> +/* { dg-do compile } */
>>>>>>>> +/* { dg-options "-msse2 -O2" } */
>>>>>>>> +_Float16
>>>>>>>> +foo (int a)
>>>>>>>> +{
>>>>>>>> +  union {
>>>>>>>> +    int a;
>>>>>>>> +    _Float16 b;
>>>>>>>> +  }c;
>>>>>>>> +  c.a = a;
>>>>>>>> +  return c.b;
>>>>>>>> +}
>>>>>>>> --
>>>>>>>> 2.27.0
>>>>>>>>
>>>>
>>>>
>>>> --
>>>> BR,
>>>> Hongtao
>>>
>>>
>>> --
>>> BR,
>>> Hongtao
>>
>>
>> --
>> BR,
>> Hongtao
>
>
>
> 0001-Make-sure-we-re-playing-with-integral-modes-before-c.patch
>
>  From 9c77ac15e69b567156a82debe45e3ced10df1110 Mon Sep 17 00:00:00 2001
> From: liuhongt <hongtao.liu@intel.com>
> Date: Fri, 6 Aug 2021 10:18:43 +0800
> Subject: [PATCH] Make sure we're playing with integral modes before call
>   extract_integral_bit_field.
>
> gcc/ChangeLog:
>
> 	* expmed.c (extract_bit_field_1): Make sure we're playing with
> 	integral modes before call extract_integral_bit_field.
> 	(extract_integral_bit_field): Add a parameter of type
> 	scalar_int_mode which corresponds to of tmode.
> 	And call extract_and_convert_fixed_bit_field instead of
> 	extract_fixed_bit_field and convert_extracted_bit_field.
> 	(extract_and_convert_fixed_bit_field): New function, it's a
> 	combination of extract_fixed_bit_field and
> 	convert_extracted_bit_field.
>
> gcc/testsuite/ChangeLog:
> 	* gcc.target/i386/float16-5.c: New test.
I bet this is all getting triggered due to the introduction of HFmode.  
Wrapping with a subreg to get an integral mode may work, but I'd be more 
comfortable if we had other instances where we knew wrapping an SF/DF 
mode with SI/DI was enough to make all this code safe.  I fear we're 
just pushing the bug down in one spot and it's going to pop up elsewhere.

Another approach would be to force the object into memory, but I suspect 
y'all don't want to do that ;-)

So in the end, it may be reasonable, but I wouldn't be surprised if we 
trip over more problems in this code with FP modes.

jeff
Hongtao Liu Aug. 26, 2021, 1:17 a.m. UTC | #14
On Tue, Aug 24, 2021 at 7:39 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 11:38 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > > > >
> > > > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> > > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > > >
> > > > > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> > > > > > <richard.sandiford@arm.com> wrote:
> > > > > > >
> > > > > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > > > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > > > > > >>
> > > > > > > >> Hi:
> > > > > > > >> ---
> > > > > > > >> OK, I think sth is amiss here upthread.  insv/extv do look like they
> > > > > > > >> are designed
> > > > > > > >> to work on integer modes (but docs do not say anything about this here).
> > > > > > > >> In fact the caller of extract_bit_field_using_extv is named
> > > > > > > >> extract_integral_bit_field.  Of course nothing seems to check what kind of
> > > > > > > >> modes we're dealing with, but we're for example happily doing
> > > > > > > >> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> > > > > > > >> some integer mode and op0 is HFmode?  From the above I get it's
> > > > > > > >> the other way around?  In that case we should wrap the
> > > > > > > >> call to extract_integral_bit_field, extracting in an integer mode with the
> > > > > > > >> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> > > > > > > >> ---
> > > > > > > >>   This is a separate patch as a follow up of upper comments.
> > > > > > > >>
> > > > > > > >> gcc/ChangeLog:
> > > > > > > >>
> > > > > > > >>         * expmed.c (extract_bit_field_1): Wrap the call to
> > > > > > > >>         extract_integral_bit_field, extracting in an integer mode with
> > > > > > > >>         the same size as 'tmode' and then converting the result
> > > > > > > >>         as (subreg:tmode (reg:imode)).
> > > > > > > >>
> > > > > > > >> gcc/testsuite/ChangeLog:
> > > > > > > >>         * gcc.target/i386/float16-5.c: New test.
> > > > > > > >> ---
> > > > > > > >>  gcc/expmed.c                              | 19 +++++++++++++++++++
> > > > > > > >>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > > > > > > >>  2 files changed, 31 insertions(+)
> > > > > > > >>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > > >>
> > > > > > > >> diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > > > > > >> index 3143f38e057..72790693ef0 100644
> > > > > > > >> --- a/gcc/expmed.c
> > > > > > > >> +++ b/gcc/expmed.c
> > > > > > > >> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > > > >>        op0_mode = opt_scalar_int_mode ();
> > > > > > > >>      }
> > > > > > > >>
> > > > > > > >> +  /* Make sure we are playing with integral modes.  Pun with subregs
> > > > > > > >> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> > > > > > > >> +     in extract_integral_bit_field.  */
> > > > > > > >> +  if (int_mode_for_mode (tmode).exists (&imode)
> > > > > > > >
> > > > > > > > check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> > > > > > > > cheaper.  Then imode should always be != tmode.  Maybe
> > > > > > > > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> > > > > > > > how it behaves for composite modes.
> > > > > > > >
> > > > > > > > Of course the least surprises would happen when we restrict this
> > > > > > > > to FLOAT_MODE_P (tmode).
> > > > > > > >
> > > > > > > > Richard - any preferences?
> > > > > > >
> > > > > > > If the bug is that extract_integral_bit_field is being called with
> > > > > > > a non-integral mode parameter, then it looks odd that we can still
> > > > > > > fall through to it without an integral mode (when exists is false).
> > > > > > >
> > > > > > > If calling extract_integral_bit_field without an integral mode is
> > > > > > > a bug then I think we should have:
> > > > > > >
> > > > > > >   int_mode_for_mode (mode).require ()
> > > > > > >
> > > > > > > whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> > > > > > > Ideally we'd make the mode parameter scalar_int_mode too.
> > > > > > >
> > > > > > > extract_integral_bit_field currently has:
> > > > > > >
> > > > > > >   /* Find a correspondingly-sized integer field, so we can apply
> > > > > > >      shifts and masks to it.  */
> > > > > > >   scalar_int_mode int_mode;
> > > > > > >   if (!int_mode_for_mode (tmode).exists (&int_mode))
> > > > > > >     /* If this fails, we should probably push op0 out to memory and then
> > > > > > >        do a load.  */
> > > > > > >     int_mode = int_mode_for_mode (mode).require ();
> > > > > > >
> > > > > > > which would seem to be redundant after this change.
> > > > > >
> > > > > > I'm not sure what exactly the bug is, but extract_integral_bit_field ends
> > > > > > up creating a lowpart subreg that's not allowed and that ICEs (and I
> > > > > > can't see a way to check beforehand).  So it seems to me at least
> > > > > > part of that function doesn't expect non-integral extraction modes.
> > > > > >
> > > > > > But who knows - the code is older than I am (OK, not, but older than
> > > > > > my involvment in GCC ;))
> > > > > >
> > > > > How about attached patch w/ below changelog
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >         * expmed.c (extract_bit_field_1): Make sure we're playing with
> > > > >         integral modes before call extract_integral_bit_field.
> > > > >         (extract_integral_bit_field): Add a parameter of type
> > > > >         scalar_int_mode which corresponds to of tmode.
> > > > >         And call extract_and_convert_fixed_bit_field instead of
> > > > >         extract_fixed_bit_field and convert_extracted_bit_field.
> > > > >         (extract_and_convert_fixed_bit_field): New function, it's a
> > > > >         combination of extract_fixed_bit_field and
> > > > >         convert_extracted_bit_field.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >         * gcc.target/i386/float16-5.c: New test.
> > > > >
> > > > I'd like to ping this patch, or maybe we can use the patch before with
> > > > richi's comments.
> > > > >
> > >
> > > Rebased and ping^2, there are plenty of avx512fp16 patches blocked by
> > > this patch, i'd like someone to help review this patch.
> > >
> > Please ignore the former attached patch, should be the patch attached here.
>
> I think the patch is reasonable.  I'm a bit worried approving it since
> my knowledge
> of the code is restricted.  I wonder if you can tell the change doesn't make
> a difference for the majority of cases - that is, did you try for
> example comparing
> generated code for GCC (or parts of it)?
Build same for SPEC2017 and eembc.
>
> OK if there's no objection from others within 48 hours.
>
> Thanks,
> Richard.
>
> > > > > > Richard.
> > > > > >
> > > > > > > >> +      && imode != tmode
> > > > > > > >> +      && imode != GET_MODE (op0))
> > > > > > > >> +    {
> > > > > > > >> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> > > > > > > >> +                                           bitsize.to_constant (),
> > > > > > > >> +                                           bitnum.to_constant (), unsignedp,
> > > > > > > >> +                                           NULL, imode, imode,
> > > > > > > >> +                                           reverse, fallback_p);
> > > > > > > >> +      gcc_assert (ret);
> > > > > > > >> +
> > > > > > > >> +      if (!REG_P (ret))
> > > > > > > >> +       ret = force_reg (imode, ret);
> > > > > > > >> +      return gen_lowpart_SUBREG (tmode, ret);
> > > > > > > >> +    }
> > > > > > > >> +
> > > > > > > >>    /* It's possible we'll need to handle other cases here for
> > > > > > > >>       polynomial bitnum and bitsize.  */
> > > > > > >
> > > > > > > Minor nit, but since the code is using to_constant, it should go after
> > > > > > > this comment rather than before it.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Richard
> > > > > > >
> > > > > > > >>
> > > > > > > >> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > > >> new file mode 100644
> > > > > > > >> index 00000000000..ebc0af1490b
> > > > > > > >> --- /dev/null
> > > > > > > >> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > > >> @@ -0,0 +1,12 @@
> > > > > > > >> +/* { dg-do compile } */
> > > > > > > >> +/* { dg-options "-msse2 -O2" } */
> > > > > > > >> +_Float16
> > > > > > > >> +foo (int a)
> > > > > > > >> +{
> > > > > > > >> +  union {
> > > > > > > >> +    int a;
> > > > > > > >> +    _Float16 b;
> > > > > > > >> +  }c;
> > > > > > > >> +  c.a = a;
> > > > > > > >> +  return c.b;
> > > > > > > >> +}
> > > > > > > >> --
> > > > > > > >> 2.27.0
> > > > > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > BR,
> > > > > Hongtao
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao
Hongtao Liu Aug. 26, 2021, 2:05 a.m. UTC | #15
On Thu, Aug 26, 2021 at 7:16 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 8/24/2021 3:44 AM, Hongtao Liu via Gcc-patches wrote:
>
> On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>
> On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>
> On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> Hi:
> ---
> OK, I think sth is amiss here upthread.  insv/extv do look like they
> are designed
> to work on integer modes (but docs do not say anything about this here).
> In fact the caller of extract_bit_field_using_extv is named
> extract_integral_bit_field.  Of course nothing seems to check what kind of
> modes we're dealing with, but we're for example happily doing
> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> some integer mode and op0 is HFmode?  From the above I get it's
> the other way around?  In that case we should wrap the
> call to extract_integral_bit_field, extracting in an integer mode with the
> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> ---
>   This is a separate patch as a follow up of upper comments.
>
> gcc/ChangeLog:
>
>         * expmed.c (extract_bit_field_1): Wrap the call to
>         extract_integral_bit_field, extracting in an integer mode with
>         the same size as 'tmode' and then converting the result
>         as (subreg:tmode (reg:imode)).
>
> gcc/testsuite/ChangeLog:
>         * gcc.target/i386/float16-5.c: New test.
> ---
>  gcc/expmed.c                              | 19 +++++++++++++++++++
>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
>  2 files changed, 31 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
>
> diff --git a/gcc/expmed.c b/gcc/expmed.c
> index 3143f38e057..72790693ef0 100644
> --- a/gcc/expmed.c
> +++ b/gcc/expmed.c
> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
>        op0_mode = opt_scalar_int_mode ();
>      }
>
> +  /* Make sure we are playing with integral modes.  Pun with subregs
> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> +     in extract_integral_bit_field.  */
> +  if (int_mode_for_mode (tmode).exists (&imode)
>
> check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> cheaper.  Then imode should always be != tmode.  Maybe
> even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> how it behaves for composite modes.
>
> Of course the least surprises would happen when we restrict this
> to FLOAT_MODE_P (tmode).
>
> Richard - any preferences?
>
> If the bug is that extract_integral_bit_field is being called with
> a non-integral mode parameter, then it looks odd that we can still
> fall through to it without an integral mode (when exists is false).
>
> If calling extract_integral_bit_field without an integral mode is
> a bug then I think we should have:
>
>   int_mode_for_mode (mode).require ()
>
> whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> Ideally we'd make the mode parameter scalar_int_mode too.
>
> extract_integral_bit_field currently has:
>
>   /* Find a correspondingly-sized integer field, so we can apply
>      shifts and masks to it.  */
>   scalar_int_mode int_mode;
>   if (!int_mode_for_mode (tmode).exists (&int_mode))
>     /* If this fails, we should probably push op0 out to memory and then
>        do a load.  */
>     int_mode = int_mode_for_mode (mode).require ();
>
> which would seem to be redundant after this change.
>
> I'm not sure what exactly the bug is, but extract_integral_bit_field ends
> up creating a lowpart subreg that's not allowed and that ICEs (and I
> can't see a way to check beforehand).  So it seems to me at least
> part of that function doesn't expect non-integral extraction modes.
>
> But who knows - the code is older than I am (OK, not, but older than
> my involvment in GCC ;))
>
> How about attached patch w/ below changelog
>
> gcc/ChangeLog:
>
>         * expmed.c (extract_bit_field_1): Make sure we're playing with
>         integral modes before call extract_integral_bit_field.
>         (extract_integral_bit_field): Add a parameter of type
>         scalar_int_mode which corresponds to of tmode.
>         And call extract_and_convert_fixed_bit_field instead of
>         extract_fixed_bit_field and convert_extracted_bit_field.
>         (extract_and_convert_fixed_bit_field): New function, it's a
>         combination of extract_fixed_bit_field and
>         convert_extracted_bit_field.
>
> gcc/testsuite/ChangeLog:
>         * gcc.target/i386/float16-5.c: New test.
>
> I'd like to ping this patch, or maybe we can use the patch before with
> richi's comments.
>
> Rebased and ping^2, there are plenty of avx512fp16 patches blocked by
> this patch, i'd like someone to help review this patch.
>
> Please ignore the former attached patch, should be the patch attached here.
>
> Richard.
>
> +      && imode != tmode
> +      && imode != GET_MODE (op0))
> +    {
> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> +                                           bitsize.to_constant (),
> +                                           bitnum.to_constant (), unsignedp,
> +                                           NULL, imode, imode,
> +                                           reverse, fallback_p);
> +      gcc_assert (ret);
> +
> +      if (!REG_P (ret))
> +       ret = force_reg (imode, ret);
> +      return gen_lowpart_SUBREG (tmode, ret);
> +    }
> +
>    /* It's possible we'll need to handle other cases here for
>       polynomial bitnum and bitsize.  */
>
> Minor nit, but since the code is using to_constant, it should go after
> this comment rather than before it.
>
> Thanks,
> Richard
>
> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> new file mode 100644
> index 00000000000..ebc0af1490b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msse2 -O2" } */
> +_Float16
> +foo (int a)
> +{
> +  union {
> +    int a;
> +    _Float16 b;
> +  }c;
> +  c.a = a;
> +  return c.b;
> +}
> --
> 2.27.0
>
>
>
> --
> BR,
> Hongtao
>
>
> --
> BR,
> Hongtao
>
>
> --
> BR,
> Hongtao
>
>
>
> 0001-Make-sure-we-re-playing-with-integral-modes-before-c.patch
>
> From 9c77ac15e69b567156a82debe45e3ced10df1110 Mon Sep 17 00:00:00 2001
> From: liuhongt <hongtao.liu@intel.com>
> Date: Fri, 6 Aug 2021 10:18:43 +0800
> Subject: [PATCH] Make sure we're playing with integral modes before call
>  extract_integral_bit_field.
>
> gcc/ChangeLog:
>
> * expmed.c (extract_bit_field_1): Make sure we're playing with
> integral modes before call extract_integral_bit_field.
> (extract_integral_bit_field): Add a parameter of type
> scalar_int_mode which corresponds to of tmode.
> And call extract_and_convert_fixed_bit_field instead of
> extract_fixed_bit_field and convert_extracted_bit_field.
> (extract_and_convert_fixed_bit_field): New function, it's a
> combination of extract_fixed_bit_field and
> convert_extracted_bit_field.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/float16-5.c: New test.
>
> I bet this is all getting triggered due to the introduction of HFmode.  Wrapping with a subreg to get an integral mode may work, but I'd be more comfortable if we had other instances where we knew wrapping an SF/DF mode with SI/DI was enough to make all this code safe.  I fear we're just pushing the bug down in one spot and it's going to pop up elsewhere.
For SFmode, it will go into the new approach and work fine, i.e.
float
foo (long long b)
{
  union{float a;
    long long b;}c;
  c.b = b;
  return c.a;
}

For DFmode, It will generate (subreg:DF (reg:TI/DI)) right before it
goes into my new patch, because validate_subreg allows

  /* ??? This should not be here.  Temporarily continue to allow word_mode
     subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
     Generally, backends are doing something sketchy but it'll take time to
     fix them all.  */
  if (omode == word_mode)
    ;
  /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
     is the culprit here, and not the backends.  */
  else if (known_ge (osize, regsize) && known_ge (isize, osize))
    ;

>
> Another approach would be to force the object into memory, but I suspect y'all don't want to do that ;-)
>
> So in the end, it may be reasonable, but I wouldn't be surprised if we trip over more problems in this code with FP modes.
I can upstream this patches separately first, then wait for a week, if
no new problems are exposed on trunk, then I will upstream the float16
patches.
>
> jeff
>
Richard Biener Aug. 26, 2021, 7:11 a.m. UTC | #16
On Thu, Aug 26, 2021 at 1:16 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 8/24/2021 3:44 AM, Hongtao Liu via Gcc-patches wrote:
>
> On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>
> On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>
> On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> Hi:
> ---
> OK, I think sth is amiss here upthread.  insv/extv do look like they
> are designed
> to work on integer modes (but docs do not say anything about this here).
> In fact the caller of extract_bit_field_using_extv is named
> extract_integral_bit_field.  Of course nothing seems to check what kind of
> modes we're dealing with, but we're for example happily doing
> expand_shift in 'mode'.  In the extract_integral_bit_field call 'mode' is
> some integer mode and op0 is HFmode?  From the above I get it's
> the other way around?  In that case we should wrap the
> call to extract_integral_bit_field, extracting in an integer mode with the
> same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)).
> ---
>   This is a separate patch as a follow up of upper comments.
>
> gcc/ChangeLog:
>
>         * expmed.c (extract_bit_field_1): Wrap the call to
>         extract_integral_bit_field, extracting in an integer mode with
>         the same size as 'tmode' and then converting the result
>         as (subreg:tmode (reg:imode)).
>
> gcc/testsuite/ChangeLog:
>         * gcc.target/i386/float16-5.c: New test.
> ---
>  gcc/expmed.c                              | 19 +++++++++++++++++++
>  gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
>  2 files changed, 31 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
>
> diff --git a/gcc/expmed.c b/gcc/expmed.c
> index 3143f38e057..72790693ef0 100644
> --- a/gcc/expmed.c
> +++ b/gcc/expmed.c
> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
>        op0_mode = opt_scalar_int_mode ();
>      }
>
> +  /* Make sure we are playing with integral modes.  Pun with subregs
> +     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
> +     in extract_integral_bit_field.  */
> +  if (int_mode_for_mode (tmode).exists (&imode)
>
> check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> cheaper.  Then imode should always be != tmode.  Maybe
> even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> how it behaves for composite modes.
>
> Of course the least surprises would happen when we restrict this
> to FLOAT_MODE_P (tmode).
>
> Richard - any preferences?
>
> If the bug is that extract_integral_bit_field is being called with
> a non-integral mode parameter, then it looks odd that we can still
> fall through to it without an integral mode (when exists is false).
>
> If calling extract_integral_bit_field without an integral mode is
> a bug then I think we should have:
>
>   int_mode_for_mode (mode).require ()
>
> whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> Ideally we'd make the mode parameter scalar_int_mode too.
>
> extract_integral_bit_field currently has:
>
>   /* Find a correspondingly-sized integer field, so we can apply
>      shifts and masks to it.  */
>   scalar_int_mode int_mode;
>   if (!int_mode_for_mode (tmode).exists (&int_mode))
>     /* If this fails, we should probably push op0 out to memory and then
>        do a load.  */
>     int_mode = int_mode_for_mode (mode).require ();
>
> which would seem to be redundant after this change.
>
> I'm not sure what exactly the bug is, but extract_integral_bit_field ends
> up creating a lowpart subreg that's not allowed and that ICEs (and I
> can't see a way to check beforehand).  So it seems to me at least
> part of that function doesn't expect non-integral extraction modes.
>
> But who knows - the code is older than I am (OK, not, but older than
> my involvment in GCC ;))
>
> How about attached patch w/ below changelog
>
> gcc/ChangeLog:
>
>         * expmed.c (extract_bit_field_1): Make sure we're playing with
>         integral modes before call extract_integral_bit_field.
>         (extract_integral_bit_field): Add a parameter of type
>         scalar_int_mode which corresponds to of tmode.
>         And call extract_and_convert_fixed_bit_field instead of
>         extract_fixed_bit_field and convert_extracted_bit_field.
>         (extract_and_convert_fixed_bit_field): New function, it's a
>         combination of extract_fixed_bit_field and
>         convert_extracted_bit_field.
>
> gcc/testsuite/ChangeLog:
>         * gcc.target/i386/float16-5.c: New test.
>
> I'd like to ping this patch, or maybe we can use the patch before with
> richi's comments.
>
> Rebased and ping^2, there are plenty of avx512fp16 patches blocked by
> this patch, i'd like someone to help review this patch.
>
> Please ignore the former attached patch, should be the patch attached here.
>
> Richard.
>
> +      && imode != tmode
> +      && imode != GET_MODE (op0))
> +    {
> +      rtx ret = extract_integral_bit_field (op0, op0_mode,
> +                                           bitsize.to_constant (),
> +                                           bitnum.to_constant (), unsignedp,
> +                                           NULL, imode, imode,
> +                                           reverse, fallback_p);
> +      gcc_assert (ret);
> +
> +      if (!REG_P (ret))
> +       ret = force_reg (imode, ret);
> +      return gen_lowpart_SUBREG (tmode, ret);
> +    }
> +
>    /* It's possible we'll need to handle other cases here for
>       polynomial bitnum and bitsize.  */
>
> Minor nit, but since the code is using to_constant, it should go after
> this comment rather than before it.
>
> Thanks,
> Richard
>
> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
> new file mode 100644
> index 00000000000..ebc0af1490b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msse2 -O2" } */
> +_Float16
> +foo (int a)
> +{
> +  union {
> +    int a;
> +    _Float16 b;
> +  }c;
> +  c.a = a;
> +  return c.b;
> +}
> --
> 2.27.0
>
>
>
> --
> BR,
> Hongtao
>
>
> --
> BR,
> Hongtao
>
>
> --
> BR,
> Hongtao
>
>
>
> 0001-Make-sure-we-re-playing-with-integral-modes-before-c.patch
>
> From 9c77ac15e69b567156a82debe45e3ced10df1110 Mon Sep 17 00:00:00 2001
> From: liuhongt <hongtao.liu@intel.com>
> Date: Fri, 6 Aug 2021 10:18:43 +0800
> Subject: [PATCH] Make sure we're playing with integral modes before call
>  extract_integral_bit_field.
>
> gcc/ChangeLog:
>
> * expmed.c (extract_bit_field_1): Make sure we're playing with
> integral modes before call extract_integral_bit_field.
> (extract_integral_bit_field): Add a parameter of type
> scalar_int_mode which corresponds to of tmode.
> And call extract_and_convert_fixed_bit_field instead of
> extract_fixed_bit_field and convert_extracted_bit_field.
> (extract_and_convert_fixed_bit_field): New function, it's a
> combination of extract_fixed_bit_field and
> convert_extracted_bit_field.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/float16-5.c: New test.
>
> I bet this is all getting triggered due to the introduction of HFmode.  Wrapping with a subreg to get an integral mode may work, but I'd be more comfortable if we had other instances where we knew wrapping an SF/DF mode with SI/DI was enough to make all this code safe.  I fear we're just pushing the bug down in one spot and it's going to pop up elsewhere.
>
> Another approach would be to force the object into memory, but I suspect y'all don't want to do that ;-)
>
> So in the end, it may be reasonable, but I wouldn't be surprised if we trip over more problems in this code with FP modes.

One thought I had is whether we can "fix" validate_subreg to have less
"weird" allowed float-int
special cases.  As said upthread I think that we either should allow
all of those, implying that
subregs work semantically as if there's subregs to same-sized integer
modes inbetween or
disallow them all and make sure we're actually doing that explicitely.

For example

  /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
     is the culprit here, and not the backends.  */
  else if (known_ge (osize, regsize) && known_ge (isize, osize))
    ;

I can't decipther rtl.text as to what the semantics of such a subreg is
given the docs hand-wave about WORDS_BIG_ENDIAN vs.
FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
when you mix those in a subreg.  So maybe the above should
have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.

But then the world would be much simpler if subregs of non-same size
modes have explicit documentation for the mode kinds we have.

Richard.

> jeff
>
Richard Sandiford Aug. 26, 2021, 9:06 a.m. UTC | #17
Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> One thought I had is whether we can "fix" validate_subreg to have less
> "weird" allowed float-int
> special cases.  As said upthread I think that we either should allow
> all of those, implying that
> subregs work semantically as if there's subregs to same-sized integer
> modes inbetween or
> disallow them all and make sure we're actually doing that explicitely.
>
> For example
>
>   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
>      is the culprit here, and not the backends.  */
>   else if (known_ge (osize, regsize) && known_ge (isize, osize))
>     ;
>
> I can't decipther rtl.text as to what the semantics of such a subreg is
> given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> when you mix those in a subreg.  So maybe the above should
> have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
>
> But then the world would be much simpler if subregs of non-same size
> modes have explicit documentation for the mode kinds we have.

Yeah.  Although validate_subreg was a good idea, some of the mode checks
are IMO a failed experiment.  The hope was that eventually we'd remove
all those special exceptions once the culprit has been fixed.  However,
the code is over 16 years old at this point and those changes never
happened.

Nested subregs aren't a thing (thankfully) and one of the big disadvantages
of the current validate_subreg mode-changing rules is that they aren't
transitive.  This can artificially require temporary pseudos for things
that could be expressed directly as a single subreg.

I'm not even sure we have to worry about WORDS_BIG_ENDIAN vs.
FLOAT_WORDS_BIG_ENDIAN.  The SUBREG_BYTE rules follow WORDS_BIG_ENDIAN
for all modes, so its up to the generator of the subreg to diddle the
SUBREG_BYTE if they want to use a different interpretation for floats.

Thanks,
Richard
Richard Biener Aug. 26, 2021, 10:14 a.m. UTC | #18
On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > One thought I had is whether we can "fix" validate_subreg to have less
> > "weird" allowed float-int
> > special cases.  As said upthread I think that we either should allow
> > all of those, implying that
> > subregs work semantically as if there's subregs to same-sized integer
> > modes inbetween or
> > disallow them all and make sure we're actually doing that explicitely.
> >
> > For example
> >
> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> >      is the culprit here, and not the backends.  */
> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> >     ;
> >
> > I can't decipther rtl.text as to what the semantics of such a subreg is
> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > when you mix those in a subreg.  So maybe the above should
> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> >
> > But then the world would be much simpler if subregs of non-same size
> > modes have explicit documentation for the mode kinds we have.
>
> Yeah.  Although validate_subreg was a good idea, some of the mode checks
> are IMO a failed experiment.  The hope was that eventually we'd remove
> all those special exceptions once the culprit has been fixed.  However,
> the code is over 16 years old at this point and those changes never
> happened.
>
> Nested subregs aren't a thing (thankfully) and one of the big disadvantages
> of the current validate_subreg mode-changing rules is that they aren't
> transitive.  This can artificially require temporary pseudos for things
> that could be expressed directly as a single subreg.

And that's what the proposed patch does (add same-mode size integer mode
punning intermediate subregs).

So if that's not supposed to be necessary then why restrict subregs at all?

I mean you seem to imply that the semantics would be clear and well-defined
(to you - not to me).  The only thing is that of course not all subregs are
"implemented" by a target (or can be, w/o spilling).

Which means - we should adjust validate_subreg with another special-case
or rather generalize the existing ones to an overall set that makes more
sense?

Richard.

> I'm not even sure we have to worry about WORDS_BIG_ENDIAN vs.
> FLOAT_WORDS_BIG_ENDIAN.  The SUBREG_BYTE rules follow WORDS_BIG_ENDIAN
> for all modes, so its up to the generator of the subreg to diddle the
> SUBREG_BYTE if they want to use a different interpretation for floats.
>
> Thanks,
> Richard
Richard Sandiford Aug. 26, 2021, 10:50 a.m. UTC | #19
Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> > One thought I had is whether we can "fix" validate_subreg to have less
>> > "weird" allowed float-int
>> > special cases.  As said upthread I think that we either should allow
>> > all of those, implying that
>> > subregs work semantically as if there's subregs to same-sized integer
>> > modes inbetween or
>> > disallow them all and make sure we're actually doing that explicitely.
>> >
>> > For example
>> >
>> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
>> >      is the culprit here, and not the backends.  */
>> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
>> >     ;
>> >
>> > I can't decipther rtl.text as to what the semantics of such a subreg is
>> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
>> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
>> > when you mix those in a subreg.  So maybe the above should
>> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
>> >
>> > But then the world would be much simpler if subregs of non-same size
>> > modes have explicit documentation for the mode kinds we have.
>>
>> Yeah.  Although validate_subreg was a good idea, some of the mode checks
>> are IMO a failed experiment.  The hope was that eventually we'd remove
>> all those special exceptions once the culprit has been fixed.  However,
>> the code is over 16 years old at this point and those changes never
>> happened.
>>
>> Nested subregs aren't a thing (thankfully) and one of the big disadvantages
>> of the current validate_subreg mode-changing rules is that they aren't
>> transitive.  This can artificially require temporary pseudos for things
>> that could be expressed directly as a single subreg.
>
> And that's what the proposed patch does (add same-mode size integer mode
> punning intermediate subregs).
>
> So if that's not supposed to be necessary then why restrict subregs at all?

I was trying to say: I'm not sure we should.

> I mean you seem to imply that the semantics would be clear and well-defined
> (to you - not to me).  The only thing is that of course not all subregs are
> "implemented" by a target (or can be, w/o spilling).

Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
But it only comes in to play during RA or when trying to take
the subreg of a particular hard register.  Transitivity doesn't
matter so much for the hard register case since the result of
simplify_gen_subreg should then be another hard register.

> Which means - we should adjust validate_subreg with another special-case
> or rather generalize the existing ones to an overall set that makes more
> sense?

Maybe it's too radical, but I would whether we should just get rid of:

  /* ??? This should not be here.  Temporarily continue to allow word_mode
     subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
     Generally, backends are doing something sketchy but it'll take time to
     fix them all.  */
  if (omode == word_mode)
    ;
  /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
     is the culprit here, and not the backends.  */
  else if (known_ge (osize, regsize) && known_ge (isize, osize))
    ;
  /* Allow component subregs of complex and vector.  Though given the below
     extraction rules, it's not always clear what that means.  */
  else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
	   && GET_MODE_INNER (imode) == omode)
    ;
  /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
     i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
     surely isn't the cleanest way to represent this.  It's questionable
     if this ought to be represented at all -- why can't this all be hidden
     in post-reload splitters that make arbitrarily mode changes to the
     registers themselves.  */
  else if (VECTOR_MODE_P (omode)
	   && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
    ;
  /* Subregs involving floating point modes are not allowed to
     change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
     (subreg:SI (reg:DF) 0) isn't.  */
  else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
    {
      if (! (known_eq (isize, osize)
	     /* LRA can use subreg to store a floating point value in
		an integer mode.  Although the floating point and the
		integer modes need the same number of hard registers,
		the size of floating point mode can be less than the
		integer mode.  LRA also uses subregs for a register
		should be used in different mode in on insn.  */
	     || lra_in_progress))
	return false;
    }

altogether.

Thanks,
Richard
Richard Biener Aug. 26, 2021, 11:09 a.m. UTC | #20
On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> >> > One thought I had is whether we can "fix" validate_subreg to have less
> >> > "weird" allowed float-int
> >> > special cases.  As said upthread I think that we either should allow
> >> > all of those, implying that
> >> > subregs work semantically as if there's subregs to same-sized integer
> >> > modes inbetween or
> >> > disallow them all and make sure we're actually doing that explicitely.
> >> >
> >> > For example
> >> >
> >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> >> >      is the culprit here, and not the backends.  */
> >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> >> >     ;
> >> >
> >> > I can't decipther rtl.text as to what the semantics of such a subreg is
> >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> >> > when you mix those in a subreg.  So maybe the above should
> >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> >> >
> >> > But then the world would be much simpler if subregs of non-same size
> >> > modes have explicit documentation for the mode kinds we have.
> >>
> >> Yeah.  Although validate_subreg was a good idea, some of the mode checks
> >> are IMO a failed experiment.  The hope was that eventually we'd remove
> >> all those special exceptions once the culprit has been fixed.  However,
> >> the code is over 16 years old at this point and those changes never
> >> happened.
> >>
> >> Nested subregs aren't a thing (thankfully) and one of the big disadvantages
> >> of the current validate_subreg mode-changing rules is that they aren't
> >> transitive.  This can artificially require temporary pseudos for things
> >> that could be expressed directly as a single subreg.
> >
> > And that's what the proposed patch does (add same-mode size integer mode
> > punning intermediate subregs).
> >
> > So if that's not supposed to be necessary then why restrict subregs at all?
>
> I was trying to say: I'm not sure we should.
>
> > I mean you seem to imply that the semantics would be clear and well-defined
> > (to you - not to me).  The only thing is that of course not all subregs are
> > "implemented" by a target (or can be, w/o spilling).
>
> Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> But it only comes in to play during RA or when trying to take
> the subreg of a particular hard register.  Transitivity doesn't
> matter so much for the hard register case since the result of
> simplify_gen_subreg should then be another hard register.
>
> > Which means - we should adjust validate_subreg with another special-case
> > or rather generalize the existing ones to an overall set that makes more
> > sense?
>
> Maybe it's too radical, but I would whether we should just get rid of:
>
>   /* ??? This should not be here.  Temporarily continue to allow word_mode
>      subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
>      Generally, backends are doing something sketchy but it'll take time to
>      fix them all.  */
>   if (omode == word_mode)
>     ;
>   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
>      is the culprit here, and not the backends.  */
>   else if (known_ge (osize, regsize) && known_ge (isize, osize))
>     ;
>   /* Allow component subregs of complex and vector.  Though given the below
>      extraction rules, it's not always clear what that means.  */
>   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
>            && GET_MODE_INNER (imode) == omode)
>     ;
>   /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
>      i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
>      surely isn't the cleanest way to represent this.  It's questionable
>      if this ought to be represented at all -- why can't this all be hidden
>      in post-reload splitters that make arbitrarily mode changes to the
>      registers themselves.  */
>   else if (VECTOR_MODE_P (omode)
>            && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
>     ;
>   /* Subregs involving floating point modes are not allowed to
>      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
>      (subreg:SI (reg:DF) 0) isn't.  */
>   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
>     {
>       if (! (known_eq (isize, osize)
>              /* LRA can use subreg to store a floating point value in
>                 an integer mode.  Although the floating point and the
>                 integer modes need the same number of hard registers,
>                 the size of floating point mode can be less than the
>                 integer mode.  LRA also uses subregs for a register
>                 should be used in different mode in on insn.  */
>              || lra_in_progress))
>         return false;
>     }
>
> altogether.

Yeah, I would fully support this.  Maybe replace it with a comment
but I don't know what it should say.

Richard.

>
> Thanks,
> Richard
Hongtao Liu Aug. 27, 2021, 4:56 a.m. UTC | #21
On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> >
> > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > <richard.sandiford@arm.com> wrote:
> > >>
> > >> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > >> > One thought I had is whether we can "fix" validate_subreg to have less
> > >> > "weird" allowed float-int
> > >> > special cases.  As said upthread I think that we either should allow
> > >> > all of those, implying that
> > >> > subregs work semantically as if there's subregs to same-sized integer
> > >> > modes inbetween or
> > >> > disallow them all and make sure we're actually doing that explicitely.
> > >> >
> > >> > For example
> > >> >
> > >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > >> >      is the culprit here, and not the backends.  */
> > >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > >> >     ;
> > >> >
> > >> > I can't decipther rtl.text as to what the semantics of such a subreg is
> > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > >> > when you mix those in a subreg.  So maybe the above should
> > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> > >> >
> > >> > But then the world would be much simpler if subregs of non-same size
> > >> > modes have explicit documentation for the mode kinds we have.
> > >>
> > >> Yeah.  Although validate_subreg was a good idea, some of the mode checks
> > >> are IMO a failed experiment.  The hope was that eventually we'd remove
> > >> all those special exceptions once the culprit has been fixed.  However,
> > >> the code is over 16 years old at this point and those changes never
> > >> happened.
> > >>
> > >> Nested subregs aren't a thing (thankfully) and one of the big disadvantages
> > >> of the current validate_subreg mode-changing rules is that they aren't
> > >> transitive.  This can artificially require temporary pseudos for things
> > >> that could be expressed directly as a single subreg.
> > >
> > > And that's what the proposed patch does (add same-mode size integer mode
> > > punning intermediate subregs).
> > >
> > > So if that's not supposed to be necessary then why restrict subregs at all?
> >
> > I was trying to say: I'm not sure we should.
> >
> > > I mean you seem to imply that the semantics would be clear and well-defined
> > > (to you - not to me).  The only thing is that of course not all subregs are
> > > "implemented" by a target (or can be, w/o spilling).
> >
> > Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> > But it only comes in to play during RA or when trying to take
> > the subreg of a particular hard register.  Transitivity doesn't
> > matter so much for the hard register case since the result of
> > simplify_gen_subreg should then be another hard register.
> >
> > > Which means - we should adjust validate_subreg with another special-case
> > > or rather generalize the existing ones to an overall set that makes more
> > > sense?
> >
> > Maybe it's too radical, but I would whether we should just get rid of:
> >
> >   /* ??? This should not be here.  Temporarily continue to allow word_mode
> >      subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
> >      Generally, backends are doing something sketchy but it'll take time to
> >      fix them all.  */
> >   if (omode == word_mode)
> >     ;
> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> >      is the culprit here, and not the backends.  */
> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> >     ;
> >   /* Allow component subregs of complex and vector.  Though given the below
> >      extraction rules, it's not always clear what that means.  */
> >   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> >            && GET_MODE_INNER (imode) == omode)
> >     ;
> >   /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
> >      i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> >      surely isn't the cleanest way to represent this.  It's questionable
> >      if this ought to be represented at all -- why can't this all be hidden
> >      in post-reload splitters that make arbitrarily mode changes to the
> >      registers themselves.  */
> >   else if (VECTOR_MODE_P (omode)
> >            && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> >     ;
> >   /* Subregs involving floating point modes are not allowed to
> >      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> >      (subreg:SI (reg:DF) 0) isn't.  */
> >   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
> >     {
> >       if (! (known_eq (isize, osize)
> >              /* LRA can use subreg to store a floating point value in
> >                 an integer mode.  Although the floating point and the
> >                 integer modes need the same number of hard registers,
> >                 the size of floating point mode can be less than the
> >                 integer mode.  LRA also uses subregs for a register
> >                 should be used in different mode in on insn.  */
> >              || lra_in_progress))
> >         return false;
> >     }
> >
> > altogether.
>
> Yeah, I would fully support this.  Maybe replace it with a comment
> but I don't know what it should say.
>
> Richard.
>
> >
> > Thanks,
> > Richard

I'm going to upstream the patch.
Joseph Myers Aug. 30, 2021, 7:09 p.m. UTC | #22
This commit introduces an ICE building libgcc for 32-bit SPARC.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133
Jeff Law Aug. 30, 2021, 9:15 p.m. UTC | #23
On 8/30/2021 1:09 PM, Joseph Myers wrote:
> This commit introduces an ICE building libgcc for 32-bit SPARC.
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133
I've sent Hongtao a testcase which is ICE-ing.  It was mcore-elf, but I 
saw similar failures on a half-dozen architectures before I reverted the 
change locally.

jeff
Richard Biener Aug. 31, 2021, 6:10 a.m. UTC | #24
On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> > >
> > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > > <richard.sandiford@arm.com> wrote:
> > > >>
> > > >> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > >> > One thought I had is whether we can "fix" validate_subreg to have less
> > > >> > "weird" allowed float-int
> > > >> > special cases.  As said upthread I think that we either should allow
> > > >> > all of those, implying that
> > > >> > subregs work semantically as if there's subregs to same-sized integer
> > > >> > modes inbetween or
> > > >> > disallow them all and make sure we're actually doing that explicitely.
> > > >> >
> > > >> > For example
> > > >> >
> > > >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > > >> >      is the culprit here, and not the backends.  */
> > > >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > >> >     ;
> > > >> >
> > > >> > I can't decipther rtl.text as to what the semantics of such a subreg is
> > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > > >> > when you mix those in a subreg.  So maybe the above should
> > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> > > >> >
> > > >> > But then the world would be much simpler if subregs of non-same size
> > > >> > modes have explicit documentation for the mode kinds we have.
> > > >>
> > > >> Yeah.  Although validate_subreg was a good idea, some of the mode checks
> > > >> are IMO a failed experiment.  The hope was that eventually we'd remove
> > > >> all those special exceptions once the culprit has been fixed.  However,
> > > >> the code is over 16 years old at this point and those changes never
> > > >> happened.
> > > >>
> > > >> Nested subregs aren't a thing (thankfully) and one of the big disadvantages
> > > >> of the current validate_subreg mode-changing rules is that they aren't
> > > >> transitive.  This can artificially require temporary pseudos for things
> > > >> that could be expressed directly as a single subreg.
> > > >
> > > > And that's what the proposed patch does (add same-mode size integer mode
> > > > punning intermediate subregs).
> > > >
> > > > So if that's not supposed to be necessary then why restrict subregs at all?
> > >
> > > I was trying to say: I'm not sure we should.
> > >
> > > > I mean you seem to imply that the semantics would be clear and well-defined
> > > > (to you - not to me).  The only thing is that of course not all subregs are
> > > > "implemented" by a target (or can be, w/o spilling).
> > >
> > > Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> > > But it only comes in to play during RA or when trying to take
> > > the subreg of a particular hard register.  Transitivity doesn't
> > > matter so much for the hard register case since the result of
> > > simplify_gen_subreg should then be another hard register.
> > >
> > > > Which means - we should adjust validate_subreg with another special-case
> > > > or rather generalize the existing ones to an overall set that makes more
> > > > sense?
> > >
> > > Maybe it's too radical, but I would whether we should just get rid of:
> > >
> > >   /* ??? This should not be here.  Temporarily continue to allow word_mode
> > >      subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
> > >      Generally, backends are doing something sketchy but it'll take time to
> > >      fix them all.  */
> > >   if (omode == word_mode)
> > >     ;
> > >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > >      is the culprit here, and not the backends.  */
> > >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > >     ;
> > >   /* Allow component subregs of complex and vector.  Though given the below
> > >      extraction rules, it's not always clear what that means.  */
> > >   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> > >            && GET_MODE_INNER (imode) == omode)
> > >     ;
> > >   /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
> > >      i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> > >      surely isn't the cleanest way to represent this.  It's questionable
> > >      if this ought to be represented at all -- why can't this all be hidden
> > >      in post-reload splitters that make arbitrarily mode changes to the
> > >      registers themselves.  */
> > >   else if (VECTOR_MODE_P (omode)
> > >            && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> > >     ;
> > >   /* Subregs involving floating point modes are not allowed to
> > >      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > >      (subreg:SI (reg:DF) 0) isn't.  */
> > >   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
> > >     {
> > >       if (! (known_eq (isize, osize)
> > >              /* LRA can use subreg to store a floating point value in
> > >                 an integer mode.  Although the floating point and the
> > >                 integer modes need the same number of hard registers,
> > >                 the size of floating point mode can be less than the
> > >                 integer mode.  LRA also uses subregs for a register
> > >                 should be used in different mode in on insn.  */
> > >              || lra_in_progress))
> > >         return false;
> > >     }
> > >
> > > altogether.
> >
> > Yeah, I would fully support this.  Maybe replace it with a comment
> > but I don't know what it should say.
> >
> > Richard.
> >
> > >
> > > Thanks,
> > > Richard
>
> I'm going to upstream the patch.

Hmm, so looks like you pushed the variant massaging extract_bit_field.  Above
we supported to instead "fix" validate_subreg to allow the HFmode subreg.

So maybe we should revert and try that?

Richard.

> --
> BR,
> Hongtao
Hongtao Liu Aug. 31, 2021, 6:30 a.m. UTC | #25
On Tue, Aug 31, 2021 at 2:11 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> > > <richard.sandiford@arm.com> wrote:
> > > >
> > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > > > <richard.sandiford@arm.com> wrote:
> > > > >>
> > > > >> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > >> > One thought I had is whether we can "fix" validate_subreg to have less
> > > > >> > "weird" allowed float-int
> > > > >> > special cases.  As said upthread I think that we either should allow
> > > > >> > all of those, implying that
> > > > >> > subregs work semantically as if there's subregs to same-sized integer
> > > > >> > modes inbetween or
> > > > >> > disallow them all and make sure we're actually doing that explicitely.
> > > > >> >
> > > > >> > For example
> > > > >> >
> > > > >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > > > >> >      is the culprit here, and not the backends.  */
> > > > >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > >> >     ;
> > > > >> >
> > > > >> > I can't decipther rtl.text as to what the semantics of such a subreg is
> > > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > > > >> > when you mix those in a subreg.  So maybe the above should
> > > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> > > > >> >
> > > > >> > But then the world would be much simpler if subregs of non-same size
> > > > >> > modes have explicit documentation for the mode kinds we have.
> > > > >>
> > > > >> Yeah.  Although validate_subreg was a good idea, some of the mode checks
> > > > >> are IMO a failed experiment.  The hope was that eventually we'd remove
> > > > >> all those special exceptions once the culprit has been fixed.  However,
> > > > >> the code is over 16 years old at this point and those changes never
> > > > >> happened.
> > > > >>
> > > > >> Nested subregs aren't a thing (thankfully) and one of the big disadvantages
> > > > >> of the current validate_subreg mode-changing rules is that they aren't
> > > > >> transitive.  This can artificially require temporary pseudos for things
> > > > >> that could be expressed directly as a single subreg.
> > > > >
> > > > > And that's what the proposed patch does (add same-mode size integer mode
> > > > > punning intermediate subregs).
> > > > >
> > > > > So if that's not supposed to be necessary then why restrict subregs at all?
> > > >
> > > > I was trying to say: I'm not sure we should.
> > > >
> > > > > I mean you seem to imply that the semantics would be clear and well-defined
> > > > > (to you - not to me).  The only thing is that of course not all subregs are
> > > > > "implemented" by a target (or can be, w/o spilling).
> > > >
> > > > Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> > > > But it only comes in to play during RA or when trying to take
> > > > the subreg of a particular hard register.  Transitivity doesn't
> > > > matter so much for the hard register case since the result of
> > > > simplify_gen_subreg should then be another hard register.
> > > >
> > > > > Which means - we should adjust validate_subreg with another special-case
> > > > > or rather generalize the existing ones to an overall set that makes more
> > > > > sense?
> > > >
> > > > Maybe it's too radical, but I would whether we should just get rid of:
> > > >
> > > >   /* ??? This should not be here.  Temporarily continue to allow word_mode
> > > >      subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
> > > >      Generally, backends are doing something sketchy but it'll take time to
> > > >      fix them all.  */
> > > >   if (omode == word_mode)
> > > >     ;
> > > >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > > >      is the culprit here, and not the backends.  */
> > > >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > >     ;
> > > >   /* Allow component subregs of complex and vector.  Though given the below
> > > >      extraction rules, it's not always clear what that means.  */
> > > >   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> > > >            && GET_MODE_INNER (imode) == omode)
> > > >     ;
> > > >   /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
> > > >      i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> > > >      surely isn't the cleanest way to represent this.  It's questionable
> > > >      if this ought to be represented at all -- why can't this all be hidden
> > > >      in post-reload splitters that make arbitrarily mode changes to the
> > > >      registers themselves.  */
> > > >   else if (VECTOR_MODE_P (omode)
> > > >            && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> > > >     ;
> > > >   /* Subregs involving floating point modes are not allowed to
> > > >      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > > >      (subreg:SI (reg:DF) 0) isn't.  */
> > > >   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
> > > >     {
> > > >       if (! (known_eq (isize, osize)
> > > >              /* LRA can use subreg to store a floating point value in
> > > >                 an integer mode.  Although the floating point and the
> > > >                 integer modes need the same number of hard registers,
> > > >                 the size of floating point mode can be less than the
> > > >                 integer mode.  LRA also uses subregs for a register
> > > >                 should be used in different mode in on insn.  */
> > > >              || lra_in_progress))
> > > >         return false;
> > > >     }
> > > >
> > > > altogether.
> > >
> > > Yeah, I would fully support this.  Maybe replace it with a comment
> > > but I don't know what it should say.
> > >
> > > Richard.
> > >
> > > >
> > > > Thanks,
> > > > Richard
> >
> > I'm going to upstream the patch.
>
> Hmm, so looks like you pushed the variant massaging extract_bit_field.  Above
> we supported to instead "fix" validate_subreg to allow the HFmode subreg.
>
> So maybe we should revert and try that?
This one:

> +  /* ???Similarly like (subreg:DI (reg:SF), also allow (subreg:SI (reg:HF))
> +     here. Though extract_bit_field is the culprit here, not the backends.  */
> +  else if (known_gt (regsize, osize) && known_gt (osize, isize)
> +          && FLOAT_MODE_P (imode) && INTEGRAL_MODE_P (omode))
> +    ;

or this one

+      machine_mode tmode = GET_MODE (target);
       if (REG_P (target)
-          && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
+          && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
+          /* When validate_subreg doesn't allow subreg between integer mode
+             and float mode with different size, It will hit gcc_assert in
+             gen_lowpart_general. Also subreg like (subreg:DI (reg:SF)) is
+             not really needed, codes like below will be finally generated.
+             (set (reg:SI 1)
+                  (and:SI (reg:DI 2) -1))
+             (set (reg:SF 3)
+                  (subreg:SF (reg:SI 1)))  */
+          && FLOAT_MODE_P (tmode) && INTEGRAL_MODE_P (mode)
+          && maybe_ne (GET_MODE_SIZE (tmode), GET_MODE_SIZE (mode)))
         {
           target = gen_lowpart (ext_mode, target);
           if (partial_subreg_p (GET_MODE (spec_target), ext_mode))

or the proposed patch in PR102133 which may risk falling down a rabbit hole?

   gcc_checking_assert (!x
        || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t))
        || (use_register_for_decl (t)
-   ? (REG_P (x)
+   ? (REG_P (x) || SUBREG_P (x)
       || (GET_CODE (x) == CONCAT
   && (REG_P (XEXP (x, 0))
       || SUBREG_P (XEXP (x, 0)))


>
> Richard.
>
> > --
> > BR,
> > Hongtao
Hongtao Liu Aug. 31, 2021, 6:48 a.m. UTC | #26
On Tue, Aug 31, 2021 at 2:30 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Aug 31, 2021 at 2:11 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > > >
> > > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> > > > <richard.sandiford@arm.com> wrote:
> > > > >
> > > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > > > > <richard.sandiford@arm.com> wrote:
> > > > > >>
> > > > > >> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > >> > One thought I had is whether we can "fix" validate_subreg to have less
> > > > > >> > "weird" allowed float-int
> > > > > >> > special cases.  As said upthread I think that we either should allow
> > > > > >> > all of those, implying that
> > > > > >> > subregs work semantically as if there's subregs to same-sized integer
> > > > > >> > modes inbetween or
> > > > > >> > disallow them all and make sure we're actually doing that explicitely.
> > > > > >> >
> > > > > >> > For example
> > > > > >> >
> > > > > >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > > > > >> >      is the culprit here, and not the backends.  */
> > > > > >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > > >> >     ;
> > > > > >> >
> > > > > >> > I can't decipther rtl.text as to what the semantics of such a subreg is
> > > > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > > > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > > > > >> > when you mix those in a subreg.  So maybe the above should
> > > > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> > > > > >> >
> > > > > >> > But then the world would be much simpler if subregs of non-same size
> > > > > >> > modes have explicit documentation for the mode kinds we have.
> > > > > >>
> > > > > >> Yeah.  Although validate_subreg was a good idea, some of the mode checks
> > > > > >> are IMO a failed experiment.  The hope was that eventually we'd remove
> > > > > >> all those special exceptions once the culprit has been fixed.  However,
> > > > > >> the code is over 16 years old at this point and those changes never
> > > > > >> happened.
> > > > > >>
> > > > > >> Nested subregs aren't a thing (thankfully) and one of the big disadvantages
> > > > > >> of the current validate_subreg mode-changing rules is that they aren't
> > > > > >> transitive.  This can artificially require temporary pseudos for things
> > > > > >> that could be expressed directly as a single subreg.
> > > > > >
> > > > > > And that's what the proposed patch does (add same-mode size integer mode
> > > > > > punning intermediate subregs).
> > > > > >
> > > > > > So if that's not supposed to be necessary then why restrict subregs at all?
> > > > >
> > > > > I was trying to say: I'm not sure we should.
> > > > >
> > > > > > I mean you seem to imply that the semantics would be clear and well-defined
> > > > > > (to you - not to me).  The only thing is that of course not all subregs are
> > > > > > "implemented" by a target (or can be, w/o spilling).
> > > > >
> > > > > Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> > > > > But it only comes in to play during RA or when trying to take
> > > > > the subreg of a particular hard register.  Transitivity doesn't
> > > > > matter so much for the hard register case since the result of
> > > > > simplify_gen_subreg should then be another hard register.
> > > > >
> > > > > > Which means - we should adjust validate_subreg with another special-case
> > > > > > or rather generalize the existing ones to an overall set that makes more
> > > > > > sense?
> > > > >
> > > > > Maybe it's too radical, but I would whether we should just get rid of:
> > > > >
> > > > >   /* ??? This should not be here.  Temporarily continue to allow word_mode
> > > > >      subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
> > > > >      Generally, backends are doing something sketchy but it'll take time to
> > > > >      fix them all.  */
> > > > >   if (omode == word_mode)
> > > > >     ;
> > > > >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > > > >      is the culprit here, and not the backends.  */
> > > > >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > >     ;
> > > > >   /* Allow component subregs of complex and vector.  Though given the below
> > > > >      extraction rules, it's not always clear what that means.  */
> > > > >   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> > > > >            && GET_MODE_INNER (imode) == omode)
> > > > >     ;
> > > > >   /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
> > > > >      i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> > > > >      surely isn't the cleanest way to represent this.  It's questionable
> > > > >      if this ought to be represented at all -- why can't this all be hidden
> > > > >      in post-reload splitters that make arbitrarily mode changes to the
> > > > >      registers themselves.  */
> > > > >   else if (VECTOR_MODE_P (omode)
> > > > >            && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> > > > >     ;
> > > > >   /* Subregs involving floating point modes are not allowed to
> > > > >      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > > > >      (subreg:SI (reg:DF) 0) isn't.  */
> > > > >   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
> > > > >     {
> > > > >       if (! (known_eq (isize, osize)
> > > > >              /* LRA can use subreg to store a floating point value in
> > > > >                 an integer mode.  Although the floating point and the
> > > > >                 integer modes need the same number of hard registers,
> > > > >                 the size of floating point mode can be less than the
> > > > >                 integer mode.  LRA also uses subregs for a register
> > > > >                 should be used in different mode in on insn.  */
> > > > >              || lra_in_progress))
> > > > >         return false;
> > > > >     }
> > > > >
> > > > > altogether.
let me test the patch which removed the upper code.
> > > >
> > > > Yeah, I would fully support this.  Maybe replace it with a comment
> > > > but I don't know what it should say.
> > > >
> > > > Richard.
> > > >
> > > > >
> > > > > Thanks,
> > > > > Richard
> > >
> > > I'm going to upstream the patch.
> >
> > Hmm, so looks like you pushed the variant massaging extract_bit_field.  Above
> > we supported to instead "fix" validate_subreg to allow the HFmode subreg.
> >
> > So maybe we should revert and try that?
> This one:
>
> > +  /* ???Similarly like (subreg:DI (reg:SF), also allow (subreg:SI (reg:HF))
> > +     here. Though extract_bit_field is the culprit here, not the backends.  */
> > +  else if (known_gt (regsize, osize) && known_gt (osize, isize)
> > +          && FLOAT_MODE_P (imode) && INTEGRAL_MODE_P (omode))
> > +    ;
>
> or this one
>
> +      machine_mode tmode = GET_MODE (target);
>        if (REG_P (target)
> -          && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> +          && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
> +          /* When validate_subreg doesn't allow subreg between integer mode
> +             and float mode with different size, It will hit gcc_assert in
> +             gen_lowpart_general. Also subreg like (subreg:DI (reg:SF)) is
> +             not really needed, codes like below will be finally generated.
> +             (set (reg:SI 1)
> +                  (and:SI (reg:DI 2) -1))
> +             (set (reg:SF 3)
> +                  (subreg:SF (reg:SI 1)))  */
> +          && FLOAT_MODE_P (tmode) && INTEGRAL_MODE_P (mode)
> +          && maybe_ne (GET_MODE_SIZE (tmode), GET_MODE_SIZE (mode)))
>          {
>            target = gen_lowpart (ext_mode, target);
>            if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
>
> or the proposed patch in PR102133 which may risk falling down a rabbit hole?
>
>    gcc_checking_assert (!x
>         || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t))
>         || (use_register_for_decl (t)
> -   ? (REG_P (x)
> +   ? (REG_P (x) || SUBREG_P (x)
>        || (GET_CODE (x) == CONCAT
>    && (REG_P (XEXP (x, 0))
>        || SUBREG_P (XEXP (x, 0)))
>
>
> >
> > Richard.
> >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> BR,
> Hongtao
Richard Biener Aug. 31, 2021, 11:16 a.m. UTC | #27
On Tue, Aug 31, 2021 at 8:48 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Tue, Aug 31, 2021 at 2:30 PM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Tue, Aug 31, 2021 at 2:11 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >
> > > > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford
> > > > > <richard.sandiford@arm.com> wrote:
> > > > > >
> > > > > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford
> > > > > > > <richard.sandiford@arm.com> wrote:
> > > > > > >>
> > > > > > >> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > > > > > >> > One thought I had is whether we can "fix" validate_subreg to have less
> > > > > > >> > "weird" allowed float-int
> > > > > > >> > special cases.  As said upthread I think that we either should allow
> > > > > > >> > all of those, implying that
> > > > > > >> > subregs work semantically as if there's subregs to same-sized integer
> > > > > > >> > modes inbetween or
> > > > > > >> > disallow them all and make sure we're actually doing that explicitely.
> > > > > > >> >
> > > > > > >> > For example
> > > > > > >> >
> > > > > > >> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > > > > > >> >      is the culprit here, and not the backends.  */
> > > > > > >> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > > > >> >     ;
> > > > > > >> >
> > > > > > >> > I can't decipther rtl.text as to what the semantics of such a subreg is
> > > > > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs.
> > > > > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens
> > > > > > >> > when you mix those in a subreg.  So maybe the above should
> > > > > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN.
> > > > > > >> >
> > > > > > >> > But then the world would be much simpler if subregs of non-same size
> > > > > > >> > modes have explicit documentation for the mode kinds we have.
> > > > > > >>
> > > > > > >> Yeah.  Although validate_subreg was a good idea, some of the mode checks
> > > > > > >> are IMO a failed experiment.  The hope was that eventually we'd remove
> > > > > > >> all those special exceptions once the culprit has been fixed.  However,
> > > > > > >> the code is over 16 years old at this point and those changes never
> > > > > > >> happened.
> > > > > > >>
> > > > > > >> Nested subregs aren't a thing (thankfully) and one of the big disadvantages
> > > > > > >> of the current validate_subreg mode-changing rules is that they aren't
> > > > > > >> transitive.  This can artificially require temporary pseudos for things
> > > > > > >> that could be expressed directly as a single subreg.
> > > > > > >
> > > > > > > And that's what the proposed patch does (add same-mode size integer mode
> > > > > > > punning intermediate subregs).
> > > > > > >
> > > > > > > So if that's not supposed to be necessary then why restrict subregs at all?
> > > > > >
> > > > > > I was trying to say: I'm not sure we should.
> > > > > >
> > > > > > > I mean you seem to imply that the semantics would be clear and well-defined
> > > > > > > (to you - not to me).  The only thing is that of course not all subregs are
> > > > > > > "implemented" by a target (or can be, w/o spilling).
> > > > > >
> > > > > > Yeah.  That's for TARGET_CAN_CHANGE_MODE_CLASS to decide.
> > > > > > But it only comes in to play during RA or when trying to take
> > > > > > the subreg of a particular hard register.  Transitivity doesn't
> > > > > > matter so much for the hard register case since the result of
> > > > > > simplify_gen_subreg should then be another hard register.
> > > > > >
> > > > > > > Which means - we should adjust validate_subreg with another special-case
> > > > > > > or rather generalize the existing ones to an overall set that makes more
> > > > > > > sense?
> > > > > >
> > > > > > Maybe it's too radical, but I would whether we should just get rid of:
> > > > > >
> > > > > >   /* ??? This should not be here.  Temporarily continue to allow word_mode
> > > > > >      subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
> > > > > >      Generally, backends are doing something sketchy but it'll take time to
> > > > > >      fix them all.  */
> > > > > >   if (omode == word_mode)
> > > > > >     ;
> > > > > >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> > > > > >      is the culprit here, and not the backends.  */
> > > > > >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > > > > >     ;
> > > > > >   /* Allow component subregs of complex and vector.  Though given the below
> > > > > >      extraction rules, it's not always clear what that means.  */
> > > > > >   else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> > > > > >            && GET_MODE_INNER (imode) == omode)
> > > > > >     ;
> > > > > >   /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
> > > > > >      i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> > > > > >      surely isn't the cleanest way to represent this.  It's questionable
> > > > > >      if this ought to be represented at all -- why can't this all be hidden
> > > > > >      in post-reload splitters that make arbitrarily mode changes to the
> > > > > >      registers themselves.  */
> > > > > >   else if (VECTOR_MODE_P (omode)
> > > > > >            && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> > > > > >     ;
> > > > > >   /* Subregs involving floating point modes are not allowed to
> > > > > >      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > > > > >      (subreg:SI (reg:DF) 0) isn't.  */
> > > > > >   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
> > > > > >     {
> > > > > >       if (! (known_eq (isize, osize)
> > > > > >              /* LRA can use subreg to store a floating point value in
> > > > > >                 an integer mode.  Although the floating point and the
> > > > > >                 integer modes need the same number of hard registers,
> > > > > >                 the size of floating point mode can be less than the
> > > > > >                 integer mode.  LRA also uses subregs for a register
> > > > > >                 should be used in different mode in on insn.  */
> > > > > >              || lra_in_progress))
> > > > > >         return false;
> > > > > >     }
> > > > > >
> > > > > > altogether.
> let me test the patch which removed the upper code.

Yes, that's what I was refering to.

Richard.

> > > > >
> > > > > Yeah, I would fully support this.  Maybe replace it with a comment
> > > > > but I don't know what it should say.
> > > > >
> > > > > Richard.
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Richard
> > > >
> > > > I'm going to upstream the patch.
> > >
> > > Hmm, so looks like you pushed the variant massaging extract_bit_field.  Above
> > > we supported to instead "fix" validate_subreg to allow the HFmode subreg.
> > >
> > > So maybe we should revert and try that?
> > This one:
> >
> > > +  /* ???Similarly like (subreg:DI (reg:SF), also allow (subreg:SI (reg:HF))
> > > +     here. Though extract_bit_field is the culprit here, not the backends.  */
> > > +  else if (known_gt (regsize, osize) && known_gt (osize, isize)
> > > +          && FLOAT_MODE_P (imode) && INTEGRAL_MODE_P (omode))
> > > +    ;
> >
> > or this one
> >
> > +      machine_mode tmode = GET_MODE (target);
> >        if (REG_P (target)
> > -          && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> > +          && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
> > +          /* When validate_subreg doesn't allow subreg between integer mode
> > +             and float mode with different size, It will hit gcc_assert in
> > +             gen_lowpart_general. Also subreg like (subreg:DI (reg:SF)) is
> > +             not really needed, codes like below will be finally generated.
> > +             (set (reg:SI 1)
> > +                  (and:SI (reg:DI 2) -1))
> > +             (set (reg:SF 3)
> > +                  (subreg:SF (reg:SI 1)))  */
> > +          && FLOAT_MODE_P (tmode) && INTEGRAL_MODE_P (mode)
> > +          && maybe_ne (GET_MODE_SIZE (tmode), GET_MODE_SIZE (mode)))
> >          {
> >            target = gen_lowpart (ext_mode, target);
> >            if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> >
> > or the proposed patch in PR102133 which may risk falling down a rabbit hole?
> >
> >    gcc_checking_assert (!x
> >         || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t))
> >         || (use_register_for_decl (t)
> > -   ? (REG_P (x)
> > +   ? (REG_P (x) || SUBREG_P (x)
> >        || (GET_CODE (x) == CONCAT
> >    && (REG_P (XEXP (x, 0))
> >        || SUBREG_P (XEXP (x, 0)))
> >
> >
> > >
> > > Richard.
> > >
> > > > --
> > > > BR,
> > > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao
diff mbox series

Patch

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 3143f38e057..72790693ef0 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -1850,6 +1850,25 @@  extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
       op0_mode = opt_scalar_int_mode ();
     }
 
+  /* Make sure we are playing with integral modes.  Pun with subregs
+     if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE
+     in extract_integral_bit_field.  */
+  if (int_mode_for_mode (tmode).exists (&imode)
+      && imode != tmode
+      && imode != GET_MODE (op0))
+    {
+      rtx ret = extract_integral_bit_field (op0, op0_mode,
+					    bitsize.to_constant (),
+					    bitnum.to_constant (), unsignedp,
+					    NULL, imode, imode,
+					    reverse, fallback_p);
+      gcc_assert (ret);
+
+      if (!REG_P (ret))
+	ret = force_reg (imode, ret);
+      return gen_lowpart_SUBREG (tmode, ret);
+    }
+
   /* It's possible we'll need to handle other cases here for
      polynomial bitnum and bitsize.  */
 
diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c b/gcc/testsuite/gcc.target/i386/float16-5.c
new file mode 100644
index 00000000000..ebc0af1490b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/float16-5.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-msse2 -O2" } */
+_Float16
+foo (int a)
+{
+  union {
+    int a;
+    _Float16 b;
+  }c;
+  c.a = a;
+  return c.b;
+}