diff mbox series

[06/13,APX,EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

Message ID 20230831082024.314097-7-hongyu.wang@intel.com
State New
Headers show
Series Support Intel APX EGPR | expand

Commit Message

Hongyu Wang Aug. 31, 2023, 8:20 a.m. UTC
From: Kong Lingling <lingling.kong@intel.com>

In inline asm, we do not know if the insn can use EGPR, so disable EGPR
usage by default from mapping the common reg/mem constraint to non-EGPR
constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
for inline asm.

gcc/ChangeLog:

	* config/i386/i386.cc (INCLUDE_STRING): Add include for
	ix86_md_asm_adjust.
	(ix86_md_asm_adjust): When APX EGPR enabled without specifying the
	target option, map reg/mem constraints to non-EGPR constraints.
	* config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/apx-inline-gpr-norex2.c: New test.
---
 gcc/config/i386/i386.cc                       |  44 +++++++
 gcc/config/i386/i386.opt                      |   5 +
 .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
 3 files changed, 156 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c

Comments

Jakub Jelinek Aug. 31, 2023, 9:17 a.m. UTC | #1
On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> From: Kong Lingling <lingling.kong@intel.com>
> 
> In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> usage by default from mapping the common reg/mem constraint to non-EGPR
> constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> for inline asm.
> 
> gcc/ChangeLog:
> 
> 	* config/i386/i386.cc (INCLUDE_STRING): Add include for
> 	ix86_md_asm_adjust.
> 	(ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> 	target option, map reg/mem constraints to non-EGPR constraints.
> 	* config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> ---
>  gcc/config/i386/i386.cc                       |  44 +++++++
>  gcc/config/i386/i386.opt                      |   5 +
>  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
>  3 files changed, 156 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> 
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index d26d9ab0d9d..9460ebbfda4 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
>  along with GCC; see the file COPYING3.  If not see
>  <http://www.gnu.org/licenses/>.  */
>  
> +#define INCLUDE_STRING
>  #define IN_TARGET_CODE 1
>  
>  #include "config.h"
> @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
>    bool saw_asm_flag = false;
>  
>    start_sequence ();
> +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> +   constraints, will eventually map all the usable constraints in the future. */

I think there should be some constraint which explicitly has all the 32
GPRs, like there is one for just all 16 GPRs (h), so that regardless of
-mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.

Also, what about the "g" constraint?  Shouldn't there be another for "g"
without r16..r31?  What about the various other memory
constraints ("<", "o", ...)?

> +  if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32)
> +    {
> +      /* Map "r" constraint in inline asm to "h" that disallows r16-r31
> +	 and replace only r, exclude Br and Yr.  */
> +      for (unsigned i = 0; i < constraints.length (); i++)
> +	{
> +	  std::string *s = new std::string (constraints[i]);

Doesn't this leak memory (all the time)?
I must say I don't really understand why you need to use std::string here,
but certainly it shouldn't leak.

> +	  size_t pos = s->find ('r');
> +	  while (pos != std::string::npos)
> +	    {
> +	      if (pos > 0
> +		  && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B'))
> +		pos = s->find ('r', pos + 1);
> +	      else
> +		{
> +		  s->replace (pos, 1, "h");
> +		  constraints[i] = (const char*) s->c_str ();

Formatting (space before *).  The usual way for constraints is ggc_strdup on
some string in a buffer.  Also, one could have several copies or r (or m, memory (doesn't
that appear just in clobbers?  And that doesn't look like something that
should be replaced), Bm, e.g. in various alternatives.  So, you
need to change them all, not just the first hit.  "r,r,r,m" and the like.
Normally, one would simply walk the constraint string, parsing the special
letters (+, =, & etc.) and single letter constraints and 2 letter
constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources).
Either do it in 2 passes, first one counts how long constraint string one
will need after the adjustments (and whether to adjust something at all),
then if needed XALLOCAVEC it and adjust in there, or say use a
auto_vec<char, 32> for
it.

> +		  break;
> +		}
> +	    }
> +	}
> +      /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with
> +	 "Bt/Bt/BT".  */
> +      for (unsigned i = 0; i < constraints.length (); i++)
> +	{
> +	  std::string *s = new std::string (constraints[i]);
> +	  size_t pos = s->find ("m");
> +	  size_t pos2 = s->find ("memory");
> +	  if (pos != std::string::npos)
> +	    {
> +	      if (pos > 0 && (s->at (pos - 1) == 'B'))
> +		  s->replace (pos - 1, 2, "BT");
> +	      else if (pos2 != std::string::npos)
> +		  s->replace (pos, 6, "Bt");
> +	      else
> +		  s->replace (pos, 1, "Bt");

Formatting, the s->replace calls are indented too much.

	Jakub
Uros Bizjak Aug. 31, 2023, 10 a.m. UTC | #2
On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> > From: Kong Lingling <lingling.kong@intel.com>
> >
> > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> > usage by default from mapping the common reg/mem constraint to non-EGPR
> > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> > for inline asm.
> >
> > gcc/ChangeLog:
> >
> >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> >       ix86_md_asm_adjust.
> >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> >       target option, map reg/mem constraints to non-EGPR constraints.
> >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > ---
> >  gcc/config/i386/i386.cc                       |  44 +++++++
> >  gcc/config/i386/i386.opt                      |   5 +
> >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
> >  3 files changed, 156 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index d26d9ab0d9d..9460ebbfda4 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
> >  along with GCC; see the file COPYING3.  If not see
> >  <http://www.gnu.org/licenses/>.  */
> >
> > +#define INCLUDE_STRING
> >  #define IN_TARGET_CODE 1
> >
> >  #include "config.h"
> > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
> >    bool saw_asm_flag = false;
> >
> >    start_sequence ();
> > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> > +   constraints, will eventually map all the usable constraints in the future. */
>
> I think there should be some constraint which explicitly has all the 32
> GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
>
> Also, what about the "g" constraint?  Shouldn't there be another for "g"
> without r16..r31?  What about the various other memory
> constraints ("<", "o", ...)?

I think we should leave all existing constraints as they are, so "r"
covers only GPR16, "m" and "o" to only use GPR16. We can then
introduce "h" to instructions that have the ability to handle EGPR.
This would be somehow similar to the SSE -> AVX512F transition, where
we still have "x" for SSE16 and "v" was introduced as a separate
register class for EVEX SSE registers. This way, asm will be
compatible, when "r", "m", "o" and "g" are used. The new memory
constraint "Bt", should allow new registers, and should be added to
the constraint string as a separate constraint, and conditionally
enabled by relevant "isa" (AKA "enabled") attribute.

Uros.

> > +  if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32)
> > +    {
> > +      /* Map "r" constraint in inline asm to "h" that disallows r16-r31
> > +      and replace only r, exclude Br and Yr.  */
> > +      for (unsigned i = 0; i < constraints.length (); i++)
> > +     {
> > +       std::string *s = new std::string (constraints[i]);
>
> Doesn't this leak memory (all the time)?
> I must say I don't really understand why you need to use std::string here,
> but certainly it shouldn't leak.
>
> > +       size_t pos = s->find ('r');
> > +       while (pos != std::string::npos)
> > +         {
> > +           if (pos > 0
> > +               && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B'))
> > +             pos = s->find ('r', pos + 1);
> > +           else
> > +             {
> > +               s->replace (pos, 1, "h");
> > +               constraints[i] = (const char*) s->c_str ();
>
> Formatting (space before *).  The usual way for constraints is ggc_strdup on
> some string in a buffer.  Also, one could have several copies or r (or m, memory (doesn't
> that appear just in clobbers?  And that doesn't look like something that
> should be replaced), Bm, e.g. in various alternatives.  So, you
> need to change them all, not just the first hit.  "r,r,r,m" and the like.
> Normally, one would simply walk the constraint string, parsing the special
> letters (+, =, & etc.) and single letter constraints and 2 letter
> constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources).
> Either do it in 2 passes, first one counts how long constraint string one
> will need after the adjustments (and whether to adjust something at all),
> then if needed XALLOCAVEC it and adjust in there, or say use a
> auto_vec<char, 32> for
> it.
>
> > +               break;
> > +             }
> > +         }
> > +     }
> > +      /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with
> > +      "Bt/Bt/BT".  */
> > +      for (unsigned i = 0; i < constraints.length (); i++)
> > +     {
> > +       std::string *s = new std::string (constraints[i]);
> > +       size_t pos = s->find ("m");
> > +       size_t pos2 = s->find ("memory");
> > +       if (pos != std::string::npos)
> > +         {
> > +           if (pos > 0 && (s->at (pos - 1) == 'B'))
> > +               s->replace (pos - 1, 2, "BT");
> > +           else if (pos2 != std::string::npos)
> > +               s->replace (pos, 6, "Bt");
> > +           else
> > +               s->replace (pos, 1, "Bt");
>
> Formatting, the s->replace calls are indented too much.
>
>         Jakub
>
Hongyu Wang Sept. 1, 2023, 9:04 a.m. UTC | #3
Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 17:18写道:
>
> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> > From: Kong Lingling <lingling.kong@intel.com>
> >
> > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> > usage by default from mapping the common reg/mem constraint to non-EGPR
> > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> > for inline asm.
> >
> > gcc/ChangeLog:
> >
> >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> >       ix86_md_asm_adjust.
> >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> >       target option, map reg/mem constraints to non-EGPR constraints.
> >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > ---
> >  gcc/config/i386/i386.cc                       |  44 +++++++
> >  gcc/config/i386/i386.opt                      |   5 +
> >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
> >  3 files changed, 156 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index d26d9ab0d9d..9460ebbfda4 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
> >  along with GCC; see the file COPYING3.  If not see
> >  <http://www.gnu.org/licenses/>.  */
> >
> > +#define INCLUDE_STRING
> >  #define IN_TARGET_CODE 1
> >
> >  #include "config.h"
> > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
> >    bool saw_asm_flag = false;
> >
> >    start_sequence ();
> > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> > +   constraints, will eventually map all the usable constraints in the future. */
>
> I think there should be some constraint which explicitly has all the 32
> GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
>

Yes, we will add new register constraints. For memory constraints it requires
some special handling in ix86_memory_address_use_extended_reg_class_p

> Also, what about the "g" constraint?  Shouldn't there be another for "g"
> without r16..r31?  What about the various other memory
> constraints ("<", "o", ...)?

We will support fully mapping of all common constraints, with refining
of current
mapping in the V2 patch.

>
> > +  if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32)
> > +    {
> > +      /* Map "r" constraint in inline asm to "h" that disallows r16-r31
> > +      and replace only r, exclude Br and Yr.  */
> > +      for (unsigned i = 0; i < constraints.length (); i++)
> > +     {
> > +       std::string *s = new std::string (constraints[i]);
>
> Doesn't this leak memory (all the time)?
> I must say I don't really understand why you need to use std::string here,
> but certainly it shouldn't leak.

std::string just makes the code shorter than using str functions. Current code
will be completely refactored with supporting more mapping of constraints.

>
> > +       size_t pos = s->find ('r');
> > +       while (pos != std::string::npos)
> > +         {
> > +           if (pos > 0
> > +               && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B'))
> > +             pos = s->find ('r', pos + 1);
> > +           else
> > +             {
> > +               s->replace (pos, 1, "h");
> > +               constraints[i] = (const char*) s->c_str ();
>
> Formatting (space before *).  The usual way for constraints is ggc_strdup on
> some string in a buffer.  Also, one could have several copies or r (or m, memory (doesn't
> that appear just in clobbers?  And that doesn't look like something that
> should be replaced), Bm, e.g. in various alternatives.  So, you
> need to change them all, not just the first hit.  "r,r,r,m" and the like.
> Normally, one would simply walk the constraint string, parsing the special
> letters (+, =, & etc.) and single letter constraints and 2 letter
> constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources).
> Either do it in 2 passes, first one counts how long constraint string one
> will need after the adjustments (and whether to adjust something at all),
> then if needed XALLOCAVEC it and adjust in there, or say use a
> auto_vec<char, 32> for
> it.

Thanks for your guidance. Previously we thought constraints[i] was a splitted
simple constraint, but clearly it is not.
We will refer to an existing example of this and rewrite current one.

>
> > +               break;
> > +             }
> > +         }
> > +     }
> > +      /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with
> > +      "Bt/Bt/BT".  */
> > +      for (unsigned i = 0; i < constraints.length (); i++)
> > +     {
> > +       std::string *s = new std::string (constraints[i]);
> > +       size_t pos = s->find ("m");
> > +       size_t pos2 = s->find ("memory");
> > +       if (pos != std::string::npos)
> > +         {
> > +           if (pos > 0 && (s->at (pos - 1) == 'B'))
> > +               s->replace (pos - 1, 2, "BT");
> > +           else if (pos2 != std::string::npos)
> > +               s->replace (pos, 6, "Bt");
> > +           else
> > +               s->replace (pos, 1, "Bt");
>
> Formatting, the s->replace calls are indented too much.
>
>         Jakub
>
Hongyu Wang Sept. 1, 2023, 9:04 a.m. UTC | #4
Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道:
>
> On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> > > From: Kong Lingling <lingling.kong@intel.com>
> > >
> > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> > > usage by default from mapping the common reg/mem constraint to non-EGPR
> > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> > > for inline asm.
> > >
> > > gcc/ChangeLog:
> > >
> > >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> > >       ix86_md_asm_adjust.
> > >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> > >       target option, map reg/mem constraints to non-EGPR constraints.
> > >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > > ---
> > >  gcc/config/i386/i386.cc                       |  44 +++++++
> > >  gcc/config/i386/i386.opt                      |   5 +
> > >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
> > >  3 files changed, 156 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> > >
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index d26d9ab0d9d..9460ebbfda4 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
> > >  along with GCC; see the file COPYING3.  If not see
> > >  <http://www.gnu.org/licenses/>.  */
> > >
> > > +#define INCLUDE_STRING
> > >  #define IN_TARGET_CODE 1
> > >
> > >  #include "config.h"
> > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
> > >    bool saw_asm_flag = false;
> > >
> > >    start_sequence ();
> > > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> > > +   constraints, will eventually map all the usable constraints in the future. */
> >
> > I think there should be some constraint which explicitly has all the 32
> > GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> >
> > Also, what about the "g" constraint?  Shouldn't there be another for "g"
> > without r16..r31?  What about the various other memory
> > constraints ("<", "o", ...)?
>
> I think we should leave all existing constraints as they are, so "r"
> covers only GPR16, "m" and "o" to only use GPR16. We can then
> introduce "h" to instructions that have the ability to handle EGPR.
> This would be somehow similar to the SSE -> AVX512F transition, where
> we still have "x" for SSE16 and "v" was introduced as a separate
> register class for EVEX SSE registers. This way, asm will be
> compatible, when "r", "m", "o" and "g" are used. The new memory
> constraint "Bt", should allow new registers, and should be added to
> the constraint string as a separate constraint, and conditionally
> enabled by relevant "isa" (AKA "enabled") attribute.

The extended constraint can work for registers, but for memory it is more
complicated.

If we want to use new mem constraints that allow gpr32, then BASE/INDEX
reg class still requires per-insn verification, so it means changes
on all patterns with vm, and those SSE patterns on opcode map0/1. Also,
several legacy insns that are promoted to EVEX encoding space need to be
changed. The overall implementation could be 10 times larger than current,
which would be quite hard for maintenance.

>
> Uros.
>
> > > +  if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32)
> > > +    {
> > > +      /* Map "r" constraint in inline asm to "h" that disallows r16-r31
> > > +      and replace only r, exclude Br and Yr.  */
> > > +      for (unsigned i = 0; i < constraints.length (); i++)
> > > +     {
> > > +       std::string *s = new std::string (constraints[i]);
> >
> > Doesn't this leak memory (all the time)?
> > I must say I don't really understand why you need to use std::string here,
> > but certainly it shouldn't leak.
> >
> > > +       size_t pos = s->find ('r');
> > > +       while (pos != std::string::npos)
> > > +         {
> > > +           if (pos > 0
> > > +               && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B'))
> > > +             pos = s->find ('r', pos + 1);
> > > +           else
> > > +             {
> > > +               s->replace (pos, 1, "h");
> > > +               constraints[i] = (const char*) s->c_str ();
> >
> > Formatting (space before *).  The usual way for constraints is ggc_strdup on
> > some string in a buffer.  Also, one could have several copies or r (or m, memory (doesn't
> > that appear just in clobbers?  And that doesn't look like something that
> > should be replaced), Bm, e.g. in various alternatives.  So, you
> > need to change them all, not just the first hit.  "r,r,r,m" and the like.
> > Normally, one would simply walk the constraint string, parsing the special
> > letters (+, =, & etc.) and single letter constraints and 2 letter
> > constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources).
> > Either do it in 2 passes, first one counts how long constraint string one
> > will need after the adjustments (and whether to adjust something at all),
> > then if needed XALLOCAVEC it and adjust in there, or say use a
> > auto_vec<char, 32> for
> > it.
> >
> > > +               break;
> > > +             }
> > > +         }
> > > +     }
> > > +      /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with
> > > +      "Bt/Bt/BT".  */
> > > +      for (unsigned i = 0; i < constraints.length (); i++)
> > > +     {
> > > +       std::string *s = new std::string (constraints[i]);
> > > +       size_t pos = s->find ("m");
> > > +       size_t pos2 = s->find ("memory");
> > > +       if (pos != std::string::npos)
> > > +         {
> > > +           if (pos > 0 && (s->at (pos - 1) == 'B'))
> > > +               s->replace (pos - 1, 2, "BT");
> > > +           else if (pos2 != std::string::npos)
> > > +               s->replace (pos, 6, "Bt");
> > > +           else
> > > +               s->replace (pos, 1, "Bt");
> >
> > Formatting, the s->replace calls are indented too much.
> >
> >         Jakub
> >
Uros Bizjak Sept. 1, 2023, 9:38 a.m. UTC | #5
On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote:
>
> Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道:
> >
> > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> > > > From: Kong Lingling <lingling.kong@intel.com>
> > > >
> > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> > > > usage by default from mapping the common reg/mem constraint to non-EGPR
> > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> > > > for inline asm.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> > > >       ix86_md_asm_adjust.
> > > >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> > > >       target option, map reg/mem constraints to non-EGPR constraints.
> > > >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > > > ---
> > > >  gcc/config/i386/i386.cc                       |  44 +++++++
> > > >  gcc/config/i386/i386.opt                      |   5 +
> > > >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
> > > >  3 files changed, 156 insertions(+)
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> > > >
> > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > > index d26d9ab0d9d..9460ebbfda4 100644
> > > > --- a/gcc/config/i386/i386.cc
> > > > +++ b/gcc/config/i386/i386.cc
> > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
> > > >  along with GCC; see the file COPYING3.  If not see
> > > >  <http://www.gnu.org/licenses/>.  */
> > > >
> > > > +#define INCLUDE_STRING
> > > >  #define IN_TARGET_CODE 1
> > > >
> > > >  #include "config.h"
> > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
> > > >    bool saw_asm_flag = false;
> > > >
> > > >    start_sequence ();
> > > > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> > > > +   constraints, will eventually map all the usable constraints in the future. */
> > >
> > > I think there should be some constraint which explicitly has all the 32
> > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> > >
> > > Also, what about the "g" constraint?  Shouldn't there be another for "g"
> > > without r16..r31?  What about the various other memory
> > > constraints ("<", "o", ...)?
> >
> > I think we should leave all existing constraints as they are, so "r"
> > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > introduce "h" to instructions that have the ability to handle EGPR.
> > This would be somehow similar to the SSE -> AVX512F transition, where
> > we still have "x" for SSE16 and "v" was introduced as a separate
> > register class for EVEX SSE registers. This way, asm will be
> > compatible, when "r", "m", "o" and "g" are used. The new memory
> > constraint "Bt", should allow new registers, and should be added to
> > the constraint string as a separate constraint, and conditionally
> > enabled by relevant "isa" (AKA "enabled") attribute.
>
> The extended constraint can work for registers, but for memory it is more
> complicated.

Yes, unfortunately. The compiler assumes that an unchangeable register
class is used for BASE/INDEX registers. I have hit this limitation
when trying to implement memory support for instructions involving
8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
registers, also inside memory operand. (You can see the "hack" in e.g.
*extzvqi_mem_rex64" and corresponding peephole2 with the original
*extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
register class is the major limitation in the compiler, so perhaps the
strategy on how to override this limitation should be discussed with
the register allocator author first. Perhaps adding an insn attribute
to insn RTX pattern to specify different BASE/INDEX register sets can
be a better solution than passing insn RTX to the register allocator.

The above idea still does not solve the asm problem on how to select
correct BASE/INDEX register set for memory operands.

Uros.
>
> If we want to use new mem constraints that allow gpr32, then BASE/INDEX
> reg class still requires per-insn verification, so it means changes
> on all patterns with vm, and those SSE patterns on opcode map0/1. Also,
> several legacy insns that are promoted to EVEX encoding space need to be
> changed. The overall implementation could be 10 times larger than current,
> which would be quite hard for maintenance.
>
> >
> > Uros.
> >
> > > > +  if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32)
> > > > +    {
> > > > +      /* Map "r" constraint in inline asm to "h" that disallows r16-r31
> > > > +      and replace only r, exclude Br and Yr.  */
> > > > +      for (unsigned i = 0; i < constraints.length (); i++)
> > > > +     {
> > > > +       std::string *s = new std::string (constraints[i]);
> > >
> > > Doesn't this leak memory (all the time)?
> > > I must say I don't really understand why you need to use std::string here,
> > > but certainly it shouldn't leak.
> > >
> > > > +       size_t pos = s->find ('r');
> > > > +       while (pos != std::string::npos)
> > > > +         {
> > > > +           if (pos > 0
> > > > +               && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B'))
> > > > +             pos = s->find ('r', pos + 1);
> > > > +           else
> > > > +             {
> > > > +               s->replace (pos, 1, "h");
> > > > +               constraints[i] = (const char*) s->c_str ();
> > >
> > > Formatting (space before *).  The usual way for constraints is ggc_strdup on
> > > some string in a buffer.  Also, one could have several copies or r (or m, memory (doesn't
> > > that appear just in clobbers?  And that doesn't look like something that
> > > should be replaced), Bm, e.g. in various alternatives.  So, you
> > > need to change them all, not just the first hit.  "r,r,r,m" and the like.
> > > Normally, one would simply walk the constraint string, parsing the special
> > > letters (+, =, & etc.) and single letter constraints and 2 letter
> > > constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources).
> > > Either do it in 2 passes, first one counts how long constraint string one
> > > will need after the adjustments (and whether to adjust something at all),
> > > then if needed XALLOCAVEC it and adjust in there, or say use a
> > > auto_vec<char, 32> for
> > > it.
> > >
> > > > +               break;
> > > > +             }
> > > > +         }
> > > > +     }
> > > > +      /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with
> > > > +      "Bt/Bt/BT".  */
> > > > +      for (unsigned i = 0; i < constraints.length (); i++)
> > > > +     {
> > > > +       std::string *s = new std::string (constraints[i]);
> > > > +       size_t pos = s->find ("m");
> > > > +       size_t pos2 = s->find ("memory");
> > > > +       if (pos != std::string::npos)
> > > > +         {
> > > > +           if (pos > 0 && (s->at (pos - 1) == 'B'))
> > > > +               s->replace (pos - 1, 2, "BT");
> > > > +           else if (pos2 != std::string::npos)
> > > > +               s->replace (pos, 6, "Bt");
> > > > +           else
> > > > +               s->replace (pos, 1, "Bt");
> > >
> > > Formatting, the s->replace calls are indented too much.
> > >
> > >         Jakub
> > >
Hongtao Liu Sept. 1, 2023, 10:35 a.m. UTC | #6
On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote:
> >
> > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道:
> > >
> > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > > >
> > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> > > > > From: Kong Lingling <lingling.kong@intel.com>
> > > > >
> > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> > > > > usage by default from mapping the common reg/mem constraint to non-EGPR
> > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> > > > > for inline asm.
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> > > > >       ix86_md_asm_adjust.
> > > > >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> > > > >       target option, map reg/mem constraints to non-EGPR constraints.
> > > > >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > > > > ---
> > > > >  gcc/config/i386/i386.cc                       |  44 +++++++
> > > > >  gcc/config/i386/i386.opt                      |   5 +
> > > > >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
> > > > >  3 files changed, 156 insertions(+)
> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> > > > >
> > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > > > index d26d9ab0d9d..9460ebbfda4 100644
> > > > > --- a/gcc/config/i386/i386.cc
> > > > > +++ b/gcc/config/i386/i386.cc
> > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
> > > > >  along with GCC; see the file COPYING3.  If not see
> > > > >  <http://www.gnu.org/licenses/>.  */
> > > > >
> > > > > +#define INCLUDE_STRING
> > > > >  #define IN_TARGET_CODE 1
> > > > >
> > > > >  #include "config.h"
> > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
> > > > >    bool saw_asm_flag = false;
> > > > >
> > > > >    start_sequence ();
> > > > > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> > > > > +   constraints, will eventually map all the usable constraints in the future. */
> > > >
> > > > I think there should be some constraint which explicitly has all the 32
> > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> > > >
> > > > Also, what about the "g" constraint?  Shouldn't there be another for "g"
> > > > without r16..r31?  What about the various other memory
> > > > constraints ("<", "o", ...)?
> > >
> > > I think we should leave all existing constraints as they are, so "r"
> > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > introduce "h" to instructions that have the ability to handle EGPR.
> > > This would be somehow similar to the SSE -> AVX512F transition, where
> > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > register class for EVEX SSE registers. This way, asm will be
> > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > constraint "Bt", should allow new registers, and should be added to
> > > the constraint string as a separate constraint, and conditionally
> > > enabled by relevant "isa" (AKA "enabled") attribute.
> >
> > The extended constraint can work for registers, but for memory it is more
> > complicated.
>
> Yes, unfortunately. The compiler assumes that an unchangeable register
> class is used for BASE/INDEX registers. I have hit this limitation
> when trying to implement memory support for instructions involving
> 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> registers, also inside memory operand. (You can see the "hack" in e.g.
> *extzvqi_mem_rex64" and corresponding peephole2 with the original
> *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> register class is the major limitation in the compiler, so perhaps the
> strategy on how to override this limitation should be discussed with
> the register allocator author first. Perhaps adding an insn attribute
> to insn RTX pattern to specify different BASE/INDEX register sets can
> be a better solution than passing insn RTX to the register allocator.
>
> The above idea still does not solve the asm problem on how to select
> correct BASE/INDEX register set for memory operands.
The current approach disables gpr32 for memory operand in asm_operand
by default. but can be turned on by options
ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction
supports gpr32).
Only ~ 5% of total instructions don't support gpr32, reversed approach
only gonna get more complicated.

>
> Uros.
> >
> > If we want to use new mem constraints that allow gpr32, then BASE/INDEX
> > reg class still requires per-insn verification, so it means changes
> > on all patterns with vm, and those SSE patterns on opcode map0/1. Also,
> > several legacy insns that are promoted to EVEX encoding space need to be
> > changed. The overall implementation could be 10 times larger than current,
> > which would be quite hard for maintenance.
> >
> > >
> > > Uros.
> > >
> > > > > +  if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32)
> > > > > +    {
> > > > > +      /* Map "r" constraint in inline asm to "h" that disallows r16-r31
> > > > > +      and replace only r, exclude Br and Yr.  */
> > > > > +      for (unsigned i = 0; i < constraints.length (); i++)
> > > > > +     {
> > > > > +       std::string *s = new std::string (constraints[i]);
> > > >
> > > > Doesn't this leak memory (all the time)?
> > > > I must say I don't really understand why you need to use std::string here,
> > > > but certainly it shouldn't leak.
> > > >
> > > > > +       size_t pos = s->find ('r');
> > > > > +       while (pos != std::string::npos)
> > > > > +         {
> > > > > +           if (pos > 0
> > > > > +               && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B'))
> > > > > +             pos = s->find ('r', pos + 1);
> > > > > +           else
> > > > > +             {
> > > > > +               s->replace (pos, 1, "h");
> > > > > +               constraints[i] = (const char*) s->c_str ();
> > > >
> > > > Formatting (space before *).  The usual way for constraints is ggc_strdup on
> > > > some string in a buffer.  Also, one could have several copies or r (or m, memory (doesn't
> > > > that appear just in clobbers?  And that doesn't look like something that
> > > > should be replaced), Bm, e.g. in various alternatives.  So, you
> > > > need to change them all, not just the first hit.  "r,r,r,m" and the like.
> > > > Normally, one would simply walk the constraint string, parsing the special
> > > > letters (+, =, & etc.) and single letter constraints and 2 letter
> > > > constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources).
> > > > Either do it in 2 passes, first one counts how long constraint string one
> > > > will need after the adjustments (and whether to adjust something at all),
> > > > then if needed XALLOCAVEC it and adjust in there, or say use a
> > > > auto_vec<char, 32> for
> > > > it.
> > > >
> > > > > +               break;
> > > > > +             }
> > > > > +         }
> > > > > +     }
> > > > > +      /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with
> > > > > +      "Bt/Bt/BT".  */
> > > > > +      for (unsigned i = 0; i < constraints.length (); i++)
> > > > > +     {
> > > > > +       std::string *s = new std::string (constraints[i]);
> > > > > +       size_t pos = s->find ("m");
> > > > > +       size_t pos2 = s->find ("memory");
> > > > > +       if (pos != std::string::npos)
> > > > > +         {
> > > > > +           if (pos > 0 && (s->at (pos - 1) == 'B'))
> > > > > +               s->replace (pos - 1, 2, "BT");
> > > > > +           else if (pos2 != std::string::npos)
> > > > > +               s->replace (pos, 6, "Bt");
> > > > > +           else
> > > > > +               s->replace (pos, 1, "Bt");
> > > >
> > > > Formatting, the s->replace calls are indented too much.
> > > >
> > > >         Jakub
> > > >
Richard Sandiford Sept. 1, 2023, 11:03 a.m. UTC | #7
Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
>> > From: Kong Lingling <lingling.kong@intel.com>
>> >
>> > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
>> > usage by default from mapping the common reg/mem constraint to non-EGPR
>> > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
>> > for inline asm.
>> >
>> > gcc/ChangeLog:
>> >
>> >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
>> >       ix86_md_asm_adjust.
>> >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
>> >       target option, map reg/mem constraints to non-EGPR constraints.
>> >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
>> > ---
>> >  gcc/config/i386/i386.cc                       |  44 +++++++
>> >  gcc/config/i386/i386.opt                      |   5 +
>> >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
>> >  3 files changed, 156 insertions(+)
>> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
>> >
>> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
>> > index d26d9ab0d9d..9460ebbfda4 100644
>> > --- a/gcc/config/i386/i386.cc
>> > +++ b/gcc/config/i386/i386.cc
>> > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
>> >  along with GCC; see the file COPYING3.  If not see
>> >  <http://www.gnu.org/licenses/>.  */
>> >
>> > +#define INCLUDE_STRING
>> >  #define IN_TARGET_CODE 1
>> >
>> >  #include "config.h"
>> > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
>> >    bool saw_asm_flag = false;
>> >
>> >    start_sequence ();
>> > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
>> > +   constraints, will eventually map all the usable constraints in the future. */
>>
>> I think there should be some constraint which explicitly has all the 32
>> GPRs, like there is one for just all 16 GPRs (h), so that regardless of
>> -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
>>
>> Also, what about the "g" constraint?  Shouldn't there be another for "g"
>> without r16..r31?  What about the various other memory
>> constraints ("<", "o", ...)?
>
> I think we should leave all existing constraints as they are, so "r"
> covers only GPR16, "m" and "o" to only use GPR16. We can then
> introduce "h" to instructions that have the ability to handle EGPR.

Yeah.  I'm jumping in without having read the full thread, sorry,
but the current mechanism for handling this is TARGET_MEM_CONSTRAINT
(added for s390).  That is, TARGET_MEM_CONSTRAINT can be defined to some
new constraint that is more general than the traditional "m" constraint.
This constraint is then the one that is associated with memory_operand
etc.  "m" can then be defined explicitly to the old definition,
so that existing asms continue to work.

So if the port wants generic internal memory addresses to use the
EGPR set (sounds reasonable), then TARGET_MEM_CONSTRAINT would be
a new constraint that maps to those addresses.

Thanks,
Richard
Uros Bizjak Sept. 1, 2023, 11:27 a.m. UTC | #8
On Fri, Sep 1, 2023 at 12:36 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote:
> > >
> > > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道:
> > > >
> > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >
> > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> > > > > > From: Kong Lingling <lingling.kong@intel.com>
> > > > > >
> > > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> > > > > > usage by default from mapping the common reg/mem constraint to non-EGPR
> > > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> > > > > > for inline asm.
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> > > > > >       ix86_md_asm_adjust.
> > > > > >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> > > > > >       target option, map reg/mem constraints to non-EGPR constraints.
> > > > > >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >
> > > > > >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > > > > > ---
> > > > > >  gcc/config/i386/i386.cc                       |  44 +++++++
> > > > > >  gcc/config/i386/i386.opt                      |   5 +
> > > > > >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
> > > > > >  3 files changed, 156 insertions(+)
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> > > > > >
> > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > > > > index d26d9ab0d9d..9460ebbfda4 100644
> > > > > > --- a/gcc/config/i386/i386.cc
> > > > > > +++ b/gcc/config/i386/i386.cc
> > > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
> > > > > >  along with GCC; see the file COPYING3.  If not see
> > > > > >  <http://www.gnu.org/licenses/>.  */
> > > > > >
> > > > > > +#define INCLUDE_STRING
> > > > > >  #define IN_TARGET_CODE 1
> > > > > >
> > > > > >  #include "config.h"
> > > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
> > > > > >    bool saw_asm_flag = false;
> > > > > >
> > > > > >    start_sequence ();
> > > > > > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> > > > > > +   constraints, will eventually map all the usable constraints in the future. */
> > > > >
> > > > > I think there should be some constraint which explicitly has all the 32
> > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> > > > >
> > > > > Also, what about the "g" constraint?  Shouldn't there be another for "g"
> > > > > without r16..r31?  What about the various other memory
> > > > > constraints ("<", "o", ...)?
> > > >
> > > > I think we should leave all existing constraints as they are, so "r"
> > > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > > introduce "h" to instructions that have the ability to handle EGPR.
> > > > This would be somehow similar to the SSE -> AVX512F transition, where
> > > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > > register class for EVEX SSE registers. This way, asm will be
> > > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > > constraint "Bt", should allow new registers, and should be added to
> > > > the constraint string as a separate constraint, and conditionally
> > > > enabled by relevant "isa" (AKA "enabled") attribute.
> > >
> > > The extended constraint can work for registers, but for memory it is more
> > > complicated.
> >
> > Yes, unfortunately. The compiler assumes that an unchangeable register
> > class is used for BASE/INDEX registers. I have hit this limitation
> > when trying to implement memory support for instructions involving
> > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> > registers, also inside memory operand. (You can see the "hack" in e.g.
> > *extzvqi_mem_rex64" and corresponding peephole2 with the original
> > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> > register class is the major limitation in the compiler, so perhaps the
> > strategy on how to override this limitation should be discussed with
> > the register allocator author first. Perhaps adding an insn attribute
> > to insn RTX pattern to specify different BASE/INDEX register sets can
> > be a better solution than passing insn RTX to the register allocator.
> >
> > The above idea still does not solve the asm problem on how to select
> > correct BASE/INDEX register set for memory operands.
> The current approach disables gpr32 for memory operand in asm_operand
> by default. but can be turned on by options
> ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction
> supports gpr32).
> Only ~ 5% of total instructions don't support gpr32, reversed approach
> only gonna get more complicated.

I'm not referring to the reversed approach, just want to point out
that the same approach as you proposed w.r.t. to memory operand can be
achieved using some named insn attribute that would affect BASE/INDEX
register class selection. The attribute could default to gpr32 with
APX, unless the insn specific attribute has e.g. nogpr32 value. See
for example how "enabled" and "preferred_for_*" attributes are used.
Perhaps this new attribute can also be applied to separate
alternatives.

Uros.
Hongtao Liu Sept. 4, 2023, 12:28 a.m. UTC | #9
On Fri, Sep 1, 2023 at 7:27 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Fri, Sep 1, 2023 at 12:36 PM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote:
> > > >
> > > > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道:
> > > > >
> > > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> > > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > > >
> > > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> > > > > > > From: Kong Lingling <lingling.kong@intel.com>
> > > > > > >
> > > > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> > > > > > > usage by default from mapping the common reg/mem constraint to non-EGPR
> > > > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> > > > > > > for inline asm.
> > > > > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> > > > > > >       ix86_md_asm_adjust.
> > > > > > >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> > > > > > >       target option, map reg/mem constraints to non-EGPR constraints.
> > > > > > >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> > > > > > >
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > >
> > > > > > >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > > > > > > ---
> > > > > > >  gcc/config/i386/i386.cc                       |  44 +++++++
> > > > > > >  gcc/config/i386/i386.opt                      |   5 +
> > > > > > >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
> > > > > > >  3 files changed, 156 insertions(+)
> > > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> > > > > > >
> > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > > > > > index d26d9ab0d9d..9460ebbfda4 100644
> > > > > > > --- a/gcc/config/i386/i386.cc
> > > > > > > +++ b/gcc/config/i386/i386.cc
> > > > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
> > > > > > >  along with GCC; see the file COPYING3.  If not see
> > > > > > >  <http://www.gnu.org/licenses/>.  */
> > > > > > >
> > > > > > > +#define INCLUDE_STRING
> > > > > > >  #define IN_TARGET_CODE 1
> > > > > > >
> > > > > > >  #include "config.h"
> > > > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
> > > > > > >    bool saw_asm_flag = false;
> > > > > > >
> > > > > > >    start_sequence ();
> > > > > > > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> > > > > > > +   constraints, will eventually map all the usable constraints in the future. */
> > > > > >
> > > > > > I think there should be some constraint which explicitly has all the 32
> > > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> > > > > >
> > > > > > Also, what about the "g" constraint?  Shouldn't there be another for "g"
> > > > > > without r16..r31?  What about the various other memory
> > > > > > constraints ("<", "o", ...)?
> > > > >
> > > > > I think we should leave all existing constraints as they are, so "r"
> > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > > > introduce "h" to instructions that have the ability to handle EGPR.
> > > > > This would be somehow similar to the SSE -> AVX512F transition, where
> > > > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > > > register class for EVEX SSE registers. This way, asm will be
> > > > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > > > constraint "Bt", should allow new registers, and should be added to
> > > > > the constraint string as a separate constraint, and conditionally
> > > > > enabled by relevant "isa" (AKA "enabled") attribute.
> > > >
> > > > The extended constraint can work for registers, but for memory it is more
> > > > complicated.
> > >
> > > Yes, unfortunately. The compiler assumes that an unchangeable register
> > > class is used for BASE/INDEX registers. I have hit this limitation
> > > when trying to implement memory support for instructions involving
> > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> > > registers, also inside memory operand. (You can see the "hack" in e.g.
> > > *extzvqi_mem_rex64" and corresponding peephole2 with the original
> > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> > > register class is the major limitation in the compiler, so perhaps the
> > > strategy on how to override this limitation should be discussed with
> > > the register allocator author first. Perhaps adding an insn attribute
> > > to insn RTX pattern to specify different BASE/INDEX register sets can
> > > be a better solution than passing insn RTX to the register allocator.
> > >
> > > The above idea still does not solve the asm problem on how to select
> > > correct BASE/INDEX register set for memory operands.
> > The current approach disables gpr32 for memory operand in asm_operand
> > by default. but can be turned on by options
> > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction
> > supports gpr32).
> > Only ~ 5% of total instructions don't support gpr32, reversed approach
> > only gonna get more complicated.
>
> I'm not referring to the reversed approach, just want to point out
> that the same approach as you proposed w.r.t. to memory operand can be
> achieved using some named insn attribute that would affect BASE/INDEX
> register class selection. The attribute could default to gpr32 with
> APX, unless the insn specific attribute has e.g. nogpr32 value. See
> for example how "enabled" and "preferred_for_*" attributes are used.
> Perhaps this new attribute can also be applied to separate
> alternatives.
Yes, for xop/fma4/3dnow instructions, I think we can use isa attr like
(define_attr "gpr32" "0, 1"
  (cond [(eq_attr "isa" "fma4")
           (const_string "0")]
      (const_string "1")))

But still, we need to adjust memory constraints in the pattern.
Ideally, gcc includes encoding information for every instruction,
(.i.e. map0/map1), so that we can determine the attribute value of
gpr32 directly from this information.
>
> Uros.
Hongtao Liu Sept. 4, 2023, 1:03 a.m. UTC | #10
On Fri, Sep 1, 2023 at 7:03 PM Richard Sandiford via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> >> > From: Kong Lingling <lingling.kong@intel.com>
> >> >
> >> > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> >> > usage by default from mapping the common reg/mem constraint to non-EGPR
> >> > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> >> > for inline asm.
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> >> >       ix86_md_asm_adjust.
> >> >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> >> >       target option, map reg/mem constraints to non-EGPR constraints.
> >> >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> >
> >> >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> >> > ---
> >> >  gcc/config/i386/i386.cc                       |  44 +++++++
> >> >  gcc/config/i386/i386.opt                      |   5 +
> >> >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++++++++++++++++++
> >> >  3 files changed, 156 insertions(+)
> >> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> >> >
> >> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> >> > index d26d9ab0d9d..9460ebbfda4 100644
> >> > --- a/gcc/config/i386/i386.cc
> >> > +++ b/gcc/config/i386/i386.cc
> >> > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
> >> >  along with GCC; see the file COPYING3.  If not see
> >> >  <http://www.gnu.org/licenses/>.  */
> >> >
> >> > +#define INCLUDE_STRING
> >> >  #define IN_TARGET_CODE 1
> >> >
> >> >  #include "config.h"
> >> > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
> >> >    bool saw_asm_flag = false;
> >> >
> >> >    start_sequence ();
> >> > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> >> > +   constraints, will eventually map all the usable constraints in the future. */
> >>
> >> I think there should be some constraint which explicitly has all the 32
> >> GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> >> -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> >>
> >> Also, what about the "g" constraint?  Shouldn't there be another for "g"
> >> without r16..r31?  What about the various other memory
> >> constraints ("<", "o", ...)?
> >
> > I think we should leave all existing constraints as they are, so "r"
> > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > introduce "h" to instructions that have the ability to handle EGPR.
>
> Yeah.  I'm jumping in without having read the full thread, sorry,
> but the current mechanism for handling this is TARGET_MEM_CONSTRAINT
> (added for s390).  That is, TARGET_MEM_CONSTRAINT can be defined to some
Thanks for the comments.
> new constraint that is more general than the traditional "m" constraint.
> This constraint is then the one that is associated with memory_operand
> etc.  "m" can then be defined explicitly to the old definition,
> so that existing asms continue to work.
>
> So if the port wants generic internal memory addresses to use the
> EGPR set (sounds reasonable), then TARGET_MEM_CONSTRAINT would be
> a new constraint that maps to those addresses.
But still we need to enhance current reload infrastructure to support
selective base_reg_class/index_reg_class, refer to [1].
The good thing about using TARGET_MEM_CONSTRAINT is that we don't have
to remapping memory constraint for inline asm, but the bad thing about
it is that we need to modify the backend pattern a lot, because only
5% of the instructions don't support gpr32, and 95% of them need to be
changed to the new memory constraint.
It feels like the cons outweigh the pros.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629040.html

>
> Thanks,
> Richard
Uros Bizjak Sept. 4, 2023, 8:57 a.m. UTC | #11
On Mon, Sep 4, 2023 at 2:28 AM Hongtao Liu <crazylht@gmail.com> wrote:

> > > > > > > I think there should be some constraint which explicitly has all the 32
> > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> > > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> > > > > > >
> > > > > > > Also, what about the "g" constraint?  Shouldn't there be another for "g"
> > > > > > > without r16..r31?  What about the various other memory
> > > > > > > constraints ("<", "o", ...)?
> > > > > >
> > > > > > I think we should leave all existing constraints as they are, so "r"
> > > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > > > > introduce "h" to instructions that have the ability to handle EGPR.
> > > > > > This would be somehow similar to the SSE -> AVX512F transition, where
> > > > > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > > > > register class for EVEX SSE registers. This way, asm will be
> > > > > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > > > > constraint "Bt", should allow new registers, and should be added to
> > > > > > the constraint string as a separate constraint, and conditionally
> > > > > > enabled by relevant "isa" (AKA "enabled") attribute.
> > > > >
> > > > > The extended constraint can work for registers, but for memory it is more
> > > > > complicated.
> > > >
> > > > Yes, unfortunately. The compiler assumes that an unchangeable register
> > > > class is used for BASE/INDEX registers. I have hit this limitation
> > > > when trying to implement memory support for instructions involving
> > > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> > > > registers, also inside memory operand. (You can see the "hack" in e.g.
> > > > *extzvqi_mem_rex64" and corresponding peephole2 with the original
> > > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> > > > register class is the major limitation in the compiler, so perhaps the
> > > > strategy on how to override this limitation should be discussed with
> > > > the register allocator author first. Perhaps adding an insn attribute
> > > > to insn RTX pattern to specify different BASE/INDEX register sets can
> > > > be a better solution than passing insn RTX to the register allocator.
> > > >
> > > > The above idea still does not solve the asm problem on how to select
> > > > correct BASE/INDEX register set for memory operands.
> > > The current approach disables gpr32 for memory operand in asm_operand
> > > by default. but can be turned on by options
> > > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction
> > > supports gpr32).
> > > Only ~ 5% of total instructions don't support gpr32, reversed approach
> > > only gonna get more complicated.
> >
> > I'm not referring to the reversed approach, just want to point out
> > that the same approach as you proposed w.r.t. to memory operand can be
> > achieved using some named insn attribute that would affect BASE/INDEX
> > register class selection. The attribute could default to gpr32 with
> > APX, unless the insn specific attribute has e.g. nogpr32 value. See
> > for example how "enabled" and "preferred_for_*" attributes are used.
> > Perhaps this new attribute can also be applied to separate
> > alternatives.
> Yes, for xop/fma4/3dnow instructions, I think we can use isa attr like
> (define_attr "gpr32" "0, 1"
>   (cond [(eq_attr "isa" "fma4")
>            (const_string "0")]
>       (const_string "1")))

Just a nit, can the member be named "map0" and "map1"? The code will
then look like:

if (get_attr_gpr32 (insn) == GPR32_MAP0) ...

instead of:

if (get_attr_gpr32 (insn) == GPR32_0) ...

> But still, we need to adjust memory constraints in the pattern.

I guess the gpr32 property is the same for all alternatives of the
insn pattern. In this case,  "m" "g" and "a" constraints could remain
as they are, the final register class will be adjusted (by some target
hook?) based on the value of gpr32 attribute.

> Ideally, gcc includes encoding information for every instruction,
> (.i.e. map0/map1), so that we can determine the attribute value of
> gpr32 directly from this information.

I think the right tool for this is attribute infrastructure of insn
patterns. We can set the default, set precise value of the insns, or
calculate attribute from some other attribute in a quite flexible way.
Other than that, adjusting BASE/INDEX register class of the RA pass is
the infrastructure change, but perhaps similar to the one you
proposed.

Uros.
Hongtao Liu Sept. 4, 2023, 9:10 a.m. UTC | #12
On Mon, Sep 4, 2023 at 4:57 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Mon, Sep 4, 2023 at 2:28 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> > > > > > > > I think there should be some constraint which explicitly has all the 32
> > > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> > > > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> > > > > > > >
> > > > > > > > Also, what about the "g" constraint?  Shouldn't there be another for "g"
> > > > > > > > without r16..r31?  What about the various other memory
> > > > > > > > constraints ("<", "o", ...)?
> > > > > > >
> > > > > > > I think we should leave all existing constraints as they are, so "r"
> > > > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > > > > > introduce "h" to instructions that have the ability to handle EGPR.
> > > > > > > This would be somehow similar to the SSE -> AVX512F transition, where
> > > > > > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > > > > > register class for EVEX SSE registers. This way, asm will be
> > > > > > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > > > > > constraint "Bt", should allow new registers, and should be added to
> > > > > > > the constraint string as a separate constraint, and conditionally
> > > > > > > enabled by relevant "isa" (AKA "enabled") attribute.
> > > > > >
> > > > > > The extended constraint can work for registers, but for memory it is more
> > > > > > complicated.
> > > > >
> > > > > Yes, unfortunately. The compiler assumes that an unchangeable register
> > > > > class is used for BASE/INDEX registers. I have hit this limitation
> > > > > when trying to implement memory support for instructions involving
> > > > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> > > > > registers, also inside memory operand. (You can see the "hack" in e.g.
> > > > > *extzvqi_mem_rex64" and corresponding peephole2 with the original
> > > > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> > > > > register class is the major limitation in the compiler, so perhaps the
> > > > > strategy on how to override this limitation should be discussed with
> > > > > the register allocator author first. Perhaps adding an insn attribute
> > > > > to insn RTX pattern to specify different BASE/INDEX register sets can
> > > > > be a better solution than passing insn RTX to the register allocator.
> > > > >
> > > > > The above idea still does not solve the asm problem on how to select
> > > > > correct BASE/INDEX register set for memory operands.
> > > > The current approach disables gpr32 for memory operand in asm_operand
> > > > by default. but can be turned on by options
> > > > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction
> > > > supports gpr32).
> > > > Only ~ 5% of total instructions don't support gpr32, reversed approach
> > > > only gonna get more complicated.
> > >
> > > I'm not referring to the reversed approach, just want to point out
> > > that the same approach as you proposed w.r.t. to memory operand can be
> > > achieved using some named insn attribute that would affect BASE/INDEX
> > > register class selection. The attribute could default to gpr32 with
> > > APX, unless the insn specific attribute has e.g. nogpr32 value. See
> > > for example how "enabled" and "preferred_for_*" attributes are used.
> > > Perhaps this new attribute can also be applied to separate
> > > alternatives.
> > Yes, for xop/fma4/3dnow instructions, I think we can use isa attr like
> > (define_attr "gpr32" "0, 1"
> >   (cond [(eq_attr "isa" "fma4")
> >            (const_string "0")]
> >       (const_string "1")))
>
> Just a nit, can the member be named "map0" and "map1"? The code will
> then look like:
>
> if (get_attr_gpr32 (insn) == GPR32_MAP0) ...
>
> instead of:
>
> if (get_attr_gpr32 (insn) == GPR32_0) ...
>
> > But still, we need to adjust memory constraints in the pattern.
>
> I guess the gpr32 property is the same for all alternatives of the
> insn pattern. In this case,  "m" "g" and "a" constraints could remain
> as they are, the final register class will be adjusted (by some target
> hook?) based on the value of gpr32 attribute.
I'm worried that not all rtl optimizers after post_reload will respect
base/index_reg_class regarding the insn they belong to.
 if they just check if it's a legitimate memory/address (the current
legitimate_address doesn't have a corresponding insn to pass down),
m/g/a will still generate invalid instruction.
So a defensive programming is to explicitly modifying the constraint.
>
> > Ideally, gcc includes encoding information for every instruction,
> > (.i.e. map0/map1), so that we can determine the attribute value of
> > gpr32 directly from this information.
>
> I think the right tool for this is attribute infrastructure of insn
> patterns. We can set the default, set precise value of the insns, or
> calculate attribute from some other attribute in a quite flexible way.
> Other than that, adjusting BASE/INDEX register class of the RA pass is
> the infrastructure change, but perhaps similar to the one you
> proposed.
Yes.
>
> Uros.
diff mbox series

Patch

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index d26d9ab0d9d..9460ebbfda4 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -17,6 +17,7 @@  You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+#define INCLUDE_STRING
 #define IN_TARGET_CODE 1
 
 #include "config.h"
@@ -23077,6 +23078,49 @@  ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/,
   bool saw_asm_flag = false;
 
   start_sequence ();
+  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
+   constraints, will eventually map all the usable constraints in the future. */
+  if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32)
+    {
+      /* Map "r" constraint in inline asm to "h" that disallows r16-r31
+	 and replace only r, exclude Br and Yr.  */
+      for (unsigned i = 0; i < constraints.length (); i++)
+	{
+	  std::string *s = new std::string (constraints[i]);
+	  size_t pos = s->find ('r');
+	  while (pos != std::string::npos)
+	    {
+	      if (pos > 0
+		  && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B'))
+		pos = s->find ('r', pos + 1);
+	      else
+		{
+		  s->replace (pos, 1, "h");
+		  constraints[i] = (const char*) s->c_str ();
+		  break;
+		}
+	    }
+	}
+      /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with
+	 "Bt/Bt/BT".  */
+      for (unsigned i = 0; i < constraints.length (); i++)
+	{
+	  std::string *s = new std::string (constraints[i]);
+	  size_t pos = s->find ("m");
+	  size_t pos2 = s->find ("memory");
+	  if (pos != std::string::npos)
+	    {
+	      if (pos > 0 && (s->at (pos - 1) == 'B'))
+		  s->replace (pos - 1, 2, "BT");
+	      else if (pos2 != std::string::npos)
+		  s->replace (pos, 6, "Bt");
+	      else
+		  s->replace (pos, 1, "Bt");
+	      constraints[i] = (const char*) s->c_str ();
+	    }
+	}
+     }
+
   for (unsigned i = 0, n = outputs.length (); i < n; ++i)
     {
       const char *con = constraints[i];
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 1ee4d90186e..5c8d3a207e3 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1335,3 +1335,8 @@  Enum(apx_features) String(ndd) Value(apx_ndd) Set(4)
 
 EnumValue
 Enum(apx_features) String(all) Value(apx_all) Set(1)
+
+mapx-inline-asm-use-gpr32
+Target Var(ix86_apx_inline_asm_use_gpr32) Init(0)
+Enable GPR32 in inline asm when APX_EGPR enabled, do not
+hook reg or mem constraint in inline asm to GPR16.
diff --git a/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
new file mode 100644
index 00000000000..21534450045
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
@@ -0,0 +1,107 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mapxf -m64 -march=skylake-avx512 -DDTYPE32" } */
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+#ifdef DTYPE32
+typedef u32 DTYPE;
+#define byteswap byteswapu32
+#endif
+
+#define R(x,n) ( (x >> n) | (x << (32 - n)))
+
+#define S0(x) (R(x, 2) ^ R(x,13) ^ R(x,22))
+#define S1(x) (R(x, 6) ^ R(x,11) ^ R(x,25))
+
+#define TT(a,b,c,d,e,f,g,h,x,K)                 \
+{                                                        \
+    tmp1 = h + S1(e) + (g ^ (e & (f ^ g))) + K + x;                \
+    tmp2 = S0(a) + ((a & b) | (c & (a | b)));                           \
+    h  = tmp1 + tmp2;                                    \
+    d += tmp1;                                           \
+}
+
+static inline u32 byteswapu32(u32 x)
+{
+  x = (x & 0x0000FFFF) << 16 | (x & 0xFFFF0000) >> 16;
+  x = (x & 0x00FF00FF) << 8 | (x & 0xFF00FF00) >> 8;  
+  return x;
+}
+
+void foo (DTYPE in[16], DTYPE out[8], const DTYPE C[16])
+{
+    DTYPE tmp1 = 0, tmp2 = 0, a, b, c, d, e, f, g, h;
+    DTYPE w0, w1, w2, w3, w4, w5, w6, w7,
+	w8, w9, w10, w11, w12, w13, w14, w15;
+    w0  = byteswap(in[0]);
+    w1  = byteswap(in[1]);
+    w2  = byteswap(in[2]);
+    w3  = byteswap(in[3]);
+    w4  = byteswap(in[4]);
+    w5  = byteswap(in[5]);
+    w6  = byteswap(in[6]);
+    w7  = byteswap(in[7]);
+    w8  = byteswap(in[8]);
+    w9  = byteswap(in[9]);
+    w10 = byteswap(in[10]);
+    w11 = byteswap(in[11]);
+    w12 = byteswap(in[12]);
+    w13 = byteswap(in[13]);
+    w14 = byteswap(in[14]);
+    w15 = byteswap(in[15]);
+    a = out[0];
+    b = out[1];
+    c = out[2];
+    d = out[3];
+    e = out[4];
+    f = out[5];
+    g = out[6];
+    h = out[7];
+    
+    TT(a, b, c, d, e, f, g, h,  w0, C[0]);
+    TT(h, a, b, c, d, e, f, g,  w1, C[1]);
+    TT(g, h, a, b, c, d, e, f,  w2, C[2]);
+    TT(f, g, h, a, b, c, d, e,  w3, C[3]);
+    TT(e, f, g, h, a, b, c, d,  w4, C[4]);
+    TT(d, e, f, g, h, a, b, c,  w5, C[5]);
+    TT(c, d, e, f, g, h, a, b,  w6, C[6]);
+    TT(b, c, d, e, f, g, h, a,  w7, C[7]);
+    TT(a, b, c, d, e, f, g, h,  w8, C[8]);
+    TT(h, a, b, c, d, e, f, g,  w9, C[9]);
+    TT(g, h, a, b, c, d, e, f, w10, C[10]);
+    TT(f, g, h, a, b, c, d, e, w11, C[11]);
+    TT(e, f, g, h, a, b, c, d, w12, C[12]);
+    TT(d, e, f, g, h, a, b, c, w13, C[13]);
+    TT(c, d, e, f, g, h, a, b, w14, C[14]);
+    TT(b, c, d, e, f, g, h, a, w15, C[15]);
+
+    out[0] += a;
+    out[1] += b;
+    out[2] += c;
+    out[3] += d;
+    out[4] += e;
+    out[5] += f;
+    out[6] += g;
+    out[7] += h;
+
+    __asm__ __volatile__ ("test_asm_xmm %0, %%rax" : : "Yr" (out[7]) : "rax");
+    __asm__ __volatile__ ("test_asm_Brr %0, %%rax" : : "Brr" (w14) : "rbx");
+    __asm__ __volatile__ ("test_asm_rBr %0, %%rax" : : "rBr" (w13) : "rbx");
+    __asm__ __volatile__ ("test_asm_r %0, %%rax" : : "r" (w15) : "rbx");
+    __asm__ __volatile__ ("test_asm_m %0, %%rax" : : "m" (out[0]) : "rbx");
+    __asm__ __volatile__ ("test_asm_mem %0, %%rax" : : "memory" (out[1]) : "rbx");
+}
+
+/* { dg-final { scan-assembler-not "knot" } } */
+/* { dg-final { scan-assembler-not "kxor" } } */
+/* { dg-final { scan-assembler-not "kor" } } */
+/* { dg-final { scan-assembler-not "kandn" } } */
+/* { dg-final { scan-assembler-times "test_asm_xmm %xmm5, %rax" 1 } } */
+/* { dg-final { scan-assembler-times "test_asm_Brr %r15d, %rax" 1 } } */
+/* { dg-final { scan-assembler-times "test_asm_rBr %r14d, %rax" 1 } } */
+/* { dg-final { scan-assembler-times "test_asm_r %r13d, %rax" 1 } } */
+/* { dg-final { scan-assembler-not "test_asm_rBr %r31d, %rax" } } */
+/* { dg-final { scan-assembler-not "test_asm_r %r30d, %rax" } } */
+/* { dg-final { scan-assembler-not "test_asm_m \\(%r29d\\), %rax" } } */
+/* { dg-final { scan-assembler-not "test_asm_mem \\(%r28d\\), %rax" } } */