diff mbox series

, Fix V1TI in Altivec regs on old systems

Message ID 20190820180031.GA23728@ibm-toto.the-meissners.org
State New
Headers show
Series , Fix V1TI in Altivec regs on old systems | expand

Commit Message

Michael Meissner Aug. 20, 2019, 6 p.m. UTC
This is a little corner case that I noticed in my rewrite of the RELOAD_REG
stuff for the future machine.

I was testing what registers were allowed in what registers for various systems
(power5 through power9 for big endian on both 32 & 64-bit systems, and
power8/power9 for little endian systems).  I was using the debug flag
-mdebug=reg which dumps out the reg_addr information (among other things).  I
noticed on power5 that the V1TImode mode is allowed in Altivec registers, even
though power5 doesn't have Altivec registers.

While it doesn't seem to effect anything (I couldn't create a test case that
failed), it is a small nit that should be fixed.  The test for TARGET_VADDUQM
matches a test earlier in the function where VSX registers are checked.

I have done a bootstrap on a little endian power8 and there were no regressions
in the bootstrap or make check.  I also verified via -mdebug=reg that V1TI mode
is marked as being valid in the Altivec registers with -mcpu=power5.  Can I
check this into the trunk?

2019-08-20  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
	Don't allow V1TImode in Altivec registers on pre-altivec systems.

Comments

Segher Boessenkool Aug. 29, 2019, 5:02 p.m. UTC | #1
Hi!

On Tue, Aug 20, 2019 at 02:00:31PM -0400, Michael Meissner wrote:
> I
> noticed on power5 that the V1TImode mode is allowed in Altivec registers, even
> though power5 doesn't have Altivec registers.
> 
> While it doesn't seem to effect anything (I couldn't create a test case that
> failed), it is a small nit that should be fixed.  The test for TARGET_VADDUQM
> matches a test earlier in the function where VSX registers are checked.

Yeah, but does that test make any sense?

Why p8 (or later) only?  Why vector only?  (Well that one is clear, and
what this patch is about).  Why -mpowerpc64 only?  Because it has __int128
maybe?  But then you should test for *that* (and why is it important?)

Where would V1TI go if not in vector regs?  Just in memory?

And, what happens on 970?  p5 doesn't have vector registers, but 970 does.

> --- gcc/config/rs6000/rs6000.c	(revision 274635)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -1874,7 +1874,7 @@ rs6000_hard_regno_mode_ok_uncached (int
>    /* AltiVec only in AldyVec registers.  */
>    if (ALTIVEC_REGNO_P (regno))
>      return (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
> -	    || mode == V1TImode);
> +	    || (TARGET_VADDUQM && mode == V1TImode));


Segher
Michael Meissner Sept. 3, 2019, 9:01 p.m. UTC | #2
On Thu, Aug 29, 2019 at 12:02:05PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Aug 20, 2019 at 02:00:31PM -0400, Michael Meissner wrote:
> > I
> > noticed on power5 that the V1TImode mode is allowed in Altivec registers, even
> > though power5 doesn't have Altivec registers.
> > 
> > While it doesn't seem to effect anything (I couldn't create a test case that
> > failed), it is a small nit that should be fixed.  The test for TARGET_VADDUQM
> > matches a test earlier in the function where VSX registers are checked.
> 
> Yeah, but does that test make any sense?

The test is copying the same test used earlier for VSX support.  As I said
without the patch, the reg_addr tables thinks that V1TImode can got in Altivec
registers, even if Altivec registers aren't allowed.  It is a nit, and other
things prevent the type from being allocated.

> Why p8 (or later) only?  Why vector only?  (Well that one is clear, and
> what this patch is about).  Why -mpowerpc64 only?  Because it has __int128
> maybe?  But then you should test for *that* (and why is it important?)

Well -mpower64 is needed because the only way you can get __int128 types is by
being 64-bit (VITImode is of course vector __int128).

The history is this mode is a hack.  It was added in the power8 time frame,
solely for the following built-in functions:

	__builtin_altivec_vadduqm
	__builtin_altivec_vaddcuq
	__builtin_altivec_vsubuqm
	__builtin_altivec_vsubcuq
	__builtin_altivec_vaddeuqm
	__builtin_altivec_vaddecuq
	__builtin_altivec_vsubeuqm
	__builtin_altivec_vsubecuq

At the time, TImode (aka, __int128) was not allowed in VSX registers since we
were using the original register allocator (reload based).  There were several
issues that showed up in building large programs if we allowed TImode into
vector registers.  So we disallowed it (with a chicken switch to re-enable it
for debugging, -mvsx-timode).

Later when we switched over to the LRA register allocator, we allowed TImode to
go into vector registers.  But the GCC release had gone out that said if you
wanted to use those two power8 instructions you had to use a vector __int128
wrapper to allow the code.

> Where would V1TI go if not in vector regs?  Just in memory?

Since the only real use of it is for the built-ins, it likely won't matter.
But like all other modes, it could easily go in vector registers.

Personally except for the register allocation issue, I don't see any reason you
should use a vector type that is as large as the vector element inside.

> 
> And, what happens on 970?  p5 doesn't have vector registers, but 970 does.

You can't declare the type using the normal sequence, since rs6000-c.c does not
allow 'vector __int128' until power8.  If instead, you write out:

	vector __int128
as
	__attribute__((altivec(vector__))) __int128

it will use GPRs to hold it on -mcpu=970 (with/without -maltivec).  On power7
it will use the vector register.  On power6 with -maltivec, it will use GPRs.


> > --- gcc/config/rs6000/rs6000.c	(revision 274635)
> > +++ gcc/config/rs6000/rs6000.c	(working copy)
> > @@ -1874,7 +1874,7 @@ rs6000_hard_regno_mode_ok_uncached (int
> >    /* AltiVec only in AldyVec registers.  */
> >    if (ALTIVEC_REGNO_P (regno))
> >      return (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
> > -	    || mode == V1TImode);
> > +	    || (TARGET_VADDUQM && mode == V1TImode));
> 
> 
> Segher
Segher Boessenkool Sept. 3, 2019, 9:52 p.m. UTC | #3
Hi!

On Tue, Sep 03, 2019 at 05:01:46PM -0400, Michael Meissner wrote:
> On Thu, Aug 29, 2019 at 12:02:05PM -0500, Segher Boessenkool wrote:
> > On Tue, Aug 20, 2019 at 02:00:31PM -0400, Michael Meissner wrote:
> > > The test for TARGET_VADDUQM
> > > matches a test earlier in the function where VSX registers are checked.
> > 
> > Yeah, but does that test make any sense?
> 
> The test is copying the same test used earlier for VSX support.  As I said
> without the patch, the reg_addr tables thinks that V1TImode can got in Altivec
> registers, even if Altivec registers aren't allowed.  It is a nit, and other
> things prevent the type from being allocated.
> 
> > Why p8 (or later) only?  Why vector only?  (Well that one is clear, and
> > what this patch is about).  Why -mpowerpc64 only?  Because it has __int128
> > maybe?  But then you should test for *that* (and why is it important?)
> 
> Well -mpower64 is needed because the only way you can get __int128 types is by
> being 64-bit (VITImode is of course vector __int128).

But types are not modes.

> The history is this mode is a hack.  It was added in the power8 time frame,
> solely for the following built-in functions:
> 
> 	__builtin_altivec_vadduqm
> 	__builtin_altivec_vaddcuq
> 	__builtin_altivec_vsubuqm
> 	__builtin_altivec_vsubcuq
> 	__builtin_altivec_vaddeuqm
> 	__builtin_altivec_vaddecuq
> 	__builtin_altivec_vsubeuqm
> 	__builtin_altivec_vsubecuq
> 
> At the time, TImode (aka, __int128) was not allowed in VSX registers since we
> were using the original register allocator (reload based).  There were several
> issues that showed up in building large programs if we allowed TImode into
> vector registers.  So we disallowed it (with a chicken switch to re-enable it
> for debugging, -mvsx-timode).
> 
> Later when we switched over to the LRA register allocator, we allowed TImode to
> go into vector registers.  But the GCC release had gone out that said if you
> wanted to use those two power8 instructions you had to use a vector __int128
> wrapper to allow the code.
> 
> > Where would V1TI go if not in vector regs?  Just in memory?
> 
> Since the only real use of it is for the built-ins, it likely won't matter.
> But like all other modes, it could easily go in vector registers.
> 
> Personally except for the register allocation issue, I don't see any reason you
> should use a vector type that is as large as the vector element inside.
> 
> > 
> > And, what happens on 970?  p5 doesn't have vector registers, but 970 does.
> 
> You can't declare the type using the normal sequence, since rs6000-c.c does not
> allow 'vector __int128' until power8.  If instead, you write out:

Ah, so that is why you test for p8 as well.  So you need neither the p8
nor the -mpowerpc64 test here...  Can you just test TARGET_VSX instead?
Or whatever else is actually needed?

> 	vector __int128
> as
> 	__attribute__((altivec(vector__))) __int128
> 
> it will use GPRs to hold it on -mcpu=970 (with/without -maltivec).  On power7
> it will use the vector register.  On power6 with -maltivec, it will use GPRs.
> 
> > > --- gcc/config/rs6000/rs6000.c	(revision 274635)
> > > +++ gcc/config/rs6000/rs6000.c	(working copy)
> > > @@ -1874,7 +1874,7 @@ rs6000_hard_regno_mode_ok_uncached (int
> > >    /* AltiVec only in AldyVec registers.  */
> > >    if (ALTIVEC_REGNO_P (regno))
> > >      return (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
> > > -	    || mode == V1TImode);
> > > +	    || (TARGET_VADDUQM && mode == V1TImode));

Or is nothing needed?  Now I don't understand anymore :-/

...

rs6000_vector_mem[V1TImode] = VECTOR_VSX;  if TARGET_VSX so that is all
you need?  Then VECTOR_MEM_ALTIVEC_OR_VSX_P (V1TImode) is true already,
so we do not need to handle V1TImode specially here at all?  That is, if
your patch is correct.

So the question then is do we ever use V1TI if we do not have VSX.


Segher
diff mbox series

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274635)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1874,7 +1874,7 @@  rs6000_hard_regno_mode_ok_uncached (int
   /* AltiVec only in AldyVec registers.  */
   if (ALTIVEC_REGNO_P (regno))
     return (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
-	    || mode == V1TImode);
+	    || (TARGET_VADDUQM && mode == V1TImode));
 
   /* We cannot put non-VSX TImode or PTImode anywhere except general register
      and it must be able to fit within the register set.  */