diff mbox series

[v2] Do not emulate vectors containing floats.

Message ID 20240223134345.20888-1-jchrist@linux.ibm.com
State New
Headers show
Series [v2] Do not emulate vectors containing floats. | expand

Commit Message

Juergen Christ Feb. 23, 2024, 1:43 p.m. UTC
The emulation via word mode tries to perform integer arithmetic on floating
point values instead of floating point arithmetic.  This leads to
mis-compilations.

Failure occured on s390x on these existing test cases:
gcc.dg/vect/tsvc/vect-tsvc-s112.c
gcc.dg/vect/tsvc/vect-tsvc-s113.c
gcc.dg/vect/tsvc/vect-tsvc-s119.c
gcc.dg/vect/tsvc/vect-tsvc-s121.c
gcc.dg/vect/tsvc/vect-tsvc-s131.c
gcc.dg/vect/tsvc/vect-tsvc-s132.c
gcc.dg/vect/tsvc/vect-tsvc-s2233.c
gcc.dg/vect/tsvc/vect-tsvc-s421.c
gcc.dg/vect/vect-alias-check-14.c
gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c

gcc/ChangeLog:

	* tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
          point vectors

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
---
 gcc/tree-vect-stmts.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Sam James Feb. 23, 2024, 1:57 p.m. UTC | #1
Juergen Christ <jchrist@linux.ibm.com> writes:

> The emulation via word mode tries to perform integer arithmetic on floating
> point values instead of floating point arithmetic.  This leads to
> mis-compilations.

Is the bug ref + test missing?

>
> Failure occured on s390x on these existing test cases:
> gcc.dg/vect/tsvc/vect-tsvc-s112.c
> gcc.dg/vect/tsvc/vect-tsvc-s113.c
> gcc.dg/vect/tsvc/vect-tsvc-s119.c
> gcc.dg/vect/tsvc/vect-tsvc-s121.c
> gcc.dg/vect/tsvc/vect-tsvc-s131.c
> gcc.dg/vect/tsvc/vect-tsvc-s132.c
> gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> gcc.dg/vect/tsvc/vect-tsvc-s421.c
> gcc.dg/vect/vect-alias-check-14.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
>
> gcc/ChangeLog:
>
> 	* tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
>           point vectors
>
> Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> ---
>  gcc/tree-vect-stmts.cc | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 09749ae38174..f95ff2c2aa34 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
>  	 those through even when the mode isn't word_mode.  For
>  	 ops we have to lower the lowering code assumes we are
>  	 dealing with word_mode.  */
> -      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> +	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
>  	    || !target_support_p)
>  	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
>  	  /* Check only during analysis.  */
Juergen Christ Feb. 23, 2024, 2:01 p.m. UTC | #2
Am Fri, Feb 23, 2024 at 01:57:12PM +0000 schrieb Sam James:
> 
> Juergen Christ <jchrist@linux.ibm.com> writes:
> 
> > The emulation via word mode tries to perform integer arithmetic on floating
> > point values instead of floating point arithmetic.  This leads to
> > mis-compilations.
> 
> Is the bug ref + test missing?

Sorry, forgot to add the "bootstrapped and tested on s390x and x86_64".

Not sure how to reference a bugzilla here.  There is 114075 that
should be solved with this, too.

> >
> > Failure occured on s390x on these existing test cases:
> > gcc.dg/vect/tsvc/vect-tsvc-s112.c
> > gcc.dg/vect/tsvc/vect-tsvc-s113.c
> > gcc.dg/vect/tsvc/vect-tsvc-s119.c
> > gcc.dg/vect/tsvc/vect-tsvc-s121.c
> > gcc.dg/vect/tsvc/vect-tsvc-s131.c
> > gcc.dg/vect/tsvc/vect-tsvc-s132.c
> > gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> > gcc.dg/vect/tsvc/vect-tsvc-s421.c
> > gcc.dg/vect/vect-alias-check-14.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
> >
> > gcc/ChangeLog:
> >
> > 	* tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> >           point vectors
> >
> > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> > ---
> >  gcc/tree-vect-stmts.cc | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index 09749ae38174..f95ff2c2aa34 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> >  	 those through even when the mode isn't word_mode.  For
> >  	 ops we have to lower the lowering code assumes we are
> >  	 dealing with word_mode.  */
> > -      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > +      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > +	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> >  	    || !target_support_p)
> >  	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> >  	  /* Check only during analysis.  */
>
Jakub Jelinek Feb. 23, 2024, 6:17 p.m. UTC | #3
On Fri, Feb 23, 2024 at 02:43:45PM +0100, Juergen Christ wrote:
> The emulation via word mode tries to perform integer arithmetic on floating
> point values instead of floating point arithmetic.  This leads to
> mis-compilations.
> 
> Failure occured on s390x on these existing test cases:
> gcc.dg/vect/tsvc/vect-tsvc-s112.c
> gcc.dg/vect/tsvc/vect-tsvc-s113.c
> gcc.dg/vect/tsvc/vect-tsvc-s119.c
> gcc.dg/vect/tsvc/vect-tsvc-s121.c
> gcc.dg/vect/tsvc/vect-tsvc-s131.c
> gcc.dg/vect/tsvc/vect-tsvc-s132.c
> gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> gcc.dg/vect/tsvc/vect-tsvc-s421.c
> gcc.dg/vect/vect-alias-check-14.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
> 
> gcc/ChangeLog:
> 

Please add
	PR tree-optimization/114075
above the * tree-vect-stmts line.
> 	* tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
>           point vectors

This line should be tab indented like the first one, and end with .
And given what the patch does, perhaps say non-integral instead of floating
point.

As for testcase, I'll handle it separately, given that it already
fixes some pre-existing tests.

> Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> ---
>  gcc/tree-vect-stmts.cc | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 09749ae38174..f95ff2c2aa34 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
>  	 those through even when the mode isn't word_mode.  For
>  	 ops we have to lower the lowering code assumes we are
>  	 dealing with word_mode.  */
> -      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> +	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
>  	    || !target_support_p)
>  	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
>  	  /* Check only during analysis.  */

LGTM, but please wait until Monday evening so that Richi or Richard
have a chance to chime in.

	Jakub
Richard Biener Feb. 26, 2024, 8 a.m. UTC | #4
On Fri, 23 Feb 2024, Jakub Jelinek wrote:

> On Fri, Feb 23, 2024 at 02:43:45PM +0100, Juergen Christ wrote:
> > The emulation via word mode tries to perform integer arithmetic on floating
> > point values instead of floating point arithmetic.  This leads to
> > mis-compilations.
> > 
> > Failure occured on s390x on these existing test cases:
> > gcc.dg/vect/tsvc/vect-tsvc-s112.c
> > gcc.dg/vect/tsvc/vect-tsvc-s113.c
> > gcc.dg/vect/tsvc/vect-tsvc-s119.c
> > gcc.dg/vect/tsvc/vect-tsvc-s121.c
> > gcc.dg/vect/tsvc/vect-tsvc-s131.c
> > gcc.dg/vect/tsvc/vect-tsvc-s132.c
> > gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> > gcc.dg/vect/tsvc/vect-tsvc-s421.c
> > gcc.dg/vect/vect-alias-check-14.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
> > 
> > gcc/ChangeLog:
> > 
> 
> Please add
> 	PR tree-optimization/114075
> above the * tree-vect-stmts line.
> > 	* tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> >           point vectors
> 
> This line should be tab indented like the first one, and end with .
> And given what the patch does, perhaps say non-integral instead of floating
> point.
> 
> As for testcase, I'll handle it separately, given that it already
> fixes some pre-existing tests.
> 
> > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> > ---
> >  gcc/tree-vect-stmts.cc | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index 09749ae38174..f95ff2c2aa34 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> >  	 those through even when the mode isn't word_mode.  For
> >  	 ops we have to lower the lowering code assumes we are
> >  	 dealing with word_mode.  */
> > -      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > +      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > +	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> >  	    || !target_support_p)
> >  	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> >  	  /* Check only during analysis.  */

I think it will work fine.  Even after the last TLC this feels like in
the need of more TLC ;)

So OK.  Also for affected branches - the effective check should be the
same in GCC 13 at least, but with some added ad-hoc costing which might
make this not trigger (maybe_lt (nunits_out, 4U)) - so we'd need a
word_mode that can cover 4 FP elements.  Possibly triggerable with
HFmode?

Thanks,
Richard.

> LGTM, but please wait until Monday evening so that Richi or Richard
> have a chance to chime in.
> 
> 	Jakub
> 
>
Jakub Jelinek Feb. 26, 2024, 8:33 a.m. UTC | #5
On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > >  	 those through even when the mode isn't word_mode.  For
> > >  	 ops we have to lower the lowering code assumes we are
> > >  	 dealing with word_mode.  */
> > > -      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > +      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > +	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > >  	    || !target_support_p)
> > >  	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > >  	  /* Check only during analysis.  */
> 
> I think it will work fine.  Even after the last TLC this feels like in
> the need of more TLC ;)

Note, at least in theory, floating point NEGATE_EXPR could be handled just
fine in the emulated vectors, just xor the sign bit, and
BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
floating point modes (though e.g. in RTL they are used even for them),
it is mainly PLUS_EXPR/MINUS_EXPR.  Perhaps the NEGATE_EXPR isn't worth
it though.  In any case, that wouldn't be stage4 material.

	Jakub
Richard Biener Feb. 26, 2024, 8:53 a.m. UTC | #6
On Mon, 26 Feb 2024, Jakub Jelinek wrote:

> On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > >  	 those through even when the mode isn't word_mode.  For
> > > >  	 ops we have to lower the lowering code assumes we are
> > > >  	 dealing with word_mode.  */
> > > > -      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > +      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > +	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > >  	    || !target_support_p)
> > > >  	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > >  	  /* Check only during analysis.  */
> > 
> > I think it will work fine.  Even after the last TLC this feels like in
> > the need of more TLC ;)
> 
> Note, at least in theory, floating point NEGATE_EXPR could be handled just
> fine in the emulated vectors, just xor the sign bit, and
> BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> floating point modes (though e.g. in RTL they are used even for them),

Hmm, I think vector types not supported only ever get integer modes
assigned, not FP modes so the operations would be on integer modes.
Unless I'm missing the point ...

> it is mainly PLUS_EXPR/MINUS_EXPR.  Perhaps the NEGATE_EXPR isn't worth
> it though.  In any case, that wouldn't be stage4 material.

Agreed.

Richard.
Jakub Jelinek Feb. 26, 2024, 9:07 a.m. UTC | #7
On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote:
> On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> 
> > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > >  	 those through even when the mode isn't word_mode.  For
> > > > >  	 ops we have to lower the lowering code assumes we are
> > > > >  	 dealing with word_mode.  */
> > > > > -      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > +      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > > +	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > >  	    || !target_support_p)
> > > > >  	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > > >  	  /* Check only during analysis.  */
> > > 
> > > I think it will work fine.  Even after the last TLC this feels like in
> > > the need of more TLC ;)
> > 
> > Note, at least in theory, floating point NEGATE_EXPR could be handled just
> > fine in the emulated vectors, just xor the sign bit, and
> > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> > floating point modes (though e.g. in RTL they are used even for them),
> 
> Hmm, I think vector types not supported only ever get integer modes
> assigned, not FP modes so the operations would be on integer modes.

I mean one could still SLP vectorize using the emulated vectors
float a[2], b[2];
...
  a[0] = -b[0];
  a[1] = -b[1];
as effectively
  *(unsigned long long *)a = *(unsigned long long *)b ^ 0x8000000080000000ULL;
if we handled that case later on (except the above !INTEGRAL_TYPE_P check
would prevent that).

As for BIT_{AND,IOR,XOR}_EXPR, seems the GIMPLE verifiers actually don't
disallow
  float f, g;
  f_3 |= g_4;
etc. (but the FEs do), though maybe we'd ICE on some targets during
expansion; i386.md has {and,ior,xor}{s,d,x}f3 optabs.
Emulation using integer mode would be well defined.

	Jakub
Richard Biener Feb. 26, 2024, 9:19 a.m. UTC | #8
On Mon, 26 Feb 2024, Jakub Jelinek wrote:

> On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote:
> > On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> > 
> > > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > > >  	 those through even when the mode isn't word_mode.  For
> > > > > >  	 ops we have to lower the lowering code assumes we are
> > > > > >  	 dealing with word_mode.  */
> > > > > > -      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > > +      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > > > +	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > >  	    || !target_support_p)
> > > > > >  	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > > > >  	  /* Check only during analysis.  */
> > > > 
> > > > I think it will work fine.  Even after the last TLC this feels like in
> > > > the need of more TLC ;)
> > > 
> > > Note, at least in theory, floating point NEGATE_EXPR could be handled just
> > > fine in the emulated vectors, just xor the sign bit, and
> > > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> > > floating point modes (though e.g. in RTL they are used even for them),
> > 
> > Hmm, I think vector types not supported only ever get integer modes
> > assigned, not FP modes so the operations would be on integer modes.
> 
> I mean one could still SLP vectorize using the emulated vectors
> float a[2], b[2];
> ...
>   a[0] = -b[0];
>   a[1] = -b[1];
> as effectively
>   *(unsigned long long *)a = *(unsigned long long *)b ^ 0x8000000080000000ULL;
> if we handled that case later on (except the above !INTEGRAL_TYPE_P check
> would prevent that).

Yep.

> As for BIT_{AND,IOR,XOR}_EXPR, seems the GIMPLE verifiers actually don't
> disallow
>   float f, g;
>   f_3 |= g_4;
> etc. (but the FEs do), though maybe we'd ICE on some targets during
> expansion; i386.md has {and,ior,xor}{s,d,x}f3 optabs.

I think that's an omission in the verifier (we allow the ops on
pointers though).

Richard.
diff mbox series

Patch

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 09749ae38174..f95ff2c2aa34 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -6756,7 +6756,8 @@  vectorizable_operation (vec_info *vinfo,
 	 those through even when the mode isn't word_mode.  For
 	 ops we have to lower the lowering code assumes we are
 	 dealing with word_mode.  */
-      if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
+      if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
+	  || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
 	    || !target_support_p)
 	   && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
 	  /* Check only during analysis.  */