Message ID | 20240223134345.20888-1-jchrist@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [v2] Do not emulate vectors containing floats. | expand |
Juergen Christ <jchrist@linux.ibm.com> writes: > The emulation via word mode tries to perform integer arithmetic on floating > point values instead of floating point arithmetic. This leads to > mis-compilations. Is the bug ref + test missing? > > Failure occured on s390x on these existing test cases: > gcc.dg/vect/tsvc/vect-tsvc-s112.c > gcc.dg/vect/tsvc/vect-tsvc-s113.c > gcc.dg/vect/tsvc/vect-tsvc-s119.c > gcc.dg/vect/tsvc/vect-tsvc-s121.c > gcc.dg/vect/tsvc/vect-tsvc-s131.c > gcc.dg/vect/tsvc/vect-tsvc-s132.c > gcc.dg/vect/tsvc/vect-tsvc-s2233.c > gcc.dg/vect/tsvc/vect-tsvc-s421.c > gcc.dg/vect/vect-alias-check-14.c > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c > > gcc/ChangeLog: > > * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating > point vectors > > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com> > --- > gcc/tree-vect-stmts.cc | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index 09749ae38174..f95ff2c2aa34 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > those through even when the mode isn't word_mode. For > ops we have to lower the lowering code assumes we are > dealing with word_mode. */ > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > || !target_support_p) > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) > /* Check only during analysis. */
Am Fri, Feb 23, 2024 at 01:57:12PM +0000 schrieb Sam James: > > Juergen Christ <jchrist@linux.ibm.com> writes: > > > The emulation via word mode tries to perform integer arithmetic on floating > > point values instead of floating point arithmetic. This leads to > > mis-compilations. > > Is the bug ref + test missing? Sorry, forgot to add the "bootstrapped and tested on s390x and x86_64". Not sure how to reference a bugzilla here. There is 114075 that should be solved with this, too. > > > > Failure occured on s390x on these existing test cases: > > gcc.dg/vect/tsvc/vect-tsvc-s112.c > > gcc.dg/vect/tsvc/vect-tsvc-s113.c > > gcc.dg/vect/tsvc/vect-tsvc-s119.c > > gcc.dg/vect/tsvc/vect-tsvc-s121.c > > gcc.dg/vect/tsvc/vect-tsvc-s131.c > > gcc.dg/vect/tsvc/vect-tsvc-s132.c > > gcc.dg/vect/tsvc/vect-tsvc-s2233.c > > gcc.dg/vect/tsvc/vect-tsvc-s421.c > > gcc.dg/vect/vect-alias-check-14.c > > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c > > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c > > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c > > > > gcc/ChangeLog: > > > > * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating > > point vectors > > > > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com> > > --- > > gcc/tree-vect-stmts.cc | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > > index 09749ae38174..f95ff2c2aa34 100644 > > --- a/gcc/tree-vect-stmts.cc > > +++ b/gcc/tree-vect-stmts.cc > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > > those through even when the mode isn't word_mode. For > > ops we have to lower the lowering code assumes we are > > dealing with word_mode. */ > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > || !target_support_p) > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) > > /* Check only during analysis. */ >
On Fri, Feb 23, 2024 at 02:43:45PM +0100, Juergen Christ wrote: > The emulation via word mode tries to perform integer arithmetic on floating > point values instead of floating point arithmetic. This leads to > mis-compilations. > > Failure occured on s390x on these existing test cases: > gcc.dg/vect/tsvc/vect-tsvc-s112.c > gcc.dg/vect/tsvc/vect-tsvc-s113.c > gcc.dg/vect/tsvc/vect-tsvc-s119.c > gcc.dg/vect/tsvc/vect-tsvc-s121.c > gcc.dg/vect/tsvc/vect-tsvc-s131.c > gcc.dg/vect/tsvc/vect-tsvc-s132.c > gcc.dg/vect/tsvc/vect-tsvc-s2233.c > gcc.dg/vect/tsvc/vect-tsvc-s421.c > gcc.dg/vect/vect-alias-check-14.c > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c > > gcc/ChangeLog: > Please add PR tree-optimization/114075 above the * tree-vect-stmts line. > * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating > point vectors This line should be tab indented like the first one, and end with . And given what the patch does, perhaps say non-integral instead of floating point. As for testcase, I'll handle it separately, given that it already fixes some pre-existing tests. > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com> > --- > gcc/tree-vect-stmts.cc | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index 09749ae38174..f95ff2c2aa34 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > those through even when the mode isn't word_mode. For > ops we have to lower the lowering code assumes we are > dealing with word_mode. */ > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > || !target_support_p) > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) > /* Check only during analysis. */ LGTM, but please wait until Monday evening so that Richi or Richard have a chance to chime in. Jakub
On Fri, 23 Feb 2024, Jakub Jelinek wrote: > On Fri, Feb 23, 2024 at 02:43:45PM +0100, Juergen Christ wrote: > > The emulation via word mode tries to perform integer arithmetic on floating > > point values instead of floating point arithmetic. This leads to > > mis-compilations. > > > > Failure occured on s390x on these existing test cases: > > gcc.dg/vect/tsvc/vect-tsvc-s112.c > > gcc.dg/vect/tsvc/vect-tsvc-s113.c > > gcc.dg/vect/tsvc/vect-tsvc-s119.c > > gcc.dg/vect/tsvc/vect-tsvc-s121.c > > gcc.dg/vect/tsvc/vect-tsvc-s131.c > > gcc.dg/vect/tsvc/vect-tsvc-s132.c > > gcc.dg/vect/tsvc/vect-tsvc-s2233.c > > gcc.dg/vect/tsvc/vect-tsvc-s421.c > > gcc.dg/vect/vect-alias-check-14.c > > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c > > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c > > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c > > > > gcc/ChangeLog: > > > > Please add > PR tree-optimization/114075 > above the * tree-vect-stmts line. > > * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating > > point vectors > > This line should be tab indented like the first one, and end with . > And given what the patch does, perhaps say non-integral instead of floating > point. > > As for testcase, I'll handle it separately, given that it already > fixes some pre-existing tests. > > > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com> > > --- > > gcc/tree-vect-stmts.cc | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > > index 09749ae38174..f95ff2c2aa34 100644 > > --- a/gcc/tree-vect-stmts.cc > > +++ b/gcc/tree-vect-stmts.cc > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > > those through even when the mode isn't word_mode. For > > ops we have to lower the lowering code assumes we are > > dealing with word_mode. */ > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > || !target_support_p) > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) > > /* Check only during analysis. */ I think it will work fine. Even after the last TLC this feels like in the need of more TLC ;) So OK. Also for affected branches - the effective check should be the same in GCC 13 at least, but with some added ad-hoc costing which might make this not trigger (maybe_lt (nunits_out, 4U)) - so we'd need a word_mode that can cover 4 FP elements. Possibly triggerable with HFmode? Thanks, Richard. > LGTM, but please wait until Monday evening so that Richi or Richard > have a chance to chime in. > > Jakub > >
On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote: > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > > > those through even when the mode isn't word_mode. For > > > ops we have to lower the lowering code assumes we are > > > dealing with word_mode. */ > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > > || !target_support_p) > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) > > > /* Check only during analysis. */ > > I think it will work fine. Even after the last TLC this feels like in > the need of more TLC ;) Note, at least in theory, floating point NEGATE_EXPR could be handled just fine in the emulated vectors, just xor the sign bit, and BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on floating point modes (though e.g. in RTL they are used even for them), it is mainly PLUS_EXPR/MINUS_EXPR. Perhaps the NEGATE_EXPR isn't worth it though. In any case, that wouldn't be stage4 material. Jakub
On Mon, 26 Feb 2024, Jakub Jelinek wrote: > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote: > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > > > > those through even when the mode isn't word_mode. For > > > > ops we have to lower the lowering code assumes we are > > > > dealing with word_mode. */ > > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) > > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > > > || !target_support_p) > > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) > > > > /* Check only during analysis. */ > > > > I think it will work fine. Even after the last TLC this feels like in > > the need of more TLC ;) > > Note, at least in theory, floating point NEGATE_EXPR could be handled just > fine in the emulated vectors, just xor the sign bit, and > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on > floating point modes (though e.g. in RTL they are used even for them), Hmm, I think vector types not supported only ever get integer modes assigned, not FP modes so the operations would be on integer modes. Unless I'm missing the point ... > it is mainly PLUS_EXPR/MINUS_EXPR. Perhaps the NEGATE_EXPR isn't worth > it though. In any case, that wouldn't be stage4 material. Agreed. Richard.
On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote: > On Mon, 26 Feb 2024, Jakub Jelinek wrote: > > > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote: > > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > > > > > those through even when the mode isn't word_mode. For > > > > > ops we have to lower the lowering code assumes we are > > > > > dealing with word_mode. */ > > > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) > > > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > > > > || !target_support_p) > > > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) > > > > > /* Check only during analysis. */ > > > > > > I think it will work fine. Even after the last TLC this feels like in > > > the need of more TLC ;) > > > > Note, at least in theory, floating point NEGATE_EXPR could be handled just > > fine in the emulated vectors, just xor the sign bit, and > > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on > > floating point modes (though e.g. in RTL they are used even for them), > > Hmm, I think vector types not supported only ever get integer modes > assigned, not FP modes so the operations would be on integer modes. I mean one could still SLP vectorize using the emulated vectors float a[2], b[2]; ... a[0] = -b[0]; a[1] = -b[1]; as effectively *(unsigned long long *)a = *(unsigned long long *)b ^ 0x8000000080000000ULL; if we handled that case later on (except the above !INTEGRAL_TYPE_P check would prevent that). As for BIT_{AND,IOR,XOR}_EXPR, seems the GIMPLE verifiers actually don't disallow float f, g; f_3 |= g_4; etc. (but the FEs do), though maybe we'd ICE on some targets during expansion; i386.md has {and,ior,xor}{s,d,x}f3 optabs. Emulation using integer mode would be well defined. Jakub
On Mon, 26 Feb 2024, Jakub Jelinek wrote: > On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote: > > On Mon, 26 Feb 2024, Jakub Jelinek wrote: > > > > > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote: > > > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > > > > > > those through even when the mode isn't word_mode. For > > > > > > ops we have to lower the lowering code assumes we are > > > > > > dealing with word_mode. */ > > > > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > > > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) > > > > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) > > > > > > || !target_support_p) > > > > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) > > > > > > /* Check only during analysis. */ > > > > > > > > I think it will work fine. Even after the last TLC this feels like in > > > > the need of more TLC ;) > > > > > > Note, at least in theory, floating point NEGATE_EXPR could be handled just > > > fine in the emulated vectors, just xor the sign bit, and > > > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on > > > floating point modes (though e.g. in RTL they are used even for them), > > > > Hmm, I think vector types not supported only ever get integer modes > > assigned, not FP modes so the operations would be on integer modes. > > I mean one could still SLP vectorize using the emulated vectors > float a[2], b[2]; > ... > a[0] = -b[0]; > a[1] = -b[1]; > as effectively > *(unsigned long long *)a = *(unsigned long long *)b ^ 0x8000000080000000ULL; > if we handled that case later on (except the above !INTEGRAL_TYPE_P check > would prevent that). Yep. > As for BIT_{AND,IOR,XOR}_EXPR, seems the GIMPLE verifiers actually don't > disallow > float f, g; > f_3 |= g_4; > etc. (but the FEs do), though maybe we'd ICE on some targets during > expansion; i386.md has {and,ior,xor}{s,d,x}f3 optabs. I think that's an omission in the verifier (we allow the ops on pointers though). Richard.
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 09749ae38174..f95ff2c2aa34 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, those through even when the mode isn't word_mode. For ops we have to lower the lowering code assumes we are dealing with word_mode. */ - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype)) + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR) || !target_support_p) && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD)) /* Check only during analysis. */
The emulation via word mode tries to perform integer arithmetic on floating point values instead of floating point arithmetic. This leads to mis-compilations. Failure occured on s390x on these existing test cases: gcc.dg/vect/tsvc/vect-tsvc-s112.c gcc.dg/vect/tsvc/vect-tsvc-s113.c gcc.dg/vect/tsvc/vect-tsvc-s119.c gcc.dg/vect/tsvc/vect-tsvc-s121.c gcc.dg/vect/tsvc/vect-tsvc-s131.c gcc.dg/vect/tsvc/vect-tsvc-s132.c gcc.dg/vect/tsvc/vect-tsvc-s2233.c gcc.dg/vect/tsvc/vect-tsvc-s421.c gcc.dg/vect/vect-alias-check-14.c gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating point vectors Signed-off-by: Juergen Christ <jchrist@linux.ibm.com> --- gcc/tree-vect-stmts.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)