Message ID | DB5PR0801MB2742FFC7E11DD25875D0C86AE7BF0@DB5PR0801MB2742.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
Bin Cheng <Bin.Cheng@arm.com> writes: > Hi, > This simple patch fixes the ICE by adding LTGT in > vec_cmp<mode><v_cmp_result> pattern. > I also modified the original test case into a compilation one since > -fno-wrapping-math > should not be used in general. > Bootstrap and test on AArch64, test result check for x86_64. Is it OK? > I would also need to > backport it to gcc-7-branch. > > Thanks, > bin > 2017-07-27 Bin Cheng <bin.cheng@arm.com> > > PR target/81228 > * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_cmp_result>): Add > LTGT. > > gcc/testsuite/ChangeLog > 2017-07-27 Bin Cheng <bin.cheng@arm.com> > > PR target/81228 > * gcc.dg/pr81228.c: New. > > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md > index 011fcec0..9cd67a2 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -2524,6 +2524,7 @@ > case EQ: > comparison = gen_aarch64_cmeq<mode>; > break; > + case LTGT: > case UNEQ: > case ORDERED: > case UNORDERED: > @@ -2571,6 +2572,7 @@ > emit_insn (comparison (operands[0], operands[2], operands[3])); > break; > > + case LTGT: > case UNEQ: > /* We first check (a > b || b > a) which is !UNEQ, inverting > this result will then give us (a == b || a UNORDERED b). */ > @@ -2578,7 +2580,8 @@ > operands[2], operands[3])); > emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); > emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); > - emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); > + if (code == UNEQ) > + emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); > break; AFAIK this is still a grey area, but I think (ltgt x y) is supposed to be a trapping operation, i.e. it's closer to (ior (lt x y) (gt x y)) than (not (uneq x y)). See e.g. the handling in may_trap_p_1, where LTGT is handled like LT and GT rather than like UNEQ. See also: https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00583.html Thanks, Richard
On Fri, Jul 28, 2017 at 12:55 PM, Richard Sandiford <richard.sandiford@linaro.org> wrote: > Bin Cheng <Bin.Cheng@arm.com> writes: >> Hi, >> This simple patch fixes the ICE by adding LTGT in >> vec_cmp<mode><v_cmp_result> pattern. >> I also modified the original test case into a compilation one since >> -fno-wrapping-math >> should not be used in general. >> Bootstrap and test on AArch64, test result check for x86_64. Is it OK? >> I would also need to >> backport it to gcc-7-branch. >> >> Thanks, >> bin >> 2017-07-27 Bin Cheng <bin.cheng@arm.com> >> >> PR target/81228 >> * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_cmp_result>): Add >> LTGT. >> >> gcc/testsuite/ChangeLog >> 2017-07-27 Bin Cheng <bin.cheng@arm.com> >> >> PR target/81228 >> * gcc.dg/pr81228.c: New. >> >> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md >> index 011fcec0..9cd67a2 100644 >> --- a/gcc/config/aarch64/aarch64-simd.md >> +++ b/gcc/config/aarch64/aarch64-simd.md >> @@ -2524,6 +2524,7 @@ >> case EQ: >> comparison = gen_aarch64_cmeq<mode>; >> break; >> + case LTGT: >> case UNEQ: >> case ORDERED: >> case UNORDERED: >> @@ -2571,6 +2572,7 @@ >> emit_insn (comparison (operands[0], operands[2], operands[3])); >> break; >> >> + case LTGT: >> case UNEQ: >> /* We first check (a > b || b > a) which is !UNEQ, inverting >> this result will then give us (a == b || a UNORDERED b). */ >> @@ -2578,7 +2580,8 @@ >> operands[2], operands[3])); >> emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); >> emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); >> - emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >> + if (code == UNEQ) >> + emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >> break; > > AFAIK this is still a grey area, but I think (ltgt x y) is supposed to > be a trapping operation, i.e. it's closer to (ior (lt x y) (gt x y)) > than (not (uneq x y)). See e.g. the handling in may_trap_p_1, where > LTGT is handled like LT and GT rather than like UNEQ. > > See also: https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00583.html Thanks for pointing me to this, I don't know anything about floating point here. As for the change, the code now looks like: case LTGT: case UNEQ: /* We first check (a > b || b > a) which is !UNEQ, inverting this result will then give us (a == b || a UNORDERED b). */ emit_insn (gen_aarch64_cmgt<mode> (operands[0], operands[2], operands[3])); emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); if (code == UNEQ) emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); break; So (a > b || b > a) is generated for LTGT which you suggested? Here we invert the result for UNEQ though. Thanks, bin > > Thanks, > Richard
"Bin.Cheng" <amker.cheng@gmail.com> writes: > On Fri, Jul 28, 2017 at 12:55 PM, Richard Sandiford > <richard.sandiford@linaro.org> wrote: >> Bin Cheng <Bin.Cheng@arm.com> writes: >>> Hi, >>> This simple patch fixes the ICE by adding LTGT in >>> vec_cmp<mode><v_cmp_result> pattern. >>> I also modified the original test case into a compilation one since >>> -fno-wrapping-math >>> should not be used in general. >>> Bootstrap and test on AArch64, test result check for x86_64. Is it OK? >>> I would also need to >>> backport it to gcc-7-branch. >>> >>> Thanks, >>> bin >>> 2017-07-27 Bin Cheng <bin.cheng@arm.com> >>> >>> PR target/81228 >>> * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_cmp_result>): Add >>> LTGT. >>> >>> gcc/testsuite/ChangeLog >>> 2017-07-27 Bin Cheng <bin.cheng@arm.com> >>> >>> PR target/81228 >>> * gcc.dg/pr81228.c: New. >>> >>> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md >>> index 011fcec0..9cd67a2 100644 >>> --- a/gcc/config/aarch64/aarch64-simd.md >>> +++ b/gcc/config/aarch64/aarch64-simd.md >>> @@ -2524,6 +2524,7 @@ >>> case EQ: >>> comparison = gen_aarch64_cmeq<mode>; >>> break; >>> + case LTGT: >>> case UNEQ: >>> case ORDERED: >>> case UNORDERED: >>> @@ -2571,6 +2572,7 @@ >>> emit_insn (comparison (operands[0], operands[2], operands[3])); >>> break; >>> >>> + case LTGT: >>> case UNEQ: >>> /* We first check (a > b || b > a) which is !UNEQ, inverting >>> this result will then give us (a == b || a UNORDERED b). */ >>> @@ -2578,7 +2580,8 @@ >>> operands[2], operands[3])); >>> emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); >>> emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); >>> - emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >>> + if (code == UNEQ) >>> + emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >>> break; >> >> AFAIK this is still a grey area, but I think (ltgt x y) is supposed to >> be a trapping operation, i.e. it's closer to (ior (lt x y) (gt x y)) >> than (not (uneq x y)). See e.g. the handling in may_trap_p_1, where >> LTGT is handled like LT and GT rather than like UNEQ. >> >> See also: https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00583.html > Thanks for pointing me to this, I don't know anything about floating point here. > As for the change, the code now looks like: > > case LTGT: > case UNEQ: > /* We first check (a > b || b > a) which is !UNEQ, inverting > this result will then give us (a == b || a UNORDERED b). */ > emit_insn (gen_aarch64_cmgt<mode> (operands[0], > operands[2], operands[3])); > emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); > emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); > if (code == UNEQ) > emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); > break; > > So (a > b || b > a) is generated for LTGT which you suggested? Ah, yeah, I was just going off LTGT being treated as !UNEQ, but... > Here we invert the result for UNEQ though. ...it looks like it might be the UNEQ code that's wrong. E.g. this test fails at -O3 and passes at -O for me: #define _GNU_SOURCE #include <fenv.h> double x[16], y[16]; int res[16]; int main (void) { for (int i = 0; i < 16; ++i) { x[i] = __builtin_nan (""); y[i] = i; } asm volatile ("" ::: "memory"); feclearexcept (FE_ALL_EXCEPT); for (int i = 0; i < 16; ++i) res[i] = __builtin_islessgreater (x[i], y[i]); asm volatile ("" ::: "memory"); return fetestexcept (FE_ALL_EXCEPT) != 0; } (asm volatiles just added for paranoia, in case stuff gets optimised away otherwise.) But I suppose that's no reason to hold up your patch. :-) Maybe it'd be worth having a comment though? Thanks, Richard
On Fri, Jul 28, 2017 at 3:15 PM, Richard Sandiford <richard.sandiford@linaro.org> wrote: > "Bin.Cheng" <amker.cheng@gmail.com> writes: >> On Fri, Jul 28, 2017 at 12:55 PM, Richard Sandiford >> <richard.sandiford@linaro.org> wrote: >>> Bin Cheng <Bin.Cheng@arm.com> writes: >>>> Hi, >>>> This simple patch fixes the ICE by adding LTGT in >>>> vec_cmp<mode><v_cmp_result> pattern. >>>> I also modified the original test case into a compilation one since >>>> -fno-wrapping-math >>>> should not be used in general. >>>> Bootstrap and test on AArch64, test result check for x86_64. Is it OK? >>>> I would also need to >>>> backport it to gcc-7-branch. >>>> >>>> Thanks, >>>> bin >>>> 2017-07-27 Bin Cheng <bin.cheng@arm.com> >>>> >>>> PR target/81228 >>>> * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_cmp_result>): Add >>>> LTGT. >>>> >>>> gcc/testsuite/ChangeLog >>>> 2017-07-27 Bin Cheng <bin.cheng@arm.com> >>>> >>>> PR target/81228 >>>> * gcc.dg/pr81228.c: New. >>>> >>>> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md >>>> index 011fcec0..9cd67a2 100644 >>>> --- a/gcc/config/aarch64/aarch64-simd.md >>>> +++ b/gcc/config/aarch64/aarch64-simd.md >>>> @@ -2524,6 +2524,7 @@ >>>> case EQ: >>>> comparison = gen_aarch64_cmeq<mode>; >>>> break; >>>> + case LTGT: >>>> case UNEQ: >>>> case ORDERED: >>>> case UNORDERED: >>>> @@ -2571,6 +2572,7 @@ >>>> emit_insn (comparison (operands[0], operands[2], operands[3])); >>>> break; >>>> >>>> + case LTGT: >>>> case UNEQ: >>>> /* We first check (a > b || b > a) which is !UNEQ, inverting >>>> this result will then give us (a == b || a UNORDERED b). */ >>>> @@ -2578,7 +2580,8 @@ >>>> operands[2], operands[3])); >>>> emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); >>>> emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); >>>> - emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >>>> + if (code == UNEQ) >>>> + emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >>>> break; >>> >>> AFAIK this is still a grey area, but I think (ltgt x y) is supposed to >>> be a trapping operation, i.e. it's closer to (ior (lt x y) (gt x y)) >>> than (not (uneq x y)). See e.g. the handling in may_trap_p_1, where >>> LTGT is handled like LT and GT rather than like UNEQ. >>> >>> See also: https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00583.html >> Thanks for pointing me to this, I don't know anything about floating point here. >> As for the change, the code now looks like: >> >> case LTGT: >> case UNEQ: >> /* We first check (a > b || b > a) which is !UNEQ, inverting >> this result will then give us (a == b || a UNORDERED b). */ >> emit_insn (gen_aarch64_cmgt<mode> (operands[0], >> operands[2], operands[3])); >> emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); >> emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); >> if (code == UNEQ) >> emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >> break; >> >> So (a > b || b > a) is generated for LTGT which you suggested? > > Ah, yeah, I was just going off LTGT being treated as !UNEQ, but... > >> Here we invert the result for UNEQ though. > > ...it looks like it might be the UNEQ code that's wrong. E.g. this > test fails at -O3 and passes at -O for me: Thanks very much for showing the issue with below example. That part code was refactored from old implementation the pattern is added. > > #define _GNU_SOURCE > #include <fenv.h> > > double x[16], y[16]; > int res[16]; > > int > main (void) > { > for (int i = 0; i < 16; ++i) > { > x[i] = __builtin_nan (""); > y[i] = i; > } > asm volatile ("" ::: "memory"); > feclearexcept (FE_ALL_EXCEPT); > for (int i = 0; i < 16; ++i) > res[i] = __builtin_islessgreater (x[i], y[i]); > asm volatile ("" ::: "memory"); > return fetestexcept (FE_ALL_EXCEPT) != 0; > } > > (asm volatiles just added for paranoia, in case stuff gets optimised > away otherwise.) > > But I suppose that's no reason to hold up your patch. :-) Maybe it'd > be worth having a comment though? Given the code is wrong, I will file a bug for tracking and see if I can fix it while I am on it. Hopefully it won't be long thus a comment can be saved now. Thanks, bin > > Thanks, > Richard
On Fri, Jul 28, 2017 at 3:15 PM, Richard Sandiford <richard.sandiford@linaro.org> wrote: > "Bin.Cheng" <amker.cheng@gmail.com> writes: >> On Fri, Jul 28, 2017 at 12:55 PM, Richard Sandiford >> <richard.sandiford@linaro.org> wrote: >>> Bin Cheng <Bin.Cheng@arm.com> writes: >>>> Hi, >>>> This simple patch fixes the ICE by adding LTGT in >>>> vec_cmp<mode><v_cmp_result> pattern. >>>> I also modified the original test case into a compilation one since >>>> -fno-wrapping-math >>>> should not be used in general. >>>> Bootstrap and test on AArch64, test result check for x86_64. Is it OK? >>>> I would also need to >>>> backport it to gcc-7-branch. >>>> >>>> Thanks, >>>> bin >>>> 2017-07-27 Bin Cheng <bin.cheng@arm.com> >>>> >>>> PR target/81228 >>>> * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_cmp_result>): Add >>>> LTGT. >>>> >>>> gcc/testsuite/ChangeLog >>>> 2017-07-27 Bin Cheng <bin.cheng@arm.com> >>>> >>>> PR target/81228 >>>> * gcc.dg/pr81228.c: New. >>>> >>>> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md >>>> index 011fcec0..9cd67a2 100644 >>>> --- a/gcc/config/aarch64/aarch64-simd.md >>>> +++ b/gcc/config/aarch64/aarch64-simd.md >>>> @@ -2524,6 +2524,7 @@ >>>> case EQ: >>>> comparison = gen_aarch64_cmeq<mode>; >>>> break; >>>> + case LTGT: >>>> case UNEQ: >>>> case ORDERED: >>>> case UNORDERED: >>>> @@ -2571,6 +2572,7 @@ >>>> emit_insn (comparison (operands[0], operands[2], operands[3])); >>>> break; >>>> >>>> + case LTGT: >>>> case UNEQ: >>>> /* We first check (a > b || b > a) which is !UNEQ, inverting >>>> this result will then give us (a == b || a UNORDERED b). */ >>>> @@ -2578,7 +2580,8 @@ >>>> operands[2], operands[3])); >>>> emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); >>>> emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); >>>> - emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >>>> + if (code == UNEQ) >>>> + emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >>>> break; >>> >>> AFAIK this is still a grey area, but I think (ltgt x y) is supposed to >>> be a trapping operation, i.e. it's closer to (ior (lt x y) (gt x y)) >>> than (not (uneq x y)). See e.g. the handling in may_trap_p_1, where >>> LTGT is handled like LT and GT rather than like UNEQ. >>> >>> See also: https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00583.html >> Thanks for pointing me to this, I don't know anything about floating point here. >> As for the change, the code now looks like: >> >> case LTGT: >> case UNEQ: >> /* We first check (a > b || b > a) which is !UNEQ, inverting >> this result will then give us (a == b || a UNORDERED b). */ >> emit_insn (gen_aarch64_cmgt<mode> (operands[0], >> operands[2], operands[3])); >> emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); >> emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); >> if (code == UNEQ) >> emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >> break; >> >> So (a > b || b > a) is generated for LTGT which you suggested? > > Ah, yeah, I was just going off LTGT being treated as !UNEQ, but... > >> Here we invert the result for UNEQ though. > > ...it looks like it might be the UNEQ code that's wrong. E.g. this > test fails at -O3 and passes at -O for me: > > #define _GNU_SOURCE > #include <fenv.h> > > double x[16], y[16]; > int res[16]; > > int > main (void) > { > for (int i = 0; i < 16; ++i) > { > x[i] = __builtin_nan (""); > y[i] = i; > } > asm volatile ("" ::: "memory"); > feclearexcept (FE_ALL_EXCEPT); > for (int i = 0; i < 16; ++i) > res[i] = __builtin_islessgreater (x[i], y[i]); > asm volatile ("" ::: "memory"); > return fetestexcept (FE_ALL_EXCEPT) != 0; > } > > (asm volatiles just added for paranoia, in case stuff gets optimised > away otherwise.) Thanks for the test, I file PR81647 for tracking. And this is actually inconsistent LTGT behavior issue. It's translated differently w/o vectorization I think. Thanks, bin > > But I suppose that's no reason to hold up your patch. :-) Maybe it'd > be worth having a comment though? > > Thanks, > Richard
Ping. On Fri, Jul 28, 2017 at 12:37 PM, Bin Cheng <Bin.Cheng@arm.com> wrote: > Hi, > This simple patch fixes the ICE by adding LTGT in vec_cmp<mode><v_cmp_result> pattern. > I also modified the original test case into a compilation one since -fno-wrapping-math > should not be used in general. > Bootstrap and test on AArch64, test result check for x86_64. Is it OK? I would also need to > backport it to gcc-7-branch. > > Thanks, > bin > 2017-07-27 Bin Cheng <bin.cheng@arm.com> > > PR target/81228 > * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_cmp_result>): Add > LTGT. > > gcc/testsuite/ChangeLog > 2017-07-27 Bin Cheng <bin.cheng@arm.com> > > PR target/81228 > * gcc.dg/pr81228.c: New.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 011fcec0..9cd67a2 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2524,6 +2524,7 @@ case EQ: comparison = gen_aarch64_cmeq<mode>; break; + case LTGT: case UNEQ: case ORDERED: case UNORDERED: @@ -2571,6 +2572,7 @@ emit_insn (comparison (operands[0], operands[2], operands[3])); break; + case LTGT: case UNEQ: /* We first check (a > b || b > a) which is !UNEQ, inverting this result will then give us (a == b || a UNORDERED b). */ @@ -2578,7 +2580,8 @@ operands[2], operands[3])); emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); - emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); + if (code == UNEQ) + emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); break; case UNORDERED: diff --git a/gcc/testsuite/gcc.dg/pr81228.c b/gcc/testsuite/gcc.dg/pr81228.c new file mode 100644 index 0000000..3334299 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr81228.c @@ -0,0 +1,47 @@ +/* PR target/81228 */ +/* { dg-do compile } */ +/* { dg-options "-O3 -fno-trapping-math" } */ +/* { dg-options "-O3 -fno-trapping-math -mavx" { target avx_runtime } } */ + +double s1[4], s2[4], s3[64]; + +int +main (void) +{ + int i; + asm volatile ("" : : : "memory"); + for (i = 0; i < 4; i++) + s3[0 * 4 + i] = __builtin_isgreater (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[1 * 4 + i] = (!__builtin_isgreater (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[2 * 4 + i] = __builtin_isgreaterequal (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[3 * 4 + i] = (!__builtin_isgreaterequal (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[4 * 4 + i] = __builtin_isless (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[5 * 4 + i] = (!__builtin_isless (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[6 * 4 + i] = __builtin_islessequal (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[7 * 4 + i] = (!__builtin_islessequal (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[8 * 4 + i] = __builtin_islessgreater (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[9 * 4 + i] = (!__builtin_islessgreater (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[10 * 4 + i] = __builtin_isunordered (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[11 * 4 + i] = (!__builtin_isunordered (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[12 * 4 + i] = s1[i] > s2[i] ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[13 * 4 + i] = s1[i] >= s2[i] ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[14 * 4 + i] = s1[i] < s2[i] ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[15 * 4 + i] = s1[i] <= s2[i] ? -1.0 : 0.0; + asm volatile ("" : : : "memory"); + return 0; +}