diff mbox

Don't error about x86 return value in SSE reg (or x86 reg) or argument in SSE reg too early (PR target/80298)

Message ID CAFULd4YxXU=wwE0Yygime6RMMWiR3RwrH=vRXeoGsF8CNoOwsQ@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak April 5, 2017, 7:42 a.m. UTC
On Tue, Apr 4, 2017 at 9:24 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> aggregate_value_p is called often very early during compilation, e.g.
> from allocate_function or during gimplification of a call with lhs.
> The problem with that is e.g. that on x86_64 -m64 -mno-sse we can't
> include <x86intrin.h>, because the always_inline inline functions
> in mmx and 3dnow intrinsic headers return __m64 or take __m64 as arguments
> and that in the 64-bit ABI is in SSE register.
>
> The following patch makes sure we diagnose this only later (e.g. when
> expanding a function to RTL or when expanding calls to other functions),
> which means we don't diagnose e.g. inline functions that got successfully
> inlined (because then there is really no function return in SSE or x87
> reg) or e.g. for builtin calls if they are emitted inline rather than
> as a library call (again, I think that is desirable).
> I had to tweak a few tests because the reported line changed slightly,
> and in the last test add -fno-builtin-fminl, because otherwise fminl
> is expanded inline and again there is no call left with the problem.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

No, I think the issue should be fixed in intrinsics.

Attached patch solves this problem for me, and also fixes a couple of
similar problems (one with -m3dnowa, that is nowadays a regular
compile option). The patched intrinsics were tested with combinations
of -m{,no-}sse{,2}, -m{,no-}mmx, -m{-no}3dnow{,a}, -m64, and there
were no problems with any combination.

Uros.

Comments

Jakub Jelinek April 5, 2017, 8 a.m. UTC | #1
On Wed, Apr 05, 2017 at 09:42:31AM +0200, Uros Bizjak wrote:
> On Tue, Apr 4, 2017 at 9:24 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > Hi!
> >
> > aggregate_value_p is called often very early during compilation, e.g.
> > from allocate_function or during gimplification of a call with lhs.
> > The problem with that is e.g. that on x86_64 -m64 -mno-sse we can't
> > include <x86intrin.h>, because the always_inline inline functions
> > in mmx and 3dnow intrinsic headers return __m64 or take __m64 as arguments
> > and that in the 64-bit ABI is in SSE register.
> >
> > The following patch makes sure we diagnose this only later (e.g. when
> > expanding a function to RTL or when expanding calls to other functions),
> > which means we don't diagnose e.g. inline functions that got successfully
> > inlined (because then there is really no function return in SSE or x87
> > reg) or e.g. for builtin calls if they are emitted inline rather than
> > as a library call (again, I think that is desirable).
> > I had to tweak a few tests because the reported line changed slightly,
> > and in the last test add -fno-builtin-fminl, because otherwise fminl
> > is expanded inline and again there is no call left with the problem.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> No, I think the issue should be fixed in intrinsics.

But then you can't use the mmx intrinsics in pure mmx non-sse code, even
when my patch allows that.  Say:
#include <x86intrin.h>

__attribute__((target ("nosse,mmx"))) void
mmx_only (__m64 *a, __m64 *b, __m64 *c)
{
  *a = _mm_packs_pu16 (*b, *c);
}

or without the attribute, but in -mno-sse -mmmx compilation.
This compiles just fine for -m32 and there is no reason why it couldn't
work similarly for -m64, when the intrinsic will be inlined there is no
return nor argument needed in SSE registers.
So in effect it makes MMX unusable for 64-bit code without SSE.
Or do we just declare that unsupported?  Of course, people shouldn't be
using MMX, especially not in 64-bit code.

Note, my patch apparently doesn't handle the above, there is still the
aggregate_value_p call in NRV.
So maybe we should also not error if:
(cfun && current_ir_type () == IR_GIMPLE && !currently_expanding_to_rtl)
or so.

	Jakub
Uros Bizjak April 5, 2017, 8:12 a.m. UTC | #2
On Wed, Apr 5, 2017 at 10:00 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Apr 05, 2017 at 09:42:31AM +0200, Uros Bizjak wrote:
>> On Tue, Apr 4, 2017 at 9:24 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> > Hi!
>> >
>> > aggregate_value_p is called often very early during compilation, e.g.
>> > from allocate_function or during gimplification of a call with lhs.
>> > The problem with that is e.g. that on x86_64 -m64 -mno-sse we can't
>> > include <x86intrin.h>, because the always_inline inline functions
>> > in mmx and 3dnow intrinsic headers return __m64 or take __m64 as arguments
>> > and that in the 64-bit ABI is in SSE register.
>> >
>> > The following patch makes sure we diagnose this only later (e.g. when
>> > expanding a function to RTL or when expanding calls to other functions),
>> > which means we don't diagnose e.g. inline functions that got successfully
>> > inlined (because then there is really no function return in SSE or x87
>> > reg) or e.g. for builtin calls if they are emitted inline rather than
>> > as a library call (again, I think that is desirable).
>> > I had to tweak a few tests because the reported line changed slightly,
>> > and in the last test add -fno-builtin-fminl, because otherwise fminl
>> > is expanded inline and again there is no call left with the problem.
>> >
>> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>
>> No, I think the issue should be fixed in intrinsics.
>
> But then you can't use the mmx intrinsics in pure mmx non-sse code, even
> when my patch allows that.  Say:
> #include <x86intrin.h>
>
> __attribute__((target ("nosse,mmx"))) void
> mmx_only (__m64 *a, __m64 *b, __m64 *c)
> {
>   *a = _mm_packs_pu16 (*b, *c);
> }
>
> or without the attribute, but in -mno-sse -mmmx compilation.
> This compiles just fine for -m32 and there is no reason why it couldn't
> work similarly for -m64, when the intrinsic will be inlined there is no
> return nor argument needed in SSE registers.
> So in effect it makes MMX unusable for 64-bit code without SSE.
> Or do we just declare that unsupported?  Of course, people shouldn't be
> using MMX, especially not in 64-bit code.

Oh, I forgot to point out that on x86_64 ABI specifies MMX values in
SSE registers. So, on x86_64, users should not compile MMX code with
-mno-sse.

The problem, as reported in the PR was that <x86intrin.h> can't be
used with -mno-sse, even when source code that includes header does't
use SSE or MMX on x86_64. But, when these intrinsics are *used*, we
should still warn on x86_64. So the warning in your example is
correct.

Uros.
Jakub Jelinek April 5, 2017, 8:20 a.m. UTC | #3
On Wed, Apr 05, 2017 at 10:12:02AM +0200, Uros Bizjak wrote:
> Oh, I forgot to point out that on x86_64 ABI specifies MMX values in
> SSE registers.

I know it does.  And if people have their own function that returns
__m64 or takes such arguments, they surely have to.
The question is only about the case when no function (in the assembly)
returns in SSE registers nor gets arguments in them, when all the
MMX code is inside of a function.
With your patch, it is - the MMX intrinsics are functions and we error on
them even when they are inlined.
With my patch we count only the non-inlined functions, something we emit
assembly for or call them from other TUs.

If you think requiring SSE for MMX always in 64-bit code is fine, even
when not strictly needed (as in, you really don't need SSE ISA to execute
such code, although there are no CPUs without that HW), so be it, then
let's go with your patch.

	Jakub
Uros Bizjak April 5, 2017, 8:26 a.m. UTC | #4
On Wed, Apr 5, 2017 at 10:20 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Apr 05, 2017 at 10:12:02AM +0200, Uros Bizjak wrote:
>> Oh, I forgot to point out that on x86_64 ABI specifies MMX values in
>> SSE registers.
>
> I know it does.  And if people have their own function that returns
> __m64 or takes such arguments, they surely have to.
> The question is only about the case when no function (in the assembly)
> returns in SSE registers nor gets arguments in them, when all the
> MMX code is inside of a function.
> With your patch, it is - the MMX intrinsics are functions and we error on
> them even when they are inlined.
> With my patch we count only the non-inlined functions, something we emit
> assembly for or call them from other TUs.
>
> If you think requiring SSE for MMX always in 64-bit code is fine, even
> when not strictly needed (as in, you really don't need SSE ISA to execute
> such code, although there are no CPUs without that HW), so be it, then
> let's go with your patch.

Yes, I think that we should consistently warn even in case intrinsic
is inlined. I'm afraid that otherwise we will have endless stream of
bugreports that something warns with -O0, but not -O1+, when LTO is
used, etc, etc... The rule is: When user uses (or intends to use) MMX
value (that goes in SSE reg on x86_64), -msse is needed on x86_64,
otherwise warning is emitted.

Uros.
Jakub Jelinek April 5, 2017, 8:29 a.m. UTC | #5
On Wed, Apr 05, 2017 at 10:26:47AM +0200, Uros Bizjak wrote:
> On Wed, Apr 5, 2017 at 10:20 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Wed, Apr 05, 2017 at 10:12:02AM +0200, Uros Bizjak wrote:
> >> Oh, I forgot to point out that on x86_64 ABI specifies MMX values in
> >> SSE registers.
> >
> > I know it does.  And if people have their own function that returns
> > __m64 or takes such arguments, they surely have to.
> > The question is only about the case when no function (in the assembly)
> > returns in SSE registers nor gets arguments in them, when all the
> > MMX code is inside of a function.
> > With your patch, it is - the MMX intrinsics are functions and we error on
> > them even when they are inlined.
> > With my patch we count only the non-inlined functions, something we emit
> > assembly for or call them from other TUs.
> >
> > If you think requiring SSE for MMX always in 64-bit code is fine, even
> > when not strictly needed (as in, you really don't need SSE ISA to execute
> > such code, although there are no CPUs without that HW), so be it, then
> > let's go with your patch.
> 
> Yes, I think that we should consistently warn even in case intrinsic
> is inlined. I'm afraid that otherwise we will have endless stream of
> bugreports that something warns with -O0, but not -O1+, when LTO is
> used, etc, etc... The rule is: When user uses (or intends to use) MMX
> value (that goes in SSE reg on x86_64), -msse is needed on x86_64,
> otherwise warning is emitted.

Well, error.  Anyway, if so, then please commit your patch.

	Jakub
Uros Bizjak April 5, 2017, 3:37 p.m. UTC | #6
On Wed, Apr 5, 2017 at 10:29 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Apr 05, 2017 at 10:26:47AM +0200, Uros Bizjak wrote:
>> On Wed, Apr 5, 2017 at 10:20 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>> > On Wed, Apr 05, 2017 at 10:12:02AM +0200, Uros Bizjak wrote:
>> >> Oh, I forgot to point out that on x86_64 ABI specifies MMX values in
>> >> SSE registers.
>> >
>> > I know it does.  And if people have their own function that returns
>> > __m64 or takes such arguments, they surely have to.
>> > The question is only about the case when no function (in the assembly)
>> > returns in SSE registers nor gets arguments in them, when all the
>> > MMX code is inside of a function.
>> > With your patch, it is - the MMX intrinsics are functions and we error on
>> > them even when they are inlined.
>> > With my patch we count only the non-inlined functions, something we emit
>> > assembly for or call them from other TUs.
>> >
>> > If you think requiring SSE for MMX always in 64-bit code is fine, even
>> > when not strictly needed (as in, you really don't need SSE ISA to execute
>> > such code, although there are no CPUs without that HW), so be it, then
>> > let's go with your patch.
>>
>> Yes, I think that we should consistently warn even in case intrinsic
>> is inlined. I'm afraid that otherwise we will have endless stream of
>> bugreports that something warns with -O0, but not -O1+, when LTO is
>> used, etc, etc... The rule is: When user uses (or intends to use) MMX
>> value (that goes in SSE reg on x86_64), -msse is needed on x86_64,
>> otherwise warning is emitted.
>
> Well, error.  Anyway, if so, then please commit your patch.

Committed with the following ChangeLog:

2017-04-05  Uros Bizjak  <ubizjak@gmail.com>

    PR target/80298
    * config/i386/mmintrin.h: Add -msse target option when __SSE__ is
    not defined for x86_64 target.  Add -mmmx target option when __SSE2__
    is not defined.
    * config/i386/mm3dnow.h: Add -msse target when __SSE__ is not defined
    for x86_64 target.  Handle -m3dnowa option.

I choose not to include testcases, since mm_malloc includes stdlib.h,
which uses SSE register return with -O2, resulting in:

In file included from /usr/include/stdlib.h:921:0,
                 from /ssd/uros/gcc-build/gcc/include/mm_malloc.h:27,
                 from /ssd/uros/gcc-build/gcc/include/xmmintrin.h:34,
                 from /ssd/uros/gcc-build/gcc/include/x86intrin.h:33,
                 from
/home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr80298-2.c:5:
/usr/include/bits/stdlib-float.h: In function 'atof':
/usr/include/bits/stdlib-float.h:27:1: error: SSE register return with
SSE disabled
compiler exited with status 1

This issue is out of scope of the compiler.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff mbox

Patch

diff --git a/gcc/config/i386/mm3dnow.h b/gcc/config/i386/mm3dnow.h
index c8a91a1..2d5c538 100644
--- a/gcc/config/i386/mm3dnow.h
+++ b/gcc/config/i386/mm3dnow.h
@@ -30,9 +30,13 @@ 
 #include <mmintrin.h>
 #include <prfchwintrin.h>
 
-#ifndef __3dNOW__
+#if defined __x86_64__ && !defined __SSE__ || !defined __3dNOW__
 #pragma GCC push_options
+#ifdef __x86_64__
+#pragma GCC target("sse,3dnow")
+#else
 #pragma GCC target("3dnow")
+#endif
 #define __DISABLE_3dNOW__
 #endif /* __3dNOW__ */
 
@@ -176,7 +180,20 @@  _m_to_float (__m64 __A)
   return __tmp.a[0];
 }
 
-#ifdef __3dNOW_A__
+#ifdef __DISABLE_3dNOW__
+#undef __DISABLE_3dNOW__
+#pragma GCC pop_options
+#endif /* __DISABLE_3dNOW__ */
+
+#if defined __x86_64__ && !defined __SSE__ || !defined __3dNOW_A__
+#pragma GCC push_options
+#ifdef __x86_64__
+#pragma GCC target("sse,3dnowa")
+#else
+#pragma GCC target("3dnowa")
+#endif
+#define __DISABLE_3dNOW_A__
+#endif /* __3dNOW_A__ */
 
 extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _m_pf2iw (__m64 __A)
@@ -208,11 +225,9 @@  _m_pswapd (__m64 __A)
   return (__m64)__builtin_ia32_pswapdsf ((__v2sf)__A);
 }
 
-#endif /* __3dNOW_A__ */
-
-#ifdef __DISABLE_3dNOW__
-#undef __DISABLE_3dNOW__
+#ifdef __DISABLE_3dNOW_A__
+#undef __DISABLE_3dNOW_A__
 #pragma GCC pop_options
-#endif /* __DISABLE_3dNOW__ */
+#endif /* __DISABLE_3dNOW_A__ */
 
 #endif /* _MM3DNOW_H_INCLUDED */
diff --git a/gcc/config/i386/mmintrin.h b/gcc/config/i386/mmintrin.h
index 957d766..2cb73e3 100644
--- a/gcc/config/i386/mmintrin.h
+++ b/gcc/config/i386/mmintrin.h
@@ -27,9 +27,13 @@ 
 #ifndef _MMINTRIN_H_INCLUDED
 #define _MMINTRIN_H_INCLUDED
 
-#ifndef __MMX__
+#if defined __x86_64__ && !defined __SSE__ || !defined __MMX__
 #pragma GCC push_options
+#ifdef __x86_64__
+#pragma GCC target("sse,mmx")
+#else
 #pragma GCC target("mmx")
+#endif
 #define __DISABLE_MMX__
 #endif /* __MMX__ */
 
@@ -311,7 +315,7 @@  _m_paddd (__m64 __m1, __m64 __m2)
 /* Add the 64-bit values in M1 to the 64-bit values in M2.  */
 #ifndef __SSE2__
 #pragma GCC push_options
-#pragma GCC target("sse2")
+#pragma GCC target("sse2,mmx")
 #define __DISABLE_SSE2__
 #endif /* __SSE2__ */
 
@@ -423,7 +427,7 @@  _m_psubd (__m64 __m1, __m64 __m2)
 /* Add the 64-bit values in M1 to the 64-bit values in M2.  */
 #ifndef __SSE2__
 #pragma GCC push_options
-#pragma GCC target("sse2")
+#pragma GCC target("sse2,mmx")
 #define __DISABLE_SSE2__
 #endif /* __SSE2__ */