diff mbox

PR target/69225: Set FLT_EVAL_METHOD to 2 only if 387 FPU is used

Message ID CAFULd4ZRmtooD2A+G6iWaSpAQvY8Jt1tS582G7bpj7HpjtUxSA@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak Jan. 12, 2016, 12:12 p.m. UTC
On Tue, Jan 12, 2016 at 12:18 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Jan 12, 2016 at 12:10:20PM +0100, Uros Bizjak wrote:
>> On Tue, Jan 12, 2016 at 1:15 AM, Joseph Myers <joseph@codesourcery.com> wrote:
>> > On Mon, 11 Jan 2016, H.J. Lu wrote:
>> >
>> >> Here is the updated patch.  Joseph, is this OK?
>> >
>> > I have no objections to this patch.
>>
>> Thinking some more, it looks to me that we also have to return 2 when
>> SSE2 (SSE doubles) is not enabled.
>>
>> I'm testing following patch:
>
> That looks weird.  If TARGET_80387 and !TARGET_SSE_MATH, then no matter
> whether sse2 is enabled or not, normal floating point operations will be
> performed in 387 stack and thus FLT_EVAL_METHOD should be 2, not 0.
> Do you want to do this because some instructions might be vectorized and
> therefore end up in sse registers?  For -std=c99 that shouldn't happen,
> already the C FE would promote all the arithmetics to be done in long
> doubles, and for -std=gnu99 it is acceptable if non-vectorized computations
> honor FLT_EVAL_METHOD and vectorized ones don't.

Eh, today is just not the day for science.

Hopefully, the logic in the patch below is correct:

    SFmode, DFmode and XFmode) in the current excess precision

Uros.

Comments

Uros Bizjak Jan. 12, 2016, 12:32 p.m. UTC | #1
On Tue, Jan 12, 2016 at 1:12 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Jan 12, 2016 at 12:18 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Tue, Jan 12, 2016 at 12:10:20PM +0100, Uros Bizjak wrote:
>>> On Tue, Jan 12, 2016 at 1:15 AM, Joseph Myers <joseph@codesourcery.com> wrote:
>>> > On Mon, 11 Jan 2016, H.J. Lu wrote:
>>> >
>>> >> Here is the updated patch.  Joseph, is this OK?
>>> >
>>> > I have no objections to this patch.
>>>
>>> Thinking some more, it looks to me that we also have to return 2 when
>>> SSE2 (SSE doubles) is not enabled.
>>>
>>> I'm testing following patch:
>>
>> That looks weird.  If TARGET_80387 and !TARGET_SSE_MATH, then no matter
>> whether sse2 is enabled or not, normal floating point operations will be
>> performed in 387 stack and thus FLT_EVAL_METHOD should be 2, not 0.
>> Do you want to do this because some instructions might be vectorized and
>> therefore end up in sse registers?  For -std=c99 that shouldn't happen,
>> already the C FE would promote all the arithmetics to be done in long
>> doubles, and for -std=gnu99 it is acceptable if non-vectorized computations
>> honor FLT_EVAL_METHOD and vectorized ones don't.
>
> Eh, today is just not the day for science.
>
> Hopefully, the logic in the patch below is correct:
>
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 6c63871..5b42e89 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -693,8 +693,9 @@ extern const char *host_detect_local_cpu (int
> argc, const char **argv);
>     only SSE, rounding is correct; when using both SSE and the FPU,
>     the rounding precision is indeterminate, since either may be chosen
>     apparently at random.  */
> -#define TARGET_FLT_EVAL_METHOD \
> -  (TARGET_MIX_SSE_I387 ? -1 : (TARGET_80387 && !TARGET_SSE_MATH) ? 2 : 0)
> +#define TARGET_FLT_EVAL_METHOD                                         \
> +  (TARGET_MIX_SSE_I387 ? -1                                            \
> +   : TARGET_80387 && !(TARGET_SSE2 && TARGET_SSE_MATH) ? 2 : 0)
>
>  /* Whether to allow x87 floating-point arithmetic on MODE (one of
>     SFmode, DFmode and XFmode) in the current excess precision

Using this patch, SSE math won't be emitted for a simple testcase
using " -O2 -msse -m32 -std=c99 -mfpmath=sse" compile flags:

float test (float a, float b)
{
  return a + b;
}

since we start with:

test (float a, float b)
{
  long double _2;
  long double _4;
  long double _5;
  float _6;

  <bb 2>:
  _2 = (long double) a_1(D);
  _4 = (long double) b_3(D);
  _5 = _2 + _4;
  _6 = (float) _5;
  return _6;
}

This is counter-intuitive, so I'd say we leave things as they are. The
situation where only floats are evaluated as floats and doubles are
evaluated as long doubles is not covered in the FLT_EVAL_METHOD spec.

Uros.
Jakub Jelinek Jan. 12, 2016, 12:43 p.m. UTC | #2
On Tue, Jan 12, 2016 at 01:32:05PM +0100, Uros Bizjak wrote:
> Using this patch, SSE math won't be emitted for a simple testcase
> using " -O2 -msse -m32 -std=c99 -mfpmath=sse" compile flags:
> 
> float test (float a, float b)
> {
>   return a + b;
> }
> 
> since we start with:
> 
> test (float a, float b)
> {
>   long double _2;
>   long double _4;
>   long double _5;
>   float _6;
> 
>   <bb 2>:
>   _2 = (long double) a_1(D);
>   _4 = (long double) b_3(D);
>   _5 = _2 + _4;
>   _6 = (float) _5;
>   return _6;
> }
> 
> This is counter-intuitive, so I'd say we leave things as they are. The
> situation where only floats are evaluated as floats and doubles are
> evaluated as long doubles is not covered in the FLT_EVAL_METHOD spec.

Well, for the -fexcess-precision=standard case (== -std=c99) FLT_EVAL_METHOD
2 doesn't hurt, that forces in the FE long double computation.  While if it
is 0 with -msse -mfpmath=sse, it means that the FE leaves computations as is
and they are computed in float precision for floats and in long double
precision for doubles.  For -fexcess-precision=fast it is different, because
the FE doesn't do anything, so in the end it is mixed in that case.
So, for -msse -mfpmath=sse, I think either we need FLT_EVAL_METHOD 2 or -1
or 2 for -fexcess-precision=standard and -1 for -fexcess-precision=fast.

	Jakub
Uros Bizjak Jan. 12, 2016, 1:10 p.m. UTC | #3
On Tue, Jan 12, 2016 at 1:43 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Jan 12, 2016 at 01:32:05PM +0100, Uros Bizjak wrote:
>> Using this patch, SSE math won't be emitted for a simple testcase
>> using " -O2 -msse -m32 -std=c99 -mfpmath=sse" compile flags:
>>
>> float test (float a, float b)
>> {
>>   return a + b;
>> }
>>
>> since we start with:
>>
>> test (float a, float b)
>> {
>>   long double _2;
>>   long double _4;
>>   long double _5;
>>   float _6;
>>
>>   <bb 2>:
>>   _2 = (long double) a_1(D);
>>   _4 = (long double) b_3(D);
>>   _5 = _2 + _4;
>>   _6 = (float) _5;
>>   return _6;
>> }
>>
>> This is counter-intuitive, so I'd say we leave things as they are. The
>> situation where only floats are evaluated as floats and doubles are
>> evaluated as long doubles is not covered in the FLT_EVAL_METHOD spec.
>
> Well, for the -fexcess-precision=standard case (== -std=c99) FLT_EVAL_METHOD
> 2 doesn't hurt, that forces in the FE long double computation.  While if it
> is 0 with -msse -mfpmath=sse, it means that the FE leaves computations as is
> and they are computed in float precision for floats and in long double
> precision for doubles.  For -fexcess-precision=fast it is different, because
> the FE doesn't do anything, so in the end it is mixed in that case.
> So, for -msse -mfpmath=sse, I think either we need FLT_EVAL_METHOD 2 or -1
> or 2 for -fexcess-precision=standard and -1 for -fexcess-precision=fast.

I think that following definition describes -msse -mfpmath=sse
situation in the most elegant way. We can just declare that the
precision is not known in this case:

#define TARGET_FLT_EVAL_METHOD                        \
  (TARGET_MIX_SSE_I387 ? -1                        \
   : (TARGET_80387 && !TARGET_SSE_MATH) ? 2 : TARGET_SSE2 ? 0 : -1)

Using this patch, the compiler will still generate SSE instructions
for the above test.

Joseph, what is your opinion on this approach?

Uros.
Joseph Myers Jan. 12, 2016, 6:18 p.m. UTC | #4
On Tue, 12 Jan 2016, Uros Bizjak wrote:

> I think that following definition describes -msse -mfpmath=sse
> situation in the most elegant way. We can just declare that the
> precision is not known in this case:
> 
> #define TARGET_FLT_EVAL_METHOD                        \
>   (TARGET_MIX_SSE_I387 ? -1                        \
>    : (TARGET_80387 && !TARGET_SSE_MATH) ? 2 : TARGET_SSE2 ? 0 : -1)
> 
> Using this patch, the compiler will still generate SSE instructions
> for the above test.
> 
> Joseph, what is your opinion on this approach?

I think this is reasonable.
diff mbox

Patch

diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 6c63871..5b42e89 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -693,8 +693,9 @@  extern const char *host_detect_local_cpu (int
argc, const char **argv);
    only SSE, rounding is correct; when using both SSE and the FPU,
    the rounding precision is indeterminate, since either may be chosen
    apparently at random.  */
-#define TARGET_FLT_EVAL_METHOD \
-  (TARGET_MIX_SSE_I387 ? -1 : (TARGET_80387 && !TARGET_SSE_MATH) ? 2 : 0)
+#define TARGET_FLT_EVAL_METHOD                                         \
+  (TARGET_MIX_SSE_I387 ? -1                                            \
+   : TARGET_80387 && !(TARGET_SSE2 && TARGET_SSE_MATH) ? 2 : 0)

 /* Whether to allow x87 floating-point arithmetic on MODE (one of