Patchwork Fixing improper conversion from sin() to sinf() in optimization mode.

login
register
mail settings
Submitter Cong Hou
Date Sept. 6, 2013, 10:24 p.m.
Message ID <CAK=A3=2jYZBkyhHM=5__s6DCqKv1Af_PmOHZLANpj1OOXoUEPA@mail.gmail.com>
Download mbox | patch
Permalink /patch/273347/
State New
Headers show

Comments

Cong Hou - Sept. 6, 2013, 10:24 p.m.
First, thank you for your detailed comments again! Then I deeply
apologize for not explaining my patch properly and responding to your
previous comment. I didn't understand thoroughly the problem before
submitting the patch.

Previously I only considered the following three conversions for sqrt():


1: (float) sqrt ((double) float_val)  ->  sqrtf (float_val)
2: (float) sqrtl ((long double) float_val)  ->  sqrtf (float_val)
3: (double) sqrtl ((long double) double_val)  ->  sqrt (double_val)


We have four types here:

TYPE is the type to which the result of the function call is converted.
ITYPE is the type of the math call expression.
TREE_TYPE(arg0) is the type of the function argument (before type conversion).
NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision.
It will be the type of the new math call expression after conversion.

For all three cases above, TYPE is always the same as NEWTYPE. That is
why I only considered TYPE during the precision comparison. ITYPE can
only be double_type_node or long_double_type_node depending on the
type of the math function. That is why I explicitly used those two
types instead of ITYPE (no correctness issue). But you are right,
ITYPE is more elegant and better here.

After further analysis, I found I missed two more cases. Note that we
have the following conditions according to the code in convert.c:

TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE)
TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0))
TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE)

the last condition comes from the fact that we only consider
converting a math function call into another one with narrower type.
Therefore we have

TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE)
TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE)

So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for
sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with
four possible combinations. Therefore we have two more conversions to
consider besides the three ones I mentioned above:


4: (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
5: (double) sqrtl ((long double) float_val)  ->  sqrt ((double) float_val)


For the first conversion here, TYPE (float) is different from NEWTYPE
(double), and my previous patch doesn't handle this case.The correct
way is to compare precisions of ITYPE and NEWTYPE now.

To sum up, we are converting the expression

(TYPE) sqrtITYPE ((ITYPE) expr)

to

(TYPE) sqrtNEWTYPE ((NEWTYPE) expr)

and we require

PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2

to make it a safe conversion.


The new patch is pasted below.

I appreciate your detailed comments and analysis, and next time when I
submit a patch I will be more carefully about the reviewer's comment.


Thank you!

Cong




On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <joseph@codesourcery.com> wrote:
> On Wed, 4 Sep 2013, Xinliang David Li wrote:
>
>> > This patch submission still fails to pay attention to various of my
>> > comments.
>> >
>>
>> If you can provide inlined comments in the patch, that will be more
>> useful and productive.
>
> I have explained things several times in this thread already.  I see no
> point in repeating things when what I would say has already been said and
> ignored.  As far as I can tell, this latest patch submission has taken one
> line from the message it is in response to, and largely ignored the
> following two paragraphs (including where I explicitly say that said line
> should not appear literally in the source code at all).  But, repeating
> what I said before yet again:
>
>   (but you should be referring to the relevant types
>
> The patch does not do properly that.  It refers explicitly to
> long_double_type_node and double_type_node.
>
>   - "type", the type
>   being converted to, "itype", the type of the function being called in the
>   source code, "TREE_TYPE (arg0)", the type of the argument after extensions
>   have been removed, and "newtype", computed from those
>
> The patch only engages with "type".  I suspect "newtype" is what it really
> means there when using "type".  When it uses long_double_type_node and
> double_type_node, those should be "itype".
>
>   - so you should have
>   expressions like the above with two or more of those four types, but not
>   with long_double_type_node directly).
>
> See above.  The patch uses long_double_type_node directly, contrary to
> what I explicitly said.  You are free to disagree with something I say in
> a review - but in that case you need to reply specifically to the review
> comment explaining your rationale for disagreeing with it.  Just ignoring
> the comment and not mentioning the disagreement wastes the time of
> reviewers.
>
>   The patch submission will need to include a proper analysis to justify to
>   the reader why the particular inequality with particular types from those
>   four is correct in all cases where the relevant code may be executed.
>
> The comments only refer to "T1" and "T2" without explaining how they
> relate to the original expression (three types) or the variables within
> GCC dealing with various types (four types, because newtype gets
> involved).  I said back in
> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and
> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types
> are involved - when I say "the patch submission needs to include its own
> analysis for the full generality of three types", again, it's
> inappropriate for a patch to omit such an analysis without justification.
> The patch submission needs to include an analysis that properly explains
> the transformation involved and the conditions under which it is safe.
> Maybe starting along the lines of:
>
> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3
> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point
> square root function being used (ITYPE), T1 is TYPE and all these types
> are binary floating-point types.  We wish to optimize if possible to an
> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is
> narrower than T2.  (Then explain the choice of T4 and the conditions under
> which the transformation is safe, with appropriate references.)
>
> I suggest that for the next patch submission you (the patch submitter)
> make sure you spend at least an hour properly understanding the issues and
> all the previous messages in the thread and writing up the detailed,
> coherent explanation of the transformation and analysis of the issues that
> I asked for some time ago, as if for a reader who hasn't read any of this
> thread or looked at this transformation in GCC before.  I've certainly
> spent longer than that on review in this thread.  It's normal for a patch
> involving anything at all tricky to take hours to write up (and also
> normal for one to discover, in the course of writing the detailed coherent
> analysis for people who aren't familiar with the code and the issues
> involved, that there's in fact something wrong with the patch and it needs
> revisiting before submission).
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
Xinliang David Li - Sept. 10, 2013, 1:26 a.m.
On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou <congh@google.com> wrote:
> First, thank you for your detailed comments again! Then I deeply
> apologize for not explaining my patch properly and responding to your
> previous comment. I didn't understand thoroughly the problem before
> submitting the patch.
>
> Previously I only considered the following three conversions for sqrt():
>
>
> 1: (float) sqrt ((double) float_val)  ->  sqrtf (float_val)
> 2: (float) sqrtl ((long double) float_val)  ->  sqrtf (float_val)
> 3: (double) sqrtl ((long double) double_val)  ->  sqrt (double_val)
>
>
> We have four types here:
>
> TYPE is the type to which the result of the function call is converted.
> ITYPE is the type of the math call expression.
> TREE_TYPE(arg0) is the type of the function argument (before type conversion).
> NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision.
> It will be the type of the new math call expression after conversion.
>
> For all three cases above, TYPE is always the same as NEWTYPE. That is
> why I only considered TYPE during the precision comparison. ITYPE can
> only be double_type_node or long_double_type_node depending on the
> type of the math function. That is why I explicitly used those two
> types instead of ITYPE (no correctness issue). But you are right,
> ITYPE is more elegant and better here.
>
> After further analysis, I found I missed two more cases. Note that we
> have the following conditions according to the code in convert.c:
>
> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE)
> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0))
> TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE)
>
> the last condition comes from the fact that we only consider
> converting a math function call into another one with narrower type.
> Therefore we have
>
> TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE)
> TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE)
>
> So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for
> sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with
> four possible combinations. Therefore we have two more conversions to
> consider besides the three ones I mentioned above:
>
>
> 4: (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
> 5: (double) sqrtl ((long double) float_val)  ->  sqrt ((double) float_val)
>
>
> For the first conversion here, TYPE (float) is different from NEWTYPE
> (double), and my previous patch doesn't handle this case.The correct
> way is to compare precisions of ITYPE and NEWTYPE now.
>
> To sum up, we are converting the expression
>
> (TYPE) sqrtITYPE ((ITYPE) expr)
>
> to
>
> (TYPE) sqrtNEWTYPE ((NEWTYPE) expr)
>
> and we require
>
> PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2
>
> to make it a safe conversion.
>
>
> The new patch is pasted below.
>
> I appreciate your detailed comments and analysis, and next time when I
> submit a patch I will be more carefully about the reviewer's comment.
>
>
> Thank you!
>
> Cong
>
>
>
> Index: gcc/convert.c
> ===================================================================
> --- gcc/convert.c (revision 201891)
> +++ gcc/convert.c (working copy)
> @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr)
>    CASE_MATHFN (COS)
>    CASE_MATHFN (ERF)
>    CASE_MATHFN (ERFC)
> -  CASE_MATHFN (FABS)
>    CASE_MATHFN (LOG)
>    CASE_MATHFN (LOG10)
>    CASE_MATHFN (LOG2)
>    CASE_MATHFN (LOG1P)
> -  CASE_MATHFN (LOGB)
>    CASE_MATHFN (SIN)
> -  CASE_MATHFN (SQRT)
>    CASE_MATHFN (TAN)
>    CASE_MATHFN (TANH)
> +    /* The above functions are not safe to do this conversion. */
> +    if (!flag_unsafe_math_optimizations)
> +      break;
> +  CASE_MATHFN (SQRT)
> +  CASE_MATHFN (FABS)
> +  CASE_MATHFN (LOGB)
>  #undef CASE_MATHFN
>      {
>        tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0));
> @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr)
>        if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
>   newtype = TREE_TYPE (arg0);
>
> +      /* We consider to convert
> +
> +     (T1) sqrtT2 ((T2) exprT3)
> + to
> +     (T1) sqrtT4 ((T4) exprT3)

Should this be

          (T4) sqrtT4 ((T4) exprT3)
> +
> +  , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0),
> + and T4 is NEWTYPE.

NEWTYPE is also the wider one between T1 and T3.

David

> All those types are of floating point types.
> + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion
> + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of
> + T2 and T4. See the following URL for a reference:
> + http://stackoverflow.com/questions/9235456/determining-floating-point-square-root
> + */
> +      if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL)
> + {
> +  int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p;
> +  int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p;
> +  if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations)
> +    break;
> + }
> +
>        /* Be careful about integer to fp conversions.
>   These may overflow still.  */
>        if (FLOAT_TYPE_P (TREE_TYPE (arg0))
> Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c
> ===================================================================
> --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891)
> +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy)
> @@ -44,11 +44,11 @@ __attribute__ ((noinline))
>  double
>  sin(double a)
>  {
> - abort ();
> + return a;
>  }
>  __attribute__ ((noinline))
>  float
>  sinf(float a)
>  {
> - return a;
> + abort ();
>  }
>
> On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <joseph@codesourcery.com> wrote:
>> On Wed, 4 Sep 2013, Xinliang David Li wrote:
>>
>>> > This patch submission still fails to pay attention to various of my
>>> > comments.
>>> >
>>>
>>> If you can provide inlined comments in the patch, that will be more
>>> useful and productive.
>>
>> I have explained things several times in this thread already.  I see no
>> point in repeating things when what I would say has already been said and
>> ignored.  As far as I can tell, this latest patch submission has taken one
>> line from the message it is in response to, and largely ignored the
>> following two paragraphs (including where I explicitly say that said line
>> should not appear literally in the source code at all).  But, repeating
>> what I said before yet again:
>>
>>   (but you should be referring to the relevant types
>>
>> The patch does not do properly that.  It refers explicitly to
>> long_double_type_node and double_type_node.
>>
>>   - "type", the type
>>   being converted to, "itype", the type of the function being called in the
>>   source code, "TREE_TYPE (arg0)", the type of the argument after extensions
>>   have been removed, and "newtype", computed from those
>>
>> The patch only engages with "type".  I suspect "newtype" is what it really
>> means there when using "type".  When it uses long_double_type_node and
>> double_type_node, those should be "itype".
>>
>>   - so you should have
>>   expressions like the above with two or more of those four types, but not
>>   with long_double_type_node directly).
>>
>> See above.  The patch uses long_double_type_node directly, contrary to
>> what I explicitly said.  You are free to disagree with something I say in
>> a review - but in that case you need to reply specifically to the review
>> comment explaining your rationale for disagreeing with it.  Just ignoring
>> the comment and not mentioning the disagreement wastes the time of
>> reviewers.
>>
>>   The patch submission will need to include a proper analysis to justify to
>>   the reader why the particular inequality with particular types from those
>>   four is correct in all cases where the relevant code may be executed.
>>
>> The comments only refer to "T1" and "T2" without explaining how they
>> relate to the original expression (three types) or the variables within
>> GCC dealing with various types (four types, because newtype gets
>> involved).  I said back in
>> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and
>> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types
>> are involved - when I say "the patch submission needs to include its own
>> analysis for the full generality of three types", again, it's
>> inappropriate for a patch to omit such an analysis without justification.
>> The patch submission needs to include an analysis that properly explains
>> the transformation involved and the conditions under which it is safe.
>> Maybe starting along the lines of:
>>
>> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3
>> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point
>> square root function being used (ITYPE), T1 is TYPE and all these types
>> are binary floating-point types.  We wish to optimize if possible to an
>> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is
>> narrower than T2.  (Then explain the choice of T4 and the conditions under
>> which the transformation is safe, with appropriate references.)
>>
>> I suggest that for the next patch submission you (the patch submitter)
>> make sure you spend at least an hour properly understanding the issues and
>> all the previous messages in the thread and writing up the detailed,
>> coherent explanation of the transformation and analysis of the issues that
>> I asked for some time ago, as if for a reader who hasn't read any of this
>> thread or looked at this transformation in GCC before.  I've certainly
>> spent longer than that on review in this thread.  It's normal for a patch
>> involving anything at all tricky to take hours to write up (and also
>> normal for one to discover, in the course of writing the detailed coherent
>> analysis for people who aren't familiar with the code and the issues
>> involved, that there's in fact something wrong with the patch and it needs
>> revisiting before submission).
>>
>> --
>> Joseph S. Myers
>> joseph@codesourcery.com
Cong Hou - Sept. 10, 2013, 2:28 a.m.
On Mon, Sep 9, 2013 at 6:26 PM, Xinliang David Li <davidxl@google.com> wrote:
> On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou <congh@google.com> wrote:
>> First, thank you for your detailed comments again! Then I deeply
>> apologize for not explaining my patch properly and responding to your
>> previous comment. I didn't understand thoroughly the problem before
>> submitting the patch.
>>
>> Previously I only considered the following three conversions for sqrt():
>>
>>
>> 1: (float) sqrt ((double) float_val)  ->  sqrtf (float_val)
>> 2: (float) sqrtl ((long double) float_val)  ->  sqrtf (float_val)
>> 3: (double) sqrtl ((long double) double_val)  ->  sqrt (double_val)
>>
>>
>> We have four types here:
>>
>> TYPE is the type to which the result of the function call is converted.
>> ITYPE is the type of the math call expression.
>> TREE_TYPE(arg0) is the type of the function argument (before type conversion).
>> NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision.
>> It will be the type of the new math call expression after conversion.
>>
>> For all three cases above, TYPE is always the same as NEWTYPE. That is
>> why I only considered TYPE during the precision comparison. ITYPE can
>> only be double_type_node or long_double_type_node depending on the
>> type of the math function. That is why I explicitly used those two
>> types instead of ITYPE (no correctness issue). But you are right,
>> ITYPE is more elegant and better here.
>>
>> After further analysis, I found I missed two more cases. Note that we
>> have the following conditions according to the code in convert.c:
>>
>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE)
>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0))
>> TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE)
>>
>> the last condition comes from the fact that we only consider
>> converting a math function call into another one with narrower type.
>> Therefore we have
>>
>> TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE)
>> TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE)
>>
>> So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for
>> sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with
>> four possible combinations. Therefore we have two more conversions to
>> consider besides the three ones I mentioned above:
>>
>>
>> 4: (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
>> 5: (double) sqrtl ((long double) float_val)  ->  sqrt ((double) float_val)
>>
>>
>> For the first conversion here, TYPE (float) is different from NEWTYPE
>> (double), and my previous patch doesn't handle this case.The correct
>> way is to compare precisions of ITYPE and NEWTYPE now.
>>
>> To sum up, we are converting the expression
>>
>> (TYPE) sqrtITYPE ((ITYPE) expr)
>>
>> to
>>
>> (TYPE) sqrtNEWTYPE ((NEWTYPE) expr)
>>
>> and we require
>>
>> PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2
>>
>> to make it a safe conversion.
>>
>>
>> The new patch is pasted below.
>>
>> I appreciate your detailed comments and analysis, and next time when I
>> submit a patch I will be more carefully about the reviewer's comment.
>>
>>
>> Thank you!
>>
>> Cong
>>
>>
>>
>> Index: gcc/convert.c
>> ===================================================================
>> --- gcc/convert.c (revision 201891)
>> +++ gcc/convert.c (working copy)
>> @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr)
>>    CASE_MATHFN (COS)
>>    CASE_MATHFN (ERF)
>>    CASE_MATHFN (ERFC)
>> -  CASE_MATHFN (FABS)
>>    CASE_MATHFN (LOG)
>>    CASE_MATHFN (LOG10)
>>    CASE_MATHFN (LOG2)
>>    CASE_MATHFN (LOG1P)
>> -  CASE_MATHFN (LOGB)
>>    CASE_MATHFN (SIN)
>> -  CASE_MATHFN (SQRT)
>>    CASE_MATHFN (TAN)
>>    CASE_MATHFN (TANH)
>> +    /* The above functions are not safe to do this conversion. */
>> +    if (!flag_unsafe_math_optimizations)
>> +      break;
>> +  CASE_MATHFN (SQRT)
>> +  CASE_MATHFN (FABS)
>> +  CASE_MATHFN (LOGB)
>>  #undef CASE_MATHFN
>>      {
>>        tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0));
>> @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr)
>>        if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
>>   newtype = TREE_TYPE (arg0);
>>
>> +      /* We consider to convert
>> +
>> +     (T1) sqrtT2 ((T2) exprT3)
>> + to
>> +     (T1) sqrtT4 ((T4) exprT3)
>
> Should this be
>
>           (T4) sqrtT4 ((T4) exprT3)

T4 is not necessarily the same as T1. For the conversion:

 (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)

T4 is double and T1 is float.


>> +
>> +  , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0),
>> + and T4 is NEWTYPE.
>
> NEWTYPE is also the wider one between T1 and T3.

Right. Actually this is easy to catch from the context just before
this comment.

tree newtype = type;
if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
    newtype = TREE_TYPE (arg0);



thanks,
Cong


>
> David
>
>> All those types are of floating point types.
>> + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion
>> + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of
>> + T2 and T4. See the following URL for a reference:
>> + http://stackoverflow.com/questions/9235456/determining-floating-point-square-root
>> + */
>> +      if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL)
>> + {
>> +  int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p;
>> +  int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p;
>> +  if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations)
>> +    break;
>> + }
>> +
>>        /* Be careful about integer to fp conversions.
>>   These may overflow still.  */
>>        if (FLOAT_TYPE_P (TREE_TYPE (arg0))
>> Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c
>> ===================================================================
>> --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891)
>> +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy)
>> @@ -44,11 +44,11 @@ __attribute__ ((noinline))
>>  double
>>  sin(double a)
>>  {
>> - abort ();
>> + return a;
>>  }
>>  __attribute__ ((noinline))
>>  float
>>  sinf(float a)
>>  {
>> - return a;
>> + abort ();
>>  }
>>
>> On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <joseph@codesourcery.com> wrote:
>>> On Wed, 4 Sep 2013, Xinliang David Li wrote:
>>>
>>>> > This patch submission still fails to pay attention to various of my
>>>> > comments.
>>>> >
>>>>
>>>> If you can provide inlined comments in the patch, that will be more
>>>> useful and productive.
>>>
>>> I have explained things several times in this thread already.  I see no
>>> point in repeating things when what I would say has already been said and
>>> ignored.  As far as I can tell, this latest patch submission has taken one
>>> line from the message it is in response to, and largely ignored the
>>> following two paragraphs (including where I explicitly say that said line
>>> should not appear literally in the source code at all).  But, repeating
>>> what I said before yet again:
>>>
>>>   (but you should be referring to the relevant types
>>>
>>> The patch does not do properly that.  It refers explicitly to
>>> long_double_type_node and double_type_node.
>>>
>>>   - "type", the type
>>>   being converted to, "itype", the type of the function being called in the
>>>   source code, "TREE_TYPE (arg0)", the type of the argument after extensions
>>>   have been removed, and "newtype", computed from those
>>>
>>> The patch only engages with "type".  I suspect "newtype" is what it really
>>> means there when using "type".  When it uses long_double_type_node and
>>> double_type_node, those should be "itype".
>>>
>>>   - so you should have
>>>   expressions like the above with two or more of those four types, but not
>>>   with long_double_type_node directly).
>>>
>>> See above.  The patch uses long_double_type_node directly, contrary to
>>> what I explicitly said.  You are free to disagree with something I say in
>>> a review - but in that case you need to reply specifically to the review
>>> comment explaining your rationale for disagreeing with it.  Just ignoring
>>> the comment and not mentioning the disagreement wastes the time of
>>> reviewers.
>>>
>>>   The patch submission will need to include a proper analysis to justify to
>>>   the reader why the particular inequality with particular types from those
>>>   four is correct in all cases where the relevant code may be executed.
>>>
>>> The comments only refer to "T1" and "T2" without explaining how they
>>> relate to the original expression (three types) or the variables within
>>> GCC dealing with various types (four types, because newtype gets
>>> involved).  I said back in
>>> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and
>>> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types
>>> are involved - when I say "the patch submission needs to include its own
>>> analysis for the full generality of three types", again, it's
>>> inappropriate for a patch to omit such an analysis without justification.
>>> The patch submission needs to include an analysis that properly explains
>>> the transformation involved and the conditions under which it is safe.
>>> Maybe starting along the lines of:
>>>
>>> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3
>>> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point
>>> square root function being used (ITYPE), T1 is TYPE and all these types
>>> are binary floating-point types.  We wish to optimize if possible to an
>>> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is
>>> narrower than T2.  (Then explain the choice of T4 and the conditions under
>>> which the transformation is safe, with appropriate references.)
>>>
>>> I suggest that for the next patch submission you (the patch submitter)
>>> make sure you spend at least an hour properly understanding the issues and
>>> all the previous messages in the thread and writing up the detailed,
>>> coherent explanation of the transformation and analysis of the issues that
>>> I asked for some time ago, as if for a reader who hasn't read any of this
>>> thread or looked at this transformation in GCC before.  I've certainly
>>> spent longer than that on review in this thread.  It's normal for a patch
>>> involving anything at all tricky to take hours to write up (and also
>>> normal for one to discover, in the course of writing the detailed coherent
>>> analysis for people who aren't familiar with the code and the issues
>>> involved, that there's in fact something wrong with the patch and it needs
>>> revisiting before submission).
>>>
>>> --
>>> Joseph S. Myers
>>> joseph@codesourcery.com
Cong Hou - Sept. 20, 2013, 4:49 p.m.
Any comment or more suggestions on this patch?


thanks,
Cong

On Mon, Sep 9, 2013 at 7:28 PM, Cong Hou <congh@google.com> wrote:
> On Mon, Sep 9, 2013 at 6:26 PM, Xinliang David Li <davidxl@google.com> wrote:
>> On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou <congh@google.com> wrote:
>>> First, thank you for your detailed comments again! Then I deeply
>>> apologize for not explaining my patch properly and responding to your
>>> previous comment. I didn't understand thoroughly the problem before
>>> submitting the patch.
>>>
>>> Previously I only considered the following three conversions for sqrt():
>>>
>>>
>>> 1: (float) sqrt ((double) float_val)  ->  sqrtf (float_val)
>>> 2: (float) sqrtl ((long double) float_val)  ->  sqrtf (float_val)
>>> 3: (double) sqrtl ((long double) double_val)  ->  sqrt (double_val)
>>>
>>>
>>> We have four types here:
>>>
>>> TYPE is the type to which the result of the function call is converted.
>>> ITYPE is the type of the math call expression.
>>> TREE_TYPE(arg0) is the type of the function argument (before type conversion).
>>> NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision.
>>> It will be the type of the new math call expression after conversion.
>>>
>>> For all three cases above, TYPE is always the same as NEWTYPE. That is
>>> why I only considered TYPE during the precision comparison. ITYPE can
>>> only be double_type_node or long_double_type_node depending on the
>>> type of the math function. That is why I explicitly used those two
>>> types instead of ITYPE (no correctness issue). But you are right,
>>> ITYPE is more elegant and better here.
>>>
>>> After further analysis, I found I missed two more cases. Note that we
>>> have the following conditions according to the code in convert.c:
>>>
>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE)
>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0))
>>> TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE)
>>>
>>> the last condition comes from the fact that we only consider
>>> converting a math function call into another one with narrower type.
>>> Therefore we have
>>>
>>> TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE)
>>> TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE)
>>>
>>> So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for
>>> sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with
>>> four possible combinations. Therefore we have two more conversions to
>>> consider besides the three ones I mentioned above:
>>>
>>>
>>> 4: (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
>>> 5: (double) sqrtl ((long double) float_val)  ->  sqrt ((double) float_val)
>>>
>>>
>>> For the first conversion here, TYPE (float) is different from NEWTYPE
>>> (double), and my previous patch doesn't handle this case.The correct
>>> way is to compare precisions of ITYPE and NEWTYPE now.
>>>
>>> To sum up, we are converting the expression
>>>
>>> (TYPE) sqrtITYPE ((ITYPE) expr)
>>>
>>> to
>>>
>>> (TYPE) sqrtNEWTYPE ((NEWTYPE) expr)
>>>
>>> and we require
>>>
>>> PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2
>>>
>>> to make it a safe conversion.
>>>
>>>
>>> The new patch is pasted below.
>>>
>>> I appreciate your detailed comments and analysis, and next time when I
>>> submit a patch I will be more carefully about the reviewer's comment.
>>>
>>>
>>> Thank you!
>>>
>>> Cong
>>>
>>>
>>>
>>> Index: gcc/convert.c
>>> ===================================================================
>>> --- gcc/convert.c (revision 201891)
>>> +++ gcc/convert.c (working copy)
>>> @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr)
>>>    CASE_MATHFN (COS)
>>>    CASE_MATHFN (ERF)
>>>    CASE_MATHFN (ERFC)
>>> -  CASE_MATHFN (FABS)
>>>    CASE_MATHFN (LOG)
>>>    CASE_MATHFN (LOG10)
>>>    CASE_MATHFN (LOG2)
>>>    CASE_MATHFN (LOG1P)
>>> -  CASE_MATHFN (LOGB)
>>>    CASE_MATHFN (SIN)
>>> -  CASE_MATHFN (SQRT)
>>>    CASE_MATHFN (TAN)
>>>    CASE_MATHFN (TANH)
>>> +    /* The above functions are not safe to do this conversion. */
>>> +    if (!flag_unsafe_math_optimizations)
>>> +      break;
>>> +  CASE_MATHFN (SQRT)
>>> +  CASE_MATHFN (FABS)
>>> +  CASE_MATHFN (LOGB)
>>>  #undef CASE_MATHFN
>>>      {
>>>        tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0));
>>> @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr)
>>>        if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
>>>   newtype = TREE_TYPE (arg0);
>>>
>>> +      /* We consider to convert
>>> +
>>> +     (T1) sqrtT2 ((T2) exprT3)
>>> + to
>>> +     (T1) sqrtT4 ((T4) exprT3)
>>
>> Should this be
>>
>>           (T4) sqrtT4 ((T4) exprT3)
>
> T4 is not necessarily the same as T1. For the conversion:
>
>  (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
>
> T4 is double and T1 is float.
>
>
>>> +
>>> +  , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0),
>>> + and T4 is NEWTYPE.
>>
>> NEWTYPE is also the wider one between T1 and T3.
>
> Right. Actually this is easy to catch from the context just before
> this comment.
>
> tree newtype = type;
> if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
>     newtype = TREE_TYPE (arg0);
>
>
>
> thanks,
> Cong
>
>
>>
>> David
>>
>>> All those types are of floating point types.
>>> + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion
>>> + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of
>>> + T2 and T4. See the following URL for a reference:
>>> + http://stackoverflow.com/questions/9235456/determining-floating-point-square-root
>>> + */
>>> +      if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL)
>>> + {
>>> +  int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p;
>>> +  int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p;
>>> +  if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations)
>>> +    break;
>>> + }
>>> +
>>>        /* Be careful about integer to fp conversions.
>>>   These may overflow still.  */
>>>        if (FLOAT_TYPE_P (TREE_TYPE (arg0))
>>> Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c
>>> ===================================================================
>>> --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891)
>>> +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy)
>>> @@ -44,11 +44,11 @@ __attribute__ ((noinline))
>>>  double
>>>  sin(double a)
>>>  {
>>> - abort ();
>>> + return a;
>>>  }
>>>  __attribute__ ((noinline))
>>>  float
>>>  sinf(float a)
>>>  {
>>> - return a;
>>> + abort ();
>>>  }
>>>
>>> On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <joseph@codesourcery.com> wrote:
>>>> On Wed, 4 Sep 2013, Xinliang David Li wrote:
>>>>
>>>>> > This patch submission still fails to pay attention to various of my
>>>>> > comments.
>>>>> >
>>>>>
>>>>> If you can provide inlined comments in the patch, that will be more
>>>>> useful and productive.
>>>>
>>>> I have explained things several times in this thread already.  I see no
>>>> point in repeating things when what I would say has already been said and
>>>> ignored.  As far as I can tell, this latest patch submission has taken one
>>>> line from the message it is in response to, and largely ignored the
>>>> following two paragraphs (including where I explicitly say that said line
>>>> should not appear literally in the source code at all).  But, repeating
>>>> what I said before yet again:
>>>>
>>>>   (but you should be referring to the relevant types
>>>>
>>>> The patch does not do properly that.  It refers explicitly to
>>>> long_double_type_node and double_type_node.
>>>>
>>>>   - "type", the type
>>>>   being converted to, "itype", the type of the function being called in the
>>>>   source code, "TREE_TYPE (arg0)", the type of the argument after extensions
>>>>   have been removed, and "newtype", computed from those
>>>>
>>>> The patch only engages with "type".  I suspect "newtype" is what it really
>>>> means there when using "type".  When it uses long_double_type_node and
>>>> double_type_node, those should be "itype".
>>>>
>>>>   - so you should have
>>>>   expressions like the above with two or more of those four types, but not
>>>>   with long_double_type_node directly).
>>>>
>>>> See above.  The patch uses long_double_type_node directly, contrary to
>>>> what I explicitly said.  You are free to disagree with something I say in
>>>> a review - but in that case you need to reply specifically to the review
>>>> comment explaining your rationale for disagreeing with it.  Just ignoring
>>>> the comment and not mentioning the disagreement wastes the time of
>>>> reviewers.
>>>>
>>>>   The patch submission will need to include a proper analysis to justify to
>>>>   the reader why the particular inequality with particular types from those
>>>>   four is correct in all cases where the relevant code may be executed.
>>>>
>>>> The comments only refer to "T1" and "T2" without explaining how they
>>>> relate to the original expression (three types) or the variables within
>>>> GCC dealing with various types (four types, because newtype gets
>>>> involved).  I said back in
>>>> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and
>>>> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types
>>>> are involved - when I say "the patch submission needs to include its own
>>>> analysis for the full generality of three types", again, it's
>>>> inappropriate for a patch to omit such an analysis without justification.
>>>> The patch submission needs to include an analysis that properly explains
>>>> the transformation involved and the conditions under which it is safe.
>>>> Maybe starting along the lines of:
>>>>
>>>> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3
>>>> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point
>>>> square root function being used (ITYPE), T1 is TYPE and all these types
>>>> are binary floating-point types.  We wish to optimize if possible to an
>>>> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is
>>>> narrower than T2.  (Then explain the choice of T4 and the conditions under
>>>> which the transformation is safe, with appropriate references.)
>>>>
>>>> I suggest that for the next patch submission you (the patch submitter)
>>>> make sure you spend at least an hour properly understanding the issues and
>>>> all the previous messages in the thread and writing up the detailed,
>>>> coherent explanation of the transformation and analysis of the issues that
>>>> I asked for some time ago, as if for a reader who hasn't read any of this
>>>> thread or looked at this transformation in GCC before.  I've certainly
>>>> spent longer than that on review in this thread.  It's normal for a patch
>>>> involving anything at all tricky to take hours to write up (and also
>>>> normal for one to discover, in the course of writing the detailed coherent
>>>> analysis for people who aren't familiar with the code and the issues
>>>> involved, that there's in fact something wrong with the patch and it needs
>>>> revisiting before submission).
>>>>
>>>> --
>>>> Joseph S. Myers
>>>> joseph@codesourcery.com
Cong Hou - Oct. 3, 2013, 11:19 p.m.
Ping...


thanks,
Cong


On Fri, Sep 20, 2013 at 9:49 AM, Cong Hou <congh@google.com> wrote:
> Any comment or more suggestions on this patch?
>
>
> thanks,
> Cong
>
> On Mon, Sep 9, 2013 at 7:28 PM, Cong Hou <congh@google.com> wrote:
>> On Mon, Sep 9, 2013 at 6:26 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou <congh@google.com> wrote:
>>>> First, thank you for your detailed comments again! Then I deeply
>>>> apologize for not explaining my patch properly and responding to your
>>>> previous comment. I didn't understand thoroughly the problem before
>>>> submitting the patch.
>>>>
>>>> Previously I only considered the following three conversions for sqrt():
>>>>
>>>>
>>>> 1: (float) sqrt ((double) float_val)  ->  sqrtf (float_val)
>>>> 2: (float) sqrtl ((long double) float_val)  ->  sqrtf (float_val)
>>>> 3: (double) sqrtl ((long double) double_val)  ->  sqrt (double_val)
>>>>
>>>>
>>>> We have four types here:
>>>>
>>>> TYPE is the type to which the result of the function call is converted.
>>>> ITYPE is the type of the math call expression.
>>>> TREE_TYPE(arg0) is the type of the function argument (before type conversion).
>>>> NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision.
>>>> It will be the type of the new math call expression after conversion.
>>>>
>>>> For all three cases above, TYPE is always the same as NEWTYPE. That is
>>>> why I only considered TYPE during the precision comparison. ITYPE can
>>>> only be double_type_node or long_double_type_node depending on the
>>>> type of the math function. That is why I explicitly used those two
>>>> types instead of ITYPE (no correctness issue). But you are right,
>>>> ITYPE is more elegant and better here.
>>>>
>>>> After further analysis, I found I missed two more cases. Note that we
>>>> have the following conditions according to the code in convert.c:
>>>>
>>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE)
>>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0))
>>>> TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE)
>>>>
>>>> the last condition comes from the fact that we only consider
>>>> converting a math function call into another one with narrower type.
>>>> Therefore we have
>>>>
>>>> TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE)
>>>> TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE)
>>>>
>>>> So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for
>>>> sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with
>>>> four possible combinations. Therefore we have two more conversions to
>>>> consider besides the three ones I mentioned above:
>>>>
>>>>
>>>> 4: (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
>>>> 5: (double) sqrtl ((long double) float_val)  ->  sqrt ((double) float_val)
>>>>
>>>>
>>>> For the first conversion here, TYPE (float) is different from NEWTYPE
>>>> (double), and my previous patch doesn't handle this case.The correct
>>>> way is to compare precisions of ITYPE and NEWTYPE now.
>>>>
>>>> To sum up, we are converting the expression
>>>>
>>>> (TYPE) sqrtITYPE ((ITYPE) expr)
>>>>
>>>> to
>>>>
>>>> (TYPE) sqrtNEWTYPE ((NEWTYPE) expr)
>>>>
>>>> and we require
>>>>
>>>> PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2
>>>>
>>>> to make it a safe conversion.
>>>>
>>>>
>>>> The new patch is pasted below.
>>>>
>>>> I appreciate your detailed comments and analysis, and next time when I
>>>> submit a patch I will be more carefully about the reviewer's comment.
>>>>
>>>>
>>>> Thank you!
>>>>
>>>> Cong
>>>>
>>>>
>>>>
>>>> Index: gcc/convert.c
>>>> ===================================================================
>>>> --- gcc/convert.c (revision 201891)
>>>> +++ gcc/convert.c (working copy)
>>>> @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr)
>>>>    CASE_MATHFN (COS)
>>>>    CASE_MATHFN (ERF)
>>>>    CASE_MATHFN (ERFC)
>>>> -  CASE_MATHFN (FABS)
>>>>    CASE_MATHFN (LOG)
>>>>    CASE_MATHFN (LOG10)
>>>>    CASE_MATHFN (LOG2)
>>>>    CASE_MATHFN (LOG1P)
>>>> -  CASE_MATHFN (LOGB)
>>>>    CASE_MATHFN (SIN)
>>>> -  CASE_MATHFN (SQRT)
>>>>    CASE_MATHFN (TAN)
>>>>    CASE_MATHFN (TANH)
>>>> +    /* The above functions are not safe to do this conversion. */
>>>> +    if (!flag_unsafe_math_optimizations)
>>>> +      break;
>>>> +  CASE_MATHFN (SQRT)
>>>> +  CASE_MATHFN (FABS)
>>>> +  CASE_MATHFN (LOGB)
>>>>  #undef CASE_MATHFN
>>>>      {
>>>>        tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0));
>>>> @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr)
>>>>        if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
>>>>   newtype = TREE_TYPE (arg0);
>>>>
>>>> +      /* We consider to convert
>>>> +
>>>> +     (T1) sqrtT2 ((T2) exprT3)
>>>> + to
>>>> +     (T1) sqrtT4 ((T4) exprT3)
>>>
>>> Should this be
>>>
>>>           (T4) sqrtT4 ((T4) exprT3)
>>
>> T4 is not necessarily the same as T1. For the conversion:
>>
>>  (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
>>
>> T4 is double and T1 is float.
>>
>>
>>>> +
>>>> +  , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0),
>>>> + and T4 is NEWTYPE.
>>>
>>> NEWTYPE is also the wider one between T1 and T3.
>>
>> Right. Actually this is easy to catch from the context just before
>> this comment.
>>
>> tree newtype = type;
>> if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
>>     newtype = TREE_TYPE (arg0);
>>
>>
>>
>> thanks,
>> Cong
>>
>>
>>>
>>> David
>>>
>>>> All those types are of floating point types.
>>>> + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion
>>>> + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of
>>>> + T2 and T4. See the following URL for a reference:
>>>> + http://stackoverflow.com/questions/9235456/determining-floating-point-square-root
>>>> + */
>>>> +      if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL)
>>>> + {
>>>> +  int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p;
>>>> +  int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p;
>>>> +  if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations)
>>>> +    break;
>>>> + }
>>>> +
>>>>        /* Be careful about integer to fp conversions.
>>>>   These may overflow still.  */
>>>>        if (FLOAT_TYPE_P (TREE_TYPE (arg0))
>>>> Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c
>>>> ===================================================================
>>>> --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891)
>>>> +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy)
>>>> @@ -44,11 +44,11 @@ __attribute__ ((noinline))
>>>>  double
>>>>  sin(double a)
>>>>  {
>>>> - abort ();
>>>> + return a;
>>>>  }
>>>>  __attribute__ ((noinline))
>>>>  float
>>>>  sinf(float a)
>>>>  {
>>>> - return a;
>>>> + abort ();
>>>>  }
>>>>
>>>> On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <joseph@codesourcery.com> wrote:
>>>>> On Wed, 4 Sep 2013, Xinliang David Li wrote:
>>>>>
>>>>>> > This patch submission still fails to pay attention to various of my
>>>>>> > comments.
>>>>>> >
>>>>>>
>>>>>> If you can provide inlined comments in the patch, that will be more
>>>>>> useful and productive.
>>>>>
>>>>> I have explained things several times in this thread already.  I see no
>>>>> point in repeating things when what I would say has already been said and
>>>>> ignored.  As far as I can tell, this latest patch submission has taken one
>>>>> line from the message it is in response to, and largely ignored the
>>>>> following two paragraphs (including where I explicitly say that said line
>>>>> should not appear literally in the source code at all).  But, repeating
>>>>> what I said before yet again:
>>>>>
>>>>>   (but you should be referring to the relevant types
>>>>>
>>>>> The patch does not do properly that.  It refers explicitly to
>>>>> long_double_type_node and double_type_node.
>>>>>
>>>>>   - "type", the type
>>>>>   being converted to, "itype", the type of the function being called in the
>>>>>   source code, "TREE_TYPE (arg0)", the type of the argument after extensions
>>>>>   have been removed, and "newtype", computed from those
>>>>>
>>>>> The patch only engages with "type".  I suspect "newtype" is what it really
>>>>> means there when using "type".  When it uses long_double_type_node and
>>>>> double_type_node, those should be "itype".
>>>>>
>>>>>   - so you should have
>>>>>   expressions like the above with two or more of those four types, but not
>>>>>   with long_double_type_node directly).
>>>>>
>>>>> See above.  The patch uses long_double_type_node directly, contrary to
>>>>> what I explicitly said.  You are free to disagree with something I say in
>>>>> a review - but in that case you need to reply specifically to the review
>>>>> comment explaining your rationale for disagreeing with it.  Just ignoring
>>>>> the comment and not mentioning the disagreement wastes the time of
>>>>> reviewers.
>>>>>
>>>>>   The patch submission will need to include a proper analysis to justify to
>>>>>   the reader why the particular inequality with particular types from those
>>>>>   four is correct in all cases where the relevant code may be executed.
>>>>>
>>>>> The comments only refer to "T1" and "T2" without explaining how they
>>>>> relate to the original expression (three types) or the variables within
>>>>> GCC dealing with various types (four types, because newtype gets
>>>>> involved).  I said back in
>>>>> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and
>>>>> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types
>>>>> are involved - when I say "the patch submission needs to include its own
>>>>> analysis for the full generality of three types", again, it's
>>>>> inappropriate for a patch to omit such an analysis without justification.
>>>>> The patch submission needs to include an analysis that properly explains
>>>>> the transformation involved and the conditions under which it is safe.
>>>>> Maybe starting along the lines of:
>>>>>
>>>>> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3
>>>>> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point
>>>>> square root function being used (ITYPE), T1 is TYPE and all these types
>>>>> are binary floating-point types.  We wish to optimize if possible to an
>>>>> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is
>>>>> narrower than T2.  (Then explain the choice of T4 and the conditions under
>>>>> which the transformation is safe, with appropriate references.)
>>>>>
>>>>> I suggest that for the next patch submission you (the patch submitter)
>>>>> make sure you spend at least an hour properly understanding the issues and
>>>>> all the previous messages in the thread and writing up the detailed,
>>>>> coherent explanation of the transformation and analysis of the issues that
>>>>> I asked for some time ago, as if for a reader who hasn't read any of this
>>>>> thread or looked at this transformation in GCC before.  I've certainly
>>>>> spent longer than that on review in this thread.  It's normal for a patch
>>>>> involving anything at all tricky to take hours to write up (and also
>>>>> normal for one to discover, in the course of writing the detailed coherent
>>>>> analysis for people who aren't familiar with the code and the issues
>>>>> involved, that there's in fact something wrong with the patch and it needs
>>>>> revisiting before submission).
>>>>>
>>>>> --
>>>>> Joseph S. Myers
>>>>> joseph@codesourcery.com
Joseph S. Myers - Oct. 4, 2013, 12:06 a.m.
On Fri, 6 Sep 2013, Cong Hou wrote:

> 4: (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)

I don't believe this case is in fact safe even if precision (long double) 
>= precision (double) * 2 + 2 (when your patch would allow it).

The result that precision (double) * 2 + 2 is sufficient for the result of 
rounding the long double value to double to be the same as the result of 
rounding once from infinite precision to double would I think also mean 
the same when rounding of the infinite-precision result to float happens 
once - that is, if instead of (float) sqrt (double_val) you have fsqrt 
(double_val) (fsqrt being the proposed function in draft TS 18661-1 for 
computing a square root of a double value, rounded exactly once to float).  
But here the question isn't whether rounding to long double then float 
gives the same result as rounding to float.  It's whether rounding to long 
double then float gives the same result as rounding to double then float.

Consider, for example, double_val = 0x1.0000020000011p+0.  The square root 
rounded once to float (and so the result if you go via long double and 
long double is sufficiently precise) is 0x1.000002p+0.  But the square 
root rounded to double is 0x1.000001p+0, which on rounding to float 
produces 0x1p+0.

> 5: (double) sqrtl ((long double) float_val)  ->  sqrt ((double) float_val)

(This case, however, *is* safe if long double is sufficiently precise; the 
conversion of float to long double is trivially equivalent to a conversion 
involving an intermediate "double" type, so it reduces to a case for which 
the standard formula involving precisions of just two types applies.)

Patch

Index: gcc/convert.c
===================================================================
--- gcc/convert.c (revision 201891)
+++ gcc/convert.c (working copy)
@@ -135,16 +135,19 @@  convert_to_real (tree type, tree expr)
   CASE_MATHFN (COS)
   CASE_MATHFN (ERF)
   CASE_MATHFN (ERFC)
-  CASE_MATHFN (FABS)
   CASE_MATHFN (LOG)
   CASE_MATHFN (LOG10)
   CASE_MATHFN (LOG2)
   CASE_MATHFN (LOG1P)
-  CASE_MATHFN (LOGB)
   CASE_MATHFN (SIN)
-  CASE_MATHFN (SQRT)
   CASE_MATHFN (TAN)
   CASE_MATHFN (TANH)
+    /* The above functions are not safe to do this conversion. */
+    if (!flag_unsafe_math_optimizations)
+      break;
+  CASE_MATHFN (SQRT)
+  CASE_MATHFN (FABS)
+  CASE_MATHFN (LOGB)
 #undef CASE_MATHFN
     {
       tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0));
@@ -155,6 +158,27 @@  convert_to_real (tree type, tree expr)
       if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
  newtype = TREE_TYPE (arg0);

+      /* We consider to convert
+
+     (T1) sqrtT2 ((T2) exprT3)
+ to
+     (T1) sqrtT4 ((T4) exprT3)
+
+  , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0),
+ and T4 is NEWTYPE. All those types are of floating point types.
+ T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion
+ is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of
+ T2 and T4. See the following URL for a reference:
+ http://stackoverflow.com/questions/9235456/determining-floating-point-square-root
+ */
+      if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL)
+ {
+  int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p;
+  int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p;
+  if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations)
+    break;
+ }
+
       /* Be careful about integer to fp conversions.
  These may overflow still.  */
       if (FLOAT_TYPE_P (TREE_TYPE (arg0))
Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891)
+++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy)
@@ -44,11 +44,11 @@  __attribute__ ((noinline))
 double
 sin(double a)
 {
- abort ();
+ return a;
 }
 __attribute__ ((noinline))
 float
 sinf(float a)
 {
- return a;
+ abort ();
 }