diff mbox

Move some flag_unsafe_math_optimizations using simplify and match

Message ID SN2PR0701MB1024EC4AD2C9B654AE3F0E9A8E790@SN2PR0701MB1024.namprd07.prod.outlook.com
State New
Headers show

Commit Message

Hurugalawadi, Naveen Aug. 17, 2015, 5:24 a.m. UTC
Hi,

Please find attached the modified patch as per the comments.

Tested the patch on AArch64 and X86 without any regressions.

The other hunks of the earlier patch have been removed as per the earlier
comments due to failure in regressions.
Investigated those issues and found that its because of Double and Float
patterns.
Could not deduce why the double and float patterns FAIL though.

>> fold_builtin_cos/cosh can be reduced to constant folding, thus
>> remove their fold_strip_sign_nops () path.
Had removed them but the double and float patterns does not generate the
optimizations and hence had to retain them

Please let me know why the double and float patterns are failing.
I could work on those and try to move all other patterns using
"simplify and match".

The testcase for these pattern optimizations are present.
Please let me know whether we would need a separate check so that I
can add them.

Thanks,
Naveen

ChangeLog

2015-08-17  Naveen H.S  <Naveen.Hurugalawadi@caviumnetworks.com>

        PR middle-end/16107
        * fold-const.c (fold_binary_loc) : Move Optimize tan(x)*cos(x) as
	sin(x) to match.pd.
	Move Optimize x*pow(x,c) as pow(x,c+1) to match.pd.
	Move Optimize pow(x,c)*x as pow(x,c+1) to match.pd.
	Move Optimize sin(x)/cos(x) as tan(x) to match.pd.
	Move Optimize cos(x)/sin(x) as 1.0/tan(x) to match.pd.
	Move Optimize sin(x)/tan(x) as cos(x) to match.pd.
	Move Optimize tan(x)/sin(x) as 1.0/cos(x) to match.pd.
	Move Optimize pow(x,c)/x as pow(x,c-1) to match.pd.
	Move Optimize x/pow(y,z) into x*pow(y,-z) to match.pd.
	* match.pd (SIN ) : New Operator.
	(TAN) : New Operator.
	(mult:c (SQRT (SQRT@1 @0)) @1) : New simplifier.
	(mult (POW:s @0 @1) (POW:s @2 @1))
	(mult:c (TAN:s @0) (COS:s @0))
	(mult:c @0 (POW @0 @1))
	(rdiv (SIN:s @0) (COS:s @0))
	(rdiv (COS:s @0) (SIN:s @0))
	(rdiv (SIN:s @0) (TAN:s @0))
	(rdiv (TAN:s @0) (SIN:s @0))
	(rdiv (POW @0 @1) @0)
	(rdiv @0 (SQRT (rdiv @1 @2)))
	(rdiv @0 (POW @1 @2))

Comments

Richard Biener Aug. 18, 2015, 10:24 a.m. UTC | #1
On Mon, Aug 17, 2015 at 7:24 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi,
>
> Please find attached the modified patch as per the comments.
>
> Tested the patch on AArch64 and X86 without any regressions.
>
> The other hunks of the earlier patch have been removed as per the earlier
> comments due to failure in regressions.
> Investigated those issues and found that its because of Double and Float
> patterns.
> Could not deduce why the double and float patterns FAIL though.
>
>>> fold_builtin_cos/cosh can be reduced to constant folding, thus
>>> remove their fold_strip_sign_nops () path.
> Had removed them but the double and float patterns does not generate the
> optimizations and hence had to retain them
>
> Please let me know why the double and float patterns are failing.
> I could work on those and try to move all other patterns using
> "simplify and match".

Can you point me to which patterns exhibit this behavior?  I guess
you mean 'long double' and 'float' variants, not 'double' variants?

> The testcase for these pattern optimizations are present.
> Please let me know whether we would need a separate check so that I
> can add them.

In your new patch I see

+ /* Simplify sqrt(x) * sqrt(x) -> x.  */
+ (simplify
+  (mult:c (SQRT (SQRT@1 @0)) @1)
+  (if (!HONOR_SNANS (type))
+   @0))

which looks like a typo - it matches (sqrt (sqrt (x)) * sqrt (x).  You want

  (mult (SQRT@1 @0) @1)

also note there is no need for the :c here.

For cases like

+ /* Simplify tan(x) * cos(x) -> sin(x). */
+ (simplify
+  (mult:c (TAN:s @0) (COS:s @0))
+   (SIN @0))

you can run into the issue that the program does not use sinf() and thus
the compiler refuses to generate it (but the fold-const.c has the same issue,
so it shouldn't regress).  Not sure if that is your issue with long
double / float
variants.

+ /* Simplify pow(x,c) / x -> pow(x,c-1). */
+ (simplify
+  (rdiv (POW @0 @1) @0)

:s missing on the POW

+  (if (TREE_CODE (@1) == REAL_CST
+       && !TREE_OVERFLOW (@1))
+   (POW @0 (minus @1 { build_one_cst (type); }))))

please use

  (simplify
    (rdiv (POW @0 REAL_CST@1) @0)
    (if (!TREE_OVERFLOW (@1))
  ...

here and in other cases where you restrict one operand to a constant.
That results in more efficient code.

+ /* Simplify a/root(b/c) into a*root(c/b).  */
+ (simplify
+  (rdiv @0 (SQRT (rdiv @1 @2)))

:s missing on the SQRT and the rdiv

+   (mult @0 (SQRT (rdiv @2 @1))))


+ /* Simplify x / pow (y,z) -> x * pow(y,-z). */
+ (simplify
+  (rdiv @0 (POW @1 @2))

:s missing on the POW.

+   (mult @0 (POW @1 (negate @2))))

Otherwise the new patch looks ok to me.

Thanks,
Richard.

> Thanks,
> Naveen
>
> ChangeLog
>
> 2015-08-17  Naveen H.S  <Naveen.Hurugalawadi@caviumnetworks.com>
>
>         PR middle-end/16107
>         * fold-const.c (fold_binary_loc) : Move Optimize tan(x)*cos(x) as
>         sin(x) to match.pd.
>         Move Optimize x*pow(x,c) as pow(x,c+1) to match.pd.
>         Move Optimize pow(x,c)*x as pow(x,c+1) to match.pd.
>         Move Optimize sin(x)/cos(x) as tan(x) to match.pd.
>         Move Optimize cos(x)/sin(x) as 1.0/tan(x) to match.pd.
>         Move Optimize sin(x)/tan(x) as cos(x) to match.pd.
>         Move Optimize tan(x)/sin(x) as 1.0/cos(x) to match.pd.
>         Move Optimize pow(x,c)/x as pow(x,c-1) to match.pd.
>         Move Optimize x/pow(y,z) into x*pow(y,-z) to match.pd.
>         * match.pd (SIN ) : New Operator.
>         (TAN) : New Operator.
>         (mult:c (SQRT (SQRT@1 @0)) @1) : New simplifier.
>         (mult (POW:s @0 @1) (POW:s @2 @1))
>         (mult:c (TAN:s @0) (COS:s @0))
>         (mult:c @0 (POW @0 @1))
>         (rdiv (SIN:s @0) (COS:s @0))
>         (rdiv (COS:s @0) (SIN:s @0))
>         (rdiv (SIN:s @0) (TAN:s @0))
>         (rdiv (TAN:s @0) (SIN:s @0))
>         (rdiv (POW @0 @1) @0)
>         (rdiv @0 (SQRT (rdiv @1 @2)))
>         (rdiv @0 (POW @1 @2))
Hurugalawadi, Naveen Aug. 19, 2015, 4:53 a.m. UTC | #2
Hi Richard,

Thanks very much for your review and comments.

>> Can you point me to which patterns exhibit this behavior?  

root(x)*root(y) as root(x*y)
expN(x)*expN(y) as expN(x+y)
pow(x,y)*pow(x,z) as pow(x,y+z)
x/expN(y) into x*expN(-y)

Long Double and Float variants FAIL with segmentation fault with these
patterns in match.pd file for AArch64.
However, most of these work as expected with X86_64.

I had those implemented as per the fold-const.c which can be found at:-
https://gcc.gnu.org/ml/gcc/2015-08/msg00021.html

>>  (mult (SQRT@1 @0) @1)

Sorry for the typo in there.
However, the current pattern does not generate the optimized pattern as expected.
x_2 = ABS_EXPR <x_1(D)>;
return x_2;

>> use (rdiv (POW @0 REAL_CST@1) @0)

It generates ICE with the above modification
internal compiler error: tree check: expected ssa_name, have var_decl in simplify_builtin_call, at tree-ssa-forwprop.c:1259

Also, can you please explain me the significance and use of ":s"
I could understand it a bit but still confused about its use in match.pd

Thanks,
Naveen
Richard Biener Aug. 19, 2015, 10:04 a.m. UTC | #3
On Wed, Aug 19, 2015 at 6:53 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi Richard,
>
> Thanks very much for your review and comments.
>
>>> Can you point me to which patterns exhibit this behavior?
>
> root(x)*root(y) as root(x*y)
> expN(x)*expN(y) as expN(x+y)
> pow(x,y)*pow(x,z) as pow(x,y+z)
> x/expN(y) into x*expN(-y)
>
> Long Double and Float variants FAIL with segmentation fault with these
> patterns in match.pd file for AArch64.

Presumably the backend tells GCC the builtins are not available.

> However, most of these work as expected with X86_64.
>
> I had those implemented as per the fold-const.c which can be found at:-
> https://gcc.gnu.org/ml/gcc/2015-08/msg00021.html
>
>>>  (mult (SQRT@1 @0) @1)
>
> Sorry for the typo in there.
> However, the current pattern does not generate the optimized pattern as expected.
> x_2 = ABS_EXPR <x_1(D)>;
> return x_2;

I see.  But I can't really help without a testcase that I can use to have a look
(same for the above issue with the segfaults).

>>> use (rdiv (POW @0 REAL_CST@1) @0)
>
> It generates ICE with the above modification
> internal compiler error: tree check: expected ssa_name, have var_decl in simplify_builtin_call, at tree-ssa-forwprop.c:1259

Hmm.  Indeed, replacing a non-call with a call isn't very well
supported yet.  A quick "fix" to
avoid this ICE would disable the pattern for -ferrno-math.  If you
open a bugreport with the
pattern and a testcase I'm going to have a closer look.

> Also, can you please explain me the significance and use of ":s"
> I could understand it a bit but still confused about its use in match.pd

":s" is important so that when we have, say

 tem = pow (x, 4.5);
 tem2 = tem / x;
 foo (tem);

thus the result of 'pow (x, 4.5)' is used in the pattern we match and also
elsewhere, we avoid turning this into

 tem = pow (x, 4.5);
 tem2 = pow (x, 3.5);
 foo (tem);

which is of course more expensive than doing the division.  Thus it makes sure
that parts of the patterns we don't use in the result are later removed as dead.

Richard.

> Thanks,
> Naveen
Hurugalawadi, Naveen Aug. 20, 2015, 4:48 a.m. UTC | #4
Hi,

Thanks again for your review and useful comments.

>> I see.  But I can't really help without a testcase that I can use to have a look
>> (same for the above issue with the segfaults).

The following testcase does not generate "x" as needed.
====================
double t (double x)
{
 x = sqrt (x) * sqrt (x);
 return x;
}
====================

All of the following operation results in segfault with:-
aarch64-thunder-elf-gcc simlify-2.c -O2 -funsafe-math-optimizations
===============================================
#include <math.h>

double t (double x, double y, double z)
{
 x = cbrt (x) * cbrt (y);
 x = exp10 (x) * exp10 (y);
 x = pow10 (x) * pow10 (y);
 x = x / cbrt (x/y)
 x = x / exp10 (y);
 x = x / pow10 (y);
 return x;
}

float t (float x, float y, float z)
{
 x = sqrtf (x) * sqrtf (y);
 x = expf (x) * expf (y);
 x = powf (x, y) * powf (x, z);
 x = x / expf (y);
 return x;
}

long double t1 (long double x, long double y, long double z)
{
 x = sqrtl (x) * sqrtl (y);
 x = expl (x) * expl (y);
 x = powl (x, y) * powl (x, z);
 x = x / expl (y);
 return x;
}
===============================================
 /* Simplify sqrt(x) * sqrt(y) -> sqrt(x*y).  */
 (simplify
  (mult (SQRT:s @0) (SQRT:s @1))
  (SQRT (mult @0 @1)))

 /* Simplify pow(x,y) * pow(x,z) -> pow(x,y+z). */
 (simplify
  (mult (POW:s @0 @1) (POW:s @0 @2))
    (POW @0 (plus @1 @2)))

 /* Simplify expN(x) * expN(y) -> expN(x+y). */
 (simplify
  (mult (EXP:s @0) (EXP:s @1))
   (EXP (plus @0 @1)))

/* Simplify x / expN(y) into x*expN(-y). */
 (simplify
  (rdiv @0 (EXP @1))
   (mult @0 (EXP (negate @1))))
===============================================

>> A quick "fix" to avoid this ICE would disable the pattern for
>> -ferrno-math.

Disabled the pattern for -ferrno-math.

>>  If you open a bugreport with the pattern and a testcase 
I'm going to have a closer look.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67285

Thanks for the detailed explanation of ":s".

Please let me know whether the working patch can be committed?
If its okay and with your approval, I would like to move some more
patterns using "match and simplify".

Thanks,
Naveen
Marc Glisse Aug. 20, 2015, 5:38 a.m. UTC | #5
On Thu, 20 Aug 2015, Hurugalawadi, Naveen wrote:

> The following testcase does not generate "x" as needed.
> ====================
> double t (double x)
> {
> x = sqrt (x) * sqrt (x);
> return x;
> }
> ====================

With -fno-math-errno, we CSE the calls to sqrt, so I would expect this to 
match:

   (mult (SQRT@1 @0) @1)

Without the flag, I expect that one will apply

  (simplify
   (mult (SQRT:s @0) (SQRT:s @1))
   (SQRT (mult @0 @1)))

and then maybe we have something converting sqrt(x*x) to abs(x) or maybe 
not.

I wonder if all the unsafe math optimizations are really ok without 
-fno-math-errno...
Richard Biener Aug. 20, 2015, 8:16 a.m. UTC | #6
On Thu, Aug 20, 2015 at 6:48 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi,
>
> Thanks again for your review and useful comments.
>
>>> I see.  But I can't really help without a testcase that I can use to have a look
>>> (same for the above issue with the segfaults).
>
> The following testcase does not generate "x" as needed.
> ====================
> double t (double x)
> {
>  x = sqrt (x) * sqrt (x);
>  return x;
> }

Works for me if you specify -fno-math-errno.  I think that's a
"regression" we can accept.
Later on GIMPLE CSE fails to CSE the two calls (because of the unknown
side-effects,
special-casing of (some) builtins would be necessary).

> ====================
>
> All of the following operation results in segfault with:-
> aarch64-thunder-elf-gcc simlify-2.c -O2 -funsafe-math-optimizations
> ===============================================
> #include <math.h>
>
> double t (double x, double y, double z)
> {
>  x = cbrt (x) * cbrt (y);
>  x = exp10 (x) * exp10 (y);
>  x = pow10 (x) * pow10 (y);
>  x = x / cbrt (x/y)
>  x = x / exp10 (y);
>  x = x / pow10 (y);
>  return x;
> }
>
> float t (float x, float y, float z)
> {
>  x = sqrtf (x) * sqrtf (y);
>  x = expf (x) * expf (y);
>  x = powf (x, y) * powf (x, z);
>  x = x / expf (y);
>  return x;
> }
>
> long double t1 (long double x, long double y, long double z)
> {
>  x = sqrtl (x) * sqrtl (y);
>  x = expl (x) * expl (y);
>  x = powl (x, y) * powl (x, z);
>  x = x / expl (y);
>  return x;
> }
> ===============================================
>  /* Simplify sqrt(x) * sqrt(y) -> sqrt(x*y).  */
>  (simplify
>   (mult (SQRT:s @0) (SQRT:s @1))
>   (SQRT (mult @0 @1)))
>
>  /* Simplify pow(x,y) * pow(x,z) -> pow(x,y+z). */
>  (simplify
>   (mult (POW:s @0 @1) (POW:s @0 @2))
>     (POW @0 (plus @1 @2)))
>
>  /* Simplify expN(x) * expN(y) -> expN(x+y). */
>  (simplify
>   (mult (EXP:s @0) (EXP:s @1))
>    (EXP (plus @0 @1)))
>
> /* Simplify x / expN(y) into x*expN(-y). */
>  (simplify
>   (rdiv @0 (EXP @1))
>    (mult @0 (EXP (negate @1))))
> ===============================================
>
>>> A quick "fix" to avoid this ICE would disable the pattern for
>>> -ferrno-math.
>
> Disabled the pattern for -ferrno-math.
>
>>>  If you open a bugreport with the pattern and a testcase
> I'm going to have a closer look.
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67285
>
> Thanks for the detailed explanation of ":s".
>
> Please let me know whether the working patch can be committed?
> If its okay and with your approval, I would like to move some more
> patterns using "match and simplify".

Can you re-post with the typo fix and the missing :s?

Thanks,
Richard.

> Thanks,
> Naveen
Richard Biener Aug. 20, 2015, 8:20 a.m. UTC | #7
On Thu, Aug 20, 2015 at 7:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Thu, 20 Aug 2015, Hurugalawadi, Naveen wrote:
>
>> The following testcase does not generate "x" as needed.
>> ====================
>> double t (double x)
>> {
>> x = sqrt (x) * sqrt (x);
>> return x;
>> }
>> ====================
>
>
> With -fno-math-errno, we CSE the calls to sqrt, so I would expect this to
> match:
>
>   (mult (SQRT@1 @0) @1)
>
> Without the flag, I expect that one will apply
>
>  (simplify
>   (mult (SQRT:s @0) (SQRT:s @1))
>   (SQRT (mult @0 @1)))
>
> and then maybe we have something converting sqrt(x*x) to abs(x) or maybe
> not.

ICK.  I'd rather have CSE still CSE the two calls by adding some
tricks regarding
to errno ...

> I wonder if all the unsafe math optimizations are really ok without
> -fno-math-errno...

Well, on GIMPLE they will preserve the original calls because of their
side-effects
setting errno...  on GENERIC probably not.

Richard.

> --
> Marc Glisse
Marc Glisse Aug. 20, 2015, 9:11 a.m. UTC | #8
On Thu, 20 Aug 2015, Richard Biener wrote:

> On Thu, Aug 20, 2015 at 7:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Thu, 20 Aug 2015, Hurugalawadi, Naveen wrote:
>>
>>> The following testcase does not generate "x" as needed.
>>> ====================
>>> double t (double x)
>>> {
>>> x = sqrt (x) * sqrt (x);
>>> return x;
>>> }
>>> ====================
>>
>>
>> With -fno-math-errno, we CSE the calls to sqrt, so I would expect this to
>> match:
>>
>>   (mult (SQRT@1 @0) @1)
>>
>> Without the flag, I expect that one will apply
>>
>>  (simplify
>>   (mult (SQRT:s @0) (SQRT:s @1))
>>   (SQRT (mult @0 @1)))
>>
>> and then maybe we have something converting sqrt(x*x) to abs(x) or maybe
>> not.
>
> ICK.  I'd rather have CSE still CSE the two calls by adding some tricks 
> regarding to errno ...
>
>> I wonder if all the unsafe math optimizations are really ok without
>> -fno-math-errno...
>
> Well, on GIMPLE they will preserve the original calls because of their 
> side-effects setting errno...  on GENERIC probably not.

But we are also introducing new math calls, and I am afraid those might 
set errno at an unexpected place in the code...

I don't know if anyone interested in errno would ever use 
-funsafe-math-optimizations though.
diff mbox

Patch

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 6c65fe1..c0399ca 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -10008,67 +10008,6 @@  fold_binary_loc (location_t loc,
 		    }
 		}
 
-	      /* Optimize tan(x)*cos(x) as sin(x).  */
-	      if (((fcode0 == BUILT_IN_TAN && fcode1 == BUILT_IN_COS)
-		   || (fcode0 == BUILT_IN_TANF && fcode1 == BUILT_IN_COSF)
-		   || (fcode0 == BUILT_IN_TANL && fcode1 == BUILT_IN_COSL)
-		   || (fcode0 == BUILT_IN_COS && fcode1 == BUILT_IN_TAN)
-		   || (fcode0 == BUILT_IN_COSF && fcode1 == BUILT_IN_TANF)
-		   || (fcode0 == BUILT_IN_COSL && fcode1 == BUILT_IN_TANL))
-		  && operand_equal_p (CALL_EXPR_ARG (arg0, 0),
-				      CALL_EXPR_ARG (arg1, 0), 0))
-		{
-		  tree sinfn = mathfn_built_in (type, BUILT_IN_SIN);
-
-		  if (sinfn != NULL_TREE)
-		    return build_call_expr_loc (loc, sinfn, 1,
-					    CALL_EXPR_ARG (arg0, 0));
-		}
-
-	      /* Optimize x*pow(x,c) as pow(x,c+1).  */
-	      if (fcode1 == BUILT_IN_POW
-		  || fcode1 == BUILT_IN_POWF
-		  || fcode1 == BUILT_IN_POWL)
-		{
-		  tree arg10 = CALL_EXPR_ARG (arg1, 0);
-		  tree arg11 = CALL_EXPR_ARG (arg1, 1);
-		  if (TREE_CODE (arg11) == REAL_CST
-		      && !TREE_OVERFLOW (arg11)
-		      && operand_equal_p (arg0, arg10, 0))
-		    {
-		      tree powfn = TREE_OPERAND (CALL_EXPR_FN (arg1), 0);
-		      REAL_VALUE_TYPE c;
-		      tree arg;
-
-		      c = TREE_REAL_CST (arg11);
-		      real_arithmetic (&c, PLUS_EXPR, &c, &dconst1);
-		      arg = build_real (type, c);
-		      return build_call_expr_loc (loc, powfn, 2, arg0, arg);
-		    }
-		}
-
-	      /* Optimize pow(x,c)*x as pow(x,c+1).  */
-	      if (fcode0 == BUILT_IN_POW
-		  || fcode0 == BUILT_IN_POWF
-		  || fcode0 == BUILT_IN_POWL)
-		{
-		  tree arg00 = CALL_EXPR_ARG (arg0, 0);
-		  tree arg01 = CALL_EXPR_ARG (arg0, 1);
-		  if (TREE_CODE (arg01) == REAL_CST
-		      && !TREE_OVERFLOW (arg01)
-		      && operand_equal_p (arg1, arg00, 0))
-		    {
-		      tree powfn = TREE_OPERAND (CALL_EXPR_FN (arg0), 0);
-		      REAL_VALUE_TYPE c;
-		      tree arg;
-
-		      c = TREE_REAL_CST (arg01);
-		      real_arithmetic (&c, PLUS_EXPR, &c, &dconst1);
-		      arg = build_real (type, c);
-		      return build_call_expr_loc (loc, powfn, 2, arg1, arg);
-		    }
-		}
-
 	      /* Canonicalize x*x as pow(x,2.0), which is expanded as x*x.  */
 	      if (!in_gimple_form
 		  && optimize
@@ -10481,107 +10420,8 @@  fold_binary_loc (location_t loc,
 
       if (flag_unsafe_math_optimizations)
 	{
-	  enum built_in_function fcode0 = builtin_mathfn_code (arg0);
 	  enum built_in_function fcode1 = builtin_mathfn_code (arg1);
 
-	  /* Optimize sin(x)/cos(x) as tan(x).  */
-	  if (((fcode0 == BUILT_IN_SIN && fcode1 == BUILT_IN_COS)
-	       || (fcode0 == BUILT_IN_SINF && fcode1 == BUILT_IN_COSF)
-	       || (fcode0 == BUILT_IN_SINL && fcode1 == BUILT_IN_COSL))
-	      && operand_equal_p (CALL_EXPR_ARG (arg0, 0),
-				  CALL_EXPR_ARG (arg1, 0), 0))
-	    {
-	      tree tanfn = mathfn_built_in (type, BUILT_IN_TAN);
-
-	      if (tanfn != NULL_TREE)
-		return build_call_expr_loc (loc, tanfn, 1, CALL_EXPR_ARG (arg0, 0));
-	    }
-
-	  /* Optimize cos(x)/sin(x) as 1.0/tan(x).  */
-	  if (((fcode0 == BUILT_IN_COS && fcode1 == BUILT_IN_SIN)
-	       || (fcode0 == BUILT_IN_COSF && fcode1 == BUILT_IN_SINF)
-	       || (fcode0 == BUILT_IN_COSL && fcode1 == BUILT_IN_SINL))
-	      && operand_equal_p (CALL_EXPR_ARG (arg0, 0),
-				  CALL_EXPR_ARG (arg1, 0), 0))
-	    {
-	      tree tanfn = mathfn_built_in (type, BUILT_IN_TAN);
-
-	      if (tanfn != NULL_TREE)
-		{
-		  tree tmp = build_call_expr_loc (loc, tanfn, 1,
-					      CALL_EXPR_ARG (arg0, 0));
-		  return fold_build2_loc (loc, RDIV_EXPR, type,
-				      build_real (type, dconst1), tmp);
-		}
-	    }
-
- 	  /* Optimize sin(x)/tan(x) as cos(x) if we don't care about
-	     NaNs or Infinities.  */
- 	  if (((fcode0 == BUILT_IN_SIN && fcode1 == BUILT_IN_TAN)
- 	       || (fcode0 == BUILT_IN_SINF && fcode1 == BUILT_IN_TANF)
- 	       || (fcode0 == BUILT_IN_SINL && fcode1 == BUILT_IN_TANL)))
-	    {
-	      tree arg00 = CALL_EXPR_ARG (arg0, 0);
-	      tree arg01 = CALL_EXPR_ARG (arg1, 0);
-
-	      if (! HONOR_NANS (arg00)
-		  && ! HONOR_INFINITIES (element_mode (arg00))
-		  && operand_equal_p (arg00, arg01, 0))
-		{
-		  tree cosfn = mathfn_built_in (type, BUILT_IN_COS);
-
-		  if (cosfn != NULL_TREE)
-		    return build_call_expr_loc (loc, cosfn, 1, arg00);
-		}
-	    }
-
- 	  /* Optimize tan(x)/sin(x) as 1.0/cos(x) if we don't care about
-	     NaNs or Infinities.  */
- 	  if (((fcode0 == BUILT_IN_TAN && fcode1 == BUILT_IN_SIN)
- 	       || (fcode0 == BUILT_IN_TANF && fcode1 == BUILT_IN_SINF)
- 	       || (fcode0 == BUILT_IN_TANL && fcode1 == BUILT_IN_SINL)))
-	    {
-	      tree arg00 = CALL_EXPR_ARG (arg0, 0);
-	      tree arg01 = CALL_EXPR_ARG (arg1, 0);
-
-	      if (! HONOR_NANS (arg00)
-		  && ! HONOR_INFINITIES (element_mode (arg00))
-		  && operand_equal_p (arg00, arg01, 0))
-		{
-		  tree cosfn = mathfn_built_in (type, BUILT_IN_COS);
-
-		  if (cosfn != NULL_TREE)
-		    {
-		      tree tmp = build_call_expr_loc (loc, cosfn, 1, arg00);
-		      return fold_build2_loc (loc, RDIV_EXPR, type,
-					  build_real (type, dconst1),
-					  tmp);
-		    }
-		}
-	    }
-
-	  /* Optimize pow(x,c)/x as pow(x,c-1).  */
-	  if (fcode0 == BUILT_IN_POW
-	      || fcode0 == BUILT_IN_POWF
-	      || fcode0 == BUILT_IN_POWL)
-	    {
-	      tree arg00 = CALL_EXPR_ARG (arg0, 0);
-	      tree arg01 = CALL_EXPR_ARG (arg0, 1);
-	      if (TREE_CODE (arg01) == REAL_CST
-		  && !TREE_OVERFLOW (arg01)
-		  && operand_equal_p (arg1, arg00, 0))
-		{
-		  tree powfn = TREE_OPERAND (CALL_EXPR_FN (arg0), 0);
-		  REAL_VALUE_TYPE c;
-		  tree arg;
-
-		  c = TREE_REAL_CST (arg01);
-		  real_arithmetic (&c, MINUS_EXPR, &c, &dconst1);
-		  arg = build_real (type, c);
-		  return build_call_expr_loc (loc, powfn, 2, arg1, arg);
-		}
-	    }
-
 	  /* Optimize a/root(b/c) into a*root(c/b).  */
 	  if (BUILTIN_ROOT_P (fcode1))
 	    {
@@ -10611,19 +10451,6 @@  fold_binary_loc (location_t loc,
 	      return fold_build2_loc (loc, MULT_EXPR, type, arg0, arg1);
 	    }
 
-	  /* Optimize x/pow(y,z) into x*pow(y,-z).  */
-	  if (fcode1 == BUILT_IN_POW
-	      || fcode1 == BUILT_IN_POWF
-	      || fcode1 == BUILT_IN_POWL)
-	    {
-	      tree powfn = TREE_OPERAND (CALL_EXPR_FN (arg1), 0);
-	      tree arg10 = CALL_EXPR_ARG (arg1, 0);
-	      tree arg11 = CALL_EXPR_ARG (arg1, 1);
-	      tree neg11 = fold_convert_loc (loc, type,
-					     negate_expr (arg11));
-	      arg1 = build_call_expr_loc (loc, powfn, 2, arg10, neg11);
-	      return fold_build2_loc (loc, MULT_EXPR, type, arg0, arg1);
-	    }
 	}
       return NULL_TREE;
 
diff --git a/gcc/match.pd b/gcc/match.pd
index 71f4127..7675332 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -55,10 +55,11 @@  along with GCC; see the file COPYING3.  If not see
 (define_operator_list POW10 BUILT_IN_POW10F BUILT_IN_POW10 BUILT_IN_POW10L)
 (define_operator_list SQRT BUILT_IN_SQRTF BUILT_IN_SQRT BUILT_IN_SQRTL)
 (define_operator_list CBRT BUILT_IN_CBRTF BUILT_IN_CBRT BUILT_IN_CBRTL)
+(define_operator_list SIN BUILT_IN_SIN BUILT_IN_SINL BUILT_IN_SINF)
 (define_operator_list COS BUILT_IN_COS BUILT_IN_COSL BUILT_IN_COSF)
+(define_operator_list TAN BUILT_IN_TAN BUILT_IN_TANL BUILT_IN_TANF)
 (define_operator_list COSH BUILT_IN_COSH BUILT_IN_COSHL BUILT_IN_COSHF)
 
-
 /* Simplifications of operations with one constant operand and
    simplifications to constants or single values.  */
 
@@ -2006,6 +2007,71 @@  along with GCC; see the file COPYING3.  If not see
 
 /* fold_builtin_logarithm */
 (if (flag_unsafe_math_optimizations)
+
+ /* Simplify sqrt(x) * sqrt(x) -> x.  */
+ (simplify
+  (mult:c (SQRT (SQRT@1 @0)) @1)
+  (if (!HONOR_SNANS (type))
+   @0))
+
+ /* Simplify pow(x,y) * pow(z,y) -> pow(x*z,y). */
+ (simplify
+  (mult (POW:s @0 @1) (POW:s @2 @1))
+   (POW (mult @0 @2) @1))
+
+ /* Simplify tan(x) * cos(x) -> sin(x). */
+ (simplify
+  (mult:c (TAN:s @0) (COS:s @0))
+   (SIN @0))
+
+ /* Simplify x * pow(x,c) -> pow(x,c+1). */
+ (simplify
+  (mult:c @0 (POW @0 @1))
+  (if (TREE_CODE (@1) == REAL_CST
+       && !TREE_OVERFLOW (@1))
+   (POW @0 (plus @1 { build_one_cst (type); }))))
+
+ /* Simplify sin(x) / cos(x) -> tan(x). */
+ (simplify
+  (rdiv (SIN:s @0) (COS:s @0))
+   (TAN @0))
+
+ /* Simplify cos(x) / sin(x) -> 1 / tan(x). */
+ (simplify
+  (rdiv (COS:s @0) (SIN:s @0))
+   (rdiv { build_one_cst (type); } (TAN @0)))
+
+ /* Simplify sin(x) / tan(x) -> cos(x). */
+ (simplify
+  (rdiv (SIN:s @0) (TAN:s @0))
+  (if (! HONOR_NANS (@0)
+       && ! HONOR_INFINITIES (@0))
+   (cos @0)))
+
+ /* Simplify tan(x) / sin(x) -> 1.0 / cos(x). */
+ (simplify
+  (rdiv (TAN:s @0) (SIN:s @0))
+  (if (! HONOR_NANS (@0)
+       && ! HONOR_INFINITIES (@0))
+   (rdiv { build_one_cst (type); } (COS @0))))
+
+ /* Simplify pow(x,c) / x -> pow(x,c-1). */
+ (simplify
+  (rdiv (POW @0 @1) @0)
+  (if (TREE_CODE (@1) == REAL_CST
+       && !TREE_OVERFLOW (@1))
+   (POW @0 (minus @1 { build_one_cst (type); }))))
+
+ /* Simplify a/root(b/c) into a*root(c/b).  */
+ (simplify
+  (rdiv @0 (SQRT (rdiv @1 @2)))
+   (mult @0 (SQRT (rdiv @2 @1))))
+
+ /* Simplify x / pow (y,z) -> x * pow(y,-z). */
+ (simplify
+  (rdiv @0 (POW @1 @2))
+   (mult @0 (POW @1 (negate @2))))
+
  /* Special case, optimize logN(expN(x)) = x.  */
  (for logs (LOG LOG2 LOG10)
       exps (EXP EXP2 EXP10)