diff mbox

More pow(x,c) expansions in cse_sincos pass (PR46728, patch 3)

Message ID 1306432259.4821.73.camel@L3G5336.ibm.com
State New
Headers show

Commit Message

Bill Schmidt May 26, 2011, 5:50 p.m. UTC
On Thu, 2011-05-26 at 11:22 +0200, Richard Guenther wrote:

<snip>


> There are several extra pre-conditions in the original code in
> builtins.c as well as commens for reasonings (yes, there seem
> to be duplicates of several of the transforms there ...).  Please
> try to preserve them.  I noticed especially
> 
>               /* Optimize pow (x, 0.25) into sqrt (sqrt (x)).  Assume on most
>                  machines that a builtin sqrt instruction is smaller than a
>                  call to pow with 0.25, so do this optimization even if
>                  -Os.  */
> 
> and
> 
>       /* Check whether we can do cbrt insstead of pow (x, 1./3.) and
>          cbrt/sqrts instead of pow (x, 1./6.).  */
>       if (cbrtfn && ! op
>           && (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode)))
> 
> where you omit the non-negative check and the check for NANs.
> Note that -funsafe-math-optimizations does not mean you can
> ignore nans or signed zero issues.  pow(x,1/3) -> cbrt(x) does not
> need a -funsafe-math-optimization check either.

OK.  Good catch on the non-negative/NaN check -- that just slipped
through the cracks.  I've fixed this and beefed up the comments as you
suggest.

> 
> It would be nice if you could re-arrange this function so that before
> each individual replacement all preconditions are checked (even if
> that means duplicating some checks and code) - that way
> reviewing and later maintainance should be much easier.
> 

Done -- you're right, the result is much cleaner.  Hopefully this
version is easier to follow.

Thanks, 
Bill


2011-05-25  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR tree-optimization/46728
	* tree-ssa-math-opts.c (powi_as_mults_1): Add gimple_set_location.
	(powi_as_mults): Add gimple_set_location.
	(build_and_insert_call): New.
	(gimple_expand_builtin_pow): Add handling for pow(x,y) when y is
	0.5, 0.25, 0.75, 1./3., or 1./6.

Comments

Joseph Myers May 26, 2011, 5:59 p.m. UTC | #1
On Thu, 26 May 2011, William J. Schmidt wrote:

> +  /* Optimize pow(x,1./3.) = cbrt(x).  This is always safe.  */

No, it's not always safe.  1./3. isn't exactly representable; if x is 
negative (and finite), the correct value of the LHS is a NaN (with the 
"invalid" exception raised) because the value of 1./3. actually has an 
even denominator, whereas the correct value of the RHS is a negative real 
value.
Richard Biener May 27, 2011, 10:24 a.m. UTC | #2
On Thu, May 26, 2011 at 7:50 PM, William J. Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
>
> On Thu, 2011-05-26 at 11:22 +0200, Richard Guenther wrote:
>
> <snip>
>
>
>> There are several extra pre-conditions in the original code in
>> builtins.c as well as commens for reasonings (yes, there seem
>> to be duplicates of several of the transforms there ...).  Please
>> try to preserve them.  I noticed especially
>>
>>               /* Optimize pow (x, 0.25) into sqrt (sqrt (x)).  Assume on most
>>                  machines that a builtin sqrt instruction is smaller than a
>>                  call to pow with 0.25, so do this optimization even if
>>                  -Os.  */
>>
>> and
>>
>>       /* Check whether we can do cbrt insstead of pow (x, 1./3.) and
>>          cbrt/sqrts instead of pow (x, 1./6.).  */
>>       if (cbrtfn && ! op
>>           && (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode)))
>>
>> where you omit the non-negative check and the check for NANs.
>> Note that -funsafe-math-optimizations does not mean you can
>> ignore nans or signed zero issues.  pow(x,1/3) -> cbrt(x) does not
>> need a -funsafe-math-optimization check either.
>
> OK.  Good catch on the non-negative/NaN check -- that just slipped
> through the cracks.  I've fixed this and beefed up the comments as you
> suggest.
>
>>
>> It would be nice if you could re-arrange this function so that before
>> each individual replacement all preconditions are checked (even if
>> that means duplicating some checks and code) - that way
>> reviewing and later maintainance should be much easier.
>>
>
> Done -- you're right, the result is much cleaner.  Hopefully this
> version is easier to follow.
>
> Thanks,
> Bill
>
>
> 2011-05-25  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>        PR tree-optimization/46728
>        * tree-ssa-math-opts.c (powi_as_mults_1): Add gimple_set_location.
>        (powi_as_mults): Add gimple_set_location.
>        (build_and_insert_call): New.
>        (gimple_expand_builtin_pow): Add handling for pow(x,y) when y is
>        0.5, 0.25, 0.75, 1./3., or 1./6.
>
>
> Index: gcc/tree-ssa-math-opts.c
> ===================================================================
> --- gcc/tree-ssa-math-opts.c    (revision 174199)
> +++ gcc/tree-ssa-math-opts.c    (working copy)
> @@ -965,6 +965,7 @@ powi_as_mults_1 (gimple_stmt_iterator *gsi, locati
>     }
>
>   mult_stmt = gimple_build_assign_with_ops (MULT_EXPR, ssa_target, op0, op1);
> +  gimple_set_location (mult_stmt, loc);
>   gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT);
>
>   return ssa_target;
> @@ -999,6 +1000,7 @@ powi_as_mults (gimple_stmt_iterator *gsi, location
>   div_stmt = gimple_build_assign_with_ops (RDIV_EXPR, target,
>                                           build_real (type, dconst1),
>                                           result);
> +  gimple_set_location (div_stmt, loc);
>   gsi_insert_before (gsi, div_stmt, GSI_SAME_STMT);
>
>   return target;
> @@ -1024,6 +1026,34 @@ gimple_expand_builtin_powi (gimple_stmt_iterator *
>   return NULL_TREE;
>  }
>
> +/* Build a gimple call statement that calls FN with argument ARG.
> +   Set the lhs of the call statement to a fresh SSA name for
> +   variable VAR.  If VAR is NULL, first allocate it.  Insert the
> +   statement prior to GSI's current position, and return the fresh
> +   SSA name.  */
> +
> +static tree
> +build_and_insert_call (gimple_stmt_iterator *gsi, tree fn, tree arg,
> +                      tree *var, location_t loc)
> +{
> +  gimple call_stmt;
> +  tree ssa_target;
> +
> +  if (!*var)
> +    {
> +      *var = create_tmp_var (TREE_TYPE (arg), "powroot");
> +      add_referenced_var (*var);
> +    }
> +
> +  call_stmt = gimple_build_call (fn, 1, arg);
> +  ssa_target = make_ssa_name (*var, NULL);
> +  gimple_set_lhs (call_stmt, ssa_target);
> +  gimple_set_location (call_stmt, loc);
> +  gsi_insert_before (gsi, call_stmt, GSI_SAME_STMT);
> +
> +  return ssa_target;
> +}
> +
>  /* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
>    with location info LOC.  If possible, create an equivalent and
>    less expensive sequence of statements prior to GSI, and return an
> @@ -1033,16 +1063,21 @@ static tree
>  gimple_expand_builtin_pow (gimple_stmt_iterator *gsi, location_t loc,
>                           tree arg0, tree arg1)
>  {
> -  REAL_VALUE_TYPE c, cint;
> +  REAL_VALUE_TYPE c, cint, dconst1_4, dconst3_4, dconst1_3, dconst1_6;
>   HOST_WIDE_INT n;
> +  tree type, sqrtfn, cbrtfn, sqrt_arg0, sqrt_sqrt, ssa_target;
> +  tree target = NULL_TREE;
> +  enum machine_mode mode;
> +  bool hw_sqrt_exists;
> +  gimple mult_stmt;
>
>   /* If the exponent isn't a constant, there's nothing of interest
>      to be done.  */
>   if (TREE_CODE (arg1) != REAL_CST)
>     return NULL_TREE;
>
> -  /* If the exponent is equivalent to an integer, expand it into
> -     multiplies when profitable.  */
> +  /* If the exponent is equivalent to an integer, expand to an optimal
> +     multiplication sequence when profitable.  */
>   c = TREE_REAL_CST (arg1);
>   n = real_to_integer (&c);
>   real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
> @@ -1054,6 +1089,97 @@ gimple_expand_builtin_pow (gimple_stmt_iterator *g
>              && powi_cost (n) <= POWI_MAX_MULTS)))
>     return gimple_expand_builtin_powi (gsi, loc, arg0, n);
>
> +  /* Attempt various optimizations using sqrt and cbrt.  */
> +  type = TREE_TYPE (arg0);
> +  mode = TYPE_MODE (type);
> +  sqrtfn = mathfn_built_in (type, BUILT_IN_SQRT);
> +
> +  /* Optimize pow(x,0.5) = sqrt(x).  This replacement is always safe
> +     unless signed zeros must be maintained.  pow(-0,0.5) = +0, while
> +     sqrt(-0) = -0.  */
> +  if (sqrtfn
> +      && REAL_VALUES_EQUAL (c, dconsthalf)
> +      && (flag_unsafe_math_optimizations
> +         || !HONOR_SIGNED_ZEROS (mode)))

Drop flag_unsafe_math_optimizations - the replacement isn't safe
for -funsafe-math-optimizations -fsigned-zeros.

> +    return build_and_insert_call (gsi, sqrtfn, arg0, &target, loc);
> +
> +  /* Optimize pow(x,0.25) = sqrt(sqrt(x)).  Assume on most machines that
> +     a builtin sqrt instruction is smaller than a call to pow with 0.25,
> +     so do this optimization even if -Os.  Don't do this optimization
> +     if we don't have a hardware sqrt insn.  */
> +  dconst1_4 = dconst1;
> +  SET_REAL_EXP (&dconst1_4, REAL_EXP (&dconst1_4) - 2);
> +  hw_sqrt_exists = optab_handler(sqrt_optab, mode) != CODE_FOR_nothing;
> +
> +  if (flag_unsafe_math_optimizations
> +      && sqrtfn
> +      && REAL_VALUES_EQUAL (c, dconst1_4)
> +      && hw_sqrt_exists)
> +    {
> +      /* sqrt(x)  */
> +      sqrt_arg0 = build_and_insert_call (gsi, sqrtfn, arg0, &target, loc);
> +
> +      /* sqrt(sqrt(x))  */
> +      return build_and_insert_call (gsi, sqrtfn, sqrt_arg0, &target, loc);
> +    }
> +
> +  /* Optimize pow(x,0.75) = sqrt(x) * sqrt(sqrt(x)) unless we are
> +     optimizing for space.  Don't do this optimization if we don't have
> +     a hardware sqrt insn.  */
> +  real_from_integer (&dconst3_4, VOIDmode, 3, 0, 0);
> +  SET_REAL_EXP (&dconst3_4, REAL_EXP (&dconst3_4) - 2);
> +
> +  if (flag_unsafe_math_optimizations
> +      && sqrtfn
> +      && optimize_function_for_speed_p (cfun)
> +      && REAL_VALUES_EQUAL (c, dconst3_4)
> +      && hw_sqrt_exists)
> +    {
> +      /* sqrt(x)  */
> +      sqrt_arg0 = build_and_insert_call (gsi, sqrtfn, arg0, &target, loc);
> +
> +      /* sqrt(sqrt(x))  */
> +      sqrt_sqrt = build_and_insert_call (gsi, sqrtfn, sqrt_arg0, &target, loc);
> +
> +      /* sqrt(x) * sqrt(sqrt(x))  */
> +      ssa_target = make_ssa_name (target, NULL);
> +      mult_stmt = gimple_build_assign_with_ops (MULT_EXPR, ssa_target,
> +                                               sqrt_arg0, sqrt_sqrt);
> +      gimple_set_location (mult_stmt, loc);
> +      gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT);
> +
> +      return ssa_target;
> +    }
> +
> +  /* Optimize pow(x,1./3.) = cbrt(x).  This is always safe.  */
> +  cbrtfn = mathfn_built_in (type, BUILT_IN_CBRT);
> +  dconst1_3 = real_value_truncate (mode, dconst_third ());
> +
> +  if (cbrtfn
> +      && (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode))
> +      && REAL_VALUES_EQUAL (c, dconst1_3))
> +    return build_and_insert_call (gsi, cbrtfn, arg0, &target, loc);

As Joseph said this needs to be conditional on flag_unsafe_math_optimization
because of the inexactness of 1./3..
Please add rationale as explained by Joseph for the HONOR_NANS
check.

Now, the tree_expr_nonnegative_p check will always return false
because arg0 is a gimple value.  We would need to implement
either some propagation engine to derive properties for floating
point entities or mimic what tree_expr_nonnegative_p does by
looking at defining statements.  A very simple variant could be
like

bool
gimple_val_nonnegative_p (tree val)
{
  if (tree_expr_nonnegative_p (val))
    return true;

  if (TREE_CODE (val) == SSA_NAME)
   {
     gimple def_stmt = SSA_NAME_DEF_STMT (val);
     if (is_gimple_assign (def_stmt))
       return gimple_assign_rhs_code (def_stmt) == ABS_EXPR;
     else if (is_gimple_call (def_stmt))
       {
          tree fndecl = gimple_call_fndecl (def_stmt);
          if (fndecl
              && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
            {
               switch (DECL_FUNCTION_CODE (fndecl))
                 {
                 CASE_FLT_FN (BUILT_IN_FABS):
                     return true;
                 default:;
                 }
            }
       }
   }

  return false;
}

suitable for gimple-fold.c.

Alternatively you can drop the call and add a FIXME comment.

> +
> +  /* Optimize pow(x,1./6.) = cbrt(sqrt(x)).  Don't do this optimization
> +     if we don't have a hardware sqrt insn.  */
> +  dconst1_6 = dconst1_3;
> +  SET_REAL_EXP (&dconst1_6, REAL_EXP (&dconst1_6) - 1);
> +
> +  if (flag_unsafe_math_optimizations
> +      && sqrtfn
> +      && cbrtfn
> +      && (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode))

Likewise.

Ok with these changes.

Thanks,
Richard.

> +      && optimize_function_for_speed_p (cfun)
> +      && hw_sqrt_exists
> +      && REAL_VALUES_EQUAL (c, dconst1_6))
> +    {
> +      /* sqrt(x)  */
> +      sqrt_arg0 = build_and_insert_call (gsi, sqrtfn, arg0, &target, loc);
> +
> +      /* cbrt(sqrt(x))  */
> +      return build_and_insert_call (gsi, cbrtfn, sqrt_arg0, &target, loc);
> +    }
> +
>   return NULL_TREE;
>  }
Bill Schmidt May 27, 2011, 1:45 p.m. UTC | #3
On Fri, 2011-05-27 at 12:24 +0200, Richard Guenther wrote:

<snip>

> Now, the tree_expr_nonnegative_p check will always return false
> because arg0 is a gimple value.  We would need to implement
> either some propagation engine to derive properties for floating
> point entities or mimic what tree_expr_nonnegative_p does by
> looking at defining statements.  A very simple variant could be
> like
> 
> bool
> gimple_val_nonnegative_p (tree val)
> {
>   if (tree_expr_nonnegative_p (val))
>     return true;
> 
>   if (TREE_CODE (val) == SSA_NAME)
>    {
>      gimple def_stmt = SSA_NAME_DEF_STMT (val);
>      if (is_gimple_assign (def_stmt))
>        return gimple_assign_rhs_code (def_stmt) == ABS_EXPR;
>      else if (is_gimple_call (def_stmt))
>        {
>           tree fndecl = gimple_call_fndecl (def_stmt);
>           if (fndecl
>               && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
>             {
>                switch (DECL_FUNCTION_CODE (fndecl))
>                  {
>                  CASE_FLT_FN (BUILT_IN_FABS):
>                      return true;
>                  default:;
>                  }
>             }
>        }
>    }
> 
>   return false;
> }
> 
> suitable for gimple-fold.c.
> 
> Alternatively you can drop the call and add a FIXME comment.
> 
> > +
> > +  /* Optimize pow(x,1./6.) = cbrt(sqrt(x)).  Don't do this optimization
> > +     if we don't have a hardware sqrt insn.  */
> > +  dconst1_6 = dconst1_3;
> > +  SET_REAL_EXP (&dconst1_6, REAL_EXP (&dconst1_6) - 1);
> > +
> > +  if (flag_unsafe_math_optimizations
> > +      && sqrtfn
> > +      && cbrtfn
> > +      && (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode))
> 
> Likewise.

OK, sounds good.  I will commit this patch with the FIXME comments, and
test the gimple_val_nonnegative_p changes next week after the holiday.

Thanks,
Bill
diff mbox

Patch

Index: gcc/tree-ssa-math-opts.c
===================================================================
--- gcc/tree-ssa-math-opts.c	(revision 174199)
+++ gcc/tree-ssa-math-opts.c	(working copy)
@@ -965,6 +965,7 @@  powi_as_mults_1 (gimple_stmt_iterator *gsi, locati
     }
 
   mult_stmt = gimple_build_assign_with_ops (MULT_EXPR, ssa_target, op0, op1);
+  gimple_set_location (mult_stmt, loc);
   gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT);
 
   return ssa_target;
@@ -999,6 +1000,7 @@  powi_as_mults (gimple_stmt_iterator *gsi, location
   div_stmt = gimple_build_assign_with_ops (RDIV_EXPR, target, 
 					   build_real (type, dconst1),
 					   result);
+  gimple_set_location (div_stmt, loc);
   gsi_insert_before (gsi, div_stmt, GSI_SAME_STMT);
 
   return target;
@@ -1024,6 +1026,34 @@  gimple_expand_builtin_powi (gimple_stmt_iterator *
   return NULL_TREE;
 }
 
+/* Build a gimple call statement that calls FN with argument ARG.
+   Set the lhs of the call statement to a fresh SSA name for
+   variable VAR.  If VAR is NULL, first allocate it.  Insert the
+   statement prior to GSI's current position, and return the fresh
+   SSA name.  */
+
+static tree
+build_and_insert_call (gimple_stmt_iterator *gsi, tree fn, tree arg,
+		       tree *var, location_t loc)
+{
+  gimple call_stmt;
+  tree ssa_target;
+
+  if (!*var)
+    {
+      *var = create_tmp_var (TREE_TYPE (arg), "powroot");
+      add_referenced_var (*var);
+    }
+
+  call_stmt = gimple_build_call (fn, 1, arg);
+  ssa_target = make_ssa_name (*var, NULL);
+  gimple_set_lhs (call_stmt, ssa_target);
+  gimple_set_location (call_stmt, loc);
+  gsi_insert_before (gsi, call_stmt, GSI_SAME_STMT);
+
+  return ssa_target;
+}
+
 /* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI
    with location info LOC.  If possible, create an equivalent and
    less expensive sequence of statements prior to GSI, and return an
@@ -1033,16 +1063,21 @@  static tree
 gimple_expand_builtin_pow (gimple_stmt_iterator *gsi, location_t loc, 
 			   tree arg0, tree arg1)
 {
-  REAL_VALUE_TYPE c, cint;
+  REAL_VALUE_TYPE c, cint, dconst1_4, dconst3_4, dconst1_3, dconst1_6;
   HOST_WIDE_INT n;
+  tree type, sqrtfn, cbrtfn, sqrt_arg0, sqrt_sqrt, ssa_target;
+  tree target = NULL_TREE;
+  enum machine_mode mode;
+  bool hw_sqrt_exists;
+  gimple mult_stmt;
 
   /* If the exponent isn't a constant, there's nothing of interest
      to be done.  */
   if (TREE_CODE (arg1) != REAL_CST)
     return NULL_TREE;
 
-  /* If the exponent is equivalent to an integer, expand it into
-     multiplies when profitable.  */
+  /* If the exponent is equivalent to an integer, expand to an optimal
+     multiplication sequence when profitable.  */
   c = TREE_REAL_CST (arg1);
   n = real_to_integer (&c);
   real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
@@ -1054,6 +1089,97 @@  gimple_expand_builtin_pow (gimple_stmt_iterator *g
 	      && powi_cost (n) <= POWI_MAX_MULTS)))
     return gimple_expand_builtin_powi (gsi, loc, arg0, n);
 
+  /* Attempt various optimizations using sqrt and cbrt.  */
+  type = TREE_TYPE (arg0);
+  mode = TYPE_MODE (type);
+  sqrtfn = mathfn_built_in (type, BUILT_IN_SQRT);
+
+  /* Optimize pow(x,0.5) = sqrt(x).  This replacement is always safe
+     unless signed zeros must be maintained.  pow(-0,0.5) = +0, while
+     sqrt(-0) = -0.  */
+  if (sqrtfn
+      && REAL_VALUES_EQUAL (c, dconsthalf)
+      && (flag_unsafe_math_optimizations
+	  || !HONOR_SIGNED_ZEROS (mode)))
+    return build_and_insert_call (gsi, sqrtfn, arg0, &target, loc);
+
+  /* Optimize pow(x,0.25) = sqrt(sqrt(x)).  Assume on most machines that
+     a builtin sqrt instruction is smaller than a call to pow with 0.25,
+     so do this optimization even if -Os.  Don't do this optimization
+     if we don't have a hardware sqrt insn.  */
+  dconst1_4 = dconst1;
+  SET_REAL_EXP (&dconst1_4, REAL_EXP (&dconst1_4) - 2);
+  hw_sqrt_exists = optab_handler(sqrt_optab, mode) != CODE_FOR_nothing;
+
+  if (flag_unsafe_math_optimizations
+      && sqrtfn
+      && REAL_VALUES_EQUAL (c, dconst1_4)
+      && hw_sqrt_exists)
+    {
+      /* sqrt(x)  */
+      sqrt_arg0 = build_and_insert_call (gsi, sqrtfn, arg0, &target, loc);
+
+      /* sqrt(sqrt(x))  */
+      return build_and_insert_call (gsi, sqrtfn, sqrt_arg0, &target, loc);
+    }
+      
+  /* Optimize pow(x,0.75) = sqrt(x) * sqrt(sqrt(x)) unless we are
+     optimizing for space.  Don't do this optimization if we don't have
+     a hardware sqrt insn.  */
+  real_from_integer (&dconst3_4, VOIDmode, 3, 0, 0);
+  SET_REAL_EXP (&dconst3_4, REAL_EXP (&dconst3_4) - 2);
+
+  if (flag_unsafe_math_optimizations
+      && sqrtfn
+      && optimize_function_for_speed_p (cfun)
+      && REAL_VALUES_EQUAL (c, dconst3_4)
+      && hw_sqrt_exists)
+    {
+      /* sqrt(x)  */
+      sqrt_arg0 = build_and_insert_call (gsi, sqrtfn, arg0, &target, loc);
+
+      /* sqrt(sqrt(x))  */
+      sqrt_sqrt = build_and_insert_call (gsi, sqrtfn, sqrt_arg0, &target, loc);
+
+      /* sqrt(x) * sqrt(sqrt(x))  */
+      ssa_target = make_ssa_name (target, NULL);
+      mult_stmt = gimple_build_assign_with_ops (MULT_EXPR, ssa_target,
+						sqrt_arg0, sqrt_sqrt);
+      gimple_set_location (mult_stmt, loc);
+      gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT);
+
+      return ssa_target;
+    }
+
+  /* Optimize pow(x,1./3.) = cbrt(x).  This is always safe.  */
+  cbrtfn = mathfn_built_in (type, BUILT_IN_CBRT);
+  dconst1_3 = real_value_truncate (mode, dconst_third ());
+
+  if (cbrtfn
+      && (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode))
+      && REAL_VALUES_EQUAL (c, dconst1_3))
+    return build_and_insert_call (gsi, cbrtfn, arg0, &target, loc);
+  
+  /* Optimize pow(x,1./6.) = cbrt(sqrt(x)).  Don't do this optimization
+     if we don't have a hardware sqrt insn.  */
+  dconst1_6 = dconst1_3;
+  SET_REAL_EXP (&dconst1_6, REAL_EXP (&dconst1_6) - 1);
+
+  if (flag_unsafe_math_optimizations
+      && sqrtfn
+      && cbrtfn
+      && (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode))
+      && optimize_function_for_speed_p (cfun)
+      && hw_sqrt_exists
+      && REAL_VALUES_EQUAL (c, dconst1_6))
+    {
+      /* sqrt(x)  */
+      sqrt_arg0 = build_and_insert_call (gsi, sqrtfn, arg0, &target, loc);
+
+      /* cbrt(sqrt(x))  */
+      return build_and_insert_call (gsi, cbrtfn, sqrt_arg0, &target, loc);
+    }
+
   return NULL_TREE;
 }