diff mbox

Extend tree-call-cdce to calls whose result is used

Message ID 876111kmo5.fsf@e105548-lin.cambridge.arm.com
State New
Headers show

Commit Message

Richard Sandiford Nov. 17, 2015, 9:19 a.m. UTC
Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Nov 13, 2015 at 2:12 PM, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Mon, Nov 9, 2015 at 10:03 PM, Michael Matz <matz@suse.de> wrote:
>>>> Hi,
>>>>
>>>> On Mon, 9 Nov 2015, Richard Sandiford wrote:
>>>>
>>>>> +static bool
>>>>> +can_use_internal_fn (gcall *call)
>>>>> +{
>>>>> +  /* Only replace calls that set errno.  */
>>>>> +  if (!gimple_vdef (call))
>>>>> +    return false;
>>>>
>>>> Oh, I managed to confuse this in my head while reading the patch.  So,
>>>> hmm, you don't actually replace the builtin with an internal function
>>>> (without the condition) under no-errno-math?  Does something else do that?
>>>> Because otherwise that seems an unnecessary restriction?
>>>>
>>>>> >> r229916 fixed that for the non-EH case.
>>>>> >
>>>>> > Ah, missed it.  Even the EH case shouldn't be difficult.  If the
>>>>> > original dominator of the EH destination was the call block it moves,
>>>>> > otherwise it remains unchanged.
>>>>>
>>>>> The target of the edge is easy in itself, I agree, but that isn't
>>>>> necessarily the only affected block, if the EH handler doesn't
>>>>> exit or rethrow.
>>>>
>>>> You're worried the non-EH and the EH regions merge again, right?  Like so:
>>>>
>>>> before change:
>>>>
>>>>         BB1: throwing-call
>>>>      fallthru/           \EH
>>>>     BB2                   BBeh
>>>>      |                   /....\ (stuff in EH-region)
>>>>      |                     /    some path out of EH region
>>>>      | /------------------/
>>>>     BB3
>>>>
>>>> Here, BB3 must at least be dominated by BB1 (the throwing block), or by
>>>> something further up (when there are other side-entries to the path
>>>> BB2->BB3 or into the EH region).  When further up, nothing changes, when
>>>> it's BB1, then it's afterwards dominated by the BB containing the
>>>> condition.  So everything with idom==BB1 gets idom=Bcond, except for BBeh,
>>>> which gets idom=Bcall.  Depending on how you split BB1, either Bcond or
>>>> BBcall might still be BB1 and doesn't lead to changes in the dom tree.
>>>>
>>>>> > Currently we have quite some of such passes (reassoc, forwprop,
>>>>> > lower_vector_ssa, cse_reciprocals, cse_sincos (sigh!), optimize_bswap
>>>>> > and others), but they are all handling only special situations in one
>>>>> > way or the other.  pass_fold_builtins is another one, but it seems
>>>>> > most related to what you want (replacing a call with something else),
>>>>> > so I thought that'd be the natural choice.
>>>>>
>>>>> Well, to be pedantic, it's not really replacing the call.  Except for
>>>>> the special case of targets that support direct assignments to errno,
>>>>> it keeps the original call but ensures that it isn't usually executed.
>>>>> From that point of view it doesn't really seem like a fold.
>>>>>
>>>>> But I suppose that's just naming again :-).  And it's easily solved with
>>>>> s/fold/rewrite/.
>>>>
>>>> Exactly, in my mind pass_fold_builtin (like many of the others I
>>>> mentioned) doesn't do folding but rewriting :)
>>>
>>> So I am replying here to the issue of where to do the transform call_cdce
>>> does and the one Richard wants to add.  For example we "lower"
>>> posix_memalign as early as GIMPLE lowering (that's before CFG construction).
>>> We also lower sincos to cexpi during GENERIC folding (or if that is dropped
>>> either GIMPLE lowering or GIMPLE folding during gimplification would be
>>> appropriate).
>>>
>>> Now, with offloading we have to avoid creating target dependencies before
>>> LTO stream-out (thus no IFN replacements before that - not sure if
>>> Richards patches have an issue there already).
>>
>> No, this patch was the earliest point at which we converted to internal
>> functions.  The idea was to make code treat ECF_PURE built-in functions
>> and internal functions as being basically equivalent.  There's therefore
>> not much benefit to doing a straight replacement of one with the other
>> during either GENERIC or gimple.  Instead the series only used internal
>> functions for things that built-in functions couldn't do, specifically:
>>
>> - the case used in this patch, to optimise part of a non-pure built-in
>>   function using a pure equivalent.
>>
>> - vector versions of built-in functions.
>>
>> The cfgexpand patch makes sure that pure built-in functions are expanded
>> like internal functions where possible.
>>
>>> Which would leave us with a lowering stage early in the main
>>> optimization pipeline - I think fold_builtins pass is way too late but
>>> any "folding" pass will do (like forwprop or backprop where the latter
>>> might be better because it might end up computing FP "ranges" to
>>> improve the initial lowering code).
>>
>> This isn't at all related to what backprop is doing though.
>> backprop is about optimising definitions based on information
>> about all uses.
>>
>> Does fold_builtins need to be as late as it is?
>
> Not sure.  It folds remaining __builtin_constant_p stuff for example.

Ah, OK.  Can imagine that's quite position-sensitive.

>>> Of course call_cdce is as good as long as it still exists.
>>
>> Does this meann that you're not against the patch in principle
>> (i.e. keeping call_cdce for now and extending it in the way that
>> this patch does)?
>
> Yes, I'm fine with extending call_cdce.  Of course I'd happily
> approve a patch dissolving it into somewhere where it makes more
> sense.  But this shouldn't block this patch.

OK, thanks.  Here's an updated patch with the problems that Joseph
mentioned fixed, and rebased on top of the fix for PR 68264.

Tested on x86_64-linux-gnu, aarch64-linux-gnu, arm-linux-gnueabi
and visium-elf (for the EDOM stuff).  OK to install?

Thanks,
Richard


gcc/
	* builtins.c (expand_errno_check, expand_builtin_mathfn)
	(expand_builtin_mathfn_2): Delete.
	(expand_builtin): Remove handling of functions with
	internal function equivalents.
	* internal-fn.def (SET_EDOM): New internal function.
	* internal-fn.h (set_edom_supported_p): Declare.
	* internal-fn.c (expand_SET_EDOM): New function.
	(set_edom_supported_p): Likewise.
	* tree-call-cdce.c: Include builtins.h and internal-fn.h.
	Rewrite comment at head of file.
	(is_call_dce_candidate): Rename to...
	(can_test_argument_range): ...this.  Don't check gimple_call_lhs
	or gimple_call_builtin_p here.
	(edom_only_function): New function.
	(shrink_wrap_one_built_in_call_with_conds): New function, split out
	from...
	(shrink_wrap_one_built_in_call): ...here.
	(can_use_internal_fn, use_internal_fn): New functions.
	(shrink_wrap_conditional_dead_built_in_calls): Call use_internal_fn
	for calls that have an lhs.
	(pass_call_cdce::gate): Remove optimize_function_for_speed_p check.
	(pass_call_cdce::execute): Skip blocks that are optimized for size.
	Check gimple_call_builtin_p here.  Use can_use_internal_fn for
	calls with an lhs.
	* opts.c (default_options_table): Enable -ftree-builtin-call-cdce
	at -O and above.

Comments

Richard Biener Nov. 17, 2015, 2:26 p.m. UTC | #1
On Tue, Nov 17, 2015 at 10:19 AM, Richard Sandiford
<richard.sandiford@arm.com> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Fri, Nov 13, 2015 at 2:12 PM, Richard Sandiford
>> <richard.sandiford@arm.com> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Mon, Nov 9, 2015 at 10:03 PM, Michael Matz <matz@suse.de> wrote:
>>>>> Hi,
>>>>>
>>>>> On Mon, 9 Nov 2015, Richard Sandiford wrote:
>>>>>
>>>>>> +static bool
>>>>>> +can_use_internal_fn (gcall *call)
>>>>>> +{
>>>>>> +  /* Only replace calls that set errno.  */
>>>>>> +  if (!gimple_vdef (call))
>>>>>> +    return false;
>>>>>
>>>>> Oh, I managed to confuse this in my head while reading the patch.  So,
>>>>> hmm, you don't actually replace the builtin with an internal function
>>>>> (without the condition) under no-errno-math?  Does something else do that?
>>>>> Because otherwise that seems an unnecessary restriction?
>>>>>
>>>>>> >> r229916 fixed that for the non-EH case.
>>>>>> >
>>>>>> > Ah, missed it.  Even the EH case shouldn't be difficult.  If the
>>>>>> > original dominator of the EH destination was the call block it moves,
>>>>>> > otherwise it remains unchanged.
>>>>>>
>>>>>> The target of the edge is easy in itself, I agree, but that isn't
>>>>>> necessarily the only affected block, if the EH handler doesn't
>>>>>> exit or rethrow.
>>>>>
>>>>> You're worried the non-EH and the EH regions merge again, right?  Like so:
>>>>>
>>>>> before change:
>>>>>
>>>>>         BB1: throwing-call
>>>>>      fallthru/           \EH
>>>>>     BB2                   BBeh
>>>>>      |                   /....\ (stuff in EH-region)
>>>>>      |                     /    some path out of EH region
>>>>>      | /------------------/
>>>>>     BB3
>>>>>
>>>>> Here, BB3 must at least be dominated by BB1 (the throwing block), or by
>>>>> something further up (when there are other side-entries to the path
>>>>> BB2->BB3 or into the EH region).  When further up, nothing changes, when
>>>>> it's BB1, then it's afterwards dominated by the BB containing the
>>>>> condition.  So everything with idom==BB1 gets idom=Bcond, except for BBeh,
>>>>> which gets idom=Bcall.  Depending on how you split BB1, either Bcond or
>>>>> BBcall might still be BB1 and doesn't lead to changes in the dom tree.
>>>>>
>>>>>> > Currently we have quite some of such passes (reassoc, forwprop,
>>>>>> > lower_vector_ssa, cse_reciprocals, cse_sincos (sigh!), optimize_bswap
>>>>>> > and others), but they are all handling only special situations in one
>>>>>> > way or the other.  pass_fold_builtins is another one, but it seems
>>>>>> > most related to what you want (replacing a call with something else),
>>>>>> > so I thought that'd be the natural choice.
>>>>>>
>>>>>> Well, to be pedantic, it's not really replacing the call.  Except for
>>>>>> the special case of targets that support direct assignments to errno,
>>>>>> it keeps the original call but ensures that it isn't usually executed.
>>>>>> From that point of view it doesn't really seem like a fold.
>>>>>>
>>>>>> But I suppose that's just naming again :-).  And it's easily solved with
>>>>>> s/fold/rewrite/.
>>>>>
>>>>> Exactly, in my mind pass_fold_builtin (like many of the others I
>>>>> mentioned) doesn't do folding but rewriting :)
>>>>
>>>> So I am replying here to the issue of where to do the transform call_cdce
>>>> does and the one Richard wants to add.  For example we "lower"
>>>> posix_memalign as early as GIMPLE lowering (that's before CFG construction).
>>>> We also lower sincos to cexpi during GENERIC folding (or if that is dropped
>>>> either GIMPLE lowering or GIMPLE folding during gimplification would be
>>>> appropriate).
>>>>
>>>> Now, with offloading we have to avoid creating target dependencies before
>>>> LTO stream-out (thus no IFN replacements before that - not sure if
>>>> Richards patches have an issue there already).
>>>
>>> No, this patch was the earliest point at which we converted to internal
>>> functions.  The idea was to make code treat ECF_PURE built-in functions
>>> and internal functions as being basically equivalent.  There's therefore
>>> not much benefit to doing a straight replacement of one with the other
>>> during either GENERIC or gimple.  Instead the series only used internal
>>> functions for things that built-in functions couldn't do, specifically:
>>>
>>> - the case used in this patch, to optimise part of a non-pure built-in
>>>   function using a pure equivalent.
>>>
>>> - vector versions of built-in functions.
>>>
>>> The cfgexpand patch makes sure that pure built-in functions are expanded
>>> like internal functions where possible.
>>>
>>>> Which would leave us with a lowering stage early in the main
>>>> optimization pipeline - I think fold_builtins pass is way too late but
>>>> any "folding" pass will do (like forwprop or backprop where the latter
>>>> might be better because it might end up computing FP "ranges" to
>>>> improve the initial lowering code).
>>>
>>> This isn't at all related to what backprop is doing though.
>>> backprop is about optimising definitions based on information
>>> about all uses.
>>>
>>> Does fold_builtins need to be as late as it is?
>>
>> Not sure.  It folds remaining __builtin_constant_p stuff for example.
>
> Ah, OK.  Can imagine that's quite position-sensitive.
>
>>>> Of course call_cdce is as good as long as it still exists.
>>>
>>> Does this meann that you're not against the patch in principle
>>> (i.e. keeping call_cdce for now and extending it in the way that
>>> this patch does)?
>>
>> Yes, I'm fine with extending call_cdce.  Of course I'd happily
>> approve a patch dissolving it into somewhere where it makes more
>> sense.  But this shouldn't block this patch.
>
> OK, thanks.  Here's an updated patch with the problems that Joseph
> mentioned fixed, and rebased on top of the fix for PR 68264.
>
> Tested on x86_64-linux-gnu, aarch64-linux-gnu, arm-linux-gnueabi
> and visium-elf (for the EDOM stuff).  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
>         * builtins.c (expand_errno_check, expand_builtin_mathfn)
>         (expand_builtin_mathfn_2): Delete.
>         (expand_builtin): Remove handling of functions with
>         internal function equivalents.
>         * internal-fn.def (SET_EDOM): New internal function.
>         * internal-fn.h (set_edom_supported_p): Declare.
>         * internal-fn.c (expand_SET_EDOM): New function.
>         (set_edom_supported_p): Likewise.
>         * tree-call-cdce.c: Include builtins.h and internal-fn.h.
>         Rewrite comment at head of file.
>         (is_call_dce_candidate): Rename to...
>         (can_test_argument_range): ...this.  Don't check gimple_call_lhs
>         or gimple_call_builtin_p here.
>         (edom_only_function): New function.
>         (shrink_wrap_one_built_in_call_with_conds): New function, split out
>         from...
>         (shrink_wrap_one_built_in_call): ...here.
>         (can_use_internal_fn, use_internal_fn): New functions.
>         (shrink_wrap_conditional_dead_built_in_calls): Call use_internal_fn
>         for calls that have an lhs.
>         (pass_call_cdce::gate): Remove optimize_function_for_speed_p check.
>         (pass_call_cdce::execute): Skip blocks that are optimized for size.
>         Check gimple_call_builtin_p here.  Use can_use_internal_fn for
>         calls with an lhs.
>         * opts.c (default_options_table): Enable -ftree-builtin-call-cdce
>         at -O and above.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index c422d0d..df5c493 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -101,9 +101,6 @@ static rtx expand_builtin_apply (rtx, rtx, rtx);
>  static void expand_builtin_return (rtx);
>  static enum type_class type_to_class (tree);
>  static rtx expand_builtin_classify_type (tree);
> -static void expand_errno_check (tree, rtx);
> -static rtx expand_builtin_mathfn (tree, rtx, rtx);
> -static rtx expand_builtin_mathfn_2 (tree, rtx, rtx);
>  static rtx expand_builtin_mathfn_3 (tree, rtx, rtx);
>  static rtx expand_builtin_mathfn_ternary (tree, rtx, rtx);
>  static rtx expand_builtin_interclass_mathfn (tree, rtx);
> @@ -1972,286 +1969,6 @@ replacement_internal_fn (gcall *call)
>    return IFN_LAST;
>  }
>
> -/* If errno must be maintained, expand the RTL to check if the result,
> -   TARGET, of a built-in function call, EXP, is NaN, and if so set
> -   errno to EDOM.  */
> -
> -static void
> -expand_errno_check (tree exp, rtx target)
> -{
> -  rtx_code_label *lab = gen_label_rtx ();
> -
> -  /* Test the result; if it is NaN, set errno=EDOM because
> -     the argument was not in the domain.  */
> -  do_compare_rtx_and_jump (target, target, EQ, 0, GET_MODE (target),
> -                          NULL_RTX, NULL, lab,
> -                          /* The jump is very likely.  */
> -                          REG_BR_PROB_BASE - (REG_BR_PROB_BASE / 2000 - 1));
> -
> -#ifdef TARGET_EDOM
> -  /* If this built-in doesn't throw an exception, set errno directly.  */
> -  if (TREE_NOTHROW (TREE_OPERAND (CALL_EXPR_FN (exp), 0)))
> -    {
> -#ifdef GEN_ERRNO_RTX
> -      rtx errno_rtx = GEN_ERRNO_RTX;
> -#else
> -      rtx errno_rtx
> -         = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno"));
> -#endif
> -      emit_move_insn (errno_rtx,
> -                     gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx)));
> -      emit_label (lab);
> -      return;
> -    }
> -#endif
> -
> -  /* Make sure the library call isn't expanded as a tail call.  */
> -  CALL_EXPR_TAILCALL (exp) = 0;
> -
> -  /* We can't set errno=EDOM directly; let the library call do it.
> -     Pop the arguments right away in case the call gets deleted.  */
> -  NO_DEFER_POP;
> -  expand_call (exp, target, 0);
> -  OK_DEFER_POP;
> -  emit_label (lab);
> -}
> -
> -/* Expand a call to one of the builtin math functions (sqrt, exp, or log).
> -   Return NULL_RTX if a normal call should be emitted rather than expanding
> -   the function in-line.  EXP is the expression that is a call to the builtin
> -   function; if convenient, the result should be placed in TARGET.
> -   SUBTARGET may be used as the target for computing one of EXP's operands.  */
> -
> -static rtx
> -expand_builtin_mathfn (tree exp, rtx target, rtx subtarget)
> -{
> -  optab builtin_optab;
> -  rtx op0;
> -  rtx_insn *insns;
> -  tree fndecl = get_callee_fndecl (exp);
> -  machine_mode mode;
> -  bool errno_set = false;
> -  bool try_widening = false;
> -  tree arg;
> -
> -  if (!validate_arglist (exp, REAL_TYPE, VOID_TYPE))
> -    return NULL_RTX;
> -
> -  arg = CALL_EXPR_ARG (exp, 0);
> -
> -  switch (DECL_FUNCTION_CODE (fndecl))
> -    {
> -    CASE_FLT_FN (BUILT_IN_SQRT):
> -      errno_set = ! tree_expr_nonnegative_p (arg);
> -      try_widening = true;
> -      builtin_optab = sqrt_optab;
> -      break;
> -    CASE_FLT_FN (BUILT_IN_EXP):
> -      errno_set = true; builtin_optab = exp_optab; break;
> -    CASE_FLT_FN (BUILT_IN_EXP10):
> -    CASE_FLT_FN (BUILT_IN_POW10):
> -      errno_set = true; builtin_optab = exp10_optab; break;
> -    CASE_FLT_FN (BUILT_IN_EXP2):
> -      errno_set = true; builtin_optab = exp2_optab; break;
> -    CASE_FLT_FN (BUILT_IN_EXPM1):
> -      errno_set = true; builtin_optab = expm1_optab; break;
> -    CASE_FLT_FN (BUILT_IN_LOGB):
> -      errno_set = true; builtin_optab = logb_optab; break;
> -    CASE_FLT_FN (BUILT_IN_LOG):
> -      errno_set = true; builtin_optab = log_optab; break;
> -    CASE_FLT_FN (BUILT_IN_LOG10):
> -      errno_set = true; builtin_optab = log10_optab; break;
> -    CASE_FLT_FN (BUILT_IN_LOG2):
> -      errno_set = true; builtin_optab = log2_optab; break;
> -    CASE_FLT_FN (BUILT_IN_LOG1P):
> -      errno_set = true; builtin_optab = log1p_optab; break;
> -    CASE_FLT_FN (BUILT_IN_ASIN):
> -      builtin_optab = asin_optab; break;
> -    CASE_FLT_FN (BUILT_IN_ACOS):
> -      builtin_optab = acos_optab; break;
> -    CASE_FLT_FN (BUILT_IN_TAN):
> -      builtin_optab = tan_optab; break;
> -    CASE_FLT_FN (BUILT_IN_ATAN):
> -      builtin_optab = atan_optab; break;
> -    CASE_FLT_FN (BUILT_IN_FLOOR):
> -      builtin_optab = floor_optab; break;
> -    CASE_FLT_FN (BUILT_IN_CEIL):
> -      builtin_optab = ceil_optab; break;
> -    CASE_FLT_FN (BUILT_IN_TRUNC):
> -      builtin_optab = btrunc_optab; break;
> -    CASE_FLT_FN (BUILT_IN_ROUND):
> -      builtin_optab = round_optab; break;
> -    CASE_FLT_FN (BUILT_IN_NEARBYINT):
> -      builtin_optab = nearbyint_optab;
> -      if (flag_trapping_math)
> -       break;
> -      /* Else fallthrough and expand as rint.  */
> -    CASE_FLT_FN (BUILT_IN_RINT):
> -      builtin_optab = rint_optab; break;
> -    CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
> -      builtin_optab = significand_optab; break;
> -    default:
> -      gcc_unreachable ();
> -    }
> -
> -  /* Make a suitable register to place result in.  */
> -  mode = TYPE_MODE (TREE_TYPE (exp));
> -
> -  if (! flag_errno_math || ! HONOR_NANS (mode))
> -    errno_set = false;
> -
> -  /* Before working hard, check whether the instruction is available, but try
> -     to widen the mode for specific operations.  */
> -  if ((optab_handler (builtin_optab, mode) != CODE_FOR_nothing
> -       || (try_widening && !excess_precision_type (TREE_TYPE (exp))))
> -      && (!errno_set || !optimize_insn_for_size_p ()))
> -    {
> -      rtx result = gen_reg_rtx (mode);
> -
> -      /* Wrap the computation of the argument in a SAVE_EXPR, as we may
> -        need to expand the argument again.  This way, we will not perform
> -        side-effects more the once.  */
> -      CALL_EXPR_ARG (exp, 0) = arg = builtin_save_expr (arg);
> -
> -      op0 = expand_expr (arg, subtarget, VOIDmode, EXPAND_NORMAL);
> -
> -      start_sequence ();
> -
> -      /* Compute into RESULT.
> -        Set RESULT to wherever the result comes back.  */
> -      result = expand_unop (mode, builtin_optab, op0, result, 0);
> -
> -      if (result != 0)
> -       {
> -         if (errno_set)
> -           expand_errno_check (exp, result);
> -
> -         /* Output the entire sequence.  */
> -         insns = get_insns ();
> -         end_sequence ();
> -         emit_insn (insns);
> -         return result;
> -       }
> -
> -      /* If we were unable to expand via the builtin, stop the sequence
> -        (without outputting the insns) and call to the library function
> -        with the stabilized argument list.  */
> -      end_sequence ();
> -    }
> -
> -  return expand_call (exp, target, target == const0_rtx);
> -}
> -
> -/* Expand a call to the builtin binary math functions (pow and atan2).
> -   Return NULL_RTX if a normal call should be emitted rather than expanding the
> -   function in-line.  EXP is the expression that is a call to the builtin
> -   function; if convenient, the result should be placed in TARGET.
> -   SUBTARGET may be used as the target for computing one of EXP's
> -   operands.  */
> -
> -static rtx
> -expand_builtin_mathfn_2 (tree exp, rtx target, rtx subtarget)
> -{
> -  optab builtin_optab;
> -  rtx op0, op1, result;
> -  rtx_insn *insns;
> -  int op1_type = REAL_TYPE;
> -  tree fndecl = get_callee_fndecl (exp);
> -  tree arg0, arg1;
> -  machine_mode mode;
> -  bool errno_set = true;
> -
> -  switch (DECL_FUNCTION_CODE (fndecl))
> -    {
> -    CASE_FLT_FN (BUILT_IN_SCALBN):
> -    CASE_FLT_FN (BUILT_IN_SCALBLN):
> -    CASE_FLT_FN (BUILT_IN_LDEXP):
> -      op1_type = INTEGER_TYPE;
> -    default:
> -      break;
> -    }
> -
> -  if (!validate_arglist (exp, REAL_TYPE, op1_type, VOID_TYPE))
> -    return NULL_RTX;
> -
> -  arg0 = CALL_EXPR_ARG (exp, 0);
> -  arg1 = CALL_EXPR_ARG (exp, 1);
> -
> -  switch (DECL_FUNCTION_CODE (fndecl))
> -    {
> -    CASE_FLT_FN (BUILT_IN_POW):
> -      builtin_optab = pow_optab; break;
> -    CASE_FLT_FN (BUILT_IN_ATAN2):
> -      builtin_optab = atan2_optab; break;
> -    CASE_FLT_FN (BUILT_IN_SCALB):
> -      if (REAL_MODE_FORMAT (TYPE_MODE (TREE_TYPE (exp)))->b != 2)
> -       return 0;
> -      builtin_optab = scalb_optab; break;
> -    CASE_FLT_FN (BUILT_IN_SCALBN):
> -    CASE_FLT_FN (BUILT_IN_SCALBLN):
> -      if (REAL_MODE_FORMAT (TYPE_MODE (TREE_TYPE (exp)))->b != 2)
> -       return 0;
> -    /* Fall through... */
> -    CASE_FLT_FN (BUILT_IN_LDEXP):
> -      builtin_optab = ldexp_optab; break;
> -    CASE_FLT_FN (BUILT_IN_FMOD):
> -      builtin_optab = fmod_optab; break;
> -    CASE_FLT_FN (BUILT_IN_REMAINDER):
> -    CASE_FLT_FN (BUILT_IN_DREM):
> -      builtin_optab = remainder_optab; break;
> -    default:
> -      gcc_unreachable ();
> -    }
> -
> -  /* Make a suitable register to place result in.  */
> -  mode = TYPE_MODE (TREE_TYPE (exp));
> -
> -  /* Before working hard, check whether the instruction is available.  */
> -  if (optab_handler (builtin_optab, mode) == CODE_FOR_nothing)
> -    return NULL_RTX;
> -
> -  result = gen_reg_rtx (mode);
> -
> -  if (! flag_errno_math || ! HONOR_NANS (mode))
> -    errno_set = false;
> -
> -  if (errno_set && optimize_insn_for_size_p ())
> -    return 0;
> -
> -  /* Always stabilize the argument list.  */
> -  CALL_EXPR_ARG (exp, 0) = arg0 = builtin_save_expr (arg0);
> -  CALL_EXPR_ARG (exp, 1) = arg1 = builtin_save_expr (arg1);
> -
> -  op0 = expand_expr (arg0, subtarget, VOIDmode, EXPAND_NORMAL);
> -  op1 = expand_normal (arg1);
> -
> -  start_sequence ();
> -
> -  /* Compute into RESULT.
> -     Set RESULT to wherever the result comes back.  */
> -  result = expand_binop (mode, builtin_optab, op0, op1,
> -                        result, 0, OPTAB_DIRECT);
> -
> -  /* If we were unable to expand via the builtin, stop the sequence
> -     (without outputting the insns) and call to the library function
> -     with the stabilized argument list.  */
> -  if (result == 0)
> -    {
> -      end_sequence ();
> -      return expand_call (exp, target, target == const0_rtx);
> -    }
> -
> -  if (errno_set)
> -    expand_errno_check (exp, result);
> -
> -  /* Output the entire sequence.  */
> -  insns = get_insns ();
> -  end_sequence ();
> -  emit_insn (insns);
> -
> -  return result;
> -}
> -
>  /* Expand a call to the builtin trinary math functions (fma).
>     Return NULL_RTX if a normal call should be emitted rather than expanding the
>     function in-line.  EXP is the expression that is a call to the builtin
> @@ -5984,37 +5701,6 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
>      CASE_FLT_FN (BUILT_IN_CABS):
>        break;
>
> -    CASE_FLT_FN (BUILT_IN_EXP):
> -    CASE_FLT_FN (BUILT_IN_EXP10):
> -    CASE_FLT_FN (BUILT_IN_POW10):
> -    CASE_FLT_FN (BUILT_IN_EXP2):
> -    CASE_FLT_FN (BUILT_IN_EXPM1):
> -    CASE_FLT_FN (BUILT_IN_LOGB):
> -    CASE_FLT_FN (BUILT_IN_LOG):
> -    CASE_FLT_FN (BUILT_IN_LOG10):
> -    CASE_FLT_FN (BUILT_IN_LOG2):
> -    CASE_FLT_FN (BUILT_IN_LOG1P):
> -    CASE_FLT_FN (BUILT_IN_TAN):
> -    CASE_FLT_FN (BUILT_IN_ASIN):
> -    CASE_FLT_FN (BUILT_IN_ACOS):
> -    CASE_FLT_FN (BUILT_IN_ATAN):
> -    CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
> -      /* Treat these like sqrt only if unsafe math optimizations are allowed,
> -        because of possible accuracy problems.  */
> -      if (! flag_unsafe_math_optimizations)
> -       break;
> -    CASE_FLT_FN (BUILT_IN_SQRT):
> -    CASE_FLT_FN (BUILT_IN_FLOOR):
> -    CASE_FLT_FN (BUILT_IN_CEIL):
> -    CASE_FLT_FN (BUILT_IN_TRUNC):
> -    CASE_FLT_FN (BUILT_IN_ROUND):
> -    CASE_FLT_FN (BUILT_IN_NEARBYINT):
> -    CASE_FLT_FN (BUILT_IN_RINT):
> -      target = expand_builtin_mathfn (exp, target, subtarget);
> -      if (target)
> -       return target;
> -      break;
> -
>      CASE_FLT_FN (BUILT_IN_FMA):
>        target = expand_builtin_mathfn_ternary (exp, target, subtarget);
>        if (target)
> @@ -6061,23 +5747,6 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
>         return target;
>        break;
>
> -    CASE_FLT_FN (BUILT_IN_ATAN2):
> -    CASE_FLT_FN (BUILT_IN_LDEXP):
> -    CASE_FLT_FN (BUILT_IN_SCALB):
> -    CASE_FLT_FN (BUILT_IN_SCALBN):
> -    CASE_FLT_FN (BUILT_IN_SCALBLN):
> -      if (! flag_unsafe_math_optimizations)
> -       break;
> -
> -    CASE_FLT_FN (BUILT_IN_FMOD):
> -    CASE_FLT_FN (BUILT_IN_REMAINDER):
> -    CASE_FLT_FN (BUILT_IN_DREM):
> -    CASE_FLT_FN (BUILT_IN_POW):
> -      target = expand_builtin_mathfn_2 (exp, target, subtarget);
> -      if (target)
> -       return target;
> -      break;
> -
>      CASE_FLT_FN (BUILT_IN_CEXPI):
>        target = expand_builtin_cexpi (exp, target);
>        gcc_assert (target);
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index f23d799..06c5d9e 100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -2073,6 +2073,24 @@ expand_GOACC_REDUCTION (internal_fn, gcall *)
>    gcc_unreachable ();
>  }
>
> +/* Set errno to EDOM.  */
> +
> +static void
> +expand_SET_EDOM (internal_fn, gcall *)
> +{
> +#ifdef TARGET_EDOM
> +#ifdef GEN_ERRNO_RTX
> +  rtx errno_rtx = GEN_ERRNO_RTX;
> +#else
> +  rtx errno_rtx = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno"));
> +#endif
> +  emit_move_insn (errno_rtx,
> +                 gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx)));
> +#else
> +  gcc_unreachable ();
> +#endif
> +}
> +
>  /* Expand a call to FN using the operands in STMT.  FN has a single
>     output operand and NARGS input operands.  */
>
> @@ -2217,6 +2235,18 @@ direct_internal_fn_supported_p (internal_fn fn, tree type)
>    return direct_internal_fn_supported_p (fn, tree_pair (type, type));
>  }
>
> +/* Return true if IFN_SET_EDOM is supported.  */
> +
> +bool
> +set_edom_supported_p (void)
> +{
> +#ifdef TARGET_EDOM
> +  return true;
> +#else
> +  return false;
> +#endif
> +}
> +
>  #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \
>    static void                                          \
>    expand_##CODE (internal_fn fn, gcall *stmt)          \
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index bf8047a..825dba1 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -181,6 +181,10 @@ DEF_INTERNAL_FN (GOACC_LOOP, ECF_PURE | ECF_NOTHROW, NULL)
>  /* OpenACC reduction abstraction.  See internal-fn.h  for usage.  */
>  DEF_INTERNAL_FN (GOACC_REDUCTION, ECF_NOTHROW | ECF_LEAF, NULL)
>
> +/* Set errno to EDOM, if GCC knows how to do that directly for the
> +   current target.  */
> +DEF_INTERNAL_FN (SET_EDOM, ECF_LEAF | ECF_NOTHROW, NULL)
> +
>  #undef DEF_INTERNAL_INT_FN
>  #undef DEF_INTERNAL_FLT_FN
>  #undef DEF_INTERNAL_OPTAB_FN
> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> index 5ee43b8..6cb123f 100644
> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -160,6 +160,7 @@ extern tree_pair direct_internal_fn_types (internal_fn, tree, tree *);
>  extern tree_pair direct_internal_fn_types (internal_fn, gcall *);
>  extern bool direct_internal_fn_supported_p (internal_fn, tree_pair);
>  extern bool direct_internal_fn_supported_p (internal_fn, tree);
> +extern bool set_edom_supported_p (void);
>
>  extern void expand_internal_call (gcall *);
>  extern void expand_internal_call (internal_fn, gcall *);
> diff --git a/gcc/opts.c b/gcc/opts.c
> index be04cf5..4345cc8 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -478,6 +478,7 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fmove_loop_invariants, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_pta, NULL, 1 },
>      { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fssa_phiopt, NULL, 1 },
> +    { OPT_LEVELS_1_PLUS, OPT_ftree_builtin_call_dce, NULL, 1 },
>
>      /* -O2 optimizations.  */
>      { OPT_LEVELS_2_PLUS, OPT_finline_small_functions, NULL, 1 },
> @@ -503,7 +504,6 @@ static const struct default_options default_options_table[] =
>        REORDER_BLOCKS_ALGORITHM_STC },
>      { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 },
> -    { OPT_LEVELS_2_PLUS, OPT_ftree_builtin_call_dce, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
> diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
> index fbcc70b..75ef180 100644
> --- a/gcc/tree-call-cdce.c
> +++ b/gcc/tree-call-cdce.c
> @@ -33,46 +33,77 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-iterator.h"
>  #include "tree-cfg.h"
>  #include "tree-into-ssa.h"
> +#include "builtins.h"
> +#include "internal-fn.h"
>
>
> -/* Conditional dead call elimination
> +/* This pass serves two closely-related purposes:
>
> -   Some builtin functions can set errno on error conditions, but they
> -   are otherwise pure.  If the result of a call to such a function is
> -   not used, the compiler can still not eliminate the call without
> -   powerful interprocedural analysis to prove that the errno is not
> -   checked.  However, if the conditions under which the error occurs
> -   are known, the compiler can conditionally dead code eliminate the
> -   calls by shrink-wrapping the semi-dead calls into the error condition:
> +   1. It conditionally executes calls that set errno if (a) the result of
> +      the call is unused and (b) a simple range check on the arguments can
> +      detect most cases where errno does not need to be set.
>
> -        built_in_call (args)
> -          ==>
> -        if (error_cond (args))
> -             built_in_call (args)
> +      This is the "conditional dead-code elimination" that gave the pass
> +      its original name, since the call is dead for most argument values.
> +      The calls for which it helps are usually part of the C++ abstraction
> +      penalty exposed after inlining.
> +
> +   2. It looks for calls to built-in functions that set errno and whose
> +      result is used.  It checks whether there is an associated internal
> +      function that doesn't set errno and whether the target supports
> +      that internal function.  If so, the pass uses the internal function
> +      to compute the result of the built-in function but still arranges
> +      for errno to be set when necessary.  There are two ways of setting
> +      errno:
> +
> +      a. by protecting the original call with the same argument checks as (1)
> +
> +      b. by protecting the original call with a check that the result
> +        of the internal function is not equal to itself (i.e. is NaN).
> +
> +      (b) requires that NaNs are the only erroneous results.  It is not
> +      appropriate for functions like log, which returns ERANGE for zero
> +      arguments.  (b) is also likely to perform worse than (a) because it
> +      requires the result to be calculated first.  The pass therefore uses
> +      (a) when it can and uses (b) as a fallback.
> +
> +      For (b) the pass can replace the original call with a call to
> +      IFN_SET_EDOM, if the target supports direct assignments to errno.
> +
> +   In both cases, arguments that require errno to be set should occur
> +   rarely in practice.  Checks of the errno result should also be rare,
> +   but the compiler would need powerful interprocedural analysis to
> +   prove that errno is not checked.  It's much easier to add argument
> +   checks or result checks instead.
> +
> +     An example of (1) is:
>
> -    An actual simple example is :
>          log (x);   // Mostly dead call
>       ==>
>          if (__builtin_islessequal (x, 0))
>              log (x);
>
>       With this change, call to log (x) is effectively eliminated, as
> -     in majority of the cases, log won't be called with x out of
> +     in the majority of the cases, log won't be called with x out of
>       range.  The branch is totally predictable, so the branch cost
>       is low.
>
> +     An example of (2) is:
> +
> +       y = sqrt (x);
> +     ==>
> +       y = IFN_SQRT (x);
> +       if (__builtin_isless (x, 0))
> +           sqrt (x);
> +
> +     In the vast majority of cases we should then never need to call sqrt.
> +
>     Note that library functions are not supposed to clear errno to zero without
>     error.  See IEEE Std 1003.1, section 2.3 Error Numbers, and section 7.5:3 of
>     ISO/IEC 9899 (C99).
>
>     The condition wrapping the builtin call is conservatively set to avoid too
> -   aggressive (wrong) shrink wrapping.  The optimization is called conditional
> -   dead call elimination because the call is eliminated under the condition
> -   that the input arguments would not lead to domain or range error (for
> -   instance when x <= 0 for a log (x) call), however the chances that the error
> -   condition is hit is very low (those builtin calls which are conditionally
> -   dead are usually part of the C++ abstraction penalty exposed after
> -   inlining).  */
> +   aggressive (wrong) shrink wrapping.  */
>
>
>  /* A structure for representing input domain of
> @@ -251,28 +282,15 @@ check_builtin_call (gcall *bcall)
>    return check_target_format (arg);
>  }
>
> -/* A helper function to determine if a builtin function call is a
> -   candidate for conditional DCE.  Returns true if the builtin call
> -   is a candidate.  */
> +/* Return true if built-in function call CALL calls a math function
> +   and if we know how to test the range of its arguments to detect _most_
> +   situations in which errno is not set.  The test must err on the side
> +   of treating non-erroneous values as potentially erroneous.  */
>
>  static bool
> -is_call_dce_candidate (gcall *call)
> +can_test_argument_range (gcall *call)
>  {
> -  tree fn;
> -  enum built_in_function fnc;
> -
> -  /* Only potentially dead calls are considered.  */
> -  if (gimple_call_lhs (call))
> -    return false;
> -
> -  fn = gimple_call_fndecl (call);
> -  if (!fn
> -      || !DECL_BUILT_IN (fn)
> -      || (DECL_BUILT_IN_CLASS (fn) != BUILT_IN_NORMAL))
> -    return false;
> -
> -  fnc = DECL_FUNCTION_CODE (fn);
> -  switch (fnc)
> +  switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call)))
>      {
>      /* Trig functions.  */
>      CASE_FLT_FN (BUILT_IN_ACOS):
> @@ -306,6 +324,31 @@ is_call_dce_candidate (gcall *call)
>    return false;
>  }
>
> +/* Return true if CALL can produce a domain error (EDOM) but can never
> +   produce a pole, range overflow or range underflow error (all ERANGE).
> +   This means that we can tell whether a function would have set errno
> +   by testing whether the result is a NaN.  */
> +
> +static bool
> +edom_only_function (gcall *call)
> +{
> +  switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call)))
> +    {
> +    CASE_FLT_FN (BUILT_IN_ACOS):
> +    CASE_FLT_FN (BUILT_IN_ASIN):
> +    CASE_FLT_FN (BUILT_IN_ATAN):
> +    CASE_FLT_FN (BUILT_IN_COS):
> +    CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
> +    CASE_FLT_FN (BUILT_IN_SIN):
> +    CASE_FLT_FN (BUILT_IN_SQRT):
> +    CASE_FLT_FN (BUILT_IN_FMOD):
> +    CASE_FLT_FN (BUILT_IN_REMAINDER):
> +      return true;
> +
> +    default:
> +      return false;
> +    }
> +}
>
>  /* A helper function to generate gimple statements for one bound
>     comparison, so that the built-in function is called whenever
> @@ -703,33 +746,24 @@ gen_shrink_wrap_conditions (gcall *bi_call, vec<gimple *> conds,
>  /* Probability of the branch (to the call) is taken.  */
>  #define ERR_PROB 0.01
>
> -/* The function to shrink wrap a partially dead builtin call
> -   whose return value is not used anywhere, but has to be kept
> -   live due to potential error condition.  Returns true if the
> -   transformation actually happens.  */
> +/* Shrink-wrap BI_CALL so that it is only called when one of the NCONDS
> +   conditions in CONDS is false.
> +
> +   Return true on success, in which case the cfg will have been updated.  */
>
>  static bool
> -shrink_wrap_one_built_in_call (gcall *bi_call)
> +shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds,
> +                                         unsigned int nconds)
>  {
>    gimple_stmt_iterator bi_call_bsi;
>    basic_block bi_call_bb, join_tgt_bb, guard_bb;
>    edge join_tgt_in_edge_from_call, join_tgt_in_edge_fall_thru;
>    edge bi_call_in_edge0, guard_bb_in_edge;
> -  unsigned tn_cond_stmts, nconds;
> +  unsigned tn_cond_stmts;
>    unsigned ci;
>    gimple *cond_expr = NULL;
>    gimple *cond_expr_start;
>
> -  auto_vec<gimple *, 12> conds;
> -  gen_shrink_wrap_conditions (bi_call, conds, &nconds);
> -
> -  /* This can happen if the condition generator decides
> -     it is not beneficial to do the transformation.  Just
> -     return false and do not do any transformation for
> -     the call.  */
> -  if (nconds == 0)
> -    return false;
> -
>    /* The cfg we want to create looks like this:
>
>            [guard n-1]         <- guard_bb (old block)
> @@ -868,6 +902,117 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
>    return true;
>  }
>
> +/* Shrink-wrap BI_CALL so that it is only called when it might set errno
> +   (but is always called if it would set errno).
> +
> +   Return true on success, in which case the cfg will have been updated.  */
> +
> +static bool
> +shrink_wrap_one_built_in_call (gcall *bi_call)
> +{
> +  unsigned nconds = 0;
> +  auto_vec<gimple *, 12> conds;
> +  gen_shrink_wrap_conditions (bi_call, conds, &nconds);
> +  /* This can happen if the condition generator decides
> +     it is not beneficial to do the transformation.  Just
> +     return false and do not do any transformation for
> +     the call.  */
> +  if (nconds == 0)
> +    return false;
> +  return shrink_wrap_one_built_in_call_with_conds (bi_call, conds, nconds);
> +}
> +
> +/* Return true if built-in function call CALL could be implemented using
> +   a combination of an internal function to compute the result and a
> +   separate call to set errno.  */
> +
> +static bool
> +can_use_internal_fn (gcall *call)
> +{
> +  /* Only replace calls that set errno.  */
> +  if (!gimple_vdef (call))
> +    return false;
> +
> +  /* Punt if we can't conditionalize the call.  */
> +  basic_block bb = gimple_bb (call);
> +  if (stmt_ends_bb_p (call) && !find_fallthru_edge (bb->succs))
> +    return false;
> +
> +  /* See whether there is an internal function for this built-in.  */
> +  if (replacement_internal_fn (call) == IFN_LAST)
> +    return false;
> +
> +  /* See whether we can catch all cases where errno would be set,
> +     while still avoiding the call in most cases.  */
> +  if (!can_test_argument_range (call)
> +      && !edom_only_function (call))
> +    return false;
> +
> +  return true;
> +}
> +
> +/* Implement built-in function call CALL using an internal function.
> +   Return true on success, in which case the cfg will have changed.  */
> +
> +static bool
> +use_internal_fn (gcall *call)
> +{
> +  unsigned nconds = 0;
> +  auto_vec<gimple *, 12> conds;
> +  gen_shrink_wrap_conditions (call, conds, &nconds);
> +  if (nconds == 0 && !edom_only_function (call))
> +    return false;
> +
> +  internal_fn ifn = replacement_internal_fn (call);
> +  gcc_assert (ifn != IFN_LAST);
> +
> +  /* Construct the new call, with the same arguments as the original one.  */
> +  auto_vec <tree, 16> args;
> +  unsigned int nargs = gimple_call_num_args (call);
> +  for (unsigned int i = 0; i < nargs; ++i)
> +    args.safe_push (gimple_call_arg (call, i));
> +  gcall *new_call = gimple_build_call_internal_vec (ifn, args);
> +  gimple_set_location (new_call, gimple_location (call));
> +
> +  /* Transfer the LHS to the new call.  */
> +  tree lhs = gimple_call_lhs (call);
> +  gimple_call_set_lhs (new_call, lhs);
> +  gimple_call_set_lhs (call, NULL_TREE);
> +  SSA_NAME_DEF_STMT (lhs) = new_call;
> +
> +  /* Insert the new call.  */
> +  gimple_stmt_iterator gsi = gsi_for_stmt (call);
> +  gsi_insert_before (&gsi, new_call, GSI_SAME_STMT);
> +
> +  if (nconds == 0)
> +    {
> +      /* Skip the call if LHS == LHS.  If we reach here, EDOM is the only
> +        valid errno value and it is used iff the result is NaN.  */
> +      conds.quick_push (gimple_build_cond (EQ_EXPR, lhs, lhs,
> +                                          NULL_TREE, NULL_TREE));
> +      nconds++;
> +
> +      /* Try replacing the original call with a direct assignment to
> +        errno, via an internal function.  */
> +      if (set_edom_supported_p () && !stmt_ends_bb_p (call))
> +       {
> +         gimple_stmt_iterator gsi = gsi_for_stmt (call);
> +         gcall *new_call = gimple_build_call_internal (IFN_SET_EDOM, 0);
> +         gimple_set_vuse (new_call, gimple_vuse (call));
> +         gimple_set_vdef (new_call, gimple_vdef (call));
> +         SSA_NAME_DEF_STMT (gimple_vdef (new_call)) = new_call;
> +         gimple_set_location (new_call, gimple_location (call));
> +         gsi_replace (&gsi, new_call, false);
> +         call = new_call;
> +       }
> +    }
> +
> +  if (!shrink_wrap_one_built_in_call_with_conds (call, conds, nconds))
> +    /* It's too late to back out now.  */
> +    gcc_unreachable ();
> +  return true;
> +}
> +
>  /* The top level function for conditional dead code shrink
>     wrapping transformation.  */
>
> @@ -884,7 +1029,10 @@ shrink_wrap_conditional_dead_built_in_calls (vec<gcall *> calls)
>    for (; i < n ; i++)
>      {
>        gcall *bi_call = calls[i];
> -      changed |= shrink_wrap_one_built_in_call (bi_call);
> +      if (gimple_call_lhs (bi_call))
> +       changed |= use_internal_fn (bi_call);
> +      else
> +       changed |= shrink_wrap_one_built_in_call (bi_call);
>      }
>
>    return changed;
> @@ -913,13 +1061,12 @@ public:
>    {}
>
>    /* opt_pass methods: */
> -  virtual bool gate (function *fun)
> +  virtual bool gate (function *)
>      {
>        /* The limit constants used in the implementation
>          assume IEEE floating point format.  Other formats
>          can be supported in the future if needed.  */
> -      return flag_tree_builtin_call_dce != 0
> -               && optimize_function_for_speed_p (fun);
> +      return flag_tree_builtin_call_dce != 0;
>      }
>
>    virtual unsigned int execute (function *);
> @@ -935,11 +1082,20 @@ pass_call_cdce::execute (function *fun)
>    auto_vec<gcall *> cond_dead_built_in_calls;
>    FOR_EACH_BB_FN (bb, fun)
>      {
> +      /* Skip blocks that are being optimized for size, since our
> +        transformation always increases code size.  */
> +      if (optimize_bb_for_size_p (bb))
> +       continue;
> +
>        /* Collect dead call candidates.  */
>        for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
>          {
>           gcall *stmt = dyn_cast <gcall *> (gsi_stmt (i));
> -          if (stmt && is_call_dce_candidate (stmt))
> +          if (stmt
> +             && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)
> +             && (gimple_call_lhs (stmt)
> +                 ? can_use_internal_fn (stmt)
> +                 : can_test_argument_range (stmt)))
>              {
>                if (dump_file && (dump_flags & TDF_DETAILS))
>                  {
>
diff mbox

Patch

diff --git a/gcc/builtins.c b/gcc/builtins.c
index c422d0d..df5c493 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -101,9 +101,6 @@  static rtx expand_builtin_apply (rtx, rtx, rtx);
 static void expand_builtin_return (rtx);
 static enum type_class type_to_class (tree);
 static rtx expand_builtin_classify_type (tree);
-static void expand_errno_check (tree, rtx);
-static rtx expand_builtin_mathfn (tree, rtx, rtx);
-static rtx expand_builtin_mathfn_2 (tree, rtx, rtx);
 static rtx expand_builtin_mathfn_3 (tree, rtx, rtx);
 static rtx expand_builtin_mathfn_ternary (tree, rtx, rtx);
 static rtx expand_builtin_interclass_mathfn (tree, rtx);
@@ -1972,286 +1969,6 @@  replacement_internal_fn (gcall *call)
   return IFN_LAST;
 }
 
-/* If errno must be maintained, expand the RTL to check if the result,
-   TARGET, of a built-in function call, EXP, is NaN, and if so set
-   errno to EDOM.  */
-
-static void
-expand_errno_check (tree exp, rtx target)
-{
-  rtx_code_label *lab = gen_label_rtx ();
-
-  /* Test the result; if it is NaN, set errno=EDOM because
-     the argument was not in the domain.  */
-  do_compare_rtx_and_jump (target, target, EQ, 0, GET_MODE (target),
-			   NULL_RTX, NULL, lab,
-			   /* The jump is very likely.  */
-			   REG_BR_PROB_BASE - (REG_BR_PROB_BASE / 2000 - 1));
-
-#ifdef TARGET_EDOM
-  /* If this built-in doesn't throw an exception, set errno directly.  */
-  if (TREE_NOTHROW (TREE_OPERAND (CALL_EXPR_FN (exp), 0)))
-    {
-#ifdef GEN_ERRNO_RTX
-      rtx errno_rtx = GEN_ERRNO_RTX;
-#else
-      rtx errno_rtx
-	  = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno"));
-#endif
-      emit_move_insn (errno_rtx,
-		      gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx)));
-      emit_label (lab);
-      return;
-    }
-#endif
-
-  /* Make sure the library call isn't expanded as a tail call.  */
-  CALL_EXPR_TAILCALL (exp) = 0;
-
-  /* We can't set errno=EDOM directly; let the library call do it.
-     Pop the arguments right away in case the call gets deleted.  */
-  NO_DEFER_POP;
-  expand_call (exp, target, 0);
-  OK_DEFER_POP;
-  emit_label (lab);
-}
-
-/* Expand a call to one of the builtin math functions (sqrt, exp, or log).
-   Return NULL_RTX if a normal call should be emitted rather than expanding
-   the function in-line.  EXP is the expression that is a call to the builtin
-   function; if convenient, the result should be placed in TARGET.
-   SUBTARGET may be used as the target for computing one of EXP's operands.  */
-
-static rtx
-expand_builtin_mathfn (tree exp, rtx target, rtx subtarget)
-{
-  optab builtin_optab;
-  rtx op0;
-  rtx_insn *insns;
-  tree fndecl = get_callee_fndecl (exp);
-  machine_mode mode;
-  bool errno_set = false;
-  bool try_widening = false;
-  tree arg;
-
-  if (!validate_arglist (exp, REAL_TYPE, VOID_TYPE))
-    return NULL_RTX;
-
-  arg = CALL_EXPR_ARG (exp, 0);
-
-  switch (DECL_FUNCTION_CODE (fndecl))
-    {
-    CASE_FLT_FN (BUILT_IN_SQRT):
-      errno_set = ! tree_expr_nonnegative_p (arg);
-      try_widening = true;
-      builtin_optab = sqrt_optab;
-      break;
-    CASE_FLT_FN (BUILT_IN_EXP):
-      errno_set = true; builtin_optab = exp_optab; break;
-    CASE_FLT_FN (BUILT_IN_EXP10):
-    CASE_FLT_FN (BUILT_IN_POW10):
-      errno_set = true; builtin_optab = exp10_optab; break;
-    CASE_FLT_FN (BUILT_IN_EXP2):
-      errno_set = true; builtin_optab = exp2_optab; break;
-    CASE_FLT_FN (BUILT_IN_EXPM1):
-      errno_set = true; builtin_optab = expm1_optab; break;
-    CASE_FLT_FN (BUILT_IN_LOGB):
-      errno_set = true; builtin_optab = logb_optab; break;
-    CASE_FLT_FN (BUILT_IN_LOG):
-      errno_set = true; builtin_optab = log_optab; break;
-    CASE_FLT_FN (BUILT_IN_LOG10):
-      errno_set = true; builtin_optab = log10_optab; break;
-    CASE_FLT_FN (BUILT_IN_LOG2):
-      errno_set = true; builtin_optab = log2_optab; break;
-    CASE_FLT_FN (BUILT_IN_LOG1P):
-      errno_set = true; builtin_optab = log1p_optab; break;
-    CASE_FLT_FN (BUILT_IN_ASIN):
-      builtin_optab = asin_optab; break;
-    CASE_FLT_FN (BUILT_IN_ACOS):
-      builtin_optab = acos_optab; break;
-    CASE_FLT_FN (BUILT_IN_TAN):
-      builtin_optab = tan_optab; break;
-    CASE_FLT_FN (BUILT_IN_ATAN):
-      builtin_optab = atan_optab; break;
-    CASE_FLT_FN (BUILT_IN_FLOOR):
-      builtin_optab = floor_optab; break;
-    CASE_FLT_FN (BUILT_IN_CEIL):
-      builtin_optab = ceil_optab; break;
-    CASE_FLT_FN (BUILT_IN_TRUNC):
-      builtin_optab = btrunc_optab; break;
-    CASE_FLT_FN (BUILT_IN_ROUND):
-      builtin_optab = round_optab; break;
-    CASE_FLT_FN (BUILT_IN_NEARBYINT):
-      builtin_optab = nearbyint_optab;
-      if (flag_trapping_math)
-	break;
-      /* Else fallthrough and expand as rint.  */
-    CASE_FLT_FN (BUILT_IN_RINT):
-      builtin_optab = rint_optab; break;
-    CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
-      builtin_optab = significand_optab; break;
-    default:
-      gcc_unreachable ();
-    }
-
-  /* Make a suitable register to place result in.  */
-  mode = TYPE_MODE (TREE_TYPE (exp));
-
-  if (! flag_errno_math || ! HONOR_NANS (mode))
-    errno_set = false;
-
-  /* Before working hard, check whether the instruction is available, but try
-     to widen the mode for specific operations.  */
-  if ((optab_handler (builtin_optab, mode) != CODE_FOR_nothing
-       || (try_widening && !excess_precision_type (TREE_TYPE (exp))))
-      && (!errno_set || !optimize_insn_for_size_p ()))
-    {
-      rtx result = gen_reg_rtx (mode);
-
-      /* Wrap the computation of the argument in a SAVE_EXPR, as we may
-	 need to expand the argument again.  This way, we will not perform
-	 side-effects more the once.  */
-      CALL_EXPR_ARG (exp, 0) = arg = builtin_save_expr (arg);
-
-      op0 = expand_expr (arg, subtarget, VOIDmode, EXPAND_NORMAL);
-
-      start_sequence ();
-
-      /* Compute into RESULT.
-	 Set RESULT to wherever the result comes back.  */
-      result = expand_unop (mode, builtin_optab, op0, result, 0);
-
-      if (result != 0)
-	{
-	  if (errno_set)
-	    expand_errno_check (exp, result);
-
-	  /* Output the entire sequence.  */
-	  insns = get_insns ();
-	  end_sequence ();
-	  emit_insn (insns);
-	  return result;
-	}
-
-      /* If we were unable to expand via the builtin, stop the sequence
-	 (without outputting the insns) and call to the library function
-	 with the stabilized argument list.  */
-      end_sequence ();
-    }
-
-  return expand_call (exp, target, target == const0_rtx);
-}
-
-/* Expand a call to the builtin binary math functions (pow and atan2).
-   Return NULL_RTX if a normal call should be emitted rather than expanding the
-   function in-line.  EXP is the expression that is a call to the builtin
-   function; if convenient, the result should be placed in TARGET.
-   SUBTARGET may be used as the target for computing one of EXP's
-   operands.  */
-
-static rtx
-expand_builtin_mathfn_2 (tree exp, rtx target, rtx subtarget)
-{
-  optab builtin_optab;
-  rtx op0, op1, result;
-  rtx_insn *insns;
-  int op1_type = REAL_TYPE;
-  tree fndecl = get_callee_fndecl (exp);
-  tree arg0, arg1;
-  machine_mode mode;
-  bool errno_set = true;
-
-  switch (DECL_FUNCTION_CODE (fndecl))
-    {
-    CASE_FLT_FN (BUILT_IN_SCALBN):
-    CASE_FLT_FN (BUILT_IN_SCALBLN):
-    CASE_FLT_FN (BUILT_IN_LDEXP):
-      op1_type = INTEGER_TYPE;
-    default:
-      break;
-    }
-
-  if (!validate_arglist (exp, REAL_TYPE, op1_type, VOID_TYPE))
-    return NULL_RTX;
-
-  arg0 = CALL_EXPR_ARG (exp, 0);
-  arg1 = CALL_EXPR_ARG (exp, 1);
-
-  switch (DECL_FUNCTION_CODE (fndecl))
-    {
-    CASE_FLT_FN (BUILT_IN_POW):
-      builtin_optab = pow_optab; break;
-    CASE_FLT_FN (BUILT_IN_ATAN2):
-      builtin_optab = atan2_optab; break;
-    CASE_FLT_FN (BUILT_IN_SCALB):
-      if (REAL_MODE_FORMAT (TYPE_MODE (TREE_TYPE (exp)))->b != 2)
-	return 0;
-      builtin_optab = scalb_optab; break;
-    CASE_FLT_FN (BUILT_IN_SCALBN):
-    CASE_FLT_FN (BUILT_IN_SCALBLN):
-      if (REAL_MODE_FORMAT (TYPE_MODE (TREE_TYPE (exp)))->b != 2)
-	return 0;
-    /* Fall through... */
-    CASE_FLT_FN (BUILT_IN_LDEXP):
-      builtin_optab = ldexp_optab; break;
-    CASE_FLT_FN (BUILT_IN_FMOD):
-      builtin_optab = fmod_optab; break;
-    CASE_FLT_FN (BUILT_IN_REMAINDER):
-    CASE_FLT_FN (BUILT_IN_DREM):
-      builtin_optab = remainder_optab; break;
-    default:
-      gcc_unreachable ();
-    }
-
-  /* Make a suitable register to place result in.  */
-  mode = TYPE_MODE (TREE_TYPE (exp));
-
-  /* Before working hard, check whether the instruction is available.  */
-  if (optab_handler (builtin_optab, mode) == CODE_FOR_nothing)
-    return NULL_RTX;
-
-  result = gen_reg_rtx (mode);
-
-  if (! flag_errno_math || ! HONOR_NANS (mode))
-    errno_set = false;
-
-  if (errno_set && optimize_insn_for_size_p ())
-    return 0;
-
-  /* Always stabilize the argument list.  */
-  CALL_EXPR_ARG (exp, 0) = arg0 = builtin_save_expr (arg0);
-  CALL_EXPR_ARG (exp, 1) = arg1 = builtin_save_expr (arg1);
-
-  op0 = expand_expr (arg0, subtarget, VOIDmode, EXPAND_NORMAL);
-  op1 = expand_normal (arg1);
-
-  start_sequence ();
-
-  /* Compute into RESULT.
-     Set RESULT to wherever the result comes back.  */
-  result = expand_binop (mode, builtin_optab, op0, op1,
-			 result, 0, OPTAB_DIRECT);
-
-  /* If we were unable to expand via the builtin, stop the sequence
-     (without outputting the insns) and call to the library function
-     with the stabilized argument list.  */
-  if (result == 0)
-    {
-      end_sequence ();
-      return expand_call (exp, target, target == const0_rtx);
-    }
-
-  if (errno_set)
-    expand_errno_check (exp, result);
-
-  /* Output the entire sequence.  */
-  insns = get_insns ();
-  end_sequence ();
-  emit_insn (insns);
-
-  return result;
-}
-
 /* Expand a call to the builtin trinary math functions (fma).
    Return NULL_RTX if a normal call should be emitted rather than expanding the
    function in-line.  EXP is the expression that is a call to the builtin
@@ -5984,37 +5701,6 @@  expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
     CASE_FLT_FN (BUILT_IN_CABS):
       break;
 
-    CASE_FLT_FN (BUILT_IN_EXP):
-    CASE_FLT_FN (BUILT_IN_EXP10):
-    CASE_FLT_FN (BUILT_IN_POW10):
-    CASE_FLT_FN (BUILT_IN_EXP2):
-    CASE_FLT_FN (BUILT_IN_EXPM1):
-    CASE_FLT_FN (BUILT_IN_LOGB):
-    CASE_FLT_FN (BUILT_IN_LOG):
-    CASE_FLT_FN (BUILT_IN_LOG10):
-    CASE_FLT_FN (BUILT_IN_LOG2):
-    CASE_FLT_FN (BUILT_IN_LOG1P):
-    CASE_FLT_FN (BUILT_IN_TAN):
-    CASE_FLT_FN (BUILT_IN_ASIN):
-    CASE_FLT_FN (BUILT_IN_ACOS):
-    CASE_FLT_FN (BUILT_IN_ATAN):
-    CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
-      /* Treat these like sqrt only if unsafe math optimizations are allowed,
-	 because of possible accuracy problems.  */
-      if (! flag_unsafe_math_optimizations)
-	break;
-    CASE_FLT_FN (BUILT_IN_SQRT):
-    CASE_FLT_FN (BUILT_IN_FLOOR):
-    CASE_FLT_FN (BUILT_IN_CEIL):
-    CASE_FLT_FN (BUILT_IN_TRUNC):
-    CASE_FLT_FN (BUILT_IN_ROUND):
-    CASE_FLT_FN (BUILT_IN_NEARBYINT):
-    CASE_FLT_FN (BUILT_IN_RINT):
-      target = expand_builtin_mathfn (exp, target, subtarget);
-      if (target)
-	return target;
-      break;
-
     CASE_FLT_FN (BUILT_IN_FMA):
       target = expand_builtin_mathfn_ternary (exp, target, subtarget);
       if (target)
@@ -6061,23 +5747,6 @@  expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 	return target;
       break;
 
-    CASE_FLT_FN (BUILT_IN_ATAN2):
-    CASE_FLT_FN (BUILT_IN_LDEXP):
-    CASE_FLT_FN (BUILT_IN_SCALB):
-    CASE_FLT_FN (BUILT_IN_SCALBN):
-    CASE_FLT_FN (BUILT_IN_SCALBLN):
-      if (! flag_unsafe_math_optimizations)
-	break;
-
-    CASE_FLT_FN (BUILT_IN_FMOD):
-    CASE_FLT_FN (BUILT_IN_REMAINDER):
-    CASE_FLT_FN (BUILT_IN_DREM):
-    CASE_FLT_FN (BUILT_IN_POW):
-      target = expand_builtin_mathfn_2 (exp, target, subtarget);
-      if (target)
-	return target;
-      break;
-
     CASE_FLT_FN (BUILT_IN_CEXPI):
       target = expand_builtin_cexpi (exp, target);
       gcc_assert (target);
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index f23d799..06c5d9e 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2073,6 +2073,24 @@  expand_GOACC_REDUCTION (internal_fn, gcall *)
   gcc_unreachable ();
 }
 
+/* Set errno to EDOM.  */
+
+static void
+expand_SET_EDOM (internal_fn, gcall *)
+{
+#ifdef TARGET_EDOM
+#ifdef GEN_ERRNO_RTX
+  rtx errno_rtx = GEN_ERRNO_RTX;
+#else
+  rtx errno_rtx = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno"));
+#endif
+  emit_move_insn (errno_rtx,
+		  gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx)));
+#else
+  gcc_unreachable ();
+#endif
+}
+
 /* Expand a call to FN using the operands in STMT.  FN has a single
    output operand and NARGS input operands.  */
 
@@ -2217,6 +2235,18 @@  direct_internal_fn_supported_p (internal_fn fn, tree type)
   return direct_internal_fn_supported_p (fn, tree_pair (type, type));
 }
 
+/* Return true if IFN_SET_EDOM is supported.  */
+
+bool
+set_edom_supported_p (void)
+{
+#ifdef TARGET_EDOM
+  return true;
+#else
+  return false;
+#endif
+}
+
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \
   static void						\
   expand_##CODE (internal_fn fn, gcall *stmt)		\
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index bf8047a..825dba1 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -181,6 +181,10 @@  DEF_INTERNAL_FN (GOACC_LOOP, ECF_PURE | ECF_NOTHROW, NULL)
 /* OpenACC reduction abstraction.  See internal-fn.h  for usage.  */
 DEF_INTERNAL_FN (GOACC_REDUCTION, ECF_NOTHROW | ECF_LEAF, NULL)
 
+/* Set errno to EDOM, if GCC knows how to do that directly for the
+   current target.  */
+DEF_INTERNAL_FN (SET_EDOM, ECF_LEAF | ECF_NOTHROW, NULL)
+
 #undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
 #undef DEF_INTERNAL_OPTAB_FN
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 5ee43b8..6cb123f 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -160,6 +160,7 @@  extern tree_pair direct_internal_fn_types (internal_fn, tree, tree *);
 extern tree_pair direct_internal_fn_types (internal_fn, gcall *);
 extern bool direct_internal_fn_supported_p (internal_fn, tree_pair);
 extern bool direct_internal_fn_supported_p (internal_fn, tree);
+extern bool set_edom_supported_p (void);
 
 extern void expand_internal_call (gcall *);
 extern void expand_internal_call (internal_fn, gcall *);
diff --git a/gcc/opts.c b/gcc/opts.c
index be04cf5..4345cc8 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -478,6 +478,7 @@  static const struct default_options default_options_table[] =
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fmove_loop_invariants, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_pta, NULL, 1 },
     { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fssa_phiopt, NULL, 1 },
+    { OPT_LEVELS_1_PLUS, OPT_ftree_builtin_call_dce, NULL, 1 },
 
     /* -O2 optimizations.  */
     { OPT_LEVELS_2_PLUS, OPT_finline_small_functions, NULL, 1 },
@@ -503,7 +504,6 @@  static const struct default_options default_options_table[] =
       REORDER_BLOCKS_ALGORITHM_STC },
     { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 },
-    { OPT_LEVELS_2_PLUS, OPT_ftree_builtin_call_dce, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
index fbcc70b..75ef180 100644
--- a/gcc/tree-call-cdce.c
+++ b/gcc/tree-call-cdce.c
@@ -33,46 +33,77 @@  along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "tree-cfg.h"
 #include "tree-into-ssa.h"
+#include "builtins.h"
+#include "internal-fn.h"
 
 
-/* Conditional dead call elimination
+/* This pass serves two closely-related purposes:
 
-   Some builtin functions can set errno on error conditions, but they
-   are otherwise pure.  If the result of a call to such a function is
-   not used, the compiler can still not eliminate the call without
-   powerful interprocedural analysis to prove that the errno is not
-   checked.  However, if the conditions under which the error occurs
-   are known, the compiler can conditionally dead code eliminate the
-   calls by shrink-wrapping the semi-dead calls into the error condition:
+   1. It conditionally executes calls that set errno if (a) the result of
+      the call is unused and (b) a simple range check on the arguments can
+      detect most cases where errno does not need to be set.
 
-        built_in_call (args)
-          ==>
-        if (error_cond (args))
-             built_in_call (args)
+      This is the "conditional dead-code elimination" that gave the pass
+      its original name, since the call is dead for most argument values.
+      The calls for which it helps are usually part of the C++ abstraction
+      penalty exposed after inlining.
+
+   2. It looks for calls to built-in functions that set errno and whose
+      result is used.  It checks whether there is an associated internal
+      function that doesn't set errno and whether the target supports
+      that internal function.  If so, the pass uses the internal function
+      to compute the result of the built-in function but still arranges
+      for errno to be set when necessary.  There are two ways of setting
+      errno:
+
+      a. by protecting the original call with the same argument checks as (1)
+
+      b. by protecting the original call with a check that the result
+	 of the internal function is not equal to itself (i.e. is NaN).
+
+      (b) requires that NaNs are the only erroneous results.  It is not
+      appropriate for functions like log, which returns ERANGE for zero
+      arguments.  (b) is also likely to perform worse than (a) because it
+      requires the result to be calculated first.  The pass therefore uses
+      (a) when it can and uses (b) as a fallback.
+
+      For (b) the pass can replace the original call with a call to
+      IFN_SET_EDOM, if the target supports direct assignments to errno.
+
+   In both cases, arguments that require errno to be set should occur
+   rarely in practice.  Checks of the errno result should also be rare,
+   but the compiler would need powerful interprocedural analysis to
+   prove that errno is not checked.  It's much easier to add argument
+   checks or result checks instead.
+
+     An example of (1) is:
 
-    An actual simple example is :
 	 log (x);   // Mostly dead call
      ==>
 	 if (__builtin_islessequal (x, 0))
 	     log (x);
 
      With this change, call to log (x) is effectively eliminated, as
-     in majority of the cases, log won't be called with x out of
+     in the majority of the cases, log won't be called with x out of
      range.  The branch is totally predictable, so the branch cost
      is low.
 
+     An example of (2) is:
+
+	y = sqrt (x);
+     ==>
+	y = IFN_SQRT (x);
+	if (__builtin_isless (x, 0))
+	    sqrt (x);
+
+     In the vast majority of cases we should then never need to call sqrt.
+
    Note that library functions are not supposed to clear errno to zero without
    error.  See IEEE Std 1003.1, section 2.3 Error Numbers, and section 7.5:3 of
    ISO/IEC 9899 (C99).
 
    The condition wrapping the builtin call is conservatively set to avoid too
-   aggressive (wrong) shrink wrapping.  The optimization is called conditional
-   dead call elimination because the call is eliminated under the condition
-   that the input arguments would not lead to domain or range error (for
-   instance when x <= 0 for a log (x) call), however the chances that the error
-   condition is hit is very low (those builtin calls which are conditionally
-   dead are usually part of the C++ abstraction penalty exposed after
-   inlining).  */
+   aggressive (wrong) shrink wrapping.  */
 
 
 /* A structure for representing input domain of
@@ -251,28 +282,15 @@  check_builtin_call (gcall *bcall)
   return check_target_format (arg);
 }
 
-/* A helper function to determine if a builtin function call is a
-   candidate for conditional DCE.  Returns true if the builtin call
-   is a candidate.  */
+/* Return true if built-in function call CALL calls a math function
+   and if we know how to test the range of its arguments to detect _most_
+   situations in which errno is not set.  The test must err on the side
+   of treating non-erroneous values as potentially erroneous.  */
 
 static bool
-is_call_dce_candidate (gcall *call)
+can_test_argument_range (gcall *call)
 {
-  tree fn;
-  enum built_in_function fnc;
-
-  /* Only potentially dead calls are considered.  */
-  if (gimple_call_lhs (call))
-    return false;
-
-  fn = gimple_call_fndecl (call);
-  if (!fn
-      || !DECL_BUILT_IN (fn)
-      || (DECL_BUILT_IN_CLASS (fn) != BUILT_IN_NORMAL))
-    return false;
-
-  fnc = DECL_FUNCTION_CODE (fn);
-  switch (fnc)
+  switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call)))
     {
     /* Trig functions.  */
     CASE_FLT_FN (BUILT_IN_ACOS):
@@ -306,6 +324,31 @@  is_call_dce_candidate (gcall *call)
   return false;
 }
 
+/* Return true if CALL can produce a domain error (EDOM) but can never
+   produce a pole, range overflow or range underflow error (all ERANGE).
+   This means that we can tell whether a function would have set errno
+   by testing whether the result is a NaN.  */
+
+static bool
+edom_only_function (gcall *call)
+{
+  switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call)))
+    {
+    CASE_FLT_FN (BUILT_IN_ACOS):
+    CASE_FLT_FN (BUILT_IN_ASIN):
+    CASE_FLT_FN (BUILT_IN_ATAN):
+    CASE_FLT_FN (BUILT_IN_COS):
+    CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
+    CASE_FLT_FN (BUILT_IN_SIN):
+    CASE_FLT_FN (BUILT_IN_SQRT):
+    CASE_FLT_FN (BUILT_IN_FMOD):
+    CASE_FLT_FN (BUILT_IN_REMAINDER):
+      return true;
+
+    default:
+      return false;
+    }
+}
 
 /* A helper function to generate gimple statements for one bound
    comparison, so that the built-in function is called whenever
@@ -703,33 +746,24 @@  gen_shrink_wrap_conditions (gcall *bi_call, vec<gimple *> conds,
 /* Probability of the branch (to the call) is taken.  */
 #define ERR_PROB 0.01
 
-/* The function to shrink wrap a partially dead builtin call
-   whose return value is not used anywhere, but has to be kept
-   live due to potential error condition.  Returns true if the
-   transformation actually happens.  */
+/* Shrink-wrap BI_CALL so that it is only called when one of the NCONDS
+   conditions in CONDS is false.
+
+   Return true on success, in which case the cfg will have been updated.  */
 
 static bool
-shrink_wrap_one_built_in_call (gcall *bi_call)
+shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds,
+					  unsigned int nconds)
 {
   gimple_stmt_iterator bi_call_bsi;
   basic_block bi_call_bb, join_tgt_bb, guard_bb;
   edge join_tgt_in_edge_from_call, join_tgt_in_edge_fall_thru;
   edge bi_call_in_edge0, guard_bb_in_edge;
-  unsigned tn_cond_stmts, nconds;
+  unsigned tn_cond_stmts;
   unsigned ci;
   gimple *cond_expr = NULL;
   gimple *cond_expr_start;
 
-  auto_vec<gimple *, 12> conds;
-  gen_shrink_wrap_conditions (bi_call, conds, &nconds);
-
-  /* This can happen if the condition generator decides
-     it is not beneficial to do the transformation.  Just
-     return false and do not do any transformation for
-     the call.  */
-  if (nconds == 0)
-    return false;
-
   /* The cfg we want to create looks like this:
 
 	   [guard n-1]         <- guard_bb (old block)
@@ -868,6 +902,117 @@  shrink_wrap_one_built_in_call (gcall *bi_call)
   return true;
 }
 
+/* Shrink-wrap BI_CALL so that it is only called when it might set errno
+   (but is always called if it would set errno).
+
+   Return true on success, in which case the cfg will have been updated.  */
+
+static bool
+shrink_wrap_one_built_in_call (gcall *bi_call)
+{
+  unsigned nconds = 0;
+  auto_vec<gimple *, 12> conds;
+  gen_shrink_wrap_conditions (bi_call, conds, &nconds);
+  /* This can happen if the condition generator decides
+     it is not beneficial to do the transformation.  Just
+     return false and do not do any transformation for
+     the call.  */
+  if (nconds == 0)
+    return false;
+  return shrink_wrap_one_built_in_call_with_conds (bi_call, conds, nconds);
+}
+
+/* Return true if built-in function call CALL could be implemented using
+   a combination of an internal function to compute the result and a
+   separate call to set errno.  */
+
+static bool
+can_use_internal_fn (gcall *call)
+{
+  /* Only replace calls that set errno.  */
+  if (!gimple_vdef (call))
+    return false;
+
+  /* Punt if we can't conditionalize the call.  */
+  basic_block bb = gimple_bb (call);
+  if (stmt_ends_bb_p (call) && !find_fallthru_edge (bb->succs))
+    return false;
+
+  /* See whether there is an internal function for this built-in.  */
+  if (replacement_internal_fn (call) == IFN_LAST)
+    return false;
+
+  /* See whether we can catch all cases where errno would be set,
+     while still avoiding the call in most cases.  */
+  if (!can_test_argument_range (call)
+      && !edom_only_function (call))
+    return false;
+
+  return true;
+}
+
+/* Implement built-in function call CALL using an internal function.
+   Return true on success, in which case the cfg will have changed.  */
+
+static bool
+use_internal_fn (gcall *call)
+{
+  unsigned nconds = 0;
+  auto_vec<gimple *, 12> conds;
+  gen_shrink_wrap_conditions (call, conds, &nconds);
+  if (nconds == 0 && !edom_only_function (call))
+    return false;
+
+  internal_fn ifn = replacement_internal_fn (call);
+  gcc_assert (ifn != IFN_LAST);
+
+  /* Construct the new call, with the same arguments as the original one.  */
+  auto_vec <tree, 16> args;
+  unsigned int nargs = gimple_call_num_args (call);
+  for (unsigned int i = 0; i < nargs; ++i)
+    args.safe_push (gimple_call_arg (call, i));
+  gcall *new_call = gimple_build_call_internal_vec (ifn, args);
+  gimple_set_location (new_call, gimple_location (call));
+
+  /* Transfer the LHS to the new call.  */
+  tree lhs = gimple_call_lhs (call);
+  gimple_call_set_lhs (new_call, lhs);
+  gimple_call_set_lhs (call, NULL_TREE);
+  SSA_NAME_DEF_STMT (lhs) = new_call;
+
+  /* Insert the new call.  */
+  gimple_stmt_iterator gsi = gsi_for_stmt (call);
+  gsi_insert_before (&gsi, new_call, GSI_SAME_STMT);
+
+  if (nconds == 0)
+    {
+      /* Skip the call if LHS == LHS.  If we reach here, EDOM is the only
+	 valid errno value and it is used iff the result is NaN.  */
+      conds.quick_push (gimple_build_cond (EQ_EXPR, lhs, lhs,
+					   NULL_TREE, NULL_TREE));
+      nconds++;
+
+      /* Try replacing the original call with a direct assignment to
+	 errno, via an internal function.  */
+      if (set_edom_supported_p () && !stmt_ends_bb_p (call))
+	{
+	  gimple_stmt_iterator gsi = gsi_for_stmt (call);
+	  gcall *new_call = gimple_build_call_internal (IFN_SET_EDOM, 0);
+	  gimple_set_vuse (new_call, gimple_vuse (call));
+	  gimple_set_vdef (new_call, gimple_vdef (call));
+	  SSA_NAME_DEF_STMT (gimple_vdef (new_call)) = new_call;
+	  gimple_set_location (new_call, gimple_location (call));
+	  gsi_replace (&gsi, new_call, false);
+	  call = new_call;
+	}
+    }
+
+  if (!shrink_wrap_one_built_in_call_with_conds (call, conds, nconds))
+    /* It's too late to back out now.  */
+    gcc_unreachable ();
+  return true;
+}
+
 /* The top level function for conditional dead code shrink
    wrapping transformation.  */
 
@@ -884,7 +1029,10 @@  shrink_wrap_conditional_dead_built_in_calls (vec<gcall *> calls)
   for (; i < n ; i++)
     {
       gcall *bi_call = calls[i];
-      changed |= shrink_wrap_one_built_in_call (bi_call);
+      if (gimple_call_lhs (bi_call))
+	changed |= use_internal_fn (bi_call);
+      else
+	changed |= shrink_wrap_one_built_in_call (bi_call);
     }
 
   return changed;
@@ -913,13 +1061,12 @@  public:
   {}
 
   /* opt_pass methods: */
-  virtual bool gate (function *fun)
+  virtual bool gate (function *)
     {
       /* The limit constants used in the implementation
 	 assume IEEE floating point format.  Other formats
 	 can be supported in the future if needed.  */
-      return flag_tree_builtin_call_dce != 0
-       	&& optimize_function_for_speed_p (fun);
+      return flag_tree_builtin_call_dce != 0;
     }
 
   virtual unsigned int execute (function *);
@@ -935,11 +1082,20 @@  pass_call_cdce::execute (function *fun)
   auto_vec<gcall *> cond_dead_built_in_calls;
   FOR_EACH_BB_FN (bb, fun)
     {
+      /* Skip blocks that are being optimized for size, since our
+	 transformation always increases code size.  */
+      if (optimize_bb_for_size_p (bb))
+	continue;
+
       /* Collect dead call candidates.  */
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
 	  gcall *stmt = dyn_cast <gcall *> (gsi_stmt (i));
-          if (stmt && is_call_dce_candidate (stmt))
+          if (stmt
+	      && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)
+	      && (gimple_call_lhs (stmt)
+		  ? can_use_internal_fn (stmt)
+		  : can_test_argument_range (stmt)))
             {
               if (dump_file && (dump_flags & TDF_DETAILS))
                 {