Message ID | 20160502170137.GB46758@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
On Mon, May 02, 2016 at 07:01:38PM +0200, Jan Hubicka wrote: > This patch impleemnts the suggested check for presence of FP parameters. > We can play with special casing the moves incrementally. This patch has broken the build: /home/marek/src/gcc/gcc/ipa-inline.c: In function ‘bool can_inline_edge_p(cgraph_edge*, bool, bool, bool)’: /home/marek/src/gcc/gcc/ipa-inline.c:341:55: error: ‘CIF_THUNK’ was not declared in this scope e->inline_failed = e->caller->thunk.thunk_p ? CIF_THUNK : CIF_MISMATCHED_ARGUMENTS; ^ Makefile:1085: recipe for target 'ipa-inline.o' failed make: *** [ipa-inline.o] Error 1 make: *** Waiting for unfinished jobs.... /home/marek/src/gcc/gcc/ipa-inline-analysis.c: In function ‘clause_t evaluate_conditions_for_known_args(cgraph_node*, bool, vec<tree_node*>, vec<ipa_agg_jump_function*>)’: /home/marek/src/gcc/gcc/ipa-inline-analysis.c:854:27: error: invalid conversion from ‘tree_node*’ to ‘long int’ [-fpermissive] c->offset, c->by_ref); ^ /home/marek/src/gcc/gcc/ipa-inline-analysis.c:854:27: error: too many arguments to function ‘tree_node* ipa_find_agg_cst_for_param(ipa_agg_jump_function*, long int, bool)’ In file included from /home/marek/src/gcc/gcc/ipa-inline-analysis.c:90:0: /home/marek/src/gcc/gcc/ipa-prop.h:639:6: note: declared here tree ipa_find_agg_cst_for_param (struct ipa_agg_jump_function *, HOST_WIDE_INT, ^ Makefile:1085: recipe for target 'ipa-inline-analysis.o' failed make: *** [ipa-inline-analysis.o] Error 1 Marek
> On Mon, May 02, 2016 at 07:01:38PM +0200, Jan Hubicka wrote: > > This patch impleemnts the suggested check for presence of FP parameters. > > We can play with special casing the moves incrementally. > > This patch has broken the build: > > /home/marek/src/gcc/gcc/ipa-inline.c: In function ‘bool > can_inline_edge_p(cgraph_edge*, bool, bool, bool)’: > /home/marek/src/gcc/gcc/ipa-inline.c:341:55: error: ‘CIF_THUNK’ was not > declared in this scope > e->inline_failed = e->caller->thunk.thunk_p ? CIF_THUNK : > CIF_MISMATCHED_ARGUMENTS; Sorry, I managed to miss independent changes that was in my tree. I will revert the evaluate_conditions_for_known_args and add CIF_THUNK shortly. Honza > ^ > Makefile:1085: recipe for target 'ipa-inline.o' failed > make: *** [ipa-inline.o] Error 1 > make: *** Waiting for unfinished jobs.... > /home/marek/src/gcc/gcc/ipa-inline-analysis.c: In function ‘clause_t > evaluate_conditions_for_known_args(cgraph_node*, bool, vec<tree_node*>, > vec<ipa_agg_jump_function*>)’: > /home/marek/src/gcc/gcc/ipa-inline-analysis.c:854:27: error: invalid conversion > from ‘tree_node*’ to ‘long int’ [-fpermissive] > c->offset, c->by_ref); > ^ > /home/marek/src/gcc/gcc/ipa-inline-analysis.c:854:27: error: too many arguments > to function ‘tree_node* ipa_find_agg_cst_for_param(ipa_agg_jump_function*, long > int, bool)’ > In file included from /home/marek/src/gcc/gcc/ipa-inline-analysis.c:90:0: > /home/marek/src/gcc/gcc/ipa-prop.h:639:6: note: declared here > tree ipa_find_agg_cst_for_param (struct ipa_agg_jump_function *, > HOST_WIDE_INT, > ^ > Makefile:1085: recipe for target 'ipa-inline-analysis.o' failed > make: *** [ipa-inline-analysis.o] Error 1 > > Marek
On 2 May 2016 at 19:01, Jan Hubicka <hubicka@ucw.cz> wrote: >> On Thu, 21 Apr 2016, Jan Hubicka wrote: >> >> > Hi, >> > this patch implements the long promised logic to inline across -ffast-math >> > boundary when eitehr caller or callee has no fp operations in it. This is >> > needed to resolve code quality regression on Firefox with LTO where >> > -O3/-O2/-Ofast flags are combined and we fail to inline a lot of comdats >> > otherwise. >> > >> > Bootstrapped/regtested x86_64-linux. Ricahrd, I would like to know your opinion >> > on fp_expression_p predicate - it is bit ugly but I do not know how to implement >> > it better. >> > >> > We still won't inline -O1 code into -O2+ because flag_strict_overflow differs. >> > I will implement similar logic for overflows incrementally. Similarly flag_errno_math >> > can be handled better, but I am not sure it matters - I think wast majority of time >> > users set errno_math in sync with other -ffast-math implied flags. >> >> Note that for reasons PR70586 shows (const functions having possible >> trapping side-effect because of FP math or division) we'd like to >> have sth like "uses FP math" "uses possibly trapping integer math" >> "uses integer math with undefined overflow" on a per function level >> and propagated alongside pure/const/nothrow state. >> >> So maybe you can fit that into a more suitable place than just the >> inliner (which of course is interested in "uses FP math locally", >> not the transitive answer we need for PR70586). > > We don't really have much more suitable place - ipa-inline-analysis is > doing most of the analysis of function body that is usefull for IPA passes, > not only for inliner. It should be renamed perhaps to something like > function_body_summary. I will do that later this stage1. > > For PR70686 in addition to transitive answer we will need to know that > the transformation is win. Const function may take a lot of time and > introducing new call on code path that did not used it previously is > bad idea unless we know that the function is very cheap (which may > be true only for fast builtins, I don't know) > > This patch impleemnts the suggested check for presence of FP parameters. > We can play with special casing the moves incrementally. > > Bootstrapped/regtested x86_64-linux. > > Index: testsuite/ChangeLog > =================================================================== > --- testsuite/ChangeLog (revision 235765) > +++ testsuite/ChangeLog (working copy) > @@ -1,3 +1,7 @@ > +2016-05-02 Jan Hubicka <hubicka@ucw.cz> > + > + * gcc.dg/ipa/inline-8.c: New testcase. > + > 2016-05-02 Jakub Jelinek <jakub@redhat.com> > > PR rtl-optimization/70467 > Index: testsuite/gcc.dg/ipa/inline-8.c > =================================================================== > --- testsuite/gcc.dg/ipa/inline-8.c (revision 0) > +++ testsuite/gcc.dg/ipa/inline-8.c (working copy) > @@ -0,0 +1,36 @@ > +/* Verify that we do not inline isnanf test info -ffast-math code but that we > + do inline trivial functions across -Ofast boundary. */ > +/* { dg-do run } */ > +/* { dg-options "-O2" } */ > +#include <math.h> > +/* Can't be inlined because isnanf will be optimized out. */ > +int > +cmp (float a) > +{ > + return isnanf (a); Hi, This new testcase does not pass on bare-metal configs (using newlib), because: warning: implicit declaration of function 'isnanf' [-Wimplicit-function-declaration] warning: incompatible implicit declaration of built-in function 'isnanf' I'm not sure what's the appropriate dg-require to avoid this? Christophe > +} > +/* Can be inlined. */ > +int > +move (int a) > +{ > + return a; > +} > +float a; > +void > +set () > +{ > + a=nan(""); > +} > +float b; > +__attribute__ ((optimize("Ofast"))) > +int > +main() > +{ > + b++; > + if (cmp(a)) > + __builtin_abort (); > + float a = move (1); > + if (!__builtin_constant_p (a)) > + __builtin_abort (); > + return 0; > +} > Index: ChangeLog > =================================================================== > --- ChangeLog (revision 235765) > +++ ChangeLog (working copy) > @@ -1,3 +1,18 @@ > +2016-05-02 Jan Hubicka <hubicka@ucw.cz> > + > + * ipa-inline-analysis.c (reset_inline_summary): Clear fp_expressions > + (dump_inline_summary): Dump it. > + (fp_expression_p): New predicate. > + (estimate_function_body_sizes): Use it. > + (inline_merge_summary): Merge fp_expressions. > + (inline_read_section): Read fp_expressions. > + (inline_write_summary): Write fp_expressions. > + * ipa-inline.c (can_inline_edge_p): Permit inlining across fp math > + codegen boundary if either caller or callee is !fp_expressions. > + * ipa-inline.h (inline_summary): Add fp_expressions. > + * ipa-inline-transform.c (inline_call): When inlining !fp_expressions > + to fp_expressions be sure the fp generation flags are updated. > + > 2016-05-02 Jakub Jelinek <jakub@redhat.com> > > PR rtl-optimization/70467 > Index: ipa-inline-analysis.c > =================================================================== > --- ipa-inline-analysis.c (revision 235693) > +++ ipa-inline-analysis.c (working copy) > @@ -850,7 +850,8 @@ evaluate_conditions_for_known_args (stru > if (known_aggs.exists ()) > { > agg = known_aggs[c->operand_num]; > - val = ipa_find_agg_cst_for_param (agg, c->offset, c->by_ref); > + val = ipa_find_agg_cst_for_param (agg, known_vals[c->operand_num], > + c->offset, c->by_ref); > } > else > val = NULL_TREE; > @@ -1069,6 +1070,7 @@ reset_inline_summary (struct cgraph_node > reset_inline_edge_summary (e); > for (e = node->indirect_calls; e; e = e->next_callee) > reset_inline_edge_summary (e); > + info->fp_expressions = false; > } > > /* Hook that is called by cgraph.c when a node is removed. */ > @@ -1423,6 +1425,8 @@ dump_inline_summary (FILE *f, struct cgr > fprintf (f, " inlinable"); > if (s->contains_cilk_spawn) > fprintf (f, " contains_cilk_spawn"); > + if (s->fp_expressions) > + fprintf (f, " fp_expression"); > fprintf (f, "\n self time: %i\n", s->self_time); > fprintf (f, " global time: %i\n", s->time); > fprintf (f, " self size: %i\n", s->self_size); > @@ -2459,6 +2463,21 @@ clobber_only_eh_bb_p (basic_block bb, bo > return true; > } > > +/* Return true if STMT compute a floating point expression that may be affected > + by -ffast-math and similar flags. */ > + > +static bool > +fp_expression_p (gimple *stmt) > +{ > + ssa_op_iter i; > + tree op; > + > + FOR_EACH_SSA_TREE_OPERAND (op, stmt, i, SSA_OP_DEF|SSA_OP_USE) > + if (FLOAT_TYPE_P (TREE_TYPE (op))) > + return true; > + return false; > +} > + > /* Compute function body size parameters for NODE. > When EARLY is true, we compute only simple summaries without > non-trivial predicates to drive the early inliner. */ > @@ -2733,6 +2752,13 @@ estimate_function_body_sizes (struct cgr > this_time * (2 - prob), &p); > } > > + if (!info->fp_expressions && fp_expression_p (stmt)) > + { > + info->fp_expressions = true; > + if (dump_file) > + fprintf (dump_file, " fp_expression set\n"); > + } > + > gcc_assert (time >= 0); > gcc_assert (size >= 0); > } > @@ -3577,6 +3603,8 @@ inline_merge_summary (struct cgraph_edge > else > toplev_predicate = true_predicate (); > > + info->fp_expressions |= callee_info->fp_expressions; > + > if (callee_info->conds) > evaluate_properties_for_edge (edge, true, &clause, NULL, NULL, NULL); > if (ipa_node_params_sum && callee_info->conds) > @@ -4229,6 +4257,7 @@ inline_read_section (struct lto_file_dec > bp = streamer_read_bitpack (&ib); > info->inlinable = bp_unpack_value (&bp, 1); > info->contains_cilk_spawn = bp_unpack_value (&bp, 1); > + info->fp_expressions = bp_unpack_value (&bp, 1); > > count2 = streamer_read_uhwi (&ib); > gcc_assert (!info->conds); > @@ -4395,6 +4424,7 @@ inline_write_summary (void) > bp = bitpack_create (ob->main_stream); > bp_pack_value (&bp, info->inlinable, 1); > bp_pack_value (&bp, info->contains_cilk_spawn, 1); > + bp_pack_value (&bp, info->fp_expressions, 1); > streamer_write_bitpack (&bp); > streamer_write_uhwi (ob, vec_safe_length (info->conds)); > for (i = 0; vec_safe_iterate (info->conds, i, &c); i++) > Index: ipa-inline.c > =================================================================== > --- ipa-inline.c (revision 235693) > +++ ipa-inline.c (working copy) > @@ -338,7 +338,7 @@ can_inline_edge_p (struct cgraph_edge *e > else if (e->call_stmt_cannot_inline_p) > { > if (e->inline_failed != CIF_FUNCTION_NOT_OPTIMIZED) > - e->inline_failed = CIF_MISMATCHED_ARGUMENTS; > + e->inline_failed = e->caller->thunk.thunk_p ? CIF_THUNK : CIF_MISMATCHED_ARGUMENTS; > inlinable = false; > } > /* Don't inline if the functions have different EH personalities. */ > @@ -396,6 +396,8 @@ can_inline_edge_p (struct cgraph_edge *e > (DECL_DISREGARD_INLINE_LIMITS (callee->decl) > && lookup_attribute ("always_inline", > DECL_ATTRIBUTES (callee->decl))); > + inline_summary *caller_info = inline_summaries->get (caller); > + inline_summary *callee_info = inline_summaries->get (callee); > > /* Until GCC 4.9 we did not check the semantics alterning flags > bellow and inline across optimization boundry. > @@ -417,16 +419,21 @@ can_inline_edge_p (struct cgraph_edge *e > == !opt_for_fn (callee->decl, optimize) || !always_inline)) > || check_match (flag_wrapv) > || check_match (flag_trapv) > - /* Strictly speaking only when the callee uses FP math. */ > - || check_maybe_up (flag_rounding_math) > - || check_maybe_up (flag_trapping_math) > - || check_maybe_down (flag_unsafe_math_optimizations) > - || check_maybe_down (flag_finite_math_only) > - || check_maybe_up (flag_signaling_nans) > - || check_maybe_down (flag_cx_limited_range) > - || check_maybe_up (flag_signed_zeros) > - || check_maybe_down (flag_associative_math) > - || check_maybe_down (flag_reciprocal_math) > + /* When caller or callee does FP math, be sure FP codegen flags > + compatible. */ > + || ((caller_info->fp_expressions && callee_info->fp_expressions) > + && (check_maybe_up (flag_rounding_math) > + || check_maybe_up (flag_trapping_math) > + || check_maybe_down (flag_unsafe_math_optimizations) > + || check_maybe_down (flag_finite_math_only) > + || check_maybe_up (flag_signaling_nans) > + || check_maybe_down (flag_cx_limited_range) > + || check_maybe_up (flag_signed_zeros) > + || check_maybe_down (flag_associative_math) > + || check_maybe_down (flag_reciprocal_math) > + /* Strictly speaking only when the callee contains function > + calls that may end up setting errno. */ > + || check_maybe_up (flag_errno_math))) > /* We do not want to make code compiled with exceptions to be > brought into a non-EH function unless we know that the callee > does not throw. > @@ -435,9 +442,6 @@ can_inline_edge_p (struct cgraph_edge *e > && DECL_FUNCTION_PERSONALITY (callee->decl)) > || (check_maybe_up (flag_exceptions) > && DECL_FUNCTION_PERSONALITY (callee->decl)) > - /* Strictly speaking only when the callee contains function > - calls that may end up setting errno. */ > - || check_maybe_up (flag_errno_math) > /* When devirtualization is diabled for callee, it is not safe > to inline it as we possibly mangled the type info. > Allow early inlining of always inlines. */ > Index: ipa-inline.h > =================================================================== > --- ipa-inline.h (revision 235693) > +++ ipa-inline.h (working copy) > @@ -132,6 +132,8 @@ struct GTY(()) inline_summary > /* True wen there is only one caller of the function before small function > inlining. */ > unsigned int single_caller : 1; > + /* True if function contains any floating point expressions. */ > + unsigned int fp_expressions : 1; > > /* Information about function that will result after applying all the > inline decisions present in the callgraph. Generally kept up to > Index: ipa-inline-transform.c > =================================================================== > --- ipa-inline-transform.c (revision 235693) > +++ ipa-inline-transform.c (working copy) > @@ -338,6 +338,63 @@ inline_call (struct cgraph_edge *e, bool > DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl) > = build_optimization_node (&opts); > } > + inline_summary *caller_info = inline_summaries->get (to); > + inline_summary *callee_info = inline_summaries->get (callee); > + if (!caller_info->fp_expressions && callee_info->fp_expressions) > + { > + caller_info->fp_expressions = true; > + if (opt_for_fn (callee->decl, flag_rounding_math) > + != opt_for_fn (to->decl, flag_rounding_math) > + || opt_for_fn (callee->decl, flag_trapping_math) > + != opt_for_fn (to->decl, flag_trapping_math) > + || opt_for_fn (callee->decl, flag_unsafe_math_optimizations) > + != opt_for_fn (to->decl, flag_unsafe_math_optimizations) > + || opt_for_fn (callee->decl, flag_finite_math_only) > + != opt_for_fn (to->decl, flag_finite_math_only) > + || opt_for_fn (callee->decl, flag_signaling_nans) > + != opt_for_fn (to->decl, flag_signaling_nans) > + || opt_for_fn (callee->decl, flag_cx_limited_range) > + != opt_for_fn (to->decl, flag_cx_limited_range) > + || opt_for_fn (callee->decl, flag_signed_zeros) > + != opt_for_fn (to->decl, flag_signed_zeros) > + || opt_for_fn (callee->decl, flag_associative_math) > + != opt_for_fn (to->decl, flag_associative_math) > + || opt_for_fn (callee->decl, flag_reciprocal_math) > + != opt_for_fn (to->decl, flag_reciprocal_math) > + || opt_for_fn (callee->decl, flag_errno_math) > + != opt_for_fn (to->decl, flag_errno_math)) > + { > + struct gcc_options opts = global_options; > + > + cl_optimization_restore (&opts, opts_for_fn (to->decl)); > + opts.x_flag_rounding_math > + = opt_for_fn (callee->decl, flag_rounding_math); > + opts.x_flag_trapping_math > + = opt_for_fn (callee->decl, flag_trapping_math); > + opts.x_flag_unsafe_math_optimizations > + = opt_for_fn (callee->decl, flag_unsafe_math_optimizations); > + opts.x_flag_finite_math_only > + = opt_for_fn (callee->decl, flag_finite_math_only); > + opts.x_flag_signaling_nans > + = opt_for_fn (callee->decl, flag_signaling_nans); > + opts.x_flag_cx_limited_range > + = opt_for_fn (callee->decl, flag_cx_limited_range); > + opts.x_flag_signed_zeros > + = opt_for_fn (callee->decl, flag_signed_zeros); > + opts.x_flag_associative_math > + = opt_for_fn (callee->decl, flag_associative_math); > + opts.x_flag_reciprocal_math > + = opt_for_fn (callee->decl, flag_reciprocal_math); > + opts.x_flag_errno_math > + = opt_for_fn (callee->decl, flag_errno_math); > + if (dump_file) > + fprintf (dump_file, "Copying FP flags from %s:%i to %s:%i\n", > + callee->name (), callee->order, to->name (), to->order); > + build_optimization_node (&opts); > + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl) > + = build_optimization_node (&opts); > + } > + } > > /* If aliases are involved, redirect edge to the actual destination and > possibly remove the aliases. */
On Tue, 3 May 2016, Christophe Lyon wrote: > On 2 May 2016 at 19:01, Jan Hubicka <hubicka@ucw.cz> wrote: > >> On Thu, 21 Apr 2016, Jan Hubicka wrote: > >> > >> > Hi, > >> > this patch implements the long promised logic to inline across -ffast-math > >> > boundary when eitehr caller or callee has no fp operations in it. This is > >> > needed to resolve code quality regression on Firefox with LTO where > >> > -O3/-O2/-Ofast flags are combined and we fail to inline a lot of comdats > >> > otherwise. > >> > > >> > Bootstrapped/regtested x86_64-linux. Ricahrd, I would like to know your opinion > >> > on fp_expression_p predicate - it is bit ugly but I do not know how to implement > >> > it better. > >> > > >> > We still won't inline -O1 code into -O2+ because flag_strict_overflow differs. > >> > I will implement similar logic for overflows incrementally. Similarly flag_errno_math > >> > can be handled better, but I am not sure it matters - I think wast majority of time > >> > users set errno_math in sync with other -ffast-math implied flags. > >> > >> Note that for reasons PR70586 shows (const functions having possible > >> trapping side-effect because of FP math or division) we'd like to > >> have sth like "uses FP math" "uses possibly trapping integer math" > >> "uses integer math with undefined overflow" on a per function level > >> and propagated alongside pure/const/nothrow state. > >> > >> So maybe you can fit that into a more suitable place than just the > >> inliner (which of course is interested in "uses FP math locally", > >> not the transitive answer we need for PR70586). > > > > We don't really have much more suitable place - ipa-inline-analysis is > > doing most of the analysis of function body that is usefull for IPA passes, > > not only for inliner. It should be renamed perhaps to something like > > function_body_summary. I will do that later this stage1. > > > > For PR70686 in addition to transitive answer we will need to know that > > the transformation is win. Const function may take a lot of time and > > introducing new call on code path that did not used it previously is > > bad idea unless we know that the function is very cheap (which may > > be true only for fast builtins, I don't know) > > > > This patch impleemnts the suggested check for presence of FP parameters. > > We can play with special casing the moves incrementally. > > > > Bootstrapped/regtested x86_64-linux. > > > > Index: testsuite/ChangeLog > > =================================================================== > > --- testsuite/ChangeLog (revision 235765) > > +++ testsuite/ChangeLog (working copy) > > @@ -1,3 +1,7 @@ > > +2016-05-02 Jan Hubicka <hubicka@ucw.cz> > > + > > + * gcc.dg/ipa/inline-8.c: New testcase. > > + > > 2016-05-02 Jakub Jelinek <jakub@redhat.com> > > > > PR rtl-optimization/70467 > > Index: testsuite/gcc.dg/ipa/inline-8.c > > =================================================================== > > --- testsuite/gcc.dg/ipa/inline-8.c (revision 0) > > +++ testsuite/gcc.dg/ipa/inline-8.c (working copy) > > @@ -0,0 +1,36 @@ > > +/* Verify that we do not inline isnanf test info -ffast-math code but that we > > + do inline trivial functions across -Ofast boundary. */ > > +/* { dg-do run } */ > > +/* { dg-options "-O2" } */ > > +#include <math.h> > > +/* Can't be inlined because isnanf will be optimized out. */ > > +int > > +cmp (float a) > > +{ > > + return isnanf (a); > > Hi, > > This new testcase does not pass on bare-metal configs (using newlib), because: > warning: implicit declaration of function 'isnanf' > [-Wimplicit-function-declaration] > warning: incompatible implicit declaration of built-in function 'isnanf' > > I'm not sure what's the appropriate dg-require to avoid this? c99_runtime I guess. Richard. > Christophe > > > +} > > +/* Can be inlined. */ > > +int > > +move (int a) > > +{ > > + return a; > > +} > > +float a; > > +void > > +set () > > +{ > > + a=nan(""); > > +} > > +float b; > > +__attribute__ ((optimize("Ofast"))) > > +int > > +main() > > +{ > > + b++; > > + if (cmp(a)) > > + __builtin_abort (); > > + float a = move (1); > > + if (!__builtin_constant_p (a)) > > + __builtin_abort (); > > + return 0; > > +} > > Index: ChangeLog > > =================================================================== > > --- ChangeLog (revision 235765) > > +++ ChangeLog (working copy) > > @@ -1,3 +1,18 @@ > > +2016-05-02 Jan Hubicka <hubicka@ucw.cz> > > + > > + * ipa-inline-analysis.c (reset_inline_summary): Clear fp_expressions > > + (dump_inline_summary): Dump it. > > + (fp_expression_p): New predicate. > > + (estimate_function_body_sizes): Use it. > > + (inline_merge_summary): Merge fp_expressions. > > + (inline_read_section): Read fp_expressions. > > + (inline_write_summary): Write fp_expressions. > > + * ipa-inline.c (can_inline_edge_p): Permit inlining across fp math > > + codegen boundary if either caller or callee is !fp_expressions. > > + * ipa-inline.h (inline_summary): Add fp_expressions. > > + * ipa-inline-transform.c (inline_call): When inlining !fp_expressions > > + to fp_expressions be sure the fp generation flags are updated. > > + > > 2016-05-02 Jakub Jelinek <jakub@redhat.com> > > > > PR rtl-optimization/70467 > > Index: ipa-inline-analysis.c > > =================================================================== > > --- ipa-inline-analysis.c (revision 235693) > > +++ ipa-inline-analysis.c (working copy) > > @@ -850,7 +850,8 @@ evaluate_conditions_for_known_args (stru > > if (known_aggs.exists ()) > > { > > agg = known_aggs[c->operand_num]; > > - val = ipa_find_agg_cst_for_param (agg, c->offset, c->by_ref); > > + val = ipa_find_agg_cst_for_param (agg, known_vals[c->operand_num], > > + c->offset, c->by_ref); > > } > > else > > val = NULL_TREE; > > @@ -1069,6 +1070,7 @@ reset_inline_summary (struct cgraph_node > > reset_inline_edge_summary (e); > > for (e = node->indirect_calls; e; e = e->next_callee) > > reset_inline_edge_summary (e); > > + info->fp_expressions = false; > > } > > > > /* Hook that is called by cgraph.c when a node is removed. */ > > @@ -1423,6 +1425,8 @@ dump_inline_summary (FILE *f, struct cgr > > fprintf (f, " inlinable"); > > if (s->contains_cilk_spawn) > > fprintf (f, " contains_cilk_spawn"); > > + if (s->fp_expressions) > > + fprintf (f, " fp_expression"); > > fprintf (f, "\n self time: %i\n", s->self_time); > > fprintf (f, " global time: %i\n", s->time); > > fprintf (f, " self size: %i\n", s->self_size); > > @@ -2459,6 +2463,21 @@ clobber_only_eh_bb_p (basic_block bb, bo > > return true; > > } > > > > +/* Return true if STMT compute a floating point expression that may be affected > > + by -ffast-math and similar flags. */ > > + > > +static bool > > +fp_expression_p (gimple *stmt) > > +{ > > + ssa_op_iter i; > > + tree op; > > + > > + FOR_EACH_SSA_TREE_OPERAND (op, stmt, i, SSA_OP_DEF|SSA_OP_USE) > > + if (FLOAT_TYPE_P (TREE_TYPE (op))) > > + return true; > > + return false; > > +} > > + > > /* Compute function body size parameters for NODE. > > When EARLY is true, we compute only simple summaries without > > non-trivial predicates to drive the early inliner. */ > > @@ -2733,6 +2752,13 @@ estimate_function_body_sizes (struct cgr > > this_time * (2 - prob), &p); > > } > > > > + if (!info->fp_expressions && fp_expression_p (stmt)) > > + { > > + info->fp_expressions = true; > > + if (dump_file) > > + fprintf (dump_file, " fp_expression set\n"); > > + } > > + > > gcc_assert (time >= 0); > > gcc_assert (size >= 0); > > } > > @@ -3577,6 +3603,8 @@ inline_merge_summary (struct cgraph_edge > > else > > toplev_predicate = true_predicate (); > > > > + info->fp_expressions |= callee_info->fp_expressions; > > + > > if (callee_info->conds) > > evaluate_properties_for_edge (edge, true, &clause, NULL, NULL, NULL); > > if (ipa_node_params_sum && callee_info->conds) > > @@ -4229,6 +4257,7 @@ inline_read_section (struct lto_file_dec > > bp = streamer_read_bitpack (&ib); > > info->inlinable = bp_unpack_value (&bp, 1); > > info->contains_cilk_spawn = bp_unpack_value (&bp, 1); > > + info->fp_expressions = bp_unpack_value (&bp, 1); > > > > count2 = streamer_read_uhwi (&ib); > > gcc_assert (!info->conds); > > @@ -4395,6 +4424,7 @@ inline_write_summary (void) > > bp = bitpack_create (ob->main_stream); > > bp_pack_value (&bp, info->inlinable, 1); > > bp_pack_value (&bp, info->contains_cilk_spawn, 1); > > + bp_pack_value (&bp, info->fp_expressions, 1); > > streamer_write_bitpack (&bp); > > streamer_write_uhwi (ob, vec_safe_length (info->conds)); > > for (i = 0; vec_safe_iterate (info->conds, i, &c); i++) > > Index: ipa-inline.c > > =================================================================== > > --- ipa-inline.c (revision 235693) > > +++ ipa-inline.c (working copy) > > @@ -338,7 +338,7 @@ can_inline_edge_p (struct cgraph_edge *e > > else if (e->call_stmt_cannot_inline_p) > > { > > if (e->inline_failed != CIF_FUNCTION_NOT_OPTIMIZED) > > - e->inline_failed = CIF_MISMATCHED_ARGUMENTS; > > + e->inline_failed = e->caller->thunk.thunk_p ? CIF_THUNK : CIF_MISMATCHED_ARGUMENTS; > > inlinable = false; > > } > > /* Don't inline if the functions have different EH personalities. */ > > @@ -396,6 +396,8 @@ can_inline_edge_p (struct cgraph_edge *e > > (DECL_DISREGARD_INLINE_LIMITS (callee->decl) > > && lookup_attribute ("always_inline", > > DECL_ATTRIBUTES (callee->decl))); > > + inline_summary *caller_info = inline_summaries->get (caller); > > + inline_summary *callee_info = inline_summaries->get (callee); > > > > /* Until GCC 4.9 we did not check the semantics alterning flags > > bellow and inline across optimization boundry. > > @@ -417,16 +419,21 @@ can_inline_edge_p (struct cgraph_edge *e > > == !opt_for_fn (callee->decl, optimize) || !always_inline)) > > || check_match (flag_wrapv) > > || check_match (flag_trapv) > > - /* Strictly speaking only when the callee uses FP math. */ > > - || check_maybe_up (flag_rounding_math) > > - || check_maybe_up (flag_trapping_math) > > - || check_maybe_down (flag_unsafe_math_optimizations) > > - || check_maybe_down (flag_finite_math_only) > > - || check_maybe_up (flag_signaling_nans) > > - || check_maybe_down (flag_cx_limited_range) > > - || check_maybe_up (flag_signed_zeros) > > - || check_maybe_down (flag_associative_math) > > - || check_maybe_down (flag_reciprocal_math) > > + /* When caller or callee does FP math, be sure FP codegen flags > > + compatible. */ > > + || ((caller_info->fp_expressions && callee_info->fp_expressions) > > + && (check_maybe_up (flag_rounding_math) > > + || check_maybe_up (flag_trapping_math) > > + || check_maybe_down (flag_unsafe_math_optimizations) > > + || check_maybe_down (flag_finite_math_only) > > + || check_maybe_up (flag_signaling_nans) > > + || check_maybe_down (flag_cx_limited_range) > > + || check_maybe_up (flag_signed_zeros) > > + || check_maybe_down (flag_associative_math) > > + || check_maybe_down (flag_reciprocal_math) > > + /* Strictly speaking only when the callee contains function > > + calls that may end up setting errno. */ > > + || check_maybe_up (flag_errno_math))) > > /* We do not want to make code compiled with exceptions to be > > brought into a non-EH function unless we know that the callee > > does not throw. > > @@ -435,9 +442,6 @@ can_inline_edge_p (struct cgraph_edge *e > > && DECL_FUNCTION_PERSONALITY (callee->decl)) > > || (check_maybe_up (flag_exceptions) > > && DECL_FUNCTION_PERSONALITY (callee->decl)) > > - /* Strictly speaking only when the callee contains function > > - calls that may end up setting errno. */ > > - || check_maybe_up (flag_errno_math) > > /* When devirtualization is diabled for callee, it is not safe > > to inline it as we possibly mangled the type info. > > Allow early inlining of always inlines. */ > > Index: ipa-inline.h > > =================================================================== > > --- ipa-inline.h (revision 235693) > > +++ ipa-inline.h (working copy) > > @@ -132,6 +132,8 @@ struct GTY(()) inline_summary > > /* True wen there is only one caller of the function before small function > > inlining. */ > > unsigned int single_caller : 1; > > + /* True if function contains any floating point expressions. */ > > + unsigned int fp_expressions : 1; > > > > /* Information about function that will result after applying all the > > inline decisions present in the callgraph. Generally kept up to > > Index: ipa-inline-transform.c > > =================================================================== > > --- ipa-inline-transform.c (revision 235693) > > +++ ipa-inline-transform.c (working copy) > > @@ -338,6 +338,63 @@ inline_call (struct cgraph_edge *e, bool > > DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl) > > = build_optimization_node (&opts); > > } > > + inline_summary *caller_info = inline_summaries->get (to); > > + inline_summary *callee_info = inline_summaries->get (callee); > > + if (!caller_info->fp_expressions && callee_info->fp_expressions) > > + { > > + caller_info->fp_expressions = true; > > + if (opt_for_fn (callee->decl, flag_rounding_math) > > + != opt_for_fn (to->decl, flag_rounding_math) > > + || opt_for_fn (callee->decl, flag_trapping_math) > > + != opt_for_fn (to->decl, flag_trapping_math) > > + || opt_for_fn (callee->decl, flag_unsafe_math_optimizations) > > + != opt_for_fn (to->decl, flag_unsafe_math_optimizations) > > + || opt_for_fn (callee->decl, flag_finite_math_only) > > + != opt_for_fn (to->decl, flag_finite_math_only) > > + || opt_for_fn (callee->decl, flag_signaling_nans) > > + != opt_for_fn (to->decl, flag_signaling_nans) > > + || opt_for_fn (callee->decl, flag_cx_limited_range) > > + != opt_for_fn (to->decl, flag_cx_limited_range) > > + || opt_for_fn (callee->decl, flag_signed_zeros) > > + != opt_for_fn (to->decl, flag_signed_zeros) > > + || opt_for_fn (callee->decl, flag_associative_math) > > + != opt_for_fn (to->decl, flag_associative_math) > > + || opt_for_fn (callee->decl, flag_reciprocal_math) > > + != opt_for_fn (to->decl, flag_reciprocal_math) > > + || opt_for_fn (callee->decl, flag_errno_math) > > + != opt_for_fn (to->decl, flag_errno_math)) > > + { > > + struct gcc_options opts = global_options; > > + > > + cl_optimization_restore (&opts, opts_for_fn (to->decl)); > > + opts.x_flag_rounding_math > > + = opt_for_fn (callee->decl, flag_rounding_math); > > + opts.x_flag_trapping_math > > + = opt_for_fn (callee->decl, flag_trapping_math); > > + opts.x_flag_unsafe_math_optimizations > > + = opt_for_fn (callee->decl, flag_unsafe_math_optimizations); > > + opts.x_flag_finite_math_only > > + = opt_for_fn (callee->decl, flag_finite_math_only); > > + opts.x_flag_signaling_nans > > + = opt_for_fn (callee->decl, flag_signaling_nans); > > + opts.x_flag_cx_limited_range > > + = opt_for_fn (callee->decl, flag_cx_limited_range); > > + opts.x_flag_signed_zeros > > + = opt_for_fn (callee->decl, flag_signed_zeros); > > + opts.x_flag_associative_math > > + = opt_for_fn (callee->decl, flag_associative_math); > > + opts.x_flag_reciprocal_math > > + = opt_for_fn (callee->decl, flag_reciprocal_math); > > + opts.x_flag_errno_math > > + = opt_for_fn (callee->decl, flag_errno_math); > > + if (dump_file) > > + fprintf (dump_file, "Copying FP flags from %s:%i to %s:%i\n", > > + callee->name (), callee->order, to->name (), to->order); > > + build_optimization_node (&opts); > > + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl) > > + = build_optimization_node (&opts); > > + } > > + } > > > > /* If aliases are involved, redirect edge to the actual destination and > > possibly remove the aliases. */ > >
Index: testsuite/ChangeLog =================================================================== --- testsuite/ChangeLog (revision 235765) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,7 @@ +2016-05-02 Jan Hubicka <hubicka@ucw.cz> + + * gcc.dg/ipa/inline-8.c: New testcase. + 2016-05-02 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/70467 Index: testsuite/gcc.dg/ipa/inline-8.c =================================================================== --- testsuite/gcc.dg/ipa/inline-8.c (revision 0) +++ testsuite/gcc.dg/ipa/inline-8.c (working copy) @@ -0,0 +1,36 @@ +/* Verify that we do not inline isnanf test info -ffast-math code but that we + do inline trivial functions across -Ofast boundary. */ +/* { dg-do run } */ +/* { dg-options "-O2" } */ +#include <math.h> +/* Can't be inlined because isnanf will be optimized out. */ +int +cmp (float a) +{ + return isnanf (a); +} +/* Can be inlined. */ +int +move (int a) +{ + return a; +} +float a; +void +set () +{ + a=nan(""); +} +float b; +__attribute__ ((optimize("Ofast"))) +int +main() +{ + b++; + if (cmp(a)) + __builtin_abort (); + float a = move (1); + if (!__builtin_constant_p (a)) + __builtin_abort (); + return 0; +} Index: ChangeLog =================================================================== --- ChangeLog (revision 235765) +++ ChangeLog (working copy) @@ -1,3 +1,18 @@ +2016-05-02 Jan Hubicka <hubicka@ucw.cz> + + * ipa-inline-analysis.c (reset_inline_summary): Clear fp_expressions + (dump_inline_summary): Dump it. + (fp_expression_p): New predicate. + (estimate_function_body_sizes): Use it. + (inline_merge_summary): Merge fp_expressions. + (inline_read_section): Read fp_expressions. + (inline_write_summary): Write fp_expressions. + * ipa-inline.c (can_inline_edge_p): Permit inlining across fp math + codegen boundary if either caller or callee is !fp_expressions. + * ipa-inline.h (inline_summary): Add fp_expressions. + * ipa-inline-transform.c (inline_call): When inlining !fp_expressions + to fp_expressions be sure the fp generation flags are updated. + 2016-05-02 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/70467 Index: ipa-inline-analysis.c =================================================================== --- ipa-inline-analysis.c (revision 235693) +++ ipa-inline-analysis.c (working copy) @@ -850,7 +850,8 @@ evaluate_conditions_for_known_args (stru if (known_aggs.exists ()) { agg = known_aggs[c->operand_num]; - val = ipa_find_agg_cst_for_param (agg, c->offset, c->by_ref); + val = ipa_find_agg_cst_for_param (agg, known_vals[c->operand_num], + c->offset, c->by_ref); } else val = NULL_TREE; @@ -1069,6 +1070,7 @@ reset_inline_summary (struct cgraph_node reset_inline_edge_summary (e); for (e = node->indirect_calls; e; e = e->next_callee) reset_inline_edge_summary (e); + info->fp_expressions = false; } /* Hook that is called by cgraph.c when a node is removed. */ @@ -1423,6 +1425,8 @@ dump_inline_summary (FILE *f, struct cgr fprintf (f, " inlinable"); if (s->contains_cilk_spawn) fprintf (f, " contains_cilk_spawn"); + if (s->fp_expressions) + fprintf (f, " fp_expression"); fprintf (f, "\n self time: %i\n", s->self_time); fprintf (f, " global time: %i\n", s->time); fprintf (f, " self size: %i\n", s->self_size); @@ -2459,6 +2463,21 @@ clobber_only_eh_bb_p (basic_block bb, bo return true; } +/* Return true if STMT compute a floating point expression that may be affected + by -ffast-math and similar flags. */ + +static bool +fp_expression_p (gimple *stmt) +{ + ssa_op_iter i; + tree op; + + FOR_EACH_SSA_TREE_OPERAND (op, stmt, i, SSA_OP_DEF|SSA_OP_USE) + if (FLOAT_TYPE_P (TREE_TYPE (op))) + return true; + return false; +} + /* Compute function body size parameters for NODE. When EARLY is true, we compute only simple summaries without non-trivial predicates to drive the early inliner. */ @@ -2733,6 +2752,13 @@ estimate_function_body_sizes (struct cgr this_time * (2 - prob), &p); } + if (!info->fp_expressions && fp_expression_p (stmt)) + { + info->fp_expressions = true; + if (dump_file) + fprintf (dump_file, " fp_expression set\n"); + } + gcc_assert (time >= 0); gcc_assert (size >= 0); } @@ -3577,6 +3603,8 @@ inline_merge_summary (struct cgraph_edge else toplev_predicate = true_predicate (); + info->fp_expressions |= callee_info->fp_expressions; + if (callee_info->conds) evaluate_properties_for_edge (edge, true, &clause, NULL, NULL, NULL); if (ipa_node_params_sum && callee_info->conds) @@ -4229,6 +4257,7 @@ inline_read_section (struct lto_file_dec bp = streamer_read_bitpack (&ib); info->inlinable = bp_unpack_value (&bp, 1); info->contains_cilk_spawn = bp_unpack_value (&bp, 1); + info->fp_expressions = bp_unpack_value (&bp, 1); count2 = streamer_read_uhwi (&ib); gcc_assert (!info->conds); @@ -4395,6 +4424,7 @@ inline_write_summary (void) bp = bitpack_create (ob->main_stream); bp_pack_value (&bp, info->inlinable, 1); bp_pack_value (&bp, info->contains_cilk_spawn, 1); + bp_pack_value (&bp, info->fp_expressions, 1); streamer_write_bitpack (&bp); streamer_write_uhwi (ob, vec_safe_length (info->conds)); for (i = 0; vec_safe_iterate (info->conds, i, &c); i++) Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 235693) +++ ipa-inline.c (working copy) @@ -338,7 +338,7 @@ can_inline_edge_p (struct cgraph_edge *e else if (e->call_stmt_cannot_inline_p) { if (e->inline_failed != CIF_FUNCTION_NOT_OPTIMIZED) - e->inline_failed = CIF_MISMATCHED_ARGUMENTS; + e->inline_failed = e->caller->thunk.thunk_p ? CIF_THUNK : CIF_MISMATCHED_ARGUMENTS; inlinable = false; } /* Don't inline if the functions have different EH personalities. */ @@ -396,6 +396,8 @@ can_inline_edge_p (struct cgraph_edge *e (DECL_DISREGARD_INLINE_LIMITS (callee->decl) && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee->decl))); + inline_summary *caller_info = inline_summaries->get (caller); + inline_summary *callee_info = inline_summaries->get (callee); /* Until GCC 4.9 we did not check the semantics alterning flags bellow and inline across optimization boundry. @@ -417,16 +419,21 @@ can_inline_edge_p (struct cgraph_edge *e == !opt_for_fn (callee->decl, optimize) || !always_inline)) || check_match (flag_wrapv) || check_match (flag_trapv) - /* Strictly speaking only when the callee uses FP math. */ - || check_maybe_up (flag_rounding_math) - || check_maybe_up (flag_trapping_math) - || check_maybe_down (flag_unsafe_math_optimizations) - || check_maybe_down (flag_finite_math_only) - || check_maybe_up (flag_signaling_nans) - || check_maybe_down (flag_cx_limited_range) - || check_maybe_up (flag_signed_zeros) - || check_maybe_down (flag_associative_math) - || check_maybe_down (flag_reciprocal_math) + /* When caller or callee does FP math, be sure FP codegen flags + compatible. */ + || ((caller_info->fp_expressions && callee_info->fp_expressions) + && (check_maybe_up (flag_rounding_math) + || check_maybe_up (flag_trapping_math) + || check_maybe_down (flag_unsafe_math_optimizations) + || check_maybe_down (flag_finite_math_only) + || check_maybe_up (flag_signaling_nans) + || check_maybe_down (flag_cx_limited_range) + || check_maybe_up (flag_signed_zeros) + || check_maybe_down (flag_associative_math) + || check_maybe_down (flag_reciprocal_math) + /* Strictly speaking only when the callee contains function + calls that may end up setting errno. */ + || check_maybe_up (flag_errno_math))) /* We do not want to make code compiled with exceptions to be brought into a non-EH function unless we know that the callee does not throw. @@ -435,9 +442,6 @@ can_inline_edge_p (struct cgraph_edge *e && DECL_FUNCTION_PERSONALITY (callee->decl)) || (check_maybe_up (flag_exceptions) && DECL_FUNCTION_PERSONALITY (callee->decl)) - /* Strictly speaking only when the callee contains function - calls that may end up setting errno. */ - || check_maybe_up (flag_errno_math) /* When devirtualization is diabled for callee, it is not safe to inline it as we possibly mangled the type info. Allow early inlining of always inlines. */ Index: ipa-inline.h =================================================================== --- ipa-inline.h (revision 235693) +++ ipa-inline.h (working copy) @@ -132,6 +132,8 @@ struct GTY(()) inline_summary /* True wen there is only one caller of the function before small function inlining. */ unsigned int single_caller : 1; + /* True if function contains any floating point expressions. */ + unsigned int fp_expressions : 1; /* Information about function that will result after applying all the inline decisions present in the callgraph. Generally kept up to Index: ipa-inline-transform.c =================================================================== --- ipa-inline-transform.c (revision 235693) +++ ipa-inline-transform.c (working copy) @@ -338,6 +338,63 @@ inline_call (struct cgraph_edge *e, bool DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl) = build_optimization_node (&opts); } + inline_summary *caller_info = inline_summaries->get (to); + inline_summary *callee_info = inline_summaries->get (callee); + if (!caller_info->fp_expressions && callee_info->fp_expressions) + { + caller_info->fp_expressions = true; + if (opt_for_fn (callee->decl, flag_rounding_math) + != opt_for_fn (to->decl, flag_rounding_math) + || opt_for_fn (callee->decl, flag_trapping_math) + != opt_for_fn (to->decl, flag_trapping_math) + || opt_for_fn (callee->decl, flag_unsafe_math_optimizations) + != opt_for_fn (to->decl, flag_unsafe_math_optimizations) + || opt_for_fn (callee->decl, flag_finite_math_only) + != opt_for_fn (to->decl, flag_finite_math_only) + || opt_for_fn (callee->decl, flag_signaling_nans) + != opt_for_fn (to->decl, flag_signaling_nans) + || opt_for_fn (callee->decl, flag_cx_limited_range) + != opt_for_fn (to->decl, flag_cx_limited_range) + || opt_for_fn (callee->decl, flag_signed_zeros) + != opt_for_fn (to->decl, flag_signed_zeros) + || opt_for_fn (callee->decl, flag_associative_math) + != opt_for_fn (to->decl, flag_associative_math) + || opt_for_fn (callee->decl, flag_reciprocal_math) + != opt_for_fn (to->decl, flag_reciprocal_math) + || opt_for_fn (callee->decl, flag_errno_math) + != opt_for_fn (to->decl, flag_errno_math)) + { + struct gcc_options opts = global_options; + + cl_optimization_restore (&opts, opts_for_fn (to->decl)); + opts.x_flag_rounding_math + = opt_for_fn (callee->decl, flag_rounding_math); + opts.x_flag_trapping_math + = opt_for_fn (callee->decl, flag_trapping_math); + opts.x_flag_unsafe_math_optimizations + = opt_for_fn (callee->decl, flag_unsafe_math_optimizations); + opts.x_flag_finite_math_only + = opt_for_fn (callee->decl, flag_finite_math_only); + opts.x_flag_signaling_nans + = opt_for_fn (callee->decl, flag_signaling_nans); + opts.x_flag_cx_limited_range + = opt_for_fn (callee->decl, flag_cx_limited_range); + opts.x_flag_signed_zeros + = opt_for_fn (callee->decl, flag_signed_zeros); + opts.x_flag_associative_math + = opt_for_fn (callee->decl, flag_associative_math); + opts.x_flag_reciprocal_math + = opt_for_fn (callee->decl, flag_reciprocal_math); + opts.x_flag_errno_math + = opt_for_fn (callee->decl, flag_errno_math); + if (dump_file) + fprintf (dump_file, "Copying FP flags from %s:%i to %s:%i\n", + callee->name (), callee->order, to->name (), to->order); + build_optimization_node (&opts); + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl) + = build_optimization_node (&opts); + } + } /* If aliases are involved, redirect edge to the actual destination and possibly remove the aliases. */