Message ID | 20180130131354.GB5559@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
Series | Fix ipa-inline ICE | expand |
Hi Jan, > this patch fixed ICE that was introduced by Richard Standiford's change to reorder > can and want_inline predicates to reduce amount of work done to verify inlining limits. > This bypasses check that the function is optimized that makes inliner to ICE because > function summary is missing. > > This patch breaks out the expensive limits checking from can predicate to new one > which makes code bit more convoluted but I hope to clean things up next stage1. [...] > * g++.dg/torture/pr81360.C: New testcase the new testcase comes out UNRESOLVED everywhere: pr81360.C -O0 scan-ipa-dump icf "Equal symbols: 0" +UNRESOLVED: g++.dg/torture/pr81360.C -O1 scan-ipa-dump icf "Equal symbols: 0" +UNRESOLVED: g++.dg/torture/pr81360.C -O2 scan-ipa-dump icf "Equal symbols: 0" +UNRESOLVED: g++.dg/torture/pr81360.C -O2 -flto scan-ipa-dump icf "Equal symbols: 0" +UNRESOLVED: g++.dg/torture/pr81360.C -O2 -flto -flto-partition=none scan-ipa-dump icf "Equal symbols: 0" +UNRESOLVED: g++.dg/torture/pr81360.C -O3 -g scan-ipa-dump icf "Equal symbols: 0" +UNRESOLVED: g++.dg/torture/pr81360.C -Os scan-ipa-dump icf "Equal symbols: 0" with g++.dg/torture/pr81360.C -O0 : dump file does not exist in the log. The following patch fixes that, tested with the appropriate runtest invocation. I guess this is obvious? Rainer
On January 30, 2018 9:05:31 PM GMT+01:00, Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> wrote: >Hi Jan, > >> this patch fixed ICE that was introduced by Richard Standiford's >change to reorder >> can and want_inline predicates to reduce amount of work done to >verify inlining limits. >> This bypasses check that the function is optimized that makes inliner >to ICE because >> function summary is missing. >> >> This patch breaks out the expensive limits checking from can >predicate to new one >> which makes code bit more convoluted but I hope to clean things up >next stage1. >[...] >> * g++.dg/torture/pr81360.C: New testcase > >the new testcase comes out UNRESOLVED everywhere: > >pr81360.C -O0 scan-ipa-dump icf "Equal symbols: 0" >+UNRESOLVED: g++.dg/torture/pr81360.C -O1 scan-ipa-dump icf "Equal >symbols: 0" >+UNRESOLVED: g++.dg/torture/pr81360.C -O2 scan-ipa-dump icf "Equal >symbols: 0" >+UNRESOLVED: g++.dg/torture/pr81360.C -O2 -flto scan-ipa-dump icf >"Equal symbols: 0" >+UNRESOLVED: g++.dg/torture/pr81360.C -O2 -flto -flto-partition=none > scan-ipa-dump icf "Equal symbols: 0" >+UNRESOLVED: g++.dg/torture/pr81360.C -O3 -g scan-ipa-dump icf >"Equal symbols: 0" >+UNRESOLVED: g++.dg/torture/pr81360.C -Os scan-ipa-dump icf "Equal >symbols: 0" > >with > >g++.dg/torture/pr81360.C -O0 : dump file does not exist > >in the log. The following patch fixes that, tested with the >appropriate >runtest invocation. > >I guess this is obvious? Yes. > Rainer
On Tue, Jan 30, 2018 at 09:05:31PM +0100, Rainer Orth wrote: > > this patch fixed ICE that was introduced by Richard Standiford's change to reorder > > can and want_inline predicates to reduce amount of work done to verify inlining limits. > > This bypasses check that the function is optimized that makes inliner to ICE because > > function summary is missing. > > > > This patch breaks out the expensive limits checking from can predicate to new one > > which makes code bit more convoluted but I hope to clean things up next stage1. > [...] > > * g++.dg/torture/pr81360.C: New testcase > > the new testcase comes out UNRESOLVED everywhere: > > pr81360.C -O0 scan-ipa-dump icf "Equal symbols: 0" > +UNRESOLVED: g++.dg/torture/pr81360.C -O1 scan-ipa-dump icf "Equal symbols: 0" > +UNRESOLVED: g++.dg/torture/pr81360.C -O2 scan-ipa-dump icf "Equal symbols: 0" > +UNRESOLVED: g++.dg/torture/pr81360.C -O2 -flto scan-ipa-dump icf "Equal symbols: 0" > +UNRESOLVED: g++.dg/torture/pr81360.C -O2 -flto -flto-partition=none scan-ipa-dump icf "Equal symbols: 0" > +UNRESOLVED: g++.dg/torture/pr81360.C -O3 -g scan-ipa-dump icf "Equal symbols: 0" > +UNRESOLVED: g++.dg/torture/pr81360.C -Os scan-ipa-dump icf "Equal symbols: 0" > > with > > g++.dg/torture/pr81360.C -O0 : dump file does not exist > > in the log. The following patch fixes that, tested with the appropriate > runtest invocation. > > I guess this is obvious? Yes, but I wonder why the test is in g++.dg/torture/, that makes no sense for a test with -O2 in dg-options which overrides all the optimization options it cycles through. Move it into g++.dg/ipa/ instead? Jakub
On 30 January 2018 at 23:42, Jakub Jelinek <jakub@redhat.com> wrote: > On Tue, Jan 30, 2018 at 09:05:31PM +0100, Rainer Orth wrote: >> > this patch fixed ICE that was introduced by Richard Standiford's change to reorder >> > can and want_inline predicates to reduce amount of work done to verify inlining limits. >> > This bypasses check that the function is optimized that makes inliner to ICE because >> > function summary is missing. >> > >> > This patch breaks out the expensive limits checking from can predicate to new one >> > which makes code bit more convoluted but I hope to clean things up next stage1. >> [...] >> > * g++.dg/torture/pr81360.C: New testcase >> >> the new testcase comes out UNRESOLVED everywhere: >> >> pr81360.C -O0 scan-ipa-dump icf "Equal symbols: 0" >> +UNRESOLVED: g++.dg/torture/pr81360.C -O1 scan-ipa-dump icf "Equal symbols: 0" >> +UNRESOLVED: g++.dg/torture/pr81360.C -O2 scan-ipa-dump icf "Equal symbols: 0" >> +UNRESOLVED: g++.dg/torture/pr81360.C -O2 -flto scan-ipa-dump icf "Equal symbols: 0" >> +UNRESOLVED: g++.dg/torture/pr81360.C -O2 -flto -flto-partition=none scan-ipa-dump icf "Equal symbols: 0" >> +UNRESOLVED: g++.dg/torture/pr81360.C -O3 -g scan-ipa-dump icf "Equal symbols: 0" >> +UNRESOLVED: g++.dg/torture/pr81360.C -Os scan-ipa-dump icf "Equal symbols: 0" >> >> with >> >> g++.dg/torture/pr81360.C -O0 : dump file does not exist >> >> in the log. The following patch fixes that, tested with the appropriate >> runtest invocation. >> >> I guess this is obvious? > > Yes, but I wonder why the test is in g++.dg/torture/, that makes > no sense for a test with -O2 in dg-options which overrides all the > optimization options it cycles through. Move it into g++.dg/ipa/ instead? > > Jakub Even with Rainer's fix, I've noticed that the new test fails on arm/aarch64: FAIL: g++.dg/torture/pr81360.C -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects scan-ipa-dump icf "Equal symbols: 0" Christophe
On Wed, Jan 31, 2018 at 09:51:31AM +0100, Christophe Lyon wrote: > Even with Rainer's fix, I've noticed that the new test fails on arm/aarch64: > FAIL: g++.dg/torture/pr81360.C -O2 -flto -fuse-linker-plugin > -fno-fat-lto-objects scan-ipa-dump icf "Equal symbols: 0" With -flto -fno-fat-lto-objects I believe we just write the bytecode before IPA passes and stop, so dg-do compile test won't really do icf at compile time, but at link time. The more reason to move the test to g++.dg/ipa/ ... Jakub
> On Wed, Jan 31, 2018 at 09:51:31AM +0100, Christophe Lyon wrote: > > Even with Rainer's fix, I've noticed that the new test fails on arm/aarch64: > > FAIL: g++.dg/torture/pr81360.C -O2 -flto -fuse-linker-plugin > > -fno-fat-lto-objects scan-ipa-dump icf "Equal symbols: 0" > > With -flto -fno-fat-lto-objects I believe we just write the bytecode before > IPA passes and stop, so dg-do compile test won't really do icf at compile > time, but at link time. > The more reason to move the test to g++.dg/ipa/ ... The testcase was about ICE on another testcase and I have copied the Equal symbols check by accident. We can drop all the dump scanning from the test, just we want to see if it builds. Honza > > Jakub
On Wed, Jan 31, 2018 at 10:16:10AM +0100, Jan Hubicka wrote: > > On Wed, Jan 31, 2018 at 09:51:31AM +0100, Christophe Lyon wrote: > > > Even with Rainer's fix, I've noticed that the new test fails on arm/aarch64: > > > FAIL: g++.dg/torture/pr81360.C -O2 -flto -fuse-linker-plugin > > > -fno-fat-lto-objects scan-ipa-dump icf "Equal symbols: 0" > > > > With -flto -fno-fat-lto-objects I believe we just write the bytecode before > > IPA passes and stop, so dg-do compile test won't really do icf at compile > > time, but at link time. > > The more reason to move the test to g++.dg/ipa/ ... > > The testcase was about ICE on another testcase and I have copied the Equal symbols > check by accident. We can drop all the dump scanning from the test, just we want > to see if it builds. If you want to torture test it, please drop the -O2 from dg-options though too. Jakub
> On Wed, Jan 31, 2018 at 10:16:10AM +0100, Jan Hubicka wrote: > > > On Wed, Jan 31, 2018 at 09:51:31AM +0100, Christophe Lyon wrote: > > > > Even with Rainer's fix, I've noticed that the new test fails on arm/aarch64: > > > > FAIL: g++.dg/torture/pr81360.C -O2 -flto -fuse-linker-plugin > > > > -fno-fat-lto-objects scan-ipa-dump icf "Equal symbols: 0" > > > > > > With -flto -fno-fat-lto-objects I believe we just write the bytecode before > > > IPA passes and stop, so dg-do compile test won't really do icf at compile > > > time, but at link time. > > > The more reason to move the test to g++.dg/ipa/ ... > > > > The testcase was about ICE on another testcase and I have copied the Equal symbols > > check by accident. We can drop all the dump scanning from the test, just we want > > to see if it builds. > > If you want to torture test it, please drop the -O2 from dg-options though > too. Like this? Index: testsuite/g++.dg/torture/pr81360.C =================================================================== --- testsuite/g++.dg/torture/pr81360.C (revision 257226) +++ testsuite/g++.dg/torture/pr81360.C (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-icf" } */ +/* { dg-options "-fno-early-inlining" } */ template <int dim> class B; template <int, int dim> class TriaObjectAccessor; @@ -76,4 +76,3 @@ int main() return 0; } -/* { dg-final { scan-ipa-dump "Equal symbols: 0" "icf" } } */
On Wed, Jan 31, 2018 at 11:28:26AM +0100, Jan Hubicka wrote: > > On Wed, Jan 31, 2018 at 10:16:10AM +0100, Jan Hubicka wrote: > > > > On Wed, Jan 31, 2018 at 09:51:31AM +0100, Christophe Lyon wrote: > > > > > Even with Rainer's fix, I've noticed that the new test fails on arm/aarch64: > > > > > FAIL: g++.dg/torture/pr81360.C -O2 -flto -fuse-linker-plugin > > > > > -fno-fat-lto-objects scan-ipa-dump icf "Equal symbols: 0" > > > > > > > > With -flto -fno-fat-lto-objects I believe we just write the bytecode before > > > > IPA passes and stop, so dg-do compile test won't really do icf at compile > > > > time, but at link time. > > > > The more reason to move the test to g++.dg/ipa/ ... > > > > > > The testcase was about ICE on another testcase and I have copied the Equal symbols > > > check by accident. We can drop all the dump scanning from the test, just we want > > > to see if it builds. > > > > If you want to torture test it, please drop the -O2 from dg-options though > > too. > Like this? > > Index: testsuite/g++.dg/torture/pr81360.C > =================================================================== > --- testsuite/g++.dg/torture/pr81360.C (revision 257226) > +++ testsuite/g++.dg/torture/pr81360.C (working copy) > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-icf" } */ > +/* { dg-options "-fno-early-inlining" } */ > > template <int dim> class B; > template <int, int dim> class TriaObjectAccessor; > @@ -76,4 +76,3 @@ int main() > return 0; > } > > -/* { dg-final { scan-ipa-dump "Equal symbols: 0" "icf" } } */ Yes, thanks (with ChangeLog of course). Jakub
Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 257174) +++ ipa-inline.c (working copy) @@ -289,18 +289,16 @@ sanitize_attrs_match_for_inline_p (const (opts_for_fn (caller->decl)->x_##flag \ != opts_for_fn (callee->decl)->x_##flag) - /* Decide if we can inline the edge and possibly update +/* Decide if we can inline the edge and possibly update inline_failed reason. We check whether inlining is possible at all and whether caller growth limits allow doing so. - if REPORT is true, output reason to the dump file. - - if DISREGARD_LIMITS is true, ignore size limits.*/ + if REPORT is true, output reason to the dump file. */ static bool can_inline_edge_p (struct cgraph_edge *e, bool report, - bool disregard_limits = false, bool early = false) + bool early = false) { gcc_checking_assert (e->inline_failed); @@ -316,9 +314,6 @@ can_inline_edge_p (struct cgraph_edge *e cgraph_node *caller = e->caller->global.inlined_to ? e->caller->global.inlined_to : e->caller; cgraph_node *callee = e->callee->ultimate_alias_target (&avail, caller); - tree caller_tree = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (caller->decl); - tree callee_tree - = callee ? DECL_FUNCTION_SPECIFIC_OPTIMIZATION (callee->decl) : NULL; if (!callee->definition) { @@ -379,12 +374,47 @@ can_inline_edge_p (struct cgraph_edge *e e->inline_failed = CIF_ATTRIBUTE_MISMATCH; inlinable = false; } + if (!inlinable && report) + report_inline_failed_reason (e); + return inlinable; +} + +/* Decide if we can inline the edge and possibly update + inline_failed reason. + We check whether inlining is possible at all and whether + caller growth limits allow doing so. + + if REPORT is true, output reason to the dump file. + + if DISREGARD_LIMITS is true, ignore size limits. */ + +static bool +can_inline_edge_by_limits_p (struct cgraph_edge *e, bool report, + bool disregard_limits = false, bool early = false) +{ + gcc_checking_assert (e->inline_failed); + + if (cgraph_inline_failed_type (e->inline_failed) == CIF_FINAL_ERROR) + { + if (report) + report_inline_failed_reason (e); + return false; + } + + bool inlinable = true; + enum availability avail; + cgraph_node *caller = e->caller->global.inlined_to + ? e->caller->global.inlined_to : e->caller; + cgraph_node *callee = e->callee->ultimate_alias_target (&avail, caller); + tree caller_tree = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (caller->decl); + tree callee_tree + = callee ? DECL_FUNCTION_SPECIFIC_OPTIMIZATION (callee->decl) : NULL; /* Check if caller growth allows the inlining. */ - else if (!DECL_DISREGARD_INLINE_LIMITS (callee->decl) - && !disregard_limits - && !lookup_attribute ("flatten", - DECL_ATTRIBUTES (caller->decl)) - && !caller_growth_limits (e)) + if (!DECL_DISREGARD_INLINE_LIMITS (callee->decl) + && !disregard_limits + && !lookup_attribute ("flatten", + DECL_ATTRIBUTES (caller->decl)) + && !caller_growth_limits (e)) inlinable = false; /* Don't inline a function with a higher optimization level than the caller. FIXME: this is really just tip of iceberg of handling @@ -541,7 +571,8 @@ can_early_inline_edge_p (struct cgraph_e fprintf (dump_file, " edge not inlinable: not in SSA form\n"); return false; } - if (!can_inline_edge_p (e, true, false, true)) + if (!can_inline_edge_p (e, true, true) + || !can_inline_edge_by_limits_p (e, true, false, true)) return false; return true; } @@ -925,6 +956,8 @@ check_callers (struct cgraph_node *node, return true; if (e->recursive_p ()) return true; + if (!can_inline_edge_by_limits_p (e, true)) + return true; if (!(*(bool *)has_hot_call) && e->maybe_hot_p ()) *(bool *)has_hot_call = true; } @@ -1317,8 +1350,9 @@ update_caller_keys (edge_heap_t *heap, s if (!check_inlinablity_for || check_inlinablity_for == edge) { - if (want_inline_small_function_p (edge, false) - && can_inline_edge_p (edge, false)) + if (can_inline_edge_p (edge, false) + && want_inline_small_function_p (edge, false) + && can_inline_edge_by_limits_p (edge, false)) update_edge_key (heap, edge); else if (edge->aux) { @@ -1361,8 +1395,9 @@ update_callee_keys (edge_heap_t *heap, s && avail >= AVAIL_AVAILABLE && !bitmap_bit_p (updated_nodes, callee->uid)) { - if (want_inline_small_function_p (e, false) - && can_inline_edge_p (e, false)) + if (can_inline_edge_p (e, false) + && want_inline_small_function_p (e, false) + && can_inline_edge_by_limits_p (e, false)) update_edge_key (heap, e); else if (e->aux) { @@ -1449,7 +1484,8 @@ recursive_inlining (struct cgraph_edge * struct cgraph_edge *curr = heap.extract_min (); struct cgraph_node *cnode, *dest = curr->callee; - if (!can_inline_edge_p (curr, true)) + if (!can_inline_edge_p (curr, true) + || can_inline_edge_by_limits_p (curr, true)) continue; /* MASTER_CLONE is produced in the case we already started modified @@ -1569,7 +1605,8 @@ add_new_edges_to_heap (edge_heap_t *heap gcc_assert (!edge->aux); if (edge->inline_failed && can_inline_edge_p (edge, true) - && want_inline_small_function_p (edge, true)) + && want_inline_small_function_p (edge, true) + && can_inline_edge_by_limits_p (edge, true)) edge->aux = heap->insert (edge_badness (edge, false), edge); } } @@ -1630,7 +1667,9 @@ speculation_useful_p (struct cgraph_edge if (!anticipate_inlining && e->inline_failed && !target->local.local) return false; /* For overwritable targets there is not much to do. */ - if (e->inline_failed && !can_inline_edge_p (e, false, true)) + if (e->inline_failed + && (!can_inline_edge_p (e, false) + || !can_inline_edge_by_limits_p (e, false, true))) return false; /* OK, speculation seems interesting. */ return true; @@ -1790,6 +1829,7 @@ inline_small_functions (void) && !edge->aux && can_inline_edge_p (edge, true) && want_inline_small_function_p (edge, true) + && can_inline_edge_by_limits_p (edge, true) && edge->inline_failed) { gcc_assert (!edge->aux); @@ -1890,7 +1930,8 @@ inline_small_functions (void) badness = current_badness; } - if (!can_inline_edge_p (edge, true)) + if (!can_inline_edge_p (edge, true) + || !can_inline_edge_by_limits_p (edge, true)) { resolve_noninline_speculation (&edge_heap, edge); continue; @@ -2101,6 +2142,7 @@ flatten_function (struct cgraph_node *no too. */ if (!early ? !can_inline_edge_p (e, true) + && !can_inline_edge_by_limits_p (e, true) : !can_early_inline_edge_p (e)) continue; @@ -2155,6 +2197,7 @@ inline_to_all_callers_1 (struct cgraph_n struct cgraph_node *caller = node->callers->caller; if (!can_inline_edge_p (node->callers, true) + || !can_inline_edge_by_limits_p (node->callers, true) || node->callers->recursive_p ()) { if (dump_file) Index: testsuite/g++.dg/torture/pr81360.C =================================================================== --- testsuite/g++.dg/torture/pr81360.C (revision 0) +++ testsuite/g++.dg/torture/pr81360.C (working copy) @@ -0,0 +1,79 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-early-inlining" } */ + +template <int dim> class B; +template <int, int dim> class TriaObjectAccessor; +template <int, typename Accessor> class A; +template <int dim> class TriaDimensionInfo { +public: + typedef A<3, TriaObjectAccessor<2, 3> > raw_quad_iterator; + typedef A<3, B<3> > raw_hex_iterator; + typedef raw_hex_iterator raw_cell_iterator; +}; +template <int dim> class Triangulation : public TriaDimensionInfo<1> { + public: + typedef typename TriaDimensionInfo<dim>::raw_quad_iterator raw_quad_iterator; + TriaDimensionInfo::raw_cell_iterator end() const; + raw_quad_iterator end_quad() const { + return raw_quad_iterator(const_cast<Triangulation *>(this), 0, 0); + } +}; +template <int dim> class TriaAccessor { +public: + typedef void AccessorData; + TriaAccessor(const Triangulation<dim> * = 0); + Triangulation<1> *tria; + + int a, b, c; +}; +template <int dim> class TriaObjectAccessor<2, dim> : public TriaAccessor<dim> { +public: + typedef typename TriaAccessor<dim>::AccessorData AccessorData; + TriaObjectAccessor(const Triangulation<dim> * = 0); +}; +template <int dim> class TriaObjectAccessor<3, dim> : public TriaAccessor<dim> { +public: + typedef typename TriaAccessor<dim>::AccessorData AccessorData; + TriaObjectAccessor(const Triangulation<dim> * = 0); +}; +template <int dim> class B : public TriaObjectAccessor<dim, dim> { +public: + typedef typename TriaObjectAccessor<dim, dim>::AccessorData AccessorData; + B(const Triangulation<dim> * = 0); +}; +template <int dim, typename Accessor> class A { +public: + A(const A &); + A(const Triangulation<dim> *, int, int); + Accessor accessor; +}; +template class Triangulation<3>; +template <int dim, typename Accessor> +A<dim, Accessor>::A(const Triangulation<dim> *, int, int) {} +template <int dim> +TriaAccessor<dim>::TriaAccessor(const Triangulation<dim> *) + : tria(), a(-1), b(-2), c(-3) {} +template <int dim> +TriaObjectAccessor<2, dim>::TriaObjectAccessor(const Triangulation<dim> *) {} +template <int dim> +TriaObjectAccessor<3, dim>::TriaObjectAccessor(const Triangulation<dim> *) {} +template <int dim> B<dim>::B(const Triangulation<dim> *) {} +template <> +TriaDimensionInfo<3>::raw_cell_iterator Triangulation<3>::end() const { + return raw_hex_iterator(const_cast<Triangulation *>(this), 0, 0); +} + +#pragma GCC optimize ("-O0") +int main() +{ + Triangulation <3> t; + Triangulation<3>::raw_quad_iterator i1 = t.end_quad(); + TriaDimensionInfo<3>::raw_cell_iterator i2 = t.end(); + + if(i2.accessor.c != -3) + return 1; + + return 0; +} + +/* { dg-final { scan-ipa-dump "Equal symbols: 0" "icf" } } */