Message ID | 20151210073037.GA40772@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
Hi, thanks for looking into this, I only have one question: On Thu, Dec 10, 2015 at 08:30:37AM +0100, Jan Hubicka wrote: > Martin, > while looking into the ipa-cp dumps for bzip and Firefox I noticed few issues. > First of all, ipcp_cloning_candidate_p calls > optimize_function_for_speed_p (DECL_STRUCT_FUNCTION (node->decl)) > which can not be used at WPA time, becuase we have no DECL_STRUCT_FUNCTION > around. I replaced it by node->optimize_for_size_p (). > > Second we perform incredible number of clones because we do obtain some sort of > polymorphic call context for them. In wast majority of cases this is useless > effort, because the functions in question do not contain virtual calls and do > not pass the parameter further. For firefox about 40k out of 50k clones > created are created just because we found some context. > > I changed the code to only clone if this immediately leads to devirtualization. > This do not cause any noticeable drop in number of devirtualized calls on > Firefox. I suppose we will miss the case where cloning a caller may allow > devirtualization in a clone of callee, but I do not think the heuristics for > context independent values can handle this as implemented right now and it > simply have way to many false positives. > > What we can do is to devirtualize w/o cloning for local functions and > speculatively devirtualize in case we would otherwise clone. > > Third problem I noticed is that > will_be_removed_from_program_if_no_direct_calls_p is used to decide if we can > ignore the function size when deciding about the code size impact. > This function is doing some analysis for inliner where it, for example, analyses > if a comdat which is going to be inlined consistently in the whole program > will be removed. > > In the cloning case I do not see this to apply: we have no evidence that the > other units will pass the same constants to the function. I think you > basically want to assume that the function will be removed if it has no > address taken and it is not externally visibible. This is what local flag > is for. > > I gathered some stats: > > number of clones for all contexts: 49948->11102 > number of clones: 4376->4383 > > good_cloning_opportunity_p is called about 70k times, I wonder if the > thresholds are not simply set too high. For example, inliner does about 300k > inlines at Firefox. > > number of param replacements: 13041-> 13056 + 5383 aggregate replacements (I do not have data on unpatched tree for this) > number of devirts: 956->933 > number of devirts happening at inline: 781->868 > number of indirect calls promoted: 512->512 > > Inliner stats from: Unit growth for small function inlining: 7965701->9130051 (14%) > to: Unit growth for small function inlining: 7965010->9138577 > > So it seems that except for large drop in number of clones there is no significant difference. > > I am bootstrapping/regtesting this on x86_64-linux, does it seem OK? > > Honza > > * ipa-cp.c (ipcp_cloning_candidate_p): Use node->optimize_for_size_p. > (good_cloning_opportunity_p): Likewise. > (gather_context_independent_values): Do not return true when > polymorphic call context is known or when we have known aggregate > value of unused parameter. > (estimate_local_effects): Try to create clone for all context > when either some params are substituted or devirtualization is possible > or some params can be removed; use local flag instead of > node->will_be_removed_from_program_if_no_direct_calls_p. > (identify_dead_nodes): Likewise. > Index: ipa-cp.c > =================================================================== > --- ipa-cp.c (revision 231477) > +++ ipa-cp.c (working copy) > @@ -613,7 +613,7 @@ ipcp_cloning_candidate_p (struct cgraph_ > return false; > } > > - if (!optimize_function_for_speed_p (DECL_STRUCT_FUNCTION (node->decl))) > + if (node->optimize_for_size_p ()) > { > if (dump_file) > fprintf (dump_file, "Not considering %s for cloning; " > @@ -2267,7 +2267,7 @@ good_cloning_opportunity_p (struct cgrap > { > if (time_benefit == 0 > || !opt_for_fn (node->decl, flag_ipa_cp_clone) > - || !optimize_function_for_speed_p (DECL_STRUCT_FUNCTION (node->decl))) > + || node->optimize_for_size_p ()) > return false; > > gcc_assert (size_cost > 0); > @@ -2387,12 +2387,14 @@ gather_context_independent_values (struc > *removable_params_cost > += ipa_get_param_move_cost (info, i); > > + if (!ipa_is_param_used (info, i)) > + continue; > + Is this really necessary, is it not enough to remove the assignment to ret below? If the parameter is not used, devirtualization time bonus, which you then rely on estimate_local_effects, should be zero for it. It is a very minor point, I suppose, but if the function gets cloned for a different reason, it might still be beneficial to have as much context-independent information for it as possible too, because that can then be used in a callee (see the second call of gather_context_independent_values). Other than that, all the changes seem like a clear improvement. Thanks, Martin > ipcp_lattice<ipa_polymorphic_call_context> *ctxlat = &plats->ctxlat; > + /* Do not account known context as reason for cloning. We can see > + if it permits devirtualization. */ > if (ctxlat->is_single_const ()) > - { > - (*known_contexts)[i] = ctxlat->values->value; > - ret = true; > - } > + (*known_contexts)[i] = ctxlat->values->value; > > if (known_aggs) > { > @@ -2494,7 +2496,9 @@ estimate_local_effects (struct cgraph_no > &known_contexts, &known_aggs, > &removable_params_cost); > known_aggs_ptrs = agg_jmp_p_vec_for_t_vec (known_aggs); > - if (always_const) > + int devirt_bonus = devirtualization_time_bonus (node, known_csts, > + known_contexts, known_aggs_ptrs); > + if (always_const || devirt_bonus || removable_params_cost) > { > struct caller_statistics stats; > inline_hints hints; > @@ -2505,8 +2509,7 @@ estimate_local_effects (struct cgraph_no > false); > estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts, > known_aggs_ptrs, &size, &time, &hints); > - time -= devirtualization_time_bonus (node, known_csts, known_contexts, > - known_aggs_ptrs); > + time -= devirt_bonus; > time -= hint_time_bonus (hints); > time -= removable_params_cost; > size -= stats.n_calls * removable_params_cost; > @@ -2515,8 +2518,7 @@ estimate_local_effects (struct cgraph_no > fprintf (dump_file, " - context independent values, size: %i, " > "time_benefit: %i\n", size, base_time - time); > > - if (size <= 0 > - || node->will_be_removed_from_program_if_no_direct_calls_p ()) > + if (size <= 0 || node->local.local) > { > info->do_clone_for_all_contexts = true; > base_time = time; > @@ -2544,6 +2546,10 @@ estimate_local_effects (struct cgraph_no > "max_new_size would be reached with %li.\n", > size + overall_size); > } > + else if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, " Not cloning for all contexts because " > + "!good_cloning_opportunity_p.\n"); > + > } > > for (i = 0; i < count ; i++) > @@ -4419,7 +4425,7 @@ identify_dead_nodes (struct cgraph_node > { > struct cgraph_node *v; > for (v = node; v ; v = ((struct ipa_dfs_info *) v->aux)->next_cycle) > - if (v->will_be_removed_from_program_if_no_direct_calls_p () > + if (v->local.local > && !v->call_for_symbol_thunks_and_aliases > (has_undead_caller_from_outside_scc_p, NULL, true)) > IPA_NODE_REF (v)->node_dead = 1;
> Is this really necessary, is it not enough to remove the assignment to > ret below? If the parameter is not used, devirtualization time bonus, > which you then rely on estimate_local_effects, should be zero for it. > > It is a very minor point, I suppose, but if the function gets cloned > for a different reason, it might still be beneficial to have as much > context-independent information for it as possible too, because that > can then be used in a callee (see the second call of > gather_context_independent_values). > > Other than that, all the changes seem like a clear improvement. The cutoff is there mainly for the rest of the function: if (known_aggs) { vec<ipa_agg_jf_item, va_gc> *agg_items; struct ipa_agg_jump_function *ajf; agg_items = context_independent_aggregate_values (plats); ajf = &(*known_aggs)[i]; ajf->items = agg_items; ajf->by_ref = plats->aggs_by_ref; ret |= agg_items != NULL; } I did not want ret to become true if we manage to propagate into an unused aggregate parameter. Honza
On Thu, Dec 10, 2015 at 05:56:26PM +0100, Jan Hubicka wrote: > > Is this really necessary, is it not enough to remove the assignment to > > ret below? If the parameter is not used, devirtualization time bonus, > > which you then rely on estimate_local_effects, should be zero for it. > > > > It is a very minor point, I suppose, but if the function gets cloned > > for a different reason, it might still be beneficial to have as much > > context-independent information for it as possible too, because that > > can then be used in a callee (see the second call of > > gather_context_independent_values). > > > > Other than that, all the changes seem like a clear improvement. > > The cutoff is there mainly for the rest of the function: > if (known_aggs) > { > vec<ipa_agg_jf_item, va_gc> *agg_items; > struct ipa_agg_jump_function *ajf; > > agg_items = context_independent_aggregate_values (plats); > ajf = &(*known_aggs)[i]; > ajf->items = agg_items; > ajf->by_ref = plats->aggs_by_ref; > ret |= agg_items != NULL; > } > I did not want ret to become true if we manage to propagate into an unused > aggregate parameter. I see, it makes sense. Thanks, Martin
On Thu, Dec 10, 2015 at 08:30:37AM +0100, Jan Hubicka wrote: > I am bootstrapping/regtesting this on x86_64-linux, does it seem OK? I think this patch (just a guess, but certainly ipa-cp related during last 24 hours) significantly regressed guality/pr36728-*.c on x86_64. Previously we have not turned foo into foo.constprop*, now we do, and pass just arg7 instead of arg1..arg7. That is fine, but we really should be emitting the debug info stuff for that case, that was added to fix PR47858, but for whatever reason it doesn't happen in this case. Does it take some other path in ipa-prop.c, or bypass ipa-prop, something different? > * ipa-cp.c (ipcp_cloning_candidate_p): Use node->optimize_for_size_p. > (good_cloning_opportunity_p): Likewise. > (gather_context_independent_values): Do not return true when > polymorphic call context is known or when we have known aggregate > value of unused parameter. > (estimate_local_effects): Try to create clone for all context > when either some params are substituted or devirtualization is possible > or some params can be removed; use local flag instead of > node->will_be_removed_from_program_if_no_direct_calls_p. > (identify_dead_nodes): Likewise. Jakub
> On Thu, Dec 10, 2015 at 08:30:37AM +0100, Jan Hubicka wrote: > > I am bootstrapping/regtesting this on x86_64-linux, does it seem OK? > > I think this patch (just a guess, but certainly ipa-cp related during last > 24 hours) significantly regressed guality/pr36728-*.c on x86_64. > Previously we have not turned foo into foo.constprop*, now we do, and pass > just arg7 instead of arg1..arg7. That is fine, but we really should be Yes, I changed the heuristics to consider it a win to drop the arugment even if no constant propagation is done. > emitting the debug info stuff for that case, that was added to fix PR47858, > but for whatever reason it doesn't happen in this case. Does it take some > other path in ipa-prop.c, or bypass ipa-prop, something different? No, it is the same clonning as any other. Just the decision heuristics changed. We would get same issue qith guality/pr36728- if one of arg1...arg7 was constant. I will try to take a look at your patch for PR47858 and see if I can work out what goes wrong. Honza > > > * ipa-cp.c (ipcp_cloning_candidate_p): Use node->optimize_for_size_p. > > (good_cloning_opportunity_p): Likewise. > > (gather_context_independent_values): Do not return true when > > polymorphic call context is known or when we have known aggregate > > value of unused parameter. > > (estimate_local_effects): Try to create clone for all context > > when either some params are substituted or devirtualization is possible > > or some params can be removed; use local flag instead of > > node->will_be_removed_from_program_if_no_direct_calls_p. > > (identify_dead_nodes): Likewise. > > Jakub
Actually I added if (!ipa_is_param_used (info, i)) continue; shortcut to gather_context_independent_values which prevents us from recording context_independent_aggregate_values for unused aggregate parameters. Perhaps that is causing the isssue? We can simply record them and just avoid returning true if all propagations happen to those.
On Thu, Dec 10, 2015 at 08:30:37AM +0100, Jan Hubicka wrote: > * ipa-cp.c (ipcp_cloning_candidate_p): Use node->optimize_for_size_p. > (good_cloning_opportunity_p): Likewise. > (gather_context_independent_values): Do not return true when > polymorphic call context is known or when we have known aggregate > value of unused parameter. > (estimate_local_effects): Try to create clone for all context > when either some params are substituted or devirtualization is possible > or some params can be removed; use local flag instead of > node->will_be_removed_from_program_if_no_direct_calls_p. > (identify_dead_nodes): Likewise. This commit breaks several guality tests on S/390x: +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 y == 2 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 y == 2 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 y == 2 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 y == 2 FAIL: gcc.dg/guality/pr36728-2.c -O2 line 18 *x == (char) 25 FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 18 *x == (char) 25 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 y == 2 FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 *x == (char) 25 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 y == 2 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 y == 2 FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 *x == (char) 25 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 y == 2 FAIL: gcc.dg/guality/pr36728-2.c -Os line 18 *x == (char) 25 +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg7 == 30 ... FAIL: gcc.dg/guality/vla-1.c -O1 line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -O2 line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 17 sizeof (a) == 6 +FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 17 i == 5 FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 17 sizeof (a) == 6 +FAIL: gcc.dg/guality/vla-1.c -O3 -g line 17 i == 5 FAIL: gcc.dg/guality/vla-1.c -O3 -g line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -Os line 17 sizeof (a) == 6 What can I do to help fixing this? Ciao Dominik ^_^ ^_^
On Wed, Dec 16, 2015 at 10:15 AM, Dominik Vogt <vogt@linux.vnet.ibm.com> wrote: > On Thu, Dec 10, 2015 at 08:30:37AM +0100, Jan Hubicka wrote: >> * ipa-cp.c (ipcp_cloning_candidate_p): Use node->optimize_for_size_p. >> (good_cloning_opportunity_p): Likewise. >> (gather_context_independent_values): Do not return true when >> polymorphic call context is known or when we have known aggregate >> value of unused parameter. >> (estimate_local_effects): Try to create clone for all context >> when either some params are substituted or devirtualization is possible >> or some params can be removed; use local flag instead of >> node->will_be_removed_from_program_if_no_direct_calls_p. >> (identify_dead_nodes): Likewise. > > This commit breaks several guality tests on S/390x: > > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 y == 2 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 y == 2 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 y == 2 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 y == 2 > FAIL: gcc.dg/guality/pr36728-2.c -O2 line 18 *x == (char) 25 > FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 18 *x == (char) 25 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 y == 2 > FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 *x == (char) 25 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 y == 2 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 y == 2 > FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 *x == (char) 25 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 y == 2 > FAIL: gcc.dg/guality/pr36728-2.c -Os line 18 *x == (char) 25 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg7 == 30 > ... > FAIL: gcc.dg/guality/vla-1.c -O1 line 17 sizeof (a) == 6 > FAIL: gcc.dg/guality/vla-1.c -O2 line 17 sizeof (a) == 6 > FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 17 sizeof (a) == 6 > +FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 17 i == 5 > FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 17 sizeof (a) == 6 > +FAIL: gcc.dg/guality/vla-1.c -O3 -g line 17 i == 5 > FAIL: gcc.dg/guality/vla-1.c -O3 -g line 17 sizeof (a) == 6 > FAIL: gcc.dg/guality/vla-1.c -Os line 17 sizeof (a) == 6 > > What can I do to help fixing this? Same on x86_64 btw. If there isn't a bugreport already please open one to track this issue. Thanks, Richard. > Ciao > > Dominik ^_^ ^_^ > > -- > > Dominik Vogt > IBM Germany >
On Wed, Dec 16, 2015 at 10:59:10AM +0100, Richard Biener wrote: > > What can I do to help fixing this? > > Same on x86_64 btw. If there isn't a bugreport already please open > one to track this issue. PR68860 covers this already and it has been discussed on this ml too already. Jakub
On Wed, Dec 16, 2015 at 10:59:10AM +0100, Richard Biener wrote: > Same on x86_64 btw. If there isn't a bugreport already please open > one to track this issue. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68935 Ciao Dominik ^_^ ^_^
> On Thu, Dec 10, 2015 at 08:30:37AM +0100, Jan Hubicka wrote: > > * ipa-cp.c (ipcp_cloning_candidate_p): Use node->optimize_for_size_p. > > (good_cloning_opportunity_p): Likewise. > > (gather_context_independent_values): Do not return true when > > polymorphic call context is known or when we have known aggregate > > value of unused parameter. > > (estimate_local_effects): Try to create clone for all context > > when either some params are substituted or devirtualization is possible > > or some params can be removed; use local flag instead of > > node->will_be_removed_from_program_if_no_direct_calls_p. > > (identify_dead_nodes): Likewise. > > This commit breaks several guality tests on S/390x: The patch changes ipa-cp heuristics in a way that it will clone in order to remove unused parameter. Previously it did know how to remove unused parameter but would do so only if some of function's parameter (perhaps the unused one) was also constant. This is inconsistent. As a result we do more ipa-cp cloning and I suppose we are more likely to hit them on guality for simplified testcases that often contain unused code: int __attribute__((noinline)) foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7) { char *x = __builtin_alloca (arg7); int __attribute__ ((aligned(32))) y; y = 2; asm (NOP : "=m" (y), "=m" (b) : "m" (y)); x[0] = 25; asm (NOP : "=m" (x[0]), "=m" (a) : "m" (x[0]), "m" (b)); return y; } int main () { int l = 0; asm ("" : "=r" (l) : "0" (l)); a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); asm volatile ("" :: "r" (l)); return 0; } I am trying to understand Jakub's debug code and perhaps it can be improved. But in the case of optimized out unused parameters I think it is perfectly resonable to say that the variable was optimized out. As you can see, the testcase explicitely prevents ipa-cp constant propagation by the asm statement. We can just update the testcases to use the parameters and test the original issue again. > > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 y == 2 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 y == 2 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 16 y == 2 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 18 y == 2 > FAIL: gcc.dg/guality/pr36728-2.c -O2 line 18 *x == (char) 25 > FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 18 *x == (char) 25 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 y == 2 > FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 *x == (char) 25 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 18 y == 2 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 16 y == 2 > FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 *x == (char) 25 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-2.c -O3 -g line 18 y == 2 > FAIL: gcc.dg/guality/pr36728-2.c -Os line 18 *x == (char) 25 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 14 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-3.c -O3 -g line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 16 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 14 arg7 == 30 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg3 == 3 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg4 == 4 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg5 == 5 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg6 == 6 > +FAIL: gcc.dg/guality/pr36728-4.c -O3 -g line 16 arg7 == 30 > ... > FAIL: gcc.dg/guality/vla-1.c -O1 line 17 sizeof (a) == 6 > FAIL: gcc.dg/guality/vla-1.c -O2 line 17 sizeof (a) == 6 > FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 17 sizeof (a) == 6 > +FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 17 i == 5 > FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 17 sizeof (a) == 6 > +FAIL: gcc.dg/guality/vla-1.c -O3 -g line 17 i == 5 > FAIL: gcc.dg/guality/vla-1.c -O3 -g line 17 sizeof (a) == 6 > FAIL: gcc.dg/guality/vla-1.c -Os line 17 sizeof (a) == 6 > > What can I do to help fixing this? > > Ciao > > Dominik ^_^ ^_^ > > -- > > Dominik Vogt > IBM Germany
On Wed, Dec 16, 2015 at 05:24:25PM +0100, Jan Hubicka wrote: > I am trying to understand Jakub's debug code and perhaps it can be improved. But in > the case of optimized out unused parameters I think it is perfectly resonable to > say that the variable was optimized out. As long as the values that would be passed to the unused parameters are constant or live in memory or registers that isn't clobbered by the call in the caller, we have the ability to express that in the debug info now, and we should. > As you can see, the testcase explicitely prevents ipa-cp constant propagation by the > asm statement. We can just update the testcases to use the parameters and test > the original issue again. No, the testcase intentionally wants to test unused arguments, they happen in quite a lot of code and "optimized out" is not really desirable answer if we can provide the values some other way. Jakub
Index: ipa-cp.c =================================================================== --- ipa-cp.c (revision 231477) +++ ipa-cp.c (working copy) @@ -613,7 +613,7 @@ ipcp_cloning_candidate_p (struct cgraph_ return false; } - if (!optimize_function_for_speed_p (DECL_STRUCT_FUNCTION (node->decl))) + if (node->optimize_for_size_p ()) { if (dump_file) fprintf (dump_file, "Not considering %s for cloning; " @@ -2267,7 +2267,7 @@ good_cloning_opportunity_p (struct cgrap { if (time_benefit == 0 || !opt_for_fn (node->decl, flag_ipa_cp_clone) - || !optimize_function_for_speed_p (DECL_STRUCT_FUNCTION (node->decl))) + || node->optimize_for_size_p ()) return false; gcc_assert (size_cost > 0); @@ -2387,12 +2387,14 @@ gather_context_independent_values (struc *removable_params_cost += ipa_get_param_move_cost (info, i); + if (!ipa_is_param_used (info, i)) + continue; + ipcp_lattice<ipa_polymorphic_call_context> *ctxlat = &plats->ctxlat; + /* Do not account known context as reason for cloning. We can see + if it permits devirtualization. */ if (ctxlat->is_single_const ()) - { - (*known_contexts)[i] = ctxlat->values->value; - ret = true; - } + (*known_contexts)[i] = ctxlat->values->value; if (known_aggs) { @@ -2494,7 +2496,9 @@ estimate_local_effects (struct cgraph_no &known_contexts, &known_aggs, &removable_params_cost); known_aggs_ptrs = agg_jmp_p_vec_for_t_vec (known_aggs); - if (always_const) + int devirt_bonus = devirtualization_time_bonus (node, known_csts, + known_contexts, known_aggs_ptrs); + if (always_const || devirt_bonus || removable_params_cost) { struct caller_statistics stats; inline_hints hints; @@ -2505,8 +2509,7 @@ estimate_local_effects (struct cgraph_no false); estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts, known_aggs_ptrs, &size, &time, &hints); - time -= devirtualization_time_bonus (node, known_csts, known_contexts, - known_aggs_ptrs); + time -= devirt_bonus; time -= hint_time_bonus (hints); time -= removable_params_cost; size -= stats.n_calls * removable_params_cost; @@ -2515,8 +2518,7 @@ estimate_local_effects (struct cgraph_no fprintf (dump_file, " - context independent values, size: %i, " "time_benefit: %i\n", size, base_time - time); - if (size <= 0 - || node->will_be_removed_from_program_if_no_direct_calls_p ()) + if (size <= 0 || node->local.local) { info->do_clone_for_all_contexts = true; base_time = time; @@ -2544,6 +2546,10 @@ estimate_local_effects (struct cgraph_no "max_new_size would be reached with %li.\n", size + overall_size); } + else if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " Not cloning for all contexts because " + "!good_cloning_opportunity_p.\n"); + } for (i = 0; i < count ; i++) @@ -4419,7 +4425,7 @@ identify_dead_nodes (struct cgraph_node { struct cgraph_node *v; for (v = node; v ; v = ((struct ipa_dfs_info *) v->aux)->next_cycle) - if (v->will_be_removed_from_program_if_no_direct_calls_p () + if (v->local.local && !v->call_for_symbol_thunks_and_aliases (has_undead_caller_from_outside_scc_p, NULL, true)) IPA_NODE_REF (v)->node_dead = 1;