Message ID | 20140516172559.GF20755@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
Jan Hubicka <hubicka@ucw.cz> writes: > + /* Remove stores to variables we marked readonly. s/read/write/ > + /* For calls we can simply remove LHS when it is known to be read only. */ s/read/write/ Andreas.
> Jan Hubicka <hubicka@ucw.cz> writes: > > > + /* Remove stores to variables we marked readonly. > > s/read/write/ > > > + /* For calls we can simply remove LHS when it is known to be read only. */ > > s/read/write/ Ah, thanks! Honza > > Andreas. > > -- > Andreas Schwab, schwab@linux-m68k.org > GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 > "And now for something completely different."
On 05/16/2014 11:25 AM, Jan Hubicka wrote: > Hi, > this patch adds code to remove write only static variables. While analyzing > effectivity of LTO on firefox, I noticed that surprisingly large part of > binary's data segment is occupied by these. Fixed thus. > (this is quite trivial transformation, I just never considered it important > enough to work on it). > > The patch goes by marking write only variables in ipa.c (at same time we > discover addressable flag) and also fixes handling of the flags for > aliases. References to variables are then removed by fixup_cfg. > As first cut, I only remove stores without side effects, so copies from > volatile variables are preserved. I also kill LHS of function calls. > I do not attempt to remove asm statements. This means that some references > may be left in the code and therefore the IPA code does not eliminate the > referneces after discovering write only variable and instead it relies > on dead variable elimination to do the job later. Consequently not all write > only variables are removed with WHOPR in the case the references ends up > in different partitions. Something I can address incrementally. > This patch seems quite similar in purpose to the remove_local_statics optimization that Mentor has proposed, although the implementation is quite different. Here is the last version of our patch, prepared by Bernd Schmidt last year: https://gcc.gnu.org/ml/gcc-patches/2013-06/msg00317.html I think we can drop our patch from our local tree now, but it includes a large number of test cases which I think are worth keeping on mainline. A few of them fail with your implementation, though -- which might be genuine bugs, or just different limitations of the two approaches. Can you take a look? The failing tests are remove-local-statics-{4,5,7,12,14b}.c. -Sandra
Hi, On Fri, May 16, 2014 at 07:25:59PM +0200, Jan Hubicka wrote: > ... > > * varpool.c (dump_varpool_node): Dump write-only flag. > * lto-cgraph.c (lto_output_varpool_node, input_varpool_node): Stream > write-only flag. > * tree-cfg.c (execute_fixup_cfg): Remove statements setting write-only variables. > > > * gcc.c-torture/execute/20101011-1.c: Update testcase. > * gcc.dg/ira-shrinkwrap-prep-1.c: Update testcase. > * gcc.dg/tree-ssa/writeonly.c: New testcase. > * gcc.dg/tree-ssa/ssa-dse-6.c: Update testcase. > * gcc.dg/tree-ssa/pr21559.c: Update testcase. > * gcc.dg/debug/pr35154.c: Update testcase. > * gcc.target/i386/vectorize1.c: Update testcase. > * ipa.c (process_references): New function. > (set_readonly_bit): New function. > (set_writeonly_bit): New function. > (clear_addressable_bit): New function. > (ipa_discover_readonly_nonaddressable_var): Mark write only variables; fix > handling of aliases. > * cgraph.h (struct varpool_node): Add writeonly flag. > ... > Index: ipa.c > =================================================================== > --- ipa.c (revision 210514) > +++ ipa.c (working copy) > @@ -640,43 +711,40 @@ ipa_discover_readonly_nonaddressable_var > if (dump_file) > fprintf (dump_file, "Clearing variable flags:"); > FOR_EACH_VARIABLE (vnode) > - if (vnode->definition && varpool_all_refs_explicit_p (vnode) > + if (!vnode->alias > && (TREE_ADDRESSABLE (vnode->decl) > + || !vnode->writeonly > || !TREE_READONLY (vnode->decl))) > { > bool written = false; > bool address_taken = false; > - int i; > - struct ipa_ref *ref; > - for (i = 0; ipa_ref_list_referring_iterate (&vnode->ref_list, > - i, ref) > - && (!written || !address_taken); i++) > - switch (ref->use) > - { > - case IPA_REF_ADDR: > - address_taken = true; > - break; > - case IPA_REF_LOAD: > - break; > - case IPA_REF_STORE: > - written = true; > - break; > - } > - if (TREE_ADDRESSABLE (vnode->decl) && !address_taken) > + bool read = false; > + bool explicit_refs = true; > + > + process_references (vnode, &written, &address_taken, &read, &explicit_refs); > + if (!explicit_refs) > + continue; > + if (!address_taken) > { > - if (dump_file) > + if (TREE_ADDRESSABLE (vnode->decl) && dump_file) > fprintf (dump_file, " %s (addressable)", vnode->name ()); I know it is technically not a part of the patch... but surely this is supposed to dump not addressable and might be quite a bit confusing, so if you are already changing this, correcting the dump would be great. Martin > - TREE_ADDRESSABLE (vnode->decl) = 0; > + varpool_for_node_and_aliases (vnode, clear_addressable_bit, NULL, true); > } > - if (!TREE_READONLY (vnode->decl) && !address_taken && !written > + if (!address_taken && !written > /* Making variable in explicit section readonly can cause section > type conflict. > See e.g. gcc.c-torture/compile/pr23237.c */ > && DECL_SECTION_NAME (vnode->decl) == NULL) > { > - if (dump_file) > + if (!TREE_READONLY (vnode->decl) && dump_file) > fprintf (dump_file, " %s (read-only)", vnode->name ()); > - TREE_READONLY (vnode->decl) = 1; > + varpool_for_node_and_aliases (vnode, set_readonly_bit, NULL, true); > + } > + if (!vnode->writeonly && !read && !address_taken) > + { > + if (dump_file) > + fprintf (dump_file, " %s (write-only)", vnode->name ()); > + varpool_for_node_and_aliases (vnode, set_writeonly_bit, NULL, true); > } > } > if (dump_file)
> > + if (!address_taken) > > { > > - if (dump_file) > > + if (TREE_ADDRESSABLE (vnode->decl) && dump_file) > > fprintf (dump_file, " %s (addressable)", vnode->name ()); > > I know it is technically not a part of the patch... but surely this is > supposed to dump not addressable and might be quite a bit confusing, > so if you are already changing this, correcting the dump would be > great. Yep, the original logic was that the variables appear in list of flags removed, so we are removing addressable flag. The other two flags do not follow the practice. I plan to cleanup this code in general (it has gathered quite some clutter), so I will look into it next and get dumps more readable. Honza > > Martin > > > - TREE_ADDRESSABLE (vnode->decl) = 0; > > + varpool_for_node_and_aliases (vnode, clear_addressable_bit, NULL, true); > > } > > - if (!TREE_READONLY (vnode->decl) && !address_taken && !written > > + if (!address_taken && !written > > /* Making variable in explicit section readonly can cause section > > type conflict. > > See e.g. gcc.c-torture/compile/pr23237.c */ > > && DECL_SECTION_NAME (vnode->decl) == NULL) > > { > > - if (dump_file) > > + if (!TREE_READONLY (vnode->decl) && dump_file) > > fprintf (dump_file, " %s (read-only)", vnode->name ()); > > - TREE_READONLY (vnode->decl) = 1; > > + varpool_for_node_and_aliases (vnode, set_readonly_bit, NULL, true); > > + } > > + if (!vnode->writeonly && !read && !address_taken) > > + { > > + if (dump_file) > > + fprintf (dump_file, " %s (write-only)", vnode->name ()); > > + varpool_for_node_and_aliases (vnode, set_writeonly_bit, NULL, true); > > } > > } > > if (dump_file)
Hi, On Fri, May 16, 2014 at 07:25:59PM +0200, Jan Hubicka wrote: > Hi, > this patch adds code to remove write only static variables. While analyzing > effectivity of LTO on firefox, I noticed that surprisingly large part of > binary's data segment is occupied by these. Fixed thus. > (this is quite trivial transformation, I just never considered it important > enough to work on it). > > The patch goes by marking write only variables in ipa.c (at same time we > discover addressable flag) and also fixes handling of the flags for > aliases. References to variables are then removed by fixup_cfg. > As first cut, I only remove stores without side effects, so copies from > volatile variables are preserved. I also kill LHS of function calls. > I do not attempt to remove asm statements. This means that some references > may be left in the code and therefore the IPA code does not eliminate the > referneces after discovering write only variable and instead it relies > on dead variable elimination to do the job later. Consequently not all write > only variables are removed with WHOPR in the case the references ends up > in different partitions. Something I can address incrementally. > > Also I think dwarf2out should be updated to mark value of the write only > variables as optimized out. Jakub, can you help me with this? > (I do not think it is valid to output the optimized out value of constructor) > > Bootstrapped/regtested x86_64-linux, will commit it later today. > > Honza > > * varpool.c (dump_varpool_node): Dump write-only flag. > * lto-cgraph.c (lto_output_varpool_node, input_varpool_node): Stream > write-only flag. > * tree-cfg.c (execute_fixup_cfg): Remove statements setting write-only variables. > > > * gcc.c-torture/execute/20101011-1.c: Update testcase. > * gcc.dg/ira-shrinkwrap-prep-1.c: Update testcase. > * gcc.dg/tree-ssa/writeonly.c: New testcase. > * gcc.dg/tree-ssa/ssa-dse-6.c: Update testcase. > * gcc.dg/tree-ssa/pr21559.c: Update testcase. > * gcc.dg/debug/pr35154.c: Update testcase. > * gcc.target/i386/vectorize1.c: Update testcase. > * ipa.c (process_references): New function. > (set_readonly_bit): New function. > (set_writeonly_bit): New function. > (clear_addressable_bit): New function. > (ipa_discover_readonly_nonaddressable_var): Mark write only variables; fix > handling of aliases. > * cgraph.h (struct varpool_node): Add writeonly flag. > Unfortunately, this commit has caused the following ICE for me when LTO building 471.omnetpp from SPEC 2006 or Firefox (but not libxul, something that gets built earlier): lto1: internal compiler error: in gimple_get_virt_method_for_vtable, at gimple-fold.c:3276 0x7437a3 gimple_get_virt_method_for_vtable(long, tree_node*, unsigned long, bool*) /home/mjambor/gcc/bisect/src/gcc/gimple-fold.c:3276 0x743993 gimple_get_virt_method_for_binfo(long, tree_node*, bool*) /home/mjambor/gcc/bisect/src/gcc/gimple-fold.c:3377 0x7913f2 possible_polymorphic_call_targets(tree_node*, long, ipa_polymorphic_call_context, bool*, void**, int*) /home/mjambor/gcc/bisect/src/gcc/ipa-devirt.c:1697 0x7b73a9 possible_polymorphic_call_targets /home/mjambor/gcc/bisect/src/gcc/ipa-utils.h:121 0x7b73a9 walk_polymorphic_call_targets /home/mjambor/gcc/bisect/src/gcc/ipa.c:177 0x7b73a9 symtab_remove_unreachable_nodes(bool, _IO_FILE*) /home/mjambor/gcc/bisect/src/gcc/ipa.c:407 0x86bf47 execute_todo /home/mjambor/gcc/bisect/src/gcc/passes.c:1843 Please submit a full bug report... I compile omnetpp with -Ofast -g -flto=8 -fwhole-program -funroll-loops -fpeel-loops -march=native -mtune=native I'm afraid I won't be able to prepare a more reduced testcase very soon. Thanks, Martin
Index: varpool.c =================================================================== --- varpool.c (revision 210514) +++ varpool.c (working copy) @@ -211,6 +211,8 @@ dump_varpool_node (FILE *f, varpool_node fprintf (f, " read-only"); if (ctor_for_folding (node->decl) != error_mark_node) fprintf (f, " const-value-known"); + if (node->writeonly) + fprintf (f, " write-only"); fprintf (f, "\n"); } Index: lto-cgraph.c =================================================================== --- lto-cgraph.c (revision 210514) +++ lto-cgraph.c (working copy) @@ -562,6 +562,7 @@ lto_output_varpool_node (struct lto_simp bp_pack_value (&bp, node->forced_by_abi, 1); bp_pack_value (&bp, node->unique_name, 1); bp_pack_value (&bp, node->body_removed, 1); + bp_pack_value (&bp, node->writeonly, 1); bp_pack_value (&bp, node->definition, 1); alias_p = node->alias && (!boundary_p || node->weakref); bp_pack_value (&bp, alias_p, 1); @@ -1153,6 +1154,7 @@ input_varpool_node (struct lto_file_decl node->forced_by_abi = bp_unpack_value (&bp, 1); node->unique_name = bp_unpack_value (&bp, 1); node->body_removed = bp_unpack_value (&bp, 1); + node->writeonly = bp_unpack_value (&bp, 1); node->definition = bp_unpack_value (&bp, 1); node->alias = bp_unpack_value (&bp, 1); node->weakref = bp_unpack_value (&bp, 1); Index: tree-cfg.c =================================================================== --- tree-cfg.c (revision 210514) +++ tree-cfg.c (working copy) @@ -8431,7 +8431,7 @@ execute_fixup_cfg (void) FOR_EACH_BB_FN (bb, cfun) { bb->count = apply_scale (bb->count, count_scale); - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) { gimple stmt = gsi_stmt (gsi); tree decl = is_gimple_call (stmt) @@ -8457,9 +8457,45 @@ execute_fixup_cfg (void) todo |= TODO_cleanup_cfg; } + /* Remove stores to variables we marked readonly. + Keep access when store has side effect, i.e. in case when source + is volatile. */ + if (gimple_store_p (stmt) + && !gimple_has_side_effects (stmt)) + { + tree lhs = get_base_address (gimple_get_lhs (stmt)); + + if (TREE_CODE (lhs) == VAR_DECL + && (TREE_STATIC (lhs) || DECL_EXTERNAL (lhs)) + && varpool_get_node (lhs)->writeonly) + { + unlink_stmt_vdef (stmt); + gsi_remove (&gsi, true); + release_defs (stmt); + todo |= TODO_update_ssa | TODO_cleanup_cfg; + continue; + } + } + /* For calls we can simply remove LHS when it is known to be read only. */ + if (is_gimple_call (stmt) + && gimple_get_lhs (stmt)) + { + tree lhs = get_base_address (gimple_get_lhs (stmt)); + + if (TREE_CODE (lhs) == VAR_DECL + && (TREE_STATIC (lhs) || DECL_EXTERNAL (lhs)) + && varpool_get_node (lhs)->writeonly) + { + gimple_call_set_lhs (stmt, NULL); + update_stmt (stmt); + todo |= TODO_update_ssa | TODO_cleanup_cfg; + } + } + if (maybe_clean_eh_stmt (stmt) && gimple_purge_dead_eh_edges (bb)) todo |= TODO_cleanup_cfg; + gsi_next (&gsi); } FOR_EACH_EDGE (e, ei, bb->succs) Index: testsuite/gcc.c-torture/execute/20101011-1.c =================================================================== --- testsuite/gcc.c-torture/execute/20101011-1.c (revision 210514) +++ testsuite/gcc.c-torture/execute/20101011-1.c (working copy) @@ -92,7 +92,7 @@ sigfpe (int signum __attribute__ ((unuse eliminate the assignment to the global k. */ static int i; static int j; -int k; +int k __attribute__ ((used)); int main () Index: testsuite/gcc.dg/ira-shrinkwrap-prep-1.c =================================================================== --- testsuite/gcc.dg/ira-shrinkwrap-prep-1.c (revision 210514) +++ testsuite/gcc.dg/ira-shrinkwrap-prep-1.c (working copy) @@ -7,7 +7,7 @@ foo (long a) return a + 5; } -static long g; +static long g __attribute__ ((used)); long __attribute__((noinline, noclone)) bar (long a) Index: testsuite/gcc.dg/tree-ssa/writeonly.c =================================================================== --- testsuite/gcc.dg/tree-ssa/writeonly.c (revision 0) +++ testsuite/gcc.dg/tree-ssa/writeonly.c (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-optimized" } */ +static struct a {int magic1,b;} a; +volatile int magic2; +static struct b {int a,b,c,d,e,f;} magic3; + +struct b foo(); + +t() +{ + a.magic1 = 1; + magic2 = 1; + magic3 = foo(); +} +/* { dg-final { scan-tree-dump-not "magic1" "optimized"} } */ +/* { dg-final { scan-tree-dump-not "magic3" "optimized"} } */ +/* { dg-final { scan-tree-dump "magic2" "optimized"} } */ +/* { dg-final { scan-tree-dump "foo" "optimized"} } */ + +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: testsuite/gcc.dg/tree-ssa/ssa-dse-6.c =================================================================== --- testsuite/gcc.dg/tree-ssa/ssa-dse-6.c (revision 210514) +++ testsuite/gcc.dg/tree-ssa/ssa-dse-6.c (working copy) @@ -3,6 +3,7 @@ int foo11 (int c) { + __attribute__ ((used)) static int local1, local2; local1 = 0; local2 += c; Index: testsuite/gcc.dg/tree-ssa/pr21559.c =================================================================== --- testsuite/gcc.dg/tree-ssa/pr21559.c (revision 210514) +++ testsuite/gcc.dg/tree-ssa/pr21559.c (working copy) @@ -1,6 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fdump-tree-vrp1-details" } */ +__attribute__ ((used)) static int blocksize = 4096; int bar (int); Index: testsuite/gcc.dg/debug/pr35154.c =================================================================== --- testsuite/gcc.dg/debug/pr35154.c (revision 210514) +++ testsuite/gcc.dg/debug/pr35154.c (working copy) @@ -15,7 +15,7 @@ struct { int main() { - static char i_inner[2]; + static char i_inner[2] __attribute__ ((used)); i_inner[0] = 'a'; i_inner[1] = 'b'; opta.f1 = 'c'; opta.f2 = 'd'; Index: testsuite/gcc.target/i386/vectorize1.c =================================================================== --- testsuite/gcc.target/i386/vectorize1.c (revision 210514) +++ testsuite/gcc.target/i386/vectorize1.c (working copy) @@ -10,6 +10,7 @@ typedef struct int set_names (void) { + __attribute__ ((used)) static tx_typ tt1; int ln; for (ln = 0; ln < 8; ln++) Index: ipa.c =================================================================== --- ipa.c (revision 210514) +++ ipa.c (working copy) @@ -624,6 +624,77 @@ symtab_remove_unreachable_nodes (bool be return changed; } +/* Process references to VNODE and set flags WRITTEN, ADDRESS_TAKEN, READ + as needed, also clear EXPLICIT_REFS if the references to given variable + do not need to be explicit. */ + +void +process_references (varpool_node *vnode, + bool *written, bool *address_taken, + bool *read, bool *explicit_refs) +{ + int i; + struct ipa_ref *ref; + + if (!varpool_all_refs_explicit_p (vnode) + || TREE_THIS_VOLATILE (vnode->decl)) + *explicit_refs = false; + + for (i = 0; ipa_ref_list_referring_iterate (&vnode->ref_list, + i, ref) + && *explicit_refs && (!*written || !*address_taken || !*read); i++) + switch (ref->use) + { + case IPA_REF_ADDR: + *address_taken = true; + break; + case IPA_REF_LOAD: + *read = true; + break; + case IPA_REF_STORE: + *written = true; + break; + case IPA_REF_ALIAS: + process_references (varpool (ref->referring), written, address_taken, + read, explicit_refs); + break; + } +} + +/* Set TREE_READONLY bit. */ + +bool +set_readonly_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED) +{ + TREE_READONLY (vnode->decl) = true; + return false; +} + +/* Set writeonly bit and clear the initalizer, since it will not be needed. */ + +bool +set_writeonly_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED) +{ + vnode->writeonly = true; + if (optimize) + { + DECL_INITIAL (vnode->decl) = NULL; + if (!vnode->alias) + ipa_remove_all_references (&vnode->ref_list); + } + return false; +} + +/* Clear addressale bit of VNODE. */ + +bool +clear_addressable_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED) +{ + vnode->address_taken = false; + TREE_ADDRESSABLE (vnode->decl) = 0; + return false; +} + /* Discover variables that have no longer address taken or that are read only and update their flags. @@ -640,43 +711,40 @@ ipa_discover_readonly_nonaddressable_var if (dump_file) fprintf (dump_file, "Clearing variable flags:"); FOR_EACH_VARIABLE (vnode) - if (vnode->definition && varpool_all_refs_explicit_p (vnode) + if (!vnode->alias && (TREE_ADDRESSABLE (vnode->decl) + || !vnode->writeonly || !TREE_READONLY (vnode->decl))) { bool written = false; bool address_taken = false; - int i; - struct ipa_ref *ref; - for (i = 0; ipa_ref_list_referring_iterate (&vnode->ref_list, - i, ref) - && (!written || !address_taken); i++) - switch (ref->use) - { - case IPA_REF_ADDR: - address_taken = true; - break; - case IPA_REF_LOAD: - break; - case IPA_REF_STORE: - written = true; - break; - } - if (TREE_ADDRESSABLE (vnode->decl) && !address_taken) + bool read = false; + bool explicit_refs = true; + + process_references (vnode, &written, &address_taken, &read, &explicit_refs); + if (!explicit_refs) + continue; + if (!address_taken) { - if (dump_file) + if (TREE_ADDRESSABLE (vnode->decl) && dump_file) fprintf (dump_file, " %s (addressable)", vnode->name ()); - TREE_ADDRESSABLE (vnode->decl) = 0; + varpool_for_node_and_aliases (vnode, clear_addressable_bit, NULL, true); } - if (!TREE_READONLY (vnode->decl) && !address_taken && !written + if (!address_taken && !written /* Making variable in explicit section readonly can cause section type conflict. See e.g. gcc.c-torture/compile/pr23237.c */ && DECL_SECTION_NAME (vnode->decl) == NULL) { - if (dump_file) + if (!TREE_READONLY (vnode->decl) && dump_file) fprintf (dump_file, " %s (read-only)", vnode->name ()); - TREE_READONLY (vnode->decl) = 1; + varpool_for_node_and_aliases (vnode, set_readonly_bit, NULL, true); + } + if (!vnode->writeonly && !read && !address_taken) + { + if (dump_file) + fprintf (dump_file, " %s (write-only)", vnode->name ()); + varpool_for_node_and_aliases (vnode, set_writeonly_bit, NULL, true); } } if (dump_file) Index: cgraph.h =================================================================== --- cgraph.h (revision 210514) +++ cgraph.h (working copy) @@ -76,6 +76,8 @@ public: /* Set once the definition was analyzed. The list of references and other properties are built during analysis. */ unsigned analyzed : 1; + /* Set for write-only variables. */ + unsigned writeonly : 1; /*** Visibility and linkage flags. ***/