diff mbox

varpool alias reorg

Message ID 20110623163834.GA3591@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka June 23, 2011, 4:38 p.m. UTC
> On Sat, Jun 18, 2011 at 7:19 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Sat, Jun 18, 2011 at 1:32 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> Hi,
> >> this patch makes symetric changes to varpool as did the prevoius series to cgraph.
> >> Basically the aliases are now represented as separate varpool nodes with alias reference
> >> to the variable they refer to, with some infrastructure to walk the alias references
> >> as needed.
> >>
> >> Bootstrapped/regtested x86_64-linux, comitted.
> >>
> >> Honza
> >>
> >>        * lto-symtab.c (lto_varpool_replace_node): Remove code handling
> >>        extra name aliases.
> >>        (lto_symtab_resolve_can_prevail_p): Likewise.
> >>        (lto_symtab_merge_cgraph_nodes): Update alias_of pointers..
> >>        * cgraphbuild.c (record_reference): Remove extra body alias code.
> >>        (mark_load): Likewise.
> >>        (mark_store): Likewise.
> >>        * cgraph.h (varpool_node): Remove extra_name filed;
> >>        add alias_of and extraname_alias.
> >>        (varpool_create_variable_alias, varpool_for_node_and_aliases): Declare.
> >>        (varpool_alias_aliased_node): New inline function.
> >>        (varpool_variable_node): New function.
> >>        * cgraphunit.c (handle_alias_pairs): Handle also variable aliases.
> >>        * ipa-ref.c (ipa_record_reference): Allow aliases on variables.
> >>        * lto-cgraph.c (lto_output_varpool_node): Update streaming.
> >>        (input_varpool_node): Likewise.
> >>        * lto-streamer-out.c (produce_symtab): Remove extra name aliases.
> >>        (varpool_externally_visible_p): Remove extra body alias code.
> >>        (function_and_variable_visibility): Likewise.
> >>        * tree-ssa-structalias.c (associate_varinfo_to_alias_1): New function.
> >>        (ipa_pta_execute): Use it.
> >>        * varpool.c (varpool_remove_node): Remove extra name alias code.
> >>        (varpool_mark_needed_node): Likewise.
> >>        (varpool_analyze_pending_decls): Analyze aliases.
> >>        (assemble_aliases): New functoin.
> >>        (varpool_assemble_decl): Use it.
> >>        (varpool_create_variable_alias): New function.
> >>        (varpool_extra_name_alias): Rewrite.
> >>        (varpool_for_node_and_aliases): New function.
> >
> > This caused:
> >
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49463
> >
> 
> This patch is incorrect as shown in the PR above.

The builtins/strstr-asm.c is the same issue as I patched some time ago in 
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00031.html
just in this case the problem remained hidden until now.

Again the problem needs plugin to manifest.

There are two problems here
  1) We do not stream builtin decls and merge them outside lto-symtab (by just
     streaming references to builtins with their asm names).  There is at least
     one extra PR related to this and on my TODO is to simply remove the code.
  2) Aliases within single unit works only when both the alias and the target use
     asm name.  This is because internally we store mangled DECL_ASSEMBLER_NAME and the
     alias_pair code.
     With LTO this breaks existing code simply because what used to be multiple units
     and thus safe is now single LTO unit.

     Dave Korn fixed part of the problem by introducing mangling code into lto-symtab
     His code solve similar problems with aliases from the asm code, but it
     did not solve the problem with aliases from LTO units, like here, simply
     because alias pair code bypass the lto-symtab.  One of goals of the incorrect patch
     above is to make lto-symtab to do the merging and thus fix this issue (for now at
     all decls except for builtins).

     Still it would be good to solve the problem on non-LTO compilation, too.
     We discussed introduction of proper symbol table into GCC at the GCC gathering last
     weekend.  It is where I am heading but it will take some time.

Until that happens, I suggest fixing the testcase same was as we already fixed
the memops-asm-lib.c in 4.6 timeframe.

Bootstrapped/regtested x86_64-linux, OK?

Comments

Jan Hubicka June 24, 2011, 11:06 a.m. UTC | #1
Hi,
I also tested the attached variant that simply disable the builtins streaming.
It fixes the testcase, too, bootstraps/regtestes x86_64 with and without plugin
and builds mozilla. It also solves the decl sharing problems that leads to
debug info confussion.
However it breaks memops-asm.c testcase.  What testcase does is giving builtlins
asm name (i.e. my_memcpy instead of memcpy) and then it tests that the new calls
to the builtilns introduced via folding actually use these asm names.

The code this patch remove was probably invented specifically for this testcase:
instead of streaming builtin decl like we do all other streaming, we just stream
reference to the builtin + an asm name and the single builtin decl in the
LTO unit gets the name.

The problem is that this won't work when LTOing multiple such units each giving a
different asm name (i.e. one of units will win). We can't quite fix this because
we don't want to keep track from where the code we are folding is comming.

So I wonder, do we really need to preserve this behaviour?  
I do not think it is documented anywhere and it seems to me that both variants:
ignoring the asm names or honoring them are sane.

Honza

Index: testsuite/gcc.c-torture/execute/builtins/memops-asm-lib.c
===================================================================
*** testsuite/gcc.c-torture/execute/builtins/memops-asm-lib.c	(revision 175350)
--- testsuite/gcc.c-torture/execute/builtins/memops-asm-lib.c	(working copy)
*************** typedef __SIZE_TYPE__ size_t;
*** 6,12 ****
  
  /* LTO code is at the present to able to track that asm alias my_bcopy on builtin
     actually refers to this function.  See PR47181. */
- __attribute__ ((used))
  void *
  my_memcpy (void *d, const void *s, size_t n)
  {
--- 6,11 ----
*************** my_memcpy (void *d, const void *s, size_
*** 19,25 ****
  
  /* LTO code is at the present to able to track that asm alias my_bcopy on builtin
     actually refers to this function.  See PR47181. */
- __attribute__ ((used))
  void
  my_bcopy (const void *s, void *d, size_t n)
  {
--- 18,23 ----
*************** my_bcopy (const void *s, void *d, size_t
*** 39,45 ****
  
  /* LTO code is at the present to able to track that asm alias my_bcopy on builtin
     actually refers to this function.  See PR47181. */
- __attribute__ ((used))
  void *
  my_memset (void *d, int c, size_t n)
  {
--- 37,42 ----
*************** my_memset (void *d, int c, size_t n)
*** 51,57 ****
  
  /* LTO code is at the present to able to track that asm alias my_bcopy on builtin
     actually refers to this function.  See PR47181. */
- __attribute__ ((used))
  void
  my_bzero (void *d, size_t n)
  {
--- 48,53 ----
Index: lto-streamer-out.c
===================================================================
*** lto-streamer-out.c	(revision 175350)
--- lto-streamer-out.c	(working copy)
*************** pack_ts_function_decl_value_fields (stru
*** 484,491 ****
  {
    /* For normal/md builtins we only write the class and code, so they
       should never be handled here.  */
-   gcc_assert (!lto_stream_as_builtin_p (expr));
- 
    bp_pack_enum (bp, built_in_class, BUILT_IN_LAST,
  		DECL_BUILT_IN_CLASS (expr));
    bp_pack_value (bp, DECL_STATIC_CONSTRUCTOR (expr), 1);
--- 484,489 ----
*************** lto_output_tree_header (struct output_bl
*** 1306,1346 ****
  }
  
  
- /* Write the code and class of builtin EXPR to output block OB.  IX is
-    the index into the streamer cache where EXPR is stored.*/
- 
- static void
- lto_output_builtin_tree (struct output_block *ob, tree expr)
- {
-   gcc_assert (lto_stream_as_builtin_p (expr));
- 
-   if (DECL_BUILT_IN_CLASS (expr) == BUILT_IN_MD
-       && !targetm.builtin_decl)
-     sorry ("gimple bytecode streams do not support machine specific builtin "
- 	   "functions on this target");
- 
-   output_record_start (ob, LTO_builtin_decl);
-   lto_output_enum (ob->main_stream, built_in_class, BUILT_IN_LAST,
- 		   DECL_BUILT_IN_CLASS (expr));
-   output_uleb128 (ob, DECL_FUNCTION_CODE (expr));
- 
-   if (DECL_ASSEMBLER_NAME_SET_P (expr))
-     {
-       /* When the assembler name of a builtin gets a user name,
- 	 the new name is always prefixed with '*' by
- 	 set_builtin_user_assembler_name.  So, to prevent the
- 	 reader side from adding a second '*', we omit it here.  */
-       const char *str = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (expr));
-       if (strlen (str) > 1 && str[0] == '*')
- 	lto_output_string (ob, ob->main_stream, &str[1], true);
-       else
- 	lto_output_string (ob, ob->main_stream, NULL, true);
-     }
-   else
-     lto_output_string (ob, ob->main_stream, NULL, true);
- }
- 
- 
  /* Write a physical representation of tree node EXPR to output block
     OB.  If REF_P is true, the leaves of EXPR are emitted as references
     via lto_output_tree_ref.  IX is the index into the streamer cache
--- 1304,1309 ----
*************** lto_output_tree (struct output_block *ob
*** 1456,1470 ****
        lto_output_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
  		       lto_tree_code_to_tag (TREE_CODE (expr)));
      }
-   else if (lto_stream_as_builtin_p (expr))
-     {
-       /* MD and NORMAL builtins do not need to be written out
- 	 completely as they are always instantiated by the
- 	 compiler on startup.  The only builtins that need to
- 	 be written out are BUILT_IN_FRONTEND.  For all other
- 	 builtins, we simply write the class and code.  */
-       lto_output_builtin_tree (ob, expr);
-     }
    else
      {
        /* This is the first time we see EXPR, write its fields
--- 1419,1424 ----
Index: lto-streamer-in.c
===================================================================
*** lto-streamer-in.c	(revision 175350)
--- lto-streamer-in.c	(working copy)
*************** lto_get_pickled_tree (struct lto_input_b
*** 2434,2481 ****
  }
  
  
- /* Read a code and class from input block IB and return the
-    corresponding builtin.  DATA_IN is as in lto_input_tree.  */
- 
- static tree
- lto_get_builtin_tree (struct lto_input_block *ib, struct data_in *data_in)
- {
-   enum built_in_class fclass;
-   enum built_in_function fcode;
-   const char *asmname;
-   tree result;
- 
-   fclass = lto_input_enum (ib, built_in_class, BUILT_IN_LAST);
-   gcc_assert (fclass == BUILT_IN_NORMAL || fclass == BUILT_IN_MD);
- 
-   fcode = (enum built_in_function) lto_input_uleb128 (ib);
- 
-   if (fclass == BUILT_IN_NORMAL)
-     {
-       if (fcode >= END_BUILTINS)
- 	fatal_error ("machine independent builtin code out of range");
-       result = built_in_decls[fcode];
-       gcc_assert (result);
-     }
-   else if (fclass == BUILT_IN_MD)
-     {
-       result = targetm.builtin_decl (fcode, true);
-       if (!result || result == error_mark_node)
- 	fatal_error ("target specific builtin not available");
-     }
-   else
-     gcc_unreachable ();
- 
-   asmname = lto_input_string (data_in, ib);
-   if (asmname)
-     set_builtin_user_assembler_name (result, asmname);
- 
-   lto_streamer_cache_append (data_in->reader_cache, result);
- 
-   return result;
- }
- 
- 
  /* Read the physical representation of a tree node with tag TAG from
     input block IB using the per-file context in DATA_IN.  */
  
--- 2434,2439 ----
*************** lto_read_tree (struct lto_input_block *i
*** 2495,2504 ****
    if (streamer_hooks.read_tree)
      streamer_hooks.read_tree (ib, data_in, result);
  
-   /* We should never try to instantiate an MD or NORMAL builtin here.  */
-   if (TREE_CODE (result) == FUNCTION_DECL)
-     gcc_assert (!lto_stream_as_builtin_p (result));
- 
    /* end_marker = */ lto_input_1_unsigned (ib);
  
  #ifdef LTO_STREAMER_DEBUG
--- 2453,2458 ----
*************** lto_input_tree (struct lto_input_block *
*** 2582,2593 ****
  	 the reader cache.  */
        result = lto_get_pickled_tree (ib, data_in);
      }
-   else if (tag == LTO_builtin_decl)
-     {
-       /* If we are going to read a built-in function, all we need is
- 	 the code and class.  */
-       result = lto_get_builtin_tree (ib, data_in);
-     }
    else if (tag == lto_tree_code_to_tag (INTEGER_CST))
      {
        /* For integer constants we only need the type and its hi/low
--- 2536,2541 ----
Index: lto-streamer.h
===================================================================
*** lto-streamer.h	(revision 175350)
--- lto-streamer.h	(working copy)
*************** enum LTO_tags
*** 198,206 ****
    /* EH region holding the previous statement.  */
    LTO_eh_region,
  
-   /* An MD or NORMAL builtin.  Only the code and class are streamed out.  */
-   LTO_builtin_decl,
- 
    /* Function body.  */
    LTO_function,
  
--- 198,203 ----
*************** emit_label_in_global_context_p (tree lab
*** 1141,1157 ****
    return DECL_NONLOCAL (label) || FORCED_LABEL (label);
  }
  
- /* Return true if tree node EXPR should be streamed as a builtin.  For
-    these nodes, we just emit the class and function code.  */
- static inline bool
- lto_stream_as_builtin_p (tree expr)
- {
-   return (TREE_CODE (expr) == FUNCTION_DECL
- 	  && DECL_IS_BUILTIN (expr)
- 	  && (DECL_BUILT_IN_CLASS (expr) == BUILT_IN_NORMAL
- 	      || DECL_BUILT_IN_CLASS (expr) == BUILT_IN_MD));
- }
- 
  DEFINE_DECL_STREAM_FUNCS (TYPE, type)
  DEFINE_DECL_STREAM_FUNCS (FIELD_DECL, field_decl)
  DEFINE_DECL_STREAM_FUNCS (FN_DECL, fn_decl)
--- 1138,1143 ----
Richard Biener June 27, 2011, 8:50 a.m. UTC | #2
On Thu, 23 Jun 2011, Jan Hubicka wrote:

> > On Sat, Jun 18, 2011 at 7:19 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > > On Sat, Jun 18, 2011 at 1:32 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> > >> Hi,
> > >> this patch makes symetric changes to varpool as did the prevoius series to cgraph.
> > >> Basically the aliases are now represented as separate varpool nodes with alias reference
> > >> to the variable they refer to, with some infrastructure to walk the alias references
> > >> as needed.
> > >>
> > >> Bootstrapped/regtested x86_64-linux, comitted.
> > >>
> > >> Honza
> > >>
> > >>        * lto-symtab.c (lto_varpool_replace_node): Remove code handling
> > >>        extra name aliases.
> > >>        (lto_symtab_resolve_can_prevail_p): Likewise.
> > >>        (lto_symtab_merge_cgraph_nodes): Update alias_of pointers..
> > >>        * cgraphbuild.c (record_reference): Remove extra body alias code.
> > >>        (mark_load): Likewise.
> > >>        (mark_store): Likewise.
> > >>        * cgraph.h (varpool_node): Remove extra_name filed;
> > >>        add alias_of and extraname_alias.
> > >>        (varpool_create_variable_alias, varpool_for_node_and_aliases): Declare.
> > >>        (varpool_alias_aliased_node): New inline function.
> > >>        (varpool_variable_node): New function.
> > >>        * cgraphunit.c (handle_alias_pairs): Handle also variable aliases.
> > >>        * ipa-ref.c (ipa_record_reference): Allow aliases on variables.
> > >>        * lto-cgraph.c (lto_output_varpool_node): Update streaming.
> > >>        (input_varpool_node): Likewise.
> > >>        * lto-streamer-out.c (produce_symtab): Remove extra name aliases.
> > >>        (varpool_externally_visible_p): Remove extra body alias code.
> > >>        (function_and_variable_visibility): Likewise.
> > >>        * tree-ssa-structalias.c (associate_varinfo_to_alias_1): New function.
> > >>        (ipa_pta_execute): Use it.
> > >>        * varpool.c (varpool_remove_node): Remove extra name alias code.
> > >>        (varpool_mark_needed_node): Likewise.
> > >>        (varpool_analyze_pending_decls): Analyze aliases.
> > >>        (assemble_aliases): New functoin.
> > >>        (varpool_assemble_decl): Use it.
> > >>        (varpool_create_variable_alias): New function.
> > >>        (varpool_extra_name_alias): Rewrite.
> > >>        (varpool_for_node_and_aliases): New function.
> > >
> > > This caused:
> > >
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49463
> > >
> > 
> > This patch is incorrect as shown in the PR above.
> 
> The builtins/strstr-asm.c is the same issue as I patched some time ago in 
> http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00031.html
> just in this case the problem remained hidden until now.
> 
> Again the problem needs plugin to manifest.
> 
> There are two problems here
>   1) We do not stream builtin decls and merge them outside lto-symtab (by just
>      streaming references to builtins with their asm names).  There is at least
>      one extra PR related to this and on my TODO is to simply remove the code.
>   2) Aliases within single unit works only when both the alias and the target use
>      asm name.  This is because internally we store mangled DECL_ASSEMBLER_NAME and the
>      alias_pair code.
>      With LTO this breaks existing code simply because what used to be multiple units
>      and thus safe is now single LTO unit.
> 
>      Dave Korn fixed part of the problem by introducing mangling code into lto-symtab
>      His code solve similar problems with aliases from the asm code, but it
>      did not solve the problem with aliases from LTO units, like here, simply
>      because alias pair code bypass the lto-symtab.  One of goals of the incorrect patch
>      above is to make lto-symtab to do the merging and thus fix this issue (for now at
>      all decls except for builtins).
> 
>      Still it would be good to solve the problem on non-LTO compilation, too.
>      We discussed introduction of proper symbol table into GCC at the GCC gathering last
>      weekend.  It is where I am heading but it will take some time.
> 
> Until that happens, I suggest fixing the testcase same was as we already fixed
> the memops-asm-lib.c in 4.6 timeframe.
> 
> Bootstrapped/regtested x86_64-linux, OK?

Ok.

Thanks,
Richard.

> Index: strstr-asm-lib.c
> ===================================================================
> --- strstr-asm-lib.c	(revision 175183)
> +++ strstr-asm-lib.c	(working copy)
> @@ -7,6 +7,7 @@
>  extern int inside_main;
>  extern const char *p;
>  
> +__attribute__ ((used))
>  char *
>  my_strstr (const char *s1, const char *s2)
>  {
Jan Hubicka June 27, 2011, 9:16 a.m. UTC | #3
> > There are two problems here
> >   1) We do not stream builtin decls and merge them outside lto-symtab (by just
> >      streaming references to builtins with their asm names).  There is at least
> >      one extra PR related to this and on my TODO is to simply remove the code.
> >   2) Aliases within single unit works only when both the alias and the target use
> >      asm name.  This is because internally we store mangled DECL_ASSEMBLER_NAME and the
> >      alias_pair code.
> >      With LTO this breaks existing code simply because what used to be multiple units
> >      and thus safe is now single LTO unit.
> > 
> >      Dave Korn fixed part of the problem by introducing mangling code into lto-symtab
> >      His code solve similar problems with aliases from the asm code, but it
> >      did not solve the problem with aliases from LTO units, like here, simply
> >      because alias pair code bypass the lto-symtab.  One of goals of the incorrect patch
> >      above is to make lto-symtab to do the merging and thus fix this issue (for now at
> >      all decls except for builtins).
> > 
> >      Still it would be good to solve the problem on non-LTO compilation, too.
> >      We discussed introduction of proper symbol table into GCC at the GCC gathering last
> >      weekend.  It is where I am heading but it will take some time.
> > 
> > Until that happens, I suggest fixing the testcase same was as we already fixed
> > the memops-asm-lib.c in 4.6 timeframe.
> > 
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> Ok.
Hi,
thanks!  Could you, please, also consider the second two variants I sent?  In
general attribute used is needed when implementing libfuncs since we never know
when folding will intoduce new libcall (like in memops-asm).  However this
particular case don't need use because the builtin is called directly.
They also solve the problem of BLOCK lists being messed up.

In meantime I tested both variants and they pass.  I will add a changelog, of course.

Honza
diff mbox

Patch

Index: strstr-asm-lib.c
===================================================================
--- strstr-asm-lib.c	(revision 175183)
+++ strstr-asm-lib.c	(working copy)
@@ -7,6 +7,7 @@ 
 extern int inside_main;
 extern const char *p;
 
+__attribute__ ((used))
 char *
 my_strstr (const char *s1, const char *s2)
 {