diff mbox

Fix type merging deficiency during WPA

Message ID 1644035.nvA1iejZSU@polaris
State New
Headers show

Commit Message

Eric Botcazou July 5, 2016, 10:57 a.m. UTC
Hi,

the deficiency comes from a chicken-and-egg problem during WPA: DECL nodes 
merging depends on type merging, but type merging also depends on DECL nodes 
merging for dynamic types declared at file scope, which easily occurs in Ada.

For the attached trivial testcase, the compiler issues:

/home/eric/svn/gcc/gcc/testsuite/gnat.dg/lto18_pkg1.ads:12:13: warning: type 
of 'lto18_pkg1__proc' does not match original declaration [-Wlto-type-
mismatch]
/home/eric/svn/gcc/gcc/testsuite/gnat.dg/lto18_pkg1.adb:3:3: note: 
'lto18_pkg1__proc' was previously declared here
/home/eric/svn/gcc/gcc/testsuite/gnat.dg/lto18_pkg1.adb:3:3: note: code may be 
misoptimized unless -fno-strict-aliasing is used

The proposed fix is to add a special processing in operand_equal_p/add_expr 
for DECL nodes during WPA.  It contains a tweak for lto_fixup_prevailing_decls 
for the sake of completeness, but it is not necessary for fixing the problem.

Tested on x86_64-suse-linux, OK for the mainline?


2016-07-05  Eric Botcazou  <ebotcazou@adacore.com>

	* cgraph.h (symbol_table::decl_assembler_name_hash): Make public.
	* fold-const.c (operand_equal_p) <tcc_declaration>: Add special
	processing during WPA.
	* tree.c (add_expr) <tcc_declaration>: Likewise.
lto/
	* lto.c (walk_simple_constant_arithmetic): New function.
	(LTO_SET_PREVAIL): Use it to fix up DECL nodes in simple expressions.


2016-07-05  Eric Botcazou  <ebotcazou@adacore.com>

	* gnat.dg/lto18.adb: New test.
	* gnat.dg/lto18_pkg1.ad[sb]: New helper.
	* gnat.dg/lto18_pkg2.ad[sb]: Likewise.

Comments

Richard Biener July 5, 2016, 11:59 a.m. UTC | #1
On Tue, Jul 5, 2016 at 12:57 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> Hi,
>
> the deficiency comes from a chicken-and-egg problem during WPA: DECL nodes
> merging depends on type merging, but type merging also depends on DECL nodes
> merging for dynamic types declared at file scope, which easily occurs in Ada.

So this is sth like (invalid C)

t.h
---
int n;
struct X { int x[n]; };

t1.c
--
#include "t.h"
struct X x;
t2.c
--
#include "t.h"
struct X x;

?

It's not obvious from the fix (which I think is in the wrong place)
which operand_equal/hash
call during WPA this is supposed to fix.  So can you please provide a
little more context here?

Thanks,
Richard.

> For the attached trivial testcase, the compiler issues:
>
> /home/eric/svn/gcc/gcc/testsuite/gnat.dg/lto18_pkg1.ads:12:13: warning: type
> of 'lto18_pkg1__proc' does not match original declaration [-Wlto-type-
> mismatch]
> /home/eric/svn/gcc/gcc/testsuite/gnat.dg/lto18_pkg1.adb:3:3: note:
> 'lto18_pkg1__proc' was previously declared here
> /home/eric/svn/gcc/gcc/testsuite/gnat.dg/lto18_pkg1.adb:3:3: note: code may be
> misoptimized unless -fno-strict-aliasing is used
>
> The proposed fix is to add a special processing in operand_equal_p/add_expr
> for DECL nodes during WPA.  It contains a tweak for lto_fixup_prevailing_decls
> for the sake of completeness, but it is not necessary for fixing the problem.
>
> Tested on x86_64-suse-linux, OK for the mainline?
>
>
> 2016-07-05  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * cgraph.h (symbol_table::decl_assembler_name_hash): Make public.
>         * fold-const.c (operand_equal_p) <tcc_declaration>: Add special
>         processing during WPA.
>         * tree.c (add_expr) <tcc_declaration>: Likewise.
> lto/
>         * lto.c (walk_simple_constant_arithmetic): New function.
>         (LTO_SET_PREVAIL): Use it to fix up DECL nodes in simple expressions.
>
>
> 2016-07-05  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * gnat.dg/lto18.adb: New test.
>         * gnat.dg/lto18_pkg1.ad[sb]: New helper.
>         * gnat.dg/lto18_pkg2.ad[sb]: Likewise.
>
> --
> Eric Botcazou
Eric Botcazou July 6, 2016, 7:56 a.m. UTC | #2
> So this is sth like (invalid C)
> 
> t.h
> ---
> int n;
> struct X { int x[n]; };
> 
> t1.c
> --
> #include "t.h"
> struct X x;
> t2.c
> --
> #include "t.h"
> struct X x;
> 
> ?

Essentially yes, but with a single definition for 'n' and references to it.

> It's not obvious from the fix (which I think is in the wrong place)
> which operand_equal/hash
> call during WPA this is supposed to fix.  So can you please provide a
> little more context here?

It's called during the canonical type computation invoked from lto_read_decls:

	      /* Compute the canonical type of all types.
		 ???  Should be able to assert that !TYPE_CANONICAL.  */
	      if (TYPE_P (t) && !TYPE_CANONICAL (t))
		{
		  gimple_register_canonical_type (t);
		  if (odr_type_p (t))
		    register_odr_type (t);
		}

In particular for VLAs:

  /* For array types hash the domain bounds and the string flag.  */
  if (TREE_CODE (type) == ARRAY_TYPE && TYPE_DOMAIN (type))
    {
      hstate.add_int (TYPE_STRING_FLAG (type));
      /* OMP lowering can introduce error_mark_node in place of
	 random local decls in types.  */
      if (TYPE_MIN_VALUE (TYPE_DOMAIN (type)) != error_mark_node)
	inchash::add_expr (TYPE_MIN_VALUE (TYPE_DOMAIN (type)), hstate);
      if (TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != error_mark_node)
	inchash::add_expr (TYPE_MAX_VALUE (TYPE_DOMAIN (type)), hstate);
    }
Richard Biener July 6, 2016, 9:11 a.m. UTC | #3
On Wed, Jul 6, 2016 at 9:56 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> So this is sth like (invalid C)
>>
>> t.h
>> ---
>> int n;
>> struct X { int x[n]; };
>>
>> t1.c
>> --
>> #include "t.h"
>> struct X x;
>> t2.c
>> --
>> #include "t.h"
>> struct X x;
>>
>> ?
>
> Essentially yes, but with a single definition for 'n' and references to it.
>
>> It's not obvious from the fix (which I think is in the wrong place)
>> which operand_equal/hash
>> call during WPA this is supposed to fix.  So can you please provide a
>> little more context here?
>
> It's called during the canonical type computation invoked from lto_read_decls:
>
>               /* Compute the canonical type of all types.
>                  ???  Should be able to assert that !TYPE_CANONICAL.  */
>               if (TYPE_P (t) && !TYPE_CANONICAL (t))
>                 {
>                   gimple_register_canonical_type (t);
>                   if (odr_type_p (t))
>                     register_odr_type (t);
>                 }
>
> In particular for VLAs:
>
>   /* For array types hash the domain bounds and the string flag.  */
>   if (TREE_CODE (type) == ARRAY_TYPE && TYPE_DOMAIN (type))
>     {
>       hstate.add_int (TYPE_STRING_FLAG (type));
>       /* OMP lowering can introduce error_mark_node in place of
>          random local decls in types.  */
>       if (TYPE_MIN_VALUE (TYPE_DOMAIN (type)) != error_mark_node)
>         inchash::add_expr (TYPE_MIN_VALUE (TYPE_DOMAIN (type)), hstate);
>       if (TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != error_mark_node)
>         inchash::add_expr (TYPE_MAX_VALUE (TYPE_DOMAIN (type)), hstate);
>     }

I see.  I think the solution is to perform cgraph/varpool merging
before attempting to read in
the global decl stream.  IIRC Micha had (old) patches for this.

But I wonder why we don't tree-merge 'n' here (from my C example) and
thus figure
that the type domain of x is equal?  Or is it that 'n' and 'x' are in
the same SCC (they
referece each other in some way)?  In this case the bug would be that we fail to
treat them equal optimistically.  That said, I don't see how TYPE_CANONICAL
computation is relevant - what is relevant is the failure to merge the
two types.
In debugging this I'd start to see if the hashes are not equal or if
they are equal
at which node we consider them to differ.

Richard.

> --
> Eric Botcazou
Eric Botcazou July 6, 2016, 9:33 a.m. UTC | #4
> I see.  I think the solution is to perform cgraph/varpool merging
> before attempting to read in
> the global decl stream.  IIRC Micha had (old) patches for this.

How can you merge varpool nodes if you haven't merged types?

> But I wonder why we don't tree-merge 'n' here (from my C example) and
> thus figure
> that the type domain of x is equal?  Or is it that 'n' and 'x' are in
> the same SCC (they
> referece each other in some way)?  In this case the bug would be that we
> fail to treat them equal optimistically.  That said, I don't see how
> TYPE_CANONICAL computation is relevant - what is relevant is the failure to
> merge the two types.
> In debugging this I'd start to see if the hashes are not equal or if
> they are equal
> at which node we consider them to differ.

We just have 2 different DECLs with different DECL_UIDs, the definition from a 
compilation unit and a reference from another compilation unit, so the hashes 
naturally differ too.  What's supposed to have them reconciled at this point?

The TYPE_CANONICAL computation is relevant because, with GCC 6, the criterion 
for compatibility of pointer types is the alias set, which is based on the 
TYPE_CANONICAL of the pointed-to type, so we fail to merge pointer types 
because warn_type_compatibility_p returns non-zero if TYPE_CANONICAL differs.
Richard Biener July 6, 2016, 10:21 a.m. UTC | #5
On Wed, Jul 6, 2016 at 11:33 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> I see.  I think the solution is to perform cgraph/varpool merging
>> before attempting to read in
>> the global decl stream.  IIRC Micha had (old) patches for this.
>
> How can you merge varpool nodes if you haven't merged types?
>
>> But I wonder why we don't tree-merge 'n' here (from my C example) and
>> thus figure
>> that the type domain of x is equal?  Or is it that 'n' and 'x' are in
>> the same SCC (they
>> referece each other in some way)?  In this case the bug would be that we
>> fail to treat them equal optimistically.  That said, I don't see how
>> TYPE_CANONICAL computation is relevant - what is relevant is the failure to
>> merge the two types.
>> In debugging this I'd start to see if the hashes are not equal or if
>> they are equal
>> at which node we consider them to differ.
>
> We just have 2 different DECLs with different DECL_UIDs, the definition from a
> compilation unit and a reference from another compilation unit, so the hashes
> naturally differ too.  What's supposed to have them reconciled at this point?

I am talking about tree/SCC merging which happily merges global decls as well.
It uses custom hashing (see lto-streamer-out.c:hash_tree) which doesn't hash
DECL_UID (obviously).  This merging process should be optimistic for all nodes
in the same SCC as well.

That said, I expect the types to be tree merged and wonder why they are not.

If they were merged they'd obviously share TYPE_CANONICAL because there
would be only one type to compute TYPE_CANONICAL for.

Richard.

> The TYPE_CANONICAL computation is relevant because, with GCC 6, the criterion
> for compatibility of pointer types is the alias set, which is based on the
> TYPE_CANONICAL of the pointed-to type, so we fail to merge pointer types
> because warn_type_compatibility_p returns non-zero if TYPE_CANONICAL differs.
>
> --
> Eric Botcazou
Richard Biener July 6, 2016, 10:44 a.m. UTC | #6
On Wed, Jul 6, 2016 at 12:21 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Jul 6, 2016 at 11:33 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>>> I see.  I think the solution is to perform cgraph/varpool merging
>>> before attempting to read in
>>> the global decl stream.  IIRC Micha had (old) patches for this.
>>
>> How can you merge varpool nodes if you haven't merged types?

You merge them just in the way the linker instructs you via the
resolution table.

>>> But I wonder why we don't tree-merge 'n' here (from my C example) and
>>> thus figure
>>> that the type domain of x is equal?  Or is it that 'n' and 'x' are in
>>> the same SCC (they
>>> referece each other in some way)?  In this case the bug would be that we
>>> fail to treat them equal optimistically.  That said, I don't see how
>>> TYPE_CANONICAL computation is relevant - what is relevant is the failure to
>>> merge the two types.
>>> In debugging this I'd start to see if the hashes are not equal or if
>>> they are equal
>>> at which node we consider them to differ.
>>
>> We just have 2 different DECLs with different DECL_UIDs, the definition from a
>> compilation unit and a reference from another compilation unit, so the hashes
>> naturally differ too.  What's supposed to have them reconciled at this point?
>
> I am talking about tree/SCC merging which happily merges global decls as well.
> It uses custom hashing (see lto-streamer-out.c:hash_tree) which doesn't hash
> DECL_UID (obviously).  This merging process should be optimistic for all nodes
> in the same SCC as well.
>
> That said, I expect the types to be tree merged and wonder why they are not.
>
> If they were merged they'd obviously share TYPE_CANONICAL because there
> would be only one type to compute TYPE_CANONICAL for.

It probably boils down to one unit refering to the DECL with DECL_EXTERNAL set
and one to the DECL with TREE_STATIC set?  This would mean that
hashing/comparing
should be set up to "merge" those but the merging ultimatively rejected by some
toplevel logic (so we get users merged).  But as said, early symtab
merging would
fix this as well.

Richard.

> Richard.
>
>> The TYPE_CANONICAL computation is relevant because, with GCC 6, the criterion
>> for compatibility of pointer types is the alias set, which is based on the
>> TYPE_CANONICAL of the pointed-to type, so we fail to merge pointer types
>> because warn_type_compatibility_p returns non-zero if TYPE_CANONICAL differs.
>>
>> --
>> Eric Botcazou
diff mbox

Patch

Index: cgraph.h
===================================================================
--- cgraph.h	(revision 237999)
+++ cgraph.h	(working copy)
@@ -2175,6 +2175,9 @@  public:
   /* Set the DECL_ASSEMBLER_NAME and update symtab hashtables.  */
   void change_decl_assembler_name (tree decl, tree name);
 
+  /* Hash asmnames ignoring the user specified marks.  */
+  static hashval_t decl_assembler_name_hash (const_tree asmname);
+
   /* Return true if assembler names NAME1 and NAME2 leads to the same symbol
      name.  */
   static bool assembler_names_equal_p (const char *name1, const char *name2);
@@ -2243,9 +2246,6 @@  private:
   /* Remove NODE from assembler name hash.  */
   void unlink_from_assembler_name_hash (symtab_node *node, bool with_clones);
 
-  /* Hash asmnames ignoring the user specified marks.  */
-  static hashval_t decl_assembler_name_hash (const_tree asmname);
-
   /* Compare ASMNAME with the DECL_ASSEMBLER_NAME of DECL.  */
   static bool decl_assembler_name_equal (tree decl, const_tree asmname);
 
Index: fold-const.c
===================================================================
--- fold-const.c	(revision 237999)
+++ fold-const.c	(working copy)
@@ -3226,6 +3226,17 @@  operand_equal_p (const_tree arg0, const_
 	}
 
     case tcc_declaration:
+      /* During WPA we can be called before the DECL nodes are merged.  */
+      if (flag_wpa
+	  && TREE_CODE (arg0) == VAR_DECL
+	  && (TREE_PUBLIC (arg0) || DECL_EXTERNAL (arg0))
+	  && (TREE_PUBLIC (arg1) || DECL_EXTERNAL (arg1)))
+	return symbol_table::assembler_names_equal_p
+		 (IDENTIFIER_POINTER
+		    (DECL_ASSEMBLER_NAME (const_cast<tree> (arg0))),
+		  IDENTIFIER_POINTER
+		    (DECL_ASSEMBLER_NAME (const_cast<tree> (arg1))));
+
       /* Consider __builtin_sqrt equal to sqrt.  */
       return (TREE_CODE (arg0) == FUNCTION_DECL
 	      && DECL_BUILT_IN (arg0) && DECL_BUILT_IN (arg1)
Index: lto/lto.c
===================================================================
--- lto/lto.c	(revision 237999)
+++ lto/lto.c	(working copy)
@@ -2506,15 +2506,55 @@  lto_wpa_write_files (void)
   timevar_pop (TV_WHOPR_WPA_IO);
 }
 
+/* Look inside *EXPR into simple arithmetic operations involving constants.
+   Return the address of the outermost non-arithmetic or non-constant node.
+   This should be equivalent to tree.c:skip_simple_constant_arithmetic.  */
 
-/* If TT is a variable or function decl replace it with its
-   prevailing variant.  */
+static tree *
+walk_simple_constant_arithmetic (tree *expr)
+{
+  while (TREE_CODE (*expr) == NON_LVALUE_EXPR)
+    expr = &TREE_OPERAND (*expr, 0);
+
+  while (true)
+    {
+      if (UNARY_CLASS_P (*expr))
+	expr = &TREE_OPERAND (*expr, 0);
+      else if (BINARY_CLASS_P (*expr))
+	{
+	  if (TREE_CONSTANT (TREE_OPERAND (*expr, 1)))
+	    expr = &TREE_OPERAND (*expr, 0);
+	  else if (TREE_CONSTANT (TREE_OPERAND (*expr, 0)))
+	    expr = &TREE_OPERAND (*expr, 1);
+	  else
+	    break;
+	}
+      else
+	break;
+    }
+
+  return expr;
+}
+
+/* If TT is a variable or function decl or a simple expression containing one,
+   replace the occurrence with its prevailing variant.  This should be able to
+   deal with all the expressions attached to _DECL and _TYPE nodes which were
+   streamed into the GIMPLE IR.  */
 #define LTO_SET_PREVAIL(tt) \
   do {\
+    tree *loc; \
     if ((tt) && VAR_OR_FUNCTION_DECL_P (tt) \
 	&& (TREE_PUBLIC (tt) || DECL_EXTERNAL (tt))) \
       { \
-        tt = lto_symtab_prevailing_decl (tt); \
+	tt = lto_symtab_prevailing_decl (tt); \
+	fixed = true; \
+      } \
+    else if ((tt) && EXPR_P (tt) \
+	     && (loc = walk_simple_constant_arithmetic (&tt)) \
+	     && VAR_OR_FUNCTION_DECL_P (*loc) \
+	     && (TREE_PUBLIC (*loc) || DECL_EXTERNAL (*loc))) \
+      { \
+	*loc = lto_symtab_prevailing_decl (*loc); \
 	fixed = true; \
       } \
   } while (0)
Index: tree.c
===================================================================
--- tree.c	(revision 237999)
+++ tree.c	(working copy)
@@ -7878,8 +7878,17 @@  add_expr (const_tree t, inchash::hash &h
 
       if (tclass == tcc_declaration)
 	{
-	  /* DECL's have a unique ID */
-	  hstate.add_wide_int (DECL_UID (t));
+	  /* During WPA we can be called before the DECL nodes are merged.  */
+	  if (flag_wpa
+	      && TREE_CODE (t) == VAR_DECL
+	      && (TREE_PUBLIC (t) || DECL_EXTERNAL (t)))
+	    {
+	      tree x = DECL_ASSEMBLER_NAME (const_cast <tree> (t));
+	      hstate.merge_hash (symbol_table::decl_assembler_name_hash (x));
+	    }
+	  else
+	    /* DECL's have a unique ID */
+	    hstate.add_wide_int (DECL_UID (t));
 	}
       else if (tclass == tcc_comparison && !commutative_tree_code (code))
 	{