diff mbox

[C,frontend] Fix construction of TYPE_STUB_DECL

Message ID 20150510173354.GB80167@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka May 10, 2015, 5:33 p.m. UTC
Hi,
TREE_PUBLIC of TYPE_DECL is defined to say if the type is public:
/* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL,                   
   nonzero means name is to be accessible from outside this translation unit.   
   In an IDENTIFIER_NODE, nonzero means an external declaration                 
   accessible from outside this translation unit was previously seen            
   for this name in an inner scope.  */                                         
#define TREE_PUBLIC(NODE) ((NODE)->base.public_flag)                            

This is properly honored by C++ FE but other FEs are bit random, which in turn
confuses type_in_anonymous_namespace_p predicate that leads to flase poistives
on type mismatch warnings.  I used to be able to get around by checking only
C++ types at LTO time, but with type checking in lto-symtab I can not, because
I do want to compare type compatibility cross translation units and cross languages
and we have no reliable way to say what type originated as C++ and what did not.

This fixed TYPE_STUB_DECl construction in C frontend. I will check other
FEs separately. I can also add way to recognize C++ types, but I think it is
good idea to make type representation consistent across FEs.

Bootstrapped/regtested x86_64-linux, OK?

Honza

	* c-decl.c (pushtag): Declare type as public.

Comments

Richard Biener May 11, 2015, 7:53 a.m. UTC | #1
On Sun, May 10, 2015 at 7:33 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> TREE_PUBLIC of TYPE_DECL is defined to say if the type is public:
> /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL,
>    nonzero means name is to be accessible from outside this translation unit.
>    In an IDENTIFIER_NODE, nonzero means an external declaration
>    accessible from outside this translation unit was previously seen
>    for this name in an inner scope.  */
> #define TREE_PUBLIC(NODE) ((NODE)->base.public_flag)
>
> This is properly honored by C++ FE but other FEs are bit random, which in turn
> confuses type_in_anonymous_namespace_p predicate that leads to flase poistives
> on type mismatch warnings.  I used to be able to get around by checking only
> C++ types at LTO time, but with type checking in lto-symtab I can not, because
> I do want to compare type compatibility cross translation units and cross languages
> and we have no reliable way to say what type originated as C++ and what did not.

The idea was you can walk up the context chain until you reach a
TRANSLATION_UNIT_DECL
where the source language is encoded in...  of course most FEs are
quite lazy here and
eventually end up at NULL_TREE instead.

Richard.

> This fixed TYPE_STUB_DECl construction in C frontend. I will check other
> FEs separately. I can also add way to recognize C++ types, but I think it is
> good idea to make type representation consistent across FEs.
>
> Bootstrapped/regtested x86_64-linux, OK?
>
> Honza
>
>         * c-decl.c (pushtag): Declare type as public.
> Index: c/c-decl.c
> ===================================================================
> --- c/c-decl.c  (revision 222981)
> +++ c/c-decl.c  (working copy)
> @@ -1563,6 +1563,7 @@ pushtag (location_t loc, tree name, tree
>
>    TYPE_STUB_DECL (type) = pushdecl (build_decl (loc,
>                                                 TYPE_DECL, NULL_TREE, type));
> +  TREE_PUBLIC (TYPE_STUB_DECL (type)) = 1;
>
>    /* An approximation for now, so we can tell this is a function-scope tag.
>       This will be updated in pop_scope.  */
Jan Hubicka May 11, 2015, 10:57 a.m. UTC | #2
> On Sun, May 10, 2015 at 7:33 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> > Hi,
> > TREE_PUBLIC of TYPE_DECL is defined to say if the type is public:
> > /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL,
> >    nonzero means name is to be accessible from outside this translation unit.
> >    In an IDENTIFIER_NODE, nonzero means an external declaration
> >    accessible from outside this translation unit was previously seen
> >    for this name in an inner scope.  */
> > #define TREE_PUBLIC(NODE) ((NODE)->base.public_flag)
> >
> > This is properly honored by C++ FE but other FEs are bit random, which in turn
> > confuses type_in_anonymous_namespace_p predicate that leads to flase poistives
> > on type mismatch warnings.  I used to be able to get around by checking only
> > C++ types at LTO time, but with type checking in lto-symtab I can not, because
> > I do want to compare type compatibility cross translation units and cross languages
> > and we have no reliable way to say what type originated as C++ and what did not.
> 
> The idea was you can walk up the context chain until you reach a
> TRANSLATION_UNIT_DECL
> where the source language is encoded in...  of course most FEs are
> quite lazy here and
> eventually end up at NULL_TREE instead.

Yep, I can walk up and look for NAMESPACE_DECL without TREE_PUBLIC flag, too,
or cleanup the flag in free_lang_data when I know if we do C++ (probably).
The predicate for anonymous_namespace is called on few quite hot paths in
devirt code, but that is not my main concern either.

In general I think we ought to fix frontends to agree with each other (and
docs) on how to represent semantics instead of adding special cases and
workarounds to middle-end/LTO.  I am trying to keep the anonymous namespace
thing indepent of C++. While I am not sure we support other language with
similar notion, it seems to make sense independently of C++. After all, it
gives the type escape info for free :)

Honza
> 
> Richard.
Jeff Law May 11, 2015, 6:40 p.m. UTC | #3
On 05/10/2015 11:33 AM, Jan Hubicka wrote:
> Hi,
> TREE_PUBLIC of TYPE_DECL is defined to say if the type is public:
> /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL,
>     nonzero means name is to be accessible from outside this translation unit.
>     In an IDENTIFIER_NODE, nonzero means an external declaration
>     accessible from outside this translation unit was previously seen
>     for this name in an inner scope.  */
> #define TREE_PUBLIC(NODE) ((NODE)->base.public_flag)
>
> This is properly honored by C++ FE but other FEs are bit random, which in turn
> confuses type_in_anonymous_namespace_p predicate that leads to flase poistives
> on type mismatch warnings.  I used to be able to get around by checking only
> C++ types at LTO time, but with type checking in lto-symtab I can not, because
> I do want to compare type compatibility cross translation units and cross languages
> and we have no reliable way to say what type originated as C++ and what did not.
>
> This fixed TYPE_STUB_DECl construction in C frontend. I will check other
> FEs separately. I can also add way to recognize C++ types, but I think it is
> good idea to make type representation consistent across FEs.
>
> Bootstrapped/regtested x86_64-linux, OK?
>
> Honza
>
> 	* c-decl.c (pushtag): Declare type as public.
What I'm struggling with here is how do you know the stub decl is 
public?  I realize these things are a bit special, but I don't see the 
C++ front-end doing anything similar.  What am I missing?

jeff
Jan Hubicka May 11, 2015, 6:53 p.m. UTC | #4
> >Bootstrapped/regtested x86_64-linux, OK?
> >
> >Honza
> >
> >	* c-decl.c (pushtag): Declare type as public.
> What I'm struggling with here is how do you know the stub decl is
> public?  I realize these things are a bit special, but I don't see
> the C++ front-end doing anything similar.  What am I missing?

It is used by type_in_anonymous_namespace_p. C++ FE definitely sets them accordingly,
but I have no idea how it is done.  Jason, can you help?

Honza
> 
> jeff
Jason Merrill May 11, 2015, 9:26 p.m. UTC | #5
On 05/10/2015 12:33 PM, Jan Hubicka wrote:
> This is properly honored by C++ FE but other FEs are bit random, which in turn
> confuses type_in_anonymous_namespace_p predicate that leads to flase poistives
> on type mismatch warnings.  I used to be able to get around by checking only
> C++ types at LTO time, but with type checking in lto-symtab I can not, because
> I do want to compare type compatibility cross translation units and cross languages
> and we have no reliable way to say what type originated as C++ and what did not.

I think we should, as only C++ declarations are subject to the ODR.  C 
has different (structural) compatibility rules, and I think they should 
apply when comparing C and C++ types.

Since C struct names have no linkage, I don't think it's right to set 
TREE_PUBLIC on them.

Jason
Jan Hubicka May 11, 2015, 9:52 p.m. UTC | #6
> On 05/10/2015 12:33 PM, Jan Hubicka wrote:
> >This is properly honored by C++ FE but other FEs are bit random, which in turn
> >confuses type_in_anonymous_namespace_p predicate that leads to flase poistives
> >on type mismatch warnings.  I used to be able to get around by checking only
> >C++ types at LTO time, but with type checking in lto-symtab I can not, because
> >I do want to compare type compatibility cross translation units and cross languages
> >and we have no reliable way to say what type originated as C++ and what did not.
> 
> I think we should, as only C++ declarations are subject to the ODR.
> C has different (structural) compatibility rules, and I think they
> should apply when comparing C and C++ types.
> 
> Since C struct names have no linkage, I don't think it's right to
> set TREE_PUBLIC on them.

Yes, I need safe way to tell what type is subject to ODR and what is not.
I am not quite sure what is the best approach here. This is what the code is doing
currently:

To detect ODR types at LTO time I use combination of presence of mangled name
and type_in_anonymous_namespace_p check. (The idea is that we do not really need
to mangle anonymous type as we do not need to deal with cross-module merging.):

odr_type_p (const_tree t)
{
  if (type_in_anonymous_namespace_p (t))
    return true;
  /* We do not have this information when not in LTO, but we do not need
     to care, since it is used only for type merging.  */
  gcc_assert (in_lto_p || flag_lto);

  return (TYPE_NAME (t)
          && (DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t))));
}
bool
type_in_anonymous_namespace_p (const_tree t)
{
  /* TREE_PUBLIC of TYPE_STUB_DECL may not be properly set for
     bulitin types; those have CONTEXT NULL.  */
  if (!TYPE_CONTEXT (t))
    return false;
  return (TYPE_STUB_DECL (t) && !TREE_PUBLIC (TYPE_STUB_DECL (t)));
}

the catch is that type_in_anonymous_namespace_p returns true for some C types.
The TYPE_CONTEXT test is already hack as I run into cases of pre-streamed LTO
types to get injected into C++ classes (i.e. if you refere to int, LTO
streaming will replace C++ int with TREE_PUBLIC (TYPE_STUB_DECL (t)) by its own
int that is !TREE_PUBLIC (TYPE_STUB_DECL (t)).  The hack to avoid builtin types
made type_in_anonymous_namespace_p to work well in cases I needed for class
types. (I believe it is because C builds record with
 TREE_PUBLIC (TYPE_STUB_DECL (t))=1 but it is a while I checked this)

We could certainly just add a flag TYPE_ODR_P that says "this type is
controlled by odr rule".  I considered that but it is generally not so nice
to introduce new flags and it seemed to me that because C
standard allows to match all types cross-module on structural basis, it makes
sense to consider them all as having public linkage because they are
"accessible from outside this translation unit." as tree.h defines the flag.

I would be happy with TYPE_ODR_P or any other solution that looks better.

Honza

> 
> Jason
Jan Hubicka May 13, 2015, 4:56 p.m. UTC | #7
Hello,
it seems that the discussion here got stuck without arriving to a consensus.
I generally see three options here
 1) make TYPE_PUBLIC flag of TYPE_STUB_DECL to work consistently across frontends
    in a sense that types with flag !TYPE_PUBLIC (TYPE_STUB_DECL (t)) can be
    considered local to the translation unit and thus we can consider these types
    non-escaping (possibly changing the layout) and TBAA incompatible with types
    from other units for LTO.
 2) Add TYPE_ODR flag that will mark all types that comply the ODR rule.  This
    would be ideally set on C++ types by the FE.
 3) Make type_in_anonymous_namespace to walk context and look for non-public
    namespace instead of relying on TYPE_STUB_DECL alone and if TREE_PUBLIC
    flag on namespaces turns out to be not consistent across FEs, it may
    walk up to check TRANSLATION_UNIT_DECL and work out if it is C++ lang.

In general I still lean toward 1 as it has hope to be useful for non-C++
languages that have notion of local types. I am trying to keep all the
ODR/devirt code as generic as posisble to make it useful for non-C++ based
languages, too. 1) is one of very few cases where we can "solve" type escape.

3) feels like a hack and would make type_in_anonymous_namespace
non-constant (probably still cheap enough for my needs).  It seems like
something useful for verify_type to check that 1) or 2) works as expected.

Any solution here would let me to apply
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01061.html which otherwise
triggers false positives on Firefox. Mixing C and C++ units makes us to
think that one of global vars (originating from C unit) is declared with type
in anonymous namespace and thus can not match declaration from other unit.
This is workaroundable, too, because C++ variables in anonymous namespaces
are never global so I know all those warnings are false positives, but I
would preffer to not go this way.

Together with the -Wodr warning on types the patch
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01061.html tests that the ODR
equivalency as established by ipa-devrt at linktime exactly match what C++
standard says - if it was wrongly defining two different types equivalent, we
will eventually get ODR type mismatch warning.  If it was wrongly defining two
types different, we will get eventually incompatible declaration warning. I
would like to get this tested and keep an eye at ODR warnings double checking
that they really indicate bugs in compiled programs and not bugs in GCC.
(current implementation seems correct so far on the bigger apps I built)

Moreover the patch is useful to catch some real bugs in Firefox or Libreoffice.

Honza
Jan Hubicka May 21, 2015, 8:32 p.m. UTC | #8
Hello,
I would like to ping this. Currently we have a problem with Ada ICE because we
consider a global variable produced by ada to have type in C++ anonymous
namespace and we get false ODR merging wraning compiling clang because we
consider instances of templates with parameter in anonymous namespace
non-anonymous. I do not think I can make type_in_anonymous_namespace_p to
work reliably at LTO time without frontend change.

This is not really hard thing to do - we only need to decide on a representation.
I think at LTO time it is useful to have two things
  - be able to say what type comply C++ ODR rule, because we special case these
    for ODR warnings
  - be able to say at LTO time what types are anonymous, that is they are not
    compatible with any type from other unit.

Either a special flag TYPE_ODR_P set by C++ FE on all types that comply ODR rule
or making TYPE_PUBLIC (TYPE_STUB_DECL (t)) to reliably identify that type is
anonymous (or an equivalent) would fix the two issues.

Honza
Jason Merrill May 21, 2015, 9:24 p.m. UTC | #9
On 05/21/2015 04:32 PM, Jan Hubicka wrote:
> I think at LTO time it is useful to have two things
>    - be able to say what type comply C++ ODR rule, because we special case these
>      for ODR warnings
>    - be able to say at LTO time what types are anonymous, that is they are not
>      compatible with any type from other unit.

...where the second group is a subset of the first; in languages that 
use structural compatibility, anonymous types can be compatible.

> Either a special flag TYPE_ODR_P set by C++ FE on all types that comply ODR rule
> or making TYPE_PUBLIC (TYPE_STUB_DECL (t)) to reliably identify that type is
> anonymous (or an equivalent) would fix the two issues.

It seems that we've been establishing which types have linkage by 
setting their DECL_ASSEMBLER_NAME.  It seems that the problem is coming 
from leaving it unset for types with internal (anonymous) linkage, so 
you're trying to check TREE_PUBLIC instead, which breaks with other 
front ends. Perhaps we should set DECL_ASSEMBLER_NAME for internal C++ 
types to some magic value?

Jason
Jan Hubicka May 21, 2015, 9:34 p.m. UTC | #10
> On 05/21/2015 04:32 PM, Jan Hubicka wrote:
> >I think at LTO time it is useful to have two things
> >   - be able to say what type comply C++ ODR rule, because we special case these
> >     for ODR warnings
> >   - be able to say at LTO time what types are anonymous, that is they are not
> >     compatible with any type from other unit.
> 
> ...where the second group is a subset of the first; in languages
> that use structural compatibility, anonymous types can be
> compatible.
> 
> >Either a special flag TYPE_ODR_P set by C++ FE on all types that comply ODR rule
> >or making TYPE_PUBLIC (TYPE_STUB_DECL (t)) to reliably identify that type is
> >anonymous (or an equivalent) would fix the two issues.
> 
> It seems that we've been establishing which types have linkage by
> setting their DECL_ASSEMBLER_NAME.  It seems that the problem is
> coming from leaving it unset for types with internal (anonymous)
> linkage, so you're trying to check TREE_PUBLIC instead, which breaks
> with other front ends. Perhaps we should set DECL_ASSEMBLER_NAME for
> internal C++ types to some magic value?

This would also work, yes.  We can set it into something like "<anonymous>".
One problem would be that type_with_linkage_p/type_in_anonymous_namespace_p
would not work on non-C++ types without LTO (because then we do not produce the
type manglings). I suppose it is not really a problem because only
place I use them is the devirtualization machinery and that won't get any
polymorphic types for other languages.

I suppose I can drop type_in_anonymous_namespcae_p in tree.c and make
mangle_decl to set "anonymous" for those?

Honza
Jason Merrill May 22, 2015, 2:56 a.m. UTC | #11
On 05/21/2015 05:34 PM, Jan Hubicka wrote:
> This would also work, yes.  We can set it into something like "<anonymous>".

> One problem would be that type_with_linkage_p/type_in_anonymous_namespace_p
> would not work on non-C++ types without LTO (because then we do not produce the
> type manglings). I suppose it is not really a problem because only
> place I use them is the devirtualization machinery and that won't get any
> polymorphic types for other languages.

And I don't know whether types from any other languages have linkage 
like C++ times, so that's fine.

> I suppose I can drop type_in_anonymous_namespcae_p in tree.c and make
> mangle_decl to set "anonymous" for those?

Sounds good.

Jason
diff mbox

Patch

Index: c/c-decl.c
===================================================================
--- c/c-decl.c	(revision 222981)
+++ c/c-decl.c	(working copy)
@@ -1563,6 +1563,7 @@  pushtag (location_t loc, tree name, tree
 
   TYPE_STUB_DECL (type) = pushdecl (build_decl (loc,
 						TYPE_DECL, NULL_TREE, type));
+  TREE_PUBLIC (TYPE_STUB_DECL (type)) = 1;
 
   /* An approximation for now, so we can tell this is a function-scope tag.
      This will be updated in pop_scope.  */