Patchwork ELF interposition and One Definition Rule

login
register
mail settings
Submitter Jan Hubicka
Date Aug. 26, 2013, 3:21 p.m.
Message ID <20130826152141.GA19918@kam.mff.cuni.cz>
Download mbox | patch
Permalink /patch/269907/
State New
Headers show

Comments

Jan Hubicka - Aug. 26, 2013, 3:21 p.m.
Hi,
cgraph_function_body_availability determine if body of a given function is
available in current compilation unit and if so, if it may be overwritten by
completely different semantic by (dynamic)linker. Function that is
AVAIL_AVAILABLE can still be repleaced by a function fron different unit, but
its effects must be the same.

Our default behaviour special case inline functions that are always
AVAIL_AVAILABLE and, via decl_replaceable_p, also any COMDAT (that may be for
functions since all COMDATs are also inlines, but makes difference for
variables I think).

In the variable initializer case we also special case DECL_VIRTUAL and assume
that there is only one possible initializer for every virtual variable.

The performance implications of cgraph_function_body_availability are important;
-fPIC costs over 10% of performance but making everything AVAIL_AVAILABLE cuts
the costs well under 1%.


My understanding of C++ One Definition Rule, in a strict sense, does not a
allow in to define two functions of the same name and different semantics in a
valid program . I also think that all DSOs eventually linked together or
dlopenned are part of the same program.  So theoretically, for C++ produced
code, we may go with AVAIL_AVAILABLE everywhere.

After some discussion on mainline list Michael Matz pointed out that one can
understand ODR in a sense that the interposed function was never part of the
program, since it is completely replaced.

On IRC we got into an agreement that we may disallow interposition for
virtuals, since replacing these seems fishy - they don't even have address well
defined and compiler can freely duplicate and/or unify them.  I think same
apply for C++ constructors and destructors now.

Does the following patch seems sane?

If there will be no opposition till end of week, I will go ahead with it.

Of course I would be happier with a stronger rule - for instance allowing
interposition only on plain functions not on methods.

Main benefits of AVAILABLE is inlining, but it also helps to avoid expensive
dynamic relocations. Making virtual functions AVAILABLE will allow us to hide
them fron interaces of shared libraries.  This should turn good part of
non-local dynamic relocations into local dynamic relocations in practice.

Bootstrapped/regtested ppc64-linux.

Honza

	* cgraph.c (cgraph_function_body_availability): Also return
	AVAIL_AVAILABLE for virtual functions, constructors and destructors.
Mike Stump - Aug. 26, 2013, 10:29 p.m.
On Aug 26, 2013, at 8:21 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> My understanding of C++ One Definition Rule, in a strict sense, does not a
> allow in to define two functions of the same name and different semantics in a
> valid program . I also think that all DSOs eventually linked together or
> dlopenned are part of the same program.  So theoretically, for C++ produced
> code, we may go with AVAIL_AVAILABLE everywhere.

So, I think you're on firm ground wrt the standard.  I think LTO naturally wants see and make use of semantics, and once you accept that as valid, which, reasonably it is, I think you get to see and understand quite a lot about the code.  Replacing anything comes with a heavy constraint that it is reasonably the same and the user will die if it is not.  When an allocation function that the LTO optimizer can see is 32 byte aligned on the returned pointer, it is reasonable to make use of this on the client side code-gen.  If the user replaces that allocation function with one that was not 32-byte aligned, bad things would happen.

I think what the optimizer can see is open ended, and any use it wants to make of what it sees is fine.  Functions, data, classes, methods, ctors, dtors, templates, everything.

Now, that the standard perspective.  From the QOI viewpoint, you will have users that want to do various things, and they should explain what they want, and we should document and vend them the good stuff.  I defer to the interposing types on what they want to do and why.  Roughly, they need a way to hide things from the optimizer.  Assembly can do this, but, we'd also want to hide (or mark as please don't peer into) any function, method or variable.  Separate compilation not using the -flto flag seems a reasonable way to do this.  I don't know if it is enough… I think those types of people will scream when they discover they want more control.

> On IRC we got into an agreement that we may disallow interposition for
> virtuals,

Hum…  I'm not one of those people that want to interpose virtuals, but as a tool vendor, it would seem like some would like to be able to interpose virtuals.  I think separate compilation with no -flto should be enough to hide enough details to make the interposition of virtuals possible.  For example, someone has a nice C++ abi that includes a virtual function for open, and one wants to interpose open to trace it to merely help debug a problem.  Doesn't strike me as wrong.

For comdat (template functions), I can't help but think having a way to mark definitions as, please don't peer into this, would be nice to have.  One can separate declaration and definition and explicitly instantiate, but doing this might be a pain.  I'd defer, again, to the interposers.

Now, when the cost of allowing interposing is high (dynamic relocs for example), disallowing interposition by default is fine, not arguing that one must always have the cost.  Just seems nice from a theoretic perspective to allow the user to say, yes, we do want to allow interposing on these virtuals.

> Does the following patch seems sane?

Easier to review the change in semantics of a sample bit of code…  I think I understand the effects of the change.

> Of course I would be happier with a stronger rule - for instance allowing
> interposition only on plain functions not on methods.

Hum, I like the orthogonal rules that apply generally.  Meaning, I don't like the notion of treating functions and methods (or virtual methods) differently.  For example, a don't peer into for a template function definition, should be used to not peer into a normal inline function.

I think I like letting the optimizer do anything, and making the user responsible for not using -flto, or ensuring enough separate compilation, or otherwise marking the boundaries that don't want to peer though…  I could also be burned alive by a linux distributor with existing code if I tried this…  :-)  Good luck.


Oh, so keep in mind, if you do something like

template <class T>
class actor {
ctor() { 
	static int i = 100;
	printf("%p\n", &i);
}
};

and don't smash all the ctors together, you (can) wind up with multiple ctor::i objects.  The standard says there is one of them.  The usual way to ensure there is only one of them is to collapse all the the ctors together into one, then, trivially, that one can only reference one of the i's that exist.  This is one way (beyond equality) to tell if there are multiples (if the function is replicated).  Trying to think if there were any other ways…  ah yes, here it is:

6 A static local variable in a member function always refers to the same
  object, whether or not the member function is inline.

  A  static local variable in an extern inline function always refers to
  the same object.  A string literal in an extern inline function is the
  same object in different translation units.

So, string literals can also be used to notice the uniqueness of the function (method).  Curious, we didn't do the same for string literals in an member function, not sure why, feels like an oversight, not on purpose.  I'd have to dig through all the papers to find when it went in, and the paper that brought it in to see.  I don't recall that we talked about making it different.
Jason Merrill - Aug. 27, 2013, 12:58 a.m.
On 08/26/2013 11:21 AM, Jan Hubicka wrote:
> Our default behaviour special case inline functions that are always
> AVAIL_AVAILABLE and, via decl_replaceable_p, also any COMDAT (that may be for
> functions since all COMDATs are also inlines, but makes difference for
> variables I think).

Not all COMDAT functions are inlines; template instantiations are also 
COMDAT.

> My understanding of C++ One Definition Rule, in a strict sense, does not a
> allow in to define two functions of the same name and different semantics in a
> valid program.

Strictly speaking, the same is true of C:

"If an identifier declared with external linkage is used in an 
expression (other than as part of the operand of a sizeof or _Alignof 
operator whose result is an integer constant), somewhere in the entire 
program there shall be exactly one external definition for the 
identifier; otherwise, there shall be no more than one."

> I also think that all DSOs eventually linked together or
> dlopenned are part of the same program.  So theoretically, for C++ produced
> code, we may go with AVAIL_AVAILABLE everywhere.

And for C code as well.

ELF symbol replacement is an extension to C/C++ semantics, which does 
not include DSOs; it seems to me that it is up to us to define what 
assumptions we want the compiler to be able to make.

> On IRC we got into an agreement that we may disallow interposition for
> virtuals, since replacing these seems fishy - they don't even have address well
> defined and compiler can freely duplicate and/or unify them.  I think same
> apply for C++ constructors and destructors now.

But it would be simple to create a DSO which replaces a virtual function 
or a constructor, by simply providing a new definition.  Since 
interposition happens outside the language, I don't think reasoning 
based on things that happen within the language makes much sense.

> Of course I would be happier with a stronger rule - for instance allowing
> interposition only on plain functions not on methods.

Existing uses of interposition apply to plain functions, but that 
doesn't mean someone might not want to use it for member functions as well.

I would be happy with an even stronger default that optimizes on the 
assumption that no interposition occurs; typically interposition is 
overriding a symbol found in a dynamic library (i.e. malloc) rather than 
a symbol defined in the same translation unit as the use.

Jason
Gabriel Dos Reis - Aug. 27, 2013, 1:16 a.m.
On Mon, Aug 26, 2013 at 7:58 PM, Jason Merrill <jason@redhat.com> wrote:
> On 08/26/2013 11:21 AM, Jan Hubicka wrote:
>>
>> Our default behaviour special case inline functions that are always
>> AVAIL_AVAILABLE and, via decl_replaceable_p, also any COMDAT (that may be
>> for
>> functions since all COMDATs are also inlines, but makes difference for
>> variables I think).
>
>
> Not all COMDAT functions are inlines; template instantiations are also
> COMDAT.
>
>
>> My understanding of C++ One Definition Rule, in a strict sense, does not a
>> allow in to define two functions of the same name and different semantics
>> in a
>> valid program.
>
>
> Strictly speaking, the same is true of C:
>
> "If an identifier declared with external linkage is used in an expression
> (other than as part of the operand of a sizeof or _Alignof operator whose
> result is an integer constant), somewhere in the entire program there shall
> be exactly one external definition for the identifier; otherwise, there
> shall be no more than one."
>
>
>> I also think that all DSOs eventually linked together or
>> dlopenned are part of the same program.  So theoretically, for C++
>> produced
>> code, we may go with AVAIL_AVAILABLE everywhere.
>
>
> And for C code as well.

C++'s ODR is much stronger than C's though.  For example, C
allows definitions for the same inline function to differ -- the
out-of-line extern definition does not need to be the same.
That is anathema in C++.

>
> ELF symbol replacement is an extension to C/C++ semantics, which does not
> include DSOs; it seems to me that it is up to us to define what assumptions
> we want the compiler to be able to make.
>
>
>> On IRC we got into an agreement that we may disallow interposition for
>> virtuals, since replacing these seems fishy - they don't even have address
>> well
>> defined and compiler can freely duplicate and/or unify them.  I think same
>> apply for C++ constructors and destructors now.
>
>
> But it would be simple to create a DSO which replaces a virtual function or
> a constructor, by simply providing a new definition.  Since interposition
> happens outside the language, I don't think reasoning based on things that
> happen within the language makes much sense.
>
>
>> Of course I would be happier with a stronger rule - for instance allowing
>> interposition only on plain functions not on methods.
>
>
> Existing uses of interposition apply to plain functions, but that doesn't
> mean someone might not want to use it for member functions as well.

Agreed.

>
> I would be happy with an even stronger default that optimizes on the
> assumption that no interposition occurs; typically interposition is
> overriding a symbol found in a dynamic library (i.e. malloc) rather than a
> symbol defined in the same translation unit as the use.
>
> Jason
>

Again, strongly agreed.

-- Gaby
Nathan Sidwell - Aug. 27, 2013, 12:53 p.m.
On 08/26/13 20:58, Jason Merrill wrote:

> I would be happy with an even stronger default that optimizes on the assumption
> that no interposition occurs; typically interposition is overriding a symbol
> found in a dynamic library (i.e. malloc) rather than a symbol defined in the
> same translation unit as the use.

I had thought that that case (overriding malloc etc) was what this patch was 
dealing with.  Perhaps I was confused.

There's nothing particularly special about ctors, dtors and virtual function 
implementations.  Their common feature, as Jan described, is that it's hard to 
take their address:
* ctors don't have names.
* dtors do have names, but you can't take their address.
* virtual function implementations can't have their instance address taken -- 
you get the ptr-to-memfn object describing the vtable slot etc.  There is (was?) 
a GNU extension to work around that, IIRC.

So that stops a programmer calling these indirectly.  However I don't really see 
the semantic difference between:

   fptr = &somefunc; ... fptr (...);
and
   somefunc (...);

So treating those 3 types of function specially because you can't use the former 
syntax with them seems odd.

nathan
Richard Guenther - Aug. 28, 2013, 10:47 a.m.
On Tue, Aug 27, 2013 at 2:53 PM, Nathan Sidwell
<nathan_sidwell@mentor.com> wrote:
> On 08/26/13 20:58, Jason Merrill wrote:
>
>> I would be happy with an even stronger default that optimizes on the
>> assumption
>> that no interposition occurs; typically interposition is overriding a
>> symbol
>> found in a dynamic library (i.e. malloc) rather than a symbol defined in
>> the
>> same translation unit as the use.
>
>
> I had thought that that case (overriding malloc etc) was what this patch was
> dealing with.  Perhaps I was confused.
>
> There's nothing particularly special about ctors, dtors and virtual function
> implementations.  Their common feature, as Jan described, is that it's hard
> to take their address:
> * ctors don't have names.
> * dtors do have names, but you can't take their address.
> * virtual function implementations can't have their instance address taken
> -- you get the ptr-to-memfn object describing the vtable slot etc.  There is
> (was?) a GNU extension to work around that, IIRC.
>
> So that stops a programmer calling these indirectly.  However I don't really
> see the semantic difference between:
>
>   fptr = &somefunc; ... fptr (...);
> and
>   somefunc (...);
>
> So treating those 3 types of function specially because you can't use the
> former syntax with them seems odd.

Yeah - this is why I'm always worried about "frontend specific" optimizations
in the middle-end.  If you need to rely stuff that's not visible in GIMPLE then
it smells.

Richard.

> nathan
>
> --
> Nathan Sidwell
Jan Hubicka - Aug. 28, 2013, 11:52 a.m.
> > I had thought that that case (overriding malloc etc) was what this patch was
> > dealing with.  Perhaps I was confused.
> >
> > There's nothing particularly special about ctors, dtors and virtual function
> > implementations.  Their common feature, as Jan described, is that it's hard
> > to take their address:
> > * ctors don't have names.
> > * dtors do have names, but you can't take their address.
> > * virtual function implementations can't have their instance address taken
> > -- you get the ptr-to-memfn object describing the vtable slot etc.  There is
> > (was?) a GNU extension to work around that, IIRC.
> >
> > So that stops a programmer calling these indirectly.  However I don't really
> > see the semantic difference between:
> >
> >   fptr = &somefunc; ... fptr (...);
> > and
> >   somefunc (...);
> >
> > So treating those 3 types of function specially because you can't use the
> > former syntax with them seems odd.

What is really important for backend is that it is not defined what happens
when you compare addresses of those functions (based on fact that youcan't take
it, as for ctors/dtors, or compare it, as for virtual functions).  If backend
also knows that they are not interposable, it knows it can freely duplicate
or unify their bodies when it seems win.

The problem whether we want to promise working interposition on these is
however ortoghonal to whether one can or can not take address of them.  I
wonder how much we want to allow by default: my impression is that we pay large
cost to allow occasional rewrite of functions like malloc. As Jason mentioned,
it is really rare to rewrite something implemented in current unit and it is
where 90% of the runtime costs come from.

Adjusting by defualt would be a good win.  We can also invent command line flag
for this and ask users to use it often.
> 
> Yeah - this is why I'm always worried about "frontend specific" optimizations
> in the middle-end.  If you need to rely stuff that's not visible in GIMPLE then
> it smells.

I am not too keen about language specific rules in gimple either (though I introduced
quite some knowledge now for C++ devirtualization - we may think how to make these
more generic if we care about Java/other languages with polymorphic calls).

I have in my TODO to discuss ading extra gimple flag "address of this symbol is
never used for equality comparsion" flag.  S far we always tested DECL_VIRTUAL
for this (that is something I learned from ICF presentation of gold) but I am
sure non-C frontends can use it more widely.  We can also make early
optimizations /propagation to rule out simple cases where address is taken but
clearly never compared.

Similarly we may want to make frontend to decide on availablity of function in
question instead of having the magic in the cgraph function.  But I guess we first
need to figure out what the rules are.

Honza
Nathan Sidwell - Sept. 4, 2013, 6:02 a.m.
On 08/28/13 12:52, Jan Hubicka wrote:

> What is really important for backend is that it is not defined what happens
> when you compare addresses of those functions (based on fact that youcan't take
> it, as for ctors/dtors, or compare it, as for virtual functions).  If backend
> also knows that they are not interposable, it knows it can freely duplicate
> or unify their bodies when it seems win.

I think this is the wrong test.  It's true that at the language level, these 
things don't have addresses.  But someone trying to use interposition is not 
working at that level -- they're working at the ELF level (for sake of picking a 
binary format).  At that level, these things do have addresses, just like any 
other (non-inlined) function.

I know of a case where users have expected interposition of member functions to 
work (ugly though that is).  I can't recall whether ctors/dtors or vfuncs were 
the functions of interest, but that's not really the point.  At the binary 
level, these are just regular functions.

nathan

Patch

Index: cgraph.c
===================================================================
--- cgraph.c	(revision 201996)
+++ cgraph.c	(working copy)
@@ -2035,6 +2052,29 @@  cgraph_function_body_availability (struc
      behaviour on replacing inline function by different body.  */
   else if (DECL_DECLARED_INLINE_P (node->symbol.decl))
     avail = AVAIL_AVAILABLE;
+  /* C++ One Deifnition Rule states that in the entire program, an object or
+     non-inline function cannot have more than one definition; if an object or
+     function is used, it must have exactly one definition. You can declare an
+     object or function that is never used, in which case you don't have to
+     provide a definition. In no event can there be more than one definition.
+
+     In the interposition done by linking or dynamic linking one function
+     is replaced by another.  Direct interpretation of C++ ODR would imply
+     that both functions must origin from the same definition and thus be
+     semantically equivalent and we should always return AVAIL_AVAILABLE.
+
+     We however opted to be more conservative and allow interposition
+     for common (noninline) functions and methods based on interpretation
+     of ODR that simply consider the replaced function definition to not
+     be part of the program.
+
+     We however do allow interposition of virtual functions on the basis that
+     their address is not well defined and compiler is free to duplicate their
+     bodies same way as it does with inline functions.  */
+  else if (DECL_VIRTUAL_P (node->symbol.decl)
+	   || DECL_CXX_CONSTRUCTOR_P (node->symbol.decl)
+	   || DECL_CXX_DESTRUCTOR_P (node->symbol.decl))
+    avail = AVAIL_AVAILABLE;
 
   /* If the function can be overwritten, return OVERWRITABLE.  Take
      care at least of two notable extensions - the COMDAT functions