Message ID | af29444d-118a-4818-26ca-bbe468a587dd@suse.cz |
---|---|
State | New |
Headers | show |
Series | Check DECL_CONTEXT of new/delete operators. | expand |
On Mon, Mar 30, 2020 at 10:41 AM Martin Liška <mliska@suse.cz> wrote: > > Hi. > > The patch ensures that a deleted new/delete pair has a same context. > That will fix the issue presented in the PR. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. I think this will break the DCE with LTO since we strip quite some DECL_CONTEXT in free-lang-data. Honza - do you remember why we strip DECL_CONTEXT for methods which will most likely keep their containing record types live through their this arguments? Quoting: /* We need to keep field decls associated with their trees. Otherwise tree merging may merge some fileds and keep others disjoint wich in turn will not do well with TREE_CHAIN pointers linking them. Also do not drop containing types for virtual methods and tables because these are needed by devirtualization. C++ destructors are special because C++ frontends sometimes produces virtual destructor as an alias of non-virtual destructor. In devirutalization code we always walk through aliases and we need context to be preserved too. See PR89335 */ if (TREE_CODE (decl) != FIELD_DECL && ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL) || (!DECL_VIRTUAL_P (decl) && (TREE_CODE (decl) != FUNCTION_DECL || !DECL_CXX_DESTRUCTOR_P (decl))))) DECL_CONTEXT (decl) = fld_decl_context (DECL_CONTEXT (decl)); and fld_decl_context stripping up to the next non-TYPE context (unless VLA). Richard. > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > 2020-03-30 Martin Liska <mliska@suse.cz> > > PR c++/94314 > * tree-ssa-dce.c (propagate_necessity): Verify that > DECL_CONTEXT of a new/delete operators do match. > > gcc/testsuite/ChangeLog: > > 2020-03-30 Martin Liska <mliska@suse.cz> > > PR c++/94314 > * g++.dg/pr94314.C: New test. > --- > gcc/testsuite/g++.dg/pr94314.C | 84 ++++++++++++++++++++++++++++++++++ > gcc/tree-ssa-dce.c | 31 +++++++++---- > 2 files changed, 105 insertions(+), 10 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/pr94314.C > >
On Mon, 30 Mar 2020, Martin Liška wrote: > The patch ensures that a deleted new/delete pair has a same context. > That will fix the issue presented in the PR. DECL_CONTEXT seems good for that example (assuming it is still available in the middle-end), but shouldn't we also check if both are array versions? I can build a similar example by implementing the global operator new[] in terms of operator new and operator delete[] in terms of operator delete. Checking if both are aligned versions could be nice as well. I don't know what others can be relevant, I think we are allowed to mix throw and nothrow, or sized and non-sized versions. For DECL_CONTEXT, if we use new on a derived class and delete on a base class, I guess it is sensible not to optimize unless devirt / inlining manage to show a matched pair of operators, so your patch looks good here. I understand this feature is getting rather painful, sorry...
> On Mon, Mar 30, 2020 at 10:41 AM Martin Liška <mliska@suse.cz> wrote: > > > > Hi. > > > > The patch ensures that a deleted new/delete pair has a same context. > > That will fix the issue presented in the PR. > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > I think this will break the DCE with LTO since we strip quite some > DECL_CONTEXT in free-lang-data. > > Honza - do you remember why we strip DECL_CONTEXT for methods > which will most likely keep their containing record types live through > their this > arguments? Quoting: > > /* We need to keep field decls associated with their trees. Otherwise tree > merging may merge some fileds and keep others disjoint wich in turn will > not do well with TREE_CHAIN pointers linking them. > > Also do not drop containing types for virtual methods and tables because > these are needed by devirtualization. > C++ destructors are special because C++ frontends sometimes produces > virtual destructor as an alias of non-virtual destructor. In > devirutalization code we always walk through aliases and we need > context to be preserved too. See PR89335 */ > if (TREE_CODE (decl) != FIELD_DECL > && ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL) > || (!DECL_VIRTUAL_P (decl) > && (TREE_CODE (decl) != FUNCTION_DECL > || !DECL_CXX_DESTRUCTOR_P (decl))))) > DECL_CONTEXT (decl) = fld_decl_context (DECL_CONTEXT (decl)); > > and fld_decl_context stripping up to the next non-TYPE context (unless VLA). Well, I basically went through all pointers and tried to get rid of as many of them as possible. CONTEXT pointers do increase size of SCCs that increases chance they will not get merged and also processing time of merging algorithm. I guess if we need to stream more contexts we could do that (and check the effect on merging and average SCC size) Honza > > Richard. > > > Ready to be installed? > > Thanks, > > Martin > > > > gcc/ChangeLog: > > > > 2020-03-30 Martin Liska <mliska@suse.cz> > > > > PR c++/94314 > > * tree-ssa-dce.c (propagate_necessity): Verify that > > DECL_CONTEXT of a new/delete operators do match. > > > > gcc/testsuite/ChangeLog: > > > > 2020-03-30 Martin Liska <mliska@suse.cz> > > > > PR c++/94314 > > * g++.dg/pr94314.C: New test. > > --- > > gcc/testsuite/g++.dg/pr94314.C | 84 ++++++++++++++++++++++++++++++++++ > > gcc/tree-ssa-dce.c | 31 +++++++++---- > > 2 files changed, 105 insertions(+), 10 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/pr94314.C > > > >
On 3/31/20 2:29 PM, Jan Hubicka wrote: > Well, I basically went through all pointers and tried to get rid of as > many of them as possible. CONTEXT pointers do increase size of SCCs > that increases chance they will not get merged and also processing time > of merging algorithm. I guess if we need to stream more contexts we > could do that (and check the effect on merging and average SCC size) Ok, do we want to stream contexts just for the new/delete operators? Can you please prepare a streaming patch? Thank you, Martin
> On 3/31/20 2:29 PM, Jan Hubicka wrote: > > Well, I basically went through all pointers and tried to get rid of as > > many of them as possible. CONTEXT pointers do increase size of SCCs > > that increases chance they will not get merged and also processing time > > of merging algorithm. I guess if we need to stream more contexts we > > could do that (and check the effect on merging and average SCC size) > > Ok, do we want to stream contexts just for the new/delete operators? > Can you please prepare a streaming patch? Hi, I am still not convinced that context is useful here. It took me a while to understand what the code does and why it fails, but here is a testcase. It fails for me with your patch and -O2 --param early-inlining-insns=100 The invalid transform is to remove pair base:new and B:delete B:new gets inlined and we get count out of sync. Honza #include <stdio.h> volatile int idx; struct base { __attribute__((malloc,noinline)) static void* operator new(unsigned long sz) { return ::operator new(sz); } __attribute__((malloc,noinline)) static void operator delete(void* ptr) { --count[idx]; ::operator delete(ptr); } volatile static int count[2]; }; volatile int base::count[2] = {0,0}; struct B:base { static void* operator new(unsigned long sz) { ++count[idx]; return base::operator new(sz); } }; volatile int c=1; int main(){ for (int i; i<c;i++) { idx=0; delete new B; if (B::count[0] != 0) __builtin_abort (); } return 0; }
Hi, and this is the streaming fix diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c index e4077b58890..dd9645723c1 100644 --- a/gcc/tree-ssa-dce.c +++ b/gcc/tree-ssa-dce.c @@ -646,6 +647,19 @@ degenerate_phi_p (gimple *phi) return true; } +/* Return true if C1 and C2 are matching contexts (both translation unit decls + or both types. */ + +bool +matching_contexts_p (tree c1, tree c2) +{ + if (TREE_CODE (c1) == TRANSLATION_UNIT_DECL) + return TREE_CODE (c2) == TRANSLATION_UNIT_DECL; + if (TREE_CODE (c2) == TRANSLATION_UNIT_DECL) + return TREE_CODE (c1) == TRANSLATION_UNIT_DECL; + return types_same_for_odr (c1, c2); +} + /* Propagate necessity using the operands of necessary statements. Process the uses on each statement in the worklist, and add all feeding statements which contribute to the calculation of this @@ -824,16 +838,28 @@ propagate_necessity (bool aggressive) || DECL_FUNCTION_CODE (def_callee) == BUILT_IN_CALLOC)) || DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee))) { - /* Delete operators can have alignment and (or) size as next - arguments. When being a SSA_NAME, they must be marked - as necessary. */ - if (is_delete_operator && gimple_call_num_args (stmt) >= 2) - for (unsigned i = 1; i < gimple_call_num_args (stmt); i++) - { - tree arg = gimple_call_arg (stmt, i); - if (TREE_CODE (arg) == SSA_NAME) - mark_operand_necessary (arg); - } + if (is_delete_operator) + { + /* Verify that new and delete operators have the same + context. */ + if (DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee) + && !matching_contexts_p + (DECL_CONTEXT (def_callee), + DECL_CONTEXT (gimple_call_fndecl (stmt)))) + mark_operand_necessary (gimple_call_arg (stmt, 0)); + + /* Delete operators can have alignment and (or) size + as next arguments. When being a SSA_NAME, they + must be marked as necessary. */ + if (gimple_call_num_args (stmt) >= 2) + for (unsigned i = 1; i < gimple_call_num_args (stmt); + i++) + { + tree arg = gimple_call_arg (stmt, i); + if (TREE_CODE (arg) == SSA_NAME) + mark_operand_necessary (arg); + } + } continue; } diff --git a/gcc/tree.c b/gcc/tree.c index 63dc6730b2b..cd608a9c878 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -5879,6 +5879,9 @@ free_lang_data_in_decl (tree decl, class free_lang_data_d *fld) merging may merge some fileds and keep others disjoint wich in turn will not do well with TREE_CHAIN pointers linking them. + tree-ssa-dce is using context to match new and delete operators (which may + be static functions). + Also do not drop containing types for virtual methods and tables because these are needed by devirtualization. C++ destructors are special because C++ frontends sometimes produces @@ -5886,6 +5889,9 @@ free_lang_data_in_decl (tree decl, class free_lang_data_d *fld) devirutalization code we always walk through aliases and we need context to be preserved too. See PR89335 */ if (TREE_CODE (decl) != FIELD_DECL + && (TREE_CODE (decl) != FUNCTION_DECL + || (!DECL_IS_REPLACEABLE_OPERATOR_NEW_P (decl) + && !DECL_IS_OPERATOR_DELETE_P (decl))) && ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL) || (!DECL_VIRTUAL_P (decl) && (TREE_CODE (decl) != FUNCTION_DECL
Hi, thinking a bit of the problem, I guess we could match in addition to DECL_CONTEXT the whole inline stack of both statements and see if there are inlined new/delete operators and if so if they are always in matching pairs. The inline stack is available as for (tree block = gimple_block (call); block && TREE_CODE (block) == BLOCK; block = BLOCK_SUPERCONTEXT (block)) { tree fn = block_ultimate_origin (block); if (fn != NULL && TREE_CODE (fn) == FUNCTION_DECL) do the checking htere. } But I do not understand what C++ promises here and in what conditions the new/delete pair can be removed. Honza
On 4/3/20 5:26 PM, Jan Hubicka wrote: >> On 3/31/20 2:29 PM, Jan Hubicka wrote: >>> Well, I basically went through all pointers and tried to get rid of as >>> many of them as possible. CONTEXT pointers do increase size of SCCs >>> that increases chance they will not get merged and also processing time >>> of merging algorithm. I guess if we need to stream more contexts we >>> could do that (and check the effect on merging and average SCC size) >> >> Ok, do we want to stream contexts just for the new/delete operators? >> Can you please prepare a streaming patch? > Hi, > I am still not convinced that context is useful here. > It took me a while to understand what the code does and why it fails, > but here is a testcase. > It fails for me with your patch and -O2 --param early-inlining-insns=100 > > The invalid transform is to remove pair > base:new and B:delete > B:new gets inlined and we get count out of sync. > > Honza > > #include <stdio.h> > volatile int idx; > struct base > { > __attribute__((malloc,noinline)) > static void* operator new(unsigned long sz) > { > return ::operator new(sz); > } > > __attribute__((malloc,noinline)) > static void operator delete(void* ptr) > { > --count[idx]; > ::operator delete(ptr); > } > volatile static int count[2]; > }; > volatile int base::count[2] = {0,0}; > struct B:base > { > static void* operator new(unsigned long sz) > { > ++count[idx]; > return base::operator new(sz); > } > > }; > > > volatile int c=1; > > int main(){ > for (int i; i<c;i++) > { > idx=0; > delete new B; > if (B::count[0] != 0) > __builtin_abort (); > } > > > return 0; > } > May I please ping Jason, Nathan and Jonathan who can help us here? Thanks, Martin
On Sat, Apr 4, 2020 at 1:53 PM Jan Hubicka <hubicka@ucw.cz> wrote: > > Hi, > thinking a bit of the problem, I guess we could match in addition to > DECL_CONTEXT the whole inline stack of both statements and see if there > are inlined new/delete operators and if so if they are always in > matching pairs. > > The inline stack is available as > for (tree block = gimple_block (call); block && TREE_CODE (block) == BLOCK; block = BLOCK_SUPERCONTEXT (block)) > { > tree fn = block_ultimate_origin (block); > if (fn != NULL && TREE_CODE (fn) == FUNCTION_DECL) > do the checking htere. > } > > But I do not understand what C++ promises here and in what conditions > the new/delete pair can be removed. But if the inline stack matches in pairs then the ultimate new/delete call should also match, no? When there's a mismatch in inlining we can't DCE since we can't remove the extra inlined stmts. Your failing testcase suggests we never can remove new/delete pairs though unless the DECL_CONTEXT is 'final'. Also the user could have chosen to "inline" the side-effect of the new operation manually but not the delete one, so operator delete() { count-- } ptr = new A; count++; delete ptr; is it valid to elide the new/delete pair here? Richard. > Honza
On 4/6/20 4:34 AM, Martin Liška wrote: > > May I please ping Jason, Nathan and Jonathan who can help us here? On IRC Martin clarified the question as essentially 'how do you pair up operator new and operator delete calls?' (so you may delete them if the object is not used). I am not sure you're permitted to remove those calls in general. All I can find is [expr.new]/12 'An implementation is allowed to omit a call to a replaceable global allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression.' But, I suspect the optimization is safe, in that either no one counts objects by their allocation, or if they do, they don't actually care that the std-conforming number of allocations happen. The both operator new and operator delete are looked up in the same manner. The std does not require a 'matching pair' be found. but it is extremely poor form for a class to declare exactly one of operator {new,delete}. The following is well formed: struct PoorForm { void *operator new (size_t s) {count++; return ::operator new (s)}; static unsigned count; }; Have you considered throwing ctors? struct Obj { Obj (); // might throw }; 'obj = new Obj (); ... delete obj;' sequence expand to something like ... // obj = new Obj (); void *p = ::operator new (sizeof (Obj)); try { Obj::ctor(p); } catch (...) // cleanup code { ::operator delete (p); // #1 throw; } obj = (Obj*)p; .... user code // delete obj; Obj::dtor (obj); ::operator delete (obj); // #2 calls 1 & 2 matchup to the operator new call Notice that para I quoted allows one to coalesce allocations using the global operator new/deletes. The rules are pretty much as you can guess -- one lifetime must be entirely within the other. If inner one's ctor throws, the exception path must destroy the outer. does that help? nathan
On Mon, Apr 6, 2020 at 5:27 AM Richard Biener via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: > On Sat, Apr 4, 2020 at 1:53 PM Jan Hubicka <hubicka@ucw.cz> wrote: > > > > Hi, > > thinking a bit of the problem, I guess we could match in addition to > > DECL_CONTEXT the whole inline stack of both statements and see if there > > are inlined new/delete operators and if so if they are always in > > matching pairs. > > > > The inline stack is available as > > for (tree block = gimple_block (call); block && TREE_CODE (block) == > BLOCK; block = BLOCK_SUPERCONTEXT (block)) > > { > > tree fn = block_ultimate_origin (block); > > if (fn != NULL && TREE_CODE (fn) == FUNCTION_DECL) > > do the checking htere. > > } > > > > But I do not understand what C++ promises here and in what conditions > > the new/delete pair can be removed. > > But if the inline stack matches in pairs then the ultimate new/delete > call should also > match, no? When there's a mismatch in inlining we can't DCE since we > can't remove > the extra inlined stmts. > > Your failing testcase suggests we never can remove new/delete pairs though > unless the DECL_CONTEXT is 'final'. Also the user could have chosen to > "inline" the side-effect of the new operation manually but not the > delete one, so > > operator delete() { count-- } > > ptr = new A; > count++; > delete ptr; > > is it valid to elide the new/delete pair here? > The C++ constraints are (deliberately, I think) vague. As Nathan quotes, it just says that a call to a replaceable operator new can be omitted, and that if it is, the matching delete-expression should not call operator delete. This example seems to demonstrate that we should also only consider the replaceable delete operators, not any operator delete. Jason
On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell <nathan@acm.org> wrote: > > On 4/6/20 4:34 AM, Martin Liška wrote: > > > > > May I please ping Jason, Nathan and Jonathan who can help us here? > > On IRC Martin clarified the question as essentially 'how do you pair up > operator new and operator delete calls?' (so you may delete them if the > object is not used). > > I am not sure you're permitted to remove those calls in general. All I > can find is [expr.new]/12 > 'An implementation is allowed to omit a call to a replaceable global > allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage > is instead provided by the implementation or provided by extending the > allocation of another new-expression.' At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith summarised the rules as "new-expressions, like std::allocator, may obtain storage by calling 'operator new', but it's unspecified how often it's called and with what arguments." So if our optimisation is removing the calls to base::operator new and base::operator delete, but not the B::operator new call, then it seems to be working at the wrong level. It should be eliding any calls to operator new and operator delete at the point of the new-expression and delete-expression, not leaving one call to operator new present and then removing another one (which leaves the call "partially removed").
On Tue, Apr 7, 2020 at 10:26 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell <nathan@acm.org> wrote: > > > > On 4/6/20 4:34 AM, Martin Liška wrote: > > > > > > > > May I please ping Jason, Nathan and Jonathan who can help us here? > > > > On IRC Martin clarified the question as essentially 'how do you pair up > > operator new and operator delete calls?' (so you may delete them if the > > object is not used). > > > > I am not sure you're permitted to remove those calls in general. All I > > can find is [expr.new]/12 > > 'An implementation is allowed to omit a call to a replaceable global > > allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage > > is instead provided by the implementation or provided by extending the > > allocation of another new-expression.' > > At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith > summarised the rules as "new-expressions, like std::allocator, may > obtain storage by calling 'operator new', but it's unspecified how > often it's called and with what arguments." > > So if our optimisation is removing the calls to base::operator new and > base::operator delete, but not the B::operator new call, then it seems > to be working at the wrong level. It should be eliding any calls to > operator new and operator delete at the point of the new-expression > and delete-expression, not leaving one call to operator new present > and then removing another one (which leaves the call "partially > removed"). Well, the question is how to identify "operator new and operator delete at the point of the new-expression and delete-expression". Currently we're just matching up "is this a new/delete operator" and the dataflow of the pointer. In the PR it was suggested the actual called methods should have the same DECL_CONTEXT. Honza suggested the context should have the "same" ODR type (or be global). You make it sound it's much harder and the parser needs to build the relation? You also suggest the "validness" is only OK in the context of std::allocator and based on the unspecifiedness of the number of allocations from the standard library. That would further suggest that we need to mark the allocation/deallocation points somehow and _not_ base the optimization on the called new/delete "function" (maybe with an exception of the global ::new and ::delete). Richard.
> > Well, the question is how to identify "operator new and operator delete at the > point of the new-expression and delete-expression". Currently we're > just matching up "is this a new/delete operator" and the dataflow of the > pointer. In the PR it was suggested the actual called methods should have > the same DECL_CONTEXT. Honza suggested the context should have the > "same" ODR type (or be global). I was just arguing that comparing pointers to types does not make much sence in LTO where types can get unshared :) Since the classes have ODR name at least this problem can be solved by using ODR name compare. However the testcase I sent abuses the fact that if you inherit the class you can rewrite only new operator. In the inerited class DECL_CONTEXT of delete will still point to the oriignal class. This means that you can mix new/delete pair from the original and inherited class. Honza > > You make it sound it's much harder and the parser needs to build the > relation? You also suggest the "validness" is only OK in the context > of std::allocator and based on the unspecifiedness of the number of > allocations from the standard library. That would further suggest that > we need to mark the allocation/deallocation points somehow and _not_ > base the optimization on the called new/delete "function" (maybe with > an exception of the global ::new and ::delete). > > Richard.
On Tue, Apr 7, 2020 at 11:49 AM Jan Hubicka <hubicka@ucw.cz> wrote: > > > > > Well, the question is how to identify "operator new and operator delete at the > > point of the new-expression and delete-expression". Currently we're > > just matching up "is this a new/delete operator" and the dataflow of the > > pointer. In the PR it was suggested the actual called methods should have > > the same DECL_CONTEXT. Honza suggested the context should have the > > "same" ODR type (or be global). > I was just arguing that comparing pointers to types does not make much > sence in LTO where types can get unshared :) > Since the classes have ODR name at least this problem can be solved by > using ODR name compare. Sure. > However the testcase I sent abuses the fact that if you inherit the > class you can rewrite only new operator. In the inerited class > DECL_CONTEXT of delete will still point to the oriignal class. This > means that you can mix new/delete pair from the original and inherited > class. Yeah, but we're searching for a correctness fix not for an optimality one ;) It sounds matching up the pairs in the middle-end is impossible and thus the FE has to do this match-up (or refrain from marking new/delete when a matchup according to to be defined methods is not going to be enough). And maybe that tracking has to be done on a per call level rather than on a called function level. Richard. > Honza > > > > You make it sound it's much harder and the parser needs to build the > > relation? You also suggest the "validness" is only OK in the context > > of std::allocator and based on the unspecifiedness of the number of > > allocations from the standard library. That would further suggest that > > we need to mark the allocation/deallocation points somehow and _not_ > > base the optimization on the called new/delete "function" (maybe with > > an exception of the global ::new and ::delete). > > > > Richard.
On 4/7/20 12:22 PM, Richard Biener wrote: > On Tue, Apr 7, 2020 at 11:49 AM Jan Hubicka <hubicka@ucw.cz> wrote: >> >>> >>> Well, the question is how to identify "operator new and operator delete at the >>> point of the new-expression and delete-expression". Currently we're >>> just matching up "is this a new/delete operator" and the dataflow of the >>> pointer. In the PR it was suggested the actual called methods should have >>> the same DECL_CONTEXT. Honza suggested the context should have the >>> "same" ODR type (or be global). >> I was just arguing that comparing pointers to types does not make much >> sence in LTO where types can get unshared :) >> Since the classes have ODR name at least this problem can be solved by >> using ODR name compare. > > Sure. > >> However the testcase I sent abuses the fact that if you inherit the >> class you can rewrite only new operator. In the inerited class >> DECL_CONTEXT of delete will still point to the oriignal class. This >> means that you can mix new/delete pair from the original and inherited >> class. > > Yeah, but we're searching for a correctness fix not for an optimality one ;) > > It sounds matching up the pairs in the middle-end is impossible and thus > the FE has to do this match-up (or refrain from marking new/delete when > a matchup according to to be defined methods is not going to be enough). > And maybe that tracking has to be done on a per call level rather than > on a called function level. Based on what was said here, I see a proper matching implementation quite expensive from implementation point of point. Moreover, the optimization itself has quite low impact and so I suggest to only remove matching replaceable operators (1-12) from [1], which are the top-level ones. I'll come up with DECL_IS_REPLACEABLE_OPERATOR_DELETE_P and fix #define DECL_IS_REPLACEABLE_OPERATOR_NEW_P(NODE) \ (DECL_IS_OPERATOR_NEW_P (NODE) && DECL_IS_MALLOC (NODE)) which is not correct. It also matches struct A { __attribute__((noinline)) static void* operator new(unsigned long sz) { ++count; return ::operator new(sz); } where DECL_IS_MALLOC is discovered by local IPA passes. Hope it's viable solution? Thanks, Martin [1] https://en.cppreference.com/w/cpp/memory/new/operator_delete > > Richard. > >> Honza >>> >>> You make it sound it's much harder and the parser needs to build the >>> relation? You also suggest the "validness" is only OK in the context >>> of std::allocator and based on the unspecifiedness of the number of >>> allocations from the standard library. That would further suggest that >>> we need to mark the allocation/deallocation points somehow and _not_ >>> base the optimization on the called new/delete "function" (maybe with >>> an exception of the global ::new and ::delete). >>> >>> Richard.
On 4/6/20 2:45 PM, Nathan Sidwell wrote: > On 4/6/20 4:34 AM, Martin Liška wrote: > >> >> May I please ping Jason, Nathan and Jonathan who can help us here? > > On IRC Martin clarified the question as essentially 'how do you pair up operator new and operator delete calls?' (so you may delete them if the object is not used). > > I am not sure you're permitted to remove those calls in general. All I can find is [expr.new]/12 > 'An implementation is allowed to omit a call to a replaceable global allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression.' > > But, I suspect the optimization is safe, in that either no one counts objects by their allocation, or if they do, they don't actually care that the std-conforming number of allocations happen. > > The both operator new and operator delete are looked up in the same manner. The std does not require a 'matching pair' be found. but it is extremely poor form for a class to declare exactly one of operator {new,delete}. Hi. Thank you for the information. > > The following is well formed: > > struct PoorForm { > void *operator new (size_t s) {count++; return ::operator new (s)}; > static unsigned count; > }; > > Have you considered throwing ctors? > > struct Obj { > Obj (); // might throw > }; > > 'obj = new Obj (); ... delete obj;' sequence expand to something like ... > > // obj = new Obj (); > void *p = ::operator new (sizeof (Obj)); > try { > Obj::ctor(p); > } > catch (...) // cleanup code > { > ::operator delete (p); // #1 > throw; > } > > obj = (Obj*)p; > > .... user code > > // delete obj; > Obj::dtor (obj); > ::operator delete (obj); // #2 > > calls 1 & 2 matchup to the operator new call Looking at the throwing ctors: $ cat new-delete2.C #include <stdio.h> struct A { __attribute__((always_inline)) A(int x) { if (x == 123) throw x; } }; int main(int argc, char **argv) { A *a = new A (argc); delete a; return 0; } $ g++-9 new-delete2.C -O2 -c -fdump-tree-optimized=/dev/stdout ... <bb 2> [local count: 1073741824]: _3 = operator new (1); if (argc_4(D) == 123) goto <bb 3>; [0.00%] else goto <bb 4>; [100.00%] <bb 3> [count: 0]: _8 = __cxa_allocate_exception (4); MEM[(int *)_8] = 123; __cxa_throw (_8, &_ZTIi, 0B); <bb 4> [local count: 1073741824]: operator delete (_3, 1); return 0; ... As seen cddce can still optimize out _3 = operator new (1); ... operator delete (_3, 1); Martin > > Notice that para I quoted allows one to coalesce allocations using the global operator new/deletes. The rules are pretty much as you can guess -- one lifetime must be entirely within the other. If inner one's ctor throws, the exception path must destroy the outer. > > does that help? > > nathan >
On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote: > The both operator new and operator delete are looked up in the same > manner. The std does not require a 'matching pair' be found. but it is > extremely poor form for a class to declare exactly one of operator > {new,delete}. There are unfortunately several such example in the standard! I wonder how much benefit we will really get from trying to make this optimisation too general. Just eliding (or coalescing) implicit calls to ::operator new(size_t) and ::operator delete(void*, size_t) (and the [] and align_val_t forms of those) probably covers 99% of real cases.
On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote: > > The both operator new and operator delete are looked up in the same > > manner. The std does not require a 'matching pair' be found. but it is > > extremely poor form for a class to declare exactly one of operator > > {new,delete}. > > There are unfortunately several such example in the standard! > > I wonder how much benefit we will really get from trying to make this > optimisation too general. > > Just eliding (or coalescing) implicit calls to ::operator new(size_t) > and ::operator delete(void*, size_t) (and the [] and align_val_t forms > of those) probably covers 99% of real cases. IIRC the only reason to have the optimization was to fully elide STL containers when possible. That brings in allocators and thus non ::new/::delete allocations. Which then suggests that our standard library implementation could annotate allocation points somehow. Richard.
On Tue, 7 Apr 2020 at 10:29, Richard Biener <richard.guenther@gmail.com> wrote: > > On Tue, Apr 7, 2020 at 10:26 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell <nathan@acm.org> wrote: > > > > > > On 4/6/20 4:34 AM, Martin Liška wrote: > > > > > > > > > > > May I please ping Jason, Nathan and Jonathan who can help us here? > > > > > > On IRC Martin clarified the question as essentially 'how do you pair up > > > operator new and operator delete calls?' (so you may delete them if the > > > object is not used). > > > > > > I am not sure you're permitted to remove those calls in general. All I > > > can find is [expr.new]/12 > > > 'An implementation is allowed to omit a call to a replaceable global > > > allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage > > > is instead provided by the implementation or provided by extending the > > > allocation of another new-expression.' > > > > At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith > > summarised the rules as "new-expressions, like std::allocator, may > > obtain storage by calling 'operator new', but it's unspecified how > > often it's called and with what arguments." > > > > So if our optimisation is removing the calls to base::operator new and > > base::operator delete, but not the B::operator new call, then it seems > > to be working at the wrong level. It should be eliding any calls to > > operator new and operator delete at the point of the new-expression > > and delete-expression, not leaving one call to operator new present > > and then removing another one (which leaves the call "partially > > removed"). > > Well, the question is how to identify "operator new and operator delete at the > point of the new-expression and delete-expression". Currently we're > just matching up "is this a new/delete operator" and the dataflow of the > pointer. In the PR it was suggested the actual called methods should have > the same DECL_CONTEXT. Honza suggested the context should have the > "same" ODR type (or be global). > > You make it sound it's much harder and the parser needs to build the > relation? You also suggest the "validness" is only OK in the context > of std::allocator and based on the unspecifiedness of the number of > allocations from the standard library. I don't think Richard's summary or my paraphrasing intends to say it only applies to std::allocator.
On Tue, 7 Apr 2020 at 12:40, Richard Biener <richard.guenther@gmail.com> wrote: > > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote: > > > The both operator new and operator delete are looked up in the same > > > manner. The std does not require a 'matching pair' be found. but it is > > > extremely poor form for a class to declare exactly one of operator > > > {new,delete}. > > > > There are unfortunately several such example in the standard! > > > > I wonder how much benefit we will really get from trying to make this > > optimisation too general. > > > > Just eliding (or coalescing) implicit calls to ::operator new(size_t) > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms > > of those) probably covers 99% of real cases. > > IIRC the only reason to have the optimization was to fully elide > STL containers when possible. That brings in allocators and > thus non ::new/::delete allocations. But the vast majority of containers are used with std::allocator, and we control that. Wouldn't it be simpler to add __builtin_operator_new and __builtin_operator_delete like clang has, then make std::allocator use those, and limit the optimizations to uses of those built-ins?
On Tue, Apr 7, 2020 at 1:46 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > On Tue, 7 Apr 2020 at 12:40, Richard Biener <richard.guenther@gmail.com> wrote: > > > > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote: > > > > The both operator new and operator delete are looked up in the same > > > > manner. The std does not require a 'matching pair' be found. but it is > > > > extremely poor form for a class to declare exactly one of operator > > > > {new,delete}. > > > > > > There are unfortunately several such example in the standard! > > > > > > I wonder how much benefit we will really get from trying to make this > > > optimisation too general. > > > > > > Just eliding (or coalescing) implicit calls to ::operator new(size_t) > > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms > > > of those) probably covers 99% of real cases. > > > > IIRC the only reason to have the optimization was to fully elide > > STL containers when possible. That brings in allocators and > > thus non ::new/::delete allocations. > > But the vast majority of containers are used with std::allocator, and > we control that. > > Wouldn't it be simpler to add __builtin_operator_new and > __builtin_operator_delete like clang has, then make std::allocator use > those, and limit the optimizations to uses of those built-ins? Sure, that's a viable implementation strathegy. Another might be void delete (void *) __attribute__((matching_new(somewhere::new))); and thus allow the user to relate a new/delete pair (here the FE would do lookup for 'new' and record for example the mangled assembler name). That is, we usually try to design a mechanism that's more broadly usable. But yes, we match malloc/free so matching ::new/::delete by aliasing them to __builtin_operator_new and __builtin_operator_delete is fair enough. Not easily usable by other languages with custom allocation though (fortran allocate/deallocate anyone? that's currently inlined to expose underlying malloc/free calls for similar reasons) Richard.
On 4/7/20 7:29 AM, Jonathan Wakely wrote: > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote: >> The both operator new and operator delete are looked up in the same >> manner. The std does not require a 'matching pair' be found. but it is >> extremely poor form for a class to declare exactly one of operator >> {new,delete}. > > There are unfortunately several such example in the standard! Pah! I also realized one could write: struct Derived : Base { void *operator new (size_t t) { ... } using Base::operator delete; }; which is not such poor code, but the FE will generate calls that completely lose the usingness of the operator delete. As RichardB comments, I think the FE needs to mark sets of new/delete calls for the middle end to safely optimize. but ... > > I wonder how much benefit we will really get from trying to make this > optimisation too general. > > Just eliding (or coalescing) implicit calls to ::operator new(size_t) > and ::operator delete(void*, size_t) (and the [] and align_val_t forms > of those) probably covers 99% of real cases. I agree. It's certainly a simpler problem and the major component of any win we'll get. nathan
On Tue, 7 Apr 2020 at 12:57, Richard Biener <richard.guenther@gmail.com> wrote: > > On Tue, Apr 7, 2020 at 1:46 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > On Tue, 7 Apr 2020 at 12:40, Richard Biener <richard.guenther@gmail.com> wrote: > > > > > > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > > > > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote: > > > > > The both operator new and operator delete are looked up in the same > > > > > manner. The std does not require a 'matching pair' be found. but it is > > > > > extremely poor form for a class to declare exactly one of operator > > > > > {new,delete}. > > > > > > > > There are unfortunately several such example in the standard! > > > > > > > > I wonder how much benefit we will really get from trying to make this > > > > optimisation too general. > > > > > > > > Just eliding (or coalescing) implicit calls to ::operator new(size_t) > > > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms > > > > of those) probably covers 99% of real cases. > > > > > > IIRC the only reason to have the optimization was to fully elide > > > STL containers when possible. That brings in allocators and > > > thus non ::new/::delete allocations. > > > > But the vast majority of containers are used with std::allocator, and > > we control that. > > > > Wouldn't it be simpler to add __builtin_operator_new and > > __builtin_operator_delete like clang has, then make std::allocator use > > those, and limit the optimizations to uses of those built-ins? > > Sure, that's a viable implementation strathegy. Another might be > > void delete (void *) __attribute__((matching_new(somewhere::new))); > > and thus allow the user to relate a new/delete pair (here the FE would > do lookup for 'new' and record for example the mangled assembler name). Does that attribute tell us anything useful? Given a pointer obtained from new and passed to delete, can't we assume they are a matching pair? If not, the behaviour would be undefined anyway. > That is, we usually try to design a mechanism that's more broadly usable. > But yes, we match malloc/free so matching ::new/::delete by aliasing them > to __builtin_operator_new and __builtin_operator_delete is fair enough. Would it make sense to annotate the actual calls to the allocation/deallocation functions, instead of the declarations of those functions? According to Richard Smith, we don't want to optimise away 'operator delete(operator new(1), 1)' because that's an explicit call, and the user might have replaced those functions and be relying on the side effects. But we can optimise away 'delete new char' and we can optimise away std::allocator<char>().deallocate(std::allocator<char>().allocate(1), 1). So what matters is not whether the functions being called match up (they have to anyway) or which functions are being called. What matters is whether they are called implicitly (by a new-expression or by std::allocator). So when the compiler expands 'new T' into a call to operator new followed by constructing a T (plus exception handling) the call to operator new would be marked as optimisable by the FE, and likewise when the compiler expands 'delete p' into a destructor followed by calling operator delete, the call to operator delete would be marked as optimisable. If a pointer is allocated by a call marked optimisable and deallocated by a call marked optimisable, it can be optimised away (or coalesced with another optimisation). Then for std::allocator we just want to be able to mark the explicit calls to operator new and operator delete as "optimisable", as the FE does for the implicit calls it generates. So if we want a general purpose utility, it would be an attribute to mark /calls/ as optimisable, not functions. Adding __builtin_operator_new would just be a different syntax for "call operator new but mark it optimisable". N.B. I am not a compiler dev and might be talking nonsense :-)
On Tue, Apr 7, 2020 at 5:17 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > On Tue, 7 Apr 2020 at 12:57, Richard Biener <richard.guenther@gmail.com> wrote: > > > > On Tue, Apr 7, 2020 at 1:46 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > > > On Tue, 7 Apr 2020 at 12:40, Richard Biener <richard.guenther@gmail.com> wrote: > > > > > > > > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > > > > > > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote: > > > > > > The both operator new and operator delete are looked up in the same > > > > > > manner. The std does not require a 'matching pair' be found. but it is > > > > > > extremely poor form for a class to declare exactly one of operator > > > > > > {new,delete}. > > > > > > > > > > There are unfortunately several such example in the standard! > > > > > > > > > > I wonder how much benefit we will really get from trying to make this > > > > > optimisation too general. > > > > > > > > > > Just eliding (or coalescing) implicit calls to ::operator new(size_t) > > > > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms > > > > > of those) probably covers 99% of real cases. > > > > > > > > IIRC the only reason to have the optimization was to fully elide > > > > STL containers when possible. That brings in allocators and > > > > thus non ::new/::delete allocations. > > > > > > But the vast majority of containers are used with std::allocator, and > > > we control that. > > > > > > Wouldn't it be simpler to add __builtin_operator_new and > > > __builtin_operator_delete like clang has, then make std::allocator use > > > those, and limit the optimizations to uses of those built-ins? > > > > Sure, that's a viable implementation strathegy. Another might be > > > > void delete (void *) __attribute__((matching_new(somewhere::new))); > > > > and thus allow the user to relate a new/delete pair (here the FE would > > do lookup for 'new' and record for example the mangled assembler name). > > Does that attribute tell us anything useful? > > Given a pointer obtained from new and passed to delete, can't we > assume they are a matching pair? If not, the behaviour would be > undefined anyway. Further matching of new/delete came up in the context of inlining where we might not be able to elide side-effects of new/delete "appropriately". That's actually the case in the referenced PR. > > That is, we usually try to design a mechanism that's more broadly usable. > > But yes, we match malloc/free so matching ::new/::delete by aliasing them > > to __builtin_operator_new and __builtin_operator_delete is fair enough. > > Would it make sense to annotate the actual calls to the > allocation/deallocation functions, instead of the declarations of > those functions? Sure - I think that's ultimatively the way to go. > According to Richard Smith, we don't want to optimise away 'operator > delete(operator new(1), 1)' because that's an explicit call, and the > user might have replaced those functions and be relying on the side > effects. But we can optimise away 'delete new char' and we can > optimise away std::allocator<char>().deallocate(std::allocator<char>().allocate(1), > 1). So what matters is not whether the functions being called match up > (they have to anyway) or which functions are being called. What > matters is whether they are called implicitly (by a new-expression or > by std::allocator). OK, I see. So for the inline case we'd for example have operator new() { return ::new ...; } and delete new T; where the generated and marked operator new is inlined we'd no longer elide the pair because the operator new implementation had an explicit call to ::new and this is what the optimization sees now. Only when the operator new implementation uses a new expression again we'd see it as candidate pair and then run into the inline issue again (the issue of the new operator implementation incrementing some counter and the delete one decrementing it, us inlining one of both and then eliding the pair, only eliding either the increment or the decrement and thus retaining half of the overall side-effect). > So when the compiler expands 'new T' into a call to operator new > followed by constructing a T (plus exception handling) the call to > operator new would be marked as optimisable by the FE, and likewise > when the compiler expands 'delete p' into a destructor followed by > calling operator delete, the call to operator delete would be marked > as optimisable. If a pointer is allocated by a call marked optimisable > and deallocated by a call marked optimisable, it can be optimised away > (or coalesced with another optimisation). > > Then for std::allocator we just want to be able to mark the explicit > calls to operator new and operator delete as "optimisable", as the FE > does for the implicit calls it generates. So if we want a general > purpose utility, it would be an attribute to mark /calls/ as > optimisable, not functions. > > Adding __builtin_operator_new would just be a different syntax for > "call operator new but mark it optimisable". Ah, so __builtin_operator_new isn't a function but an alternate new expression syntax? Richard. > N.B. I am not a compiler dev and might be talking nonsense :-)
On Wed, 8 Apr 2020 at 08:35, Richard Biener wrote: > Ah, so __builtin_operator_new isn't a function but an alternate > new expression syntax? No, not a new-expression, because a new-expression calls operator new to obtain storage *and* begins the lifetime of one or more objects in that storage. __builtin_operator_new is just the first part, i.e. the operator new(...) call. But because explicit calls to operator new(...) are not supposed to be optimized, __builtin_operator_new is a way of calling operator new(...) that tells the compiler "this isn't an explicit call, you can optimise it". So a new-expression can use it (because that needs to call operator new(...), but the call should be optimisable) and std::allocator<T>::allocate(n) can use it (because that call is also optimisable).
diff --git a/gcc/testsuite/g++.dg/pr94314.C b/gcc/testsuite/g++.dg/pr94314.C new file mode 100644 index 00000000000..76cd9d8d2a4 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr94314.C @@ -0,0 +1,84 @@ +/* PR c++/94314. */ +/* { dg-options "-O2 -fdump-tree-cddce-details" } */ +/* { dg-additional-options "-fdelete-null-pointer-checks" } */ + +#include <stdio.h> + +struct A +{ + __attribute__((malloc,noinline)) + static void* operator new(unsigned long sz) + { + ++count; + return ::operator new(sz); + } + + static void operator delete(void* ptr) + { + --count; + ::operator delete(ptr); + } + + static int count; +}; + +int A::count = 0; + +struct B +{ + __attribute__((malloc,noinline)) + static void* operator new(unsigned long sz) + { + ++count; + return ::operator new(sz); + } + + __attribute__((noinline)) + static void operator delete(void* ptr) + { + --count; + ::operator delete(ptr); + } + + static int count; +}; + +int B::count = 0; + +struct C +{ + static void* operator new(unsigned long sz) + { + ++count; + return ::operator new(sz); + } + + static void operator delete(void* ptr) + { + --count; + ::operator delete(ptr); + } + + static int count; +}; + +int C::count = 0; + +int main(){ + delete new A; + if (A::count != 0) + __builtin_abort (); + + delete new B; + if (B::count != 0) + __builtin_abort (); + + delete new C; + if (C::count != 0) + __builtin_abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "Deleting : operator delete" 1 "cddce1"} } */ +/* { dg-final { scan-tree-dump-times "Deleting : B::operator delete" 1 "cddce1"} } */ diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c index e4077b58890..d86234ead23 100644 --- a/gcc/tree-ssa-dce.c +++ b/gcc/tree-ssa-dce.c @@ -824,16 +824,27 @@ propagate_necessity (bool aggressive) || DECL_FUNCTION_CODE (def_callee) == BUILT_IN_CALLOC)) || DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee))) { - /* Delete operators can have alignment and (or) size as next - arguments. When being a SSA_NAME, they must be marked - as necessary. */ - if (is_delete_operator && gimple_call_num_args (stmt) >= 2) - for (unsigned i = 1; i < gimple_call_num_args (stmt); i++) - { - tree arg = gimple_call_arg (stmt, i); - if (TREE_CODE (arg) == SSA_NAME) - mark_operand_necessary (arg); - } + if (is_delete_operator) + { + /* Verify that new and delete operators have the same + context. */ + if (DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee) + && (DECL_CONTEXT (def_callee) + != DECL_CONTEXT (gimple_call_fndecl (stmt)))) + mark_operand_necessary (gimple_call_arg (stmt, 0)); + + /* Delete operators can have alignment and (or) size + as next arguments. When being a SSA_NAME, they + must be marked as necessary. */ + if (gimple_call_num_args (stmt) >= 2) + for (unsigned i = 1; i < gimple_call_num_args (stmt); + i++) + { + tree arg = gimple_call_arg (stmt, i); + if (TREE_CODE (arg) == SSA_NAME) + mark_operand_necessary (arg); + } + } continue; }