Message ID | 1459538004-13812-1-git-send-email-patrick@parcs.ath.cx |
---|---|
State | New |
Headers | show |
On Fri, Apr 1, 2016 at 3:13 PM, Patrick Palka <patrick@parcs.ath.cx> wrote: > Currently during constexpr CALL_EXPR evaluation we create a new copy of > the callee function's body for each separate call with no attempt made > at reusing the function body. So when a function ends up getting called > 10s of thousands of times durnig constexpr evaluuation, we end up > creating 10s of thousands of copies of the function. > > This patch is an attempt at reusing these copies instead of having a new > copy get created each call. It implements a per-function freelist of > the unshared body/parm/res copies that were created by copy_fn(). When > a constexpr call is finished it pushes the body/parm/res trees to the > freelist, and before a call is evaluated it tries to reuse the trees > from the freelist, falling back to copy_fn() only if the freelist is > empty. > > Consequently, instead of creating 10s of thousands of copies for 10s of > thousands of calls, the number of copies that're made corresponds to the > recursion depth of the function. This makes sense because the reason we > create copies in the first place (IIUC) is to ensure that recursive > calls to a function do not share the same > PARM_DECLs/VAR_DECLs/RESULT_DECLs. > > In order for this reuse to work properly in the presence of SAVE_EXPRs, > we have to track each SAVE_EXPR (belonging to the callee) that we > evaluate during the call and then forget its value after the call. > constexpr-vla4.C is a new test that would fail if this patch had missed > this SAVE_EXPR part. > > With this patch, the compile time of the test case in the PR gets cut > from 3.5s to 2s and memory usage gets cut from 550M to 200M. > > I bootstrapped + regtested this change on x86_64-pc-linux-gnu, and also > compile-tested it against boost with no regressions. > > gcc/cp/ChangeLog: > > PR c++/70452 > * constexpr.c (struct bpr_entry): New struct. > (struct constexpr_fundef): New field bpr_freelist. > (retrieve_constexpr_fundef): Initialize field bpr_freelist to > NULL. > (register_constexpr_fundef): Likewise. > (cxx_eval_call_expression): Check constexpr_fundef::bpr_freelist > for a reusable copy of the function's body before creating a new > copy. Keep track of the callee's SAVE_EXPRs that we evaluate > and forget their values afterwards. Push the function body we > used onto the freelist for later reuse. > > gcc/testsuite/ChangeLog: > > PR c++/70452 > * g++.dg/ext/constexpr-vla4.C: New test. > --- > gcc/cp/constexpr.c | 55 ++++++++++++++++++++++++++++--- > gcc/testsuite/g++.dg/ext/constexpr-vla4.C | 17 ++++++++++ > 2 files changed, 67 insertions(+), 5 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/ext/constexpr-vla4.C > > diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c > index ea605dc..fc9b67c 100644 > --- a/gcc/cp/constexpr.c > +++ b/gcc/cp/constexpr.c > @@ -112,11 +112,24 @@ ensure_literal_type_for_constexpr_object (tree decl) > return decl; > } > > +/* The representation of a single node in the constexpr_fundef::bpr_freelist chain. */ > + > +struct GTY((chain_next ("%h.prev"))) bpr_entry > +{ > + tree body; > + tree parms; > + tree res; > + struct bpr_entry *prev; > +}; > + > /* Representation of entries in the constexpr function definition table. */ > > struct GTY((for_user)) constexpr_fundef { > tree decl; > tree body; > + /* A chain of unused copies of this function's body awaiting reuse for > + CALL_EXPR evaluation. */ > + struct bpr_entry *bpr_freelist; > }; > > struct constexpr_fundef_hasher : ggc_ptr_hash<constexpr_fundef> > @@ -154,7 +167,7 @@ constexpr_fundef_hasher::hash (constexpr_fundef *fundef) > static constexpr_fundef * > retrieve_constexpr_fundef (tree fun) > { > - constexpr_fundef fundef = { NULL, NULL }; > + constexpr_fundef fundef = { NULL, NULL, NULL }; > if (constexpr_fundef_table == NULL) > return NULL; > > @@ -806,6 +819,7 @@ register_constexpr_fundef (tree fun, tree body) > > entry.decl = fun; > entry.body = body; > + entry.bpr_freelist = NULL; > slot = constexpr_fundef_table->find_slot (&entry, INSERT); > > gcc_assert (*slot == NULL); > @@ -1377,10 +1391,21 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, > if (!result || result == error_mark_node) > { > gcc_assert (DECL_SAVED_TREE (fun)); > - tree parms, res; > + tree body, parms, res; > > - /* Unshare the whole function body. */ > - tree body = copy_fn (fun, parms, res); > + /* Reuse or create a new unshared copy of this function's body. */ > + if (bpr_entry *bpr = new_call.fundef->bpr_freelist) > + { > + body = bpr->body; > + parms = bpr->parms; > + res = bpr->res; > + new_call.fundef->bpr_freelist = bpr->prev; > + } > + else > + { > + /* Unshare the whole function body. */ > + body = copy_fn (fun, parms, res); > + } > > /* Associate the bindings with the remapped parms. */ > tree bound = new_call.bindings; > @@ -1409,11 +1434,22 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, > else > ctx->values->put (res, NULL_TREE); > > + /* Track the SAVE_EXPRs of the callee that we evaluate so that > + we can later forget their values. */ > + constexpr_ctx ctx_with_save_exprs = *ctx; > + hash_set<tree> save_exprs; > + ctx_with_save_exprs.save_exprs = &save_exprs; Didn't like the way this comment is worded so I changed it to + /* Track the callee's evaluated SAVE_EXPRs so that we can forget + their values after the call. */
I like this approach a lot. One thing, though: On 04/01/2016 03:13 PM, Patrick Palka wrote: > +struct GTY((chain_next ("%h.prev"))) bpr_entry > +{ > + tree body; > + tree parms; > + tree res; > + struct bpr_entry *prev; > +}; > + > /* Representation of entries in the constexpr function definition table. */ > > struct GTY((for_user)) constexpr_fundef { > tree decl; > tree body; > + /* A chain of unused copies of this function's body awaiting reuse for > + CALL_EXPR evaluation. */ > + struct bpr_entry *bpr_freelist; > }; The freelist should be discarded on GC. I think the way to achieve that is to put all the freelists into a separate deletable hash table rather than hang them off the fundefs. Jason
On Fri, Apr 1, 2016 at 9:28 PM, Jason Merrill <jason@redhat.com> wrote: > I like this approach a lot. One thing, though: > > On 04/01/2016 03:13 PM, Patrick Palka wrote: >> >> +struct GTY((chain_next ("%h.prev"))) bpr_entry >> +{ >> + tree body; >> + tree parms; >> + tree res; >> + struct bpr_entry *prev; >> +}; >> + >> /* Representation of entries in the constexpr function definition table. >> */ >> >> struct GTY((for_user)) constexpr_fundef { >> tree decl; >> tree body; >> + /* A chain of unused copies of this function's body awaiting reuse for >> + CALL_EXPR evaluation. */ >> + struct bpr_entry *bpr_freelist; >> }; > > > The freelist should be discarded on GC. I think the way to achieve that is > to put all the freelists into a separate deletable hash table rather than > hang them off the fundefs. > > Jason > Would instead marking the bpr_freelist field as GTY ((skip)) have the same effect?
On Fri, Apr 1, 2016 at 9:54 PM, Patrick Palka <patrick@parcs.ath.cx> wrote: > On Fri, Apr 1, 2016 at 9:28 PM, Jason Merrill <jason@redhat.com> wrote: >> I like this approach a lot. One thing, though: >> >> On 04/01/2016 03:13 PM, Patrick Palka wrote: >>> >>> +struct GTY((chain_next ("%h.prev"))) bpr_entry >>> +{ >>> + tree body; >>> + tree parms; >>> + tree res; >>> + struct bpr_entry *prev; >>> +}; >>> + >>> /* Representation of entries in the constexpr function definition table. >>> */ >>> >>> struct GTY((for_user)) constexpr_fundef { >>> tree decl; >>> tree body; >>> + /* A chain of unused copies of this function's body awaiting reuse for >>> + CALL_EXPR evaluation. */ >>> + struct bpr_entry *bpr_freelist; >>> }; >> >> >> The freelist should be discarded on GC. I think the way to achieve that is >> to put all the freelists into a separate deletable hash table rather than >> hang them off the fundefs. >> >> Jason >> > > Would instead marking the bpr_freelist field as GTY ((skip)) have the > same effect? Looks like not according to https://gcc.gnu.org/onlinedocs/gccint/GTY-Options.html. I should have read that first :)
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index ea605dc..fc9b67c 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -112,11 +112,24 @@ ensure_literal_type_for_constexpr_object (tree decl) return decl; } +/* The representation of a single node in the constexpr_fundef::bpr_freelist chain. */ + +struct GTY((chain_next ("%h.prev"))) bpr_entry +{ + tree body; + tree parms; + tree res; + struct bpr_entry *prev; +}; + /* Representation of entries in the constexpr function definition table. */ struct GTY((for_user)) constexpr_fundef { tree decl; tree body; + /* A chain of unused copies of this function's body awaiting reuse for + CALL_EXPR evaluation. */ + struct bpr_entry *bpr_freelist; }; struct constexpr_fundef_hasher : ggc_ptr_hash<constexpr_fundef> @@ -154,7 +167,7 @@ constexpr_fundef_hasher::hash (constexpr_fundef *fundef) static constexpr_fundef * retrieve_constexpr_fundef (tree fun) { - constexpr_fundef fundef = { NULL, NULL }; + constexpr_fundef fundef = { NULL, NULL, NULL }; if (constexpr_fundef_table == NULL) return NULL; @@ -806,6 +819,7 @@ register_constexpr_fundef (tree fun, tree body) entry.decl = fun; entry.body = body; + entry.bpr_freelist = NULL; slot = constexpr_fundef_table->find_slot (&entry, INSERT); gcc_assert (*slot == NULL); @@ -1377,10 +1391,21 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, if (!result || result == error_mark_node) { gcc_assert (DECL_SAVED_TREE (fun)); - tree parms, res; + tree body, parms, res; - /* Unshare the whole function body. */ - tree body = copy_fn (fun, parms, res); + /* Reuse or create a new unshared copy of this function's body. */ + if (bpr_entry *bpr = new_call.fundef->bpr_freelist) + { + body = bpr->body; + parms = bpr->parms; + res = bpr->res; + new_call.fundef->bpr_freelist = bpr->prev; + } + else + { + /* Unshare the whole function body. */ + body = copy_fn (fun, parms, res); + } /* Associate the bindings with the remapped parms. */ tree bound = new_call.bindings; @@ -1409,11 +1434,22 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, else ctx->values->put (res, NULL_TREE); + /* Track the SAVE_EXPRs of the callee that we evaluate so that + we can later forget their values. */ + constexpr_ctx ctx_with_save_exprs = *ctx; + hash_set<tree> save_exprs; + ctx_with_save_exprs.save_exprs = &save_exprs; + tree jump_target = NULL_TREE; - cxx_eval_constant_expression (ctx, body, + cxx_eval_constant_expression (&ctx_with_save_exprs, body, lval, non_constant_p, overflow_p, &jump_target); + /* Forget the saved values of the callee's SAVE_EXPRs. */ + for (hash_set<tree>::iterator iter = save_exprs.begin(); + iter != save_exprs.end(); ++iter) + ctx_with_save_exprs.values->remove (*iter); + if (DECL_CONSTRUCTOR_P (fun)) /* This can be null for a subobject constructor call, in which case what we care about is the initialization @@ -1444,6 +1480,15 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, ctx->values->remove (slot); for (tree parm = parms; parm; parm = TREE_CHAIN (parm)) ctx->values->remove (parm); + + /* Make the unshared function body that we evaluated available for + reuse by a later call to cxx_eval_call_expression. */ + bpr_entry *bpr = ggc_alloc<bpr_entry> (); + bpr->body = body; + bpr->parms = parms; + bpr->res = res; + bpr->prev = new_call.fundef->bpr_freelist; + new_call.fundef->bpr_freelist = bpr; } if (result == error_mark_node) diff --git a/gcc/testsuite/g++.dg/ext/constexpr-vla4.C b/gcc/testsuite/g++.dg/ext/constexpr-vla4.C new file mode 100644 index 0000000..428a8fd --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/constexpr-vla4.C @@ -0,0 +1,17 @@ +// PR c++/70452 +// { dg-do compile { target c++14 } } + +constexpr int +foo (int n, bool p) +{ + __extension__ int a [n] = { 0 }; + if (n == 3) + foo (n - 2, false); + if (n == 3) + foo(n - 1, true); + if (p) + return a[1]; + return 0; +} + +constexpr int i2 = foo (3, false); // { dg-bogus "array subscript out of bound" }