Message ID | 8550101.ZHQCWdrEkP@polaris |
---|---|
State | New |
Headers | show |
On Sun, Oct 14, 2012 at 10:47 PM, Eric Botcazou <ebotcazou@adacore.com> wrote: > Hi, > > This is the execution failure of gfortran.dg/array_constructor_4.f90 in 64-bit > mode on SPARC/Solaris at -O3. The dse2 dump for the reduced testcase reads: > > dse: local deletions = 0, global deletions = 1, spill deletions = 0 > starting the processing of deferred insns > deleting insn with uid = 25. > ending the processing of deferred insns > > but the memory location stored to: > > (insn 25 27 154 2 (set (mem/c:SI (plus:DI (reg/f:DI 30 %fp) > (const_int 2039 [0x7f7])) [6 A.1+16 S4 A64]) > (reg:SI 1 %g1 [136])) array_constructor_4.f90:4 61 {*movsi_insn} > (nil)) > > is read by a subsequent call to memcpy. > > It turns out that this memcpy call is generated for an aggregate assignment: > > MEM[(c_char * {ref-all})&i] = MEM[(c_char * {ref-all})&A.17]; > > Note the A.1 in the store and the A.17 in the load. A.1 and A.17 are aggregate > variables sharing the same stack slot. A.17 is correcty marked as addressable > because of the call to memcpy, but A.1 isn't since its address isn't taken, > and DSE can optimize away (since 4.7) stores if their MEM_EXPR doesn't escape. > > The store is reaching the load because an intermediate store into A.17: > > (insn 78 76 82 6 (set (mem/c:SI (plus:DI (reg/f:DI 30 %fp) > (const_int 2039 [0x7f7])) [6 A.17+16 S4 A64]) > (reg:SI 1 %g1 [136])) array_constructor_4.f90:14 61 {*movsi_insn} > (nil)) > > has been deleted by postreload as no-op (because redundant), thus making A.1 > partially escape without marking it as addressable. > > The attached patch uses cfun->gimple_df->escaped.vars to plug the hole: when > mark_addressable is called during RTL expansion and the decl is partitioned, > all the variables in the partition are added to the bitmap. Then can_escape > is changed to additionally test cfun->gimple_df->escaped.vars. > > Tested on x86-64/Linux and SPARC64/Solaris, OK for mainline and 4.7 branch? Hmm. I think this points to an issue with update_alias_info_with_stack_vars instead. That is, this function should have already cared for handling this case where two decls have their stack slot shared. What it seems to get confused about is addressability, or rather can_escape is not using the RTL alias export properly. Instead of static bool can_escape (tree expr) { tree base; if (!expr) return true; base = get_base_address (expr); if (DECL_P (base) && !may_be_aliased (base)) return false; return true; it needs to check decls_to_pointers[base] and then check if any of the pointed-to decls may be aliased. Now, that's not that easy because we don't have a mapping from DECL UID to DECL (and the decl isn't in the escaped solution if it is just used by memcpy), but we could compute a bitmap of all address-taken decls in update_alias_info_with_stack_vars or simply treat all check decls_to_pointers[base] != NULL bases as possibly having their address taken. Richard. > > 2012-10-14 Eric Botcazou <ebotcazou@adacore.com> > > PR rtl-optimization/54870 > * dse.c (can_escape): Test cfun->gimple_df->escaped.vars as well. > * gimplify.c (mark_addressable): If this is a partition decl, add > all the variables in the partition to cfun->gimple_df->escaped.vars. > > > -- > Eric Botcazou
> Hmm. I think this points to an issue with update_alias_info_with_stack_vars > instead. That is, this function should have already cared for handling > this case where two decls have their stack slot shared. The problem here is that mark_addressable is called _after_ the function is run. IOW, by the time update_alias_info_with_stack_vars is run, there are no aliased variables in the function. > static bool > can_escape (tree expr) > { > tree base; > if (!expr) > return true; > base = get_base_address (expr); > if (DECL_P (base) > && !may_be_aliased (base)) > return false; > return true; > > it needs to check decls_to_pointers[base] and then check > if any of the pointed-to decls may be aliased. That's essentially what the patch does though (except that it does it more efficiently), since update_alias_info_with_stack_vars correctly computes cfun->gimple_df->escaped.vars for partitioned decls. > Now, that's not that easy because we don't have a > mapping from DECL UID to DECL (and the decl > isn't in the escaped solution if it is just used by > memcpy), but we could compute a bitmap of > all address-taken decls in update_alias_info_with_stack_vars > or simply treat all check decls_to_pointers[base] != NULL > bases as possibly having their address taken. OK, we can populate another bitmap in update_alias_info_with_stack_vars and update it in mark_addressable by means of decls_to_pointers and pi->pt.vars. That seems a bit redundant with cfun->gimple_df->escaped.vars, but why not.
On Mon, Oct 15, 2012 at 12:00 PM, Eric Botcazou <ebotcazou@adacore.com> wrote: >> Hmm. I think this points to an issue with update_alias_info_with_stack_vars >> instead. That is, this function should have already cared for handling >> this case where two decls have their stack slot shared. > > The problem here is that mark_addressable is called _after_ the function is > run. IOW, by the time update_alias_info_with_stack_vars is run, there are no > aliased variables in the function. Where is mark_addressable called? It's wrong (and generally impossible) to do that late. >> static bool >> can_escape (tree expr) >> { >> tree base; >> if (!expr) >> return true; >> base = get_base_address (expr); >> if (DECL_P (base) >> && !may_be_aliased (base)) >> return false; >> return true; >> >> it needs to check decls_to_pointers[base] and then check >> if any of the pointed-to decls may be aliased. > > That's essentially what the patch does though (except that it does it more > efficiently), since update_alias_info_with_stack_vars correctly computes > cfun->gimple_df->escaped.vars for partitioned decls. No, what it does is if a decl is in ESCAPED make sure to add decls that share the same partition also to ESCAPED. The issue is that can_escape queries TREE_ADDRESSABLE (which is correct on the gimple level, only things that have their address taken can escape) - that's no longer possible as soon as we have partitions with both addressable and non-addressable decls. >> Now, that's not that easy because we don't have a >> mapping from DECL UID to DECL (and the decl >> isn't in the escaped solution if it is just used by >> memcpy), but we could compute a bitmap of >> all address-taken decls in update_alias_info_with_stack_vars >> or simply treat all check decls_to_pointers[base] != NULL >> bases as possibly having their address taken. > > OK, we can populate another bitmap in update_alias_info_with_stack_vars and > update it in mark_addressable by means of decls_to_pointers and pi->pt.vars. > That seems a bit redundant with cfun->gimple_df->escaped.vars, but why not. If you only have memcpy then escaped will be empty. fixing escaped is not the right solution (it may work for some reason in this case though). The rtl code has to approximate ref_maybe_used_by_call_p in a conservative way which it doesn't seem to do correctly (I don't remember a RTL alias.c interface that would match this, or ref_maybe_used_by_stmt_p - maybe we should add one?) Thanks, Richard. > -- > Eric Botcazou
> Where is mark_addressable called? It's wrong (and generally impossible) to > do that late. In expr.c:emit_block_move_hints. It's one of the calls added to support the enhanced DSE last year, there are others in calls.c for example. > If you only have memcpy then escaped will be empty. fixing escaped is > not the right solution (it may work for some reason in this case though). > The rtl code has to approximate ref_maybe_used_by_call_p in a conservative > way which it doesn't seem to do correctly (I don't remember a RTL alias.c > interface that would match this, or ref_maybe_used_by_stmt_p - maybe > we should add one?) I'm OK with the new bitmap + decls_to_pointers idea. Keep in mind that the info needs to be updated after update_alias_info_with_stack_vars, because for MEM[(c_char * {ref-all})&i] = MEM[(c_char * {ref-all})&A.17]; you don't know until expand whether this will a memcpy or a move by pieces and the info is needed for the enhanced DSE to work properly.
On Mon, Oct 15, 2012 at 12:43 PM, Eric Botcazou <ebotcazou@adacore.com> wrote: >> Where is mark_addressable called? It's wrong (and generally impossible) to >> do that late. > > In expr.c:emit_block_move_hints. It's one of the calls added to support the > enhanced DSE last year, there are others in calls.c for example. Ugh ... that looks like a hack to make can_escape "work". It looks to me that we should somehow preserve knowledge on what vars a call may use or clobber (thus the GIMPLE call-use and call-clobber sets). As I'm not sure how to best do that I suggest we do a more proper RTL DSE hack by adding a 'libcall-call-escape'-set which we can add to instead of calling mark_addressable this late. We need to add all partitions of a decl here, of course, and we need to query it from can_escape. But that sounds way cleaner than abusing TREE_ADDRESSABLE for this ... >> If you only have memcpy then escaped will be empty. fixing escaped is >> not the right solution (it may work for some reason in this case though). >> The rtl code has to approximate ref_maybe_used_by_call_p in a conservative >> way which it doesn't seem to do correctly (I don't remember a RTL alias.c >> interface that would match this, or ref_maybe_used_by_stmt_p - maybe >> we should add one?) > > I'm OK with the new bitmap + decls_to_pointers idea. Keep in mind that the > info needs to be updated after update_alias_info_with_stack_vars, because for > > MEM[(c_char * {ref-all})&i] = MEM[(c_char * {ref-all})&A.17]; > > you don't know until expand whether this will a memcpy or a move by pieces and > the info is needed for the enhanced DSE to work properly. Well, it just means that the enhanced DSE is fragile :/ Richard. > -- > Eric Botcazou
Index: dse.c =================================================================== --- dse.c (revision 192353) +++ dse.c (working copy) @@ -990,6 +990,7 @@ delete_dead_store_insn (insn_info_t insn } /* Check if EXPR can possibly escape the current function scope. */ + static bool can_escape (tree expr) { @@ -998,7 +999,10 @@ can_escape (tree expr) return true; base = get_base_address (expr); if (DECL_P (base) - && !may_be_aliased (base)) + && !may_be_aliased (base) + && !(cfun->gimple_df->escaped.vars + && bitmap_bit_p (cfun->gimple_df->escaped.vars, + DECL_PT_UID (base)))) return false; return true; } Index: gimplify.c =================================================================== --- gimplify.c (revision 192353) +++ gimplify.c (working copy) @@ -116,6 +116,26 @@ mark_addressable (tree x) && TREE_CODE (x) != RESULT_DECL) return; TREE_ADDRESSABLE (x) = 1; + + /* If this is a partitioned decl, we need to mark all the variables in the + partition as escaped. This is needed because a store into one of them + can be replaced with a store into another, and this may not change the + outcome of the escape analysis for DSE to work properly. */ + if (TREE_CODE (x) == VAR_DECL + && !TREE_STATIC (x) + && cfun->gimple_df != NULL + && cfun->gimple_df->decls_to_pointers != NULL) + { + void *namep + = pointer_map_contains (cfun->gimple_df->decls_to_pointers, x); + if (namep) + { + struct ptr_info_def *pi = get_ptr_info (*(tree *)namep); + if (cfun->gimple_df->escaped.vars == NULL) + cfun->gimple_df->escaped.vars = BITMAP_GGC_ALLOC (); + bitmap_ior_into (cfun->gimple_df->escaped.vars, pi->pt.vars); + } + } } /* Return a hash value for a formal temporary table entry. */