diff mbox

Fix PR rtl-optimization/54870

Message ID 8550101.ZHQCWdrEkP@polaris
State New
Headers show

Commit Message

Eric Botcazou Oct. 14, 2012, 8:47 p.m. UTC
Hi,

This is the execution failure of gfortran.dg/array_constructor_4.f90 in 64-bit
mode on SPARC/Solaris at -O3.  The dse2 dump for the reduced testcase reads:

dse: local deletions = 0, global deletions = 1, spill deletions = 0
starting the processing of deferred insns
deleting insn with uid = 25.
ending the processing of deferred insns

but the memory location stored to:

(insn 25 27 154 2 (set (mem/c:SI (plus:DI (reg/f:DI 30 %fp)
                (const_int 2039 [0x7f7])) [6 A.1+16 S4 A64])
        (reg:SI 1 %g1 [136])) array_constructor_4.f90:4 61 {*movsi_insn}
     (nil))

is read by a subsequent call to memcpy.

It turns out that this memcpy call is generated for an aggregate assignment:

  MEM[(c_char * {ref-all})&i] = MEM[(c_char * {ref-all})&A.17];

Note the A.1 in the store and the A.17 in the load. A.1 and A.17 are aggregate
variables sharing the same stack slot.  A.17 is correcty marked as addressable
because of the call to memcpy, but A.1 isn't since its address isn't taken, 
and DSE can optimize away (since 4.7) stores if their MEM_EXPR doesn't escape.

The store is reaching the load because an intermediate store into A.17:

(insn 78 76 82 6 (set (mem/c:SI (plus:DI (reg/f:DI 30 %fp)
                (const_int 2039 [0x7f7])) [6 A.17+16 S4 A64])
        (reg:SI 1 %g1 [136])) array_constructor_4.f90:14 61 {*movsi_insn}
     (nil))

has been deleted by postreload as no-op (because redundant), thus making A.1
partially escape without marking it as addressable.

The attached patch uses cfun->gimple_df->escaped.vars to plug the hole: when 
mark_addressable is called during RTL expansion and the decl is partitioned, 
all the variables in the partition are added to the bitmap.  Then can_escape 
is changed to additionally test cfun->gimple_df->escaped.vars.

Tested on x86-64/Linux and SPARC64/Solaris, OK for mainline and 4.7 branch?


2012-10-14  Eric Botcazou  <ebotcazou@adacore.com>

	PR rtl-optimization/54870
	* dse.c (can_escape): Test cfun->gimple_df->escaped.vars as well.
	* gimplify.c (mark_addressable): If this is a partition decl, add
	all the variables in the partition to cfun->gimple_df->escaped.vars.

Comments

Richard Biener Oct. 15, 2012, 9:33 a.m. UTC | #1
On Sun, Oct 14, 2012 at 10:47 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> Hi,
>
> This is the execution failure of gfortran.dg/array_constructor_4.f90 in 64-bit
> mode on SPARC/Solaris at -O3.  The dse2 dump for the reduced testcase reads:
>
> dse: local deletions = 0, global deletions = 1, spill deletions = 0
> starting the processing of deferred insns
> deleting insn with uid = 25.
> ending the processing of deferred insns
>
> but the memory location stored to:
>
> (insn 25 27 154 2 (set (mem/c:SI (plus:DI (reg/f:DI 30 %fp)
>                 (const_int 2039 [0x7f7])) [6 A.1+16 S4 A64])
>         (reg:SI 1 %g1 [136])) array_constructor_4.f90:4 61 {*movsi_insn}
>      (nil))
>
> is read by a subsequent call to memcpy.
>
> It turns out that this memcpy call is generated for an aggregate assignment:
>
>   MEM[(c_char * {ref-all})&i] = MEM[(c_char * {ref-all})&A.17];
>
> Note the A.1 in the store and the A.17 in the load. A.1 and A.17 are aggregate
> variables sharing the same stack slot.  A.17 is correcty marked as addressable
> because of the call to memcpy, but A.1 isn't since its address isn't taken,
> and DSE can optimize away (since 4.7) stores if their MEM_EXPR doesn't escape.
>
> The store is reaching the load because an intermediate store into A.17:
>
> (insn 78 76 82 6 (set (mem/c:SI (plus:DI (reg/f:DI 30 %fp)
>                 (const_int 2039 [0x7f7])) [6 A.17+16 S4 A64])
>         (reg:SI 1 %g1 [136])) array_constructor_4.f90:14 61 {*movsi_insn}
>      (nil))
>
> has been deleted by postreload as no-op (because redundant), thus making A.1
> partially escape without marking it as addressable.
>
> The attached patch uses cfun->gimple_df->escaped.vars to plug the hole: when
> mark_addressable is called during RTL expansion and the decl is partitioned,
> all the variables in the partition are added to the bitmap.  Then can_escape
> is changed to additionally test cfun->gimple_df->escaped.vars.
>
> Tested on x86-64/Linux and SPARC64/Solaris, OK for mainline and 4.7 branch?

Hmm.  I think this points to an issue with update_alias_info_with_stack_vars
instead.  That is, this function should have already cared for handling this
case where two decls have their stack slot shared.  What it seems to get
confused about is addressability, or rather can_escape is not using the
RTL alias export properly.  Instead of

static bool
can_escape (tree expr)
{
  tree base;
  if (!expr)
    return true;
  base = get_base_address (expr);
  if (DECL_P (base)
      && !may_be_aliased (base))
    return false;
  return true;

it needs to check decls_to_pointers[base] and then check
if any of the pointed-to decls may be aliased.

Now, that's not that easy because we don't have a
mapping from DECL UID to DECL (and the decl
isn't in the escaped solution if it is just used by
memcpy), but we could compute a bitmap of
all address-taken decls in update_alias_info_with_stack_vars
or simply treat all check decls_to_pointers[base] != NULL
bases as possibly having their address taken.

Richard.

>
> 2012-10-14  Eric Botcazou  <ebotcazou@adacore.com>
>
>         PR rtl-optimization/54870
>         * dse.c (can_escape): Test cfun->gimple_df->escaped.vars as well.
>         * gimplify.c (mark_addressable): If this is a partition decl, add
>         all the variables in the partition to cfun->gimple_df->escaped.vars.
>
>
> --
> Eric Botcazou
Eric Botcazou Oct. 15, 2012, 10 a.m. UTC | #2
> Hmm.  I think this points to an issue with update_alias_info_with_stack_vars
> instead.  That is, this function should have already cared for handling
> this case where two decls have their stack slot shared.

The problem here is that mark_addressable is called _after_ the function is
run.  IOW, by the time update_alias_info_with_stack_vars is run, there are no 
aliased variables in the function.

> static bool
> can_escape (tree expr)
> {
>   tree base;
>   if (!expr)
>     return true;
>   base = get_base_address (expr);
>   if (DECL_P (base)
>       && !may_be_aliased (base))
>     return false;
>   return true;
> 
> it needs to check decls_to_pointers[base] and then check
> if any of the pointed-to decls may be aliased.

That's essentially what the patch does though (except that it does it more 
efficiently), since update_alias_info_with_stack_vars correctly computes
cfun->gimple_df->escaped.vars for partitioned decls.

> Now, that's not that easy because we don't have a
> mapping from DECL UID to DECL (and the decl
> isn't in the escaped solution if it is just used by
> memcpy), but we could compute a bitmap of
> all address-taken decls in update_alias_info_with_stack_vars
> or simply treat all check decls_to_pointers[base] != NULL
> bases as possibly having their address taken.

OK, we can populate another bitmap in update_alias_info_with_stack_vars and 
update it in mark_addressable by means of decls_to_pointers and pi->pt.vars.
That seems a bit redundant with cfun->gimple_df->escaped.vars, but why not.
Richard Biener Oct. 15, 2012, 10:27 a.m. UTC | #3
On Mon, Oct 15, 2012 at 12:00 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Hmm.  I think this points to an issue with update_alias_info_with_stack_vars
>> instead.  That is, this function should have already cared for handling
>> this case where two decls have their stack slot shared.
>
> The problem here is that mark_addressable is called _after_ the function is
> run.  IOW, by the time update_alias_info_with_stack_vars is run, there are no
> aliased variables in the function.

Where is mark_addressable called?  It's wrong (and generally impossible) to
do that late.

>> static bool
>> can_escape (tree expr)
>> {
>>   tree base;
>>   if (!expr)
>>     return true;
>>   base = get_base_address (expr);
>>   if (DECL_P (base)
>>       && !may_be_aliased (base))
>>     return false;
>>   return true;
>>
>> it needs to check decls_to_pointers[base] and then check
>> if any of the pointed-to decls may be aliased.
>
> That's essentially what the patch does though (except that it does it more
> efficiently), since update_alias_info_with_stack_vars correctly computes
> cfun->gimple_df->escaped.vars for partitioned decls.

No, what it does is if a decl is in ESCAPED make sure to add decls that
share the same partition also to ESCAPED.  The issue is that can_escape
queries TREE_ADDRESSABLE (which is correct on the gimple level, only
things that have their address taken can escape) - that's no longer possible
as soon as we have partitions with both addressable and non-addressable
decls.

>> Now, that's not that easy because we don't have a
>> mapping from DECL UID to DECL (and the decl
>> isn't in the escaped solution if it is just used by
>> memcpy), but we could compute a bitmap of
>> all address-taken decls in update_alias_info_with_stack_vars
>> or simply treat all check decls_to_pointers[base] != NULL
>> bases as possibly having their address taken.
>
> OK, we can populate another bitmap in update_alias_info_with_stack_vars and
> update it in mark_addressable by means of decls_to_pointers and pi->pt.vars.
> That seems a bit redundant with cfun->gimple_df->escaped.vars, but why not.

If you only have memcpy then escaped will be empty.  fixing escaped is
not the right solution (it may work for some reason in this case though).
The rtl code has to approximate ref_maybe_used_by_call_p in a conservative
way which it doesn't seem to do correctly (I don't remember a RTL alias.c
interface that would match this, or ref_maybe_used_by_stmt_p - maybe
we should add one?)

Thanks,
Richard.

> --
> Eric Botcazou
Eric Botcazou Oct. 15, 2012, 10:43 a.m. UTC | #4
> Where is mark_addressable called?  It's wrong (and generally impossible) to
> do that late.

In expr.c:emit_block_move_hints.  It's one of the calls added to support the 
enhanced DSE last year, there are others in calls.c for example.

> If you only have memcpy then escaped will be empty.  fixing escaped is
> not the right solution (it may work for some reason in this case though).
> The rtl code has to approximate ref_maybe_used_by_call_p in a conservative
> way which it doesn't seem to do correctly (I don't remember a RTL alias.c
> interface that would match this, or ref_maybe_used_by_stmt_p - maybe
> we should add one?)

I'm OK with the new bitmap + decls_to_pointers idea.  Keep in mind that the 
info needs to be updated after update_alias_info_with_stack_vars, because for

MEM[(c_char * {ref-all})&i] = MEM[(c_char * {ref-all})&A.17];

you don't know until expand whether this will a memcpy or a move by pieces and 
the info is needed for the enhanced DSE to work properly.
Richard Biener Oct. 15, 2012, 11:07 a.m. UTC | #5
On Mon, Oct 15, 2012 at 12:43 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Where is mark_addressable called?  It's wrong (and generally impossible) to
>> do that late.
>
> In expr.c:emit_block_move_hints.  It's one of the calls added to support the
> enhanced DSE last year, there are others in calls.c for example.

Ugh ... that looks like a hack to make can_escape "work".  It looks to me
that we should somehow preserve knowledge on what vars a call may
use or clobber (thus the GIMPLE call-use and call-clobber sets).

As I'm not sure how to best do that I suggest we do a more proper RTL
DSE hack by adding a 'libcall-call-escape'-set which we can add to
instead of calling mark_addressable this late.  We need to add all
partitions of a decl here, of course, and we need to query it from can_escape.

But that sounds way cleaner than abusing TREE_ADDRESSABLE for this ...

>> If you only have memcpy then escaped will be empty.  fixing escaped is
>> not the right solution (it may work for some reason in this case though).
>> The rtl code has to approximate ref_maybe_used_by_call_p in a conservative
>> way which it doesn't seem to do correctly (I don't remember a RTL alias.c
>> interface that would match this, or ref_maybe_used_by_stmt_p - maybe
>> we should add one?)
>
> I'm OK with the new bitmap + decls_to_pointers idea.  Keep in mind that the
> info needs to be updated after update_alias_info_with_stack_vars, because for
>
> MEM[(c_char * {ref-all})&i] = MEM[(c_char * {ref-all})&A.17];
>
> you don't know until expand whether this will a memcpy or a move by pieces and
> the info is needed for the enhanced DSE to work properly.

Well, it just means that the enhanced DSE is fragile :/

Richard.

> --
> Eric Botcazou
diff mbox

Patch

Index: dse.c
===================================================================
--- dse.c	(revision 192353)
+++ dse.c	(working copy)
@@ -990,6 +990,7 @@  delete_dead_store_insn (insn_info_t insn
 }
 
 /* Check if EXPR can possibly escape the current function scope.  */
+
 static bool
 can_escape (tree expr)
 {
@@ -998,7 +999,10 @@  can_escape (tree expr)
     return true;
   base = get_base_address (expr);
   if (DECL_P (base)
-      && !may_be_aliased (base))
+      && !may_be_aliased (base)
+      && !(cfun->gimple_df->escaped.vars
+	   && bitmap_bit_p (cfun->gimple_df->escaped.vars,
+			    DECL_PT_UID (base))))
     return false;
   return true;
 }
Index: gimplify.c
===================================================================
--- gimplify.c	(revision 192353)
+++ gimplify.c	(working copy)
@@ -116,6 +116,26 @@  mark_addressable (tree x)
       && TREE_CODE (x) != RESULT_DECL)
     return;
   TREE_ADDRESSABLE (x) = 1;
+
+  /* If this is a partitioned decl, we need to mark all the variables in the
+     partition as escaped.  This is needed because a store into one of them
+     can be replaced with a store into another, and this may not change the
+     outcome of the escape analysis for DSE to work properly.  */
+  if (TREE_CODE (x) == VAR_DECL
+      && !TREE_STATIC (x)
+      && cfun->gimple_df != NULL
+      && cfun->gimple_df->decls_to_pointers != NULL)
+    {
+      void *namep
+	= pointer_map_contains (cfun->gimple_df->decls_to_pointers, x);
+      if (namep)
+	{
+	  struct ptr_info_def *pi = get_ptr_info (*(tree *)namep);
+	  if (cfun->gimple_df->escaped.vars == NULL)
+	    cfun->gimple_df->escaped.vars = BITMAP_GGC_ALLOC ();
+	  bitmap_ior_into (cfun->gimple_df->escaped.vars, pi->pt.vars);
+	}
+    }
 }
 
 /* Return a hash value for a formal temporary table entry.  */