diff mbox

Set stores_off_frame_dead_at_return to false if sibling call

Message ID BLU437-SMTP45F1BE4C7DB75367213AE3977F0@phx.gbl
State New
Headers show

Commit Message

John David Anglin Nov. 29, 2014, 1:05 a.m. UTC
The attached simple change fixes a long standing regression since  
4.2.  When we have stack arguments and a sibling
call, the dse pass deletes stores to the parent's stack frame when we  
have a sibling call because they are are off frame
for the current function.  As a result, the sibling call arguments are  
not passed correctly on PA.

The attached change disables this behavior.

Tested on hppa2.0w-hp-hpux11.11, hppa64-hp-hpux11.11 and hppa-unknown- 
linux-gnu.

Okay for trunk, 4.9 and 4.8?

Dave
--
John David Anglin	dave.anglin@bell.net
2014-11-28  John David Anglin  <danglin@gcc.gnu.org>

	PR target/55023
	* dse.c (scan_insn): Set stores_off_frame_dead_at_return to false if
	we have a sibling call.

Comments

Jeff Law Dec. 1, 2014, 4:57 p.m. UTC | #1
On 11/28/14 18:05, John David Anglin wrote:
> The attached simple change fixes a long standing regression since 4.2.
> When we have stack arguments and a sibling
> call, the dse pass deletes stores to the parent's stack frame when we
> have a sibling call because they are are off frame
> for the current function.  As a result, the sibling call arguments are
> not passed correctly on PA.
>
> The attached change disables this behavior.
>
> Tested on hppa2.0w-hp-hpux11.11, hppa64-hp-hpux11.11 and
> hppa-unknown-linux-gnu.
>
> Okay for trunk, 4.9 and 4.8?
What is the insn_info for the sibcall itself?  I'm particularly 
interested in the frame_read flag.

Prior to reload (ie, in DSE1) there's a bit of magic in that we do not 
set frame_read on call insns.  That may in fact be wrong and possibly 
the source of the problem.

  /* This field is only used for the processing of const functions.
      These functions cannot read memory, but they can read the stack
      because that is where they may get their parms.  We need to be
      this conservative because, like the store motion pass, we don't
      consider CALL_INSN_FUNCTION_USAGE when processing call insns.
      Moreover, we need to distinguish two cases:
      1. Before reload (register elimination), the stores related to
         outgoing arguments are stack pointer based and thus deemed
         of non-constant base in this pass.  This requires special
         handling but also means that the frame pointer based stores
         need not be killed upon encountering a const function call.
      2. After reload, the stores related to outgoing arguments can be
         either stack pointer or hard frame pointer based.  This means
         that we have no other choice than also killing all the frame
         pointer based stores upon encountering a const function call.
      This field is set after reload for const function calls.  Having
      this set is less severe than a wild read, it just means that all
      the frame related stores are killed rather than all the stores.  */
   bool frame_read;


As a test, what happens if we change:


           /* See the head comment of the frame_read field.  */
           if (reload_completed)
             insn_info->frame_read = true;

To do unconditionally set frame_read?  Or if we don't want to get that 
drastic,

if (reload_completed || SIBLING_CALL_P (insn))
   insn_info->frame_read = true;


As for the patch itself, it'd be good to include a testcase, 
particularly since the BZ has a nice simple one.

It feels like setting stores_off_frame_dead_at_return, while effective 
is the wrong solution.  But if we go in that direction, then we clearly 
need a comment where we set that for sibling calls and the comment for 
that variable will need to be updated since it says it's only used for 
stdarg functions.

Jeff

*I
Jeff
John David Anglin Dec. 1, 2014, 6:44 p.m. UTC | #2
On 12/1/2014 11:57 AM, Jeff Law wrote:
> Prior to reload (ie, in DSE1) there's a bit of magic in that we do not 
> set frame_read on call insns.  That may in fact be wrong and possibly 
> the source of the problem.
>
>  /* This field is only used for the processing of const functions.
>      These functions cannot read memory, but they can read the stack
>      because that is where they may get their parms.  We need to be
>      this conservative because, like the store motion pass, we don't
>      consider CALL_INSN_FUNCTION_USAGE when processing call insns.
>      Moreover, we need to distinguish two cases:
>      1. Before reload (register elimination), the stores related to
>         outgoing arguments are stack pointer based and thus deemed
>         of non-constant base in this pass.  This requires special
>         handling but also means that the frame pointer based stores
>         need not be killed upon encountering a const function call.
>      2. After reload, the stores related to outgoing arguments can be
>         either stack pointer or hard frame pointer based.  This means
>         that we have no other choice than also killing all the frame
>         pointer based stores upon encountering a const function call.
>      This field is set after reload for const function calls. Having
>      this set is less severe than a wild read, it just means that all
>      the frame related stores are killed rather than all the stores.  */
>   bool frame_read;
>
>
> As a test, what happens if we change:
>
>
>           /* See the head comment of the frame_read field.  */
>           if (reload_completed)
>             insn_info->frame_read = true;
>
> To do unconditionally set frame_read?  Or if we don't want to get that 
> drastic,
>
> if (reload_completed || SIBLING_CALL_P (insn))
>   insn_info->frame_read = true;
Will test but I, if I read the code correctly, setting 
insn_info->frame_read = true
results in frame related stores being killed in a constant call. This 
doesn't seem
like the right solution.

Here we have frame related calls being killed before reload because the 
argument
stores for the sibcall are off frame:

/* Set the gen set of the exit block, and also any block with no
    successors that does not have a wild read.  */

static void
dse_step3_exit_block_scan (bb_info_t bb_info)
{
   /* The gen set is all 0's for the exit block except for the
      frame_pointer_group.  */

   if (stores_off_frame_dead_at_return)
     {
       unsigned int i;
       group_info_t group;

       FOR_EACH_VEC_ELT (rtx_group_vec, i, group)
         {
           if (group->process_globally && group->frame_related)
             bitmap_ior_into (bb_info->gen, group->group_kill);
         }
     }
}

Dave
John David Anglin Dec. 4, 2014, 3:50 p.m. UTC | #3
On 12/1/2014 11:57 AM, Jeff Law wrote:
> Prior to reload (ie, in DSE1) there's a bit of magic in that we do not 
> set frame_read on call insns.  That may in fact be wrong and possibly 
> the source of the problem.
>
>  /* This field is only used for the processing of const functions.
>      These functions cannot read memory, but they can read the stack
>      because that is where they may get their parms.  We need to be
>      this conservative because, like the store motion pass, we don't
>      consider CALL_INSN_FUNCTION_USAGE when processing call insns.
>      Moreover, we need to distinguish two cases:
>      1. Before reload (register elimination), the stores related to
>         outgoing arguments are stack pointer based and thus deemed
>         of non-constant base in this pass.  This requires special
>         handling but also means that the frame pointer based stores
>         need not be killed upon encountering a const function call.
The above is wrong for sibcalls.  Sibcall arguments are relative to the 
incoming
argument pointer.  Is this always the frame pointer?

If that is the case, then it may be possible to eliminate stack based 
stores when
we have a sibcall before reload.
>  if (reload_completed || SIBLING_CALL_P (insn))
>   insn_info->frame_read = true;
This works.

Dave
Jeff Law Dec. 8, 2014, 8:01 p.m. UTC | #4
On 12/04/14 08:50, John David Anglin wrote:
> On 12/1/2014 11:57 AM, Jeff Law wrote:
>> Prior to reload (ie, in DSE1) there's a bit of magic in that we do
>> not set frame_read on call insns.  That may in fact be wrong and
>> possibly the source of the problem.
>>
>> /* This field is only used for the processing of const functions.
>> These functions cannot read memory, but they can read the stack
>> because that is where they may get their parms.  We need to be this
>> conservative because, like the store motion pass, we don't consider
>> CALL_INSN_FUNCTION_USAGE when processing call insns. Moreover, we
>> need to distinguish two cases: 1. Before reload (register
>> elimination), the stores related to outgoing arguments are stack
>> pointer based and thus deemed of non-constant base in this pass.
>> This requires special handling but also means that the frame
>> pointer based stores need not be killed upon encountering a const
>> function call.
> The above is wrong for sibcalls.  Sibcall arguments are relative to
> the incoming argument pointer.  Is this always the frame pointer?
I don't think it's always the frame pointer.  Don't we use an argument 
pointer for the PA64 runtime?  If I recall, it was the only port that 
had a non-eliminable argument pointer at the time.



> If that is the case, then it may be possible to eliminate stack based
>  stores when we have a sibcall before reload.
>> if (reload_completed || SIBLING_CALL_P (insn))
>> insn_info->frame_read = true;
> This works.
>
> Dave
>
Jeff Law Dec. 8, 2014, 8:49 p.m. UTC | #5
On 12/01/14 11:44, John David Anglin wrote:
>>
>> To do unconditionally set frame_read?  Or if we don't want to get
>> that drastic,
>>
>> if (reload_completed || SIBLING_CALL_P (insn))
>> insn_info->frame_read = true;
> Will test but I, if I read the code correctly, setting
> insn_info->frame_read = true results in frame related stores being
> killed in a constant call. This doesn't seem like the right
> solution.
But isn't killing in this context referring to GEN/KILL types of 
operations for global dataflow?   ISTM that GEN is a store operation 
while KILL is a load operation (over-simplification, but stick with me).

Thus a GEN-GEN will get optimized into a single GEN (dead store 
eliminated) while a GEN-KILL-GEN can not be optimized by DSE because of 
the KILL.

It should always be safe to have an extraneous KILL since that merely 
inhibits optimization.  While an extraneous GEN can be a disaster.

So with setting frame_read, we're going to have more KILLs in the 
dataflow sets, which results in fewer stores being eliminated.  They 
come from:


               /* If the frame is read, the frame related stores are 
killed.  */
               else if (insn_info->frame_read)
                 {
                   store_info_t store_info = i_ptr->store_rec;

                   /* Skip the clobbers.  */
                   while (!store_info->is_set)
                     store_info = store_info->next;

                   if (store_info->group_id >= 0
                       && 
rtx_group_vec[store_info->group_id]->frame_related)
                     remove_store = true;
                 }

               if (remove_store)
                 {
                   if (dump_file && (dump_flags & TDF_DETAILS))
                     dump_insn_info ("removing from active", i_ptr);

                   active_local_stores_len--;
                   if (last)
                     last->next_local_store = i_ptr->next_local_store;
                   else
                     active_local_stores = i_ptr->next_local_store;
                 }
               else
                 last = i_ptr;

               i_ptr = i_ptr->next_local_store;

AFAICT in this loop, setting FRAME_READ causes us to set REMOVE_STORE 
more often.  REMOVE_STORE in this context seems to indicate that we're 
going to remove a store from the list we are tracking for potential 
removal.  Which seems exactly right.

Here as well:

   /* If this insn reads the frame, kill all the frame related stores.  */
   if (insn_info->frame_read)
     {
       FOR_EACH_VEC_ELT (rtx_group_vec, i, group)
         if (group->process_globally && group->frame_related)
           {
             if (kill)
               bitmap_ior_into (kill, group->group_kill);
             bitmap_and_compl_into (gen, group->group_kill);
           }
     }

Which also seems like exactly what we want since when we set FRAME_READ 
we end up with a bigger KILL set and a smaller GEN set.


I think the terminology and variable names certainly makes this tougher 
to follow than it should.


>
> Here we have frame related calls being killed before reload because
> the argument stores for the sibcall are off frame:
>
> /* Set the gen set of the exit block, and also any block with no
> successors that does not have a wild read.  */
>
> static void dse_step3_exit_block_scan (bb_info_t bb_info) { /* The
> gen set is all 0's for the exit block except for the
> frame_pointer_group.  */
>
> if (stores_off_frame_dead_at_return) { unsigned int i; group_info_t
> group;
>
> FOR_EACH_VEC_ELT (rtx_group_vec, i, group) { if
> (group->process_globally && group->frame_related) bitmap_ior_into
> (bb_info->gen, group->group_kill); } } }
I see your point.  Expanding the GEN set here seems wrong.    If we go 
with setting STORES_OFF_FRAME_DEAD_AT_RETURN, then we definitely need to 
update its documentation.

I think an argument could be made that we want both changes if I've 
interpreted this code correctly.

Jeff
John David Anglin Dec. 8, 2014, 9:50 p.m. UTC | #6
On 12/8/2014 3:49 PM, Jeff Law wrote:
> I think the terminology and variable names certainly makes this 
> tougher to follow than it should.
I certainly agree. When I first looked at the code, I thought it
was completely backwards.

Dave
John David Anglin Dec. 8, 2014, 10:15 p.m. UTC | #7
On 12/8/2014 3:01 PM, Jeff Law wrote:
>> The above is wrong for sibcalls.  Sibcall arguments are relative to
>> the incoming argument pointer.  Is this always the frame pointer?
> I don't think it's always the frame pointer.  Don't we use an argument 
> pointer for the PA64 runtime?  If I recall, it was the only port that 
> had a non-eliminable argument pointer at the time.
I don't think PA64 is an argument against this as sibcalls don't
work in the PA64 runtime (they are disabled in pa.c) because the 
argument pointer isn't a
fixed register.  I guess in theory it could be fixed if it was saved and 
restored across calls.

DSE as it stands doesn't look at argument pointer based stores and I 
suspect they would
be deleted with current code.

Dave
Jeff Law Dec. 8, 2014, 10:36 p.m. UTC | #8
On 12/08/14 15:15, John David Anglin wrote:
> On 12/8/2014 3:01 PM, Jeff Law wrote:
>>> The above is wrong for sibcalls.  Sibcall arguments are relative
>>> to the incoming argument pointer.  Is this always the frame
>>> pointer?
>> I don't think it's always the frame pointer.  Don't we use an
>> argument pointer for the PA64 runtime?  If I recall, it was the
>> only port that had a non-eliminable argument pointer at the time.
> I don't think PA64 is an argument against this as sibcalls don't work
> in the PA64 runtime (they are disabled in pa.c) because the argument
> pointer isn't a fixed register.  I guess in theory it could be fixed
> if it was saved and restored across calls.
But there's nothing that says another port in the future won't have 
similar characteristics as the PA, so while the PA isn't particularly 
important, it shows there's cases where arguments won't be accessed by 
the FP.


>
> DSE as it stands doesn't look at argument pointer based stores and I
>  suspect they would be deleted with current code.
Agreed.

jeff
Jeff Law Dec. 9, 2014, 8 p.m. UTC | #9
On 12/08/14 14:50, John David Anglin wrote:
> On 12/8/2014 3:49 PM, Jeff Law wrote:
>> I think the terminology and variable names certainly makes this
>> tougher to follow than it should.
> I certainly agree. When I first looked at the code, I thought it
> was completely backwards.
Me too.

Thoughts on using both patches to ensure the we don't have extraneous 
entries in the GEN set and that we're adding more stuff to the KILL set 
for sibcalls?

Jeff
John David Anglin Dec. 9, 2014, 9:09 p.m. UTC | #10
On 12/9/2014 3:00 PM, Jeff Law wrote:
> Thoughts on using both patches to ensure the we don't have extraneous 
> entries in the GEN set and that we're adding more stuff to the KILL 
> set for sibcalls?
Maybe we should add all stores to the KILL set
for a sibling call and not worry about whether
they are frame related or not.

Dave
Jeff Law Dec. 22, 2014, 4:58 p.m. UTC | #11
On 12/09/14 14:09, John David Anglin wrote:
> On 12/9/2014 3:00 PM, Jeff Law wrote:
>> Thoughts on using both patches to ensure the we don't have extraneous
>> entries in the GEN set and that we're adding more stuff to the KILL
>> set for sibcalls?
> Maybe we should add all stores to the KILL set
> for a sibling call and not worry about whether
> they are frame related or not.
For a sibling call, that may not be a terrible idea.  We're not really 
able to track the frame/arg pointer stuff well in that case and once 
those are off the table, there's not much benefit in tracking stuff for 
sibcalls.

Jeff
diff mbox

Patch

Index: dse.c
===================================================================
--- dse.c	(revision 217974)
+++ dse.c	(working copy)
@@ -2483,6 +2483,9 @@ 
 
       insn_info->cannot_delete = true;
 
+      if (SIBLING_CALL_P (insn))
+	stores_off_frame_dead_at_return = false;
+
       /* Const functions cannot do anything bad i.e. read memory,
 	 however, they can read their parameters which may have
 	 been pushed onto the stack.