Message ID | BLU437-SMTP45F1BE4C7DB75367213AE3977F0@phx.gbl |
---|---|
State | New |
Headers | show |
On 11/28/14 18:05, John David Anglin wrote: > The attached simple change fixes a long standing regression since 4.2. > When we have stack arguments and a sibling > call, the dse pass deletes stores to the parent's stack frame when we > have a sibling call because they are are off frame > for the current function. As a result, the sibling call arguments are > not passed correctly on PA. > > The attached change disables this behavior. > > Tested on hppa2.0w-hp-hpux11.11, hppa64-hp-hpux11.11 and > hppa-unknown-linux-gnu. > > Okay for trunk, 4.9 and 4.8? What is the insn_info for the sibcall itself? I'm particularly interested in the frame_read flag. Prior to reload (ie, in DSE1) there's a bit of magic in that we do not set frame_read on call insns. That may in fact be wrong and possibly the source of the problem. /* This field is only used for the processing of const functions. These functions cannot read memory, but they can read the stack because that is where they may get their parms. We need to be this conservative because, like the store motion pass, we don't consider CALL_INSN_FUNCTION_USAGE when processing call insns. Moreover, we need to distinguish two cases: 1. Before reload (register elimination), the stores related to outgoing arguments are stack pointer based and thus deemed of non-constant base in this pass. This requires special handling but also means that the frame pointer based stores need not be killed upon encountering a const function call. 2. After reload, the stores related to outgoing arguments can be either stack pointer or hard frame pointer based. This means that we have no other choice than also killing all the frame pointer based stores upon encountering a const function call. This field is set after reload for const function calls. Having this set is less severe than a wild read, it just means that all the frame related stores are killed rather than all the stores. */ bool frame_read; As a test, what happens if we change: /* See the head comment of the frame_read field. */ if (reload_completed) insn_info->frame_read = true; To do unconditionally set frame_read? Or if we don't want to get that drastic, if (reload_completed || SIBLING_CALL_P (insn)) insn_info->frame_read = true; As for the patch itself, it'd be good to include a testcase, particularly since the BZ has a nice simple one. It feels like setting stores_off_frame_dead_at_return, while effective is the wrong solution. But if we go in that direction, then we clearly need a comment where we set that for sibling calls and the comment for that variable will need to be updated since it says it's only used for stdarg functions. Jeff *I Jeff
On 12/1/2014 11:57 AM, Jeff Law wrote: > Prior to reload (ie, in DSE1) there's a bit of magic in that we do not > set frame_read on call insns. That may in fact be wrong and possibly > the source of the problem. > > /* This field is only used for the processing of const functions. > These functions cannot read memory, but they can read the stack > because that is where they may get their parms. We need to be > this conservative because, like the store motion pass, we don't > consider CALL_INSN_FUNCTION_USAGE when processing call insns. > Moreover, we need to distinguish two cases: > 1. Before reload (register elimination), the stores related to > outgoing arguments are stack pointer based and thus deemed > of non-constant base in this pass. This requires special > handling but also means that the frame pointer based stores > need not be killed upon encountering a const function call. > 2. After reload, the stores related to outgoing arguments can be > either stack pointer or hard frame pointer based. This means > that we have no other choice than also killing all the frame > pointer based stores upon encountering a const function call. > This field is set after reload for const function calls. Having > this set is less severe than a wild read, it just means that all > the frame related stores are killed rather than all the stores. */ > bool frame_read; > > > As a test, what happens if we change: > > > /* See the head comment of the frame_read field. */ > if (reload_completed) > insn_info->frame_read = true; > > To do unconditionally set frame_read? Or if we don't want to get that > drastic, > > if (reload_completed || SIBLING_CALL_P (insn)) > insn_info->frame_read = true; Will test but I, if I read the code correctly, setting insn_info->frame_read = true results in frame related stores being killed in a constant call. This doesn't seem like the right solution. Here we have frame related calls being killed before reload because the argument stores for the sibcall are off frame: /* Set the gen set of the exit block, and also any block with no successors that does not have a wild read. */ static void dse_step3_exit_block_scan (bb_info_t bb_info) { /* The gen set is all 0's for the exit block except for the frame_pointer_group. */ if (stores_off_frame_dead_at_return) { unsigned int i; group_info_t group; FOR_EACH_VEC_ELT (rtx_group_vec, i, group) { if (group->process_globally && group->frame_related) bitmap_ior_into (bb_info->gen, group->group_kill); } } } Dave
On 12/1/2014 11:57 AM, Jeff Law wrote: > Prior to reload (ie, in DSE1) there's a bit of magic in that we do not > set frame_read on call insns. That may in fact be wrong and possibly > the source of the problem. > > /* This field is only used for the processing of const functions. > These functions cannot read memory, but they can read the stack > because that is where they may get their parms. We need to be > this conservative because, like the store motion pass, we don't > consider CALL_INSN_FUNCTION_USAGE when processing call insns. > Moreover, we need to distinguish two cases: > 1. Before reload (register elimination), the stores related to > outgoing arguments are stack pointer based and thus deemed > of non-constant base in this pass. This requires special > handling but also means that the frame pointer based stores > need not be killed upon encountering a const function call. The above is wrong for sibcalls. Sibcall arguments are relative to the incoming argument pointer. Is this always the frame pointer? If that is the case, then it may be possible to eliminate stack based stores when we have a sibcall before reload. > if (reload_completed || SIBLING_CALL_P (insn)) > insn_info->frame_read = true; This works. Dave
On 12/04/14 08:50, John David Anglin wrote: > On 12/1/2014 11:57 AM, Jeff Law wrote: >> Prior to reload (ie, in DSE1) there's a bit of magic in that we do >> not set frame_read on call insns. That may in fact be wrong and >> possibly the source of the problem. >> >> /* This field is only used for the processing of const functions. >> These functions cannot read memory, but they can read the stack >> because that is where they may get their parms. We need to be this >> conservative because, like the store motion pass, we don't consider >> CALL_INSN_FUNCTION_USAGE when processing call insns. Moreover, we >> need to distinguish two cases: 1. Before reload (register >> elimination), the stores related to outgoing arguments are stack >> pointer based and thus deemed of non-constant base in this pass. >> This requires special handling but also means that the frame >> pointer based stores need not be killed upon encountering a const >> function call. > The above is wrong for sibcalls. Sibcall arguments are relative to > the incoming argument pointer. Is this always the frame pointer? I don't think it's always the frame pointer. Don't we use an argument pointer for the PA64 runtime? If I recall, it was the only port that had a non-eliminable argument pointer at the time. > If that is the case, then it may be possible to eliminate stack based > stores when we have a sibcall before reload. >> if (reload_completed || SIBLING_CALL_P (insn)) >> insn_info->frame_read = true; > This works. > > Dave >
On 12/01/14 11:44, John David Anglin wrote: >> >> To do unconditionally set frame_read? Or if we don't want to get >> that drastic, >> >> if (reload_completed || SIBLING_CALL_P (insn)) >> insn_info->frame_read = true; > Will test but I, if I read the code correctly, setting > insn_info->frame_read = true results in frame related stores being > killed in a constant call. This doesn't seem like the right > solution. But isn't killing in this context referring to GEN/KILL types of operations for global dataflow? ISTM that GEN is a store operation while KILL is a load operation (over-simplification, but stick with me). Thus a GEN-GEN will get optimized into a single GEN (dead store eliminated) while a GEN-KILL-GEN can not be optimized by DSE because of the KILL. It should always be safe to have an extraneous KILL since that merely inhibits optimization. While an extraneous GEN can be a disaster. So with setting frame_read, we're going to have more KILLs in the dataflow sets, which results in fewer stores being eliminated. They come from: /* If the frame is read, the frame related stores are killed. */ else if (insn_info->frame_read) { store_info_t store_info = i_ptr->store_rec; /* Skip the clobbers. */ while (!store_info->is_set) store_info = store_info->next; if (store_info->group_id >= 0 && rtx_group_vec[store_info->group_id]->frame_related) remove_store = true; } if (remove_store) { if (dump_file && (dump_flags & TDF_DETAILS)) dump_insn_info ("removing from active", i_ptr); active_local_stores_len--; if (last) last->next_local_store = i_ptr->next_local_store; else active_local_stores = i_ptr->next_local_store; } else last = i_ptr; i_ptr = i_ptr->next_local_store; AFAICT in this loop, setting FRAME_READ causes us to set REMOVE_STORE more often. REMOVE_STORE in this context seems to indicate that we're going to remove a store from the list we are tracking for potential removal. Which seems exactly right. Here as well: /* If this insn reads the frame, kill all the frame related stores. */ if (insn_info->frame_read) { FOR_EACH_VEC_ELT (rtx_group_vec, i, group) if (group->process_globally && group->frame_related) { if (kill) bitmap_ior_into (kill, group->group_kill); bitmap_and_compl_into (gen, group->group_kill); } } Which also seems like exactly what we want since when we set FRAME_READ we end up with a bigger KILL set and a smaller GEN set. I think the terminology and variable names certainly makes this tougher to follow than it should. > > Here we have frame related calls being killed before reload because > the argument stores for the sibcall are off frame: > > /* Set the gen set of the exit block, and also any block with no > successors that does not have a wild read. */ > > static void dse_step3_exit_block_scan (bb_info_t bb_info) { /* The > gen set is all 0's for the exit block except for the > frame_pointer_group. */ > > if (stores_off_frame_dead_at_return) { unsigned int i; group_info_t > group; > > FOR_EACH_VEC_ELT (rtx_group_vec, i, group) { if > (group->process_globally && group->frame_related) bitmap_ior_into > (bb_info->gen, group->group_kill); } } } I see your point. Expanding the GEN set here seems wrong. If we go with setting STORES_OFF_FRAME_DEAD_AT_RETURN, then we definitely need to update its documentation. I think an argument could be made that we want both changes if I've interpreted this code correctly. Jeff
On 12/8/2014 3:49 PM, Jeff Law wrote: > I think the terminology and variable names certainly makes this > tougher to follow than it should. I certainly agree. When I first looked at the code, I thought it was completely backwards. Dave
On 12/8/2014 3:01 PM, Jeff Law wrote: >> The above is wrong for sibcalls. Sibcall arguments are relative to >> the incoming argument pointer. Is this always the frame pointer? > I don't think it's always the frame pointer. Don't we use an argument > pointer for the PA64 runtime? If I recall, it was the only port that > had a non-eliminable argument pointer at the time. I don't think PA64 is an argument against this as sibcalls don't work in the PA64 runtime (they are disabled in pa.c) because the argument pointer isn't a fixed register. I guess in theory it could be fixed if it was saved and restored across calls. DSE as it stands doesn't look at argument pointer based stores and I suspect they would be deleted with current code. Dave
On 12/08/14 15:15, John David Anglin wrote: > On 12/8/2014 3:01 PM, Jeff Law wrote: >>> The above is wrong for sibcalls. Sibcall arguments are relative >>> to the incoming argument pointer. Is this always the frame >>> pointer? >> I don't think it's always the frame pointer. Don't we use an >> argument pointer for the PA64 runtime? If I recall, it was the >> only port that had a non-eliminable argument pointer at the time. > I don't think PA64 is an argument against this as sibcalls don't work > in the PA64 runtime (they are disabled in pa.c) because the argument > pointer isn't a fixed register. I guess in theory it could be fixed > if it was saved and restored across calls. But there's nothing that says another port in the future won't have similar characteristics as the PA, so while the PA isn't particularly important, it shows there's cases where arguments won't be accessed by the FP. > > DSE as it stands doesn't look at argument pointer based stores and I > suspect they would be deleted with current code. Agreed. jeff
On 12/08/14 14:50, John David Anglin wrote: > On 12/8/2014 3:49 PM, Jeff Law wrote: >> I think the terminology and variable names certainly makes this >> tougher to follow than it should. > I certainly agree. When I first looked at the code, I thought it > was completely backwards. Me too. Thoughts on using both patches to ensure the we don't have extraneous entries in the GEN set and that we're adding more stuff to the KILL set for sibcalls? Jeff
On 12/9/2014 3:00 PM, Jeff Law wrote: > Thoughts on using both patches to ensure the we don't have extraneous > entries in the GEN set and that we're adding more stuff to the KILL > set for sibcalls? Maybe we should add all stores to the KILL set for a sibling call and not worry about whether they are frame related or not. Dave
On 12/09/14 14:09, John David Anglin wrote: > On 12/9/2014 3:00 PM, Jeff Law wrote: >> Thoughts on using both patches to ensure the we don't have extraneous >> entries in the GEN set and that we're adding more stuff to the KILL >> set for sibcalls? > Maybe we should add all stores to the KILL set > for a sibling call and not worry about whether > they are frame related or not. For a sibling call, that may not be a terrible idea. We're not really able to track the frame/arg pointer stuff well in that case and once those are off the table, there's not much benefit in tracking stuff for sibcalls. Jeff
Index: dse.c =================================================================== --- dse.c (revision 217974) +++ dse.c (working copy) @@ -2483,6 +2483,9 @@ insn_info->cannot_delete = true; + if (SIBLING_CALL_P (insn)) + stores_off_frame_dead_at_return = false; + /* Const functions cannot do anything bad i.e. read memory, however, they can read their parameters which may have been pushed onto the stack.