Message ID | 56033FC7.5080005@yahoo.com |
---|---|
State | New |
Headers | show |
On 09/23/2015 06:11 PM, Abe wrote: > The patch can pass the bootstrap stage2-to-stage3 comparison [same > platform] *_if_* I prevent the "bootstrap-debug" value for > BUILD_CONFIG from being automatically chosen, e.g. via > "--with-build-config=bootstrap=time" during configuration. With the > default "BUILD_CONFIG=bootstrap-debug", it fails at the comparison > stage. So what that means is the presence or absence of debug information is causing a difference in the code you generate. That is (of course) bad and indicates a bug of some kind in your code. I haven't put your code under a debugger or anything like that, but this does jump out: + rtx_insn* temp_src_insn = BB_HEAD (then_bb); + if (temp_src_insn && NOTE_INSN_BASIC_BLOCK_P (temp_src_insn)) + temp_src_insn = NEXT_INSN (temp_src_insn); /* skip over a start-of-BB note */ What if BB_HEAD (then_bb) is a DEBUG_INSN with -g enabled, but is a NOTE_INSN_BASIC_BLOCK when -g is not enabled. That could cause this kind of failure. Jeff
On 9/24/15 1:07 AM, Jeff Law wrote: > So what that means is the presence or absence of debug information is causing a difference in > the code you generate. That is (of course) bad and indicates a bug of some kind in your code. > I haven't put your code under a debugger or anything like that, but this does jump out: > + rtx_insn* temp_src_insn = BB_HEAD (then_bb); > + if (temp_src_insn && NOTE_INSN_BASIC_BLOCK_P (temp_src_insn)) > + temp_src_insn = NEXT_INSN (temp_src_insn); /* skip over a start-of-BB note */ > What if BB_HEAD (then_bb) is a DEBUG_INSN with -g enabled, but is a NOTE_INSN_BASIC_BLOCK when -g is not enabled. That could cause > this kind of failure. Thanks very much! I`m just checking email right before heading out to surgery, so I can`t experiment on changing the above code right now, but I`ll be sure to do so next week. After some thinking, I thought of this question: would "-g"/"-g3"/etc. cause a DEBUG_INSN to be present in the RTL as the first insn in the BB, just before the start-of-BB note? In other words, rather than only checking to see if I should skip copying the first insn in the BB, would checking {the first insn, and if that`s not a NOTE, then check the second insn} and skipping over whichever one is the first "hit" for "NOTE" be a safe strategy? Thanks again! Regards, Abe
On 09/24/2015 09:19 AM, Abe wrote: > > On 9/24/15 1:07 AM, Jeff Law wrote: > >> So what that means is the presence or absence of debug information is >> causing a difference in > > the code you generate. That is (of course) bad and indicates a bug > of some kind in your code. > >> I haven't put your code under a debugger or anything like that, but >> this does jump out: > >> + rtx_insn* temp_src_insn = BB_HEAD (then_bb); >> + if (temp_src_insn && NOTE_INSN_BASIC_BLOCK_P (temp_src_insn)) >> + temp_src_insn = NEXT_INSN (temp_src_insn); /* skip over a >> start-of-BB note */ > >> What if BB_HEAD (then_bb) is a DEBUG_INSN with -g enabled, but is a >> NOTE_INSN_BASIC_BLOCK when -g is not enabled. That could cause >> this kind of failure. > > > Thanks very much! I`m just checking email right before heading out to > surgery, so I can`t experiment on changing the above code right now, but > I`ll be sure to do so next week. > > After some thinking, I thought of this question: would "-g"/"-g3"/etc. > cause a DEBUG_INSN to be present in the RTL as the first insn in the BB, > just before the start-of-BB note? In other words, rather than only > checking to see if I should skip copying the first insn in the BB, would > checking {the first insn, and if that`s not a NOTE, then check the > second insn} and skipping over whichever one is the first "hit" for > "NOTE" be a safe strategy? I'd be surprised if we had DEBUG_INSNs as the first insn in a block (before the note), but it can't hurt to verify. I believe cfgrtl checks for proper location of the block note. But maybe I'm mis-remembering. It might not be a bad idea to run the verification code immediately after your transformation (for testing purposes only). Otherwise, you're just going to have to slog through and find out why the codegen is different. These kind of bugs are usually painful to track down, but it's usually worth the time spent in the end. jeff
> From: Jeff Law <law@redhat.com> > Sent: Friday, September 25, 2015 11:09 AM > I'd be surprised if we had DEBUG_INSNs as the first insn in a block > (before the note), but it can't hurt to verify. I believe cfgrtl checks > for proper location of the block note. But maybe I'm mis-remembering. > It might not be a bad idea to run the verification code immediately > after your transformation (for testing purposes only). Thanks for the advice. By "run the verification code", did you mean: * call some code whereby "cfgrtl checks for proper location of the block note"? * re-use some other already-existing-in-GCC code? * write some new verification code of my own? By "after your transformation", did you mean: * just before my code does "goto success;", which jumps to a point in "noce_process_if_block" that was already there and is shared between several different RTL-level if conversions? * well after the relevant "success:" but before returning? * both of the preceding? * "other"? > Otherwise, you're just going to have to slog through and find out why > the codegen is different. These kind of bugs are usually painful to > track down, but it's usually worth the time spent in the end. Yeah. I think I understand. I hope I`ll be able to figure it out at the RTL level. Thanks again for your help, Jeff. Regards, Abe
On 09/25/2015 11:13 AM, Abe Skolnik wrote: >> From: Jeff Law <law@redhat.com> > >> Sent: Friday, September 25, 2015 11:09 AM > > >> I'd be surprised if we had DEBUG_INSNs as the first insn in a block >> (before the note), but it can't hurt to verify. I believe cfgrtl checks >> for proper location of the block note. But maybe I'm mis-remembering. >> It might not be a bad idea to run the verification code immediately >> after your transformation (for testing purposes only). > > Thanks for the advice. > > > By "run the verification code", did you mean: > > * call some code whereby "cfgrtl checks for proper location of the block note"? > > * re-use some other already-existing-in-GCC code? > > * write some new verification code of my own? There's several routines for verfication of the CFG and various RTL artifacts in cfgrtl.c. I'm not offhand sure of the main entry point for those routines. I believe they're typically run by the pass manager. > > > By "after your transformation", did you mean: > > * just before my code does "goto success;", which jumps to a point > in "noce_process_if_block" that was already there and is shared > between several different RTL-level if conversions? At whatever point you've finished your transformation of a single IF-THEN-ELSE block. That way if something was wrong, you'll know precisely which IF-THEN-ELSE block is getting messed up. Since it's just for debugging purposes, you can afford the time to do the checking this often. Obviously that wouldn't be part of the official patch. > > * well after the relevant "success:" but before returning? > > * both of the preceding? > > * "other"? WHenver you've totally completed transforming an IF-THEN-ELSE block, but before any other transformations are started. jeff
On Fri, Sep 25, 2015 at 10:09:23AM -0600, Jeff Law wrote: > >>So what that means is the presence or absence of debug information is > >>causing a difference in > > > the code you generate. That is (of course) bad and indicates a bug > >of some kind in your code. > > > >>I haven't put your code under a debugger or anything like that, but > >>this does jump out: > > > >>+ rtx_insn* temp_src_insn = BB_HEAD (then_bb); > >>+ if (temp_src_insn && NOTE_INSN_BASIC_BLOCK_P (temp_src_insn)) > >>+ temp_src_insn = NEXT_INSN (temp_src_insn); /* skip over a > >>start-of-BB note */ > > > >>What if BB_HEAD (then_bb) is a DEBUG_INSN with -g enabled, but is a > >>NOTE_INSN_BASIC_BLOCK when -g is not enabled. That could cause > >>this kind of failure. I haven't looked at the full code, but NEXT_INSN is even more suspicious (you also need to skip the debug insns). > Otherwise, you're just going to have to slog through and find out why > the codegen is different. These kind of bugs are usually painful to > track down, but it's usually worth the time spent in the end. Compile with -dap (directly with cc1 or with -S), find some difference in the generated asm, see what the RTL insn number for that was (that's what -dp is for), find where the difference came from (from the dumpfiles, that's -da; you probably already know what pass is the culprit ;-) ) Segher
On 09/25/2015 01:18 PM, Segher Boessenkool wrote: > On Fri, Sep 25, 2015 at 10:09:23AM -0600, Jeff Law wrote: >>>> So what that means is the presence or absence of debug information is >>>> causing a difference in >>>> the code you generate. That is (of course) bad and indicates a bug >>> of some kind in your code. >>> >>>> I haven't put your code under a debugger or anything like that, but >>>> this does jump out: >>> >>>> + rtx_insn* temp_src_insn = BB_HEAD (then_bb); >>>> + if (temp_src_insn && NOTE_INSN_BASIC_BLOCK_P (temp_src_insn)) >>>> + temp_src_insn = NEXT_INSN (temp_src_insn); /* skip over a >>>> start-of-BB note */ >>> >>>> What if BB_HEAD (then_bb) is a DEBUG_INSN with -g enabled, but is a >>>> NOTE_INSN_BASIC_BLOCK when -g is not enabled. That could cause >>>> this kind of failure. > > I haven't looked at the full code, but NEXT_INSN is even more suspicious > (you also need to skip the debug insns). Yes, you're absolutely right. :-) jeff
> I haven't looked at the full code, but NEXT_INSN is even more > suspicious (you also need to skip the debug insns). Thanks for that information. > Compile with -dap (directly with cc1 or with -S), find some difference > in the generated asm, see what the RTL insn number for that was (that's > what -dp is for), find where the difference came from (from the dumpfiles, > that's -da; you probably already know what pass is the culprit ;-) ) Thanks again! Regards, Abe
--- ifcvt.c___original_from_Kyrill_commit_227368_.c 2015-09-01 12:54:38.234108158 -0500 +++ ifcvt.c___Abe-edited_withOUT_debug_code_by_Abe.c 2015-09-23 18:17:15.663107377 -0500 @@ -47,6 +47,7 @@ #include "insn-codes.h" #include "optabs.h" #include "diagnostic-core.h" +#include "diagnostic-color.h" #include "tm_p.h" #include "cfgloop.h" #include "target.h" @@ -56,6 +57,8 @@ #include "rtl-iter.h" #include "ifcvt.h" +#include <utility> + #ifndef MAX_CONDITIONAL_EXECUTE #define MAX_CONDITIONAL_EXECUTE \ (BRANCH_COST (optimize_function_for_speed_p (cfun), false) \ @@ -66,6 +69,9 @@ #define NULL_BLOCK ((basic_block) NULL) +/* an arbitrary upper bound; it should never be *nearly* this big */ +#define SCRATCHPADS_INCLUSIVE_MAX_SIZE_IN_BYTES 128 + /* True if after combine pass. */ static bool ifcvt_after_combine; @@ -110,6 +116,10 @@ static int dead_or_predicable (basic_blo edge, int); static void noce_emit_move_insn (rtx, rtx); static rtx_insn *block_has_only_trap (basic_block); + +static auto_vec<std::pair<rtx, unsigned int> > scratchpads; +static rtx biggest_scratchpad_rtx; +static size_t size_of_biggest_scratchpad; /* Count the number of non-jump active insns in BB. */ @@ -2784,18 +2794,27 @@ noce_operand_ok (const_rtx op) return ! may_trap_p (op); } -/* Return true if a write into MEM may trap or fault. */ +/* Return true if a write into MEM may trap or fault. + Pass in "true" to SCRATCHPADS_ENABLED + if/when/where asking with regard to a + scratchpad-based if-conversion, "false" otherwise; + "noce_mem_write_may_trap_or_fault_p", below, + preserves the pre-scratchpad-enhancements behavior + of the older function with the same name, + which used to contain most of what is now in + "noce_mem_write_may_trap_or_fault_considering_scratchpad_support_p" */ static bool -noce_mem_write_may_trap_or_fault_p (const_rtx mem) +noce_mem_write_may_trap_or_fault_considering_scratchpad_support_p (const_rtx mem, bool scratchpads_enabled) { rtx addr; if (MEM_READONLY_P (mem)) return true; - if (may_trap_or_fault_p (mem)) - return true; + if (! scratchpads_enabled) + if (may_trap_or_fault_p (mem)) + return true; addr = XEXP (mem, 0); @@ -2837,6 +2856,17 @@ noce_mem_write_may_trap_or_fault_p (cons return false; } + /* the following function preserves the pre-scratchpad-enhancements + behavior of the older function with the same name, + which used to contain most of what is now in + "noce_mem_write_may_trap_or_fault_considering_scratchpad_support_p" */ + +static bool +noce_mem_write_may_trap_or_fault_p (const_rtx mem) +{ + return noce_mem_write_may_trap_or_fault_considering_scratchpad_support_p (mem, false); +} + /* Return whether we can use store speculation for MEM. TOP_BB is the basic block above the conditional block where we are considering doing the speculative store. We look for whether MEM is set @@ -3106,6 +3136,7 @@ noce_process_if_block (struct noce_if_in } /* Don't operate on sources that may trap or are volatile. */ + /* Abe note: this will probably need to be altered for scratchpad-based loads */ if (! noce_operand_ok (a) || ! noce_operand_ok (b)) return FALSE; @@ -3156,17 +3187,166 @@ noce_process_if_block (struct noce_if_in if (!set_b && MEM_P (orig_x)) { - /* Disallow the "if (...) x = a;" form (implicit "else x = x;") - for optimizations if writing to x may trap or fault, - i.e. it's a memory other than a static var or a stack slot, - is misaligned on strict aligned machines or is read-only. If - x is a read-only memory, then the program is valid only if we + /* Disallow the "if (...) x = a;" form (with no "else") for optimizations + when x is misaligned on strict-alignment machines or is read-only. + If x is a memory other than a static var or a stack slot: + for targets _with_ conditional move and _without_ conditional execution, + convert using the scratchpad technique, otherwise don`t convert. + If x is a read-only memory, then the program is valid only if we avoid the store into it. If there are stores on both the THEN and ELSE arms, then we can go ahead with the conversion; either the program is broken, or the condition is always - false such that the other memory is selected. */ + false such that the other memory is selected. The non-scratchpad-based + conversion here has an implicit "else x = x;". */ if (noce_mem_write_may_trap_or_fault_p (orig_x)) - return FALSE; + { + /* The next "if": quoting "noce_emit_cmove": If we can't create new pseudos, though, don't bother. */ + if (reload_completed) + { + return FALSE; + } + + if (optimize<2) + { + return FALSE; + } + + if (optimize_function_for_size_p (cfun)) + { // reminder: "cfun" is a global, decl.d as "extern" in "function.h" and def.d in "function.c" + return FALSE; + } + + if (targetm.have_conditional_execution () || ! HAVE_conditional_move) + { + return FALSE; + } + + const bool not_a_scratchpad_candidate = noce_mem_write_may_trap_or_fault_considering_scratchpad_support_p (orig_x, true); + + if (! not_a_scratchpad_candidate) { + + if (MEM_SIZE_KNOWN_P (orig_x) ) { + const size_t size_of_MEM_operand = MEM_SIZE (orig_x); + + if (size_of_MEM_operand <= SCRATCHPADS_INCLUSIVE_MAX_SIZE_IN_BYTES) { + + if (size_of_MEM_operand > size_of_biggest_scratchpad) { + size_of_biggest_scratchpad = size_of_MEM_operand; + biggest_scratchpad_rtx = assign_stack_local(GET_MODE (orig_x), size_of_MEM_operand, 0); + /* 0: align acc. to the machine mode indicated by "GET_MODE (orig_x)" */ + gcc_assert(biggest_scratchpad_rtx); + scratchpads.safe_push(std::make_pair(biggest_scratchpad_rtx, size_of_MEM_operand)); + } + + gcc_assert(biggest_scratchpad_rtx); + + rtx reg_for_address_to_which_to_store = gen_reg_rtx (Pmode); + set_used_flags(reg_for_address_to_which_to_store); + + start_sequence (); + + /* we must copy the insn.s between {the start of the THEN block} and {the set of 'a'}, + if they exist, since they may be needed for the converted code as well */ + + rtx_insn* temp_src_insn = BB_HEAD (then_bb); + if (temp_src_insn && NOTE_INSN_BASIC_BLOCK_P (temp_src_insn)) + temp_src_insn = NEXT_INSN (temp_src_insn); /* skip over a start-of-BB note */ + + if (temp_src_insn) + { + rtx_insn* last_one_to_splice_in = PREV_INSN(insn_a); + duplicate_insn_chain(temp_src_insn, last_one_to_splice_in); + /* a return of 0 from "duplicate_insn_chain" is _not_ a failure, + as far as Abe knows; it just seems to return + the "NEXT_INSN" of the last insn it duplicated */ + } + + /* done copying the insn.s between {the start of the THEN block} + and {the set of 'a'}, if any */ + + if (CONSTANT_P (XEXP (cond, 0)) && CONSTANT_P (XEXP (cond, 1))) + { + end_sequence (); + return FALSE; + } + + + /* WARNING: "noce_emit_cmove" takes its resulting-expr. param.s in {false, true} order, + whereas "gen_rtx_IF_THEN_ELSE" takes them the other way */ + + /* IMPORTANT NOTE: _intentionally_ not reversing the positions of + "XEXP (orig_x,0)" and "XEXP (biggest_scratchpad_rtx, 0)" + when "if_info->then_else_reversed" is true: + when that flag is true, code that has already run by this point + has _already_ inverted "if_info->cond", + which is the source for the local "cond" */ + + rtx target = noce_emit_cmove (if_info, reg_for_address_to_which_to_store, GET_CODE (cond), + XEXP (cond, 0), XEXP (cond, 1), XEXP (orig_x, 0), + XEXP (biggest_scratchpad_rtx, 0)); + + if (!target) + { + end_sequence (); + return FALSE; + } + if (target != reg_for_address_to_which_to_store) + noce_emit_move_insn (reg_for_address_to_which_to_store, target); + + /* --- start of building the new MEM --- */ + /* most or all of the code for building the new MEM + is Abe mimicking the MEM-related code in "noce_try_cmove_arith" */ + rtx mem_rtx = gen_rtx_MEM (GET_MODE (orig_x), reg_for_address_to_which_to_store); + MEM_NOTRAP_P (mem_rtx) = true; + MEM_VOLATILE_P (mem_rtx) = MEM_VOLATILE_P (orig_x); + + alias_set_type temp_alias_set = new_alias_set (); + if (MEM_ALIAS_SET (orig_x)) + record_alias_subset (MEM_ALIAS_SET (orig_x), temp_alias_set); + set_mem_alias_set (mem_rtx, temp_alias_set); + + + set_mem_align (mem_rtx, MIN (MEM_ALIGN (biggest_scratchpad_rtx), MEM_ALIGN (orig_x))); + + if (MEM_ADDR_SPACE (orig_x) != MEM_ADDR_SPACE (biggest_scratchpad_rtx)) + { + end_sequence (); + return FALSE; + } + + set_used_flags(mem_rtx); + + /* --- end of building the new MEM --- */ + + noce_emit_move_insn(mem_rtx, a); + + /* Abe`s note: do we need to do the following after getting a new pseudo-reg., as shown elsewhere in this file? + if (max_regno < max_reg_num ()) max_regno = max_reg_num (); + */ + + rtx_insn *seq = end_ifcvt_sequence (if_info); + if (!seq) + { + return FALSE; + } + else + { + unshare_all_rtl_in_chain (seq); + x = orig_x; /* prevent the code right after "success:" from throwing away the changes */ + emit_insn_before_setloc (seq, if_info->jump, INSN_LOCATION (if_info->insn_a)); + goto success; + } + + } /* (size_of_MEM_operand <= SCRATCHPADS_INCLUSIVE_MAX_SIZE_IN_BYTES) */ + + } else { + /* Abe`s manually-inserted note: this empty block is here due to stripped-out debugging-by-Abe code */ + } + + } + + return FALSE; + } /* Avoid store speculation: given "if (...) x = a" where x is a MEM, we only want to do the store if x is always set @@ -4956,9 +5136,14 @@ dead_or_predicable (basic_block test_bb, static void if_convert (bool after_combine) { + basic_block bb; int pass; + scratchpads.truncate(0); /* ensure that we start the scratchpads vec fresh each time */ + size_of_biggest_scratchpad = 0; + biggest_scratchpad_rtx = 0; /* reminder: an "rtx", therefore a pointer */ + if (optimize == 1) { df_live_add_problem ();