Message ID | CABvNLJu3OLHxvY3VYsA_ZCS-RotQ9k8V43Y+_5=QJsgKOwqcaA@mail.gmail.com |
---|---|
State | New |
Headers | show |
Series | emit-rtl.c: Allow splitting of RTX_FRAME_RELATED_P insns? | expand |
Senthil Kumar via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Hi, > > I'm working on converting the AVR backend to MODE_CC, following > the steps described for case #2 in the CC0 transition wiki page, > and I've implemented the first three bullet > points (https://github.com/saaadhu/gcc-avr-cc0/tree/avr-cc0-squashed). With > the below patch, there are zero regressions (for mega and xmega > subarchs) compared to the current mainline, as of yesterday. > > The wiki suggests using post-reload splitters, so that's the > direction I took, but I ran into an issue where split_insn > bails out early if RTX_FRAME_RELATED_P is true - this means > that splits for REG_CC clobbering insns with > RTX_FRAME_RELATED_P will never execute, resulting in a > could-not-split insn ICE in the final stage. > > I see that the recog.c:peep2_attempt allows splitting of a > RTX_FRAME_RELATED_P insn, provided the result of the split is a > single insn. Would it be ok to modify try_split also to > allow those kinds of insns (tentative patch attached, code > copied over from peep2_attempt, only setting old and new_insn)? Or is there > a different approach to fix this? I agree there's no obvious reason why splitting to a single insn should be rejected but a peephole2 to a single instruction should be OK. And reusing the existing, tried-and-tested code is the way to go. But could you split the code out of peep2_attempt into a subroutine (probably still in recog.c) and reuse it in try_split? BTW, just to check: is your email address in MAINTAINERS still correct? Thanks, Richard
On Mon, Aug 10, 2020 at 05:16:15PM +0100, Richard Sandiford wrote: > Senthil Kumar via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > > The wiki suggests using post-reload splitters, so that's the > > direction I took, but I ran into an issue where split_insn > > bails out early if RTX_FRAME_RELATED_P is true - this means > > that splits for REG_CC clobbering insns with > > RTX_FRAME_RELATED_P will never execute, resulting in a > > could-not-split insn ICE in the final stage. > > > > I see that the recog.c:peep2_attempt allows splitting of a > > RTX_FRAME_RELATED_P insn, provided the result of the split is a > > single insn. Would it be ok to modify try_split also to > > allow those kinds of insns (tentative patch attached, code > > copied over from peep2_attempt, only setting old and new_insn)? Or is there > > a different approach to fix this? > > I agree there's no obvious reason why splitting to a single insn > should be rejected but a peephole2 to a single instruction should be OK. > And reusing the existing, tried-and-tested code is the way to go. The only obvious difference is that the splitters run many times, while peep2 runs only once, very late. If you make this only do stuff for reload_completed splitters, that difference is gone as well. > But could you split the code out of peep2_attempt into a subroutine > (probably still in recog.c) and reuse it in try_split? Yes please :-) Segher
Segher Boessenkool <segher@kernel.crashing.org> writes: > On Mon, Aug 10, 2020 at 05:16:15PM +0100, Richard Sandiford wrote: >> Senthil Kumar via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >> > The wiki suggests using post-reload splitters, so that's the >> > direction I took, but I ran into an issue where split_insn >> > bails out early if RTX_FRAME_RELATED_P is true - this means >> > that splits for REG_CC clobbering insns with >> > RTX_FRAME_RELATED_P will never execute, resulting in a >> > could-not-split insn ICE in the final stage. >> > >> > I see that the recog.c:peep2_attempt allows splitting of a >> > RTX_FRAME_RELATED_P insn, provided the result of the split is a >> > single insn. Would it be ok to modify try_split also to >> > allow those kinds of insns (tentative patch attached, code >> > copied over from peep2_attempt, only setting old and new_insn)? Or is there >> > a different approach to fix this? >> >> I agree there's no obvious reason why splitting to a single insn >> should be rejected but a peephole2 to a single instruction should be OK. >> And reusing the existing, tried-and-tested code is the way to go. > > The only obvious difference is that the splitters run many times, while > peep2 runs only once, very late. If you make this only do stuff for > reload_completed splitters, that difference is gone as well. Yeah, but I was talking specifically about RTX_FRAME_RELATED_P stuff, rather than in general, and RTX_FRAME_RELATED_P insns shouldn't exist until prologue/epilogue generation. The reference to “single insn” was because both passes would still reject splitting/peepholing an RTX_FRAME_RELATED_P insn to multiple insns. Thanks, Richard
On Tue, Aug 11, 2020 at 07:59:44AM +0100, Richard Sandiford wrote: > >> I agree there's no obvious reason why splitting to a single insn > >> should be rejected but a peephole2 to a single instruction should be OK. > >> And reusing the existing, tried-and-tested code is the way to go. > > > > The only obvious difference is that the splitters run many times, while > > peep2 runs only once, very late. If you make this only do stuff for > > reload_completed splitters, that difference is gone as well. > > Yeah, but I was talking specifically about RTX_FRAME_RELATED_P stuff, > rather than in general, and RTX_FRAME_RELATED_P insns shouldn't exist > until prologue/epilogue generation. Yeah, good points. > The reference to “single insn” > was because both passes would still reject splitting/peepholing an > RTX_FRAME_RELATED_P insn to multiple insns. Is that the only thing RTX_FRAME_RELATED_P is actually useful for now, btw? (I do get that it is handy to have the simple cases in the prologue automatically figured out, but that is at best a nicety). What is actually bad about splitting FRAME_RELATED insns, anyway? I can think of many things that could go wrong, but all of those can go wrong with 1-1 splits as well. Maybe this all just works because not very many 1-1 splits are used in practice? So many questions, feel free to ignore all :-) Segher
Richard Sandiford writes: > Senthil Kumar via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >> Hi, >> >> I'm working on converting the AVR backend to MODE_CC, following >> the steps described for case #2 in the CC0 transition wiki page, >> and I've implemented the first three bullet >> points (https://github.com/saaadhu/gcc-avr-cc0/tree/avr-cc0-squashed). With >> the below patch, there are zero regressions (for mega and xmega >> subarchs) compared to the current mainline, as of yesterday. >> >> The wiki suggests using post-reload splitters, so that's the >> direction I took, but I ran into an issue where split_insn >> bails out early if RTX_FRAME_RELATED_P is true - this means >> that splits for REG_CC clobbering insns with >> RTX_FRAME_RELATED_P will never execute, resulting in a >> could-not-split insn ICE in the final stage. >> >> I see that the recog.c:peep2_attempt allows splitting of a >> RTX_FRAME_RELATED_P insn, provided the result of the split is a >> single insn. Would it be ok to modify try_split also to >> allow those kinds of insns (tentative patch attached, code >> copied over from peep2_attempt, only setting old and new_insn)? Or is there >> a different approach to fix this? > > I agree there's no obvious reason why splitting to a single insn > should be rejected but a peephole2 to a single instruction should be OK. > And reusing the existing, tried-and-tested code is the way to go. > > But could you split the code out of peep2_attempt into a subroutine > (probably still in recog.c) and reuse it in try_split? How does the below patch look? Bootstrapped and reg tested on x86_64-linux. > > BTW, just to check: is your email address in MAINTAINERS still correct? It was out-of-date, yes - updated now. Regards Senthil 2020-08-13 Senthil Kumar Selvaraj <saaadhu@gcc.gnu.org> gcc/ChangeLog: * emit-rtl.c (try_split): Call copy_frame_info_to_split_insn to split certain RTX_FRAME_RELATED_P insns. * recog.c (copy_frame_info_to_split_insn): New function. (peep2_attempt): Split copying of frame related info of RTX_FRAME_RELATED_P insns into above function and call it. * recog.h (copy_frame_info_to_split_insn): Declare it. diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index f9b0e9714d9..3706f0a03fd 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -3822,10 +3822,6 @@ try_split (rtx pat, rtx_insn *trial, int last) int njumps = 0; rtx_insn *call_insn = NULL; - /* We're not good at redistributing frame information. */ - if (RTX_FRAME_RELATED_P (trial)) - return trial; - if (any_condjump_p (trial) && (note = find_reg_note (trial, REG_BR_PROB, 0))) split_branch_probability @@ -3842,6 +3838,7 @@ try_split (rtx pat, rtx_insn *trial, int last) if (!seq) return trial; + int split_insn_count = 0; /* Avoid infinite loop if any insn of the result matches the original pattern. */ insn_last = seq; @@ -3850,11 +3847,25 @@ try_split (rtx pat, rtx_insn *trial, int last) if (INSN_P (insn_last) && rtx_equal_p (PATTERN (insn_last), pat)) return trial; + split_insn_count++; if (!NEXT_INSN (insn_last)) break; insn_last = NEXT_INSN (insn_last); } + /* We're not good at redistributing frame information if + the split occurs before reload or if it results in more + than one insn. */ + if (RTX_FRAME_RELATED_P (trial)) + { + if (!reload_completed || split_insn_count != 1) + return trial; + + rtx_insn *new_insn = seq; + rtx_insn *old_insn = trial; + copy_frame_info_to_split_insn (old_insn, new_insn); + } + /* We will be adding the new sequence to the function. The splitters may have introduced invalid RTL sharing, so unshare the sequence now. */ unshare_all_rtl_in_chain (seq); diff --git a/gcc/recog.c b/gcc/recog.c index 25f19b1b1cf..e024597f9d7 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -3277,6 +3277,78 @@ peep2_reinit_state (regset live) COPY_REG_SET (peep2_insn_data[MAX_INSNS_PER_PEEP2].live_before, live); } +/* Copies frame related info of an insn (old_insn) to the single + insn (new_insn) that was obtained by splitting old_insn. */ + +void +copy_frame_info_to_split_insn (rtx_insn *old_insn, rtx_insn *new_insn) +{ + bool any_note = false; + rtx note; + + if (!RTX_FRAME_RELATED_P (old_insn)) + return; + + RTX_FRAME_RELATED_P (new_insn) = 1; + + /* Allow the backend to fill in a note during the split. */ + for (note = REG_NOTES (new_insn); note ; note = XEXP (note, 1)) + switch (REG_NOTE_KIND (note)) + { + case REG_FRAME_RELATED_EXPR: + case REG_CFA_DEF_CFA: + case REG_CFA_ADJUST_CFA: + case REG_CFA_OFFSET: + case REG_CFA_REGISTER: + case REG_CFA_EXPRESSION: + case REG_CFA_RESTORE: + case REG_CFA_SET_VDRAP: + any_note = true; + break; + default: + break; + } + + /* If the backend didn't supply a note, copy one over. */ + if (!any_note) + for (note = REG_NOTES (old_insn); note ; note = XEXP (note, 1)) + switch (REG_NOTE_KIND (note)) + { + case REG_FRAME_RELATED_EXPR: + case REG_CFA_DEF_CFA: + case REG_CFA_ADJUST_CFA: + case REG_CFA_OFFSET: + case REG_CFA_REGISTER: + case REG_CFA_EXPRESSION: + case REG_CFA_RESTORE: + case REG_CFA_SET_VDRAP: + add_reg_note (new_insn, REG_NOTE_KIND (note), XEXP (note, 0)); + any_note = true; + break; + default: + break; + } + + /* If there still isn't a note, make sure the unwind info sees the + same expression as before the split. */ + if (!any_note) + { + rtx old_set, new_set; + + /* The old insn had better have been simple, or annotated. */ + old_set = single_set (old_insn); + gcc_assert (old_set != NULL); + + new_set = single_set (new_insn); + if (!new_set || !rtx_equal_p (new_set, old_set)) + add_reg_note (new_insn, REG_FRAME_RELATED_EXPR, old_set); + } + + /* Copy prologue/epilogue status. This is required in order to keep + proper placement of EPILOGUE_BEG and the DW_CFA_remember_state. */ + maybe_copy_prologue_epilogue_insn (old_insn, new_insn); +} + /* While scanning basic block BB, we found a match of length MATCH_LEN, starting at INSN. Perform the replacement, removing the old insns and replacing them with ATTEMPT. Returns the last insn emitted, or NULL @@ -3297,9 +3369,6 @@ peep2_attempt (basic_block bb, rtx_insn *insn, int match_len, rtx_insn *attempt) old_insn = peep2_insn_data[peep2_current].insn; if (RTX_FRAME_RELATED_P (old_insn)) { - bool any_note = false; - rtx note; - if (match_len != 0) return NULL; @@ -3313,64 +3382,7 @@ peep2_attempt (basic_block bb, rtx_insn *insn, int match_len, rtx_insn *attempt) return NULL; /* We have a 1-1 replacement. Copy over any frame-related info. */ - RTX_FRAME_RELATED_P (new_insn) = 1; - - /* Allow the backend to fill in a note during the split. */ - for (note = REG_NOTES (new_insn); note ; note = XEXP (note, 1)) - switch (REG_NOTE_KIND (note)) - { - case REG_FRAME_RELATED_EXPR: - case REG_CFA_DEF_CFA: - case REG_CFA_ADJUST_CFA: - case REG_CFA_OFFSET: - case REG_CFA_REGISTER: - case REG_CFA_EXPRESSION: - case REG_CFA_RESTORE: - case REG_CFA_SET_VDRAP: - any_note = true; - break; - default: - break; - } - - /* If the backend didn't supply a note, copy one over. */ - if (!any_note) - for (note = REG_NOTES (old_insn); note ; note = XEXP (note, 1)) - switch (REG_NOTE_KIND (note)) - { - case REG_FRAME_RELATED_EXPR: - case REG_CFA_DEF_CFA: - case REG_CFA_ADJUST_CFA: - case REG_CFA_OFFSET: - case REG_CFA_REGISTER: - case REG_CFA_EXPRESSION: - case REG_CFA_RESTORE: - case REG_CFA_SET_VDRAP: - add_reg_note (new_insn, REG_NOTE_KIND (note), XEXP (note, 0)); - any_note = true; - break; - default: - break; - } - - /* If there still isn't a note, make sure the unwind info sees the - same expression as before the split. */ - if (!any_note) - { - rtx old_set, new_set; - - /* The old insn had better have been simple, or annotated. */ - old_set = single_set (old_insn); - gcc_assert (old_set != NULL); - - new_set = single_set (new_insn); - if (!new_set || !rtx_equal_p (new_set, old_set)) - add_reg_note (new_insn, REG_FRAME_RELATED_EXPR, old_set); - } - - /* Copy prologue/epilogue status. This is required in order to keep - proper placement of EPILOGUE_BEG and the DW_CFA_remember_state. */ - maybe_copy_prologue_epilogue_insn (old_insn, new_insn); + copy_frame_info_to_split_insn (old_insn, new_insn); } /* If we are splitting a CALL_INSN, look for the CALL_INSN diff --git a/gcc/recog.h b/gcc/recog.h index 3e4b55bdf3f..ae3675f5c82 100644 --- a/gcc/recog.h +++ b/gcc/recog.h @@ -150,6 +150,8 @@ extern rtx_insn *peephole2_insns (rtx, rtx_insn *, int *); extern int store_data_bypass_p (rtx_insn *, rtx_insn *); extern int if_test_bypass_p (rtx_insn *, rtx_insn *); +extern void copy_frame_info_to_split_insn (rtx_insn *, rtx_insn *); + #ifndef GENERATOR_FILE /* Try recognizing the instruction INSN, and return the code number that results.
Sorry for the slow reply. Senthil Kumar Selvaraj <senthil.thecoder@gmail.com> writes: > 2020-08-13 Senthil Kumar Selvaraj <saaadhu@gcc.gnu.org> > > gcc/ChangeLog: > > * emit-rtl.c (try_split): Call copy_frame_info_to_split_insn > to split certain RTX_FRAME_RELATED_P insns. > * recog.c (copy_frame_info_to_split_insn): New function. > (peep2_attempt): Split copying of frame related info of > RTX_FRAME_RELATED_P insns into above function and call it. > * recog.h (copy_frame_info_to_split_insn): Declare it. > > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c > index f9b0e9714d9..3706f0a03fd 100644 > --- a/gcc/emit-rtl.c > +++ b/gcc/emit-rtl.c > @@ -3822,10 +3822,6 @@ try_split (rtx pat, rtx_insn *trial, int last) > int njumps = 0; > rtx_insn *call_insn = NULL; > > - /* We're not good at redistributing frame information. */ > - if (RTX_FRAME_RELATED_P (trial)) > - return trial; > - > if (any_condjump_p (trial) > && (note = find_reg_note (trial, REG_BR_PROB, 0))) > split_branch_probability > @@ -3842,6 +3838,7 @@ try_split (rtx pat, rtx_insn *trial, int last) > if (!seq) > return trial; > > + int split_insn_count = 0; > /* Avoid infinite loop if any insn of the result matches > the original pattern. */ > insn_last = seq; > @@ -3850,11 +3847,25 @@ try_split (rtx pat, rtx_insn *trial, int last) > if (INSN_P (insn_last) > && rtx_equal_p (PATTERN (insn_last), pat)) > return trial; > + split_insn_count++; > if (!NEXT_INSN (insn_last)) > break; > insn_last = NEXT_INSN (insn_last); > } > > + /* We're not good at redistributing frame information if > + the split occurs before reload or if it results in more > + than one insn. */ > + if (RTX_FRAME_RELATED_P (trial)) > + { > + if (!reload_completed || split_insn_count != 1) > + return trial; > + > + rtx_insn *new_insn = seq; > + rtx_insn *old_insn = trial; > + copy_frame_info_to_split_insn (old_insn, new_insn); > + } > + > /* We will be adding the new sequence to the function. The splitters > may have introduced invalid RTL sharing, so unshare the sequence now. */ > unshare_all_rtl_in_chain (seq); > diff --git a/gcc/recog.c b/gcc/recog.c > index 25f19b1b1cf..e024597f9d7 100644 > --- a/gcc/recog.c > +++ b/gcc/recog.c > @@ -3277,6 +3277,78 @@ peep2_reinit_state (regset live) > COPY_REG_SET (peep2_insn_data[MAX_INSNS_PER_PEEP2].live_before, live); > } > > +/* Copies frame related info of an insn (old_insn) to the single > + insn (new_insn) that was obtained by splitting old_insn. */ By convention, old_insn and new_insn should be in caps, since they refer to parameter names. OK otherwise, thanks. Richard
diff --git gcc/emit-rtl.c gcc/emit-rtl.c index f9b0e9714d9..7cf5704cf14 100644 --- gcc/emit-rtl.c +++ gcc/emit-rtl.c @@ -3822,10 +3822,6 @@ try_split (rtx pat, rtx_insn *trial, int last) int njumps = 0; rtx_insn *call_insn = NULL; - /* We're not good at redistributing frame information. */ - if (RTX_FRAME_RELATED_P (trial)) - return trial; - if (any_condjump_p (trial) && (note = find_reg_note (trial, REG_BR_PROB, 0))) split_branch_probability @@ -3842,6 +3838,7 @@ try_split (rtx pat, rtx_insn *trial, int last) if (!seq) return trial; + int split_insn_count = 0; /* Avoid infinite loop if any insn of the result matches the original pattern. */ insn_last = seq; @@ -3850,11 +3847,87 @@ try_split (rtx pat, rtx_insn *trial, int last) if (INSN_P (insn_last) && rtx_equal_p (PATTERN (insn_last), pat)) return trial; + split_insn_count++; if (!NEXT_INSN (insn_last)) break; insn_last = NEXT_INSN (insn_last); } + /* We're not good at redistributing frame information if + the split results in more than one insn */ + if (RTX_FRAME_RELATED_P (trial)) + { + if (split_insn_count != 1) + return trial; + + // Copy from recog.c:peep2_attempt + bool any_note = false; + rtx note; + rtx_insn *new_insn = seq; + rtx_insn *old_insn = trial; + + /* We have a 1-1 replacement. Copy over any frame-related info. */ + RTX_FRAME_RELATED_P (new_insn) = 1; + + /* Allow the backend to fill in a note during the split. */ + for (note = REG_NOTES (new_insn); note ; note = XEXP (note, 1)) + switch (REG_NOTE_KIND (note)) + { + case REG_FRAME_RELATED_EXPR: + case REG_CFA_DEF_CFA: + case REG_CFA_ADJUST_CFA: + case REG_CFA_OFFSET: + case REG_CFA_REGISTER: + case REG_CFA_EXPRESSION: + case REG_CFA_RESTORE: + case REG_CFA_SET_VDRAP: + any_note = true; + break; + default: + break; + } + + /* If the backend didn't supply a note, copy one over. */ + if (!any_note) + for (note = REG_NOTES (old_insn); note ; note = XEXP (note, 1)) + switch (REG_NOTE_KIND (note)) + { + case REG_FRAME_RELATED_EXPR: + case REG_CFA_DEF_CFA: + case REG_CFA_ADJUST_CFA: + case REG_CFA_OFFSET: + case REG_CFA_REGISTER: + case REG_CFA_EXPRESSION: + case REG_CFA_RESTORE: + case REG_CFA_SET_VDRAP: + add_reg_note (new_insn, REG_NOTE_KIND (note), XEXP (note, 0)); + any_note = true; + break; + default: + break; + } + + /* If there still isn't a note, make sure the unwind info sees the + same expression as before the split. */ + if (!any_note) + { + rtx old_set, new_set; + + /* The old insn had better have been simple, or annotated. */ + old_set = single_set (old_insn); + gcc_assert (old_set != NULL); + + new_set = single_set (new_insn); + if (!new_set || !rtx_equal_p (new_set, old_set)) + add_reg_note (new_insn, REG_FRAME_RELATED_EXPR, old_set); + } + + /* Copy prologue/epilogue status. This is required in order to keep + proper placement of EPILOGUE_BEG and the DW_CFA_remember_state. */ + maybe_copy_prologue_epilogue_insn (old_insn, new_insn); + } + + /* We will be adding the new sequence to the function. The splitters may have introduced invalid RTL sharing, so unshare the sequence now. */ unshare_all_rtl_in_chain (seq);