Message ID | 6748c592-4a51-604c-8dd1-c5f2bae5b5da@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1 | expand |
Peter Bergner <bergner@linux.ibm.com> writes: > rtl-optimization: ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1 [PR94740] > > We ICE on the test case below because decompose_normal_address() doesn't > expect to see memory operands with constant addresses like below without > a (const:DI ...) wrapped around the PLUS: > > (mem/c:SI (plus:DI (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) > (const_int 4 [0x4])) [1 array+4 S4 A32]) > > What we expect to see is: > > (mem/c:SI (const:DI (plus:DI (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) > (const_int 4 [0x4]))) [1 arrayD.2903+4 S4 A32]) > > Sometimes, combine adds the (const: ...) for us via simplify_binary_operand > call, but that only happens when combine actually does something with this > insn. The bad address from the test case actually comes from CSE, so the > fix here is to make CSE add the (const: ...) whenever it creates a MEM with > a constant address. I think we should do this in cse_process_notes_1, both to avoid creating an invalid MEM in the first place, and so that we handle embedded MEMs correctly too. Also, the (const:P ...) ought to be there even outside of a MEM. E.g. we ought to have: (set (reg X) (const:P (plus:P (symbol_ref:P S) (const_int D)))) rather than: (set (reg X) (plus:P (symbol_ref:P S) (const_int D))) Adding a PLUS case would fix this example, but I guess a more general fix would be to add a second switch statement (after the first) that switches on GET_RTX_CLASS and uses logic like simplify_replace_fn_rtx: case RTX_BIN_ARITH: case RTX_COMM_ARITH: op0 = simplify_replace_fn_rtx (XEXP (x, 0), old_rtx, fn, data); op1 = simplify_replace_fn_rtx (XEXP (x, 1), old_rtx, fn, data); if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1)) return x; return simplify_gen_binary (code, mode, op0, op1); Maybe we could even replace cse_process_notes_1 with simplify_replace_fn_rtx, but that's obviously a much more invasive change :-) Thanks, Richard
On 4/28/20 2:38 AM, Richard Sandiford wrote: > I think we should do this in cse_process_notes_1, both to avoid creating > an invalid MEM in the first place, and so that we handle embedded MEMs > correctly too. > > Also, the (const:P ...) ought to be there even outside of a MEM. E.g. we > ought to have: > > (set (reg X) (const:P (plus:P (symbol_ref:P S) (const_int D)))) > > rather than: > > (set (reg X) (plus:P (symbol_ref:P S) (const_int D))) I wondered about things such as this, but I couldn't remember ever seeing a (const:P ...) wrapped around a constant expression before. Maybe I just wasn't looking for it??? > Adding a PLUS case would fix this example, but I guess a more general > fix would be to add a second switch statement (after the first) that > switches on GET_RTX_CLASS and uses logic like simplify_replace_fn_rtx: > > case RTX_BIN_ARITH: > case RTX_COMM_ARITH: > op0 = simplify_replace_fn_rtx (XEXP (x, 0), old_rtx, fn, data); > op1 = simplify_replace_fn_rtx (XEXP (x, 1), old_rtx, fn, data); > if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1)) > return x; > return simplify_gen_binary (code, mode, op0, op1); Ok, I'll try this and see whether we survive bootstrap and regtesting. Thanks! Peter
On 4/28/20 2:38 AM, Richard Sandiford wrote: > case RTX_BIN_ARITH: > case RTX_COMM_ARITH: > op0 = simplify_replace_fn_rtx (XEXP (x, 0), old_rtx, fn, data); > op1 = simplify_replace_fn_rtx (XEXP (x, 1), old_rtx, fn, data); > if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1)) > return x; > return simplify_gen_binary (code, mode, op0, op1); Is there a reason you use simplify_replace_fn_rtx here, rather than just using op0 = simplify_rtx (XEXP (x, 0))? Ditto for op1. Does simplify_replace_fn_rtx do something that simplify_rtx doesn't? Peter
Peter Bergner <bergner@linux.ibm.com> writes: > On 4/28/20 2:38 AM, Richard Sandiford wrote: >> case RTX_BIN_ARITH: >> case RTX_COMM_ARITH: >> op0 = simplify_replace_fn_rtx (XEXP (x, 0), old_rtx, fn, data); >> op1 = simplify_replace_fn_rtx (XEXP (x, 1), old_rtx, fn, data); >> if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1)) >> return x; >> return simplify_gen_binary (code, mode, op0, op1); > > Is there a reason you use simplify_replace_fn_rtx here, rather than > just using op0 = simplify_rtx (XEXP (x, 0))? Ditto for op1. > Does simplify_replace_fn_rtx do something that simplify_rtx doesn't? I was just quoting code from simplify_replace_fn_rtx as an example of something that handles a similar situation. The recursive calls would be different for cse_process_notes_1. Sorry for the confusion :-) Richard
On 4/28/20 11:57 AM, Richard Sandiford wrote: > I was just quoting code from simplify_replace_fn_rtx as an example of > something that handles a similar situation. The recursive calls would > be different for cse_process_notes_1. Ok, how about the following which adds the (const:P ...) as well, which fixes the ICE? The FOR loop which now follows this switch statement is what used to modify the PLUS's operands without adding the (const:P ). This is still in the middle of bootstrap and regtesting. Peter gcc/ PR rtl-optimization/94740 * cse.c (cse_process_notes_1): gcc/testsuite/ PR rtl-optimization/94740 * gcc.target/powerpc/pr94740.c: New test. diff --git a/gcc/cse.c b/gcc/cse.c index 5aaba8d80e0..4592820c04c 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -6314,6 +6314,21 @@ cse_process_notes_1 (rtx x, rtx object, bool *changed) break; } + switch (GET_RTX_CLASS (code)) + { + case RTX_BIN_ARITH: + case RTX_COMM_ARITH: + /* Call simplify_gen_binary with our updated operands in case they + are now constant and need to be wrapped with a (const:P ...). */ + validate_change (object, &XEXP (x, 0), + cse_process_notes (XEXP (x, 0), object, changed), false); + validate_change (object, &XEXP (x, 1), + cse_process_notes (XEXP (x, 1), object, changed), false); + return simplify_gen_binary (code, GET_MODE (x), XEXP (x, 0), XEXP (x, 1)); + default: + break; + } + for (i = 0; i < GET_RTX_LENGTH (code); i++) if (fmt[i] == 'e') validate_change (object, &XEXP (x, i), diff --git a/gcc/testsuite/gcc.target/powerpc/pr94740.c b/gcc/testsuite/gcc.target/powerpc/pr94740.c new file mode 100644 index 00000000000..78e40fc84da --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr94740.c @@ -0,0 +1,11 @@ +/* PR rtl-optimization/94740 */ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_future_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=future -mpcrel" } */ + +int array[8]; +int +foo (void) +{ + return __builtin_bswap32 (array[1]); +}
Peter Bergner <bergner@linux.ibm.com> writes: > On 4/28/20 11:57 AM, Richard Sandiford wrote: >> I was just quoting code from simplify_replace_fn_rtx as an example of >> something that handles a similar situation. The recursive calls would >> be different for cse_process_notes_1. > > Ok, how about the following which adds the (const:P ...) as well, which > fixes the ICE? The FOR loop which now follows this switch statement is > what used to modify the PLUS's operands without adding the (const:P ). > > This is still in the middle of bootstrap and regtesting. > > Peter > > > gcc/ > PR rtl-optimization/94740 > * cse.c (cse_process_notes_1): > > gcc/testsuite/ > PR rtl-optimization/94740 > * gcc.target/powerpc/pr94740.c: New test. > > diff --git a/gcc/cse.c b/gcc/cse.c > index 5aaba8d80e0..4592820c04c 100644 > --- a/gcc/cse.c > +++ b/gcc/cse.c > @@ -6314,6 +6314,21 @@ cse_process_notes_1 (rtx x, rtx object, bool *changed) > break; > } > > + switch (GET_RTX_CLASS (code)) > + { > + case RTX_BIN_ARITH: > + case RTX_COMM_ARITH: > + /* Call simplify_gen_binary with our updated operands in case they > + are now constant and need to be wrapped with a (const:P ...). */ > + validate_change (object, &XEXP (x, 0), > + cse_process_notes (XEXP (x, 0), object, changed), false); > + validate_change (object, &XEXP (x, 1), > + cse_process_notes (XEXP (x, 1), object, changed), false); > + return simplify_gen_binary (code, GET_MODE (x), XEXP (x, 0), XEXP (x, 1)); If we use simplify_gen_binary then I don't think we need to validate each change individually. It should be enough to do something like: x0 = cse_process_notes (XEXP (x, 0), object, changed); x1 = cse_process_notes (XEXP (x, 1), object, changed); if (x0 == XEXP (x, 0) && x1 == XEXP (x, 1)) return x; return simplify_gen_binary (code, GET_MODE (x), x0, x1); Alternatively, in the hope of reducing redundant rtl, I guess we could add the switch statement after the format walk and do: switch (GET_RTX_CLASS (code)) { case RTX_BIN_ARITH: case RTX_COMM_ARITH: if (rtx new_rtx = simplify_binary (code, GET_MODE (x), XEXP (x, 0), XEXP (x, 1))) return new_rtx; break; default: break; } Thanks, Richard
On 4/28/20 5:39 PM, Richard Sandiford wrote: > If we use simplify_gen_binary then I don't think we need to validate > each change individually. It should be enough to do something like: > > x0 = cse_process_notes (XEXP (x, 0), object, changed); > x1 = cse_process_notes (XEXP (x, 1), object, changed); > if (x0 == XEXP (x, 0) && x1 == XEXP (x, 1)) > return x; > return simplify_gen_binary (code, GET_MODE (x), x0, x1); Ok, I wasn't sure whether simplify_gen_binary verified its changes or not. > Alternatively, in the hope of reducing redundant rtl, I guess we could > add the switch statement after the format walk and do: > > switch (GET_RTX_CLASS (code)) > { > case RTX_BIN_ARITH: > case RTX_COMM_ARITH: > if (rtx new_rtx = simplify_binary (code, GET_MODE (x), > XEXP (x, 0), XEXP (x, 1))) > return new_rtx; > break; > default: > break; > } Doesn't moving this after the format walk, create the situation of producing constant expressions without (const:P ...) and then fixing them up after the fact like my original patch did? That's why I placed the previous version before the format walk, but if you're ok with that now, then that's fine with me. That said, the PLUS will be wrapped in a (const:P ...) before it's substituted into the MEM, so maybe it was just the MEM you were worried about? However, do you mean the change to be the following, since I don't think simplify_gen_binary ever returns NULL? validate_change (object, &XEXP (x, i), cse_process_notes (XEXP (x, i), object, changed), 0); + switch (GET_RTX_CLASS (code)) + { + case RTX_BIN_ARITH: + case RTX_COMM_ARITH: + return simplify_gen_binary (code, GET_MODE (x), XEXP (x, 0), XEXP (x, 1)); + default: + break; + } + return x; } Peter
On 4/28/20 6:06 PM, Peter Bergner wrote: > However, do you mean the change to be the following, since I don't think > simplify_gen_binary ever returns NULL? > > > validate_change (object, &XEXP (x, i), > cse_process_notes (XEXP (x, i), object, changed), 0); > > + switch (GET_RTX_CLASS (code)) > + { > + case RTX_BIN_ARITH: > + case RTX_COMM_ARITH: > + return simplify_gen_binary (code, GET_MODE (x), XEXP (x, 0), XEXP (x, 1)); > + default: > + break; > + } > + > return x; > } FYI, this passed bootstrap and regtesting with no regressions. Is the above ok (with comment from previous patch) for trunk? Peter
Hi! On Mon, Apr 27, 2020 at 05:30:24PM -0500, Peter Bergner wrote: > rtl-optimization: ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1 [PR94740] > > We ICE on the test case below because decompose_normal_address() doesn't > expect to see memory operands with constant addresses like below without > a (const:DI ...) wrapped around the PLUS: > > (mem/c:SI (plus:DI (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) > (const_int 4 [0x4])) [1 array+4 S4 A32]) > > What we expect to see is: > > (mem/c:SI (const:DI (plus:DI (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) > (const_int 4 [0x4]))) [1 arrayD.2903+4 S4 A32]) > > Sometimes, combine adds the (const: ...) for us via simplify_binary_operand > call, but that only happens when combine actually does something with this > insn. The bad address from the test case actually comes from CSE, so the > fix here is to make CSE add the (const: ...) whenever it creates a MEM with > a constant address. So, is it a rule that we always should have CONST here? Where is that documented? > gcc/ > PR rtl-optimization/94740 > * cse.c (cse_process_notes): (Missing changelog entry here). > diff --git a/gcc/cse.c b/gcc/cse.c > index 5aaba8d80e0..cbe9e0a0692 100644 > --- a/gcc/cse.c > +++ b/gcc/cse.c > @@ -6328,6 +6328,14 @@ cse_process_notes (rtx x, rtx object, bool *changed) > rtx new_rtx = cse_process_notes_1 (x, object, changed); > if (new_rtx != x) > *changed = true; > + if (*changed && object != NULL_RTX && MEM_P (object)) > + { > + /* Call simplify_rtx on the updated address in case it is now > + a constant and needs to be wrapped with a (const: ...). */ > + rtx simplified_rtx = simplify_rtx (new_rtx); > + if (simplified_rtx) > + new_rtx = simplified_rtx; > + } > return new_rtx; > } You can just do return simplify_replace_rtx (new_rtx, 0, 0); (it doesn't return 0 if nothing changed, unlike many other simplify* things) for everything inside the if (); but you can even lose the if () itself? (Although, maybe you're doing this all somewhere else now? :-) ) > diff --git a/gcc/testsuite/gcc.target/powerpc/pr94740.c b/gcc/testsuite/gcc.target/powerpc/pr94740.c > new file mode 100644 > index 00000000000..78e40fc84da > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr94740.c > @@ -0,0 +1,11 @@ > +/* PR rtl-optimization/94740 */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_future_ok } */ > +/* { dg-options "-O2 -mdejagnu-cpu=future -mpcrel" } */ > + > +int array[8]; > +int > +foo (void) > +{ > + return __builtin_bswap32 (array[1]); > +} That part is okay for trunk, of course :-) Segher
Hi! On Tue, Apr 28, 2020 at 08:38:56AM +0100, Richard Sandiford wrote: > Also, the (const:P ...) ought to be there even outside of a MEM. E.g. we > ought to have: > > (set (reg X) (const:P (plus:P (symbol_ref:P S) (const_int D)))) > > rather than: > > (set (reg X) (plus:P (symbol_ref:P S) (const_int D))) What are the actual rules? Where is this documented? Of course it is highly desirable to have CONST around such constant addresses, but when is it *required*? And what *is* a constant address (in this context)? Segher
On Tue, 2020-04-28 at 20:03 -0500, Segher Boessenkool wrote: > Hi! > > On Tue, Apr 28, 2020 at 08:38:56AM +0100, Richard Sandiford wrote: > > Also, the (const:P ...) ought to be there even outside of a MEM. E.g. we > > ought to have: > > > > (set (reg X) (const:P (plus:P (symbol_ref:P S) (const_int D)))) > > > > rather than: > > > > (set (reg X) (plus:P (symbol_ref:P S) (const_int D))) > > What are the actual rules? Where is this documented? > > Of course it is highly desirable to have CONST around such constant > addresses, but when is it *required*? And what *is* a constant address > (in this context)? I don't know if it's documented, but it's been expected practice for decades. It includes symbolic constants, and the results of simple arithmetic on symbolic constants. It (to the best of my knowledge) does not include logical operations on symbolic constants. Jeff
Peter Bergner <bergner@linux.ibm.com> writes: > On 4/28/20 5:39 PM, Richard Sandiford wrote: >> If we use simplify_gen_binary then I don't think we need to validate >> each change individually. It should be enough to do something like: >> >> x0 = cse_process_notes (XEXP (x, 0), object, changed); >> x1 = cse_process_notes (XEXP (x, 1), object, changed); >> if (x0 == XEXP (x, 0) && x1 == XEXP (x, 1)) >> return x; >> return simplify_gen_binary (code, GET_MODE (x), x0, x1); > > Ok, I wasn't sure whether simplify_gen_binary verified its changes or not. > >> Alternatively, in the hope of reducing redundant rtl, I guess we could >> add the switch statement after the format walk and do: >> >> switch (GET_RTX_CLASS (code)) >> { >> case RTX_BIN_ARITH: >> case RTX_COMM_ARITH: >> if (rtx new_rtx = simplify_binary (code, GET_MODE (x), >> XEXP (x, 0), XEXP (x, 1))) >> return new_rtx; >> break; >> default: >> break; >> } To answer your later question, I meant simplify_binary_operation rather than simplify_binary here. Sorry, I always forget that the simplify_gen_* and non-gen versions use different naming schemes. But then I guess we're effectively back to acting like simplify_rtx, as in your original patch, but without checking whether we're in a MEM. The traditional problem with doing recursive replacement followed by simplification is that we can lose the original mode for the operands of unary operators. E.g. (zero_extend:DI (reg:M X)) can become (zero_extend:DI (const_int -1)), which isn't well-formed. So calling simplify_rtx for all rtxes (not just binary ones) could lead to problems. More below. > Doesn't moving this after the format walk, create the situation of > producing constant expressions without (const:P ...) and then fixing > them up after the fact like my original patch did? That's why I placed > the previous version before the format walk, but if you're ok with that > now, then that's fine with me. That said, the PLUS will be wrapped in > a (const:P ...) before it's substituted into the MEM, so maybe it was > just the MEM you were worried about? To be honest, when reviewing the original patch, I think I'd missed that cse_process_notes and cse_process_notes_1 were co-recursive, and so every level of the MEM would be simplified by your simplify_rtx call. I originally thought it was building up a new MEM and only then simplifying it. But yeah, like you say, the problem is that if we just add the simplification on top, we're still validating the original unsimplified replacement beforehand, so there's still a potential for non-canonical rtl to seep into the validation routine before it gets corrected by a parent call. That might cause problems in future, even if it doesn't yet. Ugh. Unfortunately, there doesn't seem to be a perfect off-the-shelf routine to use here. validate_replace_rtx_part looks like it has the right kind of idea, but can only handle a single "from" and "to" value. simplify_replace_fn_rtx can handle general replacements, but generates more garbage rtl and doesn't verify the new MEMs. If this is a bug we need to fix for GCC 10 then how about this: (1) change the MEM case to something like: { bool addr_changed = false; rtx addr = cse_process_notes (XEXP (x, 0), x, &addr_changed); if (addr_changed && validate_change (x, &XEXP (x, 0), addr, 0)) *changed = true; return x; } which makes it safe to do the validation only at the top level of the MEM. (2) make the: for (i = 0; i < GET_RTX_LENGTH (code); i++) if (fmt[i] == 'e') loop assign directly to XEXP (x, i) if the new rtx is different, instead of using validate_change. This ensures that temporarily invalid rtl doesn't get verified. (passing a null object to validate_change would have the same effect, but that feels a bit hacky) (3) use the simplify_binary_operator thing above, since that at least should be safe, even if it potentially leaves similar bugs unfixed for other operators. Sorry for the runaround :-( Hopefully this area is something we can clean up for GCC 11. Thanks, Richard
On 4/29/20 10:13 AM, Richard Sandiford wrote: > If this is a bug we need to fix for GCC 10 then how about this: I do think this is important to fix. Mike Meissner just switched our -mcpu=future code to enable -mpcrel by default in GCC10, so we're going to see a lot more of these types of addresses than before. > (1) change the MEM case to something like: > > { > bool addr_changed = false; > rtx addr = cse_process_notes (XEXP (x, 0), x, &addr_changed); > if (addr_changed && validate_change (x, &XEXP (x, 0), addr, 0)) > *changed = true; > return x; > } > > which makes it safe to do the validation only at the top level > of the MEM. > > (2) make the: > > for (i = 0; i < GET_RTX_LENGTH (code); i++) > if (fmt[i] == 'e') > > loop assign directly to XEXP (x, i) if the new rtx is different, > instead of using validate_change. This ensures that temporarily > invalid rtl doesn't get verified. > > (passing a null object to validate_change would have the same effect, > but that feels a bit hacky) > > (3) use the simplify_binary_operator thing above, since that at least > should be safe, even if it potentially leaves similar bugs unfixed > for other operators. > > Sorry for the runaround :-( No problem. I appreciate the feedback and I agree we want to get this right. I'll implement the above and report back. Thanks! Peter
Peter Bergner <bergner@linux.ibm.com> writes: > On 4/29/20 10:13 AM, Richard Sandiford wrote: >> If this is a bug we need to fix for GCC 10 then how about this: > > I do think this is important to fix. Mike Meissner just switched our > -mcpu=future code to enable -mpcrel by default in GCC10, so we're going > to see a lot more of these types of addresses than before. OK. >> (1) change the MEM case to something like: >> >> { >> bool addr_changed = false; >> rtx addr = cse_process_notes (XEXP (x, 0), x, &addr_changed); >> if (addr_changed && validate_change (x, &XEXP (x, 0), addr, 0)) >> *changed = true; >> return x; >> } >> >> which makes it safe to do the validation only at the top level >> of the MEM. >> >> (2) make the: >> >> for (i = 0; i < GET_RTX_LENGTH (code); i++) >> if (fmt[i] == 'e') >> >> loop assign directly to XEXP (x, i) if the new rtx is different, >> instead of using validate_change. This ensures that temporarily >> invalid rtl doesn't get verified. >> >> (passing a null object to validate_change would have the same effect, >> but that feels a bit hacky) >> >> (3) use the simplify_binary_operator thing above, since that at least >> should be safe, even if it potentially leaves similar bugs unfixed >> for other operators. >> >> Sorry for the runaround :-( > > No problem. I appreciate the feedback and I agree we want to get this right. > I'll implement the above and report back. Thanks! Sigh. And now of course I realise that that too is flawed. If we make a replacement on a subexpression and the containing expression isn't simplified, the validate_change for the MEM will end up being a no-op, even if it shouldn't be. So in one case we really need the recursive validate_changes, and in one case we really want to avoid them. I think at this point it would be better to go for the simplify_replace_fn_rtx thing I mentioned in the original review, rather than try to hack at the current code even more. What do you think about the attached? Testing in progress. (Sorry for going ahead and writing an alternative patch, since if we do go for this, I guess the earlier misdirections will have wasted two days of your time. But it seemed like I was just never going to think about this PR properly unless I actually tried to write something. :-() Richard
On 4/29/20 3:28 PM, Richard Sandiford wrote: > (Sorry for going ahead and writing an alternative patch, since if we do > go for this, I guess the earlier misdirections will have wasted two days > of your time. But it seemed like I was just never going to think about > this PR properly unless I actually tried to write something. :-() No worries from me! I'm just glad to see this fixed before the release. I'll kill off a bootstrap and regtest on powerpc64le-linux too, in addition to your tests (arm & x86_64?). Thanks for your help with this! Peter
On 4/29/20 3:28 PM, Richard Sandiford wrote: > I think at this point it would be better to go for the > simplify_replace_fn_rtx thing I mentioned in the original review, > rather than try to hack at the current code even more. What do you > think about the attached? Testing in progress. My testing is still running, but I like the patch a lot! It's nice when a patch not only fixes a bug, but makes the code *MUCH* simpler too. Well done! Peter
On Wed, Apr 29, 2020 at 08:56:11AM -0600, Jeff Law wrote: > On Tue, 2020-04-28 at 20:03 -0500, Segher Boessenkool wrote: > > What are the actual rules? Where is this documented? > > > > Of course it is highly desirable to have CONST around such constant > > addresses, but when is it *required*? And what *is* a constant address > > (in this context)? > I don't know if it's documented, but it's been expected practice for decades. > > It includes symbolic constants, and the results of simple arithmetic on symbolic > constants. It (to the best of my knowledge) does not include logical operations > on symbolic constants. So should it be done for anything that is "i" as constraint? Or only for constant addresses? In what sense constant, anyway? None of this ever was clear to me... I always did the "if it doesn't ICE anymore, we're good" routine, but *everyone* seems to do that, and that isn't so optimal ;-) Segher
On 4/29/20 4:15 PM, Peter Bergner wrote: > On 4/29/20 3:28 PM, Richard Sandiford wrote: >> (Sorry for going ahead and writing an alternative patch, since if we do >> go for this, I guess the earlier misdirections will have wasted two days >> of your time. But it seemed like I was just never going to think about >> this PR properly unless I actually tried to write something. :-() > > No worries from me! I'm just glad to see this fixed before the release. > I'll kill off a bootstrap and regtest on powerpc64le-linux too, in addition > to your tests (arm & x86_64?). Thanks for your help with this! My bootstrap and regtesting of your patch on powerpc64le-linux was clean. Ship it! :-) Peter
On Wed, 2020-04-29 at 17:54 -0500, Segher Boessenkool wrote: > On Wed, Apr 29, 2020 at 08:56:11AM -0600, Jeff Law wrote: > > On Tue, 2020-04-28 at 20:03 -0500, Segher Boessenkool wrote: > > > What are the actual rules? Where is this documented? > > > > > > Of course it is highly desirable to have CONST around such constant > > > addresses, but when is it *required*? And what *is* a constant address > > > (in this context)? > > I don't know if it's documented, but it's been expected practice for decades. > > > > It includes symbolic constants, and the results of simple arithmetic on > > symbolic > > constants. It (to the best of my knowledge) does not include logical > > operations > > on symbolic constants. > > So should it be done for anything that is "i" as constraint? Or only > for constant addresses? In what sense constant, anyway? Only symbolic constants. ie, things involving SYMBOL_REF or LABEL_REF which are constant when the program is run, but the exact value isn't known at compile time. Simple CONST_INT, CONST_DOUBLE and the like do not need the CONST wrapping. Jeff
Peter Bergner <bergner@linux.ibm.com> writes: > On 4/29/20 4:15 PM, Peter Bergner wrote: >> On 4/29/20 3:28 PM, Richard Sandiford wrote: >>> (Sorry for going ahead and writing an alternative patch, since if we do >>> go for this, I guess the earlier misdirections will have wasted two days >>> of your time. But it seemed like I was just never going to think about >>> this PR properly unless I actually tried to write something. :-() >> >> No worries from me! I'm just glad to see this fixed before the release. >> I'll kill off a bootstrap and regtest on powerpc64le-linux too, in addition >> to your tests (arm & x86_64?). Thanks for your help with this! > > My bootstrap and regtesting of your patch on powerpc64le-linux was clean. Thanks. aarch64-linux-gnu and x86_64-linux-gnu bootstrap & regtests also came back clean. I'll kick off an arm-linux-gnueabihf one too just to be safe. I guess at this point it needs a review from someone else though. Jeff, WDYT? Attached again below, this time without the shonky mime type. Richard From 39e20b51af8f977a1786ef5c15af80e47415d352 Mon Sep 17 00:00:00 2001 From: Richard Sandiford <richard.sandiford@arm.com> Date: Wed, 29 Apr 2020 20:38:13 +0100 Subject: [PATCH] cse: Use simplify_replace_fn_rtx to process notes [PR94740] cse_process_notes did a very simple substitution, which in the wrong circumstances could create non-canonical RTL and invalid MEMs. Various sticking plasters have been applied to cse_process_notes_1 to handle cases like ZERO_EXTEND, SIGN_EXTEND and UNSIGNED_FLOAT, but I think this PR is a plaster too far. The code is trying hard to avoid creating unnecessary rtl, which of course is a good thing. If we continue to do that, then we can end up changing subexpressions while keeping the containing rtx. This in turn means that validate_change will be a no-op on the containing rtx, even if its contents have changed. So in these cases we have to apply validate_change to the individual subexpressions. On the other hand, if we always apply validate_change to the individual subexpressions, we'll end up calling validate_change on something before it has been simplified and canonicalised. And that's one of the situations we're trying to avoid. There might be a middle ground in which we queue the validate_changes as part of a group, and so can cancel the pending validate_changes for subexpressions if there's a change in the outer expression. But that seems even more ad-hoc than the current code. It would also be quite an invasive change. I think the best thing is just to hook into the existing simplify_replace_fn_rtx function, keeping the REG and MEM handling from cse_process_notes_1 essentially unchanged. It can generate more redundant rtl when a simplification takes place, but it has the advantage of being relative well-used code (both directly and via simplify_replace_rtx). 2020-04-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR rtl-optimization/94740 * cse.c (cse_process_notes_1): Replace with... (cse_process_note_1): ...this new function, acting as a simplify_replace_fn_rtx callback to process_note. Handle only REGs and MEMs directly. Validate the MEM if cse_process_note changes its address. (cse_process_notes): Replace with... (cse_process_note): ...this new function. (cse_extended_basic_block): Update accordingly, iterating over the register notes and passing individual notes to cse_process_note. --- gcc/cse.c | 118 ++++++++++++++++-------------------------------------- 1 file changed, 34 insertions(+), 84 deletions(-) diff --git a/gcc/cse.c b/gcc/cse.c index 5aaba8d80e0..36bcfc354d8 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -585,7 +585,6 @@ static void cse_insn (rtx_insn *); static void cse_prescan_path (struct cse_basic_block_data *); static void invalidate_from_clobbers (rtx_insn *); static void invalidate_from_sets_and_clobbers (rtx_insn *); -static rtx cse_process_notes (rtx, rtx, bool *); static void cse_extended_basic_block (struct cse_basic_block_data *); extern void dump_class (struct table_elt*); static void get_cse_reg_info_1 (unsigned int regno); @@ -6222,75 +6221,28 @@ invalidate_from_sets_and_clobbers (rtx_insn *insn) } } -/* Process X, part of the REG_NOTES of an insn. Look at any REG_EQUAL notes - and replace any registers in them with either an equivalent constant - or the canonical form of the register. If we are inside an address, - only do this if the address remains valid. +static rtx cse_process_note (rtx); - OBJECT is 0 except when within a MEM in which case it is the MEM. +/* A simplify_replace_fn_rtx callback for cse_process_note. Process X, + part of the REG_NOTES of an insn. Replace any registers with either + an equivalent constant or the canonical form of the register. + Only replace addresses if the containing MEM remains valid. - Return the replacement for X. */ + Return the replacement for X, or null if it should be simplified + recursively. */ static rtx -cse_process_notes_1 (rtx x, rtx object, bool *changed) +cse_process_note_1 (rtx x, const_rtx, void *) { - enum rtx_code code = GET_CODE (x); - const char *fmt = GET_RTX_FORMAT (code); - int i; - - switch (code) + if (MEM_P (x)) { - case CONST: - case SYMBOL_REF: - case LABEL_REF: - CASE_CONST_ANY: - case PC: - case CC0: - case LO_SUM: + validate_change (x, &XEXP (x, 0), cse_process_note (XEXP (x, 0)), false); return x; + } - case MEM: - validate_change (x, &XEXP (x, 0), - cse_process_notes (XEXP (x, 0), x, changed), 0); - return x; - - case EXPR_LIST: - if (REG_NOTE_KIND (x) == REG_EQUAL) - XEXP (x, 0) = cse_process_notes (XEXP (x, 0), NULL_RTX, changed); - /* Fall through. */ - - case INSN_LIST: - case INT_LIST: - if (XEXP (x, 1)) - XEXP (x, 1) = cse_process_notes (XEXP (x, 1), NULL_RTX, changed); - return x; - - case SIGN_EXTEND: - case ZERO_EXTEND: - case SUBREG: - { - rtx new_rtx = cse_process_notes (XEXP (x, 0), object, changed); - /* We don't substitute VOIDmode constants into these rtx, - since they would impede folding. */ - if (GET_MODE (new_rtx) != VOIDmode) - validate_change (object, &XEXP (x, 0), new_rtx, 0); - return x; - } - - case UNSIGNED_FLOAT: - { - rtx new_rtx = cse_process_notes (XEXP (x, 0), object, changed); - /* We don't substitute negative VOIDmode constants into these rtx, - since they would impede folding. */ - if (GET_MODE (new_rtx) != VOIDmode - || (CONST_INT_P (new_rtx) && INTVAL (new_rtx) >= 0) - || (CONST_DOUBLE_P (new_rtx) && CONST_DOUBLE_HIGH (new_rtx) >= 0)) - validate_change (object, &XEXP (x, 0), new_rtx, 0); - return x; - } - - case REG: - i = REG_QTY (REGNO (x)); + if (REG_P (x)) + { + int i = REG_QTY (REGNO (x)); /* Return a constant or a constant register. */ if (REGNO_QTY_VALID_P (REGNO (x))) @@ -6309,26 +6261,19 @@ cse_process_notes_1 (rtx x, rtx object, bool *changed) /* Otherwise, canonicalize this register. */ return canon_reg (x, NULL); - - default: - break; } - for (i = 0; i < GET_RTX_LENGTH (code); i++) - if (fmt[i] == 'e') - validate_change (object, &XEXP (x, i), - cse_process_notes (XEXP (x, i), object, changed), 0); - - return x; + return NULL_RTX; } +/* Process X, part of the REG_NOTES of an insn. Replace any registers in it + with either an equivalent constant or the canonical form of the register. + Only replace addresses if the containing MEM remains valid. */ + static rtx -cse_process_notes (rtx x, rtx object, bool *changed) +cse_process_note (rtx x) { - rtx new_rtx = cse_process_notes_1 (x, object, changed); - if (new_rtx != x) - *changed = true; - return new_rtx; + return simplify_replace_fn_rtx (x, NULL_RTX, cse_process_note_1, NULL); } @@ -6623,14 +6568,19 @@ cse_extended_basic_block (struct cse_basic_block_data *ebb_data) { /* Process notes first so we have all notes in canonical forms when looking for duplicate operations. */ - if (REG_NOTES (insn)) - { - bool changed = false; - REG_NOTES (insn) = cse_process_notes (REG_NOTES (insn), - NULL_RTX, &changed); - if (changed) - df_notes_rescan (insn); - } + bool changed = false; + for (rtx note = REG_NOTES (insn); note; note = XEXP (note, 1)) + if (REG_NOTE_KIND (note) == REG_EQUAL) + { + rtx newval = cse_process_note (XEXP (note, 0)); + if (newval != XEXP (note, 0)) + { + XEXP (note, 0) = newval; + changed = true; + } + } + if (changed) + df_notes_rescan (insn); cse_insn (insn);
Richard Sandiford <richard.sandiford@arm.com> writes: > Peter Bergner <bergner@linux.ibm.com> writes: >> On 4/29/20 4:15 PM, Peter Bergner wrote: >>> On 4/29/20 3:28 PM, Richard Sandiford wrote: >>>> (Sorry for going ahead and writing an alternative patch, since if we do >>>> go for this, I guess the earlier misdirections will have wasted two days >>>> of your time. But it seemed like I was just never going to think about >>>> this PR properly unless I actually tried to write something. :-() >>> >>> No worries from me! I'm just glad to see this fixed before the release. >>> I'll kill off a bootstrap and regtest on powerpc64le-linux too, in addition >>> to your tests (arm & x86_64?). Thanks for your help with this! >> >> My bootstrap and regtesting of your patch on powerpc64le-linux was clean. > > Thanks. aarch64-linux-gnu and x86_64-linux-gnu bootstrap & regtests > also came back clean. I'll kick off an arm-linux-gnueabihf one too > just to be safe. FTR, that came back clean too. > I guess at this point it needs a review from someone else though. > Jeff, WDYT? Attached again below, this time without the shonky mime type. Here's a different attempt to attach the patch (sorry -- experimenting with different ways of attaching the patch inline, to see how the ML archive reacts.) Richard From 39e20b51af8f977a1786ef5c15af80e47415d352 Mon Sep 17 00:00:00 2001 From: Richard Sandiford <richard.sandiford@arm.com> Date: Wed, 29 Apr 2020 20:38:13 +0100 Subject: [PATCH] cse: Use simplify_replace_fn_rtx to process notes [PR94740] cse_process_notes did a very simple substitution, which in the wrong circumstances could create non-canonical RTL and invalid MEMs. Various sticking plasters have been applied to cse_process_notes_1 to handle cases like ZERO_EXTEND, SIGN_EXTEND and UNSIGNED_FLOAT, but I think this PR is a plaster too far. The code is trying hard to avoid creating unnecessary rtl, which of course is a good thing. If we continue to do that, then we can end up changing subexpressions while keeping the containing rtx. This in turn means that validate_change will be a no-op on the containing rtx, even if its contents have changed. So in these cases we have to apply validate_change to the individual subexpressions. On the other hand, if we always apply validate_change to the individual subexpressions, we'll end up calling validate_change on something before it has been simplified and canonicalised. And that's one of the situations we're trying to avoid. There might be a middle ground in which we queue the validate_changes as part of a group, and so can cancel the pending validate_changes for subexpressions if there's a change in the outer expression. But that seems even more ad-hoc than the current code. It would also be quite an invasive change. I think the best thing is just to hook into the existing simplify_replace_fn_rtx function, keeping the REG and MEM handling from cse_process_notes_1 essentially unchanged. It can generate more redundant rtl when a simplification takes place, but it has the advantage of being relative well-used code (both directly and via simplify_replace_rtx). 2020-04-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR rtl-optimization/94740 * cse.c (cse_process_notes_1): Replace with... (cse_process_note_1): ...this new function, acting as a simplify_replace_fn_rtx callback to process_note. Handle only REGs and MEMs directly. Validate the MEM if cse_process_note changes its address. (cse_process_notes): Replace with... (cse_process_note): ...this new function. (cse_extended_basic_block): Update accordingly, iterating over the register notes and passing individual notes to cse_process_note. --- gcc/cse.c | 118 ++++++++++++++++-------------------------------------- 1 file changed, 34 insertions(+), 84 deletions(-) diff --git a/gcc/cse.c b/gcc/cse.c index 5aaba8d80e0..36bcfc354d8 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -585,7 +585,6 @@ static void cse_insn (rtx_insn *); static void cse_prescan_path (struct cse_basic_block_data *); static void invalidate_from_clobbers (rtx_insn *); static void invalidate_from_sets_and_clobbers (rtx_insn *); -static rtx cse_process_notes (rtx, rtx, bool *); static void cse_extended_basic_block (struct cse_basic_block_data *); extern void dump_class (struct table_elt*); static void get_cse_reg_info_1 (unsigned int regno); @@ -6222,75 +6221,28 @@ invalidate_from_sets_and_clobbers (rtx_insn *insn) } } -/* Process X, part of the REG_NOTES of an insn. Look at any REG_EQUAL notes - and replace any registers in them with either an equivalent constant - or the canonical form of the register. If we are inside an address, - only do this if the address remains valid. +static rtx cse_process_note (rtx); - OBJECT is 0 except when within a MEM in which case it is the MEM. +/* A simplify_replace_fn_rtx callback for cse_process_note. Process X, + part of the REG_NOTES of an insn. Replace any registers with either + an equivalent constant or the canonical form of the register. + Only replace addresses if the containing MEM remains valid. - Return the replacement for X. */ + Return the replacement for X, or null if it should be simplified + recursively. */ static rtx -cse_process_notes_1 (rtx x, rtx object, bool *changed) +cse_process_note_1 (rtx x, const_rtx, void *) { - enum rtx_code code = GET_CODE (x); - const char *fmt = GET_RTX_FORMAT (code); - int i; - - switch (code) + if (MEM_P (x)) { - case CONST: - case SYMBOL_REF: - case LABEL_REF: - CASE_CONST_ANY: - case PC: - case CC0: - case LO_SUM: + validate_change (x, &XEXP (x, 0), cse_process_note (XEXP (x, 0)), false); return x; + } - case MEM: - validate_change (x, &XEXP (x, 0), - cse_process_notes (XEXP (x, 0), x, changed), 0); - return x; - - case EXPR_LIST: - if (REG_NOTE_KIND (x) == REG_EQUAL) - XEXP (x, 0) = cse_process_notes (XEXP (x, 0), NULL_RTX, changed); - /* Fall through. */ - - case INSN_LIST: - case INT_LIST: - if (XEXP (x, 1)) - XEXP (x, 1) = cse_process_notes (XEXP (x, 1), NULL_RTX, changed); - return x; - - case SIGN_EXTEND: - case ZERO_EXTEND: - case SUBREG: - { - rtx new_rtx = cse_process_notes (XEXP (x, 0), object, changed); - /* We don't substitute VOIDmode constants into these rtx, - since they would impede folding. */ - if (GET_MODE (new_rtx) != VOIDmode) - validate_change (object, &XEXP (x, 0), new_rtx, 0); - return x; - } - - case UNSIGNED_FLOAT: - { - rtx new_rtx = cse_process_notes (XEXP (x, 0), object, changed); - /* We don't substitute negative VOIDmode constants into these rtx, - since they would impede folding. */ - if (GET_MODE (new_rtx) != VOIDmode - || (CONST_INT_P (new_rtx) && INTVAL (new_rtx) >= 0) - || (CONST_DOUBLE_P (new_rtx) && CONST_DOUBLE_HIGH (new_rtx) >= 0)) - validate_change (object, &XEXP (x, 0), new_rtx, 0); - return x; - } - - case REG: - i = REG_QTY (REGNO (x)); + if (REG_P (x)) + { + int i = REG_QTY (REGNO (x)); /* Return a constant or a constant register. */ if (REGNO_QTY_VALID_P (REGNO (x))) @@ -6309,26 +6261,19 @@ cse_process_notes_1 (rtx x, rtx object, bool *changed) /* Otherwise, canonicalize this register. */ return canon_reg (x, NULL); - - default: - break; } - for (i = 0; i < GET_RTX_LENGTH (code); i++) - if (fmt[i] == 'e') - validate_change (object, &XEXP (x, i), - cse_process_notes (XEXP (x, i), object, changed), 0); - - return x; + return NULL_RTX; } +/* Process X, part of the REG_NOTES of an insn. Replace any registers in it + with either an equivalent constant or the canonical form of the register. + Only replace addresses if the containing MEM remains valid. */ + static rtx -cse_process_notes (rtx x, rtx object, bool *changed) +cse_process_note (rtx x) { - rtx new_rtx = cse_process_notes_1 (x, object, changed); - if (new_rtx != x) - *changed = true; - return new_rtx; + return simplify_replace_fn_rtx (x, NULL_RTX, cse_process_note_1, NULL); } @@ -6623,14 +6568,19 @@ cse_extended_basic_block (struct cse_basic_block_data *ebb_data) { /* Process notes first so we have all notes in canonical forms when looking for duplicate operations. */ - if (REG_NOTES (insn)) - { - bool changed = false; - REG_NOTES (insn) = cse_process_notes (REG_NOTES (insn), - NULL_RTX, &changed); - if (changed) - df_notes_rescan (insn); - } + bool changed = false; + for (rtx note = REG_NOTES (insn); note; note = XEXP (note, 1)) + if (REG_NOTE_KIND (note) == REG_EQUAL) + { + rtx newval = cse_process_note (XEXP (note, 0)); + if (newval != XEXP (note, 0)) + { + XEXP (note, 0) = newval; + changed = true; + } + } + if (changed) + df_notes_rescan (insn); cse_insn (insn);
On Thu, 2020-04-30 at 08:54 +0100, Richard Sandiford wrote: > Peter Bergner <bergner@linux.ibm.com> writes: > > On 4/29/20 4:15 PM, Peter Bergner wrote: > > > On 4/29/20 3:28 PM, Richard Sandiford wrote: > > > > (Sorry for going ahead and writing an alternative patch, since if we do > > > > go for this, I guess the earlier misdirections will have wasted two days > > > > of your time. But it seemed like I was just never going to think about > > > > this PR properly unless I actually tried to write something. :-() > > > > > > No worries from me! I'm just glad to see this fixed before the release. > > > I'll kill off a bootstrap and regtest on powerpc64le-linux too, in addition > > > to your tests (arm & x86_64?). Thanks for your help with this! > > > > My bootstrap and regtesting of your patch on powerpc64le-linux was clean. > > Thanks. aarch64-linux-gnu and x86_64-linux-gnu bootstrap & regtests > also came back clean. I'll kick off an arm-linux-gnueabihf one too > just to be safe. > > I guess at this point it needs a review from someone else though. > Jeff, WDYT? Attached again below, this time without the shonky mime type. It looks reasonable reasonable to me. Re-using simplify_replace_fn_rtx seems like a major simplification, which is definitely good. Presumably one of the major goals here is to get the CONST wrapping from simplify_plus_minus? Jeff ps. Both forms looked the same in my inbox. Not sure how they showed up in the archives.
Jeff Law <law@redhat.com> writes: > On Thu, 2020-04-30 at 08:54 +0100, Richard Sandiford wrote: >> Peter Bergner <bergner@linux.ibm.com> writes: >> > On 4/29/20 4:15 PM, Peter Bergner wrote: >> > > On 4/29/20 3:28 PM, Richard Sandiford wrote: >> > > > (Sorry for going ahead and writing an alternative patch, since if we do >> > > > go for this, I guess the earlier misdirections will have wasted two days >> > > > of your time. But it seemed like I was just never going to think about >> > > > this PR properly unless I actually tried to write something. :-() >> > > >> > > No worries from me! I'm just glad to see this fixed before the release. >> > > I'll kill off a bootstrap and regtest on powerpc64le-linux too, in addition >> > > to your tests (arm & x86_64?). Thanks for your help with this! >> > >> > My bootstrap and regtesting of your patch on powerpc64le-linux was clean. >> >> Thanks. aarch64-linux-gnu and x86_64-linux-gnu bootstrap & regtests >> also came back clean. I'll kick off an arm-linux-gnueabihf one too >> just to be safe. >> >> I guess at this point it needs a review from someone else though. >> Jeff, WDYT? Attached again below, this time without the shonky mime type. > It looks reasonable reasonable to me. Re-using simplify_replace_fn_rtx seems > like a major simplification, which is definitely good. Great, thanks! Now pushed to master. > Presumably one of the major goals here is to get the CONST wrapping > from simplify_plus_minus? Yeah, that's right (via very indirect means :-)) > ps. Both forms looked the same in my inbox. Not sure how they showed up in the > archives. The last one came out OK. I'd missed specifying a charset for the second one. Richard
On Thu, 2020-04-30 at 20:02 +0100, Richard Sandiford wrote: > Jeff Law <law@redhat.com> writes: > > On Thu, 2020-04-30 at 08:54 +0100, Richard Sandiford wrote: > > > Peter Bergner <bergner@linux.ibm.com> writes: > > > > On 4/29/20 4:15 PM, Peter Bergner wrote: > > > > > On 4/29/20 3:28 PM, Richard Sandiford wrote: > > > > > > (Sorry for going ahead and writing an alternative patch, since if we > > > > > > do > > > > > > go for this, I guess the earlier misdirections will have wasted two > > > > > > days > > > > > > of your time. But it seemed like I was just never going to think > > > > > > about > > > > > > this PR properly unless I actually tried to write something. :-() > > > > > > > > > > No worries from me! I'm just glad to see this fixed before the > > > > > release. > > > > > I'll kill off a bootstrap and regtest on powerpc64le-linux too, in > > > > > addition > > > > > to your tests (arm & x86_64?). Thanks for your help with this! > > > > > > > > My bootstrap and regtesting of your patch on powerpc64le-linux was clean. > > > > > > Thanks. aarch64-linux-gnu and x86_64-linux-gnu bootstrap & regtests > > > also came back clean. I'll kick off an arm-linux-gnueabihf one too > > > just to be safe. > > > > > > I guess at this point it needs a review from someone else though. > > > Jeff, WDYT? Attached again below, this time without the shonky mime type. > > It looks reasonable reasonable to me. Re-using simplify_replace_fn_rtx seems > > like a major simplification, which is definitely good. > > Great, thanks! Now pushed to master. > > > Presumably one of the major goals here is to get the CONST wrapping > > from simplify_plus_minus? > > Yeah, that's right (via very indirect means :-)) It was certainly indirect :-) Thankfully there's only one place that does CONST wrapping in simplify-rtx, so it was just a matter of walking up the most interesting call sites. Jeff
On 4/30/20 2:02 PM, Richard Sandiford wrote: > Jeff Law <law@redhat.com> writes: >> On Thu, 2020-04-30 at 08:54 +0100, Richard Sandiford wrote: >>> I guess at this point it needs a review from someone else though. >>> Jeff, WDYT? Attached again below, this time without the shonky mime type. >> It looks reasonable reasonable to me. Re-using simplify_replace_fn_rtx seems >> like a major simplification, which is definitely good. > > Great, thanks! Now pushed to master. I pushed the test case to master too. Thank you everyone! Peter
diff --git a/gcc/cse.c b/gcc/cse.c index 5aaba8d80e0..cbe9e0a0692 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -6328,6 +6328,14 @@ cse_process_notes (rtx x, rtx object, bool *changed) rtx new_rtx = cse_process_notes_1 (x, object, changed); if (new_rtx != x) *changed = true; + if (*changed && object != NULL_RTX && MEM_P (object)) + { + /* Call simplify_rtx on the updated address in case it is now + a constant and needs to be wrapped with a (const: ...). */ + rtx simplified_rtx = simplify_rtx (new_rtx); + if (simplified_rtx) + new_rtx = simplified_rtx; + } return new_rtx; } diff --git a/gcc/testsuite/gcc.target/powerpc/pr94740.c b/gcc/testsuite/gcc.target/powerpc/pr94740.c new file mode 100644 index 00000000000..78e40fc84da --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr94740.c @@ -0,0 +1,11 @@ +/* PR rtl-optimization/94740 */ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_future_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=future -mpcrel" } */ + +int array[8]; +int +foo (void) +{ + return __builtin_bswap32 (array[1]); +}