Message ID | 20200106082410.GW10088@tucnak |
---|---|
State | New |
Headers | show |
Series | Fix PowerPC -fstack-clash-protection -mprefixed-addr ICE (PR target/93122) | expand |
On Mon, 2020-01-06 at 09:24 +0100, Jakub Jelinek wrote: > Hi! > > As mentioned in the PR, the following testcase ICEs because rs, while valid > add_operand is not valid add_cint_operand and so gen_add3_insn fails, > because it doesn't meet the expander predicates. > > Fixed thusly, bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk? > > Another option would be to: > 1) always use gen_add3_insn, but if it returns NULL use the extra > emit_move_insn (end_addr, GEN_INT (-rounded_size)); > and retry with end_addr instead of rs > 2) if the first gen_add3_insn returned NULL or if it created more than one > instruction, add the REG_FRAME_RELATED_EXPR note on the last insn > > Is that what you want to do instead? > > 2020-01-06 Jakub Jelinek <jakub@redhat.com> > > PR target/93122 > * config/rs6000/rs6000-logue.c > (rs6000_emit_probe_stack_range_stack_clash): Only use gen_addr3_insn > if add_cint_operand predicate matches. Use rs instead of doing > GEN_INT again. > > * gcc.target/powerpc/pr93122.c: New test. I think this is fine, but give the PPC maintainers a few days to chime in. jeff >
On Mon, Jan 06, 2020 at 11:03:02AM -0700, Jeff Law wrote: > On Mon, 2020-01-06 at 09:24 +0100, Jakub Jelinek wrote: > > Hi! > > > > As mentioned in the PR, the following testcase ICEs because rs, while valid > > add_operand is not valid add_cint_operand and so gen_add3_insn fails, > > because it doesn't meet the expander predicates. > > > > Fixed thusly, bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk? > > > > Another option would be to: > > 1) always use gen_add3_insn, but if it returns NULL use the extra > > emit_move_insn (end_addr, GEN_INT (-rounded_size)); > > and retry with end_addr instead of rs > > 2) if the first gen_add3_insn returned NULL or if it created more than one > > instruction, add the REG_FRAME_RELATED_EXPR note on the last insn > > > > Is that what you want to do instead? > > > > 2020-01-06 Jakub Jelinek <jakub@redhat.com> > > > > PR target/93122 > > * config/rs6000/rs6000-logue.c > > (rs6000_emit_probe_stack_range_stack_clash): Only use gen_addr3_insn > > if add_cint_operand predicate matches. Use rs instead of doing > > GEN_INT again. > > > > * gcc.target/powerpc/pr93122.c: New test. > I think this is fine, but give the PPC maintainers a few days to chime > in. It's not okay, whether I'll need a few days or more to reply. But I will reply soon :-) Segher
Hi! On Mon, Jan 06, 2020 at 02:46:30PM -0600, Segher Boessenkool wrote: > > > Another option would be to: > > > 1) always use gen_add3_insn, but if it returns NULL use the extra > > > emit_move_insn (end_addr, GEN_INT (-rounded_size)); > > > and retry with end_addr instead of rs > > > 2) if the first gen_add3_insn returned NULL or if it created more than one > > > instruction, add the REG_FRAME_RELATED_EXPR note on the last insn > > > > > > Is that what you want to do instead? > > > > > > 2020-01-06 Jakub Jelinek <jakub@redhat.com> > > > > > > PR target/93122 > > > * config/rs6000/rs6000-logue.c > > > (rs6000_emit_probe_stack_range_stack_clash): Only use gen_addr3_insn > > > if add_cint_operand predicate matches. Use rs instead of doing > > > GEN_INT again. > > > > > > * gcc.target/powerpc/pr93122.c: New test. > > I think this is fine, but give the PPC maintainers a few days to chime > > in. > > It's not okay, whether I'll need a few days or more to reply. But I will > reply soon :-) I'd like to ping this patch (or can code up the above option). Thanks. Jakub
Hi! On Mon, Jan 06, 2020 at 09:24:10AM +0100, Jakub Jelinek wrote: > 1) always use gen_add3_insn, but if it returns NULL use the extra > emit_move_insn (end_addr, GEN_INT (-rounded_size)); > and retry with end_addr instead of rs > 2) if the first gen_add3_insn returned NULL or if it created more than one > instruction, add the REG_FRAME_RELATED_EXPR note on the last insn 3) Don't use gen_add3_insn, generate the wanted insns directly. Why can't we do that? gen_add3_insn is fine for generic expansion, but that is not what we are doing here at all. And since not everything it can do is okay here at all apparently, it is a bad choice. It looks like your patch will pessimise code in some cases as well, btw? Segher
On Wed, Jan 22, 2020 at 08:06:03AM -0600, Segher Boessenkool wrote: > On Mon, Jan 06, 2020 at 09:24:10AM +0100, Jakub Jelinek wrote: > > 1) always use gen_add3_insn, but if it returns NULL use the extra > > emit_move_insn (end_addr, GEN_INT (-rounded_size)); > > and retry with end_addr instead of rs > > 2) if the first gen_add3_insn returned NULL or if it created more than one > > instruction, add the REG_FRAME_RELATED_EXPR note on the last insn > > 3) Don't use gen_add3_insn, generate the wanted insns directly. Why > can't we do that? gen_add3_insn is fine for generic expansion, but that > is not what we are doing here at all. And since not everything it can do > is okay here at all apparently, it is a bad choice. I don't see how not using gen_add3_insn would make anything easier, the problem isn't gen_add3_insn, but that the add expander has predicates that need to be satisfied, and even if I build the insn by hand, I'll still need to make sure the predicates are satisfied and for that will need to check them anyway. > It looks like your patch will pessimise code in some cases as well, btw? No, it will solely turn previous wrong-codes into something that works (== cases where gen_addr3_insn would previously fail). The 1)+2) variant could even improve code, as gen_addr3_insn could succeed even if we currently don't call it (perhaps generate more than one insn, but it still might be better than forcing the constant into register and then performing add that way). Jakub
On Wed, Jan 22, 2020 at 03:24:54PM +0100, Jakub Jelinek wrote: > > It looks like your patch will pessimise code in some cases as well, btw? > > No, it will solely turn previous wrong-codes into something that works > (== cases where gen_addr3_insn would previously fail). > The 1)+2) variant could even improve code, as gen_addr3_insn could succeed > even if we currently don't call it (perhaps generate more than one insn, > but it still might be better than forcing the constant into register and > then performing add that way). Here is what I meant as the alternative, i.e. don't check any predicates, just gen_add3_insn, if that fails, force rs into register and retry. And, add REG_FRAME_RELATED_EXPR note always when we haven't emitted a single insn that has rtl exactly matching what we'd add the REG_FRAME_RELATED_EXPR with (in that case, dwarf2cfi.c is able to figure it out by itself, no need to waste compile time memory). Ok for trunk if it passes bootstrap/regtest? 2020-01-30 Jakub Jelinek <jakub@redhat.com> PR target/93122 * config/rs6000/rs6000-logue.c (rs6000_emit_probe_stack_range_stack_clash): Always use gen_add3_insn, if it fails, move rs into end_addr and retry. Add REG_FRAME_RELATED_EXPR note whenever it returns more than one insn or the insn pattern doesn't describe well what exactly happens to dwarf2cfi.c. * gcc.target/powerpc/pr93122.c: New test. --- gcc/config/rs6000/rs6000-logue.c.jj 2020-01-12 11:54:36.395413680 +0100 +++ gcc/config/rs6000/rs6000-logue.c 2020-01-30 16:54:28.646943559 +0100 @@ -1604,20 +1604,32 @@ rs6000_emit_probe_stack_range_stack_clas rtx end_addr = copy_reg ? gen_rtx_REG (Pmode, 0) : gen_rtx_REG (Pmode, 12); rtx rs = GEN_INT (-rounded_size); - rtx_insn *insn; - if (add_operand (rs, Pmode)) - insn = emit_insn (gen_add3_insn (end_addr, stack_pointer_rtx, rs)); - else + rtx_insn *insn = gen_add3_insn (end_addr, stack_pointer_rtx, rs); + if (insn == NULL) + { + emit_move_insn (end_addr, rs); + insn = gen_add3_insn (end_addr, end_addr, stack_pointer_rtx); + gcc_assert (insn); + } + rtx set; + if (!NONJUMP_INSN_P (insn) + || NEXT_INSN (insn) + || (set = single_set (insn)) == NULL_RTX + || SET_DEST (set) != end_addr + || GET_CODE (SET_SRC (set)) != PLUS + || XEXP (SET_SRC (set), 0) != stack_pointer_rtx + || XEXP (SET_SRC (set), 1) != rs) { - emit_move_insn (end_addr, GEN_INT (-rounded_size)); - insn = emit_insn (gen_add3_insn (end_addr, end_addr, - stack_pointer_rtx)); - /* Describe the effect of INSN to the CFI engine. */ + insn = emit_insn (insn); + /* Describe the effect of INSN to the CFI engine, unless it + is a single insn that describes it itself. */ add_reg_note (insn, REG_FRAME_RELATED_EXPR, gen_rtx_SET (end_addr, gen_rtx_PLUS (Pmode, stack_pointer_rtx, rs))); } + else + insn = emit_insn (insn); RTX_FRAME_RELATED_P (insn) = 1; /* Emit the loop. */ --- gcc/testsuite/gcc.target/powerpc/pr93122.c.jj 2020-01-30 16:42:14.255927311 +0100 +++ gcc/testsuite/gcc.target/powerpc/pr93122.c 2020-01-30 16:42:14.255927311 +0100 @@ -0,0 +1,12 @@ +/* PR target/93122 */ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-fstack-clash-protection -mprefixed-addr -mfuture" } */ + +void bar (char *); + +void +foo (void) +{ + char s[4294967296]; + bar (s); +} Jakub
On Thu, Jan 30, 2020 at 05:14:08PM +0100, Jakub Jelinek wrote: > On Wed, Jan 22, 2020 at 03:24:54PM +0100, Jakub Jelinek wrote: > > > It looks like your patch will pessimise code in some cases as well, btw? > > > > No, it will solely turn previous wrong-codes into something that works > > (== cases where gen_addr3_insn would previously fail). > > The 1)+2) variant could even improve code, as gen_addr3_insn could succeed > > even if we currently don't call it (perhaps generate more than one insn, > > but it still might be better than forcing the constant into register and > > then performing add that way). > > Here is what I meant as the alternative, i.e. don't check any predicates, > just gen_add3_insn, if that fails, force rs into register and retry. > And, add REG_FRAME_RELATED_EXPR note always when we haven't emitted a single > insn that has rtl exactly matching what we'd add the REG_FRAME_RELATED_EXPR > with (in that case, dwarf2cfi.c is able to figure it out by itself, no need > to waste compile time memory). > > Ok for trunk if it passes bootstrap/regtest? Successfully bootstrapped/regtested on powerpc64{,le}-linux. > 2020-01-30 Jakub Jelinek <jakub@redhat.com> > > PR target/93122 > * config/rs6000/rs6000-logue.c > (rs6000_emit_probe_stack_range_stack_clash): Always use gen_add3_insn, > if it fails, move rs into end_addr and retry. Add > REG_FRAME_RELATED_EXPR note whenever it returns more than one insn or > the insn pattern doesn't describe well what exactly happens to > dwarf2cfi.c. > > * gcc.target/powerpc/pr93122.c: New test. Jakub
Hi! Sorry for dropping this once again. On Thu, Jan 30, 2020 at 05:14:08PM +0100, Jakub Jelinek wrote: > Here is what I meant as the alternative, i.e. don't check any predicates, > just gen_add3_insn, if that fails, force rs into register and retry. I don't like gen_add3_insn here *at all*, as I said, but okay, you're only fixing existing code. But as long as it is there, this code will be a problem child. > And, add REG_FRAME_RELATED_EXPR note always when we haven't emitted a single > insn that has rtl exactly matching what we'd add the REG_FRAME_RELATED_EXPR > with (in that case, dwarf2cfi.c is able to figure it out by itself, no need > to waste compile time memory). I would say "just always emit that note", but that is what the patch does, already :-) > + rtx set; > + if (!NONJUMP_INSN_P (insn) > + || NEXT_INSN (insn) > + || (set = single_set (insn)) == NULL_RTX > + || SET_DEST (set) != end_addr > + || GET_CODE (SET_SRC (set)) != PLUS > + || XEXP (SET_SRC (set), 0) != stack_pointer_rtx > + || XEXP (SET_SRC (set), 1) != rs) Please don't have side effects in conditions. Two nested ifs would be fine here. > + insn = emit_insn (insn); > + /* Describe the effect of INSN to the CFI engine, unless it > + is a single insn that describes it itself. */ > add_reg_note (insn, REG_FRAME_RELATED_EXPR, > gen_rtx_SET (end_addr, > gen_rtx_PLUS (Pmode, stack_pointer_rtx, > rs))); So please fix the comment? Segher
On Thu, Feb 06, 2020 at 01:15:25PM -0600, Segher Boessenkool wrote: > On Thu, Jan 30, 2020 at 05:14:08PM +0100, Jakub Jelinek wrote: > > Here is what I meant as the alternative, i.e. don't check any predicates, > > just gen_add3_insn, if that fails, force rs into register and retry. > > I don't like gen_add3_insn here *at all*, as I said, but okay, you're > only fixing existing code. But as long as it is there, this code will > be a problem child. gen_add3_insn is used 25 times elsewhere in the rs6000 backend when not counting these 2 calls that were just slightly moved around by the patch. > > And, add REG_FRAME_RELATED_EXPR note always when we haven't emitted a single > > insn that has rtl exactly matching what we'd add the REG_FRAME_RELATED_EXPR > > with (in that case, dwarf2cfi.c is able to figure it out by itself, no need > > to waste compile time memory). > > I would say "just always emit that note", but that is what the patch > does, already :-) No, the patch doesn't emit it always, see below. > > + rtx set; > > + if (!NONJUMP_INSN_P (insn) > > + || NEXT_INSN (insn) > > + || (set = single_set (insn)) == NULL_RTX > > + || SET_DEST (set) != end_addr > > + || GET_CODE (SET_SRC (set)) != PLUS > > + || XEXP (SET_SRC (set), 0) != stack_pointer_rtx > > + || XEXP (SET_SRC (set), 1) != rs) > > Please don't have side effects in conditions. Two nested ifs would be > fine here. Ok, so like attached? There is an add_note boolean, so that the insn = emit_insn (insn); doesn't have to be done 3 times. > > + insn = emit_insn (insn); > > + /* Describe the effect of INSN to the CFI engine, unless it > > + is a single insn that describes it itself. */ > > add_reg_note (insn, REG_FRAME_RELATED_EXPR, > > gen_rtx_SET (end_addr, > > gen_rtx_PLUS (Pmode, stack_pointer_rtx, > > rs))); > > So please fix the comment? The point is not to add the REG_FRAME_RELATED_EXPR note in the most common case where it would be just waste of compile time memory. E.g. (insn/f 17 16 18 2 (set (reg:DI 12 12) (plus:DI (reg/f:DI 1 1) (const_int -40960 [0xffffffffffff6000]))) "pr93122-2.c":9:1 -1 (nil)) doesn't need the note, the /f flag on the insn is all that is needed, because the instruction is self-descriptive to dwarf2frame.c. The note we'd add in that case would be (set (reg:DI 12 12) (plus:DI (reg/f:DI 1 1) (const_int -40960))) but that is the PATTERN of the insn already. We need it in all other cases, when there is more than one insn doing that or it isn't like that (end_addr = stack_pointer_rtx + rs). So e.g. on the testcase in the patch, (insn 17 16 18 2 (set (reg:DI 12 12) (const_int -4294967296 [0xffffffff00000000])) "pr93122.c":9:1 -1 (nil)) (insn/f 18 17 19 2 (set (reg:DI 12 12) (plus:DI (reg:DI 12 12) (reg/f:DI 1 1))) "pr93122.c":9:1 -1 (expr_list:REG_FRAME_RELATED_EXPR (set (reg:DI 12 12) (plus:DI (reg/f:DI 1 1) (const_int -4294967296 [0xffffffff00000000]))) (nil))) Only the last insn is /f marked and the effect of it is the one written in the note. 2020-02-06 Jakub Jelinek <jakub@redhat.com> PR target/93122 * config/rs6000/rs6000-logue.c (rs6000_emit_probe_stack_range_stack_clash): Always use gen_add3_insn, if it fails, move rs into end_addr and retry. Add REG_FRAME_RELATED_EXPR note whenever it returns more than one insn or the insn pattern doesn't describe well what exactly happens to dwarf2cfi.c. * gcc.target/powerpc/pr93122.c: New test. --- gcc/config/rs6000/rs6000-logue.c.jj 2020-01-30 17:55:38.606339203 +0100 +++ gcc/config/rs6000/rs6000-logue.c 2020-02-06 20:36:21.511409319 +0100 @@ -1604,20 +1604,34 @@ rs6000_emit_probe_stack_range_stack_clas rtx end_addr = copy_reg ? gen_rtx_REG (Pmode, 0) : gen_rtx_REG (Pmode, 12); rtx rs = GEN_INT (-rounded_size); - rtx_insn *insn; - if (add_operand (rs, Pmode)) - insn = emit_insn (gen_add3_insn (end_addr, stack_pointer_rtx, rs)); + rtx_insn *insn = gen_add3_insn (end_addr, stack_pointer_rtx, rs); + if (insn == NULL) + { + emit_move_insn (end_addr, rs); + insn = gen_add3_insn (end_addr, end_addr, stack_pointer_rtx); + gcc_assert (insn); + } + bool add_note = false; + if (!NONJUMP_INSN_P (insn) || NEXT_INSN (insn)) + add_note = true; else { - emit_move_insn (end_addr, GEN_INT (-rounded_size)); - insn = emit_insn (gen_add3_insn (end_addr, end_addr, - stack_pointer_rtx)); - /* Describe the effect of INSN to the CFI engine. */ - add_reg_note (insn, REG_FRAME_RELATED_EXPR, - gen_rtx_SET (end_addr, - gen_rtx_PLUS (Pmode, stack_pointer_rtx, - rs))); + rtx set = single_set (insn); + if (set == NULL_RTX + || SET_DEST (set) != end_addr + || GET_CODE (SET_SRC (set)) != PLUS + || XEXP (SET_SRC (set), 0) != stack_pointer_rtx + || XEXP (SET_SRC (set), 1) != rs) + add_note = true; } + insn = emit_insn (insn); + if (add_note) + /* Describe the effect of INSN to the CFI engine, unless it + is a single insn that describes it itself. */ + add_reg_note (insn, REG_FRAME_RELATED_EXPR, + gen_rtx_SET (end_addr, + gen_rtx_PLUS (Pmode, stack_pointer_rtx, + rs))); RTX_FRAME_RELATED_P (insn) = 1; /* Emit the loop. */ --- gcc/testsuite/gcc.target/powerpc/pr93122.c.jj 2020-02-06 20:29:58.985147528 +0100 +++ gcc/testsuite/gcc.target/powerpc/pr93122.c 2020-02-06 20:29:58.985147528 +0100 @@ -0,0 +1,12 @@ +/* PR target/93122 */ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-fstack-clash-protection -mprefixed-addr -mfuture" } */ + +void bar (char *); + +void +foo (void) +{ + char s[4294967296]; + bar (s); +} Jakub
Hi again, On Thu, Feb 06, 2020 at 08:51:06PM +0100, Jakub Jelinek wrote: > On Thu, Feb 06, 2020 at 01:15:25PM -0600, Segher Boessenkool wrote: > > On Thu, Jan 30, 2020 at 05:14:08PM +0100, Jakub Jelinek wrote: > > > Here is what I meant as the alternative, i.e. don't check any predicates, > > > just gen_add3_insn, if that fails, force rs into register and retry. > > > > I don't like gen_add3_insn here *at all*, as I said, but okay, you're > > only fixing existing code. But as long as it is there, this code will > > be a problem child. > > gen_add3_insn is used 25 times elsewhere in the rs6000 backend when not > counting these 2 calls that were just slightly moved around by the patch. Yes, and almost none of those cases check for errors. If they really cannot error, they can probably just call one of the actual patterns for the machine instructions directly (like we already do in many more cases). > > > And, add REG_FRAME_RELATED_EXPR note always when we haven't emitted a single > > > insn that has rtl exactly matching what we'd add the REG_FRAME_RELATED_EXPR > > > with (in that case, dwarf2cfi.c is able to figure it out by itself, no need > > > to waste compile time memory). > > > > I would say "just always emit that note", but that is what the patch > > does, already :-) > > No, the patch doesn't emit it always, see below. So move the comment *before* "if (add_note)" then? :-) (I don't think it would be terrible to do it actually always either, fwiw, but this is fine). > 2020-02-06 Jakub Jelinek <jakub@redhat.com> > > PR target/93122 > * config/rs6000/rs6000-logue.c > (rs6000_emit_probe_stack_range_stack_clash): Always use gen_add3_insn, > if it fails, move rs into end_addr and retry. Add > REG_FRAME_RELATED_EXPR note whenever it returns more than one insn or > the insn pattern doesn't describe well what exactly happens to > dwarf2cfi.c. > > * gcc.target/powerpc/pr93122.c: New test. Okay for trunk (and backports if you want those). Thanks for the patch, and thanks for bearing with me. Segher
--- gcc/config/rs6000/rs6000-logue.c.jj 2020-01-01 12:22:32.945491552 +0100 +++ gcc/config/rs6000/rs6000-logue.c 2020-01-02 12:59:09.612734662 +0100 @@ -1605,11 +1605,11 @@ rs6000_emit_probe_stack_range_stack_clas = copy_reg ? gen_rtx_REG (Pmode, 0) : gen_rtx_REG (Pmode, 12); rtx rs = GEN_INT (-rounded_size); rtx_insn *insn; - if (add_operand (rs, Pmode)) + if (add_cint_operand (rs, Pmode) && add_operand (rs, Pmode)) insn = emit_insn (gen_add3_insn (end_addr, stack_pointer_rtx, rs)); else { - emit_move_insn (end_addr, GEN_INT (-rounded_size)); + emit_move_insn (end_addr, rs); insn = emit_insn (gen_add3_insn (end_addr, end_addr, stack_pointer_rtx)); /* Describe the effect of INSN to the CFI engine. */ --- gcc/testsuite/gcc.target/powerpc/pr93122.c.jj 2020-01-02 13:07:07.022459554 +0100 +++ gcc/testsuite/gcc.target/powerpc/pr93122.c 2020-01-06 07:05:31.397917021 +0100 @@ -0,0 +1,12 @@ +/* PR target/93122 */ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-fstack-clash-protection -mprefixed-addr -mfuture" } */ + +void bar (char *); + +void +foo (void) +{ + char s[4294967296]; + bar (s); +}