Message ID | 50ca58cf-f2e2-40d3-c3dc-f4bfc7a80f74@redhat.com |
---|---|
State | New |
Headers | show |
Series | [PR,target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64 | expand |
On 01/02/2018 09:02 PM, Jeff Law wrote: > * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not Typo: “adjut”. > explicitly probe *sp in a noreturn function if there were any callee > register saves. I recompiled glibc with this patch, and I can confirm it fixes the new glibc test: https://sourceware.org/ml/libc-alpha/2017-12/msg00987.html However, I would appreciate if it were possible to avoid emitting the .cfi_offset/.cfi_register annotations and only record the change of frame address. The other CFI notes aren't needed, and it would avoid reintroducing this bug if the way the prologue is constructed changes and the condition for emitting the probe is not completely correct anymore. Thanks, Florian
On 01/02/2018 03:05 PM, Florian Weimer wrote: > On 01/02/2018 09:02 PM, Jeff Law wrote: >> * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not > > Typo: “adjut”. > >> explicitly probe *sp in a noreturn function if there were any callee >> register saves. > > I recompiled glibc with this patch, and I can confirm it fixes the new > glibc test: > > https://sourceware.org/ml/libc-alpha/2017-12/msg00987.html > > However, I would appreciate if it were possible to avoid emitting the > .cfi_offset/.cfi_register annotations and only record the change of > frame address. The other CFI notes aren't needed, and it would avoid > reintroducing this bug if the way the prologue is constructed changes > and the condition for emitting the probe is not completely correct anymore. I'm not aware of a way to do that. I'm not even sure having the ability to tell the CFI machinery to avoid that stuff is a good idea from a design/implementation standpoint. What we could do is beef up the testsuite checks to verify there are no cfi restores and possibly add some asserts in the CFI machinery to verify they do not emit a .cfi_restore in a noreturn function. The former it obviously trivial -- the latter may have fallout I'm not aware of, particularly since I know very little about the CFI bits. jeff
On 01/02/2018 04:22 PM, Jeff Law wrote: > On 01/02/2018 03:05 PM, Florian Weimer wrote: >> On 01/02/2018 09:02 PM, Jeff Law wrote: >>> * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not >> >> Typo: “adjut”. >> >>> explicitly probe *sp in a noreturn function if there were any callee >>> register saves. >> >> I recompiled glibc with this patch, and I can confirm it fixes the new >> glibc test: >> >> https://sourceware.org/ml/libc-alpha/2017-12/msg00987.html >> >> However, I would appreciate if it were possible to avoid emitting the >> .cfi_offset/.cfi_register annotations and only record the change of >> frame address. The other CFI notes aren't needed, and it would avoid >> reintroducing this bug if the way the prologue is constructed changes >> and the condition for emitting the probe is not completely correct anymore. > I'm not aware of a way to do that. I'm not even sure having the ability > to tell the CFI machinery to avoid that stuff is a good idea from a > design/implementation standpoint. So Jakub pointed out in IRC we can use a REG_FRAME_RELATED_EXPR note to control more precisely what cfis are emitted. But that introduces its own set of issues. I suspect we're more likely to muck up generating the note explicitly than we are to muck up the condition controlling whether or not we need the explicit *sp probe. The way we run into problems is if the CFI notes do not reflect reality -- for example, indicating a register is saved at a location where it isn't actually saved, indicating a register was restored to its entry value when that's not true throughout the function, or mucking up the CFA offset. The way to get into those first two states is if we have a redundant push/pop pair and its associated notes for the *sp explicit probe, particularly for %esi on x86 or %rax on x86_64. An additional comment seems warranted though. And testing that we don't have a cfi restore in the regression test also seems warranted. Consider those added :-) I'll come up with the exact comment text in the AM. While avoiding the cfi notes seems useful, I doubt it's really going to improve things in practice. In theory if someone completely rewrote the prologue code they could potentially introduce all kinds of problems, but I don't see how your suggested change would really help there either. Comments in the code as well as tests verifying the output seem to be the way to go here as well. Jeff
On Tue, Jan 02, 2018 at 01:02:26PM -0700, Jeff Law wrote: > It's fairly obvious that the probe of *sp isn't actually necessary here > because the register saves in the prologue act as probe points for *sp. > > In fact, the only way this can ever cause problems is if %esi is used in > the body in which case it would have been callee saved in the prologue. > So if we detect that %esi is already callee saved in the prologue then > we could eliminate the explicit probe of *sp. > > But we can do even better. If any register is saved in the prologue, > then that callee register save functions as an implicit probe of *sp and > we do not need to explicitly probe *sp. > > While this was reported with -m32, I'm pretty sure we can trigger a > similar issue on x86_64. > > Bootstrapped and regression tested on x86_64. Also verified the > testcase behavior on -m32. The test uses flags to hopefully ensure > expected behavior on x86/Solaris, but I did not explicitly test that > configuration. > > OK for the trunk? > > Jeff > Missing PR target/83641 here. > * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not adjust as Florian reported. > explicitly probe *sp in a noreturn function if there were any callee > register saves. " or frame pointer is needed" ? > > * gcc.target/i386/stack-check-17.c: New test Missing full stop after New test This is a nice optimization, ok for trunk. I think we do not want to put anything into the CFI even if the condition you've added doesn't trigger, that is a pure space and runtime cycle waste, except for noting the sp change if it is the current CFA. Even after the push the register value still holds the caller's value, and after the pop too. ix86_emit_restore_reg_using_pop does a lot of stuff we IMNSHO don't want to do, if dummy_reg happened to be a drap reg, we'd emit really weird stuff, the cfa restore, etc. There are many other spots that gen_pop themselves and do the appropriate thing at that spot, instead of all using ix86_emit_restore_reg_using_pop. Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? 2018-01-03 Jakub Jelinek <jakub@redhat.com> PR target/83641 * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp and add REG_CFA_ADJUST_CFA notes in that case to both insns. --- gcc/config/i386/i386.c.jj 2018-01-03 10:20:06.495535771 +0100 +++ gcc/config/i386/i386.c 2018-01-03 12:32:01.747649506 +0100 @@ -12229,9 +12229,21 @@ ix86_adjust_stack_and_probe_stack_clash argument passing registers so as not to introduce dependencies in the pipeline. For 32 bit we use %esi and for 64 bit we use %rax. */ rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG); - rtx_insn *insn = emit_insn (gen_push (dummy_reg)); - RTX_FRAME_RELATED_P (insn) = 1; - ix86_emit_restore_reg_using_pop (dummy_reg); + rtx_insn *insn_push = emit_insn (gen_push (dummy_reg)); + rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg)); + m->fs.sp_offset -= UNITS_PER_WORD; + if (m->fs.cfa_reg == stack_pointer_rtx) + { + m->fs.cfa_offset -= UNITS_PER_WORD; + rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD); + x = gen_rtx_SET (stack_pointer_rtx, x); + add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x); + RTX_FRAME_RELATED_P (insn_push) = 1; + x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD); + x = gen_rtx_SET (stack_pointer_rtx, x); + add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x); + RTX_FRAME_RELATED_P (insn_pop) = 1; + } emit_insn (gen_blockage ()); } Jakub
On 01/03/2018 04:57 AM, Jakub Jelinek wrote: > On Tue, Jan 02, 2018 at 01:02:26PM -0700, Jeff Law wrote: >> It's fairly obvious that the probe of *sp isn't actually necessary here >> because the register saves in the prologue act as probe points for *sp. >> >> In fact, the only way this can ever cause problems is if %esi is used in >> the body in which case it would have been callee saved in the prologue. >> So if we detect that %esi is already callee saved in the prologue then >> we could eliminate the explicit probe of *sp. >> >> But we can do even better. If any register is saved in the prologue, >> then that callee register save functions as an implicit probe of *sp and >> we do not need to explicitly probe *sp. >> >> While this was reported with -m32, I'm pretty sure we can trigger a >> similar issue on x86_64. >> >> Bootstrapped and regression tested on x86_64. Also verified the >> testcase behavior on -m32. The test uses flags to hopefully ensure >> expected behavior on x86/Solaris, but I did not explicitly test that >> configuration. >> >> OK for the trunk? >> >> Jeff >> > > Missing > PR target/83641 > here. > >> * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not > > adjust as Florian reported. > >> explicitly probe *sp in a noreturn function if there were any callee >> register saves. > > " or frame pointer is needed" ? > >> >> * gcc.target/i386/stack-check-17.c: New test > > Missing full stop after New test > > This is a nice optimization, ok for trunk. > > I think we do not want to put anything into the CFI even if the condition > you've added doesn't trigger, that is a pure space and runtime cycle waste, > except for noting the sp change if it is the current CFA. Even after the > push the register value still holds the caller's value, and after the pop > too. ix86_emit_restore_reg_using_pop does a lot of stuff we IMNSHO don't > want to do, if dummy_reg happened to be a drap reg, we'd emit really weird > stuff, the cfa restore, etc. There are many other spots that gen_pop > themselves and do the appropriate thing at that spot, instead of > all using ix86_emit_restore_reg_using_pop. > > Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? > > 2018-01-03 Jakub Jelinek <jakub@redhat.com> > > PR target/83641 > * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For > noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, > only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp > and add REG_CFA_ADJUST_CFA notes in that case to both insns. I still question how useful this part is, but not enough to object given you've done the work. I'll go ahead and commit both as a single unit. Jeff
On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote: > > Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? > > > > 2018-01-03 Jakub Jelinek <jakub@redhat.com> > > > > PR target/83641 > > * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For > > noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, > > only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp > > and add REG_CFA_ADJUST_CFA notes in that case to both insns. > I still question how useful this part is, but not enough to object given Reduces bloat in .eh_frame and when unwinding less work is needed. The patch isn't that large after all. > you've done the work. I'll go ahead and commit both as a single unit. BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and my patch, the only regression caused by your patch is: +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi on i686-linux, quite understandably, that is exactly the case you are changing, there is %edi clobbered in the function and thus needs to be saved and restored and so no push/pop of %esi is needed. So, this testcase should be tweaked. Jakub
On 01/03/2018 01:36 PM, Jakub Jelinek wrote: > On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote: >>> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? >>> >>> 2018-01-03 Jakub Jelinek <jakub@redhat.com> >>> >>> PR target/83641 >>> * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For >>> noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, >>> only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp >>> and add REG_CFA_ADJUST_CFA notes in that case to both insns. >> I still question how useful this part is, but not enough to object given > > Reduces bloat in .eh_frame and when unwinding less work is needed. The > patch isn't that large after all. > >> you've done the work. I'll go ahead and commit both as a single unit. > > BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and > my patch, the only regression caused by your patch is: > +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi > +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi > on i686-linux, quite understandably, that is exactly the case you are > changing, there is %edi clobbered in the function and thus needs to be saved > and restored and so no push/pop of %esi is needed. > So, this testcase should be tweaked. I think the ASM can just be dropped. It doesn't contribute anything to what Florian was trying to show. Once we drop the ASM we should see those probes return. I'll have to check that obviously. jeff
On 01/03/2018 01:49 PM, Jeff Law wrote: > On 01/03/2018 01:36 PM, Jakub Jelinek wrote: >> On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote: >>>> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? >>>> >>>> 2018-01-03 Jakub Jelinek <jakub@redhat.com> >>>> >>>> PR target/83641 >>>> * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For >>>> noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, >>>> only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp >>>> and add REG_CFA_ADJUST_CFA notes in that case to both insns. >>> I still question how useful this part is, but not enough to object given >> >> Reduces bloat in .eh_frame and when unwinding less work is needed. The >> patch isn't that large after all. >> >>> you've done the work. I'll go ahead and commit both as a single unit. >> >> BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and >> my patch, the only regression caused by your patch is: >> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi >> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi >> on i686-linux, quite understandably, that is exactly the case you are >> changing, there is %edi clobbered in the function and thus needs to be saved >> and restored and so no push/pop of %esi is needed. >> So, this testcase should be tweaked. > I think the ASM can just be dropped. It doesn't contribute anything to > what Florian was trying to show. Once we drop the ASM we should see > those probes return. I'll have to check that obviously. Here's the final patch -- only changes are smashing Jakub's and my work together, adding a comment and the stack-check-12 tweak mentioned above. Committing to the trunk. jeff diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8df8f232d80..7aa0920fc65 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,17 @@ +2017-01-03 Jakub Jelinek <jakub@redhat.com> + Jeff Law <law@redhat.com> + + PR target/83641 + * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For + noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, + only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp + and add REG_CFA_ADJUST_CFA notes in that case to both insns. + + PR target/83641 + * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): Do not + explicitly probe *sp in a noreturn function if there were any callee + register saves or frame pointer is needed. + 2018-01-03 Jakub Jelinek <jakub@redhat.com> PR debug/83621 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 56baaa7e5e8..c363de93e02 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -12217,21 +12217,39 @@ ix86_adjust_stack_and_probe_stack_clash (const HOST_WIDE_INT size) pointer could be anywhere in the guard page. The safe thing to do is emit a probe now. + The probe can be avoided if we have already emitted any callee + register saves into the stack or have a frame pointer (which will + have been saved as well). Those saves will function as implicit + probes. + ?!? This should be revamped to work like aarch64 and s390 where we track the offset from the most recent probe. Normally that offset would be zero. For a noreturn function we would reset it to PROBE_INTERVAL - (STACK_BOUNDARY / BITS_PER_UNIT). Then we just probe when we cross PROBE_INTERVAL. */ - if (TREE_THIS_VOLATILE (cfun->decl)) + if (TREE_THIS_VOLATILE (cfun->decl) + && !(m->frame.nregs || m->frame.nsseregs || frame_pointer_needed)) { /* We can safely use any register here since we're just going to push its value and immediately pop it back. But we do try and avoid argument passing registers so as not to introduce dependencies in the pipeline. For 32 bit we use %esi and for 64 bit we use %rax. */ rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG); - rtx_insn *insn = emit_insn (gen_push (dummy_reg)); - RTX_FRAME_RELATED_P (insn) = 1; - ix86_emit_restore_reg_using_pop (dummy_reg); + rtx_insn *insn_push = emit_insn (gen_push (dummy_reg)); + rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg)); + m->fs.sp_offset -= UNITS_PER_WORD; + if (m->fs.cfa_reg == stack_pointer_rtx) + { + m->fs.cfa_offset -= UNITS_PER_WORD; + rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD); + x = gen_rtx_SET (stack_pointer_rtx, x); + add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x); + RTX_FRAME_RELATED_P (insn_push) = 1; + x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD); + x = gen_rtx_SET (stack_pointer_rtx, x); + add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x); + RTX_FRAME_RELATED_P (insn_pop) = 1; + } emit_insn (gen_blockage ()); } diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 2c66a0d007e..1b8e7ad6ea5 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,9 @@ +2018-01-03 Jeff Law <law@redhat.com> + + PR target/83641 + * gcc.target/i386/stack-check-17.c: New test. + * gcc.target/i386/stack-check-12.c: Drop unnecessary asm. + 2018-01-03 Jakub Jelinek <jakub@redhat.com> PR debug/83621 diff --git a/gcc/testsuite/gcc.target/i386/stack-check-12.c b/gcc/testsuite/gcc.target/i386/stack-check-12.c index 980416946df..74d3a26cad1 100644 --- a/gcc/testsuite/gcc.target/i386/stack-check-12.c +++ b/gcc/testsuite/gcc.target/i386/stack-check-12.c @@ -7,7 +7,6 @@ __attribute__ ((noreturn)) void exit (int); __attribute__ ((noreturn)) void f (void) { - asm volatile ("nop" ::: "edi"); exit (1); } diff --git a/gcc/testsuite/gcc.target/i386/stack-check-17.c b/gcc/testsuite/gcc.target/i386/stack-check-17.c new file mode 100644 index 00000000000..d2ef83b348a --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/stack-check-17.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fstack-clash-protection -mtune=generic -fomit-frame-pointer" } */ +/* { dg-require-effective-target supports_stack_clash_protection } */ + + +int x0, x1; +void f1 (void); +void f2 (int, int); + +__attribute__ ((noreturn)) +void +f3 (void) +{ + int y0 = x0; + int y1 = x1; + f1 (); + f2 (y0, y1); + while (1); +} + +/* Verify no explicit probes. */ +/* { dg-final { scan-assembler-not "or\[ql\]" } } */ + +/* We also want to verify we did not use a push/pop sequence + to probe *sp as the callee register saves are sufficient + to probe *sp. + + y0/y1 are live across the call and thus must be allocated + into either a stack slot or callee saved register. The former + would be rather dumb. So assume it does not happen. + + So search for two/four pushes for the callee register saves/argument + pushes and no pops (since the function has no reachable epilogue). */ +/* { dg-final { scan-assembler-times "push\[ql\]" 2 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "push\[ql\]" 4 { target { ia32 } } } } */ +/* { dg-final { scan-assembler-not "pop" } } */ +
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 9ff9ca4e37f..5bd37d9cda5 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -12222,7 +12222,8 @@ ix86_adjust_stack_and_probe_stack_clash (const HOST_WIDE_INT size) offset would be zero. For a noreturn function we would reset it to PROBE_INTERVAL - (STACK_BOUNDARY / BITS_PER_UNIT). Then we just probe when we cross PROBE_INTERVAL. */ - if (TREE_THIS_VOLATILE (cfun->decl)) + if (TREE_THIS_VOLATILE (cfun->decl) + && !(m->frame.nregs || m->frame.nsseregs || frame_pointer_needed)) { /* We can safely use any register here since we're just going to push its value and immediately pop it back. But we do try and avoid diff --git a/gcc/testsuite/gcc.target/i386/stack-check-17.c b/gcc/testsuite/gcc.target/i386/stack-check-17.c new file mode 100644 index 00000000000..d2ef83b348a --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/stack-check-17.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fstack-clash-protection -mtune=generic -fomit-frame-pointer" } */ +/* { dg-require-effective-target supports_stack_clash_protection } */ + + +int x0, x1; +void f1 (void); +void f2 (int, int); + +__attribute__ ((noreturn)) +void +f3 (void) +{ + int y0 = x0; + int y1 = x1; + f1 (); + f2 (y0, y1); + while (1); +} + +/* Verify no explicit probes. */ +/* { dg-final { scan-assembler-not "or\[ql\]" } } */ + +/* We also want to verify we did not use a push/pop sequence + to probe *sp as the callee register saves are sufficient + to probe *sp. + + y0/y1 are live across the call and thus must be allocated + into either a stack slot or callee saved register. The former + would be rather dumb. So assume it does not happen. + + So search for two/four pushes for the callee register saves/argument + pushes and no pops (since the function has no reachable epilogue). */ +/* { dg-final { scan-assembler-times "push\[ql\]" 2 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "push\[ql\]" 4 { target { ia32 } } } } */ +/* { dg-final { scan-assembler-not "pop" } } */ +
This time with the patch. -------- Forwarded Message -------- Subject: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64 Date: Tue, 2 Jan 2018 11:58:04 -0700 From: Jeff Law <law@redhat.com> To: gcc-patches <gcc-patches@gcc.gnu.org> Oh how I sometimes wish we'd never opened the can of worms WRT stack clash protection of noreturn functions. In this BZ we have a noreturn function. So we trigger the special bits to emit a push/pop sequence to explicitly probe *sp. For ia32 we push/pop %esi. The pop %esi tells the generic CFI machinery that %esi's value is returned to its state in the caller. But that's not entirely correct as the value will be over written in the body of the function. This situation shows up in some of the nptl code within glibc (pthread_unwind). This in turn is believed to cause giac to behave improperly. -- It's fairly obvious that the probe of *sp isn't actually necessary here because the register saves in the prologue act as probe points for *sp. In fact, the only way this can ever cause problems is if %esi is used in the body in which case it would have been callee saved in the prologue. So if we detect that %esi is already callee saved in the prologue then we could eliminate the explicit probe of *sp. But we can do even better. If any register is saved in the prologue, then that callee register save functions as an implicit probe of *sp and we do not need to explicitly probe *sp. While this was reported with -m32, I'm pretty sure we can trigger a similar issue on x86_64. Bootstrapped and regression tested on x86_64. Also verified the testcase behavior on -m32. The test uses flags to hopefully ensure expected behavior on x86/Solaris, but I did not explicitly test that configuration. OK for the trunk? Jeff * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not explicitly probe *sp in a noreturn function if there were any callee register saves. * gcc.target/i386/stack-check-17.c: New test