diff mbox series

[PR,target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64

Message ID 50ca58cf-f2e2-40d3-c3dc-f4bfc7a80f74@redhat.com
State New
Headers show
Series [PR,target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64 | expand

Commit Message

Jeff Law Jan. 2, 2018, 8:02 p.m. UTC
This time with the patch.



-------- Forwarded Message --------
Subject: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash
protected noreturn function on x86/x86_64
Date: Tue, 2 Jan 2018 11:58:04 -0700
From: Jeff Law <law@redhat.com>
To: gcc-patches <gcc-patches@gcc.gnu.org>

Oh how I sometimes wish we'd never opened the can of worms WRT stack
clash protection of noreturn functions.

In this BZ we have a noreturn function.  So we trigger the special bits
to emit a push/pop sequence to explicitly probe *sp.  For ia32 we
push/pop %esi.

The pop %esi tells the generic CFI machinery that %esi's value is
returned to its state in the caller.  But that's not entirely correct as
the value will be over written in the body of the function.

This situation shows up in some of the nptl code within glibc
(pthread_unwind).  This in turn is believed to cause giac to behave
improperly.

--

It's fairly obvious that the probe of *sp isn't actually necessary here
because the register saves in the prologue act as probe points for *sp.

In fact, the only way this can ever cause problems is if %esi is used in
the body in which case it would have been callee saved in the prologue.
So if we detect that %esi is already callee saved in the prologue then
we could eliminate the explicit probe of *sp.

But we can do even better.  If any register is saved in the prologue,
then that callee register save functions as an implicit probe of *sp and
we do not need to explicitly probe *sp.

While this was reported with -m32, I'm pretty sure we can trigger a
similar issue on x86_64.

Bootstrapped and regression tested on x86_64.  Also verified the
testcase behavior on -m32.  The test uses flags to hopefully ensure
expected behavior on x86/Solaris, but I did not explicitly test that
configuration.

OK for the trunk?

Jeff
* config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not
	explicitly probe *sp in a noreturn function if there were any callee
	register saves.

	* gcc.target/i386/stack-check-17.c: New test

Comments

Florian Weimer Jan. 2, 2018, 10:05 p.m. UTC | #1
On 01/02/2018 09:02 PM, Jeff Law wrote:
> 	* config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not

Typo: “adjut”.

> 	explicitly probe *sp in a noreturn function if there were any callee
> 	register saves.

I recompiled glibc with this patch, and I can confirm it fixes the new 
glibc test:

   https://sourceware.org/ml/libc-alpha/2017-12/msg00987.html

However, I would appreciate if it were possible to avoid emitting the 
.cfi_offset/.cfi_register annotations and only record the change of 
frame address.  The other CFI notes aren't needed, and it would avoid 
reintroducing this bug if the way the prologue is constructed changes 
and the condition for emitting the probe is not completely correct anymore.

Thanks,
Florian
Jeff Law Jan. 2, 2018, 11:22 p.m. UTC | #2
On 01/02/2018 03:05 PM, Florian Weimer wrote:
> On 01/02/2018 09:02 PM, Jeff Law wrote:
>>     * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not
> 
> Typo: “adjut”.
> 
>>     explicitly probe *sp in a noreturn function if there were any callee
>>     register saves.
> 
> I recompiled glibc with this patch, and I can confirm it fixes the new
> glibc test:
> 
>   https://sourceware.org/ml/libc-alpha/2017-12/msg00987.html
> 
> However, I would appreciate if it were possible to avoid emitting the
> .cfi_offset/.cfi_register annotations and only record the change of
> frame address.  The other CFI notes aren't needed, and it would avoid
> reintroducing this bug if the way the prologue is constructed changes
> and the condition for emitting the probe is not completely correct anymore.
I'm not aware of a way to do that.  I'm not even sure having the ability
to tell the CFI machinery to avoid that stuff is a good idea from a
design/implementation standpoint.

What we could do is beef up the testsuite checks to verify there are no
cfi restores and possibly add some asserts in the CFI machinery to
verify they do not emit a .cfi_restore in a noreturn function.  The
former it obviously trivial -- the latter may have fallout I'm not aware
of, particularly since I know very little about the CFI bits.

jeff
Jeff Law Jan. 3, 2018, 8:26 a.m. UTC | #3
On 01/02/2018 04:22 PM, Jeff Law wrote:
> On 01/02/2018 03:05 PM, Florian Weimer wrote:
>> On 01/02/2018 09:02 PM, Jeff Law wrote:
>>>     * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not
>>
>> Typo: “adjut”.
>>
>>>     explicitly probe *sp in a noreturn function if there were any callee
>>>     register saves.
>>
>> I recompiled glibc with this patch, and I can confirm it fixes the new
>> glibc test:
>>
>>   https://sourceware.org/ml/libc-alpha/2017-12/msg00987.html
>>
>> However, I would appreciate if it were possible to avoid emitting the
>> .cfi_offset/.cfi_register annotations and only record the change of
>> frame address.  The other CFI notes aren't needed, and it would avoid
>> reintroducing this bug if the way the prologue is constructed changes
>> and the condition for emitting the probe is not completely correct anymore.
> I'm not aware of a way to do that.  I'm not even sure having the ability
> to tell the CFI machinery to avoid that stuff is a good idea from a
> design/implementation standpoint.
So Jakub pointed out in IRC we can use a REG_FRAME_RELATED_EXPR note to
control more precisely what cfis are emitted.  But that introduces its
own set of issues.   I suspect we're more likely to muck up generating
the note explicitly than we are to muck up the condition controlling
whether or not we need the explicit *sp probe.

The way we run into problems is if the CFI notes do not reflect reality
-- for example, indicating a register is saved at a location where it
isn't actually saved, indicating a register was restored to its entry
value when that's not true throughout the function, or mucking up the
CFA offset.

The way to get into those first two states is if we have a redundant
push/pop pair and its associated notes for the *sp explicit probe,
particularly for %esi on x86 or %rax on x86_64.  An additional comment
seems warranted though.  And testing that we don't have a cfi restore in
the regression test also seems warranted.  Consider those added :-)
I'll come up with the exact comment text in the AM.  While avoiding the
cfi notes seems useful, I doubt it's really going to improve things in
practice.

In theory if someone completely rewrote the prologue code they could
potentially introduce all kinds of problems, but I don't see how your
suggested change would really help there either.  Comments in the code
as well as tests verifying the output seem to be the way to go here as well.


Jeff
Jakub Jelinek Jan. 3, 2018, 11:57 a.m. UTC | #4
On Tue, Jan 02, 2018 at 01:02:26PM -0700, Jeff Law wrote:
> It's fairly obvious that the probe of *sp isn't actually necessary here
> because the register saves in the prologue act as probe points for *sp.
> 
> In fact, the only way this can ever cause problems is if %esi is used in
> the body in which case it would have been callee saved in the prologue.
> So if we detect that %esi is already callee saved in the prologue then
> we could eliminate the explicit probe of *sp.
> 
> But we can do even better.  If any register is saved in the prologue,
> then that callee register save functions as an implicit probe of *sp and
> we do not need to explicitly probe *sp.
> 
> While this was reported with -m32, I'm pretty sure we can trigger a
> similar issue on x86_64.
> 
> Bootstrapped and regression tested on x86_64.  Also verified the
> testcase behavior on -m32.  The test uses flags to hopefully ensure
> expected behavior on x86/Solaris, but I did not explicitly test that
> configuration.
> 
> OK for the trunk?
> 
> Jeff
> 

Missing
	PR target/83641
here.

> 	* config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not

adjust as Florian reported.

> 	explicitly probe *sp in a noreturn function if there were any callee
> 	register saves.

" or frame pointer is needed" ?

> 
> 	* gcc.target/i386/stack-check-17.c: New test

Missing full stop after New test

This is a nice optimization, ok for trunk.

I think we do not want to put anything into the CFI even if the condition
you've added doesn't trigger, that is a pure space and runtime cycle waste,
except for noting the sp change if it is the current CFA.  Even after the
push the register value still holds the caller's value, and after the pop
too.  ix86_emit_restore_reg_using_pop does a lot of stuff we IMNSHO don't
want to do, if dummy_reg happened to be a drap reg, we'd emit really weird
stuff, the cfa restore, etc.  There are many other spots that gen_pop
themselves and do the appropriate thing at that spot, instead of
all using ix86_emit_restore_reg_using_pop.

Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?

2018-01-03  Jakub Jelinek  <jakub@redhat.com>

	PR target/83641
	* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
	noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
	only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
	and add REG_CFA_ADJUST_CFA notes in that case to both insns.

--- gcc/config/i386/i386.c.jj	2018-01-03 10:20:06.495535771 +0100
+++ gcc/config/i386/i386.c	2018-01-03 12:32:01.747649506 +0100
@@ -12229,9 +12229,21 @@ ix86_adjust_stack_and_probe_stack_clash
 	 argument passing registers so as not to introduce dependencies in
 	 the pipeline.  For 32 bit we use %esi and for 64 bit we use %rax.  */
       rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG);
-      rtx_insn *insn = emit_insn (gen_push (dummy_reg));
-      RTX_FRAME_RELATED_P (insn) = 1;
-      ix86_emit_restore_reg_using_pop (dummy_reg);
+      rtx_insn *insn_push = emit_insn (gen_push (dummy_reg));
+      rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg));
+      m->fs.sp_offset -= UNITS_PER_WORD;
+      if (m->fs.cfa_reg == stack_pointer_rtx)
+	{
+	  m->fs.cfa_offset -= UNITS_PER_WORD;
+	  rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD);
+	  x = gen_rtx_SET (stack_pointer_rtx, x);
+	  add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x);
+	  RTX_FRAME_RELATED_P (insn_push) = 1;
+	  x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
+	  x = gen_rtx_SET (stack_pointer_rtx, x);
+	  add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x);
+	  RTX_FRAME_RELATED_P (insn_pop) = 1;
+	}
       emit_insn (gen_blockage ());
     }
 


	Jakub
Jeff Law Jan. 3, 2018, 7:57 p.m. UTC | #5
On 01/03/2018 04:57 AM, Jakub Jelinek wrote:
> On Tue, Jan 02, 2018 at 01:02:26PM -0700, Jeff Law wrote:
>> It's fairly obvious that the probe of *sp isn't actually necessary here
>> because the register saves in the prologue act as probe points for *sp.
>>
>> In fact, the only way this can ever cause problems is if %esi is used in
>> the body in which case it would have been callee saved in the prologue.
>> So if we detect that %esi is already callee saved in the prologue then
>> we could eliminate the explicit probe of *sp.
>>
>> But we can do even better.  If any register is saved in the prologue,
>> then that callee register save functions as an implicit probe of *sp and
>> we do not need to explicitly probe *sp.
>>
>> While this was reported with -m32, I'm pretty sure we can trigger a
>> similar issue on x86_64.
>>
>> Bootstrapped and regression tested on x86_64.  Also verified the
>> testcase behavior on -m32.  The test uses flags to hopefully ensure
>> expected behavior on x86/Solaris, but I did not explicitly test that
>> configuration.
>>
>> OK for the trunk?
>>
>> Jeff
>>
> 
> Missing
> 	PR target/83641
> here.
> 
>> 	* config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not
> 
> adjust as Florian reported.
> 
>> 	explicitly probe *sp in a noreturn function if there were any callee
>> 	register saves.
> 
> " or frame pointer is needed" ?
> 
>>
>> 	* gcc.target/i386/stack-check-17.c: New test
> 
> Missing full stop after New test
> 
> This is a nice optimization, ok for trunk.
> 
> I think we do not want to put anything into the CFI even if the condition
> you've added doesn't trigger, that is a pure space and runtime cycle waste,
> except for noting the sp change if it is the current CFA.  Even after the
> push the register value still holds the caller's value, and after the pop
> too.  ix86_emit_restore_reg_using_pop does a lot of stuff we IMNSHO don't
> want to do, if dummy_reg happened to be a drap reg, we'd emit really weird
> stuff, the cfa restore, etc.  There are many other spots that gen_pop
> themselves and do the appropriate thing at that spot, instead of
> all using ix86_emit_restore_reg_using_pop.
> 
> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?
> 
> 2018-01-03  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/83641
> 	* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
> 	noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
> 	only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
> 	and add REG_CFA_ADJUST_CFA notes in that case to both insns.
I still question how useful this part is, but not enough to object given
you've done the work.  I'll go ahead and commit both as a single unit.

Jeff
Jakub Jelinek Jan. 3, 2018, 8:36 p.m. UTC | #6
On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote:
> > Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?
> > 
> > 2018-01-03  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR target/83641
> > 	* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
> > 	noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
> > 	only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
> > 	and add REG_CFA_ADJUST_CFA notes in that case to both insns.
> I still question how useful this part is, but not enough to object given

Reduces bloat in .eh_frame and when unwinding less work is needed.  The
patch isn't that large after all.

> you've done the work.  I'll go ahead and commit both as a single unit.

BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and
my patch, the only regression caused by your patch is:
+FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi
+FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi
on i686-linux, quite understandably, that is exactly the case you are
changing, there is %edi clobbered in the function and thus needs to be saved
and restored and so no push/pop of %esi is needed.
So, this testcase should be tweaked.

	Jakub
Jeff Law Jan. 3, 2018, 8:49 p.m. UTC | #7
On 01/03/2018 01:36 PM, Jakub Jelinek wrote:
> On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote:
>>> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?
>>>
>>> 2018-01-03  Jakub Jelinek  <jakub@redhat.com>
>>>
>>> 	PR target/83641
>>> 	* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
>>> 	noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
>>> 	only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
>>> 	and add REG_CFA_ADJUST_CFA notes in that case to both insns.
>> I still question how useful this part is, but not enough to object given
> 
> Reduces bloat in .eh_frame and when unwinding less work is needed.  The
> patch isn't that large after all.
> 
>> you've done the work.  I'll go ahead and commit both as a single unit.
> 
> BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and
> my patch, the only regression caused by your patch is:
> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi
> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi
> on i686-linux, quite understandably, that is exactly the case you are
> changing, there is %edi clobbered in the function and thus needs to be saved
> and restored and so no push/pop of %esi is needed.
> So, this testcase should be tweaked.
I think the ASM can just be dropped.  It doesn't contribute anything to
what Florian was trying to show.   Once we drop the ASM we should see
those probes return.  I'll have to check that obviously.

jeff
Jeff Law Jan. 3, 2018, 9:25 p.m. UTC | #8
On 01/03/2018 01:49 PM, Jeff Law wrote:
> On 01/03/2018 01:36 PM, Jakub Jelinek wrote:
>> On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote:
>>>> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?
>>>>
>>>> 2018-01-03  Jakub Jelinek  <jakub@redhat.com>
>>>>
>>>> 	PR target/83641
>>>> 	* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
>>>> 	noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
>>>> 	only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
>>>> 	and add REG_CFA_ADJUST_CFA notes in that case to both insns.
>>> I still question how useful this part is, but not enough to object given
>>
>> Reduces bloat in .eh_frame and when unwinding less work is needed.  The
>> patch isn't that large after all.
>>
>>> you've done the work.  I'll go ahead and commit both as a single unit.
>>
>> BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and
>> my patch, the only regression caused by your patch is:
>> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi
>> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi
>> on i686-linux, quite understandably, that is exactly the case you are
>> changing, there is %edi clobbered in the function and thus needs to be saved
>> and restored and so no push/pop of %esi is needed.
>> So, this testcase should be tweaked.
> I think the ASM can just be dropped.  It doesn't contribute anything to
> what Florian was trying to show.   Once we drop the ASM we should see
> those probes return.  I'll have to check that obviously.
Here's the final patch -- only changes are smashing Jakub's and my work
together, adding a comment and the stack-check-12 tweak mentioned above.

Committing to the trunk.

jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8df8f232d80..7aa0920fc65 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2017-01-03  Jakub Jelinek  <jakub@redhat.com>
+	    Jeff Law  <law@redhat.com>
+
+	PR target/83641
+	* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
+	noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
+	only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
+	and add REG_CFA_ADJUST_CFA notes in that case to both insns.
+
+	PR target/83641
+	* config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): Do not
+	explicitly probe *sp in a noreturn function if there were any callee
+	register saves or frame pointer is needed.
+
 2018-01-03  Jakub Jelinek  <jakub@redhat.com>
 
 	PR debug/83621
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56baaa7e5e8..c363de93e02 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12217,21 +12217,39 @@ ix86_adjust_stack_and_probe_stack_clash (const HOST_WIDE_INT size)
      pointer could be anywhere in the guard page.  The safe thing
      to do is emit a probe now.
 
+     The probe can be avoided if we have already emitted any callee
+     register saves into the stack or have a frame pointer (which will
+     have been saved as well).  Those saves will function as implicit
+     probes.
+
      ?!? This should be revamped to work like aarch64 and s390 where
      we track the offset from the most recent probe.  Normally that
      offset would be zero.  For a noreturn function we would reset
      it to PROBE_INTERVAL - (STACK_BOUNDARY / BITS_PER_UNIT).   Then
      we just probe when we cross PROBE_INTERVAL.  */
-  if (TREE_THIS_VOLATILE (cfun->decl))
+  if (TREE_THIS_VOLATILE (cfun->decl)
+      && !(m->frame.nregs || m->frame.nsseregs || frame_pointer_needed))
     {
       /* We can safely use any register here since we're just going to push
 	 its value and immediately pop it back.  But we do try and avoid
 	 argument passing registers so as not to introduce dependencies in
 	 the pipeline.  For 32 bit we use %esi and for 64 bit we use %rax.  */
       rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG);
-      rtx_insn *insn = emit_insn (gen_push (dummy_reg));
-      RTX_FRAME_RELATED_P (insn) = 1;
-      ix86_emit_restore_reg_using_pop (dummy_reg);
+      rtx_insn *insn_push = emit_insn (gen_push (dummy_reg));
+      rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg));
+      m->fs.sp_offset -= UNITS_PER_WORD;
+      if (m->fs.cfa_reg == stack_pointer_rtx)
+	{
+	  m->fs.cfa_offset -= UNITS_PER_WORD;
+	  rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD);
+	  x = gen_rtx_SET (stack_pointer_rtx, x);
+	  add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x);
+	  RTX_FRAME_RELATED_P (insn_push) = 1;
+	  x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
+	  x = gen_rtx_SET (stack_pointer_rtx, x);
+	  add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x);
+	  RTX_FRAME_RELATED_P (insn_pop) = 1;
+	}
       emit_insn (gen_blockage ());
     }
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 2c66a0d007e..1b8e7ad6ea5 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2018-01-03  Jeff Law  <law@redhat.com>
+
+	PR target/83641
+	* gcc.target/i386/stack-check-17.c: New test.
+	* gcc.target/i386/stack-check-12.c: Drop unnecessary asm.
+
 2018-01-03  Jakub Jelinek  <jakub@redhat.com>
 
 	PR debug/83621
diff --git a/gcc/testsuite/gcc.target/i386/stack-check-12.c b/gcc/testsuite/gcc.target/i386/stack-check-12.c
index 980416946df..74d3a26cad1 100644
--- a/gcc/testsuite/gcc.target/i386/stack-check-12.c
+++ b/gcc/testsuite/gcc.target/i386/stack-check-12.c
@@ -7,7 +7,6 @@ __attribute__ ((noreturn)) void exit (int);
 __attribute__ ((noreturn)) void
 f (void)
 {
-  asm volatile ("nop" ::: "edi");
   exit (1);
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/stack-check-17.c b/gcc/testsuite/gcc.target/i386/stack-check-17.c
new file mode 100644
index 00000000000..d2ef83b348a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/stack-check-17.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection -mtune=generic -fomit-frame-pointer" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+
+int x0, x1;
+void f1 (void);
+void f2 (int, int);
+
+__attribute__ ((noreturn))
+void
+f3 (void)
+{ 
+  int y0 = x0;
+  int y1 = x1;
+  f1 ();
+  f2 (y0, y1);
+  while (1);
+}
+
+/* Verify no explicit probes.  */
+/* { dg-final { scan-assembler-not "or\[ql\]" } } */
+
+/* We also want to verify we did not use a push/pop sequence
+   to probe *sp as the callee register saves are sufficient
+   to probe *sp.
+
+   y0/y1 are live across the call and thus must be allocated
+   into either a stack slot or callee saved register.  The former
+   would be rather dumb.  So assume it does not happen.
+
+   So search for two/four pushes for the callee register saves/argument
+   pushes and no pops (since the function has no reachable epilogue).  */
+/* { dg-final { scan-assembler-times "push\[ql\]" 2 { target { ! ia32 } } } }  */
+/* { dg-final { scan-assembler-times "push\[ql\]" 4 { target { ia32 } } } }  */
+/* { dg-final { scan-assembler-not "pop" } } */
+
diff mbox series

Patch

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 9ff9ca4e37f..5bd37d9cda5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12222,7 +12222,8 @@  ix86_adjust_stack_and_probe_stack_clash (const HOST_WIDE_INT size)
      offset would be zero.  For a noreturn function we would reset
      it to PROBE_INTERVAL - (STACK_BOUNDARY / BITS_PER_UNIT).   Then
      we just probe when we cross PROBE_INTERVAL.  */
-  if (TREE_THIS_VOLATILE (cfun->decl))
+  if (TREE_THIS_VOLATILE (cfun->decl)
+      && !(m->frame.nregs || m->frame.nsseregs || frame_pointer_needed))
     {
       /* We can safely use any register here since we're just going to push
 	 its value and immediately pop it back.  But we do try and avoid
diff --git a/gcc/testsuite/gcc.target/i386/stack-check-17.c b/gcc/testsuite/gcc.target/i386/stack-check-17.c
new file mode 100644
index 00000000000..d2ef83b348a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/stack-check-17.c
@@ -0,0 +1,37 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection -mtune=generic -fomit-frame-pointer" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+
+int x0, x1;
+void f1 (void);
+void f2 (int, int);
+
+__attribute__ ((noreturn))
+void
+f3 (void)
+{ 
+  int y0 = x0;
+  int y1 = x1;
+  f1 ();
+  f2 (y0, y1);
+  while (1);
+}
+
+/* Verify no explicit probes.  */
+/* { dg-final { scan-assembler-not "or\[ql\]" } } */
+
+/* We also want to verify we did not use a push/pop sequence
+   to probe *sp as the callee register saves are sufficient
+   to probe *sp.
+
+   y0/y1 are live across the call and thus must be allocated
+   into either a stack slot or callee saved register.  The former
+   would be rather dumb.  So assume it does not happen.
+
+   So search for two/four pushes for the callee register saves/argument
+   pushes and no pops (since the function has no reachable epilogue).  */
+/* { dg-final { scan-assembler-times "push\[ql\]" 2 { target { ! ia32 } } } }  */
+/* { dg-final { scan-assembler-times "push\[ql\]" 4 { target { ia32 } } } }  */
+/* { dg-final { scan-assembler-not "pop" } } */
+