diff mbox

[SPARC] PR target/80968 Prevent stack loads in return delay slot.

Message ID 20170604.162744.1050556198244389551.davem@davemloft.net
State New
Headers show

Commit Message

David Miller June 4, 2017, 8:27 p.m. UTC
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Sun, 04 Jun 2017 10:32:47 +0200

>> This is an attempt to fix PR target/80968.  This bug has existed
>> basically forever.
>> 
>> The stack_tie sequence seems to be how other targets deal with this
>> issue.  I only emit this when alloca is used.  If there are other
>> conditions that potentially would necessitate such a barrier, just let
>> me know.
> 
> See my comment in the audit trail about the stack tie approach, let's just 
> emit a frame_blockage instead.

That seems to work as well, following is going through a testsuite
run right now:

Comments

Eric Botcazou June 5, 2017, 10:02 p.m. UTC | #1
> That seems to work as well, following is going through a testsuite
> run right now:
> 
> ====================
> [PATCH] sparc: Fix stack references in return delay slot.
> 
> gcc/
> 
> 	PR target/80968
> 	* config/sparc/sparc.c (sparc_expand_prologue): Emit frame
> 	blockage if function uses alloca.

Probably worth applying on all active branches I'd think.
David Miller June 6, 2017, 12:54 a.m. UTC | #2
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Tue, 06 Jun 2017 00:02:06 +0200

>> That seems to work as well, following is going through a testsuite
>> run right now:
>> 
>> ====================
>> [PATCH] sparc: Fix stack references in return delay slot.
>> 
>> gcc/
>> 
>> 	PR target/80968
>> 	* config/sparc/sparc.c (sparc_expand_prologue): Emit frame
>> 	blockage if function uses alloca.
> 
> Probably worth applying on all active branches I'd think.

I agree, this is a really nasty bug.

I'll do that after some more testing.

Eric, thanks for reviewing and your advice to use frame blockage!
David Miller June 6, 2017, 7:02 p.m. UTC | #3
From: David Miller <davem@davemloft.net>
Date: Mon, 05 Jun 2017 20:54:46 -0400 (EDT)

> From: Eric Botcazou <ebotcazou@adacore.com>
> Date: Tue, 06 Jun 2017 00:02:06 +0200
> 
>>> That seems to work as well, following is going through a testsuite
>>> run right now:
>>> 
>>> ====================
>>> [PATCH] sparc: Fix stack references in return delay slot.
>>> 
>>> gcc/
>>> 
>>> 	PR target/80968
>>> 	* config/sparc/sparc.c (sparc_expand_prologue): Emit frame
>>> 	blockage if function uses alloca.
>> 
>> Probably worth applying on all active branches I'd think.
> 
> I agree, this is a really nasty bug.
> 
> I'll do that after some more testing.

This is now done, thanks again.
David Miller June 9, 2017, 6:56 p.m. UTC | #4
From: David Miller <davem@davemloft.net>
Date: Tue, 06 Jun 2017 15:02:55 -0400 (EDT)

> From: David Miller <davem@davemloft.net>
> Date: Mon, 05 Jun 2017 20:54:46 -0400 (EDT)
> 
>> From: Eric Botcazou <ebotcazou@adacore.com>
>> Date: Tue, 06 Jun 2017 00:02:06 +0200
>> 
>>>> That seems to work as well, following is going through a testsuite
>>>> run right now:
>>>> 
>>>> ====================
>>>> [PATCH] sparc: Fix stack references in return delay slot.
>>>> 
>>>> gcc/
>>>> 
>>>> 	PR target/80968
>>>> 	* config/sparc/sparc.c (sparc_expand_prologue): Emit frame
>>>> 	blockage if function uses alloca.
>>> 
>>> Probably worth applying on all active branches I'd think.
>> 
>> I agree, this is a really nasty bug.
>> 
>> I'll do that after some more testing.
> 
> This is now done, thanks again.

Eric, after some more testing it turns out that we need something
more for gcc-5 and gcc-6 to cover all cases.

The problem is that before gcc-7, the compiler can emit return
instructions directly without going through the epilogue expander.

So I have to add the frame barrier emission to our return expander as
well, which I will work on right now.

Just FYI...
Eric Botcazou June 9, 2017, 8:13 p.m. UTC | #5
> Eric, after some more testing it turns out that we need something
> more for gcc-5 and gcc-6 to cover all cases.

Hmm, yes, I should have thought of that...

> The problem is that before gcc-7, the compiler can emit return
> instructions directly without going through the epilogue expander.

That should also be true with GCC 7 and later, no?
David Miller June 10, 2017, 5:59 p.m. UTC | #6
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Fri, 09 Jun 2017 22:13:20 +0200

>> Eric, after some more testing it turns out that we need something
>> more for gcc-5 and gcc-6 to cover all cases.
> 
> Hmm, yes, I should have thought of that...
> 
>> The problem is that before gcc-7, the compiler can emit return
>> instructions directly without going through the epilogue expander.
> 
> That should also be true with GCC 7 and later, no?

I do not see a direct gen_return happening in function.c in the gcc-7
branch.

Is it somewhere else?
Eric Botcazou June 12, 2017, 9:27 a.m. UTC | #7
> I do not see a direct gen_return happening in function.c in the gcc-7
> branch.
> 
> Is it somewhere else?

There is a call from force_nonfallthru_and_redirect in cfgrtl.c AFAICS.

So the code generated for your testcase is less optimized with GCC 7 and later 
than with GCC 6 and earlier?
David Miller June 12, 2017, 2:02 p.m. UTC | #8
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Mon, 12 Jun 2017 11:27:10 +0200

>> I do not see a direct gen_return happening in function.c in the gcc-7
>> branch.
>> 
>> Is it somewhere else?
> 
> There is a call from force_nonfallthru_and_redirect in cfgrtl.c AFAICS.
> 
> So the code generated for your testcase is less optimized with GCC 7 and later 
> than with GCC 6 and earlier?

Ok, I see.

That invocation from cfgrtl.c is not triggered for my testcase in gcc-7 and
later.

I'll update the return pattern in the gcc-7 branch and mainline in order to
cover all possible cases.
diff mbox

Patch

====================
[PATCH] sparc: Fix stack references in return delay slot.

gcc/

	PR target/80968
	* config/sparc/sparc.c (sparc_expand_prologue): Emit frame
	blockage if function uses alloca.

gcc/testsuite/

	* gcc.target/sparc/sparc-ret-3.c: New test.
---
 gcc/config/sparc/sparc.c                     |  3 ++
 gcc/testsuite/gcc.target/sparc/sparc-ret-3.c | 53 ++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/sparc/sparc-ret-3.c

diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 6dfb269..95a64a4 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -5792,6 +5792,9 @@  sparc_expand_epilogue (bool for_eh)
 {
   HOST_WIDE_INT size = sparc_frame_size;
 
+  if (cfun->calls_alloca)
+    emit_insn (gen_frame_blockage ());
+
   if (sparc_n_global_fp_regs > 0)
     emit_save_or_restore_global_fp_regs (sparc_frame_base_reg,
 				         sparc_frame_base_offset
diff --git a/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c
new file mode 100644
index 0000000..7a151f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c
@@ -0,0 +1,53 @@ 
+/* PR target/80968 */
+/* { dg-do compile } */
+/* { dg-skip-if "no register windows" { *-*-* } { "-mflat" } { "" } } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-mcpu=ultrasparc -O" } */
+
+/* Make sure references to the stack frame do not slip into the delay slot
+   of a return instruction.  */
+
+struct crypto_shash {
+	unsigned int descsize;
+};
+struct crypto_shash *tfm;
+
+struct shash_desc {
+	struct crypto_shash *tfm;
+	unsigned int flags;
+
+	void *__ctx[] __attribute__((aligned(8)));
+};
+
+static inline unsigned int crypto_shash_descsize(struct crypto_shash *tfm)
+{
+	return tfm->descsize;
+}
+
+static inline void *shash_desc_ctx(struct shash_desc *desc)
+{
+	return desc->__ctx;
+}
+
+#define SHASH_DESC_ON_STACK(shash, ctx)				  \
+	char __##shash##_desc[sizeof(struct shash_desc) +	  \
+			      crypto_shash_descsize(ctx)] __attribute__((aligned(8))); \
+	struct shash_desc *shash = (struct shash_desc *)__##shash##_desc
+
+extern int crypto_shash_update(struct shash_desc *, const void *, unsigned int);
+
+unsigned int bug(unsigned int crc, const void *address, unsigned int length)
+{
+	SHASH_DESC_ON_STACK(shash, tfm);
+	unsigned int *ctx = (unsigned int *)shash_desc_ctx(shash);
+	int err;
+
+	shash->tfm = tfm;
+	shash->flags = 0;
+	*ctx = crc;
+
+	err = crypto_shash_update(shash, address, length);
+
+	return *ctx;
+}
+/* { dg-final { scan-assembler "ld\[ \t\]*\\\[%i5\\+8\\\], %i0\n\[^\n\]*return\[ \t\]*%i7\\+8" } } */