diff mbox

[4/5] powerpc/tm: Check for already reclaimed tasks

Message ID 1447390652-28355-4-git-send-email-mikey@neuling.org (mailing list archive)
State Superseded
Headers show

Commit Message

Michael Neuling Nov. 13, 2015, 4:57 a.m. UTC
Currently we can hit a scenario where we'll tm_reclaim() twice.  This
results in a TM bad thing exception because the second reclaim occurs
when not in suspend mode.

The scenario in which this can happen is the following.  We attempt to
deliver a signal to userspace.  To do this we need obtain the stack
pointer to write the signal context.  To get this stack pointer we
must tm_reclaim() in case we need to use the checkpointed stack
pointer (see get_tm_stackpointer()).  Normally we'd then return
directly to userspace to deliver the signal without going through
__switch_to().

Unfortunatley, if at this point we get an error (such as a bad
userspace stack pointer), we need to exit the process.  The exit will
result in a __switch_to().  __switch_to() will attempt to save the
process state which results in another tm_reclaim().  This
tm_reclaim() now causes a TM Bad Thing exception as this state has
already been saved and the processor is no longer in TM suspend mode.
Whee!

This patch checks the state of the MSR to ensure we are TM suspended
before we attempt the tm_reclaim().  If we've already saved the state
away, we should no longer be in TM suspend mode.  This has the
additional advantage of checking for a potential TM Bad Thing
exception.

Found using syscall fuzzer.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: stable@vger.kernel.org
---
 arch/powerpc/kernel/process.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Anshuman Khandual Nov. 16, 2015, 7:21 a.m. UTC | #1
On 11/13/2015 10:27 AM, Michael Neuling wrote:
> Currently we can hit a scenario where we'll tm_reclaim() twice.  This
> results in a TM bad thing exception because the second reclaim occurs
> when not in suspend mode.
> 
> The scenario in which this can happen is the following.  We attempt to
> deliver a signal to userspace.  To do this we need obtain the stack
> pointer to write the signal context.  To get this stack pointer we
> must tm_reclaim() in case we need to use the checkpointed stack
> pointer (see get_tm_stackpointer()).  Normally we'd then return
> directly to userspace to deliver the signal without going through
> __switch_to().
> 
> Unfortunatley, if at this point we get an error (such as a bad
> userspace stack pointer), we need to exit the process.  The exit will
> result in a __switch_to().  __switch_to() will attempt to save the
> process state which results in another tm_reclaim().  This
> tm_reclaim() now causes a TM Bad Thing exception as this state has
> already been saved and the processor is no longer in TM suspend mode.
> Whee!
> 
> This patch checks the state of the MSR to ensure we are TM suspended
> before we attempt the tm_reclaim().  If we've already saved the state
> away, we should no longer be in TM suspend mode.  This has the
> additional advantage of checking for a potential TM Bad Thing
> exception.

Can this situation be created using a test and verified that with
this new change, the kernel can handle it successfully. I guess
the self test in the series does not cover this scenario.

> 
> Found using syscall fuzzer.
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> Cc: stable@vger.kernel.org
> ---
>  arch/powerpc/kernel/process.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 5fbe5d8..a1b41d1 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -551,6 +551,25 @@ static void tm_reclaim_thread(struct thread_struct *thr,
>  		msr_diff &= MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 | MSR_FE1;
>  	}
>  
> +	/*
> +	 * Use the current MSR TM suspended bit to track if we have
> +	 * checkpointed state outstanding.
> +	 * On signal delivery, we'd normally reclaim the checkpointed
> +	 * state to obtain stack pointer (see:get_tm_stackpointer()).
> +	 * This will then directly return to userspace without going
> +	 * through __switch_to(). However, if the stack frame is bad,
> +	 * we need to exit this thread which calls __switch_to() which
> +	 * will again attempt to reclaim the already saved tm state.
> +	 * Hence we need to check that we've not already reclaimed
> +	 * this state.
> +	 * We do this using the current MSR, rather tracking it in
> +	 * some specific bit thread_struct bit, as it has the

There is one extra "bit" here ^^^^^.
Michael Neuling Nov. 16, 2015, 9:23 a.m. UTC | #2
On Mon, 2015-11-16 at 12:51 +0530, Anshuman Khandual wrote:
> On 11/13/2015 10:27 AM, Michael Neuling wrote:
> > Currently we can hit a scenario where we'll tm_reclaim() twice. 
> >  This
> > results in a TM bad thing exception because the second reclaim
> > occurs
> > when not in suspend mode.
> > 
> > The scenario in which this can happen is the following.  We attempt
> > to
> > deliver a signal to userspace.  To do this we need obtain the stack
> > pointer to write the signal context.  To get this stack pointer we
> > must tm_reclaim() in case we need to use the checkpointed stack
> > pointer (see get_tm_stackpointer()).  Normally we'd then return
> > directly to userspace to deliver the signal without going through
> > __switch_to().
> > 
> > Unfortunatley, if at this point we get an error (such as a bad
> > userspace stack pointer), we need to exit the process.  The exit
> > will
> > result in a __switch_to().  __switch_to() will attempt to save the
> > process state which results in another tm_reclaim().  This
> > tm_reclaim() now causes a TM Bad Thing exception as this state has
> > already been saved and the processor is no longer in TM suspend
> > mode.
> > Whee!
> > 
> > This patch checks the state of the MSR to ensure we are TM
> > suspended
> > before we attempt the tm_reclaim().  If we've already saved the
> > state
> > away, we should no longer be in TM suspend mode.  This has the
> > additional advantage of checking for a potential TM Bad Thing
> > exception.
> 
> Can this situation be created using a test and verified that with
> this new change, the kernel can handle it successfully. I guess
> the self test in the series does not cover this scenario.

No it doesn't.  The syscall fuzzer I have does hit it but I don't have
permission to post that.

> > 
> > Found using syscall fuzzer.
> > 
> > Signed-off-by: Michael Neuling <mikey@neuling.org>
> > Cc: stable@vger.kernel.org
> > ---
> >  arch/powerpc/kernel/process.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/arch/powerpc/kernel/process.c
> > b/arch/powerpc/kernel/process.c
> > index 5fbe5d8..a1b41d1 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -551,6 +551,25 @@ static void tm_reclaim_thread(struct
> > thread_struct *thr,
> >  		msr_diff &= MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 |
> > MSR_FE1;
> >  	}
> >  
> > +	/*
> > +	 * Use the current MSR TM suspended bit to track if we
> > have
> > +	 * checkpointed state outstanding.
> > +	 * On signal delivery, we'd normally reclaim the
> > checkpointed
> > +	 * state to obtain stack pointer
> > (see:get_tm_stackpointer()).
> > +	 * This will then directly return to userspace without
> > going
> > +	 * through __switch_to(). However, if the stack frame is
> > bad,
> > +	 * we need to exit this thread which calls __switch_to()
> > which
> > +	 * will again attempt to reclaim the already saved tm
> > state.
> > +	 * Hence we need to check that we've not already reclaimed
> > +	 * this state.
> > +	 * We do this using the current MSR, rather tracking it in
> > +	 * some specific bit thread_struct bit, as it has the
> 
> There is one extra "bit" here ^^^^^.

Thanks!

Mikey
>
Michael Ellerman Nov. 16, 2015, 9:33 a.m. UTC | #3
On Mon, 2015-11-16 at 20:23 +1100, Michael Neuling wrote:
> On Mon, 2015-11-16 at 12:51 +0530, Anshuman Khandual wrote:
> > On 11/13/2015 10:27 AM, Michael Neuling wrote:
> > > Currently we can hit a scenario where we'll tm_reclaim() twice. 
> > >  This
> > > results in a TM bad thing exception because the second reclaim
> > > occurs
> > > when not in suspend mode.
> > > 
> > > The scenario in which this can happen is the following.  We attempt
> > > to
> > > deliver a signal to userspace.  To do this we need obtain the stack
> > > pointer to write the signal context.  To get this stack pointer we
> > > must tm_reclaim() in case we need to use the checkpointed stack
> > > pointer (see get_tm_stackpointer()).  Normally we'd then return
> > > directly to userspace to deliver the signal without going through
> > > __switch_to().
> > > 
> > > Unfortunatley, if at this point we get an error (such as a bad
> > > userspace stack pointer), we need to exit the process.  The exit
> > > will
> > > result in a __switch_to().  __switch_to() will attempt to save the
> > > process state which results in another tm_reclaim().  This
> > > tm_reclaim() now causes a TM Bad Thing exception as this state has
> > > already been saved and the processor is no longer in TM suspend
> > > mode.
> > > Whee!
> > > 
> > > This patch checks the state of the MSR to ensure we are TM
> > > suspended
> > > before we attempt the tm_reclaim().  If we've already saved the
> > > state
> > > away, we should no longer be in TM suspend mode.  This has the
> > > additional advantage of checking for a potential TM Bad Thing
> > > exception.
> > 
> > Can this situation be created using a test and verified that with
> > this new change, the kernel can handle it successfully. I guess
> > the self test in the series does not cover this scenario.
> 
> No it doesn't.  The syscall fuzzer I have does hit it but I don't have
> permission to post that.

And we don't really want a fuzzer as a selftest, because it might call unlink
or something else bad.

But having found the bug with the fuzzer, can't you write a test that triggers
the bad case?

From your description it sounds like if you had a child spinning with a bad r1,
and then a parent sent it a signal that would trip it?

cheers
Michael Neuling Nov. 16, 2015, 10:21 a.m. UTC | #4
On Mon, 2015-11-16 at 20:33 +1100, Michael Ellerman wrote:
> On Mon, 2015-11-16 at 20:23 +1100, Michael Neuling wrote:
> > On Mon, 2015-11-16 at 12:51 +0530, Anshuman Khandual wrote:
> > > On 11/13/2015 10:27 AM, Michael Neuling wrote:
> > > > Currently we can hit a scenario where we'll tm_reclaim() twice.
> > > >  This
> > > > results in a TM bad thing exception because the second reclaim
> > > > occurs
> > > > when not in suspend mode.
> > > > 
> > > > The scenario in which this can happen is the following.  We
> > > > attempt
> > > > to
> > > > deliver a signal to userspace.  To do this we need obtain the
> > > > stack
> > > > pointer to write the signal context.  To get this stack pointer
> > > > we
> > > > must tm_reclaim() in case we need to use the checkpointed stack
> > > > pointer (see get_tm_stackpointer()).  Normally we'd then return
> > > > directly to userspace to deliver the signal without going
> > > > through
> > > > __switch_to().
> > > > 
> > > > Unfortunatley, if at this point we get an error (such as a bad
> > > > userspace stack pointer), we need to exit the process.  The
> > > > exit
> > > > will
> > > > result in a __switch_to().  __switch_to() will attempt to save
> > > > the
> > > > process state which results in another tm_reclaim().  This
> > > > tm_reclaim() now causes a TM Bad Thing exception as this state
> > > > has
> > > > already been saved and the processor is no longer in TM suspend
> > > > mode.
> > > > Whee!
> > > > 
> > > > This patch checks the state of the MSR to ensure we are TM
> > > > suspended
> > > > before we attempt the tm_reclaim().  If we've already saved the
> > > > state
> > > > away, we should no longer be in TM suspend mode.  This has the
> > > > additional advantage of checking for a potential TM Bad Thing
> > > > exception.
> > > 
> > > Can this situation be created using a test and verified that with
> > > this new change, the kernel can handle it successfully. I guess
> > > the self test in the series does not cover this scenario.
> > 
> > No it doesn't.  The syscall fuzzer I have does hit it but I don't
> > have
> > permission to post that.
> 
> And we don't really want a fuzzer as a selftest, because it might
> call unlink
> or something else bad.
> 
> But having found the bug with the fuzzer, can't you write a test that
> triggers
> the bad case?
> 
> > From your description it sounds like if you had a child spinning
> > with a bad r1,
> and then a parent sent it a signal that would trip it?

You'd need to turn on TM too, but yeah... I have something like this
working which I'll cleanup and post as a self test:

#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>

void signal_segv(int signum)
{
	/* This should never actually run since stack is foobar */
	exit(1);
}

int main()
{
	int pid;

	pid = fork();
	if (pid < 0)
		exit(1);

	if (pid) {
		// Parent
		wait(NULL);
		printf("PASSED\n");
		return 0;
	}

	if (signal(SIGSEGV, signal_segv) == SIG_ERR)
		exit(1);

	asm volatile("li 1, 0 ;"
		     "1:"
		     ".long 0x7C00051D ;" // tbegin
		     "beq 1b ;" // retry for ever
		     ".long 0x7C0005DD ; ;" // tsuspend
		     "ld 2, 0(1) ;" // trigger segv"
		     : : : "memory");

	return 1;
}
diff mbox

Patch

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 5fbe5d8..a1b41d1 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -551,6 +551,25 @@  static void tm_reclaim_thread(struct thread_struct *thr,
 		msr_diff &= MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 | MSR_FE1;
 	}
 
+	/*
+	 * Use the current MSR TM suspended bit to track if we have
+	 * checkpointed state outstanding.
+	 * On signal delivery, we'd normally reclaim the checkpointed
+	 * state to obtain stack pointer (see:get_tm_stackpointer()).
+	 * This will then directly return to userspace without going
+	 * through __switch_to(). However, if the stack frame is bad,
+	 * we need to exit this thread which calls __switch_to() which
+	 * will again attempt to reclaim the already saved tm state.
+	 * Hence we need to check that we've not already reclaimed
+	 * this state.
+	 * We do this using the current MSR, rather tracking it in
+	 * some specific bit thread_struct bit, as it has the
+	 * additional benifit of checking for a potential TM bad thing
+	 * exception.
+	 */
+	if (!MSR_TM_SUSPENDED(mfmsr()))
+		return;
+
 	tm_reclaim(thr, thr->regs->msr, cause);
 
 	/* Having done the reclaim, we now have the checkpointed