Message ID | 1447390652-28355-4-git-send-email-mikey@neuling.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On 11/13/2015 10:27 AM, Michael Neuling wrote: > Currently we can hit a scenario where we'll tm_reclaim() twice. This > results in a TM bad thing exception because the second reclaim occurs > when not in suspend mode. > > The scenario in which this can happen is the following. We attempt to > deliver a signal to userspace. To do this we need obtain the stack > pointer to write the signal context. To get this stack pointer we > must tm_reclaim() in case we need to use the checkpointed stack > pointer (see get_tm_stackpointer()). Normally we'd then return > directly to userspace to deliver the signal without going through > __switch_to(). > > Unfortunatley, if at this point we get an error (such as a bad > userspace stack pointer), we need to exit the process. The exit will > result in a __switch_to(). __switch_to() will attempt to save the > process state which results in another tm_reclaim(). This > tm_reclaim() now causes a TM Bad Thing exception as this state has > already been saved and the processor is no longer in TM suspend mode. > Whee! > > This patch checks the state of the MSR to ensure we are TM suspended > before we attempt the tm_reclaim(). If we've already saved the state > away, we should no longer be in TM suspend mode. This has the > additional advantage of checking for a potential TM Bad Thing > exception. Can this situation be created using a test and verified that with this new change, the kernel can handle it successfully. I guess the self test in the series does not cover this scenario. > > Found using syscall fuzzer. > > Signed-off-by: Michael Neuling <mikey@neuling.org> > Cc: stable@vger.kernel.org > --- > arch/powerpc/kernel/process.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c > index 5fbe5d8..a1b41d1 100644 > --- a/arch/powerpc/kernel/process.c > +++ b/arch/powerpc/kernel/process.c > @@ -551,6 +551,25 @@ static void tm_reclaim_thread(struct thread_struct *thr, > msr_diff &= MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 | MSR_FE1; > } > > + /* > + * Use the current MSR TM suspended bit to track if we have > + * checkpointed state outstanding. > + * On signal delivery, we'd normally reclaim the checkpointed > + * state to obtain stack pointer (see:get_tm_stackpointer()). > + * This will then directly return to userspace without going > + * through __switch_to(). However, if the stack frame is bad, > + * we need to exit this thread which calls __switch_to() which > + * will again attempt to reclaim the already saved tm state. > + * Hence we need to check that we've not already reclaimed > + * this state. > + * We do this using the current MSR, rather tracking it in > + * some specific bit thread_struct bit, as it has the There is one extra "bit" here ^^^^^.
On Mon, 2015-11-16 at 12:51 +0530, Anshuman Khandual wrote: > On 11/13/2015 10:27 AM, Michael Neuling wrote: > > Currently we can hit a scenario where we'll tm_reclaim() twice. > > This > > results in a TM bad thing exception because the second reclaim > > occurs > > when not in suspend mode. > > > > The scenario in which this can happen is the following. We attempt > > to > > deliver a signal to userspace. To do this we need obtain the stack > > pointer to write the signal context. To get this stack pointer we > > must tm_reclaim() in case we need to use the checkpointed stack > > pointer (see get_tm_stackpointer()). Normally we'd then return > > directly to userspace to deliver the signal without going through > > __switch_to(). > > > > Unfortunatley, if at this point we get an error (such as a bad > > userspace stack pointer), we need to exit the process. The exit > > will > > result in a __switch_to(). __switch_to() will attempt to save the > > process state which results in another tm_reclaim(). This > > tm_reclaim() now causes a TM Bad Thing exception as this state has > > already been saved and the processor is no longer in TM suspend > > mode. > > Whee! > > > > This patch checks the state of the MSR to ensure we are TM > > suspended > > before we attempt the tm_reclaim(). If we've already saved the > > state > > away, we should no longer be in TM suspend mode. This has the > > additional advantage of checking for a potential TM Bad Thing > > exception. > > Can this situation be created using a test and verified that with > this new change, the kernel can handle it successfully. I guess > the self test in the series does not cover this scenario. No it doesn't. The syscall fuzzer I have does hit it but I don't have permission to post that. > > > > Found using syscall fuzzer. > > > > Signed-off-by: Michael Neuling <mikey@neuling.org> > > Cc: stable@vger.kernel.org > > --- > > arch/powerpc/kernel/process.c | 19 +++++++++++++++++++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/arch/powerpc/kernel/process.c > > b/arch/powerpc/kernel/process.c > > index 5fbe5d8..a1b41d1 100644 > > --- a/arch/powerpc/kernel/process.c > > +++ b/arch/powerpc/kernel/process.c > > @@ -551,6 +551,25 @@ static void tm_reclaim_thread(struct > > thread_struct *thr, > > msr_diff &= MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 | > > MSR_FE1; > > } > > > > + /* > > + * Use the current MSR TM suspended bit to track if we > > have > > + * checkpointed state outstanding. > > + * On signal delivery, we'd normally reclaim the > > checkpointed > > + * state to obtain stack pointer > > (see:get_tm_stackpointer()). > > + * This will then directly return to userspace without > > going > > + * through __switch_to(). However, if the stack frame is > > bad, > > + * we need to exit this thread which calls __switch_to() > > which > > + * will again attempt to reclaim the already saved tm > > state. > > + * Hence we need to check that we've not already reclaimed > > + * this state. > > + * We do this using the current MSR, rather tracking it in > > + * some specific bit thread_struct bit, as it has the > > There is one extra "bit" here ^^^^^. Thanks! Mikey >
On Mon, 2015-11-16 at 20:23 +1100, Michael Neuling wrote: > On Mon, 2015-11-16 at 12:51 +0530, Anshuman Khandual wrote: > > On 11/13/2015 10:27 AM, Michael Neuling wrote: > > > Currently we can hit a scenario where we'll tm_reclaim() twice. > > > This > > > results in a TM bad thing exception because the second reclaim > > > occurs > > > when not in suspend mode. > > > > > > The scenario in which this can happen is the following. We attempt > > > to > > > deliver a signal to userspace. To do this we need obtain the stack > > > pointer to write the signal context. To get this stack pointer we > > > must tm_reclaim() in case we need to use the checkpointed stack > > > pointer (see get_tm_stackpointer()). Normally we'd then return > > > directly to userspace to deliver the signal without going through > > > __switch_to(). > > > > > > Unfortunatley, if at this point we get an error (such as a bad > > > userspace stack pointer), we need to exit the process. The exit > > > will > > > result in a __switch_to(). __switch_to() will attempt to save the > > > process state which results in another tm_reclaim(). This > > > tm_reclaim() now causes a TM Bad Thing exception as this state has > > > already been saved and the processor is no longer in TM suspend > > > mode. > > > Whee! > > > > > > This patch checks the state of the MSR to ensure we are TM > > > suspended > > > before we attempt the tm_reclaim(). If we've already saved the > > > state > > > away, we should no longer be in TM suspend mode. This has the > > > additional advantage of checking for a potential TM Bad Thing > > > exception. > > > > Can this situation be created using a test and verified that with > > this new change, the kernel can handle it successfully. I guess > > the self test in the series does not cover this scenario. > > No it doesn't. The syscall fuzzer I have does hit it but I don't have > permission to post that. And we don't really want a fuzzer as a selftest, because it might call unlink or something else bad. But having found the bug with the fuzzer, can't you write a test that triggers the bad case? From your description it sounds like if you had a child spinning with a bad r1, and then a parent sent it a signal that would trip it? cheers
On Mon, 2015-11-16 at 20:33 +1100, Michael Ellerman wrote: > On Mon, 2015-11-16 at 20:23 +1100, Michael Neuling wrote: > > On Mon, 2015-11-16 at 12:51 +0530, Anshuman Khandual wrote: > > > On 11/13/2015 10:27 AM, Michael Neuling wrote: > > > > Currently we can hit a scenario where we'll tm_reclaim() twice. > > > > This > > > > results in a TM bad thing exception because the second reclaim > > > > occurs > > > > when not in suspend mode. > > > > > > > > The scenario in which this can happen is the following. We > > > > attempt > > > > to > > > > deliver a signal to userspace. To do this we need obtain the > > > > stack > > > > pointer to write the signal context. To get this stack pointer > > > > we > > > > must tm_reclaim() in case we need to use the checkpointed stack > > > > pointer (see get_tm_stackpointer()). Normally we'd then return > > > > directly to userspace to deliver the signal without going > > > > through > > > > __switch_to(). > > > > > > > > Unfortunatley, if at this point we get an error (such as a bad > > > > userspace stack pointer), we need to exit the process. The > > > > exit > > > > will > > > > result in a __switch_to(). __switch_to() will attempt to save > > > > the > > > > process state which results in another tm_reclaim(). This > > > > tm_reclaim() now causes a TM Bad Thing exception as this state > > > > has > > > > already been saved and the processor is no longer in TM suspend > > > > mode. > > > > Whee! > > > > > > > > This patch checks the state of the MSR to ensure we are TM > > > > suspended > > > > before we attempt the tm_reclaim(). If we've already saved the > > > > state > > > > away, we should no longer be in TM suspend mode. This has the > > > > additional advantage of checking for a potential TM Bad Thing > > > > exception. > > > > > > Can this situation be created using a test and verified that with > > > this new change, the kernel can handle it successfully. I guess > > > the self test in the series does not cover this scenario. > > > > No it doesn't. The syscall fuzzer I have does hit it but I don't > > have > > permission to post that. > > And we don't really want a fuzzer as a selftest, because it might > call unlink > or something else bad. > > But having found the bug with the fuzzer, can't you write a test that > triggers > the bad case? > > > From your description it sounds like if you had a child spinning > > with a bad r1, > and then a parent sent it a signal that would trip it? You'd need to turn on TM too, but yeah... I have something like this working which I'll cleanup and post as a self test: #include <unistd.h> #include <sys/types.h> #include <sys/wait.h> #include <stdlib.h> #include <stdio.h> #include <signal.h> void signal_segv(int signum) { /* This should never actually run since stack is foobar */ exit(1); } int main() { int pid; pid = fork(); if (pid < 0) exit(1); if (pid) { // Parent wait(NULL); printf("PASSED\n"); return 0; } if (signal(SIGSEGV, signal_segv) == SIG_ERR) exit(1); asm volatile("li 1, 0 ;" "1:" ".long 0x7C00051D ;" // tbegin "beq 1b ;" // retry for ever ".long 0x7C0005DD ; ;" // tsuspend "ld 2, 0(1) ;" // trigger segv" : : : "memory"); return 1; }
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 5fbe5d8..a1b41d1 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -551,6 +551,25 @@ static void tm_reclaim_thread(struct thread_struct *thr, msr_diff &= MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 | MSR_FE1; } + /* + * Use the current MSR TM suspended bit to track if we have + * checkpointed state outstanding. + * On signal delivery, we'd normally reclaim the checkpointed + * state to obtain stack pointer (see:get_tm_stackpointer()). + * This will then directly return to userspace without going + * through __switch_to(). However, if the stack frame is bad, + * we need to exit this thread which calls __switch_to() which + * will again attempt to reclaim the already saved tm state. + * Hence we need to check that we've not already reclaimed + * this state. + * We do this using the current MSR, rather tracking it in + * some specific bit thread_struct bit, as it has the + * additional benifit of checking for a potential TM bad thing + * exception. + */ + if (!MSR_TM_SUSPENDED(mfmsr())) + return; + tm_reclaim(thr, thr->regs->msr, cause); /* Having done the reclaim, we now have the checkpointed
Currently we can hit a scenario where we'll tm_reclaim() twice. This results in a TM bad thing exception because the second reclaim occurs when not in suspend mode. The scenario in which this can happen is the following. We attempt to deliver a signal to userspace. To do this we need obtain the stack pointer to write the signal context. To get this stack pointer we must tm_reclaim() in case we need to use the checkpointed stack pointer (see get_tm_stackpointer()). Normally we'd then return directly to userspace to deliver the signal without going through __switch_to(). Unfortunatley, if at this point we get an error (such as a bad userspace stack pointer), we need to exit the process. The exit will result in a __switch_to(). __switch_to() will attempt to save the process state which results in another tm_reclaim(). This tm_reclaim() now causes a TM Bad Thing exception as this state has already been saved and the processor is no longer in TM suspend mode. Whee! This patch checks the state of the MSR to ensure we are TM suspended before we attempt the tm_reclaim(). If we've already saved the state away, we should no longer be in TM suspend mode. This has the additional advantage of checking for a potential TM Bad Thing exception. Found using syscall fuzzer. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: stable@vger.kernel.org --- arch/powerpc/kernel/process.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)