Patchwork [3.5.y.z,extended,stable] Patch "ptrace/x86: Partly fix" has been added to staging queue

Submitter Luis Henriques
Date Feb. 28, 2013, 11:59 a.m.
Message ID <>
Luis Henriques - Feb. 28, 2013, 11:59 a.m.
This is a note to let you know that I have just added a patch titled

    ptrace/x86: Partly fix

to the linux-3.5.y-queue branch of the 3.5.y.z extended stable tree 
which can be found at:;a=shortlog;h=refs/heads/linux-3.5.y-queue

If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.5.y.z tree, see



From 04a8df512c72f3373a05b11eaf2bb6d7457545dd Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <>
Date: Sat, 11 Aug 2012 18:06:42 +0200
Subject: [PATCH] ptrace/x86: Partly fix
 set_task_blockstep()->update_debugctlmsr() logic

commit 95cf00fa5d5e2a200a2c044c84bde8389a237e02 upstream.

Afaics the usage of update_debugctlmsr() and TIF_BLOCKSTEP in
step.c was always very wrong.

1. update_debugctlmsr() was simply unneeded. The child sleeps
   TASK_TRACED, __switch_to_xtra(next_p => child) should notice
   TIF_BLOCKSTEP and set/clear DEBUGCTLMSR_BTF after resume if

2. It is wrong. The state of DEBUGCTLMSR_BTF bit in CPU register
   should always match the state of current's TIF_BLOCKSTEP bit.

3. Even get_debugctlmsr() + update_debugctlmsr() itself does not
   look right. Irq can change other bits in MSR_IA32_DEBUGCTLMSR
   register or the caller can be preempted in between.

4. It is not safe to play with TIF_BLOCKSTEP if task != current.
   DEBUGCTLMSR_BTF and TIF_BLOCKSTEP should always match each
   other if the task is running. The tracee is stopped but it
   can be SIGKILL'ed right before set/clear_tsk_thread_flag().

However, now that uprobes uses user_enable_single_step(current)
we can't simply remove update_debugctlmsr(). So this patch adds
the additional "task == current" check and disables irqs to avoid
the race with interrupts/preemption.

Unfortunately this patch doesn't solve the last problem, we need
another fix. Probably we should teach ptrace_stop() to set/clear
single/block stepping after resume.

And afaics there is yet another problem: perf can play with
MSR_IA32_DEBUGCTLMSR from nmi, this obviously means that even
__switch_to_xtra() has problems.

Signed-off-by: Oleg Nesterov <>
Signed-off-by: Luis Henriques <>
 arch/x86/kernel/step.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)



diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
index 7a51498..f89cdc6 100644
--- a/arch/x86/kernel/step.c
+++ b/arch/x86/kernel/step.c
@@ -161,6 +161,16 @@  static void set_task_blockstep(struct task_struct *task, bool on)
 	unsigned long debugctl;

+	/*
+	 * Ensure irq/preemption can't change debugctl in between.
+	 * Note also that both TIF_BLOCKSTEP and debugctl should
+	 * be changed atomically wrt preemption.
+	 * FIXME: this means that set/clear TIF_BLOCKSTEP is simply
+	 * wrong if task != current, SIGKILL can wakeup the stopped
+	 * tracee and set/clear can play with the running task, this
+	 * can confuse the next __switch_to_xtra().
+	 */
+	local_irq_disable();
 	debugctl = get_debugctlmsr();
 	if (on) {
 		debugctl |= DEBUGCTLMSR_BTF;
@@ -169,7 +179,9 @@  static void set_task_blockstep(struct task_struct *task, bool on)
 		debugctl &= ~DEBUGCTLMSR_BTF;
 		clear_tsk_thread_flag(task, TIF_BLOCKSTEP);
-	update_debugctlmsr(debugctl);
+	if (task == current)
+		update_debugctlmsr(debugctl);
+	local_irq_enable();