From patchwork Thu Jul 11 14:23:50 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 258478 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id A866E2C0316 for ; Fri, 12 Jul 2013 00:26:09 +1000 (EST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1UxHoo-0007jr-Bu; Thu, 11 Jul 2013 14:26:02 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1UxHoV-0007XM-19 for kernel-team@lists.ubuntu.com; Thu, 11 Jul 2013 14:25:43 +0000 Received: from bl15-104-132.dsl.telepac.pt ([188.80.104.132] helo=localhost) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1UxHoU-0001aa-5v; Thu, 11 Jul 2013 14:25:42 +0000 From: Luis Henriques To: linux-kernel@vger.kernel.org, stable@vger.kernel.org, kernel-team@lists.ubuntu.com Subject: [PATCH 20/98] tracing: Use current_uid() for critical time tracing Date: Thu, 11 Jul 2013 15:23:50 +0100 Message-Id: <1373552708-15235-21-git-send-email-luis.henriques@canonical.com> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1373552708-15235-1-git-send-email-luis.henriques@canonical.com> References: <1373552708-15235-1-git-send-email-luis.henriques@canonical.com> X-Extended-Stable: 3.5 Cc: Steven Rostedt X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com 3.5.7.17 -stable review patch. If anyone has any objections, please let me know. ------------------ From: "Steven Rostedt (Red Hat)" commit f17a5194859a82afe4164e938b92035b86c55794 upstream. The irqsoff tracer records the max time that interrupts are disabled. There are hooks in the assembly code that calls back into the tracer when interrupts are disabled or enabled. When they are enabled, the tracer checks if the amount of time they were disabled is larger than the previous recorded max interrupts off time. If it is, it creates a snapshot of the currently running trace to store where the last largest interrupts off time was held and how it happened. During testing, this RCU lockdep dump appeared: [ 1257.829021] =============================== [ 1257.829021] [ INFO: suspicious RCU usage. ] [ 1257.829021] 3.10.0-rc1-test+ #171 Tainted: G W [ 1257.829021] ------------------------------- [ 1257.829021] /home/rostedt/work/git/linux-trace.git/include/linux/rcupdate.h:780 rcu_read_lock() used illegally while idle! [ 1257.829021] [ 1257.829021] other info that might help us debug this: [ 1257.829021] [ 1257.829021] [ 1257.829021] RCU used illegally from idle CPU! [ 1257.829021] rcu_scheduler_active = 1, debug_locks = 0 [ 1257.829021] RCU used illegally from extended quiescent state! [ 1257.829021] 2 locks held by trace-cmd/4831: [ 1257.829021] #0: (max_trace_lock){......}, at: [] stop_critical_timing+0x1a3/0x209 [ 1257.829021] #1: (rcu_read_lock){.+.+..}, at: [] __update_max_tr+0x88/0x1ee [ 1257.829021] [ 1257.829021] stack backtrace: [ 1257.829021] CPU: 3 PID: 4831 Comm: trace-cmd Tainted: G W 3.10.0-rc1-test+ #171 [ 1257.829021] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007 [ 1257.829021] 0000000000000001 ffff880065f49da8 ffffffff8153dd2b ffff880065f49dd8 [ 1257.829021] ffffffff81092a00 ffff88006bd78680 ffff88007add7500 0000000000000003 [ 1257.829021] ffff88006bd78680 ffff880065f49e18 ffffffff810daebf ffffffff810dae5a [ 1257.829021] Call Trace: [ 1257.829021] [] dump_stack+0x19/0x1b [ 1257.829021] [] lockdep_rcu_suspicious+0x109/0x112 [ 1257.829021] [] __update_max_tr+0xed/0x1ee [ 1257.829021] [] ? __update_max_tr+0x88/0x1ee [ 1257.829021] [] ? user_enter+0xfd/0x107 [ 1257.829021] [] update_max_tr_single+0x11d/0x12d [ 1257.829021] [] ? user_enter+0xfd/0x107 [ 1257.829021] [] stop_critical_timing+0x141/0x209 [ 1257.829021] [] ? trace_hardirqs_on+0xd/0xf [ 1257.829021] [] ? user_enter+0xfd/0x107 [ 1257.829021] [] time_hardirqs_on+0x2a/0x2f [ 1257.829021] [] ? user_enter+0xfd/0x107 [ 1257.829021] [] trace_hardirqs_on_caller+0x16/0x197 [ 1257.829021] [] trace_hardirqs_on+0xd/0xf [ 1257.829021] [] user_enter+0xfd/0x107 [ 1257.829021] [] do_notify_resume+0x92/0x97 [ 1257.829021] [] int_signal+0x12/0x17 What happened was entering into the user code, the interrupts were enabled and a max interrupts off was recorded. The trace buffer was saved along with various information about the task: comm, pid, uid, priority, etc. The uid is recorded with task_uid(tsk). But this is a macro that uses rcu_read_lock() to retrieve the data, and this happened to happen where RCU is blind (user_enter). As only the preempt and irqs off tracers can have this happen, and they both only have the tsk == current, if tsk == current, use current_uid() instead of task_uid(), as current_uid() does not use RCU as only current can change its uid. This fixes the RCU suspicious splat. Signed-off-by: Steven Rostedt Signed-off-by: Luis Henriques --- kernel/trace/trace.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 0ae8b32..8aba3f9 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -672,7 +672,15 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) memcpy(max_data->comm, tsk->comm, TASK_COMM_LEN); max_data->pid = tsk->pid; - max_data->uid = task_uid(tsk); + /* + * If tsk == current, then use current_uid(), as that does not use + * RCU. The irq tracer can be called out of RCU scope. + */ + if (tsk == current) + max_data->uid = current_uid(); + else + max_data->uid = task_uid(tsk); + max_data->nice = tsk->static_prio - 20 - MAX_RT_PRIO; max_data->policy = tsk->policy; max_data->rt_priority = tsk->rt_priority;