From patchwork Sat Apr 17 17:22:21 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= X-Patchwork-Id: 50383 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id B27E5B7CF8 for ; Sun, 18 Apr 2010 03:22:23 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752851Ab0DQRWV (ORCPT ); Sat, 17 Apr 2010 13:22:21 -0400 Received: from mail-ww0-f46.google.com ([74.125.82.46]:56986 "EHLO mail-ww0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752748Ab0DQRWV (ORCPT ); Sat, 17 Apr 2010 13:22:21 -0400 Received: by wwb24 with SMTP id 24so1761407wwb.19 for ; Sat, 17 Apr 2010 10:22:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:date:from:to:cc :subject:message-id:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=gNeJt/lG1T/CqrZM4e4gXW4NfnS1zumk0SoDeHmIrYA=; b=YJRtZs9WwDhPXLYWPKTah+J9viEt63Jx1/yHxJf5Czq+xxeznzWbfTRXIOYh9/LN7X bRzmV5ZAEiJMcOUZriKD1/ts8oo4FcSNZxe0NpdAv1TO61VT3fDpQfPu4IvSy0f0YR6r TUNfo3AL7GhN7//PKLHS+HpdIHli+3NEoCc2A= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=XpsN9JuTTiS2KCV3br4Au7vTXBv5DXVEo+YaBnzWsLe4kf0fCQ1xBvRyYWBM/JVSVs 349iWy031jiiR3FxV0hddePhhYsPX8xV52lEPhqcXjN6DFVAW6AjnnphVZRupMFAfVDc uVVSt2Z4z2W2w8xzsP3KSeAC02eRjXW1zXDDo= Received: by 10.216.182.142 with SMTP id o14mr4044662wem.146.1271524939263; Sat, 17 Apr 2010 10:22:19 -0700 (PDT) Received: from nowhere (ADijon-551-1-104-221.w92-138.abo.wanadoo.fr [92.138.15.221]) by mx.google.com with ESMTPS id z3sm31726746wbs.22.2010.04.17.10.22.17 (version=SSLv3 cipher=RC4-MD5); Sat, 17 Apr 2010 10:22:18 -0700 (PDT) Received: by nowhere (nbSMTP-1.00) for uid 1000 (using TLSv1/SSLv3 with cipher RC4-MD5 (128/128 bits)) fweisbec@gmail.com; Sat, 17 Apr 2010 19:22:23 +0200 (CEST) Date: Sat, 17 Apr 2010 19:22:21 +0200 From: Frederic Weisbecker To: David Miller Cc: rostedt@goodmis.org, sparclinux@vger.kernel.org Subject: Re: [PATCH 7/7] sparc64: Add function graph tracer support. Message-ID: <20100417172220.GB15037@nowhere> References: <20100416.134701.106783257.davem@davemloft.net> <20100416231408.GA10006@nowhere> <20100416.161744.257659239.davem@davemloft.net> <20100417.005109.32311008.davem@davemloft.net> <20100417165923.GA15037@nowhere> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20100417165923.GA15037@nowhere> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: sparclinux-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: sparclinux@vger.kernel.org On Sat, Apr 17, 2010 at 06:59:25PM +0200, Frederic Weisbecker wrote: > On Sat, Apr 17, 2010 at 12:51:09AM -0700, David Miller wrote: > > From: David Miller > > Date: Fri, 16 Apr 2010 16:17:44 -0700 (PDT) > > > > > From: Frederic Weisbecker > > > Date: Sat, 17 Apr 2010 01:14:12 +0200 > > > > > >> Hmm, just a random idea: do you think it could be due to stack overflows? > > >> Because the function graph eats more stack by digging to function graph > > >> handlers, ring buffer, etc... > > >> > > >> It diggs further than what is supposed to happen without tracing. > > > > > > I have mcount checking for stack-overflow by hand in assembler > > > during these tests, it even knows about the irq stacks. > > > > > > And those checks are not triggering. > > > > Ugh... hold on. > > > > They're not triggering because I put the test assembler into mcount > > and dynamic ftrace patches the call sites to bypass mcount altogether > > :-) > > > > Doing real tests now, and I bet you're right. > > > > That's pretty insane though, as we use 16K stacks on sparc64 and > > the gcc I'm using has the minimum stack frame decreased down to > > 176 bytes (used to be 192). I'd be interested to see what one > > of these too-large backtraces look like. > > > > Hmmm, will it be in the scheduler load balancing code? :-) > > > May be yeah :) > > This could be also a tracing recursion somewhere. > One good way to know is to put pause_graph_tracing() > and unpause_graph_tracing() in the very beginning and > end of the tracing paths. > > /me goes to try this. > This looks like the true cause. I tested the following patch and it fixes the issue after several manual loops of: echo function_graph > current_tracer echo nop > current_tracer Before it was triggering after two trials. Now it doesn't happen after 10 iterations. I'm going to hunt the culprit that causes this, we certainly need a new __notrace somewhere. --- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/sparc/kernel/ftrace.c b/arch/sparc/kernel/ftrace.c index 03ab022..3dc19b6 100644 --- a/arch/sparc/kernel/ftrace.c +++ b/arch/sparc/kernel/ftrace.c @@ -134,18 +134,24 @@ unsigned long prepare_ftrace_return(unsigned long parent, if (unlikely(atomic_read(¤t->tracing_graph_pause))) return parent + 8UL; + pause_graph_tracing(); + if (ftrace_push_return_trace(parent, self_addr, &trace.depth, - frame_pointer) == -EBUSY) + frame_pointer) == -EBUSY) { + unpause_graph_tracing(); return parent + 8UL; + } trace.func = self_addr; /* Only trace if the calling function expects to */ if (!ftrace_graph_entry(&trace)) { current->curr_ret_stack--; + unpause_graph_tracing(); return parent + 8UL; } + unpause_graph_tracing(); return return_hooker; } #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c index 9aed1a5..251a46a 100644 --- a/kernel/trace/trace_functions_graph.c +++ b/kernel/trace/trace_functions_graph.c @@ -163,6 +163,8 @@ unsigned long ftrace_return_to_handler(unsigned long frame_pointer) struct ftrace_graph_ret trace; unsigned long ret; + pause_graph_tracing(); + ftrace_pop_return_trace(&trace, &ret, frame_pointer); trace.rettime = trace_clock_local(); ftrace_graph_return(&trace); @@ -176,6 +178,8 @@ unsigned long ftrace_return_to_handler(unsigned long frame_pointer) ret = (unsigned long)panic; } + unpause_graph_tracing(); + return ret; }