From patchwork Wed May 2 20:25:30 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 156547 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from ozlabs.org (localhost [IPv6:::1]) by ozlabs.org (Postfix) with ESMTP id 58F1FB6FC5 for ; Thu, 3 May 2012 06:27:13 +1000 (EST) Received: from mail-pb0-f51.google.com (mail-pb0-f51.google.com [209.85.160.51]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 86BBAB6EEB for ; Thu, 3 May 2012 06:25:51 +1000 (EST) Received: by pbbrp16 with SMTP id rp16so1940096pbb.38 for ; Wed, 02 May 2012 13:25:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:x-x-sender:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version:content-type; bh=Ro7I8VJG5E3cstn9zZwHEeB4oGXdKB4i189GrAcl7Wo=; b=IAndQTQYar74JirAaEBAkQtbsyz2cT8kWCAuVoAL6c1V2yfGeE1ofYfUMM+5+4T/Gg NIGosNmrXHPxnBW0+xhpIzuHG3ms8be8FsGoeRKNm+fbujoejfp8Nl0A1A3st9r7iNXR nyB0xHT4XwltgL7vZgGoS74ItJ01ohQwKoQZjl7INFNqpxdJDwn28gUmBAcHP5PagkA2 rOX5Jjp1Y0musebcjXjJo9kcGJPWmSisACY50Po9pyh4VCDKme7Yo+lSiFpdHsAuZ/QZ DsMdGxK7uIJYtlvCHcMekBQe8QwxTPflYHR5UrHUPlu6U0WQYpBxX2QqPZLZGUEt2jze vbqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:x-x-sender:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version:content-type:x-gm-message-state; bh=Ro7I8VJG5E3cstn9zZwHEeB4oGXdKB4i189GrAcl7Wo=; b=neM/JlYySlv0O0B5S8y097OXjlSs3E498GcH8NHKS96OssktKbt2+3xpZZ5Xgb38og p6DCYn6ZmwInUaQzeXqEfyYMFO898yG5g9TCNQpVY69mrDmCZSqYAXnkLJqb88qhdZbE 5BXkwoOty9pm+kQ3Y6JaBDmgZT5TBizUSJo3QBNA/UpwueBfF3SxMMMUmub3xA34nBb+ A1KjkMa8PpEIhgo91qaySyZMgzTccH8s8xdY0VbhpD5nyolRlu734C+Vweaucd3Wu9oJ EiHDWrbJ7o28DnXbIoMILaI6DKgtlhilqWVrMLI3Hnqi/jU9etHfKpnNp9Kexsq30aEE hzEg== Received: by 10.68.232.168 with SMTP id tp8mr553542pbc.104.1335990349761; Wed, 02 May 2012 13:25:49 -0700 (PDT) Received: by 10.68.232.168 with SMTP id tp8mr553515pbc.104.1335990349628; Wed, 02 May 2012 13:25:49 -0700 (PDT) Received: from [192.168.1.8] (c-67-188-178-35.hsd1.ca.comcast.net. [67.188.178.35]) by mx.google.com with ESMTPS id nt8sm2893865pbb.12.2012.05.02.13.25.48 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 02 May 2012 13:25:48 -0700 (PDT) Date: Wed, 2 May 2012 13:25:30 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: "Paul E. McKenney" Subject: Re: linux-next ppc64: RCU mods cause __might_sleep BUGs In-Reply-To: <20120501232516.GR2441@linux.vnet.ibm.com> Message-ID: References: <1335832418.20866.95.camel@pasglop> <20120501142208.GA2441@linux.vnet.ibm.com> <20120501232516.GR2441@linux.vnet.ibm.com> User-Agent: Alpine 2.00 (LSU 1167 2008-08-23) MIME-Version: 1.0 X-Gm-Message-State: ALoCoQlPK+nMJcOCAfWXX3utvgKc+Ma7tynWrSGWAYZv3ApZUgTgjrESm8PHubFSDs4geOJWdgFeDZ5IXcSODbYLMcg6rLNVFdVAatGXlVDcM0/4ynn2j3QuLoItsll9QOsSccM0p3nZgs8gtXWKqkPJ8swZVpMKAvTbReUJFmvNTHVIuROJHJg= Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, "Paul E. McKenney" X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org On Tue, 1 May 2012, Paul E. McKenney wrote: > > > > > On Mon, 2012-04-30 at 15:37 -0700, Hugh Dickins wrote: > > > > > > > > > > > > BUG: sleeping function called from invalid context at include/linux/pagemap.h:354 > > > > > > in_atomic(): 0, irqs_disabled(): 0, pid: 6886, name: cc1 > > > > > > Call Trace: > > > > > > [c0000001a99f78e0] [c00000000000f34c] .show_stack+0x6c/0x16c (unreliable) > > > > > > [c0000001a99f7990] [c000000000077b40] .__might_sleep+0x11c/0x134 > > > > > > [c0000001a99f7a10] [c0000000000c6228] .filemap_fault+0x1fc/0x494 > > > > > > [c0000001a99f7af0] [c0000000000e7c9c] .__do_fault+0x120/0x684 > > > > > > [c0000001a99f7c00] [c000000000025790] .do_page_fault+0x458/0x664 > > > > > > [c0000001a99f7e30] [c000000000005868] handle_page_fault+0x10/0x30 Got it at last. Embarrassingly obvious. __rcu_read_lock() and __rcu_read_unlock() are not safe to be using __this_cpu operations, the cpu may change in between the rmw's read and write: they should be using this_cpu operations (or, I put preempt_disable/enable in the __rcu_read_unlock below). __this_cpus there work out fine on x86, which was given good instructions to use; but not so well on PowerPC. I've been running successfully for an hour now with the patch below; but I expect you'll want to consider the tradeoffs, and may choose a different solution. Hugh --- 3.4-rc4-next-20120427/include/linux/rcupdate.h 2012-04-28 09:26:38.000000000 -0700 +++ testing/include/linux/rcupdate.h 2012-05-02 11:46:06.000000000 -0700 @@ -159,7 +159,7 @@ DECLARE_PER_CPU(struct task_struct *, rc */ static inline void __rcu_read_lock(void) { - __this_cpu_inc(rcu_read_lock_nesting); + this_cpu_inc(rcu_read_lock_nesting); barrier(); /* Keep code within RCU read-side critical section. */ } --- 3.4-rc4-next-20120427/kernel/rcupdate.c 2012-04-28 09:26:40.000000000 -0700 +++ testing/kernel/rcupdate.c 2012-05-02 11:44:13.000000000 -0700 @@ -72,6 +72,7 @@ DEFINE_PER_CPU(struct task_struct *, rcu */ void __rcu_read_unlock(void) { + preempt_disable(); if (__this_cpu_read(rcu_read_lock_nesting) != 1) __this_cpu_dec(rcu_read_lock_nesting); else { @@ -83,13 +84,14 @@ void __rcu_read_unlock(void) barrier(); /* ->rcu_read_unlock_special load before assign */ __this_cpu_write(rcu_read_lock_nesting, 0); } -#ifdef CONFIG_PROVE_LOCKING +#if 1 /* CONFIG_PROVE_LOCKING */ { int rln = __this_cpu_read(rcu_read_lock_nesting); - WARN_ON_ONCE(rln < 0 && rln > INT_MIN / 2); + BUG_ON(rln < 0 && rln > INT_MIN / 2); } #endif /* #ifdef CONFIG_PROVE_LOCKING */ + preempt_enable(); } EXPORT_SYMBOL_GPL(__rcu_read_unlock); --- 3.4-rc4-next-20120427/kernel/sched/core.c 2012-04-28 09:26:40.000000000 -0700 +++ testing/kernel/sched/core.c 2012-05-01 22:40:46.000000000 -0700 @@ -2024,7 +2024,7 @@ asmlinkage void schedule_tail(struct tas { struct rq *rq = this_rq(); - rcu_switch_from(prev); + /* rcu_switch_from(prev); */ rcu_switch_to(); finish_task_switch(rq, prev); @@ -7093,6 +7093,10 @@ void __might_sleep(const char *file, int "BUG: sleeping function called from invalid context at %s:%d\n", file, line); printk(KERN_ERR + "cpu=%d preempt_count=%x preempt_offset=%x rcu_nesting=%x nesting_save=%x\n", + raw_smp_processor_id(), preempt_count(), preempt_offset, + rcu_preempt_depth(), current->rcu_read_lock_nesting_save); + printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n", in_atomic(), irqs_disabled(), current->pid, current->comm);