From patchwork Mon Feb 1 14:28:34 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 576559 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2001:1868:205::9]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 1EE49140BA6 for ; Tue, 2 Feb 2016 01:30:57 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1aQFTB-0005sh-3T; Mon, 01 Feb 2016 14:28:45 +0000 Received: from [2001:8b0:10b:1:841b:5eff:0:3b3] (helo=i7.infradead.org) by bombadil.infradead.org with esmtpsa (Exim 4.80.1 #2 (Red Hat Linux)) id 1aQFT4-0005oN-I4; Mon, 01 Feb 2016 14:28:38 +0000 Message-ID: <1454336914.133285.297.camel@infradead.org> Subject: Re: JFFS2 deadlock From: David Woodhouse To: Thomas.Betker@rohde-schwarz.com, computersforpeace@gmail.com Date: Mon, 01 Feb 2016 14:28:34 +0000 In-Reply-To: References: <1453910781.20662.35.camel@infinera.com> <20160128000559.GA14270@google.com> X-Mailer: Evolution 3.18.3 (3.18.3-1.fc23) Mime-Version: 1.0 X-BeenThere: linux-mtd@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mtd , Artem Bityutskiy , Ming Liu , linux-kernel@vger.kernel.org, Joakim Tjernlund , "sztomi89@gmail.com" , wangzaiwei , "linux-mtd@lists.infradead.org" , Alexander Viro , Thomas Betker , linux-fsdevel@vger.kernel.org, Deng Chao Sender: "linux-mtd" Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org On Thu, 2016-01-28 at 09:16 +0100, Thomas.Betker@rohde-schwarz.com wrote: > > Subject: [PATCH] Revert "jffs2: Fix lock acquisition order bug in  > jffs2_write_begin" > http://article.gmane.org/gmane.linux.drivers.mtd/62951 > > This is a patch revising my original patch, which I sent to linux-mtd on  > 10-Nov-2015. I didn't see a response yet, but it's one of the outstanding  > patches above. That looks necessary but not sufficient. I think we need this (untested) patch on top of it, to ensure that we *always* take the page lock before f->sem? Please could you try what's in the tree at http://git.infradead.org/users/dwmw2/jffs2-fixes.git ---- From: David Woodhouse Subject: [PATCH] jffs2: Fix page lock / f->sem deadlock With this fix, all code paths should now be obtaining the page lock before f->sem. Reported-by: Szabó Tamás Reported-by: Thomas Betker Signed-off-by: David Woodhouse Cc: stable@vger.kernel.org ---  fs/jffs2/README.Locking |  5 +----  fs/jffs2/gc.c           | 17 ++++++++++-------  2 files changed, 11 insertions(+), 11 deletions(-) --  2.5.0 diff --git a/fs/jffs2/README.Locking b/fs/jffs2/README.Locking index 3ea3655..8918ac9 100644 --- a/fs/jffs2/README.Locking +++ b/fs/jffs2/README.Locking @@ -2,10 +2,6 @@   JFFS2 LOCKING DOCUMENTATION   ---------------------------   -At least theoretically, JFFS2 does not require the Big Kernel Lock -(BKL), which was always helpfully obtained for it by Linux 2.4 VFS -code. It has its own locking, as described below. -  This document attempts to describe the existing locking rules for  JFFS2. It is not expected to remain perfectly up to date, but ought to  be fairly close. @@ -69,6 +65,7 @@ Ordering constraints:      any f->sem held.   2. Never attempt to lock two file mutexes in one thread.      No ordering rules have been made for doing so. + 3. Never lock a page cache page with f->sem held.       erase_completion_lock spinlock diff --git a/fs/jffs2/gc.c b/fs/jffs2/gc.c index 6fb0802..5919fef 100644 --- a/fs/jffs2/gc.c +++ b/fs/jffs2/gc.c @@ -1316,14 +1316,17 @@ static int jffs2_garbage_collect_dnode(struct jffs2_sb_info *c, struct jffs2_era   BUG_ON(start > orig_start);   }   - /* First, use readpage() to read the appropriate page into the page cache */ - /* Q: What happens if we actually try to GC the _same_ page for which commit_write() -  *    triggered garbage collection in the first place? -  * A: I _think_ it's OK. read_cache_page shouldn't deadlock, we'll write out the -  *    page OK. We'll actually write it out again in commit_write, which is a little -  *    suboptimal, but at least we're correct. -  */ + /* The rules state that we must obtain the page lock *before* f->sem, so +  * drop f->sem temporarily. Since we also hold c->alloc_sem, nothing's +  * actually going to *change* so we're safe; we only allow reading. +  * +  * It is important to note that jffs2_write_begin() will ensure that its +  * page is marked Uptodate before allocating space. That means that if we +  * end up here trying to GC the *same* page that jffs2_write_begin() is +  * trying to write out, read_cache_page() will not deadlock. */ + mutex_unlock(&f->sem);   pg_ptr = jffs2_gc_fetch_page(c, f, start, &pg); + mutex_lock(&f->sem);     if (IS_ERR(pg_ptr)) {   pr_warn("read_cache_page() returned error: %ld\n",