From patchwork Wed Aug 1 03:27:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Song X-Patchwork-Id: 951883 X-Patchwork-Delegate: dwmw2@infradead.org Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:e::133; helo=bombadil.infradead.org; envelope-from=linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=zte.com.cn Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="Ng2P6Cr6"; dkim-atps=neutral Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:e::133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41gJhR6rGjz9s3x for ; Wed, 1 Aug 2018 13:29:43 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=5fpJ32AcGh0WoX0Hm42/saqOLitAMQ1i3O/9nrhogas=; b=Ng2 P6Cr6ltDrz73WpbuTTLNa9LkNs5lwzfYifW4RFwbvYWX3OBLVrfNtwA+HR2ZzNWS+lOmi5bgCdMKr 3gZ3+ICM84JMN/6KWO27BPzzM/0AcCIf1/I0UTPx0Nri/Ut8dQoLATaCEx8zw/FAD/pFom3X/z+Wi 2YDsIdjGt/R0cm5lkRkK+eHLaqYUaeJLDPkAGzVhNHwaYP3gRA4S70RLOk0TvFzIMz95jclPf3np4 ELevCtfKSaEq1AX1Zf/E98PW9vXV1V7XfWel1YnDsR0otbxIH9DaI+scpddi2Fz0B0EETI4ZRNB12 lATId7LV+3loeudZgRWiGX7ggQbm3jA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fkhpM-0006kk-0J; Wed, 01 Aug 2018 03:29:32 +0000 Received: from out1.zte.com.cn ([202.103.147.172] helo=mxct.zte.com.cn) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1fkhpG-0006jk-Cy for linux-mtd@lists.infradead.org; Wed, 01 Aug 2018 03:29:29 +0000 Received: from mse01.zte.com.cn (unknown [10.30.3.20]) by Forcepoint Email with ESMTPS id 7336AE789105E71BF2FD; Wed, 1 Aug 2018 11:29:04 +0800 (CST) Received: from notes_smtp.zte.com.cn ([10.30.1.239]) by mse01.zte.com.cn with ESMTP id w713SxwW060402; Wed, 1 Aug 2018 11:28:59 +0800 (GMT-8) (envelope-from liu.song11@zte.com.cn) Received: from localhost.localdomain ([10.75.10.200]) by szsmtp06.zte.com.cn (Lotus Domino Release 8.5.3FP6) with ESMTP id 2018080111290448-2795477 ; Wed, 1 Aug 2018 11:29:04 +0800 From: Liu Song To: dwmw2@infradead.org Subject: [PATCH] fs/jffs2 : prevent gc to pick block which includes inode with I_NEW state Date: Wed, 1 Aug 2018 11:27:26 +0800 Message-Id: <1533094046-132789-1-git-send-email-liu.song11@zte.com.cn> X-Mailer: git-send-email 1.8.3.1 X-MIMETrack: Itemize by SMTP Server on SZSMTP06/server/zte_ltd(Release 8.5.3FP6|November 21, 2013) at 2018-08-01 11:29:04, Serialize by Router on notes_smtp/zte_ltd(Release 9.0.1FP7|August 17, 2016) at 2018-08-01 11:28:52, Serialize complete at 2018-08-01 11:28:52 X-MAIL: mse01.zte.com.cn w713SxwW060402 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180731_202926_801431_6C3AFD13 X-CRM114-Status: GOOD ( 18.46 ) X-Spam-Score: -2.3 (--) X-Spam-Report: SpamAssassin version 3.4.1 on bombadil.infradead.org summary: Content analysis details: (-2.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [202.103.147.172 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-mtd@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jiang.biao2@zte.com.cn, linux-mtd@lists.infradead.org, liu.song11@zte.com.cn, linux-kernel@vger.kernel.org, zhong.weidong@zte.com.cn MIME-Version: 1.0 Sender: "linux-mtd" Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org In JFFS2, the process of creating a new file is split into two parts. First, create the inode, name it "Part A", then create the dirent, named it "Part B", both need reserve space. In Part A, a inode with I_NEW state has been written in the block, we name it "block_a". The inode I_NEW state will be cleared after Part B finished. After "Part A" finished, alloc_sem lock is released. Other waiting process able to run and write data in "block_a" until no more space in the block. Next, if reserve space trigger gc, then "block_a" is possible been picked by gc to collect. Then, there is an inode with state I_NEW in "block_a", and gc collect will find that inode and wait on I_NEW clear. Because gc holds alloc_sem lock, so Part B waiting alloc_sem forever. System enter a deadlock state. The call trace is as follows: ----------------------------- [43850.832264] Call trace: [43850.834700] [] __switch_to+0xcc/0xd8 [43850.839834] [] __schedule+0x180/0x468 [43850.845048] [] schedule+0x38/0xb8 [43850.849920] [] bit_wait+0x14/0x68 [43850.854787] [] __wait_on_bit+0xa4/0xe0 [43850.860093] [] out_of_line_wait_on_bit+0x60/0x68 [43850.866263] [] iget_locked+0xd8/0x1a8 [43850.871483] [] jffs2_iget+0x14/0x330 [43850.876610] [] jffs2_gc_fetch_inode+0x5c/0x148 [43850.882611] [] jffs2_garbage_collect_pass+0x678/0x810 [43850.889219] [] jffs2_reserve_space+0x170/0x338 [43850.895215] [] jffs2_do_unlink+0x5c/0x240 [43850.900782] [] jffs2_rename+0x108/0x298 [43850.906169] [] vfs_rename+0x840/0x858 [43850.911389] [] SyS_renameat2+0x430/0x4a0 [43850.916863] [] SyS_renameat+0x10/0x18 [43850.922083] [] __sys_trace_return+0x0/0x4 This patch add "block_a" into a prevent list, which prevent gc to pick for collecting. After testing, it can effectively solve the deadlock bug. Reported-by: Yang Yi Signed-off-by: Liu Song Tested-by: Yang Yi Reviewed-by: Jiang Biao --- fs/jffs2/build.c | 1 + fs/jffs2/dir.c | 22 +++++++++++++++++++++- fs/jffs2/gc.c | 4 ++++ fs/jffs2/jffs2_fs_sb.h | 1 + fs/jffs2/nodelist.h | 46 +++++++++++++++++++++++++++++++++++++++++++++- fs/jffs2/write.c | 5 ++++- 6 files changed, 76 insertions(+), 3 deletions(-) diff --git a/fs/jffs2/build.c b/fs/jffs2/build.c index b288c8a..bee6521 100644 --- a/fs/jffs2/build.c +++ b/fs/jffs2/build.c @@ -403,6 +403,7 @@ int jffs2_do_mount_fs(struct jffs2_sb_info *c) INIT_LIST_HEAD(&c->free_list); INIT_LIST_HEAD(&c->bad_list); INIT_LIST_HEAD(&c->bad_used_list); + INIT_LIST_HEAD(&c->prevent_gc_list); c->highest_ino = 1; c->summary = NULL; diff --git a/fs/jffs2/dir.c b/fs/jffs2/dir.c index b2944f9..b6782f1 100644 --- a/fs/jffs2/dir.c +++ b/fs/jffs2/dir.c @@ -164,6 +164,7 @@ static int jffs2_create(struct inode *dir_i, struct dentry *dentry, struct jffs2_inode_info *f, *dir_f; struct jffs2_sb_info *c; struct inode *inode; + struct prevent_block pb; int ret; ri = jffs2_alloc_raw_inode(); @@ -197,7 +198,7 @@ static int jffs2_create(struct inode *dir_i, struct dentry *dentry, no chance of AB-BA deadlock involving its f->sem). */ mutex_unlock(&f->sem); - ret = jffs2_do_create(c, dir_f, f, ri, &dentry->d_name); + ret = jffs2_do_create(c, dir_f, f, ri, &dentry->d_name, &pb); if (ret) goto fail; @@ -210,10 +211,12 @@ static int jffs2_create(struct inode *dir_i, struct dentry *dentry, f->inocache->pino_nlink, inode->i_mapping->nrpages); d_instantiate_new(dentry, inode); + remove_prevent_gc_list(c, &pb); return 0; fail: iget_failed(inode); + remove_prevent_gc_list(c, &pb); jffs2_free_raw_inode(ri); return ret; } @@ -285,6 +288,7 @@ static int jffs2_symlink (struct inode *dir_i, struct dentry *dentry, const char struct jffs2_raw_dirent *rd; struct jffs2_full_dnode *fn; struct jffs2_full_dirent *fd; + struct prevent_block pb; int namelen; uint32_t alloclen; int ret, targetlen = strlen(target); @@ -346,6 +350,8 @@ static int jffs2_symlink (struct inode *dir_i, struct dentry *dentry, const char goto fail; } + add_prevent_gc_list(c, &pb); + /* We use f->target field to store the target path. */ f->target = kmemdup(target, targetlen + 1, GFP_KERNEL); if (!f->target) { @@ -430,10 +436,12 @@ static int jffs2_symlink (struct inode *dir_i, struct dentry *dentry, const char jffs2_complete_reservation(c); d_instantiate_new(dentry, inode); + remove_prevent_gc_list(c, &pb); return 0; fail: iget_failed(inode); + remove_prevent_gc_list(c, &pb); return ret; } @@ -447,6 +455,7 @@ static int jffs2_mkdir (struct inode *dir_i, struct dentry *dentry, umode_t mode struct jffs2_raw_dirent *rd; struct jffs2_full_dnode *fn; struct jffs2_full_dirent *fd; + struct prevent_block pb; int namelen; uint32_t alloclen; int ret; @@ -503,6 +512,9 @@ static int jffs2_mkdir (struct inode *dir_i, struct dentry *dentry, umode_t mode ret = PTR_ERR(fn); goto fail; } + + add_prevent_gc_list(c, &pb); + /* No data here. Only a metadata node, which will be obsoleted by the first data write */ @@ -574,10 +586,12 @@ static int jffs2_mkdir (struct inode *dir_i, struct dentry *dentry, umode_t mode jffs2_complete_reservation(c); d_instantiate_new(dentry, inode); + remove_prevent_gc_list(c, &pb); return 0; fail: iget_failed(inode); + remove_prevent_gc_list(c, &pb); return ret; } @@ -614,6 +628,7 @@ static int jffs2_mknod (struct inode *dir_i, struct dentry *dentry, umode_t mode struct jffs2_raw_dirent *rd; struct jffs2_full_dnode *fn; struct jffs2_full_dirent *fd; + struct prevent_block pb; int namelen; union jffs2_device_node dev; int devlen = 0; @@ -672,6 +687,9 @@ static int jffs2_mknod (struct inode *dir_i, struct dentry *dentry, umode_t mode ret = PTR_ERR(fn); goto fail; } + + add_prevent_gc_list(c, &pb); + /* No data here. Only a metadata node, which will be obsoleted by the first data write */ @@ -745,10 +763,12 @@ static int jffs2_mknod (struct inode *dir_i, struct dentry *dentry, umode_t mode jffs2_complete_reservation(c); d_instantiate_new(dentry, inode); + remove_prevent_gc_list(c, &pb); return 0; fail: iget_failed(inode); + remove_prevent_gc_list(c, &pb); return ret; } diff --git a/fs/jffs2/gc.c b/fs/jffs2/gc.c index 9ed0f26..1fe4b11 100644 --- a/fs/jffs2/gc.c +++ b/fs/jffs2/gc.c @@ -95,6 +95,10 @@ static struct jffs2_eraseblock *jffs2_find_gc_block(struct jffs2_sb_info *c) } ret = list_entry(nextlist->next, struct jffs2_eraseblock, list); + if (is_prevented_block(c, ret)) { + n = jiffies % 128; + goto again; + } list_del(&ret->list); c->gcblock = ret; ret->gc_node = ret->first_node; diff --git a/fs/jffs2/jffs2_fs_sb.h b/fs/jffs2/jffs2_fs_sb.h index 778275f..a018fbf 100644 --- a/fs/jffs2/jffs2_fs_sb.h +++ b/fs/jffs2/jffs2_fs_sb.h @@ -106,6 +106,7 @@ struct jffs2_sb_info { struct list_head free_list; /* Blocks which are free and ready to be used */ struct list_head bad_list; /* Bad blocks. */ struct list_head bad_used_list; /* Bad blocks with valid data in. */ + struct list_head prevent_gc_list; /* Prevent blocks picked for gc. */ spinlock_t erase_completion_lock; /* Protect free_list and erasing_list against erase completion handler */ diff --git a/fs/jffs2/nodelist.h b/fs/jffs2/nodelist.h index 0637271..d54b634 100644 --- a/fs/jffs2/nodelist.h +++ b/fs/jffs2/nodelist.h @@ -98,6 +98,12 @@ struct jffs2_raw_node_ref /* Use blocks of about 256 bytes */ #define REFS_PER_BLOCK ((255/sizeof(struct jffs2_raw_node_ref))-1) +struct prevent_block { + struct list_head list; + struct jffs2_eraseblock *jeb; + int in_use; +}; + static inline struct jffs2_raw_node_ref *ref_next(struct jffs2_raw_node_ref *ref) { ref++; @@ -405,7 +411,7 @@ int jffs2_write_inode_range(struct jffs2_sb_info *c, struct jffs2_inode_info *f, struct jffs2_raw_inode *ri, unsigned char *buf, uint32_t offset, uint32_t writelen, uint32_t *retlen); int jffs2_do_create(struct jffs2_sb_info *c, struct jffs2_inode_info *dir_f, struct jffs2_inode_info *f, - struct jffs2_raw_inode *ri, const struct qstr *qstr); + struct jffs2_raw_inode *ri, const struct qstr *qstr, struct prevent_block *pb); int jffs2_do_unlink(struct jffs2_sb_info *c, struct jffs2_inode_info *dir_f, const char *name, int namelen, struct jffs2_inode_info *dead_f, uint32_t time); int jffs2_do_link(struct jffs2_sb_info *c, struct jffs2_inode_info *dir_f, uint32_t ino, @@ -479,6 +485,44 @@ int jffs2_check_nand_cleanmarker(struct jffs2_sb_info *c, struct jffs2_erasebloc int jffs2_write_nand_cleanmarker(struct jffs2_sb_info *c, struct jffs2_eraseblock *jeb); #endif +static inline void add_prevent_gc_list(struct jffs2_sb_info *c, struct prevent_block *pb) +{ + BUG_ON(!mutex_is_locked(&c->alloc_sem)); + pb->jeb = c->nextblock; + pb->in_use = 1; + list_add(&pb->list, &c->prevent_gc_list); +} + +static inline void remove_prevent_gc_list(struct jffs2_sb_info *c, struct prevent_block *pb) +{ + if (pb->in_use == 1) { + mutex_lock(&c->alloc_sem); + list_del(&pb->list); + mutex_unlock(&c->alloc_sem); + } + pb->in_use = 0; +} + +static inline int is_prevented_block(struct jffs2_sb_info *c, struct jffs2_eraseblock *jeb) +{ + int ret = 0; + struct list_head *this = NULL; + struct prevent_block *pb = NULL; + + BUG_ON(!mutex_is_locked(&c->alloc_sem)); + if (list_empty(&c->prevent_gc_list)) + goto out; + + list_for_each(this, &c->prevent_gc_list) { + pb = list_entry(this, struct prevent_block, list); + if (pb->jeb != NULL && pb->jeb->offset == jeb->offset) { + ret = 1; + break; + } + } +out: + return ret; +} #include "debug.h" #endif /* __JFFS2_NODELIST_H__ */ diff --git a/fs/jffs2/write.c b/fs/jffs2/write.c index cda9a36..bd95d3b 100644 --- a/fs/jffs2/write.c +++ b/fs/jffs2/write.c @@ -440,7 +440,7 @@ int jffs2_write_inode_range(struct jffs2_sb_info *c, struct jffs2_inode_info *f, int jffs2_do_create(struct jffs2_sb_info *c, struct jffs2_inode_info *dir_f, struct jffs2_inode_info *f, struct jffs2_raw_inode *ri, - const struct qstr *qstr) + const struct qstr *qstr, struct prevent_block *pb) { struct jffs2_raw_dirent *rd; struct jffs2_full_dnode *fn; @@ -474,6 +474,9 @@ int jffs2_do_create(struct jffs2_sb_info *c, struct jffs2_inode_info *dir_f, jffs2_complete_reservation(c); return PTR_ERR(fn); } + + add_prevent_gc_list(c, pb); + /* No data here. Only a metadata node, which will be obsoleted by the first data write */