diff mbox

[2/4] jffs2: fix unbalanced locking

Message ID 20140214070122.GG3500@norris-Latitude-E6410
State Accepted
Commit 7aaea7605c0e19fa7b38d7ac5dcd818942fd17a7
Headers show

Commit Message

Brian Norris Feb. 14, 2014, 7:01 a.m. UTC
Hi Li,

On Thu, Feb 13, 2014 at 04:27:16PM +0800, Li Zefan wrote:
> On 2014/2/13 14:48, Brian Norris wrote:
> > On Wed, Feb 12, 2014 at 12:44:55PM -0800, Andrew Morton wrote:
> >> From: Li Zefan <lizefan@huawei.com>
> >> Subject: jffs2: fix unbalanced locking
> >>
> >> This was found by our internal debugging feature on runtime, but this bug
> >> won't lead to deadlock, as the structure that this lock is embedded in is
> >> freed on error.
> > 
> > Well, one of its callers frees it without unlocking it, but you're
> > forgetting about one of its other callers, and in doing so, you are
> > introducing a potential double-unlock instead!
> 
> But I don't think I should be blamed here.
> 
> Without my patch, some of the error paths release f->sem but some don't,
> so the potential double-unlock is already there.

So you can't be blamed for making *all* cases a double unlock? And this
patch is incorrect as well. See below.

> > BTW, the right way to handle lock balancing is to handle the unlocking
> > at the same level where you do the locking. So I guess you're looking
> > for the following patch instead, which is really not very useful because
> > (as Li noted) the lock is freed immediately afterward anyway:
> > 
> 
> Yeah, I do believe it's better to do locking/unlocking in the same level.
> How about this:
> 
> [PATCH v2] jffs2: fix unbalanced locking
[...]
> diff --git a/fs/jffs2/readinode.c b/fs/jffs2/readinode.c
> index 386303d..6f22234 100644
> --- a/fs/jffs2/readinode.c
> +++ b/fs/jffs2/readinode.c
[...]
> @@ -1317,8 +1308,13 @@ static int jffs2_do_read_inode_internal(struct jffs2_sb_info *c,
>  	}
>  	if (f->inocache->state == INO_STATE_READING)
>  		jffs2_set_inocache_state(c, f->inocache, INO_STATE_PRESENT);
> -
> -	return 0;
> +out:
> +	if (ret) {
> +		mutex_unlock(&f->sem);
> +		jffs2_do_clear_inode(c, f);
> +		mutex_lock(&f->sem);

This still is not consistent, and this access pattern (unlock / call
function which uses lock / lock) seems like a hack. How about we just
make the callers all do the same cleanup instead? Notice the 'error'
label in jffs2_iget(), which performs the jffs2_do_clear_inode() itself
already -- seemingly redundant. At least your patch doesn't add double
unlocks this time...

> +	}
> +	return ret;
>  }
>  
>  /* Scan the list of all nodes present for this ino, build map of versions, etc. */
[...]

How about we just push all the mutex_unlock(&f->sem) and
jffs2_do_clear_inode() to the callers which hold the mutex? The
following patch is totally untested:

(BTW Li, have you actually tested any of your patches? I'm going to need
some Tested-by's before I merge any of this.)

Signed-off-by: Brian Norris <norris@broadcom.com>

Comments

Zefan Li Feb. 19, 2014, 8:04 a.m. UTC | #1
>>> BTW, the right way to handle lock balancing is to handle the unlocking
>>> at the same level where you do the locking. So I guess you're looking
>>> for the following patch instead, which is really not very useful because
>>> (as Li noted) the lock is freed immediately afterward anyway:
>>>
>>
>> Yeah, I do believe it's better to do locking/unlocking in the same level.
>> How about this:
>>
>> [PATCH v2] jffs2: fix unbalanced locking
> [...]
>> diff --git a/fs/jffs2/readinode.c b/fs/jffs2/readinode.c
>> index 386303d..6f22234 100644
>> --- a/fs/jffs2/readinode.c
>> +++ b/fs/jffs2/readinode.c
> [...]
>> @@ -1317,8 +1308,13 @@ static int jffs2_do_read_inode_internal(struct jffs2_sb_info *c,
>>  	}
>>  	if (f->inocache->state == INO_STATE_READING)
>>  		jffs2_set_inocache_state(c, f->inocache, INO_STATE_PRESENT);
>> -
>> -	return 0;
>> +out:
>> +	if (ret) {
>> +		mutex_unlock(&f->sem);
>> +		jffs2_do_clear_inode(c, f);
>> +		mutex_lock(&f->sem);
> 
> This still is not consistent, and this access pattern (unlock / call
> function which uses lock / lock) seems like a hack. How about we just
> make the callers all do the same cleanup instead? Notice the 'error'
> label in jffs2_iget(), which performs the jffs2_do_clear_inode() itself
> already -- seemingly redundant. At least your patch doesn't add double
> unlocks this time...
> 
>> +	}
>> +	return ret;
>>  }
>>  
>>  /* Scan the list of all nodes present for this ino, build map of versions, etc. */
> [...]
> 
> How about we just push all the mutex_unlock(&f->sem) and
> jffs2_do_clear_inode() to the callers which hold the mutex? The
> following patch is totally untested:
> 

Yeah, this is better.

> (BTW Li, have you actually tested any of your patches? I'm going to need
> some Tested-by's before I merge any of this.)
> 

"jffs2: avoid soft-lockup in jffs2_reserve_space_gc()" and an older version of
"jffs2: unlock f->sem on error in jffs2_new_inode()" were tested on 2.6.34
kernel.

We're moving to 3.10, and those jffs2 patches will be tested on 3.10, but
currently we don't have test environment, and it might take month for
the test environment to be available...
diff mbox

Patch

diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c
index a69e426435dd..9ac37354852d 100644
--- a/fs/jffs2/fs.c
+++ b/fs/jffs2/fs.c
@@ -272,12 +272,9 @@  struct inode *jffs2_iget(struct super_block *sb, unsigned long ino)
 	mutex_lock(&f->sem);
 
 	ret = jffs2_do_read_inode(c, f, inode->i_ino, &latest_node);
+	if (ret)
+		goto error;
 
-	if (ret) {
-		mutex_unlock(&f->sem);
-		iget_failed(inode);
-		return ERR_PTR(ret);
-	}
 	inode->i_mode = jemode_to_cpu(latest_node.mode);
 	i_uid_write(inode, je16_to_cpu(latest_node.uid));
 	i_gid_write(inode, je16_to_cpu(latest_node.gid));
diff --git a/fs/jffs2/readinode.c b/fs/jffs2/readinode.c
index 386303dca382..5539ed46e89b 100644
--- a/fs/jffs2/readinode.c
+++ b/fs/jffs2/readinode.c
@@ -1203,17 +1203,13 @@  static int jffs2_do_read_inode_internal(struct jffs2_sb_info *c,
 		JFFS2_ERROR("failed to read from flash: error %d, %zd of %zd bytes read\n",
 			ret, retlen, sizeof(*latest_node));
 		/* FIXME: If this fails, there seems to be a memory leak. Find it. */
-		mutex_unlock(&f->sem);
-		jffs2_do_clear_inode(c, f);
-		return ret?ret:-EIO;
+		return ret ? ret : -EIO;
 	}
 
 	crc = crc32(0, latest_node, sizeof(*latest_node)-8);
 	if (crc != je32_to_cpu(latest_node->node_crc)) {
 		JFFS2_ERROR("CRC failed for read_inode of inode %u at physical location 0x%x\n",
 			f->inocache->ino, ref_offset(rii.latest_ref));
-		mutex_unlock(&f->sem);
-		jffs2_do_clear_inode(c, f);
 		return -EIO;
 	}
 
@@ -1250,16 +1246,11 @@  static int jffs2_do_read_inode_internal(struct jffs2_sb_info *c,
 			 * keep in RAM to facilitate quick follow symlink
 			 * operation. */
 			uint32_t csize = je32_to_cpu(latest_node->csize);
-			if (csize > JFFS2_MAX_NAME_LEN) {
-				mutex_unlock(&f->sem);
-				jffs2_do_clear_inode(c, f);
+			if (csize > JFFS2_MAX_NAME_LEN)
 				return -ENAMETOOLONG;
-			}
 			f->target = kmalloc(csize + 1, GFP_KERNEL);
 			if (!f->target) {
 				JFFS2_ERROR("can't allocate %u bytes of memory for the symlink target path cache\n", csize);
-				mutex_unlock(&f->sem);
-				jffs2_do_clear_inode(c, f);
 				return -ENOMEM;
 			}
 
@@ -1271,8 +1262,6 @@  static int jffs2_do_read_inode_internal(struct jffs2_sb_info *c,
 					ret = -EIO;
 				kfree(f->target);
 				f->target = NULL;
-				mutex_unlock(&f->sem);
-				jffs2_do_clear_inode(c, f);
 				return ret;
 			}
 
@@ -1289,15 +1278,11 @@  static int jffs2_do_read_inode_internal(struct jffs2_sb_info *c,
 		if (f->metadata) {
 			JFFS2_ERROR("Argh. Special inode #%u with mode 0%o had metadata node\n",
 			       f->inocache->ino, jemode_to_cpu(latest_node->mode));
-			mutex_unlock(&f->sem);
-			jffs2_do_clear_inode(c, f);
 			return -EIO;
 		}
 		if (!frag_first(&f->fragtree)) {
 			JFFS2_ERROR("Argh. Special inode #%u with mode 0%o has no fragments\n",
 			       f->inocache->ino, jemode_to_cpu(latest_node->mode));
-			mutex_unlock(&f->sem);
-			jffs2_do_clear_inode(c, f);
 			return -EIO;
 		}
 		/* ASSERT: f->fraglist != NULL */
@@ -1305,8 +1290,6 @@  static int jffs2_do_read_inode_internal(struct jffs2_sb_info *c,
 			JFFS2_ERROR("Argh. Special inode #%u with mode 0x%x had more than one node\n",
 			       f->inocache->ino, jemode_to_cpu(latest_node->mode));
 			/* FIXME: Deal with it - check crc32, check for duplicate node, check times and discard the older one */
-			mutex_unlock(&f->sem);
-			jffs2_do_clear_inode(c, f);
 			return -EIO;
 		}
 		/* OK. We're happy */
@@ -1400,10 +1383,8 @@  int jffs2_do_crccheck_inode(struct jffs2_sb_info *c, struct jffs2_inode_cache *i
 	f->inocache = ic;
 
 	ret = jffs2_do_read_inode_internal(c, f, &n);
-	if (!ret) {
-		mutex_unlock(&f->sem);
-		jffs2_do_clear_inode(c, f);
-	}
+	mutex_unlock(&f->sem);
+	jffs2_do_clear_inode(c, f);
 	jffs2_xattr_do_crccheck_inode(c, ic);
 	kfree (f);
 	return ret;