diff mbox

[3/3] ext4: Handle non empty on-disk orphan link

Message ID 1267132807-5882-3-git-send-email-dmonakhov@openvz.org
State Accepted, archived
Headers show

Commit Message

Dmitry Monakhov Feb. 25, 2010, 9:20 p.m. UTC
In case of truncate errors we explicitly remove inode from in-core
orphan list via orphan_del(NULL, inode) without on-disk list
modification.
But later same inode may be inserted in the orphan list again which
result in on-disk link corruption.  If inode i_dtime contains valid
value let skip on-disk list modification.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/namei.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

Comments

Dmitry Monakhov Feb. 25, 2010, 10:55 p.m. UTC | #1
Dmitry Monakhov <dmonakhov@openvz.org> writes:

> In case of truncate errors we explicitly remove inode from in-core
> orphan list via orphan_del(NULL, inode) without on-disk list
> modification.
> But later same inode may be inserted in the orphan list again which
> result in on-disk link corruption.
There is another "100% reliable" way to solve the issue.
In case of truncate error instead of cleaning in-core inode's list
we may just reinsert it in to another sb->s_orphan_error list. 
In this case orphan_add() will works without changes because
!list_empty() check will works as expected. And if later it is 
also possible to call orphan_del().
Later we even may try to replay this s_orphan_error list for
example before umount/remount
But this solution has major disadvantage. We can have to pin
inode in to memory to prevent inode pruning.
This is not best choice because usually truncate failed
because of ENOMEM. 

That's why i use this not absolutely reliable but simple approach.
>  If inode i_dtime contains valid
> value let skip on-disk list modification.

>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
>  fs/ext4/namei.c |    8 ++++++++
>  1 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 17a17e1..19ca9bf 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -2020,6 +2020,13 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
>  	err = ext4_reserve_inode_write(handle, inode, &iloc);
>  	if (err)
>  		goto out_unlock;
> +	/*
> +	 * Due to previous errors inode may be already a part of on-disk
> +	 * orphan list. If so skipp on-disk list modification.
> +	 */
> +	if (NEXT_ORPHAN(inode) && NEXT_ORPHAN(inode) <=
> +		(le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)))
> +			goto mem_insert;
>  
>  	/* Insert this inode at the head of the on-disk orphan list... */
>  	NEXT_ORPHAN(inode) = le32_to_cpu(EXT4_SB(sb)->s_es->s_last_orphan);
> @@ -2037,6 +2044,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
>  	 *
>  	 * This is safe: on error we're going to ignore the orphan list
>  	 * anyway on the next recovery. */
> +mem_insert:
>  	if (!err)
>  		list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan);
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o March 2, 2010, 4:33 a.m. UTC | #2
Thanks, I've added these three patches to the ext4 patch queue.

	     	   	       	       	  - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 17a17e1..19ca9bf 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2020,6 +2020,13 @@  int ext4_orphan_add(handle_t *handle, struct inode *inode)
 	err = ext4_reserve_inode_write(handle, inode, &iloc);
 	if (err)
 		goto out_unlock;
+	/*
+	 * Due to previous errors inode may be already a part of on-disk
+	 * orphan list. If so skipp on-disk list modification.
+	 */
+	if (NEXT_ORPHAN(inode) && NEXT_ORPHAN(inode) <=
+		(le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)))
+			goto mem_insert;
 
 	/* Insert this inode at the head of the on-disk orphan list... */
 	NEXT_ORPHAN(inode) = le32_to_cpu(EXT4_SB(sb)->s_es->s_last_orphan);
@@ -2037,6 +2044,7 @@  int ext4_orphan_add(handle_t *handle, struct inode *inode)
 	 *
 	 * This is safe: on error we're going to ignore the orphan list
 	 * anyway on the next recovery. */
+mem_insert:
 	if (!err)
 		list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan);