diff mbox

ext4: Resolve the hang of direct i/o read in handling EXT4_IO_END_UNWRITTEN.

Message ID 1313476673-4866-1-git-send-email-tm@tao.ma
State Accepted, archived
Headers show

Commit Message

Tao Ma Aug. 16, 2011, 6:37 a.m. UTC
From: Tao Ma <boyu.mt@taobao.com>

EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten should
be done simultaneously since ext4_end_io_nolock always clear the flag and
decrease the counter in the same time.

We don't increase i_aiodio_unwritten when setting EXT4_IO_END_UNWRITTEN so
it will go nagative and causes some process to wait forever.

Part of the patch came from Eric in his e-mail, but it doesn't fix the problem
met by Michael actually.
http://marc.info/?l=linux-ext4&m=131316851417460&w=2

Reported-and-Tested-by: Michael Tokarev<mjt@tls.msk.ru>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
Eric talked about creating a helper for this and I feel that it looks
like a new functionality not a bug fix and decided to leave it to the
next merge window.

 fs/ext4/inode.c   |    9 ++++++++-
 fs/ext4/page-io.c |    6 ++++--
 2 files changed, 12 insertions(+), 3 deletions(-)

Comments

Eric Sandeen Aug. 16, 2011, 2:06 p.m. UTC | #1
On 8/16/11 1:37 AM, Tao Ma wrote:
> From: Tao Ma <boyu.mt@taobao.com>
> 
> EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten should
> be done simultaneously since ext4_end_io_nolock always clear the flag and
> decrease the counter in the same time.
> 
> We don't increase i_aiodio_unwritten when setting EXT4_IO_END_UNWRITTEN so
> it will go nagative and causes some process to wait forever.
> 
> Part of the patch came from Eric in his e-mail, but it doesn't fix the problem
> met by Michael actually.
> http://marc.info/?l=linux-ext4&m=131316851417460&w=2
> 
> Reported-and-Tested-by: Michael Tokarev<mjt@tls.msk.ru>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> Signed-off-by: Tao Ma <boyu.mt@taobao.com>

Yes, this looks right to me.  It's the patch I meant to send, but
I missed one spot in my haste.  ;)  Thanks.

-Eric

> ---
> Eric talked about creating a helper for this and I feel that it looks
> like a new functionality not a bug fix and decided to leave it to the
> next merge window.
> 
>  fs/ext4/inode.c   |    9 ++++++++-
>  fs/ext4/page-io.c |    6 ++++--
>  2 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d47264c..40c0b10 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2668,8 +2668,15 @@ static void ext4_end_io_buffer_write(struct buffer_head *bh, int uptodate)
>  		goto out;
>  	}
>  
> -	io_end->flag = EXT4_IO_END_UNWRITTEN;
> +	/*
> +	 * It may be over-defensive here to check EXT4_IO_END_UNWRITTEN now,
> +	 * but being more careful is always safe for the future change.
> +	 */
>  	inode = io_end->inode;
> +	if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) {
> +		io_end->flag |= EXT4_IO_END_UNWRITTEN;
> +		atomic_inc(&EXT4_I(inode)->i_aiodio_unwritten);
> +	}
>  
>  	/* Add the io_end to per-inode completed io list*/
>  	spin_lock_irqsave(&EXT4_I(inode)->i_completed_io_lock, flags);
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 430c401..78839af 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -334,8 +334,10 @@ submit_and_retry:
>  	if ((io_end->num_io_pages >= MAX_IO_PAGES) &&
>  	    (io_end->pages[io_end->num_io_pages-1] != io_page))
>  		goto submit_and_retry;
> -	if (buffer_uninit(bh))
> -		io->io_end->flag |= EXT4_IO_END_UNWRITTEN;
> +	if (buffer_uninit(bh) && !(io_end->flag & EXT4_IO_END_UNWRITTEN)) {
> +		io_end->flag |= EXT4_IO_END_UNWRITTEN;
> +		atomic_inc(&EXT4_I(inode)->i_aiodio_unwritten);
> +	}
>  	io->io_end->size += bh->b_size;
>  	io->io_next_block++;
>  	ret = bio_add_page(io->io_bio, bh->b_page, bh->b_size, bh_offset(bh));

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Aug. 16, 2011, 6:29 p.m. UTC | #2
On Tue, Aug 16, 2011 at 02:37:53PM +0800, Tao Ma wrote:
> From: Tao Ma <boyu.mt@taobao.com>
> 
> EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten should
> be done simultaneously since ext4_end_io_nolock always clear the flag and
> decrease the counter in the same time.
> 
> We don't increase i_aiodio_unwritten when setting EXT4_IO_END_UNWRITTEN so
> it will go nagative and causes some process to wait forever.
> 
> Part of the patch came from Eric in his e-mail, but it doesn't fix the problem
> met by Michael actually.
> http://marc.info/?l=linux-ext4&m=131316851417460&w=2
> 
> Reported-and-Tested-by: Michael Tokarev<mjt@tls.msk.ru>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> Signed-off-by: Tao Ma <boyu.mt@taobao.com>

Thanks I've taken this into the ext4 tree.  I am a bit worried this
will trigger a GCC warning:

+	/*
+	 * It may be over-defensive here to check EXT4_IO_END_UNWRITTEN now,
+	 * but being more careful is always safe for the future change.
+	 */
 	inode = io_end->inode;
+	if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) {
+		io_end->flag |= EXT4_IO_END_UNWRITTEN;
+		atomic_inc(&EXT4_I(inode)->i_aiodio_unwritten);
+	}
 
 	/* Add the io_end to per-inode completed io list*/
 	spin_lock_irqsave(&EXT4_I(inode)->i_completed_io_lock, flags);

... since in Google we've been compiling with -Werror, but it's not
causing an error on gcc 4.4, which is what I still have on my laptop.
It may be that newer versions of GCC are smart enough to notice tha
the above is dead code, and then complain with a warning.

        	     	  	    	 - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tao Ma Aug. 17, 2011, 2:32 a.m. UTC | #3
On 08/17/2011 02:29 AM, Ted Ts'o wrote:
> On Tue, Aug 16, 2011 at 02:37:53PM +0800, Tao Ma wrote:
>> From: Tao Ma <boyu.mt@taobao.com>
>>
>> EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten should
>> be done simultaneously since ext4_end_io_nolock always clear the flag and
>> decrease the counter in the same time.
>>
>> We don't increase i_aiodio_unwritten when setting EXT4_IO_END_UNWRITTEN so
>> it will go nagative and causes some process to wait forever.
>>
>> Part of the patch came from Eric in his e-mail, but it doesn't fix the problem
>> met by Michael actually.
>> http://marc.info/?l=linux-ext4&m=131316851417460&w=2
>>
>> Reported-and-Tested-by: Michael Tokarev<mjt@tls.msk.ru>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> Signed-off-by: Tao Ma <boyu.mt@taobao.com>
> 
> Thanks I've taken this into the ext4 tree.  I am a bit worried this
> will trigger a GCC warning:
> 
> +	/*
> +	 * It may be over-defensive here to check EXT4_IO_END_UNWRITTEN now,
> +	 * but being more careful is always safe for the future change.
> +	 */
>  	inode = io_end->inode;
> +	if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) {
> +		io_end->flag |= EXT4_IO_END_UNWRITTEN;
> +		atomic_inc(&EXT4_I(inode)->i_aiodio_unwritten);
> +	}
>  
>  	/* Add the io_end to per-inode completed io list*/
>  	spin_lock_irqsave(&EXT4_I(inode)->i_completed_io_lock, flags);
> 
> ... since in Google we've been compiling with -Werror, but it's not
> causing an error on gcc 4.4, which is what I still have on my laptop.
> It may be that newer versions of GCC are smart enough to notice tha
> the above is dead code, and then complain with a warning.
Sorry for my bluntness. But where is the 'dead code' you mean?

Thanks
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d47264c..40c0b10 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2668,8 +2668,15 @@  static void ext4_end_io_buffer_write(struct buffer_head *bh, int uptodate)
 		goto out;
 	}
 
-	io_end->flag = EXT4_IO_END_UNWRITTEN;
+	/*
+	 * It may be over-defensive here to check EXT4_IO_END_UNWRITTEN now,
+	 * but being more careful is always safe for the future change.
+	 */
 	inode = io_end->inode;
+	if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) {
+		io_end->flag |= EXT4_IO_END_UNWRITTEN;
+		atomic_inc(&EXT4_I(inode)->i_aiodio_unwritten);
+	}
 
 	/* Add the io_end to per-inode completed io list*/
 	spin_lock_irqsave(&EXT4_I(inode)->i_completed_io_lock, flags);
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 430c401..78839af 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -334,8 +334,10 @@  submit_and_retry:
 	if ((io_end->num_io_pages >= MAX_IO_PAGES) &&
 	    (io_end->pages[io_end->num_io_pages-1] != io_page))
 		goto submit_and_retry;
-	if (buffer_uninit(bh))
-		io->io_end->flag |= EXT4_IO_END_UNWRITTEN;
+	if (buffer_uninit(bh) && !(io_end->flag & EXT4_IO_END_UNWRITTEN)) {
+		io_end->flag |= EXT4_IO_END_UNWRITTEN;
+		atomic_inc(&EXT4_I(inode)->i_aiodio_unwritten);
+	}
 	io->io_end->size += bh->b_size;
 	io->io_next_block++;
 	ret = bio_add_page(io->io_bio, bh->b_page, bh->b_size, bh_offset(bh));