Patchwork ext4: Fix warning in ext4_evict_inode()

login
register
mail settings
Submitter Jan Kara
Date July 9, 2013, 7:51 p.m.
Message ID <1373399484-10406-1-git-send-email-jack@suse.cz>
Download mbox | patch
Permalink /patch/257863/
State Accepted
Headers show

Comments

Jan Kara - July 9, 2013, 7:51 p.m.
The following race can lead to ext4_evict_inode() seeing i_ioend_count
> 0 and thus triggering a sanity check warning:

        CPU1                                    CPU2
ext4_end_bio()                          ext4_evict_inode()
  ext4_finish_bio()
    end_page_writeback();
                                          truncate_inode_pages()
                                            evict page
                                        WARN_ON(i_ioend_count > 0);
  ext4_put_io_end_defer()
    ext4_release_io_end()
      dec i_ioend_count

This is possible use-after-free bug since we decrement i_ioend_count in
possibly released inode.

Since i_ioend_count is used only for sanity checks one possible solution
would be to just remove it but for now I'd like to keep those sanity
checks to help debugging the new ext4 writeback code.

This patch changes ext4_end_bio() to call ext4_put_io_end_defer() before
ext4_finish_bio() in the shortcut case when unwritten extent conversion
isn't needed. In that case we don't need the io_end so we are safe to
drop it early.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/page-io.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

 Ted, this should fix the warning spotted by Guenter.
Guenter Roeck - July 9, 2013, 8:38 p.m.
On Tue, Jul 09, 2013 at 09:51:24PM +0200, Jan Kara wrote:
> The following race can lead to ext4_evict_inode() seeing i_ioend_count
> > 0 and thus triggering a sanity check warning:
> 
>         CPU1                                    CPU2
> ext4_end_bio()                          ext4_evict_inode()
>   ext4_finish_bio()
>     end_page_writeback();
>                                           truncate_inode_pages()
>                                             evict page
>                                         WARN_ON(i_ioend_count > 0);
>   ext4_put_io_end_defer()
>     ext4_release_io_end()
>       dec i_ioend_count
> 
> This is possible use-after-free bug since we decrement i_ioend_count in
> possibly released inode.
> 
> Since i_ioend_count is used only for sanity checks one possible solution
> would be to just remove it but for now I'd like to keep those sanity
> checks to help debugging the new ext4 writeback code.
> 
> This patch changes ext4_end_bio() to call ext4_put_io_end_defer() before
> ext4_finish_bio() in the shortcut case when unwritten extent conversion
> isn't needed. In that case we don't need the io_end so we are safe to
> drop it early.
> 
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---

I just saw the problem again, oddly enough while building an image
with this patch.

I'll run the kernel with the patch and let you know how it goes.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck - July 10, 2013, 6:46 p.m.
On Tue, Jul 09, 2013 at 09:51:24PM +0200, Jan Kara wrote:
> The following race can lead to ext4_evict_inode() seeing i_ioend_count
> > 0 and thus triggering a sanity check warning:
> 
>         CPU1                                    CPU2
> ext4_end_bio()                          ext4_evict_inode()
>   ext4_finish_bio()
>     end_page_writeback();
>                                           truncate_inode_pages()
>                                             evict page
>                                         WARN_ON(i_ioend_count > 0);
>   ext4_put_io_end_defer()
>     ext4_release_io_end()
>       dec i_ioend_count
> 
> This is possible use-after-free bug since we decrement i_ioend_count in
> possibly released inode.
> 
> Since i_ioend_count is used only for sanity checks one possible solution
> would be to just remove it but for now I'd like to keep those sanity
> checks to help debugging the new ext4 writeback code.
> 
> This patch changes ext4_end_bio() to call ext4_put_io_end_defer() before
> ext4_finish_bio() in the shortcut case when unwritten extent conversion
> isn't needed. In that case we don't need the io_end so we are safe to
> drop it early.
> 
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Jan Kara <jack@suse.cz>

I tried hard to reproduce the problem with this patch applied, but it just
didn't happen. While that is of course not proof that the problem is gone,
feel free to add

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

> ---
>  fs/ext4/page-io.c | 32 +++++++++++++++++++-------------
>  1 file changed, 19 insertions(+), 13 deletions(-)
> 
>  Ted, this should fix the warning spotted by Guenter.
> 
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 48786cd..d63cc5e 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -308,6 +308,7 @@ ext4_io_end_t *ext4_get_io_end(ext4_io_end_t *io_end)
>  	return io_end;
>  }
>  
> +/* BIO completion function for page writeback */
>  static void ext4_end_bio(struct bio *bio, int error)
>  {
>  	ext4_io_end_t *io_end = bio->bi_private;
> @@ -318,18 +319,6 @@ static void ext4_end_bio(struct bio *bio, int error)
>  	if (test_bit(BIO_UPTODATE, &bio->bi_flags))
>  		error = 0;
>  
> -	if (io_end->flag & EXT4_IO_END_UNWRITTEN) {
> -		/*
> -		 * Link bio into list hanging from io_end. We have to do it
> -		 * atomically as bio completions can be racing against each
> -		 * other.
> -		 */
> -		bio->bi_private = xchg(&io_end->bio, bio);
> -	} else {
> -		ext4_finish_bio(bio);
> -		bio_put(bio);
> -	}
> -
>  	if (error) {
>  		struct inode *inode = io_end->inode;
>  
> @@ -341,7 +330,24 @@ static void ext4_end_bio(struct bio *bio, int error)
>  			     (unsigned long long)
>  			     bi_sector >> (inode->i_blkbits - 9));
>  	}
> -	ext4_put_io_end_defer(io_end);
> +
> +	if (io_end->flag & EXT4_IO_END_UNWRITTEN) {
> +		/*
> +		 * Link bio into list hanging from io_end. We have to do it
> +		 * atomically as bio completions can be racing against each
> +		 * other.
> +		 */
> +		bio->bi_private = xchg(&io_end->bio, bio);
> +		ext4_put_io_end_defer(io_end);
> +	} else {
> +		/*
> +		 * Drop io_end reference early. Inode can get freed once
> +		 * we finish the bio.
> +		 */
> +		ext4_put_io_end_defer(io_end);
> +		ext4_finish_bio(bio);
> +		bio_put(bio);
> +	}
>  }
>  
>  void ext4_io_submit(struct ext4_io_submit *io)
> -- 
> 1.8.1.4
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 48786cd..d63cc5e 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -308,6 +308,7 @@  ext4_io_end_t *ext4_get_io_end(ext4_io_end_t *io_end)
 	return io_end;
 }
 
+/* BIO completion function for page writeback */
 static void ext4_end_bio(struct bio *bio, int error)
 {
 	ext4_io_end_t *io_end = bio->bi_private;
@@ -318,18 +319,6 @@  static void ext4_end_bio(struct bio *bio, int error)
 	if (test_bit(BIO_UPTODATE, &bio->bi_flags))
 		error = 0;
 
-	if (io_end->flag & EXT4_IO_END_UNWRITTEN) {
-		/*
-		 * Link bio into list hanging from io_end. We have to do it
-		 * atomically as bio completions can be racing against each
-		 * other.
-		 */
-		bio->bi_private = xchg(&io_end->bio, bio);
-	} else {
-		ext4_finish_bio(bio);
-		bio_put(bio);
-	}
-
 	if (error) {
 		struct inode *inode = io_end->inode;
 
@@ -341,7 +330,24 @@  static void ext4_end_bio(struct bio *bio, int error)
 			     (unsigned long long)
 			     bi_sector >> (inode->i_blkbits - 9));
 	}
-	ext4_put_io_end_defer(io_end);
+
+	if (io_end->flag & EXT4_IO_END_UNWRITTEN) {
+		/*
+		 * Link bio into list hanging from io_end. We have to do it
+		 * atomically as bio completions can be racing against each
+		 * other.
+		 */
+		bio->bi_private = xchg(&io_end->bio, bio);
+		ext4_put_io_end_defer(io_end);
+	} else {
+		/*
+		 * Drop io_end reference early. Inode can get freed once
+		 * we finish the bio.
+		 */
+		ext4_put_io_end_defer(io_end);
+		ext4_finish_bio(bio);
+		bio_put(bio);
+	}
 }
 
 void ext4_io_submit(struct ext4_io_submit *io)