diff mbox

ext4: fix checking on nr_to_write

Message ID 20131015191556.50c3eb03@tom-ThinkPad-T410
State Superseded, archived
Headers show

Commit Message

Ming Lei Oct. 15, 2013, 11:15 a.m. UTC
On Tue, 15 Oct 2013 12:39:00 +0200
Jan Kara <jack@suse.cz> wrote:

> On Tue 15-10-13 10:25:53, Ming Lei wrote:
> > Looks it makes sense, so how about below change?
> > 
> > --
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 32c04ab..c32b599 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -2294,7 +2294,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
> >  {
> >  	struct address_space *mapping = mpd->inode->i_mapping;
> >  	struct pagevec pvec;
> > -	unsigned int nr_pages;
> > +	unsigned int nr_pages, nr_added = 0;
> >  	pgoff_t index = mpd->first_page;
> >  	pgoff_t end = mpd->last_page;
> >  	int tag;
> > @@ -2330,6 +2330,18 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
> >  			if (page->index > end)
> >  				goto out;
> >  
> > +			/*
> > +			 * Accumulated enough dirty pages? This doesn't apply
> > +			 * to WB_SYNC_ALL mode. For integrity sync we have to
> > +			 * keep going because someone may be concurrently
> > +			 * dirtying pages, and we might have synced a lot of
> > +			 * newly appeared dirty pages, but have not synced all
> > +			 * of the old dirty pages.
> > +			 */
> > +			if (mpd->wbc->sync_mode == WB_SYNC_NONE &&
> > +					nr_added >= mpd->wbc->nr_to_write)
> > +				goto out;
> > +
>   This won't quite work because if the page is fully mapped
> mpage_process_page_bufs() will immediately submit the page and decrease
> nr_to_write. So now you would end up writing less than you were asked for
> in some cases. 

Yes, your are right, so how about below?



> Attached patch should do what's needed. Can you try whether
> it fixes the problem for you (it seems to work OK in my testing).

In fact, I had wrote and tested your attached patch before my last post,
and it may trigger BUG() in mpage_release_unused_pages(), that is because
we touch mpd->next_page without locking current page, so it is better to
not increase mpd->next_page if the current page won't be processed.


Thanks,

Comments

Jan Kara Oct. 15, 2013, 12:34 p.m. UTC | #1
On Tue 15-10-13 19:15:56, Ming Lei wrote:
> >   This won't quite work because if the page is fully mapped
> > mpage_process_page_bufs() will immediately submit the page and decrease
> > nr_to_write. So now you would end up writing less than you were asked for
> > in some cases. 
> 
> Yes, your are right, so how about below?
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 32c04ab..3cf7abb 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2295,6 +2295,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
>  	struct address_space *mapping = mpd->inode->i_mapping;
>  	struct pagevec pvec;
>  	unsigned int nr_pages;
> +	int left = mpd->wbc->nr_to_write;
  'long' please. Otherwise the patch looks fine. Thanks!

								Honza

>  	pgoff_t index = mpd->first_page;
>  	pgoff_t end = mpd->last_page;
>  	int tag;
> @@ -2330,6 +2331,17 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
>  			if (page->index > end)
>  				goto out;
>  
> +			/*
> +			 * Accumulated enough dirty pages? This doesn't apply
> +			 * to WB_SYNC_ALL mode. For integrity sync we have to
> +			 * keep going because someone may be concurrently
> +			 * dirtying pages, and we might have synced a lot of
> +			 * newly appeared dirty pages, but have not synced all
> +			 * of the old dirty pages.
> +			 */
> +			if (mpd->wbc->sync_mode == WB_SYNC_NONE && left <= 0)
> +				goto out;
> +
>  			/* If we can't merge this page, we are done. */
>  			if (mpd->map.m_len > 0 && mpd->next_page != page->index)
>  				goto out;
> @@ -2364,19 +2376,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
>  			if (err <= 0)
>  				goto out;
>  			err = 0;
> -
> -			/*
> -			 * Accumulated enough dirty pages? This doesn't apply
> -			 * to WB_SYNC_ALL mode. For integrity sync we have to
> -			 * keep going because someone may be concurrently
> -			 * dirtying pages, and we might have synced a lot of
> -			 * newly appeared dirty pages, but have not synced all
> -			 * of the old dirty pages.
> -			 */
> -			if (mpd->wbc->sync_mode == WB_SYNC_NONE &&
> -			    mpd->next_page - mpd->first_page >=
> -							mpd->wbc->nr_to_write)
> -				goto out;
> +			left--;
>  		}
>  		pagevec_release(&pvec);
>  		cond_resched();
> 
> 
> > Attached patch should do what's needed. Can you try whether
> > it fixes the problem for you (it seems to work OK in my testing).
> 
> In fact, I had wrote and tested your attached patch before my last post,
> and it may trigger BUG() in mpage_release_unused_pages(), that is because
> we touch mpd->next_page without locking current page, so it is better to
> not increase mpd->next_page if the current page won't be processed.
> 
> 
> Thanks,
> -- 
> Ming Lei
Ming Lei Oct. 15, 2013, 2:53 p.m. UTC | #2
On Tue, Oct 15, 2013 at 8:34 PM, Jan Kara <jack@suse.cz> wrote:
> On Tue 15-10-13 19:15:56, Ming Lei wrote:
>> >   This won't quite work because if the page is fully mapped
>> > mpage_process_page_bufs() will immediately submit the page and decrease
>> > nr_to_write. So now you would end up writing less than you were asked for
>> > in some cases.
>>
>> Yes, your are right, so how about below?
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 32c04ab..3cf7abb 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -2295,6 +2295,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
>>       struct address_space *mapping = mpd->inode->i_mapping;
>>       struct pagevec pvec;
>>       unsigned int nr_pages;
>> +     int left = mpd->wbc->nr_to_write;
>   'long' please. Otherwise the patch looks fine. Thanks!

OK, and I will submit the formal one with you ack.

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 32c04ab..3cf7abb 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2295,6 +2295,7 @@  static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 	struct address_space *mapping = mpd->inode->i_mapping;
 	struct pagevec pvec;
 	unsigned int nr_pages;
+	int left = mpd->wbc->nr_to_write;
 	pgoff_t index = mpd->first_page;
 	pgoff_t end = mpd->last_page;
 	int tag;
@@ -2330,6 +2331,17 @@  static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 			if (page->index > end)
 				goto out;
 
+			/*
+			 * Accumulated enough dirty pages? This doesn't apply
+			 * to WB_SYNC_ALL mode. For integrity sync we have to
+			 * keep going because someone may be concurrently
+			 * dirtying pages, and we might have synced a lot of
+			 * newly appeared dirty pages, but have not synced all
+			 * of the old dirty pages.
+			 */
+			if (mpd->wbc->sync_mode == WB_SYNC_NONE && left <= 0)
+				goto out;
+
 			/* If we can't merge this page, we are done. */
 			if (mpd->map.m_len > 0 && mpd->next_page != page->index)
 				goto out;
@@ -2364,19 +2376,7 @@  static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 			if (err <= 0)
 				goto out;
 			err = 0;
-
-			/*
-			 * Accumulated enough dirty pages? This doesn't apply
-			 * to WB_SYNC_ALL mode. For integrity sync we have to
-			 * keep going because someone may be concurrently
-			 * dirtying pages, and we might have synced a lot of
-			 * newly appeared dirty pages, but have not synced all
-			 * of the old dirty pages.
-			 */
-			if (mpd->wbc->sync_mode == WB_SYNC_NONE &&
-			    mpd->next_page - mpd->first_page >=
-							mpd->wbc->nr_to_write)
-				goto out;
+			left--;
 		}
 		pagevec_release(&pvec);
 		cond_resched();