ext4: fix a bug in ext4_wait_for_tail_page_commit
diff mbox series

Message ID 20190917084814.40370-1-yangerkun@huawei.com
State New
Headers show
Series
  • ext4: fix a bug in ext4_wait_for_tail_page_commit
Related show

Commit Message

yangerkun Sept. 17, 2019, 8:48 a.m. UTC
No need to wait when offset equals to 0. And it will trigger a bug since
the latter __ext4_journalled_invalidatepage can free the buffers but leave
page still dirty.

[   26.057508] ------------[ cut here ]------------
[   26.058531] kernel BUG at fs/ext4/inode.c:2134!
...
[   26.088130] Call trace:
[   26.088695]  ext4_writepage+0x914/0xb28
[   26.089541]  writeout.isra.4+0x1b4/0x2b8
[   26.090409]  move_to_new_page+0x3b0/0x568
[   26.091338]  __unmap_and_move+0x648/0x988
[   26.092241]  unmap_and_move+0x48c/0xbb8
[   26.093096]  migrate_pages+0x220/0xb28
[   26.093945]  kernel_mbind+0x828/0xa18
[   26.094791]  __arm64_sys_mbind+0xc8/0x138
[   26.095716]  el0_svc_common+0x190/0x490
[   26.096571]  el0_svc_handler+0x60/0xd0
[   26.097423]  el0_svc+0x8/0xc

Run below parallel can reproduce it easily(ext3):
void main()
{
        int fd, fd1, fd2, fd3, ret;
        void *addr;
        size_t length = 4096;
        int flags;
        off_t offset = 0;
        char *str = "12345";

        fd = open("a", O_RDWR | O_CREAT);
        assert(fd >= 0);

        ret = ftruncate(fd, length);
        assert(ret == 0);

        fd1 = open("a", O_RDWR | O_CREAT, -1);
        assert(fd1 >= 0);

        flags = 0xc00f;/*Journal data mode*/
        ret = ioctl(fd1, _IOW('f', 2, long), &flags);
        assert(ret == 0);

        fd2 = open("a", O_RDWR | O_CREAT);
        assert(fd2 >= 0);

        fd3 = open("a", O_TRUNC | O_NOATIME);
        assert(fd3 >= 0);

        addr = mmap(NULL, length, 0xe, 0x28013, fd2, offset);
        assert(addr != (void *)-1);
        memcpy(addr, str, 5);
        mbind(addr, length, 0, 0, 0, 2);

        close(fd);
        munmap(addr, length);
}

Signed-off-by: yangerkun <yangerkun@huawei.com>
---
 fs/ext4/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Theodore Y. Ts'o Sept. 17, 2019, 3:31 p.m. UTC | #1
On Tue, Sep 17, 2019 at 04:48:14PM +0800, yangerkun wrote:
> No need to wait when offset equals to 0. And it will trigger a bug since
> the latter __ext4_journalled_invalidatepage can free the buffers but leave
> page still dirty.

That's only true if the block size == the page size, no?  If the
offset is zero and the block size is 1k, we still need to wait.
Shouldn't the better fix be:

> -	if (offset > PAGE_SIZE - i_blocksize(inode))
> +	if (offset >= PAGE_SIZE - i_blocksize(inode))

  	   	      		- Ted
yangerkun Sept. 18, 2019, 1:47 a.m. UTC | #2
On 2019/9/17 23:31, Theodore Y. Ts'o wrote:
> On Tue, Sep 17, 2019 at 04:48:14PM +0800, yangerkun wrote:
>> No need to wait when offset equals to 0. And it will trigger a bug since
>> the latter __ext4_journalled_invalidatepage can free the buffers but leave
>> page still dirty.
> That's only true if the block size == the page size, no?  If the
> offset is zero and the block size is 1k, we still need to wait.
> Shouldn't the better fix be:
>
>> -	if (offset > PAGE_SIZE - i_blocksize(inode))
>> +	if (offset >= PAGE_SIZE - i_blocksize(inode))

It can trigger free buffers in 
__ext4_journalled_invalidatepage(offset==0,blocksize=1k), and we will 
see the bug again.

For some other case, like offset==3k blocksize=1k, actually we should 
wait the commit, but now we will return directly.

Thanks.

>    	   	      		- Ted
>
> .
>
Jan Kara Sept. 18, 2019, 10:06 a.m. UTC | #3
On Tue 17-09-19 11:31:40, Theodore Y. Ts'o wrote:
> On Tue, Sep 17, 2019 at 04:48:14PM +0800, yangerkun wrote:
> > No need to wait when offset equals to 0. And it will trigger a bug since
> > the latter __ext4_journalled_invalidatepage can free the buffers but leave
> > page still dirty.
> 
> That's only true if the block size == the page size, no?  If the
> offset is zero and the block size is 1k, we still need to wait.
> Shouldn't the better fix be:
> 
> > -	if (offset > PAGE_SIZE - i_blocksize(inode))
> > +	if (offset >= PAGE_SIZE - i_blocksize(inode))

No, what yangerkun wrote is correct. We don't have to wait for commit when
offset == 0 - truncate_inode_pages() should just happily process such page.
Also '>' in the above condition is correct. offset == PAGE_SIZE -
i_blocksize(inode) means one full block is getting truncated from the page
and we need to wait in that case to avoid jbd2_journal_invalidatepage()
failing with EBUSY when called from truncate_inode_pages().

								Honza
Jan Kara Sept. 18, 2019, 10:45 a.m. UTC | #4
On Tue 17-09-19 16:48:14, yangerkun wrote:
> No need to wait when offset equals to 0. And it will trigger a bug since
> the latter __ext4_journalled_invalidatepage can free the buffers but leave
> page still dirty.
> 
> [   26.057508] ------------[ cut here ]------------
> [   26.058531] kernel BUG at fs/ext4/inode.c:2134!
> ...
> [   26.088130] Call trace:
> [   26.088695]  ext4_writepage+0x914/0xb28
> [   26.089541]  writeout.isra.4+0x1b4/0x2b8
> [   26.090409]  move_to_new_page+0x3b0/0x568
> [   26.091338]  __unmap_and_move+0x648/0x988
> [   26.092241]  unmap_and_move+0x48c/0xbb8
> [   26.093096]  migrate_pages+0x220/0xb28
> [   26.093945]  kernel_mbind+0x828/0xa18
> [   26.094791]  __arm64_sys_mbind+0xc8/0x138
> [   26.095716]  el0_svc_common+0x190/0x490
> [   26.096571]  el0_svc_handler+0x60/0xd0
> [   26.097423]  el0_svc+0x8/0xc
> 
> Run below parallel can reproduce it easily(ext3):
> void main()
> {
>         int fd, fd1, fd2, fd3, ret;
>         void *addr;
>         size_t length = 4096;
>         int flags;
>         off_t offset = 0;
>         char *str = "12345";
> 
>         fd = open("a", O_RDWR | O_CREAT);
>         assert(fd >= 0);
> 
>         ret = ftruncate(fd, length);
>         assert(ret == 0);
> 
>         fd1 = open("a", O_RDWR | O_CREAT, -1);
>         assert(fd1 >= 0);
> 
>         flags = 0xc00f;/*Journal data mode*/
>         ret = ioctl(fd1, _IOW('f', 2, long), &flags);
>         assert(ret == 0);
> 
>         fd2 = open("a", O_RDWR | O_CREAT);
>         assert(fd2 >= 0);
> 
>         fd3 = open("a", O_TRUNC | O_NOATIME);
>         assert(fd3 >= 0);
> 
>         addr = mmap(NULL, length, 0xe, 0x28013, fd2, offset);

Ugh, these mmap flags look pretty bogus. Were they generated by some
fuzzer?

>         assert(addr != (void *)-1);
>         memcpy(addr, str, 5);

Also the O_TRUNC open above will truncate "a" to 0 so the mapping is
actually beyond i_size and this memcpy should fail with SIGBUS. So I'm
surprised your test program gets up to mbind()...

>         mbind(addr, length, 0, 0, 0, 2);
> 
>         close(fd);
>         munmap(addr, length);
> }
> 
> Signed-off-by: yangerkun <yangerkun@huawei.com>

I agree that there's no need to wait for transaction commit when offset ==
0. So your patch is correct in that regard. What still escapes me is why
this is necessary. I have a feeling that it just papers over the real
problem.  You mention crash in ext4_writepage() because page is dirty but
has no buffers - but how come the page is dirty? If offset == 0 for a page,
truncate_inode_pages() should have cleaned PageDirty flag so the page
should never get to ext4_writepage() in the first place. Together with my
comments about the test case this is still a bit mystery to me... I guess
I'll try to reproduce this to understand this better.

								Honza

> ---
>  fs/ext4/inode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 006b7a2070bf..a9943ae4f74d 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5479,7 +5479,7 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
>  	 * do. We do the check mainly to optimize the common PAGE_SIZE ==
>  	 * blocksize case
>  	 */
> -	if (offset > PAGE_SIZE - i_blocksize(inode))
> +	if (!offset || offset > PAGE_SIZE - i_blocksize(inode))
>  		return;
>  	while (1) {
>  		page = find_lock_page(inode->i_mapping,
> -- 
> 2.17.2
>
yangerkun Sept. 18, 2019, 1:09 p.m. UTC | #5
On 2019/9/18 18:45, Jan Kara wrote:
> On Tue 17-09-19 16:48:14, yangerkun wrote:
>> No need to wait when offset equals to 0. And it will trigger a bug since
>> the latter __ext4_journalled_invalidatepage can free the buffers but leave
>> page still dirty.
>>
>> [   26.057508] ------------[ cut here ]------------
>> [   26.058531] kernel BUG at fs/ext4/inode.c:2134!
>> ...
>> [   26.088130] Call trace:
>> [   26.088695]  ext4_writepage+0x914/0xb28
>> [   26.089541]  writeout.isra.4+0x1b4/0x2b8
>> [   26.090409]  move_to_new_page+0x3b0/0x568
>> [   26.091338]  __unmap_and_move+0x648/0x988
>> [   26.092241]  unmap_and_move+0x48c/0xbb8
>> [   26.093096]  migrate_pages+0x220/0xb28
>> [   26.093945]  kernel_mbind+0x828/0xa18
>> [   26.094791]  __arm64_sys_mbind+0xc8/0x138
>> [   26.095716]  el0_svc_common+0x190/0x490
>> [   26.096571]  el0_svc_handler+0x60/0xd0
>> [   26.097423]  el0_svc+0x8/0xc
>>
>> Run below parallel can reproduce it easily(ext3):
>> void main()
>> {
>>          int fd, fd1, fd2, fd3, ret;
>>          void *addr;
>>          size_t length = 4096;
>>          int flags;
>>          off_t offset = 0;
>>          char *str = "12345";
>>
>>          fd = open("a", O_RDWR | O_CREAT);
>>          assert(fd >= 0);
>>
>>          ret = ftruncate(fd, length);
>>          assert(ret == 0);
>>
>>          fd1 = open("a", O_RDWR | O_CREAT, -1);
>>          assert(fd1 >= 0);
>>
>>          flags = 0xc00f;/*Journal data mode*/
>>          ret = ioctl(fd1, _IOW('f', 2, long), &flags);
>>          assert(ret == 0);
>>
>>          fd2 = open("a", O_RDWR | O_CREAT);
>>          assert(fd2 >= 0);
>>
>>          fd3 = open("a", O_TRUNC | O_NOATIME);
>>          assert(fd3 >= 0);
>>
>>          addr = mmap(NULL, length, 0xe, 0x28013, fd2, offset);
> 
> Ugh, these mmap flags look pretty bogus. Were they generated by some
> fuzzer?
Yeah, generated by syzkaller.
> 
>>          assert(addr != (void *)-1);
>>          memcpy(addr, str, 5);
> 
> Also the O_TRUNC open above will truncate "a" to 0 so the mapping is
> actually beyond i_size and this memcpy should fail with SIGBUS. So I'm
> surprised your test program gets up to mbind()...

We run the program parallel, sometimes will run as below:

reproduce1                         reproduce2

...                            |   ...
truncate to 4k                 |
change to journal data mode    |
                                |   memcpy(set page dirty)
truncate to 0:                 |
ext4_setattr:                  |
...                            |
ext4_wait_for_tail_page_commit |
                                |   mbind(trigger bug)
truncate_pagecache(clean dirty)|   ...
...                            |
Reproduce2 will mark page as dirty by memcpy, then mbind run between 
ext4_wait_for_tail_page_commit and truncate_pagecache in ext4_setattr 
can trigger the bug with page still be dirty but buffer head has been free.

So sorry for the incomprehensible description! I will reorganize and 
resend the patch!

Thanks a lot.
> 
>>          mbind(addr, length, 0, 0, 0, 2);
>>
>>          close(fd);
>>          munmap(addr, length);
>> }
>>
>> Signed-off-by: yangerkun <yangerkun@huawei.com>
> 
> I agree that there's no need to wait for transaction commit when offset ==
> 0. So your patch is correct in that regard. What still escapes me is why
> this is necessary. I have a feeling that it just papers over the real
> problem.  You mention crash in ext4_writepage() because page is dirty but
> has no buffers - but how come the page is dirty? If offset == 0 for a page,
> truncate_inode_pages() should have cleaned PageDirty flag so the page
> should never get to ext4_writepage() in the first place. Together with my
> comments about the test case this is still a bit mystery to me... I guess
> I'll try to reproduce this to understand this better.
> 
> 								Honza
> 
>> ---
>>   fs/ext4/inode.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 006b7a2070bf..a9943ae4f74d 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -5479,7 +5479,7 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
>>   	 * do. We do the check mainly to optimize the common PAGE_SIZE ==
>>   	 * blocksize case
>>   	 */
>> -	if (offset > PAGE_SIZE - i_blocksize(inode))
>> +	if (!offset || offset > PAGE_SIZE - i_blocksize(inode))
>>   		return;
>>   	while (1) {
>>   		page = find_lock_page(inode->i_mapping,
>> -- 
>> 2.17.2
>>
Jan Kara Sept. 18, 2019, 1:27 p.m. UTC | #6
On Wed 18-09-19 21:09:00, yangerkun wrote:
> On 2019/9/18 18:45, Jan Kara wrote:
> > On Tue 17-09-19 16:48:14, yangerkun wrote:
> > > No need to wait when offset equals to 0. And it will trigger a bug since
> > > the latter __ext4_journalled_invalidatepage can free the buffers but leave
> > > page still dirty.
> > > 
> > > [   26.057508] ------------[ cut here ]------------
> > > [   26.058531] kernel BUG at fs/ext4/inode.c:2134!
> > > ...
> > > [   26.088130] Call trace:
> > > [   26.088695]  ext4_writepage+0x914/0xb28
> > > [   26.089541]  writeout.isra.4+0x1b4/0x2b8
> > > [   26.090409]  move_to_new_page+0x3b0/0x568
> > > [   26.091338]  __unmap_and_move+0x648/0x988
> > > [   26.092241]  unmap_and_move+0x48c/0xbb8
> > > [   26.093096]  migrate_pages+0x220/0xb28
> > > [   26.093945]  kernel_mbind+0x828/0xa18
> > > [   26.094791]  __arm64_sys_mbind+0xc8/0x138
> > > [   26.095716]  el0_svc_common+0x190/0x490
> > > [   26.096571]  el0_svc_handler+0x60/0xd0
> > > [   26.097423]  el0_svc+0x8/0xc
> > > 
> > > Run below parallel can reproduce it easily(ext3):
> > > void main()
> > > {
> > >          int fd, fd1, fd2, fd3, ret;
> > >          void *addr;
> > >          size_t length = 4096;
> > >          int flags;
> > >          off_t offset = 0;
> > >          char *str = "12345";
> > > 
> > >          fd = open("a", O_RDWR | O_CREAT);
> > >          assert(fd >= 0);
> > > 
> > >          ret = ftruncate(fd, length);
> > >          assert(ret == 0);
> > > 
> > >          fd1 = open("a", O_RDWR | O_CREAT, -1);
> > >          assert(fd1 >= 0);
> > > 
> > >          flags = 0xc00f;/*Journal data mode*/
> > >          ret = ioctl(fd1, _IOW('f', 2, long), &flags);
> > >          assert(ret == 0);
> > > 
> > >          fd2 = open("a", O_RDWR | O_CREAT);
> > >          assert(fd2 >= 0);
> > > 
> > >          fd3 = open("a", O_TRUNC | O_NOATIME);
> > >          assert(fd3 >= 0);
> > > 
> > >          addr = mmap(NULL, length, 0xe, 0x28013, fd2, offset);
> > 
> > Ugh, these mmap flags look pretty bogus. Were they generated by some
> > fuzzer?
> Yeah, generated by syzkaller.
> > 
> > >          assert(addr != (void *)-1);
> > >          memcpy(addr, str, 5);
> > 
> > Also the O_TRUNC open above will truncate "a" to 0 so the mapping is
> > actually beyond i_size and this memcpy should fail with SIGBUS. So I'm
> > surprised your test program gets up to mbind()...
> 
> We run the program parallel, sometimes will run as below:
> 
> reproduce1                         reproduce2
> 
> ...                            |   ...
> truncate to 4k                 |
> change to journal data mode    |
>                                |   memcpy(set page dirty)
> truncate to 0:                 |
> ext4_setattr:                  |
> ...                            |
> ext4_wait_for_tail_page_commit |
>                                |   mbind(trigger bug)
> truncate_pagecache(clean dirty)|   ...
> ...                            |
> Reproduce2 will mark page as dirty by memcpy, then mbind run between
> ext4_wait_for_tail_page_commit and truncate_pagecache in ext4_setattr can
> trigger the bug with page still be dirty but buffer head has been free.

Aha! Thanks for explanation. Makes sense. So I agree with your patch but we
also need to update the comment before the condition in
ext4_wait_for_tail_page_commit(). Something like:

If the page is fully truncated, we don't need to wait for any commit (and
we even should not as __ext4_journalled_invalidatepage() may strip all
buffers from the page but keep the page dirty which can then confuse e.g.
concurrent ext4_writepage() seeing dirty page without buffers). Also we
don't need to wait for any commit if all buffers in the page remain valid.
This is most beneficial for the common case of blocksize == PAGE_SIZE.

								Honza

> > > ---
> > >   fs/ext4/inode.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index 006b7a2070bf..a9943ae4f74d 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -5479,7 +5479,7 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
> > >   	 * do. We do the check mainly to optimize the common PAGE_SIZE ==
> > >   	 * blocksize case
> > >   	 */
> > > -	if (offset > PAGE_SIZE - i_blocksize(inode))
> > > +	if (!offset || offset > PAGE_SIZE - i_blocksize(inode))
> > >   		return;
> > >   	while (1) {
> > >   		page = find_lock_page(inode->i_mapping,
yangerkun Sept. 19, 2019, 1:29 a.m. UTC | #7
On 2019/9/18 21:27, Jan Kara wrote:
> On Wed 18-09-19 21:09:00, yangerkun wrote:
>> On 2019/9/18 18:45, Jan Kara wrote:
>>> On Tue 17-09-19 16:48:14, yangerkun wrote:
>>>> No need to wait when offset equals to 0. And it will trigger a bug since
>>>> the latter __ext4_journalled_invalidatepage can free the buffers but leave
>>>> page still dirty.
>>>>
>>>> [   26.057508] ------------[ cut here ]------------
>>>> [   26.058531] kernel BUG at fs/ext4/inode.c:2134!
>>>> ...
>>>> [   26.088130] Call trace:
>>>> [   26.088695]  ext4_writepage+0x914/0xb28
>>>> [   26.089541]  writeout.isra.4+0x1b4/0x2b8
>>>> [   26.090409]  move_to_new_page+0x3b0/0x568
>>>> [   26.091338]  __unmap_and_move+0x648/0x988
>>>> [   26.092241]  unmap_and_move+0x48c/0xbb8
>>>> [   26.093096]  migrate_pages+0x220/0xb28
>>>> [   26.093945]  kernel_mbind+0x828/0xa18
>>>> [   26.094791]  __arm64_sys_mbind+0xc8/0x138
>>>> [   26.095716]  el0_svc_common+0x190/0x490
>>>> [   26.096571]  el0_svc_handler+0x60/0xd0
>>>> [   26.097423]  el0_svc+0x8/0xc
>>>>
>>>> Run below parallel can reproduce it easily(ext3):
>>>> void main()
>>>> {
>>>>           int fd, fd1, fd2, fd3, ret;
>>>>           void *addr;
>>>>           size_t length = 4096;
>>>>           int flags;
>>>>           off_t offset = 0;
>>>>           char *str = "12345";
>>>>
>>>>           fd = open("a", O_RDWR | O_CREAT);
>>>>           assert(fd >= 0);
>>>>
>>>>           ret = ftruncate(fd, length);
>>>>           assert(ret == 0);
>>>>
>>>>           fd1 = open("a", O_RDWR | O_CREAT, -1);
>>>>           assert(fd1 >= 0);
>>>>
>>>>           flags = 0xc00f;/*Journal data mode*/
>>>>           ret = ioctl(fd1, _IOW('f', 2, long), &flags);
>>>>           assert(ret == 0);
>>>>
>>>>           fd2 = open("a", O_RDWR | O_CREAT);
>>>>           assert(fd2 >= 0);
>>>>
>>>>           fd3 = open("a", O_TRUNC | O_NOATIME);
>>>>           assert(fd3 >= 0);
>>>>
>>>>           addr = mmap(NULL, length, 0xe, 0x28013, fd2, offset);
>>>
>>> Ugh, these mmap flags look pretty bogus. Were they generated by some
>>> fuzzer?
>> Yeah, generated by syzkaller.
>>>
>>>>           assert(addr != (void *)-1);
>>>>           memcpy(addr, str, 5);
>>>
>>> Also the O_TRUNC open above will truncate "a" to 0 so the mapping is
>>> actually beyond i_size and this memcpy should fail with SIGBUS. So I'm
>>> surprised your test program gets up to mbind()...
>>
>> We run the program parallel, sometimes will run as below:
>>
>> reproduce1                         reproduce2
>>
>> ...                            |   ...
>> truncate to 4k                 |
>> change to journal data mode    |
>>                                 |   memcpy(set page dirty)
>> truncate to 0:                 |
>> ext4_setattr:                  |
>> ...                            |
>> ext4_wait_for_tail_page_commit |
>>                                 |   mbind(trigger bug)
>> truncate_pagecache(clean dirty)|   ...
>> ...                            |
>> Reproduce2 will mark page as dirty by memcpy, then mbind run between
>> ext4_wait_for_tail_page_commit and truncate_pagecache in ext4_setattr can
>> trigger the bug with page still be dirty but buffer head has been free.
> 
> Aha! Thanks for explanation. Makes sense. So I agree with your patch but we
> also need to update the comment before the condition in
> ext4_wait_for_tail_page_commit(). Something like:
> 
> If the page is fully truncated, we don't need to wait for any commit (and
> we even should not as __ext4_journalled_invalidatepage() may strip all
> buffers from the page but keep the page dirty which can then confuse e.g.
> concurrent ext4_writepage() seeing dirty page without buffers). Also we
> don't need to wait for any commit if all buffers in the page remain valid.
> This is most beneficial for the common case of blocksize == PAGE_SIZE.

I will add this comment and reorganize the patch. Thanks a lot!

> 
> 								Honza
> 
>>>> ---
>>>>    fs/ext4/inode.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>>> index 006b7a2070bf..a9943ae4f74d 100644
>>>> --- a/fs/ext4/inode.c
>>>> +++ b/fs/ext4/inode.c
>>>> @@ -5479,7 +5479,7 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
>>>>    	 * do. We do the check mainly to optimize the common PAGE_SIZE ==
>>>>    	 * blocksize case
>>>>    	 */
>>>> -	if (offset > PAGE_SIZE - i_blocksize(inode))
>>>> +	if (!offset || offset > PAGE_SIZE - i_blocksize(inode))
>>>>    		return;
>>>>    	while (1) {
>>>>    		page = find_lock_page(inode->i_mapping,

Patch
diff mbox series

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 006b7a2070bf..a9943ae4f74d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5479,7 +5479,7 @@  static void ext4_wait_for_tail_page_commit(struct inode *inode)
 	 * do. We do the check mainly to optimize the common PAGE_SIZE ==
 	 * blocksize case
 	 */
-	if (offset > PAGE_SIZE - i_blocksize(inode))
+	if (!offset || offset > PAGE_SIZE - i_blocksize(inode))
 		return;
 	while (1) {
 		page = find_lock_page(inode->i_mapping,