Message ID | 20210804125044.2480435-1-yangerkun@huawei.com |
---|---|
State | Superseded |
Headers | show |
Series | ext4: stop return ENOSPC from ext4_issue_zeroout | expand |
On Wed 04-08-21 20:50:44, yangerkun wrote: > Our testcase(briefly described as fsstress on dm thin-provisioning which > ext4 see volume size with 100G but actual size 10G) trigger a hungtask > bug since ext4_writepages fall into a infinite loop: > > static int ext4_writepages(xxx) > { > ... > while (!done && mpd.first_page <= mpd.last_page) { > ... > ret = mpage_prepare_extent_to_map(&mpd); > if (!ret) { > ... > ret = mpage_map_and_submit_extent(handle, > &mpd,&give_up_on_write); > <----- will return -ENOSPC > ... > } > ... > if (ret == -ENOSPC && sbi->s_journal) { > <------ we cannot break since we will get ENOSPC forever > jbd2_journal_force_commit_nested(sbi->s_journal); > ret = 0; > continue; > } > ... > } > } > > Got ENOSPC with follow stack: > ... > ext4_ext_map_blocks > ext4_ext_convert_to_initialized > ext4_ext_zeroout > ext4_issue_zeroout > ... > submit_bio_wait <-- bio to thinpool will return ENOSPC > > 'df22291ff0fd ("ext4: Retry block allocation if we have free blocks > left")' add the logic to retry block allcate since we may get free block > after we commit a transaction. But the ENOSPC from thin-provisioning > will confuse ext4, and lead the upper infinite loop. Fix it by convert > the err to EIO. > > Signed-off-by: yangerkun <yangerkun@huawei.com> Thanks for the patch. As a quick fix for the problem this is probably fine. But longer term we might need to implement a configurable behavior for this because just dropping data on the floor (which is what would happen here) need not be what sysadmin wants and blocking until space is provisioned may be actually a preferable behavior. Anyway for now feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/inode.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 038aebd7eb2f..d9ded699a88c 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -428,6 +428,9 @@ int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk, ext4_fsblk_t pblk, > if (ret > 0) > ret = 0; > > + if (ret == -ENOSPC) > + ret = -EIO; > + > return ret; > } > > -- > 2.31.1 >
On Wed, Aug 04, 2021 at 03:35:29PM +0200, Jan Kara wrote: > On Wed 04-08-21 20:50:44, yangerkun wrote: > > Our testcase(briefly described as fsstress on dm thin-provisioning which > > ext4 see volume size with 100G but actual size 10G) trigger a hungtask > > bug since ext4_writepages fall into a infinite loop: > > > > Got ENOSPC with follow stack: > > ... > > ext4_ext_map_blocks > > ext4_ext_convert_to_initialized > > ext4_ext_zeroout > > ext4_issue_zeroout > > ... > > submit_bio_wait <-- bio to thinpool will return ENOSPC > > > > Thanks for the patch. As a quick fix for the problem this is probably fine. > But longer term we might need to implement a configurable behavior for this > because just dropping data on the floor (which is what would happen here) > need not be what sysadmin wants and blocking until space is provisioned may be > actually a preferable behavior. Anyway for now feel free to add: > > Reviewed-by: Jan Kara <jack@suse.cz> Hmm, I wonder if this would be a better fix. (Not yet tested, may fry your file system, etc....) What do folks think? - Ted diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 92ad64b89d9b..501516cadc1b 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, split_map.m_len - ee_block); err = ext4_ext_zeroout(inode, &zero_ex1); if (err) - goto out; + goto fallback; split_map.m_len = allocated; } if (split_map.m_lblk - ee_block + split_map.m_len < @@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, ext4_ext_pblock(ex)); err = ext4_ext_zeroout(inode, &zero_ex2); if (err) - goto out; + goto fallback; } split_map.m_len += split_map.m_lblk - ee_block; @@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, } } +fallback: err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag, flags); if (err > 0)
On Fri 13-08-21 11:18:01, Theodore Ts'o wrote: > On Wed, Aug 04, 2021 at 03:35:29PM +0200, Jan Kara wrote: > > On Wed 04-08-21 20:50:44, yangerkun wrote: > > > Our testcase(briefly described as fsstress on dm thin-provisioning which > > > ext4 see volume size with 100G but actual size 10G) trigger a hungtask > > > bug since ext4_writepages fall into a infinite loop: > > > > > > Got ENOSPC with follow stack: > > > ... > > > ext4_ext_map_blocks > > > ext4_ext_convert_to_initialized > > > ext4_ext_zeroout > > > ext4_issue_zeroout > > > ... > > > submit_bio_wait <-- bio to thinpool will return ENOSPC > > > > > > > Thanks for the patch. As a quick fix for the problem this is probably fine. > > But longer term we might need to implement a configurable behavior for this > > because just dropping data on the floor (which is what would happen here) > > need not be what sysadmin wants and blocking until space is provisioned may be > > actually a preferable behavior. Anyway for now feel free to add: > > > > Reviewed-by: Jan Kara <jack@suse.cz> > > Hmm, I wonder if this would be a better fix. (Not yet tested, may fry > your file system, etc....) What do folks think? Yes, that looks indeed better. I'd note that even splitting extent may fail due to ENOSPC on thin-provisioned storage but the chances are *much* lower. Honza > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index 92ad64b89d9b..501516cadc1b 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, > split_map.m_len - ee_block); > err = ext4_ext_zeroout(inode, &zero_ex1); > if (err) > - goto out; > + goto fallback; > split_map.m_len = allocated; > } > if (split_map.m_lblk - ee_block + split_map.m_len < > @@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, > ext4_ext_pblock(ex)); > err = ext4_ext_zeroout(inode, &zero_ex2); > if (err) > - goto out; > + goto fallback; > } > > split_map.m_len += split_map.m_lblk - ee_block; > @@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, > } > } > > +fallback: > err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag, > flags); > if (err > 0)
On Mon, Aug 16, 2021 at 12:05:45PM +0200, Jan Kara wrote: > > Yes, that looks indeed better. I'd note that even splitting extent may fail > due to ENOSPC on thin-provisioned storage but the chances are *much* lower. Indeed, any kind of metadata update (updating an inode atime, creating a new inode, deleting a directory entry, etc.) can fail due to ENOSPC on thin-provision storage, leading to a potentially corrupted file system since some writes will succeed, while others won't --- and that's not a scenario that's super well tested, nor is there much we can do other than remounting the file system read/only or forcing a reboot. But at least we won't throw the kernel into an infinite loop, which is what yangerkun was reporting... - Ted
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 038aebd7eb2f..d9ded699a88c 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -428,6 +428,9 @@ int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk, ext4_fsblk_t pblk, if (ret > 0) ret = 0; + if (ret == -ENOSPC) + ret = -EIO; + return ret; }
Our testcase(briefly described as fsstress on dm thin-provisioning which ext4 see volume size with 100G but actual size 10G) trigger a hungtask bug since ext4_writepages fall into a infinite loop: static int ext4_writepages(xxx) { ... while (!done && mpd.first_page <= mpd.last_page) { ... ret = mpage_prepare_extent_to_map(&mpd); if (!ret) { ... ret = mpage_map_and_submit_extent(handle, &mpd,&give_up_on_write); <----- will return -ENOSPC ... } ... if (ret == -ENOSPC && sbi->s_journal) { <------ we cannot break since we will get ENOSPC forever jbd2_journal_force_commit_nested(sbi->s_journal); ret = 0; continue; } ... } } Got ENOSPC with follow stack: ... ext4_ext_map_blocks ext4_ext_convert_to_initialized ext4_ext_zeroout ext4_issue_zeroout ... submit_bio_wait <-- bio to thinpool will return ENOSPC 'df22291ff0fd ("ext4: Retry block allocation if we have free blocks left")' add the logic to retry block allcate since we may get free block after we commit a transaction. But the ENOSPC from thin-provisioning will confuse ext4, and lead the upper infinite loop. Fix it by convert the err to EIO. Signed-off-by: yangerkun <yangerkun@huawei.com> --- fs/ext4/inode.c | 3 +++ 1 file changed, 3 insertions(+)