Message ID | 20130927155329.3272.64086.stgit@dhcp-10-30-17-2.sw.ru |
---|---|
State | Accepted, archived |
Headers | show |
On Fri 27-09-13 19:54:03, Maxim Patlasov wrote: > While handling punch-hole fallocate, it's useless to truncate page cache > before removing the range from extent tree (or block map in indirect case) > because page cache can be re-populated (by read-ahead or read(2) or mmap-ed > read) immediately after truncating page cache, but before updating extent > tree (or block map). In that case the user will see stale data even after > fallocate is completed. > > Changed in v2 (Thanks to Jan Kara): > - Until the problem of data corruption resulting from pages backed by > already freed blocks is fully resolved, the simple thing we can do now > is to add another truncation of pagecache after punch hole is done. The patch looks good. You can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com> > --- > fs/ext4/inode.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 0d424d7..2984ddf 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -3621,6 +3621,12 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) > up_write(&EXT4_I(inode)->i_data_sem); > if (IS_SYNC(inode)) > ext4_handle_sync(handle); > + > + /* Now release the pages again to reduce race window */ > + if (last_block_offset > first_block_offset) > + truncate_pagecache_range(inode, first_block_offset, > + last_block_offset); > + > inode->i_mtime = inode->i_ctime = ext4_current_time(inode); > ext4_mark_inode_dirty(handle, inode); > out_stop: >
On Fri, Sep 27, 2013 at 06:05:17PM +0200, Jan Kara wrote: > On Fri 27-09-13 19:54:03, Maxim Patlasov wrote: > > While handling punch-hole fallocate, it's useless to truncate page cache > > before removing the range from extent tree (or block map in indirect case) > > because page cache can be re-populated (by read-ahead or read(2) or mmap-ed > > read) immediately after truncating page cache, but before updating extent > > tree (or block map). In that case the user will see stale data even after > > fallocate is completed. > > > > Changed in v2 (Thanks to Jan Kara): > > - Until the problem of data corruption resulting from pages backed by > > already freed blocks is fully resolved, the simple thing we can do now > > is to add another truncation of pagecache after punch hole is done. > The patch looks good. You can add: > Reviewed-by: Jan Kara <jack@suse.cz> I was going through old patches, and it looks like this one got dropped. My apologies. As far as I can tell, the underlying problem in the VFS/MM layer hasn't been solved yet (Jan, can you confirm?), so I've queued this patch for the next merge window. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu 20-02-14 19:21:07, Ted Tso wrote: > On Fri, Sep 27, 2013 at 06:05:17PM +0200, Jan Kara wrote: > > On Fri 27-09-13 19:54:03, Maxim Patlasov wrote: > > > While handling punch-hole fallocate, it's useless to truncate page cache > > > before removing the range from extent tree (or block map in indirect case) > > > because page cache can be re-populated (by read-ahead or read(2) or mmap-ed > > > read) immediately after truncating page cache, but before updating extent > > > tree (or block map). In that case the user will see stale data even after > > > fallocate is completed. > > > > > > Changed in v2 (Thanks to Jan Kara): > > > - Until the problem of data corruption resulting from pages backed by > > > already freed blocks is fully resolved, the simple thing we can do now > > > is to add another truncation of pagecache after punch hole is done. > > The patch looks good. You can add: > > Reviewed-by: Jan Kara <jack@suse.cz> > > I was going through old patches, and it looks like this one got > dropped. My apologies. > > As far as I can tell, the underlying problem in the VFS/MM layer > hasn't been solved yet (Jan, can you confirm?), so I've queued this > patch for the next merge window. Yes, we didn't solve it yet. Thanks for queueing the patch! Honza
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 0d424d7..2984ddf 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3621,6 +3621,12 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) up_write(&EXT4_I(inode)->i_data_sem); if (IS_SYNC(inode)) ext4_handle_sync(handle); + + /* Now release the pages again to reduce race window */ + if (last_block_offset > first_block_offset) + truncate_pagecache_range(inode, first_block_offset, + last_block_offset); + inode->i_mtime = inode->i_ctime = ext4_current_time(inode); ext4_mark_inode_dirty(handle, inode); out_stop:
While handling punch-hole fallocate, it's useless to truncate page cache before removing the range from extent tree (or block map in indirect case) because page cache can be re-populated (by read-ahead or read(2) or mmap-ed read) immediately after truncating page cache, but before updating extent tree (or block map). In that case the user will see stale data even after fallocate is completed. Changed in v2 (Thanks to Jan Kara): - Until the problem of data corruption resulting from pages backed by already freed blocks is fully resolved, the simple thing we can do now is to add another truncation of pagecache after punch hole is done. Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com> --- fs/ext4/inode.c | 6 ++++++ 1 file changed, 6 insertions(+) -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html