Message ID | 002a01cf59c3$4eaf7490$ec0e5db0$@samsung.com |
---|---|
State | Superseded, archived |
Headers | show |
On Thu, 17 Apr 2014, Namjae Jeon wrote: > Date: Thu, 17 Apr 2014 07:29:18 +0900 > From: Namjae Jeon <namjae.jeon@samsung.com> > To: Theodore Ts'o <tytso@mit.edu> > Cc: linux-ext4 <linux-ext4@vger.kernel.org>, > Lukáš Czerner <lczerner@redhat.com> > Subject: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > mode > > From: Namjae Jeon <namjae.jeon@samsung.com> > > xfstests generic/091 is failing when mounting ext4 with data=journal. > I think that this regression is same problem that occurred prior to collapse > range issue. So ZERO RANGE also need to call ext4_force_commit as > collapse range. > > Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> > Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> > --- > fs/ext4/extents.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index f386dd6..a64242f 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -4742,6 +4742,13 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > trace_ext4_zero_range(inode, offset, len, mode); > > + /* Call ext4_force_commit to flush all data in case of data=journal. */ > + if (ext4_should_journal_data(inode)) { > + ret = ext4_force_commit(inode->i_sb); > + if (ret) > + return ret; > + } Hi, it makes sense. But I have a question, maybe I do not understand it correctly but what protect us from other writes coming in after we force the commit ? -Lukas > + > /* > * Write out all dirty pages to avoid race conditions > * Then release them. >
> > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > Date: Thu, 17 Apr 2014 07:29:18 +0900 > > From: Namjae Jeon <namjae.jeon@samsung.com> > > To: Theodore Ts'o <tytso@mit.edu> > > Cc: linux-ext4 <linux-ext4@vger.kernel.org>, > > Lukáš Czerner <lczerner@redhat.com> > > Subject: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > mode > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > xfstests generic/091 is failing when mounting ext4 with data=journal. > > I think that this regression is same problem that occurred prior to collapse > > range issue. So ZERO RANGE also need to call ext4_force_commit as > > collapse range. > > > > Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> > > Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> > > --- > > fs/ext4/extents.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > > index f386dd6..a64242f 100644 > > --- a/fs/ext4/extents.c > > +++ b/fs/ext4/extents.c > > @@ -4742,6 +4742,13 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > > > trace_ext4_zero_range(inode, offset, len, mode); > > > > + /* Call ext4_force_commit to flush all data in case of data=journal. */ > > + if (ext4_should_journal_data(inode)) { > > + ret = ext4_force_commit(inode->i_sb); > > + if (ret) > > + return ret; > > + } > > Hi, Hi Lukas. > > it makes sense. But I have a question, maybe I do not understand it > correctly but what protect us from other writes coming in after we > force the commit ? Yes, Currently new write can come between ext4_force_commit and till we acquire mutex_lock. But this window is already present even without patch. Its just that in case of data=journal mode, this window will become slightly bigger. one possible solution coming to my mind is one more time calling ext4_force_commit followed by a call to filemap_write_and_wait_range inside mutex_lock which would sync data that has dirtied after 1st call. Thanks! > > -Lukas > > > + > > /* > > * Write out all dirty pages to avoid race conditions > > * Then release them. > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 17 Apr 2014, Namjae Jeon wrote: > Date: Thu, 17 Apr 2014 19:52:09 +0900 > From: Namjae Jeon <namjae.jeon@samsung.com> > To: 'Lukáš Czerner' <lczerner@redhat.com> > Cc: 'Theodore Ts'o' <tytso@mit.edu>, 'linux-ext4' <linux-ext4@vger.kernel.org> > Subject: RE: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > mode > > > > > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > > > Date: Thu, 17 Apr 2014 07:29:18 +0900 > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > To: Theodore Ts'o <tytso@mit.edu> > > > Cc: linux-ext4 <linux-ext4@vger.kernel.org>, > > > Lukáš Czerner <lczerner@redhat.com> > > > Subject: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > > mode > > > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > > > xfstests generic/091 is failing when mounting ext4 with data=journal. > > > I think that this regression is same problem that occurred prior to collapse > > > range issue. So ZERO RANGE also need to call ext4_force_commit as > > > collapse range. > > > > > > Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> > > > Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> > > > --- > > > fs/ext4/extents.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > > > index f386dd6..a64242f 100644 > > > --- a/fs/ext4/extents.c > > > +++ b/fs/ext4/extents.c > > > @@ -4742,6 +4742,13 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > > > > > trace_ext4_zero_range(inode, offset, len, mode); > > > > > > + /* Call ext4_force_commit to flush all data in case of data=journal. */ > > > + if (ext4_should_journal_data(inode)) { > > > + ret = ext4_force_commit(inode->i_sb); > > > + if (ret) > > > + return ret; > > > + } > > > > Hi, > Hi Lukas. > > > > it makes sense. But I have a question, maybe I do not understand it > > correctly but what protect us from other writes coming in after we > > force the commit ? > Yes, Currently new write can come between ext4_force_commit and till > we acquire mutex_lock. But this window is already present even > without patch. Its just that in case of data=journal mode, this > window will become slightly bigger. one possible solution coming to > my mind is one more time calling ext4_force_commit followed by a call > to filemap_write_and_wait_range inside mutex_lock which would sync > data that has dirtied after 1st call. Can we really call ext4_force_commit() inside mutex_lock ? -Lukas > > Thanks! > > > > -Lukas > > > > > + > > > /* > > > * Write out all dirty pages to avoid race conditions > > > * Then release them. > > > > >
> > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > Date: Thu, 17 Apr 2014 19:52:09 +0900 > > From: Namjae Jeon <namjae.jeon@samsung.com> > > To: 'Lukáš Czerner' <lczerner@redhat.com> > > Cc: 'Theodore Ts'o' <tytso@mit.edu>, 'linux-ext4' <linux-ext4@vger.kernel.org> > > Subject: RE: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > mode > > > > > > > > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > > > > > Date: Thu, 17 Apr 2014 07:29:18 +0900 > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > To: Theodore Ts'o <tytso@mit.edu> > > > > Cc: linux-ext4 <linux-ext4@vger.kernel.org>, > > > > Lukáš Czerner <lczerner@redhat.com> > > > > Subject: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > > > mode > > > > > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > > > > > xfstests generic/091 is failing when mounting ext4 with data=journal. > > > > I think that this regression is same problem that occurred prior to collapse > > > > range issue. So ZERO RANGE also need to call ext4_force_commit as > > > > collapse range. > > > > > > > > Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> > > > > Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> > > > > --- > > > > fs/ext4/extents.c | 7 +++++++ > > > > 1 file changed, 7 insertions(+) > > > > > > > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > > > > index f386dd6..a64242f 100644 > > > > --- a/fs/ext4/extents.c > > > > +++ b/fs/ext4/extents.c > > > > @@ -4742,6 +4742,13 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > > > > > > > trace_ext4_zero_range(inode, offset, len, mode); > > > > > > > > + /* Call ext4_force_commit to flush all data in case of data=journal. */ > > > > + if (ext4_should_journal_data(inode)) { > > > > + ret = ext4_force_commit(inode->i_sb); > > > > + if (ret) > > > > + return ret; > > > > + } > > > > > > Hi, > > Hi Lukas. > > > > > > it makes sense. But I have a question, maybe I do not understand it > > > correctly but what protect us from other writes coming in after we > > > force the commit ? > > Yes, Currently new write can come between ext4_force_commit and till > > we acquire mutex_lock. But this window is already present even > > without patch. Its just that in case of data=journal mode, this > > window will become slightly bigger. one possible solution coming to > > my mind is one more time calling ext4_force_commit followed by a call > > to filemap_write_and_wait_range inside mutex_lock which would sync > > data that has dirtied after 1st call. > > Can we really call ext4_force_commit() inside mutex_lock ? Yes, I can see ext4_force_commit inside mutex_lock in ext4_sync_file(). > > -Lukas > > > > > Thanks! > > > > > > -Lukas > > > > > > > + > > > > /* > > > > * Write out all dirty pages to avoid race conditions > > > > * Then release them. > > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 17 Apr 2014, Namjae Jeon wrote: > Date: Thu, 17 Apr 2014 21:01:25 +0900 > From: Namjae Jeon <namjae.jeon@samsung.com> > To: 'Lukáš Czerner' <lczerner@redhat.com> > Cc: 'Theodore Ts'o' <tytso@mit.edu>, 'linux-ext4' <linux-ext4@vger.kernel.org> > Subject: RE: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > mode > > > > > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > > > Date: Thu, 17 Apr 2014 19:52:09 +0900 > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > To: 'Lukáš Czerner' <lczerner@redhat.com> > > > Cc: 'Theodore Ts'o' <tytso@mit.edu>, 'linux-ext4' <linux-ext4@vger.kernel.org> > > > Subject: RE: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > > mode > > > > > > > > > > > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > > > > > > > Date: Thu, 17 Apr 2014 07:29:18 +0900 > > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > > To: Theodore Ts'o <tytso@mit.edu> > > > > > Cc: linux-ext4 <linux-ext4@vger.kernel.org>, > > > > > Lukáš Czerner <lczerner@redhat.com> > > > > > Subject: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > > > > mode > > > > > > > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > > > > > > > xfstests generic/091 is failing when mounting ext4 with data=journal. > > > > > I think that this regression is same problem that occurred prior to collapse > > > > > range issue. So ZERO RANGE also need to call ext4_force_commit as > > > > > collapse range. > > > > > > > > > > Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> > > > > > Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> > > > > > --- > > > > > fs/ext4/extents.c | 7 +++++++ > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > > > > > index f386dd6..a64242f 100644 > > > > > --- a/fs/ext4/extents.c > > > > > +++ b/fs/ext4/extents.c > > > > > @@ -4742,6 +4742,13 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > > > > > > > > > trace_ext4_zero_range(inode, offset, len, mode); > > > > > > > > > > + /* Call ext4_force_commit to flush all data in case of data=journal. */ > > > > > + if (ext4_should_journal_data(inode)) { > > > > > + ret = ext4_force_commit(inode->i_sb); > > > > > + if (ret) > > > > > + return ret; > > > > > + } > > > > > > > > Hi, > > > Hi Lukas. > > > > > > > > it makes sense. But I have a question, maybe I do not understand it > > > > correctly but what protect us from other writes coming in after we > > > > force the commit ? > > > Yes, Currently new write can come between ext4_force_commit and till > > > we acquire mutex_lock. But this window is already present even > > > without patch. Its just that in case of data=journal mode, this > > > window will become slightly bigger. one possible solution coming to > > > my mind is one more time calling ext4_force_commit followed by a call > > > to filemap_write_and_wait_range inside mutex_lock which would sync > > > data that has dirtied after 1st call. > > > > Can we really call ext4_force_commit() inside mutex_lock ? > Yes, I can see ext4_force_commit inside mutex_lock in ext4_sync_file(). There might be some misunderstanding, are we talking about inode->i_mutex because that is certainly not held in ext4_sync_file() or am I missing something ? -Lukas > > > > > -Lukas > > > > > > > > Thanks! > > > > > > > > -Lukas > > > > > > > > > + > > > > > /* > > > > > * Write out all dirty pages to avoid race conditions > > > > > * Then release them. > > > > > > > > > > > > >
> -----Original Message----- > From: Lukáš Czerner [mailto:lczerner@redhat.com] > Sent: Thursday, April 17, 2014 9:16 PM > To: Namjae Jeon > Cc: 'Theodore Ts'o'; 'linux-ext4' > Subject: RE: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling mode > > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > Date: Thu, 17 Apr 2014 21:01:25 +0900 > > From: Namjae Jeon <namjae.jeon@samsung.com> > > To: 'Lukáš Czerner' <lczerner@redhat.com> > > Cc: 'Theodore Ts'o' <tytso@mit.edu>, 'linux-ext4' <linux-ext4@vger.kernel.org> > > Subject: RE: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > mode > > > > > > > > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > > > > > Date: Thu, 17 Apr 2014 19:52:09 +0900 > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > To: 'Lukáš Czerner' <lczerner@redhat.com> > > > > Cc: 'Theodore Ts'o' <tytso@mit.edu>, 'linux-ext4' <linux-ext4@vger.kernel.org> > > > > Subject: RE: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > > > mode > > > > > > > > > > > > > > On Thu, 17 Apr 2014, Namjae Jeon wrote: > > > > > > > > > > > Date: Thu, 17 Apr 2014 07:29:18 +0900 > > > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > > > To: Theodore Ts'o <tytso@mit.edu> > > > > > > Cc: linux-ext4 <linux-ext4@vger.kernel.org>, > > > > > > Lukáš Czerner <lczerner@redhat.com> > > > > > > Subject: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > > > > > > mode > > > > > > > > > > > > From: Namjae Jeon <namjae.jeon@samsung.com> > > > > > > > > > > > > xfstests generic/091 is failing when mounting ext4 with data=journal. > > > > > > I think that this regression is same problem that occurred prior to collapse > > > > > > range issue. So ZERO RANGE also need to call ext4_force_commit as > > > > > > collapse range. > > > > > > > > > > > > Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> > > > > > > Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> > > > > > > --- > > > > > > fs/ext4/extents.c | 7 +++++++ > > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > > > > > > index f386dd6..a64242f 100644 > > > > > > --- a/fs/ext4/extents.c > > > > > > +++ b/fs/ext4/extents.c > > > > > > @@ -4742,6 +4742,13 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > > > > > > > > > > > trace_ext4_zero_range(inode, offset, len, mode); > > > > > > > > > > > > + /* Call ext4_force_commit to flush all data in case of data=journal. */ > > > > > > + if (ext4_should_journal_data(inode)) { > > > > > > + ret = ext4_force_commit(inode->i_sb); > > > > > > + if (ret) > > > > > > + return ret; > > > > > > + } > > > > > > > > > > Hi, > > > > Hi Lukas. > > > > > > > > > > it makes sense. But I have a question, maybe I do not understand it > > > > > correctly but what protect us from other writes coming in after we > > > > > force the commit ? > > > > Yes, Currently new write can come between ext4_force_commit and till > > > > we acquire mutex_lock. But this window is already present even > > > > without patch. Its just that in case of data=journal mode, this > > > > window will become slightly bigger. one possible solution coming to > > > > my mind is one more time calling ext4_force_commit followed by a call > > > > to filemap_write_and_wait_range inside mutex_lock which would sync > > > > data that has dirtied after 1st call. > > > > > > Can we really call ext4_force_commit() inside mutex_lock ? > > Yes, I can see ext4_force_commit inside mutex_lock in ext4_sync_file(). > > There might be some misunderstanding, are we talking about > inode->i_mutex because that is certainly not held in > ext4_sync_file() or am I missing something ? Ah. Sorry, I checked old kernel source(v3.10) i_mutex was removed from ext4_sync_file by Jan. But that does not mean we can't call ext4_force_commit from within i_mutex. Hi Jan. Am I missing something ? Thanks. > > -Lukas > > > > > > > > > -Lukas > > > > > > > > > > > Thanks! > > > > > > > > > > -Lukas > > > > > > > > > > > + > > > > > > /* > > > > > > * Write out all dirty pages to avoid race conditions > > > > > > * Then release them. > > > > > > > > > > > > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
So a couple of things. First of all, ext4_force_commit() is a very expensive call, so calling it twice is really not a good idea. Secondly, in the ext4_collapse_range() you are calling ext4_force_commit() before filemap_write_and_wait_range(). /* Call ext4_force_commit to flush all data in case of data=journal. */ if (ext4_should_journal_data(inode)) { ret = ext4_force_commit(inode->i_sb); if (ret) return ret; } /* Write out all dirty pages */ ret = filemap_write_and_wait_range(inode->i_mapping, offset, -1); if (ret) return ret; Shouldn't we reverse these two calls? Finally, I'm wondering if we would be better off creating a new explicit EXT4_I(inode)->i_write_mutex which is used to block new writes from starting. This could also be used to subsume the ext4_aio_mutex. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 18 Apr 2014, Theodore Ts'o wrote: > Date: Fri, 18 Apr 2014 10:37:11 -0400 > From: Theodore Ts'o <tytso@mit.edu> > To: Namjae Jeon <namjae.jeon@samsung.com> > Cc: 'Lukáš Czerner' <lczerner@redhat.com>, 'Jan Kara' <jack@suse.cz>, > 'linux-ext4' <linux-ext4@vger.kernel.org> > Subject: Re: [PATCH 2/3] ext4: fix ZERO_RANGE test failure in data journalling > mode > > So a couple of things. First of all, ext4_force_commit() is a very > expensive call, so calling it twice is really not a good idea. > > Secondly, in the ext4_collapse_range() you are calling > ext4_force_commit() before filemap_write_and_wait_range(). > > /* Call ext4_force_commit to flush all data in case of data=journal. */ > if (ext4_should_journal_data(inode)) { > ret = ext4_force_commit(inode->i_sb); > if (ret) > return ret; > } > > /* Write out all dirty pages */ > ret = filemap_write_and_wait_range(inode->i_mapping, offset, -1); > if (ret) > return ret; > > Shouldn't we reverse these two calls? > > Finally, I'm wondering if we would be better off creating a new > explicit EXT4_I(inode)->i_write_mutex which is used to block new > writes from starting. This could also be used to subsume the > ext4_aio_mutex. We can maybe use something similar xfs has with their XFS_IOLOCK -Lukas > > - Ted
> So a couple of things. First of all, ext4_force_commit() is a very > expensive call, so calling it twice is really not a good idea. Yes, Right. > > Secondly, in the ext4_collapse_range() you are calling > ext4_force_commit() before filemap_write_and_wait_range(). > > /* Call ext4_force_commit to flush all data in case of data=journal. */ > if (ext4_should_journal_data(inode)) { > ret = ext4_force_commit(inode->i_sb); > if (ret) > return ret; > } > > /* Write out all dirty pages */ > ret = filemap_write_and_wait_range(inode->i_mapping, offset, -1); > if (ret) > return ret; > > Shouldn't we reverse these two calls? Yes, The original problem will occur again if we reverse these calls. ext4_force_commit will mark the buffers as dirty during commit transcation. So we should sync it using filemap_write_and_wait_range later. > > Finally, I'm wondering if we would be better off creating a new > explicit EXT4_I(inode)->i_write_mutex which is used to block new > writes from starting. This could also be used to subsume the > ext4_aio_mutex. Right. It is better method. I will check your point. :) Thanks Ted!! > > - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index f386dd6..a64242f 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4742,6 +4742,13 @@ static long ext4_zero_range(struct file *file, loff_t offset, trace_ext4_zero_range(inode, offset, len, mode); + /* Call ext4_force_commit to flush all data in case of data=journal. */ + if (ext4_should_journal_data(inode)) { + ret = ext4_force_commit(inode->i_sb); + if (ret) + return ret; + } + /* * Write out all dirty pages to avoid race conditions * Then release them.