Message ID | 1237311235-13623-4-git-send-email-jack@suse.cz |
---|---|
State | Not Applicable, archived |
Headers | show |
On Tue, Mar 17, 2009 at 06:33:54PM +0100, Jan Kara wrote: > Assume the following situation: > Filesystem with blocksize < pagesize - suppose blocksize = 1024, > pagesize = 4096. File 'f' has first four blocks already allocated. > (line with "state:" contains the state of buffers in the page - m = mapped, > u = uptodate, d = dirty) > > process 1: process 2: > > write to 'f' bytes 0 - 1024 > state: |mud,-,-,-|, page dirty > write to 'f' bytes 1024 - 4096: > __block_prepare_write() maps blocks > state: |mud,m,m,m|, page dirty > we fail to copy data -> copied = 0 > block_write_end() does nothing > page gets unlocked > writepage() is called on the page > block_write_full_page() writes buffers with garbage > > This patch fixes the problem by skipping !uptodate buffers in > block_write_full_page(). > > CC: Nick Piggin <npiggin@suse.de> > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/buffer.c | 7 ++++++- > 1 files changed, 6 insertions(+), 1 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 9f69741..22c0144 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -1774,7 +1774,12 @@ static int __block_write_full_page(struct inode *inode, struct page *page, > } while (bh != head); > > do { > - if (!buffer_mapped(bh)) > + /* > + * Parallel write could have already mapped the buffers but > + * it then had to restart before copying in new data. We > + * must avoid writing garbage so just skip the buffer. > + */ > + if (!buffer_mapped(bh) || !buffer_uptodate(bh)) > continue; I don't quite see how this can happen. Further down in this loop, we do a test_clear_buffer_dirty(), which should exclude this I think? And marking the buffer dirty if it is not uptodate should be a bug. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed 18-03-09 13:00:23, Nick Piggin wrote: > On Tue, Mar 17, 2009 at 06:33:54PM +0100, Jan Kara wrote: > > Assume the following situation: > > Filesystem with blocksize < pagesize - suppose blocksize = 1024, > > pagesize = 4096. File 'f' has first four blocks already allocated. > > (line with "state:" contains the state of buffers in the page - m = mapped, > > u = uptodate, d = dirty) > > > > process 1: process 2: > > > > write to 'f' bytes 0 - 1024 > > state: |mud,-,-,-|, page dirty > > write to 'f' bytes 1024 - 4096: > > __block_prepare_write() maps blocks > > state: |mud,m,m,m|, page dirty > > we fail to copy data -> copied = 0 > > block_write_end() does nothing > > page gets unlocked > > writepage() is called on the page > > block_write_full_page() writes buffers with garbage > > > > This patch fixes the problem by skipping !uptodate buffers in > > block_write_full_page(). > > > > CC: Nick Piggin <npiggin@suse.de> > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > fs/buffer.c | 7 ++++++- > > 1 files changed, 6 insertions(+), 1 deletions(-) > > > > diff --git a/fs/buffer.c b/fs/buffer.c > > index 9f69741..22c0144 100644 > > --- a/fs/buffer.c > > +++ b/fs/buffer.c > > @@ -1774,7 +1774,12 @@ static int __block_write_full_page(struct inode *inode, struct page *page, > > } while (bh != head); > > > > do { > > - if (!buffer_mapped(bh)) > > + /* > > + * Parallel write could have already mapped the buffers but > > + * it then had to restart before copying in new data. We > > + * must avoid writing garbage so just skip the buffer. > > + */ > > + if (!buffer_mapped(bh) || !buffer_uptodate(bh)) > > continue; > > I don't quite see how this can happen. Further down in this loop, > we do a test_clear_buffer_dirty(), which should exclude this I > think? And marking the buffer dirty if it is not uptodate should > be a bug. Hmm, this patch definitely does something important because without it I hit corruption in UML in ~20 minutes and with it no corruption happens in ~3 hours. Maybe someone calls set_page_dirty() on the page and __set_page_dirty_buffers() unconditionally dirties all the buffers the page has? But I still don't see how the write could be lost which is what I observe in fsx-linux test. I'm doing some more tests to understand this better. Honza
On Tue, Mar 17, 2009 at 06:33:54PM +0100, Jan Kara wrote: > Assume the following situation: > Filesystem with blocksize < pagesize - suppose blocksize = 1024, > pagesize = 4096. File 'f' has first four blocks already allocated. > (line with "state:" contains the state of buffers in the page - m = mapped, > u = uptodate, d = dirty) > > process 1: process 2: > > write to 'f' bytes 0 - 1024 > state: |mud,-,-,-|, page dirty > write to 'f' bytes 1024 - 4096: > __block_prepare_write() maps blocks > state: |mud,m,m,m|, page dirty > we fail to copy data -> copied = 0 > block_write_end() does nothing > page gets unlocked If copied = 0 then in block_write_end we do page_zero_new_buffers(page, start+copied, start+len which would mean we should not see garbage. > writepage() is called on the page > block_write_full_page() writes buffers with garbage > -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu 19-03-09 00:12:22, Aneesh Kumar K.V wrote: > On Tue, Mar 17, 2009 at 06:33:54PM +0100, Jan Kara wrote: > > Assume the following situation: > > Filesystem with blocksize < pagesize - suppose blocksize = 1024, > > pagesize = 4096. File 'f' has first four blocks already allocated. > > (line with "state:" contains the state of buffers in the page - m = mapped, > > u = uptodate, d = dirty) > > > > process 1: process 2: > > > > write to 'f' bytes 0 - 1024 > > state: |mud,-,-,-|, page dirty > > write to 'f' bytes 1024 - 4096: > > __block_prepare_write() maps blocks > > state: |mud,m,m,m|, page dirty > > we fail to copy data -> copied = 0 > > block_write_end() does nothing > > page gets unlocked > > > If copied = 0 then in block_write_end we do > > page_zero_new_buffers(page, start+copied, start+len > > which would mean we should not see garbage. But this will zero only *new* buffers - so if they are already allocated, get_block() won't set new flag and they won't be zeroed... But I'm not saying I understand why this seems to help against a corruption under UML because we don't seem to be writing !uptodate buffers there. Honza
On Wed 18-03-09 13:00:23, Nick Piggin wrote: > On Tue, Mar 17, 2009 at 06:33:54PM +0100, Jan Kara wrote: > > Assume the following situation: > > Filesystem with blocksize < pagesize - suppose blocksize = 1024, > > pagesize = 4096. File 'f' has first four blocks already allocated. > > (line with "state:" contains the state of buffers in the page - m = mapped, > > u = uptodate, d = dirty) > > > > process 1: process 2: > > > > write to 'f' bytes 0 - 1024 > > state: |mud,-,-,-|, page dirty > > write to 'f' bytes 1024 - 4096: > > __block_prepare_write() maps blocks > > state: |mud,m,m,m|, page dirty > > we fail to copy data -> copied = 0 > > block_write_end() does nothing > > page gets unlocked > > writepage() is called on the page > > block_write_full_page() writes buffers with garbage > > > > This patch fixes the problem by skipping !uptodate buffers in > > block_write_full_page(). > > > > CC: Nick Piggin <npiggin@suse.de> > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > fs/buffer.c | 7 ++++++- > > 1 files changed, 6 insertions(+), 1 deletions(-) > > > > diff --git a/fs/buffer.c b/fs/buffer.c > > index 9f69741..22c0144 100644 > > --- a/fs/buffer.c > > +++ b/fs/buffer.c > > @@ -1774,7 +1774,12 @@ static int __block_write_full_page(struct inode *inode, struct page *page, > > } while (bh != head); > > > > do { > > - if (!buffer_mapped(bh)) > > + /* > > + * Parallel write could have already mapped the buffers but > > + * it then had to restart before copying in new data. We > > + * must avoid writing garbage so just skip the buffer. > > + */ > > + if (!buffer_mapped(bh) || !buffer_uptodate(bh)) > > continue; > > I don't quite see how this can happen. Further down in this loop, > we do a test_clear_buffer_dirty(), which should exclude this I > think? And marking the buffer dirty if it is not uptodate should > be a bug. OK, I spoke too soon. Now I reproduced the corruption under UML even with this patch. So it may be something different... Honza
diff --git a/fs/buffer.c b/fs/buffer.c index 9f69741..22c0144 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1774,7 +1774,12 @@ static int __block_write_full_page(struct inode *inode, struct page *page, } while (bh != head); do { - if (!buffer_mapped(bh)) + /* + * Parallel write could have already mapped the buffers but + * it then had to restart before copying in new data. We + * must avoid writing garbage so just skip the buffer. + */ + if (!buffer_mapped(bh) || !buffer_uptodate(bh)) continue; /* * If it's a fully non-blocking write attempt and we cannot
Assume the following situation: Filesystem with blocksize < pagesize - suppose blocksize = 1024, pagesize = 4096. File 'f' has first four blocks already allocated. (line with "state:" contains the state of buffers in the page - m = mapped, u = uptodate, d = dirty) process 1: process 2: write to 'f' bytes 0 - 1024 state: |mud,-,-,-|, page dirty write to 'f' bytes 1024 - 4096: __block_prepare_write() maps blocks state: |mud,m,m,m|, page dirty we fail to copy data -> copied = 0 block_write_end() does nothing page gets unlocked writepage() is called on the page block_write_full_page() writes buffers with garbage This patch fixes the problem by skipping !uptodate buffers in block_write_full_page(). CC: Nick Piggin <npiggin@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/buffer.c | 7 ++++++- 1 files changed, 6 insertions(+), 1 deletions(-)