Message ID | 1241692770-22547-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com |
---|---|
State | Superseded, archived |
Headers | show |
Aneesh Kumar K.V wrote: > ext4_get_blocks_wrap does a block lookup requesting to > allocate new blocks. A lookup of blocks in prealloc area > result in setting the unwritten flag in buffer_head. So > a write to an unwritten extent will cause the buffer_head > to have unwritten and mapped flag set. Clear hte unwritten > buffer_head flag before requesting to allocate blocks. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > --- > fs/ext4/inode.c | 7 +++++++ > 1 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index c3cd00f..f6d7e9b 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -1149,6 +1149,7 @@ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block, > int retval; > > clear_buffer_mapped(bh); > + clear_buffer_unwritten(bh); > > /* > * Try to see if we can get the block without requesting > @@ -1179,6 +1180,12 @@ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block, > return retval; > > /* > + * The above get_blocks can cause the buffer to be > + * marked unwritten. So clear the same. > + */ > + clear_buffer_unwritten(bh); hm, thinking out loud here. ext4_ext_get_blocks() will only set unwritten if (!create) ... but then ext4_get_blocks_wrap() calls ext4_ext_get_blocks() !create as an argument no matter what, the first time, for an initial lookup. But if ext4_get_blocks_wrap() was called with !create, then we return regardless, so ok - by the time you get to the above hunk, we -are- in create mode, we're planning to write it ... so I guess clearing the unwritten state makes sense here. But is this too late, because it's after this? /* * Returns if the blocks have already allocated * * Note that if blocks have been preallocated * ext4_ext_get_block() returns th create = 0 * with buffer head unmapped. */ if (retval > 0 && buffer_mapped(bh)) return retval; I guess not; ext4_ext_get_blocks() won't map the buffer if it's found to be preallocated/unwritten because it was called with !create. If we're going on to write it, we want to clear unwritten. So I guess this looks right, although I can't help but think that in general, the buffer_head state management is really getting to be a hard-to-follow mess... -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 07, 2009 at 10:36:49AM -0500, Eric Sandeen wrote: > Aneesh Kumar K.V wrote: > > ext4_get_blocks_wrap does a block lookup requesting to > > allocate new blocks. A lookup of blocks in prealloc area > > result in setting the unwritten flag in buffer_head. So > > a write to an unwritten extent will cause the buffer_head > > to have unwritten and mapped flag set. Clear hte unwritten > > buffer_head flag before requesting to allocate blocks. > > > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > > --- > > fs/ext4/inode.c | 7 +++++++ > > 1 files changed, 7 insertions(+), 0 deletions(-) > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index c3cd00f..f6d7e9b 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -1149,6 +1149,7 @@ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block, > > int retval; > > > > clear_buffer_mapped(bh); > > + clear_buffer_unwritten(bh); > > > > /* > > * Try to see if we can get the block without requesting > > @@ -1179,6 +1180,12 @@ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block, > > return retval; > > > > /* > > + * The above get_blocks can cause the buffer to be > > + * marked unwritten. So clear the same. > > + */ > > + clear_buffer_unwritten(bh); > > hm, thinking out loud here. > > ext4_ext_get_blocks() will only set unwritten if (!create) ... but then > ext4_get_blocks_wrap() calls ext4_ext_get_blocks() !create as an > argument no matter what, the first time, for an initial lookup. > > But if ext4_get_blocks_wrap() was called with !create, then we return > regardless, so ok - by the time you get to the above hunk, we -are- in > create mode, we're planning to write it ... so I guess clearing the > unwritten state makes sense here. > > But is this too late, because it's after this? > > /* > * Returns if the blocks have already allocated > * > * Note that if blocks have been preallocated > * ext4_ext_get_block() returns th create = 0 > * with buffer head unmapped. > */ > if (retval > 0 && buffer_mapped(bh)) > return retval; > > I guess not; ext4_ext_get_blocks() won't map the buffer if it's found to > be preallocated/unwritten because it was called with !create. If we're > going on to write it, we want to clear unwritten. > > So I guess this looks right, although I can't help but think that in > general, the buffer_head state management is really getting to be a > hard-to-follow mess... To further clarify what i think was causing the I/O error. 1) We do a multi block delayed alloc to prealloc space. That would get us multiple buffer_heads marked with BH_Unwritten. (say 10, 11, 12) 2) pdflush attempt to write some pages (say mapping block 10) which cause a get_block call with create = 1. That would attempt to convert uninitialized extent to initialized one. This can cause multiple blocks to be marked initialized. ( say 10, 11 , 12) 3) We do an overwrite of block 11. That would mean we call ext4_da_get_block_prep, which would again do a get_block for block 11 with create = 0. But remember we already have buffer_head marked with BH_Unwritten flag. But the buffer was unmapped because it is unwritten ( We are fixing this mess in the patch for 2.6.31). 4) The get_block call will find the buffer mapped due to step b. And mark the buffer_head mapped. There we go . We end up with buffer_head mapped and unwritten 5) later in ext4_da_get_block_prep we check whether the buffer_head in marked BH_Unwritten if so we set the block number to ~0. This is introduced by [PATCH -V4 1/2] Fix sub-block zeroing for buffered writes into unwritten extents 6) So now we have a buffer_head that is mapped, unwritten, with b_blocknr = ~0. That would result in the I/O error. -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 07, 2009 at 04:09:29PM +0530, Aneesh Kumar K.V wrote: > ext4_get_blocks_wrap does a block lookup requesting to > allocate new blocks. A lookup of blocks in prealloc area > result in setting the unwritten flag in buffer_head. So > a write to an unwritten extent will cause the buffer_head > to have unwritten and mapped flag set. Clear hte unwritten > buffer_head flag before requesting to allocate blocks. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> I've rewritten the commit changelog to this, which I believe more accurately describes the patch. Comments, please? ext4: Clear the unwritten buffer_head flag after the extent is initialized From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> The BH_Unwritten flag indicates that the buffer is allocated on disk but has not been written; that is, the disk was part of a persistent preallocation area. That flag should only be set when a get_blocks() function is looking up a inode's logical to physical block mapping. When ext4_get_blocks_wrap() is called with create=1, the uninitialized extent is converted into an initialized one, so the BH_Unwritten flag is no longer appropriate. Hence, we need to make sure the BH_Unwritten is not left set, to avoid the ensuing confusion and hilarty. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 11, 2009 at 11:08:56PM -0400, Theodore Tso wrote: > On Thu, May 07, 2009 at 04:09:29PM +0530, Aneesh Kumar K.V wrote: > > ext4_get_blocks_wrap does a block lookup requesting to > > allocate new blocks. A lookup of blocks in prealloc area > > result in setting the unwritten flag in buffer_head. So > > a write to an unwritten extent will cause the buffer_head > > to have unwritten and mapped flag set. Clear hte unwritten > > buffer_head flag before requesting to allocate blocks. > > > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > > I've rewritten the commit changelog to this, which I believe more > accurately describes the patch. Comments, please? > > ext4: Clear the unwritten buffer_head flag after the extent is initialized > > From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> > > The BH_Unwritten flag indicates that the buffer is allocated on disk > but has not been written; that is, the disk was part of a persistent > preallocation area. That flag should only be set when a get_blocks() > function is looking up a inode's logical to physical block mapping. > > When ext4_get_blocks_wrap() is called with create=1, the uninitialized > extent is converted into an initialized one, so the BH_Unwritten flag > is no longer appropriate. Hence, we need to make sure the > BH_Unwritten is not left set, to avoid the ensuing confusion and > hilarty. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> > I think it is good. But one thing missing in the commit message is, what happens if we do a write to prealloc space. Since a get_block(create = 1) is now split into __get_block(create = 0 ) and __get_block(create = 1). That would mean if we pass a buffer head with BH_Unwritten cleared we will have 1) buffer_head as BH_Unwritten cleared. 2) __get_block(create = 0 ) -> Since it is prealloc space we will have BH_Unwritten set . 3) __get_block(create = 1) -> get the blocks out of prealloc space. and retun with BH_Mapped set. That would imply we have BH_Unwritten and BH_Mapped set in the above case which is wrong. So we need a BH_Unwritten clear between (2) and (3). The patch does the same. May be we need to capture it in commit message. -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Aneesh Kumar K.V wrote: > On Mon, May 11, 2009 at 11:08:56PM -0400, Theodore Tso wrote: >> On Thu, May 07, 2009 at 04:09:29PM +0530, Aneesh Kumar K.V wrote: >>> ext4_get_blocks_wrap does a block lookup requesting to >>> allocate new blocks. A lookup of blocks in prealloc area >>> result in setting the unwritten flag in buffer_head. So >>> a write to an unwritten extent will cause the buffer_head >>> to have unwritten and mapped flag set. Clear hte unwritten >>> buffer_head flag before requesting to allocate blocks. >>> >>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> >> I've rewritten the commit changelog to this, which I believe more >> accurately describes the patch. Comments, please? >> >> ext4: Clear the unwritten buffer_head flag after the extent is initialized >> >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> >> >> The BH_Unwritten flag indicates that the buffer is allocated on disk >> but has not been written; that is, the disk was part of a persistent >> preallocation area. That flag should only be set when a get_blocks() >> function is looking up a inode's logical to physical block mapping. >> >> When ext4_get_blocks_wrap() is called with create=1, the uninitialized >> extent is converted into an initialized one, so the BH_Unwritten flag >> is no longer appropriate. Hence, we need to make sure the >> BH_Unwritten is not left set, to avoid the ensuing confusion and >> hilarty. >> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> >> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> >> > > I think it is good. But one thing missing in the commit message is, > what happens if we do a write to prealloc space. Since a > get_block(create = 1) is now split into __get_block(create = 0 ) and > __get_block(create = 1). That would mean if we pass a buffer head with > BH_Unwritten cleared we will have > > > 1) buffer_head as BH_Unwritten cleared. > > 2) __get_block(create = 0 ) -> Since it is prealloc space we will have > BH_Unwritten set . Why do we need to set BH_Unwritten on a !create call at all? Or maybe another way of asking is, are there any !create callers of get_block who -want- BH_Unwritten set? Which is to say, should we just not be setting BH_Unwritten in get_block in the !create case, ever? The comment: /* + * The above get_blocks can cause the buffer to be + * marked unwritten. So clear the same. + */ + clear_buffer_unwritten(bh); is imho not helpful; to me it says "we -just- set this, so clear it!" It leaves me scratching my head. > 3) __get_block(create = 1) -> get the blocks out of prealloc space. > and retun with BH_Mapped set. > > That would imply we have BH_Unwritten and BH_Mapped set in the above > case which is wrong. So we need a BH_Unwritten clear between (2) and > (3). The patch does the same. May be we need to capture it in commit > message. Better in comments, I think. :) -Eric > -aneesh > > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 13, 2009 at 01:56:56PM -0500, Eric Sandeen wrote: > The comment: > > /* > + * The above get_blocks can cause the buffer to be > + * marked unwritten. So clear the same. > + */ > + clear_buffer_unwritten(bh); > > is imho not helpful; to me it says "we -just- set this, so clear it!" > It leaves me scratching my head. I've updated it the comment to say this instead. /* * When we call get_blocks without the create flag, the * BH_Unwritten flag could have gotten set if the blocks * requested were part of a uninitialized extent. We need to * clear this flag now that we are committed to convert all or * part of the uninitialized extent to be an initialized * extent. This is because we need to avoid the combination * of BH_Unwritten and BH_Mapped flags being simultaneously * set on the buffer_head. */ I'm still not entirely clear what is the downside of having BH_Unwritten and BH_Mapped at the same time; whether it is just a CPU time-waster that will cause some unneeded calls to get_blocks() in the write_begin path which "just" wastes a little CPU, or whether there are other massive disasters that take place on the combination of BH_Unwritten and BH_Mapped. What we *desperately* need is documentation that describes which functions sets these flags, and whether they are intended for long-term use (either stored in buffer heads attached to a page, or in the buffer cache, and if so, which), or just as short-term hacks to pass information between two functions, or multiple functions deep in a call stack, and if so, what the implications are of the bits, and when they should be cleared --- and then we should add assert's or BUG_ON's to enforce what we think the abstraction invariant should be. Magic flags attached to objects that are really there because we don't want to change function signatures are very scary; for example, i_delalloc_reserved_flag in ext4_inode_info --- it's not clear to me exactly what this is supposed to do, but I note that mballoc.c does a lot more with the flag than balloc.c. So it's not clear to me if there is magic semantics associated with that flag which are being done in mballoc.c, but we never got around to implementing them in balloc.c. This leads to the kind of code fragility that I've been complaining about for some time. The fact that Aneesh missed one of these magic code paths in his patch attests to why this style of programming is really bad and not long-term maintainable. So we need to document all of this mess, and then gradually clean it up, one tiny patch at a time, each one describing what we were doing before, and what the new way of doing this should be, and why it's safe to make that change. > > That would imply we have BH_Unwritten and BH_Mapped set in the above > > case which is wrong. So we need a BH_Unwritten clear between (2) and > > (3). The patch does the same. May be we need to capture it in commit > > message. > > Better in comments, I think. :) The comprehensive description of all of this should be in comments, yes. See the example of the sort of comments that I've added in the clear-unwritten-bh-flag and initialize-map_bh-state. If we don't like such verbose comments, then we should use simpler programming semantics when passing information between the various abstracton layers. :-) - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 13, 2009 at 01:56:56PM -0500, Eric Sandeen wrote: > > > > I think it is good. But one thing missing in the commit message is, > > what happens if we do a write to prealloc space. Since a > > get_block(create = 1) is now split into __get_block(create = 0 ) and > > __get_block(create = 1). That would mean if we pass a buffer head with > > BH_Unwritten cleared we will have > > > > > > 1) buffer_head as BH_Unwritten cleared. > > > > 2) __get_block(create = 0 ) -> Since it is prealloc space we will have > > BH_Unwritten set . > > Why do we need to set BH_Unwritten on a !create call at all? > > Or maybe another way of asking is, are there any !create callers of > get_block who -want- BH_Unwritten set? > > Which is to say, should we just not be setting BH_Unwritten in get_block > in the !create case, ever? It should only be set in the !create case. With create == 1, we would have already converted the uninitialized extent to initialized one and the buffer_head should not be unwritten at all. My understanding is unwritten flag is used to indicate the buffer_head state between a write_begin and write_page phase with delayed allocation. ie, when we write to fallocate space, since we have delalloc enabled, we just do a block lookup (get_block with create = 0). The buffer_head returned in the above case should have unwritten set so that during writepage we do the actual block allocation (get_block writh create = 1) looking at the flag. > > The comment: > > /* > + * The above get_blocks can cause the buffer to be > + * marked unwritten. So clear the same. > + */ > + clear_buffer_unwritten(bh); > > is imho not helpful; to me it says "we -just- set this, so clear it!" > It leaves me scratching my head. > > > 3) __get_block(create = 1) -> get the blocks out of prealloc space. > > and retun with BH_Mapped set. > > > > That would imply we have BH_Unwritten and BH_Mapped set in the above > > case which is wrong. So we need a BH_Unwritten clear between (2) and > > (3). The patch does the same. May be we need to capture it in commit > > message. > > Better in comments, I think. :) > -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 13, 2009 at 06:28:47PM -0400, Theodore Tso wrote: > On Wed, May 13, 2009 at 01:56:56PM -0500, Eric Sandeen wrote: > > The comment: > > > > /* > > + * The above get_blocks can cause the buffer to be > > + * marked unwritten. So clear the same. > > + */ > > + clear_buffer_unwritten(bh); > > > > is imho not helpful; to me it says "we -just- set this, so clear it!" > > It leaves me scratching my head. > > I've updated it the comment to say this instead. > > /* > * When we call get_blocks without the create flag, the > * BH_Unwritten flag could have gotten set if the blocks > * requested were part of a uninitialized extent. We need to > * clear this flag now that we are committed to convert all or > * part of the uninitialized extent to be an initialized > * extent. This is because we need to avoid the combination > * of BH_Unwritten and BH_Mapped flags being simultaneously > * set on the buffer_head. > */ The last line in the above comment is not a problem with the latest kernel with all the patches in the patch queue. The patch that does that is "ext4: Mark the unwritten buffer_head as mapped during write_begin" The unwritten and mapped state together was a problem with the code path we had before in ext4_da_get_block_prep we had: ret = ext4_get_blocks(NULL, inode, iblock, 1, bh_result, 0); ..... .. } else if (ret > 0) { bh_result->b_size = (ret << inode->i_blkbits); if (buffer_unwritten(bh_result)) { bh_result.b_blocknr = ~0; .... } } The above can result in 1) We do a multi block delayed alloc to prealloc space. That would get us multiple buffer_heads marked with BH_Unwritten. (say 10, 11, 12) 2) pdflush attempt to write some pages (say mapping block 10) which cause a get_block call with create = 1. That would attempt to convert uninitialized extent to initialized one. This can cause multiple blocks to be marked initialized. ( say 10, 11 , 12) 3) We do an overwrite of block 11. That would mean we call ext4_da_get_block_prep, which would again do a get_block for block 11 with create = 0. But remember we already have buffer_head marked with BH_Unwritten flag. But the buffer was unmapped because it is unwritten ( We are fixing this mess in the patch for 2.6.31). 4) The get_block call will find the buffer mapped due to step b. And mark the buffer_head mapped. There we go . We end up with buffer_head mapped and unwritten 5) later in ext4_da_get_block_prep we check whether the buffer_head in marked BH_Unwritten if so we set the block number to ~0. This is introduced by [PATCH -V4 1/2] Fix sub-block zeroing for buffered writes into unwritten extents 6) So now we have a buffer_head that is mapped, unwritten, with b_blocknr = ~0. That would result in the I/O error. Now that we have the actual block number in bh_result.b_blocknr I guess we should be ok to have the unwritten flag set. But then i guess it is against the VFS assumption of buffer_head state. The state unwritten indicate the blocks are not yet written. So if we don't do a clear_buffer_unwritten as in the patch we end up with a block that is written ( done via create = 1 get_block call) and still marked unwritten. (see the 6 step example above) Now to explain what "ext4: Mark the unwritten buffer_head as mapped during write_begin" does. It marks the buffer_head as mapped in the write_begin phase. And that helps in making sure we don't end up calling get_block multiple times. So with delayed allocation we now have between the write_begin and writepage phase a buffer_head which is mapped and unwritten for blocks in the fallocate space. Once we do writepage for the block we will have buffer_head which is mapped and unwritten flag cleared. The unwritten get cleared when we do a get_block call with create = 1. To achieve the above we need to make sure writepage code path looks at the unwritten flag and does a get_blocks call with create = 1. With mainline we have in writepage code path mpage_da_map_blocks(...) if ((mpd->b_state & (1 << BH_Mapped)) && !(mpd->b_state & (1 << BH_Delay))) return 0; ... .. ext4_da_get_block_write() If we start marking buffer_head mapped for fallocate blocks in the write_begin phase, then the above code will return without doing any get_block(create = 1) call. That means we don't convert the uninitialized extent to initialized one. So along with marking buffer_head mapped and unwritten in the write_begin phase we also need changes to writepage code path which does something like below mpage_da_map_blocks(...) if ((mpd->b_state & (1 << BH_Mapped)) && !(mpd->b_state & (1 << BH_Delay)) && !(mpd->b_state & (1 << BH_Unwritten))) return 0; ... .. ext4_da_get_block_write() this is done in patch "ext4: Mark the unwritten buffer_head as mapped during write_begin". -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 14, 2009 at 11:10:02AM +0530, Aneesh Kumar K.V wrote: > > It should only be set in the !create case. With create == 1, we would > have already converted the uninitialized extent to initialized one and > the buffer_head should not be unwritten at all. My understanding is > unwritten flag is used to indicate the buffer_head state between a > write_begin and write_page phase with delayed allocation. ie, when we > write to fallocate space, since we have delalloc enabled, we just > do a block lookup (get_block with create = 0). The buffer_head returned > in the above case should have unwritten set so that during writepage > we do the actual block allocation (get_block writh create = 1) > looking at the flag. At the moment, ext4_da_get_block_prep(), which is used as a callback function by ext4_da_write_begin(), checks for buffer_unwritten(), and if true, set BH_New and BH_Mapped. So between the time that that write_begin() and the time when the page is actually written out, BH_Unwritten and BH_Mapped will both be set. If we end up bailing due to some error of some kind, such that we don't complete the write(2) operation we *can* have some pages that are simultaneously have BH_Unwritten and BH_Mapped flags set. So this had better be a harmless case, since I think it can happen. What's confusing then is some of the comments which have been made about why BH_Unwritten and BH_Mapped simultaneously are a bad. It may be bad at some points in time, but at other points in time it's completely normal operations. Or am I missing something? - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index c3cd00f..f6d7e9b 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1149,6 +1149,7 @@ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block, int retval; clear_buffer_mapped(bh); + clear_buffer_unwritten(bh); /* * Try to see if we can get the block without requesting @@ -1179,6 +1180,12 @@ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block, return retval; /* + * The above get_blocks can cause the buffer to be + * marked unwritten. So clear the same. + */ + clear_buffer_unwritten(bh); + + /* * New blocks allocate and/or writing to uninitialized extent * will possibly result in updating i_data, so we take * the write lock of i_data_sem, and call get_blocks()
ext4_get_blocks_wrap does a block lookup requesting to allocate new blocks. A lookup of blocks in prealloc area result in setting the unwritten flag in buffer_head. So a write to an unwritten extent will cause the buffer_head to have unwritten and mapped flag set. Clear hte unwritten buffer_head flag before requesting to allocate blocks. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- fs/ext4/inode.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-)