diff mbox

[-v2] ext4: use truncate_setsize() unconditionally

Message ID 1306178341-17632-1-git-send-email-tytso@mit.edu
State Accepted, archived
Headers show

Commit Message

Theodore Ts'o May 23, 2011, 7:19 p.m. UTC
In commit c8d46e41 (ext4: Add flag to files with blocks intentionally
past EOF), if the EOFBLOCKS_FL flag is set, we call ext4_truncate()
before calling vmtruncate().  This caused any allocated but unwritten
blocks created by calling fallocate() with the FALLOC_FL_KEEP_SIZE
flag to be dropped.  This was done to make to make sure that
EOFBLOCKS_FL would not be cleared while still leaving blocks past
i_size allocated.  This was not necessary, since ext4_truncate()
guarantees that blocks past i_size will be dropped, even in the case
where truncate() has increased i_size before calling ext4_truncate().

So fix this by removing the EOFBLOCKS_FL special case treatment in
ext4_setattr().  In addition, use truncate_setsize() followed by a
call to ext4_truncate() instead of using vmtruncate().  This is more
efficient since it skips the call to inode_newsize_ok(), which has
been checked already by inode_change_ok().  This is also in a win in
the case where EOFBLOCKS_FL is set since it avoids calling
ext4_truncate() twice.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 Jiayingz pointed out that in the case where we fallocate 12k, write 4k, and
 then truncate to 4k, we should discard the excess fallocate'd blocks.  So if
 attr->ia_size == inode.i_size, we can skip the truncate_setsize() call, but
 if the EOFBLOCKS_FL flag is set, we should still call ext4_truncate().

 fs/ext4/inode.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

Comments

Jiaying Zhang May 23, 2011, 8:22 p.m. UTC | #1
On Mon, May 23, 2011 at 12:19 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> In commit c8d46e41 (ext4: Add flag to files with blocks intentionally
> past EOF), if the EOFBLOCKS_FL flag is set, we call ext4_truncate()
> before calling vmtruncate().  This caused any allocated but unwritten
> blocks created by calling fallocate() with the FALLOC_FL_KEEP_SIZE
> flag to be dropped.  This was done to make to make sure that
> EOFBLOCKS_FL would not be cleared while still leaving blocks past
> i_size allocated.  This was not necessary, since ext4_truncate()
> guarantees that blocks past i_size will be dropped, even in the case
> where truncate() has increased i_size before calling ext4_truncate().
>
> So fix this by removing the EOFBLOCKS_FL special case treatment in
> ext4_setattr().  In addition, use truncate_setsize() followed by a
> call to ext4_truncate() instead of using vmtruncate().  This is more
> efficient since it skips the call to inode_newsize_ok(), which has
> been checked already by inode_change_ok().  This is also in a win in
> the case where EOFBLOCKS_FL is set since it avoids calling
> ext4_truncate() twice.
>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> ---
>  Jiayingz pointed out that in the case where we fallocate 12k, write 4k, and
>  then truncate to 4k, we should discard the excess fallocate'd blocks.  So if
>  attr->ia_size == inode.i_size, we can skip the truncate_setsize() call, but
>  if the EOFBLOCKS_FL flag is set, we should still call ext4_truncate().
>
>  fs/ext4/inode.c |   16 ++++++++--------
>  1 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index df3fb20..2e95819 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5363,8 +5363,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>
>        if (S_ISREG(inode->i_mode) &&
>            attr->ia_valid & ATTR_SIZE &&
> -           (attr->ia_size < inode->i_size ||
> -            (ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS)))) {
> +           (attr->ia_size < inode->i_size)) {
>                handle_t *handle;
>
>                handle = ext4_journal_start(inode, 3);
> @@ -5398,14 +5397,15 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>                                goto err_out;
>                        }
>                }
> -               /* ext4_truncate will clear the flag */
> -               if ((ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS)))
> -                       ext4_truncate(inode);
>        }
>
> -       if ((attr->ia_valid & ATTR_SIZE) &&
> -           attr->ia_size != i_size_read(inode))
> -               rc = vmtruncate(inode, attr->ia_size);
> +       if (attr->ia_valid & ATTR_SIZE) {
> +               if (attr->ia_size != i_size_read(inode)) {
> +                       truncate_setsize(inode, attr->ia_size);
> +                       ext4_truncate(inode);
> +               } else if (ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS))
> +                       ext4_truncate(inode);
> +       }
>
>        if (!rc) {
>                setattr_copy(inode, attr);
> --
> 1.7.3.1
lgtm.

Jiaying

>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen May 24, 2011, 2:30 p.m. UTC | #2
On 5/23/11 2:19 PM, Theodore Ts'o wrote:
> In commit c8d46e41 (ext4: Add flag to files with blocks intentionally
> past EOF), if the EOFBLOCKS_FL flag is set, we call ext4_truncate()
> before calling vmtruncate().  This caused any allocated but unwritten
> blocks created by calling fallocate() with the FALLOC_FL_KEEP_SIZE
> flag to be dropped.  This was done to make to make sure that
> EOFBLOCKS_FL would not be cleared while still leaving blocks past
> i_size allocated.  This was not necessary, since ext4_truncate()
> guarantees that blocks past i_size will be dropped, even in the case
> where truncate() has increased i_size before calling ext4_truncate().
> 
> So fix this by removing the EOFBLOCKS_FL special case treatment in
> ext4_setattr().  In addition, use truncate_setsize() followed by a
> call to ext4_truncate() instead of using vmtruncate().  This is more
> efficient since it skips the call to inode_newsize_ok(), which has
> been checked already by inode_change_ok().  This is also in a win in
> the case where EOFBLOCKS_FL is set since it avoids calling
> ext4_truncate() twice.
> 
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> ---
>  Jiayingz pointed out that in the case where we fallocate 12k, write 4k, and
>  then truncate to 4k, we should discard the excess fallocate'd blocks.  So if
>  attr->ia_size == inode.i_size, we can skip the truncate_setsize() call, but
>  if the EOFBLOCKS_FL flag is set, we should still call ext4_truncate().

are there xfstests which cover this explicitly?  It should be simple to write.

If filesystem behavior differs we can always make ext4-only tests.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiaying Zhang May 24, 2011, 10:06 p.m. UTC | #3
On Tue, May 24, 2011 at 7:30 AM, Eric Sandeen <sandeen@redhat.com> wrote:
> On 5/23/11 2:19 PM, Theodore Ts'o wrote:
>> In commit c8d46e41 (ext4: Add flag to files with blocks intentionally
>> past EOF), if the EOFBLOCKS_FL flag is set, we call ext4_truncate()
>> before calling vmtruncate().  This caused any allocated but unwritten
>> blocks created by calling fallocate() with the FALLOC_FL_KEEP_SIZE
>> flag to be dropped.  This was done to make to make sure that
>> EOFBLOCKS_FL would not be cleared while still leaving blocks past
>> i_size allocated.  This was not necessary, since ext4_truncate()
>> guarantees that blocks past i_size will be dropped, even in the case
>> where truncate() has increased i_size before calling ext4_truncate().
>>
>> So fix this by removing the EOFBLOCKS_FL special case treatment in
>> ext4_setattr().  In addition, use truncate_setsize() followed by a
>> call to ext4_truncate() instead of using vmtruncate().  This is more
>> efficient since it skips the call to inode_newsize_ok(), which has
>> been checked already by inode_change_ok().  This is also in a win in
>> the case where EOFBLOCKS_FL is set since it avoids calling
>> ext4_truncate() twice.
>>
>> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
>> ---
>>  Jiayingz pointed out that in the case where we fallocate 12k, write 4k, and
>>  then truncate to 4k, we should discard the excess fallocate'd blocks.  So if
>>  attr->ia_size == inode.i_size, we can skip the truncate_setsize() call, but
>>  if the EOFBLOCKS_FL flag is set, we should still call ext4_truncate().
>
> are there xfstests which cover this explicitly?  It should be simple to write.
>
Vivek has written a xfstest to cover this and more fallocate/truncate cases:
http://old.nabble.com/-PATCH--xfstests%3A-test-fallocate%2C-write%2C-ftruncate-combinations.-to31666685.html#a31666685

Jiaying

> If filesystem behavior differs we can always make ext4-only tests.
>
> -Eric
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen May 24, 2011, 10:31 p.m. UTC | #4
On 5/24/11 5:06 PM, Jiaying Zhang wrote:
> On Tue, May 24, 2011 at 7:30 AM, Eric Sandeen <sandeen@redhat.com> wrote:
>> On 5/23/11 2:19 PM, Theodore Ts'o wrote:
>>> In commit c8d46e41 (ext4: Add flag to files with blocks intentionally
>>> past EOF), if the EOFBLOCKS_FL flag is set, we call ext4_truncate()
>>> before calling vmtruncate().  This caused any allocated but unwritten
>>> blocks created by calling fallocate() with the FALLOC_FL_KEEP_SIZE
>>> flag to be dropped.  This was done to make to make sure that
>>> EOFBLOCKS_FL would not be cleared while still leaving blocks past
>>> i_size allocated.  This was not necessary, since ext4_truncate()
>>> guarantees that blocks past i_size will be dropped, even in the case
>>> where truncate() has increased i_size before calling ext4_truncate().
>>>
>>> So fix this by removing the EOFBLOCKS_FL special case treatment in
>>> ext4_setattr().  In addition, use truncate_setsize() followed by a
>>> call to ext4_truncate() instead of using vmtruncate().  This is more
>>> efficient since it skips the call to inode_newsize_ok(), which has
>>> been checked already by inode_change_ok().  This is also in a win in
>>> the case where EOFBLOCKS_FL is set since it avoids calling
>>> ext4_truncate() twice.
>>>
>>> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
>>> ---
>>>  Jiayingz pointed out that in the case where we fallocate 12k, write 4k, and
>>>  then truncate to 4k, we should discard the excess fallocate'd blocks.  So if
>>>  attr->ia_size == inode.i_size, we can skip the truncate_setsize() call, but
>>>  if the EOFBLOCKS_FL flag is set, we should still call ext4_truncate().
>>
>> are there xfstests which cover this explicitly?  It should be simple to write.
>>
> Vivek has written a xfstest to cover this and more fallocate/truncate cases:
> http://old.nabble.com/-PATCH--xfstests%3A-test-fallocate%2C-write%2C-ftruncate-combinations.-to31666685.html#a31666685

ah, right - ok, thanks!  Sorry for not keeping up.

-Eric

> Jiaying
> 
>> If filesystem behavior differs we can always make ext4-only tests.
>>
>> -Eric
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index df3fb20..2e95819 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5363,8 +5363,7 @@  int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 
 	if (S_ISREG(inode->i_mode) &&
 	    attr->ia_valid & ATTR_SIZE &&
-	    (attr->ia_size < inode->i_size ||
-	     (ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS)))) {
+	    (attr->ia_size < inode->i_size)) {
 		handle_t *handle;
 
 		handle = ext4_journal_start(inode, 3);
@@ -5398,14 +5397,15 @@  int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 				goto err_out;
 			}
 		}
-		/* ext4_truncate will clear the flag */
-		if ((ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS)))
-			ext4_truncate(inode);
 	}
 
-	if ((attr->ia_valid & ATTR_SIZE) &&
-	    attr->ia_size != i_size_read(inode))
-		rc = vmtruncate(inode, attr->ia_size);
+	if (attr->ia_valid & ATTR_SIZE) {
+		if (attr->ia_size != i_size_read(inode)) {
+			truncate_setsize(inode, attr->ia_size);
+			ext4_truncate(inode);
+		} else if (ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS))
+			ext4_truncate(inode);
+	}
 
 	if (!rc) {
 		setattr_copy(inode, attr);