diff mbox

ext4: Adjust trim start with first_data_block.

Message ID 1295618707-3677-1-git-send-email-tm@tao.ma
State New, archived
Headers show

Commit Message

Tao Ma Jan. 21, 2011, 2:05 p.m. UTC
From: Tao Ma <boyu.mt@taobao.com>

As we have make the consense in the e-mail[1], the trim start should
be added with first_data_block. So this patch fulfill it and remove
the check for start < first_data_block.

[1] http://www.spinics.net/lists/linux-ext4/msg22737.html

Cc: Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
 fs/ext4/mballoc.c |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

Comments

Theodore Ts'o Feb. 4, 2011, 5:40 a.m. UTC | #1
On Fri, Jan 21, 2011 at 10:05:07PM +0800, Tao Ma wrote:
> From: Tao Ma <boyu.mt@taobao.com>
> 
> As we have make the consense in the e-mail[1], the trim start should
> be added with first_data_block. So this patch fulfill it and remove
> the check for start < first_data_block.

> 
> [1] http://www.spinics.net/lists/linux-ext4/msg22737.html
> 

Sorry, I was away at Linux.conf.au and didn't have a chance to respond
this.

Fundamentally I think we need to understand what the arguments to the
trim ioctl is supposed to *mean*.  Are they supposed to be physical
block numbers, or some thing else?  If we just bump everything by the
first_data_block, (which is non-zero only for the 1k case), that will
screw up the length argument, because the userspace program isn't
going to know that (a) the file system is using 1k blocks, and (b) on
ext2/3/4 that means that first_data_block is 1.

So instead of saying that the arguments to trim mean "the data blocks"
--- which is a concept that doesn't really have any meaning to the
caller of the trim ioctl, without forcing it to know a lot more about
the physical layout of filesystems, I think the argument to trim
should be the physical block numbers.

And just as we don't trim blocks that contain data that should be
saved, we should just simply not trim block #0 (the boot block) and
and block #1 (the superblock) on 1k block filesystems, and not trim
block #0 (the boot block as well as the superblock) on 4k block
filesystems.  But we shouldn't be doing this by taking block number
passed to trim and just adding first_data_block to it.

I know a patch to do that has already merged into ext3, but with the
indulgence of the folks on linux-ext4, could we reopen this question?

Thanks,

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tao Ma Feb. 5, 2011, 3:16 p.m. UTC | #2
Hi Ted,
On 02/04/2011 01:40 PM, Ted Ts'o wrote:
> On Fri, Jan 21, 2011 at 10:05:07PM +0800, Tao Ma wrote:
>    
>> From: Tao Ma<boyu.mt@taobao.com>
>>
>> As we have make the consense in the e-mail[1], the trim start should
>> be added with first_data_block. So this patch fulfill it and remove
>> the check for start<  first_data_block.
>>      
>    
>> [1] http://www.spinics.net/lists/linux-ext4/msg22737.html
>>
>>      
> Sorry, I was away at Linux.conf.au and didn't have a chance to respond
> this.
>    
np.
> Fundamentally I think we need to understand what the arguments to the
> trim ioctl is supposed to *mean*.  Are they supposed to be physical
> block numbers, or some thing else?  If we just bump everything by the
> first_data_block, (which is non-zero only for the 1k case), that will
> screw up the length argument, because the userspace program isn't
> going to know that (a) the file system is using 1k blocks, and (b) on
> ext2/3/4 that means that first_data_block is 1.
>    
fair enough.
> So instead of saying that the arguments to trim mean "the data blocks"
> --- which is a concept that doesn't really have any meaning to the
> caller of the trim ioctl, without forcing it to know a lot more about
> the physical layout of filesystems, I think the argument to trim
> should be the physical block numbers.
>
> And just as we don't trim blocks that contain data that should be
> saved, we should just simply not trim block #0 (the boot block) and
> and block #1 (the superblock) on 1k block filesystems, and not trim
> block #0 (the boot block as well as the superblock) on 4k block
> filesystems.  But we shouldn't be doing this by taking block number
> passed to trim and just adding first_data_block to it.
>    
I am open to any suggestion as:
1) you are the original author of ext2/3/4. ;)
2) I am not the original author of trim in ext3/4 and this patch is 
suggested by Lucas. So Lucas, do you agree with it?
> I know a patch to do that has already merged into ext3, but with the
> indulgence of the folks on linux-ext4, could we reopen this question?
>    
Never mind.
So if you has the objection to this patch, please consider merging 
another patch.
It fix an underflow problem and treat 'start' as the first physical 
block number.
http://marc.info/?l=linux-ext4&m=129543034911496&w=2

As for ext3, I will contact Jan for the update if you decide to abandon 
this patch and accept the patch for 'underflow'.

Regards,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Feb. 7, 2011, 1:03 p.m. UTC | #3
On Sat, 5 Feb 2011, Tao Ma wrote:

> Hi Ted,
> On 02/04/2011 01:40 PM, Ted Ts'o wrote:
> > On Fri, Jan 21, 2011 at 10:05:07PM +0800, Tao Ma wrote:
> >    
> > > From: Tao Ma<boyu.mt@taobao.com>
> > > 
> > > As we have make the consense in the e-mail[1], the trim start should
> > > be added with first_data_block. So this patch fulfill it and remove
> > > the check for start<  first_data_block.
> > >      
> >    
> > > [1] http://www.spinics.net/lists/linux-ext4/msg22737.html
> > > 
> > >      
> > Sorry, I was away at Linux.conf.au and didn't have a chance to respond
> > this.
> >    
> np.
> > Fundamentally I think we need to understand what the arguments to the
> > trim ioctl is supposed to *mean*.  Are they supposed to be physical
> > block numbers, or some thing else?  If we just bump everything by the
> > first_data_block, (which is non-zero only for the 1k case), that will
> > screw up the length argument, because the userspace program isn't
> > going to know that (a) the file system is using 1k blocks, and (b) on
> > ext2/3/4 that means that first_data_block is 1.
> >    
> fair enough.
> > So instead of saying that the arguments to trim mean "the data blocks"
> > --- which is a concept that doesn't really have any meaning to the
> > caller of the trim ioctl, without forcing it to know a lot more about
> > the physical layout of filesystems, I think the argument to trim
> > should be the physical block numbers.
> > 
> > And just as we don't trim blocks that contain data that should be
> > saved, we should just simply not trim block #0 (the boot block) and
> > and block #1 (the superblock) on 1k block filesystems, and not trim
> > block #0 (the boot block as well as the superblock) on 4k block
> > filesystems.  But we shouldn't be doing this by taking block number
> > passed to trim and just adding first_data_block to it.
> >    
> I am open to any suggestion as:
> 1) you are the original author of ext2/3/4. ;)
> 2) I am not the original author of trim in ext3/4 and this patch is suggested
> by Lucas. So Lucas, do you agree with it?

Well, originally (when I suggested that we should just add
first_data_block) the intention was for the arguments to be "bytes of
data". I am not sure what exactly is counted into filesystem size
(device size - metadata blocks - reserved bocks - first data block etc.. ??),
but the idea was (since FITRIM is called on mounted filesystem), that
start=0 means first data block for the user, but if it is counted into
filesystem size (marked as used space) it does not really make sense to
add fist_data_block to the start argument.

Does it make sense ?

Thanks!
-Lukas

(BTW, my name is spelled with 'k' in it, so it is "Lukas", or "Lukáš" to be
exact, but the first one is just fine :) )

> > I know a patch to do that has already merged into ext3, but with the
> > indulgence of the folks on linux-ext4, could we reopen this question?
> >    
> Never mind.
> So if you has the objection to this patch, please consider merging another
> patch.
> It fix an underflow problem and treat 'start' as the first physical block
> number.
> http://marc.info/?l=linux-ext4&m=129543034911496&w=2
> 
> As for ext3, I will contact Jan for the update if you decide to abandon this
> patch and accept the patch for 'underflow'.
> 
> Regards,
> Tao
>
diff mbox

Patch

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 851f49b..71dd7bd 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4808,21 +4808,16 @@  int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
 	ext4_group_t group, ngroups = ext4_get_groups_count(sb);
 	ext4_grpblk_t cnt = 0, first_block, last_block;
 	uint64_t start, len, minlen, trimmed;
-	ext4_fsblk_t first_data_blk =
-			le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block);
 	int ret = 0;
 
-	start = range->start >> sb->s_blocksize_bits;
+	start = (range->start >> sb->s_blocksize_bits) +
+		le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block);
 	len = range->len >> sb->s_blocksize_bits;
 	minlen = range->minlen >> sb->s_blocksize_bits;
 	trimmed = 0;
 
 	if (unlikely(minlen > EXT4_BLOCKS_PER_GROUP(sb)))
 		return -EINVAL;
-	if (start < first_data_blk) {
-		len -= first_data_blk - start;
-		start = first_data_blk;
-	}
 
 	/* Determine first and last group to examine based on start and len */
 	ext4_get_group_no_and_offset(sb, (ext4_fsblk_t) start,