diff mbox series

[06/15] ext2fs: add new APIs needed for fast commits

Message ID 20201120191606.2224881-7-harshadshirwadkar@gmail.com
State Superseded
Headers show
Series Fast commits support for e2fsprogs | expand

Commit Message

harshad shirwadkar Nov. 20, 2020, 7:15 p.m. UTC
This patch adds the following new APIs:

Count the total number of blocks occupied by inode including
intermediate extent tree nodes.
extern blk64_t ext2fs_count_blocks(ext2_filsys fs, ext2_ino_t ino,
                                       struct ext2_inode *inode);

Convert ext3_extent to ext2fs_extent.
extern void ext2fs_convert_extent(struct ext2fs_extent *to,
                                       struct ext3_extent *from);

Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
---
 lib/ext2fs/ext2fs.h |  4 ++++
 lib/ext2fs/extent.c | 56 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

Comments

Theodore Ts'o Dec. 2, 2020, 6:44 p.m. UTC | #1
On Fri, Nov 20, 2020 at 11:15:57AM -0800, Harshad Shirwadkar wrote:
> This patch adds the following new APIs:
> 
> Count the total number of blocks occupied by inode including
> intermediate extent tree nodes.
> extern blk64_t ext2fs_count_blocks(ext2_filsys fs, ext2_ino_t ino,
>                                        struct ext2_inode *inode);
> 
> Convert ext3_extent to ext2fs_extent.
> extern void ext2fs_convert_extent(struct ext2fs_extent *to,
>                                        struct ext3_extent *from);

So one of the reasons why I've intentionally never exposed "struct
ext3_extent" in the libext2fs interface is because that's an on-disk
structure which I keep hoping we might change someday --- for example,
to allow for 64-bit logical block numbers so we can create ext4 files
greater than 2**32 blocks.  It might be that some other future
enhancement, such as say, reflinks (depending on how we implement
them), or reverse pointers, might also require making changes to the
on-disk format.

The kernel code has the on-disk format and the various logical
manipulations of the extent tree hopelessly entangled with each other,
which means changing the kernel code to support more than one on-disk
extent structure is going to be **hard**.  But in the userspace code,
all of the knowledge about the on-disk structure is abstracted away
inside lib/ext2fs/extent.c.

It may very well be that for fast commit, we're going to need to crack
open that abstraction barrier a bit.  But let's make sure the function
name makes it clear that what we are doing here is converting between
a particular on-disk encoding and the ext2fs abtract extent type.
"ext2fs_convert_extent" doesn't exactly make this clear.

It might also be that what should do is include a pointer to the fs
and inode structures, and call this something like
"ext2fs_{decode,encode}_extent()", and pass in the on-disk format via
a void *.  We might also want to have some kind of
ext2fs_validate_extent() function which takes a void * and validates
the on-disk structure to make sure it's sane.

What do you think?

					- Ted
harshad shirwadkar Dec. 10, 2020, 1:45 a.m. UTC | #2
I see that makes sense. In that case, I'll rename the function to
errcode_t ext2fs_decode_extent(struct ext2fs_extent *dst, void *src).
I wonder if it's okay if we make this function return an error in case
the on-disk format is not sane. If we do that, we can add
ext2fs_validate_extent() later. Does that make sense?

On Wed, Dec 2, 2020 at 10:45 AM Theodore Y. Ts'o <tytso@mit.edu> wrote:
>
> On Fri, Nov 20, 2020 at 11:15:57AM -0800, Harshad Shirwadkar wrote:
> > This patch adds the following new APIs:
> >
> > Count the total number of blocks occupied by inode including
> > intermediate extent tree nodes.
> > extern blk64_t ext2fs_count_blocks(ext2_filsys fs, ext2_ino_t ino,
> >                                        struct ext2_inode *inode);
> >
> > Convert ext3_extent to ext2fs_extent.
> > extern void ext2fs_convert_extent(struct ext2fs_extent *to,
> >                                        struct ext3_extent *from);
>
> So one of the reasons why I've intentionally never exposed "struct
> ext3_extent" in the libext2fs interface is because that's an on-disk
> structure which I keep hoping we might change someday --- for example,
> to allow for 64-bit logical block numbers so we can create ext4 files
> greater than 2**32 blocks.  It might be that some other future
> enhancement, such as say, reflinks (depending on how we implement
> them), or reverse pointers, might also require making changes to the
> on-disk format.
>
> The kernel code has the on-disk format and the various logical
> manipulations of the extent tree hopelessly entangled with each other,
> which means changing the kernel code to support more than one on-disk
> extent structure is going to be **hard**.  But in the userspace code,
> all of the knowledge about the on-disk structure is abstracted away
> inside lib/ext2fs/extent.c.
>
> It may very well be that for fast commit, we're going to need to crack
> open that abstraction barrier a bit.  But let's make sure the function
> name makes it clear that what we are doing here is converting between
> a particular on-disk encoding and the ext2fs abtract extent type.
> "ext2fs_convert_extent" doesn't exactly make this clear.
>
> It might also be that what should do is include a pointer to the fs
> and inode structures, and call this something like
> "ext2fs_{decode,encode}_extent()", and pass in the on-disk format via
> a void *.  We might also want to have some kind of
> ext2fs_validate_extent() function which takes a void * and validates
> the on-disk structure to make sure it's sane.
>
> What do you think?
>
>                                         - Ted
Theodore Ts'o Dec. 10, 2020, 3:48 p.m. UTC | #3
On Wed, Dec 09, 2020 at 05:45:27PM -0800, harshad shirwadkar wrote:
> I see that makes sense. In that case, I'll rename the function to
> errcode_t ext2fs_decode_extent(struct ext2fs_extent *dst, void *src).
> I wonder if it's okay if we make this function return an error in case
> the on-disk format is not sane. If we do that, we can add
> ext2fs_validate_extent() later. Does that make sense?

Sure, that works for me.

Something that you should think about at some point is how much impact
would be supporting an alternate on-disk extent node structure (for
the leaf and/or intermediate nodes) have on Fast Commit?  Obviously
doing this would a new an INCOMPAT feature at the file system level,
so we probably won't need any additional version negotiation in the
fast commit journal header itself, but how many tags would need to be
changed if we were to extend the extent tree structure sometime in the
future?

Cheers,

						- Ted
diff mbox series

Patch

diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 01132245..afa9c5e4 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1332,6 +1332,10 @@  extern errcode_t ext2fs_extent_fix_parents(ext2_extent_handle_t handle);
 extern size_t ext2fs_max_extent_depth(ext2_extent_handle_t handle);
 extern errcode_t ext2fs_fix_extents_checksums(ext2_filsys fs, ext2_ino_t ino,
 					      struct ext2_inode *inode);
+extern blk64_t ext2fs_count_blocks(ext2_filsys fs, ext2_ino_t ino,
+					struct ext2_inode *inode);
+extern void ext2fs_convert_extent(struct ext2fs_extent *to,
+					struct ext3_extent *from);
 
 /* fallocate.c */
 #define EXT2_FALLOCATE_ZERO_BLOCKS	(0x1)
diff --git a/lib/ext2fs/extent.c b/lib/ext2fs/extent.c
index ac3dbfec..43feea0a 100644
--- a/lib/ext2fs/extent.c
+++ b/lib/ext2fs/extent.c
@@ -1785,6 +1785,62 @@  out:
 	return errcode;
 }
 
+void ext2fs_convert_extent(struct ext2fs_extent *to,  struct ext3_extent *from)
+{
+	to->e_pblk = ext2fs_le32_to_cpu(from->ee_start) +
+		((__u64) ext2fs_le16_to_cpu(from->ee_start_hi)
+			<< 32);
+	to->e_lblk = ext2fs_le32_to_cpu(from->ee_block);
+	to->e_len = ext2fs_le16_to_cpu(from->ee_len);
+	to->e_flags |= EXT2_EXTENT_FLAGS_LEAF;
+	if (to->e_len > EXT_INIT_MAX_LEN) {
+		to->e_len -= EXT_INIT_MAX_LEN;
+		to->e_flags |= EXT2_EXTENT_FLAGS_UNINIT;
+	}
+}
+
+blk64_t ext2fs_count_blocks(ext2_filsys fs, ext2_ino_t ino,
+			struct ext2_inode *inode)
+{
+	ext2_extent_handle_t	handle;
+	struct ext2fs_extent	extent;
+	errcode_t		errcode;
+	int			i;
+	blk64_t			blkcount = 0;
+	blk64_t			*intermediate_nodes;
+
+	errcode = ext2fs_extent_open2(fs, ino, inode, &handle);
+	if (errcode)
+		goto out;
+
+	errcode = ext2fs_extent_get(handle, EXT2_EXTENT_ROOT, &extent);
+	if (errcode)
+		goto out;
+
+	ext2fs_get_array(handle->max_depth, sizeof(blk64_t),
+				&intermediate_nodes);
+	blkcount = handle->level;
+	while (!errcode) {
+		if (extent.e_flags & EXT2_EXTENT_FLAGS_LEAF) {
+			blkcount += extent.e_len;
+			for (i = 0; i < handle->level; i++) {
+				if (intermediate_nodes[i] !=
+					handle->path[i].end_blk) {
+					blkcount++;
+					intermediate_nodes[i] =
+						handle->path[i].end_blk;
+				}
+			}
+		}
+		errcode = ext2fs_extent_get(handle, EXT2_EXTENT_NEXT, &extent);
+	}
+	ext2fs_free_mem(&intermediate_nodes);
+out:
+	ext2fs_extent_free(handle);
+
+	return blkcount;
+}
+
 #ifdef DEBUG
 /*
  * Override debugfs's prompt