diff mbox

[05/54] libext2fs: avoid pointless EA block allocation

Message ID 20150127073606.13308.68905.stgit@birch.djwong.org
State Accepted, archived
Headers show

Commit Message

Darrick Wong Jan. 27, 2015, 7:36 a.m. UTC
Use qsort to move the inlinedata attribute to the front of the list
and the empty entries to the end.  Then we can use handle->count to
decide if we're done writing xattrs, which helps us to avoid the
situation where we're midway through the attribute list, so we
allocate an EA block to store more, but have no idea that there's
actually nothing left in the list.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/ext2fs/ext_attr.c |   40 +++++++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 19 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Theodore Ts'o Jan. 27, 2015, 4:07 p.m. UTC | #1
On Mon, Jan 26, 2015 at 11:36:06PM -0800, Darrick J. Wong wrote:
> Use qsort to move the inlinedata attribute to the front of the list
> and the empty entries to the end.  Then we can use handle->count to
> decide if we're done writing xattrs, which helps us to avoid the
> situation where we're midway through the attribute list, so we
> allocate an EA block to store more, but have no idea that there's
> actually nothing left in the list.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Applied, although I wonder if we might want to be a bit more
sophisticated about how we handle the order of attributes in the
future.  In particular, if we need to spill over into an EA block,
there are certain things like the selinux security id that we would
want to be inline in the inode, and let other xattrs spill over into
the EA block.  And of course, if we can manage to fit all of the
xattrs into the inode, except for system.data, then we should bail on
using the inline data feature at all, and just allocate a normal data
block for the data, and not use an EA block for system.data at all.

It's much more imporant that we get this sort of stuff right for the
kernel implementation, though, and if we occasionally misoptimize
things in libext2fs, that's unfortunate, but it's primarily going to
be a problem for the FUSE driver...

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick Wong Jan. 27, 2015, 7:26 p.m. UTC | #2
On Tue, Jan 27, 2015 at 11:07:36AM -0500, Theodore Ts'o wrote:
> On Mon, Jan 26, 2015 at 11:36:06PM -0800, Darrick J. Wong wrote:
> > Use qsort to move the inlinedata attribute to the front of the list
> > and the empty entries to the end.  Then we can use handle->count to
> > decide if we're done writing xattrs, which helps us to avoid the
> > situation where we're midway through the attribute list, so we
> > allocate an EA block to store more, but have no idea that there's
> > actually nothing left in the list.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Applied, although I wonder if we might want to be a bit more
> sophisticated about how we handle the order of attributes in the
> future.  In particular, if we need to spill over into an EA block,
> there are certain things like the selinux security id that we would

Agreed.

> want to be inline in the inode, and let other xattrs spill over into
> the EA block.  And of course, if we can manage to fit all of the
> xattrs into the inode, except for system.data, then we should bail on
> using the inline data feature at all, and just allocate a normal data
> block for the data, and not use an EA block for system.data at all.

I've been wondering for a while why there isn't a name index for
system.data -- doing that would allow 4 more bytes of inline data per
inode.

> It's much more imporant that we get this sort of stuff right for the
> kernel implementation, though, and if we occasionally misoptimize
> things in libext2fs, that's unfortunate, but it's primarily going to
> be a problem for the FUSE driver...

<shrug> That might (for now) be less of a problem for fuse, since
libext2fs has inline directories spill over to a block as soon as
i_block[] fills up... haven't decided if that's a "feature" or a "bug".

(The kernel implementation seems ok with my not-rigorous 3.19 spot
check.)

--D
> 
> 					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index c6fcf54..e8544dc 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -254,22 +254,20 @@  static struct ea_name_index ea_names[] = {
 	{0, NULL},
 };
 
-static void move_inline_data_to_front(struct ext2_xattr_handle *h)
+/* Push empty attributes to the end and inlinedata to the front. */
+static int attr_compare(const void *a, const void *b)
 {
-	struct ext2_xattr *x;
-	struct ext2_xattr tmp;
-
-	for (x = h->attrs + 1; x < h->attrs + h->length; x++) {
-		if (!x->name)
-			continue;
-
-		if (strcmp(x->name, "system.data") == 0) {
-			memcpy(&tmp, x, sizeof(tmp));
-			memcpy(x, h->attrs, sizeof(tmp));
-			memcpy(h->attrs, &tmp, sizeof(tmp));
-			return;
-		}
-	}
+	const struct ext2_xattr *xa = a, *xb = b;
+
+	if (xa->name == NULL)
+		return +1;
+	else if (xb->name == NULL)
+		return -1;
+	else if (!strcmp(xa->name, "system.data"))
+		return -1;
+	else if (!strcmp(xb->name, "system.data"))
+		return +1;
+	return 0;
 }
 
 static const char *find_ea_prefix(int index)
@@ -531,9 +529,13 @@  errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle)
 		inode->i_extra_isize = extra;
 	}
 
-	move_inline_data_to_front(handle);
-
+	/*
+	 * Force the inlinedata attr to the front and the empty entries
+	 * to the end.
+	 */
 	x = handle->attrs;
+	qsort(x, handle->length, sizeof(struct ext2_xattr), attr_compare);
+
 	/* Does the inode have size for EA? */
 	if (EXT2_INODE_SIZE(handle->fs->super) <= EXT2_GOOD_OLD_INODE_SIZE +
 						  inode->i_extra_isize +
@@ -554,11 +556,11 @@  errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle)
 	if (err)
 		goto out;
 
+write_ea_block:
 	/* Are we done? */
-	if (x == handle->attrs + handle->length)
+	if (x >= handle->attrs + handle->count)
 		goto skip_ea_block;
 
-write_ea_block:
 	/* Write the EA block */
 	err = ext2fs_get_memzero(handle->fs->blocksize, &block_buf);
 	if (err)