diff mbox

[29/31] ext4: reserve space for xattr entries/names

Message ID 20170614172326.18670-1-tahsin@google.com
State Superseded, archived
Headers show

Commit Message

Tahsin Erdogan June 14, 2017, 5:23 p.m. UTC
New ea_inode feature allows putting large xattr values into external
inodes. struct ext4_xattr_entry and the attribute name however have to
remain in the inode extra space or external attribute block. Once that
space is exhausted, no further entries can be added. Some of that space
could also be used by values that fit in there at the time of addition.

So, a single xattr entry whose value barely fits in the external block
could prevent further entries being added.

To mitigate the problem, this patch introduces a notion of reserve in
the
external attribute block that cannot be used by value data. This reserve
is enforced when ea_inode feature is enabled. The amount of reserve is
arbitrarily chosen to be min(block_size/8, 1024). The table below shows
how much space is reserved for each block size and the guaranteed
mininum
number of entries that can be placed in the external attribute block.

block size     reserved bytes  entries (name length = 16)
 1k            128              3
 2k            256              7
 4k            512             15
 8k            1024            31
16k            1024            31
32k            1024            31
64k            1024            31

Signed-off-by: Tahsin Erdogan <tahsin@google.com>
---
 fs/ext4/xattr.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Darrick Wong June 14, 2017, 11:05 p.m. UTC | #1
On Wed, Jun 14, 2017 at 10:23:26AM -0700, Tahsin Erdogan wrote:
> New ea_inode feature allows putting large xattr values into external
> inodes. struct ext4_xattr_entry and the attribute name however have to
> remain in the inode extra space or external attribute block. Once that
> space is exhausted, no further entries can be added. Some of that space
> could also be used by values that fit in there at the time of addition.
> 
> So, a single xattr entry whose value barely fits in the external block
> could prevent further entries being added.
> 
> To mitigate the problem, this patch introduces a notion of reserve in
> the
> external attribute block that cannot be used by value data. This reserve
> is enforced when ea_inode feature is enabled. The amount of reserve is
> arbitrarily chosen to be min(block_size/8, 1024). The table below shows
> how much space is reserved for each block size and the guaranteed
> mininum
> number of entries that can be placed in the external attribute block.
> 
> block size     reserved bytes  entries (name length = 16)
>  1k            128              3
>  2k            256              7
>  4k            512             15

Why not just spill the values into their own ea_inodes if we need the
space?  I guess that has the disadvantage that now we need to reserve
quite a few more journal credits ((1 inode block, 1 bbitmap block, 1
ibitmap block, 1 data block) * nr_inline_values) just in case we end up
spilling all the values.

--D

>  8k            1024            31
> 16k            1024            31
> 32k            1024            31
> 64k            1024            31
> 
> Signed-off-by: Tahsin Erdogan <tahsin@google.com>
> ---
>  fs/ext4/xattr.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index 3ad1fc62cbf0..c9579d220a0c 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -1428,6 +1428,12 @@ static int ext4_xattr_inode_lookup_create(handle_t *handle, struct inode *inode,
>  	return 0;
>  }
>  
> +/*
> + * Reserve min(block_size/8, 1024) bytes for xattr entries/names if ea_inode
> + * feature is enabled.
> + */
> +#define EXT4_XATTR_BLOCK_RESERVE(inode)	min(i_blocksize(inode)/8, 1024U)
> +
>  static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
>  				struct ext4_xattr_search *s,
>  				handle_t *handle, struct inode *inode)
> @@ -1487,6 +1493,20 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
>  			ret = -ENOSPC;
>  			goto out;
>  		}
> +
> +		/*
> +		 * If storing the value in an external inode is an option,
> +		 * reserve space for xattr entries/names in the external
> +		 * attribute block so that a long value does not occupy the
> +		 * whole space and prevent futher entries being added.
> +		 */
> +		if (ext4_has_feature_ea_inode(inode->i_sb) && new_size &&
> +		    (s->end - s->base) == i_blocksize(inode) &&
> +		    (min_offs + old_size - new_size) <
> +					EXT4_XATTR_BLOCK_RESERVE(inode)) {
> +			ret = -ENOSPC;
> +			goto out;
> +		}
>  	}
>  
>  	/*
> -- 
> 2.13.1.508.gb3defc5cc-goog
>
Tahsin Erdogan June 14, 2017, 11:39 p.m. UTC | #2
On Wed, Jun 14, 2017 at 4:05 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> Why not just spill the values into their own ea_inodes if we need the
> space?  I guess that has the disadvantage that now we need to reserve
> quite a few more journal credits ((1 inode block, 1 bbitmap block, 1
> ibitmap block, 1 data block) * nr_inline_values) just in case we end up
> spilling all the values.
>

In last week's ext4 conference call we have discussed this problem and
possible solutions. The major concern with spilling/evacuation
approach was the added complexity. We haven't discussed the impact on
journal credits, but that is a valid point too.
diff mbox

Patch

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 3ad1fc62cbf0..c9579d220a0c 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1428,6 +1428,12 @@  static int ext4_xattr_inode_lookup_create(handle_t *handle, struct inode *inode,
 	return 0;
 }
 
+/*
+ * Reserve min(block_size/8, 1024) bytes for xattr entries/names if ea_inode
+ * feature is enabled.
+ */
+#define EXT4_XATTR_BLOCK_RESERVE(inode)	min(i_blocksize(inode)/8, 1024U)
+
 static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
 				struct ext4_xattr_search *s,
 				handle_t *handle, struct inode *inode)
@@ -1487,6 +1493,20 @@  static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
 			ret = -ENOSPC;
 			goto out;
 		}
+
+		/*
+		 * If storing the value in an external inode is an option,
+		 * reserve space for xattr entries/names in the external
+		 * attribute block so that a long value does not occupy the
+		 * whole space and prevent futher entries being added.
+		 */
+		if (ext4_has_feature_ea_inode(inode->i_sb) && new_size &&
+		    (s->end - s->base) == i_blocksize(inode) &&
+		    (min_offs + old_size - new_size) <
+					EXT4_XATTR_BLOCK_RESERVE(inode)) {
+			ret = -ENOSPC;
+			goto out;
+		}
 	}
 
 	/*