ext4: inherit encryption xattr before other xattrs

Message ID 20170217233321.108637-1-ebiggers3@gmail.com
State Accepted
Headers show

Commit Message

Eric Biggers Feb. 17, 2017, 11:33 p.m.
From: Eric Biggers <ebiggers@google.com>

When using both encryption and SELinux (or another feature that requires
an xattr per file) on a filesystem with 256-byte inodes, each file's
xattrs usually spill into an external xattr block.  Currently, the
xattrs are inherited in the order ACL, security, then encryption.
Therefore, if spillage occurs, the encryption xattr will always end up
in the external block.  This is not ideal because the encryption xattrs
contain a nonce, so they will always be unique and will prevent the
external xattr blocks from being deduplicated.

To improve the situation, change the inheritance order to encryption,
ACL, then security.  This gives the encryption xattr a better chance to
be stored in-inode, allowing the other xattr(s) to be deduplicated.

Note that it may be better for userspace to format the filesystem with
512-byte inodes in this case.  However, it's not the default.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/ext4/ialloc.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

Comments

Andreas Dilger Feb. 27, 2017, 8:28 p.m. | #1
On Feb 17, 2017, at 4:33 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> 
> From: Eric Biggers <ebiggers@google.com>
> 
> When using both encryption and SELinux (or another feature that requires
> an xattr per file) on a filesystem with 256-byte inodes, each file's
> xattrs usually spill into an external xattr block.  Currently, the
> xattrs are inherited in the order ACL, security, then encryption.
> Therefore, if spillage occurs, the encryption xattr will always end up
> in the external block.  This is not ideal because the encryption xattrs
> contain a nonce, so they will always be unique and will prevent the
> external xattr blocks from being deduplicated.
> 
> To improve the situation, change the inheritance order to encryption,
> ACL, then security.  This gives the encryption xattr a better chance to
> be stored in-inode, allowing the other xattr(s) to be deduplicated.
> 
> Note that it may be better for userspace to format the filesystem with
> 512-byte inodes in this case.  However, it's not the default.
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

Note that we've been using 512-byte xattrs for Lustre metadata servers
for ages.  We may want to consider enabling this by default when the
filesystem features are set (e.g. crypto, inline data, etc).

Having a general mechanism to bias xattrs to in-inode or external xattr
storage would be nice also, but I don't know any way to do this beyond
just having a list of xattr names and then prioritizing the ones that
go into the in-inode space.

Cheers, Andreas

> ---
> fs/ext4/ialloc.c | 17 +++++++++++------
> 1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
> index b14bae2598bc..0304e28c2014 100644
> --- a/fs/ext4/ialloc.c
> +++ b/fs/ext4/ialloc.c
> @@ -1096,6 +1096,17 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
> 	if (err)
> 		goto fail_drop;
> 
> +	/*
> +	 * Since the encryption xattr will always be unique, create it first so
> +	 * that it's less likely to end up in an external xattr block and
> +	 * prevent its deduplication.
> +	 */
> +	if (encrypt) {
> +		err = fscrypt_inherit_context(dir, inode, handle, true);
> +		if (err)
> +			goto fail_free_drop;
> +	}
> +
> 	err = ext4_init_acl(handle, inode, dir);
> 	if (err)
> 		goto fail_free_drop;
> @@ -1117,12 +1128,6 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
> 		ei->i_datasync_tid = handle->h_transaction->t_tid;
> 	}
> 
> -	if (encrypt) {
> -		err = fscrypt_inherit_context(dir, inode, handle, true);
> -		if (err)
> -			goto fail_free_drop;
> -	}
> -
> 	err = ext4_mark_inode_dirty(handle, inode);
> 	if (err) {
> 		ext4_std_error(sb, err);
> --
> 2.11.0.483.g087da7b7c-goog
> 


Cheers, Andreas
Eric Biggers Feb. 27, 2017, 9:36 p.m. | #2
On Mon, Feb 27, 2017 at 01:28:28PM -0700, Andreas Dilger wrote:
> On Feb 17, 2017, at 4:33 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> > 
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > When using both encryption and SELinux (or another feature that requires
> > an xattr per file) on a filesystem with 256-byte inodes, each file's
> > xattrs usually spill into an external xattr block.  Currently, the
> > xattrs are inherited in the order ACL, security, then encryption.
> > Therefore, if spillage occurs, the encryption xattr will always end up
> > in the external block.  This is not ideal because the encryption xattrs
> > contain a nonce, so they will always be unique and will prevent the
> > external xattr blocks from being deduplicated.
> > 
> > To improve the situation, change the inheritance order to encryption,
> > ACL, then security.  This gives the encryption xattr a better chance to
> > be stored in-inode, allowing the other xattr(s) to be deduplicated.
> > 
> > Note that it may be better for userspace to format the filesystem with
> > 512-byte inodes in this case.  However, it's not the default.
> > 
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> 
> Reviewed-by: Andreas Dilger <adilger@dilger.ca>
> 
> Note that we've been using 512-byte xattrs for Lustre metadata servers
> for ages.  We may want to consider enabling this by default when the
> filesystem features are set (e.g. crypto, inline data, etc).
> 
> Having a general mechanism to bias xattrs to in-inode or external xattr
> storage would be nice also, but I don't know any way to do this beyond
> just having a list of xattr names and then prioritizing the ones that
> go into the in-inode space.
> 

I think it's a good idea to have mke2fs default to 512-byte inodes if
'-O inline_data' is specified.  But with '-O encrypt' it may be more debatable
because there is no guarantee as to how many files userspace will actually
choose to encrypt.  It could be almost the whole filesystem, or just a few
files, or even nothing at all.

Regardless, I think adjusting the xattr inheritance order (as this patch does)
has advantages but no real disadvantages, so it might as well be done too.

Eric
Andreas Dilger Feb. 27, 2017, 10:20 p.m. | #3
On Feb 27, 2017, at 2:36 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> 
> On Mon, Feb 27, 2017 at 01:28:28PM -0700, Andreas Dilger wrote:
>> On Feb 17, 2017, at 4:33 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
>>> 
>>> From: Eric Biggers <ebiggers@google.com>
>>> 
>>> When using both encryption and SELinux (or another feature that requires
>>> an xattr per file) on a filesystem with 256-byte inodes, each file's
>>> xattrs usually spill into an external xattr block.  Currently, the
>>> xattrs are inherited in the order ACL, security, then encryption.
>>> Therefore, if spillage occurs, the encryption xattr will always end up
>>> in the external block.  This is not ideal because the encryption xattrs
>>> contain a nonce, so they will always be unique and will prevent the
>>> external xattr blocks from being deduplicated.
>>> 
>>> To improve the situation, change the inheritance order to encryption,
>>> ACL, then security.  This gives the encryption xattr a better chance to
>>> be stored in-inode, allowing the other xattr(s) to be deduplicated.
>>> 
>>> Note that it may be better for userspace to format the filesystem with
>>> 512-byte inodes in this case.  However, it's not the default.
>>> 
>>> Signed-off-by: Eric Biggers <ebiggers@google.com>
>> 
>> Reviewed-by: Andreas Dilger <adilger@dilger.ca>
>> 
>> Note that we've been using 512-byte xattrs for Lustre metadata servers
>> for ages.  We may want to consider enabling this by default when the
>> filesystem features are set (e.g. crypto, inline data, etc).
>> 
>> Having a general mechanism to bias xattrs to in-inode or external xattr
>> storage would be nice also, but I don't know any way to do this beyond
>> just having a list of xattr names and then prioritizing the ones that
>> go into the in-inode space.
>> 
> 
> I think it's a good idea to have mke2fs default to 512-byte inodes if
> '-O inline_data' is specified.  But with '-O encrypt' it may be more debatable
> because there is no guarantee as to how many files userspace will actually
> choose to encrypt.  It could be almost the whole filesystem, or just a few
> files, or even nothing at all.
> 
> Regardless, I think adjusting the xattr inheritance order (as this patch does)
> has advantages but no real disadvantages, so it might as well be done too.

Sorry if I wasn't clear.  I have no objections to this patch at all, and would
be happy to see it land.  My comments were just for related improvements that
could be done in different patches.

Cheers, Andreas
Eric Biggers May 1, 2017, 6:52 p.m. | #4
On Mon, Feb 27, 2017 at 01:36:37PM -0800, Eric Biggers wrote:
> On Mon, Feb 27, 2017 at 01:28:28PM -0700, Andreas Dilger wrote:
> > On Feb 17, 2017, at 4:33 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> > > 
> > > From: Eric Biggers <ebiggers@google.com>
> > > 
> > > When using both encryption and SELinux (or another feature that requires
> > > an xattr per file) on a filesystem with 256-byte inodes, each file's
> > > xattrs usually spill into an external xattr block.  Currently, the
> > > xattrs are inherited in the order ACL, security, then encryption.
> > > Therefore, if spillage occurs, the encryption xattr will always end up
> > > in the external block.  This is not ideal because the encryption xattrs
> > > contain a nonce, so they will always be unique and will prevent the
> > > external xattr blocks from being deduplicated.
> > > 
> > > To improve the situation, change the inheritance order to encryption,
> > > ACL, then security.  This gives the encryption xattr a better chance to
> > > be stored in-inode, allowing the other xattr(s) to be deduplicated.
> > > 
> > > Note that it may be better for userspace to format the filesystem with
> > > 512-byte inodes in this case.  However, it's not the default.
> > > 
> > > Signed-off-by: Eric Biggers <ebiggers@google.com>
> > 
> > Reviewed-by: Andreas Dilger <adilger@dilger.ca>
> > 
> > Note that we've been using 512-byte xattrs for Lustre metadata servers
> > for ages.  We may want to consider enabling this by default when the
> > filesystem features are set (e.g. crypto, inline data, etc).
> > 
> > Having a general mechanism to bias xattrs to in-inode or external xattr
> > storage would be nice also, but I don't know any way to do this beyond
> > just having a list of xattr names and then prioritizing the ones that
> > go into the in-inode space.
> > 
> 
> I think it's a good idea to have mke2fs default to 512-byte inodes if
> '-O inline_data' is specified.  But with '-O encrypt' it may be more debatable
> because there is no guarantee as to how many files userspace will actually
> choose to encrypt.  It could be almost the whole filesystem, or just a few
> files, or even nothing at all.
> 
> Regardless, I think adjusting the xattr inheritance order (as this patch does)
> has advantages but no real disadvantages, so it might as well be done too.
> 
> Eric

Ted, this patch seems to have gotten missed; are you planning to apply it?  Next
cycle is fine too, but I'd like to get it in sometime.

- Eric
Theodore Ts'o May 2, 2017, 5:28 a.m. | #5
On Mon, May 01, 2017 at 11:52:28AM -0700, Eric Biggers wrote:
> 
> Ted, this patch seems to have gotten missed; are you planning to apply it?  Next
> cycle is fine too, but I'd like to get it in sometime.

Yes, I had missed it.  Since it's simple and low-risk, I'll include it
this cycle; it's in the dev branch now.  Thanks for reminding me!

						- Ted

Patch

diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index b14bae2598bc..0304e28c2014 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1096,6 +1096,17 @@  struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
 	if (err)
 		goto fail_drop;
 
+	/*
+	 * Since the encryption xattr will always be unique, create it first so
+	 * that it's less likely to end up in an external xattr block and
+	 * prevent its deduplication.
+	 */
+	if (encrypt) {
+		err = fscrypt_inherit_context(dir, inode, handle, true);
+		if (err)
+			goto fail_free_drop;
+	}
+
 	err = ext4_init_acl(handle, inode, dir);
 	if (err)
 		goto fail_free_drop;
@@ -1117,12 +1128,6 @@  struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
 		ei->i_datasync_tid = handle->h_transaction->t_tid;
 	}
 
-	if (encrypt) {
-		err = fscrypt_inherit_context(dir, inode, handle, true);
-		if (err)
-			goto fail_free_drop;
-	}
-
 	err = ext4_mark_inode_dirty(handle, inode);
 	if (err) {
 		ext4_std_error(sb, err);