[f2fs-dev] fscrypt: use 32 bytes of encrypted filename

Message ID 20170419204448.GA1021@jaegeuk.local
State New
Headers show

Commit Message

Jaegeuk Kim April 19, 2017, 8:44 p.m.
On 04/18, Eric Biggers wrote:
> On Tue, Apr 18, 2017 at 06:42:09PM -0700, Jaegeuk Kim wrote:
> > Hi Eric,
> > 
> > On 04/18, Eric Biggers wrote:
> > > On Tue, Apr 18, 2017 at 04:01:36PM -0700, Eric Biggers wrote:
> > > > 
> > > > Strangely, f2fs and ubifs do not use the bytes from the filename at all when
> > > > trying to find a specific directory entry in this case.  So this patch doesn't
> > > > really affect them.  This seems unreliable; perhaps we should introduce a
> > > > function like "fscrypt_name_matches()" which all the filesystems could call?
> > > > Can any of the f2fs and ubifs developers explain why they don't look at any
> > > > bytes from the filename?
> > > > 
> > 
> > The fscrypt_setup_filename sets fname->hash in the bigname case, but doesn't
> > give fname->disk_name. If it's not such the bigname case, we check its name
> > since fname->hash is zero.
> > 
> 
> Yes, that's what it does now.  The question is, in the "bigname" case why
> doesn't f2fs check the 16 bytes of ciphertext in fname->crypto_buf too?  f2fs
> doesn't even use 'minor_hash'; it can't possibly be the case that there are
> never any collisions of a 32-bit hash in a directory, can it?
> 
> I actually tested it, and it definitely happens if you put a lot of files in an
> encrypted directory on f2fs.  An example with 100000 files:
> 
> # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 | xargs touch
> # find edir -type f | xargs stat -c %i | sort | uniq | wc -l
> 100000
> # sync
> # echo 3 > /proc/sys/vm/drop_caches
> # keyctl new_session
> # find edir -type f | xargs stat -c %i | sort | uniq | wc -l
> 99999
> 
> So when I tried accessing the encrypted directory without the key, two dentries
> showed the same inode, due to a hash collision.

Thank you for sharing more details. I could reproduce this issue and reach out
to what you mentioned. In order to resolve this, I wrote a patch for f2fs first
to act like ext4 for easy backports. Then, I expect a global fscrypt function
is easily able to clean them up.

Thanks,

From 63ca24a64fb1625dac9740a2183fd8966388e185 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Wed, 19 Apr 2017 10:49:21 -0700
Subject: [PATCH] f2fs: check entire encrypted bigname when finding a dentry

If user has no key under an encrypted dir, fscrypt gives digested dentries.
Previously, when looking up a dentry, f2fs only checks its hash value with
first 4 bytes of the digested dentry, which didn't handle hash collisions fully.
This patch enhances to check entire dentry bytes likewise ext4.

Eric reported how to reproduce this issue by:

 # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 | xargs touch
 # find edir -type f | xargs stat -c %i | sort | uniq | wc -l
100000
 # sync
 # echo 3 > /proc/sys/vm/drop_caches
 # keyctl new_session
 # find edir -type f | xargs stat -c %i | sort | uniq | wc -l
99999

Cc: <stable@vger.kernel.org>
Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/dir.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

Comments

Eric Biggers April 21, 2017, 7:44 a.m. | #1
Hi Jaegeuk,

On Wed, Apr 19, 2017 at 01:44:48PM -0700, Jaegeuk Kim wrote:
> 
> Thank you for sharing more details. I could reproduce this issue and reach out
> to what you mentioned. In order to resolve this, I wrote a patch for f2fs first
> to act like ext4 for easy backports. Then, I expect a global fscrypt function
> is easily able to clean them up.
[...]
> @@ -130,19 +130,29 @@ struct f2fs_dir_entry *find_target_dentry(struct fscrypt_name *fname,
>  			continue;
>  		}
>  
> -		/* encrypted case */
> +		if (de->hash_code != namehash)
> +			goto not_match;
> +
>  		de_name.name = d->filename[bit_pos];
>  		de_name.len = le16_to_cpu(de->name_len);
>  
> -		/* show encrypted name */
> -		if (fname->hash) {
> -			if (de->hash_code == cpu_to_le32(fname->hash))
> -				goto found;
> -		} else if (de_name.len == name->len &&
> -			de->hash_code == namehash &&
> -			!memcmp(de_name.name, name->name, name->len))
> +#ifdef CONFIG_F2FS_FS_ENCRYPTION
> +		if (unlikely(!name->name)) {
> +			if (fname->usr_fname->name[0] == '_') {
> +				if (de_name.len >= 16 &&
> +					!memcmp(de_name.name + de_name.len - 16,
> +						fname->crypto_buf.name + 8, 16))
> +					goto found;
> +				goto not_match;
> +			}
> +			name->name = fname->crypto_buf.name;
> +			name->len = fname->crypto_buf.len;
> +		}
> +#endif
> +		if (de_name.len == name->len &&
> +				!memcmp(de_name.name, name->name, name->len))
>  			goto found;
> -
> +not_match:

I agree with this approach, but I don't think it's ever the case that
fname->usr_fname->name[0] != '_'.  (Yes, ext4 checks it, but I think it's
unneeded there too.)  And if it was, it doesn't make sense to modify the 'name'
that is passed in.

In any case, I guess that unless there are other ideas we can do these patches:

1.) f2fs patch to start checking the name, as above
2.) patch to start encoding last 32 bytes of the name (or second-to-last CTS
    block, I haven't decided yet) rather than last 16 bytes, changing
    fs/crypto/, fs/ext4/, and fs/f2fs/
3.) cleanup patches to introduce helper function and switch ext4 and f2fs to it

(1) and (2) will be backported.

UBIFS can be changed to use the helper function later if needed.  It's not as
important for it to be backported there since UBIFS does the "double hashing",
and UBIFS encryption was just added in 4.10 anyway.

I can try to put together the full series when I have time.  It probably would
make sense for it to go through the fscrypt tree, given the dependencies.

Eric
Gwendal Grignou April 21, 2017, 5:21 p.m. | #2
>
> In any case, I guess that unless there are other ideas we can do these patches:
>
> 1.) f2fs patch to start checking the name, as above
> 2.) patch to start encoding last 32 bytes of the name (or second-to-last CTS
>     block, I haven't decided yet) rather than last 16 bytes, changing
>     fs/crypto/, fs/ext4/, and fs/f2fs/

Using second-to-last CTS block is CTS-CBC specific. If we use another
method to encode filename (I am thinking of HEH,
http://www.mail-archive.com/linux-crypto@vger.kernel.org/msg21826.html)
that may not work anymore.
We don't have to use the last 32 bytes: using for instance the last 24
should be good enough, the escape rate will be 1/2^64 instead 1/2^128.

Gwendal.

> 3.) cleanup patches to introduce helper function and switch ext4 and f2fs to it
>
> (1) and (2) will be backported.
>
> UBIFS can be changed to use the helper function later if needed.  It's not as
> important for it to be backported there since UBIFS does the "double hashing",
> and UBIFS encryption was just added in 4.10 anyway.
>
> I can try to put together the full series when I have time.  It probably would
> make sense for it to go through the fscrypt tree, given the dependencies.
>
> Eric
Jaegeuk Kim April 21, 2017, 5:35 p.m. | #3
Hi Eric,

On 04/21, Eric Biggers wrote:
> Hi Jaegeuk,
> 
> On Wed, Apr 19, 2017 at 01:44:48PM -0700, Jaegeuk Kim wrote:
> > 
> > Thank you for sharing more details. I could reproduce this issue and reach out
> > to what you mentioned. In order to resolve this, I wrote a patch for f2fs first
> > to act like ext4 for easy backports. Then, I expect a global fscrypt function
> > is easily able to clean them up.
> [...]
> > @@ -130,19 +130,29 @@ struct f2fs_dir_entry *find_target_dentry(struct fscrypt_name *fname,
> >  			continue;
> >  		}
> >  
> > -		/* encrypted case */
> > +		if (de->hash_code != namehash)
> > +			goto not_match;
> > +
> >  		de_name.name = d->filename[bit_pos];
> >  		de_name.len = le16_to_cpu(de->name_len);
> >  
> > -		/* show encrypted name */
> > -		if (fname->hash) {
> > -			if (de->hash_code == cpu_to_le32(fname->hash))
> > -				goto found;
> > -		} else if (de_name.len == name->len &&
> > -			de->hash_code == namehash &&
> > -			!memcmp(de_name.name, name->name, name->len))
> > +#ifdef CONFIG_F2FS_FS_ENCRYPTION
> > +		if (unlikely(!name->name)) {
> > +			if (fname->usr_fname->name[0] == '_') {
> > +				if (de_name.len >= 16 &&
> > +					!memcmp(de_name.name + de_name.len - 16,
> > +						fname->crypto_buf.name + 8, 16))
> > +					goto found;
> > +				goto not_match;
> > +			}
> > +			name->name = fname->crypto_buf.name;
> > +			name->len = fname->crypto_buf.len;
> > +		}
> > +#endif
> > +		if (de_name.len == name->len &&
> > +				!memcmp(de_name.name, name->name, name->len))
> >  			goto found;
> > -
> > +not_match:
> 
> I agree with this approach, but I don't think it's ever the case that
> fname->usr_fname->name[0] != '_'.  (Yes, ext4 checks it, but I think it's
> unneeded there too.)  And if it was, it doesn't make sense to modify the 'name'
> that is passed in.

Agreed, and actually I tried to sync ext4 as much as possible for further work
like 2.) and 3.) below. ;)

> In any case, I guess that unless there are other ideas we can do these patches:
> 
> 1.) f2fs patch to start checking the name, as above
> 2.) patch to start encoding last 32 bytes of the name (or second-to-last CTS
>     block, I haven't decided yet) rather than last 16 bytes, changing
>     fs/crypto/, fs/ext4/, and fs/f2fs/
> 3.) cleanup patches to introduce helper function and switch ext4 and f2fs to it

IMO, it'd better to do 3.) followed by 2.), since 2.) already needs to change
fs/crypto which does not give much backporting effort.

> (1) and (2) will be backported.
> 
> UBIFS can be changed to use the helper function later if needed.  It's not as
> important for it to be backported there since UBIFS does the "double hashing",
> and UBIFS encryption was just added in 4.10 anyway.
> 
> I can try to put together the full series when I have time.  It probably would
> make sense for it to go through the fscrypt tree, given the dependencies.

I found one issue in my patch and modified it in f2fs tree [1]. Given next merge
window probable starting next week, let me upstream this modified one first
through f2fs. Then, you can see it in 4.12-rc1 two weeks later, so fscrypt
patches can be easily integrated after then. If you have any concern, I'm also
okay to push this patch through fscrypt. Let me know.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/commit/?h=dev-test&id=1585cfbbb269be6a112e0629a52123c0f9eaf4fa

Thanks,

> 
> Eric
Eric Biggers April 21, 2017, 6:53 p.m. | #4
Hi Gwendal,

On Fri, Apr 21, 2017 at 10:21:16AM -0700, Gwendal Grignou wrote:
> >
> > In any case, I guess that unless there are other ideas we can do these patches:
> >
> > 1.) f2fs patch to start checking the name, as above
> > 2.) patch to start encoding last 32 bytes of the name (or second-to-last CTS
> >     block, I haven't decided yet) rather than last 16 bytes, changing
> >     fs/crypto/, fs/ext4/, and fs/f2fs/
> 
> Using second-to-last CTS block is CTS-CBC specific. If we use another
> method to encode filename (I am thinking of HEH,
> http://www.mail-archive.com/linux-crypto@vger.kernel.org/msg21826.html)
> that may not work anymore.
> We don't have to use the last 32 bytes: using for instance the last 24
> should be good enough, the escape rate will be 1/2^64 instead 1/2^128.
> 

The thing is, even using the last N bytes is depending on the encryption
algorithm.  The only way to make it work for arbitrary algorithms would be to
use a cryptographic hash like SHA-256 --- which actually seems to have been the
original design, but apparently it got changed at some point.  (I guess so that
hashes wouldn't have to be computed in so many places, and taking advantage of
the encryption should be sufficient.)

HEH is not a problem because it's a strong pseudorandom permutation, so any
choice of bytes from the ciphertext is equally good for it.

We can always change this later if different algorithms are added, or even make
different algorithms choose different bytes.

So I think I'm leaning towards making it use the second-to-last block, rather
than the last 32 bytes.

Eric
Eric Biggers April 21, 2017, 7:26 p.m. | #5
Hi Jaegeuk,

On Fri, Apr 21, 2017 at 10:35:03AM -0700, Jaegeuk Kim wrote:
> 
> > In any case, I guess that unless there are other ideas we can do these patches:
> > 
> > 1.) f2fs patch to start checking the name, as above
> > 2.) patch to start encoding last 32 bytes of the name (or second-to-last CTS
> >     block, I haven't decided yet) rather than last 16 bytes, changing
> >     fs/crypto/, fs/ext4/, and fs/f2fs/
> > 3.) cleanup patches to introduce helper function and switch ext4 and f2fs to it
> 
> IMO, it'd better to do 3.) followed by 2.), since 2.) already needs to change
> fs/crypto which does not give much backporting effort.
> 

That would be ideal, but unfortunately the main users of filesystem encryption
are using old kernel versions which don't have fs/crypto/, usually 4.4 at
latest.  So it would be nice for it to be easier to backport the "use different
bytes from the encrypted filename" change to 4.4-stable, as I've been doing for
some of the other filesystem encryption fixes.  And people do need it, it seems,
as it causes real problems like undeletable files; Gwendal is even already
trying to merge a fix into some Chrome OS kernel.

> 
> I found one issue in my patch and modified it in f2fs tree [1]. Given next merge
> window probable starting next week, let me upstream this modified one first
> through f2fs. Then, you can see it in 4.12-rc1 two weeks later, so fscrypt
> patches can be easily integrated after then. If you have any concern, I'm also
> okay to push this patch through fscrypt. Let me know.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/commit/?h=dev-test&id=1585cfbbb269be6a112e0629a52123c0f9eaf4fa
> 

I think the series through fscrypt makes more sense, though if I don't have it
ready soon please go ahead and take the f2fs portion through the f2fs tree.

Thanks!

- Eric

Patch

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index c143dffcae6e..007c3b4adc85 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -130,19 +130,29 @@  struct f2fs_dir_entry *find_target_dentry(struct fscrypt_name *fname,
 			continue;
 		}
 
-		/* encrypted case */
+		if (de->hash_code != namehash)
+			goto not_match;
+
 		de_name.name = d->filename[bit_pos];
 		de_name.len = le16_to_cpu(de->name_len);
 
-		/* show encrypted name */
-		if (fname->hash) {
-			if (de->hash_code == cpu_to_le32(fname->hash))
-				goto found;
-		} else if (de_name.len == name->len &&
-			de->hash_code == namehash &&
-			!memcmp(de_name.name, name->name, name->len))
+#ifdef CONFIG_F2FS_FS_ENCRYPTION
+		if (unlikely(!name->name)) {
+			if (fname->usr_fname->name[0] == '_') {
+				if (de_name.len >= 16 &&
+					!memcmp(de_name.name + de_name.len - 16,
+						fname->crypto_buf.name + 8, 16))
+					goto found;
+				goto not_match;
+			}
+			name->name = fname->crypto_buf.name;
+			name->len = fname->crypto_buf.len;
+		}
+#endif
+		if (de_name.len == name->len &&
+				!memcmp(de_name.name, name->name, name->len))
 			goto found;
-
+not_match:
 		if (max_slots && max_len > *max_slots)
 			*max_slots = max_len;
 		max_len = 0;