[RFC,V2,07/11] fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page

Message ID 20180212094347.22071-8-chandan@linux.vnet.ibm.com
State Superseded
Headers show
Series
  • Ext4 encryption support for blocksize < pagesize
Related show

Commit Message

Chandan Rajendra Feb. 12, 2018, 9:43 a.m.
For block size < page size, This commit adds code to encrypt all zeroed
out blocks of a page.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
---
 fs/crypto/bio.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

Comments

Eric Biggers Feb. 21, 2018, 1:16 a.m. | #1
On Mon, Feb 12, 2018 at 03:13:43PM +0530, Chandan Rajendra wrote:
> For block size < page size, This commit adds code to encrypt all zeroed
> out blocks of a page.
> 
> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> ---
>  fs/crypto/bio.c | 38 +++++++++++++++++++++++++-------------
>  1 file changed, 25 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
> index 378df08..4d0d14f 100644
> --- a/fs/crypto/bio.c
> +++ b/fs/crypto/bio.c
> @@ -104,10 +104,12 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
>  {
>  	struct fscrypt_ctx *ctx;
>  	struct page *ciphertext_page = NULL;
> +	unsigned int page_nr_blks;
> +	unsigned int offset;
> +	unsigned int page_io_len;
>  	struct bio *bio;
>  	int ret, err = 0;
> -
> -	BUG_ON(inode->i_sb->s_blocksize != PAGE_SIZE);
> +	int i;
>  
>  	ctx = fscrypt_get_ctx(inode, GFP_NOFS);
>  	if (IS_ERR(ctx))
> @@ -119,12 +121,23 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
>  		goto errout;
>  	}
>  
> -	while (len--) {
> -		err = fscrypt_do_block_crypto(inode, FS_ENCRYPT, lblk,
> -					     ZERO_PAGE(0), ciphertext_page,
> -					     PAGE_SIZE, 0, GFP_NOFS);
> -		if (err)
> -			goto errout;
> +	page_nr_blks = 1 << (PAGE_SHIFT - inode->i_blkbits);
> +
> +	while (len) {
> +		page_nr_blks = min_t(unsigned int, page_nr_blks, len);
> +		page_io_len = page_nr_blks << inode->i_blkbits;
> +		offset = 0;

The 'page_io_len' variable isn't needed, since 'offset == page_io_len' after the
encryption loop.  You can do 'bio_add_page(bio, ciphertext_page, offset, 0);'.

Eric
Chandan Rajendra Feb. 21, 2018, 9:57 a.m. | #2
On Wednesday, February 21, 2018 6:46:48 AM IST Eric Biggers wrote:
> On Mon, Feb 12, 2018 at 03:13:43PM +0530, Chandan Rajendra wrote:
> > For block size < page size, This commit adds code to encrypt all zeroed
> > out blocks of a page.
> > 
> > Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> > ---
> >  fs/crypto/bio.c | 38 +++++++++++++++++++++++++-------------
> >  1 file changed, 25 insertions(+), 13 deletions(-)
> > 
> > diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
> > index 378df08..4d0d14f 100644
> > --- a/fs/crypto/bio.c
> > +++ b/fs/crypto/bio.c
> > @@ -104,10 +104,12 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
> >  {
> >  	struct fscrypt_ctx *ctx;
> >  	struct page *ciphertext_page = NULL;
> > +	unsigned int page_nr_blks;
> > +	unsigned int offset;
> > +	unsigned int page_io_len;
> >  	struct bio *bio;
> >  	int ret, err = 0;
> > -
> > -	BUG_ON(inode->i_sb->s_blocksize != PAGE_SIZE);
> > +	int i;
> >  
> >  	ctx = fscrypt_get_ctx(inode, GFP_NOFS);
> >  	if (IS_ERR(ctx))
> > @@ -119,12 +121,23 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
> >  		goto errout;
> >  	}
> >  
> > -	while (len--) {
> > -		err = fscrypt_do_block_crypto(inode, FS_ENCRYPT, lblk,
> > -					     ZERO_PAGE(0), ciphertext_page,
> > -					     PAGE_SIZE, 0, GFP_NOFS);
> > -		if (err)
> > -			goto errout;
> > +	page_nr_blks = 1 << (PAGE_SHIFT - inode->i_blkbits);
> > +
> > +	while (len) {
> > +		page_nr_blks = min_t(unsigned int, page_nr_blks, len);
> > +		page_io_len = page_nr_blks << inode->i_blkbits;
> > +		offset = 0;
> 
> The 'page_io_len' variable isn't needed, since 'offset == page_io_len' after the
> encryption loop.  You can do 'bio_add_page(bio, ciphertext_page, offset, 0);'.
> 

You are right. I will fix that up in the next iteration of the patchset.
Theodore Y. Ts'o March 26, 2018, 6:05 a.m. | #3
On Wed, Feb 21, 2018 at 03:27:24PM +0530, Chandan Rajendra wrote:
> 
> You are right. I will fix that up in the next iteration of the patchset.
>

Hi Chandan,

When were you planning on sending out the next iteration of the
patchset?  The merge window will be opening soon, and I wasn't sure if
I had missed a new iteration of your changes.ts

Also, it looks like when you renamed the *_page fscrypt functions to
*_blocks, on the write side, a bounce page is still being used for
each block.  So so an an architecture which has 64k pages, and we are
writing to a file sytem with 4k blocks, to write a 64k page, the
fscrypt layer will have to allocate 16 64k bounce pages to write a
single 64k page to an encrypted file.  Am I missing something?

Thanks,

					- Ted
Chandan Rajendra March 26, 2018, 8:22 a.m. | #4
On Monday, March 26, 2018 11:35:33 AM IST Theodore Y. Ts'o wrote:
> On Wed, Feb 21, 2018 at 03:27:24PM +0530, Chandan Rajendra wrote:
> > 
> > You are right. I will fix that up in the next iteration of the patchset.
> >
> 
> Hi Chandan,
> 
> When were you planning on sending out the next iteration of the
> patchset?  The merge window will be opening soon, and I wasn't sure if
> I had missed a new iteration of your changes.ts
> 

Hi Ted,

I am sorry about the delay. I got pulled into various other things at
workplace. I have not posted V3 yet. I am still working on fixing 
test failures. I don't think the patches will be ready by the next
merge window.

> Also, it looks like when you renamed the *_page fscrypt functions to
> *_blocks, on the write side, a bounce page is still being used for
> each block.  So so an an architecture which has 64k pages, and we are
> writing to a file sytem with 4k blocks, to write a 64k page, the
> fscrypt layer will have to allocate 16 64k bounce pages to write a
> single 64k page to an encrypted file.  Am I missing something?
> 

ext4_bio_write_page() invokes the new fscrypt_encrypt_block() function for
each block of the page that has been marked with "Async write". For all blocks
of the page that needs to be written to the disk, we pass the same bounce page
as an argument to fscrypt_encrypt_block().
Theodore Y. Ts'o March 27, 2018, 7:40 p.m. | #5
On Mon, Mar 26, 2018 at 01:52:54PM +0530, Chandan Rajendra wrote:
> > Also, it looks like when you renamed the *_page fscrypt functions to
> > *_blocks, on the write side, a bounce page is still being used for
> > each block.  So so an an architecture which has 64k pages, and we are
> > writing to a file sytem with 4k blocks, to write a 64k page, the
> > fscrypt layer will have to allocate 16 64k bounce pages to write a
> > single 64k page to an encrypted file.  Am I missing something?
> > 
> 
> ext4_bio_write_page() invokes the new fscrypt_encrypt_block() function for
> each block of the page that has been marked with "Async write". For all blocks
> of the page that needs to be written to the disk, we pass the same bounce page
> as an argument to fscrypt_encrypt_block().

Thanks for the explanation.  I do wonder if the proper thing to export
from the fscrypt layer is fscrypt_encrypt_page(), since for all file
systems, the only thing which really makes sense is to read and write
a full page at a time, since we cache things at the page cache a full
page a time.  So instead of teaching each file system how to use
fscrypt_{encrypt,decrypt}_block, maybe push that into the fscrypt
layer, and implement a new fscrypt_encrypt_page() which calls
fs_encrypt_block()?

							- Ted
Chandan Rajendra March 28, 2018, 1:36 p.m. | #6
On Wednesday, March 28, 2018 1:10:56 AM IST Theodore Y. Ts'o wrote:
> On Mon, Mar 26, 2018 at 01:52:54PM +0530, Chandan Rajendra wrote:
> > > Also, it looks like when you renamed the *_page fscrypt functions to
> > > *_blocks, on the write side, a bounce page is still being used for
> > > each block.  So so an an architecture which has 64k pages, and we are
> > > writing to a file sytem with 4k blocks, to write a 64k page, the
> > > fscrypt layer will have to allocate 16 64k bounce pages to write a
> > > single 64k page to an encrypted file.  Am I missing something?
> > > 
> > 
> > ext4_bio_write_page() invokes the new fscrypt_encrypt_block() function for
> > each block of the page that has been marked with "Async write". For all blocks
> > of the page that needs to be written to the disk, we pass the same bounce page
> > as an argument to fscrypt_encrypt_block().
> 
> Thanks for the explanation.  I do wonder if the proper thing to export
> from the fscrypt layer is fscrypt_encrypt_page(), since for all file
> systems, the only thing which really makes sense is to read and write
> a full page at a time, since we cache things at the page cache a full
> page a time.  So instead of teaching each file system how to use
> fscrypt_{encrypt,decrypt}_block, maybe push that into the fscrypt
> layer, and implement a new fscrypt_encrypt_page() which calls
> fs_encrypt_block()?

I don't see any problems in doing that. I will implement that. Thanks for the 
suggestion.
Chandan Rajendra April 5, 2018, 7:03 a.m. | #7
On Wednesday, March 28, 2018 1:10:56 AM IST Theodore Y. Ts'o wrote:
> On Mon, Mar 26, 2018 at 01:52:54PM +0530, Chandan Rajendra wrote:
> > > Also, it looks like when you renamed the *_page fscrypt functions to
> > > *_blocks, on the write side, a bounce page is still being used for
> > > each block.  So so an an architecture which has 64k pages, and we are
> > > writing to a file sytem with 4k blocks, to write a 64k page, the
> > > fscrypt layer will have to allocate 16 64k bounce pages to write a
> > > single 64k page to an encrypted file.  Am I missing something?
> > > 
> > 
> > ext4_bio_write_page() invokes the new fscrypt_encrypt_block() function for
> > each block of the page that has been marked with "Async write". For all blocks
> > of the page that needs to be written to the disk, we pass the same bounce page
> > as an argument to fscrypt_encrypt_block().
> 
> Thanks for the explanation.  I do wonder if the proper thing to export
> from the fscrypt layer is fscrypt_encrypt_page(), since for all file
> systems, the only thing which really makes sense is to read and write
> a full page at a time, since we cache things at the page cache a full
> page a time.  So instead of teaching each file system how to use
> fscrypt_{encrypt,decrypt}_block, maybe push that into the fscrypt
> layer, and implement a new fscrypt_encrypt_page() which calls
> fs_encrypt_block()?
> 

I encountered a problem when refactoring the code to get fscrypt layer to
encrypt all the blocks of a page by internally calling
fscrypt_encrypt_block().

It is the filesystem which knows which subset of blocks mapped by a page that
needs to be encrypted. For example, ext4_bio_write_page() marks such blocks
with "Async Write" flag and later in another pass, it encrypts and also adds
these blocks to a bio. So IMHO, I think fscrypt layer should limit itself to
encrypting/decrypting file data in block size units.
Theodore Y. Ts'o April 5, 2018, 12:47 p.m. | #8
On Thu, Apr 05, 2018 at 12:33:26PM +0530, Chandan Rajendra wrote:
> 
> I encountered a problem when refactoring the code to get fscrypt layer to
> encrypt all the blocks of a page by internally calling
> fscrypt_encrypt_block().
> 
> It is the filesystem which knows which subset of blocks mapped by a page that
> needs to be encrypted.

That's not quite correct.  All blocks in a file are either always
encrypted, or not.  So that's not really the problem.

> For example, ext4_bio_write_page() marks such blocks
> with "Async Write" flag and later in another pass, it encrypts and also adds
> these blocks to a bio.

The tricky bits with ext4_bio_write_page() all are in handling the
case where page_size > block_size.  In that case, where there are multiple
file system blocks covering a page, we need to know the on-disk
block numbers are for the blocks covering the page, and the encryption
is intertwined with the I/O submission path, which is file system
specific -- mainly because how the completion callback and the
parameters which need to be passed *into* the the callback function is
file system specific.

However, none of that is needed or relevant to the encryption
operation.  It's an accident of code development history that
fscrypt_encrypt_page was placed where it was.

That is, none of work done in the first pass (starting with the
comment "In the first loop we prepare and mark buffers to submit....")
is needed to be done before we call fscrypt_encrypt_page().  That call:

	data_page = fscrypt_encrypt_page(inode, page, PAGE_SIZE, 0,
					page->index, gfp_flags);

... could easily be moved to the beginning of ext4_bio_write_page().

I can do that to make the function easier to understand, but that
particular cleanup is merely cosmetic.  It doesn't what you would need
to do order to make fscrypt_encrypt_page() iterate over the page as it
calls fscrypt_encrypt_buffer().

Regards,

						- Ted
Chandan Rajendra April 5, 2018, 1:07 p.m. | #9
On Thursday, April 5, 2018 6:17:45 PM IST Theodore Y. Ts'o wrote:
> On Thu, Apr 05, 2018 at 12:33:26PM +0530, Chandan Rajendra wrote:
> > 
> > I encountered a problem when refactoring the code to get fscrypt layer to
> > encrypt all the blocks of a page by internally calling
> > fscrypt_encrypt_block().
> > 
> > It is the filesystem which knows which subset of blocks mapped by a page that
> > needs to be encrypted.
> 
> That's not quite correct.  All blocks in a file are either always
> encrypted, or not.  So that's not really the problem.

Sorry, I wasn't clear enough with my explaination. I actually meant to say
that not all blocks mapped by a page might be dirty and hence only a subset
of the dirty blocks might need to be written to the disk. I understand that
in such cases we could still invoke fscrypt_encrypt_page() and encrypt the
contents of all the blocks (irrespective of whether a block is dirty or not).
I wanted to optimize this and avoid the encryption of "clean blocks".

> 
> > For example, ext4_bio_write_page() marks such blocks
> > with "Async Write" flag and later in another pass, it encrypts and also adds
> > these blocks to a bio.
> 
> The tricky bits with ext4_bio_write_page() all are in handling the
> case where page_size > block_size.  In that case, where there are multiple
> file system blocks covering a page, we need to know the on-disk
> block numbers are for the blocks covering the page, and the encryption
> is intertwined with the I/O submission path, which is file system
> specific -- mainly because how the completion callback and the
> parameters which need to be passed *into* the the callback function is
> file system specific.
> 
> However, none of that is needed or relevant to the encryption
> operation.  It's an accident of code development history that
> fscrypt_encrypt_page was placed where it was.
> 
> That is, none of work done in the first pass (starting with the
> comment "In the first loop we prepare and mark buffers to submit....")
> is needed to be done before we call fscrypt_encrypt_page().  That call:
> 
> 	data_page = fscrypt_encrypt_page(inode, page, PAGE_SIZE, 0,
> 					page->index, gfp_flags);
> 
> ... could easily be moved to the beginning of ext4_bio_write_page().
> 
> I can do that to make the function easier to understand, but that
> particular cleanup is merely cosmetic.  It doesn't what you would need
> to do order to make fscrypt_encrypt_page() iterate over the page as it
> calls fscrypt_encrypt_buffer().
>
Theodore Y. Ts'o April 5, 2018, 8:50 p.m. | #10
On Thu, Apr 05, 2018 at 06:37:24PM +0530, Chandan Rajendra wrote:
> Sorry, I wasn't clear enough with my explaination. I actually meant to say
> that not all blocks mapped by a page might be dirty and hence only a subset
> of the dirty blocks might need to be written to the disk.

But we only track dirtiness on a per-page level.  That's all the page
cache gives us.  Remember, the function name is ext4_bio_write_page()
or {btrfs,xfs,ext4}_writepage() --- we will always be writing a full
page at a time.

						- Ted

Patch

diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
index 378df08..4d0d14f 100644
--- a/fs/crypto/bio.c
+++ b/fs/crypto/bio.c
@@ -104,10 +104,12 @@  int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
 {
 	struct fscrypt_ctx *ctx;
 	struct page *ciphertext_page = NULL;
+	unsigned int page_nr_blks;
+	unsigned int offset;
+	unsigned int page_io_len;
 	struct bio *bio;
 	int ret, err = 0;
-
-	BUG_ON(inode->i_sb->s_blocksize != PAGE_SIZE);
+	int i;
 
 	ctx = fscrypt_get_ctx(inode, GFP_NOFS);
 	if (IS_ERR(ctx))
@@ -119,12 +121,23 @@  int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
 		goto errout;
 	}
 
-	while (len--) {
-		err = fscrypt_do_block_crypto(inode, FS_ENCRYPT, lblk,
-					     ZERO_PAGE(0), ciphertext_page,
-					     PAGE_SIZE, 0, GFP_NOFS);
-		if (err)
-			goto errout;
+	page_nr_blks = 1 << (PAGE_SHIFT - inode->i_blkbits);
+
+	while (len) {
+		page_nr_blks = min_t(unsigned int, page_nr_blks, len);
+		page_io_len = page_nr_blks << inode->i_blkbits;
+		offset = 0;
+
+		for (i = 0; i < page_nr_blks; i++) {
+			err = fscrypt_do_block_crypto(inode, FS_ENCRYPT, lblk,
+						ZERO_PAGE(0), ciphertext_page,
+						inode->i_sb->s_blocksize, offset,
+						GFP_NOFS);
+			if (err)
+				goto errout;
+			lblk++;
+			offset += inode->i_sb->s_blocksize;
+		}
 
 		bio = bio_alloc(GFP_NOWAIT, 1);
 		if (!bio) {
@@ -135,9 +148,8 @@  int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
 		bio->bi_iter.bi_sector =
 			pblk << (inode->i_sb->s_blocksize_bits - 9);
 		bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
-		ret = bio_add_page(bio, ciphertext_page,
-					inode->i_sb->s_blocksize, 0);
-		if (ret != inode->i_sb->s_blocksize) {
+		ret = bio_add_page(bio, ciphertext_page, page_io_len, 0);
+		if (ret != page_io_len) {
 			/* should never happen! */
 			WARN_ON(1);
 			bio_put(bio);
@@ -150,8 +162,8 @@  int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
 		bio_put(bio);
 		if (err)
 			goto errout;
-		lblk++;
-		pblk++;
+		pblk += page_nr_blks;
+		len -= page_nr_blks;
 	}
 	err = 0;
 errout: