[6/8] nowait aio: ext4

Submitted by Goldwyn Rodrigues on April 3, 2017, 6:53 p.m.

Details

Message ID 20170403185307.6243-7-rgoldwyn@suse.de
State New
Headers show

Commit Message

Goldwyn Rodrigues April 3, 2017, 6:53 p.m.
From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Return EAGAIN if any of the following checks fail for direct I/O:
 + i_rwsem is lockable
 + Writing beyond end of file (will trigger allocation)
 + Blocks are not allocated at the write location
---
 fs/ext4/file.c  | 48 +++++++++++++++++++++++++++++++-----------------
 fs/ext4/super.c |  6 +++---
 2 files changed, 34 insertions(+), 20 deletions(-)

Comments

Jan Kara April 4, 2017, 7:58 a.m.
On Mon 03-04-17 13:53:05, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> Return EAGAIN if any of the following checks fail for direct I/O:
>  + i_rwsem is lockable
>  + Writing beyond end of file (will trigger allocation)
>  + Blocks are not allocated at the write location

Patches seem to be missing your Signed-off-by tag...

> @@ -235,9 +237,21 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  
>  	iocb->private = &overwrite;
>  	/* Check whether we do a DIO overwrite or not */
> -	if (o_direct && ext4_should_dioread_nolock(inode) && !unaligned_aio &&
> -	    ext4_overwrite_io(inode, iocb->ki_pos, iov_iter_count(from)))
> -		overwrite = 1;
> +	if (o_direct && !unaligned_aio) {
> +		struct ext4_map_blocks map;
> +		if (ext4_blocks_mapped(inode, iocb->ki_pos,
> +				      iov_iter_count(from), &map)) {
> +	 		/* To exclude unwritten extents, we need to check
> +			 * m_flags.
> +			 */
> +			if (ext4_should_dioread_nolock(inode) &&
> +			    (map.m_flags & EXT4_MAP_MAPPED))
> +				overwrite = 1;
> +		} else if (iocb->ki_flags & IOCB_NOWAIT) {
> +			ret = -EAGAIN;
> +			goto out;
> +		}
> +	}

Actually, overwriting unwritten extents is relatively complex in ext4 as
well. In particular we need to start a transaction and split out the
written part of the extent. So I don't think we can easily support this
without blocking. As a result I'd keep the condition for IOCB_NOWAIT the
same as for overwrite IO.

> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -117,7 +117,7 @@ static struct file_system_type ext2_fs_type = {
>  	.name		= "ext2",
>  	.mount		= ext4_mount,
>  	.kill_sb	= kill_block_super,
> -	.fs_flags	= FS_REQUIRES_DEV,
> +	.fs_flags	= FS_REQUIRES_DEV | FS_NOWAIT,

FS_NOWAIT looks a bit too generic given these are filesystem feature flags.
Can we call it FS_NOWAIT_IO?

								Honza
Christoph Hellwig April 4, 2017, 8:41 a.m.
On Tue, Apr 04, 2017 at 09:58:53AM +0200, Jan Kara wrote:
> FS_NOWAIT looks a bit too generic given these are filesystem feature flags.
> Can we call it FS_NOWAIT_IO?

It's way to generic as it's a feature of the particular file_operations
instance.  But once we switch to using RWF_* we can just the existing
per-op feature checks for thos and the per-fs flag should just go away.
Goldwyn Rodrigues April 4, 2017, 6:41 p.m.
On 04/04/2017 03:41 AM, Christoph Hellwig wrote:
> On Tue, Apr 04, 2017 at 09:58:53AM +0200, Jan Kara wrote:
>> FS_NOWAIT looks a bit too generic given these are filesystem feature flags.
>> Can we call it FS_NOWAIT_IO?
> 
> It's way to generic as it's a feature of the particular file_operations
> instance.  But once we switch to using RWF_* we can just the existing
> per-op feature checks for thos and the per-fs flag should just go away.
> 

I am working on incorporating RWF_* flags. However, I am not sure how
RWF_* flags would get rid of FS_NOWAIT/FS_NOWAIT_IO. Since most of
"blocking" information is with the filesystem, it is a per-filesystem
flag to block out (EOPNOTSUPP) the filesystems which do not support it.
Christoph Hellwig April 10, 2017, 7:45 a.m.
On Tue, Apr 04, 2017 at 01:41:09PM -0500, Goldwyn Rodrigues wrote:
> I am working on incorporating RWF_* flags. However, I am not sure how
> RWF_* flags would get rid of FS_NOWAIT/FS_NOWAIT_IO. Since most of
> "blocking" information is with the filesystem, it is a per-filesystem
> flag to block out (EOPNOTSUPP) the filesystems which do not support it.

You need to check the flag in the actual read/write methods as the
support for features on Linux is not a per-file_system_type thing.
Jan Kara April 10, 2017, 12:37 p.m.
On Mon 10-04-17 00:45:39, Christoph Hellwig wrote:
> On Tue, Apr 04, 2017 at 01:41:09PM -0500, Goldwyn Rodrigues wrote:
> > I am working on incorporating RWF_* flags. However, I am not sure how
> > RWF_* flags would get rid of FS_NOWAIT/FS_NOWAIT_IO. Since most of
> > "blocking" information is with the filesystem, it is a per-filesystem
> > flag to block out (EOPNOTSUPP) the filesystems which do not support it.
> 
> You need to check the flag in the actual read/write methods as the
> support for features on Linux is not a per-file_system_type thing.

I don't understand here. Do you want that all filesystems support NOWAIT
direct IO? IMO that's not realistic and also not necessary. In reality
different filesystems support different sets or operations and we have a
precedens for that in various fallocate operations, rename exchange, or
O_TMPFILE support...

								Honza
Christoph Hellwig April 10, 2017, 2:39 p.m.
On Mon, Apr 10, 2017 at 02:37:50PM +0200, Jan Kara wrote:
> I don't understand here. Do you want that all filesystems support NOWAIT
> direct IO?

No.  Per-file_system_type is way to coarse grained.  All feature flag
needs to be per-file_operation at least for cases like ext4 with our
without extents (or journal) XFS v4 vs v5, different NFS versions, etc.

For RWF_* each file operation simply declares if the feature is
supported not by rejecting unknown ones.  FIEMAP does the same as do
a few other interfaces.
Jan Kara April 10, 2017, 3:13 p.m.
On Mon 10-04-17 07:39:43, Christoph Hellwig wrote:
> On Mon, Apr 10, 2017 at 02:37:50PM +0200, Jan Kara wrote:
> > I don't understand here. Do you want that all filesystems support NOWAIT
> > direct IO?
> 
> No.  Per-file_system_type is way to coarse grained.  All feature flag
> needs to be per-file_operation at least for cases like ext4 with our
> without extents (or journal) XFS v4 vs v5, different NFS versions, etc.

Ah, I see your point now. Thanks for patience. I think we could make this
work by making generic_file_write/read_iter() refuse NOWAIT IO with
EOPNOTSUPP and then only modify those few filesystems that implement their
own iter helpers and will not initially support NOWAIT IO. Sounds easy
enough.

								Honza

Patch hide | download patch | download mbox

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8210c1f..e223b9f 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -127,27 +127,22 @@  ext4_unaligned_aio(struct inode *inode, struct iov_iter *from, loff_t pos)
 	return 0;
 }
 
-/* Is IO overwriting allocated and initialized blocks? */
-static bool ext4_overwrite_io(struct inode *inode, loff_t pos, loff_t len)
+/* Are IO blocks allocated */
+static bool ext4_blocks_mapped(struct inode *inode, loff_t pos, loff_t len,
+				struct ext4_map_blocks *map)
 {
-	struct ext4_map_blocks map;
 	unsigned int blkbits = inode->i_blkbits;
 	int err, blklen;
 
 	if (pos + len > i_size_read(inode))
 		return false;
 
-	map.m_lblk = pos >> blkbits;
-	map.m_len = EXT4_MAX_BLOCKS(len, pos, blkbits);
-	blklen = map.m_len;
+	map->m_lblk = pos >> blkbits;
+	map->m_len = EXT4_MAX_BLOCKS(len, pos, blkbits);
+	blklen = map->m_len;
 
-	err = ext4_map_blocks(NULL, inode, &map, 0);
-	/*
-	 * 'err==len' means that all of the blocks have been preallocated,
-	 * regardless of whether they have been initialized or not. To exclude
-	 * unwritten extents, we need to check m_flags.
-	 */
-	return err == blklen && (map.m_flags & EXT4_MAP_MAPPED);
+	err = ext4_map_blocks(NULL, inode, map, 0);
+	return err == blklen;
 }
 
 static ssize_t ext4_write_checks(struct kiocb *iocb, struct iov_iter *from)
@@ -204,6 +199,7 @@  ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
 	int o_direct = iocb->ki_flags & IOCB_DIRECT;
+	int nowait = iocb->ki_flags & IOCB_NOWAIT;
 	int unaligned_aio = 0;
 	int overwrite = 0;
 	ssize_t ret;
@@ -216,7 +212,13 @@  ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		return ext4_dax_write_iter(iocb, from);
 #endif
 
-	inode_lock(inode);
+	if (o_direct && nowait) {
+		if (!inode_trylock(inode))
+			return -EAGAIN;
+	} else {
+		inode_lock(inode);
+	}
+
 	ret = ext4_write_checks(iocb, from);
 	if (ret <= 0)
 		goto out;
@@ -235,9 +237,21 @@  ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 
 	iocb->private = &overwrite;
 	/* Check whether we do a DIO overwrite or not */
-	if (o_direct && ext4_should_dioread_nolock(inode) && !unaligned_aio &&
-	    ext4_overwrite_io(inode, iocb->ki_pos, iov_iter_count(from)))
-		overwrite = 1;
+	if (o_direct && !unaligned_aio) {
+		struct ext4_map_blocks map;
+		if (ext4_blocks_mapped(inode, iocb->ki_pos,
+				      iov_iter_count(from), &map)) {
+	 		/* To exclude unwritten extents, we need to check
+			 * m_flags.
+			 */
+			if (ext4_should_dioread_nolock(inode) &&
+			    (map.m_flags & EXT4_MAP_MAPPED))
+				overwrite = 1;
+		} else if (iocb->ki_flags & IOCB_NOWAIT) {
+			ret = -EAGAIN;
+			goto out;
+		}
+	}
 
 	ret = __generic_file_write_iter(iocb, from);
 	inode_unlock(inode);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index a9448db..e1d424a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -117,7 +117,7 @@  static struct file_system_type ext2_fs_type = {
 	.name		= "ext2",
 	.mount		= ext4_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV,
+	.fs_flags	= FS_REQUIRES_DEV | FS_NOWAIT,
 };
 MODULE_ALIAS_FS("ext2");
 MODULE_ALIAS("ext2");
@@ -132,7 +132,7 @@  static struct file_system_type ext3_fs_type = {
 	.name		= "ext3",
 	.mount		= ext4_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV,
+	.fs_flags	= FS_REQUIRES_DEV | FS_NOWAIT,
 };
 MODULE_ALIAS_FS("ext3");
 MODULE_ALIAS("ext3");
@@ -5623,7 +5623,7 @@  static struct file_system_type ext4_fs_type = {
 	.name		= "ext4",
 	.mount		= ext4_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV,
+	.fs_flags	= FS_REQUIRES_DEV | FS_NOWAIT,
 };
 MODULE_ALIAS_FS("ext4");