Patchwork [1/7] ext4: ext4_inode_info diet

login
register
mail settings
Submitter Dmitri Monakho
Date Sept. 9, 2012, 5:27 p.m.
Message ID <1347211634-11509-2-git-send-email-dmonakhov@openvz.org>
Download mbox | patch
Permalink /patch/182636/
State Superseded
Headers show

Comments

Dmitri Monakho - Sept. 9, 2012, 5:27 p.m.
Generic inode has unused i_private pointer which may be used as cur_aio_dio
storage.

TODO: If cur_aio_dio will be passed as an argument to get_block_t this allow
      to have concurent AIO_DIO requests.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/ext4.h    |    5 +++--
 fs/ext4/extents.c |    4 ++--
 fs/ext4/inode.c   |    6 +++---
 fs/ext4/super.c   |    1 -
 4 files changed, 8 insertions(+), 8 deletions(-)
Zheng Liu - Sept. 13, 2012, 10:50 a.m.
On Sun, Sep 09, 2012 at 09:27:08PM +0400, Dmitry Monakhov wrote:
> Generic inode has unused i_private pointer which may be used as cur_aio_dio
> storage.
> 
> TODO: If cur_aio_dio will be passed as an argument to get_block_t this allow
>       to have concurent AIO_DIO requests.

Out of curiosity, could you please describe your idea about concurrent
AIO_DIO requests in get_block_t?  Thanks!

> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>

Looks good to me.  You can add:
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>

Regards,
Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitri Monakho - Sept. 13, 2012, 11:15 a.m.
On Thu, 13 Sep 2012 18:50:22 +0800, Zheng Liu <gnehzuil.liu@gmail.com> wrote:
> On Sun, Sep 09, 2012 at 09:27:08PM +0400, Dmitry Monakhov wrote:
> > Generic inode has unused i_private pointer which may be used as cur_aio_dio
> > storage.
> > 
> > TODO: If cur_aio_dio will be passed as an argument to get_block_t this allow
> >       to have concurent AIO_DIO requests.
> 
> Out of curiosity, could you please describe your idea about concurrent
> AIO_DIO requests in get_block_t?  Thanks!
Current buffer.c API layering looks sub-optimal

->xxx_fs_routine: May create different contexts
   ->generic_buffer_methods(inode, bh..)
     ->xxx_fs_get_block(inode, bh,...): There is no efficient way to pass fscontext 
Both xxx_fs_routine and xxx_fs_get_block are fs specific, but
the only way to pass fscontext down to get_block is to pass it by
attaching it to inode, which make it implicit serialization point.

I just want to add fsprivate argument to get_block_t callback similar
write_begin/write_end and iocb->private, so filesystem will able to pass
it to it's get_block callback.
 
> 
> > 
> > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> Looks good to me.  You can add:
> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
> 
> Regards,
> Zheng
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Sept. 15, 2012, 3:53 p.m.
On Thu, Sep 13, 2012 at 03:15:40PM +0400, Dmitry Monakhov wrote:
> Current buffer.c API layering looks sub-optimal
> 
> ->xxx_fs_routine: May create different contexts
>    ->generic_buffer_methods(inode, bh..)
>      ->xxx_fs_get_block(inode, bh,...): There is no efficient way to pass fscontext 
> Both xxx_fs_routine and xxx_fs_get_block are fs specific, but
> the only way to pass fscontext down to get_block is to pass it by
> attaching it to inode, which make it implicit serialization point.
> 
> I just want to add fsprivate argument to get_block_t callback similar
> write_begin/write_end and iocb->private, so filesystem will able to pass
> it to it's get_block callback.

If we're going to change the function prototype for get_block_t, it
might also be good to fix a long-standing ugliness with this interface
which is that the get_block() interface does two very different things
when bh->b_size is greater a single block.  If b_size is a greater a
block, then instead of bh being a "real" buffer_head, it's a pseudo
map_bh.  This is annoying because map_bh is generally allocated on the
stack, but most of the fields are unused, since it's not really a
_real_ buffer_head, but just a countainer for a handful of fields used
by the AIO/DIO and mpage.c functions.

Instead of changing get_block(), which would require lots of changes
all over the kernel, what I'd suggest doing instead is creating a new
function which I'd propose calling get_map() which fills in a new
structure which replaces map_bh, and which includes the aio_dio
structure.  It could be made mandatory for the few file systems which
support DIO, but that way we don't havec to change the dozens of other
file systems which don't use the DIO code paths.

Regards,

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f9024a6..b3f3c55 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -912,8 +912,6 @@  struct ext4_inode_info {
 	struct list_head i_completed_io_list;
 	spinlock_t i_completed_io_lock;
 	atomic_t i_ioend_count;	/* Number of outstanding io_end structs */
-	/* current io_end structure for async DIO write*/
-	ext4_io_end_t *cur_aio_dio;
 	atomic_t i_aiodio_unwritten; /* Nr. of inflight conversions pending */
 
 	spinlock_t i_block_reservation_lock;
@@ -929,6 +927,9 @@  struct ext4_inode_info {
 	__u32 i_csum_seed;
 };
 
+/* current io_end structure for async DIO write */
+#define EXT4_CUR_AIO_DIO(inode) (inode)->i_private
+
 /*
  * File system states
  */
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 10f0afd..e993879 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3644,7 +3644,7 @@  ext4_ext_handle_uninitialized_extents(handle_t *handle, struct inode *inode,
 {
 	int ret = 0;
 	int err = 0;
-	ext4_io_end_t *io = EXT4_I(inode)->cur_aio_dio;
+	ext4_io_end_t *io = (ext4_io_end_t*) EXT4_CUR_AIO_DIO(inode);
 
 	ext_debug("ext4_ext_handle_uninitialized_extents: inode %lu, logical "
 		  "block %llu, max_blocks %u, flags %x, allocated %u\n",
@@ -3902,7 +3902,7 @@  int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 	unsigned int allocated = 0, offset = 0;
 	unsigned int allocated_clusters = 0;
 	struct ext4_allocation_request ar;
-	ext4_io_end_t *io = EXT4_I(inode)->cur_aio_dio;
+	ext4_io_end_t *io = (ext4_io_end_t*) EXT4_CUR_AIO_DIO(inode);
 	ext4_lblk_t cluster_offset;
 
 	ext_debug("blocks %u/%u requested for inode %lu\n",
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d12d30e..202ae3f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3060,7 +3060,7 @@  static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
 		 * hook to the iocb.
  		 */
 		iocb->private = NULL;
-		EXT4_I(inode)->cur_aio_dio = NULL;
+		EXT4_CUR_AIO_DIO(inode) = NULL;
 		if (!is_sync_kiocb(iocb)) {
 			ext4_io_end_t *io_end =
 				ext4_init_io_end(inode, GFP_NOFS);
@@ -3077,7 +3077,7 @@  static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
 			 * is a unwritten extents needs to be converted
 			 * when IO is completed.
 			 */
-			EXT4_I(inode)->cur_aio_dio = iocb->private;
+			EXT4_CUR_AIO_DIO(inode) = iocb->private;
 		}
 
 		if (overwrite)
@@ -3097,7 +3097,7 @@  static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
 						 NULL,
 						 DIO_LOCKING);
 		if (iocb->private)
-			EXT4_I(inode)->cur_aio_dio = NULL;
+			EXT4_CUR_AIO_DIO(inode) = NULL;
 		/*
 		 * The io_end structure takes a reference to the inode,
 		 * that structure needs to be destroyed and the
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4a7092b..a6904b3 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -967,7 +967,6 @@  static struct inode *ext4_alloc_inode(struct super_block *sb)
 	ei->jinode = NULL;
 	INIT_LIST_HEAD(&ei->i_completed_io_list);
 	spin_lock_init(&ei->i_completed_io_lock);
-	ei->cur_aio_dio = NULL;
 	ei->i_sync_tid = 0;
 	ei->i_datasync_tid = 0;
 	atomic_set(&ei->i_ioend_count, 0);