diff mbox series

[1/3] mm/filemap: initialize fsdata with iocb->ki_flags

Message ID 20250421105026.19577-2-chentaotao@didiglobal.com
State Changes Requested
Headers show
Series [1/3] mm/filemap: initialize fsdata with iocb->ki_flags | expand

Commit Message

陈涛涛 Taotao Chen April 21, 2025, 10:50 a.m. UTC
From: Taotao Chen <chentaotao@didiglobal.com>

Initialize fsdata with &iocb->ki_flags to allow filesystems to check
iocb flags like IOCB_DONTCACHE during ->write_begin().

Signed-off-by: Taotao Chen <chentaotao@didiglobal.com>
---
 mm/filemap.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Theodore Ts'o May 14, 2025, 3:51 a.m. UTC | #1
On Mon, Apr 21, 2025 at 10:50:30AM +0000, 陈涛涛 Taotao Chen wrote:
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 7b90cbeb4a1a..9174b6310f0b 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -4087,7 +4087,11 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
>  		size_t offset;		/* Offset into folio */
>  		size_t bytes;		/* Bytes to write to folio */
>  		size_t copied;		/* Bytes copied from user */
> -		void *fsdata = NULL;
> +		/*
> +		 * Initialize fsdata with iocb flags pointer so that filesystems
> +		 * can check ki_flags (like IOCB_DONTCACHE) in write operations.
> +		 */
> +		void *fsdata = &iocb->ki_flags;

Unfortunately, this is't safe.  There may very well be code
paths which depend on fsdata being initialized to NULL before
calling write_begin().

In fact in the patch 2/3 in this series, ext4_write_end() will get
confused in the non-delayed allocation case, since ext4_write_begin()
doesn't force fsdata to be not be &iocb->ki_flags, and this will cause
ext4_write_end() to potentially get confused and do the wrong thing.

I understand that it would be a lot more inconvenient change the
function signature of write_begin() to pass through iocb->ki_fags via
a new parameter.  But I think that probably is the best way to go.

Cheers,

					- Ted
Matthew Wilcox May 14, 2025, 1:38 p.m. UTC | #2
On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote:
> I understand that it would be a lot more inconvenient change the
> function signature of write_begin() to pass through iocb->ki_fags via
> a new parameter.  But I think that probably is the best way to go.

I'd suggest that passing in iocb rather than file is the way to go.
Most callers of ->write_begin already pass NULL as the first argument so
would not need to change.  i915/gem passes a non-NULL file, but it only
operates on shmem and shmem does not use the file argument, so they can
pass NULL instead.  fs/buffer.c simply passes through the file passed
to write_begin and can be changed to pass through the iocb passed in.
exfat_extend_valid_size() has an iocb in its caller and can pass in the
iocb instead.  generic_perform_write() has an iocb.
Theodore Ts'o May 14, 2025, 2:23 p.m. UTC | #3
On Wed, May 14, 2025 at 02:38:05PM +0100, Matthew Wilcox wrote:
> On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote:
> > I understand that it would be a lot more inconvenient change the
> > function signature of write_begin() to pass through iocb->ki_fags via
> > a new parameter.  But I think that probably is the best way to go.
> 
> I'd suggest that passing in iocb rather than file is the way to go.
> Most callers of ->write_begin already pass NULL as the first argument so
> would not need to change.  i915/gem passes a non-NULL file, but it only
> operates on shmem and shmem does not use the file argument, so they can
> pass NULL instead.  fs/buffer.c simply passes through the file passed
> to write_begin and can be changed to pass through the iocb passed in.
> exfat_extend_valid_size() has an iocb in its caller and can pass in the
> iocb instead.  generic_perform_write() has an iocb.

Mmm, nice!  I agree, that's probably the way to go.

There might be some callers if write_begin() that might require some
restructing because they don't have an iocb.  For example,
shmem_pwrite() in drivers/gpu/i915/gem/i915_gem_shmem.c came up when I
did a quick grep.

     	      	     	    	     	     - Ted
Christoph Hellwig May 14, 2025, 3:06 p.m. UTC | #4
On Wed, May 14, 2025 at 02:38:05PM +0100, Matthew Wilcox wrote:
> On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote:
> > I understand that it would be a lot more inconvenient change the
> > function signature of write_begin() to pass through iocb->ki_fags via
> > a new parameter.  But I think that probably is the best way to go.
> 
> I'd suggest that passing in iocb rather than file is the way to go.
> Most callers of ->write_begin already pass NULL as the first argument so
> would not need to change.  i915/gem passes a non-NULL file, but it only
> operates on shmem and shmem does not use the file argument, so they can
> pass NULL instead.

i915/gem needs to stop using write_begin/end and just do an iter_write.
Someone who has the hardware and/or CI setup needs to figure out if
vfs_write and kernel_write is fine, or this is magic enough to skip
checks and protection and go straight to callig ->iter_write.

> fs/buffer.c simply passes through the file passed
> to write_begin and can be changed to pass through the iocb passed in.
> exfat_extend_valid_size() has an iocb in its caller and can pass in the
> iocb instead.  generic_perform_write() has an iocb.

And we really need to stop theading this through the address_space
ops because it's not a generic method but a callback for the file
system..
Taotao Chen May 18, 2025, 2:01 p.m. UTC | #5
Hi Christoph, Matthew, Ted

Thanks for the suggestions.

Replacing file with iocb in write_begin(), updating call sites,
and adjusting i915/gem usage makes a lot of sense. I’ll send a
v2 to reflect this change.

Thanks,  
Taotao
diff mbox series

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index 7b90cbeb4a1a..9174b6310f0b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -4087,7 +4087,11 @@  ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
 		size_t offset;		/* Offset into folio */
 		size_t bytes;		/* Bytes to write to folio */
 		size_t copied;		/* Bytes copied from user */
-		void *fsdata = NULL;
+		/*
+		 * Initialize fsdata with iocb flags pointer so that filesystems
+		 * can check ki_flags (like IOCB_DONTCACHE) in write operations.
+		 */
+		void *fsdata = &iocb->ki_flags;
 
 		bytes = iov_iter_count(i);
 retry: