Message ID | 20250421105026.19577-2-chentaotao@didiglobal.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/3] mm/filemap: initialize fsdata with iocb->ki_flags | expand |
On Mon, Apr 21, 2025 at 10:50:30AM +0000, 陈涛涛 Taotao Chen wrote: > diff --git a/mm/filemap.c b/mm/filemap.c > index 7b90cbeb4a1a..9174b6310f0b 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -4087,7 +4087,11 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i) > size_t offset; /* Offset into folio */ > size_t bytes; /* Bytes to write to folio */ > size_t copied; /* Bytes copied from user */ > - void *fsdata = NULL; > + /* > + * Initialize fsdata with iocb flags pointer so that filesystems > + * can check ki_flags (like IOCB_DONTCACHE) in write operations. > + */ > + void *fsdata = &iocb->ki_flags; Unfortunately, this is't safe. There may very well be code paths which depend on fsdata being initialized to NULL before calling write_begin(). In fact in the patch 2/3 in this series, ext4_write_end() will get confused in the non-delayed allocation case, since ext4_write_begin() doesn't force fsdata to be not be &iocb->ki_flags, and this will cause ext4_write_end() to potentially get confused and do the wrong thing. I understand that it would be a lot more inconvenient change the function signature of write_begin() to pass through iocb->ki_fags via a new parameter. But I think that probably is the best way to go. Cheers, - Ted
On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote: > I understand that it would be a lot more inconvenient change the > function signature of write_begin() to pass through iocb->ki_fags via > a new parameter. But I think that probably is the best way to go. I'd suggest that passing in iocb rather than file is the way to go. Most callers of ->write_begin already pass NULL as the first argument so would not need to change. i915/gem passes a non-NULL file, but it only operates on shmem and shmem does not use the file argument, so they can pass NULL instead. fs/buffer.c simply passes through the file passed to write_begin and can be changed to pass through the iocb passed in. exfat_extend_valid_size() has an iocb in its caller and can pass in the iocb instead. generic_perform_write() has an iocb.
On Wed, May 14, 2025 at 02:38:05PM +0100, Matthew Wilcox wrote: > On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote: > > I understand that it would be a lot more inconvenient change the > > function signature of write_begin() to pass through iocb->ki_fags via > > a new parameter. But I think that probably is the best way to go. > > I'd suggest that passing in iocb rather than file is the way to go. > Most callers of ->write_begin already pass NULL as the first argument so > would not need to change. i915/gem passes a non-NULL file, but it only > operates on shmem and shmem does not use the file argument, so they can > pass NULL instead. fs/buffer.c simply passes through the file passed > to write_begin and can be changed to pass through the iocb passed in. > exfat_extend_valid_size() has an iocb in its caller and can pass in the > iocb instead. generic_perform_write() has an iocb. Mmm, nice! I agree, that's probably the way to go. There might be some callers if write_begin() that might require some restructing because they don't have an iocb. For example, shmem_pwrite() in drivers/gpu/i915/gem/i915_gem_shmem.c came up when I did a quick grep. - Ted
On Wed, May 14, 2025 at 02:38:05PM +0100, Matthew Wilcox wrote: > On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote: > > I understand that it would be a lot more inconvenient change the > > function signature of write_begin() to pass through iocb->ki_fags via > > a new parameter. But I think that probably is the best way to go. > > I'd suggest that passing in iocb rather than file is the way to go. > Most callers of ->write_begin already pass NULL as the first argument so > would not need to change. i915/gem passes a non-NULL file, but it only > operates on shmem and shmem does not use the file argument, so they can > pass NULL instead. i915/gem needs to stop using write_begin/end and just do an iter_write. Someone who has the hardware and/or CI setup needs to figure out if vfs_write and kernel_write is fine, or this is magic enough to skip checks and protection and go straight to callig ->iter_write. > fs/buffer.c simply passes through the file passed > to write_begin and can be changed to pass through the iocb passed in. > exfat_extend_valid_size() has an iocb in its caller and can pass in the > iocb instead. generic_perform_write() has an iocb. And we really need to stop theading this through the address_space ops because it's not a generic method but a callback for the file system..
Hi Christoph, Matthew, Ted Thanks for the suggestions. Replacing file with iocb in write_begin(), updating call sites, and adjusting i915/gem usage makes a lot of sense. I’ll send a v2 to reflect this change. Thanks, Taotao
diff --git a/mm/filemap.c b/mm/filemap.c index 7b90cbeb4a1a..9174b6310f0b 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -4087,7 +4087,11 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i) size_t offset; /* Offset into folio */ size_t bytes; /* Bytes to write to folio */ size_t copied; /* Bytes copied from user */ - void *fsdata = NULL; + /* + * Initialize fsdata with iocb flags pointer so that filesystems + * can check ki_flags (like IOCB_DONTCACHE) in write operations. + */ + void *fsdata = &iocb->ki_flags; bytes = iov_iter_count(i); retry: