diff mbox series

[v4,6/9] f2fs: don't allow DIO reads but not DIO writes

Message ID 20220722071228.146690-7-ebiggers@kernel.org
State Superseded
Headers show
Series make statx() return DIO alignment information | expand

Commit Message

Eric Biggers July 22, 2022, 7:12 a.m. UTC
From: Eric Biggers <ebiggers@google.com>

Currently, if an f2fs filesystem is mounted with the mode=lfs and
io_bits mount options, DIO reads are allowed but DIO writes are not.
Allowing DIO reads but not DIO writes is an unusual restriction, which
is likely to be surprising to applications, namely any application that
both reads and writes from a file (using O_DIRECT).  This behavior is
also incompatible with the proposed STATX_DIOALIGN extension to statx.
Given this, let's drop the support for DIO reads in this configuration.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/f2fs/file.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Jaegeuk Kim July 24, 2022, 2:01 a.m. UTC | #1
On 07/22, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> Currently, if an f2fs filesystem is mounted with the mode=lfs and
> io_bits mount options, DIO reads are allowed but DIO writes are not.
> Allowing DIO reads but not DIO writes is an unusual restriction, which
> is likely to be surprising to applications, namely any application that
> both reads and writes from a file (using O_DIRECT).  This behavior is
> also incompatible with the proposed STATX_DIOALIGN extension to statx.
> Given this, let's drop the support for DIO reads in this configuration.

IIRC, we allowed DIO reads since applications complained a lower performance.
So, I'm afraid this change will make another confusion to users. Could
you please apply the new bahavior only for STATX_DIOALIGN?

> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  fs/f2fs/file.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 5e5c97fccfb4ee..ad0212848a1ab9 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -823,7 +823,6 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
>  				struct kiocb *iocb, struct iov_iter *iter)
>  {
>  	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> -	int rw = iov_iter_rw(iter);
>  
>  	if (!fscrypt_dio_supported(inode))
>  		return true;
> @@ -841,7 +840,7 @@ static inline bool f2fs_force_buffered_io(struct inode *inode,
>  	 */
>  	if (f2fs_sb_has_blkzoned(sbi))
>  		return true;
> -	if (f2fs_lfs_mode(sbi) && (rw == WRITE)) {
> +	if (f2fs_lfs_mode(sbi)) {
>  		if (block_unaligned_IO(inode, iocb, iter))
>  			return true;
>  		if (F2FS_IO_ALIGNED(sbi))
> -- 
> 2.37.0
Eric Biggers July 25, 2022, 6:12 p.m. UTC | #2
On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> On 07/22, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > is likely to be surprising to applications, namely any application that
> > both reads and writes from a file (using O_DIRECT).  This behavior is
> > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > Given this, let's drop the support for DIO reads in this configuration.
> 
> IIRC, we allowed DIO reads since applications complained a lower performance.
> So, I'm afraid this change will make another confusion to users. Could
> you please apply the new bahavior only for STATX_DIOALIGN?
> 

Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
weird case where DIO reads are allowed but not DIO writes.  So the question is
whether this case actually matters, in which case we should make STATX_DIOALIGN
distinguish between DIO reads and DIO writes, or whether it's some odd edge case
that doesn't really matter, in which case we could just fix it or make
STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
insight here.  What sort of applications want DIO reads but not DIO writes?
Is this common at all?

- Eric
Andreas Dilger July 25, 2022, 11:58 p.m. UTC | #3
On Jul 25, 2022, at 12:12 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> 
> On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
>> On 07/22, Eric Biggers wrote:
>>> From: Eric Biggers <ebiggers@google.com>
>>> 
>>> Currently, if an f2fs filesystem is mounted with the mode=lfs and
>>> io_bits mount options, DIO reads are allowed but DIO writes are not.
>>> Allowing DIO reads but not DIO writes is an unusual restriction, which
>>> is likely to be surprising to applications, namely any application that
>>> both reads and writes from a file (using O_DIRECT).  This behavior is
>>> also incompatible with the proposed STATX_DIOALIGN extension to statx.
>>> Given this, let's drop the support for DIO reads in this configuration.
>> 
>> IIRC, we allowed DIO reads since applications complained a lower performance.
>> So, I'm afraid this change will make another confusion to users. Could
>> you please apply the new bahavior only for STATX_DIOALIGN?
>> 
> 
> Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> weird case where DIO reads are allowed but not DIO writes.  So the question is
> whether this case actually matters, in which case we should make STATX_DIOALIGN
> distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> that doesn't really matter, in which case we could just fix it or make
> STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> insight here.  What sort of applications want DIO reads but not DIO writes?
> Is this common at all?

I don't think this is f2fs related, but some backup applications I'm aware
of are using DIO reads to avoid polluting the page cache when reading large
numbers of files. They don't care about DIO writes, since that is usually
slower than async writes due to the sync before returning from the syscall.

Also, IMHO it doesn't make sense to remove useful functionality because the
new STATX_DIOALIGN fields don't handle this.  At worst the application will
still get an error when trying a DIO write, but in most cases they will
not use the brand new STATX call in the first place, and if this is documented
then any application that starts to use it should be able to handle it.

Cheers, Andreas
Jaegeuk Kim July 31, 2022, 3:08 a.m. UTC | #4
On 07/25, Eric Biggers wrote:
> On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > On 07/22, Eric Biggers wrote:
> > > From: Eric Biggers <ebiggers@google.com>
> > > 
> > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > is likely to be surprising to applications, namely any application that
> > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > Given this, let's drop the support for DIO reads in this configuration.
> > 
> > IIRC, we allowed DIO reads since applications complained a lower performance.
> > So, I'm afraid this change will make another confusion to users. Could
> > you please apply the new bahavior only for STATX_DIOALIGN?
> > 
> 
> Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> weird case where DIO reads are allowed but not DIO writes.  So the question is
> whether this case actually matters, in which case we should make STATX_DIOALIGN
> distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> that doesn't really matter, in which case we could just fix it or make
> STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> insight here.  What sort of applications want DIO reads but not DIO writes?
> Is this common at all?

I think there's no specific application to use the LFS mode at this
moment, but I'd like to allow DIO read for zoned device which will be
used for Android devices.

> 
> - Eric
Eric Biggers Aug. 16, 2022, 12:55 a.m. UTC | #5
On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
> On 07/25, Eric Biggers wrote:
> > On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > > On 07/22, Eric Biggers wrote:
> > > > From: Eric Biggers <ebiggers@google.com>
> > > > 
> > > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > > is likely to be surprising to applications, namely any application that
> > > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > > Given this, let's drop the support for DIO reads in this configuration.
> > > 
> > > IIRC, we allowed DIO reads since applications complained a lower performance.
> > > So, I'm afraid this change will make another confusion to users. Could
> > > you please apply the new bahavior only for STATX_DIOALIGN?
> > > 
> > 
> > Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> > weird case where DIO reads are allowed but not DIO writes.  So the question is
> > whether this case actually matters, in which case we should make STATX_DIOALIGN
> > distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> > that doesn't really matter, in which case we could just fix it or make
> > STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> > insight here.  What sort of applications want DIO reads but not DIO writes?
> > Is this common at all?
> 
> I think there's no specific application to use the LFS mode at this
> moment, but I'd like to allow DIO read for zoned device which will be
> used for Android devices.
> 

So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
be useless on all Android devices?  That sounds undesirable.  Are you sure that
supporting DIO reads but not DIO writes actually works?  Does it not cause
problems for existing applications?

What we need to do is make a decision about whether this means we should build
in a stx_dio_direction field (indicating no support / readonly support /
writeonly support / readwrite support) into the API from the beginning.  If we
don't do that, then I don't think we could simply add such a field later, as the
statx_dio_*_align fields will have already been assigned their meaning.  I think
we'd instead have to "duplicate" the API, with STATX_DIOROALIGN and
statx_dio_ro_*_align fields.  That seems uglier than building a directional
indicator into the API from the beginning.  On the other hand, requiring all
programs to check stx_dio_direction would add complexity to using the API.

Any thoughts on this?

- Eric
Dave Chinner Aug. 16, 2022, 9:03 a.m. UTC | #6
On Mon, Aug 15, 2022 at 05:55:45PM -0700, Eric Biggers wrote:
> On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
> > On 07/25, Eric Biggers wrote:
> > > On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > > > On 07/22, Eric Biggers wrote:
> > > > > From: Eric Biggers <ebiggers@google.com>
> > > > > 
> > > > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > > > is likely to be surprising to applications, namely any application that
> > > > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > > > Given this, let's drop the support for DIO reads in this configuration.
> > > > 
> > > > IIRC, we allowed DIO reads since applications complained a lower performance.
> > > > So, I'm afraid this change will make another confusion to users. Could
> > > > you please apply the new bahavior only for STATX_DIOALIGN?
> > > > 
> > > 
> > > Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> > > weird case where DIO reads are allowed but not DIO writes.  So the question is
> > > whether this case actually matters, in which case we should make STATX_DIOALIGN
> > > distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> > > that doesn't really matter, in which case we could just fix it or make
> > > STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> > > insight here.  What sort of applications want DIO reads but not DIO writes?
> > > Is this common at all?
> > 
> > I think there's no specific application to use the LFS mode at this
> > moment, but I'd like to allow DIO read for zoned device which will be
> > used for Android devices.
> > 
> 
> So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
> be useless on all Android devices?  That sounds undesirable.  Are you sure that
> supporting DIO reads but not DIO writes actually works?  Does it not cause
> problems for existing applications?

What purpose does DIO in only one direction actually serve? All it
means is that we're forcibly mixing buffered and direct IO to the
same file and that simply never ends well from a data coherency POV.

Hence I'd suggest that mixing DIO reads and buffered writes like
this ends up exposing uses to the worst of both worlds - all of the
problems with none of the benefits...

> What we need to do is make a decision about whether this means we should build
> in a stx_dio_direction field (indicating no support / readonly support /
> writeonly support / readwrite support) into the API from the beginning.  If we
> don't do that, then I don't think we could simply add such a field later, as the
> statx_dio_*_align fields will have already been assigned their meaning.  I think
> we'd instead have to "duplicate" the API, with STATX_DIOROALIGN and
> statx_dio_ro_*_align fields.  That seems uglier than building a directional
> indicator into the API from the beginning.  On the other hand, requiring all
> programs to check stx_dio_direction would add complexity to using the API.
> 
> Any thoughts on this?

Decide whether partial, single direction DIO serves a useful purpose
before trying to work out what is needed in the API to indicate that
this sort of crazy will be supported....

Cheers,

Dave.
Andreas Dilger Aug. 16, 2022, 4:42 p.m. UTC | #7
On Aug 16, 2022, at 3:03 AM, Dave Chinner <david@fromorbit.com> wrote:
> 
> On Mon, Aug 15, 2022 at 05:55:45PM -0700, Eric Biggers wrote:
>> On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
>>> On 07/25, Eric Biggers wrote:
>>>> On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
>>>>> On 07/22, Eric Biggers wrote:
>>>>>> From: Eric Biggers <ebiggers@google.com>
>>>>>> 
>>>>>> Currently, if an f2fs filesystem is mounted with the mode=lfs and
>>>>>> io_bits mount options, DIO reads are allowed but DIO writes are not.
>>>>>> Allowing DIO reads but not DIO writes is an unusual restriction, which
>>>>>> is likely to be surprising to applications, namely any application that
>>>>>> both reads and writes from a file (using O_DIRECT).  This behavior is
>>>>>> also incompatible with the proposed STATX_DIOALIGN extension to statx.
>>>>>> Given this, let's drop the support for DIO reads in this configuration.
>>>>> 
>>>>> IIRC, we allowed DIO reads since applications complained a lower performance.
>>>>> So, I'm afraid this change will make another confusion to users. Could
>>>>> you please apply the new bahavior only for STATX_DIOALIGN?
>>>>> 
>>>> 
>>>> Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
>>>> weird case where DIO reads are allowed but not DIO writes.  So the question is
>>>> whether this case actually matters, in which case we should make STATX_DIOALIGN
>>>> distinguish between DIO reads and DIO writes, or whether it's some odd edge case
>>>> that doesn't really matter, in which case we could just fix it or make
>>>> STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
>>>> insight here.  What sort of applications want DIO reads but not DIO writes?
>>>> Is this common at all?
>>> 
>>> I think there's no specific application to use the LFS mode at this
>>> moment, but I'd like to allow DIO read for zoned device which will be
>>> used for Android devices.
>>> 
>> 
>> So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
>> be useless on all Android devices?  That sounds undesirable.  Are you sure that
>> supporting DIO reads but not DIO writes actually works?  Does it not cause
>> problems for existing applications?
> 
> What purpose does DIO in only one direction actually serve? All it
> means is that we're forcibly mixing buffered and direct IO to the
> same file and that simply never ends well from a data coherency POV.
> 
> Hence I'd suggest that mixing DIO reads and buffered writes like
> this ends up exposing uses to the worst of both worlds - all of the
> problems with none of the benefits...
> 
>> What we need to do is make a decision about whether this means we should
>> build in a stx_dio_direction field (indicating no support / readonly
>> support / writeonly support / readwrite support) into the API from the
>> beginning.  If we don't do that, then I don't think we could simply add
>> such a field later, as the statx_dio_*_align fields will have already
>> been assigned their meaning.  I think we'd instead have to "duplicate"
>> the API, with STATX_DIOROALIGN and statx_dio_ro_*_align fields.  That
>> seems uglier than building a directional indicator into the API from the
>> beginning.  On the other hand, requiring all programs to check
>> stx_dio_direction would add complexity to using the API.
>> 
>> Any thoughts on this?
> 
> Decide whether partial, single direction DIO serves a useful purpose
> before trying to work out what is needed in the API to indicate that
> this sort of crazy will be supported....

Using read-only O_DIRECT makes sense for backup and other filesystem
scanning tools that don't want to pollute the page cache of a system
(which may be in use by other programs) while reading many files once.

Using interfaces like posix_fadvise(FADV_DONTNEED) to drop file cache
afterward is both a hassle and problematic when reading very large files
that would push out more important pages from cache before the large
file's pages can be dropped.


IMHO, this whole discussion is putting the cart before the horse.
Changing existing (and useful) IO behavior to accommodate an API that
nobody has ever used, and is unlikely to even be widely used, doesn't
make sense to me.  Most applications won't check or care about the new
DIO size fields, since they've lived this long without statx() returning
this info, and will just pick a "large enough" size (4KB, 1MB, whatever)
that gives them the performance they need.  They *WILL* care if the app
is suddenly unable to read data from a file in ways that have worked for
a long time.

Even if apps are modified to check these new DIO size fields, and then
try to DIO write to a file in f2fs that doesn't allow it, then f2fs will
return an error, which is what it would have done without the statx()
changes, so no harm done AFAICS.

Even with a more-complex DIO status return that handles a "direction"
field (which IMHO is needlessly complex), there is always the potential
for a TOCTOU race where a file changes between checking and access, so
the userspace code would need to handle this.

Cheers, Andreas
Eric Biggers Aug. 19, 2022, 11:09 p.m. UTC | #8
On Tue, Aug 16, 2022 at 10:42:29AM -0600, Andreas Dilger wrote:
> 
> IMHO, this whole discussion is putting the cart before the horse.
> Changing existing (and useful) IO behavior to accommodate an API that
> nobody has ever used, and is unlikely to even be widely used, doesn't
> make sense to me.  Most applications won't check or care about the new
> DIO size fields, since they've lived this long without statx() returning
> this info, and will just pick a "large enough" size (4KB, 1MB, whatever)
> that gives them the performance they need.  They *WILL* care if the app
> is suddenly unable to read data from a file in ways that have worked for
> a long time.
> 
> Even if apps are modified to check these new DIO size fields, and then
> try to DIO write to a file in f2fs that doesn't allow it, then f2fs will
> return an error, which is what it would have done without the statx()
> changes, so no harm done AFAICS.
> 
> Even with a more-complex DIO status return that handles a "direction"
> field (which IMHO is needlessly complex), there is always the potential
> for a TOCTOU race where a file changes between checking and access, so
> the userspace code would need to handle this.
> 

I'm having trouble making sense of your argument here; you seem to be saying
that STATX_DIOALIGN isn't useful, so it doesn't matter if we design it
correctly?  That line of reasoning is concerning, as it's certainly intended to
be useful, and if it's not useful there's no point in adding it.

Are there any specific concerns that you have, besides TOCTOU races and the lack
of support for read-only DIO?

I don't think that TOCTOU races are a real concern here.  Generally DIO
constraints would only change if the application doing DIO intentionally does
something to the file, or if there are changes that involve the filesystem being
taken offline, e.g. the filesystem being mounted with significantly different
options or being moved to a different block device.  And, well, everything else
in stat()/statx() is subject to TOCTOU as well, but is still used...

- Eric
Jaegeuk Kim Aug. 20, 2022, 12:06 a.m. UTC | #9
On 08/15, Eric Biggers wrote:
> On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
> > On 07/25, Eric Biggers wrote:
> > > On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > > > On 07/22, Eric Biggers wrote:
> > > > > From: Eric Biggers <ebiggers@google.com>
> > > > > 
> > > > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > > > is likely to be surprising to applications, namely any application that
> > > > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > > > Given this, let's drop the support for DIO reads in this configuration.
> > > > 
> > > > IIRC, we allowed DIO reads since applications complained a lower performance.
> > > > So, I'm afraid this change will make another confusion to users. Could
> > > > you please apply the new bahavior only for STATX_DIOALIGN?
> > > > 
> > > 
> > > Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> > > weird case where DIO reads are allowed but not DIO writes.  So the question is
> > > whether this case actually matters, in which case we should make STATX_DIOALIGN
> > > distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> > > that doesn't really matter, in which case we could just fix it or make
> > > STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> > > insight here.  What sort of applications want DIO reads but not DIO writes?
> > > Is this common at all?
> > 
> > I think there's no specific application to use the LFS mode at this
> > moment, but I'd like to allow DIO read for zoned device which will be
> > used for Android devices.
> > 
> 
> So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
> be useless on all Android devices?  That sounds undesirable. 

Do you have a plan to adopt STATX_DIOALIGN in android?

> Are you sure that
> supporting DIO reads but not DIO writes actually works?  Does it not cause
> problems for existing applications?

I haven't heard any issue so far.

> 
> What we need to do is make a decision about whether this means we should build
> in a stx_dio_direction field (indicating no support / readonly support /
> writeonly support / readwrite support) into the API from the beginning.  If we
> don't do that, then I don't think we could simply add such a field later, as the
> statx_dio_*_align fields will have already been assigned their meaning.  I think
> we'd instead have to "duplicate" the API, with STATX_DIOROALIGN and
> statx_dio_ro_*_align fields.  That seems uglier than building a directional
> indicator into the API from the beginning.  On the other hand, requiring all
> programs to check stx_dio_direction would add complexity to using the API.
> 
> Any thoughts on this?

I haven't seen the details of the implementation tho, why not supporting it
only if filesystem has the same DIO RW policy?

> 
> - Eric
Eric Biggers Aug. 20, 2022, 12:33 a.m. UTC | #10
On Fri, Aug 19, 2022 at 05:06:06PM -0700, Jaegeuk Kim wrote:
> On 08/15, Eric Biggers wrote:
> > On Sat, Jul 30, 2022 at 08:08:26PM -0700, Jaegeuk Kim wrote:
> > > On 07/25, Eric Biggers wrote:
> > > > On Sat, Jul 23, 2022 at 07:01:59PM -0700, Jaegeuk Kim wrote:
> > > > > On 07/22, Eric Biggers wrote:
> > > > > > From: Eric Biggers <ebiggers@google.com>
> > > > > > 
> > > > > > Currently, if an f2fs filesystem is mounted with the mode=lfs and
> > > > > > io_bits mount options, DIO reads are allowed but DIO writes are not.
> > > > > > Allowing DIO reads but not DIO writes is an unusual restriction, which
> > > > > > is likely to be surprising to applications, namely any application that
> > > > > > both reads and writes from a file (using O_DIRECT).  This behavior is
> > > > > > also incompatible with the proposed STATX_DIOALIGN extension to statx.
> > > > > > Given this, let's drop the support for DIO reads in this configuration.
> > > > > 
> > > > > IIRC, we allowed DIO reads since applications complained a lower performance.
> > > > > So, I'm afraid this change will make another confusion to users. Could
> > > > > you please apply the new bahavior only for STATX_DIOALIGN?
> > > > > 
> > > > 
> > > > Well, the issue is that the proposed STATX_DIOALIGN fields cannot represent this
> > > > weird case where DIO reads are allowed but not DIO writes.  So the question is
> > > > whether this case actually matters, in which case we should make STATX_DIOALIGN
> > > > distinguish between DIO reads and DIO writes, or whether it's some odd edge case
> > > > that doesn't really matter, in which case we could just fix it or make
> > > > STATX_DIOALIGN report that DIO is unsupported.  I was hoping that you had some
> > > > insight here.  What sort of applications want DIO reads but not DIO writes?
> > > > Is this common at all?
> > > 
> > > I think there's no specific application to use the LFS mode at this
> > > moment, but I'd like to allow DIO read for zoned device which will be
> > > used for Android devices.
> > > 
> > 
> > So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
> > be useless on all Android devices?  That sounds undesirable. 
> 
> Do you have a plan to adopt STATX_DIOALIGN in android?

Nothing specific, but statx() is among the system calls that are supported by
Android's libc and that apps are allowed to use.  So STATX_DIOALIGN would become
available as well.  I'd prefer if it actually worked properly if apps, or
Android system components, do actually try to use it (or need to use it)...

> > What we need to do is make a decision about whether this means we should build
> > in a stx_dio_direction field (indicating no support / readonly support /
> > writeonly support / readwrite support) into the API from the beginning.  If we
> > don't do that, then I don't think we could simply add such a field later, as the
> > statx_dio_*_align fields will have already been assigned their meaning.  I think
> > we'd instead have to "duplicate" the API, with STATX_DIOROALIGN and
> > statx_dio_ro_*_align fields.  That seems uglier than building a directional
> > indicator into the API from the beginning.  On the other hand, requiring all
> > programs to check stx_dio_direction would add complexity to using the API.
> > 
> > Any thoughts on this?
> 
> I haven't seen the details of the implementation tho, why not supporting it
> only if filesystem has the same DIO RW policy?

As I've mentioned, we could of course make STATX_DIOALIGN report that DIO is
unsupported when the DIO support is read-only.

The thing that confuses me based on the responses so far is that there seem to
be two camps of people: (1) people who really want STATX_DIOALIGN, and who don't
think that read-only DIO support should exist so they don't want STATX_DIOALIGN
to support it; and (2) people who feel that read-only DIO support is perfectly
reasonable and useful, and who don't care whether STATX_DIOALIGN supports it
because they don't care about STATX_DIOALIGN in the first place.

While both camps seem to agree that STATX_DIOALIGN shouldn't support read-only
DIO, it is for totally contradictory reasons, so it's not very convincing.  We
should ensure that we have rock-solid reasoning before committing to a new UAPI
that will have to be permanently supported...

- Eric
Christoph Hellwig Aug. 21, 2022, 8:53 a.m. UTC | #11
On Mon, Aug 15, 2022 at 05:55:45PM -0700, Eric Biggers wrote:
> So if the zoned device feature becomes widely adopted, then STATX_DIOALIGN will
> be useless on all Android devices?  That sounds undesirable.  Are you sure that

We just need to fix f2fs to support direct I/O on zone devices.  There
is not good reason not to support it, in fact the way how zoned devices
requires appends with the Zone Append semantics makes direct I/O way
safer than how f2fs does direct I/O currently on non-zoned devices.

Until then just supporting direct I/O reads on zoned devices for f2fs
seems like a really bad choice given that it will lead to nasty cache
incoherency.
Andreas Dilger Aug. 23, 2022, 3:22 a.m. UTC | #12
On Aug 19, 2022, at 5:09 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> 
> On Tue, Aug 16, 2022 at 10:42:29AM -0600, Andreas Dilger wrote:
>> 
>> IMHO, this whole discussion is putting the cart before the horse.
>> Changing existing (and useful) IO behavior to accommodate an API that
>> nobody has ever used, and is unlikely to even be widely used, doesn't
>> make sense to me.  Most applications won't check or care about the new
>> DIO size fields, since they've lived this long without statx() returning
>> this info, and will just pick a "large enough" size (4KB, 1MB, whatever)
>> that gives them the performance they need.  They *WILL* care if the app
>> is suddenly unable to read data from a file in ways that have worked for
>> a long time.
>> 
>> Even if apps are modified to check these new DIO size fields, and then
>> try to DIO write to a file in f2fs that doesn't allow it, then f2fs will
>> return an error, which is what it would have done without the statx()
>> changes, so no harm done AFAICS.
>> 
>> Even with a more-complex DIO status return that handles a "direction"
>> field (which IMHO is needlessly complex), there is always the potential
>> for a TOCTOU race where a file changes between checking and access, so
>> the userspace code would need to handle this.
> 
> I'm having trouble making sense of your argument here; you seem to be saying
> that STATX_DIOALIGN isn't useful, so it doesn't matter if we design it
> correctly?  That line of reasoning is concerning, as it's certainly intended
> to be useful, and if it's not useful there's no point in adding it.
> 
> Are there any specific concerns that you have, besides TOCTOU races and the
> lack of support for read-only DIO?

My main concern is disabling useful functionality that exists today to appease
the new DIO size API.  Whether STATX_DIOALIGN will become widely used by
applications or not is hard to say at this point.

If there were separate STATX_DIOREAD and STATX_DIOWRITE flags in the returned
data, and the alignment is provided as it is today, that would be enough IMHO
to address the original use case without significant complexity.

> I don't think that TOCTOU races are a real concern here.  Generally DIO
> constraints would only change if the application doing DIO intentionally does
> something to the file, or if there are changes that involve the filesystem
> being taken offline, e.g. the filesystem being mounted with significantly
> different options or being moved to a different block device.  And, well,
> everything else in stat()/statx() is subject to TOCTOU as well, but is still
> used...

I was thinking of background filesystem operations like compression, LVM
migration to new storage with a different sector size, etc. that may change
the DIO characteristics of the file even while it is open.  Not that I think
this will happen frequently, but it is possible, and applications shouldn't
explode if the DIO parameters change and they get an error.

Cheers, Andreas
diff mbox series

Patch

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 5e5c97fccfb4ee..ad0212848a1ab9 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -823,7 +823,6 @@  static inline bool f2fs_force_buffered_io(struct inode *inode,
 				struct kiocb *iocb, struct iov_iter *iter)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
-	int rw = iov_iter_rw(iter);
 
 	if (!fscrypt_dio_supported(inode))
 		return true;
@@ -841,7 +840,7 @@  static inline bool f2fs_force_buffered_io(struct inode *inode,
 	 */
 	if (f2fs_sb_has_blkzoned(sbi))
 		return true;
-	if (f2fs_lfs_mode(sbi) && (rw == WRITE)) {
+	if (f2fs_lfs_mode(sbi)) {
 		if (block_unaligned_IO(inode, iocb, iter))
 			return true;
 		if (F2FS_IO_ALIGNED(sbi))