diff mbox series

[v3,04/12] fs/xfs: Clean up DAX support check

Message ID 20200208193445.27421-5-ira.weiny@intel.com
State Not Applicable
Headers show
Series Enable per-file/directory DAX operations V3 | expand

Commit Message

Ira Weiny Feb. 8, 2020, 7:34 p.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

Rather than open coding xfs_inode_supports_dax() in
xfs_ioctl_setattr_dax_invalidate() export xfs_inode_supports_dax() and
call it in preparation for swapping dax flags.

This also means updating xfs_inode_supports_dax() to return true for a
directory.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/xfs/xfs_ioctl.c | 16 +++-------------
 fs/xfs/xfs_iops.c  |  8 ++++++--
 fs/xfs/xfs_iops.h  |  2 ++
 3 files changed, 11 insertions(+), 15 deletions(-)

Comments

Dave Chinner Feb. 11, 2020, 5:57 a.m. UTC | #1
On Sat, Feb 08, 2020 at 11:34:37AM -0800, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Rather than open coding xfs_inode_supports_dax() in
> xfs_ioctl_setattr_dax_invalidate() export xfs_inode_supports_dax() and
> call it in preparation for swapping dax flags.
> 
> This also means updating xfs_inode_supports_dax() to return true for a
> directory.

That's not correct. This now means S_DAX gets set on directory inodes
because both xfs_inode_supports_dax() and the on-disk inode flag
checks return true in xfs_diflags_to_iflags(). Hence when we
instantiate a directory inode with a DAX inherit hint set on it
we'll set S_DAX on the inode and so IS_DAX() will return true for
directory inodes...

Cheers,

Dave.
Ira Weiny Feb. 11, 2020, 4:28 p.m. UTC | #2
On Tue, Feb 11, 2020 at 04:57:45PM +1100, Dave Chinner wrote:
> On Sat, Feb 08, 2020 at 11:34:37AM -0800, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Rather than open coding xfs_inode_supports_dax() in
> > xfs_ioctl_setattr_dax_invalidate() export xfs_inode_supports_dax() and
> > call it in preparation for swapping dax flags.
> > 
> > This also means updating xfs_inode_supports_dax() to return true for a
> > directory.
> 
> That's not correct. This now means S_DAX gets set on directory inodes
> because both xfs_inode_supports_dax() and the on-disk inode flag
> checks return true in xfs_diflags_to_iflags(). Hence when we
> instantiate a directory inode with a DAX inherit hint set on it
> we'll set S_DAX on the inode and so IS_DAX() will return true for
> directory inodes...

I'm not following.  Don't we want S_DAX to get set on directory inodes?

IIRC what we wanted was something like this where IS_DAX is the current state
and "dax" is the inode flag:

/ <IS_DAX=0 dax=0>
	dir1 <IS_DAX=0 dax=0>
		f0 <IS_DAX=0 dax=0>
		f1 <IS_DAX=1 dax=1>
	dir2 <IS_DAX=1 dax=1>
		f2 <IS_DAX=1 dax=1>
		f3 <IS_DAX=0 dax=0>
		dir3 <IS_DAX=1 dax=1>
			f4 <IS_DAX=1 dax=1>
		dir4 <IS_DAX=0 dax=0>
			f5 <IS_DAX=0 dax=0>
		f6 <IS_DAX=1 dax=1>

Where f1, dir2, f3, and dir4 required explicit state changes when they were
created.  Because they inherited their dax state from the parent.  All the
other creations happened based on the DAX state of the parent directory.  So we
need to store and know the state of the directories.  What am I missing?

Ira

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
Dave Chinner Feb. 11, 2020, 8:38 p.m. UTC | #3
On Tue, Feb 11, 2020 at 08:28:30AM -0800, Ira Weiny wrote:
> On Tue, Feb 11, 2020 at 04:57:45PM +1100, Dave Chinner wrote:
> > On Sat, Feb 08, 2020 at 11:34:37AM -0800, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > Rather than open coding xfs_inode_supports_dax() in
> > > xfs_ioctl_setattr_dax_invalidate() export xfs_inode_supports_dax() and
> > > call it in preparation for swapping dax flags.
> > > 
> > > This also means updating xfs_inode_supports_dax() to return true for a
> > > directory.
> > 
> > That's not correct. This now means S_DAX gets set on directory inodes
> > because both xfs_inode_supports_dax() and the on-disk inode flag
> > checks return true in xfs_diflags_to_iflags(). Hence when we
> > instantiate a directory inode with a DAX inherit hint set on it
> > we'll set S_DAX on the inode and so IS_DAX() will return true for
> > directory inodes...
> 
> I'm not following.  Don't we want S_DAX to get set on directory inodes?

No, because S_DAX is used for controlling direct user data access,
and we *never* let users directly access directory data. Even the
filesystems don't access directory data directly - the transactional
change model of journaling filesystems requires metadata to be
buffered in memory and never directly modified.

Hence when filesystems like ext4 keep their directory data in the
page cache, we do not want the kernel to think that this inode is
accessed through the DAX subsystem - it is accessed via the buffered
IO interfaces like page_cache_sync_readahead() and writeback is
controlled by the journal. Hence setting S_DAX on these inodes is
incorrect.

Whilst XFS doesn't use the page cache for it's metadata
buffering, the issue is the same as it's also a journalling
filesystem. hence setting S_DAX on XFS directory inodes is also
incorrect.

> IIRC what we wanted was something like this where IS_DAX is the current state
> and "dax" is the inode flag:
> 
> / <IS_DAX=0 dax=0>
> 	dir1 <IS_DAX=0 dax=0>
> 		f0 <IS_DAX=0 dax=0>
> 		f1 <IS_DAX=1 dax=1>
> 	dir2 <IS_DAX=1 dax=1>
> 		f2 <IS_DAX=1 dax=1>
> 		f3 <IS_DAX=0 dax=0>
> 		dir3 <IS_DAX=1 dax=1>
> 			f4 <IS_DAX=1 dax=1>
> 		dir4 <IS_DAX=0 dax=0>
> 			f5 <IS_DAX=0 dax=0>
> 		f6 <IS_DAX=1 dax=1>
> 
> Where f1, dir2, f3, and dir4 required explicit state changes when they were
> created.  Because they inherited their dax state from the parent.  All the
> other creations happened based on the DAX state of the parent directory.  So we
> need to store and know the state of the directories.  What am I missing?

I think that you are conflating internal filesystem feature
management details (the inheritance of the DAX flag feature of
directories) with kernel IO path behaviour (the inode S_DAX flag).

i.e. IS_DAX() indicates whether DAX is _actively being used_ to
access the data of a regular file inode, not to indicate whether the
on-disk flags used to manage default behaviour are set or not.

Cheers,

Dave.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 1a57be696810..da1eb2bdb386 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1190,23 +1190,13 @@  xfs_ioctl_setattr_dax_invalidate(
 	int			*join_flags)
 {
 	struct inode		*inode = VFS_I(ip);
-	struct super_block	*sb = inode->i_sb;
 	int			error;
 
 	*join_flags = 0;
 
-	/*
-	 * It is only valid to set the DAX flag on regular files and
-	 * directories on filesystems where the block size is equal to the page
-	 * size. On directories it serves as an inherited hint so we don't
-	 * have to check the device for dax support or flush pagecache.
-	 */
-	if (fa->fsx_xflags & FS_XFLAG_DAX) {
-		struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
-
-		if (!bdev_dax_supported(target->bt_bdev, sb->s_blocksize))
-			return -EINVAL;
-	}
+	if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX &&
+	    !xfs_inode_supports_dax(ip))
+		return -EINVAL;
 
 	/* If the DAX state is not changing, we have nothing to do here. */
 	if ((fa->fsx_xflags & FS_XFLAG_DAX) &&
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index a7db50d923d4..eebec159d873 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1246,14 +1246,18 @@  xfs_inode_mount_is_dax(
 }
 
 /* Figure out if this file actually supports DAX. */
-static bool
+bool
 xfs_inode_supports_dax(
 	struct xfs_inode	*ip)
 {
 	struct xfs_mount	*mp = ip->i_mount;
 
 	/* Only supported on non-reflinked files. */
-	if (!S_ISREG(VFS_I(ip)->i_mode) || xfs_is_reflink_inode(ip))
+	if (xfs_is_reflink_inode(ip))
+		return false;
+
+	/* Only supported on regular files and directories. */
+	if (!(S_ISREG(VFS_I(ip)->i_mode) || S_ISDIR(VFS_I(ip)->i_mode)))
 		return false;
 
 	/* Block size must match page size */
diff --git a/fs/xfs/xfs_iops.h b/fs/xfs/xfs_iops.h
index 4d24ff309f59..f24fec8de1d6 100644
--- a/fs/xfs/xfs_iops.h
+++ b/fs/xfs/xfs_iops.h
@@ -24,4 +24,6 @@  extern int xfs_setattr_nonsize(struct xfs_inode *ip, struct iattr *vap,
 extern int xfs_vn_setattr_nonsize(struct dentry *dentry, struct iattr *vap);
 extern int xfs_vn_setattr_size(struct dentry *dentry, struct iattr *vap);
 
+extern bool xfs_inode_supports_dax(struct xfs_inode *ip);
+
 #endif /* __XFS_IOPS_H__ */