Patchwork Rearrange i_flags to be consistent with FS_IOC_GETFLAGS

login
register
mail settings
Submitter David Howells
Date July 5, 2010, 3:43 p.m.
Message ID <20100705154319.31193.56706.stgit@warthog.procyon.org.uk>
Download mbox | patch
Permalink /patch/57918/
State New
Headers show

Comments

David Howells - July 5, 2010, 3:43 p.m.
Rearrange the constituent bits of i_flags to be consistent with the flag values
returned by the FS_IOC_GETFLAGS ioctl where flags with common meanings occur.
Otherwise pack the bits of i_flags as low as possible so that RISC CPUs can
use small instructions.

This allows those filesystems that use i_flags (Ext2/3/4) to simplify their
get/set flags routines.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/ext2/inode.c    |   33 ++++++---------------------------
 fs/ext3/inode.c    |   31 +++++--------------------------
 fs/ext4/inode.c    |   32 +++++---------------------------
 include/linux/fs.h |   30 ++++++++++++++++++------------
 4 files changed, 34 insertions(+), 92 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen - July 5, 2010, 3:54 p.m.
David Howells wrote:
> Rearrange the constituent bits of i_flags to be consistent with the flag values
> returned by the FS_IOC_GETFLAGS ioctl where flags with common meanings occur.
> Otherwise pack the bits of i_flags as low as possible so that RISC CPUs can
> use small instructions.
> 
> This allows those filesystems that use i_flags (Ext2/3/4) to simplify their
> get/set flags routines.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>

Seems like a good idea to me, although:

> diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
> index 3675088..da5fd2d 100644
> --- a/fs/ext2/inode.c
> +++ b/fs/ext2/inode.c
> @@ -1259,38 +1259,17 @@ Egdp:
>  
>  void ext2_set_inode_flags(struct inode *inode)
>  {
> -	unsigned int flags = EXT2_I(inode)->i_flags;
> -
> -	inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_NOATIME|S_DIRSYNC);
> -	if (flags & EXT2_SYNC_FL)
> -		inode->i_flags |= S_SYNC;
> -	if (flags & EXT2_APPEND_FL)
> -		inode->i_flags |= S_APPEND;
> -	if (flags & EXT2_IMMUTABLE_FL)
> -		inode->i_flags |= S_IMMUTABLE;
> -	if (flags & EXT2_NOATIME_FL)
> -		inode->i_flags |= S_NOATIME;
> -	if (flags & EXT2_DIRSYNC_FL)
> -		inode->i_flags |= S_DIRSYNC;

Maybe a stern comment should be here about the relationship between
this and the flag definitions...

> +	unsigned int flags = EXT2_I(inode)->i_flags & S_FS_IOC_FLAGS;
> +
> +	inode->i_flags = (inode->i_flags & ~S_FS_IOC_FLAGS) | flags;
>  }
>  

...

> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 48616db..5808b9b 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h

...

> +/*
> + * Inode flags - they have no relation to superblock flags now
> + *
> + * Where applicable, flags that are also in FS_IOC_GETFLAGS have the same bit
> + * position
> + */

... a stern comment here about not re-ordering since other fs/*/* code depends
strongly on this order?

"where applicable" sounds like a wish, but now these can't be changed, right?

If somebody rearranges these bad things will happen.

-Eric

> +#define S_DEAD		0x00000001	/* removed, but still open directory */
> +#define S_NOQUOTA	0x00000002	/* Inode is not counted to quota */
> +#define S_NOCMTIME	0x00000004	/* Do not update file c/mtime */
> +#define S_SYNC		0x00000008	/* [FS_SYNC_FL] Writes are synced at once */
> +#define S_IMMUTABLE	0x00000010	/* [FS_IMMUTABLE_FL] Immutable file */
> +#define S_APPEND	0x00000020	/* [FS_APPEND_FL] Append-only file */
> +#define S_SWAPFILE	0x00000040	/* Do not truncate: swapon got its bmaps */
> +#define S_NOATIME	0x00000080	/* [FS_NOATIME_FL] Do not update access times */
> +#define S_PRIVATE	0x00000100	/* Inode is fs-internal */
> +#define S_DIRSYNC	0x00010000	/* [FS_DIRSYNC_FL] Directory mods are synchronous */
> +
> +#define S_FS_IOC_FLAGS	0x000100B8	/* the S_xxx flags that are also FS_xxx_FL flags */
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - July 5, 2010, 5:27 p.m.
On Mon, Jul 05, 2010 at 10:54:07AM -0500, Eric Sandeen wrote:
> 
> ... a stern comment here about not re-ordering since other fs/*/*
> code depends strongly on this order?

A stern warning is needed there, but it's also needed a few lines
further down in the inode flags section where the FS_*_FL flags are
defined.  These were originally ext2-specific inode flags, and it's
become generalized to a fs-independent set of bit fields, but what's
nasty/important to remember is that these flags are also used as
on-disk flags for ext2/3/4, and so extreme care is needed before new
flags are for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS are allocated....

In fact, I'd argue it's much more strongly needed for the FS_*_FL
flags.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner - July 6, 2010, 12:10 a.m.
On Mon, Jul 05, 2010 at 04:43:19PM +0100, David Howells wrote:
> Rearrange the constituent bits of i_flags to be consistent with the flag values
> returned by the FS_IOC_GETFLAGS ioctl where flags with common meanings occur.
> Otherwise pack the bits of i_flags as low as possible so that RISC CPUs can
> use small instructions.
> 
> This allows those filesystems that use i_flags (Ext2/3/4) to simplify their
> get/set flags routines.

This seems like a nasty landmine just waiting for someone to step
on.  The S_* flags are VFS level flags used by all filesytsems, not
just extN, and have a history of changing values and meaning as uses
come and go.  This code forces the VFS to use certain flag values
that *match on-disk values* of ext2/3/4.

Further, it opens up the possibility that further down the track
someone modifies S_* flag values (say when adding some XFS
functionality) and corrupts extN filesystems all over the place.
There isn't even compiler guards to catch someone modifying the S_*
flags in a way that makes it incompatible with the extN on-disk
definitions.


Cheers,

Dave.
David Howells - July 6, 2010, 1:40 p.m.
Dave Chinner <david@fromorbit.com> wrote:

> Further, it opens up the possibility that further down the track
> someone modifies S_* flag values (say when adding some XFS
> functionality) and corrupts extN filesystems all over the place.
> There isn't even compiler guards to catch someone modifying the S_*
> flags in a way that makes it incompatible with the extN on-disk
> definitions.

That occurred to me after I sent the patch.  I can add some preprocessor guards
for this.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner - July 6, 2010, 11:03 p.m.
On Tue, Jul 06, 2010 at 02:40:23PM +0100, David Howells wrote:
> Dave Chinner <david@fromorbit.com> wrote:
> 
> > Further, it opens up the possibility that further down the track
> > someone modifies S_* flag values (say when adding some XFS
> > functionality) and corrupts extN filesystems all over the place.
> > There isn't even compiler guards to catch someone modifying the S_*
> > flags in a way that makes it incompatible with the extN on-disk
> > definitions.
> 
> That occurred to me after I sent the patch.  I can add some preprocessor guards
> for this.

I'd prefer generic flags are not dependent on fixed values from a
specific filesystem several layers down the storage stack.

Also, if the problem you are trying to solve is overhead of
calculating the flags for stat() on RISC architectures, then I'd
argue that XFS is just as important target for such an optimisation
because it is widely used in small ARM and MIPS based NAS
appliances....

Cheers,

Dave.
David Howells - July 6, 2010, 11:45 p.m.
Dave Chinner <david@fromorbit.com> wrote:

> I'd prefer generic flags are not dependent on fixed values from a
> specific filesystem several layers down the storage stack.

They're not so dependent.  They're based on the FS_IOC_[GS]ETFLAGS ioctl which
even XFS translates its flags for.  These ioctl flags must now remain
invariant.  Whilst they might have originated as Ext2/3/4 flags, they're now
independent of that.

> Also, if the problem you are trying to solve is overhead of calculating the
> flags for stat() on RISC architectures, then I'd argue that XFS is just as
> important target for such an optimisation because it is widely used in small
> ARM and MIPS based NAS appliances....

This can be argued one way or another, however aligning i_flags with something
would probably be an improvement somewhere.  Most of what I deal with is Ext3/4
based, and BTRFS-based is likely to become important too.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner - July 7, 2010, 1:55 a.m.
On Wed, Jul 07, 2010 at 12:45:25AM +0100, David Howells wrote:
> Dave Chinner <david@fromorbit.com> wrote:
> 
> > I'd prefer generic flags are not dependent on fixed values from a
> > specific filesystem several layers down the storage stack.
> 
> They're not so dependent.

History says otherwise. :)

> They're based on the FS_IOC_[GS]ETFLAGS ioctl which
> even XFS translates its flags for.

Sure, because the ioctl flags values are derived
from the ext2 on-disk format flags and hence don't match anything
XFS related at all.

> These ioctl flags must now remain
> invariant.  Whilst they might have originated as Ext2/3/4 flags, they're now
> independent of that.

Yes, the ioctl flags must remain invariant. OTOH, the generic inode
flags (S_*) have no such invariant requirement and have a history of
frequent change. IMO, that means some flags should not be tied to
the value of a specific subsystem just so a subsystem specific
optimisation can be made. It just seems like a dangerous layering
violation to be making...

Cheers,

Dave.

Patch

diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 3675088..da5fd2d 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -1259,38 +1259,17 @@  Egdp:
 
 void ext2_set_inode_flags(struct inode *inode)
 {
-	unsigned int flags = EXT2_I(inode)->i_flags;
-
-	inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_NOATIME|S_DIRSYNC);
-	if (flags & EXT2_SYNC_FL)
-		inode->i_flags |= S_SYNC;
-	if (flags & EXT2_APPEND_FL)
-		inode->i_flags |= S_APPEND;
-	if (flags & EXT2_IMMUTABLE_FL)
-		inode->i_flags |= S_IMMUTABLE;
-	if (flags & EXT2_NOATIME_FL)
-		inode->i_flags |= S_NOATIME;
-	if (flags & EXT2_DIRSYNC_FL)
-		inode->i_flags |= S_DIRSYNC;
+	unsigned int flags = EXT2_I(inode)->i_flags & S_FS_IOC_FLAGS;
+
+	inode->i_flags = (inode->i_flags & ~S_FS_IOC_FLAGS) | flags;
 }
 
 /* Propagate flags from i_flags to EXT2_I(inode)->i_flags */
 void ext2_get_inode_flags(struct ext2_inode_info *ei)
 {
-	unsigned int flags = ei->vfs_inode.i_flags;
-
-	ei->i_flags &= ~(EXT2_SYNC_FL|EXT2_APPEND_FL|
-			EXT2_IMMUTABLE_FL|EXT2_NOATIME_FL|EXT2_DIRSYNC_FL);
-	if (flags & S_SYNC)
-		ei->i_flags |= EXT2_SYNC_FL;
-	if (flags & S_APPEND)
-		ei->i_flags |= EXT2_APPEND_FL;
-	if (flags & S_IMMUTABLE)
-		ei->i_flags |= EXT2_IMMUTABLE_FL;
-	if (flags & S_NOATIME)
-		ei->i_flags |= EXT2_NOATIME_FL;
-	if (flags & S_DIRSYNC)
-		ei->i_flags |= EXT2_DIRSYNC_FL;
+	unsigned int flags = ei->vfs_inode.i_flags & S_FS_IOC_FLAGS;
+
+	ei->i_flags = (ei->i_flags & ~S_FS_IOC_FLAGS) | flags;
 }
 
 struct inode *ext2_iget (struct super_block *sb, unsigned long ino)
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 735f019..4a1e45e 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -2737,38 +2737,17 @@  int ext3_get_inode_loc(struct inode *inode, struct ext3_iloc *iloc)
 
 void ext3_set_inode_flags(struct inode *inode)
 {
-	unsigned int flags = EXT3_I(inode)->i_flags;
+	unsigned int flags = EXT3_I(inode)->i_flags & S_FS_IOC_FLAGS;
 
-	inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_NOATIME|S_DIRSYNC);
-	if (flags & EXT3_SYNC_FL)
-		inode->i_flags |= S_SYNC;
-	if (flags & EXT3_APPEND_FL)
-		inode->i_flags |= S_APPEND;
-	if (flags & EXT3_IMMUTABLE_FL)
-		inode->i_flags |= S_IMMUTABLE;
-	if (flags & EXT3_NOATIME_FL)
-		inode->i_flags |= S_NOATIME;
-	if (flags & EXT3_DIRSYNC_FL)
-		inode->i_flags |= S_DIRSYNC;
+	inode->i_flags = (inode->i_flags & ~S_FS_IOC_FLAGS) | flags;
 }
 
 /* Propagate flags from i_flags to EXT3_I(inode)->i_flags */
 void ext3_get_inode_flags(struct ext3_inode_info *ei)
 {
-	unsigned int flags = ei->vfs_inode.i_flags;
-
-	ei->i_flags &= ~(EXT3_SYNC_FL|EXT3_APPEND_FL|
-			EXT3_IMMUTABLE_FL|EXT3_NOATIME_FL|EXT3_DIRSYNC_FL);
-	if (flags & S_SYNC)
-		ei->i_flags |= EXT3_SYNC_FL;
-	if (flags & S_APPEND)
-		ei->i_flags |= EXT3_APPEND_FL;
-	if (flags & S_IMMUTABLE)
-		ei->i_flags |= EXT3_IMMUTABLE_FL;
-	if (flags & S_NOATIME)
-		ei->i_flags |= EXT3_NOATIME_FL;
-	if (flags & S_DIRSYNC)
-		ei->i_flags |= EXT3_DIRSYNC_FL;
+	unsigned int flags = ei->vfs_inode.i_flags & S_FS_IOC_FLAGS;
+
+	ei->i_flags = (ei->i_flags & ~S_FS_IOC_FLAGS) | flags;
 }
 
 struct inode *ext3_iget(struct super_block *sb, unsigned long ino)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index efa17d6..63a5da7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4924,19 +4924,9 @@  int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
 
 void ext4_set_inode_flags(struct inode *inode)
 {
-	unsigned int flags = EXT4_I(inode)->i_flags;
-
-	inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_NOATIME|S_DIRSYNC);
-	if (flags & EXT4_SYNC_FL)
-		inode->i_flags |= S_SYNC;
-	if (flags & EXT4_APPEND_FL)
-		inode->i_flags |= S_APPEND;
-	if (flags & EXT4_IMMUTABLE_FL)
-		inode->i_flags |= S_IMMUTABLE;
-	if (flags & EXT4_NOATIME_FL)
-		inode->i_flags |= S_NOATIME;
-	if (flags & EXT4_DIRSYNC_FL)
-		inode->i_flags |= S_DIRSYNC;
+	unsigned int flags = EXT4_I(inode)->i_flags & S_FS_IOC_FLAGS;
+
+	inode->i_flags = (inode->i_flags & ~S_FS_IOC_FLAGS) | flags;
 }
 
 /* Propagate flags from i_flags to EXT4_I(inode)->i_flags */
@@ -4946,21 +4936,9 @@  void ext4_get_inode_flags(struct ext4_inode_info *ei)
 	unsigned long old_fl, new_fl;
 
 	do {
-		vfs_fl = ei->vfs_inode.i_flags;
+		vfs_fl = ei->vfs_inode.i_flags & S_FS_IOC_FLAGS;
 		old_fl = ei->i_flags;
-		new_fl = old_fl & ~(EXT4_SYNC_FL|EXT4_APPEND_FL|
-				EXT4_IMMUTABLE_FL|EXT4_NOATIME_FL|
-				EXT4_DIRSYNC_FL);
-		if (vfs_fl & S_SYNC)
-			new_fl |= EXT4_SYNC_FL;
-		if (vfs_fl & S_APPEND)
-			new_fl |= EXT4_APPEND_FL;
-		if (vfs_fl & S_IMMUTABLE)
-			new_fl |= EXT4_IMMUTABLE_FL;
-		if (vfs_fl & S_NOATIME)
-			new_fl |= EXT4_NOATIME_FL;
-		if (vfs_fl & S_DIRSYNC)
-			new_fl |= EXT4_DIRSYNC_FL;
+		new_fl = (old_fl & ~S_FS_IOC_FLAGS) | vfs_fl;
 	} while (cmpxchg(&ei->i_flags, old_fl, new_fl) != old_fl);
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 48616db..5808b9b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -223,18 +223,24 @@  struct inodes_stat_t {
 #define MS_MGC_VAL 0xC0ED0000
 #define MS_MGC_MSK 0xffff0000
 
-/* Inode flags - they have nothing to superblock flags now */
-
-#define S_SYNC		1	/* Writes are synced at once */
-#define S_NOATIME	2	/* Do not update access times */
-#define S_APPEND	4	/* Append-only file */
-#define S_IMMUTABLE	8	/* Immutable file */
-#define S_DEAD		16	/* removed, but still open directory */
-#define S_NOQUOTA	32	/* Inode is not counted to quota */
-#define S_DIRSYNC	64	/* Directory modifications are synchronous */
-#define S_NOCMTIME	128	/* Do not update file c/mtime */
-#define S_SWAPFILE	256	/* Do not truncate: swapon got its bmaps */
-#define S_PRIVATE	512	/* Inode is fs-internal */
+/*
+ * Inode flags - they have no relation to superblock flags now
+ *
+ * Where applicable, flags that are also in FS_IOC_GETFLAGS have the same bit
+ * position
+ */
+#define S_DEAD		0x00000001	/* removed, but still open directory */
+#define S_NOQUOTA	0x00000002	/* Inode is not counted to quota */
+#define S_NOCMTIME	0x00000004	/* Do not update file c/mtime */
+#define S_SYNC		0x00000008	/* [FS_SYNC_FL] Writes are synced at once */
+#define S_IMMUTABLE	0x00000010	/* [FS_IMMUTABLE_FL] Immutable file */
+#define S_APPEND	0x00000020	/* [FS_APPEND_FL] Append-only file */
+#define S_SWAPFILE	0x00000040	/* Do not truncate: swapon got its bmaps */
+#define S_NOATIME	0x00000080	/* [FS_NOATIME_FL] Do not update access times */
+#define S_PRIVATE	0x00000100	/* Inode is fs-internal */
+#define S_DIRSYNC	0x00010000	/* [FS_DIRSYNC_FL] Directory mods are synchronous */
+
+#define S_FS_IOC_FLAGS	0x000100B8	/* the S_xxx flags that are also FS_xxx_FL flags */
 
 /*
  * Note that nosuid etc flags are inode-specific: setting some file-system