Patchwork [03/10,v3] ext4: add physical block and status member into extent status tree

login
register
mail settings
Submitter Zheng Liu
Date Jan. 23, 2013, 12:03 p.m.
Message ID <1358942640-2262-4-git-send-email-wenqing.lz@taobao.com>
Download mbox | patch
Permalink /patch/214900/
State Superseded
Headers show

Comments

Zheng Liu - Jan. 23, 2013, 12:03 p.m.
From: Zheng Liu <wenqing.lz@taobao.com>

es_pblk is used to record physical block that maps to the disk.
es_status is used to record the status of the extent.  Three status
are defined, which are written, unwritten and delayed.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/ext4/extents_status.h | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)
Theodore Ts'o - Jan. 29, 2013, 3:03 a.m.
On Wed, Jan 23, 2013 at 08:03:53PM +0800, Zheng Liu wrote:
> +	ext4_fsblk_t es_pblk : 62;	/* first physical block */
> +	ext4_fsblk_t es_status : 2;	/* record the status of extent */

I'll accept this for now but note that ext4_fsblk_t is typedefed to be
an unsigned long long, and C99 only guarantees that bitfields can be
made from Bool, signed int, and unsigned int.  Gcc accepts unsigned
long long based bit fields as an extension, but it's not portable
code.  This is kernel code, though, and we have plenty more gcc
specific code....

						- Ted


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zheng Liu - Jan. 29, 2013, 5:34 a.m.
On Mon, Jan 28, 2013 at 10:03:53PM -0500, Theodore Ts'o wrote:
> On Wed, Jan 23, 2013 at 08:03:53PM +0800, Zheng Liu wrote:
> > +	ext4_fsblk_t es_pblk : 62;	/* first physical block */
> > +	ext4_fsblk_t es_status : 2;	/* record the status of extent */
> 
> I'll accept this for now but note that ext4_fsblk_t is typedefed to be
> an unsigned long long, and C99 only guarantees that bitfields can be
> made from Bool, signed int, and unsigned int.  Gcc accepts unsigned
> long long based bit fields as an extension, but it's not portable
> code.  This is kernel code, though, and we have plenty more gcc
> specific code....

Thanks for pointing out.  When I tried to implement this code, there are
two choices.  One is like this that bit field is used.  IMHO it is clear
enough, although it is not portable.

Another choice is like this:

        struct extent_status {
                ...
                ext4_fsblk_t es_pblk;   /* first physical block */
        };

        #define EXTENT_STATUS_WRITTEN   (1ULL << 60)
        #define EXTENT_STATUS_UNWRITTEN (1ULL << 61)
        #define EXTENT_STATUS_DELAYED   (1ULL << 62)

When we want to set extent status, we will need to do like the following:

        es->es_pblk |= EXTENT_STATUS_WRITTEN;

This can make us avoid non-protable code.  I am happy to refine this
patch if you think the latter one is better.

Thanks,
                                                - Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Jan. 29, 2013, 5:28 p.m.
On Tue, Jan 29, 2013 at 01:34:15PM +0800, Zheng Liu wrote:
> 
> Another choice is like this:
> 
>         struct extent_status {
>                 ...
>                 ext4_fsblk_t es_pblk;   /* first physical block */
>         };
> 
>         #define EXTENT_STATUS_WRITTEN   (1ULL << 60)
>         #define EXTENT_STATUS_UNWRITTEN (1ULL << 61)
>         #define EXTENT_STATUS_DELAYED   (1ULL << 62)
> 
> When we want to set extent status, we will need to do like the following:
> 
>         es->es_pblk |= EXTENT_STATUS_WRITTEN;
> 
> This can make us avoid non-protable code.  I am happy to refine this
> patch if you think the latter one is better.

This is probably the way I would have done it myself, but the then you
need to make sure that all of the places where es_pblk is used you
have to mask off the high bits.

At this point, though, I don't think it's worth it to make the change
now, especially since we're almost at -rc6, I want to make sure this
gets into linux-next and so we get lots of testing.

As a matter of fact, I've already started testing the v3 vesion of the
extent status patches from January 23rd, with the v2 version of the
slab reclaim patch.  It's in the unstable portion of the ext4 git
tree, at:

      http://repo.or.cz/w/ext4-patch-queue.git /
      git://repo.or.cz/ext4-patch-queue.git

I'm waiting for your next version of your patch series before I move
it into the dev branch which will get fed into linux-next; my
understanding is you're just about ready to push it out, right?

If we want to move away from using bitfields, we can do that as a
separate patch that gets submitted later, since that's pretty easy to
audit and verify for correctness.  Also, I've since tested clang and
noted that it supports bitfields for unsigned long long.  There is
some differences between how gcc and clang handles sign extension for
unsigned long values, though:

#include <stdio.h>
struct s
{
        unsigned long long a:2;
        unsigned long long b:40;
        unsigned long long c:22;
};

int main()
{
        struct s t = {1, 2, 3};
        printf("0x%llx\n",(t.b-8));
}

Gcc 4.7.2 will print "0xfffffffffa", while clang 3.0-6 will print
"0xfffffffffffffffa" for the same program.

I don't think this is a huge issue for us, but it's worth keeping in
mind...

So let's go ahead and keep the bitfields at least for the initial
patch submission.

Thanks for all your on this patch series!

					- Ted



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zheng Liu - Jan. 30, 2013, 2:43 a.m.
On Tue, Jan 29, 2013 at 12:28:14PM -0500, Theodore Ts'o wrote:
> On Tue, Jan 29, 2013 at 01:34:15PM +0800, Zheng Liu wrote:
> > 
> > Another choice is like this:
> > 
> >         struct extent_status {
> >                 ...
> >                 ext4_fsblk_t es_pblk;   /* first physical block */
> >         };
> > 
> >         #define EXTENT_STATUS_WRITTEN   (1ULL << 60)
> >         #define EXTENT_STATUS_UNWRITTEN (1ULL << 61)
> >         #define EXTENT_STATUS_DELAYED   (1ULL << 62)
> > 
> > When we want to set extent status, we will need to do like the following:
> > 
> >         es->es_pblk |= EXTENT_STATUS_WRITTEN;
> > 
> > This can make us avoid non-protable code.  I am happy to refine this
> > patch if you think the latter one is better.
> 
> This is probably the way I would have done it myself, but the then you
> need to make sure that all of the places where es_pblk is used you
> have to mask off the high bits.
> 
> At this point, though, I don't think it's worth it to make the change
> now, especially since we're almost at -rc6, I want to make sure this
> gets into linux-next and so we get lots of testing.
> 
> As a matter of fact, I've already started testing the v3 vesion of the
> extent status patches from January 23rd, with the v2 version of the
> slab reclaim patch.  It's in the unstable portion of the ext4 git
> tree, at:
> 
>       http://repo.or.cz/w/ext4-patch-queue.git /
>       git://repo.or.cz/ext4-patch-queue.git
> 
> I'm waiting for your next version of your patch series before I move
> it into the dev branch which will get fed into linux-next; my
> understanding is you're just about ready to push it out, right?

Yes, I am running xfstests to make sure that the patch series doesn't
break anything.  Later it will be sent out.

> 
> If we want to move away from using bitfields, we can do that as a
> separate patch that gets submitted later, since that's pretty easy to
> audit and verify for correctness.  Also, I've since tested clang and
> noted that it supports bitfields for unsigned long long.  There is
> some differences between how gcc and clang handles sign extension for
> unsigned long values, though:
> 
> #include <stdio.h>
> struct s
> {
>         unsigned long long a:2;
>         unsigned long long b:40;
>         unsigned long long c:22;
> };
> 
> int main()
> {
>         struct s t = {1, 2, 3};
>         printf("0x%llx\n",(t.b-8));
> }
> 
> Gcc 4.7.2 will print "0xfffffffffa", while clang 3.0-6 will print
> "0xfffffffffffffffa" for the same program.

Clang is first coming in my mind.  I know that some one try to use it
to build a linux kernel and get a lot of problems that are about gcc
extension.  But for us it seems that things are not too bad. ;)

> 
> I don't think this is a huge issue for us, but it's worth keeping in
> mind...
> 
> So let's go ahead and keep the bitfields at least for the initial
> patch submission.

Yes, just keep in mind and go ahead.

Thanks for teaching me a lot,
                                                - Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Feb. 3, 2013, 3:03 a.m.
On Wed, Jan 30, 2013 at 10:43:33AM +0800, Zheng Liu wrote:
> 
> Clang is first coming in my mind.  I know that some one try to use it
> to build a linux kernel and get a lot of problems that are about gcc
> extension.  But for us it seems that things are not too bad. ;)

Clang accepts bitfields with "unsigned long long", but I've discovered
something which does _not_ support unsigned long long --- the "sparse"
tool.  :-(

I discovered this when running "make C=1", i.e.:

  rm -f fs/ext4/extents_status.o
  make C=1 fs/ext4/extents_status.o

Here's a simple test case which demo's that sparse doesn't deal well
with unsigned long long.  If we change the last two fields in struct
extents_status to:

	unsigned long es_pblk : 30;	/* first physical block */
	unsigned long es_status : 2;	/* record the status of extent */

sparse doesn't complain.  But as shown below, sparse complains bitterly:

/tmp/foo.c:22:24: warning: invalid access past the end of 'es' (24 28)

I'm not sure Chris will consider this a bug, since bitfields
with "unsigned long long" isn't standards complaint, even if gcc and
clang supports it.   Chris, what do you think?

              	       		      	       - Ted
   	

#!/bin/sh
cat > /tmp/foo.c << EOF
#include <unistd.h>
#include <stdio.h>

struct rb_node {
	unsigned long  __rb_parent_color;
	struct rb_node *rb_right;
	struct rb_node *rb_left;
} __attribute__((aligned(sizeof(long))));

struct extent_status {
	struct rb_node rb_node;
	unsigned long es_lblk;		/* first logical block extent covers */
	unsigned long es_len;		/* length of extent in block */
	unsigned long long es_pblk : 62;	/* first physical block */
	unsigned long long es_status : 2;	/* record the status of extent */
};

int main(int argc, char **argv)
{
	struct extent_status es;

	es.es_status = 3;

	printf("%d\n", es.es_status);
	printf("size %u\n", sizeof(es));
}
EOF
sparse /tmp/foo.c
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zheng Liu - Feb. 3, 2013, 5:21 a.m.
On Sat, Feb 02, 2013 at 10:03:43PM -0500, Theodore Ts'o wrote:
> On Wed, Jan 30, 2013 at 10:43:33AM +0800, Zheng Liu wrote:
> > 
> > Clang is first coming in my mind.  I know that some one try to use it
> > to build a linux kernel and get a lot of problems that are about gcc
> > extension.  But for us it seems that things are not too bad. ;)
> 
> Clang accepts bitfields with "unsigned long long", but I've discovered
> something which does _not_ support unsigned long long --- the "sparse"
> tool.  :-(

Yes, this problem has been reported by Fengguang.  So I am plan to use
another method to define extent_status structure as last time we
discuessed.  What do you think?

Thanks,
                                                - Zheng

> 
> I discovered this when running "make C=1", i.e.:
> 
>   rm -f fs/ext4/extents_status.o
>   make C=1 fs/ext4/extents_status.o
> 
> Here's a simple test case which demo's that sparse doesn't deal well
> with unsigned long long.  If we change the last two fields in struct
> extents_status to:
> 
> 	unsigned long es_pblk : 30;	/* first physical block */
> 	unsigned long es_status : 2;	/* record the status of extent */
> 
> sparse doesn't complain.  But as shown below, sparse complains bitterly:
> 
> /tmp/foo.c:22:24: warning: invalid access past the end of 'es' (24 28)
> 
> I'm not sure Chris will consider this a bug, since bitfields
> with "unsigned long long" isn't standards complaint, even if gcc and
> clang supports it.   Chris, what do you think?
> 
>               	       		      	       - Ted
>    	
> 
> #!/bin/sh
> cat > /tmp/foo.c << EOF
> #include <unistd.h>
> #include <stdio.h>
> 
> struct rb_node {
> 	unsigned long  __rb_parent_color;
> 	struct rb_node *rb_right;
> 	struct rb_node *rb_left;
> } __attribute__((aligned(sizeof(long))));
> 
> struct extent_status {
> 	struct rb_node rb_node;
> 	unsigned long es_lblk;		/* first logical block extent covers */
> 	unsigned long es_len;		/* length of extent in block */
> 	unsigned long long es_pblk : 62;	/* first physical block */
> 	unsigned long long es_status : 2;	/* record the status of extent */
> };
> 
> int main(int argc, char **argv)
> {
> 	struct extent_status es;
> 
> 	es.es_status = 3;
> 
> 	printf("%d\n", es.es_status);
> 	printf("size %u\n", sizeof(es));
> }
> EOF
> sparse /tmp/foo.c
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sam Ravnborg - Feb. 3, 2013, 7:30 a.m.
On Sat, Feb 02, 2013 at 10:03:43PM -0500, Theodore Ts'o wrote:
> On Wed, Jan 30, 2013 at 10:43:33AM +0800, Zheng Liu wrote:
> > 
> > Clang is first coming in my mind.  I know that some one try to use it
> > to build a linux kernel and get a lot of problems that are about gcc
> > extension.  But for us it seems that things are not too bad. ;)
> 
> Clang accepts bitfields with "unsigned long long", but I've discovered
> something which does _not_ support unsigned long long --- the "sparse"
> tool.  :-(
> 
> I discovered this when running "make C=1", i.e.:
> 
>   rm -f fs/ext4/extents_status.o
>   make C=1 fs/ext4/extents_status.o

Small hint...
If you use:

    make C=2 fs/ext4/extents_status.o

Then kbuild will run sparse on all targets you specify,
even if they do not need to be rebuild.
In other words - you then do not need to delete the .o file first.

This works for all the usual ways you can specify a target so to
check all of ext4 you just issue:

    make C=2 fs/ext4/

	Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Carpenter - Feb. 3, 2013, 8:19 a.m.
On Sun, Feb 03, 2013 at 01:21:13PM +0800, Zheng Liu wrote:
> On Sat, Feb 02, 2013 at 10:03:43PM -0500, Theodore Ts'o wrote:
> > On Wed, Jan 30, 2013 at 10:43:33AM +0800, Zheng Liu wrote:
> > > 
> > > Clang is first coming in my mind.  I know that some one try to use it
> > > to build a linux kernel and get a lot of problems that are about gcc
> > > extension.  But for us it seems that things are not too bad. ;)
> > 
> > Clang accepts bitfields with "unsigned long long", but I've discovered
> > something which does _not_ support unsigned long long --- the "sparse"
> > tool.  :-(
> 
> Yes, this problem has been reported by Fengguang.  So I am plan to use
> another method to define extent_status structure as last time we
> discuessed.  What do you think?
> 

I don't get this warning on my version of Sparse.

Sparse used to assume -m32 all the time but now that's been changed.
Are you using the most recent version of Sparse?  Try passing -m64.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Feb. 3, 2013, 2:57 p.m.
On Sun, Feb 03, 2013 at 11:19:43AM +0300, Dan Carpenter wrote:
> 
> I don't get this warning on my version of Sparse.
> 
> Sparse used to assume -m32 all the time but now that's been changed.
> Are you using the most recent version of Sparse?  Try passing -m64.

Hmm, I got my version of sparse from

git://git.kernel.org/pub/scm/devel/sparse/sparse.git

... where the latest version is 0.4.4, dated November 21, 2011.  Is
there a different dit repository I should be using?

I see that it doesn't complain with -m64, but the test program should
be valid for x86 with 32 bits just as much as 64 bits.  Am I missing
something?

						- Ted
#!/bin/sh
cat > /tmp/testcase.c << EOF
#include <unistd.h>
#include <stdio.h>

struct rb_node {
	unsigned long  __rb_parent_color;
	struct rb_node *rb_right;
	struct rb_node *rb_left;
} __attribute__((aligned(sizeof(long))));

struct extent_status {
	struct rb_node rb_node;
	unsigned long es_lblk;		/* first logical block extent covers */
	unsigned long es_len;		/* length of extent in block */
	unsigned long long es_pblk : 62;	/* first physical block */
	unsigned long long es_status : 2;	/* record the status of extent */
};

int main(int argc, char **argv)
{
	struct extent_status es;

	es.es_status = 3;

	printf("%d\n", es.es_status);
	printf("size %u\n", sizeof(es));
}
EOF
echo "sparse /tmp/testcase.c"
sparse /tmp/testcase.c
echo " "
echo "sparse -m32 /tmp/testcase.c"
sparse -m32 /tmp/testcase.c
echo " "
echo "sparse -m64 /tmp/testcase.c"
sparse -m64 /tmp/testcase.c

Patch

diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index 81e9339..2eb9cc3 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -20,10 +20,22 @@ 
 #define es_debug(fmt, ...)	no_printk(fmt, ##__VA_ARGS__)
 #endif
 
+enum {
+	EXTENT_STATUS_WRITTEN = 0,	/* written extent */
+	EXTENT_STATUS_UNWRITTEN = 1,	/* unwritten extent */
+	EXTENT_STATUS_DELAYED = 2,	/* delayed extent */
+};
+
+/*
+ * Here for save memory es_status is stashed into es_pblk because we only have
+ * 48 bits physical block and es_status only needs 2 bits.
+ */
 struct extent_status {
 	struct rb_node rb_node;
-	ext4_lblk_t es_lblk;	/* first logical block extent covers */
-	ext4_lblk_t es_len;	/* length of extent in block */
+	ext4_lblk_t es_lblk;		/* first logical block extent covers */
+	ext4_lblk_t es_len;		/* length of extent in block */
+	ext4_fsblk_t es_pblk : 62;	/* first physical block */
+	ext4_fsblk_t es_status : 2;	/* record the status of extent */
 };
 
 struct ext4_es_tree {