Patchwork [GIT,PULL] ext4 updates for 3.9

login
register
mail settings
Submitter Theodore Ts'o
Date Feb. 27, 2013, 6:49 p.m.
Message ID <20130227184912.GA19624@thunk.org>
Download mbox | patch
Permalink /patch/223677/
State Not Applicable
Headers show

Comments

Theodore Ts'o - Feb. 27, 2013, 6:49 p.m.
Markus, Dave, can you confirm that this fixes your problem?

Thanks!!

(Sigh, this is a real brown paper bug; I'm embarassed I missed this in
my code review.)

						- Ted

From f47f0d11096ca5d9e1965d8a9f266aa13fe2b73b Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso@mit.edu>
Date: Wed, 27 Feb 2013 13:47:52 -0500
Subject: [PATCH] ext4: fix extent status tree regression for file systems >
 512GB

This fixes a regression introduced by commit f7fec032aa782.  The
problem was that the extents status flags caused us to mask out block
numbers smaller than 2**28 blocks.  Since we didn't test with file
systems smaller than 512GB, we didn't notice this during the
development cycle.

A typical failure looks like this:

EXT4-fs error (device sdb1): htree_dirblock_to_tree:919: inode #172235804: block
152052301: comm ls: bad entry in directory: rec_len is smaller than minimal -
offset=0(0), inode=0, rec_len=0, name_len=0

... where 'debugfs -R "stat <172235804>" /dev/sdb1' reports that the
inode has block number 688923213.  When viewed in hex, block number
152052301 (from the syslog) is 0x910224D, while block number 688923213
is 0x2910224D.  Note the missing "0x20000000" in the block number.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: Dave Jones <davej@redhat.com>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/ext4/extents_status.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)
Markus Trippelsdorf - Feb. 27, 2013, 6:56 p.m.
On 2013.02.27 at 13:49 -0500, Theodore Ts'o wrote:

> Markus, Dave, can you confirm that this fixes your problem?

Yes, it fixes the issue.
Thanks Ted.
Dave Jones - Feb. 27, 2013, 6:57 p.m.
On Wed, Feb 27, 2013 at 01:49:12PM -0500, Theodore Ts'o wrote:
 > Markus, Dave, can you confirm that this fixes your problem?
 > 
 > Thanks!!
 > 
 > (Sigh, this is a real brown paper bug; I'm embarassed I missed this in
 > my code review.)

Building now.  Can you confirm that nothing on-disk should be awry ?
Or will I need a new fsck to detect what happened ?

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zheng Liu - Feb. 27, 2013, 6:59 p.m.
Hi Ted,

On 02/28/2013 02:49 AM, Theodore Ts'o wrote:
> Markus, Dave, can you confirm that this fixes your problem?
> 
> Thanks!!
> 
> (Sigh, this is a real brown paper bug; I'm embarassed I missed this in
> my code review.)

Sorry, I don't have a big disk in my hand now.  So I could reproduce it.
 But the patch looks good.  Thanks for fixing it.

Regards,
						- Zheng

> 
> 						- Ted
> 
> From f47f0d11096ca5d9e1965d8a9f266aa13fe2b73b Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Wed, 27 Feb 2013 13:47:52 -0500
> Subject: [PATCH] ext4: fix extent status tree regression for file systems >
>  512GB
> 
> This fixes a regression introduced by commit f7fec032aa782.  The
> problem was that the extents status flags caused us to mask out block
> numbers smaller than 2**28 blocks.  Since we didn't test with file
> systems smaller than 512GB, we didn't notice this during the
> development cycle.
> 
> A typical failure looks like this:
> 
> EXT4-fs error (device sdb1): htree_dirblock_to_tree:919: inode #172235804: block
> 152052301: comm ls: bad entry in directory: rec_len is smaller than minimal -
> offset=0(0), inode=0, rec_len=0, name_len=0
> 
> ... where 'debugfs -R "stat <172235804>" /dev/sdb1' reports that the
> inode has block number 688923213.  When viewed in hex, block number
> 152052301 (from the syslog) is 0x910224D, while block number 688923213
> is 0x2910224D.  Note the missing "0x20000000" in the block number.
> 
> Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
> Reported-by: Dave Jones <davej@redhat.com>
> Cc: Zheng Liu <gnehzuil.liu@gmail.com>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> ---
>  fs/ext4/extents_status.h | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
> index cf83e77..c795ff6 100644
> --- a/fs/ext4/extents_status.h
> +++ b/fs/ext4/extents_status.h
> @@ -20,10 +20,10 @@
>  #define es_debug(fmt, ...)	no_printk(fmt, ##__VA_ARGS__)
>  #endif
>  
> -#define EXTENT_STATUS_WRITTEN	0x80000000	/* written extent */
> -#define EXTENT_STATUS_UNWRITTEN	0x40000000	/* unwritten extent */
> -#define EXTENT_STATUS_DELAYED	0x20000000	/* delayed extent */
> -#define EXTENT_STATUS_HOLE	0x10000000	/* hole */
> +#define EXTENT_STATUS_WRITTEN	(((unsigned long long) 1) << 63)
> +#define EXTENT_STATUS_UNWRITTEN	(((unsigned long long) 1) << 62)
> +#define EXTENT_STATUS_DELAYED	(((unsigned long long) 1) << 61)
> +#define EXTENT_STATUS_HOLE	(((unsigned long long) 1) << 60)
>  
>  #define EXTENT_STATUS_FLAGS	(EXTENT_STATUS_WRITTEN | \
>  				 EXTENT_STATUS_UNWRITTEN | \
> @@ -58,22 +58,22 @@ extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
>  
>  static inline int ext4_es_is_written(struct extent_status *es)
>  {
> -	return (es->es_pblk & EXTENT_STATUS_WRITTEN);
> +	return (es->es_pblk & EXTENT_STATUS_WRITTEN) != 0;
>  }
>  
>  static inline int ext4_es_is_unwritten(struct extent_status *es)
>  {
> -	return (es->es_pblk & EXTENT_STATUS_UNWRITTEN);
> +	return (es->es_pblk & EXTENT_STATUS_UNWRITTEN) != 0;
>  }
>  
>  static inline int ext4_es_is_delayed(struct extent_status *es)
>  {
> -	return (es->es_pblk & EXTENT_STATUS_DELAYED);
> +	return (es->es_pblk & EXTENT_STATUS_DELAYED) != 0;
>  }
>  
>  static inline int ext4_es_is_hole(struct extent_status *es)
>  {
> -	return (es->es_pblk & EXTENT_STATUS_HOLE);
> +	return (es->es_pblk & EXTENT_STATUS_HOLE) != 0;
>  }
>  
>  static inline ext4_fsblk_t ext4_es_status(struct extent_status *es)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Feb. 27, 2013, 7:04 p.m.
On Wed, Feb 27, 2013 at 01:57:55PM -0500, Dave Jones wrote:
> Building now.  Can you confirm that nothing on-disk should be awry ?
> Or will I need a new fsck to detect what happened ?
> 

Well, it's possible that a read from data file which had blocks
located above 512GB might have gotten bogus information, or a block
allocated above 512GB might result in a write to the wrong place on
disk.

So if you are very cautious, running fsck just to make sure things are
OK is not a bad idea.  But it's likely that the directory sanity
checks would have caught things quickly, or trying run an executable
would have caused a seg fault quickly enough.  If your system crashed
very quickly during the boot process, you'll probably be OK.

       	      	      	   	      	  - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov - Feb. 27, 2013, 7:06 p.m.
On Wed, Feb 27, 2013 at 01:49:12PM -0500, Theodore Ts'o wrote:
> -#define EXTENT_STATUS_WRITTEN	0x80000000	/* written extent */
> -#define EXTENT_STATUS_UNWRITTEN	0x40000000	/* unwritten extent */
> -#define EXTENT_STATUS_DELAYED	0x20000000	/* delayed extent */
> -#define EXTENT_STATUS_HOLE	0x10000000	/* hole */
> +#define EXTENT_STATUS_WRITTEN	(((unsigned long long) 1) << 63)
> +#define EXTENT_STATUS_UNWRITTEN	(((unsigned long long) 1) << 62)
> +#define EXTENT_STATUS_DELAYED	(((unsigned long long) 1) << 61)
> +#define EXTENT_STATUS_HOLE	(((unsigned long long) 1) << 60)

Just a nitpick:

1ULL << ...

is shorter and probably the standard way we do flags.
Dave Jones - Feb. 27, 2013, 7:11 p.m.
On Wed, Feb 27, 2013 at 02:04:19PM -0500, Theodore Ts'o wrote:
 > On Wed, Feb 27, 2013 at 01:57:55PM -0500, Dave Jones wrote:
 > > Building now.  Can you confirm that nothing on-disk should be awry ?
 > > Or will I need a new fsck to detect what happened ?
 > > 
 > 
 > Well, it's possible that a read from data file which had blocks
 > located above 512GB might have gotten bogus information, or a block
 > allocated above 512GB might result in a write to the wrong place on
 > disk.
 > 
 > So if you are very cautious, running fsck just to make sure things are
 > OK is not a bad idea.  But it's likely that the directory sanity
 > checks would have caught things quickly, or trying run an executable
 > would have caused a seg fault quickly enough.  If your system crashed
 > very quickly during the boot process, you'll probably be OK.

It had been up a few hours before I hit those checks.

fsck never found anything.  I do have another disk (XFS formatted) with
a backup from a week ago. I'll run a --dry-run rsync to see if it
picks up any changes that I don't expect.

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Jones - Feb. 27, 2013, 7:19 p.m.
On Wed, Feb 27, 2013 at 07:56:25PM +0100, Markus Trippelsdorf wrote:
 > On 2013.02.27 at 13:49 -0500, Theodore Ts'o wrote:
 > 
 > > Markus, Dave, can you confirm that this fixes your problem?
 > 
 > Yes, it fixes the issue.

Looks like it's fixed here too.

How did this make it through -next without anyone hitting it ?

I can't remember how many years ago I last bought a disk < 1TB,
and I can't be alone.  Or is everyone all about SSDs these days?

Is anyone running xfstests or similar on linux-next regularly ?

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Feb. 27, 2013, 7:19 p.m.
On Wed, Feb 27, 2013 at 02:11:15PM -0500, Dave Jones wrote:
> 
> fsck never found anything.  I do have another disk (XFS formatted) with
> a backup from a week ago. I'll run a --dry-run rsync to see if it
> picks up any changes that I don't expect.

Probably a good idea.

Given that Markus reports that this fixes the problem for him, and
regression smoke test has passed, I'll get a cleaned up patch ready
for Linus to pull ASAP.

My deep apologies that we didn't catch this earlier.  i'm going to
think about how we can improve our testing regime to make sure
problems like this get caught earlier.

				- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zheng Liu - Feb. 27, 2013, 7:27 p.m.
On 02/28/2013 03:19 AM, Dave Jones wrote:
> On Wed, Feb 27, 2013 at 07:56:25PM +0100, Markus Trippelsdorf wrote:
>  > On 2013.02.27 at 13:49 -0500, Theodore Ts'o wrote:
>  > 
>  > > Markus, Dave, can you confirm that this fixes your problem?
>  > 
>  > Yes, it fixes the issue.
> 
> Looks like it's fixed here too.
> 
> How did this make it through -next without anyone hitting it ?

Sorry, my apology.

> 
> I can't remember how many years ago I last bought a disk < 1TB,
> and I can't be alone.  Or is everyone all about SSDs these days?

I admit that I run xfstests on a SSD that has 128G.

> 
> Is anyone running xfstests or similar on linux-next regularly ?

Ted has a dev branch for -next, and I work on this branch.

Regards,
						- Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Feb. 27, 2013, 7:29 p.m.
On Wed, Feb 27, 2013 at 02:19:23PM -0500, Dave Jones wrote:
> 
> Looks like it's fixed here too.
> 
> How did this make it through -next without anyone hitting it ?
> 
> I can't remember how many years ago I last bought a disk < 1TB,
> and I can't be alone.  Or is everyone all about SSDs these days?

I use LVM, so I have a number of volues which are smaler than 512GB,
but very few which are actually larger than 1TB.  And none on my test
boxes.  I was running the bleeding edge ext4 code on my laptop as for
dogfooding purposes, but I have an 80GB mSATA SSD and a 500GB HDD on
my X230 laptop (it requires a thin laptop drive, and 7mm drives don't
come any bigger, alas).

> Is anyone running xfstests or similar on linux-next regularly ?

I run xfstests on the ext4 tree, and I ran it on ext4 plus Linus's tip
before I submitted a pull request.  The problem is that XFSTESTS is
S-L-O-W if you use large partitions, so typically I use a 5GB
partition sizes for my test runs.  Normally we're worried about race
condition bugs, not something as bone-headed as a bitmasking problem,
so it makes sense to use a smaller disk for most of your testing.
(Some folks do their xfstests run on SSD's or tmpfs image files, again
for speed reasons, and it's unlikely they would be big enough.)

So what we probably need to do is to have a separate set of tests
using a loopback mount, and perhaps an artificially created file
system which has a large percentage of the blocks in the middle of the
file system busied out, to make efficient testing of these sorts of
bugs more efficient.  As I said, I'm thinking about how's the best way
to improve our testing regime to catch such problems the next time around.

Cheers,

   	       	       	      	       	    - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitri Monakho - Feb. 27, 2013, 8:58 p.m.
On Wed, 27 Feb 2013 14:29:07 -0500, Theodore Ts'o <tytso@mit.edu> wrote:
> On Wed, Feb 27, 2013 at 02:19:23PM -0500, Dave Jones wrote:
> > 
> > Looks like it's fixed here too.
> > 
> > How did this make it through -next without anyone hitting it ?
> > 
> > I can't remember how many years ago I last bought a disk < 1TB,
> > and I can't be alone.  Or is everyone all about SSDs these days?
> 
> I use LVM, so I have a number of volues which are smaler than 512GB,
> but very few which are actually larger than 1TB.  And none on my test
> boxes.  I was running the bleeding edge ext4 code on my laptop as for
> dogfooding purposes, but I have an 80GB mSATA SSD and a 500GB HDD on
> my X230 laptop (it requires a thin laptop drive, and 7mm drives don't
> come any bigger, alas).
> 
> > Is anyone running xfstests or similar on linux-next regularly ?
> 
> I run xfstests on the ext4 tree, and I ran it on ext4 plus Linus's tip
> before I submitted a pull request.  The problem is that XFSTESTS is
> S-L-O-W if you use large partitions, so typically I use a 5GB
Indeed. That's why i give-up rotated disks and run xfstest only on SSD
or brd module 
> partition sizes for my test runs.  Normally we're worried about race
> condition bugs, not something as bone-headed as a bitmasking problem,
> so it makes sense to use a smaller disk for most of your testing.
> (Some folks do their xfstests run on SSD's or tmpfs image files, again
> for speed reasons, and it's unlikely they would be big enough.)
> 
> So what we probably need to do is to have a separate set of tests
> using a loopback mount, and perhaps an artificially created file
> system which has a large percentage of the blocks in the middle of the
> file system busied out, to make efficient testing of these sorts of
> bugs more efficient.  As I said, I'm thinking about how's the best way
> to improve our testing regime to catch such problems the next time around.
Amazing idea. Something like:

#dd if=/dev/zero of=/tmp/fs.img bs=1M seek=2000000 count=1
#mkfs.ext4 -m0 -i4096000 /tmp/fs.img
#mount /tmp/fs.img /mnt/ -oloop
#for ((i=0; i < 2000; i++));do   fallocate -l $((1024*1024*1024)) /mnt/f$i ;done
#for ((i=0; i < 2000; i++));do   truncate -s $((1023*1024*1024)) /mnt/f$i ;done

As result file system image has 2gb of free space wich is fragmented to ~2000
chunks 1Mb each. But image itself is quite small
# df /mnt
Filesystem      1K-blocks       Used Available Use% Mounted on
/dev/loop0     2047678076 2045679228   1998848 100% /mnt
# du -sch /tmp/fs.img 
242M     /tmp/fs.img
242M     total

Later we can simply run xfstest/fio/fsx on this image.
I'll prepare new xfstest based on that idea. But the only disadvantage
is that loop dev has bottleneck, all requests will be serialized on i_mutex.
> 
> Cheers,
> 
>    	       	       	      	       	    - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Feb. 27, 2013, 9:30 p.m.
On Thu, Feb 28, 2013 at 12:58:55AM +0400, Dmitry Monakhov wrote:
> Indeed. That's why i give-up rotated disks and run xfstest only on SSD
> or brd module 

Note that some problems only show up on slower devices (because then
the race window is wider), and sometimes they only show up on fast
devices.  This is why I've been doing runs on both HDD's and on tmpfs
kvm disk image files.

> Later we can simply run xfstest/fio/fsx on this image.
> I'll prepare new xfstest based on that idea. But the only disadvantage
> is that loop dev has bottleneck, all requests will be serialized on i_mutex.

If the image file is used as a disk file for KVM, that should avoid
the serialization bottleneck of using the loop device.

Regards,

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner - Feb. 28, 2013, 1:18 p.m.
On Wed, Feb 27, 2013 at 02:29:07PM -0500, Theodore Ts'o wrote:
> On Wed, Feb 27, 2013 at 02:19:23PM -0500, Dave Jones wrote:
> > 
> > Looks like it's fixed here too.
> > 
> > How did this make it through -next without anyone hitting it ?
> 
> > Is anyone running xfstests or similar on linux-next regularly ?
> 
> I run xfstests on the ext4 tree, and I ran it on ext4 plus Linus's tip
> before I submitted a pull request.  The problem is that XFSTESTS is
> S-L-O-W if you use large partitions, so typically I use a 5GB
> partition sizes for my test runs.

This isn't the case for XFS. I typically see 1TB scratch devices
only being ~10-20% slower than 10GB scratch devices, and 10TB only
being a little slower than 1TB scratch devices. I have to use sparse
devices and --large-fs for 100TB filesystem testing, so I can't
directly compare the speeds to those that I run on physical devices.
However I can say that it isn't significantly slower than using
small scratch devices...

> So what we probably need to do is to have a separate set of tests
> using a loopback mount, and perhaps an artificially created file
> system which has a large percentage of the blocks in the middle of the
> file system busied out, to make efficient testing of these sorts of

That's exactly what the --large-fs patch set I posted months ago does
for ext4 - it uses fallocate() to fill all but 50GB of the large
filesystem without actually writing any data and runs the standard
tests in the remaining unused space.

However, last time I tested ext4 with this patchset (when I posted
the patches months ago), multi-TB preallocation on ext4 was still too
slow to make it practical for testing on devices larger than 2-3TB.
Perhaps it would make testing 1-2TB ext4 filesystems fast enough
that you could do it regularly...

Cheers,

Dave.
Eric Sandeen - March 1, 2013, 3:41 p.m.
On 2/27/13 2:58 PM, Dmitry Monakhov wrote:
> On Wed, 27 Feb 2013 14:29:07 -0500, Theodore Ts'o <tytso@mit.edu> wrote:
>> On Wed, Feb 27, 2013 at 02:19:23PM -0500, Dave Jones wrote:
>>>
>>> Looks like it's fixed here too.
>>>
>>> How did this make it through -next without anyone hitting it ?
>>>
>>> I can't remember how many years ago I last bought a disk < 1TB,
>>> and I can't be alone.  Or is everyone all about SSDs these days?
>>
>> I use LVM, so I have a number of volues which are smaler than 512GB,
>> but very few which are actually larger than 1TB.  And none on my test
>> boxes.  I was running the bleeding edge ext4 code on my laptop as for
>> dogfooding purposes, but I have an 80GB mSATA SSD and a 500GB HDD on
>> my X230 laptop (it requires a thin laptop drive, and 7mm drives don't
>> come any bigger, alas).
>>
>>> Is anyone running xfstests or similar on linux-next regularly ?
>>
>> I run xfstests on the ext4 tree, and I ran it on ext4 plus Linus's tip
>> before I submitted a pull request.  The problem is that XFSTESTS is
>> S-L-O-W if you use large partitions, so typically I use a 5GB
> Indeed. That's why i give-up rotated disks and run xfstest only on SSD
> or brd module 
>> partition sizes for my test runs.  Normally we're worried about race
>> condition bugs, not something as bone-headed as a bitmasking problem,
>> so it makes sense to use a smaller disk for most of your testing.
>> (Some folks do their xfstests run on SSD's or tmpfs image files, again
>> for speed reasons, and it's unlikely they would be big enough.)
>>
>> So what we probably need to do is to have a separate set of tests
>> using a loopback mount, and perhaps an artificially created file
>> system which has a large percentage of the blocks in the middle of the
>> file system busied out, to make efficient testing of these sorts of
>> bugs more efficient.  As I said, I'm thinking about how's the best way
>> to improve our testing regime to catch such problems the next time around.
> Amazing idea. Something like:
> 
> #dd if=/dev/zero of=/tmp/fs.img bs=1M seek=2000000 count=1
> #mkfs.ext4 -m0 -i4096000 /tmp/fs.img
> #mount /tmp/fs.img /mnt/ -oloop
> #for ((i=0; i < 2000; i++));do   fallocate -l $((1024*1024*1024)) /mnt/f$i ;done
> #for ((i=0; i < 2000; i++));do   truncate -s $((1023*1024*1024)) /mnt/f$i ;done
> 
> As result file system image has 2gb of free space wich is fragmented to ~2000
> chunks 1Mb each. But image itself is quite small
> # df /mnt
> Filesystem      1K-blocks       Used Available Use% Mounted on
> /dev/loop0     2047678076 2045679228   1998848 100% /mnt
> # du -sch /tmp/fs.img 
> 242M     /tmp/fs.img
> 242M     total
> 
> Later we can simply run xfstest/fio/fsx on this image.
> I'll prepare new xfstest based on that idea. But the only disadvantage
> is that loop dev has bottleneck, all requests will be serialized on i_mutex.

Before anyone does too much work, it would be worth revisiting
dchinner's
	[PATCH 0/10] xfstests: rework large filesystem testing
series from July 2012 to see if it meets the needs already.

It almost got all reviews, with one sticking point left, AFAICT.

-Eric



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index cf83e77..c795ff6 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -20,10 +20,10 @@ 
 #define es_debug(fmt, ...)	no_printk(fmt, ##__VA_ARGS__)
 #endif
 
-#define EXTENT_STATUS_WRITTEN	0x80000000	/* written extent */
-#define EXTENT_STATUS_UNWRITTEN	0x40000000	/* unwritten extent */
-#define EXTENT_STATUS_DELAYED	0x20000000	/* delayed extent */
-#define EXTENT_STATUS_HOLE	0x10000000	/* hole */
+#define EXTENT_STATUS_WRITTEN	(((unsigned long long) 1) << 63)
+#define EXTENT_STATUS_UNWRITTEN	(((unsigned long long) 1) << 62)
+#define EXTENT_STATUS_DELAYED	(((unsigned long long) 1) << 61)
+#define EXTENT_STATUS_HOLE	(((unsigned long long) 1) << 60)
 
 #define EXTENT_STATUS_FLAGS	(EXTENT_STATUS_WRITTEN | \
 				 EXTENT_STATUS_UNWRITTEN | \
@@ -58,22 +58,22 @@  extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
 
 static inline int ext4_es_is_written(struct extent_status *es)
 {
-	return (es->es_pblk & EXTENT_STATUS_WRITTEN);
+	return (es->es_pblk & EXTENT_STATUS_WRITTEN) != 0;
 }
 
 static inline int ext4_es_is_unwritten(struct extent_status *es)
 {
-	return (es->es_pblk & EXTENT_STATUS_UNWRITTEN);
+	return (es->es_pblk & EXTENT_STATUS_UNWRITTEN) != 0;
 }
 
 static inline int ext4_es_is_delayed(struct extent_status *es)
 {
-	return (es->es_pblk & EXTENT_STATUS_DELAYED);
+	return (es->es_pblk & EXTENT_STATUS_DELAYED) != 0;
 }
 
 static inline int ext4_es_is_hole(struct extent_status *es)
 {
-	return (es->es_pblk & EXTENT_STATUS_HOLE);
+	return (es->es_pblk & EXTENT_STATUS_HOLE) != 0;
 }
 
 static inline ext4_fsblk_t ext4_es_status(struct extent_status *es)