[RFC] fstests: Check if a fs can survive random (emulated) power loss

Message ID 20180226073111.3066-1-wqu@suse.com
State Not Applicable, archived
Headers show
Series
  • [RFC] fstests: Check if a fs can survive random (emulated) power loss
Related show

Commit Message

Qu Wenruo Feb. 26, 2018, 7:31 a.m.
This test case is originally designed to expose unexpected corruption
for btrfs, where there are several reports about btrfs serious metadata
corruption after power loss.

The test case itself will trigger heavy fsstress for the fs, and use
dm-flakey to emulate power loss by dropping all later writes.

For btrfs, it should be completely fine, as long as superblock write
(FUA write) finishes atomically, since with metadata CoW, superblock
either points to old trees or new tress, the fs should be as atomic as
superblock.

For journal based filesystems, each metadata update should be journaled,
so metadata operation is as atomic as journal updates.

It does show that XFS is doing the best work among the tested
filesystems (Btrfs, XFS, ext4), no kernel nor xfs_repair problem at all.

For btrfs, although btrfs check doesn't report any problem, kernel
reports some data checksum error, which is a little unexpected as data
is CoWed by default, which should be as atomic as superblock.
(Unfortunately, still not the exact problem I'm chasing for)

For EXT4, kernel is fine, but later e2fsck reports problem, which may
indicates there is still something to be improved.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/generic/479     | 109 ++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/479.out |   2 +
 tests/generic/group   |   1 +
 3 files changed, 112 insertions(+)
 create mode 100755 tests/generic/479
 create mode 100644 tests/generic/479.out

Comments

Amir Goldstein Feb. 26, 2018, 8:15 a.m. | #1
On Mon, Feb 26, 2018 at 9:31 AM, Qu Wenruo <wqu@suse.com> wrote:
> This test case is originally designed to expose unexpected corruption
> for btrfs, where there are several reports about btrfs serious metadata
> corruption after power loss.
>
> The test case itself will trigger heavy fsstress for the fs, and use
> dm-flakey to emulate power loss by dropping all later writes.
>

Come on... dm-flakey is so 2016
You should take Josef's fsstress+log-writes test and bring it to fstests:
https://github.com/josefbacik/log-writes

By doing that you will gain two very important features from the test:

1. Problems will be discovered much faster, because the test can run fsck
    after every single block write has been replayed instead of just at random
    times like in your test

2. Absolute guaranty to reproducing the problem by replaying the write log.
    Even though your fsstress could use a pre-defined random seed to results
    will be far from reproduciable, because of process and IO scheduling
    differences between subsequent test runs.
    When you catch an inconsistency with log-writes test, you can send the
    write-log recording to the maintainer to analyze the problem, even if it is
    a hard problem to hit. I used that useful technique for ext4,btrfs,xfs when
    ran tests with generic/455 and found problems.

Cheers,
Amir.
Qu Wenruo Feb. 26, 2018, 8:20 a.m. | #2
On 2018年02月26日 16:15, Amir Goldstein wrote:
> On Mon, Feb 26, 2018 at 9:31 AM, Qu Wenruo <wqu@suse.com> wrote:
>> This test case is originally designed to expose unexpected corruption
>> for btrfs, where there are several reports about btrfs serious metadata
>> corruption after power loss.
>>
>> The test case itself will trigger heavy fsstress for the fs, and use
>> dm-flakey to emulate power loss by dropping all later writes.
>>
> 
> Come on... dm-flakey is so 2016
> You should take Josef's fsstress+log-writes test and bring it to fstests:
> https://github.com/josefbacik/log-writes
> 
> By doing that you will gain two very important features from the test:
> 
> 1. Problems will be discovered much faster, because the test can run fsck
>     after every single block write has been replayed instead of just at random
>     times like in your test

That's what exactly I want!!!

Great thanks for this one! I would definitely look into this.
(Although the initial commit is even older than 2016)


But the test itself could already expose something on EXT4, it still
makes some sense for ext4 developers as a verification test case.

Thanks,
Qu

> 
> 2. Absolute guaranty to reproducing the problem by replaying the write log.
>     Even though your fsstress could use a pre-defined random seed to results
>     will be far from reproduciable, because of process and IO scheduling
>     differences between subsequent test runs.
>     When you catch an inconsistency with log-writes test, you can send the
>     write-log recording to the maintainer to analyze the problem, even if it is
>     a hard problem to hit. I used that useful technique for ext4,btrfs,xfs when
>     ran tests with generic/455 and found problems.
> 
> Cheers,
> Amir.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Amir Goldstein Feb. 26, 2018, 8:33 a.m. | #3
On Mon, Feb 26, 2018 at 10:20 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2018年02月26日 16:15, Amir Goldstein wrote:
>> On Mon, Feb 26, 2018 at 9:31 AM, Qu Wenruo <wqu@suse.com> wrote:
>>> This test case is originally designed to expose unexpected corruption
>>> for btrfs, where there are several reports about btrfs serious metadata
>>> corruption after power loss.
>>>
>>> The test case itself will trigger heavy fsstress for the fs, and use
>>> dm-flakey to emulate power loss by dropping all later writes.
>>>
>>
>> Come on... dm-flakey is so 2016
>> You should take Josef's fsstress+log-writes test and bring it to fstests:
>> https://github.com/josefbacik/log-writes
>>
>> By doing that you will gain two very important features from the test:
>>
>> 1. Problems will be discovered much faster, because the test can run fsck
>>     after every single block write has been replayed instead of just at random
>>     times like in your test
>
> That's what exactly I want!!!
>
> Great thanks for this one! I would definitely look into this.
> (Although the initial commit is even older than 2016)
>

Please note that Josef's replay-individual-faster.sh script runs fsck
every 1000 writes (i.e. --check 1000), so you can play with this argument
in your test. Can also run --fsck every --check fua or --check flush, which
may be more indicative of real world problems. not sure.

>
> But the test itself could already expose something on EXT4, it still
> makes some sense for ext4 developers as a verification test case.
>

Please take a look at generic/456
When generic/455 found a reproduciable problem in ext4,
I created a specific test without any randomness to pin point the
problem found (using dm-flakey).
If the problem you found is reproduciable, then it will be easy for you
to create a similar "bisected" test.

Thanks,
Amir.
Qu Wenruo Feb. 26, 2018, 8:41 a.m. | #4
On 2018年02月26日 16:33, Amir Goldstein wrote:
> On Mon, Feb 26, 2018 at 10:20 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>> On 2018年02月26日 16:15, Amir Goldstein wrote:
>>> On Mon, Feb 26, 2018 at 9:31 AM, Qu Wenruo <wqu@suse.com> wrote:
>>>> This test case is originally designed to expose unexpected corruption
>>>> for btrfs, where there are several reports about btrfs serious metadata
>>>> corruption after power loss.
>>>>
>>>> The test case itself will trigger heavy fsstress for the fs, and use
>>>> dm-flakey to emulate power loss by dropping all later writes.
>>>>
>>>
>>> Come on... dm-flakey is so 2016
>>> You should take Josef's fsstress+log-writes test and bring it to fstests:
>>> https://github.com/josefbacik/log-writes
>>>
>>> By doing that you will gain two very important features from the test:
>>>
>>> 1. Problems will be discovered much faster, because the test can run fsck
>>>     after every single block write has been replayed instead of just at random
>>>     times like in your test
>>
>> That's what exactly I want!!!
>>
>> Great thanks for this one! I would definitely look into this.
>> (Although the initial commit is even older than 2016)
>>
> 
> Please note that Josef's replay-individual-faster.sh script runs fsck
> every 1000 writes (i.e. --check 1000), so you can play with this argument
> in your test. Can also run --fsck every --check fua or --check flush, which
> may be more indicative of real world problems. not sure.
> 
>>
>> But the test itself could already expose something on EXT4, it still
>> makes some sense for ext4 developers as a verification test case.
>>
> 
> Please take a look at generic/456
> When generic/455 found a reproduciable problem in ext4,
> I created a specific test without any randomness to pin point the
> problem found (using dm-flakey).
> If the problem you found is reproduciable, then it will be easy for you
> to create a similar "bisected" test.

Yep, it's definitely needed for a pin-point test case, but I'm also
wondering if a random, stress test could also help.

Test case with plain fsstress is already super helpful to expose some
bugs, such stress test won't hurt.

Thanks,
Qu
> 
> Thanks,
> Amir.
>
Amir Goldstein Feb. 26, 2018, 8:45 a.m. | #5
On Mon, Feb 26, 2018 at 10:41 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2018年02月26日 16:33, Amir Goldstein wrote:
>> On Mon, Feb 26, 2018 at 10:20 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>>
>>>
>>> On 2018年02月26日 16:15, Amir Goldstein wrote:
>>>> On Mon, Feb 26, 2018 at 9:31 AM, Qu Wenruo <wqu@suse.com> wrote:
>>>>> This test case is originally designed to expose unexpected corruption
>>>>> for btrfs, where there are several reports about btrfs serious metadata
>>>>> corruption after power loss.
>>>>>
>>>>> The test case itself will trigger heavy fsstress for the fs, and use
>>>>> dm-flakey to emulate power loss by dropping all later writes.
>>>>>
>>>>
>>>> Come on... dm-flakey is so 2016
>>>> You should take Josef's fsstress+log-writes test and bring it to fstests:
>>>> https://github.com/josefbacik/log-writes
>>>>
>>>> By doing that you will gain two very important features from the test:
>>>>
>>>> 1. Problems will be discovered much faster, because the test can run fsck
>>>>     after every single block write has been replayed instead of just at random
>>>>     times like in your test
>>>
>>> That's what exactly I want!!!
>>>
>>> Great thanks for this one! I would definitely look into this.
>>> (Although the initial commit is even older than 2016)
>>>
>>
>> Please note that Josef's replay-individual-faster.sh script runs fsck
>> every 1000 writes (i.e. --check 1000), so you can play with this argument
>> in your test. Can also run --fsck every --check fua or --check flush, which
>> may be more indicative of real world problems. not sure.
>>
>>>
>>> But the test itself could already expose something on EXT4, it still
>>> makes some sense for ext4 developers as a verification test case.
>>>
>>
>> Please take a look at generic/456
>> When generic/455 found a reproduciable problem in ext4,
>> I created a specific test without any randomness to pin point the
>> problem found (using dm-flakey).
>> If the problem you found is reproduciable, then it will be easy for you
>> to create a similar "bisected" test.
>
> Yep, it's definitely needed for a pin-point test case, but I'm also
> wondering if a random, stress test could also help.
>
> Test case with plain fsstress is already super helpful to expose some
> bugs, such stress test won't hurt.
>


Yes, but the same stress test with dm-log-writes instead of dm-flakey
will be as useful and much more, so no reason to merge the less useful
stress test.

Thanks,
Amir.
Qu Wenruo Feb. 26, 2018, 8:50 a.m. | #6
On 2018年02月26日 16:45, Amir Goldstein wrote:
> On Mon, Feb 26, 2018 at 10:41 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>> On 2018年02月26日 16:33, Amir Goldstein wrote:
>>> On Mon, Feb 26, 2018 at 10:20 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>>>
>>>>
>>>> On 2018年02月26日 16:15, Amir Goldstein wrote:
>>>>> On Mon, Feb 26, 2018 at 9:31 AM, Qu Wenruo <wqu@suse.com> wrote:
>>>>>> This test case is originally designed to expose unexpected corruption
>>>>>> for btrfs, where there are several reports about btrfs serious metadata
>>>>>> corruption after power loss.
>>>>>>
>>>>>> The test case itself will trigger heavy fsstress for the fs, and use
>>>>>> dm-flakey to emulate power loss by dropping all later writes.
>>>>>>
>>>>>
>>>>> Come on... dm-flakey is so 2016
>>>>> You should take Josef's fsstress+log-writes test and bring it to fstests:
>>>>> https://github.com/josefbacik/log-writes
>>>>>
>>>>> By doing that you will gain two very important features from the test:
>>>>>
>>>>> 1. Problems will be discovered much faster, because the test can run fsck
>>>>>     after every single block write has been replayed instead of just at random
>>>>>     times like in your test
>>>>
>>>> That's what exactly I want!!!
>>>>
>>>> Great thanks for this one! I would definitely look into this.
>>>> (Although the initial commit is even older than 2016)
>>>>
>>>
>>> Please note that Josef's replay-individual-faster.sh script runs fsck
>>> every 1000 writes (i.e. --check 1000), so you can play with this argument
>>> in your test. Can also run --fsck every --check fua or --check flush, which
>>> may be more indicative of real world problems. not sure.
>>>
>>>>
>>>> But the test itself could already expose something on EXT4, it still
>>>> makes some sense for ext4 developers as a verification test case.
>>>>
>>>
>>> Please take a look at generic/456
>>> When generic/455 found a reproduciable problem in ext4,
>>> I created a specific test without any randomness to pin point the
>>> problem found (using dm-flakey).
>>> If the problem you found is reproduciable, then it will be easy for you
>>> to create a similar "bisected" test.
>>
>> Yep, it's definitely needed for a pin-point test case, but I'm also
>> wondering if a random, stress test could also help.
>>
>> Test case with plain fsstress is already super helpful to expose some
>> bugs, such stress test won't hurt.
>>
> 
> 
> Yes, but the same stress test with dm-log-writes instead of dm-flakey
> will be as useful and much more, so no reason to merge the less useful
> stress test.

OK, I'll try to use dm-log to enhance the test case.

Thanks,
Qu

> 
> Thanks,
> Amir.
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Patch

diff --git a/tests/generic/479 b/tests/generic/479
new file mode 100755
index 00000000..ab530231
--- /dev/null
+++ b/tests/generic/479
@@ -0,0 +1,109 @@ 
+#! /bin/bash
+# FS QA Test 479
+#
+# Test if a filesystem can survive emulated powerloss.
+#
+# No matter what the solution a filesystem uses (journal or CoW),
+# it should survive unexpected powerloss, without major metadata
+# corruption.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2018 SuSE.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	ps -e | grep fsstress > /dev/null 2>&1
+	while [ $? -eq 0 ]; do
+		$KILLALL_PROG -KILL fsstress > /dev/null 2>&1
+		wait > /dev/null 2>&1
+		ps -e | grep fsstress > /dev/null 2>&1
+	done
+	_unmount_flakey &> /dev/null
+	_cleanup_flakey
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+_require_command "$KILLALL_PROG" "killall"
+
+runtime=$(($TIME_FACTOR * 15))
+loops=$(($LOAD_FACTOR * 4))
+
+for i in $(seq -w $loops); do
+	echo "=== Loop $i: $(date) ===" >> $seqres.full
+
+	_scratch_mkfs >/dev/null 2>&1
+	_init_flakey
+	_mount_flakey
+
+	($FSSTRESS_PROG $FSSTRESS_AVOID -w -d $SCRATCH_MNT -n 1000000 \
+		-p 100 >> $seqres.full &) > /dev/null 2>&1
+
+	sleep $runtime
+
+	# Here we only want to drop all write, don't need to umount the fs
+	_load_flakey_table $FLAKEY_DROP_WRITES
+
+	ps -e | grep fsstress > /dev/null 2>&1
+	while [ $? -eq 0 ]; do
+		$KILLALL_PROG -KILL fsstress > /dev/null 2>&1
+		wait > /dev/null 2>&1
+		ps -e | grep fsstress > /dev/null 2>&1
+	done
+
+	_unmount_flakey
+	_cleanup_flakey
+
+	# Mount the fs to do proper log replay for journal based fs
+	# so later check won't report annoying dirty log and only
+	# report real problem.
+	_scratch_mount
+	_scratch_unmount
+
+	_check_scratch_fs
+done
+
+echo "Silence is golden"
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/479.out b/tests/generic/479.out
new file mode 100644
index 00000000..290f18b3
--- /dev/null
+++ b/tests/generic/479.out
@@ -0,0 +1,2 @@ 
+QA output created by 479
+Silence is golden
diff --git a/tests/generic/group b/tests/generic/group
index 1e808865..5ce3db1d 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -481,3 +481,4 @@ 
 476 auto rw
 477 auto quick exportfs
 478 auto quick
+479 auto