xfstests: device busy when umount

Submitted by Amir Goldstein on May 18, 2011, 8:19 a.m.


Amir Goldstein May 18, 2011, 8:19 a.m.
On Wed, May 18, 2011 at 9:31 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, May 17, 2011 at 06:01:14PM +0300, Amir Goldstein wrote:
>> On Tue, May 17, 2011 at 5:32 PM, Eric Sandeen <sandeen@redhat.com> wrote:
>> > On 5/17/11 4:03 AM, Yongqiang Yang wrote:
>> >> Hi,
>> >>
>> >> I noticed that all tests which contain 'device busy' errors have
>> >> falloc operations.  Does the error have something to do with falloc?
> <shrug>
> Perhaps a bit more detail about what you are testing, how you've set
> up xfstests, etc, and some analysis of the problem is in order first?


Let me make it simple:

amir@qalab:~/xfstests$ uname -a
Linux qalab 2.6.39-rc7+ #11 SMP Mon May 16 12:08:52 IDT 2011 x86_64
x86_64 x86_64 GNU/Linux
amir@qalab:~/xfstests$ mount -t ext4
/dev/sdb1 on / type ext4 (rw,errors=remount-ro,commit=0)
/dev/sda5 on /mnt/test/ext4 type ext4 (rw,acl,user_xattr)
amir@qalab:~/xfstests$ cat local.config
export TEST_DEV=/dev/sda5
export TEST_DIR=/mnt/test/ext4
export SCRATCH_DEV=/dev/sda8
export SCRATCH_MNT=/mnt/test/scratch
amir@qalab:~/xfstests$ sudo ./check 124
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 qalab 2.6.39-rc7+
MKFS_OPTIONS  -- /dev/sda8
MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda8 /mnt/test/scratch

124 9s ... - output mismatch (see 124.out.bad)
Ran: 124
Failures: 124
Failed 1 of 1 tests
amir@qalab:~/xfstests$ mount -t ext4
/dev/sdb1 on / type ext4 (rw,errors=remount-ro,commit=0)
/dev/sda8 on /mnt/test/scratch type ext4 (rw,acl,user_xattr)
/dev/sda5 on /mnt/test/ext4 type ext4 (rw,acl,user_xattr)
amir@qalab:~/xfstests$ sudo umount /mnt/test/scratch/
amir@qalab:~/xfstests$ mount -t ext4
/dev/sdb1 on / type ext4 (rw,errors=remount-ro,commit=0)
/dev/sda5 on /mnt/test/ext4 type ext4 (rw,acl,user_xattr)

I am not trying anything special.
Running umount from command line after the test succeeds, so it must
be some kind of race.
As I said, I tried running lsof before umount in common.rc, but it
detected nothing.
Do you have any suggestions for further analysis?

>> > cc'ing xfs list since xfs devs maintain xfstests.
>> >
>> > What tests have "device busy" errors?  What do the usual investigative
>> > steps such as "lsof" and "fuser" tell you when this happens?
>> I tried running lsof | grep $TEST_DIR before umount
>> and I tried sleep 1 before umount and it didn't yield anything.
> Which usually indicates that you've got some kind of reference
> counting problem preventing the filesystem from being unmounted.

As I demonstrated, the filesystem *can* be unmounted.

>> > Are there loop devices that didn't get cleaned up, or processes that
>> > have not terminated?
>> >
>> > What tests have these problems?
>> for me 124 always fails to umount, and 198 and 213 sometimes fails to umount.
> What, exactly, are you testing on? test 124 uses XFS_IOC_RESVSP
> directly, not fallocate(), so all it is doing on a non-XFS
> filesystem is iterating a loop that writes a 1MB file, reads it back
> then unlinks it....

Tell me about it...

The machine was a clean install of Ubuntu 10.10,
which was recently upgraded to Ubuntu 11.4, but this problem
existed since the beginning.

It is used for nothing but running tests and I only
installed packages required (to my understanding) by xfstests.
I just build xfstests from git (HEAD 30456902).

The kernel is latest 2.6.39-rc7 with ext4 dev branch changed,
but again, the problem existed with any previous/release kernel I tried.

--- 124.out     2011-03-01 18:00:49.808338003 +0200
+++ 124.out.bad 2011-05-18 10:47:01.830998615 +0300
@@ -1 +1,4 @@ 
 QA output created by 124
+umount: /mnt/test/scratch: device is busy.
+        (In some cases useful info about processes that use
+         the device is found by lsof(8) or fuser(1))