diff mbox

Issues with a rather unusual configured NFS server

Message ID 52075E01.7030506@gmx.de
State New, archived
Headers show

Commit Message

Toralf Förster Aug. 11, 2013, 9:48 a.m. UTC
so that the server either crashes (if it is a user mode linux image) or at least its reboot functionality got broken
- if the NFS server is hammered with scary NFS calls using a fuzzy tool running at a remote NFS client under a non-privileged user id.

It can re reproduced, if
	- the NFS share is an EXT3 or EXT4 directory
	- and it is created at file located at tempfs and mounted via loop device
	- and the NFS server is forced to umount the NFS share
	- and the server forced to restart the NSF service afterwards
	- and trinity is used

I could find a scenario for an automated bisect. 2 times it brought this commit 
commit 68a3396178e6688ad7367202cdf0af8ed03c8727
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Thu Mar 21 11:21:50 2013 -0400

    nfsd4: shut down more of delegation earlier


to be the one after which the user mode linux server crashes with a back trace like this:


$ cat /mnt/ramdisk/bt.v3.11-rc4-172-g8ae3f1d
[New LWP 14025]
Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
Program terminated with signal 6, Aborted.
#0  0xb77ef424 in __kernel_vsyscall ()
#0  0xb77ef424 in __kernel_vsyscall ()
#1  0x083a33c5 in kill ()
#2  0x0807163d in uml_abort () at arch/um/os-Linux/util.c:93
#3  0x08071925 in os_dump_core () at arch/um/os-Linux/util.c:138
#4  0x080613a7 in panic_exit (self=0x85a1518 <panic_exit_notifier>, unused1=0, unused2=0x85d6ce0 <buf.15904>) at arch/um/kernel/um_arch.c:240
#5  0x0809a3b8 in notifier_call_chain (nl=0x0, val=0, v=0x85d6ce0 <buf.15904>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
#6  0x0809a503 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:182
#7  atomic_notifier_call_chain (nh=0x85d6cc4 <panic_notifier_list>, val=0, v=0x85d6ce0 <buf.15904>) at kernel/notifier.c:191
#8  0x08400ba8 in panic (fmt=0x0) at kernel/panic.c:128
#9  0x0818edf4 in ext4_put_super (sb=0x4a042690) at fs/ext4/super.c:818
#10 0x081010d2 in generic_shutdown_super (sb=0x4a042690) at fs/super.c:418
#11 0x0810209a in kill_block_super (sb=0x0) at fs/super.c:1028
#12 0x08100f6a in deactivate_locked_super (s=0x4a042690) at fs/super.c:299
#13 0x08101001 in deactivate_super (s=0x4a042690) at fs/super.c:324
#14 0x08118e0c in mntfree (mnt=<optimized out>) at fs/namespace.c:891
#15 mntput_no_expire (mnt=0x0) at fs/namespace.c:929
#16 0x0811a2f5 in SYSC_umount (flags=<optimized out>, name=<optimized out>) at fs/namespace.c:1335
#17 SyS_umount (name=134541632, flags=0) at fs/namespace.c:1305
#18 0x0811a369 in SYSC_oldumount (name=<optimized out>) at fs/namespace.c:1347
#19 SyS_oldumount (name=134541632) at fs/namespace.c:1345
#20 0x080618e2 in handle_syscall (r=0x49e919d4) at arch/um/kernel/skas/syscall.c:35
#21 0x08073c0d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#22 userspace (regs=0x49e919d4) at arch/um/os-Linux/skas/process.c:431
#23 0x0805e65c in fork_handler () at arch/um/kernel/process.c:160
#24 0x00000000 in ?? ()



A real system however would not crash bug would give a kernel BUG as reported here:
http://article.gmane.org/gmane.comp.file-systems.ext4/38915
Furthermore the server won't be able any longer to reboot - it would hang infinitely in the reboot phase.
Just the magic sysrq keys still works then.



Steps to reproduce at two 32 bit Gentoo Linux user mode linux images:

1. prepare the server :
	<mount a tempfs onto /mnt/ramdisk>
	mkdir /mnt/ramdisk/victims
	dd if=/dev/zero of=/mnt/ramdisk/disk1 bs=1M count=257 2>/dev/null
	yes | mkfs.ext4 -q /mnt/ramdisk/disk1 1>/dev/null
	mount -o loop /mnt/ramdisk/disk1 /mnt/ramdisk/victims
	chmod 777 /mnt/ramdisk/victims
	/etc/init.d/nfs restart

2. prepare the client
	mount the NFS share onto the local mount point /mnt/ramdisk/victims/ with NFSv4
	
3. run the fuzzy tool trinity at the client:
	while [[ : ]]; do
		<(re-)create and fill /mnt/ramdisk/victims/v1/v2 with 100 empty files and 100 empty directories>
		trinity -V /mnt/ramdisk/victims/v1/v2 -C 1 -N 10000 -q
		sleep 3
	done

4. after 15 min kill the user mode linux client with -9

5. now run at the server
	umount /mnt/ramdisk/victims || /etc/init.d/nfs restart && umount /mnt/ramdisk/victims && echo ' no issue so far'


You might need this patch too from Oleg Nesterov <oleg@redhat.com> (not in mainline currently) .

Comments

Jan Kara Aug. 12, 2013, 2:36 p.m. UTC | #1
On Sun 11-08-13 11:48:49, Toralf Förster wrote:
> so that the server either crashes (if it is a user mode linux image) or at least its reboot functionality got broken
> - if the NFS server is hammered with scary NFS calls using a fuzzy tool running at a remote NFS client under a non-privileged user id.
> 
> It can re reproduced, if
> 	- the NFS share is an EXT3 or EXT4 directory
> 	- and it is created at file located at tempfs and mounted via loop device
> 	- and the NFS server is forced to umount the NFS share
> 	- and the server forced to restart the NSF service afterwards
> 	- and trinity is used
> 
> I could find a scenario for an automated bisect. 2 times it brought this commit 
> commit 68a3396178e6688ad7367202cdf0af8ed03c8727
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Thu Mar 21 11:21:50 2013 -0400
> 
>     nfsd4: shut down more of delegation earlier
  Added Bruce to CC.

> to be the one after which the user mode linux server crashes with a back trace like this:
> 
> 
> $ cat /mnt/ramdisk/bt.v3.11-rc4-172-g8ae3f1d
> [New LWP 14025]
> Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
> Program terminated with signal 6, Aborted.
> #0  0xb77ef424 in __kernel_vsyscall ()
> #0  0xb77ef424 in __kernel_vsyscall ()
> #1  0x083a33c5 in kill ()
> #2  0x0807163d in uml_abort () at arch/um/os-Linux/util.c:93
> #3  0x08071925 in os_dump_core () at arch/um/os-Linux/util.c:138
> #4  0x080613a7 in panic_exit (self=0x85a1518 <panic_exit_notifier>, unused1=0, unused2=0x85d6ce0 <buf.15904>) at arch/um/kernel/um_arch.c:240
> #5  0x0809a3b8 in notifier_call_chain (nl=0x0, val=0, v=0x85d6ce0 <buf.15904>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> #6  0x0809a503 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:182
> #7  atomic_notifier_call_chain (nh=0x85d6cc4 <panic_notifier_list>, val=0, v=0x85d6ce0 <buf.15904>) at kernel/notifier.c:191
> #8  0x08400ba8 in panic (fmt=0x0) at kernel/panic.c:128
> #9  0x0818edf4 in ext4_put_super (sb=0x4a042690) at fs/ext4/super.c:818
> #10 0x081010d2 in generic_shutdown_super (sb=0x4a042690) at fs/super.c:418
> #11 0x0810209a in kill_block_super (sb=0x0) at fs/super.c:1028
> #12 0x08100f6a in deactivate_locked_super (s=0x4a042690) at fs/super.c:299
> #13 0x08101001 in deactivate_super (s=0x4a042690) at fs/super.c:324
> #14 0x08118e0c in mntfree (mnt=<optimized out>) at fs/namespace.c:891
> #15 mntput_no_expire (mnt=0x0) at fs/namespace.c:929
> #16 0x0811a2f5 in SYSC_umount (flags=<optimized out>, name=<optimized out>) at fs/namespace.c:1335
> #17 SyS_umount (name=134541632, flags=0) at fs/namespace.c:1305
> #18 0x0811a369 in SYSC_oldumount (name=<optimized out>) at fs/namespace.c:1347
> #19 SyS_oldumount (name=134541632) at fs/namespace.c:1345
> #20 0x080618e2 in handle_syscall (r=0x49e919d4) at arch/um/kernel/skas/syscall.c:35
> #21 0x08073c0d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
> #22 userspace (regs=0x49e919d4) at arch/um/os-Linux/skas/process.c:431
> #23 0x0805e65c in fork_handler () at arch/um/kernel/process.c:160
> #24 0x00000000 in ?? ()
> 
> 
> 
> A real system however would not crash bug would give a kernel BUG as reported here:
> http://article.gmane.org/gmane.comp.file-systems.ext4/38915
  We have deleted inodes (regular files) in the orphan list during
ext4_put_super(). My guess is that NFS is still holding some inode
references to these inodes and thus inodes don't get deleted. So ext3/4
would be just a victim here.

> Furthermore the server won't be able any longer to reboot - it would hang
> infinitely in the reboot phase.  Just the magic sysrq keys still works
> then.
  Well, this is likely because the filesystem cannot be shut down.

								Honza
J. Bruce Fields Aug. 13, 2013, 9:53 p.m. UTC | #2
On Mon, Aug 12, 2013 at 04:36:40PM +0200, Jan Kara wrote:
> On Sun 11-08-13 11:48:49, Toralf Förster wrote:
> > so that the server either crashes (if it is a user mode linux image) or at least its reboot functionality got broken
> > - if the NFS server is hammered with scary NFS calls using a fuzzy tool running at a remote NFS client under a non-privileged user id.
> > 
> > It can re reproduced, if
> > 	- the NFS share is an EXT3 or EXT4 directory
> > 	- and it is created at file located at tempfs and mounted via loop device
> > 	- and the NFS server is forced to umount the NFS share
> > 	- and the server forced to restart the NSF service afterwards
> > 	- and trinity is used
> > 
> > I could find a scenario for an automated bisect. 2 times it brought this commit 
> > commit 68a3396178e6688ad7367202cdf0af8ed03c8727
> > Author: J. Bruce Fields <bfields@redhat.com>
> > Date:   Thu Mar 21 11:21:50 2013 -0400
> > 
> >     nfsd4: shut down more of delegation earlier

Thanks for the report.  I think I see the problem--after this commit
nfs4_set_delegation() failures result in nfs4_put_delegation being
called, but nfs4_put_delegation doesn't free the nfs4_file that has
already been set by alloc_init_deleg().

Let me think about how to fix that....

--b.

>   Added Bruce to CC.
> 
> > to be the one after which the user mode linux server crashes with a back trace like this:
> > 
> > 
> > $ cat /mnt/ramdisk/bt.v3.11-rc4-172-g8ae3f1d
> > [New LWP 14025]
> > Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
> > Program terminated with signal 6, Aborted.
> > #0  0xb77ef424 in __kernel_vsyscall ()
> > #0  0xb77ef424 in __kernel_vsyscall ()
> > #1  0x083a33c5 in kill ()
> > #2  0x0807163d in uml_abort () at arch/um/os-Linux/util.c:93
> > #3  0x08071925 in os_dump_core () at arch/um/os-Linux/util.c:138
> > #4  0x080613a7 in panic_exit (self=0x85a1518 <panic_exit_notifier>, unused1=0, unused2=0x85d6ce0 <buf.15904>) at arch/um/kernel/um_arch.c:240
> > #5  0x0809a3b8 in notifier_call_chain (nl=0x0, val=0, v=0x85d6ce0 <buf.15904>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> > #6  0x0809a503 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:182
> > #7  atomic_notifier_call_chain (nh=0x85d6cc4 <panic_notifier_list>, val=0, v=0x85d6ce0 <buf.15904>) at kernel/notifier.c:191
> > #8  0x08400ba8 in panic (fmt=0x0) at kernel/panic.c:128
> > #9  0x0818edf4 in ext4_put_super (sb=0x4a042690) at fs/ext4/super.c:818
> > #10 0x081010d2 in generic_shutdown_super (sb=0x4a042690) at fs/super.c:418
> > #11 0x0810209a in kill_block_super (sb=0x0) at fs/super.c:1028
> > #12 0x08100f6a in deactivate_locked_super (s=0x4a042690) at fs/super.c:299
> > #13 0x08101001 in deactivate_super (s=0x4a042690) at fs/super.c:324
> > #14 0x08118e0c in mntfree (mnt=<optimized out>) at fs/namespace.c:891
> > #15 mntput_no_expire (mnt=0x0) at fs/namespace.c:929
> > #16 0x0811a2f5 in SYSC_umount (flags=<optimized out>, name=<optimized out>) at fs/namespace.c:1335
> > #17 SyS_umount (name=134541632, flags=0) at fs/namespace.c:1305
> > #18 0x0811a369 in SYSC_oldumount (name=<optimized out>) at fs/namespace.c:1347
> > #19 SyS_oldumount (name=134541632) at fs/namespace.c:1345
> > #20 0x080618e2 in handle_syscall (r=0x49e919d4) at arch/um/kernel/skas/syscall.c:35
> > #21 0x08073c0d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
> > #22 userspace (regs=0x49e919d4) at arch/um/os-Linux/skas/process.c:431
> > #23 0x0805e65c in fork_handler () at arch/um/kernel/process.c:160
> > #24 0x00000000 in ?? ()
> > 
> > 
> > 
> > A real system however would not crash bug would give a kernel BUG as reported here:
> > http://article.gmane.org/gmane.comp.file-systems.ext4/38915
>   We have deleted inodes (regular files) in the orphan list during
> ext4_put_super(). My guess is that NFS is still holding some inode
> references to these inodes and thus inodes don't get deleted. So ext3/4
> would be just a victim here.
> 
> > Furthermore the server won't be able any longer to reboot - it would hang
> > infinitely in the reboot phase.  Just the magic sysrq keys still works
> > then.
>   Well, this is likely because the filesystem cannot be shut down.
> 
> 								Honza
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Toralf Förster Aug. 14, 2013, 4:44 p.m. UTC | #3
On 08/13/2013 11:53 PM, J. Bruce Fields wrote:
> On Mon, Aug 12, 2013 at 04:36:40PM +0200, Jan Kara wrote:
>> On Sun 11-08-13 11:48:49, Toralf Förster wrote:
>>> so that the server either crashes (if it is a user mode linux image) or at least its reboot functionality got broken
>>> - if the NFS server is hammered with scary NFS calls using a fuzzy tool running at a remote NFS client under a non-privileged user id.
>>>
>>> It can re reproduced, if
>>> 	- the NFS share is an EXT3 or EXT4 directory
>>> 	- and it is created at file located at tempfs and mounted via loop device
>>> 	- and the NFS server is forced to umount the NFS share
>>> 	- and the server forced to restart the NSF service afterwards
>>> 	- and trinity is used
>>>
>>> I could find a scenario for an automated bisect. 2 times it brought this commit 
>>> commit 68a3396178e6688ad7367202cdf0af8ed03c8727
>>> Author: J. Bruce Fields <bfields@redhat.com>
>>> Date:   Thu Mar 21 11:21:50 2013 -0400
>>>
>>>     nfsd4: shut down more of delegation earlier
> 
> Thanks for the report.  I think I see the problem--after this commit
> nfs4_set_delegation() failures result in nfs4_put_delegation being
> called, but nfs4_put_delegation doesn't free the nfs4_file that has
> already been set by alloc_init_deleg().
> 
> Let me think about how to fix that....
> 
> --b.
> 
>>   Added Bruce to CC.
>>
>>> to be the one after which the user mode linux server crashes with a back trace like this:
>>>
>>>
>>> $ cat /mnt/ramdisk/bt.v3.11-rc4-172-g8ae3f1d
>>> [New LWP 14025]
>>> Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
>>> Program terminated with signal 6, Aborted.
>>> #0  0xb77ef424 in __kernel_vsyscall ()
>>> #0  0xb77ef424 in __kernel_vsyscall ()
>>> #1  0x083a33c5 in kill ()
>>> #2  0x0807163d in uml_abort () at arch/um/os-Linux/util.c:93
>>> #3  0x08071925 in os_dump_core () at arch/um/os-Linux/util.c:138
>>> #4  0x080613a7 in panic_exit (self=0x85a1518 <panic_exit_notifier>, unused1=0, unused2=0x85d6ce0 <buf.15904>) at arch/um/kernel/um_arch.c:240
>>> #5  0x0809a3b8 in notifier_call_chain (nl=0x0, val=0, v=0x85d6ce0 <buf.15904>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
>>> #6  0x0809a503 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:182
>>> #7  atomic_notifier_call_chain (nh=0x85d6cc4 <panic_notifier_list>, val=0, v=0x85d6ce0 <buf.15904>) at kernel/notifier.c:191
>>> #8  0x08400ba8 in panic (fmt=0x0) at kernel/panic.c:128
>>> #9  0x0818edf4 in ext4_put_super (sb=0x4a042690) at fs/ext4/super.c:818
>>> #10 0x081010d2 in generic_shutdown_super (sb=0x4a042690) at fs/super.c:418
>>> #11 0x0810209a in kill_block_super (sb=0x0) at fs/super.c:1028
>>> #12 0x08100f6a in deactivate_locked_super (s=0x4a042690) at fs/super.c:299
>>> #13 0x08101001 in deactivate_super (s=0x4a042690) at fs/super.c:324
>>> #14 0x08118e0c in mntfree (mnt=<optimized out>) at fs/namespace.c:891
>>> #15 mntput_no_expire (mnt=0x0) at fs/namespace.c:929
>>> #16 0x0811a2f5 in SYSC_umount (flags=<optimized out>, name=<optimized out>) at fs/namespace.c:1335
>>> #17 SyS_umount (name=134541632, flags=0) at fs/namespace.c:1305
>>> #18 0x0811a369 in SYSC_oldumount (name=<optimized out>) at fs/namespace.c:1347
>>> #19 SyS_oldumount (name=134541632) at fs/namespace.c:1345
>>> #20 0x080618e2 in handle_syscall (r=0x49e919d4) at arch/um/kernel/skas/syscall.c:35
>>> #21 0x08073c0d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
>>> #22 userspace (regs=0x49e919d4) at arch/um/os-Linux/skas/process.c:431
>>> #23 0x0805e65c in fork_handler () at arch/um/kernel/process.c:160
>>> #24 0x00000000 in ?? ()
>>>
>>>
>>>
>>> A real system however would not crash bug would give a kernel BUG as reported here:
>>> http://article.gmane.org/gmane.comp.file-systems.ext4/38915
>>   We have deleted inodes (regular files) in the orphan list during
>> ext4_put_super(). My guess is that NFS is still holding some inode
>> references to these inodes and thus inodes don't get deleted. So ext3/4
>> would be just a victim here.

Just FWIW, EXT2 is not affected.

>>> Furthermore the server won't be able any longer to reboot - it would hang
>>> infinitely in the reboot phase.  Just the magic sysrq keys still works
>>> then.
>>   Well, this is likely because the filesystem cannot be shut down.
>>
>> 								Honza
>> -- 
>> Jan Kara <jack@suse.cz>
>> SUSE Labs, CR
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
diff mbox

Patch

--- x/kernel/exit.c
+++ x/kernel/exit.c
@@ -783,8 +783,8 @@  void do_exit(long code)
        exit_shm(tsk);
        exit_files(tsk);
        exit_fs(tsk);
-       exit_task_namespaces(tsk);
        exit_task_work(tsk);
+       exit_task_namespaces(tsk);
        check_stack_usage();
        exit_thread();