From patchwork Wed Dec 2 18:18:35 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 40083 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.samba.org (fn.samba.org [216.83.154.106]) by ozlabs.org (Postfix) with ESMTP id 18406B7BC1 for ; Thu, 3 Dec 2009 06:07:12 +1100 (EST) Received: from fn.samba.org (localhost [127.0.0.1]) by lists.samba.org (Postfix) with ESMTP id 77AA0465B7; Wed, 2 Dec 2009 12:05:20 -0700 (MST) X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on fn.samba.org X-Spam-Level: X-Spam-Status: No, score=-1.5 required=3.8 tests=AWL, BAYES_00, NO_MORE_FUNN, SPF_NEUTRAL autolearn=no version=3.2.5 X-Original-To: linux-cifs-client@lists.samba.org Delivered-To: linux-cifs-client@lists.samba.org Received: from cdptpa-omtalb.mail.rr.com (cdptpa-omtalb.mail.rr.com [75.180.132.122]) by lists.samba.org (Postfix) with ESMTP id B0938465AF for ; Wed, 2 Dec 2009 12:05:15 -0700 (MST) Received: from mail.poochiereds.net ([71.70.153.3]) by cdptpa-omta04.mail.rr.com with ESMTP id <20091202181836051.MPMG1964@cdptpa-omta04.mail.rr.com>; Wed, 2 Dec 2009 18:18:36 +0000 Received: from tlielax.poochiereds.net (tlielax.poochiereds.net [192.168.1.3]) by mail.poochiereds.net (Postfix) with ESMTPS id B361758050; Wed, 2 Dec 2009 13:18:35 -0500 (EST) Date: Wed, 2 Dec 2009 13:18:35 -0500 From: Jeff Layton To: Gustavo Carvalho Homem Message-ID: <20091202131835.3f2584bd@tlielax.poochiereds.net> In-Reply-To: <200912021715.31136.gustavo@angulosolido.pt> References: <200912021715.31136.gustavo@angulosolido.pt> X-Mailer: Claws Mail 3.7.3 (GTK+ 2.18.3; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Cc: linux-cifs-client@lists.samba.org Subject: Re: [linux-cifs-client] kernel crash - CIFS client unstable on faulty network conditions X-BeenThere: linux-cifs-client@lists.samba.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: The Linux CIFS VFS client List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-cifs-client-bounces@lists.samba.org Errors-To: linux-cifs-client-bounces@lists.samba.org On Wed, 2 Dec 2009 17:15:30 +0000 Gustavo Carvalho Homem wrote: > Hi, > > We are using: > > kernel 2.6.31-5 > samba 3.4.2 > > to mount CIFS shares over DFS. > > Everything works fine under normal conditions. However if some server(s) is/are unreachable we end up with a kernel crash that locks up the machine. > > Kernel logs can be seen below. > > Any comment? > > Cheers > Gustavo > > > ------------------------------------- > > Dec 2 15:13:39 CGDWX08027093 klogd: CIFS VFS: No response for cmd 114 mid 1 > Dec 2 15:13:39 CGDWX08027093 klogd: BUG: unable to handle kernel NULL pointer dereference at 00000020 > Dec 2 15:13:39 CGDWX08027093 klogd: IP: [] cifs_put_smb_ses+0x14/0xd0 [cifs] > Dec 2 15:13:39 CGDWX08027093 klogd: *pde = 00000000 > Dec 2 15:13:39 CGDWX08027093 klogd: Oops: 0000 [#1] SMP > Dec 2 15:13:39 CGDWX08027093 klogd: last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/net/eth0/ifindex > Dec 2 15:13:39 CGDWX08027093 klogd: Modules linked in: nls_utf8 nls_iso8859_1 cifs i915 drm i2c_algo_bit i2c_core af_packet ipv6 binfmt_misc loop dm_mirror dm_region_hash dm_log dm_mod cpufreq_ondemand cpufreq_conservative cpufreq_powersave acpi_cpufreq freq_table snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm snd_timer snd_mixer_oss ehci_hcd snd e1000e heci(C) soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr uhci_hcd processor floppy button evdev video wmi output tpm_infineon tpm tpm_bios thermal sg usbcore ide_generic ata_generic ide_pci_generic ide_gd_mod ide_core pata_acpi ahci ata_piix libata sd_mod scsi_mod crc_t10dif ext3 jbd > Dec 2 15:13:39 CGDWX08027093 klogd: > Dec 2 15:13:39 CGDWX08027093 klogd: Pid: 3511, comm: mount.cifs Tainted: G WC (2.6.31.5-desktop-1xcm #1) HP Compaq dc7900 Small Form Factor > Dec 2 15:13:39 CGDWX08027093 klogd: EIP: 0060:[] EFLAGS: 00010282 CPU: 1 > Dec 2 15:13:39 CGDWX08027093 klogd: EIP is at cifs_put_smb_ses+0x14/0xd0 [cifs] > Dec 2 15:13:39 CGDWX08027093 klogd: EAX: 00000000 EBX: f5e64400 ECX: f5e64400 EDX: f5e65600 > Dec 2 15:13:39 CGDWX08027093 klogd: ESI: 00000079 EDI: 00000000 EBP: f5dc1dfc ESP: f5dc1de0 > Dec 2 15:13:39 CGDWX08027093 klogd: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > Dec 2 15:13:39 CGDWX08027093 klogd: Process mount.cifs (pid: 3511, ti=f5dc0000 task=f61ca550 task.ti=f5dc0000) > Dec 2 15:13:39 CGDWX08027093 klogd: Stack: > Dec 2 15:13:39 CGDWX08027093 klogd: 00000000 f5dc1dfc f848e101 f5dc1dfc f5e64400 00000079 00000000 f5dc1e20 > Dec 2 15:13:39 CGDWX08027093 klogd: <0> f847e957 f8481193 00000000 c16bb260 f8481193 ffffff90 f614b000 f5e64400 > Dec 2 15:13:39 CGDWX08027093 klogd: <0> f5dc1eb8 f84811a6 f5f9e950 f849d2f8 f5e64430 00000000 c0701028 c0701038 > Dec 2 15:13:39 CGDWX08027093 klogd: Call Trace: > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? tconInfoFree+0x61/0x90 [cifs] > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_put_tcon+0x97/0xd0 [cifs] > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_mount+0x4e3/0x2570 [cifs] > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_mount+0x4e3/0x2570 [cifs] > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_mount+0x4f6/0x2570 [cifs] > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_get_sb+0x124/0x2c0 [cifs] > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? vfs_kern_mount+0x5e/0x120 > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? do_kern_mount+0x3e/0xe0 > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? do_mount+0x446/0x7d0 > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? copy_mount_options+0xad/0x130 > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? sys_mount+0x8c/0xb0 > Dec 2 15:13:39 CGDWX08027093 klogd: [] ? sysenter_do_call+0x12/0x28 > Dec 2 15:13:39 CGDWX08027093 klogd: Code: a4 00 00 00 85 d2 74 bc b8 09 00 00 00 e8 85 30 cd c7 5b 5d c3 66 90 55 89 e5 83 ec 1c 89 5d f4 89 75 f8 89 7d fc 0f 1f 44 00 00 <8b> 70 20 89 c3 b8 20 fd 4a f8 e8 4d 35 f9 c7 8b 43 24 83 e8 01 > Dec 2 15:13:39 CGDWX08027093 klogd: EIP: [] cifs_put_smb_ses+0x14/0xd0 [cifs] SS:ESP 0068:f5dc1de0 > Dec 2 15:13:39 CGDWX08027093 klogd: CR2: 0000000000000020 > Dec 2 15:13:39 CGDWX08027093 klogd: ---[ end trace 93d72a36b9146f24 ]--- > Dec 2 15:14:01 CGDWX08027093 CROND[6710]: (root) CMD ( /usr/share/msec/promisc_check.sh) > Dec 2 15:14:10 CGDWX08027093 klogd: BUG: unable to handle kernel NULL pointer dereference at (null) > Dec 2 15:14:10 CGDWX08027093 klogd: IP: [] cifs_demultiplex_thread+0x37c/0xc50 [cifs] > Dec 2 15:14:10 CGDWX08027093 klogd: *pde = 00000000 > Dec 2 15:14:10 CGDWX08027093 klogd: Oops: 0000 [#2] SMP > Dec 2 15:14:10 CGDWX08027093 klogd: last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/net/eth0/ifindex > Dec 2 15:14:10 CGDWX08027093 klogd: Modules linked in: nls_utf8 nls_iso8859_1 cifs i915 drm i2c_algo_bit i2c_core af_packet ipv6 binfmt_misc loop dm_mirror dm_region_hash dm_log dm_mod cpufreq_ondemand cpufreq_conservative cpufreq_powersave acpi_cpufreq freq_table snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm snd_timer snd_mixer_oss ehci_hcd snd e1000e heci(C) soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr uhci_hcd processor floppy button evdev video wmi output tpm_infineon tpm tpm_bios thermal sg usbcore ide_generic ata_generic ide_pci_generic ide_gd_mod ide_core pata_acpi ahci ata_piix libata sd_mod scsi_mod crc_t10dif ext3 jbd > Dec 2 15:14:10 CGDWX08027093 klogd: > Dec 2 15:14:10 CGDWX08027093 klogd: Pid: 5120, comm: cifsd Tainted: G D WC (2.6.31.5-desktop-1xcm #1) HP Compaq dc7900 Small Form Factor > Dec 2 15:14:10 CGDWX08027093 klogd: EIP: 0060:[] EFLAGS: 00010216 CPU: 1 > Dec 2 15:14:10 CGDWX08027093 klogd: EIP is at cifs_demultiplex_thread+0x37c/0xc50 [cifs] > Dec 2 15:14:10 CGDWX08027093 klogd: EAX: f84afd20 EBX: f5e64400 ECX: f6c38000 EDX: 00000000 > Dec 2 15:14:10 CGDWX08027093 klogd: ESI: f5e64460 EDI: f5e64408 EBP: f5f73fb8 ESP: f5f73f40 > Dec 2 15:14:10 CGDWX08027093 klogd: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > Dec 2 15:14:10 CGDWX08027093 klogd: Process cifsd (pid: 5120, ti=f5f72000 task=f5f5eff0 task.ti=f5f72000) > Dec 2 15:14:10 CGDWX08027093 klogd: Stack: > Dec 2 15:14:10 CGDWX08027093 klogd: 00000000 00000004 00000000 743594af f5e64448 c1f9a420 f5e64460 f61ca550 > Dec 2 15:14:10 CGDWX08027093 klogd: <0> f5fd0000 f5dbb340 f6b46e00 f5f5eff0 00f5f270 c1f9a420 00000001 c012e588 > Dec 2 15:14:10 CGDWX08027093 klogd: <0> 74359c99 83000001 00000003 00000000 f5f73fa4 00000001 00000000 00000000 > Dec 2 15:14:10 CGDWX08027093 klogd: Call Trace: > Dec 2 15:14:10 CGDWX08027093 klogd: [] ? __wake_up_common+0x48/0x70 > Dec 2 15:14:10 CGDWX08027093 klogd: [] ? complete+0x4e/0x60 > Dec 2 15:14:10 CGDWX08027093 klogd: [] ? cifs_demultiplex_thread+0x0/0xc50 [cifs] > Dec 2 15:14:10 CGDWX08027093 klogd: [] ? kthread+0x84/0x90 > Dec 2 15:14:10 CGDWX08027093 klogd: [] ? kthread+0x0/0x90 > Dec 2 15:14:10 CGDWX08027093 klogd: [] ? kernel_thread_helper+0x7/0x10 > Dec 2 15:14:10 CGDWX08027093 klogd: Code: 4d a0 3b 4b 60 74 17 f6 05 04 fd 4a f8 01 0f 85 4e 04 00 00 b8 b0 b3 00 00 e8 71 e9 cc c7 b8 20 fd 4a f8 e8 57 22 f9 c7 8b 53 08 <8b> 02 0f 18 00 90 39 fa 74 15 66 90 c7 42 20 00 00 00 00 89 c2 > Dec 2 15:14:10 CGDWX08027093 klogd: EIP: [] cifs_demultiplex_thread+0x37c/0xc50 [cifs] SS:ESP 0068:f5f73f40 > Dec 2 15:14:10 CGDWX08027093 klogd: CR2: 0000000000000000 > Dec 2 15:14:10 CGDWX08027093 klogd: ---[ end trace 93d72a36b9146f25 ]--- > > ------------------------------------- > Hard to be sure from the info here but I think I might see the problem... Can you reproduce this at will? If you can, could you let me know whether the attached patch fixes it? Note that I haven't even compile tested it, but it's pretty straightforward. Thanks, From a2d6f76bb2bbc45ab9a534fdfe5c7f1617c0e87a Mon Sep 17 00:00:00 2001 From: Jeff Layton Date: Wed, 2 Dec 2009 13:16:20 -0500 Subject: [PATCH] cifs: NULL out tcon, pSesInfo, and srvTcp pointers when chasing DFS referrals The scenario is this: We've got a valid tcon pointer and we're chasing a DFS referral. We put the tcon reference, which puts the session reference too. Then we try the mount again with the new mount info. That mount fails, and we goto mount_fail_check. The tcon and pSesInfo pointers are non-NULL, but no longer valid, and things blow up when we try to put references to them. Fix this by zeroing out the tcon, tcp and smb session pointers before retrying the mount. Signed-off-by: Jeff Layton --- fs/cifs/connect.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 63ea83f..54f38f1 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -2595,6 +2595,9 @@ remote_path_check: else if (pSesInfo) cifs_put_smb_ses(pSesInfo); + tcon = NULL; + pSesInfo = NULL; + srvTcp = NULL; cleanup_volume_info(&volume_info); referral_walks_count++; goto try_mount_again; -- 1.6.5.2