From patchwork Thu Dec 3 05:53:22 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suresh Jayaraman X-Patchwork-Id: 40130 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.samba.org (fn.samba.org [216.83.154.106]) by ozlabs.org (Postfix) with ESMTP id 26552B7C10 for ; Thu, 3 Dec 2009 16:53:35 +1100 (EST) Received: from fn.samba.org (localhost [127.0.0.1]) by lists.samba.org (Postfix) with ESMTP id 68DC8AD093; Wed, 2 Dec 2009 22:51:48 -0700 (MST) X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on fn.samba.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.8 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI autolearn=ham version=3.2.5 X-Original-To: linux-cifs-client@lists.samba.org Delivered-To: linux-cifs-client@lists.samba.org Received: from mx1.suse.de (cantor.suse.de [195.135.220.2]) by lists.samba.org (Postfix) with ESMTP id 7B1AAAD087 for ; Wed, 2 Dec 2009 22:51:43 -0700 (MST) Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.221.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 9256F8D893; Thu, 3 Dec 2009 06:53:28 +0100 (CET) Message-ID: <4B175252.6070607@suse.de> Date: Thu, 03 Dec 2009 11:23:22 +0530 From: Suresh Jayaraman User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.4pre) Gecko/20090915 SUSE/3.0b4-3.6 Thunderbird/3.0b4 MIME-Version: 1.0 To: Jeff Layton References: <200912021715.31136.gustavo@angulosolido.pt> <20091202131835.3f2584bd@tlielax.poochiereds.net> In-Reply-To: <20091202131835.3f2584bd@tlielax.poochiereds.net> Cc: linux-cifs-client@lists.samba.org Subject: Re: [linux-cifs-client] kernel crash - CIFS client unstable on faulty network conditions X-BeenThere: linux-cifs-client@lists.samba.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: The Linux CIFS VFS client List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-cifs-client-bounces@lists.samba.org Errors-To: linux-cifs-client-bounces@lists.samba.org On 12/02/2009 11:48 PM, Jeff Layton wrote: > On Wed, 2 Dec 2009 17:15:30 +0000 > Gustavo Carvalho Homem wrote: > >> Hi, >> >> We are using: >> >> kernel 2.6.31-5 >> samba 3.4.2 >> >> to mount CIFS shares over DFS. >> >> Everything works fine under normal conditions. However if some server(s) is/are unreachable we end up with a kernel crash that locks up the machine. >> >> Kernel logs can be seen below. >> >> Any comment? >> >> Cheers >> Gustavo >> >> >> ------------------------------------- >> >> Dec 2 15:13:39 CGDWX08027093 klogd: CIFS VFS: No response for cmd 114 mid 1 >> Dec 2 15:13:39 CGDWX08027093 klogd: BUG: unable to handle kernel NULL pointer dereference at 00000020 >> Dec 2 15:13:39 CGDWX08027093 klogd: IP: [] cifs_put_smb_ses+0x14/0xd0 [cifs] >> Dec 2 15:13:39 CGDWX08027093 klogd: *pde = 00000000 >> Dec 2 15:13:39 CGDWX08027093 klogd: Oops: 0000 [#1] SMP >> Dec 2 15:13:39 CGDWX08027093 klogd: last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/net/eth0/ifindex >> Dec 2 15:13:39 CGDWX08027093 klogd: Modules linked in: nls_utf8 nls_iso8859_1 cifs i915 drm i2c_algo_bit i2c_core af_packet ipv6 binfmt_misc loop dm_mirror dm_region_hash dm_log dm_mod cpufreq_ondemand cpufreq_conservative cpufreq_powersave acpi_cpufreq freq_table snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm snd_timer snd_mixer_oss ehci_hcd snd e1000e heci(C) soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr uhci_hcd processor floppy button evdev video wmi output tpm_infineon tpm tpm_bios thermal sg usbcore ide_generic ata_generic ide_pci_generic ide_gd_mod ide_core pata_acpi ahci ata_piix libata sd_mod scsi_mod crc_t10dif ext3 jbd >> Dec 2 15:13:39 CGDWX08027093 klogd: >> Dec 2 15:13:39 CGDWX08027093 klogd: Pid: 3511, comm: mount.cifs Tainted: G WC (2.6.31.5-desktop-1xcm #1) HP Compaq dc7900 Small Form Factor >> Dec 2 15:13:39 CGDWX08027093 klogd: EIP: 0060:[] EFLAGS: 00010282 CPU: 1 >> Dec 2 15:13:39 CGDWX08027093 klogd: EIP is at cifs_put_smb_ses+0x14/0xd0 [cifs] >> Dec 2 15:13:39 CGDWX08027093 klogd: EAX: 00000000 EBX: f5e64400 ECX: f5e64400 EDX: f5e65600 >> Dec 2 15:13:39 CGDWX08027093 klogd: ESI: 00000079 EDI: 00000000 EBP: f5dc1dfc ESP: f5dc1de0 >> Dec 2 15:13:39 CGDWX08027093 klogd: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 >> Dec 2 15:13:39 CGDWX08027093 klogd: Process mount.cifs (pid: 3511, ti=f5dc0000 task=f61ca550 task.ti=f5dc0000) >> Dec 2 15:13:39 CGDWX08027093 klogd: Stack: >> Dec 2 15:13:39 CGDWX08027093 klogd: 00000000 f5dc1dfc f848e101 f5dc1dfc f5e64400 00000079 00000000 f5dc1e20 >> Dec 2 15:13:39 CGDWX08027093 klogd: <0> f847e957 f8481193 00000000 c16bb260 f8481193 ffffff90 f614b000 f5e64400 >> Dec 2 15:13:39 CGDWX08027093 klogd: <0> f5dc1eb8 f84811a6 f5f9e950 f849d2f8 f5e64430 00000000 c0701028 c0701038 >> Dec 2 15:13:39 CGDWX08027093 klogd: Call Trace: >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? tconInfoFree+0x61/0x90 [cifs] >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_put_tcon+0x97/0xd0 [cifs] >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_mount+0x4e3/0x2570 [cifs] >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_mount+0x4e3/0x2570 [cifs] >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_mount+0x4f6/0x2570 [cifs] >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? cifs_get_sb+0x124/0x2c0 [cifs] >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? vfs_kern_mount+0x5e/0x120 >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? do_kern_mount+0x3e/0xe0 >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? do_mount+0x446/0x7d0 >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? copy_mount_options+0xad/0x130 >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? sys_mount+0x8c/0xb0 >> Dec 2 15:13:39 CGDWX08027093 klogd: [] ? sysenter_do_call+0x12/0x28 >> Dec 2 15:13:39 CGDWX08027093 klogd: Code: a4 00 00 00 85 d2 74 bc b8 09 00 00 00 e8 85 30 cd c7 5b 5d c3 66 90 55 89 e5 83 ec 1c 89 5d f4 89 75 f8 89 7d fc 0f 1f 44 00 00 <8b> 70 20 89 c3 b8 20 fd 4a f8 e8 4d 35 f9 c7 8b 43 24 83 e8 01 >> Dec 2 15:13:39 CGDWX08027093 klogd: EIP: [] cifs_put_smb_ses+0x14/0xd0 [cifs] SS:ESP 0068:f5dc1de0 >> Dec 2 15:13:39 CGDWX08027093 klogd: CR2: 0000000000000020 >> Dec 2 15:13:39 CGDWX08027093 klogd: ---[ end trace 93d72a36b9146f24 ]--- >> Dec 2 15:14:01 CGDWX08027093 CROND[6710]: (root) CMD ( /usr/share/msec/promisc_check.sh) >> Dec 2 15:14:10 CGDWX08027093 klogd: BUG: unable to handle kernel NULL pointer dereference at (null) >> Dec 2 15:14:10 CGDWX08027093 klogd: IP: [] cifs_demultiplex_thread+0x37c/0xc50 [cifs] >> Dec 2 15:14:10 CGDWX08027093 klogd: *pde = 00000000 >> Dec 2 15:14:10 CGDWX08027093 klogd: Oops: 0000 [#2] SMP >> Dec 2 15:14:10 CGDWX08027093 klogd: last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/net/eth0/ifindex >> Dec 2 15:14:10 CGDWX08027093 klogd: Modules linked in: nls_utf8 nls_iso8859_1 cifs i915 drm i2c_algo_bit i2c_core af_packet ipv6 binfmt_misc loop dm_mirror dm_region_hash dm_log dm_mod cpufreq_ondemand cpufreq_conservative cpufreq_powersave acpi_cpufreq freq_table snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm snd_timer snd_mixer_oss ehci_hcd snd e1000e heci(C) soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr uhci_hcd processor floppy button evdev video wmi output tpm_infineon tpm tpm_bios thermal sg usbcore ide_generic ata_generic ide_pci_generic ide_gd_mod ide_core pata_acpi ahci ata_piix libata sd_mod scsi_mod crc_t10dif ext3 jbd >> Dec 2 15:14:10 CGDWX08027093 klogd: >> Dec 2 15:14:10 CGDWX08027093 klogd: Pid: 5120, comm: cifsd Tainted: G D WC (2.6.31.5-desktop-1xcm #1) HP Compaq dc7900 Small Form Factor >> Dec 2 15:14:10 CGDWX08027093 klogd: EIP: 0060:[] EFLAGS: 00010216 CPU: 1 >> Dec 2 15:14:10 CGDWX08027093 klogd: EIP is at cifs_demultiplex_thread+0x37c/0xc50 [cifs] >> Dec 2 15:14:10 CGDWX08027093 klogd: EAX: f84afd20 EBX: f5e64400 ECX: f6c38000 EDX: 00000000 >> Dec 2 15:14:10 CGDWX08027093 klogd: ESI: f5e64460 EDI: f5e64408 EBP: f5f73fb8 ESP: f5f73f40 >> Dec 2 15:14:10 CGDWX08027093 klogd: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 >> Dec 2 15:14:10 CGDWX08027093 klogd: Process cifsd (pid: 5120, ti=f5f72000 task=f5f5eff0 task.ti=f5f72000) >> Dec 2 15:14:10 CGDWX08027093 klogd: Stack: >> Dec 2 15:14:10 CGDWX08027093 klogd: 00000000 00000004 00000000 743594af f5e64448 c1f9a420 f5e64460 f61ca550 >> Dec 2 15:14:10 CGDWX08027093 klogd: <0> f5fd0000 f5dbb340 f6b46e00 f5f5eff0 00f5f270 c1f9a420 00000001 c012e588 >> Dec 2 15:14:10 CGDWX08027093 klogd: <0> 74359c99 83000001 00000003 00000000 f5f73fa4 00000001 00000000 00000000 >> Dec 2 15:14:10 CGDWX08027093 klogd: Call Trace: >> Dec 2 15:14:10 CGDWX08027093 klogd: [] ? __wake_up_common+0x48/0x70 >> Dec 2 15:14:10 CGDWX08027093 klogd: [] ? complete+0x4e/0x60 >> Dec 2 15:14:10 CGDWX08027093 klogd: [] ? cifs_demultiplex_thread+0x0/0xc50 [cifs] >> Dec 2 15:14:10 CGDWX08027093 klogd: [] ? kthread+0x84/0x90 >> Dec 2 15:14:10 CGDWX08027093 klogd: [] ? kthread+0x0/0x90 >> Dec 2 15:14:10 CGDWX08027093 klogd: [] ? kernel_thread_helper+0x7/0x10 >> Dec 2 15:14:10 CGDWX08027093 klogd: Code: 4d a0 3b 4b 60 74 17 f6 05 04 fd 4a f8 01 0f 85 4e 04 00 00 b8 b0 b3 00 00 e8 71 e9 cc c7 b8 20 fd 4a f8 e8 57 22 f9 c7 8b 53 08 <8b> 02 0f 18 00 90 39 fa 74 15 66 90 c7 42 20 00 00 00 00 89 c2 >> Dec 2 15:14:10 CGDWX08027093 klogd: EIP: [] cifs_demultiplex_thread+0x37c/0xc50 [cifs] SS:ESP 0068:f5f73f40 >> Dec 2 15:14:10 CGDWX08027093 klogd: CR2: 0000000000000000 >> Dec 2 15:14:10 CGDWX08027093 klogd: ---[ end trace 93d72a36b9146f25 ]--- >> >> ------------------------------------- >> > > Hard to be sure from the info here but I think I might see the problem... > > Can you reproduce this at will? If you can, could you let me know > whether the attached patch fixes it? Note that I haven't even compile > tested it, but it's pretty straightforward. > While the scenario seems quite possible, the NULL pointer dereference error (as opposed to GPF) makes me think whether this patch will actually fix this issue or not. There is one more scenario which is possible I think. In mount_fail_check, we fail to check if pSesInfo is not NULL before calling cifs_put_smb_ses() from cifs_put_tcon(). If pSesInfo is NULL already, then cifs_put_smb_ses() will try to dereference ses to get TCP_Server_Info pointer and it will falter. Something like the below should fix this? fs/cifs/connect.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 63ea83f..04047e7 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -1672,7 +1672,8 @@ cifs_put_tcon(struct cifsTconInfo *tcon) _FreeXid(xid); tconInfoFree(tcon); - cifs_put_smb_ses(ses); + if (ses) + cifs_put_smb_ses(ses); } int