From patchwork Tue Sep 21 07:56:00 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 65281 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id ACEF5B6F10 for ; Tue, 21 Sep 2010 17:56:19 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756354Ab0IUH4I (ORCPT ); Tue, 21 Sep 2010 03:56:08 -0400 Received: from mail-wy0-f174.google.com ([74.125.82.174]:51851 "EHLO mail-wy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755907Ab0IUH4G (ORCPT ); Tue, 21 Sep 2010 03:56:06 -0400 Received: by wyf22 with SMTP id 22so5139940wyf.19 for ; Tue, 21 Sep 2010 00:56:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:subject:from:to:cc :in-reply-to:references:content-type:date:message-id:mime-version :x-mailer:content-transfer-encoding; bh=gv2vZiLLXTW4auWahq7QMneqRaCEV3zsCUGmJD1xy6Y=; b=hnvk7S20vI15u1NfEGrkqNRJy4zqtDkApQpld/kI/KIh5s4qq/c2wJWa5Jhz0+iE7F p8nLkbT1IKOgALL+1pV4xPHj8ozQDjufGkQQN22n2jxw4JQ1C/NrML70v7QIDqd+PWHc IYcM5pPeevp59gzsJr1U6OKnwQd8Iw04u9wQI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:to:cc:in-reply-to:references:content-type:date :message-id:mime-version:x-mailer:content-transfer-encoding; b=kFdMKZTU8MZr74oLf6TyKembFvCSnMBmcKPFIFcTbvAwF8q+iNJhBwsJC3sZJwi6MM 3l2b1wOtbj7Xblm0VKi3auX/AgRmxjWu54hLfw5INFbTzDoH5OCllYeR82B4IR4ERbx2 IBKCqfEslUlEYTmbVsTZ8xdwX28IaD5b1s6sQ= Received: by 10.227.141.77 with SMTP id l13mr2873051wbu.77.1285055764046; Tue, 21 Sep 2010 00:56:04 -0700 (PDT) Received: from [10.150.51.210] (gw0.net.jmsp.net [212.23.165.14]) by mx.google.com with ESMTPS id e56sm5776091wer.46.2010.09.21.00.56.02 (version=SSLv3 cipher=RC4-MD5); Tue, 21 Sep 2010 00:56:03 -0700 (PDT) Subject: Re: [2.6.35-rc5] INET?: possible irq lock inversion dependency From: Eric Dumazet To: Tetsuo Handa Cc: linux-fsdevel@vger.kernel.org, netdev@vger.kernel.org In-Reply-To: <201009210651.o8L6pbkP038129@www262.sakura.ne.jp> References: <201006242053.IAG26562.JHQFFOMSVFtOLO@I-love.SAKURA.ne.jp> <201009072132.FHA93457.MHOJFQOOFLVStF@I-love.SAKURA.ne.jp> <201009210651.o8L6pbkP038129@www262.sakura.ne.jp> Date: Tue, 21 Sep 2010 09:56:00 +0200 Message-ID: <1285055760.2617.27.camel@edumazet-laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Le mardi 21 septembre 2010 à 15:51 +0900, Tetsuo Handa a écrit : > I succeeded to save undamaged messages by doing > > echo 1 > /proc/sys/kernel/printk_delay > > . (Well, it would be nice to do above before printing lockdep warning because > it prints a lot of lines enough to drop some lines.) > > Although my patch is still applied, I believe my patch is not the culprit. > This warning might be related to lock_sock_fast() which was introduced in > 2.6.35 because I'm seeing this warning since 2.6.35-rcX . What can I do next? > > Regards. > > [ 199.267633] nfsd: last server has exited, flushing export cache > [ 200.784089] > [ 200.784094] ========================================================= > [ 200.786186] [ INFO: possible irq lock inversion dependency detected ] > [ 200.786186] 2.6.36-rc5-old #2 > [ 200.786186] --------------------------------------------------------- > [ 200.786186] kworker/0:1/10 just changed the state of lock: > [ 200.786186] (clock-AF_INET){++.?..}, at: [] xs_tcp_state_change+0x1e/0x1d0 > [ 200.786186] but this lock was taken by another, SOFTIRQ-safe lock in the past: > [ 200.786186] (slock-AF_INET){+.-...} > [ 200.786186] > [ 200.786186] and interrupts could create inverse lock ordering between them. > [ 200.786186] > [ 200.786186] > [ 200.786186] other info that might help us debug this: > [ 200.786186] 3 locks held by kworker/0:1/10: > [ 200.786186] #0: (rpciod){+.+.+.}, at: [] process_one_work+0xd6/0x3a0 > [ 200.786186] #1: ((&xprt->task_cleanup)){+.+...}, at: [] process_one_work+0xd6/0x3a0 > [ 200.786186] #2: (sk_lock-AF_INET-RPC){+.+...}, at: [] inet_shutdown+0x49/0x110 > [ 200.786186] > [ 200.786186] the shortest dependencies between 2nd lock and 1st lock: > [ 200.786186] -> (slock-AF_INET){+.-...} ops: 11832 { > [ 200.786186] HARDIRQ-ON-W at: > [ 200.786186] [] mark_irqflags+0xfe/0x180 > [ 200.786186] [] __lock_acquire+0x38d/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_spin_lock_bh+0x3e/0x80 > [ 200.786186] [] lock_sock_fast+0x29/0x90 > [ 200.786186] [] udp_destroy_sock+0x14/0x40 > [ 200.786186] [] sk_common_release+0xb3/0xc0 > [ 200.786186] [] udp_lib_close+0x8/0x10 > [ 200.786186] [] inet_release+0xbe/0x100 > [ 200.786186] [] sock_release+0x66/0x80 > [ 200.786186] [] sock_close+0x12/0x30 > [ 200.786186] [] __fput+0x1cb/0x240 > [ 200.786186] [] fput+0x19/0x20 > [ 200.786186] [] filp_close+0x43/0x70 > [ 200.786186] [] close_files+0xad/0x150 > [ 200.786186] [] put_files_struct+0x29/0xf0 > [ 200.786186] [] exit_files+0x40/0x50 > [ 200.786186] [] do_exit+0x116/0x2e0 > [ 200.786186] [] do_group_exit+0x34/0xb0 > [ 200.786186] [] sys_exit_group+0xf/0x20 > [ 200.786186] [] syscall_call+0x7/0xb > [ 200.786186] IN-SOFTIRQ-W at: > [ 200.786186] [] mark_irqflags+0x11e/0x180 > [ 200.786186] [] __lock_acquire+0x38d/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_spin_lock+0x39/0x70 > [ 200.786186] [] sk_clone+0xb1/0x2c0 > [ 200.786186] [] inet_csk_clone+0x1a/0xb0 > [ 200.786186] [] tcp_create_openreq_child+0x1c/0x460 > [ 200.786186] [] tcp_v4_syn_recv_sock+0x3f/0x1e0 > [ 200.786186] [] tcp_check_req+0x18c/0x3b0 > [ 200.786186] [] tcp_v4_hnd_req+0x4d/0x160 > [ 200.786186] [] tcp_v4_do_rcv+0x159/0x280 > [ 200.786186] [] tcp_v4_rcv+0x574/0xa30 > [ 200.786186] [] ip_local_deliver_finish+0x103/0x320 > [ 200.786186] [] ip_local_deliver+0x30/0x40 > [ 200.786186] [] ip_rcv_finish+0x169/0x480 > [ 200.786186] [] ip_rcv+0x19a/0x2b0 > [ 200.786186] [] __netif_receive_skb+0x21d/0x310 > [ 200.786186] [] process_backlog+0x88/0x160 > [ 200.786186] [] net_rx_action+0x127/0x140 > [ 200.786186] [] __do_softirq+0xd0/0x130 > [ 200.786186] INITIAL USE at: > [ 200.786186] [] __lock_acquire+0x1c0/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_spin_lock_bh+0x3e/0x80 > [ 200.786186] [] lock_sock_fast+0x29/0x90 > [ 200.786186] [] udp_destroy_sock+0x14/0x40 > [ 200.786186] [] sk_common_release+0xb3/0xc0 > [ 200.786186] [] udp_lib_close+0x8/0x10 > [ 200.786186] [] inet_release+0xbe/0x100 > [ 200.786186] [] sock_release+0x66/0x80 > [ 200.786186] [] sock_close+0x12/0x30 > [ 200.786186] [] __fput+0x1cb/0x240 > [ 200.786186] [] fput+0x19/0x20 > [ 200.786186] [] filp_close+0x43/0x70 > [ 200.786186] [] close_files+0xad/0x150 > [ 200.786186] [] put_files_struct+0x29/0xf0 > [ 200.786186] [] exit_files+0x40/0x50 > [ 200.786186] [] do_exit+0x116/0x2e0 > [ 200.786186] [] do_group_exit+0x34/0xb0 > [ 200.786186] [] sys_exit_group+0xf/0x20 > [ 200.786186] [] syscall_call+0x7/0xb > [ 200.786186] } > [ 200.786186] ... key at: [] af_family_slock_keys+0x10/0x140 > [ 200.786186] ... acquired at: > [ 200.786186] [] check_prevs_add+0xab/0x100 > [ 200.786186] [] validate_chain+0x305/0x5a0 > [ 200.786186] [] __lock_acquire+0x2ad/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_write_lock_bh+0x3e/0x80 > [ 200.786186] [] inet_csk_listen_stop+0x86/0x160 > [ 200.786186] [] tcp_close+0x382/0x390 > [ 200.786186] [] inet_release+0xbe/0x100 > [ 200.786186] [] sock_release+0x66/0x80 > [ 200.786186] [] sock_close+0x12/0x30 > [ 200.786186] [] __fput+0x1cb/0x240 > [ 200.786186] [] fput+0x19/0x20 > [ 200.786186] [] filp_close+0x43/0x70 > [ 200.786186] [] sys_close+0x6d/0x110 > [ 200.786186] [] syscall_call+0x7/0xb > [ 200.786186] > [ 200.786186] -> (clock-AF_INET){++.?..} ops: 2429 { > [ 200.786186] HARDIRQ-ON-W at: > [ 200.786186] [] mark_irqflags+0xfe/0x180 > [ 200.786186] [] __lock_acquire+0x38d/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_write_lock_bh+0x3e/0x80 > [ 200.786186] [] sk_common_release+0x2f/0xc0 > [ 200.786186] [] udp_lib_close+0x8/0x10 > [ 200.786186] [] inet_release+0xbe/0x100 > [ 200.786186] [] sock_release+0x66/0x80 > [ 200.786186] [] sock_close+0x12/0x30 > [ 200.786186] [] __fput+0x1cb/0x240 > [ 200.786186] [] fput+0x19/0x20 > [ 200.786186] [] filp_close+0x43/0x70 > [ 200.786186] [] close_files+0xad/0x150 > [ 200.786186] [] put_files_struct+0x29/0xf0 > [ 200.786186] [] exit_files+0x40/0x50 > [ 200.786186] [] do_exit+0x116/0x2e0 > [ 200.786186] [] do_group_exit+0x34/0xb0 > [ 200.786186] [] sys_exit_group+0xf/0x20 > [ 200.786186] [] syscall_call+0x7/0xb > [ 200.786186] HARDIRQ-ON-R at: > [ 200.786186] [] mark_irqflags+0x6e/0x180 > [ 200.786186] [] __lock_acquire+0x38d/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_read_lock+0x39/0x70 > [ 200.786186] [] xs_tcp_state_change+0x1e/0x1d0 > [ 200.786186] [] tcp_rcv_synsent_state_process+0x398/0x590 > [ 200.786186] [] tcp_rcv_state_process+0x4a7/0x560 > [ 200.786186] [] tcp_v4_do_rcv+0x71/0x280 > [ 200.786186] [] __release_sock+0x66/0x150 > [ 200.786186] [] release_sock+0x87/0x90 > [ 200.786186] [] inet_stream_connect+0x5a/0x1b0 > [ 200.786186] [] kernel_connect+0x18/0x30 > [ 200.786186] [] xs_tcp_finish_connecting+0x4e/0x120 > [ 200.786186] [] xs_tcp_setup_socket+0x5b/0x180 > [ 200.786186] [] xs_tcp_connect_worker4+0x14/0x20 > [ 200.786186] [] process_one_work+0x147/0x3a0 > [ 200.786186] [] worker_thread+0xa6/0x200 > [ 200.786186] [] kthread+0x75/0x80 > [ 200.786186] [] kernel_thread_helper+0x6/0x1c > [ 200.786186] IN-SOFTIRQ-R at: > [ 200.786186] [] mark_irqflags+0x11e/0x180 > [ 200.786186] [] __lock_acquire+0x38d/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_read_lock+0x39/0x70 > [ 200.786186] [] xs_tcp_data_ready+0x21/0x90 > [ 200.786186] [] tcp_data_queue+0x248/0x820 > [ 200.786186] [] tcp_rcv_established+0xae/0x4e0 > [ 200.786186] [] tcp_v4_do_rcv+0x1d1/0x280 > [ 200.786186] [] tcp_v4_rcv+0x574/0xa30 > [ 200.786186] [] ip_local_deliver_finish+0x103/0x320 > [ 200.786186] [] ip_local_deliver+0x30/0x40 > [ 200.786186] [] ip_rcv_finish+0x169/0x480 > [ 200.786186] [] ip_rcv+0x19a/0x2b0 > [ 200.786186] [] __netif_receive_skb+0x21d/0x310 > [ 200.786186] [] process_backlog+0x88/0x160 > [ 200.786186] [] net_rx_action+0x127/0x140 > [ 200.786186] [] __do_softirq+0xd0/0x130 > [ 200.786186] SOFTIRQ-ON-R at: > [ 200.786186] [] mark_irqflags+0xe2/0x180 > [ 200.786186] [] __lock_acquire+0x38d/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_read_lock+0x39/0x70 > [ 200.786186] [] xs_tcp_state_change+0x1e/0x1d0 > [ 200.786186] [] inet_shutdown+0x97/0x110 > [ 200.786186] [] kernel_sock_shutdown+0x9/0x10 > [ 200.786186] [] xs_tcp_shutdown+0x17/0x20 > [ 200.786186] [] xs_tcp_close+0x27/0x30 > [ 200.786186] [] xprt_autoclose+0x1d/0x50 > [ 200.786186] [] process_one_work+0x147/0x3a0 > [ 200.786186] [] worker_thread+0xa6/0x200 > [ 200.786186] [] kthread+0x75/0x80 > [ 200.786186] [] kernel_thread_helper+0x6/0x1c > [ 200.786186] INITIAL USE at: > [ 200.786186] [] __lock_acquire+0x1c0/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_write_lock_bh+0x3e/0x80 > [ 200.786186] [] sk_common_release+0x2f/0xc0 > [ 200.786186] [] udp_lib_close+0x8/0x10 > [ 200.786186] [] inet_release+0xbe/0x100 > [ 200.786186] [] sock_release+0x66/0x80 > [ 200.786186] [] sock_close+0x12/0x30 > [ 200.786186] [] __fput+0x1cb/0x240 > [ 200.786186] [] fput+0x19/0x20 > [ 200.786186] [] filp_close+0x43/0x70 > [ 200.786186] [] close_files+0xad/0x150 > [ 200.786186] [] put_files_struct+0x29/0xf0 > [ 200.786186] [] exit_files+0x40/0x50 > [ 200.786186] [] do_exit+0x116/0x2e0 > [ 200.786186] [] do_group_exit+0x34/0xb0 > [ 200.786186] [] sys_exit_group+0xf/0x20 > [ 200.786186] [] syscall_call+0x7/0xb > [ 200.786186] } > [ 200.786186] ... key at: [] af_callback_keys+0x10/0x130 > [ 200.786186] ... acquired at: > [ 200.786186] [] check_usage_backwards+0x76/0xd0 > [ 200.786186] [] mark_lock_irq+0x99/0x240 > [ 200.786186] [] mark_lock+0x21c/0x3c0 > [ 200.786186] [] mark_irqflags+0xe2/0x180 > [ 200.786186] [] __lock_acquire+0x38d/0x8e0 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] _raw_read_lock+0x39/0x70 > [ 200.786186] [] xs_tcp_state_change+0x1e/0x1d0 > [ 200.786186] [] inet_shutdown+0x97/0x110 > [ 200.786186] [] kernel_sock_shutdown+0x9/0x10 > [ 200.786186] [] xs_tcp_shutdown+0x17/0x20 > [ 200.786186] [] xs_tcp_close+0x27/0x30 > [ 200.786186] [] xprt_autoclose+0x1d/0x50 > [ 200.786186] [] process_one_work+0x147/0x3a0 > [ 200.786186] [] worker_thread+0xa6/0x200 > [ 200.786186] [] kthread+0x75/0x80 > [ 200.786186] [] kernel_thread_helper+0x6/0x1c > [ 200.786186] > [ 200.786186] > [ 200.786186] stack backtrace: > [ 200.786186] Pid: 10, comm: kworker/0:1 Not tainted 2.6.36-rc5-old #2 > [ 200.786186] Call Trace: > [ 200.786186] [] ? printk+0x18/0x20 > [ 200.786186] [] print_irq_inversion_bug+0x108/0x120 > [ 200.786186] [] check_usage_backwards+0x76/0xd0 > [ 200.786186] [] mark_lock_irq+0x99/0x240 > [ 200.786186] [] ? check_usage_backwards+0x0/0xd0 > [ 200.786186] [] mark_lock+0x21c/0x3c0 > [ 200.786186] [] mark_irqflags+0xe2/0x180 > [ 200.786186] [] __lock_acquire+0x38d/0x8e0 > [ 200.786186] [] ? __lock_is_held+0x3a/0x60 > [ 200.786186] [] lock_acquire+0x7a/0xa0 > [ 200.786186] [] ? xs_tcp_state_change+0x1e/0x1d0 > [ 200.786186] [] _raw_read_lock+0x39/0x70 > [ 200.786186] [] ? xs_tcp_state_change+0x1e/0x1d0 > [ 200.786186] [] xs_tcp_state_change+0x1e/0x1d0 > [ 200.786186] [] ? tcp_send_fin+0x4f/0xc0 > [ 200.786186] [] inet_shutdown+0x97/0x110 > [ 200.786186] [] kernel_sock_shutdown+0x9/0x10 > [ 200.786186] [] xs_tcp_shutdown+0x17/0x20 > [ 200.786186] [] xs_tcp_close+0x27/0x30 > [ 200.786186] [] xprt_autoclose+0x1d/0x50 > [ 200.786186] [] process_one_work+0x147/0x3a0 > [ 200.786186] [] ? process_one_work+0xd6/0x3a0 > [ 200.786186] [] ? __lock_acquired+0x119/0x1c0 > [ 200.786186] [] ? xprt_autoclose+0x0/0x50 > [ 200.786186] [] ? worker_thread+0x1c/0x200 > [ 200.786186] [] ? __need_more_worker+0x17/0x40 > [ 200.786186] [] worker_thread+0xa6/0x200 > [ 200.786186] [] kthread+0x75/0x80 > [ 200.786186] [] ? worker_thread+0x0/0x200 > [ 200.786186] [] ? kthread+0x0/0x80 > [ 200.786186] [] kernel_thread_helper+0x6/0x1c > [ 208.718821] Restarting system. > [ 208.720581] machine restart I would say someone calls xs_tcp_state_change() while softirqs are not masked. If its expected, then following patch is needed ? --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index b6309db..fc94e5e 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1301,7 +1301,7 @@ static void xs_tcp_state_change(struct sock *sk) { struct rpc_xprt *xprt; - read_lock(&sk->sk_callback_lock); + read_lock_bh(&sk->sk_callback_lock); if (!(xprt = xprt_from_sock(sk))) goto out; dprintk("RPC: xs_tcp_state_change client %p...\n", xprt); @@ -1313,7 +1313,7 @@ static void xs_tcp_state_change(struct sock *sk) switch (sk->sk_state) { case TCP_ESTABLISHED: - spin_lock_bh(&xprt->transport_lock); + spin_lock(&xprt->transport_lock); if (!xprt_test_and_set_connected(xprt)) { struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt); @@ -1327,7 +1327,7 @@ static void xs_tcp_state_change(struct sock *sk) xprt_wake_pending_tasks(xprt, -EAGAIN); } - spin_unlock_bh(&xprt->transport_lock); + spin_unlock(&xprt->transport_lock); break; case TCP_FIN_WAIT1: /* The client initiated a shutdown of the socket */ @@ -1365,7 +1365,7 @@ static void xs_tcp_state_change(struct sock *sk) xs_sock_mark_closed(xprt); } out: - read_unlock(&sk->sk_callback_lock); + read_unlock_bh(&sk->sk_callback_lock); } /**