From patchwork Thu Oct 29 14:08:30 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 37186 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id BDA3FB7BE9 for ; Fri, 30 Oct 2009 01:08:50 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754230AbZJ2OIk (ORCPT ); Thu, 29 Oct 2009 10:08:40 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753977AbZJ2OIj (ORCPT ); Thu, 29 Oct 2009 10:08:39 -0400 Received: from gw1.cosmosbay.com ([212.99.114.194]:49214 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752891AbZJ2OIj (ORCPT ); Thu, 29 Oct 2009 10:08:39 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) by gw1.cosmosbay.com (8.13.7/8.13.7) with ESMTP id n9TE8V2a002103; Thu, 29 Oct 2009 15:08:31 +0100 Message-ID: <4AE9A1DE.6000808@gmail.com> Date: Thu, 29 Oct 2009 15:08:30 +0100 From: Eric Dumazet User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: =?ISO-8859-15?Q?Ilpo_J=E4rvinen?= CC: David Miller , Andrew Morton , Stephen Hemminger , Netdev , kolo@albatani.cz, bugzilla-daemon@bugzilla.kernel.org Subject: Re: Fw: [Bug 14470] New: freez in TCP stack References: <20091026084132.57bc3d07@nehalam> <20091028151313.ba4a4d23.akpm@linux-foundation.org> <4AE9298C.1000204@gmail.com> In-Reply-To: X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [0.0.0.0]); Thu, 29 Oct 2009 15:08:32 +0100 (CET) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org > ...I don't understand how a stale reference would yield to a consistent > NULL ptr crash there rather than hard to track corruption for most of the > times and random crashes then here and there. Or perhaps we were just very > lucky to immediately get only those reports which point out to the right > track :-). > When a skb is freed, and re-allocated, we clear most of its fields in __alloc_skb() memset(skb, 0, offsetof(struct sk_buff, tail)); Then if this skb is freed again, not queued anywhere, its skb->next stays NULL So if we have a stale reference to a freed skb, we can : - Get a NULL pointer, or a poisonned value (if SLUB_DEBUG) Here is a debug patch to check we dont have stale pointers, maybe this will help ?sync [PATCH] tcp: check stale pointers in tcp_unlink_write_queue() In order to track some obscure bug, we check in tcp_unlink_write_queue() if we dont have stale references to unlinked skb Signed-off-by: Eric Dumazet --- include/net/tcp.h | 4 ++++ net/ipv4/tcp.c | 2 +- net/ipv4/tcp_input.c | 4 ++-- net/ipv4/tcp_output.c | 8 ++++---- 4 files changed, 11 insertions(+), 7 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/include/net/tcp.h b/include/net/tcp.h index 740d09b..09da342 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1357,6 +1357,10 @@ static inline void tcp_insert_write_queue_before(struct sk_buff *new, static inline void tcp_unlink_write_queue(struct sk_buff *skb, struct sock *sk) { + WARN_ON(skb == tcp_sk(sk)->retransmit_skb_hint); + WARN_ON(skb == tcp_sk(sk)->lost_skb_hint); + WARN_ON(skb == tcp_sk(sk)->scoreboard_skb_hint); + WARN_ON(skb == sk->sk_send_head); __skb_unlink(skb, &sk->sk_write_queue); } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e0cfa63..328bdb1 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1102,11 +1102,11 @@ out: do_fault: if (!skb->len) { - tcp_unlink_write_queue(skb, sk); /* It is the one place in all of TCP, except connection * reset, where we can be unlinking the send_head. */ tcp_check_send_head(sk, skb); + tcp_unlink_write_queue(skb, sk); sk_wmem_free_skb(sk, skb); } diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index ba0eab6..fccc6e9 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3251,13 +3251,13 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets, if (!fully_acked) break; - tcp_unlink_write_queue(skb, sk); - sk_wmem_free_skb(sk, skb); tp->scoreboard_skb_hint = NULL; if (skb == tp->retransmit_skb_hint) tp->retransmit_skb_hint = NULL; if (skb == tp->lost_skb_hint) tp->lost_skb_hint = NULL; + tcp_unlink_write_queue(skb, sk); + sk_wmem_free_skb(sk, skb); } if (likely(between(tp->snd_up, prior_snd_una, tp->snd_una))) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 616c686..196171d 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1791,6 +1791,10 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb) tcp_highest_sack_combine(sk, next_skb, skb); + /* changed transmit queue under us so clear hints */ + tcp_clear_retrans_hints_partial(tp); + if (next_skb == tp->retransmit_skb_hint) + tp->retransmit_skb_hint = skb; tcp_unlink_write_queue(next_skb, sk); skb_copy_from_linear_data(next_skb, skb_put(skb, next_skb_size), @@ -1813,10 +1817,6 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb) */ TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS; - /* changed transmit queue under us so clear hints */ - tcp_clear_retrans_hints_partial(tp); - if (next_skb == tp->retransmit_skb_hint) - tp->retransmit_skb_hint = skb; tcp_adjust_pcount(sk, next_skb, tcp_skb_pcount(next_skb));