From patchwork Wed Jun 17 11:51:45 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick McHardy X-Patchwork-Id: 28777 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@bilbo.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from ozlabs.org (ozlabs.org [203.10.76.45]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx.ozlabs.org", Issuer "CA Cert Signing Authority" (verified OK)) by bilbo.ozlabs.org (Postfix) with ESMTPS id 139F9B71BA for ; Wed, 17 Jun 2009 21:51:57 +1000 (EST) Received: by ozlabs.org (Postfix) id 059CDDDDB2; Wed, 17 Jun 2009 21:51:57 +1000 (EST) Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 8E41CDDDA2 for ; Wed, 17 Jun 2009 21:51:56 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933066AbZFQLvr (ORCPT ); Wed, 17 Jun 2009 07:51:47 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1759576AbZFQLvq (ORCPT ); Wed, 17 Jun 2009 07:51:46 -0400 Received: from stinky.trash.net ([213.144.137.162]:64930 "EHLO stinky.trash.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756772AbZFQLvp (ORCPT ); Wed, 17 Jun 2009 07:51:45 -0400 Received: from [192.168.0.100] (unknown [78.42.204.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by stinky.trash.net (Postfix) with ESMTPSA id DB9FAB2C52; Wed, 17 Jun 2009 13:51:46 +0200 (MEST) Message-ID: <4A38D8D1.6060004@trash.net> Date: Wed, 17 Jun 2009 13:51:45 +0200 From: Patrick McHardy User-Agent: Mozilla-Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: Eric Dumazet CC: Ingo Molnar , David Miller , Thomas Gleixner , torvalds@linux-foundation.org, akpm@linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30 __list_add+0x7d/0xad() References: <20090615.050449.144947903.davem@davemloft.net> <20090616091538.GA4184@elte.hu> <20090616.034752.226811527.davem@davemloft.net> <20090616105304.GA3579@elte.hu> <20090616122415.GA16630@elte.hu> <20090617092152.GA17449@elte.hu> <4A38C2F3.3000009@gmail.com> <4A38D5BD.2040502@trash.net> In-Reply-To: <4A38D5BD.2040502@trash.net> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Patrick McHardy wrote: > Eric Dumazet wrote: >> IPS_CONFIRMED_BIT is set under nf_conntrack_lock (in >> __nf_conntrack_confirm()), >> we probably want to add a synchronisation under ct->lock as well, >> or __nf_ct_refresh_acct() could set ct->timeout.expires to extra_jiffies, >> while a different cpu could confirm the conntrack. > > Before the conntrack is confirmed, it is exclusively handled by a > single CPU. I agree that we need to make sure the IPS_CONFIRMED_BIT > is visible before we add the conntrack to the hash table since the > lookup is lockless, but simply moving the set_bit before the hash > insertion should be fine I think. > A slightly changed version which moves hash insertion to the end and adds a comment about ordering. This make sure the timer is actually running before the conntrack can be found be other CPUs. diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 5f72b94..e2cc707 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -425,7 +425,6 @@ __nf_conntrack_confirm(struct sk_buff *skb) /* Remove from unconfirmed list */ hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode); - __nf_conntrack_hash_insert(ct, hash, repl_hash); /* Timer relative to confirmation time, not original setting time, otherwise we'd get timer wrap in weird delay cases. */ @@ -433,8 +432,15 @@ __nf_conntrack_confirm(struct sk_buff *skb) add_timer(&ct->timeout); atomic_inc(&ct->ct_general.use); set_bit(IPS_CONFIRMED_BIT, &ct->status); + + /* Since the lookup is lockless, hash insertion must be after starting the + * timer and setting the CONFIRMED bit. The RCU barriers guarantee that no + * other CPU can find the conntrack before the above stores are visible. + */ + __nf_conntrack_hash_insert(ct, hash, repl_hash); NF_CT_STAT_INC(net, insert); spin_unlock_bh(&nf_conntrack_lock); + help = nfct_help(ct); if (help && help->helper) nf_conntrack_event_cache(IPCT_HELPER, ct);