From patchwork Tue Mar 24 19:54:53 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 25027 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 7CD77DDE25 for ; Wed, 25 Mar 2009 06:55:28 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756545AbZCXTzL (ORCPT ); Tue, 24 Mar 2009 15:55:11 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756489AbZCXTzL (ORCPT ); Tue, 24 Mar 2009 15:55:11 -0400 Received: from gw1.cosmosbay.com ([212.99.114.194]:50662 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752331AbZCXTzK convert rfc822-to-8bit (ORCPT ); Tue, 24 Mar 2009 15:55:10 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) by gw1.cosmosbay.com (8.13.7/8.13.7) with ESMTP id n2OJss1p011692; Tue, 24 Mar 2009 20:54:54 +0100 Message-ID: <49C93A8D.8000603@cosmosbay.com> Date: Tue, 24 Mar 2009 20:54:53 +0100 From: Eric Dumazet User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) MIME-Version: 1.0 To: Patrick McHardy CC: mbizon@freebox.fr, "Paul E. McKenney" , Joakim Tjernlund , avorontsov@ru.mvista.com, netdev@vger.kernel.org Subject: [PATCH] netfilter: Use hlist_add_head_rcu() in nf_conntrack_set_hashsize() References: <49C77D71.8090709@trash.net> <49C780AD.70704@trash.net> <49C7CB9B.1040409@trash.net> <49C8A415.1090606@cosmosbay.com> <49C8CCF4.5050104@cosmosbay.com> <1237907850.12351.80.camel@sakura.staff.proxad.net> <49C8FBCA.40402@cosmosbay.com> In-Reply-To: <49C8FBCA.40402@cosmosbay.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [0.0.0.0]); Tue, 24 Mar 2009 20:54:55 +0100 (CET) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Eric Dumazet a écrit : > > We are working on a SLAB_DESTROY_BY_RCU implementation so that > conntrack wont use call_rcu() anymore, give us a couple of days :) > While working on this stuff, I found one suspect use of hlist_add_head() Its not a hot path, I believe following patch would make sure nothing wrong happens. If a chain contains element A and B, then we might build a new table with a new chain containing B and A (in this reverse order), and a cpu could see A->next = B (new pointer), B->next = A (old pointer) Thanks [PATCH] netfilter: Use hlist_add_head_rcu() in nf_conntrack_set_hashsize() Using hlist_add_head() in nf_conntrack_set_hashsize() is quite dangerous. Without any barrier, one CPU could see a loop while doing its lookup. Its true new table cannot be seen by another cpu, but previous table is still readable. Signed-off-by: Eric Dumazet --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 55befe5..54e983f 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1121,7 +1121,7 @@ int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp) struct nf_conntrack_tuple_hash, hnode); hlist_del_rcu(&h->hnode); bucket = __hash_conntrack(&h->tuple, hashsize, rnd); - hlist_add_head(&h->hnode, &hash[bucket]); + hlist_add_head_rcu(&h->hnode, &hash[bucket]); } } old_size = nf_conntrack_htable_size;