Patchwork [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30 __list_add+0x7d/0xad()

login
register
mail settings
Submitter Patrick McHardy
Date June 17, 2009, 2:23 p.m.
Message ID <4A38FC5A.70500@trash.net>
Download mbox | patch
Permalink /patch/28792/
State RFC
Delegated to: David Miller
Headers show

Comments

Patrick McHardy - June 17, 2009, 2:23 p.m.
Patrick McHardy wrote:
> Eric Dumazet wrote:
>> Patrick McHardy a écrit :
>>> No, before it is confirmed, its only visible to the CPU handling
>>> the initial packet of a connection. Confirmation is the step that
>>> makes it visible to other CPUs.
>>
>> Thanks Patrick, I missed this, and your patch seems fine now :)
> 
> Thanks for your help, I'll send it to Dave later today.

I'm having some trouble figuring out the exact events that would
lead to the timer base corruption. Ingo, could you please test
this patch to make sure it also fixes the problem?
Eric Dumazet - June 17, 2009, 3:29 p.m.
Patrick McHardy a écrit :
> Patrick McHardy wrote:
>> Eric Dumazet wrote:
>>> Patrick McHardy a écrit :
>>>> No, before it is confirmed, its only visible to the CPU handling
>>>> the initial packet of a connection. Confirmation is the step that
>>>> makes it visible to other CPUs.
>>>
>>> Thanks Patrick, I missed this, and your patch seems fine now :)
>>
>> Thanks for your help, I'll send it to Dave later today.
> 
> I'm having some trouble figuring out the exact events that would
> lead to the timer base corruption. Ingo, could you please test
> this patch to make sure it also fixes the problem?
> 
> 

;)

Event can be described as following :

CPU1					CPU2

/* __nf_conntrack_confirm() */
__nf_conntrack_hash_insert(ct, hash, repl_hash);
// now 'ct' is visible by other cpus
					// search conntrack and find ct
// timeout.expires becomes absolute here
ct->timeout.expires += jiffies;
add_timer(&ct->timeout);

					/* __nf_ct_refresh_acct() */
					if (!nf_ct_is_confirmed(ct)) {
					// we *believe* timeout.expires 
					// is not yet in use by timer code
					// and is still a relative quantity.
					// We want to 'update' it but we should not !
						ct->timeout.expires = extra_jiffies;  << CORRUPTION >>
					} else {
// too late :(
set_bit(IPS_CONFIRMED_BIT, &ct->status);



This is how I understood the problem, but I may be wrong ?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy - June 17, 2009, 3:34 p.m.
Eric Dumazet wrote:
> Patrick McHardy a écrit :
>> I'm having some trouble figuring out the exact events that would
>> lead to the timer base corruption. Ingo, could you please test
>> this patch to make sure it also fixes the problem?
> 
> ;)
> 
> Event can be described as following :
> 
> CPU1					CPU2
> 
> /* __nf_conntrack_confirm() */
> __nf_conntrack_hash_insert(ct, hash, repl_hash);
> // now 'ct' is visible by other cpus
> 					// search conntrack and find ct
> // timeout.expires becomes absolute here
> ct->timeout.expires += jiffies;
> add_timer(&ct->timeout);
> 
> 					/* __nf_ct_refresh_acct() */
> 					if (!nf_ct_is_confirmed(ct)) {
> 					// we *believe* timeout.expires 
> 					// is not yet in use by timer code
> 					// and is still a relative quantity.
> 					// We want to 'update' it but we should not !
> 						ct->timeout.expires = extra_jiffies;  << CORRUPTION >>
> 					} else {
> // too late :(
> set_bit(IPS_CONFIRMED_BIT, &ct->status);
> 
> This is how I understood the problem, but I may be wrong ?

Thats one case that can happen, but that wouldn't corrupt the
timer base AFAICS. Also the callpath shows that it actually went
into the mod_timer_pending() path *and* timer_pending() was true.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5f72b94..9b20e58 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -425,7 +425,6 @@  __nf_conntrack_confirm(struct sk_buff *skb)
 	/* Remove from unconfirmed list */
 	hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
 
-	__nf_conntrack_hash_insert(ct, hash, repl_hash);
 	/* Timer relative to confirmation time, not original
 	   setting time, otherwise we'd get timer wrap in
 	   weird delay cases. */
@@ -433,8 +432,16 @@  __nf_conntrack_confirm(struct sk_buff *skb)
 	add_timer(&ct->timeout);
 	atomic_inc(&ct->ct_general.use);
 	set_bit(IPS_CONFIRMED_BIT, &ct->status);
+
+	/* Since the lookup is lockless, hash insertion must be done after
+	 * starting the timer and setting the CONFIRMED bit. The RCU barriers
+	 * guarantee that no other CPU can find the conntrack before the above
+	 * stores are visible.
+	 */
+	__nf_conntrack_hash_insert(ct, hash, repl_hash);
 	NF_CT_STAT_INC(net, insert);
 	spin_unlock_bh(&nf_conntrack_lock);
+
 	help = nfct_help(ct);
 	if (help && help->helper)
 		nf_conntrack_event_cache(IPCT_HELPER, ct);