diff mbox

[RFT,3/4] Use mod_timer_noact to remove nf_conntrack_lock

Message ID 20090218052747.555811553@vyatta.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

stephen hemminger Feb. 18, 2009, 5:19 a.m. UTC
Now that we are using mod_timer_noact() for timer updates there's no need to
hold the global lock during the timer update since the actual timeout update
is now protected by the timer locking.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>

---
 net/netfilter/nf_conntrack_core.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Patrick McHardy Feb. 18, 2009, 9:54 a.m. UTC | #1
Stephen Hemminger wrote:

This looks good, thanks for not letting those patches die :)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Feb. 18, 2009, 11:05 a.m. UTC | #2
On 18-02-2009 06:19, Stephen Hemminger wrote:
> Now that we are using mod_timer_noact() for timer updates there's no need to

Hmm... so where exactly we are using this mod_timer_noact() now?

Jarek P.

> hold the global lock during the timer update since the actual timeout update
> is now protected by the timer locking.
> 
> Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
> 
> ---
>  net/netfilter/nf_conntrack_core.c |    9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> --- a/net/netfilter/nf_conntrack_core.c	2009-02-17 10:55:33.370882059 -0800
> +++ b/net/netfilter/nf_conntrack_core.c	2009-02-17 13:48:25.080060712 -0800
> @@ -793,13 +793,12 @@ void __nf_ct_refresh_acct(struct nf_conn
>  	NF_CT_ASSERT(ct->timeout.data == (unsigned long)ct);
>  	NF_CT_ASSERT(skb);
>  
> -	spin_lock_bh(&nf_conntrack_lock);
> -
>  	/* Only update if this is not a fixed timeout */
>  	if (test_bit(IPS_FIXED_TIMEOUT_BIT, &ct->status))
>  		goto acct;
>  
> -	/* If not in hash table, timer will not be active yet */
> +	/* If not in hash table, timer will not be active yet,
> +	   we are the only one able to see it. */
>  	if (!nf_ct_is_confirmed(ct)) {
>  		ct->timeout.expires = extra_jiffies;
>  		event = IPCT_REFRESH;
> @@ -821,16 +820,16 @@ acct:
>  	if (do_acct) {
>  		struct nf_conn_counter *acct;
>  
> +		spin_lock_bh(&nf_conntrack_lock);
>  		acct = nf_conn_acct_find(ct);
>  		if (acct) {
>  			acct[CTINFO2DIR(ctinfo)].packets++;
>  			acct[CTINFO2DIR(ctinfo)].bytes +=
>  				skb->len - skb_network_offset(skb);
>  		}
> +		spin_unlock_bh(&nf_conntrack_lock);
>  	}
>  
> -	spin_unlock_bh(&nf_conntrack_lock);
> -
>  	/* must be unlocked when calling event cache */
>  	if (event)
>  		nf_conntrack_event_cache(event, ct);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Feb. 18, 2009, 11:08 a.m. UTC | #3
Jarek Poplawski wrote:
> On 18-02-2009 06:19, Stephen Hemminger wrote:
>> Now that we are using mod_timer_noact() for timer updates there's no need to
> 
> Hmm... so where exactly we are using this mod_timer_noact() now?

Hehe, good point, the conversion to actually use it seems to
be missing :)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 18, 2009, 2:01 p.m. UTC | #4
Stephen Hemminger a écrit :
> Now that we are using mod_timer_noact() for timer updates there's no need to
> hold the global lock during the timer update since the actual timeout update
> is now protected by the timer locking.
> 
> Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
> 
> ---
>  net/netfilter/nf_conntrack_core.c |    9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> --- a/net/netfilter/nf_conntrack_core.c	2009-02-17 10:55:33.370882059 -0800
> +++ b/net/netfilter/nf_conntrack_core.c	2009-02-17 13:48:25.080060712 -0800
> @@ -793,13 +793,12 @@ void __nf_ct_refresh_acct(struct nf_conn
>  	NF_CT_ASSERT(ct->timeout.data == (unsigned long)ct);
>  	NF_CT_ASSERT(skb);
>  
> -	spin_lock_bh(&nf_conntrack_lock);
> -
>  	/* Only update if this is not a fixed timeout */
>  	if (test_bit(IPS_FIXED_TIMEOUT_BIT, &ct->status))
>  		goto acct;
>  
> -	/* If not in hash table, timer will not be active yet */
> +	/* If not in hash table, timer will not be active yet,
> +	   we are the only one able to see it. */
>  	if (!nf_ct_is_confirmed(ct)) {
>  		ct->timeout.expires = extra_jiffies;
>  		event = IPCT_REFRESH;
> @@ -821,16 +820,16 @@ acct:
>  	if (do_acct) {
>  		struct nf_conn_counter *acct;
>  
> +		spin_lock_bh(&nf_conntrack_lock);
>  		acct = nf_conn_acct_find(ct);
>  		if (acct) {
>  			acct[CTINFO2DIR(ctinfo)].packets++;
>  			acct[CTINFO2DIR(ctinfo)].bytes +=
>  				skb->len - skb_network_offset(skb);
>  		}
> +		spin_unlock_bh(&nf_conntrack_lock);
>  	}
>  
> -	spin_unlock_bh(&nf_conntrack_lock);
> -
>  	/* must be unlocked when calling event cache */
>  	if (event)
>  		nf_conntrack_event_cache(event, ct);
> 

Unfortunatly, this patch changes nothing, as most of the time, do_acct is true.

We also need to fine lock the accounting part as well.

	spin_lock_bh(&ct->some_lock);
	acct = nf_conn_acct_find(ct);
	if (acct) {
		acct[CTINFO2DIR(ctinfo)].packets++;
		acct[CTINFO2DIR(ctinfo)].bytes +=
			skb->len - skb_network_offset(skb);
	}
	spin_unlock_bh(&ct->some_lock);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Feb. 18, 2009, 2:04 p.m. UTC | #5
Eric Dumazet wrote:
> Unfortunatly, this patch changes nothing, as most of the time, do_acct is true.
> 
> We also need to fine lock the accounting part as well.
> 
> 	spin_lock_bh(&ct->some_lock);
> 	acct = nf_conn_acct_find(ct);
> 	if (acct) {
> 		acct[CTINFO2DIR(ctinfo)].packets++;
> 		acct[CTINFO2DIR(ctinfo)].bytes +=
> 			skb->len - skb_network_offset(skb);
> 	}
> 	spin_unlock_bh(&ct->some_lock);
> 

Its currently still enabled by default, but we intend to change that.
After that I guess almost nobody will have it enabled.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 18, 2009, 2:22 p.m. UTC | #6
Patrick McHardy a écrit :
> Eric Dumazet wrote:
>> Unfortunatly, this patch changes nothing, as most of the time, do_acct
>> is true.
>>
>> We also need to fine lock the accounting part as well.
>>
>>     spin_lock_bh(&ct->some_lock);
>>     acct = nf_conn_acct_find(ct);
>>     if (acct) {
>>         acct[CTINFO2DIR(ctinfo)].packets++;
>>         acct[CTINFO2DIR(ctinfo)].bytes +=
>>             skb->len - skb_network_offset(skb);
>>     }
>>     spin_unlock_bh(&ct->some_lock);
>>
> 
> Its currently still enabled by default, but we intend to change that.
> After that I guess almost nobody will have it enabled.
> 
> 

Really ? I find this accounting stuff really useful and always enable it :)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Feb. 18, 2009, 2:27 p.m. UTC | #7
Eric Dumazet wrote:
> Patrick McHardy a écrit :
>> Eric Dumazet wrote:
>>> Unfortunatly, this patch changes nothing, as most of the time, do_acct
>>> is true.
>>>
>>> We also need to fine lock the accounting part as well.
>>>
>>>     spin_lock_bh(&ct->some_lock);
>>>     acct = nf_conn_acct_find(ct);
>>>     if (acct) {
>>>         acct[CTINFO2DIR(ctinfo)].packets++;
>>>         acct[CTINFO2DIR(ctinfo)].bytes +=
>>>             skb->len - skb_network_offset(skb);
>>>     }
>>>     spin_unlock_bh(&ct->some_lock);
>>>
>> Its currently still enabled by default, but we intend to change that.
>> After that I guess almost nobody will have it enabled.
>> 
> Really ? I find this accounting stuff really useful and always enable it :)

You usually need extra userspace daemons to make something useful
out of the data and I doubt many people are running them. It doesn't
hurt to optimize it anyways of course :)

But I'm somewhat doubtful that we're actually having lock contention
here. One thing we could do with your lock hash change is to perform
the counter updates while holding those locks, that avoids taking
a different lock just for the counters. The only reason why its done
in nf_ct_refresh is that it was already taking the conntrack lock,
but if thats no longer the case, no reason to keep it there.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/net/netfilter/nf_conntrack_core.c	2009-02-17 10:55:33.370882059 -0800
+++ b/net/netfilter/nf_conntrack_core.c	2009-02-17 13:48:25.080060712 -0800
@@ -793,13 +793,12 @@  void __nf_ct_refresh_acct(struct nf_conn
 	NF_CT_ASSERT(ct->timeout.data == (unsigned long)ct);
 	NF_CT_ASSERT(skb);
 
-	spin_lock_bh(&nf_conntrack_lock);
-
 	/* Only update if this is not a fixed timeout */
 	if (test_bit(IPS_FIXED_TIMEOUT_BIT, &ct->status))
 		goto acct;
 
-	/* If not in hash table, timer will not be active yet */
+	/* If not in hash table, timer will not be active yet,
+	   we are the only one able to see it. */
 	if (!nf_ct_is_confirmed(ct)) {
 		ct->timeout.expires = extra_jiffies;
 		event = IPCT_REFRESH;
@@ -821,16 +820,16 @@  acct:
 	if (do_acct) {
 		struct nf_conn_counter *acct;
 
+		spin_lock_bh(&nf_conntrack_lock);
 		acct = nf_conn_acct_find(ct);
 		if (acct) {
 			acct[CTINFO2DIR(ctinfo)].packets++;
 			acct[CTINFO2DIR(ctinfo)].bytes +=
 				skb->len - skb_network_offset(skb);
 		}
+		spin_unlock_bh(&nf_conntrack_lock);
 	}
 
-	spin_unlock_bh(&nf_conntrack_lock);
-
 	/* must be unlocked when calling event cache */
 	if (event)
 		nf_conntrack_event_cache(event, ct);