diff mbox

[v4,net-next,1/1] net_sched: Introduce skbmod action

Message ID 1473721837.18970.95.camel@edumazet-glaptop3.roam.corp.google.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Sept. 12, 2016, 11:10 p.m. UTC
On Mon, 2016-09-12 at 19:02 -0400, Jamal Hadi Salim wrote:
> On 16-09-12 06:26 PM, Eric Dumazet wrote:
> > On Mon, 2016-09-12 at 18:14 -0400, Jamal Hadi Salim wrote:
> >
> >> I noticed some very weird issues when I took that out.
> >> Running sufficiently large amount of traffic (ping -f is sufficient)
> >> I saw that when i did a dump it took anywhere between 6-15 seconds.
> >> With the read_lock in place response was immediate.
> >> I can go back and run things to verify - but it was very odd.
> >
> > This was on uni processor ?
> >
> 
> It was a VM.
> 
> > Looks like typical starvation caused by aggressive softirq.
> >
> 
> Well, then it is strange that in one case a tc dump of the rule
> was immediate and in the other case it was consistent for 5-15
> seconds.
> 

This needs investigation ;)

One possible loop under high stress would be possible in
__gnet_stats_copy_basic(), since we might restart the loop if we are
really really unlucky, but this would have nothing with your patches.

Comments

Eric Dumazet Sept. 12, 2016, 11:20 p.m. UTC | #1
On Mon, 2016-09-12 at 16:10 -0700, Eric Dumazet wrote:

> 
> diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
> index 508e051304fb62627e61b5065b2325edd1b84f2e..dc9dd8ae7d5405f76c775278dac7689655b21041 100644
> --- a/net/core/gen_stats.c
> +++ b/net/core/gen_stats.c
> @@ -142,10 +142,14 @@ __gnet_stats_copy_basic(const seqcount_t *running,
>  		return;
>  	}
>  	do {
> -		if (running)
> +		if (running) {
> +			local_bh_disable();
>  			seq = read_seqcount_begin(running);
> +		}
>  		bstats->bytes = b->bytes;
>  		bstats->packets = b->packets;
> +		if (running)
> +			local_bh_enable();
>  	} while (running && read_seqcount_retry(running, seq));
>  }

Ah well, forget this patch, re-enabling bh right before
read_seqcount_retry() is not going to help.
Jamal Hadi Salim Sept. 12, 2016, 11:40 p.m. UTC | #2
On 16-09-12 07:20 PM, Eric Dumazet wrote:
> On Mon, 2016-09-12 at 16:10 -0700, Eric Dumazet wrote:
>
>>
>> diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
>> index 508e051304fb62627e61b5065b2325edd1b84f2e..dc9dd8ae7d5405f76c775278dac7689655b21041 100644
>> --- a/net/core/gen_stats.c
>> +++ b/net/core/gen_stats.c
>> @@ -142,10 +142,14 @@ __gnet_stats_copy_basic(const seqcount_t *running,
>>  		return;
>>  	}
>>  	do {
>> -		if (running)
>> +		if (running) {
>> +			local_bh_disable();
>>  			seq = read_seqcount_begin(running);
>> +		}
>>  		bstats->bytes = b->bytes;
>>  		bstats->packets = b->packets;
>> +		if (running)
>> +			local_bh_enable();
>>  	} while (running && read_seqcount_retry(running, seq));
>>  }
>
> Ah well, forget this patch, re-enabling bh right before
> read_seqcount_retry() is not going to help.
>

I have to say I have seen some odd issues once in a while reading
generic action stats.
I had a program that opened a netlink socket into the kernel.
Every X seconds it does a dump of all the actions to read the
stats.
There is a very reproducible behavior that the stats
are not in sync with the kernel. Given generic stats is lockless
I thought maybe rcu or per-cpu stats was the issue. I havent had time
to look closely.
The solution is instead of keeping the socket open all the time;
I open, read stats, close (repeat every x seconds).

If there is something you want me to try - I could do sometimes
this week. Your patch above may be useful!

cheers,
jamal
diff mbox

Patch

diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
index 508e051304fb62627e61b5065b2325edd1b84f2e..dc9dd8ae7d5405f76c775278dac7689655b21041 100644
--- a/net/core/gen_stats.c
+++ b/net/core/gen_stats.c
@@ -142,10 +142,14 @@  __gnet_stats_copy_basic(const seqcount_t *running,
 		return;
 	}
 	do {
-		if (running)
+		if (running) {
+			local_bh_disable();
 			seq = read_seqcount_begin(running);
+		}
 		bstats->bytes = b->bytes;
 		bstats->packets = b->packets;
+		if (running)
+			local_bh_enable();
 	} while (running && read_seqcount_retry(running, seq));
 }
 EXPORT_SYMBOL(__gnet_stats_copy_basic);