diff mbox

[RFC,V2,2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

Message ID 20150829025715.GA5546@linux.vnet.ibm.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Raghavendra K T Aug. 29, 2015, 2:57 a.m. UTC
* David Miller <davem@davemloft.net> [2015-08-28 11:24:13]:

> From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> Date: Fri, 28 Aug 2015 12:09:52 +0530
> 
> > On 08/28/2015 12:08 AM, David Miller wrote:
> >> From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> >> Date: Wed, 26 Aug 2015 23:07:33 +0530
> >>
> >>> @@ -4641,10 +4647,12 @@ static inline void __snmp6_fill_stats64(u64
> >>> *stats, void __percpu *mib,
> >>>   static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int
> >>>   attrtype,
> >>>   			     int bytes)
> >>>   {
> >>> +	u64 buff[IPSTATS_MIB_MAX] = {0,};
> >>> +
>  ...
> > hope you wanted to know the overhead than to change the current
> > patch. please let me know..
> 
> I want you to change that variable initializer to an explicit memset().
> 
> The compiler is emitting a memset() or similar _anyways_.
> 
> Not because it will have any impact at all upon performance, but because
> of how it looks to people trying to read and understand the code.
> 
> 

Hi David,
resending the patch with memset. Please let me know if you want to
resend all the patches.

----8<----
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Subject: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking
 all the percpu data at once

Docker container creation linearly increased from around 1.6 sec to 7.5 sec
(at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field.

reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks
through per cpu data of an item (iteratively for around 90 items).

idea: This patch tries to aggregate the statistics by going through
all the items of each cpu sequentially which is reducing cache
misses.

Docker creation got faster by more than 2x after the patch.

Result:
                       Before           After
Docker creation time   6.836s           3.357s
cache miss             2.7%             1.38%

perf before:
    50.73%  docker           [kernel.kallsyms]       [k] snmp_fold_field
     9.07%  swapper          [kernel.kallsyms]       [k] snooze_loop
     3.49%  docker           [kernel.kallsyms]       [k] veth_stats_one
     2.85%  swapper          [kernel.kallsyms]       [k] _raw_spin_lock

perf after:
    10.56%  swapper          [kernel.kallsyms]     [k] snooze_loop
     8.72%  docker           [kernel.kallsyms]     [k] snmp_get_cpu_field
     7.59%  docker           [kernel.kallsyms]     [k] veth_stats_one
     3.65%  swapper          [kernel.kallsyms]     [k] _raw_spin_lock

Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
---
 net/ipv6/addrconf.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

 Change in V2:
 - Allocate stat calculation buffer in stack (Eric)
 - Use memset to zero temp buffer (David)

Thanks David and Eric for coments on V1 and as both of them pointed,
unfortunately we cannot get rid of buffer for calculation without
avoiding unaligned op.

Comments

Eric Dumazet Aug. 29, 2015, 3:26 a.m. UTC | #1
On Sat, 2015-08-29 at 08:27 +0530, Raghavendra K T wrote:
>  
>  	/* Use put_unaligned() because stats may not be aligned for u64. */
>  	put_unaligned(items, &stats[0]);


>  	for (i = 1; i < items; i++)
> -		put_unaligned(snmp_fold_field64(mib, i, syncpoff), &stats[i]);
> +		put_unaligned(buff[i], &stats[i]);
>  

I believe Joe suggested following code instead :

buff[0] = items;
memcpy(stats, buff, items * sizeof(u64));

Also please move buff[] array into __snmp6_fill_stats64() to make it
clear it is used in a 'leaf' function.

(even if calling memcpy()/memset() makes it not a leaf function)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Aug. 29, 2015, 5:11 a.m. UTC | #2
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Date: Sat, 29 Aug 2015 08:27:15 +0530

> resending the patch with memset. Please let me know if you want to
> resend all the patches.

Do not post patches as replies to existing discussion threads.

Instead, make a new, fresh, patch posting, updating the Subject line
as needed.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Raghavendra K T Aug. 29, 2015, 7:52 a.m. UTC | #3
On 08/29/2015 08:56 AM, Eric Dumazet wrote:
> On Sat, 2015-08-29 at 08:27 +0530, Raghavendra K T wrote:
>>
>>   	/* Use put_unaligned() because stats may not be aligned for u64. */
>>   	put_unaligned(items, &stats[0]);
>
>
>>   	for (i = 1; i < items; i++)
>> -		put_unaligned(snmp_fold_field64(mib, i, syncpoff), &stats[i]);
>> +		put_unaligned(buff[i], &stats[i]);
>>
>
> I believe Joe suggested following code instead :
>
> buff[0] = items;
> memcpy(stats, buff, items * sizeof(u64));

Thanks. Sure, will use this.

(I missed that. I thought that it was applicable only when we have
aligned data,and for power, put_aunaligned was not a nop unlike intel).

>
> Also please move buff[] array into __snmp6_fill_stats64() to make it
> clear it is used in a 'leaf' function.

Correct.

>
> (even if calling memcpy()/memset() makes it not a leaf function)
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Raghavendra K T Aug. 29, 2015, 7:53 a.m. UTC | #4
On 08/29/2015 10:41 AM, David Miller wrote:
> From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> Date: Sat, 29 Aug 2015 08:27:15 +0530
>
>> resending the patch with memset. Please let me know if you want to
>> resend all the patches.
>
> Do not post patches as replies to existing discussion threads.
>
> Instead, make a new, fresh, patch posting, updating the Subject line
> as needed.
>

Sure.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 21c2c81..9bdfba3 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4624,16 +4624,22 @@  static inline void __snmp6_fill_statsdev(u64 *stats, atomic_long_t *mib,
 }
 
 static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib,
-				      int items, int bytes, size_t syncpoff)
+					int items, int bytes, size_t syncpoff,
+					u64 *buff)
 {
-	int i;
+	int i, c;
 	int pad = bytes - sizeof(u64) * items;
 	BUG_ON(pad < 0);
 
 	/* Use put_unaligned() because stats may not be aligned for u64. */
 	put_unaligned(items, &stats[0]);
+
+	for_each_possible_cpu(c)
+		for (i = 1; i < items; i++)
+			buff[i] += snmp_get_cpu_field64(mib, c, i, syncpoff);
+
 	for (i = 1; i < items; i++)
-		put_unaligned(snmp_fold_field64(mib, i, syncpoff), &stats[i]);
+		put_unaligned(buff[i], &stats[i]);
 
 	memset(&stats[items], 0, pad);
 }
@@ -4641,10 +4647,13 @@  static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib,
 static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype,
 			     int bytes)
 {
+	u64 buff[IPSTATS_MIB_MAX];
+
 	switch (attrtype) {
 	case IFLA_INET6_STATS:
-		__snmp6_fill_stats64(stats, idev->stats.ipv6,
-				     IPSTATS_MIB_MAX, bytes, offsetof(struct ipstats_mib, syncp));
+		memset(buff, 0, sizeof(buff));
+		__snmp6_fill_stats64(stats, idev->stats.ipv6, IPSTATS_MIB_MAX, bytes,
+				     offsetof(struct ipstats_mib, syncp), buff);
 		break;
 	case IFLA_INET6_ICMP6STATS:
 		__snmp6_fill_statsdev(stats, idev->stats.icmpv6dev->mibs, ICMP6_MIB_MAX, bytes);