From patchwork Sat Aug 29 02:57:15 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Raghavendra K T X-Patchwork-Id: 512110 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 7DBB714031D for ; Sat, 29 Aug 2015 12:55:41 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752207AbbH2Cz0 (ORCPT ); Fri, 28 Aug 2015 22:55:26 -0400 Received: from e31.co.us.ibm.com ([32.97.110.149]:36884 "EHLO e31.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751568AbbH2CzY (ORCPT ); Fri, 28 Aug 2015 22:55:24 -0400 Received: from /spool/local by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 28 Aug 2015 20:55:23 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e31.co.us.ibm.com (192.168.1.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 28 Aug 2015 20:55:21 -0600 X-Helo: d03dlp02.boulder.ibm.com X-MailFrom: raghavendra.kt@linux.vnet.ibm.com X-RcptTo: netdev@vger.kernel.org Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 40D7F3E4003B; Fri, 28 Aug 2015 20:55:21 -0600 (MDT) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t7T2qSBn43843776; Fri, 28 Aug 2015 19:52:28 -0700 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t7T2tIdC025596; Fri, 28 Aug 2015 20:55:21 -0600 Received: from kernel.stglabs.ibm.com (kernel.stglabs.ibm.com [9.114.214.19]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t7T2tIE4025582; Fri, 28 Aug 2015 20:55:18 -0600 Received: from linux.vnet.ibm.com (unknown [9.79.205.196]) by kernel.stglabs.ibm.com (Postfix) with SMTP id 20F5924004C; Fri, 28 Aug 2015 19:55:08 -0700 (PDT) Date: Sat, 29 Aug 2015 08:27:15 +0530 From: Raghavendra K T To: David Miller Cc: raghavendra.kt@linux.vnet.ibm.com, edumazet@google.com, kuznet@ms2.inr.ac.ru, jmorris@namei.org, yoshfuji@linux-ipv6.org, kaber@trash.net, jiri@resnulli.us, hannes@stressinduktion.org, tom@herbertland.com, azhou@nicira.com, ebiederm@xmission.com, ipm@chirality.org.uk, nicolas.dichtel@6wind.com, serge.hallyn@canonical.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, anton@au1.ibm.com, nacc@linux.vnet.ibm.com, srikar@linux.vnet.ibm.com Subject: Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once Message-ID: <20150829025715.GA5546@linux.vnet.ibm.com> Reply-To: Raghavendra K T References: <1440610653-14210-3-git-send-email-raghavendra.kt@linux.vnet.ibm.com> <20150827.113823.214019265460582055.davem@davemloft.net> <55E00238.10909@linux.vnet.ibm.com> <20150828.112413.424099339331017970.davem@davemloft.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150828.112413.424099339331017970.davem@davemloft.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15082902-8236-0000-0000-00000EC00150 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org * David Miller [2015-08-28 11:24:13]: > From: Raghavendra K T > Date: Fri, 28 Aug 2015 12:09:52 +0530 > > > On 08/28/2015 12:08 AM, David Miller wrote: > >> From: Raghavendra K T > >> Date: Wed, 26 Aug 2015 23:07:33 +0530 > >> > >>> @@ -4641,10 +4647,12 @@ static inline void __snmp6_fill_stats64(u64 > >>> *stats, void __percpu *mib, > >>> static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int > >>> attrtype, > >>> int bytes) > >>> { > >>> + u64 buff[IPSTATS_MIB_MAX] = {0,}; > >>> + > ... > > hope you wanted to know the overhead than to change the current > > patch. please let me know.. > > I want you to change that variable initializer to an explicit memset(). > > The compiler is emitting a memset() or similar _anyways_. > > Not because it will have any impact at all upon performance, but because > of how it looks to people trying to read and understand the code. > > Hi David, resending the patch with memset. Please let me know if you want to resend all the patches. ----8<---- From: Raghavendra K T Subject: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once Docker container creation linearly increased from around 1.6 sec to 7.5 sec (at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field. reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks through per cpu data of an item (iteratively for around 90 items). idea: This patch tries to aggregate the statistics by going through all the items of each cpu sequentially which is reducing cache misses. Docker creation got faster by more than 2x after the patch. Result: Before After Docker creation time 6.836s 3.357s cache miss 2.7% 1.38% perf before: 50.73% docker [kernel.kallsyms] [k] snmp_fold_field 9.07% swapper [kernel.kallsyms] [k] snooze_loop 3.49% docker [kernel.kallsyms] [k] veth_stats_one 2.85% swapper [kernel.kallsyms] [k] _raw_spin_lock perf after: 10.56% swapper [kernel.kallsyms] [k] snooze_loop 8.72% docker [kernel.kallsyms] [k] snmp_get_cpu_field 7.59% docker [kernel.kallsyms] [k] veth_stats_one 3.65% swapper [kernel.kallsyms] [k] _raw_spin_lock Signed-off-by: Raghavendra K T --- net/ipv6/addrconf.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) Change in V2: - Allocate stat calculation buffer in stack (Eric) - Use memset to zero temp buffer (David) Thanks David and Eric for coments on V1 and as both of them pointed, unfortunately we cannot get rid of buffer for calculation without avoiding unaligned op. diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 21c2c81..9bdfba3 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -4624,16 +4624,22 @@ static inline void __snmp6_fill_statsdev(u64 *stats, atomic_long_t *mib, } static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib, - int items, int bytes, size_t syncpoff) + int items, int bytes, size_t syncpoff, + u64 *buff) { - int i; + int i, c; int pad = bytes - sizeof(u64) * items; BUG_ON(pad < 0); /* Use put_unaligned() because stats may not be aligned for u64. */ put_unaligned(items, &stats[0]); + + for_each_possible_cpu(c) + for (i = 1; i < items; i++) + buff[i] += snmp_get_cpu_field64(mib, c, i, syncpoff); + for (i = 1; i < items; i++) - put_unaligned(snmp_fold_field64(mib, i, syncpoff), &stats[i]); + put_unaligned(buff[i], &stats[i]); memset(&stats[items], 0, pad); } @@ -4641,10 +4647,13 @@ static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib, static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype, int bytes) { + u64 buff[IPSTATS_MIB_MAX]; + switch (attrtype) { case IFLA_INET6_STATS: - __snmp6_fill_stats64(stats, idev->stats.ipv6, - IPSTATS_MIB_MAX, bytes, offsetof(struct ipstats_mib, syncp)); + memset(buff, 0, sizeof(buff)); + __snmp6_fill_stats64(stats, idev->stats.ipv6, IPSTATS_MIB_MAX, bytes, + offsetof(struct ipstats_mib, syncp), buff); break; case IFLA_INET6_ICMP6STATS: __snmp6_fill_statsdev(stats, idev->stats.icmpv6dev->mibs, ICMP6_MIB_MAX, bytes);