From patchwork Mon Sep 11 21:10:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Gregory Rose X-Patchwork-Id: 812620 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="lsERN/cy"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xrgdB6Sp0z9sCZ for ; Tue, 12 Sep 2017 07:13:18 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id B8D6ACB3; Mon, 11 Sep 2017 21:11:21 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id DF302CB3 for ; Mon, 11 Sep 2017 21:11:19 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pg0-f65.google.com (mail-pg0-f65.google.com [74.125.83.65]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 4CD30D3 for ; Mon, 11 Sep 2017 21:11:19 +0000 (UTC) Received: by mail-pg0-f65.google.com with SMTP id v5so5183674pgn.4 for ; Mon, 11 Sep 2017 14:11:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/XSTG34Q8GxNSzWNs3owPR5+/APDMgxxJfnT4f3elck=; b=lsERN/cyW/XRjxBGu9AZVNTwNjHlra7E2mAe2PyxHk8psYeOfuQFOo8E78KkrXXimv Vod1U2mu+8qYKwdYJCijq4RzLL013Ys2FP8XVjZ46S/EtVCP4VDiV6UaK9uGVyEeXRl6 XDYUcw9crNVHDHQ+oAaTawEgBWVb+uZo2h1TVovjieOJrXWAwE4cKuD6aPC2fYQSKYV8 L2r01PVa8ML7BNXojZpa2wDq+KylfmQYE2LTFAO2PWN4r0bq6X4J27a0f4JktksnZXVP pRl6uxufTrIAddkTHuTM4yuNQJc+YlWgWTvKl8PVRNfdqbHS3OfRUiNdKfA3DaDtjB25 +GbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/XSTG34Q8GxNSzWNs3owPR5+/APDMgxxJfnT4f3elck=; b=bKtZSHRi3GFCrxZyhirD5k6wdby2/ErUpZy1JA6fx7geMRg0YcCT2n5octvX+fpgG4 L4dNGvWWT6+en+0uUmmRqvh5HlX2QvjCDSj7dKKzg5rpD3w5v34v4UvWD1sGLV/PMmst 4ZtNdIJxQh18OJwAjcgWazDlAJDORe2w4TWlDZD5pY4YWECU7PLc3arTffT0hkCP5osM bKuz1o37w7mdW6gyCNegziV+8uy4BCO+MxDqfgs8ndD3+KiOWKr9IA/iUw+NZ/OPC7nl oZSUdz63deVG+LR6w2XSnj2gO8pXSGSQQSxAovG70+cQ5ejlFkxE2p2LcoBT9UJX7uNq uc1A== X-Gm-Message-State: AHPjjUjgPtguxMuldmP7MjJHBPsnmcHEt+D3xTpPxBKQkJGExiQeGFMA N1HSm2tutMuQAVgy X-Google-Smtp-Source: ADKCNb7nYbijSM8g8W+h/1ykKGFowFtclLuw1J+2Usb2W2KLtso69Hludq7VLohIIi4Tt/XM9cx2Og== X-Received: by 10.84.229.8 with SMTP id b8mr15059400plk.405.1505164277848; Mon, 11 Sep 2017 14:11:17 -0700 (PDT) Received: from gizo.domain (67-5-132-83.ptld.qwest.net. [67.5.132.83]) by smtp.gmail.com with ESMTPSA id q23sm6568745pfk.182.2017.09.11.14.11.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Sep 2017 14:11:16 -0700 (PDT) From: Greg Rose To: dev@openvswitch.org Date: Mon, 11 Sep 2017 14:10:57 -0700 Message-Id: <1505164269-9455-4-git-send-email-gvrose8192@gmail.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1505164269-9455-1-git-send-email-gvrose8192@gmail.com> References: <1505164269-9455-1-git-send-email-gvrose8192@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=0.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM autolearn=disabled version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH V2 04/16] datapath: Optimize operations for OvS flow_stats. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org Upstream commit: commit c4b2bf6b4a35348fe6d1eb06928eb68d7b9d99a9 Author: Tonghao Zhang Date: Mon Jul 17 23:28:06 2017 -0700 openvswitch: Optimize operations for OvS flow_stats. When calling the flow_free() to free the flow, we call many times (cpu_possible_mask, eg. 128 as default) cpumask_next(). That will take up our CPU usage if we call the flow_free() frequently. When we put all packets to userspace via upcall, and OvS will send them back via netlink to ovs_packet_cmd_execute(will call flow_free). The test topo is shown as below. VM01 sends TCP packets to VM02, and OvS forward packtets. When testing, we use perf to report the system performance. VM01 --- OvS-VM --- VM02 Without this patch, perf-top show as below: The flow_free() is 3.02% CPU usage. 4.23% [kernel] [k] _raw_spin_unlock_irqrestore 3.62% [kernel] [k] __do_softirq 3.16% [kernel] [k] __memcpy 3.02% [kernel] [k] flow_free 2.42% libc-2.17.so [.] __memcpy_ssse3_back 2.18% [kernel] [k] copy_user_generic_unrolled 2.17% [kernel] [k] find_next_bit When applied this patch, perf-top show as below: Not shown on the list anymore. 4.11% [kernel] [k] _raw_spin_unlock_irqrestore 3.79% [kernel] [k] __do_softirq 3.46% [kernel] [k] __memcpy 2.73% libc-2.17.so [.] __memcpy_ssse3_back 2.25% [kernel] [k] copy_user_generic_unrolled 1.89% libc-2.17.so [.] _int_malloc 1.53% ovs-vswitchd [.] xlate_actions With this patch, the TCP throughput(we dont use Megaflow Cache + Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec (maybe ~10% performance imporve). This patch adds cpumask struct, the cpu_used_mask stores the cpu_id that the flow used. And we only check the flow_stats on the cpu we used, and it is unncessary to check all possible cpu when getting, cleaning, and updating the flow_stats. Adding the cpu_used_mask to sw_flow struct does’t increase the cacheline number. Signed-off-by: Tonghao Zhang Acked-by: Pravin B Shelar Signed-off-by: David S. Miller Signed-off-by: Greg Rose --- datapath/flow.c | 7 ++++--- datapath/flow.h | 2 ++ datapath/flow_table.c | 4 +++- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/datapath/flow.c b/datapath/flow.c index 30e4d21..5da7e3e 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -71,7 +71,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 tcp_flags, const struct sk_buff *skb) { struct flow_stats *stats; - int cpu = smp_processor_id(); + unsigned int cpu = smp_processor_id(); int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0); stats = rcu_dereference(flow->stats[cpu]); @@ -116,6 +116,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 tcp_flags, rcu_assign_pointer(flow->stats[cpu], new_stats); + cpumask_set_cpu(cpu, &flow->cpu_used_mask); goto unlock; } } @@ -143,7 +144,7 @@ void ovs_flow_stats_get(const struct sw_flow *flow, memset(ovs_stats, 0, sizeof(*ovs_stats)); /* We open code this to make sure cpu 0 is always considered */ - for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpu_possible_mask)) { + for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, &flow->cpu_used_mask)) { struct flow_stats *stats = rcu_dereference_ovsl(flow->stats[cpu]); if (stats) { @@ -167,7 +168,7 @@ void ovs_flow_stats_clear(struct sw_flow *flow) int cpu; /* We open code this to make sure cpu 0 is always considered */ - for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpu_possible_mask)) { + for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, &flow->cpu_used_mask)) { struct flow_stats *stats = ovsl_dereference(flow->stats[cpu]); if (stats) { diff --git a/datapath/flow.h b/datapath/flow.h index 07af912..0796b09 100644 --- a/datapath/flow.h +++ b/datapath/flow.h @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -218,6 +219,7 @@ struct sw_flow { */ struct sw_flow_key key; struct sw_flow_id id; + struct cpumask cpu_used_mask; struct sw_flow_mask *mask; struct sw_flow_actions __rcu *sf_acts; struct flow_stats __rcu *stats[]; /* One for each CPU. First one diff --git a/datapath/flow_table.c b/datapath/flow_table.c index 6fe3739..47057a1 100644 --- a/datapath/flow_table.c +++ b/datapath/flow_table.c @@ -104,6 +104,8 @@ struct sw_flow *ovs_flow_alloc(void) RCU_INIT_POINTER(flow->stats[0], stats); + cpumask_set_cpu(0, &flow->cpu_used_mask); + return flow; err: kmem_cache_free(flow_cache, flow); @@ -147,7 +149,7 @@ static void flow_free(struct sw_flow *flow) if (flow->sf_acts) ovs_nla_free_flow_actions((struct sw_flow_actions __force *)flow->sf_acts); /* We open code this to make sure cpu 0 is always considered */ - for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpu_possible_mask)) + for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, &flow->cpu_used_mask)) if (flow->stats[cpu]) kmem_cache_free(flow_stats_cache, rcu_dereference_raw(flow->stats[cpu]));