From patchwork Sat Dec 5 14:21:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411447 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=fraxinus.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcX4kq1z9sWQ for ; Sun, 6 Dec 2020 01:22:52 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 16F8A87386; Sat, 5 Dec 2020 14:22:51 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lz6VOqRBSlE3; Sat, 5 Dec 2020 14:22:49 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id 514A2872B5; Sat, 5 Dec 2020 14:22:46 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2CD05C1DA0; Sat, 5 Dec 2020 14:22:46 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 82951C1833 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 7242687159 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sVU+04hsHptB for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id D7F3687149 for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id AFE0A40004 for ; Sat, 5 Dec 2020 14:22:32 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:21:56 +0100 Message-Id: <46eaa47594f53a10b6e86d7e75c19e161cb8645b.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 01/26] netdev: Add flow API de-init function X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a new operation for flow API providers to deinitializes when the API is disassociated from a netdev. Signed-off-by: Gaetan Rivet --- lib/netdev-offload-provider.h | 3 +++ lib/netdev-offload.c | 4 ++++ 2 files changed, 7 insertions(+) diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-provider.h index 0bed7bf61..f6e8b009c 100644 --- a/lib/netdev-offload-provider.h +++ b/lib/netdev-offload-provider.h @@ -86,6 +86,9 @@ struct netdev_flow_api { /* Initializies the netdev flow api. * Return 0 if successful, otherwise returns a positive errno value. */ int (*init_flow_api)(struct netdev *); + + /* Deinitializes the netdev flow api. */ + void (*deinit_flow_api)(struct netdev *); }; int netdev_register_flow_api_provider(const struct netdev_flow_api *); diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c index 2da3bc701..f748fcf0d 100644 --- a/lib/netdev-offload.c +++ b/lib/netdev-offload.c @@ -309,6 +309,10 @@ netdev_uninit_flow_api(struct netdev *netdev) return; } + if (flow_api->deinit_flow_api) { + flow_api->deinit_flow_api(netdev); + } + ovsrcu_set(&netdev->flow_api, NULL); rfa = netdev_lookup_flow_api(flow_api->type); ovs_refcount_unref(&rfa->refcnt); From patchwork Sat Dec 5 14:21:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411459 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBdS6pDSz9sVV for ; Sun, 6 Dec 2020 01:23:40 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 89BBB875E6; Sat, 5 Dec 2020 14:23:39 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fi4+f4slhLoV; Sat, 5 Dec 2020 14:23:34 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id 5477F87BC8; Sat, 5 Dec 2020 14:22:58 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 16C69C1DDC; Sat, 5 Dec 2020 14:22:58 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 61EDAC013B for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 3FE12874DA for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KtLP-sGHrkB9 for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by whitealder.osuosl.org (Postfix) with ESMTPS id E2092874D1 for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 013CB40005 for ; Sat, 5 Dec 2020 14:22:32 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:21:57 +0100 Message-Id: <0dfe9e77ff913fff36868d851552d411b68ac42c.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 02/26] netdev-offload-dpdk: Use per-netdev offload metadata X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a per-netdev offload data field as part of netdev hw_info structure. Use this field in netdev-offload-dpdk to map offload metadata (ufid to rte_flow). Use flow API deinit ops to destroy the per-netdev metadata when deallocating a netdev. Signed-off-by: Gaetan Rivet --- lib/netdev-offload-dpdk.c | 100 +++++++++++++++++++++++++++++++------- lib/netdev-offload.h | 1 + 2 files changed, 84 insertions(+), 17 deletions(-) diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index 01c52e1de..8d39ab7b4 100644 --- a/lib/netdev-offload-dpdk.c +++ b/lib/netdev-offload-dpdk.c @@ -52,7 +52,6 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(100, 5); /* * A mapping from ufid to dpdk rte_flow. */ -static struct cmap ufid_to_rte_flow = CMAP_INITIALIZER; struct ufid_to_rte_flow_data { struct cmap_node node; @@ -62,14 +61,63 @@ struct ufid_to_rte_flow_data { struct dpif_flow_stats stats; }; +struct netdev_offload_dpdk_data { + struct cmap ufid_to_rte_flow; +}; + +static int +offload_data_init(struct netdev *netdev) +{ + struct netdev_offload_dpdk_data *data; + + data = xzalloc(sizeof *data); + cmap_init(&data->ufid_to_rte_flow); + + netdev->hw_info.offload_data = data; + + return 0; +} + +static void +offload_data_destroy(struct netdev *netdev) +{ + struct netdev_offload_dpdk_data *data; + struct ufid_to_rte_flow_data *node; + + data = netdev->hw_info.offload_data; + if (data == NULL) { + return; + } + + CMAP_FOR_EACH (node, node, &data->ufid_to_rte_flow) { + ovsrcu_postpone(free, node); + } + + cmap_destroy(&data->ufid_to_rte_flow); + free(data); + + netdev->hw_info.offload_data = NULL; +} + +static struct cmap * +offload_data_map(struct netdev *netdev) +{ + struct netdev_offload_dpdk_data *data; + + data = netdev->hw_info.offload_data; + return &data->ufid_to_rte_flow; +} + /* Find rte_flow with @ufid. */ static struct ufid_to_rte_flow_data * -ufid_to_rte_flow_data_find(const ovs_u128 *ufid) +ufid_to_rte_flow_data_find(struct netdev *netdev, + const ovs_u128 *ufid) { size_t hash = hash_bytes(ufid, sizeof *ufid, 0); struct ufid_to_rte_flow_data *data; + struct cmap *map = offload_data_map(netdev); - CMAP_FOR_EACH_WITH_HASH (data, node, hash, &ufid_to_rte_flow) { + CMAP_FOR_EACH_WITH_HASH (data, node, hash, map) { if (ovs_u128_equals(*ufid, data->ufid)) { return data; } @@ -79,12 +127,13 @@ ufid_to_rte_flow_data_find(const ovs_u128 *ufid) } static inline struct ufid_to_rte_flow_data * -ufid_to_rte_flow_associate(const ovs_u128 *ufid, +ufid_to_rte_flow_associate(struct netdev *netdev, const ovs_u128 *ufid, struct rte_flow *rte_flow, bool actions_offloaded) { size_t hash = hash_bytes(ufid, sizeof *ufid, 0); struct ufid_to_rte_flow_data *data = xzalloc(sizeof *data); struct ufid_to_rte_flow_data *data_prev; + struct cmap *map = offload_data_map(netdev); /* * We should not simply overwrite an existing rte flow. @@ -92,7 +141,7 @@ ufid_to_rte_flow_associate(const ovs_u128 *ufid, * Thus, if following assert triggers, something is wrong: * the rte_flow is not destroyed. */ - data_prev = ufid_to_rte_flow_data_find(ufid); + data_prev = ufid_to_rte_flow_data_find(netdev, ufid); if (data_prev) { ovs_assert(data_prev->rte_flow == NULL); } @@ -101,21 +150,22 @@ ufid_to_rte_flow_associate(const ovs_u128 *ufid, data->rte_flow = rte_flow; data->actions_offloaded = actions_offloaded; - cmap_insert(&ufid_to_rte_flow, - CONST_CAST(struct cmap_node *, &data->node), hash); + cmap_insert(map, CONST_CAST(struct cmap_node *, &data->node), hash); return data; } static inline void -ufid_to_rte_flow_disassociate(const ovs_u128 *ufid) +ufid_to_rte_flow_disassociate(struct netdev *netdev, + const ovs_u128 *ufid) { + struct cmap *map = offload_data_map(netdev); size_t hash = hash_bytes(ufid, sizeof *ufid, 0); struct ufid_to_rte_flow_data *data; - CMAP_FOR_EACH_WITH_HASH (data, node, hash, &ufid_to_rte_flow) { + CMAP_FOR_EACH_WITH_HASH (data, node, hash, map) { if (ovs_u128_equals(*ufid, data->ufid)) { - cmap_remove(&ufid_to_rte_flow, - CONST_CAST(struct cmap_node *, &data->node), hash); + cmap_remove(map, CONST_CAST(struct cmap_node *, &data->node), + hash); ovsrcu_postpone(free, data); return; } @@ -1435,7 +1485,8 @@ netdev_offload_dpdk_add_flow(struct netdev *netdev, if (!flow) { goto out; } - flows_data = ufid_to_rte_flow_associate(ufid, flow, actions_offloaded); + flows_data = ufid_to_rte_flow_associate(netdev, ufid, flow, + actions_offloaded); VLOG_DBG("%s: installed flow %p by ufid "UUID_FMT, netdev_get_name(netdev), flow, UUID_ARGS((struct uuid *)ufid)); @@ -1453,7 +1504,7 @@ netdev_offload_dpdk_destroy_flow(struct netdev *netdev, int ret = netdev_dpdk_rte_flow_destroy(netdev, rte_flow, &error); if (ret == 0) { - ufid_to_rte_flow_disassociate(ufid); + ufid_to_rte_flow_disassociate(netdev, ufid); VLOG_DBG_RL(&rl, "%s: rte_flow 0x%"PRIxPTR " flow destroy %d ufid " UUID_FMT, netdev_get_name(netdev), (intptr_t) rte_flow, @@ -1484,7 +1535,7 @@ netdev_offload_dpdk_flow_put(struct netdev *netdev, struct match *match, * Here destroy the old rte flow first before adding a new one. * Keep the stats for the newly created rule. */ - rte_flow_data = ufid_to_rte_flow_data_find(ufid); + rte_flow_data = ufid_to_rte_flow_data_find(netdev, ufid); if (rte_flow_data && rte_flow_data->rte_flow) { old_stats = rte_flow_data->stats; modification = true; @@ -1515,7 +1566,7 @@ netdev_offload_dpdk_flow_del(struct netdev *netdev, const ovs_u128 *ufid, { struct ufid_to_rte_flow_data *rte_flow_data; - rte_flow_data = ufid_to_rte_flow_data_find(ufid); + rte_flow_data = ufid_to_rte_flow_data_find(netdev, ufid); if (!rte_flow_data || !rte_flow_data->rte_flow) { return -1; } @@ -1530,7 +1581,21 @@ netdev_offload_dpdk_flow_del(struct netdev *netdev, const ovs_u128 *ufid, static int netdev_offload_dpdk_init_flow_api(struct netdev *netdev) { - return netdev_dpdk_flow_api_supported(netdev) ? 0 : EOPNOTSUPP; + int ret = EOPNOTSUPP; + + if (netdev_dpdk_flow_api_supported(netdev)) { + ret = offload_data_init(netdev); + } + + return ret; +} + +static void +netdev_offload_dpdk_deinit_flow_api(struct netdev *netdev) +{ + if (netdev_dpdk_flow_api_supported(netdev)) { + offload_data_destroy(netdev); + } } static int @@ -1547,7 +1612,7 @@ netdev_offload_dpdk_flow_get(struct netdev *netdev, struct rte_flow_error error; int ret = 0; - rte_flow_data = ufid_to_rte_flow_data_find(ufid); + rte_flow_data = ufid_to_rte_flow_data_find(netdev, ufid); if (!rte_flow_data || !rte_flow_data->rte_flow) { ret = -1; goto out; @@ -1584,5 +1649,6 @@ const struct netdev_flow_api netdev_offload_dpdk = { .flow_put = netdev_offload_dpdk_flow_put, .flow_del = netdev_offload_dpdk_flow_del, .init_flow_api = netdev_offload_dpdk_init_flow_api, + .deinit_flow_api = netdev_offload_dpdk_deinit_flow_api, .flow_get = netdev_offload_dpdk_flow_get, }; diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h index 4c0ed2ae8..49b893190 100644 --- a/lib/netdev-offload.h +++ b/lib/netdev-offload.h @@ -45,6 +45,7 @@ struct netdev_hw_info { bool oor; /* Out of Offload Resources ? */ int offload_count; /* Pending (non-offloaded) flow count */ int pending_count; /* Offloaded flow count */ + void *offload_data; /* Offload metadata. */ }; enum hw_info_type { From patchwork Sat Dec 5 14:21:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411443 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcP2J87z9sWQ for ; Sun, 6 Dec 2020 01:22:43 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 35EFE871A1; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JV1H7Hk3Z3aK; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 0DB7687136; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id D724FC0FA7; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7B816C013B for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 779B987159 for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ola9bY1jH+Kf for ; Sat, 5 Dec 2020 14:22:35 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id 994D587120 for ; Sat, 5 Dec 2020 14:22:35 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 5323A40006 for ; Sat, 5 Dec 2020 14:22:33 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:21:58 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 03/26] netdev-offload: Add function to read hardware offload stats X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a function for an offload provider to report a device hardware offload count. It is not for reporting offload usage statistics, i.e. in terms of packets or bytes matched by offloads, but only to know how many offloads are currently programmed into the device. Because it is not related to any specific flow, this count is not integrated with the more general flow statistics. Signed-off-by: Gaetan Rivet --- lib/netdev-offload-provider.h | 3 +++ lib/netdev-offload.c | 11 +++++++++++ lib/netdev-offload.h | 1 + 3 files changed, 15 insertions(+) diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-provider.h index f6e8b009c..fd38cea66 100644 --- a/lib/netdev-offload-provider.h +++ b/lib/netdev-offload-provider.h @@ -83,6 +83,9 @@ struct netdev_flow_api { int (*flow_del)(struct netdev *, const ovs_u128 *ufid, struct dpif_flow_stats *); + /* Queries an offload provider hardware statistics. */ + int (*hw_offload_stats_get)(struct netdev *netdev, uint64_t *counter); + /* Initializies the netdev flow api. * Return 0 if successful, otherwise returns a positive errno value. */ int (*init_flow_api)(struct netdev *); diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c index f748fcf0d..4a8403ead 100644 --- a/lib/netdev-offload.c +++ b/lib/netdev-offload.c @@ -280,6 +280,17 @@ netdev_flow_del(struct netdev *netdev, const ovs_u128 *ufid, : EOPNOTSUPP; } +int +netdev_hw_offload_stats_get(struct netdev *netdev, uint64_t *counter) +{ + const struct netdev_flow_api *flow_api = + ovsrcu_get(const struct netdev_flow_api *, &netdev->flow_api); + + return (flow_api && flow_api->hw_offload_stats_get) + ? flow_api->hw_offload_stats_get(netdev, counter) + : EOPNOTSUPP; +} + int netdev_init_flow_api(struct netdev *netdev) { diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h index 49b893190..5ed561d13 100644 --- a/lib/netdev-offload.h +++ b/lib/netdev-offload.h @@ -95,6 +95,7 @@ int netdev_flow_get(struct netdev *, struct match *, struct nlattr **actions, struct dpif_flow_attrs *, struct ofpbuf *wbuffer); int netdev_flow_del(struct netdev *, const ovs_u128 *, struct dpif_flow_stats *); +int netdev_hw_offload_stats_get(struct netdev *, uint64_t *counter); int netdev_init_flow_api(struct netdev *); void netdev_uninit_flow_api(struct netdev *); uint32_t netdev_get_block_id(struct netdev *); From patchwork Sat Dec 5 14:21:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411444 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=fraxinus.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcQ4QhRz9sWj for ; Sun, 6 Dec 2020 01:22:46 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 93A9C871FD; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wU3izhtwvyk1; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id C940086FDB; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 94582C1D9F; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 61ED6C0FA7 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 4DB0C87136 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7iLNX+AeUUKy for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id 6E99C8714C for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 9C77D40007 for ; Sat, 5 Dec 2020 14:22:33 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:21:59 +0100 Message-Id: <41f050792e324f23eaa606c3637471ffd6e85e0b.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 04/26] netdev-offload-dpdk: Implement hw-offload statistics read X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" In the DPDK offload provider, keep track of inserted rte_flow and report it when asked. Only one thread writes the counter so consistency is guaranteed. Signed-off-by: Gaetan Rivet --- lib/netdev-offload-dpdk.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index 8d39ab7b4..b29c1188f 100644 --- a/lib/netdev-offload-dpdk.c +++ b/lib/netdev-offload-dpdk.c @@ -63,6 +63,7 @@ struct ufid_to_rte_flow_data { struct netdev_offload_dpdk_data { struct cmap ufid_to_rte_flow; + uint64_t rte_flow_counter; }; static int @@ -618,6 +619,10 @@ netdev_offload_dpdk_flow_create(struct netdev *netdev, flow = netdev_dpdk_rte_flow_create(netdev, attr, items, actions, error); if (flow) { + struct netdev_offload_dpdk_data *data; + + data = netdev->hw_info.offload_data; + data->rte_flow_counter++; if (!VLOG_DROP_DBG(&rl)) { dump_flow(&s, &s_extra, attr, items, actions); extra_str = ds_cstr(&s_extra); @@ -1504,6 +1509,11 @@ netdev_offload_dpdk_destroy_flow(struct netdev *netdev, int ret = netdev_dpdk_rte_flow_destroy(netdev, rte_flow, &error); if (ret == 0) { + struct netdev_offload_dpdk_data *data; + + data = netdev->hw_info.offload_data; + data->rte_flow_counter--; + ufid_to_rte_flow_disassociate(netdev, ufid); VLOG_DBG_RL(&rl, "%s: rte_flow 0x%"PRIxPTR " flow destroy %d ufid " UUID_FMT, @@ -1644,6 +1654,17 @@ out: return ret; } +static int +netdev_offload_dpdk_hw_offload_stats_get(struct netdev *netdev, + uint64_t *counter) +{ + struct netdev_offload_dpdk_data *data; + + data = netdev->hw_info.offload_data; + *counter = data->rte_flow_counter; + return 0; +} + const struct netdev_flow_api netdev_offload_dpdk = { .type = "dpdk_flow_api", .flow_put = netdev_offload_dpdk_flow_put, @@ -1651,4 +1672,5 @@ const struct netdev_flow_api netdev_offload_dpdk = { .init_flow_api = netdev_offload_dpdk_init_flow_api, .deinit_flow_api = netdev_offload_dpdk_deinit_flow_api, .flow_get = netdev_offload_dpdk_flow_get, + .hw_offload_stats_get = netdev_offload_dpdk_hw_offload_stats_get, }; From patchwork Sat Dec 5 14:22:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411449 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcc1MYwz9sWQ for ; Sun, 6 Dec 2020 01:22:56 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 7B33F87846; Sat, 5 Dec 2020 14:22:54 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6aNvhS-EUEp1; Sat, 5 Dec 2020 14:22:51 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 3DD268773F; Sat, 5 Dec 2020 14:22:45 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1A9DCC013B; Sat, 5 Dec 2020 14:22:45 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 4DD54C0FA7 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 3387986FB4 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i5K9zUG8enKA for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 6770786F99 for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id E113F40008 for ; Sat, 5 Dec 2020 14:22:33 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:00 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 05/26] dpif: Add function to read hardware offload statistics X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Expose a function to query datapath offload statistics. Call the new API from dpctl. Signed-off-by: Gaetan Rivet --- lib/dpctl.c | 36 ++++++++++++++++++++++++++++++++++++ lib/dpif-netdev.c | 1 + lib/dpif-netlink.c | 1 + lib/dpif-provider.h | 7 +++++++ lib/dpif.c | 8 ++++++++ lib/dpif.h | 9 +++++++++ 6 files changed, 62 insertions(+) diff --git a/lib/dpctl.c b/lib/dpctl.c index 33202813b..7bd75ae1a 100644 --- a/lib/dpctl.c +++ b/lib/dpctl.c @@ -1387,6 +1387,40 @@ dpctl_del_flows(int argc, const char *argv[], struct dpctl_params *dpctl_p) return error; } +static int +dpctl_offload_stats_show(int argc, const char *argv[], + struct dpctl_params *dpctl_p) +{ + struct netdev_custom_stats stats; + struct dpif *dpif; + int error; + size_t i; + + error = opt_dpif_open(argc, argv, dpctl_p, 2, &dpif); + if (error) { + return error; + } + + memset(&stats, 0, sizeof(stats)); + error = dpif_offload_stats_get(dpif, &stats); + if (error) { + dpctl_error(dpctl_p, error, "retrieving offload statistics"); + goto close_dpif; + } + + dpctl_print(dpctl_p, "HW Offload stats:\n"); + for (i = 0; i < stats.size; i++) { + dpctl_print(dpctl_p, " %s: %6" PRIu64 "\n", + stats.counters[i].name, stats.counters[i].value); + } + + netdev_free_custom_stats_counters(&stats); + +close_dpif: + dpif_close(dpif); + return error; +} + static int dpctl_help(int argc OVS_UNUSED, const char *argv[] OVS_UNUSED, struct dpctl_params *dpctl_p) @@ -2541,6 +2575,8 @@ static const struct dpctl_command all_commands[] = { { "get-flow", "[dp] ufid", 1, 2, dpctl_get_flow, DP_RO }, { "del-flow", "[dp] flow", 1, 2, dpctl_del_flow, DP_RW }, { "del-flows", "[dp]", 0, 1, dpctl_del_flows, DP_RW }, + { "offload-stats-show", "[dp]", + 0, 1, dpctl_offload_stats_show, DP_RO }, { "dump-conntrack", "[dp] [zone=N]", 0, 2, dpctl_dump_conntrack, DP_RO }, { "flush-conntrack", "[dp] [zone=N] [ct-tuple]", 0, 3, dpctl_flush_conntrack, DP_RW }, diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 300861ca5..a97796f64 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -8415,6 +8415,7 @@ const struct dpif_class dpif_netdev_class = { dpif_netdev_flow_dump_thread_destroy, dpif_netdev_flow_dump_next, dpif_netdev_operate, + NULL, /* offload_stats_get */ NULL, /* recv_set */ NULL, /* handlers_set */ dpif_netdev_set_config, diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 2f881e4fa..5b4cf6d09 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -3967,6 +3967,7 @@ const struct dpif_class dpif_netlink_class = { dpif_netlink_flow_dump_thread_destroy, dpif_netlink_flow_dump_next, dpif_netlink_operate, + NULL, /* offload_stats_get */ dpif_netlink_recv_set, dpif_netlink_handlers_set, NULL, /* set_config */ diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h index b817fceac..36dfa8e71 100644 --- a/lib/dpif-provider.h +++ b/lib/dpif-provider.h @@ -330,6 +330,13 @@ struct dpif_class { void (*operate)(struct dpif *dpif, struct dpif_op **ops, size_t n_ops, enum dpif_offload_type offload_type); + /* Get hardware-offloads activity counters from a dataplane. + * Those counters are not offload statistics (which are accessible through + * netdev statistics), but a status of hardware offload management: + * how many offloads are currently waiting, inserted, etc. */ + int (*offload_stats_get)(struct dpif *dpif, + struct netdev_custom_stats *stats); + /* Enables or disables receiving packets with dpif_recv() for 'dpif'. * Turning packet receive off and then back on is allowed to change Netlink * PID assignments (see ->port_get_pid()). The client is responsible for diff --git a/lib/dpif.c b/lib/dpif.c index ac2860764..7218357aa 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -1426,6 +1426,14 @@ dpif_operate(struct dpif *dpif, struct dpif_op **ops, size_t n_ops, } } +int dpif_offload_stats_get(struct dpif *dpif, + struct netdev_custom_stats *stats) +{ + return (dpif->dpif_class->offload_stats_get + ? dpif->dpif_class->offload_stats_get(dpif, stats) + : EOPNOTSUPP); +} + /* Returns a string that represents 'type', for use in log messages. */ const char * dpif_upcall_type_to_string(enum dpif_upcall_type type) diff --git a/lib/dpif.h b/lib/dpif.h index cb047dbe2..7ad0fe604 100644 --- a/lib/dpif.h +++ b/lib/dpif.h @@ -785,6 +785,15 @@ struct dpif_op { void dpif_operate(struct dpif *, struct dpif_op **ops, size_t n_ops, enum dpif_offload_type); + +/* Queries the datapath for hardware offloads stats. + * + * Statistics are written in 'stats' following the 'netdev_custom_stats' + * format. They are allocated on the heap and must be freed by the caller, + * using 'netdev_free_custom_stats_counters'. + */ +int dpif_offload_stats_get(struct dpif *dpif, + struct netdev_custom_stats *stats); /* Upcalls. */ From patchwork Sat Dec 5 14:22:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411445 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=fraxinus.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcR4fJyz9sVV for ; Sun, 6 Dec 2020 01:22:47 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 52939872B2; Sat, 5 Dec 2020 14:22:45 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Vtg7z47S28UF; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id C22C587029; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 95454C1DA0; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 4E136C013B for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 3D6C787120 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3FXlkmKTVHvY for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id AC77387137 for ; Sat, 5 Dec 2020 14:22:35 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 3077940009 for ; Sat, 5 Dec 2020 14:22:34 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:01 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 06/26] dpif-netdev: Implement hardware offloads stats query X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" In the netdev datapath, keep track of the enqueued offloads between the PMDs and the offload thread. Additionally, query each netdev for their hardware offload counters. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 54 insertions(+), 1 deletion(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index a97796f64..71c75174b 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -435,12 +435,14 @@ struct dp_flow_offload_item { struct dp_flow_offload { struct ovs_mutex mutex; struct ovs_list list; + uint64_t enqueued_item; pthread_cond_t cond; }; static struct dp_flow_offload dp_flow_offload = { .mutex = OVS_MUTEX_INITIALIZER, .list = OVS_LIST_INITIALIZER(&dp_flow_offload.list), + .enqueued_item = 0, }; static struct ovsthread_once offload_thread_once @@ -2627,6 +2629,7 @@ dp_netdev_append_flow_offload(struct dp_flow_offload_item *offload) { ovs_mutex_lock(&dp_flow_offload.mutex); ovs_list_push_back(&dp_flow_offload.list, &offload->node); + dp_flow_offload.enqueued_item++; xpthread_cond_signal(&dp_flow_offload.cond); ovs_mutex_unlock(&dp_flow_offload.mutex); } @@ -2743,6 +2746,7 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) ovsrcu_quiesce_end(); } list = ovs_list_pop_front(&dp_flow_offload.list); + dp_flow_offload.enqueued_item--; offload = CONTAINER_OF(list, struct dp_flow_offload_item, node); ovs_mutex_unlock(&dp_flow_offload.mutex); @@ -4197,6 +4201,55 @@ dpif_netdev_operate(struct dpif *dpif, struct dpif_op **ops, size_t n_ops, } } +static int +dpif_netdev_offload_stats_get(struct dpif *dpif, + struct netdev_custom_stats *stats) +{ + enum { + DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED, + DP_NETDEV_HW_OFFLOADS_STATS_INSERTED, + }; + const char *names[] = { + [DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED] = "Enqueued offloads", + [DP_NETDEV_HW_OFFLOADS_STATS_INSERTED] = "Inserted offloads", + }; + struct dp_netdev *dp = get_dp_netdev(dpif); + struct dp_netdev_port *port; + uint64_t nb_offloads; + size_t i; + + if (!netdev_is_flow_api_enabled()) { + return EINVAL; + } + + stats->size = ARRAY_SIZE(names); + stats->counters = xcalloc(stats->size, sizeof *stats->counters); + + nb_offloads = 0; + + ovs_mutex_lock(&dp->port_mutex); + HMAP_FOR_EACH (port, node, &dp->ports) { + uint64_t port_nb_offloads = 0; + + /* Do not abort on read error from a port, just report 0. */ + if (!netdev_hw_offload_stats_get(port->netdev, &port_nb_offloads)) { + nb_offloads += port_nb_offloads; + } + } + ovs_mutex_unlock(&dp->port_mutex); + + stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value = + dp_flow_offload.enqueued_item; + stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_INSERTED].value = nb_offloads; + + for (i = 0; i < ARRAY_SIZE(names); i++) { + snprintf(stats->counters[i].name, sizeof(stats->counters[i].name), + "%s", names[i]); + } + + return 0; +} + /* Enable or Disable PMD auto load balancing. */ static void set_pmd_auto_lb(struct dp_netdev *dp) @@ -8415,7 +8468,7 @@ const struct dpif_class dpif_netdev_class = { dpif_netdev_flow_dump_thread_destroy, dpif_netdev_flow_dump_next, dpif_netdev_operate, - NULL, /* offload_stats_get */ + dpif_netdev_offload_stats_get, NULL, /* recv_set */ NULL, /* handlers_set */ dpif_netdev_set_config, From patchwork Sat Dec 5 14:22:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411448 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcb4Sgfz9sVV for ; Sun, 6 Dec 2020 01:22:55 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 2C19287AD9; Sat, 5 Dec 2020 14:22:54 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XhJEZqFX+Njc; Sat, 5 Dec 2020 14:22:48 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id 8A95F877F2; Sat, 5 Dec 2020 14:22:43 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 745EDC1D9F; Sat, 5 Dec 2020 14:22:43 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 983C1C0FA7 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 874E522640 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HiMzG2JFslh8 for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by silver.osuosl.org (Postfix) with ESMTPS id 78138221DC for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 796634000B for ; Sat, 5 Dec 2020 14:22:34 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:02 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 07/26] dpif-netdev: Rename flow offload thread X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" ovs_strlcpy silently fails to copy the thread name if it is too long. Rename the flow offload thread to differentiate it from the main thread. Fixes: 02bb2824e51d ("dpif-netdev: do hw flow offload in a thread") Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 71c75174b..4cc6492a1 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -2785,8 +2785,7 @@ queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd, if (ovsthread_once_start(&offload_thread_once)) { xpthread_cond_init(&dp_flow_offload.cond, NULL); - ovs_thread_create("dp_netdev_flow_offload", - dp_netdev_flow_offload_main, NULL); + ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, NULL); ovsthread_once_done(&offload_thread_once); } @@ -2809,8 +2808,7 @@ queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd, if (ovsthread_once_start(&offload_thread_once)) { xpthread_cond_init(&dp_flow_offload.cond, NULL); - ovs_thread_create("dp_netdev_flow_offload", - dp_netdev_flow_offload_main, NULL); + ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, NULL); ovsthread_once_done(&offload_thread_once); } From patchwork Sat Dec 5 14:22:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411446 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcW6wDsz9sVV for ; Sun, 6 Dec 2020 01:22:51 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 283838773E; Sat, 5 Dec 2020 14:22:50 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8AVbxrtoXFMw; Sat, 5 Dec 2020 14:22:45 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 86A798715D; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 5B74AC1D9F; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 86640C013B for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 75B9087136 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tqyn8WBng2BZ for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id 483B187140 for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id C053C4000C for ; Sat, 5 Dec 2020 14:22:34 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:03 +0100 Message-Id: <4aa1c30180eb5399fbf034ca32482c63f7b2ebdd.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 08/26] dpif-netdev: Rename offload thread structure X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The offload management in userspace is done through a separate thread. The naming of the structure holding the objects used for synchronization with the dataplane is generic and nondescript. Clarify the object function by renaming it. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 58 +++++++++++++++++++++++------------------------ 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 4cc6492a1..e8156cd57 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -421,7 +421,7 @@ enum { DP_NETDEV_FLOW_OFFLOAD_OP_DEL, }; -struct dp_flow_offload_item { +struct dp_offload_thread_item { struct dp_netdev_pmd_thread *pmd; struct dp_netdev_flow *flow; int op; @@ -432,16 +432,16 @@ struct dp_flow_offload_item { struct ovs_list node; }; -struct dp_flow_offload { +struct dp_offload_thread { struct ovs_mutex mutex; struct ovs_list list; uint64_t enqueued_item; pthread_cond_t cond; }; -static struct dp_flow_offload dp_flow_offload = { +static struct dp_offload_thread dp_offload_thread = { .mutex = OVS_MUTEX_INITIALIZER, - .list = OVS_LIST_INITIALIZER(&dp_flow_offload.list), + .list = OVS_LIST_INITIALIZER(&dp_offload_thread.list), .enqueued_item = 0, }; @@ -2596,12 +2596,12 @@ mark_to_flow_find(const struct dp_netdev_pmd_thread *pmd, return NULL; } -static struct dp_flow_offload_item * +static struct dp_offload_thread_item * dp_netdev_alloc_flow_offload(struct dp_netdev_pmd_thread *pmd, struct dp_netdev_flow *flow, int op) { - struct dp_flow_offload_item *offload; + struct dp_offload_thread_item *offload; offload = xzalloc(sizeof(*offload)); offload->pmd = pmd; @@ -2615,7 +2615,7 @@ dp_netdev_alloc_flow_offload(struct dp_netdev_pmd_thread *pmd, } static void -dp_netdev_free_flow_offload(struct dp_flow_offload_item *offload) +dp_netdev_free_flow_offload(struct dp_offload_thread_item *offload) { dp_netdev_pmd_unref(offload->pmd); dp_netdev_flow_unref(offload->flow); @@ -2625,17 +2625,17 @@ dp_netdev_free_flow_offload(struct dp_flow_offload_item *offload) } static void -dp_netdev_append_flow_offload(struct dp_flow_offload_item *offload) +dp_netdev_append_flow_offload(struct dp_offload_thread_item *offload) { - ovs_mutex_lock(&dp_flow_offload.mutex); - ovs_list_push_back(&dp_flow_offload.list, &offload->node); - dp_flow_offload.enqueued_item++; - xpthread_cond_signal(&dp_flow_offload.cond); - ovs_mutex_unlock(&dp_flow_offload.mutex); + ovs_mutex_lock(&dp_offload_thread.mutex); + ovs_list_push_back(&dp_offload_thread.list, &offload->node); + dp_offload_thread.enqueued_item++; + xpthread_cond_signal(&dp_offload_thread.cond); + ovs_mutex_unlock(&dp_offload_thread.mutex); } static int -dp_netdev_flow_offload_del(struct dp_flow_offload_item *offload) +dp_netdev_flow_offload_del(struct dp_offload_thread_item *offload) { return mark_to_flow_disassociate(offload->pmd, offload->flow); } @@ -2652,7 +2652,7 @@ dp_netdev_flow_offload_del(struct dp_flow_offload_item *offload) * valid, thus only item 2 needed. */ static int -dp_netdev_flow_offload_put(struct dp_flow_offload_item *offload) +dp_netdev_flow_offload_put(struct dp_offload_thread_item *offload) { struct dp_netdev_pmd_thread *pmd = offload->pmd; struct dp_netdev_flow *flow = offload->flow; @@ -2732,23 +2732,23 @@ err_free: static void * dp_netdev_flow_offload_main(void *data OVS_UNUSED) { - struct dp_flow_offload_item *offload; + struct dp_offload_thread_item *offload; struct ovs_list *list; const char *op; int ret; for (;;) { - ovs_mutex_lock(&dp_flow_offload.mutex); - if (ovs_list_is_empty(&dp_flow_offload.list)) { + ovs_mutex_lock(&dp_offload_thread.mutex); + if (ovs_list_is_empty(&dp_offload_thread.list)) { ovsrcu_quiesce_start(); - ovs_mutex_cond_wait(&dp_flow_offload.cond, - &dp_flow_offload.mutex); + ovs_mutex_cond_wait(&dp_offload_thread.cond, + &dp_offload_thread.mutex); ovsrcu_quiesce_end(); } - list = ovs_list_pop_front(&dp_flow_offload.list); - dp_flow_offload.enqueued_item--; - offload = CONTAINER_OF(list, struct dp_flow_offload_item, node); - ovs_mutex_unlock(&dp_flow_offload.mutex); + list = ovs_list_pop_front(&dp_offload_thread.list); + dp_offload_thread.enqueued_item--; + offload = CONTAINER_OF(list, struct dp_offload_thread_item, node); + ovs_mutex_unlock(&dp_offload_thread.mutex); switch (offload->op) { case DP_NETDEV_FLOW_OFFLOAD_OP_ADD: @@ -2781,10 +2781,10 @@ static void queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd, struct dp_netdev_flow *flow) { - struct dp_flow_offload_item *offload; + struct dp_offload_thread_item *offload; if (ovsthread_once_start(&offload_thread_once)) { - xpthread_cond_init(&dp_flow_offload.cond, NULL); + xpthread_cond_init(&dp_offload_thread.cond, NULL); ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, NULL); ovsthread_once_done(&offload_thread_once); } @@ -2799,7 +2799,7 @@ queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd, struct dp_netdev_flow *flow, struct match *match, const struct nlattr *actions, size_t actions_len) { - struct dp_flow_offload_item *offload; + struct dp_offload_thread_item *offload; int op; if (!netdev_is_flow_api_enabled()) { @@ -2807,7 +2807,7 @@ queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd, } if (ovsthread_once_start(&offload_thread_once)) { - xpthread_cond_init(&dp_flow_offload.cond, NULL); + xpthread_cond_init(&dp_offload_thread.cond, NULL); ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, NULL); ovsthread_once_done(&offload_thread_once); } @@ -4237,7 +4237,7 @@ dpif_netdev_offload_stats_get(struct dpif *dpif, ovs_mutex_unlock(&dp->port_mutex); stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value = - dp_flow_offload.enqueued_item; + dp_offload_thread.enqueued_item; stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_INSERTED].value = nb_offloads; for (i = 0; i < ARRAY_SIZE(names); i++) { From patchwork Sat Dec 5 14:22:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411455 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=fraxinus.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBd83gj9z9sWQ for ; Sun, 6 Dec 2020 01:23:24 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id B086A876BB; Sat, 5 Dec 2020 14:23:22 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5x_KmDKx_ReP; Sat, 5 Dec 2020 14:23:19 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id E322587493; Sat, 5 Dec 2020 14:23:02 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7EC7AC1DF4; Sat, 5 Dec 2020 14:23:02 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7F913C1DA5 for ; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 61BE6870E3 for ; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ecZDrVWbs7Vl for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 13C3486F83 for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 1A13D4000D for ; Sat, 5 Dec 2020 14:22:34 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:04 +0100 Message-Id: <1b0d4b06fd8ad3a65cd7290c0e90da8a3858c0ce.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 09/26] mov-avg: Add a moving average helper structure X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a new library offering helpers to compute Cumulative Moving Average (CMA) and Exponential Moving Average (EMA) of series of values. Signed-off-by: Gaetan Rivet --- lib/automake.mk | 1 + lib/mov-avg.h | 166 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 167 insertions(+) create mode 100644 lib/mov-avg.h diff --git a/lib/automake.mk b/lib/automake.mk index 8eeb6c3f6..52c99b288 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -166,6 +166,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/memory.c \ lib/memory.h \ lib/meta-flow.c \ + lib/mov-avg.h \ lib/multipath.c \ lib/multipath.h \ lib/namemap.c \ diff --git a/lib/mov-avg.h b/lib/mov-avg.h new file mode 100644 index 000000000..8569bdc34 --- /dev/null +++ b/lib/mov-avg.h @@ -0,0 +1,166 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef _MOV_AVG_H +#define _MOV_AVG_H 1 + +#include + +/* Moving average helpers. */ + +/* Cumulative Moving Average. + * + * Also called Simple Moving Average. + * Online equivalent of sum(V) / len(V). + * + * As all values have equal weight, this average will + * be slower to show recent changes in the series. + * + */ + +struct mov_avg_cma { + unsigned long long int count; + double mean; + double sum_dsquared; +}; + +#define MOV_AVG_CMA_INITIALIZER \ + { .count = 0, .mean = .0, .sum_dsquared = .0 } + +static inline void +mov_avg_cma_init(struct mov_avg_cma *cma) +{ + *cma = (struct mov_avg_cma)MOV_AVG_CMA_INITIALIZER; +} + +static inline void +mov_avg_cma_update(struct mov_avg_cma *cma, double new_val) +{ + double mean; + + cma->count++; + mean = cma->mean + (new_val - cma->mean) / cma->count; + + cma->sum_dsquared += (new_val - mean) * (new_val - cma->mean); + cma->mean = mean; +} + +static inline double +mov_avg_cma(struct mov_avg_cma *cma) +{ + return cma->mean; +} + +static inline double +mov_avg_cma_std_dev(struct mov_avg_cma *cma) +{ + double variance = 0.0; + + if (cma->count > 1) { + variance = cma->sum_dsquared / (cma->count - 1); + } + + return sqrt(variance); +} + +/* Exponential Moving Average. + * + * Each value in the series has an exponentially decreasing weight, + * the older they get the less weight they have. + * + * The smoothing factor 'alpha' must be within 0 < alpha < 1. + * The closer this factor to zero, the more equal the weight between + * recent and older values. As it approaches one, the more recent values + * will have more weight. + * + * The EMA can be thought of as an estimator for the next value when measures + * are dependent. In that sense, it can make sense to consider the mean square + * error of the prediction. An 'alpha' minimizing this error would be the + * better choice to improve the estimation. + * + * A common choice for 'alpha' is to derive it from the 'N' past periods that + * are interesting for the average. The following formula is used + * + * a = 2 / (N + 1) + * + * It makes the 'N' previous values weigh approximatively 86% of the average. + * Using the above formula is common practice but arbitrary. When doing so, + * it should be noted that the EMA will not forget past values before 'N', + * only that their weight will be reduced. + */ + +struct mov_avg_ema { + double alpha; /* 'Smoothing' factor. */ + double mean; + double variance; + bool initialized; +}; + +/* Choose alpha explicitly. */ +#define MOV_AVG_EMA_INITIALIZER_ALPHA(a) { \ + .initialized = false, \ + .alpha = (a), .variance = .0, .mean = .0 \ +} + +/* Choose alpha from 'N' past periods. */ +#define MOV_AVG_EMA_INITIALIZER(n_elem) \ + MOV_AVG_EMA_INITIALIZER_ALPHA(2. / ((double)(n_elem) + 1.)) + +static inline void +mov_avg_ema_init_alpha(struct mov_avg_ema *ema, + double alpha) +{ + *ema = (struct mov_avg_ema)MOV_AVG_EMA_INITIALIZER_ALPHA(alpha); +} + +static inline void +mov_avg_ema_init(struct mov_avg_ema *ema, + unsigned long long int n_elem) +{ + *ema = (struct mov_avg_ema)MOV_AVG_EMA_INITIALIZER(n_elem); +} + +static inline void +mov_avg_ema_update(struct mov_avg_ema *ema, double new_val) +{ + const double alpha = ema->alpha; + double diff; + + if (!ema->initialized) { + ema->initialized = true; + ema->mean = new_val; + return; + } + + diff = new_val - ema->mean; + + ema->variance = (1.0 - alpha) * (ema->variance + alpha * diff * diff); + ema->mean = ema->mean + alpha * diff; +} + +static inline double +mov_avg_ema(struct mov_avg_ema *ema) +{ + return ema->mean; +} + +static inline double +mov_avg_ema_std_dev(struct mov_avg_ema *ema) +{ + return sqrt(ema->variance); +} + +#endif /* _MOV_AVG_H */ From patchwork Sat Dec 5 14:22:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411451 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcp5C2hz9sVV for ; Sun, 6 Dec 2020 01:23:06 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 8FFDB876B8; Sat, 5 Dec 2020 14:23:04 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pGtSaZVhZYAI; Sat, 5 Dec 2020 14:23:02 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 268F8877A7; Sat, 5 Dec 2020 14:22:49 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id F0112C1DA5; Sat, 5 Dec 2020 14:22:48 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 235CFC013B for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 12D6587639 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dxxXdA5IU4+0 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by whitealder.osuosl.org (Postfix) with ESMTPS id 9ADFE8751E for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 5C02B4000E for ; Sat, 5 Dec 2020 14:22:35 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:05 +0100 Message-Id: <3f9d08b1499a7c042d6eb94dbd895cbb23d50138.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 10/26] dpif-netdev: Add flow offload latency metrics X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add offloads latency average metric. Use an exponential moving average to show latest changes in latency faster. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index e8156cd57..1bbe6d98f 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -51,6 +51,7 @@ #include "hmapx.h" #include "id-pool.h" #include "ipf.h" +#include "mov-avg.h" #include "netdev.h" #include "netdev-offload.h" #include "netdev-provider.h" @@ -428,6 +429,7 @@ struct dp_offload_thread_item { struct match match; struct nlattr *actions; size_t actions_len; + long long int timestamp; struct ovs_list node; }; @@ -436,13 +438,17 @@ struct dp_offload_thread { struct ovs_mutex mutex; struct ovs_list list; uint64_t enqueued_item; + struct mov_avg_ema ema; pthread_cond_t cond; }; +#define DP_NETDEV_OFFLOAD_EMA_N (10) + static struct dp_offload_thread dp_offload_thread = { .mutex = OVS_MUTEX_INITIALIZER, .list = OVS_LIST_INITIALIZER(&dp_offload_thread.list), .enqueued_item = 0, + .ema = MOV_AVG_EMA_INITIALIZER(DP_NETDEV_OFFLOAD_EMA_N), }; static struct ovsthread_once offload_thread_once @@ -2734,6 +2740,7 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) { struct dp_offload_thread_item *offload; struct ovs_list *list; + long long int latency_us; const char *op; int ret; @@ -2767,6 +2774,9 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) OVS_NOT_REACHED(); } + latency_us = time_usec() - offload->timestamp; + mov_avg_ema_update(&dp_offload_thread.ema, latency_us); + VLOG_DBG("%s to %s netdev flow "UUID_FMT, ret == 0 ? "succeed" : "failed", op, UUID_ARGS((struct uuid *) &offload->flow->mega_ufid)); @@ -2791,6 +2801,7 @@ queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd, offload = dp_netdev_alloc_flow_offload(pmd, flow, DP_NETDEV_FLOW_OFFLOAD_OP_DEL); + offload->timestamp = pmd->ctx.now; dp_netdev_append_flow_offload(offload); } @@ -2823,6 +2834,7 @@ queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd, memcpy(offload->actions, actions, actions_len); offload->actions_len = actions_len; + offload->timestamp = pmd->ctx.now; dp_netdev_append_flow_offload(offload); } @@ -4206,10 +4218,12 @@ dpif_netdev_offload_stats_get(struct dpif *dpif, enum { DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED, DP_NETDEV_HW_OFFLOADS_STATS_INSERTED, + DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN, }; const char *names[] = { - [DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED] = "Enqueued offloads", - [DP_NETDEV_HW_OFFLOADS_STATS_INSERTED] = "Inserted offloads", + [DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED] = " Enqueued offloads", + [DP_NETDEV_HW_OFFLOADS_STATS_INSERTED] = " Inserted offloads", + [DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN] = " Average latency (us)", }; struct dp_netdev *dp = get_dp_netdev(dpif); struct dp_netdev_port *port; @@ -4239,6 +4253,8 @@ dpif_netdev_offload_stats_get(struct dpif *dpif, stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value = dp_offload_thread.enqueued_item; stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_INSERTED].value = nb_offloads; + stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN].value = + mov_avg_ema(&dp_offload_thread.ema); for (i = 0; i < ARRAY_SIZE(names); i++) { snprintf(stats->counters[i].name, sizeof(stats->counters[i].name), From patchwork Sat Dec 5 14:22:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411453 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBd61lbPz9sVV for ; Sun, 6 Dec 2020 01:23:22 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id ADEFB8774A; Sat, 5 Dec 2020 14:23:19 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rntGWCQYuIVl; Sat, 5 Dec 2020 14:23:15 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 369658783D; Sat, 5 Dec 2020 14:22:53 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 07E97C1DA5; Sat, 5 Dec 2020 14:22:53 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id B01ACC1DA6 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id A0D9C87747 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3ch6U+rspc6r for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by whitealder.osuosl.org (Postfix) with ESMTPS id 98D06874C2 for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id A116A40010 for ; Sat, 5 Dec 2020 14:22:35 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:06 +0100 Message-Id: <5628b58b78ec4b38f4d72306d26cfdc0cc6612b8.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 11/26] lib/atomic: Expose atomic exchange operation X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The atomic exchange operation is a useful primitive that should be available as well. Most compiler already expose or offer a way to use it, but a single symbol needs to be defined. Signed-off-by: Gaetan Rivet --- lib/ovs-atomic-c++.h | 3 +++ lib/ovs-atomic-clang.h | 5 +++++ lib/ovs-atomic-gcc4+.h | 5 +++++ lib/ovs-atomic-gcc4.7+.h | 5 +++++ lib/ovs-atomic-i586.h | 5 +++++ lib/ovs-atomic-msvc.h | 22 ++++++++++++++++++++++ lib/ovs-atomic-x86_64.h | 5 +++++ lib/ovs-atomic.h | 8 +++++++- 8 files changed, 57 insertions(+), 1 deletion(-) diff --git a/lib/ovs-atomic-c++.h b/lib/ovs-atomic-c++.h index d47b8dd39..8605fa9d3 100644 --- a/lib/ovs-atomic-c++.h +++ b/lib/ovs-atomic-c++.h @@ -29,6 +29,9 @@ using std::atomic_compare_exchange_strong_explicit; using std::atomic_compare_exchange_weak; using std::atomic_compare_exchange_weak_explicit; +using std::atomic_exchange; +using std::atomic_exchange_explicit; + #define atomic_read(SRC, DST) \ atomic_read_explicit(SRC, DST, memory_order_seq_cst) #define atomic_read_explicit(SRC, DST, ORDER) \ diff --git a/lib/ovs-atomic-clang.h b/lib/ovs-atomic-clang.h index 34cc2faa7..cdf02a512 100644 --- a/lib/ovs-atomic-clang.h +++ b/lib/ovs-atomic-clang.h @@ -67,6 +67,11 @@ typedef enum { #define atomic_compare_exchange_weak_explicit(DST, EXP, SRC, ORD1, ORD2) \ __c11_atomic_compare_exchange_weak(DST, EXP, SRC, ORD1, ORD2) +#define atomic_exchange(RMW, ARG) \ + atomic_exchange_explicit(RMW, ARG, memory_order_seq_cst) +#define atomic_exchange_explicit(RMW, ARG, ORDER) \ + __c11_atomic_exchange(RMW, ARG, ORDER) + #define atomic_add(RMW, ARG, ORIG) \ atomic_add_explicit(RMW, ARG, ORIG, memory_order_seq_cst) #define atomic_sub(RMW, ARG, ORIG) \ diff --git a/lib/ovs-atomic-gcc4+.h b/lib/ovs-atomic-gcc4+.h index 25bcf20a0..f9accde1a 100644 --- a/lib/ovs-atomic-gcc4+.h +++ b/lib/ovs-atomic-gcc4+.h @@ -128,6 +128,11 @@ atomic_signal_fence(memory_order order) #define atomic_compare_exchange_weak_explicit \ atomic_compare_exchange_strong_explicit +#define atomic_exchange_explicit(DST, SRC, ORDER) \ + __sync_lock_test_and_set(DST, SRC) +#define atomic_exchange(DST, SRC) \ + atomic_exchange_explicit(DST, SRC, memory_order_seq_cst) + #define atomic_op__(RMW, OP, ARG, ORIG) \ ({ \ typeof(RMW) rmw__ = (RMW); \ diff --git a/lib/ovs-atomic-gcc4.7+.h b/lib/ovs-atomic-gcc4.7+.h index 4c197ebe0..846e05775 100644 --- a/lib/ovs-atomic-gcc4.7+.h +++ b/lib/ovs-atomic-gcc4.7+.h @@ -61,6 +61,11 @@ typedef enum { #define atomic_compare_exchange_weak_explicit(DST, EXP, SRC, ORD1, ORD2) \ __atomic_compare_exchange_n(DST, EXP, SRC, true, ORD1, ORD2) +#define atomic_exchange_explicit(DST, SRC, ORDER) \ + __atomic_exchange_n(DST, SRC, ORDER) +#define atomic_exchange(DST, SRC) \ + atomic_exchange_explicit(DST, SRC, memory_order_seq_cst) + #define atomic_add(RMW, OPERAND, ORIG) \ atomic_add_explicit(RMW, OPERAND, ORIG, memory_order_seq_cst) #define atomic_sub(RMW, OPERAND, ORIG) \ diff --git a/lib/ovs-atomic-i586.h b/lib/ovs-atomic-i586.h index 9a385ce84..35a0959ff 100644 --- a/lib/ovs-atomic-i586.h +++ b/lib/ovs-atomic-i586.h @@ -400,6 +400,11 @@ atomic_signal_fence(memory_order order) #define atomic_compare_exchange_weak_explicit \ atomic_compare_exchange_strong_explicit +#define atomic_exchange_explicit(RMW, ARG, ORDER) \ + atomic_exchange__(RMW, ARG, ORDER) +#define atomic_exchange(RMW, ARG) \ + atomic_exchange_explicit(RMW, ARG, memory_order_seq_cst) + #define atomic_add__(RMW, ARG, CLOB) \ asm volatile("lock; xadd %0,%1 ; " \ "# atomic_add__ " \ diff --git a/lib/ovs-atomic-msvc.h b/lib/ovs-atomic-msvc.h index 9def887d3..19cc57888 100644 --- a/lib/ovs-atomic-msvc.h +++ b/lib/ovs-atomic-msvc.h @@ -345,6 +345,28 @@ atomic_signal_fence(memory_order order) #define atomic_compare_exchange_weak_explicit \ atomic_compare_exchange_strong_explicit +/* While intrinsics offering different memory ordering + * are available in MSVC C compiler, they are not defined + * in the C++ compiler. Ignore for compatibility. + * + * Use nested ternary operators as the GNU extension ({}) + * is not available. + */ + +#define atomic_exchange_explicit(DST, SRC, ORDER) \ + ((sizeof *(DST) == 1) ? \ + _InterlockedExchange8((char volatile *)DST, SRC) \ + : (sizeof *(DST) == 2) ? \ + _InterlockedExchange16((short volatile *)DST, SRC) \ + : (sizeof *(DST) == 4) ? \ + _InterlockedExchange((long int volatile *)DST, SRC) \ + : (sizeof *(DST) == 8) ? \ + _InterlockedExchange64((__int64 volatile *)DST, SRC) \ + : (ovs_abort(), 0)) + +#define atomic_exchange(DST, SRC) \ + atomic_exchange_explicit(DST, SRC, memory_order_seq_cst) + /* MSVCs c++ compiler implements c11 atomics and looking through its * implementation (in xatomic.h), orders are ignored for x86 platform. * Do the same here. */ diff --git a/lib/ovs-atomic-x86_64.h b/lib/ovs-atomic-x86_64.h index 1e7d42707..3bdaf2f08 100644 --- a/lib/ovs-atomic-x86_64.h +++ b/lib/ovs-atomic-x86_64.h @@ -274,6 +274,11 @@ atomic_signal_fence(memory_order order) #define atomic_compare_exchange_weak_explicit \ atomic_compare_exchange_strong_explicit +#define atomic_exchange_explicit(RMW, ARG, ORDER) \ + atomic_exchange__(RMW, ARG, ORDER) +#define atomic_exchange(RMW, ARG) \ + atomic_exchange_explicit(RMW, ARG, memory_order_seq_cst) + #define atomic_add__(RMW, ARG, CLOB) \ asm volatile("lock; xadd %0,%1 ; " \ "# atomic_add__ " \ diff --git a/lib/ovs-atomic.h b/lib/ovs-atomic.h index 11fa19268..8fdce0cf8 100644 --- a/lib/ovs-atomic.h +++ b/lib/ovs-atomic.h @@ -210,7 +210,7 @@ * In this section, A is an atomic type and C is the corresponding non-atomic * type. * - * The "store" and "compare_exchange" primitives match C11: + * The "store", "exchange", and "compare_exchange" primitives match C11: * * void atomic_store(A *object, C value); * void atomic_store_explicit(A *object, C value, memory_order); @@ -244,6 +244,12 @@ * efficiently, so it should be used if the application will need to * loop anyway. * + * C atomic_exchange(A *object, C desired); + * C atomic_exchange_explicit(A *object, C desired, memory_order); + * + * Atomically stores 'desired' into '*object', returning the value + * previously held. + * * The following primitives differ from the C11 ones (and have different names) * because there does not appear to be a way to implement the standard * primitives in standard C: From patchwork Sat Dec 5 14:22:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411458 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBdN53cnz9sVV for ; Sun, 6 Dec 2020 01:23:36 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id B5E9587838; Sat, 5 Dec 2020 14:23:34 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KA99-zG+GT5Y; Sat, 5 Dec 2020 14:23:26 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 78D8B87602; Sat, 5 Dec 2020 14:22:56 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 45A35C1DCA; Sat, 5 Dec 2020 14:22:56 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2B7AEC013B for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 19A1087046 for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SsjqlefIs93r for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by fraxinus.osuosl.org (Postfix) with ESMTPS id BCFE386F9C for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id ED45040011 for ; Sat, 5 Dec 2020 14:22:35 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:07 +0100 Message-Id: <4030699f2734d9967ec2799817fdb190113dca28.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 12/26] mpsc-queue: Module for lock-free message passing X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a lockless multi-producer/single-consumer (MPSC), linked-list based, intrusive, unbounded queue that does not require deferred memory management. The queue is an implementation of the structure described by Dmitri Vyukov[1]. It adds a slightly more explicit API explaining the proper use of the queue. Alternatives were considered such as a Treiber Stack [2] or a Michael-Scott queue [3], but this one is faster, simpler and scalable. [1]: http://www.1024cores.net/home/lock-free-algorithms/queues/intrusive-mpsc-node-based-queue [2]: R. K. Treiber. Systems programming: Coping with parallelism. Technical Report RJ 5118, IBM Almaden Research Center, April 1986. [3]: M. M. Michael, Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms. https://www.cs.rochester.edu/research/synchronization/pseudocode/queues.html The queue is designed to improve the specific MPSC setup. A benchmark accompanies the unit tests to measure the difference in this configuration. A single reader thread polls the queue while N writers enqueue elements as fast as possible. The mpsc-queue is compared against the regular ovs-list as well as the guarded list. The latter usually offers a slight improvement by batching the element removal, however the mpsc-queue is faster. The average is of each producer threads time: $ ./tests/ovstest test-mpsc-queue benchmark 3000000 1 Benchmarking n=3000000 on 1 + 1 threads. type\thread: Reader 1 Avg mpsc-queue: 161 161 161 ms list: 803 803 803 ms guarded list: 665 665 665 ms $ ./tests/ovstest test-mpsc-queue benchmark 3000000 2 Benchmarking n=3000000 on 1 + 2 threads. type\thread: Reader 1 2 Avg mpsc-queue: 102 101 97 99 ms list: 246 212 246 229 ms guarded list: 264 263 214 238 ms $ ./tests/ovstest test-mpsc-queue benchmark 3000000 3 Benchmarking n=3000000 on 1 + 3 threads. type\thread: Reader 1 2 3 Avg mpsc-queue: 92 91 92 91 91 ms list: 520 517 515 520 517 ms guarded list: 405 395 401 404 400 ms $ ./tests/ovstest test-mpsc-queue benchmark 3000000 4 Benchmarking n=3000000 on 1 + 4 threads. type\thread: Reader 1 2 3 4 Avg mpsc-queue: 77 73 73 77 75 74 ms list: 371 359 361 287 370 344 ms guarded list: 389 388 359 363 357 366 ms Signed-off-by: Gaetan Rivet --- lib/automake.mk | 2 + lib/mpsc-queue.c | 190 +++++++++++++ lib/mpsc-queue.h | 149 +++++++++++ tests/automake.mk | 1 + tests/library.at | 5 + tests/test-mpsc-queue.c | 580 ++++++++++++++++++++++++++++++++++++++++ 6 files changed, 927 insertions(+) create mode 100644 lib/mpsc-queue.c create mode 100644 lib/mpsc-queue.h create mode 100644 tests/test-mpsc-queue.c diff --git a/lib/automake.mk b/lib/automake.mk index 52c99b288..3012d4700 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -167,6 +167,8 @@ lib_libopenvswitch_la_SOURCES = \ lib/memory.h \ lib/meta-flow.c \ lib/mov-avg.h \ + lib/mpsc-queue.c \ + lib/mpsc-queue.h \ lib/multipath.c \ lib/multipath.h \ lib/namemap.c \ diff --git a/lib/mpsc-queue.c b/lib/mpsc-queue.c new file mode 100644 index 000000000..9280d81f6 --- /dev/null +++ b/lib/mpsc-queue.c @@ -0,0 +1,190 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "ovs-atomic.h" + +#include "mpsc-queue.h" + +/* Multi-producer, single-consumer queue + * ===================================== + * + * This an implementation of the MPSC queue described by Dmitri Vyukov [1]. + * + * One atomic exchange operation is done per insertion. Removal in most cases + * will not require atomic operation and will use one atomic exchange to close + * the queue chain. + * + * Insertion + * ========= + * + * The queue is implemented using a linked-list. Insertion is done at the + * back of the queue, by swapping the current end with the new node atomically, + * then pointing the previous end toward the new node. To follow Vyukov + * nomenclature, the end-node of the chain is called head. A producer will + * only manipulate the head. + * + * The head swap is atomic, however the link from the previous head to the new + * one is done in a separate operation. This means that the chain is + * momentarily broken, when the previous head still points to NULL and the + * current head has been inserted. + * + * Considering a series of insertions, the queue state will remain consistent + * and the insertions order is compatible with their precedence, thus the + * queue is serializable. However, because an insertion consists in two + * separate memory transactions, it is not linearizable. + * + * Removal + * ======= + * + * The consumer must deal with the queue inconsistency. It will manipulate + * the tail of the queue and move it along the latest consumed elements. + * When an end of the chain of elements is found (the next pointer is NULL), + * the tail is compared with the head. + * + * If both points to different addresses, then the queue is in an inconsistent + * state: the tail cannot move forward as the next is NULL, but the head is not + * the last element in the chain: this can only happen if the chain is broken. + * + * In this case, the consumer must wait for the producer to finish writing the + * next pointer of its current tail: 'MPSC_QUEUE_RETRY' is returned. + * + * Removal is thus in most cases (when there are elements in the queue) + * accomplished without using atomics, until the last element of the queue. + * There, the head is atomically loaded. If the queue is in a consistent state, + * the head is moved back to the queue stub by inserting the stub in the queue: + * ending the queue is the same as an insertion, which is one atomic XCHG. + * + * Limitations + * =========== + * + * The chain will remain broken as long as a producer is not finished writing + * its next pointer. If a producer is cancelled for example, the queue could + * remain broken for any future readings. This queue should either be used + * with cooperative threads or outside any cancellable sections. + * + * Performances + * ============ + * + * In benchmarks this structure was better than alternatives such as: + * + * * A reversed Treiber stack [2], using 1 CAS per operations + * and requiring reversal of the node list on removal. + * + * * Michael-Scott lock-free queue [3], using 2 CAS per operations. + * + * While it is not linearizable, this queue is well-suited for message passing. + * If a proper hardware XCHG operation is used, it scales better than + * CAS-based implementations. + * + * References + * ========== + * + * [1]: http://www.1024cores.net/home/lock-free-algorithms/queues/intrusive-mpsc-node-based-queue + * + * [2]: R. K. Treiber. Systems programming: Coping with parallelism. + * Technical Report RJ 5118, IBM Almaden Research Center, April 1986. + * + * [3]: M. M. Michael, Simple, Fast, and Practical Non-Blocking and + * Blocking Concurrent Queue Algorithms + * [3]: https://www.cs.rochester.edu/research/synchronization/pseudocode/queues.html + * + */ + +void +mpsc_queue_init(struct mpsc_queue *queue) +{ + atomic_store_relaxed(&queue->head, &queue->stub); + atomic_store_relaxed(&queue->tail, &queue->stub); + atomic_store_relaxed(&queue->stub.next, NULL); + + ovs_mutex_init(&queue->read_lock); +} + +void +mpsc_queue_destroy(struct mpsc_queue *queue) +{ + ovs_mutex_destroy(&queue->read_lock); +} + +int +mpsc_queue_acquire(struct mpsc_queue *queue) + OVS_TRY_LOCK(1, queue->read_lock) +{ + return !ovs_mutex_trylock(&queue->read_lock); +} + +void +mpsc_queue_release(struct mpsc_queue *queue) + OVS_RELEASES(queue->read_lock) +{ + ovs_mutex_unlock(&queue->read_lock); +} + +enum mpsc_queue_poll_result +mpsc_queue_poll(struct mpsc_queue *queue, struct mpsc_queue_node **node) + OVS_REQUIRES(queue->read_lock) +{ + struct mpsc_queue_node *tail; + struct mpsc_queue_node *next; + struct mpsc_queue_node *head; + + atomic_read_relaxed(&queue->tail, &tail); + atomic_read_explicit(&tail->next, &next, memory_order_acquire); + + if (tail == &queue->stub) { + if (next == NULL) { + return MPSC_QUEUE_EMPTY; + } + + atomic_store_relaxed(&queue->tail, next); + tail = next; + atomic_read_explicit(&tail->next, &next, memory_order_acquire); + } + + if (next != NULL) { + atomic_store_relaxed(&queue->tail, next); + *node = tail; + return MPSC_QUEUE_ITEM; + } + + atomic_read_explicit(&queue->head, &head, memory_order_acquire); + if (tail != head) { + return MPSC_QUEUE_RETRY; + } + + mpsc_queue_insert(queue, &queue->stub); + + atomic_read_explicit(&tail->next, &next, memory_order_acquire); + if (next != NULL) { + atomic_store_relaxed(&queue->tail, next); + *node = tail; + return MPSC_QUEUE_ITEM; + } + + return MPSC_QUEUE_EMPTY; +} + +void +mpsc_queue_insert(struct mpsc_queue *queue, struct mpsc_queue_node *node) +{ + struct mpsc_queue_node *prev; + + atomic_store_relaxed(&node->next, NULL); + prev = atomic_exchange_explicit(&queue->head, node, memory_order_acq_rel); + atomic_store_explicit(&prev->next, node, memory_order_release); +} diff --git a/lib/mpsc-queue.h b/lib/mpsc-queue.h new file mode 100644 index 000000000..8ee59409f --- /dev/null +++ b/lib/mpsc-queue.h @@ -0,0 +1,149 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef MPSC_QUEUE_H +#define MPSC_QUEUE_H 1 + +#include +#include +#include + +#include +#include + +#include "ovs-atomic.h" + +/* Multi-producer, single-consumer queue + * ===================================== + * + * This data structure is a lockless queue implementation with + * the following properties: + * + * * Multi-producer: multiple threads can write concurrently. + * Insertion in the queue is thread-safe, no inter-thread + * synchronization is necessary. + * + * * Single-consumer: only a single thread can safely remove + * nodes from the queue. The queue must be 'acquired' using + * 'mpsc_queue_acquire()' before removing nodes. + * + * * Unbounded: the queue is backed by a linked-list and is not + * limited in number of elements. + * + * * Intrusive: queue elements are allocated as part of larger + * objects. Objects are retrieved by offset manipulation. + * + * * per-producer FIFO: Elements in the queue are kept in the + * order their producer inserted them. The consumer retrieves + * them in in the same insertion order. When multiple + * producers insert at the same time, either will proceed. + * + * This queue is well-suited for message passing between threads, + * where any number of thread can insert a message and a single + * thread is meant to receive and process them. + * + * Thread-safety + * ============= + * + * The consumer thread must acquire the queue using 'mpsc_queue_acquire()'. + * If no error is returned, the thread can call 'mpsc_queue_poll()'. + * When a thread is finished with reading the queue, it can release the + * reader lock using 'mpsc_queue_release()'. + * + * Producers can always insert elements in the queue, even if no consumer + * acquired the reader lock. No inter-producer synchronization (e.g. using a + * lock) is needed. + * + * The consumer thread is also allowed to insert elements while it holds the + * reader lock. + * + * Producer threads must never be cancelled while writing to the queue. + * This will block the consumer, that will then lose any subsequent writes + * to the queue. Producers should ideally be cooperatively managed or + * the queue insertion should be within non-cancellable sections. + * + * Queue state + * =========== + * + * When polling the queue, three states can be observed: 'empty', 'non-empty', + * and 'inconsistent'. Three polling results are defined, respectively: + * + * * MPSC_QUEUE_EMPTY: the queue is empty. + * * MPSC_QUEUE_ITEM: an item was available and has been removed. + * * MPSC_QUEUE_RETRY: the queue is inconsistent. + * + * If 'MPSC_QUEUE_RETRY' is returned, then a producer has not yet finished + * writing to the queue and the list of nodes is not coherent. The consumer + * can retry shortly to check if the producer has finished. + * + * This behavior is the reason the removal function is called + * 'mpsc_queue_poll()'. + * + */ + +struct mpsc_queue_node { + ATOMIC(struct mpsc_queue_node *) next; +}; + +struct mpsc_queue { + ATOMIC(struct mpsc_queue_node *) head; + ATOMIC(struct mpsc_queue_node *) tail; + struct mpsc_queue_node stub; + struct ovs_mutex read_lock; +}; + +#define MPSC_QUEUE_INITIALIZER(Q) { \ + .head = ATOMIC_VAR_INIT(&(Q)->stub), \ + .tail = ATOMIC_VAR_INIT(&(Q)->stub), \ + .stub = { .next = ATOMIC_VAR_INIT(NULL) }, \ + .read_lock = OVS_MUTEX_INITIALIZER, \ +} + +/* Consumer API. */ + +/* Initialize the queue. Not necessary is 'MPSC_QUEUE_INITIALIZER' was used. */ +void mpsc_queue_init(struct mpsc_queue *queue); +/* The reader lock must be released prior to destroying the queue. */ +void mpsc_queue_destroy(struct mpsc_queue *queue); +/* Acquire the reader lock if 1 is returned. */ +int mpsc_queue_acquire(struct mpsc_queue *queue); +/* Release the reader lock. */ +void mpsc_queue_release(struct mpsc_queue *queue); + +enum mpsc_queue_poll_result { + /* Queue is empty. */ + MPSC_QUEUE_EMPTY, + /* Polling the queue returned an item. */ + MPSC_QUEUE_ITEM, + /* Data has been enqueued but one or more producer thread have not + * finished writing it. The queue is in an inconsistent state. + * Retrying shortly, if the producer threads are still active, will + * return the data. + */ + MPSC_QUEUE_RETRY, +}; + +/* Set 'node' to a removed item from the queue if 'MPSC_QUEUE_ITEM' is + * returned, otherwise 'node' is not set. + */ +enum mpsc_queue_poll_result +mpsc_queue_poll(struct mpsc_queue *queue, struct mpsc_queue_node **node); + +/* Producer API. */ + +void mpsc_queue_insert(struct mpsc_queue *queue, struct mpsc_queue_node *node); + +#endif /* MPSC_QUEUE_H */ diff --git a/tests/automake.mk b/tests/automake.mk index 677b99a6b..d7ae5df90 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -460,6 +460,7 @@ tests_ovstest_SOURCES = \ tests/test-list.c \ tests/test-lockfile.c \ tests/test-multipath.c \ + tests/test-mpsc-queue.c \ tests/test-netflow.c \ tests/test-odp.c \ tests/test-ofpbuf.c \ diff --git a/tests/library.at b/tests/library.at index ac4ea4abf..537f0aa4c 100644 --- a/tests/library.at +++ b/tests/library.at @@ -253,3 +253,8 @@ AT_SETUP([stopwatch module]) AT_CHECK([ovstest test-stopwatch], [0], [...... ], [ignore]) AT_CLEANUP + +AT_SETUP([mpsc-queue module]) +AT_CHECK([ovstest test-mpsc-queue check], [0], [.. +]) +AT_CLEANUP diff --git a/tests/test-mpsc-queue.c b/tests/test-mpsc-queue.c new file mode 100644 index 000000000..791715848 --- /dev/null +++ b/tests/test-mpsc-queue.c @@ -0,0 +1,580 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#undef NDEBUG +#include +#include +#include + +#include + +#include "command-line.h" +#include "guarded-list.h" +#include "mpsc-queue.h" +#include "openvswitch/list.h" +#include "openvswitch/util.h" +#include "ovs-thread.h" +#include "ovstest.h" +#include "timeval.h" +#include "util.h" + +struct element { + union { + struct mpsc_queue_node mpscq; + struct ovs_list list; + } node; + uint64_t mark; +}; + +static void +test_mpsc_queue_mark_element(struct mpsc_queue_node *node, + uint64_t mark, + unsigned int *counter) +{ + struct element *elem; + + elem = CONTAINER_OF(node, struct element, node.mpscq); + elem->mark = mark; + *counter += 1; +} + +static void +test_mpsc_queue_insert(void) +{ + struct element elements[100]; + struct mpsc_queue_node *node; + struct mpsc_queue queue; + unsigned int counter; + size_t i; + + memset(elements, 0, sizeof(elements)); + mpsc_queue_init(&queue); + ignore(mpsc_queue_acquire(&queue)); + + for (i = 0; i < ARRAY_SIZE(elements); i++) { + mpsc_queue_insert(&queue, &elements[i].node.mpscq); + } + + counter = 0; + while (mpsc_queue_poll(&queue, &node) == MPSC_QUEUE_ITEM) { + test_mpsc_queue_mark_element(node, 1, &counter); + } + + mpsc_queue_release(&queue); + mpsc_queue_destroy(&queue); + + ovs_assert(counter == ARRAY_SIZE(elements)); + for (i = 0; i < ARRAY_SIZE(elements); i++) { + ovs_assert(elements[i].mark == 1); + } + + printf("."); +} + +static void +test_mpsc_queue_flush_is_fifo(void) +{ + struct element elements[100]; + struct mpsc_queue_node *node; + struct mpsc_queue queue; + unsigned int counter; + size_t i; + + memset(elements, 0, sizeof(elements)); + + mpsc_queue_init(&queue); + ignore(mpsc_queue_acquire(&queue)); + + for (i = 0; i < ARRAY_SIZE(elements); i++) { + mpsc_queue_insert(&queue, &elements[i].node.mpscq); + } + + /* Elements are in the same order in the list as they + * were declared / initialized. + */ + counter = 0; + while (mpsc_queue_poll(&queue, &node) == MPSC_QUEUE_ITEM) { + test_mpsc_queue_mark_element(node, counter, &counter); + } + + /* The list is valid once extracted from the queue, + * the queue can be destroyed here. + */ + mpsc_queue_release(&queue); + mpsc_queue_destroy(&queue); + + for (i = 0; i < ARRAY_SIZE(elements) - 1; i++) { + struct element *e1, *e2; + + e1 = &elements[i]; + e2 = &elements[i + 1]; + + ovs_assert(e1->mark < e2->mark); + } + + printf("."); +} + +static void +run_tests(struct ovs_cmdl_context *ctx OVS_UNUSED) +{ + /* Verify basic insertion worked. */ + test_mpsc_queue_insert(); + /* Verify flush() happens in FIFO if configured. */ + test_mpsc_queue_flush_is_fifo(); + printf("\n"); +} + +static struct element *elements; +static uint64_t *thread_working_ms; /* Measured work time. */ + +static unsigned int n_threads; +static unsigned int n_elems; + +static struct ovs_barrier barrier; +static volatile bool working; + +static int +elapsed(const struct timeval *start) +{ + struct timeval end; + + xgettimeofday(&end); + return timeval_to_msec(&end) - timeval_to_msec(start); +} + +struct mpscq_aux { + struct mpsc_queue *queue; + atomic_uint thread_id; +}; + +static void * +mpsc_queue_insert_thread(void *aux_) +{ + unsigned int n_elems_per_thread; + struct element *th_elements; + struct mpscq_aux *aux = aux_; + struct timeval start; + unsigned int id; + size_t i; + + atomic_add(&aux->thread_id, 1u, &id); + n_elems_per_thread = n_elems / n_threads; + th_elements = &elements[id * n_elems_per_thread]; + + ovs_barrier_block(&barrier); + xgettimeofday(&start); + + for (i = 0; i < n_elems_per_thread; i++) { + mpsc_queue_insert(aux->queue, &th_elements[i].node.mpscq); + } + + thread_working_ms[id] = elapsed(&start); + ovs_barrier_block(&barrier); + + working = false; + + return NULL; +} + +static void +benchmark_mpsc_queue(void) +{ + struct mpsc_queue_node *node; + struct mpsc_queue queue; + struct timeval start; + unsigned int counter; + bool work_complete; + pthread_t *threads; + struct mpscq_aux aux; + uint64_t epoch; + uint64_t avg; + size_t i; + + memset(elements, 0, n_elems & sizeof *elements); + memset(thread_working_ms, 0, n_threads & sizeof *thread_working_ms); + + mpsc_queue_init(&queue); + + aux.queue = &queue; + atomic_store(&aux.thread_id, 0); + + for (i = n_elems - (n_elems % n_threads); i < n_elems; i++) { + mpsc_queue_insert(&queue, &elements[i].node.mpscq); + } + + working = true; + + threads = xmalloc(n_threads * sizeof *threads); + ovs_barrier_init(&barrier, n_threads); + + for (i = 0; i < n_threads; i++) { + threads[i] = ovs_thread_create("sc_queue_insert", + mpsc_queue_insert_thread, &aux); + } + + ignore(mpsc_queue_acquire(&queue)); + xgettimeofday(&start); + + counter = 0; + epoch = 1; + do { + while (mpsc_queue_poll(&queue, &node) == MPSC_QUEUE_ITEM) { + test_mpsc_queue_mark_element(node, epoch, &counter); + } + if (epoch == UINT64_MAX) { + epoch = 0; + } + epoch++; + } while (working); + + avg = 0; + for (i = 0; i < n_threads; i++) { + xpthread_join(threads[i], NULL); + avg += thread_working_ms[i]; + } + avg /= n_threads; + + /* Elements might have been inserted before threads were joined. */ + while (mpsc_queue_poll(&queue, &node) == MPSC_QUEUE_ITEM) { + test_mpsc_queue_mark_element(node, epoch, &counter); + } + + printf(" mpsc-queue: %6d", elapsed(&start)); + for (i = 0; i < n_threads; i++) { + printf(" %6" PRIu64, thread_working_ms[i]); + } + printf(" %6" PRIu64 " ms\n", avg); + + mpsc_queue_release(&queue); + mpsc_queue_destroy(&queue); + ovs_barrier_destroy(&barrier); + free(threads); + + work_complete = true; + for (i = 0; i < n_elems; i++) { + if (elements[i].mark == 0) { + printf("Element %" PRIuSIZE " was never consumed.\n", i); + work_complete = false; + } + } + ovs_assert(work_complete); + ovs_assert(counter == n_elems); +} + +struct list_aux { + struct ovs_list *list; + struct ovs_mutex *lock; + atomic_uint thread_id; +}; + +static void * +locked_list_insert_thread(void *aux_) +{ + unsigned int n_elems_per_thread; + struct element *th_elements; + struct list_aux *aux = aux_; + struct timeval start; + unsigned int id; + size_t i; + + atomic_add(&aux->thread_id, 1u, &id); + n_elems_per_thread = n_elems / n_threads; + th_elements = &elements[id * n_elems_per_thread]; + + ovs_barrier_block(&barrier); + xgettimeofday(&start); + + for (i = 0; i < n_elems_per_thread; i++) { + ovs_mutex_lock(aux->lock); + ovs_list_push_front(aux->list, &th_elements[i].node.list); + ovs_mutex_unlock(aux->lock); + } + + thread_working_ms[id] = elapsed(&start); + ovs_barrier_block(&barrier); + + working = false; + + return NULL; +} + +static void +benchmark_list(void) +{ + struct ovs_mutex lock; + struct ovs_list list; + struct element *elem; + struct timeval start; + unsigned int counter; + bool work_complete; + pthread_t *threads; + struct list_aux aux; + uint64_t epoch; + uint64_t avg; + size_t i; + + memset(elements, 0, n_elems * sizeof *elements); + memset(thread_working_ms, 0, n_threads * sizeof *thread_working_ms); + + ovs_mutex_init(&lock); + ovs_list_init(&list); + + aux.list = &list; + aux.lock = &lock; + atomic_store(&aux.thread_id, 0); + + ovs_mutex_lock(&lock); + for (i = n_elems - (n_elems % n_threads); i < n_elems; i++) { + ovs_list_push_front(&list, &elements[i].node.list); + } + ovs_mutex_unlock(&lock); + + working = true; + + threads = xmalloc(n_threads * sizeof *threads); + ovs_barrier_init(&barrier, n_threads); + + for (i = 0; i < n_threads; i++) { + threads[i] = ovs_thread_create("locked_list_insert", + locked_list_insert_thread, &aux); + } + + xgettimeofday(&start); + + counter = 0; + epoch = 1; + do { + ovs_mutex_lock(&lock); + LIST_FOR_EACH_POP (elem, node.list, &list) { + elem->mark = epoch; + counter++; + } + ovs_mutex_unlock(&lock); + if (epoch == UINT64_MAX) { + epoch = 0; + } + epoch++; + } while (working); + + avg = 0; + for (i = 0; i < n_threads; i++) { + xpthread_join(threads[i], NULL); + avg += thread_working_ms[i]; + } + avg /= n_threads; + + /* Elements might have been inserted before threads were joined. */ + ovs_mutex_lock(&lock); + LIST_FOR_EACH_POP (elem, node.list, &list) { + elem->mark = epoch; + counter++; + } + ovs_mutex_unlock(&lock); + + printf(" list: %6d", elapsed(&start)); + for (i = 0; i < n_threads; i++) { + printf(" %6" PRIu64, thread_working_ms[i]); + } + printf(" %6" PRIu64 " ms\n", avg); + ovs_barrier_destroy(&barrier); + free(threads); + + work_complete = true; + for (i = 0; i < n_elems; i++) { + if (elements[i].mark == 0) { + printf("Element %" PRIuSIZE " was never consumed.\n", i); + work_complete = false; + } + } + ovs_assert(work_complete); + ovs_assert(counter == n_elems); +} + +struct guarded_list_aux { + struct guarded_list *glist; + atomic_uint thread_id; +}; + +static void * +guarded_list_insert_thread(void *aux_) +{ + unsigned int n_elems_per_thread; + struct element *th_elements; + struct guarded_list_aux *aux = aux_; + struct timeval start; + unsigned int id; + size_t i; + + atomic_add(&aux->thread_id, 1u, &id); + n_elems_per_thread = n_elems / n_threads; + th_elements = &elements[id * n_elems_per_thread]; + + ovs_barrier_block(&barrier); + xgettimeofday(&start); + + for (i = 0; i < n_elems_per_thread; i++) { + guarded_list_push_back(aux->glist, &th_elements[i].node.list, n_elems); + } + + thread_working_ms[id] = elapsed(&start); + ovs_barrier_block(&barrier); + + working = false; + + return NULL; +} + +static void +benchmark_guarded_list(void) +{ + struct guarded_list_aux aux; + struct ovs_list extracted; + struct guarded_list glist; + struct element *elem; + struct timeval start; + unsigned int counter; + bool work_complete; + pthread_t *threads; + uint64_t epoch; + uint64_t avg; + size_t i; + + memset(elements, 0, n_elems * sizeof *elements); + memset(thread_working_ms, 0, n_threads * sizeof *thread_working_ms); + + guarded_list_init(&glist); + ovs_list_init(&extracted); + + aux.glist = &glist; + atomic_store(&aux.thread_id, 0); + + for (i = n_elems - (n_elems % n_threads); i < n_elems; i++) { + guarded_list_push_back(&glist, &elements[i].node.list, n_elems); + } + + working = true; + + threads = xmalloc(n_threads * sizeof *threads); + ovs_barrier_init(&barrier, n_threads); + + for (i = 0; i < n_threads; i++) { + threads[i] = ovs_thread_create("guarded_list_insert", + guarded_list_insert_thread, &aux); + } + + xgettimeofday(&start); + + counter = 0; + epoch = 1; + do { + guarded_list_pop_all(&glist, &extracted); + LIST_FOR_EACH_POP (elem, node.list, &extracted) { + elem->mark = epoch; + counter++; + } + if (epoch == UINT64_MAX) { + epoch = 0; + } + epoch++; + } while (working); + + avg = 0; + for (i = 0; i < n_threads; i++) { + xpthread_join(threads[i], NULL); + avg += thread_working_ms[i]; + } + avg /= n_threads; + + /* Elements might have been inserted before threads were joined. */ + guarded_list_pop_all(&glist, &extracted); + LIST_FOR_EACH_POP (elem, node.list, &extracted) { + elem->mark = epoch; + counter++; + } + + printf("guarded list: %6d", elapsed(&start)); + for (i = 0; i < n_threads; i++) { + printf(" %6" PRIu64, thread_working_ms[i]); + } + printf(" %6" PRIu64 " ms\n", avg); + ovs_barrier_destroy(&barrier); + free(threads); + guarded_list_destroy(&glist); + + work_complete = true; + for (i = 0; i < n_elems; i++) { + if (elements[i].mark == 0) { + printf("Element %" PRIuSIZE " was never consumed.\n", i); + work_complete = false; + } + } + ovs_assert(work_complete); + ovs_assert(counter == n_elems); +} + +static void +run_benchmarks(struct ovs_cmdl_context *ctx) +{ + long int l_threads; + long int l_elems; + size_t i; + + l_elems = strtol(ctx->argv[1], NULL, 10); + l_threads = strtol(ctx->argv[2], NULL, 10); + ovs_assert(l_elems > 0 && l_threads > 0); + + n_elems = l_elems; + n_threads = l_threads; + + elements = xcalloc(n_elems, sizeof *elements); + thread_working_ms = xcalloc(n_threads, sizeof *thread_working_ms); + + printf("Benchmarking n=%u on 1 + %u threads.\n", n_elems, n_threads); + + printf(" type\\thread: Reader "); + for (i = 0; i < n_threads; i++) { + printf(" %3" PRIuSIZE " ", i + 1); + } + printf(" Avg\n"); + + benchmark_mpsc_queue(); + benchmark_list(); + benchmark_guarded_list(); + + free(thread_working_ms); + free(elements); +} + +static const struct ovs_cmdl_command commands[] = { + {"check", NULL, 0, 0, run_tests, OVS_RO}, + {"benchmark", " ", 2, 2, run_benchmarks, OVS_RO}, + {NULL, NULL, 0, 0, NULL, OVS_RO}, +}; + +static void +test_mpsc_queue_main(int argc, char *argv[]) +{ + struct ovs_cmdl_context ctx = { + .argc = argc - optind, + .argv = argv + optind, + }; + + set_program_name(argv[0]); + ovs_cmdl_run_command(&ctx, commands); +} + +OVSTEST_REGISTER("test-mpsc-queue", test_mpsc_queue_main); From patchwork Sat Dec 5 14:22:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411460 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBdb0F0Zz9sVV for ; Sun, 6 Dec 2020 01:23:47 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 59120876E5; Sat, 5 Dec 2020 14:23:45 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hYNhLngle2Cs; Sat, 5 Dec 2020 14:23:44 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 4F772878E8; Sat, 5 Dec 2020 14:23:01 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2A8BCC1DD3; Sat, 5 Dec 2020 14:23:01 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 5C80AC1DA0 for ; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 40770874D1 for ; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LCpRzNcDb+Em for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by whitealder.osuosl.org (Postfix) with ESMTPS id DF3E987522 for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 5033240006 for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:08 +0100 Message-Id: <05ee3850fb71e80c82ccfdc9acaaa7fbb4b15d4e.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 13/26] llring: Add lockless MPMC bounded queue structure X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a lockless multi-producer/multi-consumer, array-based, non-intrusive, bounded queue that will fail on overflow. Each operation (enqueue, dequeue) uses a CAS(). As such, both producer and consumer sides guarantee lock-free forward progress. If the queue is full, enqueuing will fail. Conversely, if the queue is empty, dequeueing will fail. The bound of the queue are restricted to power-of-twos, to allow simpler overflow on unsigned position markers. Signed-off-by: Gaetan Rivet --- lib/automake.mk | 2 + lib/llring.c | 153 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/llring.h | 76 ++++++++++++++++++++++++ 3 files changed, 231 insertions(+) create mode 100644 lib/llring.c create mode 100644 lib/llring.h diff --git a/lib/automake.mk b/lib/automake.mk index 3012d4700..c67c01779 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -156,6 +156,8 @@ lib_libopenvswitch_la_SOURCES = \ lib/learn.h \ lib/learning-switch.c \ lib/learning-switch.h \ + lib/llring.c \ + lib/llring.h \ lib/lockfile.c \ lib/lockfile.h \ lib/mac-learning.c \ diff --git a/lib/llring.c b/lib/llring.c new file mode 100644 index 000000000..66fb22a1b --- /dev/null +++ b/lib/llring.c @@ -0,0 +1,153 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "ovs-atomic.h" + +#include "llring.h" + +/* A queue element. + * Calling 'llring_create' will allocate an array of such elements, + * that will hold the inserted data. + */ +struct llring_node { + atomic_uint32_t seq; + uint32_t data; +}; + +/* A ring description. + * The head and tail of the ring are padded to avoid false-sharing, + * which improves slightly multi-thread performance, at the cost + * of some memory. + */ +struct llring { + PADDED_MEMBERS(CACHE_LINE_SIZE, atomic_uint32_t head;); + PADDED_MEMBERS(CACHE_LINE_SIZE, atomic_uint32_t tail;); + uint32_t mask; + struct llring_node nodes[0]; +}; + +struct llring * +llring_create(uint32_t size) +{ + struct llring *r; + uint32_t i; + + if (size < 2 || !IS_POW2(size)) { + return NULL; + } + + r = xmalloc(sizeof *r + size * sizeof r->nodes[0]); + + r->mask = size - 1; + for (i = 0; i < size; i++) { + atomic_store_relaxed(&r->nodes[i].seq, i); + } + atomic_store_relaxed(&r->head, 0); + atomic_store_relaxed(&r->tail, 0); + + return r; +} + +void +llring_destroy(struct llring *r) +{ + free(r); +} + +bool +llring_enqueue(struct llring *r, uint32_t data) +{ + struct llring_node *node; + uint32_t pos; + + atomic_read_relaxed(&r->head, &pos); + while (true) { + int64_t diff; + uint32_t seq; + + node = &r->nodes[pos & r->mask]; + atomic_read_explicit(&node->seq, &seq, memory_order_acquire); + diff = (int64_t)seq - (int64_t)pos; + + if (diff < 0) { + /* Current ring[head].seq is from previous ring generation, + * ring is full and enqueue fails. */ + return false; + } + + if (diff == 0) { + /* If head == ring[head].seq, then the slot is free, + * attempt to take it by moving the head, if no one moved it since. + */ + if (atomic_compare_exchange_weak_explicit(&r->head, &pos, pos + 1, + memory_order_relaxed, + memory_order_relaxed)) { + break; + } + } else { + /* Someone changed the head since last read, retry. */ + atomic_read_relaxed(&r->head, &pos); + } + } + + node->data = data; + atomic_store_explicit(&node->seq, pos + 1, memory_order_release); + return true; +} + +bool +llring_dequeue(struct llring *r, uint32_t *data) +{ + struct llring_node *node; + uint32_t pos; + + atomic_read_relaxed(&r->tail, &pos); + while (true) { + int64_t diff; + uint32_t seq; + + node = &r->nodes[pos & r->mask]; + atomic_read_explicit(&node->seq, &seq, memory_order_acquire); + diff = (int64_t)seq - (int64_t)(pos + 1); + + if (diff < 0) { + /* Current ring[tail + 1].seq is from previous ring generation, + * ring is empty and dequeue fails. */ + return false; + } + + if (diff == 0) { + /* If tail + 1 == ring[tail + 1].seq, then the slot is allocated, + * attempt to free it by moving the tail, if no one moved it since. + */ + if (atomic_compare_exchange_weak_explicit(&r->tail, &pos, pos + 1, + memory_order_relaxed, + memory_order_relaxed)) { + break; + } + } else { + /* Someone changed the tail since last read, retry. */ + atomic_read_relaxed(&r->tail, &pos); + } + } + + *data = node->data; + /* Advance the slot to next gen by adding r->mask + 1 to its sequence. */ + atomic_store_explicit(&node->seq, pos + r->mask + 1, memory_order_release); + return true; +} diff --git a/lib/llring.h b/lib/llring.h new file mode 100644 index 000000000..f97baa343 --- /dev/null +++ b/lib/llring.h @@ -0,0 +1,76 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include + +#include "ovs-atomic.h" + +/* Bounded lockless queue + * ====================== + * + * A lockless FIFO queue bounded to a known size. + * Each operation (insert, remove) uses one CAS(). + * + * The structure is: + * + * Multi-producer: multiple threads can write to it + * concurrently. + * + * Multi-consumer: multiple threads can read from it + * concurrently. + * + * Bounded: the queue is backed by external memory. + * No new allocation is made on insertion, only the + * used elements in the queue are marked as such. + * The boundary of the queue is defined as the size given + * at init, which must be a power of two. + * + * Failing: when an operation (enqueue, dequeue) cannot + * be performed due to the queue being full/empty, the + * operation immediately fails, instead of waiting on + * a state change. + * + * Non-intrusive: queue elements are allocated prior to + * initialization. Data is shallow-copied to those + * allocated elements. + * + * Thread safety + * ============= + * + * The queue is thread-safe for MPMC case. + * No lock is taken by the queue. The queue guarantees + * lock-free forward progress for each of its operations. + * + */ + +/* Create a circular lockless ring. + * The 'size' parameter must be a power-of-two higher than 2, + * otherwise allocation will fail. + */ +struct llring; +struct llring *llring_create(uint32_t size); + +/* Free a lockless ring. */ +void llring_destroy(struct llring *r); + +/* 'data' is copied to the latest free slot in the queue. */ +bool llring_enqueue(struct llring *r, uint32_t data); + +/* The value within the oldest slot taken in the queue is copied + * to the address pointed by 'data'. + */ +bool llring_dequeue(struct llring *r, uint32_t *data); From patchwork Sat Dec 5 14:22:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411454 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBd765jNz9sVV for ; Sun, 6 Dec 2020 01:23:23 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 4FD2187AE2; Sat, 5 Dec 2020 14:23:22 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uh9zzt-EUVGs; Sat, 5 Dec 2020 14:23:06 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id 14A89879AB; Sat, 5 Dec 2020 14:22:52 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7E3D7C1DA1; Sat, 5 Dec 2020 14:22:51 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9D119C1DA1 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 83F6687018 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AtqDy1yB3LRf for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 2B0E986FB3 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 98EB840003 for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:09 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 14/26] seq-pool: Module for faster ID generation X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The current id-pool module is slow to allocate the next valid ID, and can be optimized when restricting some properties of the pool. Those restrictions are: * No ability to add a random ID to the pool. * A new ID is no more the smallest possible ID. It is however guaranteed to be in the range of [base, next_id]. Multiple users of the pool are registered, each with a thread-local cache for better scalability and the next_id is one after the latest ID added to any user cache. The allocation range can be written as: [base, last_alloc + nb-user * cache-size + 1]. * A user should never free an ID that is not allocated. No checks are done and doing so will duplicate the spurious ID. Refcounting or other memory management scheme should be used to ensure an object and its ID are only freed once. This pool is designed to scale reasonably well in multi-thread setup. As it is aimed at being a faster replacement to the current id-pool, a benchmark has been implemented alongside unit tests. The benchmark is composed of 4 rounds: 'new', 'del', 'mix', and 'rnd'. Respectively + 'new': only allocate IDs + 'del': only free IDs + 'mix': allocate, sequential free, then allocate ID. + 'rnd': allocate, random free, allocate ID. Randomized freeing is done by swapping the latest allocated ID with any from the range of currently allocated ID, which is reminiscent of the Fisher-Yates shuffle. This evaluates freeing non-sequential IDs, which is the more natural use-case. For this specific round, the id-pool performance is such that a timeout of 10 seconds is added to the benchmark: $ ./tests/ovstest test-seq-pool benchmark 10000 1 Benchmarking n=10000 on 1 thread. type\thread: 1 Avg seq-pool new: 1 1 ms seq-pool del: 0 0 ms seq-pool mix: 1 1 ms seq-pool rnd: 1 1 ms id-pool new: 0 0 ms id-pool del: 1 1 ms id-pool mix: 1 1 ms id-pool rnd: 1201 1201 ms $ ./tests/ovstest test-seq-pool benchmark 100000 1 Benchmarking n=100000 on 1 thread. type\thread: 1 Avg seq-pool new: 2 2 ms seq-pool del: 5 5 ms seq-pool mix: 5 5 ms seq-pool rnd: 5 5 ms id-pool new: 8 8 ms id-pool del: 5 5 ms id-pool mix: 11 11 ms id-pool rnd: 10000+ ****** ms $ ./tests/ovstest test-seq-pool benchmark 1000000 1 Benchmarking n=1000000 on 1 thread. type\thread: 1 Avg seq-pool new: 23 23 ms seq-pool del: 49 49 ms seq-pool mix: 53 53 ms seq-pool rnd: 53 53 ms id-pool new: 190 190 ms id-pool del: 173 173 ms id-pool mix: 273 273 ms id-pool rnd: 10042+ ****** ms $ ./tests/ovstest test-seq-pool benchmark 1000000 2 Benchmarking n=1000000 on 2 threads. type\thread: 1 2 Avg seq-pool new: 40 39 39 ms seq-pool del: 33 33 33 ms seq-pool mix: 89 91 90 ms seq-pool rnd: 146 151 148 ms id-pool new: 485 485 485 ms id-pool del: 541 542 541 ms id-pool mix: 550 600 575 ms id-pool rnd: 10048+ 10003+ ****** ms $ ./tests/ovstest test-seq-pool benchmark 1000000 4 Benchmarking n=1000000 on 4 threads. type\thread: 1 2 3 4 Avg seq-pool new: 40 39 40 40 39 ms seq-pool del: 24 28 28 30 27 ms seq-pool mix: 60 63 69 69 65 ms seq-pool rnd: 195 197 202 202 199 ms id-pool new: 478 471 482 485 479 ms id-pool del: 474 469 467 474 471 ms id-pool mix: 558 558 611 545 568 ms id-pool rnd: 10121+ 10076+ 10030+ 10167+ ****** ms Signed-off-by: Gaetan Rivet --- lib/automake.mk | 2 + lib/seq-pool.c | 198 +++++++++++++++ lib/seq-pool.h | 66 +++++ tests/automake.mk | 1 + tests/library.at | 5 + tests/test-seq-pool.c | 542 ++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 814 insertions(+) create mode 100644 lib/seq-pool.c create mode 100644 lib/seq-pool.h create mode 100644 tests/test-seq-pool.c diff --git a/lib/automake.mk b/lib/automake.mk index c67c01779..639f1000f 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -293,6 +293,8 @@ lib_libopenvswitch_la_SOURCES = \ lib/sat-math.h \ lib/seq.c \ lib/seq.h \ + lib/seq-pool.c \ + lib/seq-pool.h \ lib/sha1.c \ lib/sha1.h \ lib/shash.c \ diff --git a/lib/seq-pool.c b/lib/seq-pool.c new file mode 100644 index 000000000..4426d11d8 --- /dev/null +++ b/lib/seq-pool.c @@ -0,0 +1,198 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "openvswitch/list.h" +#include "openvswitch/thread.h" +#include "openvswitch/util.h" +#include "ovs-atomic.h" +#include "llring.h" +#include "seq-pool.h" + +#define SEQPOOL_CACHE_SIZE 32 +BUILD_ASSERT_DECL(IS_POW2(SEQPOOL_CACHE_SIZE)); + +struct seq_node { + struct ovs_list list_node; + uint32_t id; +}; + +struct seq_pool { + uint32_t next_id; + struct llring **cache; /* per-user id cache. */ + size_t nb_user; /* Number of user threads. */ + struct ovs_mutex lock; /* Protects free_ids access. */ + struct ovs_list free_ids; /* Set of currently free IDs. */ + uint32_t base; /* IDs in the range of [base, base + n_ids). */ + uint32_t n_ids; /* Total number of ids in the pool. */ +}; + +struct seq_pool * +seq_pool_create(unsigned int nb_user, uint32_t base, uint32_t n_ids) +{ + struct seq_pool *pool; + size_t i; + + ovs_assert(nb_user != 0); + ovs_assert(base <= UINT32_MAX - n_ids); + + pool = xmalloc(sizeof *pool); + + pool->cache = xcalloc(nb_user, sizeof *pool->cache); + for (i = 0; i < nb_user; i++) { + pool->cache[i] = llring_create(SEQPOOL_CACHE_SIZE); + } + pool->nb_user = nb_user; + + pool->next_id = base; + pool->base = base; + pool->n_ids = n_ids; + + ovs_mutex_init(&pool->lock); + ovs_list_init(&pool->free_ids); + + return pool; +} + +void +seq_pool_destroy(struct seq_pool *pool) +{ + struct seq_node *node; + struct seq_node *next; + size_t i; + + if (!pool) { + return; + } + + ovs_mutex_lock(&pool->lock); + LIST_FOR_EACH_SAFE (node, next, list_node, &pool->free_ids) { + free(node); + } + ovs_list_poison(&pool->free_ids); + ovs_mutex_unlock(&pool->lock); + ovs_mutex_destroy(&pool->lock); + + for (i = 0; i < pool->nb_user; i++) { + llring_destroy(pool->cache[i]); + } + free(pool->cache); + + free(pool); +} + +bool +seq_pool_new_id(struct seq_pool *pool, unsigned int uid, uint32_t *id) +{ + struct llring *cache; + struct ovs_list *front; + struct seq_node *node; + + uid %= pool->nb_user; + cache = pool->cache[uid]; + + if (llring_dequeue(cache, id)) { + return true; + } + + ovs_mutex_lock(&pool->lock); + + while (!ovs_list_is_empty(&pool->free_ids)) { + front = ovs_list_front(&pool->free_ids); + node = CONTAINER_OF(front, struct seq_node, list_node); + if (llring_enqueue(cache, node->id)) { + ovs_list_remove(front); + free(node); + } else { + break; + } + } + + while (pool->next_id < pool->base + pool->n_ids) { + if (llring_enqueue(cache, pool->next_id)) { + pool->next_id++; + } else { + break; + } + } + + ovs_mutex_unlock(&pool->lock); + + if (llring_dequeue(cache, id)) { + return true; + } else { + struct llring *c2; + size_t i; + + /* If no ID was available either from shared counter, + * free-list or local cache, steal an ID from another + * user cache. + */ + for (i = 0; i < pool->nb_user; i++) { + if (i == uid) { + continue; + } + c2 = pool->cache[i]; + if (llring_dequeue(c2, id)) { + return true; + } + } + } + + return false; +} + +void +seq_pool_free_id(struct seq_pool *pool, unsigned int uid, uint32_t id) +{ + struct seq_node *nodes[SEQPOOL_CACHE_SIZE + 1]; + struct llring *cache; + uint32_t node_id; + size_t i; + + if (id < pool->base || id >= pool->base + pool->n_ids) { + return; + } + + uid %= pool->nb_user; + cache = pool->cache[uid]; + + if (llring_enqueue(cache, id)) { + return; + } + + /* Flush the cache. */ + for (i = 0; llring_dequeue(cache, &node_id); i++) { + nodes[i] = xmalloc(sizeof *nodes[i]); + nodes[i]->id = node_id; + } + + /* Finish with the last freed node. */ + nodes[i] = xmalloc(sizeof **nodes); + nodes[i]->id = id; + i++; + + if (i < ARRAY_SIZE(nodes)) { + nodes[i] = NULL; + } + + ovs_mutex_lock(&pool->lock); + for (i = 0; i < ARRAY_SIZE(nodes) && nodes[i] != NULL; i++) { + ovs_list_push_back(&pool->free_ids, &nodes[i]->list_node); + } + ovs_mutex_unlock(&pool->lock); +} diff --git a/lib/seq-pool.h b/lib/seq-pool.h new file mode 100644 index 000000000..c992a0988 --- /dev/null +++ b/lib/seq-pool.h @@ -0,0 +1,66 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef SEQ_POOL_H +#define SEQ_POOL_H + +#include +#include +#include + +/* + * Sequential ID pool. + * =================== + * + * Pool of unique 32bits IDs. + * + * Multiple users are registered at initialization. Each user uses a cache + * of ID. When each thread using the pool uses its own user ID, the pool + * scales reasonably for concurrent allocation. + * + * New IDs are always in the range of '[base, next_id]', where 'next_id' is + * in the range of '[last_alloc_ID + nb_user * cache_size + 1]'. + * This means that a new ID is not always the smallest available ID, but it is + * still from a limited range. + * + * Users should ensure that an ID is *never* freed twice. Not doing so will + * have the effect of double-allocating such ID afterward. + * + * Thread-safety + * ============= + * + * APIs are thread safe. + * + * Multiple threads can share the same user ID if necessary, but it can hurt + * performance if threads are not otherwise synchronized. + */ + +struct seq_pool; + +/* nb_user is the number of expected users of the pool, + * in terms of execution threads. */ +struct seq_pool *seq_pool_create(unsigned int nb_user, + uint32_t base, uint32_t n_ids); +void seq_pool_destroy(struct seq_pool *pool); + +/* uid is the thread user-id. It should be within '[0, nb_user['. */ +bool seq_pool_new_id(struct seq_pool *pool, unsigned int uid, uint32_t *id); + +/* uid is the thread user-id. It should be within '[0, nb_user['. + * An allocated ID must *never* be freed twice. + */ +void seq_pool_free_id(struct seq_pool *pool, unsigned int uid, uint32_t id); +#endif /* seq-pool.h */ diff --git a/tests/automake.mk b/tests/automake.mk index d7ae5df90..f34016a24 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -469,6 +469,7 @@ tests_ovstest_SOURCES = \ tests/test-rcu.c \ tests/test-reconnect.c \ tests/test-rstp.c \ + tests/test-seq-pool.c \ tests/test-sflow.c \ tests/test-sha1.c \ tests/test-skiplist.c \ diff --git a/tests/library.at b/tests/library.at index 537f0aa4c..9e7ea1de2 100644 --- a/tests/library.at +++ b/tests/library.at @@ -258,3 +258,8 @@ AT_SETUP([mpsc-queue module]) AT_CHECK([ovstest test-mpsc-queue check], [0], [.. ]) AT_CLEANUP + +AT_SETUP([seq-pool module]) +AT_CHECK([ovstest test-seq-pool check], [0], [.... +]) +AT_CLEANUP diff --git a/tests/test-seq-pool.c b/tests/test-seq-pool.c new file mode 100644 index 000000000..9ff18074a --- /dev/null +++ b/tests/test-seq-pool.c @@ -0,0 +1,542 @@ +/* + * Copyright (c) 2020 NVIDIA Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#undef NDEBUG +#include +#include +#include + +#include + +#include "command-line.h" +#include "id-pool.h" +#include "openvswitch/util.h" +#include "ovs-thread.h" +#include "ovs-rcu.h" +#include "ovstest.h" +#include "random.h" +#include "seq-pool.h" +#include "timeval.h" +#include "util.h" + +#define SEQ_POOL_CACHE_SIZE 32 + +#define N_IDS 100 + +static void +test_seq_pool_alloc_full_range(void) +{ + bool ids[N_IDS]; + struct seq_pool *pool; + size_t i; + + memset(ids, 0, sizeof ids); + pool = seq_pool_create(1, 0, N_IDS); + + for (i = 0; i < N_IDS; i++) { + uint32_t id; + + ovs_assert(seq_pool_new_id(pool, 0, &id)); + /* No double alloc.*/ + ovs_assert(ids[id] == false); + ids[id] = true; + } + + for (i = 0; i < N_IDS; i++) { + ovs_assert(ids[i]); + } + + seq_pool_destroy(pool); + printf("."); +} + +static void +test_seq_pool_alloc_steal(void) +{ + /* N must be less than a pool cache size to force the second user + * to steal from the first. + */ + const unsigned int N = SEQ_POOL_CACHE_SIZE / 4; + bool ids[N]; + struct seq_pool *pool; + uint32_t id; + size_t i; + + memset(ids, 0, sizeof ids); + pool = seq_pool_create(2, 0, N); + + /* Fill up user 0 cache. */ + ovs_assert(seq_pool_new_id(pool, 0, &id)); + for (i = 0; i < N - 1; i++) { + /* Check that user 1 can still alloc from user 0 cache. */ + ovs_assert(seq_pool_new_id(pool, 1, &id)); + } + + seq_pool_destroy(pool); + printf("."); +} + +static void +test_seq_pool_alloc_monotonic(void) +{ + uint32_t ids[N_IDS]; + struct seq_pool *pool; + size_t i; + + memset(ids, 0, sizeof ids); + pool = seq_pool_create(1, 0, N_IDS); + + for (i = 0; i < N_IDS; i++) { + ovs_assert(seq_pool_new_id(pool, 0, &ids[i])); + ovs_assert(ids[i] == i); + } + + seq_pool_destroy(pool); + printf("."); +} + +static void +test_seq_pool_alloc_under_limit(void) +{ + uint32_t ids[N_IDS]; + unsigned int limit; + struct seq_pool *pool; + size_t i; + + memset(ids, 0, sizeof ids); + pool = seq_pool_create(1, 0, N_IDS); + + for (limit = 1; limit < N_IDS; limit++) { + /* Allocate until arbitrary limit then free allocated ids. */ + for (i = 0; i < limit; i++) { + ovs_assert(seq_pool_new_id(pool, 0, &ids[i])); + } + for (i = 0; i < limit; i++) { + seq_pool_free_id(pool, 0, ids[i]); + } + /* Verify that the N='limit' next allocations are under limit. */ + for (i = 0; i < limit; i++) { + ovs_assert(seq_pool_new_id(pool, 0, &ids[i])); + ovs_assert(ids[i] < limit + SEQ_POOL_CACHE_SIZE); + } + for (i = 0; i < limit; i++) { + seq_pool_free_id(pool, 0, ids[i]); + } + } + + seq_pool_destroy(pool); + printf("."); +} + +static void +run_tests(struct ovs_cmdl_context *ctx OVS_UNUSED) +{ + /* Check that all ids can be allocated. */ + test_seq_pool_alloc_full_range(); + /* Check that all ids can be allocated with multiple users. */ + test_seq_pool_alloc_steal(); + /* Check that id allocation is always increasing. */ + test_seq_pool_alloc_monotonic(); + /* Check that id allocation stays under some limit. */ + test_seq_pool_alloc_under_limit(); + printf("\n"); +} + +static uint32_t *ids; +static uint64_t *thread_working_ms; /* Measured work time. */ + +static unsigned int n_threads; +static unsigned int n_ids; + +static struct ovs_barrier barrier; + +#define TIMEOUT_MS (10 * 1000) /* 10 sec timeout */ +static int running_time_ms; +volatile bool stop = false; + +static int +elapsed(int *start) +{ + return running_time_ms - *start; +} + +static void +swap_u32(uint32_t *a, uint32_t *b) +{ + uint32_t t; + t = *a; + *a = *b; + *b = t; +} + +static void +shuffle(uint32_t *p, size_t n) +{ + for (; n > 1; n--, p++) { + uint32_t *q = &p[random_range(n)]; + swap_u32(p, q); + } +} + +static void +print_result(const char *prefix) +{ + uint64_t avg; + size_t i; + + avg = 0; + for (i = 0; i < n_threads; i++) { + avg += thread_working_ms[i]; + } + avg /= n_threads; + printf("%s: ", prefix); + for (i = 0; i < n_threads; i++) { + if (thread_working_ms[i] >= TIMEOUT_MS) { + printf("%6" PRIu64 "+", thread_working_ms[i]); + } else { + printf(" %6" PRIu64, thread_working_ms[i]); + } + } + if (avg >= TIMEOUT_MS) { + printf(" ****** ms\n"); + } else { + printf(" %6" PRIu64 " ms\n", avg); + } +} + +struct seq_pool_aux { + struct seq_pool *pool; + atomic_uint thread_id; +}; + +static void * +seq_pool_thread(void *aux_) +{ + unsigned int n_ids_per_thread; + struct seq_pool_aux *aux = aux_; + uint32_t *th_ids; + unsigned int tid; + int start; + size_t i; + + atomic_add(&aux->thread_id, 1u, &tid); + n_ids_per_thread = n_ids / n_threads; + th_ids = &ids[tid * n_ids_per_thread]; + + /* NEW / ALLOC */ + + start = running_time_ms; + for (i = 0; i < n_ids_per_thread; i++) { + ignore(seq_pool_new_id(aux->pool, tid, &th_ids[i])); + } + thread_working_ms[tid] = elapsed(&start); + + ovs_barrier_block(&barrier); + + /* DEL */ + + shuffle(th_ids, n_ids_per_thread); + + start = running_time_ms; + for (i = 0; i < n_ids_per_thread; i++) { + seq_pool_free_id(aux->pool, tid, th_ids[i]); + } + thread_working_ms[tid] = elapsed(&start); + + ovs_barrier_block(&barrier); + + /* MIX */ + + start = running_time_ms; + for (i = 0; i < n_ids_per_thread; i++) { + ignore(seq_pool_new_id(aux->pool, tid, &th_ids[i])); + seq_pool_free_id(aux->pool, tid, th_ids[i]); + ignore(seq_pool_new_id(aux->pool, tid, &th_ids[i])); + } + thread_working_ms[tid] = elapsed(&start); + + ovs_barrier_block(&barrier); + + /* MIX SHUFFLED */ + + start = running_time_ms; + for (i = 0; i < n_ids_per_thread; i++) { + if (elapsed(&start) >= TIMEOUT_MS) { + break; + } + ignore(seq_pool_new_id(aux->pool, tid, &th_ids[i])); + swap_u32(&th_ids[i], &th_ids[random_range(i + 1)]); + seq_pool_free_id(aux->pool, tid, th_ids[i]); + ignore(seq_pool_new_id(aux->pool, tid, &th_ids[i])); + } + thread_working_ms[tid] = elapsed(&start); + + return NULL; +} + +static void +benchmark_seq_pool(void) +{ + pthread_t *threads; + struct seq_pool_aux aux; + size_t i; + + memset(ids, 0, n_ids & sizeof *ids); + memset(thread_working_ms, 0, n_threads & sizeof *thread_working_ms); + + aux.pool = seq_pool_create(n_threads, 0, n_ids); + atomic_store(&aux.thread_id, 0); + + for (i = n_ids - (n_ids % n_threads); i < n_ids; i++) { + uint32_t id; + + seq_pool_new_id(aux.pool, 0, &id); + ids[i] = id; + } + + threads = xmalloc(n_threads * sizeof *threads); + ovs_barrier_init(&barrier, n_threads + 1); + + for (i = 0; i < n_threads; i++) { + threads[i] = ovs_thread_create("seq_pool_alloc", + seq_pool_thread, &aux); + } + + ovs_barrier_block(&barrier); + + print_result("seq-pool new"); + + ovs_barrier_block(&barrier); + + print_result("seq-pool del"); + + ovs_barrier_block(&barrier); + + print_result("seq-pool mix"); + + for (i = 0; i < n_threads; i++) { + xpthread_join(threads[i], NULL); + } + + print_result("seq-pool rnd"); + + seq_pool_destroy(aux.pool); + ovs_barrier_destroy(&barrier); + free(threads); +} + +struct id_pool_aux { + struct id_pool *pool; + struct ovs_mutex *lock; + atomic_uint thread_id; +}; + +static void * +id_pool_thread(void *aux_) +{ + unsigned int n_ids_per_thread; + struct id_pool_aux *aux = aux_; + uint32_t *th_ids; + unsigned int tid; + int start; + size_t i; + + atomic_add(&aux->thread_id, 1u, &tid); + n_ids_per_thread = n_ids / n_threads; + th_ids = &ids[tid * n_ids_per_thread]; + + /* NEW */ + + start = running_time_ms; + for (i = 0; i < n_ids_per_thread; i++) { + ovs_mutex_lock(aux->lock); + ovs_assert(id_pool_alloc_id(aux->pool, &th_ids[i])); + ovs_mutex_unlock(aux->lock); + } + thread_working_ms[tid] = elapsed(&start); + + ovs_barrier_block(&barrier); + + /* DEL */ + + shuffle(th_ids, n_ids_per_thread); + + start = running_time_ms; + for (i = 0; i < n_ids_per_thread; i++) { + ovs_mutex_lock(aux->lock); + id_pool_free_id(aux->pool, th_ids[i]); + ovs_mutex_unlock(aux->lock); + } + thread_working_ms[tid] = elapsed(&start); + + ovs_barrier_block(&barrier); + + /* MIX */ + + start = running_time_ms; + for (i = 0; i < n_ids_per_thread; i++) { + ovs_mutex_lock(aux->lock); + ignore(id_pool_alloc_id(aux->pool, &th_ids[i])); + id_pool_free_id(aux->pool, th_ids[i]); + ignore(id_pool_alloc_id(aux->pool, &th_ids[i])); + ovs_mutex_unlock(aux->lock); + } + thread_working_ms[tid] = elapsed(&start); + + ovs_barrier_block(&barrier); + + /* MIX SHUFFLED */ + + start = running_time_ms; + for (i = 0; i < n_ids_per_thread; i++) { + if (elapsed(&start) >= TIMEOUT_MS) { + break; + } + ovs_mutex_lock(aux->lock); + ignore(id_pool_alloc_id(aux->pool, &th_ids[i])); + swap_u32(&th_ids[i], &th_ids[random_range(i + 1)]); + id_pool_free_id(aux->pool, th_ids[i]); + ignore(id_pool_alloc_id(aux->pool, &th_ids[i])); + ovs_mutex_unlock(aux->lock); + } + thread_working_ms[tid] = elapsed(&start); + + return NULL; +} + +static void +benchmark_id_pool(void) +{ + pthread_t *threads; + struct id_pool_aux aux; + struct ovs_mutex lock; + size_t i; + + memset(ids, 0, n_ids & sizeof *ids); + memset(thread_working_ms, 0, n_threads & sizeof *thread_working_ms); + + aux.pool = id_pool_create(0, n_ids); + aux.lock = &lock; + ovs_mutex_init(&lock); + atomic_store(&aux.thread_id, 0); + + for (i = n_ids - (n_ids % n_threads); i < n_ids; i++) { + id_pool_alloc_id(aux.pool, &ids[i]); + } + + threads = xmalloc(n_threads * sizeof *threads); + ovs_barrier_init(&barrier, n_threads + 1); + + for (i = 0; i < n_threads; i++) { + threads[i] = ovs_thread_create("id_pool_alloc", id_pool_thread, &aux); + } + + ovs_barrier_block(&barrier); + + print_result(" id-pool new"); + + ovs_barrier_block(&barrier); + + print_result(" id-pool del"); + + ovs_barrier_block(&barrier); + + print_result(" id-pool mix"); + + for (i = 0; i < n_threads; i++) { + xpthread_join(threads[i], NULL); + } + + print_result(" id-pool rnd"); + + id_pool_destroy(aux.pool); + ovs_barrier_destroy(&barrier); + free(threads); +} + +static void * +clock_main(void *arg OVS_UNUSED) +{ + struct timeval start; + struct timeval end; + + xgettimeofday(&start); + while (!stop) { + xgettimeofday(&end); + running_time_ms = timeval_to_msec(&end) - timeval_to_msec(&start); + xnanosleep(1000); + } + + return NULL; +} + +static void +run_benchmarks(struct ovs_cmdl_context *ctx) +{ + pthread_t clock; + long int l_threads; + long int l_ids; + size_t i; + + l_ids = strtol(ctx->argv[1], NULL, 10); + l_threads = strtol(ctx->argv[2], NULL, 10); + ovs_assert(l_ids > 0 && l_threads > 0); + + n_ids = l_ids; + n_threads = l_threads; + + ids = xcalloc(n_ids, sizeof *ids); + thread_working_ms = xcalloc(n_threads, sizeof *thread_working_ms); + + clock = ovs_thread_create("clock", clock_main, NULL); + + printf("Benchmarking n=%u on %u thread%s.\n", n_ids, n_threads, + n_threads > 1 ? "s" : ""); + + printf(" type\\thread: "); + for (i = 0; i < n_threads; i++) { + printf(" %3" PRIuSIZE " ", i + 1); + } + printf(" Avg\n"); + + benchmark_seq_pool(); + benchmark_id_pool(); + + stop = true; + + free(thread_working_ms); + xpthread_join(clock, NULL); +} + +static const struct ovs_cmdl_command commands[] = { + {"check", NULL, 0, 0, run_tests, OVS_RO}, + {"benchmark", " ", 2, 2, run_benchmarks, OVS_RO}, + {NULL, NULL, 0, 0, NULL, OVS_RO}, +}; + +static void +test_seq_pool_main(int argc, char *argv[]) +{ + struct ovs_cmdl_context ctx = { + .argc = argc - optind, + .argv = argv + optind, + }; + + set_program_name(argv[0]); + ovs_cmdl_run_command(&ctx, commands); +} + +OVSTEST_REGISTER("test-seq-pool", test_seq_pool_main); From patchwork Sat Dec 5 14:22:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411450 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcl1ZP1z9sVV for ; Sun, 6 Dec 2020 01:23:03 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 8D8CA878EE; Sat, 5 Dec 2020 14:23:01 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JrJf0jssgMpJ; Sat, 5 Dec 2020 14:22:57 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 54B8F876B7; Sat, 5 Dec 2020 14:22:47 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2CA29C1DA5; Sat, 5 Dec 2020 14:22:47 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id BC8C5C1DA0 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id B68EF87149 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wp7nIvzcB1z9 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id 71C3787120 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id EB1BE40009 for ; Sat, 5 Dec 2020 14:22:36 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:10 +0100 Message-Id: <70fc444d979df346a0184d9a9c0733d58e578e27.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 15/26] netdev-offload: Add multi-thread API X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Expose functions reporting user configuration of offloading threads, as well as utility functions for multithreading. This will only expose the configuration knob to the user, while no datapath will implement the multiple thread request. This will allow implementations to use this API for offload thread management in relevant layers before enabling the actual dataplane implementation. The offload thread ID is lazily allocated and can as such be in a different order than the offload thread start sequence. The RCU thread will sometime access hardware-offload objects from a provider for reclamation purposes. In such case, it will get a default offload thread ID of 0. Care must be taken that using this thread ID is safe concurrently with the offload threads. Signed-off-by: Gaetan Rivet --- lib/netdev-offload-dpdk.c | 4 +- lib/netdev-offload-provider.h | 6 ++- lib/netdev-offload.c | 81 +++++++++++++++++++++++++++++++++-- lib/netdev-offload.h | 21 ++++++++- vswitchd/vswitch.xml | 16 +++++++ 5 files changed, 120 insertions(+), 8 deletions(-) diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index b29c1188f..7648a0ccd 100644 --- a/lib/netdev-offload-dpdk.c +++ b/lib/netdev-offload-dpdk.c @@ -1656,12 +1656,12 @@ out: static int netdev_offload_dpdk_hw_offload_stats_get(struct netdev *netdev, - uint64_t *counter) + uint64_t *counters) { struct netdev_offload_dpdk_data *data; data = netdev->hw_info.offload_data; - *counter = data->rte_flow_counter; + *counters = data->rte_flow_counter; return 0; } diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-provider.h index fd38cea66..2ac73cd36 100644 --- a/lib/netdev-offload-provider.h +++ b/lib/netdev-offload-provider.h @@ -83,8 +83,10 @@ struct netdev_flow_api { int (*flow_del)(struct netdev *, const ovs_u128 *ufid, struct dpif_flow_stats *); - /* Queries an offload provider hardware statistics. */ - int (*hw_offload_stats_get)(struct netdev *netdev, uint64_t *counter); + /* Queries an offload provider hardware statistics. + * One counter per offload thread is expected. + */ + int (*hw_offload_stats_get)(struct netdev *netdev, uint64_t *counters); /* Initializies the netdev flow api. * Return 0 if successful, otherwise returns a positive errno value. */ diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c index 4a8403ead..17b8cbc5a 100644 --- a/lib/netdev-offload.c +++ b/lib/netdev-offload.c @@ -60,6 +60,12 @@ VLOG_DEFINE_THIS_MODULE(netdev_offload); static bool netdev_flow_api_enabled = false; +#define DEFAULT_OFFLOAD_THREAD_NB 1 +#define MAX_OFFLOAD_THREAD_NB 10 + +static unsigned int offload_thread_nb = DEFAULT_OFFLOAD_THREAD_NB; +DEFINE_EXTERN_PER_THREAD_DATA(netdev_offload_thread_id, OVSTHREAD_ID_UNSET); + /* Protects 'netdev_flow_apis'. */ static struct ovs_mutex netdev_flow_api_provider_mutex = OVS_MUTEX_INITIALIZER; @@ -281,13 +287,13 @@ netdev_flow_del(struct netdev *netdev, const ovs_u128 *ufid, } int -netdev_hw_offload_stats_get(struct netdev *netdev, uint64_t *counter) +netdev_hw_offload_stats_get(struct netdev *netdev, uint64_t *counters) { const struct netdev_flow_api *flow_api = ovsrcu_get(const struct netdev_flow_api *, &netdev->flow_api); return (flow_api && flow_api->hw_offload_stats_get) - ? flow_api->hw_offload_stats_get(netdev, counter) + ? flow_api->hw_offload_stats_get(netdev, counters) : EOPNOTSUPP; } @@ -436,6 +442,64 @@ netdev_is_flow_api_enabled(void) return netdev_flow_api_enabled; } +unsigned int +netdev_offload_thread_nb(void) +{ + return offload_thread_nb; +} + +unsigned int +netdev_offload_ufid_to_thread_id(const ovs_u128 ufid) +{ + uint32_t ufid_hash; + + if (netdev_offload_thread_nb() == 1) { + return 0; + } + + ufid_hash = hash_words64_inline( + (const uint64_t [2]){ ufid.u64.lo, + ufid.u64.hi }, 2, 1); + return ufid_hash % netdev_offload_thread_nb(); +} + +unsigned int +netdev_offload_thread_init(void) +{ + static atomic_count next_id = ATOMIC_COUNT_INIT(0); + bool thread_is_hw_offload; + bool thread_is_rcu; + + thread_is_hw_offload = !strncmp(get_subprogram_name(), + "hw_offload", strlen("hw_offload")); + thread_is_rcu = !strncmp(get_subprogram_name(), "urcu", strlen("urcu")); + + /* Panic if any other thread besides offload and RCU tries + * to initialize their thread ID. */ + ovs_assert(thread_is_hw_offload || thread_is_rcu); + + if (*netdev_offload_thread_id_get() == OVSTHREAD_ID_UNSET) { + unsigned int id; + + if (thread_is_rcu) { + /* RCU will compete with other threads for shared object access. + * Reclamation functions using a thread ID must be thread-safe. + * For that end, and because RCU must consider all potential shared + * objects anyway, its thread-id can be whichever, so return 0. + */ + id = 0; + } else { + /* Only the actual offload threads have their own ID. */ + id = atomic_count_inc(&next_id); + } + /* Panic if any offload thread is getting a spurious ID. */ + ovs_assert(id < netdev_offload_thread_nb()); + return *netdev_offload_thread_id_get() = id; + } else { + return *netdev_offload_thread_id_get(); + } +} + void netdev_ports_flow_flush(const char *dpif_type) { @@ -664,7 +728,18 @@ netdev_set_flow_api_enabled(const struct smap *ovs_other_config) if (ovsthread_once_start(&once)) { netdev_flow_api_enabled = true; - VLOG_INFO("netdev: Flow API Enabled"); + offload_thread_nb = smap_get_ullong(ovs_other_config, + "hw-offload-thread-nb", + DEFAULT_OFFLOAD_THREAD_NB); + if (offload_thread_nb > MAX_OFFLOAD_THREAD_NB) { + VLOG_WARN("netdev: Invalid number of threads requested: %u", + offload_thread_nb); + offload_thread_nb = DEFAULT_OFFLOAD_THREAD_NB; + } + + VLOG_INFO("netdev: Flow API Enabled, using %u thread%s", + offload_thread_nb, + offload_thread_nb > 1 ? "s" : ""); #ifdef __linux__ tc_set_policy(smap_get_def(ovs_other_config, "tc-policy", diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h index 5ed561d13..e60539706 100644 --- a/lib/netdev-offload.h +++ b/lib/netdev-offload.h @@ -20,6 +20,7 @@ #include "openvswitch/netdev.h" #include "openvswitch/types.h" +#include "ovs-thread.h" #include "packets.h" #include "flow.h" @@ -79,6 +80,24 @@ struct offload_info { * to delete the original flow. */ }; +DECLARE_EXTERN_PER_THREAD_DATA(unsigned int, netdev_offload_thread_id); + +unsigned int netdev_offload_thread_nb(void); +unsigned int netdev_offload_thread_init(void); +unsigned int netdev_offload_ufid_to_thread_id(const ovs_u128 ufid); + +static inline unsigned int +netdev_offload_thread_id(void) +{ + unsigned int id = *netdev_offload_thread_id_get(); + + if (OVS_UNLIKELY(id == OVSTHREAD_ID_UNSET)) { + id = netdev_offload_thread_init(); + } + + return id; +} + int netdev_flow_flush(struct netdev *); int netdev_flow_dump_create(struct netdev *, struct netdev_flow_dump **dump, bool terse); @@ -95,7 +114,7 @@ int netdev_flow_get(struct netdev *, struct match *, struct nlattr **actions, struct dpif_flow_attrs *, struct ofpbuf *wbuffer); int netdev_flow_del(struct netdev *, const ovs_u128 *, struct dpif_flow_stats *); -int netdev_hw_offload_stats_get(struct netdev *, uint64_t *counter); +int netdev_hw_offload_stats_get(struct netdev *, uint64_t *counters); int netdev_init_flow_api(struct netdev *); void netdev_uninit_flow_api(struct netdev *); uint32_t netdev_get_block_id(struct netdev *); diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 89a876796..63f99299e 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -247,6 +247,22 @@

+ +

+ Set this value to the number of threads created to manage hardware + offloads. +

+

+ The default value is 1. Changing this value requires + restarting the daemon. +

+

+ This is only relevant if + is enabled. +

+
+ From patchwork Sat Dec 5 14:22:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411457 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBdK6xkXz9sVV for ; Sun, 6 Dec 2020 01:23:33 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 81D4B8835D; Sat, 5 Dec 2020 14:23:32 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dkqioVih99WY; Sat, 5 Dec 2020 14:23:24 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id F260087B32; Sat, 5 Dec 2020 14:22:54 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 8674AC1DA5; Sat, 5 Dec 2020 14:22:54 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0885CC013B for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id E3EC187650 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id M5N6Ufj4CXBm for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by whitealder.osuosl.org (Postfix) with ESMTPS id BA17B874CE for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 3E0E14000A for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:11 +0100 Message-Id: <2eb0e7ba586aff40e1a98f42459dc3d6ae08cb7c.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 16/26] dpif-netdev: Quiesce offload thread periodically X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" After each processed offload, the offload thread currently quiesce and will sync with RCU. This synchronization can be lengthy and make the thread unnecessary slow. Instead attempt to quiesce every 10 ms at most or any time the queue is empty. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 1bbe6d98f..6a3413b2e 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -2735,15 +2735,20 @@ err_free: return -1; } +#define DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US (10 * 1000) /* 10 ms */ + static void * dp_netdev_flow_offload_main(void *data OVS_UNUSED) { struct dp_offload_thread_item *offload; struct ovs_list *list; long long int latency_us; + long long int next_rcu; + long long int now; const char *op; int ret; + next_rcu = time_usec() + DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; for (;;) { ovs_mutex_lock(&dp_offload_thread.mutex); if (ovs_list_is_empty(&dp_offload_thread.list)) { @@ -2751,6 +2756,7 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) ovs_mutex_cond_wait(&dp_offload_thread.cond, &dp_offload_thread.mutex); ovsrcu_quiesce_end(); + next_rcu = time_usec() + DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; } list = ovs_list_pop_front(&dp_offload_thread.list); dp_offload_thread.enqueued_item--; @@ -2774,14 +2780,22 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) OVS_NOT_REACHED(); } - latency_us = time_usec() - offload->timestamp; + now = time_usec(); + + latency_us = now - offload->timestamp; mov_avg_ema_update(&dp_offload_thread.ema, latency_us); VLOG_DBG("%s to %s netdev flow "UUID_FMT, ret == 0 ? "succeed" : "failed", op, UUID_ARGS((struct uuid *) &offload->flow->mega_ufid)); dp_netdev_free_flow_offload(offload); - ovsrcu_quiesce(); + + /* Do RCU synchronization at fixed interval. */ + if (now > next_rcu) { + if (!ovsrcu_try_quiesce()) { + next_rcu += DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; + } + } } return NULL; From patchwork Sat Dec 5 14:22:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411467 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBfY4Zwjz9sVV for ; Sun, 6 Dec 2020 01:24:37 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 2F43B87904; Sat, 5 Dec 2020 14:24:36 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rVDIrML1gL5x; Sat, 5 Dec 2020 14:24:32 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id 0D6E4880CB; Sat, 5 Dec 2020 14:23:10 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id C36C4C1E0B; Sat, 5 Dec 2020 14:23:09 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0E4F0C0FA7 for ; Sat, 5 Dec 2020 14:22:45 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id EAB508780E for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w7V4pEZo08oQ for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by whitealder.osuosl.org (Postfix) with ESMTPS id 026728750B for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 81C6540004 for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:12 +0100 Message-Id: <0d3273743cc46e0705f7ccbc7b5d228ded0166dd.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 17/26] dpif-netdev: Postpone flow offload item freeing X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Profiling the HW offload thread, the flow offload freeing takes approximatively 25% of the time. Most of this time is spent waiting on the futex used by the libc free(), as it triggers a syscall and reschedule the thread. Avoid the syscall and its expensive context switch. Batch the offload messages freeing using the RCU. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6a3413b2e..ea9d66580 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -2621,13 +2621,19 @@ dp_netdev_alloc_flow_offload(struct dp_netdev_pmd_thread *pmd, } static void -dp_netdev_free_flow_offload(struct dp_offload_thread_item *offload) +dp_netdev_flow_offload_free(struct dp_offload_thread_item *offload) +{ + free(offload->actions); + free(offload); +} + +static void +dp_netdev_flow_offload_unref(struct dp_offload_thread_item *offload) { dp_netdev_pmd_unref(offload->pmd); dp_netdev_flow_unref(offload->flow); - free(offload->actions); - free(offload); + ovsrcu_postpone(dp_netdev_flow_offload_free, offload); } static void @@ -2788,7 +2794,8 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) VLOG_DBG("%s to %s netdev flow "UUID_FMT, ret == 0 ? "succeed" : "failed", op, UUID_ARGS((struct uuid *) &offload->flow->mega_ufid)); - dp_netdev_free_flow_offload(offload); + + dp_netdev_flow_offload_unref(offload); /* Do RCU synchronization at fixed interval. */ if (now > next_rcu) { From patchwork Sat Dec 5 14:22:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411452 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBcs27fRz9sVV for ; Sun, 6 Dec 2020 01:23:09 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 07F5787970; Sat, 5 Dec 2020 14:23:07 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id owigMiLpuU5U; Sat, 5 Dec 2020 14:23:04 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id B575B877B3; Sat, 5 Dec 2020 14:22:50 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 5C8A7C1833; Sat, 5 Dec 2020 14:22:50 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 86E97C1833 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 5DB7F87189 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id m6eatiz-q8sJ for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id 44CAB87140 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id C17924000B for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:13 +0100 Message-Id: <5a98617b820e273c00efa167fddd55195a1fdd49.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 18/26] dpif-netdev: Use seq-pool for mark allocation X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Change the flow mark pool to seq-pool. Use the netdev-offload multithread API to allow multiple thread allocating marks concurrently. Initialize only once the pool in a multithread context by using the ovsthread_once type. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index ea9d66580..daf8fb249 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -74,6 +74,7 @@ #include "pvector.h" #include "random.h" #include "seq.h" +#include "seq-pool.h" #include "smap.h" #include "sset.h" #include "timeval.h" @@ -2413,7 +2414,7 @@ struct megaflow_to_mark_data { struct flow_mark { struct cmap megaflow_to_mark; struct cmap mark_to_flow; - struct id_pool *pool; + struct seq_pool *pool; }; static struct flow_mark flow_mark = { @@ -2424,14 +2425,18 @@ static struct flow_mark flow_mark = { static uint32_t flow_mark_alloc(void) { + static struct ovsthread_once pool_init = OVSTHREAD_ONCE_INITIALIZER; + unsigned int tid = netdev_offload_thread_id(); uint32_t mark; - if (!flow_mark.pool) { + if (ovsthread_once_start(&pool_init)) { /* Haven't initiated yet, do it here */ - flow_mark.pool = id_pool_create(1, MAX_FLOW_MARK); + flow_mark.pool = seq_pool_create(netdev_offload_thread_nb(), + 1, MAX_FLOW_MARK); + ovsthread_once_done(&pool_init); } - if (id_pool_alloc_id(flow_mark.pool, &mark)) { + if (seq_pool_new_id(flow_mark.pool, tid, &mark)) { return mark; } @@ -2441,7 +2446,9 @@ flow_mark_alloc(void) static void flow_mark_free(uint32_t mark) { - id_pool_free_id(flow_mark.pool, mark); + unsigned int tid = netdev_offload_thread_id(); + + seq_pool_free_id(flow_mark.pool, tid, mark); } /* associate megaflow with a mark, which is a 1:1 mapping */ From patchwork Sat Dec 5 14:22:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411456 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=fraxinus.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBdB0Qqwz9sVV for ; Sun, 6 Dec 2020 01:23:26 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 71C4E876E6; Sat, 5 Dec 2020 14:23:24 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SGqbxnWbNLqa; Sat, 5 Dec 2020 14:23:22 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id 0D891875A7; Sat, 5 Dec 2020 14:23:04 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id D4B2DC1DF6; Sat, 5 Dec 2020 14:23:03 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 737DCC1D9F for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 6EAF3877A7 for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ySSmOoWBR1KF for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by whitealder.osuosl.org (Postfix) with ESMTPS id 8A35F87530 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 1117D40007 for ; Sat, 5 Dec 2020 14:22:37 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:14 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 19/26] netdev-offload-dpdk: Use per-thread HW offload stats X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The implementation of hardware offload counters in currently meant to be managed by a single thread. Use the offload thread pool API to manage one counter per thread. Signed-off-by: Gaetan Rivet --- lib/netdev-offload-dpdk.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index 7648a0ccd..48cf6d696 100644 --- a/lib/netdev-offload-dpdk.c +++ b/lib/netdev-offload-dpdk.c @@ -63,7 +63,7 @@ struct ufid_to_rte_flow_data { struct netdev_offload_dpdk_data { struct cmap ufid_to_rte_flow; - uint64_t rte_flow_counter; + uint64_t *rte_flow_counters; }; static int @@ -73,6 +73,8 @@ offload_data_init(struct netdev *netdev) data = xzalloc(sizeof *data); cmap_init(&data->ufid_to_rte_flow); + data->rte_flow_counters = xcalloc(netdev_offload_thread_nb(), + sizeof *data->rte_flow_counters); netdev->hw_info.offload_data = data; @@ -95,6 +97,7 @@ offload_data_destroy(struct netdev *netdev) } cmap_destroy(&data->ufid_to_rte_flow); + free(data->rte_flow_counters); free(data); netdev->hw_info.offload_data = NULL; @@ -620,9 +623,10 @@ netdev_offload_dpdk_flow_create(struct netdev *netdev, flow = netdev_dpdk_rte_flow_create(netdev, attr, items, actions, error); if (flow) { struct netdev_offload_dpdk_data *data; + unsigned int tid = netdev_offload_thread_id(); data = netdev->hw_info.offload_data; - data->rte_flow_counter++; + data->rte_flow_counters[tid]++; if (!VLOG_DROP_DBG(&rl)) { dump_flow(&s, &s_extra, attr, items, actions); extra_str = ds_cstr(&s_extra); @@ -1510,9 +1514,10 @@ netdev_offload_dpdk_destroy_flow(struct netdev *netdev, if (ret == 0) { struct netdev_offload_dpdk_data *data; + unsigned int tid = netdev_offload_thread_id(); data = netdev->hw_info.offload_data; - data->rte_flow_counter--; + data->rte_flow_counters[tid]--; ufid_to_rte_flow_disassociate(netdev, ufid); VLOG_DBG_RL(&rl, "%s: rte_flow 0x%"PRIxPTR @@ -1659,9 +1664,12 @@ netdev_offload_dpdk_hw_offload_stats_get(struct netdev *netdev, uint64_t *counters) { struct netdev_offload_dpdk_data *data; + unsigned int tid; data = netdev->hw_info.offload_data; - *counters = data->rte_flow_counter; + for (tid = 0; tid < netdev_offload_thread_nb(); tid++) { + counters[tid] = data->rte_flow_counters[tid]; + } return 0; } From patchwork Sat Dec 5 14:22:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411464 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBfD0JHYz9sVV for ; Sun, 6 Dec 2020 01:24:20 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 6B948872BA; Sat, 5 Dec 2020 14:24:18 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Rbbab8H+f2D7; Sat, 5 Dec 2020 14:24:15 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 67A8287A18; Sat, 5 Dec 2020 14:23:11 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 47993C1DF8; Sat, 5 Dec 2020 14:23:11 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 50550C013B for ; Sat, 5 Dec 2020 14:22:47 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 4D450878C8 for ; Sat, 5 Dec 2020 14:22:47 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TeqUCB5Yk27Z for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by whitealder.osuosl.org (Postfix) with ESMTPS id CF6F687613 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 537BF4000D for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:15 +0100 Message-Id: <286dcc941744a81f84e9af46ad7158b644570e55.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 20/26] netdev-offload-dpdk: Lock rte_flow map access X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a lock to access the ufid to rte_flow map. This will protect it from concurrent write accesses when multiple threads attempt it. At this point, the reason for taking the lock is not to fullfill the needs of the DPDK offload implementation anymore. Rewrite the comments to reflect this change. The lock is still needed to protect against changes to netdev port mapping. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 8 ++--- lib/netdev-offload-dpdk.c | 76 +++++++++++++++++++++++++++++++-------- 2 files changed, 66 insertions(+), 18 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index daf8fb249..4eae34893 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -2562,7 +2562,7 @@ mark_to_flow_disassociate(struct dp_netdev_pmd_thread *pmd, port = netdev_ports_get(in_port, dpif_type_str); if (port) { /* Taking a global 'port_mutex' to fulfill thread safety - * restrictions for the netdev-offload-dpdk module. */ + * restrictions regarding netdev port mapping. */ ovs_mutex_lock(&pmd->dp->port_mutex); ret = netdev_flow_del(port, &flow->mega_ufid, NULL); ovs_mutex_unlock(&pmd->dp->port_mutex); @@ -2719,8 +2719,8 @@ dp_netdev_flow_offload_put(struct dp_offload_thread_item *offload) netdev_close(port); goto err_free; } - /* Taking a global 'port_mutex' to fulfill thread safety restrictions for - * the netdev-offload-dpdk module. */ + /* Taking a global 'port_mutex' to fulfill thread safety + * restrictions regarding the netdev port mapping. */ ovs_mutex_lock(&pmd->dp->port_mutex); ret = netdev_flow_put(port, &offload->match, CONST_CAST(struct nlattr *, offload->actions), @@ -3402,7 +3402,7 @@ dpif_netdev_get_flow_offload_status(const struct dp_netdev *dp, } ofpbuf_use_stack(&buf, &act_buf, sizeof act_buf); /* Taking a global 'port_mutex' to fulfill thread safety - * restrictions for the netdev-offload-dpdk module. + * restrictions regarding netdev port mapping. * * XXX: Main thread will try to pause/stop all revalidators during datapath * reconfiguration via datapath purge callback (dp_purge_cb) while diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index 48cf6d696..5bc67254c 100644 --- a/lib/netdev-offload-dpdk.c +++ b/lib/netdev-offload-dpdk.c @@ -37,9 +37,6 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(100, 5); * * Below API is NOT thread safe in following terms: * - * - The caller must be sure that none of these functions will be called - * simultaneously. Even for different 'netdev's. - * * - The caller must be sure that 'netdev' will not be destructed/deallocated. * * - The caller must be sure that 'netdev' configuration will not be changed. @@ -64,6 +61,7 @@ struct ufid_to_rte_flow_data { struct netdev_offload_dpdk_data { struct cmap ufid_to_rte_flow; uint64_t *rte_flow_counters; + struct ovs_mutex map_lock; }; static int @@ -72,6 +70,7 @@ offload_data_init(struct netdev *netdev) struct netdev_offload_dpdk_data *data; data = xzalloc(sizeof *data); + ovs_mutex_init(&data->map_lock); cmap_init(&data->ufid_to_rte_flow); data->rte_flow_counters = xcalloc(netdev_offload_thread_nb(), sizeof *data->rte_flow_counters); @@ -97,12 +96,33 @@ offload_data_destroy(struct netdev *netdev) } cmap_destroy(&data->ufid_to_rte_flow); + ovs_mutex_destroy(&data->map_lock); free(data->rte_flow_counters); free(data); netdev->hw_info.offload_data = NULL; } +static void +offload_data_lock(struct netdev *netdev) + OVS_NO_THREAD_SAFETY_ANALYSIS +{ + struct netdev_offload_dpdk_data *data; + + data = netdev->hw_info.offload_data; + ovs_mutex_lock(&data->map_lock); +} + +static void +offload_data_unlock(struct netdev *netdev) + OVS_NO_THREAD_SAFETY_ANALYSIS +{ + struct netdev_offload_dpdk_data *data; + + data = netdev->hw_info.offload_data; + ovs_mutex_unlock(&data->map_lock); +} + static struct cmap * offload_data_map(struct netdev *netdev) { @@ -130,6 +150,24 @@ ufid_to_rte_flow_data_find(struct netdev *netdev, return NULL; } +/* Find rte_flow with @ufid, lock-protected. */ +static struct ufid_to_rte_flow_data * +ufid_to_rte_flow_data_find_protected(struct netdev *netdev, + const ovs_u128 *ufid) +{ + size_t hash = hash_bytes(ufid, sizeof *ufid, 0); + struct ufid_to_rte_flow_data *data; + struct cmap *map = offload_data_map(netdev); + + CMAP_FOR_EACH_WITH_HASH_PROTECTED (data, node, hash, map) { + if (ovs_u128_equals(*ufid, data->ufid)) { + return data; + } + } + + return NULL; +} + static inline struct ufid_to_rte_flow_data * ufid_to_rte_flow_associate(struct netdev *netdev, const ovs_u128 *ufid, struct rte_flow *rte_flow, bool actions_offloaded) @@ -139,13 +177,15 @@ ufid_to_rte_flow_associate(struct netdev *netdev, const ovs_u128 *ufid, struct ufid_to_rte_flow_data *data_prev; struct cmap *map = offload_data_map(netdev); + offload_data_lock(netdev); + /* * We should not simply overwrite an existing rte flow. * We should have deleted it first before re-adding it. * Thus, if following assert triggers, something is wrong: * the rte_flow is not destroyed. */ - data_prev = ufid_to_rte_flow_data_find(netdev, ufid); + data_prev = ufid_to_rte_flow_data_find_protected(netdev, ufid); if (data_prev) { ovs_assert(data_prev->rte_flow == NULL); } @@ -155,6 +195,9 @@ ufid_to_rte_flow_associate(struct netdev *netdev, const ovs_u128 *ufid, data->actions_offloaded = actions_offloaded; cmap_insert(map, CONST_CAST(struct cmap_node *, &data->node), hash); + + offload_data_unlock(netdev); + return data; } @@ -163,20 +206,25 @@ ufid_to_rte_flow_disassociate(struct netdev *netdev, const ovs_u128 *ufid) { struct cmap *map = offload_data_map(netdev); - size_t hash = hash_bytes(ufid, sizeof *ufid, 0); struct ufid_to_rte_flow_data *data; + size_t hash; - CMAP_FOR_EACH_WITH_HASH (data, node, hash, map) { - if (ovs_u128_equals(*ufid, data->ufid)) { - cmap_remove(map, CONST_CAST(struct cmap_node *, &data->node), - hash); - ovsrcu_postpone(free, data); - return; - } + offload_data_lock(netdev); + + data = ufid_to_rte_flow_data_find_protected(netdev, ufid); + if (!data) { + offload_data_unlock(netdev); + VLOG_WARN("ufid "UUID_FMT" is not associated with an rte flow", + UUID_ARGS((struct uuid *) ufid)); + return; } + hash = hash_bytes(ufid, sizeof *ufid, 0); + cmap_remove(map, CONST_CAST(struct cmap_node *, &data->node), + hash); + + offload_data_unlock(netdev); - VLOG_WARN("ufid "UUID_FMT" is not associated with an rte flow", - UUID_ARGS((struct uuid *) ufid)); + ovsrcu_postpone(free, data); } /* From patchwork Sat Dec 5 14:22:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411461 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBdv1T0hz9sVV for ; Sun, 6 Dec 2020 01:24:03 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id A517487890; Sat, 5 Dec 2020 14:24:01 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MYEZt247CvKh; Sat, 5 Dec 2020 14:23:58 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id C8C6787944; Sat, 5 Dec 2020 14:23:05 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 4A242C1E03; Sat, 5 Dec 2020 14:23:05 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 47167C1833 for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 264C987703 for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ur2I9AGR4NEZ for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id 208C98715E for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 9A52A4000E for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:16 +0100 Message-Id: <377176c04ff72f29451a15cdf3db8f8a13d13a25.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 21/26] dpif-netdev: Use lockless queue to manage offloads X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The dataplane threads (PMDs) send offloading commands to a dedicated offload management thread. The current implementation uses a lock and benchmarks show a high contention on the queue in some cases. With high-contention, the mutex will more often lead to the locking thread yielding in wait, using a syscall. This should be avoided in a userland dataplane. The mpsc-queue can be used instead. It uses less cycles and has lower latency. Benchmarks show better behavior as multiple revalidators and one or multiple PMDs writes to a single queue while another thread polls it. One trade-off with the new scheme however is to be forced to poll the queue from the offload thread. Without mutex, a cond_wait cannot be used for signaling. The offload thread is implementing an exponential backoff and will sleep in short increments when no data is available. This makes the thread yield, at the price of some latency to manage offloads after an inactivity period. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 78 ++++++++++++++++++++++++++++------------------- 1 file changed, 46 insertions(+), 32 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 4eae34893..d0cdb33db 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -52,6 +52,7 @@ #include "id-pool.h" #include "ipf.h" #include "mov-avg.h" +#include "mpsc-queue.h" #include "netdev.h" #include "netdev-offload.h" #include "netdev-provider.h" @@ -432,23 +433,20 @@ struct dp_offload_thread_item { size_t actions_len; long long int timestamp; - struct ovs_list node; + struct mpsc_queue_node node; }; struct dp_offload_thread { - struct ovs_mutex mutex; - struct ovs_list list; - uint64_t enqueued_item; + struct mpsc_queue queue; + atomic_uint64_t enqueued_item; struct mov_avg_ema ema; - pthread_cond_t cond; }; #define DP_NETDEV_OFFLOAD_EMA_N (10) static struct dp_offload_thread dp_offload_thread = { - .mutex = OVS_MUTEX_INITIALIZER, - .list = OVS_LIST_INITIALIZER(&dp_offload_thread.list), - .enqueued_item = 0, + .queue = MPSC_QUEUE_INITIALIZER(&dp_offload_thread.queue), + .enqueued_item = ATOMIC_VAR_INIT(0), .ema = MOV_AVG_EMA_INITIALIZER(DP_NETDEV_OFFLOAD_EMA_N), }; @@ -2646,11 +2644,8 @@ dp_netdev_flow_offload_unref(struct dp_offload_thread_item *offload) static void dp_netdev_append_flow_offload(struct dp_offload_thread_item *offload) { - ovs_mutex_lock(&dp_offload_thread.mutex); - ovs_list_push_back(&dp_offload_thread.list, &offload->node); - dp_offload_thread.enqueued_item++; - xpthread_cond_signal(&dp_offload_thread.cond); - ovs_mutex_unlock(&dp_offload_thread.mutex); + mpsc_queue_insert(&dp_offload_thread.queue, &offload->node); + atomic_count_inc64(&dp_offload_thread.enqueued_item); } static int @@ -2748,33 +2743,48 @@ err_free: return -1; } +#define DP_NETDEV_OFFLOAD_BACKOFF_MIN 1 +#define DP_NETDEV_OFFLOAD_BACKOFF_MAX 64 #define DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US (10 * 1000) /* 10 ms */ static void * dp_netdev_flow_offload_main(void *data OVS_UNUSED) { struct dp_offload_thread_item *offload; - struct ovs_list *list; + enum mpsc_queue_poll_result poll_result; + struct mpsc_queue_node *node; + struct mpsc_queue *queue; long long int latency_us; long long int next_rcu; long long int now; + uint64_t backoff; const char *op; int ret; + queue = &dp_offload_thread.queue; + if (!mpsc_queue_acquire(queue)) { + VLOG_ERR("failed to register as consumer of the offload queue"); + return NULL; + } + +sleep_until_next: + backoff = DP_NETDEV_OFFLOAD_BACKOFF_MIN; + while ((poll_result = mpsc_queue_poll(queue, &node)) == MPSC_QUEUE_EMPTY) { + xnanosleep(backoff * 1E6); + if (backoff < DP_NETDEV_OFFLOAD_BACKOFF_MAX) { + backoff <<= 1; + } + } + next_rcu = time_usec() + DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; - for (;;) { - ovs_mutex_lock(&dp_offload_thread.mutex); - if (ovs_list_is_empty(&dp_offload_thread.list)) { - ovsrcu_quiesce_start(); - ovs_mutex_cond_wait(&dp_offload_thread.cond, - &dp_offload_thread.mutex); - ovsrcu_quiesce_end(); - next_rcu = time_usec() + DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; - } - list = ovs_list_pop_front(&dp_offload_thread.list); - dp_offload_thread.enqueued_item--; - offload = CONTAINER_OF(list, struct dp_offload_thread_item, node); - ovs_mutex_unlock(&dp_offload_thread.mutex); + + do { + while (poll_result == MPSC_QUEUE_RETRY) { + poll_result = mpsc_queue_poll(queue, &node); + } + + offload = CONTAINER_OF(node, struct dp_offload_thread_item, node); + atomic_count_dec64(&dp_offload_thread.enqueued_item); switch (offload->op) { case DP_NETDEV_FLOW_OFFLOAD_OP_ADD: @@ -2810,7 +2820,11 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) next_rcu += DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; } } - } + + poll_result = mpsc_queue_poll(queue, &node); + } while (poll_result != MPSC_QUEUE_EMPTY); + + goto sleep_until_next; return NULL; } @@ -2822,7 +2836,7 @@ queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd, struct dp_offload_thread_item *offload; if (ovsthread_once_start(&offload_thread_once)) { - xpthread_cond_init(&dp_offload_thread.cond, NULL); + mpsc_queue_init(&dp_offload_thread.queue); ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, NULL); ovsthread_once_done(&offload_thread_once); } @@ -2846,7 +2860,7 @@ queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd, } if (ovsthread_once_start(&offload_thread_once)) { - xpthread_cond_init(&dp_offload_thread.cond, NULL); + mpsc_queue_init(&dp_offload_thread.queue); ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, NULL); ovsthread_once_done(&offload_thread_once); } @@ -4278,8 +4292,8 @@ dpif_netdev_offload_stats_get(struct dpif *dpif, } ovs_mutex_unlock(&dp->port_mutex); - stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value = - dp_offload_thread.enqueued_item; + atomic_read_relaxed(&dp_offload_thread.enqueued_item, + &stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value); stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_INSERTED].value = nb_offloads; stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN].value = mov_avg_ema(&dp_offload_thread.ema); From patchwork Sat Dec 5 14:22:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411468 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBfm3bJjz9sVV for ; Sun, 6 Dec 2020 01:24:48 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 0FB5F87F2D; Sat, 5 Dec 2020 14:24:47 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dHsI1eukEu4i; Sat, 5 Dec 2020 14:24:42 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id B6BF7881B7; Sat, 5 Dec 2020 14:23:12 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 6050EC1DFD; Sat, 5 Dec 2020 14:23:12 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0705CC013B for ; Sat, 5 Dec 2020 14:22:48 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id B1EFE228B4 for ; Sat, 5 Dec 2020 14:22:47 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xH09QediVM2F for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by silver.osuosl.org (Postfix) with ESMTPS id 63FF6226FC for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id E042A40003 for ; Sat, 5 Dec 2020 14:22:38 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:17 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 22/26] dpif-netdev: Make megaflow and mark mappings thread objects X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" In later commits hardware offloads are managed in several threads. Each offload is managed by a thread determined by its flow's 'mega_ufid'. As megaflow to mark and mark to flow mappings are 1:1 and 1:N respectively, then a single mark exists for a single 'mega_ufid', and multiple flows uses the same 'mega_ufid'. Because the managing thread will be choosen using the 'mega_ufid', then each mapping does not need to be shared with other offload threads. The mappings are kept as cmap as upcalls will sometimes query them before enqueuing orders to the offload threads. To prepare this change, move the mappings within the offload thread structure. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 41 +++++++++++++++++++---------------------- 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index d0cdb33db..aeadb0790 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -439,6 +439,8 @@ struct dp_offload_thread_item { struct dp_offload_thread { struct mpsc_queue queue; atomic_uint64_t enqueued_item; + struct cmap megaflow_to_mark; + struct cmap mark_to_flow; struct mov_avg_ema ema; }; @@ -446,6 +448,8 @@ struct dp_offload_thread { static struct dp_offload_thread dp_offload_thread = { .queue = MPSC_QUEUE_INITIALIZER(&dp_offload_thread.queue), + .megaflow_to_mark = CMAP_INITIALIZER, + .mark_to_flow = CMAP_INITIALIZER, .enqueued_item = ATOMIC_VAR_INIT(0), .ema = MOV_AVG_EMA_INITIALIZER(DP_NETDEV_OFFLOAD_EMA_N), }; @@ -2409,16 +2413,7 @@ struct megaflow_to_mark_data { uint32_t mark; }; -struct flow_mark { - struct cmap megaflow_to_mark; - struct cmap mark_to_flow; - struct seq_pool *pool; -}; - -static struct flow_mark flow_mark = { - .megaflow_to_mark = CMAP_INITIALIZER, - .mark_to_flow = CMAP_INITIALIZER, -}; +static struct seq_pool *flow_mark_pool; static uint32_t flow_mark_alloc(void) @@ -2429,12 +2424,12 @@ flow_mark_alloc(void) if (ovsthread_once_start(&pool_init)) { /* Haven't initiated yet, do it here */ - flow_mark.pool = seq_pool_create(netdev_offload_thread_nb(), + flow_mark_pool = seq_pool_create(netdev_offload_thread_nb(), 1, MAX_FLOW_MARK); ovsthread_once_done(&pool_init); } - if (seq_pool_new_id(flow_mark.pool, tid, &mark)) { + if (seq_pool_new_id(flow_mark_pool, tid, &mark)) { return mark; } @@ -2446,7 +2441,7 @@ flow_mark_free(uint32_t mark) { unsigned int tid = netdev_offload_thread_id(); - seq_pool_free_id(flow_mark.pool, tid, mark); + seq_pool_free_id(flow_mark_pool, tid, mark); } /* associate megaflow with a mark, which is a 1:1 mapping */ @@ -2459,7 +2454,7 @@ megaflow_to_mark_associate(const ovs_u128 *mega_ufid, uint32_t mark) data->mega_ufid = *mega_ufid; data->mark = mark; - cmap_insert(&flow_mark.megaflow_to_mark, + cmap_insert(&dp_offload_thread.megaflow_to_mark, CONST_CAST(struct cmap_node *, &data->node), hash); } @@ -2470,9 +2465,10 @@ megaflow_to_mark_disassociate(const ovs_u128 *mega_ufid) size_t hash = dp_netdev_flow_hash(mega_ufid); struct megaflow_to_mark_data *data; - CMAP_FOR_EACH_WITH_HASH (data, node, hash, &flow_mark.megaflow_to_mark) { + CMAP_FOR_EACH_WITH_HASH (data, node, hash, + &dp_offload_thread.megaflow_to_mark) { if (ovs_u128_equals(*mega_ufid, data->mega_ufid)) { - cmap_remove(&flow_mark.megaflow_to_mark, + cmap_remove(&dp_offload_thread.megaflow_to_mark, CONST_CAST(struct cmap_node *, &data->node), hash); ovsrcu_postpone(free, data); return; @@ -2489,7 +2485,8 @@ megaflow_to_mark_find(const ovs_u128 *mega_ufid) size_t hash = dp_netdev_flow_hash(mega_ufid); struct megaflow_to_mark_data *data; - CMAP_FOR_EACH_WITH_HASH (data, node, hash, &flow_mark.megaflow_to_mark) { + CMAP_FOR_EACH_WITH_HASH (data, node, hash, + &dp_offload_thread.megaflow_to_mark) { if (ovs_u128_equals(*mega_ufid, data->mega_ufid)) { return data->mark; } @@ -2506,7 +2503,7 @@ mark_to_flow_associate(const uint32_t mark, struct dp_netdev_flow *flow) { dp_netdev_flow_ref(flow); - cmap_insert(&flow_mark.mark_to_flow, + cmap_insert(&dp_offload_thread.mark_to_flow, CONST_CAST(struct cmap_node *, &flow->mark_node), hash_int(mark, 0)); flow->mark = mark; @@ -2521,7 +2518,7 @@ flow_mark_has_no_ref(uint32_t mark) struct dp_netdev_flow *flow; CMAP_FOR_EACH_WITH_HASH (flow, mark_node, hash_int(mark, 0), - &flow_mark.mark_to_flow) { + &dp_offload_thread.mark_to_flow) { if (flow->mark == mark) { return false; } @@ -2546,7 +2543,7 @@ mark_to_flow_disassociate(struct dp_netdev_pmd_thread *pmd, return EINVAL; } - cmap_remove(&flow_mark.mark_to_flow, mark_node, hash_int(mark, 0)); + cmap_remove(&dp_offload_thread.mark_to_flow, mark_node, hash_int(mark, 0)); flow->mark = INVALID_FLOW_MARK; /* @@ -2583,7 +2580,7 @@ flow_mark_flush(struct dp_netdev_pmd_thread *pmd) { struct dp_netdev_flow *flow; - CMAP_FOR_EACH (flow, mark_node, &flow_mark.mark_to_flow) { + CMAP_FOR_EACH (flow, mark_node, &dp_offload_thread.mark_to_flow) { if (flow->pmd_id == pmd->core_id) { queue_netdev_flow_del(pmd, flow); } @@ -2597,7 +2594,7 @@ mark_to_flow_find(const struct dp_netdev_pmd_thread *pmd, struct dp_netdev_flow *flow; CMAP_FOR_EACH_WITH_HASH (flow, mark_node, hash_int(mark, 0), - &flow_mark.mark_to_flow) { + &dp_offload_thread.mark_to_flow) { if (flow->mark == mark && flow->pmd_id == pmd->core_id && flow->dead == false) { return flow; From patchwork Sat Dec 5 14:22:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411462 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBdz037kz9sVV for ; Sun, 6 Dec 2020 01:24:07 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 8C6538796D; Sat, 5 Dec 2020 14:24:05 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3Bhpczob3-yi; Sat, 5 Dec 2020 14:24:04 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 089D187971; Sat, 5 Dec 2020 14:23:07 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id C1290C1DCF; Sat, 5 Dec 2020 14:23:06 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 8A516C1DA0 for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 66FBD22802 for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iGr5o7d8of58 for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by silver.osuosl.org (Postfix) with ESMTPS id AE51022794 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 30E6740005 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:18 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 23/26] dpif-netdev: Revert make datapath port mutex recursive. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" This reverts commit 81e89d5c2645c16bad3309f532a6ab1ea41530d8. The initial issue motivating the use of a recursive mutex here is still valid. However, the reverted commit was only a partial workaround. A similar deadlock could still happen, described and avoided in commit 12d0edd75eba ("dpif-netdev: Avoid deadlock with offloading during PMD thread deletion."). Neither commit are full fixes for the deadlocks. One blocking part was the requirement for mutual exclusion of the netdev-offload-dpdk module, which was alleviated by previous commit 33351d84c115 ("netdev-offload-dpdk: Lock rte_flow map access"). Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index aeadb0790..2443904fa 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1780,7 +1780,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class, ovs_refcount_init(&dp->ref_cnt); atomic_flag_clear(&dp->destroyed); - ovs_mutex_init_recursive(&dp->port_mutex); + ovs_mutex_init(&dp->port_mutex); hmap_init(&dp->ports); dp->port_seq = seq_create(); ovs_mutex_init(&dp->bond_mutex); From patchwork Sat Dec 5 14:22:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411466 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBfX2glzz9sVV for ; Sun, 6 Dec 2020 01:24:36 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id E64618793E; Sat, 5 Dec 2020 14:24:34 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7XvHetl6c0hz; Sat, 5 Dec 2020 14:24:29 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 0E91B87983; Sat, 5 Dec 2020 14:23:15 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id D5CFCC1DA6; Sat, 5 Dec 2020 14:23:14 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 5BB86C1DA1 for ; Sat, 5 Dec 2020 14:22:49 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 510358770B for ; Sat, 5 Dec 2020 14:22:49 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HiHlPfpztbqX for ; Sat, 5 Dec 2020 14:22:42 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by hemlock.osuosl.org (Postfix) with ESMTPS id 39215875D3 for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 7627740009; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:19 +0100 Message-Id: <42929ac609c2b79e89d96380759915712fad20fe.1607177117.git.grive@u256.net> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Cc: Eli Britstein Subject: [ovs-dev] [RFC PATCH 24/26] dpif-netdev: Replace port mutex by rw port lock X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The port mutex protects the netdev mapping, that can be changed by port add or port deletion. Queries and applying HW flows can be only reader locks, and changes as writer locks. Signed-off-by: Gaetan Rivet Signed-off-by: Eli Britstein --- lib/dpif-netdev.c | 123 +++++++++++++++++++------------------- lib/netdev-offload-dpdk.c | 2 +- 2 files changed, 63 insertions(+), 62 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 2443904fa..dedfaae37 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -330,8 +330,8 @@ struct dp_netdev { /* Ports. * * Any lookup into 'ports' or any access to the dp_netdev_ports found - * through 'ports' requires taking 'port_mutex'. */ - struct ovs_mutex port_mutex; + * through 'ports' requires taking 'port_rwlock'. */ + struct ovs_rwlock port_rwlock; struct hmap ports; struct seq *port_seq; /* Incremented whenever a port changes. */ @@ -407,7 +407,7 @@ static void meter_unlock(const struct dp_netdev *dp, uint32_t meter_id) static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev *dp, odp_port_t) - OVS_REQUIRES(dp->port_mutex); + OVS_REQ_WRLOCK(dp->port_rwlock); enum rxq_cycles_counter_type { RXQ_CYCLES_PROC_CURR, /* Cycles spent successfully polling and @@ -828,17 +828,17 @@ struct dpif_netdev { static int get_port_by_number(struct dp_netdev *dp, odp_port_t port_no, struct dp_netdev_port **portp) - OVS_REQUIRES(dp->port_mutex); + OVS_REQ_WRLOCK(dp->port_rwlock); static int get_port_by_name(struct dp_netdev *dp, const char *devname, struct dp_netdev_port **portp) - OVS_REQUIRES(dp->port_mutex); + OVS_REQ_WRLOCK(dp->port_rwlock); static void dp_netdev_free(struct dp_netdev *) OVS_REQUIRES(dp_netdev_mutex); static int do_add_port(struct dp_netdev *dp, const char *devname, const char *type, odp_port_t port_no) - OVS_REQUIRES(dp->port_mutex); + OVS_REQ_WRLOCK(dp->port_rwlock); static void do_del_port(struct dp_netdev *dp, struct dp_netdev_port *) - OVS_REQUIRES(dp->port_mutex); + OVS_REQ_WRLOCK(dp->port_rwlock); static int dpif_netdev_open(const struct dpif_class *, const char *name, bool create, struct dpif **); static void dp_netdev_execute_actions(struct dp_netdev_pmd_thread *pmd, @@ -859,7 +859,7 @@ static void dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, int numa_id); static void dp_netdev_destroy_pmd(struct dp_netdev_pmd_thread *pmd); static void dp_netdev_set_nonpmd(struct dp_netdev *dp) - OVS_REQUIRES(dp->port_mutex); + OVS_REQ_WRLOCK(dp->port_rwlock); static void *pmd_thread_main(void *); static struct dp_netdev_pmd_thread *dp_netdev_get_pmd(struct dp_netdev *dp, @@ -893,7 +893,7 @@ static void dp_netdev_del_bond_tx_from_pmd(struct dp_netdev_pmd_thread *pmd, OVS_EXCLUDED(pmd->bond_mutex); static void reconfigure_datapath(struct dp_netdev *dp) - OVS_REQUIRES(dp->port_mutex); + OVS_REQ_WRLOCK(dp->port_rwlock); static bool dp_netdev_pmd_try_ref(struct dp_netdev_pmd_thread *pmd); static void dp_netdev_pmd_unref(struct dp_netdev_pmd_thread *pmd); static void dp_netdev_pmd_flow_flush(struct dp_netdev_pmd_thread *pmd); @@ -1400,7 +1400,7 @@ dpif_netdev_subtable_lookup_set(struct unixctl_conn *conn, int argc, sorted_poll_thread_list(dp, &pmd_list, &n); /* take port mutex as HMAP iters over them. */ - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_rdlock(&dp->port_rwlock); for (size_t i = 0; i < n; i++) { struct dp_netdev_pmd_thread *pmd = pmd_list[i]; @@ -1424,7 +1424,7 @@ dpif_netdev_subtable_lookup_set(struct unixctl_conn *conn, int argc, } /* release port mutex before netdev mutex. */ - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); ovs_mutex_unlock(&dp_netdev_mutex); struct ds reply = DS_EMPTY_INITIALIZER; @@ -1717,7 +1717,7 @@ create_dpif_netdev(struct dp_netdev *dp) * Return ODPP_NONE on failure. */ static odp_port_t choose_port(struct dp_netdev *dp, const char *name) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { uint32_t port_no; @@ -1780,7 +1780,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class, ovs_refcount_init(&dp->ref_cnt); atomic_flag_clear(&dp->destroyed); - ovs_mutex_init(&dp->port_mutex); + ovs_rwlock_init(&dp->port_rwlock); hmap_init(&dp->ports); dp->port_seq = seq_create(); ovs_mutex_init(&dp->bond_mutex); @@ -1815,7 +1815,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class, ovs_mutex_init_recursive(&dp->non_pmd_mutex); ovsthread_key_create(&dp->per_pmd_key, NULL); - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); /* non-PMD will be created before all other threads and will * allocate static_tx_qid = 0. */ dp_netdev_set_nonpmd(dp); @@ -1823,7 +1823,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class, error = do_add_port(dp, name, dpif_netdev_port_open_type(dp->class, "internal"), ODPP_LOCAL); - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); if (error) { dp_netdev_free(dp); return error; @@ -1909,11 +1909,11 @@ dp_netdev_free(struct dp_netdev *dp) shash_find_and_delete(&dp_netdevs, dp->name); - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); HMAP_FOR_EACH_SAFE (port, next, node, &dp->ports) { do_del_port(dp, port); } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); ovs_mutex_lock(&dp->bond_mutex); CMAP_FOR_EACH (bond, node, &dp->tx_bonds) { @@ -1938,7 +1938,7 @@ dp_netdev_free(struct dp_netdev *dp) seq_destroy(dp->port_seq); hmap_destroy(&dp->ports); - ovs_mutex_destroy(&dp->port_mutex); + ovs_rwlock_destroy(&dp->port_rwlock); cmap_destroy(&dp->tx_bonds); ovs_mutex_destroy(&dp->bond_mutex); @@ -2106,7 +2106,7 @@ out: static int do_add_port(struct dp_netdev *dp, const char *devname, const char *type, odp_port_t port_no) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct netdev_saved_flags *sf; struct dp_netdev_port *port; @@ -2158,7 +2158,7 @@ dpif_netdev_port_add(struct dpif *dpif, struct netdev *netdev, odp_port_t port_no; int error; - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); dpif_port = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf); if (*port_nop != ODPP_NONE) { port_no = *port_nop; @@ -2171,7 +2171,7 @@ dpif_netdev_port_add(struct dpif *dpif, struct netdev *netdev, *port_nop = port_no; error = do_add_port(dp, dpif_port, netdev_get_type(netdev), port_no); } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); return error; } @@ -2182,7 +2182,7 @@ dpif_netdev_port_del(struct dpif *dpif, odp_port_t port_no) struct dp_netdev *dp = get_dp_netdev(dpif); int error; - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); if (port_no == ODPP_LOCAL) { error = EINVAL; } else { @@ -2193,7 +2193,7 @@ dpif_netdev_port_del(struct dpif *dpif, odp_port_t port_no) do_del_port(dp, port); } } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); return error; } @@ -2206,7 +2206,7 @@ is_valid_port_number(odp_port_t port_no) static struct dp_netdev_port * dp_netdev_lookup_port(const struct dp_netdev *dp, odp_port_t port_no) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_port *port; @@ -2221,7 +2221,7 @@ dp_netdev_lookup_port(const struct dp_netdev *dp, odp_port_t port_no) static int get_port_by_number(struct dp_netdev *dp, odp_port_t port_no, struct dp_netdev_port **portp) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { if (!is_valid_port_number(port_no)) { *portp = NULL; @@ -2256,7 +2256,7 @@ port_destroy(struct dp_netdev_port *port) static int get_port_by_name(struct dp_netdev *dp, const char *devname, struct dp_netdev_port **portp) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_port *port; @@ -2275,7 +2275,7 @@ get_port_by_name(struct dp_netdev *dp, /* Returns 'true' if there is a port with pmd netdev. */ static bool has_pmd_port(struct dp_netdev *dp) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_port *port; @@ -2290,7 +2290,7 @@ has_pmd_port(struct dp_netdev *dp) static void do_del_port(struct dp_netdev *dp, struct dp_netdev_port *port) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { hmap_remove(&dp->ports, &port->node); seq_change(dp->port_seq); @@ -2317,12 +2317,12 @@ dpif_netdev_port_query_by_number(const struct dpif *dpif, odp_port_t port_no, struct dp_netdev_port *port; int error; - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); error = get_port_by_number(dp, port_no, &port); if (!error && dpif_port) { answer_port_query(port, dpif_port); } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); return error; } @@ -2335,12 +2335,12 @@ dpif_netdev_port_query_by_name(const struct dpif *dpif, const char *devname, struct dp_netdev_port *port; int error; - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); error = get_port_by_name(dp, devname, &port); if (!error && dpif_port) { answer_port_query(port, dpif_port); } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); return error; } @@ -2558,9 +2558,9 @@ mark_to_flow_disassociate(struct dp_netdev_pmd_thread *pmd, if (port) { /* Taking a global 'port_mutex' to fulfill thread safety * restrictions regarding netdev port mapping. */ - ovs_mutex_lock(&pmd->dp->port_mutex); + ovs_rwlock_rdlock(&pmd->dp->port_rwlock); ret = netdev_flow_del(port, &flow->mega_ufid, NULL); - ovs_mutex_unlock(&pmd->dp->port_mutex); + ovs_rwlock_unlock(&pmd->dp->port_rwlock); netdev_close(port); } @@ -2713,12 +2713,12 @@ dp_netdev_flow_offload_put(struct dp_offload_thread_item *offload) } /* Taking a global 'port_mutex' to fulfill thread safety * restrictions regarding the netdev port mapping. */ - ovs_mutex_lock(&pmd->dp->port_mutex); + ovs_rwlock_rdlock(&pmd->dp->port_rwlock); ret = netdev_flow_put(port, &offload->match, CONST_CAST(struct nlattr *, offload->actions), offload->actions_len, &flow->mega_ufid, &info, NULL); - ovs_mutex_unlock(&pmd->dp->port_mutex); + ovs_rwlock_unlock(&pmd->dp->port_rwlock); netdev_close(port); if (ret) { @@ -2944,7 +2944,7 @@ dpif_netdev_port_dump_next(const struct dpif *dpif, void *state_, struct hmap_node *node; int retval; - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_rdlock(&dp->port_rwlock); node = hmap_at_position(&dp->ports, &state->position); if (node) { struct dp_netdev_port *port; @@ -2961,7 +2961,7 @@ dpif_netdev_port_dump_next(const struct dpif *dpif, void *state_, } else { retval = EOF; } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); return retval; } @@ -3412,24 +3412,24 @@ dpif_netdev_get_flow_offload_status(const struct dp_netdev *dp, return false; } ofpbuf_use_stack(&buf, &act_buf, sizeof act_buf); - /* Taking a global 'port_mutex' to fulfill thread safety + /* Taking a global 'port_rwlock' to fulfill thread safety * restrictions regarding netdev port mapping. * * XXX: Main thread will try to pause/stop all revalidators during datapath * reconfiguration via datapath purge callback (dp_purge_cb) while - * holding 'dp->port_mutex'. So we're not waiting for mutex here. - * Otherwise, deadlock is possible, bcause revalidators might sleep + * rw-holding 'dp->port_rwlock'. So we're not waiting for lock here. + * Otherwise, deadlock is possible, because revalidators might sleep * waiting for the main thread to release the lock and main thread * will wait for them to stop processing. * This workaround might make statistics less accurate. Especially * for flow deletion case, since there will be no other attempt. */ - if (!ovs_mutex_trylock(&dp->port_mutex)) { + if (!ovs_rwlock_tryrdlock(&dp->port_rwlock)) { ret = netdev_flow_get(netdev, &match, &actions, &netdev_flow->mega_ufid, stats, attrs, &buf); /* Storing statistics and attributes from the last request for * later use on mutex contention. */ dp_netdev_flow_set_last_stats_attrs(netdev_flow, stats, attrs, ret); - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); } else { dp_netdev_flow_get_last_stats_attrs(netdev_flow, stats, attrs, &ret); if (!ret && !attrs->dp_layer) { @@ -4278,7 +4278,7 @@ dpif_netdev_offload_stats_get(struct dpif *dpif, nb_offloads = 0; - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_rdlock(&dp->port_rwlock); HMAP_FOR_EACH (port, node, &dp->ports) { uint64_t port_nb_offloads = 0; @@ -4287,7 +4287,7 @@ dpif_netdev_offload_stats_get(struct dpif *dpif, nb_offloads += port_nb_offloads; } } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); atomic_read_relaxed(&dp_offload_thread.enqueued_item, &stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value); @@ -4548,7 +4548,7 @@ dpif_netdev_port_set_config(struct dpif *dpif, odp_port_t port_no, const char *affinity_list = smap_get(cfg, "pmd-rxq-affinity"); bool emc_enabled = smap_get_bool(cfg, "emc-enable", true); - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); error = get_port_by_number(dp, port_no, &port); if (error) { goto unlock; @@ -4602,7 +4602,7 @@ dpif_netdev_port_set_config(struct dpif *dpif, odp_port_t port_no, dp_netdev_request_reconfigure(dp); unlock: - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); return error; } @@ -5086,7 +5086,8 @@ compare_rxq_cycles(const void *a, const void *b) * The function doesn't touch the pmd threads, it just stores the assignment * in the 'pmd' member of each rxq. */ static void -rxq_scheduling(struct dp_netdev *dp, bool pinned) OVS_REQUIRES(dp->port_mutex) +rxq_scheduling(struct dp_netdev *dp, bool pinned) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_port *port; struct rr_numa_list rr; @@ -5230,7 +5231,7 @@ reload_affected_pmds(struct dp_netdev *dp) static void reconfigure_pmd_threads(struct dp_netdev *dp) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_pmd_thread *pmd; struct ovs_numa_dump *pmd_cores; @@ -5328,7 +5329,7 @@ static void pmd_remove_stale_ports(struct dp_netdev *dp, struct dp_netdev_pmd_thread *pmd) OVS_EXCLUDED(pmd->port_mutex) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct rxq_poll *poll, *poll_next; struct tx_port *tx, *tx_next; @@ -5358,7 +5359,7 @@ pmd_remove_stale_ports(struct dp_netdev *dp, * rxqs and assigns all rxqs/txqs to pmd threads. */ static void reconfigure_datapath(struct dp_netdev *dp) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct hmapx busy_threads = HMAPX_INITIALIZER(&busy_threads); struct dp_netdev_pmd_thread *pmd; @@ -5542,7 +5543,7 @@ reconfigure_datapath(struct dp_netdev *dp) /* Returns true if one of the netdevs in 'dp' requires a reconfiguration */ static bool ports_require_restart(const struct dp_netdev *dp) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_port *port; @@ -5593,7 +5594,7 @@ variance(uint64_t a[], int n) static bool get_dry_run_variance(struct dp_netdev *dp, uint32_t *core_list, uint32_t num_pmds, uint64_t *predicted_variance) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_port *port; struct dp_netdev_pmd_thread *pmd; @@ -5709,7 +5710,7 @@ cleanup: * better distribution of load on PMDs. */ static bool pmd_rebalance_dry_run(struct dp_netdev *dp) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_pmd_thread *pmd; uint64_t *curr_pmd_usage; @@ -5804,7 +5805,7 @@ dpif_netdev_run(struct dpif *dpif) long long int now = time_msec(); struct dp_netdev_pmd_thread *pmd; - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID); if (non_pmd) { ovs_mutex_lock(&dp->non_pmd_mutex); @@ -5876,7 +5877,7 @@ dpif_netdev_run(struct dpif *dpif) if (dp_netdev_is_reconf_required(dp) || ports_require_restart(dp)) { reconfigure_datapath(dp); } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); tnl_neigh_cache_run(); tnl_port_map_run(); @@ -5896,7 +5897,7 @@ dpif_netdev_wait(struct dpif *dpif) struct dp_netdev *dp = get_dp_netdev(dpif); ovs_mutex_lock(&dp_netdev_mutex); - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); HMAP_FOR_EACH (port, node, &dp->ports) { netdev_wait_reconf_required(port->netdev); if (!netdev_is_pmd(port->netdev)) { @@ -5907,7 +5908,7 @@ dpif_netdev_wait(struct dpif *dpif) } } } - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); ovs_mutex_unlock(&dp_netdev_mutex); seq_wait(tnl_conf_seq, dp->last_tnl_conf_seq); } @@ -6524,7 +6525,7 @@ dp_netdev_get_pmd(struct dp_netdev *dp, unsigned core_id) /* Sets the 'struct dp_netdev_pmd_thread' for non-pmd threads. */ static void dp_netdev_set_nonpmd(struct dp_netdev *dp) - OVS_REQUIRES(dp->port_mutex) + OVS_REQ_WRLOCK(dp->port_rwlock) { struct dp_netdev_pmd_thread *non_pmd; @@ -8587,7 +8588,7 @@ dpif_dummy_change_port_number(struct unixctl_conn *conn, int argc OVS_UNUSED, ovs_refcount_ref(&dp->ref_cnt); ovs_mutex_unlock(&dp_netdev_mutex); - ovs_mutex_lock(&dp->port_mutex); + ovs_rwlock_wrlock(&dp->port_rwlock); if (get_port_by_name(dp, argv[2], &port)) { unixctl_command_reply_error(conn, "unknown port"); goto exit; @@ -8616,7 +8617,7 @@ dpif_dummy_change_port_number(struct unixctl_conn *conn, int argc OVS_UNUSED, unixctl_command_reply(conn, NULL); exit: - ovs_mutex_unlock(&dp->port_mutex); + ovs_rwlock_unlock(&dp->port_rwlock); dp_netdev_unref(dp); } diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index 5bc67254c..6a1e90e62 100644 --- a/lib/netdev-offload-dpdk.c +++ b/lib/netdev-offload-dpdk.c @@ -44,7 +44,7 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(100, 5); * 'netdev' is forbidden. * * For current implementation all above restrictions could be fulfilled by - * taking the datapath 'port_mutex' in lib/dpif-netdev.c. */ + * taking the datapath 'port_rwlock' in lib/dpif-netdev.c. */ /* * A mapping from ufid to dpdk rte_flow. From patchwork Sat Dec 5 14:22:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411465 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBfP2tQzz9sVV for ; Sun, 6 Dec 2020 01:24:29 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id BC69E878DC; Sat, 5 Dec 2020 14:24:27 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LOR3Fs1+LlIE; Sat, 5 Dec 2020 14:24:23 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id E5C5F87A56; Sat, 5 Dec 2020 14:23:13 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id AF254C1833; Sat, 5 Dec 2020 14:23:13 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id D4457C1DA1 for ; Sat, 5 Dec 2020 14:22:47 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id B6538873BF for ; Sat, 5 Dec 2020 14:22:47 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RYjZO8ylTNSy for ; Sat, 5 Dec 2020 14:22:43 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by fraxinus.osuosl.org (Postfix) with ESMTPS id D221B8706E for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 0486E40004 for ; Sat, 5 Dec 2020 14:22:39 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:20 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 25/26] dpif-netdev: Use one or more offload threads X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Read the user configuration in the netdev-offload module to modify the number of threads used to manage hardware offload requests. This allows processing insertion, deletion and modification concurrently. The offload thread structure was modified to contain all needed elements. This structure is multiplied by the number of requested threads and used separately. Signed-off-by: Gaetan Rivet --- lib/dpif-netdev.c | 210 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 145 insertions(+), 65 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index dedfaae37..f10478f79 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -437,25 +437,48 @@ struct dp_offload_thread_item { }; struct dp_offload_thread { - struct mpsc_queue queue; - atomic_uint64_t enqueued_item; - struct cmap megaflow_to_mark; - struct cmap mark_to_flow; - struct mov_avg_ema ema; + PADDED_MEMBERS(CACHE_LINE_SIZE, + struct mpsc_queue queue; + atomic_uint64_t enqueued_item; + struct cmap megaflow_to_mark; + struct cmap mark_to_flow; + struct mov_avg_ema ema; + ); }; +static struct dp_offload_thread *dp_offload_threads; +static void *dp_netdev_flow_offload_main(void *arg); + #define DP_NETDEV_OFFLOAD_EMA_N (10) -static struct dp_offload_thread dp_offload_thread = { - .queue = MPSC_QUEUE_INITIALIZER(&dp_offload_thread.queue), - .megaflow_to_mark = CMAP_INITIALIZER, - .mark_to_flow = CMAP_INITIALIZER, - .enqueued_item = ATOMIC_VAR_INIT(0), - .ema = MOV_AVG_EMA_INITIALIZER(DP_NETDEV_OFFLOAD_EMA_N), -}; +static void +dp_netdev_offload_init(void) +{ + static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; + unsigned int nb_offload_thread = netdev_offload_thread_nb(); + unsigned int tid; + + if (!ovsthread_once_start(&once)) { + return; + } + + dp_offload_threads = xcalloc(nb_offload_thread, + sizeof *dp_offload_threads); + + for (tid = 0; tid < nb_offload_thread; tid++) { + struct dp_offload_thread *thread; -static struct ovsthread_once offload_thread_once - = OVSTHREAD_ONCE_INITIALIZER; + thread = &dp_offload_threads[tid]; + mpsc_queue_init(&thread->queue); + cmap_init(&thread->megaflow_to_mark); + cmap_init(&thread->mark_to_flow); + atomic_init(&thread->enqueued_item, 0); + mov_avg_ema_init(&thread->ema, DP_NETDEV_OFFLOAD_EMA_N); + ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, thread); + } + + ovsthread_once_done(&once); +} #define XPS_TIMEOUT 500000LL /* In microseconds. */ @@ -2450,11 +2473,12 @@ megaflow_to_mark_associate(const ovs_u128 *mega_ufid, uint32_t mark) { size_t hash = dp_netdev_flow_hash(mega_ufid); struct megaflow_to_mark_data *data = xzalloc(sizeof(*data)); + unsigned int tid = netdev_offload_thread_id(); data->mega_ufid = *mega_ufid; data->mark = mark; - cmap_insert(&dp_offload_thread.megaflow_to_mark, + cmap_insert(&dp_offload_threads[tid].megaflow_to_mark, CONST_CAST(struct cmap_node *, &data->node), hash); } @@ -2464,11 +2488,12 @@ megaflow_to_mark_disassociate(const ovs_u128 *mega_ufid) { size_t hash = dp_netdev_flow_hash(mega_ufid); struct megaflow_to_mark_data *data; + unsigned int tid = netdev_offload_thread_id(); CMAP_FOR_EACH_WITH_HASH (data, node, hash, - &dp_offload_thread.megaflow_to_mark) { + &dp_offload_threads[tid].megaflow_to_mark) { if (ovs_u128_equals(*mega_ufid, data->mega_ufid)) { - cmap_remove(&dp_offload_thread.megaflow_to_mark, + cmap_remove(&dp_offload_threads[tid].megaflow_to_mark, CONST_CAST(struct cmap_node *, &data->node), hash); ovsrcu_postpone(free, data); return; @@ -2484,9 +2509,10 @@ megaflow_to_mark_find(const ovs_u128 *mega_ufid) { size_t hash = dp_netdev_flow_hash(mega_ufid); struct megaflow_to_mark_data *data; + unsigned int tid = netdev_offload_thread_id(); CMAP_FOR_EACH_WITH_HASH (data, node, hash, - &dp_offload_thread.megaflow_to_mark) { + &dp_offload_threads[tid].megaflow_to_mark) { if (ovs_u128_equals(*mega_ufid, data->mega_ufid)) { return data->mark; } @@ -2501,9 +2527,10 @@ megaflow_to_mark_find(const ovs_u128 *mega_ufid) static void mark_to_flow_associate(const uint32_t mark, struct dp_netdev_flow *flow) { + unsigned int tid = netdev_offload_thread_id(); dp_netdev_flow_ref(flow); - cmap_insert(&dp_offload_thread.mark_to_flow, + cmap_insert(&dp_offload_threads[tid].mark_to_flow, CONST_CAST(struct cmap_node *, &flow->mark_node), hash_int(mark, 0)); flow->mark = mark; @@ -2515,10 +2542,11 @@ mark_to_flow_associate(const uint32_t mark, struct dp_netdev_flow *flow) static bool flow_mark_has_no_ref(uint32_t mark) { + unsigned int tid = netdev_offload_thread_id(); struct dp_netdev_flow *flow; CMAP_FOR_EACH_WITH_HASH (flow, mark_node, hash_int(mark, 0), - &dp_offload_thread.mark_to_flow) { + &dp_offload_threads[tid].mark_to_flow) { if (flow->mark == mark) { return false; } @@ -2534,6 +2562,7 @@ mark_to_flow_disassociate(struct dp_netdev_pmd_thread *pmd, const char *dpif_type_str = dpif_normalize_type(pmd->dp->class->type); struct cmap_node *mark_node = CONST_CAST(struct cmap_node *, &flow->mark_node); + unsigned int tid = netdev_offload_thread_id(); uint32_t mark = flow->mark; int ret = 0; @@ -2543,7 +2572,8 @@ mark_to_flow_disassociate(struct dp_netdev_pmd_thread *pmd, return EINVAL; } - cmap_remove(&dp_offload_thread.mark_to_flow, mark_node, hash_int(mark, 0)); + cmap_remove(&dp_offload_threads[tid].mark_to_flow, + mark_node, hash_int(mark, 0)); flow->mark = INVALID_FLOW_MARK; /* @@ -2579,10 +2609,18 @@ static void flow_mark_flush(struct dp_netdev_pmd_thread *pmd) { struct dp_netdev_flow *flow; + unsigned int tid; - CMAP_FOR_EACH (flow, mark_node, &dp_offload_thread.mark_to_flow) { - if (flow->pmd_id == pmd->core_id) { - queue_netdev_flow_del(pmd, flow); + if (dp_offload_threads == NULL) { + return; + } + + for (tid = 0; tid < netdev_offload_thread_nb(); tid++) { + CMAP_FOR_EACH (flow, mark_node, + &dp_offload_threads[tid].mark_to_flow) { + if (flow->pmd_id == pmd->core_id) { + queue_netdev_flow_del(pmd, flow); + } } } } @@ -2592,12 +2630,21 @@ mark_to_flow_find(const struct dp_netdev_pmd_thread *pmd, const uint32_t mark) { struct dp_netdev_flow *flow; + unsigned int tid; + size_t hash; - CMAP_FOR_EACH_WITH_HASH (flow, mark_node, hash_int(mark, 0), - &dp_offload_thread.mark_to_flow) { - if (flow->mark == mark && flow->pmd_id == pmd->core_id && - flow->dead == false) { - return flow; + if (dp_offload_threads == NULL) { + return NULL; + } + + hash = hash_int(mark, 0); + for (tid = 0; tid < netdev_offload_thread_nb(); tid++) { + CMAP_FOR_EACH_WITH_HASH (flow, mark_node, hash, + &dp_offload_threads[tid].mark_to_flow) { + if (flow->mark == mark && flow->pmd_id == pmd->core_id && + flow->dead == false) { + return flow; + } } } @@ -2641,8 +2688,13 @@ dp_netdev_flow_offload_unref(struct dp_offload_thread_item *offload) static void dp_netdev_append_flow_offload(struct dp_offload_thread_item *offload) { - mpsc_queue_insert(&dp_offload_thread.queue, &offload->node); - atomic_count_inc64(&dp_offload_thread.enqueued_item); + unsigned int i; + + dp_netdev_offload_init(); + + i = netdev_offload_ufid_to_thread_id(offload->flow->mega_ufid); + mpsc_queue_insert(&dp_offload_threads[i].queue, &offload->node); + atomic_count_inc64(&dp_offload_threads[i].enqueued_item); } static int @@ -2745,8 +2797,9 @@ err_free: #define DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US (10 * 1000) /* 10 ms */ static void * -dp_netdev_flow_offload_main(void *data OVS_UNUSED) +dp_netdev_flow_offload_main(void *arg) { + struct dp_offload_thread *ofl_thread = arg; struct dp_offload_thread_item *offload; enum mpsc_queue_poll_result poll_result; struct mpsc_queue_node *node; @@ -2758,7 +2811,7 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) const char *op; int ret; - queue = &dp_offload_thread.queue; + queue = &ofl_thread->queue; if (!mpsc_queue_acquire(queue)) { VLOG_ERR("failed to register as consumer of the offload queue"); return NULL; @@ -2781,7 +2834,7 @@ sleep_until_next: } offload = CONTAINER_OF(node, struct dp_offload_thread_item, node); - atomic_count_dec64(&dp_offload_thread.enqueued_item); + atomic_count_dec64(&ofl_thread->enqueued_item); switch (offload->op) { case DP_NETDEV_FLOW_OFFLOAD_OP_ADD: @@ -2803,7 +2856,7 @@ sleep_until_next: now = time_usec(); latency_us = now - offload->timestamp; - mov_avg_ema_update(&dp_offload_thread.ema, latency_us); + mov_avg_ema_update(&ofl_thread->ema, latency_us); VLOG_DBG("%s to %s netdev flow "UUID_FMT, ret == 0 ? "succeed" : "failed", op, @@ -2832,12 +2885,6 @@ queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd, { struct dp_offload_thread_item *offload; - if (ovsthread_once_start(&offload_thread_once)) { - mpsc_queue_init(&dp_offload_thread.queue); - ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, NULL); - ovsthread_once_done(&offload_thread_once); - } - offload = dp_netdev_alloc_flow_offload(pmd, flow, DP_NETDEV_FLOW_OFFLOAD_OP_DEL); offload->timestamp = pmd->ctx.now; @@ -2856,12 +2903,6 @@ queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd, return; } - if (ovsthread_once_start(&offload_thread_once)) { - mpsc_queue_init(&dp_offload_thread.queue); - ovs_thread_create("hw_offload", dp_netdev_flow_offload_main, NULL); - ovsthread_once_done(&offload_thread_once); - } - if (flow->mark != INVALID_FLOW_MARK) { op = DP_NETDEV_FLOW_OFFLOAD_OP_MOD; } else { @@ -4259,45 +4300,84 @@ dpif_netdev_offload_stats_get(struct dpif *dpif, DP_NETDEV_HW_OFFLOADS_STATS_INSERTED, DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN, }; - const char *names[] = { - [DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED] = " Enqueued offloads", - [DP_NETDEV_HW_OFFLOADS_STATS_INSERTED] = " Inserted offloads", - [DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN] = " Average latency (us)", + struct { + const char *name; + uint64_t total; + } hwol_stats[] = { + [DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED] = + { " Enqueued offloads", 0 }, + [DP_NETDEV_HW_OFFLOADS_STATS_INSERTED] = + { " Inserted offloads", 0 }, + [DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN] = + { " Average latency (us)", 0 }, }; struct dp_netdev *dp = get_dp_netdev(dpif); struct dp_netdev_port *port; - uint64_t nb_offloads; + unsigned int nb_thread; + uint64_t *port_nb_offloads; + uint64_t *nb_offloads; + unsigned int tid; size_t i; if (!netdev_is_flow_api_enabled()) { return EINVAL; } - stats->size = ARRAY_SIZE(names); + nb_thread = netdev_offload_thread_nb(); + /* nb_thread counters for the overall total as well. */ + stats->size = ARRAY_SIZE(hwol_stats) * (nb_thread + 1); stats->counters = xcalloc(stats->size, sizeof *stats->counters); - nb_offloads = 0; + nb_offloads = xcalloc(nb_thread, sizeof *nb_offloads); + port_nb_offloads = xcalloc(nb_thread, sizeof *port_nb_offloads); ovs_rwlock_rdlock(&dp->port_rwlock); HMAP_FOR_EACH (port, node, &dp->ports) { - uint64_t port_nb_offloads = 0; - + memset(port_nb_offloads, 0, nb_thread * sizeof *port_nb_offloads); /* Do not abort on read error from a port, just report 0. */ - if (!netdev_hw_offload_stats_get(port->netdev, &port_nb_offloads)) { - nb_offloads += port_nb_offloads; + if (!netdev_hw_offload_stats_get(port->netdev, port_nb_offloads)) { + for (i = 0; i < nb_thread; i++) { + nb_offloads[i] += port_nb_offloads[i]; + } } } ovs_rwlock_unlock(&dp->port_rwlock); - atomic_read_relaxed(&dp_offload_thread.enqueued_item, - &stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value); - stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_INSERTED].value = nb_offloads; - stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN].value = - mov_avg_ema(&dp_offload_thread.ema); + free(port_nb_offloads); + + for (tid = 0; tid < nb_thread; tid++) { + uint64_t counts[ARRAY_SIZE(hwol_stats)]; + size_t idx = ((tid + 1) * ARRAY_SIZE(hwol_stats)); + + memset(counts, 0, sizeof counts); + counts[DP_NETDEV_HW_OFFLOADS_STATS_INSERTED] = nb_offloads[tid]; + if (dp_offload_threads != NULL) { + atomic_read_relaxed(&dp_offload_threads[tid].enqueued_item, + &counts[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED]); + + counts[DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN] = + mov_avg_ema(&dp_offload_threads[tid].ema); + + } + + for (i = 0; i < ARRAY_SIZE(hwol_stats); i++) { + snprintf(stats->counters[idx + i].name, + sizeof(stats->counters[idx + i].name), + " [%3u] %s", tid, hwol_stats[i].name); + stats->counters[idx + i].value = counts[i]; + hwol_stats[i].total += counts[i]; + } + } + + free(nb_offloads); + + /* Do an average of the average for the aggregate. */ + hwol_stats[DP_NETDEV_HW_OFFLOADS_STATS_LATENCY_MEAN].total /= nb_thread; - for (i = 0; i < ARRAY_SIZE(names); i++) { + for (i = 0; i < ARRAY_SIZE(hwol_stats); i++) { snprintf(stats->counters[i].name, sizeof(stats->counters[i].name), - "%s", names[i]); + " Total %s", hwol_stats[i].name); + stats->counters[i].value = hwol_stats[i].total; } return 0; From patchwork Sat Dec 5 14:22:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1411463 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=u256.net Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CpBf53dMyz9sVV for ; Sun, 6 Dec 2020 01:24:13 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 191028778C; Sat, 5 Dec 2020 14:24:12 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9DGM0ud8CBFG; Sat, 5 Dec 2020 14:24:10 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 76BAA87994; Sat, 5 Dec 2020 14:23:08 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 17BB1C1DF2; Sat, 5 Dec 2020 14:23:08 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id EBBB2C1D9F for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id AFF9887239 for ; Sat, 5 Dec 2020 14:22:44 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zYPZz_8Tuh6r for ; Sat, 5 Dec 2020 14:22:43 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by fraxinus.osuosl.org (Postfix) with ESMTPS id CDE5D86FDE for ; Sat, 5 Dec 2020 14:22:41 +0000 (UTC) X-Originating-IP: 90.78.4.16 Received: from inocybe.home (lfbn-poi-1-1343-16.w90-78.abo.wanadoo.fr [90.78.4.16]) (Authenticated sender: grive@u256.net) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 5430340006 for ; Sat, 5 Dec 2020 14:22:40 +0000 (UTC) From: Gaetan Rivet To: dev@openvswitch.org Date: Sat, 5 Dec 2020 15:22:21 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [ovs-dev] [RFC PATCH 26/26] netdev-dpdk: remove rte-flow API access locks X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The rte_flow DPDK API is now thread-safe. Remove the locks Signed-off-by: Gaetan Rivet --- lib/netdev-dpdk.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 2640a421a..296227a7d 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -5249,9 +5249,7 @@ netdev_dpdk_rte_flow_destroy(struct netdev *netdev, struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); int ret; - ovs_mutex_lock(&dev->mutex); ret = rte_flow_destroy(dev->port_id, rte_flow, error); - ovs_mutex_unlock(&dev->mutex); return ret; } @@ -5265,9 +5263,7 @@ netdev_dpdk_rte_flow_create(struct netdev *netdev, struct rte_flow *flow; struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); - ovs_mutex_lock(&dev->mutex); flow = rte_flow_create(dev->port_id, attr, items, actions, error); - ovs_mutex_unlock(&dev->mutex); return flow; }