From patchwork Thu Jul 6 10:01:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han Zhou X-Patchwork-Id: 1804236 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QxXBj47Nnz20ZQ for ; Thu, 6 Jul 2023 20:02:21 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id B7C064175E; Thu, 6 Jul 2023 10:02:17 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org B7C064175E X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FoW35uHJbZfe; Thu, 6 Jul 2023 10:02:15 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 0F34D41747; Thu, 6 Jul 2023 10:02:13 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 0F34D41747 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id BFAF3C008D; Thu, 6 Jul 2023 10:02:13 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 59386C0DD3 for ; Thu, 6 Jul 2023 10:02:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 2976781FE8 for ; Thu, 6 Jul 2023 10:02:12 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 2976781FE8 X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EhNYogdeDc60 for ; Thu, 6 Jul 2023 10:02:11 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org E0B0F81FCA Received: from relay8-d.mail.gandi.net (relay8-d.mail.gandi.net [217.70.183.201]) by smtp1.osuosl.org (Postfix) with ESMTPS id E0B0F81FCA for ; Thu, 6 Jul 2023 10:02:10 +0000 (UTC) X-GND-Sasl: hzhou@ovn.org X-GND-Sasl: hzhou@ovn.org X-GND-Sasl: hzhou@ovn.org X-GND-Sasl: hzhou@ovn.org Received: by mail.gandi.net (Postfix) with ESMTPSA id 27EE01BF20E; Thu, 6 Jul 2023 10:02:06 +0000 (UTC) From: Han Zhou To: dev@openvswitch.org Date: Thu, 6 Jul 2023 03:01:40 -0700 Message-Id: <20230706100140.699587-3-hzhou@ovn.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230706100140.699587-1-hzhou@ovn.org> References: <20230706100140.699587-1-hzhou@ovn.org> MIME-Version: 1.0 Cc: Dumitru Ceara Subject: [ovs-dev] [PATCH ovn 3/3] ovn-northd: Avoid recompute caused by in-flight transactions. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Although each individual VIF port related changes are handled incrementally, it still triggers recompute if there are in-flight transactions (either to NB or SB) when a change comes, which is very common in real production environment if change happens frequently. It is also easy to hit such situation in test cases where nb_cfg mechanism is heavily used, which makes it difficult to write reliable and stable tests, such as what the commit 8c30ba1386 was trying to work around. This patch skips the I-P engine execution until the NB & SB transaction handles are available (no in-flight transactions), and when skippiing the runs it keeps the tracked changes in IDL across main loop iterations. This way we avoid recompute without worrying about missing any changes. Signed-off-by: Han Zhou Acked-by: Numan Siddique --- northd/inc-proc-northd.c | 6 +-- northd/ovn-northd.c | 22 +++++---- tests/ovn-northd.at | 104 +++++++++++++++++++-------------------- 3 files changed, 66 insertions(+), 66 deletions(-) diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c index 19fc67795643..d328deb222e6 100644 --- a/northd/inc-proc-northd.c +++ b/northd/inc-proc-northd.c @@ -296,6 +296,7 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb, bool inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn, struct ovsdb_idl_txn *ovnsb_txn, bool recompute) { + ovs_assert(ovnnb_txn && ovnsb_txn); engine_init_run(); /* Force a full recompute if instructed to, for example, after a NB/SB @@ -312,10 +313,7 @@ bool inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn, }; engine_set_context(&eng_ctx); - - if (ovnnb_txn && ovnsb_txn) { - engine_run(true); - } + engine_run(true); if (!engine_has_run()) { if (engine_need_run()) { diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 3e0848e41b8e..4fa1b039ea32 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -881,6 +881,7 @@ main(int argc, char *argv[]) simap_destroy(&usage); } + bool clear_idl_track = true; if (!state.paused) { if (!ovsdb_idl_has_lock(ovnsb_idl_loop.idl) && !ovsdb_idl_is_lock_contended(ovnsb_idl_loop.idl)) @@ -930,25 +931,26 @@ main(int argc, char *argv[]) } if (ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) { - int64_t loop_start_time = time_wall_msec(); - bool activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn, - recompute); - recompute = false; - if (ovnsb_txn) { + bool activity = false; + if (ovnnb_txn && ovnsb_txn) { + int64_t loop_start_time = time_wall_msec(); + activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn, + recompute); + recompute = false; check_and_add_supported_dhcp_opts_to_sb_db( ovnsb_txn, ovnsb_idl_loop.idl); check_and_add_supported_dhcpv6_opts_to_sb_db( ovnsb_txn, ovnsb_idl_loop.idl); check_and_update_rbac( ovnsb_txn, ovnsb_idl_loop.idl); - } - if (ovnnb_txn && ovnsb_txn) { update_sequence_numbers(loop_start_time, ovnnb_idl_loop.idl, ovnsb_idl_loop.idl, ovnnb_txn, ovnsb_txn, &ovnsb_idl_loop); + } else if (!recompute) { + clear_idl_track = false; } /* If there are any errors, we force a full recompute in order @@ -998,8 +1000,10 @@ main(int argc, char *argv[]) recompute = true; } - ovsdb_idl_track_clear(ovnnb_idl_loop.idl); - ovsdb_idl_track_clear(ovnsb_idl_loop.idl); + if (clear_idl_track) { + ovsdb_idl_track_clear(ovnnb_idl_loop.idl); + ovsdb_idl_track_clear(ovnsb_idl_loop.idl); + } unixctl_server_run(unixctl); unixctl_server_wait(unixctl); diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index f1bf9092eeb7..e79d33b2aec5 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -9522,67 +9522,65 @@ as hv1 ovs-vsctl add-br br-phys ovn_attach n1 br-phys 192.168.0.11 -fail_count=0 check_recompute_counter() { northd_recomp=$(as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute) - if test x$northd_recomp != x$1; then - fail_count=$(($fail_count + 1)) - echo check northd recompute failed: expected $1, got $northd_recomp - return 1 - fi + AT_CHECK([test x$northd_recomp = x$1]) + lflow_recomp=$(as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute) - if test x$lflow_recomp != x$2; then - fail_count=$(($fail_count + 1)) - echo check lflow recompute failed: expected $2, got $lflow_recomp - return 1 - fi - return 0 + AT_CHECK([test x$lflow_recomp = x$2]) } -# Depending on order of responses from NB and SB, the number of recompute may -# be different. This test case only verifies the best case scenario, which -# should have the expected recompute count at least 50% of the time. +check ovn-nbctl --wait=hv ls-add ls0 + +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses lsp0-0 "unknown" +ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0 external_ids:iface-id=lsp0-0 +wait_for_ports_up +check ovn-nbctl --wait=hv sync +check_recompute_counter 5 5 + +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-1 -- lsp-set-addresses lsp0-1 "aa:aa:aa:00:00:01 192.168.0.11" +ovs-vsctl add-port br-int lsp0-1 -- set interface lsp0-1 external_ids:iface-id=lsp0-1 +wait_for_ports_up +check ovn-nbctl --wait=hv sync +check_recompute_counter 0 0 + +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-2 -- lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:02 192.168.0.12" +ovs-vsctl add-port br-int lsp0-2 -- set interface lsp0-2 external_ids:iface-id=lsp0-2 +wait_for_ports_up +check ovn-nbctl --wait=hv sync +check_recompute_counter 0 0 + +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats +check ovn-nbctl --wait=hv lsp-del lsp0-1 +check_recompute_counter 0 0 + +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats +check ovn-nbctl --wait=hv lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:88 192.168.0.88" +check_recompute_counter 0 0 + +# Delete and re-add a LSP for several times continuously, to ensure +# frequent operations do not trigger recompute when there are in-flight +# transcations. for i in $(seq 10); do - check ovn-nbctl --wait=hv ls-add ls$i - - check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats - check ovn-nbctl --wait=hv lsp-add ls$i lsp${i}-0 -- lsp-set-addresses lsp${i}-0 "unknown" - ovs-vsctl add-port br-int lsp${i}-0 -- set interface lsp${i}-0 external_ids:iface-id=lsp${i}-0 - wait_for_ports_up - check ovn-nbctl --wait=hv sync - check_recompute_counter 5 5 || continue - - check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats - check ovn-nbctl --wait=hv lsp-add ls$i lsp${i}-1 -- lsp-set-addresses lsp${i}-1 "aa:aa:aa:00:00:01 192.168.0.11" - ovs-vsctl add-port br-int lsp${i}-1 -- set interface lsp${i}-1 external_ids:iface-id=lsp${i}-1 - wait_for_ports_up - check ovn-nbctl --wait=hv sync - check_recompute_counter 0 0 || continue - - check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats - check ovn-nbctl --wait=hv lsp-add ls$i lsp${i}-2 -- lsp-set-addresses lsp${i}-2 "aa:aa:aa:00:00:02 192.168.0.12" - ovs-vsctl add-port br-int lsp${i}-2 -- set interface lsp${i}-2 external_ids:iface-id=lsp${i}-2 - wait_for_ports_up - check ovn-nbctl --wait=hv sync - check_recompute_counter 0 0 || continue - - check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats - check ovn-nbctl --wait=hv lsp-del lsp${i}-1 - check_recompute_counter 0 0 || continue - - check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats - check ovn-nbctl --wait=hv lsp-set-addresses lsp${i}-2 "aa:aa:aa:00:00:88 192.168.0.88" - check_recompute_counter 0 0 || continue - - # No change, no recompute - check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats - check ovn-nbctl --wait=sb sync - check_recompute_counter 0 0 || continue + # wait for sb but not wait for hv + check ovn-nbctl --wait=sb lsp-del lsp0-2 + check ovn-nbctl --wait=sb lsp-add ls0 lsp0-2 -- lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:02 192.168.0.12" - CHECK_NO_CHANGE_AFTER_RECOMPUTE + # even without waiting for sb + check ovn-nbctl lsp-del lsp0-2 + check ovn-nbctl lsp-add ls0 lsp0-2 -- lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:02 192.168.0.12" done -echo Test failed $fail_count in 10. -AT_CHECK([test $fail_count -le 5]) +check_recompute_counter 0 0 + +# No change, no recompute +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats +check ovn-nbctl --wait=sb sync +check_recompute_counter 0 0 + +CHECK_NO_CHANGE_AFTER_RECOMPUTE OVN_CLEANUP([hv1]) AT_CLEANUP