From patchwork Tue Oct 26 00:42:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han Zhou X-Patchwork-Id: 1546175 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HdY255TPTz9t0k for ; Tue, 26 Oct 2021 11:42:37 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 326A440448; Tue, 26 Oct 2021 00:42:35 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ruCdMILbHH69; Tue, 26 Oct 2021 00:42:34 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 63A9B403B1; Tue, 26 Oct 2021 00:42:33 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 400E4C0019; Tue, 26 Oct 2021 00:42:33 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 76C74C000E for ; Tue, 26 Oct 2021 00:42:31 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 5F77980F26 for ; Tue, 26 Oct 2021 00:42:31 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xAY0kjRrzEAa for ; Tue, 26 Oct 2021 00:42:30 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from relay11.mail.gandi.net (relay11.mail.gandi.net [217.70.178.231]) by smtp1.osuosl.org (Postfix) with ESMTPS id C021180F21 for ; Tue, 26 Oct 2021 00:42:29 +0000 (UTC) Received: (Authenticated sender: hzhou@ovn.org) by relay11.mail.gandi.net (Postfix) with ESMTPSA id 8792E100003; Tue, 26 Oct 2021 00:42:25 +0000 (UTC) From: Han Zhou To: dev@openvswitch.org Date: Mon, 25 Oct 2021 17:42:18 -0700 Message-Id: <20211026004218.2206164-1-hzhou@ovn.org> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Cc: Vladislav Odintsov Subject: [ovs-dev] [PATCH ovn] northd.c: Fix north blocking on deleting duplicated SB datapath. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Duplicated datapaths shouldn't be created in the first place, but in case it is created because of either bug or dirty data, it should be properly deleted instead of causing permanent failure in northd. Currently, when there is a duplicated datapath record and a ip_multicast record referencing it created, northd tries to delete the duplicated datapath but the transaction fails due to schema constraint: transaction error: {"details":"Deletion of 1 weak reference(s) to deleted (or never-existing) rows from column \"datapath\" in \"IP_Multicast\" row 72bac803-e484-4358-9e48-11911c8aa16f caused this column to become empty, but constraints on this column disallow an empty column.","error":"constraint violation"} Northd would be blocked forever until manual deletion of the ip_multicast record. The problem is that in the same transaction only datapath is deleted but not the ip_multicast record that references the datapath. In build_ip_mcast() there is logic to delete ip_multicast records with non-exist datapath, but the logic of ovn_datapath_from_sbrec() can find the datapath and regard it as valid whenever the external-ids:logical-switch/router matches. This patch fixes that by comparing the sbrec of the datapath afterwards, and regards it as valid only if the sbrec matches, too. This way, both ip_multicast and datapath records are deleted in the same transaction and won't cause any trouble. Reported-by: Vladislav Odintsov Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-October/388269.html Signed-off-by: Han Zhou Acked-by: Numan Siddique --- northd/northd.c | 7 ++++++- tests/ovn-northd.at | 20 ++++++++++++++++++++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/northd/northd.c b/northd/northd.c index f71121486..da4f9cd14 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -1045,7 +1045,12 @@ ovn_datapath_from_sbrec(struct hmap *datapaths, !smap_get_uuid(&sb->external_ids, "logical-router", &key)) { return NULL; } - return ovn_datapath_find(datapaths, &key); + struct ovn_datapath *od = ovn_datapath_find(datapaths, &key); + if (od && (od->sb == sb)) { + return od; + } + + return NULL; } static bool diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index 544820764..e2b9924b6 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -5446,3 +5446,23 @@ wait_row_count Port_binding 1 logical-port=S1-vm2 requested_chassis=$ch2_uuid AT_CLEANUP ]) + +# Duplicated datapaths shouldn't be created, but in case it is created because +# of bug or dirty data, it should be properly deleted instead of causing +# permanent failure in northd. +OVN_FOR_EACH_NORTHD([ +AT_SETUP([handling duplicated datapaths]) +ovn_start + +check ovn-nbctl --wait=sb ls-add ls1 +ls1_uuid=$(fetch_column nb:Logical_Switch _uuid) + +# create a duplicated sb datapath (and an IP_Mulicast record that references +# it) on purpose. +AT_CHECK([ovn-sbctl --id=@dp create Datapath_Binding external_ids:logical-switch=$ls1_uuid external_ids:name=ls1 tunnel_key=123 -- create IP_Multicast datapath=@dp], [0], [ignore]) + +# northd should delete one of the datapaths in the end +wait_row_count Datapath_Binding 1 + +AT_CLEANUP +])