From patchwork Fri Jul 12 17:50:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1131501 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45lgdJ6tD7z9sBF for ; Sat, 13 Jul 2019 03:58:12 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 21B1C6562; Fri, 12 Jul 2019 17:58:09 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 27ED31D51 for ; Fri, 12 Jul 2019 17:50:07 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 9E325883 for ; Fri, 12 Jul 2019 17:50:06 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0CBD830842A0 for ; Fri, 12 Jul 2019 17:50:06 +0000 (UTC) Received: from nusiddiq.mac (unknown [10.74.10.56]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0162937DE for ; Fri, 12 Jul 2019 17:50:04 +0000 (UTC) From: nusiddiq@redhat.com To: dev@openvswitch.org Date: Fri, 12 Jul 2019 23:20:00 +0530 Message-Id: <20190712175000.11085-1-nusiddiq@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Fri, 12 Jul 2019 17:50:06 +0000 (UTC) X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH] ovn-northd: Fix the ovn-northd continous looping X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Numan Siddique ovn-northd wakes up continuously from poll_block(). This issue can be reproduced in the sandbox with the below commands ovn-nbctl lr-add lr0 ovn-nbctl ls-add public ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.168.0.100/24 ovn-nbctl lsp-add public public-lr0 ovn-nbctl lsp-set-type public-lr0 router ovn-nbctl lsp-set-addresses public-lr0 router ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public ovn-nbctl lrp-set-gateway-chassis lr0-public chassis-1 20 This issue is seen after the commit [1], which makes use of the function - sbrec_port_binding_update_nat_addresses_addvalue() to add a value to Port_Binding.nat_addresses column. Looks like the IDL client code is sending the transactions to the ovsdb-server repeatedly to update the Port_Binding.nat_addresses even though the Southbound DB has updated the column when this function is used. The actual bug seems to be in the IDL client code and that needs to be fixed. This patch as a quick fix, fixes ovn-northd's continuous loop by not using this function, instead making use of sbrec_port_binding_set_nat_addresses(). The below messages are seen continuously when the ovn-nortdh debug logs are enabled. **** 2019-07-12T17:26:13.837Z|74512|jsonrpc|DBG|unix:sb1.ovsdb: received reply, result=[{},{"count":1},{"count":1}], id=18628 2019-07-12T17:26:13.837Z|74513|poll_loop|DBG|wakeup due to 0-ms timeout at ../lib/ovsdb-idl.c:5397 (75% CPU usage) 2019-07-12T17:26:13.837Z|74514|jsonrpc|DBG|unix:sb1.ovsdb: send request, method="transact", params=["OVN_Southbound",{"lock":"ovn_northd","op":"assert"}, {"where":[["_uuid","==",["uuid","56a9eb75-8d3b-4144-b4e7-1bb749645011"]]],"row": {"nat_addresses":["set",[]]},"op":"update","table":"Port_Binding"},{"mutations":[["nat_addresses", "insert",["set",["00:00:20:20:12:13 172.168.0.100 is_chassis_resident(\"cr-lr0-public\")"]]]], "where":[["_uuid","==",["uuid","56a9eb75-8d3b-4144-b4e7-1bb749645011"]]],"op":"mutate","table":"Port_Binding"}], id=18629 2019-07-12T17:26:13.837Z|74516|jsonrpc|DBG|unix:sb1.ovsdb: received reply, result=[{},{"count":1},{"count":1}], id=18629 2019-07-12T17:26:13.837Z|74517|poll_loop|DBG|wakeup due to 0-ms timeout at ../lib/ovsdb-idl.c:5397 (75% CPU usage) 2019-07-12T17:26:13.837Z|74518|jsonrpc|DBG|unix:sb1.ovsdb: send request, method="transact", params=["OVN_Southbound",{"lock":"ovn_northd","op":"assert"}, {"where":[["_uuid","==",["uuid","56a9eb75-8d3b-4144-b4e7-1bb749645011"]]], "row":{"nat_addresses":["set",[]]},"op":"update","table":"Port_Binding"}, {"mutations":[["nat_addresses","insert",["set",["00:00:20:20:12:13 172.168.0.100 is_chassis_resident(\"cr-lr0-public\")"]]]],"where":[["_uuid","==",["uuid", "56a9eb75-8d3b-4144-b4e7-1bb749645011"]]],"op":"mutate","table":"Port_Binding"}], id=18630 2019-07-12T17:26:13.837Z|74520|jsonrpc|DBG|unix:sb1.ovsdb: received reply, result=[{},{"count":1},{"count":1}], id=18630 ****** The OpenStack CI tests for networking-ovn is frequently failing few tests after this commit. The failure seems to be related to timing issues as ovn-northd is hogging the CPU continuously. We are also seeing travis CI test failures after this commit. [1] - ed198fb3b92e Fixes: ed198fb3b92e("ovn: Send GARP for the router ports with reside-on-redirect-chassis options set") Signed-off-by: Numan Siddique Tested-by: Greg Rose Reviewed-by: Greg Rose --- ovn/northd/ovn-northd.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index ce382ac89..127227712 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -2530,13 +2530,6 @@ ovn_port_update_sbrec(struct northd_context *ctx, } } - sbrec_port_binding_set_nat_addresses(op->sb, - (const char **) nats, n_nats); - for (size_t i = 0; i < n_nats; i++) { - free(nats[i]); - } - free(nats); - /* Add the router mac and IPv4 addresses to * Port_Binding.nat_addresses so that GARP is sent for these * IPs by the ovn-controller on which the distributed gateway @@ -2578,10 +2571,18 @@ ovn_port_update_sbrec(struct northd_context *ctx, op->peer->od->l3redirect_port->json_key); } - sbrec_port_binding_update_nat_addresses_addvalue( - op->sb, ds_cstr(&garp_info)); + n_nats++; + nats = xrealloc(nats, (n_nats * sizeof *nats)); + nats[n_nats - 1] = ds_steal_cstr(&garp_info); ds_destroy(&garp_info); } + + sbrec_port_binding_set_nat_addresses(op->sb, + (const char **) nats, n_nats); + for (size_t i = 0; i < n_nats; i++) { + free(nats[i]); + } + free(nats); } sbrec_port_binding_set_parent_port(op->sb, op->nbsp->parent_name);