From patchwork Thu Feb 28 17:15:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han Zhou X-Patchwork-Id: 1049630 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="JDvl4NoK"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 449KBm28brz9s70 for ; Fri, 1 Mar 2019 04:23:10 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 70DBFAB68; Thu, 28 Feb 2019 17:23:05 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id BC15CAAF8 for ; Thu, 28 Feb 2019 17:15:35 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id B08F678C for ; Thu, 28 Feb 2019 17:15:34 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id p19so10057101plo.2 for ; Thu, 28 Feb 2019 09:15:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=hohElNVH0Hihqyn+wdHY9DsSdpq4bF/cTDw2sLEY6aA=; b=JDvl4NoKxKcZx9HI2xX44fGGuZwNVvaHiBswuQ+gWecoZhZe4AHR9vrHCyS8NACKvl i5hiwldI1jHPIWFmOa8Robzagz5/GaxhtNWhIY77YASWyySJlfIjfUlLzjHqlk9oDK60 r1CALY+MHyf4Lw7/kbjwJjSgrKKDIH0CG/uLPkAe03Uv/tFdzO7dEt0NW3lA1v7Xfdx9 VNziiBYl4KuXKuEeAk6wZjqah+a9jBi+Na9yOoXuRyQH0Nbh3h8Iu5ZRW6ZDzGrq1Nuo hSaCclee5Ug+p/HFOi12JGtSOqROv9z2hIIYMWStPNZmL1Oz9v6wsRssLVa58DiGmvlM TCiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=hohElNVH0Hihqyn+wdHY9DsSdpq4bF/cTDw2sLEY6aA=; b=YiIB4uZpgxtQPU8OQyFphqg/z+sNrrCr6PDsE+WPCQ9HJTqU5kXJQqKTX45WpksA1a wRknKL5wSEZDuJ3I61JXNsDUM+hYy68DSpABzzQqN+BLE924JyiWmq6zqGIZiHPm5CTY lu4nClW0WK3oC7pHFxu2hoQgAWnDwbNenbhLFgDfQoNPNsit8OlDnf+EY34kJsxm6+hJ KYymYU+bWijgj6lUw4/JgVc4oZkKFYBWBBkprq5Olaqv8uWBBrZxNKf5hf071eQkkA0a GJb7IbY5Ka1Ts+YTCMq+RdhWUfkIx4qD+utdeCJTvlN7jnLb8N0KeOcKOSMg/2My4W5/ tOwg== X-Gm-Message-State: APjAAAVLm4an3GIEffbKzZ159B/4RJ7C0JODJNfZ0Ht96GwNV/OId958 JpPTBcyXK90PD9kpoqBxH3SGNoOG X-Google-Smtp-Source: APXvYqxHxLdy73RmUNem2q1xm95ohxRHn37brtP6lNeL6+OFc1gG4NbaQIB3HetBHpHvXCZy/mwGDw== X-Received: by 2002:a17:902:a03:: with SMTP id 3mr434002plo.306.1551374133883; Thu, 28 Feb 2019 09:15:33 -0800 (PST) Received: from localhost.localdomain.localdomain (c-76-21-108-74.hsd1.ca.comcast.net. [76.21.108.74]) by smtp.gmail.com with ESMTPSA id a4sm36972372pga.52.2019.02.28.09.15.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Feb 2019 09:15:33 -0800 (PST) From: Han Zhou X-Google-Original-From: Han Zhou To: dev@openvswitch.org Date: Thu, 28 Feb 2019 09:15:15 -0800 Message-Id: <1551374120-44287-1-git-send-email-hzhou8@ebay.com> X-Mailer: git-send-email 2.1.0 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v4 0/5] Fast OVSDB resync after restart or fail-over. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org In scalability test with ovn-scale-test, ovsdb-server SB load is not a problem at least with 1k HVs. However, if we restart the ovsdb-server, depending on the number of HVs and scale of logical objects, e.g. the number of logical ports, ovsdb-server of SB become an obvious bottleneck. In our test with 1k HVs and 20k logical ports (200 lport * 100 lswitches connected by one single logical router). Restarting ovsdb-server of SB resulted in 100% CPU of ovsdb-server for more than 1 hour. All HVs (and northd) are reconnecting and resyncing the big amount of data at the same time. Similar problem would happen in failover scenario. With active-active cluster, the problem can be aleviated slightly, because only 1/3 (assuming it is 3-node cluster) of the HVs will need to resync data from new servers, but it is still a serious problem. For detailed discussions for the problem and solutions, see: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-October/047591.html The patches implements the proposal in that discussion. It introduces a new method monitor_cond_since to enable client to request changes that happened after a specific point so that the data has been cached already in client are not re-transfered. Scalability test shows dramatic improvement. All HVs finishes sync as soon as they reconnect since there is no new data to be transfered. The current patches supports all 3 modes of ovsdb-server, but only clustered mode can benefit from it, since it is the only one that supports transaction id out of the box. --- v1 -> v2: fix the unused variable in patch 6/7 reported by 0-day. v2 -> v3: first 2 in the 7 were merged. Revised 1/5 addressing Ben's comment on the json cache hmap. v3 -> v4: fix the use after free found by Address Sanitizer in 2/5. Han Zhou (5): ovsdb-monitor: Refactor ovsdb monitor implementation. ovsdb-server: Transaction history tracking. ovsdb-monitor: Support monitor_cond_since. ovsdb-idl.c: Support monitor_cond_since method in C IDL. ovsdb-idl.c: Fast resync from server when connection reset. Documentation/ref/ovsdb-server.7.rst | 78 +++++- lib/ovsdb-idl.c | 229 ++++++++++++---- ovsdb/jsonrpc-server.c | 101 +++++-- ovsdb/monitor.c | 516 ++++++++++++++++++++++------------- ovsdb/monitor.h | 16 +- ovsdb/ovsdb-client.1.in | 28 +- ovsdb/ovsdb-client.c | 104 ++++++- ovsdb/ovsdb-server.c | 9 + ovsdb/ovsdb.c | 6 + ovsdb/ovsdb.h | 10 + ovsdb/transaction.c | 127 ++++++++- ovsdb/transaction.h | 5 + tests/ovsdb-monitor.at | 301 +++++++++++++++++++- 13 files changed, 1244 insertions(+), 286 deletions(-)