From patchwork Mon Sep 10 07:56:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 967888 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4280jn3G8Jz9s3C for ; Mon, 10 Sep 2018 17:56:29 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id A5186CDB; Mon, 10 Sep 2018 07:56:27 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 3E7ECCD1 for ; Mon, 10 Sep 2018 07:56:26 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id B94DC793 for ; Mon, 10 Sep 2018 07:56:25 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 95F2F40201BC for ; Mon, 10 Sep 2018 07:56:24 +0000 (UTC) Received: from nusiddiq.redhat (unknown [10.76.67.198]) by smtp.corp.redhat.com (Postfix) with ESMTP id 757A92027EA4; Mon, 10 Sep 2018 07:56:23 +0000 (UTC) From: nusiddiq@redhat.com To: dev@openvswitch.org Date: Mon, 10 Sep 2018 13:26:18 +0530 Message-Id: <20180910075618.20131-1-nusiddiq@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 10 Sep 2018 07:56:24 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 10 Sep 2018 07:56:24 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'nusiddiq@redhat.com' RCPT:'' X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH] ovsdb-server: Fix the possible data loss in an active/standby setup X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Numan Siddique The present code resets the database when it is in the state - 'RPL_S_SCHEMA_REQUESTED' and repopulates the database when it receives the monitor reply when it is in the state - 'RPL_S_MONITOR_REQUESTED'. If however, it goes to active mode before it processes the monitor reply, the whole data is lost. This patch fixes the issue by resetting the database when it receives the monitor reply (before processing it). so that reset and repopulation of the db happens in the same state. This approach still has a very small window for data loss if the function process_notification() which processes the monitor reply fails for some reason. Reported-by: Han Zhou Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047161.html Tested-by: aginwala Signed-off-by: Numan Siddique Acked-by: Han Zhou --- ovsdb/replication.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/ovsdb/replication.c b/ovsdb/replication.c index 2b9ae2f83..44428a48e 100644 --- a/ovsdb/replication.c +++ b/ovsdb/replication.c @@ -299,19 +299,7 @@ replication_run(void) /* After receiving schemas, reset the local databases that * will be monitored and send out monitor requests for them. */ if (hmap_is_empty(&request_ids)) { - struct shash_node *node, *next; - - SHASH_FOR_EACH_SAFE (node, next, replication_dbs) { - db = node->data; - error = reset_database(db); - if (error) { - const char *db_name = db->schema->name; - shash_find_and_delete(replication_dbs, db_name); - ovsdb_error_assert(error); - VLOG_WARN("Failed to reset database, " - "%s not replicated.", db_name); - } - } + struct shash_node *node; if (shash_is_empty(replication_dbs)) { VLOG_WARN("Nothing to replicate."); @@ -335,7 +323,11 @@ replication_run(void) case RPL_S_MONITOR_REQUESTED: { /* Reply to monitor requests. */ struct ovsdb_error *error; - error = process_notification(msg->result, db); + VLOG_INFO("Monitor request received. Resetting the database"); + error = reset_database(db); + if (!error) { + error = process_notification(msg->result, db); + } if (error) { ovsdb_error_assert(error); state = RPL_S_ERR;