From patchwork Wed Nov 2 08:33:55 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Babu Shanmugam X-Patchwork-Id: 690303 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3t81cB2GdYz9sD5 for ; Wed, 2 Nov 2016 19:34:34 +1100 (AEDT) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 256F0106AA; Wed, 2 Nov 2016 01:34:14 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e3.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id CDC1010671 for ; Wed, 2 Nov 2016 01:34:12 -0700 (PDT) Received: from bar5.cudamail.com (localhost [127.0.0.1]) by mx1e3.cudamail.com (Postfix) with ESMTPS id 335774203B0 for ; Wed, 2 Nov 2016 02:34:10 -0600 (MDT) X-ASG-Debug-ID: 1478075649-09eadd0f976257f0001-byXFYA Received: from mx3-pf2.cudamail.com ([192.168.14.1]) by bar5.cudamail.com with ESMTP id KxCOR4v4wnW938RA (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 02 Nov 2016 02:34:09 -0600 (MDT) X-Barracuda-Envelope-From: bschanmu@redhat.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.14.1 Received: from unknown (HELO mx1.redhat.com) (209.132.183.28) by mx3-pf2.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 2 Nov 2016 08:34:09 -0000 Received-SPF: pass (mx3-pf2.cudamail.com: SPF record at _spf1.redhat.com designates 209.132.183.28 as permitted sender) X-Barracuda-Apparent-Source-IP: 209.132.183.28 X-Barracuda-RBL-IP: 209.132.183.28 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 978BFC057EC9; Wed, 2 Nov 2016 08:34:08 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-64.phx2.redhat.com [10.3.116.64]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uA28XwPN003110; Wed, 2 Nov 2016 04:34:06 -0400 X-CudaMail-Envelope-Sender: bschanmu@redhat.com From: Babu Shanmugam To: dev@openvswitch.org X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-V2-1101001582 X-CudaMail-DTE: 110216 X-CudaMail-Originating-IP: 209.132.183.28 Date: Wed, 2 Nov 2016 14:03:55 +0530 X-ASG-Orig-Subj: [##CM-V2-1101001582##][PATCH v4 2/4] ovn: OCF script for OVN OVSDB servers Message-Id: <1478075637-28565-3-git-send-email-bschanmu@redhat.com> In-Reply-To: <1478075637-28565-1-git-send-email-bschanmu@redhat.com> References: <1478075637-28565-1-git-send-email-bschanmu@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 02 Nov 2016 08:34:08 +0000 (UTC) X-Barracuda-Connect: UNKNOWN[192.168.14.1] X-Barracuda-Start-Time: 1478075649 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 Cc: Andrew Beekhof Subject: [ovs-dev] [PATCH v4 2/4] ovn: OCF script for OVN OVSDB servers X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" Co-authored-by: Numan Siddique Signed-off-by: Numan Siddique Co-authored-by: Andrew Beekhof Signed-off-by: Andrew Beekhof Signed-off-by: Babu Shanmugam --- IntegrationGuide.md | 63 +++++++ ovn/utilities/automake.mk | 6 +- ovn/utilities/ovndb-servers.ocf | 356 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 423 insertions(+), 2 deletions(-) create mode 100755 ovn/utilities/ovndb-servers.ocf diff --git a/IntegrationGuide.md b/IntegrationGuide.md index 5d3e574..945ecfd 100644 --- a/IntegrationGuide.md +++ b/IntegrationGuide.md @@ -167,3 +167,66 @@ following command can be used: ovs-vsctl set Interface eth0 external-ids:iface-id='"${UUID}"' + +HA for OVN DB servers using pacemaker +------------------------------------- + +The ovsdb servers can work in either active or backup mode. In backup mode, db +server will be connected to an active server and replicate the active servers +contents. At all times, the data can be transacted only from the active server. +When the active server dies for some reason, entire OVN operations will be +stalled. + +[Pacemaker][] is a cluster resource manager which can manage a defined set of +resource across a set of clustered nodes. Pacemaker manages the resource with +the help of the resource agents. One among the resource agent is [OCF][]. + +OCF is nothing but a shell script which accepts a set of actions and returns an +appropriate status code. + +With the help of the OCF resource agent ovn/utilities/ovndb-servers.ocf, one +can defined a resource for the pacemaker such that pacemaker will always +maintain one running active server at any time. + +After creating a pacemaker cluster, use the following commands to create +one active and multiple backup servers for OVN databases. + + pcs resource create ovndb_servers ocf:ovn:ovndb-servers \ + master_ip=x.x.x.x \ + ovn_ctl= \ + op monitor interval="10s" + + pcs resource master ovndb_servers-master ovndb_servers \ + meta notify="true" + +The `master_ip` and `ovn_ctl` are the parameters that will be used by the +OCF script. `ovn_ctl` is optional, if not given, it assumes a default value of +/usr/share/openvswitch/scripts/ovn-ctl. + +Whenever the active server dies, pacemaker is responsible to promote one of +the backup servers to be active. Both ovn-controller and ovn-northd needs the +ip-address at which the active server is listening. With pacemaker changing the +node at which the active server is run, it is not efficient to instruct all the +ovn-controllers and the ovn-northd to listen to the latest active server's ip- +address + +This problem can be solved by using a native ocf resource agent +`ocf:heartbeat:IPaddr2`. The IPAddr2 resource agent is just a resource with an +ip-address. When we colocate this resource with the active server, pacemaker +will enable the active server to be connected with a single ip-address all the +time. This is the ip-address that needs to be given as the parameter while +creating the `ovndb_servers` resource. + +Use the following command to create the IPAddr2 resource and colocate it +with the active server. + + pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=x.x.x.x \ + op monitor interval=30s + + pcs constraint order VirtualIP then ovndb_servers-master + + pcs constraint colocation add ovndb_servers-master with master VirtualIP \ + score=INFINITY + +[Pacemaker]: http://clusterlabs.org/pacemaker.html +[OCF]: http://www.linux-ha.org/wiki/OCF_Resource_Agents diff --git a/ovn/utilities/automake.mk b/ovn/utilities/automake.mk index b03d125..164cdda 100644 --- a/ovn/utilities/automake.mk +++ b/ovn/utilities/automake.mk @@ -1,5 +1,6 @@ scripts_SCRIPTS += \ - ovn/utilities/ovn-ctl + ovn/utilities/ovn-ctl \ + ovn/utilities/ovndb-servers.ocf man_MANS += \ ovn/utilities/ovn-ctl.8 \ @@ -20,7 +21,8 @@ EXTRA_DIST += \ ovn/utilities/ovn-docker-overlay-driver \ ovn/utilities/ovn-docker-underlay-driver \ ovn/utilities/ovn-nbctl.8.xml \ - ovn/utilities/ovn-trace.8.xml + ovn/utilities/ovn-trace.8.xml \ + ovn/utilities/ovndb-servers.ocf DISTCLEANFILES += \ ovn/utilities/ovn-ctl.8 \ diff --git a/ovn/utilities/ovndb-servers.ocf b/ovn/utilities/ovndb-servers.ocf new file mode 100755 index 0000000..1fc61a7 --- /dev/null +++ b/ovn/utilities/ovndb-servers.ocf @@ -0,0 +1,356 @@ +#!/bin/bash + +: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat} +. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs +: ${OVN_CTL_DEFAULT="/usr/share/openvswitch/scripts/ovn-ctl"} +CRM_MASTER="${HA_SBIN_DIR}/crm_master -l reboot" +CRM_ATTR_REPL_INFO="${HA_SBIN_DIR}/crm_attribute --type crm_config --name OVN_REPL_INFO -s ovn_ovsdb_master_server" +OVN_CTL=${OCF_RESKEY_ovn_ctl:-${OVN_CTL_DEFAULT}} +MASTER_IP=${OCF_RESKEY_master_ip} + +# Invalid IP address is an address that can never exist in the network, as +# mentioned in rfc-5737. The ovsdb servers connects to this IP address till +# a master is promoted and the IPAddr2 resource is started. +INVALID_IP_ADDRESS=192.0.2.254 + +host_name=$(ocf_local_nodename) +: ${slave_score=5} +: ${master_score=10} + +ovsdb_server_metadata() { + cat < + + + 1.0 + + + This resource manages ovsdb-server. + + + + Manages ovsdb-server. + + + + + + + Location to the ovn-ctl script file + + ovn-ctl script + + + + + + The IP address resource which will be available on the master ovsdb server + + master ip address + + + + + + + + + + + + + + + + +END + exit $OCF_SUCCESS +} + +ovsdb_server_notify() { + # requires the notify=true meta resource attribute + local type_op="${OCF_RESKEY_CRM_meta_notify_type}-${OCF_RESKEY_CRM_meta_notify_operation}" + + if [ "$type_op" != "post-promote" ]; then + # We are only interested in specific events + return $OCF_SUCCESS + fi + + ocf_log debug "ovndb_server: notified of event $type_op" + if [ "x${OCF_RESKEY_CRM_meta_notify_promote_uname}" = "x${host_name}" ]; then + # Record ourselves so that the agent has a better chance of doing + # the right thing at startup + ocf_log debug "ovndb_server: $host_name is the master" + ${CRM_ATTR_REPL_INFO} -v "$host_name" + + else + # Synchronize with the new master + ocf_log debug "ovndb_server: Connecting to the new master ${OCF_RESKEY_CRM_meta_notify_promote_uname}" + ${OVN_CTL} demote_ovnnb --db-nb-sync-from-addr=${MASTER_IP} + ${OVN_CTL} demote_ovnsb --db-sb-sync-from-addr=${MASTER_IP} + fi +} + +ovsdb_server_usage() { + cat < Stop -> Start -> Promote + # At the point this is run, the only active masters will be + # previous masters minus any that were scheduled to be demoted + + for master in ${OCF_RESKEY_CRM_meta_notify_master_uname}; do + found=0 + for old in ${OCF_RESKEY_CRM_meta_notify_demote_uname}; do + if [ $master = $old ]; then + found=1 + fi + done + if [ $found = 0 ]; then + # Rely on master-max=1 + # Pacemaker will demote any additional ones it finds before starting new copies + echo "$master" + return + fi + done + + local expected_master=$($CRM_ATTR_REPL_INFO --query -q 2>/dev/null) + case "x${OCF_RESKEY_CRM_meta_notify_start_uname}x" in + *${expected_master}*) echo "${expected_master}";; # The previous master is expected to start + esac +} + +ovsdb_server_find_active_peers() { + # Do we have any peers that are not stopping + for peer in ${OCF_RESKEY_CRM_meta_notify_slave_uname}; do + found=0 + for old in ${OCF_RESKEY_CRM_meta_notify_stop_uname}; do + if [ $peer = $old ]; then + found=1 + fi + done + if [ $found = 0 ]; then + # Rely on master-max=1 + # Pacemaker will demote any additional ones it finds before starting new copies + echo "$peer" + return + fi + done +} + +ovsdb_server_master_update() { + + case $1 in + $OCF_SUCCESS) + $CRM_MASTER -v ${slave_score};; + $OCF_RUNNING_MASTER) + $CRM_MASTER -v ${master_score};; + #*) $CRM_MASTER -D;; + esac +} + +ovsdb_server_monitor() { + ovsdb_server_check_status + rc=$? + + ovsdb_server_master_update $rc + return $rc +} + +ovsdb_server_check_status() { + local sb_status=`${OVN_CTL} status_ovnsb` + local nb_status=`${OVN_CTL} status_ovnnb` + + if [[ $sb_status == "running/backup" && $nb_status == "running/backup" ]]; then + return $OCF_SUCCESS + fi + + if [[ $sb_status == "running/active" && $nb_status == "running/active" ]]; then + return $OCF_RUNNING_MASTER + fi + + # TODO: What about service running but not in either state above? + # Eg. a transient state where one db is "active" and the other + # "backup" + + return $OCF_NOT_RUNNING +} + +ovsdb_server_start() { + ovsdb_server_check_status + local status=$? + # If not in stopped state, return + if [ $status -ne $OCF_NOT_RUNNING ]; then + return $status + fi + + local present_master=$(ovsdb_server_find_active_master) + + set ${OVN_CTL} + + if [ "x${present_master}" = x ]; then + # No master detected, or the previous master is not among the + # set starting. + # + # Force all copies to come up as slaves by pointing them into + # space and let pacemaker pick one to promote: + # + set $@ --db-nb-sync-from-addr=${INVALID_IP_ADDRESS} --db-sb-sync-from-addr=${INVALID_IP_ADDRESS} + + elif [ ${present_master} != ${host_name} ]; then + # An existing master is active, connect to it + set $@ --db-nb-sync-from-addr=${MASTER_IP} --db-sb-sync-from-addr=${MASTER_IP} + fi + + $@ start_ovsdb + + while [ 1 = 1 ]; do + # It is important that we don't return until we're in a functional state + ovsdb_server_monitor + rc=$? + case $rc in + $OCF_SUCCESS) return $rc;; + $OCF_RUNNING_MASTER) return $rc;; + $OCF_ERR_GENERIC) return $rc;; + # Otherwise loop, waiting for the service to start, until + # the cluster times the operation out + esac + ocf_log warn "ovndb_servers: After starting ovsdb, status is $rc. Checking the status again" + done +} + +ovsdb_server_stop() { + ovsdb_server_check_status + case $? in + $OCF_NOT_RUNNING) return ${OCF_SUCCESS};; + $OCF_RUNNING_MASTER) return ${OCF_RUNNING_MASTER};; + esac + + ${OVN_CTL} stop_ovsdb + ovsdb_server_master_update ${OCF_NOT_RUNNING} + + while [ 1 = 1 ]; do + # It is important that we don't return until we're stopped + ovsdb_server_check_status + rc=$? + case $rc in + $OCF_SUCCESS) + # Loop, waiting for the service to stop, until the + # cluster times the operation out + ocf_log warn "ovndb_servers: Even after stopping, the servers seems to be running" + ;; + $OCF_NOT_RUNNING) + return $OCF_SUCCESS + ;; + *) + return $rc + ;; + esac + done + + return $OCF_ERR_GENERIC +} + +ovsdb_server_promote() { + ovsdb_server_check_status + rc=$? + case $rc in + ${OCF_SUCCESS}) ;; + ${OCF_RUNNING_MASTER}) return ${OCF_SUCCESS};; + *) + ovsdb_server_master_update $OCF_RUNNING_MASTER + return ${rc} + ;; + esac + + ${OVN_CTL} promote_ovnnb + ${OVN_CTL} promote_ovnsb + + ocf_log debug "ovndb_servers: Promoting $host_name as the master" + # Record ourselves so that the agent has a better chance of doing + # the right thing at startup + ${CRM_ATTR_REPL_INFO} -v "$host_name" + ovsdb_server_master_update $OCF_RUNNING_MASTER + return $OCF_SUCCESS +} + +ovsdb_server_demote() { + ovsdb_server_check_status + if [ $? = $OCF_NOT_RUNNING ]; then + return $OCF_NOT_RUNNING + fi + + local present_master=$(ovsdb_server_find_active_master) + local recorded_master=$($CRM_ATTR_REPL_INFO --query -q 2>/dev/null) + + ocf_log debug "ovndb_servers: Demoting $host_name, present master ${present_master}, recorded master ${recorded_master}" + if [ "x${recorded_master}" = "x${host_name}" -a "x${present_master}" = x ]; then + # We are the one and only master + # This should be the "normal" case + # The only way to be demoted is to call demote_ovn* + # + # The local database is only reset once we successfully + # connect to the peer. So specify one that doesn't exist. + # + # Eventually a new master will be promoted and we'll resync + # using the logic in ovsdb_server_notify() + ${OVN_CTL} demote_ovnnb --db-nb-sync-from-addr=${INVALID_IP_ADDRESS} + ${OVN_CTL} demote_ovnsb --db-sb-sync-from-addr=${INVALID_IP_ADDRESS} + + elif [ "x${present_master}" = "x${host_name}" ]; then + # Safety check, should never be called + # + # Never allow sync'ing from ourselves, its a great way to + # erase the local DB + ${OVN_CTL} demote_ovnnb --db-nb-sync-from-addr=${INVALID_IP_ADDRESS} + ${OVN_CTL} demote_ovnsb --db-sb-sync-from-addr=${INVALID_IP_ADDRESS} + + elif [ "x${present_master}" != x ]; then + # There are too many masters and we're an extra one that is + # being demoted. Sync to the surviving one + ${OVN_CTL} demote_ovnnb --db-nb-sync-from-addr=${MASTER_IP} + ${OVN_CTL} demote_ovnsb --db-sb-sync-from-addr=${MASTER_IP} + + else + # For completeness, should never be called + # + # Something unexpected happened, perhaps CRM_ATTR_REPL_INFO is incorrect + ${OVN_CTL} demote_ovnnb --db-nb-sync-from-addr=${INVALID_IP_ADDRESS} + ${OVN_CTL} demote_ovnsb --db-sb-sync-from-addr=${INVALID_IP_ADDRESS} + fi + + ovsdb_server_master_update $OCF_SUCCESS + return $OCF_SUCCESS +} + +ovsdb_server_validate() { + if [ ! -e ${OVN_CTL} ]; then + return $OCF_ERR_INSTALLED + fi + return $OCF_SUCCESS +} + + +case $__OCF_ACTION in +start) ovsdb_server_start;; +stop) ovsdb_server_stop;; +promote) ovsdb_server_promote;; +demote) ovsdb_server_demote;; +start) ovsdb_server_start;; +notify) ovsdb_server_notify;; +meta-data) ovsdb_server_metadata;; +validate-all) ovsdb_server_validate;; +status|monitor) ovsdb_server_monitor;; +usage|help) ovsdb_server_usage $OCF_SUCCESS;; +*) ovsdb_server_usage $OCF_ERR_UNIMPLEMENTED ;; +esac + +rc=$? +exit $rc +