From patchwork Wed Nov 25 10:12:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Bianconi X-Patchwork-Id: 1405994 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=aHxpAeTV; dkim-atps=neutral Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CgxYM027rz9s0b for ; Wed, 25 Nov 2020 21:13:26 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id A89208752B; Wed, 25 Nov 2020 10:13:24 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0asqQY7L8fON; Wed, 25 Nov 2020 10:13:22 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 856C2874E9; Wed, 25 Nov 2020 10:13:22 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 76DFDC0891; Wed, 25 Nov 2020 10:13:22 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2E013C0052 for ; Wed, 25 Nov 2020 10:13:21 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 26B5C874E6 for ; Wed, 25 Nov 2020 10:13:21 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R-EZetnolQxQ for ; Wed, 25 Nov 2020 10:13:20 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by hemlock.osuosl.org (Postfix) with ESMTPS id 37196874E9 for ; Wed, 25 Nov 2020 10:13:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1606299199; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=HV5eeKNpjWrQvBi6KjL+f5art0O2oxKJH4pTIqYINd8=; b=aHxpAeTVhpmq7IZIkJRvJpxRZLV4X6/+KItjssktZ/zngKNXxHudHVTNqlUwaIDwE+aXpc ccYxDQClSY2W5OEH84d1qrohJQPqBTQMDSBH5nGnNtsdeO4B5fM3vwTs3VAcR29jqQUUkG h7bFVn/1SwVEoWwBR1WiTOTLGsY5dso= Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-569-lZzYpH2wM6-msRb76qr4TQ-1; Wed, 25 Nov 2020 05:13:17 -0500 X-MC-Unique: lZzYpH2wM6-msRb76qr4TQ-1 Received: by mail-ed1-f69.google.com with SMTP id c23so452837edr.4 for ; Wed, 25 Nov 2020 02:13:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=HV5eeKNpjWrQvBi6KjL+f5art0O2oxKJH4pTIqYINd8=; b=Us9x0npOcO53GMM0hU32/ZGEzpEAqWJzS+dWMh9itm22SagyfoK+IivS8nxGnf69V2 9niCxhPscUjfgN+JCtHXwT5mi0G95iTSlDtQchdkzZwuSkbpBRt1kXJU7hlzZJTY4s0M O8R90zqpF1ZV58Y0cp/OVF1CxAlkGw94+A3yuvm6KAz0UTnXEwRI7yWhYa3/zgndOr1w Wm/msGKvWx7tHJicrrmG6zNA9hvTEuGdguE1ru3QkAEss2wVPkZ/+Puu0+tSHgNOfqko zFTWmnVP+uyJxVyJxMhD0pZoJLgkPICpe5vg2pVDdAiBxVb5340Df8Dyl9/H36Vivgiz 4jWg== X-Gm-Message-State: AOAM532OdT8XADMenxdelsw0STT3ZhQhbffWuzSZn0ZG5Jl3m17S4j6S FzZWzcobnymeyjkdonxfnDjjPR+9coVl9hFd5ngwRj3AzSf4nNLg/HjWib1xtAuPbgNIRl/aqHq DjQdar5ZRMPAiPoOw4Vccm50DBrIbV2Noa90+puYVVxGjZbINfUivWu9nIgYFKINKyRK5ju4uXH g= X-Received: by 2002:a50:9fc1:: with SMTP id c59mr2624775edf.59.1606299195787; Wed, 25 Nov 2020 02:13:15 -0800 (PST) X-Google-Smtp-Source: ABdhPJyKf14iU6hzWC/2xF3gAiZUcqNYZCav+pDNfCNmVcfpQkfOtdXPWD8S61zD62ZOS7GR6ugbbg== X-Received: by 2002:a50:9fc1:: with SMTP id c59mr2624754edf.59.1606299195555; Wed, 25 Nov 2020 02:13:15 -0800 (PST) Received: from lore-desk.redhat.com ([151.66.8.153]) by smtp.gmail.com with ESMTPSA id gn41sm922892ejc.32.2020.11.25.02.13.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Nov 2020 02:13:15 -0800 (PST) From: Lorenzo Bianconi To: dev@openvswitch.org Date: Wed, 25 Nov 2020 11:12:59 +0100 Message-Id: X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lorenzo.bianconi@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: i.maximets@ovn.org Subject: [ovs-dev] [PATCH v4] ovsdb: raft: add some debugging information to cluster/status command X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Introduce the following info useful for cluster debugging to cluster/status command: - time elapsed from last start/complete election - election trigger (e.g. timeout) - number of disconnections - time elapsed from last raft messaged received Acked-by: Dumitru Ceara Signed-off-by: Lorenzo Bianconi --- Changes since v3: - fixed comment alignment Changes since v2: - move last_msg_tx in raft_handle_rpc() - store election_start and election_won in ms - improve comments --- ovsdb/raft-private.h | 2 ++ ovsdb/raft.c | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/ovsdb/raft-private.h b/ovsdb/raft-private.h index 76b097b89..a69e37e5c 100644 --- a/ovsdb/raft-private.h +++ b/ovsdb/raft-private.h @@ -90,6 +90,8 @@ struct raft_server { /* For use in adding and removing servers: */ struct uuid requester_sid; /* Nonzero if requested via RPC. */ struct unixctl_conn *requester_conn; /* Only if requested via unixctl. */ + + long long int last_msg_ts; /* Last received msg timestamp in ms. */ }; void raft_server_destroy(struct raft_server *); diff --git a/ovsdb/raft.c b/ovsdb/raft.c index 760dfca6d..ea91d1fdb 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -264,6 +264,12 @@ struct raft { long long int election_base; /* Time of last heartbeat from leader. */ long long int election_timeout; /* Time at which we start an election. */ + long long int election_start; /* Start election time. */ + long long int election_won; /* Time of election completion. */ + bool leadership_transfer; /* Was the leadership transferred? */ + + unsigned int n_disconnections; + /* Used for joining a cluster. */ bool joining; /* Attempting to join the cluster? */ struct sset remote_addresses; /* Addresses to try to find other servers. */ @@ -1708,6 +1714,10 @@ raft_start_election(struct raft *raft, bool leadership_transfer) raft->n_votes = 0; + raft->election_start = time_msec(); + raft->election_won = 0; + raft->leadership_transfer = leadership_transfer; + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); if (!VLOG_DROP_INFO(&rl)) { long long int now = time_msec(); @@ -1857,6 +1867,7 @@ raft_run(struct raft *raft) struct raft_conn *next; LIST_FOR_EACH_SAFE (conn, next, list_node, &raft->conns) { if (!raft_conn_should_stay_open(raft, conn)) { + raft->n_disconnections++; raft_conn_close(conn); } } @@ -2597,6 +2608,7 @@ raft_become_leader(struct raft *raft) ovs_assert(raft->role != RAFT_LEADER); raft->role = RAFT_LEADER; + raft->election_won = time_msec(); raft_set_leader(raft, &raft->sid); raft_reset_election_timer(raft); raft_reset_ping_timer(raft); @@ -3714,6 +3726,7 @@ raft_handle_add_server_request(struct raft *raft, s->requester_sid = rq->common.sid; s->requester_conn = NULL; s->phase = RAFT_PHASE_CATCHUP; + s->last_msg_ts = time_msec(); /* Start sending the log. If this is the first time we've tried to add * this server, then this will quickly degenerate into an InstallSnapshot @@ -4273,6 +4286,11 @@ raft_handle_execute_command_reply( static void raft_handle_rpc(struct raft *raft, const union raft_rpc *rpc) { + struct raft_server *s = raft_find_server(raft, &rpc->common.sid); + if (s) { + s->last_msg_ts = time_msec(); + } + uint64_t term = raft_rpc_get_term(rpc); if (term && !raft_should_suppress_disruptive_server(raft, rpc) @@ -4485,6 +4503,17 @@ raft_unixctl_status(struct unixctl_conn *conn, raft_put_sid("Vote", &raft->vote, raft, &s); ds_put_char(&s, '\n'); + if (raft->election_start) { + ds_put_format(&s, + "Last Election started %"PRIu64" ms ago, reason: %s\n", + (uint64_t) (time_msec() - raft->election_start), + raft->leadership_transfer + ? "leadership_transfer" : "timeout"); + } + if (raft->election_won) { + ds_put_format(&s, "Last Election won: %"PRIu64" ms ago\n", + (uint64_t) (time_msec() - raft->election_won)); + } ds_put_format(&s, "Election timer: %"PRIu64, raft->election_timer); if (raft->role == RAFT_LEADER && raft->election_timer_new) { ds_put_format(&s, " (changing to %"PRIu64")", @@ -4512,6 +4541,8 @@ raft_unixctl_status(struct unixctl_conn *conn, } ds_put_char(&s, '\n'); + ds_put_format(&s, "Disconnections: %u\n", raft->n_disconnections); + ds_put_cstr(&s, "Servers:\n"); struct raft_server *server; HMAP_FOR_EACH (server, hmap_node, &raft->servers) { @@ -4536,6 +4567,10 @@ raft_unixctl_status(struct unixctl_conn *conn, ds_put_format(&s, " next_index=%"PRIu64" match_index=%"PRIu64, server->next_index, server->match_index); } + if (server->last_msg_ts) { + ds_put_format(&s, " last msg %"PRIu64" ms ago", + (uint64_t) (time_msec() - server->last_msg_ts)); + } ds_put_char(&s, '\n'); }