From patchwork Thu May 25 08:37:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1785996 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Bu8KtBID; dkim-atps=neutral Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QRhJQ2z7Wz20Pb for ; Thu, 25 May 2023 18:37:41 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id D7FE942AA2; Thu, 25 May 2023 08:37:37 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org D7FE942AA2 Authentication-Results: smtp2.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Bu8KtBID X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V82C7gs96tSW; Thu, 25 May 2023 08:37:36 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp2.osuosl.org (Postfix) with ESMTPS id ED8A742A84; Thu, 25 May 2023 08:37:35 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org ED8A742A84 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 81A33C0036; Thu, 25 May 2023 08:37:35 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 17FFBC002A for ; Thu, 25 May 2023 08:37:34 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id D26A783252 for ; Thu, 25 May 2023 08:37:33 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org D26A783252 Authentication-Results: smtp1.osuosl.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Bu8KtBID X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NfMfZkgyGTXh for ; Thu, 25 May 2023 08:37:32 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org BD1CB83192 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp1.osuosl.org (Postfix) with ESMTPS id BD1CB83192 for ; Thu, 25 May 2023 08:37:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1685003851; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=gBqfwGDAl+SWik1gXVi6vuUu1bJP+45wY6iusIgO22s=; b=Bu8KtBIDEJRVZkxrZiwQOo5jRrSSUcs5zGmf4BbaYXGKYKdMI2ytk7JLKgXexoM0ctU1T7 7QZxGn0OaWhjniyg+w0Mm13qwOtdw/XkMenmQMvKND9RrZC6nbCM75tdJmwkqdd2lHowob qTiufGNwG4wytGpz/MsX+yZEoS3jJWI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-515-HwCddT4eMZqD5ZfIOa4-bg-1; Thu, 25 May 2023 04:37:27 -0400 X-MC-Unique: HwCddT4eMZqD5ZfIOa4-bg-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5DA1C8030D4; Thu, 25 May 2023 08:37:27 +0000 (UTC) Received: from dceara.remote.csb (unknown [10.39.195.63]) by smtp.corp.redhat.com (Postfix) with ESMTP id 582F6492B00; Thu, 25 May 2023 08:37:26 +0000 (UTC) From: Dumitru Ceara To: ovs-dev@openvswitch.org Date: Thu, 25 May 2023 10:37:23 +0200 Message-Id: <20230525083723.676797-1-dceara@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: frigo@amadeus.com, i.maximets@ovn.org Subject: [ovs-dev] [PATCH ovn] controller: Handle OpenFlow errors. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Whenever an OpenFlow error is returned by OvS, trigger a reconnect of the OpenFlow (rconn) connection. This will clear any installed OpenFlow rules/groups. To ensure consistency, trigger a full I-P recompute too. An example of scenario that can result in an OpenFlow error returned by OvS follows (describing two main loop iterations in ovn-controller): - Iteration I: a. get updates from SB b. process these updates and generate "desired" openflows (lets assume this generates quite a lot of desired openflow modifications) c.1. add bundle-open msg to rconn c.2. add openflow mod msgs to rconn (only some of these make it through, the rest gets queued, the rconn is backlogged at this point). c.3. add bundle-commit msg to rconn (this gets queued) - Iteration II: a. get updates from SB (rconn is still backlogged) b. process the updates and generate "desired" openflows (lets assume this takes 10+ seconds for the specific SB updates) At some point, while step II.b was being executed OvS declared the bundle operation (started at I.c.1) timeout. We now act on this error by reconnecting which in turn triggers a flush of the rconn backlog and gives more chance to the next full recompute to succeed in installing all flows. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2134880 Reported-by: François Rigault CC: Ilya Maximets Signed-off-by: Dumitru Ceara Reviewed-by: Simon Horman Reviewed-by: Ales Musil --- controller/ofctrl.c | 17 ++++++++++++++--- controller/ofctrl.h | 2 +- controller/ovn-controller.c | 10 ++++++++-- 3 files changed, 23 insertions(+), 6 deletions(-) diff --git a/controller/ofctrl.c b/controller/ofctrl.c index b1ba1c743a..1da23bc27e 100644 --- a/controller/ofctrl.c +++ b/controller/ofctrl.c @@ -766,13 +766,18 @@ ofctrl_get_mf_field_id(void) /* Runs the OpenFlow state machine against 'br_int', which is local to the * hypervisor on which we are running. Attempts to negotiate a Geneve option - * field for class OVN_GENEVE_CLASS, type OVN_GENEVE_TYPE. */ -void + * field for class OVN_GENEVE_CLASS, type OVN_GENEVE_TYPE. + * + * Returns 'true' if an OpenFlow reconnect happened; 'false' otherwise. + */ +bool ofctrl_run(const struct ovsrec_bridge *br_int, const struct ovsrec_open_vswitch_table *ovs_table, struct shash *pending_ct_zones) { char *target = xasprintf("unix:%s/%s.mgmt", ovs_rundir(), br_int->name); + bool reconnected = false; + if (strcmp(target, rconn_get_target(swconn))) { VLOG_INFO("%s: connecting to switch", target); rconn_connect(swconn, target, target); @@ -782,10 +787,12 @@ ofctrl_run(const struct ovsrec_bridge *br_int, rconn_run(swconn); if (!rconn_is_connected(swconn)) { - return; + goto done; } + if (seqno != rconn_get_connection_seqno(swconn)) { seqno = rconn_get_connection_seqno(swconn); + reconnected = true; state = S_NEW; /* Reset the state of any outstanding ct flushes to resend them. */ @@ -855,6 +862,9 @@ ofctrl_run(const struct ovsrec_bridge *br_int, * point, so ensure that we come back again without waiting. */ poll_immediate_wake(); } + +done: + return reconnected; } void @@ -909,6 +919,7 @@ ofctrl_recv(const struct ofp_header *oh, enum ofptype type) } else if (type == OFPTYPE_ERROR) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(30, 300); log_openflow_rl(&rl, VLL_INFO, oh, "OpenFlow error"); + rconn_reconnect(swconn); } else { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(30, 300); log_openflow_rl(&rl, VLL_DBG, oh, "OpenFlow packet ignored"); diff --git a/controller/ofctrl.h b/controller/ofctrl.h index f5751e3ee4..105f9370be 100644 --- a/controller/ofctrl.h +++ b/controller/ofctrl.h @@ -51,7 +51,7 @@ struct ovn_desired_flow_table { void ofctrl_init(struct ovn_extend_table *group_table, struct ovn_extend_table *meter_table, int inactivity_probe_interval); -void ofctrl_run(const struct ovsrec_bridge *br_int, +bool ofctrl_run(const struct ovsrec_bridge *br_int, const struct ovsrec_open_vswitch_table *, struct shash *pending_ct_zones); enum mf_field_id ofctrl_get_mf_field_id(void); diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 1151d36644..b301c50157 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -5075,8 +5075,14 @@ main(int argc, char *argv[]) if (br_int) { ct_zones_data = engine_get_data(&en_ct_zones); - if (ct_zones_data) { - ofctrl_run(br_int, ovs_table, &ct_zones_data->pending); + if (ct_zones_data && ofctrl_run(br_int, ovs_table, + &ct_zones_data->pending)) { + static struct vlog_rate_limit rl + = VLOG_RATE_LIMIT_INIT(1, 1); + + VLOG_INFO_RL(&rl, "OVS OpenFlow connection reconnected," + "force recompute."); + engine_set_force_recompute(true); } if (chassis) {