From patchwork Wed Aug 24 17:39:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xavier Simonart X-Patchwork-Id: 1669920 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=cQK1QWGX; dkim-atps=neutral Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MCYJF6nHdz1ygV for ; Thu, 25 Aug 2022 03:39:41 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 3692582BA7; Wed, 24 Aug 2022 17:39:40 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 3692582BA7 Authentication-Results: smtp1.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=cQK1QWGX X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YTgOV6aMMO0P; Wed, 24 Aug 2022 17:39:38 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id E7A2D827A5; Wed, 24 Aug 2022 17:39:37 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org E7A2D827A5 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9EACDC0032; Wed, 24 Aug 2022 17:39:37 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0E718C002D for ; Wed, 24 Aug 2022 17:39:36 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id D057740BAC for ; Wed, 24 Aug 2022 17:39:35 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org D057740BAC Authentication-Results: smtp2.osuosl.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=cQK1QWGX X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BKbyLkS3cnZr for ; Wed, 24 Aug 2022 17:39:34 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 895D340585 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp2.osuosl.org (Postfix) with ESMTPS id 895D340585 for ; Wed, 24 Aug 2022 17:39:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661362773; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=srSqIWiXws9JqZTdjtTAvYtT+Wd7QnVdtobmqwQ0wJ4=; b=cQK1QWGXVOEQLaNdFxHeaNO8rsVmISKZcpsB6F8l2VnjotzNGvcU/8FbZ+0d3uo5O9Mh7U O7YhrXUwcWaZ+PKwdGwW3pTETr/ZrPr+LtC3i1Uiy2p6lY4WxnDICdq44ufha4Fs08pit/ Oq2Hk9pf2IeLLoHhyh3VYmu5lWICa8k= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-261-TdoXlok9PW6wTladuPEAjQ-1; Wed, 24 Aug 2022 13:39:32 -0400 X-MC-Unique: TdoXlok9PW6wTladuPEAjQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0072685A58A for ; Wed, 24 Aug 2022 17:39:32 +0000 (UTC) Received: from wsfd-netdev90.ntdv.lab.eng.bos.redhat.com (wsfd-netdev90.ntdv.lab.eng.bos.redhat.com [10.19.188.196]) by smtp.corp.redhat.com (Postfix) with ESMTP id DC6BA40B40C8; Wed, 24 Aug 2022 17:39:31 +0000 (UTC) From: Xavier Simonart To: xsimonar@redhat.com, dev@openvswitch.org Date: Wed, 24 Aug 2022 13:39:31 -0400 Message-Id: <20220824173931.3689774-1-xsimonar@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH ovn] northd: Fix multicast table full X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" active_v4_flows count was intialized when the northd node was computed. However, neither sb_multicast_group nor en_sb_igmp_group causes northd updates. Hence this count could keep increasing while processing igmp groups. This issue was sometimes 'hidden' by northd recomputes due to lflow unable to be incrementally processed (sb busy). active_v4_flows is now reinitialized right before building flows (i.e. as part of the lflow node, which is computed on igmp group changes). Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2094710 Signed-off-by: Xavier Simonart --- northd/northd.c | 18 +++++-- tests/system-ovn.at | 125 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 140 insertions(+), 3 deletions(-) diff --git a/northd/northd.c b/northd/northd.c index 7e2681865..8bbcc1b82 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -1051,7 +1051,12 @@ init_mcast_info_for_switch_datapath(struct ovn_datapath *od) mcast_sw_info->query_max_response = smap_get_ullong(&od->nbs->other_config, "mcast_query_max_response", OVN_MCAST_DEFAULT_QUERY_MAX_RESPONSE_S); +} +static void +init_mcast_flow_count(struct ovn_datapath *od) +{ + struct mcast_switch_info *mcast_sw_info = &od->mcast_info.sw; mcast_sw_info->active_v4_flows = ATOMIC_VAR_INIT(0); mcast_sw_info->active_v6_flows = ATOMIC_VAR_INIT(0); } @@ -8368,6 +8373,10 @@ build_lswitch_ip_mcast_igmp_mld(struct ovn_igmp_group *igmp_group, if (atomic_compare_exchange_strong( &mcast_sw_info->active_v4_flows, &table_size, mcast_sw_info->table_size)) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + + VLOG_INFO_RL(&rl, "Too many active mcast flows: %ld", + mcast_sw_info->active_v4_flows); return; } atomic_add(&mcast_sw_info->active_v4_flows, 1, &dummy); @@ -15069,6 +15078,11 @@ build_mcast_groups(struct lflow_input *input_data, hmap_init(mcast_groups); hmap_init(igmp_groups); + struct ovn_datapath *od; + + HMAP_FOR_EACH (od, key_node, datapaths) { + init_mcast_flow_count(od); + } HMAP_FOR_EACH (op, key_node, ports) { if (op->nbrp && lrport_is_enabled(op->nbrp)) { @@ -15126,8 +15140,7 @@ build_mcast_groups(struct lflow_input *input_data, } /* If the datapath value is stale, purge the group. */ - struct ovn_datapath *od = - ovn_datapath_from_sbrec(datapaths, sb_igmp->datapath); + od = ovn_datapath_from_sbrec(datapaths, sb_igmp->datapath); if (!od || ovn_datapath_is_stale(od)) { sbrec_igmp_group_delete(sb_igmp); @@ -15172,7 +15185,6 @@ build_mcast_groups(struct lflow_input *input_data, * IGMP groups are based on the groups learnt by their multicast enabled * peers. */ - struct ovn_datapath *od; HMAP_FOR_EACH (od, key_node, datapaths) { if (ovs_list_is_empty(&od->mcast_info.groups)) { diff --git a/tests/system-ovn.at b/tests/system-ovn.at index 992813614..87093fbcc 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -8266,3 +8266,128 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d AT_CLEANUP ]) + +OVN_FOR_EACH_NORTHD([ +AT_SETUP([mcast flow count]) + +ovn_start + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-int]) + +# Set external-ids in br-int needed for ovn-controller +ovs-vsctl \ + -- set Open_vSwitch . external-ids:system-id=hv1 \ + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true + +# Start ovn-controller +start_daemon ovn-controller + +check ovn-nbctl ls-add ls +check ovn-nbctl lsp-add ls vm1 +check ovn-nbctl lsp-set-addresses vm1 00:00:00:00:00:01 +check ovn-nbctl lsp-add ls vm2 +check ovn-nbctl lsp-set-addresses vm2 00:00:00:00:00:02 +check ovn-nbctl lsp-add ls vm3 +check ovn-nbctl lsp-set-addresses vm3 00:00:00:00:00:03 + +check ovn-nbctl set logical_switch ls other_config:mcast_querier=false other_config:mcast_snoop=true other_config:mcast_query_interval=30 other_config:mcast_eth_src=00:00:00:00:00:05 other_config:mcast_ip4_src=42.42.42.5 other_config:mcast_ip6_src=fe80::1 other_config:mcast_idle_timeout=3000 +ovn-sbctl list ip_multicast + +wait_igmp_flows_installed() +{ + OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int table=31 | \ + grep 'priority=90' | grep "nw_dst=$1"]) +} + +ADD_NAMESPACES(vm1) +ADD_INT([vm1], [vm1], [br-int], [42.42.42.1/24]) +NS_CHECK_EXEC([vm1], [ip link set vm1 address 00:00:00:00:00:01], [0]) +NS_CHECK_EXEC([vm1], [ip route add default via 42.42.42.5], [0]) +NS_CHECK_EXEC([vm1], [ip -6 addr add 2000::1/24 dev vm1], [0]) +NS_CHECK_EXEC([vm1], [ip -6 route add default via 2000::5], [0]) +check ovs-vsctl set Interface vm1 external_ids:iface-id=vm1 + +ADD_NAMESPACES(vm2) +ADD_INT([vm2], [vm2], [br-int], [42.42.42.2/24]) +NS_CHECK_EXEC([vm2], [ip link set vm2 address 00:00:00:00:00:02], [0]) +NS_CHECK_EXEC([vm2], [ip -6 addr add 2000::2/64 dev vm2], [0]) +NS_CHECK_EXEC([vm2], [ip link set lo up], [0]) +check ovs-vsctl set Interface vm2 external_ids:iface-id=vm2 + +ADD_NAMESPACES(vm3) +NS_CHECK_EXEC([vm3], [tcpdump -n -i any -nnle > vm3.pcap 2>/dev/null &], [ignore], [ignore]) + +ADD_INT([vm3], [vm3], [br-int], [42.42.42.3/24]) +NS_CHECK_EXEC([vm3], [ip link set vm3 address 00:00:00:00:00:03], [0]) +NS_CHECK_EXEC([vm3], [ip -6 addr add 2000::3/64 dev vm3], [0]) +NS_CHECK_EXEC([vm3], [ip link set lo up], [0]) +NS_CHECK_EXEC([vm3], [ip route add default via 42.42.42.5], [0]) +NS_CHECK_EXEC([vm3], [ip -6 route add default via 2000::5], [0]) +check ovs-vsctl set Interface vm3 external_ids:iface-id=vm3 + +NS_CHECK_EXEC([vm2], [sysctl -w net.ipv4.igmp_max_memberships=100], [ignore], [ignore]) +NS_CHECK_EXEC([vm3], [sysctl -w net.ipv4.igmp_max_memberships=100], [ignore], [ignore]) +wait_for_ports_up + +NS_CHECK_EXEC([vm3], [ip addr add 228.0.0.1 dev vm3 autojoin], [0]) +wait_igmp_flows_installed 228.0.0.1 + +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 228.0.0.1], [ignore], [ignore]) + +OVS_WAIT_UNTIL([ + requests=`grep "ICMP echo request" -c vm3.pcap` + test "${requests}" -ge "3" +]) +kill $(pidof tcpdump) + +NS_CHECK_EXEC([vm3], [tcpdump -n -i any -nnleX > vm3.pcap 2>/dev/null &], [ignore], [ignore]) +NS_CHECK_EXEC([vm2], [tcpdump -n -i any -nnleX > vm2.pcap 2>/dev/null &], [ignore], [ignore]) +NS_CHECK_EXEC([vm1], [tcpdump -n -i any -nnleX > vm1.pcap 2>/dev/null &], [ignore], [ignore]) + +for i in `seq 1 40`;do + NS_CHECK_EXEC([vm2], [ip addr add 228.1.$i.1 dev vm2 autojoin &], [0]) + NS_CHECK_EXEC([vm3], [ip addr add 229.1.$i.1 dev vm3 autojoin &], [0]) + # Do not go too fast. If going fast, there is a higher chance of sb being busy, causing full recompute (engine has not run) + # In this test, we do not want too many recomputes as they might hide I+I related errors + sleep 0.2 +done + +for i in `seq 1 40`;do + wait_igmp_flows_installed 228.1.$i.1 + wait_igmp_flows_installed 229.1.$i.1 +done +ovn-sbctl list multicast_group + +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 228.1.1.1], [ignore], [ignore]) + +OVS_WAIT_UNTIL([ + requests=`grep "ICMP echo request" -c vm2.pcap` + test "${requests}" -ge "3" +]) +kill $(pidof tcpdump) + +# The test could succeed thanks to a lucky northd recompute...after hitting too any flows +# Double check we never hit error condition +AT_CHECK([grep -qE 'Too many active mcast flows' northd/ovn-northd.log], [1]) + +OVS_APP_EXIT_AND_WAIT([ovn-controller]) + +as ovn-sb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as ovn-nb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as northd +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE]) + +as +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d +/connection dropped.*/d +/removing policing failed: No such device/d"]) +AT_CLEANUP +])