From patchwork Thu Jul 11 08:44:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1130709 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45kqd56t7Qz9sNT for ; Thu, 11 Jul 2019 18:55:05 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 81FF24CC7; Thu, 11 Jul 2019 08:54:31 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id CB01F4CB0 for ; Thu, 11 Jul 2019 08:44:49 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 648EDCF for ; Thu, 11 Jul 2019 08:44:49 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E426E307D844 for ; Thu, 11 Jul 2019 08:44:48 +0000 (UTC) Received: from dceara.remote.csb (ovpn-117-109.ams2.redhat.com [10.36.117.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6B16260603 for ; Thu, 11 Jul 2019 08:44:48 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Thu, 11 Jul 2019 10:44:45 +0200 Message-Id: <20190711084430.15842.49840.stgit@dceara.remote.csb> In-Reply-To: <20190711084413.15842.62313.stgit@dceara.remote.csb> References: <20190711084413.15842.62313.stgit@dceara.remote.csb> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Thu, 11 Jul 2019 08:44:48 +0000 (UTC) X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v3 1/3] packets: Add IGMPv3 query packet definitions X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org Signed-off-by: Dumitru Ceara Acked-by: Mark Michelson --- lib/packets.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ lib/packets.h | 19 ++++++++++++++++++- 2 files changed, 62 insertions(+), 1 deletion(-) diff --git a/lib/packets.c b/lib/packets.c index a8fd61f..ab0b1a3 100644 --- a/lib/packets.c +++ b/lib/packets.c @@ -1281,6 +1281,50 @@ packet_set_icmp(struct dp_packet *packet, uint8_t type, uint8_t code) } } +/* Sets the IGMP type to IGMP_HOST_MEMBERSHIP_QUERY and populates the + * v3 query header fields in 'packet'. 'packet' must be a valid IGMPv3 + * query packet with its l4 offset properly populated. + */ +void +packet_set_igmp3_query(struct dp_packet *packet, uint8_t max_resp, + ovs_be32 group, bool srs, uint8_t qrv, uint8_t qqic) +{ + struct igmpv3_query_header *igh = dp_packet_l4(packet); + ovs_be16 orig_type_max_resp = + htons(igh->type << 8 | igh->max_resp); + ovs_be16 new_type_max_resp = + htons(IGMP_HOST_MEMBERSHIP_QUERY << 8 | max_resp); + + if (orig_type_max_resp != new_type_max_resp) { + igh->type = IGMP_HOST_MEMBERSHIP_QUERY; + igh->max_resp = max_resp; + igh->csum = recalc_csum16(igh->csum, orig_type_max_resp, + new_type_max_resp); + } + + ovs_be32 old_group = get_16aligned_be32(&igh->group); + + if (old_group != group) { + put_16aligned_be32(&igh->group, group); + igh->csum = recalc_csum32(igh->csum, old_group, group); + } + + /* See RFC 3376 4.1.6. */ + if (qrv > 7) { + qrv = 0; + } + + ovs_be16 orig_srs_qrv_qqic = htons(igh->srs_qrv << 8 | igh->qqic); + ovs_be16 new_srs_qrv_qqic = htons(srs << 11 | qrv << 8 | qqic); + + if (orig_srs_qrv_qqic != new_srs_qrv_qqic) { + igh->srs_qrv = (srs << 3 | qrv); + igh->qqic = qqic; + igh->csum = recalc_csum16(igh->csum, orig_srs_qrv_qqic, + new_srs_qrv_qqic); + } +} + void packet_set_nd_ext(struct dp_packet *packet, const ovs_16aligned_be32 rso_flags, const uint8_t opt_type) diff --git a/lib/packets.h b/lib/packets.h index d293b35..4124490 100644 --- a/lib/packets.h +++ b/lib/packets.h @@ -681,6 +681,7 @@ char *ip_parse_cidr_len(const char *s, int *n, ovs_be32 *ip, #define IP_ECN_ECT_0 0x02 #define IP_ECN_CE 0x03 #define IP_ECN_MASK 0x03 +#define IP_DSCP_CS6 0xc0 #define IP_DSCP_MASK 0xfc static inline int @@ -763,6 +764,20 @@ struct igmpv3_header { }; BUILD_ASSERT_DECL(IGMPV3_HEADER_LEN == sizeof(struct igmpv3_header)); +#define IGMPV3_QUERY_HEADER_LEN 12 +struct igmpv3_query_header { + uint8_t type; + uint8_t max_resp; + ovs_be16 csum; + ovs_16aligned_be32 group; + uint8_t srs_qrv; + uint8_t qqic; + ovs_be16 nsrcs; +}; +BUILD_ASSERT_DECL( + IGMPV3_QUERY_HEADER_LEN == sizeof(struct igmpv3_query_header +)); + #define IGMPV3_RECORD_LEN 8 struct igmpv3_record { uint8_t type; @@ -1543,7 +1558,9 @@ void packet_set_nd(struct dp_packet *, const struct in6_addr *target, void packet_set_nd_ext(struct dp_packet *packet, const ovs_16aligned_be32 rso_flags, const uint8_t opt_type); - +void packet_set_igmp3_query(struct dp_packet *, uint8_t max_resp, + ovs_be32 group, bool srs, uint8_t qrv, + uint8_t qqic); void packet_format_tcp_flags(struct ds *, uint16_t); const char *packet_tcp_flag_to_string(uint32_t flag); void compose_arp__(struct dp_packet *); From patchwork Thu Jul 11 08:45:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1130710 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45kqf71T0xz9sNT for ; Thu, 11 Jul 2019 18:55:59 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 9AC104CC9; Thu, 11 Jul 2019 08:54:32 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 3BC144CBF for ; Thu, 11 Jul 2019 08:45:19 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id DE5CECF for ; Thu, 11 Jul 2019 08:45:15 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 787782F8BEB for ; Thu, 11 Jul 2019 08:45:15 +0000 (UTC) Received: from dceara.remote.csb (ovpn-117-109.ams2.redhat.com [10.36.117.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2B01D60BFB for ; Thu, 11 Jul 2019 08:45:14 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Thu, 11 Jul 2019 10:45:11 +0200 Message-Id: <20190711084454.15842.53180.stgit@dceara.remote.csb> In-Reply-To: <20190711084413.15842.62313.stgit@dceara.remote.csb> References: <20190711084413.15842.62313.stgit@dceara.remote.csb> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 11 Jul 2019 08:45:15 +0000 (UTC) X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v3 2/3] OVN: Add IGMP SB definitions and ovn-controller support X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org A new IP_Multicast table is added to Southbound DB. This table stores the multicast related configuration for each datapath. Each row will be populated by ovn-northd and will control: - if IGMP Snooping is enabled or not, the snooping table size and multicast group idle timeout. - if IGMP Querier is enabled or not (only if snooping is enabled too), query interval, query source addresses (Ethernet and IP) and the max-response field to be stored in outgoing queries. - an additional "seq_no" column is added such that ovn-sbctl or if needed a CMS can flush currently learned groups. This can be achieved by incrementing the "seq_no" value. A new IGMP_Group table is added to Southbound DB. This table stores all the multicast groups learned by ovn-controllers. The table is indexed by datapath, group address and chassis. For a learned multicast group on a specific datapath each ovn-controller will store its own row in this table. Each row contains the list of chassis-local ports on which the group was learned. Rows in the IGMP_Group table are updated or deleted only by the ovn-controllers that created them. A new action ("igmp") is added to punt IGMP packets on a specific logical switch datapath to ovn-controller if IGMP snooping is enabled. Per datapath IGMP multicast snooping support is added to pinctrl: - incoming IGMP reports are processed and multicast groups are maintained (using the OVS mcast-snooping library). - each OVN controller syncs its in-memory IGMP groups to the Southbound DB in the IGMP_Group table. - pinctrl also sends periodic IGMPv3 general queries for all datapaths where querier is enabled. Signed-off-by: Mark Michelson Co-authored-by: Mark Michelson Signed-off-by: Dumitru Ceara Acked-by: Mark Michelson --- include/ovn/actions.h | 7 ovn/controller/automake.mk | 2 ovn/controller/ip-mcast.c | 164 ++++++++ ovn/controller/ip-mcast.h | 52 +++ ovn/controller/ovn-controller.c | 23 + ovn/controller/pinctrl.c | 786 +++++++++++++++++++++++++++++++++++++++ ovn/controller/pinctrl.h | 2 ovn/lib/actions.c | 16 + ovn/lib/automake.mk | 2 ovn/lib/ip-mcast-index.c | 40 ++ ovn/lib/ip-mcast-index.h | 39 ++ ovn/lib/logical-fields.c | 2 ovn/ovn-sb.ovsschema | 43 ++ ovn/ovn-sb.xml | 80 ++++ ovn/utilities/ovn-sbctl.c | 53 +++ ovn/utilities/ovn-trace.c | 4 tests/ovn.at | 4 17 files changed, 1314 insertions(+), 5 deletions(-) create mode 100644 ovn/controller/ip-mcast.c create mode 100644 ovn/controller/ip-mcast.h create mode 100644 ovn/lib/ip-mcast-index.c create mode 100644 ovn/lib/ip-mcast-index.h diff --git a/include/ovn/actions.h b/include/ovn/actions.h index f42bbc2..fe19424 100644 --- a/include/ovn/actions.h +++ b/include/ovn/actions.h @@ -67,6 +67,7 @@ struct ovn_extend_table; OVNACT(ICMP4, ovnact_nest) \ OVNACT(ICMP4_ERROR, ovnact_nest) \ OVNACT(ICMP6, ovnact_nest) \ + OVNACT(IGMP, ovnact_null) \ OVNACT(TCP_RESET, ovnact_nest) \ OVNACT(ND_NA, ovnact_nest) \ OVNACT(ND_NA_ROUTER, ovnact_nest) \ @@ -486,6 +487,12 @@ enum action_opcode { * The actions, in OpenFlow 1.3 format, follow the action_header. */ ACTION_OPCODE_ICMP4_ERROR, + + /* "igmp()". + * + * Snoop IGMP, learn the multicast participants + */ + ACTION_OPCODE_IGMP, }; /* Header. */ diff --git a/ovn/controller/automake.mk b/ovn/controller/automake.mk index fcdf7a4..193ea69 100644 --- a/ovn/controller/automake.mk +++ b/ovn/controller/automake.mk @@ -10,6 +10,8 @@ ovn_controller_ovn_controller_SOURCES = \ ovn/controller/encaps.h \ ovn/controller/ha-chassis.c \ ovn/controller/ha-chassis.h \ + ovn/controller/ip-mcast.c \ + ovn/controller/ip-mcast.h \ ovn/controller/lflow.c \ ovn/controller/lflow.h \ ovn/controller/lport.c \ diff --git a/ovn/controller/ip-mcast.c b/ovn/controller/ip-mcast.c new file mode 100644 index 0000000..ef36be2 --- /dev/null +++ b/ovn/controller/ip-mcast.c @@ -0,0 +1,164 @@ +/* Copyright (c) 2019, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "ip-mcast.h" +#include "lport.h" +#include "ovn/lib/ovn-sb-idl.h" + +/* + * Used for (faster) updating of IGMP_Group ports. + */ +struct igmp_group_port { + struct hmap_node hmap_node; + const struct sbrec_port_binding *port; +}; + +struct ovsdb_idl_index * +igmp_group_index_create(struct ovsdb_idl *idl) +{ + const struct ovsdb_idl_index_column cols[] = { + { .column = &sbrec_igmp_group_col_address }, + { .column = &sbrec_igmp_group_col_datapath }, + { .column = &sbrec_igmp_group_col_chassis }, + }; + + return ovsdb_idl_index_create(idl, cols, ARRAY_SIZE(cols)); +} + +/* Looks up an IGMP group based on an IPv4 (mapped in IPv6) or IPv6 'address' + * and 'datapath'. + */ +const struct sbrec_igmp_group * +igmp_group_lookup(struct ovsdb_idl_index *igmp_groups, + const struct in6_addr *address, + const struct sbrec_datapath_binding *datapath, + const struct sbrec_chassis *chassis) +{ + char addr_str[INET6_ADDRSTRLEN]; + + if (!ipv6_string_mapped(addr_str, address)) { + return NULL; + } + + struct sbrec_igmp_group *target = + sbrec_igmp_group_index_init_row(igmp_groups); + + sbrec_igmp_group_index_set_address(target, addr_str); + sbrec_igmp_group_index_set_datapath(target, datapath); + sbrec_igmp_group_index_set_chassis(target, chassis); + + const struct sbrec_igmp_group *g = + sbrec_igmp_group_index_find(igmp_groups, target); + sbrec_igmp_group_index_destroy_row(target); + return g; +} + +/* Creates and returns a new IGMP group based on an IPv4 (mapped in IPv6) or + * IPv6 'address', 'datapath' and 'chassis'. + */ +struct sbrec_igmp_group * +igmp_group_create(struct ovsdb_idl_txn *idl_txn, + const struct in6_addr *address, + const struct sbrec_datapath_binding *datapath, + const struct sbrec_chassis *chassis) +{ + char addr_str[INET6_ADDRSTRLEN]; + + if (!ipv6_string_mapped(addr_str, address)) { + return NULL; + } + + struct sbrec_igmp_group *g = sbrec_igmp_group_insert(idl_txn); + + sbrec_igmp_group_set_address(g, addr_str); + sbrec_igmp_group_set_datapath(g, datapath); + sbrec_igmp_group_set_chassis(g, chassis); + + return g; +} + +void +igmp_group_update_ports(const struct sbrec_igmp_group *g, + struct ovsdb_idl_index *datapaths, + struct ovsdb_idl_index *port_bindings, + const struct mcast_snooping *ms OVS_UNUSED, + const struct mcast_group *mc_group) + OVS_REQ_RDLOCK(ms->rwlock) +{ + struct igmp_group_port *old_ports_storage = + (g->n_ports ? xmalloc(g->n_ports * sizeof *old_ports_storage) : NULL); + + struct hmap old_ports = HMAP_INITIALIZER(&old_ports); + + for (size_t i = 0; i < g->n_ports; i++) { + struct igmp_group_port *old_port = &old_ports_storage[i]; + + old_port->port = g->ports[i]; + hmap_insert(&old_ports, &old_port->hmap_node, + old_port->port->tunnel_key); + } + + struct mcast_group_bundle *bundle; + uint64_t dp_key = g->datapath->tunnel_key; + + LIST_FOR_EACH (bundle, bundle_node, &mc_group->bundle_lru) { + uint32_t port_key = (uintptr_t)bundle->port; + const struct sbrec_port_binding *sbrec_port = + lport_lookup_by_key(datapaths, port_bindings, dp_key, port_key); + if (!sbrec_port) { + continue; + } + + struct hmap_node *node = hmap_first_with_hash(&old_ports, port_key); + if (!node) { + sbrec_igmp_group_update_ports_addvalue(g, sbrec_port); + } else { + hmap_remove(&old_ports, node); + } + } + + struct igmp_group_port *igmp_port; + HMAP_FOR_EACH_POP (igmp_port, hmap_node, &old_ports) { + sbrec_igmp_group_update_ports_delvalue(g, igmp_port->port); + } + + free(old_ports_storage); + hmap_destroy(&old_ports); +} + +void +igmp_group_delete(const struct sbrec_igmp_group *g) +{ + sbrec_igmp_group_delete(g); +} + +bool +igmp_group_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn, + struct ovsdb_idl_index *igmp_groups) +{ + const struct sbrec_igmp_group *g; + + if (!ovnsb_idl_txn) { + return true; + } + + SBREC_IGMP_GROUP_FOR_EACH_BYINDEX (g, igmp_groups) { + igmp_group_delete(g); + } + + return true; +} diff --git a/ovn/controller/ip-mcast.h b/ovn/controller/ip-mcast.h new file mode 100644 index 0000000..6014f43 --- /dev/null +++ b/ovn/controller/ip-mcast.h @@ -0,0 +1,52 @@ +/* Copyright (c) 2019, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef OVN_IP_MCAST_H +#define OVN_IP_MCAST_H 1 + +#include "mcast-snooping.h" + +struct ovsdb_idl; +struct ovsdb_idl_txn; + +struct sbrec_chassis; +struct sbrec_datapath_binding; + +struct ovsdb_idl_index *igmp_group_index_create(struct ovsdb_idl *); +const struct sbrec_igmp_group *igmp_group_lookup( + struct ovsdb_idl_index *igmp_groups, + const struct in6_addr *address, + const struct sbrec_datapath_binding *datapath, + const struct sbrec_chassis *chassis); + +struct sbrec_igmp_group *igmp_group_create( + struct ovsdb_idl_txn *idl_txn, + const struct in6_addr *address, + const struct sbrec_datapath_binding *datapath, + const struct sbrec_chassis *chassis); + +void igmp_group_update_ports(const struct sbrec_igmp_group *g, + struct ovsdb_idl_index *datapaths, + struct ovsdb_idl_index *port_bindings, + const struct mcast_snooping *ms, + const struct mcast_group *mc_group) + OVS_REQ_RDLOCK(ms->rwlock); + +void igmp_group_delete(const struct sbrec_igmp_group *g); + +bool igmp_group_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn, + struct ovsdb_idl_index *igmp_groups); + +#endif /* ovn/controller/ip-mcast.h */ diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index c4883aa..3afe60b 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -33,6 +33,7 @@ #include "openvswitch/dynamic-string.h" #include "encaps.h" #include "fatal-signal.h" +#include "ip-mcast.h" #include "openvswitch/hmap.h" #include "lflow.h" #include "lib/vswitch-idl.h" @@ -43,6 +44,7 @@ #include "ovn/actions.h" #include "ovn/lib/chassis-index.h" #include "ovn/lib/extend-table.h" +#include "ovn/lib/ip-mcast-index.h" #include "ovn/lib/ovn-sb-idl.h" #include "ovn/lib/ovn-util.h" #include "patch.h" @@ -133,6 +135,10 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, * Monitor Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for * local datapaths. * + * Monitor IP_Multicast for local datapaths. + * + * Monitor IGMP_Groups for local chassis. + * * We always monitor patch ports because they allow us to see the linkages * between related logical datapaths. That way, when we know that we have * a VIF on a particular logical switch, we immediately know to monitor all @@ -142,6 +148,8 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, struct ovsdb_idl_condition mb = OVSDB_IDL_CONDITION_INIT(&mb); struct ovsdb_idl_condition mg = OVSDB_IDL_CONDITION_INIT(&mg); struct ovsdb_idl_condition dns = OVSDB_IDL_CONDITION_INIT(&dns); + struct ovsdb_idl_condition ip_mcast = OVSDB_IDL_CONDITION_INIT(&ip_mcast); + struct ovsdb_idl_condition igmp = OVSDB_IDL_CONDITION_INIT(&igmp); sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "patch"); /* XXX: We can optimize this, if we find a way to only monitor * ports that have a Gateway_Chassis that point's to our own @@ -165,6 +173,8 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, sbrec_port_binding_add_clause_options(&pb, OVSDB_F_INCLUDES, &l2); const struct smap l3 = SMAP_CONST1(&l3, "l3gateway-chassis", id); sbrec_port_binding_add_clause_options(&pb, OVSDB_F_INCLUDES, &l3); + sbrec_igmp_group_add_clause_chassis(&igmp, OVSDB_F_EQ, + &chassis->header_.uuid); } if (local_ifaces) { const char *name; @@ -184,6 +194,8 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, sbrec_mac_binding_add_clause_datapath(&mb, OVSDB_F_EQ, uuid); sbrec_multicast_group_add_clause_datapath(&mg, OVSDB_F_EQ, uuid); sbrec_dns_add_clause_datapaths(&dns, OVSDB_F_INCLUDES, &uuid, 1); + sbrec_ip_multicast_add_clause_datapath(&ip_mcast, OVSDB_F_EQ, + uuid); } } sbrec_port_binding_set_condition(ovnsb_idl, &pb); @@ -191,11 +203,15 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, sbrec_mac_binding_set_condition(ovnsb_idl, &mb); sbrec_multicast_group_set_condition(ovnsb_idl, &mg); sbrec_dns_set_condition(ovnsb_idl, &dns); + sbrec_ip_multicast_set_condition(ovnsb_idl, &ip_mcast); + sbrec_igmp_group_set_condition(ovnsb_idl, &igmp); ovsdb_idl_condition_destroy(&pb); ovsdb_idl_condition_destroy(&lf); ovsdb_idl_condition_destroy(&mb); ovsdb_idl_condition_destroy(&mg); ovsdb_idl_condition_destroy(&dns); + ovsdb_idl_condition_destroy(&ip_mcast); + ovsdb_idl_condition_destroy(&igmp); } static const char * @@ -1739,6 +1755,10 @@ main(int argc, char *argv[]) = ovsdb_idl_index_create2(ovnsb_idl_loop.idl, &sbrec_mac_binding_col_logical_port, &sbrec_mac_binding_col_ip); + struct ovsdb_idl_index *sbrec_ip_multicast + = ip_mcast_index_create(ovnsb_idl_loop.idl); + struct ovsdb_idl_index *sbrec_igmp_group + = igmp_group_index_create(ovnsb_idl_loop.idl); ovsdb_idl_track_add_all(ovnsb_idl_loop.idl); ovsdb_idl_omit_alert(ovnsb_idl_loop.idl, &sbrec_chassis_col_nb_cfg); @@ -1980,6 +2000,8 @@ main(int argc, char *argv[]) sbrec_port_binding_by_key, sbrec_port_binding_by_name, sbrec_mac_binding_by_lport_ip, + sbrec_igmp_group, + sbrec_ip_multicast, sbrec_dns_table_get(ovnsb_idl_loop.idl), br_int, chassis, &ed_runtime_data.local_datapaths, @@ -2111,6 +2133,7 @@ main(int argc, char *argv[]) done = binding_cleanup(ovnsb_idl_txn, port_binding_table, chassis); done = chassis_cleanup(ovnsb_idl_txn, chassis) && done; done = encaps_cleanup(ovs_idl_txn, br_int) && done; + done = igmp_group_cleanup(ovnsb_idl_txn, sbrec_igmp_group) && done; if (done) { poll_immediate_wake(); } diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c index a442738..7f297cd 100644 --- a/ovn/controller/pinctrl.c +++ b/ovn/controller/pinctrl.c @@ -44,6 +44,7 @@ #include "ovn/actions.h" #include "ovn/lex.h" #include "ovn/lib/acl-log.h" +#include "ovn/lib/ip-mcast-index.h" #include "ovn/lib/ovn-l7.h" #include "ovn/lib/ovn-util.h" #include "ovn/logical-fields.h" @@ -54,6 +55,7 @@ #include "timeval.h" #include "vswitch-idl.h" #include "lflow.h" +#include "ip-mcast.h" VLOG_DEFINE_THIS_MODULE(pinctrl); @@ -105,6 +107,17 @@ VLOG_DEFINE_THIS_MODULE(pinctrl); * the hmap - 'buffered_mac_bindings' and reinjects the * buffered packets. * + * - igmp - This action punts an IGMP packet to the controller + * which maintains multicast group information. The + * multicast groups (mcast_snoop_map) are synced to + * the 'IGMP_Group' table by ip_mcast_sync(). + * ip_mcast_sync() also reads the 'IP_Multicast' + * (snooping and querier) configuration and builds a + * local configuration mcast_cfg_map. + * ip_mcast_snoop_run() which runs in the + * pinctrl_handler() thread configures the per datapath + * mcast_snoop_map entries according to mcast_cfg_map. + * * pinctrl module also periodically sends IPv6 Router Solicitation requests * and gARPs (for the router gateway IPs and configured NAT addresses). * @@ -122,6 +135,13 @@ VLOG_DEFINE_THIS_MODULE(pinctrl); * pinctrl_handler() thread sends these gARPs using the * shash 'send_garp_data'. * + * IGMP Queries - pinctrl_run() prepares the IGMP queries (at most one + * per local datapath) based on the mcast_snoop_map + * contents and stores them in mcast_query_list. + * + * pinctrl_handler thread sends the periodic IGMP queries + * by walking the mcast_query_list. + * * Notification between pinctrl_handler() and pinctrl_run() * ------------------------------------------------------- * 'struct seq' is used for notification between pinctrl_handler() thread @@ -131,8 +151,8 @@ VLOG_DEFINE_THIS_MODULE(pinctrl); * in 'send_garp_data', 'ipv6_ras' and 'buffered_mac_bindings' structures. * * 'pinctrl_main_seq' is used by pinctrl_handler() thread to wake up - * the main thread from poll_block() when mac bindings needs to be updated - * in the Southboubd DB. + * the main thread from poll_block() when mac bindings/igmp groups need to + * be updated in the Southboubd DB. * */ static struct ovs_mutex pinctrl_mutex = OVS_MUTEX_INITIALIZER; @@ -222,6 +242,30 @@ static void prepare_ipv6_ras( static void send_ipv6_ras(struct rconn *swconn, long long int *send_ipv6_ra_time) OVS_REQUIRES(pinctrl_mutex); + +static void ip_mcast_snoop_init(void); +static void ip_mcast_snoop_destroy(void); +static void ip_mcast_snoop_run(void) + OVS_REQUIRES(pinctrl_mutex); +static void ip_mcast_querier_run(struct rconn *swconn, + long long int *query_time); +static void ip_mcast_querier_wait(long long int query_time); +static void ip_mcast_sync( + struct ovsdb_idl_txn *ovnsb_idl_txn, + const struct sbrec_chassis *chassis, + const struct hmap *local_datapaths, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, + struct ovsdb_idl_index *sbrec_port_binding_by_key, + struct ovsdb_idl_index *sbrec_igmp_groups, + struct ovsdb_idl_index *sbrec_ip_multicast) + OVS_REQUIRES(pinctrl_mutex); +static void pinctrl_ip_mcast_handle_igmp( + struct rconn *swconn, + const struct flow *ip_flow, + struct dp_packet *pkt_in, + const struct match *md, + struct ofpbuf *userdata); + static bool may_inject_pkts(void); COVERAGE_DEFINE(pinctrl_drop_put_mac_binding); @@ -234,6 +278,7 @@ pinctrl_init(void) init_send_garps(); init_ipv6_ras(); init_buffered_packets_map(); + ip_mcast_snoop_init(); pinctrl.br_int_name = NULL; pinctrl_handler_seq = seq_create(); pinctrl_main_seq = seq_create(); @@ -1748,6 +1793,10 @@ process_packet_in(struct rconn *swconn, const struct ofp_header *msg) pinctrl_handle_put_icmp4_frag_mtu(swconn, &headers, &packet, &pin, &userdata, &continuation); break; + case ACTION_OPCODE_IGMP: + pinctrl_ip_mcast_handle_igmp(swconn, &headers, &packet, + &pin.flow_metadata, &userdata); + break; default: VLOG_WARN_RL(&rl, "unrecognized packet-in opcode %"PRIu32, @@ -1785,7 +1834,6 @@ pinctrl_recv(struct rconn *swconn, const struct ofp_header *oh, } /* Called with in the main ovn-controller thread context. */ - static void notify_pinctrl_handler(void) { @@ -1817,6 +1865,8 @@ pinctrl_handler(void *arg_) static long long int send_ipv6_ra_time = LLONG_MAX; /* Next GARP announcement in ms. */ static long long int send_garp_time = LLONG_MAX; + /* Next multicast query (IGMP) in ms. */ + static long long int send_mcast_query_time = LLONG_MAX; swconn = rconn_create(5, 0, DSCP_DEFAULT, 1 << OFP13_VERSION); @@ -1841,6 +1891,10 @@ pinctrl_handler(void *arg_) rconn_disconnect(swconn); } + ovs_mutex_lock(&pinctrl_mutex); + ip_mcast_snoop_run(); + ovs_mutex_unlock(&pinctrl_mutex); + rconn_run(swconn); if (rconn_is_connected(swconn)) { if (conn_seq_no != rconn_get_connection_seqno(swconn)) { @@ -1868,6 +1922,8 @@ pinctrl_handler(void *arg_) send_ipv6_ras(swconn, &send_ipv6_ra_time); send_mac_binding_buffered_pkts(swconn); ovs_mutex_unlock(&pinctrl_mutex); + + ip_mcast_querier_run(swconn, &send_mcast_query_time); } } @@ -1875,6 +1931,7 @@ pinctrl_handler(void *arg_) rconn_recv_wait(swconn); send_garp_wait(send_garp_time); ipv6_ra_wait(send_ipv6_ra_time); + ip_mcast_querier_wait(send_mcast_query_time); new_seq = seq_read(pinctrl_handler_seq); seq_wait(pinctrl_handler_seq, new_seq); @@ -1896,6 +1953,8 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, struct ovsdb_idl_index *sbrec_port_binding_by_key, struct ovsdb_idl_index *sbrec_port_binding_by_name, struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip, + struct ovsdb_idl_index *sbrec_igmp_groups, + struct ovsdb_idl_index *sbrec_ip_multicast_opts, const struct sbrec_dns_table *dns_table, const struct ovsrec_bridge *br_int, const struct sbrec_chassis *chassis, @@ -1922,9 +1981,16 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, prepare_ipv6_ras(sbrec_port_binding_by_datapath, sbrec_port_binding_by_name, local_datapaths); sync_dns_cache(dns_table); + ip_mcast_sync(ovnsb_idl_txn, chassis, local_datapaths, + sbrec_datapath_binding_by_key, + sbrec_port_binding_by_key, + sbrec_igmp_groups, + sbrec_ip_multicast_opts); + run_buffered_binding(sbrec_port_binding_by_datapath, sbrec_mac_binding_by_lport_ip, local_datapaths); + ovs_mutex_unlock(&pinctrl_mutex); } @@ -2272,6 +2338,7 @@ pinctrl_destroy(void) destroy_buffered_packets_map(); destroy_put_mac_bindings(); destroy_dns_cache(); + ip_mcast_snoop_destroy(); seq_destroy(pinctrl_main_seq); seq_destroy(pinctrl_handler_seq); } @@ -2711,6 +2778,718 @@ send_garp(struct rconn *swconn, struct garp_data *garp, return garp->announce_time; } +/* + * Multicast snooping configuration. + */ +struct ip_mcast_snoop_cfg { + bool enabled; + bool querier_enabled; + + uint32_t table_size; /* Max number of allowed multicast groups. */ + uint32_t idle_time_s; /* Idle timeout for multicast groups. */ + uint32_t query_interval_s; /* Multicast query interval. */ + uint32_t query_max_resp_s; /* Multicast query max-response field. */ + uint32_t seq_no; /* Used for flushing learnt groups. */ + + struct eth_addr query_eth_src; /* Src ETH address used for queries. */ + struct eth_addr query_eth_dst; /* Dst ETH address used for queries. */ + ovs_be32 query_ipv4_src; /* Src IPv4 address used for queries. */ + ovs_be32 query_ipv4_dst; /* Dsc IPv4 address used for queries. */ +}; + +/* + * Holds per-datapath information about multicast snooping. Maintained by + * pinctrl_handler(). + */ +struct ip_mcast_snoop { + struct hmap_node hmap_node; /* Linkage in the hash map. */ + struct ovs_list query_node; /* Linkage in the query list. */ + struct ip_mcast_snoop_cfg cfg; /* Multicast configuration. */ + struct mcast_snooping *ms; /* Multicast group state. */ + int64_t dp_key; /* Datapath running the snooping. */ + + long long int query_time_ms; /* Next query time in ms. */ +}; + +/* + * Holds the per-datapath multicast configuration state. Maintained by + * pinctrl_run(). + */ +struct ip_mcast_snoop_state { + struct hmap_node hmap_node; + int64_t dp_key; + struct ip_mcast_snoop_cfg cfg; +}; + +/* Only default vlan supported for now. */ +#define IP_MCAST_VLAN 1 + +/* Multicast snooping information stored independently by datapath key. + * Protected by pinctrl_mutex. pinctrl_handler has RW access and pinctrl_main + * has RO access. + */ +static struct hmap mcast_snoop_map OVS_GUARDED_BY(pinctrl_mutex); + +/* Contains multicast queries to be sent. Only used by pinctrl_handler so no + * locking needed. + */ +static struct ovs_list mcast_query_list; + +/* Multicast config information stored independently by datapath key. + * Protected by pinctrl_mutex. pinctrl_handler has RO access and pinctrl_main + * has RW access. Read accesses from pinctrl_ip_mcast_handle_igmp() can be + * performed without taking the lock as they are executed in the pinctrl_main + * thread. + */ +static struct hmap mcast_cfg_map OVS_GUARDED_BY(pinctrl_mutex); + +static void +ip_mcast_snoop_cfg_load(struct ip_mcast_snoop_cfg *cfg, + const struct sbrec_ip_multicast *ip_mcast) +{ + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); + + memset(cfg, 0, sizeof *cfg); + cfg->enabled = + (ip_mcast->enabled && ip_mcast->enabled[0]); + cfg->querier_enabled = + (cfg->enabled && ip_mcast->querier && ip_mcast->querier[0]); + + if (ip_mcast->table_size) { + cfg->table_size = ip_mcast->table_size[0]; + } else { + cfg->table_size = OVN_MCAST_DEFAULT_MAX_ENTRIES; + } + + if (ip_mcast->idle_timeout) { + cfg->idle_time_s = ip_mcast->idle_timeout[0]; + } else { + cfg->idle_time_s = OVN_MCAST_DEFAULT_IDLE_TIMEOUT_S; + } + + if (ip_mcast->query_interval) { + cfg->query_interval_s = ip_mcast->query_interval[0]; + } else { + cfg->query_interval_s = cfg->idle_time_s / 2; + if (cfg->query_interval_s < OVN_MCAST_MIN_QUERY_INTERVAL_S) { + cfg->query_interval_s = OVN_MCAST_MIN_QUERY_INTERVAL_S; + } + } + + if (ip_mcast->query_max_resp) { + cfg->query_max_resp_s = ip_mcast->query_max_resp[0]; + } else { + cfg->query_max_resp_s = OVN_MCAST_DEFAULT_QUERY_MAX_RESPONSE_S; + } + + cfg->seq_no = ip_mcast->seq_no; + + if (cfg->querier_enabled) { + /* Try to parse the source ETH address. */ + if (!ip_mcast->eth_src || + !eth_addr_from_string(ip_mcast->eth_src, + &cfg->query_eth_src)) { + VLOG_WARN_RL(&rl, + "IGMP Querier enabled with invalid ETH src address"); + /* Failed to parse the IPv4 source address. Disable the querier. */ + cfg->querier_enabled = false; + } + + /* Try to parse the source IP address. */ + if (!ip_mcast->ip4_src || + !ip_parse(ip_mcast->ip4_src, &cfg->query_ipv4_src)) { + VLOG_WARN_RL(&rl, + "IGMP Querier enabled with invalid IPv4 src address"); + /* Failed to parse the IPv4 source address. Disable the querier. */ + cfg->querier_enabled = false; + } + + /* IGMP queries must be sent to 224.0.0.1. */ + cfg->query_eth_dst = + (struct eth_addr)ETH_ADDR_C(01, 00, 5E, 00, 00, 01); + cfg->query_ipv4_dst = htonl(0xe0000001); + } +} + +static uint32_t +ip_mcast_snoop_hash(int64_t dp_key) +{ + return hash_uint64(dp_key); +} + +static struct ip_mcast_snoop_state * +ip_mcast_snoop_state_add(int64_t dp_key) + OVS_REQUIRES(pinctrl_mutex) +{ + struct ip_mcast_snoop_state *ms_state = xmalloc(sizeof *ms_state); + + ms_state->dp_key = dp_key; + hmap_insert(&mcast_cfg_map, &ms_state->hmap_node, + ip_mcast_snoop_hash(dp_key)); + return ms_state; +} + +static struct ip_mcast_snoop_state * +ip_mcast_snoop_state_find(int64_t dp_key) + OVS_REQUIRES(pinctrl_mutex) +{ + struct ip_mcast_snoop_state *ms_state; + uint32_t hash = ip_mcast_snoop_hash(dp_key); + + HMAP_FOR_EACH_WITH_HASH (ms_state, hmap_node, hash, &mcast_cfg_map) { + if (ms_state->dp_key == dp_key) { + return ms_state; + } + } + return NULL; +} + +static bool +ip_mcast_snoop_state_update(int64_t dp_key, + const struct ip_mcast_snoop_cfg *cfg) + OVS_REQUIRES(pinctrl_mutex) +{ + bool notify = false; + struct ip_mcast_snoop_state *ms_state = ip_mcast_snoop_state_find(dp_key); + + if (!ms_state) { + ms_state = ip_mcast_snoop_state_add(dp_key); + notify = true; + } else if (memcmp(cfg, &ms_state->cfg, sizeof *cfg)) { + notify = true; + } + + ms_state->cfg = *cfg; + return notify; +} + +static void +ip_mcast_snoop_state_remove(struct ip_mcast_snoop_state *ms_state) + OVS_REQUIRES(pinctrl_mutex) +{ + hmap_remove(&mcast_cfg_map, &ms_state->hmap_node); + free(ms_state); +} + +static bool +ip_mcast_snoop_enable(struct ip_mcast_snoop *ip_ms) +{ + if (ip_ms->cfg.enabled) { + return true; + } + + ip_ms->ms = mcast_snooping_create(); + return ip_ms->ms != NULL; +} + +static void +ip_mcast_snoop_flush(struct ip_mcast_snoop *ip_ms) +{ + if (!ip_ms->cfg.enabled) { + return; + } + + mcast_snooping_flush(ip_ms->ms); +} + +static void +ip_mcast_snoop_disable(struct ip_mcast_snoop *ip_ms) +{ + if (!ip_ms->cfg.enabled) { + return; + } + + mcast_snooping_unref(ip_ms->ms); + ip_ms->ms = NULL; +} + +static bool +ip_mcast_snoop_configure(struct ip_mcast_snoop *ip_ms, + const struct ip_mcast_snoop_cfg *cfg) +{ + if (cfg->enabled) { + if (!ip_mcast_snoop_enable(ip_ms)) { + return false; + } + if (ip_ms->cfg.seq_no != cfg->seq_no) { + ip_mcast_snoop_flush(ip_ms); + } + + if (ip_ms->cfg.querier_enabled && !cfg->querier_enabled) { + ovs_list_remove(&ip_ms->query_node); + } else if (!ip_ms->cfg.querier_enabled && cfg->querier_enabled) { + ovs_list_push_back(&mcast_query_list, &ip_ms->query_node); + } + } else { + ip_mcast_snoop_disable(ip_ms); + goto set_fields; + } + + ovs_rwlock_wrlock(&ip_ms->ms->rwlock); + if (cfg->table_size != ip_ms->cfg.table_size) { + mcast_snooping_set_max_entries(ip_ms->ms, cfg->table_size); + } + + if (cfg->idle_time_s != ip_ms->cfg.idle_time_s) { + mcast_snooping_set_idle_time(ip_ms->ms, cfg->idle_time_s); + } + ovs_rwlock_unlock(&ip_ms->ms->rwlock); + + if (cfg->query_interval_s != ip_ms->cfg.query_interval_s) { + long long int now = time_msec(); + + if (ip_ms->query_time_ms > now + cfg->query_interval_s * 1000) { + ip_ms->query_time_ms = now; + } + } + +set_fields: + memcpy(&ip_ms->cfg, cfg, sizeof ip_ms->cfg); + return true; +} + +static struct ip_mcast_snoop * +ip_mcast_snoop_add(int64_t dp_key, const struct ip_mcast_snoop_cfg *cfg) + OVS_REQUIRES(pinctrl_mutex) +{ + struct ip_mcast_snoop *ip_ms = xzalloc(sizeof *ip_ms); + + ip_ms->dp_key = dp_key; + if (!ip_mcast_snoop_configure(ip_ms, cfg)) { + free(ip_ms); + return NULL; + } + + hmap_insert(&mcast_snoop_map, &ip_ms->hmap_node, + ip_mcast_snoop_hash(dp_key)); + return ip_ms; +} + +static struct ip_mcast_snoop * +ip_mcast_snoop_find(int64_t dp_key) + OVS_REQUIRES(pinctrl_mutex) +{ + struct ip_mcast_snoop *ip_ms; + + HMAP_FOR_EACH_WITH_HASH (ip_ms, hmap_node, ip_mcast_snoop_hash(dp_key), + &mcast_snoop_map) { + if (ip_ms->dp_key == dp_key) { + return ip_ms; + } + } + return NULL; +} + +static void +ip_mcast_snoop_remove(struct ip_mcast_snoop *ip_ms) + OVS_REQUIRES(pinctrl_mutex) +{ + hmap_remove(&mcast_snoop_map, &ip_ms->hmap_node); + + if (ip_ms->cfg.querier_enabled) { + ovs_list_remove(&ip_ms->query_node); + } + + ip_mcast_snoop_disable(ip_ms); + free(ip_ms); +} + +static void +ip_mcast_snoop_init(void) + OVS_NO_THREAD_SAFETY_ANALYSIS +{ + hmap_init(&mcast_snoop_map); + ovs_list_init(&mcast_query_list); + hmap_init(&mcast_cfg_map); +} + +static void +ip_mcast_snoop_destroy(void) + OVS_NO_THREAD_SAFETY_ANALYSIS +{ + struct ip_mcast_snoop *ip_ms, *ip_ms_next; + + HMAP_FOR_EACH_SAFE (ip_ms, ip_ms_next, hmap_node, &mcast_snoop_map) { + ip_mcast_snoop_remove(ip_ms); + } + hmap_destroy(&mcast_snoop_map); + + struct ip_mcast_snoop_state *ip_ms_state; + + HMAP_FOR_EACH_POP (ip_ms_state, hmap_node, &mcast_cfg_map) { + free(ip_ms_state); + } +} + +static void +ip_mcast_snoop_run(void) + OVS_REQUIRES(pinctrl_mutex) +{ + struct ip_mcast_snoop *ip_ms, *ip_ms_next; + + /* First read the config updated by pinctrl_main. If there's any new or + * updated config then apply it. + */ + struct ip_mcast_snoop_state *ip_ms_state; + + HMAP_FOR_EACH (ip_ms_state, hmap_node, &mcast_cfg_map) { + ip_ms = ip_mcast_snoop_find(ip_ms_state->dp_key); + + if (!ip_ms) { + ip_mcast_snoop_add(ip_ms_state->dp_key, &ip_ms_state->cfg); + } else if (memcmp(&ip_ms_state->cfg, &ip_ms->cfg, + sizeof ip_ms_state->cfg)) { + ip_mcast_snoop_configure(ip_ms, &ip_ms_state->cfg); + } + } + + bool notify = false; + + /* Then walk the multicast snoop instances. */ + HMAP_FOR_EACH_SAFE (ip_ms, ip_ms_next, hmap_node, &mcast_snoop_map) { + + /* Delete the stale ones. */ + if (!ip_mcast_snoop_state_find(ip_ms->dp_key)) { + ip_mcast_snoop_remove(ip_ms); + continue; + } + + /* If enabled run the snooping instance to timeout old groups. */ + if (ip_ms->cfg.enabled) { + if (mcast_snooping_run(ip_ms->ms)) { + notify = true; + } + + mcast_snooping_wait(ip_ms->ms); + } + } + + if (notify) { + notify_pinctrl_main(); + } +} + +/* + * This runs in the pinctrl main thread, so it has access to the southbound + * database. It reads the IP_Multicast table and updates the local multicast + * configuration. Then writes to the southbound database the updated + * IGMP_Groups. + */ +static void +ip_mcast_sync(struct ovsdb_idl_txn *ovnsb_idl_txn, + const struct sbrec_chassis *chassis, + const struct hmap *local_datapaths, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, + struct ovsdb_idl_index *sbrec_port_binding_by_key, + struct ovsdb_idl_index *sbrec_igmp_groups, + struct ovsdb_idl_index *sbrec_ip_multicast) + OVS_REQUIRES(pinctrl_mutex) +{ + bool notify = false; + + if (!ovnsb_idl_txn || !chassis) { + return; + } + + struct sbrec_ip_multicast *ip_mcast; + struct ip_mcast_snoop_state *ip_ms_state, *ip_ms_state_next; + + /* First read and update our own local multicast configuration for the + * local datapaths. + */ + SBREC_IP_MULTICAST_FOR_EACH_BYINDEX (ip_mcast, sbrec_ip_multicast) { + + int64_t dp_key = ip_mcast->datapath->tunnel_key; + struct ip_mcast_snoop_cfg cfg; + + ip_mcast_snoop_cfg_load(&cfg, ip_mcast); + if (ip_mcast_snoop_state_update(dp_key, &cfg)) { + notify = true; + } + } + + /* Then delete the old entries. */ + HMAP_FOR_EACH_SAFE (ip_ms_state, ip_ms_state_next, hmap_node, + &mcast_cfg_map) { + if (!get_local_datapath(local_datapaths, ip_ms_state->dp_key)) { + ip_mcast_snoop_state_remove(ip_ms_state); + notify = true; + } + } + + const struct sbrec_igmp_group *sbrec_igmp; + + /* Then flush any IGMP_Group entries that are not needed anymore: + * - either multicast snooping was disabled on the datapath + * - or the group has expired. + */ + SBREC_IGMP_GROUP_FOR_EACH_BYINDEX (sbrec_igmp, sbrec_igmp_groups) { + ovs_be32 group_addr; + + if (!sbrec_igmp->datapath) { + continue; + } + + int64_t dp_key = sbrec_igmp->datapath->tunnel_key; + struct ip_mcast_snoop *ip_ms = ip_mcast_snoop_find(dp_key); + + /* If the datapath doesn't exist anymore or IGMP snooping was disabled + * on it then delete the IGMP_Group entry. + */ + if (!ip_ms || !ip_ms->cfg.enabled) { + igmp_group_delete(sbrec_igmp); + continue; + } + + if (!ip_parse(sbrec_igmp->address, &group_addr)) { + continue; + } + + ovs_rwlock_rdlock(&ip_ms->ms->rwlock); + struct mcast_group *mc_group = + mcast_snooping_lookup4(ip_ms->ms, group_addr, + IP_MCAST_VLAN); + + if (!mc_group || ovs_list_is_empty(&mc_group->bundle_lru)) { + igmp_group_delete(sbrec_igmp); + } + ovs_rwlock_unlock(&ip_ms->ms->rwlock); + } + + struct ip_mcast_snoop *ip_ms, *ip_ms_next; + + /* Last: write new IGMP_Groups to the southbound DB and update existing + * ones (if needed). We also flush any old per-datapath multicast snoop + * structures. + */ + HMAP_FOR_EACH_SAFE (ip_ms, ip_ms_next, hmap_node, &mcast_snoop_map) { + /* Flush any non-local snooping datapaths (e.g., stale). */ + struct local_datapath *local_dp = + get_local_datapath(local_datapaths, ip_ms->dp_key); + + if (!local_dp) { + continue; + } + + /* Skip datapaths on which snooping is disabled. */ + if (!ip_ms->cfg.enabled) { + continue; + } + + struct mcast_group *mc_group; + + ovs_rwlock_rdlock(&ip_ms->ms->rwlock); + LIST_FOR_EACH (mc_group, group_node, &ip_ms->ms->group_lru) { + if (ovs_list_is_empty(&mc_group->bundle_lru)) { + continue; + } + sbrec_igmp = igmp_group_lookup(sbrec_igmp_groups, &mc_group->addr, + local_dp->datapath, chassis); + if (!sbrec_igmp) { + sbrec_igmp = igmp_group_create(ovnsb_idl_txn, &mc_group->addr, + local_dp->datapath, chassis); + } + + igmp_group_update_ports(sbrec_igmp, sbrec_datapath_binding_by_key, + sbrec_port_binding_by_key, ip_ms->ms, + mc_group); + } + ovs_rwlock_unlock(&ip_ms->ms->rwlock); + } + + if (notify) { + notify_pinctrl_handler(); + } +} + +static void +pinctrl_ip_mcast_handle_igmp(struct rconn *swconn OVS_UNUSED, + const struct flow *ip_flow, + struct dp_packet *pkt_in, + const struct match *md, + struct ofpbuf *userdata OVS_UNUSED) + OVS_NO_THREAD_SAFETY_ANALYSIS +{ + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); + + /* This action only works for IP packets, and the switch should only send + * us IP packets this way, but check here just to be sure. + */ + if (ip_flow->dl_type != htons(ETH_TYPE_IP)) { + VLOG_WARN_RL(&rl, + "IGMP action on non-IP packet (eth_type 0x%"PRIx16")", + ntohs(ip_flow->dl_type)); + return; + } + + int64_t dp_key = ntohll(md->flow.metadata); + uint32_t port_key = md->flow.regs[MFF_LOG_INPORT - MFF_REG0]; + + const struct igmp_header *igmp; + size_t offset; + + offset = (char *) dp_packet_l4(pkt_in) - (char *) dp_packet_data(pkt_in); + igmp = dp_packet_at(pkt_in, offset, IGMP_HEADER_LEN); + if (!igmp || csum(igmp, dp_packet_l4_size(pkt_in)) != 0) { + VLOG_WARN_RL(&rl, "multicast snooping received bad IGMP checksum"); + return; + } + + ovs_be32 ip4 = ip_flow->igmp_group_ip4; + + struct ip_mcast_snoop *ip_ms = ip_mcast_snoop_find(dp_key); + if (!ip_ms || !ip_ms->cfg.enabled) { + /* IGMP snooping is not configured or is disabled. */ + return; + } + + void *port_key_data = (void *)(uintptr_t)port_key; + + bool group_change = false; + + ovs_rwlock_wrlock(&ip_ms->ms->rwlock); + switch (ntohs(ip_flow->tp_src)) { + /* Only default VLAN is supported for now. */ + case IGMP_HOST_MEMBERSHIP_REPORT: + case IGMPV2_HOST_MEMBERSHIP_REPORT: + group_change = + mcast_snooping_add_group4(ip_ms->ms, ip4, IP_MCAST_VLAN, + port_key_data); + break; + case IGMP_HOST_LEAVE_MESSAGE: + group_change = + mcast_snooping_leave_group4(ip_ms->ms, ip4, IP_MCAST_VLAN, + port_key_data); + break; + case IGMP_HOST_MEMBERSHIP_QUERY: + /* Shouldn't be receiving any of these since we are the multicast + * router. Store them for now. + */ + group_change = + mcast_snooping_add_mrouter(ip_ms->ms, IP_MCAST_VLAN, + port_key_data); + break; + case IGMPV3_HOST_MEMBERSHIP_REPORT: + group_change = + mcast_snooping_add_report(ip_ms->ms, pkt_in, IP_MCAST_VLAN, + port_key_data); + break; + } + ovs_rwlock_unlock(&ip_ms->ms->rwlock); + + if (group_change) { + notify_pinctrl_main(); + } +} + +static long long int +ip_mcast_querier_send(struct rconn *swconn, struct ip_mcast_snoop *ip_ms, + long long int current_time) +{ + if (current_time < ip_ms->query_time_ms) { + return ip_ms->query_time_ms; + } + + /* Compose a multicast query. */ + uint64_t packet_stub[128 / 8]; + struct dp_packet packet; + + dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub); + + uint8_t ip_tos = 0; + uint8_t igmp_ttl = 1; + + dp_packet_clear(&packet); + packet.packet_type = htonl(PT_ETH); + + struct eth_header *eh = dp_packet_put_zeros(&packet, sizeof *eh); + eh->eth_dst = ip_ms->cfg.query_eth_dst; + eh->eth_src = ip_ms->cfg.query_eth_src; + + struct ip_header *nh = dp_packet_put_zeros(&packet, sizeof *nh); + + eh->eth_type = htons(ETH_TYPE_IP); + dp_packet_set_l3(&packet, nh); + nh->ip_ihl_ver = IP_IHL_VER(5, 4); + nh->ip_tot_len = htons(sizeof(struct ip_header) + + sizeof(struct igmpv3_query_header)); + nh->ip_tos = IP_DSCP_CS6; + nh->ip_proto = IPPROTO_IGMP; + nh->ip_frag_off = htons(IP_DF); + packet_set_ipv4(&packet, ip_ms->cfg.query_ipv4_src, + ip_ms->cfg.query_ipv4_dst, ip_tos, igmp_ttl); + + nh->ip_csum = 0; + nh->ip_csum = csum(nh, sizeof *nh); + + struct igmpv3_query_header *igh = + dp_packet_put_zeros(&packet, sizeof *igh); + dp_packet_set_l4(&packet, igh); + + /* IGMP query max-response in tenths of seconds. */ + uint8_t max_response = ip_ms->cfg.query_max_resp_s * 10; + uint8_t qqic = max_response; + packet_set_igmp3_query(&packet, max_response, 0, false, 0, qqic); + + /* Inject multicast query. */ + uint64_t ofpacts_stub[4096 / 8]; + struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(ofpacts_stub); + enum ofp_version version = rconn_get_version(swconn); + put_load(ip_ms->dp_key, MFF_LOG_DATAPATH, 0, 64, &ofpacts); + put_load(OVN_MCAST_FLOOD_TUNNEL_KEY, MFF_LOG_OUTPORT, 0, 32, &ofpacts); + put_load(1, MFF_LOG_FLAGS, MLF_LOCAL_ONLY, 1, &ofpacts); + struct ofpact_resubmit *resubmit = ofpact_put_RESUBMIT(&ofpacts); + resubmit->in_port = OFPP_CONTROLLER; + resubmit->table_id = OFTABLE_LOCAL_OUTPUT; + + struct ofputil_packet_out po = { + .packet = dp_packet_data(&packet), + .packet_len = dp_packet_size(&packet), + .buffer_id = UINT32_MAX, + .ofpacts = ofpacts.data, + .ofpacts_len = ofpacts.size, + }; + match_set_in_port(&po.flow_metadata, OFPP_CONTROLLER); + enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version); + queue_msg(swconn, ofputil_encode_packet_out(&po, proto)); + dp_packet_uninit(&packet); + ofpbuf_uninit(&ofpacts); + + /* Set the next query time. */ + ip_ms->query_time_ms = current_time + ip_ms->cfg.query_interval_s * 1000; + return ip_ms->query_time_ms; +} + +static void +ip_mcast_querier_run(struct rconn *swconn, long long int *query_time) +{ + if (ovs_list_is_empty(&mcast_query_list)) { + return; + } + + /* Send multicast queries and update the next query time. */ + long long int current_time = time_msec(); + *query_time = LLONG_MAX; + + struct ip_mcast_snoop *ip_ms; + + LIST_FOR_EACH (ip_ms, query_node, &mcast_query_list) { + long long int next_query_time = + ip_mcast_querier_send(swconn, ip_ms, current_time); + if (*query_time > next_query_time) { + *query_time = next_query_time; + } + } +} + +static void +ip_mcast_querier_wait(long long int query_time) +{ + if (!ovs_list_is_empty(&mcast_query_list)) { + poll_timer_wait_until(query_time); + } +} + /* Get localnet vifs, local l3gw ports and ofport for localnet patch ports. */ static void get_localnet_vifs_l3gwports( @@ -3059,6 +3838,7 @@ may_inject_pkts(void) { return (!shash_is_empty(&ipv6_ras) || !shash_is_empty(&send_garp_data) || + !ovs_list_is_empty(&mcast_query_list) || !ovs_list_is_empty(&buffered_mac_bindings)); } diff --git a/ovn/controller/pinctrl.h b/ovn/controller/pinctrl.h index f61d705..a0b478e 100644 --- a/ovn/controller/pinctrl.h +++ b/ovn/controller/pinctrl.h @@ -37,6 +37,8 @@ void pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, struct ovsdb_idl_index *sbrec_port_binding_by_key, struct ovsdb_idl_index *sbrec_port_binding_by_name, struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip, + struct ovsdb_idl_index *sbrec_igmp_groups, + struct ovsdb_idl_index *sbrec_ip_multicast_opts, const struct sbrec_dns_table *, const struct ovsrec_bridge *, const struct sbrec_chassis *, const struct hmap *local_datapaths, diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c index d132214..6d7e669 100644 --- a/ovn/lib/actions.c +++ b/ovn/lib/actions.c @@ -1230,6 +1230,12 @@ format_ICMP6(const struct ovnact_nest *nest, struct ds *s) } static void +format_IGMP(const struct ovnact_null *a OVS_UNUSED, struct ds *s) +{ + ds_put_cstr(s, "igmp;"); +} + +static void format_TCP_RESET(const struct ovnact_nest *nest, struct ds *s) { format_nested_action(nest, "tcp_reset", s); @@ -1317,6 +1323,14 @@ encode_ICMP6(const struct ovnact_nest *on, } static void +encode_IGMP(const struct ovnact_null *a OVS_UNUSED, + const struct ovnact_encode_params *ep OVS_UNUSED, + struct ofpbuf *ofpacts OVS_UNUSED) +{ + encode_controller_op(ACTION_OPCODE_IGMP, ofpacts); +} + +static void encode_TCP_RESET(const struct ovnact_nest *on, const struct ovnact_encode_params *ep, struct ofpbuf *ofpacts) @@ -2492,6 +2506,8 @@ parse_action(struct action_context *ctx) parse_ICMP4_ERROR(ctx); } else if (lexer_match_id(ctx->lexer, "icmp6")) { parse_ICMP6(ctx); + } else if (lexer_match_id(ctx->lexer, "igmp")) { + ovnact_put_IGMP(ctx->ovnacts); } else if (lexer_match_id(ctx->lexer, "tcp_reset")) { parse_TCP_RESET(ctx); } else if (lexer_match_id(ctx->lexer, "nd_na")) { diff --git a/ovn/lib/automake.mk b/ovn/lib/automake.mk index 023f51b..8f69982 100644 --- a/ovn/lib/automake.mk +++ b/ovn/lib/automake.mk @@ -12,6 +12,8 @@ ovn_lib_libovn_la_SOURCES = \ ovn/lib/expr.c \ ovn/lib/extend-table.h \ ovn/lib/extend-table.c \ + ovn/lib/ip-mcast-index.c \ + ovn/lib/ip-mcast-index.h \ ovn/lib/lex.c \ ovn/lib/ovn-l7.h \ ovn/lib/ovn-util.c \ diff --git a/ovn/lib/ip-mcast-index.c b/ovn/lib/ip-mcast-index.c new file mode 100644 index 0000000..1f6ebc4 --- /dev/null +++ b/ovn/lib/ip-mcast-index.c @@ -0,0 +1,40 @@ +/* Copyright (c) 2019, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "ovn/lib/ip-mcast-index.h" +#include "ovn/lib/ovn-sb-idl.h" + +struct ovsdb_idl_index * +ip_mcast_index_create(struct ovsdb_idl *idl) +{ + return ovsdb_idl_index_create1(idl, &sbrec_ip_multicast_col_datapath); +} + +const struct sbrec_ip_multicast * +ip_mcast_lookup(struct ovsdb_idl_index *ip_mcast_index, + const struct sbrec_datapath_binding *datapath) +{ + struct sbrec_ip_multicast *target = + sbrec_ip_multicast_index_init_row(ip_mcast_index); + sbrec_ip_multicast_index_set_datapath(target, datapath); + + struct sbrec_ip_multicast *ip_mcast = + sbrec_ip_multicast_index_find(ip_mcast_index, target); + sbrec_ip_multicast_index_destroy_row(target); + + return ip_mcast; +} diff --git a/ovn/lib/ip-mcast-index.h b/ovn/lib/ip-mcast-index.h new file mode 100644 index 0000000..15141ea --- /dev/null +++ b/ovn/lib/ip-mcast-index.h @@ -0,0 +1,39 @@ +/* Copyright (c) 2019, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef OVN_IP_MCAST_INDEX_H +#define OVN_IP_MCAST_INDEX_H 1 + +struct ovsdb_idl; + +struct sbrec_datapath_binding; + +#define OVN_MCAST_MIN_IDLE_TIMEOUT_S 15 +#define OVN_MCAST_MAX_IDLE_TIMEOUT_S 3600 +#define OVN_MCAST_DEFAULT_IDLE_TIMEOUT_S 300 +#define OVN_MCAST_MIN_QUERY_INTERVAL_S 1 +#define OVN_MCAST_MAX_QUERY_INTERVAL_S OVN_MCAST_MAX_IDLE_TIMEOUT_S +#define OVN_MCAST_DEFAULT_QUERY_MAX_RESPONSE_S 1 +#define OVN_MCAST_DEFAULT_MAX_ENTRIES 2048 + +#define OVN_MCAST_FLOOD_TUNNEL_KEY 65535 +#define OVN_MCAST_UNKNOWN_TUNNEL_KEY (OVN_MCAST_FLOOD_TUNNEL_KEY - 1) + +struct ovsdb_idl_index *ip_mcast_index_create(struct ovsdb_idl *); +const struct sbrec_ip_multicast *ip_mcast_lookup( + struct ovsdb_idl_index *ip_mcast_index, + const struct sbrec_datapath_binding *datapath); + +#endif /* ovn/lib/ip-mcast-index.h */ diff --git a/ovn/lib/logical-fields.c b/ovn/lib/logical-fields.c index 579537d..9b80438 100644 --- a/ovn/lib/logical-fields.c +++ b/ovn/lib/logical-fields.c @@ -164,6 +164,8 @@ ovn_init_symtab(struct shash *symtab) expr_symtab_add_field(symtab, "icmp4.code", MFF_ICMPV4_CODE, "icmp4", false); + expr_symtab_add_predicate(symtab, "igmp", "ip4 && ip.proto == 2"); + expr_symtab_add_field(symtab, "ip6.src", MFF_IPV6_SRC, "ip6", false); expr_symtab_add_field(symtab, "ip6.dst", MFF_IPV6_DST, "ip6", false); expr_symtab_add_field(symtab, "ip6.label", MFF_IPV6_LABEL, "ip6", false); diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema index 2b543c6..f642790 100644 --- a/ovn/ovn-sb.ovsschema +++ b/ovn/ovn-sb.ovsschema @@ -1,7 +1,7 @@ { "name": "OVN_Southbound", - "version": "2.3.0", - "cksum": "3092285199 17409", + "version": "2.4.0", + "cksum": "4023664301 19573", "tables": { "SB_Global": { "columns": { @@ -349,4 +349,43 @@ "type": {"key": "string", "value": "string", "min": 0, "max": "unlimited"}}}, "indexes": [["name"]], + "isRoot": true}, + "IP_Multicast": { + "columns": { + "datapath": {"type": {"key": {"type": "uuid", + "refTable": "Datapath_Binding", + "refType": "weak"}}}, + "enabled": {"type": {"key": "boolean", "min": 0, "max": 1}}, + "querier": {"type": {"key": "boolean", "min": 0, "max": 1}}, + "eth_src": {"type": "string"}, + "ip4_src": {"type": "string"}, + "table_size": {"type": {"key": "integer", + "min": 0, "max": 1}}, + "idle_timeout": {"type": {"key": "integer", + "min": 0, "max": 1}}, + "query_interval": {"type": {"key": "integer", + "min": 0, "max": 1}}, + "query_max_resp": {"type": {"key": "integer", + "min": 0, "max": 1}}, + "seq_no": {"type": "integer"}}, + "indexes": [["datapath"]], + "isRoot": true}, + "IGMP_Group": { + "columns": { + "address": {"type": "string"}, + "datapath": {"type": {"key": {"type": "uuid", + "refTable": "Datapath_Binding", + "refType": "weak"}, + "min": 0, + "max": 1}}, + "chassis": {"type": {"key": {"type": "uuid", + "refTable": "Chassis", + "refType": "weak"}, + "min": 0, + "max": 1}}, + "ports": {"type": {"key": {"type": "uuid", + "refTable": "Port_Binding", + "refType": "weak"}, + "min": 0, "max": "unlimited"}}}, + "indexes": [["address", "datapath", "chassis"]], "isRoot": true}}} diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml index c2faa2c..0b03815 100644 --- a/ovn/ovn-sb.xml +++ b/ovn/ovn-sb.xml @@ -1988,6 +1988,14 @@ tcp.flags = RST;

Prerequisite: tcp

+
igmp;
+
+

+ This action sends the packet to ovn-controller for + multicast snooping. +

+

Prerequisite: igmp

+
@@ -3494,4 +3502,76 @@ tcp.flags = RST; + +

+ IP Multicast configuration options. For now only applicable to IGMP. +

+ + + entry for which these configuration + options are defined. + + + Enables/disables multicast snooping. Default: disabled. + + + Enables/disables multicast querying. If + then multicast querying is + enabled by default. + + + Limits the number of multicast groups that can be learned. Default: + 2048 groups per datapath. + + + Configures the idle timeout (in seconds) for IP multicast groups if + multicast snooping is enabled. Default: 300 seconds. + + + Configures the interval (in seconds) for sending multicast queries if + snooping and querier are enabled. + Default: /2 seconds. + + + ovn-controller reads this value and flushes all learned + multicast groups when it detects that seq_no was changed. + + + + The ovn-controller process that runs on OVN hypervisor + nodes uses the following columns to determine field values in IGMP + queries that it originates: + + Source Ethernet address. + + + Source IPv4 address. + + + Value (in seconds) to be used as "max-response" field in multicast + queries. Default: 1 second. + + +
+ +

+ Contains learned IGMP groups indexed by address/datapath/chassis. +

+ + + Destination IPv4 address for the IGMP group. + + + + Datapath to which this IGMP group belongs. + + + + Chassis to which this IGMP group belongs. + + + + The destination port bindings for this IGMP group. + +
diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c index c5ff931..dd6585a 100644 --- a/ovn/utilities/ovn-sbctl.c +++ b/ovn/utilities/ovn-sbctl.c @@ -524,6 +524,9 @@ pre_get_info(struct ctl_context *ctx) ovsdb_idl_add_column(ctx->idl, &sbrec_logical_flow_col_external_ids); ovsdb_idl_add_column(ctx->idl, &sbrec_datapath_binding_col_external_ids); + + ovsdb_idl_add_column(ctx->idl, &sbrec_ip_multicast_col_datapath); + ovsdb_idl_add_column(ctx->idl, &sbrec_ip_multicast_col_seq_no); } static struct cmd_show_table cmd_show_tables[] = { @@ -955,6 +958,52 @@ cmd_lflow_list(struct ctl_context *ctx) } static void +sbctl_ip_mcast_flush_switch(struct ctl_context *ctx, + const struct sbrec_datapath_binding *dp) +{ + const struct sbrec_ip_multicast *ip_mcast; + + /* Lookup the corresponding IP_Multicast entry. */ + SBREC_IP_MULTICAST_FOR_EACH (ip_mcast, ctx->idl) { + if (ip_mcast->datapath != dp) { + continue; + } + + sbrec_ip_multicast_set_seq_no(ip_mcast, ip_mcast->seq_no + 1); + } +} + +static void +sbctl_ip_mcast_flush(struct ctl_context *ctx) +{ + const struct sbrec_datapath_binding *dp; + + if (ctx->argc > 2) { + return; + } + + if (ctx->argc == 2) { + const struct ovsdb_idl_row *row; + char *error = ctl_get_row(ctx, &sbrec_table_datapath_binding, + ctx->argv[1], false, &row); + if (error) { + ctl_fatal("%s", error); + } + + dp = (const struct sbrec_datapath_binding *)row; + if (!dp) { + ctl_fatal("%s is not a valid datapath", ctx->argv[1]); + } + + sbctl_ip_mcast_flush_switch(ctx, dp); + } else { + SBREC_DATAPATH_BINDING_FOR_EACH (dp, ctx->idl) { + sbctl_ip_mcast_flush_switch(ctx, dp); + } + } +} + +static void verify_connections(struct ctl_context *ctx) { const struct sbrec_sb_global *sb_global = sbrec_sb_global_first(ctx->idl); @@ -1462,6 +1511,10 @@ static const struct ctl_command_syntax sbctl_commands[] = { pre_get_info, cmd_lflow_list, NULL, "--uuid,--ovs?,--stats", RO}, /* Friendly alias for lflow-list */ + /* IP multicast commands. */ + {"ip-multicast-flush", 0, 1, "SWITCH", + pre_get_info, sbctl_ip_mcast_flush, NULL, "", RW }, + /* Connection commands. */ {"get-connection", 0, 0, "", pre_connection, cmd_get_connection, NULL, "", RO}, {"del-connection", 0, 0, "", pre_connection, cmd_del_connection, NULL, "", RW}, diff --git a/ovn/utilities/ovn-trace.c b/ovn/utilities/ovn-trace.c index fff432d..5e497ad 100644 --- a/ovn/utilities/ovn-trace.c +++ b/ovn/utilities/ovn-trace.c @@ -2126,6 +2126,10 @@ trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len, super); break; + case OVNACT_IGMP: + /* Nothing to do for tracing. */ + break; + case OVNACT_TCP_RESET: execute_tcp_reset(ovnact_get_TCP_RESET(a), dp, uflow, table_id, pipeline, super); diff --git a/tests/ovn.at b/tests/ovn.at index 4da7059..55f612c 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -1333,6 +1333,10 @@ tcp_reset { }; encodes as controller(userdata=00.00.00.0b.00.00.00.00) has prereqs tcp +# IGMP +igmp; + encodes as controller(userdata=00.00.00.0f.00.00.00.00) + # Contradictionary prerequisites (allowed but not useful): ip4.src = ip6.src[0..31]; encodes as move:NXM_NX_IPV6_SRC[0..31]->NXM_OF_IP_SRC[] From patchwork Thu Jul 11 08:45:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1130711 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45kqfq5jJbz9sNT for ; Thu, 11 Jul 2019 18:56:35 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 614624CCC; Thu, 11 Jul 2019 08:54:33 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 785C64CBF for ; Thu, 11 Jul 2019 08:45:34 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 178FBCF for ; Thu, 11 Jul 2019 08:45:32 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B22BD59441 for ; Thu, 11 Jul 2019 08:45:31 +0000 (UTC) Received: from dceara.remote.csb (ovpn-117-109.ams2.redhat.com [10.36.117.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id B461D1001B10 for ; Thu, 11 Jul 2019 08:45:30 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Thu, 11 Jul 2019 10:45:28 +0200 Message-Id: <20190711084520.15842.4299.stgit@dceara.remote.csb> In-Reply-To: <20190711084413.15842.62313.stgit@dceara.remote.csb> References: <20190711084413.15842.62313.stgit@dceara.remote.csb> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 11 Jul 2019 08:45:31 +0000 (UTC) X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v3 3/3] OVN: Add ovn-northd IGMP support X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org New IP Multicast Snooping Options are added to the Northbound DB Logical_Switch:other_config column. These allow enabling IGMP snooping and querier on the logical switch and get translated by ovn-northd to rows in the IP_Multicast Southbound DB table. ovn-northd monitors for changes done by ovn-controllers in the Southbound DB IGMP_Group table. Based on the entries in IGMP_Group ovn-northd creates Multicast_Group entries in the Southbound DB, one per IGMP_Group address X, containing the list of logical switch ports (aggregated from all controllers) that have IGMP_Group entries for that datapath and address X. ovn-northd also creates a logical flow that matches on IP multicast traffic destined to address X and outputs it on the tunnel key of the corresponding Multicast_Group entry. Signed-off-by: Dumitru Ceara Acked-by: Mark Michelson --- ovn/northd/ovn-northd.c | 460 ++++++++++++++++++++++++++++++++++++++++++++--- ovn/ovn-nb.xml | 54 ++++++ tests/ovn.at | 270 ++++++++++++++++++++++++++++ tests/system-ovn.at | 119 ++++++++++++ 4 files changed, 871 insertions(+), 32 deletions(-) diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index ce382ac..2b71526 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -29,6 +29,7 @@ #include "openvswitch/json.h" #include "ovn/lex.h" #include "ovn/lib/chassis-index.h" +#include "ovn/lib/ip-mcast-index.h" #include "ovn/lib/ovn-l7.h" #include "ovn/lib/ovn-nb-idl.h" #include "ovn/lib/ovn-sb-idl.h" @@ -57,6 +58,7 @@ struct northd_context { struct ovsdb_idl_txn *ovnnb_txn; struct ovsdb_idl_txn *ovnsb_txn; struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name; + struct ovsdb_idl_index *sbrec_ip_mcast_by_dp; }; static const char *ovnnb_db; @@ -424,6 +426,33 @@ struct ipam_info { bool mac_only; }; +#define OVN_MIN_MULTICAST 32768 +#define OVN_MAX_MULTICAST OVN_MCAST_FLOOD_TUNNEL_KEY +BUILD_ASSERT_DECL(OVN_MIN_MULTICAST < OVN_MAX_MULTICAST); + +#define OVN_MIN_IP_MULTICAST OVN_MIN_MULTICAST +#define OVN_MAX_IP_MULTICAST (OVN_MCAST_UNKNOWN_TUNNEL_KEY - 1) +BUILD_ASSERT_DECL(OVN_MAX_IP_MULTICAST >= OVN_MIN_MULTICAST); + +/* + * Multicast snooping and querier per datapath configuration. + */ +struct mcast_info { + bool enabled; + bool querier; + bool flood_unregistered; + + int64_t table_size; + int64_t idle_timeout; + int64_t query_interval; + char *eth_src; + char *ipv4_src; + int64_t query_max_response; + + uint32_t group_key_next; + uint32_t active_flows; +}; + /* The 'key' comes from nbs->header_.uuid or nbr->header_.uuid or * sb->external_ids:logical-switch. */ struct ovn_datapath { @@ -448,6 +477,9 @@ struct ovn_datapath { /* IPAM data. */ struct ipam_info ipam_info; + /* Multicast data. */ + struct mcast_info mcast_info; + /* OVN northd only needs to know about the logical router gateway port for * NAT on a distributed router. This "distributed gateway port" is * populated only when there is a "redirect-chassis" specified for one of @@ -522,6 +554,8 @@ ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od) hmap_remove(datapaths, &od->key_node); destroy_tnlids(&od->port_tnlids); bitmap_free(od->ipam_info.allocated_ipv4s); + free(od->mcast_info.eth_src); + free(od->mcast_info.ipv4_src); free(od->router_ports); ovn_ls_port_group_destroy(&od->nb_pgs); free(od); @@ -659,6 +693,85 @@ init_ipam_info_for_datapath(struct ovn_datapath *od) } static void +init_mcast_info_for_datapath(struct ovn_datapath *od) +{ + if (!od->nbs) { + return; + } + + struct mcast_info *mcast_info = &od->mcast_info; + + mcast_info->enabled = + smap_get_bool(&od->nbs->other_config, "mcast_snoop", false); + mcast_info->querier = + smap_get_bool(&od->nbs->other_config, "mcast_querier", true); + mcast_info->flood_unregistered = + smap_get_bool(&od->nbs->other_config, "mcast_flood_unregistered", + false); + + mcast_info->table_size = + smap_get_ullong(&od->nbs->other_config, "mcast_table_size", + OVN_MCAST_DEFAULT_MAX_ENTRIES); + + uint32_t idle_timeout = + smap_get_ullong(&od->nbs->other_config, "mcast_idle_timeout", + OVN_MCAST_DEFAULT_IDLE_TIMEOUT_S); + if (idle_timeout < OVN_MCAST_MIN_IDLE_TIMEOUT_S) { + idle_timeout = OVN_MCAST_MIN_IDLE_TIMEOUT_S; + } else if (idle_timeout > OVN_MCAST_MAX_IDLE_TIMEOUT_S) { + idle_timeout = OVN_MCAST_MAX_IDLE_TIMEOUT_S; + } + mcast_info->idle_timeout = idle_timeout; + + uint32_t query_interval = + smap_get_ullong(&od->nbs->other_config, "mcast_query_interval", + mcast_info->idle_timeout / 2); + if (query_interval < OVN_MCAST_MIN_QUERY_INTERVAL_S) { + query_interval = OVN_MCAST_MIN_QUERY_INTERVAL_S; + } else if (query_interval > OVN_MCAST_MAX_QUERY_INTERVAL_S) { + query_interval = OVN_MCAST_MAX_QUERY_INTERVAL_S; + } + mcast_info->query_interval = query_interval; + + mcast_info->eth_src = + nullable_xstrdup(smap_get(&od->nbs->other_config, "mcast_eth_src")); + mcast_info->ipv4_src = + nullable_xstrdup(smap_get(&od->nbs->other_config, "mcast_ip4_src")); + + mcast_info->query_max_response = + smap_get_ullong(&od->nbs->other_config, "mcast_query_max_response", + OVN_MCAST_DEFAULT_QUERY_MAX_RESPONSE_S); + + mcast_info->group_key_next = OVN_MAX_IP_MULTICAST; + mcast_info->active_flows = 0; +} + +static void +store_mcast_info_for_datapath(const struct sbrec_ip_multicast *sb, + struct ovn_datapath *od) +{ + struct mcast_info *mcast_info = &od->mcast_info; + + sbrec_ip_multicast_set_datapath(sb, od->sb); + sbrec_ip_multicast_set_enabled(sb, &mcast_info->enabled, 1); + sbrec_ip_multicast_set_querier(sb, &mcast_info->querier, 1); + sbrec_ip_multicast_set_table_size(sb, &mcast_info->table_size, 1); + sbrec_ip_multicast_set_idle_timeout(sb, &mcast_info->idle_timeout, 1); + sbrec_ip_multicast_set_query_interval(sb, + &mcast_info->query_interval, 1); + sbrec_ip_multicast_set_query_max_resp(sb, + &mcast_info->query_max_response, 1); + + if (mcast_info->eth_src) { + sbrec_ip_multicast_set_eth_src(sb, mcast_info->eth_src); + } + + if (mcast_info->ipv4_src) { + sbrec_ip_multicast_set_ip4_src(sb, mcast_info->ipv4_src); + } +} + +static void ovn_datapath_update_external_ids(struct ovn_datapath *od) { /* Get the logical-switch or logical-router UUID to set in @@ -741,6 +854,7 @@ join_datapaths(struct northd_context *ctx, struct hmap *datapaths, } init_ipam_info_for_datapath(od); + init_mcast_info_for_datapath(od); } const struct nbrec_logical_router *nbr; @@ -910,7 +1024,7 @@ ovn_port_destroy(struct hmap *ports, struct ovn_port *port) } static struct ovn_port * -ovn_port_find(struct hmap *ports, const char *name) +ovn_port_find(const struct hmap *ports, const char *name) { struct ovn_port *op; @@ -2700,20 +2814,19 @@ build_ports(struct northd_context *ctx, cleanup_sb_ha_chassis_groups(ctx, &active_ha_chassis_grps); sset_destroy(&active_ha_chassis_grps); } - -#define OVN_MIN_MULTICAST 32768 -#define OVN_MAX_MULTICAST 65535 struct multicast_group { - const char *name; + char *name; uint16_t key; /* OVN_MIN_MULTICAST...OVN_MAX_MULTICAST. */ }; #define MC_FLOOD "_MC_flood" -static const struct multicast_group mc_flood = { MC_FLOOD, 65535 }; +static const struct multicast_group mc_flood = + { MC_FLOOD, OVN_MCAST_FLOOD_TUNNEL_KEY }; #define MC_UNKNOWN "_MC_unknown" -static const struct multicast_group mc_unknown = { MC_UNKNOWN, 65534 }; +static const struct multicast_group mc_unknown = + { MC_UNKNOWN, OVN_MCAST_UNKNOWN_TUNNEL_KEY }; static bool multicast_group_equal(const struct multicast_group *a, @@ -2756,10 +2869,10 @@ ovn_multicast_find(struct hmap *mcgroups, struct ovn_datapath *datapath, } static void -ovn_multicast_add(struct hmap *mcgroups, const struct multicast_group *group, - struct ovn_port *port) +ovn_multicast_add_ports(struct hmap *mcgroups, struct ovn_datapath *od, + const struct multicast_group *group, + struct ovn_port **ports, size_t n_ports) { - struct ovn_datapath *od = port->od; struct ovn_multicast *mc = ovn_multicast_find(mcgroups, od, group); if (!mc) { mc = xmalloc(sizeof *mc); @@ -2770,11 +2883,27 @@ ovn_multicast_add(struct hmap *mcgroups, const struct multicast_group *group, mc->allocated_ports = 4; mc->ports = xmalloc(mc->allocated_ports * sizeof *mc->ports); } - if (mc->n_ports >= mc->allocated_ports) { + + size_t n_ports_total = mc->n_ports + n_ports; + + if (n_ports_total > 2 * mc->allocated_ports) { + mc->allocated_ports = n_ports_total; + mc->ports = xrealloc(mc->ports, + mc->allocated_ports * sizeof *mc->ports); + } else if (n_ports_total > mc->allocated_ports) { mc->ports = x2nrealloc(mc->ports, &mc->allocated_ports, sizeof *mc->ports); } - mc->ports[mc->n_ports++] = port; + + memcpy(&mc->ports[mc->n_ports], &ports[0], n_ports * sizeof *ports); + mc->n_ports += n_ports; +} + +static void +ovn_multicast_add(struct hmap *mcgroups, const struct multicast_group *group, + struct ovn_port *port) +{ + ovn_multicast_add_ports(mcgroups, port->od, group, &port, 1); } static void @@ -2798,7 +2927,115 @@ ovn_multicast_update_sbrec(const struct ovn_multicast *mc, sbrec_multicast_group_set_ports(sb, ports, mc->n_ports); free(ports); } - + +/* + * IGMP group entry. + */ +struct ovn_igmp_group { + struct hmap_node hmap_node; /* Index on 'datapath' and 'mcgroup.name'. */ + + struct ovn_datapath *datapath; + struct in6_addr address; /* Multicast IPv6-mapped-IPv4 or IPv4 address */ + char address_s[INET6_ADDRSTRLEN + 1]; + struct multicast_group mcgroup; +}; + +static uint32_t +ovn_igmp_group_hash(const struct ovn_datapath *datapath, + const struct in6_addr *address) +{ + return hash_pointer(datapath, hash_bytes(address, sizeof *address, 0)); +} + +static struct ovn_igmp_group * +ovn_igmp_group_find(struct hmap *igmp_groups, + const struct ovn_datapath *datapath, + const struct in6_addr *address) +{ + struct ovn_igmp_group *group; + + HMAP_FOR_EACH_WITH_HASH (group, hmap_node, + ovn_igmp_group_hash(datapath, address), + igmp_groups) { + if (group->datapath == datapath && + ipv6_addr_equals(&group->address, address)) { + return group; + } + } + return NULL; +} + +static void +ovn_igmp_group_add(struct hmap *mcast_groups, struct hmap *igmp_groups, + struct ovn_datapath *datapath, + const struct hmap *ports, + const char *address, + struct sbrec_port_binding **igmp_ports, + size_t n_ports) +{ + struct in6_addr group_address; + ovs_be32 ipv4; + + if (ip_parse(address, &ipv4)) { + group_address = in6_addr_mapped_ipv4(ipv4); + } else if (!ipv6_parse(address, &group_address)) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + VLOG_WARN_RL(&rl, "invalid IGMP group address: %s", address); + return; + } + + struct ovn_igmp_group *igmp_group = + ovn_igmp_group_find(igmp_groups, datapath, &group_address); + + if (!igmp_group) { + if (datapath->mcast_info.group_key_next == OVN_MIN_IP_MULTICAST) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + VLOG_WARN_RL(&rl, "all IP multicast tunnel ids exhausted"); + return; + } + + igmp_group = xmalloc(sizeof *igmp_group); + + igmp_group->datapath = datapath; + igmp_group->address = group_address; + ovs_strzcpy(igmp_group->address_s, address, + sizeof igmp_group->address_s); + igmp_group->mcgroup.key = datapath->mcast_info.group_key_next; + igmp_group->mcgroup.name = + xasprintf("%u", datapath->mcast_info.group_key_next); + + hmap_insert(igmp_groups, &igmp_group->hmap_node, + ovn_igmp_group_hash(datapath, &group_address)); + + datapath->mcast_info.group_key_next--; + } + + struct ovn_port **oports = xmalloc(n_ports * sizeof *oports); + size_t n_oports = 0; + + for (size_t i = 0; i < n_ports; i++) { + oports[n_oports] = ovn_port_find(ports, igmp_ports[i]->logical_port); + if (oports[n_oports]) { + n_oports++; + } + } + + ovn_multicast_add_ports(mcast_groups, datapath, &igmp_group->mcgroup, + oports, n_oports); + free(oports); +} + +static void +ovn_igmp_group_destroy(struct hmap *igmp_groups, + struct ovn_igmp_group *igmp_group) +{ + if (igmp_group) { + hmap_remove(igmp_groups, &igmp_group->hmap_node); + free(igmp_group->mcgroup.name); + free(igmp_group); + } +} + /* Logical flow generation. * * This code generates the Logical_Flow table in the southbound database, as a @@ -4444,7 +4681,7 @@ build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list) static void build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, struct hmap *port_groups, struct hmap *lflows, - struct hmap *mcgroups) + struct hmap *mcgroups, struct hmap *igmp_groups) { /* This flow table structure is documented in ovn-northd(8), so please * update ovn-northd.8.xml if you change anything. */ @@ -4908,24 +5145,63 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, } } } + /* Ingress table 17: Destination lookup, broadcast and multicast handling - * (priority 100). */ - HMAP_FOR_EACH (op, key_node, ports) { - if (!op->nbsp) { + * (priority 70 - 100). */ + HMAP_FOR_EACH (od, key_node, datapaths) { + if (!od->nbs) { continue; } - if (lsp_is_enabled(op->nbsp)) { - ovn_multicast_add(mcgroups, &mc_flood, op); + if (od->mcast_info.enabled) { + /* Punt IGMP traffic to controller. */ + ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 100, + "ip4 && ip.proto == 2", "igmp;"); + + /* Flood all IP multicast traffic destined to 224.0.0.X to all + * ports - RFC 4541, section 2.1.2, item 2. + */ + ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 85, + "ip4 && ip4.dst == 224.0.0.0/24", + "outport = \""MC_FLOOD"\"; output;"); + + /* Drop unregistered IP multicast if not allowed. */ + if (!od->mcast_info.flood_unregistered) { + ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 80, + "ip4 && ip4.mcast", "drop;"); + } } + + ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 70, "eth.mcast", + "outport = \""MC_FLOOD"\"; output;"); } - HMAP_FOR_EACH (od, key_node, datapaths) { - if (!od->nbs) { + + /* Ingress table 17: Add IP multicast flows learnt from IGMP + * (priority 90). */ + struct ovn_igmp_group *igmp_group, *next_igmp_group; + + HMAP_FOR_EACH_SAFE (igmp_group, next_igmp_group, hmap_node, igmp_groups) { + ds_clear(&match); + ds_clear(&actions); + + if (!igmp_group->datapath) { continue; } - ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 100, "eth.mcast", - "outport = \""MC_FLOOD"\"; output;"); + struct mcast_info *mcast_info = &igmp_group->datapath->mcast_info; + + if (mcast_info->active_flows >= mcast_info->table_size) { + continue; + } + mcast_info->active_flows++; + + ds_put_format(&match, "eth.mcast && ip4 && ip4.dst == %s ", + igmp_group->address_s); + ds_put_format(&actions, "outport = \"%s\"; output; ", + igmp_group->mcgroup.name); + + ovn_lflow_add(lflows, igmp_group->datapath, S_SWITCH_IN_L2_LKUP, 90, + ds_cstr(&match), ds_cstr(&actions)); } /* Ingress table 17: Destination lookup, unicast handling (priority 50), */ @@ -7526,12 +7802,13 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, * constructing their contents based on the OVN_NB database. */ static void build_lflows(struct northd_context *ctx, struct hmap *datapaths, - struct hmap *ports, struct hmap *port_groups) + struct hmap *ports, struct hmap *port_groups, + struct hmap *mcgroups, struct hmap *igmp_groups) { struct hmap lflows = HMAP_INITIALIZER(&lflows); - struct hmap mcgroups = HMAP_INITIALIZER(&mcgroups); - build_lswitch_flows(datapaths, ports, port_groups, &lflows, &mcgroups); + build_lswitch_flows(datapaths, ports, port_groups, &lflows, mcgroups, + igmp_groups); build_lrouter_flows(datapaths, ports, &lflows); /* Push changes to the Logical_Flow table to database. */ @@ -7606,24 +7883,26 @@ build_lflows(struct northd_context *ctx, struct hmap *datapaths, struct multicast_group group = { .name = sbmc->name, .key = sbmc->tunnel_key }; - struct ovn_multicast *mc = ovn_multicast_find(&mcgroups, od, &group); + struct ovn_multicast *mc = ovn_multicast_find(mcgroups, od, &group); if (mc) { ovn_multicast_update_sbrec(mc, sbmc); - ovn_multicast_destroy(&mcgroups, mc); + ovn_multicast_destroy(mcgroups, mc); } else { sbrec_multicast_group_delete(sbmc); } } struct ovn_multicast *mc, *next_mc; - HMAP_FOR_EACH_SAFE (mc, next_mc, hmap_node, &mcgroups) { + HMAP_FOR_EACH_SAFE (mc, next_mc, hmap_node, mcgroups) { + if (!mc->datapath) { + continue; + } sbmc = sbrec_multicast_group_insert(ctx->ovnsb_txn); sbrec_multicast_group_set_datapath(sbmc, mc->datapath->sb); sbrec_multicast_group_set_name(sbmc, mc->group->name); sbrec_multicast_group_set_tunnel_key(sbmc, mc->group->key); ovn_multicast_update_sbrec(mc, sbmc); - ovn_multicast_destroy(&mcgroups, mc); + ovn_multicast_destroy(mcgroups, mc); } - hmap_destroy(&mcgroups); } static void @@ -8043,6 +8322,80 @@ destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports, } static void +build_ip_mcast(struct northd_context *ctx, struct hmap *datapaths) +{ + struct ovn_datapath *od; + + HMAP_FOR_EACH (od, key_node, datapaths) { + if (!od->nbs) { + continue; + } + + const struct sbrec_ip_multicast *ip_mcast = + ip_mcast_lookup(ctx->sbrec_ip_mcast_by_dp, od->sb); + + if (!ip_mcast) { + ip_mcast = sbrec_ip_multicast_insert(ctx->ovnsb_txn); + } + store_mcast_info_for_datapath(ip_mcast, od); + } + + /* Delete southbound records without northbound matches. */ + const struct sbrec_ip_multicast *sb, *sb_next; + + SBREC_IP_MULTICAST_FOR_EACH_SAFE (sb, sb_next, ctx->ovnsb_idl) { + if (!sb->datapath || + !ovn_datapath_from_sbrec(datapaths, sb->datapath)) { + sbrec_ip_multicast_delete(sb); + } + } +} + +static void +build_mcast_groups(struct northd_context *ctx, + struct hmap *datapaths, struct hmap *ports, + struct hmap *mcast_groups, + struct hmap *igmp_groups) +{ + struct ovn_port *op; + + hmap_init(mcast_groups); + hmap_init(igmp_groups); + + HMAP_FOR_EACH (op, key_node, ports) { + if (!op->nbsp) { + continue; + } + + if (lsp_is_enabled(op->nbsp)) { + ovn_multicast_add(mcast_groups, &mc_flood, op); + } + } + + const struct sbrec_igmp_group *sb_igmp, *sb_igmp_next; + + SBREC_IGMP_GROUP_FOR_EACH_SAFE (sb_igmp, sb_igmp_next, ctx->ovnsb_idl) { + /* If this is a stale group (e.g., controller had crashed, + * purge it). + */ + if (!sb_igmp->chassis || !sb_igmp->datapath) { + sbrec_igmp_group_delete(sb_igmp); + continue; + } + + struct ovn_datapath *od = + ovn_datapath_from_sbrec(datapaths, sb_igmp->datapath); + if (!od) { + sbrec_igmp_group_delete(sb_igmp); + continue; + } + + ovn_igmp_group_add(mcast_groups, igmp_groups, od, ports, + sb_igmp->address, sb_igmp->ports, sb_igmp->n_ports); + } +} + +static void ovnnb_db_run(struct northd_context *ctx, struct ovsdb_idl_index *sbrec_chassis_by_name, struct ovsdb_idl_loop *sb_loop, @@ -8053,23 +8406,36 @@ ovnnb_db_run(struct northd_context *ctx, return; } struct hmap port_groups; + struct hmap mcast_groups; + struct hmap igmp_groups; build_datapaths(ctx, datapaths, lr_list); build_ports(ctx, sbrec_chassis_by_name, datapaths, ports); build_ipam(datapaths, ports); build_port_group_lswitches(ctx, &port_groups, ports); build_lrouter_groups(ports, lr_list); - build_lflows(ctx, datapaths, ports, &port_groups); + build_ip_mcast(ctx, datapaths); + build_mcast_groups(ctx, datapaths, ports, &mcast_groups, &igmp_groups); + build_lflows(ctx, datapaths, ports, &port_groups, &mcast_groups, + &igmp_groups); sync_address_sets(ctx); sync_port_groups(ctx); sync_meters(ctx); sync_dns_entries(ctx, datapaths); + struct ovn_igmp_group *igmp_group, *next_igmp_group; + + HMAP_FOR_EACH_SAFE (igmp_group, next_igmp_group, hmap_node, &igmp_groups) { + ovn_igmp_group_destroy(&igmp_groups, igmp_group); + } + struct ovn_port_group *pg, *next_pg; HMAP_FOR_EACH_SAFE (pg, next_pg, key_node, &port_groups) { ovn_port_group_destroy(&port_groups, pg); } + hmap_destroy(&igmp_groups); + hmap_destroy(&mcast_groups); hmap_destroy(&port_groups); /* Sync ipsec configuration. @@ -8866,12 +9232,41 @@ main(int argc, char *argv[]) add_column_noalert(ovnsb_idl_loop.idl, &sbrec_ha_chassis_group_col_ref_chassis); + ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_igmp_group); + ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_igmp_group_col_address); + ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_igmp_group_col_datapath); + ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_igmp_group_col_chassis); + ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_igmp_group_col_ports); + + ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_ip_multicast); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_datapath); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_enabled); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_querier); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_eth_src); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_ip4_src); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_table_size); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_idle_timeout); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_query_interval); + add_column_noalert(ovnsb_idl_loop.idl, + &sbrec_ip_multicast_col_query_max_resp); + struct ovsdb_idl_index *sbrec_chassis_by_name = chassis_index_create(ovnsb_idl_loop.idl); struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name = ha_chassis_group_index_create(ovnsb_idl_loop.idl); + struct ovsdb_idl_index *sbrec_ip_mcast_by_dp + = ip_mcast_index_create(ovnsb_idl_loop.idl); + /* Ensure that only a single ovn-northd is active in the deployment by * acquiring a lock called "ovn_northd" on the southbound database * and then only performing DB transactions if the lock is held. */ @@ -8887,6 +9282,7 @@ main(int argc, char *argv[]) .ovnsb_idl = ovnsb_idl_loop.idl, .ovnsb_txn = ovsdb_idl_loop_run(&ovnsb_idl_loop), .sbrec_ha_chassis_grp_by_name = sbrec_ha_chassis_grp_by_name, + .sbrec_ip_mcast_by_dp = sbrec_ip_mcast_by_dp, }; if (!had_lock && ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) { diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml index 318379c..0aba420 100644 --- a/ovn/ovn-nb.xml +++ b/ovn/ovn-nb.xml @@ -274,6 +274,60 @@ + +

+ These options control IP Multicast Snooping configuration of the + logical switch. To enable IP Multicast Snooping set + to true. To enable IP + Multicast Querier set + to true. If IP Multicast Querier is enabled + and + must be set. +

+ + Enables/disables IP Multicast Snooping on the logical switch. + + + Enables/disables IP Multicast Querier on the logical switch. + + + Determines whether unregistered multicast traffic should be flooded + or not. Only applicable if + is enabled. + + + Number of multicast groups to be stored. Default: 2048. + + + Configures the IP Multicast Snooping group idle timeout (in seconds). + Default: 300 seconds. + + + Configures the IP Multicast Querier interval between queries (in + seconds). Default: + / 2. + + + Configures the value of the "max-response" field in the multicast + queries originated by the logical switch. Default: 1 second. + + + Configures the source Ethernet address for queries originated by the + logical switch. + + + Configures the source IPv4 address for queries originated by the + logical switch. + +
+ See External IDs at the beginning of this document. diff --git a/tests/ovn.at b/tests/ovn.at index 55f612c..5e7cd94 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -14338,3 +14338,273 @@ AT_CHECK([ovn-nbctl ls-add sw1], [1], [ignore], ]) AT_CLEANUP + +AT_SETUP([ovn -- IGMP snoop/querier]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +ovn_start + +# Logical network: +# Two independent logical switches (sw1 and sw2). +# sw1: +# - subnet 10.0.0.0/8 +# - 2 ports bound on hv1 (sw1-p11, sw1-p12) +# - 2 ports bound on hv2 (sw1-p21, sw1-p22) +# sw2: +# - subnet 20.0.0.0/8 +# - 1 port bound on hv1 (sw2-p1) +# - 1 port bound on hv2 (sw2-p2) +# - IGMP Querier from 20.0.0.254 + +reset_pcap_file() { + local iface=$1 + local pcap_file=$2 + ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \ +options:rxq_pcap=dummy-rx.pcap + rm -f ${pcap_file}*.pcap + ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \ +options:rxq_pcap=${pcap_file}-rx.pcap +} + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +# +# send_igmp_v3_report INPORT HV ETH_SRC IP_SRC IP_CSUM GROUP REC_TYPE +# IGMP_CSUM OUTFILE +# +# This shell function causes an IGMPv3 report to be received on INPORT of HV. +# The packet's content has Ethernet destination 01:00:5E:00:00:22 and source +# ETH_SRC (exactly 12 hex digits). Ethernet type is set to IP. +# GROUP is the IP multicast group to be joined/to leave (based on REC_TYPE). +# REC_TYPE == 04: join GROUP +# REC_TYPE == 03: leave GROUP +# The packet hexdump is also stored in OUTFILE. +# +send_igmp_v3_report() { + local inport=$1 hv=$2 eth_src=$3 ip_src=$4 ip_chksum=$5 group=$6 + local rec_type=$7 igmp_chksum=$8 outfile=$9 + + local eth_dst=01005e000016 + local ip_dst=$(ip_to_hex 224 0 0 22) + local ip_ttl=01 + local ip_ra_opt=94040000 + + local igmp_type=2200 + local num_rec=00000001 + local aux_dlen=00 + local num_src=0000 + + local eth=${eth_dst}${eth_src}0800 + local ip=46c0002800004000${ip_ttl}02${ip_chksum}${ip_src}${ip_dst}${ip_ra_opt} + local igmp=${igmp_type}${igmp_chksum}${num_rec}${rec_type}${aux_dlen}${num_src}${group} + local packet=${eth}${ip}${igmp} + + echo ${packet} >> ${outfile} + as $hv ovs-appctl netdev-dummy/receive ${inport} ${packet} +} + +# +# store_igmp_v3_query ETH_SRC IP_SRC IP_CSUM OUTFILE +# +# This shell function builds an IGMPv3 general query from ETH_SRC and IP_SRC +# and stores the hexdump of the packet in OUTFILE. +# +store_igmp_v3_query() { + local eth_src=$1 ip_src=$2 ip_chksum=$3 outfile=$4 + + local eth_dst=01005e000001 + local ip_dst=$(ip_to_hex 224 0 0 1) + local ip_ttl=01 + local igmp_type=11 + local max_resp=0a + local igmp_chksum=eeeb + local addr=00000000 + + local eth=${eth_dst}${eth_src}0800 + local ip=4500002000004000${ip_ttl}02${ip_chksum}${ip_src}${ip_dst} + local igmp=${igmp_type}${max_resp}${igmp_chksum}${addr}000a0000 + local packet=${eth}${ip}${igmp} + + echo ${packet} >> ${outfile} +} + +# +# send_ip_multicast_pkt INPORT HV ETH_SRC ETH_DST IP_SRC IP_DST IP_LEN +# IP_PROTO DATA OUTFILE +# +# This shell function causes an IP multicast packet to be received on INPORT +# of HV. +# The hexdump of the packet is stored in OUTFILE. +# +send_ip_multicast_pkt() { + local inport=$1 hv=$2 eth_src=$3 eth_dst=$4 ip_src=$5 ip_dst=$6 + local ip_len=$7 ip_chksum=$8 proto=$9 data=${10} outfile=${11} + + local ip_ttl=20 + + local eth=${eth_dst}${eth_src}0800 + local ip=450000${ip_len}95f14000${ip_ttl}${proto}${ip_chksum}${ip_src}${ip_dst} + local packet=${eth}${ip}${data} + + as $hv ovs-appctl netdev-dummy/receive ${inport} ${packet} + echo ${packet} >> ${outfile} +} + +ovn-nbctl ls-add sw1 +ovn-nbctl ls-add sw2 + +ovn-nbctl lsp-add sw1 sw1-p11 +ovn-nbctl lsp-add sw1 sw1-p12 +ovn-nbctl lsp-add sw1 sw1-p21 +ovn-nbctl lsp-add sw1 sw1-p22 +ovn-nbctl lsp-add sw2 sw2-p1 +ovn-nbctl lsp-add sw2 sw2-p2 + +net_add n1 +sim_add hv1 +as hv1 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.1 +ovs-vsctl -- add-port br-int hv1-vif1 -- \ + set interface hv1-vif1 external-ids:iface-id=sw1-p11 \ + options:tx_pcap=hv1/vif1-tx.pcap \ + options:rxq_pcap=hv1/vif1-rx.pcap \ + ofport-request=1 +ovs-vsctl -- add-port br-int hv1-vif2 -- \ + set interface hv1-vif2 external-ids:iface-id=sw1-p12 \ + options:tx_pcap=hv1/vif2-tx.pcap \ + options:rxq_pcap=hv1/vif2-rx.pcap \ + ofport-request=1 +ovs-vsctl -- add-port br-int hv1-vif3 -- \ + set interface hv1-vif3 external-ids:iface-id=sw2-p1 \ + options:tx_pcap=hv1/vif3-tx.pcap \ + options:rxq_pcap=hv1/vif3-rx.pcap \ + ofport-request=1 + +sim_add hv2 +as hv2 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.2 +ovs-vsctl -- add-port br-int hv2-vif1 -- \ + set interface hv2-vif1 external-ids:iface-id=sw1-p21 \ + options:tx_pcap=hv2/vif1-tx.pcap \ + options:rxq_pcap=hv2/vif1-rx.pcap \ + ofport-request=1 +ovs-vsctl -- add-port br-int hv2-vif2 -- \ + set interface hv2-vif2 external-ids:iface-id=sw1-p22 \ + options:tx_pcap=hv2/vif2-tx.pcap \ + options:rxq_pcap=hv2/vif2-rx.pcap \ + ofport-request=1 +ovs-vsctl -- add-port br-int hv2-vif3 -- \ + set interface hv2-vif3 external-ids:iface-id=sw2-p2 \ + options:tx_pcap=hv2/vif3-tx.pcap \ + options:rxq_pcap=hv2/vif3-rx.pcap \ + ofport-request=1 + +OVN_POPULATE_ARP + +# Enable IGMP snooping on sw1. +ovn-nbctl set Logical_Switch sw1 other_config:mcast_querier="false" +ovn-nbctl set Logical_Switch sw1 other_config:mcast_snoop="true" + +# No IGMP query should be generated by sw1 (mcast_querier="false"). +truncate -s 0 expected +OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected]) +OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected]) +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) +OVN_CHECK_PACKETS([hv2/vif2-tx.pcap], [expected]) + +ovn-nbctl --wait=hv sync + +# Inject IGMP Join for 239.0.1.68 on sw1-p11. +send_igmp_v3_report hv1-vif1 hv1 \ + 000000000001 $(ip_to_hex 10 0 0 1) f9f8 \ + $(ip_to_hex 239 0 1 68) 04 e9b9 \ + /dev/null +# Inject IGMP Join for 239.0.1.68 on sw1-p21. +send_igmp_v3_report hv2-vif1 hv2 000000000002 $(ip_to_hex 10 0 0 2) f9f9 \ + $(ip_to_hex 239 0 1 68) 04 e9b9 \ + /dev/null + +# Check that the IGMP Group is learned on both hv. +OVS_WAIT_UNTIL([ + total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l` + test "${total_entries}" = "2" +]) + +# Send traffic and make sure it gets forwarded only on the two ports that +# joined. +truncate -s 0 expected +truncate -s 0 expected_empty +send_ip_multicast_pkt hv1-vif2 hv1 \ + 000000000001 01005e000144 \ + $(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e ca70 11 \ + e518e518000a3b3a0000 \ + expected + +OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected]) +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) +OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected_empty]) +OVN_CHECK_PACKETS([hv2/vif2-tx.pcap], [expected_empty]) +OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected_empty]) +OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected_empty]) + +# Inject IGMP Leave for 239.0.1.68 on sw1-p11. +send_igmp_v3_report hv1-vif1 hv1 \ + 000000000001 $(ip_to_hex 10 0 0 1) f9f8 \ + $(ip_to_hex 239 0 1 68) 03 eab9 \ + /dev/null + +# Check IGMP_Group table on both HV. +OVS_WAIT_UNTIL([ + total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l` + test "${total_entries}" = "1" +]) + +# Send traffic traffic and make sure it gets forwarded only on the port that +# joined. +as hv1 reset_pcap_file hv1-vif1 hv1/vif1 +as hv2 reset_pcap_file hv2-vif1 hv2/vif1 +truncate -s 0 expected +truncate -s 0 expected_empty +send_ip_multicast_pkt hv1-vif2 hv1 \ + 000000000001 01005e000144 \ + $(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e ca70 11 \ + e518e518000a3b3a0000 \ + expected + +OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected_empty]) +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) +OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected_empty]) +OVN_CHECK_PACKETS([hv2/vif2-tx.pcap], [expected_empty]) +OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected_empty]) +OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected_empty]) + +# Flush IGMP groups. +ovn-sbctl ip-multicast-flush sw1 +ovn-nbctl --wait=hv -t 3 sync +OVS_WAIT_UNTIL([ + total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l` + test "${total_entries}" = "0" +]) + +# Enable IGMP snooping and querier on sw2 and set query interval to minimum. +ovn-nbctl set Logical_Switch sw2 \ + other_config:mcast_snoop="true" \ + other_config:mcast_querier="true" \ + other_config:mcast_query_interval=1 \ + other_config:mcast_eth_src="00:00:00:00:02:fe" \ + other_config:mcast_ip4_src="20.0.0.254" + +# Wait for 1 query interval (1 sec) and check that two queries are generated. +truncate -s 0 expected +store_igmp_v3_query 0000000002fe $(ip_to_hex 20 0 0 254) 84dd expected +store_igmp_v3_query 0000000002fe $(ip_to_hex 20 0 0 254) 84dd expected + +sleep 1 +OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected]) +OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected]) + +OVN_CLEANUP([hv1], [hv2]) +AT_CLEANUP diff --git a/tests/system-ovn.at b/tests/system-ovn.at index b7e2d77..10fbd26 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -1542,3 +1542,122 @@ as OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d /connection dropped.*/d"]) AT_CLEANUP + +AT_SETUP([ovn -- 2 LSs IGMP]) +AT_KEYWORDS([ovnigmp]) + +ovn_start + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-int]) + +# Set external-ids in br-int needed for ovn-controller +ovs-vsctl \ + -- set Open_vSwitch . external-ids:system-id=hv1 \ + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true + +# Start ovn-controller +start_daemon ovn-controller + +# Logical network: +# Two independent logical switches (sw1 and sw2). +# sw1: +# - subnet 10.0.0.0/8 +# - 2 ports (sw1-p1 - sw1-p2) +# sw2: +# - subnet 20.0.0.0/8 +# - 2 port (sw2-p1 - sw2-p2) +# - IGMP Querier from 20.0.0.254 + +ovn-nbctl ls-add sw1 +ovn-nbctl ls-add sw2 + +for i in `seq 1 2` +do + ADD_NAMESPACES(sw1-p$i) + ADD_VETH(sw1-p$i, sw1-p$i, br-int, "10.0.0.$i/24", "00:00:00:00:01:0$i", \ + "10.0.0.254") + ovn-nbctl lsp-add sw1 sw1-p$i \ + -- lsp-set-addresses sw1-p$i "00:00:00:00:01:0$i 10.0.0.$i" +done + +for i in `seq 1 2` +do + ADD_NAMESPACES(sw2-p$i) + ADD_VETH(sw2-p$i, sw2-p$i, br-int, "20.0.0.$i/24", "00:00:00:00:02:0$i", \ + "20.0.0.254") + ovn-nbctl lsp-add sw2 sw2-p$i \ + -- lsp-set-addresses sw2-p$i "00:00:00:00:02:0$i 20.0.0.$i" +done + +# Enable IGMP snooping on sw1. +ovn-nbctl set Logical_Switch sw1 other_config:mcast_querier="false" +ovn-nbctl set Logical_Switch sw1 other_config:mcast_snoop="true" + +# Inject IGMP Join for 239.0.1.68 on sw1-p1. +NS_CHECK_EXEC([sw1-p1], [ip addr add dev sw1-p1 239.0.1.68/32 autojoin], [0]) + +# Inject IGMP Join for 239.0.1.68 on sw1-p2 +NS_CHECK_EXEC([sw1-p2], [ip addr add dev sw1-p2 239.0.1.68/32 autojoin], [0]) + +# Check that the IGMP Group is learned. +OVS_WAIT_UNTIL([ + total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l` + ports=`ovn-sbctl find IGMP_Group | grep ports | cut -f 2 -d ":" | wc -w` + test "${total_entries}" = "1" + test "${ports}" = "2" +]) + +# Inject IGMP Leave for 239.0.1.68 on sw1-p2. +NS_CHECK_EXEC([sw1-p2], [ip addr del dev sw1-p2 239.0.1.68/32], [0]) + +# Check that only one port is left in the group. +OVS_WAIT_UNTIL([ + total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l` + ports=`ovn-sbctl find IGMP_Group | grep ports | cut -f 2 -d ":" | wc -w` + test "${total_entries}" = "1" + test "${ports}" = "1" +]) + +# Flush IGMP groups. +ovn-sbctl ip-multicast-flush sw1 +ovn-nbctl --wait=hv -t 3 sync +OVS_WAIT_UNTIL([ + total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l` + test "${total_entries}" = "0" +]) + +# Enable IGMP snooping and querier on sw2 and set query interval to minimum. +ovn-nbctl set Logical_Switch sw2 \ + other_config:mcast_snoop="true" \ + other_config:mcast_querier="true" \ + other_config:mcast_query_interval=1 \ + other_config:mcast_eth_src="00:00:00:00:02:fe" \ + other_config:mcast_ip4_src="20.0.0.254" + +# Check that queries are generated. +NS_CHECK_EXEC([sw2-p1], [tcpdump -n -c 2 -i sw2-p1 igmp > sw2-p1.pcap &]) + +OVS_WAIT_UNTIL([ + total_queries=`cat sw2-p1.pcap | grep "igmp query" | wc -l` + test "${total_queries}" = "2" +]) + +OVS_APP_EXIT_AND_WAIT([ovn-controller]) + +as ovn-sb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as ovn-nb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as northd +OVS_APP_EXIT_AND_WAIT([ovn-northd]) + +as +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d +/connection dropped.*/d"]) +AT_CLEANUP