From patchwork Mon May 10 16:00:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Miskell, Timothy" X-Patchwork-Id: 1476550 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Ff5VC3G9kz9sX2 for ; Tue, 11 May 2021 02:04:51 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 88E5360C0E; Mon, 10 May 2021 16:04:48 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V4j_3m4FTL5l; Mon, 10 May 2021 16:04:42 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTP id 6277960BFD; Mon, 10 May 2021 16:04:41 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id D255AC0026; Mon, 10 May 2021 16:04:39 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 03F19C0001 for ; Mon, 10 May 2021 16:01:24 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id D9486402D6 for ; Mon, 10 May 2021 16:01:23 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PqZ5a6XqUNOu for ; Mon, 10 May 2021 16:01:21 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by smtp2.osuosl.org (Postfix) with ESMTPS id 12DCE40262 for ; Mon, 10 May 2021 16:01:20 +0000 (UTC) IronPort-SDR: 3MzY+p+HyHNrmnhQcVYSw7bwUrpXBoxgVqEB+ZtYS6LEeD2TKkGfu9blAeuYlvKbR7nd1yPLdb ZSYBJUIT18dQ== X-IronPort-AV: E=McAfee;i="6200,9189,9980"; a="198903297" X-IronPort-AV: E=Sophos;i="5.82,287,1613462400"; d="scan'208";a="198903297" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2021 09:01:13 -0700 IronPort-SDR: hTlzoocG50zYth1hw30HYv1dwntr79v2hOaPeRwsdX3Iwz+Dmqhyqmm1S73cDXDz0DJwLcQ6mK +5AQOWqiFKmA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,287,1613462400"; d="scan'208";a="468282992" Received: from unknown (HELO sr10r096.hd.intel.com) ([10.127.128.194]) by fmsmga002.fm.intel.com with ESMTP; 10 May 2021 09:01:08 -0700 From: Timothy Miskell To: dev@openvswitch.org, maxime.coquelin@redhat.com Date: Mon, 10 May 2021 16:00:45 +0000 Message-Id: <20210510160045.49434-2-timothy.miskell@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210510160045.49434-1-timothy.miskell@intel.com> References: <20210510160045.49434-1-timothy.miskell@intel.com> X-Mailman-Approved-At: Mon, 10 May 2021 16:04:37 +0000 Cc: Liang-min Wang Subject: [ovs-dev] [PATCH] Extends the existing mirror configuration parameters X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Liang-min Wang The following parameters are added: - mirror-offload: to turn on/off mirror offloading. - output-port-name: specify a port, using name string, that is on a different bridge - output-src-vlan: output port vlan for each select-src-port. - output-dst-vlan: output port vlan for each select-dst-port. - flow-src-mac: use src mac address of each select-dst-port for the header scan. - flow-dst-mac: use dst mac address of each select-src-port for the header scan. - mirror-tunnel-addr: BDF string of the tunnel device. ovs-vsctl test change because new mirroring parameters are introduced in this patch Create a defer procedure call thread to handle all mirror offload requests. This is a light-weight thread which remains in sleep-state when there is no new request. This is created between ovs-vsctl and mirror offloading back end Implementing DPDK tx-burst (VIRTIO ingress traffic mirror) and rx-burst (VIRTIO egress traffic mirror) callbacks. Each callback functions implement the following tasks: 1. Enable per-packet VLAN insertion - for port mirroring, all packets are enabled per-packet VLAN insertion. - for flow mirroring, only packet header matches the required mac address are enabled. 2. Sending the packets to the specified transport port (output-port in mirror offload configuration) - for port mirroring, all packets are sent to the transport port. - for flow mirroring, only matched packets are sent. 3. Restore each packet attributes (remove DPDK per-packet offload flag) Signed-off-by: Liang-min Wang Tested-by: Timothy Miskell Suggested-by: Munish Mehan --- lib/automake.mk | 2 + lib/netdev-dpdk-mirror.c | 516 +++++++++++++++++++++++++++++++++++++ lib/netdev-dpdk-mirror.h | 83 ++++++ lib/netdev-dpdk.c | 397 ++++++++++++++++++++++++++++ lib/netdev-provider.h | 16 ++ lib/netdev.c | 386 +++++++++++++++++++++++++++ lib/netdev.h | 16 ++ tests/ovs-vsctl.at | 2 + vswitchd/bridge.c | 271 ++++++++++++++++++- vswitchd/vswitch.ovsschema | 24 +- vswitchd/vswitch.xml | 50 ++++ 11 files changed, 1759 insertions(+), 4 deletions(-) create mode 100644 lib/netdev-dpdk-mirror.c create mode 100644 lib/netdev-dpdk-mirror.h diff --git a/lib/automake.mk b/lib/automake.mk index 39901bd6d..dcafbfaca 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -170,6 +170,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/multipath.h \ lib/namemap.c \ lib/netdev-dpdk.h \ + lib/netdev-dpdk-mirror.h \ lib/netdev-dummy.c \ lib/netdev-offload.c \ lib/netdev-offload.h \ @@ -460,6 +461,7 @@ if DPDK_NETDEV lib_libopenvswitch_la_SOURCES += \ lib/dpdk.c \ lib/netdev-dpdk.c \ + lib/netdev-dpdk-mirror.c \ lib/netdev-offload-dpdk.c else lib_libopenvswitch_la_SOURCES += \ diff --git a/lib/netdev-dpdk-mirror.c b/lib/netdev-dpdk-mirror.c new file mode 100644 index 000000000..ff2701660 --- /dev/null +++ b/lib/netdev-dpdk-mirror.c @@ -0,0 +1,516 @@ +/* + * Copyright (c) 2014, 2015, 2016, 2017 Nicira, Inc. + * Copyright (c) 2019 Mellanox Technologies, Ltd. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#include +#include + +#include "netdev-dpdk-mirror.h" +#include "openvswitch/vlog.h" +#include "openvswitch/dynamic-string.h" +#include "util.h" + +#define MAC_ADDR_MAP 0x0000FFFFFFFFFFFFULL +#define is_mac_addr_match(a,b) (((a^b)&MAC_ADDR_MAP) == 0) +#define INIT_MIRROR_DB_SIZE 8 +#define INVALID_DEVICE_ID 0xFFFFFFFF + +VLOG_DEFINE_THIS_MODULE(netdev_dpdk_mirror); + +/* port/flow mirror database management routines */ +/* + * The below API is for port/flow mirror offloading which uses a different DPDK + * interface as rte-flow. + */ +static int mirror_port_db_size = 0; +static int mirror_port_used = 0; +static struct mirror_offload_port *mirror_port_db = NULL; + +static void +netdev_mirror_db_init(struct mirror_offload_port *db, int size) +{ + int i; + + for (i = 0; i < size; i++) { + db[i].dev_id = INVALID_DEVICE_ID; + memset(&db[i].rx, 0, sizeof(struct mirror_param)); + memset(&db[i].tx, 0, sizeof(struct mirror_param)); + } +} + +/* Double the db size when it runs out of space */ +static int +netdev_mirror_db_resize(void) +{ + int new_size = mirror_port_db_size << 1; + struct mirror_offload_port *new_db = xmalloc( + sizeof(struct mirror_offload_port)*new_size); + + memcpy(new_db, mirror_port_db, sizeof(struct mirror_offload_port) + *mirror_port_db_size); + netdev_mirror_db_init(&new_db[mirror_port_db_size], mirror_port_db_size); + mirror_port_db_size = new_size; + mirror_port_db = new_db; + + return 0; +} + + +static struct mirror_offload_port* +netdev_mirror_data_find(uint32_t dev_id) +{ + int i; + + if (mirror_port_db == NULL) { + return NULL; + } + + for (i = 0; i < mirror_port_db_size; i++) { + if (dev_id == mirror_port_db[i].dev_id) { + return &mirror_port_db[i]; + } + } + return NULL; +} + +static struct mirror_offload_port* +netdev_mirror_data_add(uint32_t dev_id, int tx, + struct mirror_param *new_param) +{ + struct mirror_offload_port *target = NULL; + int i; + + if (!mirror_port_db) { + mirror_port_db_size = INIT_MIRROR_DB_SIZE; + mirror_port_db = xmalloc(sizeof(struct mirror_offload_port)* + mirror_port_db_size); + netdev_mirror_db_init(mirror_port_db, mirror_port_db_size); + } + target = netdev_mirror_data_find(dev_id); + if (target) { + if (tx) { + if (target->tx.mirror_cb) { + VLOG_ERR("Attempt to add ingress mirror offloading" + " on port, %d, while one is outstanding\n", dev_id); + return target; + } + + memcpy(&target->tx, new_param, sizeof(*new_param)); + } else { + if (target->rx.mirror_cb) { + VLOG_ERR("Attempt to add egress mirror offloading" + " on port, %d, while one is outstanding\n", dev_id); + return target; + } + + memcpy(&target->rx, new_param, sizeof(struct mirror_param)); + } + } else { + struct mirror_param *param; + /* find an unused spot on db */ + for (i = 0; i < mirror_port_db_size; i++) { + if (mirror_port_db[i].dev_id == INVALID_DEVICE_ID) { + break; + } + } + if (i == mirror_port_db_size && netdev_mirror_db_resize()) { + return NULL; + } + + param = tx ? &mirror_port_db[i].tx : &mirror_port_db[i].rx; + memcpy(param, new_param, sizeof(struct mirror_param)); + + target = &mirror_port_db[i]; + target->dev_id = dev_id; + mirror_port_used ++; + } + return target; +} + +static void +netdev_mirror_data_remove(uint32_t dev_id, int tx) { + struct mirror_offload_port *target = netdev_mirror_data_find(dev_id); + + if (!target) { + VLOG_ERR("Attempt to remove unsaved port, %d, %s callback\n", + dev_id, tx?"tx": "rx"); + } + + if (tx) { + memset(&target->tx, 0, sizeof(struct mirror_param)); + } else { + memset(&target->rx, 0, sizeof(struct mirror_param)); + } + + if ((target->rx.mirror_cb == NULL) && + (target->tx.mirror_cb == NULL)) { + target->dev_id = INVALID_DEVICE_ID; + mirror_port_used --; + /* release port mirror db memory when there + * is no outstanding port mirror offloading + * configuration + */ + if (mirror_port_used == 0) { + free(mirror_port_db); + mirror_port_db = NULL; + mirror_port_db_size = 0; + } + } +} + +void +netdev_mirror_data_proc(uint32_t dev_id, mirror_data_op op, + int tx, struct mirror_param *in_param, + struct mirror_offload_port **out_param) +{ + switch (op) { + case mirror_data_find: + *out_param = netdev_mirror_data_find(dev_id); + break; + case mirror_data_add: + *out_param = netdev_mirror_data_add(dev_id, tx, in_param); + break; + case mirror_data_rem: + netdev_mirror_data_remove(dev_id, tx); + break; + } +} + +/* port/flow mirror traffic processors */ +static inline uint16_t +netdev_custom_mirror_offload_cb(uint16_t qidx, struct rte_mbuf **pkts, + uint16_t nb_pkts, void *user_params) +{ + struct mirror_param *data = user_params; + uint16_t i, dst_qidx, match_count = 0; + uint16_t pkt_trans; + uint16_t dst_port_id = data->dst_port_id; + uint16_t dst_vlan_id = data->dst_vlan_id; + struct rte_mbuf **pkt_buf = &data->pkt_buf[qidx * data->max_burst_size]; + + if (nb_pkts == 0) { + return 0; + } + + if (nb_pkts > data->max_burst_size) { + VLOG_ERR("Per-flow batch size, %d, exceeds maximum limit\n", nb_pkts); + return 0; + } + + for (i = 0; i < nb_pkts; i++) { + if (data->custom_scan(pkts[i], user_params)) { + pkt_buf[match_count] = pkts[i]; + pkt_buf[match_count]->ol_flags |= PKT_TX_VLAN_PKT; + pkt_buf[match_count]->vlan_tci = dst_vlan_id; + rte_mbuf_refcnt_update(pkt_buf[match_count], 1); + match_count++; + } + } + + dst_qidx = (data->n_dst_queue > qidx)?qidx:(data->n_dst_queue -1); + + rte_spinlock_lock(&data->locks[dst_qidx]); + pkt_trans = rte_eth_tx_burst(dst_port_id, dst_qidx, pkt_buf, match_count); + rte_spinlock_unlock(&data->locks[dst_qidx]); + + for (i = 0; i < match_count; i++) { + pkt_buf[i]->ol_flags &= ~PKT_TX_VLAN_PKT; + } + + while (unlikely (pkt_trans < match_count)) { + rte_pktmbuf_free(pkt_buf[pkt_trans]); + pkt_trans++; + } + + return nb_pkts; +} + +static inline uint16_t +netdev_flow_mirror_offload_cb(uint16_t qidx, struct rte_mbuf **pkts, + uint16_t nb_pkts, void *user_params, uint32_t offset) +{ + struct mirror_param *data = user_params; + uint16_t i, dst_qidx, match_count = 0; + uint16_t pkt_trans; + uint16_t dst_port_id = data->dst_port_id; + uint16_t dst_vlan_id = data->dst_vlan_id; + uint64_t target_addr = *(uint64_t *) data->extra_data; + struct rte_mbuf **pkt_buf = &data->pkt_buf[qidx * data->max_burst_size]; + + if (nb_pkts == 0) { + return 0; + } + + if (nb_pkts > data->max_burst_size) { + VLOG_ERR("Per-flow batch size, %d, exceeds maximum limit\n", nb_pkts); + return 0; + } + + for (i = 0; i < nb_pkts; i++) { + uint64_t *dst_mac_addr = + rte_pktmbuf_mtod_offset(pkts[i], void *, offset); + if (is_mac_addr_match(target_addr, (*dst_mac_addr))) { + pkt_buf[match_count] = pkts[i]; + pkt_buf[match_count]->ol_flags |= PKT_TX_VLAN_PKT; + pkt_buf[match_count]->vlan_tci = dst_vlan_id; + rte_mbuf_refcnt_update(pkt_buf[match_count], 1); + match_count ++; + } + } + + dst_qidx = (data->n_dst_queue > qidx) ? qidx : (data->n_dst_queue -1); + + rte_spinlock_lock(&data->locks[dst_qidx]); + pkt_trans = rte_eth_tx_burst(dst_port_id, dst_qidx, pkt_buf, match_count); + rte_spinlock_unlock(&data->locks[dst_qidx]); + + for (i = 0; i < match_count; i++) { + pkt_buf[i]->ol_flags &= ~PKT_TX_VLAN_PKT; + } + + while (unlikely (pkt_trans < match_count)) { + rte_pktmbuf_free(pkt_buf[pkt_trans]); + pkt_trans++; + } + + return nb_pkts; +} + +static inline uint16_t +netdev_port_mirror_offload_cb(uint16_t qidx, struct rte_mbuf **pkts, + uint16_t nb_pkts, void *user_params) +{ + struct mirror_param *data = user_params; + uint16_t i, dst_qidx; + uint16_t pkt_trans; + uint16_t dst_port_id = data->dst_port_id; + uint16_t dst_vlan_id = data->dst_vlan_id; + + if (nb_pkts == 0) { + return 0; + } + + for (i = 0; i < nb_pkts; i++) { + pkts[i]->ol_flags |= PKT_TX_VLAN_PKT; + pkts[i]->vlan_tci = dst_vlan_id; + rte_mbuf_refcnt_update(pkts[i], 1); + } + + dst_qidx = (data->n_dst_queue > qidx) ? qidx : (data->n_dst_queue -1); + + rte_spinlock_lock(&data->locks[dst_qidx]); + pkt_trans = rte_eth_tx_burst(dst_port_id, dst_qidx, pkts, nb_pkts); + rte_spinlock_unlock(&data->locks[dst_qidx]); + + for (i = 0; i < nb_pkts; i++) { + pkts[i]->ol_flags &= ~PKT_TX_VLAN_PKT; + } + + while (unlikely (pkt_trans < nb_pkts)) { + rte_pktmbuf_free(pkts[pkt_trans]); + pkt_trans++; + } + + return nb_pkts; +} + +static inline uint16_t +netdev_rx_custom_mirror_offload_cb(uint16_t port_id OVS_UNUSED, + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, + uint16_t maxi_pkts OVS_UNUSED, void *user_params) +{ + return netdev_custom_mirror_offload_cb(qidx, pkts, nb_pkts, user_params); +} + +static inline uint16_t +netdev_tx_custom_mirror_offload_cb(uint16_t port_id OVS_UNUSED, + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, + void *user_params) +{ + return netdev_custom_mirror_offload_cb(qidx, pkts, nb_pkts, user_params); +} + +static inline uint16_t +netdev_rx_flow_mirror_offload_cb(uint16_t port_id OVS_UNUSED, + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, + uint16_t maxi_pkts OVS_UNUSED, void *user_params) +{ + return netdev_flow_mirror_offload_cb(qidx, pkts, nb_pkts, user_params, 0); +} + +static inline uint16_t +netdev_tx_flow_mirror_offload_cb(uint16_t port_id OVS_UNUSED, + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, + void *user_params) +{ + return netdev_flow_mirror_offload_cb(qidx, pkts, nb_pkts, user_params, 6); +} + +static inline uint16_t +netdev_rx_port_mirror_offload_cb(uint16_t port_id OVS_UNUSED, + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, + uint16_t max_pkts OVS_UNUSED, void *user_params) +{ + return netdev_port_mirror_offload_cb(qidx, pkts, nb_pkts, user_params); +} + +static inline uint16_t +netdev_tx_port_mirror_offload_cb(uint16_t port_id OVS_UNUSED, + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, + void *user_params) +{ + return netdev_port_mirror_offload_cb(qidx, pkts, nb_pkts, user_params); +} + +static rte_rx_callback_fn +netdev_mirror_rx_cb(rte_mirror_type mirror_type) +{ + switch (mirror_type) { + case mirror_port: + return netdev_rx_port_mirror_offload_cb; + case mirror_flow_mac: + return netdev_rx_flow_mirror_offload_cb; + case mirror_flow_custom: + return netdev_rx_custom_mirror_offload_cb; + case mirror_invalid: + return NULL; + } + VLOG_ERR("Un-supported mirror type\n"); + return NULL; +} + +static rte_tx_callback_fn +netdev_mirror_tx_cb(rte_mirror_type mirror_type) +{ + switch (mirror_type) { + case mirror_port: + return netdev_tx_port_mirror_offload_cb; + case mirror_flow_mac: + return netdev_tx_flow_mirror_offload_cb; + break; + case mirror_flow_custom: + return netdev_tx_custom_mirror_offload_cb; + case mirror_invalid: + return NULL; + } + VLOG_ERR("Un-supported mirror type\n"); + return NULL; +} + +void +netdev_mirror_cb_set(struct mirror_param *data, uint16_t port_id, + int pmd_cb, int tx) +{ + unsigned int qid; + + data->pkt_buf = NULL; + if (data->extra_data_size) { + data->pkt_buf = xmalloc(sizeof(mirror_fn_cb)*data->max_burst_size * + data->n_src_queue); + } + + data->mirror_cb = xmalloc(sizeof(struct rte_eth_rxtx_callback *) + * data->n_src_queue); + for (qid = 0; qid < data->n_src_queue; qid++) { + if (pmd_cb) { + if (tx) { + data->mirror_cb[qid].pmd = rte_eth_add_tx_callback(port_id, + qid, netdev_mirror_tx_cb(data->mirror_type), data); + } else { + data->mirror_cb[qid].pmd = rte_eth_add_rx_callback(port_id, + qid, netdev_mirror_rx_cb(data->mirror_type), data); + } + } else { + struct rte_eth_rxtx_callback *rxtx_cb = + xmalloc(sizeof(struct rte_eth_rxtx_callback)); + + data->mirror_cb[qid].direct = rxtx_cb; + rxtx_cb->next = NULL; + rxtx_cb->param = data; + + if (tx) { + rxtx_cb->fn.tx = netdev_mirror_tx_cb(data->mirror_type); + } else { + rxtx_cb->fn.rx = netdev_mirror_rx_cb(data->mirror_type); + } + } + } +} + +/* port/flow mirroring device (port) register/un-registe routines */ +int +netdev_eth_register_mirror(uint16_t src_port, struct mirror_param *param, + int tx_cb) +{ + struct mirror_offload_port *port_info = NULL; + struct mirror_param *data; + + netdev_mirror_data_proc(src_port, mirror_data_add, tx_cb, param, + &port_info); + if (!port_info) { + return -1; + } + + data = tx_cb ? &port_info->tx : &port_info->rx; + netdev_mirror_cb_set(data, src_port, 1, tx_cb); + + return 0; +} + +int +netdev_eth_unregister_mirror(uint16_t src_port, int tx_cb) +{ + /* release both cb and pkt_buf */ + unsigned int i; + struct mirror_offload_port *port_info = NULL; + struct mirror_param *data; + + netdev_mirror_data_proc(src_port, mirror_data_find, tx_cb, NULL, + &port_info); + if (port_info == NULL) { + VLOG_ERR("Source port %d is not on outstanding port mirror db\n", + src_port); + return -1; + } + data = tx_cb ? &port_info->tx : &port_info->rx; + + for (i = 0; i < data->n_src_queue; i++) { + if (data->mirror_cb[i].pmd) { + if (tx_cb) { + rte_eth_remove_tx_callback(src_port, i, + data->mirror_cb[i].pmd); + } else { + rte_eth_remove_rx_callback(src_port, i, + data->mirror_cb[i].pmd); + } + } + data->mirror_cb[i].pmd = NULL; + } + free(data->mirror_cb); + + if (data->pkt_buf) { + free(data->pkt_buf); + data->pkt_buf = NULL; + } + + if (data->extra_data) { + free(data->extra_data); + data->extra_data = NULL; + data->extra_data_size = 0; + } + + netdev_mirror_data_proc(src_port, mirror_data_rem, tx_cb, NULL, NULL); + return 0; +} diff --git a/lib/netdev-dpdk-mirror.h b/lib/netdev-dpdk-mirror.h new file mode 100644 index 000000000..ee4b933ba --- /dev/null +++ b/lib/netdev-dpdk-mirror.h @@ -0,0 +1,83 @@ +/* + * Copyright (c) 2014, 2015, 2016 Nicira, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef NETDEV_DPDK_MIRROR_H +#define NETDEV_DPDK_MIRROR_H + +#include "openvswitch/types.h" + +#ifdef __cplusplus +extern "C" { +#endif + +typedef enum { + mirror_data_find, /* find the mirror-data allocated */ + mirror_data_add, /* add a new mirror_param data int DB */ + mirror_data_rem, /* remove a mirror_param from the DB */ +} mirror_data_op; + +typedef int (*rte_mirror_scan_fn)(struct rte_mbuf *pkt, void *user_param); +typedef enum { + mirror_port, /* port mirror */ + mirror_flow_mac, /* flow mirror according to source mac */ + mirror_flow_custom, /* flow mirror according to a callback scn */ + mirror_invalid, /* invalid mirror_type */ +} rte_mirror_type; + +typedef union { + const struct rte_eth_rxtx_callback *pmd; + struct rte_eth_rxtx_callback *direct; +} mirror_fn_cb; + +struct mirror_param { + uint16_t dst_port_id; + uint16_t dst_vlan_id; + rte_spinlock_t *locks; + int n_src_queue; + int n_dst_queue; + struct rte_mbuf **pkt_buf; + mirror_fn_cb *mirror_cb; + unsigned int max_burst_size; + rte_mirror_scan_fn custom_scan; + rte_mirror_type mirror_type; + unsigned int extra_data_size; + void *extra_data; /* extra mirror parameter */ +}; + +struct mirror_offload_port { + uint32_t dev_id; + struct mirror_param rx; + struct mirror_param tx; +}; + +bool netdev_port_started(uint16_t port_id, uint32_t *num_tx_queue); +int netdev_get_portid_from_addr(const char *pci_addr_str, uint16_t *port_id); +int netdev_tunnel_port_setup(uint16_t portid, uint32_t *num_queue); + +void netdev_mirror_data_proc(uint32_t dev_id, mirror_data_op op, + int tx, struct mirror_param *in_param, + struct mirror_offload_port **out_param); +void netdev_mirror_cb_set(struct mirror_param *data, uint16_t port_id, + int pmd, int tx); +int netdev_eth_register_mirror(uint16_t src_port, + struct mirror_param *param, int tx_cb); +int netdev_eth_unregister_mirror(uint16_t src_port, int tx_cb); + +#ifdef __cplusplus +} +#endif + +#endif /* netdev-dpdk-mirror.h */ diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 9d8096668..eb6644333 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -48,6 +48,7 @@ #include "fatal-signal.h" #include "if-notifier.h" #include "netdev-provider.h" +#include "netdev-dpdk-mirror.h" #include "netdev-vport.h" #include "odp-util.h" #include "openvswitch/dynamic-string.h" @@ -171,6 +172,16 @@ static const struct rte_eth_conf port_conf = { }, }; +struct mirror_tunnel_port_info { + uint16_t port_id; + rte_spinlock_t *locks; + uint32_t share_count; + uint32_t num_queue; + bool port_started; + struct mirror_tunnel_port_info *next; +}; +static struct mirror_tunnel_port_info *mirror_tunnel_head = NULL; + /* * These callbacks allow virtio-net devices to be added to vhost ports when * configuration has been fully completed. @@ -443,6 +454,8 @@ struct netdev_dpdk { }; struct dpdk_tx_queue *tx_q; struct rte_eth_link link; + mirror_fn_cb *rx_cb; /* shared pointer */ + mirror_fn_cb *tx_cb; ); PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline1, @@ -2417,6 +2430,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, nb_rx = rte_vhost_dequeue_burst(vid, qid, dev->dpdk_mp->mp, (struct rte_mbuf **) batch->packets, NETDEV_MAX_BURST); + + if (dev->rx_cb && dev->rx_cb[qid].direct->fn.rx) { + dev->rx_cb[qid].direct->fn.rx((uint16_t) vid, qid, + (struct rte_mbuf **) batch->packets, nb_rx, + NETDEV_MAX_BURST, dev->rx_cb[qid].direct->param); + } + if (!nb_rx) { return EAGAIN; } @@ -2634,6 +2654,10 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; unsigned int tx_pkts; + if (dev->tx_cb && dev->tx_cb[qid].direct->fn.tx) { + dev->tx_cb[qid].direct->fn.tx((uint16_t) vid, qid, cur_pkts, cnt, + dev->tx_cb[qid].direct->param); + } tx_pkts = rte_vhost_enqueue_burst(vid, vhost_qid, cur_pkts, cnt); if (OVS_LIKELY(tx_pkts)) { /* Packets have been sent.*/ @@ -5291,6 +5315,376 @@ netdev_dpdk_rte_flow_query_count(struct netdev *netdev, return ret; } +/* + * mirror tunnel device management routines + * mirror tunnel devices are devices reserved solely for + * traffic mirroring + */ +static void +netdev_dpdk_update_mt_list(struct mirror_tunnel_port_info *mt_port_info, + bool add_port) +{ + struct mirror_tunnel_port_info *ptr = mirror_tunnel_head; + + if (add_port) { + if (!ptr) { + mirror_tunnel_head = mt_port_info; + return; + } + while (ptr->next) { + ptr = ptr->next; + } + ptr->next = mt_port_info; + } else { + while (ptr->next && + ptr->next->port_id != mt_port_info->port_id) { + ptr = ptr->next; + } + + if (ptr->next) { + ptr->next = ptr->next->next; + free(mt_port_info); + } else { + if (ptr->port_id == mt_port_info->port_id) { + mirror_tunnel_head = NULL; + free(mt_port_info); + } else { + VLOG_ERR("Fail to find %s mirror port (%d) info\n", + add_port?"add":"remove", mt_port_info->port_id); + } + } + } +} + +static struct mirror_tunnel_port_info* +netdev_dpdk_get_mt_port_info(uint16_t port_id) +{ + struct mirror_tunnel_port_info *mt_port_info; + + if (mirror_tunnel_head) { + mt_port_info = mirror_tunnel_head; + while (mt_port_info) { + if (mt_port_info->port_id == port_id) { + return mt_port_info; + } + mt_port_info = mt_port_info->next; + } + VLOG_ERR("Could not tunnel port with port-id %d\n", + port_id); + } + + mt_port_info = xmalloc(sizeof(struct mirror_tunnel_port_info)); + memset(mt_port_info, 0, sizeof(*mt_port_info)); + mt_port_info->port_id = port_id; + mt_port_info->next = NULL; + + return mt_port_info; +} + +static int +netdev_dpdk_addr_to_portid(const char *pci_addr_str, uint16_t *port_id) +{ + struct rte_pci_device *pci_dev; + struct rte_pci_addr pci_addr; + int i; + + if (rte_pci_addr_parse(pci_addr_str, &pci_addr)) { + VLOG_ERR("Incorrect pci address %s\n", pci_addr_str); + return -1; + } + + for (i = 0; i < RTE_MAX_ETHPORTS; i++) { + struct rte_pci_addr *eth_pci_addr; + + if (!rte_eth_devices[i].device) { + continue; + } + + pci_dev = RTE_ETH_DEV_TO_PCI(&rte_eth_devices[i]); + if (!pci_dev) { + continue; + } + + eth_pci_addr = &pci_dev->addr; + + if (pci_addr.bus == eth_pci_addr->bus && + pci_addr.devid == eth_pci_addr->devid && + pci_addr.domain == eth_pci_addr->domain && + pci_addr.function == eth_pci_addr->function) { + *port_id = i; + + return 0; + } + } + + return -1; +} + +static int +netdev_dpdk_mt_open(uint16_t port_id, struct mirror_param *param) +{ + struct rte_eth_dev_info dev_info; + struct rte_eth_txconf txq_conf; + struct rte_eth_rxconf rxq_conf; + struct rte_mempool *pktbuf; + + struct mirror_tunnel_port_info *mt_info; + + uint16_t nb_rxd = NIC_PORT_DEFAULT_RXQ_SIZE; + uint16_t nb_txd = NIC_PORT_DEFAULT_TXQ_SIZE; + unsigned int i, num_queue; + + struct rte_eth_conf mt_port_conf = { + .rxmode = { + .split_hdr_size = 0, + }, + .txmode = { + .mq_mode = ETH_MQ_TX_NONE, + }, + }; + + mt_info = netdev_dpdk_get_mt_port_info(port_id); + if (!mt_info) { + return -1; + } + + if (mt_info->port_started) { + param->n_dst_queue = mt_info->num_queue; + param->dst_port_id = port_id; + param->locks = mt_info->locks; + mt_info->share_count++; + + return 0; + } + + rte_eth_dev_info_get(port_id, &dev_info); + num_queue = param->n_src_queue; + + /* A tunnel device doesn't require mbuf. It's used as + * hardware channel, transmit packets with + * mbuf provided by source. Need this mbuf creation + * to finish port initialization + */ + pktbuf = rte_pktmbuf_pool_create( + "tunnel-port", + (dev_info.rx_desc_lim.nb_max + dev_info.tx_desc_lim.nb_max), + RTE_MEMPOOL_CACHE_MAX_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, + rte_eth_dev_socket_id(port_id)); + + mt_port_conf.txmode.offloads |= DEV_TX_OFFLOAD_VLAN_INSERT; + if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) { + mt_port_conf.txmode.offloads |= DEV_TX_OFFLOAD_MBUF_FAST_FREE; + } + rte_eth_dev_configure(port_id, 1, num_queue, &mt_port_conf); + + /* init one Rx queue */ + rxq_conf = dev_info.default_rxconf; + rxq_conf.offloads = mt_port_conf.rxmode.offloads; + if (rte_eth_rx_queue_setup(port_id, 0, nb_rxd, + rte_eth_dev_socket_id(port_id), &rxq_conf, pktbuf) < 0) + VLOG_ERR("fail to setup tunnel port (%d) rx-queue\n", port_id); + + /* init # of Tx queue as part of mirror-tunnel setup */ + txq_conf = dev_info.default_txconf; + txq_conf.offloads |= mt_port_conf.txmode.offloads; + for (i = 0; i < num_queue; i++) { + if (rte_eth_tx_queue_setup(port_id, + i, nb_txd, + rte_eth_dev_socket_id(port_id), + &txq_conf) < 0) { + VLOG_ERR("fail to setup tunnel port (%d) tx queue #%u\n", + port_id, i); + return -1; + } + } + + if (rte_eth_dev_start(port_id) < 0) { + VLOG_ERR("fail to start tunnel port %d\n", port_id); + return -1; + } + + mt_info->locks = xmalloc(num_queue * sizeof(rte_spinlock_t)); + if (mt_info->locks) { + for (i = 0; i < mt_info->num_queue; i++) { + rte_spinlock_init(&mt_info->locks[i]); + } + } else { + return -1; + } + mt_info->share_count = 1; + mt_info->port_started = true; + mt_info->num_queue = num_queue; + + param->n_dst_queue = mt_info->num_queue; + param->dst_port_id = port_id; + param->locks = mt_info->locks; + + netdev_dpdk_update_mt_list(mt_info, true); + return 0; +} + +static void +netdev_dpdk_mt_close(uint16_t mirror_port_id) +{ + struct mirror_tunnel_port_info *mt_port_info = + netdev_dpdk_get_mt_port_info(mirror_port_id); + + if (mt_port_info) { + mt_port_info->share_count--; + if (!mt_port_info->share_count) { + netdev_dpdk_update_mt_list(mt_port_info, false); + rte_eth_dev_stop(mirror_port_id); + rte_eth_dev_close(mirror_port_id); + } + } +} + +/* vhost device mirror registration and un-registration routines */ +static int +netdev_vhost_register_mirror(struct netdev_dpdk *dev, + struct mirror_param *param, int tx_cb) +{ + uint32_t vid = netdev_dpdk_get_vid(dev); + struct mirror_offload_port *port_info = NULL; + struct mirror_param *data; + + netdev_mirror_data_proc(vid, mirror_data_add, tx_cb, param, &port_info); + if (!port_info) { + return -1; + } + + data = tx_cb ? &port_info->tx : &port_info->rx; + netdev_mirror_cb_set(data, (uint16_t) vid, 0, tx_cb); + + if (tx_cb) { + dev->tx_cb = data->mirror_cb; + } else { + dev->rx_cb = data->mirror_cb; + } + + return 0; +} + +static int +netdev_vhost_unregister_mirror(struct netdev_dpdk *dev, int tx_cb) +{ + /* release both cb and pkt_buf */ + unsigned int i; + uint32_t vid = netdev_dpdk_get_vid(dev); + struct mirror_offload_port *port_info = NULL; + struct mirror_param *data; + + netdev_mirror_data_proc(vid, mirror_data_find, tx_cb, NULL, &port_info); + if (port_info == NULL) { + VLOG_ERR("Source port %d is not on outstanding port mirror db\n", vid); + return -1; + } + data = tx_cb ? &port_info->tx : &port_info->rx; + + if (tx_cb) { + dev->tx_cb = NULL; + } else { + dev->rx_cb = NULL; + } + + for (i = 0; i < data->n_src_queue; i++) { + free(data->mirror_cb[i].direct); + } + + free(data->mirror_cb); + + if (data->pkt_buf) { + free(data->pkt_buf); + data->pkt_buf = NULL; + } + + if (data->extra_data) { + free(data->extra_data); + data->extra_data = NULL; + data->extra_data_size = 0; + } + + netdev_mirror_data_proc(vid, mirror_data_rem, tx_cb, NULL, NULL); + return 0; +} + +static int +netdev_dpdk_mirror_offload(struct netdev *src, struct eth_addr *flow_addr, + uint16_t vlan_id, char *mirror_tunnel_addr, + bool add_mirror, bool tx_cb) { + struct netdev_dpdk *src_dev = netdev_dpdk_cast(src); + bool eth_dev = src_dev->type == DPDK_DEV_ETH; + uint16_t mirror_port_id; + int status = 0; + + if (netdev_dpdk_addr_to_portid(mirror_tunnel_addr, &mirror_port_id)) { + VLOG_ERR("Could not find tunnel port with BDF addr %s\n", + mirror_tunnel_addr); + return -1; + } + + if (add_mirror) { + uint32_t i; + struct mirror_param data; + uint64_t mac_addr = 0; + + memset(&data, 0, sizeof(struct mirror_param)); + data.extra_data_size = 0; + data.extra_data = NULL; + data.mirror_type = mirror_port; + for (i = 0; i < 6; i++) { + mac_addr <<= 8; + mac_addr |= flow_addr->ea[6 - i - 1]; + } + if (mac_addr) { + data.mirror_type = mirror_flow_mac; + data.extra_data_size = sizeof(uint64_t); + data.extra_data = xmalloc(sizeof(uint64_t)); + memcpy(data.extra_data, &mac_addr, sizeof(uint64_t)); + } + data.dst_vlan_id = vlan_id; + data.n_src_queue = tx_cb?src->n_txq:src->n_rxq; + data.max_burst_size = NETDEV_MAX_BURST; + + if (netdev_dpdk_mt_open(mirror_port_id, &data)) { + VLOG_ERR("Fail to initialize mirror tunnel port %d\n", + mirror_port_id); + return -1; + } + + VLOG_INFO("register %s device with %s mirror-offload with" + "src-port:%d (%s) and output-port:%d (%s) vlan-id=%d flow-mac=" + "0x%" PRIx64 "\n", + eth_dev?"ethdev":"vhost", + tx_cb?"ingress":"egress", src_dev->port_id, + src->name, mirror_port_id, mirror_tunnel_addr, vlan_id, + (uint64_t)__builtin_bswap64(mac_addr)); + + if (eth_dev) { + status = netdev_eth_register_mirror(src_dev->port_id, &data, + tx_cb); + } else { + status = netdev_vhost_register_mirror(src_dev, &data, tx_cb); + } + } else { + VLOG_INFO("unregister %s device with %s mirror-offload with" + " src-port:%d(%s)\n", + eth_dev?"ethdev":"vhost", + tx_cb?"ingress":"egress", src_dev->port_id, + src->name); + + if (eth_dev) { + status = netdev_eth_unregister_mirror(src_dev->port_id, tx_cb); + } else { + status = netdev_vhost_unregister_mirror(src_dev, tx_cb); + } + + netdev_dpdk_mt_close(mirror_port_id); + } + + return status; +} + #define NETDEV_DPDK_CLASS_COMMON \ .is_pmd = true, \ .alloc = netdev_dpdk_alloc, \ @@ -5340,6 +5734,7 @@ static const struct netdev_class dpdk_class = { .construct = netdev_dpdk_construct, .set_config = netdev_dpdk_set_config, .send = netdev_dpdk_eth_send, + .mirror_offload = netdev_dpdk_mirror_offload, }; static const struct netdev_class dpdk_vhost_class = { @@ -5355,6 +5750,7 @@ static const struct netdev_class dpdk_vhost_class = { .reconfigure = netdev_dpdk_vhost_reconfigure, .rxq_recv = netdev_dpdk_vhost_rxq_recv, .rxq_enabled = netdev_dpdk_vhost_rxq_enabled, + .mirror_offload = netdev_dpdk_mirror_offload, }; static const struct netdev_class dpdk_vhost_client_class = { @@ -5371,6 +5767,7 @@ static const struct netdev_class dpdk_vhost_client_class = { .reconfigure = netdev_dpdk_vhost_client_reconfigure, .rxq_recv = netdev_dpdk_vhost_rxq_recv, .rxq_enabled = netdev_dpdk_vhost_rxq_enabled, + .mirror_offload = netdev_dpdk_mirror_offload, }; void diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h index 73dce2fca..dab278dcd 100644 --- a/lib/netdev-provider.h +++ b/lib/netdev-provider.h @@ -834,6 +834,22 @@ struct netdev_class { /* Get a block_id from the netdev. * Returns the block_id or 0 if none exists for netdev. */ uint32_t (*get_block_id)(struct netdev *); + + /* Configure a mirror offload setting on a netdev. + * 'src': netdev traffic to be mirrored + * 'flow_addr': the destination mac address is of source traffic for + * inspection. + * 'dst': netdev where mirror traffic is transmitted. + * 'vlan_id': vlag to be added to the mirrored packets. + * 'mt_pci_addr': mirror tunnel pcie address. + * 'add_mirror': true: configure a mirror traffic; false: remove mirror + * 'ingress': true: mirror 'src' netdev Rx traffic; false: mirror + * 'src' netdev Tx traffic. + */ + int (*mirror_offload)(struct netdev *src, struct eth_addr *flow_addr, + uint16_t vlan_id, char *mt_pci_addr, + bool add_mirror, bool ingress); + }; int netdev_register_provider(const struct netdev_class *); diff --git a/lib/netdev.c b/lib/netdev.c index 91e91955c..464c2f8fe 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -69,6 +69,8 @@ COVERAGE_DEFINE(netdev_get_stats); COVERAGE_DEFINE(netdev_send_prepare_drops); COVERAGE_DEFINE(netdev_push_header_drops); +#define MIRROR_DB_INIT_SIZE 8 + struct netdev_saved_flags { struct netdev *netdev; struct ovs_list node; /* In struct netdev's saved_flags_list. */ @@ -2297,3 +2299,387 @@ netdev_free_custom_stats_counters(struct netdev_custom_stats *custom_stats) } } } + + +struct netdev_mirror_offload_item { + struct mirror_offload_info info; + + struct ovs_list node; +}; + +struct netdev_mirror_offload { + struct ovs_mutex mutex; + struct ovs_list list; + pthread_cond_t cond; +}; + +static struct netdev_mirror_offload netdev_mirror_offload = { + .mutex = OVS_MUTEX_INITIALIZER, + .list = OVS_LIST_INITIALIZER(&netdev_mirror_offload.list), +}; + +static struct ovsthread_once offload_thread_once + = OVSTHREAD_ONCE_INITIALIZER; + +static void *netdev_mirror_offload_main(void *data); + +/* + * Re-size mirror_db when it's out of space. + * Always double the buffer when it's needed + */ +static int +netdev_mirror_db_resize(struct netdev_mirror_offload_item ***old_db, + int *old_db_size) +{ + struct netdev_mirror_offload_item **new_db; + int cur_size = *old_db_size; + int new_size; + + if (!cur_size) { + new_size = MIRROR_DB_INIT_SIZE; + } else { + new_size = 2 * cur_size; + } + + new_db = xzalloc(sizeof(struct netdev_mirror_offload_item *) * new_size); + + if (!new_db) { + VLOG_ERR("Out of memory!!!"); + return -1; + } + memset(new_db, 0, sizeof(struct netdev_mirror_offload_item *) * new_size); + + if (cur_size) { + int i; + + for (i = 0; i < cur_size; i++) { + new_db[i] = (*old_db)[i]; + } + free(*old_db); + } + + *old_db = new_db; + *old_db_size = new_size; + + return 0; +} + +static void +netdev_free_mirror_offload(struct netdev_mirror_offload_item *offload) +{ + if (!offload) { + return; + } + + if (offload->info.src) { + free(offload->info.src); + } + if (offload->info.dst) { + free(offload->info.dst); + } + if (offload->info.flow_dst_mac) { + free(offload->info.flow_dst_mac); + } + if (offload->info.flow_src_mac) { + free(offload->info.flow_src_mac); + } + if (offload->info.output_src_tags) { + free(offload->info.output_src_tags); + } + if (offload->info.output_dst_tags) { + free(offload->info.output_dst_tags); + } + if (offload->info.name) { + free(offload->info.name); + } + if (offload->info.mirror_tunnel_addr) { + free(offload->info.mirror_tunnel_addr); + } + + free(offload); +} + +static struct +netdev_mirror_offload_item * +netdev_alloc_mirror_offload(struct mirror_offload_info *info) +{ + struct netdev_mirror_offload_item *offload; + int i; + + offload = xzalloc(sizeof(*offload)); + memcpy(&offload->info, info, sizeof(struct mirror_offload_info)); + + if (info->name) { + offload->info.name = xzalloc(strlen(info->name) + 1); + if (offload->info.name) { + ovs_strzcpy(offload->info.name, info->name, strlen(info->name)); + } + } + + if (info->mirror_tunnel_addr) { + offload->info.mirror_tunnel_addr = + xzalloc(strlen(info->mirror_tunnel_addr) + 1); + if (offload->info.mirror_tunnel_addr) { + ovs_strzcpy(offload->info.mirror_tunnel_addr, + info->mirror_tunnel_addr, + strlen(info->mirror_tunnel_addr)); + } + } + + /* only add_mirror request include valid configuration */ + if (info->n_src_port) { + offload->info.src = xzalloc(sizeof(struct netdev *)*info->n_src_port); + offload->info.flow_dst_mac = xzalloc(sizeof(struct eth_addr)* + info->n_src_port); + offload->info.output_src_tags = xzalloc(sizeof(uint16_t)* + info->n_src_port); + if (!offload->info.src || !offload->info.flow_dst_mac || + !offload->info.output_src_tags) { + VLOG_ERR("Out of memory!!!"); + netdev_free_mirror_offload(offload); + return NULL; + } + + for (i = 0; i < info->n_src_port; i++) { + offload->info.src[i] = info->src[i]; + offload->info.output_src_tags[i] = info->output_src_tags[i]; + memcpy(&offload->info.flow_dst_mac[i], &info->flow_dst_mac[i], + sizeof(struct eth_addr)); + } + } + + if (info->n_dst_port) { + offload->info.dst = xzalloc(sizeof(struct netdev *)*info->n_dst_port); + offload->info.flow_src_mac = xzalloc(sizeof(struct eth_addr)* + info->n_dst_port); + offload->info.output_dst_tags = xzalloc(sizeof(uint16_t)* + info->n_dst_port); + if (!offload->info.dst || !offload->info.flow_src_mac || + !offload->info.output_dst_tags) { + VLOG_ERR("Out of memory!!!"); + netdev_free_mirror_offload(offload); + return NULL; + } + + for (i = 0; i < info->n_dst_port; i++) { + offload->info.dst[i] = info->dst[i]; + offload->info.output_dst_tags[i] = info->output_dst_tags[i]; + memcpy(&offload->info.flow_src_mac[i], &info->flow_src_mac[i], + sizeof(struct eth_addr)); + } + } + + return offload; +} + +static void +netdev_append_mirror_offload(struct netdev_mirror_offload_item *offload) +{ + ovs_mutex_lock(&netdev_mirror_offload.mutex); + ovs_list_push_back(&netdev_mirror_offload.list, &offload->node); + xpthread_cond_signal(&netdev_mirror_offload.cond); + ovs_mutex_unlock(&netdev_mirror_offload.mutex); +} + +void +netdev_mirror_offload_put(struct mirror_offload_info *info) +{ + struct netdev_mirror_offload_item *offload; + /* only support tunnel port for traffic mirroring */ + if (info->add_mirror && !info->mirror_tunnel_addr) { + return; + } + + if (ovsthread_once_start(&offload_thread_once)) { + xpthread_cond_init(&netdev_mirror_offload.cond, NULL); + ovs_thread_create("netdev_mirror_offload", + netdev_mirror_offload_main, NULL); + ovsthread_once_done(&offload_thread_once); + } + + offload = netdev_alloc_mirror_offload(info); + netdev_append_mirror_offload(offload); +} + +static int +netdev_mirror_offload_configue(struct mirror_offload_info *info, + bool add_mirror) +{ + int un_support_count = 0; + int ret; + + if (info->n_src_port) { + for (int i = 0; i < info->n_src_port; i++) { + const struct netdev_class *class = + info->src[i]->netdev_class; + if (!class) { + return -1; + } + if (class->mirror_offload) { + ret = class->mirror_offload( + info->src[i], + &info->flow_dst_mac[i], + info->output_src_tags[i], + info->mirror_tunnel_addr, + add_mirror, false); + if (ret) { + VLOG_ERR("Fail to %s mirror-offload" + " configuration %s\n", + add_mirror ? "add" : "remove", + info->name); + return ret; + } + } else { + un_support_count++; + } + } + } + + if (info->n_dst_port) { + for (int i = 0; i < info->n_dst_port; i++) { + const struct netdev_class *class = + info->dst[i]->netdev_class; + if (!class) { + return -1; + } + if (class->mirror_offload) { + ret = class->mirror_offload( + info->dst[i], + &info->flow_src_mac[i], + info->output_dst_tags[i], + info->mirror_tunnel_addr, + add_mirror, true); + if (ret) { + VLOG_ERR("Fail to %s mirror-offload" + " configuration %s\n", + add_mirror ? "add" : "remove", + info->name); + return ret; + } + } else { + un_support_count++; + } + } + } + + return un_support_count; +} + +static void * +netdev_mirror_offload_main(void *data OVS_UNUSED) +{ + struct netdev_mirror_offload_item *offload; + struct mirror_offload_info *info; + struct ovs_list *list; + struct netdev_mirror_offload_item **offload_db = NULL; + int offload_used_count = 0; + int offload_db_size = 0; + int ret, i, ind; + + /* continue polling to check if there is an outstanding request */ + for (;;) { + ovs_mutex_lock(&netdev_mirror_offload.mutex); + if (ovs_list_is_empty(&netdev_mirror_offload.list)) { + ovsrcu_quiesce_start(); + ovs_mutex_cond_wait(&netdev_mirror_offload.cond, + &netdev_mirror_offload.mutex); + ovsrcu_quiesce_end(); + } + list = ovs_list_pop_front(&netdev_mirror_offload.list); + offload = CONTAINER_OF(list, struct netdev_mirror_offload_item, + node); + ovs_mutex_unlock(&netdev_mirror_offload.mutex); + + if (!offload_db_size && + netdev_mirror_db_resize(&offload_db, &offload_db_size)){ + return NULL; + } + + ind = offload_db_size; + for (i = 0; i < offload_db_size; i++) { + if (offload_db[i] && + !strncmp(offload_db[i]->info.name, offload->info.name, + strlen(offload->info.name) + 1)) { + ind = i; + break; + } + } + + if (!offload->info.add_mirror) { + /* remove mirror offload setup */ + if (ind == offload_db_size) { + VLOG_WARN("Mirror offload remove configuration, %s, " + "not found; clear mirror offload operation" + " aborted\n", offload->info.name); + continue; + } + } else { + /* add mirror offload */ + if (ind < offload_db_size) { + netdev_free_mirror_offload(offload); + VLOG_WARN("Attempt adding an existing mirror-offload " + "configuration; request aborted\n"); + continue; + } + + if (offload_used_count == offload_db_size && + netdev_mirror_db_resize(&offload_db, &offload_db_size)) { + return NULL; + } + } + + info = offload->info.add_mirror ? &offload->info : + &offload_db[ind]->info; + ret = netdev_mirror_offload_configue(info, offload->info.add_mirror); + + if (ret) { + VLOG_ERR("%s mirror configuration fails due to %s\n", + offload->info.add_mirror ? "Add" : "Remove", + ret > 0 ? "unsupport source traffic type" : + "device is not ready"); + netdev_free_mirror_offload(offload); + continue; + } else { + VLOG_INFO("Succeed %s mirror-offload configuration: %s", + offload->info.add_mirror ? "adding" : "removing", + offload->info.name); + } + + if (offload->info.add_mirror) { + for (i = 0; i < offload_db_size; i++) { + if (offload_db[i] == NULL) { + offload_db[i] = offload; + offload_used_count++; + break; + } + } + } else { + /* remove the prior "add" request */ + netdev_free_mirror_offload(offload_db[ind]); + offload_db[ind] = NULL; + + /* remove the current("remove") request */ + netdev_free_mirror_offload(offload); + offload_used_count--; + } + + /* free db when the used count drop to 0 */ + if (!offload_used_count) { + free(offload_db); + offload_db = NULL; + offload_db_size = 0; + } + } + + /* clean up memory */ + for (i = 0; i < offload_db_size; i++) { + if (offload_db[i]) { + netdev_free_mirror_offload(offload_db[i]); + } + } + if (offload_db) { + free(offload_db); + } + + return NULL; +} diff --git a/lib/netdev.h b/lib/netdev.h index b705a9e56..cce042fc7 100644 --- a/lib/netdev.h +++ b/lib/netdev.h @@ -201,6 +201,22 @@ int netdev_send(struct netdev *, int qid, struct dp_packet_batch *, bool concurrent_txq); void netdev_send_wait(struct netdev *, int qid); +/* Hardware assisted mirror offloading*/ +struct mirror_offload_info { + struct netdev **src; + struct netdev **dst; + int n_src_port; + int n_dst_port; + struct eth_addr *flow_src_mac; + struct eth_addr *flow_dst_mac; + uint16_t *output_src_tags; + uint16_t *output_dst_tags; + bool add_mirror; + char *mirror_tunnel_addr; + char *name; +}; +void netdev_mirror_offload_put(struct mirror_offload_info *); + /* native tunnel APIs */ /* Structure to pass parameters required to build a tunnel header. */ struct netdev_tnl_build_header_params { diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at index dccb11741..ff6e9e625 100644 --- a/tests/ovs-vsctl.at +++ b/tests/ovs-vsctl.at @@ -1364,7 +1364,9 @@ _uuid : <1> name : eth1 _uuid : <2> name : mymirror +output_dst_vlan : [] output_port : <1> +output_src_vlan : [] output_vlan : [] select_all : false select_dst_port : [<0>] diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index 5ed7e8234..7b7603513 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -38,6 +38,7 @@ #include "mac-learning.h" #include "mcast-snooping.h" #include "netdev.h" +#include "netdev-provider.h" #include "netdev-offload.h" #include "nx-match.h" #include "ofproto/bond.h" @@ -330,6 +331,9 @@ static void mirror_destroy(struct mirror *); static bool mirror_configure(struct mirror *); static void mirror_refresh_stats(struct mirror *); +static void mirror_offload_destroy(struct mirror *); +static bool mirror_offload_configure(struct mirror *); + static void iface_configure_lacp(struct iface *, struct lacp_member_settings *); static bool iface_create(struct bridge *, const struct ovsrec_interface *, @@ -423,6 +427,35 @@ if_notifier_changed(struct if_notifier *notifier OVS_UNUSED) seq_wait(ifaces_changed, last_ifaces_changed); return changed; } + +static struct port * +port_lookup_all(const char *port_name) +{ + struct bridge *br; + struct port *port = NULL; + int found = 0; + + HMAP_FOR_EACH (br, node, &all_bridges) { + struct port *temp_port = NULL; + temp_port = port_lookup(br, port_name); + if (temp_port) { + if (!port) { + port = temp_port; + } + found++; + } + } + + if (found) { + if (found > 1) { + VLOG_INFO("More than one bridge owns port with name:%s\n", + port_name); + } + return port; + } + return NULL; +} + /* Public functions. */ @@ -5055,14 +5088,228 @@ mirror_create(struct bridge *br, const struct ovsrec_mirror *cfg) return m; } +static struct netdev *get_netdev_from_port(struct mirror *m, + struct port **port, const char *name) +{ + struct port *temp_port; + struct iface *iface; + + *port = NULL; + temp_port = port_lookup(m->bridge, name); + if (temp_port) { + LIST_FOR_EACH (iface, port_elem, &temp_port->ifaces) { + if (iface) { + *port = temp_port; + return iface->netdev; + } + } + } + /* try different bridges */ + temp_port = port_lookup_all(name); + if (temp_port) { + LIST_FOR_EACH (iface, port_elem, &temp_port->ifaces) { + if (iface) { + *port = temp_port; + return iface->netdev; + } + } + } + return NULL; +} + +static void +release_mirror_offload_info(struct mirror_offload_info *info) +{ + if (info->src) { + free(info->src); + } + if (info->dst) { + free(info->dst); + } + if (info->flow_dst_mac) { + free(info->flow_dst_mac); + } + if (info->flow_src_mac) { + free(info->flow_src_mac); + } + if (info->output_src_tags) { + free(info->output_src_tags); + } + if (info->output_dst_tags) { + free(info->output_dst_tags); + } + if (info->name) { + free(info->name); + } + if (info->mirror_tunnel_addr) { + free(info->mirror_tunnel_addr); + } +} + +static int +set_mirror_offload_info(struct mirror *m, struct mirror_offload_info *info) +{ + const struct ovsrec_mirror *cfg = m->cfg; + struct port *port = NULL; + int i; + + if (m->name) { + info->name = xmalloc(strlen(m->name) + 1); + ovs_strzcpy(info->name, m->name, strlen(m->name)); + } + + if (cfg->mirror_tunnel_addr) { + info->mirror_tunnel_addr = xmalloc(strlen(cfg->mirror_tunnel_addr) + + 1); + ovs_strzcpy(info->mirror_tunnel_addr, cfg->mirror_tunnel_addr, + strlen(cfg->mirror_tunnel_addr)); + } else { + VLOG_ERR("mirror-offload configuration fails because" + " lack of tunnel device\n"); + return -1; + } + + /* source port */ + info->n_src_port = cfg->n_select_src_port; + if (info->n_src_port) { + info->src = xmalloc(sizeof(struct netdev *)*info->n_src_port); + info->flow_dst_mac = xmalloc(sizeof(struct eth_addr)* + info->n_src_port); + if (info->n_src_port != cfg->n_output_src_vlan) { + VLOG_ERR("src port count:%d ouput src vlan count:%lu", + info->n_src_port, (unsigned long) cfg->n_output_src_vlan); + return -1; + } + info->output_src_tags = xmalloc(sizeof(uint16_t)*info->n_src_port); + } + + if (info->n_src_port) { + /* find netdev instance for each port */ + for (i = 0; i < info->n_src_port; i++) { + info->src[i] = get_netdev_from_port(m, &port, + cfg->select_src_port[i]->name); + if (!info->src[i]) { + VLOG_ERR("src-port: %s is not a netdev device\n", + cfg->select_src_port[i]->name); + return -1; + } + } + memset(info->flow_dst_mac, 0, sizeof(struct eth_addr)* + info->n_src_port); + + /* + * for source port, flow is separated by + * different dst mac addr + */ + if (cfg->n_flow_dst_mac) { + int dst_count = (info->n_src_port > cfg->n_flow_dst_mac)? + cfg->n_flow_dst_mac:info->n_src_port; + for (i = 0; i < dst_count; i++) { + eth_addr_from_string(cfg->flow_dst_mac[i], + &info->flow_dst_mac[i]); + } + } + + if (cfg->n_output_src_vlan) { + int count = (cfg->n_output_src_vlan > info->n_src_port)? + info->n_src_port:cfg->n_output_src_vlan; + for (i = 0; i < count; i++) { + info->output_src_tags[i] = cfg->output_src_vlan[i] & 0xFFF; + } + } + } + + /* dst ports */ + info->n_dst_port = cfg->n_select_dst_port; + if (info->n_dst_port) { + info->dst = xmalloc(sizeof(struct netdev *)*info->n_dst_port); + info->flow_src_mac = xmalloc(sizeof(struct eth_addr)* + info->n_dst_port); + if (info->n_dst_port != cfg->n_output_dst_vlan) { + VLOG_ERR("dst port count:%d ouput dst vlan count:%lu\n", + info->n_dst_port, (unsigned long) cfg->n_output_dst_vlan); + return -1; + } + info->output_dst_tags = xmalloc(sizeof(uint16_t)*info->n_dst_port); + } + + if (info->n_dst_port) { + for (i = 0; i < info->n_dst_port; i++) { + info->dst[i] = get_netdev_from_port(m, &port, + cfg->select_dst_port[i]->name); + if (!info->dst[i]) { + VLOG_ERR("dst-port: %s is not a netdev device\n", + cfg->select_dst_port[i]->name); + return -1; + } + } + memset(info->flow_src_mac, 0, sizeof(struct eth_addr)* + info->n_dst_port); + + /* + * for destination port, flow is separated by + * different src mac addr + */ + if (cfg->n_flow_src_mac) { + int src_count = (info->n_dst_port > cfg->n_flow_src_mac)? + cfg->n_flow_src_mac:info->n_dst_port; + for (i = 0; i < src_count; i++) { + eth_addr_from_string(cfg->flow_src_mac[i], + &info->flow_src_mac[i]); + } + } + + if (cfg->n_output_dst_vlan) { + int count = (cfg->n_output_dst_vlan > info->n_dst_port)? + info->n_dst_port:cfg->n_output_dst_vlan; + for (i = 0; i < count; i++) { + info->output_dst_tags[i] = cfg->output_dst_vlan[i] & 0xFFF; + } + } + } + + VLOG_INFO("sucess creating mirror-offload(%s): with %d src-port" + " streams %d dst-port streams to tunnel %s\n", + cfg->name, info->n_src_port, info->n_dst_port, + info->mirror_tunnel_addr?info->mirror_tunnel_addr:"none"); + return 0; +} + +static void +mirror_offload_destroy(struct mirror *m) +{ + struct mirror_offload_info info; + + memset(&info, 0, sizeof(struct mirror_offload_info)); + info.add_mirror = false; + if (m->name) { + info.name = xmalloc(strlen(m->name) + 1); + if (info.name) { + ovs_strzcpy(info.name, m->name, strlen(m->name)); + } + } + + netdev_mirror_offload_put(&info); + if (info.name) { + free(info.name); + } + if (info.mirror_tunnel_addr) { + free(info.mirror_tunnel_addr); + } +} + static void mirror_destroy(struct mirror *m) { if (m) { struct bridge *br = m->bridge; - if (br->ofproto) { - ofproto_mirror_unregister(br->ofproto, m); + if (m->cfg && m->cfg->mirror_offload) { + mirror_offload_destroy(m); + } else { + if (br->ofproto) { + ofproto_mirror_unregister(br->ofproto, m); + } } hmap_remove(&br->mirrors, &m->hmap_node); @@ -5094,12 +5341,32 @@ mirror_collect_ports(struct mirror *m, *n_out_portsp = n_out_ports; } +static bool +mirror_offload_configure(struct mirror *m) +{ + struct mirror_offload_info info; + + memset(&info, 0, sizeof(struct mirror_offload_info)); + info.add_mirror = true; + if (set_mirror_offload_info(m, &info)) { + release_mirror_offload_info(&info); + return false; + } + + netdev_mirror_offload_put(&info); + release_mirror_offload_info(&info); + return true; +} + static bool mirror_configure(struct mirror *m) { const struct ovsrec_mirror *cfg = m->cfg; struct ofproto_mirror_settings s; + if (cfg->mirror_offload) { + return mirror_offload_configure(m); + } /* Set name. */ if (strcmp(cfg->name, m->name)) { free(m->name); diff --git a/vswitchd/vswitch.ovsschema b/vswitchd/vswitch.ovsschema index 0666c8c76..4a1a34a1f 100644 --- a/vswitchd/vswitch.ovsschema +++ b/vswitchd/vswitch.ovsschema @@ -1,6 +1,6 @@ {"name": "Open_vSwitch", - "version": "8.2.0", - "cksum": "1076640191 26427", + "version": "8.2.1", + "cksum": "4051567316 27206", "tables": { "Open_vSwitch": { "columns": { @@ -418,8 +418,18 @@ "columns": { "name": { "type": "string"}, + "mirror_tunnel_addr": { + "type": "string"}, "select_all": { "type": "boolean"}, + "mirror_offload": { + "type": "boolean"}, + "flow_src_mac": { + "type": {"key": {"type": "string"}, + "min": 0, "max": "unlimited"}}, + "flow_dst_mac": { + "type": {"key": {"type": "string"}, + "min": 0, "max": "unlimited"}}, "select_src_port": { "type": {"key": {"type": "uuid", "refTable": "Port", @@ -440,6 +450,16 @@ "refTable": "Port", "refType": "weak"}, "min": 0, "max": 1}}, + "output_src_vlan": { + "type": {"key": {"type": "integer", + "minInteger": 0, + "maxInteger": 4294967295}, + "min": 0, "max": 4096}}, + "output_dst_vlan": { + "type": {"key": {"type": "integer", + "minInteger": 0, + "maxInteger": 4294967295}, + "min": 0, "max": 4096}}, "output_vlan": { "type": {"key": {"type": "integer", "minInteger": 1, diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 4597a215d..fd2049a7f 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -4869,11 +4869,35 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \ selected VLANs.

+ + BDF string of the tunnel device on which mirrored traffic will be + transmitted. + + If true, every packet arriving or departing on any port is selected for mirroring. + + If true, a hw-assisted port mirroring is configured instead + default mirroring. + + + + The source MAC address(es) for per-flow mirroring. Each MAC + address is separate by ','. This parametr is paired with + select_dst_port. A '0' MAC address indicates the requested mirror + is a per-port mirroring, otherwise it's a per-flow mirroring + + + + The destination MAC address(es) for per-flow mirroring. Each MAC + address is separate by ','. This parametr is paired with + select_src_port. A '0' MAC address indicates the requested mirror + is a per-port mirroring, otherwise it's a per-flow mirroring + + Ports on which departing packets are selected for mirroring. @@ -4955,6 +4979,32 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \

+ +

Output VLAN for selected source port packets, if nonempty.

+

+ Please note: This is different than + This vlan is used to add an additional + vlan tag on the mirror traffic, regardless it contains vlan or not. + The receive end could choose to filter out this additional vlan. + This option is provided so the mirrored traffic could maintain its + original vlan informaiton, and this mirror can be used to filter + out un-wanted traffic such as in . +

+
+ + +

Output VLAN for selected destination port packets, if nonempty.

+

+ Please note: This is different than + This vlan is used to add an additional + vlan tag on the mirror traffic, regardless it contains vlan or not. + The receive end could choose to filter out this additional vlan. + This option is provided so the mirrored traffic could maintain its + original vlan informaiton, and this mirror cab be used to filter + out un-wanted traffic such as in . +

+
+

Maximum per-packet number of bytes to mirror.

A mirrored packet with size larger than