From patchwork Mon Dec 2 08:41:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eli Britstein X-Patchwork-Id: 1203001 X-Patchwork-Delegate: i.maximets@samsung.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=mellanox.com Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47RJXY1ySFz9sPJ for ; Mon, 2 Dec 2019 19:42:53 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 699ED20483; Mon, 2 Dec 2019 08:42:51 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pKWOxcczelp4; Mon, 2 Dec 2019 08:42:34 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by silver.osuosl.org (Postfix) with ESMTP id 16D5F20BF8; Mon, 2 Dec 2019 08:42:14 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id E49C1C1799; Mon, 2 Dec 2019 08:42:13 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 6AB1DC1797 for ; Mon, 2 Dec 2019 08:42:06 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 4D3268625F for ; Mon, 2 Dec 2019 08:42:06 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZsaUaXuFBNLw for ; Mon, 2 Dec 2019 08:42:03 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by fraxinus.osuosl.org (Postfix) with ESMTP id 1C66286228 for ; Mon, 2 Dec 2019 08:42:02 +0000 (UTC) Received: from Internal Mail-Server by MTLPINE1 (envelope-from elibr@mellanox.com) with ESMTPS (AES256-SHA encrypted); 2 Dec 2019 10:42:00 +0200 Received: from dev-r-vrt-215.mtr.labs.mlnx. (dev-r-vrt-215.mtr.labs.mlnx [10.212.215.1]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id xB28g0PR007800; Mon, 2 Dec 2019 10:42:00 +0200 From: Eli Britstein To: dev@openvswitch.org, Ilya Maximets , Ian Stokes Date: Mon, 2 Dec 2019 08:41:36 +0000 Message-Id: <20191202084153.14412-3-elibr@mellanox.com> X-Mailer: git-send-email 2.14.5 In-Reply-To: <20191202084153.14412-1-elibr@mellanox.com> References: <20191202084153.14412-1-elibr@mellanox.com> Cc: Oz Shlomo , Majd Dibbiny , Eli Britstein , Ameer Mahagneh Subject: [ovs-dev] [PATCH V2 02/19] netdev-offload-dpdk: Refactor flow patterns and actions code X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Refactor the flow patterns and actions code to a new source file for better readability and towards adding more code to it. Signed-off-by: Eli Britstein Reviewed-by: Oz Shlomo --- lib/automake.mk | 4 +- lib/netdev-offload-dpdk-flow.c | 479 ++++++++++++++++++++++++++++++++++++++ lib/netdev-offload-dpdk-private.h | 69 ++++++ lib/netdev-offload-dpdk.c | 466 +----------------------------------- 4 files changed, 562 insertions(+), 456 deletions(-) create mode 100644 lib/netdev-offload-dpdk-flow.c create mode 100644 lib/netdev-offload-dpdk-private.h diff --git a/lib/automake.mk b/lib/automake.mk index 17b36b43d..3a813af85 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -142,6 +142,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/netdev-dummy.c \ lib/netdev-offload.c \ lib/netdev-offload.h \ + lib/netdev-offload-dpdk-private.h \ lib/netdev-offload-provider.h \ lib/netdev-provider.h \ lib/netdev-vport.c \ @@ -426,7 +427,8 @@ if DPDK_NETDEV lib_libopenvswitch_la_SOURCES += \ lib/dpdk.c \ lib/netdev-dpdk.c \ - lib/netdev-offload-dpdk.c + lib/netdev-offload-dpdk.c \ + lib/netdev-offload-dpdk-flow.c else lib_libopenvswitch_la_SOURCES += \ lib/dpdk-stub.c diff --git a/lib/netdev-offload-dpdk-flow.c b/lib/netdev-offload-dpdk-flow.c new file mode 100644 index 000000000..d1d5ce2c6 --- /dev/null +++ b/lib/netdev-offload-dpdk-flow.c @@ -0,0 +1,479 @@ +/* + * Copyright (c) 2019 Mellanox Technologies, Ltd. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#include + +#include + +#include "dpif-netdev.h" +#include "netdev-offload-dpdk-private.h" +#include "openvswitch/match.h" +#include "openvswitch/vlog.h" +#include "packets.h" + +VLOG_DEFINE_THIS_MODULE(netdev_offload_dpdk_flow); + +void +netdev_dpdk_flow_patterns_free(struct flow_patterns *patterns) +{ + /* When calling this function 'patterns' must be valid */ + free(patterns->items); + patterns->items = NULL; + patterns->cnt = 0; +} + +void +netdev_dpdk_flow_actions_free(struct flow_actions *actions) +{ + /* When calling this function 'actions' must be valid */ + int i; + + for (i = 0; i < actions->cnt; i++) { + if (actions->actions[i].conf) { + free((void *)actions->actions[i].conf); + } + } + free(actions->actions); + actions->actions = NULL; + actions->cnt = 0; +} + +static void +dump_flow_pattern(struct rte_flow_item *item) +{ + struct ds s; + + if (!VLOG_IS_DBG_ENABLED() || item->type == RTE_FLOW_ITEM_TYPE_END) { + return; + } + + ds_init(&s); + + if (item->type == RTE_FLOW_ITEM_TYPE_ETH) { + const struct rte_flow_item_eth *eth_spec = item->spec; + const struct rte_flow_item_eth *eth_mask = item->mask; + + ds_put_cstr(&s, "rte flow eth pattern:\n"); + if (eth_spec) { + ds_put_format(&s, + " Spec: src="ETH_ADDR_FMT", dst="ETH_ADDR_FMT", " + "type=0x%04" PRIx16"\n", + ETH_ADDR_BYTES_ARGS(eth_spec->src.addr_bytes), + ETH_ADDR_BYTES_ARGS(eth_spec->dst.addr_bytes), + ntohs(eth_spec->type)); + } else { + ds_put_cstr(&s, " Spec = null\n"); + } + if (eth_mask) { + ds_put_format(&s, + " Mask: src="ETH_ADDR_FMT", dst="ETH_ADDR_FMT", " + "type=0x%04"PRIx16"\n", + ETH_ADDR_BYTES_ARGS(eth_mask->src.addr_bytes), + ETH_ADDR_BYTES_ARGS(eth_mask->dst.addr_bytes), + ntohs(eth_mask->type)); + } else { + ds_put_cstr(&s, " Mask = null\n"); + } + } + + if (item->type == RTE_FLOW_ITEM_TYPE_VLAN) { + const struct rte_flow_item_vlan *vlan_spec = item->spec; + const struct rte_flow_item_vlan *vlan_mask = item->mask; + + ds_put_cstr(&s, "rte flow vlan pattern:\n"); + if (vlan_spec) { + ds_put_format(&s, + " Spec: inner_type=0x%"PRIx16", tci=0x%"PRIx16"\n", + ntohs(vlan_spec->inner_type), ntohs(vlan_spec->tci)); + } else { + ds_put_cstr(&s, " Spec = null\n"); + } + + if (vlan_mask) { + ds_put_format(&s, + " Mask: inner_type=0x%"PRIx16", tci=0x%"PRIx16"\n", + ntohs(vlan_mask->inner_type), ntohs(vlan_mask->tci)); + } else { + ds_put_cstr(&s, " Mask = null\n"); + } + } + + if (item->type == RTE_FLOW_ITEM_TYPE_IPV4) { + const struct rte_flow_item_ipv4 *ipv4_spec = item->spec; + const struct rte_flow_item_ipv4 *ipv4_mask = item->mask; + + ds_put_cstr(&s, "rte flow ipv4 pattern:\n"); + if (ipv4_spec) { + ds_put_format(&s, + " Spec: tos=0x%"PRIx8", ttl=%"PRIx8 + ", proto=0x%"PRIx8 + ", src="IP_FMT", dst="IP_FMT"\n", + ipv4_spec->hdr.type_of_service, + ipv4_spec->hdr.time_to_live, + ipv4_spec->hdr.next_proto_id, + IP_ARGS(ipv4_spec->hdr.src_addr), + IP_ARGS(ipv4_spec->hdr.dst_addr)); + } else { + ds_put_cstr(&s, " Spec = null\n"); + } + if (ipv4_mask) { + ds_put_format(&s, + " Mask: tos=0x%"PRIx8", ttl=%"PRIx8 + ", proto=0x%"PRIx8 + ", src="IP_FMT", dst="IP_FMT"\n", + ipv4_mask->hdr.type_of_service, + ipv4_mask->hdr.time_to_live, + ipv4_mask->hdr.next_proto_id, + IP_ARGS(ipv4_mask->hdr.src_addr), + IP_ARGS(ipv4_mask->hdr.dst_addr)); + } else { + ds_put_cstr(&s, " Mask = null\n"); + } + } + + if (item->type == RTE_FLOW_ITEM_TYPE_UDP) { + const struct rte_flow_item_udp *udp_spec = item->spec; + const struct rte_flow_item_udp *udp_mask = item->mask; + + ds_put_cstr(&s, "rte flow udp pattern:\n"); + if (udp_spec) { + ds_put_format(&s, + " Spec: src_port=%"PRIu16", dst_port=%"PRIu16"\n", + ntohs(udp_spec->hdr.src_port), + ntohs(udp_spec->hdr.dst_port)); + } else { + ds_put_cstr(&s, " Spec = null\n"); + } + if (udp_mask) { + ds_put_format(&s, + " Mask: src_port=0x%"PRIx16 + ", dst_port=0x%"PRIx16"\n", + ntohs(udp_mask->hdr.src_port), + ntohs(udp_mask->hdr.dst_port)); + } else { + ds_put_cstr(&s, " Mask = null\n"); + } + } + + if (item->type == RTE_FLOW_ITEM_TYPE_SCTP) { + const struct rte_flow_item_sctp *sctp_spec = item->spec; + const struct rte_flow_item_sctp *sctp_mask = item->mask; + + ds_put_cstr(&s, "rte flow sctp pattern:\n"); + if (sctp_spec) { + ds_put_format(&s, + " Spec: src_port=%"PRIu16", dst_port=%"PRIu16"\n", + ntohs(sctp_spec->hdr.src_port), + ntohs(sctp_spec->hdr.dst_port)); + } else { + ds_put_cstr(&s, " Spec = null\n"); + } + if (sctp_mask) { + ds_put_format(&s, + " Mask: src_port=0x%"PRIx16 + ", dst_port=0x%"PRIx16"\n", + ntohs(sctp_mask->hdr.src_port), + ntohs(sctp_mask->hdr.dst_port)); + } else { + ds_put_cstr(&s, " Mask = null\n"); + } + } + + if (item->type == RTE_FLOW_ITEM_TYPE_ICMP) { + const struct rte_flow_item_icmp *icmp_spec = item->spec; + const struct rte_flow_item_icmp *icmp_mask = item->mask; + + ds_put_cstr(&s, "rte flow icmp pattern:\n"); + if (icmp_spec) { + ds_put_format(&s, + " Spec: icmp_type=%"PRIu8", icmp_code=%"PRIu8"\n", + icmp_spec->hdr.icmp_type, + icmp_spec->hdr.icmp_code); + } else { + ds_put_cstr(&s, " Spec = null\n"); + } + if (icmp_mask) { + ds_put_format(&s, + " Mask: icmp_type=0x%"PRIx8 + ", icmp_code=0x%"PRIx8"\n", + icmp_spec->hdr.icmp_type, + icmp_spec->hdr.icmp_code); + } else { + ds_put_cstr(&s, " Mask = null\n"); + } + } + + if (item->type == RTE_FLOW_ITEM_TYPE_TCP) { + const struct rte_flow_item_tcp *tcp_spec = item->spec; + const struct rte_flow_item_tcp *tcp_mask = item->mask; + + ds_put_cstr(&s, "rte flow tcp pattern:\n"); + if (tcp_spec) { + ds_put_format(&s, + " Spec: src_port=%"PRIu16", dst_port=%"PRIu16 + ", data_off=0x%"PRIx8", tcp_flags=0x%"PRIx8"\n", + ntohs(tcp_spec->hdr.src_port), + ntohs(tcp_spec->hdr.dst_port), + tcp_spec->hdr.data_off, + tcp_spec->hdr.tcp_flags); + } else { + ds_put_cstr(&s, " Spec = null\n"); + } + if (tcp_mask) { + ds_put_format(&s, + " Mask: src_port=%"PRIx16", dst_port=%"PRIx16 + ", data_off=0x%"PRIx8", tcp_flags=0x%"PRIx8"\n", + ntohs(tcp_mask->hdr.src_port), + ntohs(tcp_mask->hdr.dst_port), + tcp_mask->hdr.data_off, + tcp_mask->hdr.tcp_flags); + } else { + ds_put_cstr(&s, " Mask = null\n"); + } + } + + VLOG_DBG("%s", ds_cstr(&s)); + ds_destroy(&s); +} + +static void +add_flow_pattern(struct flow_patterns *patterns, enum rte_flow_item_type type, + const void *spec, const void *mask) +{ + int cnt = patterns->cnt; + + if (cnt == 0) { + patterns->current_max = 8; + patterns->items = xcalloc(patterns->current_max, + sizeof *patterns->items); + } else if (cnt == patterns->current_max) { + patterns->current_max *= 2; + patterns->items = xrealloc(patterns->items, patterns->current_max * + sizeof *patterns->items); + } + + patterns->items[cnt].type = type; + patterns->items[cnt].spec = spec; + patterns->items[cnt].mask = mask; + patterns->items[cnt].last = NULL; + dump_flow_pattern(&patterns->items[cnt]); + patterns->cnt++; +} + +static void +add_flow_action(struct flow_actions *actions, enum rte_flow_action_type type, + const void *conf) +{ + int cnt = actions->cnt; + + if (cnt == 0) { + actions->current_max = 8; + actions->actions = xcalloc(actions->current_max, + sizeof *actions->actions); + } else if (cnt == actions->current_max) { + actions->current_max *= 2; + actions->actions = xrealloc(actions->actions, actions->current_max * + sizeof *actions->actions); + } + + actions->actions[cnt].type = type; + actions->actions[cnt].conf = conf; + actions->cnt++; +} + +void +netdev_dpdk_flow_actions_add_mark_rss(struct flow_actions *actions, + struct netdev *netdev, + uint32_t mark_id) +{ + struct rte_flow_action_mark *mark; + struct action_rss_data { + struct rte_flow_action_rss conf; + uint16_t queue[0]; + } *rss_data; + int i; + + mark = xmalloc(sizeof *mark); + rss_data = xmalloc(sizeof *rss_data + + netdev_n_rxq(netdev) * sizeof rss_data->queue[0]); + *rss_data = (struct action_rss_data) { + .conf = (struct rte_flow_action_rss) { + .func = RTE_ETH_HASH_FUNCTION_DEFAULT, + .level = 0, + .types = 0, + .queue_num = netdev_n_rxq(netdev), + .queue = rss_data->queue, + .key_len = 0, + .key = NULL + }, + }; + + /* Override queue array with default. */ + for (i = 0; i < netdev_n_rxq(netdev); i++) { + rss_data->queue[i] = i; + } + + mark->id = mark_id; + add_flow_action(actions, RTE_FLOW_ACTION_TYPE_MARK, mark); + add_flow_action(actions, RTE_FLOW_ACTION_TYPE_RSS, &rss_data->conf); + add_flow_action(actions, RTE_FLOW_ACTION_TYPE_END, NULL); +} + +int +netdev_dpdk_flow_patterns_add(struct flow_patterns *patterns, + struct flow_pattern_items *spec, + struct flow_pattern_items *mask, + const struct match *match) +{ + uint8_t proto = 0; + + /* Eth */ + if (!eth_addr_is_zero(match->wc.masks.dl_src) || + !eth_addr_is_zero(match->wc.masks.dl_dst)) { + memcpy(&spec->eth.dst, &match->flow.dl_dst, sizeof spec->eth.dst); + memcpy(&spec->eth.src, &match->flow.dl_src, sizeof spec->eth.src); + spec->eth.type = match->flow.dl_type; + + memcpy(&mask->eth.dst, &match->wc.masks.dl_dst, sizeof mask->eth.dst); + memcpy(&mask->eth.src, &match->wc.masks.dl_src, sizeof mask->eth.src); + mask->eth.type = match->wc.masks.dl_type; + + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, + &spec->eth, &mask->eth); + } else { + /* + * If user specifies a flow (like UDP flow) without L2 patterns, + * OVS will at least set the dl_type. Normally, it's enough to + * create an eth pattern just with it. Unluckily, some Intel's + * NIC (such as XL710) doesn't support that. Below is a workaround, + * which simply matches any L2 pkts. + */ + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL); + } + + /* VLAN */ + if (match->wc.masks.vlans[0].tci && match->flow.vlans[0].tci) { + spec->vlan.tci = match->flow.vlans[0].tci & ~htons(VLAN_CFI); + mask->vlan.tci = match->wc.masks.vlans[0].tci & ~htons(VLAN_CFI); + + /* Match any protocols. */ + mask->vlan.inner_type = 0; + + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_VLAN, + &spec->vlan, &mask->vlan); + } + + /* IP v4 */ + if (match->flow.dl_type == htons(ETH_TYPE_IP)) { + spec->ipv4.hdr.type_of_service = match->flow.nw_tos; + spec->ipv4.hdr.time_to_live = match->flow.nw_ttl; + spec->ipv4.hdr.next_proto_id = match->flow.nw_proto; + spec->ipv4.hdr.src_addr = match->flow.nw_src; + spec->ipv4.hdr.dst_addr = match->flow.nw_dst; + + mask->ipv4.hdr.type_of_service = match->wc.masks.nw_tos; + mask->ipv4.hdr.time_to_live = match->wc.masks.nw_ttl; + mask->ipv4.hdr.next_proto_id = match->wc.masks.nw_proto; + mask->ipv4.hdr.src_addr = match->wc.masks.nw_src; + mask->ipv4.hdr.dst_addr = match->wc.masks.nw_dst; + + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_IPV4, + &spec->ipv4, &mask->ipv4); + + /* Save proto for L4 protocol setup. */ + proto = spec->ipv4.hdr.next_proto_id & + mask->ipv4.hdr.next_proto_id; + } + + if (proto != IPPROTO_ICMP && proto != IPPROTO_UDP && + proto != IPPROTO_SCTP && proto != IPPROTO_TCP && + (match->wc.masks.tp_src || + match->wc.masks.tp_dst || + match->wc.masks.tcp_flags)) { + VLOG_DBG("L4 Protocol (%u) not supported", proto); + return -1; + } + + if ((match->wc.masks.tp_src && match->wc.masks.tp_src != OVS_BE16_MAX) || + (match->wc.masks.tp_dst && match->wc.masks.tp_dst != OVS_BE16_MAX)) { + return -1; + } + + switch (proto) { + case IPPROTO_TCP: + spec->tcp.hdr.src_port = match->flow.tp_src; + spec->tcp.hdr.dst_port = match->flow.tp_dst; + spec->tcp.hdr.data_off = ntohs(match->flow.tcp_flags) >> 8; + spec->tcp.hdr.tcp_flags = ntohs(match->flow.tcp_flags) & 0xff; + + mask->tcp.hdr.src_port = match->wc.masks.tp_src; + mask->tcp.hdr.dst_port = match->wc.masks.tp_dst; + mask->tcp.hdr.data_off = ntohs(match->wc.masks.tcp_flags) >> 8; + mask->tcp.hdr.tcp_flags = ntohs(match->wc.masks.tcp_flags) & 0xff; + + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_TCP, + &spec->tcp, &mask->tcp); + + /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match. */ + mask->ipv4.hdr.next_proto_id = 0; + break; + + case IPPROTO_UDP: + spec->udp.hdr.src_port = match->flow.tp_src; + spec->udp.hdr.dst_port = match->flow.tp_dst; + + mask->udp.hdr.src_port = match->wc.masks.tp_src; + mask->udp.hdr.dst_port = match->wc.masks.tp_dst; + + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_UDP, + &spec->udp, &mask->udp); + + /* proto == UDP and ITEM_TYPE_UDP, thus no need for proto match. */ + mask->ipv4.hdr.next_proto_id = 0; + break; + + case IPPROTO_SCTP: + spec->sctp.hdr.src_port = match->flow.tp_src; + spec->sctp.hdr.dst_port = match->flow.tp_dst; + + mask->sctp.hdr.src_port = match->wc.masks.tp_src; + mask->sctp.hdr.dst_port = match->wc.masks.tp_dst; + + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_SCTP, + &spec->sctp, &mask->sctp); + + /* proto == SCTP and ITEM_TYPE_SCTP, thus no need for proto match. */ + mask->ipv4.hdr.next_proto_id = 0; + break; + + case IPPROTO_ICMP: + spec->icmp.hdr.icmp_type = (uint8_t) ntohs(match->flow.tp_src); + spec->icmp.hdr.icmp_code = (uint8_t) ntohs(match->flow.tp_dst); + + mask->icmp.hdr.icmp_type = (uint8_t) ntohs(match->wc.masks.tp_src); + mask->icmp.hdr.icmp_code = (uint8_t) ntohs(match->wc.masks.tp_dst); + + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ICMP, + &spec->icmp, &mask->icmp); + + /* proto == ICMP and ITEM_TYPE_ICMP, thus no need for proto match. */ + mask->ipv4.hdr.next_proto_id = 0; + break; + } + + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_END, NULL, NULL); + return 0; +} + diff --git a/lib/netdev-offload-dpdk-private.h b/lib/netdev-offload-dpdk-private.h new file mode 100644 index 000000000..397384cb0 --- /dev/null +++ b/lib/netdev-offload-dpdk-private.h @@ -0,0 +1,69 @@ +/* + * Copyright (c) 2019 Mellanox Technologies, Ltd. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef NETDEV_OFFLOAD_DPDK_PRIVATE_H +#define NETDEV_OFFLOAD_DPDK_PRIVATE_H + +#include "openvswitch/match.h" + +#include + +struct netdev; + +/* + * To avoid individual xrealloc calls for each new element, a 'curent_max' + * is used to keep track of current allocated number of elements. Starts + * by 8 and doubles on each xrealloc call. + */ +struct flow_patterns { + struct rte_flow_item *items; + int cnt; + int current_max; +}; + +struct flow_actions { + struct rte_flow_action *actions; + int cnt; + int current_max; +}; + +struct flow_pattern_items { + struct rte_flow_item_eth eth; + struct rte_flow_item_vlan vlan; + struct rte_flow_item_ipv4 ipv4; + union { + struct rte_flow_item_tcp tcp; + struct rte_flow_item_udp udp; + struct rte_flow_item_sctp sctp; + struct rte_flow_item_icmp icmp; + }; +}; + +void +netdev_dpdk_flow_patterns_free(struct flow_patterns *patterns); +int +netdev_dpdk_flow_patterns_add(struct flow_patterns *patterns, + struct flow_pattern_items *spec, + struct flow_pattern_items *mask, + const struct match *match); +void +netdev_dpdk_flow_actions_free(struct flow_actions *actions); +void +netdev_dpdk_flow_actions_add_mark_rss(struct flow_actions *actions, + struct netdev *netdev, + uint32_t mark_id); + +#endif /* NETDEV_OFFLOAD_DPDK_PRIVATE_H */ diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index 96794dc4d..d6bbb7166 100644 --- a/lib/netdev-offload-dpdk.c +++ b/lib/netdev-offload-dpdk.c @@ -26,6 +26,7 @@ #include "openvswitch/vlog.h" #include "packets.h" #include "uuid.h" +#include "netdev-offload-dpdk-private.h" VLOG_DEFINE_THIS_MODULE(netdev_offload_dpdk); @@ -114,302 +115,6 @@ ufid_to_rte_flow_disassociate(const ovs_u128 *ufid) UUID_ARGS((struct uuid *) ufid)); } -/* - * To avoid individual xrealloc calls for each new element, a 'curent_max' - * is used to keep track of current allocated number of elements. Starts - * by 8 and doubles on each xrealloc call. - */ -struct flow_patterns { - struct rte_flow_item *items; - int cnt; - int current_max; -}; - -struct flow_actions { - struct rte_flow_action *actions; - int cnt; - int current_max; -}; - -static void -dump_flow_pattern(struct rte_flow_item *item) -{ - struct ds s; - - if (!VLOG_IS_DBG_ENABLED() || item->type == RTE_FLOW_ITEM_TYPE_END) { - return; - } - - ds_init(&s); - - if (item->type == RTE_FLOW_ITEM_TYPE_ETH) { - const struct rte_flow_item_eth *eth_spec = item->spec; - const struct rte_flow_item_eth *eth_mask = item->mask; - - ds_put_cstr(&s, "rte flow eth pattern:\n"); - if (eth_spec) { - ds_put_format(&s, - " Spec: src="ETH_ADDR_FMT", dst="ETH_ADDR_FMT", " - "type=0x%04" PRIx16"\n", - ETH_ADDR_BYTES_ARGS(eth_spec->src.addr_bytes), - ETH_ADDR_BYTES_ARGS(eth_spec->dst.addr_bytes), - ntohs(eth_spec->type)); - } else { - ds_put_cstr(&s, " Spec = null\n"); - } - if (eth_mask) { - ds_put_format(&s, - " Mask: src="ETH_ADDR_FMT", dst="ETH_ADDR_FMT", " - "type=0x%04"PRIx16"\n", - ETH_ADDR_BYTES_ARGS(eth_mask->src.addr_bytes), - ETH_ADDR_BYTES_ARGS(eth_mask->dst.addr_bytes), - ntohs(eth_mask->type)); - } else { - ds_put_cstr(&s, " Mask = null\n"); - } - } - - if (item->type == RTE_FLOW_ITEM_TYPE_VLAN) { - const struct rte_flow_item_vlan *vlan_spec = item->spec; - const struct rte_flow_item_vlan *vlan_mask = item->mask; - - ds_put_cstr(&s, "rte flow vlan pattern:\n"); - if (vlan_spec) { - ds_put_format(&s, - " Spec: inner_type=0x%"PRIx16", tci=0x%"PRIx16"\n", - ntohs(vlan_spec->inner_type), ntohs(vlan_spec->tci)); - } else { - ds_put_cstr(&s, " Spec = null\n"); - } - - if (vlan_mask) { - ds_put_format(&s, - " Mask: inner_type=0x%"PRIx16", tci=0x%"PRIx16"\n", - ntohs(vlan_mask->inner_type), ntohs(vlan_mask->tci)); - } else { - ds_put_cstr(&s, " Mask = null\n"); - } - } - - if (item->type == RTE_FLOW_ITEM_TYPE_IPV4) { - const struct rte_flow_item_ipv4 *ipv4_spec = item->spec; - const struct rte_flow_item_ipv4 *ipv4_mask = item->mask; - - ds_put_cstr(&s, "rte flow ipv4 pattern:\n"); - if (ipv4_spec) { - ds_put_format(&s, - " Spec: tos=0x%"PRIx8", ttl=%"PRIx8 - ", proto=0x%"PRIx8 - ", src="IP_FMT", dst="IP_FMT"\n", - ipv4_spec->hdr.type_of_service, - ipv4_spec->hdr.time_to_live, - ipv4_spec->hdr.next_proto_id, - IP_ARGS(ipv4_spec->hdr.src_addr), - IP_ARGS(ipv4_spec->hdr.dst_addr)); - } else { - ds_put_cstr(&s, " Spec = null\n"); - } - if (ipv4_mask) { - ds_put_format(&s, - " Mask: tos=0x%"PRIx8", ttl=%"PRIx8 - ", proto=0x%"PRIx8 - ", src="IP_FMT", dst="IP_FMT"\n", - ipv4_mask->hdr.type_of_service, - ipv4_mask->hdr.time_to_live, - ipv4_mask->hdr.next_proto_id, - IP_ARGS(ipv4_mask->hdr.src_addr), - IP_ARGS(ipv4_mask->hdr.dst_addr)); - } else { - ds_put_cstr(&s, " Mask = null\n"); - } - } - - if (item->type == RTE_FLOW_ITEM_TYPE_UDP) { - const struct rte_flow_item_udp *udp_spec = item->spec; - const struct rte_flow_item_udp *udp_mask = item->mask; - - ds_put_cstr(&s, "rte flow udp pattern:\n"); - if (udp_spec) { - ds_put_format(&s, - " Spec: src_port=%"PRIu16", dst_port=%"PRIu16"\n", - ntohs(udp_spec->hdr.src_port), - ntohs(udp_spec->hdr.dst_port)); - } else { - ds_put_cstr(&s, " Spec = null\n"); - } - if (udp_mask) { - ds_put_format(&s, - " Mask: src_port=0x%"PRIx16 - ", dst_port=0x%"PRIx16"\n", - ntohs(udp_mask->hdr.src_port), - ntohs(udp_mask->hdr.dst_port)); - } else { - ds_put_cstr(&s, " Mask = null\n"); - } - } - - if (item->type == RTE_FLOW_ITEM_TYPE_SCTP) { - const struct rte_flow_item_sctp *sctp_spec = item->spec; - const struct rte_flow_item_sctp *sctp_mask = item->mask; - - ds_put_cstr(&s, "rte flow sctp pattern:\n"); - if (sctp_spec) { - ds_put_format(&s, - " Spec: src_port=%"PRIu16", dst_port=%"PRIu16"\n", - ntohs(sctp_spec->hdr.src_port), - ntohs(sctp_spec->hdr.dst_port)); - } else { - ds_put_cstr(&s, " Spec = null\n"); - } - if (sctp_mask) { - ds_put_format(&s, - " Mask: src_port=0x%"PRIx16 - ", dst_port=0x%"PRIx16"\n", - ntohs(sctp_mask->hdr.src_port), - ntohs(sctp_mask->hdr.dst_port)); - } else { - ds_put_cstr(&s, " Mask = null\n"); - } - } - - if (item->type == RTE_FLOW_ITEM_TYPE_ICMP) { - const struct rte_flow_item_icmp *icmp_spec = item->spec; - const struct rte_flow_item_icmp *icmp_mask = item->mask; - - ds_put_cstr(&s, "rte flow icmp pattern:\n"); - if (icmp_spec) { - ds_put_format(&s, - " Spec: icmp_type=%"PRIu8", icmp_code=%"PRIu8"\n", - icmp_spec->hdr.icmp_type, - icmp_spec->hdr.icmp_code); - } else { - ds_put_cstr(&s, " Spec = null\n"); - } - if (icmp_mask) { - ds_put_format(&s, - " Mask: icmp_type=0x%"PRIx8 - ", icmp_code=0x%"PRIx8"\n", - icmp_spec->hdr.icmp_type, - icmp_spec->hdr.icmp_code); - } else { - ds_put_cstr(&s, " Mask = null\n"); - } - } - - if (item->type == RTE_FLOW_ITEM_TYPE_TCP) { - const struct rte_flow_item_tcp *tcp_spec = item->spec; - const struct rte_flow_item_tcp *tcp_mask = item->mask; - - ds_put_cstr(&s, "rte flow tcp pattern:\n"); - if (tcp_spec) { - ds_put_format(&s, - " Spec: src_port=%"PRIu16", dst_port=%"PRIu16 - ", data_off=0x%"PRIx8", tcp_flags=0x%"PRIx8"\n", - ntohs(tcp_spec->hdr.src_port), - ntohs(tcp_spec->hdr.dst_port), - tcp_spec->hdr.data_off, - tcp_spec->hdr.tcp_flags); - } else { - ds_put_cstr(&s, " Spec = null\n"); - } - if (tcp_mask) { - ds_put_format(&s, - " Mask: src_port=%"PRIx16", dst_port=%"PRIx16 - ", data_off=0x%"PRIx8", tcp_flags=0x%"PRIx8"\n", - ntohs(tcp_mask->hdr.src_port), - ntohs(tcp_mask->hdr.dst_port), - tcp_mask->hdr.data_off, - tcp_mask->hdr.tcp_flags); - } else { - ds_put_cstr(&s, " Mask = null\n"); - } - } - - VLOG_DBG("%s", ds_cstr(&s)); - ds_destroy(&s); -} - -static void -add_flow_pattern(struct flow_patterns *patterns, enum rte_flow_item_type type, - const void *spec, const void *mask) -{ - int cnt = patterns->cnt; - - if (cnt == 0) { - patterns->current_max = 8; - patterns->items = xcalloc(patterns->current_max, - sizeof *patterns->items); - } else if (cnt == patterns->current_max) { - patterns->current_max *= 2; - patterns->items = xrealloc(patterns->items, patterns->current_max * - sizeof *patterns->items); - } - - patterns->items[cnt].type = type; - patterns->items[cnt].spec = spec; - patterns->items[cnt].mask = mask; - patterns->items[cnt].last = NULL; - dump_flow_pattern(&patterns->items[cnt]); - patterns->cnt++; -} - -static void -add_flow_action(struct flow_actions *actions, enum rte_flow_action_type type, - const void *conf) -{ - int cnt = actions->cnt; - - if (cnt == 0) { - actions->current_max = 8; - actions->actions = xcalloc(actions->current_max, - sizeof *actions->actions); - } else if (cnt == actions->current_max) { - actions->current_max *= 2; - actions->actions = xrealloc(actions->actions, actions->current_max * - sizeof *actions->actions); - } - - actions->actions[cnt].type = type; - actions->actions[cnt].conf = conf; - actions->cnt++; -} - -struct action_rss_data { - struct rte_flow_action_rss conf; - uint16_t queue[0]; -}; - -static struct action_rss_data * -add_flow_rss_action(struct flow_actions *actions, - struct netdev *netdev) -{ - int i; - struct action_rss_data *rss_data; - - rss_data = xmalloc(sizeof *rss_data + - netdev_n_rxq(netdev) * sizeof rss_data->queue[0]); - *rss_data = (struct action_rss_data) { - .conf = (struct rte_flow_action_rss) { - .func = RTE_ETH_HASH_FUNCTION_DEFAULT, - .level = 0, - .types = 0, - .queue_num = netdev_n_rxq(netdev), - .queue = rss_data->queue, - .key_len = 0, - .key = NULL - }, - }; - - /* Override queue array with default. */ - for (i = 0; i < netdev_n_rxq(netdev); i++) { - rss_data->queue[i] = i; - } - - add_flow_action(actions, RTE_FLOW_ACTION_TYPE_RSS, &rss_data->conf); - - return rss_data; -} - static int netdev_offload_dpdk_add_flow(struct netdev *netdev, const struct match *match, @@ -426,177 +131,28 @@ netdev_offload_dpdk_add_flow(struct netdev *netdev, }; struct flow_patterns patterns = { .items = NULL, .cnt = 0 }; struct flow_actions actions = { .actions = NULL, .cnt = 0 }; - struct rte_flow *flow; + struct flow_pattern_items spec, mask; struct rte_flow_error error; - uint8_t proto = 0; + struct rte_flow *flow; int ret = 0; - struct flow_items { - struct rte_flow_item_eth eth; - struct rte_flow_item_vlan vlan; - struct rte_flow_item_ipv4 ipv4; - union { - struct rte_flow_item_tcp tcp; - struct rte_flow_item_udp udp; - struct rte_flow_item_sctp sctp; - struct rte_flow_item_icmp icmp; - }; - } spec, mask; memset(&spec, 0, sizeof spec); memset(&mask, 0, sizeof mask); - /* Eth */ - if (!eth_addr_is_zero(match->wc.masks.dl_src) || - !eth_addr_is_zero(match->wc.masks.dl_dst)) { - memcpy(&spec.eth.dst, &match->flow.dl_dst, sizeof spec.eth.dst); - memcpy(&spec.eth.src, &match->flow.dl_src, sizeof spec.eth.src); - spec.eth.type = match->flow.dl_type; - - memcpy(&mask.eth.dst, &match->wc.masks.dl_dst, sizeof mask.eth.dst); - memcpy(&mask.eth.src, &match->wc.masks.dl_src, sizeof mask.eth.src); - mask.eth.type = match->wc.masks.dl_type; - - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_ETH, - &spec.eth, &mask.eth); - } else { - /* - * If user specifies a flow (like UDP flow) without L2 patterns, - * OVS will at least set the dl_type. Normally, it's enough to - * create an eth pattern just with it. Unluckily, some Intel's - * NIC (such as XL710) doesn't support that. Below is a workaround, - * which simply matches any L2 pkts. - */ - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL); - } - - /* VLAN */ - if (match->wc.masks.vlans[0].tci && match->flow.vlans[0].tci) { - spec.vlan.tci = match->flow.vlans[0].tci & ~htons(VLAN_CFI); - mask.vlan.tci = match->wc.masks.vlans[0].tci & ~htons(VLAN_CFI); - - /* Match any protocols. */ - mask.vlan.inner_type = 0; - - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_VLAN, - &spec.vlan, &mask.vlan); - } - - /* IP v4 */ - if (match->flow.dl_type == htons(ETH_TYPE_IP)) { - spec.ipv4.hdr.type_of_service = match->flow.nw_tos; - spec.ipv4.hdr.time_to_live = match->flow.nw_ttl; - spec.ipv4.hdr.next_proto_id = match->flow.nw_proto; - spec.ipv4.hdr.src_addr = match->flow.nw_src; - spec.ipv4.hdr.dst_addr = match->flow.nw_dst; - - mask.ipv4.hdr.type_of_service = match->wc.masks.nw_tos; - mask.ipv4.hdr.time_to_live = match->wc.masks.nw_ttl; - mask.ipv4.hdr.next_proto_id = match->wc.masks.nw_proto; - mask.ipv4.hdr.src_addr = match->wc.masks.nw_src; - mask.ipv4.hdr.dst_addr = match->wc.masks.nw_dst; - - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_IPV4, - &spec.ipv4, &mask.ipv4); - - /* Save proto for L4 protocol setup. */ - proto = spec.ipv4.hdr.next_proto_id & - mask.ipv4.hdr.next_proto_id; - } - - if (proto != IPPROTO_ICMP && proto != IPPROTO_UDP && - proto != IPPROTO_SCTP && proto != IPPROTO_TCP && - (match->wc.masks.tp_src || - match->wc.masks.tp_dst || - match->wc.masks.tcp_flags)) { - VLOG_DBG("L4 Protocol (%u) not supported", proto); - ret = -1; + ret = netdev_dpdk_flow_patterns_add(&patterns, &spec, &mask, match); + if (ret) { + VLOG_WARN("Adding rte match patterns for flow ufid"UUID_FMT" failed", + UUID_ARGS((struct uuid *)ufid)); goto out; } - if ((match->wc.masks.tp_src && match->wc.masks.tp_src != OVS_BE16_MAX) || - (match->wc.masks.tp_dst && match->wc.masks.tp_dst != OVS_BE16_MAX)) { - ret = -1; - goto out; - } - - switch (proto) { - case IPPROTO_TCP: - spec.tcp.hdr.src_port = match->flow.tp_src; - spec.tcp.hdr.dst_port = match->flow.tp_dst; - spec.tcp.hdr.data_off = ntohs(match->flow.tcp_flags) >> 8; - spec.tcp.hdr.tcp_flags = ntohs(match->flow.tcp_flags) & 0xff; - - mask.tcp.hdr.src_port = match->wc.masks.tp_src; - mask.tcp.hdr.dst_port = match->wc.masks.tp_dst; - mask.tcp.hdr.data_off = ntohs(match->wc.masks.tcp_flags) >> 8; - mask.tcp.hdr.tcp_flags = ntohs(match->wc.masks.tcp_flags) & 0xff; - - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_TCP, - &spec.tcp, &mask.tcp); - - /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match. */ - mask.ipv4.hdr.next_proto_id = 0; - break; - - case IPPROTO_UDP: - spec.udp.hdr.src_port = match->flow.tp_src; - spec.udp.hdr.dst_port = match->flow.tp_dst; - - mask.udp.hdr.src_port = match->wc.masks.tp_src; - mask.udp.hdr.dst_port = match->wc.masks.tp_dst; - - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_UDP, - &spec.udp, &mask.udp); - - /* proto == UDP and ITEM_TYPE_UDP, thus no need for proto match. */ - mask.ipv4.hdr.next_proto_id = 0; - break; - - case IPPROTO_SCTP: - spec.sctp.hdr.src_port = match->flow.tp_src; - spec.sctp.hdr.dst_port = match->flow.tp_dst; - - mask.sctp.hdr.src_port = match->wc.masks.tp_src; - mask.sctp.hdr.dst_port = match->wc.masks.tp_dst; - - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_SCTP, - &spec.sctp, &mask.sctp); - - /* proto == SCTP and ITEM_TYPE_SCTP, thus no need for proto match. */ - mask.ipv4.hdr.next_proto_id = 0; - break; - - case IPPROTO_ICMP: - spec.icmp.hdr.icmp_type = (uint8_t) ntohs(match->flow.tp_src); - spec.icmp.hdr.icmp_code = (uint8_t) ntohs(match->flow.tp_dst); - - mask.icmp.hdr.icmp_type = (uint8_t) ntohs(match->wc.masks.tp_src); - mask.icmp.hdr.icmp_code = (uint8_t) ntohs(match->wc.masks.tp_dst); - - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_ICMP, - &spec.icmp, &mask.icmp); - - /* proto == ICMP and ITEM_TYPE_ICMP, thus no need for proto match. */ - mask.ipv4.hdr.next_proto_id = 0; - break; - } - - add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_END, NULL, NULL); - - struct rte_flow_action_mark mark; - struct action_rss_data *rss; - - mark.id = info->flow_mark; - add_flow_action(&actions, RTE_FLOW_ACTION_TYPE_MARK, &mark); - - rss = add_flow_rss_action(&actions, netdev); - add_flow_action(&actions, RTE_FLOW_ACTION_TYPE_END, NULL); + netdev_dpdk_flow_actions_add_mark_rss(&actions, netdev, + info->flow_mark); flow = netdev_dpdk_rte_flow_create(netdev, &flow_attr, patterns.items, actions.actions, &error); - free(rss); if (!flow) { VLOG_ERR("%s: rte flow creat error: %u : message : %s\n", netdev_get_name(netdev), error.type, error.message); @@ -608,8 +164,8 @@ netdev_offload_dpdk_add_flow(struct netdev *netdev, netdev_get_name(netdev), flow, UUID_ARGS((struct uuid *)ufid)); out: - free(patterns.items); - free(actions.actions); + netdev_dpdk_flow_patterns_free(&patterns); + netdev_dpdk_flow_actions_free(&actions); return ret; }