From patchwork Tue Sep 27 12:46:02 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 675568 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3sk0ws6R99z9s9Y for ; Tue, 27 Sep 2016 22:47:41 +1000 (AEST) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id EB1BE102E2; Tue, 27 Sep 2016 05:46:21 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx3v3.cudamail.com (mx3.cudamail.com [64.34.241.5]) by archives.nicira.com (Postfix) with ESMTPS id 6464A102B0 for ; Tue, 27 Sep 2016 05:46:17 -0700 (PDT) Received: from bar6.cudamail.com (localhost [127.0.0.1]) by mx3v3.cudamail.com (Postfix) with ESMTPS id EF34A1620B5 for ; Tue, 27 Sep 2016 06:46:16 -0600 (MDT) X-ASG-Debug-ID: 1474980374-0b32373c80105870001-byXFYA Received: from mx3-pf2.cudamail.com ([192.168.14.1]) by bar6.cudamail.com with ESMTP id rFoq9TaDDQ2PNaoG (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 27 Sep 2016 06:46:14 -0600 (MDT) X-Barracuda-Envelope-From: paulb@mellanox.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.14.1 Received: from unknown (HELO mellanox.co.il) (193.47.165.129) by mx3-pf2.cudamail.com with SMTP; 27 Sep 2016 12:46:13 -0000 Received-SPF: pass (mx3-pf2.cudamail.com: SPF record at _mtablock1.salesforce.com designates 193.47.165.129 as permitted sender) X-Barracuda-Apparent-Source-IP: 193.47.165.129 X-Barracuda-RBL-IP: 193.47.165.129 Received: from Internal Mail-Server by MTLPINE1 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 27 Sep 2016 15:46:07 +0300 Received: from r-vnc04.mtr.labs.mlnx (r-vnc04.mtr.labs.mlnx [10.208.0.116]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id u8RCk6J0028432; Tue, 27 Sep 2016 15:46:06 +0300 X-CudaMail-Envelope-Sender: paulb@mellanox.com From: Paul Blakey To: dev@openvswitch.org X-CudaMail-MID: CM-V2-926008727 X-CudaMail-DTE: 092716 X-CudaMail-Originating-IP: 193.47.165.129 Date: Tue, 27 Sep 2016 15:46:02 +0300 X-ASG-Orig-Subj: [##CM-V2-926008727##][PATCH ovs RFC 7/9] dpif-hw-netlink: operate implementation Message-Id: <1474980364-9291-8-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.7.8.2 In-Reply-To: <1474980364-9291-1-git-send-email-paulb@mellanox.com> References: <1474980364-9291-1-git-send-email-paulb@mellanox.com> X-GBUdb-Analysis: 0, 193.47.165.129, Ugly c=0.271956 p=0 Source Normal X-MessageSniffer-Rules: 0-0-0-32767-c X-Barracuda-Connect: UNKNOWN[192.168.14.1] X-Barracuda-Start-Time: 1474980374 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-Barracuda-BRTS-Status: 1 X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-Spam-Score: 0.60 X-Barracuda-Spam-Status: No, SCORE=0.60 using global scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=4.0 tests=BSF_SC5_MJ1963, RDNS_NONE, UNPARSEABLE_RELAY X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.33260 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 0.10 RDNS_NONE Delivered to trusted network by a host with no rDNS 0.50 BSF_SC5_MJ1963 Custom Rule MJ1963 Cc: Shahar Klein , Andy Gospodarek , Rony Efraim , Paul Blakey , Simon Horman , Or Gerlitz Subject: [ovs-dev] [PATCH ovs RFC 7/9] dpif-hw-netlink: operate implementation X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" added flow offload with tc, supporting flow get, flow put, and flow del. Signed-off-by: Paul Blakey Signed-off-by: Shahar Klein --- lib/dpif-hw-netlink.c | 687 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 682 insertions(+), 5 deletions(-) diff --git a/lib/dpif-hw-netlink.c b/lib/dpif-hw-netlink.c index 885827a..9473832 100644 --- a/lib/dpif-hw-netlink.c +++ b/lib/dpif-hw-netlink.c @@ -47,6 +47,74 @@ VLOG_DEFINE_THIS_MODULE(dpif_hw_netlink); +extern bool SKIP_HW; + +static inline void * +nla_data(const struct nlattr *nla) +{ + return (char *) nla + NLA_HDRLEN; +} + +static char * +attrname(int type) +{ + static char unkowntype[64]; + + switch (type) { + case OVS_KEY_ATTR_PRIORITY: + return "OVS_KEY_ATTR_PRIORITY"; + case OVS_KEY_ATTR_CT_LABELS: + return "OVS_KEY_ATTR_CT_LABELS"; + case OVS_KEY_ATTR_IN_PORT: + return "OVS_KEY_ATTR_IN_PORT"; + case OVS_KEY_ATTR_ETHERNET: + return "OVS_KEY_ATTR_ETHERNET"; + case OVS_KEY_ATTR_VLAN: + return "OVS_KEY_ATTR_VLAN"; + case OVS_KEY_ATTR_ETHERTYPE: + return "OVS_KEY_ATTR_ETHERTYPE"; + case OVS_KEY_ATTR_IPV4: + return "OVS_KEY_ATTR_IPV4"; + case OVS_KEY_ATTR_IPV6: + return "OVS_KEY_ATTR_IPV6"; + case OVS_KEY_ATTR_TCP: + return "OVS_KEY_ATTR_TCP"; + case OVS_KEY_ATTR_UDP: + return "OVS_KEY_ATTR_UDP"; + case OVS_KEY_ATTR_ICMP: + return "OVS_KEY_ATTR_ICMP"; + case OVS_KEY_ATTR_ICMPV6: + return "OVS_KEY_ATTR_ICMPV6"; + case OVS_KEY_ATTR_ARP: + return "OVS_KEY_ATTR_ARP"; + case OVS_KEY_ATTR_ND: + return "OVS_KEY_ATTR_ND"; + case OVS_KEY_ATTR_SKB_MARK: + return "OVS_KEY_ATTR_SKB_MARK"; + case OVS_KEY_ATTR_TUNNEL: + return "OVS_KEY_ATTR_TUNNEL"; + case OVS_KEY_ATTR_SCTP: + return "OVS_KEY_ATTR_SCTP"; + case OVS_KEY_ATTR_TCP_FLAGS: + return "OVS_KEY_ATTR_TCP_FLAGS"; + case OVS_KEY_ATTR_DP_HASH: + return "OVS_KEY_ATTR_DP_HASH"; + case OVS_KEY_ATTR_RECIRC_ID: + return "OVS_KEY_ATTR_RECIRC_ID"; + case OVS_KEY_ATTR_MPLS: + return "OVS_KEY_ATTR_MPLS"; + case OVS_KEY_ATTR_CT_STATE: + return "OVS_KEY_ATTR_CT_STATE"; + case OVS_KEY_ATTR_CT_ZONE: + return "OVS_KEY_ATTR_CT_ZONE"; + case OVS_KEY_ATTR_CT_MARK: + return "OVS_KEY_ATTR_CT_MARK"; + default: + sprintf(unkowntype, "unkown_type(%d)\n", type); + return unkowntype; + } +} + static char * printufid(const ovs_u128 * ovs_ufid) { @@ -426,7 +494,7 @@ dpif_hw_tc_flow_to_dpif_flow(struct dpif_hw_netlink *dpif, sizeof (*ipv4_mask)); memset(&ipv4_mask->ipv4_proto, 0xFF, sizeof (ipv4_mask->ipv4_proto)); - ipv4->ipv4_proto = tc_flow->ip_proto; + ipv4->ipv4_proto = tc_flow->ip_proto; ipv4_mask->ipv4_frag = UINT8_MAX; if (tc_flow->ip_type == 4) { @@ -441,7 +509,7 @@ dpif_hw_tc_flow_to_dpif_flow(struct dpif_hw_netlink *dpif, } if (tc_flow->ip_proto == IPPROTO_ICMP) { - /* putting a masked out icmp */ + /* putting a masked out icmp */ struct ovs_key_icmp *icmp = nl_msg_put_unspec_uninit(outflow, OVS_KEY_ATTR_ICMP, sizeof (*icmp)); @@ -490,7 +558,7 @@ dpif_hw_tc_flow_to_dpif_flow(struct dpif_hw_netlink *dpif, size_t actions_offset = nl_msg_start_nested(outflow, OVS_FLOW_ATTR_ACTIONS); if (tc_flow->ifindex_out) { - /* TODO: make this faster */ + /* TODO: make this faster */ int ovsport = get_ovs_port(dpif, tc_flow->ifindex_out); nl_msg_put_u32(outflow, OVS_ACTION_ATTR_OUTPUT, ovsport); @@ -843,13 +911,622 @@ dpif_hw_netlink_flow_dump_next(struct dpif_flow_dump_thread *thread_, max_flows); } +static bool +odp_mask_attr_is_wildcard(const struct nlattr *ma) +{ + return is_all_zeros(nl_attr_get(ma), nl_attr_get_size(ma)); +} + +static bool +odp_mask_is_exact(enum ovs_key_attr attr, const void *mask, size_t size) +{ + if (attr == OVS_KEY_ATTR_TCP_FLAGS) { + return TCP_FLAGS(*(ovs_be16 *) mask) == TCP_FLAGS(OVS_BE16_MAX); + } + if (attr == OVS_KEY_ATTR_IPV6) { + const struct ovs_key_ipv6 *ipv6_mask = mask; + + return ((ipv6_mask->ipv6_label & htonl(IPV6_LABEL_MASK)) + == htonl(IPV6_LABEL_MASK)) + && ipv6_mask->ipv6_proto == UINT8_MAX + && ipv6_mask->ipv6_tclass == UINT8_MAX + && ipv6_mask->ipv6_hlimit == UINT8_MAX + && ipv6_mask->ipv6_frag == UINT8_MAX + && ipv6_mask_is_exact((const struct in6_addr *) + ipv6_mask->ipv6_src) + && ipv6_mask_is_exact((const struct in6_addr *) + ipv6_mask->ipv6_dst); + } + if (attr == OVS_KEY_ATTR_TUNNEL) { + return false; + } + + if (attr == OVS_KEY_ATTR_ARP) { + /* ARP key has padding, ignore it. */ + BUILD_ASSERT_DECL(sizeof (struct ovs_key_arp) == 24); + BUILD_ASSERT_DECL(offsetof(struct ovs_key_arp, arp_tha) == 10 + 6); + size = offsetof(struct ovs_key_arp, arp_tha) + ETH_ADDR_LEN; + + ovs_assert(((uint16_t *) mask)[size / 2] == 0); + } + + return is_all_ones(mask, size); +} + +static bool +odp_mask_attr_is_exact(const struct nlattr *ma) +{ + enum ovs_key_attr attr = nl_attr_type(ma); + const void *mask; + size_t size; + + if (attr == OVS_KEY_ATTR_TUNNEL) { + return false; + } else { + mask = nl_attr_get(ma); + size = nl_attr_get_size(ma); + } + + return odp_mask_is_exact(attr, mask, size); +} + +static int +parse_to_tc_flow(struct dpif_hw_netlink *dpif, struct tc_flow *tc_flow, + const struct nlattr *key, int key_len, + const struct nlattr *key_mask, int key_mask_len) +{ + size_t left; + const struct nlattr *a; + const struct nlattr *mask[__OVS_KEY_ATTR_MAX] = { 0 }; + + VLOG_DBG("parsing mask:\n"); + NL_ATTR_FOR_EACH_UNSAFE(a, left, key_mask, key_mask_len) { + mask[nl_attr_type(a)] = a; + } + + VLOG_DBG("parsing key attributes:\n"); + NL_ATTR_FOR_EACH_UNSAFE(a, left, key, key_len) { + const struct nlattr *ma = mask[nl_attr_type(a)]; + bool is_wildcard = false; + bool is_exact = true; + + if (key_mask && key_mask_len) { + is_wildcard = ma ? odp_mask_attr_is_wildcard(ma) : true; + is_exact = ma ? odp_mask_attr_is_exact(ma) : false; + } + + if (is_exact) + VLOG_DBG("mask: %s exact: %p\n", attrname(nl_attr_type(a)), ma); + else if (is_wildcard) + VLOG_DBG("mask: %s wildcard: %p\n", attrname(nl_attr_type(a)), ma); + else + VLOG_DBG("mask %s is partial, ma: %p\n", attrname(nl_attr_type(a)), + ma); + + switch (nl_attr_type(a)) { + case OVS_KEY_ATTR_UNSPEC: + case OVS_KEY_ATTR_ENCAP: + case OVS_KEY_ATTR_PRIORITY: + case OVS_KEY_ATTR_SKB_MARK: + case OVS_KEY_ATTR_CT_STATE: + case OVS_KEY_ATTR_CT_ZONE: + case OVS_KEY_ATTR_CT_MARK: + case OVS_KEY_ATTR_CT_LABELS: + case OVS_KEY_ATTR_ND: + case OVS_KEY_ATTR_MPLS: + case OVS_KEY_ATTR_DP_HASH: + case OVS_KEY_ATTR_TUNNEL: + case OVS_KEY_ATTR_SCTP: + case OVS_KEY_ATTR_VLAN: + case OVS_KEY_ATTR_IPV6: + case OVS_KEY_ATTR_ICMP: + case OVS_KEY_ATTR_ARP: + case OVS_KEY_ATTR_ICMPV6:; + if (is_wildcard) { + VLOG_DBG("unsupported key attribute: %s is wildcard\n", + attrname(nl_attr_type(a))); + break; + } + VLOG_ERR("unsupported key attribute: %s is not wildcard\n", + attrname(nl_attr_type(a))); + return 1; + break; + + case OVS_KEY_ATTR_TCP_FLAGS: + case OVS_KEY_ATTR_RECIRC_ID: + /* IGNORE this attributes for now, (might disable some of it in + * probe? */ + VLOG_DBG + ("ignoring attribute %s -- fix me, exact: %s, wildcard: %s, partial: %s\n", + attrname(nl_attr_type(a)), is_exact ? "yes" : "no", + is_wildcard ? "yes" : "no", (!is_wildcard + && !is_exact) ? "yes" : "no"); + break; + + case OVS_KEY_ATTR_IN_PORT:{ + if (!is_exact) { + VLOG_ERR("%s isn't exact, can't offload!\n", + attrname(nl_attr_type(a))); + return 1; + } + + VLOG_DBG("in_port(%d)\n", nl_attr_get_u32(a)); + tc_flow->ovs_inport = nl_attr_get_u32(a); + tc_flow->indev = port_find(dpif, tc_flow->ovs_inport); + tc_flow->ifindex = + tc_flow->indev ? netdev_get_ifindex(tc_flow->indev) : 0; + if (!tc_flow->ovs_inport || !tc_flow->ifindex) { + VLOG_ERR + ("RESULT: not found inport: %d or ifindex: %d for ovs in_port: %d\n", + tc_flow->ovs_inport, tc_flow->ifindex, + tc_flow->ovs_inport); + return 1; + } + } + break; + + case OVS_KEY_ATTR_ETHERNET:{ + const struct ovs_key_ethernet *eth_key = 0; + struct ovs_key_ethernet full_mask; + + memset(&full_mask, 0xFF, sizeof (full_mask)); + if (!SKIP_HW) { + ma = 0; + VLOG_DBG + ("%s %d %s: FORCING FULL MASK ON ETH MAC ADDRESS, because of \"syndrome command failed, status bad parameter(0x3), syndrome 0x3ad328\"", + __FILE__, __LINE__, __func__); + } + const struct ovs_key_ethernet *eth_key_mask = + ma ? nla_data(ma) : &full_mask; + eth_key = nla_data(a); + + const struct eth_addr *src = ð_key->eth_src; + const struct eth_addr *src_mask = ð_key_mask->eth_src; + const struct eth_addr *dst = ð_key->eth_dst; + const struct eth_addr *dst_mask = ð_key_mask->eth_dst; + + memcpy(&tc_flow->src_mac, src, sizeof (tc_flow->src_mac)); + memcpy(&tc_flow->src_mac_mask, src_mask, + sizeof (tc_flow->src_mac_mask)); + memcpy(&tc_flow->dst_mac, dst, sizeof (tc_flow->dst_mac)); + memcpy(&tc_flow->dst_mac_mask, dst_mask, + sizeof (tc_flow->dst_mac_mask)); + + VLOG_DBG("eth(src=" ETH_ADDR_FMT ", src_mask=" ETH_ADDR_FMT + ", dst=" ETH_ADDR_FMT ", dst_mask=" ETH_ADDR_FMT "\n", + ETH_ADDR_ARGS(tc_flow->src_mac), + ETH_ADDR_ARGS(tc_flow->src_mac_mask), + ETH_ADDR_ARGS(tc_flow->dst_mac), + ETH_ADDR_ARGS(tc_flow->dst_mac_mask)); + } + break; + case OVS_KEY_ATTR_ETHERTYPE:{ + if (!is_exact) { + VLOG_ERR("attribute %s isn't exact, can't offload!\n", + attrname(nl_attr_type(a))); + return 1; + } + + tc_flow->eth_type = nl_attr_get_be16(a); + VLOG_DBG("eth_type(0x%04x)\n", ntohs(tc_flow->eth_type)); + } + break; + case OVS_KEY_ATTR_IPV4:{ + const struct ovs_key_ipv4 *ipv4 = nla_data(a); + struct ovs_key_ipv4 full_mask; + + memset(&full_mask, 0xFF, sizeof (full_mask)); + const struct ovs_key_ipv4 *ipv4_mask = + ma ? nla_data(ma) : &full_mask; + + if (ipv4_mask->ipv4_frag) { + VLOG_WARN + ("*** ignoring exact or partial mask on unsupported ipv4_frag, mask: %x", + ipv4_mask->ipv4_frag); + } + + if (ipv4_mask->ipv4_ttl || ipv4_mask->ipv4_tos) { + VLOG_ERR + ("ipv4 mask exact or partial one of unsupported sub attributes (ttl: %x, tos: %x, frag: %x)\n", + ipv4_mask->ipv4_ttl, ipv4_mask->ipv4_tos, + ipv4_mask->ipv4_frag); + return 1; + } + + if (ipv4_mask->ipv4_proto != 0 + && ipv4_mask->ipv4_proto != 0xFF) { + VLOG_WARN + ("*** ignoring partial mask on ipv4_proto, taking exact ip_proto: %d (%x)\n", + ipv4_mask->ipv4_proto, ipv4->ipv4_proto); + } + + /* If not wildcard out, take exact match for ipv4_proto + * (ignoring mask) */ + if (ipv4_mask->ipv4_proto != 0) + tc_flow->ip_proto = ipv4->ipv4_proto; + + if (ipv4_mask->ipv4_src) { + tc_flow->ipv4.ipv4_src = ipv4->ipv4_src; + tc_flow->ipv4.ipv4_src_mask = ipv4_mask->ipv4_src; + } + if (ipv4_mask->ipv4_dst) { + tc_flow->ipv4.ipv4_dst = ipv4->ipv4_dst; + tc_flow->ipv4.ipv4_dst_mask = ipv4_mask->ipv4_dst; + } + tc_flow->ip_type = 4; + } + break; + case OVS_KEY_ATTR_TCP:{ + struct ovs_key_tcp full_mask; + + memset(&full_mask, 0xFF, sizeof (full_mask)); + const struct ovs_key_tcp *tcp_mask = + ma ? nla_data(ma) : &full_mask; + const struct ovs_key_tcp *tcp = nla_data(a); + + if (tcp_mask->tcp_src) { + tc_flow->src_port = tcp->tcp_src; + tc_flow->src_port_mask = tcp_mask->tcp_src; + } + if (tcp_mask->tcp_dst) { + tc_flow->dst_port = tcp->tcp_dst; + tc_flow->dst_port_mask = tcp_mask->tcp_dst; + } + + VLOG_DBG("tcp(src=%d, msk: 0x%x, dst=%d, msk: 0x%x)\n", + htons(tcp->tcp_src), htons(tcp_mask->tcp_src), + htons(tcp->tcp_dst), htons(tcp_mask->tcp_dst)); + } + break; + case OVS_KEY_ATTR_UDP:{ + struct ovs_key_udp full_mask; + + memset(&full_mask, 0xFF, sizeof (full_mask)); + const struct ovs_key_udp *udp_mask = + ma ? nla_data(ma) : &full_mask; + const struct ovs_key_udp *udp = nla_data(a); + + if (udp_mask->udp_src) { + tc_flow->src_port = udp->udp_src; + tc_flow->src_port_mask = udp_mask->udp_src; + } + if (udp_mask->udp_dst) { + tc_flow->dst_port = udp->udp_dst; + tc_flow->dst_port_mask = udp_mask->udp_dst; + } + VLOG_DBG("udp(src=%d/0x%x, dst=%d/0x%x)\n", + htons(udp->udp_src), htons(udp_mask->udp_src), + htons(udp->udp_dst), htons(udp_mask->udp_dst)); + } + break; + + case __OVS_KEY_ATTR_MAX: + default: + VLOG_ERR("unknown (default/max) key attribute: %s\n", + attrname(nl_attr_type(a))); + return 1; + } + } + VLOG_DBG("--- finished parsing attr - can offload!\n"); + return 0; + +} + +static enum dpif_hw_offload_policy +parse_flow_put(struct dpif_hw_netlink *dpif, struct dpif_flow_put *put) +{ + +/* + * if this is a modify flow cmd and the policy changed: + * delete the old one + * handle the new/modify flow + * + * +*/ + const struct nlattr *a; + size_t left; + struct netdev *in = 0; + enum dpif_hw_offload_policy policy; + + int probe_feature = ((put->flags & DPIF_FP_PROBE) ? 1 : 0); + + if (probe_feature) { + VLOG_DBG("\n.\nPROBE REQUEST!\n.\n"); + /* see usage at dpif_probe_feature, we might want to intercept and + * disable some features */ + return DPIF_HW_NO_OFFLAOAD; + } + int cmd = + put->flags & DPIF_FP_CREATE ? OVS_FLOW_CMD_NEW : OVS_FLOW_CMD_SET; + if (!put->ufid) { + VLOG_INFO + ("%s %d %s missing ufid for flow put, might be from dpctl add-flow.", + __FILE__, __LINE__, __func__); + } + + policy = HW_offload_test_put(dpif, put); + if (put->ufid) + put_policy(dpif, put->ufid, policy); + int proto = 0; + int handle = + gethandle(dpif, put->ufid, &in, &proto, "DPIF_OP_FLOW_PUT", 1); + if (handle && proto && (policy == DPIF_HW_NO_OFFLAOAD)) { + put->flags |= DPIF_FP_CREATE; + int ifindex = netdev_get_ifindex(in); + + tc_del_flower(ifindex, handle, proto); + delhandle(dpif, put->ufid); + return DPIF_HW_NO_OFFLAOAD; + } + + if (policy == DPIF_HW_NO_OFFLAOAD) + return DPIF_HW_NO_OFFLAOAD; + + if (cmd == OVS_FLOW_CMD_NEW) + VLOG_DBG("cmd is OVS_FLOW_CMD_NEW - create\n"); + else + VLOG_DBG("cmd is OVS_FLOW_CMD_SET - modify\n"); + + if (put->flags & DPIF_FP_ZERO_STATS && cmd == OVS_FLOW_CMD_SET) + VLOG_WARN + ("We need to zero the stats of a modified flow, not implemented, ignored\n"); + + if (put->stats) + VLOG_WARN("FLOW PUT WANTS STATS\n"); + + /* if not present, and cmd == OVS_FLOW_CMD_SET, means don't modify ACTIONs + * (which we wrongly parse as a drop rule) see include/odp-netlink.h +:490 + * to clear actions with OVS_FLOW_CMD_SET, actions will be present but + * empty */ + if (!put->key) { + VLOG_ERR("%s %d %s error ,missing key, cmd: %d!", __FILE__, __LINE__, + __func__, cmd); + return DPIF_HW_NO_OFFLAOAD; + } + if (!put->actions) { + if (cmd == OVS_FLOW_CMD_NEW) { + VLOG_WARN("%s %d %s error missing actions on cmd new!", __FILE__, + __LINE__, __func__); + } else { + VLOG_WARN + ("%s %d %s missing actions on cmd modify, find and modify key only", + __FILE__, __LINE__, __func__); + } + } + + int outport_count = 0; + + VLOG_DBG("parsing actions\n"); + NL_ATTR_FOR_EACH_UNSAFE(a, left, put->actions, put->actions_len) { + if (nl_attr_type(a) == OVS_ACTION_ATTR_OUTPUT) { + VLOG_WARN("output to port: %d\n", nl_attr_get_u32(a)); + outport_count++; + } + } + if (outport_count == 0) + VLOG_DBG("output to port: drop\n"); + + struct ds ds; + + ds_init(&ds); + ds_clear(&ds); + if (put->ufid) { + odp_format_ufid(put->ufid, &ds); + ds_put_cstr(&ds, ", "); + } + + ds_put_cstr(&ds, "verbose: "); + odp_flow_format(put->key, put->key_len, put->mask, put->mask_len, 0, &ds, + true); + ds_put_cstr(&ds, ", not_verbose: "); + odp_flow_format(put->key, put->key_len, put->mask, put->mask_len, 0, &ds, + false); + + /* can also use dpif_flow_stats_format(&f->stats, ds) to print stats */ + + ds_put_cstr(&ds, ", actions:"); + format_odp_actions(&ds, put->actions, put->actions_len); + VLOG_DBG("%s\n", ds_cstr(&ds)); + ds_destroy(&ds); + + /* parse tc_flow */ + struct tc_flow tc_flow; + + memset(&tc_flow, 0, sizeof (tc_flow)); + tc_flow.handle = handle; + int cant_offload = + parse_to_tc_flow(dpif, &tc_flow, put->key, put->key_len, put->mask, + put->mask_len); + + int new = handle ? 0 : 1; + + VLOG_DBG + ("cant_offload: %d ifindex: %d, eth_type: %x, ip_proto: %d, outport_count: %d\n", + cant_offload, tc_flow.ifindex, ntohs(tc_flow.eth_type), + tc_flow.ip_proto, outport_count); + if (!cant_offload && tc_flow.ifindex && tc_flow.eth_type + && outport_count <= 1) { + VLOG_DBG("RESULT: %p, ***** offloading (HW_ONLY!)\n", dpif); + if (cmd != OVS_FLOW_CMD_NEW && !handle) { + /* modify and flow is now offloadable, remove from kernel netlink + * datapath */ + int error = + dpif_flow_del(dpif->lp_dpif_netlink, put->key, put->key_len, + put->ufid, PMD_ID_NULL, NULL); + + if (!error) + VLOG_DBG("modify, deleted old flow and offloading new\n"); + else + VLOG_ERR("modify, error: %d\n", error); + } + + int error = 0; + + outport_count = 0; + NL_ATTR_FOR_EACH_UNSAFE(a, left, put->actions, put->actions_len) { + if (nl_attr_type(a) == OVS_ACTION_ATTR_OUTPUT) { + outport_count++; + + tc_flow.ovs_outport = nl_attr_get_u32(a); + tc_flow.outdev = port_find(dpif, tc_flow.ovs_outport); + tc_flow.ifindex_out = + tc_flow.outdev ? netdev_get_ifindex(tc_flow.outdev) : 0; + if (tc_flow.ifindex_out) { + VLOG_DBG + (" **** handle: %d, new? %d, adding %d -> %d (ifindex: %d -> %d)\n", + tc_flow.handle, new, tc_flow.ovs_inport, + tc_flow.ovs_outport, tc_flow.ifindex, + tc_flow.ifindex_out); + int error = tc_replace_flower(&tc_flow); + + if (!error) { + if (new) + puthandle(dpif, put->ufid, tc_flow.indev, + tc_flow.ovs_inport, tc_flow.handle, + tc_flow.eth_type); + + VLOG_DBG(" **** offloaded! handle: %d (%x)\n", + tc_flow.handle, tc_flow.handle); + } else + VLOG_ERR + (" **** error! adding fwd rule! tc error: %d\n", + error); + } else { + VLOG_ERR + (" **** error! not found output port %d, ifindex: %d\n", + tc_flow.ovs_outport, tc_flow.ifindex_out); + break; + } + } + } + if (!outport_count) { + VLOG_DBG + (" ***** handle: %d, new? %d, adding %d -> DROP (ifindex: %d -> DROP)\n", + tc_flow.handle, new, tc_flow.ovs_inport, tc_flow.ifindex); + error = tc_replace_flower(&tc_flow); + if (!error) { + if (new) + puthandle(dpif, put->ufid, tc_flow.indev, + tc_flow.ovs_inport, tc_flow.handle, + tc_flow.eth_type); + + VLOG_DBG(" **** offloaded! handle: %d (%x)\n", tc_flow.handle, + tc_flow.handle); + } else + VLOG_ERR(" **** error adding drop rule! tc error: %d\n", + error); + } + + if (error) + return DPIF_HW_NO_OFFLAOAD; + return DPIF_HW_OFFLOAD_ONLY; + } + + VLOG_DBG("RESULT: SW\n"); + + return DPIF_HW_NO_OFFLAOAD; +} + +static enum dpif_hw_offload_policy +parse_flow_get(struct dpif_hw_netlink *dpif, struct dpif_flow_get *get) +{ + struct netdev *in = 0; + int protocol = 0; + int handle = + gethandle(dpif, get->ufid, &in, &protocol, "DPIF_OP_FLOW_GET", 1); + + if (handle && protocol) { + struct tc_flow tc_flow; + int ifindex = netdev_get_ifindex(in); + int ovs_port = get_ovs_port(dpif, ifindex); + int error = ENOENT; + + if (ovs_port != -1) + error = tc_get_flower(ifindex, handle, protocol, &tc_flow); + + if (!error) { + dpif_hw_tc_flow_to_dpif_flow(dpif, &tc_flow, get->flow, ovs_port, + get->buffer, in); + return DPIF_HW_OFFLOAD_ONLY; + } + } + + return DPIF_HW_NO_OFFLAOAD; +} + +static enum dpif_hw_offload_policy +parse_flow_del(struct dpif_hw_netlink *dpif, struct dpif_flow_del *del) +{ + struct netdev *in = 0; + int protocol = 0; + int handle = + gethandle(dpif, del->ufid, &in, &protocol, "DPIF_OP_FLOW_DEL", 1); + + /* we delete the handle anyway (even if not deleted from tc) */ + delhandle(dpif, del->ufid); + del_policy(dpif, del->ufid); + + if (handle && protocol) { + int ifindex = netdev_get_ifindex(in); + + VLOG_DBG("deleting ufid %s, handle %d, protocol: %d, ifindex: %d\n", + printufid(del->ufid), handle, protocol, ifindex); + int error = tc_del_flower(ifindex, handle, protocol); + + if (error) + VLOG_ERR("DELETE FAILED: tc error: %d\n", error); + else + VLOG_DBG("DELETE SUCCESS!\n"); + + if (error) + return DPIF_HW_NO_OFFLAOAD; + + return DPIF_HW_OFFLOAD_ONLY; + } + + VLOG_DBG("del with no handle/ufid/protocol, SW only\n"); + return DPIF_HW_NO_OFFLAOAD; +} + +static enum dpif_hw_offload_policy +parse_operate(struct dpif_hw_netlink *dpif, struct dpif_op *op) +{ + switch (op->type) { + case DPIF_OP_FLOW_PUT: + VLOG_DBG("DPIF_OP_FLOW_PUT"); + return parse_flow_put(dpif, &op->u.flow_put); + case DPIF_OP_FLOW_GET: + VLOG_DBG("DPIF_OP_FLOW_GET"); + return parse_flow_get(dpif, &op->u.flow_get); + case DPIF_OP_FLOW_DEL: + VLOG_DBG("DPIF_OP_FLOW_DEL"); + return parse_flow_del(dpif, &op->u.flow_del); + + case DPIF_OP_EXECUTE: + default: + return DPIF_HW_NO_OFFLAOAD; + } + return DPIF_HW_NO_OFFLAOAD; +} + static void dpif_hw_netlink_operate(struct dpif *dpif_, struct dpif_op **ops, size_t n_ops) { struct dpif_hw_netlink *dpif = dpif_hw_netlink_cast(dpif_); - return dpif->lp_dpif_netlink->dpif_class->operate(dpif->lp_dpif_netlink, - ops, n_ops); + struct dpif_op **new_ops = xmalloc(sizeof (struct dpif_op *) * n_ops); + int n_new_ops = 0; + int i = 0; + + for (i = 0; i < n_ops; i++) { + if (parse_operate(dpif, ops[i]) == DPIF_HW_OFFLOAD_ONLY) { + ops[i]->error = 0; + } else + new_ops[n_new_ops++] = ops[i]; + } + dpif->lp_dpif_netlink->dpif_class->operate(dpif->lp_dpif_netlink, new_ops, + n_new_ops); + free(new_ops); } static int