From patchwork Tue Nov 12 15:08:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tonghao Zhang X-Patchwork-Id: 1193622 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Z5ag0/Ls"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47CB421Y73z9sP4 for ; Wed, 13 Nov 2019 02:09:37 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id D5D2FC5D; Tue, 12 Nov 2019 15:09:34 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 60B0F8FF for ; Tue, 12 Nov 2019 15:09:33 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pg1-f193.google.com (mail-pg1-f193.google.com [209.85.215.193]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 937C18BB for ; Tue, 12 Nov 2019 15:09:32 +0000 (UTC) Received: by mail-pg1-f193.google.com with SMTP id 15so12037755pgh.5 for ; Tue, 12 Nov 2019 07:09:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=2DzezMOdIfO5arl9C8ik9PsJNSPPcBuBmup2DMQUr1s=; b=Z5ag0/Ls4NVsFcNSAyZnz8aPip7beFbZmHDaznRRWsEdp61lHPMsaYQ9cUOpj6oslT jCQm8AugCKVNvuwvuAlZjzIWleafs3z5ADNMr34K71Iq8iKA/f2bzbFZ1nmJ4Wx5vUgf Nm4eGGj39xdOqM9cAA6DbCt+j6iTP9lp3Ls40CbH+Ss9E0YE9kdmemhKFefArjeCy4a0 Epsbjs90PTej3+dmUUqxFibZP+wanhiiIAUOeplijXJ7jEuXLyBxmVbM2E0rRkEJGE7Q OISFQZnHQ1aD3Rr/REeO0dX9DRdJDFIRgsnCKz7lLpGDEgAjaFeTikqsHEWFX0Q8Qq64 WnAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=2DzezMOdIfO5arl9C8ik9PsJNSPPcBuBmup2DMQUr1s=; b=eL6vE5K08jXwdasubo6oX41tDllISrOI1NZNxCBJuhMBFgAv+vD8rlQn10xjV/FACG y+iJT3GCjvoATgbhKO45tb6JhhNDkFpirCdvkW9jxgUs0zFn1Eo8J0iEV71Tut1Qh9B6 m8yxqj+/7NHdfJMbbAzKAElXle7GRzvE6Oq5EHldWZ4XQvscPPkhvEmJ2oJmnwyMLXHq yAydPitCXJtFRaqcNwpSvDMYX8xBGodDTUimgznEemknn8l+EY4aCxqJNotCioOo64Qc LUGEDf5Mr0Cg17eAsebHDyJvrvfPQUeeEyDld0LCemqQfFn9OYClMM+BPpQDEBZtOJ2V WzVg== X-Gm-Message-State: APjAAAUWk6UYxLKjLzXYFsoGkfzUnmPkPq1rn08Pyz6fKnNcpirzsx9a kVLKR64BLVVVGOz+ZrtnN+g= X-Google-Smtp-Source: APXvYqzsWE9iwRj/u6mVcCR9NEndwE+ULFFmrfaIt0v+UAUxL0Mv0CPCmC51UieWGcicwEUGVBjESA== X-Received: by 2002:aa7:80d2:: with SMTP id a18mr13418590pfn.29.1573571372017; Tue, 12 Nov 2019 07:09:32 -0800 (PST) Received: from local.opencloud.tech.localdomain ([115.171.60.86]) by smtp.gmail.com with ESMTPSA id k32sm2874689pje.10.2019.11.12.07.09.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Nov 2019 07:09:31 -0800 (PST) From: xiangxia.m.yue@gmail.com To: pshelar@ovn.org, blp@ovn.org Date: Tue, 12 Nov 2019 23:08:47 +0800 Message-Id: <1573571327-6906-1-git-send-email-xiangxia.m.yue@gmail.com> X-Mailer: git-send-email 1.8.3.1 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: dev@openvswitch.org, netdev@vger.kernel.org, ychen103103@163.com Subject: [ovs-dev] [PATCH net-next v3] net: openvswitch: add hash info to upcall X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Tonghao Zhang When using the kernel datapath, the upcall don't include skb hash info relatived. That will introduce some problem, because the hash of skb is important in kernel stack. For example, VXLAN module uses it to select UDP src port. The tx queue selection may also use the hash in stack. Hash is computed in different ways. Hash is random for a TCP socket, and hash may be computed in hardware, or software stack. Recalculation hash is not easy. Hash of TCP socket is computed: tcp_v4_connect -> sk_set_txhash (is random) __tcp_transmit_skb -> skb_set_hash_from_sk There will be one upcall, without information of skb hash, to ovs-vswitchd, for the first packet of a TCP session. The rest packets will be processed in Open vSwitch modules, hash kept. If this tcp session is forward to VXLAN module, then the UDP src port of first tcp packet is different from rest packets. TCP packets may come from the host or dockers, to Open vSwitch. To fix it, we store the hash info to upcall, and restore hash when packets sent back. +---------------+ +-------------------------+ | Docker/VMs | | ovs-vswitchd | +----+----------+ +-+--------------------+--+ | ^ | | | | | | upcall v restore packet hash (not recalculate) | +-+--------------------+--+ | tap netdev | | vxlan module +---------------> +--> Open vSwitch ko +--> or internal type | | +-------------------------+ Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html Signed-off-by: Tonghao Zhang --- v3: * add enum ovs_pkt_hash_types * avoid duplicate call the skb_get_hash_raw. * explain why we should fix this problem. --- include/uapi/linux/openvswitch.h | 2 ++ net/openvswitch/datapath.c | 30 +++++++++++++++++++++++++++++- net/openvswitch/datapath.h | 12 ++++++++++++ 3 files changed, 43 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h index 1887a451c388..e65407c1f366 100644 --- a/include/uapi/linux/openvswitch.h +++ b/include/uapi/linux/openvswitch.h @@ -170,6 +170,7 @@ enum ovs_packet_cmd { * output port is actually a tunnel port. Contains the output tunnel key * extracted from the packet as nested %OVS_TUNNEL_KEY_ATTR_* attributes. * @OVS_PACKET_ATTR_MRU: Present for an %OVS_PACKET_CMD_ACTION and + * @OVS_PACKET_ATTR_HASH: Packet hash info (e.g. hash, sw_hash and l4_hash in skb). * @OVS_PACKET_ATTR_LEN: Packet size before truncation. * %OVS_PACKET_ATTR_USERSPACE action specify the Maximum received fragment * size. @@ -190,6 +191,7 @@ enum ovs_packet_attr { OVS_PACKET_ATTR_PROBE, /* Packet operation is a feature probe, error logging should be suppressed. */ OVS_PACKET_ATTR_MRU, /* Maximum received IP fragment size. */ + OVS_PACKET_ATTR_HASH, /* Packet hash. */ OVS_PACKET_ATTR_LEN, /* Packet size before truncation. */ __OVS_PACKET_ATTR_MAX }; diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index 2088619c03f0..b556cf62b77c 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -350,7 +350,8 @@ static size_t upcall_msg_size(const struct dp_upcall_info *upcall_info, size_t size = NLMSG_ALIGN(sizeof(struct ovs_header)) + nla_total_size(hdrlen) /* OVS_PACKET_ATTR_PACKET */ + nla_total_size(ovs_key_attr_size()) /* OVS_PACKET_ATTR_KEY */ - + nla_total_size(sizeof(unsigned int)); /* OVS_PACKET_ATTR_LEN */ + + nla_total_size(sizeof(unsigned int)) /* OVS_PACKET_ATTR_LEN */ + + nla_total_size(sizeof(u64)); /* OVS_PACKET_ATTR_HASH */ /* OVS_PACKET_ATTR_USERDATA */ if (upcall_info->userdata) @@ -393,6 +394,7 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, size_t len; unsigned int hlen; int err, dp_ifindex; + u64 hash; dp_ifindex = get_dpifindex(dp); if (!dp_ifindex) @@ -504,6 +506,23 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, pad_packet(dp, user_skb); } + hash = skb_get_hash_raw(skb); + if (hash) { + if (skb->sw_hash) + hash |= OVS_PACKET_HASH_SW_BIT; + + if (skb->l4_hash) + hash |= OVS_PACKET_HASH_L4_BIT; + + if (nla_put(user_skb, OVS_PACKET_ATTR_HASH, + sizeof (u64), &hash)) { + err = -ENOBUFS; + goto out; + } + + pad_packet(dp, user_skb); + } + /* Only reserve room for attribute header, packet data is added * in skb_zerocopy() */ if (!(nla = nla_reserve(user_skb, OVS_PACKET_ATTR_PACKET, 0))) { @@ -543,6 +562,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) struct datapath *dp; struct vport *input_vport; u16 mru = 0; + u64 hash; int len; int err; bool log = !a[OVS_PACKET_ATTR_PROBE]; @@ -568,6 +588,14 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) } OVS_CB(packet)->mru = mru; + if (a[OVS_PACKET_ATTR_HASH]) { + hash = nla_get_u64(a[OVS_PACKET_ATTR_HASH]); + + __skb_set_hash(packet, hash & 0xFFFFFFFFULL, + !!(hash & OVS_PACKET_HASH_SW_BIT), + !!(hash & OVS_PACKET_HASH_L4_BIT)); + } + /* Build an sw_flow for sending this packet. */ flow = ovs_flow_alloc(); err = PTR_ERR(flow); diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h index 81e85dde8217..e239a46c2f94 100644 --- a/net/openvswitch/datapath.h +++ b/net/openvswitch/datapath.h @@ -139,6 +139,18 @@ struct ovs_net { bool xt_label; }; +/** + * enum ovs_pkt_hash_types - hash info to include with a packet + * to send to userspace. + * @OVS_PACKET_HASH_SW_BIT: indicates hash was computed in software stack. + * @OVS_PACKET_HASH_L4_BIT: indicates hash is a canonical 4-tuple hash + * over transport ports. + */ +enum ovs_pkt_hash_types { + OVS_PACKET_HASH_SW_BIT = (1ULL << 32), + OVS_PACKET_HASH_L4_BIT = (1ULL << 33), +}; + extern unsigned int ovs_net_id; void ovs_lock(void); void ovs_unlock(void);