From patchwork Tue Jan 18 20:20:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bodong Wang X-Patchwork-Id: 1581510 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JdgBf5M6rz9t6h for ; Wed, 19 Jan 2022 07:20:42 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1n9uy3-0006LP-Le; Tue, 18 Jan 2022 20:20:35 +0000 Received: from mail-il-dmz.mellanox.com ([193.47.165.129] helo=mellanox.co.il) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1n9uy0-0006Hu-Tj for kernel-team@lists.ubuntu.com; Tue, 18 Jan 2022 20:20:33 +0000 Received: from Internal Mail-Server by MTLPINE1 (envelope-from bodong@nvidia.com) with SMTP; 18 Jan 2022 22:20:28 +0200 Received: from sw-mtx-016.mtx.labs.mlnx. (sw-mtx-016.mtx.labs.mlnx [10.9.150.102]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 20IKKLgx021211; Tue, 18 Jan 2022 22:20:27 +0200 From: Bodong Wang To: kernel-team@lists.ubuntu.com Subject: [SRU][F:linux-bluefield][PATCH V3 5/5] net: openvswitch: Fix ct_state nat flags for conns arriving from tc Date: Tue, 18 Jan 2022 14:20:20 -0600 Message-Id: <1642537220-22318-6-git-send-email-bodong@nvidia.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1642537220-22318-1-git-send-email-bodong@nvidia.com> References: <1642537220-22318-1-git-send-email-bodong@nvidia.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: vlad@nvidia.com, paulb@nvidia.com, majd@nvidia.com MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Paul Blakey BugLink: https://bugs.launchpad.net/bugs/1957807 Netfilter conntrack maintains NAT flags per connection indicating whether NAT was configured for the connection. Openvswitch maintains NAT flags on the per packet flow key ct_state field, indicating whether NAT was actually executed on the packet. When a packet misses from tc to ovs the conntrack NAT flags are set. However, NAT was not necessarily executed on the packet because the connection's state might still be in NEW state. As such, openvswitch wrongly assumes that NAT was executed and sets an incorrect flow key NAT flags. Fix this, by flagging to openvswitch which NAT was actually done in act_ct via tc_skb_ext and tc_skb_cb to the openvswitch module, so the packet flow key NAT flags will be correctly set. Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") Signed-off-by: Paul Blakey Acked-by: Jamal Hadi Salim Link: https://lore.kernel.org/r/20220106153804.26451-1-paulb@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 6f022c2ddbcefaee79502ce5386dfe351d457070 linux-next) Signed-off-by: Bodong Wang --- include/linux/skbuff.h | 4 +++- include/net/pkt_sched.h | 4 +++- net/openvswitch/flow.c | 16 +++++++++++++--- net/sched/act_ct.c | 6 ++++++ net/sched/cls_api.c | 2 ++ 5 files changed, 27 insertions(+), 5 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index b53877d..9701643 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -285,7 +285,9 @@ struct tc_skb_ext { __u32 chain; __u16 mru; __u16 zone; - bool post_ct; + u8 post_ct:1; + u8 post_ct_snat:1; + u8 post_ct_dnat:1; }; #endif diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index b52882e..6250acb 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -178,7 +178,9 @@ struct tc_skb_cb { struct qdisc_skb_cb qdisc_cb; u16 mru; - bool post_ct; + u8 post_ct:1; + u8 post_ct_snat:1; + u8 post_ct_dnat:1; u16 zone; /* Only valid if post_ct = true */ }; diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c index 89e00eb..090dc2f 100644 --- a/net/openvswitch/flow.c +++ b/net/openvswitch/flow.c @@ -846,7 +846,7 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info, #if IS_ENABLED(CONFIG_NET_TC_SKB_EXT) struct tc_skb_ext *tc_ext; #endif - bool post_ct = false; + bool post_ct = false, post_ct_snat = false, post_ct_dnat = false; int res, err; u16 zone = 0; @@ -887,6 +887,8 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info, key->recirc_id = tc_ext ? tc_ext->chain : 0; OVS_CB(skb)->mru = tc_ext ? tc_ext->mru : 0; post_ct = tc_ext ? tc_ext->post_ct : false; + post_ct_snat = post_ct ? tc_ext->post_ct_snat : false; + post_ct_dnat = post_ct ? tc_ext->post_ct_dnat : false; zone = post_ct ? tc_ext->zone : 0; } else { key->recirc_id = 0; @@ -898,8 +900,16 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info, err = key_extract(skb, key); if (!err) { ovs_ct_fill_key(skb, key, post_ct); /* Must be after key_extract(). */ - if (post_ct && !skb_get_nfct(skb)) - key->ct_zone = zone; + if (post_ct) { + if (!skb_get_nfct(skb)) { + key->ct_zone = zone; + } else { + if (!post_ct_dnat) + key->ct_state &= ~OVS_CS_F_DST_NAT; + if (!post_ct_snat) + key->ct_state &= ~OVS_CS_F_SRC_NAT; + } + } } return err; } diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 796089c..8a78cbd 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -832,6 +832,12 @@ static int ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, } err = nf_nat_packet(ct, ctinfo, hooknum, skb); + if (err == NF_ACCEPT) { + if (maniptype == NF_NAT_MANIP_SRC) + tc_skb_cb(skb)->post_ct_snat = 1; + if (maniptype == NF_NAT_MANIP_DST) + tc_skb_cb(skb)->post_ct_dnat = 1; + } out: return err; } diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 5df4155..aecc4f0 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1683,6 +1683,8 @@ int tcf_classify_ingress(struct sk_buff *skb, ext->chain = last_executed_chain; ext->mru = cb->mru; ext->post_ct = cb->post_ct; + ext->post_ct_snat = cb->post_ct_snat; + ext->post_ct_dnat = cb->post_ct_dnat; ext->zone = cb->zone; }