From patchwork Fri Jan 25 02:32:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Leitner X-Patchwork-Id: 1030779 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43m33Y0WDdz9s7T for ; Fri, 25 Jan 2019 13:33:13 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728627AbfAYCdL (ORCPT ); Thu, 24 Jan 2019 21:33:11 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46620 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728575AbfAYCdJ (ORCPT ); Thu, 24 Jan 2019 21:33:09 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 16E3CAC5EB; Fri, 25 Jan 2019 02:33:09 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-7.gru2.redhat.com [10.97.116.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BCA7E1690C; Fri, 25 Jan 2019 02:33:04 +0000 (UTC) Received: by localhost.localdomain (Postfix, from userid 1000) id E6A49180BE2; Fri, 25 Jan 2019 00:33:02 -0200 (-02) From: Marcelo Ricardo Leitner To: Guy Shattah , Marcelo Leitner , Aaron Conole , John Hurley , Simon Horman , Justin Pettit , Gregory Rose , Eelco Chaudron , Flavio Leitner , Florian Westphal , Jiri Pirko , Rashid Khan , Sushil Kulkarni , Andy Gospodarek , Roi Dayan , Yossi Kuperman , Or Gerlitz , Rony Efraim , "davem@davemloft.net" Cc: netdev@vger.kernel.org Subject: [RFC PATCH 1/6] flow_dissector: add support for matching on ConnTrack Date: Fri, 25 Jan 2019 00:32:30 -0200 Message-Id: <144ea7746432c8177a1b9a62db98e7b3ada3c642.1548285996.git.mleitner@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Fri, 25 Jan 2019 02:33:09 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This a preliminary patch to add support on flow dissector for matching on ConnTrack information. 2 FIXMEs in place: - reusing nf_conn_labels may not be feasible, as we don't want to pull too much of ConnTrack into flow dissector. - CT may be there, but it may not be using labels. As hashing zeroes is different than not having the information, it should either be handled as a new dissector key or have an extra bit in flow_dissector_key_ct indicating its presence. Having it as an extra key may speed searches not using labels. Signed-off-by: Marcelo Ricardo Leitner --- include/net/flow_dissector.h | 17 ++++++++++++++ include/uapi/linux/netfilter/xt_connlabel.h | 5 +++++ net/core/flow_dissector.c | 25 +++++++++++++++++++++ 3 files changed, 47 insertions(+) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 6a4586dcdeded9b6cfe7d299d368b6a6ea6801cc..2b5a20a0f65c28d9907697ac3ef7e03d0e20209a 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -4,7 +4,9 @@ #include #include +#include #include +#include /** * struct flow_dissector_key_control: @@ -199,6 +201,19 @@ struct flow_dissector_key_ip { __u8 ttl; }; +/** + * struct flow_dissector_key_ct: + */ +struct flow_dissector_key_ct { + __u16 zone; + __u8 state; + __u32 mark; + /* FIXME: Use nf_conn_labels instead? But it pulls all netfilter */ +#define NF_CT_LABELS_MAX_SIZE ((XT_CONNLABEL_MAXBIT + 1) / BITS_PER_BYTE) + unsigned long label[NF_CT_LABELS_MAX_SIZE / sizeof(long)]; +#undef NF_CT_LABELS_MAX_SIZE +}; + enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */ FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */ @@ -224,6 +239,7 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_CVLAN, /* struct flow_dissector_key_flow_vlan */ FLOW_DISSECTOR_KEY_ENC_IP, /* struct flow_dissector_key_ip */ FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */ + FLOW_DISSECTOR_KEY_CT, /* struct flow_dissector_key_ct */ FLOW_DISSECTOR_KEY_MAX, }; @@ -254,6 +270,7 @@ struct flow_keys { #define FLOW_KEYS_HASH_START_FIELD basic struct flow_dissector_key_basic basic; struct flow_dissector_key_tags tags; + struct flow_dissector_key_ct ct; struct flow_dissector_key_vlan vlan; struct flow_dissector_key_vlan cvlan; struct flow_dissector_key_keyid keyid; diff --git a/include/uapi/linux/netfilter/xt_connlabel.h b/include/uapi/linux/netfilter/xt_connlabel.h index 2312f0ec07b2791ffaece0a95eebaefa727f14be..20a1c1fe79a7676c4b9f8727c393443f2d545784 100644 --- a/include/uapi/linux/netfilter/xt_connlabel.h +++ b/include/uapi/linux/netfilter/xt_connlabel.h @@ -1,4 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _NF_CONNTRACK_CONNLABEL_H +#define _NF_CONNTRACK_CONNLABEL_H + #include #define XT_CONNLABEL_MAXBIT 127 @@ -11,3 +14,5 @@ struct xt_connlabel_mtinfo { __u16 bit; __u16 options; }; + +#endif diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 2e8d91e54179c32b41ab53b9fa424fe1691acde0..73336466423aedff9a6fc4724f6c6efe44d5225a 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -26,6 +26,8 @@ #include #include #include +#include +#include static DEFINE_MUTEX(flow_dissector_mutex); @@ -798,6 +800,29 @@ bool __skb_flow_dissect(const struct sk_buff *skb, } rcu_read_unlock(); + if (dissector_uses_key(flow_dissector, + FLOW_DISSECTOR_KEY_CT)) { + struct flow_dissector_key_ct *key_ct; + enum ip_conntrack_info ctinfo; + struct nf_conn_labels *labels; + struct nf_conn *ct; + + ct = nf_ct_get(skb, &ctinfo); + if (ct) { + key_ct = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_CT, + target_container); + key_ct->zone = ct->zone.id; + key_ct->state = ctinfo; + key_ct->mark = ct->mark; + labels = nf_ct_labels_find(ct); + /* FIXME: should this be a new key then? */ + if (labels) + memcpy(key_ct->label, labels->bits, + sizeof(key_ct->label)); + } + } + if (dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_ETH_ADDRS)) { struct ethhdr *eth = eth_hdr(skb); From patchwork Fri Jan 25 02:32:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Leitner X-Patchwork-Id: 1030778 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43m33X1jdjz9s7h for ; Fri, 25 Jan 2019 13:33:12 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728597AbfAYCdJ (ORCPT ); Thu, 24 Jan 2019 21:33:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46994 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728400AbfAYCdJ (ORCPT ); Thu, 24 Jan 2019 21:33:09 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0E8CFC0C6C2F; Fri, 25 Jan 2019 02:33:08 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-7.gru2.redhat.com [10.97.116.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B480D7B539; Fri, 25 Jan 2019 02:33:04 +0000 (UTC) Received: by localhost.localdomain (Postfix, from userid 1000) id EBB20180CF5; Fri, 25 Jan 2019 00:33:02 -0200 (-02) From: Marcelo Ricardo Leitner To: Guy Shattah , Marcelo Leitner , Aaron Conole , John Hurley , Simon Horman , Justin Pettit , Gregory Rose , Eelco Chaudron , Flavio Leitner , Florian Westphal , Jiri Pirko , Rashid Khan , Sushil Kulkarni , Andy Gospodarek , Roi Dayan , Yossi Kuperman , Or Gerlitz , Rony Efraim , "davem@davemloft.net" Cc: netdev@vger.kernel.org Subject: [RFC PATCH 2/6] net/sched: flower: add support for matching on ConnTrack Date: Fri, 25 Jan 2019 00:32:31 -0200 Message-Id: <6c976cc538f1f565b74bd2c750639af91a93adc1.1548285996.git.mleitner@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 25 Jan 2019 02:33:08 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hook on flow dissector's new interface on ConnTrack from previous patch. Signed-off-by: Marcelo Ricardo Leitner --- include/uapi/linux/pkt_cls.h | 9 +++++++++ net/sched/cls_flower.c | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 95d0db2a8350dffb1dd20816591f3b179913fb2e..ba1f3bc01b2fdfd810e37a2b3853a1da1f838acf 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -490,6 +490,15 @@ enum { TCA_FLOWER_KEY_PORT_DST_MIN, /* be16 */ TCA_FLOWER_KEY_PORT_DST_MAX, /* be16 */ + TCA_FLOWER_KEY_CT_ZONE, /* u16 */ + TCA_FLOWER_KEY_CT_ZONE_MASK, /* u16 */ + TCA_FLOWER_KEY_CT_STATE, /* u8 */ + TCA_FLOWER_KEY_CT_STATE_MASK, /* u8 */ + TCA_FLOWER_KEY_CT_MARK, /* u32 */ + TCA_FLOWER_KEY_CT_MARK_MASK, /* u32 */ + TCA_FLOWER_KEY_CT_LABEL, /* 128 bits */ + TCA_FLOWER_KEY_CT_LABEL_MASK, /* 128 bits */ + __TCA_FLOWER_MAX, }; diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 85e9f8e1da10aa7b01b0f51768edfefbe63d6a10..430b7fceeca0998b8c904acd91f8de53571814ff 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -57,6 +57,7 @@ struct fl_flow_key { struct flow_dissector_key_enc_opts enc_opts; struct flow_dissector_key_ports tp_min; struct flow_dissector_key_ports tp_max; + struct flow_dissector_key_ct ct; } __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. */ struct fl_flow_mask_range { @@ -1079,6 +1080,22 @@ static int fl_set_key(struct net *net, struct nlattr **tb, fl_set_key_ip(tb, true, &key->enc_ip, &mask->enc_ip); + fl_set_key_val(tb, &key->ct.mark, TCA_FLOWER_KEY_CT_MARK, + &mask->ct.mark, TCA_FLOWER_KEY_CT_MARK_MASK, + sizeof(key->ct.mark)); + + fl_set_key_val(tb, &key->ct.zone, TCA_FLOWER_KEY_CT_ZONE, + &mask->ct.zone, TCA_FLOWER_KEY_CT_ZONE_MASK, + sizeof(key->ct.zone)); + + fl_set_key_val(tb, &key->ct.state, TCA_FLOWER_KEY_CT_STATE, + &mask->ct.state, TCA_FLOWER_KEY_CT_STATE_MASK, + sizeof(key->ct.state)); + + fl_set_key_val(tb, &key->ct.label, TCA_FLOWER_KEY_CT_LABEL, + &mask->ct.label, TCA_FLOWER_KEY_CT_LABEL_MASK, + sizeof(key->ct.label)); + if (tb[TCA_FLOWER_KEY_ENC_OPTS]) { ret = fl_set_enc_opt(tb, key, mask, extack); if (ret) @@ -1183,6 +1200,8 @@ static void fl_init_dissector(struct flow_dissector *dissector, FLOW_DISSECTOR_KEY_ENC_IP, enc_ip); FL_KEY_SET_IF_MASKED(mask, keys, cnt, FLOW_DISSECTOR_KEY_ENC_OPTS, enc_opts); + FL_KEY_SET_IF_MASKED(mask, keys, cnt, + FLOW_DISSECTOR_KEY_CT, ct); skb_flow_dissector_init(dissector, keys, cnt); } @@ -1994,6 +2013,20 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net, fl_dump_key_enc_opt(skb, &key->enc_opts, &mask->enc_opts)) goto nla_put_failure; + if (fl_dump_key_val(skb, &key->ct.zone, TCA_FLOWER_KEY_CT_ZONE, + &mask->ct.zone, TCA_FLOWER_KEY_CT_ZONE_MASK, + sizeof(key->ct.zone)) || + fl_dump_key_val(skb, &key->ct.mark, TCA_FLOWER_KEY_CT_MARK, + &mask->ct.mark, TCA_FLOWER_KEY_CT_MARK_MASK, + sizeof(key->ct.mark)) || + fl_dump_key_val(skb, &key->ct.state, TCA_FLOWER_KEY_CT_STATE, + &mask->ct.state, TCA_FLOWER_KEY_CT_STATE_MASK, + sizeof(key->ct.state)) || + fl_dump_key_val(skb, &key->ct.label, TCA_FLOWER_KEY_CT_LABEL, + &mask->ct.label, TCA_FLOWER_KEY_CT_LABEL_MASK, + sizeof(key->ct.label))) + goto nla_put_failure; + if (fl_dump_key_flags(skb, key->control.flags, mask->control.flags)) goto nla_put_failure; From patchwork Fri Jan 25 02:32:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Leitner X-Patchwork-Id: 1030780 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43m33c0JfPz9s7T for ; Fri, 25 Jan 2019 13:33:16 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728770AbfAYCdP (ORCPT ); Thu, 24 Jan 2019 21:33:15 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47042 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728400AbfAYCdK (ORCPT ); Thu, 24 Jan 2019 21:33:10 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D423BC072253; Fri, 25 Jan 2019 02:33:09 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-7.gru2.redhat.com [10.97.116.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BD3A977A1C; Fri, 25 Jan 2019 02:33:04 +0000 (UTC) Received: by localhost.localdomain (Postfix, from userid 1000) id 04C8D180CF7; Fri, 25 Jan 2019 00:33:03 -0200 (-02) From: Marcelo Ricardo Leitner To: Guy Shattah , Marcelo Leitner , Aaron Conole , John Hurley , Simon Horman , Justin Pettit , Gregory Rose , Eelco Chaudron , Flavio Leitner , Florian Westphal , Jiri Pirko , Rashid Khan , Sushil Kulkarni , Andy Gospodarek , Roi Dayan , Yossi Kuperman , Or Gerlitz , Rony Efraim , "davem@davemloft.net" Cc: netdev@vger.kernel.org Subject: [RFC PATCH 3/6] net/sched: add CT action Date: Fri, 25 Jan 2019 00:32:32 -0200 Message-Id: <1ec3d8c3ec1256ae6cca2b498caac642c1ce09f0.1548285996.git.mleitner@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 25 Jan 2019 02:33:10 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This is where most of the code is and the main pain points. The implementation is using spinlock on the datapath for now just for simplicity. Lets get the basics done and then move forward. Open points: - nf_ct_netns_get() accepts IPv4, IPv6 or both. It would be interesting to match on what was specified to the filter, but not sure if that's really wanted neither how. - iptables CT target can set a different zone for each direction and also infer it from the mark. These are NOT used by OvS. We can focus on this later but probably want to consider the need now. - datapath fork As described in the planning RFC PATCH, OvS ct action creates a fork in the datapath: consider that the packet is being sent through conntrack. The original packet, without conntrack information, will first finish executing the current list of actions. After that is done, a packet clone created by the ct action will be inserted into the specified chain, and resume its processing. Somehow we need to be able to inject this packet and we can't use the interface backlog for it, as that would create a massive reordering. - The handling of multiple calls to CT action is needed because the first call may be on a packet still with tunnel headers, and then without it. It is handled in a subsequent patch by dropping any conntrack present in the skb. On protocol type on datapath, note that tc can match on both at once, IPv4 and IPv6. So far we can't easily tell which filter tc is using. We could tell conntrack to work with both (NFPROTO_INET), but that would be kind of a lazy solution here. Instead, lets trust the packet header: if it is here, it's because tc matched, so we can either process it as IPv4 or IPv6. Signed-off-by: Marcelo Ricardo Leitner --- include/net/tc_act/tc_ct.h | 29 +++ include/uapi/linux/tc_act/tc_ct.h | 36 +++ net/sched/Kconfig | 6 + net/sched/Makefile | 1 + net/sched/act_ct.c | 356 ++++++++++++++++++++++++++++++ 5 files changed, 428 insertions(+) create mode 100644 include/net/tc_act/tc_ct.h create mode 100644 include/uapi/linux/tc_act/tc_ct.h create mode 100644 net/sched/act_ct.c diff --git a/include/net/tc_act/tc_ct.h b/include/net/tc_act/tc_ct.h new file mode 100644 index 0000000000000000000000000000000000000000..65682460f501b5886d9266f811c8ed30a4510304 --- /dev/null +++ b/include/net/tc_act/tc_ct.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __NET_TC_CT_H +#define __NET_TC_CT_H + +#include +#include +#include + +struct tcf_ct { + struct tc_action common; + struct net *net; + + u16 zone; + u32 mark; + u32 mark_mask; + u32 chain; + + /* FIXME: Use nf_conn_labels instead? But it pulls all netfilter */ +#define NF_CT_LABELS_MAX_SIZE ((XT_CONNLABEL_MAXBIT + 1) / BITS_PER_BYTE) + u32 label[NF_CT_LABELS_MAX_SIZE / sizeof(long)]; + u32 label_mask[NF_CT_LABELS_MAX_SIZE / sizeof(long)]; + + u32 flags; + struct nf_conn *ct; +}; + +#define to_tcf_ct(a) ((struct tcf_ct *)a) + +#endif /* __NET_TC_CT_H */ diff --git a/include/uapi/linux/tc_act/tc_ct.h b/include/uapi/linux/tc_act/tc_ct.h new file mode 100644 index 0000000000000000000000000000000000000000..37b95cda1dedd283b0244a03a20860ba22966dfa --- /dev/null +++ b/include/uapi/linux/tc_act/tc_ct.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef __LINUX_TC_CT_H +#define __LINUX_TC_CT_H + +#include +#include + +#define TCA_ACT_CT 27 + +enum { + TCA_CT_UNSPEC, + TCA_CT_TM, + TCA_CT_PARMS, + TCA_CT_PAD, + TCA_CT_ZONE, + TCA_CT_MARK, + TCA_CT_MARK_MASK, + TCA_CT_LABEL, + TCA_CT_LABEL_MASK, + TCA_CT_CHAIN, + TCA_CT_FLAGS, + __TCA_CT_MAX +}; +#define TCA_CT_MAX (__TCA_CT_MAX - 1) + +enum { + TC_CT_COMMIT, + __TC_CT_MAX +}; +#define TC_CT_MAX (__TC_CT_MAX - 1) + +struct tc_ct { + tc_gen; +}; + +#endif diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 1b9afdee5ba976ba64200d8f85050cf053b7d65c..2c7f963b78f7511bbee8814b1c5bfdb488386c5d 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -912,6 +912,12 @@ config NET_ACT_TUNNEL_KEY To compile this code as a module, choose M here: the module will be called act_tunnel_key. +config NET_ACT_CT + tristate "Conntrack manipulation" + depends on NET_CLS_ACT + ---help--- + FIXME + config NET_IFE_SKBMARK tristate "Support to encoding decoding skb mark on IFE action" depends on NET_ACT_IFE diff --git a/net/sched/Makefile b/net/sched/Makefile index 8a40431d7b5c420d86427933a9af383e093812b7..f2f6db5b8352a9594b72bc6197caf2228b45c079 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -23,6 +23,7 @@ obj-$(CONFIG_NET_ACT_BPF) += act_bpf.o obj-$(CONFIG_NET_ACT_CONNMARK) += act_connmark.o obj-$(CONFIG_NET_ACT_SKBMOD) += act_skbmod.o obj-$(CONFIG_NET_ACT_IFE) += act_ife.o +obj-$(CONFIG_NET_ACT_CT) += act_ct.o obj-$(CONFIG_NET_IFE_SKBMARK) += act_meta_mark.o obj-$(CONFIG_NET_IFE_SKBPRIO) += act_meta_skbprio.o obj-$(CONFIG_NET_IFE_SKBTCINDEX) += act_meta_skbtcindex.o diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c new file mode 100644 index 0000000000000000000000000000000000000000..f69509954149a0c8be710916a5289a4448049b5d --- /dev/null +++ b/net/sched/act_ct.c @@ -0,0 +1,356 @@ +/* + * Conntrack manipulation + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static unsigned int ct_net_id; +static struct tc_action_ops act_ct_ops; + +static const struct nla_policy ct_policy[TCA_CT_MAX + 1] = { + [TCA_CT_PARMS] = { .len = sizeof(struct tc_ct) }, + [TCA_CT_ZONE] = { .type = NLA_U16 }, + [TCA_CT_MARK] = { .type = NLA_U32 }, + [TCA_CT_MARK_MASK] = { .type = NLA_U32 }, + [TCA_CT_LABEL] = { .type = NLA_BINARY, + .len = 128/BITS_PER_BYTE }, + [TCA_CT_LABEL_MASK] = { .type = NLA_BINARY, + .len = 128/BITS_PER_BYTE }, + [TCA_CT_CHAIN] = { .type = NLA_U32 }, + [TCA_CT_FLAGS] = { .type = NLA_U32 }, +}; + +static int tcf_ct_init(struct net *net, struct nlattr *nla, struct nlattr *est, + struct tc_action **a, int ovr, int bind, + bool rtnl_held, struct netlink_ext_ack *extack) +{ + struct tc_action_net *tn = net_generic(net, ct_net_id); + struct nlattr *tb[TCA_CT_MAX + 1]; + struct nf_conntrack_zone zone; + struct nf_conn *ct, *_ct; + struct tc_ct *parm; + int ret = 0, err; + struct tcf_ct *p; + u16 zone_id = NF_CT_DEFAULT_ZONE_ID; + + if (!nla) + return -EINVAL; + + err = nla_parse_nested(tb, TCA_CT_MAX, nla, ct_policy, NULL); + if (err < 0) + return err; + + if (!tb[TCA_CT_PARMS]) + return -EINVAL; + parm = nla_data(tb[TCA_CT_PARMS]); + + if (tb[TCA_CT_ZONE]) + zone_id = nla_get_u16(tb[TCA_CT_ZONE]); + + err = tcf_idr_check_alloc(tn, &parm->index, a, bind); + if (!err) { + ret = tcf_idr_create(tn, parm->index, est, a, + &act_ct_ops, bind, false); + if (ret) { + tcf_idr_cleanup(tn, parm->index); + return ret; + } + ret = ACT_P_CREATED; + } else if (err > 0) { + if (bind) + return 0; + if (!ovr) { + ret = -EEXIST; + goto err1; + } + } else { + return err; + } + + /* XXX Need translation from AF_INET to NFPROTO_ */ + err = nf_ct_netns_get(net, NFPROTO_IPV4 /* XXX par->family */); + if (err < 0) { + ret = err; + goto err1; + } + + /* XXX: CT target supports setting a different zone on each direction */ + /* XXX: CT supports inferring zone id from the mark, but we probably + * don't need that here. + if (info->flags & XT_CT_ZONE_MARK) + zone.flags |= NF_CT_FLAG_MARK; + */ + nf_ct_zone_init(&zone, zone_id, NF_CT_DEFAULT_ZONE_DIR, 0); + + ct = nf_ct_tmpl_alloc(net, &zone, GFP_KERNEL); + if (!ct) { + ret = -ENOMEM; + goto err1; + } + + __set_bit(IPS_CONFIRMED_BIT, &ct->status); + nf_conntrack_get(&ct->ct_general); + + p = to_tcf_ct(*a); + spin_lock_bh(&p->tcf_lock); + p->zone = zone_id; + if (tb[TCA_CT_MARK] && tb[TCA_CT_MARK_MASK]) { + p->mark = nla_get_u32(tb[TCA_CT_MARK]); + p->mark_mask = nla_get_u32(tb[TCA_CT_MARK_MASK]); + } + if (tb[TCA_CT_LABEL] && tb[TCA_CT_LABEL_MASK]) { + nla_memcpy(p->label, tb[TCA_CT_LABEL], sizeof(p->label)); + nla_memcpy(p->label_mask, tb[TCA_CT_LABEL_MASK], + sizeof(p->label_mask)); + nf_connlabels_replace(ct, p->label, p->label_mask, + sizeof(p->label)/sizeof(u32)); + } + if (tb[TCA_CT_CHAIN]) + p->chain = nla_get_u32(tb[TCA_CT_CHAIN]); + if (tb[TCA_CT_FLAGS]) + p->flags = nla_get_u32(tb[TCA_CT_FLAGS]); + p->net = net; + + p->tcf_action = parm->action; + + _ct = p->ct; + p->ct = ct; + + spin_unlock_bh(&p->tcf_lock); + + if (_ct) { + nf_conntrack_put(&_ct->ct_general); + } + + if (ret == ACT_P_CREATED) + tcf_idr_insert(tn, *a); + + return ret; + +err1: + tcf_idr_release(*a, bind); + return ret; +} + +static void tcf_ct_cleanup(struct tc_action *a) +{ + struct tcf_ct *p = to_tcf_ct(a); + + if (p->ct) { + nf_conntrack_put(&p->ct->ct_general); + } +} + +static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, + struct tcf_result *res) +{ + struct tcf_ct *p = to_tcf_ct(a); + struct nf_hook_state state = { + .hook = NF_INET_PRE_ROUTING, + }; + struct nf_conn *ct, *new_ct; + u32 mark, mark_mask, flags; + int action, err; + int nh_ofs; + + spin_lock(&p->tcf_lock); + + tcf_lastuse_update(&p->tcf_tm); + mark = p->mark; + mark_mask = p->mark_mask; + flags = p->flags; + state.net = p->net; + action = p->tcf_action; + ct = p->ct; + if (ct) + /* This gets transferred to conntrack */ + nf_conntrack_get(&ct->ct_general); + + bstats_update(&p->tcf_bstats, skb); + + spin_unlock(&p->tcf_lock); + + if (unlikely(action == TC_ACT_SHOT)) + goto drop; + + /* FIXME: For when we support cloning the packet + orig_skb = skb; + skb = skb_clone(orig_skb, GFP_ATOMIC); + */ + + /* The conntrack module expects to be working at L3. */ + nh_ofs = skb_network_offset(skb); + skb_pull_rcsum(skb, nh_ofs); + /* FIXME: OvS trims the packet here. Should we? */ + + /* FIXME: Need to handle multiple calls to CT action here. */ + if (ct) + nf_ct_set(skb, ct, IP_CT_NEW); + + if (skb->protocol == htons(ETH_P_IPV6)) { + state.pf = NFPROTO_IPV6; + } else { + /* FIXME: should we restrict this even further? */ + state.pf = NFPROTO_IPV4; + } + + err = nf_conntrack_in(skb, &state); + if (err != NF_ACCEPT) + goto drop; + + new_ct = (struct nf_conn *)skb_nfct(skb); + if (new_ct) { + if (mark_mask) { + new_ct->mark = (new_ct->mark &~ mark_mask) | (mark & mark_mask); + if (nf_ct_is_confirmed(new_ct)) + nf_conntrack_event_cache(IPCT_MARK, new_ct); + } + + nf_ct_deliver_cached_events(new_ct); + } + + if (flags & BIT(TC_CT_COMMIT)) { + err = nf_conntrack_confirm(skb); + if (err != NF_ACCEPT) { + printk("failed to confirm %d\n", err); + goto drop; + } + } + + /* FIXME: inject the packet into another chain (as it would happen if + * it had a miss in hw too) + */ + + skb_push(skb, nh_ofs); + skb_postpush_rcsum(skb, skb->data, nh_ofs); + return TC_ACT_PIPE; + +drop: + spin_lock(&p->tcf_lock); + p->tcf_qstats.drops++; + spin_unlock(&p->tcf_lock); + return TC_ACT_SHOT; +} + +static int tcf_ct_dump(struct sk_buff *skb, struct tc_action *a, + int bind, int ref) +{ + unsigned char *b = skb_tail_pointer(skb); + struct tcf_ct *p = to_tcf_ct(a); + struct tc_ct opt = { + .index = p->tcf_index, + .refcnt = refcount_read(&p->tcf_refcnt) - ref, + .bindcnt = atomic_read(&p->tcf_bindcnt) - bind, + }; + struct tcf_t t; + + spin_lock_bh(&p->tcf_lock); + nla_put_u16(skb, TCA_CT_ZONE, p->zone); + nla_put_u32(skb, TCA_CT_MARK, p->mark); + nla_put_u32(skb, TCA_CT_MARK_MASK, p->mark_mask); + nla_put_u32(skb, TCA_CT_CHAIN, p->chain); + nla_put(skb, TCA_CT_LABEL, sizeof(p->label), p->label); + nla_put(skb, TCA_CT_LABEL_MASK, sizeof(p->label_mask), p->label_mask); + nla_put_u32(skb, TCA_CT_FLAGS, p->flags); + opt.action = p->tcf_action; + + if (nla_put(skb, TCA_CT_PARMS, sizeof(opt), &opt)) + goto nla_put_failure; + + tcf_tm_dump(&t, &p->tcf_tm); + if (nla_put_64bit(skb, TCA_CT_TM, sizeof(t), &t, TCA_CT_PAD)) + goto nla_put_failure; + spin_unlock_bh(&p->tcf_lock); + + return skb->len; + +nla_put_failure: + spin_unlock_bh(&p->tcf_lock); + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_ct_walker(struct net *net, struct sk_buff *skb, + struct netlink_callback *cb, int type, + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) +{ + struct tc_action_net *tn = net_generic(net, ct_net_id); + + return tcf_generic_walker(tn, skb, cb, type, ops, extack); +} + +static int tcf_ct_search(struct net *net, struct tc_action **a, u32 index) +{ + struct tc_action_net *tn = net_generic(net, ct_net_id); + + return tcf_idr_search(tn, a, index); +} + +static struct tc_action_ops act_ct_ops = { + .kind = "ct", + .type = TCA_ACT_CT, + .owner = THIS_MODULE, + .act = tcf_ct_act, + .dump = tcf_ct_dump, + .init = tcf_ct_init, + .cleanup = tcf_ct_cleanup, + .walk = tcf_ct_walker, + .lookup = tcf_ct_search, + .size = sizeof(struct tcf_ct), +}; + +static __net_init int ct_init_net(struct net *net) +{ + struct tc_action_net *tn = net_generic(net, ct_net_id); + + return tc_action_net_init(tn, &act_ct_ops); +} + +static void __net_exit ct_exit_net(struct list_head *net_list) +{ + tc_action_net_exit(net_list, ct_net_id); +} + +static struct pernet_operations ct_net_ops = { + .init = ct_init_net, + .exit_batch = ct_exit_net, + .id = &ct_net_id, + .size = sizeof(struct tc_action_net), +}; + +MODULE_DESCRIPTION("Connection Tracking actions"); +MODULE_LICENSE("GPL"); + +static int __init ct_init_module(void) +{ + return tcf_register_action(&act_ct_ops, &ct_net_ops); +} + +static void __exit ct_cleanup_module(void) +{ + tcf_unregister_action(&act_ct_ops, &ct_net_ops); +} + +module_init(ct_init_module); +module_exit(ct_cleanup_module); From patchwork Fri Jan 25 02:32:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Leitner X-Patchwork-Id: 1030783 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43m33w0Zw0z9s7T for ; Fri, 25 Jan 2019 13:33:32 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728199AbfAYCdJ (ORCPT ); Thu, 24 Jan 2019 21:33:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36550 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728371AbfAYCdI (ORCPT ); Thu, 24 Jan 2019 21:33:08 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 134FC89C44; Fri, 25 Jan 2019 02:33:07 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-7.gru2.redhat.com [10.97.116.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BCCEE5EDE1; Fri, 25 Jan 2019 02:33:04 +0000 (UTC) Received: by localhost.localdomain (Postfix, from userid 1000) id 0C5E9180CF6; Fri, 25 Jan 2019 00:33:03 -0200 (-02) From: Marcelo Ricardo Leitner To: Guy Shattah , Marcelo Leitner , Aaron Conole , John Hurley , Simon Horman , Justin Pettit , Gregory Rose , Eelco Chaudron , Flavio Leitner , Florian Westphal , Jiri Pirko , Rashid Khan , Sushil Kulkarni , Andy Gospodarek , Roi Dayan , Yossi Kuperman , Or Gerlitz , Rony Efraim , "davem@davemloft.net" Cc: netdev@vger.kernel.org Subject: [RFC PATCH 4/6] net/sched: act_ct: add support for force flag Date: Fri, 25 Jan 2019 00:32:33 -0200 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 25 Jan 2019 02:33:08 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org OvS ct action has this 'force' flag, which basically forces ConnTrack to consider that this packet, this specific direction, is the original one. Implement that similarly: if the ct entry is there and the direction is not the expected one, destroy it and create a new one. Signed-off-by: Marcelo Ricardo Leitner --- include/uapi/linux/tc_act/tc_ct.h | 1 + net/sched/act_ct.c | 16 +++++++++++++++- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/tc_act/tc_ct.h b/include/uapi/linux/tc_act/tc_ct.h index 37b95cda1dedd283b0244a03a20860ba22966dfa..009e53ee83fb3125bc5c4ca86954af3bf6a0287a 100644 --- a/include/uapi/linux/tc_act/tc_ct.h +++ b/include/uapi/linux/tc_act/tc_ct.h @@ -25,6 +25,7 @@ enum { enum { TC_CT_COMMIT, + TC_CT_FORCE, __TC_CT_MAX }; #define TC_CT_MAX (__TC_CT_MAX - 1) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index f69509954149a0c8be710916a5289a4448049b5d..8a1b5d6a7cd8360c50011d992368464db213a020 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -165,6 +165,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, struct tcf_result *res) { struct tcf_ct *p = to_tcf_ct(a); + enum ip_conntrack_info ctinfo; struct nf_hook_state state = { .hook = NF_INET_PRE_ROUTING, }; @@ -173,6 +174,8 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, int action, err; int nh_ofs; + /* Again needs to be here because we need a new ref on the ct. */ +again: spin_lock(&p->tcf_lock); tcf_lastuse_update(&p->tcf_tm); @@ -218,8 +221,19 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, if (err != NF_ACCEPT) goto drop; - new_ct = (struct nf_conn *)skb_nfct(skb); + new_ct = nf_ct_get(skb, &ctinfo); if (new_ct) { + /* Force conntrack entry direction. */ + if (flags & BIT(TC_CT_FORCE) && + CTINFO2DIR(ctinfo) != IP_CT_DIR_ORIGINAL) { + if (nf_ct_is_confirmed(new_ct)) + nf_ct_delete(new_ct, 0, 0); + + nf_conntrack_put(&new_ct->ct_general); + nf_ct_set(skb, NULL, 0); + goto again; + } + if (mark_mask) { new_ct->mark = (new_ct->mark &~ mark_mask) | (mark & mark_mask); if (nf_ct_is_confirmed(new_ct)) From patchwork Fri Jan 25 02:32:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Leitner X-Patchwork-Id: 1030781 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43m33j579Fz9s7h for ; Fri, 25 Jan 2019 13:33:21 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728739AbfAYCdO (ORCPT ); Thu, 24 Jan 2019 21:33:14 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44812 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728631AbfAYCdM (ORCPT ); Thu, 24 Jan 2019 21:33:12 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4EAD9138209; Fri, 25 Jan 2019 02:33:12 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-7.gru2.redhat.com [10.97.116.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 554EF5EDE0; Fri, 25 Jan 2019 02:33:09 +0000 (UTC) Received: by localhost.localdomain (Postfix, from userid 1000) id 18F71180D00; Fri, 25 Jan 2019 00:33:03 -0200 (-02) From: Marcelo Ricardo Leitner To: Guy Shattah , Marcelo Leitner , Aaron Conole , John Hurley , Simon Horman , Justin Pettit , Gregory Rose , Eelco Chaudron , Flavio Leitner , Florian Westphal , Jiri Pirko , Rashid Khan , Sushil Kulkarni , Andy Gospodarek , Roi Dayan , Yossi Kuperman , Or Gerlitz , Rony Efraim , "davem@davemloft.net" Cc: netdev@vger.kernel.org Subject: [RFC PATCH 5/6] net/sched: act_ct: add support for clear flag Date: Fri, 25 Jan 2019 00:32:34 -0200 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 25 Jan 2019 02:33:12 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org OvS ct action supports a 'clear' flag: it removes any ConnTrack marking in the packet. Implement it similarly here: drop the reference and return. Note that the packet is also marked as UNTRACKED. Yes, parsing should ensure that clear is not used with any other flags as they are mutually exclusive. Signed-off-by: Marcelo Ricardo Leitner --- include/uapi/linux/tc_act/tc_ct.h | 1 + net/sched/act_ct.c | 13 +++++++++++++ 2 files changed, 14 insertions(+) diff --git a/include/uapi/linux/tc_act/tc_ct.h b/include/uapi/linux/tc_act/tc_ct.h index 009e53ee83fb3125bc5c4ca86954af3bf6a0287a..636f435b86e006aa36034f86c65fd5c220ca8a13 100644 --- a/include/uapi/linux/tc_act/tc_ct.h +++ b/include/uapi/linux/tc_act/tc_ct.h @@ -26,6 +26,7 @@ enum { enum { TC_CT_COMMIT, TC_CT_FORCE, + TC_CT_CLEAR, __TC_CT_MAX }; #define TC_CT_MAX (__TC_CT_MAX - 1) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 8a1b5d6a7cd8360c50011d992368464db213a020..77d55c05ed95d8abc8c35a3d19f453a586139914 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -196,6 +196,18 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, if (unlikely(action == TC_ACT_SHOT)) goto drop; + if (flags & BIT(TC_CT_CLEAR)) { + new_ct = nf_ct_get(skb, &ctinfo); + if (new_ct) { + if (nf_ct_is_confirmed(new_ct)) + nf_ct_delete(new_ct, 0, 0); + + nf_conntrack_put(&new_ct->ct_general); + nf_ct_set(skb, NULL, IP_CT_UNTRACKED); + goto out; + } + } + /* FIXME: For when we support cloning the packet orig_skb = skb; skb = skb_clone(orig_skb, GFP_ATOMIC); @@ -257,6 +269,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, skb_push(skb, nh_ofs); skb_postpush_rcsum(skb, skb->data, nh_ofs); +out: return TC_ACT_PIPE; drop: From patchwork Fri Jan 25 02:32:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Leitner X-Patchwork-Id: 1030782 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43m33n587hz9s7h for ; Fri, 25 Jan 2019 13:33:25 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728697AbfAYCdO (ORCPT ); Thu, 24 Jan 2019 21:33:14 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41326 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728575AbfAYCdM (ORCPT ); Thu, 24 Jan 2019 21:33:12 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 548EFC7C1A; Fri, 25 Jan 2019 02:33:12 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-7.gru2.redhat.com [10.97.116.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 589654C5; Fri, 25 Jan 2019 02:33:09 +0000 (UTC) Received: by localhost.localdomain (Postfix, from userid 1000) id 1EFD2180CFC; Fri, 25 Jan 2019 00:33:03 -0200 (-02) From: Marcelo Ricardo Leitner To: Guy Shattah , Marcelo Leitner , Aaron Conole , John Hurley , Simon Horman , Justin Pettit , Gregory Rose , Eelco Chaudron , Flavio Leitner , Florian Westphal , Jiri Pirko , Rashid Khan , Sushil Kulkarni , Andy Gospodarek , Roi Dayan , Yossi Kuperman , Or Gerlitz , Rony Efraim , "davem@davemloft.net" Cc: netdev@vger.kernel.org Subject: [RFC PATCH 6/6] net/sched: act_ct: allow sending a packet through conntrack multiple times Date: Fri, 25 Jan 2019 00:32:35 -0200 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 25 Jan 2019 02:33:12 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The first time it may use conntrack to track the tunnel information, then jump into another chain, and go through conntrack again so that the inner header is tracked. This commit clears previous conntrack info if any so that we can submit it to conntrack again. Header offsets are supposed to be updated by the decapsulating action. The main difference from just adding another act_ct(clear) action is that the clear flag also sets the UNTRACKED mark in the packet (like OvS does). Signed-off-by: Marcelo Ricardo Leitner --- net/sched/act_ct.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 77d55c05ed95d8abc8c35a3d19f453a586139914..6e446db3bcdda772dbe1090d5c584156f6cc59eb 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -196,16 +196,19 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, if (unlikely(action == TC_ACT_SHOT)) goto drop; - if (flags & BIT(TC_CT_CLEAR)) { - new_ct = nf_ct_get(skb, &ctinfo); - if (new_ct) { - if (nf_ct_is_confirmed(new_ct)) - nf_ct_delete(new_ct, 0, 0); + new_ct = nf_ct_get(skb, &ctinfo); + if (new_ct) { + if (nf_ct_is_confirmed(new_ct)) + nf_ct_delete(new_ct, 0, 0); - nf_conntrack_put(&new_ct->ct_general); + nf_conntrack_put(&new_ct->ct_general); + + if (flags & BIT(TC_CT_CLEAR)) { nf_ct_set(skb, NULL, IP_CT_UNTRACKED); goto out; } + + nf_ct_set(skb, NULL, 0); } /* FIXME: For when we support cloning the packet @@ -218,7 +221,6 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, skb_pull_rcsum(skb, nh_ofs); /* FIXME: OvS trims the packet here. Should we? */ - /* FIXME: Need to handle multiple calls to CT action here. */ if (ct) nf_ct_set(skb, ct, IP_CT_NEW);