diff mbox series

inet: don't call skb_orphan if tproxy happens in layer 2

Message ID 1518715545-2188-1-git-send-email-gregory.vanderschueren@tessares.net
State Changes Requested
Delegated to: Pablo Neira
Headers show
Series inet: don't call skb_orphan if tproxy happens in layer 2 | expand

Commit Message

Gregory Vander Schueren Feb. 15, 2018, 5:25 p.m. UTC
If sysctl bridge-nf-call-iptables is enabled, iptables chains are already
traversed from the bridging code. In such case, tproxy already happened when
reaching ip_rcv. Thus no need to call skb_orphan as this would actually undo
tproxy.

Fixes: 71f9dacd2e4d (inet: Call skb_orphan before tproxy activates)
Signed-off-by: Gregory Vander Schueren <gregory.vanderschueren@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---

Hi,

We noticed issues when using tproxy with net.bridge.bridge-nf-call-iptables
enabled. In such case, ip_rcv() basically undo tproxy's job. The following
patch proposes a fix.

Feedback would be most welcome,
Gregory

 include/linux/netfilter_bridge.h | 12 ++++++++++++
 net/ipv4/ip_input.c              |  9 +++++++--
 net/ipv6/ip6_input.c             |  9 +++++++--
 3 files changed, 26 insertions(+), 4 deletions(-)

Comments

Florian Westphal Feb. 16, 2018, 11:07 a.m. UTC | #1
Gregory Vander Schueren <gregory.vanderschueren@tessares.net> wrote:

[ cc netdev ]

> If sysctl bridge-nf-call-iptables is enabled, iptables chains are already
> traversed from the bridging code. In such case, tproxy already happened when
> reaching ip_rcv. Thus no need to call skb_orphan as this would actually undo
> tproxy.

I don't like this because it adds yet another test in fastpath, and for
a use case that has apparently never worked before.

> We noticed issues when using tproxy with net.bridge.bridge-nf-call-iptables
> enabled. In such case, ip_rcv() basically undo tproxy's job. The following
> patch proposes a fix.

I question wheter its a good idea to mix tproxy with bridges.

Tproxy relies on policy routing, but a bridge doesn't route :-)

I guess you use bridge snat mac mangling to force local delivery of
packets that are otherwise bridged?

If yes, can you use ebtables brouting instead?
This would bypass the bridge (so no iptables invocation from bridge
prerouting anymore).

We will try to get rid of nf-call-iptables eventually.

There might be (more complicated) ways to avoid this problem without
adding code in normal network path, but lets check other options first.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Feb. 16, 2018, 11:28 a.m. UTC | #2
On Fri, Feb 16, 2018 at 12:07:06PM +0100, Florian Westphal wrote:
> Gregory Vander Schueren <gregory.vanderschueren@tessares.net> wrote:
> 
> [ cc netdev ]
> 
> > If sysctl bridge-nf-call-iptables is enabled, iptables chains are already
> > traversed from the bridging code. In such case, tproxy already happened when
> > reaching ip_rcv. Thus no need to call skb_orphan as this would actually undo
> > tproxy.
> 
> I don't like this because it adds yet another test in fastpath, and for
> a use case that has apparently never worked before.
> 
> > We noticed issues when using tproxy with net.bridge.bridge-nf-call-iptables
> > enabled. In such case, ip_rcv() basically undo tproxy's job. The following
> > patch proposes a fix.
> 
> I question wheter its a good idea to mix tproxy with bridges.
> 
> Tproxy relies on policy routing, but a bridge doesn't route :-)
> 
> I guess you use bridge snat mac mangling to force local delivery of
> packets that are otherwise bridged?
> 
> If yes, can you use ebtables brouting instead?
> This would bypass the bridge (so no iptables invocation from bridge
> prerouting anymore).
> 
> We will try to get rid of nf-call-iptables eventually.
> 
> There might be (more complicated) ways to avoid this problem without
> adding code in normal network path, but lets check other options first.

Agreed.

If there's a fix for this, that should be away from the fast path, not
in ip_rcv().
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gregory Vander Schueren Feb. 16, 2018, 3:44 p.m. UTC | #3
Hi Florian & Pablo,

Thank your very much for your quick feedback.

On 02/16/2018 12:28 PM, Pablo Neira Ayuso wrote:
> On Fri, Feb 16, 2018 at 12:07:06PM +0100, Florian Westphal wrote:
>> Gregory Vander Schueren <gregory.vanderschueren@tessares.net> wrote:
>>
>> [ cc netdev ]
>>
>>> If sysctl bridge-nf-call-iptables is enabled, iptables chains are already
>>> traversed from the bridging code. In such case, tproxy already happened when
>>> reaching ip_rcv. Thus no need to call skb_orphan as this would actually undo
>>> tproxy.
>>
>> I don't like this because it adds yet another test in fastpath, and for
>> a use case that has apparently never worked before.

Agreed. I also thought this was not ideal but I did find another way to 
easily fix this.

>>> We noticed issues when using tproxy with net.bridge.bridge-nf-call-iptables
>>> enabled. In such case, ip_rcv() basically undo tproxy's job. The following
>>> patch proposes a fix.
>>
>> I question wheter its a good idea to mix tproxy with bridges.
>>
>> Tproxy relies on policy routing, but a bridge doesn't route :-)
>>
>> I guess you use bridge snat mac mangling to force local delivery of
>> packets that are otherwise bridged?

Indeed, we use DNAT MAC mangling.

>> If yes, can you use ebtables brouting instead?
>> This would bypass the bridge (so no iptables invocation from bridge
>> prerouting anymore).

We were actually pondering over the usage of MAC DNAT vs brouting. I'll 
thus follow your suggestion and use brouting instead then.

>> We will try to get rid of nf-call-iptables eventually.

Good to know!

>> There might be (more complicated) ways to avoid this problem without
>> adding code in normal network path, but lets check other options first.
> 
> Agreed.
> 
> If there's a fix for this, that should be away from the fast path, not
> in ip_rcv().
>
diff mbox series

Patch

diff --git a/include/linux/netfilter_bridge.h b/include/linux/netfilter_bridge.h
index b671fdf..b1c48f5 100644
--- a/include/linux/netfilter_bridge.h
+++ b/include/linux/netfilter_bridge.h
@@ -66,12 +66,24 @@  static inline bool nf_bridge_in_prerouting(const struct sk_buff *skb)
 {
 	return skb->nf_bridge && skb->nf_bridge->in_prerouting;
 }
+
+static inline bool
+nf_bridge_has_called_iptables(const struct sk_buff *skb)
+{
+	return skb->nf_bridge != NULL;
+}
 #else
 #define br_drop_fake_rtable(skb)	        do { } while (0)
 static inline bool nf_bridge_in_prerouting(const struct sk_buff *skb)
 {
 	return false;
 }
+
+static inline bool
+nf_bridge_has_called_iptables(const struct sk_buff *skb)
+{
+	return false;
+}
 #endif /* CONFIG_BRIDGE_NETFILTER */
 
 #endif
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 57fc13c..2450205 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -143,6 +143,7 @@ 
 #include <net/checksum.h>
 #include <net/inet_ecn.h>
 #include <linux/netfilter_ipv4.h>
+#include <linux/netfilter_bridge.h>
 #include <net/xfrm.h>
 #include <linux/mroute.h>
 #include <linux/netlink.h>
@@ -487,8 +488,12 @@  int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt,
 	memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
 	IPCB(skb)->iif = skb->skb_iif;
 
-	/* Must drop socket now because of tproxy. */
-	skb_orphan(skb);
+	/* If nf_bridge calls iptables then tproxy already happened.
+	 * No need to call skb_orphan as this would undo tproxy.
+	 */
+	if (!nf_bridge_has_called_iptables(skb))
+		/* Must drop socket now because of tproxy. */
+		skb_orphan(skb);
 
 	return NF_HOOK(NFPROTO_IPV4, NF_INET_PRE_ROUTING,
 		       net, NULL, skb, dev, NULL,
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 9ee208a..60ce4735 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -32,6 +32,7 @@ 
 
 #include <linux/netfilter.h>
 #include <linux/netfilter_ipv6.h>
+#include <linux/netfilter_bridge.h>
 
 #include <net/sock.h>
 #include <net/snmp.h>
@@ -202,8 +203,12 @@  int ipv6_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt
 
 	rcu_read_unlock();
 
-	/* Must drop socket now because of tproxy. */
-	skb_orphan(skb);
+	/* If nf_bridge calls iptables then tproxy already happened.
+	 * No need to call skb_orphan as this would undo tproxy.
+	 */
+	if (!nf_bridge_has_called_iptables(skb))
+		/* Must drop socket now because of tproxy. */
+		skb_orphan(skb);
 
 	return NF_HOOK(NFPROTO_IPV6, NF_INET_PRE_ROUTING,
 		       net, NULL, skb, dev, NULL,