diff mbox series

Question re. skb_orphan for TPROXY

Message ID CACAyw9-pYyvkUBOzdD+XQBEKdGGB9foJ5ph5sdjiuE4_uyEoJg@mail.gmail.com
State RFC
Delegated to: David Miller
Headers show
Series Question re. skb_orphan for TPROXY | expand

Commit Message

Lorenz Bauer April 16, 2019, 2:49 p.m. UTC
Hello Herbert (and List),

Apologies for contacting you out of the blue. I'm currently trying to
understand how TPROXY works under the hood. As part of this endeavour,
I've stumbled upon the commit attached to this email.

From the commit message I infer that somewhere, TPROXY relies on a
check of skb->sk == NULL to function. However, I can't figure out
where! I've traced TPROXY from NF_HOOK(NF_INET_PRE_ROUTING) just after
the call to skb_orphan to __inet_lookup_skb / skb_steal_sock called
from the TCP and UDP receive functions, and as far as I can tell there
is no such check. Can you maybe shed some light on this?

The commit is a fix for https://bugzilla.kernel.org/show_bug.cgi?id=13627

commit 71f9dacd2e4d233029e9e956ca3f79531f411827
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Fri Jun 26 19:22:37 2009 -0700

    inet: Call skb_orphan before tproxy activates

    As transparent proxying looks up the socket early and assigns
    it to the skb for later processing, we must drop any existing
    socket ownership prior to that in order to distinguish between
    the case where tproxy is active and where it is not.

    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>

 err:

Comments

Florian Westphal April 16, 2019, 3 p.m. UTC | #1
Lorenz Bauer <lmb@cloudflare.com> wrote:
> Apologies for contacting you out of the blue. I'm currently trying to
> understand how TPROXY works under the hood. As part of this endeavour,
> I've stumbled upon the commit attached to this email.
> 
> From the commit message I infer that somewhere, TPROXY relies on a
> check of skb->sk == NULL to function. However, I can't figure out
> where! I've traced TPROXY from NF_HOOK(NF_INET_PRE_ROUTING) just after
> the call to skb_orphan to __inet_lookup_skb / skb_steal_sock called
> from the TCP and UDP receive functions, and as far as I can tell there
> is no such check. Can you maybe shed some light on this?

Without the skb_orphan udp/tcp might steal tunnel/ppp etc. socket
instead of tproxy assigned tcp/udp socket.
Lorenz Bauer April 18, 2019, 12:01 p.m. UTC | #2
Hello Florian,

Thank you, that makes sense. I guess that technically early demux also
relies on the skb_orphan call to function? I think that's what
confused me.

Best
Lorenz

On Tue, 16 Apr 2019 at 17:00, Florian Westphal <fw@strlen.de> wrote:
>
> Lorenz Bauer <lmb@cloudflare.com> wrote:
> > Apologies for contacting you out of the blue. I'm currently trying to
> > understand how TPROXY works under the hood. As part of this endeavour,
> > I've stumbled upon the commit attached to this email.
> >
> > From the commit message I infer that somewhere, TPROXY relies on a
> > check of skb->sk == NULL to function. However, I can't figure out
> > where! I've traced TPROXY from NF_HOOK(NF_INET_PRE_ROUTING) just after
> > the call to skb_orphan to __inet_lookup_skb / skb_steal_sock called
> > from the TCP and UDP receive functions, and as far as I can tell there
> > is no such check. Can you maybe shed some light on this?
>
> Without the skb_orphan udp/tcp might steal tunnel/ppp etc. socket
> instead of tproxy assigned tcp/udp socket.
Eric Dumazet May 2, 2019, 5:50 p.m. UTC | #3
On 4/16/19 8:00 AM, Florian Westphal wrote:
> Lorenz Bauer <lmb@cloudflare.com> wrote:
>> Apologies for contacting you out of the blue. I'm currently trying to
>> understand how TPROXY works under the hood. As part of this endeavour,
>> I've stumbled upon the commit attached to this email.
>>
>> From the commit message I infer that somewhere, TPROXY relies on a
>> check of skb->sk == NULL to function. However, I can't figure out
>> where! I've traced TPROXY from NF_HOOK(NF_INET_PRE_ROUTING) just after
>> the call to skb_orphan to __inet_lookup_skb / skb_steal_sock called
>> from the TCP and UDP receive functions, and as far as I can tell there
>> is no such check. Can you maybe shed some light on this?
> 
> Without the skb_orphan udp/tcp might steal tunnel/ppp etc. socket
> instead of tproxy assigned tcp/udp socket.
> 

Florian, it is the responsibility of the loopback code to perform the skb_orphan()

I am confident we can revert 71f9dacd2e4d23 and fix the
paths that eventually miss the skb_orphan() call.

loopback_xmit() properly calls skb_orphan(), we also need to make sure that any kind 
of loopback (veth and others) do the same.

This is a prereq so that XDP or tc code can implement early demux earlier.

As a bonus we remove one skb_orphan() in rx fast path ;)

Note that skb_scrub_packet() used to call skb_orphan(), we need to a bit smarter
and insert it only in __dev_forward_skb()
diff mbox series

Patch

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 490ce20faf38..db46b4b5b2b9 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -440,6 +440,9 @@  int ip_rcv(struct sk_buff *skb, struct net_device
*dev, struct packet_type *pt,
     /* Remove any debris in the socket control block */
     memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));

+    /* Must drop socket now because of tproxy. */
+    skb_orphan(skb);
+
     return NF_HOOK(PF_INET, NF_INET_PRE_ROUTING, skb, dev, NULL,
                ip_rcv_finish);

diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index c3a07d75b5f5..6d6a4277c677 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -139,6 +139,9 @@  int ipv6_rcv(struct sk_buff *skb, struct
net_device *dev, struct packet_type *pt

     rcu_read_unlock();

+    /* Must drop socket now because of tproxy. */
+    skb_orphan(skb);
+
     return NF_HOOK(PF_INET6, NF_INET_PRE_ROUTING, skb, dev, NULL,
                ip6_rcv_finish);