diff mbox

[net] ip_vti/ip6_vti: Clear skb->mark when resetting skb->dev in receive path

Message ID 20150514020316.1635.50870.stgit@ahduyck-vm-fedora22
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Alexander Duyck May 14, 2015, 2:04 a.m. UTC
This change makes it so that we clear the skb->mark field when we pass
through the receive path of the IPv4 or IPv6 virtual tunnel interface.  The
reason for clearing these fields is to resolve an apparent regression for
the behavior before skb_scrub_packet was modified.  Without this patch I
have to set disable_policy for the vti tunnel endpoint in order to be able
to receive traffic.

Fixes: 213dd74aee76 ("skbuff: Do not scrub skb mark within the same name space")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---

I have only tested the ipv4 side of this patch as I have yet to be able to
get a message to successfully pass between to ipv6 vti endpoints.

 net/ipv4/ip_vti.c  |    1 +
 net/ipv6/ip6_vti.c |    1 +
 2 files changed, 2 insertions(+)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Herbert Xu May 14, 2015, 3:28 a.m. UTC | #1
On Wed, May 13, 2015 at 07:04:28PM -0700, Alexander Duyck wrote:
> This change makes it so that we clear the skb->mark field when we pass
> through the receive path of the IPv4 or IPv6 virtual tunnel interface.  The
> reason for clearing these fields is to resolve an apparent regression for
> the behavior before skb_scrub_packet was modified.  Without this patch I
> have to set disable_policy for the vti tunnel endpoint in order to be able
> to receive traffic.
> 
> Fixes: 213dd74aee76 ("skbuff: Do not scrub skb mark within the same name space")
> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>

This patch makes no sense.  Please explain your problem more
clearly and tell us why the mark changes the way your packet
is dealt with and why this isn't a policy decision that should
be made in user-space.

Cheers,
Alexander H Duyck May 14, 2015, 6:14 a.m. UTC | #2
On 05/13/2015 08:28 PM, Herbert Xu wrote:
> On Wed, May 13, 2015 at 07:04:28PM -0700, Alexander Duyck wrote:
>> This change makes it so that we clear the skb->mark field when we pass
>> through the receive path of the IPv4 or IPv6 virtual tunnel interface.  The
>> reason for clearing these fields is to resolve an apparent regression for
>> the behavior before skb_scrub_packet was modified.  Without this patch I
>> have to set disable_policy for the vti tunnel endpoint in order to be able
>> to receive traffic.
>>
>> Fixes: 213dd74aee76 ("skbuff: Do not scrub skb mark within the same name space")
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
> This patch makes no sense.  Please explain your problem more
> clearly and tell us why the mark changes the way your packet
> is dealt with and why this isn't a policy decision that should
> be made in user-space.
>
> Cheers,

The problem is if I set up a ipsec tunnel with vti endpoints on the 
current net-next I am unable to ping the other side.  What it looks like 
is the packet makes it past the endpoint (I can see the ICMP request in 
tcpdump), but the stack appears to be dropping the frame due to a policy 
check.  With my patch applied this issue is resolved, same thing for if 
I revert the "Fixes" patch, or if I set disable_policy for the vti endpoint.

The fact is I am not all that familiar with the vti code and just 
started crawling through it a few days ago, but it seems like it is 
overwriting the skb->mark value with the i_key to determine which policy 
to use.  The code prior to commit df3893c176e9 ("vti: Update the ipv4 
side to use it's own receive hook.") was saving the old skb->mark, 
overwriting it, and then restoring it after a call to 
xfrm4_policy_check.  After that commit it was letting skb_scrub_packet 
in vti_rcv_cb clear the mark and it was just dropped.

I suppose if we are wanting to get back to the behavior before the 
receive hook change we could look at maybe storing the previous mark in 
the skb->cb assuming there is any room there for it, and then we could 
restore it in vti_rcv_cb.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu May 14, 2015, 6:26 a.m. UTC | #3
On Wed, May 13, 2015 at 11:14:39PM -0700, Alexander Duyck wrote:
> 
> The fact is I am not all that familiar with the vti code and just
> started crawling through it a few days ago, but it seems like it is
> overwriting the skb->mark value with the i_key to determine which
> policy to use.  The code prior to commit df3893c176e9 ("vti: Update
> the ipv4 side to use it's own receive hook.") was saving the old
> skb->mark, overwriting it, and then restoring it after a call to
> xfrm4_policy_check.  After that commit it was letting
> skb_scrub_packet in vti_rcv_cb clear the mark and it was just
> dropped.

Steffen, why is vti touching skb->mark at all? This is supposed
to be a field used by user-space to control a packet as it moves
inside the kernel.  Seconding it for other purposes looks very
wrong.

Cheers,
David Miller May 15, 2015, 4:37 p.m. UTC | #4
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 14 May 2015 14:26:14 +0800

> On Wed, May 13, 2015 at 11:14:39PM -0700, Alexander Duyck wrote:
>> 
>> The fact is I am not all that familiar with the vti code and just
>> started crawling through it a few days ago, but it seems like it is
>> overwriting the skb->mark value with the i_key to determine which
>> policy to use.  The code prior to commit df3893c176e9 ("vti: Update
>> the ipv4 side to use it's own receive hook.") was saving the old
>> skb->mark, overwriting it, and then restoring it after a call to
>> xfrm4_policy_check.  After that commit it was letting
>> skb_scrub_packet in vti_rcv_cb clear the mark and it was just
>> dropped.
> 
> Steffen, why is vti touching skb->mark at all? This is supposed
> to be a field used by user-space to control a packet as it moves
> inside the kernel.  Seconding it for other purposes looks very
> wrong.

If anything, the skb_scrub_packet() call right above the skb->mark
clears should be taking care of this.

The only case where mark should be cleared is if we are changing
namespaces, and that's exactly the policy implemented by
skb_scrub_packet() currently.

Yeah, this mark handling via tunnel->parms.o_key looks not so good.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander H Duyck May 15, 2015, 7:14 p.m. UTC | #5
On 05/15/2015 09:37 AM, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Thu, 14 May 2015 14:26:14 +0800
>
>> On Wed, May 13, 2015 at 11:14:39PM -0700, Alexander Duyck wrote:
>>> The fact is I am not all that familiar with the vti code and just
>>> started crawling through it a few days ago, but it seems like it is
>>> overwriting the skb->mark value with the i_key to determine which
>>> policy to use.  The code prior to commit df3893c176e9 ("vti: Update
>>> the ipv4 side to use it's own receive hook.") was saving the old
>>> skb->mark, overwriting it, and then restoring it after a call to
>>> xfrm4_policy_check.  After that commit it was letting
>>> skb_scrub_packet in vti_rcv_cb clear the mark and it was just
>>> dropped.
>> Steffen, why is vti touching skb->mark at all? This is supposed
>> to be a field used by user-space to control a packet as it moves
>> inside the kernel.  Seconding it for other purposes looks very
>> wrong.
> If anything, the skb_scrub_packet() call right above the skb->mark
> clears should be taking care of this.

That only applies if you are crossing namespaces which we are not in 
this case.

> The only case where mark should be cleared is if we are changing
> namespaces, and that's exactly the policy implemented by
> skb_scrub_packet() currently.

Right.  The problem is it looks like vti and vti6 are using the mark to 
signal to the policy that is meant to be used for either end of the 
tunnel.  From what I can tell at some point there was a pre-routing hook 
that was used but later it was replaced with the i_key for input, and 
o_key for output.

> Yeah, this mark handling via tunnel->parms.o_key looks not so good.

So is there any recommendations for an alternative to make it so that 
the ipsec endpoint is identified as needing to be encrypted or 
decrypted?  If needed I could probably take a day or two to try and 
address it as I still have a few other minor things I want to try and 
fix such as the MTU configuration for vti/vti6.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu May 16, 2015, 12:34 p.m. UTC | #6
On Fri, May 15, 2015 at 12:14:43PM -0700, Alexander Duyck wrote:
>
> >Yeah, this mark handling via tunnel->parms.o_key looks not so good.
> 
> So is there any recommendations for an alternative to make it so
> that the ipsec endpoint is identified as needing to be encrypted or
> decrypted?  If needed I could probably take a day or two to try and
> address it as I still have a few other minor things I want to try
> and fix such as the MTU configuration for vti/vti6.

I'd like to hear from Steffen as to whether there is anything
in userspace that relies on the mark being used in this way by
vti.  If not it should be easy to get rid of it and use some
field that's not exposed to user-space.  If there is then this
would be tricky to resolve.

Cheers,
diff mbox

Patch

diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index ee479495f5a3..d853e78742d3 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -112,6 +112,7 @@  static int vti_rcv_cb(struct sk_buff *skb, int err)
 
 	skb_scrub_packet(skb, !net_eq(tunnel->net, dev_net(skb->dev)));
 	skb->dev = dev;
+	skb->mark = 0;
 
 	tstats = this_cpu_ptr(dev->tstats);
 
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index ed9d681207fa..c245fb8298e5 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -363,6 +363,7 @@  static int vti6_rcv_cb(struct sk_buff *skb, int err)
 
 	skb_scrub_packet(skb, !net_eq(t->net, dev_net(skb->dev)));
 	skb->dev = dev;
+	skb->mark = 0;
 
 	tstats = this_cpu_ptr(dev->tstats);
 	u64_stats_update_begin(&tstats->syncp);