diff mbox series

[net] ip6_tunnel: set inner ipproto before ip6_tnl_encap.

Message ID 20201016111156.26927-1-ovov@yandex-team.ru
State Changes Requested
Delegated to: David Miller
Headers show
Series [net] ip6_tunnel: set inner ipproto before ip6_tnl_encap. | expand

Commit Message

Alexander Ovechkin Oct. 16, 2020, 11:11 a.m. UTC
ip6_tnl_encap assigns to proto transport protocol which
encapsulates inner packet, but we must pass to set_inner_ipproto
protocol of that inner packet.

Calling set_inner_ipproto after ip6_tnl_encap might break gso.
For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto 
would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
incorrect calling sequence of gso functions:
ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment
instead of:
ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment

Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>
---
 net/ipv6/ip6_tunnel.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Willem de Bruijn Oct. 16, 2020, 5:55 p.m. UTC | #1
On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>
> ip6_tnl_encap assigns to proto transport protocol which
> encapsulates inner packet, but we must pass to set_inner_ipproto
> protocol of that inner packet.
>
> Calling set_inner_ipproto after ip6_tnl_encap might break gso.
> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto
> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
> incorrect calling sequence of gso functions:
> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment
> instead of:
> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment
>
> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>

Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved
the call from ip6_tnl_encap's caller to inside ip6_tnl_encap.

It makes sense that that likely broke this behavior for UDP (L4) tunnels.

But it was moved on purpose to avoid setting the inner protocol to
IPPROTO_MPLS. That needs to use skb->inner_protocol to further
segment.

I suspect we need to set this before or after conditionally to avoid
breaking that use case.
Vadim Fedorenko Oct. 17, 2020, 12:59 a.m. UTC | #2
On 16.10.2020 18:55, Willem de Bruijn wrote:
> On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>> ip6_tnl_encap assigns to proto transport protocol which
>> encapsulates inner packet, but we must pass to set_inner_ipproto
>> protocol of that inner packet.
>>
>> Calling set_inner_ipproto after ip6_tnl_encap might break gso.
>> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto
>> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
>> incorrect calling sequence of gso functions:
>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment
>> instead of:
>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment
>>
>> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>
> Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved
> the call from ip6_tnl_encap's caller to inside ip6_tnl_encap.
>
> It makes sense that that likely broke this behavior for UDP (L4) tunnels.
>
> But it was moved on purpose to avoid setting the inner protocol to
> IPPROTO_MPLS. That needs to use skb->inner_protocol to further
> segment.
>
> I suspect we need to set this before or after conditionally to avoid
> breaking that use case.
I hope it could be fixed with something like this:

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index a0217e5..87368b0 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1121,6 +1121,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev, __u8 dsfield,
         bool use_cache = false;
         u8 hop_limit;
         int err = -1;
+       __u8 pproto = proto;

         if (t->parms.collect_md) {
                 hop_limit = skb_tunnel_info(skb)->key.ttl;
@@ -1280,7 +1281,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev, __u8 dsfield,
                 ipv6_push_frag_opts(skb, &opt.ops, &proto);
         }

-       skb_set_inner_ipproto(skb, proto);
+       skb_set_inner_ipproto(skb, pproto == IPPROTO_MPLS ? proto : pproto);

         skb_push(skb, sizeof(struct ipv6hdr));
         skb_reset_network_header(skb);
diff mbox series

Patch

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index a0217e5bf3bc..648db3fe508f 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1271,6 +1271,8 @@  int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 	if (max_headroom > dev->needed_headroom)
 		dev->needed_headroom = max_headroom;
 
+	skb_set_inner_ipproto(skb, proto);
+
 	err = ip6_tnl_encap(skb, t, &proto, fl6);
 	if (err)
 		return err;
@@ -1280,8 +1282,6 @@  int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 		ipv6_push_frag_opts(skb, &opt.ops, &proto);
 	}
 
-	skb_set_inner_ipproto(skb, proto);
-
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
 	ipv6h = ipv6_hdr(skb);