diff mbox

[RFC] GSO: Questions for IPVS and GSO for IPv6

Message ID alpine.LFD.2.11.1407270957400.2036@ja.home.ssi.bg
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Julian Anastasov July 27, 2014, 7:21 a.m. UTC
Hello,

	IPVS (IP Virtual Server) has a sending mode to
encapsulate IPv6 packet in a new IPv6 header (like ip6ip6 tunnel).
As IPVS is lacking skb->encapsulation support we are trying to
call iptunnel_handle_offloads() with some SKB_GSO_ value,
SKB_GSO_SIT looking the only available option.

	IPVS works by catching packets at LOCAL_IN (GRO
is possible) and injecting them via ip_local_out() and
ip6_local_out() after adding new IPIP or IPV6 header.
Draft patch is appended for comments.

	We have the following questions:

- Is it correct to provide SKB_GSO_SIT to iptunnel_handle_offloads()
for the case when we add IPPROTO_IPV6 header, I see
(SKB_GSO_SIT|SKB_GSO_IPIP) check in ipv6_gso_segment() that
may need to account network headers. Or we have to add
SKB_GSO_IPV6 with same value like SKB_GSO_SIT, in case the
SIT name looks confusing for ip6ip6 mode and to register
it in inet6_offloads?

	Questions not related to IPVS:

- I guess ipv6_gro_receive() does not support GRO for
IPPROTO_IPV6 in IPPROTO_IPV6 because IPPROTO_IPV6 is not
registered in inet6_offloads list?

- is it correct ip6_tnl_xmit2() to use
skb->transport_header = skb->network_header before calling
skb_reset_inner_headers() ? skb->inner_transport_header will
get wrong value? I guess this is bad for HW csum offload?

Thanks!

[PATCH net] ipvs: properly declare tunnel encapsulation

The tunneling method should properly use tunnel encapsulation.
Fixes problem with CHECKSUM_PARTIAL packets when TCP/UDP csum
offload is supported.

Thanks to Alex Gartrell for reporting the problem, providing
solution and for all suggestions.

Reported-by: Alex Gartrell <agartrell@fb.com>
TODO: waiting for final ack from Alex before adding Signed-off-by line
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
 net/netfilter/ipvs/ip_vs_xmit.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

Comments

David Miller July 29, 2014, 10:43 p.m. UTC | #1
From: Julian Anastasov <ja@ssi.bg>
Date: Sun, 27 Jul 2014 10:21:17 +0300 (EEST)

> 	Questions not related to IPVS:
> 
> - I guess ipv6_gro_receive() does not support GRO for
> IPPROTO_IPV6 in IPPROTO_IPV6 because IPPROTO_IPV6 is not
> registered in inet6_offloads list?

Correct.  SIT works because we register an "ipv4" offload for protocol
ipv6.

You'll need to add an inet6 offload, but I am not sure whether you can
just reuse the SIT skb gso flag, you might need to add a new onw.

Eric would know better, he added the SIT code, CC:'d
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Gartrell July 30, 2014, 3:15 a.m. UTC | #2
Thanks for putting this up Julian.  As a meta-point, working with 
upstream isn't necessarily something that has come easily or obviously 
to us corporate shills at Internet companies, but it's pretty obvious at 
this point that it was foolish for us to go it alone :)

> [PATCH net] ipvs: properly declare tunnel encapsulation
>
> The tunneling method should properly use tunnel encapsulation.
> Fixes problem with CHECKSUM_PARTIAL packets when TCP/UDP csum
> offload is supported.
>
> Thanks to Alex Gartrell for reporting the problem, providing
> solution and for all suggestions.
>
> Reported-by: Alex Gartrell <agartrell@fb.com>
> TODO: waiting for final ack from Alex before adding Signed-off-by line

nit-picky comments below, but this looks good to me.

> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> ---
>   net/netfilter/ipvs/ip_vs_xmit.c | 20 ++++++++++++++++----
>   1 file changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
> index 73ba1cc..8c1172c 100644
> --- a/net/netfilter/ipvs/ip_vs_xmit.c
> +++ b/net/netfilter/ipvs/ip_vs_xmit.c
> @@ -38,6 +38,7 @@
>   #include <net/route.h>                  /* for ip_route_output */
>   #include <net/ipv6.h>
>   #include <net/ip6_route.h>
> +#include <net/ip_tunnels.h>
>   #include <net/addrconf.h>
>   #include <linux/icmpv6.h>
>   #include <linux/netfilter.h>
> @@ -862,14 +863,19 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
>   		old_iph = ip_hdr(skb);
>   	}
>
> -	skb->transport_header = skb->network_header;
> -
>   	/* fix old IP header checksum */
>   	ip_send_check(old_iph);
>
> +	skb = iptunnel_handle_offloads(skb, false, SKB_GSO_IPIP);
> +	if (IS_ERR(skb))
> +		goto tx_error;
> +
> +	skb->transport_header = skb->network_header;
> +
>   	skb_push(skb, sizeof(struct iphdr));
>   	skb_reset_network_header(skb);
>   	memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
> +	skb_clear_hash(skb);

We need this in ipv4 but not in ipv6?

>
>   	/*
>   	 *	Push down and install the IPIP header.
> @@ -900,7 +906,8 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
>   	return NF_STOLEN;
>
>     tx_error:
> -	kfree_skb(skb);
> +	if (!IS_ERR(skb))
> +		kfree_skb(skb);
>   	rcu_read_unlock();
>   	LeaveFunction(10);
>   	return NF_STOLEN;
> @@ -953,6 +960,10 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
>   		old_iph = ipv6_hdr(skb);
>   	}
>
> +	skb = iptunnel_handle_offloads(skb, false, SKB_GSO_SIT);

It sounds like this is going to take a fair amount of work.  Can we set 
gso_type_mask to 0 with a scary comment and push it as a fix and then do 
the right thing as a feature for a future release?

> +	if (IS_ERR(skb))
> +		goto tx_error;
> +
>   	skb->transport_header = skb->network_header;
>
>   	skb_push(skb, sizeof(struct ipv6hdr));
> @@ -988,7 +999,8 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
>   	return NF_STOLEN;
>
>   tx_error:
> -	kfree_skb(skb);
> +	if (!IS_ERR(skb))
> +		kfree_skb(skb);
>   	rcu_read_unlock();
>   	LeaveFunction(10);
>   	return NF_STOLEN;
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Julian Anastasov July 30, 2014, 9:39 p.m. UTC | #3
Hello,

On Tue, 29 Jul 2014, Alex Gartrell wrote:

> > +	skb_clear_hash(skb);
> 
> We need this in ipv4 but not in ipv6?

	I got it from iptunnel_xmit(), I don't see it
for IPv6 and I was not very sure about it. We can also
leave it for another patch.

> It sounds like this is going to take a fair amount of work.  Can we set
> gso_type_mask to 0 with a scary comment and push it as a fix and then do the
> right thing as a feature for a future release?

	Yes, 0 or SIT with some comment can be enough
for a bugfix change.

Regards

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 73ba1cc..8c1172c 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -38,6 +38,7 @@ 
 #include <net/route.h>                  /* for ip_route_output */
 #include <net/ipv6.h>
 #include <net/ip6_route.h>
+#include <net/ip_tunnels.h>
 #include <net/addrconf.h>
 #include <linux/icmpv6.h>
 #include <linux/netfilter.h>
@@ -862,14 +863,19 @@  ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 		old_iph = ip_hdr(skb);
 	}
 
-	skb->transport_header = skb->network_header;
-
 	/* fix old IP header checksum */
 	ip_send_check(old_iph);
 
+	skb = iptunnel_handle_offloads(skb, false, SKB_GSO_IPIP);
+	if (IS_ERR(skb))
+		goto tx_error;
+
+	skb->transport_header = skb->network_header;
+
 	skb_push(skb, sizeof(struct iphdr));
 	skb_reset_network_header(skb);
 	memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
+	skb_clear_hash(skb);
 
 	/*
 	 *	Push down and install the IPIP header.
@@ -900,7 +906,8 @@  ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 	return NF_STOLEN;
 
   tx_error:
-	kfree_skb(skb);
+	if (!IS_ERR(skb))
+		kfree_skb(skb);
 	rcu_read_unlock();
 	LeaveFunction(10);
 	return NF_STOLEN;
@@ -953,6 +960,10 @@  ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
 		old_iph = ipv6_hdr(skb);
 	}
 
+	skb = iptunnel_handle_offloads(skb, false, SKB_GSO_SIT);
+	if (IS_ERR(skb))
+		goto tx_error;
+
 	skb->transport_header = skb->network_header;
 
 	skb_push(skb, sizeof(struct ipv6hdr));
@@ -988,7 +999,8 @@  ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
 	return NF_STOLEN;
 
 tx_error:
-	kfree_skb(skb);
+	if (!IS_ERR(skb))
+		kfree_skb(skb);
 	rcu_read_unlock();
 	LeaveFunction(10);
 	return NF_STOLEN;