diff mbox

xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels

Message ID 20150527173823.1415.96248.stgit@ahduyck-vm-fedora22
State Awaiting Upstream, archived
Delegated to: David Miller
Headers show

Commit Message

Alexander Duyck May 27, 2015, 5:40 p.m. UTC
This change makes it so that we use icmpv6_send to report PMTU issues back
into tunnels in the case that the resulting packet is larger than the MTU
of the outgoing interface.  Previously xfrm_local_error was being used in
this case, however this was resulting in no changes, I suspect due to the
fact that the tunnel itself was being kept out of the loop.

This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
behavior seen if the socket was orphaned.  Instead of requiring the socket
to be orphaned this patch simply defaults to using icmpv6_send in the case
that the frame came though a tunnel.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 net/ipv6/xfrm6_output.c |   18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Alexander H Duyck May 29, 2015, 4:53 p.m. UTC | #1
On 05/28/2015 12:15 PM, Alexander Duyck wrote:
> On 05/28/2015 01:40 AM, Steffen Klassert wrote:
>> On Thu, May 28, 2015 at 12:18:51AM -0700, Alexander Duyck wrote:
>>> On 05/27/2015 10:36 PM, Steffen Klassert wrote:
>>>> On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
>>>>> This change makes it so that we use icmpv6_send to report PMTU 
>>>>> issues back
>>>>> into tunnels in the case that the resulting packet is larger than 
>>>>> the MTU
>>>>> of the outgoing interface.  Previously xfrm_local_error was being 
>>>>> used in
>>>>> this case, however this was resulting in no changes, I suspect due 
>>>>> to the
>>>>> fact that the tunnel itself was being kept out of the loop.
>>>>>
>>>>> This patch fixes PMTU problems seen on ip6_vti tunnels and is 
>>>>> based on the
>>>>> behavior seen if the socket was orphaned.  Instead of requiring 
>>>>> the socket
>>>>> to be orphaned this patch simply defaults to using icmpv6_send in 
>>>>> the case
>>>>> that the frame came though a tunnel.
>>>> We can use icmpv6_send() just in the case that the packet
>>>> was already transmitted by a tunnel device, otherwise we
>>>> get the bug back that I mentioned in my other mail.
>>>>
>>>> Not sure if we have something to know that the packet
>>>> traversed a tunnel device. That's what I asked in the
>>>> thread 'Looking for a lost patch'.
>>> Okay I will try to do some more digging.  From what I can tell right
>>> now it looks like my ping attempts are getting hung up on the
>>> xfrm_local_error in __xfrm6_output.  I wonder if we couldn't somehow
>>> make use of the skb->cb to store a pointer to the tunnel that could
>>> be checked to determine if we are going through a VTI or not.
>> Maybe it is as easy as the patch below, could you please test it?
>>
>> Subject: [PATCH RFC] vti6: Add pmtu handling to vti6_xmit.
>>
>> We currently rely on the PMTU discovery of xfrm.
>> However if a packet is localy sent, the PMTU mechanism
>> of xfrm tries to to local socket notification what
>> might not work for applications like ping that don't
>> check for this. So add pmtu handling to vti6_xmit to
>> report MTU changes immediately.
>>
>> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
>> ---
>>   net/ipv6/ip6_vti.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
>> index ff3bd86..13cb771 100644
>> --- a/net/ipv6/ip6_vti.c
>> +++ b/net/ipv6/ip6_vti.c
>> @@ -434,6 +434,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device 
>> *dev, struct flowi *fl)
>>       struct dst_entry *dst = skb_dst(skb);
>>       struct net_device *tdev;
>>       struct xfrm_state *x;
>> +    int mtu;
>>       int err = -1;
>>         if (!dst)
>> @@ -468,6 +469,15 @@ vti6_xmit(struct sk_buff *skb, struct net_device 
>> *dev, struct flowi *fl)
>>       skb_dst_set(skb, dst);
>>       skb->dev = skb_dst(skb)->dev;
>>   +    mtu = dst_mtu(dst);
>> +    if (!skb->ignore_df && skb->len > mtu) {
>> +        skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
>> +
>> +        icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
>> +
>> +        return -EMSGSIZE;
>> +    }
>> +
>>       err = dst_output(skb);
>>       if (net_xmit_eval(err) == 0) {
>>           struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);
>
> That seems to be working for me.  I'm able to ping and while the first 
> packet fails the second one and all that follow make it through 
> correctly after the ptmu update.
>
> - Alex

It looks like I spoke too soon.  It resolves it for IPv6, but IPv4 over 
the tunnel has the same issue.  Probably need to have some sort of 
protocol based check to determine which version of the call to use.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 09c76a7b474d..6f9b514d0e38 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -72,6 +72,7 @@  static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 {
 	int mtu, ret = 0;
 	struct dst_entry *dst = skb_dst(skb);
+	struct xfrm_state *x = dst->xfrm;
 
 	mtu = dst_mtu(dst);
 	if (mtu < IPV6_MIN_MTU)
@@ -82,7 +83,7 @@  static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 
 		if (xfrm6_local_dontfrag(skb))
 			xfrm6_local_rxpmtu(skb, mtu);
-		else if (skb->sk)
+		else if (skb->sk && x->props.mode != XFRM_MODE_TUNNEL)
 			xfrm_local_error(skb, mtu);
 		else
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
@@ -149,11 +150,16 @@  static int __xfrm6_output(struct sock *sk, struct sk_buff *skb)
 	else
 		mtu = dst_mtu(skb_dst(skb));
 
-	if (skb->len > mtu && xfrm6_local_dontfrag(skb)) {
-		xfrm6_local_rxpmtu(skb, mtu);
-		return -EMSGSIZE;
-	} else if (!skb->ignore_df && skb->len > mtu && skb->sk) {
-		xfrm_local_error(skb, mtu);
+	if (!skb->ignore_df && skb->len > mtu) {
+		skb->dev = dst->dev;
+
+		if (xfrm6_local_dontfrag(skb))
+			xfrm6_local_rxpmtu(skb, mtu);
+		else if (skb->sk && x->props.mode != XFRM_MODE_TUNNEL)
+			xfrm_local_error(skb, mtu);
+		else
+			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+
 		return -EMSGSIZE;
 	}